Thursday, September 18, 2008

Language feature supporting internal DSLs

There's a lot of discussion going on about internal or embedded DSLs and what programming languages are best suited to develop them.
Mainly an internal DSL is about pushing the syntactic flexibility of a language to it's limits in order to write against an API in a domain-specific manner. What that means depends on the DSL but in most cases it means simplification of client code by implying context, allowing declarativity and avoiding DRY-violation.

Many people think, that it is the dynamic nature of a language which makes it suitable for internal DSLs. That is only partly true. More important than the possibility to do meta programming is to have a flexible syntax allowing to invoke functions with different syntaxes (omitting paranthesis, semicolons, etc.). Ruby and Scala are both very flexible in this sense.

Working on the new version of Xtext, I had an idea for what I think would be a cool language feature.

Introducing new syntax via libraries
When declaring a function, one typically only declares a name and what kind of parameters need to be passed in. Programming languages themselfs then have defined a (more or less flexible) generic way of how such functions can be invoked.

Example:
def foo(String x) : "foo"+x;
which is invoked like this :
foo("bar")
Imagine a language which alternatively allows to specify the concrete syntax of how such a function is invoked.:
def  myFunc : 'foo' x=ID : "foo"+x;
Where the part between the first and the second colon introduces a new syntax (I've reused the Xtext syntax here) which can be used to invoke the function, so one could simply write:
foo bar
Where 'foo' is a newly introduced keyword (limited to the static scope where the definition is visible of course) and bar is an identifier which is a builtin lexer token of type String.

Another example:
def Person :
('girl'|male?='boy') name=ID (lastName=ID)? ('from' city=ID)? :
new Person(male,name,lastName,city);

def greet :
p1=Person 'greets' 'the'? p2=Person :
p2.greetedBy = p1;
which would allow for expressions like:
boy Sven from Kiel greets the girl Scarlett Johansson
and would effectively construct two instances of person and link them. :-)

Obviously this would be a tough thing to implement (parsers which dynamically change their behaviour) and it would also be problematic to avoid ambiguities with syntaxes imported by other libraries or introduced by the host language. I haven't spent much time on thinking it out yet. So maybe there are obvious show stoppers I overlooked or one of your neighbors has already developed such a language a decade ago?