Thursday, September 30, 2010

Xbase - A new programming language?


No!

It's the basis for a plethora of new programming languages and domain-specific languages!

What is Xbase?

Xbase is a partial programming language implemented in Xtext and is meant to be embedded and extended within other programming languages and domain-specific languages (DSL) written in Xtext.

Why Xbase?

Developing textual modeling languages (aka DSLs) has become incredibly easy with Xtext. Structural languages which introduce new coarse-grained concepts, such as services, entities, value objects or statemachines can be developed in minutes. However, software systems do not consist of structure only. At some point a system needs to do something, hence we want to specify some behavior which is usually done using so called expressions. Expressions are the heart of every programming language and are not so easy to get right. That is why most people do not add support for expressions in their DSL, but try to solve this differently.

The most often used workaround is to only define the structural information in the DSL and add behavior in a second step by modifying or extending the generated code. It is not only unpleasant to write, read and maintain closely related information in two different places, on two different levels of abstraction and in two different languages, this also only works for compilers (i.e. code generators) but not for interpreters. (Additionally they are a lot of other reasons why mixing generated and hand written code is problematic, which is not the topic of this blog post.)

But still as of today this is the preferred solution since adding support for expressions (and a corresponding compiler) for your language is hard - even with Xtext.

Actually being able to call out to the host language is one big advantage internal DSLs have over external DSLs. With Xbase it will be possible to explicitly allow more complex programming at certain places within your DSL, while still have full control over the syntax and semantics of your language. And you neither have to reinvent the wheel by implementing a full-blown programming language nor do your language's users have a hard time to understand the expression language, since it is closely related to Java and well specified.

Also the more Xbase-based languages we see the more commonly known it will be.

Main Decisions

We want Xbase to be expressive and convenient to use, but at the same time easy to understand and easy to adapt. Understanding not only means learning how to use it but also understanding the language infrastructure, i.e. the parser, compiler, type checkers, etc. Because people shall be able to reuse and adapt that stuff easily.

The main target audience for Xbase are Java developers. That is why an Xbase expression looks like a Java expression (or statement) at a first glance. This means the most commonly used Java statements and expressions (e.g. string literals, if statement, foreach loop, method invocation, constructor call) are also available as is in Xbase. On the other hand Java is a very complicated language, especially when it comes to the details. After all the spec counts over 600 pages and while it is very precise most of the text deals with exceptional conditions often involving the special handling of built-in types, etc.

Xbase shall be significantly simpler, so we have to make some decisions.

Runs on the JVM

The JVM is a great, popular platform. In order to ship a compiler, interpreter as well as static typing, Xbase needs to bind to some target platform.
Other platforms such as C/C++, ObjectiveC or JavaScript are also very interesting target platforms for Xtext languages, but for now the main focus of Xbase is the JVM. This seems to be natural decision, since Xtext itself runs on the JVM. Also we know a lot about this platform and the community.

Compiles to Java

The compiler will translate to Java instead of byte code directly. This is for the following reasons:

  1. Anybody should be able to integrate the expressions compiler with any Java code generator
  2. The output as well as the implementation of the compiler shall be as readable / understandable as possible
  3. The code can be used with non-JVM platforms like GWT or Android
  4. We want to leverage the optimizations coming with proven Java compilers

Another pragmatic reason is, that while we plan to have a debugger for Xbase based languages, it won't be part of next year's release. Therefore people will have to debug on the Java code level, which wouldn't be possible if we were generating byte code directly.

Interpreter

We also want to ship an interpreter in order to allow interpreted DSLs using Xbase.

Statically Typed

Xbase is statically typed. This means that there is a type checker and also that the compiler will use static type information to do it's job. Most important to users, might be the rich tooling we can and plan to provide based on Xtext and Eclipse in general.

However, it should be possible to remove the type checking phase and change the compiler to do dynamic method invocations, etc.

Full Java Generics

Xbase uses fully-fledged Java generics and doesn't change anything here. While Java Generics are not perfect they have been understood (or at least people think they have ;-)) by a lot of people.
Introducing a different type model would hurt adoption. Under the hood this is backed-up by the JVM-Types we introduces with Xtext 1.0.

No built-in types

While the Jvm-Types support every Java type, Xbase will automatically convert any references to built-in types and array types to their corresponding wrapper types resp. lists.
This means you can use built-in types in your languages if you want to (and you should be able to extend Xbase in a way that it can, too), but you don't have to.
The compiler might use built-in types in the generated Java code, but statically and conceptually everything is a subtype of java.lang.Object (i.e. pure OO).

Closures

The main addition in Xbase is the concept of closures. While it looks like Java will have them one day, the lack of them is a major problem with Java.
Xbase comes with a small runtime library, where interfaces for Functions are part of. Closures in Xbase are just sugar for anonymous classes of one of these Function types.

For instance the following Java expression:

new Function1<String,String>() {
public String apply(String s) {
return s.toUpperCase();
}
}

can be written like this in Xbase:

String s | s.toUpperCase()

Xbase also provides sugar for the types of functions. That is

(String)=>String

is a shorthand for

Function1<String,String>

Type Inference

Type inference is another important feature of any modern statically typed language. Type inference basically means that the compiler doesn't force you to write redundant information about types. In Java for example the type of a local variable needs to be specified although it could be inferred from the initialization expression:

Map<String,Person> namesToPerson = new HashMap<String,Person>();

In Xbase you don't have to write the type signature twice, but can write the following instead:


val namesToPerson =  new HashMap<String,Person>();

Of course namesToPerson would be of type HashMap<..> here. If you want to be explicit, you can add the type information optionally:


val Map<String,Person> namesToPerson =  new HashMap<String,Person>();

Xbase does type inference for type arguments in closures as well. That is the argument types don't need to be specified if they can be inferred from the current context.

Also note, that the typing service of Xbase can be used in your language in order to do type inference (for instance for return types in method signatures).

Operator Overloading

Xtext comes with a fixed set of operators, with a fixed precedence and associativity. The difference to Java is, that those operators are not bound to certain built-operations on built.in types but are just shorthands (or sugar) for certain method invocations.

That is if some type T has a method plus(T2), you can either write

myT.plus(myT2)

or

myT + myT2

This concept is known from Groovy (although it's slightly different there).

Simplicity over Syntactical Flexibility

With operator overloading we could have gone a step further as done in Scala. In Scala the operators aren't fixed keywords but words with certain characteristics (usually starting with a certain letter). That would allow to have operators which are not predefined in the language. However, this would have introduced a couple of additional lexer rules, which had limited the available syntactical space dramatically. This had made extending the language much harder (and even impossible in many cases).

In general we decided to prefer simplicity over syntactic flexibility. This is because with Xbase you already have the largest syntactic freedom. You just create a sublanguage and add or remove anything you want.

Languages like Scala really need to have all this flexibility, because they are designed to add new language features as a library. These special rules about identifiers and operators and other syntactic flexibility like newlines as expression separators (and the situations when this doesn't work) as well as the different ways to invoke functions is what makes Scala syntactically flexible but complicated at the same time.

Xbase is designed to let you easily add new language features on the language level. If you need a certain syntax you can just have it. The base language remains simple.

Everything is an Expression

There's just no good reason to separate between expressions and statements. Although most statements are inherently imperative (i.e. about side effects), there's no reason to have this separation (which is a limitation) built into the language.
Instead in Xbase everything is an expression, that is everything returns something (and has a type at compile-time). This allows to use the typical imperative statement constructs deeply nested like in the following expression:

this.setFoo(if (isFoo) "foo" else "bar")

In Java we have the ternary operator to do branches within expressions. In Xbase you can use the if expression, but you can also have for and while loops, try-catch clauses or even the nice switch expression deeply nested.

Powerful Switch Expression

This is one of the new features we added. I like pattern matching, but think it is way too complex for many people to use and most people to integrate in their language.
Also I like polymorphic dispatching, like we always had in Xpand and use a lot in Xtext.

On the other hand the switch expression in Java is just stupid. It is complex (fall through) and limited (finally switch over strings in Java 7 ?).

So what we do in Xbase is

  1. we remove fall through (first match wins)

  2. we allow to switch over anything (based on equals)

  3. we introduce so called type guards (which automatically applies down casts)

Example:

val p = getMeSomeObject();

switch ( p ) {
Foo case p.isSpecialFoo() : "SpecialFoo";
Foo : "OrdinaryFoo";
Bar : "It's a "+p.barKind()+" bar";
default : "don't know";
}

I hope this is intuitive and readable. You can find the details in the Xbase language specification.

Current state

The development of Xbase has just begun. We have a first draft of a language specification and grammars as well as some infrastructure, but we are still in a very early state.

I hope this post made you interested in Xbase. Feedback is very welcome.

Wednesday, September 29, 2010

Xtext in Indigo

We've started development on the next release of Xtext just four weeks ago. Prior to that we did the service release and some internal prototyping and evaluation for upcoming features. We also already have had our yearly kick-off meeting in order to define the main goals for the next release.

With this blog post I'ld like to share what we are up to. Feedback is welcome.

Xtext Version 2.0

Yes, that's right. We are aiming at a new major version. This means lots of
great new features and improvements as well as incompatible changes. While our API lifecycle contract is, that nothing remains compatible between two major versions, we of course are aware of what API most users use. So we reconsider any changes and don't do them easily.
The differences might turn out to be comparable to those we had between the last two releases, which were considered easy to migrate to.

What we did so far

During M2 we did a lot of clean up. We removed any deprecated code and incremented the bundle versions for those with incompatible changes to 2.0.0.
Also the whole outline view infrastructure as well as the auto editing has been rewritten. The new outline view is cleaner, faster and simpler. In addition Xtext now leverages JFace partitions. That is one can register different Jface service implementations, such as content assist, per partition. By default there are distinct partitions for comments and string literals as well as the default partition type for everything else. See the full list of solved bugzillas here.

Milestone builds

Yesterday we had our M2 of the current indigo stream. We don't have a promoted milestone build, as we had some issues with the build servers. Please use the nightly builds in case you want to give the latest changes a spin.

Expect the first promoted milestone build (M3) of the current development stream of Xtext in six weeks (November 9th).

What's next? (Plan for Indigo)

The development plan for Indigo is laid out around one big topic, which is called Xbase. Xbase is an expression language bound to the JVM, which will be shipped with Xtext and can be embedded and extended in other Xtext-based languages. It is basically a simplified version of Java's expressions but also adds some new concepts. I will post a separate blog entry on Xbase tomorrow. Note, that Xbase is a research project we are doing together with Christian-Albrecht University of Kiel which is sponsored by the Federal Ministry of Education and Research.

Refactorings & Clean ups

As you might know, we usually implement most functionality twice (or more often), because we are simply too stupid to get it right in the first place :-).
Even with Xtext 1.0.0 this is not different, so we have identified the following parts of the framework which we want to clean up or even re-implment:

  • Outline view (done)

  • Auto Editing (done)

  • Bracket Matching (done)

  • Serializer

  • Formatter

  • some smaller things

New Features

But hey, we know most people are interested in new features. So this is what we want to work on in addition to Xbase:

  • Generic Rename Refactoring

  • Additional Import Namespace Functionality (organize imports, add import, warnings and errors)

  • Default Implementation and Convenience API for Text Hovers

  • Lexer and Parser Fragments

  • View Framework (Convenient way to define tree views, such as call or type hierarchies)

More Languages

We of course eat our own dog food and therefore will implement a couple of new languages based on Xbase.

One is a textual notation to define Ecore elements similar to EMFatic, but where one can specify the behavior in derived EStructuralFeatures and EOperations in the language.

Another is a successor to Xpand and Xtend, called Xtend 2. That one comes with a couple of very interesting features I'll talk about in subsequent blog posts.

We are also thinking about having DSLs for several different view points in Xtext itself, such as scoping. Don't miss tomorrow's blog post on Xbase. :-)

Disclaimer: As usual this is what we would like to do and defines the general direction. But as we don't have any fortuneteller in our team, we might (will) adjust the plan slightly as we go.