Thursday, August 27, 2009

Xtext - Road to Helios

After the Galileo release of Xtext we are now focussing on the next iteration. We've had a face-to-face meeting in Kiel and discussed the different ideas, our process and what the "themes" for our Helios development are.

In this post I want to share our vision of how the road to Helios could! look like. I say 'could' because although we have a rough idea of where we want to go, we know from the past, that the actual road we go will slightly differ. That is things we find useful today might have become less important in six months. "Round And Round The Earth Is Turning" :-).
I'll outline our visions and priorities along the four themes we identified for the Helios development period.

Theme I : Clean Code

Our code base is already comparatively clean but we want it to be even cleaner. Cleanness in code means the following to us:

DRY - Don't repeat yourself

Every information should be kept in one place only. This is as everything I write not meant dogmatic. So, there are of course situations where it is more pragmatic to duplicate a piece of information.

KISS - Keep it small and simple


Solutions should only be as complicated as they need to be. Sometimes it is hard to understand for developers that they have solved a problem but we do not want to check it in in that way. That is of course not because we do not appreciate the effort. It is because we think it is too complex, and we do not want to add superfluous complexity to the code base. Of course opinions of what is superfluous and what is not might as well differ :-).

Keep it testable and have tests

Code needs to be testable. If it is not testable unit testing is either not possible or it is too complicated and therefore is fragile and complex. We want to have sufficient unit tests. Sufficient is again a weak statement. Usually one has too little tests, so having too many tests is very seldom the case. Mature and published code can't have too many tests, as long as each test covers a different case. Note that the code cleanness of unit tests is as important as the code quality of the production code. When we fix bugs we usually do test first. That is we write a failing test reproducing the reported miss behavior and fix it afterwards.
In Xtext we have more LOC in the tests than in production code.

Comments

Any comments need to reflect the code! Wrong comments are far more problematic than no comments.

Inlined comments are a smell. They tell you that your code is not readable. Most of the time the simplest thing to do in order to avoid such comments is extracting a method and naming that method like the comment you wanted to put in front of that code snippet.

JavaDoc comments are important. This is an aspect where we should improve our code base.
JavaDocs should be as small as possible and as verbose as needed in order to understand the contract behind the API at hand. We only add JavaDocs to primary hooks. Internal stuff is seldom commented because we can do a quick view at the code in order to understand what's going on. At least as long as the code fulfills the next requirement.

Readability

Code is read far more often than it is written or edited. So it's especially important to write the code so that your intentions are clear. Small functions and expressive names are good starting points.


Theme II : Usability (UI Quality & Features, API Quality, Documentation Quality)

UI Quality and Features

In Galileo we mainly focused on putting the main abstractions in place and creating a solid framework. Some UI features were therefore postponed.
One of the first things we want to do in Helios is adding the EMF Index and making the usage of it and the corresponding scope provider the default. Based on that we have information about all cross file references, which enables us to not only automatically trigger validation of referencing models, but also provide advanced navigation features and things like rename refactoring.

API Quality

It is our desire that people have fun when working with the abstractions we provide.
The mantra for last year's development was 'We aim to make simple things simple and complex things possible.' (by Alan Kay). Which means to us that what the 80% of things we want to do with a framework needs to be as simple as possible, but at the same time we need to make sure that the other 20% are still possible (and ideally also not too hard to accomplish). Of course there are use cases which simply do not fall into the scope of a framework but that is a different story.

Documentation Quality

Our documentation is in a relatively good shape, but still can be improved. We are convinced that good documentation is key for the overall acceptance of a framework.
So keeping the documentation current and keeping a good quality is part of our development process. That is adding or changing code is always paired with adding and changing tests and updating the documentation.

Theme III : Performance & Scalability

This has been a major theme for last year's development as well.
Given user stories like the talk about Xtext and AUTOSAR proposed for ESE 2009, it seems that we met this non-functional requirement fair enough.
But that does not mean that we don't need to check it from time to time, or that users neverface performance issues. We take this aspect very serious.
But of course we do not write optimized code as along as we haven't measured that optimization is worth the extra unreadability, complexity or what ever.
What we do is having regular profiling sessions and thinking about different use cases when we design concepts.

Theme IV : Increase Applicability (Base Language, Grammar Features)

So far all the themes are more or less about, improving the quality. I haven't talked about many new features (despite the mentioning of EMF Index and what it means for some UI features we want to implement). You will of course see many improvements and new features I haven't mentioned in this post. We decide which bugzillas to work on from milestone to milestone (I'll write about what is coming in M2 in a separate post.)
But there is one huge "feature" we want to do and I want to mention it here:

Base Language

We want to provide a base language, which can be extended and customized by users. The idea is not new: The Intentional Workbench as well as MPS both have similar things. Despite that these frameworks are of a different nature (they are not text-based) we also don't want to implement Java or C# (that is what they do), but come up with a much simpler and nicer language. Actually I like Scala very much, because it has very little concepts in it and allows to define new concepts out of them. But with an Xtext-based language you don't need the syntactical flexibility of Scala or Ruby, because the parser and the compiler are open and you can change the language as it seems fit. Imagine state machines with action implementations, entity models with implementation for operations and derived attributes, validation languages etc. Looking at the current land scape of external DSLs I'ld say that about 90% of all languages would benefit from support for embedding expressions. Some people already have implemented their own expression language others reuse the language of the target platform, i.e. they write the behavior in Java and mix it into the generated code. Sebastian and I gave a talk about this at Code Generation 2009 in June.

In order to show and prove how great such a language is, we plan to implement an EMFatic version that supports adding implementation for EOperations and derived EStructuralFeatures.
With that you'll no longer need to change the generated code (at least most of the time) and you'ld have everything in one place.
In addition DSLs for common viewpoints in Xtext, such as scoping, validation, quick fixes, formatting, etc. would be nice.

The base language is a huge effort and will likely not be finished by next year. As of now I'm not sure if and how much it will be included in the Helios release.

Java adaption

Another really nice thing we're already working on is, having a thin EMF adapter layer to JDT's Java model. This allows for referencing Java elements from within a DSL, which is very often desired. Reimplementing MWE with a nicer syntax will be a matter of a couple of days when we have this feature. We want this to be tightly integrated with EMF Index.
This is also a bigger topic but will definitely be part of Helios (actually we plan to have it finished by M4).

Grammar Features

There are a lot of ideas of additional features for the grammar language. We haven't decided on them yet but things like multiple grammar inheritance might get more important when we ship a base language. But what exactly such a feature would mean and what the consequences are still has to grow on us.

DISCLAIMER

Please don't take any of these ideas as guaranteed. This is what we currently have in mind but our view of the world will change. Also note that developing the base language we have in mind is a big effort, so it might take us more than just one release cycle.
I'll write a post for each milestone in order to tell what we exactly we did and do.

9 Kommentare:

Achim Demelt said...

Cool. Very interesting things planned for Helios. Thanks for posting this road map.

darkviews said...

Seems like a good time to finally have a look at Xtext. What I'm missing: How about debugger support? When I search errors in the final code, I need to see where each piece of code originally came from. Do you have a solution for this?

Another idea that bugs me is to generate several outputs from the same source. Say you create Wikipedia. Wouldn't it be nice to have a single source base and generate plain HTML, HTML+JavaScript, a version for flash and one using Java? The plain version would work anywhere but if you want some more power, you could use the flash or Java version on your desktop.

For this, you would need to be able to map the source to capabilities on the output end. The source would contain the WYSIWYG editor, dependency management, etc. and the transformers would strip that down.

Sven Efftinge said...

> Seems like a good time to finally
> have a look at Xtext. What I'm
> missing: How about debugger
> support? When I search errors in
> the final code, I need to see
> where each piece of code
> originally came from. Do you have
> a solution for this?

Not a solution but very promising ideas. At least for Java-based languages we want to keep track of the trace between model and generated code and then hook JDT's debugger in order to show the model instead of the Java source file.

I didn't get the other thing. Sounds like you think it is not possible to generate multiple artifacts from one? That's of course possible.

Aaron Digulla said...

How do you work around the lack of a #line directive in Java? Will you sprinkle the output with special comments?


As for the other thing, the idea is to have a single code base and generate useful programs in several languages. Kind of what we have today with Eclipse and RCP (same code base, one gives a Java app, the other a web app).

If you look at Wikipedia today, the UI is pretty clumsy. There is no support for type ahead, only very basic code templates, etc. If you could generate a high level WYSIWYG editor and turn that into something that runs with Flash and Java, people could chose while still having the option to use the pure HTML version.

On a more complex level: I'd like to be able to write code once and then get Java, Python, Flash, C#, whatever out of it. Today, you either need Mono for that or you must port the code (see git and jgit) which means >= twice the effort to maintain it.

Krzysiek said...

I am quite interested in base language (so I would not need to create one myself ;))).

Basic plugging to IJavaModel with Xtend was quite simple in my experience but extensions to handle type signatures etc. were needed so it would be really good to have well formed model around it. So Recepies would not be that much needed, Java from Java code generation would be possible and Java elements could be pointed from Xtext as long as inside Eclipse :)

Additionally I would really like to see extended support for running workflows inside Eclipse. For instance Xtext can use namespace URIs to import models, but Xpand cannot. QVT and Acceleo support uris. IMO lack of it generate a number of issues for beginners with metamodel registration, ecore file or ePackage pointing, gen model registration etc. I guess that with Xtext, more and more people will want to run workflows from an eclipse plugins. I guess that it would be hard to provide support for this in standalone setup though.

Regards,
Krzysztof Kowalczyk

randomice said...

I would like to see Xtend implemented in Xtext which would make the functions referenceable from other EMF models.

Sven Efftinge said...

@Aaron : We currently think about having the trace information in a separate file. But that's just the current ideas. They are very vague.

@Krzysztof :
Having a hook to trigger generation in a builder is a planned feature. I just did not list it. :-)

@randomize :
Well, the base language will be very similar but not 100% compatible with Xtend.

rod said...

I think it would be good to add semantic and syntatic predicates before implementing the base language.

They'll eventually be needed for more complex languages anyway and the base language should be able to support more complex features.

Robert Wloch said...

Thanks for the roadmap Sven!



The Javadoc part is a major point I think. Especially when it comes to customization a short explanation of methods I need to override would be really helpful.



For some customizable API classes it could be usefull to have a class JavaDoc that contains a short how-to-use-this example. You often find this in JDK and other framework classes, e.g. java.util.GregorianCalendar.



One last thing I'd like be in Xtext is improved code completion. For instance, althought Xtext is able to create error markers for non existant imports in your model, it is not able to provide completion suggestions for possible files. A second example is the usage of predefined ecore models when specifying a grammer. The rules one specifies need to return types from your ecore model. Although Xtext recognizes wrong types and non existant features it does not offer code completion at this point.