Tuesday, December 20, 2011

Groovy, Scala, Java, Xtend - a stupid comparison

Disclaimer: This is probably the worst possible way to compare execution performance. I'm posting it nonetheless since the result was significant. Judge for yourself.

Today I took some time to check out the latest Eclipse plugins for both Groovy and Scala. I mainly wanted to know how nice the tooling is and played around with it a bit. I had been doing some Scala a couple of years ago, but since the Eclipse plugin not even existed those days it was a bit too painful for me to get used to it.

Groovy's tooling looks very good. There are just a few oddities but mainly I think it's well done... in contrast to the Scala plugin (I used 2.0.0 RC4). It holds a lot of surprises. For instance, with the Scala-IDE installed you'll get Scala-related template proposals in the Java and the Groovy editor. Still Scala the language would be my personal choice if I had to decide between the two. Anyway, this post is not about language features or Eclipse-plugin quality, but about what I experienced regarding runtime performance.

I knew Groovy's performance is bad. But I expected the performance problems to diminish, if I use a scenario using dynamic dispatch and mostly untyped objects. So I wrote the following brain dead code:

class Groovy {

  public static void main(String[] args) {
    def absoluteResult = []
    def before = System.currentTimeMillis()
    for (times in 1..10000) {
      def result = ['foo', 23, true]
      for (y in result) {
        absoluteResult += foo(y)
      }
    }
    println("Took : "+(System.currentTimeMillis() - before)
      +" ms, number of elements : "+absoluteResult.size);
  }

  static String foo(String s) {
    'String'
  }
  
  static String foo(Boolean s) {
    'Boolean'
  }
  
  static String foo(Integer s) {
    'Integer'
  }

}

It doesn't do anything interesting but leverages Groovy's built-in way of resolving overloaded methods using the runtime type. I thought it's fair to use this feature in a comparison, also because Xtend's dispatch methods have the same behavior. The result of the code above would be a list containing the string 'String', 'Boolean', and 'Integer' each 10.000 times.

When I run this code on my machine, it prints something like the following:

Took : 2025 ms, number of elements : 30000

A possible Xtend version

This is how I could achieve the same using Xtend:

class Xtend {
  def static void main(String[] args) {
    val absoluteResult = newArrayList()
    val before = System::currentTimeMillis()
    for (times : 1..10000) {
      val result = newArrayList('foo', 23, true)
      for (y : result) {
        absoluteResult += foo(y)
      }
    }
    println("Took : "+(System::currentTimeMillis() - before)
           +" ms, number of elements : "+absoluteResult.size)
  }

  def static dispatch foo(String s) {
    'String'
  }
  
  def static dispatch foo(Boolean s) {
    'Boolean'
  }

  def static dispatch foo(Integer s) {
    'Integer'
  }
}

On a first glance it looks very similar. Note, the dispatch keyword, which turns a set of overloaded methods into dynamically dispatched methods (i.e. the default Groovy behavior).
In contrast to the Groovy version this one is fully statically typed and as it turned out much faster. (Again as you see my measurement technique is everything but professional, but given such a big difference I dare to say it's faster).

It prints something like:

Took : 43 ms, number of elements : 30000

I also did a Scala version:


object Scala extends Application {
  val absoluteResult = MutableList[Any]()
  val before = System.currentTimeMillis()
  for (times <- 0 until 10000) {
    val result = List("foo", 23, true)
    for (y <- result) {
      absoluteResult += foo(y)
    }
  }
  println("Took : "+(System.currentTimeMillis() - before)
      +" ms, number of elements : "+absoluteResult.size)

  def foo(obj : Any) =
    obj match {
          case _:String => "String"
          case _:Boolean => "Boolean"
          case _:Integer => "Integer"
          case _ => throw new IllegalArgumentException()
    }
}

I got this kind of result:

Took : 130 ms, number of elements : 30000

And finally a Java version:

public class Java {
  public static void main(String[] args) {
    List<Object> absoluteResult = new ArrayList<Object>();
    long before = System.currentTimeMillis();
    for (int i=0; i < 10000; i++) {
      List<Object> result = new ArrayList<Object>();
      result.add("foo");
      result.add( 23);
      result.add(true);
      for (Object y : result) {
        absoluteResult.add(foo(y));
      }
    }
    System.out.println("Took : "+(System.currentTimeMillis() - before)
      +" ms, number of elements : "+absoluteResult.size());
  }

  static String foo(Object s) {
    if (s instanceof String) {
      return "String";
    } else if (s instanceof Boolean) {
      return "Boolean";
    } else if (s instanceof Integer) {
      return "Integer";
    } else {
      throw new IllegalArgumentException();
    }
  }
}

Which prints:

Took : 40 ms, number of elements : 30000

10 comments:

  1. Hi Sven.

    If you want maximum performance in Scala you need while loops. For "loops" are "just" syntactic sugar for fold left and filter.

    I rewrote your little program to use while loops and the java utils collections and the scala version was as fast as java - but nearly as ugly to :-D


    Anyway - keep the great work with Xtend!

    ReplyDelete
  2. Hey Sven,

    the Groovy code is too slow for the work it should do. Please change the line
    absoluteResult += foo(y)

    into absoluteResult << foo(y)

    The << operator is much faster. I am not sure if the += construct always create a new list with the old values + the new one.

    Thorsten

    ReplyDelete
  3. This comment has been removed by the author.

    ReplyDelete
  4. And in Clojure (without any mutable datastructure):

    (defn foo [x]
    (cond
    (number? x) "Number"
    (string? x) "String"
    (= (class x) Boolean) "Boolean"))

    (time (println (count
    (map foo
    (apply concat
    (repeat 10000 ["foo" 23 true]))))))

    Prints:
    30000
    "Elapsed time: 194.046477 msecs"

    The Java version needs 31ms on my machine.

    ReplyDelete
  5. Sven, you found an example in which Groovy cannot do different than look bad. Groovy uses a fair amount of caching to get method invocations fast, but your code makes this caching not work at all. So your test there makes no argument about dynamic dispatch in general, just for a case, in which Java would look also very bad, if you would reproduce it there. The thing is only you cannot easily do that. In Java the method is preselected during compilation, the maximum you can get there is that you produce a scenario in which Java has to interpret.

    As for the Groovy program, there are some things that you could use to make it faster, but there is a big conceptual difference between the Groovy program and the others. In the Java version you make a foo(Object) method and in there check by instanceof. That means you will get only one method. In Groovy those methods are all existing and there exists no dispatcher method.

    Imagine a class with those three foo methods written in Java. Then in Groovy you can still run your test. I dare to say that in Xtend you cannot make this work without a dispatcher method somewhere - and your dispatch keyword is implying that - thus I assume you cannot make this then work using a Java class.

    But let us not stop here... let us assume that you write the base class in Java, where it knows foo(String) foo(Object) and foo(Boolean). Using this class the Integer variant would call the foo(Object) method in your code. Now extend that class in Java with a foo(Integer) method and Groovy would no longer call foo(Object) but foo(Integer).

    I doubt you can easily simulate that case.

    Anyway.. coming back to your test. The call site caching in Groovy works by assuming that the parameter types do not change in most cases. In your case they change on each call, thus part of the performance problem.

    In the Java case you are not testing method call performance at all, instead you test the performance of ArrayList#add. And the list operation dominates the Groovy program as well, since the Groovy program uses +=, which does create a new list each time. If you change it to <<, then it will also use ArrayList#add and performance should be somewhat factor 5 better. Then Groovy is still 10 times slower than the Java version of course. But at least not those 50x, and would you force Java in interpreted mode, it would have about the same slow down.

    ReplyDelete
  6. This comment has been removed by the author.

    ReplyDelete
  7. Your Java version wouldn't compile for me: "y" needs to be an Object, not a Comparable.

    Also You could write the Java version 3 lines shorter if you use Arrays.asList(new Object[] { "foo", 23, true } as a poor man's List literal. Java isn't always as verbose as everybody is claiming ;-)

    10000 iterations are probably not enough for a comparison: If I go from 10k to 100k iterations, the Scala version needs 5x time -- the Java version only 2x.

    ReplyDelete
  8. @all: Thanks for the feedback. I incorporated it and wrote an update post : http://blog.efftinge.de/2011/12/groovy-scala-java-xtend-updated-stupid.html

    ReplyDelete
  9. virtualeyes1/4/12, 4:59 PM

    Indeed, if you change "+=" to "<<" in the Groovy version you get 256ms, @8X better performance than the rigged, errr, supplied benchmark ;-)

    Still, slower than the statically typed versions, but then again, Groovy's mandate to-date has not been to provide raw number crunching power. The right tool for the right job.

    ReplyDelete
  10. If performance is important, I would have re-written the groovy sample differently. Controlling the class of absoluteResult instead of just using []. Maybe a linked list is better.

    ReplyDelete