The Open Type System vs. Code Gen

Cedric Beust took a look at Gosu’s Open Type System a few weeks back here:

Overall it was a very fair, if brief, look at Gosu’s Open Type System. There was one point at the end, however, that I want to examine:

While elegant, Gosu’s open type system is basically just saving you from a code generation phase.

This is not entirely true, but it is hard to see why at first blush.

What *is* true is that most problems that are addressed with code generation in Java today can be more elegantly addressed in Gosu by using the Open Type System. And, due to the fundamental laziness of the Gosu compilation model, it is possible to model problems with much larger potential type spaces (so long as only a reasonable subset of that type space needs to be materialized at compilation time) since it is not necessary to generate all types up front. So, yes, the Open Type System is a sort of elegant code generation mechanism.

However, it is more than that.

To understand this, consider the Protocols type loader that I wrote a while back:

Protocols are an entirely foreign concept to Gosu. They are a way to achieve “static” duck typing: you define a Protocol in a .proto file, which looks an awful lot like an interface. However, unlike Java or Gosu interfaces, a value’s type does not need to implement a Protocol for the value to be assigned to a variable that has a Protocol type. Rather, the value’s type must conform to the Protocol: for every function defined in the Protocol, there must be an invocable function on the value’s type that can be invoked safely.

An example will help clarify.

Given this procotol:

package test

protocol Protocol1 {
  function doIt()

Any value with a static type that has a method doIt() can be assigned to it:

class Class1 {
  function doIt() { print( "In Class1" ) }

class Class2 {
  function doIt() { print( "In Class2" ) }

var p : test.Protocol1 = new Class1()

p = new Class2()

// p = new ArrayList() -- Won't compile, since ArrayList doesn't have a doIt() method... 
//                        Unless you add one with an enhancement... :)

So, even though Class1 and Class2 are entirely unrelated, we are able to assign them both to a variable with a protocol type to which they both conform. We can’t assign an ArrayList to that variable, however, since an ArrayList does not have a doIt() function (although you could make ArrayList conform to the protocol by using an enhancement to add a doIt() method if you wanted to. But why would you do something crazy like that?)

I was able to implement Protocols in Gosu precisely because the Open Type System gave me more tools than simple code generation: I was actually able to implement the unique assignability semantics of my Protocol types using the IType#isAssignableFrom() method.

This is a level of control and power that mere code generation does not have.


The Dynamic Code Evolution VM

Every single java developer out there needs to know about the Dynamic Code Evolution VM (DCEVM) being turned out by Thomas Würthinger and his team:

This is truly an amazing piece of technology. It enables full hot swap in a JVM, allowing you to change method signatures, add and delete methods and so on, with impunity. You will rarely need to restart your JVM once you start using it.

And it is available…. FOR FREE.

Guidewire has sponsored the development of the DCEVM ever since we first learned about it at the 2009 JVM Language Summit. It has been a huge productivity boon for us and is a crucial bit of technology in the Guidewire (and Gosu) stack.

More people need to know about the DCEVM.

Download it, stop bouncing your JVM, and tell a friend.

Gosu IntelliJ Plugin

In an effort to further IDE support for Gosu, we’ve just open sourced the IntelliJ plugin.

If you are interested, you can clone the repository here:

Running the plugin involves:

  • Cloning the repository
  • Setting up the submodules (git submodule init then git submodule update)
  • Opening up the GosuPlugin.ipr project.

You’ll need to set up an IntelliJ SDK as well. Internally, we’ve been using the latest open source version. You can set this up as follows:

  • Check out from git://
  • build the editor using ant build
  • unzip the version for your platform in out/artifacts.

After you’ve done that, you can set up a new plugin SDK by pointing it at the directory in question.

Note that we will not be doing merges for the plugin in git, but if you’d like to submit a patch we can apply it internally and migrate it back out to the main repository.


RE: Dear Java-Killers

Sven Efftinge has a blog post up giving his dos and don’ts for “java-killer” programming languages. I thought I’d go through his points one by one and see how Gosu stacks up.

1) Don’t make unimportant changes

I think Gosu does a pretty good job here: we didn’t set out to invent a whole new collections hierarchy and we have very few gratuitous or academic features. We tried to make it easy for Java developers to become productive in Gosu and keep on using the old familiar techniques and libraries they are used to.

2) Static Typing


3) Don’t touch generics

In Gosu we actually *simplified* Java generics (at the cost of correctness) by doing away with wildcards, which are the primary pain point for most Java developers. This is in contrast to other languages that have more complicated (but more correct) generics systems.

So I think Gosu improves on Java in a way that is easier for most Java developers to grok: just drop wildcards, and things usually work the way you want.

4) Use Type Inference


5) Care About Tool Support

Tentative Check. We currently have an Eclipse plugin and we expect to have extremely solid tool support in both Eclipse and IntelliJ by the end of the summer. This is our #1 priority right now.

6) Closures

Check. And we address Eric Parnell’s first comment with Enhancements, which allow us to add the usual functional suspects to java.util.Iterable (map(), where(), etc.)

7) Get Rid of Unused Concepts

Gosu retains bitshifts and a fall through switch statement. They come up every once in a while, and, since Java has them, it’s probably best to keep them around.

So Gosu stacks up well against Sven’s Dos and Don’ts.

Gosu isn’t an attempt to reinvent the way that we write code: it is an imperative language with some useful functional concepts as well. Most of the features and syntax are taken from various existing languages, and we’ve tried, for the most part, to deviate from Java as little as possible (e.g. feature literals use the familiar Javadoc syntax.)

The real innovation in Gosu is the Open Type System, which lets Gosu act as an excellent host language for other external DSL’s, such as XSD, WSDL or Java Properties Files, all of which can be accessed and manipulated in a statically typed manner from Gosu.

For Java developers, this should be pure win: all those resources you’ve had to work with using either code-gen tools or with an untyped dynamic API can now be used immediately: just drop your WSDL or Properties file on your classpath and, blam, go.


Feature Literals

In the current open source release of Gosu there is a new feature called, er, feature literals. Feature literals provide a way to statically refer to the features of a given type in the Gosu type system. Consider the following Gosu class:

  class Employee {
    var _boss : Employee as Boss
    var _name : String as Name
    var _age : int as Age

    function update( name : String, age : int ) {
      _name = name
      _age = age

Given this class, you can refer to its features using the '#' operator (inspired by the Javadoc @link syntax):

  var nameProp = Employee#Name
  var ageProp = Employee#Age
  var updateFunc = Employee#update(String, int)

These variables are all various kinds of feature references. Using these feature references, you can get to the underlying Property or Method Info (Gosu’s equivalents to java.lang.reflect.Method), or use the feature references to directly invoke/get/set the features.

Let’s look at the using the nameProp above to update a property:

  var anEmp = new Employee() { :Name = "Joe", :Age = 32 }
  print( anEmp.Name ) // prints "Joe"
  var nameProp = Employee#Name

  nameProp.set( anEmp, "Ed" )
  print( anEmp.Name ) // now prints "Ed"

You can also bind a feature literal to an instance, allowing you to say “Give me the property X for this particular instance“:

  var anEmp = new Employee() { :Name = "Joe", :Age = 32 }
  var namePropForAnEmp = anEmp#Name

  namePropForAnEmp.set( "Ed" )
  print( anEmp.Name ) // prints "Ed"

Note that we did not need to pass an instance into the set() method, because we bound the property reference to the anEmp variable.

You also can bind argument values in method references:

  var anEmp = new Employee() { :Name = "Joe", :Age = 32 }
  var updateFuncForAnEmp = anEmp#update( "Ed", 34 )
  print( anEmp.Name ) // prints "Joe", we haven't invoked 
                      // the function reference yet
  print( anEmp.Name ) // prints "Ed" now

This allows you to refer to a method invocation with a particular set of arguments. Note that the second line does not invoke the update function, it rather gives you a reference that you can use to evaluate the function with later.

Feature literals support chaining, so you could write this code:

  var bossesNameRef = anEmp#Boss#Name

Which refers to the name of anEmp‘s boss.

You can convert method references to blocks quite easily:

  var aBlock = anEmp#update( "Ed", 34 ).toBlock()

Finally, feature references are parameterized on both their root type and the features type, so it is easy to say “give me any property on type X” or “give me any function with the following signature”.

So, what is this language feature useful for? Here are a few examples:

  1. It can be used in mapping layers, where you are mapping between properties of two types
  2. It can be used for a data-binding layer
  3. It can be used to specify type-safe bean paths for a query layer

Basically, any place you need to refer to a property or method on a type and want it to be type safe, you can use feature literals.

Ronin, an open source web framework, is making heavy use of this feature. You can check it out here:


I could not post this here…

so I decided to post it there.

Two Ways to Design A Programming Language

Method One:

Have a legend of the field think deeply and make precisely reasoned arguments for features.

Method Two:

Look at code-gen features in IntelliJ and figure out how to avoid needing them for your language:

Implicit Type Casting

IntelliJ has a macro for the common pattern of checking the type of something and then immediately downcasting/crosscasting the expression to that type:



  Object x = aMethodThatReturnsAnObject();
  if( x instanceof List ) {
    System.out.println( "x is a List of size " + ((List) x).size() );


  var x : Object = aMethodThatReturnsAnObject()
  if( x typeis List ) {
    // hey, you already told us x was a list.  Why make you cast?
    print( "x is a List of size ${x.size()}" )


IntelliJ has a wizard to assist you in delegating all the implementations of an interface to a field:



class MyDelegatingList implements List {

  delegate _delegateList represents List

  construct( delegateList : List ) {
    _delegateList = delegateList

  override function add( Object o ) : boolean {
    print( "Called add!" )
    return _delegateList.add( o )

  // all other methods on List are automatically
  // delegated to _delegateList

I omit the java version out of respect for your eyes.

I should note, Gosu is the new name for GScript, our internal programming language

I Am Hate Method Overloading (And So Can You!)

My hatred of method overloading has become a running joke at Guidewire. My hatred is genuine, icy hot, and unquenchable. Let me explain why.

First Principals
First of all, just think about naming in the abstract. Things should have good names. A good name is unique and easy to understand. If you have method overloading, the name of a method is no longer unique. Instead, the real name of the method is the sane, human chosen name, plus the fully qualified name of each argument’s type. Doesn’t that just seem sort of insane? If you are writing a tool that needs to refer to methods, or if you are just trying to look up a method reflectively, you have to know the name, plus all the argument types. And you have to know this even if the method isn’t overloaded: you pay the price for this feature even when it isn’t used.

Maybe that strikes you as a bit philosophical. People use method overloading in java, so there must be some uses for it. I’ll grant that, but there are better tools to address those problems.

In the code I work with day to day, I see method overloading primarily used in two situations:

Telescoping Methods
You may have a function that takes some number of arguments. The last few arguments may not be all that important, and most users would be annoyed in having to figure out what to pass into them. So you create a few more methods with the same name and fewer arguments, which call through to the “master” method. I’ve seen cases where we have five different versions of a method with varying numbers of arguments.

So how do I propose people deal with this situation without overloading? It turns out to be a solved problem: default arguments. We are (probably) going to support these in the Diamond release of GScript:

  function emailSomeone( address:String, subject:String, body:String,
                         cc:String=null, logToServer:boolean=false, 
                         html:boolean = false ) {
    // a very well done implementation

A much cleaner solution. One method, with obvious syntax and, if your IDE is any good, it will let you know what the default values of the option arguments are (unlike method overloading.)

True Overloading
Sometimes you truly want a method to take two different types. A good example of this is the XMLNode.parse() method, which can take a String or a File or an InputStream.

I actually would probably argue with you on this one. I don’t think three separate parse methods named parseString(), parseFile() and parseInputStream() would be a bad thing. Code completion is going to make it obvious which one to pick and, really, picking a unique name isn’t going to kill you.

But fine, you insist that I’m a terrible API designer and you *must* have one method. OK, then use a union type (also probably available in the Diamond release of GScript):

  function parse( src:(String|File|IOStream) ) : XMLNode {
    if( src typeis String ) {
      // parse the string

A union type lets you say “this argument is this type or that type.” It’s then up to you to distinguish between them at runtime.

You will probably object that this syntax is moderately annoying, but I’d counter that it will end up being fewer lines of code than if you used method overloading and that, if you really want a single function to handle three different types, you should deal with the consequences. If it bothers you too much, just pick unique names for the methods!

Let’s say you accept my alternatives to the above uses of method overloading. You might still wonder why I hate it. After all, it’s just a feature and a pretty common one at that. Why throw it out?

To understand why I’d like to throw it out, you have to understand a bit about how the GScript parser works. As you probably know, GScript makes heavy use of type inference to help developers avoid the boilerplate you find in most statically typed languages.

For example, you might have the following code:

  var lstOfNums = {1, 2, 3}
  var total = 0
  lstOfNums.each( \ i -> { total = total + i  } )

In the code above, we are passing a block into the each() method on List, and using it to sum up all the numbers in the list. ‘i‘ is the parameter to the block, and we infer it’s type to ‘int’ based on the type of the list.

This sort of inference is very useful, and it takes advantage of context sensitive parsing: we can parse the block expression because we know the type of argument that each() expects.

Now, it turns out that method overloading makes this context sensitive parsing difficult because it means that when you are parsing an expression there is no guarantee that there is a single context type. You have to accept that there may be multiple types in context when parsing any expression.

Let me explain that a bit more. Say you have two methods:

  function foo( i : int ) {
  function foo( i : String ) {

and you are attempting to parse this expression:

  foo( someVar )

What type can we infer that the context type is when we parse the expression someVar? Well, there isn’t any single context type. It might be an int or it might be a String. That isn’t a big deal here, but it becomes a big deal if the methods took blocks, or enums or any other place where GScript does context type sensitive parsing. You end up having lists of context types rather than a single context type in all of your expression parsing code. Ugly.

Furthermore, when you have method overloading, you have to score method invocations. If there is more than one version of a method, and you are parsing a method invocation, you don’t know which version of the method you are calling until after you’ve parsed all the arguments. So you’ve got to run through all the argument types and see which one is the “best” match. This ends up being some really complicated code.

Complexity Kills

Bitch, bitch, moan, moan. Just make it work, you say. If the java guys can do it, why can’t you? Well, we have made it work (for the most part.) But there’s a real price we pay for it.

I’m a Berkeley, worse-is-better sort of guy. I think that simplicity of design is the most important thing. I can’t tell you how much more complicated method overloading makes the implementation of the GScript parser. Parsing expressions, parsing arguments, assignability testing, etc. It bleeds throughout the entire parser, its little tentacles of complexity touching places you would never expect. If you come across a particularly nasty part of the parser, it’s a good bet that it’s there either because of method overloading or, at least, is made more complicated by it.

Oh, man up! you say. That’s the parser’s and tool developer’s problem, not yours.

Nope. It’s your problem too. Like Josh Bloch says, projects have a complexity budget. When we blow a big chunk of that budget on an idiotic feature like method overloading, that means we can’t spend it on other, better stuff.

Unfortunately, because GScript is Java compatible, we simply can’t remove support for method overloading. If we could though, GScript would have other, better features and, more importantly, fewer bugs.

That is why I am hate method overloading. And so can you.

API Design

API design is hard. You can tell it is hard because there are so many bad API’s out there, often written by pretty smart people. Why is this?

I believe that one reason is that, in order to do APIs right, you often need to layer their complexity. This layering should make simple stuff dead simple for people who only casually use the API, and then provide greater functionality (and complexity) for more advanced use cases.

A lot of developers get uncomfortable with this idea because it means that There is More Than One Way To Do It, a development philosophy that has earned Perl infamy amongst sane developers. Java developers, in particular, seem to dislike redundancy and would rather eliminate it. Unfortunately, the reasonable goal of eliminating redundancy can lead to miserable-to-use APIs.

Reading a File

As a canonical example of the problems caused by this dichotomy, consider reading a file. It’s as common an I/O operation in day-to-day development as you are likely to find.

This is how you might do it in java:

    String str = null;
    BufferedReader in = null;
    try {
        in = new BufferedReader( new FileReader( "C:/tmp/tempfile.txt" ) );
        StringWriter sw = new StringWriter();
        byte[] buf = new byte[1024];
        while (true) {
          int count =;
          if (count < 0) {
          out.write(buf, 0, count)
        str = sw.toString();
    } catch ( IOException e ) {
      // uhhhhhh...
      throw new RuntimeException( e );
    } finally {
      try {
      } catch ( IOException e ) {
        // double uhhhhhhh.....
        throw new RuntimeException( e );


Now, why is this so complicated?

It’s due to the fact that the I/O library is written against very abstract notion: streams. This was done for good reasons: streams are relatively high performance and they generalize to any kind of input (e.g. network connections). You can see why someone writing an I/O API might find such an abstraction enticing. There is only one* way to do I/O in java, regardless of what sort of I/O you are doing.

Unfortunately, that way sucks.

A Layered Solution

It is unacceptable to us that reading a file be so complex in GScript. To address this, we’ve created a layered approach to file I/O for our users.

Layer 1: Dead Simple

The simplest notion I can imagine for I/O is reading a file into a String. These are two relatively easy to understand objects that nearly all developers have experience with. To make this as easy as possible, we have introduced an enhancement method to

  var file = new File( "C:/tmp/tempfile.txt" )
  var str =

The implementation of File#read() is, of course, much like the java code above. However, just by adding this simple method, GScript users doing simple I/O need not worry about exceptions, what try/catch structure is needed or know what streams are. Just read a file into a string. Simplicity itself.

Layer 2: A bit more Complex

You may object that wastes memory by reading the entire file into a single string. While I think that this objection is often overstated given todays hardware, there are situations where it is valid. To handle these cases, we introduced another method to java.util.File:

  var file = new File( "C:/tmp/tempfile.txt" )
  var str = file.readLines( \ line -> print( line ) )

The readLines() method takes a block that is called with each line of text from the file. It’s somewhat analogous to a SAX-style parser. Note that the user still does not need to know anything about streams to use this API, although they have much more control over the memory footprint of their program.

Layer 3: Whole Hog

Finally, someone might actually need the level of control or performance that streams provide and, of course, are free to use them. But they need not become experts in Java I/O unless absolutely necessary.

A Bit of Redundancy For A Lot of Ease

With this layered approach to File I/O, GScript users do not need to become familiar with the complicated Java I/O libraries to do simple and even moderately complicated things. Yes, there is now some redundancy, but each layer fulfills a particular band of the complexity/performance continuum. GScript programmers are only required to know as much about I/O as is necessary to solve the problem they have with acceptable performance.

Some redundancy, done right, is often the right thing.

* – Actually, this isn’t true anymore, since the NIO library came along for high-performance I/O situations. Let’s not dwell on inconvenient facts that compromise my argument, OK?

Pragmatic Type Systems

What do static type systems provide and what do they cost? It’s tough to know what to make of the dynamic vs. static typing arguments going on right now on sites like Artima without having solid answers to these two questions.

The Benefits

Static type systems have been developed mainly by the academic community, and they focus on soundness and provability. This is understandable: academic computer science is largely a subset of mathematics, and so adopting the rigor of the mathematical community is natural. This rigor leads to the conclusion that the primary goal and benefit of a static type system is that it provides (or *should* provide) proof that your program functions as you expect.

A less extreme version of this view holds that the main benefit of a static type system is that, even if it doesn’t prove your program is correct, it will help you catch errors at compile time rather than run time. The common rejoinder to this argument is that we should all be writing tests anyway, and the tests will catch the errors for us rather than the compiler. That’s true to a point, but it is nice to have the compiler point out exactly where the trouble spots are after you’ve merged in someone else’s changes to your local machine. Running a suite of tests and sifting through the output can be tedious and slow, and compiler errors tend to be clearer and more local to the actual error than test failures. Sure, the errors compilers catch are often trivial, but they find them quickly and easily.

A third, and very different answer is that static type systems make it easier to provide excellent tool support. Things like code-completion, code analysis and refactor tools. Dynamic languages can provide some low-rent versions of these features, but they are either complicated (smalltalk) or poor implementations (ruby IDEs.)

This third benefit is easily the most compelling. IDE’s and other tools help average, fallible humans get code out the door and clean up after themselves on a day-to-day basis. It’s more important that the editor tell me what I can do with a variable than it verify what I’ve done is correct after the fact. The tools provide discoverability and allow developers to leverage up their memory banks, with the tools themselves remembering all the details about types. Once you really start ripping through code with a good IDE like IntelliJ, it’s hard to think about coding without it, and it’s hard to think about how to write an IDE as good as IntelliJ without static typing.

The Costs

The most obvious cost of a static type system is syntactic: you simply have to write more code since you have to declare the types of things. Java is a worst-case example of this: you end up writing types out over and over again on the left and right hand side of assignments.

A second cost of static type systems is conceptual complexity. A good example of this is java’s generics implementation. Josh Bloch made headlines a while ago[1] when he came out against his longtime friend Neal Gafter’s closures proposal. In his talk he cited wildcards as the primary source of complexity in java’s generics. (Note that wildcards were added to preserve type-safety: they aren’t generally of much use to tools.)

Type inference, which helps with the syntactic cost of static typing, can also add to the conceptual complexity if the inference is too aggressive: it can cause incomprehensible, difficult to resolve compilation errors. You can see this in java occasionally with generic method inference. If you’ve ever used an ML-derived language for a non-trivial project, you’ve probably seen this as well.

The GScript Approach: A Pragmatic Type-System

GScript aims to maximize the benefit of static typing while minimizing the costs. Since there is no mathematical formula tying these two opposing goals together, we simply had to go with our guts. (Not that there is no guidance available. Scott McKinney, the creator of GScript, often cites a paper[2] from Microsoft as a source of inspiration when making these tradeoffs. He also follows Anders Hejlsberg’s work on C# 3.0, a very pragmatic language.)

Below are two examples of pragmatic decisions we’ve made with respect to GScript’s type system. Both involve tradeoffs that we feel maximize the benefits of the static type system while minimizing the burden of it.

Pragmatic Type Inference

GScript provides local type-inference:

  var myList = new ArrayList()

You do not need to declare that myList is a List, the GScript compiler will determine that for you. This is especially nice with generics:

  var myMapOfLists = new HashMap<String, List<String>>()

However, unlike some ML-derived languages, GScript does not infer the types of function arguments or return types:

  function joinList( lst : List ) : String {
    var str = new StringBuilder()
    for( elt in lst index i ) {
      if( i > 0 ) str.append( ", " )
      str.append( elt )
    return str.toString()

Note that both the parameter type, List, and return type, String, are annotated.

This is conceptually simpler for both the compiler writer and the GScript coder, since all type information is local to the function. It also allows for a decoupling of types that I will discuss in a later article.

So GScript steers the middle course between no type-inference (java) and total type-inference (ML-like languages.)

Pragmatic Generics

As you can tell from the code above, GScript, like java 1.5 and greater, has a generics implementation, where you can declare a list composed of a certain type:

  var listOfStrings = new ArrayList<String>()

GScript does not have wildcards, however. Instead, we adopted covariance of generic types, so a List<String> is assignable to a List<Object>. This hugely simplifies generics for the end user: generic classes behave just like arrays, which everyone is used to. We do sacrifice some type safety and a bit of expressiveness (see my note on the Microsoft paper below) but we don’t sacrifice anything with respect to tools, and we lop off a huge chunk of complexity.

Static Typing with a Light Touch

With the two features outlined above, GScript manages to have a reasonably type safe static type system that supports tools quite well, while having a relatively small syntactic and conceptual overhead. With this light touch it competes well syntactically and conceptually with dynamic languages such as ruby or python for average, day-to-day development. The type system, for the most part, gets out of the way of the developers except when they need it.

So the costs of a pragmatic static type system are pretty low and the benefits are pretty high, as long as you have access to a language like GScript.

We’re working on that.

[1] – The Closures Controversy – Chapters 12-16
[2] – Static Typing Where Possible, Dynamic Typing When Needed:
The End of the Cold War Between Programming Languages

One interesting note on this paper is that that the authors want two incompatible features: they mention an IComparer interface as their example of generics, but then advocate covariance of generic types. Unfortunately, IComparer is a contravariant type! That is, you can safely assign an IComparer<Object> to a variable of type IComparor<String>, but you can’t safely assign an IComparor<String> to a variable of type IComparor<Object>.

Tradeoffs, tradeoffs, tradeoffs.


Get every new post delivered to your Inbox.

Join 36 other followers