Tuesday, June 25, 2013

Sugar and the "data" keyword

In my last post, I discussed the data keyword as sugar for stateful, struct-like traits. This is mostly pure sugar, although it's also the only way to define a stateless, methodless trait (like Nothing). I'd like to justify those decisions, and also introduce two more pieces of sugar.

As I touched on in an earlier post, there is a balance to be had between deep abstractions vs. ease of use. Syntactic sugar, if designed correctly, can help that balance by highlighting certain aspects of an abstraction. This is useful both for the language design, which can now have a high-level abstraction whose concrete implications are clearer; and from an individual program's design, where it's more obvious which part of the abstraction is important for a given type or object.

The data syntax aims to do this by recognizing that a type whose only functions are getters feels different than one with abstract virtual methods. The first is used purely to store state, whereas the latter doesn't even care about state directly — just about behavior. Effes combines both of concepts into traits, so sugar can help clarify which aspect is more important for a a given type. When you want to focus exclusively on the state-storing ability of a trait, data is a better option, and when you need the behavior-defining aspect of a trait, the default syntax is a better (well, only) option.

As for requiring data syntax for stateless, methodless traits, this is partly pragmatics and partially a philosophical consideration. Given that the syntax for an abstract trait is something like this:

Sizeable:
  size : Int

... how would Nothing look with this syntax? It'd probably be something like one of these:

Nothing:
-- rest of your code here, unindented

Nothing -- no colon
-- rest of your code here, unindented

Nothing:
  pass

The first two feel like weird, dangling tokens; I don't like them from an aesthetic perspective. The last one borrows from Python's pass statement, which inserts a runtime no-op. It was designed for exactly the kind of situations Nothing would have using default trait syntax: the language requires something, but you want nothing. I've always thought this was a bit ugly. Within the context of a function, one could just use return instead; and within the context of a class definition, pass suggests that something weird is going on — a methodless, init-less, stateless class is a weird beast. In Effes, such a thing is indeed useful, but it feels very much like a data type; so, rather than introducing new syntax for it, or generalizing the default syntax to allow weird, ambiguous-looking constructs (like the first or second examples above), I just require the data syntax, which does the job just fine.

Finally, as promised, here are two more minor pieces of sugar for the data syntax. Firstly, you can define multiple types on one line, separated by a semicolon. And secondly, you can follow a data definition with "nicknamed foo" to automatically create a nickname for the union of all of the types defined on that data line. So this:

data LT
data GT
data EQ
nickname Comparison = LT | GT | EQ

...can be expressed more succinctly as:

data LT; GT; EQ nicknamed Comparison

Again, this tries to smooth the transition between abstractions and ease of use. One common use case for a type system is to define enum types; a comparison is either less than, greater than, or equal to. In a language like Haskell, these are different constructors of the same type. In Effes, they would be defined as the union type of the three traits, and one would almost definitely want a nickname for that union type. This sugar allows us to emphasize the enum-like characteristics of the union of stateless, methodless traits.

The use case for this sugar is definitely enum-like types, so I'm tempted to declare that the sugar only works if all of the data types are stateless. This feels slightly more hacky, but it's also easier to reverse: generalizing the syntax will be backwards compatible, whereas restricting it (from all data types to just stateless ones) in the future could break code. I don't anticipate that backwards compatibility will be a major problem for Effes, but I think I'll take the safer approach for now, as an exercise in language evolution if nothing else.

No comments:

Post a Comment