Thursday, February 20, 2014

Method invocation syntax

I’ve been giving some thought to method invocation lately, trying to come up with something that’s fluent in simple cases, and familiar (to programmers) for the more complex cases. After a bit of playing around, I think I have a system I like.

Consider Java’s BigDecimal, and specifically its divide method. It feels very programmer-y:

aNum = myNum.divide(someOtherNum)

Wouldn’t it be nice if we could make this feel more natural?

aNum = myNum dividedBy someOtherNum

That suggests a grammar of object methodName arg0[, arg1, arg2...]. But if you have more than a couple args, this gets a bit muddy; the words all clump together in my brain, and it’s not entirely clear what’s what anymore: foo doBar baz, apple, banana, coconut. If anything, it looks like the logical grouping is (foo doBar baz) (apple banana coconut). Of course, it isn’t, and my brain knows that… but it’s not intuitive to my eye.

As I was looking around at various methods, I noticed another interesting thing: very often for multi-arg methods, there’s one “main” argument that’s followed by “secondary” arguments. In human-language grammar terms, there’s a single direct object, and then some adjectives and adverbs.

BigDecimal.divide(BigDecimal, RoundingMode) is a good example: the first argument is what you’re dividing by, and the second is just some almost-parenthetical info on how to do the division. It feels like this:

aNum = myNum dividedBy someOtherNum: HalfUp

This suggests a grammar of object methodName arg0 [: arg1, arg2, arg3...]. And that’s in fact what I think I’m going to go with (with a slight tweak that I’ll get to in a bit).

There’s an obvious problem, which is that not all methods follow that semantic pattern. For instance, List.sublist takes two arguments, fromIndex and toIndex. Neither modifies the other; they’re both “primary” args. (This may have been different if the arguments were fromIndex and length, but they’re not). You really do want to invoke this using the parentheses we all know and love:

aList = myList sublist (3, 7)

Yikes — does that mean I need two ways to invoke methods? Worse yet, do I let the call sites determine which to use, so that sometimes I’ll see myList sublist 3: 7 and sometimes I’ll see myNum dividedBy (someOtherNum, HALF_UP)? The latter isn’t bad, but I don’t want my language to encourage inconsistent style on things like this. So maybe I want to let the method declaration define which syntax to use… but how?

The solution is actually pretty simple: methods like sublist take only one arg, but it’s a tuple! That’s not enforced by the language, of course, but the syntax for declaring methods should mirror the syntax for calling them, so that things will naturally work out.

The one big issue with that grammar is that the : char is already used in lots of places, and in particular as a way of declaring a variable’s type (including to upcast it). For instance, myNum divided by someOtherNum : SomeType is ambiguous; does it take one arg, someOtherNum : SomeType, or does it take two args, someOtherNum and SomeType?

To solve this, I’m going to make a slight aesthetic concession and replace the : with {...} in method invocation.

aNum = myNum dividedBy someOtherNum { HalfUp } -- two args, num and mode
aList = myList sublist (3, 7)        -- one arg, a tuple of (start, end)

As I mentioned above, the method declaration should mirror invocation. Something like:

dividedBy divisor:BigDecimal { mode: RoundingMode } -> BigDecimal: ...
aList (start: Int, end: Int) -> List[E]: ...

I like this approach a lot, except for the curly braces. Ideally I’d use a colon, or even a pipe, but all of the single-char approaches I could think of would either cause ambiguity or be ugly. For instance, a pipe would be fine at the call site, but create visual ambiguities at declaration:

dividedBy divisor: BigDecimal | mode: RoundingMode -> BigDecimal: ...

That pipe looks like a disjunctive type at a glance. This isn’t an ambiguity from the grammar’s perspective, since mode is clearly a variable name and not a type (Effes enforces the capitalization scheme that Java suggests), but it’s not nice on the eyes. Some optional parentheses would help, but it’s hard to get excited about that. So for now, curly braces are it.

The thing I like about this syntax is that with one rule, I get everything I want. Simple methods look fluent; methods with adverbs look good (if a tad clunky with the braces); and in the worst case, I get something that’s no worse than what most of the popular languages out there require or recommend.

No comments:

Post a Comment