Yuval Shavit: Statements and expressions: an exploration of ambiguity

I've been working on the parser for Effes a bit, and I got a bit stuck on an ambiguity in case constructs; I want them to work as either statements or expressions.

To anchor things a bit, here are two uses of case, one of which is used as an expression, and the other as a statement:

-- as an expression
firstInt = case ints of
    (): 0
    (head, tail): head

-- as a statement
case ints of
    (): print "empty list!"
    (_): print "list has one element"
    _: print "list has #{intsList size} elements"

Languages handle this in various ways that make things simple. For instance:

In Java, it's always unambiguous whether something is expected to be a statement or an expression.
In Haskell, each function is just a single (potentially complex) expression; there are no statements, and thus no ambiguity
In Scala, you can put an expression anywhere in a function body, and the last expression is the function's return value — so again there's no ambiguity, because you can just make case constructs (match as they're called in Scala) always be expressions.

Scala's approach works, but it also lets you define a function as def g() = { 1; 2; 3; }, which I don't like. Statements and expressions are different beasts to me, and conflating them seems like a lazy and inelegant solution.

So then, is the case in that f example above about to introduce a statement or expression?

One solution is to take a hint from Java and have method bodies always consist of statements. If we take that approach, f... : case ints of is a statement. To make it be an expression whose value is returned, we'd have to write f... : return case ints of....

That's not the end of the world. In fact, I've never liked the Ruby-style return statements, where you just plop an expression at the end of a method:

def ugly
    123
end

There are a few reasons I don't like this, but the main reason is that in an imperative context (which a Ruby/Scala/etc method is), returning a value is an action. It should look like one! When I write imperative code, I'm telling the computer a series of actions to take. An implicit return feels like this to me:

First, ask the user how many apples they want.
Then find out how many apples are available.
Then, the minimum of that number and the number of apples requested.

That last sentence feels wrong, because it's not a sentence; it's a phrase. You can figure out what it means, but it feels stilted.

On the other hand, when writing one-liners, the return feels superfluous. Here's a nice size function for a list:

size -> Int: case this of
    (): 0
    _: 1 + (list tail)

One option I'm considering is to look at the return value if the method is a one-liner (that is, just a single statement or expression — even if it's complex). If it's Unit, that one line is a statement; otherwise, it's an expression. (If the function's body is a block instead of a one-liner, that block consists entirely of statements, including possibly a return statement.)

This feels a bit subtle and potentially confusing, and maybe that should be a big warning. On the other hand, I think that for most cases, it'll "just work." Crucially, since this only applies to one-liners, nearly all the cases should hopefully be simple cases. I can't think of any that wouldn't be.

This approach also means that the compiler will have to know about the Unit type specially. My instincts are that this smells wrong, but maybe it's not so bad.

Ah, what the heck. Despite all these warning bells going off, I'll try it out. If nothing else, it'll be good to see if my intuition (that this is a sketchy idea) is right, and why specifically. As Batman Begins put it, we fall so we can learn to pick ourselves up.

Yuval Shavit

Thursday, February 13, 2014

Statements and expressions: an exploration of ambiguity

No comments:

Post a Comment