Yuval Shavit: functional programming

Showing posts with label functional programming. Show all posts

Thursday, February 13, 2014

Statements and expressions: an exploration of ambiguity

I've been working on the parser for Effes a bit, and I got a bit stuck on an ambiguity in case constructs; I want them to work as either statements or expressions.

To anchor things a bit, here are two uses of case, one of which is used as an expression, and the other as a statement:

-- as an expression
firstInt = case ints of
    (): 0
    (head, tail): head

-- as a statement
case ints of
    (): print "empty list!"
    (_): print "list has one element"
    _: print "list has #{intsList size} elements"

Languages handle this in various ways that make things simple. For instance:

In Java, it's always unambiguous whether something is expected to be a statement or an expression.
In Haskell, each function is just a single (potentially complex) expression; there are no statements, and thus no ambiguity
In Scala, you can put an expression anywhere in a function body, and the last expression is the function's return value — so again there's no ambiguity, because you can just make case constructs (match as they're called in Scala) always be expressions.

Scala's approach works, but it also lets you define a function as def g() = { 1; 2; 3; }, which I don't like. Statements and expressions are different beasts to me, and conflating them seems like a lazy and inelegant solution.

So then, is the case in that f example above about to introduce a statement or expression?

One solution is to take a hint from Java and have method bodies always consist of statements. If we take that approach, f... : case ints of is a statement. To make it be an expression whose value is returned, we'd have to write f... : return case ints of....

That's not the end of the world. In fact, I've never liked the Ruby-style return statements, where you just plop an expression at the end of a method:

def ugly
    123
end

There are a few reasons I don't like this, but the main reason is that in an imperative context (which a Ruby/Scala/etc method is), returning a value is an action. It should look like one! When I write imperative code, I'm telling the computer a series of actions to take. An implicit return feels like this to me:

First, ask the user how many apples they want.
Then find out how many apples are available.
Then, the minimum of that number and the number of apples requested.

That last sentence feels wrong, because it's not a sentence; it's a phrase. You can figure out what it means, but it feels stilted.

On the other hand, when writing one-liners, the return feels superfluous. Here's a nice size function for a list:

size -> Int: case this of
    (): 0
    _: 1 + (list tail)

One option I'm considering is to look at the return value if the method is a one-liner (that is, just a single statement or expression — even if it's complex). If it's Unit, that one line is a statement; otherwise, it's an expression. (If the function's body is a block instead of a one-liner, that block consists entirely of statements, including possibly a return statement.)

This feels a bit subtle and potentially confusing, and maybe that should be a big warning. On the other hand, I think that for most cases, it'll "just work." Crucially, since this only applies to one-liners, nearly all the cases should hopefully be simple cases. I can't think of any that wouldn't be.

This approach also means that the compiler will have to know about the Unit type specially. My instincts are that this smells wrong, but maybe it's not so bad.

Ah, what the heck. Despite all these warning bells going off, I'll try it out. If nothing else, it'll be good to see if my intuition (that this is a sketchy idea) is right, and why specifically. As Batman Begins put it, we fall so we can learn to pick ourselves up.

Tuesday, September 10, 2013

Maybe it's optional?

A lot of functional and functional-inspired languages don't have the concept of null. Instead, they have types called Maybe or Optional — basically a box of 0 or 1 items. Effes is going to take that approach, but I might put a twist on it.

In a nutshell, the idea behind Maybe (I'll settle on Haskell's terminiology) is that there's a Nothing that represents the absense of something, a Just Foo that represents one Foo, and a Maybe Foo type which can be either a Nothing or a Just Foo.

Like other functional languages, Haskell has syntax (called pattern matching) that's kinda-sorta like an instanceof check plus a downcast in Java. Putting it all together looks something like this:

sayHi :: (Show e) => e -> String
sayHi Nothing = "Nothing to say hi to!"
sayHi (Just e) = "Hello, " ++ (show e) ++ "!"

(The (Show e) => syntax just means that e has a show method, which is like Java's toString.) In Effes, a direct translation would be a disjunctive type:

data Nothing
data Just[A] = elem : A
type Maybe[A] = Nothing | Just[A]

sayHi (m:Maybe[A Stringable]) -> String:
  case m of
      Nothing: "Nothing to say hi to!"
      Just e: "Hello, {{e}}!"

Because Effes has a more flexible type system, we can actually get away without the Just part of the Maybe pair. Instead, it looks something like this:

data Nothing
type Maybe[A] = Nothing | A

sayHi (m:Maybe[A Stringable]) -> String:
  case m of
      Nothing: "Nothing to say hi to!"
      e: "Hello, {{e}}!"

There's not a really strong driving force for this, except that it seems a bit cleaner. Instead of a Maybe being "either nothing or a box of one something," it's "either nothing or one thing." Plus it takes advantage of my cool new type system, so that's nice too.

The problem is when the A type is itself a Maybe: Maybe[Maybe[A]]. If we see that it contains a Nothing, does that mean we didn't have anything, or that we had one Nothing? To prevent unreachable code, I'd probably want the type checker to reject this altogether: Maybe[Maybe[String]] would be a type error.

That's not terrible, I guess, but the erroring type could be nestled in some data structure. For instance, if a linked list uses Maybe to signify an end, then LinkedList[Maybe[String]] wouldn't compile — and probably with some unintuitive or frustratingly un-actionable error message.

On balance, I'm leaning towards keeping the Just type. It doesn't add much complexity to the Maybe type, pattern matching keeps the call sites simple, and it eliminates ambiguity.

Wednesday, August 7, 2013

Conjunction disjunction, what's your function?

It's time to turn my attention to functions. Like virtually any language, Effes has functions; like any language written in the last few years, those functions are data. (At what point does the term "functional programming language" start to lose its meaning, given that every language these days incorporates FP principles?)

Every function in Effes has a type, which is just its signature. For instance, doubler(i : Int) = i * 2.0 would have a type of Int -> Double. Subtyping basically follows the Liskov substitution principle: f : Vehicle -> Door is a subtype of g : Car -> Part because you can use f anywhere you'd be able to use g.

Like data types, function types can be composed into conjunctive and disjunctive types. Take the following:

data Cat, Person
countLegs : (Cat -> Int) = c -> 4
sayHi : (Person -> String) = p -> "Hi, I'm #{p.name}"

countHi = countLegs sayHi

countHi is a conjunction of countLegs and sayHi. Invoking this function doesn't chain or compose them; only one will get invoked. Think of it as a box that contains both functions: invoking this box means opening it, taking out one of the functions, and using it.

This follows the data trait analogy: a Cat Person type refers to an object that is both a feline and a human; it also can be thought of as two separate objects stuck together. When you have a Cat Person and ask for its fur type, you're looking at its cat-like nature; when you ask for its social security number, you're looking at its person-like nature. Similarly, countHi has both a countLegs-like nature and a sayHi-like nature.

(There's a slight difference here in that a conjunctive data type can be used in functions that look at both of its aspects; this doesn't apply to functions.)

This means that the input to countHi can either be a Cat (in which case countLegs is invoked) or a Person (in which case sayHi is invoked). In the first case, the result is an Int, and in the second case it's String. Putting that together, we get a type of (Cat | Person) -> (Int | String). More generally:

(Ia -> Ra) (Ib -> Rb) : (Ia | Ib) -> (Ra | Rb)

Disjunctions follow the same analogy; a Cat | Person type refers to a single object which is a feline or person, so functional disjunctions work the same way. Imagine a type that said "I'm either countLegs or sayHi". To pass an argument into it, you'd have to guarantee that the argument could be passed to either function; it would have to be both a Cat and a Person. Depending on which function got invoked, the result would be either an Int or a String.

(Cat -> Int) | (Person -> String)
 : (Cat Person) -> (Int | String)

(I0 -> R0) | (I1 -> R1) : (I0 I1) -> (R0 | R1)

Notice the symmetry here: The input to a conjunction of functions is the disjunction of their inputs, while the input to a disjunction of functions is the conjunction of their inputs. In both cases, the result is a disjunction of their results.

One edge case is when the input to a conjunction is composed of both input types. For instance, what happens if we break out the beakers and engineer a Cat Person? If we pass it to a conjunction countLegs sayHi, both functions in that composed function can process it. Which gets to? (This isn't a problem for a disjunctive type, since its object is just a single function.)

I'm not positive how to handle this case. One solution is to have both functions process it, and then compose the result. So, passing Cat Person to countHi = countLegs sayHi would result in a (4 "Hi, I'm Yuval") : (Int String).