If I haven’t blogged much lately, it’s because I haven’t worked on Effes much lately. Some of it is because I’ve been busy, and some of it is because my current task is, well, a bit difficult. It’s been hard to find the time and energy to focus on it for more than a half-hour here or there, which is really what I need to do.
The problem I’m working on is pattern matching:
truthy : True | False
myValue = case truthy of
True: 1
_: 0
Okay, so that example is pretty easy. The problem is that I really want to disallow what Haskell calls partial functions: functions that might not apply to all inputs that the type system allows. Consider:
possiblyTruthy : True | False | Unknown
myValue = case possiblyTruthy of
True: 1
False: 0
-- no match for "Unknown"
Haskell will happily compile the equivalent of this code, and unhappily throw a runtime exception if myValue
is Unknown
. For a language that prides itself on its awesome type system, that’s not super helpful!
The easy option is to require an “else” (_: foo
) on all case
expressions, but that’s annoying (or even dangerous) when you know (or think) that you’ve already specified all the possibilities. I want to do better: I want the compiler to know whether you’ve specified the possibilities. Specifically, I’d like it to require:
- that all possibilities are specified
- that nothing impossible is specified
To do this, I need a way of “subtracting” types.
t : True | False
myValue = case t of
True: 1 -- t is now (True | False) - True,
-- so t : False
False: 0 -- t is now (False) - False
-- so t : ()
-- no need to specify a catch-all _: case
For simple expressions like this, the subtraction is easy. But it gets tricker when you allow more complex patterns, ones that let you peer into a value’s components. Consider:
Boolean = True | False
List[T] = Cons(head: T, tail: List[T]) | Empty
bools : List[Boolean]
v = case bools of:
Cons(True, _): 0
Cons(False, Cons(_, Cons(False, _))): 1
Cons(False, Cons(_, Cons(False, _))): 2
Cons(True, Empty): 3 -- error!
_: 4
After the first pattern, bools
is:
List[Boolean] - Cons(True, _)
= List[True | False]
- Cons(True, _)
= Cons(True | False, List[Boolean]) | Empty
- Cons(True, _)
Let’s “factor out” the True | False
from the first Cons
. I’ll also use one-letter type names as shorthand, since this gets tricky: B
for Boolean
, T
for True
, etc.
= C(T | F, L[B]) | E
- C(T, _)
= C(T, L[B]) | C(F, L[B]) | E -- factor out the T | F
- C(T, _)
= C(F, L[B]) | E
Okay, that wasn’t so hard. But then, the pattern I matched was pretty simple (“any list that starts with True
). As the patterns get more complex, so does the logic; I might need to recurse down an arbitrary number of times on an given argument. I also need to do this lazily: List[T]
is potentially infinite, so I can’t just factor everything out and subtract out the simple terms.
One way is to do a lazy breadth-first expansion: produce a sequence of types, each with one more layer expanded, and just keep going down that list until I either find the exact type I need, or find that it can’t possibly exist. That would work, but my spidey sense doesn’t like it. It feels like I should be able to hone in better on the expansions I actually want. That will probably also give me better error messages, if a user misses a possibility or types out something impossible (like the Cons(True, Empty)
above, which is impossible since we’ve already covered all lists that start with True
). I don’t think it’s super difficult; but it’s not trivial.