## Saturday, July 4, 2015

### Pattern matching using recursion

(part 1 of 4ish) I’ve made a fair amount of progress in the past few weeks, and have mostly implemented the pattern matching I mentioned in my last post. All that remains now (famous last words…) is to hook up all this pattern matching stuff, which I created as a separate unit, to the compiler.

I ran into a few hurdles along the way, which I’ll split into a few posts. But first, to recap. The objective is to take something like this:

``````t : List[Maybe[Bool]]
case t of:
Cons(Nothing, Cons(_, Cons(One(True), _))): ...
``````

… and figure that after that first `case` matcher, `t` is any list except that whose first element is `Nothing` and whose third element is `One(True)`.

This has a recursive feel to it, since at each argument you can drill down (`Cons -> Cons -> Cons -> One -> True`, for instance). I did end up using recursion, but for a while I was fumbling around without a plan, and getting nowhere. In the end, I had to take a step back and think like a CS 101 student: what’s my base case, what’s the recursive step, what kind of thing is being returned, and how is it combined?

• base case: a simple type (not a disjunction) with no args
• recursive steps:
• disjunctive case
• simple type with args
• return value: a struct that contains a set of matched types and a set of unmatched types.
For instance, if the possibility is `True | False | FileNotFound`, and the match case is `True`, then return value is (matched={`True`}, unmatched={`False`, `FileNotFound`}).
• combining step:
• for disjunctions, recurse down on each component, and combine the results (matched=<matched values from each component>, unmatched similar)
• for simple types, recurse down on each argument. If any argument returns back no matches, the result is no match. Otherwise, do a cartesian product of all of the matched and unmatched arguments. For each row that corresponds only to matched arguments, create a match; for all other rows, create an un-match.

That last point is hard to explain succinctly, but an example will illustrate it. Let’s say you have:

``````VehicleType = Car | Truck
Color       = Red | Green | Blue
Vehicle(type: Car | Truck, color: Color)
t : Vehicle
``````

You match `t` against `Vehicle(Car, Red)`. Recursing down, you find that the first argument is (matched={`Car`}, unmatched={`Truck`}) while the second argument is (matched={`Red`}, unmatched={`Green`, `Blue`}). The cartesian product of these arguments (with matches marked with asterisks) is:

``````*Car,   *Red
*Car,    Green
*Car,    Blue
Truck, *Red
Truck,  Green
Truck,  Blue
``````

Of these six options, only the first row has all matched arguments, so it’s the match; the other rows are unmatched:

``````matched = {
Vehicle(Car, Red)
}
unmatched = {
Vehicle(Car, Green | Blue),
Vehicle(Truck, Red | Green | Blue)
}
``````

This gave me the overall approach, but I still had a couple problems. The first was dealing with laziness (which is needed to handle infinitely recursive types, like `List[T]`), and the second was in figuring out how to structure the recursion. I’ll talk about those in the next couple posts, in reverse order.