Yuval Shavit: programming

Showing posts with label programming. Show all posts

Monday, August 3, 2015

Evolutionary programming

I’ve realized that working on Effes is a different beast than working in my day job, primarily due to time and a lack of continuity. It hasn’t been quite as challenging as I thought it would be, but it does result in a slightly different development style.

At work, I come in and get to chew on a problem for hours at a time (modulo the usual office interruptions). If I don’t finish, I’ll pick things up the next day pretty much where I left off, with everything still more or less fresh in my head. But with Effes, I usually work on things an hour here, a half-hour there, maybe a stretch of a few hours on a weekend. Sometimes I won’t have time to work on it for weeks at a time.

So I’ve taken an evolution-like approach to Effes. Good traits are brought in atomically and incrementally; bad traits are culled, but ones that are merely superfluous can stick around for a while. Sometimes I work on pieces that I think will mesh, but I work on them without a grand spec of how to combine them. Instead, I work on each one individually, and then try to stitch them together.

It’s not that I don’t do cleanup (I do!), but the goals are different: there’s less emphasis on a tight, succinct code base, and more on making progress. Less on the ethos, and more on the id.

Two things I haven’t deemphasized are code clarity and maintainability. Like with the day job, those characteristics are vital, since I may not come back to a piece of code for months. On the other hand, if something works fine and is clear, but could maybe be collapsed a bit or consolidated with another class for aesthetic reasons — I have less incentive to do that, since that half hour task might well be the only half hour I get to spend on Effes for two weeks.

These are, of course, skills not unique to my hobby program. I have a bit of a “architectural perfectionist” streak in me, and working on Effes has helped me hone my ability to get things done efficiently, without worrying too much about how pretty they are.

Tuesday, July 28, 2015

A type by any other name

I think it’s actually going to be easier than I feared.
~ Me, writing Famous Last Words

Uh, yeah. So it turns out it won’t actually be that easy. Instead of just plugging things in straightforwardly, it turns out that I’m going to need to take a tangent to kinda overhaul the classes that represent Effes types.

Here’s the problem. Let’s say I do something like this:

b : Something[Boolean]
case b of:
  Something[True]: doSomethingWith b
  c: doSomethingElseWith c

… then I want two things to happen. Firstly, in doSomethingWith b, I want the type system to know that b : Something[True]. And secondly, in doSomethingElseWith c, I want the type system to know that c : Something[False] | Nothing.

Neither one of those work today, because of an implementation choice I made a while back. To simplify recursive types like Cons[T](head: T, tail: Cons[T] | Empty), I separated each type’s name from its argument. One bit of information says “I’m a Cons with a generic param T” , and a separate class holds the information “given a Cons[E], I can tell you that its arguments are (E, Cons[E] | Empty).”

This means that there’s no place to store the information that in some particular case, the arguments are (T, Empty) (i.e., that this cons represents the last item of a list). That’s exactly the kind of information that the pattern matching can and should provide.

So, before I go further with the pattern matching, I’m going to have to redo my type registration quite a bit: I’m going to need to remove my “arguments for each type” registry altogether and put that information in the types themselves.

Friday, July 10, 2015

I'm too lazy to type

The title is a double joke. It’s a pun with “lazy” and [data] types, but it’s also funny cause it’s true! This post was supposed to be about lazy evaluations in my pattern matching code, but I don’t feel like writing much about it.

And actually, there’s not a whole lot to write.

the “type subtraction” has to be lazy, because one of the operands is a potentially-infinitely-recursive type. Trivial example: IntSequence(head: Int, tail: IntSequence).
for the sake of keeping expansions down, it’s better to defer forcing the evaluations as long as possible — but that was covered in my last post
step three, profit I guess?

Anyway, point is there’s not much to say, so I’m just going to leave it at that.

My next step is t actually plug all this stuff into my compiler. I did a quick look the other day to remind myself what the compiler code actually looks like in that area, and I think it’s actually going to be easier than I feared. There’s already a decent abstraction of “the pattern that’s being matched,” with logic that takes that pattern and assigns variables to it and such, so with luck it’ll be pretty easy to swap in the new abstractions I’ve been using.

Wednesday, July 8, 2015

In Soviet Russia, recursion invokes you!

This is one of those “I know what worked, but I don’t quite know what I learned” posts.

To implement my type-safe pattern matching, I have to essentially recurse down into two structures: the type of the thing being matched, and the individual pattern that’s trying to match it. That is, given:

t: List[Maybe[Boolean]]     //          call this the haystack type
case t of
    Cons(One(True), _): doWhatever(...) // and this the needle type

… I need to recurse both into the List[Maybe[Boolean]] and into the Cons(One(True), _). The question is, which of those looks like recursion, and which looks like a dispatch? The problem is interesting because both structures are polymorphic: the haystack can be simple type (Cons(head: T, tail: List[T])) or a union type (Cons(...) | Empty), while the needle type can either be a concrete type (One(True)) or a wildcard (_).

Essentially, some piece of code has to look something like this:

Result recursePolymorphically(Arg arg) {
  if (arg instanceof OneThing) { ... }
  else if (arg instanceof AnotherThing { ... }
}

and the question is whether I:

recurse into the haystack polymorphically, and instanceof-ify the needle
recurse into the needle polymorphically, and instanceof-ify the haystack

A quick observation that drove my thinking on this: the haystack type is potentially infinite (the tail of a Cons is a disjunction that includes a Cons, the tail of which is a disjunction that includes a Cons, and so on) while the needle is always finite. Thus, the base case must depend on the needle.

I tried the first of those first, since I like the idea of the base case being driven off the method’s arguments rather than an attribute of this; with the needle as an argument, the base case is when that argument represents a wildcard or a concrete, no-arg type.

The problem is that this didn’t work well with laziness (the subject of my next post), since it meant forcing the haystack more than I wanted. This in turn caused types to explode out much faster than they needed to. Instead of ending up with something like:

t': Cons(Nothing, ...)
  | Empty

I might end up with something like:

t': Cons(Nothing, Cons(Nothing, ...) | Empty)
  | Cons(Nothing, Cons(Maybe[True], ...) | Empty)
  | Cons(Nothing, Cons(Maybe[False], ...) | Empty)
  | Empty

This made testing more difficult, as I had to start balancing thoroughness and verbosity in the test code. It also means that in the case of a compilation error, the user would see a much more confusing error message. Would you rather see “saw True but expected Cons(Nothing, …)” or “saw True but expected Cons(Nothing, Cons(Nothing, …) | Empty) | Cons(Nothing, Cons(Maybe[True], …) | Empty) | Cons(Nothing, Cons(Maybe[False], …) | Empty) | Empty”? I just wrote that, and I can barely even make sense of it!

As it happens, the testing was important because this approach lent itself to more bugs, though I haven’t done the introspection to figure out why.

The explosion problem pretty much went away when I switched the recursion and the dispatch. Now, the polymorphic call to the needle’s Any form can take its unforced argument (a thunk representing some piece of the haystack’s type) and shove it into the result, still in its lazy form. The result becomes something like matched={<thunk>}, which keeps things from having to expand further.

Which gets me back to the question at the top: what did I learn? I still don’t know. I tried one technique, identified its flaws, and tried the other — but I didn’t learn how to pick techniques better. My algorithm is still a linear search.

Maybe what I learned is that the base case should be driven off polymorphism, not off method arguments. Or maybe it’s that lazy evaluation and polymorphism don’t mix well. Or maybe that sometimes, you just have to use brute force to figure new stuff out.