Friday, October 11, 2013

Of object composition and a maybe-dead cat

There's a problem with Effes that could prove deadly, at least to object composition: composed objects can conflict in ways that can't be caught at compile time, and there's no satisfying, simple way to resolve the conflicts. It's a problem I realized a few weeks ago while thinking about various use cases for composition.

To warm up, consider these types:
  • Animal
  • Dog is Animal
  • Happy
  • (Happy Dog) is Animal

Dog and Happy are simple types, and (Happy Dog) is their conjunction. Animal is an "interface-y" type with implementations provided by Dog and (Happy Dog). Let's say balto has a compile-time and runtime type of Happy Dog, and that Animal provides a method speak. Does balto speak invoke the implementation for Dog or Happy Dog?

That's not so tricky (I said it was a warmup!). Happy Dog is intuitively more specific than Dog, so we should invoke its speak. Let's try something just a bit trickier:

type Animal:
  speak -> String
data Dog is Animal:
  speak = "Woof!"
data Cat is Animal
  speak = "Meow!"

balto = Dog
felix = Cat
dogCat = balto @ felix
spoken = dogCat speak

This is trickier. The object dogCat is both a Dog and a Cat, both of which provide equally-specific overrides of Animal::speak. Which gets invoked? If your instinct is to just reject that at compile-time, consider this slightly more dynamic variant:

type Animal:
  speak -> String
data Talking, Dog, Cat -- no inherent relationships to Animal
type Talking Dog is Animal:
  speak = "Woof!"
type Talking Cat is Animal:
  speak = "Meow!"

balto = Talking Dog
felix : Cat = Talking Cat
dogCat = balto @ felix
spoken = dogCat speak

Here, felix has a run-time type of Cat but a compile-time type of Talking Cat, which is Animal. This means we can't detect a conflict at compile time, but dogCat still has two equally-specific implementations of speak at runtime.

We can come up with something clever based on compile-time contortions: take only the left object's implementation when they're composed, for instance, or take only the one whose compile-time type provides an implementation and fail to compile if they both provide an implementation. These would all work for the examples so far, though they're not very inspiring.

The problem is that Effes, like other functional languages, lets you do pattern matching — that is, runtime type checks. This is the killer:

type Animal:
  speak -> String
data Talking, Dog, Cat
type Talking Dog is Animal...
type Talking Cat is Animal...

balto : Dog = Talking Dog
felix : Cat = Talking Cat
dogCat = balto @ felix

spoken = case dogCat of
  Animal: dogCat speak
  else: "whatever"

balto and felix never have a compile-time type of Animal, and neither does dogCat (its compile-time type is just (Cat Dog). Yet both of them have a runtime type of Animal, and we intuit that dogCat should, too. The question, again, is what that speak method does.

There aren't many good, simple solutions to this that I can think of. There are bad, simple solutions; and there are interesting, complex ones. These are the ones I came up with:

  1. A runtime failure at balto @ felix, since it would create a possibly ambiguous object.
  2. A runtime failure when dogCat is matched against Animal and then invokes speak, since this is the aforementioned ambiguity.
  3. Pick one of the implementations by some arbitrary rule, like the left-hand side of the conjunction.
  4. Run both methods and return the conjunction of their results.

I really want to avoid runtime failures here: it's too easy to stumble into this case. The last option is interesting in a computer sciencey way, but it creates a possibly complex world. Let's call this the Shroedinger approach, since it creates an object that's like a superpositioning of barks and meows. What if we then run that object through a pattern match?

who = case spoken of
  "Woof!": "It was a dog"
  "Meow!": "It was a cat"
  else: "What was it?"

If we took the Shroedinger approach for speak, then spoken is ("Woof!" @ "Meow!"); it can match against either of the first two cases. Pattern matching traditionally executes just the first branch that matches, but this approach is inconsistent with the Shroedinger approach that got us this spoken in the first place. So, let's stick with the Shroedinger philosophy of taking all of the cases and conjoining them: the resulting object is a conjunction ("It was a dog" @ "It was a cat"). Of course, the expression "It was a dog" could instead be some method, one which itself might return a "superpositioned" object; and the same could be true of "It was a cat". By the time we finish the case statement, who might be pretty darn complex!

In short, we have three options for handling ambiguity: runtime failure, subtle behavior that depends on composition being non-commutative, or subtle behavior in which each ambiguous decision split the program into a more ambiguous runtime state. The last of those options is the most novel and interesting, but eventually we have to collapse the wave-form: most of the time, we need to end up printing either "the dog says woof" or "the cat says meow."

One option is to provide both philosophies: two parallel case statements, one of which executes all branches and the other of which executes the first it finds. A similar option is to provide a sort of decomposition; given an object with a compile-time type Foo, return an object which is only a Foo, with all other composed objects removed.

Even if we take this approach, all this superpositioning makes the code a lot less strictly typed. The type system provides a very low bound on each object's guarantees! Traditional type systems generally make guarantees like, "it'll be a list, but we don't know what kind of list." Shroedinger-Effes will be able to say "it's an empty list, but we don't know if it's also a non-empty list."

It also raises the question of whether methods are overridden, or if they're always superimposed. Remember our first Balto, the Happy Dog? We intuited that its speak should override the "less specific" implementation provided by Dog. Does that still happen, or do we Shroedinger it up and invoke both?

There is one simpler option. I wrote above that because balto and felix each have a runtime type (but not compile-time type) of Animal we'd want their composed object to also have a runtime type of Animal. What if this weren't the case? That is, that runtime behavior could only ever override types and methods that are known at compile time, but not to perform arbitrary runtime checks such as whether dogCat is an Animal? That may be the way to go; I'll think on it for a bit.