Yuval Shavit: Of Optional and nulls

Here at last is that rant about Optional<T> I've promised for so long. Let me preface it by saying that I am not about to propose an ideal way of handling nulls in Java; I don't think Java's null handling will ever be great. That said, there are better and worse ways of doing it, and I think Optional<T> isn't the best way. What's worse, it's edging out a better way.

For the unfamiliar, Optional<T> is a Guava class that aims to eliminate NullPointerExceptions. It has two forms: Optional.absent() and Optional.of(T item). Rather than a method passing back a nullable Foo, it returns an Optional<Foo>. You then call isPresent(), followed by get() iff the item is present.

Optional<Foo> myFooOpt = tryGetFoo();
if (myFooOpt.isPresent()) { // like a != null check
    Foo myFoo = myFooOpt.get();
    // work with the foo
} else {
    throw NoFooFoundException(); // or whatever
}

The idea is that since you have to call get() to get at the Foo, you'll probably remember to check isPresent first — and thus, no NPEs. It seems reasonable enough, but there are two big problems with it. First, it's verbose; and second, it's not backwards compatible.

The verbosity comes down to a lack of pattern matching in Java. Optional<T> is inspired by functional programming languages that have pattern matching — think of it (very roughly) as an instanceof check combined with an un-constructor. Here's how you'd use Haskell's equivalent of Optional<T>:

case tryGetFoo of
    Just foo -> handleFoo foo
    _ -> handleNoFoo

See how much cleaner that is? Optional<T>-type constructs really benefit from a terse way to get at the wrapped object. Pattern matching lets you do this two ways: by combining the isPresent() and the get(), and by therefore eliminating the need for that temporary, throwaway reference to myFooOpt.

Java is trying to move away from verbose boilerplate; one could argue that the driving force behind both Java 7 and 8 is conciseness, not new features. So why is the Java world embracing the overly-verbose Optional<T>?

The backwards compatibility problem is more clear-cut: existing libraries can't be retrofitted with Optional<T> without huge changes to how overload and method resolution is handled. For instance, Map.get returns V — you can't just change it to return Optional<V> without breaking a lot of code.

Before Optional<T> got cool, one idea people had was to use annotations to do static analysis on the code. Mark a field as @Null, and you know it can be nullable; try to use it without checking for nullity, and you'll get a warning. Nullity can be propagated through result types and arguments, and it all checks out at compile time.

The best part is that you can retrofit it to existing classes. Map.get will never return an Optional<V>, but it could return a @Null V.

There were a few different attempts at these checks, each leading to different sets of annotations. If I had it my way, we'd see one of these — preferably a concise one — get Oracle's official blessing and widespread usage.

A type checker has to be conservative, and that means that you'd have to assume that legacy code always returns nullable references. On the other hand, for new code you'd want an un-annotated method to be assumed to be @NotNull (to cut down on verbosity). This mismatch could be solved in three ways.

Classes compiled annotated with a new @NullChecked annotation would also have their methods assumed to be @NotNull.
All newly compiled code would assume @NullChecked
The type checker could take additional inputs in the form of files that list methods which should be treated as @NotNull regardless of their bytecode.

The third one of those would mean that you could mark methods as not-nullable without touching their bytecode at all. This could be useful for some serialization issues, but more importantly, it would let people locally update projects without waiting on their maintainers.

With that migration path in place, compilers could start treating unsafe dereferencing as errors rather than warnings. And maybe, just maybe, Java can recognize it as important enough as to warrant syntactic sugar: T? as shorthand for @Null T. Kotlin employs a similar trick, and while I haven't actually used it, it sure looks nice.

There are other tricks you can do with annotations that expose a lot of power (including how it interacts with subtyping, etc), at the cost of more complexity. I'm not sure Java needs all those — but even without any of them it's still at least as powerful as Optional<T> — with the added benefit of backwards compatibility.

I'm not sure why annotation-based static analysis never caught on. Maybe the pushes were too fragmented, and developers weren't willing to hack in ugly ways to solve backwards compatibility (like my "additional inputs" file)? Maybe the edge cases are just too many and complicated? A quick google search didn't give me any answers.

Yuval Shavit

Friday, December 27, 2013

Of Optional and nulls

No comments:

Post a Comment