Null reference may not be a mistake

null

The null pointer is considered to be a “billion-dollar mistake” by Tony Hoare, but was he really saying that null pointer should never be used? After years of using languages both with null pointers (e.g. Java) and without them (e.g. Haskell), I found that null pointers are much easier and more natural to use than its counterparts (e.g. Haskell’s Maybe type). I have been wondering why there is such a notion of “billion-dollar mistake” until I saw the original video where Tony Hoare claims it to be his mistake. In fact, he didn’t really say that null pointer should not be used, so I realized that I made a mistake by taking the words “billion-dollar mistake” literally.

From this video, you can see that introducing null reference is not really a mistake. On the contrary, null references are helpful and sometimes indispensable (consider how much trouble if you may or may not return a string in C++). The mistake really is not in the existence of the null pointers, but in how the type system handles them. Unfortunately most languages (C++, Java) don’t handle them correctly.

Every class type A of Java is in fact a union type {A, null}, because you can use null anywhere an object of class A is expected. This is equivalent to the Maybe type of Haskell (where null in Java corresponds to Nothing of Haskell). So the trouble really is that an annotation like {String, null} should be distinguished from String, so that it will be clear what can possibly end up in the value. Unfortunately most languages don’t provide a convenient union type that you can put String and null together (Typed Racket is an exception). If Java is to have union types, we can say something like:

{String, null} findName1() {
  if (...) {
    return "okay";
  } else {
    return null;
  }
}

This is saying: findName may return a name which is a String, or it may return nothing. In comparison, we can say something slightly different:

String findName2() {
    ...
    return "okay";
}

By distinguishing the return types of findName1() and fineName2(), the type system knows that you should check for null when you have called findName1(), but you don’t need to check for null if you call findName2(). So you have to write something like:

String s = findName1();  
if (s != null) {
  x = s.length();      // use s as a String only after null check
}

But you may write:

String s = findName2();
x = s.length();

For the latter, you don’t have to check whether s is null because we know definitely that findName2() will return a String which is not null.

In fact this approach is hinted by Tony Hoare in the above video at 00:24.00. He said that null should be a class. Indeed, the union type {String, null} certainly thinks of null to be at the same status of String — it is a class.

But we soon realize that it doesn’t really matter whether null is a class or not since the class Null will have only one value — null. So any language with null references should work equally well given a correct type system.

Advanced static analysis tools can already help alleviate this issue by essentially inferring the union types like {String, null} even when the programmers write String instead, although a type annotation system which allows the programmers to specify union types directly certainly makes type checking easier and also makes programs easier to read.

About these ads

Comments are closed.

%d bloggers like this: