Category Archives: types

Null reference may not be a mistake


The null pointer is considered to be a “billion-dollar mistake“. I have been wondering why there is such a notion until I saw the video where Tony Hoare claims it to be his mistake. In fact, he didn’t really say that null pointer should not be used.

From this video, you can see that introducing null reference is not really a mistake. On the contrary, null references are helpful and sometimes indispensable. The mistake is not in the existence of the null pointers, but in how the type system treats them. Unfortunately, most languages (C++, Java, C#, …) don’t treat them correctly.

Every class type A of Java is in fact a union type {A, null}, because you can use null where an A object is expected. {A, null} is almost equivalent to the Maybe type of Haskell, where null corresponds to Nothing of Haskell. So the trouble really is that an annotation like {String, null} should be distinguished from String, so that it will be clear whether null can possibly be its value.

Unfortunately most languages don’t provide a convenient union type that you can put String and null together (Typed Racket is an exception). If Java is to have union types, we can say something like:

{String, null} find1() {
  if (...) {
    return "okay";
  } else {
    return null;

This is saying: find1 may return a name which is a String, or it may return nothing. Because of the union type {String, null}, the type system knows that you should check for null when you have called find(), so it will force you to write a null check:

String s = find();  
if (s != null) {
  x = s.length();

In comparison, if we define a slightly different function find2, with a different return type:

String find2() {
    return "okay";

From the return type we know that find2 will never return null, so the type checker can let you you use the String without checking:

String s = find();
x = s.length();

Comments Off on Null reference may not be a mistake

Posted by on June 3, 2013 in programming languages, types


How to reduce the contagious power of ‘const’

For the purpose of optimization and (possibly?) safety, the C++ programming language decided that the users should manually specify ‘const’ annotations for variables that they know will/should never be modified. While I admit the possible usefulness of this annotation, I often see it misused, and this often results in a contagious effect where unnecessarily many variables and parameters are annotated as ‘const’.

In this post, I reason philosophically about the true meaning of the ‘const’ annotation, and propose some tentative principles regarding the use of it. By following them, I hope to reduce the contagious power of ‘const’ to the minimum.

First, the rule for ‘const’ annotations for variables:

Rule 1. Declare a variable as ‘const’ if you want it to be inlined, or you don’t want it to be modified.

By this principle, these examples are legitimate:

const int max = 1000;
const char* name = "permanent name";

These declarations can be global variables as well as local variables. Declaring them as ‘const’ can facilitate the inlining opportunities of them, as well as prevent unwanted modifications.

Now we just have one other rule. This one concerns the ‘const’ annotations on functions:

Rule 2. As a function, be self-disciplined about what you take, but be relaxed on what you give.

This means, put ‘const’ on the parameters whenever you are sure that you will by no chance modify the input argument, but never put ‘const’ on the return type unless the compiler asks you to.

The reason behind this may take a little bit of thinking. By putting ‘const’ on the input parameter, the function is basically declaring: “Whoever passes me this argument, I promise not to modify it.” “I’m side-effect free on this parameter.” “I’m pure on this parameter.”

This is a good thing, because you give the callers of the function more freedom of what they can give you. If you don’t declare your “pureness” (on the parameter), some callers may not be able to pass you certain data because they don’t want their data to be accidentally modified by you. They need their original values intact, because they are going to use them after you return.

So in general, this is a good thing to do in C++:

  void foo(const char* in);

But there is a catch here: whether you can put ‘const’ here depends on the function body and all the function calls inside. Whenever you put ‘const’ on the parameter type, you have to make sure that any functions you call don’t modify this parameter either. That is to say this is a transitive property. For example, something like this will not work:

  void foo(const char* in) {

  void bar(char* in) { ... }

That is because bar may modify its parameter, which is also foo’s parameter.

Is this to say that you should declare bar as void bar(const char* in)? It all depends on whether bar actually modifies its argument or not. If it does, then there is no way you can use ‘const’, consequently you cannot declare foo as taking a const char either. Then the type of foo should be “void foo(char in)”, not having the ‘const’.

There is no way you should use ‘const’ in foo then, because the helper bar modifies the data. You have to say this honestly: “I may modify the contents of in, because one of the functions I call modifies it.”

This is where it can get painful in C++, because the other functions you call may be written by other people and they may not follow the same principles here. They may not have ‘const’ on their parameter types even though they haven’t modified the parameter in their function body. This is not their fault, because adding those ‘const’ annotations is so boring. Some automatic inference can certainly be helpful in these cases, but unfortunately inference for ‘const’ is not provided by C++.

On the other hand, why should you NOT put ‘const’ on the return type? This is because if you do, then you are basically announcing: “Whoever calls me cannot modify the value that I return.”

This like saying this in your will: “I will give my house to my son, but he is not allowed to modify it.” Well, the law may give you the right to specify this, but what good can this do to your son? You are limiting the happiness you can provide. More over, can this kind of restriction really be enforced? Who has the time or right to keep an eye on your son and restrict his action on the house? This is just impossible to work.

Coming back to the context of programming, what’s the point of not allowing the callers to modify what you GIVE them? By returning, you are giving up control of the return value to the caller. It should be up to the caller, and not you, to decide whether to modify the return value or not. Even if you put ‘const’ annotations in the return type, they are usually ignored by the callers. They just use a const_cast whenever the ‘const’ gets in their way. This makes the ‘const’ annotation on return types virtually useless.

If you put ‘const’ on the return type and the callers respect it, then it will contaminate the whole data path wherever the return value goes. This is nasty and unnecessary.

So in general, this is not a good thing to do:

  const char* bar();

Because your caller would have to be like this:

void baz() {
  const char* ret = bar();

And the receiver of the return value (bizaar) must be defined as something like:

  void bizaar(const char* in) {

And now bizaar2 needs to be declared as:

  void bizaar2(const char* in);

And so on… You see how the contagion started? Because bar() returned a const value, which is unnecessarily restrictive. So in general, it is not a good idea to return ‘const’.

I’ll leave this as an thought exercise: When can we reasonably use ‘const’ on the return value?

Comments Off on How to reduce the contagious power of ‘const’

Posted by on April 8, 2013 in programming, types


Propositions as programs

The Curry-Howard correspondence says that propositions are types, and proofs are programs. I had been wondering if there is a simpler way to think about it, so I came up with this:

Propositions are programs; proofs are “abstract interpretation” of them to True.

Here True is a meta-level truth value. A theorem prover is then an “abstract interpreter” which tries to evaluate the proposition to True. This computational process which derives True from the proposition, is called a judgment. Here I may have conflated the concept “abstract interpretation”. I chose to use this term just because those two words “abstract” and “interpretation” conveys the general idea I hope to capture, but the term “abstract interpretation” here may not be what you use to think it is. I could have used “supercompilation”, but I don’t think the word conveys the concept very well.

Any way, this is a special kind of evaluation. It may be partial evaluation, supercompilation, or some sort of static analysis, whatever you call it. It can also take human inputs interactively, as long as it is semantic-preserving — it returns True whenever an actual interpreter would return True. But it is a lot more efficient, because unlike a normal interpreter, it takes shortcuts (e.g. induction hypotheses). If it thus evaluates to True, then the proposition is proved. This is much like fortune-telling. It immediately tells you the result that an actual interpreter would eventually give you. It has a limited number of basic “reduction operations” such as unfolding, induction and rewriting, so we can record reduction sequences as “proofs”, and we can verify them later on.

This seems to match what we have been doing with proof assistants like Coq, and possibly a more natural way of thinking about this kind of theorem proving. I feel that the propositions in Coq are a little far-fetched to be called “types” of the proof terms (note: not to be confused with tactics). For example, we can have a proof term like

fun (n' : nat) (IHn' : n' + 0 = n') => ...

Is n' + 0 = n' the type of IHn'? Well, you can call it whatever you like, but calling it a “type” doesn’t feel natural at all. What we want is just a way to bind an equation to a name, so that we can refer to it. It feels better if we just think of propositions as programs with boolean return types and the proof terms reduction sequences of them into True. If you take a look at the proof terms of Coq, you may find that this is the case. For example, take a look at this simple theorem and the tactics that prove it:

Theorem plus_0_r : forall n:nat, n + 0 = n.
  intros n. induction n as [| n'].
  simpl. rewrite -> IHn'. reflexivity.  Qed.

They produce the following proof term:

plus_0_r = 
fun n : nat =>
nat_ind (fun n0 : nat => n0 + 0 = n0) eq_refl
  (fun (n' : nat) (IHn' : n' + 0 = n') =>
   eq_ind_r (fun n0 : nat => S n0 = S n') eq_refl IHn') n
     : forall n : nat, n + 0 = n

You may think of this proof term as a program with the theorem as its type, but you can also think of it as a reduction of the program n + 0 = n to True, where n is a natural number. It is saying: “Do an induction where the first branch executes an equivalence judgment, and the other branch does an unfolding, a rewriting using the induction hypothesis, and an equivalence judgment.” I think the second way of thinking of it is much more natural.

This interpretation can also explain the operation of Boyer-Moore style theorem provers (e.g. ACL2), as this is almost exactly how they work. They just don’t normally output a machine-verifiable reduction sequence.

You may have noticed that we have an important difference from the orginal Curry-Howard correspondence here:

Proofs are no longer considered programs.

At least proofs are not object-level programs. They are the “operation histories” in the abstract interpreter, which are at the meta-level. We can give this operation history to an independent verifier, so that it can “replay” the abstract interpretation and verify the correctness of the proof.

Alternatively, if you consider the verifier as an interpreter then proofs are its input programs. In this sense you can also say that proofs are programs for the proof verifier, and both propositions and proofs can be considered programs. But they are programs at two different semantic levels: the proofs are at a higher level than the propositions. This is very different from Curry-Howard.

Thus we have arrived at a unified way of thinking about the two very different styles of theorem proving: Curry-Howard correspondence (as in Coq), and Boyer-Moore (as in ACL2).


ADTs and objects

After reading William Cook’s essay On Understanding Data Abstraction, Revisited, let me try to condense the difference between abstract data types (ADTs) and objects into a few words.

(To avoid ambiguity, I use “instances” to stand for data created by instantiating ADTs)

  • “Instances” created by the same ADT share the same functions (methods). Functions may be parameterized, but the code is the same in each instance. Because of this, they need to share the same representation.
  • “Objects” (as proposed by Cook) don’t necessarily share function code. Each object may have a completely different set of functions (with matching names and types). This is in essence the same as “call-backs” in GUI widgets. Because of this diversity, each object may have completely different representation.

Ironically, it is usually the former case in mainstream object-oriented languages like Java and C++. In Java you can somehow achieve the latter with interfaces and anonymous classes, but it is awkward. JavaScript’s prototype-based system seems to be closer to the essence of the latter, but still not feel natural.

But different from Cook’s view, I think it is better not to consider binary operations like set union as methods. Union may take two sets with different representations and create a new one, thus it can operate by using the “true methods” of both arguments (possibly iterators and membership).


A bug in GHC’s type system

Several days ago, I implemented an experimental type inference system with first-class polymorphism. When comparing it with other systems, I found a possible bug in GHC’s type system regarding universal quantification. The phenomemon was confirmed and reproduced by people at #haskell IRC channel for GHC versions above 7.01. The code that causes trouble is:

gen :: [forall a. a -> a]
gen = [id]
test1 = head gen 1

Obviously this program should typecheck, since:

  • id has the type forall a. a -> a.
  • A list gen containing id should have type [forall a. a -> a](as in the annotation).
  • head has the type forall a. [a] -> a.
  • head gen should have the type forall a. a -> a.
  • head gen should be able to be applied to 1.

But GHC rejected this program for a strange reason.

Couldn't match expected type `t1 -> t0'
with actual type `forall a. a -> a'
Expected type: [t1 -> t0]
Actual type: [forall a. a -> a]
In the first argument of `head', namely `gen'
In the expression: head gen 1

On the other hand, it works if (head gen) is bound at let:

test2 = let hg = head gen in hg 1

It doesn’t break the soundness of the type system since it only rejects some correct programs, but this kind of pecularities of type systems can be painful when they add up. I guess this may be caused by the interaction between GHC’s internal type system with the legacy let-polymorphism.

Leave a comment

Posted by on March 5, 2012 in bugs, programming languages, types


Decoupling type classes

(I posted this idea to Haskell Cafe some days ago and got interesting respons from Dominique Devriese, who implemented instance arguments [1,2] (Agda’s equivalence of type classes). The disucssion has been really interesting, so I decided to collect and organize the ideas into a coherent post here. Some syntax is borrowed from their work.)

This post relates the three ideas together: type classes, implicit parameters, and typed dynamic scoping. The first two are researched into some degree but their relationship and design are still subject to debates. The latter is my own analogy which appears to be really helpful clarifying those design decisions.

For the impatient, I have bulletted points with all the main ideas in this post here:

  • We can achieve the same or more expressiveness of type classes without using dictionaries. We just need implicit parameters.
  • Implicit parameters and their types can be fully inferred.
  • Implicit parameters can express MPTCs.
  • Type classes can be thought of as dynamic typing, though more disciplined.

To group, or not to group

Haskell’s type classes tie the two concepts “overloading” and “grouping” together, but in fact they are not necessarily related. Decoupling those two concepts has several benefits:

  • It makes the type system more flexible.
  • It makes the implementation more modular.
  • It frees us from the need to pass a whole dictionary when we only use one function in it.

I’m curious why we need to group those functions into a type class, instead of using them directly as overloaded functions. So I built an experimental type system which “decouples” the dictionary. Instead of passing a dictionary, it passes individual “implicit parameters” to the function. Each implicit parameter corresponds to an “undefined variable” in the function. Those implicit parameters are type inferenced and can contain type parameters just as the functions in a type class. Similarly, they are resolved by their types in the call site’s scope.

The convenience of this approach compared to Haskell’s type classes is that we no longer require a user of a type class to define all the methods in a type class. For example, a user could just define a method + without defining other methods in the Num class: -, *, … He can use the method + independently. For example, if + is defined on the String type to be concatenation, we can use + in another function:

weirdConcat x y = x + y + y

This has a utility, because the methods in the Num class don’t “make sense” for Strings except +, but the current type class design requires us to define them. This is burdensome. By separating the Num class into individual overloaed functions, we enables this kind of overloading without any extra work.

Then a question arises: how do we relate the types of functions if we no longer have this grouping? For example, for the Monad class, we expect return to be of type (a -> m a) if (>>=) is of type (m a -> (a -> m b) -> m b).

The answer is to use a polymorphic record to group the functions when needed, then reference the functions through the record.

Thus we have delegated the task of overloading to separate overloaded functions, and the task of grouping to records. This is a more modular design compared to the original type class system of Haskell.

Expressing MPTC

There is another benefit of this decoupling: it is expressive enough to subsume MPTC. Here is why: because the methods are no longer grouped, there is no longer a “common” type parameter to the methods. Thus we can easily have more than one parameter in the individual methods and conveniently use them as MPTC methods.

For example,

overload g
 ... ...
f x (Int y) = g x y

Notice that in my system, there are no explicit declarations containing type variables. The declaration “overload g” is all that is needed. The system infers f’s type as:

a -> Int -> {{g: a -> Int -> b}} -> b

Here it automatically infers the type for g (a -> Int -> b) just from its usage inside f, as if we have written a type class definition like:

class G a where
 g :: a -> Int -> b

So in this system, not only we don’t need to defined type classes, we don’t even need to declare the types of the overloaded functions! We can infer them from their usage and different calls to the same function don’t even need to have the same type! All it takes is:

overload g

And even this is not really necessary. It is for sanity purposes – to avoid inadvertent overloading of unbound variables.

So if g is used as:

f x y (Int z) = g x z y

then we infer its type as:

a -> b -> Int -> {{ g :: a -> Int -> b -> c}} -> c

and g will be equivalent to the one you would have defined in a MPTC method.

Some implementation details

Here is how it can be implemented. When we see an “undefined” variable in a function definition which has been declared as “overloaded function”, we store the function name, and the type variables that are associated with it. For example,

overload g -- (explicitly declare g as an overloaded function)

f x y (String s) = ...
let z = g x s y in

We don’t know what x and y are, but we know from the body of f that their types satisfy this pattern: g a String b. Thus we store this pattern constraint as an extra (implicit) argument in the type of f:

f :: a -> b -> String {{g: g a String b}}

We may have multiple such arguments.

At the call sites of f, we look for a function g in the scope that satisfies the pattern g a String b, but we don’t pass on the substitution, so they remain polymorphic. Once found, the function is passed as an extra parameter to f. This is essentially dictionary passing, but without grouping. It can be also more efficient because the parameters may be stored in registers.

Here g is explicitly declared as “overloaded”, although my experimental system doesn’t need this. Any undefined variable inside function body automatically becomes overloaded. This may cause unintended overloading and it catches bugs late. That’s why we need the “overload” declarations.

But the automatic overloading of the undefined may be useful in certain situations. For example, if we are going to use Haskell as a shell language. Every “command” must be evaluated when we type them. If we have mutually recursive definitions, the shell will report “undefined variables” either way we order the functions. The automatic overloading may solve this problem. The undefined variables will temporarily exist as automatic overloaded functions. Once we actually define a function with the same name AND satisfies the type constraints, they become implicit parameters to the function we defined before. If we call a function whose implicit parameters are not associated, the shell reports error very similar to Haskell’s “type a is not of class Num …”

Type classes as typed dynamic scoping

It appears to be helpful to think of the calls to overloaded functions as referencing dynamically scoped variables. They are dispatched depending on the bindings we have in the call site’s scope (and not the scope where the method is defined!). So it is very much similar to the much-hated dynamic scoping. But the dispatching is more disciplined — it doesn’t just match the name. It must match both the name and the inferred type. So in general it is useful and doesn’t cause trouble.

This intuition also explains why local instances shouldn’t be allowed, because if we capture the variables at the definition site, the method call becomes “statically scoped”.

Let me post an example from Oleg:

foo x =
 let overload bar (x:Int) = x + 1
 in \() -> bar x

baz =
 let overload bar (x:int) = x + 2
 in foo (1::Int)

Here at the call site foo (1::Int), bar should be resolved to the one inside baz, and not the one inside foo. My analogy to “dynamic scoping” is useful to understand the reason here. If we capture the bar inside foo’s definition site, it would correspond to “lexical scoping”.



Leave a comment

Posted by on January 16, 2012 in programming languages, types


Sum types and union types

In a new type system I’m designing, I was trying to find a good reason to get rid of sum types (as are commonly used in ML and Haskell). Well, I don’t hate them that much, but for simplicity’s sake, I always try to remove things unless there are undeniable reasons that they must exist.

I think sum types have a large overlapping of functionality with product types and cause inelegance and inefficiency, so I hope to replace them with union types (some people call “open unions”). Union types are more orthogonal with respect to product types. I seem to have found a good idea where sum types originated and the reason why we don’t need them in a programming language.

It appears that sum types were originated from mathematics. In mathematics, there are no named product types. All we have are Cartesian products. Cartesian products have no constructors (or type tags) in them. How do we distinguish a student represented as (Height x Weight) with a professor, also represented as (Height x Weight)? We put tags on them before putting them into a disjoint union. This process is called injection. We do this every time when putting something into a disjoint union, and we project the elements out when we need their values.

From a programming point of view, the injections and projections are inefficient and inconvenient. This is equivalent to constructing new objects whenever we need to put them into a list, and destructing them when we need their components. Then construct new objects (again) when we need to put them into another list, and so on.

In order to avoid these shortcomings, programming language researchers decided to attach type tags to the objects when they are created. The tags tell us what the object is and they stay with the objects throughout their life-time. This results in record types. But somehow the PL researchers seemed to haven’t been completely freed from the influence of mathematics. They decided that each variant of a union needs to define a new constructor (injection). Thus sum types were born. For example,

data T1 = Foo Int | Bar String

Here Foo and Bar are essentially injections. Why do we need the injections and projections in a language where type tags always tell us what an object is? Another problem is that a constructor can only belong to one sum type, and there is no way we can reuse it in another. For example, there is no way you can define another sum type like:

data T2 = Foo Bool | Baz Float

because Foo is already defined in T1. What we wanted to express is that Foo belongs to the union T1 and it also belongs to the union T2. This is a very reasonable need, but most Hindley-Milner style type systems doesn’t allow this.

If we relax the requirement that each variant of a sum type must define a new constructor, we get union types. They are more general and orthogonal with respect to product types. We can always choose to put another level of type tags upon the variants, therefore we lose nothing if we don’t have sum types.

Leave a comment

Posted by on August 28, 2011 in programming languages, types