RSS

Monthly Archives: March 2012

A reformulation of reducibility

I found the theory of computation books very imprecise about their descriptions of Turing machines and reductions. They usually start with something precise and mathematical, for example they would define a Turing machine as a 7-tuple, but after that, everything about decidability and reduction is described with impenetrable paragraphs in natural languages.

I believe that the use of natural languages leads to most of the confusion in theory of computation because natural languages are inherently imprecise and ambiguous. There are too many ways to say the same thing. For example, you can find these sentences which mean exactly the same in the books:

  • “Run M on w”
  • “Run M on input w”
  • “Simulate M on w”

The pronouns and references are even more confusing. For example:

“Use the description of M and w to construct the TM M1 just described.”

What is “just described”, and do M and w here mean the same things as in the earlier paragraph?

Another serious problem is that natural languages are very weak at representing the notion of “interpretation”, which underlies much of theory of computation. For example, a Turing machine simulates another Turing machine, which again contains and simulates yet another one.

After some thought, I found that we could use something precise such as mathematical notations combined with programming languages to describe the concepts. As an example, I’ll show here how the notion of reduction can be defined precisely as a homomorphism which can be instantiated for reducing one problem to another.

Definition 1 (reduction): A reduction (as in theory of computation) is a homomorphism (as in universal algebra):

Reduce(TM, I) = (TM', I')

satisfying the property

TM @ I = TM' @ I'

where

  • TM = Turing machine which we reduce from
  • TM' = Turing machine which we reduce to
  • I = input string of TM
  • I' = input string of TM’
  • @ = simulation, similar to the Scheme code ((eval TM) I)
  • TM @ I = result from simulating TM on I (accept or reject)
  • TM' @ I' = result from simulating TM’ on I’ (accept or reject)

End of Definition 1.

Notice that the simulation (TM @ I) may go into an infinite loop and never produce any result. Now I show how to use this homomorphism to describe the reduction from ATM ==> HALT, where

  • ATM = the “acceptance problem” (deciding whether a Turing machine M accepts string w)
  • HALT = the “halting problem” (deciding whether a Turing machine M halts on string w)

For convenience, we let

  • DATM = “the decider of ATM”
  • DHALT = “the decider of HALT”

Now the reduction can be fully described by the following homomorphism:

Reduce(DATM, (M,w)) = (DHALT, (M',w))
where
  M' = <if (M @ w) then accept else loop>
satisfying
  DATM @ (M,w) = DHALT @ (M',w)

Yes, that’s an all-inclusive formal proof that HALT is undecidable. It even includes the notion of “reduction” itself.

Let me explain it a bit. The first line says that there exists a function (named Reduce) from the pair (DATM, (M,w)) to another pair (DHALT, (M',w)), where M' = <if (M @ w) then accept else loop> is a description of a Turing machine.

Now let’s look at the last line:

DATM @ (M,w) = DHALT @ (M',w)

It says: if we have a decider for HALT (DHALT), then we can use it to define DATM, thus deciding ATM.

Why this is a valid defintion for DATM? This is because from the definition of M'

<if (M @ w) then accept else loop>

we know that:

  • If (M @ w) accepts, M' accepts, thus DHALT @ (M',w) accepts
  • If (M @ w) rejects, M' loops, thus DHALT @ (M',w) rejects
  • If (M @ w) loops, M' loops, thus DHALT @ (M',w) rejects

Notice from the colored words that DHALT @ (M',w) will accept if and only if M accepts w. Thus this defines a decider for ATM.

So if DHALT exists, then we can have DATM. But this contradicts the fact that DATM cannot exist, so DHALT must not exist.

If you wonder how this proof corresponds to Definition 1, here is some details how it is instantiated:

  • TM = DATM (nonexistent)
  • TM' = DHALT (hypothetical)
  • I = (M,w) where M is the description of a Turing machine which we want to know whether it accepts input w.
  • I' = (M',w) where M' is <if (M @ w) then accept else loop>
  • TM @ I = DATM @ (M,w) (running decider of DATM on input (M,w))
  • TM @ I' = DHALT @ (M',w) (running decider of DHALT on (M',w))

This is a lot more concise, accurate and easier to understand than a paragraph:

F = "On input <M,w>:
  1. Construct the following machine M'
     M' = "On input x:
        1. Run M on x.
        2. If M accepts, accept.
        3. If M rejects, enter a loop."
  2. Output <M',w>."
 

ADTs and objects

After reading William Cook’s essay On Understanding Data Abstraction, Revisited, let me try to condense the difference between abstract data types (ADTs) and objects into a few words.

(To avoid ambiguity, I use “instances” to stand for data created by instantiating ADTs)

  • “Instances” created by the same ADT share the same functions (methods). Functions may be parameterized, but the code is the same in each instance. Because of this, they need to share the same representation.
  • “Objects” (as proposed by Cook) don’t necessarily share function code. Each object may have a completely different set of functions (with matching names and types). This is in essence the same as “call-backs” in GUI widgets. Because of this diversity, each object may have completely different representation.

Ironically, it is usually the former case in mainstream object-oriented languages like Java and C++. In Java you can somehow achieve the latter with interfaces and anonymous classes, but it is awkward. JavaScript’s prototype-based system seems to be closer to the essence of the latter, but still not feel natural.

But different from Cook’s view, I think it is better not to consider binary operations like set union as methods. Union may take two sets with different representations and create a new one, thus it can operate by using the “true methods” of both arguments (possibly iterators and membership).

 

A bug in GHC’s type system

Several days ago, I implemented an experimental type inference system with first-class polymorphism. When comparing it with other systems, I found a possible bug in GHC’s type system regarding universal quantification. The phenomemon was confirmed and reproduced by people at #haskell IRC channel for GHC versions above 7.01. The code that causes trouble is:

gen :: [forall a. a -> a]
gen = [id]
test1 = head gen 1

Obviously this program should typecheck, since:

  • id has the type forall a. a -> a.
  • A list gen containing id should have type [forall a. a -> a](as in the annotation).
  • head has the type forall a. [a] -> a.
  • head gen should have the type forall a. a -> a.
  • head gen should be able to be applied to 1.

But GHC rejected this program for a strange reason.

Couldn't match expected type `t1 -> t0'
with actual type `forall a. a -> a'
Expected type: [t1 -> t0]
Actual type: [forall a. a -> a]
In the first argument of `head', namely `gen'
In the expression: head gen 1

On the other hand, it works if (head gen) is bound at let:

test2 = let hg = head gen in hg 1

It doesn’t break the soundness of the type system since it only rejects some correct programs, but this kind of pecularities of type systems can be painful when they add up. I guess this may be caused by the interaction between GHC’s internal type system with the legacy let-polymorphism.

 
Leave a comment

Posted by on March 5, 2012 in bugs, programming languages, types

 

On point-free programming

Concatenative programming, or point-free style, is useful sometimes, but has some serious drawbacks similar to the SKI combinators. Applicative programs can be compiled into point-free style easily, but writing and reading them directly in large scale is usually a mental burden.

It only works well with functions with one or two arguments. Concatenation of functions with more than two arguments will not be so convenient. If the receiver has an argument order that is different from the sender’s output order, you will need a non-trivial permutation of argument order.

For example, if the function is defined as:

f :: Int -> String -> Bool
f x y = ...

If you want to use it as the predicate for filtering a list of strings, that’s fine. You just write something like filter (f 2) .... But what if you want to filter a list of integers? Then you will need to swap the order of the first two arguments before you can do the partial application. So you write filter (flip f 2) .... Fine. But what if the function looks like:

g :: Int -> A -> String -> Bool
g x y z = ...

And you want to filter a list of A’s? Which function do you use to switch the argument order, and you expect the reader of the program to learn it?

What about functions with four arguments. Notice that there are 4! = 24 different argument orders. How many order switchers do we need?

In order to prevent this kind of plumbing, we have to take unnecessary care when we decide the order of the parameters. This often makes the function look ugly.

Names are more convenient. Notice that even mathematics uses names for permutations (as in algebra):

(a b c d)
(b a d c)

Concatenative programming is like connecting the components of a circuit. But even electronic engineers don’t do it this way. They use net-lists with names and labels.

Names are essential and useful in most cases. Concatenative programming, although nice when used sparsingly, may not be good to serve as a major way of composition.

At this point, I found this sentence from Tao Te Ching (Chapter 1) especially relevant:

“The nameless is the origin of Heaven and Earth
The named is the mother of myriad things”