Turing machine simulator (Unlambda)

From LiteratePrograms
Jump to: navigation, search
Other implementations: C | C++ | C++ | Java | LaTeX | OCaml | Scheme | Sed | Unlambda

This is an implementation of a turing machine in Unlambda. Besides demonstrating that Unlambda is indeed Turing complete, it also gives an interesting tension: Unlambda is an intentionally obfuscated programming language, which is quite the antithesis to literate programming.

Since Unlambda is a pure functional programming language (there are strictly only functions, nothing else), it also demonstrates how data structures (like the tape and the state table) can be created out of functions.

The following code implements a turing machine with two symbols. The turing machine is running a three-state busy beaver, as described in [1].

Contents

[edit] Definition of the turing machine

A turing machine consists of several parts:

  • A tape which is infinite on one or both sides, and at each position contains one of a finite set of symbols.
  • A read/write head which can move along the tape one position at a time, read the symbol at the current position, and replace that symbol with another symbol.
  • A state machine, which controls the read-write head according to a table defining the actual program.

The state machine's table contains for each combination of current state and current symbol at the head's position

  • what symbol to write at that position (replacing the current symbol)
  • in which direction (left or right) to move the head afterwards
  • the next state of the state machine

There's a special "halt" state, which causes the turing machine to stop operation. Usually that halt state is not counted, i.e. an n-state turing machine has actually n+1 states: n "working" states and the halt state.

[edit] Definition of the busy beaver

A busy beaver is turing machine with the following properties:

  • It has exactly two symbols. One of them may be called "blank" or "zero", the other symbol may be called "one".
  • It runs on a tape which is infinite on both sides.
  • If started on an empty tape (i.e. one which is filled with the "blank" symbol), it stops after a finite number of steps. After that, the tape contains as many ones as possible.

[edit] Defining the tape and head

The first thing to be defined is the tape. The tape is conceptionally infinite on both sides, but will only ever be accessed at the head position. Therefore an obvious tape representation is a pair whose first element represents the tape up to the head position, and the second element represents the tape starting from the tape position. Note that this implicitly also defines the head, which is completely determined by its position.

Of course, unlambda doesn't natively provide us with a pair type, so we have to construct it from functions. The standard way to represent a pair as function is to have a function which takes another function as argument, which it then applies to both elements (in Unlambda, of course in a curried fashion). To retrieve the first/second element of the pair, one passes a function which returns its first/second argument. The implementation of this concept can be found on the unlambda homepage (see the link above):

<<cons>>=
``s``s`ks``s`kk``s`ks``s`k`sik`kk
<<car>>=
``si`kk
<<cdr>>=
``si`k`ki

The initial empty tape is thus just a pair of two empty half-tapes (defined below). Since we will several times later need to apply a function to two copies of the same argument, we define a function to do just this first. In lambda notation it is

λF.λG.``FGG

and abstraction elimination thereof gives

<<apply-to-twice>>=
``ss`ki

With this helper function, the empty tape now looks as follows:

<<empty tape>>=
``apply-to-twice cons empty half-tape

Now, of course we still have to define the data structure for the half-tapes. An obvious choice (at least to people used to program with functional languages like Lisp or Scheme) would be to build a list of symbols by chaining pairs where the first element contains a representation of the symbol, and the second contains the next symbol (or some terminating value to denote the end of the list, i.e. the empty half-tape following). However, here another representation will be used: The only use of the symbol stored on the tape is to select one of two actions, i.e. to call one of two supplied functions. Now that's exactly what a discriminated union provides. The discriminated union also contains a "payload", which can be used to store the rest of the tape after (or before, if looking at the left half of the tape) the symbol. The discriminated union is represented by a function calling its first resp. second argument on the payload. The discriminated union constructors can also be found on the Unlambda homepage:

<<q1>>=
``s`k`s`kk``s`k`sik
<<q2>>=
``s`kk``s`k`sik

Note that the switch function also provided at the Unlambda homepage is not really necessary, since it's just the identity function. Moreover, in the code below, an explicit switch will never be used, because the representation of pairs allows an interesting trick: Assume that you have a pair of functions, and you apply it to a discriminated union (remember, the pair is just a function). This will cause the discriminated union to be applied to the two functions stored in the pair. But applying the discriminated union to the functions means to execute the function corresponding to the discriminator of the union! To demonstrate this, look at the following example:

<<example: print 1, then apply argument to i>>=
``s``s`k.1i`ki
<<example: print 2, then apply argument to i>>=
``s``s`k.2i`ki
<<example: pair>>=
``cons example: print 1, then apply argument to i example: print 2, then apply argument to i
<<example: discriminated union>>=
`q2 r
<<example.unl>>=
`example: pair example: discriminated union

This example program applies a pair of functions to a discriminated union constructed with q2. This union selects the second function (which prints 2, then applies i to its argument) and applies it to its payload (r, which just ends the line). Therefore this program outputs a line containing only "2".

Now our tape representation is almost complete: We just have to define what to store at the end of the represented tape. While the modeled tape is infinite, we certainly must represent it by a finite data structure. However, it would be nice to avoid special-casing the end of the tape. And this is indeed possible. The key observation is that the only way to identify a function is through its behaviour. So it is sufficient if we have a function which behaves as if there were an infinite number of discriminated unions all constructed with q0. That is, the empty half-tape is represented by a function which, if applied to two arguments, executes the first one on the rest of the tape, which, of course, is again the empty half tape. That is, we need a function which takes two arguments, and applies the first argument on itself.

The above, of course, describes a recursive function. Since Unlambda doesn't provide names for functions (except for the built-in functions, of course), the recursivity has to implemented by a function which takes three arguments: The first argument will be the function to recursively call (i.e. itself), the second and third arguments are the actual arguments to the function. That is, the function has the form

λF.λG.λH.`G`FF

which after abstraction elimination gives

<<empty half-tape function>>=
``s`k`s``s`ksk``s`kk``s``s`kskk

Now, to get the empty half tape, we must pass this function to itself as first argument. Thanks to currying, this can be done up-front. The simple way would be to just write the above function twice, but it's easy to avoid that repetition by looking at the expression ``ciF (where F is an arbitrary function). `c is the call/cc function which applies its argument on the current continuation. In this case, the argument to cis i, thus `ci just returns the current continuation. Applying that to F "goes back in time" and causes `ci to return F, so that the whole expression evaluates to `FF – expactly what we want here. Thus our empty half-tape can be written as

<<empty half-tape>>=
``ci empty half-tape function

To test that our half-tape indeed works, we write a simple test program using the definitions. First we define our test half tape, which contains a 0, followed by an 1, followed by an empty half-tape:

<<testing the half-tape: test tape>>=
`q1 `q2 empty half-tape

For our test, we need a function which prints the first half-tape symbol, and returns the rest of the tape. This is just a pair of two output functions for the two values:

<<testing the half-tape: print symbol>>=
``cons .0 .1

Our test code simply prints the first four values of our half-tape. The first two values must be the ones explicitly given (i.e. 0 and 1), and the next two come from our empty half-tape (both 0, of course). We also print a newline.

<<halftape_unittest.unl>>=
`r`testing the half-tape: print symbol`testing the half-tape: print symbol`testing the half-tape: print symbol`testing the half-tape: print symbol testing the half-tape: test tape

[edit] Printing the tape around the head

When the turing machine runs, we want to see what happens on the tape, so we need a function to print the current tape. Since we explicitly defined our tape to be infinite on both ends, the only distinct position on the tape is the head's current position. Therefore we simply print 6 characters on both sides of the current head's position (i.e. if the head moves left, in the output the tail moves right, and vice versa). Outside of the head's position, we use the symbols '*' and '|' to denote 0 and 1, while at the head's position, we use 'o' and 'I'.

The loops for printing the characters are created using Church integers. Those take two functions and apply the first argument n times to the second (i.e. ``<<2>>FX is `F`FX. The following definitions are straight from the Unlambda homepage (since 1 is the simplest number, and we don't need 0, we start at 1):

<<1>>=
i
<<inc>>=
`s``s`ksk
<<mul>>=
``s`ksk

With those functions we can build up our 6:

<<2>>=
`inc 1
<<3>>=
`inc 2
<<6>>=
``mul 2 3

The symbols left from the head are stored in the first half-tape. Printing them contains the extra challenge that we need to print them in the reverse order we can access them. Since Church integers only allow applying a function n times, we have to built up a function which prints the first n symbols of the given half-tape in reverse order. That is, the iteration function takes a function which prints the first n-1 elements of its half-tape argument in reverse order, and returns a function which prints the first n elements in reverse order. For this we first construct a helper function, which takes as first argument the function to print a symbol, as second argument the function to print the first n-1 characters of a half-tape in reverse order, and as third argument the half-tape itself. The function is

λS.λF.λT.`S`FT

which works because Unlambda evaluates the argument `FT before applying S to it, which prints the symbol. Abstraction elimination then gives

<<print reverse helper>>=
``s`ksk

With this we can then build our iteration function:

λf.λt.```<<cons>>``<<print reverse helper>>.*f``<<print reverse helper>>.|ft

or after abstraction elimination:

<<print reverse iteration function>>=
``s``s`ks``s``s`ks``s`k`s`kcons``s`k`s``s`kprint reverse helper`k.*k``s`k`s``s`kprint reverse helper`k.|k`ki

To start our iteration, we start with a function which simply prints nothing and returns the full tape. That's of course just the identity function. Therefore the function to print the first 6 elements of a half-tape in reverse order is

<<print start of half-tape in reverse order>>=
``6 print reverse iteration function i

Let's test that function on the example half-tape from above:

<<reverseprint_unittest.unl>>=
`r`print start of half-tape in reverse order testing the half-tape: test tape

This should print

****|*

Printing the rest of the tape is much simpler. First we print the symbol on the head, which is the first symbol in the right half-tape, and then we use a straightforward Church integer loop to print the next 6 symbols:

λT.``<<6>>``<<cons>>.*.|```<<cons>>.o.IT

Abstraction elimination then gives

<<print start of half-tape with head>>=
``s``s`k6``s``s`kcons`k.*`k.|``s``s``s`kcons`k.o`k.Ii

We test this function as well:

<<forwardprint_unittest.unl>>=
`r`print start of half-tape with head testing the half-tape: test tape

This should print

o|*****

Finally, we can build up our tape printing function. It just applies the reverse printing function to the car of the tape, and the other printing function to the cdr of the tape. Since we are not interested in the function returns, we just use the k function to chain the two function calls (this discards the result of the second call).

λT.``k `<<print start of half-tape in reverse order>>`<<car>>T `<<print start of half-tape with head>>`<<cdr>>T

or after abstraction elimination

<<print tape>>=
``s``s`kk``s`kprint start of half-tape in reverse order``s`kcari``s`kprint start of half-tape with head``s`kcdri

To test the tape printing function, we just build a tape from two copies of the half-tape used in the tetst of the previous section, and pass it to the tape printing function (and output a newline afterwards):

<<printtape_unittest.unl>>=
`r`print tape``cons testing the half-tape: test tape testing the half-tape: test tape

The output should be

****|*o|*****

[edit] Writing to the tape

Writing to the tape means replacing the symbol at the head position by the symbol to write. The symbol at the head position is, of course, given by the discriminator of the first discriminated union of the second element of the pair describing the tape. The symbol to write can conveniently be described by the corresponding union constructor function. The write function therefore takes two arguments: The symbol to be written, and the tape to write to. It returns the tape with the symbol written.

Since writing to the tape only changes the second element of the pair, the write function can be written as follows:

λS.λT.``«cons» `«car»T ``«write half-tape» `«cdr» T S

which after abstraction elimination gives

<<write>>=
``s`k`s``s`kcons``s`kcari``s`k`s``s`kwrite half-tape``s`kcdrik

Now, writing the half-tape just means unconditionally applying the supplied discriminated union constructor to the payload of the first discriminated union (which is the rest of the half-tape). Since the discriminated union definition only allows conditional application, we have to supply the same function twice. Since the arguments to «apply-to-twice» are exactly the arguments to «write half-tape», the functions are actually the same:

<<write half-tape>>=
apply-to-twice

Finally, we define the symbols by their constructors:

<<symbol 0>>=
q1
<<symbol 1>>=
q2

Finally, we test the functions by the following two test programs. The first one just writes one to the empty tape. Since the resulting tape is used again in the second test, we create an explicit chunk for it:

<<testing: single-write tape>>=
``write symbol 1 empty tape

The first test program just outputs this tape:

<<write1_unittest.unl>>=
`r`print tape testing: single-write tape

This should print

******I******

The second test writes a 0 to the tape just created and prints it:

<<write0_unittest.unl>>=
`r`print tape ``write symbol 0 testing: single-write tape

After that, the tape should be empty again, and thus we should get

******o******

[edit] Head movement

Since the head position is encoded as the first symbol in the right half-tape, moving the tape left means moving the initial symbol from the first half-tape to the second one, and vice versa. The movement functions receive the unmoved tape as argument and return the moved tape. To avoid duplication, a common movement function will be implemented, which gets passed a constructor function (cons or reverse cons) and the selectors to use, in addition to the tape. Then move left and move right can simply be implemented by

<<move left>>=
```movement cons car cdr
<<move right>>=
```movement reverse cons cdr car

where «reverse cons» is defined as

λF.λG.``«cons» G F

which after abstraction elimination gives

<<reverse cons>>=
``s`k`s``s`kconsik

Note that the supplied pair-constructor always takes the moved-from half-tape first. Also the selector for the moved-from half-tape is supplied before the selector for the moved-to half-tape.

To read and remove the first symbol from the moved-from half-tape, we need to pass both the symbol (represented by the corresponding union constructor) and the rest of the tape (i.e. the payload of the symbol) to another function used to produce the final moved tape. This "reversion" is necessary because the only way to get to the symbol and payload is to apply the union to a pair of functions. The function which does this is

λH.λF.``H`F«q1»`F«q2»

where H is the moved-from half tape, and F is a function taking the symbol constructor as first and the rest of the tape (i.e. the symbol payload) as second argument, and constructs the moved-to tape. Abstraction elimination of this function gives

<<read and remove first symbol>>=
``s``s`ks``s``s`ksk`k``si`kq1`k``si`kq2

To construct the moved tape, we have to create a pair (using the supplied pair constructor) of the moved-from tape with the first symbol removed, and the moved-to half-tape with that symbol added. Since the first two arguments of the function are supplied by the mechanism above, the order of arguments is mostly fixed; only the last two arguments could be exchanged. It turns out that abstraction elimination gives a shorter result if the pair constructor argument comes before the moved-to half-tape. The function therefore looks like:

λQ.λF.λC.λT.``CF`QT

where Q is the union constructor representing the symbol, F is the moved-from half-tape with the first symbol already removed, C is the pair constructor, and T is the moved-to half-tape where the symbol is to be added. Abstraction elimination then gives

<<construct new tape>>=
``s`k`s``s`ks``s`k`s`ks``s`k`s``s`ksk``s`kkk``s`kkk

Now the actual movement function is straightforward: It has to supply the above construction function as well as the moved-from half-tape to the reading function, which supplies the first two arguments to the construction function, and then pass the remaining elements, i.e. the pair constructor and moved-to half tape, to the result in order to create the new pair. The movement function therefore is

λC.λA.λD.λT.`` ``<<read and remove first symbol>> `AT <<construct new tape>> C `DT

which gives

<<movement>>=
``s``s`ks``s`k`s`ks``s`k`s`kk``s`k`s`ks``s`k`s``s`ks``s``s`ks``s`k`s`k read and remove first symbol i`k`k construct new tape ``s`kkk`k`ki

We test those functions using the test half-tape from above:

<<moveleft_unittest.unl>>=
`r`print tape`move left``apply-to-twice cons testing the half-tape: test tape
<<moveright_unittest.unl>>=
`r`print tape`move right``apply-to-twice cons testing the half-tape: test tape

The output of the first program should be

*****|o*|****

and the output of the second program should be

***|**I******

[edit] Representation of the state

Like the symbols on the tape, the current state is used to select different functions to perform. Therefore it makes sense to implement the states as discriminated union. Similar to lists, discriminated unions of more than two elements can be built from discriminated unions of two elements by just using another discriminated union as payload for the q2 destructor.

For our busy beaver, we need three states plus a halt state. The constructors for the states are therefore

state A constructor = «q1»
state B constructor = λx.`«q2»`«q1»x
state C constructor = λx.`«q2»`«q2»`«q1»x
state HALT constructor = λx.`«q2»`«q2»`«q2»`«q1»x

which gives

<<state A constructor>>=
q1
<<state B constructor>>=
``s`kq2``s`kq1i
<<state C constructor>>=
``s`kq2``s`kq2``s`kq1i
<<state HALT constructor>>=
``s`kq2``s`kq2``s`kq2``s`kq1i

The discriminated union of course also gets a payload, passed to the constructor on construction. This will be used to hold the current tape/head system.

[edit] The state table

The state table is simply a list of functions, one for each state, which take the current tape/head system and return the new state/head/tape according to the rules of the state table. In addition, those functions print out the current state/tape, so the action of the turing machine can be seen.

The list of functions is implemented conventionally using cons pairs, where the car contains the functions, and the cdr contains the next element. Note that applying such a list to a state as defined above directly causes the right function for the state to be called.

The end of the list is denoted by the function v.

<<state table>>=
``cons state A function ``cons state B function ``cons state C function ``cons state HALT function v

Each function prints the current tape and the name of the current state, then calculates the new state/head/tape system. An exception is the halt state, which simply ends the program after printing the current state. It does so using the function e from Unlambda 2.0, so that the main code can simply be coded as an infinite loop. While that's arguably not the cleanest way to code it, it's the simplest.

In order to print the tape, the newly calculated state/head/tape system is passed to the k function, creating a const function for that type, and then the original tape is passed to the tape print function, followed by the output of a space character. The constant function ignores the result of that print function, and returns the new state/tape system from our previus function call. Finally, we output the state's name and a newline. Note that actually the new tape and state are calculated before the old ones are output; but that doesn't matter because calculating the new state is a pure function without side effects (unlike the output functions which have the side effect of output, thus their relative execution order matters).

The ordinary state functions therefore all have the following structure:

state X function = λT.`r`.X``k `«state X action» T `. `«print tape» T

Abstraction elimination makes this into

<<state A function>>=
``s`kr``s`k.A``s``s`kk``s`kstate A actioni``s`k. ``s`kprint tapei
<<state B function>>=
``s`kr``s`k.B``s``s`kk``s`kstate B actioni``s`k. ``s`kprint tapei
<<state C function>>=
``s`kr``s`k.C``s``s`kk``s`kstate C actioni``s`k. ``s`kprint tapei

The halt state omits the state action part (there's nothing to do in the halt state), but ends the program after the output:

state HALT function = λT.`e`r`.T`.L`.A`.H`. `«print tape» T

Abstraction elimiation gives

<<state HALT function>>=
``s`ke``s`kr``s`k.T``s`k.L``s`k.A``s`k.H``s`k. ``s`kprint tapei

To calculate the new state/head/tape, the action functions read the current symbol and act accordingly. Reading the current symbol means applying a pair of functions, one for symbol 0, one for symbol 1, to the cdr of the tape (which is the argument of the state function). Those functions are then applied to the tape in order to create the new machine state. Note that for the halt state, reading the state table simply gives v, which survives all applications. Therefore a return value of v indicates that the turing machine has halted.

state X action = λT.````«cons»«symbol 0 function»«symbol 1 function» `«cdr»T T

Abstraction elimination then gives

<<state A action>>=
``s``s``s``s`kcons`kstate A symbol 0 function`kstate A symbol 1 function``s`kcdrii
<<state B action>>=
``s``s``s``s`kcons`kstate B symbol 0 function`kstate B symbol 1 function``s`kcdrii
<<state C action>>=
``s``s``s``s`kcons`kstate C symbol 0 function`kstate C symbol 1 function``s`kcdrii

The state/symbol specific functions create the new state of the turing machine from the current tape. This is done by applying the new state constructor to the accordingly written and moved tape. Note that the selection mechanism passes the tape after the head symbol as first argument. Since we need the whole tape instead, the first argument will be ignored, and only the second argument, which is the full tape, will be used. The functions therefore have the general structure

λX.λT.`«new state constructor» `«movement function» `«write function» T

This gives the following functions:

<<state A symbol 0 function>>=
`k``s`kstate B constructor``s`kmove right``s`k`writesymbol 1i
<<state A symbol 1 function>>=
`k``s`kstate C constructor``s`kmove left``s`k`writesymbol 1i
<<state B symbol 0 function>>=
`k``s`kstate A constructor``s`kmove left``s`k`writesymbol 1i
<<state B symbol 1 function>>=
`k``s`kstate B constructor``s`kmove right``s`k`writesymbol 1i
<<state C symbol 0 function>>=
`k``s`kstate B constructor``s`kmove left``s`k`writesymbol 1i
<<state C symbol 1 function>>=
`k``s`kstate HALT constructor``s`kmove right``s`k`writesymbol 1i

Again, a few unit test programs make sure that we did it right. The first couple of functions just tests that the state and tape is written correctly. For this, we just pass the different state each with an empty tape to the state table.

<<statetable_print_A_unittest.unl>>=
`state table `state A constructor empty tape
<<statetable_print_B_unittest.unl>>=
`state table `state B constructor empty tape
<<statetable_print_C_unittest.unl>>=
`state table `state C constructor empty tape
<<statetable_print_HALT_unittest.unl>>=
`state table `state HALT constructor empty tape

Each of the programs prints a line containing an empty tape, followed by the letter for the corresponding state. For example, statetable_print_A_unittest.unl prints

******o****** A

The next couple of programs tests that the resulting state is right. For that, we apply the state table twice, first to get to the resulting state, then to print that state (we could write a separate state printing function, but it's easier this way). Also, we apply it both to the empty tape, and after writing 1 to the empty tape, except for the HALT state, where this would not make a difference.

<<statetable_resultstate_A0_unittest.unl>>=
`state table `state table `state A constructor empty tape
<<statetable_resultstate_A1_unittest.unl>>=
`state table `state table `state A constructor ``write symbol 1 empty tape
<<statetable_resultstate_B0_unittest.unl>>=
`state table `state table `state B constructor empty tape
<<statetable_resultstate_B1_unittest.unl>>=
`state table `state table `state B constructor ``write symbol 1 empty tape
<<statetable_resultstate_C0_unittest.unl>>=
`state table `state table `state C constructor empty tape
<<statetable_resultstate_C1_unittest.unl>>=
`state table `state table `state C constructor ``write symbol 1 empty tape
<<statetable_resultstate_HALT_unittest.unl>>=
`state table `state table `state HALT constructor empty tape

Those functions give the following output (note that since we always print the tape centered around the head, moving the head to the left causes the tape to be moved to the right on output):

statetable_resultstate_A0_unittest.unl:

******o****** A
*****|o****** B

statetable_resultstate_A1_unittest.unl:

******I****** A
******o|***** C

statetable_resultstate_B0_unittest.unl:

******o****** B
******o|***** A

statetable_resultstate_B1_unittest.unl:

******I****** B
*****|o****** B

statetable_resultstate_C0_unittest.unl:

******o****** C
******o|***** B

statetable_resultstate_C1_unittest.unl:

******I****** C
*****|o****** HALT

statetable_resultstate_HALT_unittest.unl:

******o****** HALT

Note that in the last case there's only one line because the program was ended by the state HALT function.

[edit] The main loop

Now that all parts of the turing machine are in place, all left to do is to run it. This is done by simply repeatedly applying the state table to the initial state. Since program termination is already encoded in the halt state function, no end condition is needed. Since Unlambda has no looping construct, the loop is done through recursion. As with the empty half-tape, the recursion is implemented by passing the function to recursively call (i.e. the function itself) as first argument. The second argument is the state/head/tape system to operate on. The function therefore looks like this:

main loop body = λF.λT.``FF`«state table»T

Abstraction elimination then gives

<<main loop body>>=
``s``s`ks``s``s`kskk`k``s`kstate tablei

Initially the turing machine is in state A, and the tape is empty:

<<initial machine state>>=
`state A constructor empty tape

Now all left to do is to recursively run the main loop body on the initial machine state:

<<turing.unl>>=
` ``cimain loop body initial machine state

The output generated by this program is:

******o****** A
*****|o****** B
******I|***** A
******o||**** C
******o|||*** B
******o||||** A
*****|I|||*** B
****||I||**** B
***|||I|***** B
**||||I****** B
*|||||o****** B
**||||I|***** A
***|||I||**** C
**||||I|***** HALT
Download code
hijacker
hijacker
hijacker
hijacker