Basic constructs (Unlambda)

From LiteratePrograms
Jump to: navigation, search
Other implementations: AmigaE | C | C++ | Java | Unlambda

This program is under development.
Please help to debug it. When debugging
is complete, remove the {{develop}} tag.

This page describes some basic constructs of Unlambda.

Contents

[edit] Function calls

From a language point of view, there's only one basic construct in Unlambda: Function application. Function application looks like this:

<<function application>>=
`applied function argument

where first the applied function and after that the argument (which itself is a function) is evaluated, and then the function is called in the argument. For example, as applied function we can use a built-in function which, when called, prints an "a" and returns its argument unchanged.

<<applied function>>=
.a

Likewise, the argument is a built-in function which just prints "b" when called, and returns its argument unchanged:

<<argument>>=
.b

Now the following program, which contains just the above function call, just prints "a", because the only function called is .a. The argument is evaluated, but not called (in this case, the argument doesn't contain a function call at all, therefore evaluation just gives the named function):

<<funcall.unl>>=
function application

The following program shows that the function argument is evaluated by using the above function evaluation as argument for another function call, where the argument is .c.

<<funcall2.unl>>=
`function application .c

This program prints "ab": First, the applied function of the outer function call is evaluated. This is just the function application above, which prints a and returns .b. The argument doesn't contain a function call. Now the result of the first function application, i.e. .b, is applied to the argument .c. This call prints "b" and returns the argument .c.

The following shows evaluation of the argument:

<<funcall3.unl>>=
`.cfunction application

In this case, the function application above is the argument of the function .c. Evaluation of that function prints "a" and returns .b, the latter being used as argument to .c which prints "c" and again returns its argument. Therefore this program prints "ac".

Note that Unlambda functions only take a single argument. However multi-argument functions can be defined indirectly by currying. Say you want to implement a function (x,y)\mapsto f(x,y). Then you actually implement x\mapsto(y\mapsto f(x,y)). That is, your function takes only the first argument, but doesn't calculate f, but returns a second function which expects the second argument and then calculates f on both arguments.

[edit] Built-in functions

Unlambda has several built-in functions. Those functions are used as the building block to create other functions.

[edit] Pure functions

Pure functions are functions which do not have any side effects (like I/O or changing the program flow) and do not depend on global state.

[edit] The constant function creator k

The function k takes one function as argument and returns a constant function with that value, i.e. a function which ignores its argument and simply always returns the same value. For example, the following constructs the constant function which always returns .a:

<<constant .a>>=
`k.a

Applying that function to anything returns .a, for example:

<<get .a from constant function>>=
`constant .a .b

The following program proves this by applying the result to yet another function, which causes .a to be printed, thus showing that .a was indeed called:

<<consta.unl>>=
`get .a from constant function .c

Note that .b could have been replaced by an arbitrary unlambda expression, except that any side effects from evaluating it will of course be seen.

Another way to describe this function is that it is a curried function which takes two arguments and returns the first.

[edit] The substitution function s

The substitution function is a curried function that takes three arguments, applies both the first and the second separately to the third, and then applies the result of the first application to the result of the second. In other words, ```sABC is equivalent to ``AC`BC (except for the order of evaluation of A, B and C). The following program demonstrates this:

<<substabc.unl>>=
```s.a.b.c

This prints "abc" as can be seen as follows: By the rule above, the program is equivalent to ``.a.c`.b.c. Now `.a.c prints "a" and returns .c and `.b.c prints "b" and returns "c". Finally, `.c.c prints "c" and returns .c.

Note that s and k are already sufficient to make the language Turing complete.

[edit] The identity function i

The identity function is quite simple: It just returns its argument unchanged. Actually, it could be considered syntactic sugar, because it's equivalent to ``skk. The following code demonstrates the identity function by applying it to .a and then applying the result to another function (.b</code), which has the same effect as directly applying <code>.a to .b, which prints "a" (and returns .b).

<<identity.unl>>=
``i.a.b

[edit] The self-returning function v

The function v is a special case of a constant function: It always returns itself. Like i, it can be considered syntactic sugar, because it can be written as ```s``s`kskk``s``s`kskk.

[edit] Evaluation control

The functions in this section are used to change the order of evaluation.

[edit] The delay function d

Application of the delay function causes its argument not to be evaluated. Since it modifies the normal evaluation algorithm, it's strictly speaking not a function, but a special form. It returns a function which, when called, evaluates the argument of d and then applies that to the argument of the call. The following example demonstrates how d prevents immediate evaluation:

<<delayed function>>=
`d`.a.b
<<delay1.unl>>=
delayed function

Normally, evaluating the argument `.a.b would cause "a" to be printed (and the argument to d to be .b. However, due to the special properties of d the argument remains unevaluated, and the program doesn't print anything.

The following example demonstrates the delayed evaluation of the argument when the resulting function is called. Here, it is applied to the identity function, which causes first the argument to d (`.a.b) to be evaluated (printing "a" and returning .b), and then the result of that evaluation to be applied to the argument (here, i), which in this case causes "b" to be printed. Therefore the result is exactly the same as if the code had been ``.a.bi

<<delay2.unl>>=
`delayed function i

In some sense, this function can also be considered syntactic sugar, because the same effect can be achieved by replacing `dF with λx.`Fx (or actually, the result of doing abstraction elimination on it). To demonstrate that, we note that λx``.a.bx is

<<delayed function without d>>=
``s``s`k.a`k.bi
<codeblock>
Just as with using <code>d</code>, this doesn't print anything when used directly:
<codeblock language="unlambda">
<<delay1a.unl>>=
delayed function without d

But if called, it does print "ab":

<<delay2.unl>>=
`delayed function without d i

[edit] The call with current continuation function c

The function "call with current continuation" (call/cc for short) is the functional language equivalent to goto. What it does is to capture the current program state in a so-called continuation, and pass that to the function passed as argument. The continuation is a special function which replaces the current program state with the restored one, except that it places its argument at the point where the original call/cc was.

Since this description is quite abstract (and the call/cc isn't exactly an easy concept to grasp), here an example program which prints "ab":

<<callcc.unl>>=
``c.a.b

This program runs as follows: First, the call/cc function c is called on .a. This generates a continuation, which could be denoted by the pseudo-Unlambda code <`$.b>. In between the angle brackets is the captured state, with the $ noting the place where the call to call/cc had been. This continuation is now passed as argument to the argument of c, i.e. .a. Therefore after the call to c, the code looks like this:

``.a<`$.b>.b

What happens next is obvious: The function .a is applied to the continuation. This prints "a" and returns the argument, i.e. the continuation, unchanged. Therefore now we have

`<`$.b>.b

At this place, we have a call of the continuation. This means to replace the dollar sign in the continuation by the argument (the .b following the continuation, this gives <`.b.b>), and then replacing the complete expression to evaluate with the content of the continuation, so we now only have to evaluate

`.b.b

that is a simple application of .b which prints "b" and returns its argument (which happens to be .b as well). Therefore the complete output of the program is "ab".

[edit] The exit function e

The exit function, introduced in Unlambda 2.0, just terminates the program, returning its argument (which most Unlambda implementations ignore anyway). It can be considered syntactic sugar because the same effect can be achieved by rewriting the program to a function taking an exit function as argument, and then using a top-level call/cc (c) to supply a continuation to it. The continuation of a top-level call/cc is of course just <$>, and therefore any call to it just replaces the current expression to evaluate with the argument of the call, which in effect is exactly what e does.

The following program demonstrates the use of e:

<<exit.unl>>=
`.a`e`.bi

This prints "b" and then exits. The function .a is never executed since the exit function is executed first (on evaluating the argument of .a).

Note: The perl implementation in the Unlambda-2.0.0 distribution only does this correctly when single-stepping; when just running, it incorrectly prints "a" as well.

[edit] Input/Output functions

The functions in this category are for reading characters from standard input and writing them to standard output. All of them either have side effects or depend on global state, therefore none of them is pure.

[edit] The character output functions .x

For each character x, there's a function named .x which acts like the identity function, except that it has the side effect of printing the character x to standard output. For example, applying .a to a function outputs "a" and returns its argument unchanged. Actually these functions have already been used in all examples above. The following is a simple program which simply prints "a" (and returns the identity function):

<<printa.unl>>=
`.ai

[edit] The newline function r

The function r starts a new line. It is equivalent to .newline. The following program prints an empty line:

<<println.unl>>=
`ri

[edit] The character read function @

Since Unlambda 2.0, also standard input is supported. For this, the concept of the current character is introduced.

The current character is read from standard input with @. This takes as argument one function, which is called either with i if a character could be read, or with v if no character could be read (EOF or read error). This function has the side effect of consuming one character from standard input.

[edit] The character test functions ?x

This function compares the current character to the character x and applies its argument to i if they are equal, and v otherwise. If there's no current character (@ was not yet called, or the latest call failed), the characters are considered not equal.

This function has no side effects, but depends on global state.

[edit] The character reprint function |

If there is a current character (x), this calls its argument on the corresponding output function .x. Otherwise it calls its argument on v.

[edit] Basic data types and control structures

Since Unlambda doesn't provide any data types except functions, all data has to be created using functions. In this section, definitions of some basic data types and control structures are defined. Those can be used as a sort of standard library for Unlambda programs.

[edit] Pseudofunctions

This section will not only define functions, but also what I'll call pseudofunctions. Pseudofunctions are not really a genuine Unlambda concept, however they are an useful concept when writing Unlambda programs.

To explain what I mean with pseudofunctions, let's take an example. As described above, the function `kF (where F is an arbitrary function) is a constant function which ignores its argument and returns F. Now, how would one write a function which takes two arguments (in curried form), ignores them, and returns F? Well, this obviously can be achieved with `k`kF: Applying to the first argument ignores that and returns `kF, which then applied on the second argument ignores that, too, and returns F.

Now if you look at the function `k`kF, on a superficial view it looks like a function application: It has the typical form `«something»F. However, that «something» is k`k which clearly isn't an Unlambda function. This is what I call a pseudofunction: If in a function call you put it lexically between an application operator and an argument, it gives a valid Unlambda expression which acts just as if you had a function there, except that it isn't a function (and isn't evaluated as single block). Note, however, that pseudofunctions cannot be used as arguments; there you need true functions.

Now of course you could also make a true function which has the same effect, so why bother with pseudofunctions at all? Let's look at the example above: The Unlambda function corresponding to the pseudofunction above is, of course, λF.`k`kF, which after abstraction elimination gives ``s`kkk. That is, the function above could also be written as ```s`kkkF, where now the text between the initial ` and the final F is a true Unlambda function. However, this function expression is more complicated (in this case, the extra complication is minor, because the pseudofunction is short, and in additions some shortcuts in abstraction elimination could be applied; however, this isn't true in general; e.g. the pseudofunction .o`.l`.l`.e`.H gets expanded to the true function ``s`k.o``s`k.l``s`k.l``s`k.e``s`k.Hi), and will also take more time to evaluate (because the evaluation forst basically will re-derive the expression `k`kF' and then evaluate that).

The general form of a pseudofunction is a list of Unlambda functions separated by `, i.e. F1`F2`…`Fn. In the following, pseudofunctions will be named by prepending a star to the name of the function.

The concept can be generalized to more arguments: A two-argument pseudofunction only gives a valid Unlambda expression (and thus a function) if "applied" to two functions, and the same for more arguments. An n-argument pseudofunction will be denoted by starting with n stars. As a general rule, an n-star chunk has to be immediately preceded by n evaluation operators (`) to get a valid Unlambda expression.

[edit] Boolean values and operations

Since Unlambda 2.0, there's an "official" boolean representation, which is fixed by the fact that the input functions use it:

<<true>>=
i
<<false>>=
v

With this choice, the and function is quite easy: Just apply the first boolean to the second. If the first one is true, i.e. i, the second argument will be returned, while false, i.e. v, will just return itself. The and function therefore is λxy.`xy, i.e.

<<and>>=
i

On the other hand, the not function is quite complicated: Since there's no comparison function for functions, the only way to distinguish between i and v is to call it. However, v just returns itself, therefore to determine if the called function was v or i, you have to determine whether the return value was v or the argument. This would, of course, again be done by calling the function. But calling v always returns </code>v</code>, but we want a function which returns i when given v. So it seems we are stuck, and if we could only use the pure functions, we would be. However, there is a way out: Continuations: If a continuation is called, the current state of evaluation is replaced by the state stored in the continuation. That way we can distinguish between a continuation and a pure function without looking at the return value. So ``i<continuation>F will call the continuation with F (and therefore make the call/cc return with F) while ``v<continuation>F will just consume the continuation, and subsequently F. So if we ignore that value and just return G (i.e. apply the constant function `kG), that will only be evaluated if the applied function was not v. For the not function, of course F<code> is false and <code>G is true. The continuation is, of course, supplied as an argument by c, so the not function is given by

λB.`cλC.``k«true»``BC«false»

which after abstraction elimination gives

<<not>>=
``s`kc``s`k`s`k`ktrue``ss`k`kfalse

From the above explanation, one sees immediately one way how to write an ifthenelse function: Instead of using v</code for <code>F<code> and <code>i for G, one could simply pass them as function arguments, i.e. use

λBFG.`cλC.``kG``BCF

which would give a huge expression after abstraction elimination. However, it turns out that there's a better (i.e. shorter and more efficient) way to implement it:

Consider what happens if you supply true as first argument: Then you get a function which takes two arguments and returns the first. Also, if you supply false, you get a function which takes two arguments and returns the second. Therefore we can avoid the extra overhead of abstraction elimination by simply using those two functions for F and G. The functions are, of course:

<<return first of two arguments>>=
k
<<return second of two arguments>>=
`ki

Therefore the ifthenelse function reads:

<<ifthenelse>>=
``s`kc``s`k`s`k`kreturn second of two arguments``ss`k`kreturn first of two arguments

With the above explanation, it's also now obvious how to write the or function: If the first argument of or is false, the result equals the second argument; G therefore is the identity function. If however the first argument is true, the result is true, independent of the second argument, therefore F is `k«true». Therefore the or function reads

<<or>>=
``s`kc``s`k`s`k`ki``ss`k`k`ktrue

Let's test the functions above. If everything is correct, the first of the following programs should print a line containing "T", while the second one should write "F" in a line:

<<testiftrue.unl>>=
`r` ```ifthenelse true .T .F i
<<testiffalse.unl>>=
`r` ```ifthenelse false .T .F i

Testing the not function is equally easy:

<<testifnottrue.unl>>=
`r` ```ifthenelse `not true .T .F i
<<testifnotfalse.unl>>=
`r` ```ifthenelse `not false .T .F i

For the further tests, we first define a function which prints out a boolean. It's simply:

λb ````«ifthenelse» b .T .F b

where the b at the end means that apart of the side effect of printing the value, it acts as an identity function. Abstraction elimination gives

<<printbool>>=
``s``s``s``s`kifthenelsei`k.T`k.Fi

Using k to chain the different invocations together (we don't actually care about the results after printing them), the test program for and reads

<<testand.unl>>=
`r
``k `printbool ``and false false
``k `printbool ``and false true
``k `printbool ``and true false
    `printbool ``and true true

and should print "FFFT".

Testing or works the same way:

<<testor.unl>>=
`r
``k `printbool ``or false false
``k `printbool ``or false true
``k `printbool ``or true false
    `printbool ``or true true

This time we should get "FTTT"

Note: This seems to trigger a bug in the Perl implementation of unlambda-2.0.0, but the Java version executes it without problems, and with the correct output.

[edit] Tuples

The simplest compound data structure is the pair. A pair holds two data items, i.e. in Unlambda two functions. A pair can be implemented by a function which takes another function as argument and applies that function to the two elements of the pair. The pair constructor takes the two arguments and returns that function, i.e.

λxyf.``fxy

which after abstraction elimination gives

<<pair>>=
``s``s`ks``s`kk``s`ks``s`k`sik`kk

Of course one needs also functions to get the members of the pair back. To get the first member, the standard way is to pass a function to the pair which takes two arguments and returns the first. The resulting function can be seen on the Unlambda homepage (or here on the page Turing machine Simulator (Unlambda). However there's another, shorter way to get the first element (this function will be called «tfirst», for "tuple first", because there will be another first function for lists, «lfirst»).

Consider what happens if you call a pair (which, after all, is just a function) on a continuation. Well, the pair applies that continuation on its two members. More exactly, due to currying, it first applies it to the first member … and at that point the continuation replaces the current evaluation, effectively making the corresponding call/cc return with the first member of the pair. Now the c builtin function just calls its argument on the current continuation, therefore if we call it on a pair, we get its first member back. Or in short, the «tfirst» function can be written simply as

<<tfirst>>=
c

Now, can we do a similar trick to get the second member of the pair? Well, in principle yes, except that this time it gets more complicated than the standard way. However, as we will see in a moment, it still has some advantage over the standard way. The complication comes from the fact that to get the second member, we have to pass `k«cont» to the pair (so the first argument gets ignored, and the second thrown on the continuation). Therefore the «tsecond» function reads

λF.`cλC.`F`kC

which after abstraction elimination gives

<<tsecond>>=
``s`kc``s``s`ksk`kk

Now an obvious extension of the concept of a pair is the concept of a triple, which contains three elements. Such a triple can be defined along the lines of the tuple as

λxyzf.```fxyz

or after abstraction elimination

<<triple>>=
``s``s`ks``s`k`s`ks``s`k`s`kk``s`k`s`ks``s``s`ks``s`kk``s`ks``s`k`sik`kk`k`kk

Now what about out access functions? Well, due to the use of continuations, the access functions stop evaluation as soon as the corresponding member is thrown at them. Therefore «first» doesn't care that the triple would call it on three arguments instead of two, since after the first one the function returns anyway. Similarly, the function «second» doesn't care that the triple would pass a third argument. This would not work with the standard definitions; on those, one would need to have separate definitions for the accessors.

Of course for our triple, we need an access function for the third member. The definition is straightforward:

λF.`cλC.`F`k`kC

or

<<tthird>>=
``s`kc``s``s`ksk`k``s`kkk

Now there are in principle infinitely many tuples, therefore it doesn't make sense to define a separate constructor for each tuple type. Instead, we define a "tuple extender", which takes an n-tuple and a value and returns an (n+1)-tuple of which the first n values are the same as from the n-tuple, and the last one is the one newly provided. An (n+1)-tuple applies its arguments to n+1 arguments; thanks to currying this means first applying it to n arguments, and then applying the result to the additional argument. Applying the function to the first n arguments is done by passing the function to the original n-tuple. Therefore the tuple extender is

λTxf.``Tfx

where T is the tuple to be extended and x is the new value. Abstraction elimination gives

<<tupleext>>=
``s``s`ks``s`kks`kk

By noting that a "0-tuple" doesn't apply its function at all, i.e. is the identity function i, all tuples can be recursively built using «tupleext». For example, an alternative to ``«triple».a.b.c would be ``«tupleext»``«tupleext»``«tupleext»i.a.b.c.

[edit] Church integers

One data type which is needed quite often are integers. A quite standard way to express integers as functions are church integers: The church integer representation of the number n is a function with two arguments, which applies its first argument to the second argument n times. That is, the number 0 is represented by

λfx.x

that is

<<ci0>>=
`ki

the number 1 by

λfx.`fx

i.e.

<<ci1>>=
i

the number 2 by

λfx.`f`fx

or

<<ci2>>=
``s``s`kski

and so on. The number n+1 can be generated from the number n by the successor function, which generates a function which applies its first argument one extra time, that is

λnfx`f`nfx

or

<<cisucc>>=
`s``s`ksk

Addition m+n can be done by calculating the n-th successor of m, i.e.

λmn.``m«cisucc»n

Since we know by definition that m is a pure function, we can simplify that to

λm.`m«cisucc»

which differs from the previous expression only by the fact that m is immediately applied. Abstraction elimination gives

<<ciplus>>=
``si`kcisucc

The product of m and n is formed by applying n m times, i.e.

λmnfx.``m`nfx

or

<<citimes>>=
``s`ksk

One use of Church integers is to define an accessor function for the n-th member of a tuple (note that the first member has the index 0). The basic observation is that you have to quote the continuation n times in the accessor, i.e. `nk«cont». Therefore we can get the n-th function by adding another lambda-dependency of n and replacing k with `nk in «tsecond»:

λnF.`cλC.`F``nkC

which after abstraction elimination gives

<<tci n-th>>=
``s`k`s`kc``s`k`s``s`ksk``s``s`ksk`k`kk

[edit] Lists

While tuples contain a fixed number of values, often the number of items to be stored isn't known from the beginning. In that case, the data can be stored in a list. A list is just a pair where the first item contains the first item of the list, and the second item contains the list of the other elements. There's one exception, however, which is the empty list, «nil». This will be defined as v:

<<nil>>=
v

We of course need a function to distinguish the empty list and a non-empty list, i.e. an empty list and a pair. Now a pair can only be detected by applying it to a function, which means that the function is applied to the two arguments. However, we are not interested in the two arguments, therefore we use a function which ignores its two arguments. Since applying v to anything returns v again (which is «false»), we will return «true», i.e. i for nonempty lists, i.e. pairs. Therefore our function will be called «nonempty». The definition is simple:

&lambdaf.`f`k`ki

or after abstraction elimination

<<nonempty>>=
``si`k`k`ki

The first element of the list is the first element of the pair:

<<lfirst>>=
tfirst

Note that for «nil» this gives «nil» again.

The rest of the list is the second argument of the list. However since this time we don't have to care about higher tuples, we use the standard form which is shorter for the second element:

<<lrest>>=
``si`k`ki

Note that again, this maps «nil» to «nil». Of course, we can also define functions for the second and third element of a list. The second element is just the first element of the rest of the list. This can be encoded in the pseudofunction

<<*lsecond>>=
`lfirst`lrest

or the function

<<lsecond>>=
``s`klfirst``s`klresti

In the same way, the third element can be accessed with

<<*lthird>>=
`lfirst`lrest`lrest
<<lthird>>=
``s`klfirst``s`klrest``s`klresti

Of course, you can again use the Church integers to define a function accessing the n-th element of a list:

λnL.`«lfirst»``n«lrest»L

or after abstraction elimination

<<lci n-th>>=
``s`k`s`klfirst``s``s`ks``s``s`ksk`k`klrest`ki

As with tuples, the first element has index 0.

Download code
hijacker
hijacker
hijacker
hijacker