Basic constructs (OCaml)
From LiteratePrograms
This article describes some basic constructs in OCaml. This is not a tutorial, just a collection of example code. The Objective CAML Tutorial is a fine tutorial introduction to OCaml.
Contents |
Functions
As OCaml is a functional programming language, it seems appropriate to cover function definition first. In OCaml there are a few different ways to declare a function, but the most common is the standard let definition. Here, the general structure of the definition is let funname param1 ... paramN = body of function.
<<let_function>>= let add a b = a + b;;
Note the two semicolons at the end of the definition. This bit of punctuation is used in OCaml to signify the end of an expression. However, it is usually an optional bit of punctuation. When writing code that will be compiled or run through the interpreter one doesn't need to include the semicolons. The only time they are needed is when working with the interactive toploop, then the semicolons are needed to signal to the interpreter that the expression (or set of expressions) you've just typed in are to be evaluated.
Also note that the parameters are separate elements following the function name, not a parenthesized list like in many common languages. This doesn't mean that you can't use that style of function definition, it just means something slightly different. Unlike the definition above, which takes two parameters adds them and returns the result, the definition
<<let_function2>>= let add' (a, b) = a + b;;
creates a function that takes a single parameter which is a two-tuple, deconstructs the tuple, adds the constituent elements and returns the result. These functions are applied as one might expect:
# add 10 15;; - : int = 25 # add' (30, 40);; - : int = 70
Anonymous functions
Another way to define a function is with the fun keyword:
fun a b -> a + b;;
This creates a function without a name (an anonymous function) that adds two values together. This may seem like an odd concept to some, but in functional languages, a function is just a value, like strings or ints, and no one finds the use of unnamed strings and ints odd. As we will see in a bit, this is actually quite useful.
Anonymous functions can be applied in a fairly straight-forward manner:
# (fun a b -> (a * 10) + b) 5 7;; - : int = 57
And as they are simply values, we can use let to bind them to a name:
<<let_anon>>= let add_anon = fun a b -> a + b;;
# add_anon 10 3;; - : int = 13
This isn't immediately useful, but there are places where it comes in handy. Note also that this is equivalent to let construction shown first -- indeed, the let construction is just sugar for this form.
Pattern matching functions
Functions can also be defined using the function keyword and pattern matching. Here, function accepts a single value and tries to match it to one of several patterns.
<<let_function>>= let silly = function | 0 -> "zero" | 1 -> "one" | x when x < 2 -> "negative number" | _ -> "some other number"
Here, the patterns are separated with the | character (the initial one is optional), and they are constructed by specifying the pattern to be matched, followed by a ->, followed by the code to perform if that case matches. The patterns are checked in order, and only the first match is used, so in the example above, passing a 0 into silly will result in "zero" being returned, and not "negative number", even though it matches the pattern x when x < 2> as well. The final pattern _ is a pattern that matches all values, and here would be matched by all numbers greater than 1. Another variant of this would use a pattern like | x -> "something". Here, the x matches anything, but instead of throwing away the value that was matched, it is bound to the name x, and can be used in the code for that case, e.g. | x -> sprintf "You passed in: %d" x.
Note that these constructs can be combined:
<<combo_function>>= let sillier a = fun b -> function | 0 -> a | 1 -> b | x when x < 2 -> a + b | _ -> a * b
Here, sillier is a function that takes three values, integers a and b, and a third integer that selects the operations we perform on a and b.
# sillier 10 20 0;; - : int = 10 # sillier 10 20 1;; - : int = 20 # sillier 10 20 (-1);; - : int = 30 # sillier 10 20 100;; - : int = 200
Recursive functions
If one wishes to define a recursive function, they must use the let rec construct to inform the compiler/interpreter, that the specified name is to be used recursively.
<<factorial>>= let rec factorial n = if n <= 1 then 1 else factorial (n-1) * n
This can also be used with the function keyword in a useful way. Here we compute the length of a list using pattern matching. First we check if the list is empty, if it is, then the length is 0. If not, we match it against some unused element and the tail of the list, and we then recursively call length on the tail of the list and add 1 to the result.
<<length>>= let rec length = function | [] -> 0 | _::t -> length t + 1
Higher order functions
As we have seen, functions are values in OCaml, just like ints, floats, and strings, and so we can pass them into functions as arguments, as well as create and return them from functions. Functions that accept other functions as parameters are referred to as higher order functions (HOFs). An example of this is a function that searches a list for the first element that matches some predicate.
<<find>>= let rec find test = function | [] -> raise Not_found | h::t -> if test h then h else find test t
Here we pass an anonymous function into find in order to find the first value greater than 10 and then to find the first negative value. The last example looks for the first multiple of 11, and not finding any raises the built-in exception Not_found.
let l = [1; 1; 2; 3; 5; 8; 13; 21; -7; 600];; find (fun x -> x > 10) l;; find (fun x -> x < 0) l;; find (fun x -> x mod 11 = 0) l;;
Currying
Currying is (in essence) a way looking at functions of multiple arguments as a series of functions of one argument that return functions that work with the other arguments. In OCaml, this makes sense, as functions are values, and can be created and returned by other functions trivially. As an example of what this means, consider the first function example on this page, let add a b = a + b. We generally think of this as a function that accepts two arguments and returns a value. Really, it's not that at all.
OCaml converts that function definition under the covers to the equivalent:
<<curry>>= let add = fun a -> fun b -> a + b
What this form means is that add is a function that takes a single argument, binds its value to the name a and returns an anonymous function that takes a single argument, binds its value to the name b, and then computes the value a+b according to the bindings from the function applications and returns the result. Why this is important is that it allows us to use partial application to specialize the functions we have. For example:
<<partial>>= let add1 = add 1;; let add10 = add 10;;
Here, we have used Currying to our advantage to create two new functions, add1 and add10 that each take a single argument and add 1 and 10 to that argument, respectively.
# add1 10;; - : int = 11 # add10 10;; - : int = 20
Conditionals
There are two forms of conditionals in OCaml, if/then expressions and pattern matching.
If/Then
The if then construct is fairly straight-forward, and similar to similar constructs in other languages. The basic form of the if/then is if test then e1 else e2, where test is an expression that evaluates to a boolean value, and e1 and e2 are expressions that are evaluated in the true and false branches of the test respectively. The primary catch with if/then in OCaml is that both e1 and e2 must have the same type, so expressions like if test then 1 else "two" are not allowed. Additionally if the type of the true branch is unit, then the else branch is not required, e.g. if test then print_string "true!" is equivalent to if test then print_string "true!" else (). Note, however, that this con lead to mistakes where multiple unit valued expressions are to be executed in a given branch. For example:
if test then print_endline "hi"; print_string "bye" else print_endline "what?"
This is parsed as
( if test then print_endline "hi" ); print_string "bye" else print_endline "what?"
Which results in a syntax error. More troublesome is something like the following:
if test then print_endline "hi" else print_string "bye"; print_endline "what?"
Which is parsed as:
( if test then print_endline "hi" else print_string "bye"; ); print_endline "what?"
So the print_endline "what?" expression is always evaluated. The reason for this is that the semicolon has lower precedence than the if/then expression. To fix this, wrap multi-statement blocks in parens or begin/end pairs
if test then
(
print_endline "hi";
print_string "baby"
)
else
begin
print_string "bye";
print_endline "what?"
end
Pattern Matching
Pattern matching is a powerful construct of OCaml that allows for conditional branching based on the structure and content of a value. Pattern matching is initiated in OCaml in one of two ways: the first is the function keyword, described above, and the second is the match ... with construct. The general form of the match ... with construct is as follows:
match exp with | patt_1 -> ex_1 ... | patt_n -> ex_n
That is, the keyword match followed by an expression that evaluates to a value, followed by the with keyword, then a series of pattern/expression pairs, separated by | characters (the first one is optional). The value that exn evaluates to is checked against the patterns, in order of definition, and if one matches (say patt_i), then the entire match expression reduces to the corresponding expression (ex_i).
A pattern can be any of the following recursive constructs:
- A constant value : one of { integer-literal, float-literal; char-literal, string-literal, constant variant value,
false,true,[],(), constant polymorphic variant } -
_: matches any pattern and throws away the value - variable-name : matches any pattern and binds the value to the variable name for use in the corresponding expression
- pattern as variable-name : the value that matches pattern is bound to variable-name for use in the corresponding expression
- ( pattern )
- ( pattern : type-annotation )
- pattern | pattern : multiple patterns may correspond to the same expression
- constr pattern : Here, constr is one of OCaml's variant type values, and pattern is its associated data
- `tag-name pattern : Here, tag-name is one of OCaml's polymorphic variants, and pattern is its associated data
- # typeconstr-name : used to match some set of polymorphic variants
- pattern, pattern, ..., pattern : these are tuples of patterns
- { field-name_1 = pattern; ... ; field-name_n = pattern} : matching against records
- [ pattern ; ... ; pattern] : matching against a list
- pattern :: pattern : matching against consed values
- [| pattern ; ... ; pattern |] : matching against arrays
Some example patterns are
-
[]: the empty list -
h::t: a non empty list, binding the head and tail to the variableshandt, respectively -
h::_: a non empty list, binding the head tohand throwing away the tail -
(_,_,x): a three-tuple where the first two values are thrown away and the third is bound tox - h::(h2::_ as t) : a list with at least two elements in it, where the head is bound to
h, the second element is bound toh2, and the list minus its first value is bound tot -
Some ((h1,(h2 : float),_,1)::next::_): aSomeconstructer with associated data that is a list of four-tuples, at least two elements in length, where the first value in the tuple at the head is bound toh1, the second is bound toh2and is restricted to being a floating point value, the third is thrown away, and the fourth is a 1. The second element is bound tonext, and the rest of the list is thrown away.
This kind of feature is very powerful, and lets us write very succinct code that operates on the structure of data. For example, a function that throws away every other element of a list can be written simply:
<<odds>>= let rec odds = function | ([] | [_]) as l -> l | h::_::t -> h::(odds t);;
Here the first pattern matches lists with zero or one elements and returns that list, while the second pattern matches all lists with at least two elements, throws away the second, saves the first, and conses it onto the list created by throwing away every other element of the remaining list.
Looping
While looping is often handled via recursive functions in OCaml, e.g.
<<loop_rec>>= let rec loop times f = match times with | 0 -> () | _ -> f (); loop (times-1) f;;
there are other built-in options, the for and while loops:
for counter = start to stop do (* some expression that evaluates to unit *) done
while test do (* some expression that evaluates to unit *) done
The for loop loops over the range provided (both endpoints inclusive), performing whatever code lies within the do and done keywords, rebinding the counter variable to the current value each time. For loops can also count backwards, using downto rather than to. The endpoints of the loop don't have to be constants, they can be any expression that evaluates to an int, and the entire for loop evaluates to (). Note that there is no "step" option as in languages such as C.
While loops are similar to those found in other languages. So long as the conditional evaluates to true, the loop body is evaluated. Like the for loop, the entire while loop evaluates to ().
I/O
Simple I/O
I/O is performed using stdin, stdout, and stderr using the various read, print, and prerr functions respectively.
-
read_line - Flushes stdout then reads from stdin until a newline character is seen. Returns the resulting string, minus the newline.
-
read_int - Flushes stdout then reads from stdin until a newline character is seen. Attempts to convert the resulting string to an int, raising an exception if it cannot.
-
read_float - Flushes stdout then reads from stdin until a newline character is seen. Attempts to convert the resulting string to a float, raising an exception if it cannot.
-
print_char - prints a character to stdout
-
print_newline - prints a newline character to stdout
-
print_string - prints a string to stdout
-
print_endline - prints a string to stdout followed by a newline character
-
print_int - prints an int to stdout
-
print_float - prints a float to stdout
-
prerr_char - prints a character to stderr
-
prerr_newline - prints a newline character to stderr
-
prerr_string - prints a string to stderr
-
prerr_endline - prints a string to stderr followed by a newline character
-
prerr_int - prints an int to stderr
-
prerr_float - prints a float to stderr
Formatted printing can be done via the Printf module which contain (among other things) functions printf and sprintf that use format strings similar to those of C. E.g.
<<printf>>= Printf.printf "0x%0X\n" 500;; let s = Printf.sprintf "%d: %s --\t%08.3f" 7 "whoopie" 1.4432578;;
will print 0x1F4 followed by a newline to stdout, and bind the string "7: whoopie -- 0001.443" to s.
File I/O
Files are usually worked with via input and output channels. To open the file data.in for reading and the file data.out for writing, one would do the following:
<<channels>>= let in_channel = open_in "data.in";; let out_channel = open_out "data.out";;
The channels can now be read from and written to using the various input and output functions. For example:
<<io>>= let line = input_line in_channel;; output_string out_channel line; output_char out_channel '\n';;
Will read the next line from in_channel (connected to data.in), and copy that line followed by a newline character into out_channel (connected to data.out). Doing this repeatedly would copy data.in to data.out.
In addition to the simple I/O functions, formatted data can also be written using the Printf module's fprintf function. Here, Printf.fprintf channel format arg1 ... argN will print arguments 1 through N to output channel channel according to the format string format.
Command line arguments
Command line arguments are available as a string array named Sys.argv. Sys.argv.(0) contains the name of the program that was executed, and subsequent entries in the array contain the additional arguments, if any, un the order they were presented on the command line. So, to print the name of the program followed by the arguments, each on a new line, one can do:
<<arguments>>= print_endline Sys.argv.(0); for i = 1 to Array.length Sys.argv - 1 do print_endline Sys.argv.(i) done;;
The Program
<<basic_constructs.ml>>= let_function let_function2 let_anon let_function combo_function factorial length find curry partial odds loop_rec printf channels io
See also
| Download code |
