Merge sort (ACL2)

From LiteratePrograms
Jump to: navigation, search
Other implementations: ACL2 | C | C++ | dc | Eiffel | Erlang | Haskell | Java | JavaScript | Lisp | OCaml | Oz | Perl | Prolog | Python | Ruby | Scheme

This article presents an ACL2 implementation of the merge sort sorting algorithm, including a complete formal proof of correctness. The overall structure of our Lisp file will be:

<<mergesort.lisp>>=
(in-package "ACL2")

;--------------------------------------------------------
; Definitions
;--------------------------------------------------------

definitions

;--------------------------------------------------------
; Initial results
;--------------------------------------------------------

initial results

;--------------------------------------------------------
; Show that permutation-p is an equivalence relation
;--------------------------------------------------------

permutation-p reflexivity
permutation-p transitivity
permutation-p symmetry
permutation-p equivalence

;--------------------------------------------------------
; Main lemmas and final result
;--------------------------------------------------------

append congruences
merge-ordered is a permutation of append
append of split yields the original list
Final correctness result

Contents

[edit] Definitions

Our implementation of merge sort has two helpers. The first, merge-ordered, takes two ordered (sorted) lists and efficiently merges them into a single ordered list:

<<definitions>>=
(defun merge-ordered (x y)
  (declare (xargs :measure (+ (acl2-count x) (acl2-count y))))
  (cond ((endp x)             y)
	((endp y)             x)
	((< (car x) (car y))  (cons (car x) (merge-ordered (cdr x) y)))
	(t                    (cons (car y) (merge-ordered x (cdr y))))))

The second, split, is used to break a list into two smaller sublists (roughly half and half). This particular implementation, one of the more convenient to write in a functional style, places the even-index elements in one list and the odd-index elements in the other:

<<definitions>>=
(defun split (lst)
  (if (endp lst)
      (mv nil nil)
    (if (endp (cdr lst))
	(mv lst nil)
      (mv-let (list1 list2)
	  (split (cdr (cdr lst)))
	(mv (cons (car lst) list1) (cons (car (cdr lst)) list2))))))

Next we define the main mergesort function, which splits the list into two sublists, sorts each sublist, then merges them using merge-ordered:

<<mergesort>>=
(defun mergesort (lst)
  (if (or (endp lst) (endp (cdr lst)))
      lst
    (mv-let (list1 list2)
	(split lst)
      (merge-ordered (mergesort list1) (mergesort list2)))))

ACL2 will not admit mergesort because it is not able to prove that it terminates - it doesn't realize that the lists returned from split are smaller than the original list. We throw in a lemma to establish this and then add mergesort:

<<definitions>>=
(defthm split-results-shorter
  (implies (and (consp x) (consp (cdr x)))
	   (and (< (acl2-count (car (split x))) (acl2-count x))
		(< (acl2-count (mv-nth 1 (split x))) (acl2-count x)))))

mergesort

Next, we add some helpful functions describing correctness properties of sorting algorithms. The first determines whether a list is ordered, or sorted:

<<definitions>>=
(defun ordered-p (lst)
  (or (endp lst)
      (endp (cdr lst))
      (and (<= (car lst) (car (cdr lst)))
           (ordered-p (cdr lst)))))

It is not enough for a sorting algorithm to produce ordered output however; the output list must also be a reordering, or permutation, of the input list. We add another function to test this, which depends on a helper function for deleting a given element from a list:

<<definitions>>=
(defun delete-element (a lst)
  (if (endp lst)
      lst
    (if (equal a (car lst))
        (cdr lst)
      (cons (car lst) (delete-element a (cdr lst))))))

(defun permutation-p (left right)
  (or (and (endp left) (endp right))
      (and (consp left)
           (member (car left) right)
           (permutation-p (cdr left) (delete-element (car left) right)))))

[edit] Initial results

We get ordered for free; ACL2 proves it from scratch:

<<initial results>>=
(defthm mergesort-ordered
  (ordered-p (mergesort x)))

It remains to show that mergesort's output is a permutation of its input:

(defthm mergesort-permutation-of-input
  (permutation-p (mergesort x) x))


Attempting to prove this, we get stuck at this simplification checkpoint:

Subgoal *1/2''
(IMPLIES (AND (CONSP X)
              (CONSP (CDR X))
              (PERMUTATION-P (MERGESORT (CAR (SPLIT X)))
                             (CAR (SPLIT X)))
              (PERMUTATION-P (MERGESORT (MV-NTH 1 (SPLIT X)))
                             (MV-NTH 1 (SPLIT X))))
         (PERMUTATION-P (MERGE-ORDERED (MERGESORT (CAR (SPLIT X)))
                                       (MERGESORT (MV-NTH 1 (SPLIT X))))
                        X)).

Clearly this is true, since the two lists returned from split together form X, and merge-ordered combines them back into one list. The difficult task is to prove that the result of merge-ordered is a permutation of the original list.

One way to approach this is to show that the result of merge-ordered is a permutation of the result of appending the two lists, and separately show that appending the two sublists produces a permutation of the original list. The transitivity of permutation-p yields the result. These are still nontrivial to show, though. We start out by proving that permutation-p an equivalence relation, allowing ACL2 to better reason with it.

[edit] permutation-p is an equivalence relation

[edit] Reflexivity

ACL2 has no problem proving reflexivity:

<<permutation-p reflexivity>>=
(defthm permutation-p-reflexive
  (permutation-p x x))

[edit] Transitivity

For transitivity, we attempt to show that:

<<permutation-p-transitive>>=
(defthm permutation-p-transitive
  (implies (and (permutation-p x y)
		(permutation-p y z))
	   (permutation-p x z)))

Attempting to prove permutation-p-transitive gives us this checkpoint:

Subgoal *1/5'
(IMPLIES (AND (CONSP X)
              (NOT (MEMBER (CAR X) Z))
              (MEMBER (CAR X) Y)
              (PERMUTATION-P (CDR X)
                             (DELETE-ELEMENT (CAR X) Y)))
         (NOT (PERMUTATION-P Y Z))).

Dropping unnecessary conditions and rearranging:

(IMPLIES (AND (CONSP X)
              (MEMBER (CAR X) Y)
              (PERMUTATION-P Y Z))
         (MEMBER (CAR X) Z))

So we need to show that if a is in X and X is a permutation of Y, a is in Y. A good general result for this is:

<<permutation-p-implies-member-iff>>=
(defthm permutation-p-implies-member-iff
  (implies (permutation-p x y)
	   (iff (member a x)
		(member a y))))

But if we try to prove it, we only reach this checkpoint:

Subgoal *1/3.2
(IMPLIES (AND (CONSP X)
              (NOT (EQUAL A (CAR X)))
              (NOT (MEMBER A (CDR X)))
              (NOT (MEMBER A (DELETE-ELEMENT (CAR X) Y)))
              (MEMBER (CAR X) Y)
              (PERMUTATION-P (CDR X)
                             (DELETE-ELEMENT (CAR X) Y)))
         (NOT (MEMBER A Y))).

Rearranging and dropping unnecessary conditions:

(IMPLIES (AND (NOT (EQUAL A (CAR X)))
	      (MEMBER A Y))
	 (MEMBER A (DELETE-ELEMENT (CAR X) Y)))

In other words, deleting an element doesn't affect membership of other elements. We prove a lemma to show this:

<<permutation-p transitivity>>=
(defthm delete-different-element-preserves-member
  (implies (not (equal a b))
	   (iff (member a (delete-element b x))
		(member a x))))

We can then prove permutation-p-implies-member-iff:

<<permutation-p transitivity>>=
permutation-p-implies-member-iff

We take another crack at permutation-p-transitive and get stuck at:

(IMPLIES (AND (CONSP X)
              (MEMBER (CAR X) Z)
              (NOT (PERMUTATION-P (DELETE-ELEMENT (CAR X) Y)
                                  (DELETE-ELEMENT (CAR X) Z)))
              (PERMUTATION-P (CDR X)
                             (DELETE-ELEMENT (CAR X) Y))
              (PERMUTATION-P Y Z))
         (PERMUTATION-P (CDR X)
                        (DELETE-ELEMENT (CAR X) Z))).

Rearranging and dropping unneeded hypotheses, we get:

(IMPLIES (AND (CONSP X)
              (MEMBER (CAR X) Z)
              (PERMUTATION-P Y Z))
	 (PERMUTATION-P (DELETE-ELEMENT (CAR X) Y)
			(DELETE-ELEMENT (CAR X) Z)))

Since (car x) is in Z, permutation-p-implies-member-iff tells us and ACL2 that it's also in Y. What it doesn't know is that deleting the same element from both lists preserves permutation-p:

<<delete-same-member-preserves-permutation-p>>=
(defthm delete-same-member-preserves-permutation-p
  (implies (and (member x a)
		(member x b)
		(permutation-p a b))
	   (permutation-p (delete-element x a)
			  (delete-element x b))))

When we attempt to prove this, we fail at:

Subgoal *1/5''
(IMPLIES (AND (CONSP A)
              (NOT (EQUAL X (CAR A)))
              (PERMUTATION-P (DELETE-ELEMENT X (CDR A))
                             (DELETE-ELEMENT X (DELETE-ELEMENT (CAR A) B)))
              (MEMBER X B)
              (MEMBER (CAR A) B)
              (PERMUTATION-P (CDR A)
                             (DELETE-ELEMENT (CAR A) B)))
         (PERMUTATION-P (DELETE-ELEMENT X (CDR A))
                        (DELETE-ELEMENT (CAR A)
                                        (DELETE-ELEMENT X B)))).

Simplifying and rearranging:

(IMPLIES (PERMUTATION-P (DELETE-ELEMENT X (CDR A))
			(DELETE-ELEMENT X (DELETE-ELEMENT (CAR A) B)))
         (PERMUTATION-P (DELETE-ELEMENT X (CDR A))
                        (DELETE-ELEMENT (CAR A) (DELETE-ELEMENT X B))))

Clearly, it doesn't matter in what order we delete two elements from the list. We prove the lemma delete-element-commutes, and the lemmas delete-same-member-preserves-permutation-p and permutation-p-transitive follow:

<<permutation-p transitivity>>=
(defthm delete-element-commutes
  (equal (delete-element a (delete-element b c))
	 (delete-element b (delete-element a c))))

delete-same-member-preserves-permutation-p

permutation-p-transitive

[edit] Symmetry

Next, we go for symmetry:

<<permutation-p-symmetric>>=
(defthm permutation-p-symmetric
  (implies (permutation-p x y)
	   (permutation-p y x)))

Attempting this proof gives:

HARD ACL2 ERROR in REWRITE:  The call depth limit of 1000 has been
exceeded in the ACL2 rewriter.

Examining the stack reveals that permutation-p-implies-member-iff is being applied in an endless loop. We disable it:

<<permutation-p symmetry>>=
(in-theory (disable permutation-p-implies-member-iff))

Trying symmetry again, we get this simplification checkpoint:

Subgoal *1/3'
(IMPLIES (AND (CONSP X)
              (MEMBER (CAR X) Y)
              (PERMUTATION-P (DELETE-ELEMENT (CAR X) Y)
                             (CDR X))
              (PERMUTATION-P (CDR X)
                             (DELETE-ELEMENT (CAR X) Y)))
         (PERMUTATION-P Y X)).

Simplifying:

(IMPLIES (AND (MEMBER (CAR X) Y)
              (PERMUTATION-P (DELETE-ELEMENT (CAR X) Y)
                             (CDR X)))
         (PERMUTATION-P Y X))

We will show that:

Y is a permutation of
(CONS (CAR X) (DELETE-ELEMENT (CAR X) Y)) is a permutation of
(CONS (CAR X) (CDR X)) equals
X

ACL2 can prove the second permutation trivially. Given transitivity, it should be able to prove the result from just a lemma for the first permutation above, but its rewriting heuristics don't work. Instead, we define a more explicit lemma that directly implies the result, adding a hypothesis that is easy to derive from existing hypotheses:

<<permutation-p symmetry>>=
(defthm cons-delete-permutation-of-original
  (implies (and (member a y)
		(permutation-p (cons a (delete-element a y)) x))
	   (permutation-p y x)))

Symmetry now follows, and permutation-p is an equivalence:

<<permutation-p symmetry>>=
permutation-p-symmetric
<<permutation-p equivalence>>=
(defequiv permutation-p)

[edit] Main lemmas and result

[edit] append congruences

Recall the checkpoint of mergesort-permutation-of-input:

Subgoal *1/2''
(IMPLIES (AND (CONSP X)
              (CONSP (CDR X))
              (PERMUTATION-P (MERGESORT (CAR (SPLIT X)))
                             (CAR (SPLIT X)))
              (PERMUTATION-P (MERGESORT (MV-NTH 1 (SPLIT X)))
                             (MV-NTH 1 (SPLIT X))))
         (PERMUTATION-P (MERGE-ORDERED (MERGESORT (CAR (SPLIT X)))
                                       (MERGESORT (MV-NTH 1 (SPLIT X))))
                        X))

Before we show merge-ordered is a permutation of append, first we'll show that either argument in append can have a permutation of that argument substituted without changing the result (with respect to permutations), which will allow us to use the hypotheses above. We start with:

<<append congruences>>=
(defthm permutation-p-implies-permutation-p-append-2
  (implies (permutation-p y y-equiv)
	   (permutation-p (append x y)
			  (append x y-equiv)))
  :rule-classes (:congruence))

The congruence rule class is a special class for rules of this form (we could have also used defcong). ACL2 proves the above automatically, but if we attempt to show the analogous result for argument 1, we get stuck at this checkpoint:

Subgoal *1/3.2
(IMPLIES (AND (CONSP X)
              (PERMUTATION-P (APPEND (CDR X) Y)
                             (APPEND (DELETE-ELEMENT (CAR X) X-EQUIV)
                                     Y))
              (MEMBER (CAR X) X-EQUIV)
              (PERMUTATION-P (CDR X)
                             (DELETE-ELEMENT (CAR X) X-EQUIV)))
         (MEMBER (CAR X) (APPEND X-EQUIV Y))).

Eliminating unneeded hypotheses:

(IMPLIES (MEMBER (CAR X) X-EQUIV)
         (MEMBER (CAR X) (APPEND X-EQUIV Y)))

This is clearly true. We show a general lemma for this, which allows us to prove the desired congruence:

<<append congruences>>=
(defthm member-of-append-iff-member-of-operand
  (iff (member a (append x y))
       (or (member a x)
	   (member a y))))

(defthm permutation-p-implies-permutation-p-append-1
  (implies (permutation-p x x-equiv)
	   (permutation-p (append x y)
			  (append x-equiv y)))
  :rule-classes (:congruence))

[edit] merge-ordered is a permutation of append

Now we focus on showing that merge-ordered is a permutation of append.

<<merge-ordered-permutation-of-append simple>>=
(defthm merge-ordered-permutation-of-append
  (permutation-p (merge-ordered x y)
		 (append x y)))

If we examine the proof, we see that it tried to perform induction based on append. This isn't very useful, as merge-ordered is not expanded, and we know more about append than merge-ordered. Instead, we explicitly tell it to induct over merge-ordered:

<<merge-ordered-permutation-of-append>>=
(defthm merge-ordered-permutation-of-append
  (permutation-p (merge-ordered x y)
		 (append x y))
  :hints (("Goal" :induct (merge-ordered x y))))

Now, merge-ordered has been completely eliminated from the conclusion in our simplification checkpoint:

Subgoal *1/4''
(IMPLIES (AND (CONSP X)
              (CONSP Y)
              (<= (CAR Y) (CAR X))
              (PERMUTATION-P (MERGE-ORDERED X (CDR Y))
                             (APPEND X (CDR Y))))
         (PERMUTATION-P (APPEND X (CDR Y))
                        (DELETE-ELEMENT (CAR Y) (APPEND X Y)))).

Observe that (append (cdr y) x) and (delete-element (car y) (append y x)) are the same list. This means that we should be able to show this if we can just show two lemmas: one showing that switching the args to append gives a permutation, and the other a congruence showing that we can permute the second argument to delete-element:

<<merge-ordered is a permutation of append>>=
(defthm append-commutative-wrt-permutation
  (permutation-p (append x y)
		 (append y x)))

(defthm permutation-p-implies-permutation-p-delete-2
  (implies (permutation-p x x-equiv)
	   (permutation-p (delete-element a x)
			  (delete-element a x-equiv)))
  :rule-classes (:congruence))

No progress is made because ACL2's heuristics won't reduce (DELETE-ELEMENT (CAR Y) (APPEND Y X)) to (APPEND (CDR Y) X). To facilitate this we add a rewrite rule that moves a DELETE-ELEMENT inside an APPEND:

<<merge-ordered is a permutation of append>>=
(defthm delete-append-deletes-from-leftmost-containing
  (equal (delete-element a (append x y))
	 (if (member a x)
	     (append (delete-element a x) y)
	   (append x (delete-element a y)))))

We can now show the desired result:

<<merge-ordered is a permutation of append>>=
merge-ordered-permutation-of-append

[edit] append of split yields the original list

Finally, we must prove that appending the two lists returned by split yields the original input:

<<append-split-permutation-of-original>>=
(defthm append-split-permutation-of-original
  (permutation-p (append (car (split x))
			 (mv-nth 1 (split x)))
		 x))

The first simplification checkpoint looks like:

Subgoal *1/3''
(IMPLIES (AND (CONSP X)
              (CONSP (CDR X))
              (PERMUTATION-P (APPEND (CAR (SPLIT (CDDR X)))
                                     (MV-NTH 1 (SPLIT (CDDR X))))
                             (CDDR X)))
         (PERMUTATION-P (APPEND (CAR (SPLIT (CDDR X)))
                                (CONS (CADR X)
                                      (MV-NTH 1 (SPLIT (CDDR X)))))
                        (CDR X))).

If we could extract the cons out of the append, it would match the hypothesis. We can do this using a simple permutation lemma, and then prove the result:

<<append of split yields the original list>>=
(defthm append-cons-commute-under-permutation
  (permutation-p (append x (cons a y))
		 (cons a (append x y))))

append-split-permutation-of-original

[edit] Final correctness result

Now, we have everything we need to prove the result:

<<Final correctness result>>=
(defthm mergesort-permutation-of-input
  (permutation-p (mergesort x) x))

(defthm mergesort-correct
  (and (ordered-p (mergesort x))
       (permutation-p (mergesort x) x))
  :rule-classes nil)

We use :rule-classes nil on the last theorem so that ACL2 doesn't try to create rewrite rules for it and give spurious warnings.

Download code
hijacker
hijacker
hijacker
hijacker