Reasoning Proofs with Basic Data Structures
This note refers to the the online book series software foundation.
Pair
In inductive type definition, constructors can take arbitrary number of parameters.
For example, here is a pair type:
Inductive natprod : Type :=
| pair (n1 n2 : nat).
We can also give a notation for this type
Notation "( x , y )" := (pair x y).
Example for using pair
Definition fst (p : natprod) : nat :=
match p with
| (x,y) ⇒ x
end.
Example fst1: fst (3,5) = 3.
Pair is an extremely basic type but illustrates the idea of how data structure is like in Coq.
List
The most common list type is given
Inductive natlist : Type :=
| nil
| cons (n : nat) (l : natlist).
This definition can be read:
Any natlist
type is either empty nil
, or is composed by a number with another natlist
.
Similarly, we can define notations parse rules for list
Notation "x :: l" := (cons x l)
(at level 60, right associativity).
Notation "[ ]" := nil.
Notation "[ x ; .. ; y ]" := (cons x .. (cons y nil) ..).
(* These 3 definitions are identical *)
Definition mylist1 := 1 :: (2 :: (3 :: nil)).
Definition mylist2 := 1 :: 2 :: 3 :: nil.
Definition mylist3 := [1;2;3].
The operation on list follows a recursive style. e.g. get the length of a list:
Fixpoint length (l:natlist) : nat :=
match l with
| nil ⇒ O
| h :: t ⇒ S (length t)
end.
A more non-trivial example of operations on list is nonzeros
, which filter 0 out of a list. (From exercise)
Fixpoint nonzeros (l:natlist) : natlist:=
match l with
| nil => nil
| n::l' => match n with
| O => nonzeros l'
| h => h::(nonzeros l')
end
end.
Example test_nonzeros:
nonzeros [0;1;0;2;3;0;0] = [1;2;3].
reflexivity. Qed.
Bag
A bag is another data structure, which is like a set but an element can appear multiple times. A naïve implementation of bag is list.
Definition bag := natlist.
This function count
counts how many target items in a bag:
Fixpoint count (v : nat) (s : bag) : nat :=
match s with
| nil => O
| n::s' => if eq_nat v n then S ( count v s') else count v s'
end.
Example test_count1: count 1 [1;2;3;1;4;1] = 3.
simpl. reflexivity. Qed.
Example test_count2: count 6 [1;2;3;1;4;1] = 0.
simpl. reflexivity. Qed.
Here we gives a proof related to bag and count function to show how to reason data structure in coq
(* adding an element makes the count increase by 1 *)
Theorem add_inc_count: forall n: nat, forall s: bag,
count n (n::s) = S (count n s).
Proof.
intros n s.
destruct s as [| s'].
+ simpl. rewrite -> eq_self. reflexivity.
+ simpl. rewrite -> eq_self. reflexivity.
Qed.
where eq_self
states that a natural number equal to itself
Lemma eq_self: forall n: nat, eq_nat n n = true.
It’s hard to just read the proof and figure out how the proof works without an Coq session. For example, the proof for both subgoals seems to be identical, but it’s not in fact. We can translate it into a informal proof:
-
Theorem: for all natural number n, for all bag s, count n (n::s) = S (count n s). Proof by considering the 2 possible constructors of s.
First let s to be
nil
, we want to prove thatcount n \[n\] = S (count n \[\ ])
by the definition of count, the above equation is equivalent to
(if eq_nat n n then 1 else 0) = 1
This equation can be made true by applying the eq_self theorem.
Then we consider s to be
n::s'
, we want to show thatcount n (n :: s' :: s) = S (count n (s' :: s))
Similarly, we use the definition of count to extend the equation. then we will find the expression
eq_nat n n
again. Then followeq_self
theorem we can make the both side to be identical.
Another example shows the “proofs by induction” tactics applying on structured data. natlist
has 2 constructors: nil
or cons
, which can be used to make induction proof because a list has only 2 possible form: either empty or cons
applies to another list. By induction, we need to show that
- P holds true when
l
isnil
- P holds true when
l
is consists of some numbern
and a smaller listl'
, given P holds true forl'
(* reverse a list *)
Fixpoint rev (l:natlist) : natlist :=
match l with
| nil => nil
| h :: t => rev t ++ [h]
end.
(* The reverse of the concatenation of 2 lists is the concatenation of 2 reversed lists in reverse order*)
Theorem rev_app_distr: forall l1 l2 : natlist,
rev (l1 ++ l2) = rev l2 ++ rev l1.
Proof.
intros l1 l2.
destruct l2 as [| l2'].
- (* rev (l1 ++ [ ]) = rev [ ] ++ rev l1 *) rewrite -> rev_emp. reflexivity.
- (* rev (l1 ++ l2' :: l2) = rev (l2' :: l2) ++ rev l1 *)
induction l1 as [| n1 l1' IHl1].
+ (* rev ([ ] ++ l2' :: l2) = rev (l2' :: l2) ++ rev [ ] *)
rewrite -> app_nil_r. simpl. reflexivity.
+ (* rev ((n1 :: l1') ++ l2' :: l2) = rev (l2' :: l2) ++ rev (n1 :: l1') *)
simpl.
rewrite IHl1.
rewrite app_assoc.
reflexivity.
Qed.
where the lemmas are:
Lemma rev_emp: forall l: natlist, rev (l ++ [ ]) = rev l.
Lemma app_nil_r : forall l : natlist, l ++ [] = l.
Although I make the comments on the branches, but it’s still hard to follow. It is way easier to see what’s going on if you are reading the proof in an interactive Coq session. In this proof, we have 2 extra lemmas. The good practice for lemmas is to make it as general as possible.
Search command
To build a big proof, the common way is to split it to be several small lemmas. But it is sometimes difficult to remember the names of those lemmas. The Search
command can search lemma in the context by a given pattern.
Search rev.
Search (_ + _ = _ + _).
Search (_++_).
Map
A map is a type of data structure that map an id (or a key) to a value.
We give a definition of map in coq:
Inductive id : Type := Id (n : nat).
Inductive partial_map : Type :=
| empty
| record (i : id) (v : nat) (m : partial_map).
Here the id
is simply a nat
internally. But by introducing another name, we make the type more readable and flexible if we want to change the implementation in the future.
The declaration of partial_map
can be read: “ There are 2 ways to construct a partial_map
: either an empty or combines a k-v pair with a partial_map
”
There are 2 operations regarding a map: put (update) and get (find):
Definition update (d : partial_map)(x : id) (value : nat): partial_map :=
record x value d.
Fixpoint find (x : id) (d : partial_map) : natoption :=
match d with
| empty ⇒ None
| record y v d' ⇒ if eqb_id x y
then Some v
else find x d'
end.
There might be some confusion about why the update
is just put a new record on the original map, instead of checking if there is an existing key. It is my understanding that it doesn’t matter in this context because the find
function always returns the first item that it found. If there is a need for delete
function, it is also compatible because the implementation of delete
can just delete all the same keys.
Options
Option
is a common FP pattern that explicitly distinguish null values and non-null values by providing a responsive “container” for the value. This design pattern is so popular that even Java has an Optional class as well.
Here is the practice of Option
in coq
Inductive natoption : Type :=
| Some (n : nat)
| None.
If there is a non-null value, we say there is “Some” value; Otherwise, there is “None” value. This pattern is even more useful in coq since it doesn’t have build-in null
object as many programming languages do.
Here’s an example using natoption
inductive type to write a safer function:
Definition hd_error (l : natlist) : natoption :=
match l with
| nil => None
| n::l => Some n
end.
The hd_error
function returns the first element of list l
. If the list is empty, it gives a “None”; elsewise, it gives “Some” value.