This post is a note based on Software Foundation series 1, covering topics:

  • Implementation of a simple imperative language in Coq
  • Examples about how to use Coq to study other things
  • Basic automated proofs The original chapter in the Software Foundation series is extremely long (29 sheets of A4). So I’ll try hard to compact and summarize all important ideas (or at least I think important) in this post. It could be a good material to revive the knowledge in the future.

Out target language is a small one while basic statements, variables, boolean and arithmetic expressions, which looks this this:

Z := X;
   Y := 1;
   while Z ≠ 0 do
	 Y := Y × Z;
	 Z := Z - 1
   end

Define Expressions

[!Info] Caveat We only focus on the syntax and semantics (evaluation) parts of the language. If you are looking for something related to lexing and parsing, there is another optional chapter for those: LEXING AND PARSING IN COQ.

It is straightforward and explicit to define syntax in Coq using inductive defined data types, which almost resembles the Chomsky Normal Form 2.

Expressions are the part of program that always evaluate to values. There are two types of expressions in our language: arithmetic expressions and boolean expressions.

Inductive aexp : Type :=
  | ANum (n : nat)
  | APlus (a1 a2 : aexp)
  | AMinus (a1 a2 : aexp)
  | AMult (a1 a2 : aexp).
  
Inductive bexp : Type :=
  | BTrue
  | BFalse
  | BEq (a1 a2 : aexp)
  | BNeq (a1 a2 : aexp)
  | BLe (a1 a2 : aexp)
  | BGt (a1 a2 : aexp)
  | BNot (b : bexp)
  | BAnd (b1 b2 : bexp).

Evaluating Expressions

There are two ways to evaluate in Coq: by functions or by relations.

For example, the calculation of boolean evaluation is perfect to define in a function

Fixpoint beval (b : bexp) : bool :=
  match b with
  | BTrue  true
  | BFalse  false
  | BEq a1 a2  (aeval a1) =? (aeval a2)
  | BNeq a1 a2  negb ((aeval a1) =? (aeval a2))
  | BLe a1 a2  (aeval a1) <=? (aeval a2)
  | BGt a1 a2  negb ((aeval a1) <=? (aeval a2))
  | BNot b1  negb (beval b1)
  | BAnd b1 b2  andb (beval b1) (beval b2)
  end.

On the other hand, as it might be bizarre at first glance, define evaluation as a relation between domain and range is also straightforward.

(* Exercise: (bevalR) *)

Reserved Notation "e '==>b' b" (at level 90, left associativity).
Inductive bevalR: bexp -> bool -> Prop :=
  | R_BTrue: BTrue ==>b true
  | R_BFalse: BFalse ==>b false
  | R_BEq (n1 n2: nat) (a1 a2: aexp): 
      a1 ==> n1 -> 
      a2 ==> n2 -> 
      (BEq a1 a2) ==>b n1 =? n2 
  | R_BNeg (n1 n2: nat) (a1 a2: aexp):
      a1 ==> n1 ->
      a2 ==> n2 ->
      (BNeq a1 a2) ==>b negb (n1 =? n2) 
  | R_BLe (n1 n2: nat) (a1 a2: aexp):
      a1 ==> n1 ->
      a2 ==> n2 ->
      (BLe a1 a2) ==>b n1 <=? n2
  | R_BGt (n1 n2: nat) (a1 a2: aexp):
      a1 ==> n1 ->
      a2 ==> n2 ->
      (BGt a1 a2) ==>b negb (n1 <=? n2)
  | R_BNot (e: bexp) (b: bool):
      e ==>b b -> 
      (BNot e) ==>b (negb b)
  | R_BAnd (e1 e2: bexp) (b1 b2: bool):
      e1 ==>b b1 ->
      e2 ==>b b2 ->
      (BAnd e1 e2) ==>b andb b1 b2
where "e '==>b' b" := (bevalR e b) : type_scope
.

Think Curry–Howard Correspondence: a program is, in its nature, a proof3. It’s not formidable to acknowledge that both styles are identical. In fact, it is provable that these two definitions are equivalent.

(* Exercise: (bevalR) *)

Lemma beval_iff_bevalR : forall b bv,
  b ==>b bv <-> beval b = bv.
Proof.
  split.
  (* -> *)
  - intros H.
    induction H;
    (* trivial cases *)
    try (simpl; subst; reflexivity);
    (* cases be resolved by apply aeval_iff_aevalR *)
    try (simpl; apply aeval_iff_aevalR in H, H0; rewrite H, H0; reflexivity).
  (* <- *)
  - generalize dependent bv. induction b;
    try (intros; rewrite <- H; simpl; constructor);
    try (apply aeval_iff_aevalR; reflexivity).
    + apply IHb. reflexivity.
    + apply IHb1. reflexivity.
    + apply IHb2. reflexivity.
Qed.

Aside: Basic Automated Proofs

In the proof of lemma beval_iff_bevalR above, there are several tacticals involved, that is, tactics that takes another tactic as arguments. Tacticals used in the post include:

  • Tactical try T: try to apply T. If fails, try T does nothing at all
  • Tactical T;T' combines two tactic and applies T' them on each subgoal generated by T.
  • Tactical repeat T: repeatedly apply T until fails or no change in the goal or context.

Sometimes it is handy to define new tactic using Ltac.

Other powerful and general tactics are:

  • Tactic Lia solves linear integer arithmetic
  • Tactic subst substitutes all assumptions of the form x=e or e=x to reduce the number of variables
  • Tactic assumption tries to find an exact hypothesis to solve the goal
  • Tactic contradiction tries to find a contradiction hypothesis to solve the goal
  • Tactic constructor tries to find a constructor to solve the goal

Optimization

If we define a function to optimize the evaluation of an expression, we can use Coq to examine that the optimization we perform will not change the result. A silly example is: changing every occurrence of 0+e to e.

Fixpoint optimize_0plus (a:aexp) : aexp :=
  match a with
  | ANum n => ANum n
  | APlus (ANum 0) e2 => optimize_0plus e2
  | APlus  e1 e2 => APlus  (optimize_0plus e1) (optimize_0plus e2)
  | AMinus e1 e2 => AMinus (optimize_0plus e1) (optimize_0plus e2)
  | AMult  e1 e2 => AMult  (optimize_0plus e1) (optimize_0plus e2)
  end.

(* Exercise (optimize_0plus_b_sound) *)

Fixpoint optimize_0plus_b (b : bexp) : bexp :=
  match b with
  | BTrue => BTrue
  | BFalse => BFalse
  | BEq a1 a2 => BEq (optimize_0plus a1) (optimize_0plus a2)
  | BNeq a1 a2 => BNeq (optimize_0plus a1) (optimize_0plus a2)
  | BLe a1 a2 => BLe (optimize_0plus a1) (optimize_0plus a2)
  | BGt a1 a2 => BGt (optimize_0plus a1) (optimize_0plus a2)
  | BNot b => BNot b
  | BAnd b1 b2 => BAnd b1 b2
  end.

Proof of the optimization does not affect the results of evaluations (for boolean expression)

(* Exercise (optimize_0plus_b_sound) *)

Theorem optimize_0plus_b_sound : forall b,
  beval (optimize_0plus_b b) = beval b.
Proof.
  intros b.
  induction b;
  (* branches that does not involve optimization are trivial*)
  try (simpl; reflexivity);
  (* branches that involve optimization are proved by repeatly unfold and apply [optimize_0plus_sound] *)
  repeat (simpl; (rewrite optimize_0plus_sound; try reflexivity)).
Qed.
  

Computational vs Relational Definitions

In most situations, the choice of computational definitions or relational definitions are just a matter of taste. But sometimes using one over another gives more mileage.

Computational definitions are total, which means they are fully defined on all the possible inputs4. Thanks to the strict rules of Coq, evaluation by computational definitions is as simple as run a function. It is bound to halt, after all.

Example test_optimize_0plus:
  optimize_0plus (APlus (ANum 2)
                        (APlus (ANum 0)
                               (APlus (ANum 0) (ANum 1))))
  = APlus (ANum 2) (ANum 1).

(* Coq knows how to calculate this expression *)
Proof. reflexivity. Qed.

On the other hand, there are some situations that relational definitions work better. To be more specific, partial functions and nondeterminism are allowed in relational definitions. For example, it is easy to define the evaluation of loops in relation, we’ll see more about this in the next section

Commands Definition

In the world of imperative languages, commands or statements are the part that express some actions like assignments, loops and conditions. We shall find that relational definitions are more powerful to express commands than computations.

Informally, the commands in this language are (in BNF):

c := skip
	| x := a
	| c ; c
	| if b then c else c end
	| while b do c end

which corresponds to following definition.

Inductive com : Type :=
  | CSkip
  | CAsgn (x : string) (a : aexp)
  | CSeq (c1 c2 : com)
  | CIf (b : bexp) (c1 c2 : com)
  | CWhile (b : bexp) (c : com).

Note that there are some notations defined in the chapter of Software Foundation series, which makes the program more readable.

(* assign *)
Definition plus2 : com :=
  <{ X := X + 2 }>.

(* loop *)
Definition subtract_slowly : com :=
  <{ while X <> 0 do
	    Z := Z - 1 ;
		X := X - 1
     end }>.

I choose to leave out the definitions of these notations as they are not the core of what we want to do. If you want to see the details of the notations, you can refer to the original chapter.1

Evaluating Commands

First we need to introduce a concept of program state, or simply state. A state is designed to remember preceding evaluation. Typically, a state stores the values of variables 5. For convenience, we use a total map (a key-value dictionary with a default value) to be a state.

Definition state := total_map nat.

Remember that Coq only accept functions that are guaranteed to terminate. This rule is employed to ensure functions in Coq are indeed total. If you try to evaluating commands as functions, the function would be “takes the current state and a command and then produces a new state”. there is one huge obstacle in the way: how to define loops?

Fixpoint ceval_fun_no_while (st : state) (c : com) : state :=
  match c with
    (* ... *)
    (* omit other easy cases *)
    
    | <{ while b do c end }> =>
        st  (* bogus *)
  end.

Then what you might do is trying to decide if the loop would eventually terminate. It is , again, a horrible attempt because it is actually the halting problem, which is proved to be undecidable6.

On the other hand, define loops in term of relations is easy. Informally,

beval st b = false
-----------------------------(E_WhileFalse)
st =[ while b do c end ]=> st

beval st b = true
st =[ c ]=> st'
st' =[ while b do c end ]=> st''
--------------------------------(E_WhileTrue)
st  =[ while b do c end ]=> st''

Formally,

Inductive ceval : com -> state -> state -> Prop :=
 (* Omit other easy branches *)
  | E_WhileFalse : forall b st c,
      beval st b = false ->
      st =[ while b do c end ]=> st
  | E_WhileTrue : forall st st' st'' b c,
      beval st b = true ->
      st  =[ c ]=> st' ->
      st' =[ while b do c end ]=> st'' ->
      st  =[ while b do c end ]=> st''
where "st =[ c ]=> st'" := (ceval c st st').

The cost of defining evaluation as relations is that we need to construct a proof to evaluate a program.

(** Exercise (ceval_example2) *)
Example ceval_example2:
  empty_st =[
    X := 0;
    Y := 1;
    Z := 2
  ]=> (Z !-> 2 ; Y !-> 1 ; X !-> 0).
Proof.
  apply E_Seq with (X !-> 0).
  - apply E_Asgn. reflexivity.
  - apply E_Seq with (Y !-> 1; X !-> 0);
    apply E_Asgn; reflexivity.
Qed.

The nature behind the problem why we can define loops as relations but not as functions is that we can’t give a total function of loop: what if the input causes an infinite loop?

Up to this time, we have fully defined our tiny little programming languages.

Reasoning about Programing Languages

We can reason about some properties of the programming languages we just defined.

First, we show that a infinite loops indeed never stops.

(** Exercise (loop_never_stops) *)

Definition loop : com :=
  <{ while true do skip end }>.

Theorem loop_never_stops : forall st st',
  ~(st =[ loop ]=> st').
Proof.
  intros st st' contra. unfold loop in contra.
  remember <{ while true do skip end }> as loopdef
           eqn:Heqloopdef.
  induction contra; try (discriminate).
  (* while false do skip  = loopdef *)
  - inversion Heqloopdef.
    subst. discriminate H.
  (* while false do something not skip = loopdef *)
  - inversion Heqloopdef. subst. apply IHcontra2. reflexivity.
Qed. 

We can also show that if a program does not contain while, it indeed terminates. We give both function and relational definition of the predicate - “no while in the program” and establish their equivalence.

(** Exercise (no_whiles_eqv) **)

Inductive no_whilesR: com -> Prop :=
  | NW_Skip: no_whilesR <{ skip }>
  | NW_Asgn: forall x a, no_whilesR <{ x := a}>
  | NW_Seq: forall c1 c2, 
      no_whilesR c1 -> 
      no_whilesR c2 -> 
      no_whilesR <{c1; c2}>
  | NW_IF: forall b c1 c2,  
      no_whilesR c1 -> 
      no_whilesR c2 -> 
      no_whilesR <{if b then c1 else c2 end}>
  .

Fixpoint no_whiles (c : com) : bool :=
  match c with
  | <{ skip }> =>
      true
  | <{ _ := _ }> =>
      true
  | <{ c1 ; c2 }> =>
      andb (no_whiles c1) (no_whiles c2)
  | <{ if _ then ct else cf end }> =>
      andb (no_whiles ct) (no_whiles cf)
  | <{ while _ do _ end }>  =>
      false
  end.

Theorem no_whiles_eqv:
  forall c, no_whiles c = true <-> no_whilesR c.
Proof.
  split.
  - intros. induction c; try(constructor);
    try(
      inversion H;
      apply andb_true_iff in H1; destruct H1; 
      (* for c1 branch *)
      try(apply IHc1 in H0; exact H0);
      (* for c12branch *)
      try(apply IHc2 in H1; exact H1)
    ).
  - intros. induction c; try(constructor);
    try (simpl; inversion H; subst; 
      apply andb_true_intro; split;
      try (apply IHc1); try (apply IHc2); 
      trivial).
Qed.

Then it’s easy to approach the theorem

Theorem no_whiles_terminating : forall c st,
  no_whilesR c -> exists st', st =[ c ]=> st'.
Proof.
  intros. induction c.
  + (* skip *) exists st. constructor.
  + (* assign *) exists (x !-> (aeval st a); st). constructor. reflexivity.
  
  (* The proof is not hard, albeit tedious.
	 You need to deal with all branches. *)
  (* Let's save the page*) 
  Admitted.

A Step-Indexed Evaluator

We can’t define a evaluation function in Coq that might fall in a indefinite loop, because Coq is not just a general purpose programming language, but it is, more importantly, a consistent logic. A simple makeshift is that we can add a number as a parameter that decreases after every step and terminates after its value hits 0.

Fixpoint ceval_step2 (st : state) (c : com) (i : nat) : state :=
  match i with
  | O => empty_st
  | S i' => (* decrease 1 on each step *)
    match c with
      | <{ skip }> =>
          st
      | <{ l := a1 }> =>
          (l !-> aeval st a1 ; st)
      | <{ c1 ; c2 }> =>
          let st' := ceval_step2 st c1 i' in
          ceval_step2 st' c2 i'
      | <{ if b then c1 else c2 end }> =>
          if (beval st b)
            then ceval_step2 st c1 i'
            else ceval_step2 st c2 i'
      | <{ while b1 do c1 end }> =>
          if (beval st b1)
          then let st' := ceval_step2 st c1 i' in
               ceval_step2 st' c i'
          else st
    end
  end.

Furthermore, we can use optionto suggests whether the program halts after a success evaluation or reaching the step limitation.

Extraction

Extraction is to generate an efficient program in other language, such as Haskell. It’s straightforward to do so.

Extraction Language Haskell.
Extraction "imp1.hs" ceval_step.

A extracted program can be viewed as a certified interpreter. More controlled extraction can be found on the chapter EXTRACTING OCAML FROM COQ.

Reference

  1. https://softwarefoundations.cis.upenn.edu/lf-current/toc.html  2

  2. Chomsky normal form. (2023). Retrieved from https://en.wikipedia.org/wiki/Chomsky_normal_form 

  3. Curry–Howard correspondence. 

  4. https://en.wikipedia.org/wiki/Partial_function 

  5. State (computer science) 

  6. https://en.wikipedia.org/wiki/Halting_problem