Programming Language Elements for Correctness Proofs Gergely Dvai - - PowerPoint PPT Presentation
Programming Language Elements for Correctness Proofs Gergely Dvai - - PowerPoint PPT Presentation
Programming Language Elements for Correctness Proofs Gergely Dvai ELTE University, Budapest Department of Programming Languages and Compilers Supervisor: Dr. Zoltn Csrnyei Motivation A considerable part of software products' life
Motivation
- A considerable part of software products' life cycle
is testing and bug-fixing.
- Expectations concerning safe and secure operation
- f programs are increasing.
- Formal methods could help, but they are not yet
efficient enough: their usage in industry is limited.
- Key problems: integration of formal methods and
low efficiency of theorem provers.
Possible solution
- A
programming language where instead
- f
instructions one writes formal specification and proof.
- The task of the compiler is to check the proof and –
using its information – to generate code in a “traditional” target language.
- The generated program fulfils the requirements of
the specification.
External theorem provers
Program code in a “traditional” language Specification Representation in an external theorem prover Dischargement
- f
proof obligations
- Programming errors are discovered in the last
phase of development.
- Changing the program code may invalidate parts of
the proof.
Annotating the source code
- Few annotations only: checking can not be (fully)
automated, external theorem prover is needed.
- More annotations in order to enable automated
checking: redundant code.
public class QSort { /*@ requires A != null; ensures A.length == \old(A.length) && (\forall int k; k< A.length && k > 0; A[k] >= A[k-1] ); @*/ public void quickSort(/*@ non_null @*/ int[] A) { quicksort(A, 0, A.length - 1); } ...
Functional and logic programming
- Program
code in these languages may be considered as “executable specification”.
- For some problems (e.g. sorting) either it does not
reflect the “natural” specification, or it is extremely inefficient.
sum [] = 0 sum [ x : r ] = x + sum r naive_sort(List,Sorted) :- perm(List,Sorted), sorted(Sorted). sorted([]). sorted([_]). sorted([X,Y|T]) :- X=<Y,is_sorted([Y|T]). insert_sort(List,Sorted) :- i_sort(List,[],Sorted). i_sort([],Acc,Acc). i_sort([H|T],Acc,Sorted) :- insert(H,Acc,NAcc), i_sort(T,NAcc,Sorted). insert(X,[Y|T],[Y|NT]) :- X>Y, insert(X,T,NT). insert(X,[Y|T],[X,Y|T]) :- X=<Y. insert(X,[],[X]).
Correctness by construction
- The
formal specification is refined towards an implementation.
- If the
refinement steps are correct, the resulting program fulfils the requirements of the specification.
- Implementations (e.g. the B-
method, SpecWare) also use external theorem provers.
MACHINE First VARIABLES x INVARIANT x>0 OPERATIONS ... END MACHINE Second REFINES First OPERATIONS ... END
In the proposed solution...
- stepwise refinement is used to ensure early
discovey of errors and to help in design decisions
- specification is abstract, implementation can be
any (efficient) algorithm that solves the problem
- target-language code is generated automatically,
the programmer writes the proof only
- construction of proofs (both using temporal and
classical logic) is integrated, no external theorem provers are needed
- programming language elements (e.g. templates)
are used to ease proof construction
Current state of the project
- The compiler is implemented in C++ (>6000 lines
- f source code).
- There are already hundreds of test files.
- Simple but useful algorithms (sort, conditional
maximum search) are implemented.
- A small “utility library” is constructed to ease
reasoning about loops etc.
- Supported target languages: C++ (currently),
NASM assembly (in a previous version)
States of a program
- States of a program are described using first-order
logic formulae using program variables and parameter variables.
- “The program starts and the outValue parameter
denotes the value of the standard output stream.”
- “The program terminates and the original value of
the standard output stream is extended by the string «Hello!».” ip = Start & out = outValue ip = Stop & out = outValue + "Hello!"
“Hello World!” example
ip = Start & out = outValue >>
- ut = outValue + "Hello!" & ip = Stop;
- There is no need to refine this specification.
- Tactics can “solve” it automatically.
Precondition Postcondition It is a “progress” property
Tactics & templates
- Tactics are not built in the compiler, they can be
implemented in the language using templates.
- Templates contain proof fragments (refinements)
that can be reused and parametrised.
- Compile
time conditions examine the actual parameters of a template call and makes the templates more reusable.
- There are several types of templates:
– to contain axioms of functions or instructions – to enable induction – to describe proof tactics
Tactic – template example
sequenceTactic( Boolean #pre, Boolean #post) tactic { equals( #post, #a & #b ) : block { #pre >> #a; #a >> #post; } }
The template has 2 formal parameters of type Boolean. This template implements a tactic. Compile time condition: it is true iff the second argument can be matched with (#a & #b). Parameters are changed here according to the actual parameters and the result of the match.
Call of a template
- The compiler calls the previous template with the
pre- and postcondition of the specification as arguments. ip = Start & out = outValue >> out = outValue + "Hello!" & ip = Stop { sequenceTactic( ip = Start & out = outValue,
- ut = outValue + "Hello!" & ip = Stop );
}
The specification is now refined by the template call. This template was automatically called by the compiler as a tactic, but it is also possible to call a template explicitly.
Refinement
- The template call is replaced by its definition
(after evaluation of the compile-time conditions and change of the parameters).
ip = Start & out = outValue >> out = outValue + "Hello!" & ip = Stop { ip = Start & out = outValue >> out = outValue + "Hello!";
- ut = outValue + "Hello!" >> ip = Stop;
}
This is a “sequential” refinement consisting of two steps. The two steps are refined further automatically by tactics.
Axioms
- The refinement steps form a proof-tree. Its root is
the specification and the leaves are axioms. Axioms are placed in special templates. exit( Label #at ) atom { independent( $expr, ip ) : [ $expr ]; ip = #at >> ip = Stop; }
The argument is the label where the instructions is to be placed at by the code generator. Safety property: an expression is invariant of the instruction if it is independent of the ip variable. Progress property: this instruction terminates the program. This template contains temporal axioms.
Code generation
- The refinement of our example specification is
completed (automatically) by the calls of the following two “atoms”:
- The code generator uses this “intermediate code”
to output the following C++ program: write( "Hello!", Start, L0 ); exit( L0 ); int main( int argc, char* argv[] ) { Start: cout << "Hello!"; L0: exit( 0 ); }
Further language elements
- declarations
(variables, parameters, types,
- perators, functions)
- “selectional”
refinement (for case distinction, reasoning about “if”)
- templates that contain non-temporal logic axioms
(like x=y & y=z => x=z)
- templates to generate code (for example code for
expression evaluation)
- templates to implement induction with (for loops
and recursive procedures)
- templates that generates templates
Future directions
- inclusion of more C++ instructions (e.g. methods
- f C++ STL)
- support for other target languages (Java, Ada,
assembly...)
- further improvement of automatic proof generation
- parallel and concurrent programming
- specification
statements concerning resources (memory, time...)
- “fuzzy” temporal statements – reasoning about