Formal verification

Warning

This feature is experimental and always will be. The API is stable and verification is reliable but it always will work only for some simple cases.

Deal has a built-in formal verifier. That means deal turns your code into a formal theorem and then proves that it is formally correct (or finds a counter-example when it is not). Turning wild Python code into mathematical expressions is hard, so application of the verifier is limited. Still, you should try it. It will work for only 1% of your code but when it does work, it finds actual bugs.

python3 -m deal prove project/

How it works

Prerequisites for code to be verified:

  • It is a function or @staticmethod.

  • It is written on pure Python and calls only pure Python code.

  • Every argument is type annotated.

  • Even if everything above is satisfied, the function still can be skipped because a feature it uses is not supported yet.

Below, we use the following terms:

  • counter-example means a combination of input arguments that leads to theorem violation.

  • given means that this is an axiom the theorem uses. Counter-example must satisfy to the given conditions.

  • expected means that this is an assertion that theorem tries to break. Counter-example must violate at least one expected condition.

How different components are interpreted:

  • deal.pre is given.

  • deal.post is expected.

  • deal.pre for function called from this function is expected.

  • deal.ensure is expected.

  • deal.raises is expected to contain every exception the function can ever raise.

  • assert is expected.

Background

  • 1936. Halting problem. Alan Turing proved that you cannot formally verify if a program will ever finish execution. This is one of the most important theorems of formal verification.

  • 1949. Alan Turing publishes the first ever proof of program correctness.

  • 1967. Robert W. Floyd published “Assigning Meanings to Programs”, introducing the idea of program verification using logical assertions.

  • 1969. Hoare logic. Tony Hoare, based on the work of Floyd, introduced the idea of precondition, postcondition, and loop invariant. What’s more important, he proved that this is all you need to formally verify correctness of any program. The only thing you can’t verify this way is if the program will ever stop.

  • 1986. Design by contract. Bertrand Meyer designed Eiffel programming language which introduced the idea of Design by Contract (DbC). DbC is heavily based on Hoare Logic, but this time it turned from a purely mathematical reasoning into an actual OOP language.

  • 2009. Dafny. Microsoft Research releases a programming language that actually uses Hoare logic to formally prove contracts.

  • 2015. Z3. Microsoft Research opens the source code of the formal verifier used inside of Dafny and in a few other places. Z3 provides bindings for many different programming languages, including Python.

  • 2018. deal. We released a Python library that allows to specify contracts that can be validated in runtime. The initial motivation to make another one DbC library is to provide decorator-based API (as opposed to a more popular docstring-based approach), so users can benefit from syntax-highlighting, autocomplete, autoformatters and other tooling.

  • 2019. Deal gets a linter. It finds contract violations using static analysis.

  • 2021. deal-solver. We released a tool that converts Python code (including deal contracts) into Z3 theorems that can be formally verified.

Limitations

Since Python is a dynamically typed interpreted language and is not designed for formal verification, turning Python code into mathematical expressions is very hard. To name a few limitations:

  • There are some mutability bugs that cannot be detected by deal-solver because there is no conception of mutability (and variables) in Z3.

  • set cannot be converted into list because sets are infinite in Z3.

  • Some formulas (like a ** x where x is a variable) take unreasonable amount of time to prove. To avoid freezing to death, deal-solvers sets a timeout for every proof.

  • In general, resolving OOP magic statically is hard. At the moment, deal-solver supports only built-in types and validates only functions and static methods.

  • A big chunk of the standard library is written in C. So, to support the whole standard library, we have to manually rewrite every function implementation as a Z3 formula (because deal-solver can interpret only Python, not C).

  • Verification of loops requires loop invariants. However, deal currently doesn’t have anything like this because it’s not so helpful for other components of the project.

So, deal-solver is more proof-of-concept rather than something that will be a part of your day-to-day tooling.

Further reading

There are few additional links in case if you want to go down the rabbit hole and dig into some hardcore math: