What are you trying to prove?
If you’re thinking of using formal verification to increase the quality and reliability of your software, one of the first decisions you need to make is what you want to prove. Roughly speaking, you have three levels to choose from:
1. Predictable execution – that is, freedom from undefined and implementation-defined behaviour. The proofs cover such things as: all array indices are in bounds, variables are initialized before use, conversions of values to narrower types do not result in overflow, arithmetic does not overflow or underflow, null pointers are not dereferenced, and division by zero does not occur. It includes all the usual causes of what the MISRA standards call “run-time errors”, so it covers MISRA C rule 21.1.
2. Partial correctness – all the above, and the program meets its specification, if it terminates. The specification you provide may be a full functional specification, or a partial specification covering the most important parts, such as safety properties.
3. Full correctness – all the above, and the program terminates.
Most formal program verification systems supports one or more of these levels. For example, SPARK Ada supports levels 1 and 2. In SPARK-speak, level 1 is called “exception freedom”, because Ada programs normally perform run-time checking, so run-time errors manifest themselves as exceptions (which are predictable). On the other hand, Perfect Developer is focused on full correctness, because it is a top-down tool that starts from specifications – leaving you no choice but to specify what the program is intended to achieve. eCv gives you the choice of all three levels.
You don’t have to choose a single verification level for the entire system. So you might choose full correctness for the most critical parts of the system, but be content with predictable execution elsewhere.
What difference does the choice of verification level make to you as software developer? The main effect is that if you wish to prove partial or full correctness rather than just predictable execution, you will need to write function postconditions that state what the functions are intended to achieve. If you choose only to prove predictable execution, you will sometimes still need to write function postconditions, but they can be more vague – stating something about what the function achieves rather than everything it is supposed to achieve.
Here’s an example. Suppose we have read a value from a sensor that has a non-linear response. We wish to linearize the value. To this end, we have a constant table of pairs of (raw value, linearized value), ordered such that both values are monotonically increasing. We will use a binary search to find the two adjacent entries whose raw values bracket the value read from the sensor, then interpolate between the corresponding linearized values. So the function looks something like this:
#include "ecv.h"
#include "stddef.h"
typedef unsigned short uint16_t;
typedef struct { uint16_t raw; uint16_t corrected; } LinEntry;
size_t bSearch(const LinEntry* array table, size_t nElems, uint16_t key);
uint16_t linearize(const LinEntry* array table, size_t nElems, uint16_t rawValue) {
size_t index = bSearch(table, nElems, rawValue);
if (index == 0) return table[0].corrected;
else if (index == nElems) return table[nElems - 1].corrected;
else {
LinEntry low = table[index - 1], high = table[index];
return ( ((high.corrected - low.corrected)/(high.raw - low.raw))
* (rawValue - low.raw))
+ low.corrected;
}
}
static const LinEntry linTable[] = { {0, 0}, {10, 15}, {20, 29}, ... };
... linearize(linTable, sizeof(linTable)/sizeof(linTable[0]), someData) ...
The bSearch function takes a pointer to an array of LinEntry records (see my earlier post on taming arrays if you haven’t seen the array keyword before), the number of elements in the array, and the raw data value. I’ve assumed it returns a value in the range 0..nElems, such that 0 means the raw value is off the bottom of the table, nElems means it is off the top, otherwise the table entry it indexes and the previous table entry bracket the raw value. If we want to prove partial or full correctness, we need to specify all of this. But we can be less precise if we’re just proving predictable execution. Let’s see what eCv makes of it with no annotation other than the array keywords:
c:\ecv\ex1.c (11,33): Warning! Unable to prove: Precondition of ‘[]‘ satisfied (see C:\ecv\ex1_unproven.htm#9), cannot prove: 0 < table.lim. c:\ecv\ex1.c (12,43): Warning! Exceeded boredom threshold proving: Precondition of ‘[]‘ satisfied (see C:\ecv\ex1_unproven.htm#7), cannot prove: nElems < (1 + table.lim). c:\ecv\ex1.c (14,29): Warning! Exceeded boredom threshold proving: Precondition of ‘[]‘ satisfied (see C:\ecv\ex1_unproven.htm#2), cannot prove: index < (1 + table.lim). c:\ecv\ex1.c (14,54): Warning! Exceeded boredom threshold proving: Precondition of ‘[]‘ satisfied (see C:\ecv\ex1_unproven.htm#4), cannot prove: index < table.lim. c:\ecv\ex1.c (15,17): Warning! Unable to prove: Precondition of ‘/’ satisfied (see C:\ecv\ex1_unproven.htm#5), cannot prove: low.raw < high.raw.
There are a few problems here, all in function linearize:
- The first warning says that table[0] might be out-of-bounds, because table might be an empty array.
- The second says that table[nElems - 1] might be out-of-bounds, because we haven’t told eCv that nElems is the number of elements in table.
- The third and fourth say that table[index] and table[index - 1] might be out-of-bounds, because eCv doesn’t know that bSearch returns a value between 0 and nElems, and that table points to the start of an array with nElems elements.
- The final warning says that in the interpolation operation, the divisor (table[index].raw – table[index - 1].raw) might not be positive. Division by zero would typically cause an exception; while division by a negative number yields an implementation-defined result in C’90 and is therefore not allowed by eCv. The division will be safe if the raw values in table are increasing monotonically.
So let’s give eCv the information it needs, by adding some partial specifications:
size_t bSearch(const LinEntry* array table, size_t nElems, uint16_t key) post(result <= nElems); uint16_t linearize(const LinEntry* array table, size_t nElems, uint16_t rawValue) pre(table.lwb == 0; table.lim == nElems) pre(nElems >= 1) pre(forall i in 1..(nElems - 1) :- table[i].raw > table[i - 1].raw) { size_t index = bSearch(table, nElems, rawValue); if (index == 0) return table[0].corrected; else if (index == nElems) return table[nElems - 1].corrected; else { LinEntry low = table[index - 1], high = table[index]; return ( ((high.corrected - low.corrected)/(high.raw - low.raw)) * (rawValue - low.raw)) + low.corrected; } }
In the postcondition of bSearch, we’ve specified that the return value is no greater than nElems (since it is of type size_t, it can’t be less than zero anyway). In the precondition of linearize we’ve said that table is a regular array with lower bound zero and limit nElems. These are enough to ensure that the array accesses table[index - i] and table[index] in the else-part of the if-statement are in bounds, because the previous parts of the if-statement handle the cases index == 0 and index == nElems. We’ve also specified that nElems is at least one to take care of the access to table[0].
The final precondition we’ve added to linearize says the raw values in table are monotonically increasing, by stating that for all values i from 1 up to nElems - 1 (i.e. the highest index of table), table[i].raw is greater than table[i - 1].raw.
These partial specifications are sufficient for eCv to prove that function linearize executes predictably, assuming that it is called in a state satisfying its preconditions, and that bSearch executes predictably, has no side-effects, and returns a value satisfying its postcondition. The preconditions of linearize will be verified at every place where it is called. We’ll look at verifying bSearch next time.