Specification with Ghost Functions
In my previous post I showed that the C expression sublanguage extended with quantified expressions (forall and exists) is insufficient to allow some specifications to be expressed. I presented this function (annotated with an incomplete specification) to average an array of data:
int16_t average(const int16_t * array readings, size_t numReadings) pre(readings.lwb == 0; readings.lim == numReadings) pre(numReadings != 0) post(result == ? /* sum of elements of readings */ /numReadings) { int sum = 0; size_t i; for (i = 0; i < numReadings; ++i) keep(i <= numReadings) keep(sum == ? /* sum of first i elements of readings */ ) decrease(numReadings - i) { sum += readings[i]; } return (int16_t)(sum/numReadings); }
The two question marks ? need to be replaced by expressions for sums over the appropriate elements. Last time I showed how we can replace them by over-expressions. In this post, I’ll describe an alternative solution that has more general applications.
The problem with which we are faced is that there is no C expression (even if we include forall and exists expressions) that can express the sum of some of array elements, where the number of elements is not statically known. We can calculate the sum using a loop; but a loop is not an expression and so cannot be used in specifications. However, specification expressions can contain function calls, provided that the functions have no side-effects. We can resolve our problem by defining a ghost function that calculates the sum. We’ll specify the function recursively so that we can handle the variable number of elements.
In the postcondition of average, we need to calculate the sum of all the elements of the readings array. In the loop variant, we need the sum of just the first i elements. So let’s declare a function that returns the sum of the first n elements of an array a, where a and n are parameters. We define the sum to be zero if n is zero, otherwise it’s the sum of the first n-1 elements plus the nth element. Here’s our ghost function specification:
#ifdef __ECV__ ghost integer sumOf(const int16_t * array a, size_t n) pre(a.lwb == 0; n <= a.lim) decrease(n) returns(n == 0 ? 0 : a[n - 1] + sumOf(a, n - 1)); #endif
Declaring the function ghost tells eCv that it is for use in specifications only. A ghost function must have a specification but no body. It can also do a few things that wouldn’t be allowed for a normal function. In particular, it can use types not normally allowed in C, such as integer, which is eCv’s name for the type of unbounded integers. I’ve declared the function as returning integer so that neither we nor eCv need worry about arithmetic overflow in the specification.
The precondition defines the conditions for the function to be well-behaved, as usual. I’ve specified the value that the function returns using a postcondition of the form returns(expression). This has the same meaning as post(result == expression), but it’s shorter. Also, we want to define the function recursively, and eCv doesn’t currently allow a recursive call to the function being specified inside post(…) but it does inside returns(…). I’ve enclosed the whole function declaration in #ifdef __ECV__ … #endif so that the C compiler doesn’t see it.
Recursion would normally be forbidden in critical software (MISRA rule 16.2), however recursive calls in specifications are safe because they don’t carry any risk of run-time stack overflow. To ensure that the specification makes sense, eCv will need to prove that the recursion is bounded. That’s what the decrease clause is for, and it works exactly as if it were a loop variant – that is, it must decrease on each recursive call and it must not go negative.
Having defined sumOf, we can complete the postcondition and loop invariant:
int16_t average(const int16_t * array readings, size_t numReadings) pre(readings.lwb == 0; readings.lim == numReadings) pre(numReadings != 0) returns(sumOf(readings, numReadings)/numReadings) { int sum = 0; size_t i; for (i = 0; i < numReadings; ++i) keep(i <= numReadings) keep(sum == sumOf(readings, i)) decrease(numReadings - i) { sum += readings[i]; } return (int16_t)(sum/numReadings); }
I’ve taken the liberty of replacing the original post(…) postcondition of average with the returns(…) form.
Just like the version of average that I gave in my previous post, this one verifies completely except for possible overflow during integer arithmetic. I’ll show how we can deal with that next time.