Formal Specifications Better Than Function Points for Code Sizing
Mark Staples, Rafal Kolanski, Gerwin Klein, Corey Lewis, June Andronick, Toby Murray, Ross Jeffery, and Len Bass: "Formal Specifications Better Than Function Points for Code Sizing". ICSE'13, 2013, http://www.nicta.com.au/pub?doc=6416.
Size and effort estimation is a significant challenge for the management of large-scale formal verification projects. We report on an initial study of relationships between the sizes of artefacts from the development of seL4, a formally-verified embedded systems microkernel. For each API function we first determined its COSMIC Function Point (CFP) count (based on the seL4 user manual), then sliced the formal specifications and source code, and performed a normalised line count on these artefact slices. We found strong and significant relationships between the sizes of the artefact slices, but no significant relationships between them and the CFP counts. Our finding that CFP is poorly correlated with lines of code is based on just one system, but is largely consistent with prior literature. We find CFP is also poorly correlated with the size of formal specifications. Nonetheless, lines of formal specification correlate with lines of source code, and this may provide a basis for size prediction in future formal verification projects. In future work we will investigate proof sizing.
Donald Knuth reportedly once said that C was the best language ever invented for analyzing algorithms because you could see how many atomic operations there were simply by counting punctuation. In this paper, Staples et al. apply that idea at a much, much higher level. Does the size of the formal specification for part of a micro-kernel correlate with the number of lines required to implement it? The answer appears to be "yes", and moreover, that correlation is much better than the correlation between lines of code and function points. This doesn't necessarily mean that formal specifications are worth writing solely as an aid to project cost estimation, but it does open up some new practical applications—not least, the question of whether the effort saved by having a formal specification repays the cost of creating it. We look forward to seeing this work replicated on other systems.