Estimating the number of remaining defects

As previously stated, the principal deficiency with the capture-recapture models tested within the software engineering domain is that the results tend to diverge for the ideal result under non-ideal conditions. In these situations the models can be regarded as having insufficient information to correctly resolve the ?more complex? problem, i.e. the computation is under-constrained. This becomes relatively obvious when we consider that some inspection models recommend only having two participants. Fortunately, recent work provides some insight into the inspection process, which can be utilised to solve this problem. Controlled experiments into the ability of novice inspectors to predict the effectiveness of the inspection process. Their results suggest that even novice inspectors are relatively accurate at predicting this quantity. (A recent study in cost-estimation provides evidence that if users are held accountable for their estimates, the accuracy of the estimates increases. Subsequently, this direct estimate of effectiveness can become an important component in quality assurance activities.) Hence, this additional information could be utilised to stabilise the estimation process. That is the estimation process can be altered to be a Bayesian estimation process with the subjective performance estimate deployed as the prior distribution. Additionally, the system may utilise historical inspection information to further regularise the estimate. On subsequent rounds of inspection, the previous estimate would be used instead of the historical information.

This type of estimator has been initially explored in the zoological field. The field has again shown that no unique optimal estimator exists and the field has produced a number of different estimators based upon different problem formulations. Much of this work can be adapted into a software engineering context, to provide a starting point for finding acceptable solutions. Like the zoological formulations, it is believed that a single optimal estimator will not exist for the defect estimation problem. Hence, the exploration of techniques for combining multiple models, such as stacking, should be undertaken to attempt to combine the best ?features? of the various models, under different conditions. For example, it was shown that although heterogeneous models produce the best average performance, that the maximum likelihood model can be utilised as an effective estimate of the upper bound of the estimate. Finally, it is envisaged that the system would be deployed as an advisor to the moderator, who would still be responsible for the final decision. Although it is believed that the proposed system will produce accurate estimates, the situation can always arise where the current inspection diametrically diverges from all previous understanding of software inspections. Hence, it is believed that it is essential that the human-component remain within the decision loop.

Although the estimation process is presented within an inspection context, it has a much wider application within software engineering, and can be applied to many defect-orientation situations, such as alpha testing.