Realistic assessment of software effort estimation models

Boyce Sigweni, Martin Shepperd, Tommaso Turchi

Research output: Chapter in Book/Report/Conference proceedingConference contribution

10 Citations (Scopus)

Abstract

Context: It is unclear that current approaches to evaluating or comparing competing software cost or effort models give a realistic picture of how they would perform in actual use. Specifically, we're concerned that the usual practice of using all data with some holdout strategy is at variance with the reality of a data set growing as projects complete. Objective: This study investigates the impact of using unrealistic, though possibly convenient to the researchers, ways to compare models on commercial data sets. Our questions are does this lead to different conclusions in terms of the comparisons and if so,are the results biased e.g., more optimistic than those that might realistically be achieved in practice. Method: We compare a traditional approach based on leave one out cross-validation with growing the data set chronologically using the Finnish and Desharnais data sets. Results: Our realistic, time-based approach to validation is significantly more conservative than leave-one-out cross-validation (LOOCV) for both data sets. Conclusion: If we want our research to lead to actionable findings it's incumbent upon the researchers to evaluate their models in realistic ways. This means a departure from LOOCV techniques, while further investigation is needed for other validation techniques, such as k-fold validation.

Original languageEnglish
Title of host publicationProceedings of the 20th International Conference on Evaluation and Assessment in Software Engineering, EASE 2016
PublisherAssociation for Computing Machinery
Volume01-03-June-2016
ISBN (Electronic)9781450336918
DOIs
Publication statusPublished - Jun 1 2016
Event20th International Conference on Evaluation and Assessment in Software Engineering, EASE 2016 - Limerick, Ireland
Duration: Jun 1 2016Jun 3 2016

Other

Other20th International Conference on Evaluation and Assessment in Software Engineering, EASE 2016
CountryIreland
CityLimerick
Period6/1/166/3/16

Fingerprint

Costs

All Science Journal Classification (ASJC) codes

  • Software
  • Human-Computer Interaction
  • Computer Vision and Pattern Recognition
  • Computer Networks and Communications

Cite this

Sigweni, B., Shepperd, M., & Turchi, T. (2016). Realistic assessment of software effort estimation models. In Proceedings of the 20th International Conference on Evaluation and Assessment in Software Engineering, EASE 2016 (Vol. 01-03-June-2016). [a41] Association for Computing Machinery. https://doi.org/10.1145/2915970.2916005
Sigweni, Boyce ; Shepperd, Martin ; Turchi, Tommaso. / Realistic assessment of software effort estimation models. Proceedings of the 20th International Conference on Evaluation and Assessment in Software Engineering, EASE 2016. Vol. 01-03-June-2016 Association for Computing Machinery, 2016.
@inproceedings{f50102187c2d452da368c53fae31f8d3,
title = "Realistic assessment of software effort estimation models",
abstract = "Context: It is unclear that current approaches to evaluating or comparing competing software cost or effort models give a realistic picture of how they would perform in actual use. Specifically, we're concerned that the usual practice of using all data with some holdout strategy is at variance with the reality of a data set growing as projects complete. Objective: This study investigates the impact of using unrealistic, though possibly convenient to the researchers, ways to compare models on commercial data sets. Our questions are does this lead to different conclusions in terms of the comparisons and if so,are the results biased e.g., more optimistic than those that might realistically be achieved in practice. Method: We compare a traditional approach based on leave one out cross-validation with growing the data set chronologically using the Finnish and Desharnais data sets. Results: Our realistic, time-based approach to validation is significantly more conservative than leave-one-out cross-validation (LOOCV) for both data sets. Conclusion: If we want our research to lead to actionable findings it's incumbent upon the researchers to evaluate their models in realistic ways. This means a departure from LOOCV techniques, while further investigation is needed for other validation techniques, such as k-fold validation.",
author = "Boyce Sigweni and Martin Shepperd and Tommaso Turchi",
year = "2016",
month = "6",
day = "1",
doi = "10.1145/2915970.2916005",
language = "English",
volume = "01-03-June-2016",
booktitle = "Proceedings of the 20th International Conference on Evaluation and Assessment in Software Engineering, EASE 2016",
publisher = "Association for Computing Machinery",

}

Sigweni, B, Shepperd, M & Turchi, T 2016, Realistic assessment of software effort estimation models. in Proceedings of the 20th International Conference on Evaluation and Assessment in Software Engineering, EASE 2016. vol. 01-03-June-2016, a41, Association for Computing Machinery, 20th International Conference on Evaluation and Assessment in Software Engineering, EASE 2016, Limerick, Ireland, 6/1/16. https://doi.org/10.1145/2915970.2916005

Realistic assessment of software effort estimation models. / Sigweni, Boyce; Shepperd, Martin; Turchi, Tommaso.

Proceedings of the 20th International Conference on Evaluation and Assessment in Software Engineering, EASE 2016. Vol. 01-03-June-2016 Association for Computing Machinery, 2016. a41.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

TY - GEN

T1 - Realistic assessment of software effort estimation models

AU - Sigweni, Boyce

AU - Shepperd, Martin

AU - Turchi, Tommaso

PY - 2016/6/1

Y1 - 2016/6/1

N2 - Context: It is unclear that current approaches to evaluating or comparing competing software cost or effort models give a realistic picture of how they would perform in actual use. Specifically, we're concerned that the usual practice of using all data with some holdout strategy is at variance with the reality of a data set growing as projects complete. Objective: This study investigates the impact of using unrealistic, though possibly convenient to the researchers, ways to compare models on commercial data sets. Our questions are does this lead to different conclusions in terms of the comparisons and if so,are the results biased e.g., more optimistic than those that might realistically be achieved in practice. Method: We compare a traditional approach based on leave one out cross-validation with growing the data set chronologically using the Finnish and Desharnais data sets. Results: Our realistic, time-based approach to validation is significantly more conservative than leave-one-out cross-validation (LOOCV) for both data sets. Conclusion: If we want our research to lead to actionable findings it's incumbent upon the researchers to evaluate their models in realistic ways. This means a departure from LOOCV techniques, while further investigation is needed for other validation techniques, such as k-fold validation.

AB - Context: It is unclear that current approaches to evaluating or comparing competing software cost or effort models give a realistic picture of how they would perform in actual use. Specifically, we're concerned that the usual practice of using all data with some holdout strategy is at variance with the reality of a data set growing as projects complete. Objective: This study investigates the impact of using unrealistic, though possibly convenient to the researchers, ways to compare models on commercial data sets. Our questions are does this lead to different conclusions in terms of the comparisons and if so,are the results biased e.g., more optimistic than those that might realistically be achieved in practice. Method: We compare a traditional approach based on leave one out cross-validation with growing the data set chronologically using the Finnish and Desharnais data sets. Results: Our realistic, time-based approach to validation is significantly more conservative than leave-one-out cross-validation (LOOCV) for both data sets. Conclusion: If we want our research to lead to actionable findings it's incumbent upon the researchers to evaluate their models in realistic ways. This means a departure from LOOCV techniques, while further investigation is needed for other validation techniques, such as k-fold validation.

UR - http://www.scopus.com/inward/record.url?scp=84978535952&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84978535952&partnerID=8YFLogxK

U2 - 10.1145/2915970.2916005

DO - 10.1145/2915970.2916005

M3 - Conference contribution

AN - SCOPUS:84978535952

VL - 01-03-June-2016

BT - Proceedings of the 20th International Conference on Evaluation and Assessment in Software Engineering, EASE 2016

PB - Association for Computing Machinery

ER -

Sigweni B, Shepperd M, Turchi T. Realistic assessment of software effort estimation models. In Proceedings of the 20th International Conference on Evaluation and Assessment in Software Engineering, EASE 2016. Vol. 01-03-June-2016. Association for Computing Machinery. 2016. a41 https://doi.org/10.1145/2915970.2916005