Please use this identifier to cite or link to this item:
Title: Technical debt forecasting: An empirical study on open-source repositories
Authors: Tsoukalas, Dimitrios
Kehagias, Dionysios
Siavvas, Miltiadis
Chatzigeorgiou, Alexander
Type: Article
Subjects: FRASCATI::Natural sciences::Computer and information sciences
Keywords: technical debt
software maintenance
software quality
Issue Date: 2020
Publisher: Elsevier
Source: Journal of Systems and Software
Volume: 170
First Page: 110777
Abstract: Technical debt (TD) is commonly used to indicate additional costs caused by quality compromises that can yield short-term benefits in the software development process, but may negatively affect the long-term quality of software products. Predicting the future value of TD could facilitate decision-making tasks regarding software maintenance and assist developers and project managers in taking proactive actions regarding TD repayment. However, no notable contributions exist in the field of TD forecasting, indicating that it is a scarcely investigated field. To this end, in the present paper, we empirically evaluate the ability of machine learning (ML) methods to model and predict TD evolution. More specifically, an extensive study is conducted, based on a dataset that we constructed by obtaining weekly snapshots of fifteen open source software projects over three years and using two popular static analysis tools to extract software-related metrics that can act as TD predictors. Subsequently, based on the identified TD predictors, a set of TD forecasting models are produced using popular ML algorithms and validated for various forecasting horizons. The results of our analysis indicate that linear Regularization models are able to fit and provide meaningful forecasts of TD evolution for shorter forecasting horizons, while the non-linear Random Forest regression performs better than the linear models for longer forecasting horizons. In most of the cases, the future TD value is captured with a sufficient level of accuracy. These models can be used to facilitate planning for software evolution budget and time allocation. The approach presented in this paper provides a basis for predictive TD analysis, suitable for projects with a relatively long history. To the best of our knowledge, this is the first study that investigates the feasibility of using ML models for forecasting TD.
ISSN: 0164-1212
Other Identifiers: 10.1016/j.jss.2020.110777
Appears in Collections:Department of Applied Informatics

Files in This Item:
File Description SizeFormat 
tsoukalas_jss_preprint.pdf2,11 MBAdobe PDFView/Open

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.