Data

Data files used to study the distribution of growth in software systems

Swinburne University of Technology
Rajesh Vasa (Owned by)
Viewed: [[ro.stat.viewed]] Cited: [[ro.stat.cited]] Accessed: [[ro.stat.accessed]]
ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Adc&rfr_id=info%3Asid%2FANDS&rft_id=http://hdl.handle.net/1959.3/190170&rft.title=Data files used to study the distribution of growth in software systems&rft.identifier=http://hdl.handle.net/1959.3/190170&rft.publisher=Swinburne University of Technology&rft.description=The evolution of a software system can be studied in terms of how various properties as reflected by software metrics change over time. Current models of software evolution have allowed for inferences to be drawn about certain attributes of the software system, for instance, regarding the architecture, complexity and its impact on the development effort. However, an inherent limitation of these models is that they do not provide any direct insight into where growth takes place. In particular, we cannot assess the impact of evolution on the underlying distribution of size and complexity among the various classes. Such an analysis is needed in order to answer questions such as 'do developers tend to evenly distribute complexity as systems get bigger?', and 'do large and complex classes get bigger over time?'. These are questions of more than passing interest since by understanding what typical and successful software evolution looks like, we can identify anomalous situations and take action earlier than might otherwise be possible. Information gained from an analysis of the distribution of growth will also show if there are consistent boundaries within which a software design structure exists. In our study of metric distributions, we focused on 10 different measures that span a range of size and complexity measures. The raw metric data (4 .txt files and 1 .log file in a .zip file measuring ~0.5MB in total) is provided as a comma separated values (CSV) file, and the first line of the CSV file contains the header. A detailed output of the statistical analysis undertaken is provided as log files generated directly from Stata (statistical analysis software).&rft.creator=Rajesh Vasa&rft.date=2011&rft.relation=http://hdl.handle.net/1959.3/95058&rft_subject=COMPUTER SOFTWARE AND SERVICES&rft_subject=INFORMATION AND COMMUNICATION SERVICES&rft_subject=Software evolution&rft_subject=Software maintenance&rft_subject=Metrics&rft_subject=Software Engineering&rft_subject=INFORMATION AND COMPUTING SCIENCES&rft_subject=COMPUTER SOFTWARE&rft_subject=PhD thesis&rft_subject=Software engineering&rft_subject=Open source software&rft_subject=Open Software&rft.type=dataset&rft.language=English Access the data

Access:

Other view details

The files are made available on open access with the kind permission of the author. Licence for re-use to be determined.

Copyright © 2010 Rajesh Vasa.

Full description

The evolution of a software system can be studied in terms of how various properties as reflected by software metrics change over time. Current models of software evolution have allowed for inferences to be drawn about certain attributes of the software system, for instance, regarding the architecture, complexity and its impact on the development effort. However, an inherent limitation of these models is that they do not provide any direct insight into where growth takes place. In particular, we cannot assess the impact of evolution on the underlying distribution of size and complexity among the various classes. Such an analysis is needed in order to answer questions such as 'do developers tend to evenly distribute complexity as systems get bigger?', and 'do large and complex classes get bigger over time?'. These are questions of more than passing interest since by understanding what typical and successful software evolution looks like, we can identify anomalous situations and take action earlier than might otherwise be possible. Information gained from an analysis of the distribution of growth will also show if there are consistent boundaries within which a software design structure exists. In our study of metric distributions, we focused on 10 different measures that span a range of size and complexity measures. The raw metric data (4 .txt files and 1 .log file in a .zip file measuring ~0.5MB in total) is provided as a comma separated values (CSV) file, and the first line of the CSV file contains the header. A detailed output of the statistical analysis undertaken is provided as log files generated directly from Stata (statistical analysis software).

This dataset is part of a larger collection

Click to explore relationships graph

145.03889,-37.8226

145.038886,-37.822599

Identifiers