Changes and Bugs – Mining and Predicting Development Activities

by Thomas Zimmermann

Software development results in a huge amount of data: changes to source code are recorded in version archives, bugs are reported to issue tracking systems, and communications are archived in e-mails and newsgroups. In this thesis, we present techniques for mining version archives and bug databases to understand and support software development.
First, we present techniques which mine version archives for fine-grained changes. We introduce the concept of co-addition of method calls, which we use to identify patterns that describe how methods should be called. We use dynamic analysis to validate these patterns and identify violations. The co-addition of method calls can also detect cross-cutting changes, which are an indicator for concerns that could have been realized as aspects in aspect-oriented programming.
Second, we present techniques to build models that can successfully predict the most defectprone parts of large-scale industrial software, in our experiments Windows Server 2003. This helps managers to allocate resources for quality assurance to those parts of a system that are expected to have most defects. The proposed measures on dependency graphs outperformed traditional complexity metrics. In addition, we found empirical evidence for a domino effect: depending on defect-prone binaries increases the chances of having defects.


Thomas Zimmermann. Changes and Bugs – Mining and Predicting Development Activities. PhD Thesis, Saarland University, May 2008.

BibTeX Entry

    title = "Changes and Bugs – Mining and Predicting Development Activities",
    author = "Thomas Zimmermann",
    year = "2008",
    month = "May",
    school = "Saarland University",
    type = "PhD Thesis",