Mining Bug Databases
Meet iBUGS: a benchmark for defect localization
Researchers have proposed a number of tools for automatic bug localization. Given a program and a description of the failure, such tools pinpoint a set of statements that are most likely to contain the bug. Evaluating these tools is a difficult task because existing benchmarks are limited in size of subjects and number of bugs.
We developed an approach that semiautomatically extracts benchmarks for bug localization from the history of a project. The result is the iBUGS dataset, a benchmark with real bugs for large test subjects (AspectJ, Rhino).
The iBUGS Repository – 401 bugs in AspectJ and Rhino
Extraction of Bug Localization Benchmarks from History – ASE 2007
Extraction of Bug Localization Benchmarks from History – extended TR
Identification of bug-introducing changes
Bug-fixes are widely used for predicting bugs or finding risky parts of software. However, a bug-fix does not contain information about the change that initially introduced the bug. Such bug-introducing changes can help identifying important properties of software bugs such as correlated factors or causalities. For example, they reveal which developers or what kinds of source code changes introduce more bugs.
In contrast to bug-fixes that are relatively easy to obtain, the extraction of bugintroducing changes is challenging. We developed algorithms to automatically and accurately identify bug-introducing changes.
When do Changes Induce Fixes? On Fridays – MSR 2005
HATARI: Raising Risk Awareness – ESEC/SIGSOFT FSE 2005
Automatic Identification of Bug-Introducing Changes – ASE 2006






