Codebook: Discovering and Exploiting Relationships in Software Repositories – ICSE 2010
Large-scale software engineering requires communication and collaboration to successfully build and ship products. We conducted a survey with Microsoft engineers on inter-team coordination and found that the most impactful problems concerned finding and keeping track of other engineers. Since engineers are connected by their shared work, a tool that discovers connections in their work-related repositories can help.
Here we describe the Codebook framework for mining software repositories. It is flexible enough to address all of the problems identified by our survey with a single data structure (graph of people and artifacts) and a single algorithm (regular language reachability). Codebook handles a larger variety of problems than prior work, analyzes more kinds of work artifacts, and can be customized by and for end-users. To evaluate our framework's flexibility, we built two applications, Hoozizat and Deep Intellisense. We evaluated these applications with engineers to show effectiveness in addressing multiple inter-team coordination problems.
Download as PDF.
Reference
Andrew Begel, Khoo Yit Phang, Thomas Zimmermann. Codebook: Discovering and Exploiting Relationships in Software Repositories. In Proceedings of the 32th International Conference on Software Engineering (ICSE 2010), Cape Town, South Africa, May 2010.
BibTeX Entry
@inproceedings{begel-icse-2010,
title = "Codebook: Discovering and Exploiting Relationships in Software Repositories",
author = "Andrew Begel and Khoo Yit Phang and Thomas Zimmermann",
year = "2010",
month = "May",
booktitle = "Proceedings of the 32th International Conference on Software Engineering",
location = "Cape Town, South Africa",
}






