Data for Mining Software Repositories

Last week, Daniel Rodríguez (Information Engineering Research Unit, UAH) visited our department to talk about how to start to collaborate in the field Mining Software Repositories, where to get data, what topics we could do join works on. I prepared a set of slides with practical information about datasets, conferences and journals, to be used as a facilitator for discussion. The slides are available in SlideShare:

The presentation contains some links to datasets that can be easily used for empirical studies, and that makes it possible to conduct replicable studies. Also, there is paper at MSR 2010 that describes the data sources used for the MSR Challenge; the paper is entitled Mining Challenge 2010: FreeBSD, GNOME Desktop and Debian/Ubuntu and contains description of the FreeBSD repositories, of FLOSSMetrics data about GNOME and of the Ultimate Debian Database. If you use the paper for your research, please consider citing it (download the BibTeX citation as text file):

Written on Jun 25 2010 | Tags: #uax, #mining software repositories, #msr
