herraiz.org | Blog

Main | Blog | Research papers | PhD thesis | GnuPG (PGP)

 Subscribe to this blog in a reader

Data for Mining Software Repositories

Last week, Daniel Rodríguez (Information Engineering Research Unit, UAH) visited our department to talk about how to start to collaborate in the field Mining Software Repositories, where to get data, what topics we could do join works on. I prepared a set of slides with practical information about datasets, conferences and journals, to be used as a facilitator for discussion. The slides are available in SlideShare:

Mining Software Repositories
View more presentations from Israel Herraiz.

The presentation contains some links to datasets that can be easily used for empirical studies, and that makes it possible to conduct replicable studies. Also, there is paper at MSR 2010 that describes the data sources used for the MSR Challenge; the paper is entitled Mining Challenge 2010: FreeBSD, GNOME Desktop and Debian/Ubuntu and contains description of the FreeBSD repositories, of FLOSSMetrics data about GNOME and of the Ultimate Debian Database. If you use the paper for your research, please consider citing it (download the BibTeX citation as text file):

  author    = {Abram Hindle 
	       and Israel Herraiz 
	       and Emad Shihab 
	       and Zheng Ming Jiang},
  title     = {Mining {C}hallenge 2010: 
	      {F}ree{BSD}, {GNOME} {D}esktop 
	      and {D}ebian/{U}buntu},
  booktitle = {Proceedings of the 
	      7th IEEE International Working Conference 
	      on Mining Software Repositories},
  pages     = {82--85},
  year      = {2010},
  publisher = {IEEE Computer Society},
Written on Jun 25 2010 | Tags: #uax, #mining software repositories, #msr
blog comments powered by Disqus