herraiz.org | Blog

Main | Blog | Research papers | PhD thesis | GnuPG (PGP)

 Subscribe to this blog in a reader

Don't do empirical software engineering with Excel

Or any other statistical analysis. From time to time, I see papers published in Empirical Software Engineering conferences and journals that have used Excel for the statistical analyses (although I would not point here to any of those papers :). I have not liked this much, but it was mostly because of personal taste, and probably also a sort of prejudice.

But I found this paper today:

On the accuracy of statistical procedures in Microsoft Excel 2007
B.D. McCullough , David A. Heiser
Computational Statistics and Data Analysis 52 (2008) 4570–4578

The paper remarks some of the flaws of Excel 2007 in standard statistical methods. Two of the flaws are related to the least squares fitting of exponential models, and to the normality plot of a sample against the deciles of the Normal distribution. Apparently, this situation has repeated since years ago with every new release of Excel. Some of the problems have even been claimed to be fixed, while the truth is that they remain providing wrong results.

If you use Excel in your research, I think you would better consider switching to GNU R, and its superb IDE R Studio.

Written on Nov 23 2012 | Tags: #research, #statistics, #excel
blog comments powered by Disqus