# Terms

Back in 1975, Fred Brooks wrote:

Show me your flowcharts and conceal your tables, and I shall continue to be mystified; show me your tables and I won’t usually need your flowcharts: they’ll be obvious.

Along the same lines, telling me the terms that someone needs to know in order to understand something is a quick and dirty way to figure out what a lesson about that thing needs to cover. I have therefore gone through two dozen empirical studies on software engineering and pulled out the terms they use that computer science undergraduates are unlikely to know. It’s an intimidating list, but if we want to teach software engineers how to apply data science to software engineering problems and understand empirical software engineering research, I think we’ll have to cover most of it.

See below the table for the papers these terms were found in.

 accuracy alternative hypothesis Amdahl's Law analysis of variance Bayes' Rule Benjamini-Hochberg p-value correction Bernoulli distribution Bessel correction binomial distribution Bonferroni correction box-and-whisker plot central moment Chebyshev's Inequality chi-square test Cliff's δ Cohen's d Cohen's kappa conditional probability confidence interval continuity correction convergence correlation coefficient covariance covariance matrix cumulative distribution function dataframe degrees of freedom dependent variable descriptive statistics effect size expected value explanatory variable F-measure F-test false negative false positive Gamma distribution Gamma function geometric distribution goal-question-metric Greenhouse-Geisser correction harmonic mean histogram independent variable interquartile range Kano scale Kruskal-Wallis test Likert scale linear regression logistic regression long tail Mann-Whitney U test Mauchly's test for sphericity maximum likelihood estimation mean median method of moments multiple linear regression n-gram analysis negative binomial distribution negative binomial regression Noble's Rules Not a Number normal distribution nuisance factor null hypothesis one-sided distribution outlier overdispersion quartile p hacking p value Poisson distribution pooled sample variance population population moment power law distribution precision principal component analysis probability density function probability mass function quartile rank correlation recall response variable sample sample moment sample variance Shapiro-Wilk test sigmoidal curve Spearman's rank correlation standard deviation standard normal distribution standard uniform distribution statistic statistical model t-distribution t-test tidy data uniform distribution variance variance violin plot Wilcoxon rank-sum test Wilcoxon signed rank test z-test Zipf's Law Zipf-Mandelbrot distribution

The papers are:

Updated: