 # Terms Describing Statistics Used

Causation When the independent variable can be shown to have a direct relationship with the dependant variable. Must be statistically shown to be a result of the intervention and not due to chance. See “significance”.

Chi-squared test Generally abbreviated as X2 . Compares the proportions of two groups to see if the associations or trends are statistically significant.

Confidence Interval The range in which the researcher is 95% sure the results for the generalized population would be. A smaller CI range indicates a stronger result.

Correlation A statistical test that shows a relationship between two sets of data. A positive correlation means when one goes up, so does the other. One example of a positive correlation would be when the temperature goes up, so do ice cream sales. A negative correlation would be when one goes up the other goes down. An example here would be when the temperature goes up, sales of mittens go down. CORRELATION DOES NOT INDICATE CAUSATION! It is a red flag, but you cannot say that one causes the other. One example of this is that over the last 150 years, there has been a negative correlation between household income and horse ownership. Incomes have gone way up, horse ownership way down. However, it would be incorrect to say that having more money makes people not want horses. Other factors, like the invention of cars, have more to do with the decline of horse ownership than income.

Correlation Coefficient Has lots of abbreviations because there are lots of different ways to calculate it. Some abbreviations are r , rs and Pearson r. A number that expresses the strength of the correlation. There are a variety of methods and multiple symbols for a correlation coefficient. A correlation coefficient of 0 shows no relationship. The higher the coefficient, the stronger the correlation. A correlation of +1 or -1 is an absolute correlation, so the number won’t get higher than 1.

Data The information collected in the study. May be numerical (blood pressures, complication rates) or other information, like survey or interview responses.

Mean Often abbreviated x̄ for the sample and μ for the population. The mathematical average: add up all the results and divide by the number of results. Can easily be thrown off by outliers.

Median The point at which half the data points are above the median and half are below.

Mode The score that happens most often.

Multiple Regression Similar to correlation but with more than 2 variables. Please read the desciption of “Correlation”.

Normal Distribution The familiar bell-shaped curve where most data falls in the middle and less is on the edges. See the graph under “Standard Deviation.” In a normal distribution, the mean, median and mode are all at the peak of the bell curve. Used as a theoretical comparison.

Odds ratio Often abbreviated “OR,”The likelihood of one outcome as compared with another. If a study reports an OR of 1.4 for an adverse outcome of a baby born by cesarean, it means that baby was 1.4 times the chance of an adverse outcome as compared with the control group. In this case, the control group is likely vaginal births.

Outlier A point of data that falls way off from the rest of the data. Some researchers will use analysis techniques that reduce the effect of outliers; others will exclude outliers from analysis.

Relative Risk
Abbreviated RR. See “Odds Ratio.”

Sample Size Abbreviated by a lower case “n.” The number of items studied. Can be people, can be incidents, can be outcomes.

Significance Usually abbreviated with a lower case P. A statistically calculated value that shows how likely it is that the difference in results is due to chance. The smaller the p value, the more significant the results. Most studies use a cutoff of p=.05 for significance. So if p=.06 the authors will say the results are not significant or “approaching significance”.

Standard Deviation Abbreviated with “SD” or σ . Indicates how widely the data is spread out. Look at the graph below and imagine how it would change if the distance between standard deviations were narrower. A small standard deviation would mean the data is closely clustered together and the bell curve is tall and narrow, while a large standard deviation would indicate a wide spread of data with a shorter peak on the bell curve. Researchers most often report the mean (average) with a standard deviation. The size of the standard deviation can give you a clue about how spread out the data is.

t-test Abbreviated with a lower case t. Compares the means (averages) of two groups to determine if the differences are significant. There are a variety of t-tests that can be used for different circumstances