· ASSUMED MATH SKILLS:

· Rounding
Expectation: Expressing a fraction to the nearest whole number.

· Converting Fractions Into Decimals
Expectation: Converting three quarters to 0.75 etc.

· Addition and Subtraction
Expectation: Be able to add up and subtract.

· Multiplication and Division
Expectation: Be able to multiply and divide.

· Proportions and Percentages
Expectation: Be able to express a quantity as a proportion of a larger whole or as a percentage.

· Sum
Expectation: Add up

· Product
Expectation: multiply
Math Formula: (A)(B) = A x B = AB = A(B)

· Square
Definition:
a number multiplied by itself
Math Formula:A2

· Square root
Description: The square root of 9 is 3, i.e. the number, when multiplied by itself results in a given number.
E.g.

· Numerator
Description: The top number of a fraction, indicating how many of the fraction type is available
E.g. in the fraction , 1 is the numerator.

· Denominator
Definition: The bottom number of a fraction, indicating how many parts of the fraction would make one whole. So for a half, the number would be two, meaning the whole is divided into two units.
E.g. in the fraction , 2 is the denominator.


· Nominal Scale

Definition: Nominal measurement consists of assigning items to groups or categories.
Discussion: No quantitative information is conveyed and no ordering of the items is implied. Nominal scales are therefore qualitative rather than quantitative. Religious preference, race, and sex are all examples of nominal scales. Frequency distributions are usually used to analyze data measured on a nominal scale. The main statistic computed is the mode. Variables measured on a nominal scale are often referred to as categorical or qualitative variables.


· Ordinal Scale

Definition: Measurements with ordinal scales are ordered in the sense that higher numbers represent higher values. However, the intervals between the numbers are not necessarily equal.
Example: On a five-point rating scale measuring attitudes toward gun control, the difference between a rating of 2 and a rating of 3 may not represent the same difference as the difference between a rating of 4 and a rating of 5. There is no "true" zero point for ordinal scales since the zero point is chosen arbitrarily. The lowest point on the rating scale in the example was arbitrarily chosen to be 1. It could just as well have been 0 or -5.


· Interval Scale

Definition: On interval measurement scales, a unit anywhere on the scale has the same magnitude.
Example: The difference in measuring pain between 10 and 11 units is the same as the difference between 332 and 333 units, or with temperature the difference between -19 and -18 degrees Fahrenheit is the same as the difference between 39 and 40.
Discussion: With interval scales there is not a 'true' zero point, so it's not possible to make statements about how many times higher one score is than another. For a scale measuring pain, it would not be valid to say that a person with a score of 30 had twice as much pain as a person with a score of 15. True interval measurement is somewhere between rare and nonexistent in the behavioral sciences. No interval-level scale of pain such as the one described in the example actually exists. A good example of an interval scale is the Fahrenheit scale for temperature. Equal differences on this scale represent equal differences in temperature, but a temperature of 40 degrees is not twice as warm as one of 20 degrees.


· Sum of Scores

Definitions (formal/informal): Add all the scores together
Math Symbol:
Math Formula: If scores (x) are 2,3,2,1,4,3 then the Sum of Scores is 15. In symbols, it is written:


· Number of Observations/Subjects

Math Symbol: n and N.
n=num. of observations of a group.
N=num. of observations in an entire study.


· Degrees of Freedom

Definition: The number of observations in a data set that are free to vary.
Math Formula: df.
Relation to Other Stats (where it came from and what does it lead to): A statistic needed to calculate many statistics, including: mean, variance, covariance, chi-square, F-ratio, and a t-score.
What is Required to Render This Stat? Number of observations and the statistic that is to be computed.
Examples: For mean: df=n;
For variance: df = n-1;
For other stats, df is different


· Degrees of Freedom for the Mean

Definition: When computing the mean, degrees of freedom equals the number of observations.
Math Formula:
Range of Possibilities: 1 to infinity
Relation to Other Stats This is required for computing the mean


· Degrees of freedom for variance

Definition: When computing the variance from sample data, degrees of freedom equals the number of observations minus one.
Discussion:
Why is it n-1 and not n? (a) It's just the way it is, so remember it. If you want to get some reasons for this though, read on...
(b) The sample variance is computed with n in the denominator; it is a biased estimator of the population's variance. If n is used for the df in sample variance, results will systematically tend to be too small.
(c) Variance iis based on deviation scores (deviations are used to compute the sum of the squared deviations). Deviation scores take the mean of the scores into account. Therefore, if you know the mean and the deviation for all the scores except one, you can perfectly predict the last score. Thus, we can say that the last score is not free to vary. It does not contribute to the degrees of freedom.
Math Formula:
Relation to Other Stats (where it came from and what does it lead to): variance, st. dev.


· Central Tendency

Measures of central tendency describe the "middle/center", or "typical", "most representative", or "best prediction" of a distribution. It is somewhat vague exactly what constitutes the "middle" or "center" of a distribution. The term "central tendency" refers to a variety of measures that describes this aspect of a set of scores in different ways.. The 'mean' is the most frequently used measure of central tendency.

Common Measures of Central Tendency:

· Central Tendency-Mean

Definition: the sum of the scores divided by the number of scores

Math Formula:

Example with Graphics:


<add numerical version of this example>
Relation to Other Stats: Requires x, df. leads to: variance, and many others
Background: What is Required to Render This Stat? (Type of Data/Number of Groups): All the data points of a group/subj. scores. Sum of Scores; Number of Scores (n).

· Central Tendency-Median

Definition: Center score of an ordered list of scores. (Should you have an even number of scores, take the two middle scores and average them to get the median.)
Example: 1,1,1,1,2,2,3,4,6,7,7,8,8,9 Here you have 14 scores, and so the median would be the average between 3 and 4 = 3.5

· Central Tendency-Mode

Definition: most common score
E.g. From this data set: 1,1,1,1,2,2,3,4,6,7,7,8,8,9,9 the mode is 1. Bimodal would imply there are two scores that occur most frequently.


· Deviation

Definition: Deviation describes how much a point deviates from the mean. Deviation is the distance from mean to score. Each individual data point has a 'deviation'.
Formula: / Subract the mean from the score.

Range of Possibilities: minimum: range from mean to lowest value; zero: the score is the same as the mean; maximum: the score the furthest away from the mean
Example with Graphics:

Relation to Other Stats (where it came from and what does it lead to): needed: mean, individual score; leads to: squared deviation, variance, covariance.
Background: What is Required to Render This Stat? (Type of Data/Number of Groups): The Mean


· Squared Deviation

Definition: Squared deviation represents the deviation of a point. Sq. dev. puts more weight on points that are farther from the mean. It represents all deviations as positive numbers. Squared deviation can be thought of as the squared area associated with the deviation of a point. Each point has a squared deviation. The deviation multiplied by itself
Definition: Like Deviation, Squared Deviation represents how much the point deviates from the mean, but squaring exponentially enlarges the distances from the mean. This exponentially enlarges effects of points that are farther and thus puts greater value on the farthest points. Squaring makes all values positive.

Formula: / Take the deviation of a point and square it.

Range of Possibilities: Min: zero; Max: Squared extremes from mean to most extreme value (may be above or below mean).
Procedure: (i) find deviation for each score; (ii) quare the deviation; (iii) Add up the sq. dev.
Example with Graphics:

Relation to Other Stats (where it came from and what does it lead to):
Background: What is Required to Render This Stat? (Type of Data/Number of Groups): mean, a score
Background: What are equivalent stats for different data types/number of groups? *


· Sum of Squared Deviations (SS)

Discussion: Sum of squared deviation describes the variability of the group. If a group's scores are all close to the mean, the SS is small. If the scores are far from the mean, the SS is large.
Math Formula:

Range of Possibilities: There is one sum of sq dev for each group
Meaning of Benchmarks:
Examples with Graphics:

Relation to Other Stats (where it came from and what does it lead to):
Background: What is Required to Render This Stat? (Type of Data/Number of Groups):
Background: What are equivalent stats for different data types/number of groups? *


· Variance (S squared)

Definition: Like SS, variance is a measure of how much the scores are spread out, the more spread out the scores, the greater the variance. However, variance is SS divided by df to take into account how many scores are contributing.
Discussion: Consider the difference between the lump sum paid out by a company is salaries and the amount received in per capita earnings. If the workers receive $20 million, it is still difficult to decide how to understand that number if you do not know among how many workers it is distributed (and of course how evenly it is distributed). Variance enables you to understand numbers better.

Math Formula:

Range of Possibilities: Min = 0, if all points are equal to the mean. The max. variance possible only makes sense if you have a finite scale, like percentages where there could then be a predetermined maximum distance from the mean.
Meaning of Benchmarks:
Examples with Graphics:

Relation to Other Stats (where it came from and what does it lead to):
Background: What is Required to Render This Stat? (Type of Data/Number of Groups):
Background: What are equivalent stats for different data types/number of groups? *


· Standard Deviation

Definition: The standard deviation is the square root of the variance. This is a statistic that indicates the spread of the scores from the mean. If the same class takes two texts and in the first test the lowest and highest scores were 24 and 94 and on the second test the lowest and highest scores were 88 and 94 the standard deviation (SD) for the first test would be larger than for the second test. The standard deviation is calculated by using the square root of the variance.
Math Formula:

Range of Possibilities:
Meaning of Benchmarks: Min = 0, if all points are equal, max. is the distance from the mean to the furthest score from the mean. The max. possible score only makes sense if you have a finite scale, like percentages.
Examples with Graphics:
Relation to Other Stats (where it came from and what does it lead to):
Background: What is Required to Render This Stat? (Type of Data/Number of Groups):
Background: What are equivalent stats for different data types/number of groups? *


· Cross Product

Definition: Represents deviation of a point (X,Y) from the point that is (Xbar,Ybar). The deviation on X is (X-Xbar). The deviation on Y is (Y-Ybar). The cross product is the product of these two deviations.
Math Formula:

Range of Possibilities:
Meaning of Benchmarks:
Examples with Graphics:


Relation to Other Stats (where it came from and what does it lead to): Similar to squared deviation, which is (X-Xbar)(X-Xbar)
Background: What is Required to Render This Stat? (Type of Data/Number of Groups):
Background: What are equivalent stats for different data types/number of groups? *


· Sum of Cross Product

Definition: Sum of cross products for all the cases.
Discussion: For a given cut of X scores and Y scores, the Sum of Cross Products...
(1) is greatest when the highest X is paired with the highest Y, second highest X with the second highest Y, and so on.
(2) is closest to zero when there is no relationship between the value of X and the value of Y.
(3) has its 'largest' negative value when the highest X is paired with the lowest "...and so on.

Things that affect the sum of cross pruducts:
(1) linear relationship between X and Y scores
(2) scale of measurement of the variables (X and Y)

(3) number of cases (data points)

The problem with interpreting Sum of Cross Products is that it depends on how many cases you have and the scale of measurement. If you have more cases, you will have a larger Sum of Cross Products.

Math Formula:

Range of Possibilities:
Meaning of Benchmarks:
Examples with Graphics:


Relation to Other Stats (where it came from and what does it lead to):
Background: What is Required to Render This Stat? (Type of Data/Number of Groups):
Background: What are equivalent stats for different data types/number of groups? *


· Covariance

Definitions (formal/informal): Covariance is the average cross-product. Covariance is an indicator of the degree of linear relationship between two variables. Covariance is the sum of the cross products divided by the degrees of freedom (n-1).

Discussion: Things that afffect covariance: linear relationship between X and Y scores, scale of measurement of the variables (X, Y).

Math Formula:

Range of Possibilities:
Meaning of Benchmarks:
Examples with Graphics:

The problem with interpreting covariance is that it depends on the scale of measurement of the variables. If you measure two variables in seconds, you will have a larger covariance than if you were to measure those same variables in minutes.


Relation to Other Stats (where it came from and what does it lead to):
Background: What is Required to Render This Stat? (Type of Data/Number of Groups):
Background: What are equivalent stats for different data types/number of groups? *


· r (The Pearson Product-Moment Correlation Coefficient)

Definition: A correlation is the measure of linear relationship between two variables. A correlation coefficient (Sxy) divided by the product of the standard deviations (Sx Sy).
Discussion: Denominator of Correlation: Product of Standard Deviations. The product of the standard deviations is an indicator fo the scales used to measure X and Y. The product of the standard deviations is the maximum possible value of the covariance. When the data are perfectly linear, covariance equals the product of the standard deviations.

Things that affect correlation: linear relationship between X and Y. Correlation is not directly affected by the number of cases. (Rember, some of the cross products was divided by the degrees of freedom to give covariance). Correlation is not affected by the scales of measurement of X and Y. (Covariance was divided by the product of the standard deviations).

Math Formula:

Range of Possibilities:-1 through +1
Meaning of Benchmarks:
Examples with Graphics:

 

Relation to Other Stats (where it came from and what does it lead to):
Background: What is Required to Render This Stat? (Type of Data/Number of Groups):
Background: What are equivalent stats for different data types/number of groups? *


· (Correlation Coefficient Squared)

Definitions (formal/informal): Proportion of total variance accounted for by the model. It's a value that ranges from 0-1, and it represents the proportion of the dep. var. in a multiple regression model that is accounted for by the independent variable.
Math Formula:
Range of Possibilities: 0-1
Meaning of Benchmarks: if the r2= .23 it means that only 23% of the dependent variable could be explained/predicted by the independent variable
Examples with Graphics:

Relation to Other Stats (where it came from and what does it lead to):
Background: What is Required to Render This Stat? (Type of Data/Number of Groups):
Background: What are equivalent stats for different data types/number of groups? *


· Product of Standard Deviations

Definition:
Math Formula:
Range of Possibilities:
Meaning of Benchmarks:
Examples with Graphics:
Relation to Other Stats (where it came from and what does it lead to):
Background: What is Required to Render This Stat? (Type of Data/Number of Groups):
Background: What are equivalent stats for different data types/number of groups? *


· Standard Error of Mean

Definition: A stat to indicate how greatly the mean score of a single sample tend to differ from the mean score of a population
Math Formula:

Range of Possibilities:
Meaning of Benchmarks:
Examples with Graphics:
Relation to Other Stats (where it came from and what does it lead to):

Background: What is Required to Render This Stat? (Type of Data/Number of Groups):
Background: What are equivalent stats for different data types/number of groups? *


· Confidence Intervals (interval estimate)

Definitions (formal/informal): A range of values within which it is estimated the population parameter will fall.
An interval calculated around a sample estimator within which we are confident, at some level, that the parameter lies. The upper and lower limits of the interval are called the confidence limits.
Math Formula:
Range of Possibilities:
Meaning of Benchmarks:
Examples with Graphics:
Relation to Other Stats (where it came from and what does it lead to):
Background: What is Required to Render This Stat? (Type of Data/Number of Groups):
Background: What are equivalent stats for different data types/number of groups? *


· t-test

Definition: A test of hypothesis about the mean of a single population or about the difference between the means of two populations.
Discussion: The t-test helps you determine whether you should reject the null hypothesis (H0) or not.
Math Formula:

Range of Possibilities:H0: 1-20 and H1: 1-2>0

Meaning of Benchmarks:

Examples with Graphics:

Relation to Other Stats (where it came from and what does it lead to): Calculate the means and st. dev. of each group and the correlation between the two sets of scores, then calculate the estimated standard error of the difference between the means (Sdiff) and then calculate the t. Once you have the t calculated you can determine the p-value. You would use the df and the desired p value to see if you can reject the null hypothesis.

Background: What is Required to Render This Stat? (Type of Data/Number of Groups):

Background: What are equivalent stats for different data types/number of groups? *


· ANOVA (Analysis of Variance)

Definitions (formal/informal): A family of tests for comparing two or more means in studies with one or more independent variables.
A technique for statistically analyzing the data from a completely randomized design. This technique uses the F-test, and tests to determine if there is a significant difference in 2 or more independent groups.
Relation to Other Stats (where it came from and what does it lead to):
Background: What is Required to Render This Stat? (Type of Data/Number of Groups):
Background: What are equivalent stats for different data types/number of groups? *


Source: Sharp, Vicky F. (1979) "Statistics for the Social Sciences."

Level of Measurement
Num of Groups
Nature of Groups/ Num of Categories
Test

Interval

1
 

t

2

Independent

t

Related

t

2>

Independent

1 way ANOVA

Related

Randomized Blocks Design

Ordinal
1
 

Kolmogorov-Smirnov

2

Independent

Mann-Whitney U

Related (signs)

Sign

Related (nums)

Wilcoxon

2>

Independent

Kruskal-Wallis

Related

Friedman

Nominal
1

1 Category

Chi-Square

2 categories

Size: 4 or less

Binomial

2 categories

 

Size: 5 or more

Chi Square

3 or more categories

Chi Square

2

Independent

Chi Square

Related

McNemar

2>

Independent

Chi Square

Related

Cochran Q



Additional Images:



Which of the following scatter plots (A-G) have a correlation coefficient of about 0.70?



Updated: Aug 2000