There are many different ways to use effect sizes but here we focus on progress – not on comparisons between classes, teaching methods etc.
Imagine a class of students have been administered a similar or the same test relating to the curriculum in February and June. We can use the data from these two tests to calculate an effect size. This effect size helps us to understand the impact of our teaching over this period.
The easiest way to calculate an effect size is to use Excel. Here is the formula:
Effect size = Average (time 2)  Average (time 1)
standard deviation
Consider this example:

A

B

C

1

Student

Time 1

Time 2

2

David

40

35

3

Anne

25

30

4

Molly

45

50

5

Barry

30

40

6

Collin

35

45

7

Brad

60

70

8

Juliet

65

75

9

John

70

80

10

Fred

50

75

11

Brooke

55

85

12




13

Average:

48
=AVERAGE(B2:B11)

59
=AVERAGE(C2:C11)

14

Standard deviation

15
=STDEV(B2:B11)

21
=STDEV(C2:C11)

15

Average standard deviation


18
=AVERAGE(B14:C14)

16

Effect size


0.62
=(C13B13)/C15

So to recap, the effect size was calculated as:
Effect size = (5848)/ 18 = .62
Interpretation of effect sizes
So now we have the first piece of important information – the class average effect size is 0.62. How should we interpret it? To come up with an independent measure of what expected progress should be, we have used two main considerations.
a. When we look at many major longitudinal databases (PIRLS, PISA, TIMSS, NAEP, NAPLAN)[1] they all lead to a similar estimate of an effect size of 0.4 for a year’s input of schooling. For example, using NAPLAN reading, writing and math data (Australia’s national assessments) for students moving from one year to the next, the average effect size across all students is .40.
b. The average of 800+ metaanalyses based on 240 million students shows an average intervention of 0.40.
Therefore an effect greater than 0.40 is seen as above the norm and leading towards a more than expected growth over a year.
Within a year, it is expected that the progress should be 0.40. So, if calculating an effect size over five months the 0.40 average should still be expected – primarily because teachers often adjust the difficulty of a test to take into account the elapsed time, and because teachers more often create assessments on specific topics within a year’s curriculum. So, within a year the aim is greater than 0.40, over two years 0.8, over three years 1.2 and so on.
Individual effect sizes
We can also calculate effect sizes for individual students. When we do this, we assume each student contributes similarly to the overall variance, then use the pooled spread (standard deviation) as an estimator for each student. Here is the formula:
Effect size = Individual score (posttest) – Individual score (pretest)
standard deviation for the whole class
Let’s go back to our example. Remember the average spread for the class was 18. The effectsize for David is (3540)/ 18, for Anne is (3025)/18, etc.
Student

Time 1

Time 2

Effect size

David

40

35

0.28

Anne

25

30

0.28

Molly

45

50

0.28

Barry

30

40

0.56

Collin

35

45

0.56

Brad

60

70

0.56

Juliet

65

75

0.56

John

70

80

0.56

Fred

50

75

1.40

Brooke

55

85

1.68

In the above case there are now some important questions for teachers. Why did Fred and Brooke make such high gains, and why did David, Anne, and Molly make such low gains. The data, obviously, do not ascribe the reasons but they do provide the best evidence to lead to these important causal explanations. (Note, in this case it is not necessarily that it was the more struggling students that made lowest and brightest who made highest gains.)
Given there is an assumption (that each student contributes to the spread similarly), the most important issue is the questions that these data create – What possible explanations could there be for those students who achieved below 0.40 and for those who achieved above 0.40. This then allows evidence to be used to form the right questions. Only teachers can seek the reasons, look for triangulation about these reasons, and devise strategies for these students.
There are some things to be aware of:
a. Caution should be used with small sample sizes. The smaller the sample the more care should be taken to crossvalidate the findings. Any sample size less than 30 can be considered “small”.
b. A key is to look for outlier students. In a small sample a few outliers can skew the effect sizes and they may need special consideration (e.g., Why did they grow so much more than the other students? Or Why did they not grow as much as the other students?), and perhaps the effect sizes recalculated with these students omitted. The perils of small sample sizes!
[1] Progress in International Reading Literacy Study (PIRLS), PISA, TIMSS (Trends in International Mathematics and Science Study), National Assessment of Educational Progress (NAEP), National Assessment Program  Literacy and Numeracy (NAPLAN)