Measurement provides us with powerful tools that we can use to improve our tests from the perspective of both students and teachers. This article offers tips for using these measurement tools for both items and tests.

Item analysis refers to methods used to review and improve the items in our tests.

Test review refers to techniques we can use to examine the stability and appropriateness of each test as a whole.

4 Tips for Item Analysis

1. Know the item difficulty

Item difficulty can help you set the appropriate expectations when measuring student knowledge. This can help you check the appropriateness of items and associated skills for your class or grade level and identify gaps in instruction.

2. Look for problematic items

The item discrimination can tell you if items are confusing or possibly even miskeyed. Look out for correct answers with negative discriminations as they are problematic and should be revised before further use or just removed from the test.

3. Make sure the distractors are plausible

Look at the distractors (incorrect choices) for each item to see if students are selecting each choice at a similar rate. If some distractors are not being chosen then they are not a reasonable option and should be reviewed.

4. Integrate feedback from instruction

The item analysis is enhanced if feedback from teachers and students is used to support an item review. Teacher and student feedback can identify items that are inappropriate for the current skill level and whether the test is balanced in length.

4 Tips for Test Review

1. Build assessment confidence using reliability

Obtain the test’s reliability coefficient.  This tells you how consistent your test scores are and can provide the teacher with confidence that the test they are using is of the highest quality.

2. Seek to improve reliability

Cronbach’s alpha reliability can be improved using Tips 1–3 discussed in the previous section. Consider an item analysis and find ways to improve the test for the next testing session.

3. Check the test’s validity

What evidence supports your interpretations of the test scores? A review of the literature or verification that your items are linked to the right skills and standards can inform you how much validity your test has.

4. Support your test plan using validity

The body of evidence behind your test can provide evidence that your test plan and content is appropriate for the students you are measuring. Consider making revisions to the test plan if the test review reveals that the test content is not appropriate.

These tips provide a solid foundation to improve the quality of your assessment at both the item and test level. Test improvement is an ongoing process that requires feedback from the item development and exam form creation stages of the assessment development cycle.

Understanding Key Psychometric Terms

Review the terms below for a quick reference covering some common psychometric terms, including interpretative guidelines.

Interpretative Guidelines
Total responses for option divided by total sample size
0 to 1
Item difficulty (p value)
Proportion of sample that chose keyed (correct) answer
0 to 1
Distractor p-value
Proportion that endorsed the non-keyed option(s)
0 to 1
Should see similar values for each option
Statistic that indexes the relationship between two variables.
-1 to 1
Positive correlation: as one variable increases, the other variable tends to increase also
Item discrimination (point-biserial)
Correlation between keyed response and total score
-1 to 1
Below .10:   Review item
.10 to .19: Low
.20 to .29: Good
.30+: Very Good
Distractor point-biserial
Correlation between non-keyed option and total score
-1 to 1
Want values less than .10
Cronbach’s alpha reliability coefficient
 Statistic that indexes the consistency of test items
(Indicates whether items are measuring the same general skill)
-1 to 1
.70 to .79: Acceptable
.80 to .89:  Good
.90+: High
Theory that supports our interpretation of test scores