There may never be a way to comprehensively understand how student learning was altered by COVID-19. Everyone’s routines were deeply disrupted, and most of us were just doing the absolute best we could under the overwhelming circumstances. As we work together to recover from this difficult time, it’s important to think carefully about how we evaluate student learning and move forward as educators.
The role of assessment
Since long before the COVID-19 pandemic, standardized math and reading assessments have been routinely used to track student progress across grades and understand the effectiveness of educational interventions and policies. These assessments include end-of-year state summative tests, which were cancelled for the 2019–20 school year, and interim assessments like MAP® Growth™. However, the scales used in educational assessments differ, and the same score on two different tests may suggest very different things about what a student knows and is ready to learn next. This can make it challenging for educators, families, and policymakers to understand the practical significance of test score changes over time.
This difficulty has been clearly demonstrated during COVID-19, as there has been an immediate attempt to quantify how the disruptions to traditional in-person schooling have impacted student learning trajectories. Researchers have responded by reporting unfinished learning in a wide variety of metrics, including:
- Percentage of students hitting or missing a proficiency benchmark
- Standard deviation units
- Percentile rank
- Percentage of “typical” gains made
- Months or years of learning
Of these various options, the months or years of learning metric has the most obvious appeal, as it has an intuitive meaning to educators, families, and policymakers. However, it also has some fairly large limitations that have led experts to recommend that researchers should “avoid this translation in all cases,” as Matthew Baird and John Pane explain, while people consuming research should “look with skepticism toward research results translated into units of time.” In this blog post, I will discuss the pros and cons of the months or years of learning translation and explain why NWEA has chosen to avoid this approach in our research on student academic achievement and growth during the pandemic.
What it means to measure months or years of learning
Months or years of learning are calculated by first estimating the ratio of some observed effect to a measure of typical growth on the same scale and then rescaling it into months or years, typically by multiplying the figure by 9.5 months or 180 days, which represents a typical school year. For example, if third-grade students were observed to have gained seven points on the MAP Growth Reading RIT scale across the entire 2020–21 school year, while the NWEA 2020 norms show typical third-grade gains of 14 RIT points in reading, the months of learning translation would be calculated as 7/14 x 9.5, which equals 4.75 months of learning that occurred for third-grade students during the last school year.
While there is no perfect metric that meets the needs of all users, we have primarily chosen to report test scores in their original RIT metric as well as change in percentile rank relative to a pre-pandemic cohort of students.
There are a couple advantages to this translation choice. First, it is straightforward to calculate and report. Second, it has an intuitive appeal to non-researchers that many other options, like standard deviation and percentile rank, lack. In a recent study, teachers noted that months of progress was one of the most informative metrics for reporting the effectiveness of an intervention. Some important users of research clearly prefer it as well.
However, there are serious disadvantages that limit the usefulness of this metric. As Baird and Pane outline in “Translating standardized effects of education programs into more interpretable metrics,” results are highly sensitive to the choice of the benchmark used for “typical” learning. It is possible that researchers may cherry-pick the benchmark used to inflate or deflate the perceived effect of an intervention (or COVID-19) on student learning. Second, this metric relies on the faulty assumption that learning rates are linear within and across grades, which our research shows does not hold up with academic achievement tests, which often show that growth is typically faster in earlier grades than in later grades and can differ from fall to winter to spring within a grade. Or, to put it another way, not all months and not all years are equal. Third, the months or years of learning metric is not bounded within a reasonable set of values, especially when typical growth rates are low (as is often the case in the upper grades). Using MAP Growth data for eleventh-graders, Baird and Pane found their years of learning translation produced results between +37 and -276 years of learning, which does not pass the sniff test or provide useful, meaningful context. Finally, use of the months or years of learning metric has caused teachers to greatly overestimate the effectiveness of an intervention relative to any other available metric.
Using more reliable metrics for understanding learning during COVID-19
Despite the advantages of being an intuitive and well-liked metric by educators, we concur with Baird and Pane’s conclusion that the months or years of learning translation should be avoided whenever possible. While there is no perfect metric that meets the needs of all users, we have primarily chosen to report test scores in their original RIT metric as well as change in percentile rank relative to a pre-pandemic cohort of students. Although it may take a little more explanation and context to understand these metrics, we believe that effort is worth it to avoid introducing inaccuracies or bias in our research. Finally, as we shift into a period of recovery from the pandemic, describing students’ learning in terms of time may incentivize educators to speed through curriculum to “catch up,” a practice that can be particularly damaging in math.
For more information on the translation of educational effect sizes, see the following articles:
- “Translating standardized effects of education programs into more interpretable metrics”
- “Interpreting effect sizes of education interventions”
- “How should educational effects be communicated to teachers?”
To learn more about how students fared academically on MAP Growth assessments during COVID-19, read our research brief and join us for our webinar, “Research meets practice: How educators are responding to the latest COVID-19 impact results.”
We encourage you to use local data to understand where your students are this fall and determine if what we observed nationally and across a number of individual states is reflective of what you observe with your students. Our guide “Kick-start fall planning: 4 principles for school leaders” will also walk you through specific strategies that can help you be successful in the new school year.