On a recent vacation, my family and I visited a sea turtle hospital. In the education center, there were binders of letters from children who had visited. In one letter, a child named Colton thanked the staff for showing them around and said that he especially liked the baby turtles. What stood out about Colton’s letter was his closing statement: “Did you know that three percent of Antarctica is penguin urine? This has nothing to do with turtles. I just felt like sharing.” In addition to cracking up my family, Colton’s stat about penguins’ contributions to Antarctica got me wondering about its veracity. A quick internet trip proved it challenging to verify, which got me thinking about how we teach kids about data and data literacy.
What is data literacy?
As with so many things, settling on a single definition can be challenging. “Data literacy,” “statistical literacy,” and “information literacy” are sometimes conflated or used interchangeably. For our discussion here, I like researchers Kristin Fontichiaro and Melissa Johnston’s conceptualization of data literacy as involving “numeracy, quantitative literacy, and mathematical and statistical calculations, as well as problem-solving, communication, and decision-making.”
They also reference the definition of “data literacy” from researchers Ellen Mandinach and Edith Gummer, who add that it includes the ability to “understand and use data effectively to inform decisions,” “identify, collect, organize, analyze, summarize, and prioritize data,” and “develop hypotheses, identify problems, interpret the data, and determine, plan, implement, and monitor courses of action.” While this may seem lengthy, it also seems appropriate as data literacy incorporates many skills.
Why is data literacy important?
We interact with data on many fronts and in many forms, sometimes without realizing it. Decisions about what clothes are appropriate for the weather and about when to make big-ticket purchases are both informed by data, for example.
As the volume of data we interact with increases, the ability to question the source, validity, and claims made from that data—as well as how to interpret the data ourselves—becomes increasingly important. In their book The Basics of Data Literacy: Helping Your Students (and You!) Make Sense of Data, authors Michael Bowen and Anthony Bartley sum it up this way: “Data literacy is important for your students even if they aren’t going to be scientists because data are used to argue and persuade people to, among other things, vote for political agendas, support specific types of spending within organizations, sell life insurance, or lease a car. An improved understanding of data practices means that better questions can be asked in all of these situations.”
Additionally, being data literate is becoming increasingly critical in many careers. Small and large business owners need to analyze market and labor trends, as well as projected supply and shipping costs. Teachers need to be able to read and interpret educational research and classroom data from a variety of assessment types and adjust practice and instruction based on this data. In fact, the Bureau of Labor Statistics projects that between 2021–2031, data-related occupations are likely to grow significantly faster than the five percent average growth for other occupations over the same period, with roles like data scientists and statisticians predicted to show growth exceeding 30 percent. The 2025 Future of Jobs Report published by the World Economic Forum reported that analytical thinking, a key component of data literacy, is the number one core skill employers look for, with 7 out of 10 companies rating it as an essential skill.
How does data literacy show up in the standards?
Many college-and-career ready standards have expectations related to data literacy, and the need to be data literate is expressed across disciplines.
In April of 2024, five major educational organizations—the National Council for Teachers of Mathematics, National Science Teachers Association, American Statistical Association, National Council for the Social Studies, and Computer Science Teachers Association—issued a joint position statement on the importance of teaching data science. They state that “All subjects in school should recognize the contribution of data to their discipline and take curricular approaches that integrate data with disciplinary lessons where appropriate.”
Additionally, as of last year, 27 states were exploring some type of official stance on increasing data science education. New Jersey’s 2023 math standards broke apart the Measurement and Data domain, expanding data content under the new Data Literacy domain. The University of Michigan School of Information and University Library has compiled a list of “K–12 academic standards related to data literacy” for some of the major standards sets. Additionally, pages XIV and XV of The Basics of Data Literacy: Helping Your Students (and You!) Make Sense of Datahave a clear and comprehensive overview of the progressions for data literacy in NCTM Principles and Standards for School Mathematics, the Common Core State Standards (CCSS) for Mathematics, and the Next Generation Science Standards (NGSS).
Let’s take a look at how data literacy standards show up in different content areas.
Math
Both the NCTM Principles and Standards for School Mathematics and the Common Core State Standards for Mathematics infuse data literacy throughout the grades. NCTM spells out four big ideas related to data and statistics, and skills related to data literacy are also components of CCSS’s Standards for Mathematical Practice.
At a high level, in the early grades, both standards start with basic data organization and representation and reading of data in tables, line plots, and bar/picture graphs. In middle school, students are introduced to probability concepts and measures of center and variability. Additionally, they explore the general shape of a set of data with more sophisticated representations, such as histograms, box plots, scatter plots, and linear models. In high school, students summarize, represent, and interpret data on single and multiple variables and analyze patterns and deviations. They use statistical tools to describe variability and make predictions, while also understanding the importance of random sampling and random assignment in drawing valid conclusions. Probability models are introduced to describe random processes, and students compute probabilities. The standards also highlight the role of technology in generating plots, regression functions, and simulations.
Science
The NGSS explicitly include analyzing and interpreting data as one of their eight scientific and engineering practices, although several others, such as engaging argument from evidence, are also related to data literacy. In terms of the content standards, elementary and middle school expectations are similar to those for math. In high school, students summarize and systematically analyze data to identify patterns or test hypotheses, revising models when data conflicts with expectations. They utilize various tools and data representations to organize and display data, evaluate conclusions, and explore relationships between variables. They distinguish between causal and correlational relationships and collect and analyze data from physical models to assess design performance under different conditions.
English language arts
Data literacy appears in multiple areas of the CCSS ELA standards. Anchor Standard 7 is about evaluating content presented in diverse formats, including visually and quantitatively. Standards about reading informational text include the ability to read text features such as tables, charts, and graphs. High school writing standards call out the use of graphics, like tables and figures, as well as the need to assess the quality, relevance, and authoritativeness of sources of information. Finally, the standards related to cross-disciplinary literacy in science and social studies call on students to integrate quantitative or technical analysis, such as charts and research data, with qualitative analysis and translate quantitative or technical information expressed in words in a text into visual forms, such as tables and charts.
Social studies
The C3 Social Studies Standards incorporate and support data literacy by emphasizing the use of data and evidence throughout the inquiry process. Dimension 1 encourages students to develop questions and plan inquiries, requiring them to identify and evaluate sources that provide multiple perspectives and types of data. Dimension 2 focuses on applying disciplinary tools and concepts, such as using economic data to analyze market outcomes or geographic data to understand spatial patterns. Dimension 3 is dedicated to gathering and evaluating sources, teaching students to assess the credibility and relevance of data. Finally, Dimension 4 involves communicating conclusions and taking informed actions, where students use data to construct arguments, explanations, and critiques, ensuring their conclusions are well-supported by evidence.
Reading misleading data
While the standards are explicit about the need for students to be skilled at making, reading, interpreting, and using data as evidence, they are not often as explicit about the ability to identify errors in interpretation or misleading data. Misleading data can stem from the way data is represented visually, or it can be related to the assertions made about the data. Misleading data can be the result of intentional manipulation or because of unintentional errors in interpretation or representations. Teaching students to question data is an invaluable skill and can be empowering for students.
Lynette Hoelter, of the University of Michigan’s Inter-University Consortium for Political and Social Research (ICPSR) has a wonderful 45-minute webinar, “Data, data everywhere and not a number to teach!” which delves into both how data can be misleading and how we can teach students to be more savvy consumers of data. In the webinar, she calls out how we are prone to unquestioningly accept numerical data as evidence, forgetting that “numbers…don’t exist apart from people.” She proposes the following questions to consider when examining any data representation or claim based on data:
- What is the source of the statement/data? (Who collected the data? What did they count? Why did they count it?)
- How is the information reported?
- Is the sample size both adequate and representative?
- Are the graphics misleading in any way? (Do they lack context? Are the appropriate measures of central tendency used? Is the scale appropriate?)
She also gives examples of several common ways that statistics can be confusing or misleading.
Definition issues
These types of issues involve examining what was included and excluded in the data gathering and how the question posed is framed or defined. For example, if you see a statement about the average income for a particular type of job, you might question who is counted in this statistic. Is it only full-time employees? Does it include part-time employees? What about contractors/gig workers? Were the salaries taken from a particular geographic area, across the US, or worldwide? Asking such questions is critical to fully understanding the implications and limitations of data or claims based on that data.
Big numbers
Large numbers can appear statistically significant by nature of their size, but without additional context it is difficult to understand the relative magnitude of a number. Imagine, for example, a contest or lottery that boasts of having 10,000 winners. That sounds like a large number until you learn that 20 million people entered the contest. Knowing how the number of winners compares to the number of entrants highlights the relative magnitude of the number and shows that the winners represent only 0.05% of those who entered.
Using the wrong measures of center
If mean and median are used incorrectly to describe a data set, it can impact the interpretation of the data. For data sets that are relatively evenly distributed with no outliers, the mean is a good measure of the data. If a data set is skewed or contains outliers, the median is the better choice. Let’s say I want to choose a measure of center that best represents the heights of my students. If my students’ heights are evenly spread between 4 ¾ and 5 ½ feet, mean is a good choice. But if I have one student who is 6 ½ feet tall, then median would be the better choice to minimize the impact of the one outlier student on my data.
Correlation vs. causation
This is a commonly known issue but one that still fools people. When two variables are correlated, it means a change in one variable is accompanied by a change in the other. However, this does not mean that the change in one of the variables necessarily caused the change in the other. For example, ice cream sales and sunscreen sales may show the same patterns of increase and decrease over the course of year in a particular area, but the act of buying ice cream does not cause you to buy more sunscreen. They are likely correlated because they are both things you are more likely to buy when it is hot, but there is no evidence of a causal relationship.
Misleading graphics
There are many ways to intentionally or unintentionally manipulate visual displays of data. Data scales can be chosen to make small changes appear much more significant, or critical context can be left out of the display. Lea Gaslowitz’s TED-Ed video “How to spot a misleading graph” and the Academy 4 Social Civics’s “Misleading graphs: Don’t get fooled graphs series” both give great visual examples of this issue.
When talking about misleading data, it is important to teach students to research who conducted or commissioned the study behind a set of data. This can uncover potential conflicts of interest that could lead to bias in either the findings or the claims based on a set of data.
Strategies for increasing data literacy in the classroom
While there may be a push for more comprehensive data literacy standards and instruction in the future, there are smaller ways you can increase students’ data literacy. Data literacy shouldn’t be limited to math and science classrooms; by its nature, data literacy lends itself to cross-disciplinary study. Here are a few ways to infuse your lessons with data.
Start with data
When starting a lesson or unit on a particular topic, use relevant, real-world data as a starting point for discussion. Even when teaching literature, you can share a graph or piece of data related to either a key concept or the period during which the piece is set. For example, when reading A Raisin in the Sun, you could have students explore graphs from Zillow that show the long-term impact of redlining on home values. In addition to providing an opportunity to explore data with students, this data lends additional context for the play.
Use authentic data
Engaging with authentic data can increase student interest with the work and give them valuable experience with real-world (and often “messy”) data. This means moving beyond graphs of data on how students get to school and favorite lunches that are featured in so many curriculum materials.
At the elementary level, have students gather data on weather or class experiments or teach them to conduct surveys to gather authentic data. Or go to sites like Data Nuggets to find activities built around authentic data. Middle and high school students can explore larger and more complex data sets found at sites like the US Census Bureau or NOAA.
Turn your students into data hunters
Encourage students to be on the lookout for data in the real world. Have them bring in examples of data or claims they see online or out in the world and examine it as a class. This is a great opportunity to practice asking critical questions to help determine the validity of both the underlying data and the claims made from it.
Make time for misleading data
Do not shy away from having students explore claims or data presentations that are misleading. Being able to spot data manipulation is a critical life skill given that purposely manipulated data is often used to persuade people to buy or believe something. A quick internet search of “misleading graphs” turns up a variety of examples suitable for different ages and purposes.
Dig into data!
Incorporating data literacy into education is essential for preparing students to navigate a data-rich world. By teaching them to question, analyze, and interpret data, we empower them to identify both critical and misleading information and make informed decisions
For more help finding ways to add data to your classroom, keep the following list of lesson and data resources handy:
- The Basics of Data Literacy: Helping Your Students (and You!) Make Sense of Data. This NSTA publication, which covers both fundamental and advanced data literacy, is designed for teachers from elementary to high school and includes activities and resources.
- Creating Data Literate Students. This book, published by the University of Michigan, is available for free online. Each chapter is written by a high school librarian and contains information, strategies, and lesson plans for improving students’ statistical literacy, visual data literacy, and literacy with data in research. The activities in the appendix are frequently categorized by how much time you have to devote to the particular topic and whether you’ll be working with a whole class or multiple classes.
- “Data, data everywhere and not a number to teach!” This webinar from the University of Michigan delves into how data can be misleading and how to teach students to be more savvy consumers of data.
- Data.gov. This is the US government’s open data repository. You can search by topic or by organization. Searching by organization allows you to view data from large cities, like New York, Sioux Falls, and Los Angeles, as well as government agencies, such as the Census Bureau, the Bureau of Labor and Statistics, and the Environmental Protection Agency.
- Data Nuggets. This site, founded by Michigan State University, contains free classroom activities for elementary through high school students. Each Nugget was codesigned by educators and scientists and provides information about actual science research projects and activities designed to help students use data to find patterns and explanations of natural phenomena.
- Data Science 4 Everyone. This organization, created by the University of Chicago, is a coalition of multiple organizations promoting the need for data science education. The site includes possible implementation models for getting more data literacy in K–12 education, and it links to resources for teaching data literacy, learning more about data literacy, and data sets.
- Education Development Center’s “Resources for educators using data in the classroom”. This site includes links to free sources of data that can be used in the classroom as well as links to lessons and activities designed to support data literacy education from elementary school to high school.
- Foundations of Data Science for Students in Grades K–12. This publication from the National Academies can be accessed or downloaded for free. It summarizes presentations and discussions at a 2022 conference on K–12 data science education.
- My NASA Data. This site provides educators access to curated NASA Earth data, along with mini-lessons, full lesson plans, and interactives all aligned to Next Generation Science Standards. The content is geared for grades 3–12.
- National Oceanic and Atmospheric Administration data resources for educators. This site provides lesson plans, activities, and curriculum that use real NOAA data, including climate, historical, ocean, freshwater, and real-time data. The resources vary by grade, but there is content available for grades K–12.
- Our World in Data. This nonprofit site publishes data related to current world issues and problems, with the goal of using data to make progress against these issues. Topics include population, health, the environment, education, innovation, and human rights.
- “Pre-K–12 guidelines for assessment and instruction in statistics education II (GAISE II): A framework for statistics and data science education.” This free guide is an official position of the National Council of Teachers of Mathematics and was endorsed by the American Statistical Association. It “lays out a curriculum framework for Pre-K–12 educational programs that is designed to help students achieve data literacy and become statistically literate. The framework and subsequent sections in this book recommend curriculum and implementation strategies covering Pre-K–12 statistics education.”
- Statistics in Schools. This site from the US Census uses real Census Bureau data in lessons and resources suited for K–12 students.