Wait, Why Not Just Use Administrative Data?
“Let’s find districts like yours.” That’s how I started this part of my presentation to the superintendent of the Kansas City, Missouri school district, in spring of 2015. Our discussion topic was the launch of EdWise, a free website I created that used visual tools to make publicly available education data more accessible.* Using a series of filters and over 16 million points of data, I shrank a list of 568 Missouri school districts to three close matches of the superintendent’s school district in under two minutes. The identified school districts matched his district’s student enrollment, staff size, and student diversity (both in racial and economic terms).
The process of building EdWise taught me many things (including a renewed respect for the art of design and the power of caffeine on a late night) but mostly that administrative data can tell stories. These stories about our schools and districts can change lives and improve policies. For school leaders, it’s helpful to know, for example, that districts like theirs can only be found on the other side of the state. These data help them to build a point of comparison for their own work, potentially introducing them to new partners for collaboration and empowering them to better engage the public narrative about their school districts. In fact, if you looked at the test scores of the three school districts most similar to Kansas City’s, two had a history of scores identical to those of Kansas City, but one district was ahead by 15 percentage points. As an insightful leader, this superintendent immediately wanted to know all he could about this high-performing school district.
Let’s define “administrative data.” This is information collected from an institution, such as a program, school, district, or state, for the purposes of regulatory requirements and public reporting. For example, administrative data is the number of students enrolled in a school, libraries in a county, or fishing licenses in a state. Typically, the institution collecting this information is either a government entity or a nonprofit organization. Administrative data is reported at the institutional level, which by definition is a collection of students (or other elements) rather than just one. This means that administrative data, at its core, protects the identity of students. Any data small enough to identify a student or a group of students is redacted through a policy commonly referred to as “Cell Size.” Cell Size refers to the number within a single cell of a spreadsheet document, which is small enough that it could allow someone to guess a student’s identity. Think about a school with just one arts teacher; that teacher’s identity would be fairly easy to find. This is why those cells are redacted and why administrative data has legitimate appeal.
In fact, administrative data is abundant. The premier clearinghouse for federal administrative data, Data.gov, boasts more than 192,322 datasets on hundreds of topics. Data.gov is just the tip of the iceberg. There are plenty of other data clearinghouses with scads of information. Some of these websites are run by states, cities, research organizations, and even nonprofits. Further, people are already doing interesting things with this data. Here is an example of folks using U.S. education data to evaluate the safety of Ivy League schools, and here’s another one, from the New York Times’s Upshot, on racial and economic equity. Administrative data is valid for identifying problems, trends, opportunities, and even loose relationships between topics.
However, administrative data is not a panacea. Its usefulness rests on its ability to summarize. If summarization is not ideal, then this tool breaks down. Let’s say you were doing medical research on different treatment plans within a hospital. Would it help to summarize the results to the hospital or county level? Of course, it would not because you would lose the nuance that made the study work in the first place. The same could be said of reading programs within a group, class, or even a school.
The argument here is not that administrative data is perfect, just that it is more valuable and safer than we acknowledge. In a world gripped by one data privacy scandal after another, why not try administrative data first? Websites like 538, UpShot, and EdWise made a name for themselves with such information. For certain key questions, administrative data holds great opportunity for telling stories in education.
Christopher Laubenthal has 15+ years experience with data visualization and storytelling including work with multiple sectors and fields. Whether dashboard or database his goal is always the same: help people answer questions with data, tell stories, and make data look beautiful. He is currently a Data & Visualization Consultant for Lockton Companies, LLC.