By Steve Evangelista
A hidden bias
Out of 90 charter schools that administered the New York State standardized tests in both 2011 and 2012, Harlem Link had the 8th highest average increase in English Language Arts (ELA) and Math scores. This score improvement was amazing, fantastic, even inspiring. And misleading—because of a small, relatively unknown factor called survivorship bias.
Survivorship bias is a statistical term for an indication that there is some hidden factor that excludes certain members of a data set over time—namely, part of a sample that was there at the beginning is no longer there at the end and does not count in the final analysis. The smaller subset of those who “survive” over time might be better off than the original whole group simply because of who stayed and who left, not any value added over time.
Simply put, every year, at every school, some students leave, and their departure changes the profile of who takes the test from year to year. Sometimes high-scoring students depart. At other times low-scoring students depart.
If schools continuously enroll new students (and some don’t), the same factor impacts the student population for these incoming students. At the end of this blog I chart a hypothetical situation in which survivorship bias shows how a school can appear to improve while not actually adding any value simply by not adding new students year after year.
In large systems, there is so much mobility that these student profiles tend to cancel each other out because of scale. For example, the student population appears relatively stable from year to year in the third grade in Community School District 3, where 1,342 students in 30 school took the state English Language Arts exam in 2012. But in small student populations like the one at Harlem Link, where only 52 third grader students took the 2012 exam, a few students entering or leaving the school with certain test scores can make a big difference.
When the state department of education releases test scores each year, however, it does not provide this or any other contextual background information alongside the scores. I believe that this process penalizes, in the public eye, schools that continue to enroll students to replace those that depart.
(Partly) illusory gains
At Harlem Link, the fact that we only test in three grades guarantees that at least 1/3 of our students taking the tests each year will be different students than those who took it the year before. Putting aside the variability in the state test from year to year, this rolling of the dice has influenced some dramatic swings in achievement that mean our school’s test scores have looked worse than the actual performance of our teachers in some years, and at other times (like this year) they may have looked better than they really were.
It turns out that the profile of our students who departed before the last school year was a much less successful one than the profile of the group that left the prior year. In other words, we had to improve less to get apparently lofty gains.
In English Language Arts, we saw an improvement of 18 percentage points from 2011 to 2012, according to the state’s way of reporting the scores. But since many of the students who graduated in 2011 or left for other reasons following the 2010-11 academic year performed poorly on the 2011 exams, the students who returned had a better passing rate by 10 percentage points than the original group. In other words, more than half of our test score gains in ELA could be accounted for by attrition.
Now, I’m not going to say that I’m not proud of our scores or that they are not indicative of a powerful effort by talented and dedicated professionals. I’m not even going to tell you that we didn’t improve our practice last year. I think we have improved in that area every year, because we have been an honest, self-examining, learning organization. But the wide swing in test scores and the state’s failure to describe enrollment patterns when reporting the scores masks the true story of a gradual, continuous march to improvement that is the real hallmark of the growth at Harlem Link.
Best practices often begin as difficult, controversial and seemingly impossible changes to “the way things are.” Strong schools take the time required to plan, assess and tweak new initiatives until they become standard operating procedures. The lack of information provided alongside scores obscures this type of growth, creating perverse incentives for schools to “push out” students who are low performers and to “quick fix” by whittling down large original cohorts to smaller groups of survivors, uncompromised by new admittees.
At Harlem Link, we have resisted these perverse incentives. We have always replaced students who leave, for budgetary reasons (being a small, standalone charter school) and to serve a greater portion of the community starved for high-quality school choices. Each year, we have encouraged some students who are particularly high achieving to leave a year early by helping them apply to competitive public and independent middle schools that only admit in fifth grade, reasoning that we’d rather lose their strong fifth grade test scores than see them lose an opportunity to get firmly on the college track a year ahead of their peers. If we followed the short-sighted state incentive, we would not have urged four of our highest-scoring fourth graders on the state exams in 2012 to apply to and enter the Upper West Side’s highly sought-after Center School. They were admitted and are all attending—a fact that may well push down our fifth grade test scores by as much as 10% next year—and we are thrilled, because we helped four more students living in a high-poverty environment to gain admission to this exclusive public school. We also would not have pushed students leave after fourth grade in years past to embark on the independent school track by attending the East Harlem School at Exodus House and the George Jackson Academy in lower Manhattan.
In the context of reform
This issue has been raised before in the blogosphere, but not in a thoughtful manner. Instead, it has been wielded as a weapon by those who are against the current strain of education reform. It has been used to defeat the straw man argument that charters are silver bullets and to denigrate the success of networks like KIPP, which is another organization that deserves no such uninformed criticism. (Each year, KIPP asks itself several questions in its annual internal reporting, including, “Are we serving the students who need us?”)
Because it is potentially embarrassing and might burst the balloon of so-called charter education miracles, this issue has also (to my knowledge) been ignored publicly by my colleagues in the charter community. There are many groups of charter schools that go happily on their way winnowing down their large kindergarten classes, educating fewer and fewer students in each cohort each year, not adding new students and narrowing down their challenges as they deal with fewer and fewer “survivor” students well. And those charters that benefit from network infrastructure and economies of scale can balance their budgets even while shrinking six to eight kindergarten sections down to three or four fifth grade sections.
I’m not passing judgment on those networks. As a charter school founder who has been running a school for almost 10 years, I still believe that the charter experiment has been a profoundly positive one for the communities where such schools have flourished. What I want is for the public to have some understanding of the context behind test scores, so alleged miracles can be put in their proper place, and year to year statistical swings that have nothing to do with a school community’s actual performance can be put into their proper perspective.
Hypothetical (with some assumptions): survivorship bias in action
In the example below, compare two schools that start out with similar student profiles. School A replaces each student who departs. School B does not.
Each year at both schools, a greater percentage of academically struggling students than successful students leave. Each year at both schools, neither school is adding any value since no individual’s test scores are changing.
Because the entering students at School A are similarly academically disadvantaged to those who depart, its scores do not change. School B’s scores improve more than 20 percentage points—simply by virtue of attrition, the decision not to enroll new students, and the mix of which students are taking the test each year.
School A = Enrolls new students continuously
School B = Does not enroll new students
|Passing students added||40||5||5||5|
|Failing students added||60||15||15||15|
|Passing students leaving||0||5||5||5|
|Failing students leaving||0||15||15||15|
|Total Passing Students||40||40||40||40|
|Total Failing Students||60||60||60||60|
|Passing students added||40||0||0||0|
|Failing students added||60||0||0||0|
|Passing students leaving||0||5||5||5|
|Failing students leaving||0||15||15||15|
|Total Passing Students||40||35||30||25|
|Total Failing Students||60||45||30||15|
This post also appeared at GothamSchools.