The High Cost of Unproven Reforms: How Ideology and Weak Research Have Shaped Decades of Costly Education Policy
Rebecca A. Huggins
When I was a graduate student in education nearly twenty years ago, I recall a professor telling me that there wasn’t any good research on educational impacts because you can’t really control groups when people are involved. That claim echoed a broader skepticism about randomized experiments in education that Thomas Cook (2006) and others have documented as philosophical rather than empirical in origin. The consequence has been predictable: ideas that feel right often scale faster than ideas that are tested.
We have seen the consequences in reform after reform. Learning styles theory persisted long after reviews found no reliable evidence to support tailoring instruction to preferred modalities (Pashler et al., 2008). Whole-language approaches to reading were widely adopted despite phonics-based evidence on how early decoding supports comprehension (Foorman et al., 1998; National Reading Panel, 2000). Class-size debates were muddied by misreadings of Project STAR (Word et al., 1990). D.A.R.E and similar anti-drug programs were rolled out nationwide despite null effects on substance abuse (Ennett et al., 1994; West & O’Neal, 2004). Self-esteem campaigns promised achievement gains that rigorous reviews did not confirm (Baumeister et al., 2003; Marsh & Craven, 2006). Open classroom experiments (Ramey & Piper, 1974; Bennett, & Hyland, 1980), technology fads (Banerjee et al., 2007; OECD, 2015), and ill-founded social-promotion policies all spread on intuition, not causal proof (Holmes, 1989; Jacob & Lefgren, 2009).
When education relies on weak research, ideology, or intuition instead of rigorous evidence, entire generations of students pay the price. I, like many of my peers in colleges of education, was none the wiser: teacher-preparation programs historically privilege qualitative and philosophical approaches and often leave graduates without the tools to evaluate causal claims (Wiens, 2012). Worse, when rigorous research is conducted, it frequently contradicts popular practice. Education is one of the few major human-impact fields where ideas can spread widely without ever being tested.
That pattern persists today. Universal social‑emotional learning (SEL) programs show some positive effects in meta-analyses (Durlak et al., 2011), but the academic gains are modest and the evidence that SEL can serve as a primary lever for raising test scores at scale remains limited (Jacob & Parkinson, 2015). Restorative Justice practices are widely promoted for school climate and discipline, yet evidence of academic benefits are mixed (Augustine et al., 2018). Personalized learning platforms and many student-centered models produce small, inconsistent gains; several large implementations show no effect (Pane et al., 2015). Growth Mindset interventions, when scaled, yield small average impacts with important contextual limits (Yeager et al., 2019; Sisk et al., 2018). Culturally responsive teaching might be pedagogically important, but causal evidence linking CRT to measurable academic gains remains thin (Aronson & Laughter, 2016; Dee & Penner, 2017). Project‑based learning (PBL) often fails to raise achievement unless teachers receive substantial, sustained training (Condliffe et al., 2017). And new movements, such as Building Thinking Classrooms, have attracted enthusiastic adopters but lack large-scale causal studies (Liljedahl, 2020).
All of this comes at a cost. Districts routinely spend hundreds of thousands to millions of dollars on curriculum adoptions, professional development, consultants, and technology licenses (Pane et al., 2015; Augustine et al., 2018; CASEL, n.d.). Those expenditures create economic and political inertia: once money, time, and reputation are invested, reversing course becomes difficult. Perhaps most frustrating is that educative systems are built structurally, culturally, and politically to preserve ideas, not test them. The U.S. system is fragmented into more than 13,000 districts and fifty state systems (NCES, 2025) with no single research authority to vet and scale proven practices. Teacher-preparation programs often do not prioritize training in research literacy or causal inference (Wiens, 2012). Implementation feedback loops are slow: failures in literacy or numeracy may not be visible for years. And when an instructional approach becomes a moral stance, evidence is treated as optional.
The stakes are particularly high in mathematics. Recent reform movements have placed a strong emphasis on problem solving, discourse, and student-centered approaches, goals that are worthwhile in principle; yet in practice, they rest on uncertain evidence of student outcomes. Approaches such as Building Thinking Classrooms, for example, demand a high level of teacher expertise: anticipating misconceptions, selecting and sequencing tasks, and maintaining a clear trajectory toward mathematical understanding. The rhetoric echoes past reforms, specifically mastery-learning and poorly designed discovery models—and obscures how mathematical thinking actually develops, particularly the need for fluency and automaticity, so that students can engage in the complex thinking we claim to value. When foundational skills are underdeveloped, students are left to grapple with problems they are not yet equipped to solve. Allowing calculators or discovery-only approaches to replace systematic skill-building is like handing my literature students a SparkNotes summary of Macbeth and asking them to have a deep, meaningful conversation on Shakespeare’s language choices with only a superficial grasp of the play. In many classrooms, the issue is not the idea itself, but the assumption that it can be implemented effectively and consistently across classrooms, irregardless of the pedagogical content knowledge of the teacher.
This is not a call to abandon innovation. Some interventions have robust evidence and deserve scale. But the field must stop treating evidence as optional. Districts should require independent evaluation before large adoptions, fund randomized or well-designed quasi-experimental pilots, build research literacy into teacher preparation, and demand transparent cost-benefit reporting for major initiatives. The desire to do what is best for students is ultimately what most educators want. But good intentions don’t guarantee good outcomes. And the truth is, belief‑driven practices have done little to improve educational results in the nearly twenty years since I graduated from my college of education. Until we begin grounding instructional decisions in evidence rather than ideology, we will continue to repeat the same cycles—new labels, new programs, same outcomes.
References
Aronson, B., & Laughter, J. (2016). The theory and practice of culturally relevant education: A synthesis of research across content areas. Review of Educational Research, 86(1), 163-206. https://doi.org/10.3102/0034654315582066
Augustine, C.H., Engberg, J., Grimm, G.E., Lee, E., Wang, E. L., Christianson, K., & Joseph, A.A. (2018). Can restorative practices improve school climate and curb suspensions? An evaluation of the impact of restorative practices in a mid-sized urban school district. RAND Corporation. https://www.rand.org/pubs/research_reports/RR2840.html
Banerjee, A., Cole, S., Duflo, E., & Linden, L. (2007). Remedying education: Evidence from two randomized experiments in India. Quarterly Journal of Economics, 122(3), 1235–1264. https://doi.org/10.1162/qjec.122.3.1235
Baumeister, R. F., Campbell, J. D., Krueger, J. I., & Vohs, K. D. (2003). Does High Self-Esteem Cause Better Performance, Interpersonal Success, Happiness, or Healthier Lifestyles?. Psychological science in the public interest: a journal of the American Psychological Society, 4(1), 1–44. https://doi.org/10.1111/1529-1006.01431
Bennett, N., & Hyland, T. (1979). Open Plan: Open Education? British Educational Research Journal, 5(2), 159–166. http://www.jstor.org/stable/1501026
CASEL. (n.d.). Cost estimator — CASEL district resource guide. Collaborative for Academic, Social, and Emotional Learning. https://drc.casel.org/cost-estimator/
Condliffe, B., Quint, J., Visher, M.G., Bangser, M., Brohojowska, S., Saco, L., & Nelson, E. (2017). Project‑based learning: A literature review. MDRC. https://www.mdrc.org/sites/default/files/Project-Based_Learning-LitRev_Final.pdf
Cook, T. D. (2006). Sciencephobia: Why education rejects randomized experiments. Education Next, 1(3), 62-68. https://www.educationnext.org/sciencephobia/
Dee, T.S., & Penner, E.K. (2017). The causal effects of cultural relevance: Evidence from an ethnic-studies curriculum. American Educational Research Journal, 54(1), 127-166. https://files.eric.ed.gov/fulltext/EJ1132535.pdf
Durlak, J.A., Weissberg, R.P., Dymnicki, A.B., Taylor, R.D., & Schellinger, K.B. (2011). The impact of enhancing students’ social and emotional learning: A meta-analysis of school-based universal interventions. Child Development, 82(1), 405-432. https://doi.org/10.1111/j.1467-8624.2010.01564.x
Ennett, S. T., Tobler, N. S., Ringwalt, C. L., & Flewelling, R. L. (1994). How effective is drug abuse resistance education? A meta-analysis of Project DARE outcome evaluations. American journal of public health, 84(9), 1394–1401. https://doi.org/10.2105/ajph.84.9.1394
Foorman, B. R., Francis, D. J., Fletcher, J. M., Schatschneider, C., & Mehta, P. (1998). The role of instruction in learning to read: Preventing reading failure in at‑risk children. Journal of Educational Psychology, 90(1), 37–55. https://doi.org/10.1037/0022-0663.90.1.37
Holmes, C. T. (1989). Flunking grades: Research and policies on retention. Falmer Press.
Jacob, B. A., & Lefgren, L. (2009). The effect of grade retention on high school completion. American Economic Journal: Applied Economics, 1(3), 33–58. https://doi.org/10.1257/app.1.3.33
Jacob, R., & Parkinson, J. (2015). The potential for school‑based interventions that target executive function to improve academic achievement: A review. Review of Educational Research, 85(4), 512–552. https://doi.org/10.3102/0034654314561338
Liljedahl, P. (2020). Building Thinking Classrooms in Mathematics, Grades K-12: 14 teaching practices for enhancing learning. Corwin/SAGE.
Marsh, H. W., & Craven, R. G. (2006). Reciprocal Effects of Self-Concept and Performance From a Multidimensional Perspective: Beyond Seductive Pleasure and Unidimensional Perspectives. Perspectives on psychological science : a journal of the Association for Psychological Science, 1(2), 133–163. https://doi.org/10.1111/j.1745-6916.2006.00010.x
National Center for Education Statistics. (2025). Table 3. Number of operating public elementary and secondary schools, by school type, charter, and state or jurisdiction: School year 2024–25. Common Core of Data. U.S. Department of Education. https://nces.ed.gov/ccd/tables/202425_summary_3.asp
National Reading Panel. (2000). Teaching children to read: An evidence‑based assessment of the scientific research literature on reading and its implications for reading instruction. NIH Publication. https://eric.ed.gov/?id=ED444126
OECD. (2015). Education at a Glance 2015: OECD Indicators. OECD Publishing. https://www.oecd.org/en/publications/education-at-a-glance-2015_eag-2015-en.html
Pane, J.F., Steiner, E.D., Baird, M.D., & Hamilton, L.S. (2015). Continued progress: Promising evidence on personalized learning. RAND Corporation. https://www.rand.org/pubs/research_reports/RR1365.html
Pashler, H., McDaniel, M., Rohrer, D., Bjork, R. (2008). Learning styles: Concepts and evidence. Psychological science in the public interest: a journal of the American Psychology Society, 9(3), 105-119. https://doi.org/10.1111/j.1539-6053.2009.01038.x
Ramey, C. T., & Piper, V. (1974). Creativity in Open and Traditional Classrooms. Child Development, 45(2), 557–560. https://doi.org/10.2307/1127989
Sisk, V. F., Burgoyne, A. P., Sun, J., Butler, J. L., & Macnamara, B. N. (2018). To what extent and under which circumstances are growth mind‑sets important to academic achievement? Two meta‑analyses. Psychological Science, 29(4), 549–571. https://doi.org/10.1177/0956797617739704
Wiens, P. D. (2012). The Missing Link: Research on Teacher Education. Action in Teacher Education, 34(3), 249–261. https://doi.org/10.1080/01626620.2012.694018
West, S. L., & O’Neal, K. K. (2004). Project D.A.R.E. outcome effectiveness revisited. American journal of public health, 94(6), 1027–1029. https://doi.org/10.2105/ajph.94.6.1027
Word, E. R., et al. (1990). Student/Teacher Achievement Ratio (STAR): Tennessee’s K–3 class size study. Final summary report, 1985–1990. ERIC Document No. ED320692. Retrieved from https://eric.ed.gov/?id=ED320692
Yeager, D.S., Hanselman, P., Walton, G.M., et al. (2019). A national experiment reveals where growth mindset improves achievement. Nature, 573, 364-369. https://doi.org/10.1038/s41586-019-1466-y
Leave a Reply