The Groundbreaking Nature Study Shaking Social Science
A landmark investigation published in Nature has delivered sobering news for the field of social science: only about half of published results can be reliably replicated. This finding comes from the Social, Computational, and Open Research Ecology (SCORE) project, a massive seven-year effort coordinated by the Center for Open Science at the University of Virginia. Researchers scrutinized 3,900 papers from 62 journals spanning disciplines like psychology, economics, political science, sociology, and education, all published between 2009 and 2018.
The project's scope was unprecedented, involving 865 scientists from dozens of countries who assessed reproducibility (running the original analysis on the same data), robustness (trying alternative analyses), and replicability (repeating experiments with new data). The headline result? Just 49% of 164 studies replicated with statistical significance, effect sizes shrank dramatically, and in some cases, fresh analyses flipped conclusions entirely. For US higher education, where social science departments drive policy research and student training, this underscores an urgent need for reform.
Decoding the SCORE Methodology and Key Statistics
The SCORE project broke down reliability into three pillars. Reproducibility checks focused on 600 papers; only 145 had sufficient data and code, with 53% yielding exact matches. Robustness tests on 100 papers showed 75% holding up under varied analyses, but 2% contradicted originals. Replicability, the gold standard, succeeded in 49% of cases, aligning with psychology's infamous 36% rate from 2015 but extending across fields.
- Economics: Lowest replicability at around 40%, echoing a 2016 survey where 40% failed.
- Education: Stronger performance despite poor data sharing.
- Political Science: Higher reproducibility, possibly due to public data norms.
Effect sizes dropped from 0.25 to 0.10 on average, suggesting overstated impacts in originals. Data sharing emerged as the top predictor of success, a boon for open science advocates in US academia.
Roots of the Reproducibility Crisis: A Timeline in US Academia
The crisis erupted in 2011 with psychology fraud cases, but gained traction via the Open Science Collaboration's 2015 replication of 100 studies, succeeding in just 36%. US universities like Harvard and Stanford saw priming effects (e.g., words making people walk slower) vanish. Economics followed in 2016, with many lab experiments failing. By 2026, SCORE confirms it's systemic, not isolated.
Publication bias favors novel p-values under 0.05, small samples amplify noise, and p-hacking (tweaking analyses post-hoc) inflates false positives. In US colleges, 'publish or perish' pressures tenure-track faculty, prioritizing quantity over rigor.
Why Replications Fail: Incentives, Methods, and Human Factors
Small sample sizes (often n=20-50) yield 80% false positives. Flexibility in analysis allows fishing for significance. Researcher degrees of freedom—choosing outliers or subgroups—exacerbate issues. Cultural factors like positive-only publication reward flashy findings over nulls. In economics, field experiments vary by context, eroding universality.
Tim Errington of UVA's COS notes, 'Some results are not replicable because of honest mistakes,' urging single studies as 'pieces of a puzzle.' Stanford's John Ioannidis calls results 'not surprising,' praising community response.
Photo by Hakim Menikh on Unsplash
Infamous Case Studies: Power Posing and Ego Depletion in US Labs
Carney's 2010 power posing (two minutes expands testosterone 20%, risk-taking) failed in 2015 replications at Columbia and others; effect vanished. Baumeister's ego depletion (willpower as finite resource) crumbled in 2016 Many Labs 2 across 23 labs. Economics: Chetty's mobility studies partially held, but many lab games didn't. These US-originated icons shaped textbooks, policies, now revised at colleges like UCLA, Michigan.
Shaking US Higher Ed Foundations: Funding, Tenure, and Public Trust
NSF/NIH fund $10B+ yearly in social/behavioral research; irreproducible work wastes millions, erodes grants. Tenure dossiers heavy on pubs ignore replication; depts rethink metrics. Student curricula teach debunked findings, eroding trust—Gallup polls show 36% Americans confident in higher ed science.
Policy fallout: Failed interventions cost billions (e.g., priming in education). US colleges face lawsuits, donor skepticism amid replication scandals.
Center for Open Science tracks these shifts.US Universities Lead the Charge: Open Science Training Programs
UVA's COS offers workshops; Berkeley's BITSS trains transparency. Harvard's DBMI mandates data sharing. Stanford's Meta-Research Innovation Center (METRICS), led by Ioannidis, pushes reforms. Many adopt TOP Guidelines: Level 1 citation standards, Level 2 data/code sharing. Registered Reports (e.g., APS journals) fund Stage 1 protocols.
- Pre-registration on OSF prevents p-hacking.
- Open data/code via GitHub/Dryad.
- Badges for practices boost CVs.
Training integrates into PhD curricula at Michigan, NYU.
Emerging Solutions: From Preregistration to AI Tools
Preregistration locks hypotheses pre-data. Multiverse analysis explores decision paths. Larger samples via crowdsourcing (MTurk/Prolific). Journals like Advances in Methods and Practices in Psychological Science prioritize replications. AI predicts replicability (SCORE tested, imperfect but promising). NSF requires DMPs; tenure committees value rigor.
Abel Brodeur (Ottawa): Recent papers (2022+) show 85% reproducibility gains.
Photo by Matt Ridley on Unsplash
Real-World Reforms at American Campuses: Success Stories
At UC Berkeley, BITSS's Summer Institute trains 100+ yearly. UVA integrates reproducibility into psych/stats courses. NYU's Crow Lab mandates open practices; replications count double for tenure. Community colleges like Foothill teach open science in research methods. Impacts: Funded replications rise 300% since 2015.
| University | Initiative | Outcome |
|---|---|---|
| Stanford | METRICS Center | Meta-research grants |
| Harvard | DBMI Open Data | 100% sharing policy |
| UVA | COS Training | Global workshops |
Future Outlook: A More Credible Social Science in US Higher Ed
Optimism grows: Preprints (PsyArXiv) enable early scrutiny; funder mandates (NIH rigor initiatives) enforce change. By 2030, expect 70%+ replication via norms. For US colleges, this rebuilds trust, attracts funding, prepares students for evidence-based careers. As Errington says, 'The community is taking it seriously.' Embracing open science positions American universities as leaders.
Stakeholders—from deans to undergrads—must prioritize: Train in stats software (R/JAMOVI), demand pre-reg in theses, value replications. The crisis, painful as it is, catalyzes a stronger foundation.
