Academic Jobs - Home of Higher Ed Logo

Social Science Reproducibility Crisis: New Nature Study Finds Only Half of Results Replicate in US-Led Probe

Submit News
dim room
Photo by diana kereselidze on Unsplash

The Groundbreaking Nature Study Shaking Social Science

A landmark investigation published in Nature has delivered sobering news for the field of social science: only about half of published results can be reliably replicated. This finding comes from the Social, Computational, and Open Research Ecology (SCORE) project, a massive seven-year effort coordinated by the Center for Open Science at the University of Virginia. Researchers scrutinized 3,900 papers from 62 journals spanning disciplines like psychology, economics, political science, sociology, and education, all published between 2009 and 2018.

The project's scope was unprecedented, involving 865 scientists from dozens of countries who assessed reproducibility (running the original analysis on the same data), robustness (trying alternative analyses), and replicability (repeating experiments with new data). The headline result? Just 49% of 164 studies replicated with statistical significance, effect sizes shrank dramatically, and in some cases, fresh analyses flipped conclusions entirely. For US higher education, where social science departments drive policy research and student training, this underscores an urgent need for reform.

Decoding the SCORE Methodology and Key Statistics

The SCORE project broke down reliability into three pillars. Reproducibility checks focused on 600 papers; only 145 had sufficient data and code, with 53% yielding exact matches. Robustness tests on 100 papers showed 75% holding up under varied analyses, but 2% contradicted originals. Replicability, the gold standard, succeeded in 49% of cases, aligning with psychology's infamous 36% rate from 2015 but extending across fields.

  • Economics: Lowest replicability at around 40%, echoing a 2016 survey where 40% failed.
  • Education: Stronger performance despite poor data sharing.
  • Political Science: Higher reproducibility, possibly due to public data norms.

Effect sizes dropped from 0.25 to 0.10 on average, suggesting overstated impacts in originals. Data sharing emerged as the top predictor of success, a boon for open science advocates in US academia.

Roots of the Reproducibility Crisis: A Timeline in US Academia

The crisis erupted in 2011 with psychology fraud cases, but gained traction via the Open Science Collaboration's 2015 replication of 100 studies, succeeding in just 36%. US universities like Harvard and Stanford saw priming effects (e.g., words making people walk slower) vanish. Economics followed in 2016, with many lab experiments failing. By 2026, SCORE confirms it's systemic, not isolated.

Publication bias favors novel p-values under 0.05, small samples amplify noise, and p-hacking (tweaking analyses post-hoc) inflates false positives. In US colleges, 'publish or perish' pressures tenure-track faculty, prioritizing quantity over rigor.

Timeline of key reproducibility studies in social sciences from US-led projects

Why Replications Fail: Incentives, Methods, and Human Factors

Small sample sizes (often n=20-50) yield 80% false positives. Flexibility in analysis allows fishing for significance. Researcher degrees of freedom—choosing outliers or subgroups—exacerbate issues. Cultural factors like positive-only publication reward flashy findings over nulls. In economics, field experiments vary by context, eroding universality.

Tim Errington of UVA's COS notes, 'Some results are not replicable because of honest mistakes,' urging single studies as 'pieces of a puzzle.' Stanford's John Ioannidis calls results 'not surprising,' praising community response.

the word social media written in white type on a black background

Photo by Hakim Menikh on Unsplash

Infamous Case Studies: Power Posing and Ego Depletion in US Labs

Carney's 2010 power posing (two minutes expands testosterone 20%, risk-taking) failed in 2015 replications at Columbia and others; effect vanished. Baumeister's ego depletion (willpower as finite resource) crumbled in 2016 Many Labs 2 across 23 labs. Economics: Chetty's mobility studies partially held, but many lab games didn't. These US-originated icons shaped textbooks, policies, now revised at colleges like UCLA, Michigan.

Shaking US Higher Ed Foundations: Funding, Tenure, and Public Trust

NSF/NIH fund $10B+ yearly in social/behavioral research; irreproducible work wastes millions, erodes grants. Tenure dossiers heavy on pubs ignore replication; depts rethink metrics. Student curricula teach debunked findings, eroding trust—Gallup polls show 36% Americans confident in higher ed science.

Policy fallout: Failed interventions cost billions (e.g., priming in education). US colleges face lawsuits, donor skepticism amid replication scandals.

Center for Open Science tracks these shifts.

US Universities Lead the Charge: Open Science Training Programs

UVA's COS offers workshops; Berkeley's BITSS trains transparency. Harvard's DBMI mandates data sharing. Stanford's Meta-Research Innovation Center (METRICS), led by Ioannidis, pushes reforms. Many adopt TOP Guidelines: Level 1 citation standards, Level 2 data/code sharing. Registered Reports (e.g., APS journals) fund Stage 1 protocols.

  • Pre-registration on OSF prevents p-hacking.
  • Open data/code via GitHub/Dryad.
  • Badges for practices boost CVs.

Training integrates into PhD curricula at Michigan, NYU.

Emerging Solutions: From Preregistration to AI Tools

Preregistration locks hypotheses pre-data. Multiverse analysis explores decision paths. Larger samples via crowdsourcing (MTurk/Prolific). Journals like Advances in Methods and Practices in Psychological Science prioritize replications. AI predicts replicability (SCORE tested, imperfect but promising). NSF requires DMPs; tenure committees value rigor.

Infographic of open science practices adopted by US universities

Abel Brodeur (Ottawa): Recent papers (2022+) show 85% reproducibility gains.

text

Photo by Matt Ridley on Unsplash

Real-World Reforms at American Campuses: Success Stories

At UC Berkeley, BITSS's Summer Institute trains 100+ yearly. UVA integrates reproducibility into psych/stats courses. NYU's Crow Lab mandates open practices; replications count double for tenure. Community colleges like Foothill teach open science in research methods. Impacts: Funded replications rise 300% since 2015.

UniversityInitiativeOutcome
StanfordMETRICS CenterMeta-research grants
HarvardDBMI Open Data100% sharing policy
UVACOS TrainingGlobal workshops

Future Outlook: A More Credible Social Science in US Higher Ed

Optimism grows: Preprints (PsyArXiv) enable early scrutiny; funder mandates (NIH rigor initiatives) enforce change. By 2030, expect 70%+ replication via norms. For US colleges, this rebuilds trust, attracts funding, prepares students for evidence-based careers. As Errington says, 'The community is taking it seriously.' Embracing open science positions American universities as leaders.

Stakeholders—from deans to undergrads—must prioritize: Train in stats software (R/JAMOVI), demand pre-reg in theses, value replications. The crisis, painful as it is, catalyzes a stronger foundation.

Portrait of Prof. Marcus Blackwell
About the author

Prof. Marcus BlackwellView author

Academic Jobs In House Author

Discussion

Sort by:

Be the first to comment on this article!

You

Please keep comments respectful and on-topic.

New0 comments

Join the conversation!

Add your comments now!

Have your say

Engagement level

Browse by Faculty

Browse by Subject

Frequently Asked Questions

🔬What is the social science reproducibility crisis?

The crisis refers to low replication rates in social sciences, where many published findings fail independent verification. The recent Nature SCORE project found 49% replicability.86

📊What did the Nature study specifically find?

SCORE analyzed 3,900 papers; 49% replicated, 53% reproducible with data, 75% robust. Effect sizes halved. Read the full Nature article.

Why do social science results fail to replicate?

Causes include small samples, p-hacking, publication bias favoring positives, and flexible analyses. Not usually fraud, but systemic incentives.

🏫How does this affect US higher education?

Wastes NSF/NIH funds, questions tenure pubs, erodes teaching trust. Universities like UVA, Stanford respond with reforms.

💥What are famous replication failures?

Power posing (Carney), ego depletion (Baumeister)—US psych icons that vanished. Economics lab experiments also struggled.

🛠️What solutions are US colleges implementing?

Preregistration (OSF), open data/code, TOP guidelines, Registered Reports. Training at Berkeley BITSS, UVA COS.

📜How can faculty adapt for tenure?

Prioritize rigor: large N, share data, publish replications. Many depts now value open practices badges.

🎓Role of students in open science?

Undergrads/PhDs preregister theses, use R for analysis, contribute to replications. Courses at NYU, Michigan teach it.

📈Is improvement happening?

Yes—recent papers 85% reproducible per Brodeur. Norms shifting via funders, journals.

🔮Future of social science research in US?

AI prediction tools, mandatory sharing, replication journals. Positions unis as credibility leaders. COS resources.

🚀How to start open practices today?

Sign up OSF.io, preregister on AsPredicted.org, share code on GitHub. Free UVA courses available.