SCORE Project CSER Cambridge Nature

Q: What is the Nature special collection?

Reliable research in social/behavioural sciences : 4 articles + news/editorials, April 2026.

Be the first to comment on this article!

You

Please keep comments respectful and on-topic.

Graffiti art spelling the word score on a wooden surface. — Photo by Marija Zaric on Unsplash

Promote Your Research… Share it Worldwide

Have a story or a research paper to share? Become a contributor and publish your work on AcademicJobs.com.

Submit your Research - Make it Global News

The Landmark Publication from Cambridge's Centre for the Study of Existential Risk

In a significant development for scientific integrity, the Centre for the Study of Existential Risk (CSER) at the University of Cambridge has contributed to the release of results from the SCORE project—Systematizing Confidence in Open Research and Evidence—in a dedicated Nature special collection. Announced on April 7, 2026, this collection addresses a pressing challenge in academia: how to systematically evaluate the credibility of research claims, particularly in the social and behavioural sciences.

The SCORE initiative, a multi-year, international collaboration involving over 865 researchers, evaluated nearly 4,000 claims from published studies. Its findings reveal both strengths and vulnerabilities in how scientific knowledge is constructed, emphasizing the need for multi-dimensional assessments of research reliability.

Background on CSER and Its Mission at Cambridge University

Established in 2012 as one of the world's first interdisciplinary centres dedicated to existential risks—threats that could lead to human extinction or irreversible civilizational collapse—CSER brings together experts from philosophy, computer science, policy, and more to study and mitigate dangers like artificial intelligence misalignment, pandemics, and climate extremes. Housed within the University of Cambridge, CSER exemplifies the UK's leadership in forward-thinking higher education research.

CSER's involvement in SCORE aligns perfectly with its focus on high-stakes decision-making under uncertainty. Where existential risks demand the highest confidence in evidence, tools to systematize research credibility are invaluable. Assistant Professor Alexandru Marcoci from CSER played a key role, co-authoring preprints on human predictions of replicability and leading related work in Nature Human Behaviour.

The Centre for the Study of Existential Risk building at Cambridge University

Unpacking the SCORE Project: Goals and Methodology

Funded initially by DARPA and led by the Center for Open Science (COS), SCORE sought to create scalable algorithms and indicators for assessing research trustworthiness. It targeted social and behavioural sciences, fields crucial for policy on risks like misinformation or societal resilience.

The project dissected credibility into three pillars:

Reproducibility: Can the same analysis on the original data yield the same result? (Computational checks for code and data.)
Robustness: Does the conclusion hold under varied but reasonable analyses on the same data? (Addressing 'researcher degrees of freedom.')
Replicability: Does new data confirm the original finding? (Independent studies.)

Over seven years, teams conducted large-scale tests, human elicitations, and machine learning validations, culminating in open datasets on OSF.

Key Findings from the Nature Special Collection

The Nature collection, titled 'Reliable research in the social and behavioural sciences', features four core articles plus editorials and news pieces.

Highlights include:

Reproducibility: 53% precise matches in 145 papers; 75% approximate. Higher in recent papers and data-sharing journals.
Robustness: Only 34% close matches in reanalyses; 74% same conclusion direction. Economics and political science showed stronger results.
Replicability: 49% success rate in 164 papers; effect sizes halved on average.
These dimensions are distinct, with low correlations, underscoring credibility's multi-faceted nature.

Over 85% of papers were computationally reproducible, but 25% had coding errors.

CSER's Specific Contributions and Alexandru Marcoci's Role

Marcoci, CSER's expert in collective decision-making and long-term risks, contributed to eliciting structured human forecasts for replicability across hundreds of claims. His team's preprints—'Large-scale human predictions...', 'Assessments of Credibility...', and 'repliCATS-SCORE'—are part of the collection's supporting materials.

Earlier, he co-authored a Nature Human Behaviour paper on predicting COVID-19 preprint replicability, where structured elicitations from 'beginners' and 'experts' achieved 57-69% accuracy—better than chance but highlighting prediction challenges.

This work advances forecasting for high-uncertainty domains like existential risks.

Photo by Phil Hearing on Unsplash

Relevance to Existential Risk Research and Policy

For CSER, SCORE's tools are transformative. Existential risk studies rely on interdisciplinary evidence from social sciences—e.g., AI governance or pandemic preparedness—where low reproducibility could mislead policy. Systematized confidence scores enable better prioritization of robust findings.

In the UK, with UKRI funding existential risk hubs, such methodologies strengthen grant evaluations and evidence synthesis, ensuring taxpayer investments yield reliable insights.

Implications for UK Higher Education Landscape

UK universities, facing Research Excellence Framework (REF) pressures, can leverage SCORE's open tools for internal audits and training. Cambridge's leadership positions it as a hub for metascience, potentially influencing UKRI's openness policies.

Challenges persist: analytical flexibility and publication bias inflate effects, but solutions like Registered Reports—adopted by 300+ journals—offer paths forward.

Economics and political science's higher robustness suggests discipline-specific strategies could elevate others.

Cover of Nature special collection on SCORE project research credibility

Global and UK Expert Perspectives

Nature's editorial urges self-reflection: 'Progress depends on questioning what we think we know.' COS's Brian Nosek notes no single credibility shortcut exists; multi-method approaches are essential.

UK academics praise the open data (OSF), enabling meta-analyses. Implications for AI ethics and climate policy research are profound, demanding robust evidence.

Emerging Tools and Resources from SCORE

SCORE's legacy includes open repositories: data/code at OSF SCORE registry, algorithms for confidence scoring, and protocols for elicitations. UK labs can integrate these into workflows, fostering a culture of verification.

Human forecasting protocols for peer review.
Machine learning models trained on 3,900 claims.
Guidelines distinguishing credibility dimensions.

Case Studies: Social Sciences Under the Microscope

In economics, robustness held across reanalyses, aiding policy like fiscal multipliers. Political science showed high reproducibility, vital for election studies. Yet, effect size shrinkage in replications warns against overreliance on single studies.

COVID-19 preprints tested rapid science: 65% replicated, validating structured predictions even from novices.

The cambridge train station's entrance is visible.

Photo by Cheryl Ng on Unsplash

Future Outlook: Transforming Research Practices

SCORE paves the way for 'credibility by design': pre-register analyses, multiverse testing, and AI-assisted checks. For UK higher ed, integrating into PhD training and REF could boost global rankings.

Challenges remain—resource costs—but open tools democratize access. As CSER's Marcoci notes, in risk domains, 'better forecasts save lives.'

Explore CSER's work or COS's SCORE page for deeper dives.

Frequently Asked Questions

🔬What is the SCORE project?

SCORE (Systematizing Confidence in Open Research and Evidence) is a global collaboration evaluating research credibility via reproducibility, robustness, and replicability in social/behavioural sciences.

🏛️How is CSER Cambridge involved?

CSER's Alexandru Marcoci contributed to human replicability predictions, co-authoring preprints and a Nature Human Behaviour paper on COVID preprints.

📊What are the main SCORE findings?

53% precise reproducibility, 49% replicability, 34% robust reanalyses; dimensions distinct, higher in economics/pol sci.

📚What is the Nature special collection?

Reliable research in social/behavioural sciences: 4 articles + news/editorials, April 2026.

⚠️Why does this matter for existential risks?

High-confidence evidence needed for policy on AI, pandemics; SCORE tools aid prioritization.

🔄Differences: reproducibility vs replicability?

Reproducibility: same data/analysis. Replicability: new data, same analysis.

🎓Implications for UK universities?

Enhances REF, training; supports UKRI openness push.

💻Access SCORE resources?

OSF registry for data/code.

🧠Human predictions accuracy in SCORE?

57-69% in COVID preprints; structured elicitations effective even for novices.

🚀Future of research credibility post-SCORE?

Registered Reports, AI checks, multi-verse analysis; open tools for all.

💼Robustness in economics/political science?

Higher rates, independent of data sharing or authors.

Cambridge CSER's SCORE Project Results in Nature: Systematizing Research Confidence