Promote Your Research… Share it Worldwide
Have a story or a research paper to share? Become a contributor and publish your work on AcademicJobs.com.
Submit your Research - Make it Global NewsThe Landmark Publication from Cambridge's Centre for the Study of Existential Risk
In a significant development for scientific integrity, the Centre for the Study of Existential Risk (CSER) at the University of Cambridge has contributed to the release of results from the SCORE project—Systematizing Confidence in Open Research and Evidence—in a dedicated Nature special collection. Announced on April 7, 2026, this collection addresses a pressing challenge in academia: how to systematically evaluate the credibility of research claims, particularly in the social and behavioural sciences.
The SCORE initiative, a multi-year, international collaboration involving over 865 researchers, evaluated nearly 4,000 claims from published studies. Its findings reveal both strengths and vulnerabilities in how scientific knowledge is constructed, emphasizing the need for multi-dimensional assessments of research reliability.
Background on CSER and Its Mission at Cambridge University
Established in 2012 as one of the world's first interdisciplinary centres dedicated to existential risks—threats that could lead to human extinction or irreversible civilizational collapse—CSER brings together experts from philosophy, computer science, policy, and more to study and mitigate dangers like artificial intelligence misalignment, pandemics, and climate extremes. Housed within the University of Cambridge, CSER exemplifies the UK's leadership in forward-thinking higher education research.
CSER's involvement in SCORE aligns perfectly with its focus on high-stakes decision-making under uncertainty. Where existential risks demand the highest confidence in evidence, tools to systematize research credibility are invaluable. Assistant Professor Alexandru Marcoci from CSER played a key role, co-authoring preprints on human predictions of replicability and leading related work in Nature Human Behaviour.
Unpacking the SCORE Project: Goals and Methodology
Funded initially by DARPA and led by the Center for Open Science (COS), SCORE sought to create scalable algorithms and indicators for assessing research trustworthiness. It targeted social and behavioural sciences, fields crucial for policy on risks like misinformation or societal resilience.
The project dissected credibility into three pillars:
- Reproducibility: Can the same analysis on the original data yield the same result? (Computational checks for code and data.)
- Robustness: Does the conclusion hold under varied but reasonable analyses on the same data? (Addressing 'researcher degrees of freedom.')
- Replicability: Does new data confirm the original finding? (Independent studies.)
Over seven years, teams conducted large-scale tests, human elicitations, and machine learning validations, culminating in open datasets on OSF.
Key Findings from the Nature Special Collection
The Nature collection, titled 'Reliable research in the social and behavioural sciences', features four core articles plus editorials and news pieces.
Highlights include:
- Reproducibility: 53% precise matches in 145 papers; 75% approximate. Higher in recent papers and data-sharing journals.
- Robustness: Only 34% close matches in reanalyses; 74% same conclusion direction. Economics and political science showed stronger results.
- Replicability: 49% success rate in 164 papers; effect sizes halved on average.
- These dimensions are distinct, with low correlations, underscoring credibility's multi-faceted nature.
Over 85% of papers were computationally reproducible, but 25% had coding errors.
CSER's Specific Contributions and Alexandru Marcoci's Role
Marcoci, CSER's expert in collective decision-making and long-term risks, contributed to eliciting structured human forecasts for replicability across hundreds of claims. His team's preprints—'Large-scale human predictions...', 'Assessments of Credibility...', and 'repliCATS-SCORE'—are part of the collection's supporting materials.
Earlier, he co-authored a Nature Human Behaviour paper on predicting COVID-19 preprint replicability, where structured elicitations from 'beginners' and 'experts' achieved 57-69% accuracy—better than chance but highlighting prediction challenges.
This work advances forecasting for high-uncertainty domains like existential risks.
Photo by Phil Hearing on Unsplash
Relevance to Existential Risk Research and Policy
For CSER, SCORE's tools are transformative. Existential risk studies rely on interdisciplinary evidence from social sciences—e.g., AI governance or pandemic preparedness—where low reproducibility could mislead policy. Systematized confidence scores enable better prioritization of robust findings.
In the UK, with UKRI funding existential risk hubs, such methodologies strengthen grant evaluations and evidence synthesis, ensuring taxpayer investments yield reliable insights.
Implications for UK Higher Education Landscape
UK universities, facing Research Excellence Framework (REF) pressures, can leverage SCORE's open tools for internal audits and training. Cambridge's leadership positions it as a hub for metascience, potentially influencing UKRI's openness policies.
Challenges persist: analytical flexibility and publication bias inflate effects, but solutions like Registered Reports—adopted by 300+ journals—offer paths forward.
Economics and political science's higher robustness suggests discipline-specific strategies could elevate others.
Global and UK Expert Perspectives
Nature's editorial urges self-reflection: 'Progress depends on questioning what we think we know.' COS's Brian Nosek notes no single credibility shortcut exists; multi-method approaches are essential.
UK academics praise the open data (OSF), enabling meta-analyses. Implications for AI ethics and climate policy research are profound, demanding robust evidence.
Emerging Tools and Resources from SCORE
SCORE's legacy includes open repositories: data/code at OSF SCORE registry, algorithms for confidence scoring, and protocols for elicitations. UK labs can integrate these into workflows, fostering a culture of verification.
- Human forecasting protocols for peer review.
- Machine learning models trained on 3,900 claims.
- Guidelines distinguishing credibility dimensions.
Case Studies: Social Sciences Under the Microscope
In economics, robustness held across reanalyses, aiding policy like fiscal multipliers. Political science showed high reproducibility, vital for election studies. Yet, effect size shrinkage in replications warns against overreliance on single studies.
COVID-19 preprints tested rapid science: 65% replicated, validating structured predictions even from novices.
Future Outlook: Transforming Research Practices
SCORE paves the way for 'credibility by design': pre-register analyses, multiverse testing, and AI-assisted checks. For UK higher ed, integrating into PhD training and REF could boost global rankings.
Challenges remain—resource costs—but open tools democratize access. As CSER's Marcoci notes, in risk domains, 'better forecasts save lives.'
Explore CSER's work or COS's SCORE page for deeper dives.

Be the first to comment on this article!
Please keep comments respectful and on-topic.