AI Breakthrough: Latest Models Surpass UTokyo 2026 Secondary Exam Risan III Level

Toshin Reveals AI Dominance in Elite Japanese Admissions Test

  • generative-ai
  • higher-education-japan
  • higher-education-news
  • ai-in-education
  • university-of-tokyo
New0 comments

Be one of the first to share your thoughts!

Add your comments now!

Have your say

Engagement level
a laptop computer sitting on top of a wooden table
Photo by Jacob Mindak on Unsplash

The Toshin Survey: AI Masters UTokyo's 2026 Secondary Exam

In a landmark development for artificial intelligence in education, Japan's leading cram school operator Nagase, through its Toshin (東進) brand, conducted an immediate post-exam analysis of the University of Tokyo's (UTokyo) 2026 secondary entrance examination. Held on February 25 and 26, this rigorous test saw three state-of-the-art generative AI models achieve unprecedented scores, all surpassing 80% across liberal arts (文系) and science (理系) subjects. Most strikingly, they comfortably exceeded the passing threshold for the ultra-competitive Faculty of Science III (理科三類, or Risan III), UTokyo's premier medicine track. 97 96

Claude Opus 4.6 from Anthropic led with scores nearing 90%, followed by Google's Gemini 3.1 Pro and OpenAI's GPT 5.2. This breakthrough highlights AI's rapid evolution in handling complex descriptive problems typical of Japan's university entrance exams.

Decoding UTokyo Secondary Exam and Risan III's Prestige

The University of Tokyo secondary exam is the decisive stage of Japan's highly selective admissions process. Following the national University Entrance Common Test (共通テスト), it tests deep understanding through essay-style questions in subjects like mathematics, sciences, languages, and history. Risan III, part of UTokyo's science divisions, admits only about 110 students annually from thousands, focusing on medicine with intense competition—passing rates hover around 60% combined score, but secondary performance is pivotal. 86

This year's exam was notably challenging, with math and English labeled as historically difficult by prep schools and examinees, amplifying the AI achievement's significance.

State-of-the-Art AI Models Under the Spotlight

Nagase selected the freshest models available: Claude Opus 4.6 (Anthropic, February 2026 release), Gemini 3.1 Pro (Google, February 2026), and GPT 5.2 (OpenAI, December 2025). These large language models (LLMs) excel in reasoning and generation, trained on vast datasets including academic content. 97

Three AI models Claude Gemini GPT used in Toshin UTokyo exam survey

Excluded was older DeepSeek due to prior underperformance, ensuring a focus on peak capabilities.

Overall Scores: A Clean Sweep Over 80%

All AIs dominated both tracks: liberal arts faculties (文科一類 to 三類) and science (理科一類 to 三類), with totals exceeding the ~60% benchmark for general admission and Risan III's steeper demands. Claude topped charts, demonstrating nuanced reasoning essential for UTokyo's format. 97

  • Claude Opus 4.6: ~89% aggregate
  • Gemini 3.1 Pro: High 80s
  • GPT 5.2: Solid 80s

This marks a leap from prior years, where AI hovered at 60-70%.

Perfection in Liberal Arts Mathematics

In a stunning display, every AI scored 100% on liberal arts math—a subject demanding proofs and descriptions. Despite the exam's difficulty spike, AIs not only solved but articulated step-by-step logic flawlessly, signaling matured 'descriptive power'. 97

This contrasts with human examinees struggling amid 'historic' hardness. 110

Grok ai interface with a question prompt

Photo by Salvador Rios on Unsplash

Breaking Barriers: Risan III Qualification Achieved

Risan III requires exceptional secondary scores atop strong Common Test results (71.6% foot cut this year). AIs cleared it 'with margin', implying secondary scores well above typical ~65-70% minima from past (e.g., 2025 ~385/550 combined). 92 97

Science math challenges, like plane illustrations, were navigated computationally, though visuals posed hurdles.

UTokyo Admissions Overview

Strengths, Weaknesses, and Speed Insights

Strengths shone in computation-heavy areas: history solved in 1-2 minutes vs. human hours. Weaknesses emerged in figure reading (e.g., failing diagrams despite math) and historical material interpretation—summarizing without era-specific terms. 97

  • Speed: Math <20 min; unmatched efficiency
  • Reasoning: Full thought processes included
  • Gaps: Visual/spatial, context-dependent history
AI performance strengths and weaknesses in UTokyo exam subjects

Rigorous Testing Methodology

Toshin fed textual/image problems sans prior training, respecting answer sheet limits (30 chars/line). Staff scored using same-day model answers/partial credits, ensuring fairness. No internet access post-exam to prevent leaks. 97

Leveraging 20B+ data records, Nagase uses this for AI drills boosting student scores (e.g., +102 pts Common Test edge).

Toshin Official Site

Evolution from Past AI Attempts

Previously, 2017's Todai Robot topped 80% but faltered fully; 2025 OpenAI o1 passed Risan III minima. 2026's 80%+ leap reflects LLM advances amid tougher exams. 119 158

Common Test 2026 saw AI ~97% accuracy elsewhere.

Implications for Japan's Higher Education Landscape

Japan's MEXT pushes AI literacy K-12/universities, with policies mandating training by 2026. UTokyo et al. integrate AI tools; this survey spotlights need for adaptive admissions—perhaps emphasizing creativity over rote. 131

Challenges cheating fears, but opportunities abound in AI-enhanced learning. For faculty/researchers eyeing AI roles, check research jobs.

Close-up of a laptop screen with a logo

Photo by Salvador Rios on Unsplash

Stakeholder Views and Broader Reactions

Early buzz on X/Twitter notes exam hardness amplifying AI feat; experts praise methodological rigor but caution on visuals/human insight gaps. Prep schools like Toshin pivot to 'AI-proof' drills. 9

UTokyo maintains descriptive focus to test 'true ability'.

Future Trajectories: AI, Exams, and University Careers

Expect exam evolutions—more visuals/orals. Universities ramp AI curricula; Japan targets AI talent via scholarships/jobs. Aspiring profs/lecturers? Lecturer jobs and academic CV tips await. Explore Japan higher ed opportunities or rate professors.

This milestone positions AI as partner, not replacement, in elite education. Stay ahead with higher ed jobs and university jobs.

Frequently Asked Questions

🏥What is UTokyo's Risan III?

Risan III (理科三類) is UTokyo's elite medicine/science track, admitting ~110 top scorers annually via Common Test + secondary exam.

🤖Which AI models were tested?

Claude Opus 4.6, Gemini 3.1 Pro, GPT 5.2—latest 2026 releases.

📊What scores did AIs achieve?

All >80% in liberal/science tracks; Claude ~89%; perfect liberal math.

How was Risan III passing determined?

AIs exceeded typical ~60-70% thresholds comfortably per Toshin scoring.

⚠️What were AI weaknesses?

Figure interpretation and historical materials; strong in computation.

🔬Exam methodology details?

Real-time input of problems/images; Toshin staff scored with partial credits.

📈How hard was 2026 UTokyo exam?

Math/English historically tough; AIs thrived despite.

📉Past AI performance comparison?

2025 o1 passed minima; 2026 leap to 80%+.

💡Implications for admissions?

Push for creative/visual tests; AI as learning tool.

🎓AI in Japan higher ed?

MEXT mandates literacy; unis integrate tools. See career advice.

💼Job opportunities?

AI boosts demand; explore higher ed jobs in Japan.