The Toshin Survey: AI Masters UTokyo's 2026 Secondary Exam
In a landmark development for artificial intelligence in education, Japan's leading cram school operator Nagase, through its Toshin (東進) brand, conducted an immediate post-exam analysis of the University of Tokyo's (UTokyo) 2026 secondary entrance examination. Held on February 25 and 26, this rigorous test saw three state-of-the-art generative AI models achieve unprecedented scores, all surpassing 80% across liberal arts (文系) and science (理系) subjects. Most strikingly, they comfortably exceeded the passing threshold for the ultra-competitive Faculty of Science III (理科三類, or Risan III), UTokyo's premier medicine track.
Claude Opus 4.6 from Anthropic led with scores nearing 90%, followed by Google's Gemini 3.1 Pro and OpenAI's GPT 5.2. This breakthrough highlights AI's rapid evolution in handling complex descriptive problems typical of Japan's university entrance exams.
Decoding UTokyo Secondary Exam and Risan III's Prestige
The University of Tokyo secondary exam is the decisive stage of Japan's highly selective admissions process. Following the national University Entrance Common Test (共通テスト), it tests deep understanding through essay-style questions in subjects like mathematics, sciences, languages, and history. Risan III, part of UTokyo's science divisions, admits only about 110 students annually from thousands, focusing on medicine with intense competition—passing rates hover around 60% combined score, but secondary performance is pivotal.
This year's exam was notably challenging, with math and English labeled as historically difficult by prep schools and examinees, amplifying the AI achievement's significance.
State-of-the-Art AI Models Under the Spotlight
Nagase selected the freshest models available: Claude Opus 4.6 (Anthropic, February 2026 release), Gemini 3.1 Pro (Google, February 2026), and GPT 5.2 (OpenAI, December 2025). These large language models (LLMs) excel in reasoning and generation, trained on vast datasets including academic content.
Excluded was older DeepSeek due to prior underperformance, ensuring a focus on peak capabilities.
Overall Scores: A Clean Sweep Over 80%
All AIs dominated both tracks: liberal arts faculties (文科一類 to 三類) and science (理科一類 to 三類), with totals exceeding the ~60% benchmark for general admission and Risan III's steeper demands. Claude topped charts, demonstrating nuanced reasoning essential for UTokyo's format.
- Claude Opus 4.6: ~89% aggregate
- Gemini 3.1 Pro: High 80s
- GPT 5.2: Solid 80s
This marks a leap from prior years, where AI hovered at 60-70%.
Perfection in Liberal Arts Mathematics
In a stunning display, every AI scored 100% on liberal arts math—a subject demanding proofs and descriptions. Despite the exam's difficulty spike, AIs not only solved but articulated step-by-step logic flawlessly, signaling matured 'descriptive power'.
This contrasts with human examinees struggling amid 'historic' hardness.
Photo by Salvador Rios on Unsplash
Breaking Barriers: Risan III Qualification Achieved
Risan III requires exceptional secondary scores atop strong Common Test results (71.6% foot cut this year). AIs cleared it 'with margin', implying secondary scores well above typical ~65-70% minima from past (e.g., 2025 ~385/550 combined).
Science math challenges, like plane illustrations, were navigated computationally, though visuals posed hurdles.
UTokyo Admissions OverviewStrengths, Weaknesses, and Speed Insights
Strengths shone in computation-heavy areas: history solved in 1-2 minutes vs. human hours. Weaknesses emerged in figure reading (e.g., failing diagrams despite math) and historical material interpretation—summarizing without era-specific terms.
- Speed: Math <20 min; unmatched efficiency
- Reasoning: Full thought processes included
- Gaps: Visual/spatial, context-dependent history
Rigorous Testing Methodology
Toshin fed textual/image problems sans prior training, respecting answer sheet limits (30 chars/line). Staff scored using same-day model answers/partial credits, ensuring fairness. No internet access post-exam to prevent leaks.
Leveraging 20B+ data records, Nagase uses this for AI drills boosting student scores (e.g., +102 pts Common Test edge).
Toshin Official SiteEvolution from Past AI Attempts
Previously, 2017's Todai Robot topped 80% but faltered fully; 2025 OpenAI o1 passed Risan III minima. 2026's 80%+ leap reflects LLM advances amid tougher exams.
Common Test 2026 saw AI ~97% accuracy elsewhere.
Implications for Japan's Higher Education Landscape
Japan's MEXT pushes AI literacy K-12/universities, with policies mandating training by 2026. UTokyo et al. integrate AI tools; this survey spotlights need for adaptive admissions—perhaps emphasizing creativity over rote.
Challenges cheating fears, but opportunities abound in AI-enhanced learning. For faculty/researchers eyeing AI roles, check research jobs.
Photo by Salvador Rios on Unsplash
Stakeholder Views and Broader Reactions
Early buzz on X/Twitter notes exam hardness amplifying AI feat; experts praise methodological rigor but caution on visuals/human insight gaps. Prep schools like Toshin pivot to 'AI-proof' drills.
UTokyo maintains descriptive focus to test 'true ability'.
Future Trajectories: AI, Exams, and University Careers
Expect exam evolutions—more visuals/orals. Universities ramp AI curricula; Japan targets AI talent via scholarships/jobs. Aspiring profs/lecturers? Lecturer jobs and academic CV tips await. Explore Japan higher ed opportunities or rate professors.
This milestone positions AI as partner, not replacement, in elite education. Stay ahead with higher ed jobs and university jobs.