“Take-home exams are dead.”

A professor used ElevenLabs voice AI to conduct oral exams for 36 students at roughly 42 cents per student. Three LLMs (Claude, Gemini, ChatGPT) independently graded transcripts, then revised after seeing each other’s assessments—agreement within one point jumped from 0% to 62%. The motivation: students use LLMs on take-home exams, and pen-and-paper tests can’t verify group project contributions. Oral exams force real-time reasoning but don’t scale. Until now.