NYU Professor Uses AI-Powered Oral Exams to Combat AI-Generated Assignments
An NYU business school professor has turned to AI-powered oral exams to combat the growing use of artificial intelligence in student assignments. Panos Ipeirotis, a data science professor at NYU’s Stern School of Business, noticed that many student submissions, while polished and professional in tone, lacked true understanding. He described the work as sounding like a "McKinsey memo" but failing to reflect real grasp of the material. When he asked students to explain their reasoning in class, many struggled to defend their own work. "If you cannot defend your own work live, then the written artifact is not measuring what you think it is measuring," Ipeirotis wrote in a recent blog post. To address the issue, he revived oral exams—once common but difficult to scale—and used AI to make them practical for large classes. He called it "fighting fire with fire." The goal was to create assessments that test genuine understanding, decision-making, and real-time reasoning. Ipeirotis and a colleague built an AI examiner using ElevenLabs’ conversational speech technology. Setting it up required only a prompt describing the questions, which took minutes. The system conducted two parts: first, it questioned students about their capstone projects, probing their choices and logic. Then, it presented a class case study and asked students to think through it on the spot. Over nine days, the AI assessed 36 students, with each session lasting about 25 minutes. Total compute costs were just $15—far less than the hundreds of dollars a human TA would cost. Ipeirotis said the system was not only efficient but also more consistent and fair than human graders. He also used AI to grade the exams. Three large language models—Claude, Gemini, and ChatGPT—evaluated each transcript independently. They then reviewed each other’s feedback, adjusted scores, and reached a final decision, with Claude acting as a moderator. Ipeirotis said the AI "council" produced more consistent, fair, and detailed feedback than any human could, and even revealed gaps in how the course material had been taught. However, student reactions were mixed. Most preferred traditional written exams and found the AI oral exams more stressful, even though they acknowledged the format better measured real understanding. Still, Ipeirotis believes the experiment shows how learning should work: the more you practice, the better you get. His approach reflects a broader shift in higher education. As AI becomes more prevalent, universities face a growing challenge in designing assessments that can’t be easily bypassed. A September study in the journal Assessment & Evaluation in Higher Education found that instructors are overwhelmed, confused, and divided on how to handle AI in exams. Some see AI as a tool to be mastered. Others view it as cheating. Many are unsure what to do. In May, LinkedIn co-founder Reid Hoffman suggested that students may soon face AI examiners, especially since oral exams are harder to fake. He argued that such formats demand real understanding and are less vulnerable to AI shortcuts.
