· 02:43
🎙 Podcast Summary: "Asking Smart Questions Beats Answering Hard Ones"
What if the real test of intelligence isn’t giving perfect answers – but coming up with the right questions in the first place? In his thought-provoking newsletter piece, historian Dan Cohen dives into the limitations of how we currently assess artificial intelligence. Cohen humorously recounts taking a notoriously difficult "Humanity’s Last Exam" – a 3,000-question mega-test meant for AI – and flunking it spectacularly, despite being a PhD-holding historian. From that humbling experience, Cohen launches into a deeper reflection: while large language models are rapidly improving and even outperforming humans in areas like translation and handwriting recognition, they fall short in one major area — curiosity. He argues that history, and perhaps human progress itself, is driven less by having answers, and more by asking bold, unexpected questions — something AI still struggles with. As Cohen notes, “PhD-level work is not just about correct answers. It is more about asking distinctive, uncommon questions.”
đź§ Key Points:
Cohen took the AI-targeted Humanity’s Last Exam (HLE) and scored almost nothing. His critique: it’s heavily biased toward STEM — only 16 out of 3,000 questions were on history, and four of those were about naval battles.
HLE and similar tests define intelligence as the ability to answer complex questions correctly. But Cohen argues this is a narrow definition that misses the essence of scholarly thinking.
AI is undeniably improving in certain research tasks. Recent large language models can now:
Historians like Benjamin Breen and Cameron Blevins have shown AI’s rapid gains in archiving, research assistance, and even deciphering handwritten text — long a major challenge in digital scholarship.
However, AI’s focus on right answers sidelines a key part of human intelligence: generating insightful questions that start entirely new fields of inquiry.
Good historical work often starts with strange, novel questions. Examples include:
Cohen ends on a critical note: AI might be able to beat first-year PhDs in fact-retention or translation — but can it ever ask an original, paradigm-shifting question?
🎧 Notable Quote:
“Ultimately, we may want answers, but we must begin with new queries, new areas of interest... This is a much bigger challenge.”
đź§ Extra Context:
📚 Related Reading:
🎙 Curious about AI, scholarship, and human creativity? Subscribe to Dan Cohen’s newsletter “Humane Ingenuity” to follow the conversation.
Link to Article
Listen to jawbreaker.io using one of many popular podcasting apps or directories.