Researchers used questions from the NPR Sunday Puzzle challenge to build a benchmark to test AI 'reasoning' models.
I come from an improv background. In improv, you know that a game (or the gimmick of a scene) is clear and fun when everyone ...