Evaluation: AI Benchmarks Beyond ARC-AGI, MMMU, MLE-bench, and the FrontierMath Test Post date January 15, 2025 Post author By Stephen Post categories In agi, ai-benchmarks, arc-agi, artificial-intelligence, frontiermath-test, human-intelligence, human-mind, mle-bench