DeepMind says its Aletheia AI solved six of ten FirstProof math problems
Mar 9th 2026
Google DeepMind released a paper and code claiming its Aletheia agent independently solved six of ten hard problems in the FirstProof math challenge using a generator-verifier approach and heavy compute, and it has provided logs for community review while independent confirmation continues.
- DeepMind published a paper and GitHub repository claiming its research agent Aletheia independently solved 6 of 10 FirstProof challenge problems and submitted them within the contest deadline.
- The team credits a generator and verifier architecture plus a self-filtering mechanism for reducing hallucination and improving proof reliability.
- DeepMind reports one breakthrough problem, P7, required about 16 times the compute of their previous Erdős-1051 run and that several experts reviewed and affirmed at least some solutions.
- The company posted logs and code for the runs and frames the project as prioritizing correctness over producing answers when uncertain.
- Independent peer review and broader community verification are still in progress, so some claims remain to be independently confirmed.