Ruder AI agents improve complex reasoning in new study
Mar 1st 2026
Allowing AI chat agents to interrupt and use personality traits raised accuracy on benchmark reasoning tasks, with the biggest gains when multiple agents were initially wrong.
- Researchers gave LLM agents personality traits from the Big Five and made them generate responses sentence by sentence so they could interrupt or stay silent.
- The team compared three conversation modes: fixed speaking order, dynamic order, and dynamic order with interruption driven by an urgency score.
- On 1,000 MMLU questions, accuracy when one agent initially answered incorrectly was 68.7% with fixed order, 73.8% with dynamic order, and 79.2% with interruption enabled.
- When two agents initially answered incorrectly, accuracy rose from 37.2% with fixed order to 43.7% with dynamic order and 49.5% with interruption enabled.
- The authors say personality-driven, interruptible agents could improve group reasoning and plan to test the approach in collaborative and creative decision-making settings.