τ³-Bench benchmark tests AI agents on document navigation and voice calls
New open benchmark extends agent evaluation to knowledge-intensive document retrieval and full-duplex voice, with GPT-5.2 reaching ~25% on complex policy navigation tasks.
Intuit releases comprehensive 2026 AI Impact Report mapping AI adoption patterns across 34,000+ small-to-midsize businesses, offering benchmark data for where SMBs stand on AI implementation.
Continue reading