Herculean benchmark measures AI agents on real financial professional work
arXiv paper introduces Herculean, the first benchmark evaluating agentic AI on financial intelligence tasks beyond static Q&A, testing whether agents can reliably perform actual financial professio...