New Benchmark BTF-2: Evaluating Strategic Reasoning Capabilities of AI Forecasting Agents
A new arXiv paper introduces "Bench to the Future 2," a benchmark that systematically evaluates reasoning strategy diffe…
1 articles about 'Strategic Reasoning'
A new arXiv paper introduces "Bench to the Future 2," a benchmark that systematically evaluates reasoning strategy diffe…