AI reasoning - AI News

Gemini 2.5 Ultra Tops Math Benchmarks

2026-05-10 llm 👁 11

Google DeepMind's Gemini 2.5 Ultra achieves record scores on major mathematical reasoning benchmarks, surpassing GPT-4o …

2026-05-10 llm 👁 11

Anthropic's Claude 4 achieves state-of-the-art results on graduate-level reasoning benchmarks, surpassing GPT-4o and Gem…

2026-05-07 llm 👁 10

Anthropic's Claude 4 sets new records on MATH and GPQA benchmarks, surpassing GPT-4o and Gemini Ultra in advanced reason…

2026-05-07 llm 👁 10

Anthropic's Claude 4 Opus sets new state-of-the-art scores on GPQA and other graduate-level reasoning benchmarks, outpac…

2026-05-07 llm 👁 8

Microsoft's Phi-4 small language model matches GPT-4 performance on key reasoning benchmarks while running on a fraction…

2026-05-06 llm 👁 9

Google DeepMind launches Gemini 2.5 Flash, a cost-efficient reasoning model that challenges premium AI offerings with en…

2026-05-06 llm 👁 8

OpenAI unveils GPT-5 Turbo featuring advanced reasoning, native multimodal capabilities, and significant API improvement…

2026-05-06 tutorial 👁 10

Master advanced prompt engineering techniques including Chain-of-Thought and Tree-of-Thought to dramatically improve LLM…

2026-05-06 llm 👁 8

OpenAI unveils GPT-5 Turbo, featuring built-in chain-of-thought reasoning, 1M token context, and up to 3x benchmark gain…

2026-05-06 llm 👁 10

Anthropic releases Claude 4.5 Sonnet featuring breakthrough mathematical proof generation that outperforms GPT-4o and Ge…

2026-05-05 llm 👁 9

Anthropic launches Claude 4 with Extended Thinking, enabling multi-step reasoning for complex scientific and mathematica…

2026-05-05 tutorial 👁 11

Master advanced chain-of-thought reasoning techniques for Anthropic's Claude 4 to unlock superior AI outputs across comp…