AI Efficiency - AI News

Microsoft MoE Architecture Slashes Inference Costs 70%

2026-05-07 research 👁 11

Microsoft Research unveils a sparse Mixture-of-Experts architecture that reduces AI inference costs by 70% while maintai…

2026-05-07 llm 👁 9

DeepSeek R1's benchmark results challenge assumptions about the gap between open-source and proprietary AI models, spark…

2026-05-06 research 👁 11

South Korea's KAIST develops a novel pruning method that cuts Transformer model size by up to 60% while preserving over …

2026-05-06 research 👁 10

Microsoft Research proposes a new Sparse Mixture-of-Experts architecture that dramatically improves LLM scaling efficien…

2026-05-05 opinion 👁 7

Specialized smaller AI models increasingly outperform massive general-purpose systems in cost, speed, and accuracy acros…

2026-05-05 research 👁 10

A new study reveals Mixture-of-Experts models activate only a fraction of parameters during inference, slashing compute …