Microsoft MoE Architecture Slashes Inference Costs 70%
Microsoft Research unveils a sparse Mixture-of-Experts architecture that reduces AI inference costs by 70% while maintai…
9 articles about 'Microsoft Research'
Microsoft Research unveils a sparse Mixture-of-Experts architecture that reduces AI inference costs by 70% while maintai…
Microsoft Research unveils a novel framework that reduces large language model hallucinations by up to 90%, potentially …
Microsoft Research proposes a new Sparse Mixture-of-Experts architecture that dramatically improves LLM scaling efficien…
Microsoft Research unveils a breakthrough system capable of translating speech across 120 languages simultaneously in re…
Microsoft Research releases Phi-5, a small language model that rivals GPT-4 performance while running on consumer hardwa…
Microsoft Research presents breakthroughs in large-scale distributed systems, datacenter networking, and AI infrastructu…
Microsoft Research unveils Phi-4, a 14-billion parameter small language model that matches or exceeds GPT-4 on key bench…
Microsoft Research introduces BitNet b2, pushing extreme quantization to slash LLM memory and compute costs while preser…
Microsoft Research has released its fifth edition of the New Future of Work report, highlighting that generative AI is r…