LLM efficiency - AI News

MIT Sparse Attention Cuts Transformer Memory 80%

2026-05-07 research 👁 7

MIT researchers introduce a sparse attention mechanism that slashes Transformer memory usage by 80% while preserving mod…

2026-05-06 research 👁 9

Japan-based Sakana AI develops evolutionary algorithms to merge existing LLMs, creating powerful new models without expe…

2026-05-05 research 👁 9

Microsoft Research introduces BitNet b2, pushing extreme quantization to slash LLM memory and compute costs while preser…