🏷️ transformer optimization

1 articles about 'transformer optimization'

MIT Sparse Attention Cuts LLM Inference Costs by 60%

2026-05-07 research 👁 11

MIT researchers unveil a new sparse attention mechanism that dramatically reduces LLM inference costs while preserving m…