Google Launches MTP Drafter for Gemma 4, Boosting Speed 3x
Google introduces Multi-Token Prediction drafters for its Gemma 4 AI models, achieving up to 3x faster inference without…
3 articles about 'speculative decoding'
Google introduces Multi-Token Prediction drafters for its Gemma 4 AI models, achieving up to 3x faster inference without…
Google's new Gemma 4 open-weight models leverage speculative decoding to deliver up to 3x faster inference with no quali…
A new study proposes SpecTr-GBV, the first method to combine multi-draft strategies with block verification techniques, …