return to table of content
Accelerating Gemma 4: faster inference with multi-token prediction drafters
327 comments