return to table of content

Accelerating Gemma 4: faster inference with multi-token prediction drafters

327 comments