return to table of content
DSpark: Speculative decoding accelerates LLM inference [pdf]
337 comments