SDSAT: Accelerating LLM Inference through Speculative Decoding with Semantic Adaptive Tokens
Paper
• 2403.18647 • Published
The 7B model of "SDSAT: Accelerating LLM Inference through Speculative Decoding with Semantic Adaptive Tokens"