Inference speed increase LLM LayerSkip: Enabling Early Exit Inference and Self-Speculative Decoding Paper • 2404.16710 • Published Apr 25 • 73
LayerSkip: Enabling Early Exit Inference and Self-Speculative Decoding Paper • 2404.16710 • Published Apr 25 • 73