A Guide to LLM Inference (Part 4): Speculative Decoding & Batching

Stephen Carmody · June 2, 2024

Coming soon…