JD.com Unveils xLLM Speculative Inference Architecture
JD.com reveals its xLLM speculative inference system at AICon Shanghai, promising magnitude-level speedups for LLM infer…
1 articles about 'inference-optimization'
JD.com reveals its xLLM speculative inference system at AICon Shanghai, promising magnitude-level speedups for LLM infer…