Alibaba Qwen3.5 LiveTranslate: 2.8s Latency Breakthrough
Alibaba's Qwen3.5-LiveTranslate-Flash slashes real-time translation latency to 2.8 seconds while preserving speaker voic…
22 articles about 'llm'
Alibaba's Qwen3.5-LiveTranslate-Flash slashes real-time translation latency to 2.8 seconds while preserving speaker voic…
New δ-mem framework slashes GPU memory usage for LLMs by 90%, enabling efficient online inference on consumer hardware.
AWS launches Assisted NLU for Amazon Lex, leveraging LLMs to improve intent recognition and slot filling without manual …
DeepSeek's AI model accidentally outputs explicit content from China's V2EX forum, raising data privacy and training set…
Brix is hiring AI engineers to build autonomous recruiting agents. This role focuses on LLM reasoning and multi-agent wo…
A developer systematically refutes three hypotheses on semantic units using geometric algebra and factor attention on du…
Correcting an AI in chat does not instantly update its model. Learn how training data cycles and RAG systems impact long…
Xiaomi's MiMo Orbit initiative distributes nearly 80 trillion tokens in under two weeks, signaling aggressive expansion …
Developers now fine-tune powerful small language models on consumer hardware, reducing costs and boosting privacy for lo…
New study uses LLM judges and TrueSkill to rank 1,000 Show HN posts by merit.
DeepSeek releases R1 model, offering open-source reasoning capabilities that rival top proprietary models at a fraction …
Panasonic integrates large language models into its industrial IoT edge devices, enabling real-time AI inference on fact…