🏷️ model-quantization

2 articles about 'model-quantization'

Deploy Fine-Tuned LLMs on AWS Lambda Fast

2026-05-07 tutorial 👁 11

A step-by-step guide to deploying fine-tuned large language models on AWS Lambda while minimizing cold start latency.

2026-05-05 research 👁 8

Apple's ML research team publishes new techniques enabling large language models to run efficiently on iPhones and iPads…