Blogs#
Exploring Optimal Quantization Settings for Small Language Models
An exploration of how Olive applies different quantization strategies such as GPTQ, mixed precision, and QuaRot to optimize small language models for efficiency and accuracy.
Exploring Optimal Quantization Settings for Small Language Models