Optimised AWQ Quants for high-throughput deployments of Gemma2! Compatible with Transformers, TGI & VLLM 🤗
![](https://cdn-avatars.huggingface.co/v1/production/uploads/60f0608166e5701b80ed3f02/Mm_zQGapvXKw4VCCJ8Jbc.png)
Hugging Quants
AI & ML interests
Optimised quants for high-throughput deployments! Compatible with Transformers, TGI & vLLM 🤗
Recent Activity
View all activity
Organization Card
Welcome to the home of exciting quantized models! We'd love to see increased adoption of powerful state-of-the-art open models, and quantization is a key component to make them work on more types of hardware.
Resources:
- Llama 3.1 Quantized Models: Optimised Quants of Llama 3.1 for high-throughput deployments! Compatible with Transformers, TGI & VLLM 🤗.
- Hugging Face Llama Recipes: A set of minimal recipes to get started with Llama 3.1.
Collections
3
Llama.cpp compatible quants for Llama 3.2 3B and 1B Instruct models.
-
hugging-quants/Llama-3.2-3B-Instruct-Q8_0-GGUF
Text Generation • Updated • 23.3k • 44 -
hugging-quants/Llama-3.2-3B-Instruct-Q4_K_M-GGUF
Text Generation • Updated • 829 • 19 -
hugging-quants/Llama-3.2-1B-Instruct-Q8_0-GGUF
Text Generation • Updated • 385k • 27 -
hugging-quants/Llama-3.2-1B-Instruct-Q4_K_M-GGUF
Text Generation • Updated • 26k • 13
models
19
![](https://cdn-avatars.huggingface.co/v1/production/uploads/60f0608166e5701b80ed3f02/Mm_zQGapvXKw4VCCJ8Jbc.png)
hugging-quants/gemma-2-9b-it-AWQ-INT4
Text Generation
•
Updated
•
1.22k
•
6
![](https://cdn-avatars.huggingface.co/v1/production/uploads/60f0608166e5701b80ed3f02/Mm_zQGapvXKw4VCCJ8Jbc.png)
hugging-quants/Mixtral-8x7B-Instruct-v0.1-AWQ-INT4
Text Generation
•
Updated
•
273
![](https://cdn-avatars.huggingface.co/v1/production/uploads/60f0608166e5701b80ed3f02/Mm_zQGapvXKw4VCCJ8Jbc.png)
hugging-quants/Llama-3.2-1B-Instruct-Q4_K_M-GGUF
Text Generation
•
Updated
•
26k
•
13
![](https://cdn-avatars.huggingface.co/v1/production/uploads/60f0608166e5701b80ed3f02/Mm_zQGapvXKw4VCCJ8Jbc.png)
hugging-quants/Llama-3.2-1B-Instruct-Q8_0-GGUF
Text Generation
•
Updated
•
385k
•
27
![](https://cdn-avatars.huggingface.co/v1/production/uploads/60f0608166e5701b80ed3f02/Mm_zQGapvXKw4VCCJ8Jbc.png)
hugging-quants/Llama-3.2-3B-Instruct-Q4_K_M-GGUF
Text Generation
•
Updated
•
829
•
19
![](https://cdn-avatars.huggingface.co/v1/production/uploads/60f0608166e5701b80ed3f02/Mm_zQGapvXKw4VCCJ8Jbc.png)
hugging-quants/Llama-3.2-3B-Instruct-Q8_0-GGUF
Text Generation
•
Updated
•
23.3k
•
44
![](https://cdn-avatars.huggingface.co/v1/production/uploads/60f0608166e5701b80ed3f02/Mm_zQGapvXKw4VCCJ8Jbc.png)
hugging-quants/Meta-Llama-3.1-405B-BNB-NF4
Text Generation
•
Updated
•
29
•
2
![](https://cdn-avatars.huggingface.co/v1/production/uploads/60f0608166e5701b80ed3f02/Mm_zQGapvXKw4VCCJ8Jbc.png)
hugging-quants/Meta-Llama-3.1-405B-Instruct-BNB-NF4
Text Generation
•
Updated
•
67
•
5
![](https://cdn-avatars.huggingface.co/v1/production/uploads/60f0608166e5701b80ed3f02/Mm_zQGapvXKw4VCCJ8Jbc.png)
hugging-quants/Meta-Llama-3.1-405B-BNB-NF4-BF16
Text Generation
•
Updated
•
2.27k
•
2
![](https://cdn-avatars.huggingface.co/v1/production/uploads/60f0608166e5701b80ed3f02/Mm_zQGapvXKw4VCCJ8Jbc.png)
hugging-quants/Meta-Llama-3.1-405B-Instruct-AWQ-INT4
Text Generation
•
Updated
•
2.83k
•
37
datasets
None public yet