Эта модель является квантизированной версией.

pip install autoawq==0.2.8

from awq import AutoAWQForCausalLM
from transformers import AutoTokenizer
import torch

# Specify paths and hyperparameters for quantization
model_path = "/data/models/RuadaptQwen2.5-32B-Pro-Beta"
quant_path = "/data/models/RuadaptQwen2.5-32B-Pro-Beta-AWQ"
quant_config = {"zero_point": True, "q_group_size": 128, "w_bit": 4, "version": "GEMM"}

# Load your tokenizer and model with AutoAWQ
model = AutoAWQForCausalLM.from_pretrained(
    model_path, safetensors=True, torch_dtype=torch.bfloat16
)

model.quantize(tokenizer, quant_config=quant_config, calib_data="/data/scripts/RuadaptQwen-Quantization-Dataset", text_column='text')

model.save_quantized(quant_path, safetensors=True, shard_size="5GB")
Downloads last month
2
Safetensors
Model size
5.68B params
Tensor type
I32
·
BF16
·
FP16
·
Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and HF Inference API was unable to determine this model's library.

Model tree for hiauiarau/RuadaptQwen2.5-32B-Pro-Beta-AWQ

Base model

Qwen/Qwen2.5-32B
Finetuned
(2)
this model

Dataset used to train hiauiarau/RuadaptQwen2.5-32B-Pro-Beta-AWQ