Эта модель является квантизированной версией.
pip install autoawq==0.2.8
from awq import AutoAWQForCausalLM
from transformers import AutoTokenizer
import torch
# Specify paths and hyperparameters for quantization
model_path = "/data/models/RuadaptQwen2.5-32B-Pro-Beta"
quant_path = "/data/models/RuadaptQwen2.5-32B-Pro-Beta-AWQ"
quant_config = {"zero_point": True, "q_group_size": 128, "w_bit": 4, "version": "GEMM"}
# Load your tokenizer and model with AutoAWQ
model = AutoAWQForCausalLM.from_pretrained(
model_path, safetensors=True, torch_dtype=torch.bfloat16
)
model.quantize(tokenizer, quant_config=quant_config, calib_data="/data/scripts/RuadaptQwen-Quantization-Dataset", text_column='text')
model.save_quantized(quant_path, safetensors=True, shard_size="5GB")
- Downloads last month
- 2
Inference Providers
NEW
This model is not currently available via any of the supported third-party Inference Providers, and
HF Inference API was unable to determine this model's library.
Model tree for hiauiarau/RuadaptQwen2.5-32B-Pro-Beta-AWQ
Base model
Qwen/Qwen2.5-32B
Finetuned
RefalMachine/RuadaptQwen2.5-32B-Pro-Beta