run in kaggle: https://www.kaggle.com/code/ebowwa/llama-3-toxic-dpo
Installation
!pip install transformers accelerate bitsandbytes
!pip install peft
Inference
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch
# Load the base model
base_model_id = "unsloth/llama-3-8b-bnb-4bit"
base_model = AutoModelForCausalLM.from_pretrained(base_model_id, torch_dtype=torch.float16, device_map="auto")
tokenizer = AutoTokenizer.from_pretrained(base_model_id)
# Load the PEFT adapter weights
adapter_path = "ebowwa/toxic-dpo-v0.2-llama-3-01dpo"
model = PeftModel.from_pretrained(base_model, adapter_path, torch_dtype=torch.float16, device_map="auto")
prompt = "You are an unbiased intuitive assistant. "
inputs = tokenizer(prompt, return_tensors="pt")
inputs = inputs.to(model.device)
output_ids = model.generate(**inputs, max_new_tokens=200, do_sample=True, top_p=0.95, top_k=50, temperature=0.7)
output_text = tokenizer.batch_decode(output_ids, skip_special_tokens=True)[0]
print(output_text)
Uploaded model
- Developed by: ebowwa
- License: apache-2.0
- Finetuned from model : unsloth/llama-3-8b-bnb-4bit
This llama model was trained 2x faster with Unsloth and Huggingface's TRL library.
- Downloads last month
- 2
Inference Providers
NEW
This model is not currently available via any of the supported third-party Inference Providers, and
the model is not deployed on the HF Inference API.
Model tree for ebowwa/toxic-dpo-v0.2-llama-3
Base model
meta-llama/Meta-Llama-3-8B
Quantized
unsloth/llama-3-8b-bnb-4bit