Model Card for Alif 1.0 8B Instruct

Alif 1.0 8B Instruct is an open-source model with highly advanced multilingual reasoning capabilities. It utilizes human refined multilingual synthetic data paired with reasoning to enhance cultural nuance and reasoning capabilities in english and urdu languages.

  • Developed by: large-traversaal
  • License: apache-2.0
  • Base model: unsloth/Meta-Llama-3.1-8B
  • Model: Alif-1.0-8B-Instruct
  • Model Size: 8 billion parameters

This model was trained 2x faster with Unsloth and Huggingface's TRL library.

How to Use Alif 1.0 8B Instruct

Install the transformers, bitsandbytes libraries and load Alif 1.0 8B Instruct as follows:

from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline
import torch
from transformers import BitsAndBytesConfig

model_id = "large-traversaal/Alif-1.0-8B-Instruct"

# 4-bit quantization configuration
quantization_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_compute_dtype=torch.float16,
    bnb_4bit_use_double_quant=True,
    bnb_4bit_quant_type="nf4"
)

# Load tokenizer and model in 4-bit
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    quantization_config=quantization_config,
    device_map="auto"
)

# Create text generation pipeline
chatbot = pipeline("text-generation", model=model, tokenizer=tokenizer, device_map="auto")

# Function to chat
def chat(message):
    response = chatbot(message, max_new_tokens=100, do_sample=True, temperature=0.3)
    return response[0]["generated_text"]

# Example chat
user_input = "شہر کراچی کی کیا اہمیت ہے؟"
bot_response = chat(user_input)

print(bot_response)

You can also try out this model using TextStreamer or Gradio in Colab. It is also available in GGUF with various quantized formats for Ollama, LM Studio, Jan, and Llama.cpp.

Model Details

Input: Models input text only.

Output: Models generate text only.

Model Architecture: Alif 1.0 8B Instruct is an auto-regressive language model that uses an optimized transformer architecture. Post-training includes continuous pretraining and supervised finetuning.

For more details about how the model was trained, check out our blogpost.

Evaluation

We evaluated Alif 1.0 8B Instruct against Gemma 2 9B, Llama 3.1 8B, Mistral Nemo 12B, Qwen 2.5 7B and Cohere Aya Expanse 8B using the human annotated Urdu evaluation dataset and scores are determined using gpt-4o as a judge.

Model Card Contact

For errors or additional questions about details in this model card, contact: contact@traversaal.ai

Downloads last month
581
GGUF
Model size
8.03B params
Architecture
llama

2-bit

3-bit

4-bit

5-bit

6-bit

8-bit

16-bit

Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and HF Inference API was unable to determine this model’s pipeline type.

Model tree for large-traversaal/Alif-1.0-8B-Instruct

Quantized
(18)
this model

Space using large-traversaal/Alif-1.0-8B-Instruct 1

Collection including large-traversaal/Alif-1.0-8B-Instruct