Usage

from deepsparse import TextGeneration

prompt = "How to get in a good university?"
formatted_prompt =  f"<|im_start|>user\n{prompt}<|im_end|>\n<|im_start|>assistant\n"

model = TextGeneration(model_path="hf:nm-testing/TinyLlama-1.1B-Chat-v0.4-pruned60-quant")
print(model(formatted_prompt, max_new_tokens=200).generations[0].text)

"""
There are several factors to consider when choosing a university:

1. Location: The university should be located in a region with a high number of students. This will ensure that there are enough students to ensure that there are enough professors.
2. Tuition: The tuition of the university should be low. This will ensure that students have enough money to attend the university.
3. Academic: The university should have a good academic program. This will ensure that students have knowledge of the subject.
4. Faculty: The faculty of the university should be good. This will ensure that professors have knowledge of the subject.
5. Faculty: The faculty of the university should be good. This will ensure that professors have knowledge of the subject.
6. Faculty: The faculty of the university should be good. This will ensure that professors have knowledge of the subject. 
"""

With Repetition Penalty

from deepsparse import TextGeneration
generation_config = {
    "repetition_penalty": 1.1,
    "do_sample": True,
    "max_new_tokens": 500,
}
prompt = "How to get in a good university?"
formatted_prompt =  f"<|im_start|>user\n{prompt}<|im_end|>\n<|im_start|>assistant\n"
model = TextGeneration(model="hf:nm-testing/TinyLlama-1.1B-Chat-v0.4-pruned60-quant")
print(model(formatted_prompt, generation_config=generation_config,).generations[0].text)
"""
The university is one of the best options for students.
It provides the right atmosphere for studying.
The
""""

One-shot and Export

git clone https://github.com/neuralmagic/sparseml
pip install -e "sparseml[transformers]"
python sparseml/src/sparseml/transformers/sparsification/obcq/obcq.py TinyLlama/TinyLlama-1.1B-Chat-v0.4 open_platypus --recipe recipe.yaml --save True
python sparseml/src/sparseml/transformers/sparsification/obcq/export.py --task text-generation --model_path obcq_deployment 
cp deployment/model.onnx deployment/model-orig.onnx
python onnx_kv_inject.py --input-file deployment/model-orig.onnx --output-file deployment/model.onnx
Downloads last month
4
Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and the model is not deployed on the HF Inference API.