OSError with Llama3.2-3B-Instruct-QLORA_INT4_EO8

by steveq - opened Oct 25, 2024

Oct 25, 2024

When trying to run Llama3.2-3B-Instruct-QLORA_INT4_EO8, I'm getting the error:

OSError: meta-llama/Llama-3.2-3B-Instruct-QLORA_INT4_EO8 does not appear to have a file named pytorch_model.bin, model.safetensors, tf_model.h5, model.ckpt or flax_model.msgpack.

I've tried using transformers pipeline and also AutoModelForCausalLM to pull the model but get the error in both cases.

carlosaguayo

Oct 25, 2024

The weights were uploaded in their "original" (meta) format, and they need to be translated to the HuggingFace format to be used with the pipelines. I'm sure they will upload the reformatted version soon.

GrimSqueaker

Oct 31, 2024

Following this. (I thought I was missing something esoteric and GGUF related at first :D)

GrimSqueaker

Dec 2, 2024

Anything new on this or examples for a library that can work with this?

tgaud

17 days ago

Any solution ?

lsm0729

2 days ago

there is excutorch run sample here. https://github.com/pytorch/executorch/blob/main/examples/models/llama/README.md.
but it runs after converted to .pte kind of binary file. it is hard to look into how the model actually works..
still dont know how to run the model on gpu likie hf format

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment