llmware
/

llama-3.1-instruct-gguf

Model card Files Files and versions Community

llama-3.1-instruct-gguf

llama-3.1-instruct-gguf is a GGUF Q4_K_M int4 quantized version of Llama 3.1 Instruct, providing a very fast inference implementation, optimized for AI PCs using Intel GPU, CPU and NPU.

llama-3.1-instruct is a leading open source general foundation model from Meta.

Model Description

Developed by: meta-llama
Model type: llama-3.1
Parameters: 8 billion
Model Parent: meta-llama/Meta-Llama-3.1-8B-Instruct
Language(s) (NLP): English
License: Llama 3.1 Community License
Uses: General chat use cases
RAG Benchmark Accuracy Score: NA
Quantization: int4

Model Card Contact

llmware on github

llmware website

Downloads last month: 0

GGUF

Model size

8.03B params

Architecture

llama

Inference Providers NEW

This model is not currently available via any of the supported third-party Inference Providers, and HF Inference API has been turned off for this model.

Model tree for llmware/llama-3.1-instruct-gguf

Base model

meta-llama/Llama-3.1-8B

Finetuned

meta-llama/Llama-3.1-8B-Instruct

Quantized

(343)

this model