Supported?
Expect broken or faulty items for the time being. Use at your own discretion.
- ComfyUI-GGUF: all? (CPU/CUDA)
- Fast dequant: BF16, Q8_0, Q5_1, Q5_0, Q4_1, Q4_0, Q6_K, Q5_K, Q4_K, Q3_K, Q2_K
- Slow dequant: others via GGUF/NumPy
- Forge: TBC
- stable-diffusion.cpp: llama.cpp Feature-matrix
- CPU: all
- Cuda: all?
- Vulkan: >= Q3_K_S, > IQ4_S; PR IQ1_S, IQ1_M PR IQ4_XS
- other: ?
Bravo
Combined imatrix multiple images 512x512 25 and 50 steps city96/flux1-dev-Q8_0 euler
Using llama.cpp quantize cae9fb4 with modified lcpp.patch.
Experimental from f16
Filename | Quant type | File Size | Description | Example Image |
---|---|---|---|---|
flux1-dev-IQ1_S.gguf | IQ1_S | 2.45GB | bad quality | Example |
flux1-dev-IQ1_M.gguf | IQ1_M | 2.72GB | bad quality | Example |
flux1-dev-IQ2_XXS.gguf | IQ2_XXS | 3.19GB | bad quality | Example |
flux1-dev-IQ2_XS.gguf | IQ2_XS | 3.56GB | TBC | - |
flux1-dev-IQ2_S.gguf | IQ2_S | 3.56GB | TBC | - |
flux1-dev-IQ2_M.gguf | IQ2_M | 3.93GB | bad quality | Example |
flux1-dev-Q2_K_S.gguf | Q2_K_S | 4.02GB | TBC | Example |
flux1-dev-IQ3_XXS.gguf | IQ3_XXS | 4.66GB | - | Example |
flux1-dev-IQ3_XS.gguf | IQ3_XS | 5.22GB | worse than IQ3_XXS | Example |
flux1-dev-Q3_K_S.gguf | Q3_K_S | 5.22GB | TBC | Example |
flux1-dev-IQ4_XS.gguf | IQ4_XS | 6.42GB | TBC | - |
flux1-dev-Q4_0.gguf | Q4_0 | 6.79GB | TBC | - |
flux1-dev-IQ4_NL.gguf | IQ4_NL | 6.79GB | TBC | Example |
flux1-dev-Q4_K_S.gguf | Q4_K_S | 6.79GB | TBC | Example |
flux1-dev-Q4_1.gguf | Q4_1 | 7.53GB | TBC | - |
flux1-dev-Q5_K_S.gguf | Q5_K_S | 8.27GB | TBC | Example |
Observations
- Bravo IQ1_S worse than Alpha?
- Latent loss
- Per layer quantization cost from chrisgoringe/casting_cost
- Per layer quantization cost 2 from Freepik/flux.1-lite-8B: double blocks and single blocks
- Ablation latent loss per weight type
- Pareto front loss vs. size
Alpha
Simple imatrix: 512x512 single image 8/20 steps city96/flux1-dev-Q3_K_S euler
data: load_imatrix: loaded 314 importance matrix entries from imatrix.dat computed on 7 chunks
.
Using llama.cpp quantize cae9fb4 with modified lcpp.patch.
Experimental from q8
Filename | Quant type | File Size | Description | Example Image |
---|---|---|---|---|
flux1-dev-IQ1_S.gguf | IQ1_S | 2.45GB | obviously bad quality | Example |
- | IQ1_M | - | broken | - |
flux1-dev-TQ1_0.gguf | TQ1_0 | 2.63GB | TBC | - |
flux1-dev-TQ2_0.gguf | TQ2_0 | 3.19GB | TBC | - |
flux1-dev-IQ2_XXS.gguf | IQ2_XXS | 3.19GB | TBC | Example |
flux1-dev-IQ2_XS.gguf | IQ2_XS | 3.56GB | TBC | Example |
flux1-dev-IQ2_S.gguf | IQ2_S | 3.56GB | TBC | - |
flux1-dev-IQ2_M.gguf | IQ2_M | 3.93GB | TBC | - |
flux1-dev-Q2_K.gguf | Q2_K | 4.02GB | TBC | - |
flux1-dev-Q2_K_S.gguf | Q2_K_S | 4.02GB | TBC | Example |
flux1-dev-IQ3_XXS.gguf | IQ3_XXS | 4.66GB | TBC | Example |
flux1-dev-IQ3_XS.gguf | IQ3_XS | 5.22GB | TBC | - |
flux1-dev-IQ3_S.gguf | IQ3_S | 5.22GB | TBC | - |
flux1-dev-IQ3_M.gguf | IQ3_M | 5.22GB | TBC | - |
flux1-dev-Q3_K_S.gguf | Q3_K_S | 5.22GB | TBC | Example |
flux1-dev-Q3_K_M.gguf | Q3_K_M | 5.36GB | TBC | - |
flux1-dev-Q3_K_L.gguf | Q3_K_L | 5.36GB | TBC | - |
flux1-dev-IQ4_XS.gguf | IQ4_XS | 6.42GB | TBC | Example |
flux1-dev-IQ4_NL.gguf | IQ4_NL | 6.79GB | TBC | Example |
flux1-dev-Q4_0.gguf | Q4_0 | 6.79GB | TBC | - |
- | Q4_K | TBC | TBC | - |
flux1-dev-Q4_K_S.gguf | Q4_K_S | 6.79GB | TBC | Example |
flux1-dev-Q4_K_M.gguf | Q4_K_M | 6.93GB | TBC | - |
flux1-dev-Q4_1.gguf | Q4_1 | 7.53GB | TBC | - |
flux1-dev-Q5_K_S.gguf | Q5_K_S | 8.27GB | TBC | Example |
flux1-dev-Q5_K.gguf | Q5_K | 8.41GB | TBC | - |
- | Q5_K_M | TBC | TBC | - |
flux1-dev-Q6_K.gguf | Q6_K | 9.84GB | TBC | - |
- | Q8_0 | 12.7GB | TBC | Example |
- | F16 | 23.8GB | TBC | Example |
Observations
Sub-quants not diferentiated as expected: IQ2_XS == IQ2_S, IQ3_XS == IQ3_S == IQ3_M, Q3_K_M == Q3_K_L.
- Check if lcpp_sd3.patch includes more specific quant level logic
- Extrapolate the existing level logic
Quant type | High level quants | Middle level quants | Low level quant | Average |
---|---|---|---|---|
IQ1_S | 5.5% 16bpw | - | 94.5% 1.5625bpw | 2.3556bpw |
IQ2_XXS | 4.2% 16bpw | - | 95.8% 2.0625bpw | 2.6504bpw |
IQ2_XS | 3.8% 16bpw | - | 96.2% 2.3125bpw | 2.8297bpw |
IQ2_S | 3.8% 16bpw | - | 96.2% 2.3125bpw | 2.8298bpw |
IQ2_M | 3.4% 16bpw | - | 96.6% 2.5625bpw | 3.0224bpw |
Q2_K_S | 3.3% 16bpw | - | 96.7% 2.625bpw | 3.0723bpw |
IQ3_XXS | 2.9% 16bpw | - | 97.1% 3.0625bpw | 3.4351bpw |
IQ3_XS | 2.6% 16bpw | - | 97.4% 3.4375bpw | 3.7609bpw |
IQ3_S | 2.6% 16bpw | - | 97.4% 3.4375bpw | 3.7609bpw |
IQ3_M | 2.6% 16bpw | - | 97.4% 3.4375bpw | 3.7609bpw |
- Downloads last month
- 1,857
Inference Providers
NEW
This model is not currently available via any of the supported third-party Inference Providers, and
HF Inference API was unable to determine this model's library.
Model tree for Eviation/flux-imatrix
Base model
black-forest-labs/FLUX.1-dev