Supported?

Expect broken or faulty items for the time being. Use at your own discretion.

Bravo

Combined imatrix multiple images 512x512 25 and 50 steps city96/flux1-dev-Q8_0 euler

Using llama.cpp quantize cae9fb4 with modified lcpp.patch.

Experimental from f16

Filename Quant type File Size Description Example Image
flux1-dev-IQ1_S.gguf IQ1_S 2.45GB bad quality Example
flux1-dev-IQ1_M.gguf IQ1_M 2.72GB bad quality Example
flux1-dev-IQ2_XXS.gguf IQ2_XXS 3.19GB bad quality Example
flux1-dev-IQ2_XS.gguf IQ2_XS 3.56GB TBC -
flux1-dev-IQ2_S.gguf IQ2_S 3.56GB TBC -
flux1-dev-IQ2_M.gguf IQ2_M 3.93GB bad quality Example
flux1-dev-Q2_K_S.gguf Q2_K_S 4.02GB TBC Example
flux1-dev-IQ3_XXS.gguf IQ3_XXS 4.66GB - Example
flux1-dev-IQ3_XS.gguf IQ3_XS 5.22GB worse than IQ3_XXS Example
flux1-dev-Q3_K_S.gguf Q3_K_S 5.22GB TBC Example
flux1-dev-IQ4_XS.gguf IQ4_XS 6.42GB TBC -
flux1-dev-Q4_0.gguf Q4_0 6.79GB TBC -
flux1-dev-IQ4_NL.gguf IQ4_NL 6.79GB TBC Example
flux1-dev-Q4_K_S.gguf Q4_K_S 6.79GB TBC Example
flux1-dev-Q4_1.gguf Q4_1 7.53GB TBC -
flux1-dev-Q5_K_S.gguf Q5_K_S 8.27GB TBC Example

Observations

Alpha

Simple imatrix: 512x512 single image 8/20 steps city96/flux1-dev-Q3_K_S euler

data: load_imatrix: loaded 314 importance matrix entries from imatrix.dat computed on 7 chunks.

Using llama.cpp quantize cae9fb4 with modified lcpp.patch.

Experimental from q8

Filename Quant type File Size Description Example Image
flux1-dev-IQ1_S.gguf IQ1_S 2.45GB obviously bad quality Example
- IQ1_M - broken -
flux1-dev-TQ1_0.gguf TQ1_0 2.63GB TBC -
flux1-dev-TQ2_0.gguf TQ2_0 3.19GB TBC -
flux1-dev-IQ2_XXS.gguf IQ2_XXS 3.19GB TBC Example
flux1-dev-IQ2_XS.gguf IQ2_XS 3.56GB TBC Example
flux1-dev-IQ2_S.gguf IQ2_S 3.56GB TBC -
flux1-dev-IQ2_M.gguf IQ2_M 3.93GB TBC -
flux1-dev-Q2_K.gguf Q2_K 4.02GB TBC -
flux1-dev-Q2_K_S.gguf Q2_K_S 4.02GB TBC Example
flux1-dev-IQ3_XXS.gguf IQ3_XXS 4.66GB TBC Example
flux1-dev-IQ3_XS.gguf IQ3_XS 5.22GB TBC -
flux1-dev-IQ3_S.gguf IQ3_S 5.22GB TBC -
flux1-dev-IQ3_M.gguf IQ3_M 5.22GB TBC -
flux1-dev-Q3_K_S.gguf Q3_K_S 5.22GB TBC Example
flux1-dev-Q3_K_M.gguf Q3_K_M 5.36GB TBC -
flux1-dev-Q3_K_L.gguf Q3_K_L 5.36GB TBC -
flux1-dev-IQ4_XS.gguf IQ4_XS 6.42GB TBC Example
flux1-dev-IQ4_NL.gguf IQ4_NL 6.79GB TBC Example
flux1-dev-Q4_0.gguf Q4_0 6.79GB TBC -
- Q4_K TBC TBC -
flux1-dev-Q4_K_S.gguf Q4_K_S 6.79GB TBC Example
flux1-dev-Q4_K_M.gguf Q4_K_M 6.93GB TBC -
flux1-dev-Q4_1.gguf Q4_1 7.53GB TBC -
flux1-dev-Q5_K_S.gguf Q5_K_S 8.27GB TBC Example
flux1-dev-Q5_K.gguf Q5_K 8.41GB TBC -
- Q5_K_M TBC TBC -
flux1-dev-Q6_K.gguf Q6_K 9.84GB TBC -
- Q8_0 12.7GB TBC Example
- F16 23.8GB TBC Example

Observations

Sub-quants not diferentiated as expected: IQ2_XS == IQ2_S, IQ3_XS == IQ3_S == IQ3_M, Q3_K_M == Q3_K_L.

  • Check if lcpp_sd3.patch includes more specific quant level logic
  • Extrapolate the existing level logic
Quant type High level quants Middle level quants Low level quant Average
IQ1_S 5.5% 16bpw - 94.5% 1.5625bpw 2.3556bpw
IQ2_XXS 4.2% 16bpw - 95.8% 2.0625bpw 2.6504bpw
IQ2_XS 3.8% 16bpw - 96.2% 2.3125bpw 2.8297bpw
IQ2_S 3.8% 16bpw - 96.2% 2.3125bpw 2.8298bpw
IQ2_M 3.4% 16bpw - 96.6% 2.5625bpw 3.0224bpw
Q2_K_S 3.3% 16bpw - 96.7% 2.625bpw 3.0723bpw
IQ3_XXS 2.9% 16bpw - 97.1% 3.0625bpw 3.4351bpw
IQ3_XS 2.6% 16bpw - 97.4% 3.4375bpw 3.7609bpw
IQ3_S 2.6% 16bpw - 97.4% 3.4375bpw 3.7609bpw
IQ3_M 2.6% 16bpw - 97.4% 3.4375bpw 3.7609bpw
Downloads last month
1,857
GGUF
Model size
11.9B params
Architecture
flux

1-bit

2-bit

3-bit

4-bit

5-bit

6-bit

16-bit

Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and HF Inference API was unable to determine this model's library.

Model tree for Eviation/flux-imatrix

Quantized
(24)
this model