Supported?

Expect broken or faulty items for the time being. Use at your own discretion.

ComfyUI-GGUF: all? (CPU/CUDA)
- Fast dequant: BF16, Q8_0, Q5_1, Q5_0, Q4_1, Q4_0, Q6_K, Q5_K, Q4_K, Q3_K, Q2_K
- Slow dequant: others via GGUF/NumPy
Forge: TBC
stable-diffusion.cpp: llama.cpp Feature-matrix
- CPU: all
- Cuda: all?
- Vulkan: >= Q3_K_S, > IQ4_S; PR IQ1_S, IQ1_M PR IQ4_XS
- other: ?

Bravo

Combined imatrix multiple images 512x512 25 and 50 steps city96/flux1-dev-Q8_0 euler

Filename	Quant type	File Size	Description	Example Image
flux1-dev-IQ1_S.gguf	IQ1_S	2.45GB	bad quality	Example
flux1-dev-IQ1_M.gguf	IQ1_M	2.72GB	bad quality	Example
flux1-dev-IQ2_XXS.gguf	IQ2_XXS	3.19GB	bad quality	Example
flux1-dev-IQ2_XS.gguf	IQ2_XS	3.56GB	TBC	-
flux1-dev-IQ2_S.gguf	IQ2_S	3.56GB	TBC	-
flux1-dev-IQ2_M.gguf	IQ2_M	3.93GB	bad quality	Example
flux1-dev-Q2_K_S.gguf	Q2_K_S	4.02GB	TBC	Example
flux1-dev-IQ3_XXS.gguf	IQ3_XXS	4.66GB	-	Example
flux1-dev-IQ3_XS.gguf	IQ3_XS	5.22GB	worse than IQ3_XXS	Example
flux1-dev-Q3_K_S.gguf	Q3_K_S	5.22GB	TBC	Example
flux1-dev-IQ4_XS.gguf	IQ4_XS	6.42GB	TBC	-
flux1-dev-Q4_0.gguf	Q4_0	6.79GB	TBC	-
flux1-dev-IQ4_NL.gguf	IQ4_NL	6.79GB	TBC	Example
flux1-dev-Q4_K_S.gguf	Q4_K_S	6.79GB	TBC	Example
flux1-dev-Q4_1.gguf	Q4_1	7.53GB	TBC	-
flux1-dev-Q5_K_S.gguf	Q5_K_S	8.27GB	TBC	Example

Simple imatrix: 512x512 single image 8/20 steps city96/flux1-dev-Q3_K_S euler

data: load_imatrix: loaded 314 importance matrix entries from imatrix.dat computed on 7 chunks.

Filename	Quant type	File Size	Description	Example Image
flux1-dev-IQ1_S.gguf	IQ1_S	2.45GB	obviously bad quality	Example
-	IQ1_M	-	broken	-
flux1-dev-TQ1_0.gguf	TQ1_0	2.63GB	TBC	-
flux1-dev-TQ2_0.gguf	TQ2_0	3.19GB	TBC	-
flux1-dev-IQ2_XXS.gguf	IQ2_XXS	3.19GB	TBC	Example
flux1-dev-IQ2_XS.gguf	IQ2_XS	3.56GB	TBC	Example
flux1-dev-IQ2_S.gguf	IQ2_S	3.56GB	TBC	-
flux1-dev-IQ2_M.gguf	IQ2_M	3.93GB	TBC	-
flux1-dev-Q2_K.gguf	Q2_K	4.02GB	TBC	-
flux1-dev-Q2_K_S.gguf	Q2_K_S	4.02GB	TBC	Example
flux1-dev-IQ3_XXS.gguf	IQ3_XXS	4.66GB	TBC	Example
flux1-dev-IQ3_XS.gguf	IQ3_XS	5.22GB	TBC	-
flux1-dev-IQ3_S.gguf	IQ3_S	5.22GB	TBC	-
flux1-dev-IQ3_M.gguf	IQ3_M	5.22GB	TBC	-
flux1-dev-Q3_K_S.gguf	Q3_K_S	5.22GB	TBC	Example
flux1-dev-Q3_K_M.gguf	Q3_K_M	5.36GB	TBC	-
flux1-dev-Q3_K_L.gguf	Q3_K_L	5.36GB	TBC	-
flux1-dev-IQ4_XS.gguf	IQ4_XS	6.42GB	TBC	Example
flux1-dev-IQ4_NL.gguf	IQ4_NL	6.79GB	TBC	Example
flux1-dev-Q4_0.gguf	Q4_0	6.79GB	TBC	-
-	Q4_K	TBC	TBC	-
flux1-dev-Q4_K_S.gguf	Q4_K_S	6.79GB	TBC	Example
flux1-dev-Q4_K_M.gguf	Q4_K_M	6.93GB	TBC	-
flux1-dev-Q4_1.gguf	Q4_1	7.53GB	TBC	-
flux1-dev-Q5_K_S.gguf	Q5_K_S	8.27GB	TBC	Example
flux1-dev-Q5_K.gguf	Q5_K	8.41GB	TBC	-
-	Q5_K_M	TBC	TBC	-
flux1-dev-Q6_K.gguf	Q6_K	9.84GB	TBC	-
-	Q8_0	12.7GB	TBC	Example
-	F16	23.8GB	TBC	Example

Sub-quants not diferentiated as expected: IQ2_XS == IQ2_S, IQ3_XS == IQ3_S == IQ3_M, Q3_K_M == Q3_K_L.

Quant type	High level quants	Middle level quants	Low level quant	Average
IQ1_S	5.5% 16bpw	-	94.5% 1.5625bpw	2.3556bpw
IQ2_XXS	4.2% 16bpw	-	95.8% 2.0625bpw	2.6504bpw
IQ2_XS	3.8% 16bpw	-	96.2% 2.3125bpw	2.8297bpw
IQ2_S	3.8% 16bpw	-	96.2% 2.3125bpw	2.8298bpw
IQ2_M	3.4% 16bpw	-	96.6% 2.5625bpw	3.0224bpw
Q2_K_S	3.3% 16bpw	-	96.7% 2.625bpw	3.0723bpw
IQ3_XXS	2.9% 16bpw	-	97.1% 3.0625bpw	3.4351bpw
IQ3_XS	2.6% 16bpw	-	97.4% 3.4375bpw	3.7609bpw
IQ3_S	2.6% 16bpw	-	97.4% 3.4375bpw	3.7609bpw
IQ3_M	2.6% 16bpw	-	97.4% 3.4375bpw	3.7609bpw