amd-shark/sdxl-quant-fp8 at main

sdxl-quant-fp8

104 GB

4 contributors

History: 63 commits

GiusFra

Upload quant_sdxl_exps/sdxl_quant_artifacts_mxfp4_net2/net_2_mxfp4.irpa with huggingface_hub

e144c20 verified 5 months ago

all_linear_sym_8_calib8
Fix names over 1 year ago
all_quant_int8_sdpa_fp8
Create config.json 10 months ago
all_sym_8_calib10
MI250 QKV fused and all layers sym, FP8 attention, guidance scale 8, calib steps 10 over 1 year ago
brevitas
updated quant_params with QKV fusion over 1 year ago
fp8_irpa
Upload fp8_irpa/unet_quant_bias.irpa with huggingface_hub 5 months ago
linear_conv_fp8_sdpa_fp16_eq_bl
Create config.json about 1 year ago
linear_conv_fp8_sdpa_fp16_no_eq_bl
Create config.json about 1 year ago
linear_conv_fp8_sdpa_fp8_eq_bl
Create config.json about 1 year ago
linear_conv_fp8_sdpa_fp8_no_eq_bl
Create config.json about 1 year ago
nvidia_fp8_unet
Upload nvidia_fp8_unet/params.safetensors with huggingface_hub about 1 year ago
quant_sdxl_exps
Upload quant_sdxl_exps/sdxl_quant_artifacts_mxfp4_net2/net_2_mxfp4.irpa with huggingface_hub 5 months ago
unet_int8_sdpa_fp8_ocp
Upload unet_int8_sdpa_fp8_ocp/params.safetensors with huggingface_hub 10 months ago
unet_int8_sdpa_fp8_vae_int8
Upload unet_int8_sdpa_fp8_vae_int8/vae_quant_params.json with huggingface_hub 10 months ago
unet_int8_sdpa_fp8_vae_int8_v2
Upload unet_int8_sdpa_fp8_vae_int8_v2/vae_quant_params.json with huggingface_hub 10 months ago
.gitattributes

2.63 kB

Upload quant_sdxl_exps/sdxl_quant_artifacts_mxfp4_net2/net_2_mxfp4.irpa with huggingface_hub 5 months ago
attn.py

6.26 kB

Added SDPA math model & test over 1 year ago
math_model.py

7.13 kB

Create math_model.py about 1 year ago
sdxl.json

2.19 MB

Upload sdxl.json with huggingface_hub over 1 year ago
sdxl.safetensors

5.14 GB
xet

Upload sdxl.safetensors with huggingface_hub over 1 year ago
test_attn.py

1.31 kB

[math_model] Make it more obvious that softmax scale comes from the quantizer 12 months ago