LoneStriker/llama-3-70B-Instruct-abliterated-6.0bpw-h6-exl2 · did you have to do anything special to quantize this one?

Hey mate, I was trying to quantize this one yesterday with the convert.py script in exllamav2, but it always failed with this error:

 -- Resuming job
 !! Note: Overriding options with settings from existing job
 -- Input: llama-3-70B-Instruct-abliterated/
 -- Output: llama-3-70B-Instruct-abliterated-wip
 -- Using default calibration dataset
 -- Target bits per weight: 8.0 (decoder), 6 (head)
 -- Max shard size: 8192 MB
 -- Full model will be compiled to: exl2/llama-3-70B-Instruct-abliterated-exl2-8BPW
 -- Quantizing...
 -- Layer: model.layers.0 (Attention)
 -- Linear: model.layers.0.self_attn.q_proj -> 1:6b_32g s4, 6.13 bpw
 -- Linear: model.layers.0.self_attn.k_proj -> 1:6b_32g s4, 6.16 bpw
 !! Warning, difference of (0.015625, 0.015625) between unpacked and dequantized matrices
 -- Linear: model.layers.0.self_attn.v_proj -> 1:8b_32g s4, 8.16 bpw
 ## Quantization error (2)

Never had an issue with this before. Did you have to do anything special for this model to make it work?