Model quantized using a modified [EETQ](https://github.com/NetEase-FuXi/EETQ) repo. Currently working on decoupling its kernels from CUTLASS to make this a bit easier to use. 8bits.