Cuda supports all the implementations of GPTQ now

by TheYuriLover - opened Apr 1, 2023

Apr 1, 2023

Hello,

CUDA_VISIBLE_DEVICES=0 python llama.py ./models/chavinlo-gpt4-x-alpaca --wbits 4 --true-sequential --groupsize 128 --save gpt-x-alpaca-13b-native-4bit-128g-cuda.pt
I saw that your CUDA quantization doesn't have act-order in it, you should quantize it again because it looks like qwopqwop200 finally combined all the implementations on it.
https://github.com/qwopqwop200/GPTQ-for-LLaMa/tree/cuda

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment