Requesting for info
#1
by
gsaivinay
- opened
Hello,
Could you please provide details of the quantization process? I'd like to know what is the dataset and what is the sequence length used for this conversion.
Thanks.
I used auto-gptq
pypi library and just applied post-training quantization so I don't used any dataset
Developed my flow for minimalist quantization in https://github.com/seonglae/llama2gptq
python main.py quantize --safetensor --model meta-llama/Llama-2-13b-chat-hf --output llama-2-13b-chat-hf-gptq