Spaces:

bertin-project
/

bertin-gpt-j-6B

Runtime error

8-bit quantization model

by mrm8488 - opened Oct 3, 2022

BERTIN Project org Oct 3, 2022

As seen here: https://huggingface.co/spaces/bertin-project/bertin-gpt-j-6B/discussions/1#633aeb9acbdbadd99c070c74
With the new feature that automatically quantizes the model weights to 8 bits, IMHO, It does not make sense to create a separated and already quantized model. What do you think @versae ?

versae

BERTIN Project org Oct 3, 2022

Yeah. It seems the LoRA work might not be maintained in the future, so maybe using the int8 feature in transformers is the way to go. As I see it, there should be some way to serialize the model in int8 so we can create a branch in the model repo that automatically loads in int8.

mrm8488

BERTIN Project org Oct 3, 2022

I have already done it with the latest ckpt (https://huggingface.co/mrm8488/bertin-gpt-j-6B-ES-v1-8bit). Do I create a branch and push it there?

versae

BERTIN Project org Oct 3, 2022

That'd be great!

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment