Add fp16/int8 weights

Generally LGTM! by the way, if we don't include variant="int8" in the from_pretrained, it will just load the original fp32 version, is that correct?

model = AutoModelForCausalLM.from_pretrained(
    model_id,
    trust_remote_code=True,
    variant="int8",
    low_cpu_mem_usage=True,
    load_in_8bit=True,
)

mkshing

Aug 10, 2023

•

edited Aug 10, 2023

Exactly!
So, if I'm correct, it loads fp32 weights first and convert to int8 in this case.


model = AutoModelForCausalLM.from_pretrained(
    model_id,
    trust_remote_code=True,
-   variant="int8",
    low_cpu_mem_usage=True,
    load_in_8bit=True,
)

leemeng

Stability AI org Aug 10, 2023

nice! let's merge this. By the way, do you want to also include the variant as a colab dropdown (with default use int8) like model_id so people can be aware of that?

leemeng changed pull request status to merged Aug 10, 2023

mkshing

Aug 10, 2023

@leemeng sure! I will add some comments that only int8 works in colab free.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment