8bit and sharded weights

#5
by ThreeBlessings - opened

Hi!

I'm updating a lab for Data-Centric AI course and it would be cool to use this model with load_in_8bit=True parameter and have it sharded in 2Gb weights for easy use with free tier Colab GPUs.

Is it planned to add this features?

Hi!

I'm not an author of this model but I sharded this model, you can check it here.
And I did it using free tier Colab environment with no GPUs, in this environment Colab give you enough RAM to load models up to 7B.

Technology Innovation Institute org

There has been some support for 4-bit in a great external library, FalconTune. You can also check-out this blog post from HuggingFace.

Hey, check out my video on how to fine tune and use instruct on a single gpu in free google colab: https://youtu.be/AXG7TA7vIQ8

Sign up or log in to comment