Hardware requirements for fine-tuning / prompt-tuning bloomz-7b1-mt

#2
by raihan2345 - opened

I am interested in fine-tuning / prompt-tuning the Bloomz-7B-MT model provided by Hugging Face. However, I am unsure about the specific hardware requirements that are needed to perform this task efficiently.

Some of the questions I have include:

  • What is the recommended amount of RAM for fine-tuning / prompt-tuning Bloomz-7B-MT?
  • What is the recommended GPU memory size?
  • Are there any specific hardware requirements for training with mixed precision?
  • Are there any other hardware considerations that are important for efficient fine-tuning / prompt-tuning of Bloomz-7B-MT?

I would greatly appreciate any insights and advice

Interested in exactly the same topic. Anyone working on this?

Hello dear friends, I will also greatly appreciate any help in this question. Please tell me what is the minimum computer power required to run the retraining process for this model?

BigScience Workshop org
edited Jul 20, 2023

Sorry for the late response, not sure how helpful my answer is as it depends on so many factors 😅

What is the recommended amount of RAM for fine-tuning / prompt-tuning Bloomz-7B-MT?

Bloomz-7B-MT is ~14GB. If you have more CPU RAM than that, it should be fine. You can even get away with less CPU RAM by transferring the weights to GPU step by step.

What is the recommended GPU memory size?

Bloomz-7B-MT is ~14GB.
Fine-tuning: For naive fine-tuning of all parameters you will need to store ~2x the weights in optimizer states etc. , so probably having 80GB would be sufficient.
Fine-tuning w/ less memory: Using DeepSpeed / other things, you could probably do full fine-tuning with 40GB
Fine-tuning some parameters: Using LLoRA or other stuff, you can probably do fine-tuning with even less by just changing a few parameters
Prompting: Depends on the length of your prompt, you will need ~14GB + some amount for the data

Are there any specific hardware requirements for training with mixed precision?

Most modern GPUs support FP16 afaik, i.e. Nvidia A100, AMD Mi250 etc should all work

I want to use this version of the model and make it more ChatGPT like for my language Punjabi. Is there a way I can shred off tokenizers and embeddings for languages other than the one I want, since it can be done for mt5 which reduced the model size by more than half. Also, are there any smaller versions of this model coming soon since I dont have access to a cluster of GPUs.

I have seen multiple tutorials on using the QLORA and PEFT techniques to fine-tune many 7B parameter models but they dont seem to work for this one here. I want to fine-tune it using a free version on colab and I dont want it to take much space, can anyone please help?

Thanks in advance!

BigScience Workshop org

You can try e.g. https://huggingface.co/bigscience/bloomz-3b or https://huggingface.co/bigscience/bloomz-3b or https://huggingface.co/bigscience/bloomz-1b7 which should be similar to bloomz-7b1-mt. For -mt, 7b1 is the smallest one though

Thanks alot I am thinking of going ahead with fine-tuning the smallest version of bloom itself taking individual data files from the same xp3mt dataset, (because I found some cleaning issues with the data so I would want to give a try with the cleaned data), is there any way to remove the tokens for other languages, decreasing the tokenizer size though?

BigScience Workshop org

Hm I have never tried that but it should work. The easiest one may be to remove all tokens with foreign characters that you do not intend to use (e.g. Chinese / Japanese / Korean etc)

Sign up or log in to comment