Determining Minimum GPU Memory and Input Text Length Calculation in Model Training

#19
by kobe8-24 - opened

Hello there,

I was wondering if it would be possible to consider the result obtained from this tool as an estimation of the minimum GPU memory required for model training when the batch_size is set to 1. Additionally, I would appreciate it if you could provide some information regarding the length of the input text used to calculate this result.

Thank you kindly.

I am also concerned about it.

accelerate org

As noted in the description, the values that are shown are the minimum recommended vRAM needed for a model is denoted as the size of the “largest layer”, and training of a model is roughly 4x its size (for Adam). This tool is still under development and we will add more variables (batch_size, ...) in the future. To have a better understanding on how much VRAM is needed, I suggest you to read this article: https://blog.eleuther.ai/transformer-math/

Sign up or log in to comment