Only 4G memory usage when inferring while 38Go when training

#15

by hayj - opened Aug 30, 2023

Discussion

hayj

Aug 30, 2023

Is it normal it takes much more GPU mem when training, or am I wrongly using it?
I use a Nvidia A100.

intfloat

Owner Aug 31, 2023

Yes, this is normal. During training, it needs to store optimizer states, intermediate activations, and some other stuff, which are several times larger than the model weights.

Please refer to https://huggingface.co/docs/transformers/v4.20.1/en/perf_train_gpu_one#anatomy-of-models-memory for more details.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment