Even 330GB of host RAM is not enough

#8
by BigDeeper - opened

Although on disk the model takes up 180GB, as python builds it up in RAM, it runs out of memory. I am not sure if data gets loaded for inference at full 32bit precision or it is just Python's needs for extra RAM wrt. to how it represents data.

Does numpy have 8bit float support?

In my machine I needed about 370GB of memory just to load the model, after it loads, it drops to 170GB.
But there is virtually no cpu parallelization (in my case), so getting prompt response is slow.

Sign up or log in to comment