How to load pytorch shards only and not safe-tensors ? so that we can load only the pytorch model into gpu from huggingface?

#16
by bilwa99 - opened

Hi All,

How to load pytorch shards only and not safe-tensors ? so that we can load only the pytorch model into gpu from huggingface?

So, if we see the model card, both the pytorch shards and safe-tensors are of 14.5 GB each; and when I am loading the model the total space is of 29GB.
But, I want only the pytorch shards and so the memory footprint should be of 14.5 GB.

Give me the correct code to do this for this model- "Intel/neural-chat-7b-v3-1".
I only want the model to load- pytorch_model-00001-of-00002.bin and pytorch_model-00002-of-00002.bin.

It must not load the model-00001-of-00002.safetensors and model-00002-of-00002.safetensors

Give me the correct code for the invocation and loading and usage of the model.

Thanks.

Sign up or log in to comment