Technical Specs required to run the XL

#15
by ishaan812 - opened

Hey I've been running the XL mostly on the CPU. Just wanted to know the exact tech specs if there are any for running it on a GPU since I wanted to upgrade. Also the GPU version will be faster right if Im not misunderstanding. Thanks for the help!

NLP Group of The University of Hong Kong org

Hi, Thanks a lot for your interest in the INSTRUCTOR!

It is possible to run the XL model on GPU devices. Others have tried with 24GB memory. It should only consume reasonable spaces with controlled batch sizes.

Feel free to add any further questions or comments!

Hi, can you please tell me what is the minimum GPU required to run this model? Actually, I'm doing research for which I am using this model. I have various documents, and the total length of chunks is 27700, and each chunk is 1000. I have a laptop with 16gb ram, 2070 max-q design GPU of 8GB + intel optane memory of 8gb, and i7 11th gen. If my specs are not proper then can I run the embedding process on collab pro?
It would be great if you can help me out by answering the query.

Hi, can you please tell me what is the minimum GPU required to run this model? Actually, I'm doing research for which I am using this model. I have various documents, and the total length of chunks is 27700, and each chunk is 1000. I have a laptop with 16gb ram, 2070 max-q design GPU of 8GB + intel optane memory of 8gb, and i7 11th gen. If my specs are not proper then can I run the embedding process on collab pro?
It would be great if you can help me out by answering the query.

Long story short: 8GB will usually not be enough. I've tried and I ran into errors related to memory allocation many times. Colab pro will work with the appropriate run environment, but don't forget that (if you use the embeddings for information retrieval), you need to run the model during information retrieval as well, because you need to embed your questions. You'll run into the same memory-related problems as you'd have when making the embeddings.

I tried out the instructor-xl model on colab pro and it can indeed be done if you use a run environment with high memory and a premium GPU (I think the one assigned to me had 16GB VRAM). I also tried with the free version of colab and then it runs out of memory at some point, crashing the environment.

Assuming you are indexing the embeddings in a vector store for information retrieval: You'll of course need to load the model for information retrieval too. If I use my laptop's GPU (2070 max-q with 8GB), I run out of VRAM quickly when running queries. It is possible to use the CPU instead, but it is a bit slower. You can of course also run the queries in google colab.

Hi @ishaan812 , can you please share the CPU device specs you have successfully ran the XL model on.

Sign up or log in to comment