Doing the encoding using GPU

#7
by Luning-Yang - opened

I'm try to encode a massive amount of data using instructor. Here is what I did:

import torch
from transformers import AutoTokenizer
from InstructorEmbedding import INSTRUCTOR

device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')

model = INSTRUCTOR('hkunlp/instructor-large').to(device)
tokenizer = AutoTokenizer.from_pretrained('hkunlp/instructor-large')

However, I don't know how to properly convert the input data into tensors in order to use GPU for encoding. Could you elaborate on this?

NLP Group of The University of Hong Kong org

Hi, Thanks a lot for your interest in the INSTRUCTOR model!

You may need to move both models and encoding texts to the GPU.

Feel free to add any questions or comments!

Sign up or log in to comment