Improving the inference/classification/prediction speed of this bart-large-mnli model

#15
by abhijit57 - opened

Hello,

I am working on a text classification research project and I have a dataset of about 500000 rows where each document is of a fairly larger size (70-100 tokens). I tried this model on nvidia v100 32gb GPU for 10 rows and a candidate label size of 804. It took 10 minutes. I cannot reduce the candidate label list size as per the requirements. I also tried codon compiler and numba to improve the inferences speed but not much luck there.

Has anyone have worked on the C++ bart model or have used deepspeed to improve the predictions for this model?
Any leads or help would be greatly appreciated, thank you.

+1, did you find a way?

Use deepspeed

Sign up or log in to comment