How to embed 5 Million documents?

#22

by leonshub - opened Jan 26, 2024

Discussion

leonshub

Jan 26, 2024

•

edited Jan 27, 2024

I want to use this model for my research and have 5 million texts to embed, consisting of each between 100 and 300 tokens. What would be the fastest way to do that?
I guess my question boils down to what would be the best batch_size?

intfloat

Owner Jan 30, 2024

I would suggest using the largest batch size that can fit within your GPU to make the utilization rate as high as possible.

Another trick is to sort the documents by length, so that unnecessary padding can be minimized.

Also do not forget to use fp16/bf16 and flash attention if your GPU supports that.

nirupamadasari05

Apr 19, 2024

This comment has been hidden

leonshub changed discussion status to closed Jun 1, 2024

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment