Hosted API

#3
by tiagofreitas87 - opened

Is anyone running an API for embedding? Otherwise what is the best host for a serverless api to do embeddings?
Thanks

NLP Group of The University of Hong Kong org

Hi, Thanks for your interest in the INSTRUCTOR model!

One good way to run the INSTRUCTOR model without using local GPUs would be, computing embeddings in Google Colab. Here is an example script: https://colab.research.google.com/drive/1P7ivNLMosHyG7XOHmoh7CoqpXryKy3Qt?usp=sharing

Hope this helps! Feel free to add further questions or comments!

I need a serverless API (pay per second api) as it’s not worth it to pay for a full GPU for now.

NLP Group of The University of Hong Kong org

Hi, Thanks a lot for your question!

The Colab service is free, and you may try the INSTRUCTOR model there without paying for GPU. Currently, we may not have an API for calculating embeddings.

This comment has been hidden
This comment has been hidden

@tiagofreitas87 maybe this is relevant for you https://embaas.io

Thanks, but the discord invite is not working on that page, and MosaicML just launched an inference service that has Instruct embeddings.
https://www.mosaicml.com/inference

They are more established so it would be difficult to compete with them, unless Embaas or another service provide cheaper/easier continuous fine-tuning to a specific domain, but there are privacy concerns for enterprises.

Disclaimer: I am also working on embaas.

Thank you for the hint. We fixed the invitation. And it's cool that MosaicML has just added Insturctor. We are currently working on fine tuning the model on other languages. :)

@tiagofreitas87 If you're up for the challenge and latency is not an issue for you, you could try https://github.com/maxsagt/lambda-instructor which lets you deploy Instructor-Large on an AWS Lambda. It runs at ca. 6 seconds per request and costs less than $0.001 / request (depending on setup).

Sign up or log in to comment