Ready to use Mistral-7B-Instruct-v0.1-GGUF model as OpenAI API compatible endpoint

#29

by limcheekin - opened Oct 1, 2023

Oct 1, 2023

Hi there,

I deployed the model as OpenAI API compatible endpoint at https://huggingface.co/spaces/limcheekin/Mistral-7B-Instruct-v0.1-GGUF.

Also, I created a jupyter notebook to get you started to use the API endpoint in no time.

Lastly, if you find this resource valuable, your support in the form of starring the space would be greatly appreciated.

Thank you.

Hawks101

Oct 1, 2023

First of all thanks for this ! I But we need to purchase OpenAI credit points for this right? I am a beginner.

deleted

Oct 1, 2023

Its compatible with, not running at. Unsure if HF API is free or not, but it woudl be their charges, not OpenAI

limcheekin

Oct 2, 2023

•

edited Oct 2, 2023

It is free of charge.
But I think HF definitely have a cap on number of requests can be made to the free tier HF spaces per hour or per day. Anyone here know the cap?

That's the reason your support is important.

Alternatively, you can duplicate the space and run your own instance for free.

toanbft

Oct 2, 2023

Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained. What is embeddings are fine tuned of mistralai/Mistral-7B-Instruct-v0.1

Hawks101

Oct 2, 2023

It is free of charge.
But I think HF definitely have a cap on number of requests can be made to the free tier HF spaces per hour or per day. Anyone here know the cap?

That's the reason your support is important.

Alternatively, you can duplicate the space and run your own instance for free.

Thank you so so much for this man ! Appreciated. I used this to make an app for question answering a pdf. Just wanted to ask that when i use this model through this space; on which api does this bounce back from exatcly? Is it Openai? or the huggingface's api for this model? Coz HF's api wasn't giving the full ouput (only around 10 tokens) when used through langchain. And if its openai then how is it free ? Please help. I am a beginner.

mayacinka

Oct 2, 2023

Thanks a lot, much appreciated, it works like a charm! I do have one little issue however... I get this weird output where every new sentence starts with a number. Do you know why that might be the case?

limcheekin

Oct 3, 2023

It is free of charge.
But I think HF definitely have a cap on number of requests can be made to the free tier HF spaces per hour or per day. Anyone here know the cap?

That's the reason your support is important.

Alternatively, you can duplicate the space and run your own instance for free.

Thank you so so much for this man ! Appreciated. I used this to make an app for question answering a pdf. Just wanted to ask that when i use this model through this space; on which api does this bounce back from exatcly? Is it Openai? or the huggingface's api for this model? Coz HF's api wasn't giving the full ouput (only around 10 tokens) when used through langchain. And if its openai then how is it free ? Please help. I am a beginner.

You can use the free space for hosting open-source text embeddings models such as BAAI/bge-large-en, intfloat/e5-large-v2, sentence-transformers/all-MiniLM-L6-v2, sentence-transformers/all-mpnet-base-v2, etc. as OpenAI API compatible embeddings endpoint using the following Python package:
https://github.com/limcheekin/open-text-embeddings

limcheekin

Oct 3, 2023

Thanks a lot, much appreciated, it works like a charm! I do have one little issue however... I get this weird output where every new sentence starts with a number. Do you know why that might be the case?

I get similar output, not sure why that's the case. Perhaps you need to play around with the prompts or try the original unquantized model weight.

limcheekin

Oct 3, 2023

Hi there,

I just enabled (turn on) the embeddings endpoint and go ahead and test it out yourself and highly appreciate if you could share the result here on how does it compare to other open-source text embeddings models such as BAAI/bge-large-en, intfloat/e5-large-v2, sentence-transformers/all-MiniLM-L6-v2, sentence-transformers/all-mpnet-base-v2, etc.

By the way, I just created the same endpoints for Mistral-7B-OpenOrca-GGUF model at https://huggingface.co/spaces/limcheekin/Mistral-7B-OpenOrca-GGUF.

pvbhanuteja

Oct 10, 2023

Hi there,

I deployed the model as OpenAI API compatible endpoint at https://huggingface.co/spaces/limcheekin/Mistral-7B-Instruct-v0.1-GGUF.

Also, I created a jupyter notebook to get you started to use the API endpoint in no time.

Lastly, if you find this resource valuable, your support in the form of starring the space would be greatly appreciated.

Thank you.

Use vLLM : https://github.com/vllm-project/vllm

limcheekin

Oct 10, 2023

Hi there,

I deployed the model as OpenAI API compatible endpoint at https://huggingface.co/spaces/limcheekin/Mistral-7B-Instruct-v0.1-GGUF.

Also, I created a jupyter notebook to get you started to use the API endpoint in no time.

Lastly, if you find this resource valuable, your support in the form of starring the space would be greatly appreciated.

Thank you.

Use vLLM : https://github.com/vllm-project/vllm

Thanks for sharing. Is the vllm support GGUF models?

pvbhanuteja

Oct 11, 2023

Hi there,

I deployed the model as OpenAI API compatible endpoint at https://huggingface.co/spaces/limcheekin/Mistral-7B-Instruct-v0.1-GGUF.

Also, I created a jupyter notebook to get you started to use the API endpoint in no time.

Lastly, if you find this resource valuable, your support in the form of starring the space would be greatly appreciated.

Thank you.

Use vLLM : https://github.com/vllm-project/vllm

Thanks for sharing. Is the vllm support GGUF models?

Not sure. What not tryout ?

python -m vllm.entrypoints.openai.api_server --model=

I haven't used GGUF on vLLMs before

limcheekin

Oct 11, 2023

•

edited Oct 11, 2023

Hi there,

I deployed the model as OpenAI API compatible endpoint at https://huggingface.co/spaces/limcheekin/Mistral-7B-Instruct-v0.1-GGUF.

Also, I created a jupyter notebook to get you started to use the API endpoint in no time.

Lastly, if you find this resource valuable, your support in the form of starring the space would be greatly appreciated.

Thank you.

Use vLLM : https://github.com/vllm-project/vllm

Thanks for sharing. Is the vllm support GGUF models?

Not sure. What not tryout ?

python -m vllm.entrypoints.openai.api_server --model=

I haven't used GGUF on vLLMs before

I am very much focus on using GGUF models and will pass it for now.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

Your need to confirm your account before you can post a new comment.

· Sign up or log in to comment