APIs & Fine-tuning ( both Domain Adaptation and instruction fine-tuning) on $/1000 token basis

#12

by KrishnaKaasyap - opened Oct 15, 2023

Oct 15, 2023

I love this model. This is probably the best 7B model out there (as on October 2023). Thanks to Mistral team & 🤗 team. Infact Zephyr-7b-alpha is a bit better than Mistral Instruct and the UI in the spaces that hosted Zephyr is awesome. And thanks a lot for open sourcing the UI.

But I cannot understand one thing! Almost every cloud provider, from huge to big to small offer models on $/GPU/Hour basis!

As far as I know - only Together.ai offers OPEN models via API at $/1000 token pricing format. Not only that - they have standard rates per 1000 tokens based on models sizes (billions of parameters). Any model that has 7 billion parameters costs 0.0002$/1k tokens on their cloud. You don't have to pay for hosting, you don't have to painstakingly select GPU CPU instances, you don't have to consider your demand spikes, and so on. You only pay for what you use - whenever you use it and whichever model you use! That is a sheer brilliant pricing strategy.

But...... even they don't have fine-tuning (which includes IMHO - both Domain Adaptation and instruction fine-tuning) on $/1000 token basis.

Open AI (The most closed company in the world) offers not just inference but also even fine tuning on $/1000 token basis!

https://openai.com/pricing

If there is one thing that 🤗 can do to improve their stake in the cloud business and AI future in general - it is to offer dozens of awesome open models (including and starting with Zephyr-7b-alpha) that it already possess via API on $/1000 token pricing format and also please please offer fine-tuning on $/1000 token pricing format.

You could charge a bit premium for fine-tuning if you offer services on $/1000 token pricing format because all the user has to do is have the dataset in the system - prompt - output format! Open AI charges a lot of premium for fine-tuning and inference of even the basic Babbage model (which is estimated to have 3B parameters - not confirmed tho!)

I know it is a huge challenge to offer multiple models for fine-tuning in $/1000 token pricing format - but you can offer atleast one model, right?

Like Together AI you can offer many models via API and atleast offer Mistral (base and instruct) &
Zephyr-7b-alpha (chat) for fine-tuning in $/1000 token pricing format.

The two defining features and huge advantages of open models are - customisation + ease of use and security. Since the code and sometimes the datasets are in the open - the security aspect of open source is perfectly achieved. The only pending aspect is customisation + ease of use.

Trust me - there are millions of people who don't have the technical expertise, but the need to use Gen AI for their businesses - and the current methodologies are in no way creating ease of use for those people without technical expertise!

If the closed source models are moving very fastly in supporting fine tuning - why can't the open community with much larger support base and much bigger wiggle room can do the same?

More power to the open source community. Let the wisdom of commons win.

KrishnaKaasyap

Oct 15, 2023

I would love to see Mistral AI providing fine-tuning (both Domain Adaptation and instruction fine-tuning) on $/1000 token pricing format! Since you created this legendary 7B model - you'll know how to effectively and efficiently host, fine-tune and infer the model in the bestest way possible. I bet CoreWeave folks would be interested in this!

@glample
@arthurmensch
@devendrachaplot

KrishnaKaasyap

Oct 15, 2023

@vipul
@orangetin
@mauriceweber

Tagging y'all just to show my appreciation for what you did to open source community.

And also to truly understand why none of the cloud providers are providing Fine-tuning ( both Domain Adaptation and instruction fine-tuning) on $/1000 token basis.

BTW, I adore the 32k LLAMA 7B. You guys did it way before Long LLAMA project from Meta AI!

lewtun

Hugging Face H4 org Oct 18, 2023

Hi @KrishnaKaasyap thanks for the super detailed and useful feedback! Regarding fine-tuning, have you tried AutoTrain (https://huggingface.co/autotrain)? It doesn't offer a $/1000 token pricing format because it supports many modalities beyond text, but it certainly does support LLM fine-tuning :)

cc @abhishek for vis

KrishnaKaasyap

Oct 19, 2023

have you tried AutoTrain (https://huggingface.co/autotrain)? It doesn't offer a $/1000 token pricing format because it supports many modalities beyond text, but it certainly does support LLM fine-tuning :)

Yes, but that is not letting me select my model of choice (in my case, either Llama 7B or Mistral 7B) via "manual" selection. And even after selecting "automatic" option - I am unable to proceed to the next stage and an "internal server error" message is appearing!

And thanks for the response and suggestions dude, much appreciated. 🙏🏼

abhishek

Oct 19, 2023

•

edited Oct 19, 2023

For LLM finetuning, you should use AutoTrain Advanced. The old UI is deprecated.
docs: https://hf.co/docs/autotrain

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment