question on computation cost

#3
by MicFizzy - opened

Hey BK-Lee,

I'm super impressed with your open-source model on Hugging Face and am considering it for a Chinese AI Q&A app on WeChat. However, I'm torn between developing with your open-source model and using APIs from big tech companies, which offer good performance at very low costs.

Right now, I have a 4090 GPU that runs a 20B model fine for a few users, but I'm worried about how the costs could skyrocket if the user base grows significantly. Could you share any insights on the necessary hardware if many users are online simultaneously? I’m trying to figure out the potential computational costs to better decide between the open-source approach and a commercial API.

Thanks a ton for your help and for making such an awesome tool available to everyone!

Cheers,

Liz

Actually, I have no ability to determine this kind of information because I dont have any experience of dealing with bunch of users queries.

To manage billon number users, It was known that OpenAI built lots of server system. In addition, there has been a rumor that OpenAI used 128k number of GPUs to train and 128 number of GPUs for inference. Based on these facts.. , I guess that 100 number of GPUs (A100-like large VRAM GPU) are at least needed once the users are increasing.

Sign up or log in to comment