Text Generation
Transformers
PyTorch
TensorBoard
Safetensors
bloom
Eval Results
text-generation-inference
Inference Endpoints

Inference on TPU-v3-32

#68
by zhiG - opened
No description provided.

I think maybe using TPU-v3-32 is the most cost-effective way for Bloom inference.

BigScience Workshop org

hi @zhiG !
Thanks for the suggestion ! We already made an effort for BLOOM-tpu inference that you can have a look here: https://github.com/huggingface/bloom-jax-inference
Let us know if you have any more questions :)

BigScience Workshop org

Also could you please move the discussion into a discussion instead of a PR 🙏 Thank you !

ybelkada changed pull request status to closed

Thank you for your information! I am quite new to the Huggingface community. I am sorry that I cannot find a method to move it to the discussion. I can't even delete it .

BigScience Workshop org

No worries at all ! Let me create a discussion and ping you there ;)

Sign up or log in to comment