Post
Hear, hear, AMD MI300Xs have started to emerge much sooner than expected.
Here is a 2-part benchmarks report on performing BLOOM-176B inference using @MSFTDeepSpeed optimized for AMD MI300X.
1. https://www.evp.cloud/post/diving-deeper-insights-from-our-llm-inference-testing
2. https://www.evp.cloud/post/diving-deeper-insights-from-our-llm-inference-testing-part-2
This was published in response to our BLOOM-176B super-fast inference blog post https://huggingface.co/blog/bloom-inference-pytorch-scripts
Note that these have 192GB of HBM!
The NVIDIA monopoly is strong, but it'll have to start sharing the pie and hopefully drive the costs down at least somewhat.
Thanks to https://www.linkedin.com/in/eliovp for sharing this writeup with me.
p.s. at the PyTorch conference in the fall, the AMD representative said we will see MI300X available to us mortals in Q4-2024/Q1-2025.
Here is a 2-part benchmarks report on performing BLOOM-176B inference using @MSFTDeepSpeed optimized for AMD MI300X.
1. https://www.evp.cloud/post/diving-deeper-insights-from-our-llm-inference-testing
2. https://www.evp.cloud/post/diving-deeper-insights-from-our-llm-inference-testing-part-2
This was published in response to our BLOOM-176B super-fast inference blog post https://huggingface.co/blog/bloom-inference-pytorch-scripts
Note that these have 192GB of HBM!
The NVIDIA monopoly is strong, but it'll have to start sharing the pie and hopefully drive the costs down at least somewhat.
Thanks to https://www.linkedin.com/in/eliovp for sharing this writeup with me.
p.s. at the PyTorch conference in the fall, the AMD representative said we will see MI300X available to us mortals in Q4-2024/Q1-2025.