whats the min configure hardware for hosting this model?

by xDAN2099 - opened May 26, 2023

Discussion

xDAN2099

May 26, 2023

At least 1x A100 80G or what?

jeff31415

May 26, 2023

May be using ggml q4 quant to bring the model size down to around 100GB than using cpu to run it? (using bloomz.cpp).Might be very slow,but possible with 128G or more ram and a powerful cpu.

itkingtao

May 26, 2023

8*A100

ztsvvstz

May 27, 2023

what about int8 mode and disk offloading?

xDAN2099

May 28, 2023

Is there any guy to slim the model within the qlora methods?

ChangranHuuu

SambaNova Systems org May 31, 2023

Hey folks! Please follow https://github.com/sambanova/bloomchat here. For bf16 inference the minimal requirement is 880GB A100, for int8 inference the minimal requirement is 480GB A100. We are exploring other compression techniques and welcome any suggestion/contribution!

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment