Can I run this model with 2X RTX 3090

#4
by huytungst - opened

I am planning to build a workstation as follows:

  • CPU: AMD 3960x (24 core)
  • GPUs: 2x (dual) RTX 3090
  • RAM: 128GB

Can I run inference (or fine-tuning) this model?

  • falcon-180b-chat.Q6_K.gguf (Max RAM required 150.02 GB)

If it's not possible, I'll probably have to adjust my budgets and plans.

Thank you for your great work.

Maybe you actually can (who knows) by using this:
https://github.com/huggingface/text-generation-inference

Sign up or log in to comment