It's a bit difficult to deploy the 70B model for verification, so let's keep an eye on how things develop

by wawoshashi - opened Sep 22, 2023

Sep 22, 2023

个人部署70B模型来做验证,有点困难, 关注事态发展

Yhyu13

Sep 22, 2023

Try this quantized version https://huggingface.co/TheBloke/Xwin-LM-70B-V0.1-GGUF which only needs a 48G Vram card, or 40GB RAM cpu only.

You can try it now with llama.cpp

Sep 24, 2023

I can run 70B quantized GGUF model (Q3_K - Small and offloaded 60/83 layers to GPU ) on 3090 via llama.cpp.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment