am I supposed to download the entire thing?

#1
by plowthat1998 - opened

I'm confused as to whether or not I'm supposed to be downloading the entire model library for this model, and if so what the ram requirements are? and if it is the whole model library, how to I put the safetensors files together to use them? or if it has different options for the download.

Correct, this repo contains the entire model in fp16 precision. It's intended for further conversions, as you'd need crazy hardware to run this model in fp16.

If you are asking these questions, you probably shouldn't be downloading this.

This repo contains the quantized model intended to run locally (provided you have 2x 3090 or 4090, or an A6000:
https://huggingface.co/Doctor-Shotgun/lzlv-limarpv3-l2-70b-exl2

Is there a version of this that’s intended to run locally on ram and vram?

You'd want a GGUF version for that, although it will run extremely slowly if it doesn't fit entirely on vram.

Sign up or log in to comment