how is it used?

#48
by Codigoabierto - opened

When entering the test an error appears.

@Codigoabierto It isn't a test for inferencing.

Will they place one to test the operation?

Will they place one to test the operation?

The model is 314 B. I don't think yet, as this is still a raw model. They need to quantise it, as the model is around 228 GB.

If someone creates a 4-bit quantized model of the 314B it would be around ~60gb which means it will still need around that much VRAM minimum right?

Considering that you need about 30-40GB VRAM to engage in inference with a 70b model.

If someone creates a 4-bit quantized model of the 314B it would be around ~60gb which means it will still need around that much VRAM minimum right?

Considering that you need about 30-40GB VRAM to engage in inference with a 70b model.

Yes, probably. Probably even more VRAM.

Sign up or log in to comment