Text Generation
Transformers
Safetensors
English
llama
text-generation-inference
4-bit precision

fix documentation for loading the model, since the fused attention module doesnt work here either.

#4
by mber - opened
No description provided.

Ahh thank you! Yes I need that for all 70B models.

TheBloke changed pull request status to merged

With pleasure :)

Sign up or log in to comment