Edit model card

Saiga/Llama3 8B, Russian Llama-3-based chatbot

4bit AWQ-quantized version of Saiga/Llama3 8B (Version 4) by Ilya Gusev.

Quantization parameters:

  • Version: GEMM
  • Group size: 128
  • Zero point: True

Quantization dataset: Den4ikAI/russian_instructions_2 formatted in Llama3 prompt format with Saiga default system prompt.

Downloads last month
21
Safetensors
Model size
1.98B params
Tensor type
FP16
·
I32
·
Inference API
Input a message to start chatting with alekosus/saiga_llama3_8b_awq.
This model can be loaded on Inference API (serverless).