PULI LlumiX 32K instruct (6.74B billion parameter)

Intruct finetuned version of NYTK/PULI-LlumiX-32K.

Provided files

Quant method Bits Use case
Q3_K_M 3 very small, high quality loss
Q4_K_S 4 small, greater quality loss
Q4_K_M 4 medium, balanced quality - recommended
Q5_K_S 5 large, low quality loss - recommended
Q5_K_M 5 large, very low quality loss - recommended
Q6_K 6 very large, extremely low quality loss
Q8_0 8 very large, extremely low quality loss - not recommended

Training platform

Runpod RTX 4090 GPU

Hyper parameters

  • Epoch: 3
  • LoRA rank (r): 16
  • LoRA alpha: 16
  • Lr: 2e-4
  • Lr scheduler: cosine
  • Optimizer: adamw_8bit
  • Weight decay: 0.01

Dataset

boapps/szurkemarha

Only Hungarian instructions were selected: ~53000 prompts.

Prompt format: ChatML

<|im_start|>system
Egy segítőkész mesterséges intelligencia asszisztens vagy. Válaszold meg a kérdést legjobb tudásod szerint!<|im_end|>
<|im_start|>user
Ki a legerősebb szuperhős?<|im_end|>
<|im_start|>assistant
A legerősebb szuperhős a Marvel univerzumában Hulk.<|im_end|>

Base model

  • Trained with OpenChatKit github
  • The LLaMA-2-7B-32K model were continuously pretrained on Hungarian dataset
  • The model has been extended to a context length of 32K with position interpolation
  • Checkpoint: 100 000 steps

Base model dataset for continued pretraining

  • Hungarian: 7.9 billion words, documents (763K) that exceed 5000 words in length
  • English: Long Context QA (2 billion words), BookSum (78 million words)

Limitations

  • max_seq_length = 32 768
  • float16
  • vocab size: 32 000
Downloads last month
67
GGUF
Model size
6.74B params
Architecture
llama

3-bit

4-bit

5-bit

6-bit

8-bit

Inference Examples
Unable to determine this model's library. Check the docs .

Model tree for ariel-ml/PULI-LlumiX-32K-instruct-GGUF

Quantized
(1)
this model