lahh
Collection
4 items
•
Updated
This model was fine-tuned using TPU on Kaggle. The training script can be found in this repository by Locutusque.
GGUF: Static/Imatrix made available by mradermacher
Training Details:
First step training
BF16 with QLORA
Examples Used: 5K (4K English and 1K Indonesian)
Lora Rank: 64
Lora Alpha: 16
Lora Dropout: 0.05
Learning Rate: 1e-5
Three Epoch
Second step training
BF16 with QLORA
Examples Used: 30K (25K English and 5K Indonesian)
Lora Rank: 64
Lora Alpha: 32
Lora Dropout: 0.05
Learning Rate: 6e-5
One Epoch
Both of the adapters were merged with Kaggle T4, that's why the weight is in FP16
Datasets Used for the Final Model: 30K examples were taken from the following datasets:
Some of the examples that don't have "system" are combined to fill the 8K context.
Llama 3 Chat Template:
<|start_header_id|>system<|end_header_id|>
{system}<|eot_id|><|start_header_id|>user<|end_header_id|>
{input}<|eot_id|><|start_header_id|>assistant<|end_header_id|>
{response}<|eot_id|>
Notes:
Detailed results can be found here
Metric | Value |
---|---|
Avg. | 70.43 |
AI2 Reasoning Challenge (25-Shot) | 64.16 |
HellaSwag (10-Shot) | 83.40 |
MMLU (5-Shot) | 67.68 |
TruthfulQA (0-shot) | 54.70 |
Winogrande (5-shot) | 79.95 |
GSM8k (5-shot) | 72.71 |