|
--- |
|
library_name: transformers |
|
base_model: beomi/Llama-3-Open-Ko-8B |
|
datasets: |
|
- kyujinpy/OpenOrca-KO |
|
pipeline_tag: text-generation |
|
license: llama3 |
|
--- |
|
|
|
# Llama-3-Ko-OpenOrca |
|
|
|
<!-- Provide a quick summary of what the model is/does. --> |
|
|
|
|
|
|
|
## Model Details |
|
|
|
### Model Description |
|
|
|
<!-- Provide a longer summary of what this model is. --> |
|
|
|
Original model: [beomi/Llama-3-Open-Ko-8B](https://huggingface.co/beomi/Llama-3-Open-Ko-8B) (2024.04.24 버전) |
|
|
|
Dataset: [kyujinpy/OpenOrca-KO](https://huggingface.co/datasets/kyujinpy/OpenOrca-KO) |
|
|
|
### Training details |
|
|
|
Training: Axolotl을 이용해 LoRA-8bit로 4epoch 학습 시켰습니다. |
|
- sequence_len: 4096 |
|
- bf16 |
|
|
|
학습 시간: A6000x2, 6시간 |
|
|
|
### Evaluation |
|
|
|
- 0 shot kobest |
|
|
|
|
|
| Tasks |n-shot| Metric |Value | |Stderr| |
|
|----------------|-----:|--------|-----:|---|------| |
|
|kobest_boolq | 0|acc |0.5021|± |0.0133| |
|
|kobest_copa | 0|acc |0.6920|± |0.0146| |
|
|kobest_hellaswag| 0|acc |0.4520|± |0.0223| |
|
|kobest_sentineg | 0|acc |0.7330|± |0.0222| |
|
|kobest_wic | 0|acc |0.4881|± |0.0141| |
|
|
|
|
|
- 5 shot kobest |
|
|
|
|
|
| Tasks |n-shot| Metric |Value | |Stderr| |
|
|----------------|-----:|--------|-----:|---|------| |
|
|kobest_boolq | 5|acc |0.7123|± |0.0121| |
|
|kobest_copa | 5|acc |0.7620|± |0.0135| |
|
|kobest_hellaswag| 5|acc |0.4780|± |0.0224| |
|
|kobest_sentineg | 5|acc |0.9446|± |0.0115| |
|
|kobest_wic | 5|acc |0.6103|± |0.0137| |
|
|
|
|
|
### License: |
|
[https://llama.meta.com/llama3/license](https://llama.meta.com/llama3/license) |