|
--- |
|
library_name: transformers |
|
tags: |
|
- trismegistus |
|
- llama3 |
|
- esoteric |
|
license: llama3.2 |
|
datasets: |
|
- teknium/trismegistus-project |
|
base_model: |
|
- meta-llama/Llama-3.2-1B |
|
pipeline_tag: text-generation |
|
--- |
|
|
|
## Model Details |
|
|
|
### Model Description |
|
|
|
Trismegistus for Llama 3.2 1b. Credits to teknium for dataset and original model. |
|
|
|
### Model Sources [optional] |
|
|
|
Llama 3.2 1b |
|
|
|
## Uses |
|
|
|
Use for esoteric joy. |
|
|
|
|
|
## Bias, Risks, and Limitations |
|
|
|
May be biased as hell. |
|
|
|
- Recommendation: |
|
- Don't take it personally. |
|
|
|
|
|
## How to Get Started with the Model |
|
|
|
Run it. |
|
|
|
|
|
### Training Data |
|
|
|
#### Training Hyperparameters |
|
|
|
- lora 4bit peft |
|
|
|
#### Speeds, Sizes, Times [optional] |
|
|
|
- global_step=16905 |
|
- training_loss=1.169401215731269 |
|
- train_runtime: 21882.4747 |
|
- train_samples_per_second: 3.09 |
|
- train_steps_per_second: 0.773 |
|
- total_flos: 4.437195883099177e+17 |
|
- train_loss': 1.169401215731269 |
|
- epoch: 5.0 |
|
|
|
## Evaluation and Metrics |
|
|
|
| Tasks |Version|Filter|n-shot| Metric | |Value | |Stderr| |
|
|-------------|------:|------|-----:|--------|---|-----:|---|-----:| |
|
|arc_challenge| 1|none | 0|acc |↑ |0.3345|± |0.0138| |
|
| | |none | 0|acc_norm|↑ |0.3695|± |0.0141| |
|
|arc_easy | 1|none | 0|acc |↑ |0.6044|± |0.0100| |
|
| | |none | 0|acc_norm|↑ |0.5694|± |0.0102| |
|
|boolq | 2|none | 0|acc |↑ |0.6410|± |0.0084| |
|
|hellaswag | 1|none | 0|acc |↑ |0.4400|± |0.0050| |
|
| | |none | 0|acc_norm|↑ |0.5728|± |0.0049| |
|
|openbookqa | 1|none | 0|acc |↑ |0.2260|± |0.0187| |
|
| | |none | 0|acc_norm|↑ |0.3540|± |0.0214| |
|
|piqa | 1|none | 0|acc |↑ |0.7002|± |0.0107| |
|
| | |none | 0|acc_norm|↑ |0.7024|± |0.0107| |
|
|winogrande | 1|none | 0|acc |↑ |0.5785|± |0.0139| |
|
|
|
|
|
## Environmental Impact |
|
|
|
Will steal your horse and kill your cat. |