jtatman's picture
Update README.md
93eabde verified
|
raw
history blame
2 kB
---
library_name: transformers
tags:
- trismegistus
- llama3
- esoteric
license: llama3.2
datasets:
- teknium/trismegistus-project
base_model:
- meta-llama/Llama-3.2-1B
pipeline_tag: text-generation
---
## Model Details
### Model Description
Trismegistus for Llama 3.2 1b. Credits to teknium for dataset and original model.
### Model Sources [optional]
Llama 3.2 1b
## Uses
Use for esoteric joy.
## Bias, Risks, and Limitations
May be biased as hell.
- Recommendation:
- Don't take it personally.
## How to Get Started with the Model
Run it.
### Training Data
#### Training Hyperparameters
- lora 4bit peft
#### Speeds, Sizes, Times [optional]
- global_step=16905
- training_loss=1.169401215731269
- train_runtime: 21882.4747
- train_samples_per_second: 3.09
- train_steps_per_second: 0.773
- total_flos: 4.437195883099177e+17
- train_loss': 1.169401215731269
- epoch: 5.0
## Evaluation and Metrics
| Tasks |Version|Filter|n-shot| Metric | |Value | |Stderr|
|-------------|------:|------|-----:|--------|---|-----:|---|-----:|
|arc_challenge| 1|none | 0|acc |↑ |0.3345|± |0.0138|
| | |none | 0|acc_norm|↑ |0.3695|± |0.0141|
|arc_easy | 1|none | 0|acc |↑ |0.6044|± |0.0100|
| | |none | 0|acc_norm|↑ |0.5694|± |0.0102|
|boolq | 2|none | 0|acc |↑ |0.6410|± |0.0084|
|hellaswag | 1|none | 0|acc |↑ |0.4400|± |0.0050|
| | |none | 0|acc_norm|↑ |0.5728|± |0.0049|
|openbookqa | 1|none | 0|acc |↑ |0.2260|± |0.0187|
| | |none | 0|acc_norm|↑ |0.3540|± |0.0214|
|piqa | 1|none | 0|acc |↑ |0.7002|± |0.0107|
| | |none | 0|acc_norm|↑ |0.7024|± |0.0107|
|winogrande | 1|none | 0|acc |↑ |0.5785|± |0.0139|
## Environmental Impact
Will steal your horse and kill your cat.