jtatman's picture
Update README.md
93eabde verified
|
raw
history blame
2 kB
metadata
library_name: transformers
tags:
  - trismegistus
  - llama3
  - esoteric
license: llama3.2
datasets:
  - teknium/trismegistus-project
base_model:
  - meta-llama/Llama-3.2-1B
pipeline_tag: text-generation

Model Details

Model Description

Trismegistus for Llama 3.2 1b. Credits to teknium for dataset and original model.

Model Sources [optional]

Llama 3.2 1b

Uses

Use for esoteric joy.

Bias, Risks, and Limitations

May be biased as hell.

  • Recommendation:
    • Don't take it personally.

How to Get Started with the Model

Run it.

Training Data

Training Hyperparameters

  • lora 4bit peft

Speeds, Sizes, Times [optional]

  • global_step=16905
  • training_loss=1.169401215731269
  • train_runtime: 21882.4747
  • train_samples_per_second: 3.09
  • train_steps_per_second: 0.773
  • total_flos: 4.437195883099177e+17
  • train_loss': 1.169401215731269
  • epoch: 5.0

Evaluation and Metrics

Tasks Version Filter n-shot Metric Value Stderr
arc_challenge 1 none 0 acc 0.3345 ± 0.0138
none 0 acc_norm 0.3695 ± 0.0141
arc_easy 1 none 0 acc 0.6044 ± 0.0100
none 0 acc_norm 0.5694 ± 0.0102
boolq 2 none 0 acc 0.6410 ± 0.0084
hellaswag 1 none 0 acc 0.4400 ± 0.0050
none 0 acc_norm 0.5728 ± 0.0049
openbookqa 1 none 0 acc 0.2260 ± 0.0187
none 0 acc_norm 0.3540 ± 0.0214
piqa 1 none 0 acc 0.7002 ± 0.0107
none 0 acc_norm 0.7024 ± 0.0107
winogrande 1 none 0 acc 0.5785 ± 0.0139

Environmental Impact

Will steal your horse and kill your cat.