Edit model card

Model Card for Model ID

Model Details

Model Description

This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated.

  • Developed by: Tatman Electric
  • Funded by [optional]: Spare Pocket Lint
  • Shared by [optional]: TRL
  • Model type: Sliced Layered
  • Language(s) (NLP): Mixed
  • License: Pythia @ EleutherAI
  • Finetuned from model [optional]: EleutherAI/pythia-2.8b-deduped

Model Sources [optional]

  • Repository: [More Information Needed]
  • Paper [optional]: [More Information Needed]
  • Demo [optional]: [More Information Needed]

Uses

Before there were merged models, there were slices of shards of... stuff. Those slices have meaning. Those slices are real slices too.

Direct Use

Part of a series of slice and dice mods.

Single Hidden Layer Pythia

What does a single hidden layer preserve from a 12 layer base model?

[More Information Needed]

Downstream Use [optional]

[More Information Needed]

Out-of-Scope Use

[More Information Needed]

Bias, Risks, and Limitations

[More Information Needed]

Recommendations

Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.

How to Get Started with the Model

Use the code below to get started with the model.

[More Information Needed]

Training Details

Training Data

[More Information Needed]

Training Procedure

Preprocessing [optional]

[More Information Needed]

Training Hyperparameters

  • Training regime: [More Information Needed]

Speeds, Sizes, Times [optional]

[More Information Needed]

Evaluation

Groups Version Filter n-shot Metric Value Stderr
Open LLM Leaderboard N/A none 5 rouge1_max 36.3550 ± 0.9462
flexible-extract 5 exact_match 0.0220 ± 0.0066
- arc_challenge 1 none 25 acc 0.1760 ± 0.0170
none 25 acc_norm 0.2320 ± 0.0189
- gsm8k 3 strict-match 5 exact_match 0.0060 ± 0.0035
flexible-extract 5 exact_match 0.0220 ± 0.0066
- hellaswag 1 none 10 acc 0.3520 ± 0.0214
none 10 acc_norm 0.4040 ± 0.0220
- winogrande 1 none 5 acc 0.5120 ± 0.0224
none 5 bleu_diff -0.6500 ± 0.6421
none 5 rouge1_acc 0.3700 ± 0.0216
none 5 rouge1_diff -1.5564 ± 1.0223
none 5 acc 0.2664 ± 0.0036
none 5 rougeL_max 33.8798 ± 0.9367
none 5 rouge2_diff -3.3178 ± 0.9477
none 5 bleu_max 15.2292 ± 0.6714
none 5 bleu_acc 0.4360 ± 0.0222
none 5 rouge2_max 16.4873 ± 1.0172
none 5 acc_norm 0.3180 ± 0.0145
strict-match 5 exact_match 0.0060 ± 0.0035
none 5 rougeL_diff -0.7765 ± 1.0034
none 5 rougeL_acc 0.3860 ± 0.0218
none 5 rouge2_acc 0.1920 ± 0.0176
- mmlu N/A none 0 acc 0.2533 ± 0.0039
- humanities N/A none 5 acc 0.2408 ± 0.0075
- other N/A none 5 acc 0.2443 ± 0.0080
- social_sciences N/A none 5 acc 0.2538 ± 0.0081
- stem N/A none 5 acc 0.2740 ± 0.0079
- truthfulqa N/A none 0 rouge1_max 36.3550 ± 0.9462
none 0 bleu_diff -0.6500 ± 0.6421
none 0 rouge1_acc 0.3700 ± 0.0216
none 0 rouge1_diff -1.5564 ± 1.0223
none 0 acc 0.3435 ± 0.0137
none 0 rougeL_max 33.8798 ± 0.9367
none 0 bleu_max 15.2292 ± 0.6714
none 0 bleu_acc 0.4360 ± 0.0222
none 0 rouge2_max 16.4873 ± 1.0172
none 0 rougeL_acc 0.3860 ± 0.0218
none 0 rougeL_diff -0.7765 ± 1.0034
none 0 rouge2_acc 0.1920 ± 0.0176
none 0 rouge2_diff -3.3178 ± 0.9477

Testing Data, Factors & Metrics

Testing Data

[More Information Needed]

Factors

[More Information Needed]

Metrics

[More Information Needed]

Results

[More Information Needed]

Summary

Model Examination [optional]

[More Information Needed]

Environmental Impact

Carbon emissions can be estimated using the Machine Learning Impact calculator presented in Lacoste et al. (2019).

  • Hardware Type: OldAsDirt
  • Hours used: 5
  • Cloud Provider: YourMomsBasement
  • Compute Region: Siberia
  • Carbon Emitted: 8ppm

No yaks were harmed in the making of this model.

Technical Specifications [optional]

Model Architecture and Objective

[More Information Needed]

Compute Infrastructure

[More Information Needed]

Hardware

[More Information Needed]

Software

[More Information Needed]

Citation [optional]

BibTeX:

[More Information Needed]

APA:

[More Information Needed]

Glossary [optional]

[More Information Needed]

More Information [optional]

[More Information Needed]

Model Card Authors [optional]

[More Information Needed]

Model Card Contact

[More Information Needed]

Downloads last month
152
Safetensors
Model size
704M params
Tensor type
F32
·
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.