Edit model card

GPT-NeoX trained on MiniPile, for a baseline to compare my MANN models against. Uses NeelNanda/gpt-neox-tokenizer-digits for tokenization.

The exact model configuration is as follows:

cfg = GPTNeoXConfig(
    vocab_size = len(tokenizer),
    hidden_size = 768,
    intermediate_size = 768*4,
    num_hidden_layers = 12,
    num_attention_heads = 12,
    tie_word_embeddings = True,
    hidden_act = "gelu_new",
    tokenizer = "NeelNanda/gpt-neox-tokenizer-digits"
)

Open LLM Leaderboard Evaluation Results

Detailed results can be found here

Metric Value
Avg. 25.1
ARC (25-shot) 20.73
HellaSwag (10-shot) 27.03
MMLU (5-shot) 25.31
TruthfulQA (0-shot) 49.19
Winogrande (5-shot) 52.33
GSM8K (5-shot) 0.0
DROP (3-shot) 1.09
Downloads last month
3,020
Safetensors
Model size
172M params
Tensor type
F32
Β·
BOOL
Β·

Dataset used to train euclaise/gpt-neox-122m-minipile-digits

Spaces using euclaise/gpt-neox-122m-minipile-digits 19