A Reproduction of OpenLLaMA using 128 H100 GPUs in Bfloat16.

The pretrain data consists of Falcon, Starcoder, and the wikipedia, arxiv, books, stackexchange from RedPajama. In total, this encompassed nearly 1 trillion tokens.

The model was trained over a single epoch, incorporating 2000 warm-up steps and a cosine learning rate schedule, starting at 3e-5 with 4M batch size.

Open LLM Leaderboard Evaluation Results

Detailed results can be found here

Metric	Value
Avg.	47.09
AI2 Reasoning Challenge (25-Shot)	46.16
HellaSwag (10-Shot)	76.40
MMLU (5-Shot)	42.82
TruthfulQA (0-shot)	36.65
Winogrande (5-shot)	70.88
GSM8k (5-shot)	9.63

Downloads last month: 1,754

Inference Providers NEW

Text Generation

This model is not currently available via any of the supported Inference Providers.

Model tree for itsliupeng/openllama-7b-base

Quantizations

2 models

Datasets used to train itsliupeng/openllama-7b-base

Evaluation results

normalized accuracy on AI2 Reasoning Challenge (25-Shot)
test set Open LLM Leaderboard

46.160
normalized accuracy on HellaSwag (10-Shot)
validation set Open LLM Leaderboard

76.400
accuracy on MMLU (5-Shot)
test set Open LLM Leaderboard

42.820
mc2 on TruthfulQA (0-shot)
validation set Open LLM Leaderboard

36.650
accuracy on Winogrande (5-shot)
validation set Open LLM Leaderboard

70.880
accuracy on GSM8k (5-shot)
test set Open LLM Leaderboard

9.630

View on Papers With Code