Model Details

The Mistral AI-7B-v0.1 Large Language Model (LLM) is a pretrained generative text model with 7 billion parameters. Mistral AI-7B-v0.1 outperforms Llama 2 13B on all benchmarks we tested.

Model Developers Mistral AI.

Variations None.

Input Text only.

Output Text only.

Model Architecture Mistral AI-7B-v0.1 is a transformer model, with the following architecture choices:

Grouped-Query Attention
Sliding-Window Attention
Byte-fallback BPE tokenizer

Model Dates Mistral AI-7B-v0.1 was trained between June and September 2023.

Status This is a static model. Future models will have new version numbers.

License Apache 2.0 license.

Research Paper TODO: Coming soon.

Where to send questions or comments about the model TODO: How do people send comments?

Intended Use

Intended Use Cases Mistral AI-7B-v0.1 is for commercial and research use. It can be adapted for a variety of natural language generation tasks.

Evaluation Results

We report the standard benchmark results for Mistral AI-7B-v0.1. We use a custom evaluation library to produce the results.

Model	Size	hellaswag	winogrande	piqa	boolq	arc_easy	arc_challenge	naturalqs	naturalqs_5shot	triviaqa_5shot	triviaqa	humaneval_pass@1	mbpp_pass@1	mmlu	math	gsm8k
Mistral-7B-v0.1	7B	81.19%	75.53%	82.92%	83.52%	80.01%	55.38%	23.96%	28.92%	69.88%	63.22%	29.88%	47.86%	59.99%	11.94%	39.35%

Theme-based grouping

Commonsense Reasoning: 0-shot average of Hellaswag, Winogrande, PIQA, SIQA, OpenbookQA, ARC-Easy, ARC-Challenge, and CommonsenseQA.
World Knowledge: 5-shot average of NaturalQuestions and TriviaQA.
Reading Comprehension: 0-shot average of BoolQ and QuAC.
Math: Average of 8-shot GSM8K with maj@8 and 4-shot MATH with maj@4
Code: Average of 0-shot Humaneval and 3-shot MBPP
Popular aggregated results: 5-shot MMLU, 3-shot BBH, and 3-5-shot AGI Eval (English multiple-choice questions only)

Ethical Considerations and Limitations

TODO: what do we say here?