Bam4d commited on
Commit
f592c5f
1 Parent(s): 0e5a0e2

Small updates

Browse files
Files changed (1) hide show
  1. README.md +52 -3
README.md CHANGED
@@ -1,3 +1,52 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # **Model Details**
2
+
3
+ The Mistral AI-7B-v0.1 Large Language Model (LLM) is a pretrained generative text model with 7 billion parameters. Mistral AI-7B-v0.1 outperforms Llama 2 13B on all benchmarks we tested.
4
+
5
+ **Model Developers** Mistral AI.
6
+
7
+ **Variations** None.
8
+
9
+ **Input** Text only.
10
+
11
+ **Output** Text only.
12
+
13
+ **Model Architecture** Mistral AI-7B-v0.1 is a transformer model, with the following architecture choices:
14
+ - Grouped-Query Attention
15
+ - Sliding-Window Attention
16
+ - Byte-fallback BPE tokenizer
17
+
18
+ **Model Dates** Mistral AI-7B-v0.1 was trained between June and September 2023.
19
+
20
+ **Status** This is a static model. Future models will have new version numbers.
21
+
22
+ **License** Apache 2.0 license.
23
+
24
+ **Research Paper** TODO: Coming soon.
25
+
26
+ **Where to send questions or comments about the model** TODO: How do people send comments?
27
+
28
+ # **Intended Use**
29
+ **Intended Use Cases** Mistral AI-7B-v0.1 is for commercial and research use. It can be adapted for a variety of natural language generation tasks.
30
+
31
+ # **Evaluation Results**
32
+ We report the standard benchmark results for Mistral AI-7B-v0.1. We use a custom evaluation library to produce the results.
33
+
34
+ | Model | Size | hellaswag | winogrande | piqa | boolq | arc_easy | arc_challenge | naturalqs | naturalqs_5shot | triviaqa_5shot | triviaqa | humaneval_pass@1 | mbpp_pass@1 | mmlu | math | gsm8k |
35
+ |-----------------|------|-----------|------------|--------|--------|----------|---------------|-----------|-----------------|----------------|----------|------------------|-------------|--------|--------|--------|
36
+ | Mistral-7B-v0.1 | 7B | 81.19% | 75.53% | 82.92% | 83.52% | 80.01% | 55.38% | 23.96% | 28.92% | 69.88% | 63.22% | 29.88% | 47.86% | 59.99% | 11.94% | 39.35% |
37
+
38
+ **Theme-based grouping**
39
+ - Commonsense Reasoning: 0-shot average of Hellaswag, Winogrande, PIQA, SIQA, OpenbookQA, ARC-Easy, ARC-Challenge, and CommonsenseQA.
40
+
41
+ - World Knowledge: 5-shot average of NaturalQuestions and TriviaQA.
42
+
43
+ - Reading Comprehension: 0-shot average of BoolQ and QuAC.
44
+
45
+ - Math: Average of 8-shot GSM8K with maj@8 and 4-shot MATH with maj@4
46
+
47
+ - Code: Average of 0-shot Humaneval and 3-shot MBPP
48
+
49
+ - Popular aggregated results: 5-shot MMLU, 3-shot BBH, and 3-5-shot AGI Eval (English multiple-choice questions only)
50
+
51
+ # **Ethical Considerations and Limitations**
52
+ TODO: what do we say here?