Zardos commited on
Commit
5989100
1 Parent(s): 3eb7634
Files changed (1) hide show
  1. README.md +28 -5
README.md CHANGED
@@ -7,12 +7,35 @@ license: apache-2.0
7
 
8
  # Model Yaml
9
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
10
  ```
11
- # Prompt Format
12
- ###### chatml
 
13
  ```
14
 
15
- ### Instruction:
16
 
17
- ### Response:
18
- ```
 
 
 
 
 
 
7
 
8
  # Model Yaml
9
 
10
+ The Mistral-7B-v0.1 Large Language Model (LLM) is a pretrained generative text model with 7 billion parameters.
11
+ Mistral-7B-v0.1 outperforms Llama 2 13B on all benchmarks we tested.
12
+
13
+ For full details of this model please read our [paper](https://arxiv.org/abs/2310.06825) and [release blog post](https://mistral.ai/news/announcing-mistral-7b/).
14
+
15
+ ## Model Architecture
16
+
17
+ Mistral-7B-v0.1 is a transformer model, with the following architecture choices:
18
+ - Grouped-Query Attention
19
+ - Sliding-Window Attention
20
+ - Byte-fallback BPE tokenizer
21
+
22
+ ## Troubleshooting
23
+
24
+ - If you see the following error:
25
+ ```
26
+ KeyError: 'mistral'
27
  ```
28
+ - Or:
29
+ ```
30
+ NotImplementedError: Cannot copy out of meta tensor; no data!
31
  ```
32
 
33
+ Ensure you are utilizing a stable version of Transformers, 4.34.0 or newer.
34
 
35
+ ## Notice
36
+
37
+ Mistral 7B is a pretrained base model and therefore does not have any moderation mechanisms.
38
+
39
+ ## The Mistral AI Team
40
+
41
+ Albert Jiang, Alexandre Sablayrolles, Arthur Mensch, Chris Bamford, Devendra Singh Chaplot, Diego de las Casas, Florian Bressand, Gianna Lengyel, Guillaume Lample, Lélio Renard Lavaud, Lucile Saulnier, Marie-Anne Lachaux, Pierre Stock, Teven Le Scao, Thibaut Lavril, Thomas Wang, Timothée Lacroix, William El Sayed.