cledoux42 commited on
Commit
feff255
1 Parent(s): 74c51a3

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +33 -27
README.md CHANGED
@@ -1,40 +1,46 @@
1
  ---
 
 
 
 
2
  tags:
3
- - autotrain
4
- - text-generation
5
- widget:
6
- - text: "I love AutoTrain because "
7
- license: other
8
  ---
9
 
10
- # Model Trained Using AutoTrain
11
 
12
- This model was trained using AutoTrain. For more information, please visit [AutoTrain](https://hf.co/docs/autotrain).
 
13
 
14
- # Usage
15
 
16
- ```python
17
 
18
- from transformers import AutoModelForCausalLM, AutoTokenizer
 
 
 
19
 
20
- model_path = "PATH_TO_THIS_REPO"
21
 
22
- tokenizer = AutoTokenizer.from_pretrained(model_path)
23
- model = AutoModelForCausalLM.from_pretrained(
24
- model_path,
25
- device_map="auto",
26
- torch_dtype='auto'
27
- ).eval()
 
 
28
 
29
- # Prompt content: "hi"
30
- messages = [
31
- {"role": "user", "content": "hi"}
32
- ]
33
 
34
- input_ids = tokenizer.apply_chat_template(conversation=messages, tokenize=True, add_generation_prompt=True, return_tensors='pt')
35
- output_ids = model.generate(input_ids.to('cuda'))
36
- response = tokenizer.decode(output_ids[0][input_ids.shape[1]:], skip_special_tokens=True)
37
 
38
- # Model response: "Hello! How can I assist you today?"
39
- print(response)
40
- ```
 
 
 
1
  ---
2
+ license: apache-2.0
3
+ pipeline_tag: text-generation
4
+ language:
5
+ - en
6
  tags:
7
+ - pretrained
8
+ inference:
9
+ parameters:
10
+ temperature: 0.7
 
11
  ---
12
 
13
+ # Model Card for Mistral-7B-v0.1
14
 
15
+ The Mistral-7B-v0.1 Large Language Model (LLM) is a pretrained generative text model with 7 billion parameters.
16
+ Mistral-7B-v0.1 outperforms Llama 2 13B on all benchmarks we tested.
17
 
18
+ For full details of this model please read our [paper](https://arxiv.org/abs/2310.06825) and [release blog post](https://mistral.ai/news/announcing-mistral-7b/).
19
 
20
+ ## Model Architecture
21
 
22
+ Mistral-7B-v0.1 is a transformer model, with the following architecture choices:
23
+ - Grouped-Query Attention
24
+ - Sliding-Window Attention
25
+ - Byte-fallback BPE tokenizer
26
 
27
+ ## Troubleshooting
28
 
29
+ - If you see the following error:
30
+ ```
31
+ KeyError: 'mistral'
32
+ ```
33
+ - Or:
34
+ ```
35
+ NotImplementedError: Cannot copy out of meta tensor; no data!
36
+ ```
37
 
38
+ Ensure you are utilizing a stable version of Transformers, 4.34.0 or newer.
 
 
 
39
 
40
+ ## Notice
 
 
41
 
42
+ Mistral 7B is a pretrained base model and therefore does not have any moderation mechanisms.
43
+
44
+ ## The Mistral AI Team
45
+
46
+ Albert Jiang, Alexandre Sablayrolles, Arthur Mensch, Chris Bamford, Devendra Singh Chaplot, Diego de las Casas, Florian Bressand, Gianna Lengyel, Guillaume Lample, Lélio Renard Lavaud, Lucile Saulnier, Marie-Anne Lachaux, Pierre Stock, Teven Le Scao, Thibaut Lavril, Thomas Wang, Timothée Lacroix, William El Sayed.