Update README.md

Browse files

Files changed (1) hide show

README.md +67 -1

README.md CHANGED Viewed

	@@ -1 +1,67 @@
1	- ~~[Mixtral](mistralai/Mixtral-8x7B-Instruct-v0.1) compressed using Low-Rank Approximation. The model is compressed by removing 10B parameters from the MLP experts.~~

+---
+language:
+- en
+license: apache-2.0
+base_model: mistralai/Mixtral-8x7B-v0.1
+inference:
+  parameters:
+    temperature: 0.5
+widget:
+- messages:
+  - role: user
+    content: What is your favorite condiment?
+---
+# Model Card for Mixtral-8x7B
+## Instruction format
+This format must be strictly respected, otherwise the model will generate sub-optimal outputs.
+The template used to build a prompt for the Instruct model is defined as follows:
+```
+<s> [INST] Instruction [/INST] Model answer</s> [INST] Follow-up instruction [/INST]
+```
+Note that `<s>` and `</s>` are special tokens for beginning of string (BOS) and end of string (EOS) while [INST] and [/INST] are regular strings.
+As reference, here is the pseudo-code used to tokenize instructions during fine-tuning:
+```python
+def tokenize(text):
+    return tok.encode(text, add_special_tokens=False)
+[BOS_ID] +
+tokenize("[INST]") + tokenize(USER_MESSAGE_1) + tokenize("[/INST]") +
+tokenize(BOT_MESSAGE_1) + [EOS_ID] +
+…
+tokenize("[INST]") + tokenize(USER_MESSAGE_N) + tokenize("[/INST]") +
+tokenize(BOT_MESSAGE_N) + [EOS_ID]
+```
+In the pseudo-code above, note that the `tokenize` method should not add a BOS or EOS token automatically, but should add a prefix space.
+In the Transformers library, one can use [chat templates](https://huggingface.co/docs/transformers/main/en/chat_templating) which make sure the right format is applied.
+<details>
+<summary> Click to expand </summary>
+```diff
++ import torch
+from transformers import AutoModelForCausalLM, AutoTokenizer
+model_id = "mistralai/Mixtral-8x7B-Instruct-v0.1"
+tokenizer = AutoTokenizer.from_pretrained(model_id)
++ model = AutoModelForCausalLM.from_pretrained(model_id, torch_dtype=torch.float16, device_map="auto")
+messages = [
+    {"role": "user", "content": "What is your favourite condiment?"},
+    {"role": "assistant", "content": "Well, I'm quite partial to a good squeeze of fresh lemon juice. It adds just the right amount of zesty flavour to whatever I'm cooking up in the kitchen!"},
+    {"role": "user", "content": "Do you have mayonnaise recipes?"}
+]
+input_ids = tokenizer.apply_chat_template(messages, return_tensors="pt").to("cuda")
+outputs = model.generate(input_ids, max_new_tokens=20)
+print(tokenizer.decode(outputs[0], skip_special_tokens=True))
+```
+</details>