tcapelle commited on
Commit
b2484a0
1 Parent(s): f17fc1f

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +23 -2
README.md CHANGED
@@ -1,6 +1,27 @@
1
  ---
2
- library_name: transformers
3
  license: mit
 
 
 
 
4
  ---
5
 
6
- Mistral on the new [HuggingFaceH4/deita-10k-v0-sft](https://huggingface.co/datasets/HuggingFaceH4/deita-10k-v0-sft) dataset using 16k context.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
 
2
  license: mit
3
+ library_name: transformers
4
+ datasets:
5
+ - HuggingFaceH4/deita-10k-v0-sft
6
+ base_model: mistralai/Mistral-7B-v0.1
7
  ---
8
 
9
+ [<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="200" height="32"/>](https://wandb.ai/llm_surgery/gemma-zephyr)
10
+
11
+ # Mistral 7B Zephyr SFT V2
12
+
13
+ The [Zephyr](https://huggingface.co/HuggingFaceH4/zephyr-7b-beta) SFT recipe applied on top of Mistral 7B (new recipe with chatML format)
14
+
15
+ ## Model description
16
+
17
+ - **Model type:** A 7.2B parameter GPT-like model fine-tuned on a mix of publicly available, synthetic datasets.
18
+ - **Language(s) (NLP):** Primarily English
19
+ - **Finetuned from model:** [mistralai/Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1)
20
+
21
+ ## Recipe
22
+
23
+ We trained using the [alignment handbook recipe](https://github.com/huggingface/alignment-handbook/blob/main/scripts/run_sft.py) and logging to W&B
24
+
25
+ Visit the [W&B workspace here](https://wandb.ai/llm_surgery/gemma-zephyr?nw=nwusercapecape)
26
+
27
+ ## Compute provided by Lambda Labs - 8xA100 80GB node