wandb
/

mistral-7b-zephyr-sft

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

tcapelle commited on Mar 9, 2024

Commit

b2484a0

·

verified ·

1 Parent(s): f17fc1f

Update README.md

Files changed (1) hide show

README.md +23 -2

README.md CHANGED Viewed

@@ -1,6 +1,27 @@
 ---
-library_name: transformers
 license: mit
 ---
-Mistral on the new [HuggingFaceH4/deita-10k-v0-sft](https://huggingface.co/datasets/HuggingFaceH4/deita-10k-v0-sft) dataset using 16k context.

 ---
 license: mit
+library_name: transformers
+datasets:
+- HuggingFaceH4/deita-10k-v0-sft
+base_model: mistralai/Mistral-7B-v0.1
 ---
+[<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="200" height="32"/>](https://wandb.ai/llm_surgery/gemma-zephyr)
+# Mistral 7B Zephyr SFT V2
+The [Zephyr](https://huggingface.co/HuggingFaceH4/zephyr-7b-beta) SFT recipe applied on top of Mistral 7B (new recipe with chatML format)
+## Model description
+- **Model type:** A 7.2B parameter GPT-like model fine-tuned on a mix of publicly available, synthetic datasets.
+- **Language(s) (NLP):** Primarily English
+- **Finetuned from model:** [mistralai/Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1)
+## Recipe
+We trained using the [alignment handbook recipe](https://github.com/huggingface/alignment-handbook/blob/main/scripts/run_sft.py) and logging to W&B
+Visit the [W&B workspace here](https://wandb.ai/llm_surgery/gemma-zephyr?nw=nwusercapecape)
+## Compute provided by Lambda Labs - 8xA100 80GB node