formatting of README
Browse files
README.md
CHANGED
@@ -11,9 +11,11 @@ pipeline_tag: text-generation
|
|
11 |
|
12 |
# Wizard Mega 13B
|
13 |
|
14 |
-
Wizard Mega is a Llama 13B model fine-tuned on the ShareGPT, WizardLM, and Wizard-Vicuna datasets. These particular datasets have all been filtered to remove responses where the model responds with "As an AI language model...", etc or when the model refuses to respond.
|
15 |
|
16 |
-
Release (Epoch Two)
|
|
|
|
|
17 |
|
18 |
## Build
|
19 |
|
|
|
11 |
|
12 |
# Wizard Mega 13B
|
13 |
|
14 |
+
Wizard Mega is a Llama 13B model fine-tuned on the [ShareGPT](https://huggingface.co/datasets/anon8231489123/ShareGPT_Vicuna_unfiltered), [WizardLM](https://huggingface.co/datasets/ehartford/WizardLM_alpaca_evol_instruct_70k_unfiltered), and [Wizard-Vicuna](https://huggingface.co/datasets/ehartford/wizard_vicuna_70k_unfiltered) datasets. These particular datasets have all been filtered to remove responses where the model responds with "As an AI language model...", etc or when the model refuses to respond.
|
15 |
|
16 |
+
## Release (Epoch Two)
|
17 |
+
|
18 |
+
The Wizard Mega 13B SFT model is being released after two epochs as the eval loss increased during the 3rd (final planned epoch). Because of this, we have preliminarily decided to use the epoch 2 checkpoint as the final release candidate. https://wandb.ai/wing-lian/vicuna-13b/runs/5uebgm49
|
19 |
|
20 |
## Build
|
21 |
|