Update README.md
Browse files
README.md
CHANGED
@@ -15,7 +15,7 @@ datasets:
|
|
15 |
|
16 |
## Model Description
|
17 |
|
18 |
-
StableVicuna-13B is a [Vicuna-13B
|
19 |
|
20 |
### Apply Delta Weights
|
21 |
|
|
|
15 |
|
16 |
## Model Description
|
17 |
|
18 |
+
StableVicuna-13B is a [Vicuna-13B v0](https://vicuna.lmsys.org/) model fine-tuned using reinforcement learning from human feedback (RLHF) via Proximal Policy Optimization (PPO) on various conversational and instructional datasets.
|
19 |
|
20 |
### Apply Delta Weights
|
21 |
|