dvilasuero HF staff commited on
Commit
f818073
1 Parent(s): 62c23f5

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -172,7 +172,7 @@ model-index:
172
 
173
  # Model Card for Notus 7B v1
174
 
175
- Notus is a collection of fine-tuned models using Direct Preference Optimization (DPO) and related RLHF techniques. This model is the first version, fine-tuned with DPO (Direct Preference Optimization) over `zephyr-7b-sft-full`, which is the SFT model produced to create `zephyr-7b-beta`.
176
 
177
  Following a **data-first** approach, the only difference between Notus-7B-v1 and Zephyr-7B-beta is the preference dataset used for dDPO.
178
 
 
172
 
173
  # Model Card for Notus 7B v1
174
 
175
+ Notus is a collection of fine-tuned models using Direct Preference Optimization (DPO) and related RLHF techniques. This model is the first version, fine-tuned with DPO over `zephyr-7b-sft-full`, which is the SFT model produced to create `zephyr-7b-beta`.
176
 
177
  Following a **data-first** approach, the only difference between Notus-7B-v1 and Zephyr-7B-beta is the preference dataset used for dDPO.
178