Commit
•
81eaacd
1
Parent(s):
ac2a77a
Update README.md
Browse files
README.md
CHANGED
@@ -36,12 +36,6 @@ The Buzz model, Dataset, and Code are to be released to build a toolkit that aim
|
|
36 |
|
37 |
the **Buzz dataset** and two additional models: **Buzz-2.5B-Small** and **Buzz-5B-Medium**, the codebase to refine, filter and augment the data, as well as prune and train your own variants, will additionally be released in the coming days.
|
38 |
|
39 |
-
## Performance
|
40 |
-
|
41 |
-
Buzz-8b-Large achieves remarkably low train and validation loss, with unseen data loss reaching around **0.5** by the end of training. This performance showcases the effectiveness of our novel iterative fine-tuning approach, which maximizes the reuse of pretrained weights. Even the smallest variant, Buzz-Small, maintains a steady train loss of approximately **0.4-0.6**, on entirely new data and hold out sets.
|
42 |
-
|
43 |
-
[ benchmark scores table here]
|
44 |
-
|
45 |
## Iterative Fine-Tuning Methodology
|
46 |
|
47 |
Our research builds upon the concepts introduced in several key papers, including:
|
|
|
36 |
|
37 |
the **Buzz dataset** and two additional models: **Buzz-2.5B-Small** and **Buzz-5B-Medium**, the codebase to refine, filter and augment the data, as well as prune and train your own variants, will additionally be released in the coming days.
|
38 |
|
|
|
|
|
|
|
|
|
|
|
|
|
39 |
## Iterative Fine-Tuning Methodology
|
40 |
|
41 |
Our research builds upon the concepts introduced in several key papers, including:
|