Commit
•
9851677
1
Parent(s):
ddf6c45
Update README.md
Browse files
README.md
CHANGED
@@ -29,7 +29,7 @@ I picked these models because:
|
|
29 |
|
30 |
Based on the parent models, I expect this model to be used with an 8192 context window. Please use NTK scaling alpha of 2.6 to experimentally try out 16383 context.
|
31 |
|
32 |
-
**Let me be candid:** Despite the test scores,
|
33 |
|
34 |
**MT-Bench Average Turn**
|
35 |
| model | score | size
|
|
|
29 |
|
30 |
Based on the parent models, I expect this model to be used with an 8192 context window. Please use NTK scaling alpha of 2.6 to experimentally try out 16383 context.
|
31 |
|
32 |
+
**Let me be candid:** Despite the test scores, this model is **NOT is a GPT killer**. I think it's a very sharp model **for a 7B**, it probably punches way above its weight **for a 7B**, but it's still a 7B model. Even for a 7B model, I think **it's quirky and has some weird outputs**. Keep your expectations in check 😉
|
33 |
|
34 |
**MT-Bench Average Turn**
|
35 |
| model | score | size
|