SanjiWatsuki
commited on
Commit
•
ddf6c45
1
Parent(s):
95327a4
Update README.md
Browse files
README.md
CHANGED
@@ -29,7 +29,7 @@ I picked these models because:
|
|
29 |
|
30 |
Based on the parent models, I expect this model to be used with an 8192 context window. Please use NTK scaling alpha of 2.6 to experimentally try out 16383 context.
|
31 |
|
32 |
-
**Let me be candid:** Despite the test scores, I do not believe this model is a GPT killer. I think it's a very sharp model, it probably punches way above its weight, but it's still a 7B model. Keep your expectations in check 😉
|
33 |
|
34 |
**MT-Bench Average Turn**
|
35 |
| model | score | size
|
|
|
29 |
|
30 |
Based on the parent models, I expect this model to be used with an 8192 context window. Please use NTK scaling alpha of 2.6 to experimentally try out 16383 context.
|
31 |
|
32 |
+
**Let me be candid:** Despite the test scores, I do not believe this model is a GPT killer. I think it's a very sharp model, it probably punches way above its weight, but it's still a 7B model. Even for a 7B model, I think it's quirky and has some weird outputs. Keep your expectations in check 😉
|
33 |
|
34 |
**MT-Bench Average Turn**
|
35 |
| model | score | size
|