Update README.md
Browse files
README.md
CHANGED
@@ -30,6 +30,8 @@ This model employs linear RoPE scaling, which is now has native support in Trans
|
|
30 |
|
31 |
Please comment with any questions. I'll likely upload a GPTQ and (possibly) a GGML version soon, especially if anyone expresses interest.
|
32 |
|
|
|
|
|
33 |
## Motivation
|
34 |
|
35 |
Previous experiments have demonstrated that orca-like datasets yield substantial performance improvements on numerous benchmarks. Additionally, the PI method of context extension requires finetuning to minimize performance impacts relative to the original (non context extended) model. My most successful models for context extension with PI methods employ a pretraining phase on long sequences, but due to the compute requirements, I have not scaled this to more than 200 iterations or so. Many groups (including OpenAssistant) have performed such training at scale. This model uses such a model as a starting point.
|
|
|
30 |
|
31 |
Please comment with any questions. I'll likely upload a GPTQ and (possibly) a GGML version soon, especially if anyone expresses interest.
|
32 |
|
33 |
+
Ooba use: Be sure to increase the `Truncate the prompt up to this length` parameter to 8192 to utilize the full context capabilities.
|
34 |
+
|
35 |
## Motivation
|
36 |
|
37 |
Previous experiments have demonstrated that orca-like datasets yield substantial performance improvements on numerous benchmarks. Additionally, the PI method of context extension requires finetuning to minimize performance impacts relative to the original (non context extended) model. My most successful models for context extension with PI methods employ a pretraining phase on long sequences, but due to the compute requirements, I have not scaled this to more than 200 iterations or so. Many groups (including OpenAssistant) have performed such training at scale. This model uses such a model as a starting point.
|