bhenrym14 commited on
Commit
26b7edf
1 Parent(s): f267968

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -0
README.md CHANGED
@@ -30,6 +30,8 @@ This model employs linear RoPE scaling, which is now has native support in Trans
30
 
31
  Please comment with any questions. I'll likely upload a GPTQ and (possibly) a GGML version soon, especially if anyone expresses interest.
32
 
 
 
33
  ## Motivation
34
 
35
  Previous experiments have demonstrated that orca-like datasets yield substantial performance improvements on numerous benchmarks. Additionally, the PI method of context extension requires finetuning to minimize performance impacts relative to the original (non context extended) model. My most successful models for context extension with PI methods employ a pretraining phase on long sequences, but due to the compute requirements, I have not scaled this to more than 200 iterations or so. Many groups (including OpenAssistant) have performed such training at scale. This model uses such a model as a starting point.
 
30
 
31
  Please comment with any questions. I'll likely upload a GPTQ and (possibly) a GGML version soon, especially if anyone expresses interest.
32
 
33
+ Ooba use: Be sure to increase the `Truncate the prompt up to this length` parameter to 8192 to utilize the full context capabilities.
34
+
35
  ## Motivation
36
 
37
  Previous experiments have demonstrated that orca-like datasets yield substantial performance improvements on numerous benchmarks. Additionally, the PI method of context extension requires finetuning to minimize performance impacts relative to the original (non context extended) model. My most successful models for context extension with PI methods employ a pretraining phase on long sequences, but due to the compute requirements, I have not scaled this to more than 200 iterations or so. Many groups (including OpenAssistant) have performed such training at scale. This model uses such a model as a starting point.