bhenrym14
/

airoboros-l2-13b-2.1-PI-16k-fp16

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

bhenrym14 commited on Aug 31, 2023

Commit

2206897

•

1 Parent(s): e05d2fe

Update README.md

Files changed (1) hide show

README.md +3 -3

README.md CHANGED Viewed

@@ -34,8 +34,8 @@ Given the excellent performance of llama-2 13b finetunes relative to llama 33b,
 ## Relative Performance (wikitext perplexity)
-| Context (tokens)  | bhenrym14/airoboros-l2-13b-PI-16k-fp16| bhenrym14/airophin-v2-13b-PI-8k-fp16 | bhenrym14/airophin-13b-pntk-16k-fp16| bhenrym14/airoboros-13b-gpt4-1.4.1-PI-8192-fp16 |bhenrym14/airoboros-33b-gpt4-1.4.1-lxctx-PI-16384-fp16  | jondurbin/airoboros-l2-13b-gpt4-1.4.1 |
-| --- | ---| ----- | -----| ------| --- |
 | 512 | 7.67 | 7.38 | 7.62 | 8.24 | 7.90 | **7.23** |
 | 1024 | 6.15 | 5.99 | 6.20 | 6.71 | 6.17 | **5.85**  |
 | 2048 | 5.29 | 5.22 | 5.38 | 5.87 | 5.23 | **5.07** |
@@ -43,7 +43,7 @@ Given the excellent performance of llama-2 13b finetunes relative to llama 33b,
 | 8192 | **4.71** | **4.71** | 4.90 | 5.32 | Not Tested | 57.1 |
 | 12000 | **4.54** | 55 | 4.82 | 56.1 | Not Tested | Not Tested |
-- Larger PI scaling factors increase short context performance degradation. If you don't require 16k context, you're better off using a model with a different context extension method, or a smaller (or no) PI scaling factor.
 - Beyond 8k, this model has lower perplexity than all other models tested here.
 - I'm actively exploring/implementing other context extension methods that may ameliorate the tendency of PI methods to impair the ability of the model to attend to the context space equally.

 ## Relative Performance (wikitext perplexity)
+| Context (tokens)  | **bhenrym14/airoboros-l2-13b-PI-16k-fp16** | bhenrym14/airophin-v2-13b-PI-8k-fp16 | bhenrym14/airophin-13b-pntk-16k-fp16| bhenrym14/airoboros-13b-gpt4-1.4.1-PI-8192-fp16 |bhenrym14/airoboros-33b-gpt4-1.4.1-lxctx-PI-16384-fp16  | jondurbin/airoboros-l2-13b-gpt4-1.4.1 |
+| --- | --- | ---| ----- | -----| ------| --- |
 | 512 | 7.67 | 7.38 | 7.62 | 8.24 | 7.90 | **7.23** |
 | 1024 | 6.15 | 5.99 | 6.20 | 6.71 | 6.17 | **5.85**  |
 | 2048 | 5.29 | 5.22 | 5.38 | 5.87 | 5.23 | **5.07** |
 | 8192 | **4.71** | **4.71** | 4.90 | 5.32 | Not Tested | 57.1 |
 | 12000 | **4.54** | 55 | 4.82 | 56.1 | Not Tested | Not Tested |
+- Larger PI scaling factors increase short context performance degradation. If you don't require 16k context, you're better off using a model with a different context extension method, or a smaller (or no) PI scaling factor. Given this, don't expect anything special on the HF leaderboard.
 - Beyond 8k, this model has lower perplexity than all other models tested here.
 - I'm actively exploring/implementing other context extension methods that may ameliorate the tendency of PI methods to impair the ability of the model to attend to the context space equally.