shisa-ai
/

shisa-v1-llama3-8b

Text Generation

Generated from Trainer

Inference Endpoints

text-generation-inference

Model card Files Files and versions Community

leonardlin commited on May 21

Commit

533f412

•

1 Parent(s): 0064154

Update README.md

Files changed (1) hide show

README.md +14 -0

README.md CHANGED Viewed

@@ -7,6 +7,20 @@ model-index:
 - name: outputs/lr-8e6
   results: []
 ---
 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 should probably proofread and complete it, then remove this comment. -->

 - name: outputs/lr-8e6
   results: []
 ---
+I ran the tests for 2 runs just to try to lower variance. These are all using temp 0.2, min_p 0.1, freq penalty 0.5
+| Model                       | AVG Score | ELYZA100 | JA MT-Bench | Rakuda | Tengu-Bench | JA Char % |
+|-----------------------------|-----------|----------|-------------|--------|-------------|-----------|
+| shisa-v1-llama3-8b.lr-2e4   | 3.97      | 4.60     | 4.54        | 3.33   | 3.42        | 92.42%    |
+| shisa-v1-llama3-8b.lr-5e5   | 5.73      | 6.28     | 6.45        | 5.37   | 4.81        | 90.93%    |
+| shisa-v1-llama3-8b (2e5 avg)| 6.33      | 6.51     | 6.66        | 6.68   | 5.48        | 91.51%    |
+| shisa-v1-llama3-8b.8e6      | 6.59      | 6.67     | 6.95        | 7.05   | 5.68        | 91.30%    |
+| shisa-v1-llama3-8b.5e6      | 6.42      | 6.33     | 6.76        | 7.15   | 5.45        | 91.56%    |
+| shisa-v1-llama3-8b.2e6      | 6.31      | 6.26     | 6.88        | 6.73   | 5.38        | 92.00%    |
+* The 2e-4 and 5e-5 are definitely overtrained and perform significantly worse.
+* 2e-5 is on the edge since weightwacher shows the embed as slightly overtrained for 2e-5, but NEFTune version is not
+* 8e-6 performs the best, and 5e-6 also performed slightly better than 2e-5
 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 should probably proofread and complete it, then remove this comment. -->