SparseLLM
/

prosparse-llama-2-7b

Text Generation

feature-extraction

Model card Files Files and versions Community

Raincleared commited on May 17, 2024

Commit

4b55046

·

verified ·

1 Parent(s): 178909c

Update README.md

Files changed (1) hide show

README.md +2 -2

README.md CHANGED Viewed

@@ -64,8 +64,8 @@ The 7B model is trained on 8 A100 GPUs. The learning rate (LR) is controlled by
 |        1        |   \\(5e-3\\)    | 6,000  |         12.58          |
 |        2        |   \\(5e-2\\)    | 10,000 |         20.97          |
 |        3        |   \\(5e-2\\)    | 12,000 |         25.17          |
-|        4        |   \\(5e-1\\)    | 16,000 |         33.55          |
-|        5        |   \\(5e-1\\)    | 16,500 |         34.60          |
 ### Evaluation Results

 |        1        |   \\(5e-3\\)    | 6,000  |         12.58          |
 |        2        |   \\(5e-2\\)    | 10,000 |         20.97          |
 |        3        |   \\(5e-2\\)    | 12,000 |         25.17          |
+|        4        |   \\(2e-1\\)    | 16,000 |         33.55          |
+|        5        |   \\(2e-1\\)    | 16,500 |         34.60          |
 ### Evaluation Results