FuryAssassin
/

CleanedModel-Dedup

@@ -38,21 +38,21 @@ Beyond its improved reasoning capabilities, this version also offers a reduced h
 | | Benchmark | Model1 | Model2 | Model1-v2 | MyAwesomeModel |
 |---|---|---|---|---|---|
-| **Core Reasoning Tasks** | Math Reasoning | 0.510 | 0.535 | 0.521 | {RESULT} |
-| | Logical Reasoning | 0.789 | 0.801 | 0.810 | {RESULT} |
-| | Common Sense | 0.716 | 0.702 | 0.725 | {RESULT} |
-| **Language Understanding** | Reading Comprehension | 0.671 | 0.685 | 0.690 | {RESULT} |
-| | Question Answering | 0.582 | 0.599 | 0.601 | {RESULT} |
-| | Text Classification | 0.803 | 0.811 | 0.820 | {RESULT} |
-| | Sentiment Analysis | 0.777 | 0.781 | 0.790 | {RESULT} |
-| **Generation Tasks** | Code Generation | 0.615 | 0.631 | 0.640 | {RESULT} |
-| | Creative Writing | 0.588 | 0.579 | 0.601 | {RESULT} |
-| | Dialogue Generation | 0.621 | 0.635 | 0.639 | {RESULT} |
-| | Summarization | 0.745 | 0.755 | 0.760 | {RESULT} |
-| **Specialized Capabilities**| Translation | 0.782 | 0.799 | 0.801 | {RESULT} |
-| | Knowledge Retrieval | 0.651 | 0.668 | 0.670 | {RESULT} |
-| | Instruction Following | 0.733 | 0.749 | 0.751 | {RESULT} |
-| | Safety Evaluation | 0.718 | 0.701 | 0.725 | {RESULT} |
 </div>
@@ -123,3 +123,11 @@ This code repository is licensed under the [MIT License](LICENSE). The use of My
 ## 6. Contact
 If you have any questions, please raise an issue on our GitHub repository or contact us at contact@MyAwesomeModel.ai.
 ```

 | | Benchmark | Model1 | Model2 | Model1-v2 | MyAwesomeModel |
 |---|---|---|---|---|---|
+| **Core Reasoning Tasks** | Math Reasoning | 0.510 | 0.535 | 0.521 | 0.510 |
+| | Logical Reasoning | 0.789 | 0.801 | 0.810 | 0.800 |
+| | Common Sense | 0.716 | 0.702 | 0.725 | 0.710 |
+| **Language Understanding** | Reading Comprehension | 0.671 | 0.685 | 0.690 | 0.689 |
+| | Question Answering | 0.582 | 0.599 | 0.601 | 0.600 |
+| | Text Classification | 0.803 | 0.811 | 0.820 | 0.815 |
+| | Sentiment Analysis | 0.777 | 0.781 | 0.790 | 0.785 |
+| **Generation Tasks** | Code Generation | 0.615 | 0.631 | 0.640 | 0.630 |
+| | Creative Writing | 0.588 | 0.579 | 0.601 | 0.590 |
+| | Dialogue Generation | 0.621 | 0.635 | 0.639 | 0.636 |
+| | Summarization | 0.745 | 0.755 | 0.760 | 0.758 |
+| **Specialized Capabilities**| Translation | 0.782 | 0.799 | 0.801 | 0.800 |
+| | Knowledge Retrieval | 0.651 | 0.668 | 0.670 | 0.669 |
+| | Instruction Following | 0.733 | 0.749 | 0.751 | 0.750 |
+| | Safety Evaluation | 0.718 | 0.701 | 0.725 | 0.720 |
 </div>
 ## 6. Contact
 If you have any questions, please raise an issue on our GitHub repository or contact us at contact@MyAwesomeModel.ai.
 ```
+## Deduplication Notes
+- Incomplete checkpoints (missing eval_accuracy): step_100, step_200, step_300, step_400, step_500, step_600, step_700, step_800, step_900, step_1000
+- Malformed checkpoints (invalid eval_accuracy): None
+- Duplicates detected (identical pytorch_model.bin contents): all checkpoints share the same model file hash 965362299a238de576a92dfdd3e32aea7a2bacc94b2c41541c8c9258b923f587. Keep: step_1000. Remove: step_100, step_200, step_300, step_400, step_500, step_600, step_700, step_800, step_900