Update README.md
Browse files
README.md
CHANGED
|
@@ -6,6 +6,7 @@ datasets:
|
|
| 6 |
- QingyiSi/Alpaca-CoT
|
| 7 |
- HuggingFaceH4/MATH-500
|
| 8 |
- zai-org/LongWriter-6k
|
|
|
|
| 9 |
language:
|
| 10 |
- en
|
| 11 |
pipeline_tag: text-generation
|
|
@@ -66,6 +67,7 @@ Not intended: safety-critical use, heavy factual QA at web scale, or domains req
|
|
| 66 |
- QingyiSi/Alpaca-CoT ~128K Tokens [2, 1024], [1, 2048] [4, 512]
|
| 67 |
- HuggingFaceH4/MATH-500 ~256k Tokens, [8, 256] [4, 512]
|
| 68 |
- zai-org/LongWriter-6k ~128k Tokens [2, 1024] [1, 2048]
|
|
|
|
| 69 |
|
| 70 |
Training used modest token budgets (hundreds of thousands). Reported training logs showed healthy loss descent on both 512 and 1024 sequence lengths on CPU runs. Exact metrics will vary with tokenizer, preprocessing, and optimizer settings.
|
| 71 |
|
|
|
|
| 6 |
- QingyiSi/Alpaca-CoT
|
| 7 |
- HuggingFaceH4/MATH-500
|
| 8 |
- zai-org/LongWriter-6k
|
| 9 |
+
- m-a-p/DeepWriting-20K
|
| 10 |
language:
|
| 11 |
- en
|
| 12 |
pipeline_tag: text-generation
|
|
|
|
| 67 |
- QingyiSi/Alpaca-CoT ~128K Tokens [2, 1024], [1, 2048] [4, 512]
|
| 68 |
- HuggingFaceH4/MATH-500 ~256k Tokens, [8, 256] [4, 512]
|
| 69 |
- zai-org/LongWriter-6k ~128k Tokens [2, 1024] [1, 2048]
|
| 70 |
+
- SFT: prithivMLmods/Deepthink-Reasoning [8, 256] ~ Final Loss 0.3200/ Total Tokens 128512.0
|
| 71 |
|
| 72 |
Training used modest token budgets (hundreds of thousands). Reported training logs showed healthy loss descent on both 512 and 1024 sequence lengths on CPU runs. Exact metrics will vary with tokenizer, preprocessing, and optimizer settings.
|
| 73 |
|