Update README.md
Browse files
README.md
CHANGED
|
@@ -73,7 +73,7 @@ def generate_with_reasoning(prompt_text):
|
|
| 73 |
return generated, duration, num_generated_tokens
|
| 74 |
|
| 75 |
# Example Arabic math problem
|
| 76 |
-
|
| 77 |
|
| 78 |
result, time_taken, tokens = generate_with_reasoning(prompt)
|
| 79 |
print(result)
|
|
@@ -128,7 +128,7 @@ torch==2.4.1
|
|
| 128 |
The model is trained to follow a reasoning-first format:
|
| 129 |
|
| 130 |
```
|
| 131 |
-
<think>
|
| 132 |
<answer> 240,000 </answer>
|
| 133 |
```
|
| 134 |
|
|
@@ -168,26 +168,8 @@ The model is trained to follow a reasoning-first format:
|
|
| 168 |
|
| 169 |
---
|
| 170 |
|
| 171 |
-
## ๐งโ๐ฌ Authors
|
| 172 |
-
|
| 173 |
-
Developed and trained by **Omar Paniego** with adaptation of the DeepSeek-R1 training recipe using Hugging Face's open tools and datasets.
|
| 174 |
-
|
| 175 |
-
---
|
| 176 |
-
|
| 177 |
-
## ๐ข License
|
| 178 |
-
|
| 179 |
-
Refer to the license file in the repository.
|
| 180 |
-
|
| 181 |
-
---
|
| 182 |
-
|
| 183 |
-
## โค๏ธ Acknowledgements
|
| 184 |
-
|
| 185 |
-
Thanks to:
|
| 186 |
-
- **Hugging Face Science Team** for `trl` and `math_verify`
|
| 187 |
-
- **AI-MO** for the NuminaMath-TIR dataset
|
| 188 |
-
- **DeepSeek Team** for releasing their methodology and insights
|
| 189 |
-
|
| 190 |
Happy reasoning! ๐โจ
|
|
|
|
| 191 |
## Citations
|
| 192 |
|
| 193 |
Cite GRPO as:
|
|
|
|
| 73 |
return generated, duration, num_generated_tokens
|
| 74 |
|
| 75 |
# Example Arabic math problem
|
| 76 |
+
prompt_text = '''ูู ู
ุฏููุฉ ูุจูุบ ุนุฏุฏ ุณูุงููุง 1 ู
ูููู ูุณู
ุฉุ ุฅุฐุง ูุงู 60% ู
ู ุงูุณูุงู ุจุงูุบููุ ู40% ู
ู ุงูุจุงูุบูู ูุนู
ูููุ ููู
ุนุฏุฏ ุงูุนุงู
ููู ูู ุงูู
ุฏููุฉุ'''
|
| 77 |
|
| 78 |
result, time_taken, tokens = generate_with_reasoning(prompt)
|
| 79 |
print(result)
|
|
|
|
| 128 |
The model is trained to follow a reasoning-first format:
|
| 129 |
|
| 130 |
```
|
| 131 |
+
<think> ุฃููุงูุ ูุญุณุจ 60% ู
ู ู
ูููู ูุณู
ุฉุ ููู 600,000. ุซู
ูุญุณุจ 40% ู
ู ูุฐุง ุงูุนุฏุฏุ ููู 240,000. </think>
|
| 132 |
<answer> 240,000 </answer>
|
| 133 |
```
|
| 134 |
|
|
|
|
| 168 |
|
| 169 |
---
|
| 170 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 171 |
Happy reasoning! ๐โจ
|
| 172 |
+
|
| 173 |
## Citations
|
| 174 |
|
| 175 |
Cite GRPO as:
|