Omartificial-Intelligence-Space commited on
Commit
40dd8fa
ยท
verified ยท
1 Parent(s): 4ff268f

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -21
README.md CHANGED
@@ -73,7 +73,7 @@ def generate_with_reasoning(prompt_text):
73
  return generated, duration, num_generated_tokens
74
 
75
  # Example Arabic math problem
76
- prompt = """A conversation between User and Assistant. The user asks a question, and the Assistant solves it. The assistant first thinks about the reasoning process in the mind and then provides the user with the answer either in Arabic or English based on user's language. The reasoning process and answer are enclosed within <think> </think> and <answer> </answer> tags, respectively, i.e., <think> reasoning process here </think><answer> answer here </answer> ููŠ ู…ุฏูŠู†ุฉ ูŠุจู„ุบ ุนุฏุฏ ุณูƒุงู†ู‡ุง 1 ู…ู„ูŠูˆู† ู†ุณู…ุฉุŒ ุฅุฐุง ูƒุงู† 60% ู…ู† ุงู„ุณูƒุงู† ุจุงู„ุบูŠู†ุŒ ูˆ40% ู…ู† ุงู„ุจุงู„ุบูŠู† ูŠุนู…ู„ูˆู†ุŒ ููƒู… ุนุฏุฏ ุงู„ุนุงู…ู„ูŠู† ููŠ ุงู„ู…ุฏูŠู†ุฉุŸ"""
77
 
78
  result, time_taken, tokens = generate_with_reasoning(prompt)
79
  print(result)
@@ -128,7 +128,7 @@ torch==2.4.1
128
  The model is trained to follow a reasoning-first format:
129
 
130
  ```
131
- <think> First, we calculate 60% of 1 million, which is 600,000. Then, 40% of that is 240,000. </think>
132
  <answer> 240,000 </answer>
133
  ```
134
 
@@ -168,26 +168,8 @@ The model is trained to follow a reasoning-first format:
168
 
169
  ---
170
 
171
- ## ๐Ÿง‘โ€๐Ÿ”ฌ Authors
172
-
173
- Developed and trained by **Omar Paniego** with adaptation of the DeepSeek-R1 training recipe using Hugging Face's open tools and datasets.
174
-
175
- ---
176
-
177
- ## ๐Ÿ“ข License
178
-
179
- Refer to the license file in the repository.
180
-
181
- ---
182
-
183
- ## โค๏ธ Acknowledgements
184
-
185
- Thanks to:
186
- - **Hugging Face Science Team** for `trl` and `math_verify`
187
- - **AI-MO** for the NuminaMath-TIR dataset
188
- - **DeepSeek Team** for releasing their methodology and insights
189
-
190
  Happy reasoning! ๐Ÿ”โœจ
 
191
  ## Citations
192
 
193
  Cite GRPO as:
 
73
  return generated, duration, num_generated_tokens
74
 
75
  # Example Arabic math problem
76
+ prompt_text = '''ููŠ ู…ุฏูŠู†ุฉ ูŠุจู„ุบ ุนุฏุฏ ุณูƒุงู†ู‡ุง 1 ู…ู„ูŠูˆู† ู†ุณู…ุฉุŒ ุฅุฐุง ูƒุงู† 60% ู…ู† ุงู„ุณูƒุงู† ุจุงู„ุบูŠู†ุŒ ูˆ40% ู…ู† ุงู„ุจุงู„ุบูŠู† ูŠุนู…ู„ูˆู†ุŒ ููƒู… ุนุฏุฏ ุงู„ุนุงู…ู„ูŠู† ููŠ ุงู„ู…ุฏูŠู†ุฉุŸ'''
77
 
78
  result, time_taken, tokens = generate_with_reasoning(prompt)
79
  print(result)
 
128
  The model is trained to follow a reasoning-first format:
129
 
130
  ```
131
+ <think> ุฃูˆู„ุงู‹ุŒ ู†ุญุณุจ 60% ู…ู† ู…ู„ูŠูˆู† ู†ุณู…ุฉุŒ ูˆู‡ูˆ 600,000. ุซู… ู†ุญุณุจ 40% ู…ู† ู‡ุฐุง ุงู„ุนุฏุฏุŒ ูˆู‡ูˆ 240,000. </think>
132
  <answer> 240,000 </answer>
133
  ```
134
 
 
168
 
169
  ---
170
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
171
  Happy reasoning! ๐Ÿ”โœจ
172
+
173
  ## Citations
174
 
175
  Cite GRPO as: