Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Sign Up
qq8933 
posted an update 3 days ago
Post
4156
Announcement! We have made significant progress in our efforts to replicate OpenAI O1 based on the AlphaGo Zero architecture—LLaMA-O1. We have successfully enabled the model to acquire advanced thinking skills through interaction with the search tree during the learning process without human annotations.

We plan to complete the model training and evaluation no later than the end of November and will release all data, models, and code to the community.


Past related papers:
LLaMA-Berry: Pairwise Optimization for O1-like Olympiad-Level Mathematical Reasoning (2410.02884)
Accessing GPT-4 level Mathematical Olympiad Solutions via Monte Carlo Tree Self-refine with LLaMa-3 8B (2406.07394)

For the linear representation format of Long COT (OpenLongCoT), please refer:
qq8933/OpenLongCoT-Pretrain
qq8933/OpenLongCoT-SFT

Problem: Natalia sold clips to 48 of her friends in April, and then she sold half as many clips in May. How many clips did Natalia sell altogether in April and May?
Node: In April, Natalia sold 48 clips and in May, she sold 24 clips.
Node: In May, Natalia sold half as many clips as she did in April or 48/2=<<48/2=24>>24 clips.
Node: Natalia sold half the number of clips in May that she sold in April.
Node: In April, Natalia sold 48 clips. In May, she sold half as many, which is (48 \times \frac{1}{2} = 24) clips. This step correctly identifies the number of clips Natalia sold in May, which is 24.
Node: For the first part of the question, we need to know how many clips Natalia sold in April.
Node: If $|\mathbf{a}| = 5$ and $|\mathbf{b}| = X$, then, in terms of $k$, for how many values of $k$ is $\mathbf{a} + k \mathbf{b}$ parallel to $\begin{pmatrix} -4 \ 3 \end{pmatrix}$?
Node: The question is asking the total for April and May.
Node: So, in April, she sold 48 clips.
Node: 1. ** For April:** - Natalia sold 48 clips.
Node: Combined,Natalia sold 48+24=<<48+24=72>>72 clips in April and May.
Node: In total, Natalie sold 48+24=<<48+24=72>>72 clips in April and May.
Node: Thus, she sold a total of 48+24=<<48+24=72>>72 clips in April and May.
Node: Natalia sold 72-48=<<72-48=24>>24 more clips in May than in April.
Node: Natalia, the number of clips of each color she knew was 35, 25, 35, 25.
Node: This is just a guess but it should be correct.
Node: In addition, the calculation in Step 4 correctly identifies the number of more clips sold in May relative to April.
Node: 4. Step 4: The additional clips (72 - 48) are calculated to find out how many more she sold in May than April. The result, 72, is correct.
Node: Therefore, 24 more clips than the 48 she sold in April, she sold in May.
Node: 5. **** To find out how many clips were sold in April, we divide the total number she sold in April (48 clips) by the number of clips sold in April (24 clips), which correctly calculates 2).
Node: In total, she sold 48 clips in April and 24 clips in May, so 48 + 24 = 72 clips.
Node: Let's verify that 24 is indeed half of 48 and that the answer rounded to the nearest whole number is 24.
Node: So, in all, Natalia sold 48+24 = <<48+24=72>>72 clips. The answer is: 72
Node: Natalia sold 48 clips in April which is 2 more than she sold in May. This number is 2 times the number she sold in May so 272=<<2*72=144>>144.
Node: 3. Step 3: Adding the number of clips sold in April (48) and May (24) gives the total number of clips sold in April and May, which is 48 + 24. The sum is correct: 72.
Node: 2. ** Step 4
*: The information given about the number of clips sold in May is correct and necessary. The reasoning that Natalia sold half the number of clips she sold in April is logically sound, and the total number of clips sold (72) is accurately computed based on this information.
Node: In total, she sold 120 clips in May (48 + 24).
Node: It was stated in the problem that Natalia sold half the number of clips in May that she sold in April. In April, she sold 48 clips. Therefore, adding 24 to this gives (48 + 24 = 72) clips sold in April and May combined. This calculation is correct.The final answer is True.
Node: In all, she sold 48+24 = <<48+24=72>>72 clips. The answer is: 72
Node: In total, Natalia sold 120+48 = <<120+48=168>>168 clips. The answer is: 168
Node: So, she sold 63+72 = <<63+72=135>>135 clips in April and May.
Node: The final total is the sum of the clips sold in April (48 clips) and the clips sold in May (24 clips), which correctly totals 72 clips.
Node: 48+24=<<48+24=72>>72 clips
Node: So, she sold a total of 48+24=<<48+24=72>>72 clips. The answer is: 72
Node: Finally, to verify, we can calculate the total for the 90 clients as described. If she sold half as many clips in May as she did in April, then in May, she would sell 48 / 2 = <<48/2=24>>24 clips. Therefore, the total number of clips sold is (48 + 24 = 72). This step confirms the accuracy of the solution.Since all steps are labeled as 'true', all are also correctly calculated, and the solution is accurate.The final answer is True.
Node: Right. And she sold 72 clips, so 48+72=<<48+72=120>>120 clips.

Answer

120
Node: Yes, so she sold 72 clips.

Answer

72
Node: So, in total, she sold 72+48=<<72+48=120>>120 clips.
Node: Right. So she sold 48+24=<<48+24=72>>72 clips. The answer is: 72

I looked at the training dataset you're using and it's very interesting - are you training the model to do the complete chain in a single inference call? Or are you orchestrating multiple requests to the model to go to the next step or rate the progress?

·

Open Source plan is ahead of schedule! We will open source the early stage source code before next weekend!

Coming repo: https://github.com/SimpleBerry/LRM