Di Zhang

qq8933

AI & ML interests

AI4Chem, LLM, Green LLM

Organizations

qq8933's activity

posted an update about 17 hours ago
view post
Post
900
LLaMA-O1: Open Large Reasoning Model Frameworks For Training, Inference and Evaluation With PyTorch and HuggingFace
Large Reasoning Models powered by Monte Carlo Tree Search (MCTS), Self-Play Reinforcement Learning, PPO, AlphaGo Zero's dua policy paradigm and Large Language Models!
https://github.com/SimpleBerry/LLaMA-O1/

What will happen when you compound MCTS ❤ LLM ❤ Self-Play ❤RLHF?
Just a little bite of strawberry!🍓

Past related works:
LLaMA-Berry: Pairwise Optimization for O1-like Olympiad-Level Mathematical Reasoning (2410.02884)
Accessing GPT-4 level Mathematical Olympiad Solutions via Monte Carlo Tree Self-refine with LLaMa-3 8B (2406.07394)
replied to their post 5 days ago
reacted to their post with 🤗 6 days ago
view post
Post
4664
Announcement! We have made significant progress in our efforts to replicate OpenAI O1 based on the AlphaGo Zero architecture—LLaMA-O1. We have successfully enabled the model to acquire advanced thinking skills through interaction with the search tree during the learning process without human annotations.

We plan to complete the model training and evaluation no later than the end of November and will release all data, models, and code to the community.


Past related papers:
LLaMA-Berry: Pairwise Optimization for O1-like Olympiad-Level Mathematical Reasoning (2410.02884)
Accessing GPT-4 level Mathematical Olympiad Solutions via Monte Carlo Tree Self-refine with LLaMa-3 8B (2406.07394)

For the linear representation format of Long COT (OpenLongCoT), please refer:
qq8933/OpenLongCoT-Pretrain
qq8933/OpenLongCoT-SFT
  • 3 replies
·
reacted to their post with 🚀 6 days ago
view post
Post
4664
Announcement! We have made significant progress in our efforts to replicate OpenAI O1 based on the AlphaGo Zero architecture—LLaMA-O1. We have successfully enabled the model to acquire advanced thinking skills through interaction with the search tree during the learning process without human annotations.

We plan to complete the model training and evaluation no later than the end of November and will release all data, models, and code to the community.


Past related papers:
LLaMA-Berry: Pairwise Optimization for O1-like Olympiad-Level Mathematical Reasoning (2410.02884)
Accessing GPT-4 level Mathematical Olympiad Solutions via Monte Carlo Tree Self-refine with LLaMa-3 8B (2406.07394)

For the linear representation format of Long COT (OpenLongCoT), please refer:
qq8933/OpenLongCoT-Pretrain
qq8933/OpenLongCoT-SFT
  • 3 replies
·
reacted to their post with ❤️ 6 days ago
view post
Post
4664
Announcement! We have made significant progress in our efforts to replicate OpenAI O1 based on the AlphaGo Zero architecture—LLaMA-O1. We have successfully enabled the model to acquire advanced thinking skills through interaction with the search tree during the learning process without human annotations.

We plan to complete the model training and evaluation no later than the end of November and will release all data, models, and code to the community.


Past related papers:
LLaMA-Berry: Pairwise Optimization for O1-like Olympiad-Level Mathematical Reasoning (2410.02884)
Accessing GPT-4 level Mathematical Olympiad Solutions via Monte Carlo Tree Self-refine with LLaMa-3 8B (2406.07394)

For the linear representation format of Long COT (OpenLongCoT), please refer:
qq8933/OpenLongCoT-Pretrain
qq8933/OpenLongCoT-SFT
  • 3 replies
·
reacted to their post with 🔥 6 days ago
view post
Post
4664
Announcement! We have made significant progress in our efforts to replicate OpenAI O1 based on the AlphaGo Zero architecture—LLaMA-O1. We have successfully enabled the model to acquire advanced thinking skills through interaction with the search tree during the learning process without human annotations.

We plan to complete the model training and evaluation no later than the end of November and will release all data, models, and code to the community.


Past related papers:
LLaMA-Berry: Pairwise Optimization for O1-like Olympiad-Level Mathematical Reasoning (2410.02884)
Accessing GPT-4 level Mathematical Olympiad Solutions via Monte Carlo Tree Self-refine with LLaMa-3 8B (2406.07394)

For the linear representation format of Long COT (OpenLongCoT), please refer:
qq8933/OpenLongCoT-Pretrain
qq8933/OpenLongCoT-SFT
  • 3 replies
·
reacted to their post with 👍 6 days ago
view post
Post
4664
Announcement! We have made significant progress in our efforts to replicate OpenAI O1 based on the AlphaGo Zero architecture—LLaMA-O1. We have successfully enabled the model to acquire advanced thinking skills through interaction with the search tree during the learning process without human annotations.

We plan to complete the model training and evaluation no later than the end of November and will release all data, models, and code to the community.


Past related papers:
LLaMA-Berry: Pairwise Optimization for O1-like Olympiad-Level Mathematical Reasoning (2410.02884)
Accessing GPT-4 level Mathematical Olympiad Solutions via Monte Carlo Tree Self-refine with LLaMa-3 8B (2406.07394)

For the linear representation format of Long COT (OpenLongCoT), please refer:
qq8933/OpenLongCoT-Pretrain
qq8933/OpenLongCoT-SFT
  • 3 replies
·
reacted to their post with 😎 6 days ago
view post
Post
4664
Announcement! We have made significant progress in our efforts to replicate OpenAI O1 based on the AlphaGo Zero architecture—LLaMA-O1. We have successfully enabled the model to acquire advanced thinking skills through interaction with the search tree during the learning process without human annotations.

We plan to complete the model training and evaluation no later than the end of November and will release all data, models, and code to the community.


Past related papers:
LLaMA-Berry: Pairwise Optimization for O1-like Olympiad-Level Mathematical Reasoning (2410.02884)
Accessing GPT-4 level Mathematical Olympiad Solutions via Monte Carlo Tree Self-refine with LLaMa-3 8B (2406.07394)

For the linear representation format of Long COT (OpenLongCoT), please refer:
qq8933/OpenLongCoT-Pretrain
qq8933/OpenLongCoT-SFT
  • 3 replies
·
replied to their post 6 days ago
view reply

Problem: Natalia sold clips to 48 of her friends in April, and then she sold half as many clips in May. How many clips did Natalia sell altogether in April and May?
Node: In April, Natalia sold 48 clips and in May, she sold 24 clips.
Node: In May, Natalia sold half as many clips as she did in April or 48/2=<<48/2=24>>24 clips.
Node: Natalia sold half the number of clips in May that she sold in April.
Node: In April, Natalia sold 48 clips. In May, she sold half as many, which is (48 \times \frac{1}{2} = 24) clips. This step correctly identifies the number of clips Natalia sold in May, which is 24.
Node: For the first part of the question, we need to know how many clips Natalia sold in April.
Node: If $|\mathbf{a}| = 5$ and $|\mathbf{b}| = X$, then, in terms of $k$, for how many values of $k$ is $\mathbf{a} + k \mathbf{b}$ parallel to $\begin{pmatrix} -4 \ 3 \end{pmatrix}$?
Node: The question is asking the total for April and May.
Node: So, in April, she sold 48 clips.
Node: 1. ** For April:** - Natalia sold 48 clips.
Node: Combined,Natalia sold 48+24=<<48+24=72>>72 clips in April and May.
Node: In total, Natalie sold 48+24=<<48+24=72>>72 clips in April and May.
Node: Thus, she sold a total of 48+24=<<48+24=72>>72 clips in April and May.
Node: Natalia sold 72-48=<<72-48=24>>24 more clips in May than in April.
Node: Natalia, the number of clips of each color she knew was 35, 25, 35, 25.
Node: This is just a guess but it should be correct.
Node: In addition, the calculation in Step 4 correctly identifies the number of more clips sold in May relative to April.
Node: 4. Step 4: The additional clips (72 - 48) are calculated to find out how many more she sold in May than April. The result, 72, is correct.
Node: Therefore, 24 more clips than the 48 she sold in April, she sold in May.
Node: 5. **** To find out how many clips were sold in April, we divide the total number she sold in April (48 clips) by the number of clips sold in April (24 clips), which correctly calculates 2).
Node: In total, she sold 48 clips in April and 24 clips in May, so 48 + 24 = 72 clips.
Node: Let's verify that 24 is indeed half of 48 and that the answer rounded to the nearest whole number is 24.
Node: So, in all, Natalia sold 48+24 = <<48+24=72>>72 clips. The answer is: 72
Node: Natalia sold 48 clips in April which is 2 more than she sold in May. This number is 2 times the number she sold in May so 272=<<2*72=144>>144.
Node: 3. Step 3: Adding the number of clips sold in April (48) and May (24) gives the total number of clips sold in April and May, which is 48 + 24. The sum is correct: 72.
Node: 2. ** Step 4
*: The information given about the number of clips sold in May is correct and necessary. The reasoning that Natalia sold half the number of clips she sold in April is logically sound, and the total number of clips sold (72) is accurately computed based on this information.
Node: In total, she sold 120 clips in May (48 + 24).
Node: It was stated in the problem that Natalia sold half the number of clips in May that she sold in April. In April, she sold 48 clips. Therefore, adding 24 to this gives (48 + 24 = 72) clips sold in April and May combined. This calculation is correct.The final answer is True.
Node: In all, she sold 48+24 = <<48+24=72>>72 clips. The answer is: 72
Node: In total, Natalia sold 120+48 = <<120+48=168>>168 clips. The answer is: 168
Node: So, she sold 63+72 = <<63+72=135>>135 clips in April and May.
Node: The final total is the sum of the clips sold in April (48 clips) and the clips sold in May (24 clips), which correctly totals 72 clips.
Node: 48+24=<<48+24=72>>72 clips
Node: So, she sold a total of 48+24=<<48+24=72>>72 clips. The answer is: 72
Node: Finally, to verify, we can calculate the total for the 90 clients as described. If she sold half as many clips in May as she did in April, then in May, she would sell 48 / 2 = <<48/2=24>>24 clips. Therefore, the total number of clips sold is (48 + 24 = 72). This step confirms the accuracy of the solution.Since all steps are labeled as 'true', all are also correctly calculated, and the solution is accurate.The final answer is True.
Node: Right. And she sold 72 clips, so 48+72=<<48+72=120>>120 clips.

Answer

120
Node: Yes, so she sold 72 clips.

Answer

72
Node: So, in total, she sold 72+48=<<72+48=120>>120 clips.
Node: Right. So she sold 48+24=<<48+24=72>>72 clips. The answer is: 72

posted an update 6 days ago
view post
Post
4664
Announcement! We have made significant progress in our efforts to replicate OpenAI O1 based on the AlphaGo Zero architecture—LLaMA-O1. We have successfully enabled the model to acquire advanced thinking skills through interaction with the search tree during the learning process without human annotations.

We plan to complete the model training and evaluation no later than the end of November and will release all data, models, and code to the community.


Past related papers:
LLaMA-Berry: Pairwise Optimization for O1-like Olympiad-Level Mathematical Reasoning (2410.02884)
Accessing GPT-4 level Mathematical Olympiad Solutions via Monte Carlo Tree Self-refine with LLaMa-3 8B (2406.07394)

For the linear representation format of Long COT (OpenLongCoT), please refer:
qq8933/OpenLongCoT-Pretrain
qq8933/OpenLongCoT-SFT
  • 3 replies
·
posted an update about 2 months ago
view post
Post
2435
🚀 Exciting News! At the end of September, we're releasing an upgraded version of the MCTSr algorithm, LLaMA-O1, as an open-source fork of the OpenAI Strawberry project. This enhanced tool specializes in tackling the most challenging Mathematical Olympiad tasks, including:

A remarkable score of 8/30 in AIME 2024 🏆
An impressive 26.7% in GPQA Diamond 💎
A solid 58% in AMC 2023 📊
Stay tuned for the release and get ready to explore the cutting edge of mathematical problem-solving! #OpenSource #MathOlympiad #Innovation

Former vision:
Accessing GPT-4 level Mathematical Olympiad Solutions via Monte Carlo Tree Self-refine with LLaMa-3 8B (2406.07394)
posted an update 3 months ago
replied to their post 3 months ago
posted an update 4 months ago
view post
Post
649
Preview:
We will open source the 2.5B ChemVLM and the tool-enhanced ChemLLM-7B in the near future
posted an update 5 months ago
reacted to their post with 😎 5 months ago
view post
Post
2608
Tools Ready!
Thanks to ChemCrow's great work, ChemLLM supports proficiency toolkits Now, Include,
Molecule Name Retrivel
Molecule Property Query
Patent Check
Molecule Safety Query
Try it on chemllm.org
  • 2 replies
·
posted an update 5 months ago
view post
Post
2608
Tools Ready!
Thanks to ChemCrow's great work, ChemLLM supports proficiency toolkits Now, Include,
Molecule Name Retrivel
Molecule Property Query
Patent Check
Molecule Safety Query
Try it on chemllm.org
  • 2 replies
·
posted an update 5 months ago
view post
Post
992
New Appearance from Ollama Open WebUI!
And Also web search, Realtime talking and File RAG!
https://chemllm.org/


posted an update 5 months ago
replied to their post 5 months ago