jwhj commited on
Commit
97ef421
·
verified ·
1 Parent(s): ab53ac8

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -4,7 +4,7 @@ Source code for [Offline Reinforcement Learning for LLM Multi-Step Reasoning](ht
4
 
5
  Model: [Policy](https://huggingface.co/jwhj/Qwen2.5-Math-1.5B-OREO) | [Value](https://huggingface.co/jwhj/Qwen2.5-Math-1.5B-OREO-Value)
6
 
7
- <img src="./OREO.png" alt="Image description" width="50%" />
8
 
9
 
10
  # Installation
 
4
 
5
  Model: [Policy](https://huggingface.co/jwhj/Qwen2.5-Math-1.5B-OREO) | [Value](https://huggingface.co/jwhj/Qwen2.5-Math-1.5B-OREO-Value)
6
 
7
+ <img src="https://raw.githubusercontent.com/jwhj/OREO/refs/heads/main/OREO.png" alt="Image description" width="50%" />
8
 
9
 
10
  # Installation