Spaces:

ceoavinash
/

codearena-rl

Sleeping

havinashpatil commited on 12 days ago

Commit

8599a81

1 Parent(s): 3f9399a

chore: update dependencies and include training results for README

Files changed (5) hide show

requirements.txt CHANGED Viewed

@@ -3,3 +3,9 @@ uvicorn[standard]>=0.23.0
 pydantic>=2.0.0
 openai>=1.0.0
 httpx>=0.24.1

 pydantic>=2.0.0
 openai>=1.0.0
 httpx>=0.24.1
+pandas
+matplotlib
+transformers
+torch
+datasets
+trl

results/reward_by_task.png ADDED Viewed

results/reward_curve.png ADDED Viewed

rewards_log.csv ADDED Viewed


1	+ timestamp,task_id,step,reward,compile_score,test_ratio,efficiency_score
2	+ 2026-04-25T11:18:35.777063,easy-1,5,0.01,0.0,0.0,0.0

test_server.py ADDED Viewed

+import httpx
+import time
+import subprocess
+import sys
+import os
+def main():
+    print("Starting server...")
+    server_process = subprocess.Popen([sys.executable, "-m", "uvicorn", "server.app:app", "--port", "7860"])
+    time.sleep(3) # Wait for server to start
+    try:
+        print("Testing /reset...")
+        res = httpx.post("http://localhost:7860/reset", json={"task_id": "auto"})
+        res.raise_for_status()
+        print("Running inference.py...")
+        # Just run easy task for one episode to save time
+        env = os.environ.copy()
+        env["CODEARENA_TASK"] = "easy-1"
+        # We don't have a real openai key or hf model downloaded, so it will hit fallback and succeed
+        subprocess.run([sys.executable, "inference.py", "--backend", "openai"], env=env, check=True)
+        print("Running plot_rewards.py...")
+        subprocess.run([sys.executable, "plot_rewards.py"], check=True)
+        print("All tests passed.")
+    except Exception as e:
+        print("Test failed:", e)
+    finally:
+        server_process.terminate()
+        server_process.wait()
+if __name__ == "__main__":
+    main()