RiddleHe commited on
Commit
c3da79f
·
1 Parent(s): dce1f7c

Add YC-Bench evaluation results (avg $1,208,190)

Browse files

YC-Bench medium preset evaluation across seeds 1, 2, 3.

Score: $1,208,190 average final funds (USD).

Benchmark: https://huggingface.co/datasets/collinear-ai/yc-bench
Source: https://github.com/collinear-ai/yc-bench

Files changed (1) hide show
  1. .eval_results/yc-bench.yaml +9 -0
.eval_results/yc-bench.yaml ADDED
@@ -0,0 +1,9 @@
 
 
 
 
 
 
 
 
 
 
1
+ - dataset:
2
+ id: collinear-ai/yc-bench
3
+ task_id: medium
4
+ value: 1208190
5
+ date: "2026-03-24"
6
+ source:
7
+ url: https://github.com/collinear-ai/yc-bench
8
+ name: "YC-Bench eval"
9
+ notes: "avg final funds (USD) across seeds 1,2,3. GLM-5 (via OpenRouter z-ai/glm-5)"