ojus1 commited on
Commit
32a8720
·
verified ·
1 Parent(s): 8fc0d8b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +6 -0
README.md CHANGED
@@ -33,6 +33,12 @@ The code used to generate the dataset can be found [here](https://github.com/pre
33
  <img src="assets/line_plot.png" alt="Line Plot" width="80%">
34
  </div>
35
 
 
 
 
 
 
 
36
  ## Results
37
 
38
  ### BFCL v3
 
33
  <img src="assets/line_plot.png" alt="Line Plot" width="80%">
34
  </div>
35
 
36
+ Notes:
37
+ - *Funcdex-0.6B is the average of performances of individual Funcdex-0.6B models.*
38
+ - For cost, we track the number of prompt/completion tokens for evaluating 300 conversations.
39
+ - e.g. If token cost is input=$1 and output=$10 per million tokens, and evaluation needed `0.5M` and `0.1M` input/output tokens, then cost is `1 * 0.5 + 0.1 * 10 = $1.5`.
40
+ - *Qwen3-0.6B and Qwen3-1.7B evaluation costs are estimated by extrapolating from Llama3.2-3B serverless costs. Other model's costs are sourced from Openrouter.*
41
+
42
  ## Results
43
 
44
  ### BFCL v3