clembench-playpen
/

meta-llama-Meta-Llama-3.1-8B-Instruct_SFT_E1_D20001

Generated from Trainer

Model card Files Files and versions Metrics Training metrics Community

Nicohst commited on Sep 26

Commit

415b42b

•

1 Parent(s): 7b4ac55

Update README.md

added training info

Files changed (1) hide show

README.md +16 -2

README.md CHANGED Viewed

@@ -22,7 +22,21 @@ This model is a fine-tuned version of [unsloth/meta-llama-3.1-8b-instruct-bnb-4b
 ## Model description
-More information needed
 ## Intended uses & limitations
@@ -30,7 +44,7 @@ More information needed
 ## Training and evaluation data
-More information needed
 ## Training procedure

 ## Model description
+The Model is trained only on successful episodes produced by the top 10 models from the clembench benchmark version 0.9 and 1.0. The success was measured in terms of most overall
+successful episodes across all games.
+| Place | Item |
+|-------|------|
+| 1 | gpt-4-0613-t0.0--gpt-4-0613-t0.0 |
+| 2 | claude-v1.3-t0.0--claude-v1.3-t0.0 |
+| 3 | gpt-4-1106-preview-t0.0--gpt-4-1106-preview-t0.0 |
+| 4 | gpt-4-t0.0--gpt-4-t0.0 |
+| 5 | gpt-4-0314-t0.0--gpt-4-0314-t0.0 |
+| 6 | claude-2.1-t0.0--claude-2.1-t0.0 |
+| 7 | gpt-4-t0.0--gpt-3.5-turbo-t0.0 |
+| 8 | claude-2-t0.0--claude-2-t0.0 |
+| 9 | gpt-3.5-turbo-1106-t0.0--gpt-3.5-turbo-1106-t0.0 |
+| 10 | gpt-3.5-turbo-0613-t0.0--gpt-3.5-turbo-0613-t0.0 |
 ## Intended uses & limitations
 ## Training and evaluation data
+Traning Data: D20001
 ## Training procedure