Text Generation
Transformers
PyTorch
Safetensors
English
gpt2
alignment
instruction tuned
text generation
conversation
assistant
dpo
text-generation-inference
Inference Endpoints
nicholasKluge commited on
Commit
1a8184c
1 Parent(s): 2e7461f

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +65 -0
README.md ADDED
@@ -0,0 +1,65 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ datasets:
4
+ - nicholasKluge/reward-aira-dataset
5
+ language:
6
+ - en
7
+ library_name: transformers
8
+ pipeline_tag: text-generation
9
+ tags:
10
+ - text-generation-inference
11
+ ---
12
+
13
+ # Aira-2-124M-DPO-checkpoint-200
14
+
15
+ ## Hyperparameters
16
+
17
+ ```yaml
18
+ model_args:
19
+ base_model: "nicholasKluge/Aira-2-124M"
20
+ model_ref: "nicholasKluge/Aira-2-124M"
21
+ cache_dir: null
22
+ data_args:
23
+ dataset_name: "nicholasKluge/reward-aira-dataset"
24
+ dataset_split: "english"
25
+ validation_split_percentage: null
26
+ streaming: false
27
+ max_prompt_length: 150
28
+ max_length: 600
29
+ sanity_check: false
30
+ training_args:
31
+ output_dir: "checkpoints"
32
+ do_eval: false
33
+ evaluation_strategy: "no"
34
+ save_strategy: "steps"
35
+ logging_strategy: "steps"
36
+ logging_steps: 200
37
+ max_steps: 2400
38
+ save_steps: 200
39
+ per_device_train_batch_size: 8
40
+ per_device_eval_batch_size: 8
41
+ gradient_accumulation_steps: 1
42
+ gradient_checkpointing: false
43
+ optim: "adamw_torch"
44
+ learning_rate: 0.00005
45
+ lr_scheduler_type: "cosine"
46
+ warmup_steps: 100
47
+ hub_token: null
48
+ push_to_hub: false
49
+ hub_model_id: null
50
+ extra_args:
51
+ project_name: "Aira-2"
52
+ wandb_token: null
53
+ beta: 0.8
54
+ ```
55
+
56
+ ## Eval
57
+
58
+ | Task |Version| Metric |Value | |Stderr|
59
+ |-------------|------:|--------|-----:|---|-----:|
60
+ |arc_challenge| 0|acc |0.2031|± |0.0118|
61
+ | | |acc_norm|0.2491|± |0.0126|
62
+ |toxigen | 0|acc |0.5521|± |0.0162|
63
+ | | |acc_norm|0.4340|± |0.0162|
64
+ |truthfulqa_mc| 1|mc1 |0.2485|± |0.0151|
65
+ | | |mc2 |0.4368|± |0.0153|