--- license: apache-2.0 datasets: - nicholasKluge/reward-aira-dataset language: - en library_name: transformers pipeline_tag: text-generation tags: - text-generation-inference --- # Aira-2-124M-DPO-checkpoint-200 ## Hyperparameters ```yaml model_args: base_model: "nicholasKluge/Aira-2-124M" model_ref: "nicholasKluge/Aira-2-124M" cache_dir: null data_args: dataset_name: "nicholasKluge/reward-aira-dataset" dataset_split: "english" validation_split_percentage: null streaming: false max_prompt_length: 150 max_length: 600 sanity_check: false training_args: output_dir: "checkpoints" do_eval: false evaluation_strategy: "no" save_strategy: "steps" logging_strategy: "steps" logging_steps: 200 max_steps: 2400 save_steps: 200 per_device_train_batch_size: 8 per_device_eval_batch_size: 8 gradient_accumulation_steps: 1 gradient_checkpointing: false optim: "adamw_torch" learning_rate: 0.00005 lr_scheduler_type: "cosine" warmup_steps: 100 hub_token: null push_to_hub: false hub_model_id: null extra_args: project_name: "Aira-2" wandb_token: null beta: 0.8 ``` ## Eval | Task |Version| Metric |Value | |Stderr| |-------------|------:|--------|-----:|---|-----:| |arc_challenge| 0|acc |0.2031|± |0.0118| | | |acc_norm|0.2491|± |0.0126| |toxigen | 0|acc |0.5521|± |0.0162| | | |acc_norm|0.4340|± |0.0162| |truthfulqa_mc| 1|mc1 |0.2485|± |0.0151| | | |mc2 |0.4368|± |0.0153|