--- license: apache-2.0 base_model: NousResearch/Nous-Hermes-2-Mistral-7B-DPO tags: - generated_from_trainer model-index: - name: workspace/out-mistral-2B results: [] --- [

](https://github.com/OpenAccess-AI-Collective/axolotl)

See axolotl config

axolotl version: `0.4.0` ```yaml adapter: null base_model: NousResearch/Nous-Hermes-2-Mistral-7B-DPO batch_size: 2 bf16: auto dataset_prepared_path: null datasets: - ds_type: json path: /workspace/data.jsonl type: context_qa.load_v2 debug: null deepspeed: null early_stopping_patience: null evals_per_epoch: 4 flash_attention: null fp16: null fsdp: null fsdp_config: null gptq_groupsize: null gptq_model_v1: null gradient_checkpointing: true group_by_length: false learning_rate: 1.0e-05 local_rank: null logging_steps: 1 lora_alpha: 32 lora_dropout: 0.2 lora_fan_in_fan_out: null lora_model_dir: null lora_r: 64 lora_target_linear: true lora_target_modules: null lr_scheduler: cosine max_packed_sequence_len: null micro_batch_size: 1 model_config: output_router_logits: true model_type: MistralForCausalLM num_epochs: 4 optimizer: adamw_bnb_8bit output_dir: /workspace/out-mistral-2B resume_from_checkpoint: null saves_per_epoch: 1 sequence_len: 2048 special_tokens: bos_token: ~~eos_token: <|im_end|> pad_token:~~ tf32: true tokenizer_type: LlamaTokenizer torchdistx_path: null train_on_inputs: false trust_remote_code: true val_set_size: 0.05 wandb_entity: null wandb_log_model: Nous-Hermes-2-Mistral-7B-DPO wandb_name: mistral wandb_project: Ultron-llama wandb_watch: null warmup_steps: 40 weight_decay: 0.0 xformers_attention: true ```

# workspace/out-mistral-2B This model is a fine-tuned version of [NousResearch/Nous-Hermes-2-Mistral-7B-DPO](https://huggingface.co/NousResearch/Nous-Hermes-2-Mistral-7B-DPO) on the None dataset. It achieves the following results on the evaluation set: - Loss: 0.5036 ## Model description More information needed ## Intended uses & limitations More information needed ## Training and evaluation data More information needed ## Training procedure ### Training hyperparameters The following hyperparameters were used during training: - learning_rate: 1e-05 - train_batch_size: 1 - eval_batch_size: 1 - seed: 42 - distributed_type: multi-GPU - num_devices: 2 - gradient_accumulation_steps: 2 - total_train_batch_size: 4 - total_eval_batch_size: 2 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 - lr_scheduler_type: cosine - lr_scheduler_warmup_steps: 40 - num_epochs: 4 ### Training results | Training Loss | Epoch | Step | Validation Loss | |:-------------:|:-----:|:----:|:---------------:| | 0.6411 | 0.02 | 1 | 0.4803 | | 0.5321 | 0.26 | 11 | 0.3867 | | 0.4077 | 0.51 | 22 | 0.3591 | | 0.4455 | 0.77 | 33 | 0.3995 | | 0.2921 | 1.02 | 44 | 0.4368 | | 0.3459 | 1.28 | 55 | 0.4884 | | 0.2768 | 1.53 | 66 | 0.4978 | | 0.4168 | 1.79 | 77 | 0.4808 | | 0.14 | 2.05 | 88 | 0.4547 | | 0.1132 | 2.3 | 99 | 0.4856 | | 0.1055 | 2.56 | 110 | 0.4916 | | 0.1385 | 2.81 | 121 | 0.4783 | | 0.0455 | 3.07 | 132 | 0.4677 | | 0.0211 | 3.33 | 143 | 0.4892 | | 0.0236 | 3.58 | 154 | 0.5016 | | 0.009 | 3.84 | 165 | 0.5036 | ### Framework versions - Transformers 4.39.0.dev0 - Pytorch 2.1.2+cu121 - Datasets 2.18.0 - Tokenizers 0.15.0