metadata
license: apache-2.0
base_model: TheBloke/Mistral-7B-Instruct-v0.2-GPTQ
tags:
- trl
- sft
- generated_from_trainer
model-index:
- name: v3
results: []
datasets:
- hieunguyenminh/roleplay
v3
This model is a fine-tuned version of TheBloke/Mistral-7B-Instruct-v0.2-GPTQ on the hieunguyenminh/roleplay dataset.
Model description
This model can adapt to any type of characters and provide answer that personalize that character.
Training and evaluation data
It is trained with supervised learning and will be trained with DPO in the future.
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0002
- train_batch_size: 16
- eval_batch_size: 8
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: cosine
- training_steps: 400
- mixed_precision_training: Native AMP
Training results
Loss after 400 steps: 0.73
Framework versions
- Transformers 4.35.2
- Pytorch 2.1.0+cu121
- Datasets 2.16.1
- Tokenizers 0.15.0