Edit model card

Visualize in Weights & Biases

arxiv-assistant-mistral7b

This model is a fine-tuned version of mistralai/Mistral-7B-Instruct-v0.2 on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.3332
  • Num Input Tokens Seen: 8960242

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0002
  • train_batch_size: 4
  • eval_batch_size: 1
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.05
  • num_epochs: 1
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.6005 0.1938 1000 0.4064 1827684
0.5877 0.3877 2000 0.3750 3600506
0.4922 0.5815 3000 0.3551 5407592
0.498 0.7753 4000 0.3394 7199648
0.5224 0.9692 5000 0.3332 8960242

Framework versions

  • PEFT 0.11.2.dev0
  • Transformers 4.42.0.dev0
  • Pytorch 2.2.2+cu118
  • Datasets 2.19.2
  • Tokenizers 0.19.1
Downloads last month
2
Unable to determine this model’s pipeline type. Check the docs .

Adapter for