Edit model card

Visualize in Weights & Biases

Mists-7B-v01-simple-projector-trained

This model is a fine-tuned version of HachiML/Mists-7B-v01-not-trained on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0152

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.002
  • train_batch_size: 32
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_ratio: 0.05
  • num_epochs: 1

Training results

Training Loss Epoch Step Validation Loss
0.3727 0.0444 100 0.0301
0.0274 0.0888 200 0.0243
0.0615 0.1332 300 0.0382
0.0367 0.1776 400 0.0325
0.0327 0.2220 500 0.0304
0.0304 0.2664 600 0.0243
0.0242 0.3108 700 0.0216
0.0236 0.3552 800 0.0214
0.0226 0.3996 900 0.0188
0.0206 0.4440 1000 0.0181
0.0197 0.4885 1100 0.0188
0.0192 0.5329 1200 0.0175
0.019 0.5773 1300 0.0171
0.018 0.6217 1400 0.0166
0.0173 0.6661 1500 0.0169
0.017 0.7105 1600 0.0163
0.0172 0.7549 1700 0.0164
0.0186 0.7993 1800 0.0161
0.0168 0.8437 1900 0.0167
0.0161 0.8881 2000 0.0163
0.0157 0.9325 2100 0.0156
0.016 0.9769 2200 0.0152

Framework versions

  • Transformers 4.42.3
  • Pytorch 2.3.0+cu121
  • Datasets 2.20.0
  • Tokenizers 0.19.1
Downloads last month
14
Safetensors
Model size
7.62B params
Tensor type
BF16
·
F32
·
Inference API (serverless) does not yet support model repos that contain custom code.

Finetuned from

Collection including HachiML/Mists-7B-v01-simple-projector-trained