---
tags:
- generated_from_trainer
model-index:
- name: seq2seq_huggingface_mix_results
  results: []
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# seq2seq_huggingface_mix_results

This model is a fine-tuned version of [](https://huggingface.co/) on an unknown dataset.
It achieves the following results on the evaluation set:
- Loss: 7.0175

## Model description

More information needed

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 12
- eval_batch_size: 12
- seed: 42
- gradient_accumulation_steps: 4
- total_train_batch_size: 48
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_steps: 500
- num_epochs: 3
- mixed_precision_training: Native AMP

### Training results

| Training Loss | Epoch  | Step | Validation Loss |
|:-------------:|:------:|:----:|:---------------:|
| 10.5072       | 0.0480 | 10   | 10.4491         |
| 10.3574       | 0.0959 | 20   | 10.1991         |
| 10.0831       | 0.1439 | 30   | 9.8790          |
| 9.7946        | 0.1918 | 40   | 9.5780          |
| 9.5118        | 0.2398 | 50   | 9.3344          |
| 9.3333        | 0.2878 | 60   | 9.1722          |
| 9.1888        | 0.3357 | 70   | 9.0610          |
| 9.0913        | 0.3837 | 80   | 8.9742          |
| 9.0007        | 0.4317 | 90   | 8.9005          |
| 8.9134        | 0.4796 | 100  | 8.8328          |
| 8.8583        | 0.5276 | 110  | 8.7615          |
| 8.7722        | 0.5755 | 120  | 8.6873          |
| 8.7092        | 0.6235 | 130  | 8.6137          |
| 8.6223        | 0.6715 | 140  | 8.5340          |
| 8.5312        | 0.7194 | 150  | 8.4538          |
| 8.4582        | 0.7674 | 160  | 8.3681          |
| 8.3748        | 0.8153 | 170  | 8.2801          |
| 8.2637        | 0.8633 | 180  | 8.1936          |
| 8.1704        | 0.9113 | 190  | 8.1001          |
| 8.0697        | 0.9592 | 200  | 8.0079          |
| 7.9792        | 1.0072 | 210  | 7.9126          |
| 7.9           | 1.0552 | 220  | 7.8175          |
| 7.8134        | 1.1031 | 230  | 7.7236          |
| 7.7153        | 1.1511 | 240  | 7.6328          |
| 7.6087        | 1.1990 | 250  | 7.5477          |
| 7.5328        | 1.2470 | 260  | 7.4634          |
| 7.4347        | 1.2950 | 270  | 7.3862          |
| 7.3531        | 1.3429 | 280  | 7.3179          |
| 7.3059        | 1.3909 | 290  | 7.2513          |
| 7.2403        | 1.4388 | 300  | 7.1955          |
| 7.2128        | 1.4868 | 310  | 7.1506          |
| 7.1508        | 1.5348 | 320  | 7.1105          |
| 7.1104        | 1.5827 | 330  | 7.0835          |
| 7.067         | 1.6307 | 340  | 7.0655          |
| 7.0594        | 1.6787 | 350  | 7.0558          |
| 7.0591        | 1.7266 | 360  | 7.0411          |
| 7.0129        | 1.7746 | 370  | 7.0381          |
| 7.0107        | 1.8225 | 380  | 7.0344          |
| 7.0549        | 1.8705 | 390  | 7.0268          |
| 7.0358        | 1.9185 | 400  | 7.0249          |
| 7.0395        | 1.9664 | 410  | 7.0242          |
| 7.0105        | 2.0144 | 420  | 7.0215          |
| 7.0113        | 2.0624 | 430  | 7.0259          |
| 6.9985        | 2.1103 | 440  | 7.0213          |
| 7.0218        | 2.1583 | 450  | 7.0218          |
| 6.9735        | 2.2062 | 460  | 7.0275          |
| 7.0132        | 2.2542 | 470  | 7.0254          |
| 7.0241        | 2.3022 | 480  | 7.0219          |
| 7.0127        | 2.3501 | 490  | 7.0238          |
| 6.9644        | 2.3981 | 500  | 7.0249          |
| 7.0103        | 2.4460 | 510  | 7.0259          |
| 7.006         | 2.4940 | 520  | 7.0266          |
| 6.9882        | 2.5420 | 530  | 7.0235          |
| 7.0016        | 2.5899 | 540  | 7.0235          |
| 7.002         | 2.6379 | 550  | 7.0217          |
| 6.9782        | 2.6859 | 560  | 7.0196          |
| 6.9833        | 2.7338 | 570  | 7.0198          |
| 6.9967        | 2.7818 | 580  | 7.0202          |
| 6.9644        | 2.8297 | 590  | 7.0196          |
| 6.9825        | 2.8777 | 600  | 7.0199          |
| 7.0097        | 2.9257 | 610  | 7.0178          |
| 6.9909        | 2.9736 | 620  | 7.0175          |


### Framework versions

- Transformers 4.40.2
- Pytorch 2.3.0+cu121
- Datasets 2.19.1
- Tokenizers 0.19.1