---
license: other
base_model: deepseek-ai/deepseek-coder-6.7b-instruct
tags:
- generated_from_trainer
metrics:
- accuracy
- bleu
- sacrebleu
- rouge
model-index:
- name: deepseek-coder-6.7b-instruct_En__size_52_epochs_10_2024-06-21_06-20-33_3556409
  results: []
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# deepseek-coder-6.7b-instruct_En__size_52_epochs_10_2024-06-21_06-20-33_3556409

This model is a fine-tuned version of [deepseek-ai/deepseek-coder-6.7b-instruct](https://huggingface.co/deepseek-ai/deepseek-coder-6.7b-instruct) on an unknown dataset.
It achieves the following results on the evaluation set:
- Loss: 1.4340
- Accuracy: 0.042
- Chrf: 0.734
- Bleu: 0.608
- Sacrebleu: 0.6
- Rouge1: 0.707
- Rouge2: 0.494
- Rougel: 0.637
- Rougelsum: 0.693
- Meteor: 0.534

## Model description

More information needed

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 0.001
- train_batch_size: 1
- eval_batch_size: 1
- seed: 3407
- distributed_type: multi-GPU
- num_devices: 4
- total_train_batch_size: 4
- total_eval_batch_size: 4
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-06
- lr_scheduler_type: linear
- lr_scheduler_warmup_steps: 52
- training_steps: 520

### Training results

| Training Loss | Epoch | Step | Validation Loss | Accuracy | Chrf  | Bleu  | Sacrebleu | Rouge1 | Rouge2 | Rougel | Rougelsum | Meteor |
|:-------------:|:-----:|:----:|:---------------:|:--------:|:-----:|:-----:|:---------:|:------:|:------:|:------:|:---------:|:------:|
| 0.1233        | 4.0   | 52   | 1.1674          | 0.027    | 0.726 | 0.601 | 0.6       | 0.681  | 0.458  | 0.612  | 0.674     | 0.539  |
| 0.5834        | 8.0   | 104  | 1.2639          | 0.032    | 0.708 | 0.57  | 0.6       | 0.686  | 0.458  | 0.617  | 0.679     | 0.483  |
| 0.1938        | 12.0  | 156  | 1.2723          | 0.032    | 0.708 | 0.574 | 0.6       | 0.684  | 0.457  | 0.609  | 0.673     | 0.479  |
| 0.1681        | 16.0  | 208  | 1.2437          | 0.036    | 0.719 | 0.595 | 0.6       | 0.697  | 0.469  | 0.619  | 0.682     | 0.524  |
| 0.176         | 20.0  | 260  | 1.4102          | 0.037    | 0.699 | 0.565 | 0.6       | 0.666  | 0.435  | 0.588  | 0.652     | 0.507  |
| 0.4563        | 24.0  | 312  | 1.3416          | 0.039    | 0.717 | 0.586 | 0.6       | 0.69   | 0.452  | 0.609  | 0.678     | 0.521  |
| 0.114         | 28.0  | 364  | 1.3758          | 0.041    | 0.728 | 0.602 | 0.6       | 0.703  | 0.478  | 0.618  | 0.683     | 0.524  |
| 0.4204        | 32.0  | 416  | 1.4116          | 0.042    | 0.727 | 0.598 | 0.6       | 0.705  | 0.476  | 0.621  | 0.689     | 0.545  |
| 0.1118        | 36.0  | 468  | 1.4229          | 0.042    | 0.734 | 0.607 | 0.6       | 0.709  | 0.497  | 0.64   | 0.694     | 0.528  |
| 0.2482        | 40.0  | 520  | 1.4340          | 0.042    | 0.734 | 0.608 | 0.6       | 0.707  | 0.494  | 0.637  | 0.693     | 0.534  |


### Framework versions

- Transformers 4.37.0
- Pytorch 2.2.1+cu121
- Datasets 2.20.0
- Tokenizers 0.15.2