seongs's picture
Update README.md
280cc2c verified
|
raw
history blame
2.62 kB
---
language:
- en
- ko
license: apache-2.0
library_name: transformers
tags:
- translation
- t5
- en-to-ko
datasets:
- aihub-koen-translation-integrated-base-10m
metrics:
- bleu
model-index:
- name: traintogpb-ke-t5-base-aihub-koen-translation-integrated-10m-en-to-ko
results:
- task:
name: Translation
type: translation
dataset:
name: AIHub KO-EN Translation Integrated Base (10M)
type: aihub-koen-translation-integrated-base-10m
metrics:
- name: BLEU
type: bleu
value: 18.838066
epoch: 2
- name: BLEU
type: bleu
value: 18.006119
epoch: 1
---
# Model Description
This model, named **traintogpb-ke-t5-base-aihub-koen-translation-integrated-10m-en-to-ko**, is a machine translation model that translates English to Korean. It is fine-tuned from the [KETI-AIR/ke-t5-base](https://huggingface.co/KETI-AIR/ke-t5-base) model using the [aihub-koen-translation-integrated-base-10m](https://huggingface.co/datasets/traintogpb/aihub-koen-translation-integrated-base-10m) dataset.
## Model Architecture
The model uses the ke-t5-base architecture, which is based on the T5 (Text-to-Text Transfer Transformer) model.
## Training Data
The model was trained on the aihub-koen-translation-integrated-base-10m dataset, which is designed for English-to-Korean translation tasks.
## Training Procedure
### Training Parameters
The model was trained with the following parameters:
- Learning Rate: 0.0005
- Weight Decay: 0.01
- Batch Size: 64 (training), 128 (evaluation)
- Number of Epochs: 2
- Save Steps: 500
- Max Save Checkpoints: 2
- Evaluation Strategy: At the end of each epoch
- Logging Strategy: No logging
- Use of FP16: No
- Gradient Accumulation Steps: 2
- Reporting: None
### Hardware
The training was performed on a single GPU system with an NVIDIA A100 (40GB).
## Performance
The model achieved the following BLEU scores during training:
- Epoch 1: 18.006119
- Epoch 2: 18.838066
## Usage
This model is suitable for applications involving translation from English to Korean. Here is an example on how to use this model in Hugging Face's Transformers:
```python
from transformers import AutoModelForSeq2SeqLM, AutoTokenizer
model = AutoModelForSeq2SeqLM.from_pretrained("traintogpb-ke-t5-base-aihub-koen-translation-integrated-10m-en-to-ko")
tokenizer = AutoTokenizer.from_pretrained("traintogpb-ke-t5-base-aihub-koen-translation-integrated-10m-en-to-ko")
inputs = tokenizer.encode("This is a sample text.", return_tensors="pt")
outputs = model.generate(inputs)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))