trocr-small-korean / README.md
hyunwoo3235's picture
Update README.md
33e0a37
|
raw
history blame
1.39 kB
---
license: apache-2.0
language:
- ko
pipeline_tag: image-to-text
tags:
- trocr
- vision-encoder-decoder
---
# trocr-small-korean
## Model Details
TrOCR์€ Encoder-Decoder ๋ชจ๋ธ๋กœ, ์ด๋ฏธ์ง€ ํŠธ๋žœ์Šคํฌ๋จธ ์ธ์ฝ”๋”์™€ ํ…์ŠคํŠธ ํŠธ๋žœ์Šคํฌ๋จธ ๋””์ฝ”๋”๋กœ ์ด๋ฃจ์–ด์ ธ ์žˆ์Šต๋‹ˆ๋‹ค.
์ด๋ฏธ์ง€ ์ธ์ฝ”๋”๋Š” DeiT ๊ฐ€์ค‘์น˜๋กœ ์ดˆ๊ธฐํ™”๋˜์—ˆ๊ณ , ํ…์ŠคํŠธ ๋””์ฝ”๋”๋Š” ์ž์ฒด์ ์œผ๋กœ ํ•™์Šตํ•œ RoBERTa ๊ฐ€์ค‘์น˜๋กœ ์ดˆ๊ธฐํ™”๋˜์—ˆ์Šต๋‹ˆ๋‹ค.
์ด ์—ฐ๊ตฌ๋Š” ๊ตฌ๊ธ€์˜ TPU Research Cloud(TRC)๋ฅผ ํ†ตํ•ด ์ง€์›๋ฐ›์€ Cloud TPU๋กœ ํ•™์Šต๋˜์—ˆ์Šต๋‹ˆ๋‹ค.
## How to Get Started with the Model
```python
import torch
from transformers import VisionEncoderDecoderModel
model = VisionEncoderDecoderModel.from_pretrained("team-lucid/trocr-small-korean")
pixel_values = torch.rand(1, 3, 384, 384)
generated_ids = model.generate(pixel_values)
```
## Training Details
### Training Data
ํ•ด๋‹น ๋ชจ๋ธ์€ [synthtiger](https://github.com/clovaai/synthtiger)๋กœ ํ•ฉ์„ฑ๋œ 6M๊ฐœ์˜ ์ด๋ฏธ์ง€๋กœ ํ•™์Šต๋˜์—ˆ์Šต๋‹ˆ๋‹ค
### Training Hyperparameters
| Hyperparameter | Small |
|:--------------------|--------:|
| Warmup Steps | 4,000 |
| Learning Rates | 1e-4 |
| Batch Size | 512 |
| Weight Decay | 0.01 |
| Max Steps | 500,000 |
| Learning Rate Decay | 0.1 |
| \\(Adam\beta_1\\) | 0.9 |
| \\(Adam\beta_2\\) | 0.98 |