File size: 906 Bytes
2c5b432
3a5f192
2c5b432
 
3a5f192
 
 
 
 
 
 
 
 
c4acd53
 
3a5f192
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
---
language: ko
license: apache-2.0
---

# team-lucid/t5-v1_1-large-ko

[Google's T5](https://ai.googleblog.com/2020/02/exploring-transfer-learning-with-t5.html) Version 1.1 that trained on korean corpus

t5-v1_1-large-ko์€ ํ•œ๊ตญ์–ด ์ฝ”ํผ์Šค์—์„œ ํ•™์Šต๋œ t5 v1.1 ๋ชจ๋ธ์ž…๋‹ˆ๋‹ค.

OOV์„ ๋ง‰๊ธฐ ์œ„ํ•ด BBPE๋ฅผ ์‚ฌ์šฉํ•˜์˜€์œผ๋ฉฐ, HyperCLOVA์—์„œ ํ˜•ํƒœ์†Œ ๋ถ„์„์ด ์„ฑ๋Šฅ์„ ๋†’ํžˆ๋Š”๋ฐ ๋„์›€์ด ๋˜๋Š” ๊ฒƒ์„ ๋ณด๊ณ  ํ† ํฌ๋‚˜์ด์ € ํ•™์Šต ๊ณผ์ •์—์„œ MeCab์„ ์ด์šฉํ•ด ํ˜•ํƒœ์†Œ๊ฐ€ ์ด์ƒํ•˜๊ฒŒ ํ† ํฐํ™”๋˜์ง€ ์•Š๋„๋ก ํ•˜์˜€์Šต๋‹ˆ๋‹ค.

์ด ์—ฐ๊ตฌ๋Š” ๊ตฌ๊ธ€์˜ TPU Research Cloud(TRC)๋ฅผ ํ†ตํ•ด ์ง€์›๋ฐ›์€ Cloud TPU๋กœ ํ•™์Šต๋˜์—ˆ์Šต๋‹ˆ๋‹ค.

## Usage
```python
from transformers import AutoTokenizer, T5ForConditionalGeneration

tokenizer = AutoTokenizer.from_pretrained('team-lucid/t5-v1_1-large-ko')
model = T5ForConditionalGeneration.from_pretrained('team-lucid/t5-v1_1-large-ko')
```