Trofish's picture
Update README.md
0ee90b1 verified
|
raw
history blame
1.76 kB

RoBERTa-base Korean

λͺ¨λΈ μ„€λͺ…

이 RoBERTa λͺ¨λΈμ€ λ‹€μ–‘ν•œ ν•œκ΅­μ–΄ ν…μŠ€νŠΈ λ°μ΄ν„°μ…‹μ—μ„œ 음절 λ‹¨μœ„λ‘œ 사전 ν•™μŠ΅λ˜μ—ˆμŠ΅λ‹ˆλ‹€. 자체 κ΅¬μΆ•ν•œ ν•œκ΅­μ–΄ 음절 λ‹¨μœ„ vocab을 μ‚¬μš©ν•˜μ˜€μŠ΅λ‹ˆλ‹€.

μ•„ν‚€ν…μ²˜

  • λͺ¨λΈ μœ ν˜•: RoBERTa
  • μ•„ν‚€ν…μ²˜: RobertaForMaskedLM
  • λͺ¨λΈ 크기: 256 hidden size, 8 hidden layers, 8 attention heads
  • max_position_embeddings: 514
  • intermediate_size: 2048
  • vocab_size: 1428

ν•™μŠ΅ 데이터

μ‚¬μš©λœ 데이터셋은 λ‹€μŒκ³Ό κ°™μŠ΅λ‹ˆλ‹€:

  • λͺ¨λ‘μ˜λ§λ­‰μΉ˜: μ±„νŒ…, κ²Œμ‹œνŒ, μΌμƒλŒ€ν™”, λ‰΄μŠ€, λ°©μ†‘λŒ€λ³Έ, μ±… λ“±
  • AIHUB: SNS, 유튜브 λŒ“κΈ€, λ„μ„œ λ¬Έμž₯
  • 기타: λ‚˜λ¬΄μœ„ν‚€, ν•œκ΅­μ–΄ μœ„ν‚€ν”Όλ””μ•„ 총 ν•©μ‚°λœ λ°μ΄ν„°λŠ” μ•½ 11GB μž…λ‹ˆλ‹€.

ν•™μŠ΅ 상세

  • BATCH_SIZE: 112 (GPUλ‹Ή)
  • ACCUMULATE: 36
  • MAX_STEPS: 12,500
  • Train Steps*Batch Szie: 100M
  • WARMUP_STEPS: 2,400
  • μ΅œμ ν™”: AdamW, LR 1e-3, BETA (0.9, 0.98), eps 1e-6
  • ν•™μŠ΅λ₯  감쇠: linear
  • μ‚¬μš©λœ ν•˜λ“œμ›¨μ–΄: 2x RTX 8000 GPU

Evaluation Loss Graph

Evaluation Accuracy Graph

μ‚¬μš© 방법

from transformers import AutoModel, AutoTokenizer

# λͺ¨λΈκ³Ό ν† ν¬λ‚˜μ΄μ € 뢈러였기
model = AutoModel.from_pretrained("your_model_name")
tokenizer = AutoTokenizer.from_pretrained("your_tokenizer_name")

# ν…μŠ€νŠΈλ₯Ό ν† ν°μœΌλ‘œ λ³€ν™˜ν•˜κ³  예츑 μˆ˜ν–‰
inputs = tokenizer("여기에 ν•œκ΅­μ–΄ ν…μŠ€νŠΈ μž…λ ₯", return_tensors="pt")
outputs = model(**inputs)