|
--- |
|
{} |
|
--- |
|
|
|
The L2AI-dictionary model is fine-tuned for multiple choice, specifically for selecting the best dictionary definition of a given word in a sentence. Below is an example usage: |
|
|
|
```python |
|
import numpy as np |
|
import torch |
|
from transformers import AutoModelForMultipleChoice, AutoTokenizer |
|
|
|
model_name = "JesseStover/L2AI-dictionary-klue-bert-base" |
|
tokenizer = AutoTokenizer.from_pretrained(model_name) |
|
model = AutoModelForMultipleChoice.from_pretrained(model_name) |
|
model.to(torch.device("cuda" if torch.cuda.is_available() else "cpu")) |
|
|
|
prompts = "\"κ°μμ§λ λ½μ‘λ½μ‘νλ€.\"μ μλ \"κ°μμ§\"μ μ μλ " |
|
candidates = [ |
|
"\"(λͺ
μ¬) κ°μ μλΌ\"μμ.", |
|
"\"(λͺ
μ¬) λΆλͺ¨λ ν μλ²μ§, ν λ¨Έλκ° μμμ΄λ μμ£Όλ₯Ό κ·μ¬μνλ©΄μ λΆλ₯΄λ λ§\"μ΄μμ." |
|
] |
|
|
|
inputs = tokenizer( |
|
[[prompt, candidate] for candidate in candidates], |
|
return_tensors="pt", |
|
padding=True |
|
) |
|
|
|
labels = torch.tensor(0).unsqueeze(0) |
|
|
|
with torch.no_grad(): |
|
outputs = model( |
|
**{k: v.unsqueeze(0) for k, v in inputs.items()}, labels=labels |
|
) |
|
|
|
print({i: float(x) for i, x in enumerate(outputs.logits.softmax(1)[0])}) |
|
``` |
|
|
|
Training data was procured under Creative Commons [CC BY-SA 2.0 KR DEED](https://creativecommons.org/licenses/by-sa/2.0/kr/) from the National Institute of Korean Language's [Basic Korean Dictionary](https://krdict.korean.go.kr) and [Standard Korean Dictionary](https://stdict.korean.go.kr/). |