README.md · JesseStover/L2AI-dictionary-klue-bert-base at main

metadata

{}

The L2AI-dictionary model is fine-tuned checkpoint of klue/bert-base for multiple choice, specifically for selecting the best dictionary definition of a given word in a sentence. Below is an example usage:

import numpy as np
import torch
from transformers import AutoModelForMultipleChoice, AutoTokenizer

model_name = "JesseStover/L2AI-dictionary-klue-bert-base"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForMultipleChoice.from_pretrained(model_name)
model.to(torch.device("cuda" if torch.cuda.is_available() else "cpu"))

prompts = "\"강아지는 뽀송뽀송하다.\"에 있는 \"강아지\"의 정의는 "
candidates = [
    "\"(명사) 개의 새끼\"예요.",
    "\"(명사) 부모나 할아버지, 할머니가 자식이나 손주를 귀여워하면서 부르는 말\"이예요."
]

inputs = tokenizer(
    [[prompt, candidate] for candidate in candidates],
    return_tensors="pt",
    padding=True
)

labels = torch.tensor(0).unsqueeze(0)

with torch.no_grad():
    outputs = model(
        **{k: v.unsqueeze(0) for k, v in inputs.items()}, labels=labels
    )

print({i: float(x) for i, x in enumerate(outputs.logits.softmax(1)[0])})

Training data was procured under Creative Commons CC BY-SA 2.0 KR DEED from the National Institute of Korean Language's Basic Korean Dictionary and Standard Korean Dictionary.