JesseStover commited on
Commit
8f83403
β€’
1 Parent(s): e5af8b5

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +40 -0
README.md ADDED
@@ -0,0 +1,40 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - ko
4
+ ---
5
+
6
+ The L2AI-dictionary model is fine-tuned for multiple choice, specifically for selecting the best dictionary definition of a given word in a sentence. Below is an example usage:
7
+
8
+ ```python
9
+ import numpy as np
10
+ import torch
11
+ from transformers import AutoModelForMultipleChoice, AutoTokenizer
12
+
13
+ model_name = "JesseStover/L2AI-dictionary-klue-bert-base"
14
+ tokenizer = AutoTokenizer.from_pretrained(model_name)
15
+ model = AutoModelForMultipleChoice.from_pretrained(model_name)
16
+ model.to(torch.device("cuda" if torch.cuda.is_available() else "cpu"))
17
+
18
+ prompts = "\"κ°•μ•„μ§€λŠ” λ½€μ†‘λ½€μ†‘ν•˜λ‹€.\"에 μžˆλŠ” \"강아지\"의 μ •μ˜λŠ” "
19
+ candidates = [
20
+ "\"(λͺ…사) 개의 μƒˆλΌ\"μ˜ˆμš”.",
21
+ "\"(λͺ…사) λΆ€λͺ¨λ‚˜ 할아버지, ν• λ¨Έλ‹ˆκ°€ μžμ‹μ΄λ‚˜ 손주λ₯Ό κ·€μ—¬μ›Œν•˜λ©΄μ„œ λΆ€λ₯΄λŠ” 말\"μ΄μ˜ˆμš”."
22
+ ]
23
+
24
+ inputs = tokenizer(
25
+ [[prompt, candidate] for candidate in candidates],
26
+ return_tensors="pt",
27
+ padding=True
28
+ )
29
+
30
+ labels = torch.tensor(0).unsqueeze(0)
31
+
32
+ with torch.no_grad():
33
+ outputs = model(
34
+ **{k: v.unsqueeze(0) for k, v in inputs.items()}, labels=labels
35
+ )
36
+
37
+ print({i: float(x) for i, x in enumerate(outputs.logits.softmax(1)[0])})
38
+ ```
39
+
40
+ Training data was procured under Creative Commons [CC BY-SA 2.0 KR DEED](https://creativecommons.org/licenses/by-sa/2.0/kr/) from the National Institute of Korean Language's [Basic Korean Dictionary](https://krdict.korean.go.kr) and [Standard Korean Dictionary](https://stdict.korean.go.kr/).