Eiki commited on
Commit
ff43b17
1 Parent(s): ec23d3f

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +62 -0
README.md ADDED
@@ -0,0 +1,62 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language: ja
3
+ widget:
4
+ - text: X が 部屋 で ゲーム するxEffect
5
+ ---
6
+
7
+ # COMET-GPT2 ja
8
+
9
+ Finetuned GPT-2 on the large version of [ATOMIC ja](https://github.com/nlp-waseda/comet-atomic-ja) using a causal language modeling (CLM) objective.
10
+ The original version and the large version of ATOMIC ja were introduced in [this paper](https://www.anlp.jp/proceedings/annual_meeting/2023/pdf_dir/B2-5.pdf) and in [this paper](https://www.anlp.jp/proceedings/annual_meeting/2023/pdf_dir/B9-1.pdf), respectively.
11
+
12
+
13
+ ### How to use
14
+
15
+ You can use this model directly with a pipeline for text generation.
16
+ Since the generation relies on some randomness, we set a seed for reproducibility:
17
+
18
+ ```python
19
+ >>> from transformers import pipeline, set_seed
20
+ >>> generator = pipeline('text-generation', model='nlp-waseda/comet-gpt2-small-japanese')
21
+ >>> set_seed(42)
22
+ >>> generator('X が 副業 を 始めるxEffect', max_length=30, num_return_sequences=5, do_sample=True)
23
+
24
+ [{'generated_text': 'X が 副業 を 始めるxEffect X が 収入 を 得る'},
25
+ {'generated_text': 'X が 副業 を 始めるxEffect X が 時間 を 失う'},
26
+ {'generated_text': 'X が 副業 を 始めるxEffect X が 儲かる'},
27
+ {'generated_text': 'X が 副業 を 始めるxEffect X が 稼ぐ'},
28
+ {'generated_text': 'X が 副業 を 始めるxEffect X が 稼げる ように なる'}]
29
+ ```
30
+
31
+ ### Preprocessing
32
+
33
+ The texts are segmented into words using Juman++ and tokenized using SentencePiece.
34
+
35
+ ## Evaluation results
36
+
37
+ The model achieves the following results:
38
+
39
+ | BLEU | BERTScore |
40
+ |:-----:|:---------:|
41
+ | - | - |
42
+
43
+ ### BibTeX entry and citation info
44
+
45
+ ```bibtex
46
+ @InProceedings{ide_nlp2023_event,
47
+ author = "井手竜也 and 村田栄樹 and 堀尾海斗 and 河原大輔 and 山崎天 and 李聖哲 and 新里顕大 and 佐藤敏紀",
48
+ title = "人間と言語モデルに対するプロンプトを用いたゼロからのイベント常識知識グラフ構築",
49
+ booktitle = "言語処理学会第29回年次大会",
50
+ year = "2023",
51
+ url = "https://www.anlp.jp/proceedings/annual_meeting/2023/pdf_dir/B2-5.pdf"
52
+ note = "in Japanese"
53
+ }
54
+ @InProceedings{murata_nlp2023,
55
+ author = "村田栄樹 and 井手竜也 and 榮田亮真 and 河原大輔 and 山崎天 and 李聖哲 and 新里顕大 and 佐藤敏紀",
56
+ title = "大規模言語モデルによって構築された常識知識グラフの拡大と低コストフィルタリング",
57
+ booktitle = "言語処理学会第29回年次大会",
58
+ year = "2023",
59
+ url = "https://www.anlp.jp/proceedings/annual_meeting/2023/pdf_dir/B9-1.pdf"
60
+ note = "in Japanese"
61
+ }
62
+ ```