julien-c HF staff commited on
Commit
c161f38
β€’
1 Parent(s): 1bf286c

Migrate model card from transformers-repo

Browse files

Read announcement at https://discuss.huggingface.co/t/announcement-all-model-cards-will-be-migrated-to-hf-co-model-repos/2755
Original file history: https://github.com/huggingface/transformers/commits/master/model_cards/mrm8488/GPT-2-finetuned-common_gen/README.md

Files changed (1) hide show
  1. README.md +63 -0
README.md ADDED
@@ -0,0 +1,63 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language: en
3
+ datasets:
4
+ - common_gen
5
+ widget:
6
+ - text: "<|endoftext|> apple, tree, pick:"
7
+ ---
8
+
9
+ # GPT-2 fine-tuned on CommonGen
10
+
11
+ [GPT-2](https://huggingface.co/gpt2) fine-tuned on [CommonGen](https://inklab.usc.edu/CommonGen/index.html) for *Generative Commonsense Reasoning*.
12
+
13
+ ## Details of GPT-2
14
+
15
+ GPT-2 is a transformers model pretrained on a very large corpus of English data in a self-supervised fashion. This
16
+ means it was pretrained on the raw texts only, with no humans labelling them in any way (which is why it can use lots
17
+ of publicly available data) with an automatic process to generate inputs and labels from those texts. More precisely,
18
+ it was trained to guess the next word in sentences.
19
+
20
+ More precisely, inputs are sequences of continuous text of a certain length and the targets are the same sequence,
21
+ shifted one token (word or piece of word) to the right. The model uses internally a mask-mechanism to make sure the
22
+ predictions for the token `i` only uses the inputs from `1` to `i` but not the future tokens.
23
+
24
+ This way, the model learns an inner representation of the English language that can then be used to extract features
25
+ useful for downstream tasks. The model is best at what it was pretrained for however, which is generating texts from a
26
+ prompt.
27
+
28
+
29
+ ## Details of the dataset πŸ“š
30
+
31
+ CommonGen is a constrained text generation task, associated with a benchmark dataset, to explicitly test machines for the ability of generative commonsense reasoning. Given a set of common concepts; the task is to generate a coherent sentence describing an everyday scenario using these concepts.
32
+
33
+ CommonGen is challenging because it inherently requires 1) relational reasoning using background commonsense knowledge, and 2) compositional generalization ability to work on unseen concept combinations. Our dataset, constructed through a combination of crowd-sourcing from AMT and existing caption corpora, consists of 30k concept-sets and 50k sentences in total.
34
+
35
+
36
+ | Dataset | Split | # samples |
37
+ | -------- | ----- | --------- |
38
+ | common_gen | train | 67389 |
39
+ | common_gen | valid | 4018 |
40
+ | common_gen | test | 1497 |
41
+
42
+
43
+ ## Model fine-tuning πŸ‹οΈβ€
44
+
45
+ You can find the fine-tuning script [here](https://github.com/huggingface/transformers/tree/master/examples/language-modeling)
46
+
47
+ ## Model in Action πŸš€
48
+
49
+ ```bash
50
+ python ./transformers/examples/text-generation/run_generation.py \
51
+ --model_type=gpt2 \
52
+ --model_name_or_path="mrm8488/GPT-2-finetuned-common_gen" \
53
+ --num_return_sequences 1 \
54
+ --prompt "<|endoftext|> kid, room, dance:" \
55
+ --stop_token "."
56
+ ```
57
+
58
+ > Created by [Manuel Romero/@mrm8488](https://twitter.com/mrm8488) | [LinkedIn](https://www.linkedin.com/in/manuel-romero-cs/)
59
+
60
+ > Made with <span style="color: #e25555;">&hearts;</span> in Spain
61
+
62
+
63
+