Moreno La Quatra commited on
Commit
94d162b
1 Parent(s): 16fd9d9

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +13 -6
README.md CHANGED
@@ -2,9 +2,18 @@
2
  license: apache-2.0
3
  tags:
4
  - generated_from_trainer
 
 
 
5
  model-index:
6
  - name: distilgpt2-fables-demo
7
  results: []
 
 
 
 
 
 
8
  ---
9
 
10
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
@@ -12,23 +21,21 @@ should probably proofread and complete it, then remove this comment. -->
12
 
13
  # distilgpt2-fables-demo
14
 
15
- This model is a fine-tuned version of [distilgpt2](https://huggingface.co/distilgpt2) on an unknown dataset.
16
  It achieves the following results on the evaluation set:
17
  - Loss: 3.2165
18
 
19
  ## Model description
20
 
21
- More information needed
22
 
23
  ## Intended uses & limitations
24
 
25
- More information needed
26
 
27
  ## Training and evaluation data
28
 
29
- More information needed
30
-
31
- ## Training procedure
32
 
33
  ### Training hyperparameters
34
 
 
2
  license: apache-2.0
3
  tags:
4
  - generated_from_trainer
5
+ - distilgpt2
6
+ - text-generation
7
+ - english
8
  model-index:
9
  - name: distilgpt2-fables-demo
10
  results: []
11
+ pipeline:
12
+ - text-generation
13
+ widget:
14
+ - text: Once upon a time,
15
+ - text: There was a time when
16
+ - text: Long time ago
17
  ---
18
 
19
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 
21
 
22
  # distilgpt2-fables-demo
23
 
24
+ This model is a fine-tuned version of [distilgpt2](https://huggingface.co/distilgpt2) on [demelin/understanding_fables](https://huggingface.co/datasets/demelin/understanding_fables) dataset.
25
  It achieves the following results on the evaluation set:
26
  - Loss: 3.2165
27
 
28
  ## Model description
29
 
30
+ The model is a demo for the fine-tuning of decoder-only models using `transformers` library.
31
 
32
  ## Intended uses & limitations
33
 
34
+ It can be used mainly for prototyping and educational purposes.
35
 
36
  ## Training and evaluation data
37
 
38
+ The [demelin/understanding_fables](https://huggingface.co/datasets/demelin/understanding_fables) dataset has been split into train/test/validation using an 80/10/10 random split (`random_seed = 42`). Google Colab has been used for model fine-tuning.
 
 
39
 
40
  ### Training hyperparameters
41