Sakonii commited on
Commit
5f35de6
1 Parent(s): 38ba3d6

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +16 -8
README.md CHANGED
@@ -3,9 +3,18 @@ license: apache-2.0
3
  base_model: Sakonii/distilgpt2-nepali
4
  tags:
5
  - generated_from_trainer
 
 
 
 
 
 
 
6
  model-index:
7
- - name: distilgpt2-nepali-patrakar-qa
8
  results: []
 
 
9
  ---
10
 
11
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
@@ -19,18 +28,17 @@ It achieves the following results on the evaluation set:
19
 
20
  ## Model description
21
 
22
- More information needed
23
 
24
  ## Intended uses & limitations
25
 
26
- More information needed
27
-
28
- ## Training and evaluation data
29
-
30
- More information needed
31
 
32
  ## Training procedure
33
 
 
 
34
  ### Training hyperparameters
35
 
36
  The following hyperparameters were used during training:
@@ -58,4 +66,4 @@ The following hyperparameters were used during training:
58
  - Transformers 4.32.1
59
  - Pytorch 2.0.0
60
  - Datasets 2.1.0
61
- - Tokenizers 0.13.3
 
3
  base_model: Sakonii/distilgpt2-nepali
4
  tags:
5
  - generated_from_trainer
6
+ widget:
7
+ - text: 'नेपाली राजनीतिमा युवा पिढीको भूमिका के हो? '
8
+ example_title: Example 1
9
+ - text: 'नेपालको ग्रामीण र शहरी क्षेत्रमा स्वास्थ्य सेवा कस्तो छ? '
10
+ example_title: Example 2
11
+ - text: 'नेपाली राजनीतिमा युवा पिढीको भूमिका के हो? '
12
+ example_title: Example 3
13
  model-index:
14
+ - name: distilgpt2-nepali-qa
15
  results: []
16
+ language:
17
+ - ne
18
  ---
19
 
20
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 
28
 
29
  ## Model description
30
 
31
+ Refer to original [distilgpt2](https://huggingface.co/distilgpt2)
32
 
33
  ## Intended uses & limitations
34
 
35
+ This marginally fine-tuned model can be used for Nepali text generation and possibly question answering and intends to be fine-tuned on Nepali language focused generative downstream task.
36
+ The language model being trained on a data with texts grouped to a block size of 512, it handles text sequence up to 512 tokens.
 
 
 
37
 
38
  ## Training procedure
39
 
40
+ The model is trained with the same configuration as the original [distilgpt2](https://huggingface.co/distilgpt2); but with 512 tokens per instance, 72 instances per batch, and around 34.14K training steps (excluding the pre-training with CLM Objective).
41
+
42
  ### Training hyperparameters
43
 
44
  The following hyperparameters were used during training:
 
66
  - Transformers 4.32.1
67
  - Pytorch 2.0.0
68
  - Datasets 2.1.0
69
+ - Tokenizers 0.13.3