Sakonii
/

distilgpt2-nepali-qa

Text Generation

Generated from Trainer

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

Sakonii commited on Sep 8, 2023

Commit

5f35de6

•

1 Parent(s): 38ba3d6

Update README.md

Files changed (1) hide show

README.md +16 -8

README.md CHANGED Viewed

@@ -3,9 +3,18 @@ license: apache-2.0
 base_model: Sakonii/distilgpt2-nepali
 tags:
 - generated_from_trainer
 model-index:
-- name: distilgpt2-nepali-patrakar-qa
   results: []
 ---
 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
@@ -19,18 +28,17 @@ It achieves the following results on the evaluation set:
 ## Model description
-More information needed
 ## Intended uses & limitations
-More information needed
-## Training and evaluation data
-More information needed
 ## Training procedure
 ### Training hyperparameters
 The following hyperparameters were used during training:
@@ -58,4 +66,4 @@ The following hyperparameters were used during training:
 - Transformers 4.32.1
 - Pytorch 2.0.0
 - Datasets 2.1.0
-- Tokenizers 0.13.3

 base_model: Sakonii/distilgpt2-nepali
 tags:
 - generated_from_trainer
+widget:
+- text: 'नेपाली राजनीतिमा युवा पिढीको भूमिका के हो? '
+  example_title: Example 1
+- text: 'नेपालको ग्रामीण र शहरी क्षेत्रमा स्वास्थ्य सेवा कस्तो छ? '
+  example_title: Example 2
+- text: 'नेपाली राजनीतिमा युवा पिढीको भूमिका के हो? '
+  example_title: Example 3
 model-index:
+- name: distilgpt2-nepali-qa
   results: []
+language:
+- ne
 ---
 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 ## Model description
+Refer to original [distilgpt2](https://huggingface.co/distilgpt2)
 ## Intended uses & limitations
+This marginally fine-tuned model can be used for Nepali text generation and possibly question answering and intends to be fine-tuned on Nepali language focused generative downstream task.
+The language model being trained on a data with texts grouped to a block size of 512, it handles text sequence up to 512 tokens.
 ## Training procedure
+The model is trained with the same configuration as the original [distilgpt2](https://huggingface.co/distilgpt2); but with 512 tokens per instance, 72 instances per batch, and around 34.14K training steps (excluding the pre-training with CLM Objective).
 ### Training hyperparameters
 The following hyperparameters were used during training:
 - Transformers 4.32.1
 - Pytorch 2.0.0
 - Datasets 2.1.0
+- Tokenizers 0.13.3