gabtan99
/

dialogpt-tagalog-medium-10

Text Generation

Inference Endpoints

text-generation-inference

Model card Files Files and versions Community

Gabriel Tan commited on Jul 26, 2021

Commit

6243969

•

1 Parent(s): 386b070

Update README.md

Files changed (1) hide show

README.md +15 -3

README.md CHANGED Viewed

@@ -15,9 +15,22 @@ inference: false
 A DialoGPT model fine-tuned on Tagalog conversational data scraped from the web. This model is an output of a research on BERT-based data augmentation for low resource languages. The base model used is DialoGPT-medium.
 #  Latest release: July 25, 2021
-* As of the moment, the model is only able to respond based on the history of 3 previous utterances before being limited. This is a result of the limited amount of Tagalog conversations in our dataset.
 # Usage
 Here is an example of using Beam Search as the decoding method for our model.
@@ -41,5 +54,4 @@ for step in range(2):
     print("DialoGPT: {}".format(tokenizer.decode(chat_history_ids[:, bot_input_ids.shape[-1]:][0], skip_special_tokens=True)))
 ```
-# Dataset and Scripts
-To be released

 A DialoGPT model fine-tuned on Tagalog conversational data scraped from the web. This model is an output of a research on BERT-based data augmentation for low resource languages. The base model used is DialoGPT-medium.
 #  Latest release: July 25, 2021
+* As of the moment, the model is only able to respond based on the history of 3 previous utterances before being limited. This is a result of the scarce amount of Tagalog conversations in our dataset.
+# Dataset and Scripts
+The training data used was collected under the following categories:
+* Food and Drinks
+* Home and Garden
+* Style and Fashion
+* Travel and Leisure
+* Visas and Immigration
+* Health and Wellness
+* Body and Fitness
+* Small Talk
+Pinoy Exchange (PEx) Conversational Dataset to be released soon.
 # Usage
 Here is an example of using Beam Search as the decoding method for our model.
     print("DialoGPT: {}".format(tokenizer.decode(chat_history_ids[:, bot_input_ids.shape[-1]:][0], skip_special_tokens=True)))
 ```