Gabriel Tan commited on
Commit
6243969
1 Parent(s): 386b070

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +15 -3
README.md CHANGED
@@ -15,9 +15,22 @@ inference: false
15
  A DialoGPT model fine-tuned on Tagalog conversational data scraped from the web. This model is an output of a research on BERT-based data augmentation for low resource languages. The base model used is DialoGPT-medium.
16
 
17
  # Latest release: July 25, 2021
18
- * As of the moment, the model is only able to respond based on the history of 3 previous utterances before being limited. This is a result of the limited amount of Tagalog conversations in our dataset.
19
 
20
 
 
 
 
 
 
 
 
 
 
 
 
 
 
21
  # Usage
22
 
23
  Here is an example of using Beam Search as the decoding method for our model.
@@ -41,5 +54,4 @@ for step in range(2):
41
  print("DialoGPT: {}".format(tokenizer.decode(chat_history_ids[:, bot_input_ids.shape[-1]:][0], skip_special_tokens=True)))
42
  ```
43
 
44
- # Dataset and Scripts
45
- To be released
 
15
  A DialoGPT model fine-tuned on Tagalog conversational data scraped from the web. This model is an output of a research on BERT-based data augmentation for low resource languages. The base model used is DialoGPT-medium.
16
 
17
  # Latest release: July 25, 2021
18
+ * As of the moment, the model is only able to respond based on the history of 3 previous utterances before being limited. This is a result of the scarce amount of Tagalog conversations in our dataset.
19
 
20
 
21
+ # Dataset and Scripts
22
+ The training data used was collected under the following categories:
23
+ * Food and Drinks
24
+ * Home and Garden
25
+ * Style and Fashion
26
+ * Travel and Leisure
27
+ * Visas and Immigration
28
+ * Health and Wellness
29
+ * Body and Fitness
30
+ * Small Talk
31
+
32
+ Pinoy Exchange (PEx) Conversational Dataset to be released soon.
33
+
34
  # Usage
35
 
36
  Here is an example of using Beam Search as the decoding method for our model.
 
54
  print("DialoGPT: {}".format(tokenizer.decode(chat_history_ids[:, bot_input_ids.shape[-1]:][0], skip_special_tokens=True)))
55
  ```
56
 
57
+