nikokons commited on
Commit
3e8814d
1 Parent(s): 60558c6

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -0
README.md CHANGED
@@ -1 +1,2 @@
 
1
  This model uses the open sourced-weights of the DIALOGPT (microsoft/DialoGPT-small) and is fine-tuned to the PERSONA-CHAT dataset using an augmented input representation and a multi-task learning scheme, further described in the paper "TransferTransfo: A Transfer Learning Approach for Neural Network Based Conversational Agents". The model finetunes quickly to the PERSONA-CHAT dataset and 5 epochs of training was sufficient. A batch size of 4 and accumulated gradients over 8 iterations are used, resulting in the effective batch size of 32. In addition, the Adam optimization scheme with a learning rate of 6e-5 is used.
 
1
+ # A brief description
2
  This model uses the open sourced-weights of the DIALOGPT (microsoft/DialoGPT-small) and is fine-tuned to the PERSONA-CHAT dataset using an augmented input representation and a multi-task learning scheme, further described in the paper "TransferTransfo: A Transfer Learning Approach for Neural Network Based Conversational Agents". The model finetunes quickly to the PERSONA-CHAT dataset and 5 epochs of training was sufficient. A batch size of 4 and accumulated gradients over 8 iterations are used, resulting in the effective batch size of 32. In addition, the Adam optimization scheme with a learning rate of 6e-5 is used.