nflubis commited on
Commit
64a1ea2
1 Parent(s): af9a20c

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +46 -0
README.md CHANGED
@@ -1,3 +1,49 @@
1
  ---
 
 
2
  license: apache-2.0
 
 
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ language:
3
+ - en
4
  license: apache-2.0
5
+ tags:
6
+ - dialogue policy
7
+ - task-oriented dialog
8
+
9
  ---
10
+
11
+ # lava-policy-multiwoz
12
+
13
+ This is the best performing LAVA_kl model from the [LAVA paper](https://aclanthology.org/2020.coling-main.41/) which can be used as a word-level policy module in ConvLab3 pipeline.
14
+
15
+ Refer to [ConvLab-3](https://github.com/ConvLab/ConvLab-3) for model description and usage.
16
+
17
+ ## Training procedure
18
+ The model was trained on MultiWOZ 2.0 data using the [LAVA codebase](https://gitlab.cs.uni-duesseldorf.de/general/dsml/lava-public). The model started with VAE pre-training and fine-tuning with informative prior KL loss, followed by corpus-based RL with REINFORCE.
19
+
20
+ ### Training hyperparameters
21
+
22
+ The following hyperparameters were used during SL training:
23
+ - y_size: 10
24
+ - k_size: 20
25
+ - beta: 0.1
26
+ - simple_posterior: true
27
+ - contextual_posterior: false
28
+ - learning_rate: 1e-03
29
+ - max_vocab_size: 1000
30
+ - max_utt_len: 50
31
+ - max_dec_len: 30
32
+ - backward_size: 2
33
+ - train_batch_size: 128
34
+ - seed: 58
35
+ - optimizer: Adam
36
+ - num_epoch: 100 with early stopping based on validation set
37
+
38
+ The following hyperparameters were used during RL training:
39
+ - tune_pi_only: false
40
+ - max_words: 100
41
+ - temperature: 1.0
42
+ - episode_repeat: 1.0
43
+ - rl_lr: 0.01
44
+ - momentum: 0.0
45
+ - nesterov: false
46
+ - gamma: 0.99
47
+ - rl_clip: 5.0
48
+ - random_seed: 38
49
+