--- language: - en license: apache-2.0 tags: - dialogue policy - task-oriented dialog datasets: - ConvLab/multiwoz21 --- # ddpt-policy-sgd This is a DDPT model (https://aclanthology.org/2022.coling-1.21/) trained on [MultiWOZ 2.1](https://huggingface.co/datasets/ConvLab/multiwoz21) Refer to [ConvLab-3](https://github.com/ConvLab/ConvLab-3) for model description and usage. ## Training procedure ### Training hyperparameters The following hyperparameters were used during training: - learning_rate: 1e-05 - train_batch_size: 64 - seed: 1 - optimizer: Adam - num_epochs: 40 - use checkpoint which performed best on validation set ### Framework versions - Transformers 4.18.0 - Pytorch 1.10.2+cu111