Visible device: cuda Seed used: 0 Batch size: 64 Epochs: 1 Learning rate: 1e-05 Entropy weight: 0.01 Regularization weight: 0.0 Only use multiwoz like domains: False We use: 100.0% of the data Dialogue order used: 0 Vectorizer: Data set used is sgd We filter state by active domains: True Vectorizer: Data set used is sgd Embedding semantic descriptions: True Embedded descriptions successfully. Size: torch.Size([1678, 768]) Data set used for descriptions: sgd We use Roberta to embed actions. Didnt load a model Start training Epoch: 0 Average actions: 1.684490442276001 Average target actions: 2.024200201034546 Precision: 0.3306945737954022 Recall: 0.27521008403361347 F1: 0.3004118891239007 <> epoch 0: saved network to mdl Best Precision: 0.3306945737954022 Best Recall: 0.27521008403361347 Best F1: 0.3004118891239007