--- language: - en license: apache-2.0 tags: - t5-large - text2text-generation - conversational question rewriting datasets: - CANARD metrics: - BLEU model-index: - name: t5-large-coqr-canard results: - task: type: text2text-generation name: conversational question rewriting dataset: type: CANARD name: CANARD split: test metrics: - type: BLEU value: 77.8 name: BLEU widget: - text: "Rewrite the question according to the given context to make the dialog fluent using anaphora and ellipsis.\n\nquestion: What else happened during 1977-1981 other than Superstar Billy Graham's return?\n\ncontext: Superstar Billy Graham\nReturn to WWWF (1977-1981)\nWhy did he return to the WWWF?\nan agreement with promoter Vincent J. McMahon (Senior\nWhat was his agreement with McMahon?\nI don't know.\nHow did people respond to his return?\nI don't know." - text: "Rewrite the question according to the given context to make the dialog fluent using anaphora and ellipsis.\n\nquestion: why did Billy Graham personally sued Zahorian and the WWF?\n\ncontext: Superstar Billy Graham\nDisputes with the McMahons\nwhat disputes did he have?\nGraham personally sued Zahorian and the WWF," inference: parameters: max_length: 100 --- # t5-large-coqr-canard This model is a fine-tuned version of [t5-large](https://huggingface.co/t5-large) on the [CANARD](https://sites.google.com/view/qanta/projects/canard) dataset. It achieves the following results on the test set: - Loss: 0.3064 - Bleu: 77.1979 - Generation Length: 9.576 ## Model description CANARD dataset rewrites the original questions in conversations to make them context-independent (understandable w/o context). On the contrary, this model is trained to rewrite context-independent questions to conversational questions, aiming to create fluent dialog with anaphora and ellipsis. Input: ``` Rewrite the question according to the given context to make the dialog fluent using anaphora and ellipsis. question: How did people respond to Superstar Billy Graham's return? context: Superstar Billy Graham Return to WWWF (1977-1981) Why did he return to the WWWF? an agreement with promoter Vincent J. McMahon (Senior What was his agreement with McMahon? I don't know. ``` Target: ``` How did people respond to his return? ``` ## Training procedure ### Training hyperparameters The following hyperparameters were used during training: - learning_rate: 0.001 - train_batch_size: 64 - eval_batch_size: 64 - seed: 42 - distributed_type: multi-GPU - num_devices: 8 - total_train_batch_size: 512 - total_eval_batch_size: 512 - optimizer: Adafactor - lr_scheduler_type: linear - num_epochs: 1.0 ### Training results | Training Loss | Epoch | Step | Validation Loss | Bleu | Gen Len | |:-------------:|:-----:|:----:|:---------------:|:-------:|:-------:| | No log | 1.0 | 62 | 0.2987 | 77.2361 | 9.4534 | ### Framework versions - Transformers 4.20.1 - Pytorch 1.11.0+cu113 - Datasets 2.6.1 - Tokenizers 0.12.1