adding support for bi-text asks like NLI or QA #8

by MoritzLaurer - opened

If I understand the interface correctly, one can currently only use datasets with a single input column. Tasks like NLI or QA require two input texts for the task (e.g. premise column und hypothesis column for multi_nli). Could support for bi-text tasks be added? (note that the order in which the two texts are concatenated is quite important, so if someone puts the columns in the wrong order, the output would be misleading)

Hi @MoritzLaurer thank you for sharing this great feedback! We're planning to add support for bi-text tasks soon and I'll report back here when it's available :)

Hey @MoritzLaurer, we've just deployed support for bi-text tasks under the natural_language_inference task in the UI

Screen Shot 2022-08-29 at 20.58.53.png

Under the hood, we use the align_labels_with_mapping() function from datasets to ensure that the model's label mapping is aligned with the dataset. This should handle the issue your mentioned about the order of inputs being important.

Let us know if you run into any issues πŸ€—!

That's great, thank you very much for implementing this! I'm testing it now and will tell you if I run into issues.
(with the order of inputs issue, I actually meant that there is a difference if the input for the tokenizer is either "{premise} [SEP] {hypothesis}" or "{hypothesis} [SEP] {premise}". With the second (wrong) order, the model would probably perform worse. But I see in the new interface that you allow people to select the correct order, so this is great :) )

MoritzLaurer changed discussion status to closed