adding support for bi-text asks like NLI or QA

#8
by MoritzLaurer HF staff - opened

If I understand the interface correctly, one can currently only use datasets with a single input column. Tasks like NLI or QA require two input texts for the task (e.g. premise column und hypothesis column for multi_nli). Could support for bi-text tasks be added? (note that the order in which the two texts are concatenated is quite important, so if someone puts the columns in the wrong order, the output would be misleading)

Evaluation on the Hub org

Hi @MoritzLaurer thank you for sharing this great feedback! We're planning to add support for bi-text tasks soon and I'll report back here when it's available :)

Evaluation on the Hub org

Hey @MoritzLaurer , we've just deployed support for bi-text tasks under the natural_language_inference task in the UI

Screen Shot 2022-08-29 at 20.58.53.png

Under the hood, we use the align_labels_with_mapping() function from datasets to ensure that the model's label mapping is aligned with the dataset. This should handle the issue your mentioned about the order of inputs being important.

Let us know if you run into any issues ๐Ÿค—!

That's great, thank you very much for implementing this! I'm testing it now and will tell you if I run into issues.
(with the order of inputs issue, I actually meant that there is a difference if the input for the tokenizer is either "{premise} [SEP] {hypothesis}" or "{hypothesis} [SEP] {premise}". With the second (wrong) order, the model would probably perform worse. But I see in the new interface that you allow people to select the correct order, so this is great :) )

MoritzLaurer changed discussion status to closed

Sign up or log in to comment