Does adding that many datasets increase the accuracy?
#3
by
AHDMK
- opened
if the model is meant to be used to create embeddings for medical text wouldnt using that much general data shift the models focus from the medical terms?
Only the MNLI and SNLI datasets are general domain datasets for natural language inference. The other datasets are medical domain datasets. Adding a mixture of both types of datasets improve model performance. However the overall accuracy will depend on the task at hand.
AHDMK
changed discussion status to
closed