Does adding that many datasets increase the accuracy?

#3
by AHDMK - opened

if the model is meant to be used to create embeddings for medical text wouldnt using that much general data shift the models focus from the medical terms?

Only the MNLI and SNLI datasets are general domain datasets for natural language inference. The other datasets are medical domain datasets. Adding a mixture of both types of datasets improve model performance. However the overall accuracy will depend on the task at hand.

AHDMK changed discussion status to closed

Sign up or log in to comment