This is a version of the SciBERT encoder trained for the purpose of retrieving datasets by textual description given a natural language query.

If useful, please cite

    title = {DataFinder: Scientific Dataset Recommendation from Natural Language Descriptions},
    author = {Vijay Viswanathan and Luyu Gao and Tongshuang Wu and Pengfei Liu and Graham Neubig},
    booktitle = {Annual Conference of the Association for Computational Linguistics (ACL)},
    address = {Toronto, Canada},
    month = {July},
    url = {https://arxiv.org/abs/2305.16636},
    year = {2023}
