Sentence Similarity
sentence-transformers
PyTorch
English
mpnet
feature-extraction
medical
biology
Inference Endpoints
FremyCompany commited on
Commit
4f4afec
1 Parent(s): 1343694

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -26,7 +26,7 @@ This model was trained using BioLORD, a new pre-training strategy for producing
26
 
27
  State-of-the-art methodologies operate by maximizing the similarity in representation of names referring to the same concept, and preventing collapse through contrastive learning. However, because biomedical names are not always self-explanatory, it sometimes results in non-semantic representations.
28
 
29
- BioLORD overcomes this issue by grounding its concept representations using definitions, as well as short descriptions derived from a multi-relational knowledge graph consisting of biomedical ontologies. Thanks to this grounding, our model produces more semantic concept representations that match more closely the hierarchical structure of ontologies. BioLORD-2023 establishes a new state of the art for text similarity on both clinical sentences (MedSTS) and biomedical concepts (MayoSRS).
30
 
31
  This model is based on [sentence-transformers/all-mpnet-base-v2](https://huggingface.co/sentence-transformers/all-mpnet-base-v2) and was further finetuned on the [BioLORD-Dataset](https://huggingface.co/datasets/FremyCompany/BioLORD-Dataset) and LLM-generated definitions from the [Automatic Glossary of Clinical Terminology (AGCT)](https://huggingface.co/datasets/FremyCompany/AGCT-Dataset).
32
 
 
26
 
27
  State-of-the-art methodologies operate by maximizing the similarity in representation of names referring to the same concept, and preventing collapse through contrastive learning. However, because biomedical names are not always self-explanatory, it sometimes results in non-semantic representations.
28
 
29
+ BioLORD overcomes this issue by grounding its concept representations using definitions, as well as short descriptions derived from a multi-relational knowledge graph consisting of biomedical ontologies. Thanks to this grounding, our model produces more semantic concept representations that match more closely the hierarchical structure of ontologies. BioLORD-2023 establishes a new state of the art for text similarity on both clinical sentences (MedSTS) and biomedical concepts (EHR-Rel-B).
30
 
31
  This model is based on [sentence-transformers/all-mpnet-base-v2](https://huggingface.co/sentence-transformers/all-mpnet-base-v2) and was further finetuned on the [BioLORD-Dataset](https://huggingface.co/datasets/FremyCompany/BioLORD-Dataset) and LLM-generated definitions from the [Automatic Glossary of Clinical Terminology (AGCT)](https://huggingface.co/datasets/FremyCompany/AGCT-Dataset).
32