nasa-impact
/

nasa-ibm-st.38m

Sentence Similarity

sentence-transformers

text-embeddings-inference

Inference Endpoints

Model card Files Files and versions Community

Muthukumaran commited on Mar 13

Commit

d5452fe

•

1 Parent(s): 552d3fc

update README from Bhatta's revisions

Files changed (1) hide show

README.md +4 -3

README.md CHANGED Viewed

@@ -12,7 +12,7 @@ pipeline_tag: sentence-similarity
 # Model Card for nasa-smd-ibm-st-v2
-`nasa-smd-ibm-st.38m` is a Bi-encoder sentence transformer model, that is fine-tuned from nasa-smd-ibm-v0.1 encoder model. it is a smaller version of `nasa-smd-ibm-st` with better performance, using fewer parameters (shown below). It's trained with 271 million examples along with a domain-specific dataset of 2.6 million examples from documents curated by NASA Science Mission Directorate (SMD). With this model, we aim to enhance natural language technologies like information retrieval and intelligent search as it applies to SMD NLP applications.
 ## Model Details
 - **Base Encoder Model**: nasa-smd-ibm-v0.1
@@ -25,7 +25,7 @@ pipeline_tag: sentence-similarity
 ![image/png](https://cdn-uploads.huggingface.co/production/uploads/61099e5d86580d4580767226/fcsd0fEY_EoMA1F_CsEbD.png)
-Figure: dataset sources for sentence transformers (269M in total)
 Additionally, 2.6M abstract + title pairs collected from NASA SMD documents.
@@ -41,7 +41,7 @@ Following models are evaluated:
 1. All-MiniLM-l6-v2 [sentence-transformers/all-MiniLM-L6-v2]
 2. BGE-base [BAAI/bge-base-en-v1.5]
 3. RoBERTa-base [roberta-base]
-4. nasa-smd-ibm-rtvr_v0.1 [nasa-impact/nasa-smd-ibm-st]
@@ -99,6 +99,7 @@ IBM Research
 - Aashka Trivedi
 - Masayasu Maraoka
 - Bishwaranjan Bhattacharjee
 NASA SMD
 - Muthukumaran Ramasubramanian

 # Model Card for nasa-smd-ibm-st-v2
+`nasa-smd-ibm-st.38m` is a Bi-encoder sentence transformer model, that is fine-tuned from distilled version of nasa-smd-ibm-v0.1 encoder model. it is a smaller version of `nasa-smd-ibm-st` with better performance, using fewer parameters (shown below). It's trained with 271 million examples along with a domain-specific dataset of 2.6 million examples from documents curated by NASA Science Mission Directorate (SMD). With this model, we aim to enhance natural language technologies like information retrieval and intelligent search as it applies to SMD NLP applications.
 ## Model Details
 - **Base Encoder Model**: nasa-smd-ibm-v0.1
 ![image/png](https://cdn-uploads.huggingface.co/production/uploads/61099e5d86580d4580767226/fcsd0fEY_EoMA1F_CsEbD.png)
+Figure: dataset sources for sentence transformers (362M in total)
 Additionally, 2.6M abstract + title pairs collected from NASA SMD documents.
 1. All-MiniLM-l6-v2 [sentence-transformers/all-MiniLM-L6-v2]
 2. BGE-base [BAAI/bge-base-en-v1.5]
 3. RoBERTa-base [roberta-base]
+4. nasa-smd-ibm-rtvr_v2 [nasa-impact/nasa-smd-ibm-st-v2]
 - Aashka Trivedi
 - Masayasu Maraoka
 - Bishwaranjan Bhattacharjee
+- Takuma Udagawa
 NASA SMD
 - Muthukumaran Ramasubramanian