Muthukumaran
commited on
Commit
•
d5452fe
1
Parent(s):
552d3fc
update README from Bhatta's revisions
Browse files
README.md
CHANGED
@@ -12,7 +12,7 @@ pipeline_tag: sentence-similarity
|
|
12 |
|
13 |
# Model Card for nasa-smd-ibm-st-v2
|
14 |
|
15 |
-
`nasa-smd-ibm-st.38m` is a Bi-encoder sentence transformer model, that is fine-tuned from nasa-smd-ibm-v0.1 encoder model. it is a smaller version of `nasa-smd-ibm-st` with better performance, using fewer parameters (shown below). It's trained with 271 million examples along with a domain-specific dataset of 2.6 million examples from documents curated by NASA Science Mission Directorate (SMD). With this model, we aim to enhance natural language technologies like information retrieval and intelligent search as it applies to SMD NLP applications.
|
16 |
|
17 |
## Model Details
|
18 |
- **Base Encoder Model**: nasa-smd-ibm-v0.1
|
@@ -25,7 +25,7 @@ pipeline_tag: sentence-similarity
|
|
25 |
|
26 |
![image/png](https://cdn-uploads.huggingface.co/production/uploads/61099e5d86580d4580767226/fcsd0fEY_EoMA1F_CsEbD.png)
|
27 |
|
28 |
-
Figure: dataset sources for sentence transformers (
|
29 |
|
30 |
Additionally, 2.6M abstract + title pairs collected from NASA SMD documents.
|
31 |
|
@@ -41,7 +41,7 @@ Following models are evaluated:
|
|
41 |
1. All-MiniLM-l6-v2 [sentence-transformers/all-MiniLM-L6-v2]
|
42 |
2. BGE-base [BAAI/bge-base-en-v1.5]
|
43 |
3. RoBERTa-base [roberta-base]
|
44 |
-
4. nasa-smd-ibm-
|
45 |
|
46 |
|
47 |
|
@@ -99,6 +99,7 @@ IBM Research
|
|
99 |
- Aashka Trivedi
|
100 |
- Masayasu Maraoka
|
101 |
- Bishwaranjan Bhattacharjee
|
|
|
102 |
|
103 |
NASA SMD
|
104 |
- Muthukumaran Ramasubramanian
|
|
|
12 |
|
13 |
# Model Card for nasa-smd-ibm-st-v2
|
14 |
|
15 |
+
`nasa-smd-ibm-st.38m` is a Bi-encoder sentence transformer model, that is fine-tuned from distilled version of nasa-smd-ibm-v0.1 encoder model. it is a smaller version of `nasa-smd-ibm-st` with better performance, using fewer parameters (shown below). It's trained with 271 million examples along with a domain-specific dataset of 2.6 million examples from documents curated by NASA Science Mission Directorate (SMD). With this model, we aim to enhance natural language technologies like information retrieval and intelligent search as it applies to SMD NLP applications.
|
16 |
|
17 |
## Model Details
|
18 |
- **Base Encoder Model**: nasa-smd-ibm-v0.1
|
|
|
25 |
|
26 |
![image/png](https://cdn-uploads.huggingface.co/production/uploads/61099e5d86580d4580767226/fcsd0fEY_EoMA1F_CsEbD.png)
|
27 |
|
28 |
+
Figure: dataset sources for sentence transformers (362M in total)
|
29 |
|
30 |
Additionally, 2.6M abstract + title pairs collected from NASA SMD documents.
|
31 |
|
|
|
41 |
1. All-MiniLM-l6-v2 [sentence-transformers/all-MiniLM-L6-v2]
|
42 |
2. BGE-base [BAAI/bge-base-en-v1.5]
|
43 |
3. RoBERTa-base [roberta-base]
|
44 |
+
4. nasa-smd-ibm-rtvr_v2 [nasa-impact/nasa-smd-ibm-st-v2]
|
45 |
|
46 |
|
47 |
|
|
|
99 |
- Aashka Trivedi
|
100 |
- Masayasu Maraoka
|
101 |
- Bishwaranjan Bhattacharjee
|
102 |
+
- Takuma Udagawa
|
103 |
|
104 |
NASA SMD
|
105 |
- Muthukumaran Ramasubramanian
|