Update README.md
Browse files
README.md
CHANGED
@@ -6,6 +6,11 @@ language:
|
|
6 |
- en
|
7 |
---
|
8 |
|
|
|
|
|
|
|
|
|
|
|
9 |
**Aug 2023 Update:**
|
10 |
1. The SPECTER 2.0 Base and proximity adapter models have been renamed in Hugging Face based upon usage patterns as follows:
|
11 |
|
@@ -18,9 +23,7 @@ language:
|
|
18 |
However, for benchmarking purposes, please continue using the current version.
|
19 |
|
20 |
|
21 |
-
<!-- Provide a quick summary of what the model is/does. -->
|
22 |
|
23 |
-
# SPECTER 2.0 (Base)
|
24 |
SPECTER 2.0 is the successor to [SPECTER](https://huggingface.co/allenai/specter) and is capable of generating task specific embeddings for scientific tasks when paired with [adapters](https://huggingface.co/models?search=allenai/specter-2_).
|
25 |
This is the base model to be used along with the adapters.
|
26 |
Given the combination of title and abstract of a scientific paper or a short texual query, the model can be used to generate effective embeddings to be used in downstream applications.
|
@@ -39,7 +42,7 @@ Post that it is trained with additionally attached task format specific adapter
|
|
39 |
Task Formats trained on:
|
40 |
- Classification
|
41 |
- Regression
|
42 |
-
- Proximity
|
43 |
- Adhoc Search
|
44 |
|
45 |
|
@@ -69,12 +72,12 @@ It builds on the work done in [SciRepEval: A Multi-Format Benchmark for Scientif
|
|
69 |
|
70 |
|Model|Name and HF link|Description|
|
71 |
|--|--|--|
|
72 |
-
|
|
73 |
-
|Adhoc Query|[allenai/specter2_adhoc_query](https://huggingface.co/allenai/specter2_adhoc_query)|Encode short raw text queries for search tasks. (Candidate papers can be encoded with proximity)|
|
74 |
|Classification|[allenai/specter2_classification](https://huggingface.co/allenai/specter2_classification)|Encode papers to feed into linear classifiers as features|
|
75 |
|Regression|[allenai/specter2_regression](https://huggingface.co/allenai/specter2_regression)|Encode papers to feed into linear regressors as features|
|
76 |
|
77 |
-
*
|
78 |
|
79 |
```python
|
80 |
from transformers import AutoTokenizer, AutoModel
|
@@ -86,7 +89,7 @@ tokenizer = AutoTokenizer.from_pretrained('allenai/specter2_base')
|
|
86 |
model = AutoModel.from_pretrained('allenai/specter2_base')
|
87 |
|
88 |
#load the adapter(s) as per the required task, provide an identifier for the adapter in load_as argument and activate it
|
89 |
-
model.load_adapter("allenai/
|
90 |
#other possibilities: allenai/specter2_<classification|regression|adhoc_query>
|
91 |
|
92 |
papers = [{'title': 'BERT', 'abstract': 'We introduce a new language representation model called BERT'},
|
|
|
6 |
- en
|
7 |
---
|
8 |
|
9 |
+
<!-- Provide a quick summary of what the model is/does. -->
|
10 |
+
|
11 |
+
# SPECTER 2.0 (Base)
|
12 |
+
|
13 |
+
|
14 |
**Aug 2023 Update:**
|
15 |
1. The SPECTER 2.0 Base and proximity adapter models have been renamed in Hugging Face based upon usage patterns as follows:
|
16 |
|
|
|
23 |
However, for benchmarking purposes, please continue using the current version.
|
24 |
|
25 |
|
|
|
26 |
|
|
|
27 |
SPECTER 2.0 is the successor to [SPECTER](https://huggingface.co/allenai/specter) and is capable of generating task specific embeddings for scientific tasks when paired with [adapters](https://huggingface.co/models?search=allenai/specter-2_).
|
28 |
This is the base model to be used along with the adapters.
|
29 |
Given the combination of title and abstract of a scientific paper or a short texual query, the model can be used to generate effective embeddings to be used in downstream applications.
|
|
|
42 |
Task Formats trained on:
|
43 |
- Classification
|
44 |
- Regression
|
45 |
+
- Proximity (Retrieval)
|
46 |
- Adhoc Search
|
47 |
|
48 |
|
|
|
72 |
|
73 |
|Model|Name and HF link|Description|
|
74 |
|--|--|--|
|
75 |
+
|Proximity*|[allenai/specter2](https://huggingface.co/allenai/specter2)|Encode papers as queries and candidates eg. Link Prediction, Nearest Neighbor Search|
|
76 |
+
|Adhoc Query|[allenai/specter2_adhoc_query](https://huggingface.co/allenai/specter2_adhoc_query)|Encode short raw text queries for search tasks. (Candidate papers can be encoded with the proximity adapter)|
|
77 |
|Classification|[allenai/specter2_classification](https://huggingface.co/allenai/specter2_classification)|Encode papers to feed into linear classifiers as features|
|
78 |
|Regression|[allenai/specter2_regression](https://huggingface.co/allenai/specter2_regression)|Encode papers to feed into linear regressors as features|
|
79 |
|
80 |
+
*Proximity model should suffice for downstream task types not mentioned above
|
81 |
|
82 |
```python
|
83 |
from transformers import AutoTokenizer, AutoModel
|
|
|
89 |
model = AutoModel.from_pretrained('allenai/specter2_base')
|
90 |
|
91 |
#load the adapter(s) as per the required task, provide an identifier for the adapter in load_as argument and activate it
|
92 |
+
model.load_adapter("allenai/specter2", source="hf", load_as="proximity", set_active=True)
|
93 |
#other possibilities: allenai/specter2_<classification|regression|adhoc_query>
|
94 |
|
95 |
papers = [{'title': 'BERT', 'abstract': 'We introduce a new language representation model called BERT'},
|