Update README.md
Browse files
README.md
CHANGED
@@ -1,64 +1,19 @@
|
|
1 |
---
|
2 |
-
language:
|
|
|
3 |
library_name: sentence-transformers
|
4 |
tags:
|
5 |
- sentence-transformers
|
6 |
- sentence-similarity
|
7 |
- feature-extraction
|
8 |
-
base_model: yano0/my_rope_bert_v2
|
9 |
metrics:
|
10 |
-
- pearson_cosine
|
11 |
-
- spearman_cosine
|
12 |
-
- pearson_manhattan
|
13 |
-
- spearman_manhattan
|
14 |
-
- pearson_euclidean
|
15 |
-
- spearman_euclidean
|
16 |
-
- pearson_dot
|
17 |
-
- spearman_dot
|
18 |
-
- pearson_max
|
19 |
-
- spearman_max
|
20 |
widget: []
|
21 |
pipeline_tag: sentence-similarity
|
22 |
-
|
23 |
-
|
24 |
-
|
25 |
-
|
26 |
-
|
27 |
-
name: Semantic Similarity
|
28 |
-
dataset:
|
29 |
-
name: Unknown
|
30 |
-
type: unknown
|
31 |
-
metrics:
|
32 |
-
- type: pearson_cosine
|
33 |
-
value: 0.8363388345473755
|
34 |
-
name: Pearson Cosine
|
35 |
-
- type: spearman_cosine
|
36 |
-
value: 0.7829140815230603
|
37 |
-
name: Spearman Cosine
|
38 |
-
- type: pearson_manhattan
|
39 |
-
value: 0.8169134821588451
|
40 |
-
name: Pearson Manhattan
|
41 |
-
- type: spearman_manhattan
|
42 |
-
value: 0.7806182228552376
|
43 |
-
name: Spearman Manhattan
|
44 |
-
- type: pearson_euclidean
|
45 |
-
value: 0.8176194153920942
|
46 |
-
name: Pearson Euclidean
|
47 |
-
- type: spearman_euclidean
|
48 |
-
value: 0.7812646926795144
|
49 |
-
name: Spearman Euclidean
|
50 |
-
- type: pearson_dot
|
51 |
-
value: 0.790584312051173
|
52 |
-
name: Pearson Dot
|
53 |
-
- type: spearman_dot
|
54 |
-
value: 0.7341313863604967
|
55 |
-
name: Spearman Dot
|
56 |
-
- type: pearson_max
|
57 |
-
value: 0.8363388345473755
|
58 |
-
name: Pearson Max
|
59 |
-
- type: spearman_max
|
60 |
-
value: 0.7829140815230603
|
61 |
-
name: Spearman Max
|
62 |
---
|
63 |
|
64 |
# SentenceTransformer based on yano0/my_rope_bert_v2
|
@@ -66,10 +21,13 @@ model-index:
|
|
66 |
This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [yano0/my_rope_bert_v2](https://huggingface.co/yano0/my_rope_bert_v2). It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
|
67 |
|
68 |
## Model Details
|
|
|
|
|
|
|
|
|
69 |
|
70 |
### Model Description
|
71 |
- **Model Type:** Sentence Transformer
|
72 |
-
- **Base model:** [yano0/my_rope_bert_v2](https://huggingface.co/yano0/my_rope_bert_v2) <!-- at revision a392086c08b3bf3a9b9030267a8965af0552d7fb -->
|
73 |
- **Maximum Sequence Length:** 1024 tokens
|
74 |
- **Output Dimensionality:** 768 tokens
|
75 |
- **Similarity Function:** Cosine Similarity
|
@@ -181,41 +139,31 @@ You can finetune this model on your own dataset.
|
|
181 |
*What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
|
182 |
-->
|
183 |
|
184 |
-
##
|
185 |
|
186 |
-
###
|
187 |
-
|
188 |
-
|:-----:|:----:|:---------------:|
|
189 |
-
| 0 | 0 | 0.7829 |
|
190 |
|
|
|
|
|
|
|
|
|
|
|
191 |
|
192 |
-
### Framework Versions
|
193 |
-
- Python: 3.10.13
|
194 |
-
- Sentence Transformers: 3.0.0
|
195 |
-
- Transformers: 4.44.0
|
196 |
-
- PyTorch: 2.3.1+cu118
|
197 |
-
- Accelerate: 0.30.1
|
198 |
-
- Datasets: 2.19.2
|
199 |
-
- Tokenizers: 0.19.1
|
200 |
|
201 |
-
|
|
|
|
|
|
|
202 |
|
203 |
-
|
|
|
|
|
|
|
|
|
204 |
|
205 |
-
|
206 |
-
|
207 |
-
|
208 |
-
*Clearly define terms in order to be accessible across audiences.*
|
209 |
-
-->
|
210 |
-
|
211 |
-
<!--
|
212 |
-
## Model Card Authors
|
213 |
-
|
214 |
-
*Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
|
215 |
-
-->
|
216 |
-
|
217 |
-
<!--
|
218 |
-
## Model Card Contact
|
219 |
|
220 |
-
|
221 |
-
|
|
|
1 |
---
|
2 |
+
language:
|
3 |
+
- ja
|
4 |
library_name: sentence-transformers
|
5 |
tags:
|
6 |
- sentence-transformers
|
7 |
- sentence-similarity
|
8 |
- feature-extraction
|
|
|
9 |
metrics:
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
10 |
widget: []
|
11 |
pipeline_tag: sentence-similarity
|
12 |
+
license: apache-2.0
|
13 |
+
datasets:
|
14 |
+
- hpprc/emb
|
15 |
+
- hpprc/mqa-ja
|
16 |
+
- google-research-datasets/paws-x
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
17 |
---
|
18 |
|
19 |
# SentenceTransformer based on yano0/my_rope_bert_v2
|
|
|
21 |
This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [yano0/my_rope_bert_v2](https://huggingface.co/yano0/my_rope_bert_v2). It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
|
22 |
|
23 |
## Model Details
|
24 |
+
The model is 1024-context sentence embedding model based on the RoFormer.
|
25 |
+
The model is pre-trained with Wikipedia and cc100 and fine-tuned as a sentence embedding model.
|
26 |
+
Fine-tuning begins with weakly supervised learning using mc4 and MQA.
|
27 |
+
After that, we perform the same 3-stage learning process as [GLuCoSE v2](https://huggingface.co/pkshatech/GLuCoSE-base-ja-v2).
|
28 |
|
29 |
### Model Description
|
30 |
- **Model Type:** Sentence Transformer
|
|
|
31 |
- **Maximum Sequence Length:** 1024 tokens
|
32 |
- **Output Dimensionality:** 768 tokens
|
33 |
- **Similarity Function:** Cosine Similarity
|
|
|
139 |
*What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
|
140 |
-->
|
141 |
|
142 |
+
## Benchmarks
|
143 |
|
144 |
+
### Retieval
|
145 |
+
Evaluated with [MIRACL-ja](https://huggingface.co/datasets/miracl/miracl), [JQARA](https://huggingface.co/datasets/hotchpotch/JQaRA) and [MLDR-ja](https://huggingface.co/datasets/Shitao/MLDR).
|
|
|
|
|
146 |
|
147 |
+
| model | size | MIRACL<br>Recall@5 | JQaRA<br>nDCG@10 | MLDR<br>nDCG@10 |
|
148 |
+
|--------|--------|---------------------|-------------------|-------------------|
|
149 |
+
| me5-base | 0.3B | 84.2 | 47.2 | 25.4 |
|
150 |
+
| GLuCoSE | 0.1B | 53.3 | 30.8 | 25.2 |
|
151 |
+
| RoSEtta | 0.2B | 79.3 | 57.7 | 32.3 |
|
152 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
153 |
|
154 |
+
### JMTEB
|
155 |
+
Evaluated with [JMTEB](https://github.com/sbintuitions/JMTEB).
|
156 |
+
* Time-consuming [‘amazon_review_classification’, ‘mrtydi’, ‘jaqket’, ‘esci’] were excluded and evaluated.
|
157 |
+
* The average is a macro-average per task.
|
158 |
|
159 |
+
| model | size | Class. | Ret. | STS. | Clus. | Pair. | Avg. |
|
160 |
+
|--------|--------|--------|------|------|-------|-------|------|
|
161 |
+
| me5-base | 0.3B | 75.1 | 80.6 | 80.5 | 52.6 | 62.4 | 70.2 |
|
162 |
+
| GLuCoSE | 0.1B | 82.6 | 69.8 | 78.2 | 51.5 | 66.2 | 69.7 |
|
163 |
+
| RoSEtta | 0.2B | 79.0 | 84.3 | 81.4 | 53.2 | 61.7 | 71.9 |
|
164 |
|
165 |
+
## Authors
|
166 |
+
Chihiro Yano, Go Mocho, Hideyuki Tachibana, Hiroto Takegawa, Yotaro Watanabe
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
167 |
|
168 |
+
## License
|
169 |
+
This model is published under the [Apache License, Version 2.0](https://www.apache.org/licenses/LICENSE-2.0).
|