File size: 3,600 Bytes
550b7bb
 
1669bbe
 
843196d
550b7bb
1669bbe
 
0a3b4a3
b19dd16
 
 
1669bbe
a4bf534
6151dff
976d87c
1ce141c
 
 
 
 
3d88214
 
 
1ce141c
a4bf534
 
 
31ecf97
1b1ed0e
31ecf97
 
 
 
 
 
 
 
 
 
 
 
4ca7a7e
1669bbe
 
 
 
 
 
 
 
25b39f8
1669bbe
 
d75013f
1669bbe
 
09645bb
 
1669bbe
 
d75013f
1669bbe
09645bb
 
 
 
1669bbe
09645bb
3151cd8
08a6484
 
 
 
 
 
 
 
 
 
 
 
c1891bf
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
---
license: mit
language:
- zh
pipeline_tag: sentence-similarity
---


# PromCSE(sup)




## Data List
The following datasets are all in Chinese.
|          Data          | size(train) | size(valid) | size(test) |
|:----------------------:|:----------:|:----------:|:----------:|
|   [ATEC](https://link.zhihu.com/?target=https%3A//pan.baidu.com/s/1gmnyz9emqOXwaHhSM9CCUA%3Fpwd%3Db17c)   |  62477|  20000|  20000|
|   [BQ](https://link.zhihu.com/?target=https%3A//pan.baidu.com/s/1M-e01yyy5NacVPrph9fbaQ%3Fpwd%3Dtis9)     | 100000|  10000|  10000|
|   [LCQMC](https://pan.baidu.com/s/16DfE7fHrCkk4e8a2j3SYUg?pwd=bc8w )                                      | 238766|   8802|  12500|
|   [PAWSX](https://link.zhihu.com/?target=https%3A//pan.baidu.com/s/1ox0tJY3ZNbevHDeAqDBOPQ%3Fpwd%3Dmgjn)  |  49401|   2000|   2000|
|   [STS-B](https://link.zhihu.com/?target=https%3A//pan.baidu.com/s/10yfKfTtcmLQ70-jzHIln1A%3Fpwd%3Dgf8y)  |   5231|   1458|   1361|
|   [*SNLI*](https://link.zhihu.com/?target=https%3A//pan.baidu.com/s/1NOgA7JwWghiauwGAUvcm7w%3Fpwd%3Ds75v)   | 146828|   2699|   2618|
|   [*MNLI*](https://link.zhihu.com/?target=https%3A//pan.baidu.com/s/1xjZKtWk3MAbJ6HX4pvXJ-A%3Fpwd%3D2kte)   | 122547|   2932|   2397|




## Model List
The evaluation dataset is in Chinese, and we used the same language model **RoBERTa base** on different methods.  In addition, considering that the test set of some datasets is small, which may lead to a large deviation in evaluation accuracy, the evaluation data here uses train, valid and test at the same time, and the final evaluation result adopts the **weighted average (w-avg)** method.
|          Model          | STS-B(w-avg) | ATEC | BQ | LCQMC | PAWSX | Avg. |
|:-----------------------:|:------------:|:-----------:|:----------|:-------------|:------------:|:----------:|
|  BERT-Whitening  |  65.27| -| -| -| -| -|
|  SimBERT   |  70.01| -| -| -| -| -|
|  SBERT-Whitening  |  71.75| -| -| -| -| -|
|  [BAAI/bge-base-zh](https://huggingface.co/BAAI/bge-base-zh)  |  78.61| -| -| -| -| -|
|  [hellonlp/simcse-base-zh](https://huggingface.co/hellonlp/simcse-roberta-base-zh)  |  80.96| -| -| -| -| -|
|  [hellonlp/promcse-base-zh](https://huggingface.co/hellonlp/promcse-bert-base-zh)  |  **81.57**| -| -| -| -| -|




## Uses
To use the tool, first install the `promcse` package from [PyPI](https://pypi.org/project/promcse/)
```bash
pip install promcse
```

After installing the package, you can load our model by two lines of code
```python
from promcse import PromCSE
model = PromCSE("hellonlp/promcse-bert-base-zh", "cls", 10)
```

Then you can use our model for encoding sentences into embeddings
```python
embeddings = model.encode("武汉是一个美丽的城市。")
print(embeddings.shape)
#torch.Size([768])
```

Compute the cosine similarities between two groups of sentences
```python
sentences_a = ['你好吗']
sentences_b = ['你怎么样','我吃了一个苹果','你过的好吗','你还好吗','你',
               '你好不好','你好不好呢','我不开心','我好开心啊', '你吃饭了吗',
               '你好吗','你现在好吗','你好个鬼']
similarities = model.similarity(sentences_a, sentences_b)
print(similarities)
# [(1.0, '你好吗'),
#  (0.9167, '你好不好'),
#  (0.8956, '你好不好呢'),
#  (0.8431, '你还好吗'),
#  (0.7919, '你怎么样'),
#  (0.7649, '你现在好吗'),
#  (0.7458, '你过的好吗'),
#  (0.6844, '你好个鬼'),
#  (0.6177, '你'),
#  (0.5654, '你吃饭了吗'),
#  (0.3612, '我好开心啊'),
#  (0.1875, '我不开心'),
#  (0.0866, '我吃了一个苹果')]
```