hellonlp's picture
Update README.md
0440bc4 verified
|
raw
history blame
2.25 kB
metadata
license: mit
language:
  - zh
pipeline_tag: sentence-similarity

PromCSE(sup)

Data List

The following datasets are all in Chinese.

Data size(train) size(valid) size(test)
ATEC 62477 20000 20000
BQ 100000 10000 10000
LCQMC 238766 8802 12500
PAWSX 49401 2000 2000
STS-B 5231 1458 1361
SNLI 146828 2699 2618
MNLI 122547 2932 2397

Model List

The evaluation dataset is in Chinese, and we used the same language model RoBERTa Large on different methods. In addition, considering that the test set of some datasets is small, which may lead to a large deviation in evaluation accuracy, the evaluation data here uses train, valid and test at the same time, and the final evaluation result adopts the weighted average (w-avg) method.

Model STS-B(w-avg) ATEC BQ LCQMC PAWSX Avg.
BAAI/bge-large-zh 78.61 - - - - -
BAAI/bge-large-zh-v1.5 79.07 - - - - -
hellonlp/simcse-large-zh 81.32 - - - - -
hellonlp/promcse-large-zh xx - - - - -