File size: 751 Bytes
aa56de1
 
c6baf3d
 
 
 
 
 
 
 
 
f033a37
aa56de1
f033a37
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
---
license: apache-2.0
language:
- zh
tags:
- bert
- feature-extraction
- text2vec
datasets:
- shibing624/nli_zh
pipeline_tag: sentence-similarity

---
简介:
参考 https://github.com/shibing624/text2vec
基于Cosent模型架构,使用hfl/chinese-roberta-wwm-ext作为基座模型,在中文STS-B数据集上重新微调训练,将max_seq_length从原有的128扩展到了512
eval_spearman:0.833

---
下游任务:
基于text2vec库或sentence-transformer库均可调用。
文本向量表征:
```
>>> from text2vec import SentenceModel, EncoderType
>>> model = SentenceModel('EricLee/text2vec-roberta-512', encoder_type=EncoderType.FIRST_LAST_AVG, max_seq_length=512)
>>> model.encode("今天天气不错啊")
Embedding shape: (768,)
```