Update README.md
Browse files
README.md
CHANGED
@@ -16,24 +16,6 @@ library_name: sentence-transformers
|
|
16 |
|
17 |
# sup-simcse-ja-base
|
18 |
|
19 |
-
## Model Summary
|
20 |
-
|
21 |
-
- Fine-tuning method: Supervised SimCSE
|
22 |
-
- Base model: [cl-tohoku/bert-base-japanese-v3](https://huggingface.co/cl-tohoku/bert-base-japanese-v3)
|
23 |
-
- Training dataset: [JSNLI](https://nlp.ist.i.kyoto-u.ac.jp/?%E6%97%A5%E6%9C%AC%E8%AA%9ESNLI%28JSNLI%29%E3%83%87%E3%83%BC%E3%82%BF%E3%82%BB%E3%83%83%E3%83%88)
|
24 |
-
- Pooling strategy: cls (with an extra MLP layer only during training)
|
25 |
-
- Hidden size: 768
|
26 |
-
- Learning rate: 5e-5
|
27 |
-
- Batch size: 512
|
28 |
-
- Temperature: 0.05
|
29 |
-
- Max sequence length: 64
|
30 |
-
- Number of training examples: 2^20
|
31 |
-
- Validation interval (steps): 2^6
|
32 |
-
- Warmup ratio: 0.1
|
33 |
-
- Dtype: BFloat16
|
34 |
-
|
35 |
-
See the [GitHub repository](https://github.com/hppRC/simple-simcse-ja) for a detailed experimental setup.
|
36 |
-
|
37 |
|
38 |
## Usage (Sentence-Transformers)
|
39 |
|
@@ -54,6 +36,23 @@ embeddings = model.encode(sentences)
|
|
54 |
print(embeddings)
|
55 |
```
|
56 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
57 |
|
58 |
|
59 |
## Usage (HuggingFace Transformers)
|
|
|
16 |
|
17 |
# sup-simcse-ja-base
|
18 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
19 |
|
20 |
## Usage (Sentence-Transformers)
|
21 |
|
|
|
36 |
print(embeddings)
|
37 |
```
|
38 |
|
39 |
+
## Model Summary
|
40 |
+
|
41 |
+
- Fine-tuning method: Supervised SimCSE
|
42 |
+
- Base model: [cl-tohoku/bert-base-japanese-v3](https://huggingface.co/cl-tohoku/bert-base-japanese-v3)
|
43 |
+
- Training dataset: [JSNLI](https://nlp.ist.i.kyoto-u.ac.jp/?%E6%97%A5%E6%9C%AC%E8%AA%9ESNLI%28JSNLI%29%E3%83%87%E3%83%BC%E3%82%BF%E3%82%BB%E3%83%83%E3%83%88)
|
44 |
+
- Pooling strategy: cls (with an extra MLP layer only during training)
|
45 |
+
- Hidden size: 768
|
46 |
+
- Learning rate: 5e-5
|
47 |
+
- Batch size: 512
|
48 |
+
- Temperature: 0.05
|
49 |
+
- Max sequence length: 64
|
50 |
+
- Number of training examples: 2^20
|
51 |
+
- Validation interval (steps): 2^6
|
52 |
+
- Warmup ratio: 0.1
|
53 |
+
- Dtype: BFloat16
|
54 |
+
|
55 |
+
See the [GitHub repository](https://github.com/hppRC/simple-simcse-ja) for a detailed experimental setup.
|
56 |
|
57 |
|
58 |
## Usage (HuggingFace Transformers)
|