hpprc commited on
Commit
e005b06
1 Parent(s): fe06738

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +18 -19
README.md CHANGED
@@ -36,25 +36,6 @@ embeddings = model.encode(sentences)
36
  print(embeddings)
37
  ```
38
 
39
- ## Model Summary
40
-
41
- - Fine-tuning method: Supervised SimCSE
42
- - Base model: [cl-tohoku/bert-base-japanese-v3](https://huggingface.co/cl-tohoku/bert-base-japanese-v3)
43
- - Training dataset: [JSNLI](https://nlp.ist.i.kyoto-u.ac.jp/?%E6%97%A5%E6%9C%AC%E8%AA%9ESNLI%28JSNLI%29%E3%83%87%E3%83%BC%E3%82%BF%E3%82%BB%E3%83%83%E3%83%88)
44
- - Pooling strategy: cls (with an extra MLP layer only during training)
45
- - Hidden size: 768
46
- - Learning rate: 5e-5
47
- - Batch size: 512
48
- - Temperature: 0.05
49
- - Max sequence length: 64
50
- - Number of training examples: 2^20
51
- - Validation interval (steps): 2^6
52
- - Warmup ratio: 0.1
53
- - Dtype: BFloat16
54
-
55
- See the [GitHub repository](https://github.com/hppRC/simple-simcse-ja) for a detailed experimental setup.
56
-
57
-
58
  ## Usage (HuggingFace Transformers)
59
  Without [sentence-transformers](https://www.SBERT.net), you can use the model like this: First, you pass your input through the transformer model, then you have to apply the right pooling-operation on-top of the contextualized word embeddings.
60
 
@@ -96,6 +77,24 @@ SentenceTransformer(
96
  )
97
  ```
98
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
99
  ## Citing & Authors
100
 
101
  ```
 
36
  print(embeddings)
37
  ```
38
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
39
  ## Usage (HuggingFace Transformers)
40
  Without [sentence-transformers](https://www.SBERT.net), you can use the model like this: First, you pass your input through the transformer model, then you have to apply the right pooling-operation on-top of the contextualized word embeddings.
41
 
 
77
  )
78
  ```
79
 
80
+ ## Model Summary
81
+
82
+ - Fine-tuning method: Supervised SimCSE
83
+ - Base model: [cl-tohoku/bert-base-japanese-v3](https://huggingface.co/cl-tohoku/bert-base-japanese-v3)
84
+ - Training dataset: [JSNLI](https://nlp.ist.i.kyoto-u.ac.jp/?%E6%97%A5%E6%9C%AC%E8%AA%9ESNLI%28JSNLI%29%E3%83%87%E3%83%BC%E3%82%BF%E3%82%BB%E3%83%83%E3%83%88)
85
+ - Pooling strategy: cls (with an extra MLP layer only during training)
86
+ - Hidden size: 768
87
+ - Learning rate: 5e-5
88
+ - Batch size: 512
89
+ - Temperature: 0.05
90
+ - Max sequence length: 64
91
+ - Number of training examples: 2^20
92
+ - Validation interval (steps): 2^6
93
+ - Warmup ratio: 0.1
94
+ - Dtype: BFloat16
95
+
96
+ See the [GitHub repository](https://github.com/hppRC/simple-simcse-ja) for a detailed experimental setup.
97
+
98
  ## Citing & Authors
99
 
100
  ```