Update README.md
Browse files
README.md
CHANGED
@@ -16,7 +16,7 @@ language:
|
|
16 |
|
17 |
This is a [sentence-transformers](https://www.SBERT.net) model: It maps sentences & paragraphs to a 768 dimensional dense vector space and can be used for tasks like clustering or semantic search.
|
18 |
|
19 |
-
The base model is [studio-ousia/luke-japanese-base-lite](studio-ousia/luke-japanese-base-lite) and was trained
|
20 |
|
21 |
## Usage (Sentence-Transformers)
|
22 |
|
@@ -84,47 +84,4 @@ The results of the evaluation by JSTS and JSICK are available [here](https://git
|
|
84 |
## Training
|
85 |
|
86 |
Training scripts are available in [this repository](https://github.com/oshizo/JapaneseEmbeddingTrain).
|
87 |
-
This model was trained 1 epoch on Google Colab Pro A100 and took approximately
|
88 |
-
|
89 |
-
The model was trained with the parameters:
|
90 |
-
|
91 |
-
**DataLoader**:
|
92 |
-
|
93 |
-
`sentence_transformers.datasets.NoDuplicatesDataLoader.NoDuplicatesDataLoader` of length 2304 with parameters:
|
94 |
-
```
|
95 |
-
{'batch_size': 128}
|
96 |
-
```
|
97 |
-
|
98 |
-
**Loss**:
|
99 |
-
|
100 |
-
`sentence_transformers.losses.MultipleNegativesRankingLoss.MultipleNegativesRankingLoss` with parameters:
|
101 |
-
```
|
102 |
-
{'scale': 20.0, 'similarity_fct': 'cos_sim'}
|
103 |
-
```
|
104 |
-
|
105 |
-
Parameters of the fit()-Method:
|
106 |
-
```
|
107 |
-
{
|
108 |
-
"epochs": 1,
|
109 |
-
"evaluation_steps": 230,
|
110 |
-
"evaluator": "sentence_transformers.evaluation.EmbeddingSimilarityEvaluator.EmbeddingSimilarityEvaluator",
|
111 |
-
"max_grad_norm": 1,
|
112 |
-
"optimizer_class": "<class 'torch.optim.adamw.AdamW'>",
|
113 |
-
"optimizer_params": {
|
114 |
-
"lr": 2e-05
|
115 |
-
},
|
116 |
-
"scheduler": "WarmupLinear",
|
117 |
-
"steps_per_epoch": null,
|
118 |
-
"warmup_steps": 231,
|
119 |
-
"weight_decay": 0.01
|
120 |
-
}
|
121 |
-
```
|
122 |
-
|
123 |
-
|
124 |
-
## Full Model Architecture
|
125 |
-
```
|
126 |
-
SentenceTransformer(
|
127 |
-
(0): Transformer({'max_seq_length': 128, 'do_lower_case': False}) with Transformer model: LukeModel
|
128 |
-
(1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False})
|
129 |
-
)
|
130 |
-
```
|
|
|
16 |
|
17 |
This is a [sentence-transformers](https://www.SBERT.net) model: It maps sentences & paragraphs to a 768 dimensional dense vector space and can be used for tasks like clustering or semantic search.
|
18 |
|
19 |
+
The base model is [studio-ousia/luke-japanese-base-lite](https://huggingface.co/studio-ousia/luke-japanese-base-lite) and was trained 1 epoch with [shunk031/jsnli](https://huggingface.co/datasets/shunk031/jsnli).
|
20 |
|
21 |
## Usage (Sentence-Transformers)
|
22 |
|
|
|
84 |
## Training
|
85 |
|
86 |
Training scripts are available in [this repository](https://github.com/oshizo/JapaneseEmbeddingTrain).
|
87 |
+
This model was trained 1 epoch on Google Colab Pro A100 and took approximately 40 minutes.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|