oshizo
/

sbert-jsnli-luke-japanese-base-lite

Sentence Similarity

sentence-transformers

feature-extraction

Inference Endpoints

Model card Files Files and versions Community

oshizo commited on Jan 10, 2023

Commit

8b56287

·

1 Parent(s): d753083

Update README.md

Files changed (1) hide show

README.md +2 -45

README.md CHANGED Viewed

@@ -16,7 +16,7 @@ language:
 This is a [sentence-transformers](https://www.SBERT.net) model: It maps sentences & paragraphs to a 768 dimensional dense vector space and can be used for tasks like clustering or semantic search.
-The base model is [studio-ousia/luke-japanese-base-lite](studio-ousia/luke-japanese-base-lite) and was trained one epoch with [JSNLI](https://huggingface.co/datasets/shunk031/jsnli).
 ## Usage (Sentence-Transformers)
@@ -84,47 +84,4 @@ The results of the evaluation by JSTS and JSICK are available [here](https://git
 ## Training
 Training scripts are available in [this repository](https://github.com/oshizo/JapaneseEmbeddingTrain).
-This model was trained 1 epoch on Google Colab Pro A100 and took approximately 35 minutes.
-The model was trained with the parameters:
-**DataLoader**:
-`sentence_transformers.datasets.NoDuplicatesDataLoader.NoDuplicatesDataLoader` of length 2304 with parameters:
-```
-{'batch_size': 128}
-```
-**Loss**:
-`sentence_transformers.losses.MultipleNegativesRankingLoss.MultipleNegativesRankingLoss` with parameters:
-  ```
-  {'scale': 20.0, 'similarity_fct': 'cos_sim'}
-  ```
-Parameters of the fit()-Method:
-```
-{
-    "epochs": 1,
-    "evaluation_steps": 230,
-    "evaluator": "sentence_transformers.evaluation.EmbeddingSimilarityEvaluator.EmbeddingSimilarityEvaluator",
-    "max_grad_norm": 1,
-    "optimizer_class": "<class 'torch.optim.adamw.AdamW'>",
-    "optimizer_params": {
-        "lr": 2e-05
-    },
-    "scheduler": "WarmupLinear",
-    "steps_per_epoch": null,
-    "warmup_steps": 231,
-    "weight_decay": 0.01
-}
-```
-## Full Model Architecture
-```
-SentenceTransformer(
-  (0): Transformer({'max_seq_length': 128, 'do_lower_case': False}) with Transformer model: LukeModel
-  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False})
-)
-```

 This is a [sentence-transformers](https://www.SBERT.net) model: It maps sentences & paragraphs to a 768 dimensional dense vector space and can be used for tasks like clustering or semantic search.
+The base model is [studio-ousia/luke-japanese-base-lite](https://huggingface.co/studio-ousia/luke-japanese-base-lite) and was trained 1 epoch with [shunk031/jsnli](https://huggingface.co/datasets/shunk031/jsnli).
 ## Usage (Sentence-Transformers)
 ## Training
 Training scripts are available in [this repository](https://github.com/oshizo/JapaneseEmbeddingTrain).
+This model was trained 1 epoch on Google Colab Pro A100 and took approximately 40 minutes.