pkshatech
/

GLuCoSE-base-ja

Sentence Similarity

sentence-transformers

feature-extraction

Model card Files Files and versions Community

akiFQC commited on Jul 16, 2023

Commit

008d110

•

1 Parent(s): 6040316

small update (reame)

Files changed (2) hide show

README.md +2 -0
README_JA.md +11 -0

README.md CHANGED Viewed

@@ -22,6 +22,8 @@ datasets:
 # GLuCoSE (General Luke-based COntrastive Sentence Embedding)-base-Japanese
 GLuCoSE (General LUke-based COntrastive Sentence Embedding, "glucose") is a Japanese text embedding model based on [LUKE](https://github.com/studio-ousia/luke). In order to create a general-purpose, user-friendly Japanese text embedding model, GLuCoSE has been trained on a mix of web data and various datasets associated with natural language inference and search. This model is not only suitable for sentence vector similarity tasks but also for semantic search tasks.
 - Maximum token count: 512
 - Output dimension: 768

 # GLuCoSE (General Luke-based COntrastive Sentence Embedding)-base-Japanese
+[日本語のREADME/Japanese README](https://huggingface.co/pkshatech/GLuCoSE-base-ja)
 GLuCoSE (General LUke-based COntrastive Sentence Embedding, "glucose") is a Japanese text embedding model based on [LUKE](https://github.com/studio-ousia/luke). In order to create a general-purpose, user-friendly Japanese text embedding model, GLuCoSE has been trained on a mix of web data and various datasets associated with natural language inference and search. This model is not only suitable for sentence vector similarity tasks but also for semantic search tasks.
 - Maximum token count: 512
 - Output dimension: 768

README_JA.md CHANGED Viewed

@@ -8,10 +8,21 @@ tags:
   - feature-extraction
   - sentence-transformers
 inference: false
 ---
 # GLuCoSE (General Luke-based COntrastive Sentence Embedding)-base-Japanese
 GLuCoSE (General LUke-based COntrastive Sentence Embedding, "ぐるこーす")は[LUKE](https://github.com/studio-ousia/luke)をベースにした日本語のテキスト埋め込みモデルです。汎用的で気軽に使えるテキスト埋め込みモデルを目指して、Webデータと自然言語推論や検索などの複数のデータセットを組み合わせたデータで学習されています。文ベクトルの類似度タスクだけでなく意味検索タスクにもお使いいただけます。
 - 最大トークン数: 512
 - 出力次元数: 768

   - feature-extraction
   - sentence-transformers
 inference: false
+datasets:
+  - mc4
+  - clips/mqa
+  - shunk031/JGLUE
+  - paws-x
+  - hpprc/janli
+  - MoritzLaurer/multilingual-NLI-26lang-2mil7
+  - castorini/mr-tydi
+  - hpprc/jsick
 ---
 # GLuCoSE (General Luke-based COntrastive Sentence Embedding)-base-Japanese
+[English README/英語のREADME](https://huggingface.co/pkshatech/GLuCoSE-base-ja)
 GLuCoSE (General LUke-based COntrastive Sentence Embedding, "ぐるこーす")は[LUKE](https://github.com/studio-ousia/luke)をベースにした日本語のテキスト埋め込みモデルです。汎用的で気軽に使えるテキスト埋め込みモデルを目指して、Webデータと自然言語推論や検索などの複数のデータセットを組み合わせたデータで学習されています。文ベクトルの類似度タスクだけでなく意味検索タスクにもお使いいただけます。
 - 最大トークン数: 512
 - 出力次元数: 768