bwang0911 commited on
Commit
95f65b5
1 Parent(s): f5bccc2

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +16 -2
README.md CHANGED
@@ -15,9 +15,23 @@ license: apache-2.0
15
 
16
  The text embedding suit trained by [Jina AI](https://github.com/jina-ai), [Finetuner team](https://github.com/jina-ai/finetuner).
17
 
18
- ## Intented Usage
19
 
20
- ## Model Info
 
 
 
 
 
 
 
 
 
 
 
 
 
 
21
 
22
  ## Data & Parameters
23
 
 
15
 
16
  The text embedding suit trained by [Jina AI](https://github.com/jina-ai), [Finetuner team](https://github.com/jina-ai/finetuner).
17
 
18
+ ## Intented Usage & Model Info
19
 
20
+ `jina-embedding-s-en-v1` is a language model that has been trained using Jina AI's Linnaeus-Clean dataset.
21
+ This dataset consists of 380 million pairs of sentences, which include both query-document pairs.
22
+ These pairs were obtained from various domains and were carefully selected through a thorough cleaning process.
23
+ The Linnaeus-Full dataset, from which the Linnaeus-Clean dataset is derived, originally contained 1.6 billion sentence pairs.
24
+
25
+ The model has a range of use cases, including information retrieval, semantic textual similarity, text reranking, and more.
26
+
27
+ With a compact size of just 35 million parameters,
28
+ the model enables lightning-fast inference while still delivering impressive performance.
29
+ Additionally, we provide the following options:
30
+
31
+ - jina-embedding-b-en-v1: 110 million parameters.
32
+ - jina-embedding-l-en-v1: 800 million parameters.
33
+ - jina-embedding-xl-en-v1: 3 billion parameters.
34
+ - jina-embedding-xxl-en-v1: 11 billion parameters.
35
 
36
  ## Data & Parameters
37