bwang0911 commited on
Commit
34234f6
1 Parent(s): 05421be

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +6 -6
README.md CHANGED
@@ -2622,7 +2622,7 @@ model-index:
2622
  ## Intended Usage & Model Info
2623
 
2624
  `jina-embedding-b-en-v2` is an English, monolingual **embedding model** supporting **8192 sequence length**.
2625
- It is based on a Bert architecture (Jina Bert) that supports the symmetric bidirectional variant of [ALiBi](https://arxiv.org/abs/2108.12409) to support longer sequence length.
2626
  The backbone `jina-bert-b-en-v2` is pretrained on the C4 dataset.
2627
  The model is further trained on Jina AI's collection of more than 400 millions of sentence pairs and hard negatives.
2628
  These pairs were obtained from various domains and were carefully selected through a thorough cleaning process.
@@ -2631,18 +2631,18 @@ The embedding model was trained using 512 sequence length, but extrapolates to 8
2631
  This makes our model useful for a range of use cases, especially when processing long documents is needed, including long document retrieval, semantic textual similarity, text reranking, recommendation, RAG and LLM-based generative search,...
2632
 
2633
  With a standard size of 137 million parameters, the model enables fast inference while delivering better performance than our small model. It is recommended to use a single GPU for inference.
2634
- Additionally, we provide the following embedding models, supporting 8k sequence length as well:
2635
 
2636
- ### V1 (Based on T5)
2637
 
2638
  - [`jina-embedding-s-en-v1`](https://huggingface.co/jinaai/jina-embedding-s-en-v1): 35 million parameters.
2639
  - [`jina-embedding-b-en-v1`](https://huggingface.co/jinaai/jina-embedding-b-en-v1): 110 million parameters.
2640
  - [`jina-embedding-l-en-v1`](https://huggingface.co/jinaai/jina-embedding-l-en-v1): 330 million parameters.
2641
 
2642
- ### V2 (Based on JinaBert)
2643
 
2644
- - [`jina-embedding-s-en-v2`](https://huggingface.co/jinaai/jina-embedding-s-en-v2): 33 million parameters.
2645
- - [`jina-embedding-b-en-v2`](https://huggingface.co/jinaai/jina-embedding-b-en-v2): 137 million parameters **(you are here)**.
2646
  - [`jina-embedding-l-en-v2`](https://huggingface.co/jinaai/jina-embedding-l-en-v2): 435 million parameters.
2647
 
2648
  ## Data & Parameters
 
2622
  ## Intended Usage & Model Info
2623
 
2624
  `jina-embedding-b-en-v2` is an English, monolingual **embedding model** supporting **8192 sequence length**.
2625
+ It is based on a Bert architecture (JinaBert) that supports the symmetric bidirectional variant of [ALiBi](https://arxiv.org/abs/2108.12409) to allow longer sequence length.
2626
  The backbone `jina-bert-b-en-v2` is pretrained on the C4 dataset.
2627
  The model is further trained on Jina AI's collection of more than 400 millions of sentence pairs and hard negatives.
2628
  These pairs were obtained from various domains and were carefully selected through a thorough cleaning process.
 
2631
  This makes our model useful for a range of use cases, especially when processing long documents is needed, including long document retrieval, semantic textual similarity, text reranking, recommendation, RAG and LLM-based generative search,...
2632
 
2633
  With a standard size of 137 million parameters, the model enables fast inference while delivering better performance than our small model. It is recommended to use a single GPU for inference.
2634
+ Additionally, we provide the following embedding models:
2635
 
2636
+ ### V1 (Based on T5, 512 Seq)
2637
 
2638
  - [`jina-embedding-s-en-v1`](https://huggingface.co/jinaai/jina-embedding-s-en-v1): 35 million parameters.
2639
  - [`jina-embedding-b-en-v1`](https://huggingface.co/jinaai/jina-embedding-b-en-v1): 110 million parameters.
2640
  - [`jina-embedding-l-en-v1`](https://huggingface.co/jinaai/jina-embedding-l-en-v1): 330 million parameters.
2641
 
2642
+ ### V2 (Based on JinaBert, 8k Seq)
2643
 
2644
+ - [`jina-embedding-s-en-v2`](https://huggingface.co/jinaai/jina-embedding-s-en-v2): 33 million parameters **(you are here)**.
2645
+ - [`jina-embedding-b-en-v2`](https://huggingface.co/jinaai/jina-embedding-b-en-v2): 137 million parameters.
2646
  - [`jina-embedding-l-en-v2`](https://huggingface.co/jinaai/jina-embedding-l-en-v2): 435 million parameters.
2647
 
2648
  ## Data & Parameters