bwang0911 commited on
Commit
3ea4a9e
1 Parent(s): abc42e1

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -4
README.md CHANGED
@@ -2621,7 +2621,7 @@ model-index:
2621
  ## Intended Usage & Model Info
2622
 
2623
  `jina-embedding-s-en-v2` is an English, monolingual **embedding model** supporting **8192 sequence length**.
2624
- It is based on a Bert architecture (Jina Bert) that supports the symmetric bidirectional variant of [ALiBi](https://arxiv.org/abs/2108.12409) to support longer sequence length.
2625
  The backbone `jina-bert-s-en-v2` is pretrained on the C4 dataset.
2626
  The model is further trained on Jina AI's collection of more than 400 millions of sentence pairs and hard negatives.
2627
  These pairs were obtained from various domains and were carefully selected through a thorough cleaning process.
@@ -2630,15 +2630,15 @@ The embedding model was trained using 512 sequence length, but extrapolates to 8
2630
  This makes our model useful for a range of use cases, especially when processing long documents is needed, including long document retrieval, semantic textual similarity, text reranking, recommendation, RAG and LLM-based generative search,...
2631
 
2632
  This model has 33 million parameters, which enables lightning-fast and memory efficient inference, while still delivering impressive performance.
2633
- Additionally, we provide the following embedding models, supporting 8k sequence length as well:
2634
 
2635
- ### V1 (Based on T5)
2636
 
2637
  - [`jina-embedding-s-en-v1`](https://huggingface.co/jinaai/jina-embedding-s-en-v1): 35 million parameters.
2638
  - [`jina-embedding-b-en-v1`](https://huggingface.co/jinaai/jina-embedding-b-en-v1): 110 million parameters.
2639
  - [`jina-embedding-l-en-v1`](https://huggingface.co/jinaai/jina-embedding-l-en-v1): 330 million parameters.
2640
 
2641
- ### V2 (Based on JinaBert)
2642
 
2643
  - [`jina-embedding-s-en-v2`](https://huggingface.co/jinaai/jina-embedding-s-en-v2): 33 million parameters **(you are here)**.
2644
  - [`jina-embedding-b-en-v2`](https://huggingface.co/jinaai/jina-embedding-b-en-v2): 137 million parameters.
 
2621
  ## Intended Usage & Model Info
2622
 
2623
  `jina-embedding-s-en-v2` is an English, monolingual **embedding model** supporting **8192 sequence length**.
2624
+ It is based on a Bert architecture (JinaBert) that supports the symmetric bidirectional variant of [ALiBi](https://arxiv.org/abs/2108.12409) to allow longer sequence length.
2625
  The backbone `jina-bert-s-en-v2` is pretrained on the C4 dataset.
2626
  The model is further trained on Jina AI's collection of more than 400 millions of sentence pairs and hard negatives.
2627
  These pairs were obtained from various domains and were carefully selected through a thorough cleaning process.
 
2630
  This makes our model useful for a range of use cases, especially when processing long documents is needed, including long document retrieval, semantic textual similarity, text reranking, recommendation, RAG and LLM-based generative search,...
2631
 
2632
  This model has 33 million parameters, which enables lightning-fast and memory efficient inference, while still delivering impressive performance.
2633
+ Additionally, we provide the following embedding models:
2634
 
2635
+ ### V1 (Based on T5, 512 Seq)
2636
 
2637
  - [`jina-embedding-s-en-v1`](https://huggingface.co/jinaai/jina-embedding-s-en-v1): 35 million parameters.
2638
  - [`jina-embedding-b-en-v1`](https://huggingface.co/jinaai/jina-embedding-b-en-v1): 110 million parameters.
2639
  - [`jina-embedding-l-en-v1`](https://huggingface.co/jinaai/jina-embedding-l-en-v1): 330 million parameters.
2640
 
2641
+ ### V2 (Based on JinaBert, 8k Seq)
2642
 
2643
  - [`jina-embedding-s-en-v2`](https://huggingface.co/jinaai/jina-embedding-s-en-v2): 33 million parameters **(you are here)**.
2644
  - [`jina-embedding-b-en-v2`](https://huggingface.co/jinaai/jina-embedding-b-en-v2): 137 million parameters.