bwang0911 commited on
Commit
9cf3ad7
1 Parent(s): 38ca931

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -2
README.md CHANGED
@@ -21,8 +21,10 @@ tags:
21
 
22
  ## Intended Usage & Model Info
23
 
24
- `jina-embeddings-v2-base-code` is an multilingual **embedding model** speaks English and 29 widely used programming languages supporting **8192 sequence length**.
25
- It is based on a Bert architecture (JinaBert) that supports the symmetric bidirectional variant of [ALiBi](https://arxiv.org/abs/2108.12409) to allow longer sequence length.
 
 
26
  The backbone `jina-bert-v2-base-code` is pretrained on the [github-code](https://huggingface.co/datasets/codeparrot/github-code) dataset.
27
  The model is further trained on Jina AI's collection of more than 150 millions of coding question answer and docstring source code pairs.
28
  These pairs were obtained from various domains and were carefully selected through a thorough cleaning process.
 
21
 
22
  ## Intended Usage & Model Info
23
 
24
+ `jina-embeddings-v2-base-code` is an multilingual **embedding model** speaks **English and 30 widely used programming languages**.
25
+ Similar as other jina-embeddings-v2 series models, it supports **8192 sequence length**.
26
+
27
+ `jina-embeddings-v2-base-code` is based on a Bert architecture (JinaBert) that supports the symmetric bidirectional variant of [ALiBi](https://arxiv.org/abs/2108.12409) to allow longer sequence length.
28
  The backbone `jina-bert-v2-base-code` is pretrained on the [github-code](https://huggingface.co/datasets/codeparrot/github-code) dataset.
29
  The model is further trained on Jina AI's collection of more than 150 millions of coding question answer and docstring source code pairs.
30
  These pairs were obtained from various domains and were carefully selected through a thorough cleaning process.