Yura Kuratov commited on
Commit
32b0375
1 Parent(s): a34e087

update paper links in readme

Browse files
Files changed (1) hide show
  1. README.md +6 -5
README.md CHANGED
@@ -17,11 +17,11 @@ GENA-LM (`gena-lm-bert-base-fly`) model is trained with a masked language model
17
  - 768 Hidden size
18
  - 32k Vocabulary size
19
 
20
- We pre-trained `gena-lm-bert-base-fly` using TODO(data). Pre-training was performed for 1,900,000 iterations with batch size 256 and sequence length was equal to 512 tokens. We modified Transformer to use [Pre-Layer normalization](https://arxiv.org/abs/2002.04745).
21
 
22
  Source code and data: https://github.com/AIRI-Institute/GENA_LM
23
 
24
- Paper: https://www.biorxiv.org/content/10.1101/2023.06.12.544594v1
25
 
26
  ## Examples
27
 
@@ -74,13 +74,14 @@ For evaluation results, see our paper: https://www.biorxiv.org/content/10.1101/2
74
  ```bibtex
75
  @article{GENA_LM,
76
  author = {Veniamin Fishman and Yuri Kuratov and Maxim Petrov and Aleksei Shmelev and Denis Shepelin and Nikolay Chekanov and Olga Kardymon and Mikhail Burtsev},
77
- title = {GENA-LM: A Family of Open-Source Foundational Models for Long DNA Sequences},
78
  elocation-id = {2023.06.12.544594},
79
  year = {2023},
80
  doi = {10.1101/2023.06.12.544594},
81
  publisher = {Cold Spring Harbor Laboratory},
82
- URL = {https://www.biorxiv.org/content/early/2023/06/13/2023.06.12.544594},
83
- eprint = {https://www.biorxiv.org/content/early/2023/06/13/2023.06.12.544594.full.pdf},
84
  journal = {bioRxiv}
85
  }
 
86
  ```
 
17
  - 768 Hidden size
18
  - 32k Vocabulary size
19
 
20
+ We pre-trained `gena-lm-bert-base-fly` using TODO(data). Pre-training was performed for 1,900,000 iterations with batch size 256 and sequence length was equal to 512 tokens. We modified Transformer to use [Pre-Layer normalization](https://arxiv.org/abs/2002.04745). We upload checkpoint with the best MLM accuracy on validation set.
21
 
22
  Source code and data: https://github.com/AIRI-Institute/GENA_LM
23
 
24
+ Paper: https://www.biorxiv.org/content/10.1101/2023.06.12.544594
25
 
26
  ## Examples
27
 
 
74
  ```bibtex
75
  @article{GENA_LM,
76
  author = {Veniamin Fishman and Yuri Kuratov and Maxim Petrov and Aleksei Shmelev and Denis Shepelin and Nikolay Chekanov and Olga Kardymon and Mikhail Burtsev},
77
+ title = {GENA-LM: A Family of Open-Source Foundational DNA Language Models for Long Sequences},
78
  elocation-id = {2023.06.12.544594},
79
  year = {2023},
80
  doi = {10.1101/2023.06.12.544594},
81
  publisher = {Cold Spring Harbor Laboratory},
82
+ URL = {https://www.biorxiv.org/content/early/2023/11/01/2023.06.12.544594},
83
+ eprint = {https://www.biorxiv.org/content/early/2023/11/01/2023.06.12.544594.full.pdf},
84
  journal = {bioRxiv}
85
  }
86
+
87
  ```