ccasimiro commited on
Commit
cada4a6
1 Parent(s): 257a27c

Rollback to the first model version

Browse files
Files changed (10) hide show
  1. .gitattributes +1 -0
  2. README.md +40 -13
  3. args.json +12 -23
  4. config.json +2 -2
  5. dict.txt +1738 -0
  6. merges.txt +0 -0
  7. process.log +1 -8
  8. pytorch_model.bin +2 -2
  9. tokenizer_config.json +1 -1
  10. vocab.json +0 -0
.gitattributes CHANGED
@@ -25,3 +25,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
25
  *.zip filter=lfs diff=lfs merge=lfs -text
26
  *.zstandard filter=lfs diff=lfs merge=lfs -text
27
  *tfevents* filter=lfs diff=lfs merge=lfs -text
 
 
25
  *.zip filter=lfs diff=lfs merge=lfs -text
26
  *.zstandard filter=lfs diff=lfs merge=lfs -text
27
  *tfevents* filter=lfs diff=lfs merge=lfs -text
28
+ pytorch_model.bin filter=lfs diff=lfs merge=lfs -text
README.md CHANGED
@@ -13,8 +13,10 @@ widget:
13
  - text: "En el <mask> toraco-abdómino-pélvico no se encontraron hallazgos patológicos de interés."
14
  ---
15
 
 
 
16
  # Biomedical language model for Spanish
17
- Biomedical pretrained language model for Spanish. For more details about the corpus, the pretraining and the evaluation, check the official [repository](https://github.com/PlanTL-GOB-ES/lm-biomedical-clinical-es).
18
 
19
 
20
  ## Tokenization and model pretraining
@@ -37,20 +39,20 @@ To obtain a high-quality training corpus, a cleaning pipeline with the following
37
  - deduplication of repetitive contents
38
  - keep the original document boundaries
39
 
40
- Finally, the corpora are concatenated and further global deduplication among the corpora has been applied.
41
  The result is a medium-size biomedical corpus for Spanish composed of about 963M tokens. The table below shows some basic statistics of the individual cleaned corpora:
42
 
43
 
44
  | Name | No. tokens | Description |
45
  |-----------------------------------------------------------------------------------------|-------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
46
- | [Medical crawler](https://zenodo.org/record/4561970) | 903,558,13 | Crawler of more than 3,000 URLs belonging to Spanish biomedical and health domains. |
47
  | Clinical cases misc. | 102,855,267 | A miscellany of medical content, essentially clinical cases. Note that a clinical case report is a scientific publication where medical practitioners share patient cases and it is different from a clinical note or document. |
48
- | [Scielo](https://github.com/PlanTL-GOB-ES/SciELO-Spain-Crawler) | 60,007,289 | Publications written in Spanish crawled from the Spanish SciELO server in 2017. |
49
  | [BARR2_background](https://temu.bsc.es/BARR2/downloads/background_set.raw_text.tar.bz2) | 24,516,442 | Biomedical Abbreviation Recognition and Resolution (BARR2) containing Spanish clinical case study sections from a variety of clinical disciplines. |
50
  | Wikipedia_life_sciences | 13,890,501 | Wikipedia articles crawled 04/01/2021 with the [Wikipedia API python library](https://pypi.org/project/Wikipedia-API/) starting from the "Ciencias\_de\_la\_vida" category up to a maximum of 5 subcategories. Multiple links to the same articles are then discarded to avoid repeating content. |
51
  | Patents | 13,463,387 | Google Patent in Medical Domain for Spain (Spanish). The accepted codes (Medical Domain) for Json files of patents are: "A61B", "A61C","A61F", "A61H", "A61K", "A61L","A61M", "A61B", "A61P". |
52
  | [EMEA](http://opus.nlpl.eu/download.php?f=EMEA/v3/moses/en-es.txt.zip) | 5,377,448 | Spanish-side documents extracted from parallel corpora made out of PDF documents from the European Medicines Agency. |
53
- | [mespen_Medline](https://zenodo.org/record/3562536#.YTt1fH2xXbR) | 4,166,077 | Spanish-side articles extracted from a collection of Spanish-English parallel corpus consisting of biomedical scientific literature. The collection of parallel resources is aggregated from the MedlinePlus source. |
54
  | PubMed | 1,858,966 | Open-access articles from the PubMed repository crawled in 2017. |
55
 
56
 
@@ -81,7 +83,35 @@ The model is ready-to-use only for masked language modelling to perform the Fill
81
  However, the is intended to be fine-tuned on downstream tasks such as Named Entity Recognition or Text Classification.
82
 
83
  ## Cite
84
- To be announced soon.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
85
 
86
  ---
87
 
@@ -90,13 +120,13 @@ To be announced soon.
90
  ```python
91
  from transformers import AutoTokenizer, AutoModelForMaskedLM
92
 
93
- tokenizer = AutoTokenizer.from_pretrained("PlanTL-GOB-ES/roberta-base-biomedical-es")
94
 
95
- model = AutoModelForMaskedLM.from_pretrained("PlanTL-GOB-ES/roberta-base-biomedical-es")
96
 
97
  from transformers import pipeline
98
 
99
- unmasker = pipeline('fill-mask', model="PlanTL-GOB-ES/roberta-base-biomedical-es")
100
 
101
  unmasker("El único antecedente personal a reseñar era la <mask> arterial.")
102
  ```
@@ -134,7 +164,4 @@ unmasker("El único antecedente personal a reseñar era la <mask> arterial.")
134
  "token_str": " presión"
135
  }
136
  ]
137
- ```
138
-
139
- ## Funding
140
- This work was funded by the Spanish State Secretariat for Digitalization and Artificial Intelligence (SEDIA) within the framework of the Plan-TL.
 
13
  - text: "En el <mask> toraco-abdómino-pélvico no se encontraron hallazgos patológicos de interés."
14
  ---
15
 
16
+ **⚠️NOTICE⚠️: THIS MODEL HAS BEEN MOVED TO THE FOLLOWING URL AND WILL SOON BE REMOVED:** https://huggingface.co/PlanTL-GOB-ES/roberta-base-biomedical-es
17
+
18
  # Biomedical language model for Spanish
19
+ Biomedical pretrained language model for Spanish. For more details about the corpus, the pretraining and the evaluation, check the official [repository](https://github.com/PlanTL-SANIDAD/lm-biomedical-clinical-es) and read our [preprint](https://arxiv.org/abs/2109.03570) "_Carrino, C. P., Armengol-Estapé, J., Gutiérrez-Fandiño, A., Llop-Palao, J., Pàmies, M., Gonzalez-Agirre, A., & Villegas, M. (2021). Biomedical and Clinical Language Models for Spanish: On the Benefits of Domain-Specific Pretraining in a Mid-Resource Scenario._".
20
 
21
 
22
  ## Tokenization and model pretraining
 
39
  - deduplication of repetitive contents
40
  - keep the original document boundaries
41
 
42
+ Finally, the corpora are concatenated and further global deduplication among the corpora have been applied.
43
  The result is a medium-size biomedical corpus for Spanish composed of about 963M tokens. The table below shows some basic statistics of the individual cleaned corpora:
44
 
45
 
46
  | Name | No. tokens | Description |
47
  |-----------------------------------------------------------------------------------------|-------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
48
+ | [Medical crawler](https://zenodo.org/record/4561970) | 745,705,946 | Crawler of more than 3,000 URLs belonging to Spanish biomedical and health domains. |
49
  | Clinical cases misc. | 102,855,267 | A miscellany of medical content, essentially clinical cases. Note that a clinical case report is a scientific publication where medical practitioners share patient cases and it is different from a clinical note or document. |
50
+ | [Scielo](https://github.com/PlanTL-SANIDAD/SciELO-Spain-Crawler) | 60,007,289 | Publications written in Spanish crawled from the Spanish SciELO server in 2017. |
51
  | [BARR2_background](https://temu.bsc.es/BARR2/downloads/background_set.raw_text.tar.bz2) | 24,516,442 | Biomedical Abbreviation Recognition and Resolution (BARR2) containing Spanish clinical case study sections from a variety of clinical disciplines. |
52
  | Wikipedia_life_sciences | 13,890,501 | Wikipedia articles crawled 04/01/2021 with the [Wikipedia API python library](https://pypi.org/project/Wikipedia-API/) starting from the "Ciencias\_de\_la\_vida" category up to a maximum of 5 subcategories. Multiple links to the same articles are then discarded to avoid repeating content. |
53
  | Patents | 13,463,387 | Google Patent in Medical Domain for Spain (Spanish). The accepted codes (Medical Domain) for Json files of patents are: "A61B", "A61C","A61F", "A61H", "A61K", "A61L","A61M", "A61B", "A61P". |
54
  | [EMEA](http://opus.nlpl.eu/download.php?f=EMEA/v3/moses/en-es.txt.zip) | 5,377,448 | Spanish-side documents extracted from parallel corpora made out of PDF documents from the European Medicines Agency. |
55
+ | [mespen_Medline](https://zenodo.org/record/3562536#.YTt1fH2xXbR) | 4,166,077 | Spanish-side articles extracted from a collection of Spanish-English parallel corpus consisting of biomedical scientific literature. The collection of parallel resources are aggregated from the MedlinePlus source. |
56
  | PubMed | 1,858,966 | Open-access articles from the PubMed repository crawled in 2017. |
57
 
58
 
 
83
  However, the is intended to be fine-tuned on downstream tasks such as Named Entity Recognition or Text Classification.
84
 
85
  ## Cite
86
+ If you use our models, please cite our latest preprint:
87
+
88
+ ```bibtex
89
+
90
+ @misc{carrino2021biomedical,
91
+ title={Biomedical and Clinical Language Models for Spanish: On the Benefits of Domain-Specific Pretraining in a Mid-Resource Scenario},
92
+ author={Casimiro Pio Carrino and Jordi Armengol-Estapé and Asier Gutiérrez-Fandiño and Joan Llop-Palao and Marc Pàmies and Aitor Gonzalez-Agirre and Marta Villegas},
93
+ year={2021},
94
+ eprint={2109.03570},
95
+ archivePrefix={arXiv},
96
+ primaryClass={cs.CL}
97
+ }
98
+
99
+ ```
100
+
101
+ If you use our Medical Crawler corpus, please cite the preprint:
102
+
103
+ ```bibtex
104
+
105
+ @misc{carrino2021spanish,
106
+ title={Spanish Biomedical Crawled Corpus: A Large, Diverse Dataset for Spanish Biomedical Language Models},
107
+ author={Casimiro Pio Carrino and Jordi Armengol-Estapé and Ona de Gibert Bonet and Asier Gutiérrez-Fandiño and Aitor Gonzalez-Agirre and Martin Krallinger and Marta Villegas},
108
+ year={2021},
109
+ eprint={2109.07765},
110
+ archivePrefix={arXiv},
111
+ primaryClass={cs.CL}
112
+ }
113
+
114
+ ```
115
 
116
  ---
117
 
 
120
  ```python
121
  from transformers import AutoTokenizer, AutoModelForMaskedLM
122
 
123
+ tokenizer = AutoTokenizer.from_pretrained("BSC-TeMU/roberta-base-biomedical-es")
124
 
125
+ model = AutoModelForMaskedLM.from_pretrained("BSC-TeMU/roberta-base-biomedical-es")
126
 
127
  from transformers import pipeline
128
 
129
+ unmasker = pipeline('fill-mask', model="BSC-TeMU/roberta-base-biomedical-es")
130
 
131
  unmasker("El único antecedente personal a reseñar era la <mask> arterial.")
132
  ```
 
164
  "token_str": " presión"
165
  }
166
  ]
167
+ ```
 
 
 
args.json CHANGED
@@ -1,28 +1,17 @@
1
  {
2
- "tokenizer_path_name": null,
3
- "vocab_name": "biomedical",
4
- "tokenizer": "bbpe-roberta",
 
 
 
5
  "lowercase": false,
6
- "vocab_size": 50262,
7
- "min_frequency": 6,
8
- "extra_tokens": [],
9
  "limit_alphabet": 1000,
10
- "max_len": 512,
11
- "no_show_progress": false,
12
- "strip_accents": false,
13
- "no_handle_chinese_chars": false,
14
- "no_clean_text": false,
15
  "reserve_tokens": 0,
16
- "use_tokenizers": false,
17
- "no_fairseq": false,
18
- "bbpe_add_prefix_space": true,
19
- "single_paragraph_add_punct": true,
20
- "tok_batch_size": 100000000,
21
- "files": [
22
- "/home/shared/dt01/temutauro/ccasimiro/corpus-utils-lm/output/model-ready_output/biomedical-vocab-50262-2021-12-09-1207-d1d3-e42b/train_valid_test_split_output/biomedical-2021-12-09-1210-d1d3-ad85/train.txt",
23
- "/home/shared/dt01/temutauro/ccasimiro/corpus-utils-lm/output/model-ready_output/biomedical-vocab-50262-2021-12-09-1207-d1d3-e42b/train_valid_test_split_output/biomedical-2021-12-09-1210-d1d3-ad85/valid.txt",
24
- "/home/shared/dt01/temutauro/ccasimiro/corpus-utils-lm/output/model-ready_output/biomedical-vocab-50262-2021-12-09-1207-d1d3-e42b/train_valid_test_split_output/biomedical-2021-12-09-1210-d1d3-ad85/test.txt"
25
- ],
26
- "output_root_path": "/home/shared/dt01/temutauro/ccasimiro/corpus-utils-lm/output/model-ready_output/biomedical-vocab-50262-2021-12-09-1207-d1d3-e42b",
27
- "commit_hash": "d1d3920e7012caf14c9d6968fded36e0dd719a51"
28
  }
 
1
  {
2
+ "output_root": "/gpfs/projects/bsc88/corpus-utils-lm/23-12-2020-72f8c7e/output/model-ready_output/2020-12-23-1900-daf4-ab38",
3
+ "files": "/gpfs/projects/bsc88/corpus-utils-lm/23-12-2020-72f8c7e/output/model-ready_output/2020-12-23-1900-daf4-ab38/train_valid_test_split_output/2020-12-23-1905-daf4-a0e0/train.txt",
4
+ "vocab_name": "roberta-ca",
5
+ "clean_text": true,
6
+ "handle_chinese_chars": true,
7
+ "strip_accents": false,
8
  "lowercase": false,
9
+ "vocab_size": 52000,
 
 
10
  "limit_alphabet": 1000,
11
+ "show_progress": true,
12
+ "min_frequency": 2,
13
+ "extra_tokens": [],
 
 
14
  "reserve_tokens": 0,
15
+ "tokenizer": "bbpe",
16
+ "commit_hash": "daf4d660ec8a4b28d2bc29b3063779100ab85796\n"
 
 
 
 
 
 
 
 
 
 
17
  }
config.json CHANGED
@@ -18,8 +18,8 @@
18
  "num_hidden_layers": 12,
19
  "pad_token_id": 1,
20
  "position_embedding_type": "absolute",
21
- "transformers_version": "4.6.0",
22
  "type_vocab_size": 1,
23
  "use_cache": true,
24
- "vocab_size": 50262
25
  }
 
18
  "num_hidden_layers": 12,
19
  "pad_token_id": 1,
20
  "position_embedding_type": "absolute",
21
+ "transformers_version": "4.4.0",
22
  "type_vocab_size": 1,
23
  "use_cache": true,
24
+ "vocab_size": 52000
25
  }
dict.txt CHANGED
@@ -50255,3 +50255,1741 @@
50255
  50258 100
50256
  50259 100
50257
  50260 100
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
50255
  50258 100
50256
  50259 100
50257
  50260 100
50258
+ 50261 100
50259
+ 50262 100
50260
+ 50263 100
50261
+ 50264 100
50262
+ 50265 100
50263
+ 50266 100
50264
+ 50267 100
50265
+ 50268 100
50266
+ 50269 100
50267
+ 50270 100
50268
+ 50271 100
50269
+ 50272 100
50270
+ 50273 100
50271
+ 50274 100
50272
+ 50275 100
50273
+ 50276 100
50274
+ 50277 100
50275
+ 50278 100
50276
+ 50279 100
50277
+ 50280 100
50278
+ 50281 100
50279
+ 50282 100
50280
+ 50283 100
50281
+ 50284 100
50282
+ 50285 100
50283
+ 50286 100
50284
+ 50287 100
50285
+ 50288 100
50286
+ 50289 100
50287
+ 50290 100
50288
+ 50291 100
50289
+ 50292 100
50290
+ 50293 100
50291
+ 50294 100
50292
+ 50295 100
50293
+ 50296 100
50294
+ 50297 100
50295
+ 50298 100
50296
+ 50299 100
50297
+ 50300 100
50298
+ 50301 100
50299
+ 50302 100
50300
+ 50303 100
50301
+ 50304 100
50302
+ 50305 100
50303
+ 50306 100
50304
+ 50307 100
50305
+ 50308 100
50306
+ 50309 100
50307
+ 50310 100
50308
+ 50311 100
50309
+ 50312 100
50310
+ 50313 100
50311
+ 50314 100
50312
+ 50315 100
50313
+ 50316 100
50314
+ 50317 100
50315
+ 50318 100
50316
+ 50319 100
50317
+ 50320 100
50318
+ 50321 100
50319
+ 50322 100
50320
+ 50323 100
50321
+ 50324 100
50322
+ 50325 100
50323
+ 50326 100
50324
+ 50327 100
50325
+ 50328 100
50326
+ 50329 100
50327
+ 50330 100
50328
+ 50331 100
50329
+ 50332 100
50330
+ 50333 100
50331
+ 50334 100
50332
+ 50335 100
50333
+ 50336 100
50334
+ 50337 100
50335
+ 50338 100
50336
+ 50339 100
50337
+ 50340 100
50338
+ 50341 100
50339
+ 50342 100
50340
+ 50343 100
50341
+ 50344 100
50342
+ 50345 100
50343
+ 50346 100
50344
+ 50347 100
50345
+ 50348 100
50346
+ 50349 100
50347
+ 50350 100
50348
+ 50351 100
50349
+ 50352 100
50350
+ 50353 100
50351
+ 50354 100
50352
+ 50355 100
50353
+ 50356 100
50354
+ 50357 100
50355
+ 50358 100
50356
+ 50359 100
50357
+ 50360 100
50358
+ 50361 100
50359
+ 50362 100
50360
+ 50363 100
50361
+ 50364 100
50362
+ 50365 100
50363
+ 50366 100
50364
+ 50367 100
50365
+ 50368 100
50366
+ 50369 100
50367
+ 50370 100
50368
+ 50371 100
50369
+ 50372 100
50370
+ 50373 100
50371
+ 50374 100
50372
+ 50375 100
50373
+ 50376 100
50374
+ 50377 100
50375
+ 50378 100
50376
+ 50379 100
50377
+ 50380 100
50378
+ 50381 100
50379
+ 50382 100
50380
+ 50383 100
50381
+ 50384 100
50382
+ 50385 100
50383
+ 50386 100
50384
+ 50387 100
50385
+ 50388 100
50386
+ 50389 100
50387
+ 50390 100
50388
+ 50391 100
50389
+ 50392 100
50390
+ 50393 100
50391
+ 50394 100
50392
+ 50395 100
50393
+ 50396 100
50394
+ 50397 100
50395
+ 50398 100
50396
+ 50399 100
50397
+ 50400 100
50398
+ 50401 100
50399
+ 50402 100
50400
+ 50403 100
50401
+ 50404 100
50402
+ 50405 100
50403
+ 50406 100
50404
+ 50407 100
50405
+ 50408 100
50406
+ 50409 100
50407
+ 50410 100
50408
+ 50411 100
50409
+ 50412 100
50410
+ 50413 100
50411
+ 50414 100
50412
+ 50415 100
50413
+ 50416 100
50414
+ 50417 100
50415
+ 50418 100
50416
+ 50419 100
50417
+ 50420 100
50418
+ 50421 100
50419
+ 50422 100
50420
+ 50423 100
50421
+ 50424 100
50422
+ 50425 100
50423
+ 50426 100
50424
+ 50427 100
50425
+ 50428 100
50426
+ 50429 100
50427
+ 50430 100
50428
+ 50431 100
50429
+ 50432 100
50430
+ 50433 100
50431
+ 50434 100
50432
+ 50435 100
50433
+ 50436 100
50434
+ 50437 100
50435
+ 50438 100
50436
+ 50439 100
50437
+ 50440 100
50438
+ 50441 100
50439
+ 50442 100
50440
+ 50443 100
50441
+ 50444 100
50442
+ 50445 100
50443
+ 50446 100
50444
+ 50447 100
50445
+ 50448 100
50446
+ 50449 100
50447
+ 50450 100
50448
+ 50451 100
50449
+ 50452 100
50450
+ 50453 100
50451
+ 50454 100
50452
+ 50455 100
50453
+ 50456 100
50454
+ 50457 100
50455
+ 50458 100
50456
+ 50459 100
50457
+ 50460 100
50458
+ 50461 100
50459
+ 50462 100
50460
+ 50463 100
50461
+ 50464 100
50462
+ 50465 100
50463
+ 50466 100
50464
+ 50467 100
50465
+ 50468 100
50466
+ 50469 100
50467
+ 50470 100
50468
+ 50471 100
50469
+ 50472 100
50470
+ 50473 100
50471
+ 50474 100
50472
+ 50475 100
50473
+ 50476 100
50474
+ 50477 100
50475
+ 50478 100
50476
+ 50479 100
50477
+ 50480 100
50478
+ 50481 100
50479
+ 50482 100
50480
+ 50483 100
50481
+ 50484 100
50482
+ 50485 100
50483
+ 50486 100
50484
+ 50487 100
50485
+ 50488 100
50486
+ 50489 100
50487
+ 50490 100
50488
+ 50491 100
50489
+ 50492 100
50490
+ 50493 100
50491
+ 50494 100
50492
+ 50495 100
50493
+ 50496 100
50494
+ 50497 100
50495
+ 50498 100
50496
+ 50499 100
50497
+ 50500 100
50498
+ 50501 100
50499
+ 50502 100
50500
+ 50503 100
50501
+ 50504 100
50502
+ 50505 100
50503
+ 50506 100
50504
+ 50507 100
50505
+ 50508 100
50506
+ 50509 100
50507
+ 50510 100
50508
+ 50511 100
50509
+ 50512 100
50510
+ 50513 100
50511
+ 50514 100
50512
+ 50515 100
50513
+ 50516 100
50514
+ 50517 100
50515
+ 50518 100
50516
+ 50519 100
50517
+ 50520 100
50518
+ 50521 100
50519
+ 50522 100
50520
+ 50523 100
50521
+ 50524 100
50522
+ 50525 100
50523
+ 50526 100
50524
+ 50527 100
50525
+ 50528 100
50526
+ 50529 100
50527
+ 50530 100
50528
+ 50531 100
50529
+ 50532 100
50530
+ 50533 100
50531
+ 50534 100
50532
+ 50535 100
50533
+ 50536 100
50534
+ 50537 100
50535
+ 50538 100
50536
+ 50539 100
50537
+ 50540 100
50538
+ 50541 100
50539
+ 50542 100
50540
+ 50543 100
50541
+ 50544 100
50542
+ 50545 100
50543
+ 50546 100
50544
+ 50547 100
50545
+ 50548 100
50546
+ 50549 100
50547
+ 50550 100
50548
+ 50551 100
50549
+ 50552 100
50550
+ 50553 100
50551
+ 50554 100
50552
+ 50555 100
50553
+ 50556 100
50554
+ 50557 100
50555
+ 50558 100
50556
+ 50559 100
50557
+ 50560 100
50558
+ 50561 100
50559
+ 50562 100
50560
+ 50563 100
50561
+ 50564 100
50562
+ 50565 100
50563
+ 50566 100
50564
+ 50567 100
50565
+ 50568 100
50566
+ 50569 100
50567
+ 50570 100
50568
+ 50571 100
50569
+ 50572 100
50570
+ 50573 100
50571
+ 50574 100
50572
+ 50575 100
50573
+ 50576 100
50574
+ 50577 100
50575
+ 50578 100
50576
+ 50579 100
50577
+ 50580 100
50578
+ 50581 100
50579
+ 50582 100
50580
+ 50583 100
50581
+ 50584 100
50582
+ 50585 100
50583
+ 50586 100
50584
+ 50587 100
50585
+ 50588 100
50586
+ 50589 100
50587
+ 50590 100
50588
+ 50591 100
50589
+ 50592 100
50590
+ 50593 100
50591
+ 50594 100
50592
+ 50595 100
50593
+ 50596 100
50594
+ 50597 100
50595
+ 50598 100
50596
+ 50599 100
50597
+ 50600 100
50598
+ 50601 100
50599
+ 50602 100
50600
+ 50603 100
50601
+ 50604 100
50602
+ 50605 100
50603
+ 50606 100
50604
+ 50607 100
50605
+ 50608 100
50606
+ 50609 100
50607
+ 50610 100
50608
+ 50611 100
50609
+ 50612 100
50610
+ 50613 100
50611
+ 50614 100
50612
+ 50615 100
50613
+ 50616 100
50614
+ 50617 100
50615
+ 50618 100
50616
+ 50619 100
50617
+ 50620 100
50618
+ 50621 100
50619
+ 50622 100
50620
+ 50623 100
50621
+ 50624 100
50622
+ 50625 100
50623
+ 50626 100
50624
+ 50627 100
50625
+ 50628 100
50626
+ 50629 100
50627
+ 50630 100
50628
+ 50631 100
50629
+ 50632 100
50630
+ 50633 100
50631
+ 50634 100
50632
+ 50635 100
50633
+ 50636 100
50634
+ 50637 100
50635
+ 50638 100
50636
+ 50639 100
50637
+ 50640 100
50638
+ 50641 100
50639
+ 50642 100
50640
+ 50643 100
50641
+ 50644 100
50642
+ 50645 100
50643
+ 50646 100
50644
+ 50647 100
50645
+ 50648 100
50646
+ 50649 100
50647
+ 50650 100
50648
+ 50651 100
50649
+ 50652 100
50650
+ 50653 100
50651
+ 50654 100
50652
+ 50655 100
50653
+ 50656 100
50654
+ 50657 100
50655
+ 50658 100
50656
+ 50659 100
50657
+ 50660 100
50658
+ 50661 100
50659
+ 50662 100
50660
+ 50663 100
50661
+ 50664 100
50662
+ 50665 100
50663
+ 50666 100
50664
+ 50667 100
50665
+ 50668 100
50666
+ 50669 100
50667
+ 50670 100
50668
+ 50671 100
50669
+ 50672 100
50670
+ 50673 100
50671
+ 50674 100
50672
+ 50675 100
50673
+ 50676 100
50674
+ 50677 100
50675
+ 50678 100
50676
+ 50679 100
50677
+ 50680 100
50678
+ 50681 100
50679
+ 50682 100
50680
+ 50683 100
50681
+ 50684 100
50682
+ 50685 100
50683
+ 50686 100
50684
+ 50687 100
50685
+ 50688 100
50686
+ 50689 100
50687
+ 50690 100
50688
+ 50691 100
50689
+ 50692 100
50690
+ 50693 100
50691
+ 50694 100
50692
+ 50695 100
50693
+ 50696 100
50694
+ 50697 100
50695
+ 50698 100
50696
+ 50699 100
50697
+ 50700 100
50698
+ 50701 100
50699
+ 50702 100
50700
+ 50703 100
50701
+ 50704 100
50702
+ 50705 100
50703
+ 50706 100
50704
+ 50707 100
50705
+ 50708 100
50706
+ 50709 100
50707
+ 50710 100
50708
+ 50711 100
50709
+ 50712 100
50710
+ 50713 100
50711
+ 50714 100
50712
+ 50715 100
50713
+ 50716 100
50714
+ 50717 100
50715
+ 50718 100
50716
+ 50719 100
50717
+ 50720 100
50718
+ 50721 100
50719
+ 50722 100
50720
+ 50723 100
50721
+ 50724 100
50722
+ 50725 100
50723
+ 50726 100
50724
+ 50727 100
50725
+ 50728 100
50726
+ 50729 100
50727
+ 50730 100
50728
+ 50731 100
50729
+ 50732 100
50730
+ 50733 100
50731
+ 50734 100
50732
+ 50735 100
50733
+ 50736 100
50734
+ 50737 100
50735
+ 50738 100
50736
+ 50739 100
50737
+ 50740 100
50738
+ 50741 100
50739
+ 50742 100
50740
+ 50743 100
50741
+ 50744 100
50742
+ 50745 100
50743
+ 50746 100
50744
+ 50747 100
50745
+ 50748 100
50746
+ 50749 100
50747
+ 50750 100
50748
+ 50751 100
50749
+ 50752 100
50750
+ 50753 100
50751
+ 50754 100
50752
+ 50755 100
50753
+ 50756 100
50754
+ 50757 100
50755
+ 50758 100
50756
+ 50759 100
50757
+ 50760 100
50758
+ 50761 100
50759
+ 50762 100
50760
+ 50763 100
50761
+ 50764 100
50762
+ 50765 100
50763
+ 50766 100
50764
+ 50767 100
50765
+ 50768 100
50766
+ 50769 100
50767
+ 50770 100
50768
+ 50771 100
50769
+ 50772 100
50770
+ 50773 100
50771
+ 50774 100
50772
+ 50775 100
50773
+ 50776 100
50774
+ 50777 100
50775
+ 50778 100
50776
+ 50779 100
50777
+ 50780 100
50778
+ 50781 100
50779
+ 50782 100
50780
+ 50783 100
50781
+ 50784 100
50782
+ 50785 100
50783
+ 50786 100
50784
+ 50787 100
50785
+ 50788 100
50786
+ 50789 100
50787
+ 50790 100
50788
+ 50791 100
50789
+ 50792 100
50790
+ 50793 100
50791
+ 50794 100
50792
+ 50795 100
50793
+ 50796 100
50794
+ 50797 100
50795
+ 50798 100
50796
+ 50799 100
50797
+ 50800 100
50798
+ 50801 100
50799
+ 50802 100
50800
+ 50803 100
50801
+ 50804 100
50802
+ 50805 100
50803
+ 50806 100
50804
+ 50807 100
50805
+ 50808 100
50806
+ 50809 100
50807
+ 50810 100
50808
+ 50811 100
50809
+ 50812 100
50810
+ 50813 100
50811
+ 50814 100
50812
+ 50815 100
50813
+ 50816 100
50814
+ 50817 100
50815
+ 50818 100
50816
+ 50819 100
50817
+ 50820 100
50818
+ 50821 100
50819
+ 50822 100
50820
+ 50823 100
50821
+ 50824 100
50822
+ 50825 100
50823
+ 50826 100
50824
+ 50827 100
50825
+ 50828 100
50826
+ 50829 100
50827
+ 50830 100
50828
+ 50831 100
50829
+ 50832 100
50830
+ 50833 100
50831
+ 50834 100
50832
+ 50835 100
50833
+ 50836 100
50834
+ 50837 100
50835
+ 50838 100
50836
+ 50839 100
50837
+ 50840 100
50838
+ 50841 100
50839
+ 50842 100
50840
+ 50843 100
50841
+ 50844 100
50842
+ 50845 100
50843
+ 50846 100
50844
+ 50847 100
50845
+ 50848 100
50846
+ 50849 100
50847
+ 50850 100
50848
+ 50851 100
50849
+ 50852 100
50850
+ 50853 100
50851
+ 50854 100
50852
+ 50855 100
50853
+ 50856 100
50854
+ 50857 100
50855
+ 50858 100
50856
+ 50859 100
50857
+ 50860 100
50858
+ 50861 100
50859
+ 50862 100
50860
+ 50863 100
50861
+ 50864 100
50862
+ 50865 100
50863
+ 50866 100
50864
+ 50867 100
50865
+ 50868 100
50866
+ 50869 100
50867
+ 50870 100
50868
+ 50871 100
50869
+ 50872 100
50870
+ 50873 100
50871
+ 50874 100
50872
+ 50875 100
50873
+ 50876 100
50874
+ 50877 100
50875
+ 50878 100
50876
+ 50879 100
50877
+ 50880 100
50878
+ 50881 100
50879
+ 50882 100
50880
+ 50883 100
50881
+ 50884 100
50882
+ 50885 100
50883
+ 50886 100
50884
+ 50887 100
50885
+ 50888 100
50886
+ 50889 100
50887
+ 50890 100
50888
+ 50891 100
50889
+ 50892 100
50890
+ 50893 100
50891
+ 50894 100
50892
+ 50895 100
50893
+ 50896 100
50894
+ 50897 100
50895
+ 50898 100
50896
+ 50899 100
50897
+ 50900 100
50898
+ 50901 100
50899
+ 50902 100
50900
+ 50903 100
50901
+ 50904 100
50902
+ 50905 100
50903
+ 50906 100
50904
+ 50907 100
50905
+ 50908 100
50906
+ 50909 100
50907
+ 50910 100
50908
+ 50911 100
50909
+ 50912 100
50910
+ 50913 100
50911
+ 50914 100
50912
+ 50915 100
50913
+ 50916 100
50914
+ 50917 100
50915
+ 50918 100
50916
+ 50919 100
50917
+ 50920 100
50918
+ 50921 100
50919
+ 50922 100
50920
+ 50923 100
50921
+ 50924 100
50922
+ 50925 100
50923
+ 50926 100
50924
+ 50927 100
50925
+ 50928 100
50926
+ 50929 100
50927
+ 50930 100
50928
+ 50931 100
50929
+ 50932 100
50930
+ 50933 100
50931
+ 50934 100
50932
+ 50935 100
50933
+ 50936 100
50934
+ 50937 100
50935
+ 50938 100
50936
+ 50939 100
50937
+ 50940 100
50938
+ 50941 100
50939
+ 50942 100
50940
+ 50943 100
50941
+ 50944 100
50942
+ 50945 100
50943
+ 50946 100
50944
+ 50947 100
50945
+ 50948 100
50946
+ 50949 100
50947
+ 50950 100
50948
+ 50951 100
50949
+ 50952 100
50950
+ 50953 100
50951
+ 50954 100
50952
+ 50955 100
50953
+ 50956 100
50954
+ 50957 100
50955
+ 50958 100
50956
+ 50959 100
50957
+ 50960 100
50958
+ 50961 100
50959
+ 50962 100
50960
+ 50963 100
50961
+ 50964 100
50962
+ 50965 100
50963
+ 50966 100
50964
+ 50967 100
50965
+ 50968 100
50966
+ 50969 100
50967
+ 50970 100
50968
+ 50971 100
50969
+ 50972 100
50970
+ 50973 100
50971
+ 50974 100
50972
+ 50975 100
50973
+ 50976 100
50974
+ 50977 100
50975
+ 50978 100
50976
+ 50979 100
50977
+ 50980 100
50978
+ 50981 100
50979
+ 50982 100
50980
+ 50983 100
50981
+ 50984 100
50982
+ 50985 100
50983
+ 50986 100
50984
+ 50987 100
50985
+ 50988 100
50986
+ 50989 100
50987
+ 50990 100
50988
+ 50991 100
50989
+ 50992 100
50990
+ 50993 100
50991
+ 50994 100
50992
+ 50995 100
50993
+ 50996 100
50994
+ 50997 100
50995
+ 50998 100
50996
+ 50999 100
50997
+ 51000 100
50998
+ 51001 100
50999
+ 51002 100
51000
+ 51003 100
51001
+ 51004 100
51002
+ 51005 100
51003
+ 51006 100
51004
+ 51007 100
51005
+ 51008 100
51006
+ 51009 100
51007
+ 51010 100
51008
+ 51011 100
51009
+ 51012 100
51010
+ 51013 100
51011
+ 51014 100
51012
+ 51015 100
51013
+ 51016 100
51014
+ 51017 100
51015
+ 51018 100
51016
+ 51019 100
51017
+ 51020 100
51018
+ 51021 100
51019
+ 51022 100
51020
+ 51023 100
51021
+ 51024 100
51022
+ 51025 100
51023
+ 51026 100
51024
+ 51027 100
51025
+ 51028 100
51026
+ 51029 100
51027
+ 51030 100
51028
+ 51031 100
51029
+ 51032 100
51030
+ 51033 100
51031
+ 51034 100
51032
+ 51035 100
51033
+ 51036 100
51034
+ 51037 100
51035
+ 51038 100
51036
+ 51039 100
51037
+ 51040 100
51038
+ 51041 100
51039
+ 51042 100
51040
+ 51043 100
51041
+ 51044 100
51042
+ 51045 100
51043
+ 51046 100
51044
+ 51047 100
51045
+ 51048 100
51046
+ 51049 100
51047
+ 51050 100
51048
+ 51051 100
51049
+ 51052 100
51050
+ 51053 100
51051
+ 51054 100
51052
+ 51055 100
51053
+ 51056 100
51054
+ 51057 100
51055
+ 51058 100
51056
+ 51059 100
51057
+ 51060 100
51058
+ 51061 100
51059
+ 51062 100
51060
+ 51063 100
51061
+ 51064 100
51062
+ 51065 100
51063
+ 51066 100
51064
+ 51067 100
51065
+ 51068 100
51066
+ 51069 100
51067
+ 51070 100
51068
+ 51071 100
51069
+ 51072 100
51070
+ 51073 100
51071
+ 51074 100
51072
+ 51075 100
51073
+ 51076 100
51074
+ 51077 100
51075
+ 51078 100
51076
+ 51079 100
51077
+ 51080 100
51078
+ 51081 100
51079
+ 51082 100
51080
+ 51083 100
51081
+ 51084 100
51082
+ 51085 100
51083
+ 51086 100
51084
+ 51087 100
51085
+ 51088 100
51086
+ 51089 100
51087
+ 51090 100
51088
+ 51091 100
51089
+ 51092 100
51090
+ 51093 100
51091
+ 51094 100
51092
+ 51095 100
51093
+ 51096 100
51094
+ 51097 100
51095
+ 51098 100
51096
+ 51099 100
51097
+ 51100 100
51098
+ 51101 100
51099
+ 51102 100
51100
+ 51103 100
51101
+ 51104 100
51102
+ 51105 100
51103
+ 51106 100
51104
+ 51107 100
51105
+ 51108 100
51106
+ 51109 100
51107
+ 51110 100
51108
+ 51111 100
51109
+ 51112 100
51110
+ 51113 100
51111
+ 51114 100
51112
+ 51115 100
51113
+ 51116 100
51114
+ 51117 100
51115
+ 51118 100
51116
+ 51119 100
51117
+ 51120 100
51118
+ 51121 100
51119
+ 51122 100
51120
+ 51123 100
51121
+ 51124 100
51122
+ 51125 100
51123
+ 51126 100
51124
+ 51127 100
51125
+ 51128 100
51126
+ 51129 100
51127
+ 51130 100
51128
+ 51131 100
51129
+ 51132 100
51130
+ 51133 100
51131
+ 51134 100
51132
+ 51135 100
51133
+ 51136 100
51134
+ 51137 100
51135
+ 51138 100
51136
+ 51139 100
51137
+ 51140 100
51138
+ 51141 100
51139
+ 51142 100
51140
+ 51143 100
51141
+ 51144 100
51142
+ 51145 100
51143
+ 51146 100
51144
+ 51147 100
51145
+ 51148 100
51146
+ 51149 100
51147
+ 51150 100
51148
+ 51151 100
51149
+ 51152 100
51150
+ 51153 100
51151
+ 51154 100
51152
+ 51155 100
51153
+ 51156 100
51154
+ 51157 100
51155
+ 51158 100
51156
+ 51159 100
51157
+ 51160 100
51158
+ 51161 100
51159
+ 51162 100
51160
+ 51163 100
51161
+ 51164 100
51162
+ 51165 100
51163
+ 51166 100
51164
+ 51167 100
51165
+ 51168 100
51166
+ 51169 100
51167
+ 51170 100
51168
+ 51171 100
51169
+ 51172 100
51170
+ 51173 100
51171
+ 51174 100
51172
+ 51175 100
51173
+ 51176 100
51174
+ 51177 100
51175
+ 51178 100
51176
+ 51179 100
51177
+ 51180 100
51178
+ 51181 100
51179
+ 51182 100
51180
+ 51183 100
51181
+ 51184 100
51182
+ 51185 100
51183
+ 51186 100
51184
+ 51187 100
51185
+ 51188 100
51186
+ 51189 100
51187
+ 51190 100
51188
+ 51191 100
51189
+ 51192 100
51190
+ 51193 100
51191
+ 51194 100
51192
+ 51195 100
51193
+ 51196 100
51194
+ 51197 100
51195
+ 51198 100
51196
+ 51199 100
51197
+ 51200 100
51198
+ 51201 100
51199
+ 51202 100
51200
+ 51203 100
51201
+ 51204 100
51202
+ 51205 100
51203
+ 51206 100
51204
+ 51207 100
51205
+ 51208 100
51206
+ 51209 100
51207
+ 51210 100
51208
+ 51211 100
51209
+ 51212 100
51210
+ 51213 100
51211
+ 51214 100
51212
+ 51215 100
51213
+ 51216 100
51214
+ 51217 100
51215
+ 51218 100
51216
+ 51219 100
51217
+ 51220 100
51218
+ 51221 100
51219
+ 51222 100
51220
+ 51223 100
51221
+ 51224 100
51222
+ 51225 100
51223
+ 51226 100
51224
+ 51227 100
51225
+ 51228 100
51226
+ 51229 100
51227
+ 51230 100
51228
+ 51231 100
51229
+ 51232 100
51230
+ 51233 100
51231
+ 51234 100
51232
+ 51235 100
51233
+ 51236 100
51234
+ 51237 100
51235
+ 51238 100
51236
+ 51239 100
51237
+ 51240 100
51238
+ 51241 100
51239
+ 51242 100
51240
+ 51243 100
51241
+ 51244 100
51242
+ 51245 100
51243
+ 51246 100
51244
+ 51247 100
51245
+ 51248 100
51246
+ 51249 100
51247
+ 51250 100
51248
+ 51251 100
51249
+ 51252 100
51250
+ 51253 100
51251
+ 51254 100
51252
+ 51255 100
51253
+ 51256 100
51254
+ 51257 100
51255
+ 51258 100
51256
+ 51259 100
51257
+ 51260 100
51258
+ 51261 100
51259
+ 51262 100
51260
+ 51263 100
51261
+ 51264 100
51262
+ 51265 100
51263
+ 51266 100
51264
+ 51267 100
51265
+ 51268 100
51266
+ 51269 100
51267
+ 51270 100
51268
+ 51271 100
51269
+ 51272 100
51270
+ 51273 100
51271
+ 51274 100
51272
+ 51275 100
51273
+ 51276 100
51274
+ 51277 100
51275
+ 51278 100
51276
+ 51279 100
51277
+ 51280 100
51278
+ 51281 100
51279
+ 51282 100
51280
+ 51283 100
51281
+ 51284 100
51282
+ 51285 100
51283
+ 51286 100
51284
+ 51287 100
51285
+ 51288 100
51286
+ 51289 100
51287
+ 51290 100
51288
+ 51291 100
51289
+ 51292 100
51290
+ 51293 100
51291
+ 51294 100
51292
+ 51295 100
51293
+ 51296 100
51294
+ 51297 100
51295
+ 51298 100
51296
+ 51299 100
51297
+ 51300 100
51298
+ 51301 100
51299
+ 51302 100
51300
+ 51303 100
51301
+ 51304 100
51302
+ 51305 100
51303
+ 51306 100
51304
+ 51307 100
51305
+ 51308 100
51306
+ 51309 100
51307
+ 51310 100
51308
+ 51311 100
51309
+ 51312 100
51310
+ 51313 100
51311
+ 51314 100
51312
+ 51315 100
51313
+ 51316 100
51314
+ 51317 100
51315
+ 51318 100
51316
+ 51319 100
51317
+ 51320 100
51318
+ 51321 100
51319
+ 51322 100
51320
+ 51323 100
51321
+ 51324 100
51322
+ 51325 100
51323
+ 51326 100
51324
+ 51327 100
51325
+ 51328 100
51326
+ 51329 100
51327
+ 51330 100
51328
+ 51331 100
51329
+ 51332 100
51330
+ 51333 100
51331
+ 51334 100
51332
+ 51335 100
51333
+ 51336 100
51334
+ 51337 100
51335
+ 51338 100
51336
+ 51339 100
51337
+ 51340 100
51338
+ 51341 100
51339
+ 51342 100
51340
+ 51343 100
51341
+ 51344 100
51342
+ 51345 100
51343
+ 51346 100
51344
+ 51347 100
51345
+ 51348 100
51346
+ 51349 100
51347
+ 51350 100
51348
+ 51351 100
51349
+ 51352 100
51350
+ 51353 100
51351
+ 51354 100
51352
+ 51355 100
51353
+ 51356 100
51354
+ 51357 100
51355
+ 51358 100
51356
+ 51359 100
51357
+ 51360 100
51358
+ 51361 100
51359
+ 51362 100
51360
+ 51363 100
51361
+ 51364 100
51362
+ 51365 100
51363
+ 51366 100
51364
+ 51367 100
51365
+ 51368 100
51366
+ 51369 100
51367
+ 51370 100
51368
+ 51371 100
51369
+ 51372 100
51370
+ 51373 100
51371
+ 51374 100
51372
+ 51375 100
51373
+ 51376 100
51374
+ 51377 100
51375
+ 51378 100
51376
+ 51379 100
51377
+ 51380 100
51378
+ 51381 100
51379
+ 51382 100
51380
+ 51383 100
51381
+ 51384 100
51382
+ 51385 100
51383
+ 51386 100
51384
+ 51387 100
51385
+ 51388 100
51386
+ 51389 100
51387
+ 51390 100
51388
+ 51391 100
51389
+ 51392 100
51390
+ 51393 100
51391
+ 51394 100
51392
+ 51395 100
51393
+ 51396 100
51394
+ 51397 100
51395
+ 51398 100
51396
+ 51399 100
51397
+ 51400 100
51398
+ 51401 100
51399
+ 51402 100
51400
+ 51403 100
51401
+ 51404 100
51402
+ 51405 100
51403
+ 51406 100
51404
+ 51407 100
51405
+ 51408 100
51406
+ 51409 100
51407
+ 51410 100
51408
+ 51411 100
51409
+ 51412 100
51410
+ 51413 100
51411
+ 51414 100
51412
+ 51415 100
51413
+ 51416 100
51414
+ 51417 100
51415
+ 51418 100
51416
+ 51419 100
51417
+ 51420 100
51418
+ 51421 100
51419
+ 51422 100
51420
+ 51423 100
51421
+ 51424 100
51422
+ 51425 100
51423
+ 51426 100
51424
+ 51427 100
51425
+ 51428 100
51426
+ 51429 100
51427
+ 51430 100
51428
+ 51431 100
51429
+ 51432 100
51430
+ 51433 100
51431
+ 51434 100
51432
+ 51435 100
51433
+ 51436 100
51434
+ 51437 100
51435
+ 51438 100
51436
+ 51439 100
51437
+ 51440 100
51438
+ 51441 100
51439
+ 51442 100
51440
+ 51443 100
51441
+ 51444 100
51442
+ 51445 100
51443
+ 51446 100
51444
+ 51447 100
51445
+ 51448 100
51446
+ 51449 100
51447
+ 51450 100
51448
+ 51451 100
51449
+ 51452 100
51450
+ 51453 100
51451
+ 51454 100
51452
+ 51455 100
51453
+ 51456 100
51454
+ 51457 100
51455
+ 51458 100
51456
+ 51459 100
51457
+ 51460 100
51458
+ 51461 100
51459
+ 51462 100
51460
+ 51463 100
51461
+ 51464 100
51462
+ 51465 100
51463
+ 51466 100
51464
+ 51467 100
51465
+ 51468 100
51466
+ 51469 100
51467
+ 51470 100
51468
+ 51471 100
51469
+ 51472 100
51470
+ 51473 100
51471
+ 51474 100
51472
+ 51475 100
51473
+ 51476 100
51474
+ 51477 100
51475
+ 51478 100
51476
+ 51479 100
51477
+ 51480 100
51478
+ 51481 100
51479
+ 51482 100
51480
+ 51483 100
51481
+ 51484 100
51482
+ 51485 100
51483
+ 51486 100
51484
+ 51487 100
51485
+ 51488 100
51486
+ 51489 100
51487
+ 51490 100
51488
+ 51491 100
51489
+ 51492 100
51490
+ 51493 100
51491
+ 51494 100
51492
+ 51495 100
51493
+ 51496 100
51494
+ 51497 100
51495
+ 51498 100
51496
+ 51499 100
51497
+ 51500 100
51498
+ 51501 100
51499
+ 51502 100
51500
+ 51503 100
51501
+ 51504 100
51502
+ 51505 100
51503
+ 51506 100
51504
+ 51507 100
51505
+ 51508 100
51506
+ 51509 100
51507
+ 51510 100
51508
+ 51511 100
51509
+ 51512 100
51510
+ 51513 100
51511
+ 51514 100
51512
+ 51515 100
51513
+ 51516 100
51514
+ 51517 100
51515
+ 51518 100
51516
+ 51519 100
51517
+ 51520 100
51518
+ 51521 100
51519
+ 51522 100
51520
+ 51523 100
51521
+ 51524 100
51522
+ 51525 100
51523
+ 51526 100
51524
+ 51527 100
51525
+ 51528 100
51526
+ 51529 100
51527
+ 51530 100
51528
+ 51531 100
51529
+ 51532 100
51530
+ 51533 100
51531
+ 51534 100
51532
+ 51535 100
51533
+ 51536 100
51534
+ 51537 100
51535
+ 51538 100
51536
+ 51539 100
51537
+ 51540 100
51538
+ 51541 100
51539
+ 51542 100
51540
+ 51543 100
51541
+ 51544 100
51542
+ 51545 100
51543
+ 51546 100
51544
+ 51547 100
51545
+ 51548 100
51546
+ 51549 100
51547
+ 51550 100
51548
+ 51551 100
51549
+ 51552 100
51550
+ 51553 100
51551
+ 51554 100
51552
+ 51555 100
51553
+ 51556 100
51554
+ 51557 100
51555
+ 51558 100
51556
+ 51559 100
51557
+ 51560 100
51558
+ 51561 100
51559
+ 51562 100
51560
+ 51563 100
51561
+ 51564 100
51562
+ 51565 100
51563
+ 51566 100
51564
+ 51567 100
51565
+ 51568 100
51566
+ 51569 100
51567
+ 51570 100
51568
+ 51571 100
51569
+ 51572 100
51570
+ 51573 100
51571
+ 51574 100
51572
+ 51575 100
51573
+ 51576 100
51574
+ 51577 100
51575
+ 51578 100
51576
+ 51579 100
51577
+ 51580 100
51578
+ 51581 100
51579
+ 51582 100
51580
+ 51583 100
51581
+ 51584 100
51582
+ 51585 100
51583
+ 51586 100
51584
+ 51587 100
51585
+ 51588 100
51586
+ 51589 100
51587
+ 51590 100
51588
+ 51591 100
51589
+ 51592 100
51590
+ 51593 100
51591
+ 51594 100
51592
+ 51595 100
51593
+ 51596 100
51594
+ 51597 100
51595
+ 51598 100
51596
+ 51599 100
51597
+ 51600 100
51598
+ 51601 100
51599
+ 51602 100
51600
+ 51603 100
51601
+ 51604 100
51602
+ 51605 100
51603
+ 51606 100
51604
+ 51607 100
51605
+ 51608 100
51606
+ 51609 100
51607
+ 51610 100
51608
+ 51611 100
51609
+ 51612 100
51610
+ 51613 100
51611
+ 51614 100
51612
+ 51615 100
51613
+ 51616 100
51614
+ 51617 100
51615
+ 51618 100
51616
+ 51619 100
51617
+ 51620 100
51618
+ 51621 100
51619
+ 51622 100
51620
+ 51623 100
51621
+ 51624 100
51622
+ 51625 100
51623
+ 51626 100
51624
+ 51627 100
51625
+ 51628 100
51626
+ 51629 100
51627
+ 51630 100
51628
+ 51631 100
51629
+ 51632 100
51630
+ 51633 100
51631
+ 51634 100
51632
+ 51635 100
51633
+ 51636 100
51634
+ 51637 100
51635
+ 51638 100
51636
+ 51639 100
51637
+ 51640 100
51638
+ 51641 100
51639
+ 51642 100
51640
+ 51643 100
51641
+ 51644 100
51642
+ 51645 100
51643
+ 51646 100
51644
+ 51647 100
51645
+ 51648 100
51646
+ 51649 100
51647
+ 51650 100
51648
+ 51651 100
51649
+ 51652 100
51650
+ 51653 100
51651
+ 51654 100
51652
+ 51655 100
51653
+ 51656 100
51654
+ 51657 100
51655
+ 51658 100
51656
+ 51659 100
51657
+ 51660 100
51658
+ 51661 100
51659
+ 51662 100
51660
+ 51663 100
51661
+ 51664 100
51662
+ 51665 100
51663
+ 51666 100
51664
+ 51667 100
51665
+ 51668 100
51666
+ 51669 100
51667
+ 51670 100
51668
+ 51671 100
51669
+ 51672 100
51670
+ 51673 100
51671
+ 51674 100
51672
+ 51675 100
51673
+ 51676 100
51674
+ 51677 100
51675
+ 51678 100
51676
+ 51679 100
51677
+ 51680 100
51678
+ 51681 100
51679
+ 51682 100
51680
+ 51683 100
51681
+ 51684 100
51682
+ 51685 100
51683
+ 51686 100
51684
+ 51687 100
51685
+ 51688 100
51686
+ 51689 100
51687
+ 51690 100
51688
+ 51691 100
51689
+ 51692 100
51690
+ 51693 100
51691
+ 51694 100
51692
+ 51695 100
51693
+ 51696 100
51694
+ 51697 100
51695
+ 51698 100
51696
+ 51699 100
51697
+ 51700 100
51698
+ 51701 100
51699
+ 51702 100
51700
+ 51703 100
51701
+ 51704 100
51702
+ 51705 100
51703
+ 51706 100
51704
+ 51707 100
51705
+ 51708 100
51706
+ 51709 100
51707
+ 51710 100
51708
+ 51711 100
51709
+ 51712 100
51710
+ 51713 100
51711
+ 51714 100
51712
+ 51715 100
51713
+ 51716 100
51714
+ 51717 100
51715
+ 51718 100
51716
+ 51719 100
51717
+ 51720 100
51718
+ 51721 100
51719
+ 51722 100
51720
+ 51723 100
51721
+ 51724 100
51722
+ 51725 100
51723
+ 51726 100
51724
+ 51727 100
51725
+ 51728 100
51726
+ 51729 100
51727
+ 51730 100
51728
+ 51731 100
51729
+ 51732 100
51730
+ 51733 100
51731
+ 51734 100
51732
+ 51735 100
51733
+ 51736 100
51734
+ 51737 100
51735
+ 51738 100
51736
+ 51739 100
51737
+ 51740 100
51738
+ 51741 100
51739
+ 51742 100
51740
+ 51743 100
51741
+ 51744 100
51742
+ 51745 100
51743
+ 51746 100
51744
+ 51747 100
51745
+ 51748 100
51746
+ 51749 100
51747
+ 51750 100
51748
+ 51751 100
51749
+ 51752 100
51750
+ 51753 100
51751
+ 51754 100
51752
+ 51755 100
51753
+ 51756 100
51754
+ 51757 100
51755
+ 51758 100
51756
+ 51759 100
51757
+ 51760 100
51758
+ 51761 100
51759
+ 51762 100
51760
+ 51763 100
51761
+ 51764 100
51762
+ 51765 100
51763
+ 51766 100
51764
+ 51767 100
51765
+ 51768 100
51766
+ 51769 100
51767
+ 51770 100
51768
+ 51771 100
51769
+ 51772 100
51770
+ 51773 100
51771
+ 51774 100
51772
+ 51775 100
51773
+ 51776 100
51774
+ 51777 100
51775
+ 51778 100
51776
+ 51779 100
51777
+ 51780 100
51778
+ 51781 100
51779
+ 51782 100
51780
+ 51783 100
51781
+ 51784 100
51782
+ 51785 100
51783
+ 51786 100
51784
+ 51787 100
51785
+ 51788 100
51786
+ 51789 100
51787
+ 51790 100
51788
+ 51791 100
51789
+ 51792 100
51790
+ 51793 100
51791
+ 51794 100
51792
+ 51795 100
51793
+ 51796 100
51794
+ 51797 100
51795
+ 51798 100
51796
+ 51799 100
51797
+ 51800 100
51798
+ 51801 100
51799
+ 51802 100
51800
+ 51803 100
51801
+ 51804 100
51802
+ 51805 100
51803
+ 51806 100
51804
+ 51807 100
51805
+ 51808 100
51806
+ 51809 100
51807
+ 51810 100
51808
+ 51811 100
51809
+ 51812 100
51810
+ 51813 100
51811
+ 51814 100
51812
+ 51815 100
51813
+ 51816 100
51814
+ 51817 100
51815
+ 51818 100
51816
+ 51819 100
51817
+ 51820 100
51818
+ 51821 100
51819
+ 51822 100
51820
+ 51823 100
51821
+ 51824 100
51822
+ 51825 100
51823
+ 51826 100
51824
+ 51827 100
51825
+ 51828 100
51826
+ 51829 100
51827
+ 51830 100
51828
+ 51831 100
51829
+ 51832 100
51830
+ 51833 100
51831
+ 51834 100
51832
+ 51835 100
51833
+ 51836 100
51834
+ 51837 100
51835
+ 51838 100
51836
+ 51839 100
51837
+ 51840 100
51838
+ 51841 100
51839
+ 51842 100
51840
+ 51843 100
51841
+ 51844 100
51842
+ 51845 100
51843
+ 51846 100
51844
+ 51847 100
51845
+ 51848 100
51846
+ 51849 100
51847
+ 51850 100
51848
+ 51851 100
51849
+ 51852 100
51850
+ 51853 100
51851
+ 51854 100
51852
+ 51855 100
51853
+ 51856 100
51854
+ 51857 100
51855
+ 51858 100
51856
+ 51859 100
51857
+ 51860 100
51858
+ 51861 100
51859
+ 51862 100
51860
+ 51863 100
51861
+ 51864 100
51862
+ 51865 100
51863
+ 51866 100
51864
+ 51867 100
51865
+ 51868 100
51866
+ 51869 100
51867
+ 51870 100
51868
+ 51871 100
51869
+ 51872 100
51870
+ 51873 100
51871
+ 51874 100
51872
+ 51875 100
51873
+ 51876 100
51874
+ 51877 100
51875
+ 51878 100
51876
+ 51879 100
51877
+ 51880 100
51878
+ 51881 100
51879
+ 51882 100
51880
+ 51883 100
51881
+ 51884 100
51882
+ 51885 100
51883
+ 51886 100
51884
+ 51887 100
51885
+ 51888 100
51886
+ 51889 100
51887
+ 51890 100
51888
+ 51891 100
51889
+ 51892 100
51890
+ 51893 100
51891
+ 51894 100
51892
+ 51895 100
51893
+ 51896 100
51894
+ 51897 100
51895
+ 51898 100
51896
+ 51899 100
51897
+ 51900 100
51898
+ 51901 100
51899
+ 51902 100
51900
+ 51903 100
51901
+ 51904 100
51902
+ 51905 100
51903
+ 51906 100
51904
+ 51907 100
51905
+ 51908 100
51906
+ 51909 100
51907
+ 51910 100
51908
+ 51911 100
51909
+ 51912 100
51910
+ 51913 100
51911
+ 51914 100
51912
+ 51915 100
51913
+ 51916 100
51914
+ 51917 100
51915
+ 51918 100
51916
+ 51919 100
51917
+ 51920 100
51918
+ 51921 100
51919
+ 51922 100
51920
+ 51923 100
51921
+ 51924 100
51922
+ 51925 100
51923
+ 51926 100
51924
+ 51927 100
51925
+ 51928 100
51926
+ 51929 100
51927
+ 51930 100
51928
+ 51931 100
51929
+ 51932 100
51930
+ 51933 100
51931
+ 51934 100
51932
+ 51935 100
51933
+ 51936 100
51934
+ 51937 100
51935
+ 51938 100
51936
+ 51939 100
51937
+ 51940 100
51938
+ 51941 100
51939
+ 51942 100
51940
+ 51943 100
51941
+ 51944 100
51942
+ 51945 100
51943
+ 51946 100
51944
+ 51947 100
51945
+ 51948 100
51946
+ 51949 100
51947
+ 51950 100
51948
+ 51951 100
51949
+ 51952 100
51950
+ 51953 100
51951
+ 51954 100
51952
+ 51955 100
51953
+ 51956 100
51954
+ 51957 100
51955
+ 51958 100
51956
+ 51959 100
51957
+ 51960 100
51958
+ 51961 100
51959
+ 51962 100
51960
+ 51963 100
51961
+ 51964 100
51962
+ 51965 100
51963
+ 51966 100
51964
+ 51967 100
51965
+ 51968 100
51966
+ 51969 100
51967
+ 51970 100
51968
+ 51971 100
51969
+ 51972 100
51970
+ 51973 100
51971
+ 51974 100
51972
+ 51975 100
51973
+ 51976 100
51974
+ 51977 100
51975
+ 51978 100
51976
+ 51979 100
51977
+ 51980 100
51978
+ 51981 100
51979
+ 51982 100
51980
+ 51983 100
51981
+ 51984 100
51982
+ 51985 100
51983
+ 51986 100
51984
+ 51987 100
51985
+ 51988 100
51986
+ 51989 100
51987
+ 51990 100
51988
+ 51991 100
51989
+ 51992 100
51990
+ 51993 100
51991
+ 51994 100
51992
+ 51995 100
51993
+ 51996 100
51994
+ 51997 100
51995
+ 51998 100
merges.txt CHANGED
The diff for this file is too large to render. See raw diff
 
process.log CHANGED
@@ -1,8 +1 @@
1
- Executing train_tokenizer.py
2
- ------------------------------
3
- training bbpe tokenizer
4
- Initialize an empty tokenizer
5
- training
6
- saving model tokenizer to /home/shared/dt01/temutauro/ccasimiro/corpus-utils-lm/output/model-ready_output/biomedical-vocab-50262-2021-12-09-1207-d1d3-e42b/train_tokenizer_output/train-tokenizer-2021-12-09-1223-d1d3-4dfc
7
- saving pretrained to /home/shared/dt01/temutauro/ccasimiro/corpus-utils-lm/output/model-ready_output/biomedical-vocab-50262-2021-12-09-1207-d1d3-e42b/train_tokenizer_output/train-tokenizer-2021-12-09-1223-d1d3-4dfc
8
- saving config to /home/shared/dt01/temutauro/ccasimiro/corpus-utils-lm/output/model-ready_output/biomedical-vocab-50262-2021-12-09-1207-d1d3-e42b/train_tokenizer_output/train-tokenizer-2021-12-09-1223-d1d3-4dfc
 
1
+ INFO:root:Function "train_tokenizer" took 306.3926444167737 seconds to complete.
 
 
 
 
 
 
 
pytorch_model.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:7d2855202fb3571f8eb79f9cb08005f80709a5b7b6e5e64e9926a0485ca7d95f
3
- size 499067539
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:d720b1dddaef37080df8761bea199e3a307cd86cdd261fe3430a674579118f21
3
+ size 504420627
tokenizer_config.json CHANGED
@@ -1 +1 @@
1
- {"unk_token": {"content": "<unk>", "single_word": false, "lstrip": false, "rstrip": false, "normalized": true, "__type": "AddedToken"}, "bos_token": {"content": "<s>", "single_word": false, "lstrip": false, "rstrip": false, "normalized": true, "__type": "AddedToken"}, "eos_token": {"content": "</s>", "single_word": false, "lstrip": false, "rstrip": false, "normalized": true, "__type": "AddedToken"}, "add_prefix_space": true, "errors": "replace", "sep_token": {"content": "</s>", "single_word": false, "lstrip": false, "rstrip": false, "normalized": true, "__type": "AddedToken"}, "cls_token": {"content": "<s>", "single_word": false, "lstrip": false, "rstrip": false, "normalized": true, "__type": "AddedToken"}, "pad_token": {"content": "<pad>", "single_word": false, "lstrip": false, "rstrip": false, "normalized": true, "__type": "AddedToken"}, "mask_token": {"content": "<mask>", "single_word": false, "lstrip": true, "rstrip": false, "normalized": true, "__type": "AddedToken"}, "max_len": 512, "special_tokens_map_file": null, "name_or_path": "/home/shared/dt01/temutauro/ccasimiro/corpus-utils-lm/output/model-ready_output/biomedical-vocab-50262-2021-12-09-1207-d1d3-e42b/train_tokenizer_output/train-tokenizer-2021-12-09-1223-d1d3-4dfc"}
 
1
+ {"unk_token": {"content": "<unk>", "single_word": false, "lstrip": false, "rstrip": false, "normalized": true, "__type": "AddedToken"}, "bos_token": {"content": "<s>", "single_word": false, "lstrip": false, "rstrip": false, "normalized": true, "__type": "AddedToken"}, "eos_token": {"content": "</s>", "single_word": false, "lstrip": false, "rstrip": false, "normalized": true, "__type": "AddedToken"}, "add_prefix_space": true, "errors": "replace", "sep_token": {"content": "</s>", "single_word": false, "lstrip": false, "rstrip": false, "normalized": true, "__type": "AddedToken"}, "cls_token": {"content": "<s>", "single_word": false, "lstrip": false, "rstrip": false, "normalized": true, "__type": "AddedToken"}, "pad_token": {"content": "<pad>", "single_word": false, "lstrip": false, "rstrip": false, "normalized": true, "__type": "AddedToken"}, "mask_token": {"content": "<mask>", "single_word": false, "lstrip": true, "rstrip": false, "normalized": true, "__type": "AddedToken"}, "max_len": 512, "special_tokens_map_file": null, "name_or_path": "/gpfs/projects/bsc88/corpus-utils-lm/23-12-2020-72f8c7e/output/model-ready_output/2020-12-23-1900-daf4-ab38/train_tokenizer_output/2020-12-23-1913-daf4-ed9c"}
vocab.json CHANGED
The diff for this file is too large to render. See raw diff