--- language: - English tags: - Clinical notes - Discharge summaries - RoBERTa license: "cc-by-4.0" datasets: - MIMIC-III --- * Continue pre-training RoBERTa-base using discharge summaries from MIMIC-III datasets. * Details can be found in the following paper > Xiang Dai and Ilias Chalkidis and Sune Darkner and Desmond Elliott. 2022. Revisiting Transformer-based Models for Long Document Classification. (https://arxiv.org/abs/2204.06683) * Important hyper-parameters | | | |---|---| | Max sequence | 128 | | Batch size | 128 | | Learning rate | 5e-5 | | Training epochs | 15 | | Training time | 40 GPU-hours |