|
--- |
|
|
|
language: |
|
- English |
|
tags: |
|
- Clinical notes |
|
- Discharge summaries |
|
- RoBERTa |
|
license: "cc-by-4.0" |
|
datasets: |
|
- MIMIC-III |
|
|
|
--- |
|
|
|
* Continue pre-training RoBERTa-base using discharge summaries from MIMIC-III datasets. |
|
|
|
* Details can be found in the following paper |
|
|
|
> Xiang Dai and Ilias Chalkidis and Sune Darkner and Desmond Elliott. 2022. Revisiting Transformer-based Models for Long Document Classification. (https://arxiv.org/abs/2204.06683) |
|
|
|
* Important hyper-parameters |
|
|
|
| | | |
|
|---|---| |
|
| Max sequence | 128 | |
|
| Batch size | 128 | |
|
| Learning rate | 5e-5 | |
|
| Training epochs | 15 | |
|
| Training time | 40 GPU-hours | |