language: en | |
license: cc-by-4.0 | |
tags: | |
- Clinical notes | |
- Discharge summaries | |
- longformer | |
datasets: | |
- MIMIC-III | |
* Continue pre-training RoBERTa-base using discharge summaries from MIMIC-III datasets. | |
* Details can be found in the following paper | |
> Xiang Dai and Ilias Chalkidis and Sune Darkner and Desmond Elliott. 2022. Revisiting Transformer-based Models for Long Document Classification. (https://arxiv.org/abs/2204.06683) | |
* Important hyper-parameters | |
| | | | |
|---|---| | |
| Max sequence | 4096 | | |
| Batch size | 8 | | |
| Learning rate | 5e-5 | | |
| Training epochs | 6 | | |
| Training time | 130 GPU-hours | |