metadata
license: cc-by-nc-4.0
KEPTlongfomer using contrastive learning.
First, init from RoBERTa-base-PM-M3-Voc-distill from bio-lm.
And then pretrained with Hierarchical Self-Alignment Pretrainumls (HSAP) using Knowledge Graph UMLS. This includes (a) Hierarchy, (b) Synonym, (c) Abbreviation. For more info, see section 3.3 in paper.
See here for how to use this on auto ICD coding.
With the following result:
Metric | Score |
---|---|
rec_micro | =0.5844294992252652 |
rec_macro | =0.12471916602840005 |
rec_at_8 | =0.4138093882408751 |
rec_at_75 | =0.8581874197033126 |
rec_at_50 | =0.8109877644497351 |
rec_at_5 | =0.2923155353947738 |
rec_at_15 | =0.586890060777621 |
prec_micro | =0.6537291416981642 |
prec_macro | =0.1382069689951297 |
prec_at_8 | =0.7835112692763938 |
prec_at_75 | =0.20033214709371291 |
prec_at_50 | =0.2810260972716489 |
prec_at_5 | =0.8551008303677343 |
prec_at_15 | =0.6288256227758008 |
f1_micro | =0.6171399726721254 |
f1_macro | =0.13111711325953157 |
f1_at_8 | =0.54158310388029 |
f1_at_75 | =0.324835806140454 |
f1_at_50 | =0.4174099512237087 |
f1_at_5 | =0.4356905906241822 |
f1_at_15 | =0.6071345676658747 |
auc_micro | =0.9653561390964384 |
auc_macro | =0.8572490224880879 |
acc_micro | =0.4462779749767132 |
acc_macro | =0.09732882850157536 |