README.md · whaleloops/KEPTlongformer-PMM3 at eada6dd3ab2a23922a7153da820ad452cc6af741

metadata

license: cc-by-nc-4.0

KEPTlongfomer using contrastive learning.

First, init from RoBERTa-base-PM-M3-Voc-distill from bio-lm.

And then pretrained with Hierarchical Self-Alignment Pretrainumls (HSAP) using Knowledge Graph UMLS. This includes (a) Hierarchy, (b) Synonym, (c) Abbreviation. For more info, see section 3.3 in paper.

See here for how to use this on auto ICD coding.

With the following result:

Metric	Score
rec_micro	=0.5844294992252652
rec_macro	=0.12471916602840005
rec_at_8	=0.4138093882408751
rec_at_75	=0.8581874197033126
rec_at_50	=0.8109877644497351
rec_at_5	=0.2923155353947738
rec_at_15	=0.586890060777621
prec_micro	=0.6537291416981642
prec_macro	=0.1382069689951297
prec_at_8	=0.7835112692763938
prec_at_75	=0.20033214709371291
prec_at_50	=0.2810260972716489
prec_at_5	=0.8551008303677343
prec_at_15	=0.6288256227758008
f1_micro	=0.6171399726721254
f1_macro	=0.13111711325953157
f1_at_8	=0.54158310388029
f1_at_75	=0.324835806140454
f1_at_50	=0.4174099512237087
f1_at_5	=0.4356905906241822
f1_at_15	=0.6071345676658747
auc_micro	=0.9653561390964384
auc_macro	=0.8572490224880879
acc_micro	=0.4462779749767132
acc_macro	=0.09732882850157536