File size: 784 Bytes
2b8e80e
 
42f2cd9
 
 
 
 
565905c
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
---
license: cc-by-2.0
datasets:
- allenai/s2orc
language:
- en
pipeline_tag: token-classification
---
Another name for this model is sciDeBERta v2[1].
This model is trained from scratch using S2ORC dataset(260GB), which include abstract, body text of papers, on DeBERTa v2. 
This model achieves the SOTA in NET of SciERC dataset. 
From this model, MediBioDeBERTa, which continuously leaned from scidebert v2. to medibiodeberta using the data from the domain (bio, medical, chemistry domain data)
and additional intermediate fine-tuning for specific blurb benchmark tasks, achieve the 11 rank in the BLURB benchmark. 

[1] Eunhui Kim, Yuna Jeong, Myung-seok Choi, "MediBioDeBERTa: BioMedical Language Model with Continous Learning and Intermediate Fine-Tuning, Dec. 2023, IEEE Access"