unknown commited on
Commit
aabd7e5
2 Parent(s): 1283c90 0edf4e2

Merge branch 'main' of https://huggingface.co/m3rg-iitd/matscibert

Browse files
Files changed (1) hide show
  1. README.md +25 -0
README.md ADDED
@@ -0,0 +1,25 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # MatSciBERT
2
+ ## A Materials Domain Language Model for Text Mining and Information Extraction
3
+
4
+ This is the pretrained model presented in [MatSciBERT: A Materials Domain Language Model for Text Mining and Information Extraction](https://arxiv.org/abs/2109.15290), which is a BERT model trained on material science research papers.
5
+
6
+ The training corpus comprises papers related to the broad category of materials: alloys, glasses, metallic glasses, cement and concrete. We have utilised the abstracts and full length of papers(when available). All the research papers have been downloaded from [ScienceDirect](https://www.sciencedirect.com/) using the [Elsevier API](https://dev.elsevier.com/). The detailed methodology is given in the paper.
7
+
8
+ The codes for pretraining and finetuning on downstream tasks are shared on [GitHub](https://github.com/m3rg-repo/MatSciBERT).
9
+
10
+ If you find this useful in your research, please consider citing:
11
+ ```
12
+ @article{gupta_matscibert_2021,
13
+ title = {{{MatSciBERT}}: A {{Materials Domain Language Model}} for {{Text Mining}} and {{Information Extraction}}},
14
+ shorttitle = {{{MatSciBERT}}},
15
+ author = {Gupta, Tanishq and Zaki, Mohd and Krishnan, N. M. Anoop and Mausam},
16
+ year = {2021},
17
+ month = sep,
18
+ journal = {arXiv:2109.15290 [cond-mat]},
19
+ eprint = {2109.15290},
20
+ eprinttype = {arxiv},
21
+ primaryclass = {cond-mat},
22
+ archiveprefix = {arXiv},
23
+ keywords = {Computer Science - Computation and Language,Condensed Matter - Materials Science}}
24
+ }
25
+ ```