File size: 2,795 Bytes
2213800
35b9929
 
 
 
 
9174afc
35b9929
aa9d7bf
1d1559a
aa9d7bf
 
1edbf29
 
93a12c1
 
 
 
 
924b6e4
35b9929
1edbf29
35b9929
 
cb98fca
 
 
 
 
 
0dcad0b
2b7f609
cb98fca
1edbf29
6aeb18d
1edbf29
 
9be731a
 
 
 
 
 
 
 
 
1edbf29
 
 
 
 
9be731a
1edbf29
9be731a
1edbf29
35b9929
29c90ae
fb7d5ee
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62

---
language: "en"
tags:
- Financial Language Modelling
widget:
- text: "Stocks rallied and the British pound [MASK]."
---
## Dataset Summary
- **Homepage:** https://salt-nlp.github.io/FLANG/
- **Models:** https://huggingface.co/SALT-NLP/FLANG-BERT
- **Repository:** https://github.com/SALT-NLP/FLANG

## FLANG
FLANG is a set of large language models for Financial LANGuage tasks. These models use domain specific pre-training with preferential masking to build more robust representations for the domain. The models in the set are:\
[FLANG-BERT](https://huggingface.co/SALT-NLP/FLANG-BERT)\
[FLANG-SpanBERT](https://huggingface.co/SALT-NLP/FLANG-SpanBERT)\
[FLANG-DistilBERT](https://huggingface.co/SALT-NLP/FLANG-DistilBERT)\
[FLANG-Roberta](https://huggingface.co/SALT-NLP/FLANG-Roberta)\
[FLANG-ELECTRA](https://huggingface.co/SALT-NLP/FLANG-ELECTRA)

## FLANG-BERT
FLANG-BERT is a pre-trained language model which uses financial keywords and phrases for preferential masking of domain specific terms. It is built by further training the BERT language model in the finance domain with improved performance over previous models due to the use of domain knowledge and vocabulary.

## FLUE
FLUE (Financial Language Understanding Evaluation) is a comprehensive and heterogeneous benchmark that has been built from 5 diverse financial domain specific datasets.

Sentiment Classification: [Financial PhraseBank](https://huggingface.co/datasets/financial_phrasebank)\
Sentiment Analysis, Question Answering: [FiQA 2018](https://huggingface.co/datasets/SALT-NLP/FLUE-FiQA)\
New Headlines Classification: [Headlines](https://www.kaggle.com/datasets/daittan/gold-commodity-news-and-dimensions)\
Named Entity Recognition: [NER](https://paperswithcode.com/dataset/fin)\
Structure Boundary Detection: [FinSBD3](https://sites.google.com/nlg.csie.ntu.edu.tw/finweb2021/shared-task-finsbd-3)

## Citation
Please cite the model with the following citation:
```bibtex
@INPROCEEDINGS{shah-etal-2022-flang,
    author = {Shah, Raj Sanjay  and
      Chawla, Kunal and
      Eidnani, Dheeraj and
      Shah, Agam and
      Du, Wendi and
      Chava, Sudheer and
      Raman, Natraj and
      Smiley, Charese and
      Chen, Jiaao and
      Yang, Diyi },
    title = {When FLUE Meets FLANG: Benchmarks and Large Pretrained Language Model for Financial Domain},
    booktitle = {Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing (EMNLP)},
    year = {2022},
    publisher = {Association for Computational Linguistics}
}
```

## Contact information

Please contact Raj Sanjay Shah (rajsanjayshah[at]gatech[dot]edu) or Sudheer Chava (schava6[at]gatech[dot]edu) or Diyi Yang (diyiy[at]stanford[dot]edu) about any FLANG-BERT related issues and questions.


---
license: afl-3.0
---