minhtriphan commited on
Commit
64470e5
1 Parent(s): dff2da5

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +46 -0
README.md ADDED
@@ -0,0 +1,46 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - en
4
+ ---
5
+ # Introduction
6
+ This is the implementation of the BERT model using the LongNet structure (paper: https://arxiv.org/pdf/2307.02486.pdf). The model is pre-trained with 10-K/Q filings of US firms from 1994 to 2018.
7
+
8
+ # Training code
9
+ https://github.com/minhtriphan/LongFinBERT-base/tree/main
10
+
11
+ # Training configuration
12
+ * The model is trained with 4 epochs using the Masked Language Modeling (MLM) task;
13
+ * The masking probability is 15%;
14
+ * Details about the training configuration are given in the log file named `train_v1a_0803_1144_seed_1.log`;
15
+
16
+ # Instruction to load the pre-trained model
17
+ * Clone the git repo
18
+ ```
19
+ git clone https://github.com/minhtriphan/LongFinBERT-base.git
20
+ cd LongBERT
21
+ ```
22
+ or
23
+ ```
24
+ !git clone https://github.com/minhtriphan/LongFinBERT-base.git
25
+ import sys
26
+ sys.path.append('/LongFinBERT-base')
27
+ ```
28
+
29
+ * Load the pre-trained tokenizer, model configuration, and model weights
30
+ ```
31
+ from model import LongBERT
32
+ from custom_config import LongBERTConfig
33
+ from tokenizer import LongBERTTokenizer
34
+
35
+ backbone = 'minhtriphan/LongFinBERT'
36
+
37
+ tokenizer = LongBERTTokenizer.from_pretrained(backbone)
38
+ config = LongBERTConfig.from_pretrained(backbone)
39
+ model = LongBERT.from_pretrained(backbone)
40
+ ```
41
+
42
+ # Contact
43
+ For any comments, questions, feedback, please get in touch with us via phanminhtri2611@gmail.com or triminh.phan@unisg.ch.
44
+
45
+ # Paper
46
+ (updating)