satyaalmasian commited on
Commit
72632d2
1 Parent(s): 887b547

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +54 -0
README.md ADDED
@@ -0,0 +1,54 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # BERT based temporal tagged
2
+
3
+ Token classifier for temporal tagging of plain text using German Gelectra model.
4
+ # Model description
5
+ GELECTRA is a transformer (ELECTRA) model pretrained on a large corpus of German data in a self-supervised fashion. We use GELECTRA for token classification to tag the tokens in text with classes (tags are from english timex3 format):
6
+ ```
7
+ O -- outside of a tag
8
+ I-TIME -- inside tag of time
9
+ B-TIME -- beginning tag of time
10
+ I-DATE -- inside tag of date
11
+ B-DATE -- beginning tag of date
12
+ I-DURATION -- inside tag of duration
13
+ B-DURATION -- beginning tag of duration
14
+ I-SET -- inside tag of the set
15
+ B-SET -- beginning tag of the set
16
+ ```
17
+
18
+
19
+ # Intended uses & limitations
20
+ This model is best used accompanied with code from the [repository](https://github.com/satya77/Transformer_Temporal_Tagger). Especially for inference, the direct output might be noisy and hard to decipher, in the repository we provide alignment functions and voting strategies for the final output. The repo examples the english models, the german model can be used the same way.
21
+
22
+ # How to use
23
+ you can load the model as follows:
24
+ ```
25
+ tokenizer = AutoTokenizer.from_pretrained("satyaalmasian/temporal_tagger_German_GELECTRA", use_fast=False)
26
+ model = BertForTokenClassification.from_pretrained("satyaalmasian/temporal_tagger_German_GELECTRA")
27
+
28
+ ```
29
+ for inference use:
30
+ ```
31
+ processed_text = tokenizer(input_text, return_tensors="pt")
32
+ result = model(**processed_text)
33
+ classification= result[0]
34
+
35
+ ```
36
+ for an example with post-processing, refer to the [repository](https://github.com/satya77/Transformer_Temporal_Tagger).
37
+ We provide a function `merge_tokens` to decipher the output.
38
+ to further fine-tune, use the `Trainer` from hugginface. An example of a similar fine-tuning can be found [here](https://github.com/satya77/Transformer_Temporal_Tagger/blob/master/run_token_classifier.py).
39
+
40
+ #Training data
41
+ For pre-training we use a large corpus of automatically annotated news articles with heideltime.
42
+
43
+ We use 2 data sources for fine-tunning. :
44
+ [Tempeval-3](https://www.cs.york.ac.uk/semeval-2013/task1/index.php%3Fid=data.html),automatically translated to gemran,
45
+ [KRAUTS dataset](https://github.com/JannikStroetgen/KRAUTS).
46
+
47
+ #Training procedure
48
+ The model is trained from publicly available checkpoints on huggingface (`deepset/gelectra-large`), with a batch size of 192. We use a learning rate of 1e-07 with an Adam optimizer and linear weight decay for pretraining.
49
+ For fine-tuning we use a batch size of 16. We use a learning rate of 5e-05 with an Adam optimizer and linear weight decay.
50
+ We fine-tune with 3 different random seeds, this version of the model is the only seed=7.
51
+ For training, we use 2 NVIDIA A100 GPUs with 40GB of memory.
52
+
53
+
54
+