satyaalmasian commited on
Commit
46bdd51
1 Parent(s): 6f162cd

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +29 -33
README.md CHANGED
@@ -1,55 +1,51 @@
1
- # BERT2BERT temporal tagger
2
 
3
- Seq2seq model for temporal tagging of plain text using BERT language model. The model is introduced in the paper BERT got a Date: Introducing Transformers to Temporal Tagging and release in this [repository](https://github.com/satya77/Transformer_Temporal_Tagger).
4
- RoBERTa version of the same model is also available [here](https://huggingface.co/satyaalmasian/temporal_tagger_roberta2roberta) and has better performance.
5
 
6
  # Model description
7
- BERT is a transformers model pretrained on a large corpus of English data in a self-supervised fashion. We use BERT in an encoder-decoder architecture for text generation, where the input is raw text and the output is the temporally annotated text. The model is pre-trained on a weakly annotated dataset from a rule-based system (HeidelTime) and fine-tuned on the temporal benchmark datasets (Wikiwars, Tweets, Tempeval-3).
 
 
 
 
 
 
 
 
 
 
 
 
8
 
9
  # Intended uses & limitations
10
- This model is best used accompanied with code from the [repository](https://github.com/satya77/Transformer_Temporal_Tagger). Especially for inference, the direct output might be noisy and hard to decipher, in the repository we provide cleaning functions for the output and insert the temporal tags from the generated text in the input text. If you have temporally annotated data you can fine-tune this model.
11
 
12
  # How to use
13
  you can load the model as follows:
14
  ```
15
- tokenizer = AutoTokenizer.from_pretrained("satyaalmasian/temporal_tagger_BERT_tokenclassifier")
16
- model = EncoderDecoderModel.from_pretrained("satyaalmasian/temporal_tagger_BERT_tokenclassifier")
17
 
18
  ```
19
  for inference use:
20
  ```
21
- model_inputs = tokenizer(input_text, truncation=True, return_tensors="pt")
22
- out = model.generate(**model_inputs)
23
- decoded_preds = tokenizer.batch_decode(out, skip_special_tokens=True)
24
 
25
  ```
26
- for an example with post-processing, refer to the [repository](https://github.com/satya77/Transformer_Temporal_Tagger).
27
- to further fine-tune, use the `Seq2SeqTrainer` from hugginface. An example of a similar fine-tuning can be found [here](https://github.com/satya77/Transformer_Temporal_Tagger/blob/master/run_seq2seq_bert_roberta.py).
28
- ```
29
- trainer = Seq2SeqTrainer(
30
- model=model2model,
31
- tokenizer=tokenizer,
32
- args=training_args,
33
- compute_metrics=metrics.compute_metrics,
34
- train_dataset=train_data,
35
- eval_dataset=val_data,
36
- )
37
-
38
- train_result=trainer.train()
39
- ```
40
- where the `training_args` is an instance of `Seq2SeqTrainingArguments`.
41
  #Training data
42
- We use four data sources:
43
- For Pretraining :1 million weakly annotated samples from heideltime. The samples are from news articles between the 1st January 2019 and the 30th July.
44
- Fine-tunning: [Tempeval-3](https://www.cs.york.ac.uk/semeval-2013/task1/index.php%3Fid=data.html), Wikiwars, Tweets datasets. For the correct data versions please refer to our [repository](https://github.com/satya77/Transformer_Temporal_Tagger).
45
 
46
  #Training procedure
47
- The model is pre-trained on the weakly labeled data for $3$ epochs on the train set, from publicly available checkpoints on huggingface (`roberta-base`), with a batch size of 12. We use a learning rate of 5e-05 with an Adam optimizer and linear weight decay.
48
- Additionally, we use 2000 warmup steps.
49
- We fine-tune the 3 benchmark data for 8 epochs with 5 different random seeds, this version of the model is the only seed=4.
50
- The batch size and the learning rate is the same as the pre-training setup, but the warm-up steps are reduced to 100.
51
  For training, we use 2 NVIDIA A100 GPUs with 40GB of memory.
52
- For inference in seq2seq models, we use Greedy decoding, since beam search had sub-optimal results.
53
 
54
 
55
 
1
+ # BERT based temporal tagged
2
 
3
+ Token classifier for temporal tagging of plain text using BERT language model. The model is introduced in the paper BERT got a Date: Introducing Transformers to Temporal Tagging and release in this [repository](https://github.com/satya77/Transformer_Temporal_Tagger).
 
4
 
5
  # Model description
6
+ BERT is a transformers model pretrained on a large corpus of English data in a self-supervised fashion. We use BERT for token classification to tag the tokens in text with classes:
7
+ ```
8
+ O -- outside of a tag
9
+ I-TIME -- inside tag of time
10
+ B-TIME -- beginning tag of time
11
+ I-DATE -- inside tag of date
12
+ B-DATE -- beginning tag of date
13
+ I-DURATION -- inside tag of duration
14
+ B-DURATION -- beginning tag of duration
15
+ I-SET -- inside tag of the set
16
+ B-SET -- beginning tag of the set
17
+ ```
18
+
19
 
20
  # Intended uses & limitations
21
+ This model is best used accompanied with code from the [repository](https://github.com/satya77/Transformer_Temporal_Tagger). Especially for inference, the direct output might be noisy and hard to decipher, in the repository we provide alignment functions and voting strategies for the final output.
22
 
23
  # How to use
24
  you can load the model as follows:
25
  ```
26
+ tokenizer = AutoTokenizer.from_pretrained("satyaalmasian/temporal_tagger_BERT_tokenclassifier", use_fast=False)
27
+ model = BertForTokenClassification.from_pretrained("satyaalmasian/temporal_tagger_BERT_tokenclassifier")
28
 
29
  ```
30
  for inference use:
31
  ```
32
+ processed_text = tokenizer(input_text, return_tensors="pt")
33
+ result = model(**processed_text)
34
+ classification= result[0]
35
 
36
  ```
37
+ for an example with post-processing, refer to the [repository](https://github.com/satya77/Transformer_Temporal_Tagger).
38
+ We provide a function `merge_tokens` to decipher the output.
39
+ to further fine-tune, use the `Trainer` from hugginface. An example of a similar fine-tuning can be found [here](https://github.com/satya77/Transformer_Temporal_Tagger/blob/master/run_token_classifier.py).
40
+
 
 
 
 
 
 
 
 
 
 
 
41
  #Training data
42
+ We use 3 data sources:
43
+ [Tempeval-3](https://www.cs.york.ac.uk/semeval-2013/task1/index.php%3Fid=data.html), Wikiwars, Tweets datasets. For the correct data versions please refer to our [repository](https://github.com/satya77/Transformer_Temporal_Tagger).
 
44
 
45
  #Training procedure
46
+ The model is trained from publicly available checkpoints on huggingface (`bert-base-uncased`), with a batch size of 34. We use a learning rate of 5e-05 with an Adam optimizer and linear weight decay.
47
+ We fine-tune with 5 different random seeds, this version of the model is the only seed=4.
 
 
48
  For training, we use 2 NVIDIA A100 GPUs with 40GB of memory.
 
49
 
50
 
51