satyaalmasian commited on
Commit
a2ee042
1 Parent(s): 6feb3d6

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -44,7 +44,7 @@ For Pretraining :1 million weakly annotated samples from heideltime. The samples
44
  Fine-tunning: [Tempeval-3](https://www.cs.york.ac.uk/semeval-2013/task1/index.php%3Fid=data.html), Wikiwars, Tweets datasets. For the correct data versions please refer to our [repository](https://github.com/satya77/Transformer_Temporal_Tagger).
45
 
46
  #Training procedure
47
- The model is pre-trained on the weakly labeled data for $3$ epochs on the train set, from publicly available checkpoints on huggingface (`roberta-base`), with a batch size of 12. We use a learning rate of 5e-05 with an Adam optimizer and linear weight decay.
48
  Additionally, we use 2000 warmup steps.
49
  We fine-tune the 3 benchmark data for 8 epochs with 5 different random seeds, this version of the model is the only seed=4.
50
  The batch size and the learning rate is the same as the pre-training setup, but the warm-up steps are reduced to 100.
 
44
  Fine-tunning: [Tempeval-3](https://www.cs.york.ac.uk/semeval-2013/task1/index.php%3Fid=data.html), Wikiwars, Tweets datasets. For the correct data versions please refer to our [repository](https://github.com/satya77/Transformer_Temporal_Tagger).
45
 
46
  #Training procedure
47
+ The model is pre-trained on the weakly labeled data for $3$ epochs on the train set, from publicly available checkpoints on huggingface (`bert-base-uncased`), with a batch size of 12. We use a learning rate of 5e-05 with an Adam optimizer and linear weight decay.
48
  Additionally, we use 2000 warmup steps.
49
  We fine-tune the 3 benchmark data for 8 epochs with 5 different random seeds, this version of the model is the only seed=4.
50
  The batch size and the learning rate is the same as the pre-training setup, but the warm-up steps are reduced to 100.