pranaydeeps commited on
Commit
db7bf93
1 Parent(s): f9d785f

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +28 -4
README.md CHANGED
@@ -1,7 +1,6 @@
1
-
2
  # Ancient Greek BERT
3
 
4
- ![](https://media.nationalgeographic.org/assets/photos/141/591/6a85829d-16e1-4392-be49-74e0461e77ec_c0-713-4188-2078_r230x75.JPG?55192c09e6fa5ad5cce49f0b30f7fc05b6c6cb9e)
5
 
6
  The first and only available Ancient Greek sub-word BERT model!
7
 
@@ -15,13 +14,27 @@ Please refer to our paper titled: "A Pilot Study for BERT Language Modelling and
15
 
16
  ## How to use
17
 
 
 
 
 
 
 
 
 
18
  Can be directly used from the HuggingFace Model Hub with:
19
 
 
20
  ```python
21
  from transformers import AutoTokenizer, AutoModel
22
  tokeniser = AutoTokenizer.from_pretrained("pranaydeeps/Ancient-Greek-BERT")
23
  model = AutoModel.from_pretrained("pranaydeeps/Ancient-Greek-BERT")
24
  ```
 
 
 
 
 
25
  ## Training data
26
 
27
  The model was initialised from [AUEB NLP Group's Greek BERT](https://huggingface.co/nlpaueb/bert-base-greek-uncased-v1)
@@ -31,7 +44,18 @@ Gorman's Treebank
31
  ## Training and Eval details
32
 
33
  Standard de-accentuating and lower-casing for Greek as suggested in [AUEB NLP Group's Greek BERT](https://huggingface.co/nlpaueb/bert-base-greek-uncased-v1)
 
 
34
 
 
35
 
36
- The model was trained on 4 NVIDIA Tesla V100 16GB GPUs for 80 epochs, with a max-seq-len of 512 and results in a perplexity of 4.8 on the held out test set.
37
- It also gives state-of-the-art results when fine-tuned for PoS Tagging and Morphological Analysis on all 3 treebanks averaging >90% accuracy. Please consult our paper or contact [me](mailto:pranaydeep.singh@ugent.be) for further questions!
 
 
 
 
 
 
 
 
 
 
1
  # Ancient Greek BERT
2
 
3
+ <img src="https://ichef.bbci.co.uk/images/ic/832xn/p02m4gzb.jpg"/>
4
 
5
  The first and only available Ancient Greek sub-word BERT model!
6
 
 
14
 
15
  ## How to use
16
 
17
+ Requirements:
18
+
19
+ ```python
20
+ pip install transformers
21
+ pip install unicodedata
22
+ pip install flair
23
+ ```
24
+
25
  Can be directly used from the HuggingFace Model Hub with:
26
 
27
+
28
  ```python
29
  from transformers import AutoTokenizer, AutoModel
30
  tokeniser = AutoTokenizer.from_pretrained("pranaydeeps/Ancient-Greek-BERT")
31
  model = AutoModel.from_pretrained("pranaydeeps/Ancient-Greek-BERT")
32
  ```
33
+
34
+ ## Fine-tuning for POS/Morphological Analysis
35
+
36
+ Please refer the GitHub repository for the code and details regarding fine-tuning
37
+
38
  ## Training data
39
 
40
  The model was initialised from [AUEB NLP Group's Greek BERT](https://huggingface.co/nlpaueb/bert-base-greek-uncased-v1)
 
44
  ## Training and Eval details
45
 
46
  Standard de-accentuating and lower-casing for Greek as suggested in [AUEB NLP Group's Greek BERT](https://huggingface.co/nlpaueb/bert-base-greek-uncased-v1)
47
+ The model was trained on 4 NVIDIA Tesla V100 16GB GPUs for 80 epochs, with a max-seq-len of 512 and results in a perplexity of 4.8 on the held out test set.
48
+ It also gives state-of-the-art results when fine-tuned for PoS Tagging and Morphological Analysis on all 3 treebanks averaging >90% accuracy. Please consult our paper or contact [me](mailto:pranaydeep.singh@ugent.be) for further questions!
49
 
50
+ ## Cite
51
 
52
+ If you end up using Ancient-Greek-BERT in your research, please cite the paper:
53
+
54
+ ```
55
+ @inproceedings{ancient-greek-bert,
56
+ author = {Singh, Pranaydeep and Rutten, Gorik and Lefever, Els},
57
+ title = {A Pilot Study for BERT Language Modelling and Morphological Analysis for Ancient and Medieval Greek},
58
+ year = {2021},
59
+ booktitle = {The 5th Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature (LaTeCH-CLfL 2021)}
60
+ }
61
+ ```