fdalvi commited on
Commit
0b39bec
1 Parent(s): 530d225

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +33 -1
README.md CHANGED
@@ -7,4 +7,36 @@ tags:
7
  license: cc-by-nc-3.0
8
  ---
9
 
10
- # BERT-base-multilingual-cased finetuned for Part-of-Speech tagging
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
7
  license: cc-by-nc-3.0
8
  ---
9
 
10
+ # BERT-base-multilingual-cased finetuned for Part-of-Speech tagging
11
+
12
+ This is a multilingual BERT model fine tuned for part-of-speech tagging for English. It is trained using the Penn TreeBank (Marcus et al., 1993) and achieves an F1-score of 96.69.
13
+
14
+ ## Usage
15
+ A *transformers* pipeline can be used to run the model:
16
+
17
+ ```python
18
+ from transformers import AutoTokenizer, AutoModelForTokenClassification, TokenClassificationPipeline
19
+
20
+ model_name = "QCRI/bert-base-multilingual-cased-pos-english"
21
+ tokenizer = AutoTokenizer.from_pretrained(model_name)
22
+ model = AutoModelForTokenClassification.from_pretrained(model_name)
23
+
24
+ pipeline = TokenClassificationPipeline(model, tokenizer)
25
+ outputs = pipeline("A test example")
26
+ print(outputs)
27
+ ```
28
+
29
+
30
+ ## Citation
31
+ This model was used for all the part-of-speech tagging based results in *Analyzing Encoded Concepts in Transformer Language Models*, published at NAACL'22. If you find this model useful for your own work, please use the following citation:
32
+
33
+ ```bib
34
+ @inproceedings{sajjad-NAACL,
35
+ title={Analyzing Encoded Concepts in Transformer Language Models},
36
+ author={Hassan Sajjad, Nadir Durrani, Fahim Dalvi, Firoj Alam, Abdul Rafae Khan and Jia Xu},
37
+ booktitle={North American Chapter of the Association of Computational Linguistics: Human Language Technologies (NAACL)},
38
+ series={NAACL~'22},
39
+ year={2022},
40
+ address={Seattle}
41
+ }
42
+ ```