piotr-rybak commited on
Commit
f6c1e93
1 Parent(s): 078eaf8

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +43 -0
README.md ADDED
@@ -0,0 +1,43 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: cc-by-sa-4.0
3
+ pipeline_tag: fill-mask
4
+ ---
5
+ # Model Card for Silesian HerBERT Base
6
+
7
+ Silesian HerBERT Base is a [HerBERT Base](https://huggingface.co/allegro/herbert-base-cased) model with a Silesian tokenizer and fine-tuned on Silesian Wikipedia.
8
+
9
+ ## Usage
10
+ Example code:
11
+ ```python
12
+ from transformers import AutoTokenizer, AutoModel
13
+
14
+ tokenizer = AutoTokenizer.from_pretrained("ipipan/silesian-herbert-base")
15
+ model = AutoModel.from_pretrained("ipipan/silesian-herbert-base")
16
+
17
+ output = model(
18
+ **tokenizer.batch_encode_plus(
19
+ [
20
+ (
21
+ "Wielgŏ Piyramida we Gizie, mianowanŏ tyż Piyramida ôd Cheopsa, to je nojsrogszŏ a nojbarzij znanŏ ze egipskich piyramid we Gizie.",
22
+ )
23
+ ],
24
+ padding='longest',
25
+ add_special_tokens=True,
26
+ return_tensors='pt'
27
+ )
28
+ )
29
+ ```
30
+
31
+ ## License
32
+ CC BY-SA 4.0
33
+
34
+ ## Citation
35
+ If you use this model, please cite the following paper:
36
+ ```
37
+
38
+ ```
39
+
40
+ ## Authors
41
+ The model was created by Piotr Rybak from [**Linguistic Engineering Group at Institute of Computer Science, Polish Academy of Sciences**](http://zil.ipipan.waw.pl/).
42
+
43
+ This work was supported by the European Regional Development Fund as a part of 2014–2020 Smart Growth Operational Programme, CLARIN — Common Language Resources and Technology Infrastructure, project no. POIR.04.02.00-00C002/19.