Shaltiel commited on
Commit
ffb24f1
1 Parent(s): 2ce194d

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +29 -0
README.md CHANGED
@@ -1,3 +1,32 @@
1
  ---
2
  license: apache-2.0
 
 
 
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: apache-2.0
3
+ language:
4
+ - he
5
+ library_name: transformers
6
+ tags:
7
+ - bert
8
  ---
9
+
10
+ # Introducing BEREL 2.0 - New and Improved BEREL: BERT Embeddings for Rabbinic-Encoded Language
11
+
12
+ When using BEREL 2.0, please reference:
13
+
14
+ Avi Shmidman, Joshua Guedalia, Shaltiel Shmidman, Cheyn Shmuel Shmidman, Eli Handel, Moshe Koppel, "Introducing BEREL: BERT Embeddings for Rabbinic-Encoded Language", Aug 2022 [arXiv:2208.01875]
15
+
16
+
17
+ 1. Usage:
18
+
19
+ ```python
20
+ from transformers import AutoTokenizer, BertForMaskedLM
21
+
22
+ tokenizer = AutoTokenizer.from_pretrained('dicta-il/BEREL_2.0')
23
+ model = BertForMaskedLM.from_pretrained('dicta-il/BEREL_2.0')
24
+ ```
25
+
26
+ > NOTE: This code will **not** work and provide bad results if you use `BertTokenizer`. Please use `AutoTokenizer` or `BertTokenizerFast`.
27
+
28
+ 2. Demo site:
29
+ You can experiment with the model in a GUI interface here: https://dicta-bert-demo.netlify.app/?genre=rabbinic
30
+ - The main part of the GUI consists of word buttons visualizing the tokenization of the sentences. Clicking on a button masks it, and then three BEREL word predictions are shown. Clicking on that bubble expands it to 10 predictions; alternatively, ctrl-clicking on that initial bubble expands to 30 predictions.
31
+ - Ctrl-clicking adjacent word buttons combines them into a single token for the mask.
32
+ - The edit box on top contains the input sentence; this can be modified at will, and the word-buttons will adjust as relevant.