Shaltiel commited on
Commit
6558638
1 Parent(s): dd000af

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +34 -0
README.md CHANGED
@@ -1,3 +1,37 @@
1
  ---
2
  license: apache-2.0
 
 
 
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: apache-2.0
3
+ language:
4
+ - he
5
+ library_name: transformers
6
+ tags:
7
+ - bert
8
  ---
9
+
10
+ > Update 2023-5-23: This model is `BEREL` version 1.0. We are now happy to provide a much improved `BEREL_2.0`.
11
+
12
+
13
+ # Introducing BEREL: BERT Embeddings for Rabbinic-Encoded Language
14
+
15
+ When using BEREL, please reference:
16
+
17
+
18
+ Avi Shmidman, Joshua Guedalia, Shaltiel Shmidman, Cheyn Shmuel Shmidman, Eli Handel, Moshe Koppel, "Introducing BEREL: BERT Embeddings for Rabbinic-Encoded Language", Aug 2022 [arXiv:2208.01875]
19
+
20
+
21
+
22
+ 1. Usage:
23
+
24
+ ```python
25
+ from transformers import AutoTokenizer, BertForMaskedLM
26
+
27
+ tokenizer = AutoTokenizer.from_pretrained('dicta-il/BEREL')
28
+ model = BertForMaskedLM.from_pretrained('dicta-il/BEREL')
29
+ ```
30
+
31
+ > NOTE: This code will **not** work and provide bad results if you use `BertTokenizer`. Please use `AutoTokenizer` or `BertTokenizerFast`.
32
+
33
+ 2. Demo site:
34
+ You can experiment with the model in a GUI interface here: https://dicta-bert-demo.netlify.app/?genre=rabbinic
35
+ - The main part of the GUI consists of word buttons visualizing the tokenization of the sentences. Clicking on a button masks it, and then three BEREL word predictions are shown. Clicking on that bubble expands it to 10 predictions; alternatively, ctrl-clicking on that initial bubble expands to 30 predictions.
36
+ - Ctrl-clicking adjacent word buttons combines them into a single token for the mask.
37
+ - The edit box on top contains the input sentence; this can be modified at will, and the word-buttons will adjust as relevant.