Shaltiel commited on
Commit
2c00e7a
โ€ข
1 Parent(s): 706324f

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +77 -0
README.md CHANGED
@@ -1,3 +1,80 @@
1
  ---
2
  license: cc-by-4.0
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: cc-by-4.0
3
+ language:
4
+ - he
5
  ---
6
+ # DictaBERT: A State-of-the-Art BERT Suite for Modern Hebrew
7
+
8
+ State-of-the-art language model for Hebrew, as released [here](link to arxiv).
9
+
10
+ This is the fine-tuned model for the prefix segmentation task.
11
+
12
+ Sample usage:
13
+
14
+ ```
15
+ from transformers import AutoModel, AutoTokenizer
16
+
17
+ tokenizer = AutoTokenizer.from_pretrained('dicta-il/dictabert-seg')
18
+ model = AutoModel.from_pretrained('dicta-il/dictabert-seg', trust_remote_code=True)
19
+
20
+ model.eval()
21
+
22
+ sentence = 'ื‘ืฉื ืช 1948 ื”ืฉืœื™ื ืืคืจื™ื ืงื™ืฉื•ืŸ ืืช ืœื™ืžื•ื“ื™ื• ื‘ืคื™ืกื•ืœ ืžืชื›ืช ื•ื‘ืชื•ืœื“ื•ืช ื”ืืžื ื•ืช ื•ื”ื—ืœ ืœืคืจืกื ืžืืžืจื™ื ื”ื•ืžื•ืจื™ืกื˜ื™ื™ื'
23
+ segmented_sentence = model.predict([sentence], tokenizer)
24
+
25
+ import json
26
+ print(json.dumps(segmented_sentence, indent=2))
27
+ ```
28
+
29
+ Output:
30
+ ```json
31
+ [
32
+ [
33
+ [ "[CLS]" ],
34
+ [ "ื‘","ืฉื ืช" ],
35
+ [ "1948" ],
36
+ [ "ื”ืฉืœื™ื" ],
37
+ [ "ืืคืจื™ื" ],
38
+ [ "ืงื™ืฉื•ืŸ" ],
39
+ [ "ืืช" ],
40
+ [ "ืœื™ืžื•ื“ื™ื•" ],
41
+ [ "ื‘","ืคื™ืกื•ืœ" ],
42
+ [ "ืžืชื›ืช" ],
43
+ [ "ื•ื‘","ืชื•ืœื“ื•ืช" ],
44
+ [ "ื”","ืืžื ื•ืช" ],
45
+ [ "ื•","ื”ื—ืœ" ],
46
+ [ "ืœืคืจืกื" ],
47
+ [ "ืžืืžืจื™ื" ],
48
+ [ "ื”ื•ืžื•ืจื™ืกื˜ื™ื™ื" ],
49
+ [ "[SEP]" ]
50
+ ]
51
+ ]
52
+ ```
53
+
54
+
55
+ ## Citation
56
+
57
+ If you use DictaBERT in your research, please cite ```DictaBERT: A State-of-the-Art BERT Suite for Modern Hebrew```
58
+
59
+ **BibTeX:**
60
+
61
+ To add
62
+
63
+ ## License
64
+
65
+ Shield: [![CC BY 4.0][cc-by-shield]][cc-by]
66
+
67
+ This work is licensed under a
68
+ [Creative Commons Attribution 4.0 International License][cc-by].
69
+
70
+ [![CC BY 4.0][cc-by-image]][cc-by]
71
+
72
+ [cc-by]: http://creativecommons.org/licenses/by/4.0/
73
+ [cc-by-image]: https://i.creativecommons.org/l/by/4.0/88x31.png
74
+ [cc-by-shield]: https://img.shields.io/badge/License-CC%20BY%204.0-lightgrey.svg
75
+
76
+
77
+
78
+
79
+
80
+ `