Shaltiel commited on
Commit
3f1e5fe
โ€ข
1 Parent(s): 264da51

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +573 -0
README.md CHANGED
@@ -1,3 +1,576 @@
1
  ---
2
  license: cc-by-4.0
 
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: cc-by-4.0
3
+ language:
4
+ - he
5
+ inference: false
6
  ---
7
+ # DictaBERT: A State-of-the-Art BERT Suite for Modern Hebrew
8
+
9
+ State-of-the-art language model for parsing Hebrew, released [update url].
10
+
11
+ This is the fine-tuned model for the joint parsing of the following tasks:
12
+
13
+ - Prefix Segmentation
14
+ - Morphological Disabmgiuation
15
+ - Lexicographical Analysis (Lemmatization)
16
+ - Syntactical Parsing (Dependency-Tree)
17
+ - Named-Entity Recognition
18
+
19
+ This model was initialized from dictabert-**large**-joint and tuned on the Hebrew UD Treebank and NEMO corpora, to align the predictions of the model to the tagging methodology in those corpora.
20
+
21
+ A live demo of the `dictabert-joint` model with instant visualization of the syntax tree can be found [here](https://huggingface.co/spaces/dicta-il/joint-demo).
22
+
23
+ For a faster model, you can use the equivalent bert-tiny model for this task [here](https://huggingface.co/dicta-il/dictabert-tiny-parse).
24
+
25
+ For the bert-base models for other tasks, see [here](https://huggingface.co/collections/dicta-il/dictabert-6588e7cc08f83845fc42a18b).
26
+
27
+ ---
28
+
29
+ The model currently supports 3 types of output:
30
+
31
+ 1. **JSON**: The model returns a JSON object for each sentence in the input, where for each sentence we have the sentence text, the NER entities, and the list of tokens. For each token we include the output from each of the tasks.
32
+ ```python
33
+ model.predict(..., output_style='json')
34
+ ```
35
+
36
+ 1. **UD**: The model returns the full UD output for each sentence, according to the style of the Hebrew UD Treebank.
37
+ ```python
38
+ model.predict(..., output_style='ud')
39
+ ```
40
+
41
+ 1. **UD, in the style of IAHLT**: This model returns the full UD output, with slight modifications to match the style of IAHLT. This differences are mostly granularity of some dependency relations, how the suffix of a word is broken up, and implicit definite articles. The actual tagging behavior doesn't change.
42
+ ```python
43
+ model.predict(..., output_style='iahlt_ud')
44
+ ```
45
+
46
+ ---
47
+
48
+ If you only need the output for one of the tasks, you can tell the model to not initialize some of the heads, for example:
49
+ ```python
50
+ model = AutoModel.from_pretrained('dicta-il/dictabert-parse', trust_remote_code=True, do_lex=False)
51
+ ```
52
+
53
+ The list of options are: `do_lex`, `do_syntax`, `do_ner`, `do_prefix`, `do_morph`.
54
+
55
+ ---
56
+
57
+ Sample usage:
58
+
59
+ ```python
60
+ from transformers import AutoModel, AutoTokenizer
61
+
62
+ tokenizer = AutoTokenizer.from_pretrained('dicta-il/dictabert-parse')
63
+ model = AutoModel.from_pretrained('dicta-il/dictabert-parse', trust_remote_code=True)
64
+
65
+ model.eval()
66
+
67
+ sentence = 'ื‘ืฉื ืช 1948 ื”ืฉืœื™ื ืืคืจื™ื ืงื™ืฉื•ืŸ ืืช ืœื™ืžื•ื“ื™ื• ื‘ืคื™ืกื•ืœ ืžืชื›ืช ื•ื‘ืชื•ืœื“ื•ืช ื”ืืžื ื•ืช ื•ื”ื—ืœ ืœืคืจืกื ืžืืžืจื™ื ื”ื•ืžื•ืจื™ืกื˜ื™ื™ื'
68
+ print(model.predict([sentence], tokenizer, output_style='json')) # see below for other return formats
69
+ ```
70
+
71
+ Output:
72
+ ```json
73
+ [
74
+ {
75
+ "text": "ื‘ืฉื ืช 1948 ื”ืฉืœื™ื ืืคืจื™ื ืงื™ืฉื•ืŸ ืืช ืœื™ืžื•ื“ื™ื• ื‘ืคื™ืกื•ืœ ืžืชื›ืช ื•ื‘ืชื•ืœื“ื•ืช ื”ืืžื ื•ืช ื•ื”ื—ืœ ืœืคืจืกื ืžืืžืจื™ื ื”ื•ืžื•ืจื™ืกื˜ื™ื™ื",
76
+ "tokens": [
77
+ {
78
+ "token": "ื‘ืฉื ืช",
79
+ "offsets": {
80
+ "start": 0,
81
+ "end": 4
82
+ },
83
+ "syntax": {
84
+ "word": "ื‘ืฉื ืช",
85
+ "dep_head_idx": 2,
86
+ "dep_func": "obl",
87
+ "dep_head": "ื”ืฉืœื™ื"
88
+ },
89
+ "seg": [
90
+ "ื‘",
91
+ "ืฉื ืช"
92
+ ],
93
+ "lex": "ืฉื ื”",
94
+ "morph": {
95
+ "token": "ื‘ืฉื ืช",
96
+ "pos": "NOUN",
97
+ "feats": {
98
+ "Gender": "Fem",
99
+ "Number": "Sing"
100
+ },
101
+ "prefixes": [
102
+ "ADP"
103
+ ],
104
+ "suffix": false
105
+ }
106
+ },
107
+ {
108
+ "token": "1948",
109
+ "offsets": {
110
+ "start": 5,
111
+ "end": 9
112
+ },
113
+ "syntax": {
114
+ "word": "1948",
115
+ "dep_head_idx": 0,
116
+ "dep_func": "compound:smixut",
117
+ "dep_head": "ื‘ืฉื ืช"
118
+ },
119
+ "seg": [
120
+ "1948"
121
+ ],
122
+ "lex": "1948",
123
+ "morph": {
124
+ "token": "1948",
125
+ "pos": "NUM",
126
+ "feats": {},
127
+ "prefixes": [],
128
+ "suffix": false
129
+ }
130
+ },
131
+ {
132
+ "token": "ื”ืฉืœื™ื",
133
+ "offsets": {
134
+ "start": 10,
135
+ "end": 15
136
+ },
137
+ "syntax": {
138
+ "word": "ื”ืฉืœื™ื",
139
+ "dep_head_idx": -1,
140
+ "dep_func": "root",
141
+ "dep_head": "ื”ื•ืžื•ืจื™ืกื˜ื™ื™ื"
142
+ },
143
+ "seg": [
144
+ "ื”ืฉืœื™ื"
145
+ ],
146
+ "lex": "ื”ืฉืœื™ื",
147
+ "morph": {
148
+ "token": "ื”ืฉืœื™ื",
149
+ "pos": "VERB",
150
+ "feats": {
151
+ "Gender": "Masc",
152
+ "Number": "Sing",
153
+ "Person": "3",
154
+ "Tense": "Past"
155
+ },
156
+ "prefixes": [],
157
+ "suffix": false
158
+ }
159
+ },
160
+ {
161
+ "token": "ืืคืจื™ื",
162
+ "offsets": {
163
+ "start": 16,
164
+ "end": 21
165
+ },
166
+ "syntax": {
167
+ "word": "ืืคืจื™ื",
168
+ "dep_head_idx": 2,
169
+ "dep_func": "nsubj",
170
+ "dep_head": "ื”ืฉืœื™ื"
171
+ },
172
+ "seg": [
173
+ "ืืคืจื™ื"
174
+ ],
175
+ "lex": "ืืคืจื™ื",
176
+ "morph": {
177
+ "token": "ืืคืจื™ื",
178
+ "pos": "PROPN",
179
+ "feats": {},
180
+ "prefixes": [],
181
+ "suffix": false
182
+ }
183
+ },
184
+ {
185
+ "token": "ืงื™ืฉื•ืŸ",
186
+ "offsets": {
187
+ "start": 22,
188
+ "end": 27
189
+ },
190
+ "syntax": {
191
+ "word": "ืงื™ืฉื•ืŸ",
192
+ "dep_head_idx": 3,
193
+ "dep_func": "flat:name",
194
+ "dep_head": "ืืคืจื™ื"
195
+ },
196
+ "seg": [
197
+ "ืงื™ืฉื•ืŸ"
198
+ ],
199
+ "lex": "ืงื™ืฉื•ืŸ",
200
+ "morph": {
201
+ "token": "ืงื™ืฉื•ืŸ",
202
+ "pos": "PROPN",
203
+ "feats": {},
204
+ "prefixes": [],
205
+ "suffix": false
206
+ }
207
+ },
208
+ {
209
+ "token": "ืืช",
210
+ "offsets": {
211
+ "start": 28,
212
+ "end": 30
213
+ },
214
+ "syntax": {
215
+ "word": "ืืช",
216
+ "dep_head_idx": 6,
217
+ "dep_func": "case:acc",
218
+ "dep_head": "ืœื™ืžื•ื“ื™ื•"
219
+ },
220
+ "seg": [
221
+ "ืืช"
222
+ ],
223
+ "lex": "ืืช",
224
+ "morph": {
225
+ "token": "ืืช",
226
+ "pos": "ADP",
227
+ "feats": {},
228
+ "prefixes": [],
229
+ "suffix": false
230
+ }
231
+ },
232
+ {
233
+ "token": "ืœื™ืžื•ื“ื™ื•",
234
+ "offsets": {
235
+ "start": 31,
236
+ "end": 38
237
+ },
238
+ "syntax": {
239
+ "word": "ืœื™ืžื•ื“ื™ื•",
240
+ "dep_head_idx": 2,
241
+ "dep_func": "obj",
242
+ "dep_head": "ื”ืฉืœื™ื"
243
+ },
244
+ "seg": [
245
+ "ืœื™ืžื•ื“ื™ื•"
246
+ ],
247
+ "lex": "ืœื™ืžื•ื“",
248
+ "morph": {
249
+ "token": "ืœื™ืžื•ื“ื™ื•",
250
+ "pos": "NOUN",
251
+ "feats": {
252
+ "Gender": "Masc",
253
+ "Number": "Plur"
254
+ },
255
+ "prefixes": [],
256
+ "suffix": "ADP_PRON",
257
+ "suffix_feats": {
258
+ "Gender": "Masc",
259
+ "Number": "Sing",
260
+ "Person": "3"
261
+ }
262
+ }
263
+ },
264
+ {
265
+ "token": "ื‘ืคื™ืกื•ืœ",
266
+ "offsets": {
267
+ "start": 39,
268
+ "end": 45
269
+ },
270
+ "syntax": {
271
+ "word": "ื‘ืคื™ืกื•ืœ",
272
+ "dep_head_idx": 6,
273
+ "dep_func": "nmod",
274
+ "dep_head": "ืœื™ืžื•ื“ื™ื•"
275
+ },
276
+ "seg": [
277
+ "ื‘",
278
+ "ืคื™ืกื•ืœ"
279
+ ],
280
+ "lex": "ืคื™ืกื•ืœ",
281
+ "morph": {
282
+ "token": "ื‘ืคื™ืกื•ืœ",
283
+ "pos": "NOUN",
284
+ "feats": {
285
+ "Gender": "Masc",
286
+ "Number": "Sing"
287
+ },
288
+ "prefixes": [
289
+ "ADP"
290
+ ],
291
+ "suffix": false
292
+ }
293
+ },
294
+ {
295
+ "token": "ืžืชื›ืช",
296
+ "offsets": {
297
+ "start": 46,
298
+ "end": 50
299
+ },
300
+ "syntax": {
301
+ "word": "ืžืชื›ืช",
302
+ "dep_head_idx": 7,
303
+ "dep_func": "compound:smixut",
304
+ "dep_head": "ื‘ืคื™ืกื•ืœ"
305
+ },
306
+ "seg": [
307
+ "ืžืชื›ืช"
308
+ ],
309
+ "lex": "ืžืชื›ืช",
310
+ "morph": {
311
+ "token": "ืžืชื›ืช",
312
+ "pos": "NOUN",
313
+ "feats": {
314
+ "Gender": "Fem",
315
+ "Number": "Sing"
316
+ },
317
+ "prefixes": [],
318
+ "suffix": false
319
+ }
320
+ },
321
+ {
322
+ "token": "ื•ื‘ืชื•ืœื“ื•ืช",
323
+ "offsets": {
324
+ "start": 51,
325
+ "end": 59
326
+ },
327
+ "syntax": {
328
+ "word": "ื•ื‘ืชื•ืœื“ื•ืช",
329
+ "dep_head_idx": 7,
330
+ "dep_func": "conj",
331
+ "dep_head": "ื‘ืคื™ืกื•ืœ"
332
+ },
333
+ "seg": [
334
+ "ื•ื‘",
335
+ "ืชื•ืœื“ื•ืช"
336
+ ],
337
+ "lex": "ืชื•ืœื“ื”",
338
+ "morph": {
339
+ "token": "ื•ื‘ืชื•ืœื“ื•ืช",
340
+ "pos": "NOUN",
341
+ "feats": {
342
+ "Gender": "Fem",
343
+ "Number": "Plur"
344
+ },
345
+ "prefixes": [
346
+ "CCONJ",
347
+ "ADP"
348
+ ],
349
+ "suffix": false
350
+ }
351
+ },
352
+ {
353
+ "token": "ื”ืืžื ื•ืช",
354
+ "offsets": {
355
+ "start": 60,
356
+ "end": 66
357
+ },
358
+ "syntax": {
359
+ "word": "ื”ืืžื ื•ืช",
360
+ "dep_head_idx": 9,
361
+ "dep_func": "compound:smixut",
362
+ "dep_head": "ื•ื‘ืชื•ืœื“ื•ืช"
363
+ },
364
+ "seg": [
365
+ "ื”",
366
+ "ืืžื ื•ืช"
367
+ ],
368
+ "lex": "ืื•ืžื ื•ืช",
369
+ "morph": {
370
+ "token": "ื”ืืžื ื•ืช",
371
+ "pos": "NOUN",
372
+ "feats": {
373
+ "Gender": "Fem",
374
+ "Number": "Sing"
375
+ },
376
+ "prefixes": [
377
+ "DET"
378
+ ],
379
+ "suffix": false
380
+ }
381
+ },
382
+ {
383
+ "token": "ื•ื”ื—ืœ",
384
+ "offsets": {
385
+ "start": 67,
386
+ "end": 71
387
+ },
388
+ "syntax": {
389
+ "word": "ื•ื”ื—ืœ",
390
+ "dep_head_idx": 2,
391
+ "dep_func": "conj",
392
+ "dep_head": "ื”ืฉืœื™ื"
393
+ },
394
+ "seg": [
395
+ "ื•",
396
+ "ื”ื—ืœ"
397
+ ],
398
+ "lex": "ื”ื—ืœ",
399
+ "morph": {
400
+ "token": "ื•ื”ื—ืœ",
401
+ "pos": "VERB",
402
+ "feats": {
403
+ "Gender": "Masc",
404
+ "Number": "Sing",
405
+ "Person": "3",
406
+ "Tense": "Past"
407
+ },
408
+ "prefixes": [
409
+ "CCONJ"
410
+ ],
411
+ "suffix": false
412
+ }
413
+ },
414
+ {
415
+ "token": "ืœืคืจืกื",
416
+ "offsets": {
417
+ "start": 72,
418
+ "end": 77
419
+ },
420
+ "syntax": {
421
+ "word": "ืœืคืจืกื",
422
+ "dep_head_idx": 11,
423
+ "dep_func": "xcomp",
424
+ "dep_head": "ื•ื”ื—ืœ"
425
+ },
426
+ "seg": [
427
+ "ืœืคืจืกื"
428
+ ],
429
+ "lex": "ืคืจืกื",
430
+ "morph": {
431
+ "token": "ืœืคืจืกื",
432
+ "pos": "VERB",
433
+ "feats": {},
434
+ "prefixes": [],
435
+ "suffix": false
436
+ }
437
+ },
438
+ {
439
+ "token": "ืžืืžืจื™ื",
440
+ "offsets": {
441
+ "start": 78,
442
+ "end": 84
443
+ },
444
+ "syntax": {
445
+ "word": "ืžืืžืจื™ื",
446
+ "dep_head_idx": 12,
447
+ "dep_func": "obj",
448
+ "dep_head": "ืœืคืจืกื"
449
+ },
450
+ "seg": [
451
+ "ืžืืžืจื™ื"
452
+ ],
453
+ "lex": "ืžืืžืจ",
454
+ "morph": {
455
+ "token": "ืžืืžืจื™ื",
456
+ "pos": "NOUN",
457
+ "feats": {
458
+ "Gender": "Masc",
459
+ "Number": "Plur"
460
+ },
461
+ "prefixes": [],
462
+ "suffix": false
463
+ }
464
+ },
465
+ {
466
+ "token": "ื”ื•ืžื•ืจื™ืกื˜ื™ื™ื",
467
+ "offsets": {
468
+ "start": 85,
469
+ "end": 96
470
+ },
471
+ "syntax": {
472
+ "word": "ื”ื•ืžื•ืจื™ืกื˜ื™ื™ื",
473
+ "dep_head_idx": 13,
474
+ "dep_func": "amod",
475
+ "dep_head": "ืžืืžืจื™ื"
476
+ },
477
+ "seg": [
478
+ "ื”ื•ืžื•ืจื™ืกื˜ื™ื™ื"
479
+ ],
480
+ "lex": "ื”ื•ืžื•ืจื™ืกื˜ื™",
481
+ "morph": {
482
+ "token": "ื”ื•ืžื•ืจื™ืกื˜ื™ื™ื",
483
+ "pos": "ADJ",
484
+ "feats": {
485
+ "Gender": "Masc",
486
+ "Number": "Plur"
487
+ },
488
+ "prefixes": [],
489
+ "suffix": false
490
+ }
491
+ }
492
+ ],
493
+ "root_idx": 2,
494
+ "ner_entities": [
495
+ {
496
+ "phrase": "ืืคืจื™ื ืงื™ืฉื•ืŸ",
497
+ "label": "PER",
498
+ "start": 16,
499
+ "end": 27,
500
+ "token_start": 3,
501
+ "token_end": 4
502
+ }
503
+ ]
504
+ }
505
+ ]
506
+ ```
507
+
508
+ You can also choose to get your response in UD format:
509
+
510
+ ```python
511
+ sentence = 'ื‘ืฉื ืช 1948 ื”ืฉืœื™ื ืืคืจื™ื ืงื™ืฉื•ืŸ ืืช ืœื™ืžื•ื“ื™ื• ื‘ืคื™ืกื•ืœ ืžืชื›ืช ื•ื‘ืชื•ืœื“ื•ืช ื”ืืžื ื•ืช ื•ื”ื—ืœ ืœืคืจืกื ืžืืžืจื™ื ื”ื•ืžื•ืจื™ืกื˜ื™ื™ื'
512
+ print(model.predict([sentence], tokenizer, output_style='ud'))
513
+ ```
514
+
515
+ Results:
516
+ ```json
517
+ [
518
+ [
519
+ "# sent_id = 1",
520
+ "# text = ื‘ืฉื ืช 1948 ื”ืฉืœื™ื ืืคืจื™ื ืงื™ืฉื•ืŸ ืืช ืœื™ืžื•ื“ื™ื• ื‘ืคื™ืกื•ืœ ืžืชื›ืช ื•ื‘ืชื•ืœื“ื•ืช ื”ืืžื ื•ืช ื•ื”ื—ืœ ืœืคืจืกื ืžืืžืจื™ื ื”ื•ืžื•ืจื™ืกื˜ื™ื™ื",
521
+ "1-2\tื‘ืฉื ืช\t_\t_\t_\t_\t_\t_\t_\t_",
522
+ "1\tื‘\tื‘\tADP\tADP\t_\t2\tcase\t_\t_",
523
+ "2\tืฉื ืช\tืฉื ื”\tNOUN\tNOUN\tGender=Fem|Number=Sing\t4\tobl\t_\t_",
524
+ "3\t1948\t1948\tNUM\tNUM\t\t2\tcompound:smixut\t_\t_",
525
+ "4\tื”ืฉืœื™ื\tื”ืฉืœื™ื\tVERB\tVERB\tGender=Masc|Number=Sing|Person=3|Tense=Past\t0\troot\t_\t_",
526
+ "5\tืืคืจื™ื\tืืคืจื™ื\tPROPN\tPROPN\t\t4\tnsubj\t_\t_",
527
+ "6\tืงื™ืฉื•ืŸ\tืงื™ืฉื•ืŸ\tPROPN\tPROPN\t\t5\tflat:name\t_\t_",
528
+ "7\tืืช\tืืช\tADP\tADP\t\t8\tcase:acc\t_\t_",
529
+ "8-10\tืœื™ืžื•ื“ื™ื•\t_\t_\t_\t_\t_\t_\t_\t_",
530
+ "8\tืœื™ืžื•ื“_\tืœื™ืžื•ื“\tNOUN\tNOUN\tGender=Masc|Number=Plur\t4\tobj\t_\t_",
531
+ "9\t_ืฉืœ_\tืฉืœ\tADP\tADP\t_\t10\tcase\t_\t_",
532
+ "10\t_ื”ื•ื\tื”ื•ื\tPRON\tPRON\tGender=Masc|Number=Sing|Person=3\t8\tnmod:poss\t_\t_",
533
+ "11-12\tื‘ืคื™ืกื•ืœ\t_\t_\t_\t_\t_\t_\t_\t_",
534
+ "11\tื‘\tื‘\tADP\tADP\t_\t12\tcase\t_\t_",
535
+ "12\tืคื™ืกื•ืœ\tืคื™ืกื•ืœ\tNOUN\tNOUN\tGender=Masc|Number=Sing\t8\tnmod\t_\t_",
536
+ "13\tืžืชื›ืช\tืžืชื›ืช\tNOUN\tNOUN\tGender=Fem|Number=Sing\t12\tcompound:smixut\t_\t_",
537
+ "14-16\tื•ื‘ืชื•ืœื“ื•ืช\t_\t_\t_\t_\t_\t_\t_\t_",
538
+ "14\tื•\tื•\tCCONJ\tCCONJ\t_\t16\tcc\t_\t_",
539
+ "15\tื‘\tื‘\tADP\tADP\t_\t16\tcase\t_\t_",
540
+ "16\tืชื•ืœื“ื•ืช\tืชื•ืœื“ื”\tNOUN\tNOUN\tGender=Fem|Number=Plur\t12\tconj\t_\t_",
541
+ "17-18\tื”ืืžื ื•ืช\t_\t_\t_\t_\t_\t_\t_\t_",
542
+ "17\tื”\tื”\tDET\tDET\t_\t18\tdet\t_\t_",
543
+ "18\tืืžื ื•ืช\tืื•ืžื ื•ืช\tNOUN\tNOUN\tGender=Fem|Number=Sing\t16\tcompound:smixut\t_\t_",
544
+ "19-20\tื•ื”ื—ืœ\t_\t_\t_\t_\t_\t_\t_\t_",
545
+ "19\tื•\tื•\tCCONJ\tCCONJ\t_\t20\tcc\t_\t_",
546
+ "20\tื”ื—ืœ\tื”ื—ืœ\tVERB\tVERB\tGender=Masc|Number=Sing|Person=3|Tense=Past\t4\tconj\t_\t_",
547
+ "21\tืœืคืจืกื\tืคืจืกื\tVERB\tVERB\t\t20\txcomp\t_\t_",
548
+ "22\tืžืืžืจื™ื\tืžืืžืจ\tNOUN\tNOUN\tGender=Masc|Number=Plur\t21\tobj\t_\t_",
549
+ "23\tื”ื•ืžื•ืจื™ืกื˜ื™ื™ื\tื”ื•ืžื•ืจื™ืกื˜ื™\tADJ\tADJ\tGender=Masc|Number=Plur\t22\tamod\t_\t_"
550
+ ]
551
+ ]
552
+ ```
553
+
554
+ ## Citation
555
+
556
+ If you use DictaBERT-parse in your research, please cite ```MRL Parsing without Tears: The Case of Hebrew```
557
+
558
+ **BibTeX:**
559
+
560
+ ```bibtex
561
+ to add
562
+ ```
563
+
564
+
565
+ ## License
566
+
567
+ Shield: [![CC BY 4.0][cc-by-shield]][cc-by]
568
+
569
+ This work is licensed under a
570
+ [Creative Commons Attribution 4.0 International License][cc-by].
571
+
572
+ [![CC BY 4.0][cc-by-image]][cc-by]
573
+
574
+ [cc-by]: http://creativecommons.org/licenses/by/4.0/
575
+ [cc-by-image]: https://i.creativecommons.org/l/by/4.0/88x31.png
576
+ [cc-by-shield]: https://img.shields.io/badge/License-CC%20BY%204.0-lightgrey.svg