Adriane Boyd commited on
Commit
7506945
1 Parent(s): 8e4eb53

Add vi_udv25_vietnamesevtb_trf-0.0.1

Browse files
.gitattributes CHANGED
@@ -25,3 +25,8 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
25
  *.zip filter=lfs diff=lfs merge=lfs -text
26
  *.zstandard filter=lfs diff=lfs merge=lfs -text
27
  *tfevents* filter=lfs diff=lfs merge=lfs -text
 
 
 
 
 
 
25
  *.zip filter=lfs diff=lfs merge=lfs -text
26
  *.zstandard filter=lfs diff=lfs merge=lfs -text
27
  *tfevents* filter=lfs diff=lfs merge=lfs -text
28
+ *.whl filter=lfs diff=lfs merge=lfs -text
29
+ *.npz filter=lfs diff=lfs merge=lfs -text
30
+ *strings.json filter=lfs diff=lfs merge=lfs -text
31
+ vectors filter=lfs diff=lfs merge=lfs -text
32
+ model filter=lfs diff=lfs merge=lfs -text
LICENSE.txt ADDED
@@ -0,0 +1,426 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ Attribution-ShareAlike 4.0 International
2
+
3
+ =======================================================================
4
+
5
+ Creative Commons Corporation ("Creative Commons") is not a law firm and
6
+ does not provide legal services or legal advice. Distribution of
7
+ Creative Commons public licenses does not create a lawyer-client or
8
+ other relationship. Creative Commons makes its licenses and related
9
+ information available on an "as-is" basis. Creative Commons gives no
10
+ warranties regarding its licenses, any material licensed under their
11
+ terms and conditions, or any related information. Creative Commons
12
+ disclaims all liability for damages resulting from their use to the
13
+ fullest extent possible.
14
+
15
+ Using Creative Commons Public Licenses
16
+
17
+ Creative Commons public licenses provide a standard set of terms and
18
+ conditions that creators and other rights holders may use to share
19
+ original works of authorship and other material subject to copyright
20
+ and certain other rights specified in the public license below. The
21
+ following considerations are for informational purposes only, are not
22
+ exhaustive, and do not form part of our licenses.
23
+
24
+ Considerations for licensors: Our public licenses are
25
+ intended for use by those authorized to give the public
26
+ permission to use material in ways otherwise restricted by
27
+ copyright and certain other rights. Our licenses are
28
+ irrevocable. Licensors should read and understand the terms
29
+ and conditions of the license they choose before applying it.
30
+ Licensors should also secure all rights necessary before
31
+ applying our licenses so that the public can reuse the
32
+ material as expected. Licensors should clearly mark any
33
+ material not subject to the license. This includes other CC-
34
+ licensed material, or material used under an exception or
35
+ limitation to copyright. More considerations for licensors:
36
+ wiki.creativecommons.org/Considerations_for_licensors
37
+
38
+ Considerations for the public: By using one of our public
39
+ licenses, a licensor grants the public permission to use the
40
+ licensed material under specified terms and conditions. If
41
+ the licensor's permission is not necessary for any reason--for
42
+ example, because of any applicable exception or limitation to
43
+ copyright--then that use is not regulated by the license. Our
44
+ licenses grant only permissions under copyright and certain
45
+ other rights that a licensor has authority to grant. Use of
46
+ the licensed material may still be restricted for other
47
+ reasons, including because others have copyright or other
48
+ rights in the material. A licensor may make special requests,
49
+ such as asking that all changes be marked or described.
50
+ Although not required by our licenses, you are encouraged to
51
+ respect those requests where reasonable. More_considerations
52
+ for the public:
53
+ wiki.creativecommons.org/Considerations_for_licensees
54
+
55
+ =======================================================================
56
+
57
+ Creative Commons Attribution-ShareAlike 4.0 International Public
58
+ License
59
+
60
+ By exercising the Licensed Rights (defined below), You accept and agree
61
+ to be bound by the terms and conditions of this Creative Commons
62
+ Attribution-ShareAlike 4.0 International Public License ("Public
63
+ License"). To the extent this Public License may be interpreted as a
64
+ contract, You are granted the Licensed Rights in consideration of Your
65
+ acceptance of these terms and conditions, and the Licensor grants You
66
+ such rights in consideration of benefits the Licensor receives from
67
+ making the Licensed Material available under these terms and
68
+ conditions.
69
+
70
+
71
+ Section 1 -- Definitions.
72
+
73
+ a. Adapted Material means material subject to Copyright and Similar
74
+ Rights that is derived from or based upon the Licensed Material
75
+ and in which the Licensed Material is translated, altered,
76
+ arranged, transformed, or otherwise modified in a manner requiring
77
+ permission under the Copyright and Similar Rights held by the
78
+ Licensor. For purposes of this Public License, where the Licensed
79
+ Material is a musical work, performance, or sound recording,
80
+ Adapted Material is always produced where the Licensed Material is
81
+ synched in timed relation with a moving image.
82
+
83
+ b. Adapter's License means the license You apply to Your Copyright
84
+ and Similar Rights in Your contributions to Adapted Material in
85
+ accordance with the terms and conditions of this Public License.
86
+
87
+ c. BY-SA Compatible License means a license listed at
88
+ creativecommons.org/compatiblelicenses, approved by Creative
89
+ Commons as essentially the equivalent of this Public License.
90
+
91
+ d. Copyright and Similar Rights means copyright and/or similar rights
92
+ closely related to copyright including, without limitation,
93
+ performance, broadcast, sound recording, and Sui Generis Database
94
+ Rights, without regard to how the rights are labeled or
95
+ categorized. For purposes of this Public License, the rights
96
+ specified in Section 2(b)(1)-(2) are not Copyright and Similar
97
+ Rights.
98
+
99
+ e. Effective Technological Measures means those measures that, in the
100
+ absence of proper authority, may not be circumvented under laws
101
+ fulfilling obligations under Article 11 of the WIPO Copyright
102
+ Treaty adopted on December 20, 1996, and/or similar international
103
+ agreements.
104
+
105
+ f. Exceptions and Limitations means fair use, fair dealing, and/or
106
+ any other exception or limitation to Copyright and Similar Rights
107
+ that applies to Your use of the Licensed Material.
108
+
109
+ g. License Elements means the license attributes listed in the name
110
+ of a Creative Commons Public License. The License Elements of this
111
+ Public License are Attribution and ShareAlike.
112
+
113
+ h. Licensed Material means the artistic or literary work, database,
114
+ or other material to which the Licensor applied this Public
115
+ License.
116
+
117
+ i. Licensed Rights means the rights granted to You subject to the
118
+ terms and conditions of this Public License, which are limited to
119
+ all Copyright and Similar Rights that apply to Your use of the
120
+ Licensed Material and that the Licensor has authority to license.
121
+
122
+ j. Licensor means the individual(s) or entity(ies) granting rights
123
+ under this Public License.
124
+
125
+ k. Share means to provide material to the public by any means or
126
+ process that requires permission under the Licensed Rights, such
127
+ as reproduction, public display, public performance, distribution,
128
+ dissemination, communication, or importation, and to make material
129
+ available to the public including in ways that members of the
130
+ public may access the material from a place and at a time
131
+ individually chosen by them.
132
+
133
+ l. Sui Generis Database Rights means rights other than copyright
134
+ resulting from Directive 96/9/EC of the European Parliament and of
135
+ the Council of 11 March 1996 on the legal protection of databases,
136
+ as amended and/or succeeded, as well as other essentially
137
+ equivalent rights anywhere in the world.
138
+
139
+ m. You means the individual or entity exercising the Licensed Rights
140
+ under this Public License. Your has a corresponding meaning.
141
+
142
+
143
+ Section 2 -- Scope.
144
+
145
+ a. License grant.
146
+
147
+ 1. Subject to the terms and conditions of this Public License,
148
+ the Licensor hereby grants You a worldwide, royalty-free,
149
+ non-sublicensable, non-exclusive, irrevocable license to
150
+ exercise the Licensed Rights in the Licensed Material to:
151
+
152
+ a. reproduce and Share the Licensed Material, in whole or
153
+ in part; and
154
+
155
+ b. produce, reproduce, and Share Adapted Material.
156
+
157
+ 2. Exceptions and Limitations. For the avoidance of doubt, where
158
+ Exceptions and Limitations apply to Your use, this Public
159
+ License does not apply, and You do not need to comply with
160
+ its terms and conditions.
161
+
162
+ 3. Term. The term of this Public License is specified in Section
163
+ 6(a).
164
+
165
+ 4. Media and formats; technical modifications allowed. The
166
+ Licensor authorizes You to exercise the Licensed Rights in
167
+ all media and formats whether now known or hereafter created,
168
+ and to make technical modifications necessary to do so. The
169
+ Licensor waives and/or agrees not to assert any right or
170
+ authority to forbid You from making technical modifications
171
+ necessary to exercise the Licensed Rights, including
172
+ technical modifications necessary to circumvent Effective
173
+ Technological Measures. For purposes of this Public License,
174
+ simply making modifications authorized by this Section 2(a)
175
+ (4) never produces Adapted Material.
176
+
177
+ 5. Downstream recipients.
178
+
179
+ a. Offer from the Licensor -- Licensed Material. Every
180
+ recipient of the Licensed Material automatically
181
+ receives an offer from the Licensor to exercise the
182
+ Licensed Rights under the terms and conditions of this
183
+ Public License.
184
+
185
+ b. Additional offer from the Licensor -- Adapted Material.
186
+ Every recipient of Adapted Material from You
187
+ automatically receives an offer from the Licensor to
188
+ exercise the Licensed Rights in the Adapted Material
189
+ under the conditions of the Adapter's License You apply.
190
+
191
+ c. No downstream restrictions. You may not offer or impose
192
+ any additional or different terms or conditions on, or
193
+ apply any Effective Technological Measures to, the
194
+ Licensed Material if doing so restricts exercise of the
195
+ Licensed Rights by any recipient of the Licensed
196
+ Material.
197
+
198
+ 6. No endorsement. Nothing in this Public License constitutes or
199
+ may be construed as permission to assert or imply that You
200
+ are, or that Your use of the Licensed Material is, connected
201
+ with, or sponsored, endorsed, or granted official status by,
202
+ the Licensor or others designated to receive attribution as
203
+ provided in Section 3(a)(1)(A)(i).
204
+
205
+ b. Other rights.
206
+
207
+ 1. Moral rights, such as the right of integrity, are not
208
+ licensed under this Public License, nor are publicity,
209
+ privacy, and/or other similar personality rights; however, to
210
+ the extent possible, the Licensor waives and/or agrees not to
211
+ assert any such rights held by the Licensor to the limited
212
+ extent necessary to allow You to exercise the Licensed
213
+ Rights, but not otherwise.
214
+
215
+ 2. Patent and trademark rights are not licensed under this
216
+ Public License.
217
+
218
+ 3. To the extent possible, the Licensor waives any right to
219
+ collect royalties from You for the exercise of the Licensed
220
+ Rights, whether directly or through a collecting society
221
+ under any voluntary or waivable statutory or compulsory
222
+ licensing scheme. In all other cases the Licensor expressly
223
+ reserves any right to collect such royalties.
224
+
225
+
226
+ Section 3 -- License Conditions.
227
+
228
+ Your exercise of the Licensed Rights is expressly made subject to the
229
+ following conditions.
230
+
231
+ a. Attribution.
232
+
233
+ 1. If You Share the Licensed Material (including in modified
234
+ form), You must:
235
+
236
+ a. retain the following if it is supplied by the Licensor
237
+ with the Licensed Material:
238
+
239
+ i. identification of the creator(s) of the Licensed
240
+ Material and any others designated to receive
241
+ attribution, in any reasonable manner requested by
242
+ the Licensor (including by pseudonym if
243
+ designated);
244
+
245
+ ii. a copyright notice;
246
+
247
+ iii. a notice that refers to this Public License;
248
+
249
+ iv. a notice that refers to the disclaimer of
250
+ warranties;
251
+
252
+ v. a URI or hyperlink to the Licensed Material to the
253
+ extent reasonably practicable;
254
+
255
+ b. indicate if You modified the Licensed Material and
256
+ retain an indication of any previous modifications; and
257
+
258
+ c. indicate the Licensed Material is licensed under this
259
+ Public License, and include the text of, or the URI or
260
+ hyperlink to, this Public License.
261
+
262
+ 2. You may satisfy the conditions in Section 3(a)(1) in any
263
+ reasonable manner based on the medium, means, and context in
264
+ which You Share the Licensed Material. For example, it may be
265
+ reasonable to satisfy the conditions by providing a URI or
266
+ hyperlink to a resource that includes the required
267
+ information.
268
+
269
+ 3. If requested by the Licensor, You must remove any of the
270
+ information required by Section 3(a)(1)(A) to the extent
271
+ reasonably practicable.
272
+
273
+ b. ShareAlike.
274
+
275
+ In addition to the conditions in Section 3(a), if You Share
276
+ Adapted Material You produce, the following conditions also apply.
277
+
278
+ 1. The Adapter's License You apply must be a Creative Commons
279
+ license with the same License Elements, this version or
280
+ later, or a BY-SA Compatible License.
281
+
282
+ 2. You must include the text of, or the URI or hyperlink to, the
283
+ Adapter's License You apply. You may satisfy this condition
284
+ in any reasonable manner based on the medium, means, and
285
+ context in which You Share Adapted Material.
286
+
287
+ 3. You may not offer or impose any additional or different terms
288
+ or conditions on, or apply any Effective Technological
289
+ Measures to, Adapted Material that restrict exercise of the
290
+ rights granted under the Adapter's License You apply.
291
+
292
+
293
+ Section 4 -- Sui Generis Database Rights.
294
+
295
+ Where the Licensed Rights include Sui Generis Database Rights that
296
+ apply to Your use of the Licensed Material:
297
+
298
+ a. for the avoidance of doubt, Section 2(a)(1) grants You the right
299
+ to extract, reuse, reproduce, and Share all or a substantial
300
+ portion of the contents of the database;
301
+
302
+ b. if You include all or a substantial portion of the database
303
+ contents in a database in which You have Sui Generis Database
304
+ Rights, then the database in which You have Sui Generis Database
305
+ Rights (but not its individual contents) is Adapted Material,
306
+
307
+ including for purposes of Section 3(b); and
308
+ c. You must comply with the conditions in Section 3(a) if You Share
309
+ all or a substantial portion of the contents of the database.
310
+
311
+ For the avoidance of doubt, this Section 4 supplements and does not
312
+ replace Your obligations under this Public License where the Licensed
313
+ Rights include other Copyright and Similar Rights.
314
+
315
+
316
+ Section 5 -- Disclaimer of Warranties and Limitation of Liability.
317
+
318
+ a. UNLESS OTHERWISE SEPARATELY UNDERTAKEN BY THE LICENSOR, TO THE
319
+ EXTENT POSSIBLE, THE LICENSOR OFFERS THE LICENSED MATERIAL AS-IS
320
+ AND AS-AVAILABLE, AND MAKES NO REPRESENTATIONS OR WARRANTIES OF
321
+ ANY KIND CONCERNING THE LICENSED MATERIAL, WHETHER EXPRESS,
322
+ IMPLIED, STATUTORY, OR OTHER. THIS INCLUDES, WITHOUT LIMITATION,
323
+ WARRANTIES OF TITLE, MERCHANTABILITY, FITNESS FOR A PARTICULAR
324
+ PURPOSE, NON-INFRINGEMENT, ABSENCE OF LATENT OR OTHER DEFECTS,
325
+ ACCURACY, OR THE PRESENCE OR ABSENCE OF ERRORS, WHETHER OR NOT
326
+ KNOWN OR DISCOVERABLE. WHERE DISCLAIMERS OF WARRANTIES ARE NOT
327
+ ALLOWED IN FULL OR IN PART, THIS DISCLAIMER MAY NOT APPLY TO YOU.
328
+
329
+ b. TO THE EXTENT POSSIBLE, IN NO EVENT WILL THE LICENSOR BE LIABLE
330
+ TO YOU ON ANY LEGAL THEORY (INCLUDING, WITHOUT LIMITATION,
331
+ NEGLIGENCE) OR OTHERWISE FOR ANY DIRECT, SPECIAL, INDIRECT,
332
+ INCIDENTAL, CONSEQUENTIAL, PUNITIVE, EXEMPLARY, OR OTHER LOSSES,
333
+ COSTS, EXPENSES, OR DAMAGES ARISING OUT OF THIS PUBLIC LICENSE OR
334
+ USE OF THE LICENSED MATERIAL, EVEN IF THE LICENSOR HAS BEEN
335
+ ADVISED OF THE POSSIBILITY OF SUCH LOSSES, COSTS, EXPENSES, OR
336
+ DAMAGES. WHERE A LIMITATION OF LIABILITY IS NOT ALLOWED IN FULL OR
337
+ IN PART, THIS LIMITATION MAY NOT APPLY TO YOU.
338
+
339
+ c. The disclaimer of warranties and limitation of liability provided
340
+ above shall be interpreted in a manner that, to the extent
341
+ possible, most closely approximates an absolute disclaimer and
342
+ waiver of all liability.
343
+
344
+
345
+ Section 6 -- Term and Termination.
346
+
347
+ a. This Public License applies for the term of the Copyright and
348
+ Similar Rights licensed here. However, if You fail to comply with
349
+ this Public License, then Your rights under this Public License
350
+ terminate automatically.
351
+
352
+ b. Where Your right to use the Licensed Material has terminated under
353
+ Section 6(a), it reinstates:
354
+
355
+ 1. automatically as of the date the violation is cured, provided
356
+ it is cured within 30 days of Your discovery of the
357
+ violation; or
358
+
359
+ 2. upon express reinstatement by the Licensor.
360
+
361
+ For the avoidance of doubt, this Section 6(b) does not affect any
362
+ right the Licensor may have to seek remedies for Your violations
363
+ of this Public License.
364
+
365
+ c. For the avoidance of doubt, the Licensor may also offer the
366
+ Licensed Material under separate terms or conditions or stop
367
+ distributing the Licensed Material at any time; however, doing so
368
+ will not terminate this Public License.
369
+
370
+ d. Sections 1, 5, 6, 7, and 8 survive termination of this Public
371
+ License.
372
+
373
+
374
+ Section 7 -- Other Terms and Conditions.
375
+
376
+ a. The Licensor shall not be bound by any additional or different
377
+ terms or conditions communicated by You unless expressly agreed.
378
+
379
+ b. Any arrangements, understandings, or agreements regarding the
380
+ Licensed Material not stated herein are separate from and
381
+ independent of the terms and conditions of this Public License.
382
+
383
+
384
+ Section 8 -- Interpretation.
385
+
386
+ a. For the avoidance of doubt, this Public License does not, and
387
+ shall not be interpreted to, reduce, limit, restrict, or impose
388
+ conditions on any use of the Licensed Material that could lawfully
389
+ be made without permission under this Public License.
390
+
391
+ b. To the extent possible, if any provision of this Public License is
392
+ deemed unenforceable, it shall be automatically reformed to the
393
+ minimum extent necessary to make it enforceable. If the provision
394
+ cannot be reformed, it shall be severed from this Public License
395
+ without affecting the enforceability of the remaining terms and
396
+ conditions.
397
+
398
+ c. No term or condition of this Public License will be waived and no
399
+ failure to comply consented to unless expressly agreed to by the
400
+ Licensor.
401
+
402
+ d. Nothing in this Public License constitutes or may be interpreted
403
+ as a limitation upon, or waiver of, any privileges and immunities
404
+ that apply to the Licensor or You, including from the legal
405
+ processes of any jurisdiction or authority.
406
+
407
+
408
+ =======================================================================
409
+
410
+ Creative Commons is not a party to its public licenses.
411
+ Notwithstanding, Creative Commons may elect to apply one of its public
412
+ licenses to material it publishes and in those instances will be
413
+ considered the "Licensor." Except for the limited purpose of indicating
414
+ that material is shared under a Creative Commons public license or as
415
+ otherwise permitted by the Creative Commons policies published at
416
+ creativecommons.org/policies, Creative Commons does not authorize the
417
+ use of the trademark "Creative Commons" or any other trademark or logo
418
+ of Creative Commons without its prior written consent including,
419
+ without limitation, in connection with any unauthorized modifications
420
+ to any of its public licenses or any other arrangements,
421
+ understandings, or agreements concerning use of licensed material. For
422
+ the avoidance of doubt, this paragraph does not form part of the public
423
+ licenses.
424
+
425
+ Creative Commons may be contacted at creativecommons.org.
426
+
README.md ADDED
@@ -0,0 +1,108 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ tags:
3
+ - spacy
4
+ - token-classification
5
+ language:
6
+ - vi
7
+ license: cc-by-sa-4.0
8
+ model-index:
9
+ - name: vi_udv25_vietnamesevtb_trf
10
+ results:
11
+ - task:
12
+ name: TAG
13
+ type: token-classification
14
+ metrics:
15
+ - name: TAG (XPOS) Accuracy
16
+ type: accuracy
17
+ value: 0.8805048216
18
+ - task:
19
+ name: POS
20
+ type: token-classification
21
+ metrics:
22
+ - name: POS (UPOS) Accuracy
23
+ type: accuracy
24
+ value: 0.9018631331
25
+ - task:
26
+ name: MORPH
27
+ type: token-classification
28
+ metrics:
29
+ - name: Morph (UFeats) Accuracy
30
+ type: accuracy
31
+ value: 0.9695345305
32
+ - task:
33
+ name: LEMMA
34
+ type: token-classification
35
+ metrics:
36
+ - name: Lemma Accuracy
37
+ type: accuracy
38
+ value: 0.8934519139
39
+ - task:
40
+ name: UNLABELED_DEPENDENCIES
41
+ type: token-classification
42
+ metrics:
43
+ - name: Unlabeled Attachment Score (UAS)
44
+ type: f_score
45
+ value: 0.6807696182
46
+ - task:
47
+ name: LABELED_DEPENDENCIES
48
+ type: token-classification
49
+ metrics:
50
+ - name: Labeled Attachment Score (LAS)
51
+ type: f_score
52
+ value: 0.6063552526
53
+ - task:
54
+ name: SENTS
55
+ type: token-classification
56
+ metrics:
57
+ - name: Sentences F-Score
58
+ type: f_score
59
+ value: 0.943275972
60
+ ---
61
+ UD v2.5 benchmarking pipeline for UD_Vietnamese-VTB
62
+
63
+ | Feature | Description |
64
+ | --- | --- |
65
+ | **Name** | `vi_udv25_vietnamesevtb_trf` |
66
+ | **Version** | `0.0.1` |
67
+ | **spaCy** | `>=3.2.1,<3.3.0` |
68
+ | **Default Pipeline** | `experimental_char_ner_tokenizer`, `transformer`, `tagger`, `morphologizer`, `parser`, `experimental_edit_tree_lemmatizer` |
69
+ | **Components** | `experimental_char_ner_tokenizer`, `transformer`, `senter`, `tagger`, `morphologizer`, `parser`, `experimental_edit_tree_lemmatizer` |
70
+ | **Vectors** | 0 keys, 0 unique vectors (0 dimensions) |
71
+ | **Sources** | [Universal Dependencies v2.5](https://lindat.mff.cuni.cz/repository/xmlui/handle/11234/1-3105) (Zeman, Daniel; et al.) |
72
+ | **License** | `CC BY-SA 4.0` |
73
+ | **Author** | [Explosion](https://explosion.ai) |
74
+
75
+ ### Label Scheme
76
+
77
+ <details>
78
+
79
+ <summary>View label scheme (81 labels for 6 components)</summary>
80
+
81
+ | Component | Labels |
82
+ | --- | --- |
83
+ | **`experimental_char_ner_tokenizer`** | `TOKEN` |
84
+ | **`senter`** | `I`, `S` |
85
+ | **`tagger`** | `!`, `"`, `,`, `-`, `.`, `...`, `:`, `;`, `?`, `@`, `A`, `C`, `CC`, `E`, `I`, `L`, `LBKT`, `M`, `N`, `NP`, `Nb`, `Nc`, `Np`, `Nu`, `Ny`, `P`, `R`, `RBKT`, `T`, `V`, `VP`, `X`, `Y`, `Z` |
86
+ | **`morphologizer`** | `POS=NOUN`, `POS=ADP`, `POS=X\|Polarity=Neg`, `POS=VERB`, `POS=ADJ`, `POS=PUNCT`, `POS=X`, `POS=SCONJ`, `NumType=Card\|POS=NUM`, `POS=DET`, `POS=CCONJ`, `POS=PROPN`, `POS=AUX`, `POS=PART`, `POS=INTJ` |
87
+ | **`parser`** | `ROOT`, `advcl`, `advmod`, `amod`, `appos`, `aux`, `aux:pass`, `case`, `cc`, `ccomp`, `compound`, `conj`, `cop`, `csubj`, `dep`, `det`, `discourse`, `iobj`, `list`, `mark`, `nmod`, `nsubj`, `nummod`, `obj`, `obl`, `parataxis`, `punct`, `xcomp` |
88
+ | **`experimental_edit_tree_lemmatizer`** | `0` |
89
+
90
+ </details>
91
+
92
+ ### Accuracy
93
+
94
+ | Type | Score |
95
+ | --- | --- |
96
+ | `TOKEN_F` | 87.90 |
97
+ | `TOKEN_P` | 86.84 |
98
+ | `TOKEN_R` | 89.00 |
99
+ | `TOKEN_ACC` | 98.42 |
100
+ | `SENTS_F` | 94.33 |
101
+ | `SENTS_P` | 96.23 |
102
+ | `SENTS_R` | 92.50 |
103
+ | `TAG_ACC` | 88.05 |
104
+ | `POS_ACC` | 90.19 |
105
+ | `MORPH_ACC` | 96.95 |
106
+ | `DEP_UAS` | 68.08 |
107
+ | `DEP_LAS` | 60.64 |
108
+ | `LEMMA_ACC` | 89.35 |
config.cfg ADDED
@@ -0,0 +1,254 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [paths]
2
+ train = "corpus/UD_Vietnamese-VTB/train.spacy"
3
+ dev = "corpus/UD_Vietnamese-VTB/dev.spacy"
4
+ vectors = null
5
+ init_tok2vec = null
6
+ tokenizer_source = "training/UD_Vietnamese-VTB/tokenizer/model-best"
7
+ transformer_source = "training/UD_Vietnamese-VTB/transformer/model-best"
8
+
9
+ [system]
10
+ gpu_allocator = "pytorch"
11
+ seed = 0
12
+
13
+ [nlp]
14
+ lang = "vi"
15
+ pipeline = ["experimental_char_ner_tokenizer","transformer","senter","tagger","morphologizer","parser","experimental_edit_tree_lemmatizer"]
16
+ batch_size = 64
17
+ disabled = ["senter"]
18
+ before_creation = null
19
+ after_creation = null
20
+ after_pipeline_creation = null
21
+ tokenizer = {"@tokenizers":"spacy-experimental.char_pretokenizer.v1"}
22
+
23
+ [components]
24
+
25
+ [components.experimental_char_ner_tokenizer]
26
+ factory = "experimental_char_ner_tokenizer"
27
+ scorer = {"@scorers":"spacy-experimental.tokenizer_scorer.v1"}
28
+
29
+ [components.experimental_char_ner_tokenizer.model]
30
+ @architectures = "spacy.TransitionBasedParser.v2"
31
+ state_type = "ner"
32
+ extra_state_tokens = false
33
+ hidden_width = 64
34
+ maxout_pieces = 2
35
+ use_upper = true
36
+ nO = null
37
+
38
+ [components.experimental_char_ner_tokenizer.model.tok2vec]
39
+ @architectures = "spacy.Tok2Vec.v2"
40
+
41
+ [components.experimental_char_ner_tokenizer.model.tok2vec.embed]
42
+ @architectures = "spacy.MultiHashEmbed.v2"
43
+ width = 128
44
+ attrs = ["ORTH","LOWER","IS_DIGIT","IS_ALPHA","IS_SPACE","IS_PUNCT"]
45
+ rows = [1000,500,50,50,50,50]
46
+ include_static_vectors = false
47
+
48
+ [components.experimental_char_ner_tokenizer.model.tok2vec.encode]
49
+ @architectures = "spacy.MaxoutWindowEncoder.v2"
50
+ width = 128
51
+ depth = 4
52
+ window_size = 4
53
+ maxout_pieces = 2
54
+
55
+ [components.experimental_edit_tree_lemmatizer]
56
+ factory = "experimental_edit_tree_lemmatizer"
57
+ backoff = "orth"
58
+ min_tree_freq = 1
59
+ overwrite = false
60
+ scorer = {"@scorers":"spacy.lemmatizer_scorer.v1"}
61
+ top_k = 1
62
+
63
+ [components.experimental_edit_tree_lemmatizer.model]
64
+ @architectures = "spacy.Tagger.v1"
65
+ nO = null
66
+
67
+ [components.experimental_edit_tree_lemmatizer.model.tok2vec]
68
+ @architectures = "spacy-transformers.TransformerListener.v1"
69
+ grad_factor = 1.0
70
+ upstream = "transformer"
71
+ pooling = {"@layers":"reduce_mean.v1"}
72
+
73
+ [components.morphologizer]
74
+ factory = "morphologizer"
75
+ extend = false
76
+ overwrite = false
77
+ scorer = {"@scorers":"spacy.morphologizer_scorer.v1"}
78
+
79
+ [components.morphologizer.model]
80
+ @architectures = "spacy.Tagger.v1"
81
+ nO = null
82
+
83
+ [components.morphologizer.model.tok2vec]
84
+ @architectures = "spacy-transformers.TransformerListener.v1"
85
+ grad_factor = 1.0
86
+ upstream = "transformer"
87
+ pooling = {"@layers":"reduce_mean.v1"}
88
+
89
+ [components.parser]
90
+ factory = "parser"
91
+ learn_tokens = false
92
+ min_action_freq = 5
93
+ moves = null
94
+ scorer = {"@scorers":"spacy.parser_scorer.v1"}
95
+ update_with_oracle_cut_size = 100
96
+
97
+ [components.parser.model]
98
+ @architectures = "spacy.TransitionBasedParser.v2"
99
+ state_type = "parser"
100
+ extra_state_tokens = false
101
+ hidden_width = 64
102
+ maxout_pieces = 3
103
+ use_upper = false
104
+ nO = null
105
+
106
+ [components.parser.model.tok2vec]
107
+ @architectures = "spacy-transformers.TransformerListener.v1"
108
+ grad_factor = 1.0
109
+ upstream = "transformer"
110
+ pooling = {"@layers":"reduce_mean.v1"}
111
+
112
+ [components.senter]
113
+ factory = "senter"
114
+ overwrite = false
115
+ scorer = {"@scorers":"spacy.senter_scorer.v1"}
116
+
117
+ [components.senter.model]
118
+ @architectures = "spacy.Tagger.v1"
119
+ nO = null
120
+
121
+ [components.senter.model.tok2vec]
122
+ @architectures = "spacy-transformers.TransformerListener.v1"
123
+ grad_factor = 1.0
124
+ upstream = "transformer"
125
+ pooling = {"@layers":"reduce_mean.v1"}
126
+
127
+ [components.tagger]
128
+ factory = "tagger"
129
+ neg_prefix = "!!!"
130
+ overwrite = false
131
+ scorer = {"@scorers":"spacy.tagger_scorer.v1"}
132
+
133
+ [components.tagger.model]
134
+ @architectures = "spacy.Tagger.v1"
135
+ nO = null
136
+
137
+ [components.tagger.model.tok2vec]
138
+ @architectures = "spacy-transformers.TransformerListener.v1"
139
+ grad_factor = 1.0
140
+ upstream = "transformer"
141
+ pooling = {"@layers":"reduce_mean.v1"}
142
+
143
+ [components.transformer]
144
+ factory = "transformer"
145
+ max_batch_items = 4096
146
+ set_extra_annotations = {"@annotation_setters":"spacy-transformers.null_annotation_setter.v1"}
147
+
148
+ [components.transformer.model]
149
+ @architectures = "spacy-transformers.TransformerModel.v3"
150
+ name = "xlm-roberta-base"
151
+ mixed_precision = true
152
+
153
+ [components.transformer.model.get_spans]
154
+ @span_getters = "spacy-transformers.strided_spans.v1"
155
+ window = 128
156
+ stride = 96
157
+
158
+ [components.transformer.model.grad_scaler_config]
159
+
160
+ [components.transformer.model.tokenizer_config]
161
+ use_fast = true
162
+
163
+ [components.transformer.model.transformer_config]
164
+
165
+ [corpora]
166
+
167
+ [corpora.dev]
168
+ @readers = "spacy.Corpus.v1"
169
+ path = ${paths.dev}
170
+ max_length = 0
171
+ gold_preproc = false
172
+ limit = 0
173
+ augmenter = null
174
+
175
+ [corpora.train]
176
+ @readers = "spacy.Corpus.v1"
177
+ path = ${paths.train}
178
+ max_length = 0
179
+ gold_preproc = false
180
+ limit = 0
181
+ augmenter = null
182
+
183
+ [training]
184
+ train_corpus = "corpora.train"
185
+ dev_corpus = "corpora.dev"
186
+ seed = ${system:seed}
187
+ gpu_allocator = ${system:gpu_allocator}
188
+ dropout = 0.1
189
+ accumulate_gradient = 3
190
+ patience = 5000
191
+ max_epochs = 0
192
+ max_steps = 20000
193
+ eval_frequency = 200
194
+ frozen_components = []
195
+ before_to_disk = null
196
+ annotating_components = []
197
+
198
+ [training.batcher]
199
+ @batchers = "spacy.batch_by_padded.v1"
200
+ discard_oversize = true
201
+ get_length = null
202
+ size = 2000
203
+ buffer = 256
204
+
205
+ [training.logger]
206
+ @loggers = "spacy.ConsoleLogger.v1"
207
+ progress_bar = false
208
+
209
+ [training.optimizer]
210
+ @optimizers = "Adam.v1"
211
+ beta1 = 0.9
212
+ beta2 = 0.999
213
+ L2_is_weight_decay = true
214
+ L2 = 0.01
215
+ grad_clip = 1.0
216
+ use_averages = true
217
+ eps = 0.00000001
218
+
219
+ [training.optimizer.learn_rate]
220
+ @schedules = "warmup_linear.v1"
221
+ warmup_steps = 250
222
+ total_steps = 20000
223
+ initial_rate = 0.00005
224
+
225
+ [training.score_weights]
226
+ token_f = 0.0
227
+ token_p = null
228
+ token_r = null
229
+ token_acc = null
230
+ sents_f = 0.05
231
+ sents_p = 0.0
232
+ sents_r = 0.0
233
+ tag_acc = 0.11
234
+ pos_acc = 0.05
235
+ morph_acc = 0.05
236
+ morph_per_feat = null
237
+ dep_uas = 0.11
238
+ dep_las = 0.11
239
+ dep_las_per_type = null
240
+ lemma_acc = 0.52
241
+
242
+ [pretraining]
243
+
244
+ [initialize]
245
+ vectors = ${paths.vectors}
246
+ init_tok2vec = ${paths.init_tok2vec}
247
+ vocab_data = null
248
+ lookups = null
249
+ before_init = null
250
+ after_init = null
251
+
252
+ [initialize.components]
253
+
254
+ [initialize.tokenizer]
experimental_char_ner_tokenizer/cfg ADDED
@@ -0,0 +1,13 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "moves":null,
3
+ "update_with_oracle_cut_size":100,
4
+ "multitasks":[
5
+
6
+ ],
7
+ "min_action_freq":1,
8
+ "learn_tokens":false,
9
+ "beam_width":1,
10
+ "beam_density":0.0,
11
+ "beam_update_prob":0.0,
12
+ "incorrect_spans_key":null
13
+ }
experimental_char_ner_tokenizer/model ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:81510e06c952a2e4115bedffd791243378f939d0ad4d563eb648295a1d498ff5
3
+ size 6922248
experimental_char_ner_tokenizer/moves ADDED
@@ -0,0 +1 @@
 
 
1
+ ��moves�h{"0":{},"1":{"TOKEN":80997},"2":{"TOKEN":80997},"3":{"TOKEN":80997},"4":{"TOKEN":80997,"":1},"5":{"":1}}�cfg��neg_key�
experimental_edit_tree_lemmatizer/cfg ADDED
@@ -0,0 +1,5 @@
 
 
 
 
 
 
1
+ {
2
+ "labels":[
3
+ 0
4
+ ]
5
+ }
experimental_edit_tree_lemmatizer/model ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:9349eac597785b4fab82e3b17d543069734e4196082ac9e131a34db0f4354333
3
+ size 3664
experimental_edit_tree_lemmatizer/trees ADDED
Binary file (101 Bytes). View file
 
meta.json ADDED
@@ -0,0 +1,307 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "lang":"vi",
3
+ "name":"udv25_vietnamesevtb_trf",
4
+ "version":"0.0.1",
5
+ "description":"UD v2.5 benchmarking pipeline for UD_Vietnamese-VTB",
6
+ "author":"Explosion",
7
+ "email":"contact@explosion.ai",
8
+ "url":"https://explosion.ai",
9
+ "license":"CC BY-SA 4.0",
10
+ "spacy_version":">=3.2.1,<3.3.0",
11
+ "spacy_git_version":"800737b41",
12
+ "vectors":{
13
+ "width":0,
14
+ "vectors":0,
15
+ "keys":0,
16
+ "name":null
17
+ },
18
+ "labels":{
19
+ "experimental_char_ner_tokenizer":[
20
+ "TOKEN"
21
+ ],
22
+ "transformer":[
23
+
24
+ ],
25
+ "senter":[
26
+ "I",
27
+ "S"
28
+ ],
29
+ "tagger":[
30
+ "!",
31
+ "\"",
32
+ ",",
33
+ "-",
34
+ ".",
35
+ "...",
36
+ ":",
37
+ ";",
38
+ "?",
39
+ "@",
40
+ "A",
41
+ "C",
42
+ "CC",
43
+ "E",
44
+ "I",
45
+ "L",
46
+ "LBKT",
47
+ "M",
48
+ "N",
49
+ "NP",
50
+ "Nb",
51
+ "Nc",
52
+ "Np",
53
+ "Nu",
54
+ "Ny",
55
+ "P",
56
+ "R",
57
+ "RBKT",
58
+ "T",
59
+ "V",
60
+ "VP",
61
+ "X",
62
+ "Y",
63
+ "Z"
64
+ ],
65
+ "morphologizer":[
66
+ "POS=NOUN",
67
+ "POS=ADP",
68
+ "POS=X|Polarity=Neg",
69
+ "POS=VERB",
70
+ "POS=ADJ",
71
+ "POS=PUNCT",
72
+ "POS=X",
73
+ "POS=SCONJ",
74
+ "NumType=Card|POS=NUM",
75
+ "POS=DET",
76
+ "POS=CCONJ",
77
+ "POS=PROPN",
78
+ "POS=AUX",
79
+ "POS=PART",
80
+ "POS=INTJ"
81
+ ],
82
+ "parser":[
83
+ "ROOT",
84
+ "advcl",
85
+ "advmod",
86
+ "amod",
87
+ "appos",
88
+ "aux",
89
+ "aux:pass",
90
+ "case",
91
+ "cc",
92
+ "ccomp",
93
+ "compound",
94
+ "conj",
95
+ "cop",
96
+ "csubj",
97
+ "dep",
98
+ "det",
99
+ "discourse",
100
+ "iobj",
101
+ "list",
102
+ "mark",
103
+ "nmod",
104
+ "nsubj",
105
+ "nummod",
106
+ "obj",
107
+ "obl",
108
+ "parataxis",
109
+ "punct",
110
+ "xcomp"
111
+ ],
112
+ "experimental_edit_tree_lemmatizer":[
113
+ 0
114
+ ]
115
+ },
116
+ "pipeline":[
117
+ "experimental_char_ner_tokenizer",
118
+ "transformer",
119
+ "tagger",
120
+ "morphologizer",
121
+ "parser",
122
+ "experimental_edit_tree_lemmatizer"
123
+ ],
124
+ "components":[
125
+ "experimental_char_ner_tokenizer",
126
+ "transformer",
127
+ "senter",
128
+ "tagger",
129
+ "morphologizer",
130
+ "parser",
131
+ "experimental_edit_tree_lemmatizer"
132
+ ],
133
+ "disabled":[
134
+ "senter"
135
+ ],
136
+ "sources":[
137
+ {
138
+ "name":"Universal Dependencies v2.5",
139
+ "url":"https://lindat.mff.cuni.cz/repository/xmlui/handle/11234/1-3105",
140
+ "author":"Zeman, Daniel; et al."
141
+ }
142
+ ],
143
+ "performance":{
144
+ "token_f":0.8790426353,
145
+ "token_p":0.8683898305,
146
+ "token_r":0.8899600486,
147
+ "token_acc":0.9842035036,
148
+ "sents_f":0.943275972,
149
+ "sents_p":0.9622886866,
150
+ "sents_r":0.925,
151
+ "tag_acc":0.8805048216,
152
+ "pos_acc":0.9018631331,
153
+ "morph_acc":0.9695345305,
154
+ "morph_per_feat":{
155
+ "NumType":{
156
+ "p":1.0,
157
+ "r":0.9409594096,
158
+ "f":0.969581749
159
+ },
160
+ "Polarity":{
161
+ "p":1.0,
162
+ "r":0.9932885906,
163
+ "f":0.9966329966
164
+ }
165
+ },
166
+ "dep_uas":0.6807696182,
167
+ "dep_las":0.6063552526,
168
+ "dep_las_per_type":{
169
+ "cc":{
170
+ "p":0.625282167,
171
+ "r":0.5794979079,
172
+ "f":0.6015200869
173
+ },
174
+ "nummod":{
175
+ "p":0.84375,
176
+ "r":0.8503937008,
177
+ "f":0.8470588235
178
+ },
179
+ "compound":{
180
+ "p":0.5607375271,
181
+ "r":0.6251511487,
182
+ "f":0.5911949686
183
+ },
184
+ "nsubj":{
185
+ "p":0.7237880496,
186
+ "r":0.6881028939,
187
+ "f":0.7054945055
188
+ },
189
+ "advmod":{
190
+ "p":0.843937575,
191
+ "r":0.762472885,
192
+ "f":0.8011396011
193
+ },
194
+ "root":{
195
+ "p":0.7582128778,
196
+ "r":0.72125,
197
+ "f":0.7392696989
198
+ },
199
+ "obj":{
200
+ "p":0.7544843049,
201
+ "r":0.5674536256,
202
+ "f":0.6477382098
203
+ },
204
+ "case":{
205
+ "p":0.7461538462,
206
+ "r":0.8164983165,
207
+ "f":0.7797427653
208
+ },
209
+ "obl":{
210
+ "p":0.5305164319,
211
+ "r":0.5566502463,
212
+ "f":0.5432692308
213
+ },
214
+ "xcomp":{
215
+ "p":0.5512552301,
216
+ "r":0.5648445874,
217
+ "f":0.5579671784
218
+ },
219
+ "amod":{
220
+ "p":0.511627907,
221
+ "r":0.4074074074,
222
+ "f":0.4536082474
223
+ },
224
+ "nmod":{
225
+ "p":0.5875,
226
+ "r":0.4159292035,
227
+ "f":0.4870466321
228
+ },
229
+ "det":{
230
+ "p":0.8123393316,
231
+ "r":0.7559808612,
232
+ "f":0.7831474597
233
+ },
234
+ "aux:pass":{
235
+ "p":0.8360655738,
236
+ "r":0.6144578313,
237
+ "f":0.7083333333
238
+ },
239
+ "ccomp":{
240
+ "p":0.3333333333,
241
+ "r":0.4194630872,
242
+ "f":0.3714710253
243
+ },
244
+ "parataxis":{
245
+ "p":0.3555555556,
246
+ "r":0.2461538462,
247
+ "f":0.2909090909
248
+ },
249
+ "mark":{
250
+ "p":0.18,
251
+ "r":0.1551724138,
252
+ "f":0.1666666667
253
+ },
254
+ "iobj":{
255
+ "p":0.0,
256
+ "r":0.0,
257
+ "f":0.0
258
+ },
259
+ "cop":{
260
+ "p":0.7843137255,
261
+ "r":0.8080808081,
262
+ "f":0.7960199005
263
+ },
264
+ "csubj":{
265
+ "p":0.1538461538,
266
+ "r":0.1379310345,
267
+ "f":0.1454545455
268
+ },
269
+ "aux":{
270
+ "p":0.6071428571,
271
+ "r":0.7906976744,
272
+ "f":0.6868686869
273
+ },
274
+ "conj":{
275
+ "p":0.5621468927,
276
+ "r":0.5012594458,
277
+ "f":0.5299600533
278
+ },
279
+ "advcl":{
280
+ "p":0.2118644068,
281
+ "r":0.2212389381,
282
+ "f":0.2164502165
283
+ },
284
+ "dep":{
285
+ "p":0.1785714286,
286
+ "r":0.1041666667,
287
+ "f":0.1315789474
288
+ },
289
+ "discourse":{
290
+ "p":0.4716981132,
291
+ "r":0.3048780488,
292
+ "f":0.3703703704
293
+ },
294
+ "appos":{
295
+ "p":0.4615384615,
296
+ "r":0.4,
297
+ "f":0.4285714286
298
+ }
299
+ },
300
+ "lemma_acc":0.8934519139
301
+ },
302
+ "requirements":[
303
+ "pyvi",
304
+ "spacy-experimental>=0.2.0,<0.3.0",
305
+ "spacy-transformers>=1.1.3,<1.2.0"
306
+ ]
307
+ }
morphologizer/cfg ADDED
@@ -0,0 +1,38 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "extend":false,
3
+ "labels_morph":{
4
+ "POS=NOUN":"",
5
+ "POS=ADP":"",
6
+ "POS=X|Polarity=Neg":"Polarity=Neg",
7
+ "POS=VERB":"",
8
+ "POS=ADJ":"",
9
+ "POS=PUNCT":"",
10
+ "POS=X":"",
11
+ "POS=SCONJ":"",
12
+ "NumType=Card|POS=NUM":"NumType=Card",
13
+ "POS=DET":"",
14
+ "POS=CCONJ":"",
15
+ "POS=PROPN":"",
16
+ "POS=AUX":"",
17
+ "POS=PART":"",
18
+ "POS=INTJ":""
19
+ },
20
+ "labels_pos":{
21
+ "POS=NOUN":92,
22
+ "POS=ADP":85,
23
+ "POS=X|Polarity=Neg":101,
24
+ "POS=VERB":100,
25
+ "POS=ADJ":84,
26
+ "POS=PUNCT":97,
27
+ "POS=X":101,
28
+ "POS=SCONJ":98,
29
+ "NumType=Card|POS=NUM":93,
30
+ "POS=DET":90,
31
+ "POS=CCONJ":89,
32
+ "POS=PROPN":96,
33
+ "POS=AUX":87,
34
+ "POS=PART":94,
35
+ "POS=INTJ":91
36
+ },
37
+ "overwrite":false
38
+ }
morphologizer/model ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:9c062a431612822ef8a58e7bcdb29ea343b40f02bf7440f19830430ef9881db8
3
+ size 46728
parser/cfg ADDED
@@ -0,0 +1,13 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "moves":null,
3
+ "update_with_oracle_cut_size":100,
4
+ "multitasks":[
5
+
6
+ ],
7
+ "min_action_freq":5,
8
+ "learn_tokens":false,
9
+ "beam_width":1,
10
+ "beam_density":0.0,
11
+ "beam_update_prob":0.0,
12
+ "incorrect_spans_key":null
13
+ }
parser/model ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:1d154d8d5a1cb42ff39210effe328c3dfe4e9cad85c62cbb4a166e1041dcdad4
3
+ size 522845
parser/moves ADDED
@@ -0,0 +1 @@
 
 
1
+ ��moves�|{"0":{"":7792},"1":{"":11095},"2":{"nsubj":1680,"advmod":1120,"case":1082,"punct":834,"cc":578,"nummod":435,"compound":418,"det":368,"obl":242,"cop":183,"advcl":171,"aux:pass":162,"aux":115,"obj":66,"discourse":65,"amod":65,"csubj":45,"xcomp":38,"dep":32,"nmod":25,"parataxis":22,"mark":15,"ccomp":13},"3":{"punct":2095,"obj":1764,"xcomp":1664,"compound":1244,"conj":734,"ccomp":588,"nmod":505,"amod":445,"obl":374,"advmod":354,"det":337,"parataxis":185,"cc":154,"advcl":132,"nummod":130,"mark":82,"discourse":62,"case":52,"dep":44,"appos":37,"aux:pass":25,"punct||conj":24,"iobj":17,"cc||conj":11,"nsubj":8,"list":7},"4":{"ROOT":1400}}�cfg��neg_key�
senter/cfg ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ {
2
+ "overwrite":false
3
+ }
senter/model ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:a8fbfd60fcb2540c0b62ba4faef6d62f9fd75aa3eadb038840e593e8bba35d9d
3
+ size 6740
tagger/cfg ADDED
@@ -0,0 +1,40 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "labels":[
3
+ "!",
4
+ "\"",
5
+ ",",
6
+ "-",
7
+ ".",
8
+ "...",
9
+ ":",
10
+ ";",
11
+ "?",
12
+ "@",
13
+ "A",
14
+ "C",
15
+ "CC",
16
+ "E",
17
+ "I",
18
+ "L",
19
+ "LBKT",
20
+ "M",
21
+ "N",
22
+ "NP",
23
+ "Nb",
24
+ "Nc",
25
+ "Np",
26
+ "Nu",
27
+ "Ny",
28
+ "P",
29
+ "R",
30
+ "RBKT",
31
+ "T",
32
+ "V",
33
+ "VP",
34
+ "X",
35
+ "Y",
36
+ "Z"
37
+ ],
38
+ "neg_prefix":"!!!",
39
+ "overwrite":false
40
+ }
tagger/model ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:fec970c90ba6bb8ea757696accb2904bd4a664ecbc600ff8b213270edcaaa79b
3
+ size 105174
transformer/cfg ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ {
2
+ "max_batch_items":4096
3
+ }
transformer/model ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:66356f788556ac16eb8de4f95095ff725f8f5f6bad73724c7f59a25fe7fdf48e
3
+ size 1126406104
vi_udv25_vietnamesevtb_trf-any-py3-none-any.whl ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:5ea1fd24d0b0e17d042f208f69016562cb4204ede0d2fcec2c8d62c0bb732aa9
3
+ size 839950911
vocab/key2row ADDED
@@ -0,0 +1 @@
 
 
1
+
vocab/lookups.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:76be8b528d0075f7aae98d6fa57a6d3c83ae480a8469e668d7b0af968995ac71
3
+ size 1
vocab/strings.json ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:f26b38047d913d09c0d4ff670d26060a267e56c9359f2befee1e38860ddfdfc6
3
+ size 368227
vocab/vectors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:14772b683e726436d5948ad3fff2b43d036ef2ebbe3458aafed6004e05a40706
3
+ size 128
vocab/vectors.cfg ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ {
2
+ "mode":"default"
3
+ }