EC2 Default User commited on
Commit
faa7399
1 Parent(s): 0c4a7b8

Update spaCy pipeline

Browse files

LICENSES_SOURCES CHANGED
@@ -450,557 +450,6 @@ Creative Commons may be contacted at creativecommons.org.
450
 
451
 
452
 
453
- # Lemmatization Lists
454
-
455
- * Author: Michal Měchura
456
- * URL: https://github.com/michmech/lemmatization-lists/
457
- * License: ODbL
458
-
459
- ```
460
- ## ODC Open Database License (ODbL)
461
-
462
- ### Preamble
463
-
464
- The Open Database License (ODbL) is a license agreement intended to
465
- allow users to freely share, modify, and use this Database while
466
- maintaining this same freedom for others. Many databases are covered by
467
- copyright, and therefore this document licenses these rights. Some
468
- jurisdictions, mainly in the European Union, have specific rights that
469
- cover databases, and so the ODbL addresses these rights, too. Finally,
470
- the ODbL is also an agreement in contract for users of this Database to
471
- act in certain ways in return for accessing this Database.
472
-
473
- Databases can contain a wide variety of types of content (images,
474
- audiovisual material, and sounds all in the same database, for example),
475
- and so the ODbL only governs the rights over the Database, and not the
476
- contents of the Database individually. Licensors should use the ODbL
477
- together with another license for the contents, if the contents have a
478
- single set of rights that uniformly covers all of the contents. If the
479
- contents have multiple sets of different rights, Licensors should
480
- describe what rights govern what contents together in the individual
481
- record or in some other way that clarifies what rights apply.
482
-
483
- Sometimes the contents of a database, or the database itself, can be
484
- covered by other rights not addressed here (such as private contracts,
485
- trade mark over the name, or privacy rights / data protection rights
486
- over information in the contents), and so you are advised that you may
487
- have to consult other documents or clear other rights before doing
488
- activities not covered by this License.
489
-
490
- ------
491
-
492
- The Licensor (as defined below)
493
-
494
- and
495
-
496
- You (as defined below)
497
-
498
- agree as follows:
499
-
500
- ### 1.0 Definitions of Capitalised Words
501
-
502
- "Collective Database" – Means this Database in unmodified form as part
503
- of a collection of independent databases in themselves that together are
504
- assembled into a collective whole. A work that constitutes a Collective
505
- Database will not be considered a Derivative Database.
506
-
507
- "Convey" – As a verb, means Using the Database, a Derivative Database,
508
- or the Database as part of a Collective Database in any way that enables
509
- a Person to make or receive copies of the Database or a Derivative
510
- Database. Conveying does not include interaction with a user through a
511
- computer network, or creating and Using a Produced Work, where no
512
- transfer of a copy of the Database or a Derivative Database occurs.
513
- "Contents" – The contents of this Database, which includes the
514
- information, independent works, or other material collected into the
515
- Database. For example, the contents of the Database could be factual
516
- data or works such as images, audiovisual material, text, or sounds.
517
-
518
- "Database" – A collection of material (the Contents) arranged in a
519
- systematic or methodical way and individually accessible by electronic
520
- or other means offered under the terms of this License.
521
-
522
- "Database Directive" – Means Directive 96/9/EC of the European
523
- Parliament and of the Council of 11 March 1996 on the legal protection
524
- of databases, as amended or succeeded.
525
-
526
- "Database Right" – Means rights resulting from the Chapter III ("sui
527
- generis") rights in the Database Directive (as amended and as transposed
528
- by member states), which includes the Extraction and Re-utilisation of
529
- the whole or a Substantial part of the Contents, as well as any similar
530
- rights available in the relevant jurisdiction under Section 10.4.
531
-
532
- "Derivative Database" – Means a database based upon the Database, and
533
- includes any translation, adaptation, arrangement, modification, or any
534
- other alteration of the Database or of a Substantial part of the
535
- Contents. This includes, but is not limited to, Extracting or
536
- Re-utilising the whole or a Substantial part of the Contents in a new
537
- Database.
538
-
539
- "Extraction" – Means the permanent or temporary transfer of all or a
540
- Substantial part of the Contents to another medium by any means or in
541
- any form.
542
-
543
- "License" – Means this license agreement and is both a license of rights
544
- such as copyright and Database Rights and an agreement in contract.
545
-
546
- "Licensor" – Means the Person that offers the Database under the terms
547
- of this License.
548
-
549
- "Person" – Means a natural or legal person or a body of persons
550
- corporate or incorporate.
551
-
552
- "Produced Work" – a work (such as an image, audiovisual material, text,
553
- or sounds) resulting from using the whole or a Substantial part of the
554
- Contents (via a search or other query) from this Database, a Derivative
555
- Database, or this Database as part of a Collective Database.
556
-
557
- "Publicly" – means to Persons other than You or under Your control by
558
- either more than 50% ownership or by the power to direct their
559
- activities (such as contracting with an independent consultant).
560
-
561
- "Re-utilisation" – means any form of making available to the public all
562
- or a Substantial part of the Contents by the distribution of copies, by
563
- renting, by online or other forms of transmission.
564
-
565
- "Substantial" – Means substantial in terms of quantity or quality or a
566
- combination of both. The repeated and systematic Extraction or
567
- Re-utilisation of insubstantial parts of the Contents may amount to the
568
- Extraction or Re-utilisation of a Substantial part of the Contents.
569
-
570
- "Use" – As a verb, means doing any act that is restricted by copyright
571
- or Database Rights whether in the original medium or any other; and
572
- includes without limitation distributing, copying, publicly performing,
573
- publicly displaying, and preparing derivative works of the Database, as
574
- well as modifying the Database as may be technically necessary to use it
575
- in a different mode or format.
576
-
577
- "You" – Means a Person exercising rights under this License who has not
578
- previously violated the terms of this License with respect to the
579
- Database, or who has received express permission from the Licensor to
580
- exercise rights under this License despite a previous violation.
581
-
582
- Words in the singular include the plural and vice versa.
583
-
584
- ### 2.0 What this License covers
585
-
586
- 2.1. Legal effect of this document. This License is:
587
-
588
- a. A license of applicable copyright and neighbouring rights;
589
-
590
- b. A license of the Database Right; and
591
-
592
- c. An agreement in contract between You and the Licensor.
593
-
594
- 2.2 Legal rights covered. This License covers the legal rights in the
595
- Database, including:
596
-
597
- a. Copyright. Any copyright or neighbouring rights in the Database.
598
- The copyright licensed includes any individual elements of the
599
- Database, but does not cover the copyright over the Contents
600
- independent of this Database. See Section 2.4 for details. Copyright
601
- law varies between jurisdictions, but is likely to cover: the Database
602
- model or schema, which is the structure, arrangement, and organisation
603
- of the Database, and can also include the Database tables and table
604
- indexes; the data entry and output sheets; and the Field names of
605
- Contents stored in the Database;
606
-
607
- b. Database Rights. Database Rights only extend to the Extraction and
608
- Re-utilisation of the whole or a Substantial part of the Contents.
609
- Database Rights can apply even when there is no copyright over the
610
- Database. Database Rights can also apply when the Contents are removed
611
- from the Database and are selected and arranged in a way that would
612
- not infringe any applicable copyright; and
613
-
614
- c. Contract. This is an agreement between You and the Licensor for
615
- access to the Database. In return you agree to certain conditions of
616
- use on this access as outlined in this License.
617
-
618
- 2.3 Rights not covered.
619
-
620
- a. This License does not apply to computer programs used in the making
621
- or operation of the Database;
622
-
623
- b. This License does not cover any patents over the Contents or the
624
- Database; and
625
-
626
- c. This License does not cover any trademarks associated with the
627
- Database.
628
-
629
- 2.4 Relationship to Contents in the Database. The individual items of
630
- the Contents contained in this Database may be covered by other rights,
631
- including copyright, patent, data protection, privacy, or personality
632
- rights, and this License does not cover any rights (other than Database
633
- Rights or in contract) in individual Contents contained in the Database.
634
- For example, if used on a Database of images (the Contents), this
635
- License would not apply to copyright over individual images, which could
636
- have their own separate licenses, or one single license covering all of
637
- the rights over the images.
638
-
639
- ### 3.0 Rights granted
640
-
641
- 3.1 Subject to the terms and conditions of this License, the Licensor
642
- grants to You a worldwide, royalty-free, non-exclusive, terminable (but
643
- only under Section 9) license to Use the Database for the duration of
644
- any applicable copyright and Database Rights. These rights explicitly
645
- include commercial use, and do not exclude any field of endeavour. To
646
- the extent possible in the relevant jurisdiction, these rights may be
647
- exercised in all media and formats whether now known or created in the
648
- future.
649
-
650
- The rights granted cover, for example:
651
-
652
- a. Extraction and Re-utilisation of the whole or a Substantial part of
653
- the Contents;
654
-
655
- b. Creation of Derivative Databases;
656
-
657
- c. Creation of Collective Databases;
658
-
659
- d. Creation of temporary or permanent reproductions by any means and
660
- in any form, in whole or in part, including of any Derivative
661
- Databases or as a part of Collective Databases; and
662
-
663
- e. Distribution, communication, display, lending, making available, or
664
- performance to the public by any means and in any form, in whole or in
665
- part, including of any Derivative Database or as a part of Collective
666
- Databases.
667
-
668
- 3.2 Compulsory license schemes. For the avoidance of doubt:
669
-
670
- a. Non-waivable compulsory license schemes. In those jurisdictions in
671
- which the right to collect royalties through any statutory or
672
- compulsory licensing scheme cannot be waived, the Licensor reserves
673
- the exclusive right to collect such royalties for any exercise by You
674
- of the rights granted under this License;
675
-
676
- b. Waivable compulsory license schemes. In those jurisdictions in
677
- which the right to collect royalties through any statutory or
678
- compulsory licensing scheme can be waived, the Licensor waives the
679
- exclusive right to collect such royalties for any exercise by You of
680
- the rights granted under this License; and,
681
-
682
- c. Voluntary license schemes. The Licensor waives the right to collect
683
- royalties, whether individually or, in the event that the Licensor is
684
- a member of a collecting society that administers voluntary licensing
685
- schemes, via that society, from any exercise by You of the rights
686
- granted under this License.
687
-
688
- 3.3 The right to release the Database under different terms, or to stop
689
- distributing or making available the Database, is reserved. Note that
690
- this Database may be multiple-licensed, and so You may have the choice
691
- of using alternative licenses for this Database. Subject to Section
692
- 10.4, all other rights not expressly granted by Licensor are reserved.
693
-
694
- ### 4.0 Conditions of Use
695
-
696
- 4.1 The rights granted in Section 3 above are expressly made subject to
697
- Your complying with the following conditions of use. These are important
698
- conditions of this License, and if You fail to follow them, You will be
699
- in material breach of its terms.
700
-
701
- 4.2 Notices. If You Publicly Convey this Database, any Derivative
702
- Database, or the Database as part of a Collective Database, then You
703
- must:
704
-
705
- a. Do so only under the terms of this License or another license
706
- permitted under Section 4.4;
707
-
708
- b. Include a copy of this License (or, as applicable, a license
709
- permitted under Section 4.4) or its Uniform Resource Identifier (URI)
710
- with the Database or Derivative Database, including both in the
711
- Database or Derivative Database and in any relevant documentation; and
712
-
713
- c. Keep intact any copyright or Database Right notices and notices
714
- that refer to this License.
715
-
716
- d. If it is not possible to put the required notices in a particular
717
- file due to its structure, then You must include the notices in a
718
- location (such as a relevant directory) where users would be likely to
719
- look for it.
720
-
721
- 4.3 Notice for using output (Contents). Creating and Using a Produced
722
- Work does not require the notice in Section 4.2. However, if you
723
- Publicly Use a Produced Work, You must include a notice associated with
724
- the Produced Work reasonably calculated to make any Person that uses,
725
- views, accesses, interacts with, or is otherwise exposed to the Produced
726
- Work aware that Content was obtained from the Database, Derivative
727
- Database, or the Database as part of a Collective Database, and that it
728
- is available under this License.
729
-
730
- a. Example notice. The following text will satisfy notice under
731
- Section 4.3:
732
-
733
- Contains information from DATABASE NAME, which is made available
734
- here under the Open Database License (ODbL).
735
-
736
- DATABASE NAME should be replaced with the name of the Database and a
737
- hyperlink to the URI of the Database. "Open Database License" should
738
- contain a hyperlink to the URI of the text of this License. If
739
- hyperlinks are not possible, You should include the plain text of the
740
- required URI's with the above notice.
741
-
742
- 4.4 Share alike.
743
-
744
- a. Any Derivative Database that You Publicly Use must be only under
745
- the terms of:
746
-
747
- i. This License;
748
-
749
- ii. A later version of this License similar in spirit to this
750
- License; or
751
-
752
- iii. A compatible license.
753
-
754
- If You license the Derivative Database under one of the licenses
755
- mentioned in (iii), You must comply with the terms of that license.
756
-
757
- b. For the avoidance of doubt, Extraction or Re-utilisation of the
758
- whole or a Substantial part of the Contents into a new database is a
759
- Derivative Database and must comply with Section 4.4.
760
-
761
- c. Derivative Databases and Produced Works. A Derivative Database is
762
- Publicly Used and so must comply with Section 4.4. if a Produced Work
763
- created from the Derivative Database is Publicly Used.
764
-
765
- d. Share Alike and additional Contents. For the avoidance of doubt,
766
- You must not add Contents to Derivative Databases under Section 4.4 a
767
- that are incompatible with the rights granted under this License.
768
-
769
- e. Compatible licenses. Licensors may authorise a proxy to determine
770
- compatible licenses under Section 4.4 a iii. If they do so, the
771
- authorised proxy's public statement of acceptance of a compatible
772
- license grants You permission to use the compatible license.
773
-
774
-
775
- 4.5 Limits of Share Alike. The requirements of Section 4.4 do not apply
776
- in the following:
777
-
778
- a. For the avoidance of doubt, You are not required to license
779
- Collective Databases under this License if You incorporate this
780
- Database or a Derivative Database in the collection, but this License
781
- still applies to this Database or a Derivative Database as a part of
782
- the Collective Database;
783
-
784
- b. Using this Database, a Derivative Database, or this Database as
785
- part of a Collective Database to create a Produced Work does not
786
- create a Derivative Database for purposes of Section 4.4; and
787
-
788
- c. Use of a Derivative Database internally within an organisation is
789
- not to the public and therefore does not fall under the requirements
790
- of Section 4.4.
791
-
792
- 4.6 Access to Derivative Databases. If You Publicly Use a Derivative
793
- Database or a Produced Work from a Derivative Database, You must also
794
- offer to recipients of the Derivative Database or Produced Work a copy
795
- in a machine readable form of:
796
-
797
- a. The entire Derivative Database; or
798
-
799
- b. A file containing all of the alterations made to the Database or
800
- the method of making the alterations to the Database (such as an
801
- algorithm), including any additional Contents, that make up all the
802
- differences between the Database and the Derivative Database.
803
-
804
- The Derivative Database (under a.) or alteration file (under b.) must be
805
- available at no more than a reasonable production cost for physical
806
- distributions and free of charge if distributed over the internet.
807
-
808
- 4.7 Technological measures and additional terms
809
-
810
- a. This License does not allow You to impose (except subject to
811
- Section 4.7 b.) any terms or any technological measures on the
812
- Database, a Derivative Database, or the whole or a Substantial part of
813
- the Contents that alter or restrict the terms of this License, or any
814
- rights granted under it, or have the effect or intent of restricting
815
- the ability of any person to exercise those rights.
816
-
817
- b. Parallel distribution. You may impose terms or technological
818
- measures on the Database, a Derivative Database, or the whole or a
819
- Substantial part of the Contents (a "Restricted Database") in
820
- contravention of Section 4.74 a. only if You also make a copy of the
821
- Database or a Derivative Database available to the recipient of the
822
- Restricted Database:
823
-
824
- i. That is available without additional fee;
825
-
826
- ii. That is available in a medium that does not alter or restrict
827
- the terms of this License, or any rights granted under it, or have
828
- the effect or intent of restricting the ability of any person to
829
- exercise those rights (an "Unrestricted Database"); and
830
-
831
- iii. The Unrestricted Database is at least as accessible to the
832
- recipient as a practical matter as the Restricted Database.
833
-
834
- c. For the avoidance of doubt, You may place this Database or a
835
- Derivative Database in an authenticated environment, behind a
836
- password, or within a similar access control scheme provided that You
837
- do not alter or restrict the terms of this License or any rights
838
- granted under it or have the effect or intent of restricting the
839
- ability of any person to exercise those rights.
840
-
841
- 4.8 Licensing of others. You may not sublicense the Database. Each time
842
- You communicate the Database, the whole or Substantial part of the
843
- Contents, or any Derivative Database to anyone else in any way, the
844
- Licensor offers to the recipient a license to the Database on the same
845
- terms and conditions as this License. You are not responsible for
846
- enforcing compliance by third parties with this License, but You may
847
- enforce any rights that You have over a Derivative Database. You are
848
- solely responsible for any modifications of a Derivative Database made
849
- by You or another Person at Your direction. You may not impose any
850
- further restrictions on the exercise of the rights granted or affirmed
851
- under this License.
852
-
853
- ### 5.0 Moral rights
854
-
855
- 5.1 Moral rights. This section covers moral rights, including any rights
856
- to be identified as the author of the Database or to object to treatment
857
- that would otherwise prejudice the author's honour and reputation, or
858
- any other derogatory treatment:
859
-
860
- a. For jurisdictions allowing waiver of moral rights, Licensor waives
861
- all moral rights that Licensor may have in the Database to the fullest
862
- extent possible by the law of the relevant jurisdiction under Section
863
- 10.4;
864
-
865
- b. If waiver of moral rights under Section 5.1 a in the relevant
866
- jurisdiction is not possible, Licensor agrees not to assert any moral
867
- rights over the Database and waives all claims in moral rights to the
868
- fullest extent possible by the law of the relevant jurisdiction under
869
- Section 10.4; and
870
-
871
- c. For jurisdictions not allowing waiver or an agreement not to assert
872
- moral rights under Section 5.1 a and b, the author may retain their
873
- moral rights over certain aspects of the Database.
874
-
875
- Please note that some jurisdictions do not allow for the waiver of moral
876
- rights, and so moral rights may still subsist over the Database in some
877
- jurisdictions.
878
-
879
- ### 6.0 Fair dealing, Database exceptions, and other rights not affected
880
-
881
- 6.1 This License does not affect any rights that You or anyone else may
882
- independently have under any applicable law to make any use of this
883
- Database, including without limitation:
884
-
885
- a. Exceptions to the Database Right including: Extraction of Contents
886
- from non-electronic Databases for private purposes, Extraction for
887
- purposes of illustration for teaching or scientific research, and
888
- Extraction or Re-utilisation for public security or an administrative
889
- or judicial procedure.
890
-
891
- b. Fair dealing, fair use, or any other legally recognised limitation
892
- or exception to infringement of copyright or other applicable laws.
893
-
894
- 6.2 This License does not affect any rights of lawful users to Extract
895
- and Re-utilise insubstantial parts of the Contents, evaluated
896
- quantitatively or qualitatively, for any purposes whatsoever, including
897
- creating a Derivative Database (subject to other rights over the
898
- Contents, see Section 2.4). The repeated and systematic Extraction or
899
- Re-utilisation of insubstantial parts of the Contents may however amount
900
- to the Extraction or Re-utilisation of a Substantial part of the
901
- Contents.
902
-
903
- ### 7.0 Warranties and Disclaimer
904
-
905
- 7.1 The Database is licensed by the Licensor "as is" and without any
906
- warranty of any kind, either express, implied, or arising by statute,
907
- custom, course of dealing, or trade usage. Licensor specifically
908
- disclaims any and all implied warranties or conditions of title,
909
- non-infringement, accuracy or completeness, the presence or absence of
910
- errors, fitness for a particular purpose, merchantability, or otherwise.
911
- Some jurisdictions do not allow the exclusion of implied warranties, so
912
- this exclusion may not apply to You.
913
-
914
- ### 8.0 Limitation of liability
915
-
916
- 8.1 Subject to any liability that may not be excluded or limited by law,
917
- the Licensor is not liable for, and expressly excludes, all liability
918
- for loss or damage however and whenever caused to anyone by any use
919
- under this License, whether by You or by anyone else, and whether caused
920
- by any fault on the part of the Licensor or not. This exclusion of
921
- liability includes, but is not limited to, any special, incidental,
922
- consequential, punitive, or exemplary damages such as loss of revenue,
923
- data, anticipated profits, and lost business. This exclusion applies
924
- even if the Licensor has been advised of the possibility of such
925
- damages.
926
-
927
- 8.2 If liability may not be excluded by law, it is limited to actual and
928
- direct financial loss to the extent it is caused by proved negligence on
929
- the part of the Licensor.
930
-
931
- ### 9.0 Termination of Your rights under this License
932
-
933
- 9.1 Any breach by You of the terms and conditions of this License
934
- automatically terminates this License with immediate effect and without
935
- notice to You. For the avoidance of doubt, Persons who have received the
936
- Database, the whole or a Substantial part of the Contents, Derivative
937
- Databases, or the Database as part of a Collective Database from You
938
- under this License will not have their licenses terminated provided
939
- their use is in full compliance with this License or a license granted
940
- under Section 4.8 of this License. Sections 1, 2, 7, 8, 9 and 10 will
941
- survive any termination of this License.
942
-
943
- 9.2 If You are not in breach of the terms of this License, the Licensor
944
- will not terminate Your rights under it.
945
-
946
- 9.3 Unless terminated under Section 9.1, this License is granted to You
947
- for the duration of applicable rights in the Database.
948
-
949
- 9.4 Reinstatement of rights. If you cease any breach of the terms and
950
- conditions of this License, then your full rights under this License
951
- will be reinstated:
952
-
953
- a. Provisionally and subject to permanent termination until the 60th
954
- day after cessation of breach;
955
-
956
- b. Permanently on the 60th day after cessation of breach unless
957
- otherwise reasonably notified by the Licensor; or
958
-
959
- c. Permanently if reasonably notified by the Licensor of the
960
- violation, this is the first time You have received notice of
961
- violation of this License from the Licensor, and You cure the
962
- violation prior to 30 days after your receipt of the notice.
963
-
964
- Persons subject to permanent termination of rights are not eligible to
965
- be a recipient and receive a license under Section 4.8.
966
-
967
- 9.5 Notwithstanding the above, Licensor reserves the right to release
968
- the Database under different license terms or to stop distributing or
969
- making available the Database. Releasing the Database under different
970
- license terms or stopping the distribution of the Database will not
971
- withdraw this License (or any other license that has been, or is
972
- required to be, granted under the terms of this License), and this
973
- License will continue in full force and effect unless terminated as
974
- stated above.
975
-
976
- ### 10.0 General
977
-
978
- 10.1 If any provision of this License is held to be invalid or
979
- unenforceable, that must not affect the validity or enforceability of
980
- the remainder of the terms and conditions of this License and each
981
- remaining provision of this License shall be valid and enforced to the
982
- fullest extent permitted by law.
983
-
984
- 10.2 This License is the entire agreement between the parties with
985
- respect to the rights granted here over the Database. It replaces any
986
- earlier understandings, agreements or representations with respect to
987
- the Database.
988
-
989
- 10.3 If You are in breach of the terms of this License, You will not be
990
- entitled to rely on the terms of this License or to complain of any
991
- breach by the Licensor.
992
-
993
- 10.4 Choice of law. This License takes effect in and will be governed by
994
- the laws of the relevant jurisdiction in which the License terms are
995
- sought to be enforced. If the standard suite of rights granted under
996
- applicable copyright law and Database Rights in the relevant
997
- jurisdiction includes additional rights not granted under this License,
998
- these additional rights are granted in this License in order to meet the
999
- terms of this License.```
1000
-
1001
-
1002
-
1003
-
1004
  # Explosion fastText Vectors (cbow, OSCAR Common Crawl + Wikipedia)
1005
 
1006
  * Author: Explosion
450
 
451
 
452
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
453
  # Explosion fastText Vectors (cbow, OSCAR Common Crawl + Wikipedia)
454
 
455
  * Author: Explosion
README.md CHANGED
The diff for this file is too large to render. See raw diff
accuracy.json CHANGED
@@ -3,311 +3,155 @@
3
  "token_p": 0.998357254,
4
  "token_r": 0.9988754325,
5
  "token_f": 0.9986162761,
6
- "sents_p": 0.8262295082,
7
- "sents_r": 0.8168557536,
8
- "sents_f": 0.8215158924,
9
- "dep_uas": 0.7404241849,
10
- "dep_las": 0.6700432626,
11
- "dep_las_per_type": {
12
- "root": {
13
- "p": 0.7385620915,
14
- "r": 0.7325769854,
15
- "f": 0.7355573637
16
- },
17
- "obl": {
18
- "p": 0.5110132159,
19
- "r": 0.5296803653,
20
- "f": 0.5201793722
21
- },
22
- "nmod": {
23
- "p": 0.7538644471,
24
- "r": 0.7624774504,
25
- "f": 0.7581464873
26
- },
27
- "amod": {
28
- "p": 0.7503805175,
29
- "r": 0.7458396369,
30
- "f": 0.7481031866
31
- },
32
- "cc": {
33
- "p": 0.7351129363,
34
- "r": 0.7665952891,
35
- "f": 0.750524109
36
- },
37
- "conj": {
38
- "p": 0.4879032258,
39
- "r": 0.5475113122,
40
- "f": 0.5159914712
41
- },
42
- "obl:arg": {
43
- "p": 0.5672131148,
44
- "r": 0.5831460674,
45
- "f": 0.5750692521
46
- },
47
- "acl": {
48
- "p": 0.4695945946,
49
- "r": 0.4527687296,
50
- "f": 0.4610281924
51
- },
52
- "advmod": {
53
- "p": 0.7456359102,
54
- "r": 0.736453202,
55
- "f": 0.741016109
56
- },
57
- "det": {
58
- "p": 0.7043010753,
59
- "r": 0.8238993711,
60
- "f": 0.7594202899
61
- },
62
- "xcomp": {
63
- "p": 0.7944664032,
64
- "r": 0.858974359,
65
- "f": 0.8254620123
66
- },
67
- "advcl": {
68
- "p": 0.4106280193,
69
- "r": 0.3373015873,
70
- "f": 0.3703703704
71
- },
72
- "parataxis": {
73
- "p": 0.4444444444,
74
- "r": 0.3636363636,
75
- "f": 0.4
76
- },
77
- "advmod:emph": {
78
- "p": 0.6756756757,
79
- "r": 0.5841121495,
80
- "f": 0.626566416
81
- },
82
- "nsubj": {
83
- "p": 0.7166392092,
84
- "r": 0.7038834951,
85
- "f": 0.7102040816
86
- },
87
- "acl:relcl": {
88
- "p": 0.6865671642,
89
- "r": 0.6388888889,
90
- "f": 0.6618705036
91
- },
92
- "case": {
93
- "p": 0.8396946565,
94
- "r": 0.8291457286,
95
- "f": 0.8343868521
96
- },
97
- "csubj": {
98
- "p": 0.5454545455,
99
- "r": 0.375,
100
- "f": 0.4444444444
101
- },
102
- "mark": {
103
- "p": 0.7863247863,
104
- "r": 0.7796610169,
105
- "f": 0.7829787234
106
- },
107
- "cop": {
108
- "p": 0.7697841727,
109
- "r": 0.8294573643,
110
- "f": 0.7985074627
111
- },
112
- "obj": {
113
- "p": 0.8015665796,
114
- "r": 0.7561576355,
115
- "f": 0.7782002535
116
- },
117
- "dep": {
118
- "p": 0.0,
119
- "r": 0.0,
120
- "f": 0.0
121
- },
122
- "ccomp": {
123
- "p": 0.6395348837,
124
- "r": 0.625,
125
- "f": 0.632183908
126
- },
127
- "appos": {
128
- "p": 0.7333333333,
129
- "r": 0.4230769231,
130
- "f": 0.5365853659
131
- },
132
- "nummod": {
133
- "p": 0.7099236641,
134
- "r": 0.6739130435,
135
- "f": 0.6914498141
136
- },
137
- "nummod:gov": {
138
- "p": 0.0,
139
- "r": 0.0,
140
- "f": 0.0
141
- },
142
- "flat": {
143
- "p": 0.3541666667,
144
- "r": 0.1603773585,
145
- "f": 0.2207792208
146
- },
147
- "nsubj:pass": {
148
- "p": 0.5,
149
- "r": 0.4470588235,
150
- "f": 0.4720496894
151
- },
152
- "flat:foreign": {
153
- "p": 0.0,
154
- "r": 0.0,
155
- "f": 0.0
156
- },
157
- "csubj:pass": {
158
- "p": 0.0,
159
- "r": 0.0,
160
- "f": 0.0
161
- },
162
- "iobj": {
163
- "p": 0.0,
164
- "r": 0.0,
165
- "f": 0.0
166
- }
167
- },
168
- "ents_p": 0.750907441,
169
- "ents_r": 0.827913957,
170
- "ents_f": 0.7875327147,
171
- "ents_per_type": {
172
- "PERSON": {
173
- "p": 0.0,
174
- "r": 0.0,
175
- "f": 0.0
176
- },
177
- "GPE": {
178
- "p": 0.0,
179
- "r": 0.0,
180
- "f": 0.0
181
- },
182
- "PRODUCT": {
183
- "p": 0.0,
184
- "r": 0.0,
185
- "f": 0.0
186
- },
187
- "ORG": {
188
- "p": 0.0,
189
- "r": 0.0,
190
- "f": 0.0
191
- },
192
- "LOC": {
193
- "p": 0.0,
194
- "r": 0.0,
195
- "f": 0.0
196
- },
197
- "TIME": {
198
- "p": 0.0,
199
- "r": 0.0,
200
- "f": 0.0
201
- }
202
- },
203
- "speed": 6365.8683040846,
204
- "pos_acc": 0.9496907038,
205
- "morph_acc": 0.8704416663,
206
- "morph_micro_p": 0.9129765114,
207
- "morph_micro_r": 0.9070390207,
208
- "morph_micro_f": 0.909998081,
209
  "morph_per_feat": {
210
  "Case": {
211
- "p": 0.925956329,
212
  "r": 0.924287119,
213
- "f": 0.925120971
214
  },
215
  "Gender": {
216
- "p": 0.9346576059,
217
- "r": 0.931573463,
218
- "f": 0.933112986
219
  },
220
  "Number": {
221
- "p": 0.9204563213,
222
- "r": 0.9207440988,
223
- "f": 0.9206001876
224
  },
225
  "Definite": {
226
- "p": 0.9225490196,
227
- "r": 0.9144800777,
228
- "f": 0.9184968277
229
  },
230
  "Degree": {
231
- "p": 0.8665028665,
232
- "r": 0.8824020017,
233
- "f": 0.8743801653
234
  },
235
  "Polarity": {
236
- "p": 0.9271948608,
237
- "r": 0.9173728814,
238
- "f": 0.922257721
239
  },
240
  "Tense": {
241
- "p": 0.8390889052,
242
- "r": 0.8378576669,
243
- "f": 0.8384728341
244
  },
245
  "VerbForm": {
246
- "p": 0.8966809422,
247
- "r": 0.8871822034,
248
- "f": 0.8919062833
249
  },
250
  "Voice": {
251
- "p": 0.7972166998,
252
- "r": 0.7566037736,
253
- "f": 0.7763794773
254
  },
255
  "PronType": {
256
- "p": 0.9333333333,
257
- "r": 0.9240924092,
258
- "f": 0.9286898839
259
  },
260
  "Aspect": {
261
- "p": 0.8356545961,
262
- "r": 0.826446281,
263
- "f": 0.8310249307
264
  },
265
  "Hyph": {
266
- "p": 0.8974358974,
267
- "r": 0.9308510638,
268
- "f": 0.9138381201
269
  },
270
  "Reflex": {
271
- "p": 0.8128078818,
272
- "r": 0.6088560886,
273
- "f": 0.6962025316
274
  },
275
  "Mood": {
276
- "p": 0.890083632,
277
- "r": 0.8911483254,
278
- "f": 0.8906156605
279
  },
280
  "Person": {
281
- "p": 0.8801980198,
282
- "r": 0.8872255489,
283
- "f": 0.8836978131
284
  },
285
  "AdpType": {
286
- "p": 1.0,
287
- "r": 0.9850746269,
288
- "f": 0.992481203
289
  },
290
  "NumForm": {
291
- "p": 0.9270833333,
292
  "r": 0.89,
293
- "f": 0.9081632653
294
- },
295
- "NumType": {
296
- "p": 0.8,
297
- "r": 0.6428571429,
298
- "f": 0.7128712871
299
  },
300
  "Abbr": {
301
- "p": 0.9686098655,
302
  "r": 0.943231441,
303
- "f": 0.9557522124
304
  },
305
  "Foreign": {
306
- "p": 0.5714285714,
307
- "r": 0.25,
308
- "f": 0.347826087
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
309
  }
310
  },
311
- "tag_acc": 0.8630012545,
312
- "lemma_acc": 0.7106344332
313
  }
3
  "token_p": 0.998357254,
4
  "token_r": 0.9988754325,
5
  "token_f": 0.9986162761,
6
+ "pos_acc": 0.9468766223,
7
+ "morph_acc": 0.8705658418,
8
+ "morph_micro_p": 0.9154675098,
9
+ "morph_micro_r": 0.9045524101,
10
+ "morph_micro_f": 0.9099772297,
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
11
  "morph_per_feat": {
12
  "Case": {
13
+ "p": 0.9271740917,
14
  "r": 0.924287119,
15
+ "f": 0.9257283545
16
  },
17
  "Gender": {
18
+ "p": 0.9329511899,
19
+ "r": 0.9327891629,
20
+ "f": 0.9328701693
21
  },
22
  "Number": {
23
+ "p": 0.9187705818,
24
+ "r": 0.9158980772,
25
+ "f": 0.9173320808
26
  },
27
  "Definite": {
28
+ "p": 0.9205955335,
29
+ "r": 0.9013605442,
30
+ "f": 0.9108765038
31
  },
32
  "Degree": {
33
+ "p": 0.8616144975,
34
+ "r": 0.8723936614,
35
+ "f": 0.866970576
36
  },
37
  "Polarity": {
38
+ "p": 0.9304725693,
39
+ "r": 0.907309322,
40
+ "f": 0.9187449718
41
  },
42
  "Tense": {
43
+ "p": 0.8707224335,
44
+ "r": 0.8400586941,
45
+ "f": 0.855115758
46
  },
47
  "VerbForm": {
48
+ "p": 0.9033134166,
49
+ "r": 0.8808262712,
50
+ "f": 0.8919281309
51
  },
52
  "Voice": {
53
+ "p": 0.8136645963,
54
+ "r": 0.741509434,
55
+ "f": 0.7759131293
56
  },
57
  "PronType": {
58
+ "p": 0.9304635762,
59
+ "r": 0.9273927393,
60
+ "f": 0.9289256198
61
  },
62
  "Aspect": {
63
+ "p": 0.8342696629,
64
+ "r": 0.8181818182,
65
+ "f": 0.826147427
66
  },
67
  "Hyph": {
68
+ "p": 0.9037433155,
69
+ "r": 0.8989361702,
70
+ "f": 0.9013333333
71
  },
72
  "Reflex": {
73
+ "p": 0.7579908676,
74
+ "r": 0.6125461255,
75
+ "f": 0.6775510204
76
  },
77
  "Mood": {
78
+ "p": 0.9086479903,
79
+ "r": 0.8923444976,
80
+ "f": 0.9004224502
81
  },
82
  "Person": {
83
+ "p": 0.9114688129,
84
+ "r": 0.9041916168,
85
+ "f": 0.9078156313
86
  },
87
  "AdpType": {
88
+ "p": 0.9899749373,
89
+ "r": 0.9825870647,
90
+ "f": 0.986267166
91
  },
92
  "NumForm": {
93
+ "p": 0.9417989418,
94
  "r": 0.89,
95
+ "f": 0.9151670951
 
 
 
 
 
96
  },
97
  "Abbr": {
98
+ "p": 0.9642857143,
99
  "r": 0.943231441,
100
+ "f": 0.9536423841
101
  },
102
  "Foreign": {
103
+ "p": 0.6551724138,
104
+ "r": 0.59375,
105
+ "f": 0.6229508197
106
+ },
107
+ "NumType": {
108
+ "p": 0.7608695652,
109
+ "r": 0.625,
110
+ "f": 0.6862745098
111
+ }
112
+ },
113
+ "tag_acc": 0.8632116283,
114
+ "sents_p": 0.8447712418,
115
+ "sents_r": 0.8379254457,
116
+ "sents_f": 0.8413344182,
117
+ "dep_uas": 0.7353933769,
118
+ "dep_las": 0.6609365113,
119
+ "dep_las_per_type": {},
120
+ "lemma_acc": 0.8484193228,
121
+ "ents_p": 0.7557354926,
122
+ "ents_r": 0.8404202101,
123
+ "ents_f": 0.7958313595,
124
+ "ents_per_type": {
125
+ "ORG": {
126
+ "p": 0.6943866944,
127
+ "r": 0.7625570776,
128
+ "f": 0.7268770403
129
+ },
130
+ "TIME": {
131
+ "p": 0.7280334728,
132
+ "r": 0.7909090909,
133
+ "f": 0.7581699346
134
+ },
135
+ "LOC": {
136
+ "p": 0.7134502924,
137
+ "r": 0.7554179567,
138
+ "f": 0.7338345865
139
+ },
140
+ "PRODUCT": {
141
+ "p": 0.3829787234,
142
+ "r": 0.5714285714,
143
+ "f": 0.4585987261
144
+ },
145
+ "GPE": {
146
+ "p": 0.7651663405,
147
+ "r": 0.9654320988,
148
+ "f": 0.8537117904
149
+ },
150
+ "PERSON": {
151
+ "p": 0.9010791367,
152
+ "r": 0.9109090909,
153
+ "f": 0.9059674503
154
  }
155
  },
156
+ "speed": 9529.4689235955
 
157
  }
attribute_ruler/patterns CHANGED
Binary files a/attribute_ruler/patterns and b/attribute_ruler/patterns differ
config.cfg CHANGED
@@ -10,7 +10,7 @@ seed = 0
10
 
11
  [nlp]
12
  lang = "lt"
13
- pipeline = ["tok2vec","morphologizer","tagger","parser","senter","attribute_ruler","lemmatizer","ner"]
14
  disabled = ["senter"]
15
  before_creation = null
16
  after_creation = null
@@ -26,11 +26,22 @@ scorer = {"@scorers":"spacy.attribute_ruler_scorer.v1"}
26
  validate = false
27
 
28
  [components.lemmatizer]
29
- factory = "lemmatizer"
30
- mode = "lookup"
31
- model = null
32
  overwrite = false
33
  scorer = {"@scorers":"spacy.lemmatizer_scorer.v1"}
 
 
 
 
 
 
 
 
 
 
 
34
 
35
  [components.morphologizer]
36
  factory = "morphologizer"
@@ -39,8 +50,9 @@ overwrite = true
39
  scorer = {"@scorers":"spacy.morphologizer_scorer.v1"}
40
 
41
  [components.morphologizer.model]
42
- @architectures = "spacy.Tagger.v1"
43
  nO = null
 
44
 
45
  [components.morphologizer.model.tok2vec]
46
  @architectures = "spacy.Tok2VecListener.v1"
@@ -70,7 +82,7 @@ nO = null
70
  @architectures = "spacy.MultiHashEmbed.v2"
71
  width = 96
72
  attrs = ["NORM","PREFIX","SUFFIX","SHAPE","SPACY"]
73
- rows = [5000,2500,2500,2500,100]
74
  include_static_vectors = true
75
 
76
  [components.ner.model.tok2vec.encode]
@@ -108,8 +120,9 @@ overwrite = false
108
  scorer = {"@scorers":"spacy.senter_scorer.v1"}
109
 
110
  [components.senter.model]
111
- @architectures = "spacy.Tagger.v1"
112
  nO = null
 
113
 
114
  [components.senter.model.tok2vec]
115
  @architectures = "spacy.Tok2Vec.v2"
@@ -130,12 +143,14 @@ maxout_pieces = 2
130
 
131
  [components.tagger]
132
  factory = "tagger"
 
133
  overwrite = false
134
  scorer = {"@scorers":"spacy.tagger_scorer.v1"}
135
 
136
  [components.tagger.model]
137
- @architectures = "spacy.Tagger.v1"
138
  nO = null
 
139
 
140
  [components.tagger.model.tok2vec]
141
  @architectures = "spacy.Tok2VecListener.v1"
@@ -152,7 +167,7 @@ factory = "tok2vec"
152
  @architectures = "spacy.MultiHashEmbed.v2"
153
  width = ${components.tok2vec.model.encode:width}
154
  attrs = ["NORM","PREFIX","SUFFIX","SHAPE","SPACY"]
155
- rows = [5000,2500,2500,2500,100]
156
  include_static_vectors = true
157
 
158
  [components.tok2vec.model.encode]
@@ -189,7 +204,7 @@ dropout = 0.1
189
  accumulate_gradient = 1
190
  patience = 5000
191
  max_epochs = 0
192
- max_steps = 0
193
  eval_frequency = 1000
194
  frozen_components = []
195
  before_to_disk = null
@@ -224,18 +239,18 @@ eps = 0.00000001
224
  learn_rate = 0.001
225
 
226
  [training.score_weights]
227
- pos_acc = 0.06
228
- morph_acc = 0.05
229
  morph_per_feat = null
230
- tag_acc = 0.06
231
  dep_uas = 0.0
232
- dep_las = 0.16
233
  dep_las_per_type = null
234
  sents_p = null
235
  sents_r = null
236
- sents_f = 0.02
237
- lemma_acc = 0.5
238
- ents_f = 0.16
239
  ents_p = 0.0
240
  ents_r = 0.0
241
  ents_per_type = null
@@ -252,6 +267,13 @@ after_init = null
252
 
253
  [initialize.components]
254
 
 
 
 
 
 
 
 
255
  [initialize.components.morphologizer]
256
 
257
  [initialize.components.morphologizer.labels]
10
 
11
  [nlp]
12
  lang = "lt"
13
+ pipeline = ["tok2vec","morphologizer","tagger","parser","lemmatizer","senter","attribute_ruler","ner"]
14
  disabled = ["senter"]
15
  before_creation = null
16
  after_creation = null
26
  validate = false
27
 
28
  [components.lemmatizer]
29
+ factory = "trainable_lemmatizer"
30
+ backoff = "orth"
31
+ min_tree_freq = 3
32
  overwrite = false
33
  scorer = {"@scorers":"spacy.lemmatizer_scorer.v1"}
34
+ top_k = 1
35
+
36
+ [components.lemmatizer.model]
37
+ @architectures = "spacy.Tagger.v2"
38
+ nO = null
39
+ normalize = false
40
+
41
+ [components.lemmatizer.model.tok2vec]
42
+ @architectures = "spacy.Tok2VecListener.v1"
43
+ width = ${components.tok2vec.model.encode:width}
44
+ upstream = "tok2vec"
45
 
46
  [components.morphologizer]
47
  factory = "morphologizer"
50
  scorer = {"@scorers":"spacy.morphologizer_scorer.v1"}
51
 
52
  [components.morphologizer.model]
53
+ @architectures = "spacy.Tagger.v2"
54
  nO = null
55
+ normalize = false
56
 
57
  [components.morphologizer.model.tok2vec]
58
  @architectures = "spacy.Tok2VecListener.v1"
82
  @architectures = "spacy.MultiHashEmbed.v2"
83
  width = 96
84
  attrs = ["NORM","PREFIX","SUFFIX","SHAPE","SPACY"]
85
+ rows = [5000,1000,2500,2500,50]
86
  include_static_vectors = true
87
 
88
  [components.ner.model.tok2vec.encode]
120
  scorer = {"@scorers":"spacy.senter_scorer.v1"}
121
 
122
  [components.senter.model]
123
+ @architectures = "spacy.Tagger.v2"
124
  nO = null
125
+ normalize = false
126
 
127
  [components.senter.model.tok2vec]
128
  @architectures = "spacy.Tok2Vec.v2"
143
 
144
  [components.tagger]
145
  factory = "tagger"
146
+ neg_prefix = "!"
147
  overwrite = false
148
  scorer = {"@scorers":"spacy.tagger_scorer.v1"}
149
 
150
  [components.tagger.model]
151
+ @architectures = "spacy.Tagger.v2"
152
  nO = null
153
+ normalize = false
154
 
155
  [components.tagger.model.tok2vec]
156
  @architectures = "spacy.Tok2VecListener.v1"
167
  @architectures = "spacy.MultiHashEmbed.v2"
168
  width = ${components.tok2vec.model.encode:width}
169
  attrs = ["NORM","PREFIX","SUFFIX","SHAPE","SPACY"]
170
+ rows = [5000,1000,2500,2500,50]
171
  include_static_vectors = true
172
 
173
  [components.tok2vec.model.encode]
204
  accumulate_gradient = 1
205
  patience = 5000
206
  max_epochs = 0
207
+ max_steps = 100000
208
  eval_frequency = 1000
209
  frozen_components = []
210
  before_to_disk = null
239
  learn_rate = 0.001
240
 
241
  [training.score_weights]
242
+ pos_acc = 0.1
243
+ morph_acc = 0.09
244
  morph_per_feat = null
245
+ tag_acc = 0.1
246
  dep_uas = 0.0
247
+ dep_las = 0.29
248
  dep_las_per_type = null
249
  sents_p = null
250
  sents_r = null
251
+ sents_f = 0.04
252
+ lemma_acc = 0.1
253
+ ents_f = 0.29
254
  ents_p = 0.0
255
  ents_r = 0.0
256
  ents_per_type = null
267
 
268
  [initialize.components]
269
 
270
+ [initialize.components.lemmatizer]
271
+
272
+ [initialize.components.lemmatizer.labels]
273
+ @readers = "spacy.read_labels.v1"
274
+ path = "corpus/labels/trainable_lemmatizer.json"
275
+ require = false
276
+
277
  [initialize.components.morphologizer]
278
 
279
  [initialize.components.morphologizer.labels]
lemmatizer/cfg ADDED
@@ -0,0 +1,730 @@