--- language: English tags: - classification English model datasets: - jrc-acquis widget: - text: "Extract from the winding-up decision according to Article 14 of Directive 2001/17/EC of the European Parliament and of the Council of 19 March 2001 on the reorganisation and winding-up of insurance undertakings concerning INGO BALTIC UADB (2005/C 289/03) The bankruptcy proceedings were initiated against private insurance company INGO BALTIC UADB (company code 110426768) under a respective ruling of Vilnius Regional Court on 18 August 2005. The court ruling came into effect on 30 August 2005. Public company Jupoga (company code 5008557, Pramones str. 21a, Alytus, Lithuania, authorization No 48a for the rendering of bankruptcy administration services as issued by the Ministry of Economy of the Republic of Lithuania, phone (370) 685 67424) has been appointed as the administrator of the company that is subject to bankruptcy proceedings. Juozas Dzekunskas (certificate of competence No 468a) has been appointed to act as assignee for the administrator. Please send any creditors' claims immediately to the following address: Odminiu str. 3, LT-01122 Vilnius, Lithuania. Further information is available by phone (370) 5 264 90 90, fax (370) 5 23 13 117, e-mail: ingo@ingo.lt. The authority authorized by the State to supervise the bankruptcy proceedings against the company shall be the Insurance Supervisory Commission of the Republic of Lithuania (Ukmerges str. 222, Vilnius, Lithuania, phone: (370) 5 243 13 70, fax: (370) 5 272 36 896, e-mail: dpk@dpk.lt). Bankruptcy proceedings are governed by Lithuania's law. --------------------------------------------------" --- # legal_t5_small_cls_en model Model for classification of legal text written in English. It was first released in [this repository](https://github.com/agemagician/LegalTrans). This model is trained on three parallel corpus from jrc-acquis. ## Model description legal_t5_small_cls_en is based on the `t5-small` model and was trained on a large corpus of parallel text. This is a smaller model, which scales the baseline model of t5 down by using `dmodel = 512`, `dff = 2,048`, 8-headed attention, and only 6 layers each in the encoder and decoder. This variant has about 60 million parameters. ## Intended uses & limitations The model could be used for classification of legal texts written in English. ### How to use Here is how to use this model to classify legal text written in English in PyTorch: ```python from transformers import AutoTokenizer, AutoModelWithLMHead, TranslationPipeline pipeline = TranslationPipeline( model=AutoModelWithLMHead.from_pretrained("SEBIS/legal_t5_small_cls_en"), tokenizer=AutoTokenizer.from_pretrained(pretrained_model_name_or_path = "SEBIS/legal_t5_small_cls_en", do_lower_case=False, skip_special_tokens=True), device=0 ) en_text = "Extract from the winding-up decision according to Article 14 of Directive 2001/17/EC of the European Parliament and of the Council of 19 March 2001 on the reorganisation and winding-up of insurance undertakings concerning INGO BALTIC UADB (2005/C 289/03) The bankruptcy proceedings were initiated against private insurance company INGO BALTIC UADB (company code 110426768) under a respective ruling of Vilnius Regional Court on 18 August 2005. The court ruling came into effect on 30 August 2005. Public company Jupoga (company code 5008557, Pramones str. 21a, Alytus, Lithuania, authorization No 48a for the rendering of bankruptcy administration services as issued by the Ministry of Economy of the Republic of Lithuania, phone (370) 685 67424) has been appointed as the administrator of the company that is subject to bankruptcy proceedings. Juozas Dzekunskas (certificate of competence No 468a) has been appointed to act as assignee for the administrator. Please send any creditors' claims immediately to the following address: Odminiu str. 3, LT-01122 Vilnius, Lithuania. Further information is available by phone (370) 5 264 90 90, fax (370) 5 23 13 117, e-mail: ingo@ingo.lt. The authority authorized by the State to supervise the bankruptcy proceedings against the company shall be the Insurance Supervisory Commission of the Republic of Lithuania (Ukmerges str. 222, Vilnius, Lithuania, phone: (370) 5 243 13 70, fax: (370) 5 272 36 896, e-mail: dpk@dpk.lt). Bankruptcy proceedings are governed by Lithuania's law. --------------------------------------------------" pipeline([en_text], max_length=512) ``` ## Training data The legal_t5_small_cls_en model was trained on [JRC-ACQUIS](https://wt-public.emm4u.eu/Acquis/index_2.2.html) dataset consisting of 14 Thousand texts. ## Training procedure ### Preprocessing ### Pretraining An unigram model with 88M parameters is trained over the complete parallel corpus to get the vocabulary (with byte pair encoding), which is used with this model. ## Evaluation results When the model is used for classification test dataset, achieves the following results: Test results : | Model | F1 score | |:-----:|:-----:| | legal_t5_small_cls_en | 0.6247| ### BibTeX entry and citation info