language: English
tags:
- classification English model
datasets:
- jrc-acquis
widget:
- text: >-
Extract from the winding-up decision according to Article 14 of Directive
2001/17/EC of the European Parliament and of the Council of 19 March 2001
on the reorganisation and winding-up of insurance undertakings concerning
INGO BALTIC UADB (2005/C 289/03) The bankruptcy proceedings were initiated
against private insurance company INGO BALTIC UADB (company code
110426768) under a respective ruling of Vilnius Regional Court on 18
August 2005. The court ruling came into effect on 30 August 2005. Public
company Jupoga (company code 5008557, Pramones str. 21a, Alytus,
Lithuania, authorization No 48a for the rendering of bankruptcy
administration services as issued by the Ministry of Economy of the
Republic of Lithuania, phone (370) 685 67424) has been appointed as the
administrator of the company that is subject to bankruptcy proceedings.
Juozas Dzekunskas (certificate of competence No 468a) has been appointed
to act as assignee for the administrator. Please send any creditors'
claims immediately to the following address: Odminiu str. 3, LT-01122
Vilnius, Lithuania. Further information is available by phone (370) 5 264
90 90, fax (370) 5 23 13 117, e-mail: ingo@ingo.lt. The authority
authorized by the State to supervise the bankruptcy proceedings against
the company shall be the Insurance Supervisory Commission of the Republic
of Lithuania (Ukmerges str. 222, Vilnius, Lithuania, phone: (370) 5 243 13
70, fax: (370) 5 272 36 896, e-mail: dpk@dpk.lt). Bankruptcy proceedings
are governed by Lithuania's law.
--------------------------------------------------
legal_t5_small_cls_en model
Model for classification of legal text written in English. It was first released in this repository. This model is trained on three parallel corpus from jrc-acquis.
Model description
legal_t5_small_cls_en is based on the t5-small
model and was trained on a large corpus of parallel text. This is a smaller model, which scales the baseline model of t5 down by using dmodel = 512
, dff = 2,048
, 8-headed attention, and only 6 layers each in the encoder and decoder. This variant has about 60 million parameters.
Intended uses & limitations
The model could be used for classification of legal texts written in English.
How to use
Here is how to use this model to classify legal text written in English in PyTorch:
from transformers import AutoTokenizer, AutoModelWithLMHead, TranslationPipeline
pipeline = TranslationPipeline(
model=AutoModelWithLMHead.from_pretrained("SEBIS/legal_t5_small_cls_en"),
tokenizer=AutoTokenizer.from_pretrained(pretrained_model_name_or_path = "SEBIS/legal_t5_small_cls_en", do_lower_case=False,
skip_special_tokens=True),
device=0
)
en_text = "Extract from the winding-up decision according to Article 14 of Directive 2001/17/EC of the European Parliament and of the Council of 19 March 2001 on the reorganisation and winding-up of insurance undertakings concerning INGO BALTIC UADB (2005/C 289/03) The bankruptcy proceedings were initiated against private insurance company INGO BALTIC UADB (company code 110426768) under a respective ruling of Vilnius Regional Court on 18 August 2005. The court ruling came into effect on 30 August 2005. Public company Jupoga (company code 5008557, Pramones str. 21a, Alytus, Lithuania, authorization No 48a for the rendering of bankruptcy administration services as issued by the Ministry of Economy of the Republic of Lithuania, phone (370) 685 67424) has been appointed as the administrator of the company that is subject to bankruptcy proceedings. Juozas Dzekunskas (certificate of competence No 468a) has been appointed to act as assignee for the administrator. Please send any creditors' claims immediately to the following address: Odminiu str. 3, LT-01122 Vilnius, Lithuania. Further information is available by phone (370) 5 264 90 90, fax (370) 5 23 13 117, e-mail: ingo@ingo.lt. The authority authorized by the State to supervise the bankruptcy proceedings against the company shall be the Insurance Supervisory Commission of the Republic of Lithuania (Ukmerges str. 222, Vilnius, Lithuania, phone: (370) 5 243 13 70, fax: (370) 5 272 36 896, e-mail: dpk@dpk.lt). Bankruptcy proceedings are governed by Lithuania's law. --------------------------------------------------"
pipeline([en_text], max_length=512)
Training data
The legal_t5_small_cls_en model was trained on JRC-ACQUIS dataset consisting of 14 Thousand texts.
Training procedure
Preprocessing
Pretraining
An unigram model with 88M parameters is trained over the complete parallel corpus to get the vocabulary (with byte pair encoding), which is used with this model.
Evaluation results
When the model is used for classification test dataset, achieves the following results:
Test results :
Model | F1 score |
---|---|
legal_t5_small_cls_en | 0.6247 |