Mainak Manna commited on
Commit
e6d9b25
1 Parent(s): 5a9d376

First version of the model

Browse files
Files changed (1) hide show
  1. README.md +7 -3
README.md CHANGED
@@ -6,7 +6,7 @@ tags:
6
  datasets:
7
  - jrc-acquis
8
  widget:
9
- - text: "of 2 December 2004 related to the listing of a Conformity Assessment Body under the Sectoral Annex on Telecommunication Equipment (2004/885/EC) THE JOINT COMMITTEE, Having regard to the Agreement on Mutual Recognition between the European Community and the United States of America and in particular Articles 7 and 14, HAS DECIDED AS FOLLOWS: 1. The Conformity Assessment Body in Attachment A is added to the list of Conformity Assessment Bodies in Section V of the Sectoral Annex on Telecommunication Equipment. 2. The specific scope of listing, in terms of products and conformity assessment procedures, of the Conformity Assessment Body indicated in Attachment A has been agreed by the Parties and will be maintained by them. This Decision, done in duplicate, shall be signed by representatives of the Joint Committee who are authorised to act on behalf of the Parties for purposes of amending the Agreement. This Decision shall be effective from the date of the later of these signatures. Signed in Washington DC, 26 November 2004. On behalf of the United States of America James C. Sanford Signed in Brussels, 2 December 2004. On behalf of the European Community Joanna Kioussi -------------------------------------------------- -------------------------------------------------- +++++ ANNEX 1 +++++</br> "
10
 
11
  ---
12
 
@@ -38,7 +38,7 @@ tokenizer=AutoTokenizer.from_pretrained(pretrained_model_name_or_path = "SEBIS/l
38
  device=0
39
  )
40
 
41
- en_text = "of 2 December 2004 related to the listing of a Conformity Assessment Body under the Sectoral Annex on Telecommunication Equipment (2004/885/EC) THE JOINT COMMITTEE, Having regard to the Agreement on Mutual Recognition between the European Community and the United States of America and in particular Articles 7 and 14, HAS DECIDED AS FOLLOWS: 1. The Conformity Assessment Body in Attachment A is added to the list of Conformity Assessment Bodies in Section V of the Sectoral Annex on Telecommunication Equipment. 2. The specific scope of listing, in terms of products and conformity assessment procedures, of the Conformity Assessment Body indicated in Attachment A has been agreed by the Parties and will be maintained by them. This Decision, done in duplicate, shall be signed by representatives of the Joint Committee who are authorised to act on behalf of the Parties for purposes of amending the Agreement. This Decision shall be effective from the date of the later of these signatures. Signed in Washington DC, 26 November 2004. On behalf of the United States of America James C. Sanford Signed in Brussels, 2 December 2004. On behalf of the European Community Joanna Kioussi -------------------------------------------------- -------------------------------------------------- +++++ ANNEX 1 +++++</br> "
42
 
43
  pipeline([en_text], max_length=512)
44
  ```
@@ -48,11 +48,13 @@ pipeline([en_text], max_length=512)
48
  The legal_t5_small_summ_en model was trained on [JRC-ACQUIS](https://wt-public.emm4u.eu/Acquis/index_2.2.html) dataset consisting of 22 Thousand texts.
49
 
50
  ## Training procedure
 
51
 
 
52
  ### Preprocessing
53
 
54
  ### Pretraining
55
- An unigram model with 88M parameters is trained over the complete parallel corpus to get the vocabulary (with byte pair encoding), which is used with this model.
56
 
57
 
58
  ## Evaluation results
@@ -67,3 +69,5 @@ Test results :
67
 
68
 
69
  ### BibTeX entry and citation info
 
 
 
6
  datasets:
7
  - jrc-acquis
8
  widget:
9
+ - text: "on the illegal trade in firearms (SCH/Com-ex (99) 10) THE EXECUTIVE COMMITTEE, Having regard to Article 132 of the Convention implementing the Schengen Agreement, Having regard to Article 9 of the abovementioned Convention, HAS DECIDED AS FOLLOWS: Henceforth, the Contracting Parties shall submit each year by 31 July their national annual data for the preceding year on illegal trade in firearms, on the basis of the joint table for compiling statistics annexed to document SCH/I-ar (98) 32. Luxembourg, 28 April 1999. The Chairman C. H. Schapper ANNEX SCH/I-Ar (98) 32 %gt%PIC FILE= %quot%L_2000239EN.047002.EPS%quot%%gt% %gt%PIC FILE= %quot%L_2000239EN.047101.EPS%quot%%gt% %gt%PIC FILE= %quot%L_2000239EN.047201.EPS%quot%%gt% %gt%PIC FILE= %quot%L_2000239EN.047301.EPS%quot%%gt% "
10
 
11
  ---
12
 
 
38
  device=0
39
  )
40
 
41
+ en_text = "on the illegal trade in firearms (SCH/Com-ex (99) 10) THE EXECUTIVE COMMITTEE, Having regard to Article 132 of the Convention implementing the Schengen Agreement, Having regard to Article 9 of the abovementioned Convention, HAS DECIDED AS FOLLOWS: Henceforth, the Contracting Parties shall submit each year by 31 July their national annual data for the preceding year on illegal trade in firearms, on the basis of the joint table for compiling statistics annexed to document SCH/I-ar (98) 32. Luxembourg, 28 April 1999. The Chairman C. H. Schapper ANNEX SCH/I-Ar (98) 32 %gt%PIC FILE= %quot%L_2000239EN.047002.EPS%quot%%gt% %gt%PIC FILE= %quot%L_2000239EN.047101.EPS%quot%%gt% %gt%PIC FILE= %quot%L_2000239EN.047201.EPS%quot%%gt% %gt%PIC FILE= %quot%L_2000239EN.047301.EPS%quot%%gt% "
42
 
43
  pipeline([en_text], max_length=512)
44
  ```
 
48
  The legal_t5_small_summ_en model was trained on [JRC-ACQUIS](https://wt-public.emm4u.eu/Acquis/index_2.2.html) dataset consisting of 22 Thousand texts.
49
 
50
  ## Training procedure
51
+ An unigram model trained with 88M lines of text from the parallel corpus (of all possible language pairs) to get the vocabulary (with byte pair encoding), which is used with this model.
52
 
53
+ The model was trained on a single TPU Pod V3-8 for 250K steps in total, using sequence length 512 (batch size 64). It has a total of approximately 220M parameters and was trained using the encoder-decoder architecture. The optimizer used is AdaFactor with inverse square root learning rate schedule for pre-training.
54
  ### Preprocessing
55
 
56
  ### Pretraining
57
+
58
 
59
 
60
  ## Evaluation results
 
69
 
70
 
71
  ### BibTeX entry and citation info
72
+
73
+ > Created by [Ahmed Elnaggar/@Elnaggar_AI](https://twitter.com/Elnaggar_AI) | [LinkedIn](https://www.linkedin.com/in/prof-ahmed-elnaggar/)