SEBIS
/

legal_t5_small_trans_fr_en

Text2Text Generation

translation French English model

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

Mainak Manna commited on Jan 29, 2021

Commit

0c3589a

·

1 Parent(s): c281d3d

First version of the model

Files changed (1) hide show

README.md +4 -4

README.md CHANGED Viewed

@@ -6,7 +6,7 @@ tags:
 datasets:
 - dcep europarl jrc-acquis
 widget:
-- text: "Lundi 26 mars 2012, de 15 heures à 18 h 30"
 ---
@@ -38,7 +38,7 @@ tokenizer=AutoTokenizer.from_pretrained(pretrained_model_name_or_path = "SEBIS/l
     device=0
 )
-fr_text = "Lundi 26 mars 2012, de 15 heures à 18 h 30"
 pipeline([fr_text], max_length=512)
 ```
@@ -49,12 +49,12 @@ The legal_t5_small_trans_fr_en model was trained on [JRC-ACQUIS](https://wt-publ
 ## Training procedure
-An unigram model trained with 88M lines of text from the parallel corpus (of all possible language pairs) to get the vocabulary (with byte pair encoding), which is used with this model.
 The model was trained on a single TPU Pod V3-8 for 250K steps in total, using sequence length 512 (batch size 4096). It has a total of approximately 220M parameters and was trained using the encoder-decoder architecture. The optimizer used is AdaFactor with inverse square root learning rate schedule for pre-training.
 ### Preprocessing
 ### Pretraining

 datasets:
 - dcep europarl jrc-acquis
 widget:
+- text: "quels montants ont été attribués et quelles sommes ont été effectivement utilisées dans chaque État membre? 4."
 ---
     device=0
 )
+fr_text = "quels montants ont été attribués et quelles sommes ont été effectivement utilisées dans chaque État membre? 4."
 pipeline([fr_text], max_length=512)
 ```
 ## Training procedure
 The model was trained on a single TPU Pod V3-8 for 250K steps in total, using sequence length 512 (batch size 4096). It has a total of approximately 220M parameters and was trained using the encoder-decoder architecture. The optimizer used is AdaFactor with inverse square root learning rate schedule for pre-training.
 ### Preprocessing
+An unigram model trained with 88M lines of text from the parallel corpus (of all possible language pairs) to get the vocabulary (with byte pair encoding), which is used with this model.
 ### Pretraining