mt5-small-german-query-generation
Model description:
This model was created with the purpose to generate possible queries for a german input article.
For this model, we finetuned a multilingual T5 model mt5-small on the MMARCO dataset the machine translated version of the MS MARCO dataset.
The model was trained for 1 epoch, on 200,000 unique queries of the dataset. We trained the model on one K80 GPU for 25,000 iterations with following parameters:
- learning rate: 1e-3
- train batch size: 8
- max input sequence length: 512
- max target sequence length: 64
Model Performance:
Model evaluation was done on 2000 evaluation paragraphs of the dataset. Mean f1 ROUGE scores were calculated for the model.
Rouge-1 | Rouge-2 | Rouge-L |
---|---|---|
0.162 | 0.052 | 0.161 |
- Downloads last month
- 111
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.