YAML Metadata
Error:
"language[0]" with value "dutch" is not valid. It must be an ISO 639-1, 639-2 or 639-3 code (two/three letters), or a special value like "code", "multilingual". If you want to use BCP-47 identifiers, you can specify them in language_bcp47.
t5-base-dutch-demo 📰
Created by Yeb Havinga & Dat Nguyen during the Hugging Face community week
This model is based on t5-base-dutch and fine-tuned to create summaries of news articles.
For a demo of the model, head over to the Hugging Face Spaces for the Netherformer 📰 example application!
Dataset
t5-base-dutch-demo
is fine-tuned on three mixed news sources:
- CNN DailyMail translated to Dutch with MarianMT.
- XSUM translated to Dutch with MarianMt.
- News article summaries distilled from the nu.nl website.
The total number of training examples in this dataset is 1366592.
Training
Training consisted of fine-tuning t5-base-dutch with the following parameters:
- Constant learning rate 0.0005
- Batch size 8
- 1 epoch (170842 steps)
Evaluation
The performance of the summarization model is measured with the Rouge metric from the Huggingface Datasets library.
"rouge{n}" (e.g. `"rouge1"`, `"rouge2"`) where: {n} is the n-gram based scoring,
"rougeL": Longest common subsequence based scoring.
- Rouge1: 23.8
- Rouge2: 6.9
- RougeL: 19.7
These scores are expected to improve if the model is trained with evaluation configured for the CNN DM and XSUM datasets (translated to Dutch) individually.
- Downloads last month
- 140
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.