Model Card for mT5-base-trimmed_deplain-apa

Finetuned mT5-Model for German sentence-level text-simplification.

Model Details

Model Description

  • Model type: Encoder-Decoder-Transformer
  • Language(s) (NLP): German
  • Finetuned from model: google/mT5-base
  • Task: Text-Simplification

Training Details

Training Data

DEplain/DEplain-APA-sent
Stodden et al. (2023):arXiv:2305.18939

Training Procedure

Parameter-efficient Fine-Tuning with LoRA. Vocabulary trimmed to 32.000 most frequent tokens for German.

Training Hyperparameters

  • Batch Size: 16
  • Epochs: 1
  • Learning Rate: 0.001
  • Optimizer: Adafactor

LoRA Hyperparameters

  • R: 32
  • Alpha: 64
  • Dropout: 0.1
  • Target modules: all linear layers
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.

Dataset used to train vera-8/mT5-base-trimmed_deplain-apa