Text2Text Generation
Transformers
Safetensors
German
encoder-decoder
Inference Endpoints
Edit model card

Model Card of germanInstructionBERTcased for Bertology

A minimalistic german instruction model with an already good analyzed and pretrained encoder like dbmdz/bert-base-german-cased. So we can research the Bertology with instruction-tuned models, look at the attention and investigate what happens to BERT embeddings during fine-tuning.

The training code is released at the instructionBERT repository. We used the Huggingface API for warm-starting BertGeneration with Encoder-Decoder-Models for this purpose.

Training parameters

  • base model: "dbmdz/bert-base-german-cased"
  • trained for 3 epochs
  • batch size of 16
  • 40000 warm-up steps
  • learning rate of 0.0001

Purpose of germanInstructionBERTcased

InstructionMBERT is intended for research purposes. The model-generated text should be treated as a starting point rather than a definitive solution for potential use cases. Users should be cautious when employing these models in their applications.

Downloads last month
5
Safetensors
Model size
138M params
Tensor type
F32
·
Inference API
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Datasets used to train Bachstelze/germanInstructionBERTcased