cstr
/

Automatic Speech Recognition
Transformers
German
Eval Results
Inference Endpoints
Edit model card

Quant

This is only a int8 quantization from primeline/whisper-large-v3-turbo-german per ctranslate2-converter, for usage e.g. in ctranslate2, faster-whisper, etc.

Modelcard from primeline/whisper-large-v3-german

Summary

This model map provides information about a model based on Whisper Large v3 that has been fine-tuned for speech recognition in German. Whisper is a powerful speech recognition platform developed by OpenAI. This model has been specially optimized for processing and recognizing German speech.

Applications

This model can be used in various application areas, including

  • Transcription of spoken German language
  • Voice commands and voice control
  • Automatic subtitling for German videos
  • Voice-based search queries in German
  • Dictation functions in word processing programs

Model family

Model Parameters link
Whisper large v3 german 1.54B link
Whisper large v3 turbo german 809M link
Distil-whisper large v3 german 756M link
tiny whisper 37.8M link

Evaluations

Dataset openai-whisper-large-v3-turbo openai-whisper-large-v3 primeline-whisper-large-v3-german nyrahealth-CrisperWhisper primeline-whisper-large-v3-turbo-german
common_voice_19_0 6.31 5.84 4.30 4.14 4.28
Tuda-De 11.45 11.21 9.89 13.88 8.10
multilingual librispeech 18.03 17.69 13.46 10.10 4.71
All 14.16 13.79 10.51 8.48 4.75

Training data

The training data for this model includes a large amount of spoken German from various sources. The data was carefully selected and processed to optimize recognition performance.

Training process

The training of the model was performed with the following hyperparameters

  • Batch size: 12288
  • Epochs: 3
  • Learning rate: 1e-6
  • Data augmentation: No
  • Optimizer: Ademamix

How to use

import torch
from transformers import AutoModelForSpeechSeq2Seq, AutoProcessor, pipeline
from datasets import load_dataset
device = "cuda:0" if torch.cuda.is_available() else "cpu"
torch_dtype = torch.float16 if torch.cuda.is_available() else torch.float32
model_id = "primeline/whisper-large-v3-turbo-german"
model = AutoModelForSpeechSeq2Seq.from_pretrained(
    model_id, torch_dtype=torch_dtype, low_cpu_mem_usage=True, use_safetensors=True
)
model.to(device)
processor = AutoProcessor.from_pretrained(model_id)
pipe = pipeline(
    "automatic-speech-recognition",
    model=model,
    tokenizer=processor.tokenizer,
    feature_extractor=processor.feature_extractor,
    max_new_tokens=128,
    chunk_length_s=30,
    batch_size=16,
    return_timestamps=True,
    torch_dtype=torch_dtype,
    device=device,
)
dataset = load_dataset("distil-whisper/librispeech_long", "clean", split="validation")
sample = dataset[0]["audio"]
result = pipe(sample)
print(result["text"])

About us

primeline AI

Your partner for AI infrastructure in Germany
Experience the powerful AI infrastructure that drives your ambitions in Deep Learning, Machine Learning & High-Performance Computing. Optimized for AI training and inference.

Model author: Florian Zimmermeister

Downloads last month
48
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for cstr/whisper-large-v3-turbo-german-int8_float32

Finetuned
(6)
this model

Datasets used to train cstr/whisper-large-v3-turbo-german-int8_float32

Space using cstr/whisper-large-v3-turbo-german-int8_float32 1

Evaluation results