Whisper-Small-Morse

This model is fine-tuned off of openai/whisper-small of 100k synthetic samples of Morse Code audio, transcription (raw capitalized decoded text), and translation (English interpretation). It uses the unused <|startoflm|> token as a language marker.

Ethical Considerations

This model was trained off of synthetically generated text generated by Claude by Anthropic. It may contain biases of the underlying model, especially when using the translation task. Much of the generated text is relevant to the amateur radio domain.

Audio data has been generated with stochastically determined tone, noise, and WPM, off of the transcribed text ground truth.

Training

This model was fine-tuned on a Nvidia RTX 4070 Ti Super for approximately 48 hours over the course of 15 epochs.

Downloads last month: 28

Safetensors

Model size

0.2B params

Tensor type

F32

Model tree for michaellin/whisper-small-morse

Base model

openai/whisper-small

Finetuned

(3520)

this model

michaellin
/

whisper-small-morse

Whisper-Small-Morse

Ethical Considerations

Training

Model tree for michaellin/whisper-small-morse

Dataset used to train michaellin/whisper-small-morse

Space using michaellin/whisper-small-morse 1