Edit model card

πŸ‘³ Arabic-Whisper-CodeSwitching-Edition

This model is a fine-tuned version of Whisper Large v2 by OpenAI, trained on an Arabic-English-code-switching dataset.

image/png

πŸ“ Model Details

Model Description

The Arabic-Whisper-CodeSwitching-Edition is designed to handle Arabic audio with embedded English words. This model enhances the original Whisper Large v2 by improving its performance on Arabic-English code-switching speech

  • Developed by: Ψ§Ω„ΨΉΨ¨Ψ― Ω„Ω„Ω‡
  • Model type: Speech Recognition
  • Language(s) (NLP): Arabic, English (in the context of Arabic audio)
  • License: GPL-3.0

Model Sources [optional]

πŸ‘· Uses

Direct Use

The model can be used directly for transcribing Arabic speech that includes English words. It is particularly useful in multilingual environments where code-switching is common.

Out-of-Scope Use

The model may not perform well on monolingual speech in languages other than Arabic or English, or on speech with code-switching in languages other than Arabic and English.

😨 Bias, Risks, and Limitations

Recommendations

Users (both direct and downstream) should be made aware of the risks, biases, and limitations of the model. More information needed for further recommendations.

πŸ” How to Get Started with the Model

Use the code below to get started with the model.

from transformers import WhisperForConditionalGeneration, WhisperProcessor

processor = WhisperProcessor.from_pretrained("MohamedRashad/Arabic-Whisper-CodeSwitching-Edition")
model = WhisperForConditionalGeneration.from_pretrained("MohamedRashad/Arabic-Whisper-CodeSwitching-Edition")

# Example usage
inputs = processor("path_to_audio_file.wav", return_tensors="pt")
generated_ids = model.generate(inputs["input_features"])
transcription = processor.batch_decode(generated_ids, skip_special_tokens=True)
print(transcription)

πŸ‘¨β€πŸŽ“ Citation

BibTeX:

@misc{rashad2024arabicwhisper,
  title={Arabic-Whisper-CodeSwitching-Edition},
  author={Mohamed Rashad},
  year={2024},
  url={https://huggingface.co/spaces/MohamedRashad/Arabic-Whisper-CodeSwitching-Edition},
}

APA:

Rashad, M. (2024). Arabic-Whisper-CodeSwitching-Edition. Retrieved from https://huggingface.co/spaces/MohamedRashad/Arabic-Whisper-CodeSwitching-Edition

Downloads last month
0
Safetensors
Model size
1.54B params
Tensor type
BF16
Β·
Inference API
or
This model can be loaded on Inference API (serverless).

Dataset used to train MohamedRashad/Arabic-Whisper-CodeSwitching-Edition

Space using MohamedRashad/Arabic-Whisper-CodeSwitching-Edition 1