Papers
arxiv:2406.18120

ArzEn-LLM: Code-Switched Egyptian Arabic-English Translation and Speech Recognition Using LLMs

Published on Jun 26
· Submitted by ahmedheakl on Jun 28
Authors:
,

Abstract

Motivated by the widespread increase in the phenomenon of code-switching between Egyptian Arabic and English in recent times, this paper explores the intricacies of machine translation (MT) and automatic speech recognition (ASR) systems, focusing on translating code-switched Egyptian Arabic-English to either English or Egyptian Arabic. Our goal is to present the methodologies employed in developing these systems, utilizing large language models such as LLama and Gemma. In the field of ASR, we explore the utilization of the Whisper model for code-switched Egyptian Arabic recognition, detailing our experimental procedures including data preprocessing and training techniques. Through the implementation of a consecutive speech-to-text translation system that integrates ASR with MT, we aim to overcome challenges posed by limited resources and the unique characteristics of the Egyptian Arabic dialect. Evaluation against established metrics showcases promising results, with our methodologies yielding a significant improvement of 56% in English translation over the state-of-the-art and 9.3% in Arabic translation. Since code-switching is deeply inherent in spoken languages, it is crucial that ASR systems can effectively handle this phenomenon. This capability is crucial for enabling seamless interaction in various domains, including business negotiations, cultural exchanges, and academic discourse. Our models and code are available as open-source resources. Code: http://github.com/ahmedheakl/arazn-llm}, Models: http://huggingface.co/collections/ahmedheakl/arazn-llm-662ceaf12777656607b9524e.

Community

Paper author Paper submitter
edited 21 days ago

This paper addresses the growing phenomenon of code-switching between Egyptian Arabic and English by developing machine translation (MT) and automatic speech recognition (ASR) systems to translate code-switched language into either English or Egyptian Arabic. Utilizing large language models like LLama and Gemma for MT, and the Whisper model for ASR, we detail our experimental procedures, including data preprocessing and training techniques. Our integrated speech-to-text translation system, designed to tackle limited resources and the unique traits of Egyptian Arabic, shows significant improvements: 56% in English translation and 9.3% in Arabic translation over state-of-the-art benchmarks. This advancement is critical for effective ASR in various domains. Our models and code are available as open-source resources.

Hi @ahmedheakl congrats on this work!!

Would it be possible to link the models and collection to this paper page? See here for more info: https://huggingface.co/docs/hub/en/model-cards#linking-a-paper

·
Paper author

Done, thank you!

Sign up or log in to comment

Models citing this paper 13

Browse 13 models citing this paper

Datasets citing this paper 2

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2406.18120 in a Space README.md to link it from this page.

Collections including this paper 1