You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

Configuration Parsing Warning: In adapter_config.json: "peft.task_type" must be a string

MALIBA-ASR-v1: Revolutionizing Bambara Speech Recognition

MALIBA-ASR-v1 represents a breakthrough in African language technology, setting a new for Bambara speech recognition. Developed by MALIBA-AI, this model significantly outperforms all existing open-source solutions for Bambara ASR, bringing unprecedented quality speech technology to Mali's most widely spoken language.

Bridging the Digital Language Divide

Despite being spoken by over 22 million people, Bambara has remained severely underrepresented in speech technology. MALIBA-ASR-v1 directly addresses this critical gap, achieving performance levels that make digital voice interfaces accessible to Bambara speakers. This work represents a crucial step toward digital language equality and demonstrates that high-quality speech technology is possible for African languages.

Performance Metrics

MALIBA-ASR-v1 achieves breakthrough results on the oza75/bambara-asr benchmark: Here's the metrics table showing only the WER and CER values for your model:

Metric Value
WER 0.22
CER 0.10

Exceptional Code-Switching Capabilities

One of the most significant advantages of MALIBA-ASR-v1 is its capability of code-switching โ€“ the natural mixing of Bambara with French or other languages that characterizes everyday speech in Mali. MALIBA-ASR-v1 accurately transcribes multi-lingual content, making it practical for real-world applications.

Transforming Access to Technology in Mali

MALIBA-ASR-v1 enables numerous applications previously unavailable to Bambara speakers:

  • Healthcare: Voice interfaces for medical information and services
  • Education: Audio-based learning tools for literacy and education
  • News & Media: Automated transcription of Bambara broadcasts and podcasts
  • Preservation: Documentation of oral histories and traditional knowledge
  • Accessibility: Voice technologies for visually impaired Bambara speakers
  • Mobile Access: Voice commands for smartphone users with limited literacy

Training Details

Dataset and Evaluation

The model was trained on the [coming soon] dataset, representing diverse speakers, dialects, and recording conditions.

Training Procedure

  • Base Model: openai/whisper-large-v2
  • Adaptation Method: LoRA (PEFT)
  • Training Duration: 6 epochs
  • Batch Size: 128 (32 per device with gradient accumulation steps of 4)
  • Learning Rate: 0.001 with linear scheduler and 50 warmup steps
  • Mixed Precision: Native AMP
  • Optimizer: AdamW (betas=(0.9, 0.999), epsilon=1e-08)

Training Results

Training Loss Epoch Step Validation Loss
0.3265 1.0 531 0.4117
0.2711 2.0 1062 0.3612
0.223 3.0 1593 0.3397
0.1802 4.0 2124 0.3330
0.1268 5.0 2655 0.3339
0.0932 6.0 3186 0.3491

Usage Examples

    COMING SOON 

The MALIBA-AI Impact

MALIBA-ASR-v1 is part of MALIBA-AI's broader mission to ensure "No Malian Language Left Behind." This initiative is actively transforming Mali's digital landscape by:

  1. Breaking Language Barriers: Providing technology in languages that Malians actually speak
  2. Enabling Local Innovation: Allowing Malian developers to build voice-based applications
  3. Preserving Cultural Heritage: Digitizing and preserving Mali's rich oral traditions
  4. Democratizing AI: Making cutting-edge technology accessible to all Malians regardless of literacy level
  5. Building Local Expertise: Training Malian AI practitioners and researchers

Future Development

MALIBA-AI is committed to continuing this work with:

  • Extension to other Malian languages (Songhoy, Pular, Tamasheq, etc.)

Join Our Mission

MALIBA-ASR-v1 embodies our commitment to open science and the advancement of African language technologies. We believe that by making cutting-edge speech recognition models freely available, we can accelerate NLP development across Africa.

Join our mission to democratize AI technology:

  • Open Science: Use and build upon our research - all code, models, and documentation are open source
  • Data Contribution: Share your Bambara speech datasets to help improve model performance
  • Research Collaboration: Integrate our models into your research projects and share your findings
  • Application Development: Build tools that serve Malian communities using our models
  • Educational Impact: Use our models in educational settings to train the next generation of African AI researchers

License

This model is released under the Apache 2.0 license to encourage research, commercial use, and innovation in African language technologies while ensuring proper attribution and patent protection.

Citation

@misc{maliba-asr-v1,
  author = {MALIBA-AI},
  title = {MALIBA-ASR-v1:  Bambara Automatic Speech Recognition},
  year = {2025},
  publisher = {HuggingFace},
  howpublished = {\url{https://huggingface.co/MALIBA-AI/maliba-asr-v1}}
}

Acknowledgements

  • We thank OpenAI for the Whisper model that served as our foundation
  • We acknowledge jeli-asr contributor, [cpmming soon] providing the Bambara ASR dataset
  • We appreciate the support of the Bambara-speaking community in Mali

MALIBA-AI: Empowering Mali's Future Through Community-Driven AI Innovation

"No Malian Language Left Behind"

Downloads last month
37
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for sudoping01/maliba-asr-v1

Adapter
(242)
this model

Evaluation results