|
--- |
|
language: |
|
- ckb |
|
tags: |
|
- generated_from_trainer |
|
datasets: |
|
- PawanKrd/asr-ckb |
|
metrics: |
|
- wer |
|
model-index: |
|
- name: ASR CKB |
|
results: |
|
- task: |
|
name: Automatic Speech Recognition |
|
type: automatic-speech-recognition |
|
dataset: |
|
name: PawanKrd/asr-ckb |
|
type: PawanKrd/asr-ckb |
|
metrics: |
|
- name: Wer |
|
type: wer |
|
value: 4.1303699778079555 |
|
--- |
|
|
|
# Automatic Speech Recognition - CKB |
|
|
|
This model is trained on the [PawanKrd/asr-ckb](https://huggingface.co/datasets/PawanKrd/asr-ckb) dataset. This model is specifically for the Central Kurdish (Sorani) language. |
|
|
|
## Model Performance |
|
|
|
The model achieves the following performance on the evaluation set: |
|
- **Loss**: 0.0048 |
|
- **Word Error Rate (WER)**: 4.1304 |
|
|
|
## Model Description |
|
|
|
This Automatic Speech Recognition (ASR) model for Central Kurdish (Sorani) is designed to transcribe spoken Kurdish into written text. It leverages a deep learning architecture optimized for speech-to-text tasks. The model is built using the Transformers library and trained on a diverse set of Central Kurdish audio recordings. |
|
|
|
## Intended Uses & Limitations |
|
|
|
This model is intended for automatic transcription of Central Kurdish audio. It performs best on clear, high-quality audio recordings. Performance may degrade with noisy backgrounds, strong accents, or atypical pronunciations. |
|
|
|
### Intended Uses |
|
- Transcribing interviews and speeches in Central Kurdish. |
|
- Creating subtitles for Kurdish videos. |
|
- Assisting in the documentation and preservation of the Kurdish language. |
|
|
|
### Limitations |
|
- Performance may be suboptimal on audio with heavy background noise. |
|
- Strong regional accents or non-standard pronunciations can impact accuracy. |
|
- Not suitable for real-time transcription without further optimization. |
|
|
|
## Training and Evaluation Data |
|
|
|
The model was trained and evaluated using the [PawanKrd/asr-ckb](https://huggingface.co/datasets/PawanKrd/asr-ckb) dataset, which consists of diverse audio samples in Central Kurdish. The training process was designed to optimize the model's recognition accuracy for this specific language. |
|
|
|
## Training Procedure |
|
|
|
### Hyperparameters |
|
|
|
- **Learning Rate**: 1e-05 |
|
- **Train Batch Size**: 32 |
|
- **Eval Batch Size**: 16 |
|
- **Seed**: 42 |
|
- **Optimizer**: Adam (betas=(0.9, 0.999), epsilon=1e-08) |
|
- **Learning Rate Scheduler**: Linear |
|
- **Warmup Steps**: 500 |
|
- **Epochs**: 3 |
|
|
|
### Training Results |
|
|
|
| Training Loss | Epoch | Step | Validation Loss | WER | |
|
|:-------------:|:------:|:-----:|:---------------:|:-------:| |
|
| 0.0966 | 0.1927 | 1000 | 0.1457 | 29.30 | |
|
| 0.0952 | 0.3854 | 2000 | 0.0988 | 22.26 | |
|
| 0.0582 | 0.5780 | 3000 | 0.0741 | 17.51 | |
|
| 0.0523 | 0.7707 | 4000 | 0.0532 | 15.14 | |
|
| 0.0164 | 0.9634 | 5000 | 0.0412 | 14.19 | |
|
| 0.0271 | 1.1561 | 6000 | 0.0519 | 15.68 | |
|
| 0.0358 | 1.3487 | 7000 | 0.0407 | 11.18 | |
|
| 0.0208 | 1.5414 | 8000 | 0.0327 | 9.94 | |
|
| 0.031 | 1.7341 | 9000 | 0.0268 | 10.86 | |
|
| 0.033 | 1.9268 | 10000 | 0.0191 | 7.70 | |
|
| 0.0269 | 2.1195 | 11000 | 0.0138 | 6.48 | |
|
| 0.025 | 2.3121 | 12000 | 0.0111 | 6.83 | |
|
| 0.003 | 2.5048 | 13000 | 0.0086 | 5.78 | |
|
| 0.0021 | 2.6975 | 14000 | 0.0065 | 4.66 | |
|
| 0.0031 | 2.8902 | 15000 | 0.0048 | 4.13 | |
|
|
|
### Framework Versions |
|
|
|
- **Transformers**: 4.41.0.dev0 |
|
- **PyTorch**: 2.3.0+cu121 |
|
- **Datasets**: 2.19.1 |
|
- **Tokenizers**: 0.19.1 |
|
|
|
## Example Usage |
|
|
|
To use this model for transcription, you can follow the example code below: |
|
|
|
```python |
|
from transformers import pipeline |
|
|
|
# Load the fine-tuned model |
|
asr_pipeline = pipeline(model="PawanKrd/asr-large-ckb") |
|
|
|
# Transcribe audio file |
|
audio_file = "audio.wav" |
|
transcription = asr_pipeline(audio_file) |
|
|
|
# Print the transcription |
|
print(transcription["text"]) |
|
``` |
|
|
|
This code demonstrates how to load the model and use it to transcribe an audio file in Central Kurdish. |