asr-large-ckb / README.md

PawanOsman

Update README.md

2108149 verified 5 months ago

preview code

raw

history blame contribute delete

No virus

4.09 kB

	---
	language:
	- ckb
	tags:
	- generated_from_trainer
	datasets:
	- PawanKrd/asr-ckb
	metrics:
	- wer
	model-index:
	- name: ASR CKB
	results:
	- task:
	name: Automatic Speech Recognition
	type: automatic-speech-recognition
	dataset:
	name: PawanKrd/asr-ckb
	type: PawanKrd/asr-ckb
	metrics:
	- name: Wer
	type: wer
	value: 4.1303699778079555
	---

	# Automatic Speech Recognition - CKB

	This model is trained on the [PawanKrd/asr-ckb](https://huggingface.co/datasets/PawanKrd/asr-ckb) dataset. This model is specifically for the Central Kurdish (Sorani) language.

	## Model Performance

	The model achieves the following performance on the evaluation set:
	- Loss: 0.0048
	- Word Error Rate (WER): 4.1304

	## Model Description

	This Automatic Speech Recognition (ASR) model for Central Kurdish (Sorani) is designed to transcribe spoken Kurdish into written text. It leverages a deep learning architecture optimized for speech-to-text tasks. The model is built using the Transformers library and trained on a diverse set of Central Kurdish audio recordings.

	## Intended Uses & Limitations

	This model is intended for automatic transcription of Central Kurdish audio. It performs best on clear, high-quality audio recordings. Performance may degrade with noisy backgrounds, strong accents, or atypical pronunciations.

	### Intended Uses
	- Transcribing interviews and speeches in Central Kurdish.
	- Creating subtitles for Kurdish videos.
	- Assisting in the documentation and preservation of the Kurdish language.

	### Limitations
	- Performance may be suboptimal on audio with heavy background noise.
	- Strong regional accents or non-standard pronunciations can impact accuracy.
	- Not suitable for real-time transcription without further optimization.

	## Training and Evaluation Data

	The model was trained and evaluated using the [PawanKrd/asr-ckb](https://huggingface.co/datasets/PawanKrd/asr-ckb) dataset, which consists of diverse audio samples in Central Kurdish. The training process was designed to optimize the model's recognition accuracy for this specific language.

	## Training Procedure

	### Hyperparameters

	- Learning Rate: 1e-05
	- Train Batch Size: 32
	- Eval Batch Size: 16
	- Seed: 42
	- Optimizer: Adam (betas=(0.9, 0.999), epsilon=1e-08)
	- Learning Rate Scheduler: Linear
	- Warmup Steps: 500
	- Epochs: 3

	### Training Results

	\| Training Loss \| Epoch \| Step \| Validation Loss \| WER \|
	\|:-------------:\|:------:\|:-----:\|:---------------:\|:-------:\|
	\| 0.0966 \| 0.1927 \| 1000 \| 0.1457 \| 29.30 \|
	\| 0.0952 \| 0.3854 \| 2000 \| 0.0988 \| 22.26 \|
	\| 0.0582 \| 0.5780 \| 3000 \| 0.0741 \| 17.51 \|
	\| 0.0523 \| 0.7707 \| 4000 \| 0.0532 \| 15.14 \|
	\| 0.0164 \| 0.9634 \| 5000 \| 0.0412 \| 14.19 \|
	\| 0.0271 \| 1.1561 \| 6000 \| 0.0519 \| 15.68 \|
	\| 0.0358 \| 1.3487 \| 7000 \| 0.0407 \| 11.18 \|
	\| 0.0208 \| 1.5414 \| 8000 \| 0.0327 \| 9.94 \|
	\| 0.031 \| 1.7341 \| 9000 \| 0.0268 \| 10.86 \|
	\| 0.033 \| 1.9268 \| 10000 \| 0.0191 \| 7.70 \|
	\| 0.0269 \| 2.1195 \| 11000 \| 0.0138 \| 6.48 \|
	\| 0.025 \| 2.3121 \| 12000 \| 0.0111 \| 6.83 \|
	\| 0.003 \| 2.5048 \| 13000 \| 0.0086 \| 5.78 \|
	\| 0.0021 \| 2.6975 \| 14000 \| 0.0065 \| 4.66 \|
	\| 0.0031 \| 2.8902 \| 15000 \| 0.0048 \| 4.13 \|

	### Framework Versions

	- Transformers: 4.41.0.dev0
	- PyTorch: 2.3.0+cu121
	- Datasets: 2.19.1
	- Tokenizers: 0.19.1

	## Example Usage

	To use this model for transcription, you can follow the example code below:

	```python
	from transformers import pipeline

	# Load the fine-tuned model
	asr_pipeline = pipeline(model="PawanKrd/asr-large-ckb")

	# Transcribe audio file
	audio_file = "audio.wav"
	transcription = asr_pipeline(audio_file)

	# Print the transcription
	print(transcription["text"])
	```

	This code demonstrates how to load the model and use it to transcribe an audio file in Central Kurdish.