whisper-small-urdu-int8-ct2

INT8 Quantized Version of khawajaaliarshad/whisper-small-urdu
Converted using CTranslate2 for faster inference on CPU/GPU, optimized for mobile and edge deployment.


Original Model

  • Name: whisper-small-urdu
  • Author: Khawaja Ali Arshad
  • License: Apache 2.0
  • This repository contains a quantized INT8 version; no retraining was performed.

Conversion Details

  • Conversion tool: CTranslate2
  • Quantization: INT8
  • Purpose: Reduce model size and accelerate inference for low-resource environments (mobile/CPU/GPU).
  • Backend support: Faster-Whisper compatible.

Benchmark (T4 GPU, 5-second audio sample)

Version Inference Time Notes
Original FP32 9.07 seconds Standard Hugging Face PyTorch model
INT8 Quantized 0.54 seconds Using CTranslate2, 16x speed-up
  • Model size: ~967 MB original → INT8 version smaller (approx 1/4–1/3 the size; check actual folder size)

⚠️ Note: Actual speed may vary depending on device, CPU/GPU, and batch size.


Usage Example

from faster_whisper import WhisperModel

# Load the INT8 quantized model
model = WhisperModel(
    "whisper-small-urdu-int8-ct2",
    device="cpu",        # or "cuda" for GPU
    compute_type="int8"
)

# Transcribe an audio file
segments, info = model.transcribe("audio.wav")

print("Detected language:", info.language)
for segment in segments:
    print(segment.text)

License & Attribution

  • This model is released under Apache 2.0 License, same as the original model.

  • Original model by Khawaja Ali Arshad: Original Model

  • INT8 quantization done by Muhammad Khubaib Ahmad using CTranslate2.

  • Please retain proper attribution when using this model.

Recommendations

  • For mobile/edge: Use INT8 version for faster inference and lower memory usage.

  • For training/fine-tuning: Use the original FP32 model; quantized INT8 is not suitable for further training.

  • For benchmarking: Test on your target hardware for accurate latency measurements.

  • Compatibility: Fully compatible with faster-whisper API.

Downloads last month
18
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Inferencelab/whisper-small-urdu-int8-ct2

Finetuned
(3)
this model