--- language: - ar license: apache-2.0 base_model: openai/whisper-large-v3 tags: - hf-asr-leaderboard - generated_from_trainer datasets: - Voice_Cleverlytics model-index: - name: Whisper_Cleverlytics results: [] metrics: - wer --- # Whisper_Cleverlytics ## Usage To run the model, first install the Transformers library through the GitHub repo. ```python pip install --upgrade pip pip install --upgrade git+https://github.com/huggingface/transformers.git accelerate datasets[audio] ``` ```python import torch from transformers import AutoModelForSpeechSeq2Seq, AutoProcessor, pipeline #from datasets import load_dataset device = "cuda:0" if torch.cuda.is_available() else "cpu" torch_dtype = torch.float16 if torch.cuda.is_available() else torch.float32 model_id = "smerchi/Arabic-Morocco-Speech_To_Text" model = AutoModelForSpeechSeq2Seq.from_pretrained( model_id, torch_dtype=torch_dtype, low_cpu_mem_usage=False, use_safetensors=True ) model.to(device) processor = AutoProcessor.from_pretrained(model_id) pipe = pipeline( "automatic-speech-recognition", model=model, tokenizer=processor.tokenizer, feature_extractor=processor.feature_extractor, max_new_tokens=128, chunk_length_s=30, batch_size=16, return_timestamps=True, torch_dtype=torch_dtype, device=device, ) audio="/content/audio.mp3" %time result = pipe(audio) print(result["text"],) ``` ### Training hyperparameters The following hyperparameters were used during training: - learning_rate: 1e-05 - num_epochs: 20 ### Training results ### Framework versions - Transformers 4.35.2 - Pytorch 2.0.1+cu117 - Datasets 2.16.0 - Tokenizers 0.14.1