--- library_name: transformers tags: [] --- # Model Card for Model ID ## Model Details ### Model Description This is the model card of a 🤗 transformers model that has been pushed on the Hub. - **Developed by:** [Fastino Mateteva] - **Model type:** [Transformer model] - **Language(s) (NLP):** [Shona] - **License:** [] ### Model Sources [optional] - **Repository:** [More Information Needed] - **Paper [optional]:** [More Information Needed] - **Demo [optional]:** [More Information Needed] ## Uses ### Direct Use [More Information Needed] ### Downstream Use [optional] [More Information Needed] ### Out-of-Scope Use [More Information Needed] ## Bias, Risks, and Limitations [More Information Needed] ### Recommendations Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. ## How to Get Started with the Model Use the code below to get started with the model. [### Running the model
Click to expand ```python !pip install transformers datasets torchaudio from transformers import Wav2Vec2ForCTC, Wav2Vec2Processor import torch import torchaudio model_id = "fastinom/ASR_fassy" # Load model and processor model = Wav2Vec2ForCTC.from_pretrained(model_id) processor = Wav2Vec2Processor.from_pretrained(model_id) def load_audio(file_path): speech_array, sampling_rate = torchaudio.load(file_path) resampler = torchaudio.transforms.Resample(sampling_rate, 16000) speech = resampler(speech_array).squeeze().numpy() return speech # Example audio file path audio_file = "/content/drive/MyDrive/recordings/wavefiles/1.wa"#YOUR AUDIO PATH speech = load_audio(audio_file) # Preprocess the audio inputs = processor(speech, sampling_rate=16000, return_tensors="pt", padding=True) # Perform inference with torch.no_grad(): logits = model(inputs.input_values).logits # Decode the output predicted_ids = torch.argmax(logits, dim=-1) transcription = processor.batch_decode(predicted_ids) print(transcription[0]) ```
] ## Training Details ### Training Data [More Information Needed] ### Training Procedure #### Preprocessing [optional] [More Information Needed] ### Training hyperparameters The following hyperparameters were used during training: - learning_rate: 5e-4 - per_device_train_batch_size=4 - eval_batch_size: 2 - evaluation_strategy="steps" - gradient_checkpointing=True - gradient_accumulation_steps: 4 - total_train_batch_size: 16 - num_train_epochs=3 - save_total_limit=1 - fp16=True - save_steps=400 - eval_steps=200 - logging_steps=200 - push_to_hub=True ### Training results | Training Loss | WER | Step | Validation Loss | |:-------------:|:-----:|:----:|:---------------:| | 6.427 | 1.00 | 200 | 4.1518 | | 3.7979 | 1.00 | 400 | 3.8410 | | 3.6924 | 1.00 | 600 | 3.4249 | | 0.8357 | 0.26 | 800 | 0.2396 | | 0.1528 | 0.24 | 1000 | 0.2155 | | 0.1415 | 0.24 | 1200 | 0.2036 | | 0.1278 | 0.24 | 1400 | 0.2028 | #### Speeds, Sizes, Times [optional] [More Information Needed] ## Evaluation ### Testing Data, Factors & Metrics #### Testing Data [More Information Needed] #### Factors [More Information Needed] #### Metrics [More Information Needed] ### Results [More Information Needed] ## Environmental Impact Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). - **Hardware Type:** [T4 GPU] - **Hours used:** [3] - **Cloud Provider:** [Google Colab] ## Technical Specifications [optional] ### Model Architecture and Objective [More Information Needed] ### Compute Infrastructure [More Information Needed] #### Hardware [More Information Needed] #### Software [More Information Needed] ## Model Card Authors [optional] [Fastino Mateteva] ## Model Card Contact [fastinomateteva@gmail.com]