Intel
/

whisper-large-int8-static-inc

Automatic Speech Recognition

PostTrainingStatic

Intel® Neural Compressor

neural-compressor

Inference Endpoints

Model card Files Files and versions Community

MengniWang commited on May 18, 2023

Commit

705a1ad

•

1 Parent(s): 91121b0

add code

Files changed (1) hide show

README.md +40 -0

README.md CHANGED Viewed

@@ -48,6 +48,46 @@ Download the model by cloning the repository:
 git clone https://huggingface.co/Intel/whisper-large-int8-static
 ```
 ## Metrics (Model Performance):
 | Model  | Model Size (GB) | wer |
 |---|:---:|:---:|

 git clone https://huggingface.co/Intel/whisper-large-int8-static
 ```
+Evaluate the model with below code:
+```python
+import os
+from evaluate import load
+from datasets import load_dataset
+from transformers import WhisperForConditionalGeneration, WhisperProcessor, AutoConfig
+model_name = 'openai/whisper-large'
+model_path = 'whisper-large-int8-static'
+processor = WhisperProcessor.from_pretrained(model_name)
+model = WhisperForConditionalGeneration.from_pretrained(model_name)
+config = AutoConfig.from_pretrained(model_name)
+wer = load("wer")
+librispeech_test_clean = load_dataset("librispeech_asr", "clean", split="test")
+from optimum.onnxruntime import ORTModelForSpeechSeq2Seq
+from transformers import PretrainedConfig
+model_config = PretrainedConfig.from_pretrained(model_name)
+predictions = []
+references = []
+sessions = ORTModelForSpeechSeq2Seq.load_model(
+            os.path.join(model_path, 'encoder_model.onnx'),
+            os.path.join(model_path, 'decoder_model.onnx'),
+            os.path.join(model_path, 'decoder_with_past_model.onnx'))
+model = ORTModelForSpeechSeq2Seq(sessions[0], sessions[1], model_config, model_path, sessions[2])
+for idx, batch in enumerate(librispeech_test_clean):
+    audio = batch["audio"]
+    input_features = processor(audio["array"], sampling_rate=audio["sampling_rate"], return_tensors="pt").input_features
+    reference = processor.tokenizer._normalize(batch['text'])
+    references.append(reference)
+    predicted_ids = model.generate(input_features)[0]
+    transcription = processor.decode(predicted_ids)
+    prediction = processor.tokenizer._normalize(transcription)
+    predictions.append(prediction)
+wer_result = wer.compute(references=references, predictions=predictions)
+print(f"Result wer: {wer_result * 100}")
+accuracy = 1 - wer_result
+print("Accuracy: %.5f" % accuracy)
+```
 ## Metrics (Model Performance):
 | Model  | Model Size (GB) | wer |
 |---|:---:|:---:|