jonatasgrosman commited on
Commit
d12252b
1 Parent(s): 209410b

adjust README

Browse files
Files changed (1) hide show
  1. README.md +20 -9
README.md CHANGED
@@ -24,10 +24,10 @@ model-index:
24
  metrics:
25
  - name: Test WER
26
  type: wer
27
- value: 13.42
28
  - name: Test CER
29
  type: cer
30
- value: 8.63
31
  ---
32
 
33
  # Wav2Vec2-Large-XLSR-53-Dutch
@@ -49,8 +49,9 @@ from transformers import Wav2Vec2ForCTC, Wav2Vec2Processor
49
 
50
  LANG_ID = "nl"
51
  MODEL_ID = "jonatasgrosman/wav2vec2-large-xlsr-53-dutch"
 
52
 
53
- test_dataset = load_dataset("common_voice", LANG_ID, split="test[:2%]")
54
 
55
  processor = Wav2Vec2Processor.from_pretrained(MODEL_ID)
56
  model = Wav2Vec2ForCTC.from_pretrained(MODEL_ID)
@@ -64,17 +65,28 @@ def speech_file_to_array_fn(batch):
64
  return batch
65
 
66
  test_dataset = test_dataset.map(speech_file_to_array_fn)
67
- inputs = processor(test_dataset[:2]["speech"], sampling_rate=16_000, return_tensors="pt", padding=True)
68
 
69
  with torch.no_grad():
70
  logits = model(inputs.input_values, attention_mask=inputs.attention_mask).logits
71
 
72
  predicted_ids = torch.argmax(logits, dim=-1)
 
73
 
74
- print("Prediction:", processor.batch_decode(predicted_ids))
75
- print("Reference:", test_dataset[:2]["sentence"])
 
 
76
  ```
77
 
 
 
 
 
 
 
 
 
78
  ## Evaluation
79
 
80
  The model can be evaluated as follows on the Dutch test data of Common Voice.
@@ -134,6 +146,5 @@ print("CER: {:2f}".format(100 * cer.compute(predictions=result["pred_strings"],
134
 
135
  **Test Result**:
136
 
137
- - WER: 13.42%
138
-
139
- - CER: 8.63%
 
24
  metrics:
25
  - name: Test WER
26
  type: wer
27
+ value: 13.60
28
  - name: Test CER
29
  type: cer
30
+ value: 8.12
31
  ---
32
 
33
  # Wav2Vec2-Large-XLSR-53-Dutch
 
49
 
50
  LANG_ID = "nl"
51
  MODEL_ID = "jonatasgrosman/wav2vec2-large-xlsr-53-dutch"
52
+ SAMPLES = 5
53
 
54
+ test_dataset = load_dataset("common_voice", LANG_ID, split=f"test[:{SAMPLES}]")
55
 
56
  processor = Wav2Vec2Processor.from_pretrained(MODEL_ID)
57
  model = Wav2Vec2ForCTC.from_pretrained(MODEL_ID)
 
65
  return batch
66
 
67
  test_dataset = test_dataset.map(speech_file_to_array_fn)
68
+ inputs = processor(test_dataset["speech"], sampling_rate=16_000, return_tensors="pt", padding=True)
69
 
70
  with torch.no_grad():
71
  logits = model(inputs.input_values, attention_mask=inputs.attention_mask).logits
72
 
73
  predicted_ids = torch.argmax(logits, dim=-1)
74
+ predicted_sentences = processor.batch_decode(predicted_ids)
75
 
76
+ for i, predicted_sentence in enumerate(predicted_sentences):
77
+ print("-" * 100)
78
+ print("Reference:", test_dataset[i]["sentence"])
79
+ print("Prediction:", predicted_sentence)
80
  ```
81
 
82
+ | Reference | Prediction |
83
+ | ------------- | ------------- |
84
+ | DE ABORIGINALS ZIJN DE OORSPRONKELIJKE BEWONERS VAN AUSTRALIË. | DE ABORIGONALS ZIJN DE OORSPRONKELIJKE BEWONERS VAN AUSTRALIË |
85
+ | MIJN TOETSENBORD ZIT VOL STOF | MIJN TOETSEN BORT ZIT VOL STOF. |
86
+ | ZE HAD DE BANK BESCHADIGD MET HAAR SKATEBOARD. | ZE HAD DE BANK BESCHADIGD MET HAAR SCHEETBOORD |
87
+ | WAAR LAAT JIJ JE ONDERHOUD DOEN? | WAAR LAAT JIJ JE ONDERHOUD DOEN |
88
+ | NA HET LEZEN VAN VELE BEOORDELINGEN HAD ZE EINDELIJK HAAR OOG LATEN VALLEN OP EEN LAPTOP MET EEN QWERTY TOETSENBORD. | NA HET LEZEN VAN VELE BEOORDELINGEN HAD ZE EINDELIJK HAAR OOG LATEN VALLEN OP EEN LAPTOP MET EEN KWERTIETOETSENBORD |
89
+
90
  ## Evaluation
91
 
92
  The model can be evaluated as follows on the Dutch test data of Common Voice.
 
146
 
147
  **Test Result**:
148
 
149
+ - WER: 13.60%
150
+ - CER: 8.12%