MahmoodAnaam/lrs2_train_validation_test
Viewer • Updated • 48.1k • 3 • 1
How to use MahmoodAnaam/MSP-Audio-V0 with Transformers:
# Use a pipeline as a high-level helper
from transformers import pipeline
pipe = pipeline("automatic-speech-recognition", model="MahmoodAnaam/MSP-Audio-V0", trust_remote_code=True) # Load model directly
from transformers import AutoModelForCTC
model = AutoModelForCTC.from_pretrained("MahmoodAnaam/MSP-Audio-V0", trust_remote_code=True, dtype="auto")This model is a fine-tuned version of on an unknown dataset. It achieves the following results on the evaluation set:
Note: we evaluate the test data set with batch_size=1 on purpose
due to this issue.
Since padded inputs don't yield the exact same output as non-padded
inputs, a better WER can be achieved by not padding the input at all.
More information needed
More information needed
More information needed
The following hyperparameters were used during training:
| Training Loss | Epoch | Step | Validation Loss | Wer | Cer |
|---|---|---|---|---|---|
| 2.9020 | 0.6821 | 500 | 0.3734 | 0.3326 | 0.2717 |
| 2.9679 | 1.3643 | 1000 | 0.3505 | 0.3264 | 0.2593 |
| 2.9390 | 2.0464 | 1500 | 0.3923 | 0.3659 | 0.2725 |
| 2.8775 | 2.7285 | 2000 | 0.3607 | 0.3614 | 0.2675 |
| 2.9122 | 3.4106 | 2500 | 0.3953 | 0.3812 | 0.2770 |
| 2.8879 | 4.0928 | 3000 | 0.3950 | 0.3800 | 0.2774 |
| 2.8735 | 4.7749 | 3500 | 0.4303 | 0.3827 | 0.2849 |
| 2.9131 | 5.4570 | 4000 | 0.4071 | 0.3833 | 0.2847 |
| 2.8792 | 6.1392 | 4500 | 0.3638 | 0.3640 | 0.2703 |
| 2.8804 | 6.8213 | 5000 | 0.3389 | 0.3544 | 0.2669 |
| 2.8883 | 7.5034 | 5500 | 0.3495 | 0.3583 | 0.2693 |
| 2.8861 | 8.1855 | 6000 | 0.3985 | 0.3827 | 0.2849 |
| 2.8934 | 8.8677 | 6500 | 0.3453 | 0.3590 | 0.2694 |
| 2.9068 | 9.5498 | 7000 | 0.3327 | 0.3344 | 0.2596 |
| 2.8741 | 10.2319 | 7500 | 0.3176 | 0.3321 | 0.2577 |
| 2.8961 | 10.9141 | 8000 | 0.3362 | 0.3309 | 0.2591 |
| 2.8826 | 11.5962 | 8500 | 0.3344 | 0.3272 | 0.2564 |
| 2.8922 | 12.2783 | 9000 | 0.3172 | 0.3359 | 0.2568 |
| 2.8963 | 12.9604 | 9500 | 0.3175 | 0.3228 | 0.2525 |
| 2.8683 | 13.6426 | 10000 | 0.2987 | 0.3147 | 0.2521 |
| 2.8781 | 14.3247 | 10500 | 0.2992 | 0.3222 | 0.2552 |
| 2.8693 | 15.0068 | 11000 | 0.2764 | 0.3099 | 0.2482 |
| 2.8676 | 15.6889 | 11500 | 0.3020 | 0.3140 | 0.2522 |
| 2.8953 | 16.3711 | 12000 | 0.2932 | 0.3080 | 0.2470 |
| 2.9023 | 17.0532 | 12500 | 0.2895 | 0.3075 | 0.2478 |
| 2.8665 | 17.7353 | 13000 | 0.2889 | 0.3098 | 0.2466 |
| 2.9208 | 18.4175 | 13500 | 0.2753 | 0.3114 | 0.2461 |
| 2.8623 | 19.0996 | 14000 | 0.2749 | 0.3077 | 0.2447 |
| 2.9092 | 19.7817 | 14500 | 0.2711 | 0.3066 | 0.2433 |
Base model
facebook/wav2vec2-base-960h