versae's picture
Upload README.md with huggingface_hub
87f8848
---
datasets:
- coscan-speech2
license: cc0-1.0
metrics:
- accuracy
- f1
- precision
- recall
model-index:
- name: wav2vec2-large-voxrex-swedish-coscan-no-region
results:
- dataset:
name: Coscan Speech
type: NbAiLab/coscan-speech2
metrics:
- name: Test Accuracy on Coscan Speech
type: accuracy
value: 0.6155107552811807
- name: Validation Accuracy on Coscan Speech
type: accuracy
value: 0.8773432861141742
- name: Test F1 (micro) on Coscan Speech
type: f1
value: 0.6155107552811807
- name: Validation F1 (micro) on Coscan Speech
type: f1
value: 0.8773432861141742
task:
name: Audio Classification
type: audio-classification
tags:
- generated_from_trainer
---
<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->
# wav2vec2-large-voxrex-swedish-coscan-no-region
This model is a fine-tuned version of [KBLab/wav2vec2-large-voxrex-swedish](https://huggingface.co/KBLab/wav2vec2-large-voxrex-swedish) on the coscan-speech2 dataset.
It achieves the following results on the evaluation set:
- Loss: 1.0151
- Accuracy: 0.8773
- F1: 0.8773
- Precision: 0.8773
- Recall: 0.8773
## Model description
More information needed
## Intended uses & limitations
More information needed
## Training and evaluation data
More information needed
## Training procedure
### Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 3e-05
- train_batch_size: 16
- eval_batch_size: 16
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_ratio: 0.1
- num_epochs: 5
### Training results
| Training Loss | Epoch | Step | Validation Loss | Accuracy | F1 | Precision | Recall |
|:-------------:|:-----:|:-----:|:---------------:|:--------:|:------:|:---------:|:------:|
| 0.1651 | 1.0 | 6468 | 0.5657 | 0.8650 | 0.8650 | 0.8650 | 0.8650 |
| 0.1217 | 2.0 | 12936 | 0.9411 | 0.8487 | 0.8487 | 0.8487 | 0.8487 |
| 0.0013 | 3.0 | 19404 | 0.9991 | 0.8617 | 0.8617 | 0.8617 | 0.8617 |
| 0.0652 | 4.0 | 25872 | 1.0151 | 0.8773 | 0.8773 | 0.8773 | 0.8773 |
| 0.0001 | 5.0 | 32340 | 1.1031 | 0.8700 | 0.8700 | 0.8700 | 0.8700 |
### Classification report on Coscan Speech (test set)
```
precision recall f1-score support
Bergen og Ytre Vestland 0.65 0.97 0.78 1809
Hedmark og Oppland 0.12 0.06 0.08 2302
Nordland 0.97 0.47 0.63 2195
Oslo-området 0.78 0.42 0.55 6957
Sunnmøre 0.94 0.71 0.81 2636
Sør-Vestlandet 0.96 0.46 0.62 2860
Sørlandet 0.62 0.81 0.70 2490
Troms 0.67 1.00 0.80 2867
Trøndelag 0.52 0.94 0.67 2666
Voss og omland 0.70 0.71 0.71 2641
Ytre Oslofjord 0.20 0.49 0.29 1678
accuracy 0.62 31101
macro avg 0.65 0.64 0.60 31101
weighted avg 0.68 0.62 0.61 31101
```
### Framework versions
- Transformers 4.22.0.dev0
- Pytorch 1.10.1+cu102
- Datasets 2.4.1.dev0
- Tokenizers 0.12.1