---
datasets:
- coscan-speech2
license: cc0-1.0
metrics:
- accuracy
- f1
- precision
- recall
model-index:
- name: wav2vec2-large-voxrex-swedish-coscan-no-region
  results:
  - dataset:
      name: Coscan Speech
      type: NbAiLab/coscan-speech2
    metrics:
    - name: Test Accuracy on Coscan Speech
      type: accuracy
      value: 0.6155107552811807
    - name: Validation Accuracy on Coscan Speech
      type: accuracy
      value: 0.8773432861141742
    - name: Test F1 (micro) on Coscan Speech
      type: f1
      value: 0.6155107552811807
    - name: Validation F1 (micro) on Coscan Speech
      type: f1
      value: 0.8773432861141742
    task:
      name: Audio Classification
      type: audio-classification
tags:
- generated_from_trainer
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# wav2vec2-large-voxrex-swedish-coscan-no-region

This model is a fine-tuned version of [KBLab/wav2vec2-large-voxrex-swedish](https://huggingface.co/KBLab/wav2vec2-large-voxrex-swedish) on the coscan-speech2 dataset.
It achieves the following results on the evaluation set:
- Loss: 1.0151
- Accuracy: 0.8773
- F1: 0.8773
- Precision: 0.8773
- Recall: 0.8773

## Model description

More information needed

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 3e-05
- train_batch_size: 16
- eval_batch_size: 16
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_ratio: 0.1
- num_epochs: 5

### Training results

| Training Loss | Epoch | Step  | Validation Loss | Accuracy | F1     | Precision | Recall |
|:-------------:|:-----:|:-----:|:---------------:|:--------:|:------:|:---------:|:------:|
| 0.1651        | 1.0   | 6468  | 0.5657          | 0.8650   | 0.8650 | 0.8650    | 0.8650 |
| 0.1217        | 2.0   | 12936 | 0.9411          | 0.8487   | 0.8487 | 0.8487    | 0.8487 |
| 0.0013        | 3.0   | 19404 | 0.9991          | 0.8617   | 0.8617 | 0.8617    | 0.8617 |
| 0.0652        | 4.0   | 25872 | 1.0151          | 0.8773   | 0.8773 | 0.8773    | 0.8773 |
| 0.0001        | 5.0   | 32340 | 1.1031          | 0.8700   | 0.8700 | 0.8700    | 0.8700 |


### Classification report on Coscan Speech (test set)

```
                         precision    recall  f1-score   support

Bergen og Ytre Vestland       0.65      0.97      0.78      1809
     Hedmark og Oppland       0.12      0.06      0.08      2302
               Nordland       0.97      0.47      0.63      2195
           Oslo-området       0.78      0.42      0.55      6957
               Sunnmøre       0.94      0.71      0.81      2636
         Sør-Vestlandet       0.96      0.46      0.62      2860
              Sørlandet       0.62      0.81      0.70      2490
                  Troms       0.67      1.00      0.80      2867
              Trøndelag       0.52      0.94      0.67      2666
         Voss og omland       0.70      0.71      0.71      2641
         Ytre Oslofjord       0.20      0.49      0.29      1678

               accuracy                           0.62     31101
              macro avg       0.65      0.64      0.60     31101
           weighted avg       0.68      0.62      0.61     31101

```


### Framework versions

- Transformers 4.22.0.dev0
- Pytorch 1.10.1+cu102
- Datasets 2.4.1.dev0
- Tokenizers 0.12.1