bookbot
/

wav2vec2-xls-r-adult-child-cls

@@ -1,66 +1,73 @@
 ---
 license: apache-2.0
 tags:
-- generated_from_trainer
 metrics:
-- accuracy
-- f1
 model-index:
-- name: wav2vec2-xls-r-adult-child-cls
-  results: []
 ---
-<!-- This model card has been generated automatically according to the information the Trainer had access to. You
-should probably proofread and complete it, then remove this comment. -->
-# wav2vec2-xls-r-adult-child-cls
-This model is a fine-tuned version of [facebook/wav2vec2-xls-r-300m](https://huggingface.co/facebook/wav2vec2-xls-r-300m) on the None dataset.
-It achieves the following results on the evaluation set:
-- Loss: 0.1851
-- Accuracy: 0.9469
-- F1: 0.9508
-## Model description
-More information needed
-## Intended uses & limitations
-More information needed
-## Training and evaluation data
-More information needed
 ## Training procedure
 ### Training hyperparameters
 The following hyperparameters were used during training:
-- learning_rate: 3e-05
-- train_batch_size: 8
-- eval_batch_size: 8
-- seed: 42
-- gradient_accumulation_steps: 4
-- total_train_batch_size: 32
-- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
-- lr_scheduler_type: linear
-- lr_scheduler_warmup_ratio: 0.1
-- num_epochs: 5
 ### Training results
-| Training Loss | Epoch | Step | Validation Loss | Accuracy | F1     |
-|:-------------:|:-----:|:----:|:---------------:|:--------:|:------:|
-| 0.2906        | 1.0   | 383  | 0.1856          | 0.9372   | 0.9421 |
-| 0.1749        | 2.0   | 766  | 0.1925          | 0.9418   | 0.9465 |
-| 0.1681        | 3.0   | 1149 | 0.1893          | 0.9414   | 0.9459 |
-| 0.1295        | 4.0   | 1532 | 0.1851          | 0.9469   | 0.9508 |
-| 0.2031        | 5.0   | 1915 | 0.1944          | 0.9423   | 0.9460 |
-### Framework versions
 - Transformers 4.17.0.dev0
 - Pytorch 1.10.2+cu102

 ---
+language: en
 license: apache-2.0
 tags:
+  - audio-classification
+  - generated_from_trainer
 metrics:
+  - accuracy
+  - f1
 model-index:
+  - name: wav2vec2-xls-r-adult-child-cls
+    results: []
 ---
+# Wav2Vec2 XLS-R Adult/Child Speech Classifier
+Wav2Vec2 XLS-R Adult/Child Speech Classifier is an audio classification model based on the [XLS-R](https://arxiv.org/abs/2111.09296) architecture. This model is a fine-tuned version of [wav2vec2-xls-r-300m](https://huggingface.co/facebook/wav2vec2-xls-r-300m) on a private adult/child speech classification dataset.
+This model was trained using HuggingFace's PyTorch framework. All training was done on a Tesla P100, provided by Kaggle. [Training metrics](https://huggingface.co/w11wo/wav2vec2-xls-r-adult-child-cls/tensorboard) were logged via Tensorboard.
+## Model
+| Model                            | #params | Arch. | Training/Validation data (text)           |
+| -------------------------------- | ------- | ----- | ----------------------------------------- |
+| `wav2vec2-xls-r-adult-child-cls` | 300M    | XLS-R | Adult/Child Speech Classification Dataset |
+## Evaluation Results
+The model achieves the following results on evaluation:
+| Dataset                           | Loss   | Accuracy | F1     |
+| --------------------------------- | ------ | -------- | ------ |
+| Adult/Child Speech Classification | 0.1851 | 94.69%   | 0.9508 |
 ## Training procedure
 ### Training hyperparameters
 The following hyperparameters were used during training:
+- `learning_rate`: 3e-05
+- `train_batch_size`: 8
+- `eval_batch_size`: 8
+- `seed`: 42
+- `gradient_accumulation_steps`: 4
+- `total_train_batch_size`: 32
+- `optimizer`: Adam with `betas=(0.9,0.999)` and `epsilon=1e-08`
+- `lr_scheduler_type`: linear
+- `lr_scheduler_warmup_ratio`: 0.1
+- `num_epochs`: 5
 ### Training results
+| Training Loss | Epoch | Step | Validation Loss | Accuracy |   F1   |
+| :-----------: | :---: | :--: | :-------------: | :------: | :----: |
+|    0.2906     |  1.0  | 383  |     0.1856      |  0.9372  | 0.9421 |
+|    0.1749     |  2.0  | 766  |     0.1925      |  0.9418  | 0.9465 |
+|    0.1681     |  3.0  | 1149 |     0.1893      |  0.9414  | 0.9459 |
+|    0.1295     |  4.0  | 1532 |     0.1851      |  0.9469  | 0.9508 |
+|    0.2031     |  5.0  | 1915 |     0.1944      |  0.9423  | 0.9460 |
+## Disclaimer
+Do consider the biases which came from pre-training datasets that may be carried over into the results of this model.
+## Authors
+Wav2Vec2 XLS-R Adult/Child Speech Classifier was trained and evaluated by [Wilson Wongso](https://w11wo.github.io/). All computation and development are done on Kaggle.
+## Framework versions
 - Transformers 4.17.0.dev0
 - Pytorch 1.10.2+cu102