bookbot
/

distil-wav2vec2-xls-r-adult-child-cls-89m

@@ -1,66 +1,73 @@
 ---
 license: apache-2.0
 tags:
-- generated_from_trainer
 metrics:
-- accuracy
-- f1
 model-index:
-- name: distil-wav2vec2-xls-r-adult-child-cls-v2
-  results: []
 ---
-<!-- This model card has been generated automatically according to the information the Trainer had access to. You
-should probably proofread and complete it, then remove this comment. -->
-# distil-wav2vec2-xls-r-adult-child-cls-v2
-This model is a fine-tuned version of [w11wo/wav2vec2-xls-r-adult-child-cls](https://huggingface.co/w11wo/wav2vec2-xls-r-adult-child-cls) on the None dataset.
-It achieves the following results on the evaluation set:
-- Loss: 0.3048
-- Accuracy: 0.9354
-- F1: 0.9420
-## Model description
-More information needed
-## Intended uses & limitations
-More information needed
-## Training and evaluation data
-More information needed
 ## Training procedure
 ### Training hyperparameters
 The following hyperparameters were used during training:
-- learning_rate: 3e-05
-- train_batch_size: 32
-- eval_batch_size: 32
-- seed: 42
-- gradient_accumulation_steps: 4
-- total_train_batch_size: 128
-- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
-- lr_scheduler_type: linear
-- lr_scheduler_warmup_ratio: 0.1
-- num_epochs: 5
 ### Training results
-| Training Loss | Epoch | Step | Validation Loss | Accuracy | F1     |
-|:-------------:|:-----:|:----:|:---------------:|:--------:|:------:|
-| 0.7711        | 1.0   | 96   | 0.5413          | 0.9017   | 0.9156 |
-| 0.5551        | 2.0   | 192  | 0.4627          | 0.9164   | 0.9272 |
-| 0.4166        | 3.0   | 288  | 0.3832          | 0.9261   | 0.9352 |
-| 0.3928        | 4.0   | 384  | 0.3242          | 0.9331   | 0.9406 |
-| 0.3622        | 5.0   | 480  | 0.3048          | 0.9354   | 0.9420 |
-### Framework versions
 - Transformers 4.17.0.dev0
 - Pytorch 1.10.2+cu102

 ---
+language: en
 license: apache-2.0
 tags:
+  - audio-classification
+  - generated_from_trainer
 metrics:
+  - accuracy
+  - f1
 model-index:
+  - name: distil-wav2vec2-xls-r-adult-child-cls-v2
+    results: []
 ---
+# DistilWav2Vec2 XLS-R Adult/Child Speech Classifier
+DistilWav2Vec2 XLS-R Adult/Child Speech Classifier is an audio classification model based on the [XLS-R](https://arxiv.org/abs/2111.09296) architecture. This model is a distilled version of [wav2vec2-xls-r-adult-child-cls](https://huggingface.co/w11wo/wav2vec2-xls-r-adult-child-cls) on a private adult/child speech classification dataset.
+This model was trained using HuggingFace's PyTorch framework. All training was done on a Tesla P100, provided by Kaggle. [Training metrics](https://huggingface.co/w11wo/distil-wav2vec2-xls-r-adult-child-cls-v2/tensorboard) were logged via Tensorboard.
+## Model
+| Model                                      | #params | Arch. | Training/Validation data (text)           |
+| ------------------------------------------ | ------- | ----- | ----------------------------------------- |
+| `distil-wav2vec2-xls-r-adult-child-cls-v2` | 89M     | XLS-R | Adult/Child Speech Classification Dataset |
+## Evaluation Results
+The model achieves the following results on evaluation:
+| Dataset                           | Loss   | Accuracy | F1     |
+| --------------------------------- | ------ | -------- | ------ |
+| Adult/Child Speech Classification | 0.3048 | 93.54%   | 0.9420 |
 ## Training procedure
 ### Training hyperparameters
 The following hyperparameters were used during training:
+- `learning_rate`: 3e-05
+- `train_batch_size`: 32
+- `eval_batch_size`: 32
+- `seed`: 42
+- `gradient_accumulation_steps`: 4
+- `total_train_batch_size`: 128
+- `optimizer`: Adam with `betas=(0.9,0.999)` and `epsilon=1e-08`
+- `lr_scheduler_type`: linear
+- `lr_scheduler_warmup_ratio`: 0.1
+- `num_epochs`: 5
 ### Training results
+| Training Loss | Epoch | Step | Validation Loss | Accuracy |   F1   |
+| :-----------: | :---: | :--: | :-------------: | :------: | :----: |
+|    0.7711     |  1.0  |  96  |     0.5413      |  0.9017  | 0.9156 |
+|    0.5551     |  2.0  | 192  |     0.4627      |  0.9164  | 0.9272 |
+|    0.4166     |  3.0  | 288  |     0.3832      |  0.9261  | 0.9352 |
+|    0.3928     |  4.0  | 384  |     0.3242      |  0.9331  | 0.9406 |
+|    0.3622     |  5.0  | 480  |     0.3048      |  0.9354  | 0.9420 |
+## Disclaimer
+Do consider the biases which came from pre-training datasets that may be carried over into the results of this model.
+## Authors
+DistilWav2Vec2 XLS-R Adult/Child Speech Classifier was trained and evaluated by [Wilson Wongso](https://w11wo.github.io/). All computation and development are done on Kaggle.
+## Framework versions
 - Transformers 4.17.0.dev0
 - Pytorch 1.10.2+cu102