File size: 2,081 Bytes
851047d
 
86ff66a
 
 
 
 
 
 
 
 
 
 
b397cbb
 
 
 
 
483d4df
b397cbb
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
---
license: cc-by-sa-4.0
datasets:
- Ar4ikov/iemocap_audio_text_splitted
language:
- en
- zh
metrics:
- f1
library_name: transformers
pipeline_tag: audio-classification
tags:
- speech-emotion-recognition
---

# Cross-Lingual Cross-Age Group Adaptation for Low-Resource Elderly Speech Emotion Recognition

Fine-tuned [facebook/wav2vec2-large-xlsr-53](https://huggingface.co/facebook/wav2vec2-large-xlsr-53) on English and Chinese data from all-age speakers.
The model is trained on the training sets of [CREMA-D](https://github.com/CheyneyComputerScience/CREMA-D), [CSED](https://github.com/AkishinoShiame/Chinese-Speech-Emotion-Datasets), [ElderReact](https://github.com/Mayer123/ElderReact), [ESD](https://github.com/HLTSingapore/Emotional-Speech-Data), [IEMOCAP](https://sail.usc.edu/iemocap/iemocap_release.htm), and [TESS](https://www.kaggle.com/datasets/ejlok1/toronto-emotional-speech-set-tess).
When using this model, make sure that your speech input is sampled at 16kHz.

The script used for training and evaluation can be found here:
[https://github.com/HLTCHKUST/elderly_ser/tree/main](https://github.com/HLTCHKUST/elderly_ser/tree/main)

## Evaluation Results

For the details (e.g., the statistics of `train`, `valid`, and `test` data), please refer to our paper on [arXiv](https://arxiv.org/abs/2306.14517).
It also provides the model's speech emotion recognition performances on: English-All, Chinese-All, English-Elderly, Chinese-Elderly, English-Adults, Chinese-Adults.

## Citation

Our paper will be published at INTERSPEECH 2023. In the meantime, you can find our paper on [arXiv](https://arxiv.org/abs/2306.14517).
If you find our work useful, please consider citing our paper as follows:
```
@misc{cahyawijaya2023crosslingual,
      title={Cross-Lingual Cross-Age Group Adaptation for Low-Resource Elderly Speech Emotion Recognition}, 
      author={Samuel Cahyawijaya and Holy Lovenia and Willy Chung and Rita Frieske and Zihan Liu and Pascale Fung},
      year={2023},
      eprint={2306.14517},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}
```