SECAP: Speech Emotion Captioning with Large Language Model

This repository contains the implementation of the paper "SECap: Speech Emotion Captioning with Large Language Model". It includes the model code, training and testing scripts, and a test dataset. The test dataset consists of 600 wav audio files and their corresponding emotion descriptions.

Please find more details at the GitHub repo[https://github.com/xuyaoxun/SECaps]

Checkpoint

You can download the model checkpoint in this repo freely and put it in the main folder of SECaps.

Meanwhile you will need to download the weights folder and also put it in the main folder of SECaps.

Citation

If you use this repository in your research, please kindly cite our paper:

@article{SECap, title={SECap: Speech Emotion Captioning with Large Language Model},

}