DLF: Disentangled-Language-Focused Multimodal Sentiment Analysis, AAAI 2025.
Arxiv Paper
Main Contributions
Our main contributions can be summarized as follows:
- Proposed Framework: In this study, we propose a Disentangled-Language-Focused (DLF) multimodal representation learning framework to promote MSA tasks. The framework follows a structured pipeline: feature extraction, disentanglement, enhancement, fusion, and prediction.
- Language-Focused Attractor (LFA): We develop the LFA to fully harness the potential of the dominant language modality within the modality-specific space. The LFA exploits the language-guided multimodal cross-attention mechanisms to achieve a targeted feature enhancement ($X$->Language).
- Hierarchical Predictions: We devise hierarchical predictions to leverage the pre-fused and post-fused features, improving the total MSA accuracy.
Usage
Prerequisites
- Python 3.9.13
- PyTorch 1.13.0
- CUDA 11.7
Installation
- Create a conda environment. Please make sure you have installed conda before.
conda create -n DLF python==3.9.13
- Activate the built DLF environment.
conda activate DLF
- Install Pytorch with CUDA
pip install torch==1.13.0+cu117 torchvision==0.14.0+cu117 torchaudio==0.13.0 --extra-index-url https://download.pytorch.org/whl/cu117
- Clone this repo.
git clone https://github.com/pwang322/DLF.git
- Install the necessary packages.
cd DLF
pip install -r requirements.txt
Datasets
Data files (containing processed MOSI, MOSEI datasets) can be downloaded from here.
You can first build and then put the downloaded datasets into ./dataset
directory and revise the path in ./config/config.json
. For example, if the processed the MOSI dataset is located in ./dataset/MOSI/aligned_50.pkl
. Please make sure "dataset_root_dir": "./dataset" and "featurePath": "MOSI/aligned_50.pkl".
Please note that the meta information and the raw data are not available due to the privacy of YouTube content creators. For more details, please follow the official website of these datasets.
Run the Codes
- Training
You can first set the training dataset name in ./train.py
as "mosei" or "mosi", and then run:
python3 train.py
By default, the trained model will be saved in ./pt
directory. You can change this in train.py
.
- Testing
You can first set the testing dataset name in ./test.py
as "mosei" or "mosi", and then test the trained model:
python3 test.py
We also provide pre-trained models for testing. (Google drive)
๐ค Option 2: Load Pretrained Models from Hugging Face Hub
We also release pre-trained models on Hugging Face for direct use:
from trains.singleTask.model.DLF import DLF
model = DLF.from_pretrained("Peter180/DLF_mosei") # or "Peter180/DLF_mosi"
Citation
If you find the code and our idea helpful in your research or work, please cite the following paper.
@article{wang2024dlf,
title={DLF: Disentangled-Language-Focused Multimodal Sentiment Analysis},
author={Wang, Pan and Zhou, Qiang and Wu, Yawen and Chen, Tianlong and Hu, Jingtong},
journal={arXiv preprint arXiv:2412.12225},
year={2024}
}
- Downloads last month
- 58