arxiv:2308.11073

Audio-Visual Class-Incremental Learning

Published on Aug 21, 2023

Authors:

Abstract

In this paper, we introduce audio-visual <PRE_TAG>class-incremental learning</POST_TAG>, a class-incremental learning scenario for audio-visual video recognition. We demonstrate that joint audio-visual modeling can improve class-incremental learning, but current methods fail to preserve semantic similarity between audio and visual features as incremental step grows. Furthermore, we observe that audio-visual correlations learned in previous tasks can be forgotten as incremental steps progress, leading to poor performance. To overcome these challenges, we propose AV-CIL, which incorporates Dual-Audio-Visual Similarity Constraint (D-AVSC) to maintain both instance-aware and class-aware semantic similarity between audio-visual modalities and Visual Attention Distillation (VAD) to retain previously learned audio-guided visual attentive ability. We create three audio-visual class-incremental datasets, AVE-Class-Incremental (AVE-CI), Kinetics-Sounds-Class-Incremental (K-S-CI), and VGGSound100-Class-Incremental (VS100-CI) based on the AVE, Kinetics-Sounds, and VGGSound datasets, respectively. Our experiments on AVE-CI, K-S-CI, and VS100-CI demonstrate that AV-CIL significantly outperforms existing class-incremental learning methods in audio-visual <PRE_TAG>class-incremental learning</POST_TAG>. Code and data are available at: https://github.com/weiguoPian/AV-CIL_ICCV2023.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment

No model linking this paper

Cite arxiv.org/abs/2308.11073 in a model README.md to link it from this page.

No dataset linking this paper

Cite arxiv.org/abs/2308.11073 in a dataset README.md to link it from this page.

No Space linking this paper

Cite arxiv.org/abs/2308.11073 in a Space README.md to link it from this page.

No Collection including this paper

Add this paper to a collection to link it from this page.