arxiv:2305.17651

DPHuBERT: Joint Distillation and Pruning of Self-Supervised Speech Models

Published on May 28, 2023

Authors:

Yifan Peng ,

Abstract

Self-supervised learning (SSL) has achieved notable success in many speech processing tasks, but the large model size and heavy computational cost hinder the deployment. Knowledge distillation trains a small student model to mimic the behavior of a large teacher model. However, the student architecture usually needs to be manually designed and will remain fixed during training, which requires prior knowledge and can lead to suboptimal performance. Inspired by recent success of task-specific structured pruning, we propose DPHuBERT, a novel task-agnostic compression method for speech SSL based on joint distillation and pruning. Experiments on SUPERB show that DPHuBERT outperforms pure distillation methods in almost all tasks. Moreover, DPHuBERT requires little training time and performs well with limited training data, making it suitable for resource-constrained applications. Our method can also be applied to various speech SSL models. Our code and models will be publicly available.

View arXiv page View PDF Add to collection

Community

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment

Upvote

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2305.17651 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2305.17651 in a dataset README.md to link it from this page.

DPHuBERT: Joint Distillation and Pruning of Self-Supervised Speech Models

Abstract

Community

Models citing this paper 0

Datasets citing this paper 0

Spaces citing this paper 4

Collections including this paper 4