sungnyun
/

ARMHuBERT

Automatic Speech Recognition

Model card Files Files and versions Community

ARMHuBERT / README.md

sungnyun's picture

Update README.md

d87a106 about 1 year ago

|

1.33 kB

	---
	license: apache-2.0
	language:
	- en
	library_name: fairseq
	pipeline_tag: automatic-speech-recognition
	inference: false
	---


	<br>
	<br>

	# ARMHuBERT Model Card

	This repo contains the models from our paper [Recycle-and-Distill: Universal Compression Strategy for Transformer-based Speech SSL Models with Attention Map Reusing and Masking Distillation](https://arxiv.org/abs/2305.11685), INTERSPEECH 2023.


	## Model details

	Model type:
	ARMHuBERT is an open-source speech SSL model distilled from HuBERT-Base, by attention map reusing and masking distillation.
	We also provide the model checkpoints of MaskHuBERT (without attention map reusing) and ARMwavLM (wavLM-Base teacher).

	- Attention Map Reusing: Reuse previous layer's attention map to remove key & query parameters in Transformer.
	- Masking Distillation: Masking distillation treating masked frames and unmasked frames separately.

	License:
	Apache 2.0 License

	Where to send questions or comments about the model:
	https://github.com/sungnyun/ARMHuBERT/issues


	## Training dataset
	Pretraining data: [LibriSpeech](https://www.openslr.org/12)
	- ``[ModelName]-100h.ckpt``: train-clean-100
	- ``[ModelName]-960h.ckpt``: train-clean-100 + train-clean-360 + train-other-500


	<br>

	More detials are in our github, https://github.com/sungnyun/ARMHuBERT.