metadata

license: mit

Content Vec Best

Official Repo: ContentVec
This repo brings fairseq ContentVec model to HuggingFace Transformers.

How to use

To use this model, you need to define

class HubertModelWithFinalProj(HubertModel):
    def __init__(self, config):
        super().__init__(config)

        # The final projection layer is only used for backward compatibility.
        # Following https://github.com/auspicious3000/contentvec/issues/6
        # Remove this layer is necessary to achieve the desired outcome.
        self.final_proj = nn.Linear(config.hidden_size, config.classifier_proj_size)

and then load the model with

model = HubertModelWithFinalProj.from_pretrained("lengyue233/content-vec-best")

x = model(audio)["last_hidden_state"]

How to convert

You need to download the ContentVec_legacy model from the official repo, and then run

python convert.py