Possible fix for the CedModel's forward method for a long audio
#1
by
daisukelab
- opened
Hi, thanks for sharing the implementation on Hugging Face.
I noticed issues with 30-s audios and made a fix locally, which are around:
https://huggingface.co/mispeech/ced-base/blob/main/ced_model/modeling_ced.py#L456-L459
These codes have problems that the self.forward_head
and self.ced
are not found in the class.
I guess, instead, we should simply call forward_features
as same as when the audio is short.
In addition, it seems to have an issue with reshaping right after that.
Then, the following is my local fix.
x = self.forward_features(x)
SPLB, T, D = x.shape
x = torch.reshape(
x, (n_splits, SPLB//n_splits, T, D)
)
I hope it helps.
Fixed. Thanks a lot!
jimbozhang
changed discussion status to
closed