The size of tensor a (110) must match the size of tensor b (1214) at non-singleton dimension 1

#7
by Kivaa - opened

I am trying to use AST for the first time and i've followed all the documentations and tutorial but i encountered an error which i still can't find out what the error is
the error is as follows
/usr/local/lib/python3.10/dist-packages/transformers/models/audio_spectrogram_transformer/modeling_audio_spectrogram_transformer.py in forward(self, input_values)
85 distillation_tokens = self.distillation_token.expand(batch_size, -1, -1)
86 embeddings = torch.cat((cls_tokens, distillation_tokens, embeddings), dim=1)
---> 87 embeddings = embeddings + self.position_embeddings
88 embeddings = self.dropout(embeddings)
89

RuntimeError: The size of tensor a (110) must match the size of tensor b (1214) at non-singleton dimension 1

my input is a tensor data with a shape of torch.Size([9887, 128, 100])

i appreciate all the help thank you

@Kivaa The AST expects [Batch, 1024, 128] Your input data format is not quite right. try using the ASTFeature Extractor from Huggingface. you'll get a proper format. incase you get [9887, 1, 1024,128] just squeeze out the (1) in dimension.

Sign up or log in to comment