Using the model with larger tables than the example

#2
by wmwandre - opened

Hi!

Firstly, I'd like to congratulate the authors by this work. The model is very relevant and it will be very useful in my project.

I'm trying to run the example considering a table a little bit larger. For instance, 20 rows and 3 columns. Then I got an error regarding the length of the token indices. I saw in some discussion that I could fix this by truncating the maximum length to 1024 when defining the tokernizer. When I did this, I got the following error:

Captura de tela 2023-10-10 122358.png

Is there a way to fix? What is the best way to use the model with larger tables?

Best regards!

Microsoft org

@wmwandre Thanks for your interest on our work! If you have further questions, you may raise it on the code: https://github.com/microsoft/Table-Pretraining, otherwise I cannot receive the notification.

Can you show your full code here? I cannot get the problem 😂

Hi!

I found out the problem. This error happened because I didn't set the parameter return_tensors="pt" when initializing the tokenizer. I saw the example for using the truncation option and the "return_tensors" parameter was missing.

Thank you so much!

Microsoft org

@wmwandre That's awesome! Glad to hear that, and hope you enjoy tapex!

SivilTaram changed discussion status to closed

Sign up or log in to comment