How can this model be used on tables that I have stored in Postgres
#2
by
Chelcie
- opened
I have a table called employees from adventureworks that I pulled from postgres.
creating employee df from postgres employee table
with engine.begin() as conn:
query = text("""SELECT businessentityid,jobtitle FROM humanresources.employee""")
employee = pd.read_sql_query(query, conn)
employee
I then run the following:
query = "how many rows does the employee table have?"
encoding = tokenizer(table=employee, query=query, return_tensors="pt")
outputs = model.generate(**encoding)
print(tokenizer.batch_decode(outputs, skip_special_tokens=True))
and get the following warning/errors:
Token indices sequence length is longer than the specified maximum sequence length for this model (2813 > 1024). Running this sequence through the model will result in indexing errors
IndexError: index out of range in self
@Chelcie Hello, thanks for your interest on our work! I think the problem should be that the table is larger than the maximum positions of TAPEX. You may use the default truncation strategy defined in tapex as:
encoding = tokenizer(
table=employee,
query= query,
max_length=1024,
truncation=True,
)
And try again!
SivilTaram
changed discussion status to
closed