OmniTab
OmniTab is a table-based QA model proposed in OmniTab: Pretraining with Natural and Synthetic Data for Few-shot Table-based Question Answering. The original Github repository is https://github.com/jzbjyb/OmniTab.
Description
neulab/omnitab-large
(based on BART architecture) is initialized with microsoft/tapex-large
and continuously pretrained on natural and synthetic data.
Usage
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
import pandas as pd
tokenizer = AutoTokenizer.from_pretrained("neulab/omnitab-large")
model = AutoModelForSeq2SeqLM.from_pretrained("neulab/omnitab-large")
data = {
"year": [1896, 1900, 1904, 2004, 2008, 2012],
"city": ["athens", "paris", "st. louis", "athens", "beijing", "london"]
}
table = pd.DataFrame.from_dict(data)
query = "In which year did beijing host the Olympic Games?"
encoding = tokenizer(table=table, query=query, return_tensors="pt")
outputs = model.generate(**encoding)
print(tokenizer.batch_decode(outputs, skip_special_tokens=True))
# [' 2008']
Reference
@inproceedings{jiang-etal-2022-omnitab,
title = "{O}mni{T}ab: Pretraining with Natural and Synthetic Data for Few-shot Table-based Question Answering",
author = "Jiang, Zhengbao and Mao, Yi and He, Pengcheng and Neubig, Graham and Chen, Weizhu",
booktitle = "Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies",
month = jul,
year = "2022",
}
- Downloads last month
- 715
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.