Table Structure Recognition Model for Pix2Text (P2T)

Currently, this model is forked from https://huggingface.co/microsoft/table-transformer-structure-recognition-v1.1-all (much thanks to the authors), and will be evolving afterward.

Documents for Pix2Text


Table Transformer (TATR) model trained on PubTables1M and FinTabNet.c. It was introduced in the paper Aligning benchmark datasets for table structure recognition by Smock et al. and first released in this repository.

Disclaimer: The team releasing Table Transformer did not write a model card for this model so this model card has been written by the Hugging Face team.

Model description

The Table Transformer is equivalent to DETR, a Transformer-based object detection model. Note that the authors decided to use the "normalize before" setting of DETR, which means that layernorm is applied before self- and cross-attention.

Usage

You can use the raw model for detecting tables in documents. See the documentation for more info.

Downloads last month
1,870
Safetensors
Model size
28.8M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Space using breezedeus/pix2text-table-rec 1

Collection including breezedeus/pix2text-table-rec