Skimformer

A collaboration between reciTAL & MLIA (ISIR, Sorbonne Université)

Model description

Skimformer is a two-stage Transformer that replaces self-attention with Skim-Attention, a self-attention module that computes attention solely based on the 2D positions of tokens in the page. The model adopts a two-step approach: first, the skim-attention scores are computed once and only once using layout information alone; then, these attentions are used in every layer of a text-based Transformer encoder. For more details, please refer to our paper:

Skim-Attention: Learning to Focus via Document Layout Laura Nguyen, Thomas Scialom, Jacopo Staiano, Benjamin Piwowarski, EMNLP 2021

Citation

@article{nguyen2021skimattention,
    title={Skim-Attention: Learning to Focus via Document Layout}, 
    author={Laura Nguyen and Thomas Scialom and Jacopo Staiano and Benjamin Piwowarski},
    journal={arXiv preprint arXiv:2109.01078}
    year={2021},
}
Downloads last month
6
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.