r""" Atienza, Rowel. "Vision Transformer for Fast and Efficient Scene Text Recognition." In International Conference on Document Analysis and Recognition (ICDAR). 2021. https://arxiv.org/abs/2105.08582 All source files, except `system.py`, are based on the implementation listed below, and hence are released under the license of the original. Source: https://github.com/roatienza/deep-text-recognition-benchmark License: Apache License 2.0 (see LICENSE file in project root) """