This is CaiT model from [1]. It was first implemented in TensorFlow and then the original parameters from [2] were ported into the implementation. Refer to [3] for more details. ## References [1] Going deeper with Image Transformers: https://arxiv.org/abs/2103.17239 [2] CaiT GitHub: https://github.com/facebookresearch/deit [3] CaiT-TF GitHub: https://github.com/sayakpaul/cait-tf