deit_small_patch16_224
Implementation of DeiT proposed in Training data-efficient image transformers & distillation through attention
An attention based distillation is proposed where a new token is added to the model, the [dist]{.title-ref} token.
DeiT.deit_tiny_patch16_224()
DeiT.deit_small_patch16_224()
DeiT.deit_base_patch16_224()
DeiT.deit_base_patch16_384()