timm documentation

Model Summaries

timm

You are viewing v1.0.8 version. A newer version v1.0.20 is available.

Join the Hugging Face community

and get access to the augmented documentation experience

Collaborate on models, datasets and Spaces

Faster examples with accelerated inference

Switch between documentation themes

to get started

Model Summaries

The model architectures included come from a wide variety of sources. Sources, including papers, original impl (“reference code”) that I rewrote / adapted, and PyTorch impl that I leveraged directly (“code”) are listed below.

Most included models have pretrained weights. The weights are either:

from their original sources
ported by myself from their original impl in a different framework (e.g. Tensorflow models)
trained from scratch using the included training script

The validation results for the pretrained weights are here

A more exciting view (with pretty pictures) of the models within timm can be found at paperswithcode.

Big Transfer ResNetV2 (BiT)

Implementation: resnetv2.py
Paper: Big Transfer (BiT): General Visual Representation Learning - https://arxiv.org/abs/1912.11370
Reference code: https://github.com/google-research/big_transfer

Cross-Stage Partial Networks

Implementation: cspnet.py
Paper: CSPNet: A New Backbone that can Enhance Learning Capability of CNN - https://arxiv.org/abs/1911.11929
Reference impl: https://github.com/WongKinYiu/CrossStagePartialNetworks

DenseNet

Implementation: densenet.py
Paper: Densely Connected Convolutional Networks - https://arxiv.org/abs/1608.06993
Code: https://github.com/pytorch/vision/tree/master/torchvision/models

DLA

Dual-Path Networks

Implementation: dpn.py
Paper: Dual Path Networks - https://arxiv.org/abs/1707.01629
My PyTorch code: https://github.com/rwightman/pytorch-dpn-pretrained
Reference code: https://github.com/cypw/DPNs

GPU-Efficient Networks

Implementation: byobnet.py
Paper: Neural Architecture Design for GPU-Efficient Networks - https://arxiv.org/abs/2006.14090
Reference code: https://github.com/idstcv/GPU-Efficient-Networks

HRNet

Implementation: hrnet.py
Paper: Deep High-Resolution Representation Learning for Visual Recognition - https://arxiv.org/abs/1908.07919
Code: https://github.com/HRNet/HRNet-Image-Classification

Inception-V3

Implementation: inception_v3.py
Paper: Rethinking the Inception Architecture for Computer Vision - https://arxiv.org/abs/1512.00567
Code: https://github.com/pytorch/vision/tree/master/torchvision/models

Inception-V4

Implementation: inception_v4.py
Paper: Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning - https://arxiv.org/abs/1602.07261
Code: https://github.com/Cadene/pretrained-models.pytorch
Reference code: https://github.com/tensorflow/models/tree/master/research/slim/nets

Inception-ResNet-V2

Implementation: inception_resnet_v2.py
Paper: Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning - https://arxiv.org/abs/1602.07261
Code: https://github.com/Cadene/pretrained-models.pytorch
Reference code: https://github.com/tensorflow/models/tree/master/research/slim/nets

NASNet-A

Implementation: nasnet.py
Papers: Learning Transferable Architectures for Scalable Image Recognition - https://arxiv.org/abs/1707.07012
Code: https://github.com/Cadene/pretrained-models.pytorch
Reference code: https://github.com/tensorflow/models/tree/master/research/slim/nets/nasnet

PNasNet-5

Implementation: pnasnet.py
Papers: Progressive Neural Architecture Search - https://arxiv.org/abs/1712.00559
Code: https://github.com/Cadene/pretrained-models.pytorch
Reference code: https://github.com/tensorflow/models/tree/master/research/slim/nets/nasnet

EfficientNet

Implementation: efficientnet.py
Papers:
- EfficientNet NoisyStudent (B0-B7, L2) - https://arxiv.org/abs/1911.04252
- EfficientNet AdvProp (B0-B8) - https://arxiv.org/abs/1911.09665
- EfficientNet (B0-B7) - https://arxiv.org/abs/1905.11946
- EfficientNet-EdgeTPU (S, M, L) - https://ai.googleblog.com/2019/08/efficientnet-edgetpu-creating.html
- MixNet - https://arxiv.org/abs/1907.09595
- MNASNet B1, A1 (Squeeze-Excite), and Small - https://arxiv.org/abs/1807.11626
- MobileNet-V2 - https://arxiv.org/abs/1801.04381
- FBNet-C - https://arxiv.org/abs/1812.03443
- Single-Path NAS - https://arxiv.org/abs/1904.02877
My PyTorch code: https://github.com/rwightman/gen-efficientnet-pytorch
Reference code: https://github.com/tensorflow/tpu/tree/master/models/official/efficientnet

MobileNet-V3

Implementation: mobilenetv3.py
Paper: Searching for MobileNetV3 - https://arxiv.org/abs/1905.02244
Reference code: https://github.com/tensorflow/models/tree/master/research/slim/nets/mobilenet

RegNet

Implementation: regnet.py
Paper: Designing Network Design Spaces - https://arxiv.org/abs/2003.13678
Reference code: https://github.com/facebookresearch/pycls/blob/master/pycls/models/regnet.py

RepVGG

Implementation: byobnet.py
Paper: Making VGG-style ConvNets Great Again - https://arxiv.org/abs/2101.03697
Reference code: https://github.com/DingXiaoH/RepVGG

ResNet, ResNeXt

Implementation: resnet.py
ResNet (V1B)
- Paper: Deep Residual Learning for Image Recognition - https://arxiv.org/abs/1512.03385
- Code: https://github.com/pytorch/vision/tree/master/torchvision/models
ResNeXt
- Paper: Aggregated Residual Transformations for Deep Neural Networks - https://arxiv.org/abs/1611.05431
- Code: https://github.com/pytorch/vision/tree/master/torchvision/models
‘Bag of Tricks’ / Gluon C, D, E, S ResNet variants
- Paper: Bag of Tricks for Image Classification with CNNs - https://arxiv.org/abs/1812.01187
- Code: https://github.com/dmlc/gluon-cv/blob/master/gluoncv/model_zoo/resnetv1b.py
Instagram pretrained / ImageNet tuned ResNeXt101
- Paper: Exploring the Limits of Weakly Supervised Pretraining - https://arxiv.org/abs/1805.00932
- Weights: https://pytorch.org/hub/facebookresearch_WSL-Images_resnext (NOTE: CC BY-NC 4.0 License, NOT commercial friendly)
Semi-supervised (SSL) / Semi-weakly Supervised (SWSL) ResNet and ResNeXts
- Paper: Billion-scale semi-supervised learning for image classification - https://arxiv.org/abs/1905.00546
- Weights: https://github.com/facebookresearch/semi-supervised-ImageNet1K-models (NOTE: CC BY-NC 4.0 License, NOT commercial friendly)
Squeeze-and-Excitation Networks
- Paper: Squeeze-and-Excitation Networks - https://arxiv.org/abs/1709.01507
- Code: Added to ResNet base, this is current version going forward, old senet.py is being deprecated
ECAResNet (ECA-Net)
- Paper: ECA-Net: Efficient Channel Attention for Deep CNN - https://arxiv.org/abs/1910.03151v4
- Code: Added to ResNet base, ECA module contributed by @VRandme, reference https://github.com/BangguWu/ECANet

Res2Net

Implementation: res2net.py
Paper: Res2Net: A New Multi-scale Backbone Architecture - https://arxiv.org/abs/1904.01169
Code: https://github.com/gasvn/Res2Net

ResNeSt

Implementation: resnest.py
Paper: ResNeSt: Split-Attention Networks - https://arxiv.org/abs/2004.08955
Code: https://github.com/zhanghang1989/ResNeSt

ReXNet

Implementation: rexnet.py
Paper: ReXNet: Diminishing Representational Bottleneck on CNN - https://arxiv.org/abs/2007.00992
Code: https://github.com/clovaai/rexnet

Selective-Kernel Networks

Implementation: sknet.py
Paper: Selective-Kernel Networks - https://arxiv.org/abs/1903.06586
Code: https://github.com/implus/SKNet, https://github.com/clovaai/assembled-cnn

SelecSLS

Implementation: selecsls.py
Paper: XNect: Real-time Multi-Person 3D Motion Capture with a Single RGB Camera - https://arxiv.org/abs/1907.00837
Code: https://github.com/mehtadushy/SelecSLS-Pytorch

Squeeze-and-Excitation Networks

Implementation: senet.py NOTE: I am deprecating this version of the networks, the new ones are part of resnet.py
Paper: Squeeze-and-Excitation Networks - https://arxiv.org/abs/1709.01507
Code: https://github.com/Cadene/pretrained-models.pytorch

TResNet

Implementation: tresnet.py
Paper: TResNet: High Performance GPU-Dedicated Architecture - https://arxiv.org/abs/2003.13630
Code: https://github.com/mrT23/TResNet

VGG

Implementation: vgg.py
Paper: Very Deep Convolutional Networks For Large-Scale Image Recognition - https://arxiv.org/pdf/1409.1556.pdf
Reference code: https://github.com/pytorch/vision/blob/master/torchvision/models/vgg.py

Vision Transformer

Implementation: vision_transformer.py
Paper: An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale - https://arxiv.org/abs/2010.11929
Reference code and pretrained weights: https://github.com/google-research/vision_transformer

VovNet V2 and V1

Implementation: vovnet.py
Paper: CenterMask : Real-Time Anchor-Free Instance Segmentation - https://arxiv.org/abs/1911.06667
Reference code: https://github.com/youngwanLEE/vovnet-detectron2

Xception

Implementation: xception.py
Paper: Xception: Deep Learning with Depthwise Separable Convolutions - https://arxiv.org/abs/1610.02357
Code: https://github.com/Cadene/pretrained-models.pytorch

Xception (Modified Aligned, Gluon)

Implementation: gluon_xception.py
Paper: Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation - https://arxiv.org/abs/1802.02611
Reference code: https://github.com/dmlc/gluon-cv/tree/master/gluoncv/model_zoo, https://github.com/jfzhang95/pytorch-deeplab-xception/

Xception (Modified Aligned, TF)

Implementation: aligned_xception.py
Paper: Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation - https://arxiv.org/abs/1802.02611
Reference code: https://github.com/tensorflow/models/tree/master/research/deeplab

< > Update on GitHub

←Share and Load Models from the 🤗 Hugging Face Hub Results→