Image Classification
timm
PDE
ConvNet
QLNet / README.md
liuyao's picture
Update README.md
8a45673
|
raw
history blame
No virus
4.14 kB
metadata
datasets:
  - imagenet-1k
metrics:
  - accuracy
library_name: timm

Model Card for Model ID

Based on quasi-linear hyperbolic systems of PDEs [Liu et al, 2023], the QLNet enters uncharted waters of ConvNet model space marked by the use of (element-wise) multiplication instead of ReLU as the primary nonlinearity. It achieves comparable performance as ResNet50 on ImageNet-1k (acc=78.4), demonstrating that it has the same level of capacity/expressivity, and deserves more study (hyper-paremeter tuning, optimizer, etc.) by the community.

One notable feature is that the architecture (trained or not) admits a continuous symmetry in its parameters. Check out the notebook for a demo that makes a particular transformation on the weights while leaving the output unchanged.

This modelcard aims to be a base template for new models. It has been generated using this raw template.

Model Details

Model Description

Instead of the bottleneck block of ResNet50 which consists of 1x1, 3x3, 1x1 in succession, this simplest version of QLNet does a 1x1, splits into two equal halves and multiplies them, then applies a 3x3 (depthwise), and a 1x1, all without activation functions except at the end of the block, where a radial activation function that we call hardball is applied.

  • Developed by: Yao Liu 刘杳
  • Model type: Convolutional Neural Network (ConvNet)
  • License: [More Information Needed]
  • Finetuned from model: N/A (trained from scratch)

Model Sources [optional]

How to Get Started with the Model

Use the code below to get started with the model.

[More Information Needed]

Training Details

Training and Testing Data

ImageNet-1k

[More Information Needed]

Training Procedure

We use the training script in timm

python3 train.py ../datasets/imagenet/ --model resnet50 --num-classes 1000 --lr 0.1 --warmup-epochs 5 --epochs 240 --weight-decay 1e-4 --sched cosine --reprob 0.4 --recount 3 --remode pixel --aa rand-m7-mstd0.5-inc1 -b 192 -j 6 --amp --dist-bn reduce 

Preprocessing [optional]

[More Information Needed]

Training Hyperparameters

  • Training regime: [More Information Needed]

Speeds, Sizes, Times [optional]

Results

top1 acc = 78.40

Summary

Model Examination [optional]

[More Information Needed]

Technical Specifications [optional]

Model Architecture and Objective

[More Information Needed]

Compute Infrastructure

[More Information Needed]

Hardware

single GPU :(

Software

[More Information Needed]

Citation [optional]

BibTeX:

[More Information Needed]

APA:

[More Information Needed]

Glossary [optional]

[More Information Needed]

More Information [optional]

[More Information Needed]

Model Card Authors [optional]

[More Information Needed]

Model Card Contact

[More Information Needed]