QLNet / README.md

liuyao

Update README.md

c3cf389 8 months ago

preview code

raw

history blame

No virus

4.15 kB

	---
	datasets:
	- imagenet-1k
	metrics:
	- accuracy
	library_name: timm
	---

	# Model Card for Model ID

	Based on quasi-linear hyperbolic systems of PDEs [[Liu et al, 2023](https://github.com/liuyao12/ConvNets-PDE-perspective)], the QLNet enters an uncharted water of ConvNet model space marked by the use of (element-wise) multiplication instead of ReLU as the primary nonlinearity. It achieves comparable performance as ResNet50 on ImageNet-1k (acc=78.4), demonstrating that it has the same level of capacity/expressivity, and deserves more study (hyper-paremeter tuning, optimizer, etc.) by the community.


	![](https://huggingface.co/liuyao/QLNet/resolve/main/QLNet.jpeg)

	One notable feature is that the architecture (trained or not) admits a continuous symmetry in its parameters. Check out the [notebook](https://colab.research.google.com/#fileId=https://huggingface.co/liuyao/QLNet/blob/main/QLNet_symmetry.ipynb) for a demo that makes a particular transformation on the weights while leaving the output unchanged.

	This modelcard aims to be a base template for new models. It has been generated using [this raw template](https://github.com/huggingface/huggingface_hub/blob/main/src/huggingface_hub/templates/modelcard_template.md?plain=1).


	## Model Details

	### Model Description

	Instead of the `bottleneck` block of ResNet50 which consists of 1x1, 3x3, 1x1 in succession, this simplest version of QLNet does a 1x1, splits into two equal halves and multiplies them, then applies a 3x3 (depthwise), and a 1x1, all without activation functions except at the end of the block, where a radial activation function that we call `hardball` is applied.



	- Developed by: Yao Liu 刘杳
	- Model type: Convolutional Neural Network (ConvNet)
	- License: [More Information Needed]
	- Finetuned from model: N/A (trained from scratch)

	### Model Sources [optional]

	<!-- Provide the basic links for the model. -->

	- Repository: [ConvNet from the PDE perspective](https://github.com/liuyao12/ConvNets-PDE-perspective)
	- Paper: [A Novel ConvNet Architecture with a Continuous Symmetry](https://arxiv.org/abs/2308.01621)
	- Demo [optional]: [More Information Needed]

	## How to Get Started with the Model

	Use the code below to get started with the model.

	[More Information Needed]

	## Training Details

	### Training and Testing Data

	ImageNet-1k

	[More Information Needed]

	### Training Procedure

	We use the training script in `timm`

	```
	python3 train.py ../datasets/imagenet/ --model resnet50 --num-classes 1000 --lr 0.1 --warmup-epochs 5 --epochs 240 --weight-decay 1e-4 --sched cosine --reprob 0.4 --recount 3 --remode pixel --aa rand-m7-mstd0.5-inc1 -b 192 -j 6 --amp --dist-bn reduce
	```

	#### Preprocessing [optional]

	[More Information Needed]


	#### Training Hyperparameters

	- Training regime: [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->

	#### Speeds, Sizes, Times [optional]

	<!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->

	### Results

	top1 acc = 78.40

	#### Summary



	## Model Examination [optional]

	<!-- Relevant interpretability work for the model goes here -->

	[More Information Needed]

	## Technical Specifications [optional]

	### Model Architecture and Objective

	[More Information Needed]

	### Compute Infrastructure

	[More Information Needed]

	#### Hardware

	single GPU :(

	#### Software

	[More Information Needed]

	## Citation [optional]

	<!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->

	BibTeX:

	[More Information Needed]

	APA:

	[More Information Needed]

	## Glossary [optional]

	<!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->

	[More Information Needed]

	## More Information [optional]

	[More Information Needed]

	## Model Card Authors [optional]

	[More Information Needed]

	## Model Card Contact

	[More Information Needed]