File size: 3,604 Bytes

42fa14a
 
 
 
 
 
 
 
 
 
 
 
1f6af64
42fa14a
1f6af64
42fa14a
d46ec21
 
42fa14a
 
 
 
1f6af64
42fa14a
 
 
d46ec21
1f6af64
42fa14a
1f6af64
42fa14a
 
 
 
 
 
d46ec21
42fa14a
 
 
 
 
 
 
 
 
 
1f6af64
42fa14a
1f6af64
42fa14a
 
 
 
 
1f6af64
 
 
 
 
42fa14a
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1f6af64
42fa14a
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1f6af64
42fa14a

---
datasets:
- imagenet-1k
language:
- en
metrics:
- accuracy
library_name: timm
---

# Model Card for Model ID

Based on quasi-linear hyperbolic systems of PDEs, the QLNet enters an uncharted water of the model space for ConvNets that uses multiplication (of same-sized tensors) instead of ReLU as the nonlinearity. It achieves comparable accuracy as ResNet50 on ImageNet-1k, demonstrating that it has the same level of capacity/expressivity, and deserves more study (hyper-paremeter tuning, optimizer, etc.) that I alone am not able to do.

*This modelcard aims to be a base template for new models. It has been generated using [this raw template](https://github.com/huggingface/huggingface_hub/blob/main/src/huggingface_hub/templates/modelcard_template.md?plain=1).*



## Model Details

### Model Description

Instead of the `bottleneck` block of ResNet50 which consists of 1x1, 3x3, 1x1 in succession, this simplest version of QLNet does a 1x1, splits into two equal halves and **multiplies** them, then applies a 3x3 (depthwise), and a 1x1, all *without* activation functions except at the end of the block, where a *radial activation function* that we call `hardball` is applied.



- **Developed by:** Yao Liu 刘杳
- **Model type:** Convolutiona Neural Network (ConvNet)
- **License:** [More Information Needed]
- **Finetuned from model [optional]:** *from scratch*

### Model Sources [optional]

<!-- Provide the basic links for the model. -->

- **Repository:** [More Information Needed]
- **Paper [optional]:** [A Novel ConvNet Architecture with a Continuous Symmetry](https://arxiv.org/abs/2308.01621)
- **Demo [optional]:** [More Information Needed]

## How to Get Started with the Model

Use the code below to get started with the model.

[More Information Needed]

## Training Details

### Training and Testing Data

ImageNet-1k

[More Information Needed]

### Training Procedure 

We use the training script in `timm`

```
python3 train.py ../datasets/imagenet/ --model resnet50 --num-classes 1000 --lr 0.1 --warmup-epochs 5 --epochs 240 --weight-decay 1e-4 --sched cosine --reprob 0.4 --recount 3 --remode pixel --aa rand-m7-mstd0.5-inc1 -b 192 -j 6 --amp --dist-bn reduce 
```

#### Preprocessing [optional]

[More Information Needed]


#### Training Hyperparameters

- **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->

#### Speeds, Sizes, Times [optional]

<!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->

### Results

top1 acc = 78.40

#### Summary



## Model Examination [optional]

<!-- Relevant interpretability work for the model goes here -->

[More Information Needed]

## Technical Specifications [optional]

### Model Architecture and Objective

[More Information Needed]

### Compute Infrastructure

[More Information Needed]

#### Hardware

single GPU :(

#### Software

[More Information Needed]

## Citation [optional]

<!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->

**BibTeX:**

[More Information Needed]

**APA:**

[More Information Needed]

## Glossary [optional]

<!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->

[More Information Needed]

## More Information [optional]

[More Information Needed]

## Model Card Authors [optional]

[More Information Needed]

## Model Card Contact

[More Information Needed]