Image Classification
timm
PDE
ConvNet
liuyao commited on
Commit
79fd544
1 Parent(s): 444c338

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +10 -0
README.md CHANGED
@@ -21,6 +21,16 @@ Based on **quasi-linear hyperbolic systems of PDEs** [[Liu et al, 2023](https://
21
 
22
  One notable feature is that the architecture (trained or not) admits a *continuous* symmetry in its parameters. Check out the [notebook](https://colab.research.google.com/#fileId=https://huggingface.co/liuyao/QLNet/blob/main/QLNet_symmetry.ipynb) for a demo that makes a particular transformation on the weights while leaving the output *unchanged*.
23
 
 
 
 
 
 
 
 
 
 
 
24
  *This modelcard aims to be a base template for new models. It has been generated using [this raw template](https://github.com/huggingface/huggingface_hub/blob/main/src/huggingface_hub/templates/modelcard_template.md?plain=1).*
25
 
26
 
 
21
 
22
  One notable feature is that the architecture (trained or not) admits a *continuous* symmetry in its parameters. Check out the [notebook](https://colab.research.google.com/#fileId=https://huggingface.co/liuyao/QLNet/blob/main/QLNet_symmetry.ipynb) for a demo that makes a particular transformation on the weights while leaving the output *unchanged*.
23
 
24
+ FAQ (as the author imagines):
25
+
26
+ - Q: Who needs another ConvNet, when the SOTA for ImageNet-1k is now in the low 80s with models of comparable size?
27
+ - A: Aside from shortage of resources to perform extensive experiments, the real answer is that the new symmetry has the potential to be exploited, in different ways. The non-elementwise nonlinearity does have more "natural"-ness (coordinate independence) that is inherent in equations in mathematics and physics.
28
+ - Q: Multiplication is too simple, someone must have tried it?
29
+ - A: Perhaps. My bet is whoever tried it soon found the model fail to train with standard ReLU. Without the belief in the underlying PDE perspective, maybe it wasn't pushed to its limit.
30
+ - Q: Is it not similar to attention in Transformer?
31
+ - A: It is, indeed. It's natural to wonder if the activation functions in Transformer could be removed (or reduced) while still achieve comparable performance.
32
+
33
+
34
  *This modelcard aims to be a base template for new models. It has been generated using [this raw template](https://github.com/huggingface/huggingface_hub/blob/main/src/huggingface_hub/templates/modelcard_template.md?plain=1).*
35
 
36