fal
/

cloneofsimo commited on
Commit
24ce408
·
verified ·
1 Parent(s): c4ff943

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +9 -11
README.md CHANGED
@@ -2,19 +2,19 @@
2
  license: apache-2.0
3
  ---
4
 
5
- # Equivarient 16ch, f8 VAE
6
 
7
  <video controls autoplay src="https://cdn-uploads.huggingface.co/production/uploads/6311151c64939fabc00c8436/6DQGRWvQvDXp2xQlvwvwU.mp4"></video>
8
 
9
- AuraEquiVAE is novel autoencoder that fixes multiple problem of existing conventional VAE. First, unlike traditional VAE that has significantly small log-variance, this model admits large noise to the latent.
10
- Next, unlike traditional VAE the latent space is equivariant under `Z_2 X Z_2` group operation (Horizonal / Vertical flip).
11
 
12
- To understand the equivariance, we give suitable group action to both latent globally but also locally. Meaning, latent represented as `Z = (z_1, \cdots, z_n)` and performing the permutation group action `g_global` to the tuples such that `g_global` is isomorphic to `Z_2 x Z_2` group.
13
- But also `g_local` to individual `z_i` themselves such that `g_local` is also isomorphic to `Z_2 x Z_2`.
14
 
15
- In our case specifically, `g_global` corresponds to flips, `g_local` corresponds to sign flip on specific latent dimension. changing 2 channel for sign flip for both horizonal, vertical was chosen empirically.
16
 
17
- The model has been trained on [Mastering VAE Training](https://github.com/cloneofsimo/vqgan-training), and detailed explanation for training could be found there.
18
 
19
  ## How to use
20
 
@@ -72,7 +72,7 @@ decimg = Image.fromarray(decimg) # PIL image.
72
 
73
  ## Citation
74
 
75
- If you find this material useful, please cite:
76
 
77
  ```
78
  @misc{Training VQGAN and VAE, with detailed explanation,
@@ -83,6 +83,4 @@ If you find this material useful, please cite:
83
  journal = {GitHub repository},
84
  howpublished = {\url{https://github.com/cloneofsimo/vqgan-training}},
85
  }
86
- ```
87
-
88
-
 
2
  license: apache-2.0
3
  ---
4
 
5
+ # Equivariant 16ch, f8 VAE
6
 
7
  <video controls autoplay src="https://cdn-uploads.huggingface.co/production/uploads/6311151c64939fabc00c8436/6DQGRWvQvDXp2xQlvwvwU.mp4"></video>
8
 
9
+ AuraEquiVAE is a novel autoencoder that addresses multiple problems of existing conventional VAEs. First, unlike traditional VAEs that have significantly small log-variance, this model admits large noise to the latent space.
10
+ Additionally, unlike traditional VAEs, the latent space is equivariant under `Z_2 X Z_2` group operations (Horizontal / Vertical flip).
11
 
12
+ To understand the equivariance, we apply suitable group actions to both the latent space globally and locally. The latent is represented as `Z = (z_1, ..., z_n)`, and we perform a global permutation group action `g_global` on the tuples such that `g_global` is isomorphic to the `Z_2 x Z_2` group.
13
+ We also apply a local action `g_local` to individual `z_i` elements such that `g_local` is also isomorphic to the `Z_2 x Z_2` group.
14
 
15
+ In our specific case, `g_global` corresponds to flips, while `g_local` corresponds to sign flips on specific latent dimensions. Changing 2 channels for sign flips for both horizontal and vertical directions was chosen empirically.
16
 
17
+ The model has been trained using the approach described in [Mastering VAE Training](https://github.com/cloneofsimo/vqgan-training), where detailed explanations for the training process can be found.
18
 
19
  ## How to use
20
 
 
72
 
73
  ## Citation
74
 
75
+ If you find this model useful, please cite:
76
 
77
  ```
78
  @misc{Training VQGAN and VAE, with detailed explanation,
 
83
  journal = {GitHub repository},
84
  howpublished = {\url{https://github.com/cloneofsimo/vqgan-training}},
85
  }
86
+ ```