AbstractPhil commited on
Commit
3cb02ca
Β·
verified Β·
1 Parent(s): 25d3742

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +249 -3
README.md CHANGED
@@ -1,3 +1,249 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ ---
4
+ # MobiusNet
5
+
6
+ A vision architecture built on continuous topological principles, replacing traditional activations with wave-based interference gating.
7
+
8
+ ## Overview
9
+
10
+ MobiusNet introduces a fundamentally different approach to neural network design:
11
+
12
+ - **MobiusLens**: Wave superposition as a gating mechanism, replacing standard activations (ReLU, GELU)
13
+ - **Thirds Mask**: Cantor-inspired fractal channel suppression for regularization
14
+ - **Continuous Topology**: Layers sample a continuous manifold via the `t` parameter, not discrete units
15
+ - **Twist Rotations**: Smooth rotation through representation space across network depth
16
+
17
+ ## Performance
18
+
19
+ | Model | Params | GFLOPs | Tiny ImageNet |
20
+ |-------|--------|--------|---------------|
21
+ | ResNet-18 | 11M | 1.8 | 50-55% |
22
+ | MobiusNet-M | 14.6M | 2.69 | 55.4% |
23
+ | MobiusNet-Base | 33.7M | 2.69 | TBD |
24
+
25
+ ## Installation
26
+
27
+ ```bash
28
+ pip install torch torchvision safetensors huggingface_hub tensorboard tqdm
29
+ ```
30
+
31
+ ## Quick Start
32
+
33
+ ### Training
34
+
35
+ ```python
36
+ from mobius_trainer_full import train_tiny_imagenet
37
+
38
+ model, best_acc = train_tiny_imagenet(
39
+ preset='mobius_base',
40
+ epochs=200,
41
+ lr=3e-4,
42
+ batch_size=128,
43
+ use_integrator=True,
44
+ data_dir='./data/tiny-imagenet-200',
45
+ output_dir='./outputs',
46
+ hf_repo='AbstractPhil/mobiusnet',
47
+ save_every_n_epochs=10,
48
+ upload_every_n_epochs=10,
49
+ )
50
+ ```
51
+
52
+ ### Continue from Checkpoint
53
+
54
+ ```python
55
+ # From local directory
56
+ model, best_acc = train_tiny_imagenet(
57
+ preset='mobius_base',
58
+ epochs=200,
59
+ continue_from="./outputs/checkpoints/mobius_base_tiny_imagenet/20240101_120000",
60
+ )
61
+
62
+ # From HuggingFace (auto-downloads)
63
+ model, best_acc = train_tiny_imagenet(
64
+ preset='mobius_base',
65
+ epochs=200,
66
+ continue_from="checkpoints/mobius_base_tiny_imagenet/20240101_120000",
67
+ )
68
+ ```
69
+
70
+ ### Inference
71
+
72
+ ```python
73
+ from safetensors.torch import load_file
74
+ from mobius_trainer_full import MobiusNet, PRESETS
75
+
76
+ # Load model
77
+ config = PRESETS['mobius_base']
78
+ model = MobiusNet(num_classes=200, use_integrator=True, **config)
79
+ state_dict = load_file("best_model.safetensors")
80
+ model.load_state_dict(state_dict)
81
+ model.eval()
82
+
83
+ # Inference
84
+ with torch.no_grad():
85
+ logits = model(image_tensor)
86
+ pred = logits.argmax(1)
87
+ ```
88
+
89
+ ## Model Presets
90
+
91
+ | Preset | Channels | Depths | ~Params |
92
+ |--------|----------|--------|---------|
93
+ | `mobius_tiny_s` | (64, 128, 256) | (2, 2, 2) | 500K |
94
+ | `mobius_tiny_m` | (64, 128, 256, 512, 768) | (2, 2, 4, 2, 2) | 11M |
95
+ | `mobius_tiny_l` | (96, 192, 384, 768) | (3, 3, 3, 3) | 8M |
96
+ | `mobius_base` | (128, 256, 512, 768, 1024) | (2, 2, 2, 2, 2) | 33.7M |
97
+
98
+ ## Architecture
99
+
100
+ ```
101
+ Input
102
+ β”‚
103
+ β–Ό
104
+ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
105
+ β”‚ Stem (Conv β†’ BN) β”‚
106
+ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
107
+ β”‚
108
+ β–Ό
109
+ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
110
+ β”‚ Stage 1-N β”‚
111
+ β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚
112
+ β”‚ β”‚ MobiusConvBlock (Γ—depth) β”‚ β”‚
113
+ β”‚ β”‚ β”œβ”€ Depthwise-Sep Conv β”‚ β”‚
114
+ β”‚ β”‚ β”œβ”€ BatchNorm β”‚ β”‚
115
+ β”‚ β”‚ β”œβ”€ MobiusLens (wave gate) β”‚ β”‚
116
+ β”‚ β”‚ β”œβ”€ Thirds Mask β”‚ β”‚
117
+ β”‚ β”‚ └─ Learned Residual β”‚ β”‚
118
+ β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚
119
+ β”‚ Downsample (stride-2 conv) β”‚
120
+ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
121
+ β”‚
122
+ β–Ό
123
+ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
124
+ β”‚ Integrator (Conv β†’ BN β†’ GELU) β”‚ ← Task collapse
125
+ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
126
+ β”‚
127
+ β–Ό
128
+ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
129
+ β”‚ Pool β†’ Linear β†’ Classes β”‚
130
+ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
131
+ ```
132
+
133
+ ## Core Components
134
+
135
+ ### MobiusLens
136
+
137
+ Wave-based gating mechanism with three interference paths:
138
+
139
+ ```python
140
+ L = wave(phase_l, drift_l) # Left path (+1 drift)
141
+ M = wave(phase_m, drift_m) # Middle path (0 drift, ghost)
142
+ R = wave(phase_r, drift_r) # Right path (-1 drift)
143
+
144
+ # Interference
145
+ xor_comp = |L + R - 2*L*R| # Differentiable XOR
146
+ and_comp = L * R # Differentiable AND
147
+
148
+ # Gating
149
+ gate = weighted_sum(L, M, R) * interference_blend
150
+ output = input * sigmoid(layernorm(gate))
151
+ ```
152
+
153
+ The middle path (M) acts as a "ghost" β€” present but diminished β€” maintaining gradient continuity while biasing information flow toward L/R edges (Cantor-like structure).
154
+
155
+ ### Thirds Mask
156
+
157
+ Rotating channel suppression inspired by Cantor set construction:
158
+
159
+ ```
160
+ Layer 0: suppress channels [0:C/3]
161
+ Layer 1: suppress channels [C/3:2C/3]
162
+ Layer 2: suppress channels [2C/3:C]
163
+ Layer 3: back to [0:C/3]
164
+ ```
165
+
166
+ Forces redundancy and prevents co-adaptation across channel groups.
167
+
168
+ ### Continuous Topology
169
+
170
+ Each layer samples a continuous manifold:
171
+
172
+ ```python
173
+ t = layer_idx / (total_layers - 1) # 0 β†’ 1
174
+
175
+ twist_in_angle = t * Ο€
176
+ twist_out_angle = -t * Ο€
177
+ scales = scale_range[0] + t * scale_span
178
+ ```
179
+
180
+ Adding layers = finer sampling of the same underlying structure.
181
+
182
+ ## Checkpoints
183
+
184
+ Saved to: `checkpoints/{variant}_{dataset}/{timestamp}/`
185
+
186
+ ```
187
+ β”œβ”€β”€ config.json
188
+ β”œβ”€β”€ best_accuracy.json
189
+ β”œβ”€β”€ final_accuracy.json
190
+ β”œβ”€β”€ checkpoints/
191
+ β”‚ β”œβ”€β”€ checkpoint_epoch_0010.pt
192
+ β”‚ β”œβ”€β”€ checkpoint_epoch_0010.safetensors
193
+ β”‚ β”œβ”€β”€ best_model.pt
194
+ β”‚ β”œβ”€β”€ best_model.safetensors
195
+ β”‚ β”œβ”€β”€ final_model.pt
196
+ β”‚ └── final_model.safetensors
197
+ └── tensorboard/
198
+ ```
199
+
200
+ ## TensorBoard
201
+
202
+ Monitor training:
203
+
204
+ ```bash
205
+ tensorboard --logdir ./outputs/checkpoints
206
+ ```
207
+
208
+ Tracks:
209
+ - Loss, train/val accuracy
210
+ - Per-layer lens parameters (omega, alpha, twist angles, L/M/R weights)
211
+ - Residual weights
212
+ - Weight histograms
213
+
214
+ ## Data Setup
215
+
216
+ ### Tiny ImageNet
217
+
218
+ ```bash
219
+ wget http://cs231n.stanford.edu/tiny-imagenet-200.zip
220
+ unzip tiny-imagenet-200.zip -d ./data/
221
+ ```
222
+
223
+ ## HuggingFace Integration
224
+
225
+ Checkpoints auto-upload to HuggingFace Hub:
226
+
227
+ ```python
228
+ # Set token in Colab
229
+ from google.colab import userdata
230
+ token = userdata.get('HF_TOKEN')
231
+
232
+ # Or environment variable
233
+ export HF_TOKEN=your_token_here
234
+ ```
235
+
236
+ ## License
237
+
238
+ MIT
239
+
240
+ ## Citation
241
+
242
+ ```bibtex
243
+ @misc{mobiusnet2024,
244
+ title={MobiusNet: Wave-Based Topological Vision Architecture},
245
+ author={AbstractPhil},
246
+ year={2024},
247
+ url={https://huggingface.co/AbstractPhil/mobiusnet}
248
+ }
249
+ ```