Update README.md
Browse files
README.md
CHANGED
@@ -10,14 +10,14 @@ Heavily inspired by [google/gemma-scope](https://huggingface.co/google/gemma-sco
|
|
10 |
|
11 |
| Layer | MSE | Explained Variance | Dead Feature Proportion |
|
12 |
|-------|-----|--------------------|-------------------------|
|
13 |
-
| 2 |
|
14 |
-
| 5 |
|
15 |
-
| 8 |
|
16 |
-
| 11 |
|
17 |
-
| 14 | |
|
18 |
-
| 17 |
|
19 |
-
| 20 |
|
20 |
-
| 22 |
|
21 |
|
22 |
Training logs are available [via wandb](https://wandb.ai/lewington/ViT-L-14-laion2B-s32B-b82K/workspace) and training code is available on [github](https://github.com/Lewington-pitsos/vitsae). The training process is heavily reliant on [AWS ECS](https://aws.amazon.com/ecs/) so may contain some strange artefacts when a spot instance is killed and the training is reumed by another instance. Some of the code is ripped directly from [Hugo Fry](https://github.com/HugoFry/mats_sae_training_for_ViTs).
|
23 |
|
|
|
10 |
|
11 |
| Layer | MSE | Explained Variance | Dead Feature Proportion |
|
12 |
|-------|-----|--------------------|-------------------------|
|
13 |
+
| 2 | 267.95 | 0.763 | 0.000912 |
|
14 |
+
| 5 | 354.46 | 0.665 | 0 |
|
15 |
+
| 8 | 357.58 | 0.642 | 0 |
|
16 |
+
| 11 | 321.23 | 0.674 | 0 |
|
17 |
+
| 14 | 319.64| 0.689 | 0 |
|
18 |
+
| 17 | 261.201 | 0.731 | 0 |
|
19 |
+
| 20 | 278.06 | 0.706 | 0.0000763 |
|
20 |
+
| 22 | 299.96 | 0.684 | 0 |
|
21 |
|
22 |
Training logs are available [via wandb](https://wandb.ai/lewington/ViT-L-14-laion2B-s32B-b82K/workspace) and training code is available on [github](https://github.com/Lewington-pitsos/vitsae). The training process is heavily reliant on [AWS ECS](https://aws.amazon.com/ecs/) so may contain some strange artefacts when a spot instance is killed and the training is reumed by another instance. Some of the code is ripped directly from [Hugo Fry](https://github.com/HugoFry/mats_sae_training_for_ViTs).
|
23 |
|