lewington commited on
Commit
decae61
1 Parent(s): aaf0642

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +8 -8
README.md CHANGED
@@ -10,14 +10,14 @@ Heavily inspired by [google/gemma-scope](https://huggingface.co/google/gemma-sco
10
 
11
  | Layer | MSE | Explained Variance | Dead Feature Proportion |
12
  |-------|-----|--------------------|-------------------------|
13
- | 2 | | | |
14
- | 5 | | | |
15
- | 8 | | | |
16
- | 11 | | | |
17
- | 14 | | | |
18
- | 17 | | | |
19
- | 20 | | | |
20
- | 22 | | | |
21
 
22
  Training logs are available [via wandb](https://wandb.ai/lewington/ViT-L-14-laion2B-s32B-b82K/workspace) and training code is available on [github](https://github.com/Lewington-pitsos/vitsae). The training process is heavily reliant on [AWS ECS](https://aws.amazon.com/ecs/) so may contain some strange artefacts when a spot instance is killed and the training is reumed by another instance. Some of the code is ripped directly from [Hugo Fry](https://github.com/HugoFry/mats_sae_training_for_ViTs).
23
 
 
10
 
11
  | Layer | MSE | Explained Variance | Dead Feature Proportion |
12
  |-------|-----|--------------------|-------------------------|
13
+ | 2 | 267.95 | 0.763 | 0.000912 |
14
+ | 5 | 354.46 | 0.665 | 0 |
15
+ | 8 | 357.58 | 0.642 | 0 |
16
+ | 11 | 321.23 | 0.674 | 0 |
17
+ | 14 | 319.64| 0.689 | 0 |
18
+ | 17 | 261.201 | 0.731 | 0 |
19
+ | 20 | 278.06 | 0.706 | 0.0000763 |
20
+ | 22 | 299.96 | 0.684 | 0 |
21
 
22
  Training logs are available [via wandb](https://wandb.ai/lewington/ViT-L-14-laion2B-s32B-b82K/workspace) and training code is available on [github](https://github.com/Lewington-pitsos/vitsae). The training process is heavily reliant on [AWS ECS](https://aws.amazon.com/ecs/) so may contain some strange artefacts when a spot instance is killed and the training is reumed by another instance. Some of the code is ripped directly from [Hugo Fry](https://github.com/HugoFry/mats_sae_training_for_ViTs).
23