AbstractPhil commited on
Commit
004b4b6
·
verified ·
1 Parent(s): 60adb7c

Update model card (step 5625)

Browse files
Files changed (1) hide show
  1. README.md +79 -3
README.md CHANGED
@@ -1,3 +1,79 @@
1
- ---
2
- license: mit
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ tags:
3
+ - vae
4
+ - multimodal
5
+ - text-embeddings
6
+ - clip
7
+ - t5
8
+ license: mit
9
+ ---
10
+
11
+ # VAE Lyra 🎵
12
+
13
+ Multi-modal Variational Autoencoder for text embedding transformation using geometric fusion.
14
+
15
+ ## Model Details
16
+
17
+ - **Fusion Strategy**: cantor
18
+ - **Latent Dimension**: 768
19
+ - **Training Steps**: 5,625
20
+ - **Best Loss**: 0.2159
21
+
22
+ ## Architecture
23
+
24
+ - **Modalities**: CLIP-L (768d) + T5-base (768d)
25
+ - **Encoder Layers**: 3
26
+ - **Decoder Layers**: 3
27
+ - **Hidden Dimension**: 1024
28
+
29
+ ## Usage
30
+ ```python
31
+ from geovocab2.train.model.vae.vae_lyra import MultiModalVAE, MultiModalVAEConfig
32
+ from huggingface_hub import hf_hub_download
33
+ import torch
34
+
35
+ # Download model
36
+ model_path = hf_hub_download(
37
+ repo_id="AbstractPhil/vae-lyra",
38
+ filename="model.pt"
39
+ )
40
+
41
+ # Load checkpoint
42
+ checkpoint = torch.load(model_path)
43
+
44
+ # Create model
45
+ config = MultiModalVAEConfig(
46
+ modality_dims={"clip": 768, "t5": 768},
47
+ latent_dim=768,
48
+ fusion_strategy="cantor"
49
+ )
50
+
51
+ model = MultiModalVAE(config)
52
+ model.load_state_dict(checkpoint['model_state_dict'])
53
+ model.eval()
54
+
55
+ # Use model
56
+ inputs = {
57
+ "clip": clip_embeddings, # [batch, 77, 768]
58
+ "t5": t5_embeddings # [batch, 77, 768]
59
+ }
60
+
61
+ reconstructions, mu, logvar = model(inputs)
62
+ ```
63
+
64
+ ## Training Details
65
+
66
+ - Trained on 10,000 diverse prompts
67
+ - Mix of LAION flavors (85%) and synthetic prompts (15%)
68
+ - KL Annealing: True
69
+ - Learning Rate: 0.0001
70
+
71
+ ## Citation
72
+ ```bibtex
73
+ @software{vae_lyra_2025,
74
+ author = {AbstractPhil},
75
+ title = {VAE Lyra: Multi-Modal Variational Autoencoder},
76
+ year = {2025},
77
+ url = {https://huggingface.co/AbstractPhil/vae-lyra}
78
+ }
79
+ ```