Update README.md
Browse files
README.md
CHANGED
|
@@ -36,12 +36,12 @@ The Barlow Twins objective explicitly minimizes redundancy between embedding dim
|
|
| 36 |
|
| 37 |
| Attribute | Value |
|
| 38 |
|----------|-------|
|
| 39 |
-
| **Base architecture** | Custom RoBERTa-style transformer (
|
| 40 |
| **Initialization** | Random (not pretrained on text or chemistry) |
|
| 41 |
| **Training objective** | **Barlow Twins**, redundancy-reduction via cross-correlation matrix |
|
| 42 |
| **Augmentation** | Stochastic SMILES enumeration (`MolToSmiles(..., doRandom=True)`) |
|
| 43 |
| **Training data** | ~24K unique molecules → augmented into positive pairs |
|
| 44 |
-
| **Sequence length** |
|
| 45 |
| **Embedding dimension** | 320 |
|
| 46 |
| **Projection head** | 3-layer MLP with BatchNorm (2048 → 2048 → 2048) |
|
| 47 |
| **Pooling** | Mean pooling over token embeddings |
|
|
|
|
| 36 |
|
| 37 |
| Attribute | Value |
|
| 38 |
|----------|-------|
|
| 39 |
+
| **Base architecture** | Custom RoBERTa-style transformer (6 layers, 320 hidden dim, 4 attention heads, ~8M params) |
|
| 40 |
| **Initialization** | Random (not pretrained on text or chemistry) |
|
| 41 |
| **Training objective** | **Barlow Twins**, redundancy-reduction via cross-correlation matrix |
|
| 42 |
| **Augmentation** | Stochastic SMILES enumeration (`MolToSmiles(..., doRandom=True)`) |
|
| 43 |
| **Training data** | ~24K unique molecules → augmented into positive pairs |
|
| 44 |
+
| **Sequence length** | 514 tokens |
|
| 45 |
| **Embedding dimension** | 320 |
|
| 46 |
| **Projection head** | 3-layer MLP with BatchNorm (2048 → 2048 → 2048) |
|
| 47 |
| **Pooling** | Mean pooling over token embeddings |
|