Upload README.md with huggingface_hub
Browse files
README.md
CHANGED
|
@@ -11,9 +11,9 @@ tags:
|
|
| 11 |
pipeline_tag: feature-extraction
|
| 12 |
---
|
| 13 |
|
| 14 |
-
#
|
| 15 |
|
| 16 |
-
The pretrained **
|
| 17 |
|
| 18 |
> **Entropy-Guided Dynamic Tokens for Graph-LLM Alignment in Molecular Understanding**
|
| 19 |
> Zihao Jing, Qiuhao Zeng, Ruiyi Fang, Yan Sun, Boyu Wang, Pingzhao Hu
|
|
@@ -21,10 +21,10 @@ The pretrained **DQ-Former encoder** from EDT-Former, as described in the ICLR 2
|
|
| 21 |
|
| 22 |
## Model Description
|
| 23 |
|
| 24 |
-
The
|
| 25 |
- **Entropy-guided dynamic token selection** to focus on informative molecular patches
|
| 26 |
- **BRICS fragment IDs** for substructural awareness
|
| 27 |
-
- **Cross-attention over graph node features** to generate a
|
| 28 |
|
| 29 |
This Stage 1 checkpoint (~699 MB) is trained on the PubChem pretraining corpus and is used to initialize Stage 2 (full model) training.
|
| 30 |
|
|
@@ -41,16 +41,15 @@ Use this checkpoint as the Stage 1 initialization for Stage 2 fine-tuning:
|
|
| 41 |
|
| 42 |
```yaml
|
| 43 |
# configs/stage2_dqw2d/model_config.yaml
|
| 44 |
-
stage1_path: path/to/
|
| 45 |
```
|
| 46 |
|
| 47 |
-
Or
|
| 48 |
|
| 49 |
```python
|
| 50 |
-
|
| 51 |
-
from models.edt_former import EDTFormerEncoder
|
| 52 |
|
| 53 |
-
|
| 54 |
```
|
| 55 |
|
| 56 |
To reproduce Stage 1 training from scratch:
|
|
@@ -64,9 +63,9 @@ bash scripts/training/pretraining.sh
|
|
| 64 |
|
| 65 |
| Resource | Link |
|
| 66 |
|----------|------|
|
| 67 |
-
| Pretrain Data | [zihaojing/
|
| 68 |
-
| SFT Data | [zihaojing/
|
| 69 |
-
| Full Model (Stage 2) | [zihaojing/
|
| 70 |
| Code | [selmiss/DQ-Former](https://github.com/selmiss/DQ-Former) |
|
| 71 |
|
| 72 |
## Citation
|
|
|
|
| 11 |
pipeline_tag: feature-extraction
|
| 12 |
---
|
| 13 |
|
| 14 |
+
# EDT-Former Encoder (Stage 1)
|
| 15 |
|
| 16 |
+
The pretrained **EDT-Former encoder** from the ICLR 2026 paper:
|
| 17 |
|
| 18 |
> **Entropy-Guided Dynamic Tokens for Graph-LLM Alignment in Molecular Understanding**
|
| 19 |
> Zihao Jing, Qiuhao Zeng, Ruiyi Fang, Yan Sun, Boyu Wang, Pingzhao Hu
|
|
|
|
| 21 |
|
| 22 |
## Model Description
|
| 23 |
|
| 24 |
+
The EDT-Former encoder is a Dual Q-Former that bridges molecular graphs and language. It uses:
|
| 25 |
- **Entropy-guided dynamic token selection** to focus on informative molecular patches
|
| 26 |
- **BRICS fragment IDs** for substructural awareness
|
| 27 |
+
- **Cross-attention over graph node features** to generate a token sequence aligned with text
|
| 28 |
|
| 29 |
This Stage 1 checkpoint (~699 MB) is trained on the PubChem pretraining corpus and is used to initialize Stage 2 (full model) training.
|
| 30 |
|
|
|
|
| 41 |
|
| 42 |
```yaml
|
| 43 |
# configs/stage2_dqw2d/model_config.yaml
|
| 44 |
+
stage1_path: path/to/EDT-Former-encoder/model.safetensors
|
| 45 |
```
|
| 46 |
|
| 47 |
+
Or download and use directly:
|
| 48 |
|
| 49 |
```python
|
| 50 |
+
from huggingface_hub import snapshot_download
|
|
|
|
| 51 |
|
| 52 |
+
snapshot_download("zihaojing/EDT-Former-encoder", local_dir="checkpoints/edt_former_s1_large/final_model")
|
| 53 |
```
|
| 54 |
|
| 55 |
To reproduce Stage 1 training from scratch:
|
|
|
|
| 63 |
|
| 64 |
| Resource | Link |
|
| 65 |
|----------|------|
|
| 66 |
+
| Pretrain Data | [zihaojing/EDT-Former-pretrain-data](https://huggingface.co/datasets/zihaojing/EDT-Former-pretrain-data) |
|
| 67 |
+
| SFT Data | [zihaojing/EDT-Former-sft-data](https://huggingface.co/datasets/zihaojing/EDT-Former-sft-data) |
|
| 68 |
+
| Full Model (Stage 2) | [zihaojing/EDT-Former-model](https://huggingface.co/zihaojing/EDT-Former-model) |
|
| 69 |
| Code | [selmiss/DQ-Former](https://github.com/selmiss/DQ-Former) |
|
| 70 |
|
| 71 |
## Citation
|