zihaojing commited on
Commit
7dff722
verified
1 Parent(s): 352188f

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +81 -0
README.md ADDED
@@ -0,0 +1,81 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ language:
4
+ - en
5
+ tags:
6
+ - molecules
7
+ - chemistry
8
+ - graph-encoder
9
+ - qformer
10
+ - molecular-understanding
11
+ pipeline_tag: feature-extraction
12
+ ---
13
+
14
+ # DQFormer Encoder (Stage 1)
15
+
16
+ The pretrained **DQ-Former encoder** from EDT-Former, as described in the ICLR 2026 paper:
17
+
18
+ > **Entropy-Guided Dynamic Tokens for Graph-LLM Alignment in Molecular Understanding**
19
+ > Zihao Jing, Qiuhao Zeng, Ruiyi Fang, Yan Sun, Boyu Wang, Pingzhao Hu
20
+ > *ICLR 2026* 路 [Paper](https://www.arxiv.org/abs/2602.02742) 路 [Code](https://github.com/selmiss/DQ-Former)
21
+
22
+ ## Model Description
23
+
24
+ The DQ-Former encoder is a Dual Q-Former that bridges molecular graphs and language. It uses:
25
+ - **Entropy-guided dynamic token selection** to focus on informative molecular patches
26
+ - **BRICS fragment IDs** for substructural awareness
27
+ - **Cross-attention over graph node features** to generate a variable-length token sequence aligned with text
28
+
29
+ This Stage 1 checkpoint (~699 MB) is trained on the PubChem pretraining corpus and is used to initialize Stage 2 (full model) training.
30
+
31
+ **Architecture config:**
32
+ - `num_query_tokens`: 32
33
+ - `embed_dim`: 512
34
+ - `cross_attention_freq`: 1
35
+ - `num_layers`: 8 (blending module)
36
+ - `num_heads`: 8
37
+
38
+ ## Usage
39
+
40
+ Use this checkpoint as the Stage 1 initialization for Stage 2 fine-tuning:
41
+
42
+ ```yaml
43
+ # configs/stage2_dqw2d/model_config.yaml
44
+ stage1_path: path/to/DQFormer-encoder/model.safetensors
45
+ ```
46
+
47
+ Or load directly:
48
+
49
+ ```python
50
+ # First clone the repo and install dependencies (see github.com/selmiss/DQ-Former)
51
+ from models.edt_former import EDTFormerEncoder
52
+
53
+ encoder = EDTFormerEncoder.from_pretrained("zihaojing/DQFormer-encoder")
54
+ ```
55
+
56
+ To reproduce Stage 1 training from scratch:
57
+
58
+ ```bash
59
+ # Set up environment first (see repo README)
60
+ bash scripts/training/pretraining.sh
61
+ ```
62
+
63
+ ## Related Resources
64
+
65
+ | Resource | Link |
66
+ |----------|------|
67
+ | Pretrain Data | [zihaojing/DQFormer-pretrain-data](https://huggingface.co/datasets/zihaojing/DQFormer-pretrain-data) |
68
+ | SFT Data | [zihaojing/DQFormer-sft-data](https://huggingface.co/datasets/zihaojing/DQFormer-sft-data) |
69
+ | Full Model (Stage 2) | [zihaojing/DQFormer-model](https://huggingface.co/zihaojing/DQFormer-model) |
70
+ | Code | [selmiss/DQ-Former](https://github.com/selmiss/DQ-Former) |
71
+
72
+ ## Citation
73
+
74
+ ```bibtex
75
+ @inproceedings{jing2026edtformer,
76
+ title={Entropy-Guided Dynamic Tokens for Graph-LLM Alignment in Molecular Understanding},
77
+ author={Jing, Zihao and Zeng, Qiuhao and Fang, Ruiyi and Sun, Yan and Wang, Boyu and Hu, Pingzhao},
78
+ booktitle={International Conference on Learning Representations (ICLR)},
79
+ year={2026}
80
+ }
81
+ ```