sync: snapshot at correct top-level path
Browse files
results/current_snapshot_20260426.md
ADDED
|
@@ -0,0 +1,123 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# DNAThinker β current results snapshot (2026-04-26 00:40)
|
| 2 |
+
|
| 3 |
+
Hand-aggregated from `metrics.json` files; the slower
|
| 4 |
+
`build_results_table.py` will fill in once the in-flight jobs finish
|
| 5 |
+
and 226046 fires.
|
| 6 |
+
|
| 7 |
+
## T1 enhancer_generation grid (separatedQA, complete)
|
| 8 |
+
|
| 9 |
+
`runs/exp_t1_grid_separatedQA_20260424_154915/{zs_raw,zs_enriched,lora_raw,lora_enriched}/`
|
| 10 |
+
|
| 11 |
+
| Mode | parse | gc_err β | len_ratio | FBD β | spec β | argmax β | div | emb_cos |
|
| 12 |
+
|---|---|---|---|---|---|---|---|---|
|
| 13 |
+
| zs_raw | 1.000 | 0.0932 | 1.829 | **11.93** | 2.752 | 0.328 | 0.442 | 0.651 |
|
| 14 |
+
| zs_enriched | 1.000 | 0.0957 | 1.615 | **11.32** | 3.129 | 0.328 | 0.430 | 0.713 |
|
| 15 |
+
| lora_raw | 1.000 | **0.0698** | 3.642 | 29.27 | 3.236 | **0.844** | 0.203 | 0.880 |
|
| 16 |
+
| lora_enriched| 1.000 | 0.1023 | 3.897 | 32.50 | 2.753 | 0.578 | 0.278 | 0.837 |
|
| 17 |
+
|
| 18 |
+
Notable tension: **LoRA wins gc_err + argmax_acc but loses badly on
|
| 19 |
+
FBD** (29 vs 11) β it generates 2Γ too-long sequences and drifts from
|
| 20 |
+
the real DNA distribution. Strong motivation for the Fusion-SFT β
|
| 21 |
+
Loop-SFT β SV-GSPO chain.
|
| 22 |
+
|
| 23 |
+
Per-cell-type FID exists in `*/genqual/genqual.json::per_cell_type` β
|
| 24 |
+
currently single-cell (Ex) only since these are sample-128 evals.
|
| 25 |
+
|
| 26 |
+
## T1 enhancer_generation grid (original-DEDUP, in flight as 226007)
|
| 27 |
+
|
| 28 |
+
`runs/exp_t1_grid_original_DEDUP_20260425_165628/{zs_raw,zs_enriched,lora_raw,lora_enriched}/`
|
| 29 |
+
|
| 30 |
+
| Mode | parse | gc_err | len_ratio | status |
|
| 31 |
+
|---|---|---|---|---|
|
| 32 |
+
| zs_raw | 1.000 | 0.1126 | 1.948 | β
|
|
| 33 |
+
| zs_enriched | 1.000 | 0.1090 | 2.113 | β
|
|
| 34 |
+
| lora_raw | 1.000 | 0.1529 | 3.717 | β
|
|
| 35 |
+
| lora_enriched | β | β | β | π 226007 in flight |
|
| 36 |
+
|
| 37 |
+
Genqual not yet computed for this split.
|
| 38 |
+
|
| 39 |
+
## T2 pair_aux ablation (production, n=128 each)
|
| 40 |
+
|
| 41 |
+
`runs/exp_t2_pair_aux_{none,supcon_pair,tier_aware_supcon}_20260425_192434_prod/`
|
| 42 |
+
|
| 43 |
+
| Variant | acc | F1 | precision | recall | parse |
|
| 44 |
+
|---|---|---|---|---|---|
|
| 45 |
+
| none (no aux) | **0.773** | **0.808** | 0.701 | 0.953 | 1.000 |
|
| 46 |
+
| supcon_pair | 0.719 | 0.710 | 0.733 | 0.688 | 1.000 |
|
| 47 |
+
| tier_aware_supcon | 0.711 | 0.776 | 0.634 | 1.000 | 1.000 |
|
| 48 |
+
|
| 49 |
+
β οΈ small (n=128) eval; the no-aux baseline edges out tier_aware_supcon
|
| 50 |
+
on F1. The full-data run (#42, blocked on dataset rewrite 225994) will
|
| 51 |
+
give the real number.
|
| 52 |
+
|
| 53 |
+
## Aligner loss ablation (7-cell production)
|
| 54 |
+
|
| 55 |
+
`runs/exp_aligner_t1_{infonce,lit,siglip}_20260425_210442_7cell/`
|
| 56 |
+
|
| 57 |
+
| Variant | val/train ratio | Wandb |
|
| 58 |
+
|---|---|---|
|
| 59 |
+
| lit | 1.22 (best generalisation gap) | dnathinker-align |
|
| 60 |
+
| infonce | 1.27 | |
|
| 61 |
+
| siglip | (in flight, 226025) | |
|
| 62 |
+
|
| 63 |
+
Per memory note `reference_benchmark_suite`: **lit overfits less than
|
| 64 |
+
infonce** on the 7-cell strat7c split (ratio 1.22 < 1.27).
|
| 65 |
+
|
| 66 |
+
## Oracle weights
|
| 67 |
+
|
| 68 |
+
| Path | Size | val_pearson_mean | val_spearman_mean |
|
| 69 |
+
|---|---|---|---|
|
| 70 |
+
| `runs/exp_oracle_ds_7cell_fdr_both_20260424_162210/oracle.pt` | 1.4 MB | 0.136 | 0.086 |
|
| 71 |
+
| `runs/exp_oracle_ds_7cell_100k_20260424_003143/oracle.pt` | 1.4 MB | (debug) | |
|
| 72 |
+
| `runs/exp_oracle_enformer_full_<jid>/` | (in flight, 225956, ~30h elapsed) | | |
|
| 73 |
+
|
| 74 |
+
Per-cell pearson range (DeepSTARR-7cell fdr_both): -0.017 (Mic) β 0.363 (Ast).
|
| 75 |
+
|
| 76 |
+
## Currently in flight (squeue snapshot, 2026-04-26 00:40)
|
| 77 |
+
|
| 78 |
+
| JID | Job | Elapsed | State |
|
| 79 |
+
|---|---|---|---|
|
| 80 |
+
| 225956 | oracle_enformer_full | 1d 8h | RUNNING |
|
| 81 |
+
| 225994 | T2 dataset rewrite (225994) | 20h+ | RUNNING (blocks #42) |
|
| 82 |
+
| 226007 | T1 task_prog (original-DEDUP) | 7h 42m | RUNNING |
|
| 83 |
+
| 226025 | aligner T1 siglip 7-cell | 3h 34m | RUNNING |
|
| 84 |
+
| 226037 | arch_llava (control) | 2h 14m | RUNNING (step ~110/4375) |
|
| 85 |
+
| 226038 | encoder NTv3-8m | 2h 14m | RUNNING (step ~110/4375) |
|
| 86 |
+
| 226043 | sv_gspo_v5 (NTv3-650m, fixed) | ~2 min | RUNNING (loading NTv3) |
|
| 87 |
+
| 226044 | arch_unified_ntp (messages-fix) | ~2 min | RUNNING |
|
| 88 |
+
| 226045 | arch_unified_mdlm (messages-fix) | PENDING | (Resources) |
|
| 89 |
+
| 226046 | arch_ablation_table | PENDING | (Dependency afterany 226037,226038,226043,226044,226045) |
|
| 90 |
+
|
| 91 |
+
## Cancelled / failed (with root cause + fix commit)
|
| 92 |
+
|
| 93 |
+
| JID | Why | Fix |
|
| 94 |
+
|---|---|---|
|
| 95 |
+
| 226030/226031/226032 | unk_token=None Qwen3 BPE | `8acf261` (committed pre-resubmit) |
|
| 96 |
+
| 226033/226034 | LoRA leaf-view in-place op | `b43c106` |
|
| 97 |
+
| 226034 (resub) | SDPA mask dtype mismatch | `1c5e270` |
|
| 98 |
+
| 226039 | sv_gspo encoder size mismatch (650m ckpt vs 8m model) | resubmit 226043 with NTv3-650m |
|
| 99 |
+
| 226040/226041 | unified loss=0 (UnifiedCollator read flat fields, dataset emits messages) | `9f6706e` (this turn) |
|
| 100 |
+
|
| 101 |
+
## Aggregator runs
|
| 102 |
+
|
| 103 |
+
```bash
|
| 104 |
+
# After 226037-226045 all finish (226046 auto-fires)
|
| 105 |
+
results/arch_ablation_20260425_220006.md
|
| 106 |
+
results/encoder_ablation_20260425.md
|
| 107 |
+
|
| 108 |
+
# Manual rerun (faster after run dirs land):
|
| 109 |
+
pixi run python scripts/build_results_table.py \
|
| 110 |
+
--runs-root /extra/.../runs \
|
| 111 |
+
--name-filter 'exp_t1_arch_*_20260425_220006' \
|
| 112 |
+
--output results/arch_ablation_20260425_220006.md \
|
| 113 |
+
--per-cell-type --per-tier
|
| 114 |
+
```
|
| 115 |
+
|
| 116 |
+
## Open questions
|
| 117 |
+
|
| 118 |
+
1. Does the unified-mode messages-schema fix actually produce non-zero
|
| 119 |
+
loss now? Watching 226044 β first step's loss in ~10 min.
|
| 120 |
+
2. Should we push oracle.pt + the v5 warm-start (7.2 GB) + the T1 grid
|
| 121 |
+
metrics to HF for two-machine sync? Need `HF_TOKEN`.
|
| 122 |
+
3. T2 #42 still blocked on 225994's train.pair_prediction.jsonl
|
| 123 |
+
completing rewrite (~17% currently).
|