Initial placeholder upload

Files changed (4) hide show

README.md ADDED Viewed

+---
+license: apache-2.0
+library_name: pytorch
+tags:
+  - hrm
+  - hierarchical-reasoning
+  - text2sql
+---
+# HRM-XL Base Checkpoint
+A 1.2B-parameter hierarchical reasoning language model. Used as the base for fine-tuning in the [HRM-Text](https://github.com/sapientinc/HRM-Text) tutorial.
+## Architecture
+- 16 layers, hidden size 1536, 12 heads (head dim 128)
+- H_cycles=2, L_cycles=3 (dual-timescale recurrence)
+- RoPE positional encoding
+- Precision: bf16
+## Usage
+```bash
+huggingface-cli download SapientIntelligence/HRM-XL-base --local-dir ./ckpts/base
+python -u pretrain.py --config-name cfg_finetune_demo resume_from=./ckpts/base
+```
+See [github.com/sapientinc/HRM-Text](https://github.com/sapientinc/HRM-Text) for the full tutorial.

all_config.yaml ADDED Viewed

+arch:
+  name: baselines.hrm_nocarry_more_bp_no_x@HierarchicalReasoningModel
+  head: lm_head@LMHead
+  n_layers: 16
+  hidden_size: 1536
+  num_heads: 12
+  expansion: 4
+  H_cycles: 2
+  L_cycles: 3
+  norm_type: pre
+  norm_eps: 1e-6
+  rope_theta: 10000.0
+  pos_emb_type: rope
+  init_type: lecun_normal

fsdp2_epoch_0/.placeholder ADDED Viewed


1	+ HRM-XL base checkpoint placeholder.
2	+ Full checkpoint weights are distributed separately.

train_metadata.yaml ADDED Viewed

+vocab_size: 65536
+max_seq_len: 4097
+total_length: 8748043