abcd1927 commited on
Commit
4a3b431
·
verified ·
1 Parent(s): 6f4f58d

Initial placeholder upload

Browse files
README.md ADDED
@@ -0,0 +1,28 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ library_name: pytorch
4
+ tags:
5
+ - hrm
6
+ - hierarchical-reasoning
7
+ - text2sql
8
+ ---
9
+
10
+ # HRM-XL Base Checkpoint
11
+
12
+ A 1.2B-parameter hierarchical reasoning language model. Used as the base for fine-tuning in the [HRM-Text](https://github.com/sapientinc/HRM-Text) tutorial.
13
+
14
+ ## Architecture
15
+
16
+ - 16 layers, hidden size 1536, 12 heads (head dim 128)
17
+ - H_cycles=2, L_cycles=3 (dual-timescale recurrence)
18
+ - RoPE positional encoding
19
+ - Precision: bf16
20
+
21
+ ## Usage
22
+
23
+ ```bash
24
+ huggingface-cli download SapientIntelligence/HRM-XL-base --local-dir ./ckpts/base
25
+ python -u pretrain.py --config-name cfg_finetune_demo resume_from=./ckpts/base
26
+ ```
27
+
28
+ See [github.com/sapientinc/HRM-Text](https://github.com/sapientinc/HRM-Text) for the full tutorial.
all_config.yaml ADDED
@@ -0,0 +1,14 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ arch:
2
+ name: baselines.hrm_nocarry_more_bp_no_x@HierarchicalReasoningModel
3
+ head: lm_head@LMHead
4
+ n_layers: 16
5
+ hidden_size: 1536
6
+ num_heads: 12
7
+ expansion: 4
8
+ H_cycles: 2
9
+ L_cycles: 3
10
+ norm_type: pre
11
+ norm_eps: 1e-6
12
+ rope_theta: 10000.0
13
+ pos_emb_type: rope
14
+ init_type: lecun_normal
fsdp2_epoch_0/.placeholder ADDED
@@ -0,0 +1,2 @@
 
 
 
1
+ HRM-XL base checkpoint placeholder.
2
+ Full checkpoint weights are distributed separately.
train_metadata.yaml ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ vocab_size: 65536
2
+ max_seq_len: 4097
3
+ total_length: 8748043