NorthernTribe-Research commited on
Commit
d671937
·
verified ·
1 Parent(s): 6853460

Apply curated official model card (SOTA 0.5B) with full project context.

Browse files
Files changed (1) hide show
  1. README.md +223 -388
README.md CHANGED
@@ -1,490 +1,325 @@
1
  ---
 
 
 
2
  library_name: transformers
3
  pipeline_tag: text-generation
4
  tags:
5
  - mathematics
6
- - conjecture
 
7
  - reasoning
8
- - peft
9
  - lora
 
 
10
  - lean
11
- base_model: Qwen/Qwen2.5-0.5B-Instruct
12
- base_model_relation: adapter
 
 
 
 
 
13
  ---
14
 
15
- # NorthernTribe-Research/math-conjecture-model
16
 
17
- ## Parameter Count Visualization
18
 
19
- | Metric | Value |
20
- | --- | --- |
21
- | Total parameters | **502.831M** (`502,830,976`) |
22
- | Trainable parameters | **8.798M** (`8,798,208`) |
23
- | Frozen parameters | **494.033M** (`494,032,768`) |
24
- | Trainable ratio | **1.7497%** |
25
 
26
- `[#---------------------------]` trainable share
27
 
28
- ## Training Reference
29
 
30
- - Summary source: `workspace/runs/math-conjecture-sota-050b-quick/training_summary.json`
31
- - This card is generated from the repository README plus the latest training summary.
32
 
33
- ## Project Documentation
34
 
35
- <details>
36
- <summary>Expand full project README context</summary>
 
 
 
 
37
 
38
- This repository builds a merged dataset for training math AI systems aimed at
39
- **unsolved conjecture reasoning**. The v1 pipeline combines local conjecture
40
- data with curated open-license Hugging Face datasets (competition, structured
41
- reasoning, and formal proof corpora).
42
 
43
- It now also reaches into broader open math sources such as OpenR1-Math-220k,
44
- FineProofs-SFT, LeanStatement_CoT, NuminaMath-LEAN, and DeepSeek-Prover-V1 to
45
- better cover proof traces, theorem formalization, and reasoning-heavy
46
- competition data.
47
 
48
- ## Repository Layout
49
 
50
- ```text
51
- configs/
52
- source_registry.yaml
53
- data/
54
- raw/
55
- unsolved_conjectures.jsonl
56
- processed/
57
- train.jsonl
58
- validation.jsonl
59
- test.jsonl
60
- manifest.json
61
- interim/
62
- discovery.json
63
- pull_report.json
64
- normalized_rows.jsonl
65
- merged_train.jsonl
66
- merged_validation.jsonl
67
- merged_test.jsonl
68
- normalize_stats.json
69
- merge_stats.json
70
- validation_report.json
71
- releases/
72
- v1/
73
- train.parquet
74
- validation.parquet
75
- test.parquet
76
- manifest.json
77
- excluded_sources.json
78
- dataset_card.md
79
- size_report.json
80
- push_report.json
81
- schemas/
82
- conjecture_record.schema.json
83
- training_example.schema.json
84
- normalized_training_row.schema.json
85
- scripts/
86
- build_dataset.py
87
- validate_dataset.py
88
- pipeline.py
89
- manage_hf_bucket.py
90
- release_and_space_run.py
91
- model_development/
92
- configs/
93
- math_conjecture_sota.yaml
94
- math_conjecture_sota_state_of_art.yaml
95
- math_conjecture_sft.yaml
96
- math_conjecture_scratch.yaml
97
- math_conjecture_scratch_smoke.yaml
98
- qwen25_math_sota.yaml
99
- scripts/
100
- train_sft.py
101
- train_sota.py
102
- train_scratch.py
103
- eval_sota.py
104
- generate_rft_data.py
105
- merge_and_push.py
106
- requirements.txt
107
- README.md
108
- space_trainer/
109
- app.py
110
- configs/
111
- math_conjecture_sota.yaml
112
- math_conjecture_sota_state_of_art.yaml
113
- qwen25_math_sota.yaml
114
- scripts/
115
- train_sota.py
116
- eval_sota.py
117
- requirements.txt
118
- README.md
119
- space_conjecture_lab/
120
- app.py
121
- requirements.txt
122
- README.md
123
- docs/
124
- math_conjecture_lean_ai_rollout_runbook.md
125
- model_sota_strategy_2026-03-23.md
126
- state_of_art_math_blueprint_2026-03-25.md
127
- ```
128
 
129
- ## Existing Local Dataset Build
 
 
130
 
131
- ```bash
132
- python scripts/build_dataset.py \
133
- --seed-path data/raw/unsolved_conjectures.jsonl \
134
- --output-dir data/processed \
135
- --split-seed 17
136
 
137
- python scripts/validate_dataset.py \
138
- --seed-path data/raw/unsolved_conjectures.jsonl \
139
- --processed-dir data/processed
140
- ```
141
 
142
- ## Merged Corpus Pipeline (v1)
 
 
 
 
 
143
 
144
- Use the project virtualenv:
 
145
 
146
- ```bash
147
- .venv/bin/python scripts/pipeline.py discover
148
- .venv/bin/python scripts/pipeline.py pull
149
- .venv/bin/python scripts/pipeline.py normalize
150
- .venv/bin/python scripts/pipeline.py merge
151
- .venv/bin/python scripts/pipeline.py validate
152
- .venv/bin/python scripts/pipeline.py pack
153
- .venv/bin/python scripts/pipeline.py push
154
- ```
155
 
156
- Default publish target:
157
- - HF account: from env (`HF_USERNAME`/`HF_NAMESPACE`) or `huggingface-api-key.json`
158
- - dataset repo: `HF_DATASET_REPO_ID` or fallback `<username>/math-conjecture-training-corpus`
159
- - visibility: public dataset repo
160
 
161
- The registry keeps these broader sources capped and split-aware so the corpus
162
- can grow materially without assuming unrealistic storage or training budgets.
163
- License policy checks now evaluate both dataset card fields (`license` and
164
- `license_name`), handle list-valued license metadata, and still block unresolved
165
- custom/unknown licenses.
166
 
167
- ## Hugging Face Buckets
 
168
 
169
- Current Hugging Face Hub docs expose storage buckets through both the CLI and
170
- Python API.
171
 
172
- CLI commands documented today:
173
- - `hf buckets create BUCKET_ID [--private] [--exist-ok]`
174
- - `hf buckets info BUCKET_ID`
175
- - `hf buckets list [NAMESPACE|BUCKET_ID]`
176
- - `hf buckets delete BUCKET_ID`
177
- - `hf buckets move FROM_ID TO_ID`
178
- - `hf buckets remove ...`
179
- - `hf buckets sync ...`
180
- - `hf buckets cp ...`
181
 
182
- Python bucket helpers are documented as available since `huggingface_hub`
183
- `v1.5.0`, including `create_bucket`, `list_bucket_tree`, and `sync_bucket`.
184
 
185
- This checkout currently has `huggingface_hub 1.4.1`, so the bucket CLI/API is
186
- not available here yet. The helper script below detects that mismatch and
187
- prints a clear upgrade/permission error instead of failing silently.
188
 
189
- Examples:
 
 
 
 
 
 
190
 
191
- ```bash
192
- # Check whether a bucket exists in the current namespace
193
- python scripts/manage_hf_bucket.py status my-bucket --namespace NorthernTribe-Research
194
 
195
- # Create a private bucket
196
- python scripts/manage_hf_bucket.py create my-bucket --namespace NorthernTribe-Research --private
197
 
198
- # Ensure a bucket exists, creating it if needed
199
- python scripts/manage_hf_bucket.py ensure my-bucket --namespace NorthernTribe-Research --private
 
 
 
200
 
201
- # Upstream CLI, when using a newer huggingface_hub install
202
- hf buckets create NorthernTribe-Research/my-bucket --private
203
- hf buckets list NorthernTribe-Research
204
- hf buckets info NorthernTribe-Research/my-bucket
205
- ```
206
 
207
- If you want the bucket commands in the CLI on this machine, upgrade the Hub
208
- package to a version at or above `huggingface_hub>=1.5.0`.
209
 
210
- ## hf-mount Setup (Standard Workflow)
 
 
 
 
211
 
212
- Use `scripts/hf_mount_setup.sh` as the project-standard entrypoint for
213
- installing and operating `hf-mount`.
214
 
215
- One-time bootstrap (installs binaries, persists PATH, writes runtime env):
216
 
217
- ```bash
218
- scripts/hf_mount_setup.sh bootstrap --persist-path
219
- ```
220
 
221
- This writes helper defaults to:
222
- - `workspace/runtime/hf_mount.env`
 
 
 
223
 
224
- In new shells, load project defaults:
225
 
226
- ```bash
227
- source workspace/runtime/hf_mount.env
228
- ```
229
 
230
- `hf_mount_setup.sh` now writes and maintains these Hugging Face project defaults:
231
-
232
- - `HF_NAMESPACE`
233
- - `HF_DATASET_REPO_ID`
234
- - `HF_MODEL_REPO_ID`
235
- - `HF_TRAINER_SPACE_REPO_ID`
236
- - `HF_LAB_SPACE_REPO_ID`
237
- - `HF_OPS_BUCKET_ID`
238
-
239
- Common operations:
240
-
241
- ```bash
242
- # Mount the dataset repo (read-only)
243
- scripts/hf_mount_setup.sh mount-repo \
244
- --repo "datasets/${HF_DATASET_REPO_ID}" \
245
- --target workspace/hf_mounts/dataset
246
-
247
- # Mount the model repo (read-only)
248
- scripts/hf_mount_setup.sh mount-repo \
249
- --repo "${HF_MODEL_REPO_ID}" \
250
- --target workspace/hf_mounts/model
251
-
252
- # Mount a Space repo (read-only)
253
- scripts/hf_mount_setup.sh mount-repo \
254
- --repo "spaces/${HF_TRAINER_SPACE_REPO_ID}" \
255
- --target workspace/hf_mounts/space_trainer
256
-
257
- # Mount a bucket (read-write by default)
258
- scripts/hf_mount_setup.sh mount-bucket \
259
- --bucket "${HF_OPS_BUCKET_ID}" \
260
- --target workspace/hf_mounts/math_conjecture_ops
261
-
262
- # Inspect and stop mounts
263
- scripts/hf_mount_setup.sh status
264
- scripts/hf_mount_setup.sh stop --target workspace/hf_mounts/dataset
265
- ```
266
 
267
- If your host requires privileged NFS/FUSE mounts, add `--sudo` to mount/stop
268
- commands.
269
 
270
- ## Push-And-Run Orchestrator
 
 
 
 
 
 
271
 
272
- Use `scripts/release_and_space_run.py` to execute the full promotion flow for:
273
 
274
- - dataset repo: `HF_DATASET_REPO_ID` (default: `NorthernTribe-Research/math-conjecture-training-corpus`)
275
- - space repo: `HF_TRAINER_SPACE_REPO_ID` (default: `NorthernTribe-Research/math_trainer`)
276
- - model repo: `HF_MODEL_REPO_ID` (default: `NorthernTribe-Research/math-conjecture-model`)
277
- - ops bucket: `HF_OPS_BUCKET_ID` (default: `NorthernTribe-Research/math-conjecture-ops`)
278
 
279
- Install/upgrade tooling in the virtualenv:
280
 
281
- ```bash
282
- .venv/bin/python -m pip install -r model_development/requirements.txt
283
- ```
284
 
285
- Run the rollout sequence:
 
 
 
 
 
286
 
287
- ```bash
288
- .venv/bin/python scripts/release_and_space_run.py prepare
289
- .venv/bin/python scripts/release_and_space_run.py bucket
290
- .venv/bin/python scripts/release_and_space_run.py publish-dataset
291
- .venv/bin/python scripts/release_and_space_run.py deploy-space
292
- .venv/bin/python scripts/release_and_space_run.py run-space
293
- .venv/bin/python scripts/release_and_space_run.py verify
294
- ```
295
 
296
- `run-space` now retries transient client/runtime failures by default. Tune with
297
- `--max-retries` and `--retry-sleep-seconds`.
298
 
299
- Non-destructive Space API probe (preflight only):
300
 
301
- ```bash
302
- .venv/bin/python scripts/release_and_space_run.py run-space \
303
- --preflight-only \
304
- --no-push-to-hub \
305
- --no-run-eval \
306
- --max-stages 1 \
307
- --allow-failed-result
308
- ```
309
-
310
- Optional safety pins for dataset publish/verify:
311
 
312
- ```bash
313
- .venv/bin/python scripts/release_and_space_run.py publish-dataset \
314
- --expected-created-at 2026-03-23T11:03:54+00:00 \
315
- --expected-total-rows 473349
316
- ```
317
 
318
- `verify` is strict by default and expects a successful `run-space` report plus
319
- model artifacts. For baseline infrastructure checks without a completed run:
320
 
321
- ```bash
322
- .venv/bin/python scripts/release_and_space_run.py verify \
323
- --no-require-space-run-success \
324
- --no-require-model-artifacts
325
- ```
 
 
326
 
327
- Generated reports are written to `data/releases/v1/`:
328
 
329
- - `promotion_prepare_report.json`
330
- - `promotion_bucket_report.json`
331
- - `promotion_dataset_publish_report.json`
332
- - `promotion_space_deploy_report.json`
333
- - `promotion_space_run_report.json`
334
- - `promotion_verify_report.json`
335
 
336
- ## Model Development (Lean + SOTA Math Profiles)
337
 
338
- The model fine-tuning workspace is under `model_development/`.
339
- The SOTA curriculum now profiles responses across simple, intermediate,
340
- advanced, and Lean-formalized bands.
341
 
342
- ```bash
343
- .venv/bin/python -m pip install -r model_development/requirements.txt
 
344
 
345
- .venv/bin/python model_development/scripts/train_sft.py \
346
- --config model_development/configs/math_conjecture_sft.yaml
347
 
348
- .venv/bin/python model_development/scripts/train_sft.py \
349
- --config model_development/configs/math_conjecture_sft.yaml \
350
- --max-train-samples 120000
 
 
 
351
 
352
- .venv/bin/python model_development/scripts/train_sota.py \
353
- --config model_development/configs/math_conjecture_sota.yaml
354
 
355
- .venv/bin/python model_development/scripts/train_sota.py \
356
- --config model_development/configs/qwen25_math_sota.yaml
357
 
358
- .venv/bin/python model_development/scripts/train_sota.py \
359
- --config model_development/configs/math_conjecture_sota_state_of_art.yaml
 
 
 
 
360
 
361
- .venv/bin/python model_development/scripts/train_scratch.py \
362
- --config model_development/configs/math_conjecture_scratch.yaml \
363
- --init-only
 
 
 
 
 
364
 
365
- .venv/bin/python model_development/scripts/train_scratch.py \
366
- --config model_development/configs/math_conjecture_scratch_smoke.yaml \
367
- --dry-run
368
  ```
369
 
370
- The SOTA eval report now includes `difficulty_band_metrics`,
371
- `response_profile_metrics`, and `simple_to_lean`.
372
- The training summary now records model parameter stats under
373
- `model.parameter_counts` (total/trainable/frozen + ratio).
374
- When `push_to_hub` is enabled, `train_sota.py` now also updates model
375
- `README.md` on Hugging Face with a parameter-count visualization table and
376
- trainable-share bar.
377
- The eval flow also supports consensus/verifier-aware selection metrics such as
378
- `selected_pass_at_k`, `consensus_rate`, and `consensus_pass_at_k`.
379
- State-of-art eval now also supports stricter grading controls:
380
- `--allow-substring-match` (off by default) and optional SymPy symbolic checks
381
- (`symbolic_verifier_enabled` reported in eval output).
382
-
383
- Self-improvement (rejection-sampling) data generation:
384
-
385
- ```bash
386
- .venv/bin/python model_development/scripts/generate_rft_data.py \
387
- --config model_development/configs/math_conjecture_sota_state_of_art.yaml \
388
- --adapter-path model_development/runs/math-conjecture-sota-state-of-art/final_adapter \
389
- --input-file data/releases/v1/train.parquet \
390
- --output-file model_development/runs/math-conjecture-rft/rft_train.parquet \
391
- --k 8 \
392
- --max-samples 2000
393
- ```
394
 
395
- SOTA blueprint and implementation notes:
396
- - `docs/state_of_art_math_blueprint_2026-03-25.md`
 
 
 
 
 
397
 
398
- Optional adapter merge and model publish:
399
 
400
- ```bash
401
- .venv/bin/python model_development/scripts/merge_and_push.py \
402
- --adapter-path model_development/runs/math-conjecture-sota/final_adapter \
403
- --output-dir model_development/merged/math-conjecture-model \
404
- --push-to-hub \
405
- --repo-id NorthernTribe-Research/math-conjecture-model
 
 
 
 
406
  ```
407
 
408
- The publish flow now auto-generates a Hugging Face model card `README.md` with
409
- parameter-count visualization (total/trainable/frozen + trainable-share bar),
410
- so model size is visible directly on the Hub page.
411
 
412
- ### llama.cpp Inference (GGUF)
413
 
414
- `llama.cpp` inference is validated in this workspace for the trained
415
- math-conjecture adapter flow.
416
 
417
- Build `llama.cpp`:
 
 
418
 
419
- ```bash
420
- git clone --depth 1 https://github.com/ggml-org/llama.cpp workspace/llama.cpp
421
- cmake -S workspace/llama.cpp -B workspace/llama.cpp/build -DGGML_BLAS=ON -DGGML_BLAS_VENDOR=OpenBLAS
422
- cmake --build workspace/llama.cpp/build -j
423
- ```
424
 
425
- Merge adapter to full model, then convert to GGUF:
426
 
427
- ```bash
428
- .venv/bin/python model_development/scripts/merge_and_push.py \
429
- --adapter-path workspace/runs/math-conjecture-sota-050b-quick/final_adapter \
430
- --output-dir workspace/runs/math-conjecture-sota-050b-quick/merged_model
431
 
432
- .venv/bin/python workspace/llama.cpp/convert_hf_to_gguf.py \
433
- workspace/runs/math-conjecture-sota-050b-quick/merged_model \
434
- --outfile workspace/runs/math-conjecture-sota-050b-quick/math-conjecture-sota-050b-f16.gguf \
435
- --outtype f16
436
- ```
437
 
438
- Optional quantization:
 
 
 
 
 
439
 
440
- ```bash
441
- ./workspace/llama.cpp/build/bin/llama-quantize \
442
- workspace/runs/math-conjecture-sota-050b-quick/math-conjecture-sota-050b-f16.gguf \
443
- workspace/runs/math-conjecture-sota-050b-quick/math-conjecture-sota-050b-q4_k_m.gguf \
444
- Q4_K_M
445
- ```
446
 
447
- Run one-shot inference via helper script:
448
 
449
- ```bash
450
- scripts/llama_cpp_infer.sh \
451
- --model workspace/runs/math-conjecture-sota-050b-quick/math-conjecture-sota-050b-q4_k_m.gguf \
452
- --prompt "2+2=" \
453
- --n-predict 8
454
- ```
455
 
456
- You can also save output:
 
 
 
457
 
458
- ```bash
459
- scripts/llama_cpp_infer.sh \
460
- --model workspace/runs/math-conjecture-sota-050b-quick/math-conjecture-sota-050b-f16.gguf \
461
- --prompt "Solve: a+b=10 and a-b=4. Return JSON with keys a and b only." \
462
- --n-predict 64 \
463
- --out workspace/runs/math-conjecture-sota-050b-quick/llama_cpp_inference.json.txt
464
- ```
465
 
466
- ## Hugging Face Space Trainer
467
 
468
- `space_trainer/` contains a GPU Space app that runs staged training and pushes
469
- artifacts to `NorthernTribe-Research/math-conjecture-model`.
 
 
 
 
 
470
 
471
- ## Conjecture Showcase Space
472
 
473
- `space_conjecture_lab/` contains a separate Lean+AI conjecture analysis Space
474
- that:
475
 
476
- - loads evidence from `NorthernTribe-Research/math-conjecture-training-corpus`,
477
- - attempts conjecture analysis with `NorthernTribe-Research/math-conjecture-model`,
478
- - falls back to a base model when needed,
479
- - always displays explicit work traces, prompt text, and a Lean stub in UI.
 
 
 
 
 
480
 
481
- ## Policy
482
 
483
- - Include only open, known-license, non-gated datasets from the registry.
484
- - Exclude unknown-license and gated datasets in v1.
485
- - Prefer capped, high-signal subsets for very large sources so training stays
486
- practical while coverage expands.
487
- - Keep release compact with deterministic filtering, de-duplication, and
488
- split assignment by hashed prompt.
489
 
490
- </details>
 
1
  ---
2
+ language:
3
+ - en
4
+ license: apache-2.0
5
  library_name: transformers
6
  pipeline_tag: text-generation
7
  tags:
8
  - mathematics
9
+ - conjectures
10
+ - theorem-proving
11
  - reasoning
12
+ - qlora
13
  - lora
14
+ - peft
15
+ - formal-math
16
  - lean
17
+ - research
18
+ base_model: "Qwen/Qwen2.5-0.5B-Instruct"
19
+ datasets:
20
+ - NorthernTribe-Research/math-conjecture-training-corpus
21
+ model-index:
22
+ - name: math-conjecture-model
23
+ results: []
24
  ---
25
 
26
+ # Math Conjecture SOTA 0.5B
27
 
28
+ Math Conjecture SOTA 0.5B is a research-focused language model adapted for **mathematical reasoning**, **conjecture analysis**, **proof-style generation**, and **formalization-aware responses**. It is part of the broader **Math Conjecture Training Corpus** effort, which builds open-license training pipelines for unsolved conjectures, structured reasoning, competition mathematics, proof traces, and Lean-oriented theorem workflows.
29
 
30
+ This checkpoint is intended for research and experimentation around long-form mathematical reasoning rather than proof certification.
 
 
 
 
 
31
 
32
+ ---
33
 
34
+ ## Model Details
35
 
36
+ ### Model description
 
37
 
38
+ This model is fine-tuned to produce more structured mathematical outputs, including:
39
 
40
+ - intuition-first explanations
41
+ - stepwise proof sketches
42
+ - conjecture decomposition
43
+ - theorem-style reasoning
44
+ - informal-to-formal transition hints
45
+ - Lean-aware reasoning patterns
46
 
47
+ The surrounding project supports multiple development paths including supervised fine-tuning, state-of-the-art math profiles, scratch initialization, evaluation, self-improvement data generation, adapter merge, Hub publishing, and local `llama.cpp` inference after conversion.
 
 
 
48
 
49
+ ### Model type
 
 
 
50
 
51
+ Parameter-efficient fine-tuned causal language model for math reasoning.
52
 
53
+ ### Hub repo
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
54
 
55
+ - **Model repo:** `NorthernTribe-Research/math-conjecture-model`
56
+ - **Dataset repo:** `NorthernTribe-Research/math-conjecture-training-corpus`
57
+ - **Trainer Space repo:** `NorthernTribe-Research/math_trainer`
58
 
59
+ ---
 
 
 
 
60
 
61
+ ## Parameter Count Visualization
 
 
 
62
 
63
+ | Metric | Value |
64
+ |---|---:|
65
+ | Total parameters | 502.831M (502,830,976) |
66
+ | Trainable parameters | 8.798M (8,798,208) |
67
+ | Frozen parameters | 494.033M (494,032,768) |
68
+ | Trainable ratio | 1.7497% |
69
 
70
+ **Trainable share:**
71
+ `[#---------------------------] 1.7497%`
72
 
73
+ This checkpoint uses a parameter-efficient adaptation setup in which only a small fraction of model weights are trainable while the vast majority remain frozen. The project documentation explicitly states that training summaries record `model.parameter_counts` with total, trainable, frozen, and ratio fields, and that the Hub README is auto-updated with this visualization when `push_to_hub` is enabled.
 
 
 
 
 
 
 
 
74
 
75
+ ---
 
 
 
76
 
77
+ ## Training Reference
 
 
 
 
78
 
79
+ - **Summary source:** `workspace/runs/math-conjecture-sota-050b-quick/training_summary.json`
80
+ - This model card is derived from the repository README together with the latest training summary. The repo also includes a helper for refreshing the model card directly from those sources: `scripts/update_model_card.py`.
81
 
82
+ ---
 
83
 
84
+ ## Intended Uses
 
 
 
 
 
 
 
 
85
 
86
+ ### Direct use
 
87
 
88
+ This model is suitable for:
 
 
89
 
90
+ - mathematical reasoning research
91
+ - conjecture exploration demos
92
+ - proof-sketch generation
93
+ - theorem-style answer generation
94
+ - reasoning benchmark experiments
95
+ - formalization-oriented prompting
96
+ - research prototypes on Hugging Face Spaces
97
 
98
+ ### Downstream use
 
 
99
 
100
+ Potential downstream uses include:
 
101
 
102
+ - math-focused copilots
103
+ - conjecture analysis interfaces
104
+ - educational proof assistants
105
+ - formal/informal bridge systems
106
+ - evaluation pipelines for reasoning-heavy LLMs
107
 
108
+ ### Out-of-scope use
 
 
 
 
109
 
110
+ This model is **not** intended to be treated as:
 
111
 
112
+ - a formal theorem prover
113
+ - a replacement for proof assistants
114
+ - a certified symbolic solver
115
+ - a source of guaranteed-correct proofs
116
+ - an authoritative system for high-stakes mathematical claims
117
 
118
+ ---
 
119
 
120
+ ## Training Data
121
 
122
+ This model is part of a larger corpus-building effort designed for training math AI systems aimed at **unsolved conjecture reasoning**. The v1 dataset pipeline combines local conjecture data with curated open-license Hugging Face datasets covering competition mathematics, structured reasoning, and formal proof corpora. The repository also documents broader open math sources such as:
 
 
123
 
124
+ - OpenR1-Math-220k
125
+ - FineProofs-SFT
126
+ - LeanStatement_CoT
127
+ - NuminaMath-LEAN
128
+ - DeepSeek-Prover-V1
129
 
130
+ These sources are included to improve coverage across proof traces, theorem formalization, and reasoning-heavy mathematical examples. The project documentation also emphasizes open-license, practical-size, and deterministic filtering choices in the corpus pipeline.
131
 
132
+ ---
 
 
133
 
134
+ ## Training Procedure
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
135
 
136
+ The model-development workspace supports:
 
137
 
138
+ - SFT training
139
+ - SOTA math profile training
140
+ - scratch initialization
141
+ - scratch dry-runs
142
+ - self-improvement / rejection-sampling data generation
143
+ - evaluation with richer reasoning metrics
144
+ - adapter merge and Hub publish flows
145
 
146
+ The SOTA curriculum profiles responses across:
147
 
148
+ - simple
149
+ - intermediate
150
+ - advanced
151
+ - Lean-formalized bands
152
 
153
+ ### Evaluation capabilities
154
 
155
+ The project documentation states that the SOTA evaluation flow includes:
 
 
156
 
157
+ - `difficulty_band_metrics`
158
+ - `response_profile_metrics`
159
+ - `simple_to_lean`
160
+ - `selected_pass_at_k`
161
+ - `consensus_rate`
162
+ - `consensus_pass_at_k`
163
 
164
+ It also supports stricter grading controls such as optional SymPy symbolic verification and substring-match controls.
 
 
 
 
 
 
 
165
 
166
+ ### Self-improvement flow
 
167
 
168
+ The repository includes a rejection-sampling data-generation path via `generate_rft_data.py`, used to create self-improvement training data from model outputs.
169
 
170
+ ---
 
 
 
 
 
 
 
 
 
171
 
172
+ ## Project Architecture
 
 
 
 
173
 
174
+ The wider project includes:
 
175
 
176
+ - merged dataset construction
177
+ - validation and release packaging
178
+ - model development configs and scripts
179
+ - a Space trainer app
180
+ - a conjecture-lab Space app
181
+ - rollout runbooks
182
+ - state-of-the-art math blueprints
183
 
184
+ This means the model should be understood as one artifact within a broader research pipeline rather than a standalone checkpoint.
185
 
186
+ ---
 
 
 
 
 
187
 
188
+ ## Example Usage
189
 
190
+ ### Transformers
 
 
191
 
192
+ ```python
193
+ from transformers import AutoTokenizer, AutoModelForCausalLM
194
+ import torch
195
 
196
+ repo_id = "NorthernTribe-Research/math-conjecture-model"
 
197
 
198
+ tokenizer = AutoTokenizer.from_pretrained(repo_id)
199
+ model = AutoModelForCausalLM.from_pretrained(
200
+ repo_id,
201
+ torch_dtype=torch.float16 if torch.cuda.is_available() else torch.float32,
202
+ device_map="auto"
203
+ )
204
 
205
+ prompt = """Analyze the following mathematical conjecture.
 
206
 
207
+ Conjecture:
208
+ If a sequence is eventually periodic, then its generating function is rational.
209
 
210
+ Return:
211
+ 1. Intuition
212
+ 2. Proof sketch
213
+ 3. Key assumptions
214
+ 4. Formalization notes
215
+ """
216
 
217
+ inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
218
+ outputs = model.generate(
219
+ **inputs,
220
+ max_new_tokens=512,
221
+ do_sample=True,
222
+ temperature=0.7,
223
+ top_p=0.9
224
+ )
225
 
226
+ print(tokenizer.decode(outputs[0], skip_special_tokens=True))
 
 
227
  ```
228
 
229
+ ### Prompting style
230
+
231
+ This model generally benefits from prompts that request structure explicitly, such as:
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
232
 
233
+ - intuition
234
+ - proof sketch
235
+ - formalization notes
236
+ - assumptions
237
+ - edge cases
238
+ - candidate counterexamples
239
+ - Lean-style outline
240
 
241
+ ### Example prompt
242
 
243
+ ```text
244
+ Consider the statement:
245
+
246
+ "If a + b is even, then a and b have the same parity."
247
+
248
+ Provide:
249
+ 1. An intuitive explanation
250
+ 2. A proof
251
+ 3. A compact formalization outline
252
+ 4. Any assumptions or edge cases
253
  ```
254
 
255
+ ---
 
 
256
 
257
+ ## Local Inference
258
 
259
+ The project documentation confirms a local inference path using `llama.cpp` after:
 
260
 
261
+ 1. merging the adapter into a full model
262
+ 2. converting the merged model to GGUF
263
+ 3. optionally quantizing for lighter deployment
264
 
265
+ This supports local experimentation outside standard Transformers-based serving.
 
 
 
 
266
 
267
+ ---
268
 
269
+ ## Limitations
 
 
 
270
 
271
+ This is a research model and may still:
 
 
 
 
272
 
273
+ - generate invalid proofs that sound convincing
274
+ - confuse symbolic plausibility with correctness
275
+ - fail on deep multi-hop theorem reasoning
276
+ - produce incomplete or brittle formalization hints
277
+ - overuse familiar proof templates
278
+ - hallucinate mathematical structure
279
 
280
+ All substantive outputs should be independently verified with formal tools, symbolic systems, or expert review.
281
+
282
+ ---
 
 
 
283
 
284
+ ## Risks and Recommendations
285
 
286
+ Because the model is optimized for proof-style text generation, users may over-trust fluent mathematical output. For that reason:
 
 
 
 
 
287
 
288
+ - verify results independently
289
+ - use proof assistants or symbolic systems when correctness matters
290
+ - treat outputs as research assistance, not certification
291
+ - benchmark behavior before deployment in educational or analytical systems
292
 
293
+ ---
294
+
295
+ ## Project Documentation
 
 
 
 
296
 
297
+ The repository includes documentation and workflow support for:
298
 
299
+ - dataset release
300
+ - model development
301
+ - rollout automation
302
+ - Space deployment
303
+ - evaluation
304
+ - adapter merge and publish
305
+ - model-card refresh automation
306
 
307
+ ---
308
 
309
+ ## Citation
 
310
 
311
+ ```bibtex
312
+ @misc{northerntribe_math_conjecture_sota_05b_2026,
313
+ title = {Math Conjecture SOTA 0.5B},
314
+ author = {NorthernTribe Research},
315
+ year = {2026},
316
+ publisher = {Hugging Face},
317
+ howpublished = {Model repository}
318
+ }
319
+ ```
320
 
321
+ ---
322
 
323
+ ## Disclaimer
 
 
 
 
 
324
 
325
+ This model is intended for research, experimentation, and educational exploration in mathematical reasoning. It does not guarantee theorem validity, proof correctness, or novel mathematical discovery.