Instructions to use Edmon02/mathphd-plus-plus-0.5b with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use Edmon02/mathphd-plus-plus-0.5b with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="Edmon02/mathphd-plus-plus-0.5b")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("Edmon02/mathphd-plus-plus-0.5b")
model = AutoModelForCausalLM.from_pretrained("Edmon02/mathphd-plus-plus-0.5b")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use Edmon02/mathphd-plus-plus-0.5b with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "Edmon02/mathphd-plus-plus-0.5b"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Edmon02/mathphd-plus-plus-0.5b",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/Edmon02/mathphd-plus-plus-0.5b

SGLang

How to use Edmon02/mathphd-plus-plus-0.5b with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "Edmon02/mathphd-plus-plus-0.5b" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Edmon02/mathphd-plus-plus-0.5b",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "Edmon02/mathphd-plus-plus-0.5b" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Edmon02/mathphd-plus-plus-0.5b",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use Edmon02/mathphd-plus-plus-0.5b with Docker Model Runner:
```
docker model run hf.co/Edmon02/mathphd-plus-plus-0.5b
```

Edmon02 commited on 14 days ago

Commit

bc174f2

verified ·

1 Parent(s): eeafa31

Update README.md

Browse files

Files changed (1) hide show

README.md +93 -164

README.md CHANGED Viewed

@@ -1,199 +1,128 @@
 ---
 library_name: transformers
-tags: []
 ---
-# Model Card for Model ID
-<!-- Provide a quick summary of what the model is/does. -->
-## Model Details
-### Model Description
-<!-- Provide a longer summary of what this model is. -->
-This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated.
-- **Developed by:** [More Information Needed]
-- **Funded by [optional]:** [More Information Needed]
-- **Shared by [optional]:** [More Information Needed]
-- **Model type:** [More Information Needed]
-- **Language(s) (NLP):** [More Information Needed]
-- **License:** [More Information Needed]
-- **Finetuned from model [optional]:** [More Information Needed]
-### Model Sources [optional]
-<!-- Provide the basic links for the model. -->
-- **Repository:** [More Information Needed]
-- **Paper [optional]:** [More Information Needed]
-- **Demo [optional]:** [More Information Needed]
-## Uses
-<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
-### Direct Use
-<!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
-[More Information Needed]
-### Downstream Use [optional]
-<!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
-[More Information Needed]
-### Out-of-Scope Use
-<!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
-[More Information Needed]
-## Bias, Risks, and Limitations
-<!-- This section is meant to convey both technical and sociotechnical limitations. -->
-[More Information Needed]
-### Recommendations
-<!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
-Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
-## How to Get Started with the Model
-Use the code below to get started with the model.
-[More Information Needed]
-## Training Details
-### Training Data
-<!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
-[More Information Needed]
-### Training Procedure
-<!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
-#### Preprocessing [optional]
-[More Information Needed]
-#### Training Hyperparameters
-- **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
-#### Speeds, Sizes, Times [optional]
-<!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
-[More Information Needed]
-## Evaluation
-<!-- This section describes the evaluation protocols and provides the results. -->
-### Testing Data, Factors & Metrics
-#### Testing Data
-<!-- This should link to a Dataset Card if possible. -->
-[More Information Needed]
-#### Factors
-<!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
-[More Information Needed]
-#### Metrics
-<!-- These are the evaluation metrics being used, ideally with a description of why. -->
-[More Information Needed]
-### Results
-[More Information Needed]
-#### Summary
-## Model Examination [optional]
-<!-- Relevant interpretability work for the model goes here -->
-[More Information Needed]
-## Environmental Impact
-<!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
-Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
-- **Hardware Type:** [More Information Needed]
-- **Hours used:** [More Information Needed]
-- **Cloud Provider:** [More Information Needed]
-- **Compute Region:** [More Information Needed]
-- **Carbon Emitted:** [More Information Needed]
-## Technical Specifications [optional]
-### Model Architecture and Objective
-[More Information Needed]
-### Compute Infrastructure
-[More Information Needed]
-#### Hardware
-[More Information Needed]
-#### Software
-[More Information Needed]
-## Citation [optional]
-<!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
-**BibTeX:**
-[More Information Needed]
-**APA:**
-[More Information Needed]
-## Glossary [optional]
-<!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
-[More Information Needed]
-## More Information [optional]
-[More Information Needed]
-## Model Card Authors [optional]
-[More Information Needed]
-## Model Card Contact
-[More Information Needed]

 ---
+language:
+  - en
+license: apache-2.0
 library_name: transformers
+pipeline_tag: text-generation
+tags:
+  - math
+  - reasoning
+  - chain-of-thought
+  - qwen2
+  - conversational
+  - rlvr
+base_model: Qwen/Qwen2.5-0.5B-Instruct
 ---
+# MathPhD++ 0.5B
+**MathPhD++** is a small (≈0.5B parameter) language model fine-tuned for **mathematical reasoning** in natural language. It is built on [Qwen2.5-0.5B-Instruct](https://huggingface.co/Qwen/Qwen2.5-0.5B-Instruct) and trained with the **MathPhD++** open-source pipeline (see linked code repository in your Hub “Model sources” if you publish it): supervised fine-tuning (SFT) on curated math instruction data with structured `<thinking>` / `<answer>` (and related) tags, optional process reward modeling (PRM), and reinforcement learning from verifiable rewards (GRPO) using SymPy-backed correctness checks.
+This Hub release is intended as a **reproducible checkpoint** for research and experimentation on math LLMs at the edge of what fits comfortably on a single consumer or Colab GPU.
+## Model summary
+| Attribute | Value |
+|-----------|--------|
+| **Architecture** | Qwen2 (causal LM), ~0.5B parameters |
+| **Precision** | FP16 (typical Hub export) |
+| **Chat format** | ChatML (`<|im_start|>` / `<|im_end|>`) — prefer `tokenizer.apply_chat_template` when available |
+| **Primary use** | Step-by-step math word problems, competition-style reasoning (informal proofs / chain-of-thought) |
+| **Developed by** | Edmon (Edmon02) — community research project |
+| **Finetuned from** | `Qwen/Qwen2.5-0.5B-Instruct` |
+## Training data (high level)
+SFT mixes multiple public sources (non-exhaustive; see project config for exact caps):
+- MetaMath-style QA
+- Competition MATH (train)
+- GSM8K (train)
+- OpenMathInstruct-2 (subset)
+- NuminaMath-CoT (subset)
+Examples are formatted in **ChatML** with structured assistant outputs (reasoning blocks and final answers) to encourage verifiable extraction and consistent formatting for downstream RL.
+## Evaluation (reported from project notebook run)
+Results below are **indicative** and used a **200-sample** cap per benchmark (`QUICK_TEST`-style eval). For publication-quality numbers, run full GSM8K test (1,319) and a standard MATH split with fixed protocol.
+| Benchmark | Subset / protocol | Accuracy |
+|-----------|-------------------|----------|
+| GSM8K | 200 / test | **18.5%** (37/200) |
+| MATH | 200 / MATH-500 | **6.0%** (12/200) |
+These scores reflect the **SFT-loaded** policy evaluated after the pipeline fix that loads fine-tuned weights from checkpoint storage (not the raw base model). A 0.5B model remains **capacity-limited** on MATH; GSM8K is the more informative “did SFT help?” signal at this scale.
+## How to use
+### Transformers (generate)
+```python
+from transformers import AutoModelForCausalLM, AutoTokenizer
+import torch
+model_id = "Edmon02/mathphd-plus-plus-0.5b"
+tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True)
+model = AutoModelForCausalLM.from_pretrained(
+    model_id,
+    torch_dtype=torch.float16,
+    device_map="auto",
+    trust_remote_code=True,
+)
+problem = "What is the sum of the first 100 positive integers?"
+prompt = (
+    "<|im_start|>system\nYou are MathPhD++, an advanced mathematical reasoning assistant. "
+    "Show your complete reasoning step-by-step.<|im_end|>\n"
+    f"<|im_start|>user\n{problem}<|im_end|>\n"
+    "<|im_start|>assistant\n"
+)
+inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
+outputs = model.generate(
+    **inputs,
+    max_new_tokens=512,
+    do_sample=False,
+    pad_token_id=tokenizer.pad_token_id,
+)
+print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[1]:], skip_special_tokens=True))
+```
+Use **greedy or low temperature** for benchmarking; use sampling for exploratory interaction.
+## Limitations
+- **Small model:** Will underperform larger instruction models on hard competition math and long proofs.
+- **Informal reasoning:** Outputs are not formally verified unless you pair the model with an external proof checker or code execution sandbox.
+- **Data contamination:** Public math benchmarks overlap train/eval sources; treat leaderboard-style claims with care unless you hold out data strictly.
+- **Language:** Primarily English math text; mixed-language or non-math prompts are out of distribution.
+## Bias, safety, and responsible use
+This model inherits behaviors and limitations of the base Qwen2.5 model and the fine-tuning corpora. It may produce confident but incorrect mathematics. **Do not** use as a sole authority for safety-critical, financial, medical, or legal reasoning. Prefer human review and independent verification.
+## Environmental note
+If your Hub UI shows an unrelated arXiv paper (e.g. carbon footprint of ML), that is often an **automatic metadata artifact**. This model card is the authoritative description; consider removing incorrect `arxiv:` tags under model settings.
+## Links
+- **Checkpoints / artifacts (author):** [Google Drive — mathphd_checkpoints](https://drive.google.com/drive/folders/14T6zF9B_Zh0JbKUW2nFEWz7QrYtW_r85?usp=sharing) (SFT, PRM, GRPO, eval exports — access as permitted by owner)
+- **Base model:** [Qwen/Qwen2.5-0.5B-Instruct](https://huggingface.co/Qwen/Qwen2.5-0.5B-Instruct)
+## Citation
+If you use this model, cite the base model and this Hub repository as appropriate:
+```bibtex
+@misc{mathphd_plus_plus_05b,
+  title        = {MathPhD++ 0.5B: Math Reasoning Model (Qwen2.5-0.5B-Instruct fine-tune)},
+  author       = {Edmon02},
+  year         = {2026},
+  howpublished = {\url{https://huggingface.co/Edmon02/mathphd-plus-plus-0.5b}},
+}
+```
+---
+*Model card written for professional Hub documentation. Update the GitHub URL and evaluation table when you publish full-benchmark runs.*