YAML Metadata Warning:empty or missing yaml metadata in repo card
Check out the documentation for more information.
Dyck Sequence Completion Model
Fine-tuned DeepSeek-R1-Distill-Qwen-1.5B for completing Dyck sequences (balanced bracket problems) with step-by-step reasoning.
Model Description
- Base Model: DeepSeek-R1-Distill-Qwen-1.5B
- Parameters: 1.5B (all parameters fully fine-tuned)
- Training Method: Full fine-tuning (not LoRA)
- Language: English
- Task: Dyck sequence completion with structured reasoning
Intended Use
This model completes Dyck sequences by adding minimal closing brackets to match all opening brackets. Given a prefix like ([<{, it generates:
- Step-by-step reasoning showing the stack-based algorithm
- The complete balanced sequence as the final answer
Example:
Input: ([<{
Output:
# 1: ( open -> push ) | [')']
# 2: [ open -> push ] | [')',']']
# 3: < open -> push > | [')',']','>']
# 4: { open -> push } | [')',']','>','}']
# 5: done | stack LIFO [')',']','>','}']
# +1: add '}'
# +2: add '>'
# +3: add ']'
# +4: add ')'
# add: }>)]
# full: ([<{}>)]
FINAL ANSWER: ([<{}>)]
Training Data
- Dataset: 100,000 Dyck sequence completion examples
- Bracket types:
(),[],{},<> - Sequence lengths: 12-32 characters (varying difficulty)
- Format: User prompt + assistant reasoning (using ASCII symbols
->,|) + final answer
Training Details
Hyperparameters
- Epochs: 4 (400k total training examples)
- Batch size: 16 per device, gradient accumulation 8 (effective batch: 128)
- Learning rate: 2e-5 with cosine schedule
- Warmup: 10% of total steps
- Optimizer: AdamW (fused)
- Precision: bfloat16
- Gradient clipping: 1.0
- Loss weighting: Final answer tokens weighted 10× (ANSWER_LOSS_WEIGHT=10.0)
- Sequence length: MAX_LENGTH=738 (exact max from dataset)
Infrastructure
- GPU: NVIDIA L40S (48GB)
- Training time: ~8 hours
- Memory usage: ~20-24 GB
- Framework: PyTorch + Transformers
Optimization
- Gradient checkpointing enabled for memory efficiency
- Best checkpoint selection by lowest eval_loss
- 96k train samples, 4k eval samples (96/4 split)
- Evaluation every 250 steps
Performance
The model learns to:
- Generate structured reasoning using ASCII symbols (
->,|) for clarity - Follow the exact dataset format without manual parsing needed
- Complete Dyck sequences accurately with minimal closing brackets
- Provide step-by-step stack operations matching the training data style
Usage
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
MODEL_ID = "results" # or your HF repo path
SEQUENCE = "([<{"
# Load model
tokenizer = AutoTokenizer.from_pretrained(MODEL_ID, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
MODEL_ID,
torch_dtype=torch.bfloat16,
device_map="auto",
trust_remote_code=True,
)
model.eval()
# Format prompt (same as training)
prompt = f"""Complete the Dyck sequence with minimal closing brackets.
Sequence: {SEQUENCE}
Rules: add only closings that match open brackets; no extra pairs.
Format: use -> for steps (e.g. open -> push close | stack=[...]); # +k: add 'X'; end FINAL ANSWER: <full_sequence>. No prose."""
messages = [{"role": "user", "content": prompt}]
chat_text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(chat_text, return_tensors="pt").to(model.device)
# Generate (greedy decoding for deterministic output)
with torch.no_grad():
outputs = model.generate(
**inputs,
max_new_tokens=600,
do_sample=False,
pad_token_id=tokenizer.pad_token_id,
)
response = tokenizer.decode(outputs[0][inputs["input_ids"].shape[1]:], skip_special_tokens=True)
print(response)
Limitations
- Trained specifically on Dyck sequences with 4 bracket types:
(),[],{},<> - Sequence length limited to patterns similar to training data (12-32 chars)
- Output format is structured/code-like, not natural language prose
- Best performance on sequences that follow training distribution
Model Card Authors
Created for Dyck sequence completion task with full fine-tuning approach.
License
Inherits license from base model DeepSeek-R1-Distill-Qwen-1.5B.
- Downloads last month
- 1
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support