Dyck Sequence Completion Model

Fine-tuned DeepSeek-R1-Distill-Qwen-1.5B for completing Dyck sequences (balanced bracket problems) with step-by-step reasoning.

Model Description

Base Model: DeepSeek-R1-Distill-Qwen-1.5B
Parameters: 1.5B (all parameters fully fine-tuned)
Training Method: Full fine-tuning (not LoRA)
Language: English
Task: Dyck sequence completion with structured reasoning

Intended Use

This model completes Dyck sequences by adding minimal closing brackets to match all opening brackets. Given a prefix like ([<{, it generates:

Step-by-step reasoning showing the stack-based algorithm
The complete balanced sequence as the final answer

Example:

Input: ([<{

Output:

# 1: ( open -> push ) | [')']
# 2: [ open -> push ] | [')',']']
# 3: < open -> push > | [')',']','>']
# 4: { open -> push } | [')',']','>','}']
# 5: done | stack LIFO [')',']','>','}']
# +1: add '}'
# +2: add '>'
# +3: add ']'
# +4: add ')'
# add: }>)]
# full: ([<{}>)]

FINAL ANSWER: ([<{}>)]

Training Data

Dataset: 100,000 Dyck sequence completion examples
Bracket types: (), [], {}, <>
Sequence lengths: 12-32 characters (varying difficulty)
Format: User prompt + assistant reasoning (using ASCII symbols ->, |) + final answer

Training Details

Hyperparameters

Epochs: 4 (400k total training examples)
Batch size: 16 per device, gradient accumulation 8 (effective batch: 128)
Learning rate: 2e-5 with cosine schedule
Warmup: 10% of total steps
Optimizer: AdamW (fused)
Precision: bfloat16
Gradient clipping: 1.0
Loss weighting: Final answer tokens weighted 10× (ANSWER_LOSS_WEIGHT=10.0)
Sequence length: MAX_LENGTH=738 (exact max from dataset)

Infrastructure

GPU: NVIDIA L40S (48GB)
Training time: ~8 hours
Memory usage: ~20-24 GB
Framework: PyTorch + Transformers

Optimization

Gradient checkpointing enabled for memory efficiency
Best checkpoint selection by lowest eval_loss
96k train samples, 4k eval samples (96/4 split)
Evaluation every 250 steps

Performance

The model learns to:

Generate structured reasoning using ASCII symbols (->, |) for clarity
Follow the exact dataset format without manual parsing needed
Complete Dyck sequences accurately with minimal closing brackets
Provide step-by-step stack operations matching the training data style

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

MODEL_ID = "results"  # or your HF repo path
SEQUENCE = "([<{"

# Load model
tokenizer = AutoTokenizer.from_pretrained(MODEL_ID, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
    MODEL_ID,
    torch_dtype=torch.bfloat16,
    device_map="auto",
    trust_remote_code=True,
)
model.eval()

# Format prompt (same as training)
prompt = f"""Complete the Dyck sequence with minimal closing brackets.

Sequence: {SEQUENCE}

Rules: add only closings that match open brackets; no extra pairs.
Format: use -> for steps (e.g. open -> push close | stack=[...]); # +k: add 'X'; end FINAL ANSWER: <full_sequence>. No prose."""

messages = [{"role": "user", "content": prompt}]
chat_text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(chat_text, return_tensors="pt").to(model.device)

# Generate (greedy decoding for deterministic output)
with torch.no_grad():
    outputs = model.generate(
        **inputs,
        max_new_tokens=600,
        do_sample=False,
        pad_token_id=tokenizer.pad_token_id,
    )

response = tokenizer.decode(outputs[0][inputs["input_ids"].shape[1]:], skip_special_tokens=True)
print(response)

Limitations

Trained specifically on Dyck sequences with 4 bracket types: (), [], {}, <>
Sequence length limited to patterns similar to training data (12-32 chars)
Output format is structured/code-like, not natural language prose
Best performance on sequences that follow training distribution

Model Card Authors

Created for Dyck sequence completion task with full fine-tuning approach.

License

Inherits license from base model DeepSeek-R1-Distill-Qwen-1.5B.

Downloads last month: 1

Safetensors

Model size

2B params

Tensor type

BF16

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support