YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.

Dyck Sequence Completion Model

Fine-tuned DeepSeek-R1-Distill-Qwen-1.5B for completing Dyck sequences (balanced bracket problems) with step-by-step reasoning.

Model Description

  • Base Model: DeepSeek-R1-Distill-Qwen-1.5B
  • Parameters: 1.5B (all parameters fully fine-tuned)
  • Training Method: Full fine-tuning (not LoRA)
  • Language: English
  • Task: Dyck sequence completion with structured reasoning

Intended Use

This model completes Dyck sequences by adding minimal closing brackets to match all opening brackets. Given a prefix like ([<{, it generates:

  1. Step-by-step reasoning showing the stack-based algorithm
  2. The complete balanced sequence as the final answer

Example:

Input: ([<{

Output:

# 1: ( open -> push ) | [')']
# 2: [ open -> push ] | [')',']']
# 3: < open -> push > | [')',']','>']
# 4: { open -> push } | [')',']','>','}']
# 5: done | stack LIFO [')',']','>','}']
# +1: add '}'
# +2: add '>'
# +3: add ']'
# +4: add ')'
# add: }>)]
# full: ([<{}>)]

FINAL ANSWER: ([<{}>)]

Training Data

  • Dataset: 100,000 Dyck sequence completion examples
  • Bracket types: (), [], {}, <>
  • Sequence lengths: 12-32 characters (varying difficulty)
  • Format: User prompt + assistant reasoning (using ASCII symbols ->, |) + final answer

Training Details

Hyperparameters

  • Epochs: 4 (400k total training examples)
  • Batch size: 16 per device, gradient accumulation 8 (effective batch: 128)
  • Learning rate: 2e-5 with cosine schedule
  • Warmup: 10% of total steps
  • Optimizer: AdamW (fused)
  • Precision: bfloat16
  • Gradient clipping: 1.0
  • Loss weighting: Final answer tokens weighted 10× (ANSWER_LOSS_WEIGHT=10.0)
  • Sequence length: MAX_LENGTH=738 (exact max from dataset)

Infrastructure

  • GPU: NVIDIA L40S (48GB)
  • Training time: ~8 hours
  • Memory usage: ~20-24 GB
  • Framework: PyTorch + Transformers

Optimization

  • Gradient checkpointing enabled for memory efficiency
  • Best checkpoint selection by lowest eval_loss
  • 96k train samples, 4k eval samples (96/4 split)
  • Evaluation every 250 steps

Performance

The model learns to:

  • Generate structured reasoning using ASCII symbols (->, |) for clarity
  • Follow the exact dataset format without manual parsing needed
  • Complete Dyck sequences accurately with minimal closing brackets
  • Provide step-by-step stack operations matching the training data style

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

MODEL_ID = "results"  # or your HF repo path
SEQUENCE = "([<{"

# Load model
tokenizer = AutoTokenizer.from_pretrained(MODEL_ID, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
    MODEL_ID,
    torch_dtype=torch.bfloat16,
    device_map="auto",
    trust_remote_code=True,
)
model.eval()

# Format prompt (same as training)
prompt = f"""Complete the Dyck sequence with minimal closing brackets.

Sequence: {SEQUENCE}

Rules: add only closings that match open brackets; no extra pairs.
Format: use -> for steps (e.g. open -> push close | stack=[...]); # +k: add 'X'; end FINAL ANSWER: <full_sequence>. No prose."""

messages = [{"role": "user", "content": prompt}]
chat_text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(chat_text, return_tensors="pt").to(model.device)

# Generate (greedy decoding for deterministic output)
with torch.no_grad():
    outputs = model.generate(
        **inputs,
        max_new_tokens=600,
        do_sample=False,
        pad_token_id=tokenizer.pad_token_id,
    )

response = tokenizer.decode(outputs[0][inputs["input_ids"].shape[1]:], skip_special_tokens=True)
print(response)

Limitations

  • Trained specifically on Dyck sequences with 4 bracket types: (), [], {}, <>
  • Sequence length limited to patterns similar to training data (12-32 chars)
  • Output format is structured/code-like, not natural language prose
  • Best performance on sequences that follow training distribution

Model Card Authors

Created for Dyck sequence completion task with full fine-tuning approach.

License

Inherits license from base model DeepSeek-R1-Distill-Qwen-1.5B.

Downloads last month
1
Safetensors
Model size
2B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support