DeepSeek R1 Medical Reasoning (Fine-Tuned with LoRA)

This repository contains the LoRA fine-tuned DeepSeek-R1-Distill-Llama-8B model, specifically adapted for advanced medical reasoning tasks. It was trained on a subset of the Medical O1 Reasoning SFT dataset using Low-Rank Adaptation (LoRA) for efficient fine-tuning.

Model Information

  • Base Model: unsloth/DeepSeek-R1-Distill-Llama-8B
  • LoRA Rank: 16
  • LoRA Alpha: 16
  • Fine-tuning Method: LoRA (Low-Rank Adaptation)
  • Training Dataset: Medical O1 Reasoning SFT

Fine-tuning Configuration

  • Epochs: 1
  • Max Steps: 60
  • Batch Size per Device: 2
  • Gradient Accumulation Steps: 4
  • Learning Rate: 2e-4
  • Optimizer: AdamW (8-bit)
  • Quantization: FP16
  • Seed: 3407 (for reproducibility)

Targeted Modules for LoRA

  • Self-attention projections (q_proj, k_proj, v_proj, o_proj)
  • Feed-forward layers (gate_proj, up_proj, down_proj)

Usage

from unsloth import FastLanguageModel
from transformers import AutoTokenizer

model_id = "NikkeS/deepSeek-finetuned-Medical-O1-Reasoning-SFT"

# Load fine-tuned model and tokenizer
model, tokenizer = FastLanguageModel.from_pretrained(
    model_name=model_id,
    max_seq_length=2048,
    load_in_4bit=True
)

# Set to inference mode
FastLanguageModel.for_inference(model)

# Example inference
question = """A 61-year-old woman with involuntary urine loss during coughing but no leakage at night undergoes a gynecological exam and Q-tip test. What would cystometry reveal about residual volume and detrusor contractions?"""

inputs = tokenizer([question], return_tensors="pt").to("cuda")
outputs = model.generate(
    input_ids=inputs.input_ids,
    attention_mask=inputs.attention_mask,
    max_new_tokens=1200
)
response = tokenizer.batch_decode(outputs)[0]
print(response)

Repository


Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.

Model tree for NikkeS/deepSeek-finetuned-Medical-O1-Reasoning-SFT

Dataset used to train NikkeS/deepSeek-finetuned-Medical-O1-Reasoning-SFT