DeepSeek R1 Medical Reasoning (Fine-Tuned with LoRA)
This repository contains the LoRA fine-tuned DeepSeek-R1-Distill-Llama-8B model, specifically adapted for advanced medical reasoning tasks. It was trained on a subset of the Medical O1 Reasoning SFT dataset using Low-Rank Adaptation (LoRA) for efficient fine-tuning.
Model Information
- Base Model:
unsloth/DeepSeek-R1-Distill-Llama-8B
- LoRA Rank: 16
- LoRA Alpha: 16
- Fine-tuning Method: LoRA (Low-Rank Adaptation)
- Training Dataset: Medical O1 Reasoning SFT
Fine-tuning Configuration
- Epochs: 1
- Max Steps: 60
- Batch Size per Device: 2
- Gradient Accumulation Steps: 4
- Learning Rate: 2e-4
- Optimizer: AdamW (8-bit)
- Quantization: FP16
- Seed: 3407 (for reproducibility)
Targeted Modules for LoRA
- Self-attention projections (
q_proj
,k_proj
,v_proj
,o_proj
) - Feed-forward layers (
gate_proj
,up_proj
,down_proj
)
Usage
from unsloth import FastLanguageModel
from transformers import AutoTokenizer
model_id = "NikkeS/deepSeek-finetuned-Medical-O1-Reasoning-SFT"
# Load fine-tuned model and tokenizer
model, tokenizer = FastLanguageModel.from_pretrained(
model_name=model_id,
max_seq_length=2048,
load_in_4bit=True
)
# Set to inference mode
FastLanguageModel.for_inference(model)
# Example inference
question = """A 61-year-old woman with involuntary urine loss during coughing but no leakage at night undergoes a gynecological exam and Q-tip test. What would cystometry reveal about residual volume and detrusor contractions?"""
inputs = tokenizer([question], return_tensors="pt").to("cuda")
outputs = model.generate(
input_ids=inputs.input_ids,
attention_mask=inputs.attention_mask,
max_new_tokens=1200
)
response = tokenizer.batch_decode(outputs)[0]
print(response)
Repository
Inference Providers
NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API:
The model has no library tag.
Model tree for NikkeS/deepSeek-finetuned-Medical-O1-Reasoning-SFT
Base model
deepseek-ai/DeepSeek-R1-Distill-Llama-8B
Finetuned
unsloth/DeepSeek-R1-Distill-Llama-8B