CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

Project Overview

ms-swift is a comprehensive framework for fine-tuning and deploying large language models (LLMs) and multi-modal models. It supports 500+ text models and 200+ multi-modal models, providing a complete pipeline from training to deployment with advanced techniques like LoRA, QLoRA, RLHF methods (DPO, GRPO, PPO, etc.), and inference acceleration engines.

Core Architecture

CLI Commands Structure

The main entry points are through the swift command with subcommands:

swift sft - Supervised Fine-Tuning
swift pt - Pre-training
swift rlhf - Human alignment training (DPO, GRPO, PPO, KTO, etc.)
swift infer - Interactive inference
swift deploy - Model deployment with OpenAI-compatible API
swift eval - Model evaluation
swift export - Model export and quantization
swift app - Launch inference server
swift web-ui - Launch Gradio web interface
swift sample - Sampling/distillation

Key Components

Models & Templates (swift/llm/model/, swift/llm/template/):

Model registration system supporting 500+ LLMs and 200+ VLMs
Template system for chat formatting and tokenization
Organized by vendor (qwen.py, llama.py, deepseek.py, etc.)

Training Framework (swift/llm/train/, swift/trainers/):

Unified training interface supporting SFT, pre-training, and RLHF
Custom trainers for different alignment methods (DPO, GRPO, PPO, KTO, CPO, SimPO, ORPO)
Advanced techniques: sequence parallelism, gradient accumulation, mixed precision

Tuners (swift/tuners/):

Extensive PEFT support: LoRA, QLoRA, DoRA, AdaLoRA, etc.
Custom Swift tuning methods and adapters
Integration with HuggingFace PEFT

Inference Engines (swift/llm/infer/infer_engine/):

Multi-backend support: PyTorch, vLLM, SGLang, LMDeploy
Optimized for different deployment scenarios
OpenAI-compatible API endpoints

Megatron Integration (swift/megatron/):

Parallel training support for large models
Model/tensor parallelism for distributed training
Optimized for MoE and large parameter models

Web UI (swift/ui/):

Gradio-based interface for training, inference, and evaluation
Modular components for different tasks (LLM training, GRPO, evaluation, etc.)

Common Development Commands

Installation & Setup

# Install from PyPI
pip install ms-swift -U

# Install from source (development)
git clone https://github.com/modelscope/ms-swift.git
cd ms-swift
pip install -e .

Build & Test Commands

# Build documentation  
make docs

# Run linter (if linter.sh exists)
make linter

# Run tests (if citest.sh exists)  
make test

# Build wheel package
make whl

# Clean build artifacts
make clean

Training Examples

# Basic LoRA fine-tuning
swift sft \
    --model Qwen/Qwen2.5-7B-Instruct \
    --dataset 'AI-ModelScope/alpaca-gpt4-data-en#500' \
    --train_type lora \
    --output_dir output

# Pre-training
swift pt \
    --model Qwen/Qwen2.5-7B \
    --dataset swift/chinese-c4 \
    --streaming true \
    --train_type full

# RLHF training
swift rlhf \
    --rlhf_type dpo \
    --model Qwen/Qwen2.5-7B-Instruct \
    --dataset hjh0119/shareAI-Llama3-DPO-zh-en-emoji

Inference & Deployment

# Interactive inference
swift infer \
    --model Qwen/Qwen2.5-7B-Instruct \
    --stream true

# Deploy with vLLM backend
swift deploy \
    --model Qwen/Qwen2.5-7B-Instruct \
    --infer_backend vllm

# Launch web UI
SWIFT_UI_LANG=en swift web-ui

Key Configuration Patterns

Model Loading

Models auto-detect from ModelScope or HuggingFace hubs
Use --use_hf true to force HuggingFace downloads
Custom models supported via registration system

Training Configuration

Training arguments inherit from HuggingFace Transformers
DeepSpeed configurations in swift/llm/ds_config/
Support for multi-GPU (DDP, DeepSpeed) and multi-node training

Dataset Format

Supports multiple formats: JSON, JSONL, CSV
Built-in datasets via dataset IDs (e.g., AI-ModelScope/alpaca-gpt4-data-en)
Custom datasets via path specification

LoRA/Adapter Management

Checkpoint paths contain training args for auto-loading
Use --adapters to specify adapter checkpoint paths
Supports adapter merging and quantization

Testing Framework

Tests located in tests/ directory
Organized by functionality: train/, infer/, eval/, models/, etc.
Individual test files for specific components and features
No specific test runner configured - use standard pytest

Important Notes

Default model/dataset source is ModelScope (Chinese AI platform)
Extensive multi-modal support for images, videos, and audio
Plugin system for custom loss functions, metrics, and training components
Built-in support for popular quantization methods (AWQ, GPTQ, BNB)
Web UI provides zero-code training and deployment interface