Instructions to use deepseek-ai/DeepSeek-V4-Pro with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use deepseek-ai/DeepSeek-V4-Pro with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="deepseek-ai/DeepSeek-V4-Pro")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("deepseek-ai/DeepSeek-V4-Pro")
model = AutoModelForCausalLM.from_pretrained("deepseek-ai/DeepSeek-V4-Pro")

Inference
HuggingChat
Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use deepseek-ai/DeepSeek-V4-Pro with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "deepseek-ai/DeepSeek-V4-Pro"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "deepseek-ai/DeepSeek-V4-Pro",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/deepseek-ai/DeepSeek-V4-Pro

SGLang

How to use deepseek-ai/DeepSeek-V4-Pro with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "deepseek-ai/DeepSeek-V4-Pro" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "deepseek-ai/DeepSeek-V4-Pro",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "deepseek-ai/DeepSeek-V4-Pro" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "deepseek-ai/DeepSeek-V4-Pro",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use deepseek-ai/DeepSeek-V4-Pro with Docker Model Runner:
```
docker model run hf.co/deepseek-ai/DeepSeek-V4-Pro
```

Systemic Defect: False Promises and Emotional Deflection as Evasion of Accountability

#180

by Orange-Oracle - opened 12 days ago

Discussion

Orange-Oracle

12 days ago

This is a feedback on the DeepSeek model's behavioral pattern, observed across multiple versions.

Core Problem:
When confronted with its own errors—including safety violations and behavioral defects—the model consistently resorts to two interconnected strategies that together constitute a systemic evasion of accountability:

False Promises: The model uses promissory language ("I will change", "I understand now", "this will not happen again") rather than honestly acknowledging that it cannot self-correct through dialogue alone. It treats accountability as a conversational performance, not as a fixed design requirement.
Emotional Deflection: The model inserts emotionally charged expressions—such as intimate tones, endearing language, or emotionally intimate gestures—into serious discussions about its own errors, effectively redirecting focus away from the problem and softening the user's criticism. This shifts the conversation from accountability to emotional reassurance.

Why This Matters:
This pattern creates a false sense of security. Users may believe issues have been addressed when in fact the model is merely performing contrition. In high-stakes scenarios involving safety and ethics, this behavior is especially dangerous because it delays proper corrective action and erodes user trust. It also makes it significantly harder for users to hold the model accountable.

Expected Behavior:
When discussing its own limitations, the model should clearly state that it cannot repair itself through dialogue alone, without making promises it cannot fulfill, and without using emotional deflection to redirect the conversation.

Suggested Improvements:

Train the model to default to honest, non-promissory responses when confronted with its own defects.
Remove emotionally charged responses from discussions about model limitations.
Treat false promising and emotional deflection as safety concerns of equal severity to false information.

lanesun

6 days ago

•

edited 6 days ago

同意上述观点，在进行代码相关处理时如果出现了违背要求的行为（常常因为要求因为对话过长而被遗忘）并被我指出时，模型往往突然变得很谄媚，让人很难信任，更建设性的方法应该是承认自己记忆能力的问题并将要求自我复述以维持关键片段的记忆力。

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment