Instructions to use DeepXR/Helion-V2.5-Rnd with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use DeepXR/Helion-V2.5-Rnd with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="DeepXR/Helion-V2.5-Rnd", trust_remote_code=True)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("DeepXR/Helion-V2.5-Rnd", trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained("DeepXR/Helion-V2.5-Rnd", trust_remote_code=True)

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use DeepXR/Helion-V2.5-Rnd with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "DeepXR/Helion-V2.5-Rnd"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "DeepXR/Helion-V2.5-Rnd",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/DeepXR/Helion-V2.5-Rnd

SGLang

How to use DeepXR/Helion-V2.5-Rnd with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "DeepXR/Helion-V2.5-Rnd" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "DeepXR/Helion-V2.5-Rnd",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "DeepXR/Helion-V2.5-Rnd" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "DeepXR/Helion-V2.5-Rnd",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Docker Model Runner
How to use DeepXR/Helion-V2.5-Rnd with Docker Model Runner:
```
docker model run hf.co/DeepXR/Helion-V2.5-Rnd
```

Helion-V2.5-Rnd / requirements.txt

Trouter-Library

Create requirements.txt

9a52a70 verified 6 months ago

raw

history blame

925 Bytes

	# Core ML Framework
	torch==2.2.0
	torchvision==0.17.0
	torchaudio==2.2.0

	# Transformers and Model Serving
	transformers==4.40.0
	tokenizers==0.15.2
	sentencepiece==0.2.0
	accelerate==0.28.0
	safetensors==0.4.2
	huggingface-hub==0.21.4

	# High-Performance Inference
	vllm==0.3.3
	ray[default]==2.10.0

	# Quantization Support
	bitsandbytes==0.43.0

	# Web Server
	fastapi==0.110.0
	uvicorn[standard]==0.29.0
	aiohttp==3.9.3
	websockets==12.0

	# Data Processing
	numpy==1.26.4
	scipy==1.12.0
	pandas==2.2.1
	pyarrow==15.0.2

	# Model Utilities
	pydantic==2.6.4
	pyyaml==6.0.1
	omegaconf==2.3.0

	# Monitoring and Logging
	prometheus-client==0.20.0
	gputil==1.4.0
	psutil==5.9.8
	py-cpuinfo==9.0.0
	pynvml==11.5.0

	# HTTP Clients
	requests==2.31.0
	httpx==0.27.0

	# Development Tools
	pytest==8.1.1
	pytest-asyncio==0.23.6
	black==24.3.0
	flake8==7.0.0
	mypy==1.9.0

	# Optional: Advanced Features
	scikit-learn==1.4.1
	matplotlib==3.8.3
	seaborn==0.13.2
	pillow==10.2.0