Enxin's picture
Upload folder using huggingface_hub
96fe658 verified
# README: GRPO Internal(Colocate) Mode Execution Scripts
---
**NOTE**
## **Introduction**
The GRPO (Group Relative Policy Optimization) training framework supports high-performance inference engines like vLLM to accelerate the sampling process. The **Internal Mode** allows you to deploy vLLM and perform training using the same GPU resources.
This folder contains scripts and instructions for running GRPO in **Internal Mode**
## Training with Internal mode
```bash
--use_vllm true \
--vllm_mode colocate \
--vllm_gpu_memory_utilization [ut_ratio] \
```
## Multi-Node Training
On each node, execute the original single-node training script, using the environment variables `NNODES` and `NODE_RANK`, and ensure consistent use of configuration parameters across all nodes.