# README: GRPO Internal(Colocate) Mode Execution Scripts | |
--- | |
**NOTE** | |
## **Introduction** | |
The GRPO (Group Relative Policy Optimization) training framework supports high-performance inference engines like vLLM to accelerate the sampling process. The **Internal Mode** allows you to deploy vLLM and perform training using the same GPU resources. | |
This folder contains scripts and instructions for running GRPO in **Internal Mode** | |
## Training with Internal mode | |
```bash | |
--use_vllm true \ | |
--vllm_mode colocate \ | |
--vllm_gpu_memory_utilization [ut_ratio] \ | |
``` | |
## Multi-Node Training | |
On each node, execute the original single-node training script, using the environment variables `NNODES` and `NODE_RANK`, and ensure consistent use of configuration parameters across all nodes. | |