➀ Overview of verl framework and its benefits for large-scale reinforcement learning from human feedback (RLHF);
➁ Introduction to AMD ROCm software support and Docker image for verl v0.3.0.post0;
➂ Detailed instructions on building Docker images and training scripts for single-node and multi-node setups;
➃ Performance results of verl on AMD Instinct™ MI300X GPUs, focusing on throughput and convergence accuracy.