# Page Not Found

The URL `ru-men/reinforcement-learning-rl-guide/preference-dpo-orpo-and-kto` does not exist.

You might be looking for one of these pages:
- [偏好优化训练 - DPO、ORPO 和 KTO](https://unsloth.ai/docs/zh/kai-shi-shi-yong/reinforcement-learning-rl-guide/preference-dpo-orpo-and-kto.md)
- [视觉强化学习（VLM RL）](https://unsloth.ai/docs/zh/kai-shi-shi-yong/reinforcement-learning-rl-guide/vision-reinforcement-learning-vlm-rl.md)
- [高级强化学习文档](https://unsloth.ai/docs/zh/kai-shi-shi-yong/reinforcement-learning-rl-guide/advanced-rl-documentation.md)
- [具有 7 倍更长上下文的强化学习 GRPO](https://unsloth.ai/docs/zh/kai-shi-shi-yong/reinforcement-learning-rl-guide/grpo-long-context.md)
- [FP8 强化学习](https://unsloth.ai/docs/zh/kai-shi-shi-yong/reinforcement-learning-rl-guide/fp8-reinforcement-learning.md)

## How to find the correct page

1. **Browse the full index**: [/sitemap.md](https://unsloth.ai/docs/sitemap.md) - Complete documentation index
2. **View the full content**: [/llms-full.txt](https://unsloth.ai/docs/llms-full.txt) - Full content export

## Tips for requesting documentation

- For markdown responses, append `.md` to URLs (e.g., `/docs/zh/kai-shi-shi-yong/reinforcement-learning-rl-guide/preference-dpo-orpo-and-kto.md`)
- Use `Accept: text/markdown` header for content negotiation