TY - JOUR T1 - Research on Reinforcement Learning Methodologies for Large Language Models Using TRPO, PPO, and DPO AU - Kim, Taehyun AU - Park, Soohyun JO - The Journal of Korean Institute of Communications and Information Sciences PY - 2025 DA - 2025/1/1 DO - 10.7840/kics.2025.50.5.790 KW - RLHF KW - LLMs AB - As the utilization of reinforcement learning (RL) in training large language models (LLMs) becomes more prevalent, the necessity to identify optimal RL methodologies tailored for LLMs has emerged. The fields of LLMs and RL are continually evolving through the development of novel techniques that contribute to their mutual advancement. This paper addresses the current trends in reinforcement learning algorithms aimed at enhancing the performance of large language models.