TY  - JOUR
T1  - Research on Reinforcement Learning Methodologies for Large Language Models Using TRPO, PPO, and DPO
AU  - Kim, Taehyun 
AU  - Park, Soohyun 
JO  - The Journal of Korean Institute of Communications and Information Sciences
PY  - 2025
DA  - 2025/1/1
DO  - 10.7840/kics.2025.50.5.790
KW  - RLHF
KW  - LLMs
AB  - As the utilization of reinforcement learning (RL) in training large language models (LLMs) becomes more prevalent, the necessity to identify optimal RL methodologies tailored for LLMs has emerged. The fields of LLMs and RL are continually evolving through the development of novel techniques that contribute to their mutual advancement. This paper addresses the current trends in reinforcement learning algorithms aimed at enhancing the performance of large language models.