Proximal Policy Optimization with Double Clipping Boundaries

doi:10.15888/j.cnki.csa.009033

WeChat

Mobile website

Home > Archive>Volume 32, Issue 4, 2023 >177-186. DOI:10.15888/j.cnki.csa.009033

PDF HTML XML Export Cite reminder

Proximal Policy Optimization with Double Clipping Boundaries
DOI:
                        10.15888/j.cnki.csa.009033
                    
Author:
                        
                        
                    
Affiliation:
Clc Number:
Fund Project:

Article

Figures

Metrics

Reference

Cited by

Materials

Comments

Abstract:

Proximal policy optimization (PPO) is a stable deep reinforcement learning algorithm. The key process of the algorithm is to use clipped surrogate targets to limit step size updates. Experiments have found that when a clipping coefficient with optimal experience is employed, the upper bound of Kullback-Leibler (KL) divergence cannot be determined. This phenomenon is against the optimization theory of trust region. In this study, an improved PPO with double clipping boundaries (PPO-DC) algorithm is proposed. The algorithm adjusts the KL divergence based on two probability-based clipping boundaries and limits parameters to the trust region, so as to ensure that the sample data are fully utilized. In several continuous control tasks, the PPO-DC algorithm achieves better performance than other algorithms.

Reference

Cited by

Get Citation

张骏,王红成.双裁切近端策略优化算法.计算机系统应用,2023,32(4):177-186

Copy

Article Metrics

Abstract:
PDF:
HTML:
Cited by:

History

Received:August 23,2022
Revised:September 27,2022
Adopted:
Online: December 23,2022
Published:

You are the firstVisitors
Copyright: Institute of Software, Chinese Academy of Sciences Beijing ICP No. 05046678-3
Address：4# South Fourth Street, Zhongguancun,Haidian, Beijing,Postal Code：100190
Phone：010-62661041 Fax： Email：csa (a) iscas.ac.cn
Technical Support：Beijing Qinyun Technology Development Co., Ltd.

Beijing Public Network Security No. 11040202500063

WeChat

Mobile website

Get Citation

Share

Article Metrics

History