基于深度强化学习的增程式电动轻卡能量管理策略

段龙锦; 王贵勇; 王伟超; 何述超

Energy Management Strategy of An Extended Range Electric Light Truck Based on Deep Reinforcement Learning

DOI：10.13949/j.cnki.nrjgc.2023.06.011

Key Words:deep Q-network（DQN） deep deterministic policy gradient（DDPG） twin delayed deep deterministic policy gradient（TD3） algorithm extended range electric light truck

Author Name	Affiliation	E-mail
DUAN Longjin^*	Yunnan Key Laboratory of Internal Combustion Engine，Kunming University of Technology， Kunming 650500， China	1456249466@qq.com
WANG Guiyong^*	Yunnan Key Laboratory of Internal Combustion Engine，Kunming University of Technology， Kunming 650500， China	wangguiyong@kust.edu.cn
WANG Weichao	Yunnan Key Laboratory of Internal Combustion Engine，Kunming University of Technology， Kunming 650500， China	3262386925@qq.com
HE Shuchao	Kunming Yunnei Power Co.， Ltd.， Kunming 650500， China	3564097974@qq.com

Hits: 1092

Download times: 615

Abstract:In order to solve the problem of reasonable energy allocation between auxiliary power units（APUs） and power batteries in incremental electric light trucks， a control oriented simulation model was established in Simulink， and a real-time energy management strategy based on the twin delayed deep deterministic policy gradient （TD3） algorithm was proposed to reduce engine fuel consumption. The state of charge（SOC） change of the battery was the optimization objective， and deep reinforcement learning agents were trained in the world light vehicle test procedure（WLTP）. The simulation results show that the energy management strategy（EMS） based on TD3 algorithm has good stability and adaptability， which has been validated under different operating conditions. The TD3 algorithm achieves continuous control of engine speed and torque， making the output power smoother. The EMS based on TD3 algorithm was compared with the EMS based on the traditional deep Q network（DQN） algorithm and the deep deterministic policy gradient（DDPG） algorithm. The fuel economy of the EMS based on the TD3 algorithm was improved by 12.35% and 0.67% respectively compared to EMS based on DQN algorithm and DDPG algorithm， reaching 94.85% of the EMS based on the dynamic programming（DP） algorithm. And the convergence speed was improved by 40.00% and 47.60% respectively compared to EMS based on DQN algorithm and DDPG algorithm.

View Full Text View/Add Comment Download reader