Journal of Systems & Management ›› 2024, Vol. 33 ›› Issue (1): 150-161.DOI: 10.3969/j.issn.1005-2542.2024.01.011

Previous Articles     Next Articles

Stock Trading Strategy via Deep Reinforcement Learning with Behavior Cloning

YANG Xingyu, CHEN Liangwei, ZHENG Xiaoteng, ZHANG Yong   

  1. School of Management, Guangdong University of Technology, Guangzhou 510520, China
  • Received:2022-11-28 Revised:2023-06-23 Online:2024-01-28 Published:2024-01-26
  • Supported by:


考虑行为克隆的深度强化学习股票交易策略

杨兴雨,陈亮威,郑萧腾,张永   

  1. 广东工业大学管理学院,广州 510520
  • 基金资助:

    国家自然科学基金资助项目(72371080);广东省基础与应用基础研究基金资助项目(2023A1515012840);广东省哲学社会科学规划项目(GD23XGL022

Abstract:

In order to improve the return of stock investment and reduce the risk, this paper introduces the idea of behavior cloning in imitation learning into the deep reinforcement learning framework to design a stock trading strategy. In the process of strategy design, the dueling deep Q-learning (DQN) algorithm and behavior cloning are combined, which enables the agent to imitate the decision of pre-constructed investment expert while exploring autonomously. A numerical experiment is conducted on selected stocks from different industries, which illustrates that the designed trading strategy is superior to the comparison strategies in terms of the return and risk metrics such as the annualized percentage yield (APY), Sharpe ratio (SR), and Calmar ratio (CR). The research result shows that combining imitation learning and deep reinforcement learning enables the agent to simultaneously have the abilities of exploration and imitation, and thus improves the generalization ability of the model and the applicability of the strategy. 

Key words: stock trading strategy, deep reinforcement learning, imitation learning, behavior cloning, dueling deep Q-learning network (DQN)

摘要:

为提高股票投资的收益并降低风险,将模仿学习中的行为克隆思想引入深度强化学习框架中设计股票交易策略。在策略设计过程中,将对决DQN深度强化学习算法和行为克隆进行结合,使智能体在自主探索的同时模仿事先构造的投资专家的决策。选择不同行业的股票进行数值实验,说明了所设计的交易策略在年化收益率、夏普比率和卡玛比率等收益与风险指标上优于对比策略。研究结果表明:将模仿学习与深度强化学习相结合可以使智能体同时具有探索和模仿能力,从而提高模型的泛化能力和策略的适用性。

关键词: 股票交易策略, 深度强化学习, 模仿学习, 行为克隆, 对决DQN

CLC Number: