• CN:11-2187/TH
  • ISSN:0577-6686

机械工程学报 ›› 2024, Vol. 60 ›› Issue (22): 165-178.doi: 10.3901/JME.2024.22.165

• 仪器科学与技术 • 上一篇    下一篇

扫码分享

基于域泛化D3QN的跨工况故障诊断方法

柏林, 何牧耕, 陈兵奎, 刘小峰   

  1. 重庆大学机械与运载工程学院 重庆 400044
  • 收稿日期:2024-01-09 修回日期:2024-06-22 出版日期:2024-11-20 发布日期:2025-01-02
  • 作者简介:柏林,男,1972年出生,博士,教授。主要研究方向为机械设备状态监测与故障诊断、智能测试与仪器。E-mail:bolin0001@aliyun.com;何牧耕,男,1997年出生,硕士。主要研究方向为机械设备状态检测与故障诊断。E-mail:414398805@qq.com;陈兵奎,男,1966年出生,博士,教授,博士研究生导师。主要研究方向为精密传动,不确定性分析及优化设计。E-mail:bkchen@cqu.edu.cn;刘小峰(通信作者),女,1980年出生,博士,教授,博士研究生导师。主要研究方向为设备监测与故障诊断、结构损伤检测。E-mail:liuxfeng0080@126.com
  • 基金资助:
    国家科技重大专项(J2019-IV-0001-0068)和国家自然科学基金(52175077,51975067)资助项目。

Domain Generalization D3QN for Machinery Fault Diagnosis Across Different Working Conditions

BO Lin, HE Mugeng, CHEN Bingkui, LIU Xiaofeng   

  1. College of Mechanical and Transportation Engineering, Chongqing University, Chongqing 400044
  • Received:2024-01-09 Revised:2024-06-22 Online:2024-11-20 Published:2025-01-02
  • About author:10.3901/JME.2024.22.165

摘要: 针对深度强化学习对交互环境的依赖性导致的其在跨工况设备故障诊断中可移植性差的问题,提出一种D3QN (Dueling double deep Q network, D3QN) 域泛化的故障诊断方法。采用自适应权值的最大相关最小冗余特征筛选方法进行特征优化选择,实现数据环境去冗余精化处理;在竞争网络和双Q网络基础上引入了域识别网络,实现工况环境掩蔽下的故障状态信息分离提取;构建基于故障模式类间距的量化奖励矩阵,并结合域辨识奖励设置分治奖励策略,增强智能体对跨工况混叠故障模式的辨识决策能力。齿轮箱故障与轴承故障的跨工况诊断结果表明,能够较好地解决深度强化学习网络对交互环境的依赖性和其在跨工况故障诊断中与环境独立性之间的矛盾问题,实现深度强化模型在不同工况环境中的复用移植,提高深度强化学习在跨域故障诊断中的适用性。

关键词: 故障诊断, 域泛化, 特征筛选, 分治奖励, 深度强化学习

Abstract: To address the problem of poor portability of deep reinforcement learning model in cross-condition fault diagnosis due to its dependence on the interaction environment, a domain generalization D3QN (Domain generalization dueling double deep Q network, DGD3QN)model is proposed for the machinery fault diagnosis across different working conditions. To realize the de-redundancy and refinement of data environment, the adaptive weighted max-relevance-min-redundancy method is utilized to optimize feature selection. The domain recognition network branch is introduced into D3QN network to separate and extract the fault state information from multi-conditions. To enhance the agent’s ability of identifying the overlapping failure modes in the multi-condition, the graded reward strategy is set by combining the domain recognition reward and the quantitative reward matrix constructed based on the inter-class distance of multi-condition failure modes. The experimental results of cross-condition diagnosis of gearbox fault and bearing fault showed that the proposed DGD3QN can better solve the contradiction between the environment dependence of DQN and the independence of cross-condition fault diagnosis on environmental conditions, realize the multiplexing and transplantation of D3QN models in different operating environments and enhance the applicability of DQN in the cross-domain fault diagnosis accuracy.

Key words: fault diagnosis, domain generalization, feature screening, graded reward strategy, deep reinforcement learning

中图分类号: