• CN:11-2187/TH
  • ISSN:0577-6686

机械工程学报 ›› 2020, Vol. 56 ›› Issue (9): 102-117.doi: 10.3901/JME.2020.09.102

• 机械动力学 • 上一篇    下一篇

扫码分享

数据驱动故障诊断方法泛化性能的经验性分析

郑怀亮, 王日新, 杨远涛, 尹建程, 徐敏强   

  1. 哈尔滨工业大学深空探测基础研究中心 哈尔滨 150001
  • 收稿日期:2019-04-08 修回日期:2019-11-21 出版日期:2020-05-05 发布日期:2020-05-29
  • 通讯作者: 王日新(通信作者),男,1963年出生,博士。副教授。主要研究方向为航天器和机械设备智能诊断技术。E-mail:wangrx@hit.edu.cn
  • 作者简介:郑怀亮,男,1989年出生,博士研究生。主要研究方向为机械设备的故障诊断,智能诊断算法和迁移学习。E-mail:hlzhenghit@126.com

An Empirical Analysis about the Generalization Performance of Data-driven Fault Diagnosis Methods

ZHENG Huailiang, WANG Rixin, YANG Yuantao, YIN Jiancheng, XU Minqiang   

  1. Deep Space Exploration Research Center, Harbin Institute of Technology, Harbin 150001
  • Received:2019-04-08 Revised:2019-11-21 Online:2020-05-05 Published:2020-05-29

摘要: 近年来数据驱动的故障诊断方法被广泛研究,但是这些方法有效的一个前提条件是训练诊断模型的数据与待测试数据应需采集自相同的设备和运行环境,然而这个前提条件在实际的诊断情形中很难得到满足,实际能够用来训练诊断模型的通常是采集自同类型设备或不同工况的历史数据。对于实际诊断情形下存在潜在差异的数据集,数据驱动故障诊断方法是否有效的问题鲜有讨论。首先讨论了影响诊断方法泛化性能的可能因素,然后构建了多个跨数据集诊断任务,在此基础上对几个数据驱动诊断方法的泛化性能进行了经验性的分析,分析发现相较于模型复杂度数据集间的分布差异是影响跨域诊断泛化性能的主要因素;并进一步从信号特性分析的角度解释了设备型号差异和工况差异对跨域诊断性能影响的深层次原因。这些讨论有益于启发面向实际诊断情形的数据驱动诊断方法的研究。

关键词: 故障诊断, 数据驱动, 泛化性能, 经验性分析

Abstract: In recent years, data-driven fault diagnosis methods have been widely researched. A prerequisite of ensuring those methods' effectiveness is that the data for training diagnosis models and data to be tested should be collected from the same machine and the working environment. However, it is very difficult to satisfy this prerequisite in the actual diagnosis problem, and only the historical fault data collected from other same-type machines or different operating conditions are available for training the diagnosis model. The validity of conventional data-driven fault diagnosis methods for the actual diagnosis scenarios between datasets with potential discrepancy has rarely been discussed yet. The possible factors that dominate the generalization performance of fault diagnosis methods are first analyzed theoretically, and then multiple cross-dataset diagnosis tasks are organized. Based on them the empirical analysis about the generalization performance of several data-driven fault diagnosis methods is conducted. It is found that the distribution discrepancy between datasets is a major factor to influence the generalization performance. Meanwhile, the further fundamental reason for generalization performance declines is also explained from the perspective of signal characteristics under both model difference and operating condition difference. The discussion is conducive to inspiring the studies of data-driven diagnosis methods that can handle actual diagnosis scenarios.

Key words: fault diagnosis, data-driven, generalization performance, empirical analysis

中图分类号: