人本智造：人体行为识别关键技术分析与展望

doi:10.3901/JME.2025.15.057

机械工程学报 ›› 2025, Vol. 61 ›› Issue (15): 57-81.doi: 10.3901/JME.2025.15.057

• 综述 • 上一篇

扫码分享

人本智造：人体行为识别关键技术分析与展望

刘庭煜^1,2, 翁陈熠¹, 王柏村³, 郑湃⁴, 赵强强⁵, 王昊琪⁶, 董元发⁷, 庄存波⁸, 冷杰武⁹, 向峰¹⁰, 陈成军¹¹, 周小舟¹, 李兴宇¹², 焦磊¹, 王晓宇¹, 倪中华^1,2

1. 东南大学机械工程学院南京 211189;
2. 东南大学南通海洋高等研究院南通 226010;
3. 浙江大学机械工程学院杭州 310058;
4. 香港理工大学工业及系统工程学系香港 999077;
5. 西安交通大学机械工程学院西安 710049;
6. 郑州轻工业大学机电工程学院郑州 450002;
7. 三峡大学机械与动力学院宜昌 443002;
8. 北京理工大学机械与车辆学院北京 100081;
9. 广东工业大学省部关建精密电子制造技术与装备国家重点实验室广州 510006;
10. 武汉科技大学冶金装备及其控制教育部重点实验室武汉 430081;
11. 青岛理工大学机械与汽车工程学院青岛 266520;
12. 普渡大学工程技术学院西拉法叶市 47907 美国

收稿日期:2025-03-18 修回日期:2025-05-23 发布日期:2025-09-28
作者简介:刘庭煜，男，1982年出生，博士，教授，博士研究生导师。主要研究方向为海陆空天重大装备与系统的智能感知与优化决策等理论与工程应用。E-mail：tingyu@seu.edu.cn;翁陈熠，男，1995年出生，博士研究生。主要研究方向为工业人体行为识别及预测;倪中华(通信作者)，男，1967年出生，博士，教授，博士研究生导师。主要研究方向为先进制造理论及相关使能技术的集成和应用，微纳医疗器械设计与制造的共性基础科学问题和关键技术。E-mail：nzh2003@seu.edu.cn
基金资助:
国家自然科学基金(52475514); 碳达峰碳中和科技创新专项(BE2023853); 国家重点研发计划(2020YFB1708403); 国防基础科研重点(JCKY2017204B053)资助项目。

Human-centric Smart Manufacturing: Analysis and Prospects of Human Activity Recognition

LIU Tingyu^1,2, WENG Chenyi¹, WANG Baicun³, ZHENG Pai⁴, ZHAO Qiangqiang⁵, WANG Haoqi⁶, DONG Yuanfa⁷, ZHUANG Cunbo⁸, LENG Jiewu⁹, XIANG Feng¹⁰, CHEN Chengjun¹¹, ZHOU Xiaozhou¹, LI Xingyu¹², JIAO Lei¹, WANG Xiaoyu¹, NI Zhonghua^1,2

1. School of Mechanical Engineering, Southeast University, Nanjing 211189;
2. Advanced Ocean Institute of Southeast University (Nantong), Nantong 226010;
3. School of Mechanical Engineering, Zhejiang University, Hangzhou 310058;
4. Department of Industrial and Systems Engineering, The Hong Kong Polytechnic University, Hong Kong 999077;
5. School of Mechanical Engineering, Xi'an Jiaotong University, Xi'an 710049;
6. School of Mechanical and Electrical Engineering, Zhengzhou University of Light Industry, Zhengzhou 450002;
7. College of Mechanical and Power Engineering, China Three Gorges University, Yichang 443002;
8. School of Mechanical Engineering, Beijing Institute of Technology, Beijing 100081;
9. State Key Laboratory of Precision Electronic Manufacturing Technology and Equipment, Guangdong University of Technology, Guangzhou 510006;
10. Key Laboratory of Metallurgical Equipment and Control of Ministry of Education, Wuhan University of Science and Technology, Wuhan 430081;
11. School of Mechanical and Automotive Engineering, Qingdao University of Technology, Qingdao 266520;
12. School of Engineering Technology, Purdue University, West Lafayette 47907, USA

Received:2025-03-18 Revised:2025-05-23 Published:2025-09-28

摘要/Abstract

摘要： 随着新一代信息技术与制造技术的持续深度融合，以人为中心的智能制造范式正在重塑传统工业生产模式，人体行为识别技术作为实现人本智造的关键使能技术，主要研究人体行为语义的智能识别与理解，展现出广阔应用前景。对工业场景中人体行为识别技术的发展现状、关键挑战与应用前景进行系统探讨，有助于推动人本智造的理论发展与创新实践。首先，以人体行为识别技术的发展脉络为基础，深入分析人体感知、行为建模和行为识别等核心技术的演进过程，为人体行为识别技术的工业化应用奠定技术基础；其次，针对工业场景的特殊需求，重点讨论多模态鲁棒感知系统、多尺度行为理解框架、融合意图理解的人机协同及工业场景的优化部署等关键技术的研究现状；在此基础上，对工业场景人体行为数据集进行系统化分析和质量评估，并重点阐述人体行为识别技术在生产安全管控、生产调度优化、工艺改进和行为改善等典型应用场景的实践进展；最后，结合空间智能、生理认知融合、多模态大语言模型等新兴技术，展望工业人体行为识别技术的未来发展方向。

关键词: 人本智造, 人体行为识别, 多模态数据融合, 空间智能, 人机协作

Abstract: With the continuous deep integration of new generation information technology and manufacturing technology, the human-centric smart manufacturing paradigm is reshaping traditional industrial production models. Human activity recognition technology, as a key enabling technology for implementing human-oriented smart manufacturing, primarily focuses on intelligent recognition and understanding of human activity semantics, which shows broad application prospects. A systematic exploration of the current development status, key challenges, and application prospects of human activity recognition technology in industrial scenarios helps promote theoretical development and innovative practices of human-oriented smart manufacturing. First, based on the developmental trajectory of human activity recognition technology, this study deeply analyzes the evolution process of core technologies such as human perception, activity modeling, and activity recognition, laying the technical foundation for industrial applications of human activity recognition technology; second, focusing on the special requirements of industrial scenarios, it emphasizes research on key technologies including robust multi-modal perception systems, multi-scale activity understanding frameworks, human-machine collaboration with integrated intention understanding, and optimized deployment in industrial scenarios; on this basis, it systematically analyzes and evaluates the quality of human activity datasets in industrial scenarios, and highlights the practical progress of human activity recognition technology in typical application scenarios such as production safety control, production scheduling optimization, process improvement, and activity enhancement; finally, combined with emerging technologies such as spatial intelligence, physiological-cognitive integration, and multi-modal large language models, it envisions future development directions for human activity recognition technology in industrial settings.

Key words: human-centric smart manufacturing, human activity recognition, multimodal data fusion, spatial intelligence, human-robot collaboration

中图分类号:

刘庭煜, 翁陈熠, 王柏村, 郑湃, 赵强强, 王昊琪, 董元发, 庄存波, 冷杰武, 向峰, 陈成军, 周小舟, 李兴宇, 焦磊, 王晓宇, 倪中华. 人本智造：人体行为识别关键技术分析与展望[J]. 机械工程学报, 2025, 61(15): 57-81.

LIU Tingyu, WENG Chenyi, WANG Baicun, ZHENG Pai, ZHAO Qiangqiang, WANG Haoqi, DONG Yuanfa, ZHUANG Cunbo, LENG Jiewu, XIANG Feng, CHEN Chengjun, ZHOU Xiaozhou, LI Xingyu, JIAO Lei, WANG Xiaoyu, NI Zhonghua. Human-centric Smart Manufacturing: Analysis and Prospects of Human Activity Recognition[J]. Journal of Mechanical Engineering, 2025, 61(15): 57-81.

参考文献

[1] LIU T,WENG C,JIAO L,et al. Toward fast 3D human activity recognition:A refined feature based on minimum joint freedom model (Mint)[J]. Journal of Manufacturing Systems,2023,66:127-141.
[2] 臧冀原,刘宇飞,王柏村,等.面向2035的智能制造技术预见和路线图研究[J].机械工程学报,2022,58(4):285-308.ZANG Jiyuan,LIU Yufei,WANG Baicun,et al. Research on rapid reconstruction and collision detection of human machine point clouds for tightly coupled collaborative spaces[J]. Journal of Mechanical Engineering,2022,58(4):285-308.
[3] 王柏村,黄思翰,易兵,等.面向智能制造的人因工程研究与发展[J].机械工程学报,2020,56(16):240-253.WANG Baicun, HUANG Sihan, YI Bing, et al.State-of-art of human factors/ergonomics in intelligent manufacturing[J]. Journal of Mechanical Engineering,2020,56(16):240-253.
[4] LIU T,XIA M,HONG Q,et al. Modeling of cross-scale human activity for digital twin workshop[J]. Digital Twin,2024,1(2):1-11.
[5] 黄思翰,王柏村,张美迪,等.面向人本智造的新一代操作工:参考架构、使能技术与典型场景[J].机械工程学报,2022,58(18):251-264.HUANG Sihan,WANG Baicun,ZHANG Meidi,et al.Operator 4.0 towards human-centric smart manufacturing:Framework,enabling technologies and typical scenarios[J]. Journal of Mechanical Engineering,2022,58(18):251-264.
[6] 王柏村,薛塬,延建林,等.以人为本的智能制造:理念、技术与应用[J].中国工程科学,2020,22(04):139-146.WANG Baicun, XUE Yuan, YAN Jianlin, et al.Human-centered intelligent manufacturing:Overview and perspectives[J]. Strategic Study of CAE,2020,22(4):139-146.
[7] URGO M, BERARDINUCCI F, ZHENG P, et al.AI-based pose estimation of human operators in manufacturing environments[M]. Cham:Springer Nature Switzerland,2024.
[8] 黄思翰,陈建鹏,徐哲,等.基于大语言模型和机器视觉的智能制造系统人机自主协同作业方法[J].机械工程学报,2025,61(3):130-141.HUANG Sihan, CHEN Jianpeng, XU Zhe, et al.Human-robot autonomous collaboration method of smart manufacturing systems based on large language model and machine vision[J]. Journal of Mechanical Engineering,2025,61(3):130-141.
[9] 李瑞峰,王亮亮,王珂.人体动作行为识别研究综述[J].模式识别与人工智能,2014(1):35-48.LI Ruifeng,WANG Liangliang,WANG Ke. A survey of human body action recognition[J]. Pattern Recognition and Artificial Intelligence,2014(1):35-48.
[10] 纪光欣,孔敏.论泰勒科学管理理论的系统性特征[J].系统科学学报,2022,30(2):18-24.JI Guangxin, KONG Min. On the systematic characteristics of taylor's scientific management theory[J]. Chinese Journal of Systems Science,2022,30(2):18-24.
[11] RUDE D J,ADAMS S,BELING P A. Task recognition from joint tracking data in an operational manufacturing cell[J]. Journal of Intelligent Manufacturing,2018,29:1203-1217.
[12] HINTON G E,OSINDERO S,TEH Y W. A fast learning algorithm for deep belief nets[J]. Neural computation,2006,18(7):1527-1554.
[13] SIMONYAN K. Very deep convolutional networks for large-scale image recognition[J]. arXiv preprint arXiv:1409.1556,2014.
[14] HE K,ZHANG X,REN S,et al. Deep residual learning for image recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2016:770-778.
[15] SZEGEDY C,LIU W,JIA Y,et al. Going deeper with convolutions[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2015:1-9.
[16] HOCHREITER S,SCHMIDHUBER J. Long short-term memory[J]. Neural computation,1997,9(8):1735-1780.
[17] 闫路遥.基于数字孪生的人机协作安全距离监测与个性化控制策略研究[D].郑州:郑州轻工业大学,2024.YAN Luyao. Research on human-robot cooperation safety distance monitoring and personalized control strategy based on digital twin[D]. Zhengzhou:Zhengzhou University of Light Industry,2024.
[18] LUO G, YAN L, WANG H, et al. Human-robot collaboration assembly personalised safety detection method based on digital twin[J]. Journal of Engineering Design,2024,35(9):1102-1124.
[19] HERAVI M Y, JANG Y, JEONG I, et al. Deep learning-based activity-aware 3D human motion trajectory prediction in construction[J]. Expert Systems with Applications,2024,239:122423.
[20] ALFAVO-VIQUEZ D,ZAMORA-HERNANDEZ M A,AZORÍN-LÓPEZ J,et al. Visual analysis of fatigue in Industry 4.0[J]. The International Journal of Advanced Manufacturing Technology,2024,133(1):959-970.
[21] MEHMOOD I, LI H, QAROUT Y, et al. Deep learning-based construction equipment operators'mental fatigue classification using wearable EEG sensor data[J].Advanced Engineering Informatics,2023,56:101978.
[22] 张树.智能视觉连铸浇注异常检测系统的设计与实现[D].大连:大连理工大学,2020.ZHANG Shu. Design and implementation of continuous casting pouring anomaly detection system based on intelligent vision[D]. Dalian:Dalian University of Technology,2020.
[23] 汪晨,孙伟,郑蓓,等.基于机器视觉的变电站数字化违章行为三维虚拟周界监测方法[J].微电子学与计算机,2023,40(12):53-60.WANG Chen,SUN Wei,ZHENG Bei,et al. 3D virtual perimeter monitoring method for substation digital violation based on machine vision[J]. Microelectronics&Computer,2023,40(12):53-60.
[24] 朱建宝,许志龙,孙玉玮,等.基于Open Pose人体姿态识别的变电站危险行为检测[J].自动化与仪表,2020,35(2):47-51.ZHU Jianbao,XU Zhilong,SUN Yuwei,et al. Detection of dangerous behaviors in power stations based on openpose multiperson attitude recognition[J]. Automation&Instrumentation,2020,35(2):47-51.
[25] 杨斌.融合目标检测的石化场景人员危险行为识别方法研究[D].徐州:中国矿业大学,2021.YANG Bin. Research on dangerous action recognition method of personnel in petrochemical scene fused with object detection[D]. Xuzhou:China University of Mining and Technology,2021.
[26] 杨斌,云霄,董锴文,等.基于机器视觉的石化场景人员危险行为识别[J].激光与光电子学进展,2021,58(22):347-357.YANG Bin,YUN Xiao,DONG Kaiwen,et al. Human's dangerous action recognition in petrochemical scene using machine vision[J]. Laser&Optoelectronics Progress,2021,58(22):347-357.
[27] GKOURNELOS C,KONSTANTINOU C,ANGELAKIS P,et al. Praxis:A framework for AI-driven human action recognition in assembly[J]. Journal of Intelligent Manufacturing,2024,35(8):3697-3711.
[28] 吕其兵.基于人机协作的弹性装配任务策略研究[D].上海:东华大学,2023.LÜQibing. Research on task strategy of flexible assembly based on human-robot cooperation[D]. Shanghai:Donghua University,2023.
[29] WANG P,LIU H,WANG L,et al. Deep learning-based human motion recognition for predictive context-aware human-robot collaboration[J]. CIRP Annals,2018,67(1):17-20.
[30] 陈超.基于Kinect的离散制造单元人机协作技术研究[D].哈尔滨:哈尔滨工业大学,2022.CHEN Chao. Research on human-robot collaboration technology in discrete manufacturing cells based on kinect[D]. Harbin:Harbin Institute of Technology,2022.
[31] 曾超,杨辰光,李强,等.人-机器人技能传递研究进展[J].自动化学报,2019,45(10):1813-1828.ZENG Chao, YANG Chenguang, LI Qiang, et al.Research progress on human-robot skill transfer[J]. Acta Automatica Sinica,2019,45(10):1813-1828.
[32] IYER H,MACWAN N,GUO S,et al. Computervision-enabled worker video analysis for motion amount quantification[J]. 10.48550/arXiv.2405.13999. 2024
[33] CHEN H,LUO X,ZHENG Z,et al. A proactive workers'safety risk evaluation framework based on position and posture data fusion[J]. Automation in Construction,2019,98:275-288.
[34] ALEX KRIZHEVSKY, ILYA SUTSKEVER,GEOFFREY E. HINTON. ImageNet classification with deep convolutional neural networks[J]. Commun. ACM,2017,60(6):84-90.
[35] GOODFELLOW I,POUGET-ABADIE J,MIRZA M,et al. Generative adversarial nets[J]. Advances in Neural Information Processing Systems,2014,27:139-144.
[36] TÖLGYESSY M,DEKAN M,CHOVANECL,et al.Evaluation of the azure kinect and its comparison to kinect v1 and kinect v2[J]. Sensors,2021,21(2):413.
[37] GU F,CHUNG M H,CHIGNELL M,et al. A Survey on deep learning for human activity recognition[J]. Acm Computing Surveys,2022,54(8):1-34.
[38] PAREEK P,THAKKAR A. A survey on video-based human action recognition:Recent updates,datasets,challenges, and applications[J]. Artificial Intelligence Review,2021,54(3):2259-2322.
[39] ANGELINI F,FU Z,LONG Y,et al. 2D pose-based real-time human action recognition with occlusion-handling[J]. IEEE Transactions on Multimedia,2019,22(6):1433-1446.
[40] PENG K,ROITBERG A,YANG K,et al. Delving deep into one-shot skeleton-based action recognition with diverse occlusions[J]. IEEE Transactions on Multimedia,2023,25:1489-1504.
[41] WANG L,HUYNH D Q,KONIUSZ P. A comparative review of recent kinect-based action recognition algorithms[J]. IEEE Transactions on Image Processing,2020,29:15-28.
[42] DUAN H, ZHAO Y, CHEN K, et al. Revisiting skeleton-based action recognition[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition,2022:2969-2978.
[43] WANG Y,CANG S,YU H. A survey on wearable sensor modality centred human activity recognition in health care[J]. Expert Systems with Applications,2019,137:167-190.
[44] NWEKE H F,TEH Y W,AL-GARADI M A,et al. Deep learning algorithms for human activity recognition using mobile and wearable sensor networks:State of the art and research challenges[J]. Expert Systems with Applications,2018,105:233-261.
[45] WAQAR S,PÄTZOLD M. A simulation-based framework for the design of human activity recognition systems using radar sensors[J]. IEEE Internet of Things Journal,2023,11(8):14494-14507.
[46] ULLMANN I,GUENDEL R G,KRUSE N C,et al. A survey on radar-based continuous human activity recognition[J]. IEEE Journal of Microwaves,2023,3(3):938-950.
[47] 许道峰.穿墙雷达人体行为识别方法研究[D].长沙:中南大学,2023.XU Daofeng. Research on human behavior recognition with through-wall radar[D]. Changsha:Central South University,2023.
[48] FRANCO A,MAGNANI A,MAIO D. A multimodal approach for human activity recognition based on skeleton and RGB data[J]. Pattern Recognition Letters,2020,131:293-299.
[49] HU Z,XIAO J,LI L,et al. Human-centric multimodal fusion network for robust action recognition[J]. Expert Systems with Applications,2024,239:122314.
[50] SHAHROUDY A,NG T T,GONG Y,et al. Deep multimodal feature analysis for action recognition in rgb+d videos[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence,2017,40(5):1045-1058.
[51] ZHANG C,ZU T,HOU Y,et al. Human activity recognition based on multi-modal fusion[J]. CCF Transactions on Pervasive Computing and Interaction,2023,5(3):321-332.
[52] HWANG I,CHA G,OH S. Multi-modal human action recognition using deep neural networks fusing image and inertial sensor data[C]//2017 IEEE International Conference on Multisensor Fusion and Integration for Intelligent Systems (MFI). IEEE,2017:278-283.
[53] ROCHE J,DE-SILVA V,HOOK J,et al. A multimodal data processing system for Li DAR-based human activity recognition[J]. IEEE Transactions on Cybernetics,2021,52(10):10027-10040.
[54] LIU H,LIU Z. A multimodal dynamic hand gesture recognition based on radar-vision fusion[J]. IEEE Transactions on Instrumentation and Measurement,2023,72:1-15.
[55] ALTMANN M,OTT P,STACHE N C,et al. Multi-modal cross learning for an FMCW radar assisted by thermal and RGB cameras to monitor gestures and cooking processes[J]. IEEE Access,2021,9:22295-22303.
[56] WANG X,LV T,GAN Z,et al. Fusion of skeleton and inertial data for human action recognition based on skeleton motion maps and dilated convolution[J]. IEEE Sensors Journal,2021,21(21):24653-24664.
[57] MIAO S,CHEN L,HU R,et al. Towards a dynamic inter-sensor correlations learning framework for multi-sensor-based wearable human activity recognition[J]. Proceedings of the ACM on Interactive,Mobile,Wearable and Ubiquitous Technologies,2022,6(3):1-25.
[58] IMRAN J,RAMAN B. Evaluating fusion of RGB-D and inertial sensors for multimodal human action recognition[J]. Journal of Ambient Intelligence and Humanized Computing,2020,11(1):189-208.
[59] EHATISHAM-UL-HAQ M, JAVED A, AZAM M A,et al. Robust human activity recognition using multimodal feature-level fusion[J]. IEEE Access, 2019, 7:60736-60751.
[60] CHEN K,ZHANG D,YAO L,et al. Deep learning for sensor-based human activity recognition:Overview,challenges, and opportunities[J]. ACM Computing Surveys (CSUR),2021,54(4):1-40.
[61] CAO Z,SIMON T,WEI S E,et al. Realtime multi-person2D pose estimation using part affinity fields[C]//2017IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Honolulu, HI:IEEE, 2017:1302-1310.
[62] FANG H S,LI J,TANG H,et al. Alphapose:Whole-body regional multi-person pose estimation and tracking in real-time[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence,2022,45(6):7157-7173.
[63] KOCABAS M,ATHANASIOU N,BLACK M J. VIBE:Video inference for human body pose and shape estimation[C]//2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Seattle,WA,USA:IEEE,2020:5252-5262.
[64] CHU H, LEE J H, LEE Y C, et al. Part-aware measurement for robust multi-view multi-human 3d pose estimation and tracking[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2021:1472-1481.
[65] QAZI H A,JAHANGIR U,YOUSUF B M,et al. Human action recognition using SIFT and HOG method[C]//2017International Conference on Information and Communication Technologies (ICICT). IEEE,2017:6-10.
[66] NGUYEN T P,MANZANERA A,VU N S,et al.Revisiting lbp-based texture models for human action recognition[C]//Progress in Pattern Recognition,Image Analysis, Computer Vision, and Applications:18th Iberoamerican Congress,CIARP 2013,Havana,Cuba,November 20-23,2013,Proceedings,Part II 18. Springer,2013:286-293.
[67] LADJAILIA A,BOUCHRIKA I,MEROUANI H F,et al.Human activity recognition via optical flow:Decomposing activities into basic actions[J]. Neural Computing and Applications,2020,32:16387-16400.
[68] DING C,GUO S,CUI G,et al. A parameter estimation and deep learning hybrid extraction network for multi-directional human activity recognition based on mmwave radar[J]. IEEE Internet of Things Journal,2024,12(5):5769-5782.
[69] JIA Y,GUO Y,WANG G,et al. Multi-frequency and multi-domain human activity recognition based on SFCW radar using deep learning[J]. Neurocomputing,2021,444:274-287.
[70] ALTHLOOTHI S,MAHOOR M H,ZHANG X,et al.Human activity recognition using multi-features and multiple kernel learning[J]. Pattern Recognition,2014,47(5):1800-1812.
[71] CHEN Z,ZHANG L,CAO Z,et al. Distilling the knowledge from handcrafted features for human activity recognition[J]. IEEE Transactions on Industrial Informatics,2018,14(10):4334-4342.
[72] YANG X,ZHANG C,TIAN Y. Recognizing actions using depth motion maps-based histograms of oriented gradients[C]//Proceedings of the 20th ACM international conference on Multimedia. Nara Japan:ACM,2012:1057-1060.
[73] EUM H,YOON C,LEE H,et al. Continuous human action recognition using depth-MHI-HOG and a spotter model[J]. Sensors,2015,15(3):5197-5227.
[74] KURASHIMA T,ALTHOFF T,LESKOVEC J. Modeling interdependent and periodic real-world action sequences[C]//Proceedings of the 2018 World Wide Web Conference. Republic and Canton of Geneva,CHE:International World Wide Web Conferences Steering Committee,2018:803-812.
[75] YIN J,WU Y,ZHU C,et al. Energy-based periodicity mining with deep features for action repetition counting in unconstrained videos[J]. IEEE Transactions on Circuits and Systems for Video Technology,2021,31(12):4812-4825.
[76] WANG B,ZHOU H,LI X,et al. Human digital twin in the context of industry 5.0[J]. Robotics and Computer-Integrated Manufacturing,2024,85:102626.
[77] LOPER M,MAHMOOD N,ROMERO J,et al. SMPL:A skinned multi-person linear model[J]. ACM Trans.Graph.,2015,34(6):248:1-248:16.
[78] OSMAN A A A,BOLKART T,BLACK M J. STAR:Sparse trained articulated human body regressor[C]//Vedaldi A,Bischof H,Brox T,et al. Computer Vision-ECCV 2020. Cham:Springer International Publishing,2020:598-613.
[79] MINHAS R,BARADARANI A,SEIFZADEH S,et al.Human action recognition using extreme learning machine based on visual vocabularies[J]. Neurocomputing,2010,73(10-12):1906-1917.
[80] AHMAD M, LEE S W. HMM-based human action recognition using multiview image sequences[C]//18th International Conference on Pattern Recognition (ICPR'06).IEEE,2006:263-266.
[81] JI S,XU W,YANG M,et al. 3D convolutional neural networks for human action recognition[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence,2013,35(1):221-231.
[82] TRAN D,BOURDEV L,FERGUS R,et al. Learning spatiotemporal features with 3D convolutional networks[C]//Proceedings of the IEEE International Conference on Computer Vision. 2015:4489-4497.
[83] ESSA E, ABDELMAKSOUD I R. Temporal-channel convolution with self-attention network for human activity recognition using wearable sensors[J]. Knowledge-Based Systems,2023,278:110867.
[84] CHENG Q,CHENG J,LIU Z,et al. A dense-sparse complementary network for human action recognition based on RGB and skeleton modalities[J]. Expert Systems with Applications,2024,244:123061.
[85] YAN S,XIONG Y,LIN D. Spatial temporal graph convolutional networks for skeleton-based action recognition[C]//Thirty-second AAAI Conference on Artificial Intelligence. 2018:1-7.
[86] SHI L,ZHANG Y,CHENG J,et al. Two-stream adaptive graph convolutional networks for skeleton-based action recognition[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2019:12026-12035.
[87] LIU Z,ZHANG H,CHEN Z,et al. Disentangling and unifying graph convolutions for skeleton-based action recognition[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2020:143-152.
[88] LEE J,LEE M,LEE D,et al. Hierarchically decomposed graph convolutional networks for skeleton-based action recognition[J]. arXiv preprint arXiv:2208.10741,2022.
[89] ROITBERG A, SOMANI N, PERZYLO A, et al.Multimodal human activity recognition for industrial manufacturing processes in robotic workcells[C]//Proceedings of the 2015 ACM on International Conference on Multimodal Interaction.2015:259-266.
[90] BORDEL B,ALCARRIA R,ROBLES T. Recognizing human activities in Industry 4.0 scenarios through an analysis-modeling-recognition algorithm and context labels[J]. Integrated Computer-Aided Engineering,2022,29(1):83-103.
[91] QU Y,TANG Y,YANG X,et al. Context-aware mutual learning for semi-supervised human activity recognition using wearable sensors[J]. Expert Systems with Applications,2023,219:119679.
[92] 董元发,严华兵,刘勇哲,等.基于多尺度目标检测的人机协作装配场景认知方法[J].计算机集成制造系统,2024,30(5):1657-1667.DONG Yuanfa,YAN Huabing,LIU Yongzhe,et al.Cognitive approach for human-robot collaborative assembly scene based on multi-scale object detection[J].Computer Integrated Manufacturing Systems, 2024,30(5):1657-1667.
[93] 甘文霞,张宇轩,耿晶,等.改进Pose Conv3D模型在建筑工人临边不安全行为识别中的应用[J].安全与环境学报,2024,24(7):2712-2720.GAN Wenxia,ZHANG Yuxuan,GENG Jing,et al.Application of improved PoseConv3D model in recognition of unsafe behaviors of construction workers near the edge[J]. Journal of Safety and Environment,2024,24(7):2712-2720.
[94] LIU T,WENG C,HUANG J,et al. A lightweight future skeleton generation network (FSGN) based on spatiotemporal encoding and decoding[J]. Knowledge-Based Systems,2024,306:112717.
[95] LI D,YAO T,DUAN L Y,et al. Unified spatio-temporal attention networks for action recognition in videos[J].IEEE Transactions on Multimedia,2019,21(2):416-428.
[96] DU W,WANG Y,QIAO Y. Recurrent spatial-temporal attention network for action recognition in videos[J].IEEE Transactions on Image Processing,2018,27(3):1347-1360.
[97] CHEN Y,GUO J,HE T,et al. Fine-grained side information guided dual-prompts for zero-shot skeleton action recognition[C]//Proceedings of the 32nd ACM International Conference on Multimedia. Melbourne VIC Australia:ACM,2024:778-786.
[98] WANG J,WANG Y,LIU S,et al. Few-shot fine-grained action recognition via bidirectional attention and contrastive meta-learning[C]//Proceedings of the 29th ACM International Conference on Multimedia. Virtual Event China:ACM,2021:582-591.
[99] 董元发,黄继涛,黎海凡,等.面向紧耦合协作空间的人-机点云快速重建与碰撞检测方法研究[J/OL].机械工程学报,1-13[2025-05-28]. http://kns.cnki.net/kcms/detail/11.2187.TH.20241212.1304.041.html.DONG Yuanfa,HUANG Jitao,LI Haifan,et al. Research on rapid reconstruction and collision detection of human machine point clouds for tightly coupled collaborative spaces[J]. Journal of Mechanical Engineering,1-13[2025-05-28]. http://kns.cnki.net/kcms/detail/11.2187.TH.20241212.1304.041.html.
[100] FAN J,ZHENG P,LEE C K. A vision-based human digital twin modeling approach for adaptive human-robot collaboration[J]. Journal of Manufacturing Science and Engineering,2023,145(12):121002.
[101] LI S,ZHENG P,FAN J,et al. Toward proactive human-robot collaborative assembly:A multimodal transfer-learning-enabled action prediction approach[J].IEEE Transactions on Industrial Electronics,2021,69(8):8579-8588.
[102] YIN Y,ZHENG P,LI C,et al. Enhancing human-guided robotic assembly:AR-assisted DT for skill-based and low-code programming[J]. Journal of Manufacturing Systems,2024,74:676-689.
[103] JIN Z,LIU A,ZHANG W A,et al. A learning based hierarchical control framework for human-robot collaboration[J]. IEEE Transactions on Automation Science and Engineering,2023,20(1):506-517.
[104] HANNA A,LARSSON S,GÖTVALL P L,et al.Deliberative safety for industrial intelligent human-robot collaboration:Regulatory challenges and solutions for taking the next step towards industry 4.0[J]. Robotics and Computer-Integrated Manufacturing,2022,78:102386.
[105] OLEINIKOV A,KUSDAVLETOV S,SHINTEMIROV A,et al. Safety-aware nonlinear model predictive control for physical human-robot interaction[J]. IEEE Robotics and Automation Letters,2021,6(3):5665-5672.
[106] XIONG Q,ZHANG J,WANG P,et al. Transferable two-stream convolutional neural network for human action recognition[J]. Journal of Manufacturing Systems,2020,56:605-614.
[107] LIANG W,TANG R,JIANG S,et al. Li Wi-HAR:Lightweight WiFi-based human activity recognition using distributed AIOT[J]. IEEE Internet of Things Journal,2023,11(1):597-611.
[108] WANG S T,LI I H,WANG W Y. Human action recognition of autonomous mobile robot using edge-AI[J]. IEEE Sensors Journal, 2023, 23(2):1671-1682.
[109] SODHRO A H,PIRBHULAL S,DE ALBUQUERQUE V H C. Artificial intelligence-driven mechanism for edge computing-based industrial applications[J]. IEEE Transactions on Industrial Informatics,2019,15(7):4235-4243.
[110] SOOMRO K. UCF101:A dataset of 101 human actions classes from videos in the wild[J]. arXiv preprint arXiv:1212.0402,2012.
[111] IONESCU C,PAPAVA D,OLARU V,et al. Human3.6m:Large scale datasets and predictive methods for 3D human sensing in natural environments[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence,2013,36(7):1325-1339.
[112] LIU J,SHAHROUDY A,PEREZ M,et al. NTU RGB+D 120:A large-scale benchmark for 3d human activity understanding[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence,2019,42(10):2684-2701.
[113] SHAHROUDY A,LIU J,NG T T,et al. NTU RGB+D:A large scale dataset for 3D human activity analysis[J].arXiv e-prints,2016:arXiv:1604.02808.
[114] IODICE F,DE MOMI E,AJOUDANI A. Hri30:An action recognition dataset for industrial human-robot interaction[C]//202226th International Conference on Pattern Recognition (ICPR). IEEE,2022:4941-4947.
[115] DALLEL M, HAVARD V, BAUDRY D, et al.Inhard-industrial human action recognition dataset in the context of industrial collaborative robotics[C]//2020IEEE International Conference on Human-Machine Systems (ICHMS). IEEE,2020:1-6.
[116] CICIRELLI G,MARANI R,ROMEO L,et al. The HA4M dataset:Multi-modal monitoring of an assembly task for human action recognition in manufacturing[J].Scientific Data,2022,9(1):745.
[117] MAURICE P,MALAISÉA,AMIOT C,et al. Human movement and ergonomics:An industry-oriented dataset for collaborative robotics[J]. The International Journal of Robotics Research,2019,38(14):1529-1537.
[118] TIAN Y,LI H,CUI H,et al. Construction motion data library:An integrated motion dataset for on-site activity recognition[J]. Scientific Data,2022,9(1):726.
[119] RAGUSA F,FURNARI A,LIVATINO S,et al. The meccano dataset:Understanding human-object interactions from egocentric videos in an industrial-like domain[C]//Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision. 2021:1569-1578.
[120] MÄKELA S M,LÄMSÄA,KERÄNEN J S,et al.Introducing VTT-ConIot:A realistic dataset for activity recognition of construction workers using IMU devices[J]. Sustainability,2021,14(1):220.
[121] LAGAMTZIS D,SCHMIDT F,SEYLER J R,et al.Co Ax:Collaborative action dataset for human motion forecasting in an industrial workspace[C]//ICAART (3).2022:98-105.
[122] 史红梅,高战."点-线-面"安全风险管控模式在现场安全生产中的应用[J].煤炭工程,2017,49(10):101-104.SHI Hongmei, GAO Zhan. Application of "point-line-face" safety risk management mode in safety production[J]. Coal Engineering,2017,49(10):101-104.
[123] 杨赓,周慧颖,王柏村.数字孪生驱动的智能人机协作:理论、技术与应用[J].机械工程学报,2022,58(18):279-291.YANG Geng,ZHOU Huiying,WANG Baicun. Digital twin-driven smart human-machine collaboration:Theory, enabling technologies and applications[J].Journal of Mechanical Engineering,2022,58(18):279-291.
[124] LUO F,KHAN S,LI A,et al. Edgeactnet:Edge intelligence-enabled human activity recognition using radar point cloud[J]. IEEE Transactions on Mobile Computing,2024,23(5):5479-5493.
[125] 胡明珠,张维维,张剑,等.元模型驱动的柔性作业车间具身智能体及其调度系统构建[J/OL].西南交通大学学报,1-13[2025-05-28]. http://kns.cnki.net/kcms/detail/51.1277.u.20241231.1624.018.html.HU Mingzhu,ZHANG Weiwei,ZHANG Jian,et al.Metamodel-driven flexible job shop embodied agent and its scheduling system construction[J]. Journal of Southwest Jiaotong University,1-13[2025-05-28]. http://kns.cnki.net/kcms/detail/51.1277.u.20241231.1624.018.html.
[126] 王丽君,王成广,李相阳,等.基于多智能体深度强化学习求解分布式异构作业车间动态调度问题[J/OL].计算机集成制造系统,1-19[2025-5-28]. http://doi.org/10.13196/j.cims.2024.0602.WANG Lijun,WANG Chenguang,LI Xiangyang,et al.Solving the dynamic scheduling problem in distributed heterogeneous job shops based on multi-agent deep reinforcement learning[J/OL]. Computer Integrated Manufacturing Systems, 1-19[2025-5-28]. http://doi.org/10.13196/j.cims.2024.0602.
[127] GOUETT M C. Activity analysis for continuous productivity improvement in construction[D]. Waterloo:University of Waterloo,2010.
[128] MORAN K A,WALLACE E S. Eccentric loading and range of knee joint motion effects on performance enhancement in vertical jumping[J]. Human Movement Science,2007,26(6):824-840.
[129] IJSPEERT A J,NAKANISHI J,SCHAAL S. Trajectory formation for imitation with nonlinear dynamical systems[C]//Proceedings 2001 IEEE/RSJ International Conference on Intelligent Robots and Systems. 2001(2):752-757.
[130] LI S,ZHENG P,LIU S,et al. Proactive human-robot collaboration:Mutual-cognitive, predictable, and self-organising perspectives[J]. Robotics and Computer-Integrated Manufacturing,2023,81:102510.
[131] CHEN B,XU Z,KIRMANI S,et al. SpatialVLM:Endowing vision-language models with spatial reasoning capabilities[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2024:14455-14465.
[132] YANG J,YANG S,GUPTA A W,et al. Thinking in space:How multimodal large language models see,remember, and recall spaces[J]. arXiv preprint arXiv:2412.14171,2024.
[133] GAO K,GAO Y,HE H,et al. Nerf:Neural radiance field in 3D vision,a comprehensive review[J]. ar Xiv preprint arXiv:2210.00379,2022.
[134] FAN J,ZHENG P. A vision-language-guided robotic action planning approach for ambiguity mitigation in human-robot collaborative manufacturing[J]. Journal of Manufacturing Systems,2024,74:1009-1018.
[135] TANG X, SHEN H, ZHAO S, et al. Flexible brain-computer interfaces[J]. Nature Electronics,2023,6(2):109-118.
[136] 董元发,蒋磊,彭巍,等.融合脑电-肌电信号的人机协作装配意图识别方法[J].中国机械工程,2022,33(17):2071-2078.DONG Yuanfa,JIANG Lei,PENG Wei,et al. Intention recognition method of human robot cooperative assembly based on EEG-EMG signals[J]. China Mechanical Engineering,2022,33(17):2071-2078.
[137] KOTSERUBA I,TSOTSOS J K. 40 years of cognitive architectures:Core cognitive abilities and practical applications[J]. Artificial Intelligence Review,2020,53(1):17-94.
[138] KOWALCZUK Z, CZUBENKO M. Cognitive motivations and foundations for building intelligent decision-making systems[J]. Artificial Intelligence Review,2023,56(4):3445-3472.
[139] 柳素红.基于脑波信号的情感识别[D].合肥:合肥工业大学,2020.LIU Suhong. Emotion recognition based on eeg signals[D]. Hefei:Hefei University of Technology,2020.
[140] KORBACH A,BRÜNKEN R,PARK B. Differentiating different types of cognitive load:A comparison of different measures[J]. Educational Psychology Review,2018,30(2):503-529.
[141] CHEN J,RAN X. Deep learning with edge computing:A review[J]. Proceedings of the IEEE,2019,107(8):1655-1674.
[142] WANG S,TUOR T,SALONIDIS T,et al. Adaptive federated learning in resource constrained edge computing systems[J]. IEEE Journal on Selected Areas in Communications,2019,37(6):1205-1221.
[143] HUANG W,WANG D,OUYANG X,et al. Multimodal federated learning:Concept,methods,applications and future directions[J]. Information Fusion,2024,112:102576.
[144] ZHANG Q, ZHANG Y, LUO Q, et al.Cloud-edge-end-based aircraft assembly production quality monitoring system framework and applications[J].Journal of Manufacturing Systems,2024,75:116-131.
[145] HE Y,FANG J,YU F R,et al. Large language models (LLMs) inference offloading and resource allocation in cloud-edge computing:An active inference approach[J].IEEE Transactions on Mobile Computing,2024,23(12):11253-11264.
[146] WANG H,LI J,WU H,et al. Pre-trained language models and their applications[J]. Engineering,2023,25:51-65.
[147] SHAN W,LIU Z,ZHANG X,et al. P-STMO:Pre-trained spatial temporal many-to-one model for 3D human pose estimation[C]//Avidan S,Brostow G,CisséM,et al. Computer Vision-ECCV 2022. Cham:Springer Nature Switzerland,2022:461-478.
[148] YAN T,ZENG W,XIAO Y,et al. Cross GLG:Llm guides one-shot skeleton-based 3d action recognition in a cross-level manner[C]//European Conference on Computer Vision. Springer,2024:113-131.
[149] DING N,QIN Y,YANG G,et al. Parameter-efficient fine-tuning of large-scale pre-trained language models[J].Nature Machine Intelligence,2023,5(3):220-235.
[150] XU H, GAO Y, HUI Z, et al. Language knowledge-assisted representation learning for skeleton-based action recognition[J/OL]. IEEE Transactions on Multimedia, 2025:1-16.DOI:10.1109/TMM.2025.3543034.
[151] ZHANG W,CAI M,ZHANG T,et al. Earth GPT:A universal multimodal large language model for multisensor image comprehension in remote sensing domain[J]. IEEE Transactions on Geoscience and Remote Sensing,2024,62:1-20.

人本智造：人体行为识别关键技术分析与展望

Human-centric Smart Manufacturing: Analysis and Prospects of Human Activity Recognition

PDF

可视化

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 15

编辑推荐

Metrics

本文评价

[1]	黄思翰, 陈建鹏, 徐哲, 阎艳, 王国新. 基于大语言模型和机器视觉的智能制造系统人机自主协同作业方法[J]. 机械工程学报, 2025, 61(3): 130-141.
[2]	李嘉佳, 易茜, 冯毅雄, 朱鹏兴, 易树平. 人本智造单元中人-智能系统协同双智能体工作机制研究[J]. 机械工程学报, 2025, 61(3): 105-118.
[3]	崔凯越, 洪兆溪, 娄山河, 闫炜煜, 冯毅雄, 谭建荣. 人机协作的掘进机截割部故障率稳健性容差设计[J]. 机械工程学报, 2025, 61(3): 77-90.
[4]	寇逸群, 杨晔, 刘颉, 胡友民, 李林, 俞百川, 徐家和, 胡中旭, 史铁林. 面向认知赋能的人机协作：进展、挑战和展望[J]. 机械工程学报, 2025, 61(3): 1-22.
[5]	王柏村, 宋词, 苑艺修, 周慧颖, 鲍劲松, 黄思翰, 刘蔚然, 刘庭煜, 阮兵, 陶飞, 谢海波, 杨华勇. 面向人本智造的人体运动数字孪生研究与应用进展[J]. 机械工程学报, 2025, 61(15): 21-39.
[6]	张洁, 丁鹏飞, 王柏村, 张朋, 吕佑龙, 汪俊亮. 面向人本智造的人机协作：发展演变、融合应用与未来展望[J]. 机械工程学报, 2025, 61(15): 4-20.
[7]	蒋周明矩, 熊异, 王柏村. 面向工业5.0的人机协作增材制造[J]. 机械工程学报, 2024, 60(3): 238-253.
[8]	喻惟刚, 文思捷, 李建军, 李心雨. 基于可供性推理的人机协作拆卸序列规划方法[J]. 机械工程学报, 2024, 60(17): 297-309.
[9]	吴其林, 赵韩, 陈晓飞, 赵雅婷. 多臂协作机器人技术与应用现状及发展趋势[J]. 机械工程学报, 2023, 59(15): 1-16.
[10]	杨赓, 周慧颖, 王柏村. 数字孪生驱动的智能人机协作：理论、技术与应用[J]. 机械工程学报, 2022, 58(18): 279-291.
[11]	黄思翰, 王柏村, 张美迪, 黄金棠, 朱启章, 杨赓. 面向人本智造的新一代操作工：参考架构、使能技术与典型场景[J]. 机械工程学报, 2022, 58(18): 251-264.
[12]	宋学官, 何西旺, 李昆鹏, 来孝楠, 李忠海. 人体骨骼数字孪生的构建方法及应用[J]. 机械工程学报, 2022, 58(18): 218-228.
[13]	鲍劲松, 张荣, 李婕, 陆玉前, 彭涛. 面向人-机-环境共融的数字孪生协同技术[J]. 机械工程学报, 2022, 58(18): 103-115.
[14]	马南峰, 姚锡凡, 陈飞翔, 俞鸿均, 王柯赛. 面向工业5.0的人本智造[J]. 机械工程学报, 2022, 58(18): 88-102.
[15]	姚锡凡, 马南峰, 张存吉, 周佳军. 以人为本的智能制造：演进与展望[J]. 机械工程学报, 2022, 58(18): 2-15.