GAP
Human Activity Recognition (HAR) holds immense significance in computer vision, offering benefits in diverse scenarios. In industrial settings, manufacturing assembly steps are categorized into macro and micro tasks. Macro steps involve attaching or inserting components, while micro steps encompass swift actions like screwing. Detailed research has been conducted on HAR for screwing tasks, considering architecture, window size, sliding rate, weighting methods, and model development. The focus is to differentiate between short-duration activities (e.g., screwing) and continuous movements (e.g., walking). The ATM assembly involves numerous steps, each containing specific screwing actions vital for completion. Tracking these actions helps compare with requirements, aiding error reduction, faster detection, and lowers costs. Additionally, human-machine collaboration could enhance worker confidence and shorten training periods.
Automated understanding of work steps in industrial assembly work is important for assistive guidance technologies in employee-machine collaboration and for industrial environments. Our aim is to identify macro work steps using depth images and micro activities of employees during the assembly of ATM machines for auxiliary purposes in their daily complex tasks using hand-operated tools.
Due to the advance of inertial measurement unit (IMU) technologies and pattern recognition IMU based sensing together with machine learning has gained momentum on work step recognition and was selected for this study in combination with a depth camera sensor which is mounted on a ceiling with a top-down angle. In this work, the focus is on a seamless embedding of non-impeding body-worn IMUs or their integration into smart devices, and the depth sensor ensures the privacy of the operators, allowing for unobstructed monitoring of tools’ usage patterns and thus assembly work step recognition.
The results of this study are evidenced by empirical observations of assembly work step executions by (i) hand screwing, (ii) screwdriver screwing, (iii) machine screwing, (iv) wrench screwing, with the null class being disproportionally dominant in the data set. Deep Learning models including LSTM, Temporal Convolutional Networks (TCN), and CNN architectures are proposed for the detection of micro activities and macro work steps and the identification of the current work step which is beneficial also for the recognition of the transition between each two consecutive macro work steps. A sophisticated counting mechanism of the classified activities is recognized as the next research challenge, that utilizes features from each IMU sensor with weak labels and the temporal information from the depth sensor.
Goals
This project investigates further development of cognitive assistance systems in the assembly area by incorporating new machine learning-based mechanisms for macro and micro work step detection, as well as the embedding of weakly annotated data to improve the data annotation process and the quality of the recognition results. Furthermore, it assists the worker by recognizing the work step and work activities, allowing them to complete complex tasks without errors.
The initial phase involves adapting the project's approach based on insights gained from the previous period. This entails transitioning to smartwatches while reducing the reliance on shimmer sensors and enhancing user usability. However, this transition prompts a necessary adjustment in machine learning algorithms due to the altered input parameter configuration. Simultaneously, user management becomes a priority, enabling personalized utilization of project tools. Additionally, the project explores the implementation of automatic work step identification. Complementing the user's ability to navigate work segments through a smartwatch, this algorithm is refined using privacy-friendly image processing techniques.
The aim of the project is to investigate personalized Machine learning algorithms that can operate on weakly annotated data for micro-work step detection algorithms and privacy-friendly image processing algorithms for macro-work step detection.
Approach
The system detects micro and macro work steps during an industrial assembly process shown in the figure. The process begins with the initialization of the application on the smartwatch that enables the collection of the IMU and depth data as it is visible in the upper part of the image. The collected data is then sent to a CPU via Bluetooth and used as input to the deep learning models, which will have to predict the number of activities that constitute each module (left side of the watch in the image) and the correct module of the workflow in which the employees currently are (right side of the watch in the image).
Individual deep models are trained to recognize similar patterns and additionally perform scene classification to provide feedback through a wrist-worn smartwatch. Weakly annotated data is used for the deep learning models to count the activities.
Expected and Achieved Results
In the final stage, the user will be able to have the complete system working in a smartwatch. Each node transmits data to the IoT system, where the data from the IMUs are processed and the activity is counted while the new work step is classified based on depth images. The system will detect the activities, classify them, and enumerate them to determine the stage of the process. Finally, it will provide online visual feedback in its built-in monitor combined with vibrations for delivering messages. Information on the smartwatch includes how many micro activities are completed at each timestamp, how many activities are yet to be done and what is the current macro work-step.
Currently, the system is deployed in an industrial environment where the collected data is used to train and fine-tune the models. LSTM and CNN architectures are implemented to identify the activities and the work steps for this supervised problem. The classification and recognition of the activities are already achieved while the work step recognition yields promising results. The counting part of the problem as shown below is tested successfully in a public dataset related to industrial activities.


