Thorsten Schüppstuhl Kirsten Tracht Jürgen Fleischer Editors

Annals of Scientific Society for Assembly, Handling and Industrial Robotics 2022

Annals of Scientifc Society for Assembly, Handling and Industrial Robotics 2022

Thorsten Schüppstuhl · Kirsten Tracht · Jürgen Fleischer Editors

# Annals of Scientifc Society for Assembly, Handling and Industrial Robotics 2022

*Editors* Thorsten Schüppstuhl Institute of Aircraft Production Technology (IFPT) Hamburg University of Technology Hamburg, Germany

Jürgen Fleischer wbk Institute of Production Science Karlsruhe Institute of Technology Karlsruhe, Germany

Kirsten Tracht Bremen Institute for Mechanical Engineering (bime) University of Bremen Bremen, Germany

ISBN 978-3-031-10070-3 ISBN 978-3-031-10071-0 (eBook) https://doi.org/10.1007/978-3-031-10071-0

© The Editor(s) (if applicable) and The Author(s) 2023. This book is an open access publication.

**Open Access** This book is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this book are included in the book's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the book's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specifc statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use.

The publisher, the authors, and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affliations.

This Springer imprint is published by the registered company Springer Nature Switzerland AG The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland

### **Contents**

#### **Artifcial Intelligence**




Max Herrmann, Christoph Ebenhoch, Jens von der Wense and Robert Weidner


#### **Gripping Technology and Industry 4.0**



**Artifcial Intelligence**

### **High Precision Peg-in-Hole Assembly Approach Based on Sensitive Robotics and Deep Recurrent Q-Learning**

Nehal Atef Afifi, Marco Schneider, Ali Kanso and Rainer Müller

#### **Abstract**

Sensitive robot systems are used in various assembly and manufacturing technologies. Assembly is a vital activity that requires high-precision robotic manipulation. One of the challenges faced in high precision assembly tasks is when the task precision exceeds the robot's precision. In this research, Deep Q-Learning (DQN) is used to perform a very tight clearance Peg-in-Hole assembly task. Moreover, recurrence is introduced into the system via a Long-Short Term Memory (LSTM) layer to tackle DQN drawbacks. The LSTM layer has the ability to encode prior decisions, allowing the agent to make more informed decisions. The robot's sensors are used to represent the state. Despite the tight hole clearance, this method was able to successfully achieve the task at hand, which has been validated by a 7-DOF Kuka LBR iiwa sensitive robot. This paper will focus on the search phase. Furthermore, our approach has the advantage of working in environments that vary from the learned environment.

#### **Keywords**

Assembly • Peg-in-Hole • Deep Learning • Sensitive robotics

N. A. Afifi (B) · M. Schneider · A. Kanso · R. Müller

ZeMA – Zentrum für Mechatronik und Automatisierungstechnik gGmbH, Saarbrücken, Germany e-mail: n.afifi@zema.de

M. Schneider e-mail: m.schneider@zema.de

A. Kanso e-mail: a.kanso@zema.de

R. Müller e-mail: rainer.mueller@zema.de

#### **1 Introduction**

Industrial robotics plays a key role in production, notably in assembly. Despite the fact that industrial robots are currently primarily used for repetitive, dangerous, or relatively heavy operations, robotic applications are increasingly being challenged to do more than simple pick-and-place activities [1, 2]. They must be able to react to their surroundings. As a result, Sensitive robot systems are capable of conducting force- or torque-controlled applications, which are used to achieve the previously mentioned contact with the environment. Although there is no clear definition of the term sensitivity, based on the measurement technology DIN 1319 norm, sensitivity is defined as the change in the value of the output variable of a measuring instrument in relation to the causal change in the value of the input variable [3]. Special control strategies are required in the case of a physical contact with the environment, since simple pure position control, as utilized in part manipulation, is no longer sufficient. Furthermore, relying just on force control is insufficient thus it makes sense to employ a hybrid force/position control [4, 5]. Depending on the task, it is therefore necessary to decide which of the transitional and rotational degrees of freedom are position controlled or force controlled [6]. The Peg-in-Hole assembly is an example of a robotic task that requires direct physical contact with the surrounding environment [7]. It has been extensively researched in both 2-D [8, 9] and 3-D environments [10, 11], and a variety of techniques for solving it have been presented [8–15]. Conventional online programming methods have been suggested and widely utilized with robots to train them to perform precise industrial processes as well as assembly activities, in which a teach pendant is used to guide the robot to the desired positions while recording each movement. This strategy is time consuming and challenging to adapt to new environments. Another approach is offline programming (simulation) [9, 12], and while it has many advantages in terms of downtime, it is difficult to simulate a precise actual environment due to environmental variance, and it is inefficient in industrial activities when the required precision exceeds robot accuracy. So due to the limitation of these techniques, a new skill acquisition technique has been proposed [11, 15], where the robot learns to do the high precision mating task using reinforcement learning [11].

#### **2 State of the Art**

A variety of techniques in tackling Peg-in-Hole assembly challenges have been suggested [8–15]. This section will go over some of these strategies. Gullapalli et al. [8] investigated a 2D Peg-in-Hole insertion task, focusing on employing associated reinforcement learning to learn reactive control strategies in the presence of uncertainty and noise, with a 0.8mm gap between peg and hole. A Zebra Zero robot with a wrist force sensor and position encoders was used. Their evaluation was conducted over 500 sequential training runs. Hovland et al. [15] proposed skill learning by human demonstration where they implemented a hidden Markov model. Nuttin et al. [12] ran a simulation with a CAD-based contact force simulator. Their results show that the insertion is effective if the force level or time surpasses a particular threshold. Their approach focuses solely on the insertion while using reactive control with reinforcement as their strategy, in which the learning process is divided into two phases. The first phase is controller, where it consists of two networks: policy network and exploration network. The second phase is actor-critic algorithm, in which the actor calculates the action policy and the critic is responsible for computing the Q-value. Yun [9] imitated the human arm using passive compliance and learning. He used simulation, implemented in MATLAB, to solve a 2-D Peg-in-Hole task with a 3-DOF manipulator where he focuses on search phase only. The accuracy is 0.5mm, and the training was done on a gap of 10mm. Their main goal of the research is to demonstrate the significance of passive compliance in association with reinforcement learning. We use integrated torque sensors with the deep learning algorithm, unlike Abdullah et al. [14], who used a vision system with force/torque sensors to achieve automatic assembling by imitating human operating steps, in which vision systems have limitations due to changes in illumination that may cause measurement errors. Also, unlike Inoue et al.'s [11] strategy, in which the robot's movement is caused by a force condition in *x* and *y* directions, in our approach, the robot's motion is discrete displacement action in *x* or *y* direction, because a motion resulting from a force condition raises the difficulty that such a force condition cannot be reached due to the physical interaction between the robot and the environment (e.g. the stick-slip effect), eventually resulting in a theoretically infinite motion. Furthermore, in contrast to the aforementioned approaches, the Peg-in-Hole task has not been conducted on a very narrow hole clearance, and some of these approaches were only confirmed with a simulation, which is not as exact as the real world, adding to the challenge of adjusting to actual world variance. Moreover, our approach has a higher advantage in adapting to variations in both hole location and environmental settings. As well as the ability to take actions based on a prior state trajectory rather than just the current state. It also has the capability of compensating for sensor delays.

#### **3 Problem Formulation and Task Description**

As previously stated, when the required level of precision of the assembly task surpasses the robot precision, it is difficult to perform Peg-in-Hole assembly tasks, and it is even more challenging to perform them using force controlled robotic manipulation. Our approach in solving the Peg-in-Hole task is employing a recurrent neural network trained with reinforcement learning using skill acquisition techniques [11, 13]. The first learned skill, which is known as the search phase, where the peg seeks to align the peg center within the clearance zone of the hole center. A successful search phase is followed by the insertion phase in which the robot is responsible for correcting the orientational misalignment. This paper focuses solely on the search phase. This research is done on a clearance of 30µm using a robot with repeatability of 0.14mm and some millimeters positional inaccuracy.

#### **4 Reinforcement Learning**

Reinforcement learning (RL) is an agent-in-the-loop learning approach in which an agent learns by performing actions on an environment and receiving a reward (*rt*) and an updated state (*st*) of the environment as a result of those actions. The aim is to learn an optimal action policy for the agent that maximizes the eventual cumulative reward (*Rt*) shown in Eq. (1), where (γ ) indicates the discount factor, (*rt*) is the current reward generated from performing action (*at*), and (*t*) denotes to the step number. The learned action policy is the probability of selecting an action from a set of possible actions in the current state [11, 16].

$$R\_l = r\_l + \chi \, r\_{l+1} + \chi^2 \, r\_{l+2} + \dots + \chi^T \, r\_{l+T} = r\_l + \chi \, R\_{l+1} \tag{1}$$

**Deep Q-Learning** Q-Learning is a model-free off-policy RL technique. Model-free techniques do not require an environment model. Off-policy techniques learn optimal action policy implicitly by learning optimal Q-value function. Q-value function at a given state action (*s*,*a*) pair is a measure of the desirability of taking action (*a*) in state (*s*) as illustrated in Eq. (2).

$$\mathcal{Q}^{\pi}(s, a) = \mathbb{E}\left[\sum\_{k=0}^{k=T} \gamma^k r\_{l+k} \mid s\_l = s, a\_l = a\right] \tag{2}$$

Q-Learning employs the -greedy policy as behavior policy, in which an agent chooses a random action with probability () and chooses the action that maximizes the Q-value for the (*s*,*a*) pair with probability (1-) (see Eq. (3)). In this paper, exploration and exploitation are not set to a specific percentages. On the contrary, the exploration rate decays with a linear rate per episode as shown in Éq.(4).

$$a = \begin{cases} a \sim random \left( A\_l \right), & with \ P = \epsilon \\ \arg\max\_a \left. \mathcal{Q}(s, a), \right| with \ P = 1 - \epsilon \end{cases} \tag{3}$$

$$
\epsilon\_{n+1} = \epsilon\_{initial} - \epsilon\_{decay} \ast n \tag{4}
$$

The simplest form of Q-Learning is a tabular form which uses an iterative Bellman based update rule as seen in Eq. (5). Tabular Q-Learning computes the Q-value function for every (*s*,*a*) pair in the problem space, which makes it unsuitable for the assembly task at hand due to the complexity and variety of the environment. To overcome the tabular formulation drawbacks, DQN was introduced in [16] in which a neural network is employed as a function approximator of a (*s*,*a*) pair Q-value.

$$\mathcal{Q}\left(\mathbf{s},a\right) \leftarrow \mathcal{Q}\left(\mathbf{s},a\right) + a \left[r + \chi \max\_{a'} \mathcal{Q}(\mathbf{s'},a') - \mathcal{Q}\left(\mathbf{s},a\right)\right] \tag{5}$$

**Deep Recurrent Q-Learning** While Deep Q-Learning can learn action policies for problems with large state spaces, it struggles to learn sequential problems where action choice is based on a truncated trajectory of prior states and actions. This challenge urged the use of another DQN variant which has a memory to encode previous trajectories. In this paper, a Deep Recurrent Q-Network (DRQN) is utilized as a suitable DQN variant. DRQN was introduced in [17] to solve the RL problem in partially observable markov decision process (POMDP). DRQN utilizes long-short term memory (LSTM) layers to add recurrency to the network architecture. The LSTM layer can encode previous (*s*,*a*) trajectories providing enhanced information for learning the Q-values. In addition, the recurrency can account for sensor and communication delays.

**Action and Learning Loops** The Deep Recurrent Q-Learning algorithm is illustrated in Fig. 1. The algorithm can be divided into two parallel loops; the action loop (Green) and the learning loop (Yellow). The action loop is responsible for choosing agent's action where the current environment state is fed through a policy network. The policy network estimates the Q-value function over the current state and the set of available actions. Based on -greedy exploration rate, the agent action is either the action with the highest Q-value or a randomly sampled action as illustrated in Eq. (3). At each step, (*st*,*rt*, *at*,*st*+1) experience is saved in a reply memory. After a predefined number of episodes, the agent starts learning from randomly sampled experience batches. Each experience batch is a sequence of steps with a defined length from a randomly sampled episode. The target network is an additional network serving as a temporary fixed target for optimization of the Bellman Eq. (5). The weights of the target network are copied from the policy network after a number of steps. The policy network estimates the Q-value of (*st*, *at*) pair while the target network estimates the max Q-value achievable in (*st*+1). The output from both networks is used to compute the proposed loss function in Eq. (6). Gradient descent is used to learn the policy network passed by back propagation of loss as illustrated in Eq. (7).

**Fig. 1** Action and Learning Loops

$$L\_{\theta} = \frac{1}{2} \left[ \text{target} - \text{prediction} \right]^2 = \frac{1}{2} \left[ r + \text{\text{\textdegree{}} \text{\textdegree{}} \text{\textdegree{}} \text{\textdegree{}} \text{\textdegree{}} \text{\textdegree{}} \text{\text{\textdegree{}}} \text{\text{\textdegree{}}} \text{\text{\textdegree{}}} \text{\text{\textdegree{}}} \text{\text{\textdegree{}}} \text{\text{\textdegree{}}} \text{\text{\textdegree{}}} \text{\text{\textdegree{}}} \text{\text{\textdegree{}}} \text{\text{\textdegree{}}} \text{\text{\textdegree{}}} \text{\text{\textdegree{}}} \text{\text{\textdegree{}}} \text{\text{\textdegree{}}} \text{\text{\textdegree{}}} \text{\text{\textdegree{}}} \text{\text{\textdegree{}}} \text{\text{\textdegree{}}} \text{\text{\textdegree{}}} \text{\text{\textdegree{}}} \text{\text{\textdegree{}}} \text{\text{\textdegree{}}} \text{\text{\textdegree{}}} \text{\text{\textdegree{}}} \text{\text{\textdegree{}}} \text{\text{\textdegree{}}} \text{\text{\textdegree{}}} \text{\text{\textdegree{}}} \text{\text{\textdegree{}}} \text{\text{\textdegree{}}} \text{\text{\textdegree{}}} \text{\text{\textdegree{}}} \text{\text{\textdegree{}}} \text{\text{\textdegree{}}} \text{\text{\textdegree{}}} \text{\text{\textdegree{}}} \text{\text{\textdegree{}}} \text{\text{\textdegree{}}} \text{\text{\textdegree{}}} \text{\text{\textdegree{}}} \text{\text{\textdegree{}}} \text{\text{\text{\textdegree{}}}} \text{\text{\text{\textdegree{}}}} \text{\text{\text{\text$$

$$
\theta \gets \theta + a \left( r + \chi \max\_{a'} Q\_{\theta'}(s', a') - \mathcal{Q}\_{\theta}(s, a) \right) \nabla\_{\theta} Q\_{\theta} \left( s, a \right) \tag{7}
$$

#### **5 Search Skill Learning Approach**

This paper focuses on search skill which will be discussed in the following subsections in more details. Fig. 2 illustrates how the learning process is done.

**Initial Position** Each episode starts with the peg in a random position. The polar coordinates of the initial position are determined by a predefined radius from the hole's center and a randomly sampled angle between 0 and 2π (see Fig. 3). The advantage of utilizing such an initialization method is that it maintains the initial distance to the hole center while searching the full task space.

**State** At each time step, the reinforcement learning (RL) agent receives a new state sensed by the robot (see Fig.2 lower arrows) which consists of forces in *x*, *y*, and *z* (*Fx* , *Fy* , *Fz*), moments around x and y (*Mx* , *My* ), and rounded positions in x and y (*P*˜ *<sup>x</sup>* , *P*˜ *y* ) as seen in Eq. (8). In order to provide enough robustness against positional inaccuracy, it was assumed that the hole and the peg were not precisely positioned. *P*˜ *<sup>x</sup>* , *P*˜ *y* are computed using the grid

**Fig. 2** Illustration of How Robot Learns New Skill Using Deep RL

**Fig. 3** Peg Initial Position Strategy

indicated in Fig. 4. where*C* is the positional error's margin. This approach provides auxiliary inputs to the network, which can very well aid in the acceleration of learning convergence.

$$\boldsymbol{S} = \begin{bmatrix} \boldsymbol{F}\_{\boldsymbol{x}}, \boldsymbol{F}\_{\boldsymbol{y}}, \boldsymbol{F}\_{\boldsymbol{z}}, \boldsymbol{M}\_{\boldsymbol{x}}, \boldsymbol{M}\_{\boldsymbol{y}}, \tilde{\boldsymbol{P}}\_{\boldsymbol{x}}, \tilde{\boldsymbol{P}}\_{\boldsymbol{y}} \end{bmatrix} \tag{8}$$

**Action** A deep neural network (policy network) is utilized in the current system to estimate a Q-value function for each (*s*,*a*) pair, which subsequently generates an action index to the robot controller. The action index is then utilized to assist the robot in selecting one of four discrete actions (see Fig.2 upper arrows), each of which has a constant force in the *z*-direction (*F<sup>d</sup> <sup>z</sup>* ). According to Eq. (9), the agent must alter its desired position between *x* and *y* <sup>±</sup> *<sup>d</sup><sup>d</sup> <sup>x</sup>* , <sup>±</sup> *<sup>d</sup><sup>d</sup> y* , For all four discrete actions, the orientation of the peg (*R<sup>d</sup> <sup>x</sup>* , *R<sup>d</sup> <sup>y</sup>* ) is set to zero throughout the search phase. The advantage of maintaining constant and continuous force in the *z*-direction is that when the search algorithm finds the hole, the peg height drops by a fraction of a millimeter, which is a success criterion for the search phase.

$$a = \left[ d\_{\times}^{d}, d\_{\rm y}^{d}, F\_{z}^{d}, R\_{\times}^{d}, R\_{\rm y}^{d} \right] \tag{9}$$

**Reward** A reward function is used to evaluate how much an agent is rewarded or punished for performing an action in the current state. The reward (*rt*) is calculated after the completion of each episode in this proposed methodology. The reward zones for our search task is illustrated in Fig. 5. First, the inner circle (Green Zone) indicates that the peg has either reached the goal position or the maximum number of steps (*kmax* ) per episode is reached with the peg close to our goal. Inside the second circle (White Zone), the peg is at a distance less than the initial distance (*do*) and receives a reward of zero. Moreover, when the robot moves away from the starting position toward the boundaries of the safety limits (Yellow Zone), the agent receives a negative reward. Finally, the working space barrier is the outer square (Red zone), which indicates that the peg is violating the safety restrictions (*D*) and receives the highest negative reward.

**Fig. 4** Examples of Peg Position Rounding Approach Using Grid Size

$$On\ Faulure:\ r = \left\{ -\frac{d\_{-d\_o}}{\frac{d-d\_o}{D-d\_o}}, d\_o < d < D \right\}$$

$$On\ Success\:\ r = 1 - \frac{k}{k\_{max}}$$

**Fig. 5** Different Reward Zones

#### **6 Implementation and Validation**

A KUKA LBR iiwa, which is a sensitive robot arm with an open kinematic chain and integrated sensors, is used for this work. The integrated torque sensors are based on strain gauges in each of the robot's joints that enable for the determination of external forces and torques acting on the robot. Force-controlled robot applications are therefore possible when combined with the control approach discussed. The peg and block used in this study are made of corrosion-resistant stainless steel, which is ideally suited for this purpose due to the continuous force exerted during the experiments. The clearance of the peg and the hole is 30µm. The experimental setup is displayed in Fig. 6. As mentioned before, such assembly is done with the assistance of artificial intelligence as the task accuracy exceeds the robot precision. According to KUKA, the position repeatability of the LBR iiwa lies at ± 0.15mm [18]. This could be proven by DIN 9283:1998 with the help of a high-precision laser tracker API R50-Radian with an accuracy of ±10µm + 5µm/m. The measured repeatability was 0.14mm, which equates to around five times the clearance between the peg and the hole. In order to assure data flow between the DRQN and the robot, Message Queuing Telemetry Transport (MQTT) was used. MQTT is a bidirectional network protocol based on the client-

**Fig. 6** Experimental Setup: in simulation (left) and in reality (right)

server principle rather than the end-to-end connection paradigm like many other network protocols. Messages are not sent directly to clients; rather, communication is event-based and follows the publish-subscribe paradigm [19]. In order to validate our approach, we conducted experiments by running 200 learning episodes followed by some test runs. In order to achieve near optimal hyperparameter values, a few tests were conducted by maintaining all variables constant and adjusting one at a time. Throughout the training, the agent was able to identify the hole 130 times out of a total of 200 times. Two test trials were conducted, in which the agent was able to locate the hole 18 times out of 21 and 27 times out of 31, for an overall success rate of 86.5%, and as shown in Fig. 7c, the loss decreases during the training process. Additionally, the loss curve also demonstrates a well-chosen learning rate. Moving on to Fig. 7a, the graph shows that the peg strives to stay near to the hole position and only drifts further away a few times. Experiments revealed that a sparse reward function Fig. 7b is not the best fit for the search challenge, and that more dense reward functions should be investigated. Fig. 7d shows a cutout from the trajectory using two cases where the hole was identified (Success) and one case where the defined limit were exceeded (Failure).

**Fig. 7** Experimental results

#### **7 Conclusion and Future Work**

This research demonstrated and validated the success of our proposed strategy using DRQN in addressing a high-precision Peg-in-Hole assembly task using a 7-DOF sensitive robot with integrated sensors. The employed approach was successful in completing the search phase. It was also shown that integrating recurrence into a reinforcement learning system via an LSTM layer overcomes DQN's drawbacks, where the LSTM layer was able to encode previously taken decisions, allowing the agent to execute a better informed decision and overcome sensor delays. In the future, the approach will be extended to the insertion phase as well as improving the network architecture, including tuning the hyperparameters in order to reach an overall success rate of 100%. In addition we are planning to evaluate continuous action space techniques such as DDPG, DPPO, or NEAT, which should potentially enhance the performance.

**Acknowledgements** The research is funded by the Interreg V A Großregion within Robotix-Academy project (no 002-4-09-001).

#### **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

### **Generalized Model for the Optimization of a LIFO Topology Storage Using a Metaheuristic Algorithm**

Dominik Kuhn, Jan Adelsbach, Martin Karkowski and Rainer Müller

#### **Abstract**

LIFO topologies, due to their simplicity and high degree of space usage efficiency are common in applications for which a flexible and cost effective storage solution is required. This topology however represent an optimization challenge due to insertion and removal constraints. A scalable generalized model for the optimization of this topology using a population based metaheuristic algorithm is presented in this paper. The model to represent this storage topology, in a way suitable for population based metaheuristics and the implementation thereof are being discussed. It is being validated using practical usage scenarios from logistics and assembly such as non-stacking condensed pallet storage.

#### **Keywords**

LIFO • Metaheuristic • Genetic algorithm

#### **1 Introduction**

Last-In-First-Out (LIFO) type storage is used in logistics and assembly systems, where a high degree of storage efficiency is required. Application scenarios include part shelves in assembly, automated guided vehicles in matrix production [1] and in general warehousing.

A particular optimization problem of this storage type is the reduction of item shuffling, that is the reduction of item relocation in order to clear access to an item that is to be retrieved. In an optimal case the item would always be at the front of the respective strip.

D. Kuhn (B) · J. Adelsbach · M. Karkowski · R. Müller

Centre for Mechatronics and Automation gGmbH (ZeMA), Saarbrücken, Germany e-mail: d.kuhn@zema.de

<sup>©</sup> The Author(s) 2023

T. Schüppstuhl et al. (eds.), *Annals of Scientific Society for Assembly, Handling and Industrial Robotics 2022*, https://doi.org/10.1007/978-3-031-10071-0\_2

However this can only be guaranteed if the strip is homogeneous in terms of item types. As described in [2, 3] certain types of industries such as food, medical and chemicals this is not always possible due to expiration dates and production batch requirements. Furthermore a type-pure homogeneous storage would assume enough strips to handle both all item types as well as any potential backlog thereof, which is unrealistic in many application scenarios.

Usually, an attempt is made to describe this problem mathematically and solve it by the means of Linear Programming (LP) problem such as described in [4]. Many works are grouped under the term *Storage Location Assignment Problem—SLAP*, many of which do not explicitly consider the LIFO topology. In a previous paper [2], we presented an approach that deals precisely with the problem of optimizing this type of a LIFO storage using metaheuristics. In this paper, we would like to discuss the structure and further development of the data structure, as well as the effects of the different approaches.

The original motivation of this work was the optimization of a LIFO type palette warehouse of a beverage producer. However given the aforementioned further application scenarios and little research work on this subject it was decided to further generalize the approach as we see potential in its use in adaptable assembly systems.

We first define the model used for the algorithm, the prior work done and give a brief overview of the theoretical basis. Subsequently we describe the evolution of the data structure used and the convergence behavior as well as the concepts of exploitation versus exploration with the said data structure.

#### **2 Problem Description**

The general model concerned with this optimization problem is that of a LIFO type storage consisting of an arbitrary amount of strips. Each strip acts like an individual LIFO row, in that items can only be removed in the reverse sequence of them being stored. This mirrors a typical generalized application scenario of LIFO storage, such as palette warehousing.

Following a similar definition as [3], let *W* be a storage composed of *ns* strips, *W* = {1, ..., *ns*} which are each may contain *ni* items *Wi* = (*s*1, ...). Given a sequence of *nz* items to be placed into the storage *Z* = (*z*1, ...) The goal is to find a *configuration* for the placement of the items in the storage, such that using a scoring method *<sup>f</sup>* (*W*) <sup>→</sup> <sup>R</sup> that the latter sits at Pareto optimality.

The implementation of *f* (*W*) depends on the desired properties and can be implemented to examine single items, strips or the storage as a whole. Depending upon the formulation this can represent a multi-objective optimization problem. For example [2, 5] describe possible variants examining the homogeneity of strips in various manners and their combination to a single score. Further multi-dimensional approaches could further assess the access to items through empty neighbouring strips.

#### **3 Previous Works**

Current research approaches consider single item allocation, such as [3] with an ad-hoc placement strategy and a *greedy relocation procedure*, [5] with a genetic algorithm refined with simulated annealing or [4] with a linear model. However in those algorithms the placement of only single items is being considered, restricting the view of potential storage optimality that the consideration of multiple items could offer.

It is therefore advantageous to optimize the storage allocation for multiple items at once. This allows the algorithm for example to take into account the current production schedule. In the previous work [2] we present an approach to solve the allocation problem for an ordered batch of items. The focus of the latter was on illustrating and explaining the problem in form of an implementation of a genetic algorithm. The implementation of typical storage restrictions in terms of the fitness function formulation was furthermore handled in detail and mathematically described. It should be noted that the way in which the fitness methods are implemented has a significant influence on the runtime and parameterisation of the algorithm.

This approach has been developed in the Python programming language based a genetic algorithm module. The latter module was extended by a number of adapted fitness, crossover and motivation methods. A meta class inherits the original genetic algorithm module and overloads certain functions with extensions with regards to the LIFO data structures as described below, debugging and profiling functionality. Using an application programming interface (API) real-time data from a regional beverage producer who uses such a LIFO storage was used to test the approach using a real scenario. The latter contains storage capacities of up to 5000 pallets, at a varying strip size with up to 30 pallets. The number of pallets to have their storage location optimized are taken from a production queue which varies between 8 and 12 pallets.

#### **4 Theoretical Basis**

#### **4.1 Biological Model and Basic Idea**

The idea of genetic algorithms is inspired by evolutionary processes in nature, through which individuals adapt more increasingly to environmental conditions. The principle of which is described e.g. in [6, 7], goes back to Charles R. Darwin, who proclaimed it as "survival of the fittest". The basic idea of transferring this approach to mathematical optimisation problems goes back to a work by [8] as a method of metaheuristics.

As an evolutionary optimisation method, the solution is represented by an individual in genetic algorithms. Multiple individuals together build a population and thereby a set of solutions in a search space.

The individuals of each generation must be characterised with regards to the problem under consideration for solution quality. This task is usually performed by a fitness function that quantifies the quality of an individual with the use of a real value.

Corresponding to the biological mechanisms of crossing, mutation and selection, there are also methods to enable the population of individuals to evolve.

#### **4.2 Structure and Function of the Genetic Algorithm**

For a better understanding we first briefly review the workings of a genetic algorithm in this section. The algorithm consists mainly of the following components and phases. These phases are executed in an iterative manner:


The first step of an evolutionary algorithm like genetic algorithm is to define a description or **representation** of the context and the search space of the problem. This often involves simplifying or abstracting a real-world problem to derive a clearly defined context. It must be decided how a possible solution should be coded and stored so that it can be processed by a computer in the given programming language. Simple problems are often implemented in the form of binary permutations, so called *genotype representations*. More complex problems, must be implemented using more complex data types, as simple binary encoding is no longer sufficient. In this case, several structures are often used to map the relationships.

After coding the problem, the next step is to initialise a start population randomly. After **initialising** a starting population, the **fitness** of the individuals is calculated using a suitable fitness function.

A **selection** method is used to pick out a given number of the fittest individuals from the current population in order for them to be transferred into the next population. This usually takes into account a stochastic component when selecting the fittest individuals [6, 9].

During **recombination**, new individuals are generated from two selected individuals with a crossover operator. From the newly emerged individuals, candidates for **mutations** are selected with a low probability to strengthen the exploration of the search space. Mutation involves changing a random part of the individual. Various mutation methods are evaluated in [10]. Depending on the coding chosen, a mutation method has a different impact on exploration, depending upon the problem certain mutation operations are not effective and produce useless solutions in the search space [9].

**Fig. 1** Sparse- and dense data structure for the same item configuration. Shaded boxes in the storage are occupied, empty boxes are free. Numbered vector elements are strip indices

#### **5 Data Structure**

The data structure in the previous work [2] was initially chosen in such a way, that standard crossover and mutation operators for the traveling salesman problem such as described in [6] can be applied, yet such that the interpretation of the data structure keeps track of the LIFO principle and the item sequence.

This originally was implemented as two vectors, one *map* vector in which every element would correspond to an empty slot in the LIFO storage in terms of the strip number. That is if a strip *i* has *n* free slots the vector would contain *n* elements with value *i*. A further vector, the *individual* of an equal size and corresponding by index to the *map* vector, would be occupied either by an empty indicator or by an identifier for the item to be placed (Fig. 1).

This individual vector would then be interpreted sequentially from left-to-right by skipping over empty elements and pushing items into the farthest free position from the front of the corresponding strip. This is illustrated in Algorithm1 for an individual vector *I*, a map vector *M* which are both of size *n*, as well as a function to push the item onto the back of a strip pushToStip(strip, item).

#### **Algorithm 1** Sparse vector to storage mapping

```
Require: I, M, pushToStrip(strip, item)
 for i := 1 to n do
    if Ii = ∅ then
       pushToStrip(Mi , Ii)
    end if
 end for
```
Standard mutation and crossover operators, need a continuous vector filled with values. In order to facilitate the use of those operators another vector would be generated from the individual in which all empty element indicators are filled with values that are distinct from from the item identifiers.

After the application of the standard operator the filled in values are replaced again with empty element indicators, such that the resulting data structure matches the representation as described above.

A problem with this data structure is that due to the large sparsity of the vector probabilistic selection methods will often pick empty elements. Attempts were made to overcome such problems by having probabilistic operations only choose from occupied elements by deriving an indexed vector of the non-empty elements such as described in the concepts of [11].

Furthermore attempts were made to minimize the vector size in overall, by limiting the amount of free slot elements per column to the maximum amount of items to be placed, or filtering out the worst scoring strips.

In order to address these issues the data structure was redesigned such that the individual vector is a standard dense vector. In this approach the map vector has the size of the items to be stored and their identifiers. The individual vector on the other hand then contains the strip numbers, the number of free slots per strip is stored in a separate data structure which is implemented using a standard dictionary. Care needs to be taken to ensure that the individual vector only contains each strip index at a maximum amount of the free slots in that strip. The latter requires special handling for the initial population generation and mutation operators.

This vector is still interpreted from left-to-right, in order to maintain the storage order, following the same terminology as for Algorithms 1 and 2 shows how this structure is mapped into a storage.


In this approach the crossover and mutation functions need to be significantly adjusted to handle and modify the vector. In particular a mutation function that does not merely swap elements will need to ensure that when altering the strip index that the total amount of occurrences of the latter in the individual are less than or equal to the amount of free slots in the strip. This can be achieved by first counting the free slots for every strip in the storage and then subtracting and removing the slots claimed by the individual. The mutation can then select a new strip index from the remaining ones.

In order to use standard crossover functions distinct values are required, this can be achieved by operating on a *shadow* vector with increasing values representing the indices of the individual vector. Upon applying the crossover to the shadow vector it is then used as a permutation to the individual.

#### **6 Convergence Behavior**

The convergence behavior can be assessed both in terms of computational performance and maximum achieved optimality. In order to provide comparable results for the data structure approaches, a snapshot of a warehouse configuration from a beverage manufacturer was taken and used to validate the performances.

In order to provide performance improvements the fitness of all strips in the storage as-is are computed upfront. During the execution of the algorithm the fitness is only recomputed for strips into which items have been designated to, in accordance to the configuration of the individual. This allows fast computation of the fitness score for the whole storage. This results in a predictable performance pattern, where if items are clustered, with an iteratively decreasing spread across the storage, the execution time of the fitness function over the population decreases. This behavior can be examined in Figs. 2 and 3. This pattern applies to either data structure used, with the sparse structure having a significantly higher offset time than the dense structure, with the former taking 10 to 20 times longer to compute than the dense method.

**Fig. 2** Average fractional execution time of the different steps in the genetic algorithm using the dense vector approach from around 100 executions

**Fig. 3** Convergence behavior of the dense vector approach using exploitation only and both exploitation and exploration. The higher the convergence score, the better

In terms of absolute performance, the choice of data structures, in particular for dictionary or map type containers that are used for various lookup operations can significantly affect the performance. However in this case either algorithm was implemented using standard Python language container data structures, as such the performance increase is in a relative relation.

In terms of the optimality the algorithm sometimes gets stuck in a local maximum. Using an elitist genetic algorithm, this can result in multiple generations without improvement followed by erratic jumps. This behavior presents a challenge for an appropriate stop criterion. In the current algorithm a defined amount of the last *n* fitness values are being kept and examined for a sufficient change δ. Once this falls below a threshold the algorithm is stopped. Currently it is found that appropriate values are to examine the fitness scores of the last 20 generations to have a change of at least 0.01 if the fitness score is in a range of [−1,1].

#### **7 Exploration vs. Exploitation**

In a genetic algorithm, the mutation methods have the task of introducing new entropy to an individual in order to expand the search space within a generation, this is called the *exploration*. The crossover method on the other hand is in charge of combining the best aspects of two individuals in order to inherit an improvement into the next generation, this is the *exploitation*. Without mutation methods and purely with crossover methods, there is a risk that the algorithm will remain in a local extremas. The algorithm without crossover methods on the other hand will be probabilistic when based solely on the mutation methods.

As far as the nature of the mutation function is concerned, it is noted that the structure of the data leads to significant differences in terms of exploration in the prior described model. If the problem is represented in the form of a single sparse vector, mutation methods on their own are able to generate a proper exploration rate. This is the result of the vector attaining a new configuration by randomly shuffling the elements around. If the data structure however is implemented in the form of a condensed vector as described above for the new approach, exploitative mutation methods that shuffle elements such as *CIM* and *RSM* as described in [10] offer little to no exploration. This is due to the dense vector approach only containing a subset of the possible configurations and as such not expanding the search space beyond what was initially generated with the individual. In this case in order to obtain a desired exploration rate new configurations need to be introduced, this can be achieved with mutation methods that not merely shuffle elements around but also can alter them, an example of which is the Twors mutation [10]. Figure 3 illustrates the behavior of the dense vector approach when using only exploitative versus explorative mutation methods.

#### **8 Discussion**

The method described herein assumes that the strips are constrained to insertion and removal from the front only in a true Last-In-First-Out fashion. Methods to use the neighbouring strips to access items are not discussed, but could be further considered in the fitness methods, if applicable. This approach still considers all items as single entities, even if they are of the same type. A further refinement could be to consider clusters of items in order to improve convergence behavior and item location quality on otherwise sub-optimal cases.

As mentioned, the stopping criteria is a particular problem for this type of algorithm. Advanced methods utilizing early stopping techniques such as described in [12] were not further explored or evaluated for their applicability in this context.

The size of the population in relation to the size of items to be stored as well as the size of the storage has not mathematically quantified. As a consequence of this the population size is based on intuition or trial and error.

By comparison to the sparse vector approach the dense vector approach yields a significant performance saving impact at the cost of a higher complexity. This results in the dense approach being harder to assess in its theoretical performance by comparison to the sparse vector approach.

Further research could be conducted in order to explore the addition of memetic algorithm approach based on the work of [13] in which individuals are further refined.

Adaptive assembly systems allow many degrees of flexibility but are inherently complex in their optimization, we hope to adapt the described algorithm for this usage scenario. But due to the inherent complexity of adaptive assembly systems we explore this approach first using more simplistic but real world usable application scenarios.

**Acknowledgements** This article was written as a part of the Mittelstand 4.0-Kompetenzzentrum Saarbrücken project, a part of the BMWi's "Mittelstand-Digital" funding initiative. The competence centers, which are funded all over Germany, inform small and medium-sized enterprises about the opportunities and challenges of digitalisation. The responsibility for the content of this publication lies with the authors.

#### **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

### **Machine Learning as an Enabler for Automated Assistance Systems for the Classifcation of Tool Wear on Milling Tools**

Björn Papenberg, Sebastian Hogreve and Kirsten Tracht

#### **Abstract**

Tool wear and the decision when to replace tools is a universal challenge in the metal cutting industry. While the tool wear state can be accurately determined using optical measuring methods, the tool wear of milling tools is often examined by the CNCmachine operators, especially in small and medium enterprises. In order to increase the accuracy with which tool wear can be correctly classifed, it is advisable to use an assistance system that automatically removes the tools from a buffer, examines the tool wear state based on visual sensor data and sorts them into separate boxes according to the classifcation result. In this context, the accurate classifcation of tool wear is a key capability that can be enabled using methods of machine learning, based on image data that was labeled by human experts. In this paper different machine learning models are examined based on their ability to classify images of milling tools into the categories worn and not worn. The EffcientNet\_b0 model achieves an accuracy of 91.47% and outperforms human experts that classifed similar images by 22.87%.

#### **Keywords**

Machine learning · Tool wear prediction · Image processing

B. Papenberg (\*) · S. Hogreve · K. Tracht

University of Bremen, Bremen Institute for Mechanical Engineering (Bime), Bremen, Germany e-mail: papenberg@bime.de

<sup>©</sup> The Author(s) 2023 27

T. Schüppstuhl et al. (eds.), *Annals of Scientifc Society for Assembly, Handling and Industrial Robotics 2022*, https://doi.org/10.1007/978-3-031-10071-0\_3

#### **1 Milling Tool Assessment in the Machining Industry**

In the manufacturing industry the product quality needs to be optimized and production cost minimized in order to compete with other enterprises. While the usage of worn tools decreases product quality the underuse of a tools remaining lifespan results in an increase in production cost [1]. In order to maintain a suffcient product quality only 50%–80% of the mean tool life is generally used [2].

Thus, arises the need for effective assistance systems to determine tool wear in order to reduce production costs by as much as 10%–40% [3].

In medium and small enterprises, the decision whether or not a milling tool can still be used is often made by the machine operators. While they have specifc tools such as magnifying glasses or microscopes at their disposal when handling the milling tools, the classifcation is still subjective and contains an underlying, individual bias, which can lead to different individuals classifying the same milling tool differently.

In order to classify tool wear on a more accurate, deterministic basis an automated assistance system is necessary. This assistance system could, by means of an industrial robot, remove milling tools from a predefned buffer and then feed them to a camera, which takes several images of the milling tool. Using these images, the tool wear could be examined. Following the image-based classifcation, the milling tools could be sorted into separate, predefned output buffers based on the classifcation results. In the context of the described assistance system, the required image processing of the collected image data is a key component. Methods of machine learning can be used in order to classify the images, therefore enabling the usage of the aforementioned system.

#### **2 Classifcation of Tool Wear Using Methods of Machine Learning**

Tool wear can be classifed using either an indirect on a direct approach. The indirect approach utilizes cutting parameters such as force, vibration, acoustic emission or the measured power of the CNC-machine [4–7]. Since these parameters can be measured during the milling process, no intervention in the process is necessary to draw conclusions about tool wear [5, 6]. Using statistical methods, the indirect approach determines a correlation between tool wear and the recorded sensor signals as a basis to classify tool wear [7].

In the direct approach, the tool wear is measured by means of optical sensors via the geometric properties of the tool [4, 6, 7]. For the optical measuring of the tool wear it is generally necessary for the milling tool to be removed from the machine [6]. This disadvantage causes machine downtime [7]. The direct measurement of the tool wear offers a higher recognition accuracy under ideal conditions than the indirect approach [4, 6]. Uncertainties may arise from the interpretation of the image data by human operators [4]. The presence of chips or cutting fuids in the image data effects the recognition accuracy as well [4, 6].

Classical methods of computer vision such as the sobel- and canny algorithms as well as the active contour method have been applied in the literature to detect tool wear [4]. Deep learning approaches outperform classical approaches in regard to the classifcation of images [8]. Additionally, methods of machine learning are more robust in terms of the classifcation accuracy towards changing light conditions [4].

Methods of machine learning can be used in order to classify tool wear based on both the indirect and the direct approach. Neural networks are the most used method for the indirect classifcation of tool wear [6]. Machine learning approaches such as neural networks are able to extract knowledge from large amounts of data and map this knowledge in a model, which is then able to apply the learned knowledge to the specifc application. For the classifcation of tool wear, deep learning methods are particularly suitable, since they can detect patterns in the input data independently, which is why external feature detection is not necessary [9]. These methods require a large amount of data, which is not always accessible [6, 10]. This is particularly true for use-cases where expert knowledge is required to label the data. For these specifc cases, which includes the classifcation of tool wear, deep learning approaches such as ensemble learning or transfer learning look promising [6].

Since deep learning methods are able to detect features in the datasets without the use of external feature detection algorithms, they can be used to fnd correlations in sensor data, which is recorded during milling processes. This data can be processed through the use of deep learning methods such as convolutional neural networks (CNN) [10]. This is done by encoding the time series data as images which can then be processed by CNN [11, 12].

Table 1 shows an overview of the presented literature and their key parameters. The indirect approaches using sensor data in order to classify tool wear reach an accuracy of 86% to 90%. These approaches use sensor data based on the entire lifespan of milling tools in order to classify the wear. The predicted classes range from no wear to steady state wear and fnally tool failure. The direct approaches use image data to classify tool wear. The approach proposed by Wu et al. does not classify whether an image depicts a worn or a not worn tool but different kinds of wear phenomena [13]. Bergs et al. use image segmentation instead of image classifcation to detect tool wear. Therefore, they use the Intersect over Union (IoU) metric instead of accuracy to evaluate their results. Ambadekar et al. classify images of surface quality of workpieces in order to classify the wear of the used cutting tool. Using this approach, they classify the wear state of the tool with an accuracy of 87.26% [9].

Many state of the art CNN-architectures are trained on publicly available datasets, such as the ImageNet dataset, in order to evaluate their performance. CNN trained on the ImageNet-dataset are observed to be biased towards detection textures instead of object shapes [14]. This property is benefcial for the detection of tool wear, since the detection of tool wear is a texture recognition problem [4]. This should enable network architec-


**Table 1** Comparable approaches for the classifcation of tool wear

tures that are good at classifying images on the ImageNet dataset to reliably classify tool wear. CNN-architectures such as VGG [15] or ResNet50 [16] have successfully been used to classify tool wear [9, 13].

#### **3 Approach to Aligning the Classifcation Accuracy of a Machine Learning Algorithm With Expert Knowledge**

#### **3.1 Image Acquisition Device**

The images necessary to train the neural network are taken using a Nikon D5600 camera using a Sigma 150 mm camera lens. The Camera is mounted on top of a special fxture which prevents relative movement between the camera and the tool holding fxture for the milling tools. This ensures that the images are taken under identical initial conditions. To prevent image blur when the camera shutter button is pressed, a remote control is used. The tool holding fxture, in which the milling tools are inserted can be rotated around its axis by 360 degrees. The edges of the tool holding fxture allow it to be manually turned in intervals of 45 degrees, so that the entire circumference of the milling tools can be photographed though eight individual images. In order to capture images of the front side of the milling tools the tool holding fxture can be attached at a different angle. The entire image acquisition device is placed inside of a photo box when taking the images in order to ensure constant illumination, as shown in Fig. 1. The photo box contains LEDs which illuminated it with diffuse light. Tool holding fxtures of different diameters can be used for different milling tools.

**Fig. 1** Image acquisition device within a photo box

#### **3.2 Dataset and Preprocessing**

The dataset acquired by using the aforementioned device consists of 328 images of 41 different milling tools. These milling tools were classifed by an expert into the categories worn or not worn, using magnifying glasses or microscopes. This classifcation is taken as the ground truth for the images in the dataset. Therefore, uncertainties in the dataset can be expected. The dataset is split into different subsets for training, validation and test at a ratio of 60:20:20. In order to enable the used method of machine learning to process the image data more effciently, the images are preprocessed.

Initially the images are cropped, so that the majority of the background, which contains no information of the tool wear, is removed, therefore reducing the size of the image. In order to further increase the datasets, different flters are applied to the individual images, which increases the robustness of the model after training [4]. These flters include the increase of contrast, the increase of illumination, as well as the use of a sharpening and softening flter. Image augmentation techniques such as translation and rotation are not used, since the position of the milling tools relative to the camera is fxed. Therefore, these augmentation techniques offer no beneft. By applying these flters to the images, the dataset is increased to 1640 images. Fig. 2 shows four images of the same milling tool with the different flters.

**Fig. 2** Images of same milling tool using different flter. From left to right: contrast, illumination, sharpening, softening

#### **3.3 Convolutional Neural Network Implementation**

State of the art CNN-architectures such as VGG and ResNet50 are capably of classifying tool wear since the recognition of tool wear is a texture recognition problem instead of an object detection problem as described in the previous chapter. In order to classify the wear on milling tools based on the dataset, several training runs are conducted using the VGG [15], ResNet50 [16] and EffcientNet\_b0 [17] architectures. The EffcientNet scores a better result than the VGG and ResNet50 when trained on the ImageNet dataset, while utilizing less parameters and training quicker. Since a large number of parameters is one factor that attributes to overftting, which was observed in previous papers when classifying tool wear using VGG and ResNet50, the EffcientNet is employed as well.

Transfer learning is one possibility to reduce the effect of overftting, especially for small datasets. In order to evaluate the classifcation results of the different architectures and the infuence of the use of transfer learning based on the ImageNet dataset, the VGG-16, VGG19, ResNet50 and EffcientNet\_b0 model are trained with and without the usage of pretrained weights based on the ImageNet dataset. The base models are extended by the following layers, in order to fne tune the model to be able to detect tool wear. After the base model global average pooling is used. Following the global average pooling a fully connected layer with 128 neurons is added. This fully connected layer (FC-Layer) uses the ReLu activation function. The last layer is another fully connected layer with two neurons, representing the two possibly classifcation results. This layer uses the SoftMax activation function. The architecture is depicted in Fig. 3.

For the training of the model a NVIDIA RTX 2060 graphics card is used. The code was implemented in python, using the TensorFlow 2.6 framework [18].

**Fig. 3** Architecture of the CNN


**Table 2** Results of the training runs

For training the models the adam optimizer is used. The loss function is categorical cross entropy. The images used for the training are passed to the models in a resolution of 405×150 pixel in batches of eight images. The images are downscaled in order to reduce the number of parameters and increase training speed. The models are trained for 50 epochs. Since the accuracy of the models does not increase after a certain number of epochs, no further training runs above 50 epochs are conducted.

#### **4 Results**

The results of the training runs are shown in Table 2. Transfer learning signifcantly improves the accuracy and decreases the loss of every model. The models perform in accordance to their performance on the ImageNet dataset. The VGG-16 model scores an accuracy of 50% without the use of transfer learning and 67.65% with the use of transfer learning. The VGG-19 model scores an accuracy of 50% without the use of transfer learning and 72.06% with the use of transfer learning. The second-best model is the ResNet50, which scores 76.76% without the use of transfer learning and 89.71 % with the use of transfer learning.

The best overall accuracy is achieved by the EffcientNet\_b0 model with transfer learning, which scores an accuracy of 91.47%. This model also achieves a signifcantly lower loss on the test dataset.

**Fig. 4** Accuracy and loss of the EffcientNet\_b0 model without transfer learning

**Fig. 5** Accuracy and loss of the EffcientNet\_b0 model with transfer learning

The course of the accuracy and error of the EffcientNet\_b0 model using no transfer learning are shown in Fig. 4. The accuracy on the training dataset shows a linear increase in the frst 20 epochs before rising signifcantly and converging to one. The loss on the training dataset shows a drop at the frst epoch and remains almost constant for another twenty epochs. After twenty epochs the loss decreases in a volatile manner to values between 0.2 and zero. The accuracy on the validation dataset barely increases at all. The loss on the validation dataset is highly volatile and does not decrease below a value of 0.7.

Figure 5 shows the course of the accuracy and error of the EffcientNet\_b0 model using transfer learning. Similarly, to the model without transfer learning the accuracy and error of the model on the training dataset converge to one and zero respectively.

The signifcant difference between the models is that the convergence is achieved at a substantially faster rate. The validation accuracy increases signifcantly in the frst few epochs and converges around 0.9. The error on the validation dataset decreases signifcantly in the frst few epochs and up to a value of 0.2. While the course of the validation error on the model with transfer learning is still volatile, it is signifcantly steadier than the validation error of the model that does not make use of transfer learning.

The confusion matrix of model eight is shown in Fig. 6. The correct classifcations on the test dataset are shown on the main diagonal of the matrix. 46.7% of the tools that show no tool wear were classifed correctly. Tool wear is correctly classifed in 44.7% of cases. The model misclassifes 5.29% of the samples were tool wear is present and 3.26% of the samples were no tool wear is present. This results in 91.4% accuracy, which is higher than most accuracies that can be found in the literature.

The dataset that is used to train, validate and test the models consists images of milling tools which were classifed into the categories wear or no wear by a human expert, as described in Sect. 3.2. Therefore, the labels in the data set can be expected to contain an individual bias, resulting in some uncertainty in the classifcation of tool wear by the models. Thus, it is unlikely that 100% accuracy can be achieved. Taking this background into account, the classifcation accuracy achieved by EffcientNet\_b0 is all the more remarkable. The results show that it is possible to reproduce human expert knowledge using CNN without having to perform metrological evaluations for the annotation of the data. In comparison to the existing literature, it was not investigated whether or which wear can be detected, but whether the tool would be classifed as worn or not yet worn by a human expert. In contrast to the classifcation of different wear features, this type of classifcation is particularly challenging, as the number and extent of the wear features in the image have a non-trivial infuence on the wear condition of the depicted tool. Contrary to the approaches in the literature, all potential wear features have to be considered at the same time.

In order to evaluate how well the model is able to match the knowledge of machine operators regarding the classifcation of milling tools, ten different milling tools were classifed by 14 different machine operators using magnifying glasses to help them with the classifcation. These milling tools are a subset of the tools used for the creation of the data set and were classifed by the same expert for their wear condition. The results are shown in the right matrix of Fig. 6. The average accuracy of the 14 humans when clas-

**Fig. 6** Confusion matrix of the EffcientNet\_b0 with transfer learning

sifying these ten milling tools is 68.6% which is signifcantly lower than the model accuracy on the test dataset. Thus, it can be concluded that the CNN is able to match human expertise very well.

Therefore, the usage of an assistance system which classifes the tool wear on milling tools based on images can be an effective tool for humans in making these decisions while handling the milling tools, thus reducing production costs and conserving resources.

#### **5 Conclusion and Outlook**

Dealing with tool wear is a challenge faced by every company in the machining industry. The decision whether or not to a tool can still be used is often made by human machine operators, specifcally in small and medium enterprises. Tool condition monitoring systems can help to provide objective decision making and therefore help to reduce the costs. The usage of image processing via machine learning serves as an enabler towards the development of such an assistance system, that stores, handles and classifes milling tools by the state of their tool wear.

In recent years methods of deep learning have proven to be able to detect tool wear based on indirect and direct approaches. For the direct approach, which classifes the tool wear using images of the tools, state of the art CNN-architectures that perform well on the ImageNet dataset, such as VGG and ResNet50 have proven to be able to detect tool wear. The VGG-16, VGG-19 ResNet50 and EffcientNet\_b0 model were trained based on the created dataset, which consists of 1640 images based on 41 different milling tools that were classifed as worn or not worn by experts. The usage of the weights based on the ImageNet dataset signifcantly boosted the performance of every model. The EffcientNet\_b0 model with the use of the ImageNet weights performed best with an accuracy of 91.47%. The model outperforms human machine operators in classifying the wear on milling tools by 22.87%.

An image-based assistance system that helps machine operators in classifying the wear on milling tools could decrease production costs, since a larger proportion of the possible tool life expectancy could be used. Furthermore, the usage of worn tools would become less likely by using such an assistance system, which leads to an increase in product quality.

To achieve further improvements in detection performance, the dataset should frst be enlarged. Furthermore, the dataset contains an inherent bias, since the data was labeled by a human expert. This bias could be removed by classifying the samples based on measured wear phenomena in their geometry.

A comparable approach could be used to assess the wear of turning tools or in optical quality assurance. The approach is particularly suitable in areas where there are no clearly defned boundaries between the classifcation results, which means that classic, analytical approaches cannot be used.

**Acknowledgements** The project "KI\_Café" ("Einführung entscheidungsunterstützender KI-Systeme in der Produktion") is funded by the German Federal Ministry of Labour and Social Affairs within the "Zukunftsfähige Unternehmungen und Verwaltungen im digitalen Wandel" program. within the framework of the "Initiative neue Qualität der Arbeit" (www.inqa.de)

#### **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

**Resource Allocation**

### **Simulation-Based Potential Analysis of Line-Less Assembly Systems in the Automotive Industry**

Jonas Rachner , Lea Kaven , Florian Voet , Amon Göppert and Robert H. Schmitt

#### **Abstract**

Increasing product variety, shorter product life cycles, and the ongoing transition towards electro-mobility demand higher fexibility in automotive production. Especially in the fnal assembly, where most variant-dependent processes are happening, the currently predominant concept of fowing line assembly is already been pushed to its fexibility limits. Line-less assembly systems break up the rigid line structures by enabling higher routing and operational fexibility using individual product routes that are takt-time independent. Hybrid approaches consider the combination of line and matrix-structured systems to increase fexibility while maintaining existing structures. Such system changes require a high planning effort and investment costs. For a risk-minimized potential evaluation, discrete-event simulation is a promising tool. However, the challenge is to model the existing line assembly concept and line-less assembly for comparison. In this work, a comprehensive scenario analysis based on real assembly system data is conducted to evaluate the potential of line-less assembly in the automotive industry. Within the simulation, an online scheduling algorithm for adaptive routing and sequencing is used. Based on an automated experiment design, several system parameters are varied full-factorially and applied to different system confgurations. Various scenarios considering worker capabilities, station failures,

J. Rachner (\*) · L. Kaven · A. Göppert · R. H. Schmitt WZL | RWTH Aachen University, Aachen, Germany e-mail: j.rachner@wzl-mq.rwth-aachen.de

R. H. Schmitt Fraunhofer Institute for Production Technology IPT, Aachen, Germany

F. Voet Dr. Ing. H.c. F. Porsche AG, Stuttgart, Germany

T. Schüppstuhl et al. (eds.), *Annals of Scientifc Society for Assembly, Handling and Industrial Robotics 2022*, https://doi.org/10.1007/978-3-031-10071-0\_4

material availability, and product variants are simulated in a discrete-event simulation considering realistic assumptions. Results show that the throughput and utilization can be increased in the hybrid and line-less systems when assuming that the stations will have failures and the assumption of an unchanged order input.

#### **Keywords**

Potential analysis · Automotive Industry · Line-less Assembly · Discrete-event simulation · Flexibilisation

#### **1 Introduction**

Production systems are evolving towards smart, cognitive, and more adaptable systems to cope with new global challenges [1]. Especially in the automotive sector, assembly systems face several drivers for fexibilisation: The shift of strategy from centralized production to decentralized plants to cope with local demands leads to a reduction in production volumes. Therefore, frequent adjustments of production lines are necessary, implying high investment costs per unit for infexible systems [2]. Due to shorter product life cycles, production systems must adapt to new or expanded product portfolios [3]. Specifcally, as the demand for electric cars is rising but not high enough to operate individual production lines economically, multi-model lines are used. Moreover, the higher frequency of production start-ups demands a more effcient ramp-up enabled by fexible systems [4]. Additionally, unexpected global pandemic events and political restrictions lead to material shortages resulting in long downtimes when using infexible assembly systems [5, 6]. In summary, existing conventional automotive production systems do not meet these new requirements [7].

Therefore, in the past, various systems have been developed for fexibilization in the automotive industry [8]. Enablers for fexible systems are fexible transport systems (e.g., AGVs), highly qualifed workers, and modular, reconfgurable stations [1]. Existing fexibility levels (e.g., line-less assembly, agile hybrid assembly systems) differ in the number of relaxed restrictions and applied fexibility [9]. The decision for one of these fexibility levels demands assessing potential investments, benefts and risks. In this context, one must take both the green-feld planning to evaluate production systems to be newly designed and the estimation of a conversion potential for existing systems into account. For such modeling and planning, especially in the automotive industry, the method of discrete event-driven simulation (DES) is an established tool [10]. So far, no method or tool exists to derive the potential of different fexibility levels and systems in an application-specifc and automated way. In order to conduct comprehensive analyses, standardised models are required for the low-effort investigation of different models on the basis of common system variables. Therefore, this paper presents a methodology for an automated simulation-based potential analysis of different fexibility levels in the automotive sector. Following, an application of this methodology to an automotive industry use case and the derivation of relevant potentials is presented.

#### **2 State of the Art: Simulation Studies of Assembly Flexibilisation**

To model an automotive fnal assembly, special requirements have to be taken into account. For example, there are a large number of stations in which a combination of manual and automated processes are performed. The predominant organisation form of takt line assembly is characterised by a continuous product fow and a strongly planned takt. The fexibilisation of product- or system-side fexibility through simulation studies has been investigated in some publications with a different focus.

In particular, the impact of routing fexibility on the performance of manufacturing systems is often simulated with discrete-event simulation to evaluate the effects of fexibilization [11–15]. Routing fexibility can be defned as "the ability of a manufacturing system to produce a part by alternate routes through the system" and can be achieved by having multi-purpose stations and allowing individual product routes. On the other hand, operation fexibility describes "the ability of a part to be produced in a different way" and thus refers to product-side fexibility [16, 17].

While most publications do not specifcally refer to line-less assembly systems in comparison to line assembly systems, Hofmann et al. [18] examine the effect of routing and operation fexibility in so-called matrix productions and production lines The resulting fexibility levels are simulated for different levels of failure probabilities and evaluated as the throughput time, tardiness output and utilization. The result shows that especially long downtimes are a motivator for matrix production, as adherence to schedules is signifcantly better in this case. However, only 10 work stations are included in the evaluated system that is not based on industrial data.

Schönemann et al. [19] also compare matrix with line production in discrete-event simulation and investigate the infuence of buffer sizes and machine failures. The results show that the matrix system can achieve a higher utilisation due to the redundancy of work stations and the used adaptive control strategy. Only the given scenario with eight work packages is investigated for the two designs and no intermediate levels of fexibility are considered.

Göppert et al. [20] use an automated scenario analysis to investigate in a high number of scenarios the infuence of operation and routing fexibility on the fow time in line-less assembly systems compared to line assembly systems for different station failures and interarrival times. Results show that especially operation fexibility can compensate station failures and bottlenecks due to low interarrival times. The evaluated system consists of 8 work stations and is not based on industrial data.

Küpper et al. [21] defne the concept of fexible-cell manufacturing, in which workers are assigned to individual matrix-structured work stations, which are divided into specialised and generalised cells. In a simulation study, the fnal assembly for a real automotive use case is modelled for the existing line concept and as a fexible-cell concept. The focus of the evaluation is on worker utilisation, which is increased by 12% in a shift to fexible-cell manufacturing.

In conclusion, the number of stations and processes to be executed is not suffciently taken into account in the publications presented and only Schönemann et al. [19], Hofmann et al. [18], Göppert et al. [20] and Küpper et al. [21] compare a fexible production system with takt line production. Just Küpper et al. [21] refer specifcally to the requirements of the automotive industry and use real data to map the complexity. Here, however, only the contrasts of a line production versus a fexible cell production are modelled without considering hybrid forms. Therefore, it requires the investigation of hybrid line-less assembly systems compared to classical line assembly and line-less assembly based on real industry data for automotive fnal assembly.

#### **3 Automated Scenario Analysis for Modelling of Takt Line and Line-Less Assembly Systems**

To assess the potential of fexibilization in automotive assembly, a simulation-based automated scenario analysis is applied: Based on standardized input data (see Fig. 1), a full-factorial experimental plan of simulations to be performed is generated. The input

**Fig. 1** Automated scenario analysis to generate full-factorial simulation experiments to be on different fexibility levels

data is based on industry data e.g., MTM times, building plans and error protocols. It contains information about the layout, the stations, operations, products, transport systems, and general simulation information. A normal distribution for the mean time between failure (MTBF) and a uniform distribution for mean time to repair (MTTR) is assigned to each station for breakdown simulation. A random seed repeats stochastic infuences and avoids outliers. The simulations to be executed differ concerning the parameter combinations of the input parameters and the applied assembly system fexibility levels. Three fexibility levels are considered:


The discrete-event simulation models are generated and executed in the software Tecnomatix Plant Simulation [22]. A greedy algorithm controls the online routing and scheduling decisions during the simulation. To minimize the assembly makespan, the decision on the next process step and execution location is made on a product-individual basis, considering the current queue lengths, transport times, and station failures. The result of the automated scenario analysis is a detailed statement of relevant KPIs (e.g. makespan, output quantity, utilization, waiting times) for each scenario. The comparison of all scenarios allows for conclusions about the potential of different fexibility levels in automotive assembly.

#### **4 Modelling of an Industrial Use Case**

The methodology is applied to an industrial use case with data taken from an automotive OEMs fnal assembly. Six product types are assembled in the system. These include three car types, each of which is built in two variants. Therefore, each vehicle type is modelled once in a minimum and in full equipment, whereby all assembly-relevant features are taken into account. The order amount of product types are assumed to be equally distributed and the sequence is randomly generated.

In all fexibility levels, a new order is released into the system during the given takt time if one required station is free. This ensures comparability, as the product input and its quantity is the same in all systems. Each simulation is run with fve random seeds to compensate for the statistical infuences. The system is frst loaded for eight hours and then the statistics are gathered in order to exclude ramp-up effects. After the rampup, another eight hours of production time are simulated. The transportation between the stations is done by AGVs for the hybrid and line-less system and is simulated with AGV routing based on distance and velocity but without traffc. In the takt line system no transportation time is considered due to the continuous product fow.

The considered part of the fnal line assembly consists of about 100 stations which are modelled while the number of stations remains the same for all system confgurations. The process times are recorded in detail based on MTM data and assigned to the stations. The processes at a station sometimes take longer than the specifed takt time, since in the real system compensation is achieved by principles such as workers who jump between stations or overlapping processes into the next station. Most processes are manual and can therefore theoretically be executed at any place. However, some processes are automated using robotics or need special tooling equipment that is only available at one specifc station. This results in the stations being divided into generalized and special stations that are considered in the hybrid and line-less approach. Each station is predetermined by the same availability (between 98–100% to cover a broad but realistic feld in the industry) and therefore randomly fails with an MTTR value of 5–30 min. assumed to be uniformly distributed. The type of failure is not further specifed, so it can be of a technical or organisational character and may also mean an absence of material. The modelled fexibility levels are displayed in Fig. 2.

Since there is a continuous product fow in the real takt line assembly, each vehicle must remain in the station for the length of the takt time, even when the processing time is over. In case of station failure, the products have to wait in the stations behind the affected station, but upstream processes continue to run since no rigid conveyor belt but AGVs are used in the current system as fexible means of transport. The line is divided

**Fig. 2** Overview of the modelled fexibility levels

into sections, between which four buffer positions are available for decoupling. The processes are uniquely assigned to stations and follow the predefned precedence graph.

In the frst version of the hybrid assembly system, clusters are formed based on the existing sections in the real line assembly (*Cluster 1*). A cluster always consists of the number of stations that are in a section and within this, the capabilities are combined and the process sequence restriction is dissolved but the general sequence of the sections remains. In the second cluster, two sections are always combined, making the matrix areas larger, which is also associated with increased employee skills (*Cluster 2*).

In the line-less assembly system, all stations of the real system are transferred to a matrix layout and the restrictions are resolved. In one version, the process sequence is fxed as in the real system and in the second version, the operational fexibility is set to 100% by dissolving all process restrictions which cannot be transferred to the real system but allows to identify potentials. Manual processes can be executed at all generalised stations, while the identifed special stations remain (caused by automated processes or special handling tools). The number of stations remains the same, the system is not optimised (e.g. resolution of resulting bottlenecks).

Since throughput is one of the most relevant KPIs in the automotive industry, the effect of station availabilities on this is examined in the following for all system variants.

#### **5 Discussion on Simulation Results**

For the scenario analysis, the station availability was examined in fve levels between 98–100% for fve fexibility levels and the throughput was determined. For this purpose, the average of fve simulation runs was calculated, resulting in a total of 125 simulations. The results for the throughput can be seen in Fig. 3. In an idealised system with 100% availability, the Takt Line Assembly performs best, as there is a continuous fow without interruptions. The utilization is the highest for takt line assembly with 90.9% due to

#### Throughput based on station availability

**Fig. 3** Results on the average throughput based on station availability for all fexibility levels

**Fig. 4** Results on the average station utilization based on fve levels of station availabilities for all fexibility levels

process times that are shorter than the takt time (see Fig. 4). In theory, the more fexible systems should be able to reach the results of the line at 100% availability, as they could reproduce the process and station sequences of the line. The fact that this is not the case shows the complexity in the planning and control of the systems and can be attributed to the use of a simple greedy algorithm to generate the individual product routes.

Already at 99.5% station availability, it can be seen that throughput drops signifcantly in Takt Line Assembly, as a single station failure disrupts the entire product fow, while only a slight decrease can be observed in the more fexible alternatives. At 100% availability, compared to the Takt Line Assembly, the throughput decreases by 10% when viewing the Hybrid Assembly Cluster 1 and 5% when viewing the Lineless Assembly with free operation sequence. However, at 98% availability, throughput is increased by 38% in a Hybrid system and as much as 51% for the Line-less system. Increasing the size of the clusters (from 1 to 2) gives an advantage (3% increase in throughput at 98% availability), although this comes with increased worker skills. Both cluster variants and line-less assembly are more resilient to station failures. With regard to the precedence graph, it can be recognised that a resolution of this into clusters, but especially with free operation sequence, higher throughputs and utilisation can be achieved. In comparison to the Takt Line Assembly, the Line-less Assembly System with fxed operation sequence is signifcantly more resilient to station failures due to the system-side fexibility by allowing alternative stations for manual processes. Figure 4 shows the average utilisation across all stations and the fve simulation runs. Taking station failures into account, the utilisation can be signifcantly increased by fexibilisation (e.g. increase by 17.8% from takt line assembly to Hybrid Assembly Cluster 1 for an average station availability of 99%).

As a critical refection, it is noted that completely free operation sequences in line-less assembly systems are unrealistic in reality, but the results show that an investigation of the precedence graph in terms of fexibility is worthwhile, as increased operational fexibility offers great advantages. The assumption for hybrid systems, on the other hand, is legitimate, but still needs to be validated by practical tests that could be done by fexibilising the process sequences and worker tasks for selected line sections without changing the general layout. In addition, the foor space requirement is not directly taken into account, whereby the number of stations remains the same. For line-less systems, however, it requires larger path areas due to increased transport effort. In addition, workers were only considered only indirectly; a dedicated worker scheduling system is needed for control. Compared with the state of the art, it can be confrmed here on the basis of real data, that system- and product-side fexibilisation is worthwhile, especially when station breakdowns are taken into account. However, it also shows the complexity in the control system with a high number of stations, as the throughput and utilisation of the line could not be achieved with full availability.

#### **6 Conclusion and Outlook**

In conclusion, hybrids and line-less assembly systems show high potential for automotive fnal assembly, as proven by extensive simulation studies. Whereas in an idealised system, Takt Line Assembly Systems achieves the best values, throughput and capacity utilisation can be increased in Hybrid and Line-less Assembly Systems when taking into account station failures while maintaining the same order volume. While complete line-less systems still require a high degree of planning and control effort (e.g. employee qualifcations), hybrid systems can be implemented more easily.

As a next step, in the hybrid and line-less systems, bottlenecks can be identifed and the system be planned better (e.g. identify which automated stations are worth duplicating; lower the inter-arrival time of jobs; implement order release strategies based on the system status to maximise utilisation). In addition, the takt line assembly can be modelled even more realistically by taking into account balancing principles, where processes take place across stations. The potential exists here to catch up in a line operated by AGVs through increased speeds after disruptions. Also, breaking up the rigid line structure into a Hybrid System shows benefts for this use-case. A further object of investigation is the integration of new variants into the system. This raises the question of how complex line-less systems deal with constantly new product integrations and integrated ramp-ups and how these can be supported by simulation modules.

**Acknowledgements** We would like to thank the BMWK, DLR-PT and our partners for their kind support.

This work is part of the research project "AIMFREE – Agile assembly of electric vehicles" that is funded by the German Federal Ministry for Economic Affairs and Climate Action (BMWK) on a joint initiative to fund research and development in the feld of electromobility (funding number: 01MV19002A) and supported by the project management agency German Aerospace Center (DLR-PT). The author is responsible for the content. Find more information at aimfree.wzl.rwthaachen.de.

#### **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

### **Ontology-Based Task Allocation for Heterogeneous Resources in Line-Less Mobile Assembly Systems**

Aline Kluge-Wilkes , Balaji Gunaseelan and Robert H. Schmitt

#### **Abstract**

Volatile markets and production request for assembly systems adaptable to changes of product types, production capacity, and product order. Computer-aided decision support systems facilitate scheduling, planning, and controlling adaptive and fexible assembly systems. Formal description models of resources and their capabilities, assembly tasks and their requirements are necessary for automated decision-making. This paper contributes a conceptual CAPabILity-based resource AllocatioN Ontology (CAPILANO). The ontology is tailored as a uniform description of heterogeneous assembly resources and their (combined) capabilities, connected to a capability-based task allocation approach. The intended application of the resulting framework is the identifcation of suitable assembly resources in Line-less Mobile Assembly Systems (LMAS) and their allocation to assembly tasks, based on a unifed and formal description. To date, ontologies in assembly have been limited to querying resources and their capabilities; here, subsequent task allocation is presented as an integral component of a tailored framework. The resulting framework consists of a model of heterogeneous resources and their capabilities in an ontology created in Protégé in OWL, SPARQL-based querying, and a consecutive and availability-aware task allocation in Python. The development of the ontology-based task allocation framework,

e-mail: A.Kluge-Wilkes@wzl.rwth-aachen.de

A. Kluge-Wilkes (\*) · B. Gunaseelan · R. H. Schmitt

Chair of Production Metrology and Quality Management, Department Model-based Systems, Laboratory for Machine Tools and Production Engineering WZL of RWTH Aachen, Aachen, Germany

R. H. Schmitt Fraunhofer Institute for Production Technology (IPT), Aachen, Germany

T. Schüppstuhl et al. (eds.), *Annals of Scientifc Society for Assembly, Handling and Industrial Robotics 2022*, https://doi.org/10.1007/978-3-031-10071-0\_5

including ontology taxonomy, querying and task allocation, is discussed. Its applicability in LMAS is demonstrated through linear scalability of task allocation and future advances are discussed.

#### **Keyword**

Ontology · Task Allocation · Line-less Mobile Assembly Systems

#### **1 Station Control in Line-Less Mobile Assembly Systems**

The trend of consumers demanding individualized products requires adaptable and fexible production and therefore adaptable and fexible assembly systems [1]. The paradigm of Line-less Mobile Assembly Systems (LMAS) offers a solution for realizing such an assembly system based on the three principles: Mobilized resources, a clean foor approach, and dynamic job-routes [2]. To ensure the adaptation of stations in LMAS to new tasks, each new task involves checking which resources provide the capability to perform the allocated task, which has been done manually beforehand. Therefore, the objective of this paper is to provide a framework of the two necessary steps for formation planning in a fexible assembly station: a formalized representation of assembly resources and their capabilities and a task allocation based on this representation [3, 4]. Accordingly, the paper aims to answer the research question: *How can assembly tasks, assembly resources and boundary conditions for automated task allocation be described?* The resulting framework is intended as a foundation for adaptive assembly station planning through feasibility checking, whereby the offered capabilities of the resources and the requested requirements of the tasks are matched.

The remainder of the paper is structured as follows: In Sect. 2, the related work on assembly resource modeling, capability modeling and task allocation is reviewed with regard to applicability for formation planning in LMAS. In Sect. 3, the methodology followed while developing the framework is summarized. Section 4 introduces the methodology's results and details the derived conceptual schema of the ontology and the task allocation. A usecase-specifc implementation on operation level is presented and its performance is evaluated in Sect. 5. Finally, the results are concluded and future work is presented in Sect. 6.

#### **2 Related Work in the Context of Station Planning in LMAS**

LMAS enables dynamic adaption to changing demands through temporal and reconfgurable layouts (formations) of assembly stations. The task-depending and ever-changing formation of the heterogeneous and mobilized resources in assembly stations requires adaptable formation planning and task allocation [2]. The foundation for automated task allocation to assembly resources is a consistent and formal modeling of the resources, their capabilities and their taxonomy as a digital representation [4, 5]. Ontologies

provide a means of formally representing knowledge by describing instances and their relations, and are thus suitable for modeling resources and their capabilities in the manufacturing domain [6].

In the following, the imposing requirements on a capability-based resource allocation ontology for a station in LMAS are detailed. The ontology has to be **scalable**, to be enhanced with new classes and instances representing newly integrated resources. It has to enable **queries for assembly resources and their capabilities** to allocate resources to assembly tasks depending on availability and requested capabilities [4]. Considering cooperating resources or changing capabilities depending on equipment and tools, it has to provide **inheritance of combinational capabilities**. For station control, the **time-relevant update of properties** (e.g. a resource being idle or not idle) must be included [2]. To allow for **task allocation** the transfer of query results to third-party software has to be implemented. In the following existing ontologies and frameworks will be evaluated with regard to these requirements.

The Product Resource Order Staff Architecture (PROSA) provides one of the frst semantic representations intended for smart manufacturing. However, it leaves matching capabilities to requirements for future research [7]. MANDATE defnes an International Standard for representing manufacturing management data, including the product, process and resource paradigm [6]. The "Referenzarchitekturmodell Industrie 4.0 (RAMI 4.0)" for information systems defnes a reference architecture of technical assets and their relevant aspects throughout their entire life cycle [3].

With MAnufacturing's Semantics Ontology (MASON), a semantic net as an upperlevel ontology for manufacturing was presented, including entities, operations and resources, but lacking a representation of capabilities [8]. In the BaSys 4.0 ontology, modular resources provide combined basic and slave capabilities orchestrated by master capabilities. BaSys 4.0 includes querying by matching requested and provided capabilities, but excludes task allocation to individual resources. [9] Weser et al. aim to create an upper-level ontology (C4I) enabling matching of provided capabilities of resources and the required capabilities to fulfll the task. They defne capabilities as a hardwareagnostic representation of the resources' functionalities, consisting of sub-capabilities. C4I lacks the means of allocating tasks to individual resources. [5] In the Manufacturing Resource Capability Ontology (MaRCO), the four classes of product, process, capability and resource are differentiated. Combined capabilities are modeled in an analog approach to C4I: The combined capabilities result from cooperating resources or a combination of resources to an aggregated resource [10]. A wide-ranging review of ontologies intended for robotic utilization can be found in [11].

Currently, ontologies can be queried for matching the task's requested capabilities to resources' provided capabilities, resulting in a list of individual or combined resources that provide the requested capability. For formation planning, it is necessary to allocate one specifc resource to one specifc task. No ontology fulflling all stated requirements is publically accessible to derive an ontology for task allocation for stations in LMAS. Currently, the planning of formations of mobile resources in LMAS takes place manually. Due to the manual process, reconfgurable stations in LMAS are still ineffcient for industrial applications, especially for prototype production and lot size one.

#### **3 Methodology and Foundations**

This research aims at developing an ontology-based task allocation framework as a foundation for station planning in LMAS. Varying methods of creating such a framework can be found in the literature. For building the domain ontology, the broadly accepted sevenstep procedure according to Noy et al. [12] was followed due to its application-oriented structure. Moreover, the frst two steps (defnition phase: developing an ontology and modeling phase: modeling of a use-case) of the digital twin pipeline of Göppert et al. were applied [13]. Finally, validation and verifcation are carried out following Sargent et al. and Gómez-Pérez through application ontology and performance testing [14, 15].

The conceptual ontology is built in Protégé using the Web Ontology Language OWL. OWL is a machine-readable knowledge representation language that enables the derivation of implicit knowledge from explicitly defned knowledge by reasoning systems [9]. OWL allows for reasoning based on semantic and syntactic rules, thus being formal and allowing for capabilities to be inherited from one instance to another and composing of capabilities of other capabilities [12].

#### **4 Capability-Based Resource Allocation Ontology CAPILANO**

In the following the frst phase of ontology-based modeling, the defnition phase, according to Göppert et al. [13], is described and the conceptual CAPabILity-based resource AllocatioN Ontology (CAPILANO) is presented. The broadly applied concept of dividing assembly systems into product, process and resource of Martin et al. was followed. We focus on the resources and their allocation to tasks through capability-matching, defning a process as a set of tasks [11, 16].

The **resource** class consists of the heterogeneous individual resources (class objects) and the associated capabilities of the individuals. Table 1 provides an example of the individual FASIMA\_ABB4600 of the resource class "ABB\_4600" and its parameters. According to RAMI 4.0, resources are assumed to be an entity, i.e., a uniquely identifable, represented, and known asset [3]. Consistent with the defnitions of RAMI 4.0 [3] and PROSA [7], the resources follow the defnition of an asset or holon. Thus they can be delimited individually but may also be composed of other resources. Resources have a defned boundary, can be composed of other identifable resources, can be combined to form resources and be assigned a value and a purpose. [3, 7] The utilization of individual parameters was adapted from MASON [8], PROSA [7] and PMK [17]. The actual static and dynamic parameters were adjusted from MASON [8]. The interaction of these parameters with the capabilities was adapted from PROSA [7] and PMK [17].


**Table 1** Example of the robot resource class 'ABB\_4600' of CAPILANO

Following Kluge, we assume **capabilities** to be describable by referring to the elemental assembly operations defned in standards and guidelines such as VDI 2860 and DIN 8580, 8582, 8588, 8592 and 8593 [18]. Here we defne capability as a hardwareagnostic means of fulflling a function. Capabilities are defned as classes and are assigned to the resources through the class restrictions adapted from MaRCO and C4I [5, 11]. To model the resources' capabilities, the functional methodology from MaRCO [11] and specifcations of the VDI 2860 are adapted, inheriting the concept of combinatory capabilities. In contrast to MaRCO, parameters like 'Payload' are assigned to the capability class instead of the resource class, allowing for easier adaption through changing the individual itself instead of an entire class. Simple capabilities are directly assigned to the individual resources, therefore represented as individual parameters. Complex capabilities consist of multiple simple capabilities, as visualized in Fig. 1. Depending on the related capabilities, a resource inherits these parameters [7]. For example, if the individual 'Gripper' has the individual parameter 'GrippingForce', the 'Gripping' capability inherits this parameter. The combined capability combines the capabilities and the related individual parameters resulting in the capability parameter. For example, the resource 'mobile manipulator' consists of the resource 'robot' and 'AGV' and inherits the following capabilities: 'EndEffector', 'Positioning', 'Transporting' and assuming the end effector 'ScrewDriver' e.g. 'Screwing'. The inheritance process itself is adapted from the MASON ontology [8].

Figure 2 presents CAPILANO as the main result of the defnition phase, depicted with the related ontologies. Compliant with the development phase defned in Göppert et al., elements of existing ontologies were inherited, adapted and extended.

**Fig. 1** Combined capabilities and their properties in UML

**Fig. 2** Ontological approach and inheritances towards CAPILANO

**Fig. 3** Conceptual framework including the ontology, task allocation and interfaces

#### **5 Task Allocation Framework**

The conceptual framework for task allocation is depicted in Fig. 3. The framework consists of a means of storing and retrieving data (data lake), converting this data into a tailored schema (data conversion), querying the data in the ontology (here: CAPILANO) followed by allocation of the querying results, and converting the results back into a storable data format and thus closing the circle to the data lake. Realizing the framework's goal to match the task's requirements to the resources' capabilities, the process of querying and consecutive task allocation is detailed below. The resulting framework can be found under: https://github.com/AKlugeWilkes/IoP-CAPILANO.

In a **pre-processing** stage, CAPILANO, including the resources and provided capabilities, is extracted. Moreover, the input process chart is retrieved from the data lake and converted into a custom schema adapted from MaRCO [11] and the C4I metamodel [5]. The process chart contains the required capabilities to perform the tasks and the task order.

During the **processing** step, querying and task allocation are carried out. For querying, the requested list of capabilities of the process chart is converted in a SPARQL query by the Python-based Cython Phaser. Figure 4 visualizes one query loop for the capabilities of 'Screwing', 'Positioning' and 'Transporting'. The SPARQL queries are forwarded to the JAVA-based HermiT Reasoner and the output is cached by checking the compatibility of capabilities and inferring implicit capabilities [19]. At frst, individuals, which provide the requested capability are identifed (e.g. 'ScrewDriver\_1' and 'Screw-Driver\_2' for 'Screwing'), then the HermiT Reasoner checks for combinable capabilities of the resources, e.g., if a capability is required, which could be provided by a specifc robot in combination with a particular end-effector (here: 'ScrewDriver\_1' and 'Mobile-Manipulator' are combinable through 'EndEffector Type 01'). This function facilitates the combinatory capability inheritance requirement. If several resources match the capabilities, they are chosen in descending order, thus in the frst query loop, the one with the closest matching parameter is selected (e.g. a payload of 20 kg is requested a gripper providing a payload of 25 kg would be preferred over one providing 40 kg), represented in Fig. 4 by a white square. The loop continues until all possible combinations of resources are listed in descending order. Once the Cython Phaser processed all queries, the cached results are converted into CSV/TSV and forwarded to the task allocation step.

**Fig. 4** Querying for 'Screwing', 'Positioning and 'Transporting' in CAPILANO

(Depending on the interchangeable third-party system carrying out the task allocation, this conversion could be adapted, providing compatibility to other programs.) During task allocation, the best possible resource is selected. The 'best' is currently defned by the criteria 1) necessary equipment is already equipped on the robot 2) highest battery charge 3) the frst item of query list.

During **post-processing**, the list of allocated resources and tasks is converted to be OWL readable and the results are integrated and updated in CAPILANO.

#### **6 Evaluation and Results**

According to Gómez-Pérez [15], the fulfllment of the evaluation criteria consistency, completeness and expandability were investigated to validate and verify CAPILANO. Consistency was proven by deriving the inferred hierarchy of the asserted hierarchy of CAPILANO with the HermiT Reasoner and applying the ROMEO (Requirementsoriented methodology for evaluating ontologies) methodology [20]. To verify the correct implementation and programming of the conceptual ontology (computerized model verifcation), the ontology taxonomy evaluation and a comparison of the asserted and inferred hierarchy were applied [14].

An **application ontology** was implemented to validate the framework within its intended scope and determine whether its output behavior provides an acceptable accuracy [14]. Use-case-specifc instances are modeled according to the conceptual ontology, creating a knowledge base/description model on an operational level, consistent with [8, 13] and [12]. The process of truck chassis assembly is used as an application scenario: The parts 'cross member', 'front member' and 'rear member' have to be transported from storage to the chassis and have to be screwed onto 'chassis'. To fulfll this process, the capabilities 'Screwing', 'Transporting' and 'Positioning' are requested with differing property parameters of acceleration, velocity, jerk, etc. As resources, several stationary robots (ABB\_4600, ABB\_2600), mobile robots (Kairos) and equipment (gripper, screwdriver) are available, which provide the requested capabilities in varying resource combinations.

Based on the application ontology, the framework's performance is analyzed by measuring the time to process a task allocation. Twenty unique queries with varying degrees of complexity were created and used as the seed for a randomizer to generate processes charts. Five process charts with the same number of queries, but unique randomized queries are processed for each data point. The run-time of these fve charts is averaged to ensure uniform distribution of complexity within the fve process charts. The queries and process charts can be found here: https://github.com/AKlugeWilkes/IoP-CAPILANO/ tree/main/03\_Evaluation.

The graph "Average runtime vs. number of queries" a) in Fig. 5 presents the scaling performance of the ontology. The querying was carried out for 5, 25, 50, 100, 250, 500, 1000, 1500 and then every 1500 queries. Based on the test data, it shows linear scal-

**Fig. 5** Runtime analysis depending on the number of queries

ing behavior. The graph "Runtime per query vs. number of queries" b) in Fig. 5 along with an R-Squared trendline, presents an upward trend, representing a non-linear behavior. Linear behavior is obtained when normalized to an error percentage of 0.8385. It is concluded that the developed ontology has a linear scaling behavior within a margin of 0.8385%.

It was shown that the developed framework is scalable as one can integrate new entities for application and inherit other ontologies. As visualized in the fgures above querying and matching resources to tasks based on the required and provided capabilities was realized through SPARQL querying and Python-based allocation. Capabilities resulting from combining resources to a new one can be inherited from one instance to another. A transfer of query results to a third-party software to enable task allocation was exemplary realized by developing interfaces allowing for a transfer in a Python program and can be adapted for other third party software.

#### **7 Conclusion and Outlook**

This paper contributes the ontology CAPILANO. CAPILANO formally describes assembly resources and their combined capabilities as a function of equipment using the Web Ontology Language (OWL). Based on CAPILANO, a framework matching resource capabilities with task requirements and subsequent availability-aware task allocation was developed. Compared to the manual allocation of tasks to resources, automatic allocation requires less time and provides reproducible results. In conclusion, the developed framework supports the planning of mobile assembly stations by displaying possible combinations of resources and allocations to assembly tasks, reducing the time required for task allocation compared to manual allocation.

In future research, the framework will be extended by investigating the spatial and temporal requirements of a feasible formation. Spatial reachability and manipulability of the allocated tasks as well as collision avoidance of the allocated resources will be researched. The task allocation applied will be extended by incorporating criteria like proximity of the resource and the allocated task pose, to optimize production time.

To explicitly integrate a higher degree of detail of the implicitly existing knowledge of humans necessary for automated assembly planning and the subsequent assembly execution into the ontology, the ontology can be enhanced by additional parameters. For example, parameters such as the wear and tear of resources over time or the consideration of measurement systems for localization on a map can be added.

**Acknowledgements** Funded by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) under Germany'́s Excellence Strategy – EXC-2023 Internet of Production – 390621612.

#### **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

### **Exploratory Pilot Study for the Integration of Task-Specifc Load Alternation into a Cyclic Assembly Process**

Steffen Jansing, Christoph Rieger, Tim Jabs, Jochen Deuse, Florestan Wagenblast, Robert Seibt, Julia Gabriel, Judith Spieler, Monika Rieger and Benjamin Steinhilber

#### **Abstract**

Takt work represents a signifcant risk factor for the development of musculoskeletal complaints and diseases, especially in short-cycle processes. The increased risk results primarily from a permanent uniform load on the musculoskeletal system. Studies on motor variability suggest that an increase in load variation can have positive effects on reducing the risk.

The research project "Integration of activity-specifc load changes to reduce physical stress during takt work" aims to demonstrate the increase in load variation by introducing specifc load changes during takt work as a possible means of preventing musculoskeletal disorders without causing negative effects on productivity. For this purpose, a pilot study was already carried out with ten subjects, which is presented in more detail in this paper.

As foundation for the description of this study, the given paper frst provides background on the applied theoretical concepts as well as the design of the overall research project. This is followed by the presentation of the experimental procedure and the results of the pilot study on cyclic assembly. Based on the stress profles determined via surface electromyography the sequence of the analysed reference

S. Jansing (\*) · C. Rieger · T. Jabs · J. Deuse

Institute of Production Systems, TU Dortmund University, Dortmund, Germany e-mail: steffen.jansing@tu-dortmund.de

F. Wagenblast · R. Seibt · J. Gabriel · J. Spieler · M. Rieger · B. Steinhilber Institute of Occupational and Social Medicine and Health Services Research, University Hospital Tübingen, Tübingen, Germany

T. Schüppstuhl et al. (eds.), *Annals of Scientifc Society for Assembly, Handling and Industrial Robotics 2022*, https://doi.org/10.1007/978-3-031-10071-0\_6

assembly process is reconfgured in order to integrate load changes. Future investigations within the research project are planned to compare both processes in terms of risk surrogate parameters for musculoskeletal disorders.

#### **Keywords**

Takt work · Cyclic assembly · Manual assembly · Musculoskeletal disorders · Load alternation

#### **1 Introduction**

A large proportion of all employees in manufacturing companies in Germany is involved in takt-based work. The main reasons for this widespread use are the advantages of increased transparency in production processes as well as increased productivity and reduced training time for employees [1]. Although a cycle time of about one minute has become typical in many manufacturing companies [2], a continuous reduction in the amount of work per cycle with a concomitant decrease in cycle time is discernible. Thus, 42% of all production processes in industrialised countries have a cycle time of less than 1.5 min and 26% of all processes have a cycle time of less than 30 s [3].

On the other hand, takt work is a signifcant risk factor for the development of various musculoskeletal disorders (MSDs) and complaints (MSCs) due to frequent repetitive and uniform movements [4, 5]. This also manifests itself in employees' absenteeism from work. In 2020 for example, 26,8% of all days of incapacity to work among employees in the manufacturing sector were attributable to MSDs, resulting in a loss of gross value added of 10.6 billion Euro [6]. Since a shift away from takt work is unlikely due to its widespread use, there is a need for new approaches for work design to adapt the form of takt work to the employee.

Both in industrial practice and in science, load alternation is proposed as a possible means of reducing physical stress. The reason for this is the assumption that the same motor units and associated muscle fbres are generally always activated and stressed when performing uniform activities. This can lead to overload or even degeneration of individual muscle fbres [7]. It is assumed that load changes protect individual motor units from such overload situations [8]. Furthermore, it is assumed that a greater variation in load contributes to a relief of motor units [9, 10]. However, there is a lack of evidence for the targeted use of this approach, which is why this is focused within the given research project. For the investigations presented in this paper load changes are therefore defned as targeted relief or different types of loads on muscles between activity segments. This publication is primarily concerned with the frst sections of the project: the defnition of exemplary assembly processes with the help of a pilot study.

Before providing details on the pilot study as well as the overarching research project necessary theoretical background is outlined in the following chapter. This includes the characterisation of repetitive activities as well as the defnition of load and stress. In addition, the state of the art in recording physiological stress is also presented below.

#### **2 State of the Art**

Repetitive activities are not clearly defned in the literature. However, they are unanimously described as activities that continuously stress the same muscle groups, tendons, etc. in a short time sequence and are performed over a period of at least 60 min [11, 12]. Loads are defned according to [13] as external conditions and demands in a system that affect the physiological or psychological stress of a person, whereby the objectivity of the load is a central characteristic [14, 15]. The internal reaction resulting from the load, which is individual for each person, is referred to as stress and depends on the person's individual characteristics [13].

Both subjective and objective methods are available for measuring stress. The subjective techniques include not only the questioning of perceived exertion, e.g. via the Borg scale [16], but also the description of physical complaints or self-assessment via standardised questionnaires (e.g. self-state scale [17] and NASA task load index [18]). Compared to objective methods, subjective ones are particularly disadvantageous because of their low resolution and the fact that they can be infuenced at will [19]. In addition to the evaluation of e.g. produced quantities and number of errors to determine the work performance [20] physiological methods for determining stress represent the core of objective methods.

As prominent physiological method, surface electromyography (SEMG) allows to measure the electrical muscular activity noninvasively. Through the person- and musclespecifc normalisation of the amplitude parameter RMS (root mean square) calculated from the SEMG signal during physical work in relation to the RMS during maximum voluntary activation (%MVE), work-related muscular stress can be characterised and attributed to a wide range of work activities [21]. An often used characterisation method in ergonomic research is the amplitude probability distribution function with the 10th percentile, the median and the 90th percentile as MSC risk indicators [22]. In cyclic assembly work, SEMG assessment therefore indicates muscle-specifc peaks or prolonged episodes of muscle stress that would be of interest for a redesign [23].

For ergonomic design, a reduction in muscular stress during physically demanding work activities is considered positive [24]. A correlation with an increased risk of MSCs has already been demonstrated, particularly for static muscle stress indicated by the 10th percentile [25]. In addition, other risk surrogate parameters can be calculated from the SEMG measurement. For example, the number of muscle activities [26] and the relative total duration of activities below 0.5% of the maximum activation [27] could be associated with an increased risk of MSCs in the shoulder–neck region [28]. Similarly, a low degree of cycle-dependent standard deviation of muscular activation (motor variability) [29], is thought to be associated with the development of MSCs and MSDs [30].

#### **3 Basics of the Project**

In the research project "Integration of activity-specifc load changes to reduce physical stress during takt work", the aim is to provide conceptual proof of the positive effect of load changes and an increase in load variation during takt work on the aforementioned risk surrogate parameters for MSCs and MSDs. After introducing the overall study design in this chapter, the focus of this paper is on the preliminary pilot study that was already carried out, including the presentation of the fndings in Chap. 4.

#### **3.1 Project Design**

The research project is divided into four subsequent parts. In the frst work package (WP), an assembly process is defned, which serves as a reference for the entire study and which fulfls the essential characteristics of a cyclic, manual work system. In the subsequent second WP, the reference process is carried out by ten experimental subjects (half male and half female) in a pilot study. The test persons are equipped with measurement technology (SEMG) during the execution of the assembly. The muscular stress profles generated based on the measurement are used to subsequently reconfgure the chronical sequence of partial activities of the assembly process to integrate load changes. In addition, the duration of the study for the main experiment is determined by analysing the timing of the increase in stress in the sense of physical complaints, physical exertion or signs of muscular fatigue. In the subsequent main experiment, WP three, the reference process is compared to the reconfgured assembly process. For this purpose, data is collected from 40 test persons (half male and half female) who are randomised, balanced and blinded to both assembly processes on two different days under laboratory conditions. During execution, SEMG data is collected as well as information on execution times and errors. In the WP four, a methodological approach is developed based on the comparison and evaluation of the results from the main experiment. This approach should represent a procedure for industrial practice to classify the muscular stresses and loads of partial activities without measurement support. This makes it possible to determine an optimised sequence of stresses for the assembly to be carried out. In the case of a successful proof of concept (PoC), further studies are required to validate and expand the developed method.

Due to the planned laboratory studies with experiments on humans, an ethics application with the planned study protocol was prepared at the beginning of the research project and submitted to the responsible ethics committee at the Medical Faculty of Tübingen. In this protocol, all methods and precautionary measures for the pilot study as well as the main experiment are described in detail. The application reveiced a positive voting by the ethics committee.

#### **3.2 Defnition of the Reference Process and Work System design**

At the beginning of the study, a manual assembly process was defned as a reference process, which fulfls essential characteristics of a cyclic work system. All requirements for this process were formulated within the scope of a specifcation sheet. In addition to the conditions defned in the research proposal and results of a literature research, interviews with representatives from industrial practice were fundamental to this. Exemplary requirements include a target process duration of 60 s, the integration of ambidextrous work and the integration of static as well as dynamic loads. Based on these specifcations, several drafts for assembly processes were developed, physically implemented in the laboratory and evaluated in terms of time and ergonomics. Through close cooperation between the research partners, the assembly process could be checked with regard to the defned requirements and iteratively adjusted.

The fnal reference process consists of 13 partial activities, which are mostly independent in terms of the assembly sequence. Due to the design as a manual assembly process, only two operations require the use of tools in the form of an electric screwdriver. Based on the MTM (Methods-Time Measurement) process description, the movements of the partial activities range from the Get and Place of larger individual parts to the Handle Tool of electric screwdrivers. The individual partial activities can be described in terms of the elements of the basic movement cycle Reach, Grasp, Move, Position and Release and are composed of various basic operations [31]. Due to the different motion lengths, despite a maximum execution time of approx. 3 s for a basic movement cycle, a partly dynamic or static strain of the different muscle groups takes place over several sequence segments.

The analysis of the execution time of the process using MTM-UAS (Universal Analysis System) as a system of predetermined times leads to a standard time of approx. 63 s for the assembly process. The ergonomic assessment of the work system by means of the ergonomic assessment worksheet (EAWS) results in a medium overall risk and thus requires measures for risk control. The defned work system shown in Fig. 1 is heightadjustable and corresponds to the state of the art. The individual components with weights ranging from few grams to 2.1 kg are located on a rack in front of the worker. They are assembled in housings fxed on trays, which are moved from left to right on a roller conveyor.

**Fig. 1** Work system for assembly of components

#### **4 Pilot Study**

In preparation for the main investigation, a pilot study was conducted to derive the experimental setup. The aim of the series of measurements was to create a data basis for the reconfguration of the assembly process and to defne the observation period for the following main investigation. In addition, the infuence of the previous manual experience was to be examined as well as the training concept used.

#### **4.1 Methods and Procedure**

The study population for the pilot study consisted of ten right-handed subjects with an average age of 30 years, half women and half men, who had different amounts of experience in assembly work and provided written informed consent prior to participation. In addition, no limitations in the musculoskeletal system of the upper body and general physical health without previous illnesses were defned as inclusion criteria.

First, the subjects were prepared for the measurement and instructed in the measurement procedure. Descriptions based on [32] were used for the localisation of the following muscles of the forearm and shoulder–neck area at the right body side: extensor digitorum muscle, fexor digitorum superfcialis muscle, infraspinatus muscle, deltoid anterior muscle and upper trapezius muscle (also at the left body side). These muscles were selected because of their relationships with work-related complaints in repetitive work [33]. Then, electrodes were attached according to the SEMIAN recommendations [34]. Subsequently, the normalisation procedure was carried out. Therefore, participants had to perform three maximum voluntary contractions (MVC) of each targeted upper body muscle (1 min pause in between MVCs) while the muscle activity was recorded. In order to achieve a uniform working method with a standardised procedure, the test persons were instructed and trained in the assembly process afterwards. The reference performance to be achieved for the subsequent measurement to start was defned as the execution of two consecutive error-free processes with a permissible deviation of±10% from the standard performance. The four-step method according to REFA, which focuses on manual, short-cycle and simply structured tasks with a standardised sequence, was used as the training concept [35].

During the subsequent assembly, various subjective and objective measurements were taken. As a subjective procedure, the test persons were asked before and after the assembly as well as every 20 min during the assembly about their perceived exertion using the Borg scale [36] and their personal discomfort at parts of the upper body [37] using a numerical rating scale from 0 to 10. As objective procedures, a SEMG measurement was carried out as core element, in addition to determining the execution times and errors. With the help of the recorded execution time, a uniform execution speed was determined based on the degree of time defned by REFA [35], whereby a deviation of±10% from the standard time was defned as permissible.

The SEMG signals were sampled at 4096 Hz by a SEMG device (PS12-II, Thumedi GmbH & Co. KG, Germany) using a combined data analyser and logger, which calculated the root mean square (RMS) from the power spectrum in real-time. For the signal normalisation the RMS during the assembly was divided by the maximum RMS during the MVCs and is expressed in percentage [%MVE]. To calculate the average muscle activity during each partial activity for each subject, the median of the normalised RMS of single partial activities were obtained from fve uninterrupted 1 min assembly cycles after the frst 30 min.

#### **4.2 Results and Discussion**

The evaluation of the subjective measurement data shows a continuous increase in the subjects' perceived exertion and discomfort over the duration of the exercise. For example, half of all test persons report physical complaints in the area of the neck and the right shoulder after 120 min. At that point, three out of ten test persons report complaints in the area of the left shoulder, and another four out of ten test persons report complaints in the area of the right wrist. Over the same period, an increase in perceived exertion from a median of 6.0 to 9.5 with an interquartile range of 3.5 was observed. The evaluation of the objectively collected measurement data and process-related parameters shows that with regard to the degree of time, there is no need to differentiate the test persons according to gender and previous manual experience in assembly. A mean time degree of 98.15% with a standard deviation (SD) of 0.31% was recorded for all participants. Likewise, no remarkable difference can be determined regarding the number of errors with a mean value of 0.1 execution errors per cycle (SD=0.06).

The evaluation from an occupational health point of view is based in particular on the assessment of the stress of the partial activities shown in the SEMG measurements. The result is exemplifed for the right trapezius muscle in the following Fig. 2. This shows the time course for a representative cycle of muscular load. The median as well as the 10th and 90th percentile of the load are shown for the individual partial activities for all subjects. Sections with a high load are highlighted in colour. The illustration shows that there is a concentration of highly stressful partial activities in the frst section of the assembly process and a decrease in the stress level towards the end of the cycle.

In summary, the pilot study shows the validity of the training concept, and from a process-technical point of view, there is no need to differentiate between subjects based on previous assembly experience and gender. The structured training results in homogenous cycle times for all study participants, which is necessary to eliminate possible effects of different work paces [38, 39]. Additionally, error rates are held at a low level to resemble skilled industrial assembly with uniform movements.

From an occupational health perspective however, the differentiation of male and female test persons remains relevant due to different physiological prerequisites [40]. In the analysis of muscular stress and perceived complaints, an increase in the frequency of complaints and in muscular stress is observed after only two hours. As a result, a measurement duration of 2.5 h is deemed suffcient for the future main study. Concerning the recorded Borg values, the feeling of exertion is generally considered to be low to moderate in that time span as expected from the EAWS assessment of the work system ex ante. Furthermore, the stress based on the SEMG data is in line with previous studies on manual assembly [41]. As an extension to existing approaches, different stress levels are associated with the single partial activities of the process, which serves as foundation for the subsequently described reconfguration.

**Fig. 2** Median and percentiles of the stress on the trapezius muscle (right) for all subjects as well as exemplary progression

#### **5 Reconfguration**

Based on the stress profle of the reference process determined in the pilot study, the reconfguration is carried out by defning a new sequence. The aim is to achieve the highest possible variation of the partial activities in order to prevent a continuous or uniform stress on individual muscle groups. The redefnition of the assembly sequence has to take place under the framework conditions of an unchanged temporal and ergonomic evaluation of the process. The work system is not modifed in order to prevent overlapping effects. In addition, the realism and practical relevance should be maintained in the redesign and thus the joining sequence should not be abstracted too much.

The result of the reconfguration is based on the stresses per partial activity determined in the pilot study and is shown again for the trapezius muscle (right) in the following Fig. 3. It is visible that in the reconfguration the partial activities with high stress alternate with partial activities with low stress. Additionally, it can be stated that due to the limitations of the technically possible assembly sequences and the goal of a constant takt time, it is not possible to determine an optimal solution regarding stress.

#### **6 Outlook**

The presented reconfguration of the assembly process is the basis for data collection in the future main study of the research project. In this study, 40 test persons (half male and female) will perform both process sequences on two different days in a blinded, balanced and randomised experimental design. This is followed by the processing, evaluation and interpretation of the results. The aim of the main study is to prove the positive infuence of specifc load changes on risk surrogate parameters for MSDs and MSCs. In order to provide a methodology for the integration of specifc load changes in industrial practice, a procedure will fnally be developed. In case of a successful PoC, further studies will be required to validate and expand this methodological approach.


**Fig. 3** Stress on the trapezius muscle (right) for the reconfguration based on the pilot study

**Acknowledgements** The project "Integration of activity-specifc load changes to reduce physical stress during takt workˮ is funded under no. FF-FP 0458 by the German Social Accident Insurance. We thank the research advisory group for the constructive exchange of ideas.

#### **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

**Augmented Reality**

### **Input and Tracking System for Augmented Reality-Assisted Robot Programming**

Michael Brand , Marvin Gravert , Lukas Antonio Wulff and Thorsten Schüppstuhl

#### **Abstract**

Augmented Reality-assisted robot programming systems (ARRPS) aim to make the programming of industrial robots more effcient by providing an AR-based human machine interface that allows operators to program robots intuitively and quickly. This work aims to contribute to the feld by presenting an input and tracking system based on the VIVE Lighthouse technology that can act as a basis for ARRPS systems, improving maturity, costs and accessibility. To evaluate the system, ARRPS core functionality has been implemented so as to demonstrate its basic feasibility. An extensive evaluation of the system accuracy has been conducted, as this is one of the key criteria for potential adoption of the technology. The feasibility could be successfully demonstrated and it could be shown that the end-to-end mean absolute error

M. Brand (\*) · M. Gravert · L. A. Wulff · T. Schüppstuhl Institute of Aircraft Production Technology, Hamburg University of Technology, Hamburg, Germany e-mail: michael.brand90@gmail.com

M. Gravert e-mail: marvin.gravert@tuhh.de

L. A. Wulff e-mail: lukas.wulff@icarus-consult.de

T. Schüppstuhl e-mail: schueppstuhl@tuhh.de

Michael Brand and Marvin Gravert contributed equally and are joint frst authors.

L. A. Wulff ICARUS Consulting GmbH, Lüneburg, Germany

of the robot path point placement amounts to 11 mm in a workspace of 0.6×0.6× 0.25 m3 volume. Finally, the robustness and setup time of the system still need to be improved.

#### **Keywords**

Augmented reality · ARRPS · VIVE · HoloLens · Accuracy

#### **1 Introduction**

Automation is one of the key technologies for enhancing the effciency of industrial production processes. Due to the trends of increasingly high number of variants and small series, industrial robots need to be programmed more and more frequently. Especially Small and Medium Enterprises (SMEs) have diffculties employing automation economically because of their small batch sizes and the high costs [1]. This puts cutting down the costs of robot programming into the spotlight. Today, the dominating programming method in mass production is hybrid programming [2], a two-stage procedure consisting of an extensive offine programming (OLP) and simulation step, followed by a relatively short online commissioning step using teach-in. New approaches that aim to make robot programming more intuitive and less costly include Programming by Demonstration (PbD) as well as Augmented Reality-assisted programming systems1 [4]. They are not as costly and complex as enterprise-grade OLP software suites, yet more effcient and intuitive than online programming, using the teach pendant. These approaches address production processes which have low to medium complexity and thus do not require extensive simulation-based optimization.

ARRPS have been researched extensively [3, 5–8]. The basic idea is to provide an AR-based human machine interface (HMI) for robot programming that overlays useful virtual information like robot paths with the real environment and essentially aims to replace online programming via the teach pendant. This allows for faster and more intuitive programming. Typically, path points can be programmed via 3D user input and the resulting robot motion can be previewed based on a basic simulation. It has been shown that effciency could be increased signifcantly, when compared to conventional teachin [5]. However, there are still considerable issues with existing concepts including the unavailability of mature input methods and high hardware costs. The goal of this paper is to present a novel input and tracking system for AR-based robot programming, as a step towards mitigating some of the existing defcits. The contributions include:

<sup>1</sup>We adopt the term ARRPS coined by the group of Ong et al. as a general term for similar systems. [3].


In Chap. 2 the theoretical background and related research are presented and analyzed. Subsequently, in Chap. 3 the concept for the proposed system is detailed, and in Chap. 4 the implementation is briefy outlined. In Chap. 5 the evaluation is presented and the results are thoroughly discussed. The paper concludes with a summary and an outlook on future work in Chap. 6.

#### **2 Related Work**

ARRPS systems [3, 5–8] try to improve robot programming by employing AR as an intuitive HMI utilized at the production site. The most comprehensive milestone publications are presented subsequently.

Lambrecht et al. [6] present an ARRPS that consists of a tablet and a Kinect to provide gesture input. The components are calibrated to each other using ArUco markers. The accuracy of the gesture recognition component is evaluated to be around 6 mm, the end-to-end accuracy of the system is not examined.

Vogl [5] presents a system for robot programming using an infrared-tracked stylus for path point placement. The robot trajectory is projected onto workpieces with high accuracy using a laser projector. The overall system accuracy is not examined, however, based on the components' accuracies the overall accuracy is likely under 2 mm.

In their recent work Ong et al. [8] show an ARRPS for welding applications. The system is built with the commercial OptiTrack system as the main tracking system. The input device is a computer mouse with OptiTrack markers attached to it. The system is evaluated with user tests, showing that programming time could be saved and the accuracy necessary for the process could be achieved.

Analyzing the literature, little focus has gone into examining end-to-end system accuracy of the proposed concepts, although this is a crucial criterion determining which range of production processes could be covered. As for the components, the presented input devices mostly have low maturity or few functions. The employed tracking systems tend to be either very costly or rather inaccurate. Lastly, none of the systems are released as open-source software.

The aforementioned defcits are addressed by designing a novel input and tracking system that has low costs, yet reasonable accuracy. The input hardware needs to have

<sup>2</sup> https://github.com/MarvinGravert/ViveBasedArrpsPlatform

a good level of maturity and a rich function set. The system shall be implemented in a modular software platform and released as open-source software, so others can build upon it. The system will be evaluated through a demonstration of basic feasibility. Furthermore, an extensive examination of the system accuracy is going to be conducted.

#### **3 Concept**

The basis for the proposed system is the Microsoft HoloLens, a modern Optical See-Through Head Mounted Display. It is combined with the Lighthouse (LH) system as an additional outside-in tracking system which was originally built for VR applications. This enables the use of the VIVE controllers and the so-called VIVE Trackers for object tracking. The LH system hasn't been previously used as an input and tracking system for ARRPS systems.

This setup enables natural and ergonomic input using established controller hardware. The functional space of the controller is rich, multiple buttons and a trackpad can be used. Also, diverse gesture input can be implemented using a 3D-tracked controller. The used VIVE setup can be considered low-cost when compared to other commercially available tracking systems. The LH technology is robust against environmental conditions, as it uses infrared light. Its mean static accuracy has been determined to be<3 mm [9], which is a very good performance in a low-cost system. The recommended workspace with two LHs is 3.5×3.5 m2 but can generally be arbitrarily chosen. The system can be extended by adding more LHs, yielding fexibility in terms of setup. On the fip side, the mobility of the system is limited because the LHs need to be set up. Placing them on movable tripods, combined with their ability to calibrate themselves automatically, decent mobility of the system is still warranted.

Figure 1 shows the system overview, that is, the main components and their interactions. The communication of all components is carried out via the central server. The HoloLens acts as the main user interface. A distributed software architecture is employed where the

**Fig. 1** System overview

HoloLens is the front-end, and the server is the back-end. The LH system complements the internal capabilities of the HoloLens. It acts as the central tracking system, tracking the controller as well as the HoloLens with a Tracker mounted on it. The LH system sends the tracking information, as well as the controller inputs to the server. In order to automatically register the virtual AR scene with objects tracked by the LH tracking system, the HoloLens is tracked externally. The robot interface exchanges robot motion commands as well as the current state of the robot with the server. A Tracker is mounted initially to the robot fange to reference the robot with the LH, it can be removed during usage of the system.

#### **4 Implementation**

Figure 2 shows the system architecture. The server is implemented using a modular microservice architecture. It provides essential services to the HoloLens application, namely, the tracking hub service, the robot path service and the registration service. The tracking hub service collects all the information from the LH tracking system. Additionally, the registration service helps with the Lighthouse-HoloLens-registration procedure. The robot path service stores robot paths and controls the robot via the robot interface. Finally, the front-end app on the HoloLens manages the application state and provides the user interface.

The server is run in a Docker container for system-independent and robust operation. Internally, the communication is carried out via gRPC. Since the HoloLens does not support gRPC, TCP is used for the communication between the server and the HoloLens. The server is implemented using the Python programming language, whereas the Holo-Lens front-end is designed using the Unity game engine. The latter is rather basic, as the focus of this work does not lie on the user interface but rather on the tracking system. The robot-specifc robot interface is not implemented, instead the data is transferred manually. It could be implemented in the future by using the API of the robot manufacturer or using a common robot interface like ROS (Robot Operating System). The robot used is a KUKA KR 6 R900 sixx, and the interface to the LH system is implemented with SteamVR.

**Fig. 2** Software architecture

**Fig. 3** HoloLens-Lighthouse-registration

In order for the tracking scheme to work in practice, the Tracker needs to be calibrated with respect to the HoloLens it is mounted on, we call this procedure HoloLens-Lighthouse-registration. It needs to be carried out once by each user.

Figure 3 shows a scheme of the transformations, the transformation HoloLens*T*HoloTracker is sought-after. In order to determine it, the user is asked to align a real object (with a Tracker mounted onto it) with a virtual copy that he can control through the AR interface. Based on this, correspondences, which are the same points expressed in two different coordinate frames, can be collected. The collected data set can be used to solve for the unknown transformation HoloLens*T*HoloTracker using Point Set Registration methods [10].

In order to ultimately be able to move the robot fange along user-defned path points, the robot location needs to be known with respect to the LH system. The transformation scheme is shown in Fig. 4a), the goal is to determine the unknown transformation between the robot base and the LH system Base*T*Lighthouse. A procedure called Robot-Lighthouse-referencing is proposed. A Tracker is initially attached to the robot fange, it is then moved to discrete locations according to a predefned robot program (as depicted in Fig. 4b)). At the same time, the LH tracking system records the locations of the Tracker. The resulting correspondences can be used to determine the unknown transformation using Point Set Registration methods. After the referencing is done, the Tracker can be removed. To make this procedure more effcient, a quick changer system for the robot fange can be used. All of the steps in the Robot-Lighthouse-referencing could potentially be automated.

**Fig. 4** Robot-Lighthouse-referencing: **a** Transformation scheme, **b** Robot movement scheme

#### **5 Evaluation**

To evaluate the system a basic demonstration is conducted in Sect. 5.1. An in-depth examination of the system accuracy is presented in Sect. 5.2

#### **5.1 Demonstration**

To evaluate the ARRPS platform, a basic functional demonstration has been conducted. The essential feature set to prove the feasibility are basic tools for robot path creation and editing using the controller.

This feature set has been implemented and shown to work reliably. The creation of a robot path is depicted in Fig. 7. To program a path point, the user places the lower tip of the controller to the desired location, where it gets created after pressing the trigger button. By pressing the trackpad, the motion type can be altered (Point-to-Point or Linear). Multiple points are visually connected with lines, whose color indicates the motion type. Via the Menu button, the last point can be deleted. The Grip button saves the path to the storage.

The result of the programming process is a list of path points stored on the server which could be transferred through the robot interface. However, this transfer has not been implemented. Instead, the robot path is entered manually in the robot teach pendant in order to validate it. Further and more complex functionality could be implemented; however, creating a functionally rich and mature HMI was not the goal of this work.

#### **5.2 Examination of the System Accuracy**

Subsequently, the accuracy of the system is evaluated. First, an error analysis is conducted to explain which errors are examined and to understand what components contribute to the overall errors. Afterwards, the augmentation error and the Robot-Lighthouse-referencing are examined separately. Finally, the overall end-to-end accuracy of the path point placement is examined because it is the system's most relevant performance characteristic.

#### **Error Analysis**

First, the errors and transformations of the path point placement are presented. The path point placement error consists of the user error, the LH tracking error, the Robot-Lighthousereferencing error and the robot error, as shown in Fig. 5. This leads to a deviation between the location intended by the user and the location the robot would actually move to.

Note that the AR display is not playing any role in this, since the path point location is created directly based on the controller location and transformed into robot coordinates. The location displayed in AR is meant only as a visual assistance and represents

**Fig. 5** Path point placement error

**Fig. 6** Augmentation error

the approximate path point location. This is a deliberate design choice because it reduces the path point placement error considerably. Other designs could still be implemented with the platform, but should be expected to have a different accuracy.

Even though it's not used for path point placement, the error of the augmentation is still important because augmentations should be near the correct locations in order to be useful for user information and guidance. As shown in Fig. 6, the augmentation error comprises the tracking error, the HoloLens-Lighthouse-registration error as well as the HoloLens error. The HoloLens error includes all errors that come from the AR display, such as the internal tracking and display error.

#### **Augmentation Error**

The augmentation is evaluated with user-generated ground truth data. To acquire it, the user is asked to align a real controller with a virtual one that he can control via the AR system. This is repeated in fve different locations across a fat, rectangular workspace of 0.55 m edge length. The LHs are located at the corners of a 4×4 m rectangle, at the center of which the workspace is located. The absolute error of the augmentation is measured as (15.6±10.3 mm) translationally and (1.0±1.1°) rotationally. As expected, the accuracy of the augmentations is not very high, which is the reason why they are not used for the actual path point placement.

#### **Accuracy of the Robot-Lighthouse-referencing**

The Robot-Lighthouse-referencing is evaluated by adding a set of test points to the Robot-Lighthouse-referencing procedure and evaluating the registration error for this test set, using the transformation that was calculated using the original data set. All points are located within a cuboid of 0.55×0.55×0.45 m3 volume. The absolute error of the test set is determined as (10.8±3.0 mm). These results are worse than the expectations before the experiments. The observed error is of random nature, as no axis-specifc bias could be proven with statistical t-tests. The following steps have been taken to fnd out if they can reduce the error: Variation of position and number of LHs, covering of metallic surfaces, change of the axis confguration of the robot, as well as change of the rotations of the robot fange. However, the error could not be reduced. More research on this problem is needed.

#### **End-to-end Accuracy of the Path Point Placement**

To examine the end-to-end accuracy of the path point placement, the following experiment is conducted. The user places a path point at one corner of a cuboid shaped, 3D printed workpiece that is mounted on a tripod. The placement is repeated 5 times per point, so as to rule out random user error at each location. Subsequently, the tip of a 3D printed tool that holds a needle is moved to the programmed location. This setup is shown in Fig. 7. Both the position of the workpiece and the tool are measured using a Leica LTD 800 laser tracker. That is why the tool and the workpiece are constructed to be able to hold three laser tracker targets each. This is repeated 14 times within a cuboid shaped workspace of approximately 0.6×0.6×0.25 m3 volume.

Table 1 shows an overview of the errors that have been measured. The absolute value of the error is (10.7±3.5 mm). There is a bias in the z-axis of 9.3 mm. A possible explanation for the bias is the random error of the Robot-Lighthouse-referencing, which was carried out once previous to the path point placement and thus would appear as a systematic error in the subsequent experiment. Thus, like mentioned before, the referencing procedure should be revised. Overall, a mean absolute error of 10.7 mm is not as good as

**Fig. 7** (a) Path point placement in front of a workpiece, (b) Path point placement experiment


**Table 1** Error of the path point placement

was expected when designing the system; however, a large bias suggests that by fxing the suspected issue, the achievable accuracy should be substantially higher.

Only a relatively small workspace was tested because the robot used in the experiments limited the workspace of the evaluation procedure. It is expected that the accuracy will be lower in bigger workspaces.

#### **5.3 Discussion**

To conclude, the mean absolute error of 10.7 mm in a relatively small workspace is decent, especially when considering it is a low-cost system, but not as good as was expected when designing the system. On top of that, in real-world application the overall error is expected to increase rather than decrease, due to perturbations and generally less ideal conditions, so robustness needs to be evaluated as well. Because of the determined accuracy only industrial processes such as painting and handling could be programmed using the presented system, since they usually have a compatible tolerance of the tool positioning. The presented stylus-based placement method is ideal for the quick robot program creation of small programs from scratch. However, development of more complex functionality, e.g. the commissioning of a robot program that was planned in a simulation software is also possible. At the same time, the applications need to be limited to medium-size workspaces like 1 m3 which is likely a sensible trade-off between workspace size and accuracy for this particular setup. To sum up, the accuracy in medium-size workspaces will be quite high, if the problem in the referencing can be solved.

#### **6 Conclusion**

In this work, a fexible and cost-effcient input and tracking system based on the VIVE technology has been developed. Overall, the system is feasible as a fexible and lowcost ARRPS system. The end-to-end mean absolute error of the path point placement is 10.7 mm. The VIVE controller integration allows for many interesting possibilities when designing the user experience. The software platform can be a starting point for other developers to implement their own ARRPS-related ideas.

In future work, the robot interface could be implemented, e.g. using ROS. The robustness of the Robot-Lighthouse-referencing should be improved, so as to improve overall accuracy. Also, the setup time of the Robot-Lighthouse-referencing should be reduced by fully automating the procedure. Furthermore, the system should be evaluated in terms of usability and ergonomics for the intended use-cases.

**Acknowledgement** This work was created as part of the research project "MiReP" and is supported by the Federal Ministry for Economic Affairs and Climate Action as part of the Federal Aeronautical Research Programme "LuFo V-3".

#### **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

### **Usability of Augmented Reality Assisted Commissioning of Industrial Robot Programs**

Lukas Antonio Wulff , Michael Brand and Thorsten Schüppstuhl

#### **Abstract**

This paper analyses the usability of *Augmented Reality* (AR) in the commissioning and programming of industrial robots. Conducting two individual studies with a total of 31 participants we analysed the three dimensions of usability: effectiveness, effciency, and user satisfaction by comparing our developed AR system with the conventional Teach-In programming method during the commissioning and modifcation of offine created robot programs. The results indicate that, while less accurate and hence effective, the AR system is more effcient and has a higher user satisfaction. Beyond that a posture analysis indicates that during a timeframe of 30 min the additional weight of the AR device does not signifcantly worsen the posture of a worker. Complemented by the positive result of the System Usability Score (SUS) that rates the analysed AR system with a good usability, the overall results indicate that while still limited by its achievable accuracy AR is an intuitive medium to conduct robot programming and commissioning.

#### **Keywords**

Augmented Reality · Manufacturing · Mixed Reality · Industrial Robot · Robot Programming · NASA-TLX · OWAKO · SUS

M. Brand · T. Schüppstuhl Institute of Aircraft Production Technology, Hamburg University of Technology, TUHH, Hamburg, Germany

L. A. Wulff (\*)

ICARUS Consulting GmbH, Lüneburg, Germany e-mail: lukas.wulff@icarus-consult.de

#### **1 Introduction**

Augmented Reality (AR) is a novel technology with the ability to combine spatially mapped digital and real content in an interactive and multimodal interface [1]. As such AR can serve the role of a human–machine-interface (HMI) and is capable of enhancing the fexible skills of human workers in an industry 4.0 environment [2]. Offering a more intuitive approach to human robot collaboration, AR-based robot programming could be a potential alternative to conventional online and offine programming.

In literature, a variety of different systems realising AR-assisted robot programming ranging from path point modifcation [3], trajectory planning [4], collision detection [5], and human–machine collaboration [6] have been developed. Especially the benefts of natural gesture based programming methods have shown a higher effciency as well as a good user satisfaction when compared to conventional programming methods [7]. However, a breakthrough of AR in the scope of industrial robot programming beyond the tier of a proof-of-concept solution has not been acquired yet.

Limited by the available stable accuracy, scenarios beyond pick-and-place application [8] are diffcult to industrialise. With recent advances like the introduction of additional equipment like three-dimensionally tracked styluses [9] or external LIDAR sensors [10] performance, throughput and accuracy can be increased. Nevertheless, AR is not yet powerful enough to be a standalone alternative robot programming method in high tier automation.

Hence, we chose a different approach and do not view AR-assisted robot programming as an alternative but an enhancement of existing conventional robot programming methods. Especially in high tier automation industries where a combination of offine planning, programming, and an online commissioning and optimisation is characteristic, AR can smooth the transition between the two phases.

With AR we can on one hand assist the worker in the shopfoor environment with additional simulative abilities. On the other hand, deviations can be detected and directly corrected by comparing the digital model to the real workstation, thus, creating a more accurate digital representation. Based on this motivation we developed a system, that utilizes the visual and interactive abilities of AR to harness the features of an offine created robot programming system inside a factory environment to work with programs on a real robot [11]. While this approach is generally feasible, more detailed work in the assessment of accuracy, effciency, and satisfaction, i.e. usability [12], is necessary.

In the following paragraphs, we will give a brief overview of our developed system, after which we will defne the usability of a process and introduce different methods to measure the individual dimensions. In the end, we will present the results of two studies and discuss the usability of our system in the scope of AR-assisted robot programming.

#### **2 MiReP—Mixed Reality Programming**

We try to smooth the transition from the digital planning environment to the real workstation by utilising AR in the commissioning of an offine created robot program. We chose a modularised architecture as our basic design pattern to combine the functional range offine robot programming systems present with the multimodal interactive capabilities of AR. Adhering to the guidelines of the dependency inversion [13], we split our application in one core and six independent microservices.

Figure 1 (left) shows the general design of our application. Each microservice is implemented as an interchangeable plugin that adheres to a standardised interface defned by the core. Utilising only the standardised functions of the interface to orchestrate the different plugins, each system component is enclosed in an independent shell. This does not only increase testability and opens the possibility of decentralised cloud computed systems, but it creates an inherent extensibility. If, for example, a new plugin for a different AR device is developed, it can be introduced to the system without the necessity to update the entire infrastructure.

The current implementation, schematically displayed in Fig. 1 (right), utilises a Microsoft HoloLens 2, a Microsoft Controller, and the simulation system Process Simulate (PS). By accessing the API of PS, we cannot only import, modify, simulate, and export programs, but directly export the geometry and position of CAD elements from the digital model as well. In combination with the model tracking capabilities of the Vuforia engine [14], we can detect CAD elements in the real world and register our system accordingly. After detection a reference geometry is displayed, and the user is prompted to confrm the correctness of the registration as shown in Fig. 2 left.

Hereafter, programs from accessible machines are automatically imported. The user then selects a program to work on in the AR path editor (Fig. 2 right). While the visualisation and interaction happen in the scope of the AR and input device, all calculations regarding reachability, movement simulation, or tool changes are done in the simulation system. When ready, the modifed program is then re-exported to the associated robot.

**Fig. 1** Schematic sketches of the general architecture (left) and the implementation (right)

**Fig. 2** AR view of the MiReP system prior and post program optimisation

#### **3 Target Dimensions of Usability**

The usability of a system offers a general evaluation of its suitability in a specifc use case and is generally defned by three dimensions [12]:


When refected on the scope of the MiReP application, an HMI enabling a worker to analyse, modify, and evaluate robot programs of a real machine based on a digital simulation model, these generalised dimensions can be concretised.

The effectiveness directly relates to the quality of the modifed process executed by the real robot. It consists of both the available functionality—e.g., the modifcation of the pose of a path point, the correct evaluation of a collision—as well as the quality of corrective modifcation—e.g., the accuracy of the modifcation. The effciency is defned by the amount of time and effort a user invests into reaching the aspired result. The user satisfaction is a more complex parameter, as it embodies multiple interdependent and highly individual parameters. It consists of parameters like mental, physical, and temporal load, as well as frustration, effort, and the perceived performance.

#### **4 Methodology**

Measuring the usability of AR-assisted robot programming is not trivial. While parameters like end-to-end accuracy or duration can be measured absolutely, parameters like effort are coupled with the individual user, the scenario, as well as the current environment. However, while an absolute scale is diffcult to realise, a relative comparison between two processes can be made. Hence, we will compare the usability of AR in the commissioning of an offine created robot program with the conventional Teach-In.

One standardised questionnaire applicable is the NASA Task Load Index (NASA-TLX) [15]. The questionnaire shown in Fig. 3 consists of six questions each targeting a different category that together offer an assessment of the global workload users perceive during processing of their task.

Based on the results of the different categories, a global workload index with a range of 0 to 100 can be calculated. The lower the value, the better the result.

Especially during prolonged work, the ergonomics of a task are an important element in a healthy work environment. As HMIs like the HoloLens increase the strain on the neck of the user due to their weight, a detrimental effect on the posture is expected. Hence, another more specifc analysis of working posture is necessary in addition to the NASA-TLX, as bad posture correlates with physical demand.

The Ovako Working Posture Assessment System (OWAS) offers an objective method to analyse the posture of a human over a prolonged period [16]. A score between 1 and 4 is calculated depending on the relative position of back, arms, legs, and the handled load. The scoreboard is depicted in Fig. 4.

As an example, Fig. 5 shows a user in two different working postures. Regarding the scoreboard the left user has a bent back (2), both arms are below shoulder level (1), he

**Fig. 3** NASA-TLX questionnaire [15]

**Fig. 4** Ovako Working posture Assessment System (OWAS) [16]

**Fig. 5** User during programming with AR (left) and Teach-In (right)

**Fig. 6** Sketch of original and aspired contour in simulation (left); view in AR setup (right)

squats (3) and handles a load below 10 kg (1), scoring a value of 2 which implies that corrective actions to improve the working posture are required in the near future.

As each OWAS analysis does only represent one moment during task execution it has to be done over a prolonged period of time. An overall grade between 100 and 400 is calculated depending on the percentage of time the user stays in a bad posture.

As absolute values accuracy and working time can be measured in a simple experimental setup, as depicted in Fig. 6.

The user modifes an erroneous program (red) by adding and repositioning path points until a defned contour (green) is acquired. The thereby created program is then exported to a robot, which, armed with a pen, then draws the contour on paper. An assessment of the accuracy and working time can be made by measuring the offset of each path point as well as the time the user took to modify the program.

An additional method to evaluate the usability of a system is the System Usability Scale (SUS) [17]. Utilising ten standardised questions, an absolute score between 0 and 100 can be calculated. The according question are displayed in Table 1.

Each question is to be answered with one of the following statements: "Strongly Agree", "Agree", "Neutral", "Disagree" or "Strongly Disagree". From that, a global score can be calculated. Generally, a value exceeding 70 indicates a good usability.


**Table 1** System Usability Scale Questionnaire

#### **5 Mock-Up and Conduction of the Experiment**

We conducted two independent studies with a total of 31 participants. Each time a user commissioned an offine created robot program with both the AR-assisted as well as the Teach-In method. In the frst study, we used the OWAS to assess the working posture of ten users with an age between 19 and 54.

Figure 7 shows the original program in the simulation system as well as the displaced program as viewed in AR. The users were split in two separate groups. After a brief 10-min introduction to either the MiReP system or Teach-In programming, the user commissioned for 30-min. The same procedure was repeated with the other programming method after a short break. The order changed depending on the group affliation. Each user was recorded with a camera during execution. The average risk index of the MiReP system is 109/400, Teach-In programming has a value of 104/400.

Similar to the previous experiment, two groups of users with a total of 21 participants with an age between 17 and 35 commissioned an erroneous offine created robot program in different orders. A sheet of paper indicated the aspired contour as a guideline for the optimisation.

Figure 8 shows a user during the two different tasks. In preparation to the task, each user was given 10 to 20 min of guided preparation with each of the two systems. During training, 5 of the handled pens were broken due to programming errors while controlling the robot with the Teach-In method.

During the experiment, no pens were broken. However, some user needed additional assistance while using the Teach-In programming due to operating issues.

In the end, any created program with either method was valid and runnable. The calculated results regarding accuracy and working time are shown in Fig. 9.

Users flled out a NASA-TLX questionnaire immediately after completing a programming task. The averaged results are displayed in Fig. 10.

**Fig. 7** Program in simulation system (left); Displaced program in reality (right)

**Fig. 8** User during programming with Teach-In (left) and AR (right)

**Fig. 9** Results of accuracy analysis (left) and effciency analysis (right)

**Fig. 10** Results of the NASA-TLX

The average global task load of AR is 29 with a standard deviation of 13.4, whereas the Teach-In method averaged at 34.9 with a standard deviation of 13.4.

At the end of the experiment, each user flled out a SUS questionnaire. The averaged result was 75 with a standard deviation of 12.

#### **6 Discussion**

Both calculated OWAS scores are acceptable. Even though MiReP scored slightly worse (109/400) we assume, that using AR for a 30-min commissioning does not signifcantly worsen the posture of a user when compared to the Teach-In method.

As expected, the result of the accuracy assessment shows that the average error of 9.7 mm with a standard deviation of 6 when using AR is signifcantly worse than the average error using Teach-In which is 1 mm with a standard deviation of 0.7 mm.

However, as depicted in Fig. 11 the potential infuence of a systematic error can be detected. After calculating the systematic error and adjusting the result an average error of 2.9 mm is calculated confrming the existence of a systematic effect.

As users register the device initially from an individually chosen angle and position, especially in consideration that AR glasses have known limitations regarding depth perception [18], we assume this to play a major part in the systematic error. However, as the best user achieved a fairly high level of accuracy (2.8 mm), this also shows that a proper setup can result in higher achieved program quality. In addition to accuracy, Fig. 9 also shows a 32% reduction in programming time when compared to Teach-In. Using a twosample t-test a signifcant difference is deduced.

The results of the NASA-TLX show that, even though not signifcant, the global task load of AR is slightly better than Teach-In programming. However, it is noticeable that the physical load perceived when working with AR is lower than when using the Teach-In. While this is contradictory to the results of the OWAS, it can be partly explained by the shorter working time and the lower effort needed to reach the aspired goal. Moreover, the grading of the perceived performance correlates with the measured accuracy.

**Fig. 11** Comparison of best (avg. error 2.8 mm) to worst (avg. error 25.4 mm) result with AR

Even though the sample size of 21 is small, the results of the SUS indicate a generally good usability of the presented AR-assisted robot programming.

#### **7 Conclusion and Outlook**

The presented studies show that when utilising AR-assisted robot programming the commissioning of an offine created robot program is more effcient but less accurate when compared to the Teach-In robot programming. It was shown that the initial registration has a major effect on the overall error, hence, the introduction of additional visual assistance and continuous feedback to the user could improve the performance of AR-assisted robot programming. The OWAS showed that during a task duration of 30 min the use of an AR device does not negatively impact the posture of a user. This is confrmed by the NASA-TLX that shows a slightly better global workload than in Teach-In programming. Complemented by the results of the SUS, the presented AR-assisted robot programming in the commissioning of offine created robot programs has a generally good usability.

The results show that the presented AR-assisted robot programming is currently not accurate enough to fully substitute the Teach-In programming in commissioning of offine created robot programs. However, due to its intuitiveness it is plausible to use AR when either the accuracy suffces the depicted use case or if AR is utilised as a transition to reduce the necessary amount of Teach-In optimisation to reduce the overall duration of commissioning.

**Acknowledgement** This work was created in the scope of the research project MiReP—Mixed Reality Programming and is supported by the Federal Ministry for Economic Affairs and Energy as part of the Federal Aeronautical Research Programme LuFo V-3.

#### **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

## **Verifcation and Integration of Safety Systems for Industrial Robot Cells Using Laser-Based Distance Measurement and Augmented Reality**

Patrick Seim, Benedict Theren, Jürgen Kutschinski and Bernd Kuhlenkötter

#### **Abstract**

The commissioning of robot cells requires an individual safety analysis followed by an appropriate dimensioning of safety components to ensure worker safety. With increasing complexity of robot work cells, developing proper safety concepts becomes more challenging. Therefore, updated validation concepts are needed that support safety engineers and reduce delay due to commissioning. This paper presents a solution to decrease the commissioning time for robot cells due to safety considerations and implementation of measurements. Data from a digital twin (DT) is used to generate test programs that are able to automatically measure relevant safety distances. The virtual robot cell is used to generate robot paths for the real cell. A distance sensor (laser based) measures the distance to relevant objects within the real cell. The programs run automatically, and the safety engineer only defnes safety relevant points within the DT. Furter on, augmented reality (AR) is used to visualize safety zones specifc to the induvial generated safety concept.

#### **Keywords**

Robot safety · Augmented reality · Layout planning

P. Seim · B. Theren (\*) · J. Kutschinski

RIF Institut für Forschung und Transfer e. V., Dortmund, Germany e-mail: theren@lps.rub.de

B. Kuhlenkötter Lehrstuhl für Produktionssysteme, Ruhr-Universität Bochum, Bochum, Germany

#### **1 Introduction**

Safety analysis for industrial robot cells is a crucial step during commissioning [1]. Every foreseeable potential hazard needs to be evaluated and minimized to ensure machine operator safety. Due to enhanced robot controller features new opportunities for the use of robot cells become more and more possible. This involves layout and task specifc details, like work objects, grippers and handled objects [2, 3]. Since all these factors determine the safety concept, the demand for easy-to-handle safety systems with short commissioning times is further increasing. More versatile assembly or production systems need permanent update of the safety documentation [4]. Safety calculations are mostly based on minimal permitted distances between robot and objects or safety zones within the robot cell. Usually, safety concepts are derived based on assumptions for the position of robot and human and their respective maximum velocities. In practice minimal deviations of objects and safety equipment within the real cell can lead to a threat to the worker.

State of the art robot controllers enable complex confgurations regarding safety functions. ABB's SafeMove Pro feature, for example, allows for a case specifc reduction of robot velocity, restricted working spaces by using safety zones and limiting the range of motion. Some safety zones only exist within the software, which makes them harder to evaluate within the real cell (Fig. 1). To compensate the lack of physical separating safety components, more fexible solutions like light curtains or laser scanners are used. An individual testing of the certain component may be easy to accomplish, but an evaluation of the overall safety concept is more complicated.

Deviations in the minimal permitted spatial confguration of objects can lead to an insuffcient safety concept. A proper validation concept needs to support the safety engineer to evaluate spatially complex safety zones. The following solutions are developed and presented in this work:

**Fig. 1** Example of a safety confguration for ABB robots (different colors mean different safety zones that are activated by the operator)


#### **1.1 Safety Concepts for Robot Cells**

For a variety of assembly or productions systems different approaches for safety concepts were developed. The higher demand for individualized products calls for production lines to get more and more close to lot-size 1 [5]. A well-known concepts behind this demand is the reconfgurable manufacturing system (RMS) [6].

Separating protective devices are widely used as a solution to ensure machine safety. Unfortunately, those systems lack fexibility while producing high costs if a change in cell design is needed. As an alternative, sensor-based security components such as laser scanners or light barriers are established. An example of a cell with different safety components is presented in Fig. 2.

To ensure the exact position of every safety component and therefore the safety of the whole cell a safety certifcation must be conducted. To evaluate the risk and validate and verify the cell's safety standards such as EN ISO 10218-1 [8] and EN ISO 10218-2 [9] are used.

**Fig. 2** Safety components of a robot cell [7]

#### **2 Materials and Methods**

As mentioned before the main goal of this work is reducing the validation time of safety concepts for industrial robot cells. One major part of every safety concept is to examine the distances between safety relevant objects, especially workers, and the robot. Mentioned in German national standards [8, 9] and international standards [10, 11] safety engineers must examine these aspects. The presented software VISIBLE generates an automated solution for this process. Based on a DT of the robot cell distanced between objects can easily be derived. This process generates the setpoints for the distances that now must be evaluated within the real system. A laser distance sensor mounted on the robot is used to examine the correct positioning of objects within the cell. The VISIBLE software generates a tool path planning for the laser distance sensor to properly measure the desired distances.

This method is designed to assist the safety engineer while commissioning production systems according to the mentioned standards.

Additionally, AR is used to visualize an overlap of real and virtual objects. The safety engineer uses AR-glasses (Microsoft HoloLens) to project the robot's virtual safety zones while physically standing in the cell. A calibration of the AR-environment is performed guided by an assistant that automatically calculates the accuracy so that the user can properly calibrate the AR-glasses. The AR-glasses recognize the position of a calibration marker and align its coordinate system relative to the robot. A value beneft analysis was carried out comparing several different components such as:


The analysis resulted in AR-glasses and a distance measuring device being the most suitable solution. The used hardware is shown in Fig. 3. For the AR application a *HoloLens* and the HoloLens Development Edition by Microsoft is used. For distance measurement a *GLM 120 C Professional* by Bosch is used.

**Fig. 3** Used hardware: distance measuring device (e.g. line laser) (left), AR-glasses HoloLens (right)

The whole concept is manufacturer independent. New installations and changes are easy to implement. The new methods of visualization can project even complex contours such as planes at different height levels. The new system links solutions for robotic and AR and decreases the barriers to entry for advanced industrial robot cells.

#### **3 Results**

The following section presents the results of this work. It is structured in different parts each regarding different aspects of the solution.

#### **3.1 Development of a Method to Verify Safety Components**

Initially literature research was conducted to get an overview of the state-of-the-art approaches to verify safety components. There are several norms and guidelines such as the EN ISO 10218, EN ISO 12100, EN ISO 13849 and several more that the approach is built on. The norms list different methods for planning, design, risk evaluation, verifcation, validation and documentation of safety components. The state-of-the-art sequence of the necessary tasks is presented in Fig. 4.

After evaluation of the established sequence for plant commissioning a new method is developed. The implementation of the new method includes a comparison of the layout of the cell and the information of the DT. The process enables an automated documentation by picking up contours of physical and optical safety components from the planning tool in relation to the robot basis coordinate system. This feature is included in the new developed sequence presented in Fig. 5.

Further on the use of the HoloLens offers a solution to examine virtual safety zones. Zones that only exist within the robot control lack a physical counterpart in the real cell that makes them hard to evaluate. Visualizing these zones using AR offers an easy-tohandle solution for the safety engineer.

**Fig. 4** State-of-the-art sequence of the necessary tasks for commissioning of safety concepts

**Fig. 5** Method for automated commissioning of robot cells

The planning and verifcation of safety components are to be connected to the developed software. An analysis of robot safety controls and their interfaces was carried out identifying those controls that are able to integrate safety zones. The controls are:


The safety controllers of the above-mentioned manufacturers were also evaluated if they can connect measuring or visualization devices. The controls of ABB and KUKA fulfll the requirements the most.

#### **3.2 Development of the Planning, Communication and Verifcation Software**

The overall approach for a software architecture is presented in Fig. 6. The concept illustrates the interaction of planning, communication and validation functions. After a detailed analysis the methods of a "line laser" and "laser projection" were evaluated as inappropriate.

**Fig. 6** Architecture of the software for planning, communication, and verifcation

The developed tool converts the parametrized safety zones from the robot controller into mesh-geometry for further use. The converted objects are easily visualized via the AR-glasses. The desired and the actual state can be displayed.

To calibrate the AR-glasses within in the robot cell a marker and an assistant were developed. An iterative algorithm calculates the position of the glasses relative to the robot position based on predefned points. The accuracy of the calibration process increases with every calibration step. A stop criterion is derived that stops the iteration if the accuracy does not increase any further. This process is a state-of-the-art process for calibrating AR-glasses. For the given scenario several different marker positions were examined by the principle of trial-and-error.

Both simple measuring points and complex contours can be defned within the software. The measurement of these points and contours can be performed automatically using the software by direct control of the laser measuring device and the robot controller. After the measurement, the results are automatically summarized in a report. Figure 7 shows the user interface of the planning software.

In order to be able to move the robot to the defned points or to have the robot move along the defned contours, a system was introduced that does not require inverse kinematics. The system is based on a rough pre-positioning of the robot by the commissioning engineer. The pre-positioning can then be used to calculate the fnal poses for the defned points and contours. Subsequently the check points and their distances to the robot as well as any deviations between measured (in real cell) and calculated (in virtual cell) distances are summarized in a technical report.

**Fig. 7** User interface of the planning software

#### **3.3 Development of a Modular Measuring- and Visualization System**

Prior to using the planning software two calibration processes are needed. First the operating point of the distance measuring device must be described within the robot basis coordinate system. Second a connection between the coordinate system and the ARglasses must be established. This leads to three functional components that must be mounted on the robot:


The components are mounted on a plate with standardized hole pattern to mount it directly on the robot fange. The modular setup allows a case specifc mounting of the components on the tool. Figure 8 shows the tool.

The accuracy of this method for calibration was tested using an ABB IRB 4600. In case of the distance measurement device calibration, an angular error between 0.09 and 0.2 degree between the measured robot tool and the laser beam was measured. For more acute angles between laser beam and surface an increase of the angular error is observed. The error for the alignment of the light point (distance of 1000 mm, beam angle of 90° to the surface) is less than 4 mm in an unforeseeable direction. Therefore, check points must have a minimum distance to corners and edges to ensure that the desired object is hit. The precision of the test can be further increased by using a more accurate robot.

**Fig. 8** Modular robot tool

To calibrate the connection between robot basis and AR-glasses the calibration marker is used at different calibration positions. For every position the connection between both systems is known and therefore the pose of the robot coordinate system and the AR-glasses system can be derived. The highest accuracy was achieved, when the marker was spectated (by the AR-glasses) along an arc of 90°. This method allows a visualization with an accuracy of 5 mm between virtual safety zone from the DT and visualized safety zone in the augmented reality.

#### **4 Summary and Conclusion**

A new system for a safety analysis of robot cells was presented. Usually the process of safety analysis is very time consuming while commissioning new robot cells and has great potential for automation. A software was introduced that partly automated chooses points within the digital twin of a robot cell. The software then calculates poses for the robot in which the distance of the checkpoints from the virtual model can be measured in the real work cell. Within the performed tests an accuracy of 5 mm was achieved. However for greater distances the angular error further increases.

The increasing computing power of robot controllers allows to use more and more complex virtual safety zones within the safety setup. This makes it harder for the engineers to evaluate safety concept. Therefore, AR-Glasses were used to display virtual safety zones in the real cell. To calibrate the system, a calibration method to connect the coordinate system of the AR-Glasses to the system of the robot was developed using a calibration marker. This method allows a display of the holograms of the safety zones with an overall accuracy of 5 mm.

The presented system contributes to the aim of decreasing commissioning time and assists the safety engineer to evaluate more and more complex safety concepts.

#### **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

**Digital Twin**

### **Digital Twin for Evaluating Support Characteristics for Industrial Exoskeletons for Upper Body Activities**

Samuel Villotti, Lukas Durst and Robert Weidner

#### **Abstract**

Beside the advancing trends in automation especially in regard to Industry 4.0, workers in industrial factories in non-automated work activities often face repetitive tasks with heavy workloads. Whenever methods or adaptions in both technology and organization are insuffcient to optimize working conditions, personal-related interventions as exoskeletons come into question. They may prove successful in alleviating musculoskeletal disorders and relieving physical strain. The increasing number of exoskeletons often challenges users or companies to select or specify an appropriate specifc system for their applications. In order to address this problem, this paper presents the possibility for developers of using a digital twin for evaluating particular support characteristics of exoskeletons at an early stage of product development. The process for a user-specifc design is strongly dependent on the activity and its environment. As a use case for the validation of a digital twin, an overhead work activity is analyzed and relevant factors such as muscle activity are examined in this paper. Initial

S. Villotti (\*) · L. Durst · R. Weidner

Institute of Mechatronics, Chair of Production Technology, University of Innsbruck, Innsbruck, Austria


R. Weidner e-mail: Robert.Weidner@uibk.ac.at; Robert.Weidner@hsu-hh.de

R. Weidner Laboratory of Manufacturing Technology, Helmut Schmidt-University/University of the Federal Armed Forces Hamburg, Hamburg, Germany

T. Schüppstuhl et al. (eds.), *Annals of Scientifc Society for Assembly, Handling and Industrial Robotics 2022*, https://doi.org/10.1007/978-3-031-10071-0\_10

simulation results show promising possibilities for parameter variation of different properties of an industrial work process in order to create a starting point for a future developing of an optimally tailored upper body exoskeleton.

#### **Keywords**

Exoskeletons · Digital Human Model simulation · Biomechanical analysis · Human– Machine Interaction

#### **1 Introduction**

Nowadays, advancing technology in the industry such as automation and mechanization affect the redesign of workplaces, as well as the implementation of new systems in industrial applications (e.g., systems for human–machine cooperation [1], exoskeletons [2], or augmented reality systems [3]). Demographic change forces employers to provide more technical assistance systems to reduce musculoskeletal loads and enable a longer, healthier and safer working life [4, 12]. More than one of three people manipulates heavyweight goods during the workday, 43 percent daily work in tiring, exhausting, or painful postures [4]. As a result, workers are physically burdened and exposed to a risk of developing musculoskeletal disorders (MSD) [4]. Studies of [5] complain about 21.6 billion Euro loss of gross value due to incapacity to work days caused by MSD. Forecasts defne a worldwide market volume of up to 5.6 billion dollars in 2025 for the exoskeleton industry, where especially work-assisting devices will grow exponentially [6]. Upper body exoskeletons will take on a primary role for possible future solutions for specifc work tasks like lifting heavy parts and for overhead working tasks. Regarding industrial applications, exoskeletons are externally wearable mechanical devices [7] that either empower, facilitate, stabilize, or add movements [8]. Support systems such as exoskeletons are used with the aim of reducing strain on workers without having to make extensive interventions in the work process fow [8]. An ergonomic design of the work process can reduce the development of musculoskeletal disorders but can also become an economic challenge in the case of signifcant process and product changes [9]. In practice it is diffcult and costly to prove the exact effectiveness of exoskeletons for supporting specifc work activities. Laboratory and feld studies are conducted for this purpose, but they require a great amount of time and expense in product development [10]. Currently, there are no exoskeletons on the market that can be manufactured according to variable parameters and specifed boundary conditions. Users cannot fnd a suitable system that satisfactorily addresses their individual requirements - for example, movements are restricted, kinematic structures do not match movement patterns or interfaces are uncomfortable [10]. The process for a user- and task-specifc new design of exoskeletons is infuenced by various aspects. The technically complex replication of joints (e.g., the shoulder joint with several degrees of freedom) is always a challenge in system development, in order to ensure that there are no movement restrictions for the user. There is a demand how to validate and optimize exoskeletons for the applied task in respect to movements and loads. End-users of exoskeletons vary in population characteristics such as anthropometry, muscle strength, body mass, manner of executing movements, and each application scenario varies concerning movement and load-specifc boundary conditions. Digital human models comprising the human as well as the exoskeleton in a single biomechanical system offer the chance to consider all these aspects in parallel [6].

#### **2 Evaluation of the Biomechanical Requirements for the Digital Twin**

The detailed evaluation of the workplace and the working environment as well the choice of the right biomechanical parameters to improve is of great importance in the frst step for a creation of a digital twin.

#### **2.1 Analysis of the Workfow and the Specifc Environment**

Typical human activities in e.g. the automotive industry, aircraft production, logistics, retail, are e.g. handling loads, performing tasks at head height or above or assembling very small products. These and other tasks lead to different strains (e.g. with regard to the body region,). The analysis of such tasks is important, as it defnes the starting point for interactions between human and technology in order to improve the quality of work and to relieve employees. Four main activities in particular need to be distinguished in industrial context: Lifting and carrying, working at and above head height, pushing and pulling or drilling and screwing. Depending on the activity, different parts of the body are stressed to different degrees [12]. On the basis of the identifed task, various distinguishing characteristics can be derived for activities in industrial production and must be taken into account by introducing a exoskeletal system in the workfow of a company:


#### **2.2 Biomechanical Analysis**

In order to determine the requirements and the exact need for the support of an exoskeletal system, the movement sequence of the workfow must be analyzed biomechanically beforehand. For this purpose, it is crucial which parameters can be examined. Biomechanical analysis usually includes different aspects of the interaction of the human with its environment e.g. the movement (kinematics), external forces and moments (kinetics) acting on the body or caused by its interaction with the environment, internal forces and also the muscle activity that cause voluntary body movement. The following parameters summarizes a selection of the most important values used for the evaluation of biomechanical effects for a physical support system – exoskeleton [11].


The most used biomechanical analysis in the evaluation of physical support systems (exoskeletons) of all listed methods in this section is the electromyographical analysis (EMG) [14]. This reference value is created by maximum voluntary contraction (MVC) measurements directly prior to the actual measurements. The general idea behind MVC measurements is that these will present themselves as a 100% contraction value. Any measurements will be presented as a percentage value in reference to the MVC [14].

#### **2.3 Digital Human Model**

Basically, the implementation of a digital twin requires suitable tools and a digital environment that can be confgured. In order to be able to carry out dynamic biomechanical analyses for the prediction of relevant biomechanical parameters (e.g. internal forces, torque, muscle activities) acting on the human body Digital Human Model (DHM) are required. DHM software is a computer-aided design tool [15]. It can be evaluated from an ergonomics perspective using virtual simulation before making the real physical prototype. A few popular DHM software, which is commercially available include JACK, Sammie, Ramsis, Open Sim and the Biomechanics of Bodies (BoB) [10]. Some important previous research work [15] was conducted with a biomechanical modelling system, namely, the AnyBody Modeling System (AMS). This system provides the possibility to investigate the interaction of biomechanics at the musculoskeletal level. In such musculoskeletal models, structures like bones, tendons or muscles are modelled very detailed. AMS offers the possibility to simulate these models in interaction with its environment and to perform an inverse kinematic analysis.

To perform a simulation, motions and reactions normally have to be recorded from a subject (human) and transferred to the DHM. For recording movements, usually motion capturing are used. The entire human movement is captured with the help of a limited number of markers on the body via optical, three-dimensional kinematic camera [14]. Afterwards the position and orientation of the markers is transferred to the DHM with inverse kinematics to perform the simulation.

For this paper AMS was chosen because of the possibility to model additional mechanical variables (exoskeletal effect) in the simulation environment.

#### **2.4 Process for the Prototypical Implementation of a Digital Twin**


users. The movement must be imported from the motion capturing and validated for further simulation.


#### **3 Application Example: Overhead Lifting Task**

To illustrate the application of a digital twin for an exoskeletal system evaluation, an example with corresponding parameter study has been provided in this paper. The aim of this evaluation is to determine the different effects of weight and support changes on muscle activity for the defned movement sequence. A common overhead work activity from the industrial sector was selected as a use case: Overhead (right arm) lifting activity. To investigate the effect of exoskeletal support when lifting a variable load, a model was built in the AMS to replicate the characteristics of the exoskeleton Lucy [9] as a starting point. Lucy is a shoulder exoskeleton and supports the abduction/elevation of the upper arms by means of a pneumatic actuator depending on the angle between the upper arm and the upper body (upper arm elevation angle) [9]. The model created in AMS generates a torque in the right shoulder depending on the angle between the humerus and thorax, which acts upwards. It therefore supports the lifting of the arm forwards (anteversion) and sideways (abduction). The following fgure shows the movement sequence called the humerus-thorax elevation. This movement was recorded according step 2 in Sect. 2.4 with motion capture (Fig. 1).

The subject stands upright at the beginning of the movement. The left arm hangs freely downwards. The muscles of the left arm are not examined in this study, which is why the movement of the left arm during the drilling activity is not described further here. The right arm is already pointing forward at the start of the movement. During the motion capturing recording, the test person held the drilling tool in his right hand. This is not shown in the simulation. The mass of the tool is changed in the different scenarios and acts at the center of the right palm. The test person moves the right arm evenly

**Fig. 1** Movement sequence of the overhead work activity (displayed in AMS)

upwards until the tip of the tool reaches the point of action. In a realistic drilling process, the subject would now increase the pressure to perform the drilling. This detail is not simulated in this simulation for reasons of clarity and simplicity. Since different masses are defned as tool weight in the different scenarios, the same effects should be shown this way. This use case is limited to a consideration of muscle activities. Muscle activity is defned as the active state of the muscle in fractions of the maximum voluntary contraction. This means that at a muscle activity of 100%, the muscle has reached its theoretical load limit. A value greater than 100% is not possible in practice, but can occur in the simulation [13]. The muscle activities are measured as a representative parameter for measuring the relieving effect of the exoskeleton on the human musculoskeletal system. The maximum of the average muscle activity gives information about the unevenness of the effort. The further away the maximum is from the average mean value, the more irregular the load.

The parameter study is based on several scenarios, in each of them one parameter is changed and a comparison is generated. In the frst comparison, two simulations are carried out in which the movement is simulated without exoskeletal support. The load in the frst simulation is 0 kg and in the second simulation it is 2 kg. The simulation without load corresponds to the simple lifting of the right arm. These two scenarios are simulated, among other reasons, in order to obtain a reference for the results and to carry out a kind of plausibility check of the simulation. The assumption that muscle activity will increase with increasing load could be plausibly proven. In order to analyze the infuence of the load, simulations with 4, 6 and 10 kg will be carried out sequentially. In the second comparison, the movement is simulated with the support of the assisting torque. The implemented torque curves (virtual torque applied to the shoulder hinge) are based on the exoskeleton Lucy and are shown in Fig. 2. The theoretical maximum assistance power for the exoskeleton Lucy is 12 Nm at a humerus-thorax angle of 90 degrees [9].

**Fig. 2** Support torque with maximum peak at 90 degree

#### **3.1 Results**

In this use-case, only the most straining muscles in the shoulder were taken into consideration. For this purpose Deltoideus (anterior), Supraspinatus and Infraspinatus were analyzed in the situations. For a better understanding the structure of the shoulder muscles is illustrated in Fig. 3 (right side). It can be easily seen in Fig. 3 (left side) that in the part of the cycle where the applied torque is high, the effect on muscle activity is also high. It can also be seen that the effect on the Deltoid muscle is greater than on the Infraspinatus muscle. Especially the last third of the cycle, the exoskeletal support does not have such a large effect on the muscle activity of these two muscles.

The results of all the different scenarios were summarized in Figs. 4 and 5. For this purpose, they are displayed in boxplots. The minimum, the lower quartile, the median, the upper quartile and the maximum are shown. For each muscle, a boxplot was created with one box per simulation scenario.

**Fig. 3** Left: Muscle activities under varying support (torque) with 2 kg load, Right: Anatomy of the shoulder

**Fig. 4** Muscle activity of the Supraspinatus (left) and Infraspinatus (right) under varying support and load

The main task of the Supraspinatus muscle in this use case is abduction and external rotation of the upper arm, especially below an abduction angle of 15°. In Fig. 4 (left) the second box (0 Nm support, 2 kg load) and the fourth box (12 Nm, 2 kg load) are compared, it can be seen that the torque support has a clearly positive effect on muscle activity. The last three boxes (right in the diagram) show that an increase in load also increases the scatter of the data.

The Infraspinatus muscle is mainly responsible for the external rotation of the upper arm. The exoskeletal support has a different infuence here than on the Supraspinatus muscle. It is noticeable in Fig. 4 (right) that in the second to fourth box (from the left side of the diagram) the upper quartile and the maxima are very close together. Here it is mainly the interquartile range that changes. This means that there is defnitely a relief of the Infraspinatus through the exoskeletal support.

The Deltoideus Anterior muscle is largely involved in lifting the arm forward and is therefore the most important muscle in this activity. The greatest effect on muscle activity is therefore expected. This can also be easily seen in the 2nd to 4th box in Fig. 5 (from the left side). With successive increases in torque, all values decrease constantly. Increasing the load results in clearly higher muscle activity, whereby the scatter of the data increases also.

**Fig. 5** Muscle activity of the Deltoideus Anterior under varying support and load

#### **3.2 Discussion of the Results**

In general, the simulation results show that a higher supporting torque leads to lower muscle activities in the Deltoideus Anterior, Infraspinatus and Supraspinatus. The lowest muscle activities were seen at lower loads and at greater supports. An important observation is that for this movement in this simulation, the increase in average muscle activity is approximately linear with a constant increase in load. However, it is not possible to predict whether this increase remains similar with further increases in load. The simulation is particularly suitable for looking at the muscles individually, different effects can be observed for the individual muscles. For example, in the Supraspinatus, the dispersion of the data increases when the load is increased. With increased torque, on the other hand, the dispersion remains similar, and the mean value of the activities decreases. The interquartile range (IQA) also remains similar, suggesting that the load remains consistently strenuous. For the Infraspinatus, the dispersion of the data increases with higher support torque but the mean value decreases. For the Deltoideus Anterior, the dispersion remains relatively constant with an increase in torque. The aim for an optimal system is to keep the dispersion as low as possible to keep muscle activity at a constant low level [13]. In general, the results confrm that muscle activity increases with higher load and decreases with higher support torque. For the downward movement of the arm, the support torque has less effect on muscle activity, as muscle-work has to be done against the system. It should be noted that the weight of the exoskeleton was not considered with the assumption that this would not have a major impact on the muscle activities considered here. Contact forces were also not taken into account for the use case presented. It is assumed that the contact forces have a greater infuence on the subjective feeling of comfort when wearing a exoskeleton than on the measurable muscle activities. Another point of discussion is the way in which the support-torque is transmitted to the body or how exactly it acts. In the virtual model (DHM), it is a torque in the shoulder hinge without contact points.

#### **4 Conclusion and Outlook**

This paper has examined the possibility as well as the potentials of using a digital twin with regard to the evaluation of different characteristics of a physical support system. In the frst approach, promising evaluation possibilities could be shown without using a real system. The digital twin provides a fast and agile way to investigate user-specifc confgurations and derive an optimal support setting, which is essential for the construction of a real exoskeleton. The next step will be to deduce the correct mechanical design of the exoskeleton based on the optimal support characteristics. For this purpose, mechanical (active/passive) elements have to be dimensioned to generate the necessary support. A possible validation of the design would be the import of the CAD of the exoskeletal model in the DHM.

#### **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

### **Scenario-Driven Data Generation with Experimentable Digital Twins**

#### Osama Maqbool and Jürgen Roßmann

#### **Abstract**

Synthetic data is an indispensable supplement to the difficult-to-acquire real data in order to meet the substantial demand by machine learning based systems. Data playing the key role in machine learning models, its objective and maintainable quality metrics are vital for quality assurance of the whole system. This paper introduces a systematic and domain-neutral methodology based on formalized scenario variation and experimental digital twins for the generation of synthetic data. The methodology uses human-readable scenarios and semantically meaningful parameter variations to describe possible entities, actions and events to be simulated, whereas experimental digital twins bring the scenarios to life by the integration of various domains of a system such as mechanics, sensors, actuators and communication under one platform that can be simulated as a whole. The scenario description and digital twin simulation is carried out iteratively to derive the optimal distribution of synthetic data. Thus scenarios and experimentable digital twins can together serve as mediums to systematically cover diverse application scenarios, test dangerous situations and find faults within a system.

#### **Keywords**

Synthetic Datasets • Experimentable Digital Twins • Scenario Variation

J. Roßmann e-mail: rossmann@mmi.rwth-aachen.de

O. Maqbool (B) · J. Roßmann

Institute for Man-Machine-Interaction, RWTH Aachen, Aachen, Germany e-mail: maqbool@mmi.rwth-aachen.de

#### **1 Introduction**

The increasing complexity of machine learning (ML) based systems necessitates rigorous design and validation approaches to ensure correctness and trust-worthiness of a system. Unlike traditional algorithms composed of specified logical rules, ML algorithms are datadriven and implicitly derive their own inferences. The "reasoning" and results are often unpredictable and difficult to interpret, resulting in the loss of transparency of a system. This renders powerful techniques used in traditional software development, e.g. unit testing and regression testing, either ineffective or in need of serious modifications.

The data-collection process can be quite expensive or in some cases impossible, therefore simulations supplement the missing demand by synthetic data. Quality assurance for the ML-based system requires a substantial volume as well as verifiable quality metrics of the synthetic data. This paper presents a systematic and domain-agnostic methodology for synthetic data generation that addresses two aspects of data quality: transparency and the diversity of scenarios behind the data. The methodology is based on formal application scenario descriptions, appended with formal scenario variation descriptions, and experimentable digital twins. Application scenarios describe the environment, the entities, actions, goals and the initial configuration of an *experiment*. Conversely, the digital twin of a system is its comprehensive representation, i.e. it collects the set of knowledge representations of a system that may belong to different domains and cater to diverse functionalities. The digital twin can be simulated within virtual testbeds, platforms that provide various simulation functionalities, to create the *experimentable digital twin* (EDT) [14]. The proposed methodology integrates the two concepts in an iterative manner.

#### **2 State of the Art**

To achieve simulation-based variation, a target scenario is typically explicitly modelled in the simulation platform of choice, and its parameters are varied accordingly—e.g. the steering angle and acceleration of a constant turn rate and acceleration (CTRA) model [16]. On the other end, adversarial methodologies are increasingly used to challenge ML-based systems, where another ML-system iteratively generates adversarial configurations for the system-under-test [5].

Both ends of the spectrum miss a generic, platform-independent and semantically meaningful description of the scenario and parameters to be varied. Jian et al. uses a configurable scene grammar to describe static scenes, with stochasticity as part of the description to describe the possible scene variation [11]. Fremont et al. developed a probabilistic programming language to describe dynamic scenarios with variation which can be integrated into a simulation engine [9]. Fremont et al. [10] uses the formal probabilistic description in SCENIC for test-case generation of autonomous vehicle safety scenarios. Another prevalent approach for describing dynamic scenes is found in the automotive industry in the OpenSCENARIO standard [2]. The PEGASUS methodology [3] defines logical scenarios as a supplement to OpenSCENARIO description to specify parameter variation. In contrast to SCENIC, the decoupled description of the scenario and scenario variation offers a higher potential for systematization and optimization of scenario variation, as will be seen later in this paper. The PEGASUS methodology however does not deal with complex probability distributions and inter-parameter constraints for parameter variation which are addressed in this paper.

There is vast literature and frameworks for validation of ML-models by exploring certain parameters spaces, regardless if the parameters are semantically meaningful or not. The VERIFAI framework allows the user to define an abstract feature space as input, which it changes to run falsification test for the ML-model [7]. DeepXplore varies inputs for deep learning systems to explore the resulting neuron coverage, and can find the inputs that most contribute to differential behavior [13].

#### **3 The Scenario Variation Methodology**

As data quality plays a vital role in quality assurance for ML-based systems, the data generation process should incorporate maximum transparency and formalism, as with quality control for conventional software. Furthermore, the process should allow the identification and control of data quality metrics such as data accuracy, understandability, correctness and context coverage [8]. The scenario variation methodology, summarized in Fig. 1, affords the designer control over these factors via a systematic workflow and semantically meaningful control parameters. The following sections go through each of the steps.

**Fig. 1** The scenario variation methodology for synthetic data generation

#### **3.1 Scenario Configuration**

The scenario configuration stage involves the definition of the basic application scenario. This paper uses the classification of Dahmen et al. by classifying scenarios into abstract, logical and concrete scenarios [6].

**Abstract Scenario** The *abstract scenario* provides the description of an environment and defines the participating entities, actions and goals. Certain parameters at this level are abstract, i.e. either undefined or assigned preliminary values. The abstract scenario must be specified in a human-readable and formal syntax (e.g. a standardized XML-Schema like OpenSCENARIO [2]), be semantically complete and consistent. For example, the abstract scenario of a vehicle performing a lane-change maneuver may be (informally) described with abstract parameters *p*<sup>1</sup> − *p*4:

Given road with *p*1 lanes, the actor *car* with the initial position on lane *p*2 and velocity *p*3 moves to lane *p*<sup>1</sup> − 1 after *p*<sup>4</sup> minutes have passed.

**Logical Scenario** The logical scenario uses the abstract parameters to specify rules for scenario variation, and likewise follows a formal syntax and is semantically meaningful. Maqbool et al. [12] introduced an XML-based test specification to define logical scenarios via a dedicated meta-model that allows a hierarchical modeling of parameters ranges, probability distributions, inter-parameter mathematical and logical constraints. The example in Scenario 1 illustrates this approach. A generic *speed distribution* element is defined for vehicle speeds in urban settings. The two abstract parameters, *speed\_vehicle\_1* and *speed\_vehicle\_2* inherit the attributes of this element, and *speed\_vehicle\_2* overwrites the distribution. A mathematical constraint is additionally specified between the abstract parameters - regardless of the chosen values of the abstract parameters, the constraint must hold.

#### **Scenario 1** Example of a logical scenario

**Define:** vehicle\_speed\_urban

1: range ← {20, 50}*km*/*h* 2: distribution <sup>←</sup> Gaussian{<sup>μ</sup> <sup>=</sup> <sup>30</sup>, σ<sup>2</sup> <sup>=</sup> <sup>4</sup>}

### **Parameter Variation:**

3: speed\_vehicle\_1 ← vehicle\_speed\_urban

4: speed\_vehicle\_2 ← vehicle\_speed\_urban

5: **overwrite** distribution <sup>←</sup> Gaussian{<sup>μ</sup> <sup>=</sup> <sup>25</sup>, σ<sup>2</sup> <sup>=</sup> <sup>4</sup>}

#### **Constraints:**

6: speed\_vehicle\_1 > speed\_vehicle\_2 +10

**Logical Scenario Design** The logical scenario discussed above is well-equipped to generate possible, impossible, probable and improbable scenarios. As Fig. 1 illustrates, both domain expertise and historical data may be taken as sources for the design of a logical scenario. Examples of logical scenario design by domain expertise are exemplified in [17] where sets of possible values of parameters are derived by listing and clustering the pre-conceived situations the system may encounter. An example of design by historical data can be seen in [16], where a driving study from BMW is used to estimate probable driver inputs for a car within a sharp curve. The logical scenario methodology can fully support both approaches via specification of parameter distributions and constraints while simultaneously using a platform-independent and formal syntax to do so. The third method for logical scenario design illustrated in Fig. 1 via feedback from scenario evaluation is discussed in Sect. 3.4.

#### **3.2 Scenario Variation**

The scenario variation stage use the logical scenarios to generate *concrete scenarios*. Concrete scenarios have concrete values for the previously abstract parameters, distributed according to the logical scenario specification. This stage uses sampling techniques to generate samples distributed as close as possible to the specified parameter space. The contribution uses the variants of Markov-Chain-Monte-Carlo proposed by Maqbool et al. [12] to generate the samples.

The scenario based approach with decoupled abstract, logical and concrete scenarios help to impart understandability to ML data, as concrete scenarios provide a unique, formal and human-readable basis behind each data-set. Secondly, the distributions and constraints offer control over data accuracy—they can control the similarity between concrete scenarios and the desired realistic distribution. Accuracy of the data is further ensured by the digital twin approach for simulating the concrete scenarios, discussed in the next section.

#### **3.3 Scenario Evaluation**

The scenario evaluation stage brings the concrete scenarios to life using simulation techniques. The authors propose the use of experimentable digital twins (EDT) to match the flexible and multi-domain nature of the scenarios. The EDT of a system is the digital twin implemented as a simulation model in a virtual testbed that offers diverse simulation functionalities. EDTs collect various aspects of the system and can be easily reconfigured for different application contexts throughout the training and validation process. Additionally, EDTs offer scalability in the level of detail (e.g. simulation realism, sensor resolution) and computing resources [15]. Figure 2 illustrates the EDT for a rover on an extra-terrestrial terrain modeled in the multi-domain simulation software VEROSIM. The figure illustrates

how the EDT-based simulation allows the fusion of environment generation, multi-body dynamics and various perception sensors.

The flexible and modular nature of EDTs make them an ideal fit for the parameterized and iterative scenario evaluation methodology in Fig. 1. For instance, scenario design iterations can be performed on simplistic models without expensive sensor rendering and can be seamlessly upgraded per requirement in subsequent iterations. EDT-based scenario evaluation stage generates the ground truth data and the replay data. The ground truth data is annotated and labeled by the simulation and serves as ML input data, whereas the replay data contains the simulation events and results that may be used for the post-analysis of a particular simulation run. Thus EDTs further impart control over the accuracy and coherency of the synthetic data-sets by flexibility in the realism, scope and configuration of simulation entities.

#### **3.4 Scenario Redesign**

As previously mentioned, design-by-domain-expertise and design-by-historical-data are not always feasible in practical scenario design. Simulation results provide a valuable insight into the effect of parameters and opens the way towards iterative scenario design. Parameterization via logical scenarios makes every scenario viable for iterative redesign, and this iterative process can be carried out by a domain expert or an optimization algorithm. Consider the example (illustrated in detail in Sect. 4) of an automotive simulation, where the ML-designer requires data-sets from both accidental and non-accidental situations, but the desired scenario distribution for such data-sets is unknown. Random simulation within the complete parameter space can offer insight and allow the scenario designer to set the desired parameter bounds. Various optimization- and heuristic-based algorithms can be used for this purpose [1, 4].

#### **3.5 ML Training and Validation**

Once the concrete scenarios in the scenario design phase have acquired the sufficient characteristics, the EDT-based simulations can be used to generate ground truth or input data for training and validating ML-based systems. The scenario variation methodology suggests another feedback loop after training or validating the ML-system. This loop can be utilized, e.g. to iterativly find critical scenarios for the ML-model using the same heuristics as in Sect. 3.4.

#### **4 Application Examples**

Two examples from the space- and automotive domain are presented to illustrate the multidomain capability of the scenario variation methodology. Within both examples, the Open-SCENARIO standard is adapted to describe the abstract scenario, whereas the logical scenario is specified via the test specification in [12]. VEROSIM is used as the EDT-based simulation software.

**Rendezvous and Docking Scenario** The rendezvous and docking maneuver (RvD) maneuver, illustrated in Fig. 3a requires a chaser shuttle to scan a target satellite via LiDAR and determine the relative pose. The ML-based pose-estimation algorithm is to be trained via synthetic LiDAR scans with ground-truth information. The specifications of the LiDAR scanner by the chaser and the ML-model posit two constraints. Firstly, all measurements must be taken such that the chaser is within a flight corridor. The flight corridor is specified via a cone, with its apex on the satellite, length *lC* and radius *rC*, see Fig. 3b. Secondly, the closer the chaser is to the satellite, the higher the likelihood of it being on the center. The simulated datasets should reflect this distribution.

(a) The rendezvous and docking scenario.

(b) The logical scenario problem, square is satellite and dot is chaser

(c) Concrete scenarios each red dot illustrates a concrete chaser position

#### **Fig. 3** Logical scenario design of a satellite docking maneuver

#### **Scenario 2** Logical scenario for rendezvous and docking scenario

#### **Parameter Variation:**

```
1: xchaser, ychaser, zchaser
2: distribution ← RvD_Dist{rC,lC}
```
#### **Constraints:**

3: 0 ≤ *n* · *d* ≤ *lcorridor* 4: arccos *<sup>n</sup>*·*d* |*n*||*d*| <sup>≤</sup> <sup>θ</sup>; <sup>0</sup> <sup>≤</sup> <sup>θ</sup> <sup>≤</sup> <sup>π</sup>

The abstract scenario is modeled via *teleport* action of OpenSCENARIO—both chaser and satellite are teleported to initial positions, whereas the initial coordinates of the chaser are defined as abstract parameters. The logical scenario is specified as in Scenario 2. Lines 3–4 enforce requirement 1 via inter-parameter constraints. As Fig. 3 illustrates, given unit vector *n* along the corridor axis, vector *d* from the target to the chaser, the dot product of the vectors—the distance between the target and chaser along the corridor axis—must be less than the corridor length. Secondly, the angle between *n* and *d* must be less than the corridor angle θ, so that the chaser is always within the corridor bounds. Lines 5–6 implement requirement 2. The "RvD\_Dist" is implemented in the scenario variation engine by extending the meta-model in [12], and is simply referred to in the logical scenario. The implementation uses a gaussian distribution dependent on the distance between the chaser and target. The resulting concrete scenarios generated with unique chaser positions are illustrated in Fig. 3c.

**Automotive Collision Avoidance** In the second use-case, a collision avoidance ML-model must maneuver a vehicle to avoid an incoming truck via an evasive maneuver. The MLmodel needs sufficient samples of both collision and non-collision scenarios for training, otherwise it runs the risk of over-fitting to a particular case. To find out the target logical scenario, an initial logical scenario is set up with the velocity of the car v and the point of curvature *parc* as abstract parameters. The point of curvature is the evasion inducing point within the spline trajectory of the vehicle. The first iteration of logical scenario, illustrated in Scenario 3 assigns suitable ranges to two abstract parameters with uniform probability distribution functions (PDF). The resulting concrete scenarios and their EDT simulations are illustrated in Fig. 4a and b respectively. The percentage of no-collision scenarios is relatively much lower than collision scenarios, which may cause ML-model to over-fit to no-collision scenarios. Based on the results, the next iteration of logical scenario design can either impose newer parameter ranges, or an appropriate PDF to ensure sufficiency of both collision and non-collision scenarios. A bi-modal gaussian PDF with two means located within the highest collision and no-collision densities is chosen. The gaussian PDF allows the scenario designer more flexibility by providing a finer balance between the area of the sampling region and

the frequency of outlier sampling. The logical scenario is formulated in the second iteration of Scenario 3. The resulting concrete scenarios and simulations are illustrated in Fig. 4c and d, and show an equal distribution of accident and no-accident scenarios. With the desired logical distribution now found, the number of concrete scenarios can be further increased and the simulation can be made further complex by adding realistic sensor EDTs.

#### **Scenario 3** Iterative logical scenario design for automotive collision avoidance

**Iteration** 1 1: v 2: distribution ← uniform {8,20} 3: *parc* 4: distribution ← uniform {0,30} **Iteration** 2 5: v , *parc* 6: distribution ← Gaussian {μ<sup>1</sup> = 17, μ<sup>2</sup> = 17, σ<sup>1</sup> = 1.5, σ<sup>2</sup> = 1.5}

#### **5 Conclusions**

This contribution introduced a methodology for synthetic data generation based on formal scenarios, semantic parameter variation and experimentable digital twins (EDT). The methodology provides transparency and formality to the data generation process, and delivers control over data quality via scenario distribution and EDT configurations. A human-readable concrete scenario behind each synthetic data imparts a higher degree of understanding about the data. The proposed logical scenarios allow a formal scenario distribution specification. They can support domain expertise, historical data, as well as iterative methods to derive the scenario distribution. EDTs concurrently provide a simulation platform to simulate the scenarios throughout the scenario design process, offering high flexibility in simulation perspective, complexity and scale. Future works plan to carry out further research on iterative design of logical scenarios using metrics from trained machine learning models, and explore techniques to derive exploratory, exploitative and adversarial logical scenarios.

**Acknowledgements** This work is part of the project "KImaDiZ", supported by the German Aerospace Center (DLR) with funds of the German Federal Ministry of Economics and Technology (BMWi), support code 50 RA 1934.

#### **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

### **Living Earth—A Methodology for Modeling the Environment of Construction Sites Via Digital Twins**

Artem Bliznyuk, Michael Schluse and Jürgen Roßmann

#### **Abstract**

The architecture, engineering and construction (AEC) industry appears hesitant to embrace new digital innovations. One of the few recent successful examples is the introduction of the building information modeling (BIM) paradigm. However, the focus here lies mainly on the building itself and does not support the construction environment. This paper presents a methodology for the development and application of Digital Twins representing and supporting the working environment of a construction site. By combining available geodata with real-time sensor data from mobile construction machines, it is possible to create always-up-to-date Digital Twins of the relevant objects and processes in the field in order to facilitate supervision, additional planning steps, management, control and security activities. The proposed concept is currently being tested on a local test site to generate, update and adapt the Digital Twins as well as to incorporate additional semantic information about e.g. the soil and various working processes.

**Keywords**

Construction • Environment • Digital Twin • Geospatial Data

A. Bliznyuk (B) · M. Schluse · J. Roßmann

Institute for Man-Machine Interaction, RWTH Aachen University, Aachen, Germany e-mail: bliznyuk@mmi.rwth-aachen.de

T. Schüppstuhl et al. (eds.), *Annals of Scientific Society for Assembly, Handling and Industrial Robotics 2022*, https://doi.org/10.1007/978-3-031-10071-0\_12

#### **1 Introduction**

The architecture, engineering and construction (AEC) industry is one of the least digitized industries [12] and only slowly embraces paradigms, such as Building Information Modeling (BIM). BIM emphasizes the processes and technologies to create a digital model, that represents the physical and functional characteristics of the project. In its present form, BIM targets mainly the building, but as Choi et al. [6] point out, the work space and the building environment is an important resource for managing a construction site, as well. The introduction of environmental information requires integration of BIM and Geographic Information Systems (GIS), which however brings new challenges, as GIS and BIM follow different paradigms [1].

Moreover, such systems are limited by their static representation of a building and its environment [13], as they lack any automatic information flow from the construction site to the GIS+BIM model. This synchronization between real assets and their digital model is a key feature of a Digital Twin (DT). The concept of DTs describes a virtual representation of a physical object, and the corresponding flow of data between these two parts [10]. It is a major part of the Industry 4.0 roadmap, already embraced by other industries such as manufacturing and production, whereas its development in AEC is still in infant stages, as Deng et al. [7] state. The majority of current research is exploring the integration of BIM models and sensors, but their focus lies on building related topics during the operation phase, such as indoor hazards monitoring, thus again leaving out the building environment during the construction phase.

This paper presents a methodology for establishing and operating a DT of the environment of construction sites. Inspired by GIS applications, it uses geodata to initialize the DT and employs mobile construction machines to create information flow from the environment to the DT. This methodology is then tested on a local test site.

#### **2 State of the Art**

The majority of current research does not directly consider the environment of construction sites, but touches on it in efforts to automate site monitoring. In their literature review, Boje et al. [2] gather several examples, such as the use of drones or laser-scanning to capture and save changes in construction status to a BIM system. Although there is progress in automating the integration of captured data into the chosen building model, the act of operating the drone or preparing and capturing laser scans is manual. Moreover, all examples focused mainly on the constructed object and provided only visual evolution.

Xu et al. [15] employed already existing construction site models to solve multi-objective dynamic construction site layout problems, but considered only facility or process related features, such as safety and environmental hazards posed by a facility. Song et al. [14] developed an automated tool to calculate optimized equipment travel paths using the site layout within BIMs, however they assumed a 2D flat surface and square obstacles without consideration of elevation or ground properties.

One exception is the work by Cheung and Lin [5], which assessed the level of hazardous gases around the construction site using a Wireless Sensor Network. Although this idea represents dynamic updates of the environmental state, only a single attribute is monitored without integration into a more extensive model of the environment. Similarly, Arroyo et al. [1] examined the use of geological shallow subsurface data for construction and design applications, which is only a limited subset of properties of the construction environment.

This analysis indicates that the area of DTs for environmental modeling of construction sites has not yet been extensively explored, as the current focus lies mainly on the constructed building, without taking the rest of the site into consideration. Additionally, there are hardly any automatic approaches for updating the state of the environmental model, which is required by the DT paradigm.

#### **3 Methodology**

This section will provide a closer look into the steps of the described process in Fig. 3 and their implementation in the preliminary study currently being carried out on a local test

**Available Geodata** Geodata (or geographic data) is data that can be referenced by a location relative to the Earth. Two most common types of geodata are rasterized and vectorized data. The former saves values like height, e.g. in digital elevation models (DEM) (Fig. 2), or color, e.g. in digital orthophotos (DOP) (Fig. 1), on a regular grid. By georeferencing the 'origin' and defining the resolution of the grid, every cell can be easily accessed by its geographic position. The latter type explicitly defines the geometry of objects by describing them with geometrical primitives. Every vertex is then assigned geographic coordinates. This work suggests using following geodata, depending on its quality and availability: Digital Surface Models (DSM), Digital Elevation Models (DEM), Digital Orthophotos (DOP), topographical maps, geological maps, city maps and building models. Such data is usually accessible via public databases, such as 'OpenGeodata.NRW', the official geodatabase of the government of North Rhine-Westphalia, that was mainly used in this work. The 'Open Geospatial Consortium', an authority on geospacial information, offers free and open standards for interaction with such databases via its 'OGC Web Services'. Geodata can come in different formats, so for consistency and simplified interaction with the future Digital Twin, this work suggests transforming and combining rasterized data into a multilayered GeoTIFF file, an official OGC standard that offers functionalities for georeferenced raster data. A similar tactic is applied to vectorized data to transform them into the CityJSON format.

**Fig. 1** Digital Orthophoto and building model of the test site area. Building outlines are colored in pink and can be seen in the top right corner

**Fig. 2** Digital Elevation Model of the test site area

The described data offers a starting point for the DT. However public geodata sometimes lacks quality, especially its temporal resolution can be in the range of several years and the data is thus quickly outdated. Figures 1 and 2 illustrate this nicely, as they were taken before creation of the construction site and therefore show only an empty field. This issue calls for more frequent updates, ideally in real time, to create a true DT of the environment.

**Live Data Acquisition** Mobile construction vehicles are a promising platform for gathering live data about the construction site environment. They frequently traverse and directly interact with it by e.g. excavating earth or transporting material. Moreover, modern machines are already equipped with different sensors. Most sensors are typically used for condition monitoring of the vehicle [9] thus measuring internal data and focusing on the *machine*, our case, however, focuses on the *environment*.

This situation calls for two solutions: inference of external, environmental data from available internal sensors and equipping the construction vehicles with additional sensors. This work implemented both strategies. A wheel loader, a typical mobile construction machine for moving material using a front mounted bucket, was equipped with following sensors: Inertial Measurement Unit (IMU), Global Positioning System (GPS) tracker, RGB camera, Light Detection and Ranging (LiDAR) sensors, pressure sensors of the hydraulic cylinders, stroke transducers of the hydraulic cylinders, wheel encoders, measuring the rotation of all four wheels. All measurements are indexed by their unique UNIX timestamps. Depending on the accuracy of the GPS sensor, it can be necessary to correct the measured positions. This can be achieved by fusing GPS and IMU readings as described for example in [3]. After this optional preprocessing step, a set of measurements, each with a unique time and clear geographical position, is ready for further analysis.

**Data Analysis** An effective data analysis strategy is key to extract *information* from the previously gathered data. As construction progresses, the site experiences changes in material placement, object placement, soil conditions and the general surface model and thus these changes are of special interest for an up-to-date DT.

*Surface Model* The use of IMU and LiDAR sensors enables Simultaneous Location And Mapping (SLAM) based approaches. Such algorithms try to localize the robot (or machine) within its surroundings, while building a map of those surroundings at the same time. Exactly such 3D map of the environment allows for updates of the surface model. The generated point clouds can then be rasterized by placing them on a regular grid and taking the combination of height values of all points within a cell. This work applied the average height within every cell. Additionally, the differences in resolution of the point cloud and the grid can lead to some cells being empty, as points get sparser with distance to the LiDAR or the rays are simply obstructed by the roughness of the terrain. A straight-forward solution is to use interpolation to estimate missing cell values.

*Terrain Condition* Besides elevation, the type of terrain is important information about the environment of a construction site. One possible approach is done by Kurup et al. [11], who propose a support-vector-machine based algorithm to classify different terrains with features from camera images and IMU readings. Differences in vehicle speed and the rotational speed of its wheels indicates some form of slip. This information can again be combined with corresponding GPS readings to detect areas with difficult to traverse ground. Soil compaction is another terrain property that can be inferred from the movement and position of the vehicle around the construction site as the wheels exert pressure on the ground. All extracted information about the properties of the terrain is then added to the corresponding layer of the grid describing the environment (Fig. 3).

*Material Transport* Of course, the main changes in the environment are caused by excavation and transport of material performed by the machinery. Stroke transducers and pressure sensors yield the actuator position of e.g. the bucket of a wheel loader. Since the kinematics of the vehicle and its current geographical position are known, the location of removed or placed material can be observed. Volume and mass of the transported material can be then estimated from its physical properties and technical specifications of the used machine, e.g.

**Fig. 4** Example of a GeoJSON object

its bucket volume. Detected material, e.g. a heap of topsoil, can then be described as geographic objects in vector format (see Sect. 3), containing its coordinates and attributes.

*Object Localization* Finally, all objects that are not terrain or earth material also have to be considered, such as placement of fences or building material. One strategy is to use RGB cameras and LiDARs to detect and localize objects relative to the vehicle [4, 8]. This is, however, only tried out in context of autonomous driving and more datasets with objects from construction sites are needed. Localized objects should be saved in vector format. At the end of this data analysis step, environment measurements from construction vehicles are transformed into information in vector and raster form and ready to be incorporated into the DT.

**Structure of Digital Twin** As discussed in Sect. 3, geodata is usually handled with two different formats, vector and raster data. The proposed structure embraces this distinction and splits the DT in two parts: A multilayered GeoTIFF file and a database for vector objects.

The GeoTIFF format is suitable for storing the raster part of the current environmental state due to its native integration of geospatial location, reference coordinate system and possibility to include multiple layers into one file. Since every layer shares the same grid structure, this format enforces consistency between different layers in regard to resolution and geographic location. The GeoTIFF file could contain the following selection of layers: DSM, DEM, red, green and blue channels of color data, terrain type map, soil compactness map, geological map. The second part of the environmental model stores all distinct geographical objects in a JSON document database. We propose using the GeoJSON format. Figure 4 shows an example for such an object. It consists of a set of user defined properties and primitive geometries referenced by geographic coordinates. In this case, it is a 'fence' type object in from of a line with additional information about its height. Together with the previously mentioned CityJSON objects (see Sect. 3), the DT needs only one database to manage all vector data, allowing for efficient and consistent access, searching and editing.

With this structure, the information about the current state of the environmental model can then be used to support future updates of the DT by combining them with new measurements. This information can also be exported back into public geodatabases improving and updating their data.

**Interfaces and Applications** A complete and consistent DT acts as the single source of truth for the state of the construction site. Through a selection of suitable interfaces and representations, such as 3D models, maps or dashboards and statistics, the site operator is supported in their monitoring, controlling and planning tasks. This constitutes an indirect connection from the Digital Twin back to the Real Twin, as its state directly influences the actions of the user, who in turn changes the state of the real construction site through their managerial decisions. Possible applications can be optimization of route planning based on elevation and ground properties or editing the site layout after changes in the construction plan.

#### **4 Experiments**

The presented methodology was implemented in a preliminary study, using a wheel loader and local test site. During initialization, all available data sources depicted the site as an empty and flat area. The machine then performed several test drives around the area and carried out typical construction tasks. It was equipped with sensors described in Sect. 3 to emulate one update pass of the digital twin.

#### **4.1 Surface Model**

Figure 5 demonstrates the information gain from LiDAR recordings. The point cloud records the area in front of the machine. For better visibility, points are colored based on their height. At some point in time, a depression in the ground could be clearly detected ahead of the vehicle, indicated by the blue region, as well as some tall vegetation in the form of the ragged yellow-orange structures on the right. Neither of those were present in the initial surface model. The recorded point cloud was then rasterized, linearly interpolating empty pixels (Fig. 6) and used to update the corresponding area of the global DSM, as the current geographical position of the vehicle is known.

#### **4.2 Slip and Soil Compaction**

Figure 7 shows a qualitative map of slip in the area traversed during operation of the machine. The measure of slip was defined as the difference in rotation frequency between wheels on mostly straight movement segments. Darker regions show areas with more slip. Figure 8 shows a quantitative map of soil compaction, which was caused by the pressure of the

**Fig. 5** Recorded point cloud during wheel loader operation. Points are colored based on their height. The white symbol indicates the position of the viewer. A sink can be clearly observed ahead of the vehicle, indicated by the blue region. On the right-hand side, the rough and yellow-orange patches indicate vegetation

**Fig. 6** Elevation map generated from the point cloud in Fig. 5 with 10 cm resolution. Empty grid cells were linearly interpolated

wheels of the vehicle on the ground. Additionally, crossing the same spot multiple times and the added weight of transported material were also considered in the estimation. Darker regions show higher level of compaction. The dark patch on the right side of the figure shows an area where the wheel loader piled up and subsequently removed a heap of earth. This new environmental information isn't normally available and offers new possibilities for e.g. future route planning and is added to the corresponding layers of the DT.

#### **5 Discussion**

The experiments demonstrated potential for employing a DT in context of construction site environments. Each test drive can be seen as a single update pass from the real environment to the DT. In a real use case, these updates should happen continuously and near real-time during the whole construction process. The implementation of an automatic system for updating the DT needs to be developed and tested in further studies.

Constant updates provide the possibility to monitor the evolution of the construction site, as the DT can store its past states, which can be a powerful controlling tool. However, it raises ethical questions about surveillance of construction site personnel, like machine operators, since their behavior is indirectly recorded through their machine and thus also stored inside the DT. Another issue can arise in using the full potential of the DT and procure its latest geodata to update databases of public bodies. It is important to manage the distinction of confidential business data and data cleared for public use before their export.

The methodology describes only the essential structure of an environmental DT. Based on the use case, the proposed contents of the DT could be extended or further transformed to fit a certain application. Additional information, like the target layout of the site or safety thresholds on slip values, will enable detection of deviations in the current layout or provide insights for path planning, respectively.

#### **6 Conclusion**

We proposed a methodology for modeling the environment of construction sites via a Digital Twin (DT). After the DT is initialized with available geodata, it needs to be coupled with information about events happening right now at the construction site. Using construction vehicles equipped with different sensors, raw data about the environment can then be gathered. Next, this data is fused and analyzed to derive the desired information, such as an elevation model or soil condition. This finally updates the previously initialized DT. Every new set of measurements can then be combined with current information from the DT to enhance future updates. The cycle is finally closed with a range of applications, such as monitoring and controlling. The DT can 'act' on the real environment through the management and planning decisions of the user, supported by suitable human-machine interfaces. One update loop of this methodology was then implemented on a test site using a wheel loader. Different information, such as changes in surface model, regions with slip and soil compaction, have been derived from sensor measurements during vehicle operation and added into the existing environmental model of the test site.

Future research will include the implementation and evaluation of a fully connected infrastructure to automate the measurement, analysis and update steps. In combination with that, other sensors, such as radar or stereocameras, should be tested for deployment on construction machinery. In the long term, the extension of the environmental DTs to incorporate DTs of constructed objects and the construction machines themselves will offer new topics for investigation.

#### **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

**Modeling**

### **Analyzing Natural Resting Aspects of Arbitrary Components Using a Physics Engine**

Torge Kolditz , Jakob Hentschel and Annika Raatz

#### **Abstract**

Part Feeding Systems play a vital role in automated assembly, linking in-house logistics with individual assembly stations. One of the main tasks of part feeding systems is to transfer components from a disordered state (e.g. bulk material) to an ordered state (defned position and orientation) so that they can be further processed by automated handling equipment. Knowledge of the natural resting aspects (probability that a geometrical body rests in a certain orientation) of the components is essential for the development and design of part feeding systems. The experimental determination of natural resting aspects is time consuming and expensive since extensive drop tests have to be carried out. Therefore, many approaches have been taken to derive the natural resting aspects mathematically based on the component geometry or by direct dynamic simulation. In this work, the open-source physics engine Blender is used to determine natural resting aspects of arbitrary components without the need for experimental drop tests. In virtual drop tests, components are imported in the common STL format and are dropped on a surface from random initial orientations. The resting orientations of the components are exported and automatically evaluated using MATLAB. The functionality and accuracy of the approach is evaluated by conducting experimental drop tests with fve exemplary components. The evaluation shows good agreement between simulated and experimental results.

T. Kolditz (\*) · J. Hentschel · A. Raatz

Institute of Assembly Technology, Leibniz University Hannover, Garbsen, Germany e-mail: kolditz@match.uni-hannover.de URL: http://www.match.uni-hannover.de

<sup>©</sup> The Author(s) 2023 155

T. Schüppstuhl et al. (eds.), *Annals of Scientifc Society for Assembly, Handling and Industrial Robotics 2022*, https://doi.org/10.1007/978-3-031-10071-0\_13

#### **Keywords**

Natural resting aspects · Physics engine · Drop test · Simulation

#### **1 Introduction**

Modern production environments are characterized by an increasing level of automation. Especially in assembly processes, there is still a lot of potential to enhance effciency and productivity by automating assembly processes. Key components of automated assembly systems are part feeding systems which transfer the assembly components from a disordered state (e.g. bulk material) to an ordered state, with a defned orientation and position. An example for widely used devices are vibratory bowl feeders [1]. They consist of a vibrating bowl with a spiral track. Due to the vibration, the components are transported up the track, which is equipped with different chicanes or traps that sort out (reject) components in undesired orientations. In order to achieve high feeding rates, the amount of rejected components should be minimized. Therefore, knowledge of the probabilities for a component to naturally adopt certain orientations (natural resting aspects, cf. Fig. 1) is essential for an effcient design of vibratory bowl feeders [2]. The design of the chicanes must allow components in highly probable orientations to pass and reject components in unlikely orientations. Apart from vibratory bowl feeders, knowledge of the natural resting aspects of a component is also valuable for the design of various types of conveying and part feeding systems like linear feeders, camera-based pick-and-place systems, or aerodynamic feeding systems [3].

The experimental determination of the natural resting aspects in manual drop tests is very time-consuming because a component has to be dropped several hundred times and the resulting resting orientations have to be documented manually. This work presents a novel method for an automated determination of the natural resting aspects of arbitrary components with the use of a physics engine. In the frst step, the component is imported into a physics engine in the common STL format (Sect. 3.1). Then, the virtual component is repeatedly dropped on a surface from a defned height with a random initial orientation (Sect. 3.2). The component bounces off the surface, changing orientation

**Fig. 1** Natural resting aspects of an L-shaped exemplary component

multiple times. When the component comes to a rest, its orientation is exported for further processing (Sect. 3.3). The extraction of the natural resting aspects from the exported data is explained in Sect. 4. Ultimately, the experimental evaluation is presented (Sect. 5) and the results are discussed (Sect. 6).

#### **2 Related Work**

In one of the frst works regarding model-based determination of natural resting aspects, Boothroyd and Ho used the energy barrier between different orientations (resting aspects) of a component as an indicator for the stability of their resting aspects [2]. They assumed that the probability, a component comes to rest in a particular orientation is proportional to the energy needed to change the orientation. Boothroyd and Ho applied the method to simple, regular prismatic and cylindrical components and validated it experimentally with drop tests. The results showed high consistency [2]. However, even though the method is sometimes referenced for comparison, it was not developed any further (cf. [4]) and is strongly limited with regard to the component complexity.

To compensate the limitations mentioned above, Ngoi et al. presented the centroid solid angle (CSA) method [5]. Ngoi et al. proposed that the probability in which a component comes to rest on any of the feasible aspects is proportional to the solid angle from the centroid of the component to the considered aspects. They also proposed that the aforementioned probability is also inversely proportional to the height of the centroid from the considered aspect. In [5], the CSA method is compared to Boothroyd and Ho's energy barrier method and successfully validated with experimental drop tests using a T-shaped prism as exemplary component. In following works, Ngoi et al. refned and evaluated the CSA method using different exemplary components, with a displaced center of gravity or form elements like bores or grooves, for example [6]. In [7], the method and the drop test results were also validated using a vibratory bowl feeder.

In [8], Ngoi et al. introduced and evaluated the critical solid angle (CRSA) method. For the CRSA method, they assumed that the probability that a component comes to rest on a certain aspect is proportional to the difference between the centroid solid angle of that aspect and the average of the critical solid angles of the surrounding aspects. The critical solid angle between two aspects of a component is determined by the critical position of the centroid when tilting the component from one aspect to another. A detailed explanation of the CRSA method is given in [8].

Chua and Tay developed the stability method, which analyzes the stability of a component when resting on a certain aspect. The stability is defned as a function of the contact area of the aspect with the surface the component is resting on and the distance of the center of gravity to said surface [9]. The stability is proportional to the contact area and inversely proportional to the distance of the center of gravity.

The described methods (energy barrier, CSA, CRSA and stability) were evaluated by Suresh et al. [10] and Udhayakumar et al. [11] using brake pads and sector shaped parts as exemplary components. The results show good accuracy for all methods.

In contrast to the analytical methods described above, Moll and Erdmann introduced a numerical approach to simulate drop tests with the aim to determine the optimal drop height and surface shape for a component to rest on a certain aspect [12]. They used a dynamic simulator to simulate the behavior of polyhedral rigid bodies when dropped on a surface from a defned height. However, the dynamic simulator was limited to twodimensional components.

Várkonyi introduced numerical dynamic simulation to determine the natural resting aspects of randomly generated three-dimensional polyhedra [13]. The aim of this work was to create a dataset to compare the accuracy of three existing analytical methods and three estimators (developed by Várkonyi). Experimental results showed good agreement, when the components were dropped on a hard surface, but signifcant deviations, when the components were dropped on a soft surface. Boothroyd and Ho defne a hard surface (e.g. metal, glass) as a surface with a negligible horizontal impact force (no friction) as opposed to a soft surface (e.g. rubber), where signifcant horizontal forces occur on impact [2]. For the dynamic simulation, Várkonyi only considered vertical impact forces, resulting in frictionless impacts, limiting the model to the simulation of drop tests on hard surfaces.

The review of the related work shows that there are multiple approaches towards a model-based analysis of natural resting aspects. However, the presented approaches are limited either with regard to the component spectrum, the usability and the adaptability of the simulated environment (e.g. surface properties, surface geometry). To counteract these limitations, a novel, more adaptable approach is presented in the following.

#### **3 Drop Test Simulation Using a Physics Engine**

In this paper, the physics engine integrated in Blender [14] is used for the simulation of the natural component behavior. The software is used to simulate a standard drop test where a component is dropped onto a soft surface from a constant height. Using the open-source physics engine Blender promises multiple advantages: The components can be imported in the common STL-format, which results in an accurate representation of the simulated component and a high fexibility with regard to the component spectrum. Furthermore, Blender offers Python-based script control, which enables full automation of the iterative drop test simulation. Lastly, the Environment can be adapted freely, meaning that the shape, inclination and other parameters of the surface can be adapted freely and walls or other restricting objects can be placed in the virtual setup. In this work, the framework for the drop test simulation and the identifcation of the natural resting aspects is presented and evaluated experimentally.

The drop test is performed for *n* iterations with random initial orientations and yields a distribution of the natural resting aspects of arbitrary components. Blender is particularly well suited for integrating the simulation model into a statistical test design due to its dynamic script control in Phython. Figure 2 shows the general program fow for performing drop tests of a component. The program is divided into the simulation environment preparation (Sect. 3.1), the simulation run (Sect. 3.2), and the export of the rotation data (Sect. 3.3).

#### **3.1 Preparation of the Simulation Environment**

Firstly, the simulation environment is automatically set up in Blender using a python script. To do this, the plane on which the components fall, as well as the component itself, are imported as STL fles. The local component coordinate system (*CS*)*W* is then relocated to the component's center of gravity and aligned with the inertial coordinate system (*CS*)0. After that, the component is moved to the constant drop height *h* with the vector <sup>0</sup> ⇀ *r <sup>W</sup>* = (0, 0, 60) *<sup>T</sup>*mm. Lastly, the component is randomly oriented in R<sup>3</sup> with uniform distribution (cf. Figs. 3 and 4a, b).

In the following, the simulation boundary conditions are defned. The simulation is a rigid body simulation, all components are modeled as ideal solid bodies and component

**Fig. 2** Flow chart of drop test simulation

**Fig. 3** Initial position and orientation of a component in the simulation environment

**Fig. 4** Coordinate Transformation during the simulation

deformation is neglected. The falling component is characterized as an active rigid body and is assigned a mass according to its density. It can move freely in R<sup>3</sup> with 6 DOF. The plane, on the other hand, is defned as a passive rigid body and thus fxed in space. Subsequently, the gravitational feld is defned with a constant acceleration of *g* = 9.81 <sup>m</sup> s2 in negative Z-axis of (*CS*)0. Finally, the interaction of the two rigid bodies is determined. The surface response (bounciness and friction) is particularly important here. The bounciness *b* (0 ≤ *b* ≤ 1) describes the tendency of a rigid body to bounce after colliding with another, where 0 represents a completely inelastic collision and 1 a completely elastic one. The friction µ(0 ≤ *f* ≤ 1) describes the resistance between two touching rigid bodies with a relative velocity. For a collision behavior that matches the interaction with a soft surface, the bounciness is set to *b* = 0.8 and the friction is set to µ = 0.5.

#### **3.2 Simulation Run and Cancellation Criterion**

For a representative distribution of the natural resting aspects of each component, the drop test is performed for a total number of *n* iterations. Each iteration *i* represents a dropping process of the component and results in a stable pose (natural resting aspect). To ensure that the component remains in a stable pose at the end of each iteration, a cancelation criterion is defned. Each iteration is divided into simulated frames *f* . The density of frames can be set by the number of frames per second. For each frame *f* the simulated position of the components center of gravity and its orientation is saved. If neither the position nor the orientation changes for several frames in a row, the part is in a stable fnal pose, the simulation of the current iteration *i* is fnished, and a new iteration is started. This process continues until all *n* iterations are simulated.

#### **3.3 Export of Rotation Data**

The last simulated frame of each iteration of the physics simulation returns a quaternion 0 ⇀ ν *<sup>i</sup>*, which gives information about the orientation of the component and a location vector <sup>0</sup> ⇀ *r <sup>i</sup>* which determines its position in the inertial coordinate system (*CS*)0. A quaternion is a hypercomplex number and is constructed as follows:

$$
\vec{\mathbf{v}}\_0 \vec{\mathbf{v}}\_i = a + b\mathbf{i} + c\mathbf{j} + d\mathbf{k} \text{ with } \mathbf{i}^2, \mathbf{j}^2, \mathbf{k}^2 = -1 \text{ and } a, b, c, d \in \mathbb{R} \tag{1}
$$

The change of orientation represented by a quaternion is described by a rotation around 0 ⇀ *x <sup>i</sup>* in R<sup>3</sup> with the angle ϕ. The coeffcient *a* represents the rotation angle with.

*a* = cos(ϕ), and *b*, *c*, *d* represent the coeffcients of the rotation axis <sup>0</sup> ⇀ *x i*.

$$\overrightarrow{\mathbf{x}\_{0}\mathbf{x}\_{i}} = \begin{pmatrix} {}\_{0}\mathbf{x}\_{1} \\ {}\_{0}\mathbf{x}\_{2} \\ {}\_{0}\mathbf{x}\_{3} \end{pmatrix} = \frac{1}{\sin(\cos^{-1}(a))} \cdot \begin{pmatrix} b \\ c \\ d \end{pmatrix} \tag{2}$$

Each quaternion <sup>0</sup> ⇀ ν *<sup>i</sup>* indicates the rotation of (*CS*)0 into (*CS*)*Wi* (cf. Fig. 4).

#### **4 Data Evaluation and Identifcation of Natural Resting Positions**

The data exported from Blender, which provide information about the orientation and position of the individual fnal poses, are imported into a MATLAB framework and processed in the next step. To get a probability distribution of the fnal poses (natural resting aspects), the raw data are evaluated and sorted. The component pose can be precisely determined or assigned by specifying its orientation. However, there are several component poses representing the same natural resting aspect. To determine which poses represent the same resting aspect, a classifcation feature based on the rotation data (quaternion) was worked out. It states that two poses can be assigned to the same resting aspect if the coordinate systems can be transformed into each other by a pure rotation around the Z-axis of the coordinate system (*CS*)0. Figure 4c shows two different component orientations (<sup>0</sup> ⇀ ν 2, <sup>0</sup> ⇀ ν <sup>3</sup>). In both cases the X-axis and Y-axis are aligned with the ground surface which means the component rests on the same of its aspects. Both orientations therefore represent the same natural resting aspect.

To classify the fnal component poses into natural resting aspects, all orientations 0 ⇀ ν *<sup>i</sup>* (with *i* = 1 ... *n*) are systematically compared with each other. In the following, two iterations of the drop test and thus two different end orientations are used to explain the classifcation algorithm. The frst orientation is represented by the quaternion <sup>0</sup> ⇀ ν *<sup>i</sup>*, the second one by the quaternion <sup>0</sup> ⇀ ν *<sup>i</sup>*+1. To compare the two orientations, <sup>0</sup> ⇀ ν *<sup>i</sup>* is multiplied with the complex conjugate of <sup>0</sup> ⇀ ν *<sup>i</sup>*+1. The resulting quaternion *<sup>i</sup>* ⇀ *z* then describes the rotation of the frst orientation (*CS*)*<sup>i</sup>* into the second orientation (*CS*)*i*+1:

$$
\vec{\nu\_i} = \vec{\nu\_0} \cdot\_0 \vec{\overline{\nu}\_{i+1}} \tag{3}
$$

According to Eq. (2), the axis of this rotation *<sup>i</sup>* ⇀ *x <sup>i</sup>*,*i*+1 is extracted and then transformed into the inertial coordinate system (*CS*)0:

$$
\overrightarrow{\mathbf{x}}\_{i,i+1} = \overrightarrow{\overset{\cdot}{R}}\_{i} \cdot\_{i} \overrightarrow{\mathbf{x}}\_{i,i+1} \tag{4}
$$

As already mentioned, two orientations represent the same pose if the rotation axis 0 ⇀ *x <sup>i</sup>*,*i*+1 is parallel to the Z-axis of (*CS*)0:

$$
\overrightarrow{\boldsymbol{x}}\_{i,i+1} = (0 \,\boldsymbol{0} \,\boldsymbol{1})^T \tag{5}
$$

Each quaternion of <sup>0</sup> ⇀ ν *<sup>i</sup>* is thus assigned to one of *m* stable component poses. The result is a probability distribution of the natural resting aspects. The results are automatically plotted in a pie chart with a corresponding fgure of each stable resting aspect (Fig. 6).

**Fig. 5** Exemplary components (3D-printed, no infll)

#### **5 Experimental Evaluation**

For the experimental evaluation, the results of the simulated drop tests are compared to the results of manual drop tests using fve different 3D-printed exemplary components (Fig. 5). They vary in shape and are intended to cover a wide spectrum of possible components in reality. For both the simulated as well as the manual drop tests, each component is dropped on a soft surface 1000 times (n=1000) from random initial orientations. In the experimental setup, the soft surface consists of a 2 mm thick rubber mat adhered to a wooden board.

**Fig. 6** Comparison of simulated and experimental drop test results for fve components

#### **6 Results**

Figure 6 shows the results of the simulated and experimental drop tests. The results show that with two exceptions, all possible natural resting aspects of the components were identifed. The exceptions occur for component 5, where the resting aspects 4 and 5 have a very small probability of occurrence. Both the simulation as well as the experimental drop tests each failed to identify one of the said resting aspects. The reason for this is that these resting aspects occur with a very low probability. Therefore, the number of drop tests (1000) may not be suffcient to reliably identify all natural resting aspects and could be increased in future works.

The average deviation between the probability of occurrence of a resting aspect predicted by the simulation model and the experimental results is 6.4%. However, the accuracy varies depending on the considered component. Components 2, 3 and 5 have an average deviation of 3.2, 1.7 and 6.7% respectively, while the results of components 1 and 4 deviate by 10.0 and 10.4% on average. The highest deviation occurs in component 4 with a deviation of 31.2% for orientation 6 (Figs. 6 and 7).

In their extensive experimental evaluation Udhayakumar et al. determined average deviations of 11.1, 8.5 and 8.7% between the experimental results and the results returned by the CSA-, stability- and CRSA-method respectively [11]. They investigated eight different sector shaped components. The maximum deviation in [11] was 23.3%.

The deviations between simulation and reality can have different reasons. As already mentioned, it can be assumed that a larger sample, i.e. iterations per component in the simulation as well as in the real drop test, can lead to a higher agreement of the data. However, the larger deviations (component 1: aspect 1 and 3 (cf. Fig. 1); component 4: aspect 3 and 6) cannot be explained by an insuffcient sample size. It is assumed that differences of the bouncing and damping behavior of the components and the surface between the simulation and the real test setup are a major factor for these deviations. Furthermore, the infuence of the drop height on the resting aspects was not taken into account and could also lead to signifcant deviations.

**Fig. 7** Natural resting Aspects of Component 4

#### **7 Conclusion and Outlook**

In this work, a novel method for the automated determination of natural resting aspects of arbitrary components by means of a physics engine was presented. The physics engine iteratively simulates a drop test and exports the resulting resting poses of the component. A MATLAB framework then compares all exported component orientations, automatically clusters them, and returns the probability distribution of the natural resting aspects. An experimental evaluation of the new method shows promising results with an average deviation of 6.4% between simulated and experimental results. Nevertheless, for some components, the deviation between simulated and experimental probability of particular resting aspects is higher. In order to increase the simulation accuracy, to extend the validated component spectrum, and to include external infuences, future work will focus on two aspects:


#### **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

### **Two-Stage Robotic Bin Picking of Small Metallic Objects**

Meike Herbert , Paul Bach, Markus Lieret , Jens Fürst and Jörg Franke

#### **Abstract**

Robotic grasping of small metallic objects such as bolts is a challenging task due to the small dimensions and textureless refective surfaces. Depth images acquired of such objects are often noisy and error-prone. In addition, overlapping of parts occur as they are provided randomly oriented in a box such as a small load carrier. To overcome the limitations of existing solutions for bolt separation, a fexible and costeffective system is developed using an industrial robot and a magnetic gripper. In a two-stage procedure, the bolts are frst grasped blindly from a box and placed on a fat surface. In the second step, object detection and pose estimation is performed and the individual bolts are grasped and inserted into a fxture, so that fnally the bolts are in a defned position. Industrial use cases for this system are the automated preparation of bolts for robotic screwing processes or automated commissioning of small objects for assembly tasks. The methodology, implementation and evaluation of the proposed solution is presented in this paper.

#### **Keywords**

Bin picking · Magnetic gripper · Object detection · Convolutional neural network

J. Fürst Siemens Healthcare GmbH, Erlangen, Germany

© The Author(s) 2023 167

M. Herbert (\*) · P. Bach · M. Lieret · J. Franke

Institute for Factory Automation and Production Systems (FAPS), Friedrich-Alexander-Universität Erlangen-Nürnberg, Erlangen, Germany e-mail: meike.herbert@fau.de

T. Schüppstuhl et al. (eds.), *Annals of Scientifc Society for Assembly, Handling and Industrial Robotics 2022*, https://doi.org/10.1007/978-3-031-10071-0\_14

#### **1 Introduction**

The automation of assembly tasks provides benefts such as increasing productivity and quality, relieving employees of monotonous tasks and reducing costs. Due to their fexibility, industrial robots are applied for automation in assembly in different use cases like the automation of handling and screwing processes. The fexibility of the automation system is particularly crucial with low quantities, a high number of variants or short product life cycles.

To enable robotic assembly automation, the individual components usually have to be placed in defned poses, which is also required for bolts used for robotic screwing. Conventional systems such as step feeders or vibratory bowl feeders do not provide the necessary fexibility to handle different types of bolts and require additional space and investment. If, in contrast, the industrial robot designated for the automation task is used for handling of parts, system utilization can be increased and further investment costs can be saved.

One use case considered is the robotic screwing of components, where the industrial robot can be used to grasp and separate the bolts. If the assembly process takes place in a partially automated production line with 1- or 2-shift operation, the preparation can be done by the automated system in the overnight shift. Another use case is the automated commissioning of parts to provide them in a defned number and position e.g. in shadow boards in order to reduce search times in manual assembly.

The aim of this research is to develop a cost-effective solution for the use cases mentioned above. The use of an industrial robot with a suitable end effector enables grasping of different objects and therefore offers high degree of variant fexibility. The metallic bolts used in these applications are characterized by small dimensions and a textureless, refective surface. Those characteristics hamper the realization of a cost-effective and fexible solution and face existing bin-picking solutions with challenges.

Therefore, a novel two-stage method for robotic bin picking of small magnetic objects is presented in this paper. The proposed system is characterized by its fexibility and the use of edge computing devices for object detection, pose estimation and motion planning. Hereby, the system can be easily integrated into existing applications without the need of major modifcations on the overall robotic cell.

In the following, the corresponding state of the art is described in detail and the need of action is identifed. The methodology, implementation and evaluation are presented subsequently.

#### **2 State of the Art**

In addition to the selection of a suitable gripper, a key requirement in the implementation of bin-picking solutions is the precise and robust estimation of a suitable grasping pose. There are already numerous solutions for determining the object pose or suitable gripping positions, which are summarized, for example, in [1]. However, many of those solutions require high computation power and are often not suitable for small, textureless and symmetric objects as present in the specifed use case. Therefore, the detailed state of the art regarding object recognition on edge devices as well as pose estimation and grasping of small, metallic objects is presented in the following.

For object recognition in colour images, the state of the art offers a variety of solutions that can also be executed on low-power hardware. Widely used solutions include YOLO resp. the version optimized for mobile devices, Tiny-YOLO [2] or Pelee [3].

The segmentation of individual objects can also be performed robustly on low-power hardware. Solutions such as FuseNet, SegNet, or YolactEdge can run on edge devices like the NVIDIA Jetson TX2 or NVIDIA Jetson AGX Xavier, enabling semantic segmentation and, in the case of YolactEdge, instance segmentation at 30 FPS and above [4, 5].

Due to the dimensions and metallic surface of the bolts as well as their unordered positions e.g. in a load carrier, the automated bin picking of bolts is highly challenging. The textureless, metallic surface of the bolts causes refections and does not provide many distinct features which leads to signifcant noise in the data captured with common RGB-D cameras and inadequate point clouds of the objects. Thus, some approaches try to grasp and separate bolts or similar objects without the use of computer vision.

Mathiesen et al. present a solution whereby a robot equipped with a scoop-shaped tool grabs the required parts from a box. Within the tool, an orienting groove is used to ensure that only objects with the desired orientation are kept in the scoop. Afterwards the oriented objects can be grasped from the scoop using a separate tool or gripper [6].

Ishige et al. also avoid the application of computer vision and use a gripper with two individually movable fngers and integrated tactile sensors for object grasping and separation instead. First, multiple objects are grasped from a box at once and the number of bolts between the fngers is counted using the tactile sensors. Then the gripper fngers are moved so that excess bolts fall out and fnally only one bolt remains in the gripper [7].

Complementary, von Dirgalski et al. propose the combined use of computer vision and force sensors to determine the pose of an object between the gripper fngers [8].

This contrasts with methods using colour and depth data to determine the pose of individual bolts. Furukawa et al. use RGB-D data combined with a template matching approach to detect M6 bolts and subsequently grasp them using a two-fnger gripper [9]. The solution presented by Nakano is based on machine learning instead and uses a single shot 6DoF pose estimator to determine the pose of a bolt before grasping it [10].

To circumvent the effects of erroneous depth information for refective objects, Sato et al. propose a two-step process. In this process, multiple objects are grasped from a load carrier using a magnetic gripper and are placed on a fat surface. Subsequently the objects are classifed using RGB information and are individually grasped with the magnetic gripper. Thereby the objects remain in an unknown pose and can thus only be sorted but not ftted into a fxture or mounted to other components [11].

Another way of coping with noisy depth data is the 6DoF pose estimation pipeline for textureless, metallic objects presented by Blank et al. However, this solution is only suitable to a limited extent for small, bulk components, since the objects are too close to each other and overlap and thus cannot be clearly segmented [12].

While all of the presented solutions enable the separation of small, metallic objects, they still come with drawbacks. The approaches either require the design of an objectspecifc tool, provide the separated objects with an unknown pose, or are only suitable for objects above a certain size. At the same time, established solutions for separation and orientation such as vibratory bowl feeders or step feeders do not offer the necessary fexibility and are characterized by high space requirements and investment costs. Thus, a novel approach for the separation of small, metallic bolts is presented in the following.

#### **3 Two-Stage Bin Picking: System Design and Methodology**

The method enables to provide the bolts in a defned pose after the two-stage grasping process. At the same time, the approach also copes with noisy depth information and is characterized by its low investment costs and fexibility.

#### **3.1 Requirements and System Design**

The overall aim is the development of a fexible and cost-effective system that can be easily adapted to different objects or variants. The bolts to be separated have small dimensions with a total length of about 15 mm to 35 mm and diameters of 3 mm to 5 mm at the cylindrical shaft and 10 mm to 15 mm at the head of the bolt. The metallic bolts are magnetic and have textureless, refective surfaces.

In addition, space requirements and the integration of suitable sensors have to be considered. Small and lightweight sensors for object recognition should be arranged at the robotic end effector, while fxed infrastructural sensor systems should be avoided. A cost-effcient solution is preferably selected.

The proposed setup consists of an industrial robot with an appropriate end effector containing a magnetic gripper and a vision sensor. Thus, there is the restriction that only magnetic objects, such as metallic bolts, can be picked. A box e.g. a small load carrier containing the bolts in random poses is placed in the robot's workspace. In addition, a fxture is used to store the objects in a defned position after grasping.

Grasping the small metallic objects directly from the box is complex and challenging. Due to overlapping of the randomly oriented parts, diffculties arise in fnding suitable grasping poses and high accuracy is required when grasping the objects. Furthermore, refections occur causing noisy or fautly depth measurements and diffculties in identifying the bolts. Therefore, a two-stage procedure is proposed.

#### **3.2 Two-Stage Procedure for Bin Picking with Magnetic Gripper**

Due to the challenges described above a two-stage procedure is presented for the bin picking task consisting of a blind i.e. visionless grasp into the box in the frst stage and the grasping of individual bolts from the work surface in the second stage. The procedure is shown in Fig. 1.

An image of the workspace is taken in a defned scan pose using the vision system attached to the industrial robot. The position of the camera is parallel to the work surface at a predefned height. If no bolt is detected, there is a blind grasp whereby the robot moves the magnetic gripper into the box without using visual information. Some bolts are grasped and afterwards placed on the work surface next to the box.

If a bolt is detected in the image, its pose is estimated, the bolt is grasped and placed in the fxture. The process is repeated until the required number of bolts is reached resp. the fxture is completely equipped.

**Fig. 1** Two-stage procedure for bin picking with magnetic gripper

#### **3.3 Grasping Process and Pose Estimation**

A custom made magnetic gripper consisting of an electromagnet with a microcontroller is used to grasp the bolts. Process knowledge is used to control the force of the magnetic gripper depending on the type of bolts (especially the weight) and the intended picking process (frst step or second step). When grasping blindly into the box in the frst step, the force is strong enough to pick up several bolts and place them on the work surface. Subsequently, in order to grasp one bolt, the tip of the magnetic gripper is placed at the head of the bolt and an appropriate force is set to grasp exactly one bolt.

Therefore, the pose of the bolt is determined using the implemented computer vision system. The bolts lying on the work surface have three degrees of freedom (DOF), two translational and one rotational DOF. Applying a previously trained convolutional neural network (CNN), the positions of the bolt as well as the bolts' head are determined in the image. The center points of the bounding boxes of these two object classes are provided. In addition to the position of the bolt in x- and y-direction, the orientation of the bolt is determined using the two center points and the corresponding angle β, as depicted in Fig. 2.

#### **4 Implementation and Evaluation**

To evaluate the described system, it is implemented as depicted in Fig. 3. A lightweight robot UR10 from Universal Robots is used to automate the process and an Intel RealSense L515 LIDAR is used to capture the environment (see Fig. 3a). The sensor uses the time-of-fight principle and provides a point cloud of the environment with a maximum resolution of 1024×768 spatial points. In addition, a colour image of the environment with a maximum resolution of 1920×1080 pixels is captured and superimposed with the generated point cloud. This allows the pixel coordinates of the colour image to be converted to the corresponding spatial points.

**Fig. 3 a** Implementation of the overall system, **b** identifcation of bolts using YOLOv3 and **c** grasping position of an individual bolt

The CNN YOLOv3 is used to identify the bolts. To train the CNN, 350 images were manually annotated. Thereby, the complete bolt as well as the head of the bolt are annotated individually in order to distinguish the respective parts during recognition. For training, the annotated image dataset is divided into the actual training data, the validation data and the test data in a ratio of 80, 10 and 10%. The hyperparameters for the training were chosen as listed in Table 1. After completing the training, the model achieved a mean average precession of 95% on the test set. Figure 3b shows the identifcation of the bolts using YOLOv3 with bounding boxes of the complete bolt (purple) as well as the heads (light green).


**Table 1** Hyperparameters used for the training of YOLOv3

The developed magnetic gripper consists of an electromagnet with a diameter of 25 mm, a length of 20 mm and a maximal retention force of 50 N. The electromagnet is switched via a bridge circuit and the overall magnetic gripper is controlled via an Arduino Uno, whereby the magnetic feld strength can be adjusted via a pulse-width modulated signal. An additional crash protection is installed between the gripper and the robot fange to avoid damage to the robot in the case of a faulty gripping attempt.

The software required to achieve an automated grasping and separation process is implemented using the Robot Operation System (ROS). The darknet\_ros package can be used for the integration of YOLO. The motion planning for the UR10 is done using the MoveIt framework. The calculation of the gripping pose, the control of the magnetic gripper as well as the sequence control is implemented using the middleware provided by ROS. The whole software including object recognition, gripping pose calculation and motion planning is executed on an NVidia Jetson AGX Xavier.

To evaluate the presented system multiple test runs are performed. During the test runs M5 cylinder head bolts are used as gripping objects. The objective of every test run is to place 16 bolts in the fxture. The process starts by grasping multiple bolts from the box and placing them on the work surface. The bolts are then grasped individually (see Fig. 3c) and inserted into the holes of the fxture. Once no more bolts are detected on the work surface, new bolts are grasped from the box. Each test run continues until the fxture is fully equipped or an error occurs.

No errors or faults occurred during the runs when grasping the bolts out of the box. In every iteration, between two and fve bolts were grasped from the box and placed on the work surface. In the subsequent process of grasping and placing the individual bolts, 70% of the bolts were grasped successfully and 60% of the bolts could be deposited successfully in the fxture.

The following issues caused unsuccessful placements in the fxture and failed grasping attempts. An unsuccessful placement of a bolt was always connected to a faulty pose estimation resp. a faulty grasping attempt. When the bolt is not centred beneath the gripper or tilted after grasping it from the work surface, it cannot be placed in the fxture correctly. Since currently no optical verifcation of the correct grasping pose is integrated, a wrongly oriented or tilted bolt is placed next to a hole when it is inserted into the fxture. This can also result in an unacceptably high compression force and an emergency stop of the robot.

Unsuccessful or faulty grasping attempts are mainly caused by bolts lying close together. In this cases, the bolt and its head cannot be recognized unambiguously and thus, no or an invalid grasping pose is calculated. When no bolt was grasped from the work surface, the process continues, but the space in the holder remains empty after the placement is completed.

Another error was residual magnetization of the bolts after insertion into the fxture, which prevented it from being released from the gripper. However, this error can be reliably prevented by moving the gripper away at an angle after insertion.

#### **5 Summary and Outlook**

The system presented in this paper enables the fexible and cost-effective separation of bolts using an industrial robot and a magnetic gripper. After separation, the bolts are in a known pose, allowing them to be inserted directly into a custom fxture. As the depth data of the small and refective bolts is noisy and error-prone, a two-stage process is used to separate the bolts. First multiple bolts are grasped from the box and are placed on an even surface. Afterwards, the object detection and pose estimation is performed to grasp a single bolt in a defned manner. The presented implementation and evaluation demonstrates the functionality and potential of the system. Further test runs have to be performed with different bolt variants.

In order to address the errors encountered during the evaluation, the following improvements and enhancements will be made in the next development step. An additional colour camera will be integrated to check if a bolt has been successfully grasped from the surface and whether it is in the required pose. If not, the bolt is put down again and the grasping process is repeated. Furthermore, after placing the bolt in the fxture, the colour camera of the LIDAR should be used to check that the bolt has been inserted correctly.

Moreover, a combined force and position control will be used while inserting a bolt in the holes of the fxture to compensate for small deviations in the placement position. As shown by Metzner et al., the application of a suitable compensation strategy can signifcantly increase the success rate when inserting objects into holes [13].

Finally, after integrating the improvements mentioned above, an evaluation of the overall process with different bolt types will be carried out and the success rate when loading different fxtures will be evaluated.

#### **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

## **Investigation and Compensation of Hysteresis in Robot Joints with Cycloidal Drives**

### Patrick Mesmer , Patrick Riedel, Armin Lechler and Alexander Verl

#### **Abstract**

Improving the dynamic path accuracy has been a major research topic in industrial robotics for decades. It is known that the drivetrains installed in the robot joints limit further improvements. There is a lot more literature on the dynamic behavior of harmonic drives (HDs) than for cycloidal drives (CDs), that are usually installed in industrial robots (IRs) with heavy payload. However, a more profound knowledge of the occurring effects offers the potential for both, design- and control-based enhancements. Therefore, this paper presents an experimental study of the friction and hysteresis behavior with explicit consideration of further dependencies, such as temperature and load. Based on these investigations, a model as well as a control-based compensation approach, that does not require additional gearbox output sensors, is proposed. The investigation and validation are carried out with an experimental setup equivalent to the drivetrain of an IR with heavy payload.

#### **Keywords**

Industrial robot • Modeling and identification • Temperature • Bouc-Wen model • Sensorless compensation

P. Mesmer (B) · P. Riedel · A. Lechler · A. Verl

Institute for Control Engineering of Machine Tools and Manufacturing Units, University of Stuttgart, Stuttgart, Germany

e-mail: patrick.mesmer@isw.uni-stuttgart.de

T. Schüppstuhl et al. (eds.), *Annals of Scientific Society for Assembly, Handling and Industrial Robotics 2022*, https://doi.org/10.1007/978-3-031-10071-0\_15

#### **1 Introduction**

Since several decades, it is known that the dynamic behavior of the drivetrains installed in the robot joints are limiting the obtainable path accuracy [1, 2]. These drivetrains, usually consisting of a precision gearbox and a permanent magnet synchronous machine (PMSM), exhibit a variety of nonlinear effects such as torque ripple, kinematic error, friction and hysteresis. It is also known that as precision gearboxes, CDs are installed for IRs with heavy payloads instead of HDs, which are commonly used for lightweight robots. The reason for this is the overload capability of CDs due to the operating principle with rolling contact instead of tooth meshing [3].

Previous studies [4–14] have focused almost entirely on HDs. From these investigations it is known that the friction has a dependence on additional quantities, such as temperature or load. However, to the authors best knowledge, no study has yet been published on a potential dependence of the hysteresis on additional quantities. Therefore, this paper addresses the knowledge transfer from HDs to CDs by presenting an experimental investigation of additional dependencies of the friction and hysteresis behavior of CDs. Furthermore, a modeling of these dependencies as well as a control-based compensation approach is proposed.

#### **2 Related Work**

Since the 1990s, research efforts have been made to model the hysteresis behavior of HDs, which is caused by friction and nonlinear stiffness. Early dynamic models were proposed by Seyfferth et al. [4], Taghirad and Bélanger [5] as well as Dhaouadi et al. [6], among others. As recent work on this topic, the studies of Tjahjowidodo et al. [15] and Ruderman et al. [7, 8] are noteworthy. Tjahjowidodo et al. [15] use parallel Maxwell-slip elements to describe the nonlinear dynamics, whereas Ruderman and Iwasaki [8] adopt a rate-independent Bouc-Wen hysteresis model. In addition to the modeling, Ruderman and Iwasaki propose a sensorless hysteresis compensation approach based on a generalized momentum observer [16] and a Stribeck friction model. To the authors best knowledge, only Dhaouadi et al. [6] investigated a possible multidimensionality of the hysteresis of HDs. Thereby, hysteresis curves for different frequencies were determined, and no additional dependence was found.

In contrast, there is significantly more work exploring the friction behavior. Bittencourt et al. [9] investigated the load and temperature dependence exemplarily for the second joint of an ABB IRB 6620. They detected an independence between temperature and load, and based on this, they suggested an empirical model. In contrast, in [10] no temperature, but an additional position dependence was considered using lookup tables. Carlson et al. [11] model the temperature dependence of both, an ABB IRB140 and an ABB YuMi, using a temperature-dependent Coulomb friction adaptation based on the estimated thermal energy stored in the robot joints. Simoni et al. [12] also studied the temperature dependence of the friction in the assembled state. Considering the second joint of a Comau SMART NS-16-1.65, two different modeling approaches based on a polynomial friction model were proposed. On the one hand, a model with a linear temperature dependence of the entire friction model. On the other hand, a model where each parameter exhibits an individual but linear dependence. Madsen et al. [13] consider the temperature dependence of a Universal Robot UR5e using an additive, polynominal friction term. Whereby the authors note that their approach does not extrapolate well and thus may lead to problems in practical applications. In addition, the load dependence is taken into account using an adaptation of the Coulomb friction coefficient with respect to the squared load torque. Another approach is the approximation of the temperature dependence of the friction using a neural network [14]. The validation is carried out on a testbench with a single joint of the DLRs Humanoid robot David with position, torque, and temperature sensors.

An experimental investigation of the hysteresis behavior of CDs, which is closely linked to the friction behavior, has not yet been published, unless the previous work [17]. In this work, we proposed a Bouc-Wen as well as a nonlinear auto-regressive with exogenous inputs (NARX) model to represent the hysteresis behavior of CDs. However, a possible temperature or frequency dependence of the hysteresis behavior was not investigated. In addition, the models were not validated using a compensation scheme.

#### **3 Experimental Setup**

All subsequent investigations of the friction and hysteresis behavior of CDs are carried out on the experimental setup shown in Fig. 1, which simulates a robot joint of the heavy payload class with one degree-of-freedom. The CDs under test is the precision gearbox RH380-N from Nabtesco ➀ with a rated torque of 3.7 kNm and a gear ratio *u* of 185. The lubricant temperature of the CD is measured using a PT100 sensor ➅. The CD is driven by the PMSM MSK070D from Bosch Rexroth ➁, which is equipped with a 13Bit encoder. The joint torque τ<sup>g</sup> is measured with the torque sensor T40B of HBM ➂. Via the water-cooled high-torque motor DST2-315KO of Baumüller ➄ a dynamic load torque can be applied. Thus, the load motor ➄ is connected with the output-side of the CD ➀ using a Roba DS 1400 double-jointed coupling of Mayr ➃ with a torsional stiffness of 15e6 Nm/rad.

The experimental setup is operated with a rapid prototyping platform of Speedgoat, which executes a Simulink model. The rapid prototyping platform communicates with industrial motion controllers of Bosch Rexroth and Baumüller, on which the current control of the motors run, through EtherCAT with a 500µs cycle time.

**Fig. 1** Experimental setup used for the investigation (adapted from [17])

#### **4 Investigation**

From the related work (see Sect. 2) it is known that the friction has dependencies on the load and temperature in addition to the velocity. However, for the hysteresis, there is no study that examined additional dependencies. Therefore, in this section, the friction and a potential multidimensionality of hysteresis behavior of CDs is experimentally investigated.

#### **4.1 Friction**

Classical, static friction models describe a functional relationship between the friction torque τ<sup>f</sup> and the relative velocity of the contact surfaces. Assuming only a dependence on the motor velocity θ˙, as is often the case in industrial robotics, it is possible to identify the friction behavior by closed-loop motion trajectories with constant velocity. We assume that the effects of temperature and load are independent, which significantly reduces the investigation burden. The validity of this assumption was already shown in [9].

To investigate an additional dependence on the load torque, this experiment was repeated several times, while a constant load torque τext was applied using the output motor. Figure 2 shows the results for load torques of 0 to 3 kNm as well as for negative and positive loads. It is obvious that with increasing load torque, the friction torque increases, too. This can be explained by the fact that an increased load leads to an increase in the contact surface, which in turn results in a higher friction torque. However, this relationship between the load torque and the friction torque is nonlinear as well as dependent on the direction of rotation.

To investigate the temperature dependence, a constant motor velocity of 150 rad/s is used to heat up the joint. Once the temperature is reached, the experiment is carried out. About 30 min were required to heat from 20 to 50 ◦C. The corresponding friction curves are shown in Fig. 3. The rising temperature leads to an increase in the static and Coulomb friction, whereas the viscous friction decreases. With increasing temperature, the viscosity of lubricants decreases, which explains the decrease in viscous friction. Simultaneously, the

**Fig. 2** Joint torque τg in dependence of load torque τext

**Fig. 3** Friction torque τf in dependence of temperature *T* (in steps of 5 K)

increase in temperature leads to an expansion of the material, which increases the contact surface and thus may explain the increase in static and Coulomb friction.

#### **4.2 Hysteresis**

The hysteresis behavior of robot joints is typically modeled as a nonlinear differential equation of the joint torque depending on the joint torsion as well as its derivative. Other potential dependencies such as on frequency or temperature were not examined to the authors best knowledge. The investigation of these dependencies is performed by applying a sine signal of the load torque τext with an amplitude of 3 kNm, while varying the additional quantities. Subsequently, a static hysteresis curve is obtained in each case by plotting the joint torque τ<sup>g</sup> against the torsion angle φ.

Therefore, the frequency of the sine signal of the load torque is altered between 0.125 Hz and 2 Hz. The resulting static hysteresis curves, which are nearly identical, are shown in Fig. 4a. This corresponds to a frequency independence, which is beneficial since more simple, rate-independent hysteresis models are sufficient. The procedure to heat up the robot joint corresponds to that of the investigation of the friction of Sect. 4.1. The obtained static hysteresis curve for the temperatures 20, 35 and 50 ◦C are shown in Fig. 4b. It is noted that a

**Fig. 4** Friction torque τf in dependence of load torque τext frequency and temperature *T*

frequency of the sine signal of 0.5 Hz was chosen, however, this is irrelevant due to frequency independence. An increasing stiffness with rising temperature is noticeable, although the basic shape of the hysteresis curve does not change significantly. This stiffness increase may be explained by a temperature-dependent material expansion.

#### **5 Compensation Method**

The control-based hysteresis compensation of robot joint is an approach to meet the further increasing accuracy requirements in industrial robotics. In this case, cost-effective approaches that do not require additional gearbox output sensors are advantageous. In the following, we first propose a model based on the investigation above. Thereafter, we present a compensation approach without gearbox output sensors and validate it on the experimental setup.

#### **5.1 Modeling**

The proposed model originates from the flexible joint model according to Spong [2]. However, this model is supplemented by a temperature-dependent hysteresis spring as well as a velocity-, load-, and temperature-dependent friction. This leads to the dynamics of a single robot joint

$$
\ddot{q} = M^{-1} \left[ \tau\_{\text{g}} + \tau\_{\text{ext}} \right], \qquad \ddot{\theta} = J\_{\text{m}}^{-1} \left[ \tau\_{\text{m}} - \tau\_{\text{f}} - \mu^{-1} \tau\_{\text{g}} \right], \tag{1}
$$

with the motor θ and joint position *q*, the gear ratio *u*, the joint-side inertia *M*, the motor inertia *J*m, the motor τm, external load τext, friction τ<sup>f</sup> and joint torque τg.

For the temperature-dependent hysteresis spring we adopt a Bouc-Wen model based on our previous work [17]. This model, which is rate-independent, notes as follows:

$$\mathbf{r}\_{\mathbf{g}} = wk\phi + (1 - w)k\mathbf{x}, \qquad \dot{\mathbf{x}} = \dot{\phi} - \beta \left| \dot{\phi} \right| |\mathbf{x}|^{n-1} \mathbf{x} - \gamma \dot{\phi} |\mathbf{x}|^{n}, \tag{2}$$

with the nonlinear, temperature-dependent stiffness

$$k(T) = k\_0(T) + k\_1(T)|\phi| + k\_2(T)|\phi|^\beta,\tag{3}$$

the torsion <sup>φ</sup> <sup>=</sup> *<sup>u</sup>*−1<sup>θ</sup> <sup>−</sup> *<sup>q</sup>*, the weighting factor 0 <w< 1, the internal state *<sup>x</sup>*, the shape parameters γ , β, *n* and the temperature *T* . Due to the nonlinear behavior of the model, the identification is performed using the particle swarm optimization. In addition to our previous work [17], a temperature-dependent stiffness (3) is considered to account for the observed temperature behavior. Therefore, the identification procedure is repeated at 35 and 50 ◦C, whereas only the stiffness parameters are included as free model parameters. Subsequently, a second-order temperature-dependent polynomial is fitted separately for each stiffness parameter *k*0, *k*1, *k*<sup>2</sup> by minimizing the mean squared error (MSE) using the temperature-independent parameter estimates.

To account for the load and temperature dependence of the friction, we assume, following [9], that the temperature and load friction effects

$$\mathfrak{tr}\_{\mathbf{f}}(\dot{\theta}, T, \mathfrak{r}\_{\mathbf{g}}) = \mathfrak{r}\_{\mathbf{f}, \mathbf{T}}(\dot{\theta}, T) + \mathfrak{r}\_{\mathbf{f}, \mathbf{l}}(\dot{\theta}, \mathfrak{r}\_{\mathbf{g}}) \tag{4}$$

are independent. For the load-dependent friction τf,<sup>l</sup> we apply a 2-D lookup table as proposed in [10]. Regarding the temperature-dependent friction τf,T, we adopt a LuGre model [18]

$$\tau\_{\rm f,T} = \sigma\_0 z + \sigma\_1 \exp\left(-\dot{\theta}/v\_0\right)^2 \dot{z} + F\_\rm V(T)\dot{\theta}, \qquad \dot{z} = \dot{\theta} - \sigma\_0 \left(|\dot{\theta}| \langle \rho\_\ell \dot{\rho}\rangle\right) z,\tag{5}$$

with the temperature dependent Stribeck curve

$$\log(\dot{\theta}) = F\_{\mathbb{C}}(T) + (F\_{\mathbb{S}}(T) - F\_{\mathbb{C}}(T)) \exp\left(-\left|\dot{\theta}/v\_{\mathbb{S}}(T)\right|^{\delta}\right),\tag{6}$$

the Coulomb *F*c, viscous *F*<sup>v</sup> and static *F*<sup>s</sup> friction coefficients, the bristle stiffness σ<sup>0</sup> and damping σ1, the shaping factor δ and the Stribeck velocity vs. The identification is done in a two-step process. First the static friction parameters (*F*c, *F*v, *F*s, δ, vs) are obtained using the Levenberg-Marquardt algorithm to minimize the MSE between a classical Stribeck model and the measurement (cf. Fig. 3) at each temperature. Secondly, for each of the parameters separately, a second order temperature-dependent polynomial is fitted in the same way as for the hysteresis behavior. Subsequently, the dynamic parameters σ0, σ1, v<sup>d</sup> are identified employing a particle swarm optimization.

#### **5.2 Compensation Scheme**

The proposed compensation scheme, which is shown in Fig. 5, is adapted from [8, 19]. The compensation is based on an inversion of the hysteresis, requiring the joint torque, which

is not measured. Instead of a sensor, a so-called generalized momentum observer, which is known from collision detection of robots [16], is utilized. The starting point for the derivation is the generalized momentum *p* = *J*<sup>m</sup> · θ˙. The observer yields by taking the time derivative

$$
\dot{\rho} = J\_{\rm m} \cdot \ddot{\theta} = \mathfrak{r}\_{\rm m} - \mathfrak{r}\_{\rm f} - \mu^{-1} \mathfrak{r}\_{\rm g} \tag{7}
$$

and the residual

$$r = K\_0 \left[ \int \dot{p} \, dt - p \right] = K\_0 \left[ \int (\mathbf{r\_m} - \mathbf{r\_f} - r) \, dt - p \right] \tag{8}$$

of the generalized momentum. This residual equals the estimate of the joint torque, which becomes obvious by taking its time derivative

$$\dot{r} = K\_\text{o} \left[ \tau\_\text{m} - \tau\_\text{f} - r - \dot{p} \right] = K\_\text{o} \left[ -r + \mu^{-1} \tau\_\text{g} \right] \tag{9}$$

and transforming it into the Laplace domain

$$\lim\_{K\_o \to \infty} \, ^{r/\mu^{-1} \mathbf{r}\_{\oplus}} = \lim\_{K\_o \to \infty} \, ^{K\_o/s + K\_o} = 1. \tag{10}$$

Subsequently, the hysteresis behavior (2) is inverted to obtain the estimated joint torsion

$$
\hat{\phi} = \left[ \!/wk \cdot \left[ \!/u r - (1 - w)kx \right] . \right]. \tag{11}
$$

Finally, the estimated torsion is added to the desired joint position *q*d.

It is known from [19], that residual oscillations may occur using this compensation scheme. To avoid this effect, a dead zone of the position error is included, which matches the noise of the estimated joint torsion φˆ at standstill.

Moreover, the compensation scheme needs an estimate of the friction. In addition to [8, 19] we apply a LuGre observer instead of a static Stribeck model. The observer

$$
\hat{\sigma}\_{\rm f,T} = \sigma\_0 \hat{z} + \sigma\_1 \exp\left(-\dot{\theta}/v\_0\right)^2 \hat{\bar{z}} + F\_\text{V} \dot{\theta} \tag{12a}
$$

$$\hat{\vec{z}} = \dot{\theta} - \sigma\_0 \left( |\dot{\theta}| \langle \wp\_\delta \dot{\wp} \rangle \right) \hat{z} + k\_\mathrm{f} \left( \mathfrak{r}\_\mathrm{m} - J\_\mathrm{m} \ddot{\theta}\_\mathrm{d} - r - (\hat{\mathfrak{r}}\_\mathrm{f,T} + \hat{\mathfrak{r}}\_\mathrm{f,l}) \right), \tag{12b}$$

with the observer gain *k*f, is adapted from [20] and extended by the previously modeled temperature and load dependence. Due to an insufficient sensor resolution, we apply the desired θ¨ <sup>d</sup> instead of the measured motor acceleration θ¨.

#### **5.3 Experimental Validation**

The experimental validation is performed on the test bench of Fig. 1 by applying a pointto-point trajectory of the desired joint position *q*<sup>d</sup> with a trapezoidal acceleration profile. Simultaneously, a sinusoidal load torque τext with an amplitude of 3 kNm and a frequency of 1/16 Hz, imitating a gravity induced force, is set. The experiment is conducted at a gearbox lubricant temperature of *T* = 20 and 35 ◦C. Figure 6 shows the desired joint position *q*d, load torque τext and tracking errors *e*<sup>q</sup> = *q*<sup>d</sup> −*q* of the experiment. The presented tracking errors correspond to the scenarios without compensation *e*q, with compensation *e*<sup>c</sup> according to [8] and the proposed compensation scheme *e*e. With the compensation according to [8], the *L*<sup>1</sup> norm of the tracking error at *T* = 20 ◦C is reduced by 81% from 13.8mrad to 2.64 mrad. However, oscillations are evident at standstill. Using the proposed compensation, the *L*<sup>1</sup>

**Fig. 6** Tracking experiment without (black), with compensation according to [8] (blue) and the proposed compensation (yellow) at *T* = 20 and 35 ◦C

norm is reduced by another 56% to 1.17mrad as well as the oscillations are avoided. At the temperature of *T* = 35 ◦C, the tracking error is reduced to a larger extent regarding the compensation according to [8] due to the modeled temperature dependence.

#### **6 Conclusions**

In this paper, an experimental investigation of the friction and hysteresis behavior of cycloidal drives was presented. The investigation revealed a significant load and temperature dependence of the friction. However, the hysteresis is rate-independent, and there is a low, temperature-dependent increase in stiffness. Therefore, the results indicate a great similarity between HDs and CDs regarding the friction and hysteresis behavior. Moreover, a compensation approach with an extended friction model was proposed, which improves the trajectory tracking performance compared to the state of the art.

In the future the temperature sensor may be replaced by an observer. In addition the approach should be validated on a six degree-of-freedom manipulator in a practical application such as milling.

**Acknowledgements** This work was supported by the German Research Foundation within the project No: 443677015.

#### **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

**Ergonomics**

### **LiDAR-based Real-Time Measurement and Control of Shoulder Torque—Preview on an Experimental Approach**

Max Herrmann, Christoph Ebenhoch, Jens von der Wense and Robert Weidner

#### **Abstract**

A concept of how load imposed by an exoskeleton on the upper arm affects shoulder torque is given using a mechanical mock-up of the shoulder-arm-system and a serial kinematic robot. System identification methods for linear surrogate models of the human shoulderarm-system and their embeddings in control loops are introduced. Early measurements of a novel, multisensor LiDAR system for real-time motion-capturing of human motion are presented, and its implications discussed. The experimental setup is used for direct shoulder torque readings and control.

#### **Keywords**

Shoulder-arm biomechanics • Experimental testing • System identification • Linear surrogate models • Control • LiDAR

M. Herrmann (B)

Helmut Schmidt University, Department of Mechanical Engineering, Laboratory of Manufacturing Technology, Hamburg, Germany e-mail: max.herrmann@hsu-hh.de

C. Ebenhoch · R. Weidner

J. von der Wense Department of Mechanical Engineering, Central Workshop Construction, Helmut Schmidt University, Hamburg, Germany

University of Innsbruck, Institute of Mechatronics, Chair of Production Technology, Innsbruck, Austria

**Fig. 1** Different layers of simulation. *(1)*: Base case of a test person *(a)* wearing an exoskeleton *(c)*. *(2)*: The wearable is replaced and simulated by a collaborative, serial kinematic robot *(b)*. *(3)*: The shoulder-arm-system is mimicked by a servo drive with double pendulum attached *(d)*

#### **1 Introduction**

Exoskeletons are wearable devices supporting tasks commonly found in industrial applications [1]. They modify the wearer's internal load distribution by means of active [2] or passive [5] elements, or by a combination of both [7]. Aiming for the defined reduction of joint torques and forces, the question arises how to reliably determine these quantities. Joint forces and torque are defined by the sum of the individual muscle forces and their levers acted upon segments and cannot be measured in vivo.

Our proposed method to determine and manipulate internal load distributions with the aid of exoskeletons is to replace the human wearing an exoskeleton with a mechanical mockup of a simplified shoulder-arm-system, restricted to planar movement. Additionally, the affect of the exoskeleton is simulated with a collaborative robot, mechanically coupled to the upper arm of the analog simulator, imposing forces for establishing a control goal, like constant shoulder torque over time while carrying out a pre-defined task, see also Fig. 1. The advantage of this approach is to enable for direct readings of the torque and forces acting on the shoulders via appropriate sensors, and by measuring motor currents. The cobot, used as a substitute for the actual exoskeleton, allows for distinguished control inputs, thus simulating support of the exoskeleton.

#### **2 Related Work**

Lower-dimensional surrogate models for the prediction and feature extraction of human motion data is an actively researched area where principal component analysis, neural networks, and statistical methods are among the most popular [6, 8, 9]. Gaussian process latent variable models (GPLVM), also considered as probabilistic nonlinear PCA, has been used by Marin [13] to create a low-dimensional surrogate model embedded in an optimization problem to minimize ergonomic scores of drilling tasks. DMD-based methods have not yet established in analyzing and predicting human motion. The work of Enes [12] isolates the reason for this, and introduces delay-embedded DMD algorithm to remedy issues associated with the drawbacks of exact DMD. Patil [4] fused LiDAR and inertial measurement unit (IMU) sensor data to track human motion data in real-time. In [25], a motion-controlled mechanical mock-up of the shoulder joint is introduced, exhibiting a rotational degree of freedom of the scapula.

#### **3 Mechanical Mock-up**

The mechanical mock-up of the shoulder-arm system comprises of a gearless servo drive, a double pendulum attached to its shaft, and sensors to account for force and angular readings. Upper arm and forearm are made of milled aluminum parts, and reflect mass and dimensions of its human counterpart. At hand position, additional mass may be mounted for different load scenarios. Rotary encoders for absolute angular measurements are integrated into the servo drive, and mounted to the (elbow) joint connecting upper arm and forearm. The muscles are modeled as McKibben fluidic muscles, i.e. fiber-reinforced elastomers contracting when pneumatically pressurized [17]. The muscles' insertion points are at 50mm from elbow joint center for the biceps, and 25mm from the elbow joint center for the triceps. Table 1 lists used components and its specifications. For the mechanical simulation of the impact of an exoskeleton, a collaborative serial kinematic robot is used. It introduces pressure force via a link to the shoulder-arm-system teststand (Fig. 2).



'

**Fig. 2** Mechanical mock-up of the shoulder-arm-system comprising of servo drive *(a)*, upper arm *(b)*, forearm *(c)*, biceps *(f)*, triceps *(g)*, rotary encoder elbow joint *(e)*, force sensor mounted in base plate *(h)*, and additional mass at hand position *(d)*

#### **4 Model**

The mathematical model that describes the behavior of the planar shoulder-arm-system is a Langrangian of 1st kind description of a double pendulum with lumped masses, as depicted in Fig. 3.

The governing equations are [22]:

$$m\_{l}\ddot{\mathbf{x}}\_{l} = \sum\_{j}^{n\_{F}} F\_{j} + \sum\_{a}^{n\_{c}} \lambda\_{a} \frac{\partial f\_{a}}{\partial x\_{l}}, \qquad i = 1..2N,\tag{1}$$

where *m* denotes mass, *x* is a cartesian coordinate, *F* is applied force (gravitation, actuation, damping, support). *nF* is the number of forces, *nc* is the number of holonomic constraints, λ is Lagrange multiplier, *f* is holonomic constraint, *N* is the number of mass points.

**Fig. 3** Double pendulum schematic and characteristics for modeling according to Lagrangian of 1*st* kind. (*x*1,*z*1) is position of lumped upper arm mass, (*x*2,*z*2) is position of lumped fore arm mass, ϕ1, ϕ2 are respective angles enclosed with the *z* axis. *M*1 is torque introduced by shoulder servo drive, *M*2 is torque introduced by biceps/triceps pair about elbow joint, *Md*1, *Md*2 are respective damping torques, proportional to angular frequency. *Fsup* denotes the support vector imposed by the exoskeleton, *g* is gravitational acceleration, and *m*1, *m*2,*l*1,*l*2 are lumped masses and length of limbs, respectively

For the integration of the differential equations we are using an implicit Runge-Kutta method which has proven to be numerically more stable than explicit schemes. The chosen parameters are *<sup>m</sup>*<sup>1</sup> <sup>=</sup> 3 kg, *<sup>m</sup>*<sup>2</sup> <sup>=</sup> 2 kg, *<sup>l</sup>*<sup>1</sup> <sup>=</sup> *<sup>l</sup>*<sup>2</sup> <sup>=</sup> <sup>0</sup>.3m, *<sup>g</sup>* <sup>=</sup> 10 m/s2, *<sup>d</sup>*<sup>1</sup> <sup>=</sup> *<sup>d</sup>*<sup>2</sup> <sup>=</sup> <sup>0</sup>.4.

The model is used for testing control and identification algorithms before deploying the code on the test stand, and to get a qualitative understanding of the underlying dynamics and characteristics of the system.

#### **5 Surrogate Model**

Surrogate models are small scale approximations of full-scale descriptions of system dynamics. Their main purpose is to adequately estimate and predict the motion in phase space, usually in a given subset of possible states, limiting the application range and accuracy of the surrogate model.

In this article, we advocate the use of linear regression techniques, particularly the Hankel Alternative View of Koopman (HAVOK) [18], for two major reasons. Firstly, it preserves the physical meaning of the states, rendering the computational overhead of an observer obsolete. Secondly, the obtained linear discrete time model integrates very well into the model predictive controller framework. Due to the linearity of the surrogate model it is computationally feasible, and embeddable [16], even for optimization-based control strategies, as MPC is.

The HAVOK method for deriving linear surrogate models of nonlinear systems on basis of measurement data is, in its foundations, a time-delay embedding with a Koopman-theorymotivated linear propagation of singular right eigenvectors over discrete time, closely related to the Eigensystem realization algorithm (ERA) [20], or the more recent dynamic mode decomposition with delay (DMDd) [3] (Fig. 4).

#### **6 Control**

Figure 5 shows the schematic of how to arrive at a linear surrogate model-based controller of the identified i/o behavior of an exoskeleton's support vector to shoulder torque. The procedure is divided into an open loop and a closed loop branch. The open loop is really about system identification. While doing a trajectory-tracking controlled predefined task, i.e. keeping the hand position of the mechanical mock-up on a motion path, we impose force perturbations to the upper arm, and read the resulting shoulder torque. This input/output mapping will subsequently be used for a linear-regression-based method to create a small linear surrogate model suitable for real-time control. For the closed loop branch, we have chosen a model predictive control (MPC) strategy as it seamlessly integrates the discrete-time linear model description obtained from the system identification part. Despite its optimizationbased nature, and therefore computationally expensive, it is still applicable for real-time

**Fig. 4** The method Hankel Alternativ View of Koopman (HAVOK) [18] for creating linear surrogate models from measurements of nonlinear systems. *(a)* Time-shifted measurements are stacked into a Hankel matrix *H*, and decomposed into its left singular eigenvectors *U*, right singular eigenvectors *V*, and singular values *S*. *(b)* Only the first *r* right singular eigenvectors *V*˜ , corresponding to the largest singular values, are stored, the remaining vectors are discarded. *(c)* Dynamic mode decomposition, a linear regression technique, is applied to truncated versions of *V*˜ , denoted *X* and *X*- *(d)* The best-fit matrix propagates the right eigenvector v˜*<sup>k</sup>* one time step.(*e*) From the singular value decomposition of the Hankel matrix we have <sup>v</sup> <sup>=</sup> *<sup>S</sup>*−1*<sup>U</sup> <sup>H</sup>* . *(f)* The closed-form solution for the propagation of physical states in a time window of length *r* can be explicitely stated as a linear mapping of the truncated versions of *U*, *S*, and the best-fit matrix

**Fig. 5** Cascaded strategy of cobot trajectory control: In an open loop system identification process (*blue*), the plant follows a given, periodic motion pattern. This movement is perturbed by force signals imposed on the plant, and the resulting shoulder torque is read. From the i/o data, a surrogate model is derived, using linear regression techniques, like Hankel Alternative View of Koopman (HAVOK), Eigensystem Realization Algorithm (ERA), and Subspace Identification (SSI). This model (*A*, *B*) then forms the basis for a model predictive control algorithm to close the loop of measured state *xk* , and computed input *uk*

control due to the linear description of the model, and performant algorithms optimized for embedded systems [16]. In contrast to frequency-domain methods, MPC control goals can be formulated explicitely as cost functions and state constraints on physical values. Additionally, the discrete time setting aligns well with cycle times used in threads of programmable logic controllers.

#### **7 LiDAR Sensors**

For measuring the planar movement of the shoulder-arm-system, actuated by a serial kinematics robot, a LiDAR multi-sensor system, specifically developed for the task of tracking human motion, is applied. It basically consists of eight Intel RealSense L515 time-of-flight sensors 30 Hz frame rate, a depth accuracy of approx. 5mm, and an integrated RGB camera for color information. The sensors are spatially distributed to capture the scene from different angles with their individual point clouds registered into an integrated scan based on an extrinsic calibration in a postprocessing step. Wiring and components are depicted in Fig. 6. Challenging tasks are sensor placement for a trade-off between minimizing occlusion effects due to shadowing, and minimization of interference between individual sensors as a side effect of their active measurement principle. To account for the interference, the sensors are triggered with temporal delays. The main advantage, and inherent characteristic, of a LiDAR measurement system is its ability to collect surface information of the captured object, and therefore contribute greatly to the classification, and identification of movement patterns.

**Fig. 6** Wiring and components of LiDAR multi-sensor system

#### **8 Results and Discussion**

The double pendulum system described in Sect. 4 was stabilized with an LQR controller by linearizing about an operating point with torques *M*<sup>1</sup> = −1.2 Nm, *M*<sup>2</sup> = 2.3 Nm, representing the lower hand position of the trajectory of the task of picking up workpiece, and mounting it overhead. Figure 7 shows the damping effect of the controller when opposed to inputs introduced by the supporting structure. The input signal is a normalized measurement of an XSENSOR pressure mat, located at the load introduction area of the exoskeleton's arm shell, recorded over a full motion path when carrying out the task of picking up workpiece, and mounting it overhead, and integrated over the area [24]. The controlled shoulder-arm complex serves as a model for the real behavior of a human arm when exposed to external disturbances.

We were planning to apply the HAVOK with control (HAVOKc) method, described in Sect. 5, to create a linear surrogate model for mapping the exoskeleton force input to shoulder torque, but up until now we were not successful implementing it. Python code for model, input data, controller, and attempt for HAVOKc are available at https://bitbucket. org/maxherrmann/havok.

For the LiDAR system, the current state of development allows for capturing point clouds with four sensors measuring simultaneously. Figure 8 shows a sequence of images taken of a person taking a seat in a chair. The accuracy of the system has not yet been evaluated.

**Fig. 7** Simulation of forearm x position and phase plot of forearm (x, z) position. *Blue* represents free dynamics of the forearm when opposed to small signal support inputs, *orange* shows the oscillationattenuated forearm dynamics using LQR control

**Fig. 8** Point cloud sequence of person taking a seat recorded by four mutually registered LiDAR sensors

#### **9 Summary and Outlook**

A mechanical twin of the human shoulder-arm-system coupled to a serial kinematic roboter introducing pressure force into the upper arm to support lifting set out to answer the question *"Can shoulder torque of a mechanical mock-up be controlled with an appropriately chosen support vector over time based on data-driven linear surrogate models and a LiDAR motion capture system?"*. The mechanical twin is a cybernetic arm, an analog simulator, equipped with rotatory and translatory actuators, and designed with equal dimensions and mass distribution of a human arm, mimicking its motion. Thus, enabling for the real-time reading of torque in the shoulder and elbow joints, and reaction forces sensed with distinguished force sensors integrated in the fixed bearings of the motor. The collaborative robot simulates the impact of the exoskeleton on the upper arm via a mechanical coupling for pressure force transduction.

A concept of how a HAVOKc-based system identification can be carried out while the simulated shoulder-arm-system is moving on a trajectory-controlled periodic path is outlined. The resulting transfer function from introduced load at the upper arm to shoulder torque is obtained as a linear surrogate model. The surrogate model evaluates faster than the full model while preserving the dominant characteristics, and can thus be incorporated into a trajectory-tracking controller.

Simulations were carried out validating the mechanical model, and testing the performance of a linear-quadratic controller that stabilizes the hand position. For small signal perturbations imposed by a load vector acting upon the forearm, the resulting oscillations observed at the uncontrolled arm were effecitvely attenuated.

We are introducing a novel multi-sensor LiDAR system, merging individual sensor measurements into an integral point cloud by mutually registering the data sets. In a subsequent post processing step, features, i.e. segment positions and orientations, are extracted and used as reference signals for downstream control loops.

All the mentioned teststands, sensors, and algorithms are still in the development phase such that this article sketches an outline and a concept of the investigations to come.

In future research, the mechanical mock-up can be replaced by a mapping from kinematics to kinetics. This is usually accomplished by introducing human motion data to a musculoskeletal model, and, by means of inverse kinematics and inverse dynamics, compute the internal load state of a human [10, 11]. Since this approach is computationally expensive and infeasible for real-time control, surrogate models might as well be a suitable measure for addressing this problem.

**Acknowledgements** This research is funded by dtec.bw—Digitalization and Technology Research Center of the Bundeswehr which we gratefully acknowledge [project EVO-MTI].

#### **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

### **Laboratory-Based Evaluation of Exoskeletons in an Overhead Assembly Task**

Lennart Ralfs , Tobias Peck and Robert Weidner

#### **Abstract**

In recent years, the number of industrial exoskeletons has signifcantly increased. As a large share of assembly tasks still requires the execution of manual work, exoskeletons may help provide support to users and, thus, reduce physical strain on the human musculoskeletal system. However, exoskeletons still lack empirical evidence on their potential relieving effects on the human body and are, thus, not widely deployed in industrial applications yet. To investigate on exoskeleton's impacts and promote their future adoption in the industry, industrial settings are increasingly modeled as different test scenarios in a laboratory environment. Within this frame, this paper presents a study (n=4) investigating on effects of both an exemplary passive and active exoskeleton at an overhead screwing task. The qualitative and quantitative analysis by means of a questionnaire study as well as electromyographic investigations reveals signifcant support potentials of exoskeletons on users in assembly tasks.

#### **Keywords**

Exoskeleton · Overhead assembly · Ergonomic assessment · Physical support · Future workplace · Human-Machine interaction

L. Ralfs (\*) · T. Peck · R. Weidner

Institute of Mechatronics, Chair of Production Technology, University of Innsbruck, Innsbruck, Austria e-mail: Lennart.Ralfs@uibk.ac.at

T. Peck e-mail: Tobias.Peck@student.uibk.ac.at

#### **1 Introduction**

Despite the increasing trend toward automation and industry 4.0 in production systems [1, 2], human operators will remain a central player and factor in industrial factories [1, 3, 4]. It is expected that future-proof jobs in production will be characterized by human– machine interaction [1, 5, 6], hybrid systems consisting of human and robotic operators [2], and a paradigm shift from task-centric to human-centric workplaces [3, 7]. Due to the remaining share of manual work, workers will continue to be exposed to the risk of suffering from musculoskeletal disorders (MSDs), which are the most common reason for sick leave in industrial occupations [8–10]. Concerning assembly tasks, working in particularly stressful and unergonomic postures as well as repetitive work processes, such as overhead work, are a decisive risk factor for causing upper extremity MSDs [8–11] and stress the increasing importance of their prevention and an ergonomic work design [5, 7–10]. Support systems such as exoskeletons are one possible remedy, with the potential to relieve users during the execution of their work [3, 4, 6]. However, exoskeletons are not widely used in the industry yet, as evidence of the relief effects of exoskeletons, especially in the long term, is scarce [4, 12]. In terms of exoskeletons supporting overhead work, the literature predominantly describes studies with passive exoskeletons since no active shoulder-supporting exoskeletons are currently available on the market. The article starts at this point and presents a laboratory test setup of an overhead assembly task, which allows a combined subjective and objective assessment for both an exemplary passive and active exoskeleton. Thus, it operationalizes a station from a test course for industrial exoskeletons [13] and enables a pre-study on the exoskeleton's contributions to user support and future ergonomic workplaces.

#### **2 State of the Art**

For the multicriteria evaluation of exoskeleton's supportive effects, a multitude of criteria and methods are applied, which are suitable for evaluating different support scenarios to varying degrees but are mainly performed in laboratory environments up to now [4, 14]. Depending on the desired focus, subjective (e.g., Borg scale, observations) and objective (e.g., electromyography, motion capture) evaluation methods are capable of delivering results, of which the examination of the physical relief by means of electromyography (EMG) is the most frequently used method [14]. However, a comprehensive evaluation includes complementary subjective and objective measurement methods [4, 11, 12, 14–17].

R. Weidner

R. Weidner

e-mail: Robert.Weidner@uibk.ac.at; Robert.Weidner@hsu-hh.de

Laboratory of Manufacturing Technology, Helmut-Schmidt-University/ University of the Federal Armed Forces Hamburg, Hamburg, Germany

Different foci of investigation are set in laboratory settings, relating to the study of either singular tasks at workstations or more complex processes at integrated workplaces or in test courses [13]. A considerable number of laboratory studies have already been conducted to measure muscle activity during overhead assembly tasks. Thus, the unloading effect of exoskeletons during drilling in different directions, force application points, and body postures has already been investigated [18, 19]. In other articles, subjective criteria are examined in addition to muscle activity. Objective measurements during overhead tasks such as drilling, riveting, grinding, or lifting heavy objects are supplemented by surveys on, e.g., perceived discomfort and sense of stress [11, 12, 15, 16]. Overhead assembly tasks are also studied in industrial environments using EMG and questionnaire studies or Borg scales [12, 17].

However, almost exclusively passive shoulder-supporting exoskeletons have been investigated in previous studies. Therefore, a novel aspect of this work is the comparative evaluation of the suitability of both an active and a passive system concerning the support effect for an exemplary application scenario.

#### **3 Materials and Methods**

For evaluating the support effect of exoskeletons, a characteristic overhead assembly task was considered, which the subjects performed with and without exoskeleton support. Its test setup followed a proposed approach of laboratory-based modeling of industryrelated tasks [13].

#### **3.1 Study Participants**

The study population included four volunteered right-handed males, all of whom were in a physically healthy condition and did not report current shoulder pain. The subjects had an age between 21 and 24 years (mean: 22.5 years), a height between 174 and 190 cm (mean: 180 cm), and a weight between 65 and 83 kg (mean: 75.3 kg).

#### **3.2 Test Setup**

The task consisted of setting and fastening two bolts side by side in a wooden beam (mounted at a reference height of 2.1 m), using an electric screwdriver of mass 2.55 kg. The start and end pose of the task were equal, where the screwdriver was held in an angled arm position without the tool in reach. During the execution, the two bolts were frst set and then fxed in the wooden beam by a vertical upward movement of the arm. The subjects were not given any specifc instructions regarding the speed at which to perform the task. However, the execution of the screwing process should be similar in

**Fig. 1** Assembly task for baseline (left) and supported scenario with Lucy 2.0 (right)

all runs. For high comparability between runs and subjects, the task was performed in a standardized manner. Accordingly, the investigation focused on the screwing as core and excluded, e.g., the gripping of the screwdriver. Besides, the mounting position of the beam was individually adjusted to the subject's height allowing subjects to consistently perform the task in an upright posture and guide the screwdriver with the dominant hand while the non-dominant hand set the bolts. In addition, the lower and upper arms were at right angles to each other during the screwing. Each subject performed the screwing in triplicate: (A) without exoskeleton support as well as with support by a (B) passive and (C) active exoskeleton. Figure 1 illustrates an excerpt from the task showing the exact pose in the baseline (left) and supported (right) scenario.

#### **3.3 Used Exoskeletons**

A passive (Skelex 360) and an active (Lucy 2.0) exoskeleton were used as examples to evaluate the support effect of exoskeletons for overhead assembly tasks. The passive exoskeleton Skelex 360 provides a supportive force when lifting the arms, thus counteracting the arm's force of gravity [20]. Two carbon-fber leaf springs generate support and compensate for a weight of up to 3.5 kg per arm [20]. The maximum supporting torque equals six Nm [12]. Equal to Skelex 360, the active exoskeleton Lucy 2.0 mainly supports the users performing tasks at or above head level [21]. The main difference lies in the generation of the supporting force. Lucy 2.0 uses rigid shoulder kinematics with inserted pneumatic actuators for creating the support effect [21]. By this actuation principle, the level of support can continuously be controlled [21] to generate a maximum torque of approximately 8.5 Nm at an arm bending angle of 85 degrees [15]. Before performing the task with exoskeletons, the subjects got familiar with the systems.

#### **3.4 Applied Evaluation Methods**

For comprehensively evaluating the support effects of both exoskeletons, the assessment combines a questionnaire survey of the subjects and an electrophysiological measurement of muscular activity. In the closed questionnaire study, (1) the perceived exertion and (2) the perceived support effect provided by the exoskeletons were asked for after performing the task. In contrast, EMG tracks the muscular activities of the medial deltoid (shoulder) and the erector spinae (back extensor) during the execution of the task. EMG uses surface electrodes and measures electrical signals in the microvolt range emitted by muscle cells [22]. The EMG sensors were placed on the muscles according to the SENIAM guidelines and in the fber direction. Wireless surface EMG (Myon, Aktos, 960 Hz) was used during the studies.

#### **3.5 Data Acquisition and Processing**

Before performing the task, the maximum voluntary contraction (MVC) was measured for each subject to determine his peak muscular activity for the later analysis [23]. These MVC measurements formed the basis for the subsequent normalization of the data. Afterward, the muscle signals were recorded during the execution of the task with a frequency of 1000 Hz. However, these raw signals are not suffcient for evaluating the effectiveness of exoskeletons. The obtainment of meaningful results requires a data transformation of the EMG amplitude to a relative scale (% MVC) [23, 24]. Therefore, a four-step procedure is necessary: (a) rectifcation and fltering of the raw signal (for the generation of positive and fltered signals), (b) MVC-normalization (for the elimination of the infuence of technical, anatomical, and physiological infuences as well as for better illustration and comparison of stress levels), (c) activity separation (for cutting the relevant activity sequences from the entire signal), and (d) time normalization (for tailoring and relativization of task durations between subjects) [23, 24]. Statistical parametric mapping (SPM) [25] helped analyze and interpret the EMG data. Within this frame, statistical methods tested hypotheses for region-specifc effects [25] between the baseline scenario and the scenarios with exoskeleton support. A nonparametric, unpaired twosample t-test checked the data for mean differences at a signifcance level of fve percent. By comparing the scenarios with and without an exoskeleton, the effect on the muscular activities was investigated at each point in time. As a result, movement sequences were determined the signals signifcantly differed and, thus, an effect of the exoskeleton existed. Each of the four subjects screwing two bolts per scenario doubled the total data pool to eight measurement sets.

#### **4 Results**

This section describes the results of the studies conducted. First, the results of the questionnaire study are presented, followed by those of the EMG study.

#### **4.1 Results from Questionnaire Study**

The results from the questionnaire study on the (1) perceived exertion and (2) perceived support effect provided by the exoskeletons are illustrated using the Borg RPE scale (6–no exertion to 20–maximum exertion) [26] and Likert scale (1–low to 5–high), respectively. The data are presented as boxplots to visualize the median and standard deviation. Additionally, a dot within the boxplot indicates the mean value.

The frst question evaluated the rate of perceived exertion (RPE). For this purpose, the subjects assessed their RPE for each of the three executed runs of the task. The left-hand chart in Fig. 2 shows the results of this survey. The three boxplots display the evaluation for the investigated scenarios (A) without exoskeleton support (left plot) as well as with support by (B) Skelex 360 (middle plot) and (C) Lucy 2.0 (right plot). For the baseline scenario, i.e., executing the task without exoskeleton support, a mean RPE value of 10.75 was determined. According to the Borg scale, this corresponds to a light perceived exertion [26]. Performing the task with the support of an exoskeleton resulted in a mean RPE of 8 (Skelex 360) and 7.5 (Lucy 2.0), respectively. These ratings each correspond

**Fig. 2** Results from study on perceived exertion (left) and perceived support effect (right)

to a level of effort perceived as extremely light [26]. However, the width of the boxplots illustrates a broader distribution in subjects' assessments of exoskeletal support compared to the baseline scenario. Accordingly, there is a higher divergence in evaluating (B) and (C). Nevertheless, the RPE mean value notably differs for the supported scenarios compared to the non-supported scenario.

The second question evaluated whether the subjects felt a supportive effect of using Skelex 360 and Lucy 2.0. The right-hand chart in Fig. 2 shows the results of this survey. The perceived supportiveness of Skelex 360 (with a mean of 4.5) and Lucy 2.0 (4.75) was rated as high for both exoskeletons. Accordingly, the subjects' ratings indicated a perceived support effect of both Skelex 360 and Lucy 2.0.

For both (1) the perceived exertion and (2) the perceived support effect, the evaluations of the questionnaire study indicate a support effect by Skelex 360 and Lucy 2.0. However, since the results so far are only based on the subjective assessment, the results of an additional objective measurement of muscle relief are described below.

#### **4.2 Results from EMG Study**

This section describes the analysis and evaluation of the EMG investigation. Figure 3 shows the results of evaluating the passive exoskeleton Skelex 360 compared to the baseline scenario. As the lower graph of Fig. 3 shows, the t-value exceeds the reference value between 45% to 78% and 86% to 94% time relating to the signifcance level (p-value=0.014). In the subject context, this means the subjects were supported during large time fractions in the second half of the task execution, in which they screwed overhead with the dominant hand. Moreover, the peak in signifcance around 90% of the

**Fig. 3** Analysis of the support effect for Skelex 360 in terms of signifcance (lower graph) and reduction of activity for medial deltoid muscle (upper graph)

temporal performance is striking, where the bolt was sunk into the wooden beam with a slightly increased force applied. Accordingly, the analysis detects signifcant support for the deltoid muscle by Skelex 360 in the named ranges. On this basis, the curves of the relative muscular activity (in % MVC), shown in the upper graph of the fgure, can now be interpreted. For the signifcant time portion of the support, the relative muscular activity while using Skelex 360 equaled 15% MVC over most of the task execution. Its use resulted in a muscular relief for the medial deltoid of 10.8%-points concerning the MVC measurement. Appropriately, using Skelex 360 revealed a maximum unloading effect of 40.6% during the task fraction of overhead screwing. For the other task fractions, there was no signifcance according to SPM. This fact implies the curves do not lead to any meaningful interpretation. The same result applies to the support of the erector spinae muscle, where no signifcant support resulted for the entire course of the task.

Similarly, Fig. 4 shows the analysis results with the active exoskeleton Lucy 2.0 compared to the baseline scenario. As the lower graph in the fgure shows, the t-value exceeds the signifcance threshold over almost the entire task course (p-value=0.014). Consequently, signifcant support by Lucy 2.0 was detected for the deltoid muscle over nearly the complete task execution (setting the bolts and screwing overhead), except for the last fve percent of the time (lowering the dominant hand holding the screwdriver). The three sections of the movement sequence, beginning of elevating the arm to set the bolts, (frst) applying the torque during the screwing, and countersinking the bolt in the beam, reached the highest signifcance. For the signifcant time portion, the relative muscular activity using Lucy 2.0 was 10% MVC in the frst part of the task (setting bolts) and increased to 15% MVC in the second part (screwing overhead). Accordingly, the second part of the task required higher muscular activity. Overall, the use of Lucy 2.0 resulted in a relief of the medial deltoid muscle of 12.2%-points regarding the MVC measurement. The upper

**Fig. 4** Analysis of the support effect for Lucy 2.0 in terms of signifcance (lower graph) and reduction of activity for medial deltoid muscle (upper graph)

graph visualizes a maximum unloading effect of 49.6% for Lucy 2.0. Equal to the run with Skelex 360, there is no signifcant unload of the erector spinae muscle.

Consequently, the statistical analysis of this task supports the results of the subjective questioning and shows signifcant support potentials regarding the reduction of muscle activity by using both exoskeletons.

#### **5 Discussion**

In this section, the results obtained in the study are abstracted regarding limitations in the study design and results, lessons learned, and implications on future workplaces.

#### **5.1 Limitations of Studies and Results**

First of all, it is crucial to stress that the results base on the specifc test design described in Chap. 3.2 and are only valid in this respect. Accordingly, the obtained results depend on the task, its execution by the respective subjects, and the exoskeletons used. Limitations in the test design include the standardization of the processing and the time required for test persons to become accustomed to using exoskeletons. Both factors infuence the execution of the tasks, and, thus the reproducibility of the results since individual movement behavior and longer familiarization with the exoskeletons might produce different results. In combination with these two aspects, the study was conducted with four exclusively young and male subjects in good physical condition, not being the only reason why a larger sample is a relevant factor for improved evidence and higher informative value of the results. Regarding evaluating the measurement data, the processing and cutting of the measurement signals also play a role [23, 24]. All these aspects infuence the validity and especially the reliability of the results.

#### **5.2 Lessons Learned from Studies**

The article indicates it is not feasible to make a blanket statement about an (unlimited) support of an exoskeleton. The results must always be related to individual sections of motions and can only be evaluated against the task. Especially in the example of Skelex 360, the analysis of the results shows the relevance of dividing the complete task into single fractions. The same effect also applies to analyzing the muscular unloading effect caused by exoskeletons. Even if the curve progressions show a different level in terms of relative strain, no meaningful interpretation is valid unless signifcance is proven. Besides, relieving effects of exoskeletons can only be compared against each other if the support characteristics and torques induced by the exoskeletons are identical over the course of the angle. As a result, the study stresses the importance of equally considering subjective and objective criteria in the evaluation, as they can provide complementary results. Notwithstanding this, the results of the objective EMG investigation provide better empirical evidence than those of the questionnaire study.

#### **5.3 Implications on Future Workplaces**

The results reinforce using exoskeletons as a considerable approach while designing sustainable and ergonomic industrial workplaces. Particularly against the background of trends such as human-machine interaction [1, 5, 6] and user-centric workplace design [3, 7], exoskeletons can signifcantly contribute to supporting employees while maintaining their fexibility in manual work processes simultaneously [1]. Besides, support systems such as exoskeletons offer the opportunity to preserve human skills and abilities [27] and provide physical relief at the same time. Thus, using exoskeletons can constitute an attractive and human-oriented initiative to maintain the employee's health.

#### **6 Conclusion and Outlook**

This article describes the modeling of an exemplary overhead assembly task in a laboratory environment and its execution in different test scenarios with and without exoskeleton support. The support effects for Skelex 360 and Lucy 2.0 were evaluated. Plans include expanding the studies to a larger collective of subjects, tasks, and exoskeletons. Additionally, it seems reasonable not only to investigate the effect of exoskeletons in terms of physical but also cognitive support. However, within the framework of this study, the article provides evidence that passive and active exoskeletons can lead to (objectively verifable) muscular and (subjectively) perceived physical relief in separate movement sequences and tasks and, thus, can become a considerable element of ergonomic and human-centric industrial workplaces with future orientation.

**Acknowledgements** A special thanks goes to Maité Calisti and Benjamin Reimeir for their proftable contribution of expertise to data collection and analysis during the conduct of the study.

#### **References**


industrial practice. Lessons learned and recommendations (2019). https://www.researchgate. net/publication/337102988\_Human-centered\_factories\_from\_theory\_to\_industrial\_practice\_ Lessons\_learned\_and\_recommendations


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

### **Evaluation of the Ergonomic Potential of Adaptive Assembly Jigs for Large-Scale Products Using Human Modelling**

Sebastian Hogreve, Hannah Wallmeier and Kirsten Tracht

#### **Abstract**

The implementation of plug and fy principles for the assembly of high-lift systems requires the design of new work processes and jigs. Conventional jigs are infexibly and lag of capabilities to adapt for product and process designs. Adaptive jigs provide higher fexibility and allow workers to position the product in accordance with their personal needs. This paper presents a novel wheel-shaped adaptive jig and investigates its infuence on the ergonomics of assembling a high-lift system. A CAD program is used to model the high-lift system, the adaptive jig and workers. Based on these models, virtual scenes are generated that can be assessed with the key indicator method. The results show that the adaptive jig design improves ergonomics by eliminating the riskiest tasks during the assembly process. The adaptive jig concept has a high potential for improving production processes because it can respond fexibly to changes in the assembly process as well as to changes in product design.

**Keywords**

Assembly · Ergonomics · Modelling

S. Hogreve (\*) · H. Wallmeier · K. Tracht Bremen, Germany e-mail: hogreve@bime.de

<sup>©</sup> The Author(s) 2023 215 T. Schüppstuhl et al. (eds.), *Annals of Scientifc Society for Assembly, Handling and Industrial Robotics 2022*, https://doi.org/10.1007/978-3-031-10071-0\_18

#### **1 Introduction**

The assembly of large-scale fight systems is an essential part of the value-added process in the aviation industry. Fuselages, wings, engines and other components are assembled by hand in cycle lines. In wing outftting, the assembly of high-lift systems is traditionally done by mounting the individual components directly to the wing box. With a plug and fy assembly concept, the high-lift system can be pre-assembled, adjusted and tested as a stand-alone unit [1]. The ready-to-fy module is then joined to the wing in the fnal assembly line (FAL) with only a few joints. By outsourcing the assembly of the high-lift system, the cycle time in the wing outftting can be shortened and the factory production rate increased. In addition, pre-assembly of the high-lift system can improve the ergonomics of assembly because the subassembly offers improved accessibility and can be moved to an ergonomically favourable position and orientation with less effort. For such a modular design of the wing, no reference concepts for the assembly of the high-lift system exist yet. Both the assembly organisation and the required operating equipment must be rethought.

The use of assembly jigs for precise and repeatable assembly of the components is widespread in the aerospace industry [2]. They are required to ensure accurate joining operations during the assembly of large dimensional aircraft components such as wings and high-lift systems. The jigs must be rigid and precise, and must therefore be matched to the product and the assembly process in question. This results in infexibility with respect to shape and dimensional changes of the product [3]. New innovative jigs are needed to ensure high manufacturing accuracies and fexibility. They must be able to position large components easily and be fexible at the same time [3]. To combine high productivity and fexibility, assembly fxtures need a higher degree of automation. They also need to provide greater adaptability and improved interaction with workers [4]. To meet these requirements, collaborative robots seem suitable. They can relieve humans and protect them from physical overload by taking over heavy and repetitive tasks [5].

The aim of the research work presented here is the development of an adaptive jig that offers a high degree of adaptability with regard to product as well as process changes. The jig should enable the entire assembly process of the high-lift system in one clamping. In addition, it should offer workers the possibility for individual adjustments of the working position in order to improve both physical and cognitive ergonomics. The concept of such an adaptive jig is presented in [6]. The adaptive jig uses collaborative robots to position the components to be assembled. Consideration of physical and cognitive ergonomics has gained importance in the design of systems where humans and robots collaborate [5]. Several research efforts focus on methods to simulate physical and cognitive ergonomics with models and evaluate the acceptance of the human–robot-collaboration before building the real device. Beuß et al. propose an ergonomics study based on a simulation and virtual reality [7]. They describe that the analysis with digital humans is possible. Fritzsche shows the high degree of agreement between real and virtual ergonomics assessments and thus proves their usefulness [8].

This paper investigates the infuence of the adaptive jig on the physical ergonomics. Using the assembly of a high-lift system as an example, the adaptive jig is compared with a rigid jig. Human modelling in a CAD system is used for a frst evaluation of the ergonomic potential. The key indicator method is used to measure the risk of physical overload. The investigation will show if the adaptive jig is suitable for assembly and if it offers at least an equivalent ergonomic potential as a rigid jig. Based on the results, a decision can be made for or against building a physical demonstrator and conducting real-world tests.

#### **2 Product and jig Design**

#### **2.1 Assembly Object**

The investigated product is a high-lift system of a medium-range jet. However, only the assembly of the outboard landing fap with the associated supports is considered. Figure 1 shows a section with the main elements. The basic component is the aero fap support (AFS). It carries the moving components of the high-lift system and at the same time acts as an aerodynamic fairing [1]. The fap lever, actuator and landing fap are connected to the support by means of bolt connections. The high-lift unit is connected to the wing box at three points through the main and forward attachments. All bolt connections are designed to be fail-safe, i.e. they consist of two bolts slid into each other in opposite directions, which are fxed with lock nuts and locking plates. Since each bolt connection

**Fig. 1** High-lift system with bolt connectors [9]

consists of at least six components, depending on the design, the components are summarised in the following under the term assembly kit (AK). It is assumed that the components are provided ready for assembly at the assembly line. Manufacturing operations such as drilling, milling, deburring or surface treatment are not part of the consideration. For example, it is assumed that the main bridge and other metal brackets on the AFS are already joined when it arrives at the pre-assembly line.

To compensate for angular errors and reduce stresses, e.g. during thermal expansion, spherical bearings are integrated in all connection points. Until assembly is completed, all components are therefore movable in several degrees of freedom in relation to each other and must therefore be supported and held in position by a fxture.

#### **2.2 Assembly Devices**

**Adaptive Jig.** Figure 2 shows a raw construction of the adaptive assembly device. The positioning and orientation of the assembly parts is taken over by industrial robots, which have suitable end effectors for clamping the components. The industrial robots are arranged on a circular seventh axis. Due to its shape, this adaptive assembly device is also called assembly wheel [6]. The redundant kinematics give the robots an additional degree of freedom. This can be used to move the robots during assembly into a favourable position that interferes least with the workers' work process. In this way, accessibility to the assembly points can be increased.

The frst robot carries the assembly while other robots feed the assembly components and position them for the joining process. Workers then assemble the bolt connectors manually. The robots are able to change the position of the assembly in space so that workers of different heights can comfortably work on the object. Furthermore, changing the orientation can prevent working overhead or while kneeling. To ensure safe operation, the robots must be equipped with functions for human-robot collaboration, like force sensors and robot skin. During assembly, each support is initially equipped in a separate assembly wheel. Then the assembly wheels with the supports are brought together and the landing fap is added.

**Fig. 2** Adaptive Jig with support (left) and high-lift system (right) [6]

**Fig. 3** Rigid jig holds support and landing fap

**Rigid jig.** Since the Plug and Fly high-lift system is a completely new product for which no reference process exists so far, a rudimentary concept for a rigid jig had to be created for this study. Ergonomic requirements were considered in the design, just as they would be in an industrial design. It is assumed that the construction consists of a rigid frame of welded hollow sections. Functions for adjusting the height or orientation of the assembly are not integrated. In order to provide an approximately optimal working height for all workers, an average working height of 1100 mm was set.

The supports are inserted into a clamping device and fxed therein during assembly. The assembly is carried out in horizontal orientation, which corresponds to the fight orientation. The landing fap is assembled in the extracted condition to improve accessibility to the joining points. Figure 3 shows a CAD representation of the concept for the rigid jig. Since only the general contour and the geometric arrangement of the assembly parts in the jig are relevant for determining the infuence on an ergonomic working procedure, details such as the clamping devices were not designed.

#### **3 Process Design**

To evaluate the effects on ergonomics, the fxtures must be considered in the context of a work process. For the assembly of the high-lift system, an assembly sequence was determined experimentally in workshops [10]. Based on this assembly sequence, work steps have been defned and an allocation of labor in the cycle line is determined.

#### **3.1 Determination of the Work Steps**

The key indicator method [11] is used to compare the effects on ergonomics. It is used to evaluate the work processes both when using the adaptive jig and when using the rigid jig. The risk values can then be compared in conclusion. Depending on the type of load, different forms must be used in the KIM. Each of the assembly processes considered is therefore frst broken down into work steps, each of which contains only operations of a uniform load type. For the assessment of the assembly of the high-lift system, the forms for the assessment of Lifting, Holding and Carrying of loads (LHC) as well as for the assessment of Manual Handling Operations (MHO) are suffcient. Methods-time Measurement (MTM) was used to determine the working times for these work steps. Table 1 shows an overview of the worksteps defned for the two work processes.

#### **3.2 Work Scheduling**

Assuming that 63 aircraft are to be produced per month and that 17 shifts of seven hours each are available per week, this results in a maximum cycle time of approximately 3.75 h per wing (i.e. outboard high-lift system). The actual assembly time per high-lift system must be shorter than the cycle time. The assembly time comprises the basic time, the recovery time and the distribution time. The basic time is formed by the sum of the MTM values of all work steps. The recovery time corresponds to legal requirements and the distribution time is estimated based on values from the Federal Ministry of the Interior and Community. Since some work steps can only be performed by two people, at


**Table 1** Work steps for the assembly process with adaptive jig

least two workers must be scheduled to perform the assembly. Detailed work planning shows that at least three people are required to complete all work steps in the required cycle time. This applies to assembly with the adaptive jig as well as to assembly with the rigid jig. For both assembly processes, a task allocation is carried out in which three people are equally occupied. The basic assembly time is then approximately 2.8 h per high-lift system. This leaves suffcient recovery and distribution time within the cycle time. The allocation of the tasks is included in the determination of the key indicators in the following chapter and is primarily represented there by the task durations and the number of repetitive movements.

#### **4 Determination of Key Indicators**

The key indicator method (KIM) has become a standard across companies to evaluate the ergonomics of a working process [12]. In the automotive industry for example, it is used to evaluate assembly activities. The different key indicator methods are designed as a basic methodological for the risk assessment. They describe the most important stress factors (key indicators) in ordinal scales and determine the degree of likelihood of physical overload [11]. The methods are well evaluated and digital forms are provided for easy execution [13].

To carry out the KIM, the postures of the workers during the assembly processes have to be observed and the assembly times have to be determined. Since the study is conducted before the concept is fnalised and the jig is actually built, the study takes place with virtual objects. Both the adaptive and the rigid jig exist as CAD models. The programme Siemens NX 11 is used to integrate human models into the CAD models. Four different human models, representing the 5th and 95th percentile of the male and female German population respectively, are used for the investigation. Each of the previously defned work steps is reproduced in a CAD scenario. In each case, a posture that is characteristic for the assembly step is simulated with the human models. Based on these scenarios (e.g. Figure 4), the posture can then be evaluated within the scope of the KIM. In addition, the human models are used to check whether there is suffcient visibility of the assembly spot and the workers' own hands. The key indicators that cannot be clearly identifed in the virtual model (such as work organisation) are always assumed to be best possible.

Table 2 shows and 3 show the results of the KIM for both assembly jigs. The risk scores are classifed in four categories. Risks below 20 correspond to a low load of intensity and risks between 20 und 50 indicate a slightly increased load. Both are acceptable. Risk values between 50 and 100 belong to substantially increased load intensities and afford a redesign of the workplace since physical overload is possible for normally resilient persons. Work steps with a risk value above 100 have a high load intensity and will likely cause physical overload to all persons. [14].

**Fig. 4** Example of posture and view analysis with human models in Siemens NX 11 (Visibility of jig is deactivated.)


**Table 2** Work steps and corresponding risk values for the assembly process with rigid jig

It can be seen that most works steps cause a low or slightly increased physical load. However, the holding processes may cause an increased load especially for female workers. The rigid jig requires the manual positioning of the fairing within the jig. This work step causes loads that are too high for women. So this work step needs to be redesigned if the rigid jig should be use. While using the adaptive jig the highest risk occurs during


**Table 3** Work steps and corresponding risk values for the assembly process with adaptive jig

the providing of the actuator to the robot. This step can easily be eliminated by providing the actuator with a carrier. While the values in Table 2 and 3 only represent the risk of single work steps, Table 4 and 5 show the cumulated values for a whole working day, considering the working process described in chapter 3.2. Since workers perform different tasks during their shift, the risk is evenly distributed. Only women have a high risk of physical overload during lifting and holding since they all have to position the fairing together when using the rigid jig.


**Table 4** Cumulated risk values per working day and person with rigid jig

**Table 5** Cumulated risk values per working day and person with adaptive jig


#### **5 Conclusion and Summary**

It has been shown that the adaptive jig offers the same or even better ergonomic performance compared to a rigid jig. In particular, it provides very good support when positioning heavy objects. The adaptive jig therefore improves the work process. It can be assumed that work processes such as adjustment, electrical equipment or painting also beneft from the adaptive jig. In particular, when several people with different physical constitutions work together and when integrating people with physical disabilities, the adaptive device offers further potential for improving ergonomics. For example, the assembly object can be brought into an orientation where the assembly points are presented to the employees at different heights.

The adaptive design makes it possible to carry out the entire production process of a high-lift system in a single setup. There is no need for relocation and remeasurement in another fxture. Further potential arises when considering the adaptive jig over its life cycle. The jig can be adapted to changes in the production process or to product changes with little effort. This results in a very long service life. Because not all aspects of work ergonomics could be investigated with the chosen method (e.g. psychological factors), the use of a physical demonstrator is necessary for a complete evaluation.

#### **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

**Path Planning 1**

### **Towards Learning by Demonstration for Industrial Assembly Tasks**

#### Victor Hernandez Moreno, Marc G. Carmichael and Jochen Deuse

#### **Abstract**

In recent times, learning by demonstration has seen tremendous progress in robotic assembly operations. One of the most prominent trajectory-level task models applied is Dynamic Movement Primitives (DMP). However, it lacks the ability to tackle complex operations as often encountered in industrial assembly. Augmenting low-level models with a highlevel framework in which different movement segments are deliberately parameterised is considered promising for such scenarios. This paper investigates the combination of trajectory-level DMPs with Methods-Time Measurement (MTM). We demonstrate how the MTM-1 system is utilised to establish distinguished DMP models for five of its basic elements, paving the way to benefitting from the sophisticated MTM system. The evaluation of the framework is conducted on a generic pick and place operation. Compared to a one-model-fits-all DMP approach for the whole task, the proposed method shows the advantage of appropriate temporal scaling, accuracy levelling and force consideration at adequate times.

#### **Keywords**

Learning by Demonstration • Robotic Assembly • Dynamic Movement Primitives • Methods-Time Measurement

V. Hernandez Moreno (B) · J. Deuse

University of Technology Sydney, Centre for Advanced Manufacturing (CAM), Ultimo, Australia e-mail: Victor.HernandezMoreno@student.uts.edu.au

#### M. G. Carmichael

University of Technology Sydney, Robotics Institute (UTS:RI), Ultimo, Australia

#### J. Deuse

TU Dortmund University, Institute of Production Systems (IPS), Dortmund, Germany

© The Author(s) 2023

T. Schüppstuhl et al. (eds.), *Annals of Scientific Society for Assembly, Handling and Industrial Robotics 2022*, https://doi.org/10.1007/978-3-031-10071-0\_19

#### **1 Introduction**

With the shift from mass production to mass customisation [1] in combination with an increased labour shortage deemed through an unfavoured demographic change [2], the competitiveness of tomorrow's assembly industry is dictated by flexible and easy-to-program automation systems. A solution is promised by the concept of learning by demonstration, which endows a robotic system with the ability to be programmed through intuitive demonstration methods [3]. In recent years, task models based on trajectory-level approaches including Dynamic Movement Primitives (DMP) have prevailed in successfully reproducing assembly-related movements based on human demonstration [4].

As Dynamic Movement Primitives minimise the teaching time through one-shot learning and are capable of reproducing accurate trajectories with temporal and spatial scalability [5], key requirements are considered satisfied for the competitiveness in the industrial environment. However, handling complex tasks is still a major bottleneck of DMP and other trajectory-level task models [4].

In this work, the promising concept of embedding trajectory-level models within highlevel symbolic task representations to tackle complex tasks is further investigated [3, 6]. Compared to other approaches in which often unsophisticated and limited frameworks were considered, the proposed optimised DMP framework utilises the industry-established Methods-Time Measurement (MTM) system which provides a comprehensive and elaborated structure for assembly tasks. Hence, the two fundamentally proven methods of DMPbased learning by demonstration and assembly task analysis according to MTM are combined to create a solution to the situation outlined above.

The remainder of the paper is organised as follows. Section 2 outlines the background and state-of-the-art for Dynamic Movement Primitives and Methods-Time Measurement. Our conceptual framework towards an industry-oriented MTM-1 based optimised DMP framework is depicted in Sect. 3. Section 4 provides an experimental validation of the framework on a generic pick and place operation, followed by a conclusion and discussion of future work in Sect. 5.

#### **2 Background**

This section summarises the theoretical background of Dynamic Movement Primitives and Methods-Time Measurement and the state-of-the-art relevant to the proposed framework.

#### **2.1 Dynamic Movement Primitives**

Dynamic Movement Primitives were initially introduced by Schaal et al. [7] in 2003. The revised formulation by Saveriano et al. [8] is considered the state-of-the-art for Cartesian space Dynamic Movement Primitives (CDMP) and was used for the proposed framework. Here, the task space is divided into two *transformation systems* formed as second-order dynamical systems to capture the translational (1) and rotational dimensions (2).

$$\begin{cases} \tau \dot{\mathbf{v}} = \mathbf{K}^p \left[ \left( \mathbf{p}\_g - \mathbf{p} \right) - \left( \mathbf{p}\_g - \mathbf{p}\_0 \right) s + \mathbf{f}^p(s) \right] - \mathbf{D}^p \mathbf{v} + \mathbf{x}^p \\ \tau \dot{\mathbf{p}} = \mathbf{v} \end{cases} \tag{1}$$

$$\begin{array}{ll} \boldsymbol{\tau} \ \dot{\boldsymbol{\omega}} = \mathbf{K}^{q} \left[ \mathbf{e}\_{0} (\mathbf{q}\_{g}, \mathbf{q}) \ - \ \mathbf{e}\_{0} (\mathbf{q}\_{g}, \mathbf{q}\_{0}) \ \mathbf{s} + \mathbf{f}^{q} (\mathbf{s}) \ \right] - \mathbf{D}^{q} \ \boldsymbol{\omega} + \boldsymbol{\chi}^{q} \\ \ \boldsymbol{\tau} \ \dot{\mathbf{q}} = \frac{1}{2} \left[ \mathbf{0}, \ \boldsymbol{\omega}^{T} \right]^{T} \ \times \ \mathbf{q} \end{array} \tag{2}$$

The position, linear velocity, and acceleration are symbolised as **<sup>p</sup>**, **<sup>v</sup>**, **<sup>v</sup>**˙ <sup>∈</sup> *<sup>R</sup>*3. **<sup>q</sup>** <sup>∈</sup> *SO*(3) represents a unit quaternion with *<sup>ω</sup>*, *<sup>ω</sup>*˙ <sup>∈</sup> *<sup>R</sup>*<sup>3</sup> being the angular velocity and acceleration, and **e**0(**q***i*, **q***j*) is defined as the oientation error between **q***<sup>i</sup>* and **q***<sup>j</sup>* . The parameter τ facilitates the temporal scaling and the scalar *s* creates the time independency through the *canonical system*. The constants **p**0, **q**<sup>0</sup> and **p***g*, **q***<sup>g</sup>* stand for the start and goal poses, respectively. The positive definite matrices **K***<sup>i</sup>* , **D***<sup>i</sup>* are stiffness and damping gains. The forcing terms **f** *i* (*s*) preserve the non-linear behaviour of a demonstrated trajectory through weighted Radial Basis Functions w*n*ψ*n*(*s*) (RBF). The term *χ<sup>i</sup>* represents any extension to the dynamical system. For an in-depth explanation of DMPs see [5].

In preliminary works on DMPs by Schaal et al. [6] in 1999, a compact state-action-state sequence is shown to be a natural prerequisite for task imitation with movement primitives expressing states as *aligned*, *in contact*, *near-to*, and actions as *move-to*, *grasp-object*, *moveabove*, etc. Such a combination of low- and high-level task representation is still promoted for handling compounded actions [3]. Following the assumption that most human hand movements can be segmented into *reach*, *manipulation* and *withdraw* phases, Mao et al. [9] reproduced a chopping task by identifying grasp/release transitions and key manipulation points. Aein et al. [10] developed a three-level task model architecture based on an actiongrammar analogy. The low-level controller possessed arm movement primitives for *position* and *force control* and hand primitives for *open*, *close*, *grasp*, and *ungrasp*. Eiband et al. [11] defined four robot skills, including *gripper open*, *gripper close*, *free movement*, and *haptic exploration*, to establish a tree that describes geometric relationships between consecutive skills. Complex dual-arm household tasks were investigated by Caccavale et al. [12], resulting in a low-level segmentation based on object proximity (*near*/*far*) and explicit human commands (*open*/*close* gripper) combined with a high-level attentional behaviour-based system to structure identified movement primitives.

While reasonable symbolic frameworks have been explored, none is based on a sophisticated industry-proven structure, limiting their probability to endure realistic industrial assembly operations.


**Table 1** Properties of MTM-1 basic elements after [13]

#### **2.2 Methods-Time Measurement**

Introduced in 1948 by H. Maynard et al. [13], Methods-Time Measurement ranks among the most established predetermined motion time systems in today's industrial market. The MTM-1 variant, designed to analyse short-cycle repetitions, is provably capable of segmenting most manual assembly-related operations and methods. The proposed framework is build on five of its basic elements, namely *reach*, *grasp*, *move*, *position*, and *release*. Their definition is provided in Table 1.

Besides the intended use for designing workplaces and work methods, MTM proves to be valuable for the field of robot science. Drumwright et al. [14] developed primitive actions for task-level programming of humanoid robots based on the MTM-1 basic elements. With the growing interest in establishing human-robot interaction, the MTM-1 framework was assessed for the analysis of robot incorporated workspaces [15]. Finally, recent research has explored how to automate the classification of handling tasks according to MTM-1 using machine learning techniques [16]. The latter promises to greatly simplify its applicability in the proposed learning by demonstration context.

#### **3 The MTM-based Optimised CDMP Framework**

The proposed framework for tackling complex assembly tasks embeds the tra- jectorylevel CDMP model within the industry-established MTM-1 system as the high-level task representation. Compared to other approaches, it establishes the benefits of a comprehensive and proven structure for industrial assembly tasks and considers distinctive properties from individual subskills. In this Section, customised CDMP models are designed to reflect the differentiating properties of the five basic elements of the MTM-1 system. The MTM-based optimised CDMP framework is summarised in Fig. 1 and explained in detail below.

**REACH—**The sequence of subskills commences typically with reaching towards a workpiece, where time efficiency and movement generalisation are essential. Since the covered distance primarily dictates the time efficiency during the *reach* subskill, the temporal scaling property of CDMP models becomes valuable, especially when the task was demonstrated under reduced speed. It is achieved by amending the time constant τ , resulting in an effortless adjustment of the robot's end-effector velocity during reproduction. Since the accuracy is considered less important when approaching the workpiece, the number of RBF is recommended to be chosen low. By doing so, a smoother trajectory is created, removing shaky discrepancies, and the computational costs are reduced. Considering human demonstrations being often non-optimal for the robots kinematic, the weights w*n* may be further optimised using reinforcement learning [5].

Besides the temporal scaling property of CDMP models, the spatial scaling option creates additional advantageous characteristics for this subskill.While CDMP models can inherently


**Fig. 1** The MTM-optimised CDMP framework *(Remark: The specific parameterisation is subject to the robot's capabilities and the application's requirements)*

cope with deviating starting poses, the goal pose **p***g*, **q***<sup>g</sup>* is also adjustable in real-time through a goal switching mechanism as described in [5]. An object recognition method may be applied to detect different workpieces and identify a quantifiable goal pose for the *reach* CDMP model. Finally, the CDMP model of the *reach* subskill generalises further by adjusting its trajectory in case obstacles appear on its path. This can be realised through an CDMP extension for volumetric object avoidance which was explored in [17].

**GRASP—**After reaching the target position close to the workpiece, the *grasp* subskill commences. In contrary to the *reach* subskill, a much shorter distance is to be bridged. However, it does require a higher accuracy as a distinguishing characteristic, which dictates the success of the *grasp* operation.

Based on this requirement, the number of RBF replicating the demonstrated *grasp* subskill is recommended to be chosen high. To reduce the risk of damaging inertia forces or control limitations, a similar or slower reproduction speed than the demonstrated scenario is desirable and realised by increasing the time constant τ in the CDMP model. As far as the hardware setup permits it, additional visual or force feedback may be considered to improve the accuracy further. Finally, the gripper actuation may be reproduced through a simple DMP model under the same canonical system to guarantee correct actuation timing.

**MOVE—**Once grasped and lifted sufficiently to allow free movement, the *move* subskill is initiated to transport the workpiece close to its destination. As this subskill also focuses on a large motion in which the accuracy is considered less relevant, the same efficiency and generalisation ideas as in the *reach* element apply. Nevertheless, the properties of the transported workpiece have to be considered. This includes its weight, dimensions, and fragility.

In accordance with the requirements, the time constant τ is adjusted appropriately but may be increased to improve time efficiency. A lower number of RBF to reproduce the demonstration trajectory allows smoothing out shaky demonstration motions and reduces computational costs. When considering optimising the weights of the forcing terms **f** *i* (*s*) through reinforcement learning, as discussed for the *reach* subskill, the workpiece dimensions must be included.

Regarding generalisation capabilities, the starting pose is provided by the end pose of the preceding *position* CDMP model outcome. End pose adjustments may be incorporated in real-time as discussed for the *reach* subskill. Similar to the reinforcement learning augmentation, the workpiece dimensions must be considered when applying object avoidance methods.

**POSITION—**The *position* subskill describes the most challenging aspect of an assembly task. It covers aligning, orienting, and engaging the grasped workpiece with its designated location relative to another object. Similar to the *grasp* subskill, accuracy is a vital factor for the success of this subskill. However, a fundamental characteristic during positioning is the occurrence of contact forces and torques which can significantly influence the appropriate execution. In order to improve the accuracy, a high RBF density is recommended to replicate the demonstrated motion. Since accurate execution is of more importance than its speed, a suitably low time constant τ may be selected.

Beyond the achievable positional accuracy, the consideration of contact forces and torques promises to enhance the robustness of the *position* subskill. Therefore, these should be incorporated in the CDMP model, which can be realised in different ways [5].

**RELEASE—**The *position* subskill terminates when the workpiece is successfully aligned and oriented, and no interfering forces are recorded. Once this state is reached, the *release* subskill commences by actuating the gripper and ends after a collision-free disengagement from the workpiece. Like the preceding *position* subskill, a continued high accuracy and reduced reproduction speed characterise the *release* CDMP model. An assessment of noticeable forces may be used to guarantee no intervention with the workpiece during disengagement.

#### **4 Experimental Evaluation**

The proposed MTM-1 based optimised CDMP framework was evaluated on a generic pick and place experiment. Here, a toy dice (8 cm × 8 cm × 8 cm) is to be picked up from its initial location and to be placed onto a stationary assembly jig with an 9 cm ×9 cm ×1 cm recess (see Fig. 2). Based on a human demonstration via kinesthetic teaching, the task was reproduced using the MTM-based optimised CDMP framework and then compared with two one-model-fits-all CDMP models with distinguished accuracy levels.

**Fig. 2** Experimental setup

#### **4.1 Experimental Setup**

The experiment was conducted on an UR5e robot from Universal Robots with OnRobot RG6 gripper. An ATI Axia80 F/T sensor installed in the assembly jig measured the wrench during positioning (see Fig. 2). The free drive mode of the UR5e was used for demonstration. Endeffector cartesian poses were recorded 100 Hz while the gripper was actuated manually using the teach pendant. During the transportation of the workpiece, an artificial disturbance was introduced by shaking the end-effector for a short time. The desired transitions between the five subskills were communicated from the human teacher by briefly pausing the movement.

For the MTM-based CDMP framework, the demonstration data was separated into the subskills and fed to individual CDMP models as described in Sect. 3. In accordance to the proposed framework, the *reach* and *move* CDMP models were simplified with 10 RBF and doubled in speed by halving the time constant τ . In contrary, the *grasp*, *position*, and *release* CDMP models were generated with 200 RBF to improve their accuracy and the same time constant τ as during demonstration. All other CDMP parameters were kept the same across subskills, including *K<sup>i</sup>* as 100, *D<sup>i</sup>* being critically damped, the canonical system's parameter α*<sup>s</sup>* = −*ln*(0.001) ∗ *T* , and RBF centres equally distributed in time with a width of 2. As the subskill transitions occurred without velocities, the final merging of individual CDMP sequence was realised by the suggested approach of Saveriano et al. [8], with the initial poses being the end poses of the prior CDMP subskill. The generalisation of the starting pose was examined by introducing an offset of +3 cm in each translational dimension to the starting position of the demonstration data. For comparison, the one-model-fits-all CDMP approach was used twice with 10 and 200 RBF per subskill, no temporal scaling, and all other CDMP parameters being equivalent. During reproduction, the gripper was actuated manually by the human operator. The offline processing of the demonstration data and CDMP calculation was conducted in MATLAB (the code is accessible at https://github.com/VictorHerMor/ 2022-mtm-based-dynamic-movement-primitives-mhi).

#### **4.2 Results and Discussion**

Figure 3 shows the translational dimensions of the proposed MTM-based optimised CDMP framework compared to the one-model-fits-all CDMP approaches, from which four essential differences are observed. The *reach* and *move* subskill duration are indicated in green and blue highlighted areas within the y-dimension graph. Its comparison to the demonstration data shows that the desired end pose was reached after half the time of the respective subskill, reducing the whole reproduction duration by approximately 10 s. The introduced 3 *cm* offsets in each translational dimension were eliminated during the *reach* subskill, demonstrating the capability of coping with distinguished starting positions (green circles). The artificially introduced disturbance during the *move* subskill (around 30 s, blue circles) was smoothed out in the MTM-based optimised CDMP framework, while the required

**Fig. 3** Translation during demonstration ( ), one-model-fits-all CDMP approach with 10 RBF per subskill ( ), and MTM-based optimised CDMP framework ( )

accuracy during the *grasp*, *position* and *release* subskill were maintained. The discrepancy of the latter feature to the one-model-fits-all CDMP approach with 10 RBF per subskill is highlighted with red circles, where critical dips in the z-dimension appear. Furthermore, while a high accurate one-model-fits-all CDMP approach (200 RBF per subskill) matches accurately the demonstration data, including the artificially introduced disturbance during the *move* subskill. However, its computational costs are 36 % higher than for a one-modelfits-all CDMP alternative with only 10 RBF per subskill. In comparison, the MTM-based optimised CDMP framework increases the computational costs by only 6 %.

Figure 4 shows the measured forces in the z-direction during the *position* subskill. Based on a post-assessment of the force profile, the data verifies that the dice was successfully placed on the assembly jig through an identical end value. Furthermore, the occurred forces during reproduction did not exceed those during the demonstration, suggesting a damagefree task replication.

In summary, the distinction between the five MTM-1 basic elements and the design of characteristic CDMP models bring the decisive benefit of focusing on their unique requirements, paving the way to tackling compounded and complex assembly operations.

**Fig. 4** ATI Axia80 z-force measurement during the *position* subskill ( human demonstration, MTM-based optimised CDMP framework)

#### **5 Conclusion and Future Work**

While Dynamic Movement Primitives are considered a promising approach for robotic learning by demonstration, their stand-alone application lacks handling complex assembly tasks. This paper has presented a method to address this limitation by distinguishing subskills on a symbolic level provided by the industrially well-established MTM-1 framework. By doing so, five unique CDMP models were defined, which are designed to match the individual characteristics of the MTM-1 basic elements *reach*, *grasp*, *move*, *position*, and *release*. The proposed method was evaluated on a pick and place assembly task, showing more decisive benefits than the one-model-fits-all CDMP approach. These include appropriate time management, matching accuracy in relevant periods of the assembly task and force monitoring at adequate times.

With the presented experimental results demonstrating its proof-of-concept, the framework's optimisation shows potential for further analysis. While the proposed approach relies currently on the author's expertise to parameterise the CDMP models, a sophisticated mathematical analysis regarding the design decisions and their implementations will provide more robustness to the system. On the other side, the proposed method's full potential is yet to be explored, including its analogy to human efficiency with the predetermined motion-time and further abstraction through elaborated subsequent MTM variants. Finally, the transferability and generalisation to other robot systems and applications will be exploited in future work.

#### **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

### **Visual Programming of Robot Tasks with Product and Process Variety**

Dominik Riedelbauch and Sascha Sucker

#### **Abstract**

In flexible manufacturing settings, automation is shaped by ever changing conditions (e.g. varying part feeding locations, highly customizable products). Quick adaptation of robot systems is mostly achieved by visual end-user robot (re-)programming. In this paper, we discuss the explicit integration of anticipated product and process variety into visually programmed tasks. We contribute a task model which captures a user-defined range of task variants. To this end, parts are specified in terms of approximate locations and generalized parts families. Workspace exploration and combinatorial assignment planning enable online adaptation to unknown environments. Our experiments show that this adaptation capability can increase the economical efficiency of cobot use.

#### **Keywords**

Flexible production • Visual programming • Intelligent robots

D. Riedelbauch (B) · S. Sucker

S. Sucker e-mail: sascha.sucker@uni-bayreuth.de URL: https://robotics.uni-bayreuth.de

Lehrstuhl für Angewandte Informatik III, Universität Bayreuth, Bayreuth, Germany e-mail: dominik.riedelbauch@uni-bayreuth.de

#### **1 Introduction and Related Work**

Contrasting to traditional mass production, manufacturing demands have shifted towards shorter innovation cycles and small-batch production. This has raised the demand for *flexible manufacturing* systems that can quickly be adapted to customized products by domainexperts in small and medium enterprises [6]. When additionally considering recent advances in collaborative robotics towards flexible *partial automation*, adaptation of robot programs to various sources of variety are needed [1, 4]: *Product variety* is needed to manufacture different product instances from a product family by assembling parts with varying features (e.g. color) to suite individual customer demands [9]. In this field, we particularly focus on *process-specific variations* [7] that additionally yield *process variety*. Relevant robot task parameters that may change with process-specific variations are e.g. pickup or placement locations, or even the ordering of process steps [1].

**Visual end-user robot programming** is an established approach to cope with such variety [6]. Corresponding approaches [14, 17–19] are mostly based on skill frameworks. Those let users combine skills with human-readable semantics into tasks (e.g. [15]) even for human-robot collaboration [16, 18]. Modularity and intuitive usability support convenient (re-)programming and, in consequence, quick adaptation. In contrast, our contribution seeks to reduce recurrent programming efforts by applying the visual programming paradigm to a task model that intrinsically encodes a subset of feasible variations (e.g. different part types or locations) and adapts online (Fig. 1). We hypothesize that this would further contribute to the economic efficiency of intelligent robot systems.

Corresponding **task models with variety** have also been addressed in literature. Among them, especially precedence graphs and hierarchical AND/OR Trees are frequently used in intelligent robot systems (e.g. [4, 13, 16]). They seek to encode all feasible assembly sequences [10], hence focussing on process variety. Similarly, hierarchical models emphasizing product variety [7, 9, 11], approaches at the intersection of assembly and product family oriented goals [5], and ontologies to exchange production data under variety [8] have been proposed. They commonly decompose products into functional entities [11] until inseparable, constituent components referred to as *primary generic products* [9] or *parts*

**Fig. 1** Visual programming enables frequent end-user robot task adaptation to customer demands in flexible manufacturing (**a**). We seek to reduce programming efforts by online adaptation (**b**). To this end, we propose to explicitly encode different situations with product variety (A1, A2) or process variety (B1, B2) in a single task model

*families* [7] are reached. A group of feasible variants for assigning a part in concrete product instances is associated with each component. Analogously, groups of feasible locations can be expressed with spatial relations [14], or more specifically with areas in the workspace [18]. Taking inspiration from this group notion for feasible part types and locations, we propose end-user programming of assembly task models with skills accepting parts families and partly known locations as input. This way, parameters can be partially left underspecified at modelling time to create a single task model for several instances of the task. Consider e.g. a pick-and-place task that involves fetching five bolts from the imprecise location conveyor and putting them into a box—with our approach, a single task model is sufficient to robustly conduct this kitting task for any positions and orientations of bolts on the conveyor, and for any size of bolts.

Once a skill is executed, one of the physically present entities with precisely known parameters as sensed by the robot must be assigned to the symbolic part description in the task model. Establishing a link between symbolic parts and the world is referred to as the **anchoring problem** [3]. This in particular includes deciding between multiple sensed entities that equally match an ambiguous part description (e.g. bolts of different sizes all being of type bolt). Related approaches perform anchoring with *local decisions*[4, 14, 18]. Ambiguity is here resolved in the scope of a skill without considering subsequent process steps, e.g. by choosing from all matching entities the one closest to the robot [14], or by drawing randomly [18]. However, such decisions can render the overall process infeasible (Fig. 2): Despite being suitable for the currently considered skill, an entity may be strictly required by some subsequent skill with more strongly constrained input parts. Choosing the "wrong" entity will thus lead to an error when trying to anchor this subsequent skill. Therefore, we propose an algorithmic procedure with *global decisions* which considers the constraints of all skills during the anchoring process.

All in all, our contribution is twofold: (i) We propose a task model and visual programming procedure with robot skills accepting parts families and flexible locations rather than definitely specified, uniquely identified parts as input parameters. (ii) We show a

**Fig. 2** Our task models may be underspecified, e.g. by skills accepting any kind of gear (1 and 2) for adaptation to sensed parts in a world model (a-d). Locally correct anchoring decisions, e.g. assigning red\_gear c to skill 1, can render the process infeasible when subsequent skills have strictly specified input parts (3 and 4)

computationally efficient method for anchoring and executing such task models in unknown environments with ambiguous parts.

#### **2 Our Approach**

An overview of our approach is shown by Fig. 3. Users will first use a visual programming task editor to create a precedence graph model (Sect. 2.2) capturing different instances of the task (Sect. 2.3). After that, the robot workspace is prepared by supplying concrete parts. The task model provides partly underspecified information about the types and approximate locations of parts to be expected when executing the task (Sect. 2.1). From this information, a path to explore points of interest in the workspace with a camera attached to the robot hand is calculated. A world model is then built by active vision, i.e. by approaching each point of interest and performing object recognition. The world model enables the computational process of plan instantiation for the perceived situation in the workspace (Sect. 2.4): Detected entities in the world model are assigned to parts referenced in the task model with an *assignment planner* solving the anchoring problem. Together with the task model, the resulting assignment solution is passed to a task sequencer. The sequencer applies a scheduling algorithm to the task model and finishes skill parametrization by replacing underspecified parameters with precise information from the world model. The resulting operation sequence is finally passed to a skill execution engine. After task completion, further materials can be supplied, and the plan instantiation process can be re-iterated starting from the workspace exploration step without manually adapting the task model.

#### **2.1 Part Types and Locations**

We describe parts in terms of their type and location in the workspace. To this end, a *part type* is an entry taken from a tree-shaped part type ontology. This ontology is a required input to the approach. It captures "is-a"-relations between a set of nodes *O* = {*o*1, *o*2,..., *o*|*O*|}. Leaf nodes *P* ⊂ *O* denominate *concrete part types* as which parts in the physical world can be classified. We assume a CAD model given for each *o* ∈ *O* for the purpose of

**Fig. 3** Our approach adapts generalized task models emerging from a visual programming procedure by means of active workspace exploration, assignment planning, task sequencing, and skill execution

grasp and placement planning. When ascending from leaf nodes upwards towards the root node, encountered inner ontology nodes encode increasingly generic part descriptions. The ontology thus encodes parts families with an increasing level of generalization over part types. An example inspired by the benchmark domains used in our experiments is shown by Fig. 4. Here, different gear and conductor leaf part types are summarized under the more general terms gear and conductor. The approach is intuitively adapted to other domains by specifying a corresponding tree with several levels of generalized part types. Formally, the ontology is characterized by the function is\_a : *O* × *O* → {True, False} with is\_a(*o*, *o* ) = True whenever *o* = *o* or *o* is a child of *o* . In all other cases, is\_a(*o*, *o* ) is False.

Regarding the *part location*, we distinguish two cases: A location can be known precisely and, hence, be specified by a rigid body transform <sup>w</sup>*T*part <sup>∈</sup> <sup>R</sup>4×<sup>4</sup> indicating the object translation and rotation with respect to some world frame w. This is e.g. the case for object recognition results, for parts provided on workpiece carriers etc. In the second case, a part location is not given precisely, but only within a certain tolerance. These two concepts can be captured by a unified formalization: Let *L* = {*l*1,*l*2,...,*l*|*L*|} denote a set of locations relevant to the task. A location *li* ∈ *L* may describe the precise position and orientation of some place where parts are usually located (e.g. the output slot of a parts feeder). Let *<sup>L</sup>*prec <sup>⊆</sup> *L* denote these precisely known locations, each associated with a rigid body transform pose(*li*) <sup>∈</sup> <sup>R</sup>4×<sup>4</sup> (*li* <sup>∈</sup> *<sup>L</sup>*prec). In addition to these precisely known locations, elements of *L* may also describe a 2-dimensional area on the workbench surface, a 3-dimensional volume defining the interior of a box etc. We will see in Sect. 2.3 how *L* emerges from the visual programming process. For the planning process (Sect. 2.4), each location *li* ∈ *L* is associated with a *location function* is\_at*li* : *<sup>O</sup>* <sup>×</sup> <sup>R</sup>4×<sup>4</sup> → {True, False}. These functions are designed to output is\_at*li*(*o*, <sup>w</sup> *<sup>T</sup>*part) <sup>=</sup> True for a part type *<sup>o</sup>* <sup>∈</sup> *<sup>O</sup>* and transformation <sup>w</sup>*T*part <sup>∈</sup> <sup>R</sup>4×<sup>4</sup> if and only if some part of type *<sup>o</sup>* with pose described by <sup>w</sup>*T*part is at the location denominated *li* . Our system currently supports is\_at*li* functions for comparing equality of precise positions, and for checking whether parts lie in planar workspace areas considering

**Fig. 4** A part ontology tree encodes "is-a"-relations to group different part leaf types into more generic type descriptions represented by inner tree nodes

their axis-aligned bounding boxes aabb(*o*) (Fig. 5). The formalism allows for integrating more complex location specifications in future work (e.g. spatial relations between parts).

#### **2.2 Task Models with Degrees of Freedom**

Our goal is programming tasks that can be adapted to product and process variety at execution time. To this end, we first define the notion of part templates which capture boundary conditions that parts used in a task must satisfy. A *part template p* <sup>=</sup> (*p*type, *<sup>p</sup>*loc) combines an arbitrary node *<sup>p</sup>*type <sup>∈</sup> *<sup>O</sup>* from the part type ontology with a location *<sup>p</sup>*loc <sup>∈</sup> *<sup>L</sup>*. It describes a part with parameters that are possibly only partly known during the visual programming procedure, e.g. a conductor that may be either red, green, or blue and that lies at any position within a larger area on the workbench. Part templates enable task models with a certain degree of generality regarding part types and locations: In our framework, each task (*T* , ≺*<sup>T</sup>* ) is composed of partially ordered *operations T* = {τ1, τ2,...,τ|*<sup>T</sup>* |}. The partial order ≺*<sup>T</sup>* defines assembly precedence relations between operations, i.e. some operation τ*<sup>i</sup>* ∈ *T* must be done before τ *<sup>j</sup>* ∈ *T* (*i* = *j*) if and only if τ*<sup>i</sup>* ≺*<sup>T</sup>* τ *<sup>j</sup>* . This task model is well known from the assembly planning domain [10] and suited for flexible production settings. We further describe each operation with a pair τ*<sup>i</sup>* = (*pi*,*li*) of a part template *pi* and a part goal location *li* ∈ *L*. The model thus covers any sort of operation where a part is transferred to a new location by the robot. This comprises basic pick-and-place actions as well as operations during which the transfer requires more sophisticated robot control (e.g. force-supervised gear meshing, see Sect. 3).

Task models as defined above are underspecified, and each part template must be anchored to a physical entity when the task is executed (Sect. 1). To this end, the robot builds a *world*

**Fig. 5** Our task editor (left) combines icon-based precedence graph modelling (**a**) with part creation in a virtual workspace (**b**). The modelling process outputs task models with associated operators to compare locations and part types (right)

*model W* ={ˆ*p*1,..., *p*ˆ|*W*|} containing all entities perceived on camera images. Entities are encoded by part states. Contrasting to part templates, *part states <sup>p</sup>*<sup>ˆ</sup> <sup>=</sup> (*p*ˆtype, *<sup>p</sup>*ˆloc) combine an ontology leaf node *<sup>p</sup>*ˆtype <sup>∈</sup> *<sup>P</sup>* and a precise location *<sup>p</sup>*ˆloc <sup>∈</sup> *<sup>L</sup>*prec as detected by object recognition. We say that an operation τ*<sup>i</sup>* ∈ *T* may be applied to a part state *p*ˆ ∈ *W* if and only if *p satisfies* ˆ the part template *pi* . Validation of this connection between part templates and states is achieved with a satisfies-function (Eq. 1).

$$\text{satisfies}(\hat{p}, p) = \begin{cases} \text{TRUE} & \text{if } \text{is\\_a}(\hat{p}^{\text{type}}, p^{\text{type}}) \land \text{is\\_at\\_{p^{\text{loc}}}(\hat{p}^{\text{type}}, \text{pose}(\hat{p}^{\text{loc}})) \\ \text{FALSE} & \text{otherwise} \end{cases} \tag{1}$$

#### **2.3 Visual Programming**

Users create task models by interacting with a graphical editor shown in Fig. 5. To this end, it is first necessary to specify part templates for each part to be used during the task. A new template can be added by choosing its part type and initial part location. The user is in charge of selecting from the part type ontology appropriately so that the desired level of task generalization is reached. The selection of locations is supported by a virtual representation of the workspace. In the virtual workspace, a *workspace layout* as introduced in our prior work [16] offers pre-defined regions to be chosen as part locations (e.g. *l*<sup>4</sup> in Fig. 5, left). For each area defined by the layout, a location function based on the area corner vertices is instantiated and added to the location set *L* (Sect. 2.1). If the user prefers to specify part poses precisely (*l*1,*l*2,*l*<sup>3</sup> in Fig. 5), additional location functions are defined by corresponding precise poses. Having specified all parts, pick-and-place operations may be added. Finally, the operations are connected with precedence relations using the icon-based editor component. Currently, the system is based on a single pick-and-place skill – suitable control algorithms are derived from annotations to the part type ontology (e.g. force-supervised gear meshing vs. position-controlled placement of our benchmark conductor parts). Yet further classes of skills, e.g. for visual inspection or presentation of parts to the user for collaborative steps, can be added in the future.

#### **2.4 Plan Instantiation**

Having modelled a task with operations *T* = {τ1,...,τ|*<sup>T</sup>* |}, users need to prepare the workspace by supplying necessary parts to the robot. After an active vision exploration procedure (see [2] for an overview of applicable methods), the robot has all detected parts stored in its world model *W* = {ˆ*p*1,..., *p*ˆ|*W*|}. The next step is solving the anchoring problem as introduced in Sect. 1, i.e. mating each part template *pi* of operation τ*<sup>i</sup>* with a part state *p*ˆ*<sup>j</sup>* so that satisfies(*p*ˆ*j*, *pi*) holds. Assuming that the user has provided at least one part for each operation (|*W*|≥|*T* |), this means *O*(|*W*|!) possible assignments. Enumerating and testing those to find a valid solution, clearly, is a computationally infeasible combinatorial problem even for small |*W*|. However, we can apply efficient combinatorial optimization algorithms to this unbalanced assignment problem, e.g. the well-known Kuhn-Munkres algorithm [12] with *O*(|*W*| <sup>3</sup>) runtime complexity:

Let **C** = (*ci*,*j*) denote a |*T* |×|*W*| cost matrix with a row for each part template and a column for each part state. Any wrong assignment of *p*ˆ*<sup>j</sup>* to *pi* is modelled to have infinite costs, whereas a correct assignment has no costs, i.e.

$$c\_{l,j} = \begin{cases} 0 & \text{if satisfies} (\hat{p}\_j, p\_l) \\ \infty & \text{otherwise} \end{cases}, i \in \{1, \dots, |T|\}, j \in \{1, \dots, |W|\}. \tag{2}$$

Given **C**, combinatorial optimization computes an optimal, injective assignment *f* : {1,..., |*T* |} → {1,..., |*W*|} which minimizes the total assignment costs *<sup>i</sup> ci*, *<sup>f</sup>* (*i*) (*i* ∈ {1,..., |*T* |}). In our case, *f* says that part template *pi* of operation τ*<sup>i</sup>* must be associated with part state *p*ˆ *<sup>f</sup>* (*i*) to incur the minimum cost assignment. By construction of **C**, any solution involving a wrong assignment (cf. Fig. 2) leads to infinite overall costs. This means in practice that the user has not supplied all required parts to the workspace—in this case, our system outputs an error message to inform about missing parts. By contrast, a solution *f* with 0 overall costs means that each part template was matched with a suitable entity in the workspace. The process can then proceed to the task sequencing step.

The task sequencing procedure prepares a fully specified sequence of operations to be executed by the skill engine. For each operation τ = (*p*,*l*), a suitable input entity matching *p* is known from the above assignment *f* . We further use a grid-based placement planner that determines precise part goal locations whenever the operation goal location *l* is an area. Finally, the precedence graph is transferred into a sequence that complies with all "earlier-later" relations. The fact that we are using a graph structure as task model opens a range of future possibilities here: Aside from searching for an operation sequence that optimizes energy consumption or other secondary criteria, planning of collaborative action with a human-robot scheduler would also be feasible at this point in the process.

#### **3 Experimental Validation**

We have modelled four benchmark tasks (Sect. 2.3) which are designed to illustrate specific aspects of product and process variety (Fig. 6a): Product variety is represented by task S1, in which gears of arbitrary types (red, blue, green, cf. Fig. 4) are assembled with forcesupervised robot control. Task S2 is a kitting task, where a connector of each type is added to a bundle of three. Tasks S3 and S4 replicate assembly tasks of electrical circuits with a serial/parallel connection. The tasks S2–S4 use region-based initial locations, thus enabling convenient part feeding by the user. Task S2 furthermore allows for the bundle to be placed anywhere within an area. We have executed each task with different workspace configurations (e.g. S1 with different part types, S4 with orderly or arbitrarily placed connectors, cf. Fig. 1). Online adaptation and task execution in these differing settings was achieved successfully.

Moreover, a theoretical comparison of the effort needed for adaptation with our approach versus the traditional re-programming method was conducted. We say that a production cycle consists of executing a task *N* times, i.e. finishing *N* instances of a product. By introducing the *flexibility demand ratio F D* <sup>=</sup> <sup>1</sup> *<sup>N</sup>* , we characterize the manufacturing setting, i.e. traditional mass production with hardly any adaptations for *F D* → 0, decreasing lot sizes for *F D* → 1, and one-off products for *F D* = 1. The adaptation effort per cycle of our approach depends on *N*, as each program execution is preceded by exploration and assignment planning—re-programming effort is not required during a cycle as the task models for S1–S4 have covered all necessary adjustments. During the experiments with our benchmark tasks, an exploration time of about 9 s was measured whereas the planning time was negligible. By our definition, the effort per cycle for adaptation by visual re-programming is independent of *N* and therefore constant. However, the re-programming time including loading and saving the task model depends on the degree of necessary changes. We have considered three cases where only one operation or corresponding part (minimum effort); half of the involved parts (medium effort); or all parts (maximum effort) need to be adjusted in the task model between consecutive cycles. Representative durations of these three reprogramming types have been gathered by observing an expert operate our task editor (min. ≈ 31 s; med. ≈ 80 s; max. ≈ 110 s).

Figure 6b compares the time allocated for adaptation within a production cycle depending on *F D*. In general, our approach achieves better results than manual re-programming in highly flexible domains, i.e. for higher *F D* values, since task variants are widely encoded in

**Fig. 6** Our experiments comprise different benchmark tasks S1-S4 (**a**, goal states are rendered transparently). Adaptation time measurements enable a comparison of our approach and manual re-programming for different lot sizes (**b**)

the task model. In particular, it performs better for lot sizes of three or less, even when considering re-programming with minimum effort. In other words, less adaptation effort is needed with our approach compared to manual re-programming for finishing three products—this confirms our hypothesis regarding economical efficiency (Sect. 1). For medium and maximum re-programming effort, this amortization threshold shifts towards larger lot sizes. However, the effort for exploring the workspace before each task iteration renders reprogramming more efficient in mass production settings with relatively few changes. These quantitative results must of course be interpreted within the limits of our benchmark tasks. Yet, our analysis illustrates qualitative relationships that are transferable to other scenarios and applications.

#### **4 Conclusion and Future Work**

In this paper, we have contributed a visual programming and robot task execution approach that incorporates product and process variety. For this, part templates are specified as input to robot skills in terms of approximate locations and generalized parts families. This leads to partly ambiguous, underspecified task models capturing a set of task variants. Adaptation to concrete parts is achieved online by workspace exploration and combinatorial optimization to anchor ambiguous part templates to perceived concrete parts. Our experiments with a set of characteristic benchmarks show how this approach helps to reduce the (re-)programming effort of robots in flexible manufacturing settings.

We will address several limitations of the approach in future work: Currently, the task structure and number of processed parts are fixed. Further task variety could be achieved by augmenting the task model with constructs as loops for situation-dependent repetition of operations. Furthermore, we will extend the approach towards human-robot co-working by integrating multi-agent scheduling. Finally, our concept needs a comparison with other visual programming systems to evaluate the impact of generic part and location descriptions on usability.

**Acknowledgements** This work was partly funded by the Deutsche Forschungsgemeinschaft (DFG) under grant agreements He2696/20 (FlexCobot) and He2696/18 (VerbBot).

#### **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

### **DOE Laser-assisted Turntable Calibration and Workpiece Registration**

Edgar Schmidt, Pascal Ruppert and Dominik Henrich

#### **Abstract**

The ability to review robot programs before they are executed can be used to correct erroneous programming. In complex processes, such a review can only be achieved by integration of additional peripheral devices and workpieces used into the programming framework. In this work, we present a semi-automatic method for the calibration of a turntable and a workpiece registration based on the turntable calibration utilizing only a DOE laser. The turntable pose is calculated by approaching markers on the turntable on a so called acquisition plane. Based on the calibration, workpieces are registered with intersection points between the laser beams with the turntable rotary plate. We evaluated our approach in terms of accuracy and the amount of time for execution. The resulting poses of the turntable and workpieces are used to load models into the 3D simulation of the programming framework and thus can be used to review the robot program.

#### **Keywords**

Industrial Robots • Flexible Manufacturing • Small Batch Size • Kinesthetic Programming • Object Localisation

E. Schmidt (B) · P. Ruppert · D. Henrich University of Bayreuth, Bayreuth, Germany e-mail: edgar.schmidt@uni-bayreuth.de URL: http://robotics.uni-bayreuth.de

P. Ruppert e-mail: Pascal.Ruppert@uni-bayreuth.de

D. Henrich e-mail: Dominik.Henrich@uni-bayreuth.de

#### **1 Introduction**

In human-robot collaboration, humans and robots work together on tasks, or a human solves tasks by operating the robot, so that the advantages of both parties are combined. For example, in fibre spraying processes a spray gun is attached to a robot and the robot is programmed through direct guidance, so that the process knowledge of the operator is combined with the robot's precision and repeatability [1]. This approach, called *playback programming* or *kinesthetic programming* [2] is intuitive, since no robotics knowledge is required.

An useful extension to this programming approach is the editing of the robot program after recording the robot motion and before execution [3]. For instance, it is possible to cut out errors in the robot trajectory or to insert repetitions or branches. The editing concept can be extended to simulate the robot path with the goal to review and especially avoid incorrect robot programs. For this, the setup with all relevant objects must be known in the programming framework.

In this work, we present an extension to a kinesthetic programming framework [4], such that it is possible to calibrate an external turntable in relation to the robot and add it into the 3D simulation of the programming framework. Based on the calibration, the user can register workpieces in the 3D simulation of our framework. With this, it is possible to review robot trajectories in the 3D simulation of the framework as shown in Fig. 1.

Section 2 gives an overview of sensors and algorithms that can be used to calibrate peripherals and register objects. In Sect. 3, we describe our approach for the calibration of a turntable. Section 4 describes the method for registration of workpieces. Our extensions are evaluated in terms of accuracy and execution time in Sect. 5. Section 6 summarizes and concludes the paper.

**Fig. 1** Given a setup with a turntable and workpiece (left), our goal is to know this setup in the 3D simulation of our programming framework (right)

#### **2 Related Work**

If a robot has to work with or on objects, it is necessary to localize these object and to create a virtual representation. For this, different sensors can be used. Multiple colour and depth cameras are commonly used to monitor a work cell [5]. Colour or colour depth cameras attached to a robotic arm are used for pose estimation and motion planning [6] and laserassisted welding systems utilize already a digital industrial camera combined with a laser [7]. Nevertheless, camera based method need to handle changing lighting conditions [8] and motion blur [9]. Ultrasonic sensors are used to detect empty paint cups [10] and in outdoor applications, such as autonomous driving, radar and lidar are commonly used in case of lighting extremes such as day and night [11]. In our application, fibre spraying, only minor lightning changes occur and the process produces air pollution [12]. This results in an additional effort for cleaning and calibrating sensors. As of this, the used sensor should be as simple as possible.

Most of the challenges of the current problems of object localization are not relevant to our application. As mentioned, highly changing lighting conditions are not to be considered. Also, an approach which can handle a variety of non-stationary objects in a large scene is not required [13]. Additionally, hard real-time requirements are not given and so a fast computation is not needed [14]. Consequently, we decided to use an approach which is not facing these problems.

Insofar, a semi-automatic process in which a user can calibrate a turntable and register workpieces using a low cost DOE laser is sufficient. Thus, there is no need for a complex object localization or additional sensors. Further, a DOE laser can be used as a visual aid for playback programming in general. The only drawback of such a method is the accuracy of the procedure as this depends on the manually performed steps.

#### **3 Turntable Calibration**

We present a semi-automatic procedure with which a user can determine the complete pose of the TT in reference to the robot, so that the turntable (TT) can be included into the virtual simulation. After defining general terminology and functions for the calibration, we explain the calibration procedure in detail. In our case, the tilt of the TT is set manually before calibration, which results in the need of a explicit calculation of the tilt angle. In the case, that the tilt could be controlled and set by software, the presented method is simplified.

#### **3.1 Definitions**

We define the origin of the world coordinate system in the base of the robot and as a righthanded system. A pose *P* = (*T* , *O*) in our world consists of a position *T* := (*x*, *y*,*z*) with *<sup>x</sup>*, *<sup>y</sup>*,*<sup>z</sup>* <sup>∈</sup> <sup>R</sup> and an orientation *<sup>O</sup>* := <sup>R</sup>3×3. The set of all poses is called **<sup>P</sup>**. The extension .*x*, .*y* and .*z* is used to reference *x*, *y* or *z* coordinates of a pose. Further, the plane *A* := {(*x*, *<sup>y</sup>*,*zA*) <sup>|</sup> *<sup>x</sup>*, *<sup>y</sup>* <sup>∈</sup> <sup>R</sup> <sup>∧</sup> *zA* <sup>=</sup> *const* <sup>∈</sup> <sup>R</sup>} is called *acquisition plane*. The smallest angle between *A* and the laser beam mounted on the robot is called the *laser angle* λ ∈ ]0◦, 90◦]. The limitation of λ is given by the structure of the used robot cell.

We define a set of *markers M* := {*m*0, *m*1, *m*2, *m*3, *m*4} with *mi* ∈ *T* on the rotary plate of the TT. The marker *m*<sup>4</sup> is located in the centre of the rotary plate. The remaining markers are set on a circle with a known diameter *D* around the marker *m*<sup>4</sup> and are indexed clockwise, so that ellipse semi-axes are defined by [*m*0, *m*2] and [*m*1, *m*3], as shown in Fig. 2 left. Further, a set of *robot poses P* := {*p*0, *p*1, *p*2, *p*3, *p*4}with *pi* ∈ **P** is given, which are obtained by pointing the laser beam at the markers *mi* on a fixed *A* with a fixed laser angle λ.

A *turntable configuration T T C* := (ϕ, *TT* , *Os*) consists of a *tilt angle* ϕ ∈ [0◦, 90◦], a *turntable position TT* , in which the *mounting height zT* is known from the structure of the robot cell, and a *turntable orientation Os* := {*left*, *do*w*n*,*right*, *up*}. The orientation *Os* is simplified, since the TT can only be tilted to the left, down, right and up, which corresponds to rotations of 0◦, 90◦, 180◦ and 270◦ around the *z*-axis of the TT, as shown in Fig. 2 right. The intrinsic parameters of the TT are known. Thus, we can use the function τ : ϕ × *Os* → *TT* , which returns the translation between the centre of the rotary plate and the TT position based on the tilt angle ϕ and the orientation *Os*.

#### **3.2 Calibration Method**

Before each calibration, a reference run of the TT is performed, which guarantees correct marker positions on the turntable. Then, the user approaches the robot via hand guidance, so that the DOE laser beam targets at the markers *M* on the rotary plate. The rotational axes as well as the translation in *z* is locked while guiding the robot, defining *A* . The num-

**Fig. 2** The DOE laser beam is used to point on marker from robot positions on the acquisition plane (left). Different mounting orientations of the turntable have to be considered (right)

ber of markers on the turntable is minimal, as *m*<sup>4</sup> serves as redundancy and can be added optionally. Marker *m*<sup>0</sup> is highest point of the circle on which the markers are defined and the farthest away from the tilt axis. During the calibration, the markers are approached in ascending order of their indexing (clockwise) and the corresponding robot poses*P* on *A* are stored. Based on the recorded*P*, the turntable calibration *TTC* can be calculated as follows.

**Orientation:** As of the structure of the possible orientations, the lines segments [*p*0, *p*2] or [*p*1, *p*3] are perpendicular to the world *x*-axis and can used to determine the orientation. For this, we first compare the *y*-values of the *pi* to choose which line segment is perpendicular to the *x*-axis and then consider the values of the *y*-coordinate (down and up) or the *x*-coordinate (left and right):

$$O\_s\left(\mathcal{P}\right) := \begin{cases} down, & \text{if } |p\_{0.}\mathbf{y} - p\_{2.}\mathbf{y}| < \varepsilon \wedge p\_{1.}\mathbf{y} < p\_{3.}\mathbf{y} \\ up, & \text{if } |p\_{0.}\mathbf{y} - p\_{2.}\mathbf{y}| < \varepsilon \wedge p\_{1.}\mathbf{y} > p\_{3.}\mathbf{y} \\ left, & \text{if } |p\_{1.}\mathbf{y} - p\_{3.}\mathbf{y}| < \varepsilon \wedge p\_{1.}\mathbf{x} < p\_{3.}\mathbf{x} \\ \mathbf{right} & \text{if } |p\_{1.}\mathbf{y} - p\_{3.}\mathbf{y}| < \varepsilon \wedge p\_{1.}\mathbf{x} > p\_{3.}\mathbf{x} \end{cases} \tag{1}$$

We allow inaccuracies in approaching the markers with the threshold ε 1. The equation also holds, if *p*<sup>0</sup> and *p*<sup>2</sup> are the same point, which occurs if the tilt and the laser angle both are 90◦.

**Tilt angle:** The tilt angle ϕ is determined with *p*<sup>0</sup> and *p*2. Different cases are considered according to the given orientation *Os*. In case of *up* and *do*w*n*, λ directly influences the distance between *p*<sup>0</sup> and *p*2. For *left* and *right*, the calculations are independent of λ.

*do*w*n*: The distance *dx* := |*p*0.*x* − *p*2.*x*| is projected from *A* along the laser beams to the rotary plate, as shown in Fig. 3 top. By using λ, *dx* and *D*, a triangle can be constructed. Thus, we can determine ϕ by the ratio of the vertices *dx* and the distance *D*, where *dx* depends on both ϕ and λ:

$$\varphi = 180^\circ - \lambda - \arcsin\left(\frac{d\_\times \cdot \sin(\lambda)}{D}\right) \tag{2}$$

*up*: Analogous to *do*w*n*, a triangle can be constructed, but in this case λ does not occur in the triangle directly, as shown in Fig. 3 bottom:

$$\varphi = \lambda - \arcsin\left(\frac{d\_{\chi} \cdot \sin(180^\circ - \lambda)}{D}\right) \tag{3}$$

*left* and *right*: The markers *m*<sup>0</sup> and *m*<sup>2</sup> are not on a parallel line to the laser beam. However, the distance *dy* := |*p*0.*y* − *p*2.*y*| is independent of λ. As of this, ϕ can be calculated by the ratio of the recorded distance *dy* and *D*:

**Fig.3** Derivation diagram of the tilt angle calculation for the cases *Os* = {*do*w*n*}(top) and *Os* = {*up*} (bottom)

$$\varphi = \arccos\left(\frac{d\_\mathbf{y}}{D}\right) \tag{4}$$

**Position:** We move the TT with respect to the emanated laser beams of *P* and the TT mounting height *zT* . Let *M* be the centre of the recorded positions *P*. Then the turntable position *TT* can be defined with the laser angle λ, the mounting height *zT* and the function τ (ϕ, *Os*):

$$T\_T := M + \frac{\tau \left(\varphi, O\_s\right) \cdot z - z\_T}{\tan\left(\lambda\right)} \begin{pmatrix} 1 \\ 0 \\ -\tan\left(\lambda\right) \end{pmatrix} + \tau \left(\varphi, O\_s\right) \tag{5}$$

Thus, the turntable calibration TTC is achieved with Eq. 1 for the orientation, Eqs. 2, 3 or 4 for the tilt angle and Eq. 5 for the position.

#### **4 Workpiece Registration**

The workpiece registration is also tied to the DOE laser and moving the robot to specific points. In contrast to the turntable, there are no markers defined on the workpieces from outset. To deal with this, we use points which are located on the workpiece surface resting on the rotary plate. With this points we define correspondence pairs between the workpiece and the rotary plate.

For each correspondence pair, the user defines first a relevant point on the digital surface shown in the graphical user interface of our framework. Then, the robot is guided by the user, so that the DOE laser points to the corresponding point in the real world. This position is stored and the system calculates the intersection of the laser beam with the upper surface of the rotary plate from the stored robot pose as second point in the correspondence pair. There should be at least three correspondence pairs, since we define the contact plane to the rotary plate. After the definition of the correspondence pairs, the transformation between object and TT is calculated with pose estimation using singular value decomposition [15].

#### **5 Experiments**

We evaluate our approach in terms of accuracy and execution time for both proposed extensions. For this, we use the cell proposed in [1] and extend the programming framework presented in [4]. In terms of intuitiveness, a simple wizard-based UI was designed for the TT calibration (Fig. 4 left). Also, it is possible to save and load calibrations. For the workpiece registration, we implemented and compared two approaches: A laser-assisted registration method as proposed in Sect. 4 and the definition of correspondence points by clicking on the upper surface of the TT, so that the laser is not required (Fig. 4 right).

#### **5.1 Accuracy of Turntable Calibration**

We performed the calibration on various tilt angles ϕ. In the context of fibre spraying, ϕ = 0◦ and ϕ = 90◦ are most relevant. Additionally, we used different ϕ ∈ [0◦, 45◦] to show the accuracy for general tilt angles. Overall, 30 calibrations were performed: Eleven with *Os* = *left* and *Os* = *do*w*n* and eight with *Os* = *up*. We neglected *Os* = *right*, since the calculation is identical to *left*. The ground truth position of the TT was gauged via

**Fig. 4** After turntable calibration in a wizard-based UI dialogue (left), a workpiece is registered via a picking method (right) or a laser-assisted method

**Fig. 5** Results of the turntable calibration experiment. Left: Error in position *x* [cm]; Centre: Error in position *y* [cm]. Right: Error in tilt [ ◦]

measuring tape and the orientation was measured with a gyroscope sensor with an accuracy of ±0.06◦.

Figure 5 shows the results of our evaluation. The error in the *x*-axis is consistently smaller than in the *y*-axis, and both errors scatter in the same size, resulting in an error range of up to 1.5 cm—neglecting the two outliers. For most calibrations, the tilt angle error is in [−0.2◦, 1.1◦]. The error of the tilt angle ϕ is smallest for *Os* = *left* and for *Os* = *up* a similar good accuracy is achieved, except one outlier. In case of *Os* = *do*w*n*, there is a swing to the negative and two calibrations which failed completely with a tilt error of 10.9◦ and −22.5◦. It is not clear what exactly caused these two failed calibrations.

In Summary, the test cases for *up* and *left* have, on average, a tilt angle error in the sub-degree range and an average position accuracy in the sub-centimetre range. Although a slightly larger error rate for *do*w*n*, there is overall an accurate result, even though the calibration requires only a minimal number of markers.

**Fig. 6** Classes of workpieces for the registration evaluation: Defined corners (left), Rotational symmetric (centre) and workpieces whose contact surface to the rotary plate can not be targeted with the laser (right)

#### **5.2 Accuracy of Workpiece Registration**

We evaluated the cases of ϕ = {0◦; 44◦; 90◦} for three different orientations *Os* = {*left*; *up*; *do*w*n*}. Further, we distinct workpieces in three different classes. Workpieces with defined corners (hexagon, Fig. 6 left), which can be easily clicked and approached. Rotational symmetric workpieces (tube, Fig. 6 centre), which have not clues for point definition like corners. In addition, workpieces are considered whose contact surface to the rotary plate can not be targeted with the laser (shell, Fig. 6 right). We performed the registration 50 times, with the two mentioned different acquisition methods (laser-assisted and picking) and used a different number of correspondence points each time. The workpieces are bolted in the centre of the TT, so that the position of the workpiece is exactly known with respect to the TT position. Since the definition of the relevant points is left to the user, a presumably optimal choice of points is assumed.

We measured the error of the centre position of the workpiece, as well as the error of the contact surface, which is the surface containing the relevant, and the rotation error around the normal vector of the contact surface. Figure 7 shows the accuracy results of all performed registrations. In general, all registrations perform in with a sub-centimetre accuracy. All workpiece classes have a similar spread in the sub-centimetre range and the registrations that utilize more correspondences are more accurate. Regarding the orientation, the contact area matches often perfectly. However, there is up to 1.5◦ contact area error for the shell, independent of the used registration method. The largest outliers in the rotation error are due to use of the picking method. The remaining data points cluster in a range of [−2◦, 2◦] rotation error.

The registration with picking is more accurate than the laser-assisted approach. One reason for this is that workpieces are always mounted in the centre of the rotatory plate and it is more easy to click the points accurately, as the centre can be found relatively well by

**Fig. 7** Results of our workpiece registration evaluation. *a*) − *b*) Error in translation [cm]; *c*) Error in contact plane [ ◦]. *d*) Error in rotation [ ◦]. The overall results are shown with subsets characterized by the used registration method

human eyes. We also notice a slight worse accuracy when tilting is present. Considering that our ground truth data is subject to error, the accuracy of the position is independent of the orientation and tilt of the turntable, as well as the recording method. However, using clearly defined corner points on the workpiece, as well as using a larger number of relevant points, improve the accuracy of the method further. The orientation of the registration is most accurate when the laser assisted method is used. The contact surface rotation is error-free for the tube and hexagon and even the ≈ 1.5◦ contact plane rotation is still usable within our application fibre spraying process.

#### **5.3 Execution Time**

We measured the amount of time to perform the turntable calibration and the workpiece registration during the aforementioned experiments. The turntable calibration time is defined as the time span between opening the wizard for the calibration and loading of the turntable 3D model into the simulation with the calculated calibration. In case of the workpiece registration, we measured the time from starting the UI dialogue for registration until the object is shown in the simulation window.

Most of the calibrations are performed in less than 150 s and need minimal 87 s, as shown in Fig. 8. The outliers result in case of λ = 90◦, at which the approaching with hand guidance is more strenuous. In case of the registration, differences are seen in the used method. The laser assisted method requires a comparable amount of time to the calibration and the picking

**Fig. 8** Measured execution time for calibration and registration [sec]

method is performed with less amount of time. The wide range in the registration duration result from the varying number of correspondence pairs. On average, the amount of time rises with an additional correspondence pair by ≈ 10 s for picking and by ≈ 20 s for the laser assisted method.

The picking is carried out more efficiently because of the workpiece mounting and as it is performed only in the GUI. Overall, a fully known setup with workpiece and turntable is achieved in less then five minutes. A Further reduce of this amount of time is possible through regular application.

#### **6 Conclusion**

We present a method for the calibration of a turntable and the registration of workpieces based on the calibration. Our method utilizes, in contrast to the current trend in research, not several visual sensors. Instead, we utilize only a DOE laser attached to the robot. The DOE laser is used as a visual aid in approaching individual points, which are used for the determination of the turntable and workpiece poses. For successful calibrations, the translation error scatters in a 1.5 cm interval, while the tilt is in a range of [−3◦, 3◦] off the measured angle. The workpiece registration accuracy is in the interval of [−1, 1] cm. The rotational error of the registration is in the range of [−6.6◦, 3.4◦]. For both methods combined, five minutes of time effort is needed. The approach can be used in the production of small batches, which requires in worst case a new registration for each program execution. In future work, the known objects are used to simulate fibre spraying processes. With this, robot trajectories can be verified for correctness before execution and, if needed, optimized to reduce production costs.

**Acknowledgements** The authors acknowledge Prof. Dr.-Ing. Stefan Schafföner, Georg Puchas and Jonas Winkelbauer from the Chair for Ceramic Materials Engineering (University of Bayreuth) for the excellent cooperation.

#### **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

**Path Planning 2**

### **Digital Geometry Recording for Automation of the One-Off Production of Pipes**

Jacques Biltgen, Sascha Lauer, Martin-Christoph Wanner and Wilko Flügge

#### **Abstract**

The manual production in pipeline construction is often related to the fact that either the capacities for automation are lacking or the production is too individual for a simple automation solution. However, automated production would increase productivity and quality, especially in metal processing. The challenge in the manufacturing process of ftting tubes is the batch sizes of one. Nevertheless, a non-time-consuming programming solution must be found to integrate a robot-based solution economically into the production chain. Offine path planning based on CAD models would be a suitable solution. To ensure that the robot-welded seams comply with the standards, there has to be consistent quality in the seam preparations. For the fnal quality the direct integration of the CAD fow would be important. Due to transport and limited space, it is often impossible to use sensors to scan the piping. In case of lacking technical documentation, the pipes are still measured by hand, especially when replacing or modifying pipelines.The geometry survey is done in several steps, frst a rough drawing is made on the vessel, this is then converted into a technical sketch and only

J. Biltgen (\*) · S. Lauer · M.-C. Wanner · W. Flügge

Fraunhofer Institute for Large Structures in Production Engineering IGP, Rostock, Germany e-mail: jacques.biltgen@igp.fraunhofer.de

S. Lauer e-mail: sascha.lauer@igp.fraunhofer.de

M.-C Wanner e-mail: martin-christoph.wanner@igp.fraunhofer.de

W. Flügge e-mail: wilko.fuegge@igp.fraunhofer.de

T. Schüppstuhl et al. (eds.), *Annals of Scientifc Society for Assembly, Handling and Industrial Robotics 2022*, https://doi.org/10.1007/978-3-031-10071-0\_22

later transferred to a CAD programme. There can be several days between these steps and different operators. A method will now be presented to combine these steps with the support of a tablet. For this purpose, software is used on the tablet to digitise the geometries and prepare them for further offine path planning.

#### **Keywords**

Digitalization · Robot welding · Process optimization · Pipe production · Robotics · Automation

#### **1 Introduction**

In pipeline construction, especially in repair, ftting pipes play an essential role. The manufacturing steps are mostly carried out manually. The challenge with robot-assisted automation is the small batch size, with mainly one-offs being involved. Since the selection of standard components is defned and recurring components are used, robot-supported automation is nevertheless possible. To be able to integrate this proftably into the entire manufacturing process, the upstream and downstream processes must be considered in addition to the welding process.

For an economic integration of a robot-based solution, the weld seams have to be produced with a consistent quality. To achieve this, the upstream processes have to be adapted. In this case, it is advisable to collect data of the pipe geometries during the manufacturing process and to integrate it into the path planning. In this way, even complex unique structures can be welded with a robot.

In best case the CAD data are integrated directly into the manufacturing process to implement a follow-up of the data and the quality. The digitalization of pipes is already available on the market in various design software packages [1–3]. These are based on the new planning of plants and are therefore unsuitable for integration on the construction site. A direct and intuitive integration of the technical data into the entire production process is not practicable. Moreover, additional measurement equipment is required for a detailed representation of the plants. In the case of pipe rehabilitation on ships, this is unsuitable due to the time-consuming transport, limited accessibility, and varying light conditions. Digital Mockups (DIM) are essential in the steps of planning, construction and accessibility analyses. They are used especially in aircraft construction [4], where a scan of the fuselage is carried out and transferred to CAD. Furthermore, there is the possibility to build up the DIM from CAD data. Another advantage of these mock-ups is the comparison of target geometries (CAD) and actual geometries. Due to the high effort, these procedures are mainly carried out for large batch sizes. Due to the lack of documentation, the long journeys and very small premises, scanning the pipes is very work intensive and is therefore not carried out.

A direct integration of the technical data already on site is necessary to improve the quality and the expenditure of time during the entire process chain [5]. Due to these defcits, a methodology was developed which enables the integration of CAD data into the process chain on the construction site.

Currently, a rough hand sketch of the pipe construction is made at the construction site and photos are taken for documentation purposes. After this, a technical sketch is derived based on this manual sketch, which is then transferred to the CAD workstation. The CAD design and a list of parts are derived to order the components. Due to the repeated drawing and changing staff members, errors and mismatches can occur, which can lead to deviations in the required ftting tube. Since the CAD model is not directly integrated into the process chain, these deviations aren't detected until the end of the process chain. This fact leads to an increased effort in reworking.

#### **2 Approach to Automation**

To be able to realize an automation of the manufacturing process by means of a robot, the entire process chain, the manufacturing process and the material spectrum has to be analyzed. Here, the welding of the pipe segments, which has been done manually, has to be optimized by a robot. To achieve the required accuracies for this step, the upstream process steps have to be analyzed and adapted if necessary.

#### **2.1 Analysis of the Manufacturing Process**

In the case of one-off production, as is the case in the production of ftting pipes, an exemplary the successive process chain includes the following work steps:


The main part in the process chain is taken up by the programming of the welding robot. Therefore, it's important to implement a fast and intuitive integration of the used components in the postprocess and to counteract time-consuming programming. When integrating standard parts like pipe bends, branches, or fanges, predefned libraries should be used to reduce the programming effort, which can be extended and adapted afterwards. Due to the complexity of the components an offine based program is defned through individual CAD data.

To realize welding automation by means of a robot, the quality of the upstream manufacturing steps have to be improved. These production steps include cutting, prepositioning, seam preparation and tacking. The direct integration and use of CAD data guarantee the required quality.

#### **2.2 Analysis of the Material Spectrum**

For the realization of automation, the material spectrum must be considered. This term includes the geometries and the materials. Figure 1 shows pipe constructions, which are used for repairing work on ships. Standard components with individual pipe lengths are used.

For the welding process, the material and the weld preparation are equally important. The most common joints are butt joints. In addition to these, there are pipe branches in various confgurations. For this, offine path planning is indispensable, as manual programming of path support points is too time-consuming. The use of optical sensors in online programming can lead to problems with triangulation due to the different angles.

#### **2.3 Analysis of the Production Chain**

The manufacturing chain can be divided into three main sections:


The upstream processes include the determination and acquisition of the geometry data on site. In the partially automated welding process, the steps described in Chapt. 2.1 are considered. The downstream processes include sawing, surface treatment, straightening and quality control.

**Fig. 1** Examples for pipe components

The upstream processes are time-consuming and offer a lot of potential for savings. Likewise, this process has a great infuence on the fnal quality.

#### **2.4 Derivation to Automation**

In the previous chapters, the current state was explained and the process steps that have potential for optimization were shown. The resulting information is now to be used to derive a digitalization. The digitalization should be designed in such a way that a direct communication between the recording of the geometry data and an automated solution is possible. A functional concept is to be developed from the previously derived requirements.

It has been found that for a robot-based welding concept in a one-off production, offine path planning on CAD data is the most sensible solution due to its high fexibility. However, the generation of CAD data can be error-prone due to multiple processing steps and thus has a major impact on automation. To avoid these multiple steps, the geometric data should be recorded in fnal form on the construction site. Since the components used for ftting pipes are standardized, the use of a mobile operating device with a database of the standards component is the obvious choice. Figure 2 shows a comparison of the conventional process and the automated process. A specially developed software allows different processes to be combined on a smartpad.

The smartpad takes over the creation of sketches, the taking of photos and notes, the recording of geometries and the different components with their specifc values. The use of a digital solution eliminates the duplication of drawings. Another advantage results from the fact that all elements are already defned on site and misinformation is minimized by simplifed sketches and parts lists.

This makes it possible to integrate the CAD workstation directly into the production chain and reduce the process throughput time. By integrating the CAD data, further process steps can also be partially or fully automated. In addition to automated welding, these include cutting and seam preparation of the pipes.

Due to their structure, programs for documentation purposes have programmatic hierarchies. These can be used to create a defned data structure. Through a targeted query of customer data as well as order data within the program structure, the projects can be structured and systematically stored. This systematic approach simplifes the traceability of data and increases quality. An important step for digitization is the defnition of the most important information.

The CAD data generated this way is then used for path planning of the robot. A simulation tool is used to generate this data. In this tool, the seams can be selected, and the required paths are generated. Thus, collision control and control of the welding paths are already possible. The next step is the transfer of the generated welding paths to a robot cell.

**Fig. 2** Comparison of the conventional to the automated process chain

#### **3 Implementation of the Developed Process Chain**

The developed concept was integrated into the existing process chain. Two different programs were developed for the integration. The frst program is an app on an Android tablet, where the technical data of the designed pipe constructions are entered. Ideally, the app guides the operator systematically through the raw design. These information are used in the production chain to optimize the process. For this purpose, a second program is used, which is integrated in a CAD tool. This program creates 3D models and technical documentation from the entered data. In order to test the application possibilities of the developed software, a feld test was carried out and evaluated.

#### **3.1 Operating Concept on the Smartpad**

To implement the digitisation of the pipe constructions already on the construction site without additional devices, a software for a mobile operating device was developed. This enables an operator to create a pipe construction from several pipe segments. During the input, the pipe construction is displayed according to DIN EN ISO 6412-2 [6] (see Fig. 3). This type of representation allows the operator a standardised visualisation. A database of standard components is stored in the software. With the help of this database standardised components plus their technical parameters can be entered for the ftting pipes. The complexity and size of the ftting pipes are not limited by the software.

To create a pipe, the starting point must be entered in the input feld for the start coordinates (7) and the end point of the respective pipe segment must be entered in the input feld for the end coordinates (8). The pipe diameter and the desired wall thickness must be selected in the control panel (3). Then the weld fttings can be defned in the selection window for weld fttings (fanges, heads, tees, and reducers) (6). The program automatically selects the correct fttings for the selected pipes. Then the segment is added

**Fig. 3** User interface for digital geometry input

to the current pipe via the "Hinzufügen" ("Add") button (2). The pipe segment is listed in the table (4) and shown in the drawing (5) according to the standards. To enter connected pipe segments more quickly, the corresponding start option can be selected by choosing the input mode (9). There are several options that make more complex entries more user-friendly, e.g. to connect a pipe segment to a T-piece. Additional information, such as surface treatments or pipe shapes, can be specifed using the input felds for additional information (10). Additional design options and layout options are available in the option-buttons (1). New projects and sub-projects are created via this button. There is also an option for notes and photos. The advantage of the user interface designed in this way is the intuitive usability using the real occurring workfow.

After all the necessary data has been entered as described above, the software creates a parts list. This can be displayed and edited. The software always selects standard components for creation. However, if other components are needed in the project, they can be added or changed in the parts list. Figure 4 shows the editing of the parts list using the example of the pipe bends. Finally, the parts list is saved as a CSV fle. The parts list is intended as a source of information for the following software in the CAD tool and is therefore not usable without further processing.

#### **3.2 Integration into the Process Chain**

The aim of digitalization is to increase productivity and quality by capturing the pipe geometry already at the construction site. After the pipe construction has been cre-


**Fig. 4** Display and editing of the parts list

ated with the mobile operating device, the software creates a project-specifc folder in which all images, the CSV fle and notes are assigned to the construction. The folders are assigned to defned customers and projects and can also contain sub-projects. The automated creation of the folder structure allows for easy traceability and editing. The prepared CSV fle can be loaded into a CAD tool by saving a 3D-model and technical documentation such as parts lists and material lists with all the required information. Figure 5 shows the 2D visualization in the app and a 3D view in a CAD tool.

The conversion into a 3D model takes place automatically in the CAD tool. The standard components used are defned by libraries. The lengths of the individual pipe segments are formed depending on the total lengths entered and the specifc weld fttings. After creating the 3D model, the required pipe lengths and a quantity structure are exported to the operator.

The 3D model can now be integrated into an offine path planning. The MotoSIM tool [7] from Yaskawa was used in the trials. The individual seams can be selected in the tool and the set parameters (torch angle, angle of the positioner, approach, and departure paths, etc.) can be checked. Afterwards, the check for collision and reachability can be carried out. An example of such a setup can be seen in Fig. 6.

#### **3.3 Evaluation**

In the methodology described, a digital concept was developed that shortens an existing process chain and improves the data exchange of individual processes. The digital solution has already been tested in feld trials as part of a project. Simple pipe connections can be created intuitively and quickly via the tablet.

During the feld tests, a reduction in processing time was observed using the software. Especially the representation according to DIN EN ISO 6412-2 [6] showed great potential in the feld tests. The pipes to be replaced could already be matched with technical drawings here.

**Fig. 5** Creation of a 3D view and technical documentation from a technical drawing

**Fig. 6** Path planning on a digitalized pipe using offine programming

The feld tests were realised up to the CAD workstation. The pipes could be transferred into a 3D model without any incidents. The CAD tool used for the verifcation was AutoCAD [8]. In AutoCAD, step fles could be automatically generated and output, which could be integrated and processed in offine path planning programmes and in CNC processing programs. In the development phase only tubes in thin sheet metal were processed. The generation of the 3D models still required a lot of computing power in some cases.

Challenges arose in the representation of pipe nodes. The intersections between the pipes could not be modelled correctly due to complex interdependencies. Due to missing dependencies between individual components in AutoCAD, the pipe cannot be represented isometrically.

All in all, it could be determined during the feld test that the upstream work steps can be automated in an economically effcient way.

Integration into the simulation software MotoSIM was carried out under laboratory conditions. During the tests with the simulation software, the complexity of the seams and the pipes became apparent. The paths created by the software had to be checked and adjusted frequently for the appropriate welding position. The problem here was that a two-axis positioner was used instead of a three-axis positioner.

#### **4 Conclusion**

It could be proven that the digitalization of the ftting pipe production is already possible on the construction site. This is done without additional measuring equipment. Automation saves time by eliminating additional individual steps and increases quality by improving the traceability of the data. The data generated by this method can be transferred to various CAD tools.

The next steps are the continuation of the digital process chain, see Fig. 3. Currently, the steps up to the technical documentation have been considered. The transfer of seam preparation for automated welding will be a further step. The model data can already be entered into conventional CNC and path planning programs. However, a comparison between the actual and target model is desirable. The pipes are subject to manufacturing tolerances and have a conicity and wall thickness deviation [9]. This results in new challenges for the automated welding process.

An automated planning of the seam preparation for the thick sheet area has not yet been implemented. The integration of an automated seam planning would be a further optimisation possibility. When calculating the individual intersection contours, the different angles of attack, the tube diameters, the material thicknesses, and the seam preparation must be considered [10].

The integration into an offine program was tested. The next step here is the transfer of the welding tracks to a real robot cell.

#### **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

### **Detection and Handling of Dynamic Scenes During an Active Vision Process for Object Recognition Using a Boundary Representation**

Dorian Rohner, Johannes Hartwig and Dominik Henrich

#### **Abstract**

For robot manipulators, it is nowadays necessary to know their surroundings. This knowledge consists at least of a world representation with recognized objects. During the reconstruction of scene objects from multiple views, changes, like positioning of the objects, or additional unwanted signals, like parts of a human co-worker, may occur. In this paper, we classify the possible changes for a specific type of representation (boundary representation models). Afterwards, we present an approach to detect and handle these changes to maintain a valid world model. To achieve this, we compare what should be visible in the world model reconstructed thus far with the actual information from the current view. The detected change is handled by using object hypotheses as well as geometric information from the world representation. Based on an evaluation, we show a proof of concept and the usefulness of our approach and suggest future work.

#### **Keywords**

Computer Vision • Robotics • Environment Reconstruction

#### **1 Introduction**

Robot manipulators are more frequently used in households and small and medium-sized enterprises (SME) [1]. In such applications, the robot's environment may change over time. The robot uses sensors to detect and understand the scene. This includes information about

D. Rohner · J. Hartwig (B) · D. Henrich Bayreuth, Germany

e-mail: johannes.hartwig@uni-bayreuth.de

what different work pieces are present and where they are located. A commonly used approach is to have a camera as a sensor and combine it with visual object recognition techniques. However, only one view on the scene is not always sufficient to identify all objects, e.g. due to occlusion or large scenes. The approach of generating new views is known as active vision [2]. Based on the scene recorded thus far, new views gathering additional information about the scene are determined and incorporated into the world representation. During the movement of the robot to the next view, a human co-worker may modify the scene. This invalidates some information in the robot's world representation. Additionally, it is possible, that the sensor captures an unwanted signal, e.g. an arm of a co-worker. In both cases, it is not known a priori, which part of the world representation is still valid or which part of the signal is useful. Therefore, it is necessary to develop and evaluate an approach to handle dynamic scenes as well as unwanted signals.

In this paper we present a novel approach regarding the handling of dynamic scenes using a 3D scene reconstruction. Based on the state of the art (Sect. 2) and our preliminary work (Sect. 3.1), we identify requirements, make necessary assumptions, and present our overall concept (Sect. 3.2). We classify all possible cases of how changes in the environment can occur between two recordings (Sect. 3.3) and describe a method to detect changes for a specific kind of world representation (Sect. 3.4). As in our previous work [3, 4], we use boundary representation models (B-Reps). We present an approach to handle changes and to incorporate them into the world representation to assure the validity after each view (Sect. 3.5). This approach is evaluated by a proof of concept as well as a comparison between our method and the ground truth B-Rep (Sect. 4). Finally, we discuss our contribution and future work (Sect. 5).

#### **2 State of the Art**

The detection and handling of dynamic scenes is encountered in several fields of research, e.g. in autonomous driving, computer vision, and robotics [5]. In these different applications the problem and the solution can be viewed under multiple aspects. On the one hand, the type of internal scene representation is from interest. This ranges from LIDAR sensor data [6], over point clouds [7, 8] to bounding boxes [9] (e.g. from semantic segmentation). The representation of the scene impacts the possibilities of detecting and handling changes. When using point clouds, either each pixel must be handled on its own or a segmentation is necessary to group multiple points into clusters. In the case of semantic segmentation, these clusters are complete objects and can be used to identify changes between two frames.

On the other hand, changes can be handled in different ways: One method is to introduce a time component and aging [10, 11] to remove knowledge, which was not validated for a certain amount of time. The basic idea is to attach a certainty to each object instance and decrease it with increasing time and human proximity, as humans can only manipulate objects to which they are spatially close. Whenever an object is visible in the current view, the certainty is reset. Another approach is based solely on the given model, by comparing multiple views and reason, which parts of the scene are still available. This can be done by utilizing background detection methods [12], while multiple frames are processed and moving objects can be identified in the foreground. Alternatively, two given frames can be compared to detect possible changes. Overall the goal is to obtain a valid representation of the scene at all times.

Finally, it is of interest how the new poses for the sensor are obtained: E.g. a human operator uses a hand-held camera [10] or a robot system decides autonomously where to move [11]. Depending on the application, the requirements regarding the correctness of the world representation differs. In some cases, it is acceptable if small movement of an object is undetected and therefore not handled. In other applications, it is preferable if each change is detected—even if this results in too many detected and handled changes.

A special case of how new poses are obtained is the explicit tracking of objects [13]. For this approach it is necessary, that the object of interest is visible in a majority of all views. This assumption is difficult to fulfill in some robotic applications, especially when using an eye-in-hand camera. Another special case is visual servoing [14] in which the sensor follows the movement of the object.

Based on the state of the art and our previous research (see Sect. 2), we examine a modelbased approach regarding dynamic scenes. Especially the usage of B-Reps for detection and handling of change is of interest as the vision process utilizing B-Reps as a world representation is not well explored [3, 4]. We will focus on a model-based approach in this paper. On the one hand, existing aging methods can be applied to model-based approaches as well. On the other hand, it is of interest how B-Reps can handle dynamic scenes only based on their 3D geometry information. Furthermore, we use a robot-mounted sensor, whose motion is controlled by an active vision process for object recognition. Therefore, we cannot move the camera to specific poses to detect or further investigate changes. Due to this, tracking methods are not applicable.

#### **3 Our Approach**

#### **3.1 Basic Approach for Static Scenes**

In our previous work we developed an approach for object recognition based on B-Reps [3]. In that work, we create an object database, in which every object is stored as a B-Rep. To obtain a scene representation, we use a robot-mounted depth camera and the resulting point cloud is transformed into a B-Rep [15, 16]. This B-Rep is the input for our object recognition approach. We determine multiple sets of hypotheses between the scene and objects from our database and select the best fitting one. Some objects may not be recognized with the current view; therefore we determine new views by using our active vision approach [4]. At a new camera pose we record another point cloud, transform it into a B-Rep, and merge it into the scene representation. Afterwards we apply our object recognition method once again. This procedure repeats until each object is correctly classified. The problem of dynamic scenes occurs between the capturing of two point clouds from different views.

#### **3.2 Enhancement for Dynamic Scenes**

In this section we present our overall approach. The reconstruction and object recognition are an iterative process, and we want to ensure a valid scene representation after each view. Therefore, every change has to be incorporated directly. We primarily focus on the faces represented in the B-Rep when handling the dynamic scenes. Faces represent all information stored in B-Reps. If the faces are correctly reconstructed, the vertices and edges contained within a face are correct as well. Furthermore, faces are a robust and high-quality representation of objects, as a face is calculated by averaging over numerous points from a point cloud [15]. Therefore, our overall B-Rep is valid and correct if all faces are correct. Faces are the most important feature for our object recognition approach, as well as the active vision method.

The first step to handle the problem of dynamic scenes is to categorize the possible changes regarding faces. Based on this classification, we have to detect these changes within our representation. We capture a point cloud from a new view, convert it into a B-Rep, and compare what is currently visible and what should be visible based on the current world model from the current position. For each category, we discuss how this comparison can be calculated and how it can be detected within a scene. Finally, we handle the detected change in two ways. If object hypotheses are available, we utilize this information. If no hypotheses are available, we only use the information from the B-Reps given directly by the detected change. As an additional assumption (similar to our previous work [3, 4]) the position and extent of our working surface is known (in our context a table).

#### **3.3 Classifying Possible Changes**

Based on our previous work and the given assumptions, we classify the possible changes. We have to compare the world model reconstructed thus far and the current view in every step to ensure a valid representation. As we do not use a certainty or an aging process, the existence of a face in a scene is binary, meaning it exists or it does not. Now we discuss every possible case regarding the visibility of the face in the world model and the current view:

	- (a) is also visible in the current view: *added* face.
	- (b) is also not visible in the current reconstruction: As this face does not exist, this case is not named and can be omitted.
	- (a) is also visible in the current view: *validated* face.
	- (b) is also not visible in the current reconstruction and ...
		- (i) it should be visible from the current point of view: *removed* face.
		- (ii) it cannot be seen from the current point of view due to occlusions or camera limitations: *occluded* face.

#### **3.4 Detecting Changes**

To detect every change between two B-Reps, we describe a method for every case mentioned in the section before. The basic approach for every case is to compare what is currently visible and what should be visible based in the world model. If we find a difference, we know something has changed within the scene and we can categorize this change.

We project the faces of the world model reconstructed thus far onto the 2D image plane of our current pose *TC* <sup>∈</sup> <sup>R</sup>4×4. The world model is given by the B-Rep *<sup>W</sup>* <sup>=</sup> (*FW* , *EW* , *VW* , *BW* ) (with faces *FW* , half-edges *EW* , vertices *VW* , and boundaries *BW* ). Since the world model has multiple views incorporated, the 2D projection may contain faces, which are not recordable from our current view. Therefore, we have to incorporate the view frustum of our depth sensor and further physical limitations (e.g. angle of incidence). All these limitations are collected in a tuple *L*. A projection can be described as φ(*B*, *<sup>T</sup>* , *<sup>L</sup>*) → {1, ..., ||*FB*||}*n*×*m*, with a B-Rep *<sup>B</sup>* and a pose *<sup>T</sup>* <sup>∈</sup> <sup>R</sup>4×4. The result stores in each pixel, which face is in the front (by using an ID). So we can obtain the projection of the B-Rep *W* as *PW* = φ(*W*, *TC*, *L*). We repeat this procedure with the reconstructed B-Rep *C* = (*FC*, *EC*, *VC*, *BC*) from the current view. Therefore, we have two 2D projections: One of the world model reconstructed thus far, and one of the reconstruction from the current view *PC* = φ(*C*, *TC*, *L*). It should be noted that we still have full knowledge which face in the 3D representation belongs to which pixels in the 2D projection. We can now compare these two projections by looking at every projected face. For each face we can search for a correspondence in the other projection. We can use these projections to determine whether a face *f* is visible (1) or not (0) from a pose *T* , if another B-Rep *B* is present, regarding physical limitations *L* as ϕ( *f* , *B*, *T* , *L*) → {0, 1}.

In addition, correspondences between faces (in the 3D representation) are calculated by a function η( *f* , *g*) → {0, 1}, based on their position, normal vector, and size. In our previous work [4], η matches the explainedby-function. This function returns for two faces *f* , *g* whether they correspond (1) or not (0).

If a face is visible in the current view but not in the world model, we conclude that it was *added*, resulting in *<sup>A</sup>* = {*<sup>g</sup>* <sup>∈</sup> *FC*| *f* ∈ *FW* : η(*g*, *f* )}. If a face of the world representation is also visible in the current view, then there is no change but a validation. This set of *validated* faces is determined as *V* = { *f* ∈ *FW* |∃*g* ∈ *Fc* : η( *f* , *g*)}. The case where a face is visible in the world model but not in the current view has to be subdivided in two cases. We have to make sure to detect occlusions correctly. If no correspondence is found, but it should be perceivable from the current pose, we know that this face was *removed*. This can be denoted by *<sup>R</sup>* = { *<sup>f</sup>* <sup>∈</sup> *<sup>F</sup>*w|*g* ∈ *FC* : η( *f* , *g*) ∧ ϕ( *f* ,*C*, *TC*, *L*)}. If a face has no correspondence, and is also not perceivable from the current pose, it is *occluded <sup>O</sup>* = { *<sup>f</sup>* <sup>∈</sup> *FW* <sup>|</sup>*g* ∈ *FC* : η( *f* , *g*) ∧ ¬ϕ( *f* ,*C*, *TC*, *L*)}.

#### **3.5 Handling Changes**

So we have now the information for each face whether it should exist in the updated world representation. Our goal is to obtain a valid representation of the whole scene after each step. In our domain with complete objects it is impossible for faces to exist on their own as they always originate from an object. If a single face of an object is missing, the complete object should be removed.

The set *V* of validated faces can be handled directly by B-Rep merging [15]. The same is possible with the added faces *A*. As no decision can be made regarding the occluded faces *O*, we decide that they remain within in world representation. If they are removed, this will still be captured later in the active vision process.

Therefore, the set of faces to remove *R* remains. As we know the object instances *H* = {*h*1, ..., *ho*} (with every *hi* containing at least a B-Rep model representing the object), we can handle the detected changes in two separate ways: If the face that should be removed corresponds to an existing object hypothesis, we remove the complete hypothesis. This is done by determining and deleting all faces in the world representation, which correspond to the hypothesis. First, we determine the set of faces to delete *DH*<sup>0</sup> = ∪{*hi*∈*H*|∃*r*∈*R*:*r*<sup>∈</sup> *<sup>f</sup>* (*hi*)} *f* (*hi*). These are all the faces of hypotheses that correspond to a face to remove. By *f* (*hi*) we obtain all faces from the B-Rep *W* corresponding to the B-Rep model of hypothesis *hi* . Furthermore, we must remove all faces directly connected to the hypothesis, to ensure a valid world representation (this originates from B-Reps as the underlying data structure). Therefore, the final faces to delete can be determined by *DH*<sup>1</sup> = *DH*<sup>0</sup> ∪ { *f* ∈ *FW* |∃*g* ∈ *DH*<sup>0</sup> : neighbor( *f* , *g*)}, where neighbor denotes, whether two faces *f* and *g* are neighbored, regarding their half-edges.

If no hypothesis is available for a face, we remove all neighboring faces (meaning they share an edge). This process is repeated transitively, but the working surface is removed beforehand and therefore the procedure stops there. First, we determine the set off faces without a correspondence as *DH*¯<sup>0</sup> = {*<sup>r</sup>* <sup>∈</sup> *<sup>R</sup>*|*hi* ∈ *H* : *r* ∈ *f* (*hi*)} = *R*\*DH*<sup>0</sup> . Now, we add the neighboring faces by *DH*¯ *<sup>j</sup>* <sup>=</sup> *DH*¯ *<sup>j</sup>*−<sup>1</sup> ∪ { *<sup>f</sup>* <sup>∈</sup> *FW* |∃*<sup>g</sup>* <sup>∈</sup> *DH*¯ *<sup>j</sup>*−<sup>1</sup> : neighbor( *<sup>f</sup>* , *<sup>g</sup>*)}. On the one hand, we have to delete multiple faces, as we do not know which object may correspond to these faces. On the other hand, we have to ensure the validity of the B-Rep. This has the effect, that too many faces may be removed. However, the faces remaining within the scene can be examined later using the underlying active vision approach.

#### **4 Evaluation**

#### **4.1 Setup**

The evaluation is split into two parts. On the one hand, we validate our classification of the different types of change. To do this, for each change type a scene is recorded which contains exactly one change. Furthermore, we validate the usefulness regarding scene unspecific signals, e.g. a recorded human. On the other hand, we evaluate our approach by comparing the reconstruction of a dynamic scene and a static one. To achieve this, we build a scene, record it with our active vision approach and the handling of dynamic scene enabled. When the reconstruction is completed, we use only the active vision approach on the now static scene, to obtain a ground truth. To compare both reconstructions, we use these criteria: First, we count the number of faces. Second, we remove all faces which are in both reconstructions. To determine to which faces this applies we use the definition of η, which determines whether two faces correspond to each other. If the number of unexplained faces is low, the two reconstructed scenes are similar. Finally, we delete all faces which are explained by manually validated hypothesis. This is necessary, because some faces may be impossible to view in the static reconstruction due to occlusion. Additionally, our goal is a correct recognition of all objects, and not a complete reconstruction of the scene. If any faces remain afterwards, an error occurred during the reconstruction respectively the handling of dynamic scenes and should be investigated further.

As a hardware setup we use a KUKA LWR 4 robot with a hand/eye calibrated ENSENSO N10 depth camera. To ensure high quality point clouds, we average over multiple from one view to reduce the impact of noise. We utilize an object database with 25 instances [3], which consist of objects from different domains and complexity levels considering the number of faces, symmetry, and convexity.

#### **4.2 Results**

In the validation, we start with the removal of scene unspecific signals, as seen in Fig. 1. In a first scene, a human arm is reconstructed by multiple, planar segments. With the arm removed and another B-Rep incorporated into the scene, the segments are identified as *removed* and deleted from the reconstruction. Only one patch remains, since it is too small. Furthermore, more objects are classified correctly, as the arm occludes some of these.

Regarding the validation of every possible change case, the *removed* one can be seen in Fig. 2. From an initial pose two objects are reconstructed and identified correctly. The robot manipulator moves to a new pose to validate the hypotheses. One of the objects is removed in between and the corresponding faces are deleted in the resulting reconstruction. Additionally, the resulting gap in the table is closed. The next case is the *occluded* one, as shown in Fig. 3. First, one object is visible. In front of it, two more objects are placed, which

**Fig. 1** The reconstruction of a scene with an irrelevant signal (left, surrounded in green) and the resulting reconstruction after another image from the same pose was incorporated (right). The B-Rep is drawn in gray; the hypotheses are the red wire frame models. The arrows indicate possible next views for the active vision process. The coordinate system indicates the base of the robot

**Fig. 2** Removing an object from the initial reconstruction (left) and the resulting representation (right). Camera frustum projection on the table is drawn in black

**Fig. 3** Removing and adding multiple objects in one step. An additional view is necessary to delete the removed object due to occlusion

occlude the first one. Furthermore, the now occluded object is removed. In the resulting representation, the new objects are added and the previous one is not deleted, due to the occlusion. As we cannot be sure, whether this object is still there or not, it should remain inside the representation. Finally, another image is taken from a different view (from which the first object should be visible) and the object is removed from the representation, as we can be sure, that it is not there anymore. The remaining two cases *added* and *validated* were evaluated as well but are not shown with figures here because the handling is done with existing and already evaluated methods.

For our evaluation, we used five scenes with different objects and overall complexity. Each scene was modified by multiple changes (moving, removing, and adding multiple objects and generating unwanted signals). One scene is visible in Fig. 4. Accumulated over all scenes we gathered 126 faces for the dynamic case and 120 for the static one. 97 faces from the dynamic reconstruction had a match in the static one, and 98 faces the other way around, meaning one face was explained by two others. This occurs e.g. when the complete face was not captured by the sensor and two patches were reconstructed instead of one complete face. Furthermore, 29 faces had no correspondence to the static reconstruction (28 in the other case). However, only 1 face was left after deleting all faces with hypotheses correspondences. The high number of faces without a correspondence originate primarily from occlusion during the static reconstruction. Furthermore a few small faces are impossible to directly look at using active vision, due to collision prevention mechanics (e.g. if the face is close to the working area). Therefore, in some reconstructions a face may be present, as it was captured together with a neighboring face (which may not be the case in another reconstruction). The remaining 1 face occurs because of too similar properties of an old and a new face. An object was removed from the scene, and another object was placed there instead. One face of the new object has the same properties, here face area and normal, regarding the function η. Therefore, the old face was not deleted. Depending on the face it may be possible that it gets removed if the camera takes a direct look at it and the algorithm is able to differentiate between the old and the new one.

Based on these results, we can conclude the usefulness of our approach: On the one hand, we can successfully tackle the problem of unwanted signals. If any scene unrelated part is

**Fig. 4** One example scene of the evaluation. On the left the final world representation and pose of the dynamic evaluation is visible, and on the right the one for the static

captured, it is investigated further by the active vision method and therefore deleted as soon as the disrupting object is removed. On the other hand, we can handle changes that occur during the robot movement as seen in our validation and evaluation.

#### **5 Conclusion**

We present a novel approach to detect and handle dynamic scenes for a special type of representation. Our approach uses a categorization of all possible types of change for B-Reps. To detect these changes we compare the world model thus far and the current view. To determine the type of change we project both the world representation and the current view onto a 2D plane and compare what we should see. Afterwards, the detected change is handled, either by utilizing existing object hypothesis or only the geometric information from the scene. With an evaluation we conclude the usefulness of our approach for using B-Reps in object recognition.

Future work may include the usage of a time-based component to delete world model entries which were not validated within a certain period. Furthermore, different techniques of how much of the scene should be deleted if a face is missing can be implemented and evaluated. Finally, other representations than B-Reps are of interest, which are easier to keep valid when deleting faces.

**Acknowledgements** This work has partly been supported by the Deutsche Forschungsgemeinschaft (DFG) under grant agreement He2696/21 SeLaVi.

#### **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

### **An Integrated Approach for Hand Motion Segmentation and Robot Skills Representation**

Shuang Lu , Julia Berger and Johannes Schilp

#### **Abstract**

In this work, an approach for robot skill learning from voice command and hand movement sequences is proposed. The motion is recorded by a 3D camera. The proposed framework consists of three elements. Firstly, a hand detector is applied on each frame to extract key points, which are represented by 21 landmarks. The trajectories of index finger tip are then taken as hand motion for further processing. Secondly, the trajectories are divided into five segments by voice command and finger moving velocities. These five segments are: *reach*, *grasp*, *move*, *position* and *release*, which are considered as skills in this work. The required voice commands are grasp and release, as they have short duration and can be viewed as discrete events. In the end, dynamic movement primitives are learned to represent*reach*, *move* and *position*. In order to show the result of the approach, a human demonstration of a pick-and-place task is recorded and evaluated.

#### **Keywords**

Robot programming by demonstration • Skill representation • Motion segmentation

Fraunhofer IGCV, Augsburg, Germany

e-mail: shuang.lu@igcv.fraunhofer.de

S. Lu (B) · J. Berger · J. Schilp

J. Schilp Chair of Digital Manufacturing, Augsburg University, Augsburg, Germany

#### **1 Introduction**

The demand for customized products has been increasing rapidly in the last decades. The manufacturing process should be adjusted upon individual request. Collaborative robots can work with human workers hand-in-hand for assembly tasks, which can improve the flexibility in task execution. However, the application of hybrid system is still in its infant stage. One obstacle is the complex robot programming process. Another one is the required expertise from the worker for each specific type of robot. Moreover, the tasks have to be re-programmed each time a new request is received from the factory. It is time consuming and causes higher production cost.

Learning from demonstration is a promising programming paradigm for non-experts. Kinesthetic teaching is widely explored in the last decades for data collection [1]. However, the process can be a tedious task for a human worker especially for multi-step tasks. Instead of guiding the robot directly by hand, visual observation gains more attention recently, thanks to the development in field of computer vision. Hand movement can be tracked and recorded by optical sensors. The trajectories from demonstration are then segmented to elementary action sequences such as pick-and-place objects, which are also known as skills. A task model is then defined as a sequence of skills [2]. The basic motions (*reach*, *grasp*, *move*, *position* and *release*) in methods-time measurement (MTM) [3] are considered as skills in this work, such that a learned task model can be optimized in a more flexible way during execution. For instance, the *move* motion can be optimized while the *reach* and *grasp* remain unchanged. The representation is also beneficial for integrating natural language as voice command such as *grasp* and *release*, since they can be considered as discrete event both for human speaking and for robot execution. The aim of this work is to develop a framework, in which the robot is able to learn a task from integrated natural language instruction and video demonstration. The main contributions of this work are:


#### **2 Related Work**

This section provides a summary of recent literature on robot learning from visual observation. Ding et al. developed a learning strategy for assembly tasks, in which the continuous human hand movement are tracked by a 3D camera [4]. Finn et al. presented a visual imitation learning method that enables a robot to learn new skills from raw pixel input [5]. It allows the robot to acquire task knowledge from a single demonstration. However, the training process is time consuming and the learned model is prone to environment changes. Qiu et al. presented a system, which observes human demonstrations by a camera [6]. A human worker demonstrates an object handling task wearing a hand glove. The hand-pose is estimated based on a deep learning model trained by 3D input data [7]. The human demonstration is segmented by Hidden Markov Models (HMM) into motion primitives, so-called skills. The skills are represented by Dynamic Movement Primitives (DMPs), which allows the generalization to new goal positions. But there are no rules for defining semantic of skills in the existing works. *Pick up*, *place* and *locate* are considered as skills by Qiu et al. [6], however Kyrarin et al. define them as *start arm moving*, *object grasp*, *object release* [8]. This causes difficulty when comparing the performance of different approaches. Shao et al. developed a framework which allows robot to learn manipulation concepts from human visual demonstrations and natural language instructions [9]. By manipulation concepts they mean for instance "put [something] behind/into/in front of [something". The model's inputs are natural language instruction and an RGB image of the initial scene. The outputs are the parameters of a motion trajectory to accomplish the task in the given environment. Task policies are trained by integrated reinforcement and supervised learning algorithm. Instead of classifying all possible actions in video demonstration, the focus of this work is to extract motion trajectories from each video.

#### **3 Motion Segmentation**

In this section, the methods for extracting hand motion trajectories and segmentation are described.

#### **3.1 Data Collection**

This work aims to extract human motion from video sequences, which consists of both color and depth information of hand motion. Given recorded motion data, a pipeline consisting of the following three steps is proposed. Firstly, the objects which are more than one meter away from camera origin will be removed. Since the depth and color stream have different viewpoints, the alignment is necessary before further processing. In the second step, the depth frame is aligned to the color frame. The resulted frames have the same shape as the color image. Thirdly, the hands are detected by MediaPipe framework [10] from recorded color image sequences. The output of the hand detector are 21 3D hand-knuckle coordinates inside of the detected hand regions. Figure 1(a) shows an example. The representation of each landmark is composed of *x*-, *y*- and *z*-coordinate. *x* and *y* are normalized to [0.0, 1.0] by the image width and height respectively, *z* represents the landmark depth the wrist being the origin. The illustration of landmarks on the hand can be found on the website of MediaPipe1.

<sup>1</sup> https://google.github.io/mediapipe/solutions/hands.html

If the hand is not detected, the time stamp will be excluded from the output time sequences. Otherwise, key points in pixel coordinates are transformed to camera coordinate system in Fig. 3(a). An detailed illustration of hand landmarks in world coordinate system with wrist being the origin is outlined in Fig. 1(b). A flowchart of the proposed pipeline is summarized in Fig. 2.

#### **3.2 Motion Representation**

The goal of motion segmentation is to split the recorded time series into five basic motions: *reach*, *grasp*, *move*, *position* and *release* [11]. It builds up the moving cycle of pick-and-place in multi-step manipulation tasks. The trajectories can be represented by *P*(*x*, *y*,*z*) = [*p*(*t*0), *p*(*t*1), . . . , *p*(*ti*), . . . , *p*(*tn*)], where *ti* represents the temporal information. The segmentation task is to define the starting and ending time for each motion. *Grasp* and *release* are two basic motions with short duration. By recognizing voice commands from human, the time stamp of the voice input can be mapped to hand motion trajectories. The *move*- and *position*-motions are segmented by hand moving speed. It is based on the assumption that the speed of *position*-motion decreases monotonically. The results of segmentation are outlined in Table 1.

#### **4 DMP for Skills Representation**

Dynamice Movement Primitive (DMP) is a way to learn motor actions, which are formalized as stable nonlinear attractor systems [12]. There are many variations of DMPs. As summarized by Fabisch [13], they have in common that


**Fig. 2** Flowchart of the process for generating hand motion trajectories

**Table 1** Representation of skills


The canonical system uses the phase variable *z* which replaces explicit timing in DMPs. The values are generated by the function:

$$
\tau \dot{z} = -\alpha z \tag{1}
$$

where *z* starts from 1 and approaches 0, τ the duration of the movement primitive, and α is some constant that has to be set such that *z* approaches 0 sufficiently fast. The transformation system is a spring-damper system and generates a goal-directed motion that is controlled by a phase variable *z* and modified by a forcing term *f* .

$$\begin{aligned} \tau \dot{v} &= K(\mathbf{g} - \mathbf{y}) - Dv - K(\mathbf{g} - \mathbf{y}\_0)z + Kf(z) \\ &\tau \dot{\mathbf{y}} = v \end{aligned} \tag{2}$$

The variables *y*, *y*˙, *y*¨ are interpreted as desired position, velocity and acceleration for a control system, *y*<sup>0</sup> is the start and *g* is the goal of the movement. The forcing term *f* can be chosen hypothetically:

$$f(z) = \frac{\sum\_{l=1}^{N} \psi\_l(z) w\_l}{\sum\_{l=1}^{N} \psi\_l(z)}\tag{3}$$

with parametersw that control the shape of the trajectory. Influence of the forcing term decays as the phase variable approaches 0. <sup>ψ</sup>*i*(*z*) = exp(−*di* <sup>2</sup> (*<sup>z</sup>* <sup>−</sup> *ci*)2) are radial basis functions with constant *di* (widths) and *ci* . The DMP formulation presented in [14] is considered in this work, such that the desired velocity can be incorporated.

#### **5 Experimental Setup**

The setup is illustrated in Fig. 3 (a). The Intel® RealSenseTM L515 3D camera is mounted on the robot to record hand movements. The working space of robot can be captured by camera, as shown in Fig. 3 (b). As a LiDAR camera, it projects an infrared laser at 860 nm wavelength as an active light source. 3D data is obtained evaluating the time required to the projected signal to bounce off the objects of the scene and come back to the camera [15]. The size of color images recorded by the L515 is 1280 × 720 and the size of depth image is 640 × 480. Intel RealSense Viewer is used to record the video sequences. The natural language text for segmentation is manually inserted into the recorded sequences.

#### **6 Experimental Results**

To validate the proposed approach, a task demonstration is recorded with the setup in Fig. 3, in which the human demonstrated a pick-and-place task. The methods proposed in Sects. 3 and 4 are applied on the recorded sequence. The results are discussed in the following.

#### **6.1 Segmentation**

Index finger tip trajectory in *X* is outlined in Fig. 4. It shows that some data are missing due to depth error in the data collection process. The segmentation result based on the voice command grasp and release can be also found in Fig. 4. In the next step, the sequence

**Fig. 3** (**a**) Demonstration setup; (**b**) Color image; (**c**) Depth image

**Fig. 4** Hand motion segmentation based on voice command

**Fig. 5** Segmentation into *move* and *position*

**Fig. 6** Learned DMP for representing *reach*

between grasp and release is segmented into *move* and *position*, as illustrated in Fig. 5. This is achieved by defining the temporal information of voice command input.

#### **6.2 Learning DMPs**

Before learning the DMPs, a linear interpolation is applied on both time and trajectory data. Additionally, time series are shifted such that every trajectory starts at time zero. Three DMPs are learned for representing trajectories of *X*, *Y* and *Z*. The implementation by Fabisch [13] is used to learn the DMPs2. For the sake of simplicity, only *X* and *Y* are in Fig. 6 and Fig. 7. The learned model can be adapted to new goal position such as 0, 1 or 2.

<sup>2</sup> https://github.com/dfki-ric/movement\_primitive

**Fig. 7** Learned DMP for representing *position*

**Fig. 8** Learned DMP for representing *move*

To represent the *move* motion, it is essential that the DMPs can be adapted to different final velocities. The results for trajectories and velocities in Fig. 8 show that the learned DMPs can be adapted to different final velocities such as 0, 1 and 2, where *xd* and *yd* represent the goal velocity. Furthermore, it shows that the recorded hand trajectories are not smooth can not be applied on robot directly. The smoothness can be improved by the learned DMPs.

#### **7 Conclusion**

To reduce the complexity of robot programming, an integrated approach is introduced in this work for robot learning of skills from voice command and a single video demonstration. The extracted index finger trajectories from video are firstly segmented into five basic motions: *reach*, *grasp*, *move*, *position* and *release*. It is realized by voice input of grasp and release during video recording. Followed by segmenting *move* and *position* by hand moving velocities. DMPs are then learned to represent *reach*, *move* and *position*. They are adaptable to new goal positions and velocities. The experiment results show the feasibility of the proposed approach. As future works, the data missing problem caused by depth error should be addressed. Furthermore, the learned DMPs should be evaluated on real robot.

#### **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

**Vacuum Gripper**

### **Flow Modeling for Vacuum Pressure-Based Handling of Porous Electrodes of Lithium-Ion Batteries**

Robert Schimanek, Muhammed Aydemir, Alexander Müller and Franz Dietrich

#### **Abstract**

In lithium-ion battery (LIB) production, limp electrodes are handled gently by vacuumpressure based handling and transport systems, which generate a fluid flow that propagates through the porous electrode coating during handling. To investigate the limits and material-damaging behavior of vacuum pressure-based handling, it is required to understand how process parameters and electrode qualities affect fluid flow. Questions on how fluid flow reduces electrode quality are insufficiently addressed or modeled. Modeling the electrode and handling system interaction requires knowledge of the effective surface geometry and the volumetric flow rate caused by the pressure difference. In this article, flow through porous electrode coatings during handling is modeled. Experiments demonstrate a flow behavior according to the generalized Darcy's law. Thus, using Darcy's law, modeling fluid flow through the electrode improves the exploration of the limits and design of vacuum pressure-based handling and transport of electrodes in LIB production.

#### **Keywords**

Lithium-ion battery • Electrode coating • Vacuum suction cup • Modeling • Darcy's law

© The Author(s) 2023

R. Schimanek (B) · A. Müller · F. Dietrich

Institute of Machine Tools and Factory Management, TU Berlin, Berlin, Germany e-mail: r.schimanek@tu-berlin.de

M. Aydemir Turkish-German University, Beykoz, Turkey

T. Schüppstuhl et al. (eds.), *Annals of Scientific Society for Assembly, Handling and Industrial Robotics 2022*, https://doi.org/10.1007/978-3-031-10071-0\_25

#### **1 Introduction**

It is cost-intensive to optimize the production of lithium-ion batteries (LIB) through timeconsuming experiments. To increase development efficiency, modeling methods are used that provide adequate results for a given quality. This article further increases development efficiency by simplified modeling.

LIBs are composed of electrochemical cells consisting of cathodes and anodes, both called electrodes, with or without separators and any electrolyte. The cell assembly produces the electrode-separator compound (ESC), a stack, a Z-fold, or a roll. Cell production includes handling and transport of the electrodes between their manufacturing and sealing of the ESC. Electrode and LIB quality is affected by handling and transport parameters. Vacuum suctionbased adhesion is a gentle way to handle and transport electrodes fast. The background of this article is to model the mechanical stress and damage that occur during handling and transport in the electrode. This article aims to model the interrelationships between the electrode active material (AM) properties, i.e., porosity, and vacuum suction-based adhesion parameters, i.e., pressure and volumetric flow rate during contact, to enable local stress and damage modeling of the AM.

The following describes the structure of the article. This section begins with a brief overview of the field of study. Section 2 provides fundamentals of suction-based handling and transport in cell assembly. It summarizes the modeling of electrode handling and transport. It motivates a flow model for suction-based adhesion of electrodes. Section 3 illustrates the physical problem and presents modeling based on Darcy's law for vacuum suction cups and effective vacuum surfaces. In Sect. 4, several electrodes are experimentally examined to evaluate the model. Finally, Sect. 5 concludes how the model enhances the design of handling and transport solutions for LIB assembly.

#### **2 Scope and Motivation of Fluid-Dynamic Electrode Model**

Electrodes must be gripped securely, fixed reliably, and not damaged during handling and transport [1]. Handling and transport can be carried out through adhesion using vacuum suction at the material and application-specific limitations. Compelling examples in industry and research are introduced in the following.

Adhesion is modeled with uniform pressure distribution, and flow through the electrode is neglected to account for loading in widely used modeling methods. Modeling methods for local damage modeling that would benefit from improved flow modeling are considered below.

Section 2.1 introduces applications for vacuum suction-based handling and transport of electrodes. Section 2.2 analyses methods for modeling stress during electrode transport. Section 2.3 motivates a modeling approach for adhesion during electrode transport to increase development and commissioning efficiency.

#### **2.1 Relevance of Vacuum Suction-Based Handling and Transport of Electrodes in Battery Production**

In cell assembly, vacuum suction cups and area grippers are means for adhesion during handling [2]. These grippers adhere the electrode force-locked by a negative pressure difference between the inner and the ambient pressure. This gripping principle is popular for handling air-impermeable materials. However, electrode surfaces are air-permeable, which this article models and demonstrates by experiment. In the following, applications for electrode handling and transport are introduced and illustrated as per Fig. 1.

**Vacuum suction cups** are relatively inexpensive and stand out with high accuracy. The downsides include an increased risk of AM abrasion, electrode sheet absorption, and marks. For vacuum suction cups, the resulting flow through AM is modeled and experimentally evaluated hereafter.

**Vacuum effective surfaces**inherit multiple openings at low pressure to distribute the load resulting from the motion of electrodes. In the following example applications like *vacuum area grippers, vacuum-deflection rolls, vacuum draw-off rolls,* and *vacuum-conveyor belts* are described.

*Vacuum area grippers* have a high lateral force absorption and a high deposition accuracy. They reduce the risk of damage by distributing the gripping force across the entire electrode [2]. The model presented in the article covers the flow and pressure resulting from the multiple openings of these grippers.

*Vacuum draw-off rolls* were used to separate electrodes from a pile in the projects KontiBAT and HoLiB at the research group of the authors. An effective vacuum area in the roll (adjustable with suction insert) provides adhesion to orient and accelerate electrodes [3, 4]. This article models the flow resulting from the suction inserts.

*Vacuum deflection rolls* implemented in the KontiBAT project adhere, deflect and guide the electrodes while maintaining a constant material velocity. Vacuum-deflection rolls have also been used to ensure constant process velocity of continuous web-based electrodes [5]. The model developed in this article is applicable to flow caused by vacuum deflection rolls adhesion area.

**Fig. 1** Vacuum suction-based handling and transport systems for LIB electrodes

*Vacuum conveyor belts* have been used to transport electrodes [6]. During transport, a lower pressure under the perforated belt ensures the fixation of the electrodes. The model developed in this article applies to vacuum conveyor belts.

In summary, vacuum suction is used in various applications to adhere and guide electrodes in space. The following discusses how the interaction of the applications and electrodes is modeled.

#### **2.2 Earlier Work in Modeling Handling and Transport of Electrodes**

In addition to modeling the kinematics of the electrodes, there are approaches capable of modeling effects on the electrode material to derive measures for the design of handling processes. Some approaches are subsequently evaluated.

**Finite element method** (FEM) has been used for the characterization of mechanical stresses, and their occurrence during handling processes on the electrode surfaces [3, 4]. FEM relies on continuous macroscopic models. The external loads on the electrodes are modeled as uniform surface stresses to represent, e.g., the suction pressure [3]. FEM cannot model the load from volumetric flow through the electrodes, which can be done with the model in this article.

**Computational fluid dynamics** (CFD)-FEM simulations are used to map the movement of the electrode foils in air-filled space and to model the use of different operating materials. This is particularly suitable for mapping macroscopic processes and determining the effects of forces on the contact points of the electrodes [4]. Modeling the volume flow through the electrode material at the contact point has not been done with CFD-FEM to the authors' knowledge. This article models the volumetric flow through the electrode AM.

**Discrete element method** (DEM) models aimed at reproduction of mechanical testing like nanoindentation and on the reproduction of mechanical behavior due contact with a handling system [3]. There is no modeling of mechanical stresses on the electrode at the microscopic level from vacuum suction-based handling and transport with DEM. This article's modeling enables it.

In summary, none of the existing approaches modeled flow through the electrodes to the authors' knowledge. If the flow through the electrode material and its effects can be better modeled, the attempts to parameterize the systems could be reduced. This reduction serves the goal of development efficiency and the scope of the article. As a result, the focus of this article is on modeling the flow through the electrode material.

#### **2.3 Discussion**

Research aims to increase electrode dimensions for the application in e-mobility. However, this increases the stress at the AM that interacts with the handling and transport system during cell assembly. The industry looks for AM that is easy to handle and has a high energy density because the handling parameters play a decisive role in the competitiveness of production. In the development of new materials, only a few handling properties (e.g., electrode strengths) are taken into account since other interrelationships are missing.

During ramp-up or changeover of production, the equipment is parameterized to electrodes, the speed is increased, the resulting quality is validated, and then the production speed is further increased. In addition, the time the equipment runs continuously is increased to identify long-term effects on the materials. These processes are time-consuming and inefficient.

Knowledge of the interrelationship between the handling parameters and the effects on electrode quality, or the ability to estimate them, would save many resources during development and commissioning. Up to now, there is a lack of investigation of the interrelationship between electrode properties, such as porosity, and process parameters, such as pressure difference and volumetric flow rate of typical vacuum suction-based handling and transport operations in LIB production. This article is the first step to model the effects of macroscopic handling parameters on the electrode microstructure's quality. The approach begins in modeling the adhesion at the macroscale, to derive flow properties from there, to model the effects on the morphology of the electrodes at the microscale.

#### **3 A Model for Vacuum Suction Flow Through Electrodes**

A model for vacuum suction flow through electrodes based on the generalized form of Darcy's law is presented in this chapter. The prerequisites and principles for the flow description in porous media are described in the following subsections. Section 3.1 models the flow through electrodes for a vacuum suction cup. Section 3.2 models the flow through electrodes for effective vacuum surfaces.

#### **3.1 A Flow Model for Vacuum Suction Cup Gripper**

One can use generalized Darcy's law to model the flow, i.e., the pressure and velocity distribution through the electrode from a vacuum suction cup gripper. Darcy's law relates the volumetric flow rate *Q* with the pressure difference *p* over the porosity φ and permeability *K* of porous media [7]. One assumption is made: the flow through the electrode is assumed to be two-dimensional, with a small channel height *h* compared to the inner radius*ri* , *h* << *ri* , creating a radial plane flow channel as per Fig. 2.

The radial flow channel is characterized by the boundary conditions for pressure *pi* , *p*<sup>0</sup> at the vacuum suction cup's outer and inner radius *ri* , *r*0. The integration of generalized Darcy's law and identification of the pressure difference *p* = *p*<sup>0</sup> − *pi* yields pressure distribution through the electrode

**Fig. 2** Flow channel (*left*), parameter study of pressure and velocity (*middle*), determination relation for electrode permeability (*right*) for vacuum suction cups

$$p(r) = p\_l + \frac{\Delta p}{\log(r\_0/r\_l)} \log(r/r\_l). \tag{1}$$

For a given pressure difference, (1) yields a model of the pressure through the electrode, as done for three different vacuum suction cups in Fig. 2. It can be seen that the pressure increase varies slightly with the geometry of the vacuum suction cups. From Eq. (1) and generalized Darcy's law, one gains the radial velocity

$$
\mu\_r(r) = -\frac{K}{\phi \cdot \eta \cdot r} \frac{\Delta p}{\log(r\_0/r\_l)}.\tag{2}
$$

A parameter study of (2) for different vacuum suction cups (see Fig. 2) illustrates velocity of the flow. The highest flow velocity is at *ri* , the lowest at *r*0. Also, one can see that small vacuum suction cups have a higher average local radial velocity than bigger ones. For evaluation of the model, one can measure the velocity resulting from the pressure difference. Since the volumetric flow rate *Q*, is more comfortable to measure; it is handy to integrate Eq. (2) to

$$\mathcal{Q} = \frac{2 \cdot \pi \cdot K \cdot h}{\eta} \frac{\Delta p}{\log(r\_0/r\_l)}. \tag{3}$$

#### **3.2 A Flow Model for Vacuum Effective Surfaces**

The flow model through electrodes for effective vacuum surfaces uses potential analysis of flow. Potential flow models can be applied to viscous flows between closely spaced plates, which applies to vacuum adhering of electrodes. Moreover, a constant fluid density ρ<sup>F</sup> is assumed. The electrode plane is understood as a complex numerical plane with *z* = *<sup>x</sup>* <sup>+</sup><sup>i</sup> *<sup>y</sup>* <sup>∈</sup> <sup>C</sup> (as illustrated in Fig. 3). In the plane, the potential is represented as a real part <sup>Φ</sup> of a holomorphic function *f* (*z*) = Φ(*x*, *y*) + i Ψ (*x*, *y*), the imaginary part Ψ its resulting flow direction. Where real and imaginary parts of the complex velocity potential satisfy the Laplace equation in the plane.

**Fig. 3** Illustration of flow channel (*left*), parameter study of pressure and velocity through electrode from a vacuum effective surface with 30 circular openings (*right*)

From *f* , assuming a homogeneous permeability K of the electrode, one gets a relation for the pressure distribution of multiple openings *pn*(*zn* = *xn* + *iyn*) for any vacuum effective surfaces with circular suction areas. Each opening into which is flowing a quantity *Qs*,*n*, fluid per unit AM thickness per unit time, contributes to the pressure distribution [7]. From these assumptions one can determine the velocity distribution in *x*, *y* direction *ux* , *uy* as well as in its average value *u* in flow direction.

$$
\mu = -\frac{1}{\phi \cdot 2\pi} \sum\_{n} \mathcal{Q}\_{s,n} \frac{1}{|z - z\_n|} \tag{4}
$$

If all *Qs*,*<sup>n</sup>* are known Eq. (4) allows to model *u*, with porosity φ, through the electrode AM. From a practical point of view, it is interesting to model the fluxes' values *Qs*,*<sup>n</sup>* with generalized Darcy's law and solve a linear system of equations for known opening pressures *pn*. Since *p* is a design parameter of vacuum handling and transport system, and the separate openings are often connected to the same vacuum-pressure reservoir, the openings are considered to have the pressure *pi* .

For the pressure model, it is convenient to introduce a Green's function *G*. Which is defined as a solution of Laplace's equation, symmetrical in two points(*x*, *y*), (*x* , *y* )(sometimes called mirror charges), possessing a logarithmic singularity when (*x*, *y*) = (*x* , *y* ) and vanishing when (*x*, *y*) is a point on the boundary ∂ *S* of the region in question [7]. When *G* is found, the pressure distribution for one circular vacuum-pressure region on a rectangular region can be calculated as

$$p(\mathbf{x}, \mathbf{y}) = -\frac{1}{2\pi} \int\_{\mathcal{Y}} p\_b(\mathbf{x'}, \mathbf{y'}) \frac{\partial G}{\partial n'}(\mathbf{x}, \mathbf{y}, \mathbf{x'}, \mathbf{y'}) ds. \tag{5}$$

In Eq. (5), *pb* is the value of *p* at the region's boundary. The line integral elements are denoted by *ds*, *n* is the exterior normal to *ds*, and the integral extends over the whole boundary ∂ *S*. Since Laplace equations allow superposition of their solutions due to their linearity, one can solve Eq. (5) for multiple circular openings of an effective vacuum surface. For that, one creates a linear combination of all pressure distributions *pall* = *c <sup>j</sup>* · *p <sup>j</sup>*(*x*, *y*) and adjusts the constants of each term according to the inner and ambient pressure at the boundaries. With the resulting pressure distribution *pall*, one can then derive a formulation for the respective velocity distribution *uall* over the surface. One can model the velocity values with a known permeability *K* of the electrode material.

An example for *pall* and *uall*/*K* has been calculated for 30 circular areas of suction, which is illustrated in Fig. 3 in a boundary of 5 to 6 cm, representing the surface of a suction insert, that could be used in a draw-off roll similar to that in KontiBAT or HoLiB [3, 4]. The pressure near the 30 suction areas with diameters of 3.5mm is almost as big as the assumed inner pressure and increases faster with a shorter distance to the boundary.

#### **4 Evaluation of the Model**

An experiment is conducted to evaluate that vacuum-based handling of porous electrode AM follows generalized Darcy's law. In addition, the measured date gain permeabilities of the reference electrodes and tune the presented flow model. Section 4.1 introduces the experimental setup and measures to reduce recorded data. Section 4.2 discusses the results of the experiment.

#### **4.1 Experimental Setup and Data Reduction**

Within the scope of the investigation of the influence of the gripping parameters, pressure difference *p*, suction surface geometry *ri* , *r*<sup>0</sup> on the surface quality were examined. In the experiment, the proposed models were examined according to Eq. (3). For this purpose, samples of the reference electrodes were placed on the vacuum suction cup under different *p* while measuring the volumetric flow rate *Q*. For three different vacuum suction cups, a *p* of ≈ 30 mbar and ≈ 200 mbar were chosen, according to the resolvable range of the following sensory.

The thermal flow sensor*Festo SFAH* measures the volumetric flow rate *Q*. The differential pressure sensor module *Beckhoff AEM3712* detects the pressure difference *p* between the pressure of the fluid *pi* , and the ambient pressure *p*0.

Four reference anodes and four reference cathodes were cut into six squared samples, with edge length 2 · *r*<sup>0</sup> of the vacuum suction cup. Each sample was placed on a vacuum suction cup a the test rig. A pressure difference was applied, and the volumetric flow rate was recorded. Afterward, the samples' and current collectors' thickness were measured with the micrometer screw gauge. The measured thicknesses were used to determine the thickness of the electrode material *h*, necessary for the proposed modeling (Eq. (3)) and sketched in Fig. 2.

At the beginning of the suction, the vacuum suction cup's available volume of the movable circular bellow is emptied, as shown by volumetric flow rate measurements (illustrated for a cathode sample in Fig. 4). With the identification of the asymptotic volumetric flow rate (*Q*˙ ≈ 0), the average volumetric flow through the electrode is determined. Since *p* in the regions of *Q*˙ ≈ 0 is considered to be almost constant, the pressure difference *p*(*Q*˙ ≈ 0) is also averaged, and the standard deviation is formed.

Subsequently, the averaged asymptotic volumetric flow rates are plotted over the averaged pressure difference as per Fig. 4. A linear least-squares regression is performed for each vacuum suction cup. The average dynamic viscosity on the measurement day was 18.2 ± <sup>0</sup>.<sup>1</sup> · <sup>10</sup>−<sup>6</sup> Ns/m2. The size of the electrode permeability *<sup>K</sup>* is calculated from the gradient of the fits, as per Eq. (3) In that case, the cathode's permeability is *K* ≈ 1773±230 mD and for the anode *K* ≈ 2320 ± 613 mD, neglecting the values for the ESS-20 vacuum suction cup.

#### **4.2 Results and Discussion**

In an experiment, three vacuum suction cups with different geometries were applied at different pressure differences to electrode samples while measuring the volumetric flow rate simultaneously. The calculated regressions of the measured volumetric flow over the pressure difference (as shown for the anode as per Fig. 4) increase with the pressure difference. It is noticeable that, contrary to the expected course of the permeability determination curve (Fig. 2), the measured curve of the smallest vacuum suction cup (ESS-20) is significantly steeper. This is attributed to the discontinuities on the vacuum suction cup's contact surface, which increase the distance to the contact surface. The slopes of VASB-40 and VASB-55 are

**Fig.4** Pressure difference and volumetric flow rate over time (*left*), volumetric flow rate over pressure difference of reference anode (*right*)

positioned relative to each other as expected as per Eq. (2). The asymptotic volumetric flow rates are often above the regression curve in the low-pressure range. This may be related to the differential pressure sensor's low measuring accuracy at low measured values.

The characteristic regression curves support the modeling approach over Darcy's law. As the sensors are comparably inexpensive and part of industrial practice, they are measures to determine the permeabilities of electrodes to model the flow for various handling geometries.

With the model, one can determine *K* of a reference electrode from relatively simple geometry, e.g., a vacuum suction cup, based on Eq. (3). From there, using the approach for modeling vacuum effective surfaces in Sect. 3.2, one can estimate the average local velocities and pressures in similar electrodes resulting from vacuum effective surface geometry, like a surface area gripper, a deflection roll, a conveyor belt, or a draw-off roll.

#### **5 Conclusion**

The authors began manual development and validation to improve cell assembly processes. In order to identify, e.g., handling limits, parameters such as pressure difference were varied, and their impact on the electrode surface and LIB quality was investigated. Identification of sources of damage to electrodes resulting from vacuum-effective surfaces is laborious and expensive. Modeling approaches offer the possibility of reducing the time for experiments and increasing development efficiency.

For numerical modeling of damages from the interaction of electrode and handling and transport system, knowledge of the effective surface's geometry and the volumetric flow rate resulting from the applied pressure difference are required.

In this article, a pressure and velocity distribution model for the flow through electrodes during handling and transport was developed for vacuum suction cups and vacuum-area effective surfaces. It models pressure and velocity based on the porosity and permeability of the electrode, the pressure difference, and the handling system's geometry. The models in this article enable the creation of a simulation model to represent the interaction of electrode materials and fluids.

The presented model improves the development of electrode handling and transport processes at several levels. The model maps local effects on the electrode from the vacuum effective surface geometry. Thus, the model allows identifying critical areas of the electrodes for effect characterization (e.g., electromagnetic or electrochemical), which cuts non-valuable experiments. In addition, the model can be used to derive the effects of handling and transport on electrodes via measurements of porosity and permeability as early as the design of the electrodes in the laboratory. In addition, the model can be used to check whether a leakage from a non-optimal contact situation is present during handling and transport.

Our experience shows that modeling local stress and damage of the AM from handling and transport is possible. The presented approach allows to model the local aerodynamic and adhesion load to the AM particles in conjunction with other techniques, but the process must be optimized. The local stress and damage modeling remain to be discussed as it is beyond the scope of this article.

**Acknowledgements** The German Federal Ministry of Education and Research (BMBF) funded this research under the program VIP+ and "Forschungsfabrik Batterie". This article is related to the project KontiBAT (project ID 03VP01480) and HoLiB (project ID 03XP0236B) of the competence cluster for battery cell production (ProZell). The authors acknowledge the support of Julian Marscheider, Sezer Solmaz, Adrian Porazynski, and Julia Kowal.

#### **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

### **Empirically Adapted Model for the Handling of Variable Geometries with Vacuum-Based Granulate Grippers**

Christian Wacker, Niklas Dierks, Illgen Joana, Arno Kwade and Klaus Dröder

#### **Abstract**

Current industrial trends show an increasing demand for individualized products, which require highly fexible yet automated production systems. Universal handling systems offer an effcient solution for the fexible and safe handling of differing component geometries and shapes. An innovative gripper for form-fexible handling combining vacuum systems with the fexibility of granulate grippers was established in previous research and has continued to prove its fexibility by gripping wide varieties of object geometries. The current challenge of modelling and predicting gripping forces for this new gripper is addressed in this research. Multiple object geometries are selected and examined, with the parameters affecting the air permeability being the most important infuence for the gripping forces. Along with an overview of infuencing factors and parameters, a framework for a linear model enabling the prediction of gripping forces for different object shapes is developed. The basis for automated prediction of gripping strengths for different types of objects is established with this research and could be adapted with other, non-analytical models such as machine learning in the future.

#### **Keywords**

Handling · Granulate grippers · Flexible production

e-mail: c.wacker@tu-braunschweig.de

N. Dierks · A. Kwade Institute for Particle Technology, Technische Universität Braunschweig, Braunschweig, Germany

C. Wacker (\*) · I. Joana · K. Dröder

Institute of Machine Tools and Production Technology, Technische Universität Braunschweig, Braunschweig, Germany

T. Schüppstuhl et al. (eds.), *Annals of Scientifc Society for Assembly, Handling and Industrial Robotics 2022*, https://doi.org/10.1007/978-3-031-10071-0\_26

#### **1 Introduction—Challenges for Universal Handling**

The automation of handling processes is one of the key features and challenges for modern production and assembly. In large-scale automation processes with many identical objects, handling tools are adapted or specifed for individual components. As an example, this can be done by choosing a suitable vacuum gripper size for the accessible object surface or adapting gripper fngers of mechanical grippers to the object contours [1]. For industrial applications with shortening development cycles of new product variants and sometimes even overlapping transitions between product generations, solutions for a multifunctional, fexible handling or gripping of different components are required [2]. According to Hesse [3], three different types of fexibility for automated hand-ling processes can be defned:


Especially for smaller batches of varying or even individualized products, the necessary versatility and range of grippable objects can be achieved with universal grippers, which enable adaptable and fexible automation of these handling processes. In order to assess the suitability of a universal gripper for a spectrum of grippable objects, the respective effective gripping force and success frequency has to be examined in order to validate a grippers applicability [4]. The main focus of this publication is the development of such an assessment for a universal gripper created at the Technische Universität Braunschweig [5] and to enable an analytical model to predict the respective gripping forces for different types of object contours.

#### **2 State of the Art**

The current state of the art describes multiple solutions for fexible gripping of different kinds of objects. One possibility for universal gripping is the combination of multiple different grippers into one multi-effector system, which is able to choose a suitable gripper from a selection of pre-installed specialized grippers. This can lead to high weights as well as somewhat bulky effector setups, but allows effective handling of multiple preselected types of objects (Fig. 1a). Another possible solution are effector changing systems. The applications are similar to multi-effector systems, as the spectrum of objects

**Fig. 1** Solutions for gripping different objects [6–9]

also has to be exactly determined beforehand, as a suitable gripper has to be available for the required tasks (Fig. 1b).

Actually adaptable grippers (Fig. 1c) suitable for multiple shapes and object types can often be assigned to the feld of soft robotics. Typical examples are usages of adaptable surfaces, such as FinRays [9–11], often in combination with other gripping principles such as mechanical and electrostatic mechanisms [12].

One of these soft robotic applications, that has been gaining traction in the past few years, are robotic grippers based on the jamming of granular materials [13, 14]. These grippers use airtight cushions flled with different granular materials. When a vacuum is applied to these cushions, the granular material compacts, jams and enables a gripping force to be exerted on different kinds of objects through friction and interlocking with the respective surfaces. For these grippers, infuences such as the stiffness of cushion membrane material, the granulate material, the object enclosure and the conditions of the granulate material have been examined [15–18].

#### **Previous Works**

Based on a combination of the granulate grippers with a vacuum gripper, an innovative handling principle for form-fexible handling was developed at Technische Universität Braunschweig. The key difference to standard granulate grippers is a porous area in the gripping cushion, which allows an additional vacuum force to build up (Fig. 2).

Previous research has shown that this combination of gripping principles can achieve a high adaptability for gripping mechanisms and therefore a capability for handling large varieties of objects [20–22]. This previous research also examined infuences of the airfow rate and vacuum on the state of the granulate material, with a certain solidity of the granulate being reached for the best vacuum seal and thus the highest vacuum. In this state the gripper combines the two gripping principles most effectively and the highest gripping forces are reached [5]. The key advantage compared to previous research on granulate grippers is the combination of the ability to grip fat objects [14] with the large increase in adaptability achieved with the granulate gripping principles.

**Fig. 2** Innovative vacuum-based granulate gripper **a** Example for this gripper [19], **b** Schematic and structure of the gripper

#### **3 Analytical Model Frameworks for Gripping Forces**

The resulting gripping force is infuenced by different factors, which are described in Fig. 3. The infuences are divided into three categories, originating from the applied gripper, the grasped object as well as the infuences of the gripping strategy.

Under optimum conditions for all of these infuencing factors, the porous area in the gripper cushion is fully sealed with the grasped object and a maximum vacuum gripping force can be applied. The theoretical maximum of this vacuum-based gripping force is calculated with the following formula:

$$F = \Delta p \cdot A \tag{l}$$

The gripping force *F* results from the applied vacuum *Δp* and the covered surface area *A* of the grasped object. For this research, the infuences of the grasped objects are the main focus.


**Fig. 3** Infuencing factors for the achievable gripping strength with the examined vacuum-based granulate gripper [5, 22, 23]

#### **Experimental Setup**

In alignment with preliminary experiments, a cylindrical gripper with a diameter of 150 mm and a height of 60 mm was chosen (see Fig. 2). The membrane consists of 1.25 mm Polyurethane, the porous area with a size of~4500 mm2 extends to a maximum diameter of 95 mm and is arranged symmetrically around the center axis. As granulate material,~4100 ABS (Acrylonitrile–Butadiene–Styrene) beads with a diameter of 6 mm were used, flling 66% of the maximum cushion volume. The gripper is mounted to a K6D40 force sensor with a maximum force in Z-direction of 500 N, a Kuka LBR iiwa 14 R820 and coupled with an adjustable vacuum pump Variair Unit SV 201/2. The maximum compressor power of 4 kW results in a maximum pressure difference of up to 0.42 bar. The pressure difference is measured by a VS VP8 SA M8-4 with a range of −1 to+8 bar mounted close to a valve between the vacuum pump and the gripper.

As the gripper shows similarities to approaches with standard vacuum grippers, a similar gripping and motion sequence is applied in this research and the measured forces and pressure differences are shown in Fig. 4. After positioning the gripper directly above the clamped test object, the gripper is moved perpendicularly to the object surface until a previously defned initial contact force is reached. No manual positioning or external infuencing of the gripper is applied, even though this manual intervention has achieved a form ft for standard granulate grippers in the past [14]. With the gripper being positioned directly on the test object, a valve to the air compressor is opened and a vacuum is generated in the gripper. After a delay of 2.5 s, the maximum vacuum with the applied compressor power is reached and the gripper is pulled vertically upwards at a defned pull-off-speed. The maximum force at which the gripper detaches from the objects surface is used in the next chapters for the analysis of infuencing parameters.

#### **Experimental analysis of infuencing parameters**

The main goal is to model the infuences of different objects and geometries on the possible gripping forces using the specifed gripper. For an optimal setup of objects, the infuence of the object material, surface roughness and air permeability as well as the ini-

**Fig. 4** Experimental procedure for an initial contact force of 80 N and 50% compressor power

tial contact force has to be quantifed. For this, the type of material as well as the surface roughness and the initial contact force was examined (see Fig. 5). Five different convex cylinders with a diameter of 242 mm made from realistic materials such as aluminum as a representation for milled parts, polyurethane and paper for parcels and packaging and PLA (polyactic acid) for synthetic components were used. Fifty experiments with a rising contact force between 20 and 280 N and a constant compressor power of 50% were carried out on the curved surface of the convex cylindrical surfaces.

Resulting from these experiments, initial contact forces between 50 and 200 N prove to be most applicable, as the maximum gripping force falls off for initial contact forces below 50 N and scatters broadly over 200 N. In this range for initial contact forces, the mean and variance values of the different objects are quite comparable with the highest deviation between the mean values being under 5%. As a result of these experiments, the objects used for the further research were manufactured additively with the settings achieving the surface quality of the "smooth PLA" (Fig. 5), as this enables a time-effcient and precise design for complex geometries.

As a secondary examination, the air permeability of the objects as well as the infuence of the compressor power and resulting pressure difference is analyzed (Fig. 6). For this, air permeable rotationally symmetrically perforated fat surfaces with a gripping area of 60, 80 and 90% were prepared and compared to a 100% airtight surface, a constant initial contact force of 50 N was used. Due to air fow effects of different sized porous openings, not all air permeable surfaces will show the exact same resulting gripping forces. Therefore, these experiments serve as a reference for the assessment of the infuence of air permeability.

As seen in Fig. 6, compressor power under 40% results in a low pressure difference for all test objects. For higher levels of compressor power, the air permeability of 60 and 80% fail to achieve a pressure difference of over 0.1 bar, the maximum gripping forces are below 25 N. As the experiments for over 90% are able to achieve a higher pressure of

**Fig. 5** Resulting maximum gripping forces for convex cylinders with a diameter of 242 mm with different surfaces, materials and initial contact forces. The mean and variance values in the marked area between 50 and 200 N are shown in the table

**Fig. 6** Four steps of equally distributed relative coverage of the porous areas by fat surfaces. **a** Relative compressor power over pressure difference. **b** Maximum achievable gripping force over pressure difference

over 0.1 bar as well as a gripping force of over 50 N, this range of porosity is defned as a minimum requirement for the grippability of objects. For the highest achieved pressure differences, a comparably large spread for the datapoints is observed. This is presumed to be a result of a squeeze-out-effect pushing the granular material through the outer membrane, creating a structured surface with reduced contact to the surface.

After analyzing the infuence of material and air permeability, a multitude of objects made from "smooth PLA" are used in order to examine the infuences of different geometries. The surfaces of these example geometries are airtight, experiments resulting in gripping forces below 50 N are considered failures and can be classifed as the gripper not being able to achieve a seal with the surface with an air permeability of over 90%. An exemplary extract for fat surfaces, convex edges and concave cylinders is shown in Fig. 7. Visible is a seemingly linear correlation of the maximum gripping forces with the pressure difference as well as a clear difference in the slopes for the different objects, resulting in different maximum gripping forces for high pressure differences.

**Fig. 7** Maximum gripping force over pressure difference for three example objects

#### **Empirically adapted analytical model**

Using the linear dependence of resulting vacuum forces on the pressure difference shown in formula 1 as well as the surface of the porous area of the gripper cushion of *A por*=<sup>4500</sup> mm2, a theoretical achievable maximum vacuum gripping force of 189 N at a maximum pressure difference *Δpmax* of 0.42 bar is calculated. However, experiments have shown a larger mean gripping force of 250 N for an airtight fat surface with this pressure difference. Infuences from the granulate gripping principles are not applicable, as previous research has shown no effect on fat surfaces for purely granulate grippers. As another infuencing factor, the pressure difference at the porous surface will most likely differ somewhat, as the vacuum sensor cannot be feasibly located there. However, the used vacuum pump is not able to create a pressure difference of over 0.42 bar and the sensor is able to measure this maximum value, so this difference in force is most likely not only a result of a deviation from the measured pressure difference. Therefore, this discrepancy between the theoretical and measured gripping forces is empirically approximated by a larger surface area being under the effect of the pressure difference than the actual porous area of the gripper (see Fig. 8).

A theoretical effective surface area of~6000 mm2 can be calculated with formula 1, which translates to an effective circular area with a diameter of 87 mm. The theoretical maximum gripping force *Ftmax* for a greatest possible affected area *Atmax* (~17,668 mm2) with the maximum diameter of the gripper of 150 mm and *Δpmax* is 742 N. For more complex objects and geometries, the gripping forces resulting from the granulate gripping principles such as friction have to be considered. However, specifc forces resulting purely from the granulate gripping principle cannot be distinguished, as the vacuum gripping force cannot be avoided. Therefore, a combined correction parameter *Ccombined* for the infuence of the object geometry on granulate as well as vacuum-based gripping forces is introduced in formula 2. This correction parameter is defned as a value between 0 and 1 (see formula 3).

$$F = C\_{combined} \cdot \Delta p \cdot A\_{\text{max}} \tag{2}$$

$$C\_{combined} = \frac{S\_{object}}{S\_{max}}\tag{3}$$

**Fig. 8** Approximation for the empirical area. **a** Model for only *Apor* being in effect. **b** Model for effective area

*Sobject* is calculated as the slope of the linear approximation (*Fi /Δpi* , see Fig. 7) of the achievable gripping forces over the pressure difference. *Smax* is the theoretical maximum achievable slope calculated with *Ftmax*. As an example, this results in a *Ccombined* of 0.330 for the airtight fat surface previously examined.

In an ideal setup of an airtight fat surface, *Ccombined* represents a factor for the effective pressure difference as well as the effective area, since this setup is only affected by the vacuum gripping principles. For 3D-objects, *Ccombined* represents the infuences of the granular as well as vacuum gripping principles. For more complex objects, some parts of the porous area will not be perpendicular to the gripping trajectory, which reduces the effective gripping area. However, no direct correlation between the perpendicular area and *Ccombined* is observed, as the minimum requirements for shapes are not identical for convex and concave surfaces. Therefore, formula 3 continues to use *Atmax*. An overview for the resulting values for *Ccombined* approximated over 30 experiments with a linear distribution of compressor power between 33 and 100%, an initial contact force of 80 N for different object shapes, which should enable a broad overview for most common geometries as well as specifc requirements for a grippability is shown in Table 1. The requirements all result in a sealing of the gripper with the surface of over 90% (see Fig. 6) and thus a gripping force of over 50 N. The grippability of concave cylinders, spheres and cones is mostly infuenced by the size of the objects opening for the base frame of gripper to ft, so this is not shown further. The correction parameters shown in Table 1 prove a variety of gripping strengths for different objects with a regression quality of over 0.9, which is evidence for a good approximation.

#### **Infuence of Scale and Diameter**

Remarkable is a somewhat low infuence of the scale of the object. Starting at the specifed minimum requirement shown in the last column, the correction parameter rises somewhat insignifcantly until a suffcient similarity to a fat surface is reached (see Fig. 9). A cylinder diameter of 200 mm, more than double the initial value of 90 mm has a slightly higher slope and thus a slightly increased *Ccombined*. However, up to a diameter of 160 mm almost no difference is visible. A merged calculation of *Ccombined* for the fve shown convex cylinders results in a value of 0.311 with an R2 of 0.907, which differs less than 5% from the calculated *Ccombined* for a diameter of 90 mm.



**Fig. 9** Gripping forces over pressure difference for different diameters of convex cylinders

#### **4 Conclusion and Outlook**

The main goal of this research was to show a dependence of gripping forces on object geometries and to formulate an analytical model for calculating the maximum possible forces for these geometries. This is done to gain an understanding of the applicability of this fexible gripping solution. These goals were achieved and a linear approximation showed high regression quality for a spectrum of different objects. Minimum requirements for the usage of this model are the defned geometric characteristics, which result in a sealing of the gripper with the surface of the object of more than 90%. This enables further applications in combination with approximations of objects with similar shapes and geometric features. This could be done by comparing objects to previously tested data sets and interpolating *Ccombined* or through a utilization of machine learning with a 3D-camera. This would enable applications for a gripping force prediction for unknown objects on the basis of this research. Further expansion of the formula is possible for variations in the gripper confguration, previous research has shown some infuences of parameters such as granulate size, material of the membrane etc., which will be examined in further research. Additionally, the current approach is limited to a basic vertical gripping strategy, other, more complex trajectories might result in a better sealing of the gripper cushion with the object surface and thus a higher possible gripping force.

**Acknowledgements** The support of the German National Science Foundation (Deutsche Forschungsgemeinschaft DFG) through the funding of the research project "ModPro" (450839725) is gratefully acknowledged.

#### **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

### **Automated Stack Singulation for Technical Textiles Using Sensor Supervised Low Pressure Suction Grippers**

Benjamin Wirth, Tizian Schwind, Marco Friedmann and Jürgen Fleischer

#### **Abstract**

Automated handling of technical textiles poses major challenges on modern handling systems. Previous research has shown that using suction grippers is feasible for handling processes involving textiles. However, separating individual sheets of air-permeable materials from a stack using such grippers is a nontrivial task. This paper details an automated stack singulation process using low pressure suction grippers leveraging online data from differential pressure sensors to control the singulation process. Subsystems are analyzed to derive a governing model representation of the process. This model is deployed on a robotic test rig validating the process in experimental analysis. Using this approach, a controlled singulation process for stacked carbon fber mats has been achieved with a success rate exceeding 99% showing the practicality of controlling the internal suction pressure for advanced handling processes using low pressure suction grippers. Further improvement could be achievable by incorporating and fusing multiple sensor principles.

#### **Keywords**

Process monitoring · Gripping technology · Textile handling

M. Friedmann e-mail: marco.friedmann@kit.edu

B. Wirth (\*) · T. Schwind · M. Friedmann · J. Fleischer Karlsruhe, Germany e-mail: Benjamin.wirth@kit.edu

J. Fleischer e-mail: juergen.feischer@kit.edu

#### **1 Introduction and Related Work**

Fiber reinforced plastics (FRP) are, due to their good weight specifc mechanical properties used in high performance applications with minimum weight requirements [1, 2]. However, the production of FRP parts, is labor intensive with little automated processes in place [3]. Especially low bending stiffness of woven fber mats, as well as their fragile nature lead to challenges in robotic handling.

For effcient processes, it would be preferable to start a production process from a stack of raw fabric plies, as this allows the cutting and preparation in bulk [4]. When using such a stack, the frst and foremost task is to separate single sheets of fabric material from this stack for further handling and manipulation. While processes such as pinch gripping or freeze gripping have been shown to be capable of singulating single plies from a stack, they can lead to undesirable alterations on the gripped ply or the remaining stack underneath [5].

In contrast, low pressure suction grippers are capable of handling even fragile fabrics without damage [6]. However, the porous nature of most textile materials makes stack singulation very diffcult due to the second or third ply still being effected by the grippers' vacuum suction [7].

Cubric et al. have successfully used vacuum grippers for grasping textile materials, however, they concluded: *'It has also been found that the application of this vacuum gripper is not suitable for taking one layer of fabric from material bundle'* [8].

In further research, it was shown that parameters such as area mass density, gripper position, suction cup geometry, and supply pressure have a major infuence on the successful handling of non-woven textiles. However, no attempt has been made to quantify or model any of these infuences [9].

The author in [6] has successfully used a vacuum suction gripper for separating single plies of woven carbon fber mats from a stack, by controlling the electrical contact resistance of the carbon fber pressing against the suction cup. The main defcit of this approach is its dependence on the conductivity of the handled materials.

In this paper, a robotic gripping system capable of reliable single ply separation for woven technical textiles is presented, talking the aforementioned defcits. To achieve this task, the gripping system is split up into relevant subsystems and their infuences on the gripping process are modeled, allowing the generation of a model capable of predicting the suction pressure inside the gripper depending on chosen process parameters. This model in turn enables:


#### **2 Gripping System Overview**

The main end effector used in this research consists of four Schmalz SCG 1xE100 low pressure suction grippers. Every individual gripper element (see Fig. 1) consists of:

• A vacuum generating Coandă ejector.

High pressure air supplied to the pressure inlet (1) is accelerated through a small slit. The high velocity airstream adheres to the outlet walls of the ejector curving away from the suction chamber. This high speed airstream leads to a low pressure zone inside the chamber (4).


#### **3 Subsystem Modelling**

Since the pressure in the suction chamber is the only process parameter that can be measured in this gripper setup, an attempt is made to model the suction pressure as a function of other effective process parameters, as shown by the Ishikawa diagram in Fig. 2.

For a detailed analysis of the gripping related subsystems, the system is divided into 4 subsystems effecting the generated suction pressure: Vacuum generation, Perforated plate, Material Properties and Load-case dependent leakage.

**Fig. 1** *Left* Stylized 3D Cut through an SCG Gripper, *Right* Gripper cross section

**Fig. 2** Ishikawa diagram of major infuences on the Suction pressure inside the gripper

#### **3.1 Vacuum Generation**

The vacuum generated by the gripper is mainly dependent on the input pressure supplied to it. This input pressure is controllable from 0 to 4 bar by adjusting a voltage-controlled pressure control valve at the input port. Another major infuence on the vacuum generated by the Coandă ejector is the fow restrictions to the incoming suction airstream.

In this section the generation of a predictive model is discussed, which allows the determination of the suction pressure inside the gripper pU at any given time dependent on the input pressure pN and the volumetric fowrate of the suction airstream QV.

To determine this vacuum generation characteristic, the volumetric fow rate of air *QV* entering the gripper is recorded, as well as the suction pressure pU, while varying the supply pressure and placing different fow resistances in the suction path (see Fig. 3 left). To ensure a smooth almost laminar airfow at the location of measurement for volumetric fowrate, a 30 cm long pipe is added to the bottom of the gripper-restriction assembly.

This generates a three dimensional dataset of 200 specifc operating points. The linear regression model ftted to this data set yields a characteristic equation (see Fig. 3 right):

$$p\_U(\mathcal{Q}\_V, p\_N) = p\_N \cdot a\_{10} - \mathcal{Q}\_V \cdot a\_{01} - p\_N^2 \cdot a\_{20} - p\_N \cdot \mathcal{Q}\_V \cdot a\_{11} - \mathcal{Q}\_V^2 \cdot a\_{02}$$

$$\text{With } a\_{10} = 4279 \frac{p\_u}{bar}; \, a\_{01} = 11.28 \frac{p\_{\text{min}}}{l}; \, a\_{20} = 286.6 \frac{p\_u}{bar^2};$$

$$a\_{11} = 0.3824 \frac{Pamin}{barl}; a\_{02} = 0.01701 \frac{Pamin^2}{l^2}$$

With an adjusted coeffcient of determination of: R2 adj = 0.9299

**Fig. 3** *Left* Cross-section through the experimental setup for the measurement of vacuum generation characteristics, *Right* Interpolated vacuum generation characteristic model

**Fig. 4** *Left* Perforated plate geometries used in this study, *Right* Accompanying characteristics. All subplots are equally scaled *pU* = 0..10000Pa and *QV* = 0..600l/min

#### **3.2 Perforated Plates**

The suction chamber of these grippers is enclosed by a perforated plate, which also acts as an interface to the gripped textile. The design of this interface however is not predetermined, allowing for adjustments specifc to any use case. To determine the infuence of these suction plates, the experiment described above is repeated with 9 different suction plate designs shown in Fig. 4. These suction plate designs vary in the size of individual orifces *DH* as well as the combined cross sectional area of the orifces.

By 3D Printing these test specimen and measuring the volumetric airfow through them at varying suction pressures a characteristic resistance graph for every suction plate can be determined, as shown in Fig. 4 right.

The literature on airfows through perforated plates [10, 11] states that the pressure loss of lamina fow through such a plate is supposed to be:

$$p\_{Ambient} - p\_S = \frac{1}{2} E u \rho V^2$$

with

*pS* Vacuum pressure in the suction chamber


With *Q*<sup>2</sup> *V* being proportional to *V*<sup>2</sup> and *pU* as the left hand pressure differential the equation can be restated with a complex pressure loss coeffcient ξ under the assumption of constant fuid density (which nearly holds apart from changes in environmental conditions) as follows:

$$p\_U = \xi Q\_V^2$$

While an increase in plate porosity clearly coincides with an increase in airfow and consequentially reduction in suction pressure, the infuence of the hole diameters is not monotonous.

#### **3.3 Air-Permeable Textiles**

An airstream through a porous media, such as a woven fabric, leads to a pressure drop, which can be described by the following expression [12]:

$$
\Delta p = C\_1 \frac{\mathcal{Q}\_V}{A} + C\_2 \left(\frac{\mathcal{Q}\_V}{A}\right)^2
$$

With

*p* the pressure loss across the media

*C*1 linear loss coeffcient

*C*<sup>2</sup> quadratic loss coeffcient

With low fow rates *QV* the quadratic term can be dropped, leading to the equation used in the norm DIN EN ISO 9237 [13].

For any material used the air permeability *R* can easily be determined as described by the norm [13] by measuring it at a certifed test stand. This value R is the area normalized airfow through a textile at a 200 Pa pressure differential. Therefore, the Value of *C*<sup>1</sup> can easily be determined and substituted into the pressure differential equation above: *C*<sup>1</sup> = <sup>200</sup>*Pa <sup>R</sup> <sup>p</sup>* <sup>=</sup> <sup>200</sup>*Pa <sup>R</sup>* · *QV AF*

Combining this equation with the pressure drop across the suction-plate yields:

$$
\Delta p = \frac{200Pa}{R} \cdot \frac{\mathcal{Q}\_V}{A\_F} + \xi \mathcal{Q}\_V^2
$$

ured data as shown in Fig. 5. While this theoretically derived model shows some general correlation to the overall characteristics of the gripper, one cannot assume perfect prediction of any interaction and therefore the perfect prediction of suction pressure inside the gripper is not achiev-

able without the use case specifc experimental determination of gripper characteristics.

#### **3.4 Leakage Currents**

As shown in Fig. 1 (right) deformations in the fabric can lead to air streams not completely passing through the fabric. These stray air currents will not apply holding forces on the textile and therefore play a huge role in grip security and the separation of textiles from the gripper at low supply pressures. This subject is addressed by introducing a scalar load equivalent term which will be obtained by a simple static mechanical simulation.

For the generation of such a scalar load equivalent value 6 limp materials are selected and their mechanical properties needed for simulation are determined, as shown in Table 1:

Figure 5 (right) shows the good match of simulated cantilever test with the real one for one example material.

To quantify the deformations of a single textile sheet with a single scalar value, a circular path around the gripper is defned where the deformation of the simulated textile is measured around the circumference of said path *d*(*s*) with the normalized path length *s*. This in turn allows the calculation of the load equivalent term L.

$$L = \int\_{-0}^{1} d(s)^2 ds$$


**Table 1** Material properties and method of acquisition

**Fig. 5** *Left* Suction pressure over supply pressure for 9 different suction plates. *Orange line* calculated model, Blue line: smoothed measurement data. *Right* Cantilever test for the determination of simulation parameters and simulated deformations in good agreement (local Von-Mises-Stress shown in false color mapping)

**Fig. 6** *Left* Correlation of Load-Equivalent L and separation-pressure *s*10, *Right* Five of our considered example load cases

While these deformations have little effect on the suction pressure inside the gripper when the textile is securely gripped, they show big infuences on the separation of layers from the gripper. To verify this load equivalent 8 different load cases were defned with varying material geometries and gripper confgurations. Five of these variations are shown in Fig. 6 (right). After grasping the material the operating pressure is reduced until the ply drops off. Recording the supply pressure at which separation occurs *s*10 a correlation between *s*10 and the load-equivalent-term *L* can clearly be seen in Fig. 6 (left).

While the specifc linearity factor is dependent on the selected suction-plate and material combination this linearity factor can be determined experimentally and can later be used to determine separation-points for a minimal pressure at which a single ply is gripped successfully, thus being a good starting-point for a minimal supply pressure for stack singulation.

#### **4 Robotic Workstation**

As mentioned above, the goal is to proof the concept of robotic single ply separation from a given textile stack. Therefore, a robotic test stand is built made up of a UR5e cooperative robot and a gripper assembly, containing 4 low pressure suction grippers each equipped with a −1 to 1 bar differential pressure sensor, which can be seen in Fig. 7 Control structure for the robotic workstation left.

#### **4.1 Control Structure**

To control the test-stand, an OPC-UA server running on a Beckhoff PLC is used as well as a control backbone of the singulation routine on a connected personal computer running a Matlab control code. The control architecture can be seen in Fig. 7.

Before programming the routine, the experiments described in Sect. 3 are conducted to generate:


These experimental results can be used in the following control strategy for stack separation:

**Fig. 7** Control structure for the robotic workstation


#### **4.2 Experimental Validation**

Performing the control process described above it is possible to repeat the stack separation process indefnitely by moving a stack to a new location and back again.

In a test with a sample size of 500, it was possible to successfully detect and separate a single material layer from the stack 99.6% of the time. Both errors were due to the material sticking to the stack when lifting the gripper at step 4. This error is recognized at step 6 and could have been corrected in a production process.

The same series of tests for separating two layers of material together was successful in 66% of the cases. Problems observed when going for multiple layer handling were mainly:


#### **5 Conclusion and Outlook**

In this work, attempts at modeling the governing interactions in a low pressure suction gripper when handling porous materials are presented.

While it was not possible to generate a fully parametrical model for predicting the pressure inside a suction gripper at any given time, the approach nonetheless allows to derive a process for automated stack separation with an overwhelming success-rate of>99%.

This level of confdence would even allow for usage of this process in real production environments. Multiple challenges came up when going for multi-layer handling operations, while these might not be as important for production facilities as single ply separation, the furtherance of the understanding on those processes remains an interesting feld of research.

Incorporating and fusing other sensor principles, such as immediate force measurement at the Gripper-Textile Interface as well as optical feedback could further improve the success rate and error tolerance for such delicate handling processes.

Furthermore incorporating artifcial intelligence into the control structure could improve results and might allow for dynamic reaction on slightly alternated environmental parameters.

**Acknowledgements** This research was fnancially supported by the German Research Foundation (DFG) (funding no. 397485737)

#### **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

**Gripping Technology and Industry 4.0**

### **Accuracy Examination of a Flexible Pin Gripper for Forging Applications**

Caner-Veli Ince, Jan Geggier and Annika Raatz

#### **Abstract**

Nowadays, cost reduction in manufacturing is getting relevant. One aspect to achieve that is utilising universal handling systems and their ability to adapt to changing objects and various geometries. By that, they minimise the number of handling systems and set-up times whereby cost savings are realised. In the field of forging, the objects vary their shape several times during the manufacturing process. In addition, the temperature can rise up to 1200 ◦C during the different steps of the forging process. Current flexible handling systems cannot handle those temperatures. The main reason for that is the material they consist of, primarily elastic polymers. Hence, there is a need for a handling system to close the gap between form flexible and high temperature handling. For this purpose, we developed such a handling system in our previous work, consisting of two jaws with pins in a matrix arrangement. Each pin can move in the longitudinal direction and adapt to different shapes. In response to the current temperatures for the pins, a material is used that withstands high temperatures. This paper presents the actuation and control of the developed handling system. The system is actuated by pressurised air which is continuously controlled to counteract the thermal expansion of the air caused by the high temperatures. Therefore, we integrate intelligent valves to fulfil the automation and control. Finally, we evaluate the accuracy of our system and optimise the valve control.

#### **Keywords**

Form flexible gripping • Automation • Forging • Handling

C.-V. Ince (B) · J. Geggier · A. Raatz

Institute of Assembly Technology, Leibniz University Hannover, Garbsen, Germany e-mail: ince@match.uni-hannover.de

<sup>©</sup> The Author(s) 2023

T. Schüppstuhl et al. (eds.), *Annals of Scientific Society for Assembly, Handling and Industrial Robotics 2022*, https://doi.org/10.1007/978-3-031-10071-0\_28

#### **1 Introduction**

Formvariable handling systems can adapt to different geometries of the handling objects and fulfil various tasks and operations. Shintake et al. reviewed the so called soft grippers that consist of polymer material [1]. The polymer material is advantageous because of its elastic behaviour. In combination with several physical effects like granular jamming [2] or Fin Ray Effect [3], a universally usable handling system is realised. In contrast to the benefits of the elastic polymer material, the max. operating temperature of 300 ◦C limits the usability of a soft gripper [4]. For most use cases, this temperature is not exceeded. In this work, the application in the forging sector is considered. Here, the objects undergo massive geometric changing processes and reach temperatures up to 1200 ◦C. The workpiece of a bevel gear, for example, has a simple cylindrical geometry. After forging, the bevel gear is conically shaped with teeth on the cylinder surface without parallel surfaces for grasping. A handling system that can adapt to the varying shape of the handling object would be beneficial for the automation of these processes. Additionally, using one gripper instead of several reduces the number of necessary grippers, resulting in a cost reduction. Further, tooling times are saved.

The Tailored Forming Process can, for instance, be considered such a procedure. It is a novel forging process investigated in the Central Research Centre (CRC) 1153 at the Leibniz University Hannover [5]. The main focus is to develop a process chain to manufacture particularly tailored hybrid components consisting of multiple materials. The joining process is the main difference between the Tailored Forming Process and other hybrid manufacturing processes. Conventional processes place the joining process at the end of the manufacturing process, at which state the components have almost their final shape. This fact limits the possible geometries for the hybrid components. In contrast, the joining is located at the beginning of the process chain in the Tailored Forming Process. Here, the materials are merged into semi-finished workpieces with simple shapes followed by the forging.

To match the forging properties of the combined materials, the workpiece has to be heated up. Depending on the different materials, a temperature gradient is necessary and is set by induction heating. Currently, the combination of steel-steel and steel-aluminium is being investigated. The steel-steel paring requires temperatures up to 1200 ◦C, which defines the max. process temperature.

Several demonstrator components are investigated in the CRC 1153. Their shapes vary from cylindrical shafts over a conical bevel gear to a wishbone and are depicted in [6]. The bevel gear, for example, has a cylindrical workpiece and undergoes a massive change in shape, which is challenging to handle with the same handling system.

Furthermore, the accuracy of the handling system is essential for the Tailored Forming Process quality. Induction heating is used to prepare the workpieces for forging. They are placed on the induction coil, which requires high precision. Contact between the induction coil and the workpiece could damage the coil. In addition, the workpieces have to be placed in an exact position in the forging die. Otherwise, the forging could fail.

The brief overview of the Tailored Forming Process indicates the need for a handling system which withstands high temperatures of up to 1200 ◦C, adapts to changing geometries and fulfils the accuracy requirements.

In this work, a concept for a previously developed form variable pin gripper for use in forging environments is investigated further [6]. Therefore, the Tailored Forming Process has been introduced to define the boundary conditions. In the following, the functionality of the automation of different pin grippers is briefly presented to outline the state of the art. Afterwards, the prototype of the previously developed handling system is presented and the experimental validation results are discussed. A summary and an outlook complete this paper.

#### **2 Related Work**

This section gives an overview of pin grippers and their grasping process. At the end of this section, the gap between the currently available pin grippers and the boundary condition of the Tailored Forming Process is pointed out.

The first gripper of this kind is the Omnigripper by Scott [7]. Scott has an arrangement of 8 × 16 pins on two slightly separated plates with the same orientation. The pins can move independently in the vertical direction when it comes to contact with an object. The Omnigripper lowers over an object, the pins in contact retract, whereby the negative of the shape is adapted. The plates then move together and the pins clamp the object. The pins are telescopically designed. When the pin retracts by contact, electrical contact is established between the inner and outer tubes activating a switch. Thus, the Omnigripper can acquire 3D data from the pin positions with this sensing. A host computer starts the gripping process. The Omnigripper is attached to a robot, and the host computer order to grasp, release or reorient the object. Afterwards, the control is given back to the robot for doing the movement. Scott proved the ability to handle a wide range of objects with his work but did not mention high temperatures or the reached accuracy.

Similar to Scott, Mo also developed a pin gripper [8]. Mo's gripper has pins that move independently vertical and the shape is adapted by lowering the gripper over the object to handle. In contrast to Scott, Mo's pins are arranged on one plate, and the grasp is realised by active rotational actuation of the pins. Some pins have an elliptical shape, whereby the object is clamped. Both of these grippers need an active actuation.

Meintrup developed a system based on pins that consists of two opposing pin jaws [9]. The system is primarily utilised as a manual clamping device without automation. The pins on each jaw are close-packed and in contact with each other, and they can move independently in the horizontal direction. The jaws can be brought together to grasp an object. Thereby, a mechanical system is activated that presses the pins together. The resulting friction between the pins blocks their horizontal movement and the pin position is fixed. The activation of the mechanical system depends on the position of the jaws. The jaws move over a ramp, whereby the clamping is realised. This passive actuation is advantageous for high temperatures but not adaptable to changing diameters of the handling objects.

Kim [10] developed a similar gripper to [6] with two opposing pin jaws. The pins are actuated with pressurised air, whereby the jaws are electrically driven. The stroke of the pins regulates the process. A resistive foil is attached inside the gripper. If the pin touches the resistive foil, its resistance changes and the process is considered completed.

The systems shown are adaptable to different shapes and diameters. However, none of them is designed for the operation in conditions like the Tailored Forming Process or other forging processes. Pin grippers do not need to be made of polymer material to achieve their shape variability. In addition, the gripper's jaws can be configured and arranged differently allowing further adaptation to the handling task. Electrical grippers or sensors cannot be utilised to detect the pin position at high temperatures and have to be adapted. Due to those facts, the pin system is investigated further to realise a handling system for hot forging workpieces.

#### **3 High Temperature Flexible Handling System**

As described before, the boundary conditions of the Tailored Forming Process are exceptional. Thus, standard parts like parallel gripper cannot be utilised. Therefore, a flexible handling system consisting of two pin jaws and a grasping device was developed. The jaws, Fig. 1b, are the shape variable part of the system and were part of the previous work [6]. The grasping device, Fig. 2, aligns and moves the jaws and is designed in this work for the use case of the forging sector.

#### **3.1 The Pin Jaws**

In the case of the Tailored Forming Process, the high temperatures are problematic for the handling task. For that reason, the system developed in [6] is actuated by pressurised air instead of electric motors or hydraulics. Electrical units are not heat resistant, which is why they are not considered. Hydraulic systems have to be sealed and the sealings on the pin can be damaged by the heat the pin reaches. If the sealing breaks, the system will fail, and the fluid will contaminate the environment. Pressurised air also needs sealing but can also work without sealing. In the circumstance of a leak, only air is released, which is uncritical for the environment. The high temperature also affects the pressurised air, which causes expansion that the control circuit can compensate.

In order to use the pressurised air as actuation for the pins, each pin is integrated into a cylinder like a piston-cylinder-system, depicted in Fig. 1a. The pin consists of the piston and a screwable head. The subdivision is required for assembly purposes and allows to change the pinheads for different tasks. For example, if the gripper handles objects with

**Fig. 1 a** Construction of the cylinder-piston-system. **b** Pin gripper holding a bearing bushing

polished surfaces, metallic heads could damage them. A seal is attached at the other side of the piston to prevent pressure loss. The seal is critical because it consists of polymer material such as the elastic gripper mentioned above. Therefore, a thermal simulation was carried out to investigate the influence of different settings for material data and contact time between object and gripper. The results indicate temperatures in an allowed range for a high temperature stainless steel with a low conduction coefficient. The cylinders with the inserted pin pistons are assembled on a base plate, whereby the matrix arrangement occurs.

#### **3.2 Grasping Device**

The jaws have to be aligned and moved, which requires a grasping device that can withstand the heat and water from an additional cooling unit to maintain the handling object's heat distribution, described in more detail in [6]. Due to the temperatures, sealings could be harmed and the vaporised water can damage the electrics of the grasping device. To overcome these difficulties, a particular grasping device is designed within this work as follows: The jaws are mounted on a linear guiding system and can move independently. The linear guiding system also has a clamping system that can be activated at every position and fix the jaw. Additional double-acting cylinders actuate the jaws by pressurised air. To protect these jaw cylinders against heat and the cooling fluid, they are mounted on the backside of a plate while the linear guidance systems and the jaws are on the front side. The grasping device is depicted in Fig. 2.

**Fig. 2** Components of the pin gripper divided into grasping device and pin jaws

#### **4 Controlling Concept**

After the presentation of the mechanical part of the handling system, the introduction of the developed control unit follows. The gripper's control has a significant influence on the accuracy and reliability of the grip. Therefore, this section addresses the basics of the underlying control concept and the gripping routine carried out.

#### **4.1 Concept**

For the realisation of the control, the following setup in Fig. 3 is chosen: A programmable logic controller (PLC) is utilised as a central control unit. On the pin gripper, sensors are attached and connected with the PLC. A hall sensor on each jaw actuation cylinder is used to measure the jaw's position. Furthermore, there are laser distance sensors under the jaws. Their task is to detect the position of the handling object and transmit the data to the PLC. They operate as placeholders in the concept as their final application is still being investigated. Due to the thermal radiation of the hot object, their use is not possible without further considerations. The PLC processes and corrects position deviations of the object by adjusting the jaw positions and the pressure in the pin cylinder, which improves the gripping accuracy. Preliminary grasping experiments showed that the pins of one jaw press in the pins of the other depending on the handling object. An assumption is that the production tolerances cause varying friction forces in the cylinder-piston system. That is why the force imbalance between the pins occurs. The imbalance is compensated by adapting the pressure value in the jaws and their position. As seen in Fig. 3 the pressures are set by a valve system that is connected with the PLC. The valve system measures and adapts the pressures to equalise air expansion caused by heat impact.

#### **4.2 Gripping Routine**

A three-phase gripping routine is performed to grip a workpiece securely. In the **first phase**, both jaws are moved to the extended position using the hall sensors on the cylinders to check the position. Then the pins are brought back to their initial position by shortly applying pressure to revise previously set contours.

In the **second phase** of the gripping cycle, the gripper is closed. The cylinders first close the jaws. Thereby, the closing speed of the jaws is adjusted by the exhaust air throttling that controls the airflow out of the jaw cylinder. It is then integrated in the pressure regulating valve and can be adjusted continuously. When the pins come in contact with the workpiece during the closing process, the pins retract and the pin matrix maps the workpiece surface's profile. The pneumatic clamping system is activated, and the pins are re-pressurised for a secure grip.

In the **third phase** of the routine, the grip is released. The cylinders are pressurised again in the opening direction and the pins are vented. The clamping elements are deactivated, so the grip is released abruptly as the cylinders have already been pressurised. All components are vented when the defined extended position is reached again. Next, the handling and control systems are presented. Then, the gripping accuracy will be validated to examine if the required handling precision is reached.

#### **5 Validation**

The designed system is analysed experimentally to verify the accuracy and provide fundamental knowledge for subsequent system optimisation. This section describes the experimental setup first and is followed by the results.

#### **5.1 Experiments**

The gripping routine presented in Sect. 4.2 is performed to investigate the gripping process. The aim is to measure the deviation of the object caused by the handling system and to minimise the deviation. The output variable by which the results are evaluated is the relative position change of the workpiece during gripping. In all experiments, a hybrid hollow cylinder workpiece with an outer diameter of 62mm and a height of 84mm is gripped. Two laser distance sensors measure the x- and y-position of the cylinder. The x- and y-direction of the handling system are shown in Fig. 3 and the set up of the testbed is depicted in Fig. 2. During preliminary tests, the closing velocity of the jaws was identified as the main parameter. The velocity is determined by the inflow and outflow of air from the jaw cylinder. Therefore, the air inlet and exhaust air throttling are considered. For the exhaust, a max. value of 2% is chosen. Higher values result in a too fast movement of the jaw. For the air inlet, the range from 0.01% to 100% is tested. Initially, a mechanical synchronisation of the jaw cylinders is not utilised. Afterwards, a synchronisation is integrated because asynchrony was observed in the closing movement. The jaw cylinders are mechanically coupled through a lever. When the piston of one cylinder moves, the other is automatically set in motion, whereby synchronisation of the jaws is achieved. In total, an experimental design with 20 parameter settings is carried out. Each experiment is repeated three times in randomised order.

#### **5.2 Results**

The results are depicted in a box plot in Fig. 4. Here, only the x-direction is evaluated because it has a more significant deviation than the y-direction due to asynchrony. The results without synchronisation are presented first followed by the synchronised, including the explanation of the mechanical jaw coupling. The deviation for an air inlet throttle of 0.01%, 50% and 100% are high, and a precise handling could not be realised. An exhaust air throttle of 0.4% has, in every uncoupled case, the worst results. Furthermore, 1.6% and 2% for the exhaust have the best repeatability results for all uncoupled settings. Based on the results, one can conclude that the exhaust air throttle has a more significant effect on the accuracy than the air inlet throttle. During the experiments, observations were made that explained the inaccuracies. Depending on the exhaust air throttle, there was a delay in the closing movement between the jaws, whereby one jaw reached the object earlier and forced it out of the centre. The accuracy achieved with non synchronised jaw cylinders is insufficient for the Tailored Forming Process.

The coupling minimised the deviation drastically as seen for the coupled air inlet throttle of 50% (synchronised) in Fig. 4. The best results are achieved for the exhaust air throttle of 1.6% with a deviation between 0*.*2 mm–0*.*7mm. The deviation has to be optimised further to operate in the Tailored Forming Process successfully.

**Fig. 4** Box plot of the experimental investigation. The influence of the parameters on the accuracy is tested. Three sets without synchronised jaw cylinders and one set with synchronised jaw cylinders

The validation provides the potential for further mechanical improvements. Some pins do not retract when they come into contact as assumed, whereby the object is moved. Investigations show that the manufacturing tolerances do not match every pin. The coaxiality of the cylinder drilling and the base plate drilling must be chosen more accurately to prevent pin clamping.

#### **6 Summary**

This work points out the advantages of shape variable grippers and their ability to adapt to varying geometries as well as their benefit for the forging sector. Current shape variable grippers can not be utilised in the forging sector because they are made of a polymer material that has a limited operating temperature of 300 ◦C. The temperature of the forging object exceeds this limit by reaching temperatures up to 1200 ◦C.

Therefore, a pin gripper that is variable in shape as well as resistant to high forging temperatures was developed. The pin gripper's construction and the control to achieve precise handling were presented. The controlling is necessary for the automation of the handling process and to carry out the gripping routine.

After implementing the routine, experiments were carried out to verify the accuracy of the handling system and provide fundamental knowledge for optimisations. For the experiments, two parameters were investigated. First, the exhaust air throttle and second, the air inlet throttle. The exhaust air throttle impacts the system more than the air inlet throttle. The overall accuracy was insufficient, which is why a mechanical coupling of the jaws was installed. This had a positive impact on the accuracy. The experiments showed other critical points that must be investigated further, like the manufacturing tolerances.

#### **7 Outlook**

The observation during the experiments showed optimisation potential for the jaw's construction. The current jaw consists of a base plate with mounted pin cylinders where the experiments identified an error source. The base plate and the pin cylinders can be manufactured as one part to eliminate the error. Consequently, the drillings achieve a better coaxiality and the risk of clamping during retraction decreases.

Furthermore, the routine could be enhanced by involving the sensors for the jaw's positioning in the closing process. Currently, the jaws close until they are in contact with the object and cannot move further. In future, the jaws could be stopped before a contact appears, followed by pushing out the pins. This would be an additional electronic coupling to the mechanical one.

For experiments under forging conditions, secure and precise handling must be ensured beforehand. The experiments have shown where improvements are required, which means that the adjustments can be completed and the influence of temperature examined in future.

**Acknowledgements** The results presented in this paper were obtained within the Collaborative Research Centre 1153 'Process chain to produce hybrid high-performance components by Tailored Forming'—252662854 in subproject C7. The authors would like to thank the German Research Foundation (DFG) for the financial and organisational support of this project.

#### **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

### **Use of Autonomous UAVs for Material Supply in Terminal Strip Assembly**

#### Marius Boshoff, Michael Miro, Martin Sudhoff and Bernd Kuhlenkötter

#### **Abstract**

The multi-variant small part assembly of terminal strips requires innovative approaches for automated picking and meeting increased product variability demands with increasing process fexibility. Unmanned Aerial Vehicles (UAVs) are likely to be used in material supply for small parts and, therefore, replace manual picking of parts that are rarely needed in assembly. An autonomous material supply in the 3rd dimension could break up fxed assembly processes, reduce picking time and raise the production fexibility. In this article, the share of manual picking time in a real terminal strip assembly line is determined, and UAVs as a potential transport solution for terminals and jumpers are presented.

#### **Keywords**

UAV · Intralogistics · Material supply · Picking automation

#### **1 Terminal Strip Assembly: The Need for Improving Picking Automation**

Today's manufacturing processes are characterized by an increasing customer-specifc individualization of products and require new process strategies for batch size 1 to remain proftable with increasing process complexity. This market trend is expressed in

M. Boshoff (\*) · M. Miro · M. Sudhoff · B. Kuhlenkötter

Chair of Production Systems (LPS), Ruhr-University Bochum, Bochum, Germany e-mail: boshoff@lps.ruhr-uni-bochum.de

URL: https://www.lps.ruhr-uni-bochum.de/

T. Schüppstuhl et al. (eds.), *Annals of Scientifc Society for Assembly, Handling and Industrial Robotics 2022*, https://doi.org/10.1007/978-3-031-10071-0\_29

*mass customization* and represents a cross-industry challenge for manufacturing companies [1]. However, the desired mass customization of products prohibits the use of infexible standard automation approaches and requires fexible, economically sensible and technically feasible automation systems. Permanently installed production lines, consisting of fxed conveyor technology, automatic machines, robots and safety equipment, do not offer mass customization the necessary production fexibility for a low-effort redesign of assembly lines and meeting increased customer demands [2].

Like many other industries, switch cabinet manufacturing underlies the described phenomenon of mass customization, as a wide variety of products can be assembled in different ways to meet customer-specifc requirements [3]. Switch cabinets are elementary machine and plant components, as they are used to distribute, regulate and control power using switchgear and other components (see, for example, [6, 7]). Phoenix Contact GmbH & Co. KG and the Chair of Production Systems (LPS) have therefore been working together closely since 2016 to develop solutions for smart assembly in switch cabinet production, including the use of crucial Industry 4.0 technologies. As a part of the research cooperation, an assembly system was set up in the LPS learning and research factory (LFF). Actual assembly orders are carried out according to the customer's needs, and a dynamic technology transfer of new automation concepts from science to industry occurs.

The required small electrical parts have a size of a few centimeters, a weight of 5 to 50 g and are manually picked from a decentral storage rack. Improving the manual picking effciency is one of the most critical issues in current logistics industry [2]. New warehouse automation concepts must be tailored to their particular needs to reduce manual picking time [4]. Therefore, the terminal strip assembly will be presented as an exemplary feld of application for UAVs in automated picking. Thus, the need for automated picking will be examined based on the proportion of manual activities for providing material like terminals and jumpers in a frst step. Subsequently, a concept for UAVs in small parts assembly will be presented and discussed. It should be noted that one aim of the described method is general applicability in different production domains. In this context, the terminal strip assembly serves as an exemplary application.

#### **1.1 Material Supply of Terminals and Jumpers**

The assembly of terminal strips is carried out in an assembly line. Specifc tasks must be carried out at each workstation to fnalize the assembly process of terminal strips (see Fig. 1). Relevant in the scope of this paper are picking processes for workstation 1, the terminal assembly, and workstation 3, the assembly of jumpers. Picking parts and preparing workstation 1 is done according to production order in advance. After the number of required parts has been prepared, the terminal strips are assembled directly. Parts at workstation 3, on the other hand, are not picked in advance but are procured from the storage rack as soon as a container of parts is empty.

**Fig. 1** Overview of the assembly line [8]

In [5], the optimization of workstation 1 was discussed, and the implementation of a workstation concept was presented for the addressed assembly line. Having the material ready and in place for production supports the assembly workfow, as the employee works without interruption and loss of concentration. This, in turn, affects the workstation's design, as it must offer enough capacity to store material and be minimized to save production area at the same time. Nevertheless, there is still demand for parts with a comparatively low amount that are not stored at workstation 1 but in a separate storage rack. The current production process is frst examined to determine the share of picking time for the terminal strip assembly and thus the loss of value-adding assembly time. The frst step is to analyze the share of picking time in relation to the total production time, divided into the respective work steps of workstation 1 and workstation 3. Therefore, actual production data from real orders will be investigated. The methodology of data acquisition is examined in Sudhoff et al. 2020 [6, 7].

Based on the production data, an ABC analysis is then carried out. Therefore, components with a considerable proportion of the turnover are assigned to class A, whereas less needed components are assigned to class B or C. As the scope of this paper is to determine the distribution of parts in manual assembly and derive commissioning times for each component, the ABC analysis is conducted in terms of frequency of use. For class A, a range of 0 to 80% is applied. Class B reaches from 80 to 95% and C from 95 to 100%.

Consequently, parts that are frequently used to assemble terminal strips (class A) are usually stored within the worker's reach. Parts that are not commonly used (class B and C), on the other hand, are typically stored further away because of space requirements. In the underlying example, class B parts are stored in a distance of 8 m, and class C parts are located on a shelf 10 m from the assembly station. It can be assumed that this example rather understates the distance for commissioning B and C parts in most cases, as demonstrated in [9].

#### **1.2 Time-Consuming Picking Process for Assembly of Terminal Strips**

The conducted assembly process analysis is based on evaluated data taken from January until November of 2021. In this period, 8,944 actual production orders have been carried out. The complete assembly process of terminal strips can be divided into the operating activities of *production*, *preparation*, *administration*, *rework,* and *logistics*. The activities production and preparation account for most of the time spent on the assembly processes and are therefore of interest for further investigation. *Production* describes only work steps, like clamping on electric parts, wiring, labelling, or executing quality testing. *Preparation* refers to orderpicking required parts, like terminals, jumpers. The necessary parts are picked from a storage rack, transferred to the respective workstation, and placed for production. Many of the components, like jumpers, sometimes need a pre-assembling step. Pre-assembling steps and printing of labelling markers are carried out within the process of preparation as well.

Fig. 2(a) shows the relative proportion of the work steps from the overall assembly process. In total, a period of 997.2 h was analyzed, which is equal to 124.7 working days of eight hours.

With 78.4%, the most time-consuming assembly share is the production. However, with 17.7%, preparation takes the second-biggest share. Concerning the period of 997.2 h, the time spent for preparation is equal to 176.5 h or 22.1 working days. Logistics and rework usually hold a minor share, while administration takes a little more time.

A detailed analysis of the preparation time for single tasks is presented in Fig. 3(b). With 43.1%, preprinting of labelling markers takes the most signifcant share, while

**Fig. 2** In (**a**) time allocation in percent for the work steps, evaluated data taken from January until November of 2021 and in (**b**) time spent in percent for picking of specifc parts or preparing an assembly step

Terminal Variant

**Fig. 3** Amount of assembled terminals in 2021

39.1% are spent on picking terminals. 14.3% of the time spent is used for preparing shipping material like cardboard, 1.6% for picking jumpers, and 1.4% for preparing end terminals in an automated robotic clamping application. It may be noted that the observed time spent for the picking of jumpers is unusually low. A reason for that was a lower need for a greater variety of jumpers in the carried out orders, resulting in less spent time for picking those parts.

The resulting values are used to calculate the timeshare for picking terminals and jumpers in the given period. The evaluation showed that 71.84 h were spent for the picking of terminals and jumpers, which is equal to 7.2% of the entire assembly time. If it is possible to reduce this share of lost effort with a new supply concept, the time saved may be shifted to a value-adding activity.

#### **1.3 Demand of Terminals**

A huge variety of terminal variants were used to assemble terminal strips, but their actual demand varies strongly, depending on the order situation. Figure 4 shows the demand for terminals in the given period. In total, 56 different terminal variants were used, but only some can be listed here. There were 237,245 terminals of the variant *PT 1,5/S-TWIN/1P* assembled, but only 600 of the variant *USLKG 5,* for example.

The cumulated amount of different terminal variants is presented within an ABC analysis in Fig. 5. Based on a standardized ABC analysis, class A ranges from 0 to 80%, class B from above 80% to 95%, and class C from above 95% to 100%.

There are only 7 terminal variants out of 56 that make nearly 80% of the used parts. Conversely, 49 variants must be kept in stock in a separate storage and accessed when

**Fig. 4** Cumulated amount of terminals by their variant, divided into ABC-categories

**Fig. 5** Amount of assembled jumpers in 2021

needed. The same analysis procedure was also used for the supply of jumpers in the same period. The demand for a specifc variant is shown in Fig. 6, and the correspondent ABC analysis is presented in Fig. 7.

In total, 24,217 jumpers of 25 different variants were assembled. Again, a considerable gap in distribution demand for a particular jumper variant can be observed, which

**Fig. 6** Cumulated amount of Jumpers by their variant, divided into ABC-categories

means a comparable stocking situation and, therefore, a similar storage situation as for the terminals. In contrast to the ABC analysis of the terminals, with 9 out of 25 variants, however, a comparatively more signifcant number of jumpers take a share of class A parts. As the jumper's size is smaller than the size of terminals, the space requirements remain similar for storing A-parts. But there are still 16 variants of jumpers that are assigned to class B and C.

#### **2 Innovative Picking Automation with UAVs**

New picking concepts and automated production units are needed to reduce the effort of providing terminals and jumpers sustainably. In this course, possibilities of material provision with UAVs are discussed, which are now being subjected to a feasibility study in the chair's learning factory. Integrating an automated UAV-based supply into the assembly line might open possibilities for assembly concepts like just-in-time delivery. These concepts might break up the assembly line to a more fexible structure and even lead to reduced production effort. For this purpose, new supply concepts have to be shaped, and new technologies must be used.

#### **2.1 Automation Concepts for the Material Supply of Terminal Strip Assembly**

For the assembly line considered, the use of Automated Guided Vehicles (AGVs) is unsuitable for the provision of small parts since the AGVs would have to constantly avoid employees in a highly fexible working environment or block the narrow paths of the assembly line for the passage of employees [10]. Regarding the fexible nature of terminal strip assembly, fxed conveyor technology is not considered due to the high effort of reconfguration in every order change [11].

Earlier studies proved UAVs to be an underestimated option for the indoor material supply [10, 12]. The usage of UAVs as transportation units have neither been tested nor documented for an actual scenario in material supply [2, 13]. In this way, manual picking effort could be decreased to a minimum. This approach is innovative for small-part assembly and meets the need for increasing product variability with increasing process fexibility.

#### **2.2 UAVs for Material Delivery Tasks**

The term Unmanned Aerial Vehicle (UAV) describes small self-fying vehicles without any pilot controlling the aircraft. In the feld of computer science and artifcial intelligence, mostly the terms UAV, UAS (Unmanned Aerial System), VTOL UAV (Vertical Take-Off and Landing UAV) or Multirotor UAVs are used [14]. In most cases, four rotors lift the device, enabling the UAV to become a VTOL unit. Besides the Quadcopter UAV, there are also Helicopter UAVs and Fixed-wing UAVs. All of them come with their own strengths and weaknesses, as stated in [15]. For production environments with limited space and high demands regarding safety and reliability, quadcopters are the preferred choice as they are most likely to meet the requirements.

In the course of intralogistics, UAVs are not much discussed yet, though their potential might be huge for industrial applications [10]. UAVs promise to be faster, more fexible, space-saving and more cost-effective than, for example, the material supply with mobile robots [12, 16]. The automated supply of workstations transforms the conveyor line with fxed routes into a highly fexible, multidimensional material supply. A conceptual confgured assembly line for automated picking by UAVs is visualized in Fig. 8.

The image shows workstations 1 and 3 being supplemented by a loading station (1), in which UAVs are equipped with the respective order material. The loading station holds the parts of classes B and C that were analyzed in Sect. 1.3, and an industrial robot picks the parts from a storage rack to hand them over to the UAV. As material provisioning is carried out automatically by the UAVs, the loading of UAVs should also be automated. In a frst attempt, an industrial robot could fulfll the task by picking material from a storage rack and placing it in a delivery container attached to the UAVs. Although attaching a delivery container or weight to a UAV has been reported in literature [17], UAV-loading concepts must be evaluated in real scenarios. Simulations or preliminary considerations cannot provide a reliable result.

The overall system is integrated into the assembly system's existing architecture, involving CLIP PROJECT for task planning and management. The employee could be provided with an interface for monitoring and controlling the system. With the help of a localization system, for example, an ultra-wideband system (UWB system), the position of the fying robots could be localized and transmitted to the control system. Localization is a crucial aspect regarding indoor UAVs due to their inability to use GPS. Although indoor localization has been a topic for a long time, it is still an active feld of research, as can be seen in [18]. For indoor localization, UWB systems have already proved to be a reliable localization technology for aerial robotics in numerous studies. Besides UWB, motion capture technology is also a reported option in many UAV applications [19].

#### **3 Conclusion and next steps**

As the picking for terminal strip assembly time analysis showed, small and lightweight products with a great demand for manual picking processes are predestined to be supplied by UAVs. Production data of 8,944 terminal strips were evaluated, and a share of picking time for terminals and jumpers was close to 7.2%. Within an ABC analysis, it could be shown that many different parts of class B and C cannot be stored directly at the assembly station. These parts must be stored in a separate material rack with a walkway

**Fig. 7** Supply of terminal strip parts in an automated picking process. (1) Industrial Robot ABB IRB 120 in a loading station. (2) Assembly stations for terminals and jumpers. (3) UAVs. (4) Dummy for Localization system. (5) Storage rack

of about 10 m. Manually picking of parts by an employee results in a direct loss in value, as the spent time picking might be used for a value-adding activity in production.

Due to their low weight and size, an UAV might carry terminals and jumpers, and the traditional supply of these parts can therefore be automated. Thus, the frst approach for a system structure was presented, and a workfow for automated picking by UAVs was introduced. Afterwards, existing challenges and barriers for implementation were discussed, and research questions were derived. To evaluate the presented approach, the system structure will be implemented in the near future. Because the production takes place for actual customer's orders, an isolated test feld with the discussed confguration will be set up. After the frst successful fights, the fulflment of safety guidelines will be addressed.

#### **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

### **Integration of an IoT Communication Infrastructure in Distributed Production Systems in Industry 4.0**

Jiahang Chen and Jürgen Roßmann

#### **Abstract**

The term Internet of Things (IoT) denotes a communication network, where various Things are interconnected using novel scenario-specific Internet technologies and predefined customizable semantics. Industry 4.0 aims at enabling a globally networked production, with the use of IoT as a crucial concept. Since production systems tend to be technically and organizationally heterogeneous, distributed at different locations, and associated with large amounts of data, communication and information processing platforms that provide confidentiality, integrity, availability (CIA rules), access control, and privacy are needed. In this contribution, we introduce a concept for designing and operating heterogeneous and spatially distributed industrial systems with Digital Twins, connected via an IoT communication infrastructure, the Smart Systems Service Infrastructure (S3I). By demonstrating an industrial use case with our concept, it is proven that the S3I can be used as a cross-domain solution for the interconnection of devices in a distributed production scenario.

#### **Keywords**

Internet of Things • Industry 4.0 • Communication • Smart Systems Service Infrastructure

J. Roßmann e-mail: rossmann@mmi.rwth-aachen.de URL: https://www.mmi.rwth-aachen.de

© The Author(s) 2023 T. Schüppstuhl et al. (eds.), *Annals of Scientific Society for Assembly, Handling and Industrial Robotics 2022*, https://doi.org/10.1007/978-3-031-10071-0\_30

J. Chen (B) · J. Roßmann

Institute for Man-Machine Interaction, RWTH Aachen University, Aachen, Germany e-mail: Chen@mmi.rwth-aachen.de

#### **1 Introduction**

Industry 4.0 is concerned with the digital transformation of industries and is world-widely known, especially in the manufacturing sector. In this context, traditional industries are going to be combined with novel technologies such as Cyber-Physical Systems, the Internet of Things, Cloud Computing, and Big Data to enable a globally networked, personalized, and goal-oriented Smart Production [14]. The term Internet of Things (IoT), a crucial component at the forefront of Industry 4.0, aggregates various everyday objects to collect, exchange, process, and visualize data through the integration of scenario-specific Internet technologies and predefined customizable semantics to enable the situation-specific choreography in different domains.

The introduction of Industry 4.0 will inevitably lead to changes in the supply chain [2] to respond more flexibly to the adaptation of various technologies. As in the automotive industry, a car may consist of more than 30,000 different components produced from different raw materials and various manufacturing processes. In this regard, with increasing demand for transparency and flexibility in today's production systems, traditional manufacturing is confronted with the transition from a centralized, production-based manufacturing model to a distributed, small-scale, and loosely coupled model.

Distributed production [20], a new form of localized manufacturing, eliminates the need for companies to forecast demand and maintain large inventories, and also enables the flexibility to reconfigure production structures [15]. An important aspect of distributed production is interconnectivity among distributed systems and their devices. In this context, how to deal with technical and organizational heterogeneity and ensure the confidentiality, integrity, and availability (CIA rules) of communication and information processing is turning out to be a primarily concerned topic.

In this paper, we focus on distributed production systems and contribute a concept to network heterogeneous and spatially distributed production systems with Digital Twins [13], connected via the Smart Systems Service Infrastructure (S3I, depicted in Fig. 1) [3, 17]. The S3I is initially developed as an IoT communication infrastructure to interconnect and orchestrate the so-called *Forestry 4.0 Things* [3, 5, 19]. The remainder of this paper is structured as follows: Sect. 2 summarizes the state-of-art communication architectures in distributed manufacturing. A general concept including the requirements is illustrated in Sect. 3. Its implementation in a simulation-based application is introduced in Sect. 4. In Sect. 5, the paper is concluded.

**Fig. 1** The Smart Systems Service Infrastructure as IoT communication infrastructure provides various services to interconnect decentralized Forestry4.0 Things

#### **2 State of the Art**

In this section, we summarize some state-of-the-art communication architecture solutions for industrial distributed production systems.

#### **2.1 Centralized ERP**

Enterprise Resource Planning (ERP) refers to a comprehensive software solution for the central management of companies' resources. As proposed by Thomas Andre [16], the integration of an ERP system into the process flow helps the decision-making to be hierarchically broadcast from the upper levels to the lower levels, which is managed in a distributed autonomous way to dispatch the decisions explicitly to the respective executor. Similarly, George L. Kovacs contributes a web-based solution [9] for ERP systems as flow management solutions to manage scalable, multi-agent, multi-company production.

#### **2.2 Cloud-based Solutions**

Cloud computing uses IT and its associated technologies to drive the digital transformation of the manufacturing industry towards on-demand computing services. Following this structure, Xu proposes a layered architecture of a cloud manufacturing system [21]. This proposal incorporates a resource layer to deal with static and dynamic resources of software and hardware, a virtual service layer in charge of identifying manufacturing resources, a global service layer collaborated with cloud technologies, and an application layer dealing with user interactions. Rimal contributes architectural requirements [12] for cloud providers, enterprises, and cloud users, respectively. These can be summarised as general requirements for cloud system design.

#### **2.3 AAS-based Networking**

The Asset Administration Shell (AAS), a concept associated with RAMI4.0 [7] and regarded as the I4.0 equivalent of Digital Twins, can be combined with its asset (e.g. device, machine, equipment, etc.) to form a Component to represent all relevant data with a uniform interface [18]. As a middleware for Industry 4.0, Basys 4.0 is concerned with 1) decentralized connection of AASs, 2) Virtual Automation Bus [10] as an implementation of end-to-end communication, and 3) service-oriented process control. Using Basys 4.0, Antonino, et al. developed an automatic pallet transport system to bundle a high-level control and monitor of the status of the system [1]. Perzylo et al. [11] introduce a concept that adopts capabilitybased semantic annotations of existing information models to enrich device models aiming at the orchestration of high-level skills from the perspective of BaSys 4.0.

#### **2.4 Summary**

Despite their successful applications in various industrial fields, the reference architectures described above still have several considerable limitations and debatable aspects. Firstly, as interpreted by Sun [4], the failure rate of the implementation of ERP systems ranged from 40 up to 60%. Furthermore, ERP systems focus on interaction, mainly at the upper levels. Hence, they are not able to deal with the events triggered at the lower levels of production [16]. Besides, heterogeneity dissimilarities of production systems and lack of semantic interoperability make the interconnection even more difficult [8]. The introduction of cloud-based technology into the industry raises concerns about sensitive manufacturing information. Meanwhile, not all cloud users want to store their data in the cloud and accept the security mechanism provided by the cloud provider. The architecture of AAS-based networking covers the basic requirements for the RAMI4.0 framework, but its openness and secure nature is still a topic for globally networked production systems.

The use of the proposed concept brings various benefits to industrial distributed production systems. First, S3I combines several standard protocols to ensure the access security of everything connected to the infrastructure. S3I's distributed concept allows all Things to be managed without the need for centralized integration and without limiting the storage and management of resources centrally. In addition, S3I accommodates technical and organizational heterogeneity and ensures transparent, mutually understandable interactions through customizable semantics.

#### **3 Concepts**

Faced with the shortcomings of the current industrial communication architectures introduced in Sect. 2, we propose in this section our concept to interconnect heterogeneous and spatially distributed production systems with Digital Twins, focusing on the aspects of secured communication and interoperability by means of the proposed semantics.

#### **3.1 Requirements**

Our concept is presented under consideration of the following requirements: **Authentication** denotes that the identity of all participants in the IoT must be verified either decentrally or centrally before they are connected to the IoT. **Confidentiality** emphasizes that only the authorized users have the right to access protected resources, especially during the exchange of data. **Integrity** refers not only to the data completeness but also to the accuracy and truthfulness of the exchanged data. Data integrity can be ensured by adopting e.g. symmetric/asymmetric data encryption approach. **Heterogeneity** is related both to technical and organizational aspects originating from large and time-varying value-added networks with different actors. **Interoperability** refers to a capability of transparent interconnection between all communication participants such as Semantic Data Model [6], a common language "spoken" by all participants or a tool to depict the content of Things in the meta-level.

#### **3.2 Digital Twin**

The definition of Digital Twin varies slightly under each emphasis in different fields. In general, everyone agrees that Digital Twins are a 1-to-1 replica of the real world. In this context, Digital Twins are continuously updated during their entire life cycle through the internal and digital connection to their represented Assets. The interconnection between Digital Twins requires a capacity to extract valuable insights from large amounts of data originating from diverse devices, services, processes, systems, etc. Hence, semantic modeling, which is used to illustrate the relationships between values of data, is gradually taken into consideration and incorporated into our concept, which lets Digital Twins understand each other connected to them.

#### **3.3 From Digital Twin to I4.0 Things**

We define the combination of an asset and its Digital Twin as an Industry 4.0 Component (I4.0 Component). Together with Human-Machine Interface (HMI) and software services, they are termed *Industry 4.0 Things* (I4.0 Things), which can be seen as nodes of IoT in charge of collecting, exchanging, processing and visualizing data while being networked with others. An I4.0 Thing is globally uniquely identifiable, has predefined properties and interfaces, and supports standardized services. It can be connected to a goal-oriented Industry 4.0 System (I4.0 System) that consists of various I4.0 Things. The integration of Digital Twins in IoT enables a standardized interface for everything connected to the IoT, making Things as accessible nodes. Digital Twins can also be considered as software runtime environments that provide a virtual space for data processing and simulation.

Figure 2 illustrates a simplified Semantic Data Model of I4.0 Things in our aspect, which defines uniformly the structure as well as existing properties and callable functions provided by I4.0 Things. The data model denotes that each Thing has a unique identity managed in a central identity management service. Furthermore, each Thing can restrict the access from others and define the access policy, i.e. who can access it with given permissions. It also exposes endpoints to the external world, through which Things can be reached to provide values (via Property) and service functions (via Functionality). Each Thing can be partitioned into smaller but independent Things (via hasSubThings), like a car is composed of an engine, four tires, etc. An engine can be modeled as an independent Subthing of a car and provides

**Fig. 2** UML class diagram illustrates the Semantic Data Model applied to model and implement Industry 4.0 Things

e.g. rpm value and temperature. Furthermore, diverse I4.0 Things can be associated to enable the situation-specific choreography, comprising an I4.0 System.

#### **3.4 Platform**

As I4.0 Things are defined as "worldwide identifiable participants" [18] able to communicate and could be distributed over large areas, a central infrastructure with a few essential software services is required to realize a decentralized interconnection of those Things. These services facilitate that I4.0 Things are able to authenticate themselves, store and re-find their properties and features in a database, and end-to-end compliantly communicate with each other considering the given permissions. The S3I as an IoT infrastructure provides directory service (via S3I Directory), OAuth 2.0 authentication (via S3I Identity Provider), optional message-based asynchronous communication (via S3I Broker using AMQP), and optional cloud storage (via S3I Repository). The use of S3I is domain-independent and meets the shortcomings enumerated in Sect. 2.4 and requirements listed in Sect. 3.1.

#### **4 Application**

In this section, we implement the concept mentioned above in a simulation-based scenario to demonstrate the communication between distributed production systems, including their I4.0 components, services, and HMIs.

The use of the S3I enables different factories that are networked over large areas an integrated high-level communication. Meanwhile, the interaction at the system level is centrally managed by the S3I. The example in Fig. 3 illustrates how the I4.0 Things are networked with the S3I. In our application, Factory n attempts to get the current production status of Factory 1, see Fig. 4. All the Things appearing in this scenario are modeled as I4.0 Things using the semantic model presented in Sect. 3.4. As an example, Fig. 5 depicts the meta information of Factory 1 in JSON format. Using the standardized REST API of the S3I Directory, Factory n retrieves the endpoint and the interface provided by Factory 1 with a valid access token issued by the S3I Identity Provider. Subsequently, Factory n completes an encrypted and signed message including concrete request content and sends it to Factory 1 via S3I Broker. Because Factory n has obtained the access right to Factory 1 previously, Factory 1 is allowed to give an appropriate response. As a result, Factory n obtains the status of the field devices in Factory 1 via S3I Broker as well.

To sum up, the participants in the network are not required to be aware of how other Things are implemented, how the internal logic works, how communication proceeds and which programming language is used. They only need the corresponding access rights to the interfaces to acquire appropriate information and conduct service functions because they

**Fig. 3** Various Industry 4.0 Things in Distributed production systems interconnect with each other using the S3I and its provided services

**Fig. 4** Sequence diagram illustrates how Factory n retrieves the current production status of Factory 1 using the authentication service of the S3I Identity Provider, the directory service provided by S3I Directory and an AMQP message exchange provided by S3I Broker

**Fig. 5** JSON-based meta information of Factory 1 that is based on the Semantic Data Model and centrally stored in the S3I Directory

understand each other by means of the predefined semantics. More importantly, the S3I ensures the security of data communication and resources based on CIA principles since OAuth 2.0 and role-based authorization policy are used in the central services.

#### **5 Conclusion**

Faced with the heterogeneous nature of production systems, their spatial distribution in different locations, and the trends associated with large amounts of data, a centralized infrastructure is needed to connect everything decentrally. We propose in this paper the concept of integrating an IoT communication infrastructure in production systems using the domain-independent S3I, which was developed originally for Forestry 4.0 Things. The provided simulation-based example demonstrates the application with a comprehensive method, ensuring interoperability in a heterogeneous production network while taking the security CIA aspects into account. Besides, the S3I does not limit the resource of I4.0 Things to be centrally hosted in the service provided by the infrastructure, but rather decentralized. Therefore, from this perspective, S3I can be scaled to any size as long as the server allows. The demonstrated application also illustrates that S3I is generally applicable as an IoT solution, regardless of the domain. Consequently, the use of S3I could be understood as a promising solution for an enlarged and secured IoT. Future work will focus on specific and classic security issues, such as DoS, injection, and man-in-the-middle, and analyze the vulnerability and reliability of S3I under these attacks. Additionally, lightweight data communication needs to be considered at the level of communication protocols and semantics as well.

**Acknowledgements** This project has received funding from the European Unions's Regional Development Fund.

#### **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

### **Author Index**

#### **A**

Adelsbach, Jan, 15 Aff, Nehal Atef, 3 Aydemir, Muhammed, 305

#### **B**

Bach, Paul, 167 Berger, Julia, 291 Biltgen, Jacques, 267 Bliznyuk, Artem, 141 Boshoff, Marius, 355 Brand, Michael, 79, 91

#### **C**

Carmichael, Marc G., 229 Chen, Jiahang, 367

#### **D**

Deuse, Jochen, 65, 229 Dierks, Niklas, 317 Dietrich, Franz, 305 Dröder, Klaus, 317 Durst, Lukas, 117

**E** Ebenhoch, Christoph, 191

#### **F**

Fleischer, Jürgen, 329 Flügge, Wilko, 267

Franke, Jörg, 167 Friedmann, Marco, 329 Fürst, Jens, 167

#### **G**

Gabriel, Julia, 65 Geggier, Jan, 343 Göppert, Amon, 41 Gravert, Marvin, 79 Gunaseelan, Balaji, 53

#### **H**

Hartwig, Johannes, 279 Henrich, Dominik, 253, 279 Hentschel, Jakob, 155 Herbert, Meike, 167 Hernandez Moreno, Victor, 229 Herrmann, Max, 191 Hogreve, Sebastian, 27, 215

Ince, Caner-Veli, 343

#### **J**

**I**

Jabs, Tim, 65 Jansing, Steffen, 65 Joana, Illgen, 317

**K** Kanso, Ali, 3

© The Editor(s) (if applicable) and The Author(s) 2023 379 T. Schüppstuhl et al. (eds.), *Annals of Scientifc Society for Assembly, Handling and Industrial Robotics 2022,* https://doi.org/10.1007/978-3-031-10071-0

Karkowski, Martin, 15 Kaven, Lea, 41 Kluge-Wilkes, Aline, 53 Kolditz, Torge, 155 Kuhlenkötter, Bernd, 103, 355 Kuhn, Dominik, 15 Kutschinski, Jürgen, 103 Kwade, Arno, 317

#### **L**

Lauer, Sascha, 267 Lechler, Armin, 177 Lieret, Markus, 167 Lu, Shuang, 291

#### **M**

Maqbool, Osama, 129 Mesmer, Patrick, 177 Miro, Michael, 355 Müller, Alexander, 305 Müller, Rainer, 3, 15

#### **P**

Papenberg, Björn, 27 Peck, Tobias, 203

#### **R**

Raatz, Annika, 155, 343 Rachner, Jonas, 41 Ralfs, Lennart, 203 Riedel, Patrick, 177 Riedelbauch, Dominik, 241 Rieger, Christoph, 65 Rieger, Monika, 65 Rohner, Dorian, 279 Roßmann, Jürgen, 129, 141, 367 Ruppert, Pascal, 253

#### **S**

Schilp, Johannes, 291 Schimanek, Robert, 305 Schluse, Michael, 141 Schmidt, Edgar, 253 Schmitt, Robert H., 41, 53 Schneider, Marco, 3 Schüppstuhl, Thorsten, 79, 91 Schwind, Tizian, 329 Seibt, Robert, 65 Seim, Patrick, 103 Spieler, Judith, 65 Steinhilber, Benjamin, 65 Sucker, Sascha, 241 Sudhoff, Martin, 355

#### **T**

Theren, Benedict, 103 Tracht, Kirsten, 27, 215

#### **V**

Verl, Alexander, 177 Villotti, Samuel, 117 Voet, Florian, 41 von der Wense, Jens, 191

#### **W**

Wacker, Christian, 317 Wagenblast, Florestan, 65 Wallmeier, Hannah, 215 Wanner, Martin-Christoph, 267 Weidner, Robert, 117, 191, 203 Wirth, Benjamin, 329 Wulff, Lukas Antonio, 79, 91