23

Limited Information Shared Control and its Applications to Large Vehicle Manipulators

KARLSRUHER BEITRÄGE ZUR REGELUNGS-UND STEUERUNGSTECHNIK

# Bálint Varga Limited Information Shared Control and its Applications to Large Vehicle Manipulators

Bálint Varga

# **Limited Information Shared Control and its Applications to Large Vehicle Manipulators**

Karlsruher Beiträge zur Regelungs- und Steuerungstechnik Karlsruher Institut für Technologie

Band 23

# **Limited Information Shared Control and its Applications to Large Vehicle Manipulators**

by Bálint Varga

Karlsruher Institut für Technologie Institut für Regelungs- und Steuerungssysteme

Limited Information Shared Control and its Applications to Large Vehicle Manipulators

Zur Erlangung des akademischen Grades eines Doktors der Ingenieurwissenschaftten von der KIT-Fakultät für Elektrotechnik und Informationstechnik des Karlsruher Instituts für Technologie (KIT) genehmigte Dissertation

von Bálint Varga, M.Sc.

Tag der mündlichen Prüfung: 21. März 2023 Hauptreferent: Prof. Dr.-Ing. Sören Hohmann Korreferent: Prof. Dr. Takahiro Wada

**Impressum**

Karlsruher Institut für Technologie (KIT) KIT Scientific Publishing Straße am Forum 2 D-76131 Karlsruhe

KIT Scientific Publishing is a registered trademark of Karlsruhe Institute of Technology. Reprint using the book cover is not allowed.

www.ksp.kit.edu

*This document – excluding parts marked otherwise, the cover, pictures and graphs – is licensed under a Creative Commons Attribution-Share Alike 4.0 International License (CC BY-SA 4.0): https://creativecommons.org/licenses/by-sa/4.0/deed.en*

*The cover page is licensed under a Creative Commons Attribution-No Derivatives 4.0 International License (CC BY-ND 4.0): https://creativecommons.org/licenses/by-nd/4.0/deed.en*

PPrint on Demand 2023 – Gedruckt auf FSC-zertifiziertem Papier

ISSN 2511-6312 ISBN 978-3-7315-1325-4 DOI 10.5445/KSP/1000162759

# **Preface**

This work is the end of a special journey of the last five years as Research Assistant at the FZI Research Center for Information Technology and at the Institute of Control Systems (Institut für Regelungs- und Steuerungssysteme - IRS) of the Karlsruhe Institute of Technology (KIT). I am very grateful for the support and guidance of numerous people without whom this thesis would not have finished.

First and foremost, I express my sincere gratitude to my "Doktorvater," Professor Hohmann. I really prefer the German term emphasizing the opportunities I had at IRS, including numerous enriching discussions and engaging conversations that significantly contributed to the success of my thesis. I am grateful for these opportunities. I am also very thankful for Professor Wada for his interest in my work and for the assessment of this thesis. I really appreciate your commitment and time.

Furthermore, I am really thankful for my teams at FZI and IRS. Special thanks to my group leaders, Tobias, Stefan, Jairo and Simon, who supported me beyond technical issues in my work as a research assistant. I am thankful to my colleagues at the FZI and the KIT, the friendly atmosphere and the opportunities for scientific discussions made both facilities special. I will never forget this time. Many thanks go to Selina, Arash, Yannick and Da, who significantly contributed to my research. It was wonderful to work with you! Furthermore, thank you Florian, Christian, Esther, Philipp, Julian, Simon, Max and Lukas for the valuable discussions and feedbacks during the writing of the thesis.

Special thanks go to my Thomas, Tobias and Christopher. You were the anchor and the greatest help at begin of my FZI-time and during the change to IRS. I am sure that this journey could have been very different without you and your support in difficult times.

Last, but not least, I thank my wife and my children, for the time playing at the playgrounds giving me the balance in life and for the support in the last months to finish this thesis. Without you, I would not have been able to finish this exciting PhD-journey.

Kicsi családom: Köszönöm, hogy vagytok nekem. Ti vagytok a legcsodálatosabb ajándék, amit a Jóistentől kapthattam. Köszönöm, hogy segítettetek ennek a rögös, de izgalmas doktori utazásnak a végére érni!

Karlsruhe, im March 2023

*Gyöngyinek és a Gyerekeknek*

# **Kurzfassung**

Diese Dissertation beschäftigt sich mit der kooperativen Regelung einer mobilen Arbeitsmaschine, welche aus einem Nutzfahrzeug und einem oder mehreren hydraulischen Manipulatoren besteht. Solche Maschinen werden für Aufgaben in der Straßenunterhaltungsaufgaben eingesetzt. Die Arbeitsumgebung des Manipulators ist unstrukturiert, was die Bestimmung einer Referenztrajektorie erschwert oder unmöglich macht. Deshalb wird in dieser Arbeit ein Ansatz vorgeschlagen, welcher nur das Fahrzeug automatisiert, während der menschliche Bediener ein Teil des Systems bleibt und den Manipulator steuert. Eine solche Teilautomatisierung des Gesamtsystems führt zu einer speziellen Klasse von Mensch-Maschine-Interaktionen, welche in der Literatur noch nicht untersucht wurde: Eine kooperative Regelung zwischen zwei Teilsystemen, bei der die Automatisierung keine Informationen von dem vom Menschen gesteuerten Teilsystem hat. Deswegen wird in dieser Arbeit ein systematischer Ansatz der kooperativen Regelung mit begrenzter Information vorgestellt, der den menschlichen Bediener unterstützen kann, ohne die Referenzen oder die Systemzustände des Manipulators zu messen. Außerdem wird ein systematisches Entwurfskonzept für die kooperative Regelung mit begrenzter Information vorgestellt. Für diese Entwurfsmethode werden zwei neue Unterklassen der sogenannten Potenzialspiele eingeführt, die eine systematische Berechnung der Parameter der entwickelten kooperativen Regelung ohne manuelle Abstimmung ermöglichen. Schließlich wird das entwickelte Konzept der kooperativen Regelung am Beispiel einer großen mobilen Arbeitsmaschine angewandt, um seine Vorteile zu ermitteln und zu bewerten. Nach der Analyse in Simulationen wird die praktische Anwendbarkeit der Methode in drei Experimenten mit menschlichen Probanden an einem Simulator untersucht. Die Ergebnisse zeigen die Überlegenheit des entwickelten kooperativen Regelungskonzepts gegenüber der manuellen Steuerung und der nicht-kooperativen Steuerung hinsichtlich sowohl der objektiven Performanz als auch der subjektiven Bewertung der Probanden. Somit zeigt diese Dissertation, dass die kooperative Regelung mobiler Arbeitsmaschinen mit den entwickelten theoretischen Konzepten sowohl hilfreich als auch praktisch anwendbar ist.

# **Abstract**

This thesis focuses on the shared control of a large vehicle manipulator, which consists of a mid-size heavy-duty vehicle and one or more large hydraulic manipulators. Such machines are used for road maintenance work. The working environment of the manipulator is unstructured, making the measurement of a reference trajectory challenging or impossible. Therefore, in this thesis, an approach is proposed, that automates the vehicle only, while the human operator remains part of the system and controls the manipulator. Such a shared control setup leads to a particular class of human-machine interactions, which has not been studied in the literature: Limited information shared control between two subsystems, in which the automation has no information about the human-controlled subsystem. Therefore, this thesis contributes a systematic approach to the limited information shared control, which can support the human operator without measuring the references or the system states of the manipulator. In addition, a systematic design concept of the limited information shared control is presented. For this design method, two new subclasses of the so-called potential games are introduced that enable a systematic calculation of the parameters of the limited information shared controller without manual tuning. Finally, in order to investigate and assess the benefits of the proposed shared control concept, it is applied to the example of a large vehicle manipulator. After the analysis in simulations, the practical applicability of the methods is investigated in three experiments with human test subjects using a simulator. The results show the superiority of the developed shared control concept over manual control and non-cooperative control in terms of both objective performance and subjective evaluation of the subjects. Thus, this thesis shows that the shared control of large vehicle manipulators is both helpful as well practically applicable to the developed theoretical concepts.

# **Contents**




# **List of Figures**



# **List of Tables**


# **Abbreviations and Symbols**

# **Abbreviations**


# **Latin Letters**



# **Greek Letters**



# **Calligraphic and Other Symbols**


# **Indices, Exponents and Operators**



# **1 Introduction**

The German road network covers around 229,720 km [Deu22, p. 102] including highways, motorways, and other roads, which need to be maintained regularly. These roadside maintenance works on motor- and highways are very time and labor-intensive. Thus, they are associated with a high-cost burden [Rec16]. Furthermore, the German Federal Ministry for Digital and Transport expects these costs to rise within the next few years [FP21, Hör21]. Based on the investigations of the State Court of Audit of Baden-Württemberg, the maintenance expenses can be estimated [Rec16]. With a reduction of approximately 10% in working time, these possible savings can mount up to 5,000,000 € from the yearly state budget just in Baden-Württemberg. Furthermore, the introduction of novel systems can have a social impact: quicker and more precise road maintenance works lead to less impact on traffic increasing the safety for the road users, see [Roo06]. Similar increases and potential savings of these costs can be assumed for other European countries [For06].

For road maintenance and verge mowing works, mobile vehicle manipulators with one or more working attachments are used primarily, see e. g. [MUL20, Due20]. Each working attachment is mounted on a hydraulically actuated manipulator to move it to the desired position. An exemplary vehicle setup with three manipulators is depicted in Figure 1.1. Their current state-of-the-art operation mode is manual control, which is complex and demanding because

**Figure 1.1:** An exemplary image of a large vehicle manipulator with three working attachments.

the operator must perform two tasks simultaneously: Firstly, the operator has to control the manipulator in such a way that the roadside maintenance work (e. g. mowing, cutting, cleaning) is performed successfully. In particular, damage-free and time-efficient work is required here. Secondly, the operator must participate in regular road traffic with the vehicle.

A common solution to relieve the operator is to divide the tasks between two persons: One person is responsible for driving the vehicle and a second person solely controls the manipulators. The fulfillment of both tasks happens cooperatively between the two people, and the communication between them is mainly verbal. The operating person tells the driver how fast and far away from the roadside the vehicle should be. Meanwhile, the driver provides feedback on the actual traffic and limitation of the vehicle's motion. This cooperation is a *continuous human-human interaction* [Bec19, KKW<sup>+</sup> 21], which is central to the quality of the work [Ost20, Chapter 4]. However, employing two people on one vehicle is not a preferred solution, as it leads to even higher costs. Furthermore, due to the shortage of skilled employees, the two-operator solution is not going to be feasible in the near future [Gmb16], [Ost20, Chapter 2].

A possible solution to the aforementioned problem could be the full automation of large vehicle manipulators, which can drive the vehicle and control the manipulator enabling an automated operation of these two subsystems [Ass94, Chapter 3]. However, the manipulator is usually operated in an unstructured environment, in which the reference of the manipulator is hard to measure or estimated based on the state-of-the-art sensory systems. Furthermore, in the case of hydraulic manipulators the "*human operation cannot be predictably replaced by full robotic autonomy*" [XC18]. These two facts hinder full automation of the manipulator, making full automation of the overall system according to SAE Level 5 [ORA20, Chapter 1] unrealistic under these conditions.

Therefore, an alternative is the automation of the vehicle only, which seems to be a promising solution owing to the available results in the field of autonomous vehicles. In the case of an automated vehicle, the operator is relieved and could concentrate on the roadside maintenance work only. It therefore relieves the single operator from a heavy workload and creates the potential for significantly greater profitability for the operating companies and the manufacturers of mobile machinery and further savings for the state budget. These facts encourage increased research on the automation of the vehicles of such systems. Therefore, this thesis focuses on a novel automation solution for the vehicle itself, in which the operator still remains part of the system and controls the manipulator.

# **1.1 Research Objectives and Contributions**

Since the cooperating behavior between the vehicle and manipulator is crucial for the quality of the work, the automation of the vehicle has to take the task of the manipulator into account as well. This is challenging for the automation because of the unstructured working environment and the lack the reference trajectory of the hydraulic manipulator. Therefore, the automation has limited information on the manipulator. Task sharing with limited information is generally not addressed by the current state of the art: The state of the art has neither practical nor methodological solutions for such a shared control setup. Thus, this thesis attempts to close both the methodological and the practical research gap.

Firstly, the current literature does not provide control algorithms of vehicle manipulators

including a human operator, because their focus is the control of small-sized, indoor vehicle manipulators, see e. g. [WBK07, SK08b, MASD14, TSA<sup>+</sup> 22]. Furthermore, large manipulators are studied only with a static base [HS10, Rud18, WWXS22]. Existing shared control concepts require full information on all system states and references, see e. g. [FFV16]. Due to limitations in the perception of the manipulator, these requirements are not fulfilled for the system under consideration and thus existing shared control concepts cannot be applied.

The lack of both practical and conceptual solutions for the aforementioned shared control application with limited information necessitates the development of novel shared control methods. Therefore, this thesis focuses on *limited information shared control* of systems with similar structures to large vehicle manipulators: The human-controlled system part is not available for the automation. In other words, the automation has limited information.

Consequently, the research reported in this thesis attempts to answer the following three questions:


To answer the first question, this thesis proposes limited information shared control having a modified model structure using the so-called *cooperation state*. The proposed formal definition enables the sharing of the control task despite limited information of the system states or reference trajectories.

For answering the second question, a systematic automation design is presented, enabling a general transferability for further applications. The design procedure computes the parameters of the cooperation state and the necessary feedback gains of the controller.

In order to answer the third research question, a simulator was developed in the course of the thesis. Using this simulator, three experiments were conducted to investigate the proposed limited information shared control concept and the design method.

# **1.2 Outline**

The outline of this thesis is depicted in Figure 1.2: Chapter 2 presents the state of the research relevant to this thesis in the fields of 1) vehicle manipulators, 2) large hydraulic manipulators, and 3) shared control concepts and methods for human-machine interactions. The limitations of the existing concepts are discussed and the necessity for a novel approach is shown.

In Chapter 3, the fundamentals of game theory and shared control designs are introduced, which are followed by the novel concept of limited information shared control and the overview of the systematic design procedure.

Chapter 4 handles the systematic design procedure of limited information shared control in detail, for which the two novel subclasses of potential games are introduced. This is followed by the presentation of the systematic calculation of the cooperation state and the stability analysis of the limited information shared controller.

In Chapter 5, the concept of the limited information shared controller is applied to the lateral and longitudinal control problem of a large vehicle manipulator. Moreover, the methods and concepts for the developed potential games are applied and analyzed in simulations. Then, the limited information shared controller is verified and analyzed in simulations. Furthermore, the developed test bench is introduced, which includes the complex, real-time model of the vehicle manipulator.

Chapter 6 presents three experiments demonstrating the benefits of limited information shared control and the usability of the design process by validating both on the test bench with human test subjects. In these experiments, the performance is evaluated from objective and subjective points of view.

Finally, the thesis is summarized and concluded in Chapter 7.

**Figure 1.2:** The outline of the thesis

# **2 Related Work and Research Gap**

The first part of this chapter presents the concepts of small and large ground vehicle manipulators. The emerging challenges are discussed and the need for considering the human-automation interactions in the context of vehicle manipulators is specified. Afterward, the control concepts and models of large hydraulic manipulators are reviewed. The diverse control modes and technical hurdles of their full automation in the near future are highlighted. The subsequent part deals with the concepts and approaches to human-automation interactions. The review of the related works reveals the shortcomings of the existing methods, and hence, the necessity of new methods is highlighted. From the identified research gaps, the contributions of this thesis are formulated for the shared control of large vehicle manipulators at the end of the chapter.

# **2.1 Modeling and Control of Vehicle Manipulators**

This section provides a brief overview of the specific applications, the models and the control approaches from literature, which supports comprehending the challenges of the vehicle manipulators<sup>1</sup> .

A vehicle manipulator is a robotic system consisting of a mobile base (a vehicle) and a robotic manipulator, which can carry out various tasks [AKO06]. In literature from the '80s and '90s, there were already models and controller methods for various vehicle manipulator systems, see e. g. [DV89, LL90, HD91a, MYL91, CB97, ATS98]. From the 2000s, the efforts of this development were even more significant, whereby numerous practical applications of vehicle manipulators were established, and theoretical methods were proposed, see e. g. [WBK07, ZANS06, FDGS09, FGP14]. Such applications involve different domains: Aerial, underwater, and ground vehicle manipulators, see [BK16, Chapters 51 and 52]. Some applications worth mentioning from literature are the following:

• In the aerial domain, there are numerous special applications, which require active intervention instead of passive monitoring or remote inspection, see [RLO18]. Such aerial vehicle manipulators have the benefit of being able to operate under hazardous circumstances and protect human workers, see, e. g. [BK16, Chapter 26]. The literature addresses mechanical design problems [NGK15], controller synthesis, and the realization of real-world applications [KHS<sup>+</sup> 14]. Due to the short flying time, such systems currently

<sup>1</sup> In literature, there are different terminologies for vehicle manipulators: In the 2000s, the term "vehicle manipulator" was common, while in the last years, the literature has more often employed the term "mobile manipulator" indicating the growing number of industry 4.0 applications. Both terms are used for the description of a robotic system including a mobile base and (at least) a robotic arm. In this thesis, the term "vehicle manipulator" is consistently applied.

have only a few practical applications, see [OTS<sup>+</sup> 22]. For further information, it is referred to the review articles [RLO18, MHH20, OTS<sup>+</sup> 22].

• Examples for the underwater domain include both remotely operated and autonomous underwater vehicles [BK16, Chapter 25]. Their fields of application have a long history (e. g. [Ege91, MOM94, CODP98, ACC04]): undersea oil and gas exploration of the industry, military activities, or scientific and environmental missions, see, e. g. [SP01, MCY09, SLO<sup>+</sup> 22]. Underwater vehicle manipulators have further challenges regarding the actuators, sensors, and their mechanical structure due to the high pressure [Hya11], and communication delay hampering the teleoperation, see e. g. [KSWK20]. For in-depth reviews, the reader is referred to [BK16, Chapter 25] and [SCO<sup>+</sup> 18].

As this thesis focuses on the application of ground vehicle manipulators, see Chapter 1, the following part presents this domain in particular. To provide an overview of ground vehicle manipulators, Figure 2.1 supports the discussion of the application categories. In each category, the limitations of existing methods are addressed and the necessity of new concepts is explained. Then, the models and control approaches of these applications are discussed. Finally, the necessity of introducing systematic definitions of dual task and task prioritization in the context of vehicle manipulators is explained.

The application domains of ground vehicle manipulators are categorized into A) Industrial, B) Assistive, and C) Outdoor/Off-road vehicle manipulators<sup>2</sup> , see Figure 2.1.

Industrial vehicle manipulators possess the mobility and flexibility to carry out various tasks such as assembling [KMAM18, HOD<sup>+</sup> 22], surface treatment processing [DYM<sup>+</sup> 22], welding [LWH<sup>+</sup> 22, XWCH22], or transportation (pick and place) of heavy objects within warehouses [WFK<sup>+</sup> 16, DKK<sup>+</sup> 17, ARTG19, ZSS<sup>+</sup> 20]. For these industrial applications, the manipulators have electrical motors for the actuation of the joints. Therefore, they can move faster and more precisely compared to a large vehicle manipulator, which possesses hydraulic actuation, see Chapter 1. Furthermore, industrial vehicle manipulators are usually employed in factories or warehouses, where stationary localization systems can be installed [WFK<sup>+</sup> 16, DKK<sup>+</sup> 17]. Such localization systems provide the exact pose of the vehicle manipulator and a precise, collision-free reference path to control them. These localization systems are not available for outdoor large vehicle manipulators leading to more challenging situations. Furthermore, the existing control concepts for industrial vehicle manipulators focus only on their full automation without including a human operator in the control task. However, large vehicle manipulators cannot be fully automated due to legal reasons and at least a supervisory human operator has to be present, see [Bun21]. The supervisory role of the operator leads easily to the "out-of-theloop" phenomenon causing potentially dangerous situations, see [Bai83, Kom08]. Therefore, keeping the human operator in the control loop of large vehicle manipulators seems beneficial in contrast to the trends of industrial vehicle manipulators.

<sup>2</sup> There are review articles, which arrange the ground vehicle manipulators in different categories. The groups chosen here can help emphasize the shortcomings of the methods with respect to the large vehicle manipulator for road maintenance works more precisely.

**Figure 2.1:** The various application domains of vehicle manipulators.

Due to the lack of stationary localization systems, the control of large vehicle manipulators cannot be solved with the concepts of industrial vehicle manipulators. Thus, novel methods including the human operator are necessary for large vehicle manipulators.

Assistive vehicle manipulators can ease the life of disabled or elderly people and can e. g. provide support for patients in hospitals [KKP<sup>+</sup> 09]. Further applications include automated wheelchairs with a manipulator [MASD14, DLYL<sup>+</sup> 20], light-ware robots for household tasks (cooking, cleaning) [BGJ<sup>+</sup> 05, KECM22], or self-care robots for daily activities [HGC<sup>+</sup> 14]. Common household tasks, which are addressed in the development of assistive vehicle manipulators are closing/opening doors or picking up and handing over objects, see e. g. [KECM22, KCF<sup>+</sup> 12]. In hospitals, assistive vehicle manipulators can be used for the disinfection of surfaces [HZL<sup>+</sup> 20], therapeutic assistance [YTO<sup>+</sup> 19], or accompanying patients with needs for walking support [SRH22]. The application from literature show that assistive vehicle manipulators imply interactions with humans. However, their interaction concepts cannot be transferred to large vehicle manipulators, since the nature of the interactions is different: Assistive vehicle manipulators support humans in everyday activities, in which the automation and the human do not control a technical system together. On the other hand, in a shared control setup<sup>3</sup> , the operator and the automation control the technical system together. Therefore, the concepts of assistive vehicle manipulators cannot be adapted to large vehicle manipulators.

Outdoor and off-road vehicle manipulators can be found on construction fields, in forestry, or in agricultural applications, [OMS21b, CKH08, SSK17, ETKA<sup>+</sup> 19] or [BK16, Chapter 56]. The application of these control concepts for the problem from Chapter 1 is not possible, because the actual applications of outdoor and off-road vehicle manipulators have special mechanical structures or customized control algorithms. They are usually smaller than a large vehicle manipulator. They are often designed for one specific task only: Pesticide spraying, weed killing or harvesting on the agricultural fields [MZR<sup>+</sup> 14, BHCI20, DDT<sup>+</sup> 22]. Furthermore, in other works, a human operator is taken into account but only for large manipulators with a stationary base, see e. g. [HS10, MKK19, KMPK21].

<sup>3</sup> For a formal definition of *shared control*, it is referred to Definition 2.4 in Section 2.3.

Independently of the operating domain, all the applications of these three categories have one feature in common: The two subsystems – the vehicle and the manipulator – have different kinematic and dynamic characteristics. Their combination facilitates solving novel and challenging tasks, enabling increased manipulability and better flexibility of the end effector (EE), which is beneficial in many applications. In general case, the vehicle manipulator possesses at least nine or more degrees of freedom (DOF): The vehicle has three, and the manipulator has six DOF. This generates a system redundancy, which must be handled during the controller design. This redundancy is defined in Definition 2.1

#### **Definition 2.1 (Vehicle Manipulator Redundancy)**

*Let a vehicle manipulator be given with the system states*

$$\boldsymbol{x} = \left[ \boldsymbol{x}\_{\rm v}, \boldsymbol{x}\_{\rm m} \right],$$

*where* x<sup>v</sup> = [xv, yv, θv] *and* x<sup>m</sup> = [q1, q2, ..., qn] *represent the states of the vehicle and the generalized coordinates of the manipulator, respectively. The states* xv, y<sup>v</sup> *and* θ<sup>v</sup> *represent the vehicle position and orientation. The generalized coordinates* q<sup>i</sup> *are usually the joint angles fully describing the pose of the end effector. The vehicle manipulator redundancy is called the redundancy of the end effector pose, which is raised by the combination of the motions of the two subsystems, the vehicle and the manipulator.*

#### **Note:**

A redundancy problem can be raised by the structure of the robotic manipulator itself (see e.g. robots with seven or more degrees of freedom). However, the redundancy of the robotic manipulator itself is not explicitly addressed in this thesis as the manipulator is human-controlled, and because large vehicle manipulators usually have manipulators with four or five DoF. For further information about manipulator redundancy, it is referred to [BK16, Chapter 10]

The vehicle manipulator redundancy allows to reach a target point in various ways: For instance, the target point is reached by moving the manipulator only or by coordinating the vehicle and the manipulator. Such coordination is often essential for complex tasks to reach time- or energy-optimal operation, see [Anc17, ZSS<sup>+</sup> 20].

A common feature of vehicle manipulators is the existence of *two linked tasks* for these two subsystems due to their physical constellation<sup>4</sup> . In the case of a vehicle manipulator, a typical example of two linked tasks is the tracking of two trajectories: One for the vehicle and another for the manipulator, see e. g. [FAD10, TMK<sup>+</sup> 11, ZKHE13, NY96, CCL10, MGF<sup>+</sup> 21]. The graphical illustration of such *two linked tasks* for a planar vehicle manipulator is given in Figure 2.2: The simultaneous minimization of the relative deviations to the references. These deviations are illustrated with the orange distances d<sup>v</sup> and d<sup>m</sup> which are defined between reference points of the vehicle manipulator (Pv,Pm) and the two references Γ<sup>v</sup> and Γm. Thus, these two tasks are *linked* since they influence each other. However, there is no general definition for such two linked tasks in the robotic domain.

<sup>4</sup> Note that the issue of dealing with two tasks simultaneously is widespread. For example, the control and coordination of multiple unmanned aerial vehicles includes also two or more separate tasks, see e. g. [GBGKH18, CCS+18, ZYZS19].

**Figure 2.2:** A vehicle manipulator with two references in planar is illustrated leading to a dual task, which is the simultaneous minimization of the deviations from the references of the vehicle and the manipulator (Γv and Γm). This dual-task setting can lead to a trajectory prioritization problem.

In order to find a suitable definition for the problem of dual task, other research disciplines can also be considered. For instance, the emerging challenges of dual task is also the subject of the psychological research community. There are works focusing on the mental state of the human during a dual task. One of the main focuses is the analysis and research of different modalities (visual, auditory, manual) and their compatibility effects, see [WKK20]. Other works deal with the predictability of the dual task performance [BEd<sup>+</sup> 21], brain activity investigation [KLPK07], cross effects of the dual task [JPHK14], or "*cognitive demands in learning situations*" in dual task settings [WHEB<sup>+</sup> 18]. However, the limitations of the various definitions and combinations of the different tasks hinder the general treatment of these research works: "[...], *the current state of research indicates a clear lack of standardization of dual-task paradigms over study settings and task procedures*" [EB21]. Thus, there is no formal definition for the dual task in literature in the domains of robotics and psychological research. Therefore, a definition for the scope of this thesis is given in the following.

#### **Definition 2.2 (Dual Task)**

*If a control system has to fulfill two linked tasks with two separate goals on a vehicle manipulator, these two tasks are called a dual task. Both of the two linked tasks with separate goals have adjustable priorities and neither of them are considered as a disturbance to the other.*

#### **Note 1:**

A dual task can consist of two low-level tasks (e.g. following two different trajectories [MAD16]) or a high-level and a low-level task (e. g. following one trajectory and analyzing the recorded camera pictures [SLO<sup>+</sup> 22]). The two trajectories of a low-level dual task are called **dual trajectories**, see blue reference trajectories in Figure 2.2.

### **Note 2:**

Note that Definition 2.2 also characterizes the dual tasks which are investigated in psychological studies or robotic control problems. Furthermore, from the control engineering perspective, a control system can be a human as well as an automation system, see e. g. [Fla16].

After the presentation of the application domains and the definition of vehicle manipulators redundancy and dual task, the following paragraphs in this section focus on the control concepts of vehicle manipulators to reveal the need for new control methods including a human operator.

To solve the control challenges of dual tasks, different approaches can be found in literature. In general, two main concepts can be distinguished: *unified* and *modular* controllers, see [TSA<sup>+</sup> 22]. A unified controller is one overall control system, which governs both the vehicle and the manipulator. When the two subsystems each have a controller, it is a modular control structure. These two separate controllers can still implement a coordinated movement of subsystems if there is some communication/cooperation between them. Therefore, the main challenge is managing the vehicle manipulator redundancy for a dual task, see Definition 2.1 and Definition 2.2.

Practical concepts from literature (e. g. [HTS00, OMA<sup>+</sup> 18]) have also demonstrated that solving a dual task requires the coordination of the vehicle and the manipulator. The vehicle can help the manipulator to carry out the task more efficiently, e. g. door opening [CCL10, MGF<sup>+</sup> 21] or the teleoperation of an underwater vehicle with a visual primary task [KSWK20, SLO<sup>+</sup> 22].

Furthermore, theoretical models and control methods of vehicle manipulators do not address the problem of dual task explicitly. Such works, for instance, are flatness-based controllers [TK01, TMK<sup>+</sup> 08], singularity-free modeling methods [FDP<sup>+</sup> 10], the motion optimization of the vehicle base [PF14, RXY<sup>+</sup> 17], the coordinate-free geometric approach [ZZZ14] or coordinated optimal path planning algorithms [RXY<sup>+</sup> 17, TFSG18, LHCY19, PLM<sup>+</sup> 22]. Furthermore, such coordination between the vehicle and the manipulator must provide a redundancy resolution, for which solutions are discussed in e. g. [WBK07, Anc17].

The applications, concepts, and control methods presented above are useful for analyzing the challenges of coordinating and controlling a vehicle manipulator. However, none of the methods and approaches from literature define the coordinated prioritization of the dual task for such systems. Therefore, such a notion is elaborated in Definition 2.3.

# **Definition 2.3 (Task Prioritization Problem of Vehicle Manipulators)**

*Let a vehicle manipulator with a dual task according to Definition 2.2 be considered. If there is a configuration or a situation, in which the overall system cannot (optimally) carry out the dual task, the vehicle manipulator has a task prioritization problem. If the dual task contains purely tracking tasks, (e. g. tracking two trajectories), which cannot be followed simultaneously due to kinematic or dynamic constraints of the vehicle manipulator, the arising problem is called trajectory prioritization problem.*

# **Note:**

A task prioritization problem is not present if both subsystems can carry out the dual task optimally.

An illustrative example of this task prioritization problem is when the vehicle manipulator cannot track both references simultaneously, see blue trajectories in Figure 2.2. In some works from literature, the use of coordination controllers was proposed, which could address and handle the trajectory prioritization problem, see [CVH98, MAD14, TLX<sup>+</sup> 17]. However, no general solution was provided. Moreover, the prioritization in these works is two-valued: Either the vehicle or the manipulator follows its reference, but a continuous transition of the leading role is not possible.

While underwater and aerial vehicle manipulators are usually teleoperated or partly humancontrolled (e. g. [KSWK20, SSK<sup>+</sup> 22]), ground vehicle manipulators are usually fully automated. There is no concept in literature, which handles the dual task of ground vehicle manipulators in a continuous human-machine interaction. Furthermore, no concepts addressed the problem of limited information, meaning the design of the automation, if the reference trajectories of the human operator are not measurable or observable.

The practical motivation for sharing the dual task of large vehicle manipulators is the fact that a human operator in at least a supervisory role is crucial for the applications of large vehicle manipulators due to legal reasons [Bun21]. Therefore, it is also advantageous to keep the human operator in the control loop of a large vehicle manipulator resulting in a continuous human-machine interaction. Since the manipulator's environment is unstructured, its reference trajectory is difficult for the automation to measure and therefore the automation has limited information.

This thesis proposes a shared control concept, which can handle the challenges of limited information as well as the dual task of large vehicle manipulators with a human operator.

# **2.2 Modeling and Control of Hydraulic Manipulators**

The following section introduces modeling and control approaches for large, hydraulically actuated working machines. The models and control concepts without a human operator are provided first, followed by a discussion of semi-automated and assisted control concepts of large, hydraulically actuated manipulators. Since, vehicle manipulators for road maintenance require the presence of a human operator, cf. Chapter 1. This section discusses the models of hydraulic systems in detail because they are used for the development of the simulator applied for the analysis of the proposed concepts in this thesis.

Large, hydraulically actuated manipulators can be found in forestry applications [MWLH<sup>+</sup> 11, KBV14, FVF15, OMS21b], construction machines [BK16, Chapter 56], [YVF17] or agricultural usage [OMS21a, ASTZ21]. Regarding the automation of such mobile hydraulic systems, there are two main requirements to consider (see e. g. [Ste11, HA12, MKCS17, XC18, MG21]):


To address these two requirements, having appropriate models is important to enable modelbased control synthesis and further analysis. In recent years, there was a great effort to develop models of hydraulic actuators and to use them for model-based controller design, see [Rud17, Rud18, PR18] or [VGJ19, Chapter 2]. Depending on the model accuracy, these are used for optimizing the fuel consumption [GLWL20] or robust controller design [Rud18]. In this thesis, the two models of *Ruderman* are applied for the simulation and control model derivation due to the degrees of detail and the compact formulation provided by these models [Rud17, Rud18].

Table 2.1 summarizes the works related to this thesis. One of the most important challenges enabling the automation of hydraulic manipulators is the parameter identification of complex models [RAA<sup>+</sup> 17, SOC21]. These systems are also characterized by a significant latency and dead time, which depends strongly on the temperature of the working oil and on other different factors, such as operation load, wear, or the manipulator's configuration, see [AT20], which makes full automation more difficult. The implementation of model-based controllers is also challenging due to the nonlinearities of the overall system and the flow limitation of the working oil [LNM20, SS20]. For the teleoperation of these systems, the sensation of the load on the hydraulic working tool is essential. Such a load sensation implies further challenges, see [WWXS22]. One example of large hydraulic manipulators with a mobile base can be found in [KBV17]. The test environment of the vehicle manipulator was a large, flat area with no obstacles. Furthermore, the pose was estimated with a high-precision Real-time Kinematic Positioning Global Positioning System [Jef10, Chapter 4], which can provide centimeter-level accuracy. However, such working environments and measurement systems are rare in practical scenarios. Thus, the full automation of large vehicle manipulators is not entirely solved using state-of-the-art methods.

Additional crucial drawbacks of the fully automated hydraulic manipulators are that they require a wide range of sensors to measure all necessary system states<sup>5</sup> and precise perception of the environment. Due to the dust and dirt in their typical working environments, highfidelity sensors are not robust and reliable enough for the required accuracy. Therefore, in [LHMWS09, MWLH<sup>+</sup> 11, KBV14], open-loop experiments were conducted meaning that the references were predefined in advance. Furthermore, in [MKCS17], it is pointed out that the "[...] *control of hydraulic actuators is still a challenge and it has not yet reached a commercial off-the-shelf level of maturity*". Consequently, full automation of large hydraulic manipulators is not forthcoming in the near future and requires further research [XC18]. Therefore, such full automation is not considered in this thesis. If a human operator is required to be present on a large vehicle manipulator due to safety and legal issues, cf. Chapter 1, it is logical that the human operator undertakes the control task of the manipulator, which results in a specific setup.

In the case of a manual control of a large hydraulic manipulator, the operator use one or two joysticks to control the velocity of the single joints. Therefore, this control mode is referred to as the single rate control (SRC), [Fra04, Chapter 5], [YM10, HS10]. The SRC is the state of the art and can be very complex for manipulators with multiple joints. Therefore, long training for novice operators is inevitable, and even experienced operators can tire quickly. To support the human operator, there are concepts of *advanced manual modes* in literature, see

<sup>5</sup> Such system states are the flow and pressure of the working oil or the position and velocity of the manipulator. Furthermore, the hydraulic system is a closed one, which makes it hard to install sensors.

[HS10, FVF16]. As a result, the human operators have a shorter training time and a reduced mental fatigue, which helps to lower the working stress [TPF21]. Therefore, these concepts are often more practicable and promising, because the human operator remains part of the control loop. Through the application of such supported manual control modes, the two main requirements – good dynamics and energy efficiency – can also be reached, see e. g. [Ene10].

Advanced manual modes of the large manipulators can be divided into two categories:


Using operation mode modifications(CRC or CPC), the operability of the large manipulator is improved, see e. g. [YM10, OSBE11]. With CRC, the joystick controls the EE's velocity directly instead of controlling the single joints, leading to more intuitive control. The CRC does not require expensive sensory systems for the position estimation of the joints and of the EE, as it still has an open-loop structure. For example, in [YM10], the velocities of the EE are given in Cartesian coordinates enabling independent horizontal and vertical motions. The human operators could work with CRC more efficiently compared to SRC. Other works from


**Table 2.1:** Overview of the full- and semi-automated large hydraulic manipulators with focus on the operation modes and application area. Abbreviations: CRC - coordinated rate control, CPC - coordinated position control, HF - haptic feedback

literature suggest the coordinated position control (CPC) mode, see e. g. [WEB15, MRC19]. In this mode, the human operator controls the position of the EE directly with the joystick. CPC can facilitate more precise manipulation. Various manipulator motions can be realized due to the complex haptic industrial interfaces. In [WTES87], CRC and CPC were compared in different scenarios and with different machines. It has been shown that the CPC is more accurate, generating smaller tracking errors than CRC, except for large robotic arms with slow dynamics. These results were confirmed in a comparative study [MRC19]. Most of the large, hydraulic manipulators belong to this exceptional class, which makes the CRC more beneficial compared to CPC, see e. g. [EB11, XC18].

Semi automation concepts with haptic feedback in literature provide a support based on the optimal reference trajectory of the manipulator [YM10, HS10]. This includes the computation of the geometrical distances between the optimal trajectory and the actual pose of the manipulator, from which the haptic feedback is calculated providing support. The optimal trajectories were computed beforehand and reconstructed from measurement data, see "trajectory-based" and "data-driven" in Table 2.1. No models of the human or the hydraulic manipulator were taken into account for these optimizations, see [YM10, HS10].

Inspired by the state of the art of large hydraulic manipulators, this thesis also uses an open-loop structure, in which the human controls the manipulator with improved operability by the CRC of the EE. Furthermore, no haptic feedback is provided for the human operator by the shared control proposed in this thesis. From Table 2.1, it can be seen that operation mode modifications (CRC, CPC) and semi-automated concepts are widely utilized in construction applications. Their design approaches do not include the models of either the human or the hydraulic manipulator. On the other hand, the approaches to the full automation of the hydraulic manipulators include a model, and the concepts are mostly used in forestry applications. Thus, there is no semiautomated or shared control concept, which uses a model of the human or the system and takes advantage of the motion of the vehicle base. Thus, this research gap is addressed in this thesis.

# **2.3 Concepts of Shared Control for Human-Machine Interactions**

This section presents the fundamental concepts of modeling and controller syntheses for *shared control* in literature. A definition for *shared control* is given, which is inevitable due to the different terminologies in literature.

# **2.3.1 General Concepts of Shared Control**

There is much ongoing research to describe, model, and understand interactions between automation and human in control tasks. General concepts provide hierarchical frameworks to characterize human-automation cooperation or interaction. They usually have a structure with *decision layers* for the different task levels<sup>6</sup> , see e. g. [Fla16, PF16, FAI<sup>+</sup> 16, ACM<sup>+</sup> 18, RWIH20]. They focus on applications where one human and one automation interact with each other.

The state of the art of ground vehicle manipulators and large hydraulic manipulators showed that these two application domains usually have low-level control tasks (trajectory tracking, point-to-point motions), see Section 2.1 and Section 2.2. Therefore, the following presents the shared control methods focusing on the execution/operation layer of the frameworks mentioned above. This operational layer with a coupled control structure usually corresponds to shared control in the state of the art. However, in [ACM<sup>+</sup> 18], the following is stated: "There is no single definition for shared control that is used across application domains". Several works from literature use the terminus shared control in different contexts. Therefore, in the following, further definitions of the term shared control from literature of the last decades and different domains for low-level control tasks are discussed, and the definition is provided, which is applied in this thesis.

The core idea of the shared control was initially used for haptic support in teleoperation (see e. g. [Ros93, MS94, YAF<sup>+</sup> 99, WLA<sup>+</sup> 02]), and its use as haptic assistance executing dynamic tasks became widespread later, see e. g. [Tah01, GG05]. In [SG01], it has been shown for the first time that haptic interaction between human and automation should be characterized by a *bi-directional information transfer* meaning that haptic feedback can be used for the communication between human and automation. One of the very first overviews was given in [Tah01], in which the previous applications of shared control are summarized and compared. The common features are also identified, from which the general requirements for the shared control design are derived in subsequent works. Another literature review was provided in [OGGL06], which categorized haptic support into three groups: a) passive assistance for performance enhancement, b) passive assistance for training: record/replay, and c) shared control: performance enhancement and training. Additionally, the authors compared passive assistance for performance enhancement with shared control in two studies. In [EB10a], a

<sup>6</sup> These layers are in [FAI+19]: Strategic, Tactical, Operational Layers. On the other hand, [RWIH20] provides a slightly different arrangement of these layers: Decomposition, Decision, Trajectory and Action Layers. Both models provide the possibility to investigate human-automation or human-human interactions on higher decision layers. Because the number of layers are developed based on technical considerations rather than on the observation of human behavior, a proper choice of the layer number is always application-specific.

framework with six different classes of shared control was presented: 1) Traded Control, 2) Indirect Shared Control Through Cues, 3) Coordinated Control, 4) Collaborative Control, 5) Virtual Constraint and 6) Continuous or Blended Shared Control. The proposed concept of the blended shared control had an effect on the system continuously, which was provided via a human interface device, see [EB10a]. In [PO12], a novel taxonomy implementing shared control was suggested. The authors used their shared control terminology in the context of training novice operators and not with the focus on the "out-of-loop" problem, which is one of the key aspects of automated systems, see [WC80].

Further robotic applications also had different definitions. For instance, in [LY02], shared control was defined as the allocation of the global and local tasks between human and automation. This assumption is justified by the fact that humans can solve global tasks more efficiently while the automation is superior in local tasks. Other works define shared cooperative control as a situation, in which human and machine can carry out the dedicated task(s) together and alone, see [Fla16, Lud21]. On the other hand, shared control is in [NMG05] specified as a control structure, in which one of the partners (either human or automation) is not able to carry out the task alone properly. Their application focus was the shared control of minimally invasive telesurgical training. In [LZD<sup>+</sup> 17], the term shared control was used as an additional reliability element of wheelchair control systems, interacting with human only in unsafe situations. This definition was used similarly in other works [TKM18, DPCB19]. In [Wad19, SSK<sup>+</sup> 22], shared control was defined as "[...] *one of the cooperative control schemes, in which humans and machines interact congruently in a perception-action cycle to jointly perform a dynamic task*".

Application-oriented examples from literature are the teleoperated shared control of unmanned underwater vehicles, see [KSWK20, SLO<sup>+</sup> 22]. These applications have similar characteristics to the one presented in Chapter 1: The automation supports the human by carrying out one of the two tasks (control of the underwater vehicle) while the human can better concentrate on the other task (inception with the camera). Thus, the supporting automation does not fully replace the human operator and the visual inception of the objects remains the task of the operator only. However, these applications are different: The vehicle is a submarine and does not have non-holonomic motion constraints. The manipulators of such underwater vehicle manipulators have different structures compared to large vehicle manipulators. Moreover, teleoperated underwater systems have to deal with high pressure and communication delay. Therefore, their concepts cannot be applied to the challenges of large vehicle manipulators.

From this brief overview, it can be seen that the definitions of *shared control* cannot handle the specific problem of large vehicle manipulators from either a theoretical or a practical point of view. Furthermore, the large number of definitions hinders the comparability of the applications and the discussion of researchers. Therefore, recent works attempt to provide a standardized and more generalized framework and coherent definitions for the concepts of shared control, see [ACM<sup>+</sup> 18, FAI<sup>+</sup> 19, RWIH20]. These frameworks enable the analysis of the relationships between the various interpretations, such as indirect/direct cooperative, collaborative, or traded shared control.

Due to the generality of the *shared control* definition from [ACM<sup>+</sup> 18], this thesis adapts it to the problem of vehicle manipulators with a human operator as follows.

### **Definition 2.4 (Full Information Shared Control)**

*Consider at least one human and one automation continuously interacting in a "perceptionaction cycle." Their goal(s) is (are) to accomplish (an) operational task(s) in a dynamic environment. Under perfect conditions, either the human or the robot can carry out this task alone, because both have full information about the system states. Such a continuous interacting human-automation configuration is called Full Information Shared Control.*

### **Note:**

Definition 2.4 is an extension of the definition of [ACM<sup>+</sup> 18]. It emphasizes that the works from literature assume that the system states and reference trajectories are fully measurable for the automation, which is generally not the case by large vehicle manipulators.

Furthermore, both fully automated and pure manually controlled systems are excluded by Definition 2.4, because at least one human and one automation have to be present to enable an *interacting cycle*. Negotiation or decision-making models of e. g. [PF16, ACM<sup>+</sup> 18, FAI<sup>+</sup> 19, RWIH20] are omitted from Definition 2.4 because a *continuous human-automation interaction* is not given in the case of these negotiation and decision-making models. Meanwhile, for some earlier definitions from the state-of-the-art (e. g. traded, indirect, coordinated, collaborative, blended, cooperative), Definition 2.4 suits well, see [EB10a].

Figure 2.3 presents the shared control closed-loop system structure, where one human and one automation act on one system based on the actual system states and the input of the other.

In this thesis, the terminologies "partners" or "players" are interchangeably used for both the human and the machine in the *continuous shared control interaction*<sup>7</sup> . For the sake of simplicity, exactly one human and one machine player are taken into consideration for the scope of this thesis. Note that this assumption poses no restriction regarding the generalizability of the concepts. As suggested in [Fla16], an aggregation of the N human and M automation

**Figure 2.3:** The structure of a general shared control setup, where the human and the automation control the system together over one (or more) control interface(s). | Inspired by [Fla16].

<sup>7</sup> In the context of optimization and game theory, the term "player" is often used. In the shared control technical terminology, the term interacting "partners" are widespread. Since, this thesis applies game-theoretic methods for shared control applications, both terms are used interchangeably.

players into two players is possible and this does not influence the controller design. It is only necessary that there must be at least one human and at least one automation interacting with each other.

From Definition 2.4 and the structure of shared control, see Figure 2.3, the shortcomings of the existing methods can be emphasized: The shared control of a large vehicle manipulator with a human operator needs an adaptation of the existing methods due to the following reasons.

The first reason is that a *dual task* setup was not investigated in literature. The effects of a secondary task were investigated in the context of highly automated vehicles [GG05, BM15, LWW<sup>+</sup> 21], however, in these studies, the secondary task was considered as a disturbance or a distraction and not as a task with equivalent priority. Therefore, the design and usage of shared control for dual tasks with trajectory prioritization is still an open question. The second reason for the shortcomings is the usual assumption that both partners can control both subsystems. This assumption does not hold for many practical applications:


This thesis extends Definition 2.4 for dual tasks with limited information and provides a systematic solution for the design of the so-called *limited information shared control* problems.

### **2.3.2 Shared Control Design**

This section provides an overview of general shared control design methods. First, the *modelbased* methods from literature are presented, followed by the *data-driven* approaches. Table 2.2 summarizes the design methods from the state of the art.

Model-based shared controller designs are approaches, in which an explicit, white-box model of the human partner is used for the calculation of the controller parameters. Through the design, the behavior of the human is taken into account inherently. Modeling human behavior in shared control applications has a long history: The very first works date back to the 90s with the application of an assistive wheelchair navigation system with adaptation, see [BLK<sup>+</sup> 94, SLB<sup>+</sup> 98]. Even more applications were developed in the field of human driver models in the 2000s and early 2010s: In [CPO06], a linear-quadratic optimal control model for the human driver was proposed. In [PC08], a model predictive controller is used to characterize human's behavior. A two-level driver model was presented in [PE07] and an adaptation of the preview time of the human's behavior was suggested in [UP05]. In [BAH<sup>+</sup> 13, SRGFS19] shared control concepts with haptic feedback were developed to support the human operator in carrying out operations in a remote environment. They showed that active haptic feedback can improve task performance and accuracy. The focus of these works was the generation of haptic feedback. Due to the dual trajectories, this thesis does not have the research focus of the generation of

haptic feedback for the human operator of the large vehicle manipulator. Thus, these shared control design methods do not suit the problem presented in Chapter 1.

Based on these models, the theory of non-cooperative differential games was used for the synthesis of shared control applications, see e. g. [TAT11, NC13, FOSH14a, LSC<sup>+</sup> 19, JYN<sup>+</sup> 19]. Further examples with application on semi-automated driving can be found in literature, [NSP17, FFH17, SNBP19], where the authors used a mathematical model of the driver to design an assistive shared control system.

Due to the complexity of human behavior, the use of complex models is widespread, which impairs manageable parameterization and the general usability of model-based methods. To handle the complexity of human behavior, learning of the shared controllers directly from measurements and interaction with the human partner is a promising solution. Machine learning methods have been on the rise due to their good applicability and generality for different applications, like the design of shared control. These methods are called data-driven shared control and use general models (e. g. neuronal networks) or simplified system dynamics<sup>8</sup> , see e. g. [EB10b, DP14, BAMA20, YZC22] to reproduce the behavior that is learned from the human. That way, the physical relationships do not have to be determined by modeling the interaction or the system dynamics. The literature includes promising examples in the robotic domain. In [DS13], one of the first studies was conducted showing the benefit of the adaption and learning of the automation in a human-machine interaction. Adapting the automation can also be used for dynamically adjusting human authority, see e. g. [GJA17, BAMA20]. This is followed by further works, where the system models [BMA17] or the other partner [KTFH20] are determined using learning methods. These *data-driven shared control* concepts


**Table 2.2:** Overview of the existing shared control design concepts with focus on the number of the input devices and availability of the system information.

<sup>8</sup> Simplified dynamics mean that the choice of the system states are inspired by the physics, meanwhile the dynamics are learned from the measurement data directly, see e. g. [BAMA20]. The reason for this procedure is that system dynamics are often complex or not known for the controller design.

and approaches could be useful in general. However, they are often not suitable for the specific application of vehicles in regular traffic due to issues with testing and admission, see [Bac18, Chapter 7] or [Kön22, Chapter 5]. Therefore, the *shared control* design of this thesis is based on *model-based* approaches.

In order to illustrate the shortcomings of the general state-of-the-art shared control methods, Table 2.2 is taken into account: In its first line, the characteristics of the large vehicle manipulator are given. There are usually two input devices: A joystick and a steering wheel. Furthermore, there are two reference trajectories and the automation of the vehicle has only *limited information* about the reference trajectory of the manipulator.

As from Table 2.2, general frameworks are not suitable for the limited information problem of large vehicle manipulator, since the automation needs full information on the system state, which is not given in the case. Consequently, such a specific setup of large vehicle manipulators was not handled in literature: Neither theoretical frameworks nor practical applications exist, which would be suitable for their shared control. Therefore, novel methods are necessary for the application of large vehicle manipulators, which are addressed in this thesis: The *shared control* of systems with multiple input devices and multiple references, in which the automation has limited information only.

# **2.4 Research Gaps and Contributions of the Thesis**

From the presentation of the state of the art, it is ascertained that large vehicle manipulators represent a particular system class due to the shared control and the dual task setting, which is however not addressed in literature: There exist neither practical approaches nor theoretical methods for this system class. Therefore, the extension and application of shared control methods to problems with dual task are necessary. Furthermore, in literature, large vehicle manipulators with moving bases have not yet been not treated in depth. Consequently, a novel shared control model is essential for the dual task problem in the context of large vehicle manipulators enabling a model-based controller design.

An additional research gap is the lack of systematic handling of limited information in shared control applications. In the state of the art, it is assumed that the system states and the reference trajectories are either fully measurable or observable. However, the special setup of the large vehicle manipulators reveals a real-world application, which violates these assumptions. No shared control synthesis exists which can address the problem of states and references being unavailable in the context of shared control.

Finally, enabling the testing and comparison of the proposed concepts with state-of-the-art methods, a test bench is required on which the elaborated assistant system can be tested and its usability can be demonstrated. Such a simulator with open architecture does not exist: neither as an open-source project nor as a commercial product. These gaps in the state of the art lead to the following three research questions and the respective contributions of this thesis.

#### **Research Question 1**

*How can a human-machine shared control for dual task problems be modeled, in which the automation has limited information about the human's goals and references?*

#### **Contribution 1**

First, this thesis introduces the concept of *limited information shared control*, which provides a novel model of a continuous human-machine interaction by introducing the so-called *cooperation state*. Using the cooperation state leads to a system representation, which can characterize the shared control setup between the human operator and the automation of the vehicle without information about the human-controlled subsystem. The proposed modeling approach differs from the existing methods in the state of the art by the assumption that a subset of the system states are not available for the automation. The core idea of the limited information shared controller is, that the human actions are used to construct the cooperation state. Thereby, the automation design by means of this novel model leads to a *feedback controller* that can facilitate the handling of large vehicle manipulator systems in which the human operator controls the manipulator and the automation drives the vehicle. Furthermore, due to the general formulation, the concept of limited information shared control is transferable to further application owing similar information structure.

#### **Research Question 2**

*How should the control synthesis of the limited information shared control be formulated to facilitate a constructive, model-based design method?*

#### **Contribution 2**

The second contribution of this thesis is the development of a systematic automation design for the proposed limited information shared control providing the parameters of the cooperation state and the feedback controller. The design includes four steps: The first step is the design of the corresponding *full information shared control*, cf. Figure 2.4. The second step is the application of *potential games* to model the human-automation shared control setup, which provides a more compact, substituting model of the shared control setup than full information shared control. Since none of the current subclasses of potential games are suitable to model human-machine interaction generally, two novel subclasses, *near potential differential games* and *ordinal potential differential games*, are introduced. Methods to compute these two subclasses are developed to find a potential game for a given differential game. The third step is the calculation of the parameters of the cooperation state based on the optimality condition. In the fourth step, the controller parameters are determined through optimization using behavior matching. The use of the proposed automation design procedure enables the generalization of limited information shared control.

### **Research Question 3**

*Are there any practical benefits of the limited information shared controller and its design procedure with respect to Research Question 2 with the developed model of Research Question 1 in a real application of large vehicle manipulators compared to the current technical solutions?*

### **Contribution 3**

In the course of the research for this thesis, a simulator<sup>9</sup> was developed for the evaluation of the proposed control concepts. Three experiments were conducted indulging human test subjects, which provided the first promising indications that the limited information shared control concept supports and reduces the mental load of the human operator in real-world applications.

The motivation of the first experiment is the attempt to answer whether the limited information shared controller can support as effectively as the full information shared controller the lateral motion of the vehicle manipulator. In addition, the limited information shared controller is compared to a non-cooperative lateral controller taking into account as a possible solution from the current state of the art.

The second experiment was conducted to compare the proposed limited information shared controller with the state-of-the-art fully manual control of the large vehicle manipulators for the lateral motion. An important difference in this experiment was that no explicit trajectory is given to the test subjects in advance, so they had to determine the ideal trajectories online during the task execution.

On the other hand, in the third experiment, the concept was applied to the longitudinal guidance of large vehicle manipulators, which represents an agricultural application. In the case of a longitudinal limited information shared controller, the human operator controls the manipulator mainly parallel to the driving direction, while the automation controls the velocity of the vehicle. The proposed limited information shared controller was compared with a longitudinal non-cooperative controller of the vehicle. The experimental results also imply the transferability of limited information shared control to other robotic applications.

<sup>9</sup> Note that the expressions "test bench" and "simulator" are used interchangeably.

The contributions of the upcoming chapters and their relations to each other are depicted in Figure 2.4. Chapter 3 introduces the concept of limited information shared control. In Chapter 4, the steps of systematic controller design using the two novel subclasses of potential differential games are introduced in detail. Chapter 5 presents the application of the limited information shared control and full information shared control to a large vehicle manipulator in simulations. Furthermore, the qualitative analysis of the developed simulation models and the hardware- and software components of the simulator are presented as well. In Chapter 6, the three experiments with human test subjects are introduced and their results are discussed.

**Figure 2.4:** The contributions of the thesis are given, with the corresponding chapters providing an overview of the basic ideas and their relations.

# **3 Limited Information Shared Control**

The development of *limited information shared control* (LISC) is based on the cooperative shared control design of [FOSH14a], which is introduced in the first part of this chapter. Non-cooperative differential games have been used to model shared controls in numerous applications [TAT11, NC13, FOSH14b, LSC<sup>+</sup> 19, JYN<sup>+</sup> 19] and studies have shown the validity of this modeling approach [BOW09, LSB21, NC22]. Therefore, this thesis uses the same notion for modeling shared control.

Firstly, the preliminaries on game theory are presented. Then, the shared control concept of [FOSH14a] is introduced facilitating the use of the notations and the definitions for the latter derivations. This shared controller is referred to as *full information shared control* (FISC), since its use requires full state information of the controlled system, cf. Section 2.3.

The second part of this chapter introduces and formalizes the novel concept of the proposed LISC to answer the first research question elaborated on in the previous chapter. In contrast to [FOSH14a], this work considers more realistic scenarios, in which the automation has no system state information on the human-controlled subsystem. In the subsequent, such a scenario is referred to as *limited information* case. Systems with a non-measurable subsystem can benefit from the use of LISC especially: It enables the sharing of the control task between human and automation, even in the case of limited information. Finally, the motivation and the overview of the design procedure are presented including four design steps.

# **3.1 Shared Control Method with Full Information**

As FISC is based on game-theoretic modeling of the human-automation system, the fundamentals of game theory are introduced, which are crucial for the subsequent presentation of the novel concepts in Section 3.2. Afterward, the important special case of a linear quadratic FISC is summarized as a basis for the introduction of LISC.

# **3.1.1 Preliminaries on Game Theory**

Game theory is a mathematical discipline that describes and analyzes situations, which requires strategic decision-making (called *games*) with rational decision-makers (called *players*). In contrast to classical optimization theory, game theory considers multiple, individual decision-makers, so that the outcome for the decision-makers does not only result from their own decisions but also from the decisions of the other decision-makers. This characteristic can lead to complex decision-making processes, see e. g. [Eng05, Chapter 7]. More detailed introductions can be found in [BO98, Eng05, Bar11, Tad13].

This thesis considers a *strategic game* ([Tad13, Chapter 3], or [LCS16, Chapter 1]) as follows:

**Definition 3.1 (Strategic Game [LCS16])** *A strategic game* Γ *is defined by the tuple*

$$
\Gamma = \{\mathcal{P}, \mathcal{J}, \mathcal{U}\},
\tag{3.1}
$$

*where* N *independent players denoted by the set* P ∶= {1, ..., N} *choose a strategy* u (i) ∈ U (i) *. The action set* U (i) *includes all possible actions which are chosen by the player* i*. The choice of the applied action is based on the optimization of a cost function*

> J (i) <sup>∶</sup> <sup>U</sup> <sup>→</sup> <sup>R</sup>,

*where* U = U (1) <sup>×</sup> ... <sup>×</sup> <sup>U</sup> (N) *symbolizes the combined strategy set of the players. The set of the players' cost functions is denoted by* J = {J (1) , ..., J(N)} .

#### **Note:**

In literature, J (i) is also referred to as utility function, which is to be minimized or maximized. In this thesis, the problems are formulated without loss of generality such that the optimizations are always minimization problems.

The goal of each player is to choose a strategy that corresponds to the optimum of their own cost functions. As, this decision-making depends on the other players as well, a strategic game leads to a coupled optimization such that

$$\min\_{u^{(i)}} J^{(i)}\left(u^{(i)}, u^{(\neg i)}\right), \;\forall i \in \mathcal{P},\tag{3.2}$$

where ¬i denotes all the players excluding player i. If players compute their strategies and act only once, the game is *static*. On the other hand, if the game is repeated more than once and the players can update their strategy, the game is *dynamic*. Engineering applications often feature dynamical characteristics, which are modeled by dynamic systems.

#### **Definition 3.2 (Dynamic System [BHZ16])**

*A dynamic system is completely described by an ordinary differential equation and its initial value*

$$\dot{x}(t) = f\left(t, x(t), u^{(1)}(t), \dots, u^{(N)}(t)\right), \ t \in [0, \tau\_{\text{end}}] \tag{3.3a}$$

$$x(0) = x\_0,\tag{3.3b}$$

*where the system function* <sup>f</sup> <sup>∶</sup> <sup>R</sup> <sup>+</sup> <sup>×</sup> <sup>R</sup> <sup>n</sup> <sup>×</sup> <sup>R</sup> <sup>p</sup><sup>1</sup> <sup>×</sup> ... <sup>×</sup> <sup>R</sup> <sup>p</sup><sup>n</sup> <sup>→</sup> <sup>R</sup> n *. The state vector and the inputs of the players are represented by* <sup>x</sup>(t) <sup>∈</sup> <sup>R</sup> n *and* u (i) (t) <sup>∈</sup> <sup>R</sup> pi , i ∈ P*, respectively. The function* f *is continuous over* t ∈ [0, τend] *and Lipschitz continuous for* u (i) (t) ∈ P *and* x(t)*. The differential equation (3.3) is called the state space representation of a dynamic system.*

If the players interact with a time-discrete dynamic system (3.3) and are only able to update their inputs in discrete time steps, the game is called a *difference game* and in case of continuous updates, the strategic game is called a *differential game*.

The preferences of the players are modeled by the cost function

$$\begin{split} J^{\{i\}}\left(\mathbf{x}(t), \mathbf{u}^{\{1\}}(t), \dots, \mathbf{u}^{\{N\}}(t), \tau\_{\text{end}}\right) &= V^{\{i\}}\_{\tau\_{\text{end}}} \\ &+ \int\_0^{\tau\_{\text{end}}} h^{\{i\}}\left(t, \mathbf{x}(t), \mathbf{u}^{\{1\}}(t), \dots, \mathbf{u}^{\{N\}}(t)\right) \, \mathbf{d}t, \end{split} \tag{3.4}$$

which is defined over the duration of the game [0, τend] with V (i) <sup>τ</sup>end and h (i) denoting the terminal and the instantaneous cost, respectively. Using the strategic game Definition (3.1) and the definition of dynamic systems (3.3), differential games can be defined as follows:

# **Definition 3.3 (Differential Game [BHZ16])**

*A Differential Game* Γ<sup>d</sup> *is defined as a tuple of*


There are two main solution strategies to find the optimal inputs u (i) ∗ : The cooperative and the non-cooperative games. In a cooperative game, the players can explicitly communicate with each other before the decision-making and therefore cooperation (collaboration) is possible. On the other hand, in a non-cooperative game, there is no such communication. As there is no time for explicit communication in a low-level shared control setup, using cooperative games to model these situations is impractical. Therefore, the following focuses on the non-cooperative game theoretic methods only. Further explanations for choosing non-cooperative games are presented in Appendix A.

The information structure of a game γ (i) can be open-loop or closed-loop. The differences are depicted in Figure 3.1: In an open-loop structure, the actions of the players depend on their cost functions and the initial value of the game

$$u^{(i)}(t) = \gamma^{(i)}(t, x\_0).$$

In a closed-loop information structure, the players update their control signals based on the actual system states

$$u^{(i)}(t) = \gamma^{(i)}(t, x(t)),$$

which depends on the actual system states and their cost functions, but not on the initial condition. Thus, a closed-loop information structure enables the handling of model uncertainties.

**Figure 3.1:** Open-loop (a) and closed-loop (b) structures of differential games

This thesis assumes a closed-loop information structure, which is more robust against model uncertainties and disturbances. For further information on open-loop problems, it is referred to [Eng98] or [Eng05, Chapter 7].

To find a solution to a differential game, the coupled optimization problem of (3.2) needs to be solved. A widely used solution strategy is the so-called *Nash equilibrium*, in which the players act simultaneously without explicit communication<sup>10</sup> .

#### **Definition 3.4 (Nash Equilibrium in Differential Games [BO98])**

*The game is in a Nash equilibrium (NE), if the players cannot deviate from their actual strategies without increasing their costs*

$$J^{\left(i\right)}\left(u^{\left(i\right)^{\*}},u^{\left(\neg i\right)^{\*}}\right) \leq J^{\left(i\right)}\left(u^{\left(i\right)},u^{\left(\neg i\right)^{\*}}\right) \quad \forall i \in \mathcal{P}.\,\,\,\,$$

For differential games, the NE is the combination of the admissible strategies of all players and the system state trajectories of the game. This combination implies that the costs of the players are the function of the time-dependent system and input trajectories

$$J^{(i)}\left(u^{(i)}(t,x(t)),u^{(\neg i)^{\*}}(t,x(t))\right).$$

Computing the NE of differential games requires the coupled optimization of J (i) with respect to the system dynamics leading to the optimal control strategies of the players

$$\left(\boldsymbol{u}^{\left(i\right)^{\*}}\left(t\right) = \mathop{\arg\min}\_{\left(\boldsymbol{u}^{\left(i\right)}\right)} J^{\left(i\right)}\left(\boldsymbol{x}\left(t\right), \boldsymbol{u}^{\left(i\right)}\left(t\right), \boldsymbol{u}^{\left(\left-i\right)^{\*}}\left(t\right)\right)}, \forall i \in \mathcal{P} \tag{3.5a}$$

$$\mathbf{w}.\,\mathrm{r.t.}\,\dot{\mathbf{z}}(t) = \mathbf{f}\left(t, \mathbf{z}(t), \mathbf{u}^{\{1\}}(t), \dots, \mathbf{u}^{\{N\}}(t)\right),\,\,t \in [0, \tau\_{\mathrm{end}}] \tag{3.5b}$$

$$\mathbf{z}(0) = \mathbf{z}\_{t\_0}.$$

<sup>10</sup> Further *solution concepts* (also called *equilibrium concepts*) are the so-called *Stackelberg* equilibrium [vS+52] and the *Pareto* optimum [Par14]. The Pareto optimum of a game is a cooperative solution, in which the partners communicate to reach a common goal (e. g. social efficiency, overall welfare). In the Stackelberg equilibrium, the partners act sequentially: There is a leader and a follower, who act one after the other. The sequence of these interactions leads to the Stackelberg equilibrium. These two solution concepts are not the focus of the thesis, therefore, they are presented in Appendix A. For more detailed information about them, it is referred to [HKZ12, Chapter 2].

As discussed in Section 2.3 and in the introduction of this chapter, non-cooperative differential games are widely used in the modeling of shared controls, cf. [TAT11, NC13, FOSH14a, LSC<sup>+</sup> 19, JYN<sup>+</sup> 19]. Strong evidence for this modeling approach was presented in [BOW09], where it has been shown that a haptic interaction between two humans (and consequently between a human and an automation, too) can be modeled as a non-cooperative differential game, and the resulting motion corresponds to a NE of this game. This finding has been validated in recent applications, see [LSB21, NC22]. These practical findings prompt the application of the mathematical tools of game theory for the design of shared control systems.

### **3.1.2 General FISC Approach**

A model-based design of a shared control system requires a model of the human partner, see Section 2.3. Therefore, the following presents the model-based shared control design based on [Fla16, Chapter 3]. A widespread hypothesis for the modeling of the human's action is that the humans use a cost function, which models their preferences, to compute their actions. Therefore, an optimal controller as a human model is commonly used in literature, see [Hap92, TJ02, Tod04, Av10, WDZ<sup>+</sup> 20]. Following these research results, this thesis assumes that human movements are the result of the dynamic optimization problem

$$\left(\boldsymbol{u}^{\{\mathbf{h}\}}\right)^{\*}\left(t\right) = \underset{\mathbf{u}^{\{\mathbf{h}\}}}{\arg\min} \, J^{\{\mathbf{h}\}}\left(\mathbf{x}(t), \tau\_{\text{end}}, \boldsymbol{u}^{\{\mathbf{h}\}}(t)\right),\tag{3.6a}$$

$$\begin{aligned} \text{w.r.t. } &f(t, x(t), u^{\text{(h)}}(t)), \ t \in [0, \tau\_{\text{end}}],\\ &x(0) = x\_0, \end{aligned} \tag{3.6b}$$

where f(t,x(t),u (h) (t)) represents the dynamics of the system with which the human interacts. The cost function J (h) (⋅) can model different behaviors of the human, which can be e. g. time and/or energy optimality, effort minimization, or other comfort objectives, see e. g. [TJ02].

After [Fla16], the design of a shared control means that for a high-level requirement and a given human model, the overall behavior of the control loop is adjusted by an appropriate design of the automation. The shared control setup of the automation and the human is modeled as a non-cooperative differential game<sup>11</sup>. To provide a mathematical description for the design, these high-level requirements are condensed in a *global objective function*

$$J^{(g)}\left(\mathbf{x}(t), \tau\_{\rm end}, \mathbf{u}(t)\right),\tag{3.7}$$

which has an optimum corresponding to the global preferences, see [Fla16, Chapter 3]. In (3.7), u(t) = [u (a) (t),u (h) (t)] denotes the overall system inputs comprising the inputs of both the automation and the human. Using the optimal control law of the human (3.6) and the high-level requirements (3.7), a systematic design of a shared controller is possible: Firstly, the designed

<sup>11</sup> Such requirements can originate from higher management levels, from customers or from users in businesslike, real applications. Determining the most suitable function for practical problems can be challenging, because a suitable conceptualization of these high-level requirements is not easy especially for large, multi-variable systems with multiple adjustment possibilities. Consequently, the optimum of the *global objective function* may not correspond to the desired preferences. For more details, see [Fla16, Chapter 3].

automation strives to a NE of the game with the human. Secondly, the design allows to choose the NE that corresponds to the optimal value of J (g) . To this, the behavior of the automation is modeled as a cost function

$$\left(\boldsymbol{u}^{\{\mathbf{a}\}}\right)^{\star}(t) = \underset{\boldsymbol{u}^{\{\mathbf{a}\}}(t)}{\arg\min} J^{\{\mathbf{a}\}}\left(\mathbf{x}(t), \tau\_{\text{end}}, \boldsymbol{u}^{\{\mathbf{a}\}}(t), \boldsymbol{u}^{\{\mathbf{h}\}}^{\star}(t)\right),\tag{3.8a}$$

$$\begin{aligned} \text{w.r.t. } &f(t, x(t), u^{\{\mathbf{a}\}}(t), u^{\{\mathbf{b}\}}^{\star}(t)), \ t \in [0, \tau\_{\text{end}}],\\ &x(0) = x\_0, \end{aligned} \tag{3.8b}$$

where the actions of the human are taken into account in the cost function. From a mathematical point of view, the interaction of the human (3.6) and the automation (3.8) forms a differential game with two players. The NE of this game can be computed according to (3.5).

The computation of control inputs of the automation happens with a coupled optimization

$$\mathbf{u}^{\left(\mathbf{a}\right)^{\star}}\left(t\right) = \underset{\mathbf{u}^{\left(\mathbf{a}\right)}\left(t\right)}{\arg\min} \mathcal{J}^{\left(g\right)}\left(t, \tau\_{\text{end}}, \mathbf{z}(t), \mathbf{u}^{\left(\mathbf{a}\right)}(t), \mathbf{u}^{\left(\mathbf{h}\right)^{\star}}(t)\right),\tag{3.9a}$$

$$\text{w.r.t. } \dot{\mathbf{x}}(t) = \mathbf{f}\left(t, \mathbf{z}(t), \mathbf{u}^{\left(\mathbf{a}\right)}(t), \mathbf{u}^{\left(\mathbf{h}\right)^{\star}}(t)\right), \ t \in \left[0, \tau\_{\text{end}}\right],$$

$$\mathbf{x}(0) = \mathbf{x}\_{0},$$

$$\text{w.r.t.}$$

$$\mathbf{u}^{\left(\mathbf{h}\right)^{\star}}(t) = \underset{\mathbf{u}^{\left(\mathbf{h}\right)}(t)}{\arg\min} \mathcal{J}^{\left(\mathbf{h}\right)}\left(t, \tau\_{\text{end}}, \mathbf{z}(t), \mathbf{u}^{\left(\mathbf{a}\right)}(t), \mathbf{u}^{\left(\mathbf{h}\right)}(t)\right),$$

$$\text{w.r.t. } \dot{\mathbf{x}}(t) = \mathbf{f}\left(t, \mathbf{z}(t), \mathbf{u}^{\left(\mathbf{a}\right)}(t), \mathbf{u}^{\left(\mathbf{h}\right)}(t)\right), \ t \in \left[0, \tau\_{\text{end}}\right],$$

$$\mathbf{x}(0) = \mathbf{x}\_{0},$$

which leads to the optimum of the high-level requirements given in (3.7) and the NE of the game. The computation of (3.9) requires a solution of the coupled dynamic optimization of J (g) with respect to the optimal solution of the human. Thus, an efficient computation of (3.9) is often unavailable for practical engineering applications, see [BHZ16] or [Fla16, Chapter 3].

For practical applications, a specific assumption of the cost function's structure

$$J^{\text{(a)}}(\cdot,\theta^{\text{(a)}}) = \int\_0^{\tau\_{\text{end}}} \theta^{\text{(a)}} \cdot \Phi\left(t, x(t), u^{\text{(a)}}, u^{\text{(h)}}\right) \tag{3.10}$$

can reduce computational complexity<sup>12</sup>. In (3.10), θ (a) is the time-invariant parameter vector and Φ is the basis functions vector, see e. g. [MTL10, JKL<sup>+</sup> 19, IBM<sup>+</sup> 19]. In this case, (3.9a) can be simplified to a static parameter optimization. Computing the solution to this optimization problem requires less time but provides only an approximate solution of the NE due to the

<sup>12</sup> Using basis functions to compose an objective function to reduce computational complexity is a common procedure in machine learning applications as well, see e. g. [KVML18].

approximation (3.10). The optimization problem

$$\boldsymbol{\theta}^{\text{(a)}^{\text{\*}}} = \operatorname\*{arg\,min}\_{\boldsymbol{\theta}^{\text{(a)}}} J^{\text{(g)}}\left(t, \tau\_{\text{end}}, \boldsymbol{x}(t), \boldsymbol{u}^{\text{(h)}^{\text{\*}}}(t), \boldsymbol{u}^{\text{(a)}^{\text{\*}}}(t), \boldsymbol{\theta}^{\text{(a)}}\right),\tag{3.11a}$$

$$\begin{aligned} \text{w. r. t.} \\ \begin{aligned} \text{w. r. t.} \\ \text{u}^{\text{(h)}^{\text{+}}}(t) &= \underset{\mathbf{u}^{\text{(h)}}(t)}{\arg\min} J^{\text{(h)}}\left(t, \tau\_{\text{end}}, \mathbf{z}(t), \mathbf{u}^{\text{(a)}^{\text{+}}}(t), \mathbf{u}^{\text{(h)}}(t)\right), \end{aligned} & \text{(3.11b)} \\ \text{w. r. t. } \dot{\mathbf{x}}(t) &= \mathbf{f}\left(t, \mathbf{z}(t), \mathbf{u}^{\text{(a)}^{\text{+}}}(t), \mathbf{u}^{\text{(h)}}(t)\right), \ t \in [0, \tau\_{\text{end}}], \\ \mathbf{z}(0) &= \mathbf{z}\_{0}, \\ \text{and} \\ \mathbf{u}^{\text{(a)}^{\text{+}}}(t) &= \underset{\mathbf{u}^{\text{(a)}}(t)}{\arg\min} J^{\text{(a)}}\left(t, \tau\_{\text{end}}, \mathbf{z}(t), \mathbf{u}^{\text{(a)}}(t), \mathbf{u}^{\text{(h)}}(t), \mathbf{u}^{\text{(h)}}(t), \mathbf{f}^{\text{(a)}}\right), \\ \mathbf{w}. \text{r. t. } \dot{\mathbf{x}}(t) &= \mathbf{f}\left(t, \mathbf{z}(t), \mathbf{u}^{\text{(a)}}(t), \mathbf{u}^{\text{(h)}}(t)\right), \ t \in [0, \tau\_{\text{end}}], \\ \mathbf{x}(0) &= \mathbf{x}\_{0}, \end{aligned}$$

provides the parameters of the shared control θ (a) , which approximates the corresponding NE of the game. For general non-linear systems, the computation of the closed-loop NE is not straightforward: It is possible to have no solutions, only unique solutions or many solutions, which in some cases cannot be calculated [Eng05, Chapter 8]. Thus, the design of a general FISC raises further challenges. However, a linear system model and quadratic cost functions of the players can model many engineering applications sufficiently well, which is addressed in the next section.

### **3.1.3 Linear Quadratic FISC**

This subsection presents a specific case of FISC, in which the problem is modeled as a linearquadratic (LQ) differential game<sup>13</sup>. This approach provides an analytical solution of FISC. The computation of the control law for LQ case facilitates a real-time solution of this game-theoretic design of a shared control system. Furthermore, practical engineering problems can often be characterized sufficiently with such LQ models, see e. g. [NC13, FOSH14a].

First, a linear, time-invariant system dynamic

$$\mathbf{f}(t) = \mathbf{A}\mathbf{x}(t) + \sum\_{i \in \mathcal{P}} \mathbf{B}^{\{i\}} \mathbf{u}^{\{i\}}(t) \tag{3.12}$$

is assumed for (3.3), where A and B(i) are the system state and the input matrices, respectively. The global objective function is formulated as a quadratic function of the system states and inputs

$$J^{\{\mathbf{g}\}} = \frac{1}{2} \int\_0^{\tau\_{\rm end}} x(t)^\top \mathbf{Q}^{\{\mathbf{g}\}} x(t) + \sum\_{j \in \mathcal{P}} u^{\{j\}}(t)^\top \mathbf{R}^{\{\mathbf{g}j\}} u^{\{j\}}(t) \,\mathrm{d}t,\tag{3.13}$$

<sup>13</sup> This subsection is based on [Fla16, Chapter 4].

where Q(g) and R(gj) are the penalty matrices for the system states and system inputs, respectively, which are derived from the high-level requirements. The matrices of the objective function (3.13) are assumed to be diagonal<sup>14</sup> , <sup>Q</sup>(g) <sup>=</sup> diag[<sup>q</sup> (i) 1 , q (i) 2 , ..., q (i) <sup>n</sup> ], <sup>R</sup>(g) <sup>=</sup> diag[r (g) 1 , r (g) 2 , ..., r (g) Np]. Analogously, the cost functions of the automation and the human (i ∈ P = {a, h}) are also modeled as quadratic functions

$$J^{(i)} = \frac{1}{2} \int\_0^{\tau\_{\rm end}} x(t)^\mathsf{T} \mathbf{Q}^{(i)} x(t) + \sum\_{j \in \mathcal{P}} u^{(j)}(t)^\mathsf{T} \mathbf{R}^{(ij)} u^{(j)}(t) \,\mathrm{d}t, \,\, i \in \mathcal{P}, \tag{3.14}$$

where Q(i) and R(ij) represent the penalty matrices for the system states and system inputs of the playeri. It is assumed that the matrices of the cost functions have a diagonal structure <sup>Q</sup>(i) <sup>=</sup> diag[q (i) 1 , q (i) 2 , ..., q (i) <sup>n</sup> ], <sup>R</sup>(ij) <sup>=</sup> diag[<sup>r</sup> (ij) 1 , r (ij) 2 , ..., r (ij) pi ]. These matrices are positive semidefinite and positive definite, respectively. To formulate the necessary optimality conditions, the Hamilton function is defined, which can be derived from the variational principle (see [Eng05, Chapter 8] for more details). In LQ games, the Hamiltonians of the players are

$$H^{\{i\}} = \frac{1}{2} x(t)^{\mathsf{T}} \mathbf{Q}^{\{\mathsf{a}\}} x(t) + \frac{1}{2} \sum\_{j \in \mathcal{P}} u^{\{j\}}(t)^{\mathsf{T}} \mathbf{R}^{\{ij\}} u^{\{j\}}(t) + \lambda^{\{i\}T}(t) f(t), \tag{3.15}$$

where λ (i) (t) is the costate variable. Assuming an infinite time, τend → ∞, the optimality condition for the player i is formulated as

$$\frac{\partial H^{(i)}(t)}{\partial u^{(i)}(t)} = \mathbf{0},\tag{3.16a}$$

$$\frac{\partial H^{(i)}(t)}{\partial x(t)} = -\dot{\lambda}^{(i)}(t),\tag{3.16b}$$

$$
\dot{x}(t) = f(t). \tag{3.16c}
$$

Substituting the Hamiltonian (3.15) and the linear system dynamics (3.12) into (3.16) leads to

$$\mathbf{0} = \mathbf{R}^{\{ii\}} u^{\{i\}}(t) + \mathbf{B}^{\{i\}T} \lambda^{\{i\}}(t),\tag{3.17a}$$

$$
\dot{\lambda}^{(i)}(t) = -\mathbf{Q}^{(i)}x(t) - \mathbf{A}\lambda^{(i)}(t),\tag{3.17b}
$$

$$
\dot{x}(t) = \mathbf{A}x(t) + \sum\_{i \in \mathcal{P}} \mathbf{B}^{(i)} u^{(i)}(t), \tag{3.17c}
$$

which hold ∀i ∈ P. Solving for u (i) in (3.17a) leads to

$$\mathbf{u}^{\{i\}}(t) = -\mathbf{R}^{\{ii\}}^{-1}\mathbf{B}^{\{i\}T}\boldsymbol{\lambda}^{\{i\}}(t),\tag{3.18}$$

which can be inserted in (3.17c)

$$\dot{x}(t) = \mathbf{A}x(t) - \sum\_{i \in \mathcal{P}} \mathbf{B}^{(i)} \left( \mathbf{R}^{(ii)}{}^{-1} \mathbf{B}^{(i)T} \boldsymbol{\lambda}^{(i)}(t) \right). \tag{3.19}$$

<sup>14</sup> This is a common procedure in optimal control theory, since matrix elements outside the diagonal represent mixed terms in the cost function that are in general not interpretable, see [BH18].

The equations (3.17b) and (3.19) form a boundary value problem. Due to the fact that the second order necessary condition is automatically fulfilled

$$\frac{\partial^2 H^{\{i\}}(t)}{\partial u^{\{i\}}} = \mathbf{R}^{\{ii\}} > \mathbf{0},$$

the solution is the minimum of the cost function J (i) . To solve the boundary value problem, a linear solution

$$\lambda^{(i)}(t) = \mathbf{P}^{(i)}x(t)$$

is assumed, where <sup>P</sup> <sup>∈</sup> <sup>R</sup> n×n . This leads to

$$\mathbf{0} = \left(\mathbf{A}^{\top}\mathbf{P}^{(i)} + \mathbf{P}^{(i)}\mathbf{A} + \mathbf{Q}^{(i)} - \sum\_{j \in \mathcal{P}} \mathbf{P}^{(i)}\mathbf{S}^{(j)}\mathbf{P}^{(j)}\right)$$

$$- \sum\_{j \in \mathcal{P}} \mathbf{P}^{(j)}\mathbf{S}^{(j)}\mathbf{P}^{(i)} + \sum\_{j \in \mathcal{P}} \mathbf{P}^{(j)}\mathbf{S}^{(i)}\mathbf{P}^{(j)}\right) \mathbf{z}(t), \ \forall i \in \mathcal{P}, \tag{3.20}$$

$$\text{where } \mathbf{S}^{(j)} = \mathbf{B}^{(j)}\mathbf{R}^{(jj)}^{-1}\mathbf{B}^{(j)}^{\top} \text{ j} \in \mathcal{P},$$

$$\mathbf{S}^{(ij)} = \mathbf{B}^{(j)}\mathbf{R}^{(jj)}^{-1}\mathbf{R}^{(ij)}\mathbf{R}^{(jj)}^{-1}\mathbf{B}^{(j)}^{-1}\mathbf{B}^{(j)}^{\top} \text{ j} \in \mathcal{P}, i \neq j.$$

The equation (3.20) is called the coupled algebraic Riccati equation, which can be solved more efficiently than the general case cf. (3.9). With the resulting P(i) , the feedback gains of the players can be computed as

$$\mathbf{K}^{\{i\}} = \mathbf{R}^{\{ii\}}^{-1} \mathbf{B}^{\{i\}T} \mathbf{P}^{\{i\}},\tag{3.21}$$

which leads to the feedback control of the players

$$
\mathbf{u}^{(i)}(t) = -\mathbf{K}^{(i)}\mathbf{x}(t). \tag{3.22}
$$

The NE of the differential game is associated with the feedback gains of the players K(i) leading to the optimal trajectories in contrast to the open-loop case, in which the optimal control trajectories of the players are explicitly given as a function of the initial system states u (i) (t,x0), see [Eng05, Chapter 7, 8]. The feedback control law (3.22) enables the support of the human, see [NC13, FOSH14a].

For practical applications, it is sufficient to compute an approximation of the NE of the differential game including a predefined structure of the cost functions. Therefore, it is a feasible assumption that the global cost function is quadratic, cf. (3.13). Thus, the parameter vector of the automation is chosen as

$$\boldsymbol{\Theta}^{\{\mathbf{a}\}} = \left[ q\_1^{\{\mathbf{a}\}}, q\_2^{\{\mathbf{a}\}}, \dots, q\_n^{\{\mathbf{a}\}}, r\_1^{\{\mathbf{a}1\}}, r\_2^{\{\mathbf{a}1\}}, \dots, r\_{p\_1}^{\{\mathbf{a}1\}}, r\_1^{\{\mathbf{a}2\}}, \dots, r\_{p\_N}^{\{\mathbf{a}N\}} \right],\tag{3.23}$$

which is subject to be determined. The optimal parameter vector of the automation is computed by the nested optimization

$$\left(\boldsymbol{\theta}^{\text{(a)}^{\text{\*}}}\right)^{\*} = \underset{\boldsymbol{\theta}^{\text{(a)}}}{\arg\min} \, J^{\text{(g)}}\left(t, \tau\_{\text{end}}, \boldsymbol{x}(t), \boldsymbol{u}^{\text{(h)}^{\text{\*}}}(t), \boldsymbol{u}^{\text{(a)}^{\text{\*}}}(t), \boldsymbol{\theta}^{\text{(a)}^{\text{\*}}}\right),\tag{3.24a}$$

$$\begin{aligned} \textbf{w.r.t.} \,\forall i \in \{\textbf{h}, \textbf{a}\} \\ \textbf{0} = \textbf{A}^{\top}\textbf{P}^{(i)} + \textbf{P}^{(i)}\textbf{A} + \textbf{Q}^{(i)} - \sum\_{j \in \mathcal{P}} \textbf{P}^{(i)}\textbf{S}^{(j)}\textbf{P}^{(j)} \\ - \sum\_{j \in \mathcal{P}} \textbf{P}^{(j)}\textbf{S}^{(j)}\textbf{P}^{(i)} + \sum\_{j \in \mathcal{P}} \textbf{P}^{(j)}\textbf{S}^{(ij)}\textbf{P}^{(j)}, \end{aligned} \tag{3.24b}$$

which is solved iteratively. There are computationally efficient implementations of solvers for (3.24), e.g. in Matlab [BH18] and in C++ [GJ<sup>+</sup> 10].

# **3.2 The Concept of Limited Information Shared Control**

After the presentation of FISC, the concept of LISC is introduced. In several practical applications, FISC cannot be utilized due to the following two reasons:


These practical challenges limit the use of the shared control approaches presented in Section 3.1, for which LISC provides a remedy.

Before presenting LISC, it needs to be noted that in the state of the art, there are numerous stochastic models and controlling approaches for cooperative systems, which use similar terms, e. g. *limited* or *incomplete* information, [RSV04, MASM05, She18, KTFH20]. Other works in literature, e. g. [BMA17, BAMA20], attempt to design shared controls for unknown system dynamics, parameters, or unidentified human behavior, which is not the focus of this thesis. It is essential to note that the proposed model is deterministic and assumed to be known or identified in advance. The terminology *limited information* means that a subset of the references and system states are only available for the human, but not for the automation. Such a setup has not been taken into account by works with the terms limited and incomplete, see e. g. [RSV04, MASM05, She18, KTFH20]. Note that the terms *measurable* and *non-measurable* are always considered with respect to the automation. The concepts and results in the following subsections have been published in two research papers [VSL<sup>+</sup> 20, VIH22].

<sup>15</sup> More generally, the term *goal* would be more reasonable because the human defines a goal trajectory or path, which he/she tries to reach. The term goal can precisely point out that these references are determined by the human online during the task execution. However, for the sake of consistency, the term *reference* is used.

### **3.2.1 Modeling Assumptions**

First, general assumptions are stated, which are necessary for applying LISC.

#### **Assumption 3.2.1**

First, it is assumed that the system model is available and input-affine. Furthermore, its parameters are identified in advance. It has the general form

$$\dot{x}(t) = \mathbf{f}\left(t, x(t)\right) + \sum\_{i \in \mathcal{P}} \mathbf{g}^{(i)}\left(t, x(t)\right) u^{(i)}(t),\tag{3.25}$$

where the unforced state dynamics and the nonlinear input vector are <sup>f</sup> <sup>∶</sup> <sup>R</sup> <sup>+</sup> <sup>×</sup> <sup>R</sup> <sup>n</sup> <sup>→</sup> <sup>R</sup> n and g (i) ∶ R <sup>+</sup> <sup>×</sup> <sup>R</sup> <sup>p</sup><sup>i</sup> <sup>→</sup> <sup>R</sup> n . Furthermore, <sup>x</sup> <sup>∈</sup> <sup>R</sup> n , u (i) ∈ R <sup>p</sup><sup>i</sup> with <sup>i</sup> <sup>∈</sup> <sup>P</sup> <sup>=</sup> {a, <sup>h</sup>} are the system states, inputs of the automation and the inputs of the human, respectively.

#### **Assumption 3.2.2**

It is assumed that the dynamic system can be split into an automation-controlled (xm(t) <sup>∈</sup> <sup>R</sup> n−k , measurable for the automation) and a human-controlled part (xnm <sup>∈</sup> <sup>R</sup> k , non-measurable for the automation<sup>16</sup>) for LISC, such that

$$\mathbf{x}(t) = \begin{bmatrix} \mathbf{x}\_{\mathrm{m}}^{\mathrm{T}}(t), \; \mathbf{x}\_{\mathrm{nm}}^{\mathrm{T}}(t) \end{bmatrix}^{\mathrm{T}}.\tag{3.26}$$

It is additionally assumed that the dynamic system is characterized by *unidirectionally coupled dynamics*<sup>17</sup>. This means that the human-controlled, non-measurable states xnm have no influence on the automation-controlled states xm. On the other hand, the measurable system part has an influence on the non-measurable system part. Therefore, the system states can be split into

$$
\dot{x}\_{\rm m}(t) = f\_{\rm m}(t, x\_{\rm m}(t)) + g\_{\rm m}^{\rm (a)}\left(t, x\_{\rm m}(t)\right) u^{\rm (a)}(t), \tag{3.27a}
$$

$$
\dot{x}\_{\rm nm}(t) = f\_{\rm nm}(t, x\_{\rm nm}(t), x\_{\rm m}(t)) \tag{3.27b}
$$

$$\begin{aligned} &+\mathcal{g}\_{\rm nm}^{\rm (h)}\left(t, \boldsymbol{x}\_{\rm nm}(t), \boldsymbol{x}\_{\rm m}(t)\right)u^{\rm (h)}(t) + \mathcal{g}\_{\rm nm}^{\rm (a)}\left(t, \boldsymbol{x}\_{\rm nm}(t), \boldsymbol{x}\_{\rm m}(t)\right)u^{\rm (a)}(t), \\ &\boldsymbol{y}(t) = \boldsymbol{x}\_{\rm m}(t). \end{aligned} \tag{3.27c}$$

Works in literature utilize the assumption of unidirectionally coupled dynamics, too, meaning that some system states have an influence on other system states but not conversely, see e. g. leader-follower tracking with limited information [YYF22], battery energy storage management with limited information [ZZLW19] or general synthesis for distributed control systems with asymmetric information structure [LL11, MMRY12, LL15].

<sup>16</sup> Note that the terms *human-controlled* and *non-measurable* can by used interchangeably since they owe the same meaning in this thesis: The subsystem is meant, which cannot be measured for the automation and controlled by the human.

<sup>17</sup> In [VSL+20], the terminology *unidirectionally coupled motion* was first used, which was introduced explicitly for the application of a vehicle manipulator. In the course of this thesis, this novel term is utilized, emphasizing the suitability of the characterization for general shared, cooperative as well as distributed control systems.

#### **Assumption 3.2.3**

Finally, the continuous human-automation interaction forms a LISC, which is defined as follows.

#### **Definition 3.5 (Limited Information Shared Control [VSL**+**20])**

*Consider full information shared control between two partners as introduced in Definition 2.4. If one of the partners has no information about*


*the shared control is called limited information shared control.*

Note that this definition does not fulfill the shared control Definition (2.4) because the partner with limited information is not able to carry out the task without the support of the other partner. Thus, both partners are necessary to accomplish the task, in contrast to a general FISC, where both partners can complete the task individually. The following definition extends LISC to problems with dual tasks.

#### **Definition 3.6 (Limited Information Shared Control for Dual Tasks)**

*Let limited information shared control and dual task according to Definition 3.5 and Definition 2.2 be given. If the task is shared between the interaction entities meaning that one of the tasks is carried out by one entity and the other task by the other entity, the setup is called limited information shared control for dual tasks.*

The remaining question is how to design a shared control for such setups with limited information, which is addressed in the next subsection.

### **3.2.2 System Structure with Cooperation State**

In the following, the core idea of modeling with the so-called *cooperation state* (CS) is proposed, which is used for the design of LISC with dual tasks. It is assumed that the design procedure is carried out in a full information setup and controller parameters of LISC are obtained. Then, the subsequent usage of LISC can be realized also in setups with limited information. Figure 3.2 illustrates the design procedure and the usage of the controllers under these assumptions. The reason for this assumption stems from practice: LISC is designed in an artificial environment (e.g. testing area or simulation environment) where the use of FISC is possible as well since all system states and references are available. LISC is designed in this artificial environment. After the design, LISC can be used in real working environments with limited information, where FISC is not applicable.

**Figure 3.2:** The design idea of LISC using FISC

The core notion of LISC is that the CS models the *mutual efforts* of the human and the automation to their own goals. If this mutual effort is zero, neither the interacting human nor the automation strives to change their inputs. The CS is defined as follows.

#### **Definition 3.7 (Cooperation State)**

*Let a LISC according to Definition (3.5) be given. In that case, the mutual efforts of human and automation can be described with an artificial system state vector*

$$x\_{\kappa}(t) = \xi\left(u^{\text{(h)}}(t), u^{\text{(a)}}(t)\right),\tag{3.28}$$

*which characterizes the interaction between human and automation. This system state is called cooperation state* xκ*.*

### **Note 1:**

Since (3.28) encapsulates the inputs of the human, it has an indication of the non-measurable system part.

#### **Note 2:**

The function ξ should characterize the result of the *shared* effort between automation and human. Definition 3.7 does not include any restrictions on the structure of CS and does not provide calculation procedures. Due to the fact that the notion of CS is application specific, its structure depends on the characteristics of the differential game.

In order to make Definition 3.7 suitable for the design of LISC, restrictions on the structure of CS need to be specified. For this specification, the human's control law is assumed to be a linear feedback controller<sup>18</sup> as supported by literature [Hap92, TJ02, WDZ<sup>+</sup> 20]. Example 3.1

<sup>18</sup> This is a reasonable assumption to reduce the complexity of the human's behavior. A linear control law reduces the complexity, but still provides a suitable reconstruction of the human behavior on the action level of their motions, which is shown in literature. Human models on the decision level (e. g. [FAI+19, RWIH20]) require more complex characterization. However, such models on the decision level are not the focus of this thesis.

illustrates that the concept of CS can be applied to an arbitrary non-linear system with linear control law of the human and the automation.

#### **Example 3.1:**

*Let the following nonlinear system with two states be given*

$$
\dot{x}\_{\rm m}(t) = -2 \cdot t \cdot x\_{\rm m}(t) + u^{\rm \{a\}}(t) \tag{3.29a}
$$

$$
\dot{x}\_{\rm nm}(t) = -t \cdot x\_{\rm m}^2(t) + 2 \cdot x\_{\rm nm}(t) \cdot x\_{\rm m}(t) + x\_{\rm nm}(t) + u^{\rm \{h\}},\tag{3.29b}
$$

*which is a non-linear, time-dependent input-affine system. It is assumed that* xm(t) *is measurable for the automation, while* xnm(t) *is not. The feedback control laws of the human and automation are assumed to be linear*

$$u^{\text{(a)}}(t) = -\mathbf{K}^{\text{(a)}}x(t),\tag{3.30a}$$

$$u^{\{\mathbf{h}\}}(t) = -\mathbf{K}^{\{\mathbf{h}\}}x(t),\tag{3.30b}$$

*where* x(t) = [xm(t), xnm(t)] *and the feedback gains are*

$$\mathbf{K}^{\text{(a)}} = \begin{bmatrix} 0.5, 0 \end{bmatrix} \text{ and } \mathbf{K}^{\text{(a)}} = \begin{bmatrix} 0.75, 2.5 \end{bmatrix}, \tag{3.31}$$

*which stabilize the system. Note, these feedback gains can be assumed to be the result of FISC design, which entails the cost functions of the human and the automation, see Section 3.1.2. For the sake of brevity and to keep the focus of this example on the proposed CS, the design steps of FISC are omitted and the values in (3.31) lead to a stable control loop.*

*Assuming that the state* xnm(t) *is non-measurable, the automation cannot take into account* xnm(t) *in its control actions. Therefore, the automation cannot support the human directly. A possible solution is the use of the CS, which can provide a substitution for* xnm(t)*. The structure of CS is assumed to be as follows:*

$$x\_{\kappa}(t) = \Xi\_1 \cdot u^{\{\mathbf{a}\}}^2(t) + \Xi\_2 \cdot u^{\{\mathbf{a}\}}(t) \cdot u^{\{\mathbf{h}\}}(t) + \Xi\_3 \cdot u^{\{\mathbf{h}\}}(t). \tag{3.32}$$

*This CS is supported by the structure of (3.29b). Due to the linear control law* (3.30)*, the correspondence of* xκ(t) *is formulated as*

$$x\_{\kappa}(t) = \Xi\_1 \cdot \underbrace{\boldsymbol{u}^{\{\mathbf{a}\}}\prime \boldsymbol{t}}\_{\Longrightarrow \boldsymbol{x}\_{\mathrm{m}}^{2}\left(t\right)} + \Xi\_2 \cdot \underbrace{\boldsymbol{u}^{\{\mathbf{a}\}}\prime \boldsymbol{t}\prime \boldsymbol{u}^{\{\mathbf{h}\}}\prime \boldsymbol{t}}\_{\Longrightarrow \boldsymbol{x}\_{\mathrm{nm}}\left(t\right) \cdot x\_{\mathrm{m}}\left(t\right)} + \Xi\_3 \cdot \underbrace{\boldsymbol{u}^{\{\mathbf{h}\}}\prime \boldsymbol{t}}\_{\Longrightarrow \boldsymbol{x}\_{\mathrm{nm}}\left(t\right)},$$

*where* Ô⇒ *symbolizes the implication of the corresponding elements. The reasons for these implications are the following:*


*These similarities are used for reconstructing* xnm(t)*.* Ξ<sup>i</sup> *are manually tuned using simulations, where the initial values of the simulation are chosen to* x<sup>0</sup> = [0.75, −0.25]*. The goal of the tuning was to minimize the error between* xκ(t) *and* xnm(t)*. This manual tuning was possible, due to the small number of parameters. The obtained parameters are*

$$
\Xi\_1 = -1.45, \ \Xi\_2 = 0.25 \text{ and } \Xi\_3 = -0.4.
$$

*The results are given in Figure 3.3, which shows that* x<sup>κ</sup> *can reproduce the trajectory of* xnm*. This way, after the design procedure, the automation can use* x<sup>κ</sup> *in situations, in which* xnm *is available. In principle,* x<sup>κ</sup> *could also be applied in situations, in which* xnm *is available for the automation. However, in that case, the usage of* xnm *is not reasonable and there is no need for* xκ*.*

**Figure 3.3:** Example of the cooperation state trajectory

In Example 3.1, the CS is chosen based on three assumptions:


Due to the simplicity of the example, this manual tuning worked well. Furthermore, the CS is used to reconstruct the non-measurable system states, which correspond to the mutual effort of the human and the automation in the shared control setup<sup>19</sup>. In order to generalize the procedure, the following idea is used.

According to Assumption 3.2.2, the human-controlled system part has no impact on the automation-controlled subsystem. Therefore, the control laws of the human and the automation

<sup>19</sup> In the case of large vehicle manipulator from Chapter 1, this mutual effort can be explained intuitively: Both the operator and the automation strive to reach the reference of the manipulator as good as possible.

are used for such a systematic determination of CS. It can be assumed that the control inputs of the automation and the human are determined by a function<sup>20</sup> of the current system states

$$\left[\boldsymbol{u}^{\left(\mathbf{h}\right)}(t),\ \boldsymbol{u}^{\left(\mathbf{a}\right)}(t)\right] = \mathcal{K}\left(\boldsymbol{x}\left(t\right)\right),\tag{3.33}$$

where K is a general feedback control law of the shared control setup including the automation and the human. The essential characteristic is that there is a deterministic relation between u (h) (t),u (a) (t) and x(t). Applying the generalized inverse of (3.33) leads to

$$x(t) = \mathcal{K}^{\text{(h)}^{-}}\left(\boldsymbol{u}^{\text{(h)}}(t), \boldsymbol{u}^{\text{(a)}}(t)\right),\tag{3.34}$$

where the ◻ − symbolizes the generalized inverse, for more details, it is referred to [EH13] or [KR15, Chapter 2]. Note that the function K (h) is not necessarily bijective meaning that unique inverse does not exist in general. Therefore, if (3.34) is not unique, K (h) − can be calculated in an approximative way only.

As it can be seen, (3.28) and (3.34) have similar structures. Moreover, x(t) includes xnm(t), thus, a subset of K (h) − has to correspond to ξ. Clearly, both (3.28) and (3.34) provide algebraic, approximate mappings, thus the corresponding subset of K (h) − can be used to determine the structure and parameters of CS, omitting manual tuning. Then, if the function ξ is obtained from K (h) − and xnm is reconstructed by xκ, the application of classical control methods is possible. The idea of the CS is general and can be applied to systems without non-measurable parts. However, primarily, systems with limited information and unidirectionally coupled dynamics benefit from LISC, cf. Example 3.1.

Using Definition 3.7, the measurable system part is extended with the CS such that

$$\boldsymbol{x}\_{e}(t) = \begin{bmatrix} \boldsymbol{x}\_{\mathrm{m}}(t) \ \boldsymbol{u}^{\mathrm{(a)}}(t) \ \boldsymbol{x}\_{\kappa}(t) \end{bmatrix}^{\mathsf{T}} \in \mathbb{R}^{n+p\_{h}}.\tag{3.35}$$

Note that u (a) (t) is included in the extended system state and the novel input of the automation is chosen to u˙ (a) (t). This integral extension is necessary to avoid algebraic loops and to enable a systematic treatment of LISC<sup>21</sup>. The obtained extended system is

$$
\dot{x}\_e(t) = f\_e(t, x\_e(t)) + g\_e^{\text{(a)}} \left( \dot{u}^{\text{(a)}}(t) \right) + g\_e^{\text{(h)}} \left( u^{\text{(h)}}(t) \right) \tag{3.36a}
$$

$$\mathbf{x}\_e(0) = \mathbf{x}\_{e0},\tag{3.36b}$$

which only depends on the system dynamics and the inputs of the human and the automation. In (3.36), f<sup>e</sup> , g (a) <sup>e</sup> and g (h) <sup>e</sup> are the system dynamics and the input functions of the automation and the human adapted for the extended system formulation, respectively. This formulation in (3.36) enables the use of standard control methods. An apparent solution is designing an optimal controller for (3.36), which leads to the following optimal control inputs of the automation

$$\left(\dot{\boldsymbol{u}}^{\text{(a)}}\right)^{\*}\left(t\right) = \underset{\mathbf{u}^{\text{(a)}}}{\text{arg min}}\ J\_{\text{LSC}}^{\text{(a)}}\left(t, \left.\mathbf{x}\_{\text{e}}(t), \dot{\boldsymbol{u}}^{\text{(a)}}(t)\right)\right) \tag{3.37a}$$

$$\mathbf{r} \le \mathbf{r}.\mathbf{r}.\mathbf{t}.\tag{3.36}$$

<sup>20</sup> K is the general feedback control of the human and the automation in shared control setup. It generally not limited to linear control laws. The core idea and the corresponding steps would also suit a nonlinear model of the human behavior or nonlinear system dynamics.

<sup>21</sup> Note that dynamic feedback extension is a common procedure in the control engineering literature, see e. g. [Gil69, Loh91, AWN12]

where <sup>∗</sup> symbolizes the optimal solution. Example 3.1 shows how the automation can still have information about the non-measurable state and (3.37) provides an optimization for the design of LISC. Thus, the concept can be generally used for various problems, in which one of the subsystems has limited information on a part of the system. Such examples could be e. g. adaptive cruise control systems [PSv<sup>+</sup> 11, Sd13] or distributed control systems [LL15].

However, choosing an appropriate structure may be difficult, and an unsuitable choice can lead to a useless CS. To overcome this challenge and reduce the computation complexity, the following subsection presents the LQ formulation and the design steps of LISC.

### **3.2.3 LQ Limited Information Shared Control**

This subsection presents the application of LISC for linear systems. It is assumed that the system matrix A and the input matrices B(i) , i ∈ {a, h} are known. The cost functions of the human and the automation are quadratic cf. (3.14). Formulating (3.27) with linear dynamics yields

$$
\begin{bmatrix}
\dot{x}\_{\rm m}(t) \\
\dot{x}\_{\rm nm}(t)
\end{bmatrix} = \begin{bmatrix}
\mathbf{A}\_{\rm m} & \mathbf{0} \\
\mathbf{A}\_{\rm m-\rm nm} & \mathbf{A}\_{\rm nm}
\end{bmatrix} \begin{bmatrix}
x\_{\rm m}(t) \\
x\_{\rm nm}(t)
\end{bmatrix} + \begin{bmatrix}
\mathbf{B}\_{\rm m}^{\rm (a)} \\
\mathbf{B}\_{\rm nm}^{\rm (a)}
\end{bmatrix} u^{\rm (a)}(t) + \begin{bmatrix}
\mathbf{0} \\
\mathbf{B}\_{\rm nm}^{\rm (h)}
\end{bmatrix} u^{\rm (h)}(t) \tag{3.38a}
$$

$$
y(t) = x\_{\rm m} \tag{3.38b}
$$

with the assumption that xnm has no impact on x<sup>m</sup> cf. Assumption 3.2.2. For such linear systems, CS is defined as follows.

# **Definition 3.8 (Cooperation State for Linear Systems, [VIH22])** *The cooperation state for linear systems is defined as the linear combination of the inputs of the automation and the human such that*

$$x\_{\kappa}(t) = \Xi^{\text{(a)}} u^{\text{(a)}}(t) + \Xi^{\text{(h)}} u^{\text{(h)}}(t) \in \mathbb{R}^k,\tag{3.39}$$

*where the matrices* Ξ (a) ∈ R <sup>k</sup>×p<sup>a</sup> *and* Ξ (h) ∈ R <sup>k</sup>×p<sup>h</sup> *are design parameters.*

#### **Note:**

Using Definition 3.8 always leads to a CS, which enables the reconstruction of the nonmeasurable system parts of (3.38a) with a linear control feedback of the human. Due to the linearities of the system dynamics and the linear feedback, the CS needs to be chosen as linear combination of the inputs of the human and the automation.

A further benefit of Definition 3.8 is that the CS has the same dimension as the non-measurable states <sup>x</sup>nm <sup>∈</sup> <sup>R</sup> k of the original system. Furthermore, the specification of x<sup>κ</sup> is reduced to the identification of the parameters of Ξ (a) and Ξ (h) . With (3.39), the following extended system dynamics are obtained

$$
\begin{bmatrix}
\dot{x}\_{\mathrm{m}}(t) \\
\dot{u}^{\mathrm{(a)}}(t) \\
\dot{x}\_{\kappa}(t)
\end{bmatrix} = \begin{bmatrix}
\mathbf{A}\_{\mathrm{m}} \, \mathbf{B}\_{\mathrm{m}}^{\mathrm{(a)}} \, \mathbf{0} \\
\mathbf{0} & \mathbf{0} \\
\mathbf{0} & \mathbf{0} \\
\end{bmatrix} \begin{bmatrix}
x\_{\mathrm{m}}(t) \\
u^{\mathrm{(a)}}(t) \\
x\_{\kappa}(t)
\end{bmatrix} + \begin{bmatrix}
\mathbf{0} \\
\mathbf{1} \\
\Xi^{\mathrm{(a)}} \\
\end{bmatrix} \dot{u}^{\mathrm{(a)}}(t) + \begin{bmatrix}
\mathbf{0} \\
\mathbf{0} \\
\Xi^{\mathrm{(b)}} \\
\end{bmatrix} \dot{u}^{\mathrm{(b)}}(t), \tag{3.40}
$$

where the derivative of the original system input u˙ (a) is taken into account for the design procedure of LISC only. The structure of the extended system matrix implies that x<sup>κ</sup> has an effect on x<sup>m</sup> and u (a) , if a feedback controller is designed such that

$$
\dot{\boldsymbol{u}}\_{\rm LISC}^{\rm (a)}(t) = -\mathbf{K}\_{\rm LISC}^{\rm (a)} \cdot \boldsymbol{x}\_{\rm e}(t). \tag{3.41}
$$

To compute the feedback law, an LQR problem is formulated, excluding the non-measurable states xnm from the design. For the LQR design, the model (3.40) and the cost function

$$J\_{\rm LISC}^{\rm (a)} = \int\_0^{\tau\_{\rm end}} x\_e(t)^\top \mathbf{Q}\_{\rm LISC}^{\rm (a)} x\_e(t) + \dot{u}^{\{a\} \underline{T}}(t) \mathbf{R}\_{\rm LISC}^{\rm (a)} \dot{u}^{\{a\} \underline{t}}(t) \,\mathrm{d}t \tag{3.42}$$

are used. The optimum is computed by the dynamic optimization

$$
\dot{\boldsymbol{u}}\_{\rm LISC}^{\rm (a)^{\*}}(t) = \underset{\dot{\boldsymbol{u}}^{\rm (a)^{\*}}}{\arg\min} \, J\_{\rm LISC}^{\rm (a)}(t, \boldsymbol{x}\_{e}(t), \dot{\boldsymbol{u}}^{\rm (a)}(t), \boldsymbol{u}^{\rm (h)}(t)), \tag{3.43a}
$$

$$\mathbf{w}.\mathbf{r}.\mathbf{t}.\tag{3.40}$$

which leads to the feedback gains K (a) LISC in (3.41). The computation of (3.43) happens with classical LQ optimization, see e. g. [PLB15]. To obtain the original system input, the integral

$$
\hat{\mathbf{u}}\_{\rm LISC}^{\rm \(a\)}(t) = \int\_0^t \dot{\mathbf{u}}\_{\rm LISC}^{\rm \(a\)}(\tau^\star) \,\mathrm{d}\tau^\star \tag{3.44}
$$

is computed. In (3.40) the initial value of original system input u (a) is assumed u (a) (0) = 0 meaning that the original system input signal is zero at the beginning. This is a plausible assumption, since the controller can be initialized to zero without loss of generality.

# **3.3 Overview of the LISC Design Procedure**

After the presentation of LISC, two important questions remained open:


To answer these two open questions, in [VSB<sup>+</sup> 20], a practical procedure is proposed: The parameters were directly configured by the test subjects. They had the possibility to tune two controller parameters with practical meaning: One was the "gain of the support" and the other was the "speed or quickness of the support". They tuned these parameters until the preferred behavior was reached. Although this seems a practically promising strategy, this heuristic controller design still impedes the general application of LISC, see heuristic controller design in Figure 3.4.

Therefore, to enable the general transferability of LISC, a systematic framework is essential. The design scheme is visualized in Figure 3.4. This systematic solution was proposed in two research publications [VIH21, VIH22]. The core notion is the following: The starting

**Figure 3.4:** The constructive design procedure of LISC is shown using four steps with the corresponding sections is shown: 1) FISC design, 2) the class of potential games, 3) the idea of the cooperation state and 4) an optimization to compute the feedback gains of the LISC.

point of a designed FISC including all information of the shared control setup. Then a more compact, substituting representation of the differential game model is sought, which enables the constructive calculation of the parameters of the CS. Finally, the feedback gains of the controller are computed such that it fulfills the high-level requirements.

The subsequent elaborations provide this systematic design which consists of the following steps:


# **3.4 Summary of the Chapter**

This chapter introduces the concept of LISC and answers the first research question from Chapter 2 by introducing a novel representation of the shared control setup. The first section of this chapter presents the preliminaries on game theory and a systematic, model-based shared control design of [FOSH14a], which is referred to as FISC, because all the system states need to be available for the automation. However, this assumption does not always hold for practical applications. Therefore, the main contribution of this chapter is the proposition of LISC, which can overcome the problem of the limited information of the automation in shared control setups. The concept of LISC is formalized which enables a broad usage of the concept. Then, the proposed LISC is analyzed for linear-quadratic games in detail, for which a mathematical definition of the CS is elaborated. The CS can model the mutual effort of human and automation in shared control setups. Finally, a systematic controller design is proposed with four design steps. The overview of this design procedure is followed by the in-depth of the design steps in the next chapter.

# **4 Systematic Design of LISC: A Potential Game Approach**

After the presentation of the general concept of LISC, the steps of the proposed systematic design procedure is presented in detail for the LQ case in this chapter, for answering the second research question. The first step formulates a FISC based on the procedure discussed in Section 3.1.3. In the second design step of the design, the differential game is modeled by means of a potential game. Therefore, the chapter begins with the preliminaries on potential games. These preliminaries are followed by the discussion of shortcomings and limitations of the existing classes of potential games and the necessity of an extension to enable their broader usage is motivated. Thereafter, Section 4.2 and Section 4.3 present two novel subclasses of potential games, which are the so-called *ordinal potential differential games* (OPDG) and *near potential differential games* (NPDG), respectively. The necessary and sufficient conditions for the existence of NPDGs and OPDGs are given. Furthermore, algorithms are developed, which can determine these novel subclasses of potential differential games. Section 4.5 begins with the constructive computation of the CS, which is based on the optimality principle of the control problem. Then, two methods are presented to compute the feedback gains of the proposed LISC. Finally, the stability analysis of LISC is discussed at the end of this chapter.

# **4.1 Preliminaries on Potential Games**

The following subsection presents the fundamentals of potential games, which are necessary to highlight the shortcomings of the existing subclasses. Furthermore, the notations essential for the presentation of the two novel subclasses in Section 4.2 and 4.3 are introduced.

# **4.1.1 General Concept of Potential Games**

The very first idea of a fictitious function replacing the original structure of a non-cooperative strategic game with N players was given by *Rosenthal* [Ros73], in which the class of strategic games with (at least) one NE are identified. Based on this idea, the formal definitions of potential games were first introduced by *Moderer and Shapley* [MS96].

The general idea of potential games is presented visually in Figure 4.1: The original noncooperative strategic game<sup>22</sup> with N players and with their cost functions J (i) , i = 1...N are

<sup>22</sup> Note it is assumed that the original game is given, therefore the term *given game* is used interchangeably to emphasize this property of the original game.

replaced with one single potential function. This potential function provides a single mapping of strategy space U = U (1) <sup>×</sup> ... <sup>×</sup> <sup>U</sup> (N) of the original game to the real numbers

$$J^{(p)}: \mathcal{U} \to \mathbb{R},\tag{4.1}$$

instead of N mapping of the combined strategy set of the players to the set of real numbers

$$J^{(i)}: \mathcal{U} \to \mathbb{R}, \forall i \in \mathcal{P}. \tag{4.2}$$

In other words, the potential function is a substituting model of the original game without losing essential information. Intuitively, it is easier to compute the NE of the original non-cooperative strategic game using (4.1) than (4.2). An important property of potential games is that they possess (at least) one NE. If in addition, the potential function is bounded and strictly concave, the NE is unique. These properties make them attractive for the analysis of strategic games [GH16], [LCS16, Section 2.2].

The optimum of this potential function corresponds the NE of the original game, see Figure 4.1. Moreover, the decision dynamics of a potential static game converge<sup>23</sup> towards the NE of the game. However, it is not easy to find the potential function for a given original game (4.1) in certain applications. Potential games are a subclass of strategic games, thus not all strategic games can be modeled as potential games [LCS16, Chapter 2].

**Figure 4.1:** The general idea of potential games is illustrated, in which the original game is replaced by a (fictitious) potential function. The optimum of this potential function provides the NE of the original game.

There are different subclasses of potential static games: *exact, weighted, ordinal, generalized ordinal, best-response and pseudo* potential games. Not all of them are examined in this thesis, therefore, for more details on these subclasses, it is referred to [MS96, DHZ06, Voo00, GH16] or [LCS16, Chapter 2].

The class of potential games is getting more attention in the research community. They are applied in recent years in various engineering applications e. g. cooperative drone control

<sup>23</sup> Note that the terms *dynamics* and *convergence* do not relate to the dynamics in the context of differential games. In case of static games, they mean the dynamics and convergence of the decision-making process, which leads to the NE of the game.

[NTM<sup>+</sup> 20, KYM21], mixed traffic intersection scenarios [FG20, LTF<sup>+</sup> 22], power network management [ZMZ<sup>+</sup> 21], vehicle routing control [BYM22] or for cooperative learning methods [Tat18, SMK20].

### **4.1.2 Exact Potential Static Games**

Exact potential static games are the most restrictive subclass of potential games. They are non-cooperative games that admit having a potential function and fulfill Definition 4.1. The optimum of the potential function provides the NE of the original non-cooperative game.

#### **Definition 4.1 (Exact Potential Static Game)**

*The strategic static game* Γes *according to Definition 3.1 is an exact potential static game, if and only if there is a cost function* J (p) *such that*

$$\begin{split} J^{\{i\}}\left(u\_1^{\{i\}},u\_1^{\{\neg i\}}\right) - J^{\{i\}}\left(u\_2^{\{i\}},u\_1^{\{\neg i\}}\right) \\ = J^{\{p\}}\left(u\_1^{\{i\}},u\_1^{\{\neg i\}}\right) - J^{\{p\}}\left(u\_2^{\{i\}},u\_1^{\{\neg i\}}\right) \forall i \in \mathcal{P} \end{split} \tag{4.3}$$

*holds, where* J (i) *are the cost functions of the players of the original game. The two different input strategies of player* i *are* u (i) 1 *and* u (i) 2 *. The vector* u (¬i) 1 *denotes the fixed inputs of all the other players.*

#### **Lemma 4.1 (Monderer and Shapley)**

*Let a strategic game* Γes *be given. Furthermore, let an exact potential function* J (p) *according to Definition 4.1 be given. The equilibrium set* J *of the strategic game* Γes (J ) *corresponds with the equilibrium set of* Γes (J (p) )*, where* J (p) <sup>=</sup> {<sup>J</sup> (p) , ..., J(p) } *denotes the strategy set of the potential functions.*

*The strategy set* (u (i) ∗ ,u (¬i) ∗ ) *is en equilibrium of* Γes *if and only if*

$$J^{(p)}\left(u^{\binom{i}{i}^\*}, u^{\binom{\{-i\}^\*}{}}\right) \ge J^{(p)}\left(u^{\binom{i}{i}}, u^{\binom{\{-i\}^\*}{}}\right) \tag{4.4}$$

*holds. Consequently, the optimum of* J (p) *corresponds to the NE of* Γes*.*

#### **Proof:**

See the proof of Theorem 2.2 in [LCS16].

To illustrate and explain the basic idea of potential games in a simple, comprehensible way, the example of the Prisoner's Dilemma is presented which can be characterized as an exact potential static game.

#### **Example 4.1:**

*In the Prisoner's Dilemma, two criminals are accused of committing a crime together. They will be questioned individually, without any communication. If they both deny the crime, they receive short imprisonments (each gets one year of imprisonment). If they both confess, they receive a higher penalty, but not the maximum penalty because of their confession (each gets three years of imprisonment). If only one of the two prisoners confesses, he will go unpunished as a witness and the other will receive the maximum penalty as a convicted but unconfessed offender (the witness gets no imprisonment, the offender gets four years of imprisonment).*

*This problem can be formulated as a non-cooperative strategic game* Γ*, which has*


$$\mathcal{U} = \mathcal{U}^{\{1\}} \times \mathcal{U}^{\{2\}} = \{ \{u\_1^{\{1\}}, u\_1^{\{2\}}\}, \{u\_2^{\{1\}}, u\_1^{\{2\}}\}, \{u\_1^{\{1\}}, u\_2^{\{2\}}\}, \{u\_2^{\{1\}}, u\_2^{\{2\}}\} \}$$

• *the cost functions of the players*

$$J^{(1)}: \mathcal{U} \to \{1, 0, 4, 3\} \tag{4.5}$$

$$J^{(2)}: \mathcal{U} \to \{1, 4, 0, 3\} \tag{4.6}$$

*The game can also be formulated in the matrix form, which provides the payoffs as a matrix for the combinations of the strategies.*


*As this strategic game is non-cooperative, it ends in an equilibrium, which is the NE of the game. From the viewpoint of player 1, it is rational to choose the confession* (u (1) 2 ) *because he gets a lower punishment independently of the choice of player 2: no imprisonment instead of one year* (u (2) 1 ) *or three years instead of four years* (u (2) 2 )*. Due to the game's symmetry, player 2 has the same decision process. This means that in this non-cooperative game situation, both players choose to confess leading to the equilibrium point* (u (1) 2 , u (2) 2 )*, which is the NE of the game.*

*It can be seen that finding the NE even in this simple game is not straightforward. Using the idea of the potential games, a fictitious function can be introduced, which is a substituting model of the original game including its all characteristics. A possible potential function for the* *Prisoner's Dilemma can be defined such that the cost functions of the players* J (1) *and* J (2) *are replaced by*

$$J^{(p)}: \mathcal{U} \to \{2, 1, 1, 0\}. \tag{4.7}$$

*This potential function in matrix form is given*


*Clearly, this is an exact potential static game since*

$$J^{\{1\}}\{u\_1^{\{1\}}, u\_1^{\{2\}}\} - J^{\{1\}}\{u\_2^{\{1\}}, u\_1^{\{2\}}\} = J^{\{p\}}\{u\_1^{\{1\}}, u\_1^{\{2\}}\} - J^{\{p\}}\{u\_2^{\{1\}}, u\_1^{\{2\}}\} = -1 \tag{4.8a}$$

$$J^{\{1\}}(u\_1^{\{1\}}, u\_1^{\{2\}}) - J^{\{1\}}(u\_1^{\{1\}}, u\_2^{\{2\}}) = J^{\{p\}}(u\_1^{\{1\}}, u\_1^{\{2\}}) - J^{\{p\}}(u\_1^{\{1\}}, u\_2^{\{2\}}) = -1 \tag{4.8b}$$

*hold for both players. Note the potential function* J (p) *is not unique, there are infinitely many potential functions, which fulfill* (4.8)*, e. g.* <sup>J</sup>˜(p) <sup>=</sup> <sup>J</sup> (p) <sup>+</sup> <sup>R</sup>. *With the help of* <sup>J</sup> (p) *, the NE can be determined more easily using this potential structure. Indeed, minimum of* J (p) *corresponds to the NE of the original game.*

An obvious extension to potential static games is the inclusion of underlying system dynamics. Dynamic games can describe numerous practical engineering applications, see e. g. [FFH17, ESP21b, NC22]. In literature, systems without time dependencies and with potential structure are also addressed, see e. g. [Mar12]. However, the application of this thesis takes such systems into consideration that possess time dependencies of the underlying system dynamics. Therefore, exact potential differential games are addressed in the next subsection.

### **4.1.3 Exact Potential Differential Game**

The concept of potential games is extended to differential games (cf. Definition 3.3), [DLLP09]. In [GH16, FH18], overviews of exact potential differential games are given. An exact potential differential game differs from the static case in such a way that the underlying system dynamics has to be taken into account and through the potential function, an optimal control problem is obtained. In the following and throughout this thesis, the LQ case is discussed. An exact potential differential game is defined as follows [FH18].

#### **Definition 4.2 (LQ Exact Potential Differential Games)**

*Let an LQ differential game* Γed *with system dynamics*

$$
\dot{x}(t) = \mathbf{A}x(t) + \sum\_{i \in \mathcal{P}} \mathbf{B}^{(i)} u^{(i)}(t), \tag{4.9}
$$

*be given as defined in* (3.12)*. Furthermore, let the quadratic cost functions (3.14) and Hamiltonian functions (3.15) of the players be given. Assume that the aggregated inputs of the players and the aggregated input matrices are defined such that*

$$u^{\{p\}}(t) = \left[u^{\{1\}}^{\top}(t), u^{\{2\}}^{\top}(t), \dots u^{\{N\}}^{\top}(t)\right]^{\top} \tag{4.10}$$

*and*

$$\mathbf{B}^{(p)} = \left[ \mathbf{B}^{(1)}, \mathbf{B}^{(2)}, \dots, \mathbf{B}^{(N)} \right],$$

*respectively. Furthermore, consider an LQ optimal control problem over infinite time horizon* τend → ∞ *with the cost function*

$$J^{\{p\}} = \frac{1}{2} \int\_0^{\tau\_{\rm end}} x^\top(t) \mathbf{Q}^{\{p\}} x(t) + u^{\{p\}}^\top(t) \mathbf{R}^{\{p\}} u^{\{p\}}(t) dt \tag{4.11}$$

*as well as the Hamilton function*

$$H^{\{p\}}(t) = \frac{1}{2}x(t)^{\top} \mathbf{Q}^{\{p\}}x(t) + \frac{1}{2}u^{\{p\}}^{\top}(t)\mathbf{R}^{\{p\}}u^{\{p\}}(t) + \lambda^{\{p\}T}f(t),\tag{4.12}$$

*where the matrices* Q(p) *and* R(p) *are positive semi-definite and positive definite, respectively. If*

$$\frac{\partial H^{(p)}(t)}{\partial u^{(i)}(t)} = \frac{\partial H^{(i)}(t)}{\partial u^{(i)}(t)}\tag{4.13}$$

*holds for* ∀i ∈ P*, the LQ differential game* Γde *is an LQ exact potential differential game, which has the potential function* J (p) *.*

The necessary and sufficient condition of exact potential differential games is presented in the following for the general case.

#### **Lemma 4.2 (Fonseca-Morales and Hernández-Lerma [FH18])** *Let a differential game* Γed *according to Definition 3.3 be given. Let* h (i) *be given cf.* (3.4) *and defined such that the instantaneous cost of player* i

$$h^{(i)}\left(t,x,u^{(p)}\right) = x^{\top} \mathbf{Q}^{(p)} x + u^{(p)}{}^{\top} \mathbf{R}^{(p)} u^{(p)},$$

*and* τend → ∞ *holds. Assume that*

$$\mathbf{x} = \begin{bmatrix} x\_1, \, x\_2, \, \dots, \, x\_n \end{bmatrix}^\mathsf{T} \text{ and } \mathbf{u}^{(i)} = \begin{bmatrix} u\_1^{(i)}, \, u\_2^{(i)}, \, \dots, \, u\_n^{(i)} \end{bmatrix}, \,\forall i \in \mathcal{P}.$$

*Furthermore, assume that one of the following conditions (i)-(iii) holds* ∀i ∈ P*:*

• *(i) The instantaneous cost of player* i *fulfills*

$$\frac{\partial h^{\{1\}}}{\partial x\_k} = \frac{\partial h^{\{2\}}}{\partial x\_k} = \dots \frac{\partial h^{\{i\}}}{\partial x\_k} = \dots = \frac{\partial h^{\{n\}}}{\partial x\_k} \,\,\forall k = 1 \dots n. \tag{4.14}$$

• *(ii) The instantaneous cost of player* i *fulfills*

$$\frac{\partial h^{\{i\}}}{\partial x\_1} = \frac{\partial h^{\{i\}}}{\partial x\_k} \,\forall k = 1...n. \tag{4.15}$$

*and, the system dynamics is decoupled such that*

$$\mathbf{f} = \left[ f^{\{1\}} \left( x\_1, u^{\{1\}} \right), f^{\{2\}} \left( x\_2, u^{\{2\}} \right), \dots, f^{\{n\}} \left( x\_n, u^{\{n\}} \right) \right]^\top,$$

*where* f (i) *depends only on* x<sup>i</sup> *and* u (i) <sup>∀</sup><sup>i</sup> <sup>∈</sup> <sup>P</sup> *and*

$$\frac{\partial f^{(1)} }{\partial x\_1} = \frac{\partial f^{(k)} }{\partial x\_k}, \forall k = 1...n \tag{4.16}$$

• *(iii) The instantaneous cost of player* h (i) *depends on* x<sup>i</sup> *only, instead of the full state vector* x

$$h^{(i)}\left(t,x,\mathfrak{u}^{(p)}\right) = \tilde{h}^{(i)}\left(t,x\_i,\mathfrak{u}^{(p)}\right),$$

*and*

$$\frac{\partial f^{\{1\}}}{\partial x\_1} = \frac{\partial f^{\{k\}}}{\partial x\_k}, \forall k = 1...n \tag{4.17}$$

*hold.*

*If, in addition to Assumptions (i), (ii) or (iii), there is a function* J (p) <sup>=</sup> <sup>∫</sup> ∞ 0 h (p) (t,x,u (p) ) dt *such that*

$$\frac{\partial h^{\{p\}}}{\partial u^{\{i\}}} = \frac{\partial h^{\{i\}}}{\partial u^{\{i\}}} \,\forall i \in \mathcal{P} \tag{4.18}$$

*and*

$$\frac{\partial h^{\{p\}}}{\partial x\_i} = \frac{\partial h^{\{i\}}}{\partial x\_i}, \forall i \in \mathcal{P}, \tag{4.19}$$

*then, the game* Γed *is an exact potential differential game with the potential function* J (p)

#### **Proof:**

See the proof of Lemma 1 in [FH18].

As in (4.11), a quadratic potential function J (p) with a linear underlying system dynamics (4.9) is assumed, the linear optimal control law

$$\mathbf{u}^{(p)}(t) = -\mathbf{K}^{(p)}\mathbf{x}(t)\tag{4.20}$$

*.*

is computed which is associated with the NE of the original game. The feedback gain is computed by

$$\mathbf{K}^{(p)} = \mathbf{R}^{(p)}{}^{-1} \mathbf{B}^{(p)}{}^{\mathsf{T}} \mathbf{P}^{(p)},$$

where P(p) is the solution of an algebraic Riccati equation

$$\mathbf{0} = \mathbf{A}^{\top}\mathbf{P}^{(p)} + \mathbf{P}^{(p)}\mathbf{A} + \mathbf{Q}^{(p)} - \mathbf{P}^{(p)}\mathbf{B}^{(p)}\mathbf{R}^{(p)}\mathbf{B}^{(p)}^{-1}\mathbf{B}^{(p)}\mathbf{P}^{(p)}.\tag{4.21}$$

The necessary and sufficient conditions for the existence of a potential function in accordance with Definition 4.2 hold for problems with special structure only, see Assumptions (i)-(iii). Therefore, the next subsection presents a less restrictive subclass of potential games.

### **4.1.4 Ordinal Potential Static Games**

Ordinal potential static games were first introduced for static problems in literature [MS96], and their characteristics were studied in further works [VN97, Kuk99]. An extension to differential games does not exist in literature. To illustrate the limitations of exact potential static games, the following example presents a modified version of the Prisoner's Dilemma, cf. Example 4.1, with asymmetric payoffs.

**Example 4.2:**

*Let Example 4.1 have a modified payoffs' matrix such that*


*The NE this game is also* (u (1) 2 , u (2) 2 )*, which is the NE of the game. However, for the Prisoner's Dilemma example a potential function is:*


*whose minimum corresponds with the NE of the original game. However, Definition 4.1 does not hold meaning that the game is not an exact potential static game. Still, the following holds*

$$J^{\{1\}}(u\_1^{\{1\}}, u\_1^{\{2\}}) - J^{\{1\}}(u\_2^{\{1\}}, u\_1^{\{2\}}) > 0 \iff$$

$$J^{\{p\}}(u\_1^{\{1\}}, u\_1^{\{2\}}) - J^{\{p\}}(u\_2^{\{1\}}, u\_1^{\{2\}}) > 0 \tag{4.22a}$$

$$\text{and}$$

$$J^{\{1\}}(u\_1^{\{1\}}, u\_1^{\{2\}}) - J^{\{1\}}(u\_1^{\{1\}}, u\_2^{\{2\}}) > 0 \iff$$

$$J^{\{p\}}(u\_1^{\{1\}}, u\_1^{\{2\}}) - J^{\{p\}}(u\_1^{\{1\}}, u\_2^{\{2\}}) > 0 \tag{4.22b}$$

*which enables the computation of the NE by minimizing the same potential function as introduced before. Hence, even in the given situation, where the strict condition on exact potential static games is not satisfied, the original game can still be replaced by a potential function.*

Using the idea of Example 4.2, the exactness of Definition 4.1 is omitted and a less restrictive subclass, ordinal potential static games, is defined, which was first introduced in [MS96].

#### **Definition 4.3 (Ordinal Potential Static Game [MS96])**

*The strategic static game* Γos *according to Definition 3.1 is an ordinal potential static game, if a potential function* J (p) *exists such that*

$$\begin{aligned} \text{sgn}\left(J^{(i)}\left(u\_1^{\{i\}}, u\_1^{\{-i\}}\right) - J^{(i)}\left(u\_2^{\{i\}}, u\_1^{\{-i\}}\right)\right) &= \\ \text{sgn}\left(J^{(p)}\left(u\_1^{\{i\}}, u\_1^{\{-i\}}\right) - J^{(p)}\left(u\_2^{\{i\}}, u\_1^{\{-i\}}\right)\right) \forall i \in \mathcal{P} \quad \text{(4.23)} \end{aligned}$$

*holds, where* J (i) *are the cost functions of the players* i ∈ P *of the original game.*

*For games with continuous strategy sets* (*if* J (i) *is continuous in* u (i) )*, the definition can be reformulated to*

$$\text{sgn}\left(\frac{\partial J^{(i)}}{\partial u^{(i)}}\right) = \text{sgn}\left(\frac{\partial J^{(p)}}{\partial u^{(i)}}\right), \forall i \in \mathcal{P}.\tag{4.24}$$

#### **Note:**

Definition 4.3 is formulated for scalar inputs only. In literature, ordinal potential static games were only formulated for problems with scalar inputs. For more details and examples see [NP01, DHQS08, HS19, Ewe20]

The formal condition and its proof for the existence of an ordinal potential static game is given by Lemma 2.1 in [MS96].

### **4.1.5 Discussion on State-of-the-Art Potential Games**

As presented in Section 4.1.1, potential games have appealing properties: A potential game is a more compact representation of a given original game that has (at least) one NE. In addition, this NE is unique for a convex and bounded potential function. These properties make the use of potential static games attractive [LCS16, Chapter 2]. However, in the case of differential games, the use of exact differential potential games is limited to systems with special structure, see specifications by Assumptions (i)-(iii) in Lemma 4.2, which are necessary to fulfill the equality condition (4.13). Consequently, exact potential differential games can only model games with special system structure or utility functions of the players defined by Assumptions (i)-(iii) in Lemma 4.2. This fact is substantiated by the state-of-the-art examples: For instance, restrictions on the relationship of the system states and the inputs of the players are assumed in Definition 8 and Proposition 9 in [DLLP09]. A further illustration is the Example 4 in [GH16], which shows that exact potential differential games can solve only such problems, in which the cost functions of the players are identical or have a special structure. An example of such a special structure can be found in [FH18], where it is shown that if the players have quadratic cost functions (3.14) with the penalty matrices

$$\mathbf{Q}^{\{i\}} = \begin{bmatrix} a & a\_i \\ c - a\_i & b \end{bmatrix} \text{ and } \mathbf{R}^{\{i\}} = \begin{bmatrix} r\_{i1} & 0 \\ 0 & r\_{i2} \end{bmatrix}, \ i = \{1, 2\}, \ j$$

then the game is an exact potential differential game, since

$$x^\top \mathbf{Q}^{(1)} x = x^\top \mathbf{Q}^{(2)} x$$

hold, meaning that Assumption (i) in Lemma 4.2 is fulfilled. Furthermore, it is assumed that the number of inputs is equal with the number of state variables in the problems from literature. On the other hand, Example 4.2 in [DLLP09] also illustrates that not all games can be modeled by means of the exact potential differential games. Thus, condition (4.13) hinders the general use of exact potential differential games illustrated by the examples from literature. Therefore, novel subclasses are necessary to enable a broader application.

The core notion is that the subclass of ordinal potential static games is less restrictive compared to exact potential static games, thus an extension of ordinal potential static games to differential games could probably model more practical applications. However, current literature does not include such an extension of ordinal potential static games to differential frames. To this end, the subclass of OPDGs for LQ problems is introduced in the next section.

# **4.2 Ordinal Potential Differential Games**

This section introduces the novel subclass of OPDGs. First, the formal definition of an OPDG is given, which is followed by the necessary and sufficient conditions for the existence of an OPDG. In the second part of the section, the computation of a potential game for the original game is presented: For a given differential game the ordinal potential function is identified.

### **4.2.1 Definition of OPDG**

For the subsequent analysis of OPDGs, the following is assumed:

**Assumption 4.3.1** The game is restricted to two players with scalar inputs and the weighting in their cost functions (3.14) are

$$\mathbf{Q}^{\{i\}} = \text{diag}\{q\_1^{\{i\}}, q\_2^{\{i\}}, \dots, q\_n^{\{i\}}\} \text{ and }$$

$$\mathbf{R}^{\{i\}} = \text{diag}\{r\_1^{\{i\}}, r\_2^{\{i\}}\}, \mathcal{P} = \{1, 2\}.$$

**Assumption 4.3.2** A quadratic potential function is assumed, cf. (4.12), in which the weighting matrices are symmetric

$$\mathbf{Q}^{(p)} = \begin{bmatrix} q\_{11}^{(p)} & \cdots & q\_{1n}^{(p)} \\ \vdots & \ddots & \vdots \\ q\_{1n}^{(p)} & \cdots & q\_{nn}^{(p)} \end{bmatrix} \text{ and }$$

$$\mathbf{R}^{(p)} = \begin{bmatrix} r\_{11}^{(p)} & r\_{21}^{(p)} \\ r\_{21}^{(p)} & r\_{22}^{(p)} \end{bmatrix}.$$

#### **Definition 4.4 (LQ Ordinal Potential Differential Games)**

*Let an LQ differential game* Γod *with system dynamics* (4.9) *be given. Furthermore, let the quadratic cost functions (3.14) and Hamilton functions (3.15) of the players be given. Let Assumptions 4.3.1-2 be held. Let* J (p) *according to* (4.11) *and* H(p) *according to the Hamiltonian* (4.12) *be given. If*

$$\text{sgn}\left(\frac{\partial H^{\{p\}}(t)}{\partial u^{\{i\}}(t)}\right) = \text{sgn}\left(\frac{\partial H^{\{i\}}(t)}{\partial u^{\{i\}}(t)}\right) \tag{4.25}$$

*holds for* ∀i ∈ P*, the LQ differential game* Γod *is defined as an LQ ordinal potential differential game, which has the ordinal potential function* J (p) *.*

#### **Note:**

Under the Assumptions 4.3.1-2, ( ∂H(p) (t) ∂u(i)(t) ) and ( ∂H(i) (t) ∂u(i)(t) ) are scalars.

### **4.2.2 Necessary and Sufficient Existence Condition of OPDG**

#### **Lemma 4.3 (Necessary and Sufficient Condition of an OPDG)**

*Let an LQ differential game* Γod *described by Definition 3.3 be given. Furthermore, the Assumptions 4.3.1-2 hold.*

*If there is a game* Γod*, for which*

$$\left(\mathbf{B}^{\{i\}T}\mathbf{P}^{(p)}x(t)\right)\left(\mathbf{B}^{\{i\}T}\mathbf{P}^{(i)}x(t)\right) \ge 0\tag{4.26}$$

*holds for* ∀i ∈ P *and* ∀x(t)*, then* Γod *is an LQ OPDG and can be represented through potential function described by Definition 4.4.*

#### **Proof:**

For the proof, Definition 4.4 is used:

$$\operatorname{sgn}\left(\frac{\partial H^{(p)}(t)}{\partial u^{(i)}(t)}\right) = \operatorname{sgn}\left(\frac{\partial H^{(i)}(t)}{\partial u^{(i)}(t)}\right) \forall i \in \mathcal{P}.\tag{4.27}$$

Assuming a quadratic potential function and linear system dynamics, cf. Definition 4.2,

$$\frac{\partial H^{(p)}(t)}{\partial u^{(p)}(t)} = \mathbf{R}^{\{p\}} u^{\{p\}}(t) + \mathbf{B}^{\{p\}T} \lambda^{\{p\}}(t) \tag{4.28}$$

holds, where λ (p) (t) <sup>=</sup> <sup>P</sup>(p)x(t) can be applied, which is obtained from the solution of (4.21). The control law of the potential game is obtained from the solution of (4.21) but a small perturbation of the optimal solution is applied in the following. The reason for that is the following: An optimal control law means that

$$\frac{\partial H^{(p)}(t)}{\partial u^{(p)}(t)} = 0,$$

which is not suitable for the analysis of the existence of a potential game as in that case (4.27) yields 0 = 0. The perturbation of the optimal control law of the potential game is

$$\mathbf{u}^{(p)}(t) = -(1 + \varepsilon\_c^{(p)}(x)) \mathbf{R}^{(p)} \prescript{-1}{\mathbf{B}}{\mathbf{B}}^{(p)} \prescript{\mathsf{T}}{\mathbf{P}}{\mathbf{P}}^{(p)} x(t),\tag{4.29}$$

in which 0 < ε (p) <sup>c</sup> (x) << 1 is an arbitrary small scalar variation function. With εc(x) → 0, the optimal control law is obtained. Substituting (4.29) in (4.28) gives

$$\frac{\partial H^{(p)}(t)}{\partial \mathbf{u}^{(p)}(t)} = -\mathbf{R}^{\{p\}} (1 + \varepsilon\_c^{\{p\}}(x)) \mathbf{R}^{\{p\}}^{-1} \mathbf{B}^{\{p\}T} \mathbf{P}^{\{p\}} x(t) + \mathbf{B}^{\{p\}T} \mathbf{P}^{\{p\}} x(t), \tag{4.30}$$

which can be simplified with R(p) and due to (4.10) rewritten as

$$\frac{\partial H^{\{p\}}(t)}{\partial u^{\{i\}}(t)} = -\varepsilon\_c^{\{p\}}(x) \mathbf{B}^{\{i\}T} \mathbf{P}^{\{p\}} x(t), \forall i \in \mathcal{P}. \tag{4.31}$$

For the players of the original game, the derivatives of the Hamiltonians are expressed as

$$\frac{\partial H^{\{i\}}(t)}{\partial u^{\{i\}}(t)} = \mathbf{R}^{\{i\}} u^{\{i\}}(t) + \mathbf{B}^{\{i\}T} \lambda^{\{i\}}(t) \tag{4.32}$$

hold. Furthermore, λ (i) (t) <sup>=</sup> <sup>P</sup>(i)x(t) can be substituted into (4.57). Analogously to the potential function calculation, a small perturbation of the control law is applied for each player of the original game. These inputs are computed by

$$u^{(i)}(t) = -(1 + \varepsilon\_c^{(i)}(x)) \mathbf{R}^{(i)} \prescript{-1}{\mathbf{B}}^{(i)} \prescript{\mathsf{T}}{\mathbf{P}}^{(i)} x(t),\tag{4.33}$$

where

$$0 < \varepsilon\_c^{(i)}(x) \ll 1, \forall i \in \mathcal{P}$$

are arbitrary small scalar variation functions. The control law (4.33) yields the behavior of players around the optimal solution. Substituting (4.33) in (4.57) gives

$$\frac{\partial H^{(i)}(t)}{\partial u^{(i)}(t)} = -\mathbf{R}^{\{i\}} (1 + \varepsilon\_c^{\{i\}}(x)) \mathbf{R}^{\{i\}} \prescript{-1}{\mathbf{B}}^{\{i\}} \prescript{\mathcal{T}}{\mathbf{P}}^{\{i\}} x(t) + \mathbf{B}^{\{i\}} \prescript{\mathcal{T}}{\mathbf{P}}^{\{i\}} x(t), \tag{4.34}$$

which can be simplified to

$$\frac{\partial H^{(i)}(t)}{\partial u^{\{i\}}(t)} = -\varepsilon\_c^{\{i\}}(x) \mathbf{B}^{\{i\}} \mathbf{P}^{\{i\}} x(t). \tag{4.35}$$

Substituting (4.31) and (4.35) in (4.27) yields

$$\operatorname{sgn}\left(\varepsilon\_c^{(p)}(x)\mathbf{B}^{(i)}\prescript{\mathsf{T}}{\mathbf{P}}^{(p)}x(t)\right) = \operatorname{sgn}\left(\varepsilon\_c^{(i)}(x)\mathbf{B}^{(i)}\prescript{\mathsf{T}}{\mathbf{P}}^{(i)}x(t)\right), \forall i \in \mathcal{P},\tag{4.36}$$

which can be rewritten to

$$\operatorname{sgn}\left(\varepsilon\_c^{(p)}(\mathbf{z})\right)\operatorname{sgn}\left(\mathbf{B}^{\{i\}}^{\mathsf{T}}\mathbf{P}^{(p)}\mathbf{z}(t)\right) = \operatorname{sgn}\left(\varepsilon\_c^{\{i\}}(\mathbf{z})\right)\operatorname{sgn}\left(\mathbf{B}^{\{i\}}^{\mathsf{T}}\mathbf{P}^{(i)}\mathbf{z}(t)\right), \forall i \in \mathcal{P}.\tag{4.37}$$

As both

$$\begin{aligned} \text{sgn}\left(\varepsilon\_c^{(p)}(x)\right) &> 0, \\ \text{sgn}\left(\varepsilon\_c^{(i)}(x)\right) &> 0, \,\forall i \in \mathcal{P}, \end{aligned}$$

hold,

$$\text{sgn}\left(\mathbf{B}^{\{i\}}\prescript{\mathsf{T}}{\mathbf{P}}{\mathbf{P}}^{\{p\}}x(t)\right) = \text{sgn}\left(\mathbf{B}^{\{i\}}\prescript{\mathsf{T}}{\mathbf{P}}{\mathbf{P}}^{\{i\}}x(t)\right), \forall i \in \mathcal{P} \tag{4.38}$$

is obtained.

Since, both terms (B(i) T <sup>P</sup>(p)x(t)) and (B(i) T <sup>P</sup>(i)x(t)) are scalars, the equality of two sign functions can be reformed to a multiplication such that

$$\left(\mathbf{B}^{(i)}\prescript{\mathsf{T}}{}{\mathbf{P}}^{(p)}x(t)\right)\cdot\left(\mathbf{B}^{(i)}\prescript{\mathsf{T}}{}{\mathbf{P}}^{(i)}x(t)\right)\geq 0,\ \forall i\in\mathcal{P},\tag{4.39}$$

which proves the lemma.

#### **Remark:**

The limitation of Lemma 4.3 is that it provides a non-constructive condition for the existence of OPDGs. However, this condition is still beneficial for the LISC design since a substituting model of a differential game is found. In the case of LQ differential games, the challenge is not computing of the NE<sup>24</sup> but finding a replacement of the given original game, which still implies all its necessary details. Consequently, Lemma 4.3 proves that the obtained OPDG can be used as a substituting model for the constructive LISC design without loss of crucial information.

### **4.2.3 Two Methods for Computing an OPDG**

In the previous subsection, the question, under what conditions an OPDG exists, is answered. This subsection provides computation methods, which yield the potential function for a given differential game using the conditions derived in Section 4.2.2. In the LQ case, finding the potential function for a given differential game can be simplified to determine the parameters Q(p) and R(p) of the potential function J (p) .

#### **Computation by means of the Input Trajectories**

The first method uses the deviation of the input trajectories u (p) of an ordinal potential differential game from the NE of the given original differential game. The deviation is defined as

$$e\_u(t) = u^{(p)}(t, x(t), \mathbf{Q}^{(p)}, \mathbf{R}^{(p)}) - u^\*(t, x(t))\tag{4.40}$$

where u <sup>∗</sup> = [u (1) ∗ , u(2) ∗ ] is the concatenated optimal inputs of the players in the NE of the original differential games. It is assumed that Q(i) , R(i) , u(i)∀<sup>i</sup> <sup>∈</sup> <sup>P</sup> are given. The goal is to find the quadratic potential function J (p) as given in (4.11). With help of the known system dynamics (4.9), u (p) is obtained from the solution of the Riccati equation (4.21). Then, the deviation (4.40) is minimized. Such an optimization can be solved with a sequential quadratic programming (SQP) optimization algorithm [NW00, Chapter 14, 18], which stops, when the error is smaller than a predefined optimization threshold ϵoptim. The parameters of the potential function are accumulated in the parameter vector

$$\boldsymbol{\theta}^{(p)} = \left[ \text{vech}(\mathbf{Q}^{(p)}), \text{vech}(\mathbf{R}^{(p)}) \right],$$

<sup>24</sup> In literature, one of the reasons of potential static games is that the computation of the static NE can be complex. Thus, the use of potential games can reduce the complexity of such static games.

where vech is the half-vectorization of a matrix, which is defined such as

$$\text{vecch}(\mathbf{A}) = \left[ a\_{11}, a\_{21}, \dots, a\_{n1}, a\_{22}, a\_{32}, \dots, a\_{n2}, \dots a\_{n-1,n-1}, a\_{n,n-1}, a\_{n,n} \right] \in \mathbb{R}^{1 \times \frac{1}{2}n(n+1)}.$$

The optimization of (4.40) starts with the initial value θ (p) 0 and is carried out by the following optimization problem:

$$
\hat{\mathbf{Q}}^{\{p\}}, \hat{\mathbf{R}}^{\{p\}}, \hat{\mathbf{P}}^{\{p\}} = \underset{\mathbf{Q}(p), \mathbf{R}(p), \mathbf{P}(p)}{\operatorname{argmin}} \int\_{0}^{\tau\_{\text{end}}} \left| \mathbf{e}\_{u}(t) \right|^{2} \,\mathrm{d}t \tag{4.41a}
$$

$$\text{s.t. } \mathbf{A}^{\top} \mathbf{P}^{(p)} + \mathbf{P}^{(p)} \mathbf{A} + \mathbf{Q}^{(p)} - \mathbf{P}^{(p)} \mathbf{B}^{(p)} \mathbf{R}^{(p)} \mathbf{R}^{(p)} \overset{\text{\textquotedblleft}}{\mathbf{B}}^{\top} \mathbf{P}^{(p)} = \mathbf{0},\tag{4.41b}$$

$$\left(\mathbf{B}^{(i)}\prescript{\mathsf{T}}{}{\mathbf{P}}^{(p)}\mathbf{x}(t)\right)\cdot\left(\mathbf{B}^{(i)}\prescript{\mathsf{T}}{}{\mathbf{P}}^{(i)}\mathbf{x}(t)\right)\geq 0,\ \forall i\in\mathcal{P}.\tag{4.41c}$$

The minimization of (4.41a) ensures that u (p) <sup>=</sup> [<sup>u</sup> (1) , u(2) ] holds<sup>25</sup>. Constraint (4.41b) ensures the minimization of the potential function J (p) meaning that u (p) is provided by the LQ optimization of (4.11). Constraint (4.41c) guarantees the necessary and sufficient condition of an OPDG. The procedure of algorithm to compute the potential function is given in Algorithm 1.

#### **Algorithm 1** Input-Trajectory-Dependent Computation of an OPDG

```
Input: θ
            (p)
            0
               , A, B(i)
                        , Q(i)
                              , R(i)
                                     , u(i)∀i ∈ P
 Output: Q(p)
                 , R(p)
θ
 (p)
 1 ← θ
         (p)
         0
while eu ≤ ϵoptim do
   compute P(p)
                   from (4.41b) and θ
                                       (p)
                                       i
   verify (4.41c)
   u
     (p) = −R(p)
                  −1
                    B(p)
                          T
                           P(p)x
   i = i + 1
   update θ
             (p)
             i with SQP
end while
```
One drawback of this method is that it depends on the trajectories x of the game, which makes the algorithm slow for games with a longer time horizon. Therefore, an alternative approach is presented in [VHH23], in which the potential function is directly computed from the cost functions of the original game.

#### **Trajectory-free Optimization**

The computation method presented in this section is referred to as *trajectory-free optimization*. The trajectory-free optimization identifies an LQ OPDG by solving the problem constructed

<sup>25</sup> Note that interestingly, <sup>∫</sup> <sup>∣</sup>eu(t)∣<sup>2</sup> dt itself can also be interpreted as a potential function, thus, its minimum yields the NE of the original game.

as a linear matrix inequality (LMI) problem based on the idea from [PCC<sup>+</sup> 15]. According to Lemma 4.3,

$$\left(\mathbf{B}^{(i)^{\top}}\mathbf{P}^{(p)}\mathbf{z}(t)\right)\cdot\left(\mathbf{B}^{(i)^{\top}}\mathbf{P}^{(i)}\mathbf{z}(t)\right)\geq 0,\ \forall i\in\mathcal{P},\tag{4.42}$$

must hold for an OPDG. To calculate (4.42), the trajectories x of the game are required, and the proposed optimization is not trajectory-free. Therefore, the simplification of this trajectory dependency is discussed, enabling a faster determination of the potential function. First, the following notation is introduced:

$$\boldsymbol{\upsilon}^{(p)} \coloneqq \mathbf{B}^{(i)^{\top}} \mathbf{P}^{(p)} \text{ and } \boldsymbol{\upsilon}^{(i)} \coloneqq \mathbf{B}^{(i)^{\top}} \mathbf{P}^{(i)}, \; \forall i \in \mathcal{P}.$$

Thus, (4.42) can be rewritten as

$$\left(\left(v^{(p)}x(t)\right)\cdot\left(v^{(i)}x(t)\right)\right)\geq 0,\ \forall i\in\mathcal{P}.\tag{4.43}$$

In order to drop x, (4.43) must hold ∀x which means v (p) and v (i) should point to the same direction or v (p) and v (i) have to be parallel to x, such that (4.43) holds. This way, both terms in (4.43) have the same sign regardless of the actual x, which leads to the new condition

$$
\omega^{(i)}\boldsymbol{v}^{(p)} - \boldsymbol{v}^{(i)} = \mathbf{0},\tag{4.44}
$$

where ω (i) <sup>&</sup>gt; <sup>0</sup> is a scaling factor, see a three-dimensional example in Figure 4.2, where the scaling factor is a single scalar. Whether ω (i) is a matrix or a scalar depends on the ratio of dimensions of the system states and inputs. Figure 4.2 represents the system state vector x in two different time instances t<sup>1</sup> and t2. Since, v (p) and v (i) point to the same direction, condition (4.44) is fulfilled for all time instances. Intuitively, if x has a lower dimension than v (p) and v (i) have, there are more possible combinations of v (p) and v (i) , which fulfill condition (4.44). For instance, if x is scalar and coincides with e 1 , then any arbitrary vectors of v (p) and v (i) in the plane e <sup>2</sup> <sup>−</sup> <sup>e</sup> 3 fulfill condition (4.44).

Using (4.44), a trajectory-free computation of an OPDG can be constructed. Let it be assumed that a stabilizing feedback control K(p) and the system dynamics (4.9) are given or can be identified from measurements. It is well-known that inverse optimal control problems are ill-posed (e.g. scaling ambiguity) [PCC<sup>+</sup> 15, ER19, MIF<sup>+</sup> 20, ICH21]. Therefore, the minimization of the condition number of a concatenated matrix consisting of the searched parameters

$$\mathbf{E}\_{\{n+p\,1+p2\}} \le \begin{bmatrix} \mathbf{Q}^{(p)} & \mathbf{0} \\ \mathbf{0} & \mathbf{R}^{(p)} \end{bmatrix} \le \beta \mathbf{E}\_{\{n+p\,1+p2\}} \tag{4.45}$$

is proposed, where E(n+p1+p2) is an identity matrix with size of (n + p1 + p2) × (n + p1 + p2). Furthermore, an additional lower bound is used to omit the trivial solution. The further extension to other inverse optimal control problems ([PCC<sup>+</sup> 15, ER19, MIF<sup>+</sup> 20, ICH21]) is that (4.43) must additionally hold. Thus, the optimization variables are Q(p) , R(p) ,P(p) , β, ω (i) and the inputs of the optimization (output) are A, B(i) , Q(i) , R(i) ,∀<sup>i</sup> <sup>∈</sup> <sup>P</sup>,K(p) , which are assumed to be given. It is assumed that K(p) is either computed such as

$$\mathbf{K}^{(p)} = \left[\mathbf{K}^{(1)}, \mathbf{K}^{(2)}\right] \tag{4.46}$$

or estimated from measurements directly.

**Figure 4.2:** A schematic representation of the trajectory independence of the optimization in a three-dimensional space (e 1 , e 2 , e <sup>3</sup>): v (p) and v (i) show in the same direction are linearly dependent, therefore (4.43) is fulfilled at t<sup>1</sup> as well as at any other time t<sup>2</sup>

Finally, the following optimization problem results<sup>26</sup>

$$\hat{\mathbf{Q}}^{\{p\}}, \hat{\mathbf{R}}^{\{p\}}, \hat{\mathbf{P}}^{\{p\}}, \hat{\boldsymbol{\beta}}, \hat{\boldsymbol{\omega}}^{\{i\}} = \underset{\mathbf{Q}^{\{p\}}, \mathbf{R}^{\{p\}}, \mathbf{P}^{\{p\}}, \boldsymbol{\beta}, \boldsymbol{\omega}^{\{i\}}}{\operatorname{argmin}} \boldsymbol{\beta}^{2} \tag{4.47a}$$

$$\mathbf{s}.t. \quad \mathbf{A}^{\mathsf{T}}\mathbf{P}^{(p)} + \mathbf{P}^{(p)}\mathbf{A} - \mathbf{P}^{(p)}\mathbf{B}^{(p)}\mathbf{K}^{(p)} + \mathbf{Q}^{(p)} = \mathbf{0} \tag{4.47b}$$

$$\mathbf{B}^{(p)}\overset{\mathsf{T}}{\mathbf{P}}\mathbf{P}^{(p)} - \mathbf{R}^{(p)}\mathbf{K}^{(p)} = \mathbf{0} \tag{4.47c}$$

$$
\omega^{(i)}\mathbf{B}^{(i)^{\top}}\mathbf{P}^{(i)} - \mathbf{B}^{(i)^{\top}}\mathbf{P}^{(p)} = \mathbf{0}, \; \forall i \in \mathcal{P} \tag{4.47d}
$$

$$\mathbf{E} \le \begin{bmatrix} \mathbf{Q}^{(p)} & \mathbf{0} \\ \mathbf{0} & \mathbf{R}^{(p)} \end{bmatrix} \le \beta \mathbf{E} \tag{4.47e}$$

$$\mathbf{P}^{(p)} \ge \mathbf{0}.\tag{4.47f}$$

The constraints (4.47b), (4.47c) and (4.47f) are necessary to guarantee that K(p) is optimal with respect to the identified quadratic potential function. Constraint (4.47e) ensures the uniqueness of the solution to constraint (4.47d) restricts the identified parameters Qˆ (p) , Rˆ (p) to the set of parameters constituting the potential function of the OPDG.

If there is a solution of the optimization (4.47), then the optimization is called feasible. However, (4.47) cannot always provide a solution. In this case, the optimization is called infeasible. Such a feasibility analysis of the LMI is addressed in [PCC<sup>+</sup> 15]. However, (4.47) is more restrictive due to constraint (4.47d) compared to the analysis presented in [PCC<sup>+</sup> 15]. Therefore, in the following analysis, the necessary condition of a feasible solution is proposed, which shows, how the additional constraint (4.47d) reduces the feasibility of optimization (4.47).

The trajectory-free optimization (4.47) can only provide feasible solutions if the conditions of Lemma 4.4 are satisfied.

<sup>26</sup> Note that for the sake of readability, the dimension specifications of the unity matrix (n + p1 + p2) are omitted.

#### **Lemma 4.4 (Necessary Condition for the feasibility of the trajectory-free computation of OPDGs)**

*The trajectory-free optimization (4.47) for two players can be feasible only, if*


$$\frac{1}{2}(1+n) - \left(p\_1 + p\_2\right) > 0,\tag{4.48}$$

*where* p<sup>1</sup> *and* p<sup>2</sup> *are the dimension of the inputs vectors* B(1) ∈ R n×p<sup>1</sup> *,* B(2) ∈ R n×p<sup>2</sup> *of player 1 and 2, respectively.*

#### **Proof:**

To prove the conditions, constraint (4.47d) is rewritten as

$$
\omega^{(i)}\boldsymbol{v}^{(i)} - \mathbf{B}^{(i)^{\mathsf{T}}}\mathbf{P}^{(p)} = \mathbf{0}, \forall i \in \mathcal{P},
$$

which can be vectorized such that

$$\operatorname{vec}\left(\omega^{(i)}v^{(i)}\right) - \operatorname{vec}\left(\mathbf{B}^{(i)}\prescript{\mathsf{T}}{}{\mathbf{P}}^{(p)}\right) = \mathbf{0}, \forall i \in \mathcal{P},\tag{4.49}$$

and rearranged to

$$\underbrace{\left(\mathbf{E}\_n \otimes \mathbf{B}^{\{i\}}\right)^{\mathsf{T}}}\_{\tilde{\mathbf{A}}} \underbrace{\mathrm{vec}\left(\mathbf{P}^{\{p\}}\right)}\_{\tilde{\mathbf{A}}} = \underbrace{\mathrm{vec}\left(\omega^{\{i\}} \boldsymbol{v}^{\{i\}}\right)}\_{\tilde{\mathbf{B}}}, \forall i \in \mathcal{P}, \tag{4.50}$$

where vec(⋅) represents the column vectorization of a matrix and ⊗ is the Kronecker product of two matrices. In (4.50), the classical form of a linear system of equations <sup>A</sup>˜ <sup>x</sup>˜ <sup>=</sup> <sup>b</sup>˜ is given in the underbraces.

Condition A is necessary for the consistence of the solution, for which

$$\text{rank}\left(\mathbf{E}\_n \otimes \mathbf{B}^{\{i\}}^\top\right) = \text{rank}\left(\mathbf{E}\_n \otimes \mathbf{B}^{\{i\}}^\top\right|\text{vec}\left(\boldsymbol{\omega}^{\{i\}}\boldsymbol{v}^{\{i\}}\right)\right), \; \forall i \in \mathcal{P}$$

must hold, since an inconsistent solution of (4.47d) leads to an infeasible LMI.

Condition B is necessary for the following reasons. If (4.50) provides a unique solution, for a given vec (ω (i)v (i) ), then vec(P(p) ) is completely specified by (4.50). Thus, <sup>P</sup>(p) cannot be modified to fulfill (4.47b) and (4.47c) and consequently, (4.47) cannot not be feasible. On the other hand, if (4.50) has multiple solutions, the constraints of optimization (4.47) have additional degrees of freedom. This requires a rank analysis (see [BR14, Chapter 5]). Due to the fact that the columns of the input matrix B(i) , ∀i ∈ P are linearly independent,

$$\operatorname{rank}\left(\mathbf{E}\_n \otimes \mathbf{B}^{(i)}\right) < \dim\left(\operatorname{vec}\left(\mathbf{P}^{(p)}\right)\right), \,\forall i \in \mathcal{P}$$

must hold<sup>27</sup> meaning that the number of rows are smaller than the number of columns, cf. [War05, BR14]. For two players, cf. Assumptions 4.3.1-2, in (4.50)

$$\tilde{\mathbf{A}} = \begin{bmatrix} \mathbf{E} \otimes \mathbf{B}^{\{1\}^{\mathsf{T}}} \\ \mathbf{E} \otimes \mathbf{B}^{\{2\}^{\mathsf{T}}} \end{bmatrix}^{\mathsf{T}} \tag{4.51}$$

holds. The size of <sup>A</sup>˜ is <sup>n</sup>(p<sup>1</sup> <sup>+</sup> <sup>p</sup>2) <sup>×</sup> <sup>n</sup> <sup>⋅</sup> n, where <sup>p</sup><sup>1</sup> and <sup>p</sup><sup>2</sup> are the length of the input matrices B(1) and B(2) . If P(p) was not a symmetric matrix, the condition for a manifold of solutions would be <sup>n</sup> <sup>⋅</sup> <sup>n</sup> <sup>&</sup>gt; <sup>n</sup>(p<sup>1</sup> <sup>+</sup> <sup>p</sup>2). Due to the symmetric structure of <sup>P</sup>(p) , the degrees of freedom of vec (P(p) ) is reduced to <sup>1</sup> 2 (1+n)n, see [BR14, Chapter 14]. Thus, the condition is changed to

$$
\frac{1}{2}(1+n) > (p\_1 + p\_2),
\tag{4.52}
$$

which is the proof of the lemma.

### **4.2.4 Application of the Computation Methods**

In this section, the methods are applied to an illustrative example and a comparison between the input-trajectory-dependent and trajectory-free computation methods is provided.

#### **Example 4.3:**

*Consider an infinite time-horizon LQ differential game according to Definition 3.3 with two players. The system is given such as*

$$
\dot{x}(t) = \begin{bmatrix} 1 & 0 \\ -5 & 0 \end{bmatrix} x(t) + \begin{bmatrix} 1 \\ 1 \end{bmatrix} u^{\{1\}}(t) + \begin{bmatrix} 1 \\ -2 \end{bmatrix} u^{\{2\}}(t), \tag{4.53}
$$

*where initial values are chosen to* x<sup>0</sup> = [−1.2, 1] T *. The cost functions the players are quadratic* (3.14)*, in which the matrices are*

$$\mathbf{Q}^{\{1\}} = \text{diag}\{3, 0\}, \; \mathbf{R}^{\{1\}} = \text{diag}\{1, 0.5\}, \; \mathbf{Q}^{\{2\}} = \text{diag}\{2, 1\}, \; \mathbf{R}^{\{2\}} = \text{diag}\{0.2, 1\}.$$

*The feedback gains of the players in the NE are*

$$\mathbf{K}^{\{1\}} = \begin{bmatrix} 1.418, \ 0.135 \end{bmatrix}^{\mathsf{T}} \text{ and } \mathbf{K}^{\{2\}} = \begin{bmatrix} 2.233, \ -0.997 \end{bmatrix}^{\mathsf{T}}.$$

*which correspond to the NE of this differential game. The solutions of the coupled Riccati equation are*

$$\mathbf{P}^{\{1\}} = \begin{bmatrix} 1.408 & 0.010 \\ 0.010 & 0.125 \end{bmatrix} \text{ and } \mathbf{P}^{\{2\}} = \begin{bmatrix} 1.619 & -0.307 \\ -0.307 & 0.345 \end{bmatrix}.$$

*Note that the assumptions from Lemma 4.2 do not hold for the Hamiltonians of the players* H(i) , i = {1, 2}*. Therefore, this game cannot be modeled as an exact potential differential game.*

<sup>27</sup> The dimension of a vector is denoted by dim.

*However, finding a substituting OPDG is possible. First, the input-trajectory-dependent (ITD) method is applied, which uses the inputs of the players from the original game to compute the inputs of the potential games and obtain the parameters. The initial values of the optimization are chosen*

$$\left[ \text{vech}\left( Q\_0^{(p)} \right), \text{vech}\left( R\_0^{(p)} \right) \right] = \left[ 3, 0.5, 1, 2, 0.5, 0.5 \right],$$

*where* vech *is the half-vectorization of a matrix. The resulting matrices are*

$$\mathbf{Q}\_{\rm ITD}^{\langle p\rangle} = \begin{bmatrix} 1.848 & 0.181 \\ 0.181 & 1.455 \end{bmatrix} \text{ and } \mathbf{R}\_{\rm ITD}^{\langle p\rangle} = \begin{bmatrix} 1.480 & 0.027 \\ 0.027 & 1.429 \end{bmatrix}.$$

*The necessary time for the calculation was* 26.9 *s. Algorithm 1 was carried out with the Matlab implementation of an SQP optimizer, see [NW00, Chapter 18]. Second, the trajectory-free method is applied, which identified the following matrices*

$$\mathbf{Q}\_{\rm TF}^{(p)} = \begin{bmatrix} 1.234 & 0.089 \\ 0.089 & 1.034 \end{bmatrix} \text{ and } \mathbf{R}\_{\rm TF}^{(p)} = \begin{bmatrix} 1.061 & 0.014 \\ 0.014 & 1.009 \end{bmatrix}.$$

*The computation time of the trajectory-free method was* 0.15 *s. The LMI is solved with SeDuMi (Self-Dual-Minimization, version 1.3), an open-source software package for Matlab. For more details, it is referred to [Stu99]. The resulting trajectories are given in Figure 4.3. It can be seen that despite the difference of the identified potential functions, the system trajectories are similar.*

**Figure 4.3:** Comparison of the trajectories of the given original differential game (ODG) with the trajectories the identified potential games is shown. The trajectories of the ODG are given in blue, the input-trajectorydependent (ITD) method is in red and the trajectory-free (TF) method is in yellow color.

*Next, the dynamics of the Hamiltonians are compared. The changes of the Hamiltonians are given in Figure 4.4. It can be seen that the two computation methods provide similar changes* *of the Hamiltonians despite the differences of the matrices* Q(p) *and* R(p) *. For both methods, the zero crossings are at the same time, meaning that condition* (4.26) *is fulfilled, see* t ≈ 1 *s in subplot a). For player 1, the dynamics of the Hamiltonians are not the same, cf. subplot a), for player 2, the TF optimization yields a resulting trajectory, which coincides with the trajectory of the original differential game, cf. subplot b).*

**Figure 4.4:** The changes of the Hamiltonians H(1) , H(2) and H(p) and the comparison of the two computation methods - the original differential game (ODG) is compared with the input-trajectory-dependent (ITD) and the trajectory-free (TF) computation methods.

To summarize the application, the benefits of OPDGs are clear: A potential function can be obtained even for systems, which do not fulfill the condition of exact potential differential games. Thus, OPDGs broaden the possible applications, which can be modeled by means of potential games.

However, the concept of OPDGs provides an extension of the subclass of ordinal potential static games under certain assumptions, cf. Assumptions 4.3.1-2. The input trajectory dependent computation method can determine the potential function of the original game without any restrictions on the system structure. However, due to the optimizer, the method can be slow in the case of a longer time horizon or complex systems with a large number of system states. Therefore, a computational efficient LMI formulation is proposed, which is restricted to specific system structure, cf. Lemma 4.4. If the proposed LMI is not feasible, the optimization (4.41) can still provide the potential function due to its generality. Assumptions 4.3.1-2 restrict the application of OPDG, thus, the existence of OPDGs is proven for a case with two player and scalar inputs only. Therefore, in the next section, the subclass of NPDGs is introduced, which provides a less restrictive applicability of differential potential games.

# **4.3 Near Potential Differential Games**

The previous section presented the novel subclass of OPDGs, which uses exact mathematical relationships. However, these relationships only hold under specific assumptions, which limit the application of the proposed concept. Therefore, this section presents the concept of NPDGs and the analysis of their dynamics. NPDGs are less restrictive compared to exact potential differential games and OPDGs. Using NPDGs, a more general way to identify the CS of LISC is possible.

The core idea is the usage of a distance metric between two differential games. In that way, the required exactness of the exact potential differential games, cf. Definition 4.1 is transformed into a less restrictive condition, which permits a small, remaining difference between the two games. NPDGs are extended from the static to the dynamic case for the first time in the course of this thesis. The concept of near potential static games is introduced in [COP10, COP13]. Based on the intuitive idea that if two games are "close" in terms of the properties of the players' strategy sets, their properties in terms of NE should be somehow similar. A systematic framework for static games was developed in [COP10]. It was shown that a near potential static game has similar convergence of the strategies<sup>28</sup> compared to an exact potential static game. A similar convergence of the strategies means that similar changes in the input strategies lead to similar changes in the payoffs in the game. Furthermore, it is also shown that the meaning of "close" can be quantified in the developed framework, see [COP10]. In the following, an extension of the concept of near potential static games to differential games is proposed.

# **4.3.1 Distance between two LQ Differential Games**

Similar to the static case [COP13], a distance measure between two differential games is introduced.

### **Definition 4.5 (Differential Distance)**

*Let an exact potential differential game* Γ (p) ed *with the potential function* J (p) *be given. Furthermore, let an arbitrary differential game* Γnd *according to Definition 3.3 be given. The differential distance (DD) between* Γ (p) ed *and* Γnd *is defined as*

$$\sigma\_d^{(i)}(t) \coloneqq \left\| \frac{\partial H^{\{p\}}(t)}{\partial u^{(i)}(t)} - \frac{\partial H^{\{i\}}(t)}{\partial u^{(i)}(t)} \right\|\_{2}, \; i \in \mathcal{P}. \tag{4.54}$$

<sup>28</sup> Note that the convergence of static games means the convergence of the decision-making process, which leads to one of the NEs of the game. The term *dynamics* has no relation to the dynamics of the system states in the context of differential games.

#### **Note:**

Definition 4.5 defines vector space, in which two games can be compared and their "closeness" can be quantified. It is the intuitive extension of Definition 4.2 because for an exact potential differential game,

$$
\sigma\_d^{(i)}(t) = 0, \forall t \in [0, \tau\_{\text{end}}].
$$

holds, meaning that Γnd has the same characteristics as Γed. Softening the condition σ (i) d (t) = 0 enables a broader use.

Using Definition 4.5, the subclass of NPDGs is formally defined.

#### **Definition 4.6 (Near Potential Differential Game)**

*A differential game* Γnd *is said to be an NPDG if the DD between* Γnd *and an arbitrary exact potential differential game* Γ (p) ed *is*

$$\max\_{i} \left\| \sigma\_{d}^{(i)}(t) \right\|\_{2} < \Delta, \ i \in \mathcal{P}, \tag{4.55}$$

*where* ∆ ≥ 0 *is a small constant.*

#### **Note 1:**

Definition 4.6 does not exclude the subclass of exact potential differential games as ∆ = 0 is possible. Thus, the set of exact potential differential games is a subset of NPDGs.

#### **Note 2:**

The maximum DD is the measure of the likeness between the games. As the maximum DD increases, the dynamics of states and input trajectories of the NPDG are gradually getting larger. Thus, the main question is that for a given upper bound ∆, how large the perturbation of the state and inputs dynamics between Γnd and Γ (p) ed can be. Therefore, this perturbation is quantitatively characterized for LQ games in the following.

Using the Definition 4.6, the necessary and sufficient condition of the NPDGs is given for the LQ case.

#### **Lemma 4.5 (LQ Near Potential Differential Game)**

*Let an LQ exact potential differential game* Γ (p) *ed with its state trajectories* x (p) (t) *in its NE be given. Furthermore, let an arbitrary LQ differential game* Γnd *according to Definition 3.3 with its state trajectories* x ∗ (t) *in the NE of* Γnd *be given. If*

$$\max\_{i} \left\lVert \mathbf{B}^{(i)^{\mathsf{T}}} \mathbf{P}^{(p)} - \mathbf{B}^{(i)^{\mathsf{T}}} \mathbf{P}^{(i)} \right\rVert\_{2} < \Delta^{\*} \tag{4.56}$$

*holds, then* Γnd *is an LQ NPDG in accordance with Definition 4.6.*

#### **Proof:**

The derivative of H(i) is expressed as

$$\frac{\partial H^{(i)}(t)}{\partial u^{\{i\}}(t)} = \mathbf{R}^{\{i\}}u^{\{i\}}(t) + \mathbf{B}^{\{i\}T}\mathbf{A}^{\{i\}}(t),\tag{4.57}$$

which holds for i ∈ P. Based on the proof of Lemma 4.3, the derivatives of the Hamiltonian of player i can be rewritten as

$$\frac{\partial H^{(i)}(t)}{\partial u^{(i)}(t)} = -\varepsilon\_c^{(i)}(x)\mathbf{B}^{(i)T}\mathbf{P}^{(i)}x^\*(t),\tag{4.58}$$

and for the derivatives of the Hamiltonian of the potential function

$$\frac{\partial H^{\{p\}}(t)}{\partial u^{\{i\}}(t)} = -\varepsilon\_c^{\{p\}}(x) \mathbf{B}^{\{i\}} \prescript{\mathsf{T}}{\mathbf{P}} \mathbf{P}^{\{p\}} x^{\{p\}}(t) \tag{4.59}$$

are obtained, see steps (4.31) and (4.35), where ε (p) <sup>c</sup> (x) << 1 and ε (i) <sup>c</sup> (x) << 1 are scalar perturbation functions. Substituting the derivatives into (4.54), the DD is stated as

$$\sigma\_d^{(i)}(t) = \left\| \varepsilon\_c^{(p)}(x) \mathbf{B}^{(i)} \prescript{\mathsf{T}}{\mathbf{P}} \mathbf{P}^{(p)} x^{(p)}(t) - \varepsilon\_c^{(i)}(x) \mathbf{B}^{(i)T} \mathbf{P}^{(i)} x^\*(t) \right\|\_2.$$

Introducing an upper bound of the variation ε<sup>c</sup> ∶= max (ε (p) <sup>c</sup> (x), ε (i) <sup>c</sup> (x)), the DD is rewritten as

$$\sigma\_d^{\{i\}}(t) = \left\| \varepsilon\_c \mathbf{B}^{\{i\}} \mathbf{P}^{\{p\}} \mathbf{x}^{\{p\}}(t) - \varepsilon\_c \mathbf{B}^{\{i\}T} \mathbf{P}^{\{i\}} \mathbf{x}^\*(t) \right\|\_2$$

$$\leq |\varepsilon\_c| \left\| \mathbf{B}^{\{i\}} \mathbf{P}^{\{p\}} \mathbf{x}^{\{p\}}(t) - \mathbf{B}^{\{i\}T} \mathbf{P}^{\{i\}} \mathbf{x}^\*(t) \right\|\_2 \tag{4.60}$$

It the following, it is assumed that there is a ∆x (p) (t) ≤ 0 such

$$\mathbf{x}^{\text{(p)}}(t) = \mathbf{x}^\*(t) + \Delta \mathbf{x}^{\text{(p)}}(t) \text{ or } \tag{4.61}$$

$$\mathbf{x}^{\text{(p)}}(t) = \mathbf{x}^\*(t) - \Delta \mathbf{x}^{\text{(p)}}(t) \tag{4.62}$$

hold ∀t ∈ [0, τend]. In the on hand, if (4.61) holds, the upper bound of σ (i) d (t) is rewritten to

$$\begin{split} \mathbb{S} = & \left\| \boldsymbol{\varepsilon}\_{c} \right\| \left\| \mathbf{B}^{(i)} \right\mathbf{P}^{(p)} \mathbf{z}^{(p)}(t) - \mathbf{B}^{(i)T} \mathbf{P}^{(i)} \mathbf{z}^{(p)}(t) + \mathbf{B}^{(i)T} \mathbf{P}^{(i)} \Delta \mathbf{z}^{(p)}(t) \right\|\_{2} \\ \leq & \left| \boldsymbol{\varepsilon}\_{c} \right| \left\| \left( \mathbf{B}^{(i)} \right)^{\mathsf{T}} \mathbf{P}^{(p)} - \mathbf{B}^{(i)T} \mathbf{P}^{(i)} \right) \mathbf{z}^{(p)}(t) \right\|\_{2} + \underbrace{\left| \boldsymbol{\varepsilon}\_{c} \right| \left\| \mathbf{B}^{(i)T} \mathbf{P}^{(i)} \Delta \mathbf{z}^{(p)}(t) \right\|\_{2}}\_{\boldsymbol{\varepsilon} \bullet \boldsymbol{0} \text{ as } \boldsymbol{\varepsilon}\_{c} \cdot \Delta \mathbf{z}^{(p)} \times \mathbf{1} \text{ and } \boldsymbol{\varepsilon}\_{c} \cdot \Delta \mathbf{z}^{(p)} \to \\ \quad \leq & \left| \boldsymbol{\varepsilon}\_{c} \right| \left\| \mathbf{B}^{(i)} \right\|^{\mathsf{T}} \mathbf{P}^{(p)} - \mathbf{B}^{(i)T} \mathbf{P}^{(i)} \right\|\_{2} \left\| \mathbf{z}^{(p)}(t) \right\|\_{2} \; i \in \mathcal{P}. \end{split} \tag{4.63}$$

On the other hand, if (4.61) holds, the upper bound of σ (i) d (t) is

$$\sigma\_d^{(i)}(t) \le \left\| \varepsilon\_c \right\| \left\| \mathbf{B}^{(i)} ^\top \mathbf{P}^{(i)} - \mathbf{B}^{(i)T} \mathbf{P}^{(p)} \right\|\_2 \left\| \mathbf{z}^\*(t) \right\|\_2 \; i \in \mathcal{P}. \tag{4.64}$$

Introducing the notation for the maximum magnitude of the state vectors

$$x\_{\max} \coloneqq \max\left( \left\| x^\*(t) \right\|\_2, \left\| x^{(p)}(t) \right\|\_2 \right),$$

the estimations (4.60) and (4.64) can be combined into

$$\sigma\_d^{(i)}(t) \le |\varepsilon\_c| \left\| \mathbf{B}^{(i)} ^\top \mathbf{P}^{(p)} - \mathbf{B}^{(i)T} \mathbf{P}^{(i)} \right\|\_2 x\_{\max} \; i \in \mathcal{P}. \; t$$

Introducing ∆<sup>∗</sup> = ∆ ∣εc∣⋅xmax leads to the upper bound of σ (i) d ,

$$\max\_{i} \left\| \mathbf{B}^{(i)} \overset{\mathsf{T}}{\mathbf{P}} \mathbf{P}^{(p)} - \mathbf{B}^{(i)} \overset{\mathsf{T}}{\mathbf{P}} \mathbf{P}^{(i)} \right\|\_{2} < \Delta^{\*}.$$

proving that Γnd is an NPDG with an upper bound of ∆<sup>∗</sup> .

If the upper bound of DD σ<sup>d</sup> between the NPDG and the exact potential differential games is sufficiently *small*, closed-loop characteristics with similar results can be drawn. In the case of differential games system state trajectories are analyzed<sup>29</sup>. The terms *small* and *similar* are described more precisely in the next subsection.

### **4.3.2 Dynamics of LQ NPDGs**

The dynamics of the system and input trajectories are analyzed in order to provide an estimation of the differences between two LQ differential games. Let it be assumed for the LQ differential game Γnd that the control laws of the players i ∈ P are obtained from the solution of the coupled Riccati equations (3.20) over an infinite time horizon, which leads to the closed-loop system dynamics

$$
\dot{x}(t) = \mathbf{A}\_c^\* x(t), \ \mathbf{x}(t\_0) = \mathbf{x}\_0,\tag{4.65}
$$

where

$$\mathbf{A}\_c^\* = \mathbf{A} - \sum\_{i \in \mathcal{P}} \mathbf{B}^{(i)} \mathbf{R}^{(i)} \prescript{-1}{}{\mathbf{B}}^{\prime(i)} \prescript{\mathsf{T}}{}{\mathbf{P}}^{(i)}$$

and

$$x^\*(t) = e^{\mathbf{A}\_c^\* \cdot t} x\_0 \tag{4.66}$$

is the unique solution of (4.65).

For the LQ exact potential differential games Γ (p) ed , the control law <sup>K</sup>(p) <sup>=</sup> <sup>R</sup>(p) −1 B(p) T P(p) is obtained from the optimization of the potential function (4.11), which is used to compute the feedback system dynamics

$$
\dot{\mathbf{x}}^{(p)}(t) = \mathbf{A}\_c^{(p)} \mathbf{x}^{(p)}(t), \ \mathbf{x}^{(p)}(t\_0) = \mathbf{x}\_0^{(p)},\tag{4.67}
$$

<sup>29</sup> In the static case, the decision procedure to find the NE is the focus of the analysis. For a given distance between two static games, an approximate NE with an ϵ limit is obtained, which is called the ϵ-NE of the game. For more information on the near potential static game and the concept of ϵ-Nash Equilibrium, it is referred to [COP13] or [Nis07, Chapter 19].

where A (p) <sup>c</sup> is calculated such that

$$\mathbf{A}\_c^{(p)} = \mathbf{A} - \mathbf{B}^{(p)} \mathbf{R}^{(p)}{}^{-1} \mathbf{B}^{(p)}{}^{\top} \mathbf{P}^{(p)}{}\_{\bot}$$

The solution of (4.67) is

$$\mathbf{x}^{(p)}(t) = e^{\mathbf{A}\_c^{(p)} \cdot t} \mathbf{x}\_0^{(p)}.\tag{4.68}$$

From the state trajectories x (p) (t) and x ∗ (t), an upper bound (η) of the errors is provided for a given ∆ between two games. For this, a notion for the difference between two closed-loop system behaviors is introduced in Definition 4.7.

#### **Definition 4.7 (Closed-Loop System Matrix Error)**

*Consider an LQ exact potential differential game* Γ (p) ed *with the system trajectories* (4.68)*. Furthermore, assume that an arbitrary LQ differential game* Γnd *is an NPDG with the system trajectories* (4.66)*. Then, the closed-loop system matrix error between* Γ (p) ed *and* Γnd *is defined as*

$$
\Delta \mathbf{K} \coloneqq \mathbf{A}\_c^\* - \mathbf{A}\_c^{\{p\}}.\tag{4.69}
$$

#### **Note:**

Two differential games are *similar*, if the closed-loop system matrix error is small and consequently, the system trajectories of these two games x ∗ (t) and x (p) (t) are *close* to each other. In this case, Γnd is an NPDG. This closeness between an NPDG and an LQ exact potential differential game is quantified in Lemma 4.6.

#### **Lemma 4.6 (Boundedness of NPDGs)**

*Let an LQ NPDG (see Lemma 4.5)* Γnd *and an exact potential differential game* Γ (p) ed *be given. Let the system state trajectories of the two games* Γ (p) ed *and* Γnd *be* x (p) (t) *and* x ∗ (t)*, respectively. Moreover,*

$$\mathbf{x}^{(p)}(t\_0) = \mathbf{x}^\*(t\_0) = \mathbf{x}\_0 \tag{4.70}$$

*hold for the initial values.*

*Then, the error between the system state trajectories of* Γnd *and* Γ (p) ed *are bounded by the function* η(∆) *over an arbitrary time interval* [t0, t1]*, such that*

$$\left\|\mathbf{x}^{\{p\}}(t) - \mathbf{z}^\*(t)\right\|\_2 \le \eta(\Delta), \ \forall t \in [t\_0, t\_1]. \tag{4.71}$$

#### **Proof:**

From the solution of the differential equations (4.65) and (4.67),

$$\left\| \left\| \boldsymbol{x}^\*(t) - \boldsymbol{x}^{(p)}(t) \right\|\_2 = \left\| \boldsymbol{e}^{\mathbf{A}\_c^\* \cdot t} \boldsymbol{x}\_0 - \boldsymbol{e}^{\mathbf{A}\_c^{(p)} \cdot t} \boldsymbol{x}\_0 \right\|\_2$$

is obtained. As (4.70) holds, using Definition 4.7 and [Ber09, Theorem 11.16.7] leads to

$$\left\|x^\*(t) - x^{\{p\}}(t)\right\|\_2 \le \left\|\Delta \mathbf{K} \cdot t\right\|\_2 e^{\max\left\{\left\|\mathbf{A}\_c^{(p)} \cdot t\right\|\_2; \left\|\mathbf{A}\_c^\* \cdot t\right\|\_2\right\}} \left\|x\_0\right\|\_2. \tag{4.72}$$

In the following, an upper bound of ∆K is sought. Let the notation

$$\mathbf{P}\_{\Sigma^{\mathcal{P}}} = \begin{bmatrix} \mathbf{P}\_{\Sigma^{\mathcal{P}}}^{(1)} \\ \mathbf{P}\_{\Sigma^{\mathcal{P}}}^{(2)} \\ \vdots \\ \mathbf{P}\_{\Sigma^{\mathcal{P}}}^{(i)} \\ \mathbf{P}\_{\Sigma^{\mathcal{P}}}^{(3)} \\ \vdots \\ \mathbf{P}\_{\Sigma^{\mathcal{P}}}^{(N)} \end{bmatrix} = \begin{bmatrix} \mathbf{R}^{(1)^{-1}}\mathbf{B}^{(1)^{\top}}\mathbf{P}^{(1)} \\ \mathbf{R}^{(2)^{-1}}\mathbf{B}^{(2)^{\top}}\mathbf{P}^{(2)} \\ \vdots \\ \mathbf{R}^{(i)^{-1}}\mathbf{B}^{(i)^{\top}}\mathbf{P}^{(i)} \\ \vdots \\ \mathbf{R}^{(N)^{-1}}\mathbf{B}^{(N)^{\top}}\mathbf{P}^{(N)} \end{bmatrix} \tag{4.73}$$

be introduced. Substituting (4.66), (4.67) and (4.73) in (4.72), the upper bound

$$\begin{aligned} \|\Delta \mathbf{K}\|\_{2} &= \left\| \mathbf{B}^{(p)} \mathbf{R}^{(p)} ^{-1} \mathbf{B}^{(p)} ^\top \mathbf{P}^{(p)} - \mathbf{B}^{(p)} \sum\_{i \in \mathcal{P}} \mathbf{R}^{(i)} ^{-1} \mathbf{B}^{(i)} ^\top \mathbf{P}^{(i)} \right\|\_{2} \\ &= \left\| \mathbf{B}^{(p)} \left( \mathbf{R}^{(p)} ^{-1} \mathbf{B}^{(p)} ^\top \mathbf{P}^{(p)} - \mathbf{P}\_{\Sigma \mathcal{P}} \right) \right\|\_{2} \\ &= \left\| \mathbf{B}^{(p)} \mathbf{R}^{(p)} ^{-1} \left( \mathbf{B}^{(p)} ^\top \mathbf{P}^{(p)} - \mathbf{R}^{(p)} \mathbf{P}\_{\Sigma \mathcal{P}} \right) \right\|\_{2} \end{aligned} \tag{4.74}$$

is obtained. In addition, let the matrix

$$\mathbf{R}^{(p)} = \begin{bmatrix} \mathbf{R}\_1^{(p)}, \mathbf{R}\_2^{(p)}, \dots, \mathbf{R}\_i^{(p)}, \dots, \mathbf{R}\_N^{(p)} \end{bmatrix}^\top \tag{4.75}$$

be defined where R (p) i is the submatrix for the inputs u (i) of player i, for which

$$\mathbf{R}^{(p)}\mathbf{P}\_{\Sigma}{}^{\mathcal{P}} = \sum\_{i \in \mathcal{P}} \mathbf{R}\_i^{(p)} \mathbf{P}\_{\Sigma}^{(i)}$$

hold. Thus (4.74) can be reformulated to

$$\begin{split} \|\Delta \mathbf{K}\|\_{2} &= \left\| \mathbf{B}^{\{p\}} \mathbf{R}^{\{p\}} \right\|^{-1} \left( \mathbf{B}^{\{p\}} \mathbf{P}^{\{p\}} - \sum\_{i \in \mathcal{P}} \mathbf{R}\_{i}^{\{p\}} \mathbf{P}^{\{i\}}\_{\Sigma^{\mathcal{P}}} \right) \right\|\_{2} \\ &\leq \left\| \mathbf{B}^{\{p\}} \right\|\_{2} \left\| \mathbf{R}^{\{p\}} \right\|^{-1} \Big\|\_{2} \left\| \left( \mathbf{B}^{\{p\}} \mathbf{P}^{\{p\}} - \sum\_{i \in \mathcal{P}} \mathbf{R}\_{i}^{\{p\}} \mathbf{P}^{\{i\}}\_{\Sigma^{\mathcal{P}}} \right) \right\|\_{2} . \end{split} \tag{4.76}$$

Due to the well-known scaling ambiguity, there is a manifold of the potential functions (4.11) that result in an identical feedback gain matrix, thus a scaling factor κ <sup>p</sup> <sup>&</sup>gt; <sup>0</sup> <sup>∈</sup> <sup>R</sup> can be chosen such that <sup>J</sup>˜(p) <sup>=</sup> <sup>κ</sup> p ⋅ J (p) and

$$\left\|\mathbf{R}^{(p)}\right\|\_{2} > 1\tag{4.77}$$

.

holds. Assuming a suitable scaling, (4.74) leads to

$$\|\Delta \mathbf{K}\|\_{2} \le \left\|\mathbf{B}^{(p)}\right\|\_{2} \left\|\mathbf{B}^{(p)}\right\|^{\mathsf{T}} \mathbf{P}^{(p)} - \sum\_{i \in \mathcal{P}} \mathbf{R}\_{i}^{(p)} \mathbf{P}^{(i)}\_{\Sigma^{\mathsf{T}}}\right\|\_{2}$$

Then, let the following matrix be introduced

$$
\tilde{\mathbf{F}} = \begin{bmatrix}
\mathbf{B}^{(1)} \prescript{\mathsf{T}}{\mathbf{P}}^{(p)} - \mathbf{R}\_1^{(p)} \mathbf{P}\_{\Sigma^{\mathcal{P}}}^{(1)} \\
\vdots \\
\mathbf{B}^{(i)} \prescript{\mathsf{T}}{\mathbf{P}}^{(p)} - \mathbf{R}\_i^{(p)} \mathbf{P}\_{\Sigma^{\mathcal{P}}}^{(i)} \\
\vdots \\
\mathbf{B}^{(N)} \prescript{\mathsf{T}}{\mathbf{P}}^{(p)} - \mathbf{R}\_N^{(p)} \mathbf{P}\_{\Sigma^{\mathcal{P}}}^{(N)} \\
\end{bmatrix} = \mathbf{B}^{(p)} \prescript{\mathsf{T}}{\mathbf{P}}^{(p)} - \sum\_{i \in \mathcal{P}} \mathbf{R}\_i^{(p)} \mathbf{P}\_{\Sigma^{\mathcal{P}}}^{(i)}.\tag{4.78}
$$

The so-called Frobenius norm is defined as the entry-wise Euclidean norm of a matrix (see [BLR21]), for which

$$\left\|\tilde{\mathbf{F}}\right\|\_{2} \leq \left\|\tilde{\mathbf{F}}\right\|\_{F} \tag{4.79}$$

holds (see [HJ17, Chapter 5] or [Ber09, Section 9.8.12]). Applying the definition of the Frobenius norm to (4.78),

$$\left\|\left\|\tilde{\mathbf{F}}\right\|\right\|\_{F} = N \cdot \max\_{i} \left( \left\|\mathbf{B}^{\{i\}} ^\top \mathbf{P}^{\{p\}} - \mathbf{R}\_{i}^{\{p\}} \mathbf{P}^{\{i\}}\_{\Sigma^{\mathcal{P}}} \right\|\_{2} \right), i \in \mathcal{P} \tag{4.80}$$

is obtained.

Using property (4.79) and (4.80) leads to an upper bound

$$\begin{split} \left\lVert \Delta \mathbf{K} \right\rVert\_{2} &\leq \left\lVert \mathbf{B}^{\{i\}} \right\rVert\_{2} \left\lVert \left( \mathbf{B}^{\{p\}} \right)^{\mathsf{T}} \mathbf{P}^{\{p\}} - \sum\_{i \in \mathcal{P}} \mathbf{R}\_{i}^{\{p\}} \mathbf{P}^{\{i\}}\_{\Sigma^{\mathsf{P}}} \right) \right\rVert\_{2} \\ &\leq \left\lVert \mathbf{B}^{\{p\}} \right\rVert\_{2} N \cdot \max\_{i} \left\lVert \mathbf{B}^{\{i\}} \right\rVert^{\mathsf{T}} \mathbf{P}^{\{p\}} - \mathbf{R}\_{i}^{\{p\}} \mathbf{P}^{\{i\}}\_{\Sigma^{\mathsf{P}}} \right\rVert\_{2} . \end{split}$$

Due to the scaling ambiguity, <sup>J</sup>˜(i) <sup>=</sup> <sup>κ</sup> i ⋅ J (i) , κ<sup>i</sup> <sup>&</sup>gt; <sup>0</sup> <sup>∈</sup> <sup>R</sup> holds and <sup>κ</sup> i and κ p can be modified to obtain R(i) and R(p) , such that

$$\left\| \mathbf{B}^{(i)^{\mathsf{T}}} \mathbf{P}^{(p)} \mathbf{-R}\_{i}^{(p)} \mathbf{R}^{(i)^{-1}} \mathbf{B}^{(i)^{\mathsf{T}}} \mathbf{P}^{(i)} \right\|\_{2} \leq \left\| \mathbf{B}^{(i)^{\mathsf{T}}} \mathbf{P}^{(p)} - \mathbf{B}^{(i)^{\mathsf{T}}} \mathbf{P}^{(i)} \right\|\_{2}$$

holds, for which

$$\left\| \mathbf{R}^{\{p\}} \mathbf{R}^{\{i\}} \mathbf{^{-1} \mathbf{B}^{\{i\}} \mathbf{P}^{\{i\}} \right\|\_{2} \geq \left\| \mathbf{B}^{\{i\}} \mathbf{^{T} \mathbf{P}^{\{i\}} \right\|\_{2} \tag{4.81}$$

is sufficient (see [Ber09, Section 9.9.42]). This leads to

$$\|\Delta \mathbf{K}\|\_{2} \le \left\|\mathbf{B}^{\{p\}}\right\|\_{2} N \cdot \max\_{i} \left\|\mathbf{B}^{\{i\}} \mathbf{P}^{\{p\}} - \mathbf{B}^{\{i\}} \mathbf{P}^{\{i\}}\right\|\_{2} = \left\|\mathbf{B}^{\{p\}}\right\|\_{2} N \cdot \Delta. \tag{4.82}$$

Replacing the upper bound of ∆K in (4.72) by (4.82) leads to an upper bound for the trajectory error

$$\eta(\Delta) = \left\| \mathbf{B}^{\{p\}} \right\|\_2 N \cdot \Delta \cdot t \cdot e^{\max\left\{ \left\| \mathbf{A}\_c^{\{p\}} \cdot t \right\|\_2; \left\| \mathbf{A}\_c^\* \cdot t \right\|\_2 \right\}} \left\| x\_0 \right\|\_2 \tag{4.83}$$

such that

$$\left\|x^{\{p\}}(t) - x^\*(t)\right\|\_2 \le C\_{\eta, \text{NPDG}}(t) \cdot \Delta \tag{4.84}$$

holds, which proves the lemma.

#### **Remark 1:**

From (4.84), it can be seen that the upper bound of the DD governs the maximal admissible error between the trajectories, where the function η(∆) depends only on the initial value, the system structure and the time interval [t0, t1].

#### **Remark 2:**

In (4.83), Cη,NPDG(t) is bounded in the time interval[t0, t1]. Thus, Lemma 4.6 holds ∀t ∈ [t0, t1] only. However, ∆ can be defined as

$$
\Delta \coloneqq \begin{cases}
\Delta\_1 & \forall t \in [t\_0, t\_1] \\
\Delta\_2 & \forall t \in [t\_1, t\_2] \\
\vdots \\
\Delta\_N & \forall t \in [t\_{N-1}, t\_N] \\
\vdots
\end{cases}
$$

In case of asymptotically stable system state trajectories x (p) (t) and x ∗ (t), a monotonic decreasing series, ∆<sup>N</sup>−<sup>1</sup> ≤ ∆<sup>N</sup> , can be assumed preventing Cη,NPDG(t) from an exponential growth for t → ∞. Consequently, Lemma 4.6 also holds for t → ∞.

#### **Remark 3:**

Note that Lemma 4.6 differs from the estimation of the distance between solutions of two general initial value problems: The upper bound between two general initial value problems is given as function of the Lipschitz constant and is usually proved with the Gronwall-Bellman inequality, see e. g. [Kha02, Theorem 3.4.]. On the other hand, Lemma 4.6 provides the link between the upper bound η(∆) and the DD of the two games ∆, which differs from general initial value problems. Thus, Lemma 4.6 is a special case of Theorem 3.4. [Kha02].

#### **Remark 4:**

The function Cη,NPDG(t) depends also on the number of the players, which seems to be a restriction. However, in human-automation interactions, there are generally only two players. For games, with N >> 1, the players need to be assembled in subsets, then, the concept of NPDGs can be applied.

### **4.3.3 Computation of an NPDG**

For practical applications of the proposed NPDG, a computation method is necessary to find a potential function for a given differential game. Similarly to the trajectory-free optimization of OPDGs in Section 4.2.3, an LMI is formulated to find the parameters of the quadratic potential function. The LMI methods from [PCC<sup>+</sup> 15] is adapted for the computation of NPDGs, analogously to the trajectory-free optimization of OPDGs. For the formulation of the optimization, it is assumed that A, B(i) , Q(i) , R(i) ,∀<sup>i</sup> <sup>∈</sup> <sup>P</sup>,K(p) are given for the optimization. The feedback gain K(p) is either directly estimated from measurements or computed such as

$$\mathbf{K}^{(p)} = \left[ \mathbf{K}^{(1)}, \mathbf{K}^{(2)}, \dots, \mathbf{K}^{(N)} \right]. \tag{4.85}$$

The condition number of the concatenated matrix

$$\mathbf{E}\_{\{n+\sum p\_i \in \mathcal{P}\}} \le \begin{bmatrix} \mathbf{Q}^{(p)} & \mathbf{0} \\ \mathbf{0} & \mathbf{R}^{(p)} \end{bmatrix} \le \beta\_{\mathbf{Q},\mathbf{R}} \mathbf{E}\_{\{n+\sum p\_i \in \mathcal{P}\}} \tag{4.86}$$

is minimized, where E<sup>n</sup> is the unity matrix with n × n size. To find an NPDG, ∆ must be specified in advance. On the one hand, choosing ∆ large, the usefulness of Lemma 4.5 is weakened. On the other hand, small ∆ values can not fulfill (4.56), and the LMI has no solution. Therefore, (4.56) is modified to

$$\max\_{i} \left\| \mathbf{B}^{(i)} \prescript{\mathsf{T}}{\mathbf{P}}{\mathbf{P}}^{(p)} - \mathbf{B}^{(i)} \prescript{\mathsf{T}}{\mathbf{P}}{\mathbf{P}}^{(i)} \right\|\_{2} < \beta\_{\Delta} \cdot \Delta,\tag{4.87}$$

where β<sup>∆</sup> is an additional optimization variable, which is minimized through the LMI. This way, only an initial maximum distance ∆ needs to be chosen and the LMI can reduce β<sup>∆</sup> leading to a more strict condition. Furthermore, (4.81) and (4.77) also have to be fulfilled. The optimization variables (the outputs) of the LMI are Q(p) , R(p) ,P(p) , βQ,R, β∆, while the inputs of the optimization are A, B(i) , Q(i) , R(i) ,∀<sup>i</sup> <sup>∈</sup> <sup>P</sup>,K(p) . The LMI minimizes the sum of condition numbers βQ,<sup>R</sup> and β<sup>∆</sup> and it is subject to constraints providing a unique solution of the optimization. The resulting LMI problem is written as follows:

$$
\hat{\mathbf{Q}}^{\{p\}}, \hat{\mathbf{R}}^{\{p\}}, \hat{\mathbf{P}}^{\{p\}}, \hat{\beta}\_{\mathbf{Q},\mathbf{R}}, \hat{\beta}\_{\Delta} = \underset{\mathbf{Q}^{\{p\}}, \mathbf{R}^{\{p\}}, \mathbf{P}^{\{p\}}, \beta\_{\mathbf{Q},\mathbf{R},\beta\_{\Delta}}}{\operatorname{argmin}} \beta\_{\mathbf{Q},\mathbf{R}}^2 + \beta\_{\Delta}^2 \tag{4.88a}
$$

$$\text{s.t.} \mathbf{A}^{\mathsf{T}} \mathbf{P}^{(p)} + \mathbf{P}^{(p)} \mathbf{A} + \mathbf{Q}^{(p)} - \mathbf{P}^{(p)} \mathbf{B} \mathbf{K}^{(p)} = \mathbf{0},\tag{4.88b}$$

$$\mathbf{B}^{(p)}\overset{\uparrow}{\mathbf{P}}\mathbf{P}^{(p)} - \mathbf{R}^{(p)}\mathbf{K}^{(p)} = \mathbf{0},\tag{4.88c}$$

$$\mathbf{E} \le \begin{bmatrix} \mathbf{Q}^{(p)} & \mathbf{0} \\ \mathbf{0} & \mathbf{R}^{(p)} \end{bmatrix} \le \beta\_{\mathbf{Q},\mathbf{R}} \mathbf{E},\tag{4.88d}$$

$$\max\_{i} \left\| \mathbf{B}^{\{i\}} \mathbf{P}^{\{p\}} - \mathbf{B}^{\{i\}} \mathbf{P}^{\{i\}} \right\|\_{2} < \beta\_{\Delta} \cdot \Delta,\tag{4.88e}$$

$$\left\|\mathbf{R}^{\{p\}}\mathbf{R}^{\{i\}}^{-1}\mathbf{B}^{\{i\}}^{\top}\mathbf{P}^{\{i\}}\right\|\_{2} \geq \left\|\mathbf{B}^{\{i\}}^{\top}\mathbf{P}^{\{i\}}\right\|\_{2}, \ \forall i \in \mathcal{P},\tag{4.88f}$$

$$\left\|\mathbf{R}^{\{p\}}\right\|\_{2} > 1.\tag{4.88g}$$

The minimization of β 2 <sup>Q</sup>,<sup>R</sup> + β 2 <sup>∆</sup> ensures that the result of the LMI is unique and leads to the smallest admissible upper bound of DD. The optimality condition of the potential function J (p) is guaranteed by conditions (4.88b) and (4.88c). Constraint (4.88d) is necessary to obtain the unique solution. Constraint (4.88e) is the result of Lemma 4.5 restricting the identified potential function to be an NPDG. Using the additional condition number β∆, the upper bound of the DD is reduced by the LMI providing a more strict condition. Constraints (4.88f) and (4.88g) are the results on the boundedness condition of NPDGs from Lemma 4.6.

The limitation of the optimization algorithm is that ∆ has to be chosen in advance, which can lead to non-feasible solutions. In this case, ∆ has to be increased to obtain the solution of (4.88) and refined iteratively.

# **4.4 Comparison and Discussion**

### **4.4.1 Simulation Comparison of OPDG and NPDG**

In this subsection, the proposed concept of NPDGs is applied to the illustrative Example 4.4. After the example, the three computation methods are compared to each other. The general notion of NPDGs is that for a given differential game the nearest exact potential differential game is sought such that the resulting NE trajectories are "near" to the ones of the original game.

#### **Example 4.4:**

*Consider the differential game from Example 4.3. Similarly to the OPDG case, it is possible to find a substituting NPDG. Therefore, LMI* (4.88) *is applied and* ∆ *is chosen as 0.15. The necessary computation time was* 0.18 *s. The LMI* (4.88) *is solved with the same software as* (4.47) *(SeDuMi, version 1.3).*

*The resulting matrices are*

$$\mathbf{Q}\_{\rm NPDG}^{\langle p \rangle} = \begin{bmatrix} 0.958 & 0.039 \\ 0.039 & 1.005 \end{bmatrix},$$

$$\mathbf{R}\_{\rm NPDG}^{\langle p \rangle} = \begin{bmatrix} 0.980 & 0.024 \\ 0.024 & 0.997 \end{bmatrix}.$$

*The solution of the Riccati equation* (4.21) *is*

$$\mathbf{P}\_{\mathrm{NPDG}}^{\mathrm{(p)}} = \begin{bmatrix} 1.713 & -0.263 \\ -0.263 & 0.365 \end{bmatrix}.$$

*The obtained condition numbers are*

 $\beta\_{\Delta} = 0.243$   $\text{and } \beta\_{\text{Q,R}} = 1.030.1$ 

*The resulting trajectories are given in Figure 4.5. It can be seen that the obtained NPDG can provide trajectories similar to the original LQ differential game.*

**Figure 4.5:** The system state trajectories of the given original differential game (ODG) and the resulting NPDG are shown. It can be seen that the resulting NPDG can reproduce the trajectories of the ODG.

*The dynamics of the Hamiltonians are given in Figure 4.4, which show that the original differential game is not an exact potential differential game, thus* (4.13) *does not hold. However, the dynamics of the Hamiltonians are close to each other and the differences always remain within the defined* ∆*. The maximal DDs are*

$$
\max \sigma\_d^{(1)}(t) = 0.139 \text{ and } \max \sigma\_d^{(2)}(t) = 0.093,
$$

*for which* (4.55) *holds showing that the differential game is an NPDG.*

**Figure 4.6:** The dynamics of the Hamiltonians H(1) , H(2) and H(p) : The given original differential game (ODG) is compared to the NPDG found by the LMI.

To compare the three computation methods, normalized error measure (see e. g. [IC21]) is defined such as

$$e\_{\mathfrak{w}} = \max\left\{e\_{\mathfrak{w}\_1}, e\_{\mathfrak{w}\_2}, \dots, e\_{\mathfrak{w}\_n}\right\} \tag{4.89a}$$

$$e\_{\mathfrak{x}\_{j}} = \max \left\| \frac{x\_{j}^{\star}(t)}{\max \left\| x\_{j}^{\star}(t) \right\|\_{2}} - \frac{\widehat{x}\_{j}^{(p)}(t)}{\max \left\| \widehat{x}\_{j}^{\{p\}}(t) \right\|\_{2}} \right\|\_{2}, \forall j \in \{1, 2, \ldots, n\}, \tag{4.89b}$$

where x (p) <sup>=</sup> [x<sup>ˆ</sup> (p) 1 , xˆ (p) 2 , ..., xˆ (p) <sup>n</sup> ] are the system state trajectories obtained from one of the potential games (OPDG or NPDG). Then, x <sup>∗</sup> = [x ∗ 1 , x<sup>∗</sup> 2 , ..., x<sup>∗</sup> n ] are the state trajectories of the NE of the original differential game. In Table 4.1, the resulting error measures are given. It can be seen that, on the one hand, NPDGs can be applied more generally. On the other hand, an NPDG leads to a larger normalized error measure compared to OPDG-TF. The reasons for that are the additional conditions, which restrict the possible solutions. A limitation of OPDG-ITD is that it requires an initial parameter vector, which can have an impact on the optimization.

**Table 4.1:** Comparison of normalized error measures of the NPDG with OPDG


### **4.4.2 Limitations of OPDGs and NPDGs**

The two novel subclasses, proposed in this thesis, offer a compact representation of the original game opening up potential applications beyond designing LISC. However, it is important to acknowledge the limitations associated with the current methods presented in this thesis, which are addressed as follows.

First, the input-trajectory-dependent computation of an OPDG can be time-consuming and impractical for some applications due to its dependence on the system state trajectories x. On the other hand, the trajectory-free optimization to find an OPDG can be applied only for special system structures. Moreover, no necessary and sufficient conditions are provided for the existence of an OPDG. In the case of NPDGs, the applicability is only shown for two-player games, necessitating further research to address solution concepts for N-player games.

Despite these limitations, this thesis marks the pioneering analysis and investigation of OPDGs and NPDGs, initiating new research directions. As such, these two novel subclasses hold significant relevance and importance for the research community in the field of differential game theory.

# **4.5 Systematic Calculation of the Cooperation State**

For a short reminder, the proposed design procedure of LISC has the following steps, see Figure 3.4 in Section 3.3:


Sections 4.2 and 4.3 handle the design step 2: The novel subclasses of potential games provide a more compact substituting model. In the following, the steps 3 and 4 are presented in detail.

With the two novel subclasses of potential games, a substituting model of the differential game is obtained, from which, the CS of LISC can be derived systematically. The restriction of OPDG to two-player problems does not lead to further limitations because human-automation interactions can be always reduced to a two-player differential game [Fla16].

In the following, the derivation of the CS is presented based on the optimality principle of the identified potential game. First, let it be assumed that the quadratic potential function J (p) cf. (4.12) is computed with the developed methods. The NE of the original game corresponds to the optimum of the potential function. Therefore, using this optimum is reasonable to determine the CS. To find the optimum of the potential function, the dynamic optimization problem constituted by the potential function (4.12) and the linear system dynamics (4.9) has to be solved:

$$\frac{\partial H^{\{p\}}(t)}{\partial u^{\{p\}}(t)} = \mathbf{R}^{\{p\}} u^{\{p\}}(t) + \mathbf{B}^{\{p\}} \prescript{\mathsf{T}}{\mathbf{A}} \lambda^{\{p\}}(t) = \mathbf{0},\tag{4.90a}$$

$$
\dot{x}(t) = \frac{\partial H^{(p)}(t)}{\partial \lambda^{(p)}(t)} = \mathbf{A}x(t) + \mathbf{B}^{\{p\}}u^{\{p\}}(t), \tag{4.90b}
$$

$$\dot{\boldsymbol{\lambda}}^{\{p\}}(t) = -\frac{\partial H^{\{p\}}(t)}{\partial \boldsymbol{x}(t)} = -\mathbf{Q}^{\{p\}}\boldsymbol{x}(t) - \mathbf{A}^{\top}\boldsymbol{\lambda}^{\{p\}}(t). \tag{4.90c}$$

The resulting optimum of the potential games is computed with standards dynamic optimization methods see e. g. [PLB15, Chapter 12], which corresponds to the NE of the original LQ differential game. Furthermore, using conditions (4.90), an distinct relationship between the measured and non-measured states and the control inputs of all players is derived. By assuming a linear relationship between co-states and states

$$
\lambda^{(p)}(t) = \mathbf{P}^{(p)} x(t), \tag{4.91}
$$

and considering infinite time horizon, time-independent matrix P(p) , the time-derivative of the co-state is

$$
\dot{\boldsymbol{\lambda}}^{(p)}(t) = \mathbf{P}^{(p)} \dot{\boldsymbol{x}}(t). \tag{4.92}
$$

Substituting (4.91) and (4.92) in (4.90c) leads to

$$\mathbf{P}^{(p)}\dot{x}(t) = -\mathbf{Q}^{(p)}x(t) - \mathbf{A}^{\top}\mathbf{P}^{(p)}x(t),\tag{4.93}$$

Applying system dynamics (4.90b),

$$\mathbf{P}^{(p)}\left(\mathbf{A}x(t) + \mathbf{B}u^{(p)}(t)\right) = -\mathbf{Q}^{(p)}x(t) - \mathbf{A}^{\top}\mathbf{P}^{(p)}x(t)$$

results, which can be rewritten as the function of input u (p) (t)

$$\mathbf{x}(t) = -\left[\mathbf{P}^{(p)}\mathbf{A} + \mathbf{Q}^{(p)} + \mathbf{A}^{\top}\mathbf{P}^{(p)}\right]^{\dagger}\mathbf{P}^{(p)}\mathbf{B}u^{(p)}(t),\tag{4.94}$$

where the index † denotes the Moore-Penrose inverse<sup>30</sup>. Using (4.94) the weights of the linear CS in (3.39) can be systematically derived aiming at a better understanding of the CS. Note that using the Moore-Penrose inverse leads to an approximate mapping between the inputs and the system states. In general case, x is determined by the solution of differential equation of the dynamics system cf. Definition 4.2.

Assuming two players (the automation and the human) and a division of the states into measured and non-measured ones leads to

$$
\begin{bmatrix} x\_{\mathrm{m}}(t) \\ x\_{\mathrm{nm}}(t) \end{bmatrix} = \underbrace{-\left[\mathbf{P}^{(p)}\mathbf{A} + \mathbf{Q}^{(p)} + \mathbf{A}^{\top}\mathbf{P}^{(p)}\right]^{\dagger}\mathbf{P}^{(p)}\mathbf{B}}\_{=\Sigma} \underbrace{\begin{bmatrix} u^{\mathrm{(a)}}(t) \\ u^{\mathrm{(b)}}(t) \end{bmatrix}}\_{=\Sigma},\tag{4.95}
$$

which can be rewritten to

$$
\begin{bmatrix} \mathbf{x}\_{\mathrm{m}}(t) \\ \mathbf{x}\_{\mathrm{nm}}(t) \end{bmatrix} = \begin{bmatrix} \boldsymbol{\Sigma}\_{11} & \boldsymbol{\Sigma}\_{12} \\ \boldsymbol{\Sigma}\_{21} & \boldsymbol{\Sigma}\_{22} \end{bmatrix} \begin{bmatrix} \boldsymbol{u}^{\mathrm{(a)}}(t) \\ \boldsymbol{u}^{\mathrm{(b)}}(t) \end{bmatrix}. \tag{4.96}
$$

The CS for linear systems is defined as

$$\mathbf{x}\_{\rm nm}(t) = \Xi^{\rm (a)} u^{\rm (a)}(t) + \Xi^{\rm (h)} u^{\rm (h)}(t),\tag{4.97}$$

for which the weighting matrices are computed from (4.96). If xκ(t) ∶= xnm(t) is assumed, then the matrices Ξ (a) and Ξ (h) are computed such that

$$
\Xi^{(\mathbf{a})} \coloneqq \Sigma\_{21} \text{ and } \Xi^{(\mathbf{h})} \coloneqq \Sigma\_{21}.
$$

# **4.6 Design of the Feedback Gains**

After the computation of the parameters of CS (Step 3), the computation of the feedback gains is presented in the following section (Step 4), cf. Figure 3.4.

<sup>30</sup> Note that the Moore-Penrose inverse is computed as G† = (GTG) <sup>−</sup>1G<sup>T</sup> and always exists regardless of the dimension of the matrix.

#### **Design with simulated human inputs**

Using the first method to determine the parameters Q (a) LISC and R (a) LISC, the shared control setup is simulated with FISC. Then, the simulated input signals of FISC, u (a) FISC, is used for the parameter computation, which happens with the nested optimization

$$\left\|\mathbf{Q}\_{\mathrm{LISC}}^{\mathrm{(a)}},\mathrm{R}\_{\mathrm{LISC}}^{\mathrm{(a)}}=\underset{\mathbf{Q}\_{\mathrm{LISC}}^{\mathrm{(a)}}}{\mathrm{arg}}\min\_{\mathbf{Q}\_{\mathrm{LISC}}^{\mathrm{(a)}},\mathbf{R}\_{\mathrm{LISC}}^{\mathrm{(a)}}}\left\|\boldsymbol{\mu}\_{\mathrm{FISC}}^{\mathrm{(a)}}(t)-\boldsymbol{\mu}\_{\mathrm{LISC}}^{\mathrm{(a)}}\left(t,\mathbf{Q}\_{\mathrm{LISC}}^{\mathrm{(a)}},\mathbf{R}\_{\mathrm{LISC}}^{\mathrm{(a)}}\right)\right\|\_{2}\tag{4.98a}$$

$$\mathbf{r} \le \mathbf{r}.\mathbf{t}.\tag{4.98b}$$

$$
\dot{\boldsymbol{u}}\_{\rm LISC}^{\{a\}}(t) \triangleq \underset{\dot{\boldsymbol{u}}^{\{a\}}}{\text{min}} \; J\_{\rm LISC}^{\{\rm a\}}(t, \boldsymbol{x}(t), \boldsymbol{u}^{\{\rm a\}}(t), \boldsymbol{u}^{\{\rm h\}}(t)) \tag{4.98c}
$$

$$\text{w.r.t.}\quad \text{(3.40)} \text{ and (3.44)}.\tag{4.98d}$$

Optimization (4.98) has two steps. The inner optimization ensures the optimality of J (a) LISC leading to an optimal controller. The outer optimization ensures that the inputs of LISC and FISC are as similar as possible.

#### **Design with measured human inputs**

The second method includes the usage of measurement data of a human operator carrying out the task with FISC, which can be used for individualizing LISC, which is presented in [VIH21]. Such an individualization process is also practically feasible, as the human can first control the system in an artificial setting (e.g. test area, simulation), in which the use of FISC is possible because all the system states and references are available. Using FISC, the desired input-output behavior of the system is sought. For that, an FISC is design according to (3.24), which provides feedback gain KFISC. In the next step, KFISC is applied together with the human controlling the manipulator. In this setup, the resulting automation inputs u (a) [k], the inputs of the human operator u (h) [k] and the system states xv[k] are measured. In this way, three stacks consisting of M<sup>k</sup> data points of the signals are obtained. Finally, K (a) LISC is computed with a least squares estimation using these measurements:

$$\hat{\mathbf{K}}\_{\text{LISC}}^{\{a\}} = \underset{\mathbf{K}\_{\text{LISC}}}{\text{arg min}} \sum\_{k=1}^{M\_k} \left( \dot{\boldsymbol{u}}\_{\text{FISC}}^{\{a\}}[k] + \mathbf{K}\_{\text{LISC}}^{\{a\}} \boldsymbol{x}\_e[k] \right)^2. \tag{4.99}$$

Then, the individualized LISC can be applied in situations, in which the references or the system states are not available for the automation. Note that the computed feedback gains are not necessarily the optimum of the cost function (3.42). The schematic illustration of the controller design can be seen in Figure 4.7.

To summarize the procedure, the three steps of the proposed personalized design are


The advantage of this second method is that the feedback gains can be personalized for the individual operators. However, further analysis showed that this parameterization is less robust compared to (4.98), for more details see [VIH21].

**Figure 4.7:** The design procedure of LISC through input matching with FISC, which yields the desired parameters, [VIH21] ©2021 IEEE

# **4.7 Stability Analysis**

To ensure the practical applicability and reliability of the proposed design procedure, its stability analysis is necessary. The system equation (3.40) is used for this analysis. The stability of LISC was exemplarily examined in [VSL<sup>+</sup> 20]. The main challenge is that the influence between the human-controlled and the automation-controlled system parts are taken into account through CS. The question needs to be answered, whether the stability of the overall system is still ensured by the LISC structure with CS. Therefore, this section provides a general analysis, which takes both the human-controlled and the automation-controlled parts (shortly the overall system) into account.

A schematic illustration of the relationship between the system equations is given in Figure 4.8. It shows that the overall system includes two autonomous system parts: The automationcontrolled part (Gm) and the human-controlled part (Gnm). For the stability analysis, the two connections (y<sup>m</sup> and u (h) ) marked in Figure 4.8 are important. Note that in accordance with Assumption 3.2.2, Gnm has no direct impact on Gm. The main question is how the connection these two subsystems influences the stability of the overall system.

During the control design, the measurable system part has to be taken into account: The stability of G<sup>m</sup> is ensured by the automation. Thus, first, the closed-loop behavior of G<sup>m</sup> is analyzed.

**Figure 4.8:** The schematic illustration of the overall control system is shown. The stability analysis includes two steps: First the stability of the subsystem G<sup>m</sup> has to be ensured (orange + green blocks). Then the combination of G<sup>m</sup> and Gnm (blue blocks) needs to be analyzed. Note that Gnm does not have a direct impact on Gm, cf. (3.27).

#### **Lemma 4.7 (Stability of the LISC-controlled system part)** *Substituting (3.41) in (3.40) leads to the closed-loop structure of* G<sup>m</sup> *be defined such that* ⎡ ⎢ ⎢ ⎢ ⎢ ⎢ ⎣ x˙ <sup>m</sup>(t) u˙ (a) (t) x˙ <sup>κ</sup>(t) ⎤ ⎥ ⎥ ⎥ ⎥ ⎥ ⎦ = ⎡ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎣ A<sup>m</sup> B(a) 0 −K (a) LISC,1 −K (a) LISC,2 −K (a) LISC,3 <sup>−</sup>Ξ(a)<sup>K</sup> (a) LISC,1 <sup>−</sup>Ξ(a)<sup>K</sup> (a) LISC,2 <sup>−</sup>Ξ(a)<sup>K</sup> (a) LISC,3 ⎤ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎦ ´¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¸ ¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¶ Am,c ⎡ ⎢ ⎢ ⎢ ⎢ ⎢ ⎣ xm(t) u (a) (t) xκ(t) ⎤ ⎥ ⎥ ⎥ ⎥ ⎥ ⎦ + ⎡ ⎢ ⎢ ⎢ ⎢ ⎢ ⎣ 0 0 Ξ(h) ⎤ ⎥ ⎥ ⎥ ⎥ ⎥ ⎦ u˙ (h) (t), (4.100) *where* K (a) LISC,i, i = {1, 2, 3} *are the feedback gains of LISC. Furthermore, let it be assumed that* u˙ (h) (t) *is a bounded and Lipschitz continuous function. Moreover,* u˙ (h) (t) → 0 as t → ∞*. If* Am,c *has only eigenvalues with negative real part, then* (4.100) *is*

*asymptotically stable.*

#### **Proof:**

The proof is obtained from the stability of linear systems, see e. g. [Kha15, Lemma 3.2]

To enable the analysis of the overall system behavior, in the following, the non-measurable system part with the human control actions is taken into account. An adequate assumption of the human controlled system part is that Gnm is stable as long as y<sup>m</sup> is bounded. In that case, y<sup>m</sup> is an external disturbance for Gnm, which is compensated by the human. The practical

reason for this assumption is that the human is able to control the subsystem as they are trained to do. Adding G<sup>m</sup> to Gnm and extending it with CS, the overall system is obtained

$$
\tilde{\mathbf{E}}\_{\rm o} \dot{x}\_{\rm o}(t) = \mathbf{A}\_{\rm o} \mathbf{x}\_{\rm o}(t) + \mathbf{B}\_{\rm o}^{\rm (a)} \dot{u}^{\rm (h)}(t) + \mathbf{B}\_{\rm o}^{\rm (a)} \dot{u}^{\rm (a)}(t), \tag{4.101}
$$

,

which is a differential algebraic system, where the matrix E˜ <sup>o</sup> is given such as

$$
\tilde{\mathbf{E}}\_{\rm o} = \begin{bmatrix}
\mathbf{E}\_{n-k} & 0 & 0 & 0 & 0 \\
0 & \mathbf{E}\_{p\_n} & 0 & 0 & 0 \\
0 & 0 & 0 & 0 & 0 \\
0 & 0 & 0 & \mathbf{E}\_k & 0 \\
0 & 0 & 0 & 0 & \mathbf{E}\_{p\_h}
\end{bmatrix}
$$

where the E<sup>n</sup> is an identity matrix with the size of n × n. The state vector, the system and the input matrices are

$$x\_{\circ}(t) = \begin{bmatrix} x\_{\mathrm{m}}(t) & u^{\mathrm{(a)}}(t) & x\_{\kappa}(t) & x\_{\mathrm{nm}}(t) & u^{\mathrm{(h)}}(t) \end{bmatrix}.\top$$

$$\mathbf{A}\_o = \begin{bmatrix} \mathbf{A}\_{\rm m} & \mathbf{B}^{(a)} & \mathbf{0} & \mathbf{0} & \mathbf{0} \\ \mathbf{0} & \mathbf{0} & \mathbf{0} & \mathbf{0} & \mathbf{0} \\ \mathbf{0} & \Xi^{(a)} & -\mathbf{E}\_k & \mathbf{0} & \Xi^{(h)} \\ \mathbf{A}\_{\rm m-un} & \mathbf{0} & \mathbf{0} & \mathbf{A}\_{\rm nm} & \mathbf{B}^{(h)} \\ \mathbf{0} & \mathbf{0} & \mathbf{0} & \mathbf{0} & \mathbf{0} \end{bmatrix}. \tag{4.102}$$

B (a) <sup>o</sup> = [0 1 0 0 0] T (4.103)

$$\mathbf{B}\_{\rm o}^{(h)} = \begin{bmatrix} \mathbf{0} & \mathbf{0} & \mathbf{0} & \mathbf{0} & \mathbf{1} \end{bmatrix}^{\rm \rm \rm \,} . \tag{4.104}$$

To enable an analysis of the overall system, a human control law has to be assumed, otherwise, its impact cannot be taken into account. The following control law of the human is assumed<sup>31</sup>

$$
\dot{\boldsymbol{u}}^{\rm (h)}(t) = -\mathbf{K}\_x^{\rm (h)} \cdot \boldsymbol{x}(t) - \mathbf{K}\_u^{\rm (h)} \cdot \boldsymbol{u}^{\rm (h)}(t), \tag{4.105}
$$

from which the resulting human input is computed with <sup>∫</sup> t <sup>0</sup> u˙ (h) (τ ) dτ , which is a standard PI controller. Modeling the human behavior as a PI or a PID controller can be found in [HJ06, HMV19] or [Ort20, Chapter 2].

#### **Lemma 4.8 (Stability of the LISC-controlled overall system)** *Let it be assumed that* (4.105) *holds. Furthermore, let the matrix* Astab = ⎡ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎣ A<sup>m</sup> B(a) 0 0 −K (a) LISC,1 −K (a) LISC,2 − K (a) LISC,3Ξ(a) <sup>0</sup> <sup>−</sup><sup>K</sup> (a) LISC,3Ξ(h) <sup>A</sup><sup>m</sup>−um 0 Anm <sup>B</sup>(h) 0 0 K(h) <sup>x</sup> K (h) u ⎤ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎦ (4.106)

<sup>31</sup> The assumption of a preliminary knowledge of the human control law is feasible, as the FISC and LISC designs require it, cf. Section 3.2.2.

*be defined as the stability matrix of the overall system. Then, the system* (4.101) *is stable if and only if* Re {Λ<sup>i</sup> (Astab)} < 0, ∀i = 1, ..., n*.*

#### **Proof:**

Substituting the control law (3.41) and (4.105) in (4.101) yields an autonomous closed-loop system. This is a differential algebraic system, where the system matrix is

$$\mathbf{A}\_{o,c} = \begin{bmatrix} \mathbf{A}\_{\mathrm{m}} & \mathbf{B}^{\mathrm{(a)}} & \mathbf{0} & \mathbf{0} & \mathbf{0} \\ -\mathbf{K}\_{\mathrm{LISC},1}^{\mathrm{(a)}} & -\mathbf{K}\_{\mathrm{LISC},2}^{\mathrm{(a)}} & -\mathbf{K}\_{\mathrm{LISC},3}^{\mathrm{(a)}} & \mathbf{0} & \mathbf{0} \\ \mathbf{0} & -\boldsymbol{\Xi}^{\mathrm{(a)}} & \mathbf{E}\_{k} & \mathbf{0} & -\boldsymbol{\Xi}^{\mathrm{(h)}} \\ \mathbf{A}\_{\mathrm{m-nun}} & \mathbf{0} & \mathbf{0} & \mathbf{A}\_{\mathrm{nm}} & \mathbf{B}^{\mathrm{(h)}} \\ \mathbf{0} & \mathbf{0} & \mathbf{0} & \mathbf{K}\_{x}^{\mathrm{(h)}} & \mathbf{K}\_{u}^{\mathrm{(h)}} \end{bmatrix}$$

Reordering the state vector to [x(t), u (a) (t), xnm(t), u (h) (t), xκ(t)], the closed-loop matrix of the overall system results as

$$
\tilde{\mathbf{A}}\_{o,c} = \begin{bmatrix}
\mathbf{A}\_{\mathrm{m}} & \mathbf{B}^{\mathrm{(a)}} & \mathbf{0} & \mathbf{0} & \vdots & \mathbf{0} \\
\mathbf{A}\_{\mathrm{m}-\mathrm{um}} & \mathbf{0} & \mathbf{A}\_{\mathrm{nm}} & \mathbf{B}^{\mathrm{(h)}} & \vdots & \mathbf{0} \\
\mathbf{0} & \mathbf{0} & \mathbf{K}\_{x}^{\mathrm{(h)}} & \mathbf{K}\_{u}^{\mathrm{(h)}} & \vdots & \mathbf{0} \\
\mathbf{0} & -\mathbf{E}^{\mathrm{(a)}} & \mathbf{0} & -\mathbf{E}^{\mathrm{(h)}} & \vdots & \mathbf{E}\_{k}
\end{bmatrix},\tag{4.107}
$$

where the differential and the algebraic equations are separated by dashed lines. To enable the stability analysis of the overall system, an indexreduction of (4.107) is carried out, see e. g. [LMT13, Chapter 1, 2] or [Lun16]. An indexreduction of (4.107) is always possible, since E<sup>k</sup> is a unity matrix.

Then, using the definition of the cooperation state for the automation control law

$$\begin{split} \dot{u}^{\text{(a)}}(t) &= -\mathbf{K}\_{\text{LIS},1}^{\text{(a)}} \cdot x\_{\text{m}}(t) - \mathbf{K}\_{\text{LIS},2}^{\text{(a)}} \cdot u^{\text{(a)}}(t) - \mathbf{K}\_{\text{LIS},3}^{\text{(a)}} x\_{\text{K}}(t) \\ &= -\mathbf{K}\_{\text{LIS},1}^{\text{(a)}} \cdot x\_{\text{m}}(t) - \mathbf{K}\_{\text{LIS},2}^{\text{(a)}} \cdot u^{\text{(a)}}(t) - \mathbf{K}\_{\text{LIS},3}^{\text{(a)}} \left(\Xi^{\text{(a)}} \cdot u^{\text{(a)}}(t) - \Xi^{\text{(h)}} \cdot u^{\text{(h)}}(t)\right) \\ &= -\mathbf{K}\_{\text{LIS},1}^{\text{(a)}} \cdot x\_{\text{m}}(t) - \left(\mathbf{K}\_{\text{LIS},2}^{\text{(a)}} + \mathbf{K}\_{\text{LIS},3}^{\text{(a)}} \Xi^{\text{(a)}}\right) \cdot u^{\text{(a)}}(t) + \mathbf{K}\_{\text{LIS},3}^{\text{(a)}} \Xi^{\text{(h)}} \cdot u^{\text{(h)}}(t), \end{split} \tag{4.108}$$

the explicit dependency of the cooperation state is eliminated. Substituting (4.108) into (4.101) leads to the autonomous system

$$
\begin{bmatrix}
\dot{\boldsymbol{x}}\_{\text{m}}(t) \\
\dot{\boldsymbol{u}}^{\text{(a)}}(t) \\
\dot{\boldsymbol{x}}\_{\text{nm}}(t) \\
\dot{\boldsymbol{u}}^{\text{(h)}}(t)
\end{bmatrix} = \underbrace{\begin{bmatrix}
\mathbf{A}\_{\text{m}} & \mathbf{B}^{\text{(a)}} & \mathbf{0} & \mathbf{0} \\
\mathbf{A}\_{\text{m}-\text{um}} & \mathbf{0} & \mathbf{A}\_{\text{nm}} & \mathbf{B}^{\text{(h)}} \\
\mathbf{0} & \mathbf{0} & \mathbf{K}\_{\text{x}}^{\text{(h)}} & \mathbf{K}\_{\text{u}}^{\text{(h)}} \\
\hline \\
& \mathbf{A}\_{\text{stub}} & & \\
\end{bmatrix}}\_{\mathbf{A}\_{\text{stub}}} \begin{bmatrix}
\boldsymbol{x}\_{\text{m}}(t) \\
\boldsymbol{u}^{\text{(a)}}(t) \\
\boldsymbol{x}\_{\text{nm}}(t) \\
\mathbf{0} \\
\end{bmatrix}.\tag{4.109}$$

If the matrix Astab has only eigenvalues with negative real part, the autonomous system (4.109) is stable [Kha15, Lemma 3.2] and the designed limited information shared controller of the overall system (3.43) is stable, which proves the lemma.

The procedure of the stability analysis could raise the question of why the overall LISC design is not performed using this reduced system, cf. (4.109). Please note, that the use of the CS facilitates the systematic design of LISC. It has a practical interpretation, cf. Example 3.1 and provides a measure of the mutual effort of the human and the automation.

# **4.8 Summary of the Chapter**

This chapter reports a solution to the second research question by providing a systematic automation design of the LISC. The preliminaries on the theory of potential games for both static and dynamic cases are presented followed by a discussion on the limitations of potential games and the necessity of novel concepts. To this end, the subclasses of ordinal potential static games are extended for differential games. Afterwards, the necessary and sufficient conditions of the existence of NPDGs are presented. In addition, two computation methods are developed, which provide a potential function of the original differential game. The first one requires the state trajectories and the cost functions of the players to identify the potential function for the original differential game. On the other hand, for the second method, the computation is formulated as a linear matrix inequality problem and solely the cost functions of the players are required. Thus, this second method can determine the potential function more efficiently compared to first method.

The concept of OPDGs works under some specific assumptions only. Therefore, the subclass of NPDGs introduced. The necessary and sufficient conditions of the existence of NPDG and an computation algorithm are presented. The computation is also formulated as an LMI, which provides the potential function of the NPDG. The two novel subclasses are compared to each other and the simulation results showed that both subclasses are able to provide a substituting model of the original differential game.

The remaining section of this chapter elaborate on the third and fourth design steps of LISC: The systematic calculation of the cooperation state is based on the optimality principle. The computation of the feedback gain is realized by the use of two methods using a) simulations or b) measurements. In the final part of the chapter, the stability analysis of the proposed LISC is presented. In the next chapter, this design procedure is applied to a large vehicle manipulator.

# **5 Application to Vehicle Manipulators**

In the previous chapters, the concept of the LISC and its design procedure using the CS and potential games are presented, which are applied to a large vehicle manipulator in this chapter, for the overview of the design procedure see Figure 3.4 in Section 3.3. Thus, the third research question of this thesis is answered in the following: The practical benefits of LISC and its design procedure analyzed and compared to the current state-of-the-art technical solutions. This application and the analyses take place in three successive stages: First, the general applicability of the design procedure is analyzed using simulations in Section 5.2 and Section 5.3. Second, LISC is tested in a complex simulation environment with a simulated human behavior in order to analyze the impact of the large roll and pitch angels. This stage is referred as *qualitative analysis* in the following. Third, three experiments with human test subject are conducted providing stronger indications of the practical usability of LISC, which is the subject of Chapter 6, which are called *experiments* or *experimental analysis* in the subsequent.

First, this chapter introduces the novel design models of the vehicle manipulator, which are published in two research papers [VMSH19, VH22]. Afterwards, the proposed design procedure including four steps is applied: In the first step, a FISC is designed for the lateral and the longitudinal shared control. Then, the second step is applied: Using the shared control setup modeled by a given differential game, the corresponding OPDGs and NPDGs are computed for the lateral and the longitudinal cases. From the potential differential games, the parameters of the CS are computed in the third design step, which is followed by the computation of the feedback gains of LISC by means of (4.98). Finally, the stability and the usability of LISC are analyzed in simulations. In order to enable an analysis under more realistic conditions, a complex simulation model is developed Section 5.4. Using simulations, the qualitative analysis of this complex simulation model and LISC closes the chapter.

# **5.1 Design Models of a Vehicle Manipulator**

For the model-based control design, the longitudinal and lateral models of a large vehicle manipulator are presented in this section. To set up the models, the basic idea from Chapter 1 that the human operator controls the manipulator and the automation controls the vehicle is used. This division serves the consistence of the following presentation and does not imply any limitation from the control engineering point of view, since both the human and the automation can be modeled as controllers of the system.

### **5.1.1 Lateral Model**

In the following, the principal geometrical relationships and the linear lateral model of the large vehicle manipulator are presented. The detailed derivation of the non-linear equations of motion is given in Appendix B.1.

The vehicle manipulator consists of a vehicle and a manipulator subsystem, which both are modeled in the Frénet Frame,<sup>32</sup> relative to their given reference paths, Γveh and Γman. The vehicle is characterized by means of the kinematic bicycle model, see Figure 5.1. It is assumed that the velocity vveh of the vehicle is quasi constant and the steering angle

$$u^{\{a\}} = \delta \tag{5.1}$$

is the input variable of the automation. The vehicle can be described through its position Pveh and orientation θveh in the global frame (i, j)O. From this global coordinate system O, the states of the vehicle are transformed into its Frénet Frame (irv, jrv)<sup>P</sup>rv . Here, the vehicle system part contains two state variables: The lateral distance to the reference path dveh between the points Pveh and Prv and the orientation error ∆θveh = θveh − θrv, see Figure 5.1.

The manipulator is modeled as a planar robotic arm relative to its reference Γman, in its Frénet Frame (irm, jrm)<sup>P</sup>rm . In order to describe the manipulator subsystem, two variables are necessary: The lateral error dman and the orientation error of the manipulator ∆αman = α−αr, where a<sup>r</sup> is the reference length of the manipulator resulting from the given α<sup>r</sup> and the position of Pveh, see Figure 5.1. The detailed derivation of the non-linear model of the manipulator can be found in Appendix B.1. Inspired by models of hydraulic manipulators from literature [Rud18, PR18], two input variables of the human-controlled manipulator are assumed

$$\mathbf{u}^{\text{(h)}} = \begin{bmatrix} \dot{\mathbf{a}}\_{\text{des}}, \alpha\_{\text{des}} \end{bmatrix}, \tag{5.2}$$

**Figure 5.1:** The lateral design model for the FISC and LISC design of the large vehicle manipulator [VMSH19]. ©2020 IEEE

<sup>32</sup> For the modeling of wheel robots in the Frénet Frame, it is referred to [BK16, Section 49.2].

which are the desired velocity and position of a and α, respectively. Using the geometrical relations between the global frame and Frénet Frames of the vehicle manipulator, a non-linear model is obtained, see Appendix B.1. The linearization leads to a linear model of the vehicle manipulator with four system states

$$\mathbf{x} = \begin{bmatrix} d\_{\text{man}}, \ \Delta \alpha\_{\text{man}}, \ d\_{\text{veh}}, \ \Delta \theta\_{\text{veh}} \end{bmatrix}^{\mathsf{T}}. \tag{5.3}$$

For a constant velocity vveh, an LTI system is obtained with the system and input matrices

$$\mathbf{A} = \begin{bmatrix} 0 & 0 & 0 & 0 \\ 0 & -1 & 0 & 0 \\ 0 & 0 & v\_{\text{veh}} & 0 \\ 0 & 0 & 0 & 0 \end{bmatrix} \tag{5.4a}$$

$$\mathbf{B}^{\{\text{h}\}} = \begin{bmatrix} \sin \alpha\_r & \mathbf{a}\_r \cos \alpha\_r \\ 0 & 1 \\ 0 & 0 \\ 0 & 0 \end{bmatrix}, \mathbf{B}^{\{\text{a}\}} = \begin{bmatrix} L \cdot v\_{\text{veh}} \\ 0 \\ 0 \\ v\_{\text{veh}} \end{bmatrix}. \tag{5.4b}$$

### **5.1.2 Longitudinal Model**

To apply LISC to the longitudinal control, the longitudinal design model<sup>33</sup> of the large vehicle manipulator is presented in the following. The system is also described in the Frénet Frame meaning that the references sref,v, s˙ref,v and sref,<sup>m</sup> are additionally given. Using a general approach, the vehicle is modeled as a double integrator along its reference path Γveh in longitudinal direction, see e. g. [LP17, Chapter 13]. As suggested in [Rud18], the manipulator is modeled as an integrator along its reference path Γman. The system state vector of the vehicle manipulator is

$$\boldsymbol{x}\_{\rm Ion} = \begin{bmatrix} \Delta s\_{\rm veh}, \ \Delta \dot{s}\_{\rm veh}, \ \Delta s\_{\rm man} \end{bmatrix}^{\rm T}.$$

The first system state describes the longitudinal position error of vehicle ∆sveh = sveh − sref,v, which is followed by its velocity error ∆ ˙sveh = s˙veh − s˙ref,v. The third state is the longitudinal position error of the manipulator ∆sman = sman − sref,m, see Figure 5.2 representing the longitudinal design model. The inputs of the longitudinal model

$$u\_{\rm Ion}^{\{h\}} = \varphi\_{\mathbf{x}} \text{ and } u\_{\rm lon}^{\{a\}} = \ddot{s}\_{\rm veh,des}$$

are desired longitudinal speed of the manipulator and the acceleration of the vehicle, respectively. The system and input matrices are

$$\mathbf{A}\_{\rm Ion} = \begin{bmatrix} 0 & 1 & 0 \\ 0 & 0 & 0 \\ 0 & 1 & 0 \end{bmatrix}, \mathbf{B}\_{\rm lon}^{\{a\}} = \begin{bmatrix} 0 \\ 1 \\ 0 \end{bmatrix}, \mathbf{B}\_{\rm lon}^{\{h\}} = \begin{bmatrix} 0 \\ 0 \\ K\_{\rm boy} \end{bmatrix}, \tag{5.5}$$

where Kjoy is the gain factor of for human's input. This model is used to design the FISC and LISC for a longitudinal sharing of the operator's task, which are also computed as presented Section 5.3.

<sup>33</sup> Note that sake of brevity, the matrices and vectors possess the index lon only in the case of the longitudinal design model. No index is used for the lateral model.

**Figure 5.2:** Illustration of the longitudinal design model in the Frénet Frame for the LISC design [VH22]

# **5.2 Lateral LISC Design for a Vehicle Manipulator**

The first step of the proposed procedure is the design of FISC followed by the computation of a corresponding near potential differential game and the calculation of the parameter of the CS. Finally, a simulation demonstrates the principal usability of the designed LISC.

### **5.2.1 FISC Design for a Dual Task**

This subsection provides the first application of the FISC for a dual task in accordance with Definition 2.2. Note that in [Fla16], the adjustment of the controller parameters for a dual task was not taken into account, since both the automation and the human can carry out the task individually. However, in the case of a dual task, the modification of the global objective function J (g) has a stronger impact on the overall motion. In case of a task prioritization problem of vehicle manipulator, cf. Definition 2.3, an inadequate design of FISC has stronger impact on the overall performance.

The parameters of the lateral linear design model are given in Table B.1. The velocity is assumed to be constant vveh = 1.2m/s, which is a typical operation velocity of large vehicle manipulators, see e. g. [MUL20].

To compute the optimal FISC, the operator model is used as given in Section 3.1.2. The human inputs are computed by (3.6) and the matrices of the cost function (3.14) are chosen to

$$\mathbf{Q}^{\{\mathbf{h}\}} = \text{diag}\left\{ 4.5, 1, 0.5, 0.5 \right\} \tag{5.6a}$$

$$\mathbf{R}^{\{\text{h}\}} = \text{diag}\left\{ 0, 1.05, 0.9 \right\}. \tag{5.6b}$$

The numerical values of Q(h) and R(h) model a general behavior of a human operator: The lateral error of the manipulator is more important than the other three system states. The ratios of these numerical values were determined empirically. One of the main challenges is properly choosing the global objective function J (g) for a system with a dual task. The operator requires

smooth motions for both trajectories. Due to the mechanical coupling, the vehicle influences the manipulator. Thus, rough motions of the vehicle lead to undesired manipulator motion.

The global objective function is quadratic, cf. (3.24) with matrices

$$\mathbf{Q}^{\text{(g)}} = \text{diag}\left(\mathbf{Q}\_{d\_{\text{man}}}^{\text{(g)}}, \mathbf{Q}\_{\Delta\alpha\_{\text{man}}}^{\text{(g)}}, \mathbf{Q}\_{d\_{\text{voh}}}^{\text{(g)}}, \mathbf{Q}\_{\Delta\theta\_{\text{voh}}}^{\text{(g)}}\right) = \text{diag}\left\{ 5.5, 0.5, 1.25, 0.85 \right\}\tag{5.7a}$$

$$\mathbf{R}^{\text{(g)}} = \text{diag}\left(\mathbf{R}^{\text{(g)}}\_{\delta}, \mathbf{R}^{\text{(g)}}\_{\mathbf{\hat{a}\_{\text{des}}}}, \mathbf{R}^{\text{(g)}}\_{\alpha\_{\text{des}}}\right) = \text{diag}\left(1, 1.45, 1.35\right). \tag{5.7b}$$

The choice of J (g) includes the following considerations: First, the penalty of the lateral error of the manipulator (<sup>Q</sup> (g) dman ) should be larger than the penalty for the vehicle (<sup>Q</sup> (g) dveh ) due to the fact that reaching the reference of the manipulator implies the *mutual effort* of the shared control setup and consequently has a higher priority. Through the ratio of these two values, the task prioritization problem (see Definition 2.3) can be solved. Second, changes of the vehicle's orientation can be frustrating and counter-intuitive for the operator in contrast to the orientation error of the manipulator due to the mechanical coupling of the two subsystems. Therefore, Q (g) ∆θveh is chosen to be larger than Q (g) ∆αman .

Using the linear model (5.4), the parameters of the FISC can be computed by (3.24). The parameters are assembled into the θ (a) <sup>=</sup> (vec (Q(a) ) , vec (R(a) )). The obtained cost function of FISC has the following matrices:

$$\mathbf{Q}\_{\text{Fisc}}^{\text{(a)}} = \text{diag}\left\{ 4.21, 3.37, 1.32, 0.35 \right\}\tag{5.8a}$$

$$\mathbf{R}\_{\text{Fisc}}^{\text{(a)}} = \text{diag}\left\{ 1, 0, 0 \right\}.$$

The initial values of the optimization (3.24) are chosen to θ (a) 0 = [5, 0.1, 1, 0.9, 0.1, 0, 0]. Thus, the FISC design leads to a differential game, in which the first player is the human and the second one is the designed FISC. From this differential game, the feedback gains of the players (K(a) and K(h) ) are computed and the system trajectories can be simulated, which are necessary for the computation of the differential potential games. The initial value of the simulation was chosen to x<sup>0</sup> = [0.65, −0.75, −1.5, 0.25] .

### **5.2.2 Computation of the NPDG**

The lateral model has one scalar input u (a) and one vector input u (h) . Thus, the methods of OPDGs cannot be applied, cf. Assumption 4.3.1-2 in Section 4.2. Therefore, an NPDG is sought, which can model this shared control setup. From the linear system (5.4) with the quadratic cost functions of the human (5.6) and the FISC (5.8) in Section 5.1, a quadratic potential function (4.11) is computed.

#### **Computation with perfect Feedback Gains**

For the computation of NPDG with the proposed LMI (4.88), K(p) is either estimated directly from the measurements of the system states and inputs or computed from the differential game. Here, it is assumed that

$$\mathbf{K}^{(p)} = \left[\mathbf{K}^{(h)}, \mathbf{K}^{(a)}\right] \tag{5.9}$$

holds, where K(h) and K(a) are obtained from the solution of the human-automation differential game, cf. Section 3.1.3. Using (5.8) and (5.6), P(a) and P(h) can be computed by means of (3.20). Finally, the upper estimation of DD is chosen to ∆ = 0.1 specifying all the necessary input variables of the proposed LMI optimization (4.88), therefore, a NPDG can be computed.

The obtained matrices of the quadratic potential function J (p) (4.11) are:

$$\mathbf{Q}^{\{p\}} = \begin{bmatrix} 3.346 & -0.064 & 0.081 & 0.021 \\ -0.064 & 1.210 & -0.149 & -0.198 \\ 0.081 & -0.149 & 1.379 & -0.409 \\ 0.021 & -0.198 & -0.409 & 0.995 \end{bmatrix} \text{and } \mathbf{R}^{\{p\}} = \begin{bmatrix} 1.008 & 0.011 & -0.006 \\ 0.011 & 1.275 & -0.059 \\ -0.006 & -0.059 & 0.987 \end{bmatrix}.$$

The result of the optimization variables are optimal βQ,<sup>R</sup> = 3.351 and β<sup>∆</sup> = 0.860. The solution of the Riccati equation for NPDG is

$$\mathbf{P}^{(p)} = \begin{bmatrix} 0.743 & -0.171 & -0.381 & -0.944 \\ -0.171 & 0.565 & 0.079 & 0.348 \\ -0.381 & 0.079 & 2.555 & 1.765 \\ -0.944 & 0.348 & 1.765 & 3.779 \end{bmatrix}.$$

Figure 5.3 shows the trajectories of the original differential game and the identified NPDG, which can reproduce the motion of the vehicle manipulators controlled by the shared control of human and automation. The error measure (4.89) is e<sup>x</sup> = 0.015 and the maximum DD is max i σ (i) d = 0.055 < ∆, meaning that the game is a NPDG. This property can be observed in Figure 5.4, in which the dynamics of the three Hamiltonians are given.

#### **Computation with Estimated Feedback Gains**

In the following, it is assumed that K(i) , i ∈ {h, a} in (5.9) are estimated from measurements of the state and the control trajectories of NE with a least square estimation. For the estimation, a finite sequence of sampling times tk, k ∈ [1, ...,Mk] is given for measuring the trajectories. Then, the feedback gain of player i is estimated, such as

$$\hat{\mathbf{K}}^{\{i\}} = \operatorname\*{arg\,min}\_{\mathbf{K}^{\{i\}}} \sum\_{k=1}^{M\_k} \left\| \mathbf{u}^{\{i\}}[k] - \mathbf{K}^{\{i\}} x^\*[k] \right\|\_2^2,\tag{5.10}$$

where u (i) [k] and x[k] denote the measurements at time tk. Note that in during the design, x is assumed to be given, cf. Figure 3.2 in Section 3.2. In case of real measurements, the signals usually have measurement noise. In order to analyze the robustness of NPDG against noise, white Gaussian noise is added to the NE trajectories of the original differential game

$$
\bar{x}^\*(t) = x^\*(t) + \varrho(t),\tag{5.11}
$$

**Figure 5.3:** The resulting noise-free trajectories of the original differential game (ODG) with solid lines and the NPDG with dashed lines are shown.

**(a)** The dynamics of the first Hamiltonian function **(b)** The dynamics of the second Hamiltonian function **(c)** The dynamics of third Hamiltonian function

**Figure 5.4:** The dynamics of Hamiltonian functions in the noise-free case are shown, where the blue solid lines are the right side of (4.13) obtained from the original differential game (ODG) and the red dashed lines are the left side of (4.13) being the result of NPDG.

which is use to analyze the properties of the NPDG as function of the noise level defining through the signal-to-noise ratio (SNR). For measuring the deviation of the NPDG from the original differential game, the maximum DD max σ (i) d (t) and the maximal trajectory error max (∥x (p) (t) − x ∗ (t)∥ 2 ) and the error measure (4.89) are used. The results are given in

**Figure 5.5:** The resulting trajectories of the original differential game (ODG) with solid lines and the NPDG with dashed lines for SNR= 20 dB

Table 5.1. Moreover, the maximal values of ∆ are shown which ensure the feasibility of (4.88). It shows that the smaller the SNR value, the greater the DD and the ∆ which is necessary to ensure the feasibility of (4.88). Roughly speaking, this implies that the identified NPDG is less similar to the original differential game with increasing noise.

Still, the proposed algorithm can provide similar trajectories with NPDG compared to the original differential game maintaining a small distance, see Figure 5.6, which show the dynamics of the Hamiltonians having a SNR<sup>=</sup> <sup>20</sup> dB. It can be seen that all derivatives ∂H(i) (t) <sup>∂</sup>u(i)(t) and ∂H(p) (t) <sup>∂</sup>u(i)(t) i ∈ {h, a} have similar trajectories and the distance between them is small, fulfilling (4.55) from Definition 4.6. Consequently, the original differential game is a NPDG. As a result, the computed cost function with the parameters Q(p) and R(p) can fully replace the original game without losing essential information.


**Table 5.1:** Results of NPDG computation with different white Gaussian noise levels, SNR=∞ is the noise-free case

**(a)** The dynamics of the first Hamiltonian function **(b)** The dynamics of the second Hamiltonian function **(c)** The dynamics of third Hamiltonian function

**Figure 5.6:** The dynamics of Hamiltonian functions, the blue solid lines are the right side of (4.13) with SNR= 20 dB and the red dashed lines are the left side of (4.13).

### **5.2.3 Design of the LISC and Analysis in Simulations**

From the identified, noise-free NPDG, the matrices Ξ (a) and Ξ (h) of the CS can be computed as presented in Section 4.5. The obtained matrices of the CS are

$$
\Xi^{\text{(a)}} = \begin{bmatrix} 0.423 \\ -0.119 \end{bmatrix}
$$

$$
\Xi^{\text{(h)}} = \begin{bmatrix} 0.143 & 0.309 \\ -0.030 & 0.349 \end{bmatrix}.
$$

Using these identified parameters Ξ (h) and Ξ (a) , the extended state vector cf. (3.35) of LISC is set up for the lateral control of the vehicle manipulator based on Section 3.2.2 such that

$$x\_e(t) = \begin{bmatrix} d\_{\rm veh}(t), \ \Delta\theta\_{\rm veh}(t), \ \delta(t), \ x\_{\kappa1}(t), \ x\_{\kappa2}(t) \end{bmatrix}. \tag{5.12}$$

To compute the feedback gains of LISC, optimization (4.98) is used. The obtained optimal feedback gain is

$$\mathbf{K}\_{\rm LISC}^{(a)} = \begin{bmatrix} 38.62 & 76.91 & 51.46 & -46.06 & -14.71 \end{bmatrix},\tag{5.13}$$

which enables the shared control of the vehicle manipulator with limited information. To analyze the stability of the vehicle manipulator system, the numerical results are substituted in Astab, cf. (4.108), and the eigenvalues Λ(Astab) are computed, which are

$$\Lambda\left(\mathbf{A}\_{\text{stab}}\right) = \{-30.81, -1.43 \pm 0.49i, -0.44 \pm 1.09i, -0.45 \pm 0.66i\},$$

proving the stability of the overall system, since their real parts are negative.

After the design and the stability analysis of LISC, its effectiveness is demonstrated in a simulation, in which both LISC and FISC are applied. To obtain the simulation model in this first setup, the lateral design model of the vehicle manipulator (5.4) is extended with a time lag of 0.15 s for the human input. This time lag models the human perception lag and is supported by literature, see e. g. [Heg08, FTCZ10]. References of the vehicle and the manipulator are defined leading to the system

$$
\dot{x}(t) = \mathbf{A}x(t) + \mathbf{B}^{\{\text{h}\}}u^{\{\text{h}\}}(t) + \mathbf{B}^{\{\text{a}\}}u^{\{\text{a}\}}(t) + r\_{\text{veh}}(t) + r\_{\text{man}}(t). \tag{5.14}
$$

The reference paths rveh and rman are shown in Figure 5.7. The FISC uses every element of the state vector (5.3), in contrary, LISC requires (5.12) excluding the non-measurable system states.

The results are illustrated in Figure 5.7. It can be seen that the trajectories of the two controllers are close to each other, indicating that LISC can provide a similar support. The support of the shared controllers can be observed in two cases: First, the vehicle does not track its reference as accurate as it could, if it supports the human to reach the reference of the manipulator faster, see e. g. at x ≈ 5m or x ≈ 15m. Second, the vehicle leaves its reference to assist the human with tracking the reference of the manipulator more accurately, see e. g. x ≈ 10m and x ≈ 37m. This behavior of the automation is the result of the prioritization of the dual task, which is achieved by the suitable choice of J (g) .

**Figure 5.7:** Comparison of the trajectories obtained form the lateral control of the vehicle manipulator using LISC and FISC in simulation

In order to evaluate the design LISC, the performance index for the evaluation is defined as the average tracking error

$$d\_{\text{avg},\text{m}} = \sqrt{\int\_{t\_0}^{\tau\_{\text{end}}} d\_{\text{man}}(t) \,\text{d}t},\tag{5.15}$$

since the mutual effort of the automation and the operator is a precise tracking of the reference of the manipulator.

The resulting trajectories of FISC and LISC look very similar, which raises whether LISC always works that well. The first reason for the good correspondence is the use of the design model of the vehicle manipulator as the system plant in the simulation. The second reason is that the applied model of the human operator J (h) is fixed and does vary during the simulation. Thus, the determined CS can represent the mutual effort perfectly. The average tracking error of FISC and LISC are

$$d\_{\text{avg,m,LISC}} = 0.14 \,\text{m}$$

$$d\_{\text{avg,m,FISC}} = 0.16 \,\text{m},$$

which also indicate that the LISC can provide a support similar to FISC.

For further analyzes, an extension with a more complex simulation model and the replacement of the human's model, which are addressed in the subsequent: Section 5.4.2 presents analyzes with a complex model of the vehicle manipulator and the experiments in Chapter 6 include human test subjects. That way, both discussed limitations are addressed. Still, the simulation of this section provide the first indications that the use of a lateral LISC is suitable for the replacement of FISC.

# **5.3 Longitudinal Control of a Vehicle Manipulator with LISC**

The realization of the longitudinal shared control of a large vehicle manipulator is presented in the following. The controller design procedure has the steps as given in Section 4.5. For the design, the longitudinal model of the large vehicle manipulator from Section 5.1 is used.

### **5.3.1 Full Information Shared Controller**

First, an FISC is designed using J (g) , the global cost function, which is quadratic cf. (3.13), and its weight matrices are

$$\mathbf{Q}\_{\rm Ion}^{(g)} = \text{diag}[1, 1, 10] \text{ and } \mathbf{R}\_{\rm Ion} = \text{diag}[0.5, 1],$$

which models that the deviation of the manipulator from its reference has a higher priority compared to the velocity and position errors of the vehicle. Furthermore, based on Section 3.1.3, quadratic cost functions of the automation and the human are assumed. Based on preliminary analysis, a nominal human model is used for the FISC design with the following weight matrices

$$\mathbf{Q}\_{\rm Ion}^{\{h\}} = \text{diag}\left\{1, 1, 5\right\} \text{ and } \mathbf{R}^{\{h\}} = \text{diag}\left\{0.25, 1\right\}.$$

modeling that the primary goal of the human is the tracking of the manipulator's reference sref,m. The weighting matrices of FISC are designed by the optimization (3.24) yielding

$$\begin{aligned} \mathbf{Q}\_{\text{lon}, \text{FISC}}^{\text{(a)}} &= \text{diag}\left\{ 0.345, \ 0.076, \ 1.409 \right\}, \\ \mathbf{R}\_{\text{lon}, \text{FISC}}^{\text{(a)}} &= \text{diag}\left\{ 1.00, \ 0.190 \right\}, \end{aligned}$$

from which the feedback control laws of the human and the automation are computed by the coupled optimization of (3.20) leading to the following result

$$\mathbf{K}\_{\text{lon,FISC}}^{\{h\}} = \begin{bmatrix} -0.787, \\ 0.258, \\ 1.430 \end{bmatrix} \mathbf{K}\_{\text{lon,FISC}}^{\{a\}} = \begin{bmatrix} 0.422 \\ 1.592 \\ 0.830 \end{bmatrix}. \tag{5.16}$$

### **5.3.2 Computation of the Potential Games**

As the inputs are scalar, an OPDG can be computed, for which Algorithm 1 is used. In the case of ideal feedback gains of the player, K(i) are obtained from the original game and the quadratic cost function of the identified OPDG has the following penalty matrices

$$\mathbf{Q}\_{\text{lon,OPDG}}^{\text{(p)}} = \begin{bmatrix} 1.199 & 0.000 & 0.000 \\ 0.000 & 2.140 & 0.000 \\ 0.000 & 0.000 & 3.588 \end{bmatrix},$$

$$\mathbf{R}\_{\text{lon,OPDG}}^{\text{(p)}} = \begin{bmatrix} 1.000 & 0.052 \\ 0.052 & 3.584 \end{bmatrix},$$

which leads to

$$\mathbf{P}\_{\text{lon,OPDG}}^{\text{(p)}} = \begin{vmatrix} 7.742 & 1.470 & -5.479 \\ 1.470 & 6.008 & 2.529 \\ -5.479 & 2.529 & 10.067 \end{vmatrix}$$

The figures visualizing the results of noise-free OPDG are given in Appendix C.1.

Using the measurements of a shared control setup should intuitively provide a better model than using simulated trajectories. The inputs u (a) lon, u (h) lon and the state trajectories xlon are attained from measurements, so they could be used directly in Algorithm 1 to identify the OPDG similarly to the NPDG case in Section 5.2.2. However, in general, measurements are noisy and consequently, condition (4.41c) in Algorithm 1 cannot be fulfilled around zero.

A possible solution is that the measured trajectories are used for the computation of the input errors eu, see (4.41a). Then, a data-preprocessing use the measurements to estimate the feedback

**Figure 5.8:** Preprocessing of the measurement signals for the inputs-trajectory-dependent computation of an OPDG

gains of the players using (5.10), from which the system state trajectories are reconstructed though solving

$$\hat{\mathbf{x}}(t) = \mathbf{x}\_0 \cdot \exp\{ (\mathbf{A}\_{\rm{lon}} - \mathbf{B}\_{\rm{lon}}^{\rm{(a)}} \hat{\mathbf{K}}\_{\rm{lon, FISC}}^{\rm{(a)}} - \mathbf{B}\_{\rm{lon}}^{\rm{(h)}} \hat{\mathbf{K}}\_{\rm{lon, FISC}}^{\rm{(h)}}) \cdot t\}. \tag{5.17}$$

Those reconstructed trajectories are noise-free and can be used for condition (4.41c) in Algorithm 1 solving the problem of the noisy system state trajectories. The notion of this data-preprocessing is illustrated in Figure 5.8.

Then, for the analysis of the robustness, white Gaussian noise is added to the NE trajectories of the original game such as in (5.11). The error measure (4.89) and the computation time of Algorithm 1 are used to evaluate the input-trajectory-dependent computation method of OPDG. The resulting trajectories of the system states are presented in Figure 5.9 with SNR= 20 dB and Figure 5.10 illustrates the dynamics of the Hamiltonians. The results of the error measures and the computational times are given in Table 5.2 showing that noise increases the computational time and the error measure. However, as the system state trajectories indicate, the original differential game can still be reproduced despite using noisy signals.

Note that the trajectory-free optimization for the OPDG computation is robust against noisy measurements since it uses only the definition of the original differential game, but no input or state trajectories to identify the OPDG. Consequently, no robustness analysis needs to be provided.


**Table 5.2:** Results of the input-trajectory-dependent computation method of OPDG with different white Gaussian noise levels are shown, where SNR=∞ is the noise-free case

**Figure 5.9:** The resulting system state trajectories of the longitudinal vehicle manipulator with noisy signals (SNR= 10 dB) of the original differential game (ODG) and the identified OPDG

**Figure 5.10:** The dynamics of Hamiltonian functions, the blue solid lines are the right side of (4.13) with SNR= 20 dB and the red dashed lines are the left side of (4.13).

### **5.3.3 Design of the LISC**

In the following, the longitudinal LISC is derived from the identified OPDG. For the calculation, the noise-free case is used. First, the parameters of the CS

$$x\_{\kappa,\text{lon}}(t) = \xi^{\{h\}} \cdot \varphi\_{\text{joy}}(t) + \xi^{\{a\}} \cdot \ddot{s}\_{\text{veh,des}}^{\{a\}}(t) \tag{5.18}$$

are computed with using the method from Section 4.5. The obtained parameters are ξ (h) <sup>=</sup> <sup>−</sup>0.<sup>547</sup> and ξ (a) <sup>=</sup> <sup>0</sup>.312. Using CS, an extended system state is formulated such as

$$x\_{\rm Ion,e}(t) = \left[\Delta s\_{\rm veh}(t), \,\Delta \dot{s}\_{\rm veh}(t), \,\ddot{s}\_{\rm veh}(t), \, x\_{\kappa,\rm Ion}(t)\right],\tag{5.19}$$

which is used by LISC. Thus, a controller with limited information can be designed, which enables a similar shared behavior as a controller with full information. Its benefit is that no information about the manipulator reference is needed. The input of this extended system is computed by

$$\dddot{s}\_{\text{veh}} = -\mathbf{K}\_{\text{lon,LISC}}^{(a)} \cdot \mathbf{z}\_{\text{lon,e}},\tag{5.20}$$

where the feedback law is

$$\mathbf{K}\_{\text{lon,LSC}}^{\{a\}} = \begin{bmatrix} 19.899 & 14.183 & 25.288 & -9.862 \end{bmatrix},$$

obtained from (4.98). To prove the stability of the overall system, its eigenvalues Λ(Alon,stab) are computed in accordance with (4.109). The obtained values

Λ(Alon,stab) = {−21.595, −0.386 ± 0.814i, −0.137, −1.135, }

prove the stability of the overall system controlled with LISC.

Finally, in a simplified simulation, the efficiency of the longitudinal LISC is demonstrated. For that the reference longitudinal position of the vehicle and the manipulator are defined as

$$\begin{aligned} s\_{\text{ref,v}}(t) &= \upsilon\_{\text{ref0,v}} \cdot t + \int\_{t\_0}^t \upsilon\_{\text{var,v}}(\tau') \, \mathrm{d}\tau', \\ s\_{\text{ref,m}}(t) &= \upsilon\_{\text{ref0,v}} \cdot t + s\_{\text{ref0,m}} + s\_{\text{var,m}}(t), \end{aligned}$$

where s˙ref0,v and sref0,m are the constant reference velocity of the vehicle and the initial distance between the vehicle and the manipulator, respectively, which are chosen to vref0,v = 1 m s and sref0,m = 3.75m. The functions svar,m and vvar,v are variations of the manipulator's position and the vehicle's velocity, which need to be compensated by the controllers. These variations can be seen in Figure 5.11 and Figure 5.12. For the evaluate of the design LISC, the performance index for the evaluation is defined as the average longitudinal tracking error of the manipulator

$$
\Delta s\_{\text{avg,m}} = \sqrt{\int\_{t\_0}^{\tau\_{\text{end}}} \Delta s\_{\text{man}}(t) \text{d}t}.\tag{5.21}
$$

Figure 5.11 presents the trajectories relative to the vehicle, where the results using LISC and FISC are compared. In Figure 5.12, the reference velocity and the velocities reached by FISC and LISC are analyzed. The supporting behavior of both controllers can be observed at t ≈ 7 s, t ≈ 17 s and t ≈ 20 s, where the vehicle leaves its reference position by decelerating or accelerating the vehicle, which can be seen in Figure 5.12. At t ≈ 30 s, the vehicle follows its reference slower compared to t ≈ 3 s. The reason for this reaction is that the automated vehicle helps the human-controlled manipulator to reach its reference more accurately. The result of (5.21) are

$$
\Delta s\_{\text{avg,m, FICC}} = 0.11 \,\text{m},
$$

$$
\Delta s\_{\text{avg,m, LSC}} = 0.13 \,\text{m},
$$

which are close to each other indicating that the support of LISC and FISC are similar as well.

To summarize the section, the LISC design procedure using the novel subclasses of potential games works also for the longitudinal guidance of a large vehicle manipulator. Furthermore, the simulations are carried out for both the lateral and longitudinal shared control of the vehicle manipulator and their results indicate the general suitability of LISC.

**Figure 5.11:** Comparison of the trajectories obtained form the longitudinal control of the vehicle manipulator using LISC and FISC

**Figure 5.12:** Comparison of the trajectories obtained form the longitudinal control of the vehicle manipulator using LISC and FISC

# **5.4 Detailed Simulation Models of the Vehicle Manipulator**

After the development of the design models for a large vehicle manipulator and the application of LISC to them, this section presents the detailed simulation model, which can characterize the motion of a large vehicle manipulator more realistically than the simulations from Section 5.2.3 and 5.3.3 enabling deeper analyzes. The complex simulation model of the vehicle manipulator is necessary due to the large pitch and roll angles of the vehicle.

# **5.4.1 Simulation Model**

An "out-of-shell" overall simulator or simulation model does not exist in the literature, which would be suitable for the qualitative analyses and experiments of the proposed shared control concepts. Still, there are models of the subsystems in the state of the art, which can be combined to an overall simulation: Heavy-duty vehicles and large hydraulic manipulator were considered in literature [LH05, MHW<sup>+</sup> 15, HB16, RAA<sup>+</sup> 17]. However, their combination is novel and raises further challenges. Furthermore, the real-time implementation<sup>34</sup> requires additional examinations. Parts of the implementation was done in the course of tow bachelor theses [Bur19, Bou19] and a master thesis [Mai18]. Furthermore, parts of this section were published in two research publications [VMSH19, VMH22]. Figure 5.13 shows the overall system including a mid-sized heavy-duty vehicle and a large hydraulic manipulator.

**Figure 5.13:** The schematic illustration of the simulation model containing a tractor-like vehicle and a hydraulic manipulator with four joints [VMSH19] ©2019 IEEE

<sup>34</sup> A real-time implementation is necessary to enable the interaction between the human and simulation of the system dynamics in the experiments, see Chapter 6. *Real-time* means soft real-time in the following.

#### **Model of the Vehicle**

The vehicle's simulation model contains


As the vehicle's motion does not have high dynamics, a linear tire model is sufficient. On the other hand, large vehicle manipulators have larger roll and pitch angles than passenger cars, thus a non-linear vehicle body model is necessary. This is an essential difference compared to other vehicle models from literature. Therefore, the vehicle body motion model from [KNH14] is implemented in the course of this thesis, which can model larger roll and pitch angles accurately.

To enable a real-time simulation, fixed-step discrete solver is necessary, which however can lead to numerical unstable dynamics of the wheel around 0 m/s. A suggested solution is the introduction of a numerical threshold for the slip computation, see [Ril11, Chapter 4.]. Therefore, the slip of the wheel is computed as

$$s = \frac{v\_{\text{veh}} - \omega\_{\text{wehe}} \cdot r\_{\text{wehe}}}{\max\left(v\_{\text{veh}}, \omega\_{\text{wehe}} \cdot r\_{\text{wehe}}, v\_{\text{N}}\right)},\tag{5.22}$$

where vveh, ωwhe and rwhe are the vehicle velocity, the rotational speed of the wheel and the radius of the wheel, respectively. The parameter v<sup>N</sup> is chosen in accordance to maintain the numerical stability at low speeds. The vehicle body and suspension models are presented in detail in [KNH14]. For more details on the tire model, it is referred to [Ril11, Chapter 3.]. The vehicle is modeled in a global frame, the position of the subsystems are given relative to vehicle body. Further equations of motion of the vehicle are given in Appendix B.2.

#### **Model of the Manipulator**

The hydraulic manipulator consists of


see Figure 5.13. The heavy arm segments and hydraulic actuators are modeled based on [FH15, Chapter 6.] [Rud17, RAA<sup>+</sup> 17]. Additionally, a velocity based inverse Jacobian is added. This means that the operator does not directly control the joints of the manipulator with the joystick, but the speeds of the end effector (EE), see Figure 5.14. The general equations of motion are given relative to the vehicle body such as

$$\begin{split} \mathbf{M}\ddot{\boldsymbol{\phi}}(t) &= \mathbf{T}\_{\text{hyd}}\left(\mathbf{p}\_{\text{hyd}}, \mathbf{z}\_{\text{hyd}}(t), \boldsymbol{u}^{\text{(h)}}(t, \boldsymbol{\phi})\right) \\ &+ \mathbf{T}\_{\text{fric}}\left(\dot{\boldsymbol{\phi}}(t), \boldsymbol{\phi}(t)\right) + \mathbf{T}\_{\text{mech}}\left(\mathbf{p}\_{\text{geo}}, \mathbf{z}\_{\text{veh},\text{sim}}(t)\right), \end{split} \tag{5.23}$$

where M is the inertia matrix. The right-hand side of (5.23) are the driving torques of the

**Figure 5.14:** The control structure of the hydraulic manipulator with coordinated rate control is shown, cf. Section 2.2. The manipulator is not automated due to the unstructured environment. The orange color symbolizes that a model of the human operator is used in the qualitative analysis. On the other hand, the green color shows that the models used in both qualitative analysis and experiments.

hydraulic actuator, the linear and non-linear friction torques and the external mechanical influences, respectively. The driving torques of the hydraulic actuator depend on the hydraulic parameter phyd and on the input of the human u (h) (ϕ). The mapping of the human's input to the coordinated desired angles of the manipulator is non-linear due to the inverse Jacobian, cf. Figure 5.14.

The internal states of the hydraulic actuator xhyd(t) are the oil pressure, the oil flow and the state of the electric motors which drive the valves for the hydraulic cylinders. The modeled frictions include a linear viscous component and a Stribeck curve using the LuGre friction model, which is a common approach for hydraulic systems, see e. g. [THY12]. The function T mech includes the torques and the forces caused by the motion of the vehicle. In (5.23), xveh,sim denotes the vehicle state of the complex simulation model<sup>35</sup> and pgeo includes the geometrical parameters relevant for motion generation of the manipulator. The parameters of the hydraulic manipulator are obtained from the literature, [ZGSF09, Ala12]. The detailed equations of motion and further information of the hydraulic manipulator are given in Appendix B.2.2.

#### **Implementation and Limitations**

The simulation models of the vehicle and the manipulator were implemented with Simulink, see right green block in Figure 5.15. The overall simulation model runs with 2 kHz update rate and with a Runge-Kutta solver. The solver is chosen to ensure the numerical stability of the overall system: The oil model of the hydraulic cylinders has high frequency numerical oscillations with solvers of lower order.

The simulation model can reproduce the motion of a large vehicle manipulator, still it has some common limitations. These are as follows:

<sup>35</sup> Note that vehicle's the design model - presented in Section 5.1 - includes fewer states compared to the simulation model presented here.

	- 1. There is no detailed combustion engine model, which is actually used in such machines. There is no thermodynamic model in the simulation. It is merely replaced by the first order transfer function based on the suggestion of literature, see e. g. [Hil16] or [HB16, Chapter 6.].
	- 2. The tires are only modeled in the linear slip range, the model does not contain the non-linear part of the µ-slip curve, see e. g. [Ril11, Chapter 3.].
	- 3. The break system is modeled as an additional external torque on the wheels.
	- 4. The different hydraulic cylinders are not parameterized differently and each cylinder is modeled with their own oil tank. No common tank is implemented.

These modeling assumptions are feasible and do not restrict the indication of the simulations and experiments since the main focus of the analyses was the testing of the LISC. Furthermore, the simulations and the experiments do not have the following goals:


### **5.4.2 Analysis of the Simulation Models and the Designed LISC**

After setting up the models, they are analyzed with critical scenarios, which can demonstrate the benefits and the necessities of the three-dimensional and non-linear vehicle and manipulator models. Secondly, the general usability of the proposed LISC is verified, too.

The test environment for the qualitative analysis includes four subsystems, cf. Figure 5.15. Environment models provide the references for the simulated human operator and for the controller to be able to control the manipulator and the vehicle, respectively. The colors green and orange have the following purpose: The two orange blocks are only used in the qualitative analysis. In the experiments, the human model is replaced by test subjects and the environment model by a graphical interface, see Section 6.1. On the other hand, the two green blocks are used in the experiments as they are. The complex simulation model of the large vehicle manipulator remains the same for both qualitative analysis and experiments.

This test environment can provide the qualitative analysis of the proposed LISC. For a comparison, a non-cooperative controller (NCC) is implemented omitting the supporting behavior of a shared controller. Since, the NCC does not take the manipulator into account, the system states are reduced to xveh = [dveh, ∆θveh]. The system and input matrices for the controller design are

$$\mathbf{A}\_{\rm veh} = \begin{bmatrix} 0 & \upsilon\_{\rm veh} \\ 0 & 0 \end{bmatrix} \text{ and } \mathbf{B}\_{\rm veh} = \begin{bmatrix} 0 \\ \upsilon \end{bmatrix},$$

respectively. This is a common model for a simple vehicle representation, see e. g. [BK16]. The NCC is an LQR, which is designed by means of a quadratic cost function, in which penalty matrices are

QNCC = diag ([1.25, 8.25]) and RNCC = 1.

The control law obtained from the LQR design is

$$\mathbf{K}\_{\mathrm{NCC}} = [1.12, 3.24].$$

For the testing and analysis of the vehicle manipulator model and the LISC, two scenarios are used: The first one includes a reference with a sudden step followed by a smooth curve, see Figure 5.16. In the second scenario, the reference trajectory of the manipulator is the combination of a smoother curve and a sudden step. This subsection presents the results of the first scenario. The second scenario and further details on the simulation results are given in Appendix C.2.

In the simulation, the human model is used as given in Section 5.2.1. Figure 5.16 shows the schematic representation of the test scenario and the two reference paths of the vehicle (yellow path) and the manipulator (blue path). First the vehicle has a small correction in its reference at x ≈ 20m. Such a correction is common due to changes in the traffic flow, which requires an adjustment e. g. for sufficient safety distance between vehicles. At x ≈ 45m, the reference of the manipulator has a sudden step. This models a hidden obstacle (e. g. a large stone or a piece of metal, see red block in the figure), which could damage the manipulator. The operator must react to this. After that the vehicle manipulator enters a smooth curve.

The results of the qualitative analysis show that the sudden step of the manipulator has an impact on the vehicle motion, which can be noticed on its rotation angles, see Figure 5.17. Thus, the motion of the vehicle is not typical for state-of-the-art trucks or tractors: The vehicle is tilted around its longitudinal axis, due to the mass of the manipulator. Figure 5.17 shows that the roll angle of the vehicle is larger than 0.09 rad36, see e. g. [For16]. Therefore, a linear simulation model of the vehicle is not adequate and the implementation of the proposed non-linear model is reasonable, see Section 5.4.1. Figure 5.17 shows the influence of the manipulator: At t ≈ 45 s, the manipulator is stretched out, thus the torques on the vehicle are larger, which leads to a change in the roll angle. After t ≈ 100 s, the manipulator moves back to the earlier position and the vehicle is rolled back. A small change in the pitch angle can be also observed: The vehicle is tilted during the maneuver between 60 s and 100 s.

Further figures showing the vehicle motion (e. g. suspension's and wheel dynamics) are given in Appendix C.2. From the discussion with professional operators, it is confirmed that a roll angle of approximately 22 deg is usual in challenging situations. The motion of hydraulically driven manipulators is characterized by a slow and delayed motion, cf. [RAA<sup>+</sup> 17]. The hydraulic actuators can grant large torques, but have slow dynamics compared to electrical motors. As mentioned in Section 5.4.1, the operator uses a coordinated rate control, meaning that the desired velocities of the EE, x˙ man,<sup>d</sup> and y˙man,<sup>d</sup> are set (cf. Figure 5.14). From these desired velocities, the desired angular velocities of the manipulator are calculated with a numerical inverse kinematic algorithm. These desired angular velocities are then set by the low-level controller of the hydraulic actuators, cf. Figure 5.14. For more details on the low-level controller, it is referred to Appendix B.2.

The desired and actual angular velocities of the manipulator during the test scenario are given in Figure 5.18. The characteristic slow motion of the manipulator is similar to the available results from literature indicating the suitability of the simulation model for the experiments. To analyze the modeled hydraulic system, Figure 5.19 shows the simulated oil flow (Q˙ oil) and the

**Figure 5.16:** Schematic illustration of the test scenario with two reference paths of the vehicle (yellow path) and the manipulator (blue path). Please, note that the scaling of the axes x and y are different for a better visualization of the scenario.

<sup>36</sup> It is a common engineering practice that the sin and the cos of an angle can be linearized, if they are smaller than 0.09 rad which corresponds to around 5 ○ .

**Figure 5.17:** The pitch and roll angles of the vehicle. The influence of the manipulator's motion can be seen at t ≈ 50 s. The figure confirms the necessity of the non-linear vehicle model: The roll angle varies between 0.2 and 0.4 rad, which is larger than 0.09 rad meaning that the linearization error would be too large.

**Figure 5.18:** The desired and actual angular velocities of the four joints. Their motions are slower compared to electric actuators. Results from literature have similar characteristics.

pressure (Poil) in the hydraulic actuator of the third joint. Their characteristics and magnitudes also correlate with the results available from the literature. It can be seen that during the sudden maneuver at t ≈ 45 s, the pressure and the oil flow are increased to provide the necessary torque and power to move the joint. In the smooth curve at t ≈ 100 s, the pressure returns to an approximately constant oil pressure level. As a result, the oil flow is approximately zero. Such a behavior can be observed in state-of-the-art works [FFV16, KZM19].

To summarize, the simulation model is suitable to characterize the complex motion of the vehicle manipulator. For a qualitative comparison of the results from the state of the art, it is referred to [LHOM15, Rud17, YVF17]

**Figure 5.19:** The hydraulic flow and pressure of the 3. actuator of the manipulator.

The second goal of the considered simulation is the testing and analysis of the proposed LISC. The goal is that the simulated operator can carry out the dedicated task - the path tracking with LISC better than a NCC. The NCC, implemented for comparison, corresponds to the most obvious control scheme of the system: The vehicle is automated, while the operator can devote himself fully to the tracking tasks with the manipulator. In the simulation, the mental load, intuition of the human operator or similar aspects are not taken into account.

Sharing the task means that the vehicle moves together with the manipulator to minimize the error of the manipulator in situations, where the manipulator cannot reach the desired goal. On the other hand, this sharing should not be active in situations, which the operator can handle. Both situations are represented in the simulation scenario: A sudden step and a smooth curve, see Figure 5.16.

Such a sharing of the task between the automated vehicle and the human-controlled manipulator is beneficial for both the quality of the work and the expenditure of time. The results of the simulation with LISC and NCC are given in Figure 5.20. During the change of the vehicle's reference at x ≈ 20m, the NCC does not consider that the motion of the vehicle has an impact on the motion of the manipulator. Therefore, a more significant correction by the operator is necessary. In addition, it can be seen that by the sudden step at x ≈ 45m, the manipulator cannot reach the reference as fast as it is possible with LISC. The vehicle leaves its own reference to help the operator to reach the goal with the manipulator. This maneuver is not possible with the NCC and would probably lead to the damage of the manipulator.

On the other hand, at x ≈ 120m, the large curve begins, in which no support is necessary for the operator. Therefore, LISC does not provide support and therefore, LISC does not lead to a noticeable difference between the trajectories of NCC and LISC.

**Figure 5.20:** A Comparison between the system trajectories of using LISC and NCC is shown. It can be seen that the manipulator can track the reference path more precisely using LISC than NCC.

# **5.5 Summary of the Chapter**

This chapter presents the application of the LISC design procedure to a large vehicle manipulator. First, two design models – for the lateral and longitudinal motions – of the vehicle manipulator are developed enabling the usage of the concepts presented in Chapters 4 and 3. Then, the chapter elaborates on the design of the FISC for problems with dual goals, where the emerging challenges are addressed. In the next steps of the lateral LISC design, a NPDG is computed and its robustness against noise is investigated. In the case of the longitudinal LISC, an OPDG is computed. In case of noisy signals, the limitation of the OPDG is solved by the preprocessing of the system state trajectories, which enables the use of the input-trajectory-dependent computation methods cf. Section 4.2.3. Then, LISC is compared to FISC in the two simulations: The first includes the lateral motions of the vehicle manipulator and in the second one, the longitudinal shared controllers are analyzed. The results show similar system state trajectories, which partly answer the third research question and indicate the general applicability of LISC.

Section 5.4 describes the implementation of the simulation models and their *qualitative testing and analyses*. The challenges and the proposed solutions for the real-time implementation of the models are addressed as well. The qualitative analysis is performed by means of a test scenario. The results demonstrate the general usability of the simulation models and further explanations are given as to why the non-linear simulation models are necessary. Finally, the fundamental functionality of the proposed LISC in a complex simulation is demonstrated by comparing it with a non-cooperative controller.

# **6 Experiments**

In this chapter, three different experiments are conducted demonstrating the benefits of LISC and demonstrating its applicability in real world applications with human test subjects. This way, third research question, on the application of the control concept and the design procedure, is completely answered, see Section 2.4 These experiments are carried out on the developed simulator using different setups of the human control interfaces. They focus on the following aspects:


# **6.1 Test Bench and Measures of the Experimental Evaluation**

By reason of the special application of the vehicle manipulator, in the subsequent paragraphs, the real time test bench with its hardware and software structure is presented followed by the discussion on the experimental evaluation measures, which is necessary due to the lack of systematic evaluation and performance criteria for vehicle manipulators in the road maintenance [Rec16].

# **6.1.1 Hardware and Software Components of the Test Bench**

To enable experiments of LISC with human test subjects and to provide indications for the practicability of the proposed concepts, a simulator of a large vehicle manipulator is necessary. Unfortunately, there are neither open-source simulators nor other professional software available that would provide a real-time capability, an easy accessibility of the software interfaces and a graphical user interface, which are necessary for human-in-the-loop experiments. Therefore, a simulator has been built in the course of this thesis, which is suitable for the experiments that demonstrate the usability of the proposed LISC with the vehicle manipulator application.

Figure 6.1 shows the test bench used for the experiments. It consists of a simulation computer, a joystick, steering wheel and the graphical user interface (GUI). The *CLS-E* Brunner Jet joystick model from the manufacturer Brunner AG [Bru22] was used as joystick and the Logitech G29 Driving Force [Log22] was used as steering wheel.

The test bench utilizes the commonly used *Robot Operating System* (ROS) middle-ware providing a well organized software structure. The software and hardware components of the test bench are given in Figure 6.2. The models of the vehicle manipulator were implemented in Matlab/Simulink from which C++ ROS-Node were generated. Further details of the implementation are given in Appendix B. The generated C++ ROS-Nodes were able to run in soft real time on the simulation computer (Intel(R) Core(TM) i7-5930K CPU @ 3.50GHz, operating system: Ubuntu 18.04. ROS-distribution: melodic). The generated ROS nodes provided updated signals from the simulation model with 50 Hz<sup>37</sup> .

The GUI was implemented with the program package *pygame* [Shi11]. The GUI was a simple emulation of the complex working environment of the vehicle manipulators. No sensor models were implemented, because their influence was not in the focus of the experiments. Using a simple GUI was beneficial because the perception of the human operator was not biased by other factors. Therefore, the experiments were suitable to compare the control concepts without having biased perception of the test subjects. Furthermore, a simple representation helped to reduce the learning and training time in the experiments. The references and the

**Figure 6.1:** Picture of the test bench showing the vehicle manipulator. The GUI on this image belongs to the second experiment, see Section 6.3 [VRH22]. ©2022 IEEE

<sup>37</sup> The choice of 50 Hz correlate with the standard update frequency of the CAN on large vehicle manipulators utilized in road maintenance.

**Figure 6.2:** The general software and hardware structure of the simulator enabling human-in-the-loop experiments. The controller, the graphical user interface and the steering wheel are marked with orange color symbolizing that these components are modified for each experiment.

goals during the experiments were displayed in this GUI. Additional pictograms were used to visualize the longitudinal motion of the vehicle manipulator. The GUI was updated with 25 Hz, which can ensure a smooth visual motion perception of the test subject.

The steering wheel has a standard USB-driver for reading and controlling the steering angle. It has a dual-motor force feedback and a helical gearing, which enables a high precision. The maximum control torque is limited to 2.5Nm ensuring the safety of the test persons. The maximal wheel rotation is ±5π, however, the use of ±π was sufficient for the experiments. The measurements of the steering angle of the steering wheel are logged with 100 Hz. The joystick has a CAN interface with a manufacturer specific protocol. The manufacturer did not provide a driver for Ubuntu systems, therefore, a new driver software had to be implemented enabling a ROS-connection, cf. Figure 6.2. The maximum pitch and roll angles are 20.5 deg and the peak torque is 4.2Nm. The human inputs, which are the measurement signals of the two joystick angles, are updated with 100 Hz.

As the three experiments have different purposes, the hardware and software setups are slightly different for them. The components, which are adjusted for each experiment, are marked with the color orange in Figure 6.2. These differences are explained in the corresponding sections in detail. This thesis does not focus on haptic interactions between human and automation through the steering wheel or joystick. However, it is possible to provide active haptic feedback through these devices. Such concepts to provide haptic support for the operator are investigated in two research works [VBSH20, VSB<sup>+</sup> 20]. For more details on these practical applications, it is referred to these publications.

# **6.1.2 Experimental Evaluation Measures**

Before presenting the experiment, the overall evaluation goals of the proposed LISC are discussed. Due to the operational complexity and the lack of systematic evaluation and performance criteria for vehicle manipulators in road maintenance [Rec16], three general goals are stated in the following:


In contrary to the experiment of this chapter, in [VSSH19], a practical shared control approach was presented for the problem of the vehicle manipulators, which was tested with human test subjects in [VSB<sup>+</sup> 20]. But, these works used a heuristic tuning of the parameters based on the model of the vehicle manipulator. Such a manual design complicates the general applicability of the concept developed in [VSSH19]. Therefore, the subsequent experiments and the results also provide the first indication of the generality of the LISC design procedure.

<sup>38</sup> The occupancy rate is defined by the time in which the vehicle manipulator is in usage relative to overall working time, see e. g. [FP21].

# **6.2 Comparing Full and Limited Information Controllers**

The first experiment has three goals:


The FISC is used as a baseline and theoretical *optimum*<sup>39</sup> of the problem. In this experiment, the desired goal is that LISC reaches an overall system performance similar to FISC. The experiment can provide the first indication of the practical applicability of LISC. A NCC is taken into consideration as a state-of-the-art solution and a comparison between LISC and NCC is provided. The NCC controls the vehicle without taking into consideration the tasks and motions of the manipulator.

# **6.2.1 The Experiment Setup**

In this experiment, the steering wheel was not used by the test subjects, because a full automation of the vehicle is assumed. See Figure 6.2, the orange-colored steering wheel was not controlled by the human operator. Consequently, the test subjects had no direct control over the vehicle motion. They solely controlled the EE of the manipulator with the joystick, which set the desired velocities of the manipulator in the x and the y directions. The GUI used in this experiment includes two predefined references for the dual task: The red and the grey paths were the references of the manipulator and the vehicle, respectively, see Figure 6.3. The two controllers, LISC and FISC were designed as given in Sections 5.2.1 and 5.2.3. The NCC was applied as presented in Section 5.4.2.

For the controller design, the reference human model was assumed as given in Section 5.2.1. The parameters of the model were estimated beforehand based on the quantitative assessments. The identification of the human operator without the shared control is not reasonable as the human operator verifiably adapts their behavior to the shared control setup [IEFH18]. Therefore, since the suggested design from [Fla16, Chapter 6] with a prior identification of the human's cost function is not reasonable and a nominal human model was used for these experiments.

<sup>39</sup> Note that in real applications, the use of the FISC is not possible, see Section 2.3.2. However, in case of full information about system state and references, FISC is the ideal (best) solution for the problem, which can be achieved. Assuming that the human is perfectly identified and the global objective function J (g) represents the system requirements perfectly, there is no solution which could outperform FISC.

# **6.2.2 Experiment Procedure**

Sixteen test subjects (4 female and 12 male, age 27.8 ± 3.0 years) took part in the experiment. It was a *within-subject experiment* meaning that all the test subjects tested all controllers. The test subjects had the task of keeping the manipulator on its reference (red line) as well as possible. They used the joystick to control the speed of the EE of the hydraulically actuated manipulator relative to the vehicle. Their goal was to maintain an error as small as possible. Furthermore, they had to evaluate the three controllers (LISC, FISC and NCC) by answering a questionnaire. The test subjects were undergraduate students and research assistants at the Karlsruhe Institute

**Figure 6.3:** Graphical User Interface of the scenario with the vehicle manipulator in the first experiment. The manipulator reference (red line) was used only by FISC and for the evaluation. LISC and NCC do not require the manipulator reference [VIH22]. ©2022 IEEE

of Technology. They had no experience with real large vehicle manipulators or with similar systems. None of them had participated in an experiment with the test bench.

The controllers were tested in a random order and the test subjects were unaware of which controller they were testing currently. The experimental protocol began with a familiarization process: The test subjects had the opportunity to become familiar with the control of the manipulator. This part took approximately 250 s. The test subjects were allowed to do anything to learn to control the manipulator. The instructions also emphasized that the vehicle manipulator consisted of non-linear models. This meant that the manipulator had limitations that could be reached with unskilled input<sup>40</sup>. It was also explained that the motion of the manipulator depends on its manipulability. The optimal position of the manipulator is also demonstrated by the instructor of the study.

The test run and these instructions were followed by the actual scenario including two typical types of references (sudden step forms and smoother V-forms) and with the velocity of the vehicle set to v = 1.2m/s, which is a common speed for roadside work with a large vehicle manipulator, see e.g. [Fie20, MUL20]. The independent variable was the choice of the controller (FISC, NC, LISC). The runs of the actual scenario took approximately 700 s. Between these runs, the test subjects were given the possibility to take notes about the controllers. Finally, they had to evaluate the controllers by answering three questions, see next Section.

# **6.2.3 Hypotheses and Evaluation Criteria**

This experiment investigated two hypotheses:


Hypothesis H1E1 assumes that the use of a well-designed controller provides additional help for the human operator that leads to an increased overall task performance. This comparison has an important practical implication: It is possible to improve system performance without the need for additional sensors or a special perception system on the vehicle manipulator. Thus, the proposed LISC and NCC have the same hardware configuration and no sensors for the references of the manipulator. On the other hand, H2E1 has a theoretical purpose: Using the proposed design of LISC leads to a controller which can provide a support similar to FISC without the need for additional sensors or perception systems on the vehicle.

In order to evaluate the two hypotheses, objective and subjective measures are defined for both. For the objective assessment, a stack consisting of M<sup>k</sup> data points was collected from

<sup>40</sup> The three-dimensional configuration of the manipulator is not visible. Thus, it cannot be seen if the manipulator is fully stretched out and cannot be moved further. The limitation was demonstrated in the familiarization process by instructing the test subjects to reach this boundary point.

the measurements with 25 Hz. Based on these, the performance, defined as the average of the root-mean-square error of the manipulator

$$d\_{\text{avg},m} = \sqrt{\frac{1}{M\_k} \sum\_{k=1}^{M\_k} d\_m^2 \left[k\right]},\tag{6.1}$$

is computed for each test subject. The motivation of (6.1) is that the mutual effort of the test subjects and the automation is a precise tracking of the reference of the manipulator. The subjective evaluation of the controllers happens by means of three questions using an 11-point Likert scale ([Lik32] or [Alb18, Chapter 2.]):


The first question about the assessment of the mental load of the operators is inspired by the NASA-TLX questionnaires, see [HS88]. The second and the third questions have practical inspirations: The second question measures how intuitively the operator could use the system. An intuitively supporting controller is important to enhance the acceptance and reduce the training phase with the assistance system. The third question is the subjective perception of the test subjects about how they performed the task. It measures the self-confidence of the operator with the system. The more self-confidence the operator has, the higher the satisfaction of the operator is.

Note that the comparisons are not classical pairwise ones: H1E1 is a difference test between LISC and NCC, while, H2E1 is an equivalence test, which requires other test methods than the hypothesis tests for difference [CGA04] and [MC12]. More details on equivalence testing are given in Appendix D.1.

### **6.2.4 Results and Discussion**

From each test subjects two runs are obtained which are used for the analysis of H1E1 and H2E1. For the testing of H1E1 and H2E1, the significance level is uniformly chosen to αExp,<sup>1</sup> = 0.05. For more details, it is referred to Appendix D.3.

#### **Objective Results**

The means and the standard deviations of the average errors of the manipulator davg,<sup>m</sup> are shown in Table 6.1. It can be seen that NCC has the largest average error resulting in the weakest performance. Then, the measurements of (6.1) are tested with *Shapiro-Wilk* test for the normality condition, see [SW65]. This shows that measurement sets of the three different controllers are not normally distributed, see the results in Appendix D.3. Therefore, the *Wilcoxon Signed-rank* test is used to statistically compare NCC and LISC, see e. g. [Dal08, Chapter 4.] or [Bra14]. An additional Bonferroni correction is applied, therefore, the corrected significance level of H1E1 is α˜Exp,<sup>1</sup> = αExp,<sup>1</sup> 2 = 0.025. The Bonferroni correction is necessary because the same data is used for the analysis of H2E1. The p-value obtained the Wilcoxon Signed-rank test

$$p\_{\text{LISC-NCC}} = 1.9 \cdot 10^{-6}$$

,

which is α˜Exp,<sup>1</sup> > pLISC-NCC. Thus, the proposed LISC is significantly better than NCC, and H1E1 is accepted.

The testing of H2E1 is conducted by two one-sided sign tests (TOST) with a 90% confidence interval for the difference. The equivalence margin was chosen to ±0.05. In the case of statistical equivalence, the mean of the difference variable ∆davg,<sup>m</sup> = dLISC,avg,<sup>m</sup> − dFISC,avg,<sup>m</sup> is in the equivalence margin. The TOST provides two p-values for both sides: If both values are less than the significance level αExp,1, the two measurements are statistical equivalent and there is no statistical difference between the median values of FISC and LISC. The p-values obtained are

$$p\_{\text{FISC-LSCO}} = \left[0.006, \ 3.1 \cdot 10^{-4}\right].$$

Because α˜Exp,<sup>1</sup> > pFISC-LISC holds, H2E1 is also accepted. Further details to the obejctive results of the first experiment are given in Appendix D.3

**Table 6.1:** The means and corresponding standard deviations of the average errors of the manipulator davg,<sup>m</sup>


#### **Subjective Results**

The results of the questionnaire are given in Table 6.2. They reinforce the results of the quantitative results. Table 6.2 shows the means of the assessments for the corresponding questions and controllers. It can be seen that LISC has better results in all three questions than NCC. The FISC has the highest scores in all three questions. Since, the answers are given in a 11-points scale, Wilcoxon Signed-rank test is applied<sup>41</sup> .

**Table 6.2:** Mean values and the standard deviations of the personal questionnaire results


<sup>41</sup> The measurement data is classified into categories with a rank order (0,1,...,9,10). Such data is called ordinal, for which non-parametric statistical tests need to be used. For more detail, see Appendix D.2 or [Nor10].

Testing H1E1 includes the comparison of the results from LISC and NCC. The p-values obtained from the one-sided Wilcoxon Signed-rank test are

$$\begin{aligned} p\_{\text{IISC-NCC}}^{Q1} &= 0.012, \\ p\_{\text{IISC-NCC}}^{Q2} &= 1.8 \cdot 10^{-4}, \\ p\_{\text{IISC-NCC}}^{Q3} &= 6.7 \cdot 10^{-4}, \end{aligned}$$

which are less than α˜Exp,<sup>1</sup> showing that LISC is significantly better than NCC. The proposed novel shared controller with the systematic design leads to a better subjective assessment of the task performance. Furthermore, it is rated to be more intuitive and more useful compared to NCC.

To test H2E1, three TOSTs are carried out, for which the equivalence margin was set to ±2 steps of the answer scale. This choice was supported by the observation that the test subjects were not able to clearly distinguish a step less than 1.5 − 2 points. The resulting p-value pairs of the TOSTs are

$$\begin{aligned} p\_{\text{LISC-FISC}}^{Q1} &= \left[0.019, \, 1.2 \cdot 10^{-4}\right] \\ p\_{\text{LISC-FISC}}^{Q2} &= \left[0.004, \, 7.0 \cdot 10^{-4}\right] \\ p\_{\text{LISC-FISC}}^{Q3} &= \left[0.026, \, 1.8 \cdot 10^{-4}\right]. \end{aligned}$$

For the first and second questions, αExp,<sup>1</sup> > max {p Qi LISC-FISC} holds, meaning that there is no significant difference between FISC and LISC. On the other hand, the test subjects did not find the support of LISC and FISC similar because α˜Exp,<sup>1</sup> < max {p Q3 LISC-FISC} holds. The results of

**Figure 6.4:** The box plots show the subjective results of the three lateral controller: NCC, FISC and LISC. The red circle is the mean value of the results.

the subjective assessments of the controllers are illustrated with box plot in Figure 6.4, where the inferiority of NCC compared to LISC and FISC is clearly

The verbal feedback of the test subjects enhance these results: The difference between FISC and LISC was not noticeable for the test subjects. The third test subject expressed his personal point of view by saying "Controller 1 [LISC] is more aggressively configured than controller 2 [FISC]" and test subject 11 said: "Controller 3 [FISC] helps more in small curves than controller 1 [LISC], but not in large curves". On the other hand, all the test subjects were able to distinguish NCC from FISC and LISC. The analysis of the subjective assessment underpins the acceptance of H1E1 and H2E1.

#### **Discussion**

The resulting trajectories of the different concepts of an exemplary test subject are presented in the following figures. Figure 6.5 shows a scenario with smooth V-formed references, in which the trajectories are compared, obtained from test subject number 6 using NCC and LISC. It can be seen that with LISC, a better tracking of the manipulator references is possible. At x ≈ 30m, the vehicle with LISC does not follow its reference as precisely as NCC does. This way, LISC helps the operator to reach the reference of the manipulator and only afterwards returns to the vehicle reference. The human operator can track the reference more efficiently. A similar motion of the vehicle can be observed at x ≈ 100m, where the vehicle with LISC leaves its reference to help the operator to follow the reference of the manipulator more accurately. Furthermore, the changes in the reference of the vehicle are not followed as accurately by LISC

**Figure 6.5:** Illustrative comparison of the overall performance (test subject 6) to track the references (thick lines) of vehicle and manipulator using a controller with no cooperative support (thin line) and a LISC (dashed)

as by NCC, see x ≈ 55 and x ≈ 82.5m. This is a compromise: Tracking both references is not possible, so a decision has to be made about which one has a higher priority. As the tracking manipulator's reference is defined as the mutual effort of the shared control setup, it is more important. Thus, the vehicle leaves its own reference and supports the human operator.

It is important to note that one possible way to handle such a challenging situation would be to slow down the vehicle, carry out the challenging maneuver with the manipulator and then accelerate the vehicle to the reference speed. It is possible to follow the references of the vehicle manipulator at lower speeds. However, this solution would increase the working time and worsen the occupancy rate of the vehicle manipulator. Thus, using LISC facilitates the same quality of the work in shorter time.

Figure 6.6 comprises the resulting trajectories of FISC and LISC from test subject 12, where the references have sudden steps. The supporting motions of FISC and LISC are apparent: The vehicle leaves its reference (see x ≈ 27m, x ≈ 75m and x ≈ 100m) to help the operator to follow the reference of the manipulator. The assistive behavior of the automation by sharing the effort to reach the reference of the manipulator can be seen in the case of both controllers, the trajectories are similar despite the limited information.

In the first subfigure of Figure 6.7, the references of the vehicle and the manipulator are given explaining the changes in the two other subfigures more clearly. Then, the inputs of test subject number 12 controlling the manipulator (second subfigure) and the corresponding lateral deviation from the reference (third subfigure) are compared. At x ≈ 27m and x ≈ 38m, the reference of the manipulator has sudden steps, which have to be followed by the manipulator.

**Figure 6.6:** Comparison of the overall performance (test subject 12) to track the references (thick lines) of vehicle and manipulator using a controller with FISC (thin line) and LISC (dashed)

To compensate this, NCC needs the most time, while, both FISC and LISC can reduce the deviation faster.

There are minor differences between the resulting trajectories of LISC and FISC, in contrast to the simulations with the design models, see Chapter 5, where the trajectories are closer to each other. The reason for these differences is that a simulated human model generates the inputs in the simulations in Section 5.2 and 5.3, in contrast to this first experiment where human test subjects controlled the manipulator. The behavior of the test subject varied in different situations during the experiment. This indicates that the resulting trajectories of such an interaction depend on both the shared control and the human qualities (human objective function). Thus, further research is desirable to answer the question as to how the adaptation of shared control on human variation can be managed from both a practical and a theoretical perspective.

Still, even in this current setup, the benefit of the novel, LISC is clear from Figure 6.6: Despite the limited information about the references LISC can provide a similar support, which does not differ strongly from a FISC.

**Figure 6.7:** Comparison of the input signals from the human-controlled joystick using the three controllers (LISC, FISC and CNN) The measurements are taken from test subject 12. The upper subplot shows the reference trajectories of the vehicle and the manipulator. The middle subplot presents the normalized inputs of the joystick while in the lower figure, the deviations of the manipulator in m are presented.

# **6.3 Lateral Experiment without References**

This experiment had the goal to compare LISC with fully manual control, which is the current state of the art for large vehicle manipulators. The contents of this section are presented in publication [VRH22].

# **6.3.1 The Scenario and the Controllers**

The main difference from the first experiment is that no reference path was explicitly given: The goals of the test subjects were changed from following a reference path to collecting boxes, cf. Section 6.2. The task of the test subjects was to collect blue<sup>42</sup> boxes. The setup is used such as given in Figure 6.1 showing the screenshot of the GUI modified for this experiment. The vehicle reference trajectory was given because it is necessary for the automation of the vehicle. In addition, NCC was also included in the experiment, enabling further comparisons.

In this experiment, both controllers LISC and NCC, had the same design procedure as given in Section 5.2.3 and Section 5.4.2. The usage of these two controllers is explained by the fact that neither of them require the reference or the system states of the manipulator. Therefore, they could be used straightforwardly. If the lateral controllers were active, the test subjects only had to control the manipulator and the vehicle guidance was autonomous. While, in the case of manual control (MC), the test subjects had to control both the vehicle with the steering wheel and operate the manipulator with the joystick.

The vehicle traveled at a constant speed. The reference path of the vehicle had only minor variation because of the real-world applicability: Motorways have smooth curvatures and no sudden variations. The boxes were placed in such a way that, in specific situations, it was necessary for the vehicle to cooperate with the manipulator in order to reach the boxes. These were situations in which the test subjects had to prioritize whether to stay on the reference with the vehicle or collect the blue box with the manipulator.

# **6.3.2 Procedure and Hypotheses of the Experiment**

Fourteen test subjects (2 female and 12 male, age 27.6 ± 3.1 years) conducted this experiment. The test subjects had the task to collect as many blue boxes as possible and keeping the vehicle on its reference as well as they could. Furthermore, the same additional instructions were given as in the first experiment, see Section 6.2.2.

During the automated modes (LISC, NCC) they were not allowed to touch the steering wheel. It was necessary for safety reasons: The setup with the steering wheel did not include any interaction behavior, so sudden and unexpected movements could have led to physical injuries. In MC, they were responsible for both the manipulator control as well as the vehicle control using the steering wheel. In those cases where the boxes were not easy to reach, the test subjects were told to leave the reference with the vehicle to reach the boxes. This coordination between the vehicle and manipulator was explicitly instructed. They were also told that they were only allowed to leave the reference to collect the box and had to return to the vehicle's reference as

<sup>42</sup> Choosing the color blue had practical reasons: It ensures good visibility, whereby a variation due to the different perception of the test subject can be minimized.

quick as possible. Finally, they were informed that they needed to fill out a questionnaire to evaluate and compare the controllers after the study.

After these instructions, there was an approximately 300 s long familiarization run with MC. This familiarization run attempted to imitate that the test subjects had relevant experience and could perform the task well43. The familiarization part is not included in the evaluation.

Then, the test subjects performed the three experimental runs, each with one of the three modes, the two controllers and MC. They did not get any information which of the two controllers (either LISC or NCC) were active. The order of the experimental runs was randomized for each test subject. There was an *intermediate questionnaire* after each run, which was used to enhance the test subjects' awareness and reflection on the different controller concepts. The results of this intermediate questionnaire were not used for the final evaluation. After completing the three experimental runs they were asked to answer the *evaluation questions*, which were then used for the subjective evaluation of the controllers.

The central question of this experiment was whether LISC can provide an improvement of the measures compared to MC. Furthermore, the issue whether the proposed LISC is less demanding for the operator compared to MC or NCC was addressed. A direct comparison between NCC and LISC was not the primary goal, as that was the subject of the first experiment, see Section 6.2. Nevertheless, the absence of the reference can highly influence the experience of the test subjects. Therefore, the subjective assessments were compared between all three setups. For these goals, two hypotheses and two measures were defined. These hypotheses to be analyzed are:


As an objective measure of the operator's performance the number of collected boxes was chosen. Additionally, the root-mean-square error of the average deviation from vehicle's reference

$$d\_{\text{avg}, \text{veh}} = \sqrt{\frac{1}{M\_k} \sum\_{k=1}^{M\_k} d\_{\text{veh}}^2 \{k\}},\tag{6.2}$$

was used for the evaluation. The subjective evaluation of the controllers was performed by the evaluation questions using using an 7-point Likert scale


<sup>43</sup> This procedure was necessary because the test subjects were not professionals: They did not have any previous experience with similar systems or with the simulator.

### **6.3.3 Results and Discussion**

#### **Objective Results**

In Table 6.3, the means and the standard deviations of numbers of collected boxes (*Box Score* - BS) with the three controllers are given. Beside these, the vehicle's average deviations from its reference (davg,veh) can be found in Table 6.3. The Figure 6.8a graphically illustrates the distribution of the collected boxes. From that, it is noticeable that most of the test subjects collected the most boxes with LISC, and a few of them achieved the same results with MC. Nevertheless, the standard deviations with LISC are smaller than with MC, indicating that the performance is well-balanced regardless the skills of the test subjects. Additionally, the box plots of the average deviations from vehicle's reference (6.2) are given in Figure 6.8b, which shows that all the test subjects have a smaller average deviation with LISC compared to MC. This means that collecting more boxes did not influence negatively the tracking of the vehicle's reference.

In order to statistically test the two hypotheses, Shapiro-Wilk tests were conducted verifying the normality condition of the measurements. This showed that not all measurements have a normal distribution, see Appendix D.4. Therefore, for the test of H1E2, a Wilcoxon Signed-rank test was applied. The significance level is chosen to αExp,<sup>2</sup> = 0.01

Applying the Wilcoxon Signed-rank test shows that these differences (BS and davg,veh) are statistically significant. The p-values are

$$p\_{\rm BS} = 3.88 \cdot 10^{-4} \text{ and } p\_{d\_{\rm avg, vwh}} = 7.47 \cdot 10^{-6},$$

which are both less than αExp,2. Thus, H1E2 is accepted, meaning that LISC provides statistically significantly better performance in terms of number of boxes collected and average lateral error compared to MC. Furthermore, the result from the first experiment (see Section 6.2) could be additionally confirmed here: Using NCC, the test subjects collected less boxes compared to LISC. Applying, Wilcoxon Signed-rank test on the LISC-NCC comparison, the resulting p-value is

$$p\_{\rm BS} = 7.193 \cdot 10^{-6},$$

for which pBS < αExp,<sup>2</sup> holds, confirming the results of the first experiment that this difference is significant.

**Table 6.3:** The mean values and the standard deviations of the collected box numbers with the two controllers and with the manual mode.


**(a)** The number of the collected boxes for the three different control modes (LISC, NCC, MC) pictured in a box plot. It can be seen that LISC outperform both NCC and MC.

**(b)** Box plot of the vehicle' average lateral errors for the three different control modes (LISC, NCC, MC). It can be seen that LISC outperform MC. The NCC has smaller deviation, however, this results also impairs collecting the boxes.

**Figure 6.8:** Box plots of the second experiment, where the mean values of the results are symbolized by the circles.


**Table 6.4:** The mean values (and standard deviations) of personal questionnaire.

#### **Subjective Assessment**

The subjective evaluations of the control modes are given in Table 6.4. LISC yielded better results compared to NCC and MC in all three questions. The standard deviations of the results are the smallest for LISC. For the statistical test of H2E2, three samples were compared. Therefore, the *Kruskal-Wallis* test was chosen for the analysis, see e. g. [Dal08, Chapter 6.]. The degrees of freedom of this test were df = 2. Its null hypothesis is that there is no difference between the three controllers. This hypothesis is declined if H ≥ X 2 df,αExp,<sup>2</sup> holds, where

$$\mathcal{X}\_{df=2,\alpha=0.01}^{2} = 9.21... $$

The null hypothesis of the test is declined regarding the intuition (HQ1 <sup>=</sup> <sup>23</sup>.53), the mental

strain of the test subjects (HQ2 <sup>=</sup> <sup>11</sup>.95) and also the sense of control (HQ3 <sup>=</sup> <sup>9</sup>.92). Since <sup>H</sup><sup>Q</sup><sup>i</sup> <sup>≥</sup> <sup>X</sup> 2 df,αExp,<sup>2</sup> ∀i = 1, 2, 3, LISC provides significantly better results in all three aspects and H2E2 is confirmed.

#### **Discussion**

This experiment raises interesting aspects that are discussed in this subsection. The solid evidence that LISC yielded better results than MC is promising for practical applications: It can ease the workload enabling the operators to concentrate better and fulfill their task more efficiently compared to the current state of the art. A further interesting observation was that the test subjects collected fewer boxes with NCC than with the MC, while their subjective evaluation showed no significant difference between NCC and MC, see Table 6.3 and Table 6.4. These results indicate that a simple, "classical" automation of the vehicle does not enhance the performance of the overall system with one operator. Thus, the use of NCC is therefore not advisable for the application of vehicle manipulators.

Figure 6.9 compares the trajectories generated with LISC to the trajectories of MC. It can be seen that the test subject followed the vehicle's reference with MC less successfully. Furthermore, some boxes were missed with MC, meanwhile more the boxes are collected with LISC. Similar results can be observed by other test subjects, for additional figures see Appendix D.4.

A further interesting finding is that the trajectories of the vehicle are similar for both LISC and MC, even though that the proposed LISC does not aim to imitate the same trajectories of MC, using the exemplary set of trajectories from test subject number 5. The goal instead is to relieve the human operator in challenging situations by additional movements of the vehicle. How these movements are achieved is not exactly specified, e. g. there is no learning from the

**Figure 6.9:** Trajectories of test subject number 5 with MC and with the proposed LISC.

human-driven trajectories of the vehicle to imitate the inputs of the operator has been applied. Still, the resulting trajectories are similar, cf. Figure 6.9. These similarities indicate that LISC is potentially intuitive to the human operator, which is reached by the proposed systematic design. A limitation of this study is that the effects of the operators' environment perception are different in real application, which can influence the controller's performance.

Nonetheless, this simulator experiment demonstrate the usability of LISC concept and provides strong indications that LISC can be considered in practical future projects for manufactures of large vehicle manipulators.

# **6.3.4 Perspectives of the real-world Applicability**

Thanks to the Autobahn GmbH, five professional operators visited the cooperative lab at the Institute of Control Systems and conducted the experiment during the course of this thesis. All professional operators work in Baden Würtemberg in the roadside maintenance divisions of the Autobahn GmbH. Figure 6.10 depicts the professional test subject number 2 carrying out the experiment.

They work with such systems every day and use special equipment for verge mowing, ditching or verge cleaning. They are familiar with the challenging circumstances that arise during these works. During the discussions after the experiment, three of them said that they also have experience with agricultural machines, which often inhibit a similar setup and their operation includes similar challenging situations, too. Therefore, these invited professionals were capable of assessing LISC and providing recommendations for further development work in order to reach a real-world use in different applications.

The feedback of the professional operator to the GUI confirmed its suitability for testing the proposed LISC. In general, the simple representation of the simulator does not impair the recognition and the imitation of the challenging situations of the real world. However, according to their feedback, the simulator is not suitable for the training of novice operators.

The professional operators found that the experiment includes realistic situations. The resulting trajectories professional test subject number 2 using the different concepts are presented in Figure 6.11. However, in the reality, such challenging situations included in the experiment occur less often. Their core assessment to the assistant system was positive: They stated that the use of such an assistant system could help them in their work. They pointed out three main limitations, which should be addressed: First, they were not allowed to hold the steering wheel and override the inputs of the automation. The professionals explained that oversteering is essential to make small corrections. Secondly, they found the constant longitudinal speed of the vehicle inconvenient<sup>44</sup>. Thirdly, the movements of the vehicle needs to be limited: Sudden maneuvers with the manipulator should not imply always a sudden support from the vehicle, which needs to be taken into account for a real-world realization.

<sup>44</sup> For safety reasons, the possibilities of oversteering and speed adjustment are inevitable in a real world application of the LISC. This restriction was justified by the minimization of the variance in the experimental setup. Obviously, a real-world implementation has to exclude this restriction ensuring the safety of the other road users.

**Figure 6.10:** Test subject number 2 carrying out the experiment.

**Figure 6.11:** Exemplary trajectories of professional test subject number 2 with MC and with the proposed LISC.

The overall feedback of the professional operators was that the system could be useful in numerous scenarios, but a key remaining question is, where exactly such an automation will be used: There are different challenges on the motorways compared to other types of roads. The different road types (highway, freeway, common road etc.) have different crash-guard systems (e. g. "Super-Rail" or just reflector posts), which makes the work with the manipulator diverse. Consequently, the adaptation of LISC for each scenario is necessary. The acceptable maximal lateral motion of the vehicle is depend on the road type as well. Furthermore, they mentioned that the preferences of the operators may vary depending on their mental state, road or lighting conditions. Due to the systematic design, such requirements can be formulated in the global objective function, which automatically provides the parameters of LISC. Different preferences of the human operator can be handled by identifying operator possibly, which can lead to broader acceptance. Summarizing the visit of the professionals, the general applicability of LISC is demonstrated. The feedbacks facilitate further progress towards an actual realization of the proposed LISC.

# **6.4 Experiments of the Longitudinal Shared Control**

The third experiment took an agricultural application into account. In the experiment, a harvester (1) and a bankout wagon (2) are to move in a coordinated manner, see Figure 6.12. The bankout wagon gathers the grains and travels at a constant speed. On the harvester, an operator controls the unloading tube to transport the corn from the harvester to the bankout wagon. It is assumed that the bankout wagon is automated. To enable the optimal distribution of the harvested grain in the bankout wagon, the manipulator of the harvester must be positioned accordingly.

The goal of the experiment is to study the applicability of LISC for the longitudinal guidance of the large vehicle manipulator: The longitudinal speed of the harvester is adapted by LISC according to the task with the human-controlled manipulator. This way, the work time can be reduced. The longitudinal LISC requires the longitudinal model as presented in Section 5.1. The main question of this experiment is whether the proposed longitudinal LISC has advantages or disadvantages compared to NCC. The NCC is taken into account as currently used stateof-the-art technology because its realization is straightforward<sup>45</sup>. Therefore, a comparison between the NCC and the LISC is given and no manual control is taken into account. Parts of this section were published in the research article [VH22].

**Figure 6.12:** Agricultural application to demonstrate LISC for the longitudinal shared control of a large vehicle manipulator. The application includes a harvester (1) and a bankout wagon (2) [Mov22, VH22]

<sup>45</sup> The actual state-of-the-art large vehicle manipulators have also been using a low-level (so-called hydrostatic drive) longitudinal cruise control.

### **6.4.1 The Setup and the Controllers**

The GUI of the simulator was modified for this experiment and the focus was the analysis of different longitudinal controllers. The harvester traveled at constant velocity along a reference path having only small lateral variations. The lateral controller was a non-cooperative one: The actions of the human operator did not influence the lateral motion of the vehicle, in contrast to the earlier experiments. The lateral controller only compensated for the lateral error of the vehicle, which was caused by the reaction forces and torques of the outstretched manipulator on the vehicle.

The longitudinal controllers (LISC and NCC) were implemented as given in Section 5.3. Both controllers provided the guidance speed of the vehicle, and the required driving torques of the wheels were set by a low-level PI controller. At this low level, there is no difference between LISC and NCC. For more details on the low-level controller, it is referred to Appendix B.2.

The longitudinal experiments required an adjusted task: It consisted of five red boxes which had to be reached one after the other by the manipulator. These five boxes symbolized five sections within the bankout wagon, which had to be filled up. The filling process was imitated by remaining on the red boxes for 2.5 s. After hovering over the boxes, they became green, meaning that this part of the bankout wagon was full, and the operator had to move the manipulator otherwise the distribution of harvested grain would not be optimal. The five boxes disappeared when all of them had been reached simulating that the unloading was ready. Afterwards, five new boxes appeared immediately at a different position, which also had to be filled. It was important that the boxes were not allowed to be "left out". Test subjects breaking this rule more than 5 times were excluded from the experiment. In a real application, this would have meant that the bankout wagon was not filled optimally.

After filling the bankout wagon four times, the simulation stopped, and the next run with the other controller could be started by the test subject. At the top of the GUI, the test subjects were able to see which controller was active, cf. Aut1 and Aut2 in Figure 6.13. However, they did not have a mapping from Aut1 and Aut2 to LISC and NCC. Thus, they did not know which controller they tested. The running order of the controllers was fixed: Test subjects with even session numbers carried out NCC at first, and then LISC. Meanwhile, test subjects with uneven identification numbers started with LISC and finished with NCC.

### **6.4.2 Experiment Procedure**

Seventeen test subjects (3 female, 14 male, average age of 28.2 ± 2.7 years) took part in the experiment. They were given the goal to finish the overall task as fast as they could. The experiment consisted of the following parts:


**Figure 6.13:** The image of the GUI used for the longitudinal experiment. Two boxes are already filled up and the operator moves the manipulator to the next box.

The familiarization part and the intermediate questions were not used for the latter assessment of the experiment. Their purpose was to help the test subject to be able to evaluate the controllers. The experiment materials including the questions and the instruction are given in Appendix D.5.

# **6.4.3 Measures of the Experiment**

The aim of the experiment is to show that the use of LISC also has advantages for the longitudinal guidance of the vehicle manipulator. On the one hand, the adaptive motion of the vehicle can help the operator to reduce the task execution time. On the other hand, the interacting behavior can also lead to additional stress and be less intuitive<sup>46</sup> .

Two hypotheses are formulated for the evaluation:

H1E3 Using LISC significantly reduces the working time compared to NCC.

H2E3 The operator does not have an increased mental load or a less intuitive operation of the vehicle manipulator using LISC compared to NCC. Furthermore, LISC is not less helpful than NCC.

For the evaluation of H1E3, the overall time tOA is taken into account, which describes the necessary time to finish the task.

<sup>46</sup> For a better understanding, the reader may think of adaptive cruise control by passenger cars, whereby the desired distances cannot be set intuitively. Thus, the system can easily lead to frustration and the driver does not want to use the assistant system.

The extended input of LISC is the jerk of the vehicle <sup>∂</sup> 3 sv ∂3t , which can possibly lead to more sudden motions of the vehicle. Furthermore, the input of the operator is used by the LISC to adapt the speed of the vehicle. This combination can lead to an increased sensitivity to the inputs of the operator. Such sensitivity possibly increases the mental load of the operator. The main question of H2E3 is whether a systematic controller design of LISC can evade such a raise of the mental load. To assess H2E3, the test subjects answered the following final questions on a seven-point Likert scale:


### **6.4.4 Results and Discussion**

Two of the test subjects had to be excluded: One test subject did not follow the instructions precisely and had left out the green boxes 7 times. For another test subject, the data logging failed due to an unexpected stopping of Ubuntu services on the simulator computer. Therefore, their objective and subjective results are not used in the assessment of the controllers. The data from the fifteen remaining test subjects are used for the evaluation. The significance level for the third experiment is chosen to αExp,<sup>3</sup> = 0.05.

#### **Objective Results**

The performance measure of the experiment is the overall time to finish the task. The means and the standard deviations of the resulting overall times using LISC and NCC are given in Table 6.5, which shows that using LISC reduced the necessary time compared to NCC. To choose a suitable statistical test for H1E3, Shapiro-Wilk test and F-test were conducted. They showed that not both data sets have a normal distribution, see Appendix D.5. Therefore, the Wilcoxon Signed-rank test is applied. The p-value of the Wilcoxon Signed-rank test is

$$p\_{\rm H1} = 2.13 \cdot 10^{-6},$$

which is less than the significance level αExp,<sup>3</sup> meaning that the test subjects carried out the task with LISC significantly faster than with NCC. For a visual representation of the results, the histograms of the two controllers are given in Figure 6.14, which reinforces the conclusion that the test subjects were able to perform the task faster using LISC. Therefore, H1E3 is accepted.


**Table 6.5:** Mean values (standard deviations) of the overall times for completing the task.

#### **Subjective Results**

The questions for a subjective assessment of the controllers have the following goals:


The mean values and the standard deviations of the answers are given in Table 6.6. It can be seen that LISC leads to better average results. For a more illustrative representation of the results, the box plots of the subjective results are given in Figure 6.15.

To evaluate H2E3, a non-inferiority statistical test needs to be applied, which can answer the question whether a novel method is not worse compared to standard methods. For that, a one-sided sign test is used for evaluating the non-inferiority of LISC. The inferiority bound was chosen to 1.5, which was due to the fact that the test subjects were able to distinguish

**Figure 6.14:** The box plot of the finishing times of the task with NCC and with LISC. Most of the test subjects were able to finish the task in less time with LISC than with NCC.


**Table 6.6:** Mean values with the corresponding standard deviations of the personal questionnaire

approximately two steps in the Likert scale. For more details on non-inferiority statistical tests, see [Lak17] or Appendix D.1. The obtained p-values are:

$$\begin{aligned} p\_{\text{LISC-NCC}}^{Q1} &= 0.012, \\ p\_{\text{LISC-NCC}}^{Q2} &= 6.8 \cdot 10^{-4}, \\ p\_{\text{LISC-NCC}}^{Q3} &= 6.7 \cdot 10^{-4}, \end{aligned}$$

which are all less than αExp,3. Thus, the results indicate that the test subjects did not perceive inferiority in all three aspects between LISC and NCC. This means that the test subjects did not find NCC better compared to LISC. Thus, the results indicate the non-inferiority of LISC and H2E3 is accepted.

**Figure 6.15:** The box plots show the subjective results of the two longitudinal controllers, LISC and NCC. The red circle is the mean value of the results.

#### **Discussion**

The results show that the test subjects were able to complete the same task faster without increasing their mental load. For further discussion, the following two figures show the relative longitudinal distance and velocity of the manipulator, respectively.

Figure 6.16 shows an exemplary set of trajectories from test subject number 4, in which the test subject needs to extend the manipulator less with LISC than with NCC. An optimal configuration of the vehicle and the manipulator includes a lateral distance of approximately 4 meters between them. This allows a good manipulability of the manipulator. The spans between the maximums and minimums of the relative distance are smaller for LISC than for NCC. Thus, the operator can react to changes of the goals faster, which can be seen in Figure 6.17 after t ≈ 25 s: Using LISC, the test subject could start the new maneuver at t ≈ 25 s. On the contrary, the new maneuver started t ≈ 30 s using NCC. This observation also supports the quantitative results that LISC increases the efficiency of the task execution.

The task used in the experiment is designed to keep the workload with NCC low. The study aims to determine whether the use of LISC results in a higher mental load and less intuitive usability<sup>47</sup>. Analyzing the questionnaire, no significant changes in the workload can be reported, while the task execution time was reduced. How disruptive LISC can be in real applications is still an open question. Since the study only took around 30 minutes and did not last for days, the long-term adaptation of the test subjects could not be observed. However, this adaptation is also an essential factor for the acceptance of LISC. The results were achieved using a simulator and the subjects perceived the accelerations and velocities only visually, which may also have had a further impact.

Still, the results of the study show that LISC was applied effectively and provides the first promising indications that a cooperative longitudinal control of large vehicle manipulators is beneficial. Therefore, LISC can be considered as a promising solution for real-world implementations.

**Figure 6.16:** The relative position of the manipulator with LISC (blue solid line) and with NCC (red dashed line) originated from the 4th test subject

<sup>47</sup> Increased task complexity is caused by the additional motion of the vehicle through the use of LISC. Using NCC, this additional motion of the vehicle is not present.

**Figure 6.17:** The relative longitudinal velocity of the manipulator with LISC (blue solid line) and with NCC (red dashed line) originated from the 4th test subject

# **6.5 Discussion and Summary of the Chapter**

First, this chapter presents the test bench including the simulation models of the vehicle manipulator and the further hardware- and software-components. That is followed by three experiments, which demonstrate the usability and benefits of the proposed LISC by comparing it with state-of-the-art controllers. The results from all three experiments provide promising indications that the designed LISC outperforms the state-of-the-art technical solutions and other control concepts. Thus, the experiments with LISC is accomplished and the third research question is answered.

In the first experiment, LISC was compared to FISC and to an NCC. The results show that LISC and FISC are statistically equivalent from both objective and subjective points of view. The test subjects were not able to distinguish between these two shared control concepts. In addition, NCC had a significantly inferior performance compared to LISC and FISC. These results indicate that a controller, which actively supports the human operator in their task, would be beneficial in real-world applications.

The second experiment included a comparison of manual control with both the proposed LISC as well as with NCC. The main finding of this experiment is that LISC outperforms manual control significantly with respect to the experiment measures. A further noteworthy result is that manual control surpasses NCC. This is promising evidence that the automation of a large vehicle manipulator requires the coordination between the automated vehicle and the human-driven manipulator, which can enhance the acceptance of the automation of such systems.

In contrast to the first two experiments, which addressed the lateral shared control, in the third experiment, LISC was applied to the longitudinal control of a large vehicle manipulator. The results showed that using LISC could shorten the working time with the large vehicle manipulator compared to the current state-of-the-art technical solutions. This was possible through the adaptation of the speed of the vehicle, which however increased the complexity of the task for the operator. Therefore, the experiment also analyzed whether the complexity of LISC leads to an inferior subjective assessment compared to NCC. The results indicate that the test subject were able to carry out the experiment significantly faster without perceiving an inferiority of the proposed LISC compared to the NCC regarding all subjective aspects. Thus, the usage of LISC can be considered in real-world applications to reduce work time with such large vehicle manipulators.

Although the experiments provide strong indications of the advantages offered by the developed method compared to existing technical solutions, it is essential to acknowledge the limitations inherent in these experiments. Firstly, the definition of performance indices may vary in real-world applications. Furthermore, in many cases, the measurement and quantification of these indices may not be currently feasible from a technical standpoint. Secondly, the efficacy of the developed approach is heavily reliant on the operational environment. The extent to which the assistance system can actively support the operator remains uncertain, which may result in lower benefits than indicated by the studies conducted thus far. Finally, the subjective perception and acceptance of professional users with the developed simulator is difficult to estimate. Therefore, further studies should be conducted using a more realistic simulator.

To summarize the chapter, strong indications are derived from these experiments showing that the use of the LISC can have benefits in real-world applications for both manufacturers and customers.

# **7 Conclusion**

This thesis focuses on *continuous human-machine interactions*, which characterize the operation of large vehicle manipulators used for road maintenance works. Although vehicle manipulators are the focus of intensive research, the systematical treatment of human-machine interactions with *limited information* is not addressed in literature. Through this thesis, *limited information* means that the automation cannot measure or observe a subset of the system state, which arises for robotic systems with unstructured working environments.

In order to enable the shared control of such systems, the first notion of this thesis is the concept of the *limited information shared control*. It is assumed that a human controls the non-measurable system states and the automation strives to support the human to improve the performance on their task. The core idea is the introduction of the so-called *cooperation state*, which can model the *mutual effort* of the human and the automation. Moreover, cooperation state can also serve as a substitution for the system states being non-measurable for the automation. The design of the limited information shared control happens by matching it to a *full information shared control*, which is based on a shared control design from literature and necessitates all the system states to be measurable. On the contrary, the limited information shared control can operate with fewer measurements and can provide a similar support for the human operator. These advantageous characteristics of the limited information shared control make specific real-world applications feasible for the first time. The remaining challenges are calculating the parameters of the cooperation state in such a way that they can characterize this specific *continuous human-machine interaction*.

To solve this challenge, the development of a systematic control design for the limited information shared controller is presented, which is the second contribution of this thesis. The core idea is that the so-called potential games would be appropriate for a compact substituting representation of shared control setup. However, the existing subclasses of potential games have restrictive properties. Therefore, an extension is required to enable broader use of the potential differential games. This thesis closes this gap and introduces the two novel subclasses: The *near potential differential games* and the *ordinal potential differential games* They are suitable for the modeling of continuous shared control setups and enable the systematic calculation of the parameters of the cooperation state. Finally, the feedback control law of the limited information shared controller is computed ensuring the optimality of the mutual effort. Thus, the proposed systematic design of the limited information shared controller enables the modeling and the control of continuous human-machine interactions with limited information.

To enable the testing of the proposed limited information shared control and its comparison with other control concepts, a simulator was developed in the course of this thesis. First, the fundamental usability of the proposed concept was proved in simulation with simulated human behavior. These first analyses included challenging scenarios, in which the manipulator had challenging situations. The results show that the use of the proposed limited information shared control can help the operator carry out the tracking task faster and more precisely compared to the state-of-the-art solutions.

In order to evaluate the impact of the human behavior on the designed limited information shared controller, three different experiments were conducted including human test subjects. In the first one, the proposed limited information shared controller was compared to the full information shared controller and a non-cooperative controller in the case of the lateral control of large vehicle manipulators. The results showed that the proposed shared control with the novel design method does not have significantly different results compared to the full information shared controller despite the limited information.

In the second experiment, manual control of the large vehicle manipulator was compared to the limited information shared control and to the non-cooperative controller. Manual control is the current state-of-the-art method. The main difference to the first experiment was that no reference to the manipulator was given to the test subjects in advance: The task was to reach defined goals with the manipulator, which corresponds to a more realistic scenario. The results strongly indicate that the use of the proposed limited information shared control is suitable for the applications of the large vehicle manipulator: It outperformed both manual control and the non-cooperative controller in each aspect.

The first two experiments addressed the lateral control of the large vehicle manipulator. The third experiment presented the application of the limited information shared control for the longitudinal guidance. A comparison between the limited information shared controller and a non-cooperative controller was carried out. The results showed that significantly faster work was possible with the proposed limited information shared controller compared to the non-cooperative controller.

The three experiments provide strong indications that the proposed limited information shared control and its design procedure have practical benefits and are applicable to both longitudinal and lateral control of large vehicle manipulators.

This thesis closes the research gap regarding shared control systems with limited information by introducing novel modeling with the cooperation state, presenting a systematic design procedure, and providing the first propositions for the practical use of the concept in improving the performance of road maintenance works.

# **A Solution Concepts of Games**

In game theory, there are three main solution concepts: The Nash Equilibrium, the Stackelberg equilibrium and the Pareto optimum, [vS<sup>+</sup> 52, Par14]. In this section, the latter two are presented and the relevant differences between them are discussed. Furthermore, the choice of the Nash equilibrium is motivated, which can model human-automation shared control interactions, see [BOW09, LSB21, NC22]. In Section 3.1, the solution concept of the Nash equilibrium is presented. However, to solve the coupled optimization

$$\min\_{u^{(i)}} J^{(i)}\left(u^{(i)}, u^{(\neg i)}\right), \;\forall i \in \mathcal{P},\tag{A.1}$$

there is further concepts in literature.

# **A.1 Stackelberg Equilibrium**

Stackelberg proposed an alternative equilibrium solution concept [vS<sup>+</sup> 52]. The Stackelberg solution concept assumes that the players determine their strategy in an ordered, sequential manner. A leading player sets his strategy first. Once the strategy of the first player is set, it cannot be changed and all the other players can take it into account by choosing their strategies. Then, the second player defines his strategy. The second player assumes that the following players will react to his strategy by optimizing their objective functions. This procedure continues until the last player set his strategy. Thus, the Stackelberg equilibrium is defined as follows.

#### **Definition A.1 (Stackelberg)**

*The solution strategy of a differential game* u <sup>s</sup><sup>∗</sup> <sup>=</sup> [<sup>u</sup> (1) s∗ ,u (2) s∗ , ...,u (N) s∗ ] *is called the Stackelberg equilibrium and defined by the sequence of dynamics optimizations, such that*

$$\mathbf{u}^{\{1\}^{\ast\ast}}(t) = \operatorname\*{arg\,min}\_{\mathbf{u}^{\{1\}}(t)} J^{\{1\}}\left(\mathbf{x}(t), \mathbf{u}^{\{1\}}(t), \mathbf{u}^{\{2\}}(t), \dots, \mathbf{u}^{\{N\}}(t)\right), \qquad \text{(A.2)}$$

$$\text{s.t. } \dot{\mathbf{x}}(t) = \mathbf{f}\left(t, \mathbf{x}(t), \mathbf{u}^{\{1\}}(t), \mathbf{u}^{\{2\}}(t), \dots, \mathbf{u}^{\{N\}}(t)\right),$$

$$\mathbf{x}(0) = \mathbf{x}\_0,$$

$$\mathbf{u}^{(2)^{s^{(2)}}}(t) = \operatorname\*{arg\,min}\_{\mathbf{u}^{(2)}(t)} J^{(2)}\left(\mathbf{x}(t), \mathbf{u}^{(1)^{"\infty"}}(t), \mathbf{u}^{(2)}(t), \dots, \mathbf{u}^{(N)}(t)\right),\tag{A.3}$$

$$\text{s.t. } \dot{\mathbf{x}}(t) = \mathbf{f}\left(t, \mathbf{x}(t), \mathbf{u}^{(1)^{"\infty"}}(t), \mathbf{u}^{(2)}(t), \dots, \mathbf{u}^{(N)}(t)\right),$$

$$\mathbf{x}(0) = x\_0,$$

$$\vdots$$

$$\mathbf{u}^{(N)^{s^{\mathbf{s}}}}(t) = \operatorname\*{arg\,min}\_{\mathbf{u}^{(N)}(t)} J^{(N)}\left(\mathbf{x}(t), \mathbf{u}^{(1)^{"\infty"}}(t), \mathbf{u}^{(2)^{"\infty"}}(t), \dots, \mathbf{u}^{(N-1)^{"\infty"}}(t), \mathbf{u}^{(N)}(t)\right),\tag{A.4}$$

$$\text{s.t. } \dot{\mathbf{x}}(t) = \mathbf{f}\left(t, \mathbf{x}(t), \mathbf{u}^{(1)^{"\infty"}}(t), \mathbf{u}^{(2)^{"\infty"}}(t), \dots, \mathbf{u}^{(N-1)^{"\infty"}}(t), \mathbf{u}^{(N)}(t)\right),$$

$$\mathbf{x}(0) = x\_0,$$

$$\text{where } \mathbf{u}^{(i)^{"\infty"}}(t) \text{ } \forall i = \{1, 2, \dots, N\} \text{ are the fixed, stacked-optimal inputs of the players, which can not be modified through the optimization } \mathbf{J}^{(j)}, \text{ if, } j > i.$$

The order of the players has an impact on the optimum of the cost functions J (i) and consequently on the resulting equilibrium of the game. Games with biased or asymmetric information patterns (e. g. markets with dominant companies) are modeled by means of the Stackelberg strategy. For more detail, it is referred to [BB81, BCS15, MB18].

# **A.2 Pareto Optimum**

Nash and Stackelberg equilibria are the solutions concepts of non-cooperative games in which the players do not collaborate in reaching the optimum of their own cost functions. Thus, players strive to improve their own cost functions only. An alternative solution is the so-called Pareto optimum [Par14], in which the players do not optimize of their own objectives only but also take into account the objectives of the other players computing the control actions. Using the Pareto solution, the players collaborate with each other in order ro reach the so-called Pareto optimum, which is also referred as Pareto-front. Due to this cooperation, these are called cooperative games. The Pareto solution is defined as follows [Eng05, Definition 6.1].

#### **Definition A.2 (Pareto Optimum of Differential Game)**

*The solution strategy of a differential game* u <sup>p</sup><sup>∗</sup> <sup>=</sup> [<sup>u</sup> (1) p∗ ,u (2) p∗ , ...,u (N) p∗ ] *is called the Pareto optimum, if no other permissible strategy* u = [u (1) ,u (2) , ...,u (N) ] *exists such that*

$$J^{(i)}\left(u\right) \prec J^{(i)}\left(u^{p\*}\right) \tag{A.5}$$

*holds for at least one player* i ∈ P *and*

J (j) (u) ≤ J (j) (u p∗ ) , ∀j ∈ P, j ≠ i. (A.6)

The strategy is a Pareto optimum if no other feasible strategy exists that would lead to a better outcome for at least one of the players but would not make the outcome of any other player

worse. Consequently, players in a cooperative game act differently compared to the players in a non-cooperative game. In a non-cooperative game, a player deviates from the Pareto optimum if this leads to a lower value of his/her cost function, regardless of the resulting disadvantages for the other players. Computation methods of the Pareto optimum of a differential game are presented in [RE14, LZ18].

Both the Stackelberg and the Pareto solution are widespread in the literature of game theory. However, they do not suit for the modeling of continuous human-machine interactions. In a shared control setup in accordance with Definition 2.4, a direct and formal coordination between the human and automation is not feasible. In a shared control setup, the interaction between the partners happens continuously through the state and the inputs of the controlled system. Thus, an agreement happens implicitly, in which all players rationally choose their inputs with respect to their individual objectives and continuously adapts to the other partner.

As the result of these considerations, the Stackelberg and the Pareto solution concepts do not suit for the modeling of continuous shared control problems. This conclusion was reinforced through experiments, see [BOW09, LSB21, NC22], in which it has been shown that a continuous human-automation interaction can be modeled by means of Nash equilibria of a non-cooperative game. Therefore, the concept of Nash equilibria is used for the modeling of shared control in this thesis.

# **B The Novel Models of the Vehicle Manipulator**

In this chapter the detailed models of the vehicle manipulators are presented to provide additional information to Chapter 5. First, the nonlinear control model is presented, which is followed by the detailed simulation model of the large vehicle manipulator.

# **B.1 Derivation of the Nonlinear Control Model**

This section presents the novel control model of the vehicle manipulator, which can be used to control the dual trajectories of such system. This derivation is presented in the publication [VMSH19].

The controllers presented in this thesis controls the vehicle manipulator in planar, they require a design model of the vehicle manipulator system. Its characterisation happens through the following coordinate systems<sup>48</sup> (Fig. B.1):


Additionally the endpoint of the manipulator Pman is used for the derivation of the system dynamics. The equation of motion of the vehicle can be derived through the use of the velocity of the reference points P<sup>v</sup> and Pm. The derivation for car-like vehicles is presented in[SK08a, Chapter 49.2]. The vehicle manipulator are controlled by the steering angle of the vehicle δ, the length projected into plane, a and by the orientation α of the robotic arm. For such a system a generalization can be introduced. Defining the variables s, d and ∆θ for the vehicle (index v) and for the manipulator (index m):


<sup>48</sup> Note that the unit vectors are marked as i and j instead of x and y to avoid confusion with the latter system states.


The bicycle model is given according to [SK08a, Chapter 49.2]:

$$\begin{split} \dot{s}\_{\text{veh}} &= \frac{\upsilon\_{\text{veh}}}{1 - \kappa\_{\text{rv}} \cdot d\_{\text{veh}}} \cos(\Delta\theta\_{\text{veh}}), \\ \dot{d}\_{\text{veh}} &= \upsilon\_{\text{veh}} \sin(\Delta\theta\_{\text{veh}}), \\ \Delta\dot{\theta}\_{\text{veh}} &= \upsilon\_{\text{veh}} \frac{\tan(\delta)}{L} - s\_{\text{veh}} \kappa\_{\text{rv}}. \end{split} \tag{B.1}$$

The idea of joint-dependent variables (a and α) can be found in some earlier works [MAD14] [MAD16], but only for small indoor robots in global frame for task priority redundancy resolution or dual-trajectory control.

In the following, the manipulator dynamic in the Frenét-frame is derived. The curvature of e.g. Γman at the point Prm is defined as κrm = ∂θrm/∂sm. This definition yields after substituting the time derivation of the manipulator's transformation angle

$$
\Delta \dot{\theta}\_{\text{man}} = \dot{\theta}\_{\text{veh}} - \dot{\theta}\_{\text{rm}} = \dot{\theta}\_{\text{veh}} - \kappa\_{\text{rm}} \dot{s}\_{\text{rm}}.\tag{B.2}
$$

Secondly, the position of Pman in the global frame O is required:

$$
\overrightarrow{\text{OP}}\_{\text{man}} = \overrightarrow{\text{OP}}\_{\text{veh}} + \left( \{ L + l \} + a \cos \alpha \right) \mathbf{i}\_{\text{veh}} + a \sin \alpha \mathbf{j}\_{\text{veh}}.\tag{B.3}
$$

**Figure B.1:** The detailed lateral control model of the large vehicle manipulator for the derivation of the equations of motion and the applications of the FISC and the LISC design [VMSH19]. ©2020 IEEE

With respect to the reference path of the manipulator, Ð→OPman is alternatively computed as

$$
\overrightarrow{\text{OP}}\_{\text{man}} = \overrightarrow{\text{OP}}\_{\text{rm}} + d\_{\text{man}} \mathbf{j}\_{\text{rm}} \tag{8.4}
$$

and the time-derivation of (B.3) is

$$\begin{split} \frac{\partial \overrightarrow{\text{OP}}\_{\text{man}}}{\partial t} &= \frac{\partial \overrightarrow{\text{OP}}\_{\text{veh}}}{\partial t} + \{L + l\} \frac{\partial \mathbf{i}\_{\text{veh}}}{\partial t} + \{\dot{a}\cos\alpha - a\dot{\alpha}\sin\alpha\} \mathbf{i}\_{\text{veh}} \\ &+ a\cos\alpha \frac{\partial \mathbf{i}\_{\text{veh}}}{\partial t} + \{\dot{a}\sin\alpha + a\dot{\alpha}\cos\alpha\} \mathbf{j}\_{\text{veh}} \\ &+ a\sin\alpha \frac{\partial \mathbf{j}\_{\text{veh}}}{\partial t} . \end{split} \tag{B.5}$$

With the substitution of

$$\frac{\partial \dot{\mathbf{i}}\_{\text{veh}}}{\partial t} = \dot{\theta}\_{\text{veh}} \dot{\mathbf{j}}\_{\text{veh}}, \frac{\partial \dot{\mathbf{j}}\_{\text{veh}}}{\partial t} = -\dot{\theta}\_{\text{veh}} \dot{\mathbf{i}}\_{\text{veh}} \quad \text{and} \quad \frac{\partial \overrightarrow{\text{OP}}\_{\text{veh}}}{\partial t} = v \dot{\mathbf{i}}\_{\text{veh}}.$$

the equation of motion (B.5) is

$$\begin{split} \frac{\partial \overrightarrow{\text{OP}}\_{\text{man}}}{\partial t} &= \left( v + \dot{a} \cos \alpha - a \{ \dot{\theta}\_{\text{veh}} + \dot{\alpha} \} \sin \alpha \right) \mathbf{i}\_{\text{veh}} \\ &+ \left( \{ L + l \} \dot{\theta}\_{\text{veh}} + \dot{a} \sin \alpha + a \cos \alpha (\dot{\theta}\_{\text{veh}} + \dot{\alpha}) \right) \mathbf{j}\_{\text{veh}}. \end{split} \tag{B.6}$$

The same velocity is computed from the reference path of the manipulator in (B.4)

$$\begin{split} \frac{\partial \overrightarrow{\text{OP}}\_{\text{man}}}{\partial t} &= \frac{\partial \overrightarrow{\text{OP}}\_{\text{rm}}}{\partial t} + \frac{\partial}{\partial t} \left( d\_{\text{man} \textbf{j}\_{\text{rm}}} \right) \\ &= \dot{\mathbf{s}}\_{\text{man}} \{ 1 - d\_{\text{man}} \kappa\_{\text{rm}} \} \mathbf{i}\_{\text{rm}} + \dot{d}\_{\text{man} \textbf{j}\_{\text{rm}}} . \end{split} \tag{B.7}$$

Assuming dman ⋅ κrm ≪ 1 and transforming the manipulator's path coordinate system to the vehicle coordinate system, the dynamics of the manipulator is identified in the terms of equation (B.7) and equation (B.6) as follows:

$$\begin{aligned} \dot{s}\_{\text{man}} &= \cos \Delta \theta\_{\text{man}} \left( v + \dot{a} \cos \alpha - a \dot{\alpha} \cos \alpha - a \dot{\theta}\_{\text{vch}} \sin \alpha \right) \\ &- \sin \Delta \theta\_{\text{man}} \left( (L+l) \dot{\theta}\_{\text{vch}} + a \dot{\theta}\_{\text{vch}} \cos \alpha + \dot{a} \sin \alpha + a \dot{\alpha} \cos \alpha \right) \end{aligned} \tag{B.8}$$

and

$$\begin{split} \dot{d}\_{\text{man}} &= \sin \Delta \theta\_{\text{man}} \Big( \upsilon + \dot{a} \cos \alpha - a \dot{\alpha} \cos \alpha - a \dot{\theta}\_{\text{veh}} \sin \alpha \Big) \\ &+ \cos \Delta \theta\_{\text{man}} \Big( (L+l) \dot{\theta}\_{\text{veh}} + a \dot{\theta}\_{\text{veh}} \cos \alpha + \dot{a} \sin \alpha + a \dot{\alpha} \cos \alpha \Big). \end{split} \tag{B.9}$$

Substituting (B.8) in the equation (B.2), the manipulator's transformation angle is

$$
\Delta\dot{\theta}\_{\text{man}} = \kappa\_{\text{rm}} \left[ \cos\Delta\theta\_{\text{man}} \left( v + \dot{a}\cos\alpha - a\dot{\alpha}\cos\alpha - a\dot{\theta}\_{\text{vsh}}\sin\alpha \right) \right. \\
$$

$$
$$

The equations (B.1), (B.8), (B.9) and (B.2) are used as the state space equation of the dynamic system.

To obtain a linear control model of the vehicle manipulator, the input *vehicle's curvature* is introduced from (B.1), κ<sup>v</sup> = tan(δ) L . Thereby the input vector is

$$\mathbf{u}(t) = \begin{bmatrix} \kappa\_{\mathbf{v}}, \ \dot{\mathbf{a}}\_{\text{des}}, \ \alpha\_{\text{des}} \end{bmatrix}^{T}$$

where a˙ and α˙ are the changing rates of the length and the angle of the manipulator. The state vector is chosen to

$$x(t) = \begin{bmatrix} d\_{\rm m}, \Delta \alpha, d\_{\rm v}, \Delta \theta\_{\rm veh} \end{bmatrix}^T,$$

where ∆α = α − α<sup>r</sup> and α<sup>r</sup> is the desired reference orientation of the manipulator. Note that including (B.10) in the state vector is not necessary. However to formulate a model predictive controller with constraints on the angle ∆ ˙θman, the system state vector should be extended, see for more details [VMSH19]. The changes of the trajectories of the vehicle and the manipulator constitute the external disturbance vector

$$\mathbf{z}(t) = \begin{bmatrix} \kappa\_{\text{rv}} \ \kappa\_{\text{rm}} \end{bmatrix}^T \text{ .}$$

Thus, a linear control model can be formulated

$$
\dot{\mathbf{x}}(t) = \mathbf{A}(t)\Delta\mathbf{x}(t) + \mathbf{B}(t)\Delta\mathbf{u}(t) + \mathbf{Z}\Delta\mathbf{z}(t),\tag{B.11}
$$

where the time variances of the system matrices are caused by the variation of the longitudinal velocity of the vehicle. For constant speed, the model (B.11) is time-invariant, such as it is assumed in Chapter 5.

The parameters of the control models are given in Table B.1.


**Table B.1:** Parameters of the linear control model

# **B.2 Detailed Simulation Models of Vehicle Manipulator**

The following section presents the simulation models of the vehicle and the manipulator followed by further information on the low-level controller of the simulation models.

# **B.2.1 Simulation Model of the Vehicle**

In this section, the detailed three-dimensional model of the vehicle is presented, which is developed in the course of this thesis. Parts of the implementation were conducted in the course of the master's thesis [Mai18]. The results are published in research articles [VMSH19, VMH22].

In order to be able to investigate the effects of the hydraulic manipulator on the vehicle, a nonlinear three-dimensional vehicle model is selected. Depending on the application, large forces and torques can act from the robot arm on the vehicle. These can lead to rotational angles of the vehicle chassis that are not negligible. All simulation models of the vehicle manipulator are derived with the Newton-Euler formalism (see e. g. [WW08, Section 3.4]), in which the two laws of the classical mechanics are applied: The conservation of the linear and angular momentums, which can be formulated such that

$$\mathbf{F} = m \cdot \ddot{\mathbf{r}}\tag{\text{B.12}}$$

$$\mathbf{T} = \Theta \dot{\omega} + \omega \times \Theta \omega,\tag{\text{B.13}}$$

where


Figure B.2 illustrates the model of the vehicle including


These subsystems are connected with spring and damp elements as given in Figure B.2. For each subsystem, the kinematic relations are set up, leading to the transformation matrices between them. Furthermore, the Newton-Euler equations of motion (B.12) are formulated for the subsystems and their motions are computed.

**Figure B.2:** The simulation model of heavy duty vehicle including seven subsystems (one vehicle, two suspension systems and four wheels) modeled as rigid bodies

Orientation of rigid bodies is represented with Euler angles roll, pitch, yaw, (αEu, βEu, γEu), which include three chained rotations and describe the rigid body's orientations in a fixed frame OGlob. Their use is intuitive and easy to apply. However, the order of the chained rotations have an impact on the resulting final orientation of the rigid body. To compute the angular velocities of a rigid body for (B.12), in practice, the assumption is made that at least two of the Euler angles are small and therefore the angular velocity is computed such that

$$
\begin{bmatrix}
\omega\_1\\\omega\_2\\\omega\_3\end{bmatrix}\_{\text{Global}} = \begin{bmatrix}
\dot{\alpha}\_{\text{Eu}}\\\dot{\beta}\_{\text{Eu}}\\\dot{\gamma}\_{\text{Eu}}
\end{bmatrix}\_{\text{Global}}.\tag{B.14}
$$

In the case of passengers cars, a common assumption is that the roll and pitch angles are small and the yaw angle is large, which is valid for most of the test scenarios. However, in the case of heavy duty vehicles(B.14), this assumption is not feasible<sup>49</sup>, see [PBLB06, LYLC10]. Furthermore, the inertia of the rigid body Θ is usually estimated for the center of mass of heavy duty vehicles. Due to the complex mechanical structure of heavy duty vehicles, a transformation of (B.12) into an arbitrary point, solving the problem of large Euler-angles, is not practicable in general case.

<sup>49</sup> In literature, there are further works, which decouple the three motions and use planar models only, see e. g. [LH05, GB21] or [Ril11, Chapter 7-9]. However, this thesis attempts to build a more general model enabling realistic simulations.

Due to these reasons, the notion from [KNH14] is applied to overcome the aforementioned challenges. The fundamental idea is the use of a rotated coordinate system ORot, in which the angular velocities can be given with

$$
\begin{bmatrix}
\omega\_1\\ \omega\_2\\ \omega\_3
\end{bmatrix}\_{\text{Global}} = \begin{bmatrix}
\omega\_{1\text{Rot}}\\ 0\\ 0
\end{bmatrix}\_{\text{Rot}}.\tag{B.15}
$$

Using (B.15), the three dimensional rotation is reduced to one single rotation, fulfilling φ˙ = ω1Rot, see Figure B.3. This procedure includes three steps:


These steps are numerically computed for each time step. The solver and the time step of the resulting differential equations are Runge-Kutta and 0.5ms, respectively. For the detailed mathematical derivation of these steps, it is referred to [KNH14].

The dynamics of the steering system is modeled by means of a PT1 system part with 0.2 s time constant, leading to following subsystem dynamics

$$
\delta\_{\rm act} = -5\delta\_{\rm act} + \delta\_{\rm des}.\tag{B.16}
$$

The longitudinal model of the vehicle manipulator includes a combustion engine model consisting of a static rotation speed-driving torque map and a PT1 transfer function, which is inspired by [Ril11, GF12]. Table B.2 provides the parameters of the vehicle model, which are estimated based on literature, see [LYLC10, PBLB06].

**Figure B.3:** Illustration of the angular velocity based local coordinate system to solve the computation problem of the Euler angles


**Table B.2:** Parameters of the vehicle simulation model

### **B.2.2 Manipulator Model**

For the manipulator model, parts of the implementations were carried out in two bachelor's theses [Bou19, Bur19]. The nonlinear hydraulic model of the manipulator is presented in this section, see Fig. B.4. The model of the manipulator includes


The mechanical models are derived with New-Euler formalism, implemented in a Simulink model and solved numerically. The hydraulic models are implemented based on the works of [Rud17, Rud18]. The system states of the hydraulic model are the angular velocity of the joint ϕ˙ and the load oil pressure of the cylinder PL. The nonlinear dynamical model is formulated as follows

$$\dot{P} = \frac{4E\_{\text{hyd}}}{V\_t} \cdot \left(Q\_L - \overline{A} \cdot \phi\right) \tag{B.17}$$

$$\ddot{\phi} = \frac{1}{m\_{man,i}} \left( P\_L \cdot A\_{\text{hyd}} - f(\dot{\phi}) \right), \tag{B.18}$$

**Figure B.4:** The simulation model of the large manipulator including the four rigid body segments and the four hydraulic cylinders

where <sup>f</sup>(ϕ˙) is the non-linear, velocity-dependent Stribeck friction model. <sup>m</sup>man,i is the mass of the corresponding manipulator segment. The input of the cylinder is uhyd, which is the function of the desired angular velocity of the joint ϕ˙ des. The non-linear input equation is

$$Q\_L = z\_{\rm hyd} K\_{\rm hyd} \sqrt{\frac{1}{2} \left( P\_R - \text{sign} \{ z\_{\rm hyd} \} P\_L \right)},\tag{B.19}$$

where P<sup>R</sup> is the reservoir pressure. The servo valve governing the oil flow Q<sup>L</sup> is approximated by a second order system, where the position of the spool is the output. Its transfer function is given as

$$
\ddot{\nu} + 2\zeta\_{\rm hyd}\omega\_{\rm hyd}\dot{\nu} + \omega\_{\rm hyd}^2 \nu = \omega\_{\rm hyd}^2 u\_{\rm hyd} \tag{B.20}
$$

where the parameters η<sup>h</sup> and ω<sup>h</sup> are the damping and ω<sup>0</sup> is the natural frequency of the servo valve. The spool has a dead zone area, which is modeled by

$$z\_{\rm hyd} = h(\nu) = \begin{cases} \alpha\_{\rm hyd} \cdot \text{sign}(\nu) & \text{if } |\nu| \le \alpha\_{\rm hyd} + \beta\_{\rm hyd} \\ 0, & \text{if } |\nu| < \beta\_{\rm hyd}, \\ \nu - \beta\_{\rm hyd} \cdot \text{sign}(\nu) & \text{otherwise}, \end{cases} \tag{B.21}$$

where the parameter αhyd the saturation of the valve and βhyd is the width of the dead-zone. The sign(ν) functions are replaced by the arctan(K<sup>f</sup> ⋅ ν) functions reducing the numerical oscillations around <sup>ϕ</sup>˙ <sup>=</sup> <sup>0</sup> rad, where <sup>K</sup><sup>f</sup> <sup>=</sup> <sup>50</sup> holds. Using (B.17)-(B.21), the motion of the hydraulic cylinder can be reproduced accurately. The illustration of the low-level control loop with the models presented above is given in Figure B.5: The orange-colored box of the human operator symbolizes that this system part is replaced in the experiments by the human operator. On the other hand, the green subsystems with the dashed-line box remain the same in both simulation and experiment. The feedback gain of the low-level hydraulic controller is chosen to Khyd = 0.0225. The further parameters of the manipulator with and their numerical values are given in Table B.3, which are estimated based on data from [Rud17, VGJ19, WWXS22].

**Figure B.5:** The structure of the hydraulic system with the corresponding models and the low-level control loop is shown. The green dashed line defines the components, which are used in both simulations and experiments unmodified.


**Table B.3:** Parameters of the manipulator simulation model

# **C Additional Simulation Results**

This chapter provides additional simulation results for Chapter 5. The first part includes additional figures for Section 5.3. In the second part, the results of an additional scenario validating LISC in Section 5.4.2 are given.

# **C.1 Simulation Results of the Potential Games**

In this section, additional figures for Section 5.3.2 are provided. The resulting noise-free trajectories of the identified OPDG are given in Figure C.1 and Figure C.2 presents the dynamics of the Hamiltonians. It can be seen that in the noise-free case, the input-trajectory-dependent identification of OPDG generates similar Hamiltonian dynamics compared to the noisy case in Section 5.3.2. Thus, the robustness of the input-trajectory-dependent identification of OPDG is ensured through this preprocessing making the method suitable for practical application even with noisy measurements.

**Figure C.1:** The resulting noise-free system state trajectories of the longitudinal vehicle manipulator and the identified OPDG

**Figure C.2:** The dynamics of Hamiltonian functions comparing the results of the original differential game (ODG) and the ordinal potential differential game (OPDG)

# **C.2 Additional Validation Scenario with the Vehicle Manipulator**

This section present an additional simulation scenario, in which the reference path of the manipulator is the combination of a smooth curve and a sudden step. The resulting trajectories are given in Figure C.3 comparing the proposed limited information shared controller to the non-cooperative controller. As it can be seen, after entering in the smooth curve, both limited information shared controller and the non-cooperative controller maintain the reference similarly. During the sudden step, the limited information shared controller can help the operator to maintain the reference of the manipulator. On the other hand, the non-cooperative controller does not support the operator, thus the tracking of the manipulator's reference is less precise. The desired and actual angles of the manipulator are given for both cases: Using LISC, see Figure C.4 and NCC, see Figure C.5. Comparing these two figures, it can be seen that LISC eases the operation: smaller desired angular velocities are set, which can be followed more accurately, see second and fourth joints. Consequently, LISC is also beneficial in this additional qualitative validation scenario.

**Figure C.3:** Additional scenario for the qualitative validation: The combination of a larger curve and a sudden step

**Figure C.4:** Additional scenario for the qualitative validation: The desired and set angular velocities of the manipulator angles using LISC

**Figure C.5:** Additional scenario for the qualitative validation: The desired and set angular velocities of the manipulator angles using NCC

# **D Supplements of the Experiments**

# **D.1 Equivalence Testing**

This section provides a short introduction to the equivalence testing and its emerging challenges. These challenges are increasingly being brought into focus by research communities. For instance, in [Lak17], it is stated that "Currently, researchers often incorrectly conclude an effect is absent based a non-significant result". Furthermore, the misconceptions of the equivalence testing were addressed in [GLM02], which points out that many textbooks fail to handle this subject area correctly. Based on the non-significant result of a statistical test for superiority, equivalence cannot be concluded: "The absence of evidence is not evidence of absence" [Ald04]. Therefore, in such cases, so-called equivalence or non-inferiority tests need to be developed, which can verify the equivalence or non-inferiority of a new method compared to a recognized standard method. Such equivalence or non-inferiority tests are widely used in medicine and pharmacotherapy research, in which the goals of studies are not to test the superiority of a therapy<sup>50</sup>, but to verify the equivalence or non-inferiority (e. g. the new therapy has fewer side effects, is cheaper, but still has an equivalent effect.)

In literature, the commonly used statistical hypothesis tests attempt to show that there exists a statistically significant difference between two or more data sets. Figure D.1 shows an exemplary illustration: H<sup>0</sup> and H<sup>1</sup> are presented as a function of the difference ∆M = M<sup>1</sup> − M2. The x-axis shows ∆M between the means of the new method M<sup>1</sup> and the standard method M2. The null hypothesis H<sup>0</sup> is that the means of the two methods are statistically identical. The alternative hypothesis H<sup>1</sup> is that there is a difference, which is characterized statistically. The mathematical formulation of these two hypotheses is

$$\mathbf{H}\_0: M\_1 = M\_2,\tag{\text{D.1a}}$$

$$\mathbf{H}\_1: M\_1 \neq M\_2,\tag{\text{D.1b}}$$

where *one-sided* tests provide an evidence that M<sup>1</sup> < M<sup>2</sup> (left-tailed test) or M<sup>1</sup> > M<sup>2</sup> (righttailed test). If the difference is analyzed in both directions at the same time, M<sup>1</sup> > M<sup>2</sup> and M<sup>1</sup> < M2, the test is *two-sided*. Using a statistical test, a probability value (p-value) is obtained, which quantifies the probability that H<sup>1</sup> is true such that

$$p = \Pr\left\{ T \le t \, \middle| \, \mathcal{H}\_0 \right\}, \text{ in case of left-tailed tests,} \tag{D.2}$$

$$p = \Pr\left\{ T \ge t \, \middle| \, \mathcal{H}\_0 \right\}, \text{ in case of right-tailed tests,} \tag{D.3}$$

$$p = 2 \cdot \min\left\{ \Pr\left\{ T \le t \, \middle| \, \mathcal{H}\_0 \right\}, \Pr\left\{ T \ge t \, \middle| \, \mathcal{H}\_0 \right\} \right\} \text{ in case of two-sided tests},\tag{D.4}$$

<sup>50</sup> Note that the usage of equivalence or non-inferiority testing is widespread in medical research and in clinical trials, therefore the majority of the publications from literature use the terminology "therapy" or "treatment" instead of "method". However, due to the technical focus of this thesis, the term "method" is consequently used in the following.

**Figure D.1:** Classical hypothesis testing, analyzing the difference between M<sup>1</sup> and M2.

where t is the result of the test-statistics from the distribution T, see e. g. [HKR15, Chapter 9]. Depending on the data distribution, different statistical tests suit for the testing of H1. Overviews and guidelines are given in [VB99], [VA19, Chapter 1].

On the other hand, the absence of such a difference in (D.1) does not imply automatically the equivalence of M<sup>1</sup> and M<sup>2</sup> [LSI18]. Therefore, a new formulation of the statistical test is necessary, for which an equivalence margin is defined by a lower ∆lower and an upper ∆upper equivalence limit. The null and the alternative hypotheses are formulated as follows:

$$M\_0: M\_1 - M\_2 \ge -\Delta\_{\text{lower}} \text{ and } M\_1 - M\_2 \le \Delta\_{\text{upper}},\tag{D.5a}$$

$$\mathbf{H}\_1: -\Delta\_{\text{lower}} < M\_1 - M\_2 < \Delta\_{\text{upper}}.\tag{\text{D.5b}}$$

Figure D.2a presents the equivalence testing procedure graphically. The null hypotheses H<sup>0</sup> states that the difference between M<sup>1</sup> and M<sup>2</sup> is outside the equivalence interval. The alternative hypothesis H<sup>1</sup> is that ∆M is located inside the equivalence interval, which can be symmetric ∆lower = ∆upper as well as non-symmetric ∆lower ≠ ∆upper. The choices of ∆lower and ∆upper are always application-specific. In [CGA04, WN11], guidelines and best practices are presented pointing out the importance of a carefully considered choice of ∆lower and ∆upper.

On the other hand, non-inferiority testing answers the question only, whether the novel method M<sup>1</sup> is not worse than the standard method M2, thus, a better M<sup>1</sup> is acceptable. From its formal definition

$$\text{CH}\_0: M\_1 - M\_2 \ge -\Delta\_{\text{inf}},\tag{\text{D.6a}}$$

$$\mathbf{H}\_1: M\_1 - M\_2 < -\Delta\_{\text{inf}},\tag{\text{D.6b}}$$

Figure D.2b illustrates the fundamental idea of the non-inferiority testing.

An intuitive solution concept for the equivalence and non-inferiority tests is the idea of the *two one sided tests* (TOST)<sup>51</sup> [HA84, RHV93, LW95, LC97, Zha03]. In the TOST procedure, the difference ∆M is tested against the upper and lower bounds forming two composite null hypotheses cf. (D.5a). If both one-sided tests of (D.5a) can be statistically rejected, (D.5b) can be concluded. In the case of a non-inferiority test, the procedure is similar. The difference is that (D.2ba) includes one one-sided test only. Widespread is the use of TOSTs based on the

<sup>51</sup> In literature, the abbreviation "TOST" is often limited to two one sided t-test". However, in [Lak17], it has been shown that the t-test can be replaced by non-parametric methods e. g. Wilcoxon signed rank test enabling a more general use of the TOST procedure.

Student's t-tests. In [Lak17], it has been shown that the TOST approach can be extended for non-parametric tests making the TOST more pertinent.

An alternative solution for non-parametric equivalence hypothesis testing is the so-called *signed rank equivalence test*, see [Wel10, Chapter 5.], which does not include two explicit statistical tests against ∆lower and ∆upper. Instead of testing these bounds, the so-called rejection probability value Ccrit is computed, which is compared with a calculated statistics rank Crnk. The computation of the critical value depends on the equivalence distribution range (q ′ , q′′) of the data. This equivalence distribution range can be obtained from a numerical solution of a density function<sup>52</sup> <sup>f</sup>○ and the defined significance level <sup>α</sup>. The null hypothesis of the test states that there is no equivalence between the two data sets. The alternative hypothesis states that the two data sets are equivalent. If Ccrit > Crnk, the null hypothesis is rejected and the alternative hypothesis is accepted: The two data sets are equivalent. It is also referred as *Wellek's Signed Rank Paired-Sample Test for Equivalence* or Wellek's equivalence test, cf. [MC12]. For more details and mathematical basis on Wellek's equivalence test, it is referred to [Wel10, Section 5.4].

Due to the less practical usability and adaptability of the Wellek's equivalence test for engineering applications, in this thesis the TOST methods are applied for the equivalence and non-inferiority tests of the hypotheses from the experiments.

In the programming languages Matlab and Python, equivalence test methods have limited availability. On the other hand, the programming language R provides more open-source libraries including the implementations of various equivalence test methods, see [Cal22]. In the course of this thesis, Matlab versions of these equivalence tests were implemented<sup>53</sup> .

**(a)** Equivalence testing with the equivalence interval, which is defined by the upper and lower margins ∆lower and ∆upper.

**(b)** Non-inferiority testing with the effect limit ∆inf

**Figure D.2:** The illustration of equivalence and non-inferiority tests. The main difference is that the non-inferiority test allows the superiority of M<sup>1</sup> compared to M2. On the other hand, an equivalence test necessitates a strict limitation of ∆M = M<sup>1</sup> − M<sup>2</sup> to the equivalence interval.

<sup>52</sup> The derivation of this density function is presented in [Wel10, Section 5.4]. Furthermore, it has to be computed numerically by means of a bisection method. Therefore, in [Wel10, Section 5.4], table overview of the values pairs are given, which are commonly used in medical research.

<sup>53</sup> Hypothesis testing for equivalence with Matlab, https://github.com/vargabalint92/Hypothesistesting-for-equivalence-with-matlab

# **D.2 Choosing the Proper Statistical Test Methods**

This section provides a short overview of the different test approaches and categories, in order to provide clarification of the chosen test methods in the experiments. This section is based on [VB99, DMI03] and [VA19, Chapter 1].

Statistical variables can be categorized into two main groups *quantitative variables* and *categorical variables*.

	- a) *Continuous* variables can have continuous value (uncountable) obtained by measurements e. g. error to a reference, velocity, time or age
	- b) *Discrete*: e. g. obtained by counting e. g. scores in an exam or goals in a soccer match
	- a) Ordinal variables mean that their values are comparable (have a ranked order), e. g. Likert scale or age groups (0-18, 18-65, 65+)
	- b) Nominal variables are assigned into groups without natural order e. g. days of a week, names

In this thesis, the *independent variables* (also called manipulated variable) are the types of the controllers. They are nominal categorical variables. The *dependent variables* (also called output variables) are the results of the defined measures, which are specified in the corresponding sections. In the case of the subjective assessment, the controllers are evaluated by Likert scales leading to ordinal dependent variables. As a consequence, the subjective measures need be analyzed with non-parametric test methods, see [VB99]. On the other hand, in the case of the objective measures, the dependent variables are quantitative: In the first experiment, the distance from the reference is measured and in the third experiment the overall time necessary to finish the task. These are continuous variables. On the other hand, the number of the collected boxes in the second experiment is a discrete dependent variable. First, they have to be tested for the normality of the measurements and for the homogeneity of their variances. These tests are carried out with the so-called *Shapiro-Wilk* test and *F*-test.

The Shapiro-Wilk test is a statistical significance test that determines the hypothesis that the underlying measurement has a normal distribution. Its null hypothesis assumes that the data is normally distributed with unspecified mean and variance with the significance level α SW. The alternative hypothesis is that the data does not have a normal distribution. If p SW <sup>&</sup>lt; <sup>α</sup> SW holds, the null hypothesis is rejected and a normal distribution cannot be assumed. On the other hand, if the p-value of the Shapiro-Wilk test is larger then the significant level, the alternative hypothesis is rejected and a normal distribution can be assumed for the data. These tests are necessary to decide whether parametric or non-parametric statistical tests should be applied.

An F-test verifies whether the two data sets with normal distributions have the same variance. Its null hypothesis is that the two data sets are normally distributed and have the same variance. The alternative hypothesis is that the two data sets are normally distributed and have the

different variance with the significance level α <sup>F</sup>. If p <sup>F</sup> <sup>&</sup>lt; <sup>α</sup> <sup>F</sup> holds, the null hypothesis is rejected and the equivalence of the variances cannot be assumed. On the other hand, if the p-value is larger then the significant level, the alternative hypothesis is rejected and the equivalence of the variances of the two data sets can be assumed.

In the evaluation, the Matlab (2021b) implementation of tests are used<sup>54</sup>. In the aftercoming sections, the measurement data for the experiments is given providing further explanations and the traceability of the results.

Shapiro-Wilk: https://de.mathworks.com/matlabcentral/fileexchange/13964-shapiro-wilkand-shapiro-francia-normality-tests

F-test: https://de.mathworks.com/help/stats/vartest2.html

<sup>54</sup> For more details, the clickable links lead to the corresponding description of the functions, last accessed on January 12th 2023:

# **D.3 Experiment with Reference Trajectories**

### **D.3.1 Results**

The average deviation of the manipulator using the three controllers are given in Table D.1. They were tested with Shapiro-Wilk test for the normality condition. The significance level of the Shapiro-Wilk test was chosen to α SW <sup>=</sup> <sup>0</sup>.05. The obtained p-values are

$$p\_{\rm LISC}^{\rm SW} = 0.006, \ p\_{\rm NCC}^{\rm SW} = 8.3 \cdot 10^{-4} \text{ and } p\_{\rm FISC}^{\rm SW} = 7.3 \cdot 10^{-5},$$

which are all smaller than the significance level. Therefore, a normal distribution cannot be assumed. Additional graphical illustration of the objective results are given in Figure D.3. It can be seen that the results of the test subjects using LISC and FISC are similar in contrast to NCC.


**Table D.1:** The average errors of the manipulator in the first experiment in m

The subjective assessments are given in Table D.2, which happens by analyzing the three questions of the subjective assessment.

**Figure D.3:** Box plot of the average error from the reference of the manipulator in the first experiment


**Table D.2:** The subjective results of the first experiment including three questions

### **D.3.2 Additional Trajectory Results**

Additional trajectories from further test subjects showing the similarities between FISC and LISC. Furthermore, the benefits of LISC compared to NCC is also illustrated.

**Figure D.4:** Comparison of the trajectories test subject 7 using FISC and LISC is shown.

**Figure D.5:** Comparison of the trajectories test subject 8 using NCC and LISC is shown.

# **D.4 Lateral Trajectory-free Experiment**

This section documents the results of the second experiment allowing the reader to retrace the analysis. First the instruction page and the questions of the second experiment are provided.

# **D.4.1 Instructions**

The Fig. D.6 and Fig. D.7 show the explanations of the experiment and the instructions for the test subjects.

**Figure D.6:** The instructions of the test subjects for the second experiment, page 1

**Figure D.7:** The instructions of the test subjects for the second experiment, page 2


**Figure D.8:** Intermediate questions of the second experiment


### **D.4.2 Results**

In the following, the results are given. Table D.3 presents the number of the hit boxes and the average deviation of the vehicle from its reference for the corresponding test subject.

To test the normality condition of the data sets, Shapiro-Wilk tests were applied. The results are given in Table D.4, which show that the normal distribution of the data can be assumed. Afterwards, an F-test was applied for the results of LISC and MC. The obtained p-values for the comparison of LISC and MC are

$$p\_{\rm BS}^{\rm F} = 8.65 \cdot 10^{-8} \text{ and } p\text{F}\_{\rm BS} = 2.2 \cdot 10^{-10},$$

which show that the variance of LISC and MC are not the same. Therefore, a Wilcoxon Signedrank test was applied to compare the objective measures of LISC with MC. The subjective evaluation of the controller are in Table D.4.


**Table D.3:** The number of the hit boxes (Box Score) and average errors of the vehicle (dveh) in the second experiment

**Table D.4:** Shapiro-Wilk tests of the hit boxes (Box Score) and the average errors of the vehicle (dveh) testing the normality condition of the data



**Table D.5:** The results of the questions in the first experiment for the subjective assessment

### **D.4.3 Additional Trajectory Results**

Additional trajectories from further test subjects showing the benefits of the proposed LISC, see Figure D.10 and Figure D.11.

**Figure D.10:** Comparison of the trajectories test subject 6 using MC and LISC is shown.

**Figure D.11:** Comparison of the trajectories test subject 13 using MC and LISC is shown.

# **D.5 Longitudinal Experiment**

This section presents the supplementary information for the longitudinal experiment with the LISC. The instructions were analogous to the the second experiment. The necessary overall time for finishing the task of third experiment are presented in Table D.6. Table D.7 presents the results of the Shapiro-Wilk test for the normality condition of LISC and NCC. The subjective assessments of the controllers are given in Table D.8. Additional exemplary comparisons of the resulting trajectories of test subject number 7 are given: Figure D.12 compares the relative positions of the manipulator and Figure D.13 compares the relative velocities of the manipulator using NCC and LISC.


**Table D.6:** The overall time in s which were necessary to finish the task of the third experiment for the two runs.

**Table D.7:** The p SW values of the Shapiro-Wilk test for the normality condition



**Table D.8:** The subjective assessment of the experiment with the longitudinal shared control.

**Figure D.12:** The relative position of the manipulator with LISC (blue solid line) and with NCC (red dashed line) originated from the 7th test subject

**Figure D.13:** The relative longitudinal velocity of the manipulator with LISC (blue solid line) and with NCC (red dashed line) originated from the 7th test subject

# **References**

# **Public References**











[Heg08] J Hegde. Time course of visual perception: Coarse-to-fine processing and beyond. *Progress in Neurobiology*, 84(4):405–439, April 2008.









*Robotics and Automation*, pages 1797–1802, Shanghai, China, May 2011. IEEE.












# **Own Publications and Conference Contributions**


# **Supervised Bachelor's and Master's theses**


### **Karlsruher Beiträge zur Regelungs- und Steuerungstechnik (ISSN 2511-6312) Institut für Regelungs- und Steuerungssysteme**



INSTITUT FÜR REGELUNGS- UND STEUERUNGSSYSTEME

This work focuses on shared control for large vehicle manipulators used in unstructured environments like road maintenance. The proposed approach automates the vehicle while allowing human operators to control the manipulator. This unique shared control setup explores limited information sharing between the two subsystems, where the automation lacks information about the manipulator. Furthermore, the book introduces a systematic design method for concept of limited information shared control, utilizing two new subclasses of potential games for parameter calculation. The shared control concept is applied and tested on a large vehicle manipulator through simulations and human-in-the-loop experiments, demonstrating superior performance over manual and non-cooperative control setups. Thus, the practical applicability and benefits of shared control for large vehicle manipulators are highlighted in this book.

Limited Information Shared Control and its Applications to Large Vehicle Manipulators

ISSN 2511-6312 ISBN 978-3-7315-1325-4