Mathias Bentert

# **Elements of Dynamic and 2-SAT Programming: Paths, Trees, and Cuts**

Matthias Bentert

**Elements of Dynamic and 2-SAT Programming: Paths, Trees, and Cuts**

The scientifc series *Foundations of computing* of the Technische Universität Berlin is edited by: Prof. Dr. Stephan Kreutzer Prof. Dr. Uwe Nestmann Prof. Dr. Rolf Niedermeier

Foundations of computing | 14

Matthias Bentert

**Elements of Dynamic and 2-SAT Programming: Paths, Trees, and Cuts**

Universitätsverlag der TU Berlin

### **Bibliographic information published by the Deutsche Nationalbibliothek**

The Deutsche Nationalbibliothek lists this publication in the Deutsche Nationalbibliografe; detailed bibliographic data are available on the internet at http://dnb.dnb.de.

### **Universitätsverlag der TU Berlin, 2021**

http://www.verlag.tu-berlin.de

Fasanenstr. 88, 10623 Berlin Tel.: +49 (0)30 314 76131 / Fax: -76133 E-Mail: publikationen@ub.tu-berlin.de

Zugl.: Berlin, Techn. Univ., Diss., 2020 Gutachter: Prof. Dr. Rolf Niedermeier (TU Berlin) Gutachter: Prof. Dr. Cristina Bazgan (Université Paris Dauphine) Gutachter: Prof. Dr. Thore Husfeldt (Lund University & ITU Copenhagen) Die Arbeit wurde am 17.12.2020 an der Fakultät IV unter Vorsitz von Prof. Dr. Stephan Kreutzer erfolgreich verteidigt.

This work – except for quotes, fgures and where otherwise noted – is licensed under the Creative Commons Licence CC BY 4.0 http://creativecommons.org/licenses/by/4.0

Cover image: *Aurora borealis* by Christine Bentert CC BY 4.0 | https://creativecommons.org/licenses/by/4.0

Print: docupoint GmbH Layout/Typesetting: Matthias Bentert

**ISBN 978-3-7983-3209-6 (print) ISBN 978-3-7983-3210-2 (online)**

**ISSN 2199-5249 (print) ISSN 2199-5257 (online)**

Published online on the institutional repository of the Technische Universität Berlin: DOI 10.14279/depositonce-11462 http://dx.doi.org/10.14279/depositonce-11462

## **Zusammenfassung**

In dieser Arbeit entwickeln wir schnellere exakte Algorithmen (schneller bezüglich der Worst-Case-Laufzeit) für Spezialfälle von Graphproblemen. Diese Algorithmen beruhen größtenteils auf *dynamischem Programmieren* und auf *2-SAT-Programmierung*. Dynamisches Programmieren beschreibt den Vorgang, ein Problem rekursiv in Unterprobleme zu zerteilen, sodass diese Unterprobleme gemeinsame Unterunterprobleme haben. Wenn diese Unterprobleme optimal gelöst wurden, dann kombiniert das dynamische Programm diese Lösungen zu einer optimalen Lösung des Ursprungsproblems. 2-SAT-Programmierung bezeichnet den Prozess, ein Problem durch eine Menge von 2-SAT-Formeln (aussagenlogische Formeln in konjunktiver Normalform, wobei jede Klausel aus maximal zwei Literalen besteht) auszudrücken. Dabei müssen erfüllende Wahrheitswertbelegungen für eine Teilmenge der 2-SAT-Formeln zu einer Lösung des Ursprungsproblems korrespondieren. Wenn eine 2-SAT-Formel erfüllbar ist, dann kann eine erfüllende Wahrheitswertbelegung in Linearzeit in der Länge der Formel berechnet werden. Wenn entsprechende 2-SAT-Formeln also in polynomieller Zeit in der Eingabegröße des Ursprungsproblems erstellt werden können, dann kann das Ursprungsproblem in polynomieller Zeit gelöst werden. Im folgenden beschreiben wir die Hauptresultate der Arbeit.

Bei dem Diameter-Problem wird die größte Distanz zwischen zwei beliebigen Knoten in einem gegebenen ungerichteten Graphen gesucht. Das Ergebnis (der Durchmesser des Eingabegraphen) gehört zu den wichtigsten Parametern der Graphanalyse. In dieser Arbeit erzielen wir sowohl positive als auch negative Ergebnisse für Diameter. Wir konzentrieren uns dabei auf parametrisierte Algorithmen für Parameterkombinationen, die in vielen praktischen Anwendungen klein sind, und auf Parameter, die eine *Distanz zur Trivialität* messen.

Bei dem Problem Length-Bounded Cut geht es darum, ob es eine Kantenmenge begrenzter Größe in einem Eingabegraphen gibt, sodass das Entfernen dieser Kanten die Distanz zwischen zwei gegebenen Knoten auf ein gegebenes Minimum erhöht. Wir bestätigen in dieser Arbeit eine Vermutung aus der wissenschaftlichen Literatur, dass Length-Bounded Cut in polynomieller Zeit in der Eingabegröße auf Einheitsintervallgraphen (Intervallgraphen, in denen jedes Intervall die gleiche Länge hat) gelöst werden kann. Der Algorithmus basiert auf dynamischem Programmieren.

*k*-Disjoint Shortest Paths beschreibt das Problem, knotendisjunkte Pfade zwischen *k* gegebenen Knotenpaaren zu suchen, sodass jeder der *k* Pfade ein kürzester Pfad zwischen den jeweiligen Endknoten ist. Wir beschreiben ein

dynamisches Programm mit einer Laufzeit *n <sup>O</sup>*((*k*+1)!) für dieses Problem, wobei *n* die Anzahl der Knoten im Eingabegraphen ist. Dies zeigt, dass *k*-Disjoint Shortest Paths in polynomieller Zeit für jedes konstante *k* gelöst werden kann, was für über 20 Jahre ein ungelöstes Problem der algorithmischen Graphentheorie war.

Das Problem Tree Containment fragt, ob ein gegebener phylogenetischer Baum *T* in einem gegebenen phylogenetischen Netzwerk *N* enthalten ist. Ein phylogenetisches Netzwerk (bzw. ein phylogenetischer Baum) ist ein gerichteter azyklischer Graph (bzw. ein gerichteter Baum) mit genau einer Quelle, in dem jeder Knoten höchstens eine ausgehende oder höchstens eine eingehende Kante hat und jedes Blatt eine Beschriftung trägt. Das Problem stammt aus der Bioinformatik aus dem Bereich der *Suche nach dem Baums des Lebens* (der Geschichte der Artenbildung). Wir führen eine neue Variante des Problems ein, die wir Soft Tree Containment nennen und die bestimmte Unsicherheitsfaktoren berücksichtigt. Wir zeigen mit Hilfe von 2-SAT-Programmierung, dass Soft Tree Containment in polynomieller Zeit gelöst werden kann, wenn *N* ein phylogenetischer Baum ist, in dem jeweils maximal zwei Blätter die gleiche Beschriftung tragen. Wir ergänzen dieses Ergebnis mit dem Beweis, dass Soft Tree Containment *NP*-schwer ist, selbst wenn *N* auf phylogenetische Bäume beschränkt ist, in denen jeweils maximal drei Blätter die gleiche Beschriftung tragen.

Abschließend betrachten wir das Problem Reachable Object. Hierbei wird nach einer Sequenz von rationalen Tauschoperationen zwischen Agentinnen gesucht, sodass eine bestimmte Agentin ein bestimmtes Objekt erhält. Eine Tauschoperation ist rational, wenn beide an dem Tausch beteiligten Agentinnen ihr neues Objekt gegenüber dem jeweiligen alten Objekt bevorzugen. Reachable Object ist eine Verallgemeinerung des bekannten und viel untersuchten Problems Housing Market. Hierbei sind die Agentinnen in einem Graphen angeordnet und nur benachbarte Agentinnen können Objekte miteinander tauschen. Wir zeigen, dass Reachable Object *NP*-schwer ist, selbst wenn jede Agentin maximal drei Objekte gegenüber ihrem Startobjekt bevorzugt und dass Reachable Object polynomzeitlösbar ist, wenn jede Agentin maximal zwei Objekte gegenüber ihrem Startobjekt bevorzugt. Wir geben außerdem einen Polynomzeitalgorithmus für den Spezialfall an, in dem der Graph der Agentinnen ein Kreis ist. Dieser Polynomzeitalgorithmus basiert auf 2-SAT-Programmierung.

### **Abstract**

This thesis presents faster (in terms of worst-case running times) exact algorithms for special cases of graph problems through dynamic programming and 2-SAT programming. Dynamic programming describes the procedure of breaking down a problem recursively into overlapping subproblems, that is, subproblems with common subsubproblems. Given optimal solutions to these subproblems, the dynamic program then combines them into an optimal solution for the original problem. 2-SAT programming refers to the procedure of reducing a problem to a set of 2-SAT formulas, that is, Boolean formulas in conjunctive normal form in which each clause contains at most two literals. Computing whether such a formula is satisfable (and computing a satisfying truth assignment, if one exists) takes linear time in the formula length. Hence, when satisfying truth assignments to some 2-SAT formulas correspond to a solution of the original problem and all formulas can be computed efciently, that is, in polynomial time in the input size of the original problem, then the original problem can be solved in polynomial time. We next describe our main results.

Diameter asks for the maximal distance between any two vertices in a given undirected graph. It is arguably among the most fundamental graph parameters. We provide both positive and negative parameterized results for *distance-from-triviality*-type parameters and parameter combinations that were observed to be small in real-world applications.

In Length-Bounded Cut, we search for a bounded-size set of edges that intersects all paths between two given vertices of at most some given length. We confrm a conjecture from the literature by providing a polynomial-time algorithm for proper interval graphs which is based on dynamic programming.

*k*-Disjoint Shortest Paths is the problem of fnding (vertex-)disjoint paths between given vertex terminals such that each of these paths is a shortest path between the respective terminals. Its complexity for constant *k* ≥ 3 has been an open problem for over 20 years. Using dynamic programming, we show that *k*-Disjoint Shortest Paths can be solved in polynomial time for each constant *k*.

The problem Tree Containment asks whether a phylogenetic tree *T* is contained in a phylogenetic network *N*. A phylogenetic network (or tree) is a leaf-labeled single-source directed acyclic graph (or tree) in which each vertex has in-degree at most one or out-degree at most one. The problem stems from computational biology in the context of the *tree of life* (the history of speciation). We introduce a particular variant that resembles certain types of uncertainty in

the input. We show that if each leaf label occurs at most twice in a phylogenetic tree *N*, then the problem can be solved in polynomial time and if labels can occur up to three times, then the problem becomes *NP*-hard.

Lastly, Reachable Object is the problem of deciding whether there is a sequence of rational trades of objects among agents such that a given agent can obtain a certain object. A rational trade is a swap of objects between two agents where both agents proft from the swap, that is, they receive objects they prefer over the objects they trade away. This problem can be seen as a natural generalization of the well-known and well-studied Housing Market problem where the agents are arranged in a graph and only neighboring agents can trade objects. We prove a dichotomy result that states that the problem is polynomial-time solvable if each agent prefers at most two objects over its initially held object and it is *NP*-hard if each agent prefers at most three objects over its initially held object. We also provide a polynomial-time 2-SAT program for the case where the graph of agents is a cycle.

### **Preface**

This thesis contains some of the results of my research at the Technische Universität Berlin in the Algorithmics and Computational Complexity group headed by Prof. Rolf Niedermeier from January 2017 to September 2020. The presented fndings are partially based on published papers and partially based on papers that are only available on the arXiv repository yet. Many of these results were prepared in close collaboration with my coauthors. These are (in alphabetical order) Jiehua Chen, Vincent Froese, Klaus Heeger, Dušan Knop, Josef Malík, André Nichterlein, Malte Renken, Mathias Weller, Gerhard J. Woeginger, and Philipp Zschoche.

In the following, I sketch the story behind the research projects corresponding to the diferent chapters as well as briefy state my respective contributions.

**Chapter 3.** After fnishing my master's thesis late 2016 in the young feld of *FPT in P* and starting my PhD program in 2017, André Nichterlein (TU Berlin) suggested to further explore this feld. He asked me to choose between either Diameter or Maximum Flow to work on next and I chose Diameter. Most of the results featured in our conference paper ([BN19]), which I presented at the *11th International Conference on Algorithms and Complexity (CIAC '19) in Rome, Italy*, are based on my ideas and André Nichterlein helped polishing both the results and the paper as a whole. An extended version featuring more details and all proofs is available in the arXiv repository and is submitted to a journal.

**Chapter 4.** From September 2018 to September 2019 Dušan Knop (Czech Technical University in Prague) had a postdoctoral position in our group. He suggested to study the problem Length-Bounded Cut. Initially, he was interested in certain *W[1]*-hardness results and started working on it with Klaus Heeger (TU Berlin). I joined the project soon after. During our research we found that the computational complexity of solving Length-Bounded Cut on interval graphs and proper interval graphs was stated as an open problem in the literature. We showed that the problem is polynomial-time solvable on proper interval graphs and we also proved the W[1]-hardness results that we were initially looking for, that is, for the feedback vertex number and the combined parameter pathwidth plus maximum degree. The polynomial-time algorithm was mostly my contribution while the W[1]-hardness results are mostly due to Klaus

Heeger. The corresponding paper ([BHK20]) was presented by Klaus Heeger at the *31st International Symposium on Algorithms and Computation (ISAAC '20)*, which was held virtually in December 2020. An extended version is available in the arXiv repository and is submitted to a journal.

**Chapter 5.** Our group holds a research retreat each year. In September 2019 at the retreat in Schloss Neuhausen (Brandenburg, Germany), André Nichterlein suggested to study a problem variant of Disjoint Paths and Anne-Sophie Himmel, Malte Renken, André Nichterlein, Philipp Zschoche (all TU Berlin), and I started working on it there. During the retreat, we studied diferent versions of Disjoint Paths and decided that we wanted to tackle the version Disjoint Shortest Paths after the retreat. It was known from the literature that this problem is *NP*-hard when the number *k* of shortest paths in the solution is part of the input and it was posed as an open problem for over twenty years whether there exists a polynomial-time algorithm for constant values of *k*. For *k* = 2 an *O*(*n* 8 )-time algorithm was known, where *n* is the number of vertices in the input graph. Between December 2019 and January 2020, William Lochet (University of Bergen) and we independently answered the open question in the afrmative. William Lochet was the frst to publish his paper at the 32*nd Annual ACM-SIAM Symposium on Discrete Algorithms (SODA '21)* [Loc21]. While his algorithm has a running time of *n O*(*k* 5 *k* ) , where the Landau notation hides a constant 9 <sup>55</sup> in the exponent, we have since worked on improving the running time of our algorithm to *O*(*k* · *n* <sup>16</sup>*k*·*k*!+*k*+1). We also proved *W[1]*-hardness for Disjoint Shortest Paths with respect to *k*. All coauthors except for Anne-Sophie Himmel, who left academia shortly after the retreat and has withdrawn her authorship of the corresponding paper, have worked roughly equally on all parts of the paper. I was less involved in the *W[1]*-hardness result and instead designed a dynamic program for Disjoint Shortest Paths on directed acyclic graphs which is used as a subroutine in our main algorithm. I presented the corresponding paper at the 48*th International Colloquium on Automata, Languages, and Programming (ICALP '21)* [Ben+21]. An extended version of the paper is available in the arXiv repository.

**Chapter 6.** At the retreat in April 2017 near Boiensdorf (Mecklenburg-Vorpommern, Germany) Mathias Weller (University of Paris-Est) presented a problem called Tree Containment that stems from computational biology. Josef Malík (Czech Technical University in Prague), Mathias Weller, and I

started working on it. Unfortunately, we had only limited success during the retreat but Mathias Weller and I wanted to continue working on it after the retreat. Josef Malík also wanted to participate further but did not have the required time to do so. For this reason, most of the results were achieved in equal parts by Mathias Weller and me in close collaboration. I presented the corresponding paper ([BMW18]) at the *16th Scandinavian Symposium and Workshops on Algorithm Theory (SWAT '18)* in June 2018 in Malmö, Sweden. An extended version is available in the HAL repository and is accepted for publication in the *Journal of Graph Algorithms and Applications*.

**Chapter 7.** Rolf Niedermeier presented a paper on Reachable Object at the retreat in Darlingerode (Saxony-Anhalt, Germany) in March 2018. Jiehua Chen (TU Vienna), Vincent Froese (TU Berlin), Gerhard J. Woeginger (RWTH Aachen University), and I chose this problem to work on during the retreat. We achieved a few hardness results as well as a polynomial-time algorithm for short preference lists of all agents in close collaboration during the retreat. However, there was an intriguing open problem left when the input graph is a path that was described in the literature to be "at the frontier of tractability, despite its simplicity". Later this year, I resolved this case by providing a polynomial-time algorithm. A very similar algorithm was in the meantime developed independently by Sen Huang and Mingyu Xiao. We contacted the authors and invited them to join the two papers but they declined because of Chinese regulations. Their paper was presented at the *33rd AAAI Conference on Artifcial Intelligence (AAAI '19)* and is published in *Autonomous Agents and Multi-Agent Systems* [HX20]. We since improved our algorithm to also work for cycles but so far the paper is only available in the arXiv repository [Ben+19a].

**Acknowledgement.** I am grateful to Rolf Niedermeier for giving me the opportunity to pursue my PhD in his group, for the many valuable lessons he taught me, and for the guidance he provided.

I am thankful to all my former and current colleagues and co-authors for countless hours of discussions and for providing an atmosphere in which working rarely felt like working: René van Bevern, Niclas Boehmer, Robert Bredereck, Markus Brill, Jiehua Chen, Alexander Dittmann, Till Fluschnik, Vincent Froese, Anne-Marie George, Roman Haag, Klaus Heeger, Christian Hofer, Anne-Sophie Himmel, Jonas Israel, Andrzej Kaczmarczyk, Leon Kellerhals, Dušan Knop, Tomohiro Koana, Junjie Luo, Josef Malík, Marcello G. Millani, Hendrik

Molter, Marco Morik, Luis Müller, André Nichterlein, Rolf Niedermeier, Malte Renken, René Saitenmacher, Ulrike Schmidt-Kraepelin, Piotr Skowron, Manuel Sorge, Christline Thielcke, Mathias Weller, Gerhard J. Woeginger, and Philipp Zschoche.

Finally, I want to express my sincere gratitude towards my family, my friends, and my choirs who gave me the strength to endure the few bad days and joy and companionship on so many other days.

# **Table of Contents**



### **II 2-SAT Programming 101**


# **Chapter 1**

### **Introduction**

When confronted with a new problem, one of the frst choices we face is to select a set of tools to tackle the problem with. Sometimes, none of the tools we know is useful for the task and we give up or we come up with a new (specialized) tool. Most often, however, (some of) the tools we already know are useful and our task becomes much easier once we fgured out the correct tool for the job. So how do we choose the correct tool? Do we need to try every possible tool? Of course not. Is it up to experience to decide for the correct tool? While experience defnitely helps, there are oftentimes rules or heuristics we can follow that guide us to the correct tool. Finding these rules and heuristics is important as it helps us and others to save time and efort not trying the wrong tools and not needing to collect years of experience before becoming efcient problem solvers.

While the above holds in general, we want to focus on algorithmic problems and exact algorithms in this thesis. The tools available for such tasks are numerous. When considering computationally easy problems (problems in *P*), we often start with *greedy* algorithms but also tools like *divide and conquer*, *dynamic programming*, or *modeling with a fow network* come to mind. When considering computationally hard (*NP*-hard) problems, then we can use some of the previous tools or we can refer to tools like *branch and bound*, *backtracking*, *integer linear programming (ILP)*, *modeling as a SAT problem*, *color-coding*, or *data reduction*. All of the mentioned tools are very well understood and we know at least some rules for each of them of when to apply them. Divide and conquer and dynamic programming for example are the frst choices when a problem can be decomposed into smaller instances of the same problem. This is not to say, however, that we know *everything* about these tools already. In this thesis, we investigate two tools in more depth. These are *dynamic programming*

and *2-SAT programming*. 2-SAT programming is a tool that has been used in the literature much less than dynamic programming. For dynamic programming we will not fnd additional rules for *when* to use it but rather some rules of *how* to apply it. For 2-SAT programming, we will investigate where and how it was used so far, develop our own experiences by applying it to two problems, and conclude with two heuristics of when 2-SAT programming might be a good ft.

One might ask why we chose exactly these two tools. On the one hand, this is to a certain degree up to pure chance. These tools just happened to work well for the problems we studied in the past. On the other hand, since we worked with these tools quite successfully, we feel confdent that we can add something to the topic.

In the following, we give an introduction to *dynamic programming* and *2-SAT programming*. We conclude this chapter with an overview over the results in this thesis.

### **1.1 Dynamic Programming**

Dynamic programming describes the procedure of recursively breaking down a problem into smaller *overlapping* subproblems and computing an optimal solution from solutions for these subproblems. Subproblems overlap if they have common subsubproblems. The analogous technique for non-overlapping subproblems is called *divide and conquer*. An example for divide and conquer is merge-sort, where in each step the array of numbers to sort is partitioned and independently sorted.

The term dynamic programming was coined around 1952 by Richard Bellman [Bel52]. Dynamic programming has since then become a staple of computer science which is taught in countless classes and books on algorithms, applied to computational problems such as Longest Common Subsequence, Longest Increasing Subsequence, Maximum Weight Independent Set on trees, and Approximate String Matching [Cor+09, Ski20]. It has been used in numerous felds including machine learning [BNK20, BSW89], computer vision [AWJ90], computational biology [Che+01, FT97, San00], and computational chemistry [Ari00, Gro+19]. It also had a large impact on parameterized algorithmics as *the* go-to tool for algorithms on tree decompositions of graphs [Bod88, LZ20, Mar20].

In the following, we will frst give a general structure of how to apply dynamic programming. We then exercise a standard example for dynamic programming using our general structure and fnally describe how this structure will guide us throughout the frst part of this thesis. Dynamic programming is most often achieved by flling a table where each entry stores the solution to some subproblem. There are four main questions that one should answer when developing a dynamic program:


We present a standard dynamic program for the problem Knapsack. For this problem, we are given a set *X* of *n* objects each with a positive integer weight (denoted by *w*) and a positive integer value (denoted by *v*) and two integers *B* and *k*. The question then is whether there is a subset of objects whose total weight is at most *B* and whose total value is at least *k*. Knapsack is known to be *NP*-complete but it allows for a pseudo-polynomial-time (polynomial if all number are encoded in unary) algorithm [TM90].

Without loss of generality, let *X* = {*o*1*, o*2*, . . . , on*}. By answering the four questions above one by one, we explain how an existing *O*(*n*·*B*·*k*)-time dynamic program for Knapsack works [TM90].

1. What does a table entry represent?

Each entry in the table *T* represents a subproblem which is defned by a subset of objects and two bounds 1 ≤ *B*′ ≤ *B* and 1 ≤ *k* ′ ≤ *k*. The value in each entry in *T* is a binary value storing whether the respective subproblem is a yes- or a no-instance.

2. What dimension shall the table have?

Let *X<sup>i</sup>* . .= {*o*1*, o*2*, . . . , oi*} be the set of the frst *i* objects. The table *T* has entries for each subset *X<sup>i</sup>* , where *i* ∈ [*n*] and [*n*] . .= {1*,* 2*, . . . , n*}. Moreover, it has an entry for each combination of a subset *X<sup>i</sup>* and values 1 ≤ *B*′ ≤ *B* and 1 ≤ *k* ′ ≤ *k*. The dimension or type of *T* is therefore

$$T \colon [n] \times [B] \times [k] \to \{\text{true}, \text{false}\}.$$

### 3. How to compute each table entry?

Initially, we set

$$T[1, B', k'] := \begin{cases} \text{true}, & \text{if } B' \ge w(o\_1) \text{ and } k' \le v(o\_1), \text{ and} \\ \text{false}, & \text{otherwise}. \end{cases}$$

Once all entries *T*[*i, B*′ *, k*′ ] for a specifc *i* are computed, we can compute an entry *t* . .= *T*[*i* + 1*, B*′ *, k*′ ]. We do so by distinguishing between the three cases *B*′ *< w*(*oi*+1), *B*′ = *w*(*oi*+1), and *B*′ *> w*(*oi*+1). If *B*′ *< w*(*oi*+1), then

$$t := T[i, B', k'].$$

If *B*′ = *w*(*oi*+1), then

$$t := \begin{cases} \text{true}, & \text{if } k' \le \upsilon(o\_{i+1}) \text{ and} \\ T[i, B', k'], & \text{otherwise}. \end{cases}$$

Finally, if *B*′ *> w*(*oi*+1), then

$$t := \begin{cases} \text{true}, & \text{if } k' \le v(o\_{i+1}) \text{ and }\\ T[i, B', k'] \vee T[i, B' - w(o\_{i+1}), k' - v(o\_{i+1})], & \text{otherwise.} \end{cases}$$

The idea is the following. If there is already a solution of total weight at most *B*′ and total value at least *k* ′ using only the frst *i* objects, then this solution is also a solution for the instance corresponding to *T*[*i* + 1*, B*′ *, k*′ ]. If no such solution exists, then the "new" object *o<sup>i</sup>*+1 has to be part of every solution. If *B*′ *< w*(*o<sup>i</sup>*+1), then no solution exists in this case. If *B*′ = *w*(*o<sup>i</sup>*+1), then there is a solution (the set {*o<sup>i</sup>*+1}) if and only if *k* ′ ≤ *v*(*o<sup>i</sup>*+1). If *B*′ *> w*(*o<sup>i</sup>*+1), then either the set {*o<sup>i</sup>*+1} is a solution (if *k* ′ ≤ *v*(*o<sup>i</sup>*+1)) or any solution set *S* contains *o<sup>i</sup>*+1 and *S* ′ . .= *S* \{*o<sup>i</sup>*+1} ̸= ∅ (if *k* ′ *> v*(*o<sup>i</sup>*+1)) such that *S* ′ is a solution for the problem corresponding to *T*[*i, B*′ − *w*(*o<sup>i</sup>*+1)*, k*′ − *v*(*o<sup>i</sup>*+1)].

4. How can the solution of the original problem be computed once the table is completely flled?

If each table entry is computed correctly, then the original instance is by defnition a yes-instance if and only if *T*[*n, B, k*] = true.

We skip the formal proof of correctness and the analysis of the running time [TM90].

The frst part of this thesis is about dynamic programming. Therein, we study the problems Diameter, Length-Bounded Cut, and *k*-Disjoint Shortest Paths. The Diameter problem asks for the longest shortest path between two vertices in a given graph. In Length-Bounded Cut, we are given an undirected graph, two terminal vertices *s* and *t*, and two integers *k* and *ℓ*. The question is whether there is a set of at most *k* edges such that removing those edges yields a graph in which the distance between *s* and *t* is larger than *ℓ*. For the problem *k*-Disjoint Shortest Paths, we are given an undirected graph and *k* terminal pairs (*s<sup>i</sup> , ti*) and the question is whether there are *k* disjoint paths *P<sup>i</sup>* such that *P<sup>i</sup>* is a shortest path between *s<sup>i</sup>* and *t<sup>i</sup>* .

These problems are in some sense very similar as all of the problems deal with shortest paths in a given undirected graph but are also quite diferent from one another: One the one hand, Diameter is polynomial-time solvable while Length-Bounded Cut and *k*-Disjoint Shortest Paths are *NP*-hard. On the other hand, Length-Bounded Cut is about removing (cutting) parts from the graph while Diameter and *k*-Disjoint Shortest Paths are more about routing (fnding specifc shortest paths in a graph). As equal and yet diferent the problems are, so are the algorithms we develop for each of them. The algorithms have in common that they are dynamic programs but they difer in which of our four guiding questions is hardest to answer for them. These respective questions therefore deserve additional consideration and these considerations will guide us through the frst part of the thesis. In Chapter 3, we will study Diameter and we will encounter a dynamic program in which the dimension of the table is quite unique as it partially depends on the optimal solution and can therefore not be determined a priori. In Chapter 4, we study Length-Bounded Cut. The dynamic program we develop there does not allow to lookup the fnal answer in a specifc table entry. Instead, the fnal answer is computed by iterating over a few specifc table entries. Finally, in Chapter 5 we study *k*-Disjoint Shortest Paths and develop a dynamic program for it. The question of how to compute each table entry seems very easy to answer at frst glance but it will turn out that considering it some more and ignoring some information given to us allows for a much faster algorithm.

## **1.2 2-SAT Programming**

2-SAT programming<sup>1</sup> refers to the procedure of efciently reducing a problem to a set of Boolean formulas in 2-CNF (conjunctive normal form with at most two literals per clause) such that the solution for the original problem can be constructed from the solutions for the 2-SAT formulas (satisfying truth assignments or the fact that formulas are unsatisfable). The technique has been used in a wide range of contexts, e. g. subgraph detection [HL00, Jan17], graph transformation [HHW03], matrix partiotioning [Bul+16], computational biology [EHK03, GW09], resource allocation [HX20, MB20], and cartography [WW95]. However, we could not fnd many more examples of it being used and, to the best of our knowledge, 2-SAT programming has never been systematically analyzed as a general technique to solve computational problems. We start such an analysis by comparing how and when this technique was used in the literature so far. Before we do so, we frst begin with an example of how to use 2-SAT programming.

We show how to solve the following special case of Independent Set in linear time in the input size.

Boolean Multicolored Independent Set


Let *G* = (*V, E*) be an instance of Boolean Multicolored Independent Set where *u<sup>i</sup>* and *v<sup>i</sup>* are the two vertices of the *i* th color in *G*. Our constructed 2-SAT program consists only of a single formula Φ which contains a variable *x<sup>i</sup>* for each color. Setting *x<sup>i</sup>* to true corresponds to picking *u<sup>i</sup>* into the solution and setting *x<sup>i</sup>* to false corresponds to picking *v<sup>i</sup>* into the solution. The formula Φ consist of one clause for each edge {*y, z*} in *G* that evaluates to false if both *y* and *z* are picked into the solution. Let *y* have the *i* th color and let *z* have the *j* th color and let *i* ̸= *j* without loss of generality. We distinguish between the four possible cases (i) *y* = *u<sup>i</sup>* and *z* = *u<sup>j</sup>* , (ii) *y* = *u<sup>i</sup>* and *z* = *v<sup>j</sup>* , (iii) *y* = *v<sup>i</sup>* and *z* = *u<sup>j</sup>* , and (iv) *y* = *v<sup>i</sup>* and *z* = *v<sup>j</sup>* . In the frst case, the clause shall evaluate to false if and only if *x<sup>i</sup>* = true and *x<sup>j</sup>* = true. This is achieved by the clause ¬(*x<sup>i</sup>* ∧ *x<sup>j</sup>* ) ≡ (¬ *x<sup>i</sup>* ∨ ¬ *x<sup>j</sup>* ). Analogously, the clauses for the other three

<sup>1</sup>We mention that this method of problem solving has been used only a few times in the literature before. The name *2-SAT programming* is not established in the literature.

**Figure 1.1:** An example instance of Boolean Multicolored Independent Set and the 2-SAT formula Φ constructed by our 2-SAT program. The encircled vertices form a solution and the edges are enumerated to allow easier verifcation of Φ. The frst edge corresponds to the frst clause in Φ, the second edge to the second clause in Φ, and so on. Note that the encircled vertices correspond to the truth assignment *x*<sup>1</sup> = true, *x*<sup>2</sup> = false, and *x*<sup>3</sup> = true. It is easy to verify that this is a satisfying truth assignment to Φ. The encircled solution indeed is the only solution for the given instance and the described truth assignment is the only satisfying truth assignment for Φ.

cases are (¬ *x<sup>i</sup>* ∨ *x<sup>j</sup>* ), (*x<sup>i</sup>* ∨ ¬ *x<sup>j</sup>* ), and (*x<sup>i</sup>* ∨ *x<sup>j</sup>* ), respectively. An example of this construction is given in Figure 1.1. Note that the 2-SAT formula Φ can be computed in time linear in the size of *G*. Since Φ can be checked for a satisfying truth assignment in linear time (in the length of Φ which is linear in the size of *G*) [APT79], the total running time is linear in the input size. It is easy to verify that Φ is satisfed by some truth assignment if and only if the set of vertices corresponding to this truth assignment are pairwise non-adjacent in *G*. Since each such set contains exactly one vertex of each color, each solution of Φ corresponds to a solution of Boolean Multicolored Independent Set.

We conclude this introduction to 2-SAT programming with an analysis of how and when 2-SAT programming was used in the literature before and how our two new results ft into this picture. The majority of results that we could fnd that used 2-SAT programming ([Bul+16, EHK03, GW09, HHW03, HL00, WW95]) used it as follows. Variables describe whether or not to pick some element into a solution set and the clauses in each 2-SAT formula prevented that some conficting elements where chosen in the same solution. In the remaining examples ([HX20, Jan17, MB20]) variables did not represent whether or not to pick some element into a solution but rather which element is picked into a solution (as in our example above). By exploring 2-SAT programming more in-depth in the thesis, we hope to fnd some indications of when 2-SAT programming should be considered for new (algorithmic) problems. Indeed, we will conclude that 2-SAT programming is promising when the considered problem is (thought to be) polynomial-time solvable and has some *independence structure*. By that, we mean that the a solution consists of some elements that can


In the example above, the elements were the vertices of the graph, the partition was achieved by the colors of vertices, and a set of vertices forms a solution only if they are pairwise non-adjacent.

We present two new examples of 2-SAT programming in the second part of the thesis. These examples follow the distinction of 2-SAT programs stated above. In Chapter 6, we study a problem from computational biology called Tree Containment that asks whether a specifc subtree exists in a given directed graph. Roughly speaking, our approach is to introduce a variable for each vertex in the input network that is set to true if the vertex belongs to the sought subtree and to false otherwise. In Chapter 7, we investigate Reachable Object, a problem stemming from the feld of resource allocation. Therein, agents initially own objects, have diferent preferences over the objects, and are arranged in a social network. They may swap objects with one another under certain conditions including that they must be adjacent in the social network. The question is then whether a specifc agent can obtain a given target object. In a cycle, each object is given from the agent that initially holds it to one of its two possible neighbors. The variables in the 2-SAT program we develop in Chapter 7 then represent for each object to which of the two respective agents it is swapped to.

We conclude this introduction to 2-SAT programming with describing a similarity and a dissimilarity between the two 2-SAT programs we develop in the thesis. They have in common that they are designed for very sparse graphs (trees in Chapter 6 and cycles in Chapter 7) and they difer in how they generalize to "*k*-SAT programs". While the algorithm for Tree Containment does generalize naturally to any constant *k* to a correct algorithm for a meaningful problem, the same cannot be said about the algorithm for Reachable Object.

### **1.3 Results**

In this thesis, we design and analyze algorithms for (mostly *NP*-hard) graph problems. We achieve a wide range of diferent results, among others, parameterized hardness and algorithms, polynomial-time algorithms for special cases, and results within *FPT in P*, that is, parameterized algorithms and hardness results for problems in *P* [GMN17]. We also resolve some open problems from the literature.

In Chapter 3, we study the Diameter problem, which asks for the longest shortest path between any two vertices in a given graph. This parameter was observed to be very small in many diferent real-world application [LH08, Mil67, New03] and it is often used in network analysis [AJB99, WF94]. This has led to a wide spectrum of algorithms computing the diameter faster than the naïve algorithm (see Zwick [Zwi01]). We add to this spectrum by providing new parameterized algorithms for computing the diameter. On the one hand, we study *distance-from-triviality*-like parameters [GHN04] and show that graphs with small modulators to cographs, that is, small sets of vertices whose removal yield a cograph, allow for faster diameter computations while graphs with small modulators to bipartite graphs do not. On the other hand, we study parameter combinations that are expected to be small in real-world applications. Here, we show that the combined parameter *h*-index plus diameter allows for positive *FPTin-P* results whilst similar combinations under standard complexity assumptions do not. The algorithms for graphs with small modulators to cographs and for the combined parameter *h*-index plus diameter are both based on dynamic programming.

In Chapter 4, we study the problem Length-Bounded Cut which arises from the feld of network fows. Given an undirected graph, two terminal vertices *s* and *t*, and two integers *k* and *ℓ*, the question is whether there is a set of at most *k* edges such that removing these edges yields a graph in which the distance between *s* and *t* is larger than *ℓ*. We prove a conjecture by Bazgan et al. [Baz+19] by providing a polynomial-time algorithm for Length-Bounded Cut on proper interval graphs which is based on dynamic programming. We also briefy investigate interval graphs and show limitations of our approach for proper interval graphs.

In Chapter 5, we look at a long-standing open question regarding the complexity of *k*-Disjoint Shortest Paths for constant *k* [Eil98, Fom+19]. In *k*-Disjoint Shortest Paths, we are given an undirected graph and *k* terminal pairs (*s<sup>i</sup> , ti*), and the question is whether there are *k* disjoint paths *P<sup>i</sup>* such

that *P<sup>i</sup>* is a shortest path between *s<sup>i</sup>* and *t<sup>i</sup>* . We present an algorithm whose running time is polynomial in the input size for each constant *k*. The algorithm is based on dynamic programming and a geometric representation of the problem that is quite intuitive yet, to the best of our knowledge, novel.

In Chapter 6, we investigate a problem variant of Tree Containment which stems from computational biology. Given a leaf-labeled directed acyclic graph *N* (called a phylogenetic network) and a leaf-labeled directed tree *T*, the question in Tree Containment is whether *N displays T*. This is the case if *N* contains a subdivision of *T* as a subgraph that respects leaf-labels [ISS10]. A version of Tree Containment where *N* is a tree is used in the quest for fnding the "tree of life", that is, given the current knowledge of speciation (modeled as a directed tree *N*) and some new data (modeled as another (possibly smaller) directed tree *T*), the question is whether *N* and *T* are consistent. We call the version we investigate Soft Tree Containment. It is motivated by soft polytomies, that is, multiple speciation events whose order is unknown. Another kind of uncertainty can be modeled by allowing *N* to have multiple leaves with the same label. Our main contribution is a dichotomy result regarding the maximal number of occurrences of a label in *N*. On the one hand, using 2-SAT programming, we show that Soft Tree Containment is polynomial-time solvable if *N* is a tree in which each leaf-label occurs at most twice. On the other hand, we show that Soft Tree Containment remains *NP*-hard when restricted to trees in which each leaf-label occurs at most thrice.

In Chapter 7, we study a problem called Reachable Object. Therein, one is given a set of agents, a set of objects, a specifc agent *I*, and a specifc object *x*. Each agent has strict preferences over the objects and initially owns exactly one object. Additionally, the agents are arranged in a graph (social network) representing which agents know each other. The question is then whether there is a sequence of *rational swaps* such that agent *I* owns object *x* in the end [GLW17]. A rational swap is a trade between two agents that know each other such that both agents receive an object they prefer over the object they give away. Our contribution is twofold. First, we present a dichotomy result regarding the number of objects each agent prefers over its initially held object. If each agent prefers at most two objects over the one it initially holds, then Reachable Object can be solved in polynomial using dynamic programming. The problem remains *NP*-hard even if each agent prefers at most three objects over its initially held object. Second, using 2-SAT programming, we provide a polynomial-time algorithm for Reachable Object on cycles which is a generalization of a previous algorithm for Reachable Object on

paths [HX20]. The original algorithm for paths answered an open problem from the literature [GLW17].

Finally, we summarize our main results and give a broader overview over possible avenues for further research regarding dynamic programming and 2-SAT programming in Chapter 8.

# **Chapter 2**

### **Preliminaries**

In this chapter, we describe our notation and some general tools that will be used in the following chapters. If a specifc notion is only used in a single chapter, then it will be introduced there. We assume familiarity with the basics of set theory, calculus, and the description and analysis of algorithms.

### **2.1 Number Theory**

We denote by Z . .= {*. . . ,* −1*,* 0*,* 1*, . . .*} the set of all integers, by N . .= {0*,* 1*,* 2*, . . . ,* } the set of all non-negative integers, and by N<sup>+</sup> . .= N \ {0} the set of all positive integers. We use Q . .= {*p/<sup>q</sup>* | *p* ∈ Z ∧ *q* ∈ N<sup>+</sup>} to denote the set of all rational numbers and Q + 0 . .= {*p/<sup>q</sup>* | *p, q* ∈ N<sup>+</sup>} to denote the set of all positive rational numbers. The set of all real numbers is denoted by R.

For two integers *a, b* ∈ Z, we denote by [*a, b*] the integer *interval* between *a* and *b*, that is, [*a, b*] . .= {*i* ∈ Z | *a* ≤ *i* ≤ *b*}. Analogously, we denote the rational *interval* between *a* and *b* by [*a, b*] Q . .= {*i* ∈ Q | *a* ≤ *i* ≤ *b*}. For *a > b*, let [*a, b*] . .= [*a, b*] Q . .= ∅. Finally, for a positive integer *ℓ* ∈ N<sup>+</sup> we use [*ℓ*] as an abbreviation for [1*, ℓ*] = {1*,* 2*, . . . , ℓ*}.

### **2.2 Graph Theory**

An undirected graph *G* is a tuple (*V, E*) where *V* is the set of vertices or nodes and *E* ⊆ ( *V* 2 ) is the set of edges. A directed graph is a tuple (*V, A*) where *V* is again the set of vertices or nodes and *A* ⊆ {(*u, v*) | *u* ≠ *v*∧*u, v* ∈ *V* } is the set of arcs. We will use *n* . .= |*V* | to denote the number of vertices, *m* . .= |*E*| (*m* . .= |*A*|)

to denote the number of edges or arcs, and |*G*| . .= *n* + *m* to denote the *size* of *G*. All graphs in this thesis are undirected unless explicitly stated otherwise.

For a vertex subset *V* ′ ⊆ *V* , we denote by *G*[*V* ′ ] the graph induced by *V* ′ , that is, the graph *G*[*V* ′ ] . .= (*V* ′ *, E*′ . .= {{*u, v*} ∈ *E* | *u, v* ∈ *V* ′}) if *G* is an undirected graph and *G*[*V* ′ ] . .= (*V* ′ *, A*′ . .= {(*u, v*) ∈ *A* | *u, v* ∈ *V* ′}) if *G* is directed. We abbreviate *G* − *V* ′ . .= *G*[*V* \ *V* ′ ]. A path *P* . .= (*v*0*, . . . , vℓ*) in a directed graph is a graph with a set *V* (*P*) . .= {*v*0*, . . . , vℓ*} of vertices and arc set *A*(*P*) . .= {(*v<sup>i</sup> , vi*+1) | 0 ≤ *i < ℓ*}. A path *P* . .= (*v*0*, . . . , vℓ*) in an undirected graph is a graph with a set *V* (*P*) . .= {*v*0*, . . . , vℓ*} of vertices and a set *E*(*P*) . .= {{*v<sup>i</sup> , vi*+1} | 0 ≤ *i < ℓ*} of edges. We say that *ℓ* is the *length* of *P* and a *shortest path* between two vertices is a path of minimum length. We defne *A*(*P*) to be the set of arcs {(*vi*−1*, vi*) | *i* ∈ [*ℓ*]} and *A*<sup>−</sup><sup>1</sup> (*P*) to be the set of arcs {(*v<sup>i</sup> , vi*−1) | *i* ∈ [*ℓ*]}. Intuitively, *A*(*P*) and *A*<sup>−</sup><sup>1</sup> (*P*) describe the two directed versions of *P* in an undirected graph. The vertices *v*<sup>0</sup> and *v<sup>ℓ</sup>* are called the *end vertices* or *ends* of *P* and are denoted by start(*P*) and end(*P*). We also say that *P* is a path *from v*<sup>0</sup> *to vℓ*, a path *between v*<sup>0</sup> and *vℓ*, or a *v*0-*vℓ*-path. When no ambiguity arises, we do not distinguish between a path and its set of vertices. We identify specifc paths by just some of their vertices, e. g. we use the name *a-b-c-path* to denote a path that starts in *a*, then continues by some shortest *a*-*b*-path, and ends with some shortest *b*-*c*-path.

Let *v, w* be two vertices in a path *P*. We denote by *P*[*v, w*] the subpath of *P* with end vertices *v* and *w*. For two paths *P*<sup>1</sup> . .= (*v*0*, . . . , va*) and *P*<sup>2</sup> . .= (*v* ′ 0 *, . . . , v*′ *b* ) with *v* ′ <sup>0</sup> = *v<sup>a</sup>* or {*va, v*′ <sup>0</sup>} ∈ *E* ((*va, v*′ 0 ) ∈ *A*), we defne *P*<sup>1</sup> • *P*<sup>2</sup> . .= (*v*0*, . . . , va, v*′ 1 *, . . . , v*′ *b* ) or *P*<sup>1</sup> • *P*<sup>2</sup> . .= (*v*0*, . . . , va, v*′ 0 *, . . . , v*′ *b* ), respectively. For two vertices *u, v* ∈ *V* , we denote with dist*G*(*u, v*) the distance between *u* and *v* in *G*, that is, the number of edges in a shortest path between *u* and *v*. If *G* is clear from the context, then we omit the subscript. A *connected component C* ⊆ *V* in a graph *G* is a maximal set of vertices such that there is a path between each pair of vertices in *C*.

The *degree* deg*G*(*v*) of a vertex *v* ∈ *V* in an undirected graph *G* is the number of edges that contain *v*. The *in-degree* of a vertex *v* ∈ *V* in a directed graph is the number of arcs of the form (*u, v*) for *u* ∈ *V* . A vertex with in-degree zero is called a *source*. The *out-degree* of a vertex *v* ∈ *V* in a directed graph is the number of arcs of the form (*v, w*) for *w* ∈ *V* . A vertex with out-degree zero is called a *sink*. The degree of a vertex *v* ∈ *V* in a directed graph is the sum of its in-degree and its out-degree. The *neighborhood NG*(*v*) of a vertex is the set of all vertices that share an edge (or arc) with *v* in *G* and we use *NG*[*v*] to denote *N*(*v*) ∪ {*v*}. Again, if *G* is clear from the context, then we omit the

subscript. *Suppressing* a degree-two vertex *v* ∈ *V* in an undirected graph *G* refers to the action of removing the vertex *v* from *G* and adding the edge between *v*'s two neighbors *u, w* if it is not already contained in *G*. Suppressing a vertex *v* ∈ *V* in a directed graph *G* = (*V, A*) with in-degree one and outdegree one refers to the action of removing the vertex *v* from *G* and adding the arc (*u, w*) where (*u, v*)*,*(*v, w*) ∈ *A*. Again, if this arc was already present in *A*, then we just remove *v* with its two incident arcs. *Subdividing* an edge {*u, w*} in an undirected graph refers to the action of removing {*u, w*} and adding a new vertex *v* and new edges {*u, v*} and {*v, w*}. Subdividing an arc (*u, w*) in a directed graph refers to the action of removing the respective arc and adding a new vertex *v* and new arcs (*u, v*) and (*v, w*).

We continue with some notation for *directed acyclic graphs (DAGs)*. We call a vertex *d* in a DAG *G* a *descendant* of another vertex *a* if there is a *a*-*d*-path in *G*. Moreover, we call *a* an *ascendant* of *d*. For a DAG *G*, let *<<sup>G</sup>* be a relation between vertices in *G* such that *v <<sup>G</sup> u* if and only if *u* is an ancestor of *v*. Moreover, let *u* ≤*<sup>G</sup> v* if and only if *u <<sup>G</sup> v* or *u* = *v*. Let defne *G<sup>v</sup>* to be the subgraph of *G* induced by {*u* | *u* ≤*<sup>G</sup> v*}. The set of *least common ancestors* LCA*<sup>N</sup>* ({*X*}) of a set *X* of vertices contains all minima with respect to ≤*<sup>G</sup>* among all vertices *u* of *N* with *v* ≤*<sup>G</sup> u* for all *v* ∈ *X*. In particular, if *G* is a tree, then LCA*<sup>N</sup>* ({*X*}) contains a single vertex. If *G* is clear from the context, then we may drop the subscript.

Two undirected graphs *G* . .= (*VG, EG*) and *H* . .= (*VH, EH*) are *isomorphic* if there is a bijection *f* between *V<sup>G</sup>* and *V<sup>H</sup>* such that for any two vertices *u, v* ∈ *V<sup>G</sup>* it holds that {*u, v*} ∈ *E<sup>G</sup>* if and only if {*f*(*u*)*, f*(*v*)} ∈ *EH*. Analogously, two directed graphs *G* . .= (*VG, AG*) and *H* . .= (*VH, AH*) are isomorphic if there is a bijection *f* between *V<sup>G</sup>* and *V<sup>H</sup>* such that for any two vertices *u, v* ∈ *V<sup>G</sup>* it holds that (*u, v*) ∈ *A<sup>G</sup>* if and only if (*f*(*u*)*, f*(*v*)) ∈ *EH*. We call *f* the *mapping* between *G* and *H*.

### **2.2.1 Graph Classes**

A *tree* is a connected acyclic (directed or undirected) graph, that is, a graph in which each pair of vertices is connected by an unique shortest path. A *rooted tree* is a tree *T* with a designated vertex *r* called the *root* of *T*. The *depth* of a vertex *v* in a rooted tree is the distance between *v* and *r*. The *height* of a vertex *v* in a rooted tree is the maximum distance between *v* and a leaf *ℓ* in *T* such that *v* is contained in a shortest *r*-*ℓ*-path. A *forest* is a graph in which each connected component is a tree.

**Figure 2.2:** An example of a generalized caterpillar with hair length two. The topmost vertices form the central path and the paths below are the *hairs*.

A *clique* is a graph *G* = (*V, E*) with *E* = {{*u, v*} | *u, v* ∈ *V* }. A graph is *bipartite* if its vertex set can be partitioned in two sets *V*1*, V*<sup>2</sup> such that for each edge {*u, v*} ∈ *E* it holds that *u* ∈ *V*<sup>1</sup> and *v* ∈ *V*<sup>2</sup> (or *u* ∈ *V*<sup>2</sup> and *v* ∈ *V*1). Analogously, a graph is *k-partite* if its vertex set can be partitioned into *k* sets *V*1*, V*2*, . . . , V<sup>k</sup>* such that it holds for each edge {*u, v*} ∈ *E* that *u* and *v* are not contained in the same vertex set *V<sup>i</sup>* .

An *interval graph* is a graph *G* = (*V, E*) such that each vertex *v* can be represented by a rational interval [*bv, fv*] <sup>Q</sup> such that two vertices *u, w* are adjacent in *G* if and only if [*bu, fu*] <sup>Q</sup> ∩[*bw, fw*] <sup>Q</sup> ̸= ∅. A *proper interval graph* is an interval graph such that there are no two vertices *v* and *w* such that [*bv, fv*] <sup>Q</sup> ⊂ [*bw, fw*] R. Equivalently, a proper interval graph can be defned as an interval graph where each interval has length one, i. e., *b<sup>v</sup>* + 1 = *f<sup>v</sup>* for each vertex *v* (see e. g. [BLS99]). An example of an interval graph and its interval representation is given in Figure 2.1. A *cograph* is a graph that does not contain a *P*<sup>4</sup> (a path of four vertices and three edges) as an induced subgraph. A *caterpillar* is a tree such that removing all leaves yields a path (i.e, all vertices are within distance at most one of a "central path"). A *generalized caterpillar* with hairs of length at most *h* ≥ 1 is a tree such that removing paths of length at most *h* yields a path. A generalized caterpillar with hairs of length two is shown in Figure 2.2.

### **2.2.2 Graph Parameters**

The *maximum degree* of a graph *G* = (*V, E*) is the maximum number of incident edges to any single vertex in the graph, that is, max{deg(*v*) | *v* ∈ *V* }. Analogously, the *minimum degree* is defned as min{deg(*v*) | *v* ∈ *V* } and the *average degree* of a graph is <sup>2</sup>*m/n*. We denote by *d*(*G*) the *diameter* of *G*, that is, the length of the longest shortest path in *G*. The *h-index* of a graph *G* is the maximum number *h* such that the graph contains at least *h* vertices of degree at least *h*.

One of the most famous graph parameters is the *treewidth*. It is defned through *tree decompositions*. A tree decomposition of a graph *G* = (*V, E*) is a tree *T* = {X *, E*′}, where each *X<sup>i</sup>* ∈ X = {*X*1*, X*2*, . . . , Xℓ*} is a subset of *V* and the following three properties hold. First, each vertex *v* ∈ *V* is contained in at least one *X<sup>i</sup>* ∈ X . Second, for each vertex *v* ∈ *V* , the set of all *X<sup>i</sup>* with *v* ∈ *X<sup>i</sup>* induces a connected subgraph in *T*. Third, for every edge {*u, v*} ∈ *E*, there is a subset *X<sup>i</sup>* that contains both *u* and *v*. The *width* of a tree decomposition is max{|*X*| | *X* ∈ X }−1 and the treewidth of *G* is the minimum width among all possible tree decompositions of *G*. The *pathwidth* of a graph is defned similarly to its treewidth, but instead of tree decompositions only path decompositions are considered, that is, the tree *T* is required to be a path.

The *girth* of a graph is the size of a smallest induced cycle in the graph (or ∞ if the graph is a forest). The *bisection width* of a graph *G* = (*V, E*) is defned as the size of a smallest set *E*′ of edges such that *V* can be partitioned into two sets *V*1*, V*<sup>2</sup> with |*V*1| = |*V*2| (or |*V*1| = |*V*2| + 1 if |*V* | is odd) such that all edges with one end in *V*<sup>1</sup> and one end in *V*<sup>2</sup> is contained in *E*′ . Bisection width is illustrated in Figure 2.3. A *dominating set* in a graph is a set *K* of vertices such that each vertex in the graph is contained in *K* or has at least one neighbor in *K*. The *domination number* of a graph is the size of a minimum dominating set in it. The *acyclic chromatic number* of a graph is the minimum number of colors needed to color each vertex with one of the given colors such that each subgraph induced by all vertices of one color is an independent set and each subgraph induced by all vertices of two colors is acyclic.

Lastly, for some graph class Π, the *distance to* Π is the size of a minimum set of vertices such that the graph resulting from deleting this set of vertices is in Π. In this thesis, we will consider the *distance to cographs*, the *distance to bipartite graphs*, and the *distance to forests*. The distance to bipartite graphs is known as *odd cycle transversal number* and the distance to forests is known as *feedback vertex number* in the literature. We will hence use these names. The

**Figure 2.3:** An example of the bisection width of a graph. The edges between the two parts are drawn using dashed lines and the bisection width is two (the number of dashed edges).

edge-deletion distance to forests, that is, the size of a smallest set of edges such that removing them yields a forest, is known as the *feedback edge number*.

### **2.3 Complexity Classes and Hypotheses**

We assume familiarity with the basics of Turing machines and Random Access Machines. Otherwise, we refer to Papadimitriou [Pap94]. In this thesis, we will always analyze the running time of an algorithm in terms of Random Access Machines. However, complexity classes are classically defned using Turing machines.

The class *P* contains all decision problems (or languages) that can be decided in polynomial time by deterministic Turing machines. The class *NP* contains all decision problems that can be decided in polynomial time by non-deterministic Turing machines.

A *parameterization* for a problem *L* is formally a pair of functions (*f, g*) such that *f* maps each possible input *I* for *P* to some object *f*(*I*) and *g* maps each such object to a non-negative integer. We use the *treewidth* of a graph as an example. Here, *f* maps each graph to a *tree decomposition* of *G* and *g* measures the *width* of the tree decomposition, that is, the maximum number of vertices in any bag of the tree decomposition (minus one). A *parameter* is then the resulting positive integer *g*(*f*(*I*)) of a parameterization. A *parameterized problem* is a tuple (*L, κ*), where *L* is a language (an unparameterized decision problem) and *κ* is a *parameter*. An instance of (*L, κ*) is a pair (*x, k*) where *k* = *g*(*f*(*x*)) for some parameterization (*f, g*). For a parameterized problem L = (*L, κ*), the language Lˆ = {*x* ∈ Σ ∗ | ∃*k* : (*x, k*) ∈ L} is called the *unparameterized problem* associated to L. For a broader introduction into parameterized complexity theory, we refer the reader to the books by Cygan et al. [Cyg+15], Downey and Fellows [DF13], Flum and Grohe [FG06], and Niedermeier [Nie06].

A problem *L* is *fxed-parameter tractable* with respect to some parameter *κ* if there is an algorithm deciding whether (*x, k*) ∈ (*L, κ*) (or equivalently *x* ∈ *L*) in *f*(*k*) · |*x*| *<sup>O</sup>*(1) time, where |*x*| denotes the size of *x* and *f* is some computable function depending only on *k*. The class *FPT* contains all parameterized problems (*L, κ*) where *L* is fxed-parameter tractable with respect to *κ*. The class *XP* contains all parameterized problems (*L, κ*) such that there is an algorithm deciding whether (*x, k*) ∈ (*L, κ*) in |*x*| *f*(*k*) time, where *f* is again some computable function only depending on *k*. The class *W[1]* contains all parameterized problems (*L, κ*), where every instance (*x, k*) of (*L, κ*) can be transformed in *f*(*k*) · |*x*| *<sup>O</sup>*(1) time to a combinatorial circuit that has weft at most one and constant depth for all instances, such that (*x, k*) ∈ (*L, κ*) if and only if there is a satisfying truth assignment to the input circuit that assigns true to exactly *k* inputs. The *weft* of a combinatorial circuit is the largest number of logical units with unbounded fan-in on any path from an input to the output. The *depth* of a combinatorial circuit is the largest number of logical units on any path from an input to the output. Similarly to the assumption that *P* ̸= *NP*, the assumption *FPT* ≠ *W[1]* is widely believed and is used to exclude *FPT*-results. A few years ago, the topic of *FPT in P* [GMN17] emerged from parameterized complexity theory. Therein, instead of designing *f*(*k*)·|*x*| *<sup>O</sup>*(1)-time algorithms for *NP*-hard problems where *k* is some superpolynomial function, one is interested in *f*(*k*)·|*x*| *c* -time algorithms for problems in *P*, where no *O*(|*x*| *c* )-time algorithm is known for the unparameterized problem associated to it.

The problem *k*-SAT is a generalization of 2-SAT and defned as follows.

*k*-SAT

**Input:** A Boolean formula Φ in conjunctive normal form where each clause in Φ contains at most *k* literals.

**Question:** Is Φ satisfable?

Analogously to 2-SAT programs, we use the term *k*-SAT *program* to refer to an algorithm that solves a problem by constructing and solving *k*-SAT instances (formulas conjunctive normal form where each clause contains at most *k* literals).

The Exponential-Time Hypothesis (ETH) of Impagliazzo and Paturi [IP01] postulates that there is no 2 *o*(*m*) -time algorithm solving the Satisfiability problem, where *m* is the number of clauses. It is formalized as follows.

**Hypothesis 2.1** (Exponential-Time Hypothesis (ETH))**.** *There is some constant δ >* 0 *such that* 3-SAT *cannot be solved in O*(2*δn*) *time, where n is the number of variables in the input formula.*

It is worth noting that assuming the ETH, there is no *f*(*k*) · *n o*(*k*) -time algorithm solving Multicolored Clique problem [Che+05], where *f* is a computable function and *k* is the solution size.

Multicolored Clique

**Input:** An integer *k* and a *k*-partite undirected graph *G* . .= (*V, E*) with *V* . .= ⨄*<sup>k</sup> <sup>i</sup>*=1 *V<sup>i</sup>* and |*V<sup>i</sup>* | = *<sup>n</sup>/<sup>k</sup>* for all *i* ∈ [*k*]. **Question:** Is there an induced clique of size at least *k* in *G*?

A stronger version of the ETH is the so-called Strong Exponential-Time Hypothesis (SETH) [IP01]. It states the following.

**Hypothesis 2.2** (Strong Exponential-Time Hypothesis (SETH))**.** *For each δ <* 1 *there is an integer k such that k*-SAT *cannot be solved in O*(2*δn*) *time.*

Let Φ be a Boolean input formula for Satisfiability. We remark that if the SETH is true, then there is no |Φ| 2−*ε* · |Φ| *<sup>O</sup>*(1)-time algorithm solving Satisfiability [IP01].

### **2.4 Reductions Between Problems**

A *(many-one) reduction* is a function *R*: Σ <sup>∗</sup> → Σ ∗ that transforms an instance *x* of some problem *L* to an equivalent instance *y* of a problem *L* ′ , that is, *y* ∈ *L* ′ ⇐⇒ *x* ∈ *L*. A *polynomial-time (many-one) reduction* is a reduction that can be computed in time polynomial in the input size |*x*|. To show that a problem *A* is presumably not in *P*, one can reduce an *NP*-hard problem *B* to *A* (written as *B* ≤*<sup>p</sup> A*). Unless *P* = *NP*, this shows that *A /*∈ *P*. A problem *B* is *NP*-hard if for all problems in *NP* there is a polynomial-time reduction to *B*. Famous examples of NP-hard problems are e. g. Multicolored Clique and *k*-SAT [Kar75]. If *B* ≤*<sup>p</sup> A* for a *NP*-hard problem *B*, then *A* is also *NP*-hard.

A *parameterized reduction* is a reduction *R*: Σ <sup>∗</sup>×N → Σ <sup>∗</sup>×N that transforms a parameterized problem L to a parameterized problem L ′ in *FPT*-time, that is, for each instance (*x, k*) of L it produces an instance (*y, ℓ*) of L ′ such that

1. (*y, ℓ*) can be computed in *f*(*k*)·|*x*| *<sup>O</sup>*(1) time for some computable function *f*,


To show that some parameterized problem is *presumably not* in *FPT*, one regularly uses the standard complexity assumption that *FPT* ̸= *W[1]* and shows that a problem is *W[1]*-hard. To show *W[1]*-hardness for some parameterized problem L, we use parameterized reductions from *W[1]*-hard problems similar to the unparameterized setting. Probably the most famous example of a *W[1]*-hard problem is Multicolored Clique parameterized by the solution size *k*.

Concerning *FPT-in-P* studies, we use the notion of General-Problem-hardness which formalizes the types of reduction that allow us to exclude certain parameterized algorithms for problems in *P*. In a nutshell, we want to upper-bound the parameter in the constructed instance by some constant *ℓ* without increasing the running time or the instance size by too much. Since it holds for each computable function *f* that *f*(*ℓ*) is some constant, we can then hide any dependency on *ℓ* in the Landau notation.

**Defnition 2.1** ([Ben+19b, Defnition 3.1])**.** Let L ⊆ Σ <sup>∗</sup>×N be a parameterized problem, let Lˆ ⊆ Σ <sup>∗</sup> be the unparameterized decision problem associated to L, and let *g* : N → N be a polynomial. We call L *ℓ-General-Problem-hard*(*g*) *(ℓ-GP-hard*(*g*)*)* if there exists an algorithm A transforming any input instance *x* of Lˆ into a new instance (*y, k*) of L such that


We call L *General-Problem-hard*(*g*) *(GP-hard*(*g*)*)* if there exists an integer *ℓ* such that L is *ℓ*-GP-hard(*g*). We omit the running time and call L *ℓ-General-Problem-hard (ℓ-GP-hard)* if *g* is a linear function.

Showing GP-hardness for some parameter *κ* allows to lift algorithms for the parameterized problem to the unparameterized setting as stated next. The idea behind this statement is that assuming a parameterized problem is both *ℓ*-GPhard *n <sup>c</sup>* and can be solved in *O*(*n c* · *f*(*k*)) time for some computable function *f* and some constant *c*, then we can solve the unparameterized problem associated

to it in *O*(*n c* ) time by frst reducing an arbitrary input instance to an equivalent instance in which the parameter is at most *ℓ* and then use the parameterized algorithm where we can hide the dependency on the parameter in the Landau notation.

**Lemma 2.3** ([Ben+19b, Lemma 3.2])**.** *Let g* : N → N *be a polynomial, and let* L ⊆ Σ <sup>∗</sup> × N *be a parameterized problem which is GP-hard*(*g*)*. Let* Lˆ ⊆ Σ ∗ *be the unparameterized decision problem associated to* L*. If there is an algorithm solving each instance* (*y, k*) *of* L *in f*(*k*) · *g*(|*y*|) *time, then there is an algorithm solving each instance x of* Lˆ *in O*(*g*(|*x*|)) *time.*

We conclude this chapter with a simple example to illustrate Lemma 2.3. Consider the problem of detecting whether a given undirected graph contains a clique of size fve and the parameter bisection width. By simply copying the graph such that the resulting graph has twice as many vertices and edges, the bisection width becomes zero as there is no edge between the two copies of the original graph and both copies contain the same number of vertices. Moreover, the resulting graph contains a clique of size fve if and only if the original graph contains a clique of size fve. Now assume that there was an *O*((*n* + *m*) · *f*(*k*)) time algorithm for this problem where *k* is the bisection width and *f* is some computable function. Then, this would imply that we could frst construct an equivalent instance in linear time where *k* = 0 as described above, and then solve this equivalent problem in *f*(*k*) · (*n* + *m*) ∈ *O*(*n* + *m*) time.

# **Part I Dynamic Programming**

# **Chapter 3**

### **Diameter**

In this chapter, we study the problem Diameter which asks for the maximum distance between any two vertices in a given undirected graph. Regarding dynamic programming, this chapter features a dynamic program that is noteworthy as the solution for Diameter will not be related to any table entry but to the fnal size of the table. To the best of our knowledge, this is the frst time the dimension of a dynamic program was used in this way.

Concerning the Diameter problem, many consider the diameter of a graph among the most fundamental graph parameters [Bac+18, New03, WF94]. Most known algorithms for determining the diameter frst compute the shortest path between each pair of vertices (All-Pairs Shortest Paths) and then return the maximum [AVW16]. However, several more efcient algorithms have been proposed for special cases [AVW16, BHM20, Cor+01, FP80, Gaw+21] or for approximating the diameter [Ain+99, Bac+18, RW13, WY16].

In this chapter, we follow the *FPT-in-P* approach [AVW16, Ben+20, GMN17], that is, we propose parameterized algorithms for Diameter that run faster than known unparameterized algorithms when specifc parameters are very small or show that such algorithms refute popular complexity assumptions. In Section 3.2, we follow the *distance-from-triviality-parameterization paradigm* [GHN04] aiming to augment a folklore algorithm for Diameter on cographs such that it also works for graphs with small modulators to cographs, that is, graphs with small sets of vertices whose removal yields a cograph. We also analyze graphs with small modulators to bipartite graphs. For the parameter distance *k* to cographs, we provide a 2 *O*(*k*) (*n*+*m*)-time algorithm. For the parameter odd cycle transversal number *k* (the distance to bipartite graphs), we use our recently introduced notion of *General-Problem-hardness* [Ben+19b] to show that Diameter parameterized by *k* is "as hard" as the unparameterized Diameter problem. In Section 3.3, we investigate parameter combinations that are motivated by properties of social networks. Social networks often have special characteristics, including the *small-world* property (small diameter) and a *power-law degree distribution* (small average degree and small *h*-index) [LH08, Mil67, New03, New10, NP03]. Since social networks often have small diameter and small *h*-index, we investigate combinations of parameters closely related to the diameter and parameters closely related to the *h*-index.

The domination number *d* is a parameter that upper-bounds the diameter and the acyclic chromatic number *a* upper-bounds the average degree and is upperbounded by the *h*-index. Hence, the standard *O*(*n* · *m*)-time algorithm runs in *O*(*n* 2 · *a*) time. We will show that this is essentially the best one can hope for as, assuming the SETH, we can exclude *f*(*a, d*)·(*n* + *m*) 2−*ε* -time algorithms for each *ε >* 0. Our result is based on a reduction by Roditty and Williams [RW13] which is modifed such that the acyclic chromatic number and the domination number in the resulting graph are fve and four, respectively. It is known that a *k <sup>O</sup>*(1)(*n* + *m*) 2−*ε* -time algorithm where *k* is the combined parameter diameter plus maximum degree would refute the SETH [BN19]. Complementing this lower bound, we provide an *f*(*k*)(*n* + *m*)-time algorithm where *k* is the combined parameter diameter plus *h*-index. The maximum degree upper-bounds the *h*-index.

### **3.1 Problem Defnition and Related Work**

Diameter asks for the maximum distance between any two vertices in a given undirected and connected input graph. It is formally defned as follows and an example is given in Figure 3.1. Recall that dist*G*(*v, w*) is the length of a shortest path between *v* and *w* in *G*.

Diameter

**Input:** An undirected and connected graph *G* . .= (*V, E*).

**Task:** Compute the length of a longest shortest path in *G*, that is, max{dist*G*(*u, v*) | *u, v* ∈ *V* }.

Due to its importance, Diameter is extensively studied. Concerning worstcase analysis, the theoretically fastest algorithms (in terms of dependence on the number *n* of vertices) are based on matrix multiplication and run in *O*(*n* <sup>2</sup>*.*<sup>373</sup>) time [Sei95]. In terms of the dependence on the input size *n* + *m*, the currently fastest algorithms for All-Pairs Shortest Paths run in *O*(*n* <sup>3</sup>*/*2 Ω(<sup>√</sup> log *n*) ) time in dense graphs [CW21] and in *O*(*nm*) time in sparse graphs, respectively.

**Figure 3.1:** An undirected connected graph *G* = ({*s, t, u, v, w*}*, E*). The diameter of *G* is 3 as dist*G*(*u, v*) = 3 and dist*G*(*x, y*) ≤ 3 for all *x, y* ∈ {*s, t, u, v, w*}.

The *O*(*nm*)-time algorithm performs a breadth-frst search from each vertex and algorithms for Diameter employed in practice are usually based on this approach. See e. g. Borassi et al. [Bor+15] for a recent example of such an algorithm which also yields good performance bounds using average-case analysis [BCT17].

Concerning special graph classes, Gawrychowski et al. [Gaw+21] showed how to solve Diameter on planar graphs in *O*˜ (*n*<sup>5</sup>*/*3) time. Other special cases include linear-time algorithms for outerplanar graphs [FP80] and chordal graphs [Cor+01].

In this chapter, we follow the line of *FPT in P* [GMN17]. Starting *FPT in P* for Diameter, Abboud et al. [AVW16] observed that, unless the SETH fails, there is no *kO*(1) ·(*n* + *m*)2−*ε*-time algorithm for Diameter for any *ε >* 0 if *k* is the treewidth of the graph. Their corresponding reduction also shows the same hardness result for the combined parameter *h*-index plus domination number and the parameter vertex cover number. Moreover, the reduction also implies that the SETH is refuted by any *f*(*k*)(*n* + *m*)2−*ε*-time algorithm for Diameter for any computable function *f* and any *ε >* 0 when *k* is the distance to chordal graphs. Evald and Dahlgaard [ED16] adapted the reduction to prove the same for the parameter maximum degree.

Complementing the lower bound for the parameter treewidth by Abboud et al. [AVW16], Bringmann et al. [BHM20] showed that Diameter can be solved in 2*<sup>O</sup>*(*k*) *n*1+*o*(1) time where *k* is the treewidth of the graph. In the paper on which this chapter is based, we systematically explored the parameter space looking for parameters that allow for *k<sup>O</sup>*(1) ·(*n* + *m*)2−*ε*-time algorithms [BN19]. Figure 3.2 gives an overview over the parameterized results for Diameter and we will present in this chapter some selected results we achieved. The following results on approximating Diameter are known. A simple breadthfrst search yields a linear-time 2-approximation. Aingworth et al. [Ain+99]

**Figure 3.2:** Overview of the relation between the structural parameters and the respective results for Diameter. An edge from a parameter *α* to a parameter *β* below of *α* means that *β* can be upper-bounded in a polynomial (usually linear) function in *α* (see also the work by Schröder [Sch19]). The three small boxes below each parameter indicate whether there exists (from left to right) an algorithm running in *f*(*k*)*n* 2 , *f*(*k*)(*n* + *m*) 1+*ε* , or *k <sup>O</sup>*(1)(*n* + *m*) 1+*ε* time, respectively. If a small box is green (lighter), then a corresponding algorithm exists and the box to the left is also green. Similarly, a red (darker) box indicates that a corresponding algorithm would be a breakthrough. More precisely, if a middle box (right box) is red, then an algorithm running in *f*(*k*) · (*n* + *m*) 2−*ε* (or *k O*(1) · (*n* + *m*) 2−*ε* ) time refutes the SETH. If a left box is red, then an algorithm with running time *f*(*k*)*n* 2 implies an *O*(*n* 2 )-time algorithm for Diameter in general. Hardness results for a parameter *α* imply the same hardness results for the parameters below *α*. Similarly, algorithms for a parameter *β* imply algorithms for the parameters above *β*. White boxes indicate open problems.

improved the approximation factor to 3*/*2 at the expense of the higher running time of *O*(*n* 2 log *n* + *m* √ *n* log *n*). Roditty and Williams [RW13] showed that approximating Diameter within a factor of 3*/*2 − *δ* in *O*(*n* 2−*ε* ) for any *δ, ε >* 0

time refutes the SETH. Moreover, for any *ε, δ >* 0 a (3*/*2 − *δ*)-approximation in *O*(*m*<sup>2</sup>−*<sup>ε</sup>* ) time or a (5*/*3 − *δ*)-approximation in *O*(*m*<sup>3</sup>*/*2−*<sup>ε</sup>* ) time also refute the SETH [Bac+18, CGR16]. On planar graphs, there is an approximation scheme with near linear running time [WY16].

### **3.2 Parameters Motivated by Graph Classes**

In this section, we investigate parameterizations that measure the distance to special graph classes. We study the odd cycle transversal number (the distance to bipartite graphs) and the distance to cographs. Roditty and Williams [RW13] state that when computing the diameter of a graph, then distinguishing between diameter two and diameter three is among the most difcult cases. Nonetheless, detecting cographs (a subclass of graphs with diameter two) is easier than computing the diameter. Moreover, if the graph contains only few pairs of vertices of distance at least three, then the distance to cographs is often small. Thus, an efcient algorithm for Diameter parameterized by the distance to cographs might help dealing with the hard case of Diameter stated above. Besides cographs, we study bipartite graphs as these are among the most fundamental graph classes. Note that the lower bound of Abboud et al. [AVW16] for the parameter vertex cover number (distance to edgeless graphs) already implies that there is no *k O*(1) · (*n* + *m*) 2−*ε* -time algorithm for *k* being either of the two considered parameters as both are upper-bounded by the vertex cover number (see Figure 3.2).

**Odd Cycle Transversal.** We will show that, assuming the SETH, there is no *f*(*k*)·(*n*+*m*) 2−*ε* -time algorithm for the odd cycle transversal number *k* for any computable function *f*. We do so by showing that Diameter is 4-GP-hard<sup>1</sup> with respect to the combined parameter odd cycle transversal number plus girth. Recall that the girth of a graph is the size of a smallest induced cycle in the graph and that Diameter is 4-GP-hard with respect to the parameter odd cycle transversal number plus girth if the following holds. Each instance (*G, k*) of the decision

<sup>1</sup>We remark that Defnition 2.1 and Lemma 2.3 are stated for decision problems while Diameter is not a decision problem. However, the problem of deciding whether a given undirected connected graph has diameter *exactly k* for some given *k* is a decision problem and every algorithm for Diameter can be used to solve this decision problem with constant overhead. We call Diameter GP-hard with respect to some parameter, when this decision version of Diameter is GP-hard with respect to that parameter.

**Figure 3.3:** Example for the construction in the proof of Proposition 3.1. The input graph given on the left side has diameter two and the constructed graph on the right side has diameter three. In each graph one longest shortest path is highlighted.

version of Diameter can be transformed in *O*(|*G*|) time into an instance (*G*′ *, k*′ ) such that the odd cycle transversal number plus girth of *G*′ is at most four and *G*′ has diameter *k*′ if and only if *G* has diameter *k*. Recall further that if Diameter is 4-GP-hard with respect to some parameter *ℓ*, then Lemma 2.3 states the following. If there is an algorithm solving each instance (*G, k, ℓ*) of the decision version of Diameter in *O*(*f*(*ℓ*) · *g*(|*G*|)) time for any computable functions *f* and *g*, then there is an algorithm that solves each instance (*G*′ *, k*′ ) of the unparameterized decision version of Diameter in *O*(*g*(|*G*′ |)) time. This then yields the following two results. First, any *f*(*k*) · *n*<sup>2</sup>*.*<sup>3</sup>-time algorithm can be transformed into an *O*(*n*<sup>2</sup>*.*3)-time algorithm for Diameter (which is faster than any known unparameterized algorithm). Second, any *f*(*k*) · (*n* + *m*)<sup>2</sup>−*<sup>ε</sup>*-time algorithm would refute the SETH.

**Proposition 3.1.** Diameter *is* 4*-GP-hard with respect to the combined parameter* odd cycle transversal plus girth*.*

*Proof.* Let *G* = (*V, E*) be an arbitrary undirected connected input graph with *V* = {*v*1*, v*2*, . . . , vn*}. We construct a new bipartite graph *G*′ = (*V* ′ *, E*′ ), where

$$V' := \{ u\_i, w\_i \mid v\_i \in V \}, \text{ and}$$

$$E' := \{ \{ u\_i, w\_j \}, \{ u\_j, w\_i \} \mid \{ v\_i, v\_j \} \in E \} \cup \{ \{ u\_i, w\_i \} \mid v\_i \in V \}.$$

An example of this construction can be seen in Figure 3.3. We will now prove that all properties of Defnition 2.1 hold. It is easy to verify that the construction can be computed in linear time and therefore the resulting instance is of linear size as well. Observe that {*u<sup>i</sup>* | *v<sup>i</sup>* ∈ *V* } and {*w<sup>i</sup>* | *v<sup>i</sup>* ∈ *V* } are both independent sets and therefore *G*′ is bipartite. Notice further that for any edge {*v<sup>i</sup> , vj*} ∈ *E* there is an induced cycle in *G*′ containing the vertices *u<sup>i</sup>* , *w<sup>i</sup>* , *u<sup>j</sup>* , and *w<sup>j</sup>* . Since *G*′ is bipartite, there is no induced cycle of length three in *G*′ and thus the girth of *G*′ is four.

Lastly, we show that the diameter of *G*′ is exactly one larger than the diameter of *G*. We do so by proving for each pair (*v<sup>i</sup> , v<sup>j</sup>* ) of vertices in *G* that if dist(*v<sup>i</sup> , v<sup>j</sup>* ) is odd, then

$$\text{dist}(u\_i, w\_j) = \text{dist}(v\_i, v\_j) \text{ and } \text{dist}(u\_i, u\_j) = \text{dist}(v\_i, v\_j) + 1,$$

and if dist(*v<sup>i</sup> , v<sup>j</sup>* ) is even, then

$$\text{dist}(u\_i, u\_j) = \text{dist}(v\_i, v\_j) \text{ and } \text{dist}(u\_i, w\_j) = \text{dist}(v\_i, v\_j) + 1.$$

Since dist(*u<sup>i</sup> , wi*) = 1 and dist(*u<sup>i</sup> , w<sup>j</sup>* ) = dist(*u<sup>j</sup> , wi*), this will conclude the proof.

In order to show that the diameter of *G*′ is exactly one larger than the diameter of *G*, let *c* = dist(*v<sup>i</sup> , v<sup>j</sup>* ) be odd and let *P* = (*va*<sup>0</sup> *, va*<sup>1</sup> *, . . . , va<sup>c</sup>* ) be a shortest path from *v<sup>i</sup>* to *v<sup>j</sup>* where *va*<sup>0</sup> = *v<sup>i</sup>* and *va<sup>c</sup>* = *v<sup>j</sup>* . Let

$$P' = (u\_{a\_0}, w\_{a\_1}, u\_{a\_2}, \dots, w\_{a\_c})$$

be a path in *G*′ . Clearly *P* ′ has length *c* and hence dist(*u<sup>i</sup> , w<sup>j</sup>* ) ≤ *c* = dist(*v<sup>i</sup> , v<sup>j</sup>* ). It also holds that dist(*u<sup>i</sup> , w<sup>j</sup>* ) ≥ *c*. To verify this, assume towards a contradiction that there is a path *P* ′′ = (*u<sup>b</sup>*<sup>0</sup> *, w<sup>b</sup>*<sup>1</sup> *, u<sup>b</sup>*<sup>2</sup> *, . . . , w<sup>b</sup>c*′ ) with *u<sup>b</sup>*<sup>0</sup> = *u<sup>i</sup>* , *w<sup>b</sup>c*′ = *w<sup>j</sup>* , and *c* ′ *< c*. Then there is a path *P* ′′′ = (*v<sup>b</sup>*<sup>0</sup> *, v<sup>b</sup>*<sup>1</sup> *, . . . , v<sup>b</sup>c*′ ) between *v<sup>i</sup>* and *v<sup>j</sup>* . Note that if *v<sup>b</sup><sup>g</sup>* = *v<sup>b</sup><sup>h</sup>* for some *b<sup>g</sup> < bh*, then *g* = *h* and *P* ′′′ can be replaced by a shorter path where the subpath *P* ′′′[*b<sup>g</sup>*+1*, bh*] is removed. Thus, the distance between *v<sup>i</sup>* and *v<sup>j</sup>* is shorter than *c*, a contradiction.

Concerning dist(*u<sup>i</sup> , u<sup>j</sup>* ), observe that *G*′ is bipartite and hence dist(*u<sup>i</sup> , u<sup>j</sup>* ) is even. It holds that dist(*u<sup>i</sup> , u<sup>j</sup>* ) *> c* as dist(*u<sup>i</sup> , u<sup>j</sup>* ) ≥ *c* for the same reason as dist(*u<sup>i</sup> , w<sup>j</sup>* ) ≥ *c* and dist(*u<sup>i</sup> , u<sup>j</sup>* ) ̸= *c* as *c* is odd. Finally, since

$$P' \bullet (u\_j) = (u\_{a\_0}, w\_{a\_1}, u\_{a\_2}, \dots, w\_{a\_c}, u\_{a\_c})$$

is a path of length *c*+1 between *u<sup>i</sup>* and *u<sup>a</sup><sup>c</sup>* = *u<sup>j</sup>* , it holds that dist(*u<sup>i</sup> , u<sup>j</sup>* ) = *c*+1.

It remains to analyze the case where the distance *c* = dist(*v<sup>i</sup> , v<sup>j</sup>* ) between two vertices in *G* is even. Let again *P* = (*va*<sup>0</sup> *, va*<sup>1</sup> *, . . . , va<sup>c</sup>* ) be a shortest path from *v<sup>i</sup>* to *v<sup>j</sup>* where *va*<sup>0</sup> = *v<sup>i</sup>* and *va<sup>c</sup>* = *v<sup>j</sup>* . This time, let

$$P' = (u\_{a\_0}, w\_{a\_1}, u\_{a\_2}, \dots, u\_{a\_c})$$

be a path in *G*′ . This shows that dist(*u<sup>i</sup> , w<sup>j</sup>* ) ≤ dist(*v<sup>i</sup> , v<sup>j</sup>* ). It again holds that dist(*u<sup>i</sup> , w<sup>j</sup>* ) ≥ *c* as if there would be a path *P* ′′ = (*ub*<sup>0</sup> *, wb*<sup>1</sup> *, ub*<sup>2</sup> *, . . . , ubc*′ ) with *ub*<sup>0</sup> = *u<sup>i</sup>* , *ubc*′ = *u<sup>j</sup>* , and *c* ′ *< c*, then there would also be a shorter path *P* ′′′ = (*vb*<sup>0</sup> *, vb*<sup>1</sup> *, . . . , vbc*′ ) between *v<sup>i</sup>* and *v<sup>j</sup>* . Observe that dist(*u<sup>i</sup> , w<sup>j</sup>* ) is odd as *G*′ is bipartite. Thus, dist(*u<sup>i</sup> , w<sup>j</sup>* ) *> c* as dist(*u<sup>i</sup> , w<sup>j</sup>* ) *< c* again implies dist(*v<sup>i</sup> , v<sup>j</sup>* ) *< c* and dist(*u<sup>i</sup> , w<sup>j</sup>* ) ̸= *c* as one is odd and the other one is even. Finally,

$$\begin{aligned} P' \bullet (w\_j) &= (u\_{a\_0}, w\_{a\_1}, u\_{a\_2}, \dots, u\_{a\_c}, w\_{a\_c}) \\ (\dots, w\_j) &= c + 1 = \text{dist}(v\_i, v\_j) + 1. \end{aligned}$$

proves that dist(*u<sup>i</sup>* **Distance to cographs.** We continue with the distance to cographs. A graph

is a cograph if and only if it does not contain a *P*<sup>4</sup> as an induced subgraph, where a *P*<sup>4</sup> is a path on four vertices. Providing an algorithm that matches the lower bound of Abboud et al. [AVW16], we will show that Diameter parameterized by distance *k* to cographs can be solved in *O*(*k* · (*n* + *m*) + 2*<sup>O</sup>*(*k*) ) time. We will use the following lemma.

**Lemma 3.2.** *Let G* = (*V, E*) *be a graph and let K* ⊆ *V a vertex subset such that each connected component in G* − *K has diameter at most two. Then, the diameter of G can be computed in O*(|*K*| · (*n* + *m* + 2<sup>4</sup>|*K*<sup>|</sup> )) *time.*

*Proof.* Let *G* = (*V, E*) be the input graph, let *K* = {*x*1*, x*2*, . . . , xk*} ⊆ *V* be a set of vertices such that each connected component in *G* has diameter at most two and let *G*′ . .= *G* − *K*. We frst compute the set of all connected components of *G*′ and their respective diameter in linear time and store for each vertex the information in which connected component it is contained. Note that we only need to check for each connected component *C* whether *C* induces a clique in *G*′ , as otherwise *C*'s diameter is by assumption two. In a second step, we perform from each vertex *x<sup>i</sup>* ∈ *K* a breadth-frst search in *G* and store the distance between *x<sup>i</sup>* and each other vertex *v* in a table. Since a single breadth-frst search takes *O*(*n* + *m*) time, this takes overall *O*(*k* · (*n* + *m*)) time.

Next we introduce some notation. The *type* of a vertex *v* ∈ *V* \ *K* is a vector of length *k* where the *i* th entry describes the distance from *v* to *x<sup>i</sup>* with the

**Figure 3.4:** An example for *types*. The set *K* contains the two vertices *x*<sup>1</sup> and *x*<sup>2</sup> and the connected components in *G* − *K* are depicted. The type of *r* is (1*,* 3), the type of *s* is (2*,* 4), the type of *t* is (3*,* 4), the type of *u* is (1*,* 2), the type of *v* is (1*,* 1), and the type of *w* is (2*,* 2).

addition that any value above three is set to four. An example is given in Figure 3.4. We say that a type is *non-empty* if there is at least one vertex of this type. We compute for each vertex *v* ∈ *V* \ *K* its type. Additionally we store for each non-empty type the vertices of this type. Moreover, if all vertices of this type are in the same connected component, then we store this information, and otherwise we store that there are at least two diferent connected components containing a vertex of that type. This takes *O*(*n* · *k*) time and there are at most 4 *<sup>k</sup>* diferent types.

Lastly, we iterate over all of the at most 4 <sup>2</sup>*<sup>k</sup>* pairs (*t*1*, t*2) of non-empty types (including the pairs where both types are the same) and compute the largest distance between vertices of these types. Let *y, z* be two vertices with type(*y*) = *t*<sup>1</sup> and type(*z*) = *t*<sup>2</sup> that have maximum pairwise distance. We will frst discuss how to fnd *y* and *z* and then show how to correctly compute their distance in *O*(*k*) time. Once we iterated over all pairs of types and reported the maximum distance found, the diameter is either this or the largest distance from a vertex *x<sup>i</sup>* ∈ *K*. Since we stored all of the latter distances in a table, we can also store the maximum with only constant overhead.

To compute *y* and *z*, we consider the following two cases. If both types only appear in the same connected component, then the distance between the two vertices of these types is at most two. Hence, we can discard this case (one can check in linear time whether the diameter of *G* is at least two). If two types appear in diferent connected components, then a longest shortest path between

vertices of the respective types contains at least one vertex in *K*. Observe that since each connected component has diameter at most two, each third vertex in any shortest path must be in *K*. Thus a shortest *y*-*z*–path contains at least one vertex *x<sup>i</sup>* ∈ *K* with dist(*x<sup>i</sup> , y*) *<* 3. By defnition, each vertex with the same type as *y* has the same distance to *x<sup>i</sup>* and therefore the same distance to *z* unless there is no shortest path from it to *z* that passes through *x<sup>i</sup>* , that is, it is in the same connected component as *z*. Hence, we can choose two arbitrary vertices of the respective types in diferent connected components. Observe that we already precomputed for each type its vertices and whether it is represented in multiple connected components or not. Thus, checking whether there are two vertices of the respective type in diferent connected components is just a table lookup. We can compute the distance between *y* and *z* in *O*(*k*) time by computing min*x*∈*K*{dist(*y, x*) + dist(*x, z*)}. Observe that the shortest path from *y* to *z* contains *x<sup>i</sup>* and therefore dist(*y, xi*) + dist(*x<sup>i</sup> , z*) = dist(*y, z*). In this way, we can compute the diameter of *G* in *O*(*k* · (*n* + *m* + 2<sup>4</sup>*<sup>k</sup>* )) time.

Note that the algorithm described in the proof above does not verify whether *K* is a vertex set such that each connected component in *G* − *K* has diameter at most two. Indeed, even distinguishing between diameter two and three in *O*(*n* 2−*ε* ) time for any *ε >* 0 would refute the SETH [AVW16]. Thus, the above algorithm cannot efciently verify whether the input meets the stated conditions. Hence, when using Lemma 3.2, we need a way to ensure that each connected component in *G* − *K* has diameter two. In cographs each connected component has diameter two and hence we can show the following.

**Proposition 3.3.** Diameter *can be solved in O*(*k* ·(*n* + *m* + 2<sup>16</sup>*<sup>k</sup>* )) *time when parameterized by the* distance *k* to cographs*.*

*Proof.* Recall that a cograph does not contain a *P*<sup>4</sup> as an induced subgraph. Thus, any cograph has diameter at most two (but not every diameter-two graph is a cograph, consider e. g. a cycle on fve vertices). Moreover, given a graph *G*, one can determine in linear time whether *G* is a cograph and can return an induced *P*<sup>4</sup> if this is not the case [Bre+08, CPS85]. Iteratively searching for an induced *P*4, adding all four vertices of a returned *P*<sup>4</sup> to a set *K*, and deleting those vertices from *G* until it is *P*4-free hence computes a set *K* ⊆ *V* with |*K*| ≤ 4*k* such that *G*−*K* is a cograph. The running time for computing *K* is in *O*(*k* · (*n* + *m*)). Applying Lemma 3.2 to this set *K* then yields a running time of *O*(|*K*| · (*n* + *m* + 2<sup>4</sup>|*K*<sup>|</sup> )) ⊆ *O*(*k* · (*n* + *m* + 2<sup>16</sup>*<sup>k</sup>* )) for computing the diameter.

Observe that when a minimum deletion set *K* to cographs is given, then we can solve Diameter parameterized by the distance *k* to cographs in *O*(*k*·(*n*+*m*+24*<sup>k</sup>* )) time. We remark that computing the distance to cographs exactly is *NP*complete [LY80].

# **3.3 Parameters Motivated by Properties of Social Networks**

In this section, we study Diameter with respect to parameters that are expected to be small in social networks. It was observed that social networks have the *small-world* property and a *power-law degree distribution* [LH08, Mil67, New03, New10, NP03]. The small-world property directly transfers to the diameter. The power-law degree distribution is often captured by the *h*-index as only few high-degree vertices exist in the network. Thus, we investigate parameters related to the diameter and to the *h*-index. We start with some degree-based parameters that are upper-bounded by the *h*-index and then continue with parameter combinations.

Evald and Dahlgaard [ED16] showed that any *f*(*k*)(*n*+ *m*) 2−*ε* -time algorithm for Diameter parameterized by the maximum degree *k* for any computable function *f* refutes the SETH. Observe that 2*m* = *n* · *a*, where *a* is the average degree and therefore the standard algorithm (run a breadth-frst search from each vertex) takes *O*(*n* · (*n* + *m*)) = *O*(*a* · *n* 2 ) time. Since the average degree is at most the maximum degree, this algorithm already matches the given lower bound.

**Observation 3.4.** Diameter *parameterized by* average degree *a is solvable in O*(*a* · *n* 2 ) *time.*

We next investigate the parameter minimum degree and check whether the average degree can be replaced by the minimum degree. Unsurprisingly, it cannot. We show that Diameter is 2-GP-hard with respect to the combined parameter bisection width plus minimum degree. In other words, if there is an *f*(*b*) · *n* 2 -time algorithm, where *b* is the value of the combined parameter, then there is also an *O*(*n* 2 )-time algorithm for Diameter. The bisection width of a graph *G* is the minimum number of edges to delete from *G* in order to partition *G* into two connected component whose number of vertices difer by at most one. Computing the bisection width of a graph is known to be *NP*-hard [Bui+87].

**Figure 3.5:** Example for the construction in the proof of Proposition 3.5. The input graph given on the left side has diameter two and the constructed graph on the right side has diameter 2 + 4 = 6. The respective longest shortest paths are highlighted.

**Proposition 3.5.** Diameter *is* 2*-GP-hard with respect to the combined parameter* bisection width plus minimum degree*.*

*Proof.* Let *G* = (*V, E*) be an arbitrary undirected connected input graph with *V* = {*v*1*, v*2*, . . . , vn*} and let *d* be the diameter of *G*. We construct a new graph *G*′ = (*V* ′ *, E*′ ) with diameter *d* + 4 as follows. Let

$$\begin{aligned} V' &:= \{ s\_i, t\_i, u\_i \mid i \in [n] \} \cup \{ w\_i \mid i \in [3n] \}, \text{ and} \\ E' &:= T \cup W \cup E'', \text{ where} \\ T &:= \{ \{ s\_i, t\_i \}, \{ t\_i, u\_i \} \mid i \in [n] \}, \\ W &:= \{ u\_1, w\_1 \} \cup \{ \{ w\_1, w\_i \} \mid i \in ([3n] \} \, \{ 1 \} \}, \text{ and} \\ E'' &:= \{ \{ u\_i, u\_j \} \mid \{ v\_i, v\_j \} \in E \}. \end{aligned}$$

An example of this construction can be seen in Figure 3.5. We will now prove that all properties of Defnition 2.1 hold. It is easy to verify that the graph *G*′ contains 6*n* vertices and 5*n* + *m* edges, and that *G*′ can be computed in linear time. Notice that {*si, ti, u<sup>i</sup>* | *i* ∈ [*n*]} and {*w<sup>i</sup>* | *i* ∈ [3*n*]} are both of size 3*n* and that there is only one edge ({*u*1*, w*1}) between these two sets of vertices. The

bisection width of *G*′ is therefore one and the minimum degree is also one as *s*<sup>1</sup> has only *t*<sup>1</sup> as neighbor.

It remains to show that *G*′ has diameter *d*+ 4. First, notice that the subgraph of *G*′ induced by {*u<sup>i</sup>* | *i* ∈ [*n*]} is isomorphic to *G*. Second, dist(*s<sup>i</sup> , ui*) = 2 for all *i* ∈ [*n*] and thus dist(*s<sup>i</sup> , s<sup>j</sup>* ) = dist(*u<sup>i</sup> , u<sup>j</sup>* ) + 4 = dist(*v<sup>i</sup> , v<sup>j</sup>* ) + 4 for all *s<sup>i</sup>* ̸= *s<sup>j</sup>* . Hence, the diameter of *G*′ is at least *d* + 4. Third, note that it holds for all vertices *x* ∈ *V* ′ \ {*si*} that dist(*s<sup>i</sup> , x*) *>* dist(*t<sup>i</sup> , x*). Lastly, observe that for all *i* ∈ [3*n*] and all vertices *x* ∈ *V* ′ it holds that dist(*w<sup>i</sup> , x*) ≤ max{dist(*s*1*, x*)*,* 4}. Thus the longest shortest path in *G*′ is between two vertices *s<sup>i</sup>* and *s<sup>j</sup>* and it is of length dist(*u<sup>i</sup> , u<sup>j</sup>* ) + 4 = dist(*v<sup>i</sup> , v<sup>j</sup>* ) + 4 ≤ *d* + 4.

We mention in passing that the constructed graph in the proof of Proposition 3.5 contains the original graph as an induced subgraph and if the original graph is bipartite, then so is the constructed graph. Thus, frst applying the construction in the proof of Proposition 3.1 (see also Figure 3.3) and then the construction in the proof of Proposition 3.5 (see also Figure 3.5) shows that Diameter is GP-hard even when parameterized by the sum of girth, bisection width, minimum degree, and odd cycle traversal.

**Corollary 3.6.** Diameter *is* 6*-GP-hard with respect to the combined parameter* odd cycle traversal number plus girth plus bisection width plus minimum degree*.*

*h***-index and diameter.** We next investigate the combined parameter *h*index plus diameter. The reduction by Roditty and Williams [RW13] produces instances with constant domination number and logarithmic vertex cover number (in the input size). Since the diameter *d* is linearly upper-bounded by the domination number and the *h*-index is linearly upper-bounded by the vertex cover number, any algorithm that solves Diameter parameterized by the combined parameter (*d* + *h*) in (*d* + *h*) *O*(1) ·(*n* + *m*) 2−*ϵ* time disproves the SETH. We next present the main result in this chapter, that is, an algorithm for Diameter parameterized by *h*-index plus diameter that almost matches the lower bound. We say that the running time almost matches the lower bound since its dependence on the parameter is roughly *O*(*h <sup>d</sup>* + *d h* ) = *O*(2*<sup>d</sup>* log *<sup>h</sup>*+*<sup>h</sup>* log *<sup>d</sup>* ). Hence, it remains open whether an algorithm with a running time of (*n* + *m*) · 2 *O*(*d*+*h*) exists. We consider the following algorithm our main result of this chapter for two reasons. First, its running time almost matches the lower bound for the relevant special case where the input graph has similar properties to social networks (namely small diameter and small *h*-index). Second, the dynamic program we develop

here is quite unique in the sense that the solution of the problem is not related to some table entry but rather to the *size* of the table.

**Theorem 3.7.** Diameter *parameterized by* diameter *d* plus *h*-Index *h is solvable in O*((*n* + *m*) · *h* · (*h <sup>d</sup>* + *d h* )) *time.*

*Proof.* Let *G* = (*V, E*) be an input graph for Diameter and let *H* = {*x*1*, . . . , xh*} be a set of *h* vertices with highest degree in *G*. Clearly, *H* can be computed in linear time. Notice that all vertices in *V* \ *H* have degree at most *h* in *G*.

We will describe a two-phase algorithm based on the following idea. In the frst phase, it performs a breadth-frst search from each vertex *x<sup>i</sup>* ∈ *H*, stores the distance to each other vertex, and uses this to compute the *type* of each vertex, that is, a vector containing the distances to each vertex in *H*. In the second phase, the algorithm iteratively increases a value *e* and verifes whether there is a pair of vertices of distance at least *e* + 1 using dynamic programming. If at any point no such pair is found, then the diameter of *G* is *e*.

The frst phase is fairly straightforward. The algorithm performs a breadthfrst search from each vertex *x<sup>i</sup>* ∈ *H* and stores the distance from *x<sup>i</sup>* to each vertex *v* in a table. We denote the maximum entry in this table by *a*. It then iterates over each vertex *v* ∈ *V* \ *H* and computes a vector of length *h* with the *i* th entry representing the distance from *v* to *x<sup>i</sup>* . An example of types is depicted in Figure 3.6. The algorithm also stores the number of vertices of each type (if there is at least one such vertex). Since the distance to any vertex is at most *d*, there are at most *d <sup>h</sup>* diferent types. Let T be the set of all (non-empty) types and for some *t* ∈ T let #*<sup>t</sup>* be the total number of vertices of type *t*.

For the second phase, we deploy a dynamic program that uses two tables *N* and *T*. The table *N* : *V* × N → 2 *<sup>V</sup>* keeps for each vertex *v* and each possible distance *e* track of all vertices that have distance exactly *e* in *G*′ = *G* − *H*. The table *T* : *V* × N × T → N stores for each vertex *v*, each distance *e*, and each type *t* the number of vertices of type *t* that have distance at most *e* from *v* in *G*. Initially *e* = 1, *N*[*v,* 0] = {*v*}, and *N*[*v,* 1] = *N*(*v*) for each vertex *v*. Before we show how to initialize *T*, we explain the main idea behind it. Note that a shortest path between *v* and a vertex *w* of type *t* either contains a vertex in *H* or it is completely contained in *G*′ . If it contains a vertex *x<sup>i</sup>* ∈ *H*, then the distance between *v* and *w* is dist(*v, xi*) + dist(*x<sup>i</sup> , w*). Hence, assuming that a shortest path contains a vertex in *H*, the distance between *v* and *w* is the minimum entry in type(*v*) + type(*w*) = type(*v*) + *t*. We denote this minimum entry by mt(*v, t*).

**Figure 3.6:** An example of types. Each entry in the table on the right side displays the distance between the two respective vertices. Each column is computed by a breadthfrst search from the respective vertex *x<sup>i</sup>* and each row is the type of the respective vertex. The last row states for example that the distance between *w* and *x*1, *x*2, and *x*<sup>3</sup> are 2, 1, and 1, respectively. Thus, the type of *w* is (2*,* 1*,* 1)*<sup>T</sup>* .

Since a path of length zero or one between two vertices *v, w* ∈ *V* \ *H* cannot contain a vertex in *H*, the table can be initialized by

$$T[v,0,t] = \begin{cases} 1, & \text{if } \text{type}(v) = t, \\ 0, & \text{otherwise, and} \end{cases}$$

$$T[v,1,t] = T[v,0,t] + |\{u \in N[v,e] \mid \text{type}(u) = t\}.$$

The algorithm now iteratively increases *e* and computes *N*[*v, e*] and *T*[*v, e, t*] for each *v* ∈ *V* \ *H* and each *t* ∈ T until in one iteration *T*[*v, e, t*] = #*<sup>t</sup>* for all *v* and all *t*. Once this is the case, all vertices in *V* \ *H* have pairwise distance at most *e*. Since we already computed the distance from each vertex *x<sup>i</sup>* ∈ *H* to each other vertex, the maximum over all these distances and *e* is the diameter of *G*. The recursive formulas for *N* and *T* are as follows.

$$N[v,e] = |(\bigcup\_{u \in N[v,e-1]} N(u)) \mid (N[v,e-1] \cup N[v,e-2])| \text{ and}$$

$$T[v,e,t] = \begin{cases} \#t, & \text{if } \text{mt}(v,t) \le e, \text{ and} \\ T[v,e-1,t] + |\{u \in N[v,e] \mid \text{type}(u) = t\}| & \text{otherwise}. \end{cases}$$

If at some iteration it holds that *T*[*v, e, t*] = #*<sup>t</sup>* for all vertices *v* and all types *t*, then the algorithm terminates and returns max{*e, a*}. Observe that *e*

is equal to the number of table entries in *T* divided by |*V* \ *H*| · |T |. Thus, the solution returned by the dynamic program is not depending on any value stored within *T* but rather on the number of table entries in (the *size* of) *T*.

There are at most *d* iterations in which *e* is increased and table entries of *N* and *T* are computed. Note that all values of the function mt can be precomputed in *O*(|T |<sup>2</sup> · *h*) ⊆ *O*(*d h* · *h* · *n*) time as |T | ≤ *d <sup>h</sup>* and |T | ≤ *n*. Note that the computation of *N* closely resembles a breadth-frst search in *G*′ and since the maximum degree in *G*′ is *h* and the maximum depth is *d*, computing all entries of *N* for a single vertex takes *O*(*h d* ) time. To compute all entries *T*[*v, e, t*] for all *t* ∈ T simultaneously, we iterate over each vertex *w* ∈ *N*[*v, e*] and increase the entry *T*[*v, e,*type(*w*)] by one. This takes *O*( ∑*<sup>d</sup> <sup>e</sup>*=1 |*N*[*v, e*]|) ⊆ *O*(*h d* ) time for each vertex. The running time of our algorithm is *O*(*h* ·(*n* + *m*)) for the frst phase and *O*((*d h* · *h* · *n*) + *n* · *h d* ) for the second phase. This yields an overall running time of

$$O((n+m+h)\cdot d^h + n\cdot h^d) \subseteq O((n+m)\cdot h \cdot (h^d + d^h)).\tag{7}$$

**Acyclic chromatic number and domination number.** Finally, we analyze the parameterized complexity of Diameter parameterized by the acyclic chromatic number *a* plus domination number *d*. Note that this combined parameter is incomparable with the combined parameter *h*-index plus diameter as the *h*-index upper-bounds the acyclic chromatic number but the domination number upper-bounds the diameter. Recall that the acyclic chromatic number of a graph *G* is the smallest number *a* such that the vertices of *G* can be partitioned into *a* independent sets such that the induced subgraph of each combination of two of these independent sets is acyclic. We provide a SETH-based lower bound, adapting a reduction from Satisfiability to Diameter by Roditty and Williams [RW13].

**Proposition 3.8.** *There is no f*(*a, d*) · (*n* + *m*) 2−*ϵ -time algorithm for any computable function f that solves* Diameter *parameterized by* acyclic chromatic number *a* plus domination number *d unless the SETH is false.*

*Proof.* We provide a reduction from Satisfiability to Diameter where the input instance has constant acyclic chromatic number and constant domination number and such that an *O*((*n* + *m*) 2−*ε* )-time algorithm refutes the SETH. We note that the reduction is an extension of the construction by Roditty and Williams [RW13, Theorem 9]. Let *ϕ* be a Satisfiability instance with variable

set *W* and clause set *C*. Assume without loss of generality that |*W*| is even. We construct an instance graph *G* = (*V, E*) for Diameter as follows.

Randomly partition *W* into two sets *W*<sup>1</sup> and *W*<sup>2</sup> of equal size. Add three sets *V*1, *V*2, and *B* of vertices to *G*, where each vertex in *V*<sup>1</sup> (in *V*2) represents one of 2 <sup>|</sup>*W*|*/*<sup>2</sup> possible truth assignments of the variables in *W*<sup>1</sup> (in *W*2) and each vertex in *B* represents a clause in *C*. Clearly, |*V*1|+|*V*2| = 2·2 <sup>|</sup>*W*|*/*<sup>2</sup> and |*B*| = |*C*|. For each *v<sup>i</sup>* ∈ *V*<sup>1</sup> and each *u<sup>j</sup>* ∈ *B*, if the truth assignment corresponding to *v<sup>i</sup>* does *not* satisfy the clause corresponding to *u<sup>j</sup>* , then we add a new vertex *sij* and the two edges {*v<sup>i</sup> , sij*} and {*u<sup>j</sup> , sij*} to *G*. We call the set of all these newly introduced vertices *S*1. Now repeat the process for all vertices *w<sup>i</sup>* ∈ *V*<sup>2</sup> and all *u<sup>j</sup>* in *B* and call the newly introduced vertices *qij* . Let *S*<sup>2</sup> be the set of all *qij* . Finally we add four new vertices *t*1*, t*2*, t*3*, t*<sup>4</sup> and the sets

$$\begin{aligned} \{ \{ t\_1, v \} \mid v \in V\_1 \}, \\ \{ \{ t\_2, s \} \mid s \in S\_1 \}, \\ \{ \{ t\_3, q \} \mid q \in S\_2 \}, \\ \{ \{ t\_4, w \} \mid w \in V\_2 \}, \\ \{ \{ t\_2, b \} , \{ t\_3, b \} \mid b \in B \}, \text{ and } \\ \{ \{ t\_1, t\_2 \} , \{ t\_2, t\_3 \} , \{ t\_3, t\_4 \} \} \end{aligned}$$

of edges to *G*. See Figure 3.7 for a schematic illustration of the construction.

We will frst show that *ϕ* is satisfable if and only if *G* has diameter fve and then show that the domination number and acyclic chromatic number of *G* are fve and four, respectively. Observe that the diameter of *G* is at most fve since each vertex is connected to some vertex in {*t*1*, t*2*, t*3*, t*4} and these four vertices are of pairwise distance at most three. First assume that *ϕ* is satisfable. Then, there exists some truth assignment *β* of the variables such that all clauses are satisfed, that is, the two partial truth assignments of *β* with respect to the variables in *W*<sup>1</sup> and *W*<sup>2</sup> satisfy all clauses. Let *v*<sup>1</sup> ∈ *V*<sup>1</sup> and *v*<sup>2</sup> ∈ *V*<sup>2</sup> be the vertices corresponding to *β*. Thus, for each *b* ∈ *B* we have dist(*v*1*, b*) + dist(*v*2*, b*) ≥ 5. Observe that all paths from a vertex in *V*<sup>1</sup> to a vertex in *V*<sup>2</sup> that do not pass a vertex in *B* pass through *t*<sup>2</sup> and *t*<sup>3</sup> and are hence of length at least fve. Thus, the diameter of *G* is dist(*v*1*, v*2) = 5.

For the reverse direction, assume that there is no satisfying truth assignment for Φ. Then for each pair of vertices *v*<sup>1</sup> ∈ *V*<sup>1</sup> and *v*<sup>2</sup> ∈ *V*<sup>2</sup> it holds that there is some clause in Φ that is not satisfed by either of the two partial truth assignments corresponding to *v*<sup>1</sup> and *v*2. Hence, the vertex *u<sup>j</sup>* corresponding to

**Figure 3.7:** A schematic illustration of the construction in the proof of Proposition 3.8. A vertex *si,j* is only connected to *v<sup>i</sup>* and *u<sup>j</sup>* and *qij* is only connected to *w<sup>i</sup>* and *u<sup>j</sup>* . Note that the resulting graph has acyclic chromatic number fve (the fve independent sets are *V*<sup>1</sup> ∪*V*2, *B*, *S*<sup>1</sup> ∪*S*<sup>2</sup> ∪ {*t*1*, t*4}, {*t*2}, and {*t*3} and are also represented by colors). Moreover, the domination number of the graph is at most four as {*t*1*, t*2*, t*3*, t*4} is a dominating set.

this clause guarantees that dist(*v*1*, v*2) ≤ dist(*v*1*, u<sup>j</sup>* ) + dist(*u<sup>j</sup> , v*2) = 4. Next, observe that each pair (*v*1*, v*2) of vertices where not both *v*<sup>1</sup> ∈ *V*<sup>1</sup> and *v*<sup>2</sup> ∈ *V*<sup>2</sup> (or *v*<sup>1</sup> ∈ *V*<sup>2</sup> and *v*<sup>2</sup> ∈ *V*1) holds are of distance at most four as guaranteed by the vertices *t*1, *t*2, *t*3, and *t*4. Thus, the diameter of *G* is four.

The domination number of *G* is four since {*t*1*, t*2*, t*3*, t*4} is a dominating set. The acyclic chromatic number of *G* is at most fve as *V*<sup>1</sup> ∪ *V*2*,* {*t*2}*, B,* {*t*3}*,* and *S*<sup>1</sup> ∪ *S*<sup>2</sup> ∪ {*t*1*, t*4} each induce an independent set and each combination of two of them not including *S*1∪*S*2∪{*t*1*, t*4} only induces independent sets or stars. Moreover, note that *S*<sup>1</sup> ∪ *S*<sup>2</sup> ∪ {*t*1*, t*4} ∪ {*t*2} and *S*<sup>1</sup> ∪ *S*<sup>2</sup> ∪ {*t*1*, t*4} ∪ {*t*3} each

only induces a star and an independent set. Lastly, *S*<sup>1</sup> ∪ *S*<sup>2</sup> ∪ {*t*1*, t*4} ∪ *V*<sup>1</sup> ∪ *V*<sup>2</sup> induces two trees of depth 2 (where *t*<sup>1</sup> and *t*<sup>4</sup> are the roots and *S*<sup>1</sup> and *S*<sup>2</sup> are the leaves) and *S*<sup>1</sup> ∪ *S*<sup>2</sup> ∪ {*t*1*, t*4} ∪ *B* induces a disjoint union of stars and isolated vertices as each vertex in *S*<sup>1</sup> ∪ *S*<sup>2</sup> ∪ {*t*1*, t*4} has maximum degree one in *G*[*B* ∪ *S*<sup>1</sup> ∪ *S*<sup>2</sup> ∪ {*t*1*, t*4}].

Now assume that there was an *f*(*k*)·(*n*+*m*) 2−*ε* -time algorithm for Diameter parameterized by domination number plus acyclic chromatic number for any computable function *f* and any *ε >* 0. The constructed graph has *O*(2<sup>|</sup>*W*|*/*<sup>2</sup> · |*C*|) vertices and edges, and since *f*(9) is some constant, this implies an algorithm with running time

$$\begin{aligned} &f(9) \cdot (2^{|W|/2} \cdot |C|)^{2-\varepsilon} \\ &\in O(2^{(|W|/2)(2-\varepsilon)} \cdot |C|^{(2-\varepsilon)}) \\ &= O(2^{|W|(1-\varepsilon/2)} \cdot |C|^{(2-\varepsilon)}) \\ &= 2^{|W|(1-\varepsilon')} \cdot (|C|+|W|)^{O(1)} \text{ for some } \varepsilon' > 0. \end{aligned}$$

Such an algorithm for Diameter would refute the SETH [RW13].

### **3.4 Concluding Remarks**

We conclude this chapter with some possible avenues for further research regarding Diameter. We believe that a broader refection on the techniques we used (e. g. dynamic programming) is better deferred to the concluding chapter of this thesis, where we can compare the diferent dynamic programs we develop in this thesis. Concerning the complexity landscape shown in Figure 3.2, only a few open cases remain. Perhaps most interesting among them are the following two questions. Is there a *k O*(1) ·(*n*+*m*) 1+*ε* - time algorithm for the distance *k* to interval graphs and is there an *f*(*d*)*n* 2 -time algorithm for Diameter parameterized by the diameter *d*? Our algorithms working with parameter combinations are probably not competitive to state-of-the-art unparameterized algorithms due to their exponential dependency on the parameter value(s) even in graphs with properties similar to social networks and even so they cannot be improved by much unless the SETH breaks. So the question remains whether there are parameters *k*1*, . . . , k<sup>ℓ</sup>* (that are possibly not displayed in Figure 3.2) that are small in real-world applications and that allow for practically relevant running times like ∏*<sup>ℓ</sup> <sup>i</sup>*=1 *k<sup>i</sup>* · (*n* + *m*) or even (*n* + *m*) · ∑*<sup>ℓ</sup> <sup>i</sup>*=1 *k<sup>i</sup>* . A parameter capturing the special community structures of social networks [GN02] might be a good candidate to be included in such a parameter combination.

# **Chapter 4**

### **Length-Bounded Cuts**

In this chapter, we investigate, on a conceptual level, a peculiar case of how to compute the solution for a problem, once the table of a dynamic program is completely flled. Similarly to Chapter 3, the frst important question we have to answer is what a table entry should represent. Answering this question requires some structural observations and is by far the most complicated part of this chapter. However, once we have answered this question, determining the table dimension and computing each table entry are fairly straightforward while computing the solution from the flled table is not.

The problem we study in this chapter stems from the area of network fows. The study of network fows and, in particular, of the Edge-Disjoint Paths problem began in the 1950s with the work of Ford and Fulkerson [FF56] and has since then constituted a prominent research area in graph algorithms. In the Edge-Disjoint Paths problem, we are given an undirected graph *G*, two vertices *s* and *t*, called the *source* and the *target*, and a positive integer *β*. The question is whether there is a collection of at least *β* edge-disjoint *s*-*t*-paths in *G*. It is worth pointing out that nowadays there are many more efcient algorithms than the one by Ford and Fulkerson [FF56] for fnding *β* edge-disjoint *s*-*t*-paths in a given graph (see e. g. the work by Dinitz [Din06]).

A natural counterpart of Edge-Disjoint Paths is Edge Cut. Therein, the question is whether there is a set *F* of at most *β* edges such that there is no *s*-*t*-path in the graph after removing the edges in *F*. There is a strong dual relationship between Edge-Disjoint Paths and Edge Cut in the sense that, if both problems admit a solution for a given *β*, then the value of *β* is optimal, that is, it is not possible to fnd *β* + 1 edge disjoint *s*-*t*-paths and the removal of any set of *β* − 1 edges leaves *s* and *t* in the same connected component. Consequently, since Edge Cut can be solved in polynomial time,

so can Edge-Disjoint Paths. Quite naturally, there are many variants of the above described network fow/cut problems such as e. g. multicommodity fows, unsplittable fows, and the related cut problems (e. g. Schrijver [Sch03] provides further examples and formal defnitions). Unlike Edge-Disjoint Paths and Edge Cut, it is not always the case that the respective fow and the cut problem belong to the same complexity class. We investigate a variant of Edge Cut called Length-Bounded Cut. It originates from network design and telecommunications and Gouveia et al. [GPS08], Huygens and Ridha Mahjoub [HR07], and Huygens et al. [Huy+07] describe further applications. Length-Bounded Cut is an example where the cut problem is harder than the respective fow problem. Length-Bounded Cut is *NP*-hard [Bai+10] while the respective fow problem is polynomial-time solvable [MM10].

Our main contribution in this chapter is a dynamic-programming-based polynomial-time algorithm for Length-Bounded Cut on proper interval graphs. This confrms a conjecture by Bazgan et al. [Baz+19]. We conclude this chapter with showing some limitations of our approach when trying to adapt it for interval graphs. The existence of a polynomial-time algorithm for Length-Bounded Cut on interval graphs was also posed as an open problem by Bazgan et al. [Baz+19].

# **4.1 Problem Defnition and Related Work**

In this chapter, we study Length-Bounded Cut, which is the cut problem related to the variant of Edge-Disjoint Paths where an additional bound *λ* is given and the sought collection of *s*-*t*-paths can only contain paths of length at most *λ*. This problem has been introduced by Adámek and Koubek [AK71] and is formally defned as follows.

Length-Bounded Cut


An example of Length-Bounded Cut is given in Figure 4.1. For *λ* = |*V* | one is left with the original problem Edge Cut which is polynomial-time-solvable. Length-Bounded Cut is also solvable in polynomial time if *λ* ≤ 3 [MM10]. However, Baier et al. [Bai+10] showed that Length-Bounded Cut is NP-hard

**Figure 4.1:** An example graph. The dashed edges form a solution for Length-Bounded Cut with *β* = 2 and *λ* = 3.

for *λ* = 4. The related fow problem Length-Bounded Flow, where we restrict the fow to paths of length at most *λ*, can be solved in polynomial time via a reduction to linear programming [Bai+10, KS06, MM10].

We note that the result of Baier et al. [Bai+10] in fact gives NP-hardness for Length-Bounded Cut for each constant *λ* ≥ 4. Thus, in order to obtain tractability results, one presumably has to either consider a diferent parameterization or combine *λ* with some other parameter. Golovach and Thilikos [GT11] frst studied Length-Bounded Cut from the viewpoint of parameterized complexity. They showed that Length-Bounded Cut is fxedparameter tractable for the combined parameter *β* + *λ*. It is worth noting that the parameter *β* alone gives *W[1]*-hardness [GT11]. Later, Fluschnik et al. [Flu+18] proved that it is unlikely that a polynomial kernel in *β* + *λ* exists. Dvořák and Knop [DK18] considered structural parameters for Length-Bounded Cut. They showed that it is *W[1]*-hard when parameterized by the pathwidth of the input graph while it is fxed-parameter tractable when parameterized by the treedepth of the input graph. Kolman [Kol18] gave an *O*(*λ τ* · (*n* + *m*))-time algorithm for Length-Bounded Cut, where *τ* is the treewidth of *G*. Furthermore, Length-Bounded Cut is fxed-parameter tractable for the parameter *λ* if *G* is planar [Kol18] (it remains NP-complete on planar graphs [Flu+18]). Bazgan et al. [Baz+19] studied both restrictions on special graph classes as well as structural parameterizations for Length-Bounded Cut. They provided an *XP*-algorithm for the maximum degree of the input graph *G* and fxed-parameter tractability for the feedback edge number. Furthermore, they presented a polynomial-time algorithm for co-graphs while showing *NP*-completeness even if the input is restricted to bipartite graphs or split graphs. Finally, Length-Bounded Cut is *W[1]*-hard with respect to the

combined parameter pathwidth and maximum degree and with respect to the feedback vertex number [BHK20].

# **4.2 Polynomial-Time Algorithm for Proper Interval Graphs**

In this section, we present a polynomial-time algorithm for Length-Bounded Cut on proper interval graphs. To this end, for each vertex *v /*∈ {*s, t*}, we defne a set of vertices that contains *v* and *t*. The algorithm for Length-Bounded Cut on proper interval graphs is then a dynamic program that stores for each vertex *v* and each possible distance *d* (2 ≤ *d* ≤ *λ*) the minimum size of a cut that makes each vertex in the described set have distance at least *d* from *s*.

Recall that each vertex *v* in a proper interval graph can be represented by an interval [*bv, fv*] <sup>Q</sup> such that two vertices *u, w* are adjacent in *G* if and only if [*bu, fu*] <sup>Q</sup> ∩ [*bw, fw*] <sup>Q</sup> ≠ ∅ and no interval representing a vertex is properly contained in the interval representing another vertex. Observe that we can assume without loss of generality that *b<sup>s</sup>* ≤ *b<sup>t</sup>* as we can otherwise "mirror" the graph by setting *b<sup>v</sup>* = −*f<sup>v</sup>* and *f<sup>v</sup>* = −*b<sup>v</sup>* for each vertex *v* ∈ *V* . It is folklore that one can assume that |{*b<sup>v</sup>* | *v* ∈ *V* }| = |*V* |. We further assume that the vertices in *V* \ {*s, t*} are named *v*1*, v*2*, . . . , vn*−<sup>2</sup> such that *bv<sup>i</sup> < bv<sup>j</sup>* for all *i < j*. We frst show that we can safely ignore all vertices *v* with *f<sup>v</sup> < b<sup>s</sup>* or *f<sup>t</sup> < bv*. It is worth noting that the following lemma holds for interval graphs and not only for proper interval graphs.

**Lemma 4.1.** *Let I* = (*G* = (*V, E*)*, s, t, β, λ*) *be an instance of* Length-Bounded Cut *where G is an interval graph and b<sup>s</sup> < b<sup>t</sup> in the interval representation. Let L . .*= {*u* ∈ *V* | *f<sup>u</sup> < bs*} *and R . .*= {*u* ∈ *V* | *f<sup>t</sup> < bu*}*. Then, I* ′ *. .*= (*G*−(*L*∪*R*)*, s, t, β, λ*) *is an equivalent instance of* Length-Bounded Cut*.*

*Proof.* Let *I, I*′ *, G, s, t, β, λ, L,* and *R* be as defned above. We frst show that *I<sup>L</sup>* = (*G*−*R, s, t, β, λ*) is an equivalent instance. Note that *s, t /*∈ *L* ∪ *R* and hence *I<sup>L</sup>* and *I* ′ are instances of Length-Bounded Cut. Note further that deleting vertices from any input graph cannot decrease the distance between any pair of vertices and hence if *I* is a yes-instance, then so are *I<sup>L</sup>* and *I* ′ . Hence it remains to show that if *I<sup>L</sup>* is a yes-instance, then so is *I*.

Assume towards a contradiction that *I<sup>L</sup>* is a yes-instance and *I* is a no-instance. Then there is a set *F<sup>L</sup>* of *β* edges in *G* − *R* such that the distance between *s*

and *t* in *G<sup>L</sup>* . .= (*V* \ *R, E* \ (*F<sup>L</sup>* ∪ {{*u, v*} ∈ *E* | *u* ∈ *R*})) is at least *λ* + 1. Since *I* is a no-instance, there is a path *P* of length at most *λ* between *s* and *t* in *G*<sup>∗</sup> . .= (*V, E* \ *FL*). As *G<sup>L</sup>* and *G*<sup>∗</sup> only difer in *R*, each path of length at most *λ* between *s* and *t* in *G*<sup>∗</sup> contains at least one vertex from *R*. We will show that deg*G*(*t*) ≤ |*FL*| and hence there is an *s*-*t*-cut of size at most *β* in *G* and thus *I* is a yes-instance. This contradicts the assumption that *I* is a no-instance and hence fnishes the proof that *I<sup>L</sup>* is equivalent to *I*.

We start by giving some basic notation for the proof to come. We use sets of vertices that have a certain distance from *s* in some subgraph *H* of *G*. To this end, we defne *X p H* . .= {*u* ∈ *V* | dist*H*(*s, u*) = *p*} for each distance *p*. Analogously, we defne *X* ≤*p H* . .= {*u* ∈ *V* | dist*H*(*s, u*) ≤ *p*} and *X* ≥*p H* . .= {*u* ∈ *V* | dist*H*(*s, u*) ≥ *p*}.

Let *d* . .= dist*G*<sup>∗</sup> (*s, t*) and let *t* ′ be the vertex in *P* with maximum *b<sup>t</sup>* ′ . Since *P* contains a vertex from *R*, it holds that *b<sup>t</sup>* ′ *> f<sup>t</sup>* and hence *t* ′ ∈*/ NG*(*t*). Since *t* ′ is on a shortest *s*-*t*-path in *G*<sup>∗</sup> and *t* ′ ∈*/ NG*(*t*), it holds that *t* ′ ∈ *X* ≤*d*−2 *<sup>G</sup>*<sup>∗</sup> . Now consider the set *K* of vertices that are part of a shortest *s*-*t* ′ -path in *G*<sup>∗</sup> and that are neighbors of *t* in *G*. By construction *K* ⊆ *X* ≤*d*−3 *<sup>G</sup>*<sup>∗</sup> and for each *y* ∈ [*bt, ft*] Q there is a vertex *v* ∈ *K* with *y* ∈ [*bv, fv*] <sup>Q</sup>. We next show that <sup>|</sup>*FL*| ≥ deg*G*(*t*). To this end, consider any vertex *u* ∈ *NG*(*t*). If *u* ∈ *X* ≤*d*−2 *<sup>G</sup>*<sup>∗</sup> , then it holds that {*u, t*} ∈ *FL*. Otherwise *u* ∈ *X* ≥*d*−1 *<sup>G</sup>*<sup>∗</sup> . Observe that for each *u* ∈ *NG*(*t*) it holds by defnition that there is a *y* ∈ [*bu, fu*] <sup>Q</sup> ∩ [*bt, ft*] <sup>Q</sup> ≠ ∅ and hence there is a vertex *v* ∈ *K* with *y* ∈ [*bv, fv*] <sup>Q</sup> and hence {*u, v*} ∈ *E*. Note that *u* ∈ *X* ≥*d*−1 *G*<sup>∗</sup> and *v* ∈ *K* ⊆ *X* ≤*d*−3 *<sup>G</sup>*<sup>∗</sup> . Since

$$\text{dist}\_{G^\ast}(s, u) \ge d - 1 > d - 3 + 1 \ge \text{dist}\_{G^\ast}(s, v) + 1,$$

it holds that {*u, v*} ∈ *FL*. Since {*v, t*} ∈ *F<sup>L</sup>* for all *v* ∈ *NG*(*t*) ∩ *X* ≤*d*−2 *<sup>G</sup>*<sup>∗</sup> , since for all *v* ∈ *NG*(*t*) ∩ *X* ≥*d*−1 *<sup>G</sup>*<sup>∗</sup> there is some *w* ∈ *K* such that {*v, w*} ∈ *FL*, and since *K* ∩ *X* ≥*d*−1 *<sup>G</sup>*<sup>∗</sup> = ∅, there is a unique edge for each *v* ∈ *NG*(*v*) in *FL*. Hence, *β* = |*FL*| ≥ deg*G*(*t*) and thus there is a trivial *s*-*t*-cut of size *β* in *G* that contains all edges incident to *t*. Thus, *I* is a yes-instance.

We conclude the proof by showing that *I* ′ is equivalent to *IL*. Note that we consider undirected graphs and hence we can exchange the roles of *s* and *t* and *mirror* the graph by setting *b* ′ *v* . .= −*f<sup>v</sup>* and *f* ′ *v* . .= −*b<sup>v</sup>* for each vertex *v* ∈ *V* . Note that all vertices in *L* (originally fulflling *f<sup>u</sup> < bs*) now satisfy *b* ′ *<sup>u</sup> > f*′ *s* . Hence, if we interchange the names of *s* and *t*, then they satisfy the condition of *R* and hence we can use the argument above to show that *I* ′ and *I<sup>L</sup>* are equivalent.

Using Lemma 4.1, we can always assume that there is no vertex *v* with *f<sup>v</sup> < b<sup>s</sup>* or *b<sup>v</sup> > ft*. We next show that if there is a solution, then there is also a solution in which the distance from *s* to *v<sup>j</sup>* is non-decreasing in *j*.

**Lemma 4.2.** *Let G* = (*V, E*) *be a proper interval graph where no vertex v satisfes f<sup>t</sup> < b<sup>v</sup> or b<sup>s</sup> > f<sup>v</sup> and let F be a set of edges. Let d be the distance from s to t in G*′ *. .*= (*V, E* \ *F*)*. Then, there is a set F* ′ *of edges with* |*F* ′ | ≤ |*F*| *such that* dist*G*′′ (*s, t*) ≥ *d in G*′′ *. .*= (*V, E* \ *F* ′ ) *and* dist*G*′′ (*s, vi*) ≤ dist*G*′′ (*s, v<sup>j</sup>* ) *for each v<sup>i</sup> , v<sup>j</sup>* ∈ *V* \ {*s, t*} *with bv<sup>i</sup> < bv<sup>j</sup> .*

*Proof.* Let *G, s, t, F, G*′ *,* and *d* be as defned above. The main idea of this proof is to construct a sequence of graphs which starts with the graph *G*′ and ends with the sought graph *G*′′. To this end, we defne for each vertex *v* ∈ *V* in a graph *H* = (*V, EH*) a specifc distance *DH*(*v*). We defne *DH*(*v*) to be the length of a shortest path *P* = (*s* = *u*0*, u*1*, u*2*, . . . , u<sup>α</sup>* = *v*) from *s* to *v* in *H* such that for all *γ* ∈ [*α* − 1] it holds that *bu<sup>γ</sup> < buγ*+1 . As a special case, if *u<sup>α</sup>* = *t*, then we only require that for all *γ* ∈ [*α* − 2] it holds that *bu<sup>γ</sup> < buγ*+1 . We call such paths monotone, and if no monotone *s*-*v*-path exists, then we set *DH*(*v*) . .= ∞. Observe that for each graph *H* it holds that *DH*(*s*) = 0 and *DH*(*v*) ≥ dist*H*(*s, v*). Let G . .= {*G*<sup>∗</sup> . .= (*V, E*<sup>∗</sup> ) | *E*<sup>∗</sup> ⊆ *E* ∧ |*E*<sup>∗</sup> | ≥ |*E* \ *F*|}. We present a sequence of graphs (*G*′ . .= *G*1*, G*2*, . . . Gk*) such that


**Claim 4.3.** *If such a sequence of graphs exists, then F<sup>k</sup>* = *E* \*E<sup>k</sup> and G*′′ := *G<sup>k</sup> satisfy Lemma 4.2.*

*Proof of Claim 4.3.* First, we show that dist*<sup>G</sup><sup>k</sup>* (*s, v*) = *D<sup>G</sup><sup>k</sup>* (*v*) for all *v*. Assume towards a contradiction that there is a vertex *v* ̸= *t* with dist*<sup>G</sup><sup>k</sup>* (*s, v*) ̸= *D<sup>G</sup><sup>k</sup>* (*v*). Consider any shortest *s*-*v*-path *P* in *Gk*. Let *w* be the frst vertex on *P* with dist*<sup>G</sup><sup>k</sup>* (*s, w*) ̸= *D<sup>G</sup><sup>k</sup>* (*w*) and let *w* ′ be its predecessor. By this defnition, it holds that dist*<sup>G</sup><sup>k</sup>* (*s, w*′ ) = *D<sup>G</sup><sup>k</sup>* (*w* ′ ) and *b<sup>w</sup> < bw*′ as otherwise dist*<sup>G</sup><sup>k</sup>* (*s, w*) = *D<sup>G</sup><sup>k</sup>* (*w*). Since *D<sup>G</sup><sup>k</sup>* (*w*) ≥ dist*<sup>G</sup><sup>k</sup>* (*w*) and *D<sup>G</sup><sup>k</sup>* (*w*) ̸= dist*<sup>G</sup><sup>k</sup>* (*s, w*), it follows that

$$D\_{G\_k}(w) > \text{dist}\_{G\_k}(s, w) = \text{dist}\_{G\_k}(s, w') + 1 = D\_{G\_k}(w') + 1,$$

a contradiction to (3) and *b<sup>w</sup> < bw*′ .

Now assume that dist*G<sup>k</sup>* (*s, t*) ̸= *DG<sup>k</sup>* (*t*). Let *P* be a shortest *s*-*t*-path in *Gk*. Let *v* be the predecessor of *t* in *P*. We have shown that dist*<sup>G</sup><sup>k</sup>* (*v*) = *DG<sup>k</sup>* (*v*) and hence dist*<sup>G</sup><sup>k</sup>* (*t*) = dist*<sup>G</sup><sup>k</sup>* (*v*) + 1 = *DG<sup>k</sup>* (*v*) + 1 = *DG<sup>k</sup>* (*t*). The last step follows from the fact that *DG<sup>k</sup>* (*t*) ≤ *DG<sup>k</sup>* (*v*) + 1 as *v* is a neighbor of *t* in *G<sup>k</sup>* and the special case in the defnition of *D* that allows to ignore *bt*.

The claim now easily follows. Note that (1) ensures that *G<sup>k</sup>* ∈ G and hence |*Fk*| ≤ *F*. It follows from (2) that

$$\text{dist}\_{G\_k}(s, t) = D\_{G\_k}(t) \ge D\_{G\_{k-1}}(t) \ge \dots \ge D\_{G\_1}(t) = D\_{G'}(t) \ge \text{dist}\_{G'}(s, t) \ge d.$$

Finally, (3) states that for all *v, w* ∈ *V* \ {*s, t*} with *b<sup>v</sup> < b<sup>w</sup>* that

$$\text{dist}\_{G^{\prime\prime}}(s,v) = D\_{G^{\prime\prime}}(v) \le D\_{G^{\prime\prime}}(w) = \text{dist}\_{G^{\prime\prime}}(s,w). \tag{8}$$

We now describe how to obtain the sequence (*G*′ = *G*1*, G*2*, . . . , Gk*) of graphs. To this end, we need a rather technical order over the graphs in G. We say that (*V, Eα*) = *G<sup>α</sup> <*<sup>∆</sup> *G<sup>γ</sup>* = (*V, Eγ*) for *Gα, G<sup>γ</sup>* ∈ G if and only if


$$\bullet \quad |E\_{\alpha}| = |E\_{\gamma}|, \; D\_{G\_{\alpha}}(v) = D\_{G\_{\gamma}}(v) \text{ for all } v \in V \; \backslash \; \{t\}, \; \text{and } D\_{G\_{\alpha}}(t) < D\_{G\_{\gamma}}(t).$$

Notice that *<*<sup>∆</sup> defnes a total preorder on G, that is, the order *<*<sup>∆</sup> is transitive, refexive, and for each two graphs *Gα, G<sup>β</sup>* ∈ G with *G<sup>α</sup>* ̸= *G<sup>β</sup>* it holds that *G<sup>α</sup> <*<sup>∆</sup> *G<sup>β</sup>* or *G<sup>β</sup> <*<sup>∆</sup> *Gα*.

Let *G<sup>ℓ</sup>* be a graph in the sequence. We will guarantee that each graph in the sequence fulflls (1) and (2). Consequently, if *G<sup>ℓ</sup>* satisfes (3), then we have found the last graph in the sequence. Otherwise, we describe how to obtain another graph *G<sup>ℓ</sup>*+1 ∈ G such that (2) holds for *G<sup>ℓ</sup>* and *G<sup>ℓ</sup>*+1 and *G<sup>ℓ</sup>*+1 *<*<sup>∆</sup> *Gℓ*. Since *<*<sup>∆</sup> is a total preorder, we can only build a fnite sequence and hence at some point a graph has to satisfy (3).

Since *G<sup>ℓ</sup>* = (*V, Eℓ*) does not satisfy (3), there is some minimum *j* such that *D<sup>G</sup><sup>ℓ</sup>* (*v<sup>j</sup>* ) *> D<sup>G</sup><sup>ℓ</sup>* (*v<sup>j</sup>*+1). Let

$$X := \{ x \in N\_G(v\_{j+1}) \mid (b\_x < b\_{v\_j} \lor x = s) \land \{v\_j, x\} \in E \mid E\_\ell \land \{v\_{j+1}, x\} \in E\_\ell\},$$

$$Y := \{ y \in N\_G(v\_j) \mid (b\_y > b\_{v\_{j+1}} \lor y = t) \land \{v\_{j+1}, y\} \in E \mid E\_\ell \land \{v\_j, y\} \in E\_\ell\}.$$

See Figure 4.2 for an example of *X* and *Y* . We distinguish between the two cases |*X*| ≥ |*Y* | and |*X*| *<* |*Y* |.

**Figure 4.2:** An example for *X* and *Y* . Red (vertical) edges are contained in *E* \ *Eℓ*. For the sake of readability we do not depict edges in *Eℓ*. Note that *X* = {*vj*−2} and *Y* = {*v<sup>j</sup>*+3}.

**Case 1 (**|*X*| ≥ |*Y* |**):** Let

$$E\_{\ell+1} := (E\_{\ell} \backslash \{ \{ v\_j, y \} \mid y \in Y \}) \cup \{ \{ v\_j, x \} \mid x \in X \},$$

and *Gℓ*+1 . .= (*V, Eℓ*+1). Since |*X*| ≥ |*Y* |, *X* ∩ *Y* = ∅, and *G<sup>ℓ</sup>* ∈ G, it holds that |*Eℓ*+1| ≥ |*Eℓ*| ≥ |*E* \ *F*| and thus *Gℓ*+1 ∈ G. Clearly, for all *v* ∈ *V* \ {*t*} with *b<sup>v</sup> < bv<sup>j</sup>* , we have *DGℓ*+1 (*v*) = *DG<sup>ℓ</sup>* (*v*) as *Eℓ*+1 and *E<sup>ℓ</sup>* only difer in edges incident to *v<sup>j</sup>* . Let *w* be the predecessor of *vj*+1 in a shortest monotone *s*-*vj*+1-path in *Gℓ*. Since *DG<sup>ℓ</sup>* (*v<sup>j</sup>* ) *> DG<sup>ℓ</sup>* (*vj*+1) it holds that *w* ≠ *v<sup>j</sup>* and hence *b<sup>w</sup> < bv<sup>j</sup> < bvj*+1 ≤ *fw*. Moreover, vertex *w* is contained in *X* as otherwise *D<sup>G</sup><sup>ℓ</sup>* (*v<sup>j</sup>* ) ≤ *D<sup>G</sup><sup>ℓ</sup>* (*w*) + 1 = *D<sup>G</sup><sup>ℓ</sup>* (*v<sup>j</sup>*+1). Thus,

$$D\_{G\_{\ell+1}}(v\_j) = D\_{G\_{\ell}+1}(w) + 1 = D\_{G\_{\ell}}(w) + 1 = D\_{G\_{\ell}}(v\_{j+1}) < D\_{G\_{\ell}}(v\_j)$$

and combined with |*E<sup>ℓ</sup>*+1| ≥ |*Eℓ*| this yields *G<sup>ℓ</sup>*+1 *<*<sup>∆</sup> *Gℓ*.

It remains to show that *D<sup>G</sup><sup>ℓ</sup>* (*t*) ≤ *D<sup>G</sup>ℓ*+1 (*t*). Consider a shortest monotone *s*-*t*-path *P* in *G<sup>ℓ</sup>*+1. If *P* does not pass through *v<sup>j</sup>* , then it is also a monotone *s*-*t*-path in *G<sup>ℓ</sup>* and hence *D<sup>G</sup><sup>ℓ</sup>* (*t*) ≤ *D<sup>G</sup>ℓ*+1 (*t*). If *P* passes through *v<sup>j</sup>* , then let *z* be the successor of *v<sup>j</sup>* in *P*. Note that if *z* ∈ *Y* , then {*v<sup>j</sup>*+1*, z*} ∈ *Eℓ*, and if *z /*∈ *Y* , then *z* = *v<sup>j</sup>*+1 or {*v<sup>j</sup>*+1*, z*} ∈ *E<sup>ℓ</sup>* as *b<sup>v</sup><sup>j</sup>* ≤ *bz*. Hence it holds that

$$D\_{G\_\ell}(z) \le D\_{G\_\ell}(v\_{j+1}) + 1 = D\_{G\_{\ell+1}}(v\_j) + 1 = D\_{G\_\ell+1}(z).$$

Finally, let *P* ′ be a shortest monotone *s*-*z*-path in *G<sup>ℓ</sup>* and let *P* ′′ = *P* ′ • *P*[*z, t*]. Note that *P* ′′ is a monotone *s*-*t*-path of length at most *D<sup>G</sup>ℓ*+1 (*t*) in *G<sup>ℓ</sup>* and thus *D<sup>G</sup><sup>ℓ</sup>* (*t*) ≤ *D<sup>G</sup>ℓ*+1 (*t*).

**Case 2 (**|*X*| *<* |*Y* |**):** We set

$$E\_{\ell+1} := \left( E\_{\ell} \mid \{ \{ v\_{j+1}, x \} \mid x \in X \} \right) \cup \{ \{ v\_{j+1}, y \} \mid y \in Y \}.$$

Since |*X*| *<* |*Y* |, *X* ∩ *Y* = ∅, and *G<sup>ℓ</sup>* ∈ G, it holds that |*Eℓ*+1| *>* |*Eℓ*| and therefore *Gℓ*+1 ∈ G and *Gℓ*+1 *<*<sup>∆</sup> *Gℓ*. It remains to show *DG<sup>ℓ</sup>* (*t*) ≤ *DGℓ*+1 (*t*).

Since *E<sup>ℓ</sup>* and *Eℓ*+1 only difer in edges incident to *vj*+1, for all *v* with *b<sup>v</sup> < bv<sup>j</sup>* it holds that *DGℓ*+1 (*v*) = *DG<sup>ℓ</sup>* (*v*). Let *P* be a shortest monotone *s*-*vj*+1-path in *Gℓ*+1 and let *w* be the predecessor of *vj*+1 in *P*. By defnition of *Eℓ*+1 it holds that *w* = *v<sup>j</sup>* or {*w, vj*} ∈ *E<sup>ℓ</sup>* and hence *DG<sup>ℓ</sup>* (*v<sup>j</sup>* ) ≤ *DGℓ*+1 (*vj*+1). Let *P* ′ be a shortest monotone *s*-*t*-path in *Gℓ*+1. If *P* ′ does not pass through *vj*+1, then it is also a monotone *s*-*t*-path in *G<sup>ℓ</sup>* as *E<sup>ℓ</sup>* and *Eℓ*+1 only difer in edges incident to *vj*+1 and hence *DG<sup>ℓ</sup>* (*t*) ≤ *DGℓ*+1 (*t*). If *P* ′ passes through *vj*+1, then let *z* be the successor of *vj*+1 in *P* ′ . Since only edges between *vj*+1 and the vertices in *Y* are contained in *Eℓ*+1 but not in *Eℓ*, it holds that *z* ∈ *Y* . Hence, it holds that {*v<sup>j</sup> , z*} ∈ *E<sup>ℓ</sup>* and thus

$$D\_{G\_\ell}(z) \le D\_{G\_\ell}(v\_j) + 1 \le D\_{G\_{\ell+1}}(v\_{j+1}) + 1 = D\_{G\_{\ell+1}}(z).$$

Finally, let *P* ′′ be a shortest monotone *s*-*v<sup>j</sup>* -path in *G<sup>ℓ</sup>* and let *P* ′′′ = *P* ′′ • *P*[*z, t*]. Since *P* ′′′ is a monotone *s*-*t*-path of length at most *DGℓ*+1 (*t*) in *Gℓ*, we obtain *DGℓ*+1 (*t*) ≥ *DG<sup>ℓ</sup>* (*t*).

This concludes the proof as we have shown that the sought sequence of graphs is fnite and how to obtain each graph in it from the previous.

Using Lemma 4.2, we now provide the main result of this chapter, that is, a dynamic program that solves Length-Bounded Cut on proper interval graphs in polynomial time. The dynamic program stores for each vertex *v* ∈ *V* \ {*t*} and each possible distance *d* the minimum size of a cut that makes each vertex *u* with *b<sup>u</sup>* ≥ *b<sup>v</sup>* or *u* = *t* have distance at least *d* from *s*.

**Theorem 4.4.** Length-Bounded Cut *can be solved in O*(*n* 2 · *m*) *time if the input graph is a proper interval graph.*

*Proof.* We prove the statement by developing a dynamic program. We frst state some general observations and derive from them the main idea behind the dynamic program. We then show how the entries of the table of the dynamic program are computed and how to compute the solution for Length-Bounded Cut from the flled table. We continue with proving the correctness of our algorithm and conclude with analyzing its running time.

We assume that, in the input graph *G* . .= (*V, E*), there is no *s*-*t*-cut of size at most *β* as this cut can be detected in *O*(*n* · *m*) time [FF56] and the answer for Length-Bounded Cut is then always yes. Thus, deg*G*(*s*) *> β* and deg*G*(*t*) *> β* as otherwise the set of edges incident to *s* or *t* are an *s*-*t*cut of size at most *β*. Furthermore, by Lemma 4.1, we can assume that there is no vertex *v* with *f<sup>v</sup> < b<sup>s</sup>* or *b<sup>v</sup> > ft*. By Lemma 4.2, we can assume that we search for a solution in which for all *v<sup>i</sup> , v<sup>j</sup>* ∈ *V* \ {*s, t*} with *i < j* it holds that dist(*s, vi*) ≤ dist(*s, v<sup>j</sup>* ). Hence, we construct a table *T* : *V* × N → N which stores for each vertex *v<sup>i</sup>* ∈ *V* \ {*s, t*} and each possible distance 2 ≤ *d* ≤ *λ* the minimum number of edges that have to be deleted from *G*′ . .= (*V* ′ *, E*′ ) . .= *G* − {*t*} to ensure the following. First, dist(*s, vk*) ≤ dist(*s, vℓ*) for all *k* ≤ *ℓ* ≤ *i*. Second, each vertex *v<sup>j</sup>* ∈ *V* \ {*s, t*} with *j* ≥ *i* has distance at least *d* from *s*.

We start with showing how to initialize the table *T*. Note that *v*1*, v*2*, . . . , v*deg(*s*) are neighbors of *s* and *v*deg(*s*)+1*, v*deg(*s*)+2*, . . . , vn*−<sup>2</sup> are not. Hence, to increase the distance of each *v<sup>j</sup>* with *j* ≥ *i* for some given *i* to at least two, one has to delete all edges between *s* and the vertices in {*v<sup>j</sup>* | *i* ≤ *j* ≤ deg(*s*)}. Thus, the table is initialized with *T*[*v<sup>i</sup> ,* 2] = 0 for all *i >* deg(*s*) and *T*[*v<sup>i</sup> ,* 2] = deg(*s*)−*i*+ 1 for all *i* ≤ deg(*s*). We further initialize *T*[*v*1*, d*] = deg(*s*) for all *d* ≥ 3.

We next show how to compute the solution to Length-Bounded Cut once the table *T* is completely flled. Since we seek a solution *F* such that in *H* . .= (*V, E* \ *F*) it holds that dist*H*(*s, t*) *> λ*, each vertex *u* ∈ *NH*(*t*) has to satisfy dist*H*(*s, u*) ≥ *λ*. Note that deg*G*(*t*) *> β* and hence there is at least one vertex *v* ∈ *NH*(*t*). Thus, to compute the solution for Length-Bounded Cut, we iterate over *v* ∈ *NG*(*t*) \ {*s*} and compute *T*[*v, λ*] + |{*u* ∈ *N*(*t*) | *u* = *s* ∨ *b<sup>u</sup> < bv*}|. Note that this corresponds to the statement that each neighbor *u* of *t* in *G* has distance at least *λ* from *s* or the edge {*u, t*} was removed. Hence, the distance between *s* and *t* in the resulting graph is at least *λ* + 1. Further, if we take the minimum value over all iterations and compare it to *β*, then this solves Length-Bounded Cut.

It remains to present the recursive formula for *T*, to prove the correctness of our dynamic program, and to analyze its running time. We start with showing how to compute *T*. For the sake of simplicity, we also store, for each table entry *T*[*v<sup>i</sup> , d*] with *d* ≥ 2, in a second table *S*[*v<sup>i</sup> , d*] the vertex *v<sup>j</sup>* ∈ *V* \ {*s, t*} with minimum *j* such that *v<sup>j</sup>* has distance *d*−1 from *s* in some solution corresponding to *T*[*v<sup>i</sup> , d*]. We initialize *S*[*v<sup>i</sup> ,* 2] = *v*<sup>1</sup> for all *v<sup>i</sup>* and *S*[*v*1*, d*] = *v*<sup>1</sup> for all *d* ≥ 3. Note that *S*[*v<sup>i</sup> , d*] = *v*<sup>1</sup> might not represent what we claimed if we seek to remove the edge {*s, v*1}. However, in this case there is no solution as we assume

that deg(*s*) *> β*. For increasing values of *d* ≥ 3, we iterate over 2 ≤ *i* ≤ *n* − 2 and compute

$$\begin{aligned} T[v\_i, d] &= \min\_{j$$

where *C*[*vh, v<sup>j</sup> , v<sup>i</sup>* ] is a function that represents for each triple (*vh, v<sup>j</sup> , vi*) of vertices with *h < j < i* the set of edges between a vertex *v<sup>ℓ</sup>* with *h* ≤ *ℓ < j* and a vertex *v<sup>r</sup>* with *r* ≥ *i*. For technical reasons we exclude *s* and *t* here and hence the formal defnition is

$$C[v\_h, v\_j, v\_i] := \{ \{v\_\ell, v\_r\} \in E \mid h \le \ell < j < i \le r \}.$$

The vertex *v<sup>h</sup>* is used to avoid double counting.

We continue by proving that *S* and *T* store exactly what they are supposed to. Assume towards a contradiction that there is a vertex *v<sup>i</sup>* and a distance *d* ≥ 2 such that *S*[*v<sup>i</sup> , d*] or *T*[*v<sup>i</sup> , d*] were computed incorrectly. Then there is also a smallest *d* such that there is a vertex *v<sup>i</sup>* for which *S*[*v<sup>i</sup> , d*] or *T*[*v<sup>i</sup> , d*] are computed incorrectly and we assume that *v<sup>i</sup>* is the vertex with the smallest index *i* such that *S*[*v<sup>i</sup> , d*] or *T*[*v<sup>i</sup> , d*] is computed incorrectly. Since we have already shown that the initialization for *d* = 2 is correct, we focus on the case *d >* 2 and distinguish between the three cases that *S*[*v<sup>i</sup> , d*] was computed incorrectly, that *T*[*v<sup>i</sup> , d*] *> c*, or that *T*[*v<sup>i</sup> , d*] *< c*, where *c* is the correct value of *T*[*v<sup>i</sup> , d*].

If *T*[*v<sup>i</sup> , d*] *< c*, then let *v<sup>j</sup>* be a vertex with *j < i* that minimizes the sum *T*[*v<sup>j</sup> , d* − 1] + |*C*[*S*[*v<sup>j</sup> , d* − 1]*, v<sup>j</sup> , v<sup>i</sup>* ]|. Since we assume that *T*[*v<sup>j</sup> , d* − 1] is computed correctly (recall that *d* was chosen to be the minimum value for which *S* or *T* was computed incorrectly), there is a set *F*<sup>1</sup> of *T*[*v<sup>j</sup> , d* − 1] edges such that in the graph *H*′ . .= (*V* ′ *, E*′ \ *F*1) it holds that dist*H*′ (*s, vr*) ≥ *d* − 1 for all *r* ≥ *j* and dist*H*′ (*s, vℓ*) ≤ dist*H*′ (*s, vk*) for all *ℓ* ≤ *k* ≤ *j*. Let *v<sup>h</sup>* = *S*[*v<sup>j</sup> , d* − 1]. Since *S*[*v<sup>j</sup> , d* − 1] is by assumption computed correctly, it holds for all *v<sup>ℓ</sup>* with *ℓ < h* that dist*H*′ (*s, vℓ*) ≤ *d* − 3. Thus, *F*<sup>1</sup> contains all edges between vertices in {*v<sup>ℓ</sup>* | *ℓ < h*} and {*v<sup>r</sup>* | *r* ≥ *i*}. Since *C*[*vh, v<sup>j</sup> , v<sup>i</sup>* ] is the set of all edges between vertices *v<sup>ℓ</sup>* ′ with *h* ≤ *ℓ* ′ *< j* to vertices *v<sup>r</sup>* with *r* ≥ *i*, it holds that there is no edge between a vertex of distance at most *d* − 2 from *s* to a vertex *v<sup>r</sup>* with *r* ≥ *i* in *H* . .= (*V* ′ *, E*′ \ (*F*<sup>1</sup> ∪ *C*[*vh, v<sup>j</sup> , v<sup>i</sup>* ])). Hence each such vertex *v<sup>r</sup>* is of distance at least *d* from *s* in *H*. It remains to show that dist*H*(*s, vℓ*) ≤ dist*H*(*s, vk*) for all *ℓ* ≤ *k* ≤ *i*. Note that *H* and *H*′ only

difer in edges in *C*[*vh, v<sup>j</sup> , v<sup>i</sup>* ], that is, in edges between vertices *v<sup>ℓ</sup>* and *v<sup>r</sup>* with *h* ≤ *ℓ* ≤ *j* and *r* ≥ *i*. Since those *v<sup>ℓ</sup>* have distance *d* − 2 and those *v<sup>r</sup>* have distance at least *d* − 1 from *s* in *H*, it holds that dist*H*(*s, vℓ*) = dist*H*(*s, vℓ*) for all *ℓ* ≤ *i* and thus also *H* fulflls dist*H*(*s, vℓ*) ≤ dist*H*(*s, vk*) for all *ℓ* ≤ *k* ≤ *i*. Since *T*[*v<sup>i</sup> , d*] = |*F*<sup>1</sup> ∪ *C*[*vh, v<sup>j</sup> , v<sup>i</sup>* ]| and *F*<sup>1</sup> ∩ *C*[*vh, v<sup>j</sup> , v<sup>i</sup>* ] = ∅, it holds that

$$T[v\_i, d] = |F\_1| + |C[v\_h, v\_j, v\_i]| = T[v\_j, d-1] + |C[S[v\_j, d-1], v\_j, v\_i]|,$$

and thus *T*[*v<sup>i</sup> , d*] ≥ *c*, a contradiction.

If *T*[*v<sup>i</sup> , d*] *> c*, then there is a cut *F* ′ that contains less than *T*[*v<sup>i</sup> , d*] edges such that in the respective graph *H*′ . .= (*V* ′ *, E*′ \ *F* ′ ) all vertices *v<sup>r</sup>* with *r* ≥ *i* have distance at least *d* from *s* and dist*H*′ (*s, vk*) ≤ dist*H*′ (*s, vℓ*) for all *k* ≤ *ℓ* ≤ *i*. Then, there is a vertex *v<sup>j</sup>* such that *v<sup>j</sup>* and all vertices *v<sup>r</sup>* with *r* ≥ *j* have distance at least *d* − 1 from *s* in *H*′ and all vertices *v<sup>ℓ</sup>* with *ℓ < j* have distance at most *d* − 2 from *s* in *H*′ . Hence, *F* ′ has to contain all edges in *F* ′′ . .= {{*vℓ, vr*} ∈ *E* | *ℓ < j < i* ≤ *r*} as otherwise a vertex *v<sup>r</sup>* with *r* ≥ *i* would have distance at most *d* − 1 from *s* in *H*′ . Let *v<sup>h</sup>* . .= *S*[*v<sup>j</sup> , d* − 1]. We partition the set *F* ′′ into two disjoint sets *F* ′′ *L* . .= {{*vℓ, vr*} ∈ *E* | *ℓ < h < i* ≤ *r*} and *F* ′′ *R* . .= {{*vℓ, vr*} ∈ *E* | *h* ≤ *ℓ < j < i* ≤ *r*}. Let *H* . .= (*V* ′ *, E*′ \ (*F* ′ \ *F* ′′ *<sup>R</sup>*)). Notice that *H* and *H*′ only difer in edges in *F* ′′ *<sup>R</sup>*, that is, in edges incident to vertices *v<sup>ℓ</sup>* and *v<sup>r</sup>* with *h* ≤ *ℓ < j* and *r* ≥ *i*. Since those *v<sup>ℓ</sup>* have distance *d* − 2 and those *v<sup>r</sup>* have distance at least *d*−1 from *s* in *H*, the distance between *s* and vertices *v<sup>ℓ</sup>* with *ℓ < j* is the same in *H*′ and *H*. Hence, dist*H*(*s, vk*) ≤ dist*H*(*s, vℓ*) for all *k* ≤ *ℓ* ≤ *j*. Thus, it holds that *T*[*v<sup>j</sup> , d* − 1] ≤ |*F* ′ \ *F* ′′ *<sup>R</sup>*| as *T*[*v<sup>j</sup> , d* − 1] was computed correctly by assumption. Note further that *F* ′′ *<sup>R</sup>* is by defnition equal to *C*[*S*[*v<sup>j</sup> , d* − 1]*, v<sup>i</sup> , v<sup>j</sup>* ] and

$$|T[v\_j, d-1] + |C[v\_h, v\_j, v\_i]| \geq \min\_{j' \leq i} \{ T[v\_j', d-1] + |C[S[v\_j', d-1], v\_j', v\_i]| \}.$$

Thus, *c* = |*F* ′ | = |*F* ′ \ *F* ′′ *<sup>R</sup>*| + |*F* ′′ *<sup>R</sup>*| ≥ *T*[*v<sup>j</sup> , d* − 1] + |*C*[*vh, v<sup>j</sup> , v<sup>i</sup>* ]| ≥ *T*[*v<sup>i</sup> , d*], a contradiction.

Finally, assume towards a contradiction that *S*[*v<sup>i</sup> , d*] is computed incorrectly but *T*[*v<sup>i</sup> , d*] is computed correctly. Since *T*[*v<sup>i</sup> , d*] is computed correctly, there is a set *F* ′ of *T*[*v<sup>i</sup> , d*] edges such that in *H* = (*V* ′ *, E*′ \ *F* ′ ) it holds for all *k* ≤ *ℓ* ≤ *i* that dist*H*(*s, vk*) ≤ dist*H*(*s, vℓ*) and for all *j* ≥ *i* that dist*H*(*s, v<sup>j</sup>* ) ≥ *d*. Then, there is a vertex *v<sup>j</sup>* such that dist(*s, vℓ*) ≤ *d*−2 for all *ℓ < j* and dist(*s, vr*) ≥ *d*−1 for all *r* ≥ *j*. Let, without loss of generality, *F* ′ be a set of edges such that there is no edge set *F* ′′ with the same property as described above

where the respective vertex *v<sup>j</sup>* ′ satisfes *j* ′ *< j*. Let *S*[*v<sup>i</sup> , d*] = *vh*. We show that *h* = *j*. If *j < h*, then *T*[*vh, d* − 1] + |*C*[*S*[*vh, d* − 1]*, vh, v<sup>i</sup>* ]| *< T*[*v<sup>i</sup> , d*] as otherwise *h* would have been chosen smaller. This, however, contradicts the assumption that *T*[*v<sup>i</sup> , d*] is computed correctly. If *j > h*, then by defnition, *T*[*v<sup>i</sup> , d*] = *T*[*vh, d* − 1] + |*C*[*S*[*vh, d* − 1]*, vh, v<sup>i</sup>* ]|. Thus, there is a set *F* ′′ such that |*F* ′′| = *T*[*v<sup>i</sup> , d*] = |*F* ′ | and in *H*′ . .= (*V* ′ *, E*′ \ *F* ′′) it holds for all *k* ≤ *ℓ* ≤ *i* that dist*H*′ (*s, vk*) ≤ dist*H*′ (*s, vℓ*). Moreover, it holds for all *j* ≥ *i* that dist*H*′ (*s, v<sup>j</sup>* ) ≥ *d*, for all *ℓ < h* that dist*H*′ (*s, vℓ*) ≤ *d* − 2, and for all *r* ≥ *h* that dist(*s, vr*) ≥ *d* − 1. This contradicts the defnition of *F* ′ .

We conclude this prove with analyzing the running time of our algorithm. We frst show how to compute *C*[*vh, v<sup>i</sup> , v<sup>j</sup>* ] for all triples (*vh, v<sup>j</sup> , vi*) of vertices in *O*(*n* 2 · *m*) time. To this end, we frst compute a tables *A*[*v<sup>j</sup> , v<sup>i</sup>* ], where

$$A[v\_j, v\_i] := |\{ \{v\_\ell, v\_r\} \in E \mid \ell < j < i \le r\}|.$$

Note that *A* can be computed in *O*(*n* 2 ·*m*) time by iterating over all edges {*vℓ, vr*} (we assume *ℓ < r*) and all entries in *A*[*v<sup>j</sup> , v<sup>i</sup>* ] and if *ℓ < j < i* ≤ *r*, then increment the entry. Once *A* is computed, we compute *C*[*vh, v<sup>j</sup> , v<sup>i</sup>* ] . .= *A*[*v<sup>j</sup> , v<sup>i</sup>* ] − *A*[*vh, v<sup>i</sup>* ] in constant time per table entry. Since there are *O*(*n* 3 ) table entries, the overall running time for this preprocessing is *O*(*n* 2 · *m*) (note that the input graph is a connected interval graph and hence *O*(*n*) ⊆ *O*(*m*)).

Each table entry *S*[*v<sup>i</sup> , d*] and *T*[*v<sup>i</sup> , d*] can be computed in *O*(*n*) time by iterating over at most *n* vertices and computing the sum of a table entry in *T* and the size of a table entry in *C*, thereby keeping track of the minimum value and which iteration led to this minimum. Since there are *O*(*n* · *λ*) table entries, the overall running time is *O*(*n* 2 · *λ*). As we may assume that *λ < n* (each path has length at most *n*), the running time is bounded by *O*(*n* 3 ). Lastly, computing the solution takes *O*(*n*) time as we have to iterate over up to *n* neighbors *v<sup>i</sup>* of *t* and for each we have to compute |{*v<sup>ℓ</sup>* | *ℓ < i* ∧ {*vℓ, t*} ∈ *E*}|. This computation takes constant time as we can compute the smallest index *j* of a vertex that is adjacent to *t* in *G* and then compute *i* − *j* + 1. Thus, the overall running time for our algorithm is *O*(*n* 2 · *m*).

The main point in the proof of Theorem 4.4 where we need to assume that the input graph is a proper interval graph and not an interval graph is the application of Lemma 4.2. In the following section, we will investigate problems arising when trying to adapt Lemma 4.2 for interval graphs. Concluding this section, we want to emphasis the way the solution is computed in the proof of Theorem 4.4 once the tables *T* and *S* are completely flled. Rather than looking

at a single entry or taking the maximum or minimum entry in a given column, we iterate over all entries with *d* = *λ* in *T*[*v<sup>i</sup> , d*] and add to it the number of neighbors *v<sup>j</sup>* with *j < i* of *t*. The solution is then corresponding to the minimum such sum. This way of fnding a solution goes to show that each of the four guiding questions (even the one that looks the simplest) can have a surprising or non-trivial answer.

## **4.3 Falsifying Assumptions for Interval Graphs**

In this section, we discuss some problems that arise when trying to adapt the algorithm behind Theorem 4.4 for interval graphs. The only diference between interval graphs and proper interval graphs is that in interval graphs there can be pairs (*v, w*) of vertices such that *N*[*v*] ⊂ *N*[*w*]. Intuitively, it does not seem to make sense to remove an edge {*u, v*} while leaving an edge {*u, w*} in the solution graph as each shortest *s*-*t*-path containing *v* and using the edge {*u, v*} can then be replaced by a path containing *w* and {*u, w*}. This leads to the following conjecture.

**Conjecture 4.5.** *Let G* = (*V, E*) *be an interval graph and let F be a set of edges. Let d be the distance from s to t in G*′ = (*V, E* \ *F*)*. Then, there is a set F* ′ *of edges with* |*F* ′ | ≤ |*F*| *such that for G*′′ *. .*= (*V, E* \ *F* ′ ) *it holds that* dist*G*′′ (*s, t*) ≥ *d and for each v, w* ∈ *V* \ {*s, t*} *with N*[*v*] ⊂ *N*[*w*] *it holds that if* {*u, v*} ∈ *F* ′ *for some u* ∈ *V , then also* {*u, w*} ∈ *F* ′ *.*

Conjecture 4.5 would be helpful to show that Lemma 4.2 also holds for interval graphs. Unfortunately, Conjecture 4.5 is false as shown in the example in Figure 4.3. Therein, the only solution for removing three edges deletes some edges incident to *w* and one edge incident to *v* such that the only remaining path between *s* and *t* passes through both *v* and *w*. A next natural conjecture could be that a similar approach to the dynamic program behind Theorem 4.4 could still work, where we order the vertices by their *b*- or their *f*-values.

**Conjecture 4.6.** *Let G* = (*V, E*) *be an interval graph and let F be a set of edges. Let d be the distance from s to t in G*′ *. .*= (*V, E* \ *F*)*. Let F<sup>b</sup> and F<sup>f</sup> be sets of minimum size such that G<sup>b</sup> . .*= (*V, E* \ *F<sup>b</sup>* ) *and G<sup>f</sup> . .*= (*V, E* \ *F<sup>f</sup>* ) *fulfll* dist*<sup>G</sup><sup>b</sup>* (*s, t*) ≥ *d,* dist*<sup>G</sup><sup>f</sup>* (*s, t*) ≥ *d, and the following. For each v, w* ∈ *V* \ {*s, t*} *it holds that if b<sup>v</sup> < bw, then* dist*<sup>G</sup><sup>b</sup>* (*s, v*) ≤ dist*<sup>G</sup><sup>b</sup>* (*s, w*)*. Moreover, if f<sup>v</sup> < fw, then* dist*<sup>G</sup><sup>f</sup>* (*s, v*) ≤ dist*<sup>G</sup><sup>f</sup>* (*s, w*)*. Then,* |*F<sup>b</sup>* | ≤ |*F*| *or* |*F<sup>f</sup>* | ≤ |*F*|*.*

**Figure 4.3:** An interval graph and its interval representation. The dashed edges show the unique solution for Length-Bounded Cut with *β* = 3 and *λ* = 5. Note that *N*[*v*] ⊂ *N*[*w*], that the edge {*u, v*} is dashed, and that the edge {*u, w*} is not. Since the dashed edges are the only solution, Conjecture 4.5 is false.

**Figure 4.4:** An interval graph and its interval representation. The dashed edge is the unique solution for Length-Bounded Cut with *β* = 1 and *λ* = 5. Let *G* ′ denote the graph without the dashed edge. Note that it holds in *G* ′ that dist*G*′ (*s, u*) *<* dist*G*′ (*s, x*) = dist*G*′ (*s, y*) *<* dist*G*′ (*s, v*) but *b<sup>v</sup> < b<sup>y</sup>* and *f<sup>u</sup> < fx*. Note further that the dashed edge is the only edge whose removal increases the distance between *s* and *t* and hence Conjecture 4.6 is false.

Again, Conjecture 4.6 is false as shown in Figure 4.4. The idea behind this counterexample is to include short intervals with only two neighbors that prevent any reordering of their neighbors after the removal of some edges. This shows that the basic idea of our algorithm for proper interval graphs cannot work for interval graphs as we cannot order the vertices by their *b*- or their *f*-values for the dynamic program. Moreover, note that the dashed edge in Figure 4.4 is the only solution and that after removing this edge, the resulting graph contains a *C*<sup>4</sup> induced by the vertices *u, x, v* and *y*. Thus, we cannot even assume that removing a solution from the input interval graph yields an interval graph. This is in contrast to Theorem 4.4 where the graph resulting from removing a solution from the input proper interval graph is again a proper interval graph.

# **4.4 Concluding Remarks**

In this chapter, we studied Length-Bounded Cut in the special case where the input graph is a proper interval graph and showed polynomial-time solvability. This confrms a conjecture by Bazgan et al. [Baz+19]. A natural next step is to investigate interval graphs. We showed some limitations for adapting our approach from proper interval graphs to interval graphs. We still conjecture that Length-Bounded Cut on interval graphs should allow for a polynomial-time algorithm.

Bazgan et al. [Baz+19] provide a hierarchy of parameters with known results and open problems for Length-Bounded Cut. In the paper on which this chapter is based, we solved some of their open problems [BHK20]. Tackling the remaining ones is left as a challenge for future research.

# **Chapter 5**

### **Disjoint Shortest Paths**

This is the fnal chapter in the dynamic-programming part of the thesis. With regards to content, our main contribution in this chapter is an *XP*-algorithm for the *NP*-hard *k*-Disjoint Shortest Paths problem. This is a variant of the fundamental and well-studied combinatorial problem *k*-Disjoint Path. On a conceptual level, we will complete our journey through the intricacies of dynamic programming. For the dynamic program we develop in this chapter, there is a very simple way of computing each table entry. However, when we consider a natural generalization of our problem, then this way is not feasible any more. Further investigating the generalization yields another way of computing each table entry that will turn out to be much more efcient even for the special case we are mainly interested in.

*k*-Disjoint Path describes the question of whether there are *k* pairwise disjoint<sup>1</sup> paths between vertex terminal pairs (*s<sup>i</sup> , ti*)*i*∈[*k*] in a given undirected graph *G*. Karp [Kar75] showed that the problem is *NP*-hard when *k* is part of the input. On the positive side, Robertson and Seymour [RS95] provided an algorithm running in *O*(*n* 3 ) time for any constant *k*. Later, Kawarabayashi et al. [KKR12] improved the running time to *O*(*n* 2 )—again for fxed *k*. On directed graphs, in contrast, the problem is NP-hard even for *k* = 2 [FHW80]. However, on directed acyclic graphs (DAGs), the problem becomes again polynomial-time solvable for constant *k* [FHW80].

We study a variant called *k*-Disjoint Shortest Paths. Therein, all paths in a sought solution have to be shortest paths between the respective terminal pairs. This problem has applications in transportation networks, circuit layout, and circuit routing (see e. g. the work by Kawarabayashi et al. [KKR12] and

<sup>1</sup>Here and in the following this means vertex-disjoint.

references therein) and its complexity for constant *k* has been a long-standing open problem [Eil98, Fom+19]. Very recently, Lochet [Loc21] settled this question by showing that *k*-Disjoint Shortest Paths can be solved in *n O* ( *k* 5 *k* ) time, that is, polynomial time for every constant *k*. We provide a new approach with a novel geometric perspective that simplifes many arguments and leads to an overall streamlined algorithm with a running time of *O*(*k* · *n* <sup>16</sup>*k*·*k*!+*k*+1). Notably, *k*-Disjoint Shortest Paths is *W[1]*-hard with respect to *k* and, assuming the ETH, there is no *f*(*k*) · *n o*(*k*) -time algorithm for *k*-Disjoint Shortest Paths [Ben+21]. The asymptotic gap between the lower bound of *n <sup>o</sup>*(*k*) and our upper bound of *n <sup>O</sup>*((*k*+1)!) is, however, still quite large.

We formalize our novel geometric view for *k*-Disjoint Shortest Paths and provide some structural observations regarding solutions to *k*-Disjoint Shortest Paths in Section 5.2. In Section 5.3, we present a dynamic-programmingbased approach to solve a special case of *k*-Disjoint Shortest Paths. Afterwards, we provide our algorithm for the general problem that uses the dynamic program as a subprocedure and prove our main theorem.

### **5.1 Problem Defnition and Related Work**

*k*-Disjoint Shortest Paths is defned as follows.

*k*-Disjoint Shortest Paths

**Input:** An undirected graph *G* = (*V, E*) and *k* pairs (*s<sup>i</sup> , ti*)*i*∈[*k*] of vertices.

**Question:** Are there *k* disjoint paths *P<sup>i</sup>* such that, for each *i* ∈ [*k*], *P<sup>i</sup>* is a shortest *si*-*ti*-path?

Eilam-Tzoref [Eil98] introduced this variant of *k*-Disjoint Path, showed that it is *NP*-hard when *k* is part of the input, and provided a dynamic-programmingbased *O*(*n* 8 )-time algorithm for 2-Disjoint Shortest Paths. This was later improved to an *O*(*n* <sup>2</sup>*m*)-time algorithm [Ben+21]. The *O*(*n* 8 )-time algorithm for 2-Disjoint Shortest Paths works for positive edge lengths and, recently, Gottschau et al. [GKW19] and Kobayashi and Sako [KS19] independently extended this result by providing polynomial-time algorithms for the case where the edge lengths are non-negative. Concerning directed graphs, Bérczi and Kobayashi [BK17] provided a polynomial-time algorithm for positive edge lengths for 2-Disjoint Shortest Paths. Note that setting all edge length to zero results in 2-Disjoint Path on directed graphs, which is *NP*-hard [FHW80]. Extending the problem to fnding two disjoint *si*-*ti*-paths of minimal total length

(in undirected graphs), Björklund and Husfeldt [BH19] provided an *O*(*n* <sup>11</sup>)-time randomized algorithm. Finally, Tragoudas and Varol [TV96] showed that it is *NP*-hard to decide whether the number of solutions of an instance of 2-Disjoint Paths is at least some given threshold.

### **5.2 A Geometric View on Shortest Paths**

In this section, we delineate our geometric perspective on *k*-Disjoint Shortest Paths, make some structural observations, and give a characterization of solutions with regard to their geometry. In the following sections, we will then use these observations to design a dynamic-programming-based algorithm for *k*-Disjoint Shortest Paths. We start with some basic intuition and a small example. In Subsection 5.2.1 we then formalize the geometry-based ideas and provide a characterization of solutions for 2-Disjoint Shortest Paths. In Subsection 5.2.2 we then generalize this characterization to solutions of *k*-Disjoint Shortest Paths.

For the geometric representation, we defne a *<sup>k</sup>*-dimensional vector<sup>2</sup> #»*<sup>v</sup>* for each vertex *v*. The *i* th coordinate of this vector is the distance between *s<sup>i</sup>* and *v*. An example of this vector representation is given in Figure 5.1. Note that there can be multiple vertices with the same vector. The geometric perspective is based on the following two observations. First, since each path *P<sup>i</sup>* = (*v i* 0 *, v<sup>i</sup>* 1 *, . . . , v<sup>i</sup> di* ) in the sought solution is a shortest *si*-*ti*-path, it holds for each *j* ∈ [*d<sup>i</sup>* ] that dist(*s<sup>i</sup> , v<sup>i</sup> j* ) = dist(*s<sup>i</sup> , v<sup>i</sup> j*−1 ) + 1, that is, the path *P<sup>i</sup>* is strictly monotone in the *i* th coordinate. We say that paths which are strictly monotone in the *i* th coordinate have *color i*. Second, for each vertex *v i j* in *P<sup>i</sup>* , it holds that dist(*s<sup>i</sup> , ti*) = dist(*s<sup>i</sup> , v<sup>i</sup> j* ) + dist(*v i j , ti*). Thus, any vertex *w* with dist(*s<sup>i</sup> , ti*) ̸= dist(*s<sup>i</sup> , w*) + dist(*w, ti*) cannot be part of a shortest *si*-*ti*-path. We can formulate this into a necessary (but not sufcient) condition in terms of our geometric perspective as follows. Consider the *k*-dimensional hyperrectangle that has the vectors of *s<sup>i</sup>* and *t<sup>i</sup>* as two corners and whose sides form an angle of 45◦ with the coordinate axes. We say that this hyperrectangle is *spanned* by *s<sup>i</sup>* and *t<sup>i</sup>* . The (hyper)rectangle spanned by *s*<sup>1</sup> and *t*<sup>1</sup> in the right-hand side of Figure 5.1 is highlighted in gray. We will prove that any vertex whose vector is not within the area of this hyperrectangle cannot be part of a shortest *si*-*ti*path. Moreover, if we consider any vertex *v i j* in *P<sup>i</sup>* and the two hyperrectangles

<sup>2</sup>We use the term vector interchangeably with the point the vector is pointing to.

**Figure 5.1:** *Left side:* A simple, undirected graph with four distinguished vertices *s*1, *s*2, *t*1, and *t*2. The vectors with the distances to each *s<sup>i</sup>* are written next to the vertices. Two disjoint shortest paths are highlighted.

*Right side:* A two-dimensional coordinate system. Each vertex is represented at its vector and edges are drawn as lines between their respective end points. When multiple vertices share the same vector, then vertices are depicted close to their actual vector. The rectangle spanned by *s*<sup>1</sup> and *t*<sup>1</sup> is drawn in gray. The two disjoint shortest paths are again depicted. Note that the *s*1-*t*1-path is going strictly monotone to the right and the *s*2-*t*2-path is strictly monotone going down.

spanned by *v<sup>i</sup> <sup>j</sup>* and either *s<sup>i</sup>* or *ti*, then it holds that the vector of each vertex in *P<sup>i</sup>* is contained in the area of these two hyperrectangles.

We use these two observations as follows. Assume that there is a solution (a set of pairwise disjoint shortest *si*-*ti*-paths (*Pi*)*i*∈[*k*]). We will show that each path *P<sup>i</sup>* can be split into *ℓ<sup>i</sup>* subpaths *P*<sup>1</sup> *<sup>i</sup> , P*<sup>2</sup> *<sup>i</sup> , . . . , P<sup>ℓ</sup><sup>i</sup> <sup>i</sup>* such that

	- subpaths in the same part of the partition share a common color and
	- for two subpaths *P<sup>j</sup> <sup>i</sup>* and *P<sup>q</sup> <sup>p</sup>* in diferent parts of the partition, it holds that the areas of the hyperrectangles spanned by the end vertices of *P<sup>j</sup> <sup>i</sup>* and *P<sup>q</sup> <sup>p</sup>* , respectively, are disjoint.

Our algorithm works in two phases. In the frst phase, we *guess*<sup>3</sup> the end vertices of each of the described subpaths (we call the end vertices *marbles*). In the second phase, we compute the described partition and solve *k*-Disjoint Shortest Paths independently for each part of the partition. For each part of the partition, there is a color *c* that all subpaths in this part have. We assume that each subpath in this part is strictly increasing in its *c*-coordinates as we can otherwise swap the two endpoints. We then ignore all edges that are not monotone in *c* (the two endpoints of the edge have the same *c*-coordinate) and direct the edges so that they are pointing towards the higher *c*-coordinate. Note that the resulting graph is a DAG and since each subpath is strictly increasing in its *c*-coordinates, the directed version of each subpath is still contained in the constructed DAG. Hence, we can use the algorithm of Fortune et al. [FHW80] for *k*-Disjoint Shortest Paths on DAGs to fnd pairwise disjoint shortest subpaths. We then also present our own dynamic program for *k*-Disjoint Shortest Paths on DAGs and, as Fortune et al. [FHW80] only state *n O*(*k*) time, provide a precise running-time analysis.

We remark that Lochet [Loc21] used the same two-step approach but how these steps are achieved is diferent. In particular, he does not use a geometric view on shortest paths (as we do). As a result, even for *k* = 2 he can only upper-bound the number of vertices his algorithm has to guess to ensure that no two parts can intersect by 9 <sup>91</sup> ([Loc21, Lemma 13]) while our approach produces at most fve parts. Moreover, our geometric view allows us to use a more efcient way of splitting the paths for general *k* (in *O*((*k* + 1)!) parts instead of *O*(*k* 5 *k* ) as done by Lochet).

We continue with some intuition for the described subpaths and the partition of them. We start with the two-dimensional case and distinguish between four cases. Figure 5.2 gives an overview over these cases and the vertices (the marbles) we guess in each case. It is easy to see that only in the case in the top right-hand corner the areas of two subpaths intersect (the dashed line). However, in this area both paths are strictly monotone in both coordinates. Thus, the depicted marbles ensure in each case that a partition as described exists.

We continue with the case where *k >* 2. The basic idea is to recursively partition the paths by considering two-dimensional projections of the respective hyperrectangles. Note that if these projections are disjoint, then also the areas

<sup>3</sup>Whenever we pretend to guess something, we mean that we iterate over all possible choices and consider for the explanation or proof the respective correct iteration.

**Figure 5.2:** The four cases for the two-dimensional projection of two paths *P*<sup>1</sup> and *P*2. The thick black lines represent *P*<sup>1</sup> and *P*<sup>2</sup> and the colored rectangles are the ones spanned by the respective terminals and marbles. For easier distinction, we colored everything related to the *s*1-*t*1-path red and related to the *s*2-*t*2-path blue (except for the respective paths themselves). A black square represents a vector on which we guessed a marble on both paths. The dashed line represents the subpaths of *P*<sup>1</sup> and *P*<sup>2</sup> that have a common color.

(top left-hand corner): The lines cross in one point with non-integer coordinates.

(top right-hand corner): The lines cross in at least one point with integer coordinates. (bottom left-hand corner): The rectangles defned by *s<sup>i</sup>* and *t<sup>i</sup>* intersect (in the gray (darker) area), but the lines do not.

(bottom right-hand corner): The rectangles defned by *s<sup>i</sup>* and *t<sup>i</sup>* do not intersect.

of the respective hyperrectangles are disjoint. For each pair (*Pi, P<sup>j</sup>* ) of paths, we start with the orthogonal projection to the coordinates *i* and *j*. This yields a set of marbles for *P<sup>i</sup>* and *P<sup>j</sup>* such that the respective subpaths either have colors *i* and *j* or cannot interfere with the respective other path. Assume that we guessed for each two-dimensional (*i, j*)-projection the intersection of *P<sup>i</sup>* and *P<sup>j</sup>* in this projection. Unfortunately, we cannot partition the respective subpaths as

stated above. Instead, we store for each subpath *P* ′ *i* of *P<sup>i</sup>* the set Φ of all colors that *P* ′ *<sup>i</sup>* has and recursively refne these subpaths until a partition as stated is possible. Roughly speaking, we check for each pair of subpaths whether the areas of their respective hyperrectangles intersect and if they do, then we fnd a two-dimensional projection and use this to fnd new marbles. The resulting subpaths are then either disjoint from the respective other path or have an additional color. We continue this procedure until the areas of any two subpaths with diferent colors are disjoint. These subpaths are then partitioned by their respective sets of colors. Note that by construction diferent subpaths in one part of the partition share a common color and the areas of the hyperrectangles spanned by the end vertices of two subpaths in diferent parts of the partition are disjoint.

In Subsection 5.2.1 we formalize the geometry-based ideas and provide a characterization of solutions for 2-Disjoint Shortest Paths. In Subsection 5.2.2 we generalize this characterization to solutions of *k*-Disjoint Shortest Paths.

### **5.2.1 Two Shortest Paths**

We now formalize and generalize the idea behind the geometric view (visualized in Figures 5.1 and 5.2). We start with some notation for projections. For any ∅ ⊂ *I* ⊆ [*k*] and any vector *x* ∈ R*<sup>k</sup>* , we denote with *x <sup>I</sup>* ∈ R<sup>|</sup>*I*<sup>|</sup> the orthogonal projection of *x* to the coordinates in *I*. That is, *x I* is the |*I*|-dimensional vector obtained by deleting all dimensions in *x* that are not in *I*. We usually drop the brackets in the exponent, thus writing e. g. (5*,* 6*,* 7*,* 8*,* 9)1*,*3*,*<sup>4</sup> . .= (5*,* 7*,* 8) or (5*,* 6*,* 7)<sup>2</sup> . .= (6). Similarly, for *R* ⊆ R*<sup>k</sup>* we defne *R<sup>I</sup>* . .= {*x I* | *x* ∈ *R*} ⊆ R<sup>|</sup>*I*<sup>|</sup> .

We associate with each vertex *v* ∈ *V* a vector in the *k*-dimensional Euclidean vector space. Formally, #»*<sup>v</sup>* . .= (#»*<sup>v</sup> i* )*i*∈[*k*] . .= (dist(*s<sup>i</sup> , v*))*i*∈[*k*] ∈ N*<sup>k</sup>* and for *U* ⊆ *V* we use #»*U* . .= { #»*<sup>u</sup>* <sup>|</sup> *<sup>u</sup>* <sup>∈</sup> *<sup>U</sup>*} to denote the set of all vectors of vertices in *<sup>U</sup>*. For a given instance of *k*-Disjoint Shortest Paths, one can compute the vector of each vertex in *O*(*km*) time by performing a breadth-frst-search from each vertex *s<sup>i</sup>* .

We use the following notations for any non-empty index set *I* ⊆ [*k*] in order to compare vectors of vertices *v, w* or sets *V, W* of vertices:

$$\begin{aligned} v \simeq^I w &\iff \forall c \in I. \ \overleftrightarrow{v}^c \simeq \overleftrightarrow{w}^c &\text{for } \simeq \in \{<, \le, =, \ge, >\}, \text{ and} \\ V \simeq^I W &\iff \{\overleftrightarrow{v}^I \mid v \in V\} \simeq \{\overleftrightarrow{w}^I \mid w \in W\} &\text{for } \simeq \in \{\subset, \le, =, \ge, \supset\}. \end{aligned}$$

We further write *x* ∈ *<sup>I</sup> X* if there is an *x* ′ ∈ *X* with *x* ′ =*<sup>I</sup> x* and *x /*∈ *<sup>I</sup> X* otherwise.

**Lemma 5.1.** *For any pair of vertices v, w* ∈ *V , we have* ∥ #»*v* − #»*w*∥<sup>∞</sup> <sup>≤</sup> dist(*v, w*)*.*

*Proof.* Let *P* be a shortest *v*-*w*-path. Each edge {*p, q*} in *P* fulflls ∥ #»*<sup>p</sup>* <sup>−</sup> #»*<sup>q</sup>* <sup>∥</sup><sup>∞</sup> <sup>≤</sup> <sup>1</sup> as | dist(*s<sup>i</sup> , p*) − dist(*s<sup>i</sup> , q*)| ≤ 1 for each vertex *s<sup>i</sup>* . Thus, by the triangle inequality, ∥ #»*<sup>v</sup>* <sup>−</sup> #»*w*∥<sup>∞</sup> <sup>≤</sup> ∑ *a*∈*A*(*P* ) 1 = dist(*v, w*).

For two vertices *u, w* ∈ *V* , let

$$u \diamond w := \{ v \in V \mid \text{dist}(u, v) + \text{dist}(v, w) = \text{dist}(u, w) \}$$

be the set of all vertices that lie on a shortest *u*-*w*-path. Similarly, for any *x, y* ∈ N*<sup>k</sup>* , let

$$x \diamond y := \{ z \in \mathbb{R}^k \mid ||x - z||\_{\infty} + ||z - y||\_{\infty} = ||x - y||\_{\infty} \}$$

be the hyperrectangle spanned by *x* and *y* (whose sides form an angle of 45◦ with the coordinate axes (see Figure 5.1)). We continue with a formal defnition of *colors*.

**Defnition 5.1.** Let *s, t* be two vertices and let *P* be a shortest *s*-*t*-path. The pair (*s, t*) and the path *P* are *colored* if dist(*s, t*) = ∥ #»*<sup>s</sup>* <sup>−</sup> #»*<sup>t</sup>* <sup>∥</sup>∞. Let

$$C(P) := C(s, t) := \{ c \in [k] \mid |\overleftarrow{s}^{\succ c} - \overrightarrow{t}^{\succ c}| = ||\overrightarrow{s} - \overrightarrow{t}^{\succ}||\_{\infty} \}.$$

be the set of all colors of *P*. The pair (*s, t*) and the path *P* are *c-colored* for each *c* ∈ *C*(*s, t*).

Note that this defnition of a *c*-colored path is equivalent to saying that *P* is strictly monotonous in its *c*-coordinates. Note further that for arbitrary *u, w* ∈ *V* we do *not* always have # » *<sup>u</sup>* <sup>⋄</sup> *<sup>w</sup>* <sup>⊆</sup> #»*u* ⋄ #»*w*, that is, the vectors of all vertices on a shortest *u*-*w*-path are not necessarily contained in the set of vectors "spanned" by #»*<sup>u</sup>* and #»*w*. However, this inclusion holds for colored vertex pairs as shown next.

**Lemma 5.2.** *Let v, w* <sup>∈</sup> *<sup>V</sup> be a <sup>b</sup>-colored pair. Then,* # » *<sup>v</sup>* <sup>⋄</sup> *<sup>w</sup>* <sup>⊆</sup> #»*v* ⋄ #»*w.*

*Proof.* Without loss of generality *v* ≤*<sup>b</sup> w*. Let *u* be an arbitrary vertex in *v* ⋄ *w*. Then, dist(*v, w*) = dist(*v, u*) + dist(*u, w*). Defnition 5.1 and Lemma 5.1 yield

$$
\Box^b = \overline{v}^b + \text{dist}(v, w) = \overline{v}^b + \text{dist}(v, u) + \text{dist}(u, w) \ge \overline{u}^b + \text{dist}(u, w) \ge \overline{w}^b.
$$

Hence, #»*<sup>u</sup> <sup>b</sup>* = #»*v <sup>b</sup>* <sup>+</sup> dist(*v, u*) and #»*<sup>w</sup> <sup>b</sup>* = #»*u <sup>b</sup>* + dist(*u, w*). Lemma 5.1 then states that dist(*v, u*) = ∥ #»*<sup>v</sup>* <sup>−</sup> #»*u*∥<sup>∞</sup> and dist(*u, w*) = <sup>∥</sup> #»*<sup>v</sup>* <sup>−</sup> #»*u*∥∞. Hence,

$$\|\|\vec{v} - \vec{w}\|\|\_{\infty} = \text{dist}(v, w) = \text{dist}(v, u) + \text{dist}(u, w) = \|\|\vec{v} - \vec{w}\|\|\_{\infty} + \|\|\vec{u} - \vec{w}\|\|\_{\infty}.$$

This leads to #»*<sup>u</sup>* <sup>∈</sup> #»*v* ⋄ #»*<sup>w</sup>* and thus # » *<sup>v</sup>* <sup>⋄</sup> *<sup>w</sup>* <sup>⊆</sup> #»*v* ⋄ #»*w*.

Throughout this chapter, we will be particularly interested in two-dimensional projections of areas #»*<sup>v</sup>* <sup>⋄</sup> #»*<sup>w</sup>* for some vertices *<sup>v</sup>* and *<sup>w</sup>*. Note in this context that ( #»*v* ⋄ #»*w*) *<sup>I</sup>* = #»*v I* ⋄ #»*w I* . Recall that the area defned by *x* ⋄ *y* for *x, y* ∈ N<sup>2</sup> is a rectangle in the plane whose sides form an angle of 45◦ to the coordinate axes. The following lemma lists necessary and sufcient conditions for those rectangles to intersect.

**Lemma 5.3.** *Let x, y, x*ˆ*, y*ˆ ∈ N<sup>2</sup> *. Then x* ⋄ *y* ∩ *x*ˆ ⋄ *y*ˆ ≠ ∅ *if and only if all of the following hold:*


$$\{ (iv) \; \min \{ \hat{x}^1 + \hat{x}^2, \hat{y}^1 + \hat{y}^2 \} \le \max \{ x^1 + x^2, y^1 + y^2 \} \dots \} $$

*Proof.* Let *R*1*, R*<sup>2</sup> ⊆ R<sup>2</sup> be two axis-parallel rectangles defned by the opposite corners *q, r* ∈ R<sup>2</sup> and ˆ*q, r*ˆ ∈ R<sup>2</sup> . It is easy to see that *R*<sup>1</sup> and *R*<sup>2</sup> intersect if and only if


Since the intersection of two rectangles is invariant under rotation and scaling, we simply rotate *x*⋄*y* and *x*ˆ ⋄*y*ˆ by 45◦ (and scale it by factor <sup>√</sup> 2) by multiplying all vectors with the matrix

$$R = \begin{bmatrix} 1 & -1 \\ 1 & 1 \end{bmatrix}.$$

Now the above characterization for axis-parallel rectangles translates into the conditions stated in the lemma.

**Figure 5.3:** The rectangle *x* ⋄ *y* spanned by two points *x* and *z* in two dimensions and a point *z* ∈ *x* ⋄ *y*. Note that ∥*y* − *x*∥<sup>∞</sup> is the vertical distance between *x* and *y*. Lemma 5.4 states that *d*<sup>1</sup> ≥ *d*<sup>2</sup> and *d*<sup>3</sup> ≥ *d*4.

The next lemma states that the distance between *x* and any *z* ∈ *x* ⋄ *y* where *x* and *y* have distance |*y <sup>c</sup>* −*x c* | is at most |*z <sup>c</sup>* −*x c* |. Intuitively, this is clear as *x*⋄ *y* is a hyperrectangle whose sides form an angle of 45◦ with the coordinate axes and hence half of its borders exactly defne all points *z* whose distance to *x* is exactly |*z <sup>c</sup>* − *x c* |. See Figure 5.3 for an illustration.

**Lemma 5.4.** *Let b, c* ∈ [*k*] *and let x, y* ∈ N*<sup>k</sup> with* ∥*y* − *x*∥<sup>∞</sup> = *y <sup>c</sup>* − *x c . Then, for all z* ∈ *x* ⋄ *y it holds that z <sup>c</sup>* − *x <sup>c</sup>* ≥ |*z <sup>b</sup>* − *x b* | ≥ 0 *and y <sup>c</sup>* − *z <sup>c</sup>* ≥ |*y <sup>b</sup>* − *z b* | ≥ 0*.*

*Proof.* By assumption and the defnition of *x* ⋄ *y*, it holds that

$$\begin{aligned} \|y^c - x^c = \|y - x\|\_\infty &= \|y - z\|\_\infty + \|z - x\|\_\infty \\ &\ge |y^c - z^c| + |z^c - x^c| \ge (y^c - z^c) + (z^c - x^c) \\ &= y^c - x^c. \end{aligned}$$

Thus, we have equality everywhere, in particular *y <sup>c</sup>* − *z <sup>c</sup>* = ∥*y* − *z*∥<sup>∞</sup> ≥ |*z <sup>b</sup>* − *x b* | and *z <sup>c</sup>* − *x <sup>c</sup>* = ∥*z* − *x*∥<sup>∞</sup> ≥ |*y <sup>b</sup>* − *z b* | (as shown by the equality between the last term in the frst row and the frst term in the second row).

We next formalize the lines we used in Figure 5.1 to connect the vectors of vertices in a path. To this end, for any path *P* = (*v*1*, v*2*, . . . , vi*) we defne *<sup>ζ</sup>*(*P*) <sup>⊂</sup> <sup>R</sup>*<sup>k</sup>* as the piecewise linear curve connecting the points of #»*<sup>P</sup>* in the

order given by *P*. Recall that *C*(*P*) denotes the set of all colors *a* such that *P* is *a*-colored. The next observation states that *ζ*(*P*) *C*(*P* ) is a straight line, which is equivalent to the statement that *P* is strictly monotone in each coordinate in *C*(*P*).

**Observation 5.5.** *Let P be a colored path. Then ζ*(*P*) *C*(*P* ) *is a straight line segment.*

*Proof.* Let *ℓ* . .= ∥ # »*s<sup>P</sup>* <sup>−</sup> # »*t<sup>P</sup>* <sup>∥</sup><sup>∞</sup> and *<sup>k</sup>* ′ . .= |*C*(*P*)|. The path *P* contains exactly *ℓ* edges, each of which has an Euclidean length of at most <sup>√</sup> *k* ′ in the projection *ζ*(*P*) *C*(*P* ) . Thus the length of *ζ*(*P*) *C*(*P* ) is at most *ℓ* · √ *k* ′ which is exactly the Euclidean distance between # »*s<sup>P</sup> C*(*P* ) and # »*t<sup>P</sup> C*(*P* ) .

As a consequence of Observation 5.5, the intersection of two paths *P, Q* in the (*C*(*P*) ∪ *C*(*Q*))-projection is also a straight line segment with an angle of 45◦ to the coordinate axes as shown in Figure 5.1 (right side) and Figure 5.2 (top right).

**Lemma 5.6.** *Let P and Q be two colored paths, and C* ⊆ *C*(*P*) ∪ *C*(*Q*)*. Then ζ*(*P*) *<sup>C</sup>* ∩ *ζ*(*Q*) *<sup>C</sup> is a (possibly empty) straight line segment.*

*Proof.* For the sake of notation, we assume that *C* = [|*C*|]. Note that *ζ*(*Pa*) and *ζ*(*Pb*) are piecewise linear curves. Moreover, according to Lemma 5.2 for any two points *x, y* ∈ *ζ*(*P*), it holds that ∥*x* − *y*∥<sup>∞</sup> = |*x <sup>c</sup>* − *y c* | for all *c* ∈ *C*(*P*) and ∥*x* ′ − *y* ′∥<sup>∞</sup> = |*x* ′*<sup>b</sup>* − *y* ′*b* | for any two points *x* ′ *, y*′ ∈ *ζ*(*Q*) and all *b* ∈ *C*(*Q*). So, for *x, y* ∈ R*<sup>k</sup>* with *x <sup>C</sup> , y<sup>C</sup>* ∈ *ζ*(*P*) *<sup>C</sup>* ∩ *ζ*(*Q*) *<sup>C</sup>* , it follows that

> |*x <sup>c</sup>* − *y c* | = ∥*x* − *y*∥<sup>∞</sup> = |*x <sup>b</sup>* − *y b* |

for all *c* ∈ *C* ∩ *C*(*P*) and all *b* ∈ *C* ∩ *C*(*Q*). Thus, *C*(*x, y*) ⊇ *C* and the claim follows from Observation 5.5.

Note that even if *ζ*(*P*) *<sup>C</sup>*(*<sup>P</sup>* )∪*C*(*Q*) ∩ *ζ*(*Q*) *C*(*P* )∪*C*(*Q*) is non-empty, then it does not need to contain points from N<sup>|</sup>*C*(*<sup>P</sup>* )∪*C*(*Q*)<sup>|</sup> as can be seen in the top left example in Figure 5.2.

The following defnition starts to formalize the notion of marbles, that is, the special vertices in the diferent cases in Figure 5.2. We start with the two cases in which the lines of *P* and *Q* cross (the upper two).

**Defnition 5.2.** Let *P, Q* be two colored paths, let *b* ∈ *C*(*P*), and let *c* ∈ *C*(*Q*). The paths *P* and *Q* are *b, c-crossing* if the intersection

$$X := \zeta(P)^{b,c} \cap \zeta(Q)^{b,c}$$

is non-empty and they are *b, c-non-crossing* otherwise.

If #»*P b,c* ∩ *X* ≠ ∅, then we defne *α b,c P* (*Q*) and *ω b,c P* (*Q*) to be the frst and last vertex *v* of *P* with *v b,c* <sup>∈</sup> *<sup>X</sup>*. If #»*<sup>P</sup> b,c* ∩ *X* only contains non-integer coordinates, then *α b,c P* (*Q*) . .= *ω b,c P* (*Q*) . .= ⊥. We further defne *∂ b,c P* (*Q*) and *ϖ b,c P* (*Q*) to be the last vertex before and the frst vertex after that intersection. If #»*<sup>P</sup> b,c* ∩ *X* = ∅, then *α b,c P* (*Q*) . .= *ω b,c P* (*Q*) . .= *∂ b,c P* (*Q*) . .= *ϖ b,c P* (*Q*) . .= ⊥.

Regarding notation, we will use *α<sup>P</sup>* instead of *α b,c P* (*Q*) (and the same for *ω, ∂*, and *ϖ*) if *b*, *c*, and *Q* are clear from the context. Note that the subpaths between the respective *α*- and *ω*-vertices are by Lemma 5.6 straight lines and {*b, c*} colored.

**Observation 5.7.** *If P, Q are two paths with α b,c P* (*Q*) ̸= ⊥*, then*

$$P[\alpha\_P^{b,c}(Q), \omega\_P^{b,c}(Q)] = {}^{b,c}Q[\alpha\_Q^{b,c}(P), \omega\_Q^{b,c}(P)].$$

*In particular, both of these subpaths are b, c-colored.*

It remains to consider the subpaths between *s*- and *α*-vertices and between *ω*and *t*-vertices. By Lemma 5.2 these have to lie in the rectangle areas

$$\overleftrightarrow{s\_P} \diamond \overrightarrow{\partial\_P^{b,c}(Q)}, \overrightarrow{\varpi\_P^{b,c}(Q)} \diamond \overleftrightarrow{t\_P}, \overrightarrow{s\_Q} \diamond \overrightarrow{\partial\_Q^{b,c}(P)}, \text{ and } \overrightarrow{\varpi\_Q^{b,c}(P)} \diamond \overrightarrow{t\_Q}.\tag{5.1}$$

Figure 5.2 (top left-hand corner and top right-right hand corner) suggests that these areas are pairwise disjoint. We will show that this is indeed the case and, to this end, we show the following two observations. The frst one states that the *∂*-vertex on *P* has a *b*-coordinate that is at most the *b*-coordinate of the *∂*-vertex on *Q*, where *b* is the "original" color of *P*. Note that this *∂*-vertex is right before the respective *α*-vertex or before the single crossing point with non-integer coordinates. Since *P* is strictly *b*-monotone, the path *Q* can at most increase or decrease as fast as *P* from the point of intersection.

**Observation 5.8.** *Let P, Q be two b, c-crossing paths with ∂ b,c P* (*Q*) ̸= ⊥*. If P is a subpath of a shortest sb-tb-path and Q is a subpath of a shortest sc-tc-path, then* # » *∂ b,c P* (*Q*) *b* ≤ # » *∂ b,c <sup>Q</sup>* (*P*) *b and* # » *∂ b,c <sup>Q</sup>* (*P*) *c* ≤ # » *∂ b,c P* (*Q*) *c .*

*Proof.* Let *z* ∈ *ζ*(*P*) *b,c* ∩ *ζ*(*Q*) *b,c* have minimal *b*-coordinate. Note that

$$\left\| z - \overline{\partial\_P(Q)}^{b,c} \right\|\_{\infty} = \left\| z - \overline{\partial\_Q(P)}^{b,c} \right\|\_{\infty},$$

and since *ζ*(*P*) is strictly increasing in its *b*-coordinate, we can infer that

$$z^b - \overline{\partial\_P(Q)}^b = \left\| z - \overline{\partial\_P(Q)}^{b,c} \right\|\_{\infty} = \left\| z - \overline{\partial\_Q(P)}^{b,c} \right\|\_{\infty} \ge z^b - \overline{\partial\_Q(P)}^b.$$

This yields # » *∂<sup>P</sup>* (*Q*) *b* ≤ # » *∂Q*(*P*) *b* and the second inequality follows analogously.

The second observation is a simple but useful restatement of Lemma 5.1.

**Observation 5.9.** *Let b, c* ∈ [*k*] *and let* (*v, w*) *be a b-colored pair of vertices with v <<sup>b</sup> <sup>w</sup>. Then* #»*<sup>w</sup> <sup>b</sup>* − #»*w <sup>c</sup>* ≥ #»*v <sup>b</sup>* − #»*v c .*

*Proof.* By Lemma 5.1, #»*<sup>w</sup> <sup>c</sup>* − #»*v <sup>c</sup>* <sup>≤</sup> dist(*v, w*) = #»*<sup>w</sup> <sup>b</sup>* − #»*v b* . A simple arithmetic reformulation yields #»*<sup>w</sup> <sup>c</sup>* − #»*w <sup>b</sup>* ≤ #»*v <sup>c</sup>* − #»*v b* and multiplying both sides with −1 completes the proof.

We are now in the position to prove the statement that the four areas defned in Term (5.1) are pairwise disjoint.

**Lemma 5.10.** *Let P and Q be two b, c-crossing paths. The sets*

$$\left(\overrightarrow{s\_P} \diamond \overrightarrow{\partial\_P^{b,c}(Q)}\right)^{b,c}, \left(\overrightarrow{\varpi\_P^{b,c}(Q)} \diamond \overrightarrow{t\_P}\right)^{b,c}, \left(\overrightarrow{s\_Q} \diamond \overrightarrow{\partial\_Q^{b,c}(P)}\right)^{b,c}, \text{and } \left(\overrightarrow{\varpi\_Q^{b,c}(P)} \diamond \overrightarrow{t\_Q}\right)^{b,c}$$

*are pairwise disjoint (or undefned).*

*Proof.* Without loss of generality, let *P* be *b*-colored, *Q* be *c*-colored, *s<sup>P</sup> <<sup>b</sup> t<sup>P</sup>* , and *s<sup>Q</sup> <<sup>c</sup> tQ*. Recall that *s<sup>P</sup>* and *t<sup>P</sup>* are the start and end vertices of *P*, respectively. We further assume that all above sets are defned, that is, none of the described end points is ⊥. By Lemma 5.4, for any *x* ∈ # »*sP* ⋄ # »*∂<sup>P</sup>* and *<sup>y</sup>* <sup>∈</sup> *<sup>ϖ</sup>*# » *<sup>P</sup>* ⋄ # »*tP* it holds that *x* ≤ # »*∂P b <sup>&</sup>lt; <sup>ϖ</sup>*# » *P <sup>b</sup>* ≤ *y b* , and thus ( # »*s<sup>P</sup>* <sup>⋄</sup> # »*∂P* )*b,c* ∩ ( *ϖ*# » *<sup>P</sup>* ⋄ # »*tP* )*b,c* = ∅. An analogous argument holds for ( # »*s<sup>P</sup>* <sup>⋄</sup> # »*∂P* )*b,c* and ( *ϖ*# » *<sup>P</sup>* ⋄ # »*tP* )*b,c* .

We will now use Lemma 5.3 to show that ( # »*s<sup>P</sup>* <sup>⋄</sup> # »*∂P* )*b,c* ∩ ( # »*s<sup>Q</sup>* <sup>⋄</sup> # »*∂Q* )*b,c* = ∅. Since all other remaining cases are analogous, this will conclude the proof. By Observation 5.9, it holds that

$$\max \{ \overrightarrow{s\_P}^b - \overrightarrow{s\_P}^c, \overrightarrow{\partial\_P}^b - \overrightarrow{\partial\_P}^c \} = \overrightarrow{\partial\_P}^b - \overrightarrow{\partial\_P}^c \text{ and }$$

$$\min \{ \overrightarrow{s\_Q}^b - \overrightarrow{s\_Q}^c, \overrightarrow{\partial\_Q}^b - \overrightarrow{\partial\_Q}^c \} = \overrightarrow{\partial\_Q}^b - \overrightarrow{\partial\_Q}^c.$$

Observe that # »*∂<sup>P</sup> b,c* ̸<sup>=</sup> # »*∂Q b,c* as otherwise *∂<sup>P</sup>* would lie on the intersection of *P* and *<sup>Q</sup>*, a contradiction. Hence # »*∂<sup>P</sup> b* ≠ # »*∂Q b* or # »*∂P c* ̸= # »*∂Q c* . In the former case, Observation 5.8 states that *∂<sup>Q</sup> ><sup>b</sup> ∂<sup>P</sup>* and in the latter case it states *∂<sup>P</sup> ><sup>c</sup> ∂Q*. Hence, # »*∂<sup>P</sup> c* + # »*∂Q b >* # »*∂P b* + # »*∂Q c* and thus # »*∂<sup>P</sup> b* − # »*∂P c <* # »*∂Q b* − # »*∂Q c* . *b,c* # »*∂P b,c* # »*∂Q b,c*

Setting *x* = # »*sP* , *y* = , *x*ˆ = # »*sQ b,c*, and *y*ˆ = in Lemma 5.3 then violates condition (ii) and hence ( # »*s<sup>P</sup>* <sup>⋄</sup> # »*∂P* )*b,c* ∩ ( # »*s<sup>Q</sup>* <sup>⋄</sup> # »*∂Q* )*b,c* = ∅.

We continue with the defnition of marbles for the cases where the two paths *P* and *Q* are *b, c*-non-crossing. Figure 5.2 shows that even in this case *s<sup>P</sup>* ⋄ *t<sup>P</sup>* and *s<sup>Q</sup>* ⋄ *t<sup>Q</sup>* in general are not disjoint (bottom left-hand corner). Since the two bottom cases are distinguished by the intersection of *s<sup>P</sup>* ⋄ *t<sup>P</sup>* and *s<sup>Q</sup>* ⋄ *tQ*, we start with a defnition of this intersection.

**Defnition 5.3.** Let *b, c* ∈ [*k*]. Let *P* be a *b*-colored path and let *Q* be a *c*-colored path. The *common b, c-area* of *P* and *Q* is

$$\Delta^{b,c}(P,Q) := (\overrightarrow{s\_P} \diamond \overrightarrow{t\_P})^{b,c} \cap (\overrightarrow{s\_Q} \diamond \overrightarrow{t\_Q})^{b,c}.$$

Note that if ∆*b,c*(*P, Q*) = ∅, then by Lemma 5.2, they do not share vertices with common vectors and hence they do not share common vertices. If ∆*b,c*(*P, Q*) ̸= ∅ and *P* and *Q* are *b, c*-crossing, then we can use Defnition 5.2 to defne the marbles. It hence remains to study the case where ∆*b,c*(*P, Q*) ̸= ∅ and *P* and *Q* are *b, c*-non-crossing. In this case we need at most one marble per path and this marble is defned as follows.

**Defnition 5.4.** Let *P* be a *b*-colored path, *Q* be a *c*-colored path, and let without loss of generality be *s<sup>Q</sup> <<sup>b</sup> tQ*. Defne

$$B := \{ v \in V \mid v = {}^b s\_Q \land v < {}^c s\_Q \} \cup \{ v \in V \mid v = {}^b t\_Q \land v > {}^c t\_Q \}.$$

If *P* ∩ *B* ̸= ∅, then *δ b,c P* (*Q*) is the unique vertex in *P* ∩ *B*. If *P* ∩ *B* = ∅, then *δ b,c P* (*Q*) = ⊥.

#### **Observation 5.11.** *δ b,c P* (*Q*) *is well-defned.*

*Proof.* Since *P* is strictly monotone in the *b*-coordinate, it can clearly contain at most one point from *B*<sup>1</sup> . .= {*v* ∈ *V* | *v* =*<sup>b</sup> s<sup>Q</sup>* ∧ *v <<sup>c</sup> sQ*} and one from *B*<sup>2</sup> . .= {*v* ∈ *V* | *v* =*<sup>b</sup> t<sup>Q</sup>* ∧ *v ><sup>c</sup> tQ*}. It remains to show that it cannot intersect both sets. To this end, observe that Lemma 5.4 states for any *b*<sup>1</sup> ∈ *B*<sup>1</sup> and *b*<sup>2</sup> ∈ *B*<sup>2</sup> that |*b c* <sup>1</sup> − *b c* 2 | *>* |*s b <sup>Q</sup>* − *t b <sup>Q</sup>*| ≥ |*s b <sup>Q</sup>* − *t b <sup>Q</sup>*| = |*b b* <sup>1</sup> − *b b* 2 |, and therefore the pair (*b*1*, b*2) is not *b*-colored and thus *P* cannot contain both *b*<sup>1</sup> and *b*2.

We next show that if ∆*b,c*(*P, Q*) ̸= ∅ and *P* and *Q* are *b, c*-non-crossing, then at least one of the vertices *δ b,c P* (*Q*) or *δ b,c <sup>Q</sup>* (*P*) exists. Afterwards we will show that these vertices guarantee that the respective new areas are disjoint.

**Lemma 5.12.** *Let P, Q be two b, c-non-crossing paths with* ∆*b,c*(*P, Q*) ̸= ∅*. Then, δ b,c P* (*Q*) ̸= ⊥ *or δ b,c <sup>Q</sup>* (*P*) ̸= ⊥*.*

*Proof.* Note that the defnition of ∆*b,c*(*P, Q*) requires that without loss of generality *P* is *b*-colored and *Q* is *c*-colored. We further assume without loss of generality that {*b, c*} = [2] and that *s<sup>P</sup> <<sup>b</sup> t<sup>P</sup>* and *s<sup>Q</sup> <<sup>c</sup> tQ*.

Suppose towards a contradiction that *δ b,c P* (*Q*) = *δ b,c <sup>Q</sup>* (*P*) = ⊥. Since *P, Q* are *b, c*-non-crossing and *δ b,c <sup>Q</sup>* (*P*) = ⊥, the path *Q* cannot cross the curve

$$\{x \in \mathbb{N}^2 \mid x^c = s\_P \land x^b < s\_P\} \cup \zeta(P)^{b,c} \cup \{x \in \mathbb{N}^2 \mid x^c = t\_P \land x^b > t\_P\}.$$

Since *Q* cannot cross the line, it is located completely on one side of it. Assume without loss of generality that *Q* (and thus in particular *tQ*) is located on the side containing (0*,* 0), that is, for all *v* in *P* and *w* in *Q* with *v <sup>b</sup>* = *w b* it holds that *v <sup>c</sup> > w<sup>c</sup>* .

Then there are three possible cases: *t b <sup>Q</sup> < s<sup>b</sup> P* , *t b <sup>Q</sup>* ∈ [*s b P , t<sup>b</sup> P* ], or *t b <sup>Q</sup> > t<sup>b</sup> P* . Note that in the frst case by Lemma 5.4 it holds for any *z* ∈ ∆*b,c*(*P, Q*) that

$$z^c - t\_Q^c \ge z^b - t\_Q^b > z^b - s\_P^b \ge z^c - s\_P^c,$$

a contradiction to *z* ∈ ( # »*s<sup>Q</sup>* <sup>⋄</sup> # »*tQ*) *b,c*. In the last case, a similar argument holds with

$$s\_P^b < t\_Q^b - z^b \le t\_Q^c - z^c < s\_P^c - z^c,$$

which is a contradiction to *z* ∈ ( # »*s<sup>P</sup>* <sup>⋄</sup> # »*tP* ) *b,c*. It remains to analyze the case where *t b <sup>Q</sup>* ∈ [*s b P , t<sup>b</sup> P* ]. Note that in this case there is a vertex *p* in *P* with *p <sup>b</sup>* = *t b Q*. Hence *t c <sup>Q</sup> < p<sup>c</sup>* , and *p* ∈ {*v* ∈ *V* | *v* =*<sup>b</sup> t<sup>Q</sup>* ∧ *v ><sup>c</sup> tQ*}. Thus *δ b,c P* (*Q*) = *p* ̸= ⊥, a contradiction.

The next lemma shows that if *P* and *Q* are *b, c*-non-crossing, but they have a common area ∆*b,c* and *δ b,c P* (*Q*) ̸= ⊥, then the area between *s<sup>Q</sup>* and *t<sup>Q</sup>* is disjoint from the two areas between *s<sup>P</sup>* and *δ b,c P* (*Q*) and between *δ b,c P* (*Q*) and *t<sup>P</sup>* . Hence *δ b,c P* (*Q*) is the last type of marble needed.

**Lemma 5.13.** *Let P be a b-colored path and let Q a c-colored path such that* ∆*b,c*(*P, Q*) ̸= ⊥ *and δ b,c P* (*Q*) ̸= ⊥*. Then,*

$$\left(\overrightarrow{s\_Q} \diamond \overrightarrow{t\_Q}\right)^{b,c} \text{ is } \operatorname{disjoint}\left(\overrightarrow{s\_P} \diamond \overrightarrow{\delta\_P^{b,c}(Q)}\right)^{b,c} \cup \left(\overrightarrow{\delta\_P^{b,c}(Q)} \diamond \overrightarrow{t\_P}\right)^{b,c}$$

*.*

*Proof.* For the sake of readability, we use *δ* . .= *δ b,c P* (*Q*). We will show that

$$\left(\overrightarrow{s\_Q} \diamond \overrightarrow{t\_Q}\right)^{b,c} \cap \left(\overrightarrow{s\_P} \diamond \overrightarrow{\delta}\right)^{b,c} = \emptyset.$$

The proof for ( #»*δ* ⋄ # »*tP* ) *b,c* is then completely analogous. Assume without loss of generality that *δ* =*<sup>b</sup> t<sup>Q</sup>* and *δ ><sup>c</sup> tQ*. Notice that

$$\begin{split} \max \{ \overrightarrow{s\_P}^b - \overrightarrow{s\_P}^c, \overrightarrow{\delta}^b - \overrightarrow{\delta}^c \} &\overset{\text{Obs. 5.9}}{=} \overrightarrow{\delta}^b - \overrightarrow{\delta}^c \\ &< \overrightarrow{t\_Q}^b - \overrightarrow{t\_Q}^c \overset{\text{Obs. 5.9}}{=} \min \{ \overrightarrow{s\_Q}^b - \overrightarrow{s\_Q}^c, \overrightarrow{t\_Q}^b - \overrightarrow{t\_Q}^c \}. \end{split}$$

Setting *x* . .= # »*sP b,c* , *y* . .= #»*δ b,c* , *x*ˆ . .= # »*sQ b,c*, and *y*ˆ . .= # »*tQ b,c* in Lemma 5.3 then yields that condition (ii) is violated and thus ( # »*s<sup>P</sup>* <sup>⋄</sup> #»*δ* ) *b,c* ∩ ( # »*s<sup>Q</sup>* <sup>⋄</sup> # »*tQ*) *b,c* = ∅.

We are fnally in the position to defne the set of marbles for a pair of paths. Afterwards we conclude this subsection with the main proposition that states that marbles uniquely classify solutions.

**Defnition 5.5.** Let *P* be a *b*-colored *s<sup>P</sup>* -*t<sup>P</sup>* -path and *Q* be a *c*-colored *sQ*-*tQ*path. The set of {*b, c*}*-marbles* of *P* with respect to *Q* is

$$\mathcal{M}\_P^{b,c}(Q) := \{ s\_P, t\_P, \mu\_P^{b,c}(Q) \mid \mu \in \{ \alpha, \omega, \partial, \varpi, \delta \} \} \backslash \{ \perp \}.$$

The next proposition states that if there are two diferent solutions and in particular two pairs (*P, Q*) and (*P* ′ *, Q*′ ) of solution paths with the same marbles, then *P* and *Q* share exactly the same vectors as *P* ′ and *Q*′ do. Recall that the shared vectors are a straight line segment and that once their ends are fxed, we can use the dynamic program by Fortune et al. [FHW80] to fnd disjoint paths between these ends.

**Proposition 5.14.** *Let P and P* ′ *be b-colored s<sup>P</sup> -t<sup>P</sup> -paths, and let Q and Q*′ *be c-colored sQ-tQ-paths. If* M*b,c P* (*Q*) ⊆ *P* ′ *and* M*b,c <sup>Q</sup>* (*P*) ⊆ *Q*′ *, then*

$$\{v \in P' \mid v \in^{b,c} Q'\} =^{b,c} \{v \in P \mid v \in^{b,c} Q\}.$$

*Proof.* Let *R* be the subpath of *P* that starts at *α b,c P* (*Q*) and ends at *ω b,c P* (*Q*) (or *R* = ∅ if *α<sup>P</sup>* = *ω<sup>P</sup>* = ⊥). From the defnition of *α* and *ω* and Lemma 5.6, it follows that {*v* ∈ *P* | *v* ∈ *b,c Q*} =*b,c R*. We now consider the two cases whether or not *P* and *Q* are *b, c*-crossing.

If *P* and *Q* are *b, c*-crossing, then by defnition *∂<sup>P</sup>* ̸= ⊥ and *ϖ<sup>P</sup>* ̸= ⊥. It follows from Lemma 5.2 that the subpaths of *P* ′ from *s<sup>P</sup>* to *∂<sup>P</sup>* and from *ϖ<sup>P</sup>* to *t<sup>P</sup>* use only vectors from # »*s<sup>P</sup>* <sup>⋄</sup> # »*∂<sup>P</sup>* and *<sup>ϖ</sup>*# » *<sup>P</sup>* ⋄ # »*t<sup>P</sup>* , respectively. As the analogous statement holds for the corresponding subpaths of *Q*′ , it follows from Lemma 5.10 that all these subpaths do not intersect in the projection to the *b*-*c*-plane. It remains to consider the subpath from *α<sup>P</sup>* to *ω<sup>P</sup>* . If *α<sup>P</sup>* = *ω<sup>P</sup>* = ⊥, then

$$\{v \in P' \mid v \in^{b,c} Q'\} = \emptyset = R = \{v \in P \mid v \in^{b,c} Q\}.$$

Otherwise, #»*<sup>R</sup> b,c* is by Lemma 5.6 a straight diagonal line and by Observation 5.5 so is {*v* ∈ *P* ′ | *v* ∈ *b,c Q*′}. Since those two straight line segments have the same ends, they are the same and thus {*v* ∈ *P* ′ | *v* ∈ *b,c Q*′} =*b,c R*.

If *P* and *Q* are *b, c*-non-crossing, then {*v* ∈ *P* ′ | *v* ∈ *b,c Q*′} *b,c* ⊆ ∆*b,c*(*P, Q*) and ∅ = *R*. We consider the two cases ∆*b,c*(*P, Q*) = ∅ and ∆*b,c*(*P, Q*) ̸= ∅. In the former case it holds that

$$\{v \in P \mid v \in^{b,c} Q\} = R = \emptyset = \Delta^{b,c}(P, Q) = \Delta^{b,c}(P', Q') = \{v \in P' \mid v \in^{b,c} Q'\}.$$

In the latter case, by Lemma 5.12 there is a *δ b,c P* (*Q*) ̸= ⊥ or *δ b,c <sup>Q</sup>* (*P*) ̸= ⊥. Without loss of generality, assume that *δ b,c P* (*Q*) ̸= ⊥. Then, *δ b,c P* (*Q*) ∈ M*b,c P* (*Q*) ⊆ *P* ′ and by Lemma 5.13

$$\begin{split} \left( \overbrace{v \in P' \mid v \in {}^{b,c}Q'}^{b,c} \right)^{b,c} &\subseteq \left( \overbrace{s\_{Q'} \diamond {t\_{Q'}}}^{b,c} \right)^{b,c} \cap \left( (\overbrace{s\_{P'} \diamond {\delta\_{P}^{b,c}(Q)})^{b,c}}^{b,c} \cup (\overbrace{\delta\_{P}^{b,c}(Q)}^{b,c} \diamond \overbrace{t\_{P'}})^{b,c} \right)^{b,c} \\ &= \left( \overbrace{s\_{Q} \diamond {t\_{Q}}}^{b} \right)^{b,c} \cap \left( (\overbrace{s\_{P} \diamond {\delta\_{P}^{b,c}(Q)}}^{b,c} \odot (\overbrace{\delta\_{P}^{b,c}(Q)}^{b,c} \diamond \overbrace{t\_{P}})^{b,c} \right)^{b,c} \\ &= \emptyset. \end{split}$$

Thus, {*v* ∈ *P* ′ | *v* ∈ *b,c Q*′} = ∅ = *R* = {*v* ∈ *P* | *v* ∈ *b,c Q*}.

### **5.2.2 More than Two Shortest Paths**

In the previous subsection, we looked at two shortest paths *P* and *Q* from *s<sup>P</sup>* to *t<sup>P</sup>* and *s<sup>Q</sup>* and *tQ*, respectively. We showed that selecting at most ten vertices from *P* and *Q* (fve per path; see Defnition 5.5) is sufcient to ensure that each pair (*P* ′ *, Q*′ ) of shortest *s<sup>P</sup>* -*t<sup>P</sup>* - and *sQ*-*tQ*-paths that also contain these vertices (M*b,c P* (*Q*) and M*b,c <sup>Q</sup>* (*P*)) "behave" like *P* and *Q* in the sense that *P* ′ and *Q*′ intersect in the same vectors as *P* and *Q* do (see Proposition 5.14). In this subsection, we defne a set C*,* |C| ∈ *O*(*k* · *k*!), that basically ensures the same properties for *k* paths. To formalize our goal for this subsection, we frst introduce the concept of *avoiding* paths which is a generalization of a slightly modifed version of *b, c*-non-crossing paths. The modifcation is to ignore the ends of *P* and *Q* to ensure that we can split paths at certain vertices and still can ensure that these diferent parts are avoiding.

**Defnition 5.6** (*I*-avoiding)**.** Let ∅ ⊂ *I* ⊆ [*k*]. Two paths *P* and *Q* are *Iavoiding* if *p /*∈ *<sup>I</sup> Q* for each inner vertex *p* of *P* and *q /*∈ *I P* for each inner vertex *q* of *Q*. Two vertex pairs (*sp, tp*) and (*sq, tq*) are *I*-avoiding if

$$(\overrightarrow{s\_p}^{\bullet I} \diamond \overrightarrow{t\_p}^{\bullet I}) \cap (\overrightarrow{s\_q}^{\bullet I} \diamond \overrightarrow{t\_q}^{\bullet I}) \subseteq \{\overrightarrow{s\_p}^{\bullet I}, \overrightarrow{t\_p}^{\bullet I}\} \cap \{\overrightarrow{s\_q}^{\bullet I}, \overrightarrow{t\_q}^{\bullet I}\}.$$

Note that being *I*-avoiding implies being *I* ′ -avoiding for all *I* ′ ⊇ *I*. We use *avoiding* as a shorthand for [*k*]-avoiding. Two paths *P*<sup>1</sup> and *P*<sup>2</sup> are *internally vertex-disjoint* if neither of them contains an inner vertex of the other path. Avoiding paths are clearly internally vertex-disjoint.

**Observation 5.15.** *Let P, Q be two avoiding paths. Then P is internally vertex-disjoint from Q.*

Moreover, for each pair of avoiding vertex pairs (*s, t*) and (*u, w*), the shortest *s*-*t*- and *u*-*v*-paths are internally vertex-disjoint.

**Lemma 5.16.** *Let* (*s, t*) *and* (*u, w*) *be two colored pairs of vertices. If the pairs* (*s, t*) *and* (*u, w*) *are avoiding, then each shortest s-t-path is internally disjoint from each shortest u-w-path.*

*Proof.* If (*s, t*) and (*u, w*) are avoiding, then by defnition and Lemma 5.2

$$(\overrightarrow{s\_p \diamond t\_p}^{I} \cap \overrightarrow{s\_q \diamond t\_q}^{I}) \subseteq (\overrightarrow{s\_p}^{\star I} \diamond \overrightarrow{t\_p}^{I}) \cap (\overrightarrow{s\_q}^{\star I} \diamond \overrightarrow{t\_q}^{I}) \subseteq \{\overrightarrow{s\_p}^{\star I}, \overrightarrow{t\_p}^{\star I}\} \cap \{\overrightarrow{s\_q}^{\star I}, \overrightarrow{t\_q}^{\star I}\},$$

and thus each shortest *s*-*t*-path and each shortest *u*-*w*-path only intersect in {*s, t*} ∩ {*u, w*} and are therefore internally vertex-disjoint.

With the notation of avoiding pairs, we can formulate our goal for this subsection. To this end, fx a solution P = (*Pi*)*i*∈[*k*] for a given instance (*G,*(*s<sup>i</sup> , ti*)*i*∈[*k*]) of *k*-Disjoint Shortest Paths, that is, *P<sup>i</sup>* is the *si*-*ti*-path in the solution. Essentially, we want to partition the paths in P into subpaths and assign a set Φ of *labels* to each subpath (Φ ⊆ [*k*]) such that the following two conditions are satisfed.


Note that (2.) will be the central argument in our algorithm for *k*-Disjoint Shortest Paths. The algorithm guesses the endpoints of these subpaths and based on (2.) the algorithm can then compute the inner vertices of subpaths with diferent label sets independently.

Note that for *k* = 2 the partition of *P*<sup>1</sup> and *P*<sup>2</sup> along the sets M*b,c P* (*Q*) and M*b,c <sup>Q</sup>* (*P*) satisfes the above. Each subpath of *P<sup>i</sup> , i* ∈ [2], has label *i*. Moreover, the subpaths between the *α*- and *ω*-vertices have both labels 1 and 2. Hence, (1.) above is satisfed. Furthermore, (2.) follows from Proposition 5.14.

We now generalize this to arbitrary constant *k*. The basic idea behind defning a respective set C of marbles is depicted in Figure 5.4. Initially, each path *P<sup>i</sup>* has label *i*. Whenever two paths *P<sup>i</sup>* and *P<sup>j</sup>* in the solution intersect in the (*i, j*)-projection (that is, the respective *α*- and *ω*-vertices are not ⊥), then the subpaths *P* ′ *i* and *P* ′ *j* in the intersection get both labels *i* and *j*. If a third path *P* ′ also intersects with *P* ′ *j* , then we try to use the intersections to move

**Figure 5.4:** The three main lines represent three paths *P*1, *P*2, and *P*3. The small black rectangles represent marbles on the respective path *P<sup>i</sup>* and the *j*-colored lines above a path indicate that *P<sup>i</sup>* and *P<sup>j</sup>* intersect in the (*i, j*)-projection, that is, they contain vertices with vectors that are identical when projected in the (*i, j*)-plane. The paths *P*<sup>1</sup> and *P*<sup>2</sup> intersect in the (1*,* 2)-projection, *P*<sup>2</sup> and *P*<sup>3</sup> intersect in the (2*,* 3)-projection, but *P*<sup>1</sup> and *P*<sup>3</sup> do not intersect in the (1*,* 3)-projection. The subpaths of *P*<sup>1</sup> and *P*<sup>3</sup> where they intersect with *P*<sup>2</sup> in the (1*,* 2*,* 3)-projection are depicted by *α* ′ 1*, ω*′ 1*, α*′ 3*,* and *ω* ′ <sup>3</sup>. The colors above each subpath (and also the frst number therein) represent the labels of the respective subpath and the number (or sequence of numbers) display the sequence that led to the respective marbles (end vertices) of this subpath.

the label *i* via path *P<sup>j</sup>* to some subpath of *P* ′ . Generalizing this, we consider for each *σ* = (*ℓ*1*, ℓ*2*, . . . , ℓh*) whether label *ℓ*<sup>1</sup> could be "transported" from *Pℓ*<sup>1</sup> to *P<sup>ℓ</sup>*<sup>2</sup> , from *P<sup>ℓ</sup>*<sup>2</sup> to *P<sup>ℓ</sup>*<sup>3</sup> , and so on until from *P<sup>ℓ</sup>h*−<sup>1</sup> to *P<sup>ℓ</sup>h*−<sup>1</sup> . While the idea of transporting labels would also work with triples (transport label *a* via path *P<sup>b</sup>* to path *Pc*), we do not have any bound on the number of resulting subpaths (as for each triple there might be many such subpaths). The reason for using sequences is that we will show that for each *σ* = (*ℓ*1*, ℓ*2*, . . . , ℓh*) at most one subpath of *P<sup>ℓ</sup><sup>h</sup>* can receive label *ℓ*<sup>1</sup> via *σ*.

In the following, we use set(*τ* ) . .= {*ℓ*1*, . . . , ℓh*} to denote the set with all entries in a sequence *τ* = (*ℓ*1*, . . . , ℓh*). We next defne the *crossing set* C recursively for each Φ ⊆ [*k*]. This should be seen as the set of marbles of a solution. We will then show a result similar to Proposition 5.14 for arbitrary *k* that then allows us to fnd the desired partition of paths.

**Defnition 5.7.** Let (*G,*(*s<sup>i</sup> , ti*)*i*∈[*k*]) be an instance of *k*-Disjoint Shortest Paths and let P = (*Pi*)*i*∈[*k*] be a solution to this instance, that is, *P<sup>i</sup>* is the path between *s<sup>i</sup>* and *t<sup>i</sup>* in the solution. For each Φ ⊆ [*k*] and each permutation *σ* = (*ℓ*1*, . . . , ℓ*|Φ|) of Φ, we defne the crossing set C *<sup>σ</sup>* and the endpoints T (*σ*) of intersections as follows.


$$\mathcal{T}(\sigma) := \{ \alpha\_{P\_j}^{i,j}(P\_i), \omega\_{P\_j}^{i,j}(P\_i) \} \text{ and } \mathcal{C}^{\sigma} := \mathcal{M}\_{P\_j}^{i,j}(P\_i) \backslash \{\perp\}.$$

• If |Φ| ≥ 3, then let *σ*start . .= (*ℓ*1*, . . . , ℓ*<sup>|</sup>Φ|−1) and *σ*end . .= (*ℓ*2*, . . . , ℓ*<sup>|</sup>Φ<sup>|</sup>). We denote by *Q* the maximum common subpath of *Pℓ*|Φ|−<sup>1</sup> [T (*σ*start)] and *Pℓ*|Φ|−<sup>1</sup> [T ((*ℓ*<sup>|</sup>Φ<sup>|</sup> *, ℓ*<sup>|</sup>Φ|−1))]. If T (*σ*start) = {⊥}, T (*σ*end) = {⊥}, or *V* (*Q*) = ∅, then let T (*σ*) . .= C *σ* . .= {⊥}*.* Otherwise, let

$$\begin{aligned} P &:= P\_{\ell|\Phi|}[\mathcal{T}(\sigma\_{\text{end}})], \\ \mathcal{T}(\sigma) &:= \{ \alpha\_P^{\ell\_1,\ell\_{|\Phi|}}(Q), \omega\_P^{\ell\_1,\ell\_{|\Phi|}}(Q) \}, \text{ and} \\ \mathcal{C}^{\sigma} &:= (\mathcal{M}\_P^{\ell\_1,\ell\_{|\Phi|}}(Q) \cup \mathcal{M}\_Q^{\ell\_1,\ell\_{|\Phi|}}(P)) \backslash \{\bot\}. \end{aligned}$$

The set C . .= ⋃ *σ* C *σ* is the *crossing set* of P.

**Observation 5.17.** *Let σ . .*= (*ℓ*1*, . . . , ℓ*<sup>|</sup>Φ<sup>|</sup>) *be any permutation of any* Φ ⊆ [*k*]*. If* T (*σ*) ̸= {⊥}*, then*


*In particular, crossing sets and endpoints are well-defned.*

*Proof.* We prove both claims by an induction over |Φ|. For |Φ| = 1, note that T (*σ*) = {*s<sup>ℓ</sup>*<sup>1</sup> *, t<sup>ℓ</sup>*<sup>1</sup> }. Clearly {*s<sup>ℓ</sup>*<sup>1</sup> *, t<sup>ℓ</sup>*<sup>1</sup> } ⊆ *P<sup>ℓ</sup>*<sup>1</sup> as these are the ends of *P<sup>ℓ</sup>*<sup>1</sup> and the pair {*s<sup>ℓ</sup>*<sup>1</sup> *, t<sup>ℓ</sup>*<sup>1</sup> } is by defnition *ℓ*1-colored.

Now assume that both claims hold for all Φ ′ with |Φ ′ | *<* |Φ|. Since T (*σ*) ̸= {⊥}, it holds that T (*σ*) = {*α ℓ*1*,ℓ*|Φ<sup>|</sup> *P* (*Q*)*, ω ℓ*1*,ℓ*|Φ<sup>|</sup> *P* (*Q*)}, where

$$Q = P\_{\ell\_{|\Phi|-1}}[\mathcal{T}((\ell\_1, \ell\_2, \dots, \ell\_{|\Phi|-1}))] \cap P\_{\ell\_{|\Phi|-1}}[\mathcal{T}((\ell\_{|\Phi|}, \ell\_{|\Phi|-1}))]$$

if |Φ| ≥ 3 and *Q* = *P<sup>ℓ</sup>*<sup>1</sup> if |Φ| = 2. Note that *V* (*Q*) ̸= ∅ and hence if |Φ| ≥ 3, then by induction hypothesis *Q* ⊆ *P<sup>ℓ</sup>*|Φ|−<sup>1</sup> and *Q* is *c*-colored for each *c* ∈ Φ \ {*ℓ*<sup>|</sup>Φ<sup>|</sup>}. If |Φ| = 2, then *Q* = *P<sup>ℓ</sup>*<sup>1</sup> = *P<sup>ℓ</sup>*|Φ|−<sup>1</sup> and *Q* is by defnition *ℓ*1-colored. Thus,

$$\mathcal{T}(\sigma) = \{ \alpha\_P^{\ell\_1, \ell\_{|\Phi|}}(Q), \omega\_P^{\ell\_1, \ell\_{|\Phi|}}(Q) \}$$

is well-defned and hence T (*σ*) ⊆ *Pℓ*|Φ<sup>|</sup> . Moreover, by Observation 5.7, it holds that T (*σ*) is *c*-colored for each *c* ∈ (Φ \ {*ℓ*<sup>|</sup>Φ<sup>|</sup>}) ∪ {*ℓ*<sup>|</sup>Φ<sup>|</sup>} = Φ.

Note that Observation 5.17 states that, for each sequence *σ* . .= (*ℓ*1*, ℓ*2*, . . . , ℓ*<sup>|</sup>Φ|), the set T (*σ*) describes a pair of vertices in *Pℓ*|Φ<sup>|</sup> . The next lemma states that for any sequence *σ* ′ = (*ℓ<sup>i</sup> , ℓi*+1*, . . . , ℓ*<sup>|</sup>Φ<sup>|</sup>) with *i* ≥ 1 it holds that the subpath of *Pℓ*|Φ<sup>|</sup> between the two vertices in T (*σ*) is a subpath of the one between the two vertices in T (*σ* ′ ), that is, if we add more entries to the front of *σ* ′ , then we get smaller and smaller paths.

**Lemma 5.18.** *Let σ . .*= (*ℓ*1*, ℓ*2*, . . . , ℓ*<sup>|</sup>Φ<sup>|</sup>) *be any permutation of any* Φ ⊆ [*k*] *with* |Φ| ≥ 2*. If* T (*σ*) ̸= {⊥}*, then Pℓ*|Φ<sup>|</sup> [T (*σ*)] ⊆ *Pℓ*|Φ<sup>|</sup> [T ((*ℓ*2*, ℓ*3*, . . . , ℓ*<sup>|</sup>Φ<sup>|</sup>))]*.*

*Proof.* We prove the statement by a case distinction over |Φ|. If |Φ| = 2, then the statement is trivial as T (*σ*) ⊆ *Pℓ*|Φ<sup>|</sup> by Observation 5.17 and by Defnition 5.7 T ((*ℓ*<sup>|</sup>Φ<sup>|</sup>)) = {*sℓ*|Φ<sup>|</sup> *, tℓ*|Φ<sup>|</sup> }. If |Φ| ≥ 3, then note that *Pℓ*|Φ<sup>|</sup> [T (*σ*)] is by defnition of *α*- and *ω*-vertices the maximal subpath *P* of *Pℓ*|Φ<sup>|</sup> such that there is a subpath

> *Q* ⊆ *Pℓ*|Φ|−<sup>1</sup> [T ((*ℓ*1*, ℓ*2*, . . . , ℓ*<sup>|</sup>Φ|−1))] with *P* = {*ℓ*1*,ℓ*2*,...,ℓ*|Φ|} *Q.*

Analogously, *Pℓ*|Φ<sup>|</sup> [T ((*ℓ*2*, ℓ*3*, . . . , ℓ*<sup>|</sup>Φ|))] is the maximal subpath *P* ′ of *Pℓ*|Φ<sup>|</sup> such that there is a subpath

$$Q' \subseteq P\_{\ell\_{\lfloor \Phi \rfloor - 1}}[\mathcal{T}((\ell\_2, \ell\_3, \dots, \ell\_{\lfloor \Phi \rfloor - 1}))] \text{ with } P' = {}^{\{\ell\_2, \ell\_3, \dots, \ell\_{\lfloor \Phi \rfloor}\}} \ Q'.$$

Since *Q* ⊆ *Q*′ by Defnition 5.7, it also holds that *P* ⊆ *P* ′ .

The next lemma states that when "transporting" the labels via a permutation *σ* = (*ℓ*1*, ℓ*2*, . . . , ℓ*<sup>|</sup>Φ<sup>|</sup>), then the intersecting subpath *P* in the target path *P<sup>ℓ</sup>*|Φ<sup>|</sup> "agrees" in all coordinates in set(*σ*) with the subpath *Q* of *P<sup>ℓ</sup>*|Φ|−<sup>1</sup> where the label is transported from, that is, *P* =set(*σ*) *Q*.

**Lemma 5.19.** *Let* Φ ⊆ [*k*] *with* |Φ| ≥ 2*. Let σ . .*= (*ℓ*1*, ℓ*2*, . . . , ℓ*<sup>|</sup>Φ<sup>|</sup>) *be any permutation of* Φ*. If* T (*σ*) ̸= {⊥}*, then P<sup>ℓ</sup>*|Φ<sup>|</sup> [T (*σ*)] =<sup>Φ</sup> *Q*′ *for some subpath Q*′ *of Q . .*= *P<sup>ℓ</sup>*|Φ|−<sup>1</sup> [T ((*ℓ*1*, ℓ*2*, . . . , ℓ*<sup>|</sup>Φ|−1))] ∩ *P<sup>ℓ</sup>*|Φ|−<sup>1</sup> [T ((*ℓ*<sup>|</sup>Φ<sup>|</sup> *, ℓ*<sup>|</sup>Φ|−1))]*.*

*Proof.* We will again use induction over |Φ| to prove the claim. For |Φ| = 2, the claim follows from Observation 5.7. For |Φ| ≥ 3, let *σ*start . .= (*ℓ*1*, ℓ*2*, . . . , ℓ*<sup>|</sup>Φ|−1) and *σ*end . .= (*ℓ*2*, ℓ*3*, . . . , ℓ*|Φ|). By Lemma 5.18, *Pℓ*|Φ<sup>|</sup> [T (*σ*)] ⊆ *Pℓ*|Φ<sup>|</sup> [T (*σ*end)] and hence there is by induction hypothesis a subpath

$$R' \subseteq P\_{\ell\_{|\Phi|-1}}[\mathcal{T}(\sigma\_{\text{end}})] \cap P\_{\ell\_{|\Phi|-1}}[\mathcal{T}((\ell\_{|\Phi|}, \ell\_{|\Phi|-1}))]$$

with *Pℓ*|Φ<sup>|</sup> [T (*σ*)] =set(*σ*end) *R*′ . Furthermore, by Defnition 5.7 T ((*ℓ*1*, ℓ*<sup>|</sup>Φ|)) ̸= ⊥ and hence by induction hypothesis there is some subpath

$$Q' \subseteq Q \text{ with } P\_{\ell\_h}[\mathcal{T}(\sigma)] =^{\ell\_1, \ell\_h} Q'.$$

Note that *R*′ =*<sup>ℓ</sup>*|Φ<sup>|</sup> *Pℓ*|Φ<sup>|</sup> [T (*σ*)] =*<sup>ℓ</sup>*|Φ<sup>|</sup> *Q*′ and that *R*′ and *Q*′ are both subpaths of *Q*. Finally, since *Q* ⊆ *Pℓ*|Φ|−<sup>1</sup> [T ((*ℓ*<sup>|</sup>Φ<sup>|</sup> *, ℓ*<sup>|</sup>Φ|−1))] is by Observation 5.17 *ℓ*<sup>|</sup>Φ<sup>|</sup> colored and *R*′ =*<sup>ℓ</sup>*|Φ<sup>|</sup> *Q*′ , it holds that *R*′ = *Q*′ . Thus, *Q*′ =set(*σ*) *Pℓ*|Φ<sup>|</sup> [T (*σ*)], which proves the claim.

In Subsection 5.2.1, we defned marbles, that is, specifc vertices of two paths *P, Q* such that when splitting *P* and *Q* at these vertices, then each resulting subpath *P* ′ of *P* and *Q*′ of *Q* fulfll either *P* ′ =*b,c Q*′ or *P* ′ and *Q*′ are avoiding. In this subsection, we generalized the notion of marbles to more than two paths at the expense of restricting them to solution paths. We conclude this subsection with the notion of *marble paths*, the fnal link between marbles and crossing sets that will allow us to guess marbles and then compute shortest paths between them almost independently. By that, we mean that we will defne labels for each subpath between marbles such that paths with diferent labels are avoiding and paths with the same labels have a common color. Afterwards, we will show in Section 5.3 how to compute disjoint paths between marble pairs with a common color.

**Defnition 5.8.** An *i-marble path T* is a set of vertices such that {*s<sup>i</sup> , ti*} ⊆ *T* and for each *u, v* ∈ *T* the pair (*u, v*) is *i*-colored. A *segment S* of an *i*-marble path *T* is a subset of *T* containing two vertices denoted by start(*S*) and end(*S*) and all vertices *v* ∈ *T* with start(*S*) *<<sup>i</sup> v <<sup>i</sup>* end(*S*). A segment is *minimal* if it contains exactly two vertices, and it is *j-colored* if (start(*S*)*,* end(*S*)) is *j*-colored. A path *P follows S* if *P* is *i*-colored, has end vertices start(*S*) and end(*S*), and *S* ⊆ *V* (*P*). Two segments *S* and *S* ′ are *avoiding* if each path *P* that follows *S* and each path *P* ′ that follows *S* ′ are pairwise avoiding. Two marble paths are *avoiding* if all their segments are pairwise avoiding.

Before we state the main result of this section, we will prove a series of lemmata that involve minimal segments of marble paths. The frst one states that adding more vertices to avoiding segments still results in avoiding segments.

**Lemma 5.20.** *Let S be a segment of an i-marble path and let U be a segment of a j-marble path such that S and U are avoiding. Let S* ′ ⊇ *S and U* ′ ⊇ *U be two segments with*

> start(*S* ′ ) = start(*S*)*,* end(*S* ′ ) = end(*S*)*,* start(*U* ′ ) = start(*U*)*, and* end(*U* ′ ) = end(*U*)*.*

*The segments S* ′ *and U* ′ *are avoiding.*

*Proof.* Assume towards a contradiction that *S* ′ and *U* ′ are not avoiding, that is, there are minimal subsegments *S* <sup>∗</sup> of *S* ′ and *U* <sup>∗</sup> of *U* ′ and paths *P* and *Q* such that *P* follows *S* <sup>∗</sup> and *Q* follows *U* <sup>∗</sup> and *P* and *Q* are not avoiding. Let *S* ′′ be the minimal segment in *S* with

$$\text{start}(S^{\prime\prime}) \le^i \text{start}(S^\*) < \text{end}(S^\*) \le \text{end}(S^{\prime\prime}).$$

Note that *S* <sup>∗</sup> ⊆ *S* ′′. Analogously, let *U* ′′ be the minimal segment in *U* with

$$\text{start}(U'') \le^i \text{start}(U^\*) < \text{end}(U^\*) \le \text{end}(U'').$$

Since *P* follows *S* ∗ , it is *i*-colored and contains all vertices in *S* ∗ . Hence it contains all vertices in *S* ′′ ⊆ *S* <sup>∗</sup> and thus follows *S* ′′. Analogously, *Q* follows *U* ′′ . Hence *S* ′′ and *U* ′′ are not avoiding and thus *S* and *U* are by defnition not avoiding, a contradiction.

The next two lemmata state that segments of marble paths *P* and *Q* defned by vertices in M are avoiding unless the ends of the segment are between the respective *α*- and *ω*-vertices. The frst lemma states that if *α a,b P* (*Q*) = ⊥, then the two marble paths are completely avoiding.

**Lemma 5.21.** *Let* (*s<sup>P</sup> , t<sup>P</sup>* ) *be an a-colored pair and let* {*sQ, tQ*} *be a b-colored pair. Let P be an a-colored s<sup>P</sup> -t<sup>P</sup> -path and let Q be a b-colored sQ-tQ-path. If α a,b P* (*Q*) = ⊥*, then the marble paths* M*a,b P* (*Q*) *and* M*a,b <sup>Q</sup>* (*P*) *are avoiding.*

*Proof.* Note that since *α a,b P* (*Q*) = ⊥, it follows that {*v* ∈ *P* | *v* ∈ *a,b Q*} = ∅. Assume towards a contradiction that there are segments *S* of M*a,b P* (*Q*) and *S* ′ of M*a,b <sup>Q</sup>* (*P*) that are not avoiding. If *S* and *S* ′ are not minimal, then by defnition they contain minimal subsegments that are not avoiding. Hence we can assume without loss of generality that *S* and *S* ′ are minimal.

Let *P* ′ be an *a*-colored path that follows *S* and let *Q*′ be a *b*-colored path that follows *S* ′ such that *P* ′ and *Q*′ are not avoiding. Let further

$$P'' := P[s\_P, s\_{P'}] \bullet P' \bullet P[t\_{P'}, t\_P] \text{ and } Q'' := Q[s\_Q, s\_{Q'}] \bullet Q' \bullet Q[t\_{Q'}, t\_Q].$$

Note that *P* ′′ follows M*a,b P* (*Q*) and therefore M*a,b P* (*Q*) ⊆ *P* ′′. Analogously, *Q*′′ follows M*a,b <sup>Q</sup>* (*P*) and hence <sup>M</sup>*a,b <sup>Q</sup>* (*P*) ⊆ *Q*′′. By Proposition 5.14, it holds that

$$\{v \in P^{\prime\prime} \mid v \in ^{a,b} Q^{\prime\prime}\} = \{v \in P \mid v \in ^{a,b} Q\} = \emptyset,$$

that is, *P* ′′ and *Q*′′ are avoiding. Since *P* ′′ and *Q*′′ are avoiding, so are all subpaths of *P* ′′ and *Q*′′. Thus *P* ′ and *Q*′ are avoiding, a contradiction.

The next lemma deals with the case where *α a,b P* (*Q*) ̸= ⊥. Recall that in this case we only consider segments that do not contain *α a,b P* (*Q*) or *ω a,b P* (*Q*).

**Lemma 5.22.** *Let* (*s<sup>P</sup> , t<sup>P</sup>* ) *be an a-colored pair and let* (*sQ, tQ*) *be a b-colored pair. Let P be an a-colored s<sup>P</sup> -t<sup>P</sup> -path and let Q be a b-colored sQ-tQ-path. If α a,b P* (*Q*) ̸= ⊥*, then let S*<sup>1</sup> *and S*<sup>2</sup> *be segments of the marble path* M*a,b P* (*Q*) *with* start(*S*1) = *s<sup>P</sup> ,* end(*S*1) = *α a,b P* (*Q*)*,*start(*S*2) = *ω a,b P* (*Q*)*, and* end(*S*2) = *t<sup>P</sup> . Let further S* ′ *be a segment of the marble paths* M*a,b <sup>Q</sup>* (*P*)*. Then S*<sup>1</sup> *and S* ′ *are avoiding and so are S*<sup>2</sup> *and S* ′ *.*

*Proof.* Note that *α a,b P* (*Q*) ̸= ⊥, Observation 5.7 and Lemma 5.10 imply that

$$\{v \in P \mid v \in ^{a,b}Q\} = P[\alpha\_P^{a,b}(Q), \omega\_P^{a,b}(Q)].$$

Assume towards a contradiction that *S*<sup>1</sup> and *S* ′ are not avoiding or *S*<sup>2</sup> and *S* ′ are not avoiding. Then there are paths *P*<sup>1</sup> that follows *S*1, *P*<sup>2</sup> that follows *S*<sup>2</sup> and *Q*′ that follows *S* ′ such that *P*<sup>1</sup> and *Q*′ are not avoiding or *P*<sup>2</sup> and *Q*′ are not avoiding. Hence there are vertices *v* in *Q*′ and *w* in *P*<sup>1</sup> or *P*<sup>2</sup> that are inner vertices with *v* =*a,b w*. Let

$$P^\* := P\_1 \bullet P[\alpha\_P^{a,b}(Q), \omega\_P^{a,b}(Q)] \bullet P\_2 \text{ and } Q^\* := Q[s\_Q, s\_{Q'}] \bullet Q' \bullet Q[t\_{Q'}, t\_Q].$$

Note that *P* ∗ is *a*-colored as each of its subpaths is *a*-colored. Moreover, *P* ∗ follows M*a,b P* (*Q*) as it contains *s<sup>P</sup> , α a,b P* (*Q*)*, ω a,b P* (*Q*), and *t<sup>P</sup>* . Analogously, *Q*<sup>∗</sup> follows M*a,b <sup>Q</sup>* (*P*). Proposition 5.14 then states that

$$\{v \in P^\* \mid v \in {}^{a,b}Q'\} = \{v \in P \mid v \in {}^{a,b}Q\} = P[\alpha\_P^{a,b}(Q), \omega\_P^{a,b}(Q)].$$

Thus, *w* ∈ *P*[*α a,b P* (*Q*)*, ω a,b P* (*Q*)] which is a contradiction to the assumption that *w* is an interior vertex of the *a*-colored paths *P*<sup>1</sup> or *P*2.

The fnal lemma generalizes the two previous ones from M (comparison of two paths) to C (sequences of paths). Unfortunately, it contains a lot of rather tedious case distinctions. We remark that solving the respective cases is not particularly difcult or interesting.

**Lemma 5.23.** *Let* (*G,*(*s<sup>i</sup> , ti*)*i*∈[*k*]) *be an instance of k-*Disjoint Shortest Paths*, let* P *. .*= (*Pi*)*i*∈[*k*] *be a solution to this instance, and let* Φ ⊆ [*k*]*. Let σ . .*= (*ℓ*1*, ℓ*2*, . . . , ℓ*<sup>|</sup>Φ<sup>|</sup>) *be a permutation of* Φ*, let σ*start *. .*= (*ℓ*1*, ℓ*2*, . . . , ℓ*<sup>|</sup>Φ|−1)*, and let* C *be the crossing set of* P*. Let g . .*= *ℓ*1*, i . .*= *ℓ*<sup>|</sup>Φ|−1*, and j . .*= *ℓ*<sup>|</sup>Φ<sup>|</sup> *.*

*(i) If* T (*σ*) = {⊥} *and* T (*σ*start) ̸= {⊥}*, then the two marble paths*

$$V(P\_i[\mathcal{T}(\sigma\_{\text{start}})]) \cap \mathcal{C} \text{ and } V(P\_j) \cap \mathcal{C}$$

*are avoiding.*

*(ii) If* T (*σ*) = {*u, v*} ̸= {⊥} *with u <<sup>j</sup> v, then the two marble paths*

*V* (*P<sup>i</sup>* [T (*σ*start)]) ∩ C *and V* (*P<sup>j</sup>* [*s<sup>j</sup> , u*]) ∩ C

*are avoiding and so are*

$$V(P\_i[\mathcal{T}(\sigma\_{\text{start}})]) \cap \mathcal{C} \text{ and } V(P\_j[v, t\_j]) \cap \mathcal{C}.$$

*Proof.* We will prove both claims by induction over |Φ|. *Base case:* Let |Φ| = 2 and hence *g* = *i* and *P<sup>i</sup>* [T (*σ*start)] = *P<sup>i</sup>* .

(i) Since T (*σ*) = {⊥}, it holds that *α i,j P<sup>j</sup>* (*Pi*) = ⊥. By Defnition 5.7, it holds that M*i,j P<sup>i</sup>* (*P<sup>j</sup>* ) ⊆ *V* (*Pi*) ∩ C and M*i,j P<sup>j</sup>* (*Pi*) ⊆ *V* (*P<sup>j</sup>* ) ∩ C. By Lemma 5.21, M*i,j P<sup>i</sup>* (*P<sup>j</sup>* ) and M*i,j P<sup>j</sup>* (*Pi*) are avoiding. Lemma 5.20 states *V* (*Pi*) ∩ C and *V* (*P<sup>j</sup>* ) ∩ C are avoiding since

$$\begin{aligned} \text{start}(V(P\_i) \cap \mathcal{C}) &= s\_i = \text{start}(\mathcal{M}\_{P\_i}^{i,j}(P\_j)), \\ \text{end}(V(P\_i) \cap \mathcal{C}) &= t\_i = \text{end}(\mathcal{M}\_{P\_i}^{i,j}(P\_j)), \\ \text{start}(V(P\_j) \cap \mathcal{C}) &= s\_j = \text{start}(\mathcal{M}\_{P\_j}^{i,j}(P\_i)), \text{ and} \\ \text{end}(V(P\_j) \cap \mathcal{C}) &= t\_j = \text{end}(\mathcal{M}\_{P\_j}^{i,j}(P\_i)). \end{aligned}$$

(ii) Since *σ* = (*i, j*), it holds that *u* = *α i,j P<sup>j</sup>* (*Pi*) and *v* = *ω i,j P<sup>j</sup>* (*Pi*). By Defnition 5.7, it holds that M*i,j P<sup>i</sup>* (*P<sup>j</sup>* ) ⊆ *V* (*Pi*) ∩ C and M*i,j P<sup>j</sup>* (*Pi*) ⊆ *V* (*P<sup>j</sup>* ) ∩ C. By Lemma 5.22, M*i,j P<sup>i</sup>* (*P<sup>j</sup>* ) and {*s<sup>j</sup> , u*} are avoiding and so are M*i,j P<sup>i</sup>* (*P<sup>j</sup>* ) and {*v, tj*}. Thus the claim again follows from Lemma 5.20 and

$$\begin{aligned} \text{start}(V(P\_i) \cap \mathcal{C}) &= s\_i = \text{start}(\mathcal{M}\_{P\_i}^{i,j}(P\_j)), \\ \text{end}(V(P\_i) \cap \mathcal{C}) &= t\_i = \text{end}(\mathcal{M}\_{P\_i}^{i,j}(P\_j)), \\ \text{start}(V(P\_j[s\_j, u]) \cap \mathcal{C}) &= s\_j = \text{start}(\{s\_j, u\}), \\ \text{end}(V(P\_j[s\_j, u]) \cap \mathcal{C}) &= u = \text{end}(\{s\_j, u\}), \\ \text{start}(V(P\_j[v, t\_j]) \cap \mathcal{C}) &= v = \text{start}(\{v, t\_j\}), \text{ and } \\ \text{end}(V(P\_j[v, t\_j]) \cap \mathcal{C}) &= t\_j = \text{end}(\{v, t\_j\}). \end{aligned}$$

*Induction step:* Let |Φ| ≥ 3 and assume that the statement holds for all Φ ′ ⊆ [*k*] with 2 ≤ |Φ ′ | *<* |Φ|. Let *σ*end . .= (*ℓ*2*, ℓ*3*, . . . , ℓ*<sup>|</sup>Φ<sup>|</sup>) and *σ* ′ . .= (*ℓ*2*, ℓ*3*, . . . , ℓ*<sup>|</sup>Φ|−1).

(i) Since T (*σ*) = {⊥} and T (*σ*start) ̸= {⊥}, by Defnition 5.7, there are three possible cases:

$$\begin{aligned} \mathcal{T}(\sigma\_{\text{end}}) &= \{\bot\}, \\ V(Q) = V(P\_i[\mathcal{T}(\sigma\_{\text{start}})]) \cap V(P\_i[\mathcal{T}((j, i))]) &= \emptyset, \text{ or} \\ \alpha\_{P\_j[\mathcal{T}(\sigma\_{\text{end}})]}^{g, j}(Q) &= \bot. \end{aligned}$$

We will show that *V* (*P<sup>i</sup>* [T (*σ*start)]) ∩ C and *V* (*P<sup>j</sup>* ) ∩ C are avoiding in each of the three cases.

(1) We start with the case where T (*σ*end) = {⊥}. Since T (*σ*start) ̸= {⊥} it holds by Lemma 5.18 that

$$\emptyset \ne V(P\_i[\mathcal{T}(\sigma\_{\text{start}})]) \subseteq V(P\_i[\mathcal{T}(\sigma')])$$

and in particular, T (*σ* ′ ) ̸= {⊥}. Since T (*σ*end) = {⊥}, T (*σ* ′ ) ̸= {⊥}, and *σ*end is the permutation of a set Φ ′ with |Φ ′ | *<* |Φ|, the induction hypothesis states that

$$V(P\_i[\mathcal{T}(\sigma')]) \cap \mathcal{C} \text{ and } V(P\_j) \cap \mathcal{C}$$

are avoiding. By defnition, each subsegment of *V* (*P<sup>i</sup>* [T (*σ* ′ )]) ∩ C is also avoiding *V* (*P<sup>j</sup>* ) ∩ C and since *V* (*P<sup>i</sup>* [T (*σ*start)]) ⊆ *V* (*P<sup>i</sup>* [T (*σ* ′ )]) and T (*σ*start) ⊆ C, it holds that *V* (*P<sup>i</sup>* [T (*σ*start)]) ∩ C is a subsegment of *V* (*P<sup>i</sup>* [T (*σ* ′ )]) ∩ C.

(2) We continue with the case where

$$V(Q) := V(P\_i[\mathcal{T}(\sigma\_{\text{start}})]) \cap V(P\_i[\mathcal{T}((j, i))]) = \emptyset.$$

We consider the sequence (*j, i*) and the two cases T ((*j, i*)) = {⊥} and T ((*j, i*)) ̸= {⊥}. Since |{*j, i*}| = 2 *<* |Φ|, the induction hypothesis states that if T ((*j, i*)) = {⊥}, then

*V* (*P<sup>j</sup>* ) ∩ C and *V* (*Pi*) ∩ C are avoiding

and if T ((*j, i*)) ̸= {⊥}, then

*V* (*P<sup>j</sup>* )∩C avoids both *V* (*P<sup>i</sup>* [*si , α i,j P<sup>i</sup>* (*P<sup>j</sup>* )])∩C and *V* (*P<sup>i</sup>* [*ω i,j P<sup>i</sup>* (*P<sup>j</sup>* )*, t<sup>i</sup>* ])∩C*.*

In the former case, since *V* (*P<sup>i</sup>* [T (*σ*start)])∩ C is a segment of *V* (*Pi*)∩ C, it holds by defnition that *V* (*P<sup>i</sup>* [T (*σ*start)]) ∩ C and *V* (*P<sup>j</sup>* ) ∩ C are avoiding. In the latter case, since *V* (*Q*) = ∅, it holds that

$$V(P\_i[\mathcal{T}(\sigma\_{\text{start}})]) \subseteq V(P\_i[s\_i, \alpha\_{P\_i}^{i,j}(P\_j)]) \text{ or}$$

$$V(P\_i[\mathcal{T}(\sigma\_{\text{start}})]) \subseteq V(P\_i[\omega\_{P\_i}^{i,j}(P\_j), t\_j]).$$

Since the two cases are analogous, we assume without loss of generality the former, that is, *V* (*P<sup>i</sup>* [T (*σ*start)]) ⊆ *V* (*P<sup>i</sup>* [*si , α i,j P<sup>i</sup>* (*P<sup>j</sup>* )]).

Since *V* (*P<sup>i</sup>* [*si , α i,j P<sup>i</sup>* (*P<sup>j</sup>* )]) ∩ C and *V* (*P<sup>j</sup>* ) ∩ C are avoiding and since

$$V(P\_i[\mathcal{T}(\sigma\_{\text{start}})]) \cap \mathcal{C} \subseteq V(P\_i[s\_i, \alpha\_{P\_i}^{i,j}(P\_j)]) \cap \mathcal{C},$$

by defnition *V* (*P<sup>i</sup>* [T (*σ*start)]) ∩ C and *V* (*P<sup>j</sup>* ) ∩ C are also avoiding.

(3) It remains to analyze the case where *α g,j <sup>P</sup><sup>j</sup>* [<sup>T</sup> (*σ*end)](*Q*) = ⊥. We assume that T (*σ*end) ̸= {⊥} and *V* (*Q*) ̸= ∅ as we can otherwise use the proofs above. Assume towards a contradiction that *V* (*P<sup>j</sup>* ) ∩ C and *V* (*P<sup>i</sup>* [T (*σ*start)]) ∩ C are not avoiding. Then there are minimal segments *S<sup>i</sup>* ⊆ *V* (*P<sup>i</sup>* [T (*σ*start)]) ∩ C and *S<sup>j</sup>* ⊆ *V* (*P<sup>j</sup>* ) ∩ C that are not avoiding. We consider the two cases

$$\begin{aligned} S\_i &\subseteq V(P\_i[\mathcal{T}(\sigma\_{\text{start}})]) \cap V(P\_i[\mathcal{T}((j,i))]) = V(Q) \text{ and} \\ S\_i &\subseteq V(P\_i[\mathcal{T}(\sigma\_{\text{start}})]) \backslash V(P\_i[\mathcal{T}((j,i))]). \end{aligned}$$

Note that T ((*j, i*)) ⊆ C and that *S<sup>i</sup>* is minimal and hence this case distinction is complete. In the latter case, note that since *S<sup>i</sup>* and *S<sup>j</sup>* are not avoiding, there is a path *R<sup>i</sup>* that follows *S<sup>i</sup>* and a path *R<sup>j</sup>* that follows *S<sup>j</sup>* such that {*v* ∈ *R<sup>i</sup>* | *v* ∈ *i,j Rj*} \ *S<sup>i</sup>* ≠ ∅. Then, it holds by Observation 5.7 {*v* ∈ *P<sup>i</sup>* | *v* ∈ *i,j Pj*} ⊆ *V* (*P<sup>i</sup>* [T ((*j, i*))]). Moreover, by Proposition 5.14 {*v* ∈ *R<sup>i</sup>* | *v* ∈ *i,j Rj*} ⊆*i,j V* (*P<sup>i</sup>* [T ((*j, i*))]). Hence

$$\{v \in R\_i \mid v \in ^{i,j} R\_j\} \subseteq ^{a,b} S\_i,$$

a contradiction.

If *S<sup>i</sup>* ⊆ *V* (*P<sup>i</sup>* [T (*σ*start)])∩*V* (*P<sup>i</sup>* [T ((*j, i*))]) = *V* (*Q*), then we distinguish between the two cases

$$S\_j \subseteq V(P\_j[\mathcal{T}(\sigma\_{\text{end}})]) \cap \mathcal{C} \text{ and } S\_j \text{ } \backslash V(P\_j[\mathcal{T}(\sigma\_{\text{end}})]) \neq \emptyset.$$

In the former case, it holds by Lemma 5.21 that M*g,j <sup>P</sup><sup>j</sup>* [<sup>T</sup> (*σ*end)](*Q*) and M*g,j <sup>Q</sup>* (*P<sup>j</sup>* [T (*σ*end)]) are avoiding. Since

$$\begin{aligned} \mathcal{M}\_{P\_j}^{g,j}(\mathcal{T}(\sigma\_{\text{end}})[Q] &\subseteq V(P\_j[\mathcal{T}(\sigma\_{\text{end}})]) \cap \mathcal{C} \text{ and} \\ \mathcal{M}\_Q^{g,j}(P\_j[\mathcal{T}(\sigma\_{\text{end}})]) &\subseteq V(Q) \cap \mathcal{C}, \end{aligned}$$

it holds that *S<sup>i</sup>* and *S<sup>j</sup>* are avoiding, a contradiction.

Finally, it remains to analyze the case where *S<sup>j</sup>* \ *V* (*P<sup>j</sup>* [T (*σ*end)]) ̸= ∅. Since T (*σ*start) ̸= {⊥} it holds by Lemma 5.18 that

$$\emptyset \ne V(P\_i[\mathcal{T}(\sigma\_{\text{start}})]) \subseteq V(P\_i[\mathcal{T}(\sigma')]).$$

and in particular, T (*σ* ′ ) ̸= {⊥}. Since by assumption T (*σ*end) ̸= {⊥}, the induction hypothesis states that

$$V(P\_i[\mathcal{T}(\sigma')]) \cap \mathcal{C} \text{ and } V(P\_j[s\_j, \text{start}(\mathcal{T}(\sigma\_{\text{end}}))]) \cap \mathcal{C}$$

are avoiding and so are

$$V(P\_i[\mathcal{T}(\sigma')]) \cap \mathcal{C} \text{ and } V(P\_j[\text{end}(\mathcal{T}(\sigma\_{\text{end}})), t\_j]) \cap \mathcal{C}.$$

Since, by Lemma 5.18,

$$S\_i \subseteq V(P\_i[\mathcal{T}(\sigma\_{\text{start}})]) \cap \mathcal{C} \subseteq V(P\_i[\mathcal{T}(\sigma')]) \cap \mathcal{C}$$

and since

$$\begin{aligned} S\_j &\subseteq V(P\_j[s\_j, \text{start}(\mathcal{T}(\sigma\_{\text{end}}))]) \cap \mathcal{C} \text{ or} \\ S\_j &\subseteq V(P\_j[\text{end}(\mathcal{T}(\sigma\_{\text{end}})), t\_j]) \cap \mathcal{C} \end{aligned}$$

it follows that *S<sup>i</sup>* and *S<sup>j</sup>* are avoiding, a contradiction.

(ii) In this case it holds that T (*σ*) = {*u, v*} ̸= {⊥} with *u <<sup>j</sup> v* and it remains to show that

$$V(P\_i[\mathcal{T}(\sigma\_{\text{start}})]) \cap \mathcal{C} \text{ and } V(P\_j[s\_j, u]) \cap \mathcal{C}$$

are avoiding and so are

$$V(P\_i[\mathcal{T}(\sigma\_{\text{start}})]) \cap \mathcal{C} \text{ and } V(P\_j[v, t\_j]) \cap \mathcal{C}.$$

Since both cases are analogous, we will only show that *V* (*P<sup>i</sup>* [T (*σ*start)]) ∩ C and *V* (*P<sup>j</sup>* [*s<sup>j</sup> , u*]) ∩ C are avoiding. To this end, assume towards a contradiction that there are minimal segments

$$S\_i \subseteq V(P\_i[\mathcal{T}(\sigma\_{\text{start}})]) \cap \mathcal{C} \text{ and } S\_j \subseteq V(P\_j[s\_j, u]) \cap \mathcal{C}$$

that are not avoiding.

We consider the two cases *Si*\*V* (*P<sup>i</sup>* [T ((*j, i*))]) ̸= ∅ and *S<sup>i</sup>* ⊆ *V* (*P<sup>i</sup>* [T ((*j, i*))]). In the former case, note that if {*w, x*} . .= T ((*j, i*)) ̸= {⊥} with *w <<sup>i</sup> x*, then it holds by Lemma 5.22 that *V* (*P<sup>i</sup>* [*si , w*]) ∩ C and *V* (*P<sup>j</sup>* ) ∩ C are avoiding. If T ((*j, i*)) = {⊥}, then it holds by Lemma 5.21 that *V* (*Pi*) ∩ C and *V* (*P<sup>j</sup>* ) ∩ C are avoiding. Hence in both cases *S<sup>i</sup>* and *S<sup>j</sup>* are avoiding, a contradiction.

Now assume that *S<sup>i</sup>* ⊆ *V* (*P<sup>i</sup>* [T ((*j, i*))]). Since *S<sup>i</sup>* ⊆ *V* (*P<sup>i</sup>* [T (*σ*start)]), it holds by Lemma 5.18 that ∅ ̸= *V* (*P<sup>i</sup>* [T (*σ*start)]) ⊆ *V* (*P<sup>i</sup>* [T (*σ* ′ )]) and that

$$S\_i \subseteq V(Q) = V(P\_i[\mathcal{T}(\sigma\_{\text{start}})]) \cap V(P\_i[\mathcal{T}((j, i))]).$$

Note that if in this case T (*σ*end) = {⊥}, then by induction hypothesis *V* (*P<sup>i</sup>* [T (*σ* ′ )]) ∩ C and *V* (*P<sup>j</sup>* ) ∩ C are avoiding, and thus so are

$$V(P\_i[\mathcal{T}(\sigma\_{\text{start}})]) \cap \mathcal{C} \supseteq S\_i \text{ and } V(P\_j) \cap \mathcal{C} \supseteq S\_j,$$

a contradiction.

It remains to analyze the case where {*y, z*} . .= T (*σ*end) ̸= {⊥}. We resolve this case with a fnal case distinction:

$$S\_j \mid V(P\_j[\mathcal{T}(\sigma\_{\text{end}})]) \neq \emptyset \text{ or } S\_j \subseteq V(P\_j[\mathcal{T}(\sigma\_{\text{end}})]).$$

In the former case *S<sup>j</sup>* ⊆ *V* (*P<sup>j</sup>* [*s<sup>j</sup> , y*]) or *S<sup>j</sup>* ⊆ *V* (*P<sup>j</sup>* [*z, t<sup>j</sup>* ]). By Lemma 5.18, it holds that *S<sup>i</sup>* ⊆ *V* (*P<sup>i</sup>* [T (*σ*start)]) ⊆ *V* (*P<sup>i</sup>* [T (*σ* ′ )]). Since by Lemma 5.22

$$\begin{aligned} V(P\_j[s\_j, y]) &\cap \mathcal{C} \text{ and } \begin{aligned} V(P\_i[\mathcal{T}(\sigma')]) &\cap \mathcal{C} \text{ and }\\ V(P\_j[z, t\_j]) &\cap \mathcal{C} \text{ and } \begin{aligned} V(P\_i[\mathcal{T}(\sigma')]) &\cap \mathcal{C} \end{aligned} \end{aligned}$$

are avoiding, we conclude that *S<sup>i</sup>* and *S<sup>j</sup>* are avoiding, a contradiction.

Finally, if *S<sup>j</sup>* ⊆ *V* (*P<sup>j</sup>* [T (*σ*end)]), then it holds by induction hypothesis and Lemma 5.18 that

$$S\_i \subseteq V(P\_i[\mathcal{T}(\sigma\_{\text{start}})]) \cap \mathcal{C} \subseteq V(P\_i[\mathcal{T}(\sigma')]) \cap \mathcal{C} \text{ and } S\_j \subseteq V(P\_j[\mathcal{T}(\sigma\_{\text{end}})]) \cap \mathcal{C}$$

are avoiding, a contradiction.

We conclude this section with the defnition of labels of segments and the proof that they guarantee that paths following two segments have either a common color or are avoiding. To this end, let *S* = {*u, v*} be a segment of an *i*-marble pathwith *u <<sup>i</sup> v*. The set of *labels* of *S* (labels[*S*]) is defned as

$$\{a \mid \exists \sigma := (\ell\_1 = a, \ell\_2, \dots, \ell\_{|\sigma|} = i). \ \{\alpha, \omega\} := \mathcal{T}(\sigma) \neq \{\bot\} \land \alpha \le^i u <^i v \le^i \omega\}.$$

**Proposition 5.24.** *Let* (*G,*(*s<sup>i</sup> , ti*)*i*∈[*k*]) *be an instance of k-*Disjoint Shortest Paths *and let* P = (*Pi*)*i*∈[*k*] *be a solution to this instance. Let i, j* ∈ [*k*] *and let T<sup>i</sup>* = *V* (*Pi*) ∩ C *be an i-marble path and T<sup>j</sup>* ⊆ *V* (*P<sup>j</sup>* ) ∩ C *be a j-marble path. Let S<sup>i</sup>* ⊆ *T<sup>i</sup> and S<sup>j</sup>* ⊆ *T<sup>j</sup> be two minimal segments. If* labels[*S<sup>i</sup>* ] ̸= labels[*S<sup>j</sup>* ]*, then S<sup>i</sup> and S<sup>j</sup> are avoiding.*

*Proof.* We start with the case where *i /*∈ labels[*S<sup>j</sup>* ]. Then either T ((*i, j*)) = {⊥} or *S<sup>j</sup>* ∩ *T<sup>j</sup>* [*u, v*] = ∅, where {*u, v*} . .= T ((*i, j*)) ̸= {⊥}. In both cases, *S<sup>j</sup>* and *S<sup>i</sup>* are avoiding by Lemma 5.23. The case where *j /*∈ labels[*S<sup>i</sup>* ] is analogous.

It remains to consider the case where

$$i, j \in \text{labels}[S\_i] \cap \text{labels}[S\_j].$$

Let without loss of generality be *d* ∈ labels[*S<sup>i</sup>* ]\labels[*S<sup>j</sup>* ]. By defnition of labels, there is a set Φ = {*ℓ*1*, ℓ*2*, . . . , ℓ*<sup>|</sup>Φ<sup>|</sup>} and a permutation *σ* = (*ℓ*1*, ℓ*2*, . . . , ℓ*<sup>|</sup>Φ<sup>|</sup>) of Φ

such that *ℓ*<sup>1</sup> = *d*, *ℓ*|Φ<sup>|</sup> = *i*, T (*σ*) = {*α, ω*} ̸= {⊥}, and *S<sup>i</sup>* ⊆ *P<sup>i</sup>* [*α, ω*] ∩ C. We consider the two cases *j /*∈ Φ and *j* ∈ Φ.

If *j /*∈ Φ, then let *σ* ′ . .= (*d* = *ℓ*1*, ℓ*2*, . . . , ℓ*<sup>|</sup>Φ<sup>|</sup> = *i, j*) and we distinguish between the two cases T (*σ* ′ ) ̸= {⊥} and T (*σ* ′ ) = {⊥}. If T (*σ* ′ ) ̸= {⊥}, then by defnition of labels and since *d /*∈ labels[*S<sup>j</sup>* ], it holds that *S<sup>j</sup>* \ *V* (*P<sup>j</sup>* [T (*σ* ′ )]) ̸= ∅. Lemma 5.23 states that *S<sup>j</sup>* and each minimal segment of *V* (*P<sup>i</sup>* [T (*σ*)]) ∩ C are avoiding. Hence, *S<sup>i</sup>* and *S<sup>j</sup>* are avoiding as *S<sup>i</sup>* ⊆ *V* (*P<sup>i</sup>* [T (*σ*)])∩C. If T (*σ* ′ ) = {⊥}, then, by Lemma 5.23, it holds that *V* (*P<sup>i</sup>* [T (*σ*)]) ∩ C and *T<sup>j</sup>* . .= *V* (*P<sup>j</sup>* ) ∩ C are avoiding. Thus by defnition *S<sup>i</sup>* and *S<sup>j</sup>* are avoiding.

It remains to consider the case where *j* ∈ Φ = {*ℓ*1*, ℓ*2*, . . . , ℓ*<sup>|</sup>Φ<sup>|</sup>}. In this case let *x* ∈ [2*,* |Φ| − 1] such that *j* = *ℓ<sup>x</sup>* and let *σ<sup>i</sup>* . .= (*d* = *ℓ*1*, ℓ*2*, . . . , ℓi*) for all *h* ∈ [*x,* |Φ|] (*x* ≤ *h* ≤ |Φ|). Note that *σ<sup>x</sup>* . .= (*d* = *ℓ*1*, ℓ*2*, . . . , ℓ<sup>x</sup>* = *j*) and *σ*<sup>|</sup>Φ<sup>|</sup> = *σ*. Since *S<sup>i</sup>* ⊆ *V* (*P<sup>i</sup>* [T (*σ*)]) ∩ C, it follows that T (*σ*) ̸= {⊥} and, by defnition of T , it holds for each *h* ∈ [*x,* |Φ|] that T (*σh*) ̸= {⊥}. Lemma 5.19 then states that for each *h* ∈ [*x,* |Φ|] and each subpath *Q*<sup>|</sup>Φ<sup>|</sup> of *P<sup>i</sup>* [T (*σ*<sup>|</sup>Φ<sup>|</sup>)] = *P<sup>i</sup>* [T (*σ*)] there is some subpath *Q<sup>h</sup>* of *Pℓ<sup>h</sup>* [T (*σh*)] such that *Q<sup>h</sup>* =set(*σh*) *Qh*−1. Let *Q*<sup>|</sup>Φ<sup>|</sup> be such a path with {*sQ*|Φ<sup>|</sup> *, tQ*|Φ<sup>|</sup> } = *S<sup>i</sup>* . Thus,

$$Q\_{|\Phi|} = ^jQ\_{|\Phi|-1} = ^j\dots = ^jQ\_{x\_j}$$

and in particular {start(*Si*)*,* end(*Si*)} ⊆*<sup>j</sup> V* (*P<sup>j</sup>* [T (*σx*)]).

Since *S<sup>i</sup>* ⊆ *V* (*P<sup>i</sup>* [T (*σ*)]) ∩ C, it holds by Lemma 5.18 that

$$S\_i \subseteq V(P\_i[\mathcal{T}(\sigma)]) \cap \mathcal{C} \subseteq V(P\_i[\mathcal{T}((\ell\_x, \ell\_{x+1}, \dots, \ell\_{|\Phi|}))]) \cap \mathcal{C}.$$

Hence, it holds that {start(*Si*)*,* end(*Si*)} is *j*-colored and thus it holds for each path *Q<sup>i</sup>* that follows *S<sup>i</sup>* that *Q<sup>i</sup>* ⊆*<sup>j</sup> V* (*P<sup>j</sup>* [T (*σx*)]). Since *d* ∈ labels[*S* ′ *j* ] for each *S* ′ *<sup>j</sup>* ⊆ *V* (*P<sup>j</sup>* [T (*σx*)]) ∩ C, *d /*∈ labels[*S<sup>j</sup>* ], and *S<sup>j</sup>* is minimal, it follows that

$$S\_j \cap (V(P\_j[\mathcal{T}(\sigma\_x)]) \cap \mathcal{C}) \subseteq \{\text{start}(S\_i), \text{end}(S\_i)\} \cap \mathcal{T}(\sigma\_x).$$

Moreover, since *P<sup>j</sup>* is strictly increasing in the *j* th coordinate, it follows for each path *Q<sup>i</sup>* that follows *S<sup>i</sup>* and each path *Q<sup>j</sup>* that follows *S<sup>j</sup>* that

$$\overline{V(Q\_i)}^j \cap \overline{V(Q\_j)}^j \subseteq \overline{V(Q\_i)}^j \cap \overline{V(P\_j[\mathcal{T}(\sigma\_x)])}^j \subseteq (\{\overline{s\_{Q\_i}}, \overline{t\_{Q\_i}}\} \cap \{\overline{s\_{Q\_j}}, \overline{t\_{Q\_j}}\})^j.$$

Hence, each such pair of paths is avoiding (no two inner vertices share the same vector) and thus it holds by defnition that *S<sup>i</sup>* and *S<sup>j</sup>* are avoiding.

# **5.3 An** *XP***-Algorithm for** *k***-Disjoint Shortest Paths**

In this section, we present our main theorem, that is, an *XP*-algorithm for *k*-Disjoint Shortest Paths with respect to the number *k* of terminal pairs. In a nutshell, we frst guess all marble paths *T<sup>i</sup>* and the respective ends T corresponding to the crossing set C of some solution (if one exists). We then compute all minimal segments of each marble path *T<sup>i</sup>* , compute their respective labels, and partition the segments such that all minimal segments in the same part of the partition are strictly monotone in a common coordinate and two minimal segments in distinct parts of the partition are avoiding. The crucial improvement over the algorithm by Lochet [Loc21] is that our partition is much smaller. Afterwards, we fnd via dynamic programming for all segments in one part of the partition disjoint paths that follow the respective segments.

To this end, we introduce *c*-layered DAGs and the problem *p*-Disjoint Paths on *<sup>c</sup>*-layered DAGs. For a graph *<sup>G</sup>* with vectors #»*<sup>v</sup>* for all *<sup>v</sup>* <sup>∈</sup> *<sup>V</sup>* (as defned in Subsection 5.2.1), the *c-layered DAG D<sup>c</sup>* of *G* is the directed graph *D<sup>c</sup>* = (*V, A*), where *A* = {(*x, y*) | {*x, y*} ∈ *E*(*G*) ∧ #»*y <sup>c</sup>* − #»*x <sup>c</sup>* = 1}. Notice that a path *<sup>P</sup>* = (*v*1*, v*2*, . . . , vp*) is *<sup>c</sup>*-colored if and only if # » *<sup>v</sup>i*+1 *<sup>c</sup>* − #»*vi <sup>c</sup>* = 1 for all *<sup>i</sup>* <sup>∈</sup> [*<sup>p</sup>* <sup>−</sup> 1] or #»*v<sup>i</sup> <sup>c</sup>* − # » *<sup>v</sup>i*+1 *<sup>c</sup>* = 1 for all *i* ∈ [*p* − 1]. Let *P<sup>m</sup>* = (*vp, vp*−1*, . . . , v*1) be the mirrored path of *P*. Then, *P* is *c*-colored if and only if *P<sup>m</sup>* is and hence if and only if the directed path (*V* (*P*)*, A*(*P*)) or the directed path (*V* (*P*)*, A*(*Pm*)) is a path in *Dc*. Finally, observe that *A*(*Pm*) = *A*<sup>−</sup><sup>1</sup> (*P*), that is, *P<sup>m</sup>* and *P* have the same vertices but the edges are oppositely directed.

**Observation 5.25.** *A path P in G is c-colored if and only if* (*V* (*P*)*, A*(*P*)) *or* (*V* (*P*)*, A*<sup>−</sup><sup>1</sup> (*P*) *is a path in the c-layered DAG D<sup>c</sup> of G.*

We continue with a defnition of *p*-Disjoint Paths on *c*-layered DAGs. Here, we are given a *c*-layered DAG *D<sup>c</sup>* and a list (*s<sup>i</sup> , ti*)*i*∈[*p*] of (possibly intersecting) terminal pairs. We then ask whether there are pairwise internally vertex-disjoint *si*-*ti*-path in *Dc*. Formally, it is defned as follows.

*p*-Disjoint Paths on *c*-layered DAGs

**Input:** A *c*-layered DAG *D<sup>c</sup>* and *p* pairs (*s<sup>i</sup> , ti*)*i*∈[*k*] of vertices.

**Question:** Are there *p* internally vertex-disjoint paths *P<sup>i</sup>* in *D<sup>c</sup>* such that *P<sup>i</sup>* is a shortest *si*-*ti*-path for each *i* ∈ [*p*]?

With these defnitions, we can state our algorithm. Algorithm 5.1 provides pseudo-code.

**Algorithm 5.1:** Our algorithm for *k*-Disjoint Shortest Paths.

**<sup>1</sup> function** solve(*G,* (*s<sup>i</sup> , ti*)*i*∈[*k*]) **<sup>2</sup> foreach** guess (*Ti*)*i*∈[*k*] , Ends of the crossing set **do** /\* We assume subsequently that the guesses are correct, that is, if there is a solution P . .= (*Pi*)*i*∈[*k*] , then *T<sup>i</sup>* = *V* (*Pi*) ∩ C for all *i* ∈ [*k*] and Ends = T . \*/ **<sup>3</sup> foreach** *i* ∈ [*k*] **do <sup>4</sup>** P*<sup>i</sup>* ← ∅ // *P<sup>i</sup>* contains all segments corresponding to *D<sup>i</sup>* **<sup>5</sup> foreach** minimal segment *S* of some *T<sup>i</sup>* with *i* ∈ [*k*] **do <sup>6</sup>** marks[*S<sup>i</sup>* ] ← ∅ **<sup>7</sup> foreach** permutation *σ* = (*ℓ*1*, ℓ*2*, . . . , i*) with Ends(*σ*) = {*α, ω*} ̸= {⊥} and *α* ≤*<sup>i</sup>* start(*Si*) *<<sup>i</sup>* end(*Si*) ≤*<sup>i</sup> ω* **do <sup>8</sup>** marks[*S<sup>i</sup>* ] ← marks[*S<sup>i</sup>* ] ∪ set(*σ*) **<sup>9</sup>** *j* ← min marks[*S*] **<sup>10</sup>** *x* ← arg min{ #»*v j* | *v* ∈ {start(*S*)*,* end(*S*)}} **<sup>11</sup>** *y* ← arg max{ #»*v j* | *v* ∈ {start(*S*)*,* end(*S*)}} **<sup>12</sup>** P*<sup>j</sup>* = P*<sup>j</sup>* ∪ {(*x, y*)} **<sup>13</sup> foreach** *j* ∈ [*k*] **do <sup>14</sup>** Order <sup>P</sup>*<sup>j</sup>* = ((*x*1*, y*1)*,*(*x*2*, y*2)*, . . .*) such that # »*x*<sup>1</sup> *<sup>j</sup>* ≤ # »*x*2 *<sup>j</sup>* ≤ *. . .* **<sup>15</sup> if** all instances (*D<sup>i</sup> ,*P*i*) of |P*<sup>i</sup>* |-Disjoint Paths on *i*-layered DAGs are yes-instances and the combined solutions form a solution of *k*-Disjoint Shortest Paths **then <sup>16</sup> return** true **<sup>17</sup> return** false

Fortune et al. [FHW80] showed that *p*-Disjoint Path on DAGs can be solved in *n O*(*p*) time. Since *c*-layered DAGs are DAGs, we could use their algorithm in Algorithm 5.1 and achieve a running time of *n <sup>O</sup>*(*k*!). However, to drop the Landau notation in the exponent, we show that *p*-Disjoint Paths on *c*-layered DAGs can be solved in *O*(*n <sup>p</sup>*+1) time. Afterwards, we show that Algorithm 5.1 is correct and runs in *O*(*n* <sup>16</sup>*k*+*k*!+*k*+1) time.

The idea behind the dynamic program for *p*-Disjoint Paths on *c*-layered DAGs is as follows. Given an *i*-layered DAG *D<sup>i</sup>* , a number *p*, and a set of terminal pairs (*s<sup>j</sup> , t<sup>j</sup>* )*, j* <sup>∈</sup> [*p*], where #»*s<sup>j</sup> <sup>i</sup> <* #»*tj i* and #»*s<sup>j</sup> <sup>i</sup>* ≤ #»*sℓ i* for all *j < ℓ* ∈ [*p*], the dynamic program is a table *T*[*x*1*, x*2*, . . . , xp*] ∈ {true*,* false} that stores true roughly if the following two criteria are fulflled.


Thus, *T*[*t*1*, t*2*, . . . , tp*] = true if and only if there is a set of internally vertexdisjoint *s<sup>j</sup>* -*t<sup>j</sup>* -paths.

Note that if *s<sup>i</sup>* =*<sup>c</sup> s<sup>j</sup>* and *t<sup>i</sup>* =*<sup>c</sup> t<sup>j</sup>* for all *i, j* ∈ [*p*], then there is a fairly straightforward dynamic program for *p*-Disjoint Paths on *c*-layered DAGs. Store for increasing values of *d* ∈ [ #»*si c ,* #»*ti c* ] and for each tuple (*x*1*, x*2*, . . . , xp*) of vertices with #»*x<sup>i</sup> <sup>c</sup>* = *d* for all *i* ∈ [*p*] whether there are pairwise disjoint paths from *s<sup>i</sup>* to *x<sup>i</sup>* (the paths may possibly share their end vertices *s<sup>i</sup>* and/or *t<sup>i</sup>* if *x<sup>i</sup>* = *ti*). The table corresponding to this dynamic program has *O*(*n* · *n p* ) table entries (at most *n* values for *d* and for each *d* there are at most *n p* sequences of *p* vertices). Each table entry can be computed in *O*(*n p* ) time by iterating over all table entries for *d* − 1 and (*x* ′ 1 *, x*′ 2 *, . . . , x*′ *p* ) and checking whether (*x* ′ 1 *, x*1)*,*(*x* ′ 2 *, x*2)*, . . . ,*(*x* ′ *p , xp*) ∈ *A*. This would lead to an overall running time of *O*(*n* <sup>2</sup>*p*+1) for *p*-Disjoint Paths on *c*-layered DAGs. Note further that ensuring that *s<sup>i</sup>* =*<sup>c</sup> s<sup>j</sup>* and *t<sup>i</sup>* =*<sup>c</sup> t<sup>j</sup>* is not difcult either. One can simply replace each *s<sup>i</sup>* and *t<sup>i</sup>* with new terminals and add paths of according lengths between the new and the old terminal vertices. An example of this roughly outlined construction is given in Figure 5.5. However, there is another dynamic program that is faster (*O*(*n <sup>p</sup>*+1) time instead of *O*(*n* <sup>2</sup>*p*+1) time) and that also works for general DAGs. Basically, instead of moving all *x<sup>i</sup>* from one layer to the next in one step, we order them and move the *x<sup>i</sup>* that is frst in this ordering. This has the advantage that for computing one table entry, we only have to consider *O*(*n*) table entries instead of *O*(*n p* ).

**Lemma 5.26.** *An instance of p*-Disjoint Paths on DAGs *on a graph with n vertices can be solved in O*(*n <sup>p</sup>*+1) *time.*

*Proof.* Let *D* = (*V, A*) be a DAG and let (*s<sup>i</sup> , ti*)*i*∈[*p*] be a set of *p* terminal pairs. We defne *V* end . .= ⋃ *i*∈[*p*] {*s<sup>i</sup> , ti*} to be the set of all terminals. We also choose an arbitrary topological order of *D* and denote by *u* ≺ *v* that *u* comes before *v*

**Figure 5.5:** *Left-hand side:* An example of 3-Disjoint Paths on *c*-layered DAGs. The *c*-coordinate of vertices is illustrated by their horizontal position. The terminal pairs are (*s*1*, t*1), (*s*2*, t*2), and (*s*3*, t*3) and *s*<sup>2</sup> = *s*3. A solution is highlighted.

*Right-hand side:* An equivalent instance in which *s<sup>i</sup>* =*<sup>c</sup> s<sup>j</sup>* for all *i, j* ∈ [*p*]. The smaller vertices are the vertices that are newly introduced by the construction. The corresponding solution is again highlighted. Note that since *s*<sup>2</sup> = *s*3, after adding *s*′ 2 and *s*′ <sup>3</sup>, the solution is not internally vertex-disjoint if *s*<sup>2</sup> was not duplicated.

in this topological order. We assume without loss of generality that *s<sup>i</sup>* ⪯ *s<sup>j</sup>* for all *i < j* ∈ [*p*]. We further assume that *s<sup>i</sup>* ⪯ *t<sup>i</sup>* for all *i* ∈ [*p*] as otherwise there can be no path from *s<sup>i</sup>* to *t<sup>i</sup>* and that *p* ≤ *n* as we can iterate over all pairs (*si, ti*) and delete those that are connected by an arc (*si, ti*) ∈ *A*. All remaining paths have at least one inner vertex that has to be from *V* \ *V* end and that has to be unique for each path. Hence, if there are at least *n* + 1 pairs remaining, then the instance has no solution.

We build a table *T*[*x*1*, x*2*, . . . , xp*] ∈ {true*,* false} that stores true if and only if the following three criteria are fulflled.


If the table is completely flled, then there is a set of internally vertex-disjoint shortest *s<sup>j</sup>* -*t<sup>j</sup>* -paths if and only if *T*[*t*1*, t*2*, . . . , tp*] = true as the frst two requirements are trivially fulflled. We initialize the table with *T*[*s*1*, s*2*, . . . , sp*] . .= true as internally vertex-disjoint *si*-*si*-paths trivially exist. Moreover, for each tuple (*x*1*, . . . , xp*) ∈ *V <sup>p</sup>* if *x<sup>i</sup>* ≺ *s<sup>i</sup>* or *t<sup>i</sup>* ≺ *x<sup>i</sup>* or *x<sup>i</sup>* ∈ *V* end \ {*si, ti*} for at least one *i* ∈ [*p*], then we set *T*[*x*1*, . . . , xp*] . .= false. Note that there are *n<sup>p</sup>* possible tuples and initializing each entry takes *O*(*n*) time.

We next show how to compute the entries of *T*. To this end, for some tuple (*x*1*, x*2*, . . . , xp*), let *x<sup>ℓ</sup>* be a vertex such that *x<sup>ℓ</sup>* ̸= *s<sup>ℓ</sup>* and *x<sup>i</sup>* ⪯ *x<sup>ℓ</sup>* for all *x<sup>i</sup>* with *i* ∈ [*p*] and *x<sup>i</sup>* ̸= *s<sup>i</sup>* . Moreover, let

$$N^\*(x\_i) := \{ v \mid (v, x\_i) \in A \land v \in (V \mid (V^{\text{end}} \mid \{s\_i\})) \mid \{x\_1, \dots, x\_p\} \}.$$

Finally, let

$$T[x\_1, x\_2, \dots, x\_p] := \bigvee\_{x\_\ell' \in N^\star(x\_\ell)} T[x\_1, x\_2, \dots, x\_{\ell-1}, x\_\ell', x\_{\ell+1}, \dots, x\_p].$$

We now show by induction on the sum of positions in the topological order of all *x<sup>i</sup>* that *T*[*x*1*, x*2*, . . . , xp*] = true if and only if the three criteria are fulflled. In the base case, *T*[*s*1*, s*2*, . . . , sp*] = true, or there is some *x<sup>i</sup>* such that *x<sup>i</sup>* ≺ *s<sup>i</sup>* and therefore *T*[*x*1*, x*2*, . . . , xp*] = false. Note that there is no *si*-*xi*-path in the latter case.

Now to show the statement for some table entry *T*[*x*1*, x*2*, . . . , xp*], assume that the statement holds for all table entries *T*[*x* ′ 1 *, x*′ 2 *, . . . , x*′ *p* ] such that *x* ′ *<sup>i</sup>* ⪯ *x<sup>i</sup>* for all *i* ∈ [*p*] and *x* ′ *<sup>j</sup>* ≺ *x<sup>j</sup>* for at least one *j* ∈ [*p*]. To this end, frst assume that *T*[*x*1*, x*2*, . . . , xp*] = true. Since *T*[*x*1*, x*2*, . . . , xp*] = true, it was not set to false in the initialization and thus i) and ii) are satisfed. By construction, there is an *x* ′ *<sup>ℓ</sup>* ∈ *N*<sup>∗</sup> (*xℓ*) such that *T*[*x*1*, x*2*, . . . , xℓ*−1*, x*′ *ℓ , xℓ*+1*, . . . , xp*] = true. By induction hypothesis, there are internally vertex-disjoint *sℓ*-*x* ′ *ℓ* - and *s<sup>j</sup>* -*x<sup>j</sup>* -paths for all *j* ∈ [*p*] \ {*ℓ*} such that *s<sup>ℓ</sup>* ⪯ *x* ′ *<sup>ℓ</sup>* ⪯ *t<sup>ℓ</sup>* and *x* ′ *<sup>ℓ</sup>* ∈ *V* \ (*V* end \ {*sℓ, tℓ*}). Since by defnition of *x<sup>ℓ</sup>* it holds that *x<sup>i</sup>* ≺ *x<sup>ℓ</sup>* for all *x<sup>i</sup>* with *i* ∈ [*p*] and *x<sup>i</sup>* ̸= *s<sup>i</sup>* , it holds that *x<sup>ℓ</sup>* is not contained in any of the *si*-*xi*-paths for *i* ∈ [*p*]. Hence the *sℓ*-*x* ′ *ℓ* -path can be extended by the edge (*x* ′ *ℓ , xℓ*) and the resulting path combined with the other *si*-*xi*-paths satisfes iii).

To show the other direction assume that *x*1*, x*2*, . . . , x<sup>p</sup>* satisfy i) to iii). Then consider the *sℓ*-*xℓ*-path and the predecessor *x* ′ *ℓ* of *xℓ*. Note that *x* ′ *ℓ* exists as otherwise *x<sup>i</sup>* = *s<sup>i</sup>* for all *i* ∈ [*p*] and hence we are in the base case. By construction, *x* ′ *<sup>ℓ</sup>* ∈ *N*<sup>∗</sup> (*xℓ*) ⊆ *V* \ (*V* end \ {*si*}). Note further that *x* ′ *<sup>ℓ</sup>* ≺ *x<sup>ℓ</sup>* ⪯ *tℓ*, implying *x* ′ *<sup>ℓ</sup>* ≠ *t<sup>ℓ</sup>* and hence i) is also satisfed by *x* ′ *ℓ* . Further, since there is an *sℓ*-*x* ′ *ℓ* -path (a subpath of the *sℓ*-*x<sup>ℓ</sup>* path), it holds that *s* ⪯ *x* ′ *<sup>ℓ</sup>* ≺ *x<sup>ℓ</sup>* ⪯ *t<sup>ℓ</sup>* and thus *x* ′ *ℓ* also satisfes ii). Finally, iii) is also satisfed by the *sℓ*-*x* ′ *ℓ* -subpath combined with the other *si*-*xi*-paths. The induction hypothesis then states that *T*[*x*1*, x*2*, . . . , xℓ*−1*, x*′ *ℓ , x<sup>ℓ</sup>*+1*, . . . , xp*] = true. Since *x* ′ *<sup>ℓ</sup>* ∈ *N*<sup>∗</sup> (*xℓ*), it holds that *T*[*x*1*, x*2*, . . . , xp*] = true. Thus, the statement holds for all table entries *T*[*x*1*, x*2*, . . . , xp*].

It remains to analyze the running time. There are at most *n <sup>p</sup>* possible table entries and computing one takes *O*(*n*) time as *V* end*, ℓ*, and *N*<sup>∗</sup> (*xℓ*) can be computed in *O*(*p* + *n*) ⊆ *O*(*n*) time and iterating over at all neighbors of *x<sup>ℓ</sup>* takes *O*(*n*) time. Hence the overall running time is *O*(*n <sup>p</sup>*+1).

After showing how to solve the subproblems, it remains to show that Algorithm 5.1 is correct and to analyze its running time. We start with the analysis of the running time.

#### **Lemma 5.27.** *Algorithm 5.1 runs in O*(*k* · *n* <sup>16</sup>*k*·*k*!+*k*+1) *time.*

*Proof.* First, observe that there are at most *k* · *k*! diferent permutations of subsets of *k* objects as there are exactly *k*! permutations of exactly *k* objects and each of these can be truncated at *k* positions to get any permutation of any smaller (non-empty) subset of objects. Second, observe that by Defnition 5.7 there are at most eight vertices guessed for each sequence *σ* as if *δ i,j P* (*Q*) ̸= ⊥, then *α i,j P* (*Q*) = *ω i,j P* (*Q*) = *∂ i,j P* (*Q*) = *ϖ i,j P* (*Q*) = ⊥. Hence, at most 8*k* ·*k*! vertices need to be guessed, which requires at most *n* <sup>8</sup>*k*·*k*! attempts.

Next we analyze the running time of each iteration of the main foreach-loop in Algorithm 5.1. Notice that by Defnition 5.7, for each sequence *σ* there are at most four vertices on a marble path *T<sup>i</sup>* and that each of these vertices increases the number of minimal segments *S* on *T<sup>i</sup>* by at most one. Note that for each *σ* the set C *σ* contains vertices from at most two paths. Thus, we create at most 8*k* · *k*! new segments overall. Since we start with *k* marble paths, there are at most 8*k* ·*k*! +*k* minimal segments. Thus, there are at most (8*k* ·*k*! +*k*)·(*k* ·*k*!) iterations of the loop in Line 7, each of which takes constant time. Each iteration of Line 14 can be done in *O*(*n*) time using bucket sort and hence the overall running time for all iterations is in *O*(*n* · *k*).

Next, there are *k* instances of *pi*-Disjoint Paths on *i*-layered DAGs that are solved using Lemma 5.26, where *p<sup>i</sup>* ≤ 8*k* · *k*! + *k* for all *i* ∈ [*k*]. By Lemma 5.26 the running time for solving one instance is *O*(*n* <sup>8</sup>*k*·*k*!+*k*+1) and the running time for solving all instances is hence *O*(*k* ·*n* <sup>8</sup>*k*·*k*!+*k*+1). Lastly, we verify in Algorithm 5.1 that the solutions found can indeed be merged into one solution for *k*-Disjoint Shortest Paths. Note that we only stated the decision version of *p*-Disjoint Paths on *c*-layered DAGs but the actual solution can be found using a very similar algorithm where we do not only store true or false in the table *T* but also some set of disjoint paths corresponding to each table entry that stores true. Verifying a solution can for example be done in *O*(*k* · *n*) time by iterating over all solution paths and verify that between each pair of

consecutive vertices there is an edge, that all paths are shortest paths, and that all paths are internally vertex-disjoint. This can be done by marking all inner vertices of each path and if some vertex is already marked once and visited again, then return false and otherwise return true. Thus the overall running time of Algorithm 5.1 is

$$O(n^{8k\cdot k!} \cdot ((8k\cdot k! + k)\cdot (k\cdot k!) + n\cdot k + k\cdot n^{8k\cdot k! + k + 1} + n\cdot k) \subseteq O(k\cdot n^{16k\cdot k! + k + 1}).\qed$$

For the correctness of Algorithm 5.1, we need to show that each part of the partition of minimal segments can be solved independently. This follows from Proposition 5.24 together with the fact that Algorithm 5.1 exhaustively tries all possibilities for the crosssing set C. Together with Lemma 5.27, this implies our main theorem.

**Theorem 5.28.** *k-*Disjoint Shortest Paths *is solvable in O*(*k* · *n* <sup>16</sup>*k*·*k*!+*k*+1) *time.*

*Proof.* We use Algorithm 5.1 and focus on the correctness as the running time is already analyzed in Lemma 5.27. If Algorithm 5.1 returns true, then Line 16 is executed and a solution is verifed. It remains to show that if there is some solution, then Algorithm 5.1 returns true. If there is some solution P = (*Pi*)*i*∈[*k*] , then let C be its crossing set (Defnition 5.7). Then, there is some iteration of Line 2 where all guesses are correct, that is, Ends = T and *T<sup>i</sup>* = *V* (*Pi*) ∩ C. We now consider this iteration of Line 2.

Observation 5.17 states that for each sequence *σ* and for each segment *S* with {start(*S*)*,* end(*S*)} = Ends(*σ*) = T (*σ*) the pair {start(*S*)*,* end(*S*)} is *c*colored for each *c* ∈ set(*σ*). Hence the same also holds for each minimal segment *S* ′ ⊆ *S*. By Line 7, there is a solution where the shortest paths between the endpoints of each minimal segment *S* are strictly *c*-monotone for each *c* ∈ labels[*S*]. Note that labels[*S*] = marks[*S*] in this iteration of Line 2. Hence each path following *S* is strictly *c*-increasing for each *c* ∈ marks[*S*] and by Observation 5.25 this shortest path is contained in *Dc*. Hence we can fnd *some* solution for each minimal segment using Lemma 5.26 such that all paths for these minimal segments with the same marks are internally vertex-disjoint. Since marks[*S*] = labels[*S*] for all minimal segments, by Proposition 5.24, all shortest paths between endpoints of minimal segments with diferent marks are internally vertex-disjoint. Hence, the result computed by Algorithm 5.1 is a solution to *k*-Disjoint Shortest Paths and thus the algorithm returns true.

# **5.4 Concluding Remarks**

We provided an improved polynomial-time algorithm for *k*-Disjoint Shortest Paths for constant *k*. However, while the running time of Algorithm 5.1 can certainly be further improved by some case distinctions and a further refned analysis, the algorithm is still far from being practical. We believe that Algorithm 5.1 can be improved to run in *n* 2 *O*(*k*) time. It is left open whether a running time of *n k <sup>O</sup>*(1) is possible.

Concerning generalizations of *k*-Disjoint Shortest Paths, we believe that Algorithm 5.1 can be modifed to not only work for unit edge lengths but also for positive integer lengths. However, the case of non-negative edge lengths seems much more difcult as edges with length zero result in overlapping vertices in our geometric representation. Finally, if there are no *k* disjoint shortest paths for some constant *k*, then computing in polynomial time disjoint paths with minimum length is still an open problem (for *k* = 2 Björklund and Husfeldt [BH19] provided an *O*(*n* <sup>11</sup>)-time randomized algorithm).

# **Part II 2-SAT Programming**

# **Chapter 6**

### **Tree Containment**

In this chapter, we investigate a problem from computational biology. Concerning 2-SAT programming, we present a general *k*-SAT program that shows that a relevant special case is polynomial-time solvable as the resulting program only contains 2-SAT formulas. Concerning problem-specifc aspects, we introduce a new variant of a well-known problem in computational biology. The new version models a certain uncertainty regarding the history of evolution. We then identify a relevant special case of this new variant and a natural parameter that models the amount of uncertainty. We conclude with an equivalence between the identifed special case and *k*-SAT in the sense that there are reductions from and to *k*-SAT, where the value of *k* in both reductions matches the value of our identifed parameter. This proves that the special case is polynomial-time solvable for *k* ≤ 2 and *NP*-hard for *k* ≥ 3.

With the dawn of molecular biology also came the realization that evolutionary trees, which have been widely adopted by biologists, are insufcient to describe certain processes that have been observed in nature. In the last decade, the idea of reticulate evolution, supporting gene fow from multiple parent species, arose [CCR13, TR11]. Reticulate evolution is described using "phylogenetic networks" (see the monographs by Gusfeld [Gus14] and Huson et al. [HRS10] or the formal defnitions in Section 6.1). A central question when dealing with phylogenetic networks is whether or not diferent phylogenetic networks provide consistent information. The corresponding problem is known as Tree Containment and it has been shown to be *NP*-hard [ISS10, Kan+08].

In real life, we cannot hope for perfectly precise evolutionary history. In particular, speciation events (a species splitting of another) occurring in rapid succession (only a few thousand years between speciation events) can often not be reliably placed in the order as they occurred. Incomplete information about a certain set of successive speciation events is called a soft polytomy and it is modeled by a non-binary vertex (a vertex with more than two parent species) in a phylogenetic network. We consider the information provided by two non-binary phylogenetic networks consistent if we can replace each non-binary vertex by some binary tree such that the resulting binary phylogenetic networks provide consistent information.

In Section 6.2, we present frst structural results for Tree Containment with soft polytomies. In Section 6.3, we show that if one input network is a single-labeled phylogenetic tree and the other input network is a multi-labeled tree (for a defnition, see Section 6.1), then Tree Containment is polynomialtime solvable if each label occurs at most twice in the multi-labeled phylogenetic tree and *NP*-complete otherwise. The polynomial-time algorithm is based on the results from Section 6.2 and 2-SAT programming.

### **6.1 Problem Defnition and Related Work**

A *phylogenetic network* on a set *X* of taxa is a rooted, single-source, directed, and acyclic graph in which all vertices have in-degree at most one or out-degree exactly one and each leaf *v* (a vertex with out-degree zero) is labeled with one taxon *x* ∈ *X*. We also say that *v* has label *x*. By default, no label occurs twice in a phylogenetic network, and we will make exceptions explicit by calling phylogenetic networks *multi-labeled* if a label can occur more than once. We say that it is *ℓ-labeled* if each label occurs at most *ℓ* times and if we want to emphasize that a phylogenetic network is not multi-labeled, then we call it *single-labeled*. Vertices with in-degree at least two (and out-degree one) are called *reticulations* and the other vertices are called *tree vertices*. A phylogenetic network without reticulations is called a *phylogenetic tree* and a phylogenetic network or tree is called binary if each vertex has in-degree and out-degree at most two. Figure 6.1 shows an example of a binary phylogenetic network (left-hand side) and a phylogenetic tree (right-hand side).

An important task in computational biology is to check whether two models of evolution are consistent. A relevant special case therein is whether a given phylogenetic network is consistent with an existing tree model or not [Gam+15]. A phylogenetic network *N* and a phylogenetic tree *T* are considered consistent if *N displays T*. For the defnition of displaying, recall that subdividing an arc (*u, v*) in a directed graph refers to removing the arc (*u, v*) and replacing it

**Figure 6.1:** A 2-labeled phylogenetic network *N* (left-hand side) and a phylogenetic tree *T* (right-hand side). The respective topmost vertex is the only source and is called the root. The leaves are each labeled with one element of the set {*a, b, c, d, e, f*}. The parents of the leaves *d* and *e* in the left example are the reticulations in *N* and all other vertices are tree vertices. Removing the three smaller vertices (and all incident arcs) in *N* on the left-hand side and subdividing each dashed arc in *T* on the right-hand side once yields isomorphic<sup>1</sup> trees. Hence, *N* displays *T*.

by a new vertex *w* and two new arcs (*u, w*) and (*w, v*). A *subdivision* of a graph is the result of repeatedly subdividing arcs in it.

**Defnition 6.1.** Let *N* be a (possibly multi-labeled) phylogenetic network and let *T* be a single-labeled phylogenetic tree. Then, *N frmly displays T* if a subdivision of *N* contains a subdivision of *T* as a subgraph such that leaf-labels are respected, that is, each leaf *v* in *T* with label *x* is mapped to a leaf with label *x* in *N*.

An example for Defnition 6.1 is depicted in Figure 6.1. Based on this defnition, Tree Containment is defned as follows.

Tree Containment

**Input:** A (possibly multi-labeled) phylogenetic network *N* and a singlelabeled phylogenetic tree *T*.

**Question:** Does *N* frmly display *T*?

<sup>1</sup> In this chapter, *isomorphic* always refers to an isomorphism respecting leaf-labels, that is, the isomorphism must map a leaf with some label *λ* in *N* to a leaf with label *λ* in *T*.

Kanj et al. [Kan+08] showed that Tree Containment is *NP*-hard. Due to its importance in the analysis of evolutionary history, there have been several attempts to identify polynomial-time computable special cases [BS16, FKP15, Gam+15, GDZ17, Gun18, ISS10, Kan+08, Wel18] as well as moderately exponential-time algorithms [GLZ16, Wel18]. Since the defnitions for the special cases are rather technical and the results are not relevant for this thesis, we do not present defnitions here but only refer the reader to the works by Fakcharoenphol et al. [FKP15] and Weller [Wel18] for an overview.

Motivated by the concept of *soft polytomies*, that is, incomplete knowledge about the order of a limited set of speciation events, we consider a notion we call *soft displaying*. The goal is to allow any high-degree vertex to be replaced by any binary tree such that the resulting phylogenetic network frmly displays the resulting phylogenetic tree. To this end, we consider *arc contractions*. Contracting an arc (*u, v*) in a directed graph refers to the process of "merging" *u* and *v* (and all incident arcs). Formally, vertices *u* and *v* are removed and replaced by a new vertex *w*. For each vertex *x* other than *u* or *v*, if (*x, u*) or (*x, v*) existed in the original graph, then the new graph contains an arc (*x, w*) and if (*u, x*) or (*v, x*) existed in the original graph, then the new graph contains an arc (*w, x*). A *contraction* of a phylogenetic network is the result of repeatedly performing arc contractions in it. We call a binary phylogenetic network *B* = (*VB, AB*) a *binary resolution* of a phylogenetic network *N* = (*V<sup>N</sup> , A<sup>N</sup>* ) if *N* is a contraction of *B*. An example of contractions and binary resolutions is given in Figure 6.2. We call a surjective function *χ*: *V<sup>B</sup>* → *V<sup>N</sup>* a *contraction function* of *B* for *N* if contracting all arcs (*uv*) in *B* with *χ*(*u*) = *χ*(*v*) results in a graph isomorphic to *N*. The notion of binary resolutions leads to the following defnition of soft displaying.

**Defnition 6.2.** Let *N* be a (possibly multi-labeled) phylogenetic network and let *T* be a single-labeled phylogenetic tree. Then, *N softly displays T* if there are binary resolutions *N<sup>B</sup>* of *N* and *T<sup>B</sup>* of *T* such that *N<sup>B</sup>* frmly displays *TB*.

Note that, since each binary resolution of a binary phylogenetic network *N* is a subdivision of *N*, it holds that the concepts of frm and soft displaying coincide for binary phylogenetic networks. The notion of soft displaying naturally leads to the following defnition of Soft Tree Containment.

Soft Tree Containment

**Input:** A (possibly multi-labeled) phylogenetic network *N* and a singlelabeled phylogenetic tree *T*.

**Question:** Does *N* softly display *T*?

**Figure 6.2:** Two phylogenetic trees *B* (left-hand side) and *T* (right-hand side). The phylogenetic tree *B* is binary. Contracting the arc between the two green vertices in *B* yields the green vertex in *T*. Analogously, exhaustively contracting any arc between two blue vertices in *B* yields the blue vertex in *T*. Since the result of contracting these arcs in *B* is isomorphic to *T*, the phylogenetic tree *B* is a binary resolution of *T*.

An example of Soft Tree Containment is given in Figure 6.3. Throughout this chapter, we will mostly focus on Soft Tree Containment and for the sake of readability, we refer to soft displaying simply as "displaying". To the best of our knowledge, we are the frst to study Soft Tree Containment. In this thesis, we focus on the special case where *N* is a multi-labeled phylogenetic tree. This has three main reasons. First, Tree Containment is known to be *NP*-hard even on binary phylogenetic networks and since Tree Containment and Soft Tree Containment coincide for binary phylogenetic networks, Soft Tree Containment is *NP*-hard on binary phylogenetic networks (that is, *N* is not restricted to being a phylogenetic tree). Conversely, Tree Containment is polynomial-time solvable when *N* is a phylogenetic tree [Gam+15] and hence, the computational complexity of Soft Tree Containment on phylogenetic trees remains unclear. Second, reticulation events are comparatively rare especially when considering phylogenies of animals and so chances are that the input consists of phylogenetic trees (or phylogenetic networks with few reticulations). Hence, Soft Tree Containment on phylogenetic trees is a relevant special case from a biological perspective. Third, each algorithm for Soft Tree Containment on phylogenetic networks has to decide on a subgraph of *N* that is a phylogenetic tree and then verify that this phylogenetic tree softly

**Figure 6.3:** An example for Soft Tree Containment. In the top left-hand corner is a multi-labeled tree *N* and in the top right-hand corner is a single-labeled tree *T*. In the bottom right-hand corner is a subdivision of (a binary resolution of) *T* and in the bottom left-hand corner is (a subdivision of) a binary resolution of *N*. The subgraph in the bottom left-hand corner consisting of all vertices except for the two small vertices and all but the two dashed arcs is isomorphic to the phylogenetic tree in the bottom-right hand corner. This shows that *N* softly displays *T*.

displays *T*. Thus, Soft Tree Containment on phylogenetic trees is a relevant special case from an algorithmic perspective.

We conclude this section with some notation for the remainder of this chapter. In a single-labeled phylogenetic network, we use leaves and labels (taxa) interchangeably. A binary phylogenetic network *B* on three leaves *a*, *b*, and *c* is called a *triplet* and we denote it by *ab*|*c* if *c* is a child of the root of *B*. In Figure 6.1, the subtree rooted in the parent of the leaf labeled with *a* is the triplet *bc*|*a*. We denote by *N<sup>v</sup>* the subnetwork (or subtree) of *N* rooted in *v*, that is, the induced subgraph containing *v* and all its descendants. We denote the set of labels in a subnetwork *N<sup>v</sup>* by L(*Nv*). Slightly abusing notation, we use *n* as the maximum number of vertices in *N* and *T*.

Recall that we use the notation *v <<sup>D</sup> u* to denote that a vertex *v* is a descendant of a vertex *u* in a directed acyclic graph (DAG) *D*. We use *v* ≤*<sup>D</sup> u* to denote that *v* is a descendant of *u* in *D* or *v* = *u*. Moreover, recall that the least common ancestor(s) (LCA) of a set *V* ′ of vertices is a set *L* of vertices such that each vertex in *L* is an ancestor of each vertex in *V* ′ and no descendant of a vertex in *L* is an ancestor of each vertex in *V* ′ . In trees, the LCA of any set of vertices is always a set containing a single vertex and for the sake of readability, we will assume that the LCA in a tree *is* a single vertex.

Let *N* = (*V, A*) be a phylogenetic network. Recall that suppressing a vertex *v* with one incoming arc (*u, v*) and one outgoing arc (*v, w*) refers to the procedure of removing *v* and both incident arcs and adding the arc (*u, w*) to the graph (if it does not already exist). For any subset *U* ⊆ *V* of vertices, we denote the result of removing all vertices *v* that do not have a descendant in *U* by *N* |*<sup>U</sup>* , and *N* ||*<sup>U</sup>* is the result of suppressing all degree-two vertices in *N* |*<sup>U</sup>* . Such a phylogenetic network *N*||*<sup>U</sup>* can be computed in *O*(|*U*|) time [Col+00]. Moreover, if *N* is a phylogenetic tree, then *N* |*<sup>L</sup>* is the smallest subtree of *N* containing the vertices in *L* and the root of *N*.

If *N* contains a subgraph *S* that is isomorphic to a tree *T* up to subdivision of arcs, then we simply say that *N* contains a subdivision of *T*. Slightly abusing notation, if an isomorphism maps a vertex *v* in *T* to a vertex *u* in *S* (and thus in *N*), then we do not distinguish between *u* and *v* but say that both vertices are the same. Thus, *S* consists of all vertices in *T* and some vertices of in- and out-degree one.

### **6.2 Single-labeled Trees**

In this section, we will develop a characterization of when a single-labeled phylogenetic tree softly displays another single-labeled phylogenetic tree. To this end, all phylogenetic networks are single-labeled in this section. The characterization will then be used in Section 6.3 to design an algorithm for Soft Tree Containment when the input network *N* is a multi-labeled phylogenetic tree.

We start with a series of basic observations regarding the concept of displaying. First, note that a binary phylogenetic tree displays another binary phylogenetic tree if and only if they are isomorphic up to subdivision of arcs. Hence, if a phylogenetic tree *T* displays another phylogenetic tree *T* ′ on the same set of taxa, then there exist binary resolutions *B* of *T* and *B*′ of *T* such that *B* displays *B*′ , that is, *B* and *B*′ are isomorphic up to subdivision of arcs. Since isomorphism is a symmetric relation, *T* ′ then also displays *T*.

**Observation 6.1.** *A phylogenetic tree T displays a phylogenetic tree T* ′ *on the same label-set if and only if T* ′ *displays T.*

For binary trees and, in particular, triplets, the concept of frm displaying is well-researched and we will use the following characterization to develop a characterization for when a phylogenetic tree softly displays another phylogenetic tree.

**Lemma 6.2** ([Dre+12, Chapter 9.1])**.** *Let B be a binary phylogenetic tree. Let a, b, c* ∈ L(*B*) *be three distinct labels. Then, B frmly displays the triplet ab*|*c if and only if*

$$\text{LCA}(\{a, b\}) <\_B \text{LCA}(\{b, c\}) = \text{LCA}(\{a, c\}).$$

*Indeed, B is uniquely identifed (up to subdivision and suppression of degree-two vertices) by the set D of displayed triplets, that is, B is the only binary tree displaying the triplets in D.*

Based on Lemma 6.2, we can now relate the two forms of displaying for triplets in non-binary trees. To this end, recall that in trees the LCA of a set of vertices is uniquely determined. Moreover, it is easy to verify that if it holds for three leaves *a*, *b*, and *c* in a tree *T* that LCA*<sup>T</sup>* ({*a, b*}) *<<sup>T</sup>* LCA*<sup>T</sup>* ({*a, c*}), then LCA*<sup>T</sup>* ({*a, c*}) = LCA*<sup>T</sup>* ({*b, c*}). Lemma 6.2 and the defnition of soft displaying then immediately imply the following.

**Observation 6.3.** *Let T be a tree and let a, b, c* ∈ L(*T*)*. Then,*

*(a) T frmly displays ab*|*c if and only if*

LCA({*a, b*}) *<<sup>T</sup>* LCA({*a, c*}) = LCA({*b, c*})*.*

*(b) T frmly displays ac*|*b or bc*|*a if and only if T does not softly display ab*|*c.*

The next observation states that, in trees, an arc contraction does not change the ancestor relation. This is important as it allows us to reason about LCAs in binary resolutions.

**Observation 6.4.** *Let T be a tree and let T* ′ *be the result of contracting any arc in T. Let Y and Z be two sets of leaves common to T and T* ′ *. Then,*

*(a)* LCA*<sup>T</sup>* (*Y* ) ≤*<sup>T</sup>* LCA*<sup>T</sup>* (*Z*) *if and only if* LCA*T*′ (*Y* ) ≤*T*′ LCA*T*′ (*Z*) *and (b) if* LCA*T*′ (*Y* ) *<T*′ LCA*T*′ (*Z*)*, then* LCA*<sup>T</sup>* (*Y* ) *<<sup>T</sup>* LCA*<sup>T</sup>* (*Z*)*.*

Recall the example in Figure 6.2 and therein consider the contraction of the arc between the two green vertices in *B*. Observation 6.4 then states for *Y* . .= {*f, g*} and *Z* . .= {*a, f, g*} that LCA*B*({*f, g*}) ≤*<sup>B</sup>* LCA*B*({*a, f, g*}) if and only if LCA*<sup>T</sup>* (*f, g*) ≤*<sup>T</sup>* LCA*<sup>T</sup>* ({*a, f, g*}). Note that this is indeed the case as the LCA of {*a, f, g*} is in both phylogenetic trees the root and the LCA of {*f, g*} is the respective (lower) green vertex.

We now give a characterization of when a phylogenetic tree softly displays another phylogenetic tree. It is based on Lemma 6.2 and the following observation. Note that if in a tree *B* it holds that LCA({*a, b*}) *<<sup>B</sup>* LCA({*b, c*}) = LCA({*a, c*}), then there is no vertex *v* such that *a, c* ∈ L(*v*) and *b /*∈ L(*v*) as any ancestor of LCA({*a, c*}) is an ancestor of LCA({*a, b*}) *<<sup>B</sup>* LCA({*a, c*}).

**Lemma 6.5.** *Let N* = (*V<sup>N</sup> , A<sup>N</sup>* ) *and T* = (*V<sup>T</sup> , A<sup>T</sup>* ) *be two phylogenetic trees on the same leaf-label set. Then, N softly displays T if and only if, for all u* ∈ *V<sup>T</sup> and v* ∈ *V<sup>N</sup> , it holds that* L(*Tu*) ⊆ L(*Nv*)*,* L(*Tu*) ⊇ L(*Nv*)*, or* L(*Tu*) ∩ L(*Nv*) = ∅*.*

*Proof.* We start by showing that if *N* displays *T*, then for all *u* ∈ *V<sup>T</sup>* and *v* ∈ *V<sup>N</sup>* , it holds that L(*Tu*) ⊆ L(*Nv*), L(*Tu*) ⊇ L(*Nv*), or L(*Tu*) ∩ L(*Nv*) = ∅. Assume towards a contradiction that *N* softly displays *T* but there are *u* ∈ *V<sup>N</sup>* and *v* ∈ *V<sup>T</sup>* such that

L(*Nu*) ⊈ L(*Tv*)*,* L(*Nu*) ⊉ L(*Tv*)*,* and L(*Nu*) ∩ L(*Tv*) ̸= ∅*.*

This is equivalent to the statement that there are three taxa *x*, *y*, and *z* such that

$$x \in \mathcal{L}(N\_u) \backslash \mathcal{L}(T\_v), \ y \in \mathcal{L}(N\_u) \cap \mathcal{L}(T\_v), \text{ and } z \in \mathcal{L}(T\_v) \backslash \mathcal{L}(N\_u).$$

Since each label appears only once in *N* and *T* and *N* softly displays *T*, it holds that there are binary resolutions *N <sup>B</sup>* of *N* and *T <sup>B</sup>* of *T* such that *N <sup>B</sup>* and *T <sup>B</sup>* are isomorphic up to subdivision of arcs. Hence, there is a vertex *u* ′ in *N <sup>B</sup>* with L(*N <sup>B</sup> <sup>u</sup>*′ ) = L(*Nu*) and a vertex *v* ′ in *T <sup>B</sup>* with L(*T B v* ′ ) = L(*Tv*). Since *x, y* ∈ L(*Nu*) = L(*N <sup>B</sup> <sup>u</sup>*′ ) and *z /*∈ L(*Nu*) = L(*N <sup>B</sup> <sup>u</sup>*′ ), it holds that

$$\text{LCA}(\{x, y\}) \le\_{N^B} u' <\_{N^B} \text{LCA}(\{y, z\}) = \text{LCA}(\{x, z\}),$$

that is, *N <sup>B</sup>* displays *xy*|*z*. Analogously, *T <sup>B</sup>* displays *yz*|*x*. By Lemma 6.2, this contradicts the fact that *T <sup>B</sup>* and *N <sup>B</sup>* are isomorphic up to subdivision of arcs.

We continue with the other direction, that is, we show that if for all vertices *u* ∈ *V<sup>T</sup>* and *v* ∈ *V<sup>N</sup>* it holds that

$$
\mathcal{L}(T\_u) \subseteq \mathcal{L}(N\_v), \ \mathcal{L}(T\_u) \supseteq \mathcal{L}(N\_v), \text{ or } \mathcal{L}(T\_u) \cap \mathcal{L}(N\_v) = \emptyset,
$$

then *N* displays *T*. Using Lemma 6.2, we will show how to construct binary trees *B<sup>N</sup>* and *B<sup>T</sup>* such that *B<sup>N</sup>* is a binary resolution of *N*, *B<sup>T</sup>* is a binary resolution of *T*, and both display all triplets that are frmly displayed by *N* or *T*. Since the constructions for both trees are analogous, we only focus on *B<sup>N</sup>* here. Consider any vertex *v* ∈ *V<sup>N</sup>* that has out-degree at least three. Then, there are three labels *a*, *b*, and *c* such that LCA*<sup>N</sup>* ({*a, b*}) = LCA*<sup>N</sup>* ({*a, c*}) = LCA*<sup>N</sup>* ({*b, c*}). Let *ca*, *cb*, *c<sup>c</sup>* be the three children of *v* in *N* such that *a* ∈ L(*Nc<sup>a</sup>* ), *b* ∈ L(*Nc<sup>b</sup>* ), and *c* ∈ L(*Nc<sup>c</sup>* ). We now consider the two cases whether or not

$$\text{LCA}\_T(\{a, c\}) = \text{LCA}\_T(\{a, b\}) = \text{LCA}\_T(\{b, c\}).$$

If LCA*<sup>T</sup>* ({*a, c*}) = LCA*<sup>T</sup>* ({*a, b*}) = LCA*<sup>T</sup>* ({*b, c*}), then neither *N* nor *T* displays one of the triplets *ab*|*c*, *ac*|*b*, or *bc*|*a*. Hence we arbitrarily replace the arcs (*v, cb*) and (*v, cc*) by a new vertex *w* and new arcs (*v, w*), (*w, cb*) and (*w, cc*). Note that the resulting phylogenetic tree frmly displays all triplets that *N* frmly displayed and the triplet *bc*|*a*. Since this procedure reduces the out-degree of one vertex of out-degree at least three and does not introduce new vertices of out-degree at least three, we can repeat this procedure until no vertex has out-degree at least three any more, that is, the resulting phylogenetic tree is binary. Observe further that *B<sup>N</sup>* is trivially a binary resolution of *B<sup>N</sup>* and therefore *N* softly displays *B<sup>N</sup>* by defnition. The construction of *B<sup>T</sup>* is analogous and whenever

$$\begin{aligned} \text{LCA}\_N(\{a, c\}) &= \text{LCA}\_N(\{a, b\}) = \text{LCA}\_N(\{b, c\}) \text{ and} \\ \text{LCA}\_T(\{a, c\}) &= \text{LCA}\_T(\{a, b\}) = \text{LCA}\_T(\{b, c\}), \end{aligned}$$

then we construct *B<sup>T</sup>* to display the same triplet as *B<sup>N</sup>* .

Note that since *B<sup>N</sup>* and *B<sup>T</sup>* are binary, they frmly display one of the following three possible triplets *ab*|*c*, *ac*|*b*, or *bc*|*a* for each triple (*a, b, c*) of labels. By Lemma 6.2, *B<sup>N</sup>* and *B<sup>T</sup>* are isomorphic up to subdivision of arcs as binary trees are uniquely defned by their displayed triplets. Hence *B<sup>N</sup>* is a subdivision of a binary resolution of both *N* and *T* and, as *B<sup>N</sup>* is binary, *N* softly displays *T* by defnition.

We conclude this section with a helpful lemma that lists some equivalent characterizations of soft displaying in relevant special cases. This lemma will be used to show hardness of Soft Tree Containment in Subsection 6.3.2.

**Lemma 6.6.** *Let T and T* ′ *be phylogenetic trees and let B be a binary phylogenetic tree, all on the same set X of labels.*

*(a) T softly displays the leaf-triplet ab*|*c if and only if*

LCA({*a, b*}) ≤ LCA({*b, c*}) = LCA({*a, c*})*.*


*Proof.* We prove the three statements one after another. To verify statement (a), note that, by defnition, *T* softly displays *ab*|*c* if and only if there is a binary resolution *T<sup>B</sup>* of *T* displaying *ab*|*c*. By Lemma 6.2, *T<sup>B</sup>* frmly displays *ab*|*c* if and only if

$$\text{LCA}\_{T\_B}(\{a, b\}) <\_{T\_B} \text{LCA}\_{T\_B}(\{a, c\}) = \text{LCA}\_{T\_B}(\{b, c\}).$$

Since *T<sup>B</sup>* is binary, this is equivalent to

$$\text{LCA}\_{T\_B}(\{a, b\}) \le\_{T\_B} \text{LCA}\_{T\_B}(\{a, c\}) = \text{LCA}\_{T\_B}(\{b, c\}),$$

which by Observation 6.4 is equivalent to

$$\text{LCA}\_T(\{a, b\}) \le\_T \text{LCA}\_T(\{a, c\}) = \text{LCA}\_T(\{b, c\}).$$

We next prove statement (b). To this end, frst assume towards a contradiction that *T* displays *B* but a triplet *ab*|*c* that *B* displays frmly is not displayed softly by *T*. Then, {LCA*<sup>T</sup>* ({*a, b*})*,* LCA*<sup>T</sup>* ({*a, c*})*,* LCA*<sup>T</sup>* ({*b, c*})} has a unique minimum *x* with respect to *<<sup>T</sup>* and it holds by statement (a) that *x* ̸= LCA*<sup>T</sup>* ({*a, b*}) (as otherwise *T* displays *ab*|*c*). Without loss of generality, let *x* = LCA*<sup>T</sup>* ({*a, c*}). Since *T* has a binary resolution that is isomorphic to *B* up to subdivision of arcs, it holds that *T* is a contraction of a subdivision of *B*. Hence, Observation 6.4 states that LCA*B*({*a, c*}) *<<sup>T</sup><sup>B</sup>* LCA*B*({*a, b, c*}) and thus *B* displays *ac*|*b*. Note that a binary tree cannot display *ab*|*c* and *ac*|*b* and thus we reached a contradiction.

Now, assume towards a contradiction that *T* = (*V<sup>T</sup> , A<sup>T</sup>* ) does not softly display *B* = (*VB, AB*) but displays all triplets that are frmly displayed by *B*. Since *T* does not display *B*, there are by Lemma 6.5 vertices *u* ∈ *V<sup>T</sup>* and *v* ∈ *V<sup>B</sup>* and labels *x*, *y*, and *z* such that *x* ∈ L(*Tu*) \ L(*Bv*), *y* ∈ L(*Bv*) \ L(*Tu*), and *z* ∈ L(*Tu*) ∩ L(*Bv*). Thus,

$$\begin{aligned} \text{LCA}\_T(\{x, z\}) \leq\_T u &<\_T \text{LCA}\_T(\{x, y, z\}) \text{ and }\\ \text{LCA}\_B(\{y, z\}) \leq\_B v &<\_B \text{LCA}\_B(\{x, y, z\}). \end{aligned}$$

By statement (a), *T* displays *xz*|*y* and *B* displays *yz*|*x*. Since *T* displays all triplets that *B* displays frmly, *T* displays *yz*|*x*. Again by (a), we can conclude that LCA*<sup>T</sup>* ({*y, z*}) ≤*<sup>T</sup>* LCA*<sup>T</sup>* ({*x, z*}) ≤*<sup>T</sup> u*. Thus, *y* ∈ L(*u*), a contradiction.

It remains to show statement (c). By defnition, *T* softly displays *T* ′ if and only if there are binary resolutions *B* and *B*′ of *T* and *T* ′ , respectively, such that *B* frmly displays *B*′ . If such phylogenetic trees exist, then they are by Lemma 6.2 isomorphic up to subdivision of arcs. Thus, *B* is a binary resolution of a subdivision of *T* ′ and the statement follows.

### **6.3 Multi-labeled Trees and** *k***-SAT**

In this section, we study Soft Tree Containment for multi-labeled phylogenetic trees. We will show a strong connection between *k*-SAT and Soft Tree Containment on *k*-labeled phylogenetic trees in the sense that there is a polynomial-time reductions from *k*-SAT to Soft Tree Containment on *k*-labeled phylogenetic trees and a *k*-SAT program for Soft Tree Containment on *k*-labeled phylogenetic trees. This yields the dichotomy result that Soft Tree Containment on *k*-labeled phylogenetic trees is polynomial-time solvable if *k* ≤ 2 and *NP*-hard if *k* ≥ 3. We start with a characterization of when a multi-labeled phylogenetic tree softly displays a single-labeled phylogenetic tree *T*.

**Lemma 6.7.** *Let M be a multi-labeled phylogenetic tree and let T be a singlelabeled phylogenetic tree on the same set X of labels. Then, M softly displays T if and only if M contains (as a subgraph) a single-labeled phylogenetic tree S on X that softly displays T.*

*Proof.* We will frst show that if *M* . .= (*VM, AM*) softly displays *T* . .= (*V<sup>T</sup> , A<sup>T</sup>* ), then *M* contains a single-labeled phylogenetic tree *S* that softly displays *T*. Note that, by defnition, if *M* softly displays *T*, then there are binary resolutions *M<sup>B</sup>* . .= (*VB, AB*) of *M* and *T<sup>B</sup>* of *T* and subdivisions *M<sup>S</sup> <sup>B</sup>* of *M<sup>B</sup>* and *T S B* of *T<sup>B</sup>* such that *M<sup>S</sup> <sup>B</sup>* contains *T S <sup>B</sup>* as a subgraph (respecting leaf labels). Let *S S B* be the subgraph of *M<sup>S</sup> <sup>B</sup>* that is isomorphic to *T S <sup>B</sup>* . Let *S<sup>B</sup>* be the phylogenetic tree that is the result of reverting all subdivisions from *M<sup>B</sup>* to *M<sup>S</sup> <sup>B</sup>* in *S S <sup>B</sup>*, that is, suppressing each vertex *v* that is contained in *S S <sup>B</sup>* but not in *MB*. Note that *S S <sup>B</sup>* is a subdivision of *S<sup>B</sup>* and *S<sup>B</sup>* is a single-labeled subgraph of *MB*. Let *χ*: *V<sup>B</sup>* → *V<sup>M</sup>* be the contraction function of *M<sup>B</sup>* for *M*, that is, the function mapping each vertex *u* in *M<sup>B</sup>* to the vertex *χ*(*u*) in *M* that *u* is contracted to when forming *M*. Moreover, let *S* be the result of contracting each arc (*u, v*) in *S<sup>B</sup>* with *χ*(*u*) = *χ*(*v*). Note that for each vertex *v* in *M<sup>B</sup>* it holds that all vertices *u* with *χ*(*u*) = *χ*(*v*) contract to a single vertex in *M* and hence these vertices form, by defnition of contracting functions, a weakly connected component in *MB*. Further, since *M<sup>B</sup>* is a tree and *S<sup>B</sup>* is a subtree of *MB*, it holds for each vertex *v* ′ in *S<sup>B</sup>* that all vertices *u* ′ with *χ*(*u* ′ ) = *χ*(*v* ′ ) form a weakly connected component in *SB*. Thus, the phylogenetic tree *S* contains no two vertices *u* ′ and *v* ′ with *χ*(*u* ′ ) = *χ*(*v* ′ ). Since *M* is the result of contracting each arc (*u, v*) with *χ*(*u*) = *χ*(*v*) in *MB*, and since *S* is the result of contracting each arc (*u, v*) with *χ*(*u*) = *χ*(*v*) in *S<sup>B</sup>* and since *S<sup>B</sup>* is a subtree of *MB*, it holds that *S* is a subtree of *M*. Concluding, *S* is a single-labeled subtree of *M*, *S* has a binary resolution *SB*, *S<sup>B</sup>* has a subdivision *S S <sup>B</sup>*, and *S S <sup>B</sup>* is by assumption isomorphic to *T S <sup>B</sup>* . Thus, *S* softly displays *T* by defnition.

It remains to show that if *M* contains a single-labeled subtree *S* which softly displays *T*, then *M* softly displays *T*. If *M* contains a single-labeled subtree *S* that softly displays *T*, then there are by defnition binary resolutions *S<sup>B</sup>* and *T<sup>B</sup>* of *S* and *T*, respectively, and subdivisions *S S <sup>B</sup>* of *S<sup>B</sup>* and *T S <sup>B</sup>* of *T<sup>B</sup>* such that *S S <sup>B</sup>* and *T S <sup>B</sup>* are isomorphic. We will show that *M* softly displays *T*, that is, there is a binary resolution *M<sup>B</sup>* of *M* that has a subdivision *M<sup>S</sup> <sup>B</sup>* that contains *T S <sup>B</sup>* as a subgraph. First, to avoid ambiguity, we relabel each leaf that is not contained in *S* such that the resulting tree *M*′ is a single-labeled tree on a set *X*′ ⊇ *X* of labels. This allows us to again refer to leaves of *M*′ in terms of labels. Note that only labels for leaves not contained in *S* are diferent between *M* and *M*′ and hence *M*′ also contains *S* as a subgraph. Let *M*′ *<sup>B</sup>* be any binary resolution of *M*′ that satisfes the following property. If for three labels *a, b, c* ∈ *X* it holds that LCA(*a, b*) *<<sup>S</sup><sup>B</sup>* LCA(*a, c*) = LCA(*b, c*), then LCA(*a, b*) *<M*′ *B* LCA(*a, c*) = LCA(*b, c*). Note that *M*′ *<sup>B</sup>* contains a subdi-

vision of *S<sup>B</sup>* as a subtree. Hence, *M*′ *<sup>B</sup>* frmly displays *TB*. Finally, let *M<sup>B</sup>* be the multi-labeled phylogenetic tree resulting from replacing the labels in *M*′ *B* with their original labels from *X*. Since *M* and *M*′ only difer in these labels, it holds that *M<sup>B</sup>* is a binary resolution of *M*. Further, since *S* does not contain any of the leaves in which *M<sup>B</sup>* and *M*′ *<sup>B</sup>* difer, it holds that *M<sup>B</sup>* contains a subdivision of *S<sup>B</sup>* as a subgraph. Thus, there is a binary resolution *M<sup>B</sup>* of *M* and *M<sup>B</sup>* contains a subdivision of *S<sup>B</sup>* as a subgraph which frmly displays *T*, that is, *M* softly displays *T*.

We will use the characterization shown in Lemma 6.7 to prove both sides of the dichotomy result in this chapter. In Subsection 6.3.1, we present a *k*-SAT program for Soft Tree Containment on *k*-labeled phylogenetic trees. This implies that Soft Tree Containment is polynomial-time solvable for 2-labeled phylogenetic trees. In Subsection 6.3.2, we complement this result with a reduction from *k*-SAT to Soft Tree Containment on *k*-labeled phylogenetic trees. This implies that Soft Tree Containment on *k*-labeled phylogenetic trees is *NP*-hard for each *k* ≥ 3.

### **6.3.1 Reduction to** *k***-SAT**

In this subsection, we present a *k*-SAT program for Soft Tree Containment on *k*-labeled phylogenetic trees. The basic idea is a bottom-up approach that computes for each vertex *u* in the single-labeled phylogenetic tree *T* a set *M*(*u*) of candidates. Each such candidate is a vertex *v* in the *k*-labeled phylogenetic tree *N* such that the subtree *N<sup>v</sup>* of *N* rooted in *v* displays *T<sup>u</sup>* and for no descendant *w* of *v* it holds that *N<sup>w</sup>* displays *Tu*. We will later show that there are at most *k* such candidates for each vertex in *T*. Afterwards, we will show how to compute the set *M*(*u*) for each vertex *u* in *T* in a bottom-up manner using *k*-SAT.

Note that if *N* displays *T*, then, by Lemma 6.7, *N* contains a single-labeled subtree *S* that displays *T*. We call *S canonical* for some vertex *u* in *T* if LCA*S*(L(*Tu*)) ∈ *M*(*u*) and *canonical* for *T* if it is canonical for all vertices in *T*. We start by showing that softly displaying is equivalent to having such a canonical subtree.

**Lemma 6.8.** *A k-labeled tree N softly displays a single-labeled tree T if and only if N has a canonical subtree for T.*

*Proof.* Let *r* be the root of *T*. If *N* . .= (*V<sup>N</sup> , A<sup>N</sup>* ) has a canonical subtree *S* for *T* . .= (*V<sup>T</sup> , A<sup>T</sup>* ), then, by defnition, *S* contains a vertex *v* such that *S<sup>v</sup>* displays *T<sup>r</sup>* = *T*. Hence, *N* contains a single-labeled tree *S* that displays *T* and, by Lemma 6.7, this shows that *N* displays *T*.

It remains to show that if *N* displays *T*, then *N* contains a canonical subtree *S* for *T*. If *N* displays *T*, then *N* contains by Lemma 6.7 a single-labeled subtree *S* that displays *T*. Assume towards a contradiction that *S* is not canonical for *T*. Let *u* ∈ *V<sup>T</sup>* be a vertex for which *S* is not canonical but *S* is canonical for all ancestors of *u* in *T*. Note that *u* ̸= *r* as *S* displays *T* = *T<sup>r</sup>* by assumption. Let *p* be the parent of *u* in *T*. Since *S* is canonical for *p*, there is a vertex *y* . .= LCA*S*(L(*Tp*)) in *S* such that *S<sup>y</sup>* displays *Tp*. Let *S* ′ *y* . .= *S<sup>y</sup>* |<sup>L</sup>(*Tp*) , that is, *S* ′ *y* is the subtree of *S<sup>y</sup>* containing all leaves in *T<sup>p</sup>* and no other. By Lemma 6.6(c), there is a binary single-labeled phylogenetic tree *B* on L(*Tp*) which is displayed by *S* ′ *<sup>y</sup>* and *Tp*. By Lemma 6.6(b), *S* ′ *<sup>y</sup>* displays each triplet which is frmly displayed by *B*. Let *x* . .= LCA*S*(L(*Tu*)). Since *S* is not canonical for *u*, it holds that *S<sup>x</sup>* does not display *T<sup>u</sup>* or there is a descendant *z* of *x* such that *S<sup>z</sup>* displays *Tu*. By defnition of *x*, for no descendant *z* of *x* the subtree *S<sup>z</sup>* can display *T<sup>u</sup>* as for each such *z* there is a label *ℓ* ∈ L(*Tu*) \ L(*Sz*) and therefore no triplet containing *ℓ* can be displayed by *Sz*. Hence, *S<sup>x</sup>* does not display *Tu*. Recall that there is a binary phylogenetic tree *B* which is displayed by *S* ′ *<sup>y</sup>* and *Tp*. Let *B*′ . .= *B*|<sup>L</sup>(*Tu*) and let *ab*|*c* be any triplet that *B*′ displays frmly. Since *B*′ is a subtree of *B* it holds that *B* frmly displays *ab*|*c*. Hence, *S* ′ *<sup>y</sup>* and *T<sup>p</sup>* softly display *ab*|*c*. If *T<sup>u</sup>* does not display *ab*|*c*, then, by Observation 6.3(b), it frmly displays *ac*|*b* or *bc*|*a*. Since *T<sup>u</sup>* is a subtree of *Tp*, also *T<sup>p</sup>* frmly displays *ac*|*b* or *bc*|*a*. By Observation 6.3(b), *T<sup>p</sup>* then does not display *ab*|*c*, a contradiction. Analogously, if *S<sup>x</sup>* does not display *ab*|*c*, then *S<sup>y</sup>* does not display *ab*|*c*, another contradiction. Thus, both *S<sup>x</sup>* and *T<sup>u</sup>* display all triplets that are displayed by *B*′ , and *S<sup>x</sup>* therefore displays *T<sup>p</sup>* by Lemma 6.6, yielding a fnal contradiction to the assumption that *S* is not canonical for *u*.

As stated above, we compute *M*(*u*) for each vertex *u* in *T* in a bottom-up fashion. We will now show that |*M*(*u*)| ≤ *k* for each *u* ∈ *V<sup>T</sup>* .

**Lemma 6.9.** *Let N be a k-labeled phylogenetic tree and let T . .*= (*V<sup>T</sup> , A<sup>T</sup>* ) *be a single-labeled phylogenetic tree. Then, it holds for each u* ∈ *V<sup>T</sup> that* |*M*(*u*)| ≤ *k.*

*Proof.* We prove the statement by induction over the height of a vertex *u* in *T*. If the height of *u* is 0, that is, *u* is a leaf, then *M*(*u*) contains all leaves in *N* that have the same label as *u*. As *N* is *k*-labeled, each candidate set *M*(*u*) for a

**Figure 6.4:** Two phylogenetic trees *N* (left-hand side) and *T* (right-hand side). The vertices in *T* are colored and for each vertex *u* in *T* all vertices in *M*(*u*) in *N* are colored with the same color as *u*. The ascending paths of the two red vertices in *N* are drawn with bold arcs and the ascending paths of the two blue vertices are indicated by dashed arcs.

leaf *u* is of size at most *k*. If *u* is not a leaf, then let *c* be a child of *u* in *T* and assume that |*M*(*c*)| ≤ *k*. Consider any vertex *v* ∈ *M*(*u*). Since *N<sup>v</sup>* displays *Tu*, there is by Lemma 6.8 a subtree *S<sup>v</sup>* of *N<sup>v</sup>* that is canonical for *Tu*. Hence, there is a vertex *w* ∈ *M*(*c*) in *Sv*, that is, *S<sup>w</sup>* displays *T<sup>c</sup>* and *w* is a candidate for *c*. Note that *v* is the only ancestor of *w* in *M*(*u*) as *M*(*u*) only contains minima. Thus, any vertex in *M*(*u*) has a unique ancestor in *M*(*c*) and since |*M*(*c*)| ≤ *k*, it holds that |*M*(*u*)| ≤ *k*.

Note that the proof of Lemma 6.9 also states that for each vertex *u* in *T* that is not a leaf, each child *c* of *u* in *T*, and each *w* ∈ *M*(*c*), there is at most one ancestor *v* of *w* in *N* which is contained in *M*(*u*). We call the unique *v*-*w*-path in *N* the *ascending path* of *w* with respect to *c* and we omit mentioning *c* if it is clear from the context. An example of ascending paths is given in Figure 6.4. We next present a crucial lemma about ascending paths which states that ascending paths with respect to two vertices *c*<sup>1</sup> and *c*<sup>2</sup> are arc-disjoint unless *c*<sup>1</sup> and *c*<sup>2</sup> are siblings in *T*. Afterwards, we present our *k*-SAT program for Soft Tree Containment on *k*-labeled phylogenetic trees using the notions of candidate sets and ascending paths.

**Lemma 6.10.** *Let N be a multi-labeled phylogenetic tree, let T be a singlelabeled phylogenetic tree, and let N display T. Let S be a canonical subtree of N for T. Let u and v be two distinct vertices in T such that neither of them is the*

*root of T and u and v are not siblings in T. Let* LCA*S*(L(*Tu*)) *and* LCA*S*(L(*Tv*)) *have ascending paths R and Q with respect to u and v, respectively. Then, R and Q are arc-disjoint.*

*Proof.* To prove the statement, we distinguish between the two cases where *u* and *v* are in an ancestor-descendant relation or not. If *u* and *v* are in an ancestordescendant relation, then without loss of generality let *u <<sup>T</sup> v*. Let *p* be the parent of *u* in *T*. Note that *p* ≤*<sup>T</sup> v* and hence LCA*S*(L(*Tp*)) ≤*<sup>S</sup>* LCA*S*(L(*Tv*)). Thus, each vertex in the ascending path *R* of *u* is either *v* or a descendant of *v* in *T*. Since the ascending path *Q* of *v* only contains *v* and ancestors of *v* in *T*, it holds that *R* and *Q* share at most one vertex (*v*) and no arcs.

If *u* and *v* are not in an ancestor-descendant relation in *T*, then assume towards a contradiction that the ascending paths *R* and *Q* share an inner vertex *z*. Since *z* is an ancestor of both *u* and *v* in *T*, it holds that L(*Tu*) ∪ L(*Tv*) ⊆ L(*Tz*). As *u* and *v* are not siblings in *T*, one of *u* and *v* has a parent *p* that is not in an ancestor-descendant relation with the other. Assume without loss of generality that *p* is the parent of *u*. Since *v* and *p* are not in an ancestor-descendant relation and since *T* is a single-labeled phylogenetic tree, it holds that

$$
\mathcal{L}(T\_p) \cap \mathcal{L}(T\_z) \supseteq \mathcal{L}(T\_u) \neq \emptyset \text{ and } \mathcal{L}(T\_z) \nmid \mathcal{L}(T\_p) \supseteq \mathcal{L}(T\_v) \neq \emptyset.
$$

Since *S* is canonical, it holds that *y* . .= LCA*S*(L(*Tp*)) ∈ *M*(*p*) and, thus, the ascending path *R* starts in *y*. As *z* is an inner vertex of *R*, it holds that *z <<sup>S</sup> y*, implying

$$
\mathcal{L}(T\_P) \backslash \mathcal{L}(T\_z) \neq \emptyset.
$$

Concluding, it holds that

$$
\mathcal{L}(T\_p) \cap \mathcal{L}(T\_z) \neq \emptyset, \ \mathcal{L}(T\_z) \ \backslash \mathcal{L}(T\_p) \neq \emptyset, \ \text{and} \ \mathcal{L}(T\_p) \ \backslash \mathcal{L}(T\_z) \neq \emptyset
$$

and, by Lemma 6.5, this contradicts the assumption that *S* softly display *T*.

We next present the idea behind the main result in this section. To this end, let *r* be the root of *T*. Clearly, *N* displays *T* if and only if *M*(*r*) ̸= ∅. Hence, it remains to show how to compute *M*(*u*) given *M*(*v*) for all *v* ≠ *u* in *Tu*. We do so via a reduction to *k*-SAT that checks for each *y* in *N* whether *y* ∈ *M*(*u*). Therein, we have a variable *xz*→*<sup>c</sup>* for each vertex *c* ̸= *u* in *T<sup>u</sup>* and *z* ∈ *M*(*c*) that represents whether *S<sup>z</sup>* displays *Tc*, where *S* is the canonical subtree of *N<sup>y</sup>* for *Tu*. The formula then checks whether these choices are consistent, that is, if *xa*→*<sup>w</sup>* and *xz*→*<sup>v</sup>* are set to true and *w* is a descendant of *v* in *T*, then *a* is

a descendant of *z* in *N* (or *a* = *z*). Finally, the formula checks whether these choices satisfy Lemma 6.10. After presenting the formula, we will prove that it is correct, that is, *φy*→*<sup>u</sup>* is satisfable if and only if *N<sup>y</sup>* displays *Tu*.

**Construction 6.11.** Construct *φy*→*<sup>u</sup>* as follows. For each *v* ̸= *u* in *T<sup>u</sup>* and for each *z* ∈ *M*(*v*), introduce a variable *xz*→*v*. Moreover for each *v* ̸= *u* in *T<sup>u</sup>*


Note that the ascending path of *z* or *q* in (4) is not defned if *v* or *w* is a child of *u* in *T<sup>u</sup>* as *M*(*u*) is not defned. In this case we call the unique *y*-*z*-path or the unique *y*-*q*-path the ascending path as we test whether *y* ∈ *M*(*u*). We next show that Construction 6.11 is correct. Since we use the construction to test whether *y* ∈ *M*(*u*) and since *M*(*u*) can, by defnition, not contain two vertices that are in an ancestor-descendant relation, we assume that *φz*→*<sup>u</sup>* is not satisfable for any descendant *z* of *y* in *T*.

**Lemma 6.12.** *Let u be a vertex in T and let y be a vertex in N such that for each descendant d of y in N it holds that φd*→*<sup>u</sup> is not satisfable. Then, φy*→*<sup>u</sup> is satisfable if and only if N<sup>y</sup> displays Tu.*

*Proof.* We start by showing that if *N<sup>y</sup>* displays *Tu*, then *φy*→*<sup>u</sup>* is satisfable. To this end, let *S* be a canonical subtree of *N<sup>y</sup>* that displays *Tu*. Note that *S* exists due to Lemma 6.8. Let *β* be a truth assignment for *φy*→*<sup>u</sup>* that sets each variable *xz*→*<sup>v</sup>* to true if and only if *z* = LCA*S*(L(*Tv*)). We will show that all clauses in Construction 6.11 are satisfed by this assignment. Note that for each *v* ̸= *u* in *T* it holds that *S<sup>z</sup>* displays *T<sup>v</sup>* and *z* ∈ *M*(*v*) where *z* . .= LCA*S*(L(*Tv*)). Hence each clause of type (1) is satisfed by *β*. Moreover, since the LCA in *S* is unique (as *S* is a tree), also all clauses of type (2) are satisfed by *β*.

Assume towards a contradiction that a clause of type (3) is not satisfed. Then, there is some *v* with parent *p* in *T<sup>u</sup>* such that *y* ≰*<sup>N</sup> z* for some *y* ∈ *M*(*v*)

and *z* ∈ *M*(*p*) and *β*(*xy*→*v*) = *β*(*xz*→*p*) = true. Since L(*Tp*) ⊇ L(*Tv*), it holds that *y* ≤*<sup>S</sup> z*. Moreover, since *S* is a subtree of *N*, it holds that *y* ≤*<sup>N</sup> z*, contradicting *y* ≰*<sup>N</sup> z*. Thus, all clauses of type (3) are satisfed.

Finally, if a clause of type (4) is not satisfed, then there are *xy*→*<sup>v</sup>* and *xz*→*<sup>w</sup>* such that


This contradicts Lemma 6.10 and therefore all clauses of type (4) are satisfed. Since each clause of *φy*→*<sup>u</sup>* is satisfed by *β*, the formula is satisfable.

We next show that if *φy*→*<sup>u</sup>* is satisfable, then *N<sup>y</sup>* displays *Tu*. To this end, let *β* be a satisfying truth assignment for *φy*→*u*. Let *S* be the subtree of *N<sup>y</sup>* that contains *y* and all leaves *z* such that *β*(*xz*→*v*) = true for some leaf *v* in *T<sup>u</sup>* (and no other leaves except for possibly *y*). We will show that *S* is canonical for *Tu*. To this end, we frst show that *S* contains each vertex *z* such that *β*(*xz*→*v*) = true for some vertex *v* in *Tu*. Note that *φy*→*<sup>u</sup>* contains for each *v* ≠ *u* in *T<sup>u</sup>* at most one vertex *z* such that *β*(*xz*→*v*) = true as otherwise the respective clause of type (2) was not satisfed by *β*. It also contains at least one such vertex as otherwise the clause of type (1) was not satisfed. For the sake of readability, we will denote this unique vertex *z* by *ψ*(*v*) for each vertex *v*. As a special case, we defne *ψ*(*u*) . .= *y*. We will show by induction over the height of *v* that *ψ*(*v*) is contained in *S* and that *S<sup>ψ</sup>*(*v*) displays *Tv*. The height of a vertex *v* in a tree is the maximum distance between *v* and a descendant of *v*. If *v* is a leaf, then *ψ*(*v*) is by defnition contained in *S*, and *S<sup>ψ</sup>*(*v*) displays *Tv*. If *v* is not a leaf, then let *c* be a child of *v* in *Tu*. Since *c* has smaller height than *v* in *Tu*, it holds by induction hypothesis that *ψ*(*c*) is contained in *S*. If *ψ*(*v*) was not contained in *S*, then *ψ*(*v*) is not an ancestor of *ψ*(*c*). This, however, contradicts the clause of type (3). Hence, each vertex *ψ*(*v*) for some vertex *v* ̸= *u* in *T<sup>u</sup>* is contained in *S*. It remains to show that *S<sup>ψ</sup>*(*v*) displays *Tv*. Assume towards a contradiction that *S<sup>ψ</sup>*(*v*) does not display *Tv*. By Lemma 6.5, there are vertices *w* in *T<sup>v</sup>* and *q* in *S<sup>ψ</sup>*(*v*) and leaves

$$a \in \mathcal{L}(S\_q) \backslash \mathcal{L}(T\_w), \ b \in \mathcal{L}(T\_w) \backslash \mathcal{L}(S\_q), \ \text{and} \ c \in \mathcal{L}(T\_w) \cap \mathcal{L}(S\_q).$$

On the one hand, note that *a <<sup>S</sup> q* and *c <<sup>S</sup> q* and therefore there is a highest ancestor *α* of *a* in *T* with *ψ*(*α*) ≤*<sup>S</sup> q* and a highest ancestor *γ* of *c* in *T*

with *ψ*(*γ*) ≤*<sup>S</sup> q*. By the defnitions of *α* and *γ*, there are parents *p<sup>α</sup>* and *p<sup>γ</sup>* of *α* and *γ*, respectively, such that *ψ*(*pα*) ̸≤*<sup>S</sup> q* and *ψ*(*pγ*) ̸≤*<sup>S</sup> q*. Hence, the ascending paths of *ψ*(*α*) and *ψ*(*γ*), respectively, share *q* as an inner vertex and the arc (*pq, q*) where *p<sup>q</sup>* is the parent of *q* in *S*. Note that *q* has a parent as there is a leaf with label *b* that is not contained in *S<sup>q</sup>* but in *Sψ*(*v*) . On the other hand, note that *α <<sup>T</sup> w* and *γ* ̸≤*<sup>T</sup> w*, implying that *α* and *γ* are not siblings in *T*, contradicting the assumption that all clauses of type (4) are satisfed by *β*.

We next show our main result in this chapter, that is, a *k*-SAT program for Soft Tree Containment on *k*-labeled graphs for each *k* ≥ 2. We mention that the program resulting from Soft Tree Containment on single-labeled phylogenetic trees contains clauses with two literals (the clauses of types (3) and (4) in Construction 6.11). Since 2-SAT formulas are linear-time solvable, the following result proves that Soft Tree Containment on single-labeled phylogenetic trees is polynomial-time solvable. In the paper on which this chapter is based, we also present a linear-time algorithm for Soft Tree Containment on single-labeled phylogenetic trees [BMW18].

**Theorem 6.13.** *For each k* ≥ 2*, one can decide in O*(*n* 5 · *k* 2 ) *time whether a k-labeled phylogenetic tree N softly displays a single-labeled phylogenetic tree T using O*(*n* 2 ) *queries of size O*(*n* 2 · *k* 2 ) *to k*-SAT*.*

*Proof.* The algorithm computes for each vertex *u* in *T* at most *k* vertices *M*(*u*) such that for each *v* ∈ *M*(*u*) the subtree *N<sup>v</sup>* displays *T<sup>u</sup>* and for no descendant *w* of *v* it holds that *N<sup>w</sup>* displays *Tu*. It computes this set *M*(*u*) bottom-up for each vertex *u* in *T*. The pseudo-code is given in Algorithm 6.1. All possible candidates for vertices in *M*(*u*) that are found by the algorithm are compared in Line 16 and all non-minima are removed. Hence, the set *M*(*u*) computed by the algorithm only contains minima. Hence, it remains to show that if for a vertex *v* in *N* it holds for no descendant *w* of *v* that *N<sup>w</sup>* displays *Tu*, then *v* ∈ *M*(*u*) if and only if *N<sup>v</sup>* displays *Tu*. Let *v* be such a vertex. Note that since for no descendant of *w* of *v* it holds that *N<sup>w</sup>* displays *Tu*, it holds by Lemma 6.12 that *φw*→*<sup>u</sup>* is not satisfable for any descendant *w* of *v* in *N*. Hence, Lemma 6.12 states that *φv*→*<sup>u</sup>* is satisfable if and only if *N<sup>v</sup>* displays *Tu*. Thus, it remains to show that *v* ∈ *M*(*u*) if and only if *φv*→*<sup>u</sup>* is satisfable. To this end, note that since *N<sup>v</sup>* displays *T<sup>u</sup>* it also displays *T<sup>c</sup>* for any descendant *c* of *u* in *T*. Let *c* be the child of *u* chosen in Line 6 and let *v* ′ ∈ *M*(*c*) be a descendant of *v* (or *v* ′ = *v*). We now consider the iteration of Line 7 where *w* = *v* ′ . If *v* ′ = *v*, then the algorithm adds *v* to *M*(*u*). If *v* ′ ̸= *v*, then note that *v* ′ is a descendant

**Algorithm 6.1:** A *k*-SAT program for Soft Tree Containment on *k*-labeled phylogenetic trees.

**Input:** A *k*-labeled phylogenetic tree *N* and a single-labeled phylogenetic tree *T*. **Output:** true if *N* displays *T* and false otherwise. **<sup>1</sup>** *r* ← root of *T* **<sup>2</sup> foreach** vertex *u* in *T* **do** // in a bottom-up manner **<sup>3</sup>** *M*(*u*) ← ∅ **<sup>4</sup> if** *u* is a leaf in *T* **then** *M*(*u*) ← {*v* ∈ *N* | L(*v*) = L(*u*)} **<sup>5</sup> else <sup>6</sup>** *c* ← any child of *u* in *T* // *c* can be chosen arbitrarily **<sup>7</sup> foreach** *w* ∈ *M*(*u*) **do <sup>8</sup>** *w* ′ ← *w* **<sup>9</sup> while** *w* ′ ̸= ⊥ **do <sup>10</sup>** construct *φw*′→*<sup>u</sup>* **<sup>11</sup> if** *φw*′→*<sup>u</sup>* is satisfable **then <sup>12</sup>** *M*(*u*) ← *M*(*u*) ∪ {*w* ′} **<sup>13</sup>** *w* ′ ← ⊥ **<sup>14</sup> else <sup>15</sup>** *w* ′ ← parent of *w* ′ in *T* // If *w* ′ = *r*, then *w* ′ ← ⊥ **<sup>16</sup> if** ∃*a, b* ∈ *M*(*u*)*. a* ≤*<sup>N</sup> b* **then** remove *b* from *M*(*u*) **<sup>17</sup> if** *M*(*r*) ̸= ∅ **then return** true **<sup>18</sup> else return** false

of *v* and hence *φ<sup>v</sup>* ′→*<sup>u</sup>* is not satisfable. The algorithm then iteratively tries each ancestor *v* <sup>∗</sup> of *v* ′ and checks whether *φ<sup>v</sup>* <sup>∗</sup>→*<sup>u</sup>* is satisfable. The formula *φ<sup>v</sup>* <sup>∗</sup>→*<sup>u</sup>* is not satisfable for each descendant *v* <sup>∗</sup> of *v* and hence eventually *φv*→*<sup>u</sup>* is tested. By assumption, *φv*→*<sup>u</sup>* is satisfable and hence *v* is added to *M*(*u*). Thus, the set *M*(*u*) is computed correctly by Algorithm 6.1 for each vertex *u* in *T*. Finally, observe that *N* displays *T* if and only if *M*(*r*) ̸= ∅ where *r* is the root of *T*.

It remains to analyze the number and sizes of the constructed formulas and the running time of the algorithm. Note that all clauses of type (1) are of size at most *k* and all other clauses are of size at most 2. Hence for each *k* ≥ 2 the resulting formulas are *k*-SAT formulas. We frst analyze the size of each formula. Note that there are *O*(*n*) clauses of type (1), *O*(*n*·*k* 2 ) clauses of type (2) and (3), and *O*(*n* 2 · *k* 2 ) clauses of type (4). Since only clauses of type (1) are not of constant size, each formula is of size *O*(*n* 2 · *k* 2 ).

Note that we construct at most one formula *φv*→*<sup>u</sup>* for each pair (*v, u*) of vertices where *v* is a vertex of *N* and *u* is a vertex in *T*. Hence, there are at most *n* 2 such formulas. It remains to analyze the running time of the algorithm (excluding the steps to solve the *k*-SAT formulas). The running time is dominated by the time to construct all formulas. Since we construct *O*(*n* 2 ) formulas of size at most *O*(*n* 2 · *k* 2 ), it remains to analyze the running time to construct each clause. Clauses of type (1) and (2) take constant time per literal. Clauses of type (3) and (4) take *O*(*n*) time to construct. Thus, the overall running time is *O*(*n* 2 · (*n* 2 · *k* 2 ) · *n*) = *O*(*n* 5 · *k* 2 ).

A direct consequence of Theorem 6.13 is that Soft Tree Containment on 2-labeled phylogenetic trees can be solved in *O*(*n* 5 ) time. This is a somewhat surprising application of 2-SAT programming as it is not apparent that the diference between 2-labeled phylogenetic trees and 3-labeled phylogenetic trees and the diference between 3-labeled phylogenetic trees and 4-labeled phylogenetic trees should be very dissimilar.

**Corollary 6.14.** *It can be verifed in O*(*n* 5 ) *time whether a* 2*-labeled phylogenetic tree N softly display a single-labeled phylogenetic tree T.*

We remark that this running time is not optimized and a more careful analysis using the amortized running time leads to a cubic running time [BMW18].

### **6.3.2 Reduction from** *k***-SAT**

In this subsection, we supplement the result from the previous subsection in the sense that we show that *k*-SAT reduces to Soft Tree Containment on *k*-labeled trees. As a consequence, Soft Tree Containment is *NP*-hard even when restricted to 3-labeled phylogenetic trees. To this end, we make a slight detour and frst show a reduction from *k*-SAT to a rather technically looking version of Independent Set that will turn out to be equivalent to a very natural variant of Colorful Independent Set. From this variant of Colorful Independent Set, we will then show a reduction to Soft Tree Containment on *k*-labeled trees.

The mentioned variant of Independent Set is based on the notion of *A ▷◁ B* graphs. Therein, *A* and *B* are graph classes and a graph *G* = (*V, E*) is in *A ▷◁ B*

if its edge set *E* can be partitioned into two sets *E*<sup>1</sup> and *E*<sup>2</sup> of edges such that *G*<sup>1</sup> . .= (*V, E*1) is in graph class *A* and *G*<sup>2</sup> . .= (*V, E*2) is in *B* [BBN19]. We are interested in the case where *A* is the disjoint union of *P*3's and *B* is the disjoint union of cliques of size at most *k*. Disjoint unions of cliques are also known as cluster graphs. This leads to the following special case of Independent Set.

*P*<sup>3</sup> *▷◁* Cluster Independent Set

**Input:** An integer *ℓ* and a graph *G* . .= (*V, E*) where *E* = *E*<sup>1</sup> ⊎ *E*<sup>2</sup> such that *G*<sup>1</sup> . .= (*V, E*1) is a collection of disjoint *P*3's and *G*<sup>2</sup> . .= (*V, E*2) is a cluster graph in which each clique has size at most *k*.

**Question:** Does *G* contain an independent set of size *ℓ*?

Van Bevern et al. [Bev+15] showed via a reduction from 3-SAT that Independent Set is *NP*-hard on *A ▷◁ B* graphs<sup>1</sup> unless *A* and *B* only contain cluster graphs. We modify their reduction to be able to reduce from *k*-SAT. The basic idea is to represent each clause by a clique and each variable by a cycle of even length. The largest independent set can contain at most half of the vertices in each cycle and at most one vertex from each clique and it contains that many vertices if and only if the *k*-SAT formula is satisfable. In the following, we denote the number of literals in a clause *C* by |*C*|. Note that we can assume without loss of generality that each variable occurs at most once in each clause as otherwise the clause is either trivially satisfed (if one occurrence is positive and the other negative) or one of the literals can be removed (if both occurrences are positive or both are negative). Moreover, we assume that each variable occurs at least twice in the formula as we can otherwise always satisfy the clause in which the variable occurs.

**Construction 6.15.** Consider an instance *φ* of *k*-SAT. Let *φ* have *n* variables *x*1*, x*2*, . . . , x<sup>n</sup>* and *m* clauses *C*1*, C*2*, . . . , C<sup>m</sup>* such that each variable occurs at least twice in *φ* and at most once in each clause. For each variable *x<sup>i</sup>* let *J<sup>i</sup>* be the list of indices of clauses that contain *x<sup>i</sup>* or ¬ *x<sup>i</sup>* and let *J<sup>i</sup>* [*ℓ*] denote the *ℓ* th element of this list. Construct a graph *G* . .= (*V, E*<sup>1</sup> ⊎ *E*2) as follows. For each variable *x<sup>i</sup>* construct a cycle *V<sup>i</sup>* of 2|*J<sup>i</sup>* | vertices *u* 1 *i , u* 1 *i , u*<sup>2</sup> *i , u* 2 *i , . . . , u* |*Ji*| *i , u* |*Ji*| *i* such that *u k i* is adjacent to *u k i* and *u k*+1 *i* for each *k* ∈ [|*J<sup>i</sup>* | − 1] and *u* 1 *i* and *u* |*Ji*| *i* are adjacent. We call *V<sup>i</sup>* a variable gadget. For each clause *C<sup>j</sup>*

<sup>1</sup>We remark that *A* and *B* have to be closed under disjoint union and taking an induced subgraph. Moreover, at least one graph in *A* and one graph in *B* has to contain an edge.

**Figure 6.5:** Illustration of a small extract of the resulting graph of Construction 6.15. The triangle in the middle is the clause gadget for a clause of size three and the two cycles left and right are variable gadgets corresponding to variables that occur in this clause. The thin edges are contained in *E*<sup>1</sup> and the bold edges are contained in *E*2.

that contains variables *xa*<sup>1</sup> *, xa*<sup>2</sup> *, . . . , xa*|*Cj* | construct a clique that contains vertices *w a*<sup>1</sup> *j , w a*<sup>2</sup> *j , . . . , w a*|*Cj* | *j* . We call this clique a clause gadget. For each variable *x<sup>i</sup>* and each *ℓ* ∈ [|*J<sup>i</sup>* |], connect *w i Ji*[*ℓ*] to *u ℓ i* if *CJi*[*ℓ*] contains *x<sup>i</sup>* and to *u ℓ i* if *CJi*[*ℓ*] contains ¬ *x<sup>i</sup>* . The edge set *E*<sup>1</sup> consists of all edges between two vertices *v* and *w* where *v* is contained in a vertex gadget and *w* is contained in a clause gadget. Moreover, *E*<sup>1</sup> contains the edge {*u k i , u k <sup>i</sup>* } for each variable gadget *V<sup>i</sup>* and each *k* ∈ [|*J<sup>i</sup>* |]. The edge set *E*<sup>2</sup> contains all constructed edges that are not contained in *E*1.

See Figure 6.5 for an illustration of Construction 6.15. We show that the graph *G*<sup>1</sup> . .= (*V, E*1) consists only of disjoint *P*3's. Note that *E*<sup>1</sup> contains all edges {*u k i , u k <sup>i</sup>* } and exactly one of the two vertices in {*u k i , u k <sup>i</sup>* } is adjacent to a vertex in a clause gadget in *G*1. Since each vertex in a clause gadget has degree exactly one in *G*1, this proves that *G*<sup>1</sup> only consist of disjoint *P*3's. Next, observe that *G*<sup>2</sup> . .= (*V, E*2) consists of disjoint cliques of size at most *k*. In each variable gadget it contains every other edge, that is, a matching (disjoint cliques of size two) and it contains all edges between vertices in clause gadgets which are by defnition of size at most *k*. We next show that Construction 6.15 is correct.

**Lemma 6.16.** *Let φ be an instance of k*-SAT *in which each variable occurs at least twice in φ and at most once in each clause. Then, φ is satisfable if and only if the graph G . .*= (*V, E*<sup>1</sup> ⊎ *E*2) *resulting from Construction 6.15 has an independent set of size ℓ where ℓ is the number of cliques in G*<sup>2</sup> *. .*= (*V, E*2)*.*

*Proof.* We start by showing that if *G* contains an independent set of size *ℓ*, then *φ* is satisfable. To this end, let *I* be an independent set of size *ℓ* in *G*. Note that *I* contains exactly one vertex from each clique in *G*<sup>2</sup> and therefore for each variable gadget *V<sup>i</sup>* it either contains *u* 1 *i* or *u* 1 *i* . By construction of *V<sup>i</sup>* , it holds that if *u* 1 *<sup>i</sup>* ∈ *I*, then *u ℓ <sup>i</sup>* ∈ *I* for all *ℓ* ∈ [|*J<sup>i</sup>* |]. Analogously, if *u* 1 *<sup>i</sup>* ∈ *I*, then *u ℓ <sup>i</sup>* ∈ *I* for all *ℓ* ∈ [|*J<sup>i</sup>* |]. We now describe how to construct a satisfying truth assignment for *φ*. For each variable *x<sup>i</sup>* , we set *β*(*xi*) . .= true if *u* 1 *<sup>i</sup>* ∈ *I* and *β*(*xi*) . .= false if *u* 1 *<sup>i</sup>* ∈ *I*. It remains to show that this truth assignment *β* satisfes all clauses in *φ*. To this end, consider any clause *C<sup>j</sup>* . Since *I* contains exactly one vertex from each clause gadget (each such gadget induces a clique in *G*2), it holds that *w i <sup>j</sup>* ∈ *I* for some *i* ∈ [|*C<sup>j</sup>* |]. By construction, the variable *x<sup>i</sup>* occurs in *C<sup>j</sup>* (exactly once). If *C<sup>j</sup>* contains the literal ¬ *x<sup>i</sup>* , then *w i j* is adjacent to *u h i* for some *h* ∈ [|*J<sup>i</sup>* |] and, since *I* is an independent set, *I* does not contain *u h i* . Thus, *u* 1 *<sup>i</sup>* ∈ *I* and therefore *β*(*xi*) = false and *C<sup>j</sup>* is satisfed by *β*. If *C<sup>j</sup>* contains the literal *x<sup>i</sup>* , then *w i j* is adjacent to *u h i* for some *h* ∈ [|*J<sup>i</sup>* |] and analogously *u* 1 *<sup>i</sup>* ∈ *I*. Thus, *C<sup>j</sup>* is satisfed by *β* as *β*(*xi*) = true. Since each clause is satisfed by *β*, this concludes the frst direction of the proof.

It remains to show that if *φ* is satisfable, then *G* contains an independent set of size *k*. Let *β* be a satisfying truth assignment for *φ*. We construct an independent set *I* of size *ℓ* for *G* as follows. For each variable *x<sup>i</sup>* , if *β*(*xi*) = true, then *I* contains all vertices *u h i* for *h* ∈ [|*J<sup>i</sup>* |] and if *β*(*xi*) = false, then *I* contains all vertices *u h i* for *h* ∈ [|*J<sup>i</sup>* |]. For each clause *C<sup>j</sup>* , let *x<sup>i</sup>* be a variable that satisfes *C<sup>j</sup>* under assignment *β* and let *I* contain *w i j* . Observe that *I* is of size *ℓ* as it contains exactly one vertex of each clique in *G*2. It remains to show that *I* is indeed an independent set. Assume towards a contradiction that *I* was not an independent set. Then it contains two adjacent vertices. Note that it does not contain two adjacent vertices from variable gadgets as it contains every second vertex from the respective cycle. It does not contain two adjacent vertices from clause gadgets either as it contains exactly one vertex from each clause gadget and vertices from diferent clause gadgets are not adjacent in *G*. Hence, *I* contains a vertex *w i j* from a clause gadget and a vertex *v* from a variable gadget such that *v* and *w i j* are adjacent. If *w i <sup>j</sup>* ∈ *I*, then *x<sup>i</sup>* satisfes *C<sup>j</sup>* by construction, that is, *β*(*xi*) = true if *C<sup>j</sup>* contains the literal *x<sup>i</sup>* and *β*(*xi*) = false if *C<sup>j</sup>* contains the literal ¬ *x<sup>i</sup>* . We distinguish between the two cases where *C<sup>j</sup>* contains the literal *x<sup>i</sup>* or the literal ¬ *x<sup>i</sup>* . If *C<sup>j</sup>* contains the literal *x<sup>i</sup>* , then by construction *w i j* is only adjacent to vertices in the clause gadget for *C<sup>j</sup>* and to *u h i* for some *h* ∈ [|*J<sup>i</sup>* |]. Since *β*(*xi*) = true, it holds that *u h <sup>i</sup>* ∈ *I* and *u h <sup>i</sup>* ∈*/ I*, a

**Figure 6.6:** A gadget for a *P*<sup>3</sup> in Construction 6.17 where the inner vertex has a green color (the two upper leaves) and the end vertices have red (bottom left) and yellow color (bottom right), respectively. The triangles and squares represent leaves of diferent labels. There are six diferent labels in this phylogenetic tree (which are represented by a red square, a red triangle, a yellow square, a yellow triangle, a green square, and a green triangle, respectively).

contradiction to the assumption that *w i <sup>j</sup>* has a neighbor in *I*. If *C<sup>j</sup>* contains ¬ *x<sup>i</sup>* , then by construction *w i j* is only adjacent to vertices in the clause gadget for *C<sup>j</sup>* and to *u h i* for some *h* ∈ [|*J<sup>i</sup>* |]. Since *β*(*xi*) = false, it holds that *u h <sup>i</sup>* ∈*/ I*, which is again a contradiction to the assumption that *w i <sup>j</sup>* has a neighbor in *I*. Thus, *I* is an independent set which concludes the proof.

Note that, by construction, the independent set has to contain exactly one vertex from each clique in *G*2. This is equivalent to giving each vertex in *G*<sup>1</sup> a color that represents in which clique in *G*<sup>2</sup> the vertex is contained and asking for a colorful independent set, that is, an independent set which contains exactly one vertex of each color. Hence, Lemma 6.16 implies that *k*-SAT reduces to Colorful Independent Set on disjoint *P*3's where each color appears at most *k* times and no *P*<sup>3</sup> contains two vertices of the same color. We next reduce this variant of Colorful Independent Set to Soft Tree Containment on *k*-labeled trees. The basic idea is to construct a gadget as shown in Figure 6.6 for each *P*<sup>3</sup> in the input graph and connect all of these gadgets by an arbitrary binary tree whose leaves are the roots of the respective gadgets. By Lemma 6.8, if this phylogenetic tree *N* displays *T*, then it contains a single-labeled phylogenetic tree *S* that displays *T*. We will show that *S* can contain either a leaf that represents the inner vertex in the respective *P*<sup>3</sup> or only leaves that represent end vertices of the respective *P*3. Hence, we can use *S* to construct an independent set.

**Construction 6.17.** Given a vertex-colored collection *G* . .= (*V, E*) of *P*3's where each color occurs at most *k* times, we construct a *k*-labeled phylogenetic

**Figure 6.7:** Illustration of Construction 6.17.

**Left:** The initial instance of Colorful Independent Set on disjoint *P*3's with 4 colors (red, blue, green, and yellow) where each color occurs at most thrice. The encircled vertices represent a solution.

**Right:** The single-labeled phylogenetic tree *T* resulting from Construction 6.17. **Middle:** The binary 3-labeled phylogenetic tree *N* resulting from Construction 6.17. The highlighted edges represent the single-labeled subtree *S* of *N* that displays *T* and that corresponds to the marked solution on the left-hand side.

tree *N* and a single-labeled phylogenetic tree *T* as follows. Both phylogenetic networks contain two diferent labels *i*<sup>1</sup> and *i*<sup>2</sup> for each color *i* in *G*.

Construct *T* by frst creating a star that has exactly one leaf of each color occurring in *G*. Then, for each leaf *x* with color *i*, adding two new leaves labeled with *i*<sup>1</sup> and *i*2, respectively. Since *x* is not a leaf any more, its label is removed.

The *k*-labeled phylogenetic tree *N* is constructed as follows. We start with a gadget as shown in Figure 6.6 for each *P*<sup>3</sup> = (*u, v, w*) in the input graph where red, green, and yellow denote the colors of *u*, *v*, and *w*, respectively. Therein, a triangle of color *i* represents a leaf labeled with *i*<sup>1</sup> and a square of color *i* represents a leaf labeled with *i*2. Finally, add an arbitrary binary tree that has a leaf for each *P*<sup>3</sup> in *G* and identify each such leaf with the root of the respective constructed gadgets.

An example of Construction 6.17 is given in Figure 6.7. We conclude this subsection with the proof that *k*-SAT reduces to Soft Tree Containment on *k*-labeled phylogenetic trees and a simple corollary that states *NP*-hardness.

**Proposition 6.18.** *k*-SAT *reduces for each k to* Soft Tree Containment *on binary k-labeled phylogenetic trees.*

*Proof.* Note that *k*-SAT reduces by Lemma 6.16 to Colorful Independent Set on disjoint *P*3's where each color appears at most *k* times. We then apply Construction 6.17 to the constructed instance of Colorful Independent Set. Since the resulting phylogenetic tree from Construction 6.17 is *k*-labeled and binary, it remains to show that this construction is correct, that is, *N* displays *T* if and only if the given collection *G* . .= (*V, E*) of *P*3's has a colorful independent.

We frst show that if *N* displays *T*, then there is a colorful independent set in *G*. If *N* displays *T*, then, by Lemma 6.8, *N* contains a single-labeled phylogenetic tree *S* that displays *T*. By Observation 6.1, this is equivalent to *T* displaying *S*. Let *Q* be the set of vertices in *G* such that for each vertex *v* ∈ *Q* of color *c*, *S* contains a vertex of color *c*<sup>1</sup> in the gadget for the respective *P*<sup>3</sup> that *v* is in. Since *S* displays *T*, it contains a leaf with label *c*<sup>1</sup> and a leaf with label *c*<sup>2</sup> for each color *c*. Moreover, since *S* is single-labeled it contains exactly one vertex with label *c*<sup>1</sup> for each color *c* and therefore *Q* is colorful, that is, it contains exactly one vertex of each of the colors in *G*. Hence, it remains to show that *Q* is an independent set in *G*. Assume towards a contradiction that *Q* is not an independent set, that is, it contains two adjacent vertices *u* and *w*. Let without loss of generality *u* be an inner vertex in a *P*<sup>3</sup> and let *c* and *d* be the colors of *u* and *w* respectively. Since *u, w* ∈ *Q*, it holds that *S* contains the leaf with label *c*<sup>1</sup> and the leaf with label *d*<sup>1</sup> in the same gadget. By construction, *S* displays the triplet *c*1*d*1|*c*<sup>2</sup> and *T* displays the triplet *c*1*c*2|*d*<sup>1</sup> frmly. By Lemma 6.6(b), this contradicts the fact that *T* displays *S*.

We conclude the proof by showing that if *G* contains a colorful independent set *I*, then *N* displays *T*. To this end, let *I* be a colorful independent set in *G*. We will show that there is a single-labeled subtree *S* of *N* that displays *T*. This implies by Lemma 6.8 that *N* displays *T*. For each vertex *v* ∈ *I* of color *c*, let *S* contain the two leaves with labels *c*<sup>1</sup> and *c*<sup>2</sup> in the gadget for the respective *P*<sup>3</sup> that *v* is in. Since *I* is colorful, *S* contains exactly one leaf of each label and it therefore remains to show that *S* displays *T*.

Assume towards a contradiction that *S* does not display *T*. This is, by Observation 6.1, equivalent to *T* not displaying *S*. In this case, there is, by Lemma 6.6, a triplet *xy*|*z* that is frmly displayed by *S* but not softly displayed by *T*. By Observation 6.3(b), *T* then displays one of the triplets *xz*|*y* or *yz*|*x* frmly. Let *T* without loss of generality display *xz*|*y* frmly. By construction of *T*, it holds that *y* . .= *c*<sup>1</sup> and *z* . .= *c*<sup>2</sup> (or *y* . .= *c*<sup>2</sup> and *z* . .= *c*1) for some color *c*. By construction of *S*, it holds that the two leaves labeled with *c*<sup>1</sup> and *c*<sup>2</sup> in *S* are in the same gadget. Hence, *c*<sup>1</sup> and *c*<sup>2</sup> correspond to an inner vertex in the respective *P*<sup>3</sup> as otherwise there is no label *x* such that *S* displays the

triplet *c*1*x*|*c*<sup>2</sup> (or *c*2*x*|*c*1). By construction, *I* contains the inner vertex *v* of color *c* in the respective *P*3. Moreover, it holds that the leaf with label *x* is also contained in the same gadget and thus *I* contains one of the two end vertices in the same *P*<sup>3</sup> as *v*, a contradiction to the fact that *I* is an independent set.

Since *k*-SAT is *NP*-hard for each *k* ≥ 3, it holds that Soft Tree Containment is *NP*-hard on binary *k*-labeled phylogenetic trees and, in particular, when restricted to 3-labeled phylogenetic trees.

**Corollary 6.19.** Soft Tree Containment *is NP-hard, even if the input network N is a binary* 3*-labeled phylogenetic tree.*

## **6.4 Concluding Remarks**

We initiated research into a practically relevant variant of Tree Containment handling soft polytomies. We again defer the discussion on 2-SAT programming as a technique to the concluding chapter of this thesis and focus on Soft Tree Containment here. We laid the mathematical foundations to dealing with soft polytomies and showed the dichotomy result that Soft Tree Containment on *k*-labeled phylogenetic trees is polynomial-time solvable if *k* ≤ 2 and *NP*-hard if *k* ≥ 3. Further improving the running time of the polynomial-time algorithm for 2-labeled phylogenetic trees (e. g. within the context of *FPT in P* as done in Chapter 3) and empirically evaluating it on real-world data sets are clear avenues for further research.

Motivated by our hardness result, the search for parameterized or approximation algorithms is another logical next step. Previous work for Tree Containment [GLZ16, Wel18] might lend promising ideas and parameterizations to this efort.

# **Chapter 7**

### **Reachable Objects**

In this chapter, we will investigate a problem from the widely-studied feld of resource allocation under preferences, having applications in areas such as artifcial intelligence and economics. Conceptually, we will develop a 2-SAT program where the truth assignment of a variable does *not* represent picking some element into a solution or not. It rather represents which of two elements is picked into a solution. These types of 2-SAT programs are so far very rare in the literature. We mention that the 2-SAT program we develop in this chapter does not meaningfully generalize to a *k*-SAT program and therefore the 2-SAT program is not a special case of a reduction to *k*-SAT. In Chapter 8, we will analyze the structure of the problem we study here and observe which structural elements enable 2-SAT programming. This will lead us to a rule of when 2-SAT programming can be a promising tool for solving algorithmic problems.

Regarding resource allocation under preferences, we will investigate the Reachable Object problem which generalizes the well-known Housing Market problem [SS74]. In Reachable Object, agents are organized in a graph and two agents can only swap resources if they share an edge in the graph. This restriction models the situation where not all agents are able to communicate and swap with each other. We start with a dichotomy result regarding the number of objects each agent prefers over its initially held object and continue with investigating the special case where each agent has at most two neighbors in the graph. Using 2-SAT programming, we will show that this special case is polynomial-time solvable. The problem remains *NP*-hard for the case where each agent has at most four neighbors in the graph [SW18].

Resource allocation under preferences is a major topic in society and technology [Wal15]. It has also proven to be a key issue in a world of limited resources and allocating indivisible resources is well-studied in the context of multiagent

systems [BCM16]. It has numerous applications e. g. in contexts of food-banks, when sharing charitable donations between cities or communities, or when allocating physical to virtual resources in virtualization technologies [BKN18]. There are several versions studied in the literature that try to optimize for diferent criteria such as Pareto optimality, fairness, or social welfare [Abr+05, Rot82, SU10].

In the feld of resource allocation under preferences, one is interested in distributing a set of (divisible or indivisible) objects among a set of agents who value the objects diferently. We focus entirely on indivisible objects here and consider the special case where each agent initially holds exactly one object. While a large body of research in the literature takes a *centralized* approach that globally controls and reallocates an object to each agent, we pursue a *decentralized* strategy where any pair of agents may locally *swap* objects as long as this leads to an improvement for both of them, that is, they both value the object they get over the one they give away [DBC15]. We are then interested whether there is a sequence of such *rational trades* that leads to a situation where a given agent obtains a given object. Other examples of recently studied problems regarding allocations of indivisible resources under social network constraints are envy-free allocations [Bey+19, BKN18], Pareto-optimal allocations [IP19], and stable matchings [ABH17, AV09].

The main contribution of this chapter is a polynomial-time algorithm for Reachable Object on cycles and the following dichotomy result. If each agent prefers at most two other objects over the object it initially holds, then the problem is linear-time solvable. If some agents prefer more than two objects over their initially held object, then the problem is *NP*-hard. The polynomial-time reduction in the hardness result also shows that the problem remains *NP*-hard if the underlying graph is a clique, that is, all agents can pairwise swap with one another. It might be tempting to think that the hardness then stems from the density of the underlying graph as cycles are very sparse. This assumption is, however, false as the problem is known to be *NP*-hard even when the input graph is a tree [GLW17].

Section 7.2 is dedicated to the dichotomy result and in Section 7.3 we will present our 2-SAT-programming-based polynomial-time algorithm for Reachable Object on cycles. Me mention that the positive result in the dichotomy part is based on dynamic programming.

### **7.1 Problem Defnition and Related Work**

Let *V* . .= {1*,* 2*, . . . , n*} be a set of *n* agents and let *X* . .= {*x*1*, x*2, *. . . , xn*} be a set of *n* objects. Each agent *i* ∈ *V* has a *preference list* over the objects in *X*, which is a strict linear order on *X*. This list is denoted as ≻*<sup>i</sup>* and we omit the subscript *i* if the agent is clear from the context. For two objects *x<sup>j</sup> , xℓ*, the notation *x<sup>j</sup>* ≻*<sup>i</sup> x<sup>ℓ</sup>* means that agent *i prefers x<sup>j</sup> over xℓ*. A *preference profle* P is a collection (≻*i*)*i*∈*<sup>V</sup>* of preference lists of the agents in *V* . An *assignment* is a bijection *σ* : *V* → *X*, where each agent *i* is assigned exactly one object *σ*(*i*) ∈ *X*. Since assignments are bijections, we will also use *σ* −1 (*xi*) to denote the agent that holds *x<sup>i</sup>* in assignment *σ*.

Let *G* . .= (*V, E*) be a graph where the set *V* of agents is the set of vertices. An edge in this graph models that two agents know and trust each other enough to swap objects. We say that an assignment *σ* admits *a rational trade for two agents i and j*, denoted as *τ* = {(*i, σ*(*i*))*,*(*j, σ*(*j*))}, if the vertices corresponding to *i* and *j* are adjacent in the graph ({*i, j*} ∈ *E*) and each of the two agents prefers the other's assigned object over its own object (*σ*(*j*) ≻*<sup>i</sup> σ*(*i*) and *σ*(*i*) ≻*<sup>j</sup> σ*(*j*)). After performing the swap specifed by *τ* , agent *i* holds object *σ*(*j*), agent *j* holds object *σ*(*i*), and the other agents keep their objects. To describe this move, we say that *objects σ*(*i*) *and σ*(*j*) *are swapped over edge* {*i, j*}. Sometimes, we also say that object *σ*(*i*) (or *σ*(*j*)) *passes through* edge {*i, j*} or *moves* from agent *i* to *j*.

A *sequence of swaps* is a sequence (*σ*0*, σ*1*, . . . , σt*) of assignments where for each index *k* ∈ {0*,* 1*, . . . , t*−1} there are two agents *i, j* ∈ *V* for which *σ<sup>k</sup>* admits a swap *τ* = {(*i, σk*(*i*))*,*(*j, σk*(*j*)) such that


We call an *assignment σ* ′ *reachable from another assignment σ* if there is a sequence (*σ*0*, σ*1*, . . . , σt*) of swaps such that *σ*<sup>0</sup> = *σ* and *σ<sup>t</sup>* = *σ* ′ . We say that *an object x* ∈ *X is reachable for an agent i from a given initial assignment σ*<sup>0</sup> if there is an assignment *σ* which is reachable from *σ*<sup>0</sup> with *σ*(*i*) = *x*.

With these defnitions at hand, we can now defne the problem Reachable Object introduced by Gourvès et al. [GLW17] which we study in this chapter.

$$\begin{array}{l} 1: x\_3 \succ x\_4 \succ \overline{\{x\_1\}} \\ 3: x\_1 \succ x\_2 \succ \overline{x\_4} \succ \overline{\{x\_3\}} \\ 5: x\_6 \succ x\_3 \succ \overline{\{x\_5\}} \end{array} \begin{array}{l} 2: x\_1 \succ x\_3 \succ x\_4 \succ \overline{\{x\_2\}} \\ \overline{x\_3} \prec x\_3 \succ x\_3 \succ \overline{\{x\_4\}} \\ 6: x\_4 \succ x\_3 \succ \overline{\{x\_6\}} \end{array}$$

**Figure 7.1:** An example of Reachable Object. The six agents and the graph of agents are depicted on the left-hand side. The preference lists are depicted on the right and the initial assignment *σ*<sup>0</sup> is illustrated by the boxes in the preference lists (each agent initially holds the object that is drawn in a box in the agent's preference list). Since no agent will agree on receiving an object in any swap that it does not prefer over its initially held object, these objects are for the sake of readability not depicted in the preference lists. The agent *I* is agent 1 and *x* is the object *x*3. If the underlying graph was complete, then object *x*<sup>3</sup> would be reachable for agent 1 within one swap. However, if the graph is a cycle as shown, then to reach agent 1 object *x*<sup>3</sup> has to be swapped along {2*,* 3} with object *x*<sup>2</sup> frst and then along {1*,* 2} with object *x*1. Note that at both edges both incident agents agree to the swap as agent 3 prefers *x*<sup>2</sup> over *x*<sup>3</sup> and agent 2 prefers *x*<sup>3</sup> over *x*<sup>2</sup> and *x*<sup>1</sup> over *x*3. Finally, agent 1 prefers *x*<sup>3</sup> over *x*1.

### Reachable Object

**Input:** A set *V* of agents, a set *X* of objects with |*X*| = |*V* |, a preference profle P, an initial assignment *σ*0, a graph *G* . .= (*V, E*), an agent *I* ∈ *V* , and an object *x* ∈ *X*.

**Question:** Is *x* reachable for *I* from *σ*0?

An example of Reachable Object is given in Figure 7.1. Note that an agent *i* that gives away a certain object *x<sup>j</sup>* during a sequence of swaps, obtains an object it prefers over *x<sup>j</sup>* and hence agent *i* will not accept object *x<sup>j</sup>* in the future.

**Observation 7.1.** *Let ϕ . .*= (*σ*0*, σ*1*, . . . , σs*) *be a sequence of swaps, let i be an agent and let x<sup>j</sup> be an object. If σr*(*i*) = *x<sup>j</sup> and σ<sup>r</sup>*+1(*i*) ̸= *x<sup>j</sup> for some r* ∈ [*s*−1]*, then σ<sup>r</sup>* ′ (*i*) ̸= *x<sup>j</sup> for all r* ′ *> r.*

Concerning related work, Gourvès et al. [GLW17] introduced Reachable Object and showed that it is *NP*-hard on trees. Moreover, they showed polynomial-time solvability on stars and for a special case on paths, namely when testing whether an object is reachable for an agent positioned on an

end vertex of the path. Huang and Xiao [HX20] generalized this special case and showed that Reachable Object on paths is polynomial-time solvable independently of where the target agent *I* is located on the path. They also considered a version where agents can value diferent objects equally (that is, the preference lists are not strict) and showed that in this case Reachable Object remains *NP*-hard on paths. Safdine and Wilczynski [SW18] studied the parameterized complexity of Reachable Object with respect to parameters such as the maximum degree of the input graph or the overall number of swaps allowed in a sequence. They showed that Reachable Object remains *NP*hard even on graphs with maximum degree at most four. Further, they showed that Reachable Object is *W[1]*-hard when parameterized by the length of the minimum sequence of swaps that leads to agent *I* obtaining object *x*. Finally, Reachable Object is *NP*-complete on generalized caterpillars where each hair has length at most two and only one vertex has degree larger than two [Ben+19a].

### **7.2 Length of Preference Lists**

In this section, we will show a complexity dichotomy result with regard to the maximum length of a preference list. Notice that each agent initially holds one object and it will never obtain any object that it does not prefer over its initially held object. Thus, we describe the preference list of an agent only up to its initially held object. The length of the preference list of an agent is then defned as the number of objects the agent likes at least as much as its initially held object and the maximum length of a preference list is the length of a longest preference list of any agent.

The parameter maximum length of a preference list is mainly motivated by the following two scenarios. In many applications each agent only knows some of the objects (e. g. potential buyers usually only visited fve to ten houses and do not like all of them or when ranking movies each participant has only seen some of the available movies) and in other applications even when all alternatives are known only a few of them are appealing (e. g. when applying for a job or when choosing food). Notably, Safdine and Wilczynski [SW18] suggested to study Reachable Object with restrictions on the preference lists.

We will show in Subsection 7.2.1 that instances in which the maximum length of a preference list is at most three can be solved in linear time. We complement this result in Subsection 7.2.2 by showing that Reachable Object is *NP*-hard even if restricted to cases where the maximum length of a preference list is at most four and where the underlying graph is a clique.

### **7.2.1 Maximum Length at Most Three**

In this subsection, we provide a linear-time algorithm for Reachable Object when the maximum length of a preference list is at most three. The main idea is to reduce Reachable Object to computing an *s*-*t*-path in a directed graph. Throughout this subsection, we assume that each agent *i* initially holds object *x<sup>i</sup>* . Consider all agents that hold the given target object *x* during a sequence of swaps that leads to agent *I* obtaining object *x*. All those agents except for the agent *a* that initially holds *x* and agent *I* must swap their initially held object to receive *x* and then receive their most preferred object for giving *x* away. We call those agents *x-forwarder*. Concerning agent *I*, it might swap its initially held object *x<sup>I</sup>* in order to receive *x* or it might frst receive an object *x<sup>w</sup>* and then swap *x<sup>w</sup>* away in order to receive *x*. Note that in this case the preference list of agent *I* is *x* ≻ *x<sup>w</sup>* ≻ *x<sup>I</sup>* . Since each preference list is of length at most three, the object *x<sup>w</sup>* is unique. Note that all agents that hold object *x<sup>w</sup>* in the mentioned sequence of swaps except for agents *I* and *w* must swap their initially held object to receive *x<sup>w</sup>* and then receive their most preferred object for giving *x<sup>w</sup>* away. Analogously to *x*-forwarder, we call such agents *xw*-forwarder. Hence, we basically just consider the case distinction whether agent *I* is a *w*-forwarder or not and which objects agent *a* and *w* receive in exchange for their initially held objects. We remark that there is a special case where *a* is an *xw*-forwarder and agent *w* is an *x*-forwarder. Figure 7.2 gives an example of Reachable Object where the maximum length of preference lists is at most three.

Let (*σ*0*, σ*1*, . . . , σt*) be a sequence of swaps. To ease the reasoning, we defne *τ<sup>i</sup>* to be the swap that transforms *σi*−<sup>1</sup> into *σ<sup>i</sup>* . Formally, *τ<sup>i</sup>* = {(*j, xp*)*,*(*k, xq*)} such that

1. *σi*−1(*j*) = *σi*(*k*) = *xp*, and

$$2. \ \sigma\_{i-1}(k) = \sigma\_i(j) = x\_q.$$

Using this notation, we frst prove a property which allows us to exclusively focus on the objects *w* and *x*. Roughly speaking, any solution can be partitioned into two sequences of swaps. In the frst sequence, the object *xw*, which agent *I* swaps in exchange for object *x*, is swapped between each two consecutive

**Figure 7.2:** An example of Reachable Object that is a slight modifcation of the example in Figure 7.1. Initially held objects are again drawn in boxes and the question is still whether *x*<sup>3</sup> is reachable for agent 1. Then, our algorithm fnds the following swap sequence for object *x*<sup>3</sup> to reach agent 1: 4 ↔ 3, 3 ↔ 2, 2 ↔ 1, 4 ↔ 5, 5 ↔ 6, 6 ↔ 1, where "*i* ↔ *j*" means that agents *i* and *j* swap the objects they currently hold. In this example each agent in {1*,* 2*,* 3} is an *x*4-forwarder and each agent in {4*,* 5*,* 6} is an *x*3-forwarder.

assignments. In the second sequence, object *x* is swapped between each two consecutive assignments. More specifcally, the following lemma states that the sequence of swaps resulting from performing all swaps that involve *x<sup>w</sup>* and no other swaps leads to agent *I* obtaining *xw*.

### **Lemma 7.2.** *Let*

$$(V := \{1, 2, \ldots, n\}, X := \{x\_1, x\_2, \ldots, x\_n\}, \mathcal{P}, \sigma\_0, G := (V, E), I, x)$$

*be an instance of* Reachable Object *where σ*0(*i*) = *x<sup>i</sup> for all i* ∈ [*n*] *and where the* maximum length of preference lists *is at most three. Let ϕ . .*= (*σ*0*, σ*1*, . . . , σt*) *be a sequence of swaps such that σt*(*I*) = *x. Consider two objects x<sup>p</sup> and x<sup>q</sup> such that there is a swap τ<sup>r</sup> with τ<sup>r</sup>* = {(*I, xp*)*,*(*j, xq*)}*, that is, agent I obtains object x<sup>q</sup> in exchange for x<sup>p</sup> during ϕ. Let T* = {*τ<sup>i</sup>* | *τ<sup>i</sup>* = {(*j, x*′ *p* )*,*(*k, xq*) ∧ *i* ≤ *r*}} *be the set of all swaps between assignments in ϕ up to assignment σ<sup>r</sup> that involve swapping xq. We denote the elements of T by τ* ′ 1 *, τ* ′ 2 *, . . . , τ* ′ |*T*| *such that swap τ* ′ *i occurs before swap τ* ′ *j in ϕ for each i < j. Let ϕ*start *. .*= (*σ* ′ 0 *, σ*′ 1 *, . . . , σ*′ *s* ) *be the sequence of assignments such that σ* ′ 0 *. .*= *σ*<sup>0</sup> *and σ* ′ *i is the result of performing swap τ* ′ *i in assignment τ* ′ *i*−1 *. Let τ* ′ *i . .*= {(*ai*−1*, xq*)*,*(*a<sup>i</sup> , x<sup>b</sup><sup>i</sup>* )} *for each i* ∈ [*s*]*. Then,*

$$
\begin{pmatrix} i \end{pmatrix} \ \tau\_s' = \tau\_r,
$$

*(ii) a*<sup>0</sup> = *q and agent q prefers x<sup>b</sup>*<sup>1</sup> *over xq,*


*Proof.* We prove the individual statements one after another and we start with statement (i). To this end, note that by defnition, the last swap *τ<sup>r</sup>* between *σr*−<sup>1</sup> and *σ<sup>r</sup>* contains (*j, xq*) for some agent *j*. Since the swaps between consecutive assignments in *ϕ* ′ are exactly those that involve swapping object *xq*, it follows that the swap between *σ* ′ *<sup>s</sup>*−<sup>1</sup> and *σ* ′ *<sup>s</sup>* must be *τ<sup>r</sup>* and thus *τ* ′ *<sup>s</sup>* = *τ<sup>r</sup>* and statement (i) holds.

We continue with statement (ii). By defnition, *τ* ′ 1 . .= {(*a*0*, xq*)*,*(*a*1*, xb*<sup>1</sup> )} and agent *q* initially holds object *xq*. By defnition of rational swaps it holds that agent *a*<sup>0</sup> initially holds object *x<sup>q</sup>* and prefers *xb*<sup>1</sup> over *xq*. Since each object is unique, sind each object is only held by one agent at a time, and since both agents *a*<sup>0</sup> and *q* initially hold *xq*, it holds that *a*<sup>0</sup> = *q* and thus statement (ii) holds.

To show statement (iii), recall that *τ<sup>r</sup>* = *τ* ′ *s* . .= {(*as*−1*, xq*)*,*(*as, xb*<sup>1</sup> )} and thus *a<sup>s</sup>* = *I* and since *τ* is a rational swap, agent *a<sup>s</sup>* has to prefer *x<sup>q</sup>* over *xp*. Since agent *a<sup>s</sup>* holds *x<sup>p</sup>* during *ϕ*, it holds that *x<sup>p</sup>* = *x*<sup>1</sup> or that agent *a<sup>s</sup>* prefers *x<sup>p</sup>* over *x<sup>I</sup>* . In both cases it prefers *x* over *x<sup>I</sup>* and thus statement (iii) holds.

We next prove statement (iv). Assume towards a contradiction that there is a minimum *z* ∈ [*s*−1] such that *a<sup>z</sup>* does not have preference list *x<sup>b</sup>z*+1 ≻ *x<sup>q</sup>* ≻ *x<sup>b</sup><sup>z</sup>* or that it is not an *xq*-forwarder. Observe that if *z* = 1, then by statement (ii) *az*−<sup>1</sup> = *a*<sup>0</sup> = *q* and agent *az*−<sup>1</sup> initially holds *x<sup>q</sup>* and otherwise, since *z* is minimum, it holds that after swap *τ* ′ *<sup>z</sup>*−<sup>1</sup> agent *az*−<sup>1</sup> holds object *xq*. Since *τ* ′ *<sup>z</sup>* = *τ<sup>k</sup>* for some *k*, it holds that *x<sup>q</sup>* ≻*<sup>a</sup><sup>z</sup> x<sup>b</sup><sup>z</sup>* and agent *a<sup>z</sup>* swaps *x<sup>b</sup><sup>z</sup>* for *x<sup>q</sup>* away. By defnition of *τ* ′ *<sup>z</sup>*+1, agents *a<sup>z</sup>* and *a<sup>z</sup>*+1 then swap *x<sup>q</sup>* and *x<sup>b</sup>z*+1 and thus *x<sup>b</sup>z*+1 ≻*<sup>a</sup><sup>z</sup> x<sup>q</sup>* and *a<sup>z</sup>* is an *xq*-forwarder, a contradiction.

We next show statement (v). Note that if agent *I* prefers object *x* over *xq*, then *x* ̸= *xq*. Assume towards a contradiction that some agent *a<sup>z</sup>* with *z* ∈ [*s*] prefers *x* over *xq*. Note that in this case *x<sup>q</sup>* ̸= *x<sup>b</sup><sup>z</sup>* as *x<sup>b</sup><sup>z</sup>* is the object that *a<sup>z</sup>*

initially holds. Then by statement (iv), it holds that *a<sup>z</sup>* only prefers *x<sup>q</sup>* and *xbz*+1 over *xb<sup>z</sup>* and that *a<sup>z</sup>* obtains *xbz*+1 during *ϕ* before agent *I* obtains object *x*. Since *x<sup>q</sup>* ̸= *x*, it holds that *x* = *xbz*+1 and since *a<sup>z</sup>* holds its most preferred object *x*, it will never trade this object away. Thus, object *x* cannot be obtained by agent *I* during *ϕ*, a contradiction.

Finally, to show statement (vi), notice that by statement (i) and the defnition of *τ* , it holds that *σ* ′ *s* (*I*) = *x<sup>q</sup>* and hence it remains to show that *ϕ*start is a sequence of swaps. Assume towards a contradiction that *ϕ*start is not a sequence of swaps, that is, there are two consecutive assignments *σ* ′ *i*−1 and *σ* ′ *i* such that *τ* ′ *i* is not a rational swap. Since by defnition *τ* ′ *<sup>i</sup>* = *τ<sup>k</sup>* for some *k*, it holds that if *τ* ′ *i* . .= {(*ai*−1*, xq*)*,*(*a<sup>i</sup> , xb<sup>i</sup>* )} is possible in the sense that *ai*−<sup>1</sup> holds *x<sup>q</sup>* and agent *a<sup>i</sup>* holds *xb<sup>i</sup>* , then *τ* ′ *i* is a rational swap. Assume without loss of generality that *τ* ′ *i* is the frst swap between consecutive assignments in *ϕ*start that is not possible. Then, *i* = 1 or *τ* ′ *i*−1 is possible. By statement (ii), in both cases agent *ai*−<sup>1</sup> holds *x<sup>q</sup>* in *σ* ′ *i* . If *i < s*, then by statement (iv) agent *a<sup>i</sup>* can only trade *xb<sup>i</sup>* away in order to obtain *x<sup>q</sup>* in any trade *τ<sup>k</sup>* between two assignments *σk*−<sup>1</sup> and *σ<sup>k</sup>* in *ϕ*. Since *τ* ′ *<sup>i</sup>* = *τ<sup>k</sup>* for some *k* ∈ [*r*], it holds that *τ* ′ *i* is possible, a contradiction. If *i* = *s*, then by statement (iii) agent *a<sup>s</sup>* is agent *I* which initially holds by assumption *x<sup>p</sup>* = *xb<sup>i</sup>* and hence *τ* ′ *i* is possible, a contradiction.

We next present the main algorithm of this subsection and prove that it solves Reachable Object when the maximum length of preference lists is at most three. Pseudo-code is given in Algorithm 7.1. The idea therein is to model possible swaps that involve *x* as arcs in a directed graph. Each arc (*i, j*) in this graph represents the fact that if agent *i* obtains object *x*, then it can swap it to agent *j* in exchange for object *x<sup>j</sup>* . A directed path from the agent that initially holds *x* to agent *I* then corresponds to a sequence of swaps such that agent *I* obtains object *x* in the end. We then consider the third object *x<sup>w</sup>* ∈ { */ x<sup>I</sup> , x*} which appears in the preference list of *I* and build a similar directed graph for *xw*. The directed paths from agent *w* to agent *I* in it again correspond to sequences of swaps such that agent *I* obtains object *xw*.

**Proposition 7.3.** Reachable Object *can be solved in linear time when the* maximum length of preference lists *is at most three.*

*Proof.* Let

$$(V := \{1, 2, \ldots, n\}, X := \{x\_1, x\_2, \ldots, x\_n\}, \mathcal{P}, \sigma\_0, G := (V, E), I, x)$$

**Algorithm 7.1:** Algorithm for Reachable Object when the maximum length of preference lists is at most three.

**Input :** A set *V* of agents, preference lists (≻*i*)*i*∈*<sup>V</sup>* of length at most three, and a graph (*V, E*). **Output :**true if agent *I* can receive object *x* that is initially held by agent *a* and false otherwise. *F* ← {(*i, j*) | {*i, j*} ∈ *E* ∧ *x<sup>j</sup>* ≻*<sup>i</sup> x* ∧ *x* ≻*<sup>j</sup> xj*} // If agent *i* obtains *x*, then it can swap it to agent *j* for *x<sup>j</sup>* . *D* ← (*V, F*) **if** *D* admits a directed path *P* from *a* to *I* **then return** yes **if** *x* ≻*<sup>I</sup> x<sup>w</sup>* ≻*<sup>I</sup> x<sup>I</sup>* for some *x<sup>w</sup>* ̸= *x* **then** *F*<sup>1</sup> ← {(*i, j*) | {*i, j*} ∈ *E* ∧ *x<sup>j</sup>* ≻*<sup>i</sup> x<sup>w</sup>* ∧ *x<sup>w</sup>* ≻*<sup>j</sup> xj*} *F*<sup>2</sup> ← {(*i, j*) | *j* ̸= *I* ∧ {*i, j*} ∈ *E* ∧ *x<sup>j</sup>* ≻*<sup>i</sup> x* ∧ *x* ≻*<sup>j</sup> xj*} *F*<sup>3</sup> ← {(*i, I*) | {*i, I*} ∈ *E* ∧ *x<sup>w</sup>* ≻*<sup>i</sup> x*} // If *i* obtains object *x*, then it swaps it to agent *I* for *xw*. *D*<sup>1</sup> ← (*V, F*1) *D*<sup>2</sup> ← (*V, F*<sup>2</sup> ∪ *F*3) **if** *D*<sup>1</sup> admits a directed path from *w* to *I* and (*w, a*) ∈ *F*<sup>2</sup> **then** // Object *x* is held by agent *w* after the frst swap **if** *D*<sup>2</sup> admits a directed path from *w* to *I* **then return** true **if** *D*<sup>1</sup> − {*a*} admits a directed path from *w* to *I* **then if** *D*<sup>2</sup> admits a directed path from *a* to *I* **then return** true **return** false

be an instance of Reachable Object where *σ*0(*i*) = *x<sup>i</sup>* for all *i* ∈ [*n*] and where the maximum length of preference lists is at most three. Let *a* be the agent that initially holds object *x*. We use Algorithm 7.1 to prove this proposition. We start with showing that if object *x* is reachable for agent *I*, then Algorithm 7.1 returns true. To this end, assume that there exists a sequence *ϕ* = (*σ*0*, σ*1*, . . . , σt*) of swaps such that *σt*(*I*) = *x*. We assume without loss of generality that *σt*−1(*I*) . .= *x<sup>b</sup>* ̸= *x*. We then distinguish between the two cases *x<sup>b</sup>* = *x<sup>I</sup>* and *x<sup>b</sup>* ̸= *x<sup>I</sup>* . If *x<sup>b</sup>* = *x<sup>I</sup>* , then using *x<sup>p</sup>* = *x<sup>I</sup>* and *x<sup>q</sup>* = *x*, the sequence *ϕ* ′ = (*σ* ′ 0 *, σ*′ 1 *, . . . , σ*′ *s* ) as defned in Lemma 7.2 is a sequence of swaps such that *σ* ′ <sup>0</sup> = *σ*<sup>0</sup> and *σ* ′ *s* (*I*) = *x*. By Lemma 7.2(ii) to (iv), graph *D* as

constructed in Line 2 must contain a path from *a* to *I*. Thus, Algorithm 7.1 returns true in Line 3.

If *x<sup>b</sup>* ̸= *x<sup>I</sup>* , then the preference list of agent *I* is *x<sup>n</sup>* ≻ *x<sup>w</sup>* ≻ *x<sup>I</sup>* and *x<sup>b</sup>* = *xw*. Moreover, agent *I* obtains *x<sup>w</sup>* during *ϕ* and thus there are *σr*−<sup>1</sup> and *σ<sup>r</sup>* such that *τ<sup>r</sup>* = {{*I, x<sup>I</sup>* }*,* {*k, xw*}} for some agent *k*. By Lemma 7.2 (using *x<sup>p</sup>* = *x<sup>I</sup>* and *x<sup>q</sup>* = *xw*), the sequence *ϕ* ′ = (*σ* ′ 0 *, σ*′ 0 *, . . . , σ*′ *s* ) as defned in Lemma 7.2 is a sequence of swaps such that *σ* ′ <sup>0</sup> = *σ*<sup>0</sup> and *σ* ′ *s* (*I*) = *xw*. Let *a*1*, a*2*, . . . , a<sup>s</sup>* be the agents that hold object *x<sup>w</sup>* during *ϕ* ′ . It follows from Lemma 7.2(iv) and the defnition of *D*<sup>1</sup> in Line 8 that *ϕ* ′ defnes a directed path (*a*0*, a*1*, a*2*, . . . , as*) with *a*<sup>0</sup> = *w* and *a<sup>s</sup>* = *I* in *D*1. By Lemma 7.2(v), no agent in {*a*2*, a*3*, . . . , as*−1} prefers *x* over its initially held object. By Lemma 7.2(ii) and (iv), it holds that *τ* ′ <sup>1</sup> = {{*w, xw*}*,* {*a*1*, xa*<sup>1</sup> }}. We then distinguish between the two cases *a*<sup>1</sup> = *a* and *a*<sup>1</sup> ̸= *a*. If *a*<sup>1</sup> = *a*, then none of the agents in {*a*2*, a*3*, . . . , as*−1} can be involved in a swap where *x* is traded. Observe that in this case considering the initial assignment *σ* ′′ <sup>0</sup> with *σ* ′′ 0 (*I*) = *xw*, *σ* ′′ *<sup>a</sup>* = *x<sup>I</sup>* , *σ* ′′ *<sup>a</sup>*<sup>1</sup> = *x*, and *σ* ′′ 0 (*i*) = *σ*0(*i*) for all other agents *i* is equivalent to the original instance. Using Lemma 7.2 with *x<sup>p</sup>* = *x<sup>w</sup>* and *x<sup>q</sup>* = *x* then states that there is a sequence *ϕ* <sup>∗</sup> = (*σ* ∗ 0 *, σ*<sup>∗</sup> 0 *, . . . , σ*<sup>∗</sup> *s* ′ ) of swaps with *σ* ∗ <sup>0</sup> = *σ* ′′ <sup>0</sup> and *σ* ∗ *s* ′ (*I*) = *x*. It follows from Lemma 7.2(iv) and the defnition of *D*<sup>2</sup> in Line 9 that *ϕ* <sup>∗</sup> defnes a directed path (*b*0*, b*1*, b*<sup>2</sup> *. . . , b<sup>s</sup>* ′ ) with *b*<sup>0</sup> = *w* and *b<sup>s</sup>* ′ = *I* in *D*3. Thus, Algorithm 7.1 returns true in Line 11.

If *a*<sup>1</sup> ̸= *a*, then none of the agents in {*a*0*, a*1*, . . . , as*−1} can be involved in a swap where *x* is traded and agent *a* cannot receive *x<sup>w</sup>* during *ϕ*. Hence, *D*1−{*a*} contains a directed path from *w* to *I* and *D*<sup>2</sup> contains a directed path from *a* to *I*. Thus, Algorithm 7.1 returns true in Line 13.

We next show that if the algorithm returns true, then there exists a sequence *ϕ* = (*σ*0*, σ*1*, . . . , σt*) of swaps such that *σt*(*I*) = *x*. If the algorithm returns true, it does so either in Line 3, in Line 11, or in Line 13. If the algorithm returns true in Line 3, then let (*a*<sup>0</sup> . .= *a, a*1*, . . . , a<sup>t</sup>* . .= *I*) be a directed path in *D*. Let *σ<sup>i</sup>* be an assignment such that *σi*(*ai*) . .= *σi*−1(*ai*−1), *σi*(*ai*−1) . .= *σi*−1(*ai*) and *σi*(*j*) = *σi*−1(*j*) for all other agents *j*. By defnition of rational trades and *D*, it holds that (*σ*0*, σ*1*, . . . , σt*) is a sequence of swaps. Note that by construction *σi*(*ai*) = *x* for all *i* ∈ [*t*] and thus *σt*(*at*) = *σt*(*I*) = *x*.

If the algorithm returns true in Line 11, then let (*a*<sup>0</sup> . .= *w, a*<sup>1</sup> . .= *a, . . . , a<sup>s</sup>* . .= *I*) be a directed path in *D*<sup>1</sup> and let (*b*<sup>0</sup> . .= *w, b*1*, . . . , b<sup>t</sup>* . .= *I*) be a directed path in *D*2. Let *σ<sup>i</sup>* be an assignment such that

$$1.\ \sigma\_i(a\_i) := \sigma\_{i-1}(a\_{i-1}) \text{ and } \sigma\_i(a\_{i-1}) := \sigma\_{i-1}(a\_i) \text{ for all } i \in [s],$$

$$\text{2. } \sigma\_{s+i}(b\_i) := \sigma\_{s+i-1}(b\_{i-1}) \text{ and } \sigma\_{s+i}(b\_{i-1}) := \sigma\_{s+i-1}(b\_i) \text{ for all } i \in [t], \text{ and } t$$

3. *σi*(*j*) = *σi*−1(*j*) for all *i* ∈ [*s* + *t*] and all agents *j* that are not assigned objects by the above.

By defnition of rational trades and *D*<sup>1</sup> and *D*2, it holds that (*σ*0*, σ*1*, . . . , σs*+*t*) is a sequence of swaps. Note that by construction *σs*+*<sup>i</sup>*(*bi*) = *x* for all *i* ∈ [*t*] and thus *σs*+*<sup>t</sup>*(*bt*) = *σs*+*<sup>t</sup>*(*I*) = *x*.

Finally, the case where the algorithm returns true in Line 13 is completely analogous for the two directed paths

$$(a\_0 := w, a\_1, \ldots, a\_s := I) \text{ and } (b\_0 := a, b\_1, \ldots, b\_t := I).$$

It remains to analyze the running time. We start with constructing *D* . .= (*V, F*) in *O*(*n* + *m*) time. The constructions of *D*<sup>1</sup> and *D*<sup>2</sup> are analogous. To construct *D* = (*V, F*), we go through each edge {*i, j*} in the input graph and check in constant time whether agent *i* prefers *x<sup>j</sup>* over *x* and whether agent *j* prefers *x* over *x<sup>j</sup>* . All remaining steps are searches for directed paths in graphs. Using dynamic programming on the topological orders of the constructed DAGs, each of these steps can be computed in *O*(*n* + *m*) time. Thus, the overall running time is *O*(*n* + *m*).

### **7.2.2 Maximum Length at Most Four**

Complementing the result from the previous subsection, we next show that Reachable Object is already *NP*-hard when the maximum length of preference lists is four even if the input graph is restricted to be a clique. The hardness of cliques implies that the computational hardness of the problem does not stem from restricting the possible swaps between agents by an underlying social network. To show *NP*-hardness, we reduce from a restricted variant of 3-SAT. In this variant, called 2P1N-SAT (two-positive-and-one-negative SAT), each clause has either two or three literals, and each variable appears once as a negative literal and either once or twice as a positive literal. 2P1N-SAT is known to be *NP*-complete [Tov84].

We start with some intuition. Let

$$\Phi := (\mathcal{V} := \{v\_1, \dots, v\_n\}, \mathcal{C} := \{C\_1, \dots, C\_m\})$$

be an instance of 2P1N-SAT. The general idea of the reduction is to have a set of agents for each variable and one agent for each literal in a clause in Φ. The agents representing variables can then pass objects to agents representing an occurrence of this variable in a clause such that


Before we formally describe our construction, we frst introduce some notation. For each variable *v<sup>i</sup>* ∈ V, let occ(*i*) be the number of occurrences of variable *v<sup>i</sup>* (note that occ(*i*) ∈ {2*,* 3}), let *ν*(*i*) denote the index of the clause that contains the negative literal ¬ *v<sup>i</sup>* , and let *π*1(*i*) and *π*2(*i*) with *π*1(*i*) *< π*2(*i*) be the indices of the clauses that contain the positive literal *v<sup>i</sup>* . If occ(*vi*) = 2, then we simply neglect *π*2(*i*). For a clause *C<sup>j</sup>* , we denote by |*C<sup>j</sup>* | the number of literals that *C<sup>j</sup>* contains. For each clause *C<sup>j</sup>* ∈ C, we use an arbitrary but fxed order of the literals in *C<sup>j</sup>* to defne a bijective function *f<sup>j</sup>* : *C<sup>j</sup>* → {1*, . . . ,* |*C<sup>j</sup>* |}, which assigns to each literal contained in *C<sup>j</sup>* a distinct number from {1*,* 2*, . . . ,* |*C<sup>j</sup>* |}.

We next give the formal description of the construction of I. Afterwards we show how these defnitions match the intuition we gave earlier and show an example of the construction. Afterwards, we formally prove the correctness of the construction.

**Construction 7.4.** Let Φ = (V . .= {*v*1*, . . . , vn*}*,* C . .= {*C*1*, . . . , Cm*}) be an instance of 2P1N-SAT. We construct an instance of Reachable Object as follows.

**Agents and objects.** For each variable *v<sup>i</sup>* ∈ V, we defne occ(*i*) − 1 *variable agents U* 1 *i* (and *U* 2 *i* if occ(*i*) = 3) and occ(*i*)−1 objects *x* 1 *i* (and *x* 2 *i* if occ(*i*) = 3). For each clause *C<sup>j</sup>* ∈ C, we defne 2|*C<sup>j</sup>* | + 1 *clause agents A<sup>j</sup>* , *B<sup>z</sup> j* , and *D<sup>z</sup> j* , where *z* ∈ [|*C<sup>j</sup>* |]. Moreover, we defne 2|*C<sup>j</sup>* | + 1 objects

$$a\_j, b\_j^1, b\_j^2, \dots, b\_j^{|C\_j|}, d\_j^1, d\_j^2, \dots, d\_j^{|C\_j|}.$$

Finally, there is a special agent *I* and a special object *x*.

**Initial assignment and graph.** For each *i* ∈ [*n*] and each *z* ∈ [occ(*i*)] agent *U z i* initially holds object *x* occ(*i*)−*z i* . For each *j* ∈ [*m*] and each *z* ∈ [|*C<sup>j</sup>* |] agent *B<sup>z</sup> j* initially holds object *b z i* and agent *D<sup>z</sup> j* initially holds object *d z i* . Finally, for each *j* ∈ [*m* − 1], agent *Aj*+1 initially holds object *a<sup>j</sup>* , agent *a*<sup>1</sup> initially holds *x*, and agent *I* initially holds *am*.

The graph *G* . .= (*V, E*) is complete, that is, *E* . .= ( *V* 2 ) .

**Preference lists.** We next describe the preference list of each agent. Therein, we only specify the relevant part, that is, the preference list up to the object that the agent initially holds. We again mark the initially held object with a box. For a given variable *v<sup>i</sup>* ∈ V, let *j* = *ν*(*i*), *j* ′ = *π*1(*i*), and if occ(*i*) = 3, then *j* ′′ = *π*2(*i*). If occ(*i*) = 2, then the preference list of *U* 1 *i* is

$$d\_{j'}^{f\_{j'}(v\_i)} \succ d\_j^{f\_j(\neg v\_i)} \succ \boxed{x\_i^1},$$

and if occ(*i*) = 3, then the preference lists of *U* 1 *i* and *U* 2 *i* are

$$d\_{j'}^{f\_{j'}\left(v\_i\right)} \succ x\_i^1 \succ d\_j^{f\_j\left(\neg v\_i\right)} \succ \overline{\left\{x\_i^2\right\}} \text{and}$$

$$d\_{j''}^{f\_{j''}\left(v\_i\right)} \succ x\_i^2 \succ \overline{\left\{x\_i^1\right\}} \text{, respectively.}$$

For *j* ∈ [2*, m*], the preference list of *A<sup>j</sup>* is

$$b\_j^1 \succ \dots \succ b\_j^{|C\_j|} \succ \overbrace{\{a\_{j-1}\}}\text{-} $$

Let for each *z* ∈ [occ(*i*)] be *ℓ<sup>z</sup>* the index such that *f<sup>j</sup>* (*ℓz*) = *z*. The preference list of *B<sup>z</sup> i* is then

$$
\tau(C\_j, \ell\_z) \succ x \succ a\_{j-1} \succ \boxed{b\_j^z},
$$

where

$$\tau(C\_j, \ell) := \begin{cases} x\_i^1, & \text{if } \textsf{occ}(i) = 2 \text{ and } \ell = \neg v\_i \text{ for some variable } v\_i, \\ x\_i^2, & \text{if } \textsf{occ}(i) = 3 \text{ and } \ell = \neg v\_i \text{ for some variable } v\_i, \\ x\_i^1, & \text{if } \ell = v\_i \text{ and } j = \pi\_1(i) \text{ for some variable } v\_i, \text{ and } i \\ x\_i^2, & \text{if } \ell = v\_i \text{ and } j = \pi\_2(i) \text{ for some variable } v\_i. \end{cases}$$

The preference list of *D<sup>z</sup> j* is

$$a\_j \succ x \succ \tau(C\_j, \ell\_z) \succ \boxed{d\_j^z}.$$

The preference list of *A*<sup>1</sup> is

$$b\_1^1 \succ b\_1^2 \succ \dots \succ b\_1^{|C\_1|} \succ \boxed{x}.$$

For each *z* ∈ [occ(*i*)] let *ℓ<sup>z</sup>* be the index such that *f*1(*ℓz*) = *z*. The preference lists of *B<sup>z</sup>* <sup>1</sup> and *D<sup>z</sup>* <sup>1</sup> are

$$\begin{aligned} \tau(C\_1, \ell\_z) &\succ x \xleftarrow[\overline{b\_1^z}] \text{ and} \\ a\_1 \succ x \succ \tau(C\_1, \ell\_z) &\succ \overline{\underline{d\_1^z}}, \text{ respectively.} \end{aligned}$$

Finally, the preference list of agent *I* is *x* ≻ *a<sup>m</sup>* .

We next explain how these formal defnitions follow the general idea we started with. To this end, note that if for two agents *i* and *j* there are no two objects *x<sup>k</sup>* and *x<sup>ℓ</sup>* such that *x<sup>k</sup>* ≻*<sup>i</sup> x<sup>ℓ</sup>* ≽*<sup>i</sup> x<sup>i</sup>* and *x<sup>ℓ</sup>* ≻*<sup>j</sup> x<sup>k</sup>* ≽*<sup>j</sup> x<sup>j</sup>* , then by defnition of rational trades agents *i* and *j* will never swap objects. In this case, we say that the edge {*i, j*} is *irrelevant* and ignore the edges henceforth. All other edges are *relevant* and by carefully examining the preference lists of all agents, there are only the following relevant edges.

(1) Relevant edges between clause agents representing one clause *C<sup>j</sup>* are for each *z* ∈ [|*C<sup>j</sup>* |]

$$\{A\_j, B\_j^z\}, \{B\_j^z, D\_j^z\},$$

that is, all clause agents for one clause form a subdivided star.

(2) Relevant edges between clause agents representing two consecutive clauses *C<sup>j</sup>* and *C<sup>j</sup>*+1 are for each *z* ∈ [|*C<sup>j</sup>* |] and *z* ′ ∈ [|*C<sup>j</sup>*+1|]

$$\{D\_j^z, B\_{j+1}^{z'}\},$$

that is, the two vertex sets {*D<sup>z</sup> j* | 1 ≤ *z* ≤ |*C<sup>j</sup>* |} and {*B<sup>z</sup>* ′ *<sup>j</sup>*+1 | 1 ≤ *z* ′ ≤ |*C<sup>j</sup>*+1|} form a complete bipartite graph.


**Table 7.1:** The preference lists of all agents for the instance V . .= {*v*1*, v*2*, v*3} and C . .= {*C*<sup>1</sup> . .= (*v*<sup>2</sup> ∨ *v*3)*, C*<sup>2</sup> . .= (*v*<sup>1</sup> ∨ ¬ *v*<sup>2</sup> ∨ ¬ *v*3)*, C*<sup>3</sup> . .= (¬ *v*<sup>1</sup> ∨ *v*<sup>2</sup> ∨ *v*3)}*. A*<sup>1</sup> : *b* ≻ *b* ≻ *x A*<sup>2</sup> : *b* ≻ *b* ≻ *b* ≻ *a*<sup>1</sup> *A*<sup>3</sup> : *b* ≻ *b* ≻ *b* ≻ *a*<sup>2</sup> *B*<sup>1</sup> : *x* ≻ *x* ≻ *b B*<sup>1</sup> : *x* ≻ *x* ≻ *a*<sup>1</sup> ≻ *b B*<sup>1</sup> : *x* ≻ *x* ≻ *a*<sup>2</sup> ≻ *b B*<sup>2</sup> : *x* ≻ *x* ≻ *b B*<sup>2</sup> : *x* ≻ *x* ≻ *a*<sup>1</sup> ≻ *b B*<sup>2</sup> : *x* ≻ *x* ≻ *a*<sup>2</sup> ≻ *b B*<sup>3</sup> : *x* ≻ *x* ≻ *a*<sup>1</sup> ≻ *b B*<sup>3</sup> : *x* ≻ *x* ≻ *a*<sup>2</sup> ≻ *b D*<sup>1</sup> : *a*<sup>1</sup> ≻ *x* ≻ *x* ≻ *d D*<sup>1</sup> : *a*<sup>2</sup> ≻ *x* ≻ *x* ≻ *d D*<sup>1</sup> : *a*<sup>3</sup> ≻ *x* ≻ *x* ≻ *d D*<sup>2</sup> : *a*<sup>1</sup> ≻ *x* ≻ *x* ≻ *d D*<sup>2</sup> : *a*<sup>2</sup> ≻ *x* ≻ *x* ≻ *d D*<sup>2</sup> : *a*<sup>3</sup> ≻ *x* ≻ *x* ≻ *d D*<sup>3</sup> : *a*<sup>2</sup> ≻ *x* ≻ *x* ≻ *d D*<sup>3</sup> : *a*<sup>3</sup> ≻ *x* ≻ *x* ≻ *d I* : *x* ≻ *a*<sup>3</sup> *U* : *d* ≻ *d* ≻ *x U* : *d* ≻ *x* ≻ *d* ≻ *x U* : *d* ≻ *x* ≻ *d* ≻ *x U* : *d* ≻ *x* ≻ *x U* : *d* ≻ *x* ≻ *x* 

(a) If occ(*i*) = 2, then the relevant edges between the variable agent *U i* representing *v<sup>i</sup>* and clause agents are

$$\{U\_i^1, B\_{\pi\_1(i)}^{f\_{\pi\_1(i)}(v\_i)}\} \text{ and } \{U\_i^1, B\_{\nu(i)}^{f\_{\nu(i)}(\neg v\_i)}\}.$$

(b) If occ(*i*) = 3, then the relevant edges between the variable agents *U i* and *U i* representing *v<sup>i</sup>* and clause agents are

$$\{U\_i^1, B\_{\pi\_1(i)}^{f\_{\pi\_1(i)}(v\_i)}\}, \{U\_i^1, B\_{\nu(i)}^{f\_{\nu(i)}(\neg v\_i)}\}, \text{ and } \{U\_i^2, B\_{\pi\_2(i)}^{f\_{\pi\_2(i)}(v\_i)}\}.$$

We now briefy describe how a solution in the constructed instance corresponds to a satisfying truth assignment of the original formula. Afterwards, we present the formal proof. Consider Table 7.1 and Figure 7.3 for an example of Construction 7.4, relevant edges, and how a satisfying truth assignment to the original formula corresponds to a solution for the constructed Reachable Object instance. Note that only agents *B<sup>z</sup> j* and *D<sup>z</sup> j* for *j* ∈ [*m*] and *z* ∈ [|*C<sup>j</sup>* |] as well as agents *I* and *A*<sup>1</sup> prefer object *x* at least as much as their initially held object. By the analysis of relevant edges above, one can easily verify that agent *I* can only receive object *x* if for each clause *C<sup>j</sup>* at least one agent *B<sup>z</sup> j*

**Figure 7.3:** An example of the agents and relevant edges resulting from Construction 7.4 for the instance V . .= {*v*1*, v*2*, v*3} and C . .= {*C*<sup>1</sup> . .= (*v*<sup>2</sup> ∨ *v*3)*, C*<sup>2</sup> . .= (*v*<sup>1</sup> ∨ ¬ *v*<sup>2</sup> ∨ ¬ *v*3)*, C*<sup>3</sup> . .= (¬ *v*<sup>1</sup> ∨ *v*<sup>2</sup> ∨ *v*3)}*.* The boxes with solid lines indicate the three clause gadgets and the three boxes with dashed lines display the three variable gadgets. Relevant edges between variable agents and clause agents are only drawn red for easier distinction. The preference lists are listed in Table 7.1. Notice that setting *v*<sup>1</sup> and *v*<sup>2</sup> to false and *v*<sup>3</sup> to true is a satisfying truth assignment. This corresponds to the following sequence of swaps that lets agent *I* obtain object *x*. Setting a variable that occurs thrice to true (*v*<sup>3</sup> in our example) is represented by the swap of the initially held objects of the two respective variable agents (in our case *U* 1 <sup>3</sup> and *U* 2 <sup>3</sup> swap *x* 2 <sup>3</sup> and *x* 1 <sup>3</sup>). Afterwards we decide for each clause for one literal to satisfy this clause. Since in our example *C*<sup>1</sup> and *C*<sup>2</sup> are only satisfed by one literal, we can only choose between ¬ *v*<sup>1</sup> and *v*<sup>3</sup> in *C*3. Let us choose *v*<sup>3</sup> in *C*3. Next, for each clause *C<sup>j</sup>* the agent *A<sup>j</sup>* swaps with the chosen *B*-vertex and the chosen *D*-vertex swaps with the respective variable agent. In our case, *A*<sup>1</sup> swaps with *B* 2 <sup>1</sup> , *A*<sup>2</sup> swaps with *B* 2 <sup>2</sup> , *A*<sup>3</sup> swaps with *B* 3 <sup>3</sup> , *D* 2 <sup>1</sup> swaps with *U* 1 <sup>3</sup> , *D* 2 <sup>2</sup> swaps with *U* 1 <sup>2</sup> , and *D* 3 <sup>3</sup> swaps with *U* 2 <sup>3</sup> . If all clauses are satisfed, then object *x* can be swapped "through all clauses". In the sequence of swaps we described, object *x* it is held by agents *A*1, *B* 2 <sup>1</sup> , *D* 2 <sup>1</sup>, *B* 2 <sup>2</sup> , *D* 2 <sup>2</sup>, *B* 3 <sup>3</sup> , *D* 3 <sup>3</sup>, and *I*.

and *D<sup>z</sup>* ′ *j* for *z, z*′ ∈ [|*C<sup>j</sup>* |] held object *x* before. For agent *D<sup>z</sup>* ′ *j* to pass object *x* to agent *B<sup>z</sup> <sup>j</sup>*+1, it has to receive *a<sup>j</sup>* in return. Since this object is initially held by agent *A<sup>j</sup>*+1 and since this agent only shares relevant edges with agents *B<sup>z</sup> j*+1

for *z* ∈ [|*Cj*+1|], this works as a selection gadget of which literal in clause *Cj*+1 should be satisfed. For agent *B<sup>z</sup> <sup>j</sup>*+1 to then trade object *x* to agent *D<sup>z</sup> <sup>j</sup>*+1, it has to receive an object representing the corresponding variable in return. For *D<sup>z</sup> <sup>j</sup>*+1 to receive such an object, it has to receive this from an agent in the corresponding variable gadget. If this variable only occurs twice, then the respective object can only be given to one *D*-agent and this agent will never give it away as it is its most preferred object. If the variable occurs thrice, then the agents preference lists are constructed in a way that if either of the "positive occurrences" are given an object representing this variable, then the "negative occurrence" cannot be satisfed.

It remains to formally prove that Construction 7.4 is correct. This leads to the main result of this subsection which states that Reachable Object remains *NP*-hard when each preference list has length at most four and which complements Proposition 7.3.

### **Proposition 7.5.** Reachable Object *is NP-hard even if the* maximum length of preference lists *is four and the input graph is restricted to complete graphs.*

*Proof.* Since each step in Construction 7.4 is polynomial-time computable, since all preference lists have by construction length at most four, and since the graph is complete, we will focus on showing that the constructed instance is equivalent to the original instance. To this end, let Φ . .= (V*,* C) be an instance of 2P1N-SAT and consider the instance of Reachable Object resulting from Construction 7.4.

We will frst show that if Φ is satisfable, then there is a sequence of swaps such that object *x* reaches *I*. Let *β* : V → {true*,* false} be a satisfying truth assignment for Φ. First, for each variable *v<sup>i</sup>* ∈ V, if occ(*i*) = 3 and *β*(*vi*) = true, then let agents *U* 1 *i* and *U* 2 *i* swap their initially held objects (so that *U* 1 *i* and *U* 2 *i* hold *x* 1 *i* and *x* 2 *i* , respectively). Second, identify for each clause *C<sup>j</sup>* one literal *ℓ<sup>j</sup>* that satisfes *C<sup>j</sup>* under assignment *β*. Then, perform the following swaps.

	- (a) if *ℓ<sup>j</sup>* = ¬ *v<sup>i</sup>* , then *z* = 1 (note that in this case agent *U* 1 *i* is holding object *x* occ(*i*)−1 *i* ),
	- (b) if *ℓ<sup>j</sup>* = *v<sup>i</sup>* and *j* = *π*1(*i*), then *z* = 1 (note that in this case agent *U* 1 *i* is holding object *x* 1 *i* ), and

(c) if *ℓ<sup>j</sup>* = *v<sup>i</sup>* and *j* = *π*2(*i*), then *z* = 2 (note that in this case agent *U* 2 *i* is holding object *x* 2 *i* ).

After these swaps, agent *B f*1(*ℓ*1) <sup>1</sup> holds object *x* and agent *B f<sup>j</sup>* (*ℓ<sup>j</sup>* ) *j* holds object *aj*−<sup>1</sup> for each *j* ∈ [2*, m*]. Moreover, for each *j* ∈ [*m*], agent *D f<sup>j</sup>* (*ℓ<sup>j</sup>* ) *j* holds object *τ* (*C<sup>j</sup> , ℓ<sup>j</sup>* ). Third, for each *j* ∈ [*m* − 1] iteratively perform the following swaps.


After these swaps, agent *B fm*(*ℓm*) *<sup>m</sup>* holds object *x*. Finally, agent *B fm*(*ℓm*) *<sup>m</sup>* can swap object *x* in exchange for object *τ* (*Cm, ℓm*) with agent *D fm*(*ℓm*) *<sup>m</sup>* who can then swap *x* in exchange for object *a<sup>m</sup>* with agent *I*. Thus, object *x* is reachable for agent *I* and the constructed instance is a yes-instance.

For the other direction, assume that there is a sequence (*σ*0*, σ*1*, . . . , σs*) of swaps such that *σs*(*I*) = *x*. We show how to construct a satisfying truth assignment for Φ using the following claim that formalizes the idea that object *x* has to pass "through all clauses".

**Claim 7.6.** *For each clause C<sup>j</sup>* ∈ C*, there exist assignments σ<sup>r</sup> and σ<sup>r</sup>*+1 *and a literal ℓ<sup>j</sup>* ∈ *C<sup>j</sup> such that*

$$\begin{aligned} 1. \ \sigma\_r(B\_j^{f\_j(\ell\_j)}) &= x, \\ 2. \ \sigma\_r(D\_j^{f\_j(\ell\_j)}) &= \tau(C\_j, \ell\_j), \\ 3. \ \sigma\_{r+1}(B\_j^{f\_j(\ell\_j)}) &= \tau(C\_j, \ell\_j), \text{ and} \\ 4. \ \sigma\_{r+1}(D\_j^{f\_j(\ell\_j)}) &= x. \end{aligned}$$

*Proof of Claim 7.6.* We prove the claim by induction over *j*, starting with *j* = *m*. In the initial assignment *σ*0, agent *I* holds object *am*. Note that *I* prefers only *x* over its initially held object *a<sup>m</sup>* and only the agents *D<sup>z</sup> <sup>m</sup>* with *z* ∈ [|*Cm*|] prefer *x* over *am*. Hence, *I* has to swap with one of these agents to obtain *x*. Let *ℓ<sup>m</sup>* be the literal with *fm*(*ℓm*) = *z*. In order for agent *D<sup>z</sup> <sup>m</sup>* to obtain object *x* to swap it

to *I*, it must hold object *τ* (*Cm, ℓm*) and swap it for *x* since no agent will trade *x* for *d z <sup>m</sup>*. Observe that agent *B<sup>z</sup> <sup>m</sup>* is the only agent that prefers *τ* (*Cm, ℓm*) over *x* and *B<sup>z</sup> <sup>m</sup>* must therefore trade object *x* to *D<sup>z</sup> <sup>m</sup>* in exchange for object *τ* (*Cm, ℓm*). Thus, there are assignments *σ<sup>r</sup>* and *σr*+1 such that


We now show that if for some *j* ∈ [*m* − 1] there are assignments *σ<sup>r</sup>* ′ and *σ<sup>r</sup>* ′+1 and a literal *ℓj*+1 that fulfll the claim for clause *Cj*+1, then there are also assignments *σ<sup>r</sup>* and *σr*+1 and a literal *ℓ<sup>j</sup>* that fulfll the claim for clause *C<sup>j</sup>* . By defnition, agent *B fj*+1(*ℓj*+1) *<sup>j</sup>*+1 must have obtained object *x* at some point. Since it prefers *x* only over objects *a<sup>j</sup>* and *b fj*+1(*ℓj*+1) *<sup>j</sup>*+1 and since no agent prefers *b fj*+1(*ℓj*+1) *j*+1 over *x*, it follows that agent *B fj*+1(*ℓj*+1) *<sup>j</sup>*+1 must have swapped object *a<sup>j</sup>* with some other agent for *x*. Since only agents from {*D<sup>z</sup> j* | *z* ∈ [|*C<sup>j</sup>* |]} prefer *a<sup>j</sup>* over *x*, it follows that *B fj*+1(*ℓj*+1) *<sup>j</sup>*+1 must have swapped with some agent *D<sup>z</sup>* ′ *<sup>j</sup>* with *z* ′ ∈ [|*C<sup>j</sup>* |] to obtain object *x*. Let *ℓ<sup>j</sup>* be the literal such that *f<sup>j</sup>* (*ℓ<sup>j</sup>* ) = *z* ′ . Now consider how agent *D<sup>z</sup>* ′ *j* can obtain object *x*. Similarly to the case with agent *D<sup>z</sup> <sup>m</sup>*, agent *D<sup>z</sup>* ′ *j* must swap object *τ* (*C<sup>j</sup> , ℓ<sup>j</sup>* ) with agent *B<sup>z</sup>* ′ *j* to obtain *x* as no agent prefers *d z* ′ *j* over *x*. Thus, there are assignments *σ<sup>r</sup>* and *σ<sup>r</sup>*+1 such that

$$\begin{aligned} \text{1. } &\sigma\_r(B\_j^{f\_j(\ell\_j)}) = x, \\ \text{2. } &\sigma\_r(D\_j^{f\_j(\ell\_j)}) = \tau(C\_j, \ell\_j), \\ \text{3. } &\sigma\_{r+1}(B\_j^{f\_j(\ell\_j)}) = \tau(C\_j, \ell\_j), \text{and} \\ \text{4. } &\sigma\_{r+1}(D\_j^{f\_j(\ell\_j)}) = x. \end{aligned}$$

We conclude the proof by constructing a truth assignment *β* and show that it satisfes Φ using Claim 7.6. Let for each variable *v<sup>i</sup>* ∈ V be

$$\beta(v\_i) := \begin{cases} \text{false}, & \text{if } D^{f\_{\nu(i)}(\neg v\_i)}\_{\nu(i)} \text{ swapped object } x \text{ with } B^{f\_{\nu(i)}(\neg v\_i)}\_{\nu(i)} \\ \text{true}, & \text{otherwise}. \end{cases}$$

Assume towards a contradiction that *β* does not satisfy Φ, that is, there is some clause *C<sup>j</sup>* ∈ C that is not satisfed by *β*. By Claim 7.6, let *ℓ<sup>j</sup>* ∈ *C<sup>j</sup>* be a literal such that *D f<sup>j</sup>* (*ℓ<sup>j</sup>* ) *j* swapped object *x* with *B f<sup>j</sup>* (*ℓ<sup>j</sup>* ) *j* for object *τ* (*C<sup>j</sup> , ℓ<sup>j</sup>* ). Observe that *ℓ<sup>j</sup>* ∈ {*v<sup>i</sup> ,* ¬ *vi*} for some *v<sup>i</sup>* ∈ V. We now distinguish between the two cases *ℓ<sup>j</sup>* = *v<sup>i</sup>* and *ℓ<sup>j</sup>* = ¬ *v<sup>i</sup>* . If *ℓ<sup>j</sup>* = ¬ *v<sup>i</sup>* , then notice that *j* = *ν*(*i*) since each variable occurs exactly once as a negative literal. Thus, *D fν*(*i*)(¬ *vi*) *ν*(*i*) swapped object *x* with *B fν*(*i*)(¬ *vi*) *ν*(*i*) . By construction, *β*(*vi*) = false and thus *C<sup>j</sup>*

is satisfed, a contradiction. If *ℓ<sup>j</sup>* = *v<sup>i</sup>* , then *v<sup>i</sup>* ∈ *C<sup>j</sup>* . Since *C<sup>j</sup>* is not satisfed by *β*, it holds that ¬ *v<sup>i</sup>* ∈*/ C<sup>j</sup>* and *β*(*vi*) = false. It then follows from the construction of *β* that *D fν*(*i*)(¬ *vi*) *ν*(*i*) swapped object *x* with *B fν*(*i*)(¬ *vi*) *ν*(*i*) . Note that, by the construction of the preference lists, *B fν*(*i*)(¬ *vi*) *ν*(*i*) can only have given object *τ* (*Cν*(*i*) *,* ¬ *vi*) for *x* in this trade. We will show that this contradicts the assumption that *D f<sup>j</sup>* (*ℓ<sup>j</sup>* ) *j* swapped object *x* with *B f<sup>j</sup>* (*ℓ<sup>j</sup>* ) *j* for object *τ* (*C<sup>j</sup> , ℓ<sup>j</sup>* ) using a case distinction over occ(*i*).

If occ(*i*) = 2, then the defnition of *τ* yields *τ* (*C<sup>j</sup> , vi*) = *τ* (*C<sup>j</sup>* ′ *,* ¬ *vi*) = *x* 1 *i* . Since both agents *B f<sup>j</sup>* (*vi*) *j* and *B fν*(*i*)(¬ *vi*) *ν*(*i*) prefer object *x* 1 *i* the most, once one of the two agents received it, the same object cannot be used to be swapped to the respective other agent. Thus, not both of the constructed swaps can happen during the sequence of swaps, a contradiction.

If occ(*i*) = 3, then by the defnition of *τ* it holds that *τ* (*C<sup>ν</sup>*(*i*) *,* ¬ *vi*) = *x* 2 *i* . Since agent *D fν*(*i*)(¬ *vi*) *ν*(*i*) received object *x* 2 *i* , it must have received it from agent *U* 1 *i* (as *U* 2 *i* and *D fν*(*i*)(¬ *vi*) *ν*(*i*) share no relevant edge). Thus agents *U* 1 *i* and *U* 2 *<sup>i</sup>* did not swap their initially held objects. We make a fnal case distinction on whether *j* = *π*1(*ℓ<sup>j</sup>* ) or *j* = *π*2(*ℓ<sup>j</sup>* ). If *j* = *π*1(*ℓ<sup>j</sup>* ), then agent *D f<sup>j</sup>* (*ℓ<sup>j</sup>* ) *<sup>j</sup>* must have received object *x* 1 *i* from agent *U* 1 *i* . If *j* = *π*2(*ℓ<sup>j</sup>* ), then agent *D f<sup>j</sup>* (*ℓ<sup>j</sup>* ) *<sup>j</sup>* must have received object *x* 2 *i* from agent *U* 2 *i* . In both cases agents *U* 1 *i* and *U* 2 *i* swapped their initially held objects, a contradiction.

This concludes the dichotomy result for the maximum length *ℓ* of preference lists as Proposition 7.3 states that Reachable Object is linear-time solvable if *ℓ* ≤ 3 and Proposition 7.5 complements this result by showing that Reachable Object remains *NP*-hard for *ℓ* = 4. Note that if we replace the complete graph in Construction 7.4 by the graph that only contains the relevant edges, then, for

each *ℓ >* 4, we can simply add a new agent with an arbitrary preference list of length *ℓ* that is only adjacent to agent *I*. This agent can never swap its initially held object *o* since *I* does not prefer *o* over its initially held object. This implies *NP*-hardness of Reachable Object with respect to the maximum length *ℓ* of preference lists for each *ℓ >* 4.

# **7.3 Cycles**

In this section, we prove that Reachable Object on *n*-vertex cycles is solvable in *O*(*n* 4 ) time. This generalizes an *O*(*n* 4 )-time algorithm for Reachable Object on paths by Huang and Xiao [HX20]. The main diference between our algorithm and the algorithm by Huang and Xiao is the fact that two objects can only be swapped once in a path but up to twice in a cycle. The main ingredient to overcome this obstacle is a structural observation which states that for each solution there is a constant *c* such that the following holds. For all pairs (*x<sup>i</sup> , x<sup>j</sup>* ) of objects that are swapped twice in the solution, it holds that the two edges over which *x<sup>i</sup>* and *x<sup>j</sup>* are swapped have distance *c* in the input cycle. Since this constant *c* is the same for all pairs of objects, we will frst determine the value of *c* and then use it to check for each pair of objects whether they can be swapped twice in a solution.

Note that we can ignore all connected components in the input instance of Reachable Object that do not contain *I* and hence we may assume that the input graph is connected. Note further that any connected graph with maximum degree two is either a path or a cycle. Thus, our algorithm for cycles and the algorithm for paths by Huang and Xiao [HX20] prove that Reachable Object is polynomial-time solvable for graphs of maximum degree two. Safdine and Wilczynski [SW18] showed that Reachable Object remains *NP*-hard on graphs of maximum degree four. The general idea for our algorithm is as follows. Note that Observation 7.1 implies that there are only two possible paths of agents in the graph that can hold the target object *x* before the target agent *I* can obtain it. We will then guess<sup>1</sup> the path of agents that hold *x* during a solution (a sequence of swaps such that agent *I* obtains *x*). This will allow us to represent a solution by selecting one object to be swapped with *x* over each edge in the guessed path. An example of this is given in Figure 7.4. In Subsection 7.3.1, we show that for each edge in this path there are at most two

<sup>1</sup>As in Chapter 5, guessing refers to the procedure of iterating over all possibilities and considering "the correct" iteration for the proof.

$$\begin{array}{c} 0 \mathrel{\begin{array}{l} 0 \mathrel{\begin{array}{l} x\_{5} \succ} \succ x\_{4} \succ \end{array} \vdash x\_{3} \succ \boxed{x\_{0}} \\ 2 \mathrel{\begin{array}{l} x\_{5} \succ \end{array} \succ} x\_{1} \succ x\_{4} \succ \boxed{x\_{2}} \end{array} \vdash \begin{array}{l} 1 \mathrel{\begin{array}{l} x\_{3} \succ} x\_{4} \succ x\_{3} \succ \boxed{x\_{5}} \succ \boxed{x\_{3}} \end{array} \vdash \boxed{x\_{1}} \\ 4 \mathrel{\begin{array}{l} x\_{5} \succ \Bigred{x\_{3}} \succ} x\_{3} \succ \boxed{x\_{4}} \end{array} \vdash \boxed{x\_{5}} \end{array}$$

**Figure 7.4:** An example of Reachable Object on a cycle. Initially held objects are drawn in boxes and the question is whether *x*<sup>4</sup> is reachable for agent 0. Note that agent 5 does not accept object *x*<sup>4</sup> and hence object *x*<sup>4</sup> has to pass the edges {3*,* 4}, {2*,* 3}, {1*,* 2}, and {0*,* 1} before agent 0 can obtain it. Considering the preference lists of the agents, it is easy to verify that only object *x*<sup>3</sup> can be swapped with *x*<sup>4</sup> over the edges {3*,* 4} and {0*,* 1}. Analogously, only object *x*<sup>2</sup> can be swapped with *x*<sup>4</sup> over the edge {2*,* 3} and objects *x*<sup>1</sup> and *x*<sup>5</sup> are candidates for being swapped with *x*<sup>4</sup> over the edge {1*,* 2}. Observe that for object *x*<sup>5</sup> to be swapped with *x*<sup>4</sup> over the edge {1*,* 2}, it has to be swapped over the edge {0*,* 1} which is impossible as agent 0 does not prefer any object over *x*5. Hence, the only solution selects objects *x*1, *x*2, and *x*<sup>3</sup> to move clockwise and objects *x*0, *x*4, and *x*<sup>5</sup> move counter-clockwise. Note that the sequence of swaps resulting from 3 ↔ 4, 4 ↔ 5, 5 ↔ 0, 3 ↔ 2, 2 ↔ 1, 1 ↔ 0 leads to agent *I* obtaining *x*4. Therein, "*i* ↔ *j*" means that agents *i* and *j* swap the objects they currently hold.

candidate objects that can be swapped with *x* over the respective edge. Finally, in Subsection 7.3.2, we show how to partition the edges in the path such that


We conclude with the main theorem that states that Reachable Object on cycles can be solved in *O*(*n* 4 ) time. The respective algorithm is a 2-SAT program with a variable for each part of the described partition. The truth value of this variable represents the choice of candidates for each edge in the respective part. The clauses will guarantee that no two incompatible candidates are chosen.

**Figure 7.5:** A cycle with six vertices. The part <sup>2</sup>*,* <sup>4</sup> is colored violet (darker) and <sup>4</sup>*,* <sup>2</sup> is colored yellow (brighter).

We start with some notation for this section. For the sake of readability, we assume that the graph is

$$G := \left(V := \{0\} \cup [n-1], E := \{\{i-1, i\} \mid i \in [n-1] \} \cup \{\{0, n-1\} \} \right).$$

Furthermore, if we refer to some agent *j* with *j /*∈ {0} ∪ [*n* − 1], then we mean agent *j*′ with *j*′ ≡ *j* (mod *n*). For each object *xi*, we denote by A(*xi*) the agent that initially holds *xi*, that is, *σ*<sup>−</sup><sup>1</sup> <sup>0</sup> (*xi*). We assume without loss of generality that *I* . .= 0 and refer to the target object as *x* and defne *k* . .= A(*x*).

We use *i, j* to denote the set {*i, i* + 1 mod *n, . . . , j* mod *<sup>n</sup>*}, that is,

$$[i,j] := \begin{cases} [i,j], & \text{if } j \ge i, \text{ and} \\ [0,j] \cup [i,n-1] & \text{if } j < i. \end{cases}$$

See Figure 7.5 for an example. Finally, we say that an object *x<sup>i</sup> moves clockwise* if it is swapped from some agent *i* to agent *i* + 1. Analogously, we say that *x<sup>i</sup> moves counter-clockwise* if it swapped from some agent *i* to agent *i* − 1. By Observation 7.1, an object moving clockwise (or counter-clockwise) once, will only move clockwise (or counter-clockwise) in the future.

**Observation 7.7.** *Let ϕ be a sequence of swaps and let x<sup>i</sup> be an object. If x<sup>i</sup> is swapped during ϕ, then it either only moves clockwise or only moves counterclockwise during ϕ.*

Note that the object *x* has to move clockwise or counter-clockwise in a solution. Our main algorithm just tries out both possibilities one after another and since these two cases work analogously, we will only present the case where *x* moves counter-clockwise here. Since *x* moves counter-clockwise and is initially held by agent *k* . .= A(*x*), if there is a solution (a sequence of swaps such that agent *I* obtains object *x*), then *x* is swapped over each edge in {{*i* − 1*, i*} | *i* ∈ [*k*]}. Moreover, we can assume that *x* is swapped over the edge {0*,* 1} in the last swap of the solution as all swaps afterwards are irrelevant. Our algorithm guesses the object *z* with which object *x* is swapped in this last swap. Note that there are two possibilities for *x* moving clockwise or counter-clockwise and at most *n* possibilities for choosing *z*. Hence, there are *O*(*n*) iterations in our main algorithm and we can assume that *x* moves counter-clockwise and that object *z* is known. We will use this assumption throughout this section.

**Assumption 7.8.** *Let* I = (*V, X,*P*, σ*0*, G,* 0*, x*) *with*

$$G := \left( V := \{ 0 \} \cup [n - 1], E := \{ \{ i - 1, i \} \mid i \in [n - 1] \} \cup \{ 1, n - 1 \} \right)$$

*be an instance of* Reachable Object *on cycles. If x is reachable for agent* 0*, then there is a solution in which x moves counter-clockwise and in the last swap of the solution it is swapped with object z over the edge* {0*,* 1}*.*

We continue with an analysis of how often objects can be swapped in a cycle. To this end, we frst show a helpful lemma which states for two objects *x<sup>i</sup>* and *x<sup>j</sup>* that are swapped in a sequence of swaps which other objects are swapped with either of them before *x<sup>i</sup>* and *x<sup>j</sup>* can be swapped.

**Lemma 7.9.** *Let xh, xi, and x<sup>j</sup> be three distinct objects. Let ϕ* = (*σ*0*, σ*1*, . . . , σt*) *be a sequence of swaps such that x<sup>i</sup> and x<sup>j</sup> are swapped between σt*−<sup>1</sup> *and σ<sup>t</sup> and x<sup>i</sup>* ̸= *x moves clockwise in ϕ. Let r < t* − 1 *such that x<sup>i</sup> and x<sup>j</sup> are not swapped between σs*−<sup>1</sup> *and σ<sup>s</sup> for any s* ∈ [*r* + 1*, t* − 1]*. Then, object x<sup>h</sup> is swapped with either x<sup>i</sup> or x<sup>j</sup> between σs*−<sup>1</sup> *and σ<sup>s</sup> for some s* ∈ [*r* + 1*, t* − 1] *if and only if σ* −1 *r* (*xh*) <sup>∈</sup> <sup>J</sup>*<sup>σ</sup>* −1 *r* (*xi*)*, σ*<sup>−</sup><sup>1</sup> *r* (*x<sup>j</sup>* )K*.*

*Proof.* Note that since *x<sup>i</sup>* and *x<sup>j</sup>* are swapped in *ϕ* and since *x<sup>i</sup>* moves clockwise, it holds that *x<sup>j</sup>* moves counter-clockwise in *ϕ*. We prove the claim by induction over <sup>|</sup>J*<sup>σ</sup>* −1 *r* (*xi*)*, σ*<sup>−</sup><sup>1</sup> *r* (*x<sup>j</sup>* )K|. If <sup>|</sup>J*<sup>σ</sup>* −1 *r* (*xi*)*, σ*<sup>−</sup><sup>1</sup> *r* (*x<sup>j</sup>* )K<sup>|</sup> = 2, then *<sup>x</sup><sup>i</sup>* and *<sup>x</sup><sup>j</sup>* can only be swapped over the edge {*σ* −1 *r* (*xi*)*, σ*<sup>−</sup><sup>1</sup> *r* (*x<sup>j</sup>* )} and hence no other object can be swapped with either object before *x<sup>i</sup>* and *x<sup>j</sup>* are swapped. Since no object *x<sup>h</sup>*

other than *x<sup>i</sup>* and *x<sup>j</sup>* fulflls *σ* −1 *r* (*xh*) <sup>∈</sup> <sup>J</sup>*<sup>σ</sup>* −1 *r* (*xi*)*, σ*−<sup>1</sup> *r* (*x<sup>j</sup>* )K, this concludes the base case.

Now assume the statement holds for all objects *x<sup>i</sup>* ′ and *x<sup>j</sup>* ′ such that *x<sup>i</sup>* ′ moves clockwise, *x<sup>j</sup>* ′ moves counter-clockwise, and <sup>|</sup>JA(*x<sup>i</sup>* ′ )*,* A(*x<sup>j</sup>* ′ )K<sup>|</sup> *<sup>&</sup>lt;* <sup>|</sup>JA(*xi*)*,* A(*x<sup>j</sup>* )K|. Take any object *<sup>x</sup><sup>ℓ</sup>* such that A(*xℓ*) <sup>∈</sup> <sup>J</sup>A(*xi*)*,* A(*x<sup>j</sup>* )<sup>K</sup> \ {*x<sup>i</sup> , xj*}. We assume without loss of generality that *x<sup>ℓ</sup>* moves counter-clockwise in *ϕ* as the other case is analogous. Note that *x<sup>i</sup>* and *x<sup>ℓ</sup>* are swapped in *ϕ* as otherwise *x<sup>ℓ</sup>* would always stay "between" *x<sup>i</sup>* and *x<sup>j</sup>* and hence *x<sup>i</sup>* and *x<sup>j</sup>* could not be swapped in *ϕ*. By induction hypothesis, if *x<sup>i</sup>* and *x<sup>ℓ</sup>* are swapped between *σs*−<sup>1</sup> and *σ<sup>s</sup>* for some *s* ≥ *t*, then *x<sup>i</sup>* and *x<sup>j</sup>* are not swapped between *σt*−<sup>1</sup> and *σt*, a contradiction. Hence *x<sup>i</sup>* and *x<sup>ℓ</sup>* are swapped before *x<sup>i</sup>* and *x<sup>j</sup>* are swapped, that is, there is some *s* ∈ [*r* + 1*, t* − 1] such that *x<sup>i</sup>* and *x<sup>h</sup>* are swapped between *σs*−<sup>1</sup> and *σs*.

It remains to show that no object *<sup>x</sup><sup>h</sup>* with A(*xh*) <sup>∈</sup>*/* <sup>J</sup>A(*xi*)*,* A(*x<sup>j</sup>* )K, is swapped with *x<sup>i</sup>* or *x<sup>j</sup>* before *x<sup>i</sup>* and *x<sup>j</sup>* are swapped. This follows from a simple counting argument. There are <sup>|</sup>J*<sup>σ</sup>* −1 *r* (*xi*)*, σ*<sup>−</sup><sup>1</sup> *r* (*x<sup>j</sup>* )K| − <sup>1</sup> edges between *<sup>σ</sup>* −1 *r* (*xi*) and *σ* −1 *r* (*x<sup>j</sup>* ). The two objects *x<sup>i</sup>* and *x<sup>j</sup>* are swapped over one of these edges. Over each of the other edges exactly one of the objects is swapped before *x<sup>i</sup>* and *<sup>x</sup><sup>j</sup>* are swapped. Thus, <sup>|</sup>J*<sup>σ</sup>* −1 *r* (*xi*)*, σ*<sup>−</sup><sup>1</sup> *r* (*x<sup>j</sup>* )K| − <sup>2</sup> objects are swapped with either *x<sup>i</sup>* or *x<sup>j</sup>* before *x<sup>i</sup>* and *x<sup>j</sup>* are swapped. As shown above, each agent *x<sup>h</sup>* with A(*xh*) <sup>∈</sup> <sup>J</sup>*<sup>σ</sup>* −1 *r* (*xi*)*, σ*<sup>−</sup><sup>1</sup> *r* (*x<sup>j</sup>* )<sup>K</sup> and *<sup>x</sup><sup>h</sup>* ∈ { */ <sup>x</sup><sup>i</sup> , xj*} is swapped with *x<sup>i</sup>* or *x<sup>j</sup>* before *x<sup>i</sup>* and *x<sup>j</sup>* are swapped. The counting argument is then completed by observing that there are <sup>|</sup>J*<sup>σ</sup>* −1 *r* (*xi*)*, σ*<sup>−</sup><sup>1</sup> *r* (*x<sup>j</sup>* )K| − <sup>2</sup> such objects.

For an example of Lemma 7.9, recall Figure 7.4. Therein, object *x*<sup>3</sup> moves clockwise, object *x*<sup>0</sup> moves counter-clockwise, and objects *x*<sup>4</sup> and *x*<sup>5</sup> are initially held by agents in <sup>J</sup>3*,* <sup>0</sup>K. Lemma 7.9 states that objects *<sup>x</sup>*<sup>4</sup> and *<sup>x</sup>*<sup>5</sup> are swapped with *x*<sup>0</sup> or *x*<sup>3</sup> before objects *x*<sup>0</sup> and *x*<sup>3</sup> are swapped and objects *x*<sup>1</sup> and *x*<sup>2</sup> are not swapped with *x*<sup>0</sup> or *x*<sup>3</sup> before *x*<sup>0</sup> and *x*<sup>3</sup> are swapped. Lemma 7.9 has three interesting implications. First, it implies that each pair of objects is swapped at most twice. Note that the example in Figure 7.4 shows that two objects (*x*<sup>3</sup> and *x*<sup>4</sup> in the example) can be swapped twice in a cycle. To verify that each pair of objects is swapped at most twice, consider two objects *x<sup>i</sup>* and *x<sup>j</sup>* and the assignment *σ<sup>r</sup>* after *x<sup>i</sup>* and *x<sup>j</sup>* are swapped for the frst time over an edge {*ℓ, ℓ* + 1 mod *n*}. Lemma 7.9 then states that each object *x<sup>h</sup>* (except for *x<sup>i</sup>* and *x<sup>j</sup>* ) have to be swapped with either *x<sup>i</sup>* or *x<sup>j</sup>* before *x<sup>i</sup>* and *x<sup>j</sup>* can be swapped for a second time. Thus, each agent has to hold *x<sup>i</sup>* or *x<sup>j</sup>* between the two swaps of *x<sup>i</sup>* and *x<sup>j</sup>* as for each of the *n* − 2 objects that are swapped with *x<sup>i</sup>* or *x<sup>j</sup>* a new agent holds *x<sup>i</sup>* or *x<sup>j</sup>* . Since after the second swap of *x<sup>i</sup>* and *x<sup>j</sup>* agent A(*xi*) has held *x<sup>i</sup>* and *x<sup>j</sup>* , it will by Assumption 7.8 and Observation 7.1 not accept either of the objects again, so *x<sup>i</sup>* and *x<sup>j</sup>* cannot be swapped thrice in a cycle.

### **Corollary 7.10.** *Each pair of objects can be swapped at most twice in a cycle.*

The second interesting implication of Lemma 7.9 is that if A(*z*) <sup>∈</sup> <sup>J</sup>*k, I*K, then each object *<sup>x</sup><sup>i</sup>* with A(*xi*) <sup>∈</sup> <sup>J</sup>*k,* A(*z*)<sup>K</sup> except for *<sup>x</sup>* and *<sup>z</sup>* is not swapped with *<sup>x</sup>* or *z* before *x* and *z* are swapped. Moreover, no such object can be swapped with another object that is then swapped with *x* or *z*. Notice that in this case agent *I* holds object *z* before *x* and *z* are swapped and thus *x* and *z* are by Observation 7.1 only swapped once. Hence, an object *<sup>x</sup><sup>i</sup>* with A(*xi*) <sup>∈</sup> <sup>J</sup>*k,* A(*z*)<sup>K</sup> does not have to be swapped at all and thus no two objects have to be swapped over the edge {*k, k* + 1}. We can therefore remove the edge from the cycle to obtain a path and use the algorithm by Huang and Xiao [HX20]. For the remainder of this section, we will therefore assume the following.

### **Assumption 7.11.** A(*z*) ∈ [*I* + 1*, k*]

The third implication of Lemma 7.9 concerns how often an object is swapped with *x* or *z*. Note that each object moving clockwise is swapped with *x* (as all objects are swapped with *x* or *z* between the frst and second swap of *x* and *z* and an object moving clockwise can never be swapped with *z*). Analogously, each object moving counter-clockwise is swapped with *z*. Thus, Lemma 7.9 implies the following.

**Observation 7.12.** *Each object <sup>x</sup><sup>i</sup> with* A(*xi*) <sup>∈</sup> <sup>J</sup>A(*z*)*, k*<sup>K</sup> *is swapped exactly twice with <sup>x</sup> or <sup>z</sup> and each object <sup>x</sup><sup>j</sup> with* A(*x<sup>j</sup>* ) <sup>∈</sup>*/* <sup>J</sup>A(*z*)*, k*<sup>K</sup> *is swapped exactly once with x or z.*

Note that for each edge *e* ∈ {{*i* − 1*, i*} | *i* ∈ [*k*]}, there is exactly one object that is swapped with *x* over *e* and this object moves clockwise. Moreover, by Observation 7.12 each object moving clockwise is swapped with *x* over one of these edges. Thus, we can characterize a solution by choosing for each edge *e* ∈ {{*i* − 1*, i*} | *i* ∈ [*k*]} one object to move clockwise and be swapped with *x* over *e*.

### **7.3.1 Limited Number of Candidates**

In this subsection, we will show that once object *z* is fxed, there are for each edge *e* ∈ {{*i* − 1*, i*} | *i* ∈ [*k*]} at most two candidate objects *c*1*, c*<sup>2</sup> such that *x* is swapped with either *c*<sup>1</sup> or *c*<sup>2</sup> over *e*. We start with a series of helpful lemmata that will be used often throughout this subsection. The frst lemma states that for each pair (*x<sup>i</sup> , x<sup>j</sup>* ) of objects and each agent *ℓ*, the edge where *x<sup>i</sup>* and *x<sup>j</sup>* are swapped for the frst time is the same in all sequences of swaps where *x<sup>i</sup>* moves clockwise, *x<sup>j</sup>* moves counter-clockwise, and agent *ℓ* holds both *x<sup>i</sup>* and *x<sup>j</sup>* during the sequence of swaps.

**Lemma 7.13.** *Let <sup>x</sup><sup>i</sup> and <sup>x</sup><sup>j</sup> be two objects and let <sup>ℓ</sup>* <sup>∈</sup> <sup>J</sup>A(*xi*)*,* A(*x<sup>j</sup>* )<sup>K</sup> *be an agent. There is an edge e such that for each sequence of swaps ϕ such that x<sup>i</sup> moves clockwise in ϕ, object x<sup>j</sup> moves counter-clockwise in ϕ, and agent ℓ holds both x<sup>i</sup> and x<sup>j</sup> during ϕ, it holds that x<sup>i</sup> and x<sup>j</sup> are swapped over e during ϕ. Deciding whether such an edge exists and computing it if it exists takes O*(*n*) *time after an O*(*n* 2 )*-time preprocessing step.*

*Proof.* We distinguish between the two cases *x<sup>j</sup>* ≻*<sup>ℓ</sup> x<sup>i</sup>* and *x<sup>i</sup>* ≻*<sup>ℓ</sup> x<sup>j</sup>* . Since both cases are completely analogous, we only show the proof for the former case. Note that agent *ℓ* must then frst hold object *x<sup>i</sup>* before it holds object *x<sup>j</sup>* as it would otherwise not accept object *x<sup>i</sup>* after already holding *x<sup>j</sup>* or an object it prefers over *x<sup>j</sup>* . Thus, objects *x<sup>i</sup>* and *x<sup>j</sup>* must be swapped between two agents in <sup>J</sup>*ℓ,* A(*x<sup>j</sup>* )K. Now iteratively consider the preference list of an agent *ℓ* ′ <sup>∈</sup> <sup>J</sup>*ℓ,* A(*x<sup>j</sup>* )<sup>K</sup> (starting with agent *<sup>ℓ</sup>*+1 mod *<sup>n</sup>*). If agent *<sup>ℓ</sup>* ′ also prefers *x<sup>j</sup>* over *x<sup>i</sup>* , then *x<sup>i</sup>* and *x<sup>j</sup>* cannot be swapped over the edge {*ℓ* ′ − 1*, ℓ*′}. Hence, agent *ℓ* ′ must also hold object *x<sup>i</sup>* before it holds *x<sup>j</sup>* and we can continue the argumentation until we either fnd an agent who prefers *x<sup>i</sup>* over *x<sup>j</sup>* or we reach agent A(*x<sup>j</sup>* ) and A(*x<sup>j</sup>* ) also prefers *x<sup>j</sup>* over *x<sup>i</sup>* . If we reach agent A(*x<sup>j</sup>* ) and A(*x<sup>j</sup>* ) also prefers *x<sup>j</sup>* over *x<sup>i</sup>* , then all agents in <sup>J</sup>*ℓ,* A(*x<sup>j</sup>* )<sup>K</sup> prefer *<sup>x</sup><sup>j</sup>* over *<sup>x</sup><sup>i</sup>* and hence these two objects cannot be swapped between two such agents. If agent *ℓ* ′ prefers *x<sup>i</sup>* over *x<sup>j</sup>* , then these two objects can only be swapped over the edge {*ℓ* ′ − 1*, ℓ*′} as shown next. Assume towards a contradiction that *x<sup>i</sup>* and *x<sup>j</sup>* were swapped over another edge {*<sup>h</sup>* <sup>−</sup> <sup>1</sup>*, h*} where *<sup>h</sup>* <sup>∈</sup> <sup>J</sup>*<sup>ℓ</sup>* ′ + 1*,* A(*x<sup>j</sup>* )K. Then, agent *ℓ* ′ has to pass object *x<sup>i</sup>* towards agent *h* before agent *ℓ* ′ holds object *x<sup>j</sup>* . This means, however, that agent *ℓ* ′ will not accept object *x<sup>j</sup>* as it prefers *x<sup>i</sup>* over *x<sup>j</sup>* . Thus, object *x<sup>i</sup>* cannot be passed to agent *ℓ*, a contradiction.

It remains to analyze the running time of the algorithm. We frst describe a simple preprocessing step that eases the computation of deciding which of two objects an agent prefers. We defne pos(*i, x<sup>j</sup>* ) as the position of object *x<sup>j</sup>* in the preference list of agent *i*. Note that pos can be precomputed once in *O*(*n* 2 ) time by iterating over the preference list of each agent. Once the preprocessing is done, we have to (in the worst case) check for each agent *ℓ* ′ <sup>∈</sup> <sup>J</sup>*ℓ,* A(*x<sup>j</sup>* )<sup>K</sup> whether they

prefer *x<sup>i</sup>* over *x<sup>j</sup>* or not. Since this is only a check whether pos(*ℓ* ′ *, xi*) *<* pos(*ℓ* ′ *, x<sup>j</sup>* ) or not, the whole procedure takes *O*(*n*) time in total after preprocessing. Note that the preprocessing can be done once and then be reused for each application of Lemma 7.13.

Based on Lemma 7.13, we defne the set of edges where two objects *x<sup>i</sup>* and *x<sup>j</sup>* can be swapped.

**Defnition 7.1.** Let *x<sup>i</sup>* and *x<sup>j</sup>* be two objects and let *a, b* be two agents such that *<sup>a</sup>* <sup>∈</sup> <sup>J</sup>A(*xi*)*,* A(*x<sup>j</sup>* )<sup>K</sup> and *b /*<sup>∈</sup> <sup>J</sup>A(*xi*)*,* A(*x<sup>j</sup>* )K. The *frst edge* fe*a*(*x<sup>i</sup> , x<sup>j</sup>* ) is the edge computed by Lemma 7.13 for *x<sup>i</sup>* , *x<sup>j</sup>* , and *a*. If this edge does not exist, then fe*a*(*x<sup>i</sup> , x<sup>j</sup>* ) . .= ⊥. Let {*s, s* + 1 mod *n*} . .= feA(*xi*)(*x<sup>i</sup> , x<sup>j</sup>* ). The *second edge* se*b*(*x<sup>i</sup> , x<sup>j</sup>* ) is the edge computed by Lemma 7.13 for *x<sup>i</sup>* , *x<sup>j</sup>* , and *b* after *x<sup>i</sup>* and *x<sup>j</sup>* have been swapped over {*s, s* + 1 mod *n*}, that is, when agent *s* initially holds *x<sup>j</sup>* and agent *s* + 1 mod *n* initially holds object *x<sup>i</sup>* . If this edge does not exist, then se*b*(*x<sup>i</sup> , x<sup>j</sup>* ) . .= ⊥.

The second lemma states that an object can never "overtake" another object that moves in the same direction in the cycle.

**Lemma 7.14.** *Let <sup>x</sup>h, <sup>x</sup>i, and <sup>x</sup><sup>j</sup> be three objects such that* A(*xi*) <sup>∈</sup> <sup>J</sup>*xh, x<sup>j</sup>* <sup>K</sup>*. Let ϕ* = (*σ*0*, σ*1*, . . . , σs*) *be a sequence of swaps in which the three objects move in the same direction. For each p* ∈ [*s*]*, it holds that σ* −1 *p* (*xi*) <sup>∈</sup> <sup>J</sup>*<sup>σ</sup>* −1 *p* (*xh*)*, σ*<sup>−</sup><sup>1</sup> *p* (*x<sup>j</sup>* )K*.*

*Proof.* We assume that *xh*, *x<sup>i</sup>* , and *x<sup>j</sup>* are distinct as otherwise the statement trivially holds. Assume towards a contradiction that there is some minimal *p* ∈ [*s*] such that *σ* −1 *p* (*xi*) <sup>∈</sup>*/* <sup>J</sup>*<sup>σ</sup>* −1 *p* (*xh*)*, σ*<sup>−</sup><sup>1</sup> *p* (*x<sup>j</sup>* )K. Since *<sup>p</sup>* is minimal, it holds that *σ* −1 *p*−1 (*xi*) <sup>∈</sup> <sup>J</sup>*<sup>σ</sup>* −1 *p*−1 (*xh*)*, σ*<sup>−</sup><sup>1</sup> *p*−1 (*x<sup>j</sup>* )K. We now consider the swap {(*a, xc*)*,*(*b, xd*)} between *σp*−<sup>1</sup> and *σp*. Since *x<sup>c</sup>* and *x<sup>d</sup>* are swapped in *ϕ* they move in diferent directions. Let without loss of generality *x<sup>p</sup>* be the object that moves in the same direction as *x<sup>i</sup>* , *x<sup>j</sup>* , and *xh*. If {*x<sup>i</sup> , x<sup>j</sup> , xh*} ∩ {*xc*} = ∅, then

$$
\sigma\_p^{-1}(x\_j) = \sigma\_{p-1}^{-1}(x\_j) \in [\sigma\_{p-1}^{-1}(x\_h), \sigma\_{p-1}^{-1}(x\_j)] = [\sigma\_{p-1}^{-1}(x\_h), \sigma\_{p-1}^{-1}(x\_j)],
$$

a contradiction. Hence, *x<sup>c</sup>* ∈ {*x<sup>i</sup> , x<sup>j</sup> , xh*}. We distinguish between the two cases *x<sup>c</sup>* = *x<sup>j</sup>* and *x<sup>c</sup>* ∈ {*x<sup>i</sup> , xh*}. If *x<sup>c</sup>* = *x<sup>j</sup>* , then

$$[\sigma\_{p-1}^{-1}(x\_h), \sigma\_{p-1}^{-1}(x\_j)] = [\sigma\_{p-1}^{-1}(x\_h), \sigma\_{p-1}^{-1}(x\_j)].$$

Since *x<sup>i</sup> , x<sup>j</sup>* and *x<sup>h</sup>* are distinct objects, it holds that

$$
\sigma\_{p-1}^{-1}(x\_j) \in \left[\sigma\_{p-1}^{-1}(x\_h), \sigma\_{p-1}^{-1}(x\_j)\right] \backslash \left\{\sigma\_{p-1}^{-1}(x\_h), \sigma\_{p-1}^{-1}(x\_j)\right\}.
$$

Thus, since each object can only move one position in each swap, it holds that

$$
\sigma\_p^{-1}(x\_j) \in \left[\sigma\_{p-1}^{-1}(x\_h), \sigma\_{p-1}^{-1}(x\_j)\right] = \left[\sigma\_p^{-1}(x\_h), \sigma\_p^{-1}(x\_j)\right],
$$

which is again a contradiction. Finally, if *x<sup>c</sup>* ∈ {*x<sup>i</sup> , xh*}, then note that *x<sup>d</sup>* ̸= *x<sup>i</sup>* as they move in diferent directions. Hence,

$$
\sigma\_p^{-1}(x\_j) = \sigma\_{p-1}^{-1}(x\_j) \in \left[\sigma\_{p-1}^{-1}(x\_h), \sigma\_{p-1}^{-1}(x\_j)\right].
$$

Again, since the three objects are distinct, *x<sup>j</sup>* held by an agent

$$
\sigma\_{p-1}^{-1}(x\_j) \in \left[\sigma\_{p-1}^{-1}(x\_h), \sigma\_{p-1}^{-1}(x\_j)\right] \backslash \left\{\sigma\_{p-1}^{-1}(x\_h), \sigma\_{p-1}^{-1}(x\_j)\right\},
$$

and since in each step this interval can shrink by at most one, it follows that

$$
\sigma\_p^{-1}(x\_j) = \sigma\_{p-1}^{-1}(x\_j) \in [\sigma\_p^{-1}(x\_h), \sigma\_p^{-1}(x\_j)],
$$

a contradiction.

Finally, the third lemma states that there is a constant distance *c* such that for each object *x<sup>i</sup>* which is swapped twice with *x*, the distance between the two edges where *x<sup>i</sup>* is swapped with *x* have distance *c* in the input graph.

**Lemma 7.15.** *Let ϕ be a sequence of swaps such that agent I swaps z for x in the last swap of ϕ. Let x<sup>i</sup> be an object that moves clockwise in ϕ such that* A(*xi*) <sup>∈</sup> <sup>J</sup>A(*x*)*,* A(*z*)K*. Then,* se*<sup>I</sup>* (*x*1*, x*) ̸<sup>=</sup> ⊥ ̸<sup>=</sup> se*<sup>I</sup>* (*z, x*)*. Let*

$$\begin{aligned} \{s\_1, s\_1+1\} &:= \mathsf{fe}\_k(x\_1, x), \\ \{s\_2, s\_2+1\} &:= \mathsf{fe}\_k(z, x), \quad \text{and} \quad \{t\_2, t\_2+1\} := \mathsf{se}\_I(z, x). \end{aligned}$$

*Then, it holds that s*<sup>1</sup> − *t*<sup>1</sup> = *s*<sup>2</sup> − *t*2*.*

*Proof.* This lemma almost directly follows from Lemmata 7.9 and 7.14. Assume towards a contradiction that there is some *x<sup>i</sup>* that moves clockwise and swapped twice with *x* in *ϕ* such that *s*<sup>1</sup> − *t*<sup>1</sup> ≠ *s*<sup>2</sup> − *t*2. Consider the set *Y* of objects *x<sup>j</sup>* that move clockwise in *<sup>ϕ</sup>* and such that A(*x<sup>j</sup>* ) <sup>∈</sup> <sup>J</sup>A(*z*)*,* A(*xi*)<sup>K</sup> (excluding *<sup>x</sup><sup>i</sup>* and *z*). By Lemma 7.9, *s*<sup>1</sup> −*t*<sup>1</sup> = |*Y* |. Now consider the assignment *σ<sup>r</sup>* in *ϕ* after the frst swap of *x<sup>i</sup>* and *x* and the set *Y* ′ of objects *x<sup>j</sup>* that move clockwise in *ϕ* with *σ* −1 *r* (*x<sup>j</sup>* ) <sup>∈</sup> <sup>J</sup>*<sup>σ</sup>* −1 *r* (*z*)*, σ*<sup>−</sup><sup>1</sup> *r* (*xi*)K. By Lemma 7.9 it holds that *<sup>s</sup>*<sup>2</sup> <sup>−</sup> *<sup>t</sup>*<sup>2</sup> <sup>=</sup> <sup>|</sup>*<sup>Y</sup>* ′ | and by Lemma 7.14 that *Y* = *Y* ′ . Thus, *s*<sup>1</sup> − *t*<sup>1</sup> = |*Y* | = |*Y* ′ | = *s*<sup>2</sup> − *t*2, a contradiction.

We continue with the central defnition of this subsection: the *type* of an object. The type of an object is represented by the index of the edge where the object can possibly be swapped with *x* for the frst time. The idea behind types is the following. We will develop a 2-SAT program to determine which objects move clockwise in a solution. We will show that there are at most two objects of each type an, roughly speaking, we will introduce a variable for each type that represents which of the two objects of a type moves clockwise. We will use Lemma 7.13 to defne the type of an object *x<sup>j</sup>* . It only remains to fnd an agent which holds each of *x<sup>j</sup>* and *x* at some point in time. If A(*x<sup>j</sup>* ) ∈ [*I, k*], then object *x* has to pass agent A(*x<sup>j</sup>* ) and hence we can use this agent. If A(*x<sup>j</sup>* ) ∈*/* [*I, k*], then *<sup>I</sup>* <sup>∈</sup> <sup>J</sup>A(*x<sup>j</sup>* )*, k*<sup>K</sup> and hence agent *<sup>I</sup>* has to frst hold object *<sup>x</sup><sup>j</sup>* before it can receive object *z* and hence we can use agent *I* in Lemma 7.13.

**Defnition 7.2.** The *index* of an edge {*t* − 1*, t*} with *t* ∈ [*k*] is *t*. For each object *x<sup>j</sup>* with A(*x<sup>j</sup>* ) ∈ [*I, k*], the *type* of *y* is the index of feA(*x<sup>j</sup>* )(*x<sup>j</sup> , x*). For each object *x<sup>j</sup>* with A(*x<sup>j</sup>* ) ∈*/* [*I, k*], the *type* of *y* is the index of fe*<sup>I</sup>* (*x<sup>j</sup> , x*). If the respective value is ⊥, then the type of *x<sup>j</sup>* is 0. The *candidate set* C*<sup>α</sup> for α* contains all objects of type *α*.

Figure 7.6 shows an example of types. We continue by showing that exactly one object of each type moves clockwise in any solution. We use *t<sup>z</sup>* to denote the type of *z*. Note that *x* and *z* have to be swapped for the frst time over the edge {*t<sup>z</sup>* − 1*, tz*}. By Lemma 7.9, for each edge

$$\{t\_z, t\_z + 1\}, \ \{t\_z + 1, t\_z + 2\}, \dots, \{k - 1, k\}$$

one object *<sup>x</sup><sup>i</sup>* with A(*xi*) <sup>∈</sup> <sup>J</sup>A(*z*)*, k*<sup>K</sup> has to move clockwise and be swapped with *x* over the respective edge. By Defnition 7.2, these objects have types

$$t\_z, t\_z + 1, \dots, k$$

and, by Observation 7.12 and Lemma 7.15, these *k* − *t<sup>z</sup>* + 1 objects are swapped a second time with *x* over the edges {0*,* 1}*,* {1*,* 2}*, . . . ,* {*k* −*tz, k* −*t<sup>z</sup>* + 1}. Hence for each edge {*k* − *t<sup>z</sup>* + 1*, k* − *t<sup>z</sup>* + 2}*,* {*k* − *t<sup>z</sup>* + 2*, k* − *t<sup>z</sup>* + 3}*, . . . ,* {*t<sup>z</sup>* − 2*, t<sup>z</sup>* − 1} there is an object that is swapped once with *x*. Since the number of such objects is (*tz*−1)−(*k*−*tz*+1) = 2*tz*−*k*−2, there are (2*t<sup>z</sup>* − *k* − 2) + (*k* − *t<sup>z</sup>* + 1) = *t<sup>z</sup>* − 1 objects that move clockwise in total in each solution where *x* moves counterclockwise and *z* moves clockwise. By defnition of types, these have to have types *α* ∈ [*k* − *t<sup>z</sup>* + 1*, k*].

**Figure 7.6:** An example for types. The objects *a*, *b*, *c*, *d*, *e*, and *x* are placed next to the agents that initially hold them. The numbers next to edges between *I* and *k* depict the index of the respective edge. Objects *c* and *e* can never be swapped with *x* before *x* reaches agent *I*. Thus they both have type 0. The type of object *b* is 2 as objects *b* and *x* can only be swapped over the edge with index 2 for the frst time if *x* moves counter-clockwise. The type of *a* and *d* is 1 since if *x* moves counter-clockwise and is swapped with either *a* or *d* over a diferent edge than {0*,* 1}, then agent 1 holds the respective object before it holds *x*. Since agent 1 prefers either object over object *x*, it would not accept *x* and hence there is no solution in which *a* or *d* is swapped with *x* over a diferent edge than {0*,* 1}.

**Observation 7.16.** *Let ϕ . .*= (*σ*0*, σ*1*, . . . , σt*) *be a sequence of swaps which satisfes Assumptions 7.8 and 7.11 and such that σt*(*I*) = *x.*

*For each type α* ∈ [*k* − *t<sup>z</sup>* + 1*, k*] *there is exactly one object x<sup>α</sup> of type α that moves clockwise in ϕ. All objects whose type is not in* [*k*−*t<sup>z</sup>* +1*, k*] *move counterclockwise in <sup>ϕ</sup>. For <sup>α</sup>* <sup>∈</sup> [*<sup>k</sup>* <sup>−</sup> *<sup>t</sup><sup>z</sup>* + 1*, t<sup>z</sup>* <sup>−</sup> 1]*, it holds that* A(*xα*) <sup>∈</sup>*/* <sup>J</sup>A(*z*)*, k*K*. For <sup>α</sup>* <sup>∈</sup> [*tz, k*]*, it holds that* A(*xα*) <sup>∈</sup> <sup>J</sup>A(*z*)*, k*K*.*

Using Observation 7.16, we can now formalize *selections*. Selections are an equivalent way of think about Reachable Object on cycles. They characterize which objects move clockwise and which objects move counter-clockwise.

**Defnition 7.3.** Let *λ* ⊆ [*k*]. A set *ι* of objects is a *selection for λ* if it contains exactly one object of each type in *λ* and no other objects. A set *ι* is a *selection* if it is a selection for [*k* − *t<sup>z</sup>* + 1*, k*].

We will show in Subsection 7.3.2 how to test whether a given selection leads to a solution, that is, a sequence of swaps such that agent *I* obtains object *x*. In the remainder of this subsection, we focus on eliminating possible selections.

Observe that if the type of an object is 0, then it cannot be swapped with *x* and hence it has to be moved counter-clockwise. We will slightly misuse the defnition of types and relabel the type of any object *x<sup>j</sup>* to 0 if we know for some reason that *x<sup>j</sup>* has to move counter-clockwise. Lemma 7.15 states a frst rule that can be used to relabel the type of an object to 0. Hence, we assume that each object of type *α* = 0 ̸ fulflls the conditions of Lemma 7.15. We conclude this subsection with a proposition that identifes at most two "relevant" objects of each type and allows us to relabel the type of all other objects of this type to 0 (because they can not be moved clockwise in a solution). To this end, we defne the *subtypes* of an object. Roughly speaking, the subtype of an object *x<sup>i</sup>* encodes whether *x<sup>i</sup>* is "closer" (counted in clockwise steps) to *z* than the other object that can be considered or whether *x<sup>i</sup>* is "further away". We distinguish between objects that are possibly swapped once with *x* and objects that are possibly swapped twice with *x*. The main idea is that if *x<sup>i</sup>* is not selected (it moves counter-clockwise), then it has to be swapped with *z*. Thus we can use Lemma 7.13 to compute the edge where *x<sup>i</sup>* and *z* can be swapped. We can then check whether another object of the same type *α* as *x<sup>i</sup>* has to move clockwise in order for *x<sup>i</sup>* to reach the respective edge where *x<sup>i</sup>* and *z* can be swapped. We say that *x<sup>i</sup>* has subtype *f* if another object of type *α* between *z* and *x<sup>i</sup>* has to move clockwise in order for *x<sup>i</sup>* to reach the specifed edge. Otherwise, we say that *x<sup>i</sup>* has subtype *c*. This characteristic is captured by the following defnition.

**Defnition 7.4.** Let *x<sup>i</sup>* be an object of type *α* ∈ [*k* −*t<sup>z</sup>* + 1*, t<sup>z</sup>* −1], let *x<sup>j</sup>* be an object of type *β* ∈ [*t<sup>z</sup>* + 1*, k*], and let *t<sup>z</sup>* be the type of *z*. If A(*xi*) ∈ [*I,* A(*z*) − 1], then let *e* . .= fe*<sup>I</sup>* (*z, xi*) and if A(*xi*) ∈ [*k, n* − 1], then let *e* . .= feA(*xi*)(*z, xi*). If {*a* − 1*, a*} . .= *e* ̸= ⊥, then *h* . .<sup>=</sup> <sup>|</sup>J*a,* A(*y*)K| − <sup>1</sup> is the *distance between <sup>x</sup><sup>i</sup> and <sup>z</sup>*. If *α > h*, then the *subtype* of *x<sup>i</sup>* is *c* (for closer) and if *α* ≤ *h*, then the *subtype* of *x<sup>i</sup>* is *f* (for further).

Let *e*<sup>1</sup> . .= feA(*x<sup>j</sup>* )(*z, x<sup>j</sup>* ) and *e*<sup>2</sup> . .= se*<sup>I</sup>* (*z, x<sup>j</sup>* ). If {*b* − 1*, b*} . .= *e*<sup>1</sup> ̸= ⊥ and {*c* − 1*, c*} . .= *e*<sup>2</sup> ̸= ⊥, then *h* . .<sup>=</sup> <sup>|</sup>J*b,* A(*x<sup>j</sup>* )K| − <sup>1</sup> is the *distance between <sup>x</sup><sup>j</sup> and z* and let *h* ′ . .<sup>=</sup> <sup>|</sup>J*c, b* <sup>−</sup> <sup>1</sup>K| − <sup>1</sup>. If *α > t<sup>z</sup>* <sup>+</sup> *<sup>h</sup>* and *<sup>h</sup>* ′ = *t<sup>z</sup>* − 2, then the *subtype* of *x<sup>j</sup>* is *c* and if *α* ≤ *t<sup>z</sup>* + *h*<sup>1</sup> and *h* ′ = *t<sup>z</sup>* − 2, then the *subtype* of *x<sup>j</sup>* is *f*.

See Figure 7.7 for an illustration of subtypes. If *e* = ⊥, then *x<sup>i</sup>* cannot move counter-clockwise and hence we can relabel the type of all other objects of type *α* to 0. Analogously, if {*e*1*, e*2} ∩ {⊥} ̸= ∅, then *x<sup>j</sup>* cannot move counter-clockwise and hence we relabel the type of all other objects of type *β* to 0.

**Figure 7.7:** An example that illustrates the main idea behind Defnition 7.4 and Proposition 7.18. Some objects are depicted next to the agents that initially hold them. Let the type of *y* be *β* = 6 and assume that object *y* moves counter-clockwise. It therefore has to be swapped with *z* at some point and since *z* has to pass agent A(*y*) to reach agent *I*, we can use Lemma 7.13 to compute the edge *e* . .= feA(*y*)(*z, y*) where *y* and *z* swap for the frst time. Let *e* = {12*,* 13} (the red edge). The distance *h* computed in Defnition 7.4 is then *h* . .<sup>=</sup> <sup>|</sup><sup>13</sup>*,* A(*y*)| − 1 = <sup>|</sup><sup>13</sup>*,* <sup>17</sup>| − 1 = 4 and describes the number of edges that object *y* has to pass before it can be swapped with *z*. Note that for each type *α* ≥ *tz*, there is an object of type *α* that is swapped with *x* twice (frst in the violet (bottom right) region between agents *t<sup>z</sup>* − 1 and *k* and, by Lemma 7.15, a second time in the orange region (top right) between agents *I* and *k* − *t<sup>z</sup>* + 1). All of these objects have to be swapped with *y* and, by Lemma 7.14, this has to be in the yellow (left) region as *z* is swapped with *y* over the red edge and all other objects have to be swapped with *y* on consecutive edges. Similarly, the object that is swapped with *y* over the edge {15*,* 16} is swapped with *x* over the edge {3*,* 4}, the object that is swapped with *y* over the edge {16*,* 17} is swapped with *x* over the edge {4*,* 5}, and so on. Hence, the object that is swapped with *y* over the edge {16*,* 17} (the frst edge of *y* in counter-clockwise direction) is swapped with *x* over the edge {*h, h*+ 1} = {4*,* 5} (green) and is therefore of type *h* + 1 = 5. Since *β > h*, the subtype of *y* is *f* and *y* is not swapped with an object of type *<sup>β</sup>* that is initially held by an agent in A(*z*)*,* A(*y*).

Before we show the main proposition of this section, we prove a lemma that characterizes the types of all objects an object moving counter-clockwise is swapped with.

**Lemma 7.17.** *Let x<sup>i</sup> be an object of type α and let h be the distance between x<sup>i</sup> and z. If x<sup>i</sup> moves counter-clockwise, then*


*Proof.* Let *x<sup>i</sup>* be an object of type *α* that moves counter-clockwise. We consider the two cases A(*xi*) ∈ [*I,* A(*z*) − 1] and A(*xi*) ∈ [A(*z*)*, n* − 1].

If A(*xi*) ∈ [*I,* A(*z*) − 1], then note that agent *I* has to hold object *x<sup>i</sup>* before it can obtain *z*. Hence, by Lemma 7.13, objects *x<sup>i</sup>* and *z* are swapped over the edge fe*<sup>I</sup>* (*z, xi*). If A(*xi*) ∈ [A(*z*)*, n* − 1], then note that agent A(*xi*) has to hold object *z* before agent *I* can hold object *z*. Hence, by Lemma 7.13, objects *x<sup>i</sup>* and *z* are swapped over the edge feA(*xi*)(*z, xi*).

If the respective edge where *x<sup>i</sup>* and *z* can swap for the frst time exists, then we denote it by {*a* − 1*, a*}. Note that the distance *h* between *x<sup>i</sup>* and *z* exactly describes the number of edges in the path between A(*xi*) and *a* that *x<sup>i</sup>* has to pass before it can be swapped with *z*. Note that each object with which *x<sup>i</sup>* is swapped moves clockwise. By Lemmata 7.9 and 7.15, the object that is swapped with *x<sup>i</sup>* over the edge {*a, a* + 1} is swapped with *x* over the edge {*tz, t<sup>z</sup>* + 1} and it therefore has type *t<sup>z</sup>* + 1. Repeating this argument, the object that is swapped with *x<sup>i</sup>* over the edge {*a* + 1*, a* + 2} is of type *t<sup>z</sup>* + 2 and so on until type *k* is reached (after *k* − *t<sup>z</sup>* iterations). Thus, if *h* ≤ *k* − *tz*, then *x<sup>i</sup>* is swapped with an object of each type *β* ∈ [*tz, t<sup>z</sup>* + *h*]. If *h > k* − *tz*, then *x<sup>i</sup>* is swapped with an object of each type *β* ∈ [*tz, k*] in the frst *k* − *t<sup>z</sup>* iterations. The next type after *k* has then to be *k* − *t<sup>z</sup>* + 1 as the object of type *k* is swapped with *x* also over the edge {*k* − *t<sup>z</sup>* − 2*, k* − *t<sup>z</sup>* − 1}. Thus, if *h > k* − *tz*, then *x<sup>i</sup>* is also swapped with an object of each type *β* ∈ [*k* − *t<sup>z</sup>* + 1*, h*].

We conclude with the main result of this subsection. This allows us to relabel the type of all except for two objects of some type *α* = 0 ̸ to 0. If two objects of type *α* ̸= 0 remain afterwards, then they have diferent subtypes.

**Proposition 7.18.** *Given objects x and z, there is an O*(*n* 2 )*-time preprocessing that excludes all but at most two objects of each type α* ≥ *k* − *t<sup>z</sup>* + 2 *as potential candidates for being swapped with x. Afterwards* |C*α*| ≤ 2 *for all α* ∈ [*k*−*tz*+1*, k*]*.*

*Proof.* Consider a type *α* ≥ *k* − *t<sup>z</sup>* + 2 and all objects of type *α*. Compute the subtype of each of these objects. Exactly one of them is moved clockwise and all others have to be swapped with *z* at some point. Let *x<sup>i</sup>* be an object of type *α*. We now consider the two cases *α* ∈ [*k*−*t<sup>z</sup>* +1*, t<sup>z</sup>* −1] and *α* ∈ [*t<sup>z</sup>* +1*, k*]. We start with the case where *<sup>α</sup>* <sup>∈</sup> [*k*−*t<sup>z</sup>* +1*, tz*]. Note that in this case if A(*xi*) <sup>∈</sup> <sup>J</sup>A(*z*)*, k*<sup>K</sup> and *x<sup>i</sup>* moves clockwise, then, by Lemma 7.9, *x<sup>i</sup>* is swapped with *x* before *x* and *z* are swapped. Hence, *x<sup>i</sup>* cannot be swapped with *x* for the frst time over the edge {*α* − 1*, α*}. Thus, *x<sup>i</sup>* cannot move clockwise and its type can be relabeled to <sup>0</sup>. We therefore assume that A(*xi*) <sup>∈</sup>*/* <sup>J</sup>A(*z*)*, k*K. We consider the two cases A(*xi*) ∈ [*I,* A(*z*) − 1] and A(*xi*) ∈ [*k, n* − 1]. If A(*xi*) ∈ [*I,* A(*z*) − 1] and *x<sup>i</sup>* moves counter-clockwise, then note that agent *I* has to hold object *x<sup>i</sup>* before it can obtain *x*. Hence, by Lemma 7.13, objects *x<sup>i</sup>* and *z* are swapped over the edge fe*<sup>I</sup>* (*z, xi*). If A(*xi*) ∈ [*k, n* − 1], then note that agent A(*xi*) has to hold object *z* before agent *I* can hold object *z*. Hence, by Lemma 7.13, objects *x<sup>i</sup>* and *z* are swapped over the edge feA(*xi*)(*z, xi*). Note that if the respective edge does not exist (the respective value is ⊥), then *x<sup>i</sup>* cannot move counter-clockwise and thus all other objects of type *α* have to move counter-clockwise and we can therefore relabel their type to 0. Note further that there is no solution if such an edge does not exist for multiple objects of the same type.

If A(*xi*) <sup>∈</sup> [*t<sup>z</sup>* + 1*, k*] and *<sup>x</sup><sup>i</sup>* moves clockwise, then note that A(*xi*) <sup>∈</sup> <sup>J</sup>A(*z*)*, k*<sup>K</sup> or object *x<sup>i</sup>* cannot be swapped twice with *x* before *x* and *z* are swapped twice. Hence, A(*xi*) holds *z* during a solution and we can use Lemma 7.13 to compute the edge feA(*xi*)(*z, xi*) where *x<sup>i</sup>* and *z* are swapped over for the frst time. Again, if the respective edge does not exist (the respective value is ⊥), then *x<sup>i</sup>* cannot move counter-clockwise and thus all other objects of type *α* have to move counter-clockwise and we can therefore relabel their type to 0. Moreover, if *x<sup>i</sup>* and *z* have to be swapped a second time over the edge *e* ′ . .= se*<sup>I</sup>* (*z, x<sup>j</sup>* ). If {*b*−1*, b*} . .= *e* ̸= ⊥ and {*c*−1*, c*} . .= *e* ′ ̸= ⊥, then let *h* ′ . .<sup>=</sup> <sup>|</sup>J*c, b*−1K|−1. Note that if *h* ̸= *t<sup>z</sup>* −1, then *x<sup>i</sup>* and *z* cannot be swapped over {*b*−1*, b*} and {*c*−1*, c*} as there are, by Observation 7.16, exactly *t<sup>z</sup>* − 1 objects that move clockwise in any solution. Hence, in this case *x<sup>i</sup>* has to move clockwise and we can relabel the type of all other objects to 0.

By Lemma 7.17, object *x<sup>i</sup>* is swapped with an object of type *α* if and only if *h* ≥ *α*, that is, if the subtype of *x<sup>i</sup>* is *f*.

We now show that there are at most two candidates for each type *α*. To this end, we iterate over all agents, starting with A(*z*) and iterating clockwise. If the object that is initially held by the agent is of type *α*, then we add its subtype to the end of an initially empty sequence. We then distinguish whether the sequence is *sorted*, that is, it is (*c, c, . . . , c, f, f, . . . , f* ), or not. If the sequence is not sorted, then for each object *x<sup>i</sup>* of type *α*, there exists an object of type *α* and subtype *<sup>f</sup>* that starts in <sup>J</sup>A(*z*)*,* A(*xi*)<sup>K</sup> or an object of type *<sup>α</sup>* and subtype *<sup>c</sup>* that starts in <sup>J</sup>A(*xi*)*,* A(*z*)K. Thus, if *<sup>x</sup><sup>i</sup>* moves clockwise, then the number of counter-clockwise steps of some other object of type *α* does not match the number of swaps needed to reach the edge where the objects can be swapped with *z*. Hence, no object of type *α* can be moved clockwise and therefore there is no solution.

Now consider the case where the objects are sorted by their subtype. By the same argument as above there are only two possible objects of type *α* that can possibly be moved clockwise: The "last" object of subtype *c* and the "frst" object of subtype *f*. We can therefore set the type of all other objects of type *α* to 0.

It remains to analyze the running time. Let *n<sup>α</sup>* be the number of objects of type *α*. Since the subtype for each object of type *α* can be computed in *O*(*n*) time, we obtain that the described preprocessing takes *O*(*n<sup>α</sup>* ·*n*) time for type *α*. After having computed the subtype of each object of type *α*, we iterate over all these objects and fnd in *O*(*n*) time the two specifed objects or determine that the objects are not ordered by their subtype. Hence, the overall running time is in *O*( ∑ *α>t<sup>z</sup>* (*n<sup>α</sup>* · *n*)). Note that each object (except for *x*) has exactly one type and hence ∑ *α>t<sup>z</sup> n<sup>α</sup> < n*. Thus, the overall running time is bounded by *O*( ∑ *α>t<sup>z</sup>* (*n<sup>α</sup>* · *n*)) ⊆ *O*(*n* 2 ).

Proposition 7.18 shows that there are at most two objects of each type. In the following, we will partition types into blocks where we will observe that for each block there are at most two possible choices of which objects of the respective types to move clockwise. These choices will then be used to develop a 2-SAT program. Note that the proof of Proposition 7.18 also states that if there are two objects of some type *α* ̸= 0, then one of them has subtype *c* and one has subtype *f*. Hence, we can uniquely identify any object *x<sup>i</sup>* which does not have type 0 by its type-subtype combination. For the sake of readability, we will denote the unique object of type *α* and subtype *c* by *αc*. Analogously, *β<sup>f</sup>* is the unique object of type *β* and subtype *f*. If an object *x<sup>i</sup>* is the only object of some type *α* ̸= 0, then we say that *α<sup>c</sup>* = *α<sup>f</sup>* = *x<sup>i</sup>* .

### **7.3.2 Compatibility of Solutions**

So far, we have shown how to compute a set of at most two candidates to be swapped with *x* over each edge which *x* has to pass. Recall that the main idea of our algorithm for Reachable Object on cycles is as follows. We frst partition the edges which *x* has to pass such that


We then develop a 2-SAT program with a variable for each part of the described partition and use it to compute a set of pairwise compatible candidates for each edge. We next show how to partition types (these represent all edges that *x* has to pass) such that there are only two possible selections for all types in one part of the partition (we will call those parts *blocks*). Afterwards, we prove that selections for diferent blocks can be picked almost independently such that a set of pairwise compatible selections for each block can be computed by a 2-SAT program.

Before we provide the formal defnition of blocks, we frst focus on objects of type 0 and show that no object of type 0 can initially be held by an agent "between" the two agents that initially hold the two objects of some type *α* ̸= 0. Note that Proposition 7.18 states that there are at most two objects of type *α* (one of subtype *c* and one of subtype *f*).

**Lemma 7.19.** *For each object x<sup>h</sup> of type* 0 *and each type α* = 0 ̸ *, if there are two objects x<sup>i</sup> and x<sup>j</sup> of type α* = 0 ̸ *, then both of them are initially held by agents in* <sup>J</sup>A(*xh*)*,* A(*z*)K*, both of them initially start in* <sup>J</sup>A(*z*)*,* A(*xh*)K*, or the type of one of these two objects can be relabeled to* 0*.*

*Proof.* Assume that there is an object *x<sup>h</sup>* of type 0 such that there are two objects *<sup>x</sup><sup>i</sup>* and *<sup>x</sup><sup>j</sup>* such that A(*xi*) <sup>∈</sup> <sup>J</sup>A(*xh*)*,* A(*z*)K, A(*x<sup>j</sup>* ) <sup>∈</sup> <sup>J</sup>A(*z*)*,* A(*xh*)K, and both *x<sup>i</sup>* and *x<sup>h</sup>* have type *α* = 0 ̸ . Let *d* be the distance between A(*xh*) and A(*z*). By Lemma 7.17, we can compute whether *x<sup>i</sup>* swaps with an object of type *α* before it is swapped with *z*. If so, then *x<sup>i</sup>* cannot move clockwise and hence its type can be relabeled to 0. If not, then *x<sup>j</sup>* cannot move clockwise and hence its type can be relabeled to 0.

**Figure 7.8:** An example of *blocks*. Only a subpath of the input cycle with the objects initially held by the agents is depicted. The object 0 represents an object of type 0. The blocks in this example are {1}, {2*,* 3*,* 4}, and {5}. The boxes indicate all objects of types corresponding to each block. Note that 5*<sup>c</sup>* and 5*<sup>f</sup>* are adjacent and hence {5} is a block as blocks are minimal. Note that {2*,* 3} is not a block as 4*<sup>c</sup>* is initially held by an agent in <sup>J</sup>A(2*c*)*,* A(3*<sup>f</sup>* )K. Since *<sup>z</sup>* is the only object of type <sup>1</sup>, it always holds that {1} is a block.

We assume that Lemma 7.19 has been exhaustively applied to relabel the type of objects to 0. This will help us to defne *blocks*. Intuitively, blocks are sets of consecutive types *α, α* + 1*, . . . , β* such that all objects of those types start on a (connected) subpath of the input graph.

**Defnition 7.5.** A *block* is a minimal subset *B* ⊆ [*k* − *t<sup>z</sup>* + 1*, k*] of types such that there are two agents *a* and *b* and all objects whose type is in *B* are initially held by agents in <sup>J</sup>*a, b*<sup>K</sup> and all objects that are initially held by agents in <sup>J</sup>*a, b*<sup>K</sup> have a type in *B*.

Figure 7.8 depicts an example of blocks. Based on blocks we can state a new rule to relabel the type of an object to 0.

**Lemma 7.20.** *Let A* = [*α, β*] *be a block and let γ* ∈ [*α, β*] *be a type. If for some <sup>δ</sup>* <sup>∈</sup> [*<sup>γ</sup>* + 1*, β*] *it holds that* A(*δc*) <sup>∈</sup> <sup>J</sup>A(*z*)*,* A(*γc*)K*, then there is no solution in which <sup>δ</sup><sup>c</sup> moves clockwise. If* A(*ϵ<sup>f</sup>* ) <sup>∈</sup> <sup>J</sup>A(*z*)*,* A(*γ<sup>f</sup>* )<sup>K</sup> *for some <sup>ϵ</sup>* <sup>∈</sup> [*α, γ* <sup>−</sup> 1]*, then there is no solution in which γ<sup>f</sup> moves clockwise.*

*Proof.* First, assume towards a contradiction that A(*δc*) <sup>∈</sup> <sup>J</sup>A(*z*)*,* A(*γc*)<sup>K</sup> and that there is a solution in which *δ<sup>c</sup>* moves clockwise. Note that by Proposition 7.18 and the defnition of subtypes, it holds that A(*ηc*) <sup>∈</sup> <sup>J</sup>A(*z*)*,* A(*η<sup>f</sup>* )<sup>K</sup> for each *η* ̸= 0. Hence, there is no object of type *γ* that is initially held by an agent in <sup>J</sup>A(*z*)*,* A(*δc*)K. Consider the solution where *<sup>δ</sup><sup>c</sup>* moves clockwise up to the assignment *σ<sup>r</sup>* after the swap of *δ<sup>c</sup>* and *x* over the edge {*δ* − 1*, δ*}. By Lemma 7.9, no object of type *<sup>γ</sup>* is held by an agent in <sup>J</sup>*<sup>σ</sup>* −1 *r* (*z*)*, σ*<sup>−</sup><sup>1</sup> *r* (*x*)K. Thus, *<sup>x</sup>* cannot be swapped over the edge {*γ* − 1*, γ*}, a contradiction to the assumption that we considered a solution.

Second, assume towards a contradiction that A(*ϵ<sup>f</sup>* ) <sup>∈</sup> <sup>J</sup>A(*z*)*,* A(*γ<sup>f</sup>* )<sup>K</sup> and there is a solution in which *<sup>γ</sup><sup>f</sup>* moves clockwise. Since A(*ϵc*) <sup>∈</sup> <sup>J</sup>A(*z*)*,* A(*ϵ<sup>f</sup>* )K,

all objects of type *<sup>ϵ</sup>* are initially held by agents in <sup>J</sup>A(*z*)*,* A(*γ<sup>f</sup>* )K. Consider the solution where *γ<sup>f</sup>* moves clockwise and the assignment after *γ<sup>f</sup>* and *x* are swapped over the edge {*γ* − 1*, γ*}. By Lemma 7.9 object *x* was not swapped with any object of type *ϵ* before it was swapped with *γ<sup>f</sup>* . Hence, *x* cannot have passed the edge {*ϵ* − 1*, ϵ*}, a contradiction to Assumption 7.8.

We henceforth assume that the type of objects satisfying Lemma 7.20 is relabeled to 0. We will show that blocks are the partitions we are looking for, that is, for each block *A* there are only two possible selections for *A* and selections for diferent blocks can be chosen almost independently. We start with a lemma that states that blocks defne a partition of types.

**Lemma 7.21.** *Each type η* ≥ *k* − *t<sup>z</sup>* + 2 *is contained in exactly one block and all blocks can be computed in linear time.*

*Proof.* We frst show that A(*z*) and A(*x*) divide the types into two intervals. Observe that all objects of type *α* ∈ [*t<sup>z</sup>* + 1*, k*] have to start in [A(*z*)*,* A(*x*)] and all objects of type *α* ∈ [*k* − *t<sup>z</sup>* + 2*, t<sup>z</sup>* − 1] have to be initially hold by an agent in <sup>J</sup>A(*x*)*,* A(*z*)K. Since *<sup>z</sup>* is the only object of type *<sup>t</sup>z*, the interval [*tz, tz*] is a block and no other block can contain the type *tz*. We now show that blocks are a partition of types that can be computed in linear time. We frst focus on all other types starting with types in [*t<sup>z</sup>* + 1*, k*]. Consider the object (*t<sup>z</sup>* + 1)*c*. This has to be the initially "closest" non-type-0 object to agent A(*z*). If (*t<sup>z</sup>* + 1)*<sup>c</sup>* = (*t<sup>z</sup>* + 1)*<sup>f</sup>* , then [*t<sup>z</sup>* + 1*, t<sup>z</sup>* + 1] is a block. Otherwise, we know by Lemma 7.20 that the next object (in clockwise steps) has to be either (*t<sup>z</sup>* + 2)*<sup>c</sup>* or (*t<sup>z</sup>* + 1)*<sup>f</sup>* . If the object *ℓ<sup>f</sup>* is found, where *ℓ* is the largest type that is so far considered in the block, then the block [*t<sup>z</sup>* + 1*, ℓ*] is found. Notice that *ℓ* ≤ *k* and if a block is found, then we can redo the whole process starting with the object (*ℓ* + 1)*<sup>c</sup>* until the block [*ℓ* ′ *, k*] is found for some *ℓ* ′ . Starting then from agent *k*, we can search for object (*k* − *t<sup>z</sup>* + 2)*<sup>c</sup>* and repeat the whole argumentation until a block [*ℓ* ′ *, t<sup>z</sup>* − 1] is found. At this point, each type is contained in exactly one block. Since we need only a constant amount of computation time for each object, all blocks can be computed in linear time.

In order to prove that there are only two possible selections for each block that can lead to a solution, we frst show an intermediate lemma.

**Lemma 7.22.** *Let A* = [*α, β*] *be a block and let ι<sup>A</sup> be a selection for A. Let γ* ∈ *A be a type. If γ<sup>f</sup>* ∈ *ιA, then δ<sup>f</sup>* ∈ *ι<sup>A</sup> for each δ* ∈ [*γ, β*] *or ι<sup>A</sup> cannot be part of a selection that corresponds to a solution in which x reaches I.*

*Proof.* We prove the statement by induction on *γ*. Note that if *γ* = *β*, then the statement trivially holds. Now assume that *γ < β*, *γ<sup>f</sup>* ∈ *ιA*, and if (*γ* + 1)*<sup>f</sup>* ∈ *ιA*, then *δ<sup>f</sup>* ∈ *ι<sup>A</sup>* for each *δ* ∈ [*γ* + 1*, β*]. By Lemma 7.20, it holds that (*γ* + 1)*<sup>c</sup>* is initially hold by an agent in <sup>J</sup>A(*αc*)*,* A(*γ<sup>f</sup>* )<sup>K</sup> and (*<sup>γ</sup>* + 1)*<sup>c</sup>* cannot move clockwise since *γ<sup>f</sup>* moves clockwise and is swapped with *x* over the edge {*γ* − 1*, γ*}. This holds true as if (*γ* + 1)*<sup>c</sup>* moved clockwise, then it holds by Lemma 7.9 that *x* is swapped with *γ<sup>f</sup>* before it is swapped with (*γ* + 1)*c*. Hence, *x* cannot be swapped with (*γ* + 1)*<sup>c</sup>* over the edge {*γ, γ* + 1} which is the type of (*γ* + 1)*<sup>f</sup>* . Thus, object (*γ* + 1)*<sup>c</sup>* moves counter-clockwise and (*γ* + 1)*<sup>f</sup>* moves clockwise. By induction hypothesis, *δ<sup>f</sup>* ∈ *ι<sup>A</sup>* for each *δ* ∈ [*γ, β*].

Based on Lemmata 7.20 and 7.22, we can now prove that there are only two possible selections for each block that can lead to a solution.

**Lemma 7.23.** *Let A* = [*α, β*] *be a block. There are at most two selections for A that can be part of a selection that corresponds to a solution in which x reaches I. These selections can be computed in O*(*n* · |*A*|) *time.*

*Proof.* We will construct two selections *ι*<sup>1</sup> and *ι*<sup>2</sup> for *A* that can be part of a selection that corresponds to a solution in which *x* reaches *I*. We start with *α<sup>c</sup>* ∈ *ι*<sup>1</sup> and *α<sup>f</sup>* ∈ *ι*2. Note that by Lemma 7.22 *ι*<sup>2</sup> = {*α<sup>f</sup> ,*(*α* + 1)*<sup>f</sup> , . . . , β<sup>f</sup>* }. Thus it remains to show that *ι*<sup>1</sup> is unique.

If *α<sup>c</sup>* moves clockwise, then *α<sup>f</sup>* has to move counter-clockwise. Using Lemma 7.17, we can compute the number *h* of objects *x<sup>i</sup>* that are initially held by A(*xi*) <sup>∈</sup> <sup>J</sup>A(*αc*)*,* A(*α<sup>f</sup>* )K, that have types *<sup>α</sup>*+1*, α*+2*, . . . , α*+*h*, and that have to move clockwise. We now switch to an arbitrary type *γ* as we will use the statement iteratively (starting with *γ* = *α*). If *γ* = *β*, then *ι*<sup>1</sup> = {*αc,*(*α*+ 1)*c, . . . , βc*}. We therefore assume that *γ < β*, that *γ<sup>c</sup>* moves clockwise, and that *h* objects *x<sup>i</sup>* initially held by A(*xi*) <sup>∈</sup> <sup>J</sup>A(*γc*)*,* A(*γ<sup>f</sup>* )<sup>K</sup> and of types *<sup>γ</sup>* + 1*, γ* + 2*, . . . , γ* <sup>+</sup> *<sup>h</sup>* move clockwise. We consider the two cases *h* = 0 and *h >* 0. If *h* = 0, then note that, by Lemma 7.20, A((*<sup>γ</sup>* + 1)*c*) <sup>∈</sup> <sup>J</sup>A(*γc*)*,* A(*γ<sup>f</sup>* )K. Hence, (*<sup>γ</sup>* + 1)*<sup>c</sup>* moves counter-clockwise and (*γ* + 1)*<sup>f</sup>* moves clockwise. By Lemma 7.22, it holds for each *δ* ∈ [*γ* + 1*, β*] that *δ<sup>f</sup>* ∈ *ι<sup>A</sup>* and thus

$$\iota\_1 = \{ \alpha\_c, (\alpha+1)\_c, \dots, \gamma\_c, (\gamma+1)\_f, (\gamma+2)\_f, \dots, \beta\_f \}.$$

If *h >* 0, then we will show that (*γ* + 1)*<sup>c</sup>* moves clockwise and hence we can repeat the argument. Assume towards a contradiction that (*γ* + 1)*<sup>c</sup>* moves counter-clockwise. Then (*γ*+1)*<sup>f</sup>* moves clockwise and, by Lemma 7.22, so does *δ<sup>f</sup>*

for each *δ* ∈ [*γ* + 1*, β*]. Hence, no object *δ<sup>c</sup>* moves clockwise for *δ* ∈ [*γ* + 1*, β*]. Note that by defnition of blocks it holds for each object *x<sup>i</sup>* which is initially held by A(*xi*) <sup>∈</sup> <sup>J</sup>A(*γc*)*,* A(*γ<sup>f</sup>* )<sup>K</sup> that *<sup>x</sup><sup>i</sup>* <sup>=</sup> *<sup>δ</sup><sup>c</sup>* for some *<sup>δ</sup>* <sup>∈</sup> [*γ, β*] or that *<sup>x</sup><sup>i</sup>* <sup>=</sup> *<sup>η</sup><sup>f</sup>* for some *η* ∈ [*α, γ*]. If *η<sup>f</sup>* moved clockwise for some *η* ∈ [*α, γ*], then, by Lemma 7.22, object *γ<sup>f</sup>* also moved clockwise, a contradiction. Thus, if (*γ* + 1)*<sup>c</sup>* moves counterclockwise, then no object *x<sup>i</sup>* initially held by A(*xi*) <sup>∈</sup> <sup>J</sup>A(*γc*)*,* A(*γ<sup>f</sup>* )<sup>K</sup> moves clockwise. Thus, *γ<sup>c</sup>* and *γ<sup>f</sup>* cannot be swapped and *x* cannot reach *I*.

It remains to analyze the running time. Computing *ι*<sup>2</sup> takes *O*(*n*) time. Computing *ι*<sup>1</sup> takes *O*(*n*) time for each *γ<sup>c</sup>* with *γ* ∈ [*α, β*], that is, *O*(*n*·|*A*|) time in total.

It is fnally time to explain how to check whether a selection leads to a solution, that is, a sequence of swaps such that agent *I* obtains object *x*. Note that once a selection *ι* is fxed, Observation 7.12 states which objects are swapped how often with *x* or *z*. We assume that no object moves after it is swapped with *x* or *z* for the fnal time as these swaps are not necessary for *x* reaching *I*. Thus, we can compute the fnal position of each object and also the path *P ι xi* of agents that hold each object *x<sup>i</sup>* during a solution corresponding to *ι*. Gourvès et al. [GLW17] observed that once the path *Px<sup>i</sup>* of each object *x<sup>i</sup>* is fxed, then the order in which objects are swapped is irrelevant as long as all objects "follow" their respective paths. Thus, there is a unique set of edges where two objects *x<sup>i</sup>* and *x<sup>j</sup>* swap in each solution in which the objects in *ι* move clockwise and all other objects move counter-clockwise. We denote this set by *e ι xi,x<sup>j</sup>* . An example of *e ι xi,x<sup>j</sup>* and *P<sup>x</sup><sup>i</sup>* is given in Figure 7.9. It remains to show how to compute *e ι xi,x<sup>j</sup>* and how to fnd a selection where each pair of objects can be swapped at the respective edge. To this end, we show how to compute *e ι xi,x<sup>j</sup>* from partial selections, that is, from selections for some subset *λ* of types.

**Lemma 7.24.** *Let x<sup>i</sup> be an object of type α* ∈ [*k* − *t<sup>z</sup>* + 1*, k*]*, let x<sup>j</sup> be an object of type α* ∈ [*k* − *t<sup>z</sup>* + 1*, k*]*, and let x<sup>h</sup> be an object of type* 0*. Let A and B be two blocks with α* ∈ *A and β* ∈ *B. Given a selection ι<sup>A</sup> for A, the set e ι xi,x<sup>h</sup> is the same for each selection ι* ⊇ *ι<sup>A</sup> and can be computed in O*(*n*) *time. Given selections ι<sup>A</sup> for A and ι<sup>B</sup> for B, the set e ι xi,x<sup>j</sup> is the same for each selection ι* ⊇ *ι<sup>A</sup>* ∪ *ι<sup>B</sup> and can be computed in O*(*n*) *time.*

*Proof.* We start with determining which pairs of objects are not swapped, which pairs are swapped once and which are swapped twice. Let *x<sup>p</sup>* be an object moving clockwise and let *x<sup>q</sup>* be an object moving counter-clockwise. We consider

**Figure 7.9:** An example of Reachable Object on cycles. The objects initially held by agent are depicted outside each vertex. If objects *z* and *c* move clockwise (*ι* = {*c, z*} is the selection) and all other objects move counter-clockwise, then objects *a* and *z* swap over the edge {4*,* 5} as *z* swaps with objects *x* and *d* over the edges {2*,* 3} and {3*,* 4}, respectively, and object *a* swaps with *c* over the edge {0*,* 5}. The path *P<sup>c</sup>* for this solution is (4*,* 5*,* 0*,* 1*,* 2) as *c* is initially held by agent 4, moves clockwise, and is swapped with *x* over the edge {1*,* 2}.

the two cases A(*xp*) <sup>∈</sup> <sup>J</sup>A(*z*)*, k*<sup>K</sup> and A(*xp*) <sup>∈</sup>*/* <sup>J</sup>A(*z*)*, k*K. If A(*xp*) <sup>∈</sup> <sup>J</sup>A(*z*)*, k*<sup>K</sup> and A(*xq*) <sup>∈</sup> <sup>J</sup>A(*xp*)*, k*K, then *<sup>x</sup><sup>p</sup>* and *<sup>x</sup><sup>q</sup>* are, by Lemma 7.9, swapped once before *x* and *z* are swapped for the frst time and once afterwards. Thus, by Corollary 7.10, they are swapped exactly twice. If A(*xp*) <sup>∈</sup> <sup>J</sup>A(*z*)*, k*<sup>K</sup> and A(*xq*) <sup>∈</sup>*/* <sup>J</sup>A(*xp*)*, k*K, then *<sup>x</sup><sup>p</sup>* and *<sup>x</sup><sup>q</sup>* are swapped once as *<sup>z</sup>* and *<sup>x</sup><sup>q</sup>* are, by Observation 7.12, swapped exactly once and we assume that no object moves after it swapped with *x* or *z* for the fnal time.

If A(*xp*) <sup>∈</sup>*/* <sup>J</sup>A(*z*)*, k*K, then we distinguish between the three cases

$$\mathcal{A}(x\_q) \in \left[\mathcal{A}(z), k\right], \mathcal{A}(x\_q) \in \left[k+1, \mathcal{A}(x\_p)\right], \text{ and } \mathcal{A}(x\_q) \in \left[\mathcal{A}(x\_p), \mathcal{A}(z)\right].$$

In the frst case, *x<sup>q</sup>* is swapped twice with *z* and since, by Lemma 7.9, it is not swapped with *x<sup>p</sup>* before it is swapped with *z* for the frst time, it is swapped with *x<sup>p</sup>* once. In the second case, *x<sup>q</sup>* is swapped once with *z* and since, by Lemma 7.9, it is not swapped with *x<sup>p</sup>* before it is swapped with *z* for the frst time, it is not swapped with *xp*. In the third case, *x<sup>q</sup>* is swapped once with *z* and, by Lemma 7.9, it is swapped with *x<sup>p</sup>* before it is swapped with *z*. Thus, in this case *x<sup>p</sup>* and *x<sup>q</sup>* are swapped once.

We now show how to compute the set of edges where two objects can be swapped. Note that it is enough to compute the frst edge where two objects can be swapped as if two objects *x<sup>p</sup>* and *x<sup>q</sup>* are swapped twice, then the object moving counter-clockwise is, by Lemma 7.9, swapped with all objects moving clockwise before *x<sup>p</sup>* and *x<sup>q</sup>* are swapped again and there are always *t<sup>z</sup>* objects moving clockwise. By defnition of blocks and by Lemma 7.20, each object *x<sup>a</sup>* with a type in *A* . .= [*γ, δ*] is initially held by agent A(*xa*) <sup>∈</sup> <sup>J</sup>A(*γc*)*,* A(*δ<sup>f</sup>* )K. By Lemma 7.19, it holds for each type *<sup>µ</sup>* <sup>∈</sup> [*<sup>k</sup>* <sup>−</sup> *<sup>t</sup><sup>z</sup>* + 1*, k*] that A(*xh*) <sup>∈</sup>*/* <sup>J</sup>A(*µc*)*,* A(*µ<sup>f</sup>* )<sup>K</sup> and thus there is some type *µ* ′ such that A(*xh*) <sup>∈</sup> <sup>J</sup>A(*<sup>µ</sup>* ′ *f* )*,* A((*µ* ′ + 1)*c*)K. We now compute the frst edge where *x<sup>i</sup>* and *x<sup>h</sup>* can be swapped if *x<sup>i</sup>* moves clockwise. Note that if *x<sup>i</sup>* moves counter-clockwise, then it is not swapped with *xh*. If *α* ≤ *µ* ′ , then *x<sup>h</sup>* is swapped once with an object of each type in [*α, µ*′ ] \ {*α*} before it is swapped with *x<sup>i</sup>* . Hence {A(*xh*) − |*α* − *µ* ′ |*,* A(*xh*) − |*α* − *µ* ′ | + 1} ∈ *e ι xi,x<sup>h</sup>* is the frst edge where *x<sup>i</sup>* and *x<sup>h</sup>* can be swapped. If *α > µ*′ , then *x<sup>h</sup>* is frst swapped once with an object of each type *µ* ′ *, µ*′ − 1*, . . . , k* − *t<sup>z</sup>* + 1*, k, k* − 1*, . . . , α* + 1 before it is swapped with *x<sup>i</sup>* . Note that these are *t<sup>z</sup>* − |{*i* | *µ* ′ *< i < α*}| objects and therefore in this case {A(*xh*) − *t<sup>z</sup>* + *α* − *µ* ′ − 1*,* A(*xh*) − *t<sup>z</sup>* + *α* − *µ* ′} ∈ *e ι xi,x<sup>h</sup>* is the frst edge where *x<sup>i</sup>* and *x<sup>h</sup>* can be swapped.

It remains to analyze the possible edges for *x<sup>i</sup>* and *x<sup>j</sup>* . We assume without loss of generality that *x<sup>i</sup>* moves clockwise and *x<sup>j</sup>* moves counter-clockwise, that is, *x<sup>i</sup>* ∈ *ι<sup>A</sup>* and *x<sup>j</sup>* ∈*/ ιB*. We distinguish between the two cases *A* = *B* and *A* ̸= *B*. If *A* = *B*, then *ι<sup>A</sup>* = *ι<sup>B</sup>* and the direction of each object initially held by agents in <sup>J</sup>A(*γc*)*,* A(*δ<sup>f</sup>* )<sup>K</sup> is known. Since the number of objects moving clockwise is constant (and equal to *tz*), the number *c* of objects moving clockwise in <sup>J</sup>A(*xi*)*,* A(*x<sup>j</sup>* )<sup>K</sup> is known and the frst edge where *<sup>x</sup><sup>i</sup>* and *<sup>x</sup><sup>j</sup>* can be swapped is {A(*x<sup>j</sup>* )−*c*−1*,* A(*x<sup>j</sup>* )−*c*}. If *A* ̸= *B*, then let *B* . .= [*ψ, χ*]. Since *ι<sup>B</sup>* is given, the number of objects in <sup>J</sup>A(*ψc*)*,* A(*x<sup>j</sup>* )<sup>K</sup> moving clockwise is known. We can then use the same argument as for *xh*, where *µ* ′ = *ψ* − 1 (or *µ* ′ = *k* if *ψ* = *k* − *t<sup>z</sup>* + 1). Note that computing the unique set *e ι xi,x<sup>j</sup>* (or *e ι xi,x<sup>j</sup>* ) takes *O*(*n*) time as we only compute the type of certain objects and the number of objects of certain types moving clockwise.

An example of the set of edges computed in Lemma 7.24 is given in Figure 7.10. A selection *ι* leads to a solution if and only if for each pair (*x<sup>i</sup> , x<sup>j</sup>* ) of objects such that *x<sup>i</sup>* ∈ *ι* and *x<sup>j</sup>* ∈*/ ι* and each edge *e* ∈ *e ι xi,x<sup>j</sup>* , the agents incident to *e* can agree on swapping *x<sup>i</sup>* and *x<sup>j</sup>* . Hence, to check for a given selection *ι* whether it leads to a solution, we iterate over all pairs (*x<sup>i</sup> , x<sup>j</sup>* ) of objects such that *x<sup>i</sup>* ∈ *ι* and *x<sup>j</sup>* ∈*/ ι* and distinguish between the following three cases.

• Either *x<sup>j</sup>* does not have a type in [*k* − *t<sup>z</sup>* + 1*, k*], that is, *x<sup>j</sup>* = *x* or the type of *x<sup>j</sup>* is 0,

**Figure 7.10:** An example to illustrate Lemma 7.24. Depicted is a set of agents that are arranged in a cycle. Some objects are depicted next to the vertices of the agents which initially hold them. Notice that *A* = [2*,* 3*,* 4] is a block and let *ι<sup>A</sup>* . .= {2*c,* 3*<sup>f</sup> ,* 4*<sup>f</sup>* } be a selection for *A*. We show how to compute the edges *e ι* 2*c,*4*c* for each selection *ι* ⊃ *ιA*. Since objects 3*<sup>c</sup>* and 2*<sup>f</sup>* move counter-clockwise (they are not contained in *ιA*), the two objects 2*<sup>c</sup>* and 4*<sup>c</sup>* next to them have to be swapped over edge {5*,* 6}. Afterwards, by Lemma 7.9, object 4*<sup>c</sup>* is swapped with each other object moving clockwise before it is swapped with 2*<sup>c</sup>* for a second time. Since *t<sup>z</sup>* − 1 = 6 objects move clockwise, object 4*<sup>c</sup>* moves over fve edges after the frst swap before it is swapped with 2*<sup>c</sup>* for a second time. Thus, 2*<sup>c</sup>* and 4*<sup>c</sup>* are swapped for a second time over the edge {0*,* 19}.


The frst two cases give rise to the notion of *consistent selections*. A selection *ι<sup>A</sup>* for a block *A* is consistent if each object in *ι<sup>A</sup>* can be swapped with each object *x<sup>j</sup>* that has type 0 or a type in *A* over the respective edges in *e* ∈ *e ι xi,x<sup>j</sup>* . Therein *ι* is any selection that generalizes *ιA*, that is, *ι* ⊇ *ιA*. For the sake of readability, we use C*<sup>A</sup>* . .= ⋃ *<sup>α</sup>*∈*<sup>A</sup>* C*<sup>α</sup>* to denote the set of all objects of a type in *A*.

**Defnition 7.6.** Let *A* be a block and let *ι<sup>A</sup>* be a selection for *A*. Let *x<sup>i</sup>* ∈ *ι<sup>A</sup>* be an object. Then, *ι<sup>A</sup>* is *consistent* if


the agents incident to *e* can agree on swapping *x<sup>i</sup>* and *x<sup>j</sup>* .

We call a selection *inconsistent* if it is not consistent. Note that a given selection *ι<sup>A</sup>* for a block *A* can be checked for consistency in *O*(*n* 2 · |*A*|) time by iterating over all *x<sup>i</sup>* ∈ *ι<sup>A</sup>* and all *x<sup>j</sup>* ∈ *X*, computing the set *e ι xi,x<sup>j</sup>* in *O*(*n*) time, and checking whether they can agree on the swap in constant time using the preprocessed pos-values.

The third requirement (checking whether two objects of types in diferent blocks can be swapped) gives rise to the notion of *compatible* selections. We say that two selections *ι<sup>A</sup>* and *ι<sup>B</sup>* for blocks *A* and *B* are compatible, if all pairs of objects of types in *A* and *B*, respectively, can be swapped at their respective edges.

**Defnition 7.7.** Let *A* = [*α, β*] and *B* = [*γ, δ*] be two blocks. Let *ι<sup>A</sup>* and *ι<sup>B</sup>* be two selections for *A* and *B*, respectively. The selections are *compatible* if


and each edge *e* ∈ *e ι xi,x<sup>j</sup>* for some *ι* ⊇ *ι<sup>A</sup>* ∪ *ιB*, the agents incident to *e* can agree on swapping *x<sup>i</sup>* and *x<sup>j</sup>* .

We say that two selections *ι<sup>A</sup>* and *ι<sup>B</sup>* are *incompatible* if they are not compatible. Observe that given two selections *ι<sup>A</sup>* and *ι<sup>B</sup>* for blocks *A* and *B*, we can check them for compatibility in *O*(|*A*| · |*B*| · *n*) time by iterating over all *x<sup>i</sup>* ∈ *ι<sup>A</sup>* and all *x<sup>j</sup>* ∈ C*<sup>B</sup>* \ *ι<sup>B</sup>* and compute the respective set of edges in *O*(*n*) time using Lemma 7.24. Afterwards, we iterate over all *x<sup>i</sup>* ∈ C*<sup>A</sup>* \ *ι<sup>A</sup>* and all *x<sup>j</sup>* ∈ *ι<sup>B</sup>* and compute the respective set of edges. Checking whether the two agents incident to each of the at most two edges can agree on swapping *x<sup>i</sup>* and *x<sup>j</sup>* takes constant time as the pos-values are precomputed. It remains to fnd consistent selections for each block that are pairwise compatible. We solve this problem using 2-SAT programming. Therein, we have a variable for each block. The truth value of each variable represents which selection for the respective block is chosen and the clauses guarantee that no two incompatible selections are chosen.

#### **Theorem 7.25.** Reachable Object *on cycles can be solved in O*(*n* 4 ) *time.*

*Proof.* We start the proof with an overview over the algorithm and an analysis of its running time. Afterwards, we show that the algorithm is correct. For each possible choice of object *z* (there are at most *n* many possibilities) and each of the two possible direction we assume *x* to move (two possibilities) we do the following. First, compute the type and subtype of each object. Second, compute all blocks and use Lemma 7.23 to compute two possible selections *ι* 1 *A, ι*<sup>2</sup> *<sup>A</sup>* for each block *A* in overall *O*(*n* 2 ) time. Afterwards, check each of these selections for consistency in overall *O*(*n* 3 ) time and discard all inconsistent selections. Next, compute for each pair of consistent selections for diferent blocks whether they are compatible. This takes overall *O*(*n* 3 ) time. Finally, check whether there are pairwise compatible selections for each block using the 2-SAT program below and return true if so.

Before we present the 2-SAT program, we frst show a small preprocessing step. If for some block *A* there is only one consistent selection *ι<sup>A</sup>* for *A*, then we discard all selections that are not compatible with *ι<sup>A</sup>* as there is no set of pairwise compatible and consistent selections for all blocks that do not contain *ιA*. Since all remaining selections are compatible with *ιA*, we can ignore *ι<sup>A</sup>* from now on. If this rule discards any selection, then the respective other selection for this block is the only consistent selection for this block and hence, we repeat the process. After at most *n* rounds, each of which only takes *O*(*n*) time, we arrive at a situation where there are exactly two consistent selections for each block and the task is to fnd a set of pairwise compatible selections that include a selection for each block. We fnally reduce this problem to a 2-SAT formula.

We start with a variable *v<sup>B</sup>* for each block *B* which is set to true if we select *ι* 1 *A* and set to false if we select *ι* 2 *<sup>A</sup>*. For each pair (*ιA, ιB*) of incompatible selections for diferent blocks *A* and *B* do the following. For the sake of simplicity, we use *u* and *w* to denote the literals representing the selections for blocks *A* and *B* that are incompatible. Formally, if *ι<sup>A</sup>* = *ι* 1 *<sup>A</sup>*, then *u* . .= *v<sup>A</sup>* and otherwise *u* . .= ¬ *vA*. Analogously, if *ι<sup>B</sup>* = *ι* 1 *<sup>B</sup>*, then *w* . .= *v<sup>B</sup>* and otherwise *w* . .= ¬ *vB*. Since we cannot select *ι<sup>A</sup>* and *ι<sup>B</sup>* at the same time (the formula cannot satisfy *u* and *w* at the same time), we add the clause (¬ *u* ∨ ¬ *w*) to our 2-SAT formula.

Observe that if there is a set of pairwise compatible selections for each block, then the 2-SAT formula is satisfed by the corresponding truth assignment of the variables. Conversely, if the formula is satisfable, then the selections for each block corresponding to a satisfying truth assignment specify a direction for each object. Any sequence of swaps that follows these directions will eventually lead

to agent *I* obtaining object *x*. Since 2-SAT can be solved in linear time [APT79] and the constructed formula has *O*(*n* 2 ) clauses of constant size, the overall running time for each possible choice of *z* and the direction of *x* is in *O*(*n* 3 ). Since there are *O*(*n*) possible choices for combinations of *z* and the direction of *x*, the overall running time for all iterations is in *O*(*n* 4 ).

# **7.4 Concluding Remarks**

We investigated the computational complexity of Reachable Object with respect to restrictions on the maximum degree of the input graph and the maximum length of preference lists. Our work narrows the gap between known tractable and intractable cases leading to a more comprehensive understanding of the computational complexity of Reachable Object. In particular, we showed a dichotomy result regarding the length of the preference lists of the agents and showed polynomial-time solvability for Reachable Object on graphs with maximum degree at most two (note that a graph of maximum degree two is the disjoint union of paths and cycles and Huang and Xiao [HX20] resolved the case of paths). Safdine and Wilczynski [SW18, Theorem 4] showed *NP*-hardness of Reachable Object on graphs of maximum degree at most four. Hence, the computational complexity of Reachable Object on graphs of maximum degree three remains the only open case towards a dichotomy result with respect to the parameter maximum degree. We conjecture that this case is NP-hard. Other interesting question regarding the maximum degree of a graph are whether Reachable Object is polynomial-time solvable on trees if the maximum degree is some constant and whether our running-time bound of *O*(*n* 4 ) for graphs of maximum degree two is tight, that is, can it be improved to e. g. *O*(*n* 3 log *n*) or is there some (e. g. ETH-based) lower bound?

Note that in a cycle each object can take one of two paths towards its target object and these two paths translate to assigning each variable in our 2-SAT program one of the two possible truth values true or false. Jansen [Jan17] used a variant of 2-SAT where each variable can have one of *N* values (where *N* is some constant) to show containment in *P* for a variant of Hitting Set. It would be interesting to see whether there are graph classes in which each object can take one of constantly many paths to its target object where this generalization of 2-SAT can be used to show polynomial-time solvability.

Regarding modifcations and generalizations of Reachable Object, note that in the Housing Market problem the agents cannot only swap in pairs but also in trading cycles. Trading cycles quite naturally translate into hyperedges in the input graph of Reachable Object. A set of agents can swap their currently held objects along a trading cycle only if they share a hyperedge. This generalization of Reachable Object seems to be a quite natural link between Housing Market and Reachable Object and has not been studied so far.

# **Chapter 8**

### **Outlook**

In this thesis, we shed some light on the computational complexity of special cases of diferent graph problems using mostly dynamic and 2-SAT programming. We conclude the thesis with a summary of our main results and a broader refection on what we observed and how our work can be continued. Since we already provided directions for further research related to the problems studied in the respective chapters, we will focus on dynamic and 2-SAT programming here.

We start with a summary of our main results. For Diameter, we presented results within the feld of *FPT in P*. In particular, using dynamic programming, we showed that Diameter is solvable in *O*((*n* + *m*) · *h* · (*d <sup>h</sup>* + *h d* )) time when parameterized by the *h*-index and diameter *d*. We further presented an *O*(*n* 2 ·*m*) time algorithm for Length-Bounded Cut on proper interval graphs and proved that *k*-Disjoint Shortest Paths is solvable in *n <sup>O</sup>*((*k*+1)!) time.

Using 2-SAT programming, we showed that Soft Tree Containment is solvable in *O*(*n* 5 ) time when the input network is a 2-labeled phylogenetic tree. Complementing this result, we showed that Soft Tree Containment remains *NP*-hard when restricted to binary 3-labeled phylogenetic trees. Finally, we showed how to solve Reachable Object in *O*(*n* 4 ) time on cycles and proved a dichotomy result on arbitrary graphs parameterized by the length of the longest preference list of an agent. If all preference lists are of length at most three, then the problem can be solved in linear time and for lists of length at most four it remains *NP*-hard.

We continue with some concluding thoughts on dynamic and 2-SAT programming. Concerning dynamic programming, note that the three problems we studied in the frst part of this thesis were all related to shortest paths in graphs. All three respective dynamic programs used the length of solution paths to some extent in the representation of a table entry. For Diameter,

one of the dimensions of the respective table measures the length of a longest shortest path in the input graph. For Length-Bounded Cut, each table entry represents the minimum number of edges to delete in order to increase the length of a shortest path between the terminal *s* and each vertex in a given subset of vertices including *t* to some given threshold. Finally, for *k*-Disjoint Shortest Paths, each table entry represents whether there are disjoint paths between terminal pairs in a directed acyclic graph. We iterate over a topological order of this graph and hence allow for longer and longer disjoint paths. Concluding, we observed the following heuristic for how to use dynamic programming: *If the problem is about paths in graphs, then try to design a dynamic program that iteratively allows for longer and longer paths.*

Finally, we refect on 2-SAT programming and answer the question we started this thesis with: When is 2-SAT programming a promising tool for solving algorithmic problems? Let us begin with revisiting how we and other authors used 2-SAT programming. All 2-SAT programs (including the ones from the literature) had in common that they, to some extent, compute a solution consisting of a set of pairwise compatible elements. In this thesis, these elements were either canonical vertices (in Chapter 6) or selections for blocks (in Chapter 7). Notice that in both cases exactly one out of at most two alternatives was chosen in the solution. The same holds true for all 2-SAT programs that we could fnd in the literature with one exception. Jansen [Jan17] used a version of 2-SAT where each variable can have one of *N* possible truth values in [*N*] (where *N* is some constant) and a literal expresses that the truth value of a certain variable is at least or at most some given threshold. This version of 2-SAT is known to be polynomial-time solvable [BHM00]. Combining these insights, we present two heuristics of when to try applying 2-SAT programming to a new problem.

	- be partitioned into constant-size parts and at most one element from each part is picked into the solution and
	- a set of elements forms a solution if each pair of elements in this set can be contained in the same solution.

With these heuristics at hand, we remark that we could not fnd any application of 2-SAT programming in the context of scheduling. This is surprising as scheduling is fundamentally about fnding pairwise non-conficting assignments of jobs to machines and time slots. We therefore conjecture that 2-SAT programming should be applicable in the context of scheduling quite often.

The importance of 2-SAT programming is not limited to exact polynomialtime algorithms either. 2-SAT programming has already been used in an approximation algorithm [WW95] and we believe that it might have further applications, for instance in heuristics or data reductions. It might even be useful to reduce some *NP*-hard problem to an exponential number of 2-SAT formulas (or one formula of exponential size) to achieve faster exponential-time algorithms.

Finally, there are other special cases of Satisfiability that are polynomialtime solvable. Most notably, XOR-SAT (clauses consist of exclusive-or operations and clauses are connected by and operations) is also linear-time solvable [Sch78]. We could only fnd a single reference ([Rad+07]) where a problem was reduced to XOR-SAT but there it was not used for a polynomial-time algorithm. We believe that *XOR-SAT programming* is also worth investigating as a potential tool for exact polynomial-time algorithms and we believe it to be most useful for problems in which the parity of numbers is important.

# **Bibliography**




[Zwi01] Uri Zwick. "Exact and approximate distances in graphs – a survey". In: *Proceedings of the 9th Annual European Symposium on Algorithms (ESA '01)*. Springer, 2001, pp. 33–48 (cited on p. 9).

Schriftenreihe **Foundations of computing** Hrsg.: Prof. Dr. Stephan Kreutzer, Prof. Dr. Uwe Nestmann, Prof. Dr. Rolf Niedermeier

ISSN 2199-5249 (print) ISSN 2199-5257 (online)


**04: Talmon, Nimrod: Algorithmic Aspects of Manipulation and Anonymization in Social Choice and Social Networks**. - 2016. xiv, 275 S. ISBN **978-3-7983-2804-4** (print) EUR **13,00** ISBN **978-3-7983-2805-1** (online)

**05: Siebertz, Sebastian: Nowhere Dense Classes of Graphs**. Characterisations and Algorithmic Meta-Theorems. - 2016. - xxii, 149 S. ISBN **978-3-7983-2818-1** (print) EUR **11,00** ISBN **978-3-7983-2819-8** (online)

**06: Chen, Jiehua: Exploiting Structure in Computationally Hard Voting Problems.** - 2016. - xxi, 255 S. ISBN **978-3-7983-2825-9** (print) EUR **13,00** ISBN **978-3-7983-2826-6** (online)

**07: Arbach, Youssef: On the Foundations of dynamic coalitions.** Modeling changes and evolution of workflows in healthcare scenarios - 2016. - xv, 171 S. ISBN **978-3-7983-2856-3** (print) EUR **12,00** ISBN **978-3-7983-2857-0** (online)

**08: Sorge, Manuel: Be sparse! Be dense! Be robust!** Elements of parameterized algorithmmics. **-** 2017. - xvi, 251 S. ISBN 978-3-7983-2885-3 (print) EUR **13,00** ISBN 978-3-7983-2886-0 (online)

**09: Dittmann, Christoph: Parity games, separations, and the modal μ-calculus**. - 2017. - x, 274 S. ISBN **978-3-7983-2887-7** (print) EUR **13,00** ISBN **978-3-7983-2888-4** (online)

**10: Karcher, David S.: Event Structures with Higher-Order Dynamics**. - 2019. - xix, 125 S. ISBN **978-3-7983-2995-9** (print) EUR **11,00** ISBN **978-3-7983-2996-6** (online)

**11: Jungnickel, Tim: On the Feasibility of Multi-Leader Replication in the Early Tiers**. **-** 2018. - xiv, 177 S. ISBN **978-3-7983-3001-6** (print) EUR **13,00** ISBN **978-3-7983-3002-3** (online)

**12: Froese, Vincent: Fine-grained complexity analysis of some combinatorial data science problems**. **-** 2018. - xiv, 166 S. ISBN **978-3-7983-3003-0** (print) EUR **11,00** ISBN **978-3-7983-3004-7** (online)

**13: Molter, Hendrik: Classic graph problems made temporal – a parameterized complexity analysis**. **-** 2020. - xii, 206 S. ISBN **978-3-7983-3172-3** (print) EUR **12,00** ISBN **978-3-7983-3173-0** (online)

### **Universitätsverlag der TU Berlin**

### **Elements of Dynamic and 2-SAT Programming: Paths, Trees, and Cuts**

This thesis presents faster (in terms of worst-case running tmes and compared to the fastest previously known) exact algorithms for special cases of graph problems through dynamic programming and 2-SAT programming. Dynamic programming describes the procedure of breaking down a problem recursively into overlapping subproblems and then combining optmal solutons for these subproblems to an optmal soluton for the larger problem. 2-SAT programming refers to the proce- dure of reducing a problem to a set of 2-SAT formulas, that is, Boolean formulas in conjunctve normal form in which each clause contains at most two literals. Com- putng a satsfying truth assignment (if one exists) of a 2-SAT formula takes line- ar tme in the formula length. Hence, when satsfying truth assignments to some 2-SAT formulas correspond to a soluton of the original problem and all formu- las can be computed in polynomial tme, then the original problem can be solved in polynomial tme. Our main results are polynomial-tme algorithms for special graph classes and parameterized algorithms.

ISBN 978-3-7983-3209-6 (print) ISBN 978-3-7983-3210-2 (online)

ISBN 978-3-7983-3209-6 htps://verlag.tu-berlin.de