**Patricia Bouyer Lutz Schröder (Eds.)**

# **Foundations of Software Science and Computation Structures**

**25th International Conference, FOSSACS 2022 Held as Part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2022 Munich, Germany, April 2–7, 2022 Proceedings**

## Lecture Notes in Computer Science 13242

Founding Editors

Gerhard Goos, Germany Juris Hartmanis, USA

### Editorial Board Members

Elisa Bertino, USA Wen Gao, China Bernhard Steffen , Germany Gerhard Woeginger , Germany Moti Yung , USA

### Advanced Research in Computing and Software Science Subline of Lecture Notes in Computer Science

Subline Series Editors

Giorgio Ausiello, University of Rome 'La Sapienza', Italy Vladimiro Sassone, University of Southampton, UK

Subline Advisory Board

Susanne Albers, TU Munich, Germany Benjamin C. Pierce, University of Pennsylvania, USA Bernhard Steffen , University of Dortmund, Germany Deng Xiaotie, Peking University, Beijing, China Jeannette M. Wing, Microsoft Research, Redmond, WA, USA More information about this series at https://link.springer.com/bookseries/558

# Foundations of Software Science and Computation Structures

25th International Conference, FOSSACS 2022 Held as Part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2022 Munich, Germany, April 2–7, 2022 Proceedings

Editors Patricia Bouyer Université Paris-Saclay, CNRS, ENS Paris-Saclay Gif-sur-Yvette, France

Lutz Schröder Friedrich-Alexander-Universität Erlangen Erlangen, Germany

ISSN 0302-9743 ISSN 1611-3349 (electronic) Lecture Notes in Computer Science ISBN 978-3-030-99252-1 ISBN 978-3-030-99253-8 (eBook) https://doi.org/10.1007/978-3-030-99253-8

© The Editor(s) (if applicable) and The Author(s) 2022. This book is an open access publication.

Open Access This book is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this book are included in the book's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the book's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use.

The publisher, the authors and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This Springer imprint is published by the registered company Springer Nature Switzerland AG The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland

### ETAPS Foreword

Welcome to the 25th ETAPS! ETAPS 2022 took place in Munich, the beautiful capital of Bavaria, in Germany.

ETAPS 2022 is the 25th instance of the European Joint Conferences on Theory and Practice of Software. ETAPS is an annual federated conference established in 1998, and consists of four conferences: ESOP, FASE, FoSSaCS, and TACAS. Each conference has its own Program Committee (PC) and its own Steering Committee (SC). The conferences cover various aspects of software systems, ranging from theoretical computer science to foundations of programming languages, analysis tools, and formal approaches to software engineering. Organizing these conferences in a coherent, highly synchronized conference program enables researchers to participate in an exciting event, having the possibility to meet many colleagues working in different directions in the field, and to easily attend talks of different conferences. On the weekend before the main conference, numerous satellite workshops took place that attract many researchers from all over the globe.

ETAPS 2022 received 362 submissions in total, 111 of which were accepted, yielding an overall acceptance rate of 30.7%. I thank all the authors for their interest in ETAPS, all the reviewers for their reviewing efforts, the PC members for their contributions, and in particular the PC (co-)chairs for their hard work in running this entire intensive process. Last but not least, my congratulations to all authors of the accepted papers!

ETAPS 2022 featured the unifying invited speakers Alexandra Silva (University College London, UK, and Cornell University, USA) and Tomáš Vojnar (Brno University of Technology, Czech Republic) and the conference-specific invited speakers Nathalie Bertrand (Inria Rennes, France) for FoSSaCS and Lenore Zuck (University of Illinois at Chicago, USA) for TACAS. Invited tutorials were provided by Stacey Jeffery (CWI and QuSoft, The Netherlands) on quantum computing and Nicholas Lane (University of Cambridge and Samsung AI Lab, UK) on federated learning.

As this event was the 25th edition of ETAPS, part of the program was a special celebration where we looked back on the achievements of ETAPS and its constituting conferences in the past, but we also looked into the future, and discussed the challenges ahead for research in software science. This edition also reinstated the ETAPS mentoring workshop for PhD students.

ETAPS 2022 took place in Munich, Germany, and was organized jointly by the Technical University of Munich (TUM) and the LMU Munich. The former was founded in 1868, and the latter in 1472 as the 6th oldest German university still running today. Together, they have 100,000 enrolled students, regularly rank among the top 100 universities worldwide (with TUM's computer-science department ranked #1 in the European Union), and their researchers and alumni include 60 Nobel laureates. The local organization team consisted of Jan Křetínský (general chair), Dirk Beyer (general, financial, and workshop chair), Julia Eisentraut (organization chair), and Alexandros Evangelidis (local proceedings chair).

ETAPS 2022 was further supported by the following associations and societies: ETAPS e.V., EATCS (European Association for Theoretical Computer Science), EAPLS (European Association for Programming Languages and Systems), and EASST (European Association of Software Science and Technology).

The ETAPS Steering Committee consists of an Executive Board, and representatives of the individual ETAPS conferences, as well as representatives of EATCS, EAPLS, and EASST. The Executive Board consists of Holger Hermanns (Saarbrücken), Marieke Huisman (Twente, chair), Jan Kofroň (Prague), Barbara König (Duisburg), Thomas Noll (Aachen), Caterina Urban (Paris), Tarmo Uustalu (Reykjavik and Tallinn), and Lenore Zuck (Chicago).

Other members of the Steering Committee are Patricia Bouyer (Paris), Einar Broch Johnsen (Oslo), Dana Fisman (Be'er Sheva), Reiko Heckel (Leicester), Joost-Pieter Katoen (Aachen and Twente), Fabrice Kordon (Paris), Jan Křetínský (Munich), Orna Kupferman (Jerusalem), Leen Lambers (Cottbus), Tiziana Margaria (Limerick), Andrew M. Pitts (Cambridge), Elizabeth Polgreen (Edinburgh), Grigore Roşu (Illinois), Peter Ryan (Luxembourg), Sriram Sankaranarayanan (Boulder), Don Sannella (Edinburgh), Lutz Schröder (Erlangen), Ilya Sergey (Singapore), Natasha Sharygina (Lugano), Pawel Sobocinski (Tallinn), Peter Thiemann (Freiburg), Sebastián Uchitel (London and Buenos Aires), Jan Vitek (Prague), Andrzej Wasowski (Copenhagen), Thomas Wies (New York), Anton Wijs (Eindhoven), and Manuel Wimmer (Linz).

I'd like to take this opportunity to thank all authors, attendees, organizers of the satellite workshops, and Springer-Verlag GmbH for their support. I hope you all enjoyed ETAPS 2022.

Finally, a big thanks to Jan, Julia, Dirk, and their local organization team for all their enormous efforts to make ETAPS a fantastic event.

February 2022 Marieke Huisman ETAPS SC Chair ETAPS e.V. President

### Preface

This volume contains the papers presented at the 25th International Conference on Foundations of Software Science and Computation Structures (FoSSaCS 2022), which was held during April 4–6, 2022, in Munich, Germany. The conference is dedicated to foundational research with a clear significance for software science and brings together research on theories and methods to support the analysis, integration, synthesis, transformation, and verification of programs and software systems.

In addition to an invited talk by Nathalie Bertrand (Université de Rennes, Inria, CNRS, and IRISA, France) on "Parameterized verification to the rescue of distributed algorithms", the program consisted of 23 contributed papers, selected from among 77 submissions. Each submission was assessed by three or more Program Committee members. The conference management system EasyChair was used to handle the submissions, to conduct the electronic Program Committee discussions, and to assist with the assembly of the proceedings.

We wish to thank all the authors who submitted papers for consideration, the members of the Program Committee for their conscientious work, and all additional reviewers who assisted the Program Committee in the evaluation process. Finally, we would like to thank the ETAPS organization for providing an excellent environment for FoSSaCS, other conferences, and workshops.

January 2022 Patricia Bouyer Lutz Schröder

### Organization

### Program Committee

Patricia Bouyer (Chair) CNRS, LMF, France

James Worrell University of Oxford, UK

### C. Aiswarya Chennai Mathematical Institute, India S. Akshay Indian Institute of Technology Bombay, India Carlos Areces Universidad Nacional de Córdoba, Argentina Filippo Bonchi Università di Pisa, Italy Michaël Cadilhac DePaul University, USA Ankush Das Amazon Web Services, USA Maribel Fernandez King's College London, UK Santiago Figueira Universidad de Buenos Aires, Argentina Hongfei Fu Shanghai Jiao Tong University, China Patricia Johann Appalachian State University, USA Ohad Kammar University of Edinburgh, UK Shin-ya Katsumata National Institute of Informatics, Japan Aleks Kissinger University of Oxford, UK Naoki Kobayashi University of Tokyo, Japan Orna Kupferman Hebrew University, Israel Alexander Kurz Chapman University, USA Sławomir Lasota University of Warsaw, Poland Annabelle McIver Macquarie University, Australia Daniela Petrisan Université de Paris, IRIF, France Elaine Pimentel Universidade Federal do Rio Grande do Norte, Brazil Jean-Francois Raskin Université Libre de Bruxelles, Belgium Jurriaan Rot Radboud University, The Netherlands Lutz Schröder (Chair) Friedrich-Alexander-Universität Erlangen-Nürnberg, Germany Pawel Sobocinski Tallinn University of Technology, Estonia Ana Sokolova Universität Salzburg, Austria Jiri Srba Aalborg University, Denmark

### Additional Reviewers

Abriola, Sergio Allais, Guillaume Alvarez-Picallo, Mario Atkey, Robert Baillot, Patrick Balabonski, Thibaut

Balasubramanian, A. R. Bansal, Suguman Barloy, Corentin Blondin, Michael Bodlaender, Hans L. Boker, Udi

Bollig, Benedikt Bonomo, Flavia Bork, Alexander Bønneland, Frederik M. Carai, Luca Carbone, Marco Chen, Zhenbang Clemente, Lorenzo Comfort, Cole Crubillé, Raphaëlle Czerwiński, Wojciech D'Argenio, Pedro R. Dal Lago, Ugo Della Penna, Giuseppe Delzanno, Giorgio Demri, Stéphane Devillers, Raymond DeYoung, Henry Domínguez, Martín Ariel Doyen, Laurent Exibard, Léo Fervari, Raul Figueira, Diego Finkel, Alain Garner, Richard Gastin, Paul Gay, Simon Genest, Blaise Gocht, Stephan Goncharov, Sergey Grochau Azzi, Guilherme Grädel, Erich Hadzihasanovic, Amar Hague, Matthew Hedges, Jules Ho, Hsi-Ming Hodkinson, Ian Junges, Sebastian Kahn, David Karimov, Toghrul Kauffman, Sean Kiefer, Stefan Klin, Bartek Koutny, Maciej Kura, Satoshi Kuznetsov, Stepan

Lange, Martin Lewis, Marco Lorber, Florian López Franco, Ignacio Maarand, Hendrik Maderbacher, Benedikt Mamouras, Konstantinos Martens, Wim Martinez, Maria Vanina Mathieson, Luke Matsushita, Yusuke Meggendorfer, Tobias Mikulski, Lukasz Mikučionis, Marius Moerman, Joshua Muniz, Marco Nakazawa, Koji Nester, Chad Ockerlund, Kyle Oualhadj, Youssouf Padhi, Saswat Paperman, Charles Perez, Guillermo Piedeleu, Robin Piróg, Maciej Poças, Diogo Praveen, M. Puglisi, Simon Reynier, Pierre-Alain Román, Mario Sacerdoti Coen, Claudio Saivasan, Prakash Sangnier, Arnaud Sankur, Ocan Sarkar, Saptarshi Schmid, Todd Schou, Morten Konggaard Sharma, Vaibhav Steinberg, Florian Sterling, Jonathan Thejaswini, K. S. Trotta, Davide Tull, Sean Tzevelekos, Nikos Ulidowski, Irek van Dijk, Tom

van Glabbeek, Rob van Heerdt, Gerco Veltri, Niccol ò Voorneveld, Niels Vortmeier, Nils Wagemaker, Jana Wagner, Dominik Wang, Di

Wang, Weiyou Wojtczak, Dominik Yamakami, Tomoyuki Yang, Qizhe Ying, Mingsheng Ziliani, Beta Zimmermann, Martin Žikelić, Djordje

## Parameterized Verification to the Rescue of Distributed Algorithms (Abstract of Invited Talk)

Nathalie Bertrand

Univ Rennes, Inria, CNRS, IRISA, France nathalie.bertrand@inria.fr

Abstract. Distributed computing is everywhere in our daily lives and in advanced technological applications. Bugs in distributed algorithms can have huge consequences, so that already in 2006, Lamport advised: "Model-checking algorithms prior to submitting them for publication should become the norm" [4]. Formal verification techniques indeed avoid tedious and error-prone manual correctness proofs.

Developing formal verification techniques for distributed algorithms is a real challenge, since correctness should typically hold independently of the number of participants. The latter often can be considered, or are by design, anonymous, forming a crowd of identical copies. Since the seminal work of German and Sistla establishing the decidability of parameterized verification for crowds of finite-state machines interacting via rendez-vous [3], the model checking community has been focusing on specific classes of distributed algorithms, and has proposed appropriate crowds models with a decidable parameterized verification problem [1, 2].

In this talk, we will report on recent contributions to the parameterized verification of distributed algorithms.

Keywords: Model checking Parameterized verification Distributed algorithms

### References


### Contents




## Representing Regular Languages of Infinite Words Using Mod 2 Multiplicity Automata

Dana Angluin<sup>1</sup> , Timos Antonopoulos<sup>1</sup> (), Dana Fisman<sup>2</sup> , and Nevin George<sup>1</sup>

> <sup>1</sup> Yale University, New Haven, CT, USA timos.antonopoulos@yale.edu <sup>2</sup> Ben-Gurion University, Beer-Sheva, Israel

Abstract. We explore the suitability of mod 2 multiplicity automata (M2MAs) as a representation for regular languages of infinite words. M2MAs are a deterministic representation that is known to be learnable in polynomial time with membership and equivalence queries, in contrast to many other representations. Another advantage of M2MAs compared to non-deterministic automata is that their equivalence can be decided in polynomial time and complementation incurs only an additive constant size increase. Because learning time is parameterized by the size of the representation, particular attention is focused on the relative succinctness of alternate representations, in particular, LTL formulas and B¨uchi automata of the types: deterministic, non-deterministic and strongly unambiguous. We supplement the theoretical results of worst case upper and lower bounds with experimental results computed for randomly generated automata and specific families of LTL formulas.

Keywords: Multiplicity Automata · Regular Omega Languages · B¨uchi Automata · Linear Temporal Logic · Conciseness

### 1 Introduction

Regular languages of infinite words (or ω-words) play an important role in verification of reactive systems. The question of whether a system S satisfies a specification given by a temporal logic formula ϕ can be reduced to the question of whether L(S) ∩ L(¬ϕ) is empty, where L(S) is the set of ω-words representing the computation paths of the system S and L(¬ϕ) is the set of ω-words representing computations that violate ϕ. Automata are a useful machinery for performing operations on languages such as complementation and intersection, and for deciding properties such as emptiness and equivalence. Many verification tools are implemented using reductions to automata [20].

Regular ω-languages can be represented using various types of automata (e.g. B¨uchi, Rabin, Parity, etc.). Different automata types differ in their succinctness and in the complexity of performing operations of interest. Non-deterministic B¨uchi automata (NBAs) are one of the most popular acceptor types for regular ω-languages, mainly due to their simplicity, succinctness, and good complexity for the emptiness problem. An issue with B¨uchi automata is that their deterministic version (DBAs) is strictly less expressive: while NBAs accept all regular ω-languages, DBAs recognize only a strict subset thereof. Another issue is that complementation of NBAs is hard; it has a 2Ω(<sup>n</sup> log <sup>n</sup>) lower bound (where n is the number of states) [16]. This motivated the introduction of complete unambiguous B¨uchi automata (CUBA) by Carton and Michel who showed that every regular ω-language can be represented by a CUBA, i.e. there is a way to limit the nondeterminism without losing expressiveness [8]. Bousquet and L¨oding proposed strongly unambiguous B¨uchi automata (SUBA), a slight relaxation of CUBA for which they have shown that equivalence can be decided in polynomial time [6].

The SUBA model was also shown useful in terms of learnability of regular ω-languages — Angluin, Antonopoulos and Fisman have shown that SUBAs are polynomially predictable using membership queries (while NBAs, under plausible cryptographic assumptions, are not) [1]. Their proof makes use of a model of automata called Mod 2 Multiplicity Automata (M2MA). Informally, multiplicity automata are an algebraic variant of automata that compute functions from finite words to a field K [4,5], and M2MAs are multiplicity automata that work over the field GF(2) = {0, 1} where sum and product are computed modulo 2.

In this paper we look at questions concerning the adequacy of M2MAs for representing regular ω-languages. We note that M2MAs operate on finite words, and their use for representing regular ω-languages follows a reduction, by Calbrix, Nivat and Podelski from a regular ω-language L to a regular language (L)\$ of finite words [7]. We thus start by reviewing the succinctness of M2MAs with respect to automata on finite words, particularly of types non-deterministic (NFAs), deterministic (DFAs), and unambiguous (UFAs). We show that M2MAs are more succinct than DFAs and UFAs, whereas with respect to NFAs there are in the worst case exponential gaps in going from M2MAs to NFAs and vice versa.

We also study the complexity of performing basic operations on M2MAs; complementation can be done with an additive constant increase in size, and union and intersection with the product of sizes. There is a known cubic algorithm to minimize a weighted automaton [10,19], which applies to an M2MA and also implies cubic procedures for determining emptiness and equivalence.

We then investigate the succinctness of M2MAs in representing regular ωlanguages, by comparing translations from linear temporal logic (LTL) formulas and B¨uchi automata (deterministic, non-deterministic and strongly unambiguous) into M2MAs, DFAs, UFAs, SUBAs and NBAs (where the former three use the (L)\$ representation). The results are summarized in Fig. 3.

To complement the theoretical bounds, we implemented procedures to transform SUBAs to UFAs and UFAs to M2MAs, and to minimize and learn M2MAs, and report estimates of the average size increases in transforming random SUBAs, DBAs, and NBAs to M2MAs. We also determine the minimum dimensions of M2MAs and minimum sizes of DFAs for a few members of three specific families of LTL formulas and compare them with the respective ω-automaton sizes.

### 2 Preliminaries

For nonnegative integers k and `, [k..`] is the set of nonnegative integers n such that k ≤ n ≤ `. Given a finite alphabet Σ, Σ<sup>∗</sup> is the set of finite words over Σ. The length of a word x is |x| and the empty word is ε. Σ<sup>n</sup> = {x ∈ Σ<sup>∗</sup> | |x| = n}. The reverse of a word x is x r . A language L is any subset of Σ<sup>∗</sup> . The reverse of L, denoted L r , is {x r | x ∈ L}. The Hankel matrix of a language L is the infinite matrix whose rows and columns are indexed by elements of Σ<sup>∗</sup> , where the entry for row x and column y is 1 if xy ∈ L and 0 if xy 6∈ L.

The set of infinite words (or ω-words) over Σ is the set of all maps from the positive integers to Σ and is denoted Σω. An ω-language is any subset of Σω. For a finite or infinite word w, w[i] denotes the symbol at position i, with indices starting at 1. Concatenation of a finite word x with a finite or infinite word y is denoted xy. The word x is a prefix of xy and the word y is a suffix of xy. The suffix of w starting at position i is denoted w[i :]. If x ∈ Σ<sup>∗</sup> and k is a nonnegative integer, x <sup>k</sup> denotes the concatenation of k copies of x, and x ω denotes the concatenation of x with itself infinitely many times. An ω-word is ultimately periodic if it can be written in the form u(v) <sup>ω</sup> for u, v ∈ Σ<sup>∗</sup> with |v| > 0. If A<sup>1</sup> and A<sup>2</sup> are sets and S ⊆ A<sup>1</sup> × A2, then we define the projection π1(S) = {a<sup>1</sup> | (∃a2)(a1, a2) ∈ S} and analogously for the projection π2.

#### 2.1 NFAs, UFAs, DFAs, NBAs, UBAs, SUBAs, and DBAs

A (nondeterministic) finite-state automaton A is a tuple (Σ, Q, I, ∆, F) consisting of a finite alphabet Σ, a finite set Q of states, a set I ⊆ Q of initial states, a set F ⊆ Q of final states, and a transition relation ∆ ⊆ Q × Σ × Q. The transition relation ∆ is deterministic if for every state q ∈ Q and every symbol σ ∈ Σ, there is at most one state q <sup>0</sup> ∈ Q such that (q, σ, q<sup>0</sup> ) ∈ ∆. The size of a finite-state automaton is |Q|.

For a word w, a run of A on w is a sequence of states q0, q1, . . . such that for each i that indexes a symbol in w, (qi−1, w[i], qi) ∈ ∆. Thus, for w ∈ Σ<sup>∗</sup> a run on w is a sequence of length |w| + 1, and for w ∈ Σ<sup>ω</sup>, a run on w is an infinite sequence of states. A run on w is initial if q<sup>0</sup> ∈ I. A finite run is final if q<sup>|</sup>w<sup>|</sup> ∈ F, and an infinite run is final if there are infinitely many values of i for which q<sup>i</sup> ∈ F. Acceptors of languages and ω-languages may be defined using finite-state automata, as follows. In each case, the language of words accepted by an acceptor A is denoted L(A).

A nondeterministic finite acceptor (NFA) is a finite-state automaton A that accepts a word w ∈ Σ<sup>∗</sup> if there exists a run of A on w that is both initial and final. An NFA A is an unambiguous finite acceptor (UFA) if for every word w ∈ L(A) there is exactly one run of A on w that is initial and final. An NFA A is a deterministic finite acceptor (DFA) if there is exactly one initial state (|I| = 1) and the transition relation ∆ is deterministic. The languages over Σ that are accepted by NFAs, UFAs, or DFAs is precisely the regular languages over Σ.

A nondeterministic B¨uchi acceptor (NBA) is a finite-state automaton A that accepts a word w ∈ Σ<sup>ω</sup> if there exists a run of A on w that is both initial and final. An NBA is an unambiguous B¨uchi acceptor (UBA) if for every w ∈ L(A), there exists exactly one run of A on w that is initial and final. Bousquet and L¨oding [6] introduced the concept of a strongly unambiguous B¨uchi acceptor (SUBA), which is an NBA such that for every w ∈ Σω, there is at most one final run of the acceptor on w — note that the condition of being initial is dropped. Thus, every SUBA is a UBA. The ω-languages over Σ that are accepted by NBAs, UBAs, or SUBAs are precisely the regular ω-languages. An NBA is a deterministic B¨uchi acceptor (DBA) if there is exactly one initial state (|I| = 1) and the transition relation ∆ is deterministic. Every DBA is a UBA, but is not necessarily a SUBA. The ω-languages that are accepted by DBAs are a proper subclass of the class of all regular ω-languages.

For B¨uchi acceptors, we also consider a generalized version, GNBA, in which the acceptance condition is specified not by a single set of final states, but by a collection F of sets of final states. For a GNBA, a run q0, q1, . . . is final iff for each F ∈ F, there exist infinitely many indices i such that q<sup>i</sup> ∈ F. Applying this generalization to a SUBA yields a GSUBA. There is a standard translation of a GNBA of size n with k sets of final states into an NBA of size kn, in which there are k copies of the GNBA automaton. However, applying this construction to a GSUBA does not in general yield a SUBA.

### 2.2 LTL formulas

The syntax of linear temporal logic (LTL) [18] over a set AP of atomic propositions is given by the following grammar ϕ ::= p | ¬ϕ | ϕ1∧ϕ<sup>2</sup> | ϕ | (ϕ<sup>1</sup> U ϕ2) where p ∈ AP is an atomic proposition.

The semantics of LTL relates ω-words over 2AP to formulas as shown on the right (recall that indexing of words starts at 1). Additional Boolean and temporal

connectives are defined in the usual way. In particular > (true) is defined as p∨¬p, ♦ϕ (eventually ϕ) is defined as (> U ϕ) and ϕ (always ϕ) is defined as ¬♦(¬ϕ).

The ω-language of an LTL formula ϕ, denoted L(ϕ), is the set


of ω-words for which it is true. The size of an LTL formula ϕ is the number of distinct subformulas it contains. Every LTL formula represents a regular ω-language (see Section 5). However, not every regular ω-language can be represented by an LTL formula; in particular, the regular ω-languages that can be represented by LTL formulas are noncounting [9].

### 2.3 M2MAs

A multiplicity automaton represents a function mapping Σ<sup>∗</sup> to elements of a field K. We focus on the case where K = {0, 1} and product and sum are computed modulo 2. A mod 2 multiplicity acceptor (M2MA) of dimension d is a tuple A = (Σ, v<sup>I</sup> , {µσ}σ∈Σ, v<sup>F</sup> ), where Σ is the input alphabet, v<sup>I</sup> ∈ K<sup>d</sup> is the initial vector, v<sup>F</sup> ∈ K<sup>d</sup> is the final vector, and for each σ ∈ Σ, µ<sup>σ</sup> is a d × d transition matrix over K, that is, an element of K<sup>d</sup>×<sup>d</sup> .

The vectors v<sup>I</sup> and v<sup>F</sup> are interpreted as d×1 column vectors. The transpose operation is denoted by <sup>&</sup>gt;, and the inner product of two column vectors v, w ∈ K<sup>d</sup> is denoted v <sup>&</sup>gt;w.

To define L(A) we inductively define the matrix µ<sup>x</sup> for all x ∈ Σ<sup>∗</sup> . If x = ε, then µ<sup>x</sup> is the d × d identity matrix. If x = σy for some σ ∈ Σ and y ∈ Σ<sup>∗</sup> then µ<sup>x</sup> = µσµy. The function f<sup>A</sup> : Σ<sup>∗</sup> → K computed by A is defined by fA(x) = v > <sup>I</sup> µxv<sup>F</sup> . A word x is accepted by A if fA(x) = 1.

We refer to column vectors v ∈ K<sup>d</sup> as states or co-states of A. A state v is reachable iff there exists a word x ∈ Σ<sup>∗</sup> such that v = (v > <sup>I</sup> µx) <sup>&</sup>gt;. A co-state w is co-reachable iff there exists a word x ∈ Σ<sup>∗</sup> such that w = µxv<sup>F</sup> . For any state v, Lv(A) denotes the language of words accepted by A with its initial vector replaced by v.

We assume standard results from finite dimensional vector spaces. If U is a vector space of dimension k over the field {0, 1} then |U| = 2<sup>k</sup> . If U is a vector subspace of the vector space V , then the orthogonal complement of U is the set U <sup>⊥</sup> = {v | v <sup>&</sup>gt;u = 0 ∀u ∈ U}, U <sup>⊥</sup> is a vector subspace of V which is disjoint from U except for the zero vector, and the dimensions of U and U <sup>⊥</sup> sum to the dimension of V .

The following simple lemmas relate M2MAs to UFAs and DFAs, and show that M2MAs accept exactly the regular languages.

Lemma 1. [Beimel et al. [4]] Let L ⊆ Σ<sup>∗</sup> . If L is accepted by a UFA of size n, it is also accepted by an M2MA of dimension n.

Lemma 2. Let L ⊆ Σ<sup>∗</sup> . If L is accepted by an M2MA of dimension d with R reachable states, then L is also accepted by a DFA of R states. Clearly, R ≤ 2 d .

Beimel et al. [4] have shown that there is a polynomial time algorithm to learn an unknown M2MA using equivalence and membership queries.

### 2.4 Size lower bounds for DFAs, M2MAs and NFAs

Given a language L ⊆ Σ<sup>∗</sup> , we define an observation table for L as an ` × m matrix T of 0's and 1's where each row i is associated with a finite word x<sup>i</sup> and each column j is associated with a finite word y<sup>j</sup> , and the entry Ti,j is 1 if and only if xiy<sup>j</sup> ∈ L. This terminology is derived from its use in algorithms to learn DFAs. An observation table for L is thus a finite submatrix of its Hankel matrix.

Certain properties of observation tables for a language L yield lower bounds on acceptors recognizing L. Recall that the rank of a matrix is the number of linearly independent rows (or columns) it contains.

Lemma 3. Let T be an observation table for the regular language L with rows associated with finite words x<sup>i</sup> for i = [1..`] and columns associated with finite words y<sup>j</sup> for j ∈ [1..m]. Assume T has n distinct rows and rank d over the field {0, 1}. Then any DFA to accept L must have at least n states, and any M2MA to accept L must have dimension at least d.

Proof. Let D be a DFA accepting L. If the rows for x<sup>i</sup> and x<sup>k</sup> are distinct, then there is a column j on which they differ, that is, xiy<sup>j</sup> ∈ L iff xky<sup>j</sup> 6∈ L. Thus, the states of D reached from the initial state on the words x<sup>i</sup> and x<sup>k</sup> must be different and D has at least n states.

Let M be an M2MA accepting L. Following the argument of Beimel et al. [4], the observation table is a submatrix of the Hankel matrix of the language L, and its rank (modulo 2) is a lower bound for the rank (modulo 2) of the Hankel matrix, which is a lower bound for the size of any M2MA accepting L. ut

For lower bounds for NFAs, we use the concept of covering the observation table by 1-monochromatic rectangles. If R and C are subsets of the indices of the rows and columns (respectively) of a matrix M, then the (R, C)-rectangle of M is the matrix obtained from M by deleting those rows whose indices are not in R and those columns whose indices are not in C. The (R, C)-rectangle of a matrix M is v-monochromatic iff all of its entries are equal to the value v.

Let M be a matrix of 0 and 1 values. A 1-rectangle cover of M is a set {(Rs, Cs) | s ∈ [1..t]}, of 1-monochromatic rectangles (Rs, Cs) of M such that for every i and j, if Mi,j = 1 then there exists some s ∈ [1..t] such that i ∈ R<sup>s</sup> and j ∈ Cs. A minimum 1-rectangle cover of M is a 1-rectangle cover of M of minimum possible cardinality t.

Lemma 4. Let T be an `×m observation table for the regular language L. Any NFA M recognizing L must have at least as many states as the cardinality of the minimum 1-rectangle cover of T.

This is implied by Theorem 5.2.4.10 and Exercise 5.2.5.14 of Hromkoviˇc [12]. For completeness we provide a simple direct proof.

Proof. Let the strings indexing the rows of T be x<sup>i</sup> for i ∈ [1..`] and the strings indexing the columns of T be y<sup>j</sup> for j ∈ [1..m]. For each state q of M, let R<sup>q</sup> be the set of all i ∈ [1..`] such that x<sup>i</sup> reaches q from an initial state of M, and let C<sup>q</sup> be the set of all j ∈ [1..m] such that y<sup>j</sup> reaches a final state of M from q.

Clearly (Rq, Cq) must be a 1-monochromatic rectangle of T, because if i ∈ R<sup>q</sup> and j ∈ C<sup>q</sup> then xiy<sup>j</sup> is accepted by M and the entry Ti,j must be 1. Also, if Ti,j = 1, then xiy<sup>j</sup> must be accepted by M, so there must exist a state q of M such that x<sup>i</sup> reaches q from an initial state of M and y<sup>j</sup> reaches a final state of M from q, that is, i ∈ R<sup>q</sup> and j ∈ Cq. Thus, the rectangles (Rq, Cq) for all states q of M form a 1-rectangle covering of T, and the number of states of M is greater than or equal to the cardinality of the minimum 1-rectangle covering of T. ut

Corollary 1. If L is a regular language with an n × n observation table T that has exactly one 1 in every row and column, then any DFA, M2MA, or NFA to recognize L must have at least n states.

As an example of the use of these results, let L be the regular language over {a, b, c} consisting of those strings that do not contain any occurrences of the substrings ba or cb, with the observation table for L in Fig. 1. There are 4 different rows, so any DFA to accept L must have at least 4 states. The mod 2 rank of the table is 3 (the first three rows are a row basis) so any M2MA accepting L must have dimension

Fig. 1: Observation table with rank 3.

at least 3. The observation table with rows c and b, and columns a and b is the 2 × 2 identity matrix, so any NFA to accept L must have at least 2 states. In fact, there is a DFA of 4 states, an M2MA of dimension 3, and an NFA of 2 states accepting L, so for this example, the lower bounds are tight.

### 3 M2MAs as representations of regular languages

We consider the computational cost and size implications of some common operations and decision questions using M2MAs to represent regular languages.

### 3.1 M2MAs: procedures for operations and properties

Reverse Given an M2MA A accepting a regular language L, an M2MA A<sup>r</sup> accepting the reverse language L <sup>r</sup> may be obtained from A by exchanging the initial and final vectors, and transposing each of the transition matrices. Thus, the minimum dimension of an M2MA accepting L is equal to the minimum dimension of an M2MA accepting L r . Reversing is similarly easy for UFAs and NFAs, but may incur an exponential increase in size for a DFA.

Sum If for i = 1, 2, M<sup>i</sup> is a multiplicity automaton of dimension d<sup>i</sup> computing the function f<sup>i</sup> : Σ<sup>∗</sup> → K, then the sum f<sup>1</sup> + f<sup>2</sup> is computed by a multiplicity automaton M of dimension d<sup>1</sup> +d<sup>2</sup> constructed as the direct product of M<sup>1</sup> and M<sup>2</sup> as follows. State vectors of M are the concatenation of state vectors of M<sup>1</sup> and M2, including the initial and final vectors. For each σ ∈ Σ, the transition matrix µ<sup>σ</sup> is a (d1+d2)×(d1+d2) matrix obtained by putting (µ1)<sup>σ</sup> in the upper left, (µ2)<sup>σ</sup> in the lower right, and setting the remaining entries to 0. This ensures that the state updates of M<sup>1</sup> and M<sup>2</sup> are done in parallel for each symbol, and the output is the sum of the outputs for M<sup>1</sup> and M2.

Boolean operations For M2MAs, complementation follows directly from the sum construction. If A is an M2MA of dimension d and C is the M2MA of dimension 1 that outputs 1 on every string, then the sum construction with M and C yields an M2MA of dimension d+ 1 that accepts the regular language Σ<sup>∗</sup> \L(A). For DFAs, complementation is size-preserving, while for NFAs, complementation may incur an exponential increase in size.

Given M2MAs A<sup>i</sup> of dimension d<sup>i</sup> for i = 1, 2, the intersection language L(A1) ∩ L(A2) is accepted by an M2MA of dimension d<sup>1</sup> · d<sup>2</sup> obtained from A<sup>1</sup>

and A<sup>2</sup> using the Kronecker product of matrices.<sup>3</sup> Union can then be obtained from complementation and intersection.

Minimization, Equivalence, and Emptiness Sakarovitch [10,19] describes a cubic-time algorithm to minimize a weighted automaton with weights from a skew field, which has the following corollary.

Corollary 2 (of Theorem 5.20 in [10]). Given an M2MA A of dimension d, an M2MA A<sup>0</sup> of the minimum possible dimension accepting L(A) may be found in time O(|Σ|d 3 ).

An M2MA recognizes the empty language iff it has dimension 0 when minimized, and the equivalence of two M2MAs may be tested by determining if their sum is the empty language.

### 3.2 Conciseness comparisons for regular languages

We summarize known results comparing the conciseness of M2MAs with that of DFAs, UFAs and NFAs as representations of regular languages in Fig. 2. The entry for row A and column B is "−" if the representation A is an instance of the representation B, otherwise, starting with a machine of size n in the representation A, how large must an equivalent machine in the representation B be in the worst case? The entry 2Θ(n) means that there is a lower bound of 2cn and an upper bound of 2dn for positive constants c and d.

We briefly explain the entries in the table. A DFA is also a UFA and an NFA, and a UFA is also an NFA. A DFA or UFA of size n can be converted to an equivalent M2MA of dimension n (Lemma 1). The subset construction to determinize an NFA of size n yields a DFA (and therefore also a UFA or M2MA) of size at most 2 <sup>n</sup>. An M2MA of dimension n can be converted to a DFA (or UFA or NFA) of size


Fig. 2: Worst-case size bounds for representations of regular languages.

at most 2<sup>n</sup> (Lemma 2). The language B<sup>n</sup> = Σ<sup>∗</sup> ·1·Σ<sup>n</sup>, for Σ = {0, 1}, consisting of binary strings with a 1 located n + 1 symbols before the end is accepted by a UFA of size n + 2 (and therefore also an NFA of size n + 2 and an M2MA of dimension n + 2), but requires at least 2<sup>n</sup>+1 states for any DFA that accepts it.

For the problem of converting an NFA to an M2MA, Kaznatcheev and Panangaden [13] consider the language L<sup>n</sup> = Σ<sup>∗</sup> (0Σ<sup>n</sup>−<sup>1</sup>1) + (1Σ<sup>n</sup>−<sup>1</sup>0) Σ<sup>∗</sup> for Σ = {0, 1}, and show that L<sup>n</sup> is recognized by an NFA of 2n+ 2 states, but that any M2MA to recognize L<sup>n</sup> must have dimension at least 2<sup>n</sup>. By Lemma 1, this lower bound applies also to UFAs.

For the problem of converting an M2MA to an NFA, Kaznatcheev and Panangaden [13] give a family of languages {Ln} such that L<sup>n</sup> is recognized by an

<sup>3</sup> If A is an m×n matrix and B is a p×q matrix, then the Kronecker product A⊗B is the pm × qn block matrix, with blocks of size B, where the block-matrix at position (i, j) is aijB [17, Def 1.2.1].

M2MA of dimension n + 2, and prove that any NFA to recognize L<sup>n</sup> must have at least 2n/<sup>2</sup> − 2 states. Here we provide a simpler proof of a stronger lower bound. Let L<sup>n</sup> be the language recognized by the M2MA given in Fig. 1 of the paper of Kaznatcheev and Panangaden. This M2MA accepts a word iff the number of indices i such that both w[i] and w[i + n] is 1, is odd.

Lemma 5. Any NFA to recognize L<sup>n</sup> must have at least 2 n−1 states.

Proof. The language L<sup>n</sup> has an observation table T<sup>n</sup> of dimension 2<sup>n</sup> × 2 <sup>n</sup>, in which the rows and columns are indexed by strings x, y ∈ {0, 1} <sup>n</sup>. We view strings in {0, 1} <sup>n</sup> as vectors of length n over the field {0, 1}, so that the entry corresponding to the pair (x, y) is the inner product of the vectors x and y, that is x <sup>&</sup>gt;y. Note that the inner product x <sup>&</sup>gt;y is 1 iff the number of indices i such that both xy[i] and xy[i+n] is 1, is odd. The lower bound of 2<sup>n</sup> −1 then follows from Lemma 4 and the following Lemma. ut

Lemma 6. The minimum 1-rectangle covering of the observation table T<sup>n</sup> just defined has cardinality 2 <sup>n</sup> − 1.

Proof. For the upper bound it suffices to consider a 1-rectangle covering of T<sup>n</sup> consisting of pairs (R, C) where R is the singleton index of a nonzero row and C consists of the indices of the occurrences of 1 in that row.

If x ∈ {0, 1} <sup>n</sup> is the zero vector, then x <sup>&</sup>gt;y is 0 for all vectors y; otherwise, x <sup>&</sup>gt;y = 1 for exactly half the vectors y, that is, for 2n−<sup>1</sup> columns of Tn. Hence, T<sup>n</sup> contains exactly 2n−<sup>1</sup> (2<sup>n</sup> − 1) entries of value 1. We now show that any 1 monochromatic rectangle (R, C) of T<sup>n</sup> has at most 2n−<sup>1</sup> entries of 1, which shows that a minimum 1-rectangle covering of T<sup>n</sup> must have cardinality at least 2<sup>n</sup> −1.

Let (R, C) be any 1-monochromatic rectangle of Tn. Let U be the vector subspace spanned by the vectors x corresponding to indices in R, and let B be a basis for U whose indices are drawn from R. Let k = |B|, so that |U| = 2<sup>k</sup> . Every element of U is a sum of elements of B, but a sum of an even number of elements of B will be 0 in all the columns with indices in C, so R can contain the indices of at most half the elements of U, that is, |R| ≤ 2 k−1 .

Let S = {v | u <sup>&</sup>gt;v = 1 ∀u ∈ B}, the set of vectors whose inner product with all elements of B is 1; clearly, |C| ≤ |S|. We use inclusion/exclusion to find the cardinality of S, as follows.

$$\begin{aligned} |S| &= 2^n - |\bigcup\_{u \in B} \{v \mid u^\top v = 0\}| \\ &= 2^n - |\bigcup\_{C \subseteq B} C^\perp| \\ &= 2^n - k 2^{n-1} + \binom{k}{2} 2^{n-2} - \dots (-1)^k 2^{n-k} \\ &= 2^n \cdot (1 - \frac{1}{2})^k \\ &= 2^{n-k} \end{aligned}$$

Thus, |C| ≤ 2 n−k . Then |R×C| ≤ 2 k−1 · 2 <sup>n</sup>−<sup>k</sup> = 2<sup>n</sup>−<sup>1</sup> , concluding the proof. ut

### 4 Representing regular omega-languages using regular languages

In the preliminaries we discussed NBAs, SUBAs and DBAs, and LTL formulas as representations of regular ω-languages. Here we explain that M2MAs and other automata over finite words can also be used to represent regular ω-languages.

A regular ω-language is uniquely determined by the set of ultimately periodic ω-words it contains. Let L be a regular ω-language and let \$ be a symbol not in the alphabet of L. To represent the set of ultimately periodic words in L, Calbrix, Nivat and Podelski [7] introduced the related language of finite words L\$ = {u\$v | u(v) <sup>ω</sup> ∈ L} and proved that it is regular.

Thus a regular ω-language L can be represented by an acceptor for the regular language L\$, for example, a DFA, UFA, NFA or M2MA. The representation of L\$ by an M2MA was used by Angluin, Antonopoulos, and Fisman [1] in showing that regular ω-languages are polynomially predictable with membership queries as a function of the size of the smallest SUBA accepting the language.

We note that if for i = 1, 2, A<sup>i</sup> is an M2MA of dimension d<sup>i</sup> accepting (Li)\$ for the regular ω-language L<sup>i</sup> , then there is an M2MA of dimension d<sup>1</sup> · d<sup>2</sup> accepting (L<sup>1</sup> ∩ L2)\$, and an M2MA of dimension d<sup>1</sup> + 3 accepting (Σ<sup>ω</sup> \ L1)\$. The former follows by the intersection result for M2MAs, and the latter follows by the sum result applied to A<sup>1</sup> and the dimension 3 M2MA that accepts the set {u\$v | u ∈ Σ<sup>∗</sup> , v ∈ Σ+}.

### 5 Conciseness comparisons for regular omega-languages

We present known and new results comparing the conciseness of M2MAs with that of several other representations of regular ω-languages, summarized in Fig. 3. The entry for row A and column B gives upper (above) and lower (below) bounds on the worst case increase in size for a representation of type A of size or dimension n to an equivalent representation of type B. The entry is "−" if a representation of type A is an instance of a representation of type B. The entries for the columns for DFA, UFA, M2MA, and NFA are for the language L\$. An arrow indicates that the (lower or upper) bound is derived from a related (lower or upper) bound in the table. For example, the upper bound for the row DBA and columns UFA, M2MA and NFA are derived from the upper bound for the row DBA and column DFA. We now discuss the entries.

### 5.1 Size increases for LTL formulas

### Upper bounds

There is a "classic" algorithm, described by Baier and Katoen [3, Chapter 5], to translate an LTL formula of size n into a GNBA of size 2<sup>n</sup> with at most n sets of final states, which then yields an NBA of size at most n2 <sup>n</sup>. This shows that every LTL formula represents a regular ω-language, and gives an upper bound for translating an LTL formula to an NBA. Another algorithm to translate LTL formulas into NBAs is given by Gerth, Peled, Vardi and Wolper [11].


Fig. 3: Worst-case size bounds for representations of regular ω-languages.

Concerning the classic translation algorithm, Bousquet and L¨oding [6] give a brief argument and state that "Hence the automaton that is constructed in this standard way is strongly unambiguous." Wilke [21] states that "Every temporal formula with n subformulas can be translated into an equivalent backwards deterministic generalized B¨uchi automaton with at most 2<sup>n</sup> states and as many B¨uchi sets as there are subformulas with leading temporal operator F (eventually) or U (until)." To clarify these earlier statements, we reformulate them in our terminology. This gives an upper bound for transforming an LTL formula to a GSUBA.

Proposition 1. Let φ be an LTL formula of size n with temporal operators next and until, with m until subformulas. Applying the classic translation algorithm to φ yields a GSUBA of size 2 <sup>n</sup> with m sets of final states.

Proof. Baier and Katoen [3] show that the algorithm yields a GNBA M of the given size in which each state corresponds to an assignment of true or false to every subformula of φ. Moreover, if the ω-word w is accepted from a state q, then q assigns true to each subformula ψ of φ iff ψ is true for w. Hence there is at most one state of M from which the ω-word w is accepted, and thus M is also GSUBA. ut

To get an upper bound for translation of LTL formulas to UFAs, M2MAs, and NFAs, we would like to use the property of being strongly unambiguous. However, if the resulting GSUBA has more than one set of final states, transforming it in the usual way into an NBA does not in general yield a SUBA. Instead, we generalize to GSUBAs the method of Bousquet and L¨oding [6] for transforming a SUBA accepting L into a UFA accepting L\$.

Theorem 1. There is an algorithm to transform a GSUBA of size n with m sets of final states accepting L into a UFA of size 2 <sup>m</sup>n <sup>2</sup> +n accepting L\$. It runs in time polynomial in n and 2 m.

Proof. Let L be accepted by the GSUBA M = (Σ, Q, I, ∆, F) with n = |Q| and m = |F|. We index the elements of F as F<sup>i</sup> for i ∈ [1..m]. Bousquet and L¨oding [6, Lemma 1] show that u(v) <sup>ω</sup> is accepted by a SUBA iff there exists a state q reachable from an initial state on reading u, such that on the word v there is a computation path that loops from q back to q while passing through an accepting state. For the GSUBA M, the condition is that the computation path that loops from q back to q must pass through at least one state from each F<sup>i</sup> for i ∈ [1..m].

We define an NFA M<sup>0</sup> = (Σ<sup>0</sup> , Q<sup>0</sup> , I<sup>0</sup> , ∆<sup>0</sup> , F<sup>0</sup> ) as follows. The alphabet is Σ<sup>0</sup> = Σ ∪ {\$}. The state set is Q<sup>0</sup> = Q ∪ Q1, where Q<sup>1</sup> = {(q1, q2, S) | q1, q<sup>2</sup> ∈ Q, S ⊆ [1..m]}. The initial states are I <sup>0</sup> = I. The transition relation is ∆<sup>0</sup> = ∆∪∆1∪∆2, where ∆<sup>1</sup> is the set of all triples ((q1, q2, S), σ,(q 0 1 , q<sup>0</sup> 2 , S<sup>0</sup> )) such that q 0 <sup>1</sup> = q1, (q2, σ, q<sup>0</sup> 2 ) ∈ ∆, and S <sup>0</sup> = S ∪ T, where T = {i ∈ [1..m] | q 0 <sup>2</sup> ∈ Fi}. And ∆<sup>2</sup> contains all triples (q, \$,(q, q, ∅)) such that q ∈ Q. The set of final states F 0 is the set of triples (q1, q2, S) such that S = [1..m] and q<sup>1</sup> = q2.

Then M<sup>0</sup> has 2mn <sup>2</sup> + n states, and can be constructed in time polynomial in n and 2<sup>m</sup> given the GSUBA M. On an input u\$v, the NFA M<sup>0</sup> behaves like M on the word u, reaching some state q. Then on the symbol \$, M<sup>0</sup> transitions to the state (q, q, ∅), recording the state q reached after reading u. As M<sup>0</sup> continues reading v, the first component remembers q while the second component transitions as in M. The third component, S, records the set of indices of those final sets F<sup>i</sup> that have been visited in the processing of v. The input u\$v is accepted by M<sup>0</sup> iff there is a state q of M reachable from a state of I on input u such that there exists a computation path in M on input v from q to q that visits at least one state in F<sup>i</sup> for every i ∈ [1..m]. Thus M<sup>0</sup> accepts L\$. (Note that the set S generalizes the single bit used in Bousquet and L¨oding's construction.)

To see that M<sup>0</sup> is a UFA, we note that if there are two different accepting computations in M<sup>0</sup> for u\$v, then these may be used to construct two different accepting computations in M for u(v) <sup>ω</sup>, contradicting the fact that M is a GSUBA. ut

The entry in Fig. 3 for row LTL and column UFA is then justified by the following.

Corollary 3. Let φ be an LTL formula of size n with temporal operators next and until, with m until subformulas. Then there is a UFA of size 2 <sup>2</sup>n+<sup>m</sup> + 2<sup>n</sup> to accept L(φ)\$.

For transforming LTL to DFA, we have only the doubly-exponential bound for transforming an LTL formula to a UFA and the UFA to DFA.

### Lower bounds

We first generalize Lemma 3 to DBAs and Lemma 4 to NBAs. An observation

table for an ω-language L is a matrix T ∈ {0, 1} `×<sup>m</sup> with rows indexed by finite words x<sup>i</sup> for i ∈ [1..`] and columns indexed by ω-words y<sup>j</sup> for j ∈ [1..m] such that Ti,j = 1 iff xiy<sup>j</sup> ∈ L. Then we have the following, proved analogously to Lemma 3 and Lemma 4.

Lemma 7. Let T be an observation table for the ω-language L. If T has n distinct rows, then any DBA accepting L has at least n states.

Lemma 8. Let T be an observation table for the ω-language L. If the minimum 1-cover of T has cardinality n, then any NBA to recognize L has at least n states.

Baier and Katoen [3, Theorem 5.4.2] give a lower bound for a family of LTL formulas φ<sup>n</sup> of size poly(n) for which equivalent NBAs must have at least 2<sup>n</sup> states. Below we give a simplified and slightly strengthened version of their lower bound, which also applies to M2MAs or NFAs for L\$.

Theorem 2. For every positive integer n there exists an LTL formula ψ<sup>n</sup> of size at most 2n + 6 such that any NBA accepting L(ψn) must have size at least 2 <sup>n</sup>. Any NFA or M2MA accepting L(ψn)\$ must have size or dimension at least 2 n.

Proof. Let p be a propositional variable. For any positive integer n we define the LTL formula ψ<sup>n</sup> = (p → n(p)) ∧ (¬p → n(¬p)). We use <sup>n</sup> to represent the composition of with itself n times, so <sup>3</sup> (p) abbreviates ( ( (p))). The formula ψ<sup>n</sup> has size 2n+ 6. Let the symbols 0 and 1 represent the assignment of false and true to p. Then L(ψn) is the language of ω-words w over {0, 1} such that for some x ∈ Σn, w = x ω.

For L(ψn)\$, let x1, x2, . . . , x2<sup>n</sup> be any total ordering of all the elements of {0, 1} <sup>n</sup>, and consider the observation table T with rows corresponding to x<sup>i</sup> and columns corresponding to \$x<sup>i</sup> for i ∈ [1..2 <sup>n</sup>]. Clearly, there is exactly one 1 in row x<sup>i</sup> , in the column \$x<sup>i</sup> , so this observation table is the 2n×2 <sup>n</sup> identity matrix, which has rank 2n, and any NFA or M2MA accepting L(ψn)\$ must have size at least 2<sup>n</sup> by Corollary 1.

For the lower bound on NBAs, we observe that if we instead index the columns of T with (xi) <sup>ω</sup>, it becomes an observation table for the ω-language L(ψn), and remains the 2<sup>n</sup> × 2 <sup>n</sup> identity matrix, which implies that any NBA accepting L(φn) must have at least 2<sup>n</sup> states, by Lemma 8. ut

#### 5.2 Size increases for DBAs, NBAs, SUBAs

#### Upper bounds

For an NBA of n states accepting L, Calbrix, Nivat and Podelski [7] show that there is a DFA of 2<sup>n</sup> + 2<sup>2</sup><sup>n</sup> <sup>2</sup>+<sup>n</sup> states to accept L\$. Kuperberg, Pinault and Pous [14] give a more concise construction that yields for L\$ an NFA of size n + n3 n 2 and a DFA of size 2<sup>n</sup> + 2<sup>n</sup>3 n 2 . For the conversion of an NBA of n states to a SUBA, Carton and Michel provide the upper bound of (12n) <sup>n</sup> [8]. Starting with a DBA instead of an NBA, the NFA construction of Kuperberg, Pinault and Pous is fully deterministic, so the upper bound of n+n3 n 2 holds for transforming a DBA into a DFA. Bousquet and L¨oding [6] show that a SUBA of n states accepting the ω-language L may be transformed into a UFA of 2n <sup>2</sup> + n states accepting L\$.

### Lower bounds

For transforming a DBA for L into a DFA for L\$, Angluin and Fisman [2] prove that for every n there is a DBA of n+ 2 states accepting a language L such that no DFA of fewer than n! states accepts L\$. For transforming a DBA into a UFA, M2MA or NFA, we prove the following result.

Theorem 3. For every even positive integer n there is an ω-language L<sup>n</sup> that is accepted by a DBA of n + 5 states such that any UFA, NFA or M2MA to accept (Ln)\$ must have size or dimension at least <sup>n</sup> n/2 , which is ∼ 2 n/ p πn/2.

Proof (Sketch). The proof uses a modification of the DBAs in the construction by Angluin and Fisman [2]. Here we sketch the main idea and give an example. Let n = 2k for some nonnegative integer k, let Σ2<sup>k</sup> = {σ1, . . . , σ2k} and let Σ be Σ2<sup>k</sup> ∪ {0, L, E, F}. Consider the regular ω-language defined by the ω-regular expression ∪σ∈Σ\{0} (σ · (Σ \ {σ}) ∗ · σ) ω , which is accepted by a DBA with 2k + 5 states. Given two subsets C and D of Σ2k, each of size k, we define words u<sup>C</sup> and v<sup>D</sup> such that (u<sup>C</sup> · vD) <sup>ω</sup> is in the language if and only if C = D. The main idea behind the construction is that v<sup>D</sup> forces each symbol σ<sup>D</sup> in Σ2<sup>k</sup> \ D to be followed by the character 0. Thus, if the string preceding (and including) an occurrence of such a symbol σ<sup>D</sup> is described by the (unambiguous) regular expression (S σ∈Σ\{0} σ · (Σ \ {σ}) ∗ · σ) ∗ , then the symbol 0 that follows cannot be properly consumed, resulting in the ω-word being not in the language. We construct the words u<sup>C</sup> and v<sup>D</sup> in such a way that this can happen if and only if such a symbol σ<sup>D</sup> ∈ Σ2<sup>k</sup> \D is also in C. Since C and D are subsets of Σ2k, each of size k, this happens exactly when C 6= D. There is therefore an observation table with rows indexed by \$u<sup>C</sup> for all subsets C of size k and whose columns are indexed by v<sup>D</sup> for all subsets D of size k, and where each entry, corresponding to row and column subsets C and D respectively, is 1 if and only if C = D. By Corollary 1, the result follows. ut

u<sup>C</sup> = F · 2 · F · 2 · 3 · 2 · 3 · L · 3 v<sup>C</sup> = L · E · 1 · 0 · 4 · 0 · E v<sup>D</sup> = L · E · 1 · 0 · 3 · 0 · E Example. Let Σ2<sup>k</sup> = {1, 2, 3, 4}, let Σ be Σ2k∪ {0, L, E, F}, let C = {2, 3}, and let D = {2, 4}. Then u<sup>C</sup> , v<sup>C</sup> and v<sup>D</sup> are defined on the right. Then (u<sup>C</sup> ·v<sup>C</sup> ) <sup>ω</sup> is in the language, whereas (u<sup>C</sup> ·vD) <sup>ω</sup> is not (since C 6= D).

For the lower bound on transforming a DBA into a SUBA, Bousquet and L¨oding [6] show that for every positive integer n there exists an ω-language that is accepted by a DBA with n + 1 states, and cannot be accepted by a SUBA with fewer than 2<sup>n</sup>−<sup>1</sup> states.

For transforming a SUBA into a DFA, Angluin, Antonopoulos and Fisman [1, Theorem 5] give a family of ω-languages such that L<sup>n</sup> is accepted by a SUBA of size 4n + 5, but any DFA to accept (Ln)\$ or its reverse must have size at least 2n. For transforming a SUBA into a UFA, M2MA or NFA, we prove the following asymptotically tight lower bound.

Theorem 4. For every positive integer m greater than 3, there is an ω-language L that is accepted by a SUBA with m states, but no M2MA of dimension less than 2m<sup>2</sup> − m + 2 or NFA or UFA of size less than 2m<sup>2</sup> − m + 2 accepts (L)\$.

Proof (Sketch). For every n ∈ **N** we define L<sup>n</sup> to be the regular ω-language over Σ = {a, b, c} given by the expression ((cc·b n) ∗ ·aa·b n) <sup>ω</sup>. This language is accepted by a SUBA Sn, with m = n+ 3 states. We construct a specific observation table M for the language (Ln)\$. We then show that any 1-rectangle cover of M is of size at least 2m<sup>2</sup> − m + 2, which implies by Lemma 4 that the number of states of any NFA (or UFA) for the language (Ln)\$ is at least 2m<sup>2</sup> −m+ 2. We further show that the rank of M is 2m<sup>2</sup> − m + 2, and by Lemma 3, obtain that the dimension of any M2MA for this language is also at least 2m<sup>2</sup> − m + 2. ut

### 6 Empirical results

We report typical size increases in going from a random SUBA, DBA or NBA acceptor for a regular ω-language L to a minimized M2MA (and DFA, in the case of a SUBA) for L\$. We also report computed sizes of minimized M2MAs and DFAs for L(φn)\$ for members of particular families {φn} of LTL formulas. Code is available in the GitHub repository:

https://github.com/nevingeorge/Learning Automata.

For the generation of random SUBAs, DBAs or NBAs, our procedure is as follows. Given parameters n, f, and t we generate a transition relation on n states (random reverse-deterministic for a SUBA, random deterministic for a DBA, and all possible transitions for an NBA), select f of the n states at random to be final, and randomly remove t of the transitions. The resulting transition relation is trimmed to remove non-live states and their transitions. The trimmed acceptor may have fewer than n states.

If the goal is a SUBA, using the criterion of Wilke [21], we check that there do not exist two different states q<sup>1</sup> and q<sup>2</sup> and a nonempty finite word v such that for i = 1, 2, there is a loop on v from q<sup>i</sup> to q<sup>i</sup> that passes through a final state. If the acceptor fails this test, it is rejected, and the procedure is repeated until a SUBA is successfully generated.

### 6.1 SUBAs to minimized M2MAs and DFAs

For random SUBAs to minimized M2MAs, we first generate a random SUBA with Σ = {a, b, c}, n ∈ {5, 10, 15}, t ∈ {[1, 5], [2, 10], [18, 22]} (resp.), and f = 2 or f = 3 with equal probability. We then convert it into a UFA using the algorithm of Bousquet and L¨oding [6], and minimize the equivalent M2MA.

Fig. 6: Random SUBAs to minimized DFAs

We performed the above process on approximately 220, 000 randomly generated SUBAs.

Fig. 4 is a plot of the average minimized M2MA dimension for each trimmed SUBA size from 1 to 10. Upon performing quadratic regression, we obtain the orange curve 1.212n <sup>2</sup>−.2248n, and the blue curve is the theoretical upper bound of 2n <sup>2</sup> + n given in Fig. 3. The quadratic fit has a R<sup>2</sup> of 0.9996 while a linear fit has a R<sup>2</sup> of 0.9370, suggesting that the growth is indeed quadratic. This curve satisfies the theoretical upper bound of 2n <sup>2</sup> + n, and suggests that the lower bound of Ω(n 2 ) holds on average.

For random SUBAs to minimized DFAs, we also calculated the number of reachable states of each minimized M2MA. This is the number of states in the equivalent minimized DFA, by a property of the minimization algorithm of Corollary 2. From Fig. 3, the lower bound in going from a SUBA to a DFA is 2<sup>Ω</sup>(n) , and the upper bound is 2<sup>n</sup> + 2<sup>n</sup>3 n 2 .

In the left graph in Fig. 6, the blue data points representing the results of the SUBA to DFA experiment grow much more sharply than the results of the SUBA to M2MA experiment, so it is clear that a SUBA can be represented more concisely as an M2MA than as a DFA on average. Upon taking the log (base 2), we obtain a roughly linear fit as seen in the right graph with equation .7196n + 1.738 and a R<sup>2</sup> of .9841, suggesting that on average the growth is exponential. The standard deviation and range of converted DFA sizes was large for this conversion, making it difficult to make firm claims about the growth. However, the data suggests that the exponential lower bound likely holds on average, and that in general the upper bound of 2<sup>n</sup> + 2<sup>n</sup>3 n 2 is a severe overestimate.

### 6.2 NBAs and DBAs to minimized M2MAs

For NBAs and DBAs, a minimized M2MA is computed using the M2MA learning algorithm of Beimel et al. [4], which makes membership and equivalence queries to the NBA or DBA. Instead of exact equivalence queries, we use approximate equivalence queries, implemented by testing membership agreement on a sample of randomly generated ultimately periodic words. Thus, the dimension of the learned M2MA may be an underestimate of the true minimum dimension of an M2MA for L\$.

For the NBA/DBA to M2MA experiments, we generated approximately 1000 random NBAs/DBAs with Σ = {a, b, c}, n ∈ {5, . . . , 10}, t ∈ [0, n] for DBAs and t in ranges within [90, 680] for NBAs, and f = 2 or f = 3 with equal probability. For the approximate equivalence queries, we tested 1000 random ultimately periodic words of length at most 25. The results of the experiments can be seen in Fig. 5. The fitted NBA and DBA curves are quadratic with equations 1.096n <sup>2</sup> − .8947n and 1.318n <sup>2</sup> − 1.392n, respectively. The quadratic fits for the NBA and DBA results have a R<sup>2</sup> of .9954 and .9961, respectively, while linear fits have a R<sup>2</sup> of .9227 and .9118, respectively. These experiments have limitations: the use of approximate equivalence queries, the small sample size (because of the time requirements of the learning algorithm), and the large standard deviation and range of converted M2MA sizes. However, the results from all three conversions are very similar, suggesting that in these conditions, SUBAs, NBAs, and DBAs don't vary significantly on average with respect to their equivalent M2MA representations.

#### 6.3 LTL formulas to minimized M2MAs

Random LTL formulas seem not to provide much insight, so we consider specific families of LTL formulas: bounded request/grant formulas and two families based on the hierarchy of Manna and Pnueli [15], namely obligation and reactivity formulas. Empirically, for each of the first few members of each family we calculate the minimum dimension of an M2MA and the minimum size of a DFA accepting the corresponding L\$ language, and use the online tool provided by the Spot website (https://spot.lrde.epita.fr/) to find an ω-language acceptor for the corresponding L. (Omitted Spot entries exceeded the limit on calculation time.)

The canonical request/grant formula is of the form (p → ♦(q)), which asserts that whenever a request (p) is made, it is eventually granted (q). In the bounded version, a number of steps n is specified, and the assertion is that the request is granted within n steps. Thus, for each natural number n, we have a formula R<sup>n</sup> = (p → (q ∨ (q) ∨ <sup>2</sup> (q) ∨ . . . ∨ <sup>n</sup>(q))). The table in Fig. 7a gives the resulting sizes and dimensions for n from 0 to 5. It is reasonable to conjecture n + 1 for the size of a DBA, n <sup>2</sup> + 3n + 3 for the minimum dimension of an M2MA, and 2n <sup>2</sup> + 3n + 4 for the minimum size of a DFA representing Rn.

The family of obligation formulas we consider is: F<sup>n</sup> = ∧ n <sup>i</sup>=1(pi∨♦qi). Using conjunction and minimization, we calculate the minimum dimension M2MA (and minimum size DFA) for L\$ for these formulas for n up to 5. The table in Fig. 7b

Fig. 7: Size or dimension of acceptors for families of LTL formulas.

shows the results. It is reasonable to conjecture 3<sup>n</sup> for the size of a DBA, 2·3 <sup>n</sup>+1 for the minimum dimension of an M2MA, and 2 · 3 <sup>n</sup> + 2<sup>n</sup> + 1 for the minimum size of a DFA to represent Fn.

The family of reactivity formulas we consider is: G<sup>n</sup> = ∧ n <sup>i</sup>=1(♦p<sup>i</sup> ∨ ♦qi). We proceed as for the obligation formulas, with the results shown in the table in Fig. 7c. Note that these formulas cannot be represented by DBAs, but are instead represented by GNBAs, which may have multiple sets of final states. For example, the entry (10, 2) indicates a GNBA with 10 states and 2 sets of final states. A reasonable conjecture in this case is (3<sup>n</sup> + 1, n) for the size of a GNBA, 3 <sup>n</sup> + 2 for the minimum dimension of an M2MA, and 3<sup>n</sup> + 3 for the minimum size of a DFA representing Gn.

In these cases, the minimum dimension of an M2MA (and size of a DFA) appears to grow at most as a polynomial in the size of an ω-language acceptor, quadratically for the bounded request/grant family, and linearly for the obligation and reactivity families.

### 7 Summary and conclusions

We provide a survey of size relations of M2MAs as a representation of regular languages and regular ω-languages, as well as empirical results for several of these relations. New theoretical results include an improvement of the lower bound for transforming an M2MA to an NFA, an upper bound of 2<sup>O</sup>(n) for the translation of an LTL formula of size n to a UFA, NFA, or M2MA, a lower bound of 2<sup>Ω</sup>(n) for the translation of a DBA of n states to an M2MA or NFA, and an asymptotically optimal lower bound of 2n <sup>2</sup> −n+ 2 for the translation of a SUBA of n states to an M2MA or NFA.

M2MAs have many advantages as a representation for regular ω-languages: determinism, succinct complementation, and polynomial time algorithms for minimization, equivalence testing, and learning with membership and equivalence queries. M2MAs are as succinct as DFAs, sometimes exponentially more so, and deserve further study.

Acknowledgements We would like to thank the anonymous reviewers for their insightful feedback. This work was supported in part by ONR Grant N00014- 17-1-2787, by NSF awards CCF-2106845, CCF-2131476, by BSF grant 2016239 and by ISF Grant 2507/21.

### References


20 D. Angluin et al.


Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

### Limits and difficulties in the design of under-approximation abstract domains?

Flavio Ascari , Roberto Bruni , and Roberta Gori

Dipartimento di Informatica, Universit`a di Pisa, Largo B. Pontecorvo 3, Pisa, Italy, flavio.ascari@phd.unipi.it, {roberto.bruni,roberta.gori}@unipi.it

Abstract. Static analyses are mostly designed to show the absence of bugs: if the analysis reports no alarms then the program won't exhibit any unwanted behaviours. To this aim they manipulate over-approximations of program semantics and, inevitably, they often report some false alarms. Recently, O'Hearn proposed Incorrectness Logic, that is based on underapproximations, as a formal method to find bugs that only reports true alarms. In this paper we aim to answer one important question raised by O'Hearn, namely which role can Abstract Interpretation play for the development of under-approximate tools for bug catching. In principle, Abstract Interpretation based static analyses can be defined for computing over-approximations as well as under-approximations, but in practice, most techniques exploited the former while few attempts developed the latter. To show why it is difficult to design effective under-approximation abstract domains, we first propose the new definitions of non emptying functions and highly surjective function family and then we formally prove the limits of under-approximation analysis by showing the non existence of abstract domains able to approximate such functions in a non trivial way. Our results outline the limits of under-approximation Abstract Interpretation and clarify, for the first time, why over- and underapproximation analyzers exhibited such a different development.

Keywords: Abstract Interpretation, Under-approximation, Abstract domains, Impossibility results

### 1 Introduction

Static program analyses are techniques used to infer properties of programs directly from their source code, without executing them. They have been studied and successfully applied for over 50 years [12,3,13,1,10,17,18,22,23,4] to produce effective methods and tools to support the development of correct software. For all these years, the main focus of static analysis was to prove the absence of bugs by computing over-approximations (supersets of all possible behaviours) of the semantics of programs: the absence of unwanted behaviour

<sup>?</sup> Research supported by MIUR PRIN Project 201784YSZ5 ASPRA–Analysis of Program Analyses.

in the over-approximation guarantees the correctness of the program. However, over-approximations cannot be used to expose bugs, since any alert raised by the analyser may be caused by the over-approximation rather than by the program, i.e. it can be a so called false alarm. From the point of view of a software developer, false alarms are undesirable because they undermine the credibility and usefulness of the analysis. In principle, there is a symmetrical approach to static analysis, that is to compute an under-approximation of the semantics, i.e., a subset of all possible behaviours of a program. Dually to over-approximations, under-approximations can then expose defects in the code, while they are unable to show their absence.

Early works on static analysis, like Hoare logic [13], focused on over-approximation to prove the absence of errors, and maybe their influence directed the focus toward over-approximations. Recently O'Hearn argued for the relevance of bug catching with respect to correctness proofs and proposes the Incorrectness Logic [19], a dual version of Hoare logic thought from the ground up for under-approximation. He also advocates for a similar change of perspective in the static analyses approach.

For instance, consider the simple code

$$\text{for}\\
\text{(i = 0; i < 5; \text{ +1}) } \text{ sum } \text{ += 1000 / (2 \* i) + 100 / (2 \* i - 5);}$$

An abstract analysis based on the domain Int of intervals allows to over-approximate the set of possible values each variable can take as the smallest interval that contains such values. When applied the above program, the analysis may detect that the value of variable i is between 0 and 4 within the body of the loop, so that the arithmetic expression 2 \* i is then over-approximated by the interval [0, 8] while 2 \* i - 5 by the interval [−5, 3]. This raises two warnings for possible division by zero, since it seems that both arithmetic expression may assume the value 0. It is worth noting that while the warning on the first expression is a true alarm, the warning on the second one is a false alarm. On the contrary, an analysis based on under-approximation will never raise a warning for the second expression since no value of i can cause an error in this case, However, not all under-approximations will detect the problem with 2 \* i, because any subset of {0, 2, 4, 6, 8} is a valid under-approximation, including e.g. {2, 4, 6, 8}.

The Problem: Abstract Interpretation [6,22,4] is a general framework to define sound analyses based on constructive approximations that found its way through many aspects of modern computer science, such as verification, optimization, security and program transformation. Given its broad applicability, in his paper on Incorrectness Logic [19], O'Hearn leaves as an open question whether Abstract Interpretation could "eventually play a guiding and explanatory role for a wide range of static and dynamic under-approximate tools for bug catching, similar to what it already does for over-approximate analyses". The goal of this work is to investigate this topic. The results we have achieved will establish that under-approximation based Abstract Interpretation analyses have serious intrinsic limitations, and therefore our contribution can be read as a negative answer, even if we will then discuss how to overcome some limits.

Related Work: In their first works on Abstract Interpretation [6], Cousot and Cousot introduced the formal theory that could be used to define either overor under-approximations. However, while the former has been extensively studied, there have been only sparse studies on the latter. Bourdoncle [2] proposed abstract debugging using over-approximation domains, but acknowledged that under-approximation ones could be better suited. Lev-Ami et al. [14] proposed to use complements of over-approximation domains to infer sufficient precondition for program correctness. For the same goal, Min´e [15] used directly overapproximation domains, giving up the best abstraction and handling the choice of a maximal one with heuristics. To infer necessary condition for incorrectness, a problem similar to O'Hearn's but studied for a different goal, Cousot et al. [9,8] use Abstract Interpretation techniques but on boolean formulas, hence bypassing the issue of defining an abstract domain. Schmidt [24] uses higher-order domains, defining abstract states with meaning "there exists a value satisfying this over-approximation property", hence giving rise to an under-approximation of over-approximations. In conclusion, all the above approaches design underapproximation domains starting from over-approximation ones, and, to the extent of our knowledge, there are no abstract domains thought from the ground up for under-approximation. So the question whether it is possible to design an abstract domain for computing under-approximations naturally arises.

Contributions: We believe the absence of under-approximation abstract domains to be caused by intrinsic difficulties in their design. In this article, we determine and explain the reasons behind these difficulties. In the following we point out some intuitive asymmetries that suggest why under-approximations are not as immediate to use as over-approximations for program analysis.

While over- and under-approximation can be thought as dual theories, they have a deep asymmetry when dealing with the semantics of basic constructs of the language, the so called basic transfer functions. For instance, given an overapproximation abstract domain, we can define an under-approximation domain by taking the opposite interpretation of abstract elements: the idea is that an abstract element represents all concrete elements that may not be present in the set of possible values. As a consequence of being an under-approximation, this means that all the other concrete elements (the complement of the set) must be actual values. Considering the abstract domain of (complemented) intervals, it happens, e.g. that an arithmetic expression such as a sum of variables is often under-approximated as the whole Z. It is also worth noticing that, while basic transfer functions are the same, over-approximation abstract domains are closed under intersection, while under-approximation abstract domains are closed under union and can grow large very easily.

Another asymmetry we point out is the handling of divergence. Divergence is represented in over- and under-approximation by the same abstract element ⊥, but note that ⊥ as an under-approximations also represents the absence of information (dually to > in over-approximations). This becomes a problem since many concrete functions are strict, that is, when applied to a non-terminating expression, they also fail to terminate (they return ⊥ if one argument is ⊥), and, to be a correct under-approximation, also the corresponding abstract function needs to be strict. This implies that whenever the analysis can't determine any meaningful information at some program point, it has to propagate this absence of information along all program paths, at least until a join in the control flow is found. So "recovery" from ⊥, that is, producing a result different from ⊥, once we start with it, is very hard in an under-approximation. Note that, on the contrary, "recovery" from > in an over-approximation is quite easier, e.g. by a constant assignment.

The previous arguments are substantiated by formal impossibility results for building meaningful under-approximation abstract domains. First, we introduce the new definition of non emptying function, describing functions that don't tamper the analysis and we prove that no abstract domain for integers can be constructed that makes all sums non emptying. Second, we propose two generalizations (one local and one global) of the result for integers domains to arbitrary concrete domains and function families, by introducing the notion of highly surjective function family, of which sums are an instance. The local condition applies to each function in the family, while the global condition is a property of the whole family. Finally, we study hypothesis for the existence of abstract domains making all functions in a family non emptying to show first that the hypothesis of high surjectivity is tight, and then that further conditions on the function family must hold.

Structure of the paper: In Section 2 we introduce the notation used in the rest of the paper and recall the basics of Abstract Interpretation for over- and underapproximations. In Section 3 we apply our idea to the concrete domain of integers to show that, under some simple conditions, no under-approximation abstract domain can exist. In Section 4 we extend the result obtained for integers to arbitrary concrete domains and function families. In Section 5 we show that the hypothesis of high surjectivity is needed and explore other requirements for the function family. Section 6 contains some concluding remarks and an outline of future research directions. Due to space limitation, only informal proof sketches are included in this proceedings.

### 2 Background

Notation. We let P(S) denote the powerset of the set S and id<sup>S</sup> : S → S be the identity function on a set S. We omit subscripts when obvious from the context. If f : S → T is a function, then we overload the symbol f to denote also its additive extension f : P(S) → P(T) defined as f(X) = {f(x)| x ∈ X} for any X ⊆ S. We say a function f : S → S is acyclic if, for any element x ∈ S and any n > 0, we have f <sup>n</sup>(x) 6= x, where f <sup>n</sup> denotes composition of f with itself n times. In ordered structures, such as posets and lattices, we usually denote the ordering with , least upper bounds (lubs) with t, greatest lower bounds (glbs) with u, least element with ⊥, greatest element with >. If is an order relation, is the opposite relation, defined as s t if and only if t s. We write just S for the poset (S, ) whenever the order relation is known from the context and we use S op to denote the opposite poset (S, ): hence S op denotes the same set as S, but S op comes equipped with the opposite ordering relation . Given a poset T and two functions f, g : S → T, the notation f g means that, for all s ∈ S, f(s) g(s). Any powerset is a complete lattice with ordering given by the inclusion relation. In this case, we use standard symbols ⊆, ∪, etc.

Abstract Interpretation. Abstract Interpretation [6,7,16] is a general framework to define sound-by-construction static analyses, with the main idea of approximating the program semantics on some abstract domain A instead of working on the concrete domain C. The main tool used to study Abstract Interpretations are Galois connections. Given two complete lattices C and A, a pair of monotone functions α : C → A and γ : A → C define a Galois connection (GC) when

$$\forall c \in C, a \in A. \quad \alpha(c) \preceq a \iff c \preceq \gamma(a).$$

and we denote it with hC γ α Ai. We call C and A, respectively, the concrete and the abstract domain, α is the abstraction function and γ is the concretization function. In any GC, id<sup>C</sup> γ ◦ α, α ◦ γ idA, γ preserves glbs and α preserves lubs. In particular, this means that γ(>A) = ><sup>C</sup> and dually α(⊥<sup>C</sup> ) = ⊥A.

A GC in which α ◦ γ = id<sup>A</sup> is called Galois insertion (GI), and if this is the case also α is onto and γ is injective. By this last property, there is a bijection between A and γ(A), and using this isomorphism, whenever we consider a GI we identify A and its γ-image so that A becomes a subset of C and γ = idA, written as <sup>h</sup><sup>C</sup> <sup>α</sup> Ai. A GI is said to be trivial if A is the concrete domain or it only contains ><sup>C</sup> .

Given a monotone function f : C → C and a GC hC γ α Ai, a function f ] : A → A is a correct (or sound) approximation of f if α ◦ f f ] ◦ α. Its best correct approximation (bca) is f <sup>A</sup> = α ◦ f ◦ γ, and it is the most precise of all the correct approximation of f.

As an example, let us consider C = P(Z) be the powerset of integers and A = Int be the abstract domain of intervals [6]. Elements of Int are finite intervals [n, m] with n ≤ m, or infinite intervals of the form [−∞, m] or [n, ∞], together with the empty interval ⊥. The top element is [−∞, ∞]. Intervals are ordered by inclusion, the concretisation function γ is defined as usual, while the abstraction function α maps a set of integers to the smallest interval that contains it. If f(x) = |x| is the absolute value function, one of its sound abstractions is f ] ([n, m]) = [0, max(|n|, |m|)] because the interval [0, max(|n|, |m|)] always contains the entire set f(S) when n = min(S) and m = max(S). However this is not the best possible abstraction: for instance on S = {1} this yields [0, 1] while f(S) = {1}. Actually the best correct abstraction f <sup>A</sup> is computed as

$$f^A([n,m]) = \alpha \circ f \circ \gamma([n,m]) = \begin{cases} [0, \max(|n|, |m|)] & \text{if } n \le 0 \le m \\ [n, m] & \text{if } 0 < n \\ [-m, -n] & \text{if } m < 0 \end{cases}$$

### 2.1 Under-approximation Galois Connections

The definition of GC is not symmetric in γ and α: it favours over-approximation, and is not suited to describe under-approximations. This can be more easily seen from the property id<sup>C</sup> γ ◦ α, that means the abstraction γ(α(c)) of a concrete element c is greater than (ie. an over-approximation of) c itself. For this reason we introduce the notion of under-approximation Galois connection (UGC). Formally, an UGC is just a GC between A and C, in the reverse order, or equivalently a GC in which we replaced C and A with C op and Aop. However, we believe this definition to allow a better notation, helping the reader's intuition. Given two complete lattices C and A, a pair of monotone functions α : C → A, γ : A → C defines an UGC between C and A when

$$\forall c \in C, a \in A. \quad a \preceq \alpha(c) \iff \gamma(a) \preceq c$$

and we denote such UGC with hC α γ Ai. Note the different positions of arrows and their super/subscripts when compared with a GC hC γ α Ai. The difference

Fig. 1: Sketches of GC and UGC

between a GC and an UGC is sketched in Figure 1: in the GC (on the left) γ is above and α below, while in the UGC (on the right) the two are reversed. Using the duality observed above, from standard properties of GCs we get, reversing inequalities, that γ ◦ α id<sup>C</sup> , id<sup>A</sup> α ◦ γ, γ preserves lubs and α preserves glbs. Moreover, an under-approximation Galois insertion (UGI) is an UGC in which α ◦ γ = idA, and has the properties of α being onto and γ being injective, making the same identification of A with γ(A) possible, written as hC <sup>α</sup> <sup>A</sup>i. In particular, this means that in an UGI on a concrete powerset hP(C) <sup>α</sup> <sup>A</sup>i, for all a, a<sup>0</sup> ∈ A, γ(a ∪ a 0 ) = a ∪ a 0 , that is A is closed under union.

Dually to standard, over-approximation GCs, given a monotone function f : C → C and an UGC hC α γ Ai, a function f [ : A → A is a correct (or sound) abstraction of f if α ◦ f f [ ◦ α. Again, f <sup>A</sup> = α ◦ f ◦ γ is the best correct approximation of f.

As an example, let us take again C = P(Z) and A = Int<sup>0</sup> be the set of integer intervals around 0, ie. Int<sup>0</sup> = {I ∈ Int| 0 ∈ I} ∪ {⊥}. This is an underapproximation abstract domain because it contains ⊥ and is closed under union: the union of intersecting intervals is an interval too, and all elements of Int<sup>0</sup> intersects at 0. If again f(x) = |x| is the absolute value function, its bca f <sup>A</sup> is f <sup>A</sup>([n, m]) = [0, max(|n|, |m|)] since it's always the case that n ≤ 0 ≤ m.

### 3 Integer Domains

In this section we focus on under-approximations of integer domains and prove that any under-approximation abstract domain will mostly return trivial analyses for programs that include sums inside arithmetic expressions.

To this aim, we introduce the concept of non emptying function.

Definition 1 (Non emptying function). Let hC α γ Ai be an UGC, f : C → C a monotone function and f <sup>A</sup> = α◦f ◦γ its bca. We say that f is non emptying (in A) if, for any concrete value c, α(c) 6= ⊥ and α(f(c)) 6= ⊥ imply f <sup>A</sup>(α(c)) 6= ⊥.

Remember that ⊥ does not give any interesting information in the under-approximation setting, because it can mean divergence as well as complete loss of precision. On the contrary, any abstract element different than ⊥ means "something" interesting. The rationale behind the definition of non emptying function is that if the analysis starts from something (α(c) 6= ⊥) and it can find something (α(f(c)) 6= ⊥) then it will find at least one of the possible results (f <sup>A</sup>(α(c)) 6= ⊥), thus not falling to ⊥ and avoiding the issues discussed in the Introduction. The meaning of Definition 1 is illustrated by the following toy example.

Example 2. Consider the simple imperative fragment

if (x 6= 0) then { while (x < 10) { y := 7 / x; x := x + 1; } }

where a careless programmer used the condition x 6= 0 instead of the expected x > 0: on any initial state where x is negative the program incurs a division by 0 error.

For the analysis, suppose x is an integer value and consider the domain Int<sup>01</sup> = {I ∈ Int| 0 ∈ I ∨ 1 ∈ I} ∪ {⊥}, a variation of Int<sup>0</sup> such that each interval in Int<sup>01</sup> must contain at least one of 0 and 1. By an argument similar to that for Int<sup>0</sup> it can be shown that Int<sup>01</sup> is closed under union (since 0 and 1 are consecutive values in the integer domain), and thus is an under-approximation domain.

Assume to start the analysis in this domain with the initial condition [−1; 10] for variable x: remember that this being an under-approximation analysis, the abstract state [−1; 10] means that x may assume all the values in that interval at the beginning of the code fragment. In the concrete execution, the filter x 6= 0 then produces the concrete set of values c = {−1, 1, 2, . . . , 10}, but the abstract interpreter must abstract this to its largest subset that is an interval containing 0 or 1, that is [1; 10]. The abstract analysis of the cycle then proceeds straightforwardly, finding ⊥ after one iteration of the loop body (since after the increment the set of values for x is {2, 3, . . . , 11} that is abstracted to ⊥ because it doesn't contain neither 0 nor 1) and so the abstract fixpoint of the loop [1; 10]. This yields no error, even though the concrete execution starting at x = −1 does indeed fail after one iteration. The issue here is that the semantics f of the increment x := x + 1 is not non emptying in Int01: on the concrete value c = {−1, 1, 2, . . . , 10}, its input in this program, we have α(f(c)) = α({0, 2, 3, . . . , 11}) = [0] 6= ⊥ but f <sup>A</sup>(α(c)) = f <sup>A</sup>([1; 10]) = α(f(γ([1; 10]))) = α({2, 3, . . . , 11}) = ⊥.

For the remainder of the paper we assume a set of concrete values C, an UGI hP(C) <sup>α</sup> <sup>A</sup><sup>i</sup> with concrete domain <sup>P</sup>(C), and we say an element <sup>S</sup> ∈ P(C) is representable if it belongs to A, or equivalently if α(S) = S.

Definition 3. Let S ⊆ C be a subset of C. We say that d ∈ C is representable with S if S ∪ {d} is representable. We call R(S) the set of elements of C representable with S, ie.

$$R(S) = \{ d \in C \mid \alpha(\{d\} \cup S) = \{d\} \cup S \}.$$

For the sake of brevity, we shall write R for R(∅), the set of representable values of C, and R(c) for R({c}) where c ∈ C is any concrete value. The following is a technical lemma valid for non emptying functions, that explains the role played by Definition 1 in proving all our negative results (Propositions 7, 10 and Theorems 12, 15).

Lemma 4. Let f : C → C be non emptying, c ∈ R and the pair {c, c¯} be not representable, ie. c / ¯ ∈ R(c). If f(¯c) ∈ R then also f(c) ∈ R.

The main proof line of all our impossibility results is the same, and exploit this Lemma. All our results requires the size of the abstract domain to be comparable with that of the set of concrete values C (whose powerset P(C) is the concrete domain), and this in turn implies that representable elements are few. Then, assuming that all functions in a certain family are non emptying, we use repeatedly Lemma 4 to get many new representable elements, thus finding a contradiction. The key issues in the proofs are two: first, it must be possible to apply Lemma 4; second, all the new representable elements obtained applying it must be different from one another. In the following, we present some sets of conditions that are able to guarantee these two points, hence getting hypothesis for non existence of under-approximation abstract domain.

### 3.1 Infinite Integer Domain

As a first example, we consider the infinite domain P(Z) of integers.

Assumption 5 We assume that an abstract domain A, to be feasible for analyses, must be at most countable.

We make this assumption because we want to represent abstract elements with an amount of bits comparable with that of concrete values, to have a complexity comparable with a single concrete execution of the program and not exponentially larger. Thus, we require the size of the abstract domain to be that of Z, the set of values handled by the program, and not the concrete domain P(Z). Many abstract domains satisfy it, for instance intervals, octagons and polyhedrons with at most n edges, for any n; some, such as general polyhedrons, don't, but they also exhibit a worst case exponential cost.

Based on Assumption 5, we prove a simple cardinality estimate that is used, as anticipated before, to prove that there are few representable elements.

### Lemma 6. For any fixed subset S ⊆ Z, R(S) is finite.

The result for integers now shows that no under-approximation abstract domain makes all sums non emptying. The idea of the proof is to define an infinite sequence of representable elements, that is in contradiction with the previous lemma that says that R is finite. In order to define such a sequence, we want to use Lemma 4: we start from an initial representable n<sup>0</sup> and from a value ¯n not representable with it, then find a non-emptying f that maps ¯n into n0, so that f(¯n) is representable and we can then apply the lemma to get the new representable element f(n0). We then iterate this procedure, changing f, to build the infinite sequence. We believe the hypothesis that there exists an initial representable value is not very restrictive since initializations like x = 0 must be abstracted to ⊥ if 0 is not representable.

Proposition 7. Let hP(Z) <sup>α</sup> <sup>A</sup><sup>i</sup> be an UGI, and assume that there is an integer n<sup>0</sup> that is representable. Then it can't be the case that all the functions of the form fn(x) = x + n are non emptying in A.

The meaning of this proposition for program analysis is the fact that a domain small enough (by Assumption 5) is probably unable to deduce meaningful informations on an integer domain: if it doesn't contain representable singletons it must abstract to ⊥ any variable initialization, and otherwise it can't be non emptying for all sums, hence getting ⊥ when values are manipulated using this operation. In both cases, because of strictness, the abstract ⊥ is propagated along program paths, yielding it as the final result of the analysis, that means exactly it can't determine any information. This issue is not bound to manifest for all programs, but for any domain there exists programs for which it does.

### 3.2 Finite Integer Domain

An analogous result can be obtained for a finite integer domain P([−N; N]), where N is some big integer. This concrete domain models machine integers, that are constrained within an interval, so we assume that operations are performed in machine arithmetic, that is wrapping around in case of overflows. This is modelled working modulo 2N + 1, the length of the interval, and taking the unique representative of each congruence class in the interval [−N, N] of interest. It is worth noting that the interval is taken symmetric around 0 to simplify notation, but there is no conceptual difficulty in using an asymmetric one.

Assumption 8 We assume that an abstract domain A, to be feasible, must have a cardinality that is polynomial in N.

This assumption guarantees that the number of bits required to represent an abstract element is linear in that for concrete elements so that, again, the cost of the analysis is polynomial and not exponential in that of a concrete execution.

In the following we'll use asymptotic notation for some quantities. For this to be completely formal we should define a sequence of abstract domain A<sup>N</sup> , each one for the concrete domain P([−N, N]), then define a sequence of values for each quantity we want to estimate, and take the limit of this sequence for N going to infinity. However we do believe all these formal details would clutter notation, making hard to get insight. For this reason, we avoid all this, just (ab)using the intuitive meaning associated with the notation.

The next lemma is analogous to Lemma 6 in proving that some sets are small under Assumption 8 on the cardinality of A.

Lemma 9. For any fixed subset S ⊆ Z, |R(S)| = O(log(N)).

The following proposition uses the same proof line as Proposition 7 above: we define a sequence of representable elements, and prove that they are too many since, by the previous lemma, R is quite small.

Proposition 10. Let hP([−N, N]) <sup>α</sup> <sup>A</sup><sup>i</sup> be an under-approximation Galois insertion, and assume that there is an integer n<sup>0</sup> that is representable. Then it can't be the case that all the functions of the form fn(x) = x + n (modulo 2N + 1) are non emptying in A.

### 4 Arbitrary domains

The definition of non emptying function is fully general and not limited to the concrete integer domain, hence we use it to propose conditions that are independent of the concrete domain. In this section, we deal with an infinite set C of concrete values, and an UGI hP(C) <sup>α</sup> <sup>A</sup>i. Again, we take the Assumption <sup>5</sup> on the size of A. Under this assumption we can prove again Lemma 6, that doesn't depend on the specific integer domain considered in the previous section.

All conditions we propose in this section are mainly on the family of functions considered and not on the abstract domain. The reason for this is that first we fix a function family, corresponding to a program, and then we look for a domain well suited to analyse the specific family at hand. In other words, the family is given by the applicative context, while the domain can be adapted to it.

Definition 11 (Highly surjective function family). Given a family F of functions from C to itself and an element c ∈ C, let

$$P(c) = \{ d \in C \mid \exists f \in F. \ f(d) = c \}$$

be the set of preimages of c, elements of C that can be mapped to c by a function in F. We say that the family F is highly surjective if P(c) is infinite for any possible choice of c ∈ C.

This property is needed together with Lemma 6 to apply Lemma 4 and get a new representable element: since there are infinite preimages of c but R(c) is finite, there are elements ¯c ∈ P(c) not in R(c); then by definition of P(c) there is an f such that f(¯c) = c ∈ R, so we can apply the lemma to get f(c) ∈ R. The reason for requiring f(¯c) = c instead of just in R is that, at the beginning of the proof, we only assume R to contain one element, hence the two conditions are equivalent. Starting from this basic idea, we present two set of sufficient conditions to prove the non existence of any under-approximation abstract domain.

### 4.1 Local Requirements for Impossibility

The first set of conditions we propose is in a sense more "local", in that it requires conditions on each function in the family F independently on the other.

Theorem 12. Let F be an highly surjective function family from C to itself such that all functions f ∈ F are either injective or acyclic. Assume also that R isn't empty. Then A can't be non emptying for all f ∈ F.

In the previous section we developed an ad hoc proof for the family of sums over integers, but the same result can also be obtained as an application of this theorem: if C = Z and F = {λx.x + n | n ∈ Z}, the family is highly surjective (actually P(c) = Z for all c) and all these functions are injective, so it meets the hypothesis of the theorem. Another example are rational or real numbers, with sums or products

Example 13. Take C = Q \ {0} and F = {λx.x · q | q ∈ Q \ {0}}. The family is highly surjective since P(c) = Q \ {0} for all c, and all these functions are invertible, hence injective.

A possibly more interesting example of application is to floating-point numbers as described by the IEEE Standard.

Example 14. Take C = F \ {0} the set of non-zero floating-point numbers that can be represented with a fixed number of significant digits, say t bits, but with an arbitrary precision exponent. We make the choice of infinite precision exponents and finite number of significant digits in order to have an infinite domain, as required by the theorem, but also preserve characteristics of floatingpoint arithmetic.

Let · and  denote respectively real product and its floating-point approximation, and consider the function family F = {λx.x  y | y ∈ C}. The function family is highly surjective, eg. considering that all numbers with the same significant digits as a floating-point x but different exponent can be mapped into x multiplying them by 1 times the difference of exponents. For the second condition, if y = ±1 we have that the function λx.x  y is invertible, hence injective. Otherwise, assume without loss of generality that y > 1 (other cases are analogous), and by contradiction assume it has a cycle f <sup>n</sup>(x0) = x0. By monotonicity of  we have f(x) = x  y ≥ x  1 = x, hence x<sup>0</sup> ≤ f(x0) ≤ f 2 (x0) ≤ · · · ≤ f <sup>n</sup>(x0) = x<sup>0</sup> so all the elements of the cycle are equal, in particular f(x0) = x0. However, if y 6= 1, the product x  y is never equal to x, that is a contradiction. Hence the function is acyclic. This means F meets hypothesis of Theorem 12, hence no abstract domain on floating-point numbers can be non emptying for all multiplications.

### 4.2 Global Requirements for Impossibility

The second set of conditions we propose is "global", in the sense that it requires the family F to satisfy a property as a whole.

Theorem 15. Let F be an highly surjective function family from C in itself such that


Assume also that R isn't empty. Then A can't be non emptying for all f ∈ F.

Again this result can be used to prove the impossibility of building an abstract domain for integers that is non emptying for all sums, or for floating-point numbers.

Example 16. Take C = F \ {0} the set of non-zero floating-point numbers with t bits significands and arbitrary precision exponents, and F = {λx.x  y | y ∈ F \ {0}}. As observed in Example 14 this family is highly surjective. Fixed now two floating-point numbers x, y, and letting u be the machine precision of floating-point arithmetic, we have that y = f(x) = x  z only if

$$
\left| \frac{y - (x \cdot z)}{x \cdot z} \right| < \mathbf{u}
$$

that is

$$\left|\frac{y}{x}\right| \frac{1}{1+\mathbf{u}} < |z| < \left|\frac{y}{x}\right| \frac{1}{1-\mathbf{u}}.$$

This is a bounded interval since x 6= 0, and hence contains only a finite amount of floating-point numbers. Analogously, fixed a floating-point y and a function f(x) = xz, we have that y = xz only if |x| belong to a bounded interval, that contains a finite amount of floating-point numbers. So, by means of Theorem 15 above, we proved again that no abstract domain on floating-point numbers can be non emptying for all multiplications.

### 5 On the necessity of high surjectivity hypothesis

Both sets of conditions we proposed in this section require the function family to be highly surjective. This turns out to be necessary in order to prove that no under-approximation abstract domain exists:

Proposition 17. For any fixed family F of functions from C to itself that is not highly surjective, there exists an abstract domain A<sup>F</sup> for P(C) such that


Moreover, the proof of this proposition is constructive, and we present an example of such construction in the following.

Example 18. Fix the pair of functions f(x) = x − 1 and g(x) = x − 2 on Z. The family F = {f, g} is clearly not highly surjective, so we build an underapproximation abstract domain for which these functions are non emptying. First, take an integer n<sup>0</sup> such that P(n0) (computed with respect to F) is finite. With this F, any integer is fine, so let us fix n<sup>0</sup> = 0.

The set of preimages of 0 is P(0) = {1, 2}. We define the abstract domain A<sup>F</sup> as

$$A\_F = \{ \emptyset \} \cup \{ X \cup \{ 0 \} \mid X \subseteq P(0) \} = \{ \emptyset, \{ 0 \}, \{ 0, 1 \}, \{ 0, 2 \}, \{ 0, 1, 2 \} \}$$

In this abstract domain, a set is abstracted to ∅ if and only if it doesn't contain 0 since all elements of A<sup>F</sup> but ∅ contains 0 and the abstraction of a set must be a subset of that set.

To check that f is non emptying in A<sup>F</sup> fix a set S ⊆ Z. If α(S) = ∅ the non emptying condition is vacuously true, so assume this is not the case, that is equivalent to 0 ∈ S. Analogously, if α(f(S)) = ∅ the condition is true, so assume 0 ∈ f(S) or, equivalently, 1 ∈ S. Using these two we get


The check for g is analogous.

Even though this proposition defines an under-approximation abstract domain, it shouldn't be interpreted as a positive result since the resulting domain is almost a power set and hence too large to be feasible in practice. Instead, the proposition should be regarded as a way to show that one of the hypothesis required in the previous theorems is tight and can't be weakened. In particular, since these kind of results need high surjectivity, they are ill suited when the focus is on a single function.

This proposition can be generalized to consider sets S ⊆ C whose preimages are finite, but a little care is needed when lifting the definition of preimages to sets of values: a preimage is a set for which there exists a function that maps it to S, not the union of the preimages of elements in S:

$$P(S) = \{ T \subseteq C \mid \exists f \in F. f(T) = S \}$$

Using this definition, the proposition generalizes straightforwardly:

Proposition 19. Let F be a family of functions from C in itself, and assume there is a set S<sup>0</sup> ⊆ C such that P(S0) is finite. Then there exists a finite abstract domain A<sup>F</sup> for P(C) such that all functions f ∈ F are non emptying in A<sup>F</sup> .

This proposition may for instance be applied to the concrete domain of finite lists to show that a natural function family to consider can't be used to prove non existence of under-approximation domains using non emptying functions.

Example 20. Fix the concrete domain C as the set of all lists of finite length over a finite, non-empty alphabet Γ, i.e. C = Γ ∗ . For α ∈ Γ <sup>∗</sup> a finite string, let

$$\text{concat}\_{\alpha}(\beta) = \alpha \beta^{\prime}$$

the function that prefix α to its argument. The family

$$F = \{ \text{concat}\_{\alpha} \mid \alpha \in \Gamma^\* \}.$$

is not highly surjective, because fixed a string γ only its prefixes can be mapped into it by a function in F, and they are a finite amount. Hence we can define an under-approximation abstract domain for which all these functions are non emptying by means of Proposition 19. Such domains are defined with a construction similar to that of Example 18, and in particular, if is the empty list, considering the set S<sup>0</sup> = {} whose preimage is only S<sup>0</sup> itself, the construction yields

$$A\_F = \{ \emptyset, \{\epsilon\} \}$$

It's easy to check that all functions concat<sup>α</sup> are non emptying in this abstract domain.

The previous proposition focuses on preimages, stating that if there is a concrete element that has a finite amount of them then it is possible to define an under-approximation domain. A natural dual of this proposition can be formulated in terms of images. For a subset S ⊆ C, the set of its images is

$$I(S) = \{ f(S) \mid f \in F \}$$

This definition is exactly dual to that of preimages, and can actually be used to formulate a similar result.

Proposition 21. Let F be a family of total functions (ie. if S 6= ∅ then f(S) 6= ∅) from P(C) in itself, and assume there is a non empty set S<sup>0</sup> ⊆ C such that I(S0) is finite. Then there exists a finite abstract domain A<sup>F</sup> such that all functions f ∈ F are non emptying in A<sup>F</sup> .

Even though this proposition introduces the technical hypothesis that all f ∈ F are total, we don't believe this to be very restrictive because these theorems are intended to be applied when F is a family of basic transfer functions, that seldom introduce divergence: in programming languages this is often caused by control-flow constructs. An application of this proposition is again on lists, to rule out another natural function family.

Example 22. Fix again C = Γ ∗ , and consider functions drop<sup>n</sup> : Γ <sup>∗</sup> → Γ ∗ that, taken a list, drop its first n elements and return the resulting list. If the input list is shorter than n, the output of drop<sup>n</sup> is the empty list . The function family

$$F = \{ \text{drop}\_n \mid n \in \mathbb{N} \}$$

is highly surjective since, for any fixed list α ∈ Γ <sup>∗</sup> and any n, we can extend α with any n character, and map this list to α with dropn. However, images through this function family are finite:

$$I(\alpha) = \{ \text{drop}\_n(\alpha) \mid n \in \mathbb{N} \}$$

that is finite since it's the set of all tails of α. Hence by Proposition 21 we can define an under-approximation abstract domain such that all functions drop<sup>n</sup> are non emptying. Again, these domains are constructed from sets S<sup>0</sup> with a finite amount of images, and considering S<sup>0</sup> = {}, that satisfies I(S0) = {}, it yields

$$A\_F = \{ \emptyset, \{\epsilon\} \}$$

Again it can be easily checked that all functions drop<sup>n</sup> are non emptying in A<sup>F</sup> .

These two propositions consider opposite situations in which it is possible to define an under-approximation domain: the former requires to be able to go backward using F in infinitely many ways, while the latter to go forward. This often isn't the case in the presence of "boundaries" in the concrete domain, that are points with respect to which functions tend to walk either up or away: for instance, is such a point with finite strings because concat functions go away from it while drop go towards. Another example of such boundary is 0 in the domain of integers Z with respect to multiplications and (rounded) divisions: the former increase absolute value, moving away from 0 (even though 0 itself is never a preimage), while the latter decrease it. Also considering a function family made of both kind of functions doesn't work: a slight adaptation of the constructions for the two propositions above shows that, if F can be partitioned in two subfamilies, each satisfying the hypothesis of one of the two propositions, then there exists an under-approximation abstract domain. An example of this is in the set of finite lists, taking as F both concat and drop functions. The construction then yields exactly A<sup>F</sup> = {∅, {}}, for which all these functions are non emptying, as shown in Examples 20 and 22. In light of these observations, in order to apply effectively the definition of non emptying function to prove non existence of abstract domains, for all possible boundaries there is the need for a function that is able to both enter and exit it. This happens for integers, since there is no boundary, but doesn't for finite lists, with {} being often either a sink or a source for many functions on lists.

### 6 Conclusions and Future Works

Until recently, the focus of formal static analyses has been on over-approximation to prove program correctness, but many tools based on this theory are instead deployed to catch bugs [23,10]. Incorrectness Logic promoted the study of a theory for under-approximation to give a formal basis to a new class of tools. This has seldom been done in the last few decades, especially in the framework of Abstract Interpretation. In our work, we point out some asymmetries between over- and under-approximation in Abstract Interpretation, and why those are an obstacle to the design of abstract domains. We have identified functions as the main difference, because they remain the same in both over- and underapproximation thus preventing one theory to be obtained simply as a dual of the other. Handling of divergence is another critical issue. Building on those ideas, we have proposed the new (to the extent of our knowledge) definition of non emptying function and studied how it can be used to prove non existence of under-approximation abstract domains. We have presented some general results, and applied them to integer and floating point domains to conclude that, under some assumptions, there are no useful under-approximation domains. Then, we have found conditions under which there do exist under-approximation abstract domains, showing that some of the hypothesis required in our theorems are very tight. However, because of the scarcity of works in this direction, we believe there are many possible subjects for future research.

Under-approximation abstract domains must be closed under union, but known abstract domains are rarely such. However disjunctive completion [11], a known domain transformer, refines any abstract domain in a union-closed one. This has been studied for over-approximation in order to improve precision at the expense of increased complexity. A solution to keep the analysis feasible is to use heuristics to prune disjunctions, trading back complexity for precision, but making the analysis possible for under-approximations. Moreover, practical tools based on the theory of Incorrectness Logic already use heuristic to drop logical disjunctions [19], so taking inspiration from them may be effective also for Abstract Interpretation.

In their recent work, Raad et al. [20] study incorrectness separation logic, the join of separation logic [21] and Incorrectness Logic. They notice that the original separation logic doesn't distinguish a pointer known to be dangling from one about which it has no information, and they introduce a new kind of heap assertion for dangling pointers. This issue is reminiscent of the difference between divergence and no information we incur into in Abstract Interpretation. This may suggest the introduction of a similar distinction also in under-approximation domains, but a new point different from ⊥ describing divergence needs a concretization, and no such element exists in a power set other than ∅. However, in Abstract Interpretation it happens at times that more general concrete domains allow more flexibility in the abstraction (eg. as proposed for higher-order functional languages [5]), so it may be worth to investigate the possibility to change the concrete domain to account for this new point.

All our results depend on the existence of a representable value. This assumption is motivated by the analysis performed, but is not a requirement of Abstract Interpretation itself. A way to remove this hypothesis may be to consider representable sets of minimal cardinality because functions defined as additive extensions don't increase cardinality, so they might take the place of singletons. The technical issue is if and how Lemma 4 can be generalized, but we believe it may be possible to relax that hypothesis about singletons.

We have discussed the finite domain of integers at the end of Section 3, but all our general results deal with infinite concrete domains. Both theorems rely on cardinality estimates essentially based on the fact that arbitrary combinations of finite numbers is still finite, hence less than the cardinality of the concrete domain. However, with a finite concrete domain those would be replaced by combinations of logarithmic factors, which may become equal to the size of the concrete domain. For finite domains we can prove a result reminiscent of Theorem 15, but this topic requires thorough investigation to understand the new issues and possibilities they open up.

Acknowledgements. We thank the anonymous reviewers for their helpful comments.

### References


24. Schmidt, D.A.: A calculus of logical relations for over- and underapproximating static analyses. Sci. Comput. Program. 64(1), 29–53 (2007). https://doi.org/10.1016/j.scico.2006.03.008

Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

## On probability-raising causality in Markov decision processes ?

Christel Baier  , Florian Funke , Jakob Piribauer  , and Robin Ziemek 

Technische Universitat Dresden ¨ {christel.baier, florian.funke, jakob.piribauer,robin.ziemek}@tu-dresden.de

Abstract. The purpose of this paper is to introduce a notion of causality in Markov decision processes based on the probability-raising principle and to analyze its algorithmic properties. The latter includes algorithms for checking causeeffect relationships and the existence of probability-raising causes for given effect scenarios. Inspired by concepts of statistical analysis, we study quality measures (recall, coverage ratio and f-score) for causes and develop algorithms for their computation. Finally, the computational complexity for finding optimal causes with respect to these measures is analyzed.

### 1 Introduction

As modern software systems control more and more aspects of our everyday lives, they grow increasingly complex. Even small changes to a system might cause undesired or even disastrous behavior. Therefore, the goal of modern computer science does not only lie in the development of powerful and versatile systems, but also in providing comprehensive techniques to understand these systems. In the area of formal verification, counterexamples, invariants and related certificates are often used to provide a verifiable justification that a system does or does not behave according to a specification (see e.g., [30,16,32]). These, however, provide only elementary insights on the system behavior. Thus, there is a growing demand for a deeper understanding on *why* a system satisfies or violates a specification and *how* different components influence the performance. The analysis of causal relations between events occurring during the execution of a system can lead to such understanding. The majority of prior work in this direction relies on causality notions based on Lewis' counterfactual principle [29] stating the effect would not have occurred if the cause would not have happened. A prominent formalization of the counterfactual principle is given by Halpern and Pearl [21] via structural equation models. This inspired formal definitions of causality and related notions of blameworthiness and responsibility in Kripke and game structures (see, e.g., [15,11,14,40,19,41,7]).

In this work, we approach the concept of causality in a probabilistic setting, where we focus on the widely accepted *probability-raising principle* which has its roots in

c The Author(s) 2022

<sup>?</sup> This work was funded by DFG grant 389792660 as part of TRR 248, the Cluster of Excellence EXC 2050/1 (CeTI, project ID 390696704, as part of Germany's Excellence Strategy), DFGprojects BA-1679/11-1 and BA-1679/12-1,and the RTG QuantLA (GRK 1763).

P. Bouyer and L. Schr¨oder (Eds.): FoSSaCS 2022, LNCS 13242, pp. 40–60, 2022. https://doi.org/10.1007/978-3-030-99253-8\_3


Table 1. Complexity results for MDPs and Markov chains (MC) with fixed effect set

philosophy [38,39,18,22] and has been refined by Pearl [35] for causal and probabilistic reasoning in intelligent systems. The different notions of probability-raising causeeffect relations discussed in the literature share the following two main principles:


Despite the huge amount of work on probabilistic causation in other disciplines, research on probability-raising causes in the context of formal methods is comparably rare and has concentrated on Markov chains (see, e.g., [24,25,6] and the discussion of related work in Section 3.2). To the best of our knowledge, probabilistic causation for probabilistic operational models with nondeterminism has not been studied before.

We formalize the principles (C1) and (C2) for Markov decision processes (MDPs), a standard operational model combining probabilistic and non-deterministic behavior, and concentrate on reachability properties where both cause and effect are given as sets of states. Condition (C1) can be interpreted in two natural ways in this setting: On one hand, the probability-raising property can be locally required for each element of the cause. Such causes are called *strict probability-raising (SPR) causes* in our framework. This interpretation is especially suited when the task is to identify system states that have to be avoided for lowering the effect probability. On the other hand, one might want to treat the cause set globally as a unit in (C1) leading to the notion of *global probabilityraising (GPR) cause*. Considering the cause set as a whole is better suited when further constraints are imposed on the candidates for cause set. This might apply, e.g., when the set of non-terminal states of the given MDP is partitioned into sets of states S<sup>i</sup> under the control of an agent i, 1 6 i 6 k. For the task to identify which agent's decisions cause the effect only the subsets of S1,...,S<sup>k</sup> are candidates for causes. Furthermore, global causes are more appropriate when causes are used for monitoring purposes under partial observability constraints as then the cause candidates are sets of indistinguishable states.

Different causes for an effect according to our definition can differ substantially regarding how well they predict the effect and how well the executions exhibiting the cause cover the executions showing the effect. Taking inspiration from measures used in statistical analysis (see, e.g., [36]), we introduce quality measures that allow us to compare causes and to look for optimal causes: The *recall* captures the probability that the effect is indeed preceded by the cause. The *coverage-ratio* quantifies the fraction of the probability that cause and effect are observed and the probability that the effect but not the cause is observed. Finally, the *f-score*, a widely used quality measure for binary classifiers, is the harmonic mean of recall and precision, i.e., the probability that the cause is followed by the effect.

Contributions. The goal of this work are the mathematical and algorithmic foundations of probabilistic causation in MDPs based on (C1) and (C2). We introduce strict and global probability-raising causes in MDPs (Section 3). Algorithms are provided to check whether given cause and effect sets satisfy (one of) the probability-raising conditions (Section 4.1 and 4.2) and to check the existence of causes for a given effect (Section 4.1). In order to evaluate the coverage properties of a cause, we subsequently introduce the above-mentioned quality measures (Section 5.1). We give algorithms for computing these values for given cause-effect relations (Section 5.2) and characterize the computational complexity of finding optimal causes with respect to the different measures (Section 5.3). Table 1 summarizes our complexity results. An extended version of this paper containing the omitted proofs can be found in [8].

### 2 Preliminaries

Throughout the paper, we will assume some familiarity with basic concepts of Markov decision processes. Here, we only present a brief summary of the notations used in the paper. For more details, we refer to [37,9,23].

A *Markov decision process (MDP)* is a tuple M = (S,*Act*,P,init) where S is a finite set of states, *Act* a finite set of actions, init ∈ S the initial state and P : S×*Act*×S → [0, 1] the transition probability function such that P <sup>t</sup>∈<sup>S</sup> P(s,α, t) ∈ {0, 1} for all states s ∈ S and actions α ∈ *Act*. An action α is *enabled* in state s ∈ S if P <sup>t</sup>∈<sup>S</sup> P(s,α, t) = 1. We define *Act*(s) = {α | α is enabled in s}. A state t is *terminal* if *Act*(t) = ∅. A Markov chain (MC) is a special case of an MDP where *Act* is a singleton (we then write P(s,u) rather than P(s,α,u)). A *path* in an MDP M is a (finite or infinite) alternating sequence π = s0α<sup>0</sup> s1α<sup>1</sup> s<sup>2</sup> ··· ∈ (S × *Act*) <sup>∗</sup> ∪ (S × *Act*)<sup>ω</sup> such that P(si,αi,si+1) > 0 for all indices i. A path is called maximal if it is infinite or finite and ends in a terminal state. An MDP can be interpreted as a Kripke structure in which transitions go from states to probability distributions over states.

A *(randomized) scheduler* S is a function that maps each finite non-maximal path s0α<sup>0</sup> ...αn−1s<sup>n</sup> to a distribution over *Act*(sn). S is called deterministic if S(π) is a Dirac distribution for all finite non-maximal paths π. If the chosen action only depends on the last state of the path, S is called *memoryless*. We write MR for the class of memoryless (randomized) and MD for the class of memoryless deterministic schedulers. *Finite-memory* schedulers are those that are representable by a finite-state automaton.

The scheduler S of M induces a (possibly infinite) Markov chain. We write Pr<sup>S</sup> M,s for the standard probability measure on measurable sets of maximal paths in the Markov chain induced by S with initial state s. If ϕ is a measurable set of maximal paths, then Prmax M,s (ϕ) and Prmin M,s (ϕ) denote the supremum resp. infimum of the probabilities for ϕ under all schedulers. We use the abbreviation Pr<sup>S</sup> <sup>M</sup> = Pr<sup>S</sup> <sup>M</sup>,init and notations Prmax <sup>M</sup> and Prmin <sup>M</sup> for extremal probabilities. Analogous notations will be used for expectations. So, if f is a random variable, then, e.g., E<sup>S</sup> <sup>M</sup>(f) denotes the expectation of f under S and E max <sup>M</sup> (f) its supremum over all schedulers. We use LTL-like temporal modalities such as ♦ (eventually) and U (until) to denote path properties. For X,T ⊆ S the formula XUT is satisfied by paths π = s0s<sup>1</sup> ... such that there exists j > 0 such that for all i < j : s<sup>i</sup> ∈ X and s<sup>j</sup> ∈ T and ♦T = SUT. It is well-known that Prmin <sup>M</sup> (XUT) and Prmax <sup>M</sup> (XUT) and corresponding optimal MD-schedulers are computable in polynomial time.

If s ∈ S and α ∈ *Act*(s), then (s,α) is said to be a state-action pair of M. An *end component* (EC) of an MDP M is a strongly connected sub-MDP containing at least one state-action pair. ECs will be often identified with the set of their state-action pairs. An EC E is called maximal (abbreviated MEC) if there is no proper superset E <sup>0</sup> of (the set of state-action pairs of) E which is an EC.

### 3 Strict and global probability-raising causes

We now provide formal definitions for cause-effect relations in MDPs which rely on the probability-raising (PR) principle as stated by (C1) and (C2) in the introduction. We focus on the case where both causes and effects are state properties, i.e., sets of states.

In the sequel, let M = (S,*Act*,P,init) be an MDP and Eff ⊆ S\ {init} a nonempty set of terminal states. (As the effect set is fixed, for the analysis of cause-effect relationships in M it suffices to assume all effect states are terminal by (C2).) Furthermore, we may assume that every state s ∈ S is reachable from init.

We consider here two variants of the probability-raising condition: the global setting treats the set Cause as a unit, while the strict view requires the probability-raising condition for all states in Cause individually.

Definition 1 (Global and strict probability-raising cause (GPR/SPR cause)). *Let* M *and* Eff *be as above and* Cause *a nonempty subset of* S \Eff*. Then,* Cause *is said to be a* GPR cause *for* Eff *iff the following two conditions (G) and (M) hold:*

*(G) For each scheduler* S *where* Pr<sup>S</sup> <sup>M</sup>(♦Cause) <sup>&</sup>gt; <sup>0</sup>*:*

$$\Pr\_{\mathcal{M}}^{\mathfrak{S}}(\Diamond\mathsf{E}\mathsf{f} \mid \Diamond\mathsf{C}\mathsf{a}\mathsf{a}\mathsf{s}\mathsf{e}\ ) > \Pr\_{\mathcal{M}}^{\mathfrak{S}}(\Diamond\mathsf{E}\mathsf{f} \mathsf{f}).\tag{\mathsf{GPR}}$$

*(M) For each* c ∈ Cause*, there is a scheduler* S *with* Pr<sup>S</sup> <sup>M</sup>((¬Cause)Uc) > 0*.*

Cause *is called an* SPR cause *for* Eff *iff (M) and the following condition (S) hold:*

*(S) For each state* c ∈ Cause *and each scheduler* S *where* Pr<sup>S</sup> <sup>M</sup>((¬Cause)Uc) > 0*:*

$$\Pr\_{\mathsf{M}}^{\mathsf{S}}(\Diamond\mathsf{E}\mathsf{f} \mid (\neg\mathsf{C}\mathsf{a}\mathsf{a}\mathsf{s})\mathsf{U}\mathsf{c}\ ) > \Pr\_{\mathsf{M}}^{\mathsf{S}}(\Diamond\mathsf{E}\mathsf{f}\mathsf{f}).\tag{\mathsf{SPR}}$$

Condition (M) can be seen as a minimality requirement as states c ∈ Cause which are not accessible from init without traversing other states in Cause could be omitted without affecting the true positives (events where an effect state is reached after visiting a cause state, "covered effects") or false negatives (events where an effect state is reached without visiting a cause state before, "uncovered effect"). More concretely, whenever a set C ⊆ S \ Eff satisfies conditions (G) or (S) then the set Cause of states c ∈ C where M has a path from init satisfying (¬C)Uc is a GPR resp. an SPR cause.

### 3.1 Examples and simple properties of probability-raising causes

We first observe that SPR/GPR causes cannot contain the initial state init, since otherwise an equality instead of an inequality would hold in (GPR) and (SPR). Furthermore as a direct consequence of the definitions and using the equivalence of the LTL formulas ♦Cause and (¬Cause)UCause we obtain:

Lemma 1 (Singleton PR causes). *If* Cause *is a singleton then* Cause *is a SPR cause for* Eff *if and only if* Cause *is a GPR cause for* Eff*.*

As the event ♦Cause is a disjoint union of all events (¬Cause)Uc with c ∈ Cause, the probability for covered effects Pr<sup>S</sup> <sup>M</sup>( ♦Eff <sup>|</sup> ♦Cause ) is a weighted average of the probabilities Pr<sup>S</sup> <sup>M</sup>( ♦Eff <sup>|</sup> (¬Cause)U<sup>c</sup> ) for <sup>c</sup> <sup>∈</sup> Cause. This yields:

### Lemma 2 (Strict implies global). *Every SPR cause for* Eff *is a GPR cause for* Eff*.*

*Example 1 (Non-strict GPR cause).* Consider the Markov chain M depicted below where the nodes represent states and the directed edges represent transitions labeled with their respective probabilities. Let Eff = {eff}. Then, PrM(♦Eff) = <sup>1</sup> <sup>3</sup> + 1 3 · 1 <sup>4</sup> + 1 <sup>12</sup> = 1 2 , PrM(♦Eff|♦c1) = PrM,c<sup>1</sup> (♦eff) = 1 and PrM(♦Eff|♦c2) = PrM,c<sup>2</sup> (♦eff) = <sup>1</sup> 4 . Thus, {c1} is both an SPR and a GPR cause for Eff, while {c2} is not. The set Cause = {c1,c2} is a non-strict GPR cause for Eff as:

$$\Pr\_{\mathcal{M}}(\text{ \u0Eff | \u0\text{Case}}) = (\frac{1}{3} + \frac{1}{3} \cdot \frac{1}{4}) / (\frac{1}{3} + \frac{1}{3}) = (\frac{5}{12}) / (\frac{2}{3}) = \frac{5}{8} > \frac{1}{2} = \Pr\_{\mathcal{M}}(\text{\u0Eff}) .$$

The second condition (M) is obviously fulfilled. Non-strictness follows from the fact that the SPR condition does not hold for state c2. C

*Example 2 (Probability-raising causes might not exist).* PR causes might not exist, even if M is a Markov chain. This applies, e.g., to the Markov chain M with two states init and eff where P(init, eff) = 1 and the effect set Eff = {eff}. The only cause candidate is the singleton {init}. However, the strict inequality in (GPR) or (SPR) does not hold for Cause = {init}. The same phenomenon occurs if all non-terminal states of a Markov chain reach the effect states with the same probability. In such cases, however, the nonexistence of PR causes is well justified as the events ♦Eff and ♦Cause are stochastically independent for every set Cause ⊆ S\Eff. C

*Remark 1 (Memory needed for refuting PR condition).* Let M be the MDP in Figure 1, where the notation is similar to Example 1 with the addition of actions α,β and γ. Let Cause = {c} and Eff = {eff}. Only state s has a nondeterministic choice. Cause is not an PR cause. To see this, regard the deterministic scheduler T that schedules β only for the first visit of s and α for the second visit of s. Then:

$$\Pr\_{\mathcal{M}}^{\mathsf{T}}(\diamond \mathsf{eff}) = \frac{1}{2} \cdot \frac{1}{2} + \frac{1}{2} \cdot \frac{1}{2} \cdot 1 \cdot \frac{1}{4} = \frac{\mathsf{s}}{16} > \frac{1}{4} = \Pr\_{\mathcal{M}}^{\mathsf{T}}(\diamond \mathsf{eff}|\diamond \mathsf{c})$$

Fig. 1. MDP M from Remark 1

Fig. 2. MDP M from Remark 2

Denote the MR schedulers reaching c with positive probability as S<sup>λ</sup> with Sλ(s)(α) = λ and Sλ(s)(β) = 1−λ for some λ ∈ [0, 1[. Then, PrS<sup>λ</sup> M,s (♦eff) > 0 and:

$$\Pr\_{\mathcal{M}}^{\mathfrak{S}\_{\lambda}}(\diamond \text{eff}) = \frac{1}{2} \cdot \Pr\_{\mathcal{M},s}^{\mathfrak{S}\_{\lambda}}(\diamond \text{eff}) < \Pr\_{\mathcal{M},s}^{\mathfrak{S}\_{\lambda}}(\diamond \text{eff}) = \Pr\_{\mathcal{M},\mathfrak{c}}^{\mathfrak{S}\_{\lambda}}(\diamond \text{eff}) = \Pr\_{\mathcal{M}}^{\mathfrak{S}\_{\lambda}}(\diamond \text{eff}|\diamond \text{c})$$

Thus, the SPR/GPR condition holds for Cause and Eff under all memoryless schedulers reaching Cause with positive probability, although Cause is not an PR cause. C

*Remark 2 (Randomization needed for refuting PR condition).* Consider the MDP M of Figure 2. Let Eff = {effunc, effcov} and Cause = {c}. The two MD-schedulers S<sup>α</sup> and S<sup>β</sup> that select α resp. β for the initial state init are the only deterministic schedulers. As S<sup>α</sup> does not reach c, it is irrelevant for the SPR or GPR condition. S<sup>β</sup> satisfies (SPR) and (GPR) as PrS<sup>β</sup> <sup>M</sup> (♦Eff|♦c) = <sup>1</sup> <sup>2</sup> > 1 <sup>4</sup> <sup>=</sup> PrS<sup>β</sup> <sup>M</sup> (♦Eff). The MR scheduler <sup>T</sup> which selects α and β with probability <sup>1</sup> 2 in init reaches c with positive probability and violates (SPR) and (GPR) as Pr<sup>T</sup> <sup>M</sup>(♦Eff|♦c) = <sup>1</sup> <sup>2</sup> < 5 <sup>8</sup> = 1 <sup>2</sup> + 1 2 · 1 2 · 1 <sup>2</sup> <sup>=</sup> Pr<sup>T</sup> <sup>M</sup>(♦Eff). <sup>C</sup>

*Remark 3 (Cause-effect relations for regular classes of schedulers).* The definitions of PR causes in MDPs impose constraints for all schedulers reaching a cause state. This condition is fairly strong and might lead to the phenomenon that no PR cause exists. However, replacing M with an MDP resulting from the synchronous parallel composition of M with a deterministic finite automaton representing a regular constraint on the scheduled state-action sequences (e.g., "alternate between actions α and β in state s" or "take α on every third visit to state s and actions β or γ otherwise") leads to a weaker notion of PR causality. This can be useful to obtain more detailed information on cause-effect relationships in special scenarios. For example at design time where multiple scenarios (regular classes of schedulers) are considered or for a post-hoc analysis. For the later, one seeks causes of an occurred effect and the information about the scheduled actions is either extractable from log files or gathered by a monitor. C

*Remark 4 (Action causality and other forms of PR causality).* Our notions of PR causes are purely state-based with conditions that compare probabilities under the same scheduler. However, in combination with model transformations, the proposed notions are also applicable for reasoning about other forms of PR causality.

Suppose, the task is to check whether taking action α in state s raises the effect probabilities compared to never scheduling α in state s. Let M<sup>0</sup> and M<sup>1</sup> be copies of M with the following modifications: In M0, the only enabled action of state s is α, while in M<sup>1</sup> the enabled actions of state s are the elements of *Act*M(s)\ {α}. Let now N be the MDP whose initial state has a single enabled action and moves with probability 1/2 to M<sup>0</sup> and M1. Then, action α raises the effect probability in M iff the initial state of M<sup>0</sup> consitutes an SPR cause in N. This idea can be generalized to check whether scheduler classes satisfying a regular constraint have higher effect probability compared to all other schedulers. In this case, we can deal with an MDP N as above where M<sup>0</sup> and M<sup>1</sup> are defined as the synchronous product of deterministic finite automata and M. C

### 3.2 Related work

Previous work in the direction of probabilistic causation in stochastic operational models has mainly concentrated on Markov chains. Kleinberg [24,25] introduced *prima facie causes* in finite Markov chains where both causes and effects are formalized as PCTL state formulae, and thus they can be seen as sets of states as in our approach. The correspondence of Kleinberg's PCTL constraints for prima facie causes and the strict probability-raising condition formalized using conditional probabilities has been worked out in the survey article [5]. Our notion of SPR causes corresponds to Kleinberg's prima facie causes, except for the minimality condition (M). Abrah ´ am et al [ ´ 1] introduces a hyperlogic for Markov chains and gives a formalization of probabilistic causation in Markov chains as a hyperproperty, which is consistent with Kleinberg's prima facie causes, and with SPR causes up to minimality. Cause-effect relations in Markov chains where effects are ω-regular properties have been introduced in [6]. The notion relies on the strict probability-raising condition, but requires completeness in the sense that every path where the effect occurs has a prefix in the cause set. The paper [6] permits a non-strict inequality in the SPR condition with the consequence that causes always exist, which is not the case for our notions.

The survey article [5] introduces notions of global probability-raising causes for Markov chains where causes and effects can be path properties. [5]'s notion of *reachability causes* in Markov chains directly corresponds to our notion GPR causes, the only difference being that [5] deals with a relaxed minimality condition and requires that the cause set is reachable without visiting an effect state before. The latter is inherent in our approach as we suppose that all states are reachable and the effect states are terminal.

To the best of our knowledge, probabilistic causation in MDPs has not been studied before. The only work in this direction we are aware of is the recent paper by Dimitrova et al [17] on a hyperlogic, called PHL, for MDPs. While the paper focuses on the foundation of PHL, it contains an example illustrating how action causality can be formalized as a PHL formula. Roughly, the presented formula expresses that taking a specific action α increases the probability for reaching effect states. Thus, it also relies on the probability-raising principle, but compares the "effect probabilities" under different schedulers (which either schedule α or not) rather than comparing probabilities under the same scheduler as in our PR condition. However, as Remark 4 argues, to some extent our notions of PR causes can reason about action causality as well.

There has also been work on causality-based explanations of counterexamples in probabilistic models [27,28]. The underlying causality notion of this work, however, relies on the non-probabilistic counterfactual principle rather than the probability-raising condition. The same applies to the notions of forward and backward responsibility in stochastic games in extensive form introduced in the recent work [7].

### 4 Checking the existence of PR causes and the PR conditions

We now turn to algorithms for checking whether a given set Cause is an SPR or GPR cause for Eff. As condition (M) of SPR and GPR causes is verifiable by standard model checking techniques in polynomial time, we concentrate on checking the probabilityraising conditions (SPR) and (GPR). For Markov chains, both (SPR) and (GPR) can be checked in polynomial time by computing the corresponding probabilities. So, the interesting case is checking the PR conditions in MDPs.

We start by stating that for the SPR and GPR condition, it suffices to consider schedulers minimizing the probability to reach an effect state from every cause state.

Notation 1 (MDP with minimal effect probabilities from cause candidates). If C ⊆ S then we write M[C] for the MDP resulting from M by removing all enabled actions of the states in C. Instead, M[C] has a new action γ that is enabled exactly in the states s ∈ C with the transition probabilities PM[C] (s,γ, eff) = Prmin M,s (♦Eff) and PM[C] (s,γ,noeff) = 1−Prmin M,s (♦Eff). Here, eff is a fixed state in Eff and noeff a (possibly fresh) terminal state not in Eff. We write M[c] if C = {c} is a singleton.

Lemma 3. *Let* M = (S,*Act*,P,init) *be an MDP and* Eff ⊆ S *a set of terminal states. Let* Cause ⊆ S \Eff*. Then,* Cause *is an SPR cause (resp. a GPR cause) for* Eff *in* M *if and only if* Cause *is an SPR cause (resp. a GPR cause) for* Eff *in* M[Cause] *.*

### 4.1 Checking the strict probability-raising condition and the existence of causes

The basis of both checking the existence of PR causes or checking the SPR condition for a given cause candidate is the following polynomial time algorithm to check whether the SPR condition holds in a given state c of M for all schedulers S with Pr<sup>S</sup> <sup>M</sup>(♦c) <sup>&</sup>gt; 0:

Algorithm 2. Input: state c ∈ S, set of terminal states Eff ⊆ S.

Task: Decide whether (SPR) holds in c for all schedulers S.

Compute w<sup>c</sup> = Prmin M,c (♦Eff) and q<sup>s</sup> = Prmax M[c] ,s (♦Eff) for each state s in M[c] .


3.1 If c is reachable from init in Mmax [c] , then return "no, (SPR) does not hold for c".

3.2 If c is not reachable from init in Mmax [c] , then return "yes, (SPR) holds for c".

Lemma 4. *Algorithm 2 is sound and runs in polynomial time.*

*Soundness.* Let N = M[c] . Soundness is obvious in case 1. For case 2, consider a real number λ with 1 > λ > <sup>w</sup><sup>c</sup> qinit and MD-schedulers T and S realizing Pr<sup>T</sup> N,s (♦Eff) = q<sup>s</sup> and Pr<sup>S</sup> <sup>N</sup>(♦c) <sup>&</sup>gt; 0 for all states <sup>s</sup>. We can combine <sup>T</sup> and <sup>S</sup> to a new MR-scheduler <sup>U</sup> with the property that Pr<sup>U</sup> <sup>N</sup>(♦t) = <sup>λ</sup>Pr<sup>T</sup> N(♦t) + (1−λ)Pr<sup>S</sup> <sup>N</sup>(♦t) for all terminal states <sup>t</sup> and for t = c. Then, U witnesses a violation of (SPR). For case 3.1 consider an MDscheduler S of Mmax [c] where <sup>c</sup> is reachable from init via a <sup>S</sup>-path and Pr<sup>S</sup> N,s (♦Eff) = q<sup>s</sup> for all states s. Then, (SPR) does not hold for c in the scheduler S. In case 3.2 we have Pr<sup>S</sup> <sup>N</sup>(♦c) = 0 for all schedulers <sup>S</sup> for <sup>N</sup> with Pr<sup>S</sup> <sup>N</sup>(♦Eff) = <sup>q</sup>init <sup>=</sup> <sup>w</sup>c. But then Pr<sup>S</sup> <sup>N</sup>(♦c) <sup>&</sup>gt; 0 implies Pr<sup>S</sup> <sup>N</sup>(♦Eff) < w<sup>c</sup> as required in (SPR). ut

By applying Algorithm 2 to all states c ∈ Cause and standard algorithms to check the existence of a path satisfying (¬Cause)Uc for every state c ∈ Cause, we obtain:

Theorem 3 (Checking SPR causes). *The problem "given* M*,* Cause *and* Eff*, check whether* Cause *is a SPR cause for* Eff *in* M*" is solvable in polynomial-time.*

*Remark 5 (Memory requirements for refuting the SPR property).* As the soundness proof for Algorithm 2 shows: If Cause does not satisfy the SPR condition, then there is an MR-scheduler S for M[Cause] witnessing the violation of (SPR). Scheduler S corresponds to a finite-memory (randomized) scheduler T with two memory cells for M: "before Cause" (where T behaves as S) and "after Cause" (where T behaves as an MD-scheduler minimizing the effect probability form every state). C

Lemma 5 (Criterion for the existence of probability-raising causes). *Let* M *be an MDP and* Eff *a nonempty set of states. Then* Eff *has an SPR cause in* M *iff* Eff *has a GPR cause in* M *iff there is a state* c<sup>0</sup> ∈ S \ Eff *such that the singleton* {c0} *is an SPR cause (and therefore a GRP cause) for* Eff *in* M*. In particular, the existence of SPR/GPR causes can be checked with Algorithm 2 in polynomial time.*

### 4.2 Checking the global probability-raising condition

Theorem 4. *The problem "given* M*,* Cause *and* Eff*, check whether* Cause *is a GPR cause for* Eff *in* M*" is solvable in polynomial space.*

In order to provide an algorithm, we perform a model transformation after which the violation of (GPR) by a scheduler S can be expressed solely in terms of the expected frequencies of the state-action pairs of the transformed MDP under S. This allows us to express the existence of a scheduler witnessing the non-causality of Cause in terms of the satisfiability of a quadratic constraint system. Then we can restrict the quantification in (G) to MR-schedulers in the transformed model. We trace back the memory requirements to M[Cause] and to the original MDP M yielding the second main result. Still, memory can be necessary to witness non-causality (Remark 1).

Theorem 5. *Let* M *be an MDP with effect set* Eff *as before and* Cause *a set of noneffect states such that condition (M) holds. If* Cause *is not a GPR cause for* Eff*, then there is an MR-scheduler for* M[Cause] *refuting the GPR condition for* Cause *in* M[Cause] *and a finite-memory scheduler for* M *with two memory cells refuting the GPR condition for* Cause *in* M*.*

The remainder of this section is concerned with the proofs of Theorem 4 and Theorem 5. We suppose that both the effect set Eff and the cause candidate Cause are fixed disjoint subsets of the state space of the MDP M and that Cause satisfies (M).

Checking the GPR condition (Proof of Theorem 4). The first step is a polynomialtime model transformation which permits to make the following assumptions when checking the GPR condition of Cause for Eff.


Intuitively, effcov stands for covered effects ("Eff after Cause") and can be seen as a true positive, while effunc represents the uncovered effects ("Eff without preceding Cause") and corresponds to a false negative. Let S be a scheduler in M. Note that Pr<sup>S</sup> <sup>M</sup>((¬Cause)UEff) = Pr<sup>S</sup> <sup>M</sup>(♦effunc) and Pr<sup>S</sup> <sup>M</sup>(♦(Cause∧♦Eff)) = Pr<sup>S</sup> <sup>M</sup>(♦effcov). As the cause states can not reach each other we also have Pr<sup>S</sup> <sup>M</sup>((¬Cause)Uc) = Pr<sup>S</sup> <sup>M</sup>(♦c) for each c ∈ Cause. The intuitive meaning of noefffp is a false positive ("no effect after Cause"), while noefftn stands for true negatives where neither the effect nor the cause is observed. Note that Pr<sup>S</sup> <sup>M</sup>(♦(Cause ∧ ¬♦Eff)) = Pr<sup>S</sup> <sup>M</sup>(♦noefffp) and Pr<sup>S</sup> <sup>M</sup>(¬♦Cause <sup>∧</sup> ¬♦Eff)) = Pr<sup>S</sup> <sup>M</sup>(♦noefftn).

*Justification of assumptions (A1)-(A3):* We justify the assumptions as we can transform M into a new MDP of the same asymptotic size satisfying the above assumptions. Thanks to Lemma 3, we may suppose that M = M[Cause] (see Notation 1) without changing the satisfaction of the GPR condition. We then may rename the effect state eff and the non-effect state noeff reachable from Cause into effcov and noefffp, respectively. Furthermore, we collapse all other effect states into a single state effunc and all true negative states into noefftn. Similarly, by renaming and possibly duplicating terminal states we also suppose that noefffp has no other incoming transitions than the γ-transitions from the states in Cause. This ensures (A1) and (A2). For (A3) consider the set T of terminal states in the MDP obtained so far. We remove all end components by switching to the MEC-quotient [2], i.e., we collapse all states that belong to the same MEC E into a single state s<sup>E</sup> while ignoring the actions inside E. Additionally, we add a fresh τ-transition from the states s<sup>E</sup> to noefftn (i.e., P(sE,τ,noefftn) = 1). The τtransitions from states s<sup>E</sup> to noefftn mimic cases where schedulers of the original MDP eventually enter an end component and stay there forever with positive probability.

With assumptions (A1)-(A3), the GPR condition can be reformulated as follows:

Lemma 6. *Under assumptions (A1)-(A3),* Cause *satisfies the GPR condition if and only if for each scheduler* S *with* Pr<sup>S</sup> <sup>M</sup>(♦Cause) <sup>&</sup>gt; <sup>0</sup> *the following condition holds:*

$$\operatorname{Pr}\_{\mathcal{M}}^{\mathfrak{S}}(\diamond \mathtt{Cause}) \cdot \operatorname{Pr}\_{\mathcal{M}}^{\mathfrak{S}}(\diamond \mathtt{eff}\_{\mathtt{unc}}) < \left(1 - \operatorname{Pr}\_{\mathcal{M}}^{\mathfrak{S}}(\diamond \mathtt{Cause})\right) \cdot \sum\_{\mathfrak{c} \in \mathtt{Cause}} \operatorname{Pr}\_{\mathcal{M}}^{\mathfrak{S}}(\diamond \mathtt{c}) \cdot \operatorname{w}\_{\mathfrak{c}} \quad (\operatorname{GPR-1})$$

With assumptions (A1)-(A3), a terminal state of M is reached almost surely under any scheduler after finitely many steps in expectation. Given a scheduler S for M, the expected frequencies (i.e., expected number of occurrences in maximal paths) of state action-pairs (s,α), states s ∈ S and state-sets T ⊆ S under S are defined by:

$$\begin{aligned} &freq\_{\mathfrak{S}}(\mathbf{s}, \boldsymbol{\alpha}) \stackrel{\scriptstyle \mathfrak{s} \coloneqq}{=} \operatorname{E}\_{\mathfrak{M}}^{\mathfrak{S}}(\mathtt{number of visits to } \boldsymbol{\mathfrak{s}} \text{ in which } \boldsymbol{\alpha} \text{ is taken}) \\ &freq\_{\mathfrak{S}}(\mathbf{s}) \stackrel{\scriptstyle \mathfrak{s} \coloneqq}{=} \sum\_{\boldsymbol{\alpha} \in \operatorname{Act}(\boldsymbol{s})} freq\_{\mathfrak{S}}(\mathbf{s}, \boldsymbol{\alpha}), \qquad freq\_{\mathfrak{S}}(\mathsf{T}) \stackrel{\scriptstyle \mathfrak{s} \coloneqq}{=} \sum\_{\boldsymbol{s} \in \mathbb{T}} freq\_{\mathfrak{S}}(\mathbf{s}). \end{aligned}$$

Let T be one of the sets {effcov}, {effunc}, Cause, or a singleton {c} with c ∈ Cause. As T is visited at most once during each run of M (assumptions (A1) and (A2)), we have Pr<sup>S</sup> <sup>N</sup>(♦T) = *freq*S(T) for each scheduler <sup>S</sup>. This allows us to express the violation of the GPR condition in terms of a quadratic constraint system over variables for the expected frequencies of state-action pairs in the following way:

Let *StAct* denote the set of state-action pairs in M. We consider the following constraint system over the variables xs,<sup>α</sup> for each (s,α) ∈ *StAct* where we use the short form notation x<sup>s</sup> = P α∈*Act*(s) xs,α:

$$\infty\_{\mathbf{s},\mathbf{\alpha}} \ge 0 \qquad \text{for all } (\mathbf{s}, \mathbf{\alpha}) \in \text{StAct} \tag{l}$$

$$\chi\_{\text{init}} = 1 + \sum\_{\text{(t,\alpha)} \in \text{St} \text{ct}} \chi\_{\text{t},\alpha} \cdot \mathbb{P}(\text{t}, \alpha, \text{init}) \tag{2}$$

$$\mathbf{x}\_{\mathbf{s}} = \sum\_{\mathbf{(t,\alpha)\in S\&ct}} \mathbf{x}\_{\mathbf{t},\alpha} \cdot \mathbb{P}(\mathbf{t}, \mathbf{a}, \mathbf{s}) \qquad \text{for all } \mathbf{s} \in \mathcal{S} \text{ (init)} \tag{3}$$

Using well-known results for MDPs without ECs (see, e.g., [23, Theorem 9.16]), given a vector x ∈ R *StAct*, then x is a solution to (1) and the balance equations (2) and (3) if and only if there is a (possibly history-dependent) scheduler S for M with xs,<sup>α</sup> = *freq*S(s,α) for all (s,α) ∈ *StAct* if and only if there is an MR-scheduler S for M with xs,<sup>α</sup> = *freq*S(s,α) for all (s,α) ∈ *StAct*.

The violation of (GPR-1) in Lemma 6 and the condition Pr<sup>S</sup> <sup>M</sup>(♦Cause) <sup>&</sup>gt; 0 can be reformulated in terms of the frequency-variables as follows where xCause is an abbreviation for P <sup>c</sup>∈Cause xc:

$$\times\_{\mathsf{Cause}} \cdot \chi\_{\text{eff}\_{\text{unc}}} \quad \geqslant \quad (1 - \chi\_{\text{Gauss}}) \cdot \sum\_{\mathsf{c} \in \mathsf{Cause}} \chi\_{\text{c}} \cdot \mathsf{w}\_{\mathsf{c}} \tag{4}$$

$$\chi\_{\mathsf{Cause}} > 0 \tag{5}$$

Lemma 7. *Under assumptions (A1)-(A3), the set* Cause *is not a GPR cause for* Eff *in* M *iff the constructed quadratic system of inequalities (1)-(5) has a solution.*

*Proof of Theorem 4.* The existence of a solution to the quadratic system of inequalities (Lemma 7) can straight-forwardly be formulated as a sentence in the language of the existential theory of the reals. The system of inequalities can be constructed from M, Cause, and Eff in polynomial time. Its solvability is decidable in polynomial space as the decision problem of the existential theory of the reals is in PSPACE [13]. ut Memory requirements of schedulers in the original MDP (Proof of Theorem 5). As stated above, every solution to the linear system of inequalities (1), (2), and (3) corresponds to expected frequencies of state-action pairs of an MR-scheduler in the transformed model satisfying (A1)-(A3). Hence:

#### Corollary 1. *Under assumptions (A1)-(A3),* Cause *is no GPR cause for* Eff *iff there exists an MR-scheduler* T *with* Pr<sup>T</sup> <sup>M</sup>(♦Cause) <sup>&</sup>gt; <sup>0</sup> *violating the GPR condition.*

The model transformation we used for assumptions (A1)-(A3), however, does affect the memory requirements of schedulers. We may further restrict the MR-schedulers necessary to witness non-causality under assumptions (A1)-(A3). For the following lemma, recall that τ is the action of the MEC quotient used for the extra transition from states representing MECs to a new trap state (see also assumption (A3)).

Lemma 8. *Assume (A1)-(A3). Given an MR-scheduler* U *with* Pr<sup>U</sup> <sup>M</sup>(♦Cause) <sup>&</sup>gt; <sup>0</sup> *that violates*(GPR)*, an MR-scheduler* T *with* T(s)(τ) ∈ {0, 1} *for each state* s *with* τ ∈ *Act*(s) *that satifies* Pr<sup>T</sup> <sup>M</sup>(♦Cause) <sup>&</sup>gt; <sup>0</sup> *and violates* (GPR) *is computable in polynomial time.*

The condition that τ only has to be scheduled with probability 0 or 1 in each state is the key to transfer the sufficiency of MR-schedulers to the MDP M[Cause] . This fact is of general interest as well and stated in the following theorem where τ again is the action added to move from a state s<sup>E</sup> to the new trap state in the MEC-quotient.

Theorem 6. *Let* M *be an MDP with pairwise disjoint action sets for all states. Then, for each MR-scheduler* S *for the MEC-quotient of* M *with* S(sE)(τ) ∈ {0, 1} *for each MEC* E *of* M *there is an MR-scheduler* T *for* M *such that every action* α *of* M *that does not belong to an MEC of* M*, has the same expected frequency under* S *and* T*.*

*Proof sketch.* The crux are cases where S(sE)(τ) = 0, which requires to traverse the MEC E of M in a memoryless way such that all actions leaving E have the same expected frequency under T and S. First, we construct a finite-memory scheduler T 0 that always leaves each such end component according to the distribution given by S(sE). By [23, Theorem 9.16], we then conclude that there is an MR-scheduler T under which the expected frequencies of all state-action pairs are the same as under T 0 . ut

*Proof of Theorem 5.* The model transformation establishing assumptions (A1)-(A3) results in the MEC-quotient of M[Cause] up to the renaming and collapsing of terminal states. By Corollary 1 and Theorem 6, we conclude that Cause is not a GPR cause for Eff in M iff there is a MR-scheduler S for M[Cause] with Pr<sup>S</sup> M[Cause] (♦Cause) > 0 that violates (GPR). As in Remark 5, S can be extended to a finite-memory randomized scheduler T for M with two memory cells. ut

*Remark 6 (On lower bounds on GPR checking).* Solving systems of quadratic inequalities with linear side constraints is NP-hard in general (see, e.g., [20]). For convex problems, in which the associated symmetric matrix in the quadratic inequality has only non-negative eigenvalues, the problem is, however, solvable in polynomial time [26]. Unfortunately, the quadratic constraint system given by (1)-(5) is not of this form. Even if Cause is a singleton {c} and the variable xeffunc is forced to take a constant value y by (1)-(3), i.e., by the structure of the MDP, the inequality (4) takes the form:

$$
\mathbf{x\_{c}} \cdot \mathbf{w\_{c}} - \mathbf{x\_{c}^{2}} \cdot (\mathbf{w\_{c}} + \mathbf{y}) \lessapprox \mathbf{0} \tag{\*}
$$

Here, the 1 × 1-matrix (−wc−y) has a negative eigenvalue. Although it is not ruled out that (1)-(5) belongs to another class of efficiently solvable constraint systems, the NP-hardness result in [33] for the solvability of quadratic inequalities of the form (\*) with linear side constraints might be an indication for the computational difficulty. C

### 5 Quality and optimality of causes

The goal of this section is to identify notions that measure how "good" causes are and to present algorithms to determine good causes according to proposed quality measures. We have seen so far that small (singleton) causes are easy to determine (see Section 4.1). Moreover, it is easy to see that the proposed existence-checking algorithm can be transformed such that it returns a singleton (strict or global) probability-raising cause {c0} with maximal *precision*, i.e., a state c<sup>0</sup> where inf<sup>S</sup> Pr<sup>S</sup> <sup>M</sup>(♦Eff|♦c0) = Prmin M,c<sup>0</sup> (♦Eff) is maximal. On the other hand, singleton or small cause sets might have poor coverage in the sense that the probability of paths which reach an effect state without visiting a cause state before ("uncovered effects") can be large. This motivates the consideration of quality notions for causes that incorporate how well effect scenarios are covered. We take inspiration of quality measures that are considered in statistical analysis (see e.g. [36]). This includes the *recall* as a measure for the relative coverage (proportion of covered effects among all effect scenarios), the *coverage ratio* (quotient of covered and uncovered effects) as well as the *f-score*. The f-score is a standard measure for classifiers defined by the harmonic mean of precision and recall. It can be seen as a compromise to achieve both good precision and good recall.

Throughout this section, we assume as before an MDP M = (S,*Act*,P,init) and a set Eff ⊆ S are given where all effect states are terminal. Furthermore, we suppose that all states s ∈ S are reachable from init.

### 5.1 Quality measures for causes

In statistical analysis, the precision of a classifier with binary outcomes ("positive" or "negative") is defined as the ratio of all true positives among all positively classified elements, while its recall is defined as the ratio of all true positives among all actual positive elements. Translated to our setting, we consider classifiers induced by a given cause set Cause that return "positive" for sample paths in case that a cause state is visited and "negative" otherwise. The intuitive meaning of true positives and false negatives is as explained after Definition 1. The meaning of true negatives and false positives is analogous. We use tp<sup>S</sup> for the probability for true positives under S. The notations fpS, fnS, tn<sup>S</sup> have analogous meanings.

With this interpretation of causes as binary classifiers in mind, the recall and precision and coverage ratio of a cause set Cause *under a scheduler* S is defined as follows (assuming Pr<sup>S</sup> <sup>M</sup>(♦Eff) <sup>&</sup>gt; 0 resp. Pr<sup>S</sup> <sup>M</sup>(♦Cause) <sup>&</sup>gt; 0 resp. Pr<sup>S</sup> M (¬Cause)UEff > 0):

$$\begin{array}{rcl} \operatorname{precision}^{\mathfrak{S}}(\mathsf{Cause}) &=& \operatorname{Pr}\_{\mathfrak{M}}^{\mathfrak{S}}(\lozenge\mathsf{Eff}\mid\lozenge\mathsf{Cause}) = \frac{\mathsf{tp}^{\mathfrak{S}}}{\mathsf{tp}^{\mathfrak{S}} + \mathsf{tp}^{\mathfrak{S}}}\\\\operatorname{\mathfrak{S}}(\mathsf{Cause}) &=& \operatorname{Pr}\_{\mathfrak{M}}^{\mathfrak{S}}(\lozenge\mathsf{Cause}\mid\lozenge\mathsf{Eff}\mid) = \frac{\mathsf{tp}^{\mathfrak{S}}}{\mathsf{tp}^{\mathfrak{S}} + \mathsf{tp}^{\mathfrak{S}}} \end{array}$$

$$covvar^{\mathfrak{S}}(\mathsf{Cause}) = \frac{\Pr\_{\mathcal{M}}^{\mathfrak{S}}\left(\diamondsuit(\mathsf{Cause}\land\diamondsuit\mathsf{Eff})\right)}{\Pr\_{\mathcal{M}}^{\mathfrak{S}}\left(\left(\neg\mathsf{Cause}\right)\mathsf{U}\mathsf{Eff}\right)} = \frac{\mathsf{tr}^{\mathfrak{S}}}{\mathsf{tr}^{\mathfrak{S}}}.$$

For the coverage ratio, if Pr<sup>S</sup> M (¬Cause)UEff = 0 and Pr<sup>S</sup> <sup>M</sup>(♦Cause) <sup>&</sup>gt; 0 we define *covrat*S(Cause) = +∞. Finally, the f-score of Cause *under a scheduler* <sup>S</sup> is defined as the harmonic mean of the precision and recall (assuming Pr<sup>S</sup> <sup>M</sup>(♦Cause) <sup>&</sup>gt; 0, which implies Pr<sup>S</sup> <sup>M</sup>(♦Eff) <sup>&</sup>gt; 0 as Cause is a PR cause):

$$f\_{f}\text{score}^{\mathfrak{S}}(\mathsf{Cause}) \stackrel{\text{def}}{=} \ 2 \cdot \frac{precision^{\mathfrak{S}}(\mathsf{Cause}) \cdot recall^{\mathfrak{S}}(\mathsf{Cause})}{precision^{\mathfrak{S}}(\mathsf{Cause}) + recall^{\mathfrak{S}}(\mathsf{Cause})}$$

If, however, Pr<sup>S</sup> <sup>M</sup>(♦Eff) <sup>&</sup>gt; 0 and Pr<sup>S</sup> <sup>M</sup>(♦Cause) = 0 we define *fscore*S(Cause) = 0.

Quality measures for cause sets. Let Cause be a PR cause. The recall of Cause measures the relative coverage in terms of the worst-case conditional probability for covered effects (true positives) among all scenarios where the effect occurs.

$$\operatorname{recall}(\mathsf{Cause}) \;= \inf\_{\mathfrak{S}} \; \operatorname{recall}^{\mathfrak{S}}(\mathsf{Cause}) \;= \operatorname{Pr}\_{\mathfrak{M}}^{\min}(\diamond \mathsf{Cause} \mid \diamond \mathsf{E} \mathsf{f} \;) \;$$

when ranging over all schedulers S with Pr<sup>S</sup> <sup>M</sup>(♦Eff) <sup>&</sup>gt; 0. Likewise, the coverage ratio and f-score of Cause are defined by the worst-case coverage ratio resp. f-score (when ranging over schedulers for which *covrat*S(Cause) resp. *fscore*S(Cause) is defined):

*covrat*(Cause) = inf<sup>S</sup> *covrat*S(Cause), *fscore*(Cause) = inf<sup>S</sup> *fscore*S(Cause)

### 5.2 Computation schemes for the quality measures for fixed cause set

For this section, we assume a fixed PR cause Cause is given and address the problem to compute its quality values. Since all quality measures are preserved by the switch from M to M[Cause] as well as the transformations of M[Cause] to an MDP that satisfies conditions (A1)-(A3) of Section 4.2, we may assume that M satisfies (A1)-(A3).

While efficient computation methods for *recall*(Cause) are known from literature (see [10,31] for poly-time algorithms to compute conditional reachability probabilities), we are not aware of known concepts that are applicable for computing the coverage ratio or the f-score. Indeed, both are efficiently computable:

Theorem 7. *The values covrat*(Cause) *and fscore*(Cause) *and corresponding worstcase schedulers are computable in polynomial time.*

By definition, the value *covrat*(Cause) is the infimum over a quotient of reachability probabilities for disjoint sets of terminal states. While this is not the case for the f-score, we can express *fscore*(Cause) in terms of the supremum of such a quotient. More precisely, under assumptions (A1)-(A3) and assuming *fscore*(Cause) > 0, we have:

$$f\text{score}(\textsf{Cause}) = \frac{2}{X+2} \quad \text{where} \quad X = \sup\_{\mathfrak{S}} \frac{\Pr\_{\mathfrak{M}}^{\mathfrak{S}}(\lozenge \textsf{noeff}\_{\mathfrak{fp}}) + \Pr\_{\mathfrak{M}}^{\mathfrak{S}}(\lozenge \textsf{eff}\_{\textsf{unc}})}{\Pr\_{\mathfrak{M}}^{\mathfrak{S}}(\lozenge \textsf{eff}\_{\textsf{cove}})}$$

where S ranges over all schedulers with Pr<sup>S</sup> <sup>M</sup>(♦effcov) <sup>&</sup>gt; 0. Furthermore, we have *fscore*(Cause) = 0 if and only if *recall*(Cause) = 0 if and only if there exists a scheduler S satisfying Pr<sup>S</sup> <sup>M</sup>(♦Eff) <sup>&</sup>gt; 0 and Pr<sup>S</sup> <sup>M</sup>(♦Cause) = 0.

So, the remaining task to prove Theorem 7 is a generally applicable technique for computing extremal ratios of reachability probabilities in MDPs without ECs.

Max/min ratios of reachability probabilities for disjoint sets of terminal states. Suppose we are given an MDP M = (S,*Act*,P,init) without ECs and disjoint subsets U,V ⊆ S of terminal states. Given a scheduler S with Pr<sup>S</sup> <sup>M</sup>(♦V) <sup>&</sup>gt; 0 we define:

$$ratio\_{\widetilde{\mathcal{M}}}^{\mathfrak{S}}(\mathsf{U}, \mathsf{V}) \;= \Pr\_{\mathfrak{M}}^{\mathfrak{S}}(\diamondsuit \mathsf{U}) / \Pr\_{\mathfrak{M}}^{\mathfrak{S}}(\diamondsuit \mathsf{V})$$

The goal is to compute the extremal values: *ratio*min <sup>M</sup> (U,V) = inf<sup>S</sup> *ratio*<sup>S</sup> <sup>M</sup>(U,V) and *ratio*max <sup>M</sup> (U,V) = sup<sup>S</sup> *ratio*<sup>S</sup> <sup>M</sup>(U,V) where S ranges over all schedulers such that Pr<sup>S</sup> <sup>M</sup>(♦V) <sup>&</sup>gt; 0. For their computation, we rely on a polynomial reduction to the classical *stochastic shortest path problem* [12]. For this, consider the MDP N arising from M by adding reset transitions from all terminal states t ∈ S\V to init. Thus, exactly the V-states are terminal in N. The MDP N might contain ECs, which, however, do not intersect with V. We equip N with the weight function that assigns 1 to all states in U and 0 to all other states. For a scheduler T with Pr<sup>T</sup> <sup>N</sup>(♦V) = 1, let E<sup>T</sup> <sup>N</sup>(V) be the expected accumulated weight until reaching V under T. Let Emin <sup>N</sup> (V) = inf<sup>T</sup> <sup>E</sup> T <sup>N</sup>(V) and E max <sup>N</sup> (V) = sup<sup>T</sup> <sup>E</sup> T <sup>N</sup>(V), where <sup>T</sup> ranges over all schedulers with Pr<sup>T</sup> <sup>N</sup>(♦V) = 1. We can rely on known results [12,3,4] to obtain that both Emin <sup>N</sup> (V) and Emax <sup>N</sup> (V) are computable in polynomial time. As N has only non-negative weights, Emin <sup>N</sup> (V) is finite and a corresponding MD-scheduler with minimal expectation exists. If N has an EC containing at least one U-state, which is the case iff M has a scheduler S with Pr<sup>S</sup> <sup>M</sup>(♦U) <sup>&</sup>gt; 0 and Pr<sup>S</sup> <sup>M</sup>(♦V) = 0, then Emax <sup>N</sup> (V) = +∞. Otherwise, Emax <sup>N</sup> (V) is finite and the maximum is achieved by an MD-scheduler as well.

Theorem 8. *Let* M *be an MDP without ECs and* U,V *disjoint sets of terminal states in* M*, and let* N *be the constructed MDP as above. Then, ratio*min <sup>M</sup> (U,V) = E min <sup>N</sup> (V) *and ratio*max <sup>M</sup> (U,V) = E max <sup>N</sup> (V)*. Thus, both values are computable in polynomial time, and there is an MD-scheduler minimizing ratio*<sup>S</sup> <sup>M</sup>(U,V)*, and an MD-scheduler maximizing ratio*<sup>S</sup> <sup>M</sup>(U,V) *if ratio*max <sup>M</sup> (U,V) *is finite.*

*Proof of Theorem 7.* Using assumptions (A1)-(A3), we obtain that *covrat*(Cause) = *ratio*min <sup>M</sup> (U,V) where U = {effcov}, V = {effunc}. Similarly, with U = {noefffp, effunc}, V = {effcov}, we get *fscore*(Cause) = 0 if *ratio*max <sup>M</sup> (U,V) = +<sup>∞</sup> and *fscore*(Cause) = 2/(*ratio*max <sup>M</sup> (U,V) +2) otherwise. Thus, the claim follows from Theorem 8. ut

### 5.3 Quality-optimal probability-raising causes

An SPR cause Cause is called *recall-optimal* if *recall*(Cause) = max<sup>C</sup> *recall*(C) where C ranges over all SPR causes. Likewise, *ratio-optimality* resp. *f-score-optimality* of Cause means maximality of *covrat*(Cause) resp. *fscore*(Cause) among all SPR causes. Recall-, ratio- and f-score-optimality for GPR causes are defined accordingly.

Lemma 9. *Let* Cause *be an SPR or a GPR cause. Then,* Cause *is recall-optimal if and only if* Cause *is ratio-optimal.*

Recall- and ratio-optimal SPR causes. The techniques of Section 4.1 yield an algorithm for generating a canonical SPR cause with optimal recall and ratio. To see this, let C denote the set of states that constitute a singleton SPR cause. The canonical cause CanCause is defined as the set of states c ∈ C such that there is a scheduler S with Pr<sup>S</sup> <sup>M</sup>((¬C)Uc) > 0. Obviously, C and CanCause are computable in polynomial time.

Theorem 9. *If* C 6= ∅ *then* CanCause *is a ratio- and recall-optimal SPR cause.*

This is not true for the f-score. To see this, Consider the Markov chain on the right hand side. We have CanCause = {s1}, which has *precision*(CanCause) = <sup>3</sup> 4 and *recall*(CanCause) = <sup>3</sup> 8 /( 1 <sup>4</sup> + 3 8 ) = <sup>3</sup> 5 . But the SPR cause {s2} has better f-score as its precision is 1 and it has the same recall as CanCause.

F-score-optimal SPR cause. From Section 5.2, we see that f-score-optimal SPR causes in MDPs can be computed in polynomial space by computing the f-score for all potential SPR causes one by one in polynomial time (Theorem 7). As the space can be reused after each computation, this results in polynomial space. For Markov chains, we can do better and compute an f-score-optimal SPR cause in polynomial time via a polynomial reduction to the stochastic shortest path problem:

Theorem 10. *In Markov chains that have SPR causes, an f-score-optimal SPR cause can be computed in polynomial time.*

*Proof.* We regard the given Markov chain M as an MDP with a singleton action set *Act* = {α}. As M has SPR causes, the set C of states that constitute a singleton SPR cause is nonempty. We may assume that M has no non-trivial (i.e., cyclic) bottom strongly connected components as we may collapse them. Let w<sup>c</sup> = PrM,c(♦Eff). We switch from M to a new MDP K with state space S<sup>K</sup> = S∪{effcov,noefffp} with fresh states effcov and noefffp and the action set *Act*<sup>K</sup> = {α,γ}. The MDP K arises from M by adding (i) for each state c ∈ C a fresh state-action pair (c,γ) with PK(c,γ, effcov) = w<sup>c</sup> and PK(c,γ,noefffp) = 1−w<sup>c</sup> and (ii) reset transitions to init with action label α from the new state noefffp and all terminal states of M, i.e., PK(noefffp,α,init) = 1 and PK(s,α,init) = 1 for s ∈ Eff or if s is a terminal non-effect state of M. So, exactly effcov is terminal in K, and *Act*K(c) = {α,γ} for c ∈ C, while *Act*K(s) = {α} for all other states s. Intuitively, taking action γ in state c ∈ C selects c to be a cause state. The states in Eff represent uncovered effects in K, while effcov stands for covered effects.

We assign weight 1 to all states in U = Eff ∪{noefffp} and weight 0 to all other states of K. Let V = {effcov}. Then, f = E min <sup>K</sup> (V) and an MD-scheduler <sup>S</sup> for <sup>K</sup> such that E S <sup>K</sup>(V) = <sup>f</sup> are computable in polynomial time. Let <sup>C</sup><sup>γ</sup> denote the set of states <sup>c</sup> <sup>∈</sup> <sup>C</sup> where S(c) = γ and let Cause be the set of states c ∈ C<sup>γ</sup> where M has a path satisfying (¬Cγ)Uc. Then, Cause is an SPR cause of M. With arguments as in Section 5.2 we obtain *fscore*(Cause) = 2/(f+2). It remains to show that Cause is f-score-optimal. Let C be an arbitrary SPR cause. Then, C ⊆ C. Let T be the MD-scheduler for K that schedules γ in C and α for all other states of K. Then, *fscore*(C) = 2/(f <sup>T</sup>+2) where f <sup>T</sup> = E T <sup>K</sup>(V). Hence, <sup>f</sup> <sup>6</sup> <sup>f</sup> <sup>T</sup>, which yields *fscore*(Cause) > *fscore*(C). ut

The na¨ıve adaption of the construction presented in the proof of Theorem 10 for MDPs would yield a stochastic game structure where the objective of one player is to minimize the expected accumulated weight until reaching a target state. Although algorithms for *stochastic shortest path (SSP) games* are known [34], they rely on assumptions on the game structure which would not be satisfied here. However, for the threshold problem *SPR-f-score* where inputs are an MDP M, Eff and ϑ ∈ Q><sup>0</sup> and the task is to decide the existence of an SPR cause whose f-score exceeds ϑ, we can establish a polynomial reduction to SSP games, which yields an NP∩coNP upper bound:

### Theorem 11. *The decision problem SPR-f-score is in* NP∩coNP*.*

*Proof sketch.* Given an MDP M, an effect set Eff, and ϑ ∈ Q, we construct an SSP game [34] after a series of model transformations ensuring (i) that terminal states are reached almost surely and (ii) that Eff is reached with positive probability under all schedulers. Condition (i) is established by a standard MEC-quotient construction. To establish condition (ii), we provide a construction that forces schedulers to leave an initial sub-MDP in which the minimal probability to reach Eff is 0. This construction – unlike the MEC-quotient – affects the possible combinations of probability values with which terminal states and potential cause states can be reached, but the existence of an SPR cause satisfying the f-score-threshold condition is not affected.

The underlying idea of the construction of the game shares similarities with the MDP constructed in the proof of Theorem 10: Player 0 takes the role to select potential cause states while player 1 takes the role of a scheduler in the transformed MDP. Using the observation that for each cause C, *fscore*(C) > ϑ iff

$$2(1 - \theta) \text{Pr}\_{\mathcal{M}}^{\mathfrak{S}}(\diamond \mathbb{C} \land \diamond \mathbb{B} \mathsf{ff}) - \theta \text{Pr}\_{\mathcal{M}}^{\mathfrak{S}}(\neg \diamond \mathbb{C} \land \diamond \mathbb{B} \mathsf{ff}) - \theta \text{Pr}\_{\mathcal{M}}^{\mathfrak{S}}(\diamond \mathbb{C} \land \neg \diamond \mathbb{B} \mathsf{ff}) > 0 \qquad (\times)$$

for all schedulers S for M with Pr<sup>S</sup> <sup>M</sup>(♦Eff) <sup>&</sup>gt; 0, weights are assigned to Eff-states and other terminal states depending on whether player 0 has chosen to include a state to the cause beforehand. In the resulting SSP game, both players have optimal MDstrategies [34]. Given such strategies ζ for player 0 and S for player 1, the resulting expected accumulated weight agrees with the left-hand side of (×) when considering S as a scheduler for the transformed MDP and the cause C induced by the states that ζ chooses to belong to the cause. Thus, player 0 wins the constructed game iff an SPR cause with f-score above the threshold ϑ exists. The existence of optimal MD-strategies for both players allows us to decide this threshold problem in NP and coNP. ut

Optimality and threshold constraints for GPR causes. Computing optimal GPR causes for either quality measure can be done in polynomial space by considering all cause candidates, checking the GPR condition in polynomial space (Theorem 4) and computing the corresponding quality measure in polynomial time (Section 5.2). However, we show that no polynomial-time algorithms can be expected as the corresponding threshold problems are NP-hard. Let GPR-covratio (resp. GPR-recall, GPR-f-score) denote the decision problems: Given M,Eff and ϑ ∈ Q, decide whether there exists a GPR cause with coverage ratio (resp. recall, f-score) at least ϑ.

Theorem 12. *The problems GPR-covratio, GPR-recall and GPR-f-score are NP-hard and belong to PSPACE. For Markov chains, all three problems are NP-complete. NPhardness even holds for tree-like Markov chains.*

*Proof sketch.* NP-hardness is established via a polynomial reduction from the knapsack problem. Membership to NP for Markov chains resp. to PSPACE = NPSPACE for MDPs is obvious as we can guess nondeterministically a cause candidate and then check (i) the GPR condition in polynomial time (Markov chains) resp. polynomial space (MDPs) and (ii) the threshold condition in polynomial time (see Section 5.2). ut

### 6 Conclusion

The goal of the paper was to formalize the probability-raising principle in MDPs and related quality notions for PR causes as well as studying fundamental algorithmic problems for them. We considered the strict (local) and the global view. Our results indicate that GPR causes are more general and leave more flexibility to achieve better accuracy, while algorithmic reasoning about SPR causes is simpler.

*Existential definition of SPR/GPR causes.* The proposed definition of PR causes relies on a universal quantification over all relevant schedulers. However, another approach could be via existential quantification, i.e. there is a scheduler S such that (GPR) or resp. (SPR) hold. The resulting notion of causality yields fairly the same results (up to Prmax M,c (♦Eff) instead of Prmin M,c (♦Eff) etc). A canonical existential SPR cause can be defined in analogy to the universal case and shown to be recall- and ratio-optimal (cf. Theorem 9). The problem to find an existential f-score-optimal SPR cause is even simpler and solvable in polynomial time as the construction presented in the proof of Theorem 10 can be adapted for MDPs (thanks to the simpler nature of max<sup>C</sup> sup<sup>S</sup> *fscore*S(C) compared to max<sup>C</sup> inf<sup>S</sup> *fscore*S(C)). However, NP-hardness for the existence of GPR causes with threshold constraints for the quality carries over to the existential definition (as NP-hardness holds for Markov chains, Theorem 12).

*Non-strict inequality in the PR conditions.* Our notions of PR causes are in line with the classical approach of probability-raising causality in literature with strict inequality in the PR condition. This has the consequence that causes might not exist (see Example 2). The switch to a relaxed definition of PR causes with non-strict inequality seems to be a minor change that identifies more sets as causes. Indeed, the proposed algorithms for checking the SPR and GPR condition (Section 4) can easily be modified for the relaxed definition. While this leads to a questionable notion of causality (e.g., {init} would always be a recall- and ratio-optimal SPR cause under the relaxed definition), it could be useful in combination with other side constraints. E.g., requiring the relaxed PR condition for all schedulers which reach a cause state with positive probability and requiring the existence of a scheduler where the PR condition with strict inequality holds might be a useful alternative definition that agrees with Def. 1 for Markov chains. *Relaxing the minimality condition (M).* As many causality notions of the literature include some minimality constraint, we included condition (M). However, (M) could be dropped without affecting the algorithmic results presented here. This can be useful when the task is to identify components or agents that are responsible for the occurrences of undesired effects. In these cases the cause candidates are fixed (e.g., for each agent i, the set of states controlled by agent i), but some of them might violate (M).

*Future directions* include PR causality when causes and effects are path properties and the investigation of other quality measures for PR causes inspired by other indices for binary classifiers used in machine learning or customized for applications of cause-effect reasoning in MDPs. More sophisticated notions of probabilistic backward causality and considerations on PR causality with external interventions as in Pearl's do-calculus [35] are left for future work.

Acknowledgments We would like to thank Simon Jantsch and Clemens Dubslaff for their helpful comments and feedback on the topic of causality in MDPs.

### References


Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

## Parameterized Analysis of Reconfigurable Broadcast Networks?

A. R. Balasubramanian<sup>1</sup> , Lucie Guillou<sup>2</sup> , and Chana Weil-Kennedy<sup>3</sup> ()

> <sup>1</sup> Technical University of Munich bala.ayikudi@tum.de <sup>2</sup> ENS Rennes lucie.guillou@ens-rennes.fr

<sup>3</sup> Technical University of Munich chana.weilkennedy@in.tum.de

Abstract. Reconfigurable broadcast networks (RBN) are a model of distributed computation in which agents can broadcast messages to other agents using some underlying communication topology which can change arbitrarily over the course of executions. In this paper, we conduct parameterized analysis of RBN. We consider cubes, (infinite) sets of configurations in the form of lower and upper bounds on the number of agents in each state, and we show that we can evaluate boolean combinations over cubes and reachability sets of cubes in PSPACE. In particular, reachability from a cube to another cube is a PSPACE-complete problem. To prove the upper bound for this parameterized analysis, we prove some structural properties about the reachability sets and the symbolic graph abstraction of RBN, which might be of independent interest. We justify this claim by providing two applications of these results. First, we show that the almost-sure coverability problem is PSPACE-complete for RBN,

thereby closing a complexity gap from a previous paper [3]. Second, we define a computation model using RBN, `a la population protocols, called RBN protocols. We characterize precisely the set of predicates that can be computed by such protocols.

Keywords: Broadcast networks · Parameterized reachability · Almostsure coverability · Asynchronous shared-memory systems

### 1 Introduction

Reconfigurable broadcast networks (RBN) [8,10] are a formalism for modelling distributed systems in which a set of anonymous, finite-state agents execute the same underlying protocol and broadcast messages to their neighbors according to an underlying communication topology. The communication topology is reconfigurable, meaning that the set of neighbors of an agent can change arbitrarily over the course of an execution. Parameterized verification of these networks concerns itself with proving that a given property is correct, irrespective of the number of participating agents. Dually, it can be viewed as the problem of finding an

c The Author(s) 2022 P. Bouyer and L. Schr¨oder (Eds.): FoSSaCS 2022, LNCS 13242, pp. 61–80, 2022. https://doi.org/10.1007/978-3-030-99253-8\_4

<sup>?</sup> This project has received funding from the European Research Council (ERC) under the European Union's Horizon 2020 research and innovation programme under grant agreement No 787367 (PaVeS).

execution of some number of agents which violates a given property. Ever since their introduction within this context [10], RBN have been studied extensively, with various results on (parameterized) reachability and coverability [8,10,3,7], along with various extensions using probabilities and clocks [5,4].

In this paper, we first consider the cube-reachability problem for RBN, in which we are given two (possibly infinite) sets of configurations C and C 0 (called cubes), each of them defined by lower and upper bounds on the number of agents in each state, and we must decide if there is a configuration in C which can reach some configuration in C 0 . The cube-reachability question covers parameterized reachability and coverability problems, and as explained in [3], also covers the parameterized reachability problem for a generalized model of RBN called RBN with leaders. Moreover, a sub-problem of cube-reachability has already been studied for RBN in [8]. The authors show that this sub-problem is PSPACEcomplete. One of the results in our paper is that the entire cube-reachability problem is PSPACE-complete, hence extending the sub-problem considered in [8], while still retaining the same complexity upper bound.

In fact, our main result, which we call the PSPACE Theorem, is a more general result. It subsumes the above result for cube-reachability and allows for more complex parameterized analysis of RBN. The PSPACE Theorem roughly states that any boolean combination of atoms can be evaluated in PSPACE, where an atom is a finite union of cubes or the reachability set of a finite union of cubes (i.e. post<sup>∗</sup> or pre<sup>∗</sup> ). To prove the PSPACE Theorem, we first consider the so called symbolic graph of a RBN ([8], Section 5). We prove some structural properties about these graphs, using results from [8]. Next, using these structural properties, we show that the set of reachable configurations of a cube C can be expressed as a finite union of cubes, each having a norm exponentially bounded in the size of the given RBN and C. This result then allows us to give an on-the-fly exploration algorithm for proving the PSPACE Theorem.

We believe that the PSPACE Theorem and the results leading to it that we have proven in this paper have further applications to problems concerning RBN. To justify this claim, we provide two applications. First, we show that the almost-sure coverability problem for RBN is PSPACE-complete, thereby closing a complexity gap from a previous paper ([3], Section 5.3). Second, we define a computation model using RBN, called RBN protocols, which is similar in spirit to the population protocols model [1,2]. We characterize precisely the set of predicates that can be computed using RBN protocols. This result generalizes the corresponding result for IO protocols, which are a sub-class of population protocols that can be simulated by RBN protocols, as shown in ([3], Section 6.2).

Finally, by the reduction given in ([3], Section 4.2), our results on cubereachability and almost-sure coverability can be transferred to another model of distributed computation called asynchronous shared memory systems (ASMS), giving a PSPACE-completeness result for both of these problems. This solves an open problem from ([6], Section 6).

To summarize, we have shown that many important parameterized problems of RBN can be solved in PSPACE, that the sub-problem of the cube-reachability problem defined in [8] can be generalized while retaining the same upper bounds, and that the almost-sure coverability problems for RBN and ASMS are PSPACEcomplete, thereby solving open problems from [3,6]. We believe that our other results might be of independent interest, and we provide an application by introducing RBN protocols and characterizing the set of predicates that they can compute.

The paper is organized as follows. Section 2 contains preliminaries, including the definition of RBN. Section 3 defines the symbolic graph of a RBN, and proves the properties of this graph needed to derive our main result. Section 4 contains the main result that a host of parameterized problems over cubes, including cube-reachability, is PSPACE-complete for RBN. Finally, Sections 5 and 6 give applications of our main results: Section 5 solves the complexity gap for the almost-sure coverability problem, and Section 6 introduces RBN protocols and characterizes their expressive power. Due to lack of space, full proofs of some of the results can be found in the long version.

### 2 Preliminaries

The definitions and notations in this section are taken from [3].

### 2.1 Multisets

A multiset on a finite set E is a mapping C : E → N, i.e. for any e ∈ E, C(e) denotes the number of occurrences of element e in C. We let M(E) denote the set of all multisets on <sup>E</sup>. Let <sup>H</sup>e1, . . . , e<sup>n</sup><sup>I</sup> denote the multiset <sup>C</sup> such that C(e) = |{j | e<sup>j</sup> = e}|. We sometimes write multisets using set-like notation. For example, <sup>H</sup><sup>2</sup> · a, b<sup>I</sup> and <sup>H</sup>a, a, b<sup>I</sup> denote the same multiset. Given <sup>e</sup> <sup>∈</sup> <sup>E</sup>, we denote by e the multiset consisting of one occurrence of element e, that is <sup>H</sup>eI. Operations on <sup>N</sup> like addition or comparison are extended to multisets by defining them component wise on each element of E. Subtraction is allowed as long as each component stays non-negative. We call |C| def = P <sup>e</sup>∈<sup>E</sup> C(e) the size of C.

### 2.2 Reconfigurable Broadcast Networks

Reconfigurable broadcast networks (RBN) are networks consisting of finite-state, anonymous agents and a communication topology which specifies for every pair of processes, whether or not there is a communication link between them. During a single step, a single agent can broadcast a message which is received by all of its neighbors, after which both the agent and its neighbors change their state according to some transition relation. Further, in between two steps, the communication topology can change in an arbitrary manner. For the problems that we consider in this paper, it is easier to forget the communication topology and define the semantics of an RBN directly in terms of collections of agents.

Definition 1. A reconfigurable broadcast network is a tuple R = (Q, Σ, δ) where Q is a finite set of states, Σ is a finite alphabet and δ ⊆ Q×{!a, ?a | a ∈ Σ}× Q is the transition relation.

If (p, !a, q) (resp. (p, ?a, q)) is a transition in δ, we will denote it by p !<sup>a</sup>−→ q (resp. p ?<sup>a</sup> −→ q). A configuration C of a RBN R is a multiset over Q, which intuitively counts the number of processes in each state. Given a letter a ∈ Σ and two configurations C and C <sup>0</sup> we say that there is a step C <sup>a</sup>−→ C 0 if there exists a multiset <sup>H</sup>t, t1, . . . , t<sup>k</sup><sup>I</sup> of <sup>δ</sup> for some <sup>k</sup> <sup>≥</sup> 0 satisfying the following: <sup>t</sup> <sup>=</sup> <sup>p</sup> !<sup>a</sup>−→ q, each t<sup>i</sup> = p<sup>i</sup> ?<sup>a</sup> −→ q<sup>i</sup> , C ≥ p + P i pi, and C <sup>0</sup> = C − p − P i p<sup>i</sup> + q + P i qi. We sometimes write this as C <sup>t</sup>+t1,...,t<sup>n</sup> −−−−−−→ C <sup>0</sup> or C <sup>a</sup>−→ C 0 . Intuitively it means that a process at the state p broadcasts the message a and moves to q, and for each 1 ≤ i ≤ k, there is a process at the state p<sup>i</sup> which receives this message and moves to q<sup>i</sup> . We denote by <sup>∗</sup>−→ the reflexive and transitive closure of the step relation. A run is then a sequence of steps.

Fig. 1. An RBN R with three states.

Let R = (Q, Σ, δ) be an RBN. Given configurations C and C 0 , we say C 0 is reachable from C if C <sup>∗</sup>−→ C 0 . We say C 0 is coverable from C if there exists C 00 such that C <sup>∗</sup>−→ C <sup>00</sup> and C <sup>00</sup> ≥ C 0 . The reachability problem consists of deciding, given a RBN R and configurations C, C<sup>0</sup> , whether C 0 is reachable from C in R. The coverability problem consists of deciding, given a RBN R and configurations C, C<sup>0</sup> , whether C 0 is coverable from C in R. Let S be a set of configurations. The predecessor set of S is pre<sup>∗</sup> (S) def = {C 0 |∃C ∈ S . C<sup>0</sup> <sup>∗</sup>−→ C}, and the successor set of S is post<sup>∗</sup> (S) def = {C|∃C <sup>0</sup> ∈ S . C<sup>0</sup> <sup>∗</sup>−→ C}.

Example 1. Figure 1 illustrates a RBN R = (Q, Σ, δ) with Q = {q1, q2, q3}. Configuration <sup>H</sup>3·q<sup>1</sup><sup>I</sup> can reach <sup>H</sup>2·q1, q<sup>3</sup><sup>I</sup> in two steps. First, a process broadcasts a, the two other processes receive it and move to q2. Then, one of the processes in q<sup>2</sup> broadcasts b and moves to q1, while the other one receives b and moves to <sup>q</sup>3. Notice that <sup>H</sup>q<sup>3</sup><sup>I</sup> is only coverable from a configuration <sup>H</sup><sup>k</sup> · <sup>q</sup><sup>1</sup><sup>I</sup> if <sup>k</sup> <sup>≥</sup> 3.

### 2.3 Cubes and Counting Sets

Given a finite set Q, a cube C is a subset of M(Q) described by a lower bound L: Q → N and an upper bound U : Q → N ∪ {∞} such that C = {C : L ≤ C ≤ U}. Abusing notation, we identify the set C with the pair (L, U). Notice that since U(q) can be ∞ for some state q, a cube can contain an infinite number of configurations. All the results in this paper are true irrespective of whether the constants in a given input cube are encoded in unary or binary.

A finite union of cubes S<sup>m</sup> <sup>i</sup>=1(L<sup>i</sup> , Ui) is called a counting constraint and the set of configurations S<sup>m</sup> <sup>i</sup>=1 C<sup>i</sup> it describes is called a counting set. Notice that two different counting constraints may describe the same counting set. For example, let Q = {q} and let (L, U) = (1, 3), (L 0 , U<sup>0</sup> ) = (2, 4), (L <sup>00</sup>, U<sup>00</sup>) = (1, 4). The counting constraints (L, U)∪(L 0 , U<sup>0</sup> ) and (L <sup>00</sup>, U<sup>00</sup>) define the same counting set. It is easy to show (see also Proposition 2 of [11]) that counting constraints and counting sets are closed under Boolean operations.

Norms. Let C = (L, U) be a cube. Let kCk<sup>l</sup> be the the sum of the components of L. Let kCk<sup>u</sup> be the sum of the finite components of U if there are any, and 0 otherwise. The norm of C is the maximum of kCk<sup>l</sup> and kCku, denoted by kCk. We define the norm of a counting constraint Γ = S<sup>m</sup> <sup>i</sup>=1 C<sup>i</sup> as kΓk def = max i∈[1,m] {kCik}. The norm of a counting set S is the smallest norm of a counting constraint representing S, that is, kSk def = min <sup>S</sup>=JΓ<sup>K</sup> {kΓk}. Proposition 5 of [11] entails the following results for the norms of the union, intersection and complement.

Proposition 1. Let S1, S<sup>2</sup> be counting sets. The norms of the union, intersection and complement satisfy: kS<sup>1</sup> ∪ S2k ≤ max{kS1k, kS2k}, kS<sup>1</sup> ∩ S2k ≤ kS1k + kS2k, and kS1k ≤ |Q| · kS1k + |Q|.

Reachability. The reachability problem can be generalized to the cube-reachability problem which consists of deciding, given an RBN R and two cubes C, C 0 , whether there exists configurations C ∈ C and C <sup>0</sup> ∈ C<sup>0</sup> such that C 0 is reachable from C in R. If this is the case, we say C 0 is reachable from C. The counting set-reachability problem asks, given an RBN R and two counting sets S, S 0 , whether there exists cubes C ∈ S and C <sup>0</sup> ∈ S<sup>0</sup> such that C 0 is reachable from C in R. We define cube-coverability and counting set-coverability in an analoguous way.

Remark 1. In the paper [8], the authors define a sub-class of the cube-reachability problem, which is called the unbounded initial cube-reachability problem in [3]. More precisely, the sub-class considered in [8] is the following: We are given an RBN and two cubes C = (L, U) and C <sup>0</sup> = (L 0 , U<sup>0</sup> ) with the special property that L(q) = 0 and U(q) ∈ {0, ∞} for every state q. We then have to decide if C can reach C 0 . This problem was shown to be PSPACE-complete ([8], Theorem 5.5), whenever the numbers in the input are given in unary. As we shall show later in this paper, the cube-reachability problem itself is in PSPACE, even when the input numbers are encoded in binary, thereby generalizing the upper bound results given in that paper.

### 3 Reachability sets of counting sets

In this section, we set the stage for proving the main result of this paper. This main result is given in two stages: First, we show that given a RBN with state set Q and a counting set S, the set post<sup>∗</sup> (S) is also a counting set and kpost<sup>∗</sup> (S)k ≤ 2 <sup>p</sup>(kSk·|Q|) where p is some fixed polynomial. Using this, we then prove that a host of cube-parameterized problems for RBN can be solved in PSPACE.

The rest of this section is organized as follows: To prove the first result, we recall the notion of a symbolic graph of a RBN from [8]. In the symbolic graph, each node is a symbolic configuration of the RBN, which intuitively represents an infinite set of configurations in which the number of agents is fixed in some states, and arbitrarily big in the others. Next, by exploiting the special structure of the symbolic graph, we prove some properties which allow us to show that whenever two nodes in this graph are reachable, they are reachable by a path having a special structure. Finally, using these properties and the connection between symbolic configurations and configurations of the RBN, we prove the desired first result. Once we have shown the first result, we then show how the PSPACE Theorem can be obtained from it.

Throughout this section, we fix an RBN R = (Q, Σ, δ).

#### 3.1 Symbolic graph

In this subsection, we recall the notion of a symbolic graph of an RBN from [8]. Here, for the sake of convenience, we define it in a slightly different way, but the underlying notion is the same as [8]. Throughout this subsection and the next, we fix a number k ∈ N.

The symbolic graph of index k associated with the RBN R is an edge-labelled graph G<sup>k</sup> = (N, E, L) where N = Mk(Q) × 2 <sup>Q</sup> is the set of nodes. Here Mk(Q) denotes the set of multisets on Q of size at most k. E is the set of edges and L : E → Σ is the labelling function. Each node of G<sup>k</sup> is also called a symbolic configuration. Intuitively, in each symbolic configuration (v, S), the multiset v (called the concrete part) is used to keep track of a fixed set of at most k agents, and the subset S (called the abstract part) is used to keep track of the support of the remaining agents.

Let θ = (v, S) and θ <sup>0</sup> = (v 0 , S<sup>0</sup> ) be two symbolic configurations. There is an edge labelled by a between θ and θ 0 if and only if the following is satisfied: There exists a transition (q, !a, q<sup>0</sup> ) ∈ δ such that at least one of the following two conditions holds

	- s • If q<sup>s</sup> ∈ S \ S 0 then there exists q 0 <sup>s</sup> ∈ S <sup>0</sup> and (qs, ?a, q<sup>0</sup> s ) ∈ R.

An edge labelled by a between θ and θ 0 is denoted by θ <sup>a</sup> G<sup>k</sup> θ 0 . The relation <sup>∗</sup> G<sup>k</sup> is the reflexive and transitive closure of <sup>G</sup><sup>k</sup> := ∪a∈<sup>Σ</sup> <sup>a</sup> G<sup>k</sup> . Whenever the index k is clear, we will drop the subscript G<sup>k</sup> from these notations.

Remark 2. Let θ = (v, S), θ<sup>0</sup> = (v 0 , S<sup>0</sup> ) be two symbolic configurations. By construction, θ can only reach θ 0 if |v| = |v 0 |.

To give an intuition behind the edges in Gk, recall the intuition that in a symbolic configuration, the concrete part is used to keep track of a fixed set of at most k processes and the abstract part is used to keep track of the support of the remaining processes. The first condition for the existence of an edge asserts the following: 1) In the concrete part, some process broadcasts the message a and some subset of processes receive a, 2) In the abstract part, any new state added or any old state deleted comes because of receiving a. The second condition asserts exactly the same, except we now require the process broadcasting the message a to be from the abstract part.

The symbolic graph of index k can be thought of as an abstraction of the set of configurations of R, where only a fixed number of processes are explicitly represented and the rest are abstracted by means of their support alone. To formalize this, given a symbolic configuration <sup>θ</sup> = (v, S), we let <sup>J</sup>θ<sup>K</sup> denote the following (infinite) set of configurations: <sup>C</sup> <sup>∈</sup> <sup>J</sup>θ<sup>K</sup> if and only if <sup>C</sup>(q) = <sup>v</sup>(q) for q /∈ S and C(q) ≥ v(q) for q ∈ S.

Fig. 2. Symbolic graph G<sup>0</sup> of index 0 of the RBN of Example 1.

Example 2. The symbolic graph G<sup>0</sup> of index 0 of the RBN of Example 1 is illustrated in Figure 2. At this index, the graph only keeps track of a subset S ⊆ Q, and the edges correspond to broadcasts from S. Consider the edges from {q1}. The self-loop corresponds to a broadcast of a that is not received. The edge to {q1, q2} corresponds to a broadcast of a received by at least one process in q1. There is no edge from {q3} because there is no broadcast transition from q3.

We then have the following lemma, which asserts that runs between two configurations in an RBN induce corresponding runs in the symbolic graph. The proof of the lemma is easily obtained from the definition of the symbolic graph.

Lemma 1. Let C, C<sup>0</sup> be two configurations of R such that C <sup>a</sup>−→ C 0 . Then, for every <sup>θ</sup> such that <sup>C</sup> <sup>∈</sup> <sup>J</sup>θK, there exists <sup>θ</sup> 0 such that C <sup>0</sup> <sup>∈</sup> <sup>J</sup><sup>θ</sup> 0 <sup>K</sup> and <sup>θ</sup> <sup>a</sup> θ 0 .

### 3.2 Properties of the symbolic graph

In this subsection, we prove some properties of the symbolic graph (of any index k). The first two properties that we prove exhibit some structural properties on the paths of the symbolic graph. The next two properties relate paths over the symbolic graph to runs over the configurations of the given RBN. These four properties will ultimately lead us to prove our two main contributions in the next section.

First property: Monotonicity. Let k ∈ N and let G<sup>k</sup> be the symbolic graph of index k associated with R. The first key property of G<sup>k</sup> is the following property, which we call monotonicity.

Proposition 2. Let θ = (v, S) and θ <sup>0</sup> = (v 0 , S<sup>0</sup> ) be symbolic configurations of Gk. Then the following are true:


Proof. The two points follow immediately from the definition of <sup>a</sup> .

Second property: Normal Form. To state the second property, we first need a small definition.

Definition 2. Let (v0, S0) · · · (vm, Sm) a path in Gk. A pair of indices 0 ≤ i < j ≤ m is called a bad pair if (S<sup>i</sup> \ Si+1) ∩ S<sup>j</sup> 6= ∅. A path is said to be in normal form if it contains no bad pairs, i.e., for all 0 ≤ i < m and any j > i, (S<sup>i</sup> \ Si+1) ∩ S<sup>j</sup> = ∅.

Intuitively, a path is in normal form if during each step, the states that disappear from the abstract part never reappear again. The following lemma asserts that whenever there is a path between two symbolic configurations, then there is a path between them that is in normal form.

Lemma 2. Let θ, θ<sup>0</sup> be symbolic configurations of G<sup>k</sup> such that there is a path between θ and θ <sup>0</sup> of length m. Then, there is a path in normal form between θ and θ <sup>0</sup> of length m.

Proof Sketch. Let θ = θ<sup>0</sup> θ<sup>1</sup> θ<sup>2</sup> . . . θm−<sup>1</sup> θ<sup>m</sup> = θ <sup>0</sup> be the path between θ and θ 0 . We proceed by induction on m. The claim is clearly true for m = 0. Suppose m > 0 and the claim is true for m−1. By induction hypothesis, we can assume that the path θ<sup>0</sup> θ<sup>1</sup> . . . θm−<sup>1</sup> is already in normal form.

Let each θ<sup>i</sup> = (v<sup>i</sup> , Si). Let l be the number of bad pairs in the path between θ<sup>0</sup> and θm. If l = 0, then the path is already in normal form and we are done. Suppose l > 0 and let (w, w<sup>0</sup> ) be a bad pair. Since the path between θ<sup>0</sup> and θm−<sup>1</sup> is already in normal form, it has to be the case that w <sup>0</sup> = m. Hence, we have Z := (S<sup>w</sup> \ Sw+1) ∩ S<sup>m</sup> 6= ∅.

By Proposition 2, the following is a valid path: (vw, Sw) (vw+1, Sw+1 ∪ Z) (vw+2, Sw+2 ∪ Z). . .(vm−1, Sm−<sup>1</sup> ∪ Z) (vm, S<sup>m</sup> ∪ Z) = (vm, Sm). Let θ 0 j := θ<sup>j</sup> if j ≤ w and (v<sup>j</sup> , S<sup>j</sup> ∪ Z) otherwise. Hence, we get a path θ 0 <sup>0</sup> θ 0 <sup>1</sup> . . . θ<sup>0</sup> <sup>m</sup>−<sup>1</sup> θ 0 m.

Let each θ 0 <sup>e</sup> = (v 0 e , S<sup>0</sup> e ) and let 0 ≤ i < j ≤ m−1. By a case analysis on where i and j are relative to the index w, we can prove that (S 0 i \S 0 <sup>i</sup>+1)∩S 0 <sup>j</sup> = ∅. Having proved this, it is then clear by construction, that this new path from θ 0 0 := θ<sup>0</sup> to θ 0 <sup>m</sup> := θ<sup>m</sup> has at most l − 1 bad pairs only. Hence, we now have a path from θ<sup>0</sup> to θ<sup>m</sup> such that the prefix of length m − 1 is in normal form and the number of bad pairs has been strictly reduced to l − 1. Repeatedly applying this procedure leads to a path in normal form between θ<sup>0</sup> and θm.

Third property: Refinement. Before we state the third property, we need a small definition. Recall that, given a symbolic configuration θ = (v, S), the set <sup>J</sup>θ<sup>K</sup> denotes the set of configurations <sup>C</sup> such that <sup>C</sup>(q) = <sup>v</sup>(q) if q /<sup>∈</sup> <sup>S</sup> and <sup>C</sup>(q) <sup>≥</sup> <sup>v</sup>(q) otherwise. The following definition refines the set <sup>J</sup>θK.

Definition 3. Given a symbolic configuration θ = (v, S) and a number N ∈ N, let <sup>J</sup>θK<sup>N</sup> denote the set of configurations <sup>C</sup> such that <sup>C</sup>(q) = <sup>v</sup>(q) if q /<sup>∈</sup> <sup>S</sup> and <sup>C</sup>(q) <sup>≥</sup> <sup>v</sup>(q) + <sup>N</sup> otherwise. Note that <sup>J</sup>θ<sup>K</sup> <sup>=</sup> <sup>J</sup>θK<sup>0</sup> .

This definition along with the above two properties now enable us to prove the third property. It roughly states that if a symbolic configuration θ 0 can be reached from another symbolic configuration θ, then there is a "small" N such that any configuration in <sup>J</sup><sup>θ</sup> 0 <sup>K</sup><sup>N</sup> can be reached from some configuration in <sup>J</sup>θK.

Theorem 1. Let θ, θ<sup>0</sup> be symbolic configurations of G<sup>k</sup> such that θ <sup>∗</sup> θ 0 . Then there exists N ≤ k × (2k) <sup>|</sup>Q<sup>|</sup> × (|Q| + 1)<sup>|</sup>Q|+1 + 1 such that for all C <sup>0</sup> <sup>∈</sup> <sup>J</sup><sup>θ</sup> 0 KN , there exists <sup>C</sup> <sup>∈</sup> <sup>J</sup>θ<sup>K</sup> such that <sup>C</sup> <sup>∗</sup>−→ C 0 .

Proof Sketch. Suppose θ <sup>∗</sup> θ 0 . If the length of the path is 0, then there is nothing to prove. Hence, we restrict ourselves to the case when the length of the path is bigger than 0. By Lemma 2, there is a path in normal from from θ to θ 0 (say) θ = θ<sup>0</sup> θ<sup>1</sup> θ<sup>2</sup> . . . θm−<sup>1</sup> θ<sup>m</sup> = θ <sup>0</sup> with each θ<sup>i</sup> := (v<sup>i</sup> , Si).

Let N<sup>0</sup> = 0 and let N<sup>i</sup> = (Ni−<sup>1</sup> + 1) · (|Si−<sup>1</sup> \ S<sup>i</sup> | + 1) for every 1 ≤ i ≤ m. In Lemma 5.3 of [8] (more precisely in its proof, in Lemma 6 of the long version [9]), the following fact has been proved:

For every 1 ≤ i ≤ m and for every C <sup>0</sup> <sup>∈</sup> <sup>J</sup>θ<sup>i</sup>K<sup>N</sup>i+1, there exists <sup>C</sup> <sup>∈</sup> <sup>J</sup>θi−<sup>1</sup>K<sup>N</sup>i−1+1 such that <sup>C</sup> <sup>∗</sup>−→ C 0 .

This immediately proves that for all C <sup>0</sup> <sup>∈</sup> <sup>J</sup><sup>θ</sup> 0 <sup>K</sup><sup>N</sup>m+1, there exists <sup>C</sup> <sup>∈</sup> <sup>J</sup>θ<sup>K</sup> such that C <sup>∗</sup>−→ C 0 . If we prove N<sup>m</sup> ≤ k × (2k) <sup>|</sup>Q<sup>|</sup> × (|Q| + 1)<sup>|</sup>Q|+1, then the proof of the theorem will be complete.

Notice that if (v, ∅) (v 0 , S<sup>0</sup> ) is an edge in G<sup>k</sup> then S <sup>0</sup> = ∅. This fact, along with the definition of a path in normal form, allows us to easily conclude that the number of indices i such that |Si−<sup>1</sup> \ S<sup>i</sup> | > 0 is at most |Q|. It then follows that except for at most |Q| indices, each index N<sup>i</sup> is obtained from Ni−<sup>1</sup> by simply adding 1 and in the remaining indices, N<sup>i</sup> is obtained from Ni−<sup>1</sup> by adding 1 and then multiplying by a number which is at most |Q| + 1. Using this, we can deduce that the maximum value for N<sup>m</sup> is at most (m − |Q| + 1)|Q|(|Q| + 1)<sup>|</sup>Q<sup>|</sup> . Since m is itself the length of the path between θ<sup>0</sup> and θm, m is upper bounded by the number of symbolic configurations in G<sup>k</sup> which is at most k × k <sup>|</sup>Q<sup>|</sup> × 2 |Q| . Overall we get that N<sup>m</sup> ≤ k × (2k) <sup>|</sup>Q<sup>|</sup> × (|Q| + 1)<sup>|</sup>Q|+1 .

Remark 3. A similar result was proved in Lemma 5.3 of [8], but there it was just stated that there exists an N satisfying this property. Moreover from the proof of that lemma, only a doubly exponential bound on N could be inferred.

Fourth property: Compatibility. To describe the fourth property, we need the following notion of order on configurations, relative to a given symbolic configuration.

Definition 4. Let θ = (v, S) be a symbolic configuration, and let C, C<sup>0</sup> be two configurations of R. We define an order <sup>θ</sup> such that C <sup>θ</sup> C 0 if and only if C, C<sup>0</sup> ∈ [[θ]], and ∀q ∈ S, C(q) ≤ C 0 (q).

This definition enables us to state our next property, which we dub compatibility. It intuitively says that the order that we have defined is, in some sense, compatible with the edges of the symbolic configurations.

Lemma 3. Let θ be a symbolic configuration of Gk, and let C, C<sup>0</sup> be two configurations of R. If C ∈ [[θ]] and C <sup>∗</sup>−→ C 0 , then there exists a symbolic configuration θ 0 such that 1) C <sup>0</sup> ∈ [[θ 0 ]], 2) θ <sup>∗</sup> θ <sup>0</sup> and 3) for all C 0 1 such that C 0 <sup>1</sup> <sup>θ</sup> <sup>0</sup> C 0 , there exists C<sup>1</sup> ∈ [[θ]] such that C<sup>1</sup> <sup>∗</sup>−→ C 0 1 .

Proof. Let θ be a symbolic configuration and C, C<sup>0</sup> be configurations such that C ∈ [[θ]] and C <sup>∗</sup>−→ C 0 . Let C = C<sup>0</sup> −→ · · · −→ Cm−<sup>1</sup> −→ C<sup>m</sup> = C <sup>0</sup> denote the run between C and C 0 . We prove the property by induction on m. For m = 0, we have C = C 0 . The property is easily seen to hold with θ <sup>0</sup> = θ.

Suppose now that m ≥ 1, and that the property holds for all n ≤ m. By induction hypothesis, for the configuration Cm−1, there exists a symbolic configuration θm−<sup>1</sup> satisfying the property, in particular θ <sup>∗</sup> θm−1. Since Cm−<sup>1</sup> <sup>a</sup>−→ C<sup>m</sup> for some a ∈ Σ, by Lemma 1, there exists a symbolic configuration θ<sup>m</sup> such that C<sup>m</sup> ∈ [[θm]], and θm−<sup>1</sup> <sup>a</sup> θm. Using θ <sup>∗</sup> θm−1, we obtain that θ <sup>∗</sup> θm.

Let θm−<sup>1</sup> = (vm−1, Sm−1) and θ<sup>m</sup> = (vm, Sm). Let C 0 <sup>m</sup> ∈ [[θm]] be such that C 0 <sup>m</sup> <sup>θ</sup><sup>m</sup> Cm. We will construct a configuration C 0 <sup>m</sup>−<sup>1</sup> ∈ [[θm−1]] such that C 0 <sup>m</sup>−<sup>1</sup> <sup>θ</sup>m−<sup>1</sup> Cm−<sup>1</sup> and C 0 m−1 <sup>∗</sup>−→ C 0 <sup>m</sup>. If we construct such a configuration, then by induction hypothesis, there is a C<sup>1</sup> ∈ [[θ]] such that C<sup>1</sup> <sup>∗</sup>−→ C 0 m−1 <sup>∗</sup>−→ C 0 <sup>m</sup>, which will conclude the proof.

Let C 0 m−1 (q) = Cm−1(q) for all q 6∈ Sm−1. To define C 0 <sup>m</sup>−<sup>1</sup> on Sm−1, we first define a mapping pred from states in S<sup>m</sup> to states of Sm−<sup>1</sup> ∪ Sm−<sup>1</sup> = Q as follows. Given q <sup>0</sup> ∈ Sm:


By definition, C 0 <sup>m</sup>(q) = Cm(q) for all q 6∈ Sm. For all q ∈ Sm, let n<sup>q</sup> = C 0 <sup>m</sup>(q) − Cm(q). Intuitively, we want to place these n<sup>q</sup> processes in the right places of C 0 m−1 so that C 0 <sup>m</sup>−<sup>1</sup> −→ C 0 <sup>m</sup>. For all q ∈ Sm−1, let C 0 m−1 P (q) = Cm−1(q)+ q <sup>0</sup>∈Sm,pred(q <sup>0</sup>)=q nq <sup>0</sup> . By definition, C 0 <sup>m</sup>−<sup>1</sup> <sup>θ</sup>m−<sup>1</sup> Cm−1. So all that remains is to prove that C 0 m−1 <sup>∗</sup>−→ C 0 m.

Let Cm−<sup>1</sup> <sup>t</sup>+t1,...,t<sup>n</sup> −−−−−−→ <sup>C</sup><sup>m</sup> where <sup>t</sup> = (p, !a, p<sup>0</sup> ) and each t<sup>i</sup> = (p<sup>i</sup> , ?a, p<sup>0</sup> i ). If we let S<sup>m</sup> \ Sm−<sup>1</sup> = {q 0 1 , . . . , q<sup>0</sup> <sup>w</sup>}, then by definition there is a transition t 0 i := (pred(q 0 i ), ?a, q<sup>0</sup> i ) for each i. Additionally, C 0 m−1 (pred(q 0 i )) ≥ Cm−1(pred(q 0 i )) + nq 0 i . This allows us to do C 0 m−1 t+t1,...,tn,n<sup>q</sup> 0 1 ·t 0 1 ,n<sup>q</sup> 0 2 ·t 0 2 ,...,nq0<sup>w</sup> ·t 0 w −−−−−−−−−−−−−−−−−−−−−−→ C 0 m, which concludes the proof.

### 4 The PSPACE Theorem

In this section, we prove our two main contributions. First, we show that given a cube C, post<sup>∗</sup> (C) is a counting set of bounded size. Using this, we show our main result: any boolean combination of atoms can be evaluated in PSPACE, where an atom is a counting set or the reachability set of a counting set. We call this the PSPACE Theorem. The intuition behind the PSPACE Theorem is that the norms of the counting sets obtained by such combinations are "small", and so we only need to examine small configurations to verify them, thus yielding a PSPACE algorithm for checking correctness. In particular, the PSPACE Theorem will show that the cube-reachability problem is in PSPACE. We fix an arbitrary RBN R = (Q, Σ, δ) for the rest of the section.

We start by drawing links between cubes and symbolic configurations.


Notice that the set ∆<sup>C</sup> is included in the symbolic graph of index 2kCk. Indeed, if C = (L, U) and (v, S) ∈ ∆C, then |v| ≤ |L| + |U<sup>f</sup> | where U<sup>f</sup> (q) = 0 if U(q) = ∞ and U<sup>f</sup> (q) = U(q) otherwise. Since kCk = max(|L|, |U<sup>f</sup> |), we have the desired result. By Remark 2, we know that symbolic configurations in the graph of index 2kCk can only reach symbolic configurations which are also in the graph of index 2kCk.

Lemma 4. Given a cube C, the sets ∆<sup>C</sup> and post<sup>∗</sup> (∆C) are included in the symbolic graph of index 2kCk.

There are only a finite number of symbolic configurations in the graph of a given index. Therefore post<sup>∗</sup> (∆C) is a finite set of symbolic configurations θ. It follows that [[post<sup>∗</sup> (∆C)]] is the finite union of the cubes Cθ, and thus a counting set.

Unfortunately, it is in general not the case that post<sup>∗</sup> (C) = [[post<sup>∗</sup> (∆C)]], which would close our argument. However, we will show that for each symbolic configuration θ in post<sup>∗</sup> (∆C), there is a counting set S<sup>θ</sup> ⊆ [[θ]] such that the finite union of these counting sets is equal to post<sup>∗</sup> (C). This will then show our first important result, namely that the reachability set of a counting set is also a counting set with "small" norm.

Theorem 2. Let C be a cube. Then post<sup>∗</sup> (C) is a counting set and

> kpost<sup>∗</sup> (C)k ∈ O((kCk · |Q|) <sup>|</sup>Q|+2)

The same holds for pre<sup>∗</sup> by using the given RBN with reversed transitions.

Proof. We start by defining a counting set M of configurations, which we will then prove to be equal to post<sup>∗</sup> (C).Given a symbolic configuration θ of post<sup>∗</sup> (∆C), we define the set min(θ, C) to be the set of configurations C ∈ [[θ]] such that C is minimal for the order <sup>θ</sup> over the configurations of post<sup>∗</sup> (C), i.e.

$$\min(\theta, \mathcal{C}) = \min\_{\preceq\_{\theta}} \left\{ C \in \left[ \theta \right] \mid C \in post^\*(\mathcal{C}) \right\},$$

We can now define M to be the following set

$$\mathcal{M} = \bigcup\_{\theta \in post^\*(\mathcal{A}\_{\mathcal{C}})} \bigcup\_{C \in \min(\theta, \mathcal{C})} \mathcal{C}\_C^{\theta},$$

where C θ <sup>C</sup> is the cube C(C,S) for S such that θ = (v, S). Since M is a finite union of cubes, it is a counting set.

We show that post<sup>∗</sup> (C) ⊆ M. Let C ∈ post<sup>∗</sup> (C). There exists C<sup>0</sup> ∈ C such that C<sup>0</sup> <sup>∗</sup>−→ C, and there exists θ<sup>0</sup> ∈ ∆<sup>C</sup> such that C<sup>0</sup> ∈ [[θ0]]. Applying Lemma 1, we obtain the existence of θ ∈ post<sup>∗</sup> (θ0) ⊆ post<sup>∗</sup> (∆C) such that C ∈ [[θ]]. Now, there exists a configuration C <sup>0</sup> ∈ min(θ, C) such that C <sup>0</sup> <sup>θ</sup> C. By definition of C θ <sup>C</sup><sup>0</sup> , C is in C θ <sup>C</sup><sup>0</sup> and thus in M.

Now we show that M ⊆ post<sup>∗</sup> (C). Let C ∈ M. By definition, there must be a symbolic configuration θ ∈ post<sup>∗</sup> (∆C) and a configuration C <sup>0</sup> ∈ post<sup>∗</sup> (C) such that C <sup>0</sup> <sup>θ</sup> C. By the Compatibility Lemma (Lemma 3), C is in post<sup>∗</sup> (C) as well.

All that remains is to bound the norm of M. To do this, let θ = (v, S) ∈ post<sup>∗</sup> (∆C) and let C ∈ min(θ, C). If we bound the norm of C θ <sup>C</sup> by the desired quantity, then the proof will be complete. Noticing that kC<sup>θ</sup> <sup>C</sup> k = |C|, it suffices to bound |C| by the desired quantity, which is what we shall do now.

By Theorem 1 and Lemma 4, there exists an N ≤ 2kCk × (4kCk) <sup>|</sup>Q<sup>|</sup> × (|Q| + 1)<sup>|</sup>Q|+1 such that [[post<sup>∗</sup> (∆C)]]<sup>N</sup> ⊆ post<sup>∗</sup> ([[∆C]]) = post<sup>∗</sup> (C). By definition of C, there must be a smallest N<sup>0</sup> such that C(q) ≤ v(q) + N<sup>0</sup> for every state q. If N<sup>0</sup> > N, then let C<sup>N</sup> be the configuration given by C<sup>N</sup> (q) = min(C(q), v(q)+N). We get that C<sup>N</sup> ∈ [[θ]]<sup>N</sup> ⊆ [[post<sup>∗</sup> (∆C)]]<sup>N</sup> ⊆ post<sup>∗</sup> (C), and so C<sup>N</sup> <sup>θ</sup> C and C<sup>N</sup> ∈ post<sup>∗</sup> (C), which is a contradiction to the minimality of C. Hence N<sup>0</sup> ≤ N and so |C| ≤ |v|+|Q|·N. Since θ = (v, S) is in post<sup>∗</sup> (∆C), by Lemma 4, we have that |v| ≤ 2kCk. Substituting the upper bounds for |v| and N in the inequality |C| ≤ |v|+|Q| · N then gives the required upper bound for |C|, thereby finishing the proof.

This result also holds for pre<sup>∗</sup> (C). If R = (Q, Σ, R) is the given RBN, consider the "reverse" RBN Rr, defined as R = (Q, Σ, Rr) where R<sup>r</sup> has a transition (q, ?a, q<sup>0</sup> ) for ? ∈ {!, ?} iff R<sup>r</sup> has a transition (q 0 , ?a, q). Notice that R<sup>r</sup> is still an RBN and that post<sup>∗</sup> (C) in R is equal to pre<sup>∗</sup> (C) in Rr.

Recall that counting sets are closed under boolean operations. With the above theorem, plus the fact that counting sets are finite unions of cubes, we obtain the following closure result.

Corollary 1 (Closure). Counting sets are closed under post <sup>∗</sup> , pre<sup>∗</sup> and boolean operations.

We are now ready to show our main result, the PSPACE Theorem. We show that there exist PSPACE algorithms to evaluate boolean combinations over counting sets and reachability set of counting sets. This result and its proof are adapted from a similar result for population protocols in [12].

Given a counting constraint Γ, we let [Γ] denote the counting set described by Γ. To state our result, we first define some "nice" expressions.

Definition 5. A nice expression is any expression that is constructed by the following syntax:

> E := Γ | post<sup>∗</sup> (Γ) | pre<sup>∗</sup> (Γ) | E ∩ E | E ∪ E | E

where Γ is any counting constraint.

If E is a nice expression, then the size of E, denoted by |E|, is defined as follows:

– If E = Γ or post<sup>∗</sup> (Γ) or pre<sup>∗</sup> (Γ), then |E| = 1; – If E = E<sup>1</sup> ∪ E<sup>2</sup> or E = E<sup>1</sup> ∩ E2, then |E| = |E1| + |E2|; – If E = E1, then |E| = |E1| + 1.

The set of configurations that is described by a nice expression E can be defined in a straightforward manner, and is denoted as [E].

Notice that any nice expression E is a counting constraint, and [E] is a counting set, by the Closure Corollary 1.

Theorem 3 (PSPACE Theorem). Let E be a nice expression and let N be the maximum norm of the counting constraints appearing in E. Then [E] is a counting set of norm at most exponential in N, |E| and |Q|. Further, the membership and emptiness problems for [E] are in PSPACE.

Proof. Recall that [E] is a counting set , by the Closure Corollary (Corollary 1). The exponential bounds for the norms follow immediately from Proposition 1 and Theorem 2. The membership complexity for union, intersection and complement is easy to see. Without loss of generality it suffices to prove that membership in post<sup>∗</sup> (Γ) is in PSPACE, where Γ is a counting constraint.

By Savitch's Theorem NPSPACE=PSPACE, so we provide a nondeterministic algorithm. Given (C, Γ), we want to decide whether C ∈ post<sup>∗</sup> (Γ). The algorithm first guesses a configuration C<sup>0</sup> ∈ Γ of the same size as C, verifies that C<sup>0</sup> belongs to Γ, and then simply guesses an execution starting at C0, step by step. The algorithm stops if either the configuration reached at some step is C, or if it has guessed more steps than the number of configurations of size |C|. This concludes the discussion regarding the membership complexity.

To see that checking emptiness of E is in PSPACE, notice that if E is nonempty, then it has an element of size at most kEk. We can guess such an element C in polynomial space (by representing each coefficient in binary), and verify that C is indeed in E by means of the PSPACE membership algorithm.

This result is a powerful tool which can be used to prove that a host of problems are in PSPACE for RBN. For instance, the cube-reachability problem for cubes C and C 0 is just checking if post<sup>∗</sup> (C)∩C<sup>0</sup> is empty, which by the PSPACE Theorem can be done in PSPACE. Combining this with Remark 1, we obtain the following result.

### Theorem 4. Cube-reachability is PSPACE-complete for RBN.

By the reduction given in Section 4.2 of [3], this result also proves that cube-reachability is PSPACE-complete for asynchronous shared-memory systems (ASMS), which is another model of distributed computation where agents communicate by a shared register. Due to lack of space, we defer a discussion of this result to the appendix.

We will demonstrate further applications of the PSPACE Theorem in the next section.

### 5 Application 1: Almost-sure coverability

Having presented our PSPACE Theorem and the closure property for reachability sets of counting sets, we now provide two applications. For the first one, we consider the almost-sure coverability problem for RBN. Using our new results, we prove that this problem is PSPACE-complete.

The rest of the section is as follows: We first recall the definition of the almostsure coverability problem, give a characterization of it in terms of counting sets and then prove PSPACE-completeness. Throughout this section, we fix a RBN R = (Q, Σ, δ) with two special states init, fin ∈ Q, which will respectively be called the initial and final states.

### 5.1 The almost-sure coverability problem

Let ↑ fin denote the set of all configurations C of R such that C(fin) ≥ 1. For any <sup>k</sup> <sup>≥</sup> 1, we say that the configuration <sup>H</sup><sup>k</sup> · init<sup>I</sup> almost-surely covers fin if and only if post<sup>∗</sup> (H<sup>k</sup> · initI) <sup>⊆</sup> pre<sup>∗</sup> (↑ fin). The reason behind calling this the almost-sure coverability relation is that the definition given here is equivalent to covering the state fin from <sup>H</sup><sup>k</sup> · init<sup>I</sup> with probability 1 under a probabilistic scheduler which picks agents uniformly at random at each step.

The number k is called a cut-off if one of the following is true: Either, 1) for all <sup>h</sup> <sup>≥</sup> <sup>k</sup>, the configuration <sup>H</sup><sup>h</sup> · init<sup>I</sup> almost-surely covers fin, in which case <sup>k</sup> is called a positive cut-off; or, 2) for all <sup>h</sup> <sup>≥</sup> <sup>k</sup>, the configuration <sup>H</sup><sup>h</sup> · init<sup>I</sup> does not almost-surely cover fin, in which case k is called a negative cut-off. The following was proved in Theorem 9 of [3].

Theorem 5. Given an RBN with two states init, fin , a cut-off always exists. Whether the cut-off is positive or negative can be decided in EXPSPACE.

Our main result of this section is that

Theorem 6. Deciding whether the cut-off of a given RBN is positive or negative is PSPACE-complete. Moreover, a given RBN always has a cut-off which is at most exponential in its number of states.

#### 5.2 A characterization of almost-sure coverability

We now rewrite the definition of almost-sure coverability in terms of counting sets. Let [init] be the cube such that L(q) = U(q) = 0 if q 6= init and L(init) = 0, U(init) = ∞. Notice that by definition, ↑ fin is a cube. We now consider the set of configurations defined by S := post<sup>∗</sup> ([init]) ∩ pre<sup>∗</sup>(↑ fin). By our PSPACE Theorem 3, S is a counting set such that the norm of S is at most 2<sup>p</sup>(|Q|) for some fixed polynomial p. We now claim the following.

Theorem 7. R has a positive cut-off if and only if S is finite. Moreover, |Q|·|S| is an upper bound on the size of the cut-off for R and so R has a cut-off which is exponential in its number of states.

Proof. Let N be the norm of S. Suppose S is finite. If C ∈ S, then P <sup>q</sup>∈<sup>Q</sup> C(q) ≤ |Q| · N. So, if C is any configuration of size h > |Q| · N such that C ∈ post<sup>∗</sup> (H<sup>h</sup> · initI) then <sup>C</sup> <sup>∈</sup> pre<sup>∗</sup> (↑ fin). Hence, |Q| · N is a positive cut-off for R.

Suppose S is infinite, and let ∪iC<sup>i</sup> be a counting constraint for S whose norm is N. Then there must exist an index i with C<sup>i</sup> := (L, U) and a state p such that U(p) = ∞. For each h ≥ N, consider the configuration C<sup>h</sup> given by Ch(q) = L(q) if q 6= p and Ch(p) = h. Notice that C<sup>h</sup> ∈ S and so C<sup>h</sup> ∈ post<sup>∗</sup> ([init]) ∩ pre<sup>∗</sup>(↑ fin). Hence, for every h ≥ |Q| · N, we have exhibited a configuration of size <sup>h</sup>, reachable from (H<sup>h</sup> · init<sup>I</sup> but from which fin is not coverable. Thus N is a negative cut-off for R.

Remark 4. Notice that we have shown that if S is finite, then R has a positive cut-off and if S is infinite, then R has a negative cut-off. This gives an alternative proof of the fact that a cut-off always exists for a given RBN.

### 5.3 PSPACE-completeness of the almost-sure coverability problem

Because of Theorem 7, we now have the following result.

Lemma 5. Deciding whether the cut-off of a given RBN is positive or negative can be done in PSPACE.

Proof Sketch. By Theorem 7, it follows that a given RBN has a negative cut-off iff S = post<sup>∗</sup> ([init]) ∩ pre<sup>∗</sup>(↑ fin) is infinite. We have already seen that S is a counting set such that the norm of S is at most N := 2<sup>p</sup>(|Q|) for some fixed polynomial p.

Let ∪iC<sup>i</sup> be a counting constraint for S which minimizes its norm and let each C<sup>i</sup> = (L<sup>i</sup> , Ui). Hence, Li(q) ≤ N for every state q. Further, S is infinite iff there is an index i and a state q such that Ui(q) = ∞. Using these two facts, we can then show that S is infinite iff there is a state q and a configuration C ∈ S such that C(q 0 ) ≤ N for every q <sup>0</sup> 6= q and C(q) = N + 1.

Hence, to check if S is infinite, we just have to guess a state q and a configuration C such that C(q 0 ) ≤ N for every q <sup>0</sup> 6= q and C(q) = N + 1 and check if C ∈ S. Since guessing C can be done in polynomial space (by representing every number in binary), by the PSPACE Theorem (Theorem 3), we can check if C ∈ S in polynomial space as well, which concludes the proof of the theorem.

We also have the accompanying hardness result.

Lemma 6. Deciding whether the cut-off of a given RBN is positive or negative is PSPACE-hard.

Similar to the cube-reachability problem, our result on almost-sure coverability also applies to the related model of ASMS. This solves an open problem from [6]. For lack of space, we once again defer this discussion to the appendix.

### 6 Application 2: Computation by RBN

In this section we give another application of our results. We introduce a model of computation using RBN called RBN protocols. We take inspiration from the extensively-studied model of population protocols [1,2,12]. The reader can consult the above references for more details on population protocols.

In our model, reconfigurable networks of identical, anonymous agents interact to compute a predicate ϕ : N <sup>k</sup> → {0, 1}. We show that RBN protocols compute exactly the threshold predicates, which we will define more formally below.

### 6.1 RBN Protocols

We introduce our computation model. The notation mimics that of [13].

Definition 6. An RBN protocol is a tuple P = (Q, Σ, δ, I, O) where (Q, Σ, δ) is an RBN, I = {q1, . . . , qk} is a set of input states, and O : Q → {0, 1} is an output function.

Configurations and runs of P are the same as that of the underlying RBN. A configuration C is called a 0-consensus (respectively a 1-consensus) if C(q) > 0 implies O(q) = 0 (respectively O(q) = 1). For b ∈ {0, 1}, a b-consensus C is stable if every configuration reachable from C is also a b-consensus. A run C<sup>0</sup> −→ C<sup>1</sup> −→ C<sup>2</sup> · · · of P is fair if it is finite and cannot be extended by any step, or if it is infinite and the following condition holds for all configurations C, C<sup>0</sup> : if C −→ C 0 and C = C<sup>i</sup> for infinitely many i ≥ 0, then the step C −→ C <sup>0</sup> appears infinitely along the run. In other words, if a fair run reaches a configuration infinitely often, then all the configurations reachable in a step from that configuration will be reached infinitely often from it.

A fair run C<sup>0</sup> −→ C<sup>1</sup> −→ . . . converges to b if there is i ≥ 0 such that C<sup>j</sup> is a b-consensus for every j ≥ i. For every v ∈ N k , let C<sup>v</sup> be the configuration given by Cv(qi) = v<sup>i</sup> for every q<sup>i</sup> ∈ I, and Cv(q) = 0 for every q ∈ Q \ I. We call C<sup>v</sup> the initial configuration for input v. The protocol P computes the predicate ϕ: N <sup>k</sup> → {0, 1}, if for every v ∈ N k , every fair run starting at C<sup>v</sup> converges to ϕ(v).

Fig. 3. An RBN protocol P.

Example 3. Adding the dashed line transitions to the RBN of Example 1 yields the RBN protocol P = (Q, Σ, δ, I, O) illustrated in Figure 3. The initial state is q1, i.e. I = {q1}, and the output function is defined such that O(q1) = O(q2) = 0 and O(q3) = 1. If there is a process in q3, it can "attract" the rest of the processes there using the new dashed transitions. As with the RBN of Example 1, a process can be put in <sup>q</sup><sup>3</sup> starting from the initial configuration <sup>H</sup><sup>k</sup> · <sup>q</sup><sup>1</sup><sup>I</sup> if and only if k ≥ 3. This RBN protocol computes the predicate x ≥ 3: if there are less than 3 processes originally in q<sup>1</sup> then they stay in states with output 0, and if there are more, then in a fair run a process eventually enters q3, and eventually the others follow, thus converging to 1.

### 6.2 Expressivity

In this section, we show that RBN protocols compute exactly the predicates definable by counting sets. A predicate ϕ : N <sup>k</sup> → {0, 1} is definable by counting sets if for every b ∈ {0, 1}, the sets {v | ϕ(v) = b} are counting sets.

For b ∈ {0, 1}, define the following sets of configurations:


The next lemma states that every predicate computed by a protocol is definable by counting sets.

Lemma 7. Let P be a RBN protocol that computes the predicate ϕ : N <sup>k</sup> → {0, 1}. Then for every b ∈ {0, 1}, the sets Ib, C<sup>b</sup> and ST <sup>b</sup> are all counting sets. This entails that ϕ is definable by counting sets.

Proof Sketch. Fix a b ∈ {0, 1}. It is easy to see that C<sup>b</sup> is a cube. Unraveling the definitions of I<sup>b</sup> and ST <sup>b</sup>, we can express them in terms of C<sup>b</sup> by using boolean operations and pre<sup>∗</sup> . By the Closure Corollary (Corollary 1), they are counting sets. Set {v | ϕ(v) = b} is simply I<sup>b</sup> restricted to I, and so we are done.

The next lemma states the converse result. It essentially uses the fact that there is a sub-class of population protocols called IO protocols which compute exactly the predicates definable by counting sets (Theorem 7 and Theorem 39 of [2,13]), and that IO protocols are a sub-class of RBN (Section 6.2 of [3]).

Lemma 8. Let ϕ : N <sup>k</sup> → {0, 1} be a predicate definable by counting sets. Then there exists a RBN protocol computing ϕ.

By Lemma 7 and Lemma 8, we get our result.

Theorem 8. RBN protocols compute exactly the predicates definable by counting sets.

### Acknowledgements

We thank Nathalie Bertrand and Javier Esparza for many helpful discussions.

### References


Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

### Separators in Continuous Petri Nets?

Michael Blondin<sup>1</sup>  and Javier Esparza<sup>2</sup>

<sup>1</sup> Universit´e de Sherbrooke, Sherbrooke, Canada michael.blondin@usherbrooke.ca <sup>2</sup> Technical University of Munich, Munich, Germany esparza@in.tum.de

Abstract. Leroux has proved that unreachability in Petri nets can be witnessed by a Presburger separator, i.e. if a marking msrc cannot reach a marking mtgt, then there is a formula ϕ of Presburger arithmetic such that: ϕ(msrc) holds; ϕ is forward invariant, i.e., ϕ(m) and m → m<sup>0</sup> imply ϕ(m<sup>0</sup> ); and ¬ϕ(mtgt) holds. While these separators could be used as explanations and as formal certificates of unreachability, this has not yet been the case due to their (super-)Ackermannian worst-case size and the (super-)exponential complexity of checking that a formula is a separator. We show that, in continuous Petri nets, these two problems can be overcome. We introduce locally closed separators, and prove that: (a) unreachability can be witnessed by a locally closed separator computable in polynomial time; (b) checking whether a formula is a locally closed separator is in NC (so, simpler than unreachablity, which is P-complete).

Keywords: Petri net · continuous reachability · separators · certificates.

### 1 Introduction

Petri nets form a widespread formalism of concurrency with several applications ranging from the verification of concurrent programs to the analysis of chemical systems. The reachability problem — which asks whether a a marking msrc can reach another marking mtgt — is fundamental as a plethora of problems, such as verifying safety properties, reduce to it (e.g. [13,11,2]).

Leroux has shown that unreachability in Petri nets can be witnessed by a Presburger separator, i.e., if a marking msrc cannot reach a marking mtgt, then there exists a formula ϕ of Presburger arithmetic such that: ϕ(msrc) holds; ϕ is forward invariant, i.e., ϕ(m) and m → m<sup>0</sup> imply ϕ(m<sup>0</sup> ); and ϕ(mtgt) does not hold [14]. Intuitively, ϕ "separates" mtgt from the set of markings reachable from msrc. Leroux's result leads to a very simple algorithm to decide the Petri net reachability problem, consisting of two semi-algorithms; the first one explores the markings reachable from msrc, and halts if and when it hits mtgt, while the

<sup>?</sup> M. Blondin was supported by a Discovery Grant from the Natural Sciences and Engineering Research Council of Canada (NSERC), and by the Fonds de recherche du Qu´ebec – Nature et technologies (FRQNT). J. Esparza was supported by an ERC Advanced Grant (787367: PaVeS).

second enumerates formulas from Presburger arithmetic, and halts if and when it hits a separator.

Separators can be used as explanations and as formal certificates. Verifying a safety property can be reduced to proving that a target marking (or set of markings) is not reachable from a source marking, and a separator is an invariant of the system that explains why the property holds. Further, if a reachability tool produces separators, then the user can check that the properties of a separator indeed hold, and so trust the result even if they do not trust the tool (e.g., because it has not been verified, or is executed on a remote faster machine). Yet, in order to be useful as explanations and certificates, separators have to satisfy two requirements: (1) they should not be too large, and (2) checking that a formula is a separator should have low complexity, and in particular lower complexity than deciding reachability. This does not hold, at least in the worst-case, for the separators of [14]: In the worst case, the separator has super-Ackermannian size in the Petri net size (a consequence of the fact that the reachability problem is Ackermann-complete [16,15,7]) and the complexity of the check is super-exponential.

In this paper, we show that, unlike the above, continuous Petri nets do have separators satisfying properties (1) and (2). Continuous Petri nets are a relaxation of the standard Petri net model, called discrete in the following, in which transitions are allowed to fire "fluidly": instead of firing once, consuming i<sup>p</sup> tokens from each input place p and adding o<sup>q</sup> tokens to each output place q, a transition can fire α times for any nonnegative real number α, consuming and adding α·i<sup>p</sup> and α·o<sup>q</sup> tokens, respectively. Continuous Petri nets are interesting in their own right [8], and moreover as an overapproximation of the discrete model. In particular, if mtgt is not reachable from msrc under the continuous semantics, then it is also not under the discrete one. As reachability in continuous Petri nets is P-complete [12], and so drastically more tractable than discrete reachability, this approximation is used in many tools for the verification of discrete Petri nets, VAS, or multiset rewriting systems (e.g. [5,4,10]).

It is easy to see that unreachability in continuous Petri nets can be witnessed by separators expressible in linear arithmetic (the first-order theory of the reals with addition and order). Indeed, Blondin et al. show in [5] that the continuous reachability relation is expressible by an existential formula reach(m,m<sup>0</sup> ) of linear arithmetic, from which we can obtain a separator for any pair of unreachable markings. To wit, for all markings msrc and mtgt, if mtgt is not reachable from msrc, then the formula sep<sup>m</sup>src (m) := ¬reach(msrc,m) is a separator. Further, reach(m,m<sup>0</sup> ) has only linear size. However, these separators do not satisfy property (2) unless P = NP. Indeed, while the reachability problem for continuous Petri nets is P-complete [12], checking if a formula of linear arithmetic is a separator is coNP-hard, even for quantifier-free formulas in disjunctive normal form, a very small fragment. So, the separators arising from [5] cannot be directly used as certificates.

In this paper, we overcome this problem. We identify a class of locally closed separators, satisfying the following properties: unreachability can always be witnessed by locally closed separators; locally closed separators can be constructed in polynomial time; and checking whether a formula is a locally closed separator is computationally easier than deciding unreachability. Let us examine the last claim in more detail. While the reachability problem for continuous Petri nets is decidable in polynomial time, it is still time consuming for larger models, which can have tens of thousands of nodes. Indeed, for a Petri net with n places and m transitions, the algorithm of [12] requires to solve O(m<sup>2</sup> ) linear programming problems in n variables, each of them with up to m constraints. Moreover, since the problem is P-complete, it is unlikely that a parallel computer can significantly improve performance. We prove that, on the contrary, checking if a formula is a locally closed separator is in NC rather than P-complete, and so efficiently parallelizable. Further, the checking algorithm only requires to solve linear programming problems in a single variable.

The paper is organized as follows. Section 2 introduces terminology, and defines separators (actually, a slightly different notion called bi-separators). Section 3 recalls the characterization of the reachability relation given by Fraca and Haddad in [12], and derives a characterization of unreachability suitable for finding bi-separators. Section 4 shows that checking the separators derivable from [5] is coNP-hard, and introduces locally closed bi-separators. Sections 5 and 6 show that locally closed bi-separators satisfy the aforementioned properties (1) and (2). Finally, Section 7 shows that all our results can be extended to separators that separate two sets of markings instead of singletons.

### 2 Preliminaries

Numbers, vectors and relations. We write N, R and R<sup>+</sup> to denote the naturals (including 0), reals, and non-negative reals (including 0). Let S be a finite set. We write e<sup>s</sup> to denote the unit vector e<sup>s</sup> ∈ R <sup>S</sup> such that es(s) = 1 and es(t) = 0 for all s, t ∈ S such that t 6= s. Given x, y ∈ R <sup>S</sup>, we write x ∼<sup>S</sup> y to indicate that x(s) ∼ y(s) for all s ∈ S, where ∼ is a total order such as ≤. We define the support of a vector x ∈ R S P as supp(x) := {s ∈ S : x(s) > 0}. We write x(S) := <sup>s</sup>∈<sup>S</sup> x(s). The transpose of a binary relation R is R<sup>T</sup> := {(y, x) : (x, y) ∈ R}.

Petri nets. A Petri net<sup>3</sup> is a tuple N = (P, T, F) where P and T are disjoint finite sets, whose elements are respectively called places and transitions, and where F = (F−, F+) with F−, F<sup>+</sup> : P × T → N. For every t ∈ T, vectors ∆ − t , ∆<sup>+</sup> <sup>t</sup> ∈ N <sup>P</sup> are respectively defined as the column of F<sup>−</sup> and F<sup>+</sup> associated to t, i.e. ∆ − t := F<sup>−</sup> ·e<sup>t</sup> and ∆ + t := F<sup>+</sup> ·et. A marking is a vector m ∈ R P <sup>+</sup>. We say that transition t is α-enabled if m ≥ α∆<sup>−</sup> <sup>t</sup> holds. If this is the case, then t can be α-fired from m, which leads to marking m<sup>0</sup> := m − α∆<sup>−</sup> <sup>t</sup> + α∆<sup>+</sup> t , which we denote m αt −→ m<sup>0</sup> . A transition is enabled if it is α-enabled for some real number

<sup>3</sup> In this work, "Petri nets" stands for "continuous Petri nets". In other words, we will consider standard Petri nets, but equipped with a continuous reachability relation. We will work over the reals, but note that it is known that working over the rationals is equivalent. For decidability issues, we will assume input numbers to be rationals.

α > 0. We define F := F<sup>+</sup> − F<sup>−</sup> and ∆<sup>t</sup> := F · et. In particular, m αt −→ m<sup>0</sup> implies m<sup>0</sup> = m + α∆t. For example, for the Petri net of Figure 1:

$$\{p\_1 \mapsto 2, p\_2 \mapsto 0, p\_3 \mapsto 0, p\_4 \mapsto 0\} \xrightarrow{\{1/2\}t\_1} \{p\_1 \mapsto 3/2, p\_2 \mapsto 1/2, p\_3 \mapsto 0, p\_4 \mapsto 0\}.$$

Moreover, w.r.t. to orderings p<sup>1</sup> < · · · < p<sup>4</sup> (rows) and t<sup>1</sup> < · · · < t<sup>4</sup> (columns):

$$\mathbf{F}\_{-}=\begin{bmatrix}1&2&2&0\\0&0&1&0\\0&0&0&1\\0&1&0&0\end{bmatrix},\ \mathbf{F}\_{+}=\begin{bmatrix}0&0&1&0\\1&0&0&0\\0&1&1&0\\0&1&0&1\end{bmatrix}\quad\text{and}\ \mathbf{F}=\begin{bmatrix}-1&-2&-1&0\\1&0&-1&0\\0&1&1&-1\\0&0&0&1\end{bmatrix}.$$

Fig. 1. A Petri net and two markings msrc = {p<sup>1</sup> 7→ 2, p<sup>2</sup> 7→ 0, p<sup>3</sup> 7→ 0, p<sup>4</sup> 7→ 0} (black circles) and mtgt = {p<sup>1</sup> 7→ 0, p<sup>2</sup> 7→ 0, p<sup>3</sup> 7→ 0, p<sup>4</sup> 7→ 1} (colored squares).

A sequence σ = α1t<sup>1</sup> · · · αnt<sup>n</sup> is a firing sequence from msrc to mtgt if there are markings m0, . . . ,m<sup>n</sup> satisfying msrc = m<sup>0</sup> <sup>α</sup>1t<sup>1</sup> −−−→ m<sup>1</sup> · · · <sup>α</sup>nt<sup>n</sup> −−−→ m<sup>n</sup> = mtgt. We write m<sup>0</sup> <sup>σ</sup>−→ mn. We say that msrc enables σ, and that mtgt enables σ backwards, or backward-enables σ. The support of σ is the set {t1, . . . , tn}. For example, for the Petri net of Figure 1, we have msrc <sup>σ</sup>−→ mtgt where

$$\begin{aligned} \mathcal{m}\_{\text{src}} &= \{ p\_1 \mapsto 2, p\_2 \mapsto 0, p\_3 \mapsto 0, p\_4 \mapsto 0 \}, \\ \mathcal{m}\_{\text{tgt}} &= \{ p\_1 \mapsto 0, p\_2 \mapsto 0, p\_3 \mapsto 0, p\_4 \mapsto 1 \}, \\ \sigma &= (1/2)t\_1 \ (1/2)t\_3 \ (1/2)t\_4 \ (1/2)t\_2 \ (1/2)t\_4. \end{aligned}$$

Let U ⊆ T. We write m −→<sup>U</sup> m<sup>0</sup> to denote that m αt −→ m<sup>0</sup> for some α > 0 and t ∈ U, and −→<sup>U</sup> ∗ for the transitive and reflexive closure of −→<sup>U</sup> . We simply write −→ and −→<sup>∗</sup> when U = T. The Petri net N<sup>U</sup> is obtained by removing transitions T \ U from N . In particular, m −→<sup>U</sup> <sup>∗</sup> m<sup>0</sup> holds in N iff m −→<sup>∗</sup> m<sup>0</sup> holds in N<sup>U</sup> .

The transpose of N = (P, T,(F−, F+)) is N <sup>T</sup> := (P, T,(F+, F−)). We have msrc <sup>σ</sup>−→ mtgt in N iff mtgt <sup>τ</sup>−→ msrc in N <sup>T</sup>, where τ is the reverse of σ. For U ⊆ T, we write U <sup>T</sup> to denote U in the context of N <sup>T</sup>. This way, when we write, e.g. −→<sup>U</sup> and −→<sup>U</sup> T , it is clear that we respectively refer to N and N <sup>T</sup>.

Linear arithmetic and Farkas' lemma. An atomic proposition is a linear inequality of the form ax ≤ b or ax < b, where b and the components of a are over R. Such a proposition is homogeneous if b = 0. A linear formula is a first-order formula over atomic propositions with variables ranging over R<sup>+</sup> (the classical definition uses R, but in our context variables will encode markings.) The solutions of a linear formula <sup>ϕ</sup>, denoted <sup>J</sup>ϕK, are the assignments to the free variables of ϕ that satisfy ϕ. A linear formula is homogeneous if all of its atomic propositions are homogeneous. For every formula ϕ(x, y) where x and y have the same arity, we write ϕ <sup>T</sup> to denote the formula that syntactically swaps x and <sup>y</sup>, so that <sup>J</sup><sup>ϕ</sup> T<sup>K</sup> <sup>=</sup> <sup>J</sup>ϕ<sup>K</sup> <sup>T</sup>. Throughout the paper, we will use Farkas' lemma, a fundamental result of linear arithmetic that rephrases the absence of solution to a system into the existence of one for another system:

Lemma 1 (Farkas' lemma). Let A ∈ R <sup>m</sup>×<sup>n</sup> and b ∈ R <sup>m</sup>. The formula Ax ≤ b has no solution iff A<sup>T</sup>y = 0 ∧ b <sup>T</sup>y < 0 ∧ y ≥ 0 has a solution.

### 2.1 Separators and bi-separators

Let us fix a Petri net N = (P, T, F) and two markings msrc,mtgt ∈ R P +.

Definition 1. A separator for (msrc,mtgt) is a linear formula ϕ over R P <sup>+</sup> such that: (1) <sup>m</sup>src <sup>∈</sup> <sup>J</sup>ϕK; (2) <sup>ϕ</sup> is forward invariant, i.e., <sup>m</sup> <sup>∈</sup> <sup>J</sup>ϕ<sup>K</sup> and <sup>m</sup> −→ <sup>m</sup><sup>0</sup> implies <sup>m</sup><sup>0</sup> <sup>∈</sup> <sup>J</sup>ϕK; and (3) <sup>m</sup>tgt <sup>∈</sup>/ <sup>J</sup>ϕK.

It follows immediately from the definition that if there exists a separator ϕ for (msrc,mtgt), then msrc 6−→<sup>∗</sup> mtgt. Thus, in order to show that msrc 6−→<sup>∗</sup> mtgt in N , we can either give a separator for (msrc,mtgt) w.r.t. N , or a separator for (mtgt,msrc) w.r.t. N <sup>T</sup>. Let us call them forward and backward separators. Loosely speaking, a forward separator shows that mtgt is not among the markings reachable from msrc, and a backward separator shows that msrc is not among the markings backward-reachable from mtgt. Bi-separators are formulas from which we can easily obtain forward and backward separators. The symmetry w.r.t. forward and backward reachability make them easier to handle.

Definition 2. A linear formula ϕ over (R P +) 2 is forward invariant if (m,m<sup>0</sup> ) ∈ <sup>J</sup>ϕ<sup>K</sup> and <sup>m</sup><sup>0</sup> −→ <sup>m</sup><sup>00</sup> imply (m,m<sup>00</sup>) <sup>∈</sup> <sup>J</sup>ϕK; backward invariant if (m<sup>0</sup> ,m<sup>00</sup>) <sup>∈</sup> <sup>J</sup>ϕ<sup>K</sup> and m −→ m<sup>0</sup> imply (m,m<sup>00</sup>) <sup>∈</sup> <sup>J</sup>ϕK; and bi-invariant if it is forward and backward invariant. A bi-separator for (msrc,mtgt) is a bi-invariant linear formula <sup>ϕ</sup> s.t. (msrc,msrc) <sup>∈</sup> <sup>J</sup>ϕK, (mtgt,mtgt) <sup>∈</sup> <sup>J</sup>ϕ<sup>K</sup> and (msrc,mtgt) <sup>∈</sup>/ <sup>J</sup>ϕK.

The following proposition shows how to obtain separators from bi-separators.

Proposition 1. Let ϕ be a bi-separator for (msrc,mtgt). The following holds:


Proof. It suffices to prove the first statement, the second is symmetric.

It is the case that <sup>m</sup>src <sup>∈</sup> <sup>J</sup>ψ<sup>K</sup> and <sup>m</sup>tgt <sup>∈</sup>/ <sup>J</sup>ψ<sup>K</sup> as (msrc,msrc) <sup>∈</sup> <sup>J</sup>ϕ<sup>K</sup> and (msrc,mtgt) <sup>∈</sup>/ <sup>J</sup>ϕK. It remains to show that <sup>ψ</sup> is forward invariant. Let <sup>m</sup> <sup>∈</sup> <sup>J</sup>ψ<sup>K</sup> and m αt −→ m<sup>0</sup> . Since (msrc,m) <sup>∈</sup> <sup>J</sup>ϕ<sup>K</sup> and <sup>ϕ</sup> is forward invariant, it is the case that (msrc,m<sup>0</sup> ) <sup>∈</sup> <sup>J</sup>ϕK. Hence, <sup>m</sup><sup>0</sup> <sup>∈</sup> <sup>J</sup>ψ<sup>K</sup> as desired. ut

### 3 A characterization of unreachability

In [12], Fraca and Haddad gave the following characterization of the reachability relation in continuous Petri nets:

Theorem 1 ([12]). Let N = (P, T, F) be a Petri net, let U ⊆ T, and let msrc,mtgt ∈ R P <sup>+</sup>. It is the case that msrc −→<sup>U</sup> <sup>∗</sup> mtgt iff there exists S ⊆ U such that the following conditions hold:

1. some vector x ∈ R T <sup>+</sup> with support S satisfies msrc + Fx = mtgt,


Furthermore, these conditions can be checked in polynomial time.

Theorem 1 has the following form, where P1, P<sup>2</sup> and P<sup>3</sup> stand for the conditions of 1., 2., and 3.:

$$
\mathfrak{m}\_{\mathrm{src}} \to^{U^\*} \mathfrak{m}\_{\mathrm{tgt}} \iff \exists S \subseteq U : (\exists x \colon P\_1(S, x)) \land (\exists \sigma \colon P\_2(S, \sigma) \land (\exists \tau \colon P\_3(S, \tau)) . \square$$

Therefore, msrc 6−→<sup>U</sup> <sup>∗</sup> mtgt holds iff

$$
\forall S \subseteq U : (\forall x \colon \neg P\_1(S, x)) \lor (\forall \sigma \colon \neg P\_2(S, \sigma)) \lor (\forall \tau \colon \neg P\_3(S, \tau)).
$$

To obtain a witness of unreachability for a given S ⊆ U, we replace each universally quantified disjunct by an existentially quantified equivalent one. For conditions 2. and 3., the solution (implicitly given in [12]) is formulated in Proposition 2. Given a set of places X, let •X (resp. X• ) be the set of transitions t such that F+(p, t) > 0 (resp. F−(p, t) > 0) for some p ∈ X. A siphon of N is a subset Q of places such that •Q ⊆ Q• . A trap is a subset R of places such that R• ⊆ •R. Informally, empty siphons remain empty, and marked traps remain marked. Formally, if m −→ m<sup>0</sup> , then m(Q) = 0 implies m<sup>0</sup> (Q) = 0, and m(R) > 0 implies m<sup>0</sup> (R) > 0. We have:

Proposition 2 ([12]). Let N = (P, T, F) be a Petri net, let S ⊆ T, and let m ∈ R P <sup>+</sup>. The following statements hold:


So the universal statements "no firing sequence . . . is enabled/backward-enabled . . . " are replaced by existential statements "there exists a siphon/trap . . . ". The if-direction of the proposition is easy to prove. A siphon Q of N<sup>S</sup> satisfies Q• ⊆ S. Since Q is empty at m, if we only fire transitions from S then Q remains empty, and so no transition of Q• ever becomes enabled. So transitions of Q• can only fire after transitions that do not belong to S have fired first. But no such firing sequence has support S, and we are done. The case of traps is analogous. For the only-if direction we refer the reader to [12].

For condition 1. of Theorem 1, we obtain a solution in terms of exclusion functions.

Definition 3. Let N = (P, T, F) be a Petri net, let msrc,mtgt ∈ R P <sup>+</sup> and let S ⊆ S <sup>0</sup> ⊆ T. An exclusion function for (S, S<sup>0</sup> ) is a function f : R P <sup>+</sup> → R s.t.


An exclusion function for S is an exclusion function for (S, S).

An exclusion function for S excludes the existence of a firing sequence from msrc to mtgt with support S, i.e., witnesses that condition 1 of Theorem 1 fails. To see why, call f(m) the value of m. By definition of f, either mtgt has lower value than msrc but no transition of S decreases it, or msrc and mtgt have the same value but no transition of S decreases it, and at least one increases it. So it is impossible to reach mtgt from msrc by firing all and only the transitions of S. Let us apply exclusion functions and Proposition 2 to an example.

Example 1. Consider the Petri net of Figure 1, but with mtgt := {p<sup>1</sup> 7→ 0, p<sup>2</sup> 7→ 0, p<sup>3</sup> 7→ 1, p<sup>4</sup> 7→ 0} as target. We prove msrc 6−→<sup>∗</sup> mtgt. For the sake of contradiction, assume msrc −→<sup>U</sup> <sup>∗</sup> mtgt for some U ⊆ T. We proceed in several steps:


By the claims, U = ∅, hence we reach the contradiction msrc = mtgt. ut

Proposition 4 below shows that condition 1. of Theorem 1 fails if and only if there is an exclusion function for S (actually, a slightly more general result). We need the following consequence of Farkas' lemma:

Proposition 3. The system ∃x ≥ 0 : Ax = b ∧ S ⊆ supp(x) ⊆ S <sup>0</sup> has no solution iff this system has some: ∃y : A<sup>T</sup>y ≥S<sup>0</sup> 0 ∧ b <sup>T</sup>y ≤ 0 ∧ b <sup>T</sup>y < P s∈S (A<sup>T</sup>y)s.

Proposition 4. Let N = (P, T, F) be a Petri net, let msrc,mtgt ∈ R P <sup>+</sup>, and let S ⊆ S <sup>0</sup> ⊆ T. No vector x ∈ R T <sup>+</sup> satisfies S ⊆ supp(x) ⊆ S <sup>0</sup> and msrc+Fx = mtgt iff there exists a linear exclusion function for (S, S<sup>0</sup> ).

Proof. Assume no such x ∈ R T <sup>+</sup> exists. Let b := mtgt − msrc. By Proposition 3, there exists y ∈ R <sup>P</sup> such that: F <sup>T</sup>y ≥S<sup>0</sup> 0 ∧ b <sup>T</sup>y ≤ 0 ∧ b <sup>T</sup>y < P s∈S (F <sup>T</sup>y)s. We show that f(k) := y <sup>T</sup>k is a linear exclusion function for (S, S<sup>0</sup> ).


Putting together Proposition 4 with Theorem 1 and Proposition 2, we obtain the following characterization of unreachability.

Proposition 5. Let N = (P, T, F) be a Petri net, let U ⊆ T, and msrc,mtgt ∈ R P <sup>+</sup>. It is the case that msrc 6−→<sup>U</sup> <sup>∗</sup> mtgt iff for every S ⊆ U:


This proposition shows that, for all supports S, we can produce a witness of unreachability as an exclusion function, a siphon, or a trap. In the next section, we transform these witnesses into separators useful as certificates.

### 4 Separators as certificates

Let N = (P, T, F) be a Petri net and let msrc,mtgt ∈ R P <sup>+</sup> be two markings of N . From [5], one can easily show that if msrc 6−→<sup>∗</sup> mtgt, then there is a separator for (msrc,mtgt). Indeed, [5, Prop. 3.2] shows that there exists an existential formula ψ of linear arithmetic such that m −→<sup>∗</sup> m<sup>0</sup> iff (m,m<sup>0</sup> ) <sup>∈</sup> <sup>J</sup>ψK. Thus, the formula ϕ(m) := ψ(msrc,m) is a separator.

However, ϕ is not adequate as a certificate of unreachability. Indeed, checking a certificate for msrc 6−→<sup>∗</sup> mtgt should have smaller complexity than deciding whether msrc −→<sup>∗</sup> mtgt. This is not the case for existential linear formulas, because msrc −→<sup>∗</sup> mtgt can be decided in polynomial time, but checking that an existential linear formula is a separator is coNP-hard.

Proposition 6. The problem of determining whether an existential linear formula ϕ is a separator for (msrc,mtgt) is coNP-hard, even if ϕ is a quantifier-free formula in DNF and homogeneous.

In the rest of the section, we introduce locally closed bi-separators, and then, in Sections 5 and 6, we respectively prove that they satisfy the following:


### 4.1 Locally closed bi-separators

The most difficult part of checking that a formula ϕ is a bi-separator consists of checking that it is forward and backward invariant. Let us focus on forward invariance, backward invariance being symmetric.

Recall the definition: for all markings m,m<sup>0</sup> ,m<sup>00</sup> and every transition t: if (m,m<sup>0</sup> ) <sup>∈</sup> <sup>J</sup>ϕ<sup>K</sup> and <sup>m</sup><sup>0</sup> αt −→ <sup>m</sup><sup>00</sup> then (m,m<sup>00</sup>) <sup>∈</sup> <sup>J</sup>ϕK. Assume now that <sup>ϕ</sup> is in DNF, i.e., a disjunction of clauses ϕ = ϕ1∨· · ·∨ϕn. The forward invariance check can be decomposed into n smaller checks, one for each i ∈ [1..n], of the form: if (m,m<sup>0</sup> ) <sup>∈</sup> <sup>J</sup>ϕ<sup>i</sup>K, then (m,m<sup>00</sup>) <sup>∈</sup> <sup>J</sup>ϕK. However, in general the check cannot be decomposed into local checks of the form: there exists j ∈ [1..m] such that (m,m<sup>0</sup> ) <sup>∈</sup> <sup>J</sup>ϕ<sup>i</sup><sup>K</sup> implies (m,m<sup>00</sup>) <sup>∈</sup> <sup>J</sup>ϕ<sup>j</sup> <sup>K</sup>. Indeed, while this property is sufficient for forward invariance, it is not necessary. Intuitively, locally closed bi-separators are separators where invariance can be established by local checks.

For the formal definition, we need to introduce some notations. Given a transition t and atomic propositions ψ, ψ<sup>0</sup> , we say that ψ t-implies ψ 0 , written ψ <sup>t</sup> ψ 0 , if (m,m<sup>0</sup> ) <sup>∈</sup> <sup>J</sup>ψ<sup>K</sup> and <sup>m</sup><sup>0</sup> αt −→ <sup>m</sup><sup>00</sup> implies (m,m<sup>00</sup>) <sup>∈</sup> <sup>J</sup><sup>ψ</sup> 0 <sup>K</sup>. We further say that a clause ψ = ψ<sup>1</sup> ∧ · · · ∧ψ<sup>m</sup> t-implies a clause ψ <sup>0</sup> = ψ 0 <sup>1</sup> ∧ · · · ∧ψ 0 n , written ψ <sup>t</sup> ψ 0 , if for every j ∈ [1..n], there exists i ∈ [1..m] such that ψ<sup>i</sup> <sup>t</sup> ψ 0 j .

Definition 4. A linear formula ϕ is locally closed w.r.t. N = (P, T, F) if:


Note that the definition is semantic. We make the straightforward but crucial observation that:

#### Proposition 7. Locally closed formulas are bi-invariant.

Proof. Let ϕ = ϕ<sup>1</sup> ∨ · · · ∨ ϕ<sup>n</sup> be a locally closed formula. We only consider the forward case; the other case is symmetric. Let (m,m<sup>0</sup> ) <sup>∈</sup> <sup>J</sup>ϕ<sup>K</sup> and <sup>m</sup><sup>0</sup> αt −→ <sup>m</sup><sup>00</sup> . Let i ∈ [1..n] be such that (m,m<sup>0</sup> ) <sup>∈</sup> <sup>J</sup>ϕ<sup>i</sup>K. Since <sup>ϕ</sup> is locally closed, there exists j ∈ [1..n] such that ϕ<sup>i</sup> <sup>t</sup> ϕ<sup>j</sup> . For every atomic proposition ψ <sup>0</sup> of ϕ<sup>j</sup> , there exists an atomic proposition ψ of ϕ<sup>i</sup> such that ψ <sup>t</sup> ψ 0 . Since each atomic proposition of ϕ<sup>i</sup> is satisfied by (m,m<sup>0</sup> ), we obtain (m,m<sup>00</sup>) <sup>∈</sup> <sup>J</sup>ϕ<sup>j</sup> <sup>K</sup>. ut

Proposition 7 justifies the following definition:

Definition 5. A locally closed bi-separator for (msrc,mtgt) is a locally closed formula <sup>ϕ</sup> s.t. (msrc,msrc) <sup>∈</sup> <sup>J</sup>ϕK, (mtgt,mtgt) <sup>∈</sup> <sup>J</sup>ϕ<sup>K</sup> and (msrc,mtgt) <sup>∈</sup>/ <sup>J</sup>ϕK.

Indeed, by Proposition 7, a locally closed bi-separator is a bi-separator, as the bi-invariance condition of Definition 2 follows from local closedness.

### 5 Constructing locally closed bi-separators

In this section, we prove that unreachability can always be witnessed by locally closed bi-separators of polynomial size and computable in polynomial time. The proof uses the results of Section 3.

Theorem 2. If msrc 6−→<sup>U</sup> <sup>∗</sup> mtgt, then there is a locally closed bi-separator ϕ for (msrc,mtgt) w.r.t. N<sup>U</sup> . Further, ϕ = W <sup>1</sup>≤i≤<sup>n</sup> ϕi, where n ≤ 2|U| + 1 and each ϕ<sup>i</sup> contains at most 2|U| + 1 atomic propositions. Moreover, ϕ is computable in polynomial time.

Proof. We proceed by induction on |U|. First consider U = ∅. Let p ∈ P be such that msrc(p) =6 mtgt(p). Take ϕ(m,m<sup>0</sup> ) := epm ≤ epm<sup>0</sup> or −epm ≤ −epm<sup>0</sup> .

Now, assume that U 6= ∅. Consider the system ∃x ∈ R T <sup>+</sup> : msrc+Fx = mtgt∧ supp(x) ⊆ U. Suppose first that the system has no solution. By Proposition 4, taking S = ∅ and S <sup>0</sup> = U, there is a linear exclusion function for (∅, U), i.e. a linear function f satisfying:

1. f(msrc) > f(mtgt), 2. m <sup>u</sup>−→ m<sup>0</sup> implies f(m) ≤ f(m<sup>0</sup> ) for all u ∈ U.

(The first item holds due to Item 2 of Definition 3 and S = ∅.) So we can take ϕ(m,m<sup>0</sup> ) := (f(m) ≤ f(m<sup>0</sup> )).

Suppose now that the system has a solution x ∈ R U <sup>+</sup>. By convexity, we can suppose that supp(x) ⊆ U is maximal. Indeed, if x <sup>0</sup> and x <sup>00</sup> are solutions, then (1/2)x <sup>0</sup> + (1/2)x <sup>00</sup> is a solution with support supp(x 0 ) ∪ supp(x <sup>00</sup>). Let U 0 := supp(x). For every t ∈ U \ U 0 , consider the system of Proposition 4 with S = {t} and S <sup>0</sup> = U. By maximality of U <sup>0</sup> ⊆ U, none of these systems has a solution. Consequently, for each t ∈ U\U 0 , Proposition 4 yields a linear exclusion function for ({t}, U), i.e. a linear function f<sup>t</sup> that satisfies:


If ft(msrc) > ft(mtgt) holds for some t ∈ U \U 0 , then we are done by taking ϕ(m,m<sup>0</sup> ) := (ft(m) ≤ ft(m<sup>0</sup> )) as Item 4 ensures that ϕ <sup>u</sup> ϕ for every u ∈ U. So assume it does not hold for any t ∈ U \ U 0 , i.e. assume that ft(msrc) = ft(mtgt) holds, and the second disjunct of Item 5 holds for all t ∈ U \ U 0 . This is the most involved case. Let

$$\varphi\_{\text{inv}}(\mathfrak{m}, \mathfrak{m}') \coloneqq \bigwedge\_{t \in U \backslash U'} (f\_t(\mathfrak{m}) \le f\_t(\mathfrak{m}')) \quad \text{and} \quad \varphi\_t(\mathfrak{m}, \mathfrak{m}') \coloneqq (f\_t(\mathfrak{m}) < f\_t(\mathfrak{m}')).$$

Let Q, R ⊆ P be respectively the maximal siphon and trap of NU<sup>0</sup> such that msrc(Q) = 0 and mtgt(R) = 0 (well-defined by closure under union). Let U <sup>00</sup> := U <sup>0</sup> \(Q•∪ •R). By Theorem 1 and Proposition 2, Q•∪ •R =6 ∅. Thus, U <sup>00</sup> is a strict subset of U 0 , and, by induction hypothesis, there is a locally closed bi-separator w.r.t. NU<sup>00</sup> of the form ψ = W <sup>1</sup>≤i≤<sup>m</sup> ψ<sup>i</sup> that satisfies the claim for set U <sup>00</sup>. Let

$$\varphi(\mathfrak{m}, \mathfrak{m}') \coloneqq \bigvee\_{t \in U \backslash U'} \varphi\_t(\mathfrak{m}, \mathfrak{m}') \vee [\varphi\_{\text{inv}}(\mathfrak{m}, \mathfrak{m}') \wedge \mathfrak{m}(Q) + \mathfrak{m}'(R) > 0] \vee \bigvee\_{1 \le i \le m} [\varphi\_{\text{inv}}(\mathfrak{m}, \mathfrak{m}') \wedge \mathfrak{m}(R) + \mathfrak{m}'(Q) \le 0 \wedge \psi\_i(\mathfrak{m}, \mathfrak{m}')].$$

As (msrc,msrc) <sup>∈</sup> <sup>J</sup>ϕinv<sup>K</sup> and (msrc,msrc) <sup>∈</sup> <sup>J</sup>ψK, we have (msrc,msrc) <sup>∈</sup> <sup>J</sup>ϕK. Similarly, (mtgt,mtgt) <sup>∈</sup> <sup>J</sup>ϕK. By Item 3, (msrc,mtgt) <sup>∈</sup>/ <sup>J</sup> W <sup>t</sup>∈U\U<sup>0</sup> ϕt(m,m<sup>0</sup> )K. Further, <sup>m</sup>src(Q)+mtgt(R) = 0 and (msrc,mtgt) <sup>∈</sup>/ <sup>J</sup>ψK. So, (msrc,mtgt) <sup>∈</sup>/ <sup>J</sup>ϕK.

The number of disjuncts of ϕ is |U \ U 0 | + 1 + m and hence at most

$$\begin{aligned} |U \backslash U'| + 1 + 2|U''| + 1 &\le |U| - |U''| + 1 + 2|U''| + 1 = \\ |U| + |U''| + 2 &\le |U| + (|U| - 1) + 2 = 2|U| + 1. \end{aligned}$$

The same bounds holds for the number of atomic propositions per disjunct.

It remains to show that ϕ(m,m<sup>0</sup> ) is locally closed w.r.t. N<sup>U</sup> . We only consider the forward case, as the backward case is symmetric. Let (m,m<sup>0</sup> ) <sup>∈</sup> <sup>J</sup>ϕ<sup>K</sup> and m<sup>0</sup> <sup>u</sup>−→ m<sup>00</sup> for some u ∈ U. By Item 4, ϕ<sup>t</sup> <sup>u</sup> ϕ<sup>t</sup> holds for each ϕt. Indeed, ft(m) < ft(m<sup>0</sup> ) and m<sup>0</sup> <sup>u</sup>−→ m<sup>00</sup> imply ft(m) < ft(m<sup>0</sup> ) ≤ ft(m<sup>00</sup>), and hence ft(m) < ft(m<sup>00</sup>). To handle the other clauses, we make a case distinction on u.

	- Case u ∈ •R. We have θ <sup>0</sup> <sup>u</sup> (m(Q) + m<sup>0</sup> (R) > 0) for any atomic proposition θ 0 , since m<sup>0</sup> <sup>u</sup>−→ m<sup>00</sup> implies m<sup>00</sup>(R) > 0 (regardless of θ 0 ).
	- Case u ∈ Q• . If m<sup>0</sup> (Q) ≤ 0, then u is disabled in m<sup>0</sup> . Thus, it only remains to handle θ><sup>0</sup> := (m(Q)+m<sup>0</sup> (R) > 0). Since R is a trap of NU<sup>0</sup> , firing u from m<sup>0</sup> does not empty R, and hence θ><sup>0</sup> <sup>u</sup> θ>0.
	- Case u ∈ U 00 . Let θ<sup>≤</sup><sup>0</sup> := (m(R) + m<sup>0</sup> (Q) ≤ 0) and θ><sup>0</sup> := (m(Q) + m<sup>0</sup> (R) > 0). Since Q and R are respectively a siphon and trap of NU<sup>0</sup> , we have θ<sup>≤</sup><sup>0</sup> <sup>u</sup> θ<sup>≤</sup><sup>0</sup> and θ><sup>0</sup> <sup>u</sup> θ>0. Moreover, by induction hypothesis, for every i ∈ [1..m], there exists j ∈ [1..m] such that ψ<sup>i</sup> <sup>u</sup> ψ<sup>j</sup> .

We conclude the proof by observing that it is constructive and can be turned into Algorithm 1. The procedure works in polynomial time. Indeed, there are at most |U| recursive calls. Moreover, each set can be obtained in polynomial time via either linear programming or maximal siphons/traps computations [9]. ut

Example 2. Let us apply the construction of Theorem 2 to the Petri net and the markings of Example 1: msrc = {p<sup>1</sup> 7→ 2, p<sup>2</sup> 7→ 0, p<sup>3</sup> 7→ 0, p<sup>4</sup> 7→ 0} and mtgt := {p<sup>1</sup> 7→ 0, p<sup>2</sup> 7→ 0, p<sup>3</sup> 7→ 1, p<sup>4</sup> 7→ 0}. The locally closed bi-separator is the formula ϕ below, where the colored arrows represent the relations <sup>t</sup><sup>1</sup> , . . . , <sup>t</sup><sup>4</sup> :

$$\begin{aligned} & \left. \begin{aligned} ^{t\_1, t\_2, t\_3, t\_4} \mathop{\rm s.t.} \right\} \left[ m(p\_4) < m'(p\_4) \right] \vee \\ & \left\{ \begin{aligned} ^{t\_4} \left[ m(p\_4) \le m'(p\_4) \wedge m(p\_4) + m'(p\_4) > 0 \right] \vee & \xrightarrow{t\_1, t\_2, t\_3} \\ ^{t\_1} \left[ m(p\_4) \le m'(p\_4) \wedge m'(p\_1) + m'(p\_2) > 0 \right] \vee & \xrightarrow{t\_2} ^{t\_1} \iota\_{t\_1, t\_3} \\ ^{t\_2} \left[ m(p\_4) \le m'(p\_4) \wedge m(p\_1) + m(p\_2) \le 0 \wedge & \neg m(p\_3) \le -m'(p\_3) \right] \end{aligned} \right. \end{aligned} $$

The forward separator ψ(m) := ϕ(msrc,m) is, after simplifications, given by

$$
\psi(m) \equiv m(p\_1) + m(p\_2) > 0 \lor m(p\_4) > 0.
$$

Similarly, we obtain this backward separator ψ 0 (m) := ϕ(m,mtgt):

$$\psi'(m) \equiv \mathfrak{m}(p\_1) + \mathfrak{m}(p\_2) = 0 \land \mathfrak{m}(p\_3) \ge 1 \land \mathfrak{m}(p\_4) = 0.$$

The backward separator ψ <sup>0</sup> provides a much simpler proof of msrc 6 <sup>∗</sup>−→ mtgt than the one of Example 1. The proof goes as follows: ψ 0 is trivially backward invariant, because markings that only mark p<sup>3</sup> do not backward-enable any transition. In particular, since mtgt only marks p3, it can only be reached from mtgt. ut

Algorithm 1: Construction of a locally closed bi-sep. for (msrc,mtgt).

Input: N = (P, T, F), U ⊆ T and msrc,mtgt ∈ Q P <sup>+</sup> s.t. msrc 6−→<sup>U</sup> <sup>∗</sup> mtgt Output: A locally closed bi-separator w.r.t. N<sup>U</sup> bi-separator(U) if U = ∅ then pick p ∈ P such that msrc(p) 6= mtgt(p) return (am ≤ am<sup>0</sup> ) where a := sign(msrc(p) − mtgt(p)) · e<sup>p</sup> else b := mtgt − msrc X := {x ∈ R T <sup>+</sup> : Fx = b, supp(x) ⊆ U} Y<sup>S</sup> := {y ∈ R P : F <sup>T</sup>y ≥<sup>U</sup> 0, b <sup>T</sup>y ≤ 0, b <sup>T</sup>y < P s∈S (F <sup>T</sup>y)s} if X = ∅ then pick y ∈ Y<sup>∅</sup> and return (y <sup>T</sup>m ≤ y <sup>T</sup>m<sup>0</sup> ) else U 0 := {u ∈ U : x(u) > 0 for some x ∈ X} for t ∈ U \ U <sup>0</sup> do pick y<sup>t</sup> ∈ Y{t}; ft(m) := y T <sup>t</sup> m if ft(msrc) > ft(mtgt) then return (ft(m) < ft(m<sup>0</sup> )) Q := largest siphon of NU<sup>0</sup> such that msrc(Q) = 0 R := largest trap of NU<sup>0</sup> such that mtgt(R) = 0 ϕinv := V <sup>t</sup>∈U\U<sup>0</sup> (ft(m) ≤ ft(m<sup>0</sup> )) ψ<sup>1</sup> ∨ · · · ∨ ψ<sup>m</sup> := bi-separator(U 0 \ (Q • ∪ •R)) return W <sup>t</sup>∈U\U<sup>0</sup> ϕt(m,m<sup>0</sup> ) ∨ [ϕinv(m,m<sup>0</sup> ) ∧ m(Q) + m<sup>0</sup> (R) > 0] ∨ W 1≤i≤m[ϕinv(m,m<sup>0</sup> ) ∧ m(R) + m<sup>0</sup> (Q) ≤ 0 ∧ ψi(m,m<sup>0</sup> )]

### 6 Checking locally closed bi-separators is in NC

We show that the problem of deciding whether a given linear formula is a locally closed bi-separator is in NC. To do so, we provide a characterization of ψ <sup>t</sup> ψ for homogeneous atomic propositions ψ and ψ 0 . We only focus on forward firability, as backward firability can be expressed as forward firability in the transpose Petri net. Recall that ψ <sup>t</sup> ψ <sup>0</sup> holds iff the following holds:

$$(\mathfrak{m}, \mathfrak{m}') \in \left[\psi\right] \text{ and } \mathfrak{m}' \xrightarrow{\alpha t} \mathfrak{m}'' \text{ imply } (\mathfrak{m}, \mathfrak{m}'') \in \left[\psi'\right]. \tag{\*}$$

Property (\*) can be rephrased as:

$$(m, m') \in \left[\psi\right] \text{ and } m' \ge \alpha \cdot \Delta\_t^- \text{ imply } (m, m' + \alpha \cdot \Delta\_t) \in \left[\psi'\right].$$

As we will see towards the end of the section, due to homogeneity, it actually suffices to consider the case where α = 1, which yields this reformulation:

$$\underbrace{\{ (\boldsymbol{m}, \boldsymbol{m}') \in \mathbb{[}\boldsymbol{\psi} \} : \boldsymbol{m}' \geq \Delta\_t^-}\_{\widetilde{X}} \underline{\operatorname{\boldsymbol{\Gamma}}} \underline{\operatorname{\boldsymbol{\Gamma}}} \underline{\operatorname{\boldsymbol{\Gamma}}} \underline{\operatorname{\boldsymbol{\Gamma}}} \underline{\operatorname{\boldsymbol{\Gamma}}} \underline{\operatorname{\boldsymbol{\Gamma}}} \underline{\operatorname{\boldsymbol{\Gamma}}} \underline{\operatorname{\boldsymbol{\Gamma}}} \underline{\operatorname{\boldsymbol{\Gamma}}} \underline{\operatorname{\boldsymbol{\Gamma}}} \underline{\operatorname{\boldsymbol{\Gamma}}} \underline{\operatorname{\boldsymbol{\Gamma}}} \underline{\operatorname{\boldsymbol{\Gamma}}} \underline{\operatorname{\boldsymbol{\Gamma}}} \underline{\operatorname{\boldsymbol{\Gamma}}} \underline{\operatorname{\boldsymbol{\Gamma}}} \underline{\operatorname{\boldsymbol{\Gamma}}} \underline{\operatorname{\boldsymbol{\Gamma}}} \underline{\operatorname{\boldsymbol{\Gamma}}} \underline{\operatorname{\boldsymbol{\Gamma}}} \underline{\operatorname{\boldsymbol{\Gamma}}} \underline{\operatorname{\boldsymbol{\Gamma}}} \underline{\operatorname{\boldsymbol{\Gamma}}} \underline{\operatorname{\boldsymbol{\Gamma}}} \underline{\operatorname{\boldsymbol{\Gamma}}} \underline{\operatorname{\boldsymbol{\Gamma}}} \underline{\operatorname{\boldsymbol{\Gamma}}} \underline{\operatorname{\boldsymbol{\Gamma}}} \underline{\operatorname{\boldsymbol{\Gamma}}} \underline{\operatorname{\boldsymbol{\Gamma}}} \underline{\operatorname{\boldsymbol{\Gamma}}} \underline{\operatorname{\boldsymbol{\Gamma}}} \underline{\operatorname{\boldsymbol{\Gamma}}} \underline{\operatorname{\boldsymbol{\Gamma}}} \underline{\operatorname{\boldsymbol{\Gamma}}} \underline{\operatorname{\boldsymbol{\Gamma}}} \underline{\operatorname{\boldsymbol{\Gamma}}} \underline{\operatorname{\boldsymbol{\Gamma}}} \underline{\operatorname{\boldsymbol{\Gamma}}} \underline{\operatorname{\boldsymbol{\Gamma}}} \underline{\operatorname{\boldsymbol{\Gamma}}} \underline{\operatorname{\boldsymbol{\Gamma}}} \underline{\operatorname{\boldsymbol{\Gamma}}} \underline{\operatorname{\boldsymbol{\Gamma}}} \underline{\operatorname{\boldsymbol{\Gamma}}} \underline{\operatorname{\$$

Therefore, testing ψ <sup>t</sup> ψ <sup>0</sup> amounts to the inclusion check X ⊆ Y . Of course, if X = ∅, then this is trivial. Hence, we will suppose that X 6= ∅, assuming for now that it can somehow be tested efficiently. In the forthcoming Propositions 8 and 9, we will provide necessary and sufficient conditions for X ⊆ Y to hold. In Proposition 10, we will show that these conditions are testable in NC. Then, in Proposition 11, we will explain how to check whether X 6= ∅ actually holds.

For X ⊆ Y , we can characterize the case of atomic propositions ψ that use "≤" (rather than "<") with a generalization of Farkas' lemma:

Proposition 8. Let a, a 0 ,l ∈ R <sup>n</sup> and b <sup>0</sup> ∈ R. Let X := {x ∈ R <sup>n</sup> : ax ≤ 0 ∧ x ≥ l} and Y := {x ∈ R <sup>n</sup> : a <sup>0</sup>x ≤ b <sup>0</sup>} be such that X 6= ∅. It is the case that X ⊆ Y iff there exists λ ≥ 0 such that λa ≥ a <sup>0</sup> and −b <sup>0</sup> ≤ (λa − a 0 )l.

We now give the conditions for all four combinations of "≤" and "<":

Proposition 9. Let a, a <sup>0</sup> ∈ R <sup>n</sup>, b <sup>0</sup> ∈ R, l ≥ 0 and ∼, ∼<sup>0</sup> ∈ {≤, <}. Let X<sup>∼</sup> := {x ≥ l : ax ∼ 0} and Y∼<sup>0</sup> := {x ∈ R <sup>n</sup> : a <sup>0</sup>x ∼<sup>0</sup> b <sup>0</sup>} be such that X<sup>∼</sup> 6= ∅. It holds that X<sup>∼</sup> ⊆ Y∼<sup>0</sup> iff there exists λ ≥ 0 s.t. λa ≥ a <sup>0</sup> and one of the following holds:

1. ∼<sup>0</sup> = ≤ and −b <sup>0</sup> ≤ (λa − a 0 )l; 2. ∼ = ≤, ∼<sup>0</sup> = <, and −b <sup>0</sup> < (λa − a 0 )l; 3. ∼ = <, ∼<sup>0</sup> = <, and either −b <sup>0</sup> < (λa − a 0 )l or −b <sup>0</sup> = (λa − a 0 )l ∧ λ > 0.

Proof.

1. If ∼ = ≤, then it follows immediately from Proposition 8. Thus, assume ∼ = <. We claim that X<sup>&</sup>lt; ⊆ Y<sup>≤</sup> iff X<sup>≤</sup> ⊆ Y≤. The validity of this claim concludes the proof of this case as we have handled ∼ = ≤ and as X<sup>≤</sup> ⊇ X<sup>&</sup>lt; 6= ∅.

Let us show the claim. It is clear that X<sup>&</sup>lt; ⊆ Y<sup>≤</sup> is implied by X<sup>≤</sup> ⊆ Y≤. So, we only have to show direction from left to right. For the sake of contradiction, suppose that X<sup>&</sup>lt; ⊆ Y<sup>≤</sup> and X<sup>≤</sup> 6⊆ Y≤. Let X<sup>=</sup> := X<sup>≤</sup> \ X<. Note that X<sup>=</sup> 6= ∅. Let x ∈ X<sup>&</sup>lt; and x <sup>0</sup> ∈ X<sup>=</sup> \Y≤. We have x, x <sup>0</sup> ≥ l, ax < 0, ax<sup>0</sup> = 0, a <sup>0</sup>x = c ≤ b <sup>0</sup> and a 0x <sup>0</sup> = c <sup>0</sup> > b<sup>0</sup> for some c, c<sup>0</sup> ∈ R. In particular, b <sup>0</sup> ∈ [c, c<sup>0</sup> ). Let ∈ (0, 1] be such that b <sup>0</sup> < c + (1 − )c 0 . Let x <sup>00</sup> := x + (1 − )x 0 . Observe that x <sup>00</sup> ≥ l. Moreover, we have:

$$\begin{aligned} ax'' &= \epsilon ax + (1 - \epsilon)ax' &= \epsilon ax &< 0, \\ a'x'' &= \epsilon a'x + (1 - \epsilon)a'x' &= \epsilon c + (1 - \epsilon)c' &> b'. \end{aligned}$$

Therefore, we have x <sup>00</sup> ∈ X<sup>&</sup>lt; and x <sup>00</sup> ∈/ Y≤, which is a contradiction.

2. ⇒) Since X<sup>≤</sup> ⊆ Y<, the system ∃x : x ≥ l ∧ ax ≤ 0 ∧ a <sup>0</sup>x ≥ b <sup>0</sup> has no solution. In matrix notation, the system corresponds to ∃x : Ax ≤ c where

$$\mathbf{A} := \begin{bmatrix} -\mathbf{I} \\ \mathbf{a} \\ -\mathbf{a'} \end{bmatrix} \text{ and } \mathbf{c} := \begin{pmatrix} -l \\ 0 \\ -b' \end{pmatrix}.$$

By Farkas' lemma (Lemma 1), A<sup>T</sup>y = 0 and c <sup>T</sup>y < 0 for some y ≥ 0. In other words,

$$\exists z \ge \mathbf{0}, \lambda, \lambda' \ge 0 : \lambda \mathbf{a} - \lambda' \mathbf{a}' = \mathbf{z} \land -\lambda' \mathbf{b}' < \mathbf{z} \mathbf{l}.$$

Since z ≥ 0, we have λa ≥ λ 0a <sup>0</sup>∧−λ 0 b <sup>0</sup> < (λa−λ 0a 0 )l. If λ <sup>0</sup> > 0, then we are done by dividing all terms by λ 0 . For the sake of contradiction, suppose that λ <sup>0</sup> = 0. This means that λa ≥ 0 and 0 < λal. We necessarily have λ > 0 and al > 0. Let x ∈ X≤. We have 0 ≥ ax ≥ al > 0, which is a contradiction.

⇐) Let x ∈ X≤. We have a <sup>0</sup>x < b<sup>0</sup> and hence x ∈ Y<sup>&</sup>lt; as desired, since:

$$\begin{aligned} & -b' < (\lambda a - a')l \\ & \le (\lambda a - a')x \\ & = \lambda ax - a'x \\ & \le -a'x \end{aligned} \qquad \text{(by } (\lambda a - a') \ge 0 \text{ and } x \ge l \ge 0).$$

$$\begin{aligned} & \le -a'x \end{aligned} \qquad \text{(by } \lambda \ge 0 \text{ and } ax \le 0).$$

3. The proof is similar albeit slightly more complicated. ut

The conditions arising from Proposition 9 involve solving linear programs with one variable λ. It is easy to see that this problem is in NC:

Proposition 10. Given a, b ∈ Q<sup>n</sup> and ∼ ∈ {≤, <} <sup>n</sup>, testing ∃λ ≥ 0 : aλ ∼ b is in NC.

Recall that at the beginning of the section we made the assumption that some pair (m,m<sup>0</sup> ) <sup>∈</sup> <sup>J</sup>ψ<sup>K</sup> is such that <sup>m</sup><sup>0</sup> enables a transition t. Checking whether this is actually true has a cost. Fortunately, we provide a simple characterization of enabledness which can checked in NC. Formally, we say that ϕ enables t if there exists (m,m<sup>0</sup> ) <sup>∈</sup> <sup>J</sup>ϕ<sup>K</sup> such that <sup>m</sup><sup>0</sup> <sup>α</sup>-enables <sup>t</sup> for some α > 0. We have:

Proposition 11. Let ϕ∼(m,m<sup>0</sup> ) := am ∼ bm<sup>0</sup> where a, b ∈ R <sup>P</sup> . This holds:


Proof.

1. <sup>⇒</sup>) Since <sup>ϕ</sup><sup>&</sup>lt; enables <sup>u</sup>, we have <sup>J</sup>ϕ<sup>&</sup>lt;<sup>K</sup> <sup>6</sup><sup>=</sup> <sup>∅</sup>. Let (m,m<sup>0</sup> ) <sup>∈</sup> <sup>J</sup>ϕ<sup>&</sup>lt;K. We have am < bm<sup>0</sup> . It cannot be that a ≥ 0 and b ≤ 0, as otherwise am ≥ 0 ≥ bm<sup>0</sup> .

⇐) It suffices to give a pair (m,m<sup>0</sup> ) <sup>∈</sup> <sup>J</sup>ϕ<sup>&</sup>lt;<sup>K</sup> such that <sup>m</sup><sup>0</sup> <sup>≥</sup> <sup>∆</sup><sup>−</sup> u . Informally, if a has a negative value (resp. b has a positive value), then we can consider the pair (0, ∆<sup>−</sup> u ) and "fix" the value on the left-hand-side (resp. right-hand side) so that ϕ<sup>&</sup>lt; is satisfied. More formally, if a(p) < 0, then (kep, ∆<sup>−</sup> u ) ∈ <sup>J</sup>ϕ<sup>&</sup>lt;<sup>K</sup> with <sup>k</sup> := (|b∆<sup>−</sup> u | + 1)/|a(p)|; if b(p) > 0, then (0, ∆<sup>−</sup> <sup>u</sup> <sup>+</sup> <sup>k</sup>ep) <sup>∈</sup> <sup>J</sup>ϕ<sup>&</sup>lt;<sup>K</sup> with k := (|b∆<sup>−</sup> u | + 1)/b(p).

2. The proof is similar albeit slightly more complicated. ut

We can finally show that testing ψ <sup>t</sup> ψ 0 can be done in NC, for atomic propositions ψ and ψ 0 . In turn, this allows us to show that we can test in NC whether a linear formula is a locally closed bi-separator.

Proposition 12. Given a Petri net N , a transition t and homogeneous atomic propositions ψ and ψ 0 , testing whether ψ <sup>t</sup> ψ 0 can be done in NC.

Proof. Recall that addition, subtraction, multiplication, division and comparison can be done in NC. Note that, by Proposition 11, we can check whether ψ enables t in NC. If it does, then we must test whether (m,m<sup>0</sup> ) <sup>∈</sup> <sup>J</sup>ψ<sup>K</sup> and <sup>m</sup><sup>0</sup> αt −→ <sup>m</sup><sup>00</sup> implies (m,m<sup>00</sup>) <sup>∈</sup> <sup>J</sup><sup>ψ</sup> 0 <sup>K</sup>. We claim that this amounts to testing <sup>X</sup> <sup>⊆</sup> <sup>Y</sup> , where:

$$\begin{aligned} X &:= \{ (m, m') \in \mathbb{R}\_+^P \times \mathbb{R}\_+^P : (m, m') \in \{ \psi \} \text{ and } (m, m') \ge (\mathbf{0}, \Delta\_t^-) \}, \\ Y &:= \{ (m, m') \in \mathbb{R}\_+^P \times \mathbb{R}\_+^P : (m, m' + \Delta\_t) \in [\psi'] \}. \end{aligned}$$

Let us prove this claim.

⇒) Let (m,m<sup>0</sup> ) ∈ X. We have (m,m<sup>0</sup> ) <sup>∈</sup> <sup>J</sup>ψ<sup>K</sup> and (m,m<sup>0</sup> ) ≥ (0, ∆<sup>−</sup> t ). Thus m<sup>0</sup> <sup>t</sup>−→ <sup>m</sup><sup>0</sup> <sup>+</sup> <sup>∆</sup>t. By assumption, (m,m<sup>0</sup> <sup>+</sup> <sup>∆</sup>t) <sup>∈</sup> <sup>J</sup><sup>ψ</sup> 0 K, and hence (m,m<sup>0</sup> ) ∈ Y .

⇐) Let (m,m<sup>0</sup> ) <sup>∈</sup> <sup>J</sup>ψ<sup>K</sup> and <sup>m</sup><sup>0</sup> αt −→ <sup>m</sup><sup>00</sup>. We have <sup>m</sup><sup>0</sup> <sup>≥</sup> α∆<sup>−</sup> <sup>t</sup> and m<sup>00</sup> = m<sup>0</sup> + α∆t. Let k := m/α, k 0 := m<sup>0</sup>/α and k <sup>00</sup> := m<sup>00</sup>/α. As α > 0 and ψ is homogeneous, we have (k, k 0 ) <sup>∈</sup> <sup>J</sup>ψK, (k, <sup>k</sup> 0 ) ≥ (0, ∆<sup>−</sup> t ) and k <sup>00</sup> = k <sup>0</sup> + ∆t. Thus, (k, k 0 ) ∈ X ⊆ Y . By definition of Y , this means that (k, k <sup>00</sup>) <sup>∈</sup> <sup>J</sup><sup>ψ</sup> 0 <sup>K</sup>. By homogeneity, we conclude that (m,m<sup>00</sup>) <sup>∈</sup> <sup>J</sup><sup>ψ</sup> 0 K.

Now that we have shown the claim, let us explain how to check whether X ⊆ Y in NC. Note that X 6= ∅ since ψ enables t. Thus, by Proposition 9, testing X ⊆ Y amounts to solving a linear program in one variable. For example, if ψ = (a·(m,m<sup>0</sup> ) ≤ 0) and ψ <sup>0</sup> = (a 0 ·(m,m<sup>0</sup> ) < 0), then we must check whether this system has a solution:

$$
\exists \lambda \ge 0 : \lambda \boldsymbol{a} \ge \boldsymbol{a}' \land \boldsymbol{a} \cdot (\mathbf{0}, \Delta\_t) < (\lambda \boldsymbol{a} - \boldsymbol{a}') \cdot (\mathbf{0}, \Delta\_t^-).
$$

Thus, by Proposition 10, testing X ⊆ Y can be done in NC. ut

Theorem 3. Given N = (P, T, F), msrc,mtgt ∈ Q<sup>P</sup> <sup>+</sup> and a formula ϕ, testing whether ϕ is a locally closed bi-separator for (msrc,mtgt) can be done in NC.

Proof. Recall that ϕ = ϕ<sup>1</sup> ∨ · · · ∨ ϕ<sup>n</sup> must be in DNF with homogeneous atomic propositions. As arithmetic belongs in NC and ϕ is in DNF, we can test whether (msrc,msrc) <sup>∈</sup> <sup>J</sup>ϕK, (mtgt,mtgt) <sup>∈</sup> <sup>J</sup>ϕ<sup>K</sup> and (msrc,mtgt) <sup>∈</sup>/ <sup>J</sup>ϕ<sup>K</sup> in NC by evaluating ϕ in parallel. We can further test whether ϕ is locally closed by checking the following (which is simply the definition of "locally closed"):

$$\left[\bigwedge\_{\substack{t\in T\\i\in[1..n]}}\bigvee\bigwedge\_{\psi\in\varphi\_{i}}\bigwedge\_{\psi'\in\varphi\_{j}}\psi\leadsto\_{t}\psi'\right]\wedge\left[\bigwedge\_{\substack{t\in T^{\sf T}\\i\in[1..n]}}\bigvee\bigwedge\_{\psi\in\varphi\_{i}}\bigwedge\_{\psi'\in\varphi\_{j}}\psi^{\sf T}\leadsto\_{t}\psi'^{\sf T}\right].$$

By Proposition 12, each test ψ <sup>t</sup> ψ 0 can be carried in NC. Therefore, we can perform all of them in parallel. Note that we do not have to explicitly compute the transpose of transitions and formulas; we can simply swap arguments. ut

Remark 1. Testing whether ϕ is locally closed is even simpler if the tester is also given annotations indicating for every clause ϕ<sup>i</sup> and transition t which clause ϕ<sup>j</sup> is supposed to satisfy ϕ<sup>i</sup> <sup>t</sup> ϕ<sup>j</sup> . This mapping is a byproduct of the procedure to compute a locally closed bi-separator, and so comes at no cost. ut

### 7 Bi-separators for set-to-set unreachability

In most applications, one does not have to prove unreachability of one marking, but rather of a set of markings, usually defined by means of some simple linear constraints. We show that our approach can be extended to "set-to-set reachability", i.e. queries of the form ∃msrc ∈ A,mtgt ∈ B : msrc −→<sup>∗</sup> mtgt, which we denote by A −→<sup>∗</sup> B. We focus on the case where sets A and B are described by conjunctions of atomic propositions; in other words, A and B are convex polytopes defined as intersections of half-spaces. In particular, this includes "coverability" queries which are important in practice, i.e. where A is a singleton and B is of the form {m : m ≥ b}. More generally, our approach can directly be adapted to convex linear Horn constraints, which is a fragment of linear arithmetic that extends linear programs and that captures the expressiveness of continuous Petri nets [6].

As shown in [6, Lem. 3.7], given an atomic proposition ψ = (ax ∼ b), one can construct (in logarithmic space) a Petri net N<sup>ψ</sup> and some y ∈ {0, 1} 5 such that ψ(x) holds iff (x, y) −→<sup>∗</sup> (0, 0) in Nψ. The idea—depicted in Figure 2, which is adapted from [6, Fig. 1])—is simply to cancel out positive and negative coefficients of ψ. It is straightforward to adapt this construction to a conjunction V <sup>1</sup>≤i≤<sup>k</sup> ψk(x) of atomic propositions. Indeed, it suffices to make k copies of the gadget, but where places {p1, . . . , pn} and transitions {t1, . . . , tn} are shared. In this more general setting, t<sup>i</sup> consumes from p<sup>i</sup> and simultaneously spawns the respective coefficient to each copy. In summary, the following holds:

Fig. 2. Petri net for ψ(x) = (a<sup>1</sup> · x<sup>1</sup> + · · · + a<sup>n</sup> · x<sup>n</sup> > c) where a1, a2, c > 0 and a<sup>n</sup> < 0.

Proposition 13. Given a conjunction of atomic propositions ϕ, it is possible to construct, in logarithmic space, a Petri net N<sup>ϕ</sup> and y ∈ {0, 1} 5k such that ϕ(x) holds iff (x, y) −→<sup>∗</sup> (0, 0) in Nϕ.

With the previous construction in mind, we can reformulate any set-to-set reachability query into a standard ("marking-to-marking") reachability query.

Proposition 14. Given a Petri net N and convex polytopes A and B described as conjunctions of atomic propositions, one can construct, in log. space, a Petri net N <sup>0</sup> and markings msrc and mtgt s.t. A −→<sup>∗</sup> B in N iff msrc −→<sup>∗</sup> mtgt in N <sup>0</sup> .

Proof. Let N = (P, T, F−, F+) where P = {p1, . . . , pn}. Let us describe N <sup>0</sup> = (P 0 , T<sup>0</sup> , F 0 <sup>−</sup>, F 0 <sup>+</sup>) with the help of Figure 3. The Petri net N <sup>0</sup> extends N as follows:


Fig. 3. Reduction from set-to-set reachability to (marking-to-marking) reachability.

The Petri net N <sup>0</sup> is intended to work sequentially as follows: (1) guess the initial marking m of N ; (2) execute N on m and reach a marking m<sup>0</sup> ; and (3) test whether m ∈ A and m<sup>0</sup> ∈ B. If N <sup>0</sup> follows this order, then it is straightforward to see that A −→<sup>∗</sup> B in N iff (0, 0, y, y 0 ) −→<sup>∗</sup> (0, 0, 0, 0) in N <sup>0</sup> , where y and y <sup>0</sup> are obtained from Proposition 13. However, N <sup>0</sup> may interleave the different phases.<sup>4</sup> Nonetheless, this is not problematic, as any run of N <sup>0</sup> can be reordered in such a way that all three phases are consecutive. Indeed, phase (1) only produces tokens in P ∪ P 0 , and phase (3) only consumes tokens from P ∪ P 0 . ut

As a consequence of Proposition 14, combined with Theorems 2 and 3, we obtain the following corollary:

Corollary 1. A negative answer to a convex polytope query A −→<sup>∗</sup> B is witnessed by a locally closed bi-separator computable in polynomial time and checkable in NC.

### 8 Conclusion

We have shown that continuous Petri nets admit locally closed bi-separators that can be efficiently computed. These separators are succinct and very efficiently checkable certificates of unreachability. In particular, checking that a linear formula is a locally closed bi-separator is in NC, and only requires to solve linear inequations in one variable over the nonnegative reals.

Verification tools that have not been formally verified, or rely (as is usually the case) on external packages for linear arithmetic, can apply our results to provide certificates for their output. Further, our separators can be used as explanations of why a certain marking is unreachable. Obtaining minimal explanations is an interesting research avenue.

From a logical point of view, separators are very closely related to interpolants for linear arithmetic, which are widely used in formal verification to refine abstractions in the CEGAR approach [3,17,18,1]. We intend to explore whether they can constitute the basis of a CEGAR approach for the verification of continuous Petri nets.

Acknowledgments. We thank the anonymous referees for their comments, and in particular for suggesting a more intuitive definition of bi-separator.

### References

1. Althaus, E., Beber, B., Kupilas, J., Scholl, C.: Improving interpolants for linear arithmetic. In: Proc. 13th International on Automated Technology for Verification and Analysis (ATVA). pp. 48–63 (2015). https://doi.org/10.1007/978-3-319-24953- 7 5

<sup>4</sup> It is tempting to implement a lock, but this only works under discrete semantics.


18. Scholl, C., Pigorsch, F., Disch, S., Althaus, E.: Simple interpolants for linear arithmetic. In: Proc. Conference & Exhibition on Design, Automation & Test in Europe (DATE). pp. 1–6 (2014). https://doi.org/10.7873/DATE.2014.128

Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

### Graphical Piecewise-Linear Algebra

Guillaume Boisseau<sup>1</sup>  and Robin Piedeleu<sup>2</sup>

<sup>1</sup> University of Oxford, Oxford, UK guillaume.boisseau@cs.ox.ac.uk <sup>2</sup> University College London, London, UK r.piedeleu@ucl.ac.uk

Abstract. Graphical (Linear) Algebra is a family of diagrammatic languages allowing to reason about different kinds of subsets of vector spaces compositionally. It has been used to model various application domains, from signal-flow graphs to Petri nets and electrical circuits. In this paper, we introduce to the family its most expressive member to date: Graphical Piecewise-Linear Algebra, a new language to specify piecewise-linear subsets of vector spaces.

Like the previous members of the family, it comes with a complete axiomatisation, which means it can be used to reason about the corresponding semantic domain purely equationally, forgetting the set-theoretic interpretation. We show completeness using a single axiom on top of Graphical Polyhedral Algebra, and show that this extension is the smallest that can capture a variety of relevant constructs.

Finally, we showcase its use by modelling the behaviour of stateless electronic circuits of ideal elements, a domain that had remained outside the remit of previous diagrammatic languages.

Keywords: string diagrams · piecewise-linear · prop · axiomatisation

### 1 Introduction

Functional thinking underpins most scientific models. Nature, however, does not distinguish inputs and outputs—physical systems are governed by laws that merely express relations between their observable variables. While influential scientists, like the famous control theorist J. Willems, have pointed out the blind spots of functional thinking [11], it has remained the dominant paradigm in science and engineering. Arguably, our mathematical practice, especially the foundational emphasis on sets and functions, and the limitations of standard algebraic syntax, are partially to blame for the persistence of this status quo. But there are also alternative approaches, that take relations seriously as the primitive building blocks of our mathematical models. Category theory in particular is agnostic about what constitutes a morphism and can accommodate relations as easily as functions.

Relations, with their usual composition and the cartesian product of sets, form a monoidal category—a category in which morphisms can be composed in two different ways. As a result, they admit a natural two-dimensional syntax of string diagrams. This notation has several advantages when it comes to reasoning about open and interconnected systems [1]: string diagrams naturally keep track of structural properties, such as interconnectivity; they factor out irrelevant topological information that standard algebraic syntax needs to keep explicit; variable-sharing—the relational form of composition for systems—is depicted simply by wiring different components together.

As a result, a wealth of recent developments in computer science and beyond have adopted relations and their diagrammatic notation as a unifying language to reason about a broad range of systems, from electrical circuits to Petri nets [2,6,5]. Many of these follow the same methodology. 1) Given a class of systems, find a set of diagrammatic generators from which any system can be specified, using the two available forms of composition. 2) Interpret each of them as a relation between the observable variables of the system that they describe. This defines a structure-preserving mapping—a monoidal functor—from the diagrammatic syntax to the semantics, from the two-dimensional representation of a system to its behaviour. 3) Finally, identify a convenient set of equations between diagrams, from which any semantic equality between the behaviour of the corresponding systems may be derived.

Graphical linear algebra (GLA) is a paradigmatic example of this approach. It provides a diagrammatic syntax to reason compositionally about different types of linear dynamical systems (including for instance traditional signal flow graphs) and prove their behavioural equivalence purely diagrammatically. The syntax of GLA is generated by the following primitive components:

$$\neg\neg\bot \vdash \neg\bot \vdash \neg\bot \vdash \neg\bot \vdash \neg\bot \vdash \neg\bot \vdash \neg\bot \vdash \neg\bot$$

As relations, the black nodes force all of their ports to share the same value; the white nodes constrain their left ports and the right ports to sum to the same value (or to zero when there are no left/right ports); the final generator, parameterised by an element of the chosen field K, behaves as an amplifier: its right value is x times the left value. Following point 3) of the methodology sketched above, GLA enjoys a sound and complete equational theory for the specified semantics, called the theory of Interacting Hopf Algebras (IH). In summary, string diagrams with n ports on the left and m ports on the right, quotiented by the axioms of IH, are precisely linear relations, i.e., linear subspaces of K<sup>n</sup> × K<sup>m</sup>.

GLA was the starting point of different extensions, two of which play a prominent role in this paper. First, Graphical Affine Algebra, which adds to the syntax a generator for the constant 1. This allows it to express affine relations, i.e. affine subspaces of K<sup>n</sup> ×K<sup>m</sup>. A corresponding complete equational theory was presented in [6]. Then, Graphical Polyhedral Algebra (GPA), which assumes that K is an ordered field and adds a generator <sup>≥</sup> for this order. The resulting graphical calculus can express all polyhedral relations, i.e., polyhedra<sup>3</sup> in K<sup>n</sup> × K<sup>m</sup>, and also comes with its own complete axiomatisation.

In this paper, we define the most expressive member of the GLA family tree to date: Graphical Piecewise-Linear Algebra (GPLA) is a hybrid of symbolic and

<sup>3</sup> For the case of R, these include the usual polytopes, which are bounded subsets of R <sup>n</sup> × R <sup>m</sup>, as well as proper polyhedra, which may have unbounded faces.

diagrammatic syntax for piecewise-linear (pl) relations—finite unions of polyhedra in K<sup>n</sup> × K<sup>m</sup>—and a corresponding complete equational theory. We argue below that the proposed language strikes a convincing balance between structure and expressiveness. It is a simple extension of GPA [4], yet for K = R, it is sufficiently powerful to approximate any submanifold of R <sup>n</sup> arbitrarily closely.

Furthermore, this extension completes a research program initiated in parallel with the birth of GLA [2,6,3]: its chief purpose was to give the informal graphical notation for electrical circuits a formal, compositional interpretation, with a corresponding equational theory.

Until now however, the category-theoretic setting could only accommodate components with a linear (more precisely, affine) behaviour, such as resistors, inductors, capacitors, voltage and current sources. GPLA finally makes it possible to reason equationally about electronic components, such as ideal diodes and transistors. Even when the idealised physical behaviour of these components is not necessarily piecewise-linear, GPLA is theoretically expressive enough to approximate it as closely as necessary. Indeed, piecewise-linear approximations of transistor behaviour have been proposed to bypass the unavoidable abstraction leaks of purely digital circuits [9]. In this context, GPLA can serve as a form of abstract interpretation for electronic circuits, with adjustable precision to allow for the intended semantics to be as physically realistic as desired. Of course, in practice, working with large diagrams can be prohibitive. But this is a limitation shared by all members of the Graphical Algebra family, and developing convenient tools and techniques for diagrammatic reasoning is an active research area. Our main thrust is that piecewise-linearity provides the appropriate level of structure, where general relations are too flexible to come with a useful equational theory, and linear relations are too rigid to accommodate diodes and other electronic components.

Finally, a remark about syntax. While it is possible to make the language purely diagrammatic, we found that what one gains in purity one loses in complexity. Ultimately, the hybrid syntax of union and diagrams is more convenient to manipulate and intuitive to read. In fact, this is not the first time that sums of diagrams appear in the literature [8]. Nevertheless, one of our central technical contributions is the rigorous definition of a syntax blending diagrams and binary joins, and the corresponding notion of equational theory.

Outline. In Section 2 we recall the necessary mathematical background, the fundamentals of diagrammatic syntax, and the language of Graphical Polyhedral Algebra (GPA). In Section 3, we extend the diagrammatic syntax with unions and define the notion of symmetric monoidal semi-lattice theory. From there, in Section 4, we extend GPA with unions, to capture piecewise-linear relations, and give this new language a theory that we prove is complete (Theorem 2). This is our main technical contribution. In Section 5, we explore alternative languages for piecewise-linear relations, and show that they are all equally expressive. Finally, in Section 6, we extend the compositional re-interpretation of electrical circuits from [3] to include electronic components, namely diodes and transistors.

### 2 Preliminaries

Informally, our starting point is a simple diagrammatic language of circuits built from the following generators:

$$\neg \mathsf{C} \vdash \neg \mathsf{I} \\ \neg \vdash \mathsf{I} \vdash \neg \mathsf{C} \\ \neg \vdash \mathsf{I} \vdash \neg \mathsf{C} \\ \neg \vdash \mathsf{D} \vdash \neg \vdash \mathsf{I} \vdash \mathsf{D} \vdash \neg \mathsf{D} \quad (x \in \mathbb{K}) \tag{1}$$

We will explain how these basic components can be wired together and give them a formal interpretation.

### 2.1 Props and Symmetric Monoidal Theories

The mathematical backbone of our approach is the notion of product and permutations category (prop), a structure which generalises standard algebraic theories [7]. Formally, a prop is a strict symmetric monoidal category (SMC) whose objects are the natural numbers and where the monoidal product ⊕ on objects is given by addition. Equivalently, it is a strict SMC whose objects are all monoidal products of a single generating object. Prop morphisms are strict symmetric monoidal functors that act as the identity on objects.

Following an established methodology, we will define two props: Syn and Sem, for the syntax and semantics respectively. To guarantee a compositional interpretation, we require <sup>J</sup> · <sup>K</sup> : Syn <sup>→</sup> Sem, the mapping of terms to their intended semantics, to be a prop morphism.

Typically, the syntactic prop Syn is freely generated from a monoidal signature Σ, i.e. a set of arrows g : m → n. In this case, we use the notation PΣ and Syn interchangeably. Morphisms of PΣ are terms of an (N, N)-sorted syntax, whose constants are elements of Σ and whose operations are the usual composition (−); (−) : Syn(n, m) × Syn(m, l) → Syn(n, l) and the monoidal product (−) ⊕ (−) : Syn(n1, m1) × Syn(n2, m2) → Syn(n<sup>1</sup> + n2, m<sup>1</sup> + m2), quotiented by the laws of SMCs. But this quotient is cumbersome and unintuitive to work with.

This is why we will prefer a different representation. With their two forms of composition, monoidal categories admit a natural two-dimensional graphical notation of string diagrams. The idea is that an arrow c : n → m of PΣ is better represented as a box with n ordered wires on the left and m on the left. We can compose these diagrams in two different ways: horizontally, by connecting the right wires of one diagram to the left wires of another, and vertically by juxtaposing two diagrams:

$$c;d = \begin{array}{c} n \ \stackrel{n}{\longrightarrow} \stackrel{m}{\longrightarrow} \stackrel{m}{\longleftarrow} \stackrel{l}{d} \end{array} \qquad\qquad d\_1 \oplus d\_2 = \begin{array}{c} \stackrel{n\_1}{\longrightarrow} \stackrel{m\_1}{\longleftarrow} \stackrel{m\_2}{\longleftarrow} \stackrel{m\_2}{\longleftarrow} \stackrel{m\_3}{\longleftarrow} \stackrel{m\_3}{\longleftarrow} \stackrel{n\_1}{\longleftarrow} \stackrel{m\_4}{\longleftarrow} \stackrel{n\_2}{\longleftarrow} \stackrel{m\_3}{\longleftarrow} \stackrel{n\_4}{\longleftarrow} \stackrel{m\_4}{\longleftarrow} \stackrel{m\_5}{\longleftarrow} \stackrel{m\_6}{\longleftarrow} \stackrel{n\_6}{\longleftarrow} \stackrel{m\_7}{\longleftarrow} \stackrel{m\_8}{\longleftarrow} \stackrel{n\_1}{\longleftarrow} \stackrel{m\_6}{\longleftarrow} \stackrel{m\_7}{\longleftarrow} \stackrel{m\_8}{\longleftarrow} \stackrel{m\_9}{\longleftarrow} \stackrel{m\_1}{\longleftarrow} \stackrel{n\_2}{\longleftarrow} \stackrel{m\_2}{\longleftarrow} \stackrel{m\_3}{\longleftarrow} \stackrel{m\_4}{\longleftarrow} \stackrel{m\_5}{\longleftarrow} \stackrel{m\_6}{\longleftarrow} \stackrel{m\_7}{\longleftarrow} \stackrel{m\_8}{\longleftarrow} \stackrel{m\_9}{\longleftarrow} \stackrel{m\_1}{\longleftarrow} \stackrel{n\_2}{\longleftarrow} \stackrel{m\_2}{\longleftarrow} \stackrel{m\_3}{\longleftarrow} \stackrel{m\_4}{\longleftarrow} \stackrel{m\_5}{\longleftarrow} \stackrel{m\_6}{\longleftarrow} \stackrel{m\_7}{\longleftarrow} \stackrel{m\_8}{\longleftarrow} \stackrel{m\_9}{\longleftarrow} \stackrel{m\_1}{\longleftarrow} \stackrel{m\_2}{\longleftarrow} \stackrel{m\_3}{\longleftarrow} \stackrel{m\_4}{\longleftarrow} \stackrel{m\_2}{\longleftarrow} \stackrel{m\_3}{\longleftarrow} \stackrel{m\_4}{\longleftarrow} \stackrel{m\_5}{\longleftarrow} \stackrel{m\_6}{\longleftarrow} \stackrel$$

where the labelled wire <sup>n</sup> is syntactic sugar for a stack of n wires. The identity id<sup>1</sup> : 1 → 1 is denoted as a plain wire , the unit for ⊕, id<sup>0</sup> : 0 → 0, as the empty diagram , and when the category is symmetric, the symmetry σ1,<sup>1</sup> : 2 → 2 is denoted as a wire crossing . With this representation the laws of SMCs become diagrammatic tautologies.

Once we have defined <sup>J</sup> · <sup>K</sup> : Syn <sup>→</sup> Sem, it is natural to look for equations to reason about semantic equality directly on the diagrams themselves. Given a set of equations E, i.e., a set containing pairs of arrows of the same type, we write <sup>E</sup>= for the smallest congruence wrt the two composition operations ; and ⊕. We say that <sup>E</sup>= is sound if c <sup>E</sup><sup>=</sup> <sup>d</sup> implies <sup>J</sup> <sup>c</sup> <sup>K</sup> <sup>=</sup> <sup>J</sup> <sup>d</sup> <sup>K</sup>. It is moreover complete when the converse implication holds. We call a pair (Σ, E) a symmetric monoidal theory (SMT) and we can form the prop PΣ/E obtained by quotienting PΣ by <sup>E</sup>=. There is then a prop morphism q : PΣ → PΣ/E witnessing this quotient.

We may also wonder what the expressive power of our diagrammatic language is. In terms of props we look to characterise precisely the image Im(<sup>J</sup> · <sup>K</sup>) of the syntax via <sup>J</sup> · <sup>K</sup>.

The situation for a sound and complete SMT is summarised in the commutative diagram below right.

Soundness simply means that <sup>J</sup> · <sup>K</sup> factors as s ◦ q through PΣ/E and completeness means that s is a faithful prop morphism.

Typically, our semantic prop Sem will be (a subcategory of) the category of sets and relations.

### Definition 1. Let K be a field. Rel<sup>K</sup> is the prop


$$R\_1 \oplus R\_2 = \left\{ \left( \begin{pmatrix} x\_1 \\ x\_2 \end{pmatrix}, \begin{pmatrix} y\_1 \\ y\_2 \end{pmatrix} \right) \mid (x\_1, y\_1) \in R\_1 \land (x\_2, y\_2) \in R\_2 \right\}.$$

$$\begin{aligned} & \text{for } R\_1: \, n\_1 \to m\_1 \text{ and } R\_2: \, n\_2 \to m\_2, \\ & - \text{ symmetry } n + m \to m + n, \text{ the relation} \left\{ \left( \begin{pmatrix} x \\ y \end{pmatrix}, \begin{pmatrix} y \\ x \end{pmatrix} \right) \, | \, (x, y) \in \mathbb{K}^n \times \mathbb{K}^m \right\}. \end{aligned}$$

### 2.2 Ordered Props and Symmetric Monoidal Inequality Theories

Our semantic prop—RelK—carries additional structure that we wish to lift to the syntax: as subsets of K<sup>n</sup> ×K<sup>m</sup>, relations n → m can be ordered by inclusion. The corresponding structure is that of an ordered prop, a prop enriched over the category of posets, whose composition and monoidal product are monotone maps.

If props can be presented by SMTs, ordered props can be presented by symmetric monoidal inequality theories (SMIT). Formally, the data of a SMIT is the same as that of a SMT: a signature Σ and a set I of pairs c, d : n → m of PΣ-arrows of the same type, that we now read as inequalities c ≤ d.

As for plain props, we can construct an ordered prop from a SMIT by building the free prop PΣ and passing to a quotient PΣ/I . First, we build a preorder on each homset by closing I under ⊕ and taking the reflexive and transitive closure of the resulting relation. Then, we obtain the free ordered prop PΣ/I by quotienting the resulting preorder by imposing anti-symmetry.

SMITs subsume SMTs, since every SMT can be presented as a SMIT, by splitting each equation into two inequalities. We will refer to both simply as theories and their defining inequalities as axioms. When referring to a sound and complete theory, we will also use the term axiomatisation, as is standard in the literature.

### 2.3 Graphical Polyhedral Algebra

We now assume that K is an ordered field, that is, a field equipped with a total order ≥ compatible with the field operations in the following sense: for all x, y, z ∈ K, i) if x ≥ y then x+z ≥ y +z, and ii) if x ≥ 0 and y ≥ 0 then xy ≥ 0.

Following [4], from the generators in (1), we define a prop, give it a semantics in RelK, characterise the image of the semantic functor, and describe an axiomatisation for the specified semantics.

– For Σ + <sup>≥</sup> <sup>=</sup> { , , , , , , , , , <sup>≥</sup> , <sup>r</sup> (<sup>r</sup> <sup>∈</sup> <sup>K</sup>)} define <sup>J</sup> · <sup>K</sup> : <sup>P</sup><sup>Σ</sup> + <sup>≥</sup> → Rel<sup>K</sup> to be the prop morphism given by

<sup>J</sup> <sup>K</sup> := ( x, x x !! <sup>|</sup> <sup>x</sup> <sup>∈</sup> <sup>K</sup> ) <sup>J</sup> <sup>K</sup> := {(x, •) <sup>|</sup> <sup>x</sup> <sup>∈</sup> <sup>K</sup>} <sup>J</sup> <sup>K</sup> := ( <sup>x</sup> x ! , x! | x ∈ K ) <sup>J</sup> <sup>K</sup> := {(•, x) <sup>|</sup> <sup>x</sup> <sup>∈</sup> <sup>K</sup>} <sup>J</sup> <sup>K</sup> := ( <sup>x</sup> <sup>+</sup> y, x y !! <sup>|</sup> x, y <sup>∈</sup> <sup>K</sup> ) <sup>J</sup> <sup>K</sup> := {(0, •)} <sup>J</sup> <sup>K</sup> := ( <sup>x</sup> y ! , x + y ! | x, y ∈ K ) <sup>J</sup> <sup>K</sup> := {(•, 0)} q k y := {(x, k · x) | x ∈ K} for k ∈ K q ≥ y := {(x, y) <sup>∈</sup> <sup>K</sup> <sup>×</sup> <sup>K</sup> <sup>|</sup> <sup>x</sup> <sup>≥</sup> <sup>y</sup>} <sup>J</sup> <sup>K</sup> := {(•, 1)}

– The image of PΣ + <sup>≥</sup> by <sup>J</sup> · <sup>K</sup> is the prop whose arrows <sup>n</sup> <sup>→</sup> <sup>m</sup> are finitely generated polyhedra of K<sup>n</sup> × K<sup>m</sup>, i.e., subsets of the form

$$\left\{ (x, y) \in \mathbb{K}^n \times \mathbb{K}^m \mid A \begin{pmatrix} x \\ y \end{pmatrix} + b \ge 0 \right\}.$$

for some matrix A and some vector b (see [4] for more details, in particular the appendix for the proof that these form a prop).

– IH<sup>+</sup> <sup>≥</sup> provides an axiomatisation of polyhedral relations [4, Corollary 25]; it can be found in the first four blocks of Fig. 1.

Example 1 (Duality). Two diagrams play a special role in this paper: the half turns and , called cup and cap, respectively. Using these and , we can build cups and caps for any number <sup>n</sup> of wires: <sup>n</sup> and n .

They allow us to associate a dual d op : n → m to any diagram d : m → n by turning its left ports into right ports and vice-versa:

$$\xleftarrow{\scriptstyle n} \overbrace{\mathsf{d}^{\circ \circ}}^{m} = \overbrace{\bullet \bullet \overbrace{\mathsf{d} \bullet}^{m}}^{n} \tag{2}$$

Correspondingly, <sup>J</sup> <sup>d</sup> op <sup>K</sup> is the opposite relation, i.e. <sup>J</sup> <sup>d</sup> op <sup>K</sup> <sup>=</sup> {(y, x) <sup>|</sup> (x, y) <sup>∈</sup> <sup>J</sup> <sup>d</sup> <sup>K</sup>}. We will use of a suggestive mirror notation to denote the dual of a given generator: <sup>r</sup> := ( <sup>r</sup> ) op , := ( ) op and <sup>≤</sup> := ≥ op .

### 3 Symmetric Monoidal Semi-Lattice Theories

There are several routes to describe piecewise-linear subsets of K<sup>n</sup>. In this paper we choose to equip our syntax with a primitive operation of join, in order to describe piecewise-linear sets as (finite) unions of polyhedra. In the same way that we moved from simple props to ordered props in Section 2.2, we now move to the setting of semi-lattice-enriched props.

A ∪-prop is a prop enriched over the monoidal category of semi-lattices – partially-ordered sets with least upper bounds for any finite subset – and joinpreserving maps, with the Cartesian product as monoidal product. In other words a ∪-prop is a prop whose homsets are semi-lattices, with composition and monoidal product themselves join-preserving. The paradigmatic example is Rel<sup>K</sup> which is a ∪-prop with the union of relations as join.

As we would like to incorporate binary joins into our syntax, we need a new description of the free ∪-prop P∪Σ over a given signature Σ.


We now define a corresponding notion of theory for ∪-props. A symmetric monoidal (semi-)lattice theory (SMLT) is the data of a signature Σ and a set E of equations: formally the latter is a set of pairs (C, D) of arrows C, D : n → m from P∪Σ. We will write the elements of E as equations of the form S c∈C c = S <sup>d</sup>∈<sup>D</sup> d. We now explain how to define a ∪-prop P∪Σ/E from the data of an SMLT (Σ, E). As for SMTs, we can build the smallest congruence <sup>E</sup>= wrt to ; and ⊕, which equates the pairs in E. Then define P∪Σ/E to be the quotient of P∪Σ by <sup>E</sup>=. That this is a well-defined ∪-prop follows again from the distributivity of the composition and monoidal product over unions.

Note that the semi-lattice structure allows us to define an order over the homsets of any ∪-prop, making it into an ordered prop: we write C ⊆ D as a shorthand for C ∪ D = D. We will also use C E ⊆ D for C ∪ D <sup>E</sup>= D in P∪Σ/E. (We prefer this notation to avoid the confusion with the order ≥ on K itself.)

Remark 1 (Reasoning in ∪-props). The reader familiar with string diagrams and equational reasoning might be surprised by certain features of derivations that combine diagrammatic and traditional syntax (joins, in this case). We would like to clarify one particular point. When we want to use an equality of the form d = d<sup>1</sup> ∪ d<sup>2</sup> inside a term of the form c<sup>1</sup> ∪ c<sup>2</sup> ∪ c, we need to identify a linear context C[−] (i.e. the hole occurs exactly once in C) common to c<sup>1</sup> and c<sup>2</sup> such that c<sup>1</sup> = C[d1] and c<sup>2</sup> = C[d2]. Then we are allowed to use the fact that C[d] = C[d1] ∪ C[d2] to conclude that c<sup>1</sup> ∪ c<sup>2</sup> ∪ c = C[d] ∪ c. An example of this form of reasoning can be found in the proof of Lemma 2, which we reproduce here: we apply the equality (total) = ≤ ∪ ≥ in

Note that, to clarify the common context to the reader, we will often use the intermediate notation C[d<sup>1</sup> ∪ d2], as we did in the first step above.

### 4 The Theory of Piecewise-Linear Relations

### 4.1 Syntax and Semantics

For piecewise-linear relations we retain the same signature Σ + <sup>≥</sup> and consider P∪(Σ + <sup>≥</sup>), the free ∪-prop over it. As we saw, its morphisms are nonempty finite sets of diagrams of PΣ + <sup>≥</sup>. This is our syntax.

On the semantic side, we now need to extend the functor <sup>J</sup> · <sup>K</sup> to have <sup>P</sup><sup>Σ</sup> + ≥ as domain, retaining Rel<sup>K</sup> as codomain. Concretely, since we already know how to assign a relation to each diagram of PΣ + <sup>≥</sup>, we only need to specify how to interpret finite sets of such diagrams: unsurprisingly, we set

$$\left[ \left\{ d\_1, \ldots, d\_n \right\} \right] := \left[ \left. d\_1 \right\} \cup \cdots \cup \left\{ \left. d\_n \right\} \right]$$

This is join-preserving by construction, and remains monoidal and functorial.

By definition, we call piecewise-linear (pl) any relation in the image of this functor, i.e., any relation that is a finite union of polyhedral relations. As far as we know, this is the first time that this notion appears in print. However, it does capture our intuitive notion of piecewise-linearity as submanifolds of K<sup>n</sup> that can be subdivided into linear subspaces.

### 4.2 Equational Theory

IHP L, the SMLT of pl relations, is presented in Fig. 1. The first block is the theory of matrices/linear maps; the second block, IH, axiomatises all linear relations; the third block axiomatises the behaviour of the order ≥ ; the fourth, deals with the affine fragment of the theory, axiomatising the behaviour of the constant . Taken together, those four blocks constitute IH<sup>+</sup> <sup>≥</sup>, an axiomatisation of polyhedral relations—we refer the reader to [4] for more details on this fragment.

The key addition of IHP L is the last block, containing the axiom of totality, which states that any real number belongs to the non-negative or to the nonpositive fragment of K. Remarkably, this simple axiom is the only one we need to add to IH<sup>+</sup> <sup>≥</sup> to obtain a complete theory for pl relations. Its soundness is simply a consequence of the definition of an ordered field: the order is assumed to be total in the sense that, for any x, y ∈ K we have x ≤ y or y ≤ x. Take y = 0 to recover the last axiom of IHP L.

Remark 2. As a consequence of the Frobenius laws (•-fr) and of (co)unitality (•-un)-(•-coun), the diagrams <sup>n</sup> and n satisfy

$$\underbrace{\begin{array}{c} \begin{array}{c} \begin{array}{c} \begin{array}{c} \begin{array}{c} \end{array} \end{array} \end{array} \end{array}}\_{n} \overset{\scriptstyle \mathbb{H}\_{PL}}{\rightleftharpoons} \end{array} \xrightarrow{\begin{array}{c} \begin{array}{c} \begin{array}{c} \text{ $ \mathbb{H}\_{PL}$ } \end{array} \end{array}} \underbrace{\begin{array}{c} \begin{array}{c} \bullet \end{array} \end{array}}\_{n} \end{pmatrix} \end{pmatrix} \tag{3}$$

for any n, the defining equations of a compact closed category. Intuitively, these allow us to forget the direction of wires. In addition, compactness implies the following proposition.

#### Proposition 1. C IHP L ⊆ D iff C op IHP L ⊆ Dop .

Another important property of compact closed category which we will exploit to simplify the completeness proof is stated in the following proposition. It is an immediate consequence of (3).

$$\text{Proposition 2. } Given \ C, D:\ m \to n, \ C \stackrel{\mathbb{W}\_{PL}}{\subseteq} D \text{ iff } \underbrace{\stackrel{m}{\underset{n}{\longleftarrow}} \stackrel{m}{\longleftarrow}}\_{} \stackrel{\mathbb{W}\_{PL}}{\subseteq} \underbrace{\stackrel{m}{\underset{n}{\longleftarrow}} \stackrel{m}{\longleftarrow}}\_{} \bullet$$

### 4.3 Completeness Theorem

As we stated above, the axioms in Fig. 1 form a complete theory for pl relations. We will prove that claim in this section. Without loss of generality, using Proposition 2, we restrict to n → 0 diagrams.

We start by defining appropriate normal forms for polyhedral and pl relations, and then show that every diagram can be reduced to normal form.

Fig. 1. Axioms of GPLA.

Definition 2. We call hyperplane a nonzero affine map H : n → 1 which we write <sup>H</sup> . A given hyperplane H defines two half-spaces <sup>H</sup> ≥ and <sup>H</sup> ≤ , as well as an affine subspace <sup>H</sup> . Since inequality is not strict, the half-spaces include the affine subspace.

In [4, Theorem 14], polyhedral relations have a normal form given by a set of inequations of the form Aix+b<sup>i</sup> ≥ 0. In other words, the normal form is given by an intersection of half-spaces. For our purposes we define a related but slightly different normal form.

Definition 3. A PΣ + <sup>≥</sup>-diagram d : n → 0 is in polyhedral normal form if there are hyperplanes H<sup>i</sup> and diagrams <sup>d</sup><sup>i</sup> ∈ , ≥ , ≤ such that:

Where the d<sup>i</sup> are minimal in the following sense: fixing the set of hyperplanes Hi, we consider all choices of d<sup>i</sup> that give d when composed as above. We then require the d<sup>i</sup> in the normal form to be minimal (wrt the order of IH<sup>+</sup> <sup>≥</sup>) among those. We call the set of the d<sup>i</sup> a valuation for d relative to the hyperplanes Hi.

Definition 4. We say that a morphism D of P∪Σ + <sup>≥</sup> is in pl normal form if it is written as a non-empty union of diagrams d<sup>i</sup> each in the language of PΣ + <sup>≥</sup> (i.e. without unions), the d<sup>i</sup> are in the normal form defined in Definition 3, and all the normal forms use the same set of hyperplanes.

Lemma 1. Every d : n → 0 in PΣ + <sup>≥</sup> has a polyhedral normal form.

Proof. The normal form from [4, Theorem 14] already has the right shape. We only need to find a minimal valuation. Observe that the intersection of two valuations for d is again a valuation for d: let v and v <sup>0</sup> be two valuations H<sup>0</sup>

for d relative to the hyperplanes H<sup>i</sup> . If we write <sup>A</sup> := . . . H<sup>k</sup> then

Therefore v ∩ v 0 is again a valuation for d. Since there are finitely many valuations, we construct the minimal one by intersecting them all. ut

Lemma 2. If a morphism D of P∪Σ + <sup>≥</sup> is in pl normal form and H is a hyperplane, there exists C in pl normal form such that D IHP L = C and Hyperplanes(C) = Hyperplanes(D) ∪ {H}.

Proof. We write the normal form of D as D = S i di . Define C to be the following morphism:

$$\neg\heartsuit = \bigcup\neg\bigcirc\neg\heartsuit \supset\spadesuit \cup\smile\bigcirc\cdots\triangleright\clubsuit \cdots$$

We transform C into C <sup>0</sup> by reducing all the terms in the union to polyhedral normal form. This makes C <sup>0</sup> be in pl normal form. Since we add the same hyperplane H to all d<sup>i</sup> , Hyperplanes(C 0 ) = Hyperplanes(D) ∪ {H}.

Moreover:

C <sup>0</sup> = <sup>C</sup> = <sup>H</sup> ≥ ∪ ≤ di S i (total) = H D = <sup>D</sup>

Theorem 1. Every morphism of P∪Σ + <sup>≥</sup> has a pl normal form.

Proof. Let D be a n → 0 morphism of P∪Σ + <sup>≥</sup>. First using distributivity of the union over sequential and parallel composition, we move all the uses of the union to the top-level.

Thus D is written S i d<sup>i</sup> where each d<sup>i</sup> doesn't use the union, i.e. is in the language of PΣ + <sup>≥</sup>. We then rewrite each d<sup>i</sup> into polyhedral normal form using Lemma 1.

Each d<sup>i</sup> is thus also individually in pl normal form, so we can use Lemma 2 to add to each d<sup>i</sup> all the hyperplanes of the other d<sup>j</sup> . For each i we get a new diagram d 0 i IHP L = d<sup>i</sup> in pl normal form, and all the d 0 <sup>i</sup> use the same set of hyperplanes. So S i d 0 i is a pl normal form for D. ut

Before we can prove completeness, we need a final notion: the interior of a polyhedral relation, which is the set of its points that don't touch any of its faces.

Definition 5. Let d be morphism in polyhedral normal form. We define Int(d) to be the set of points <sup>x</sup> <sup>∈</sup> <sup>J</sup> <sup>d</sup> <sup>K</sup> for which <sup>H</sup>i(x) <sup>6</sup>= 0 when <sup>d</sup><sup>i</sup> <sup>6</sup><sup>=</sup> . In other words, Hi(x) is nonzero for all hyperplanes where it can be nonzero without x leaving <sup>J</sup> <sup>d</sup> <sup>K</sup>.

Note that we define Int only on polyhedral normal form diagrams. Int appears to be representation-independent at least when K = R, but we won't try to prove it in the general case as we don't need this here.

Remark 3. This is not the usual topological notion of interior. In particular, this notion is independent from the dimension of the surrounding space: a polyhedron of dimension 0 < k < n within R <sup>n</sup> has an empty topological interior but a nonempty Int, as we'll see in the next theorem. Int(d) instead coincides with the interior of d with the topology of the smallest containing affine space.

Lemma 3. Let <sup>d</sup> be a diagram in polyhedral normal form. If <sup>J</sup> <sup>d</sup> <sup>K</sup> is nonempty, then Int(d) is nonempty.

Proof. First, write d in polyhedral normal form:

$$\begin{array}{c} \longrightarrow \bigotimes = \stackrel{\scriptstyle \begin{array}{c} \widehat{\bigoplus\_{0}} \bigotimes\_{0} \\ \cdots \\ \widehat{\bigoplus\_{k}} \bigotimes\_{k} \end{array}} \end{array}$$

ut

Up to negating some of the H<sup>i</sup> , we can assume that none of the <sup>d</sup><sup>i</sup> are ≤ . If <sup>∀</sup>i. <sup>d</sup><sup>i</sup> <sup>=</sup> , then by definition Int(d) = <sup>J</sup> <sup>d</sup> <sup>K</sup> which is nonempty so we're done. Assume then that <sup>d</sup><sup>i</sup> = ≥ for at least some i. For each such i, by minimality of the d<sup>i</sup> in the normal form there must be a <sup>x</sup><sup>i</sup> <sup>∈</sup> <sup>J</sup> <sup>d</sup> <sup>K</sup> such that Hi(xi) > 0. We pick such an x<sup>i</sup> for each i, and define x := <sup>1</sup> p P i x<sup>i</sup> to be their average. By convexity, <sup>x</sup> <sup>∈</sup> <sup>J</sup> <sup>d</sup> <sup>K</sup>. <sup>H</sup><sup>i</sup> is an affine map, hence is concave, thus if we had picked an x<sup>i</sup> then Hi(x) ≥ 1 p P <sup>j</sup> Hi(x<sup>j</sup> ) ≥ 1 <sup>p</sup>Hi(xi) > 0. Then for each i either <sup>d</sup><sup>i</sup> = or Hi(x) > 0, hence x ∈ Int(d). ut

#### Theorem 2 (Completeness). <sup>J</sup> <sup>D</sup> <sup>K</sup> <sup>⊆</sup> <sup>J</sup> <sup>C</sup> <sup>K</sup> <sup>=</sup><sup>⇒</sup> <sup>D</sup> IHP L ⊆ C

Proof. Using Proposition 2 we can without loss of generality assume that D and C have n inputs and 0 outputs. Using Theorem 1, we reduce D and C into pl normal form. Using Lemma 2, we add each others' hyperplanes to D and C so that they both use the exact same set. So D = S i d<sup>i</sup> and C = S i ci , where the d<sup>i</sup> and c<sup>i</sup> are in polyhedral normal form and use a same set of hyperplanes {Hi}<sup>i</sup> . Pick one of the d<sup>i</sup> in D.

If d<sup>i</sup> is the empty polyhedron, we have <sup>J</sup> <sup>d</sup><sup>i</sup> <sup>K</sup> <sup>=</sup> ∅ ⊆ <sup>J</sup> <sup>c</sup><sup>0</sup> <sup>K</sup>, so by completeness of IH<sup>+</sup> <sup>≥</sup> we get d<sup>i</sup> IH<sup>+</sup> ≥ ⊆ c0. Thus d<sup>i</sup> IHP L ⊆ c<sup>0</sup> IHP L ⊆ C.

Otherwise d<sup>i</sup> is nonempty, and using Lemma 3 we pick x ∈ Int(di). Then:

$$x \in \mathfrak{Int}(d\_i) \subseteq \lceil d\_i \rceil \subseteq \lceil D \rceil \subseteq \lceil C \rceil = \left\lceil \bigcup\_j c\_j \right\rceil = \bigcup\_j \lceil c\_j \rceil$$

Thus there is a <sup>j</sup> such that <sup>x</sup> <sup>∈</sup> <sup>J</sup> <sup>c</sup><sup>j</sup> <sup>K</sup>. Now pick a <sup>k</sup>. If <sup>d</sup>ik <sup>=</sup> , then dik IHP L ⊆ <sup>c</sup>jk regardless of <sup>c</sup>jk . If <sup>d</sup>ik = ≥ , then by definition of Int(di), we have <sup>H</sup>k(x) <sup>&</sup>gt; <sup>0</sup>. Since moreover <sup>x</sup> <sup>∈</sup> <sup>J</sup> <sup>c</sup><sup>j</sup> <sup>K</sup>, <sup>c</sup>jk must be <sup>≥</sup> . If <sup>d</sup>ik = ≤ , similarly <sup>c</sup>jk must be ≤ . In all three cases, <sup>d</sup>ik IHP L ⊆ <sup>c</sup>jk . This is the case for every k, so:

$$\begin{matrix} -\widehat{\mathsf{C}\_{i}} \end{matrix} = \begin{matrix} -\widehat{\overset{\mathsf{H}\_{0}}{\cdots}}\widehat{\overset{\mathsf{H}\_{0}}{\cdots}}\widehat{\overset{\mathsf{C}\_{i0}}{\cdots}}\end{matrix} \subseteq \begin{matrix} -\widehat{\overset{\mathsf{H}\_{0}}{\cdots}}\widehat{\overset{\mathsf{H}\_{0}}{\cdots}}\widehat{\overset{\mathsf{C}\_{j0}}{\cdots}}\end{matrix} = \begin{matrix} -\widehat{\mathsf{C}\_{j}} \end{matrix} \subseteq \begin{matrix} -\widehat{\mathsf{C}} \end{matrix}$$

Finally, since we have d<sup>i</sup> IHP L ⊆ C for all i, we derive D = S i di IHP L ⊆ C. ut

### 5 Generating Piecewise-Linear Relations

Piecewise-linear subsets of vector spaces give us a rather wide semantic space to explore. One might suspect that there exist useful structured relations that live strictly between the linear and piecewise-linear worlds.

Formally, we're interested in finding sub-props of Rel<sup>K</sup> that contain not only linear or polyhedral relations, but some selected non-convex relations that would be useful for particular applications. It turns out that for many sensible choices, the resulting image will coincide with pl relations—a somewhat surprising fact. Note that we are interested in generating sub-props of Rel<sup>K</sup> here, not ∪-props, since the ∪-prop generated by the image of P∪Σ + <sup>≥</sup> under <sup>J</sup> · <sup>K</sup> already contains all pl relations.

We will go through a few natural choices, each time defining them as a term of P∪Σ + <sup>≥</sup>, a shortcut which makes reasoning about them much easier than with their set-theoretic semantics. Of course, their semantics in Rel<sup>K</sup> can be recovered via <sup>J</sup> · <sup>K</sup>.

### 5.1 The n-Fold Union Generators

We first show that the main difference between polyhedral and pl relations —the unions—can be bridged. Indeed, it is not obvious that we can build arbitrary unions of diagrams without having access to the syntax of a SMLT. For this we introduce a family of diagrams we call the n-fold union generators, defined for a given n as:

These generators suffice to reproduce the behaviour of the syntactic union:

Theorem 3. The image of the free prop generated by Σ + <sup>≥</sup> and the n-fold union generators for all n is the prop of pl relations.

Proof. If <sup>C</sup> and <sup>D</sup> are non-empty n → 0 diagrams,

$$\bullet \rightsquigarrow \bigcirc \bigcirc \bigcirc \cdots \bigcirc \bigcirc \bigcirc \cdots \bigcirc \bullet \bigcirc \cdots = \bigcirc \bigcirc \cdots$$

Since every pl relation can be written as a finite union of diagrams in PΣ + ≥, and we can easily avoid diagrams denoting the empty relation, this generates all of pl relations. ut

This means that we didn't formally need to introduce the notion of a SMLT after all: we could have defined an equivalent SMIT by adding these generators. However, this is for most purposes a much less convenient syntax, and the corresponding equational theory would be more difficult to calculate with. This is also the case for the examples that follow.

### 5.2 The Simplest Non-Convex Diagram

The following is one of the simplest diagrams that captures a non-convex relation:

$$\cdots \bullet \bullet \cdots \bullet \bullet \cdots = \cdots \longrightarrow \cdots$$

It is named after its semantics: the union of the x and y axes in the plane, corresponding to the simple equation x = 0 ∨ y = 0. Despite its simplicity, it suffices to generate all of pl relations.

Theorem 4. The image of the free prop generated by Σ + <sup>≥</sup> and + is the prop of pl relations.

This diagram has the interesting property of duplicating black and white units:

dup = dup = We can chain it to build dup n+1 := dup <sup>n</sup> dup for any n. Then, let + nn := dup <sup>n</sup> dup <sup>n</sup> op + 1 1 = n n ∪ n n This allows us to build: <sup>n</sup> + <sup>n</sup> n = n n n ∪ n n n = n n n ∪ n <sup>n</sup> = n n n ∪

n

### ut

#### 5.3 The Semantics of a Diode

Most basic electrical circuit components can be modelled with an affine semantics. The first exception is the (ideal) diode: the idealised current-voltage semantics across a diode is that the current can be negative and the voltage difference positive but not both at the same time.

I U

On a graph, the allowed (current, voltage difference) pairs are depicted above. Not only is this not affine, it is not even convex. The corresponding diagram, ≤ ∪ ≤ , is outside of both affine and polyhedral algebra.

We will see how to model electrical circuits with diodes in more detail in the next section. We will focus here on the following fact: adding a generator with this semantics is once again enough to recover all pl relations. In fact we can even build the ≥ relation from the diode, so we can start from affine algebra (without requiring the generality of polyhedral algebra).

For convenience, we define a new generator whose semantics is the mirror image of the diode's graph:

L := <sup>≥</sup> ∪ <sup>≤</sup>

Theorem 5. Recall that Σ<sup>+</sup> is Σ + <sup>≥</sup> without <sup>≥</sup> . The image of the free prop generated by Σ<sup>+</sup> and L is the prop of pl relations.

Proof. First, we can construct the ≥ generator from L:

<sup>L</sup> <sup>=</sup> ≥ ∪ ≤ = ≥ ∪ = ≥

So we generate all polyhedral relations. Then we can also recover the + generator from the previous section, which is enough to generate all of pl relations:

### 5.4 Alternative generators: max, ReLu and abs

Three of the most basic piecewise-linear functions one might come across are abs, max and ReLu. We define them diagrammatically as follows:

ut

While the reader will certainly be familiar with the first two, ReLu has acquired significant fame as one of the basic building blocks of neural networks. In fact, all neural networks whose activation function is ReLu can be represented in GPLA. This opens up the exciting possibility of applying equational reasoning to neural networks, a possibility that we leave for future work.

Once again, adding either of them to the syntax for affine algebra suffices to construct any pl relation.

Theorem 6. The image of the free prop generated by Σ<sup>+</sup> and any of max, abs or ReLu is the prop of pl relations.

Proof. First, we notice that the three functions are inter-definable. abs and ReLu were already defined in terms of max, and we can complete the cycle:

$$\max(x, y) = x + \max(0, y - x) = x + \operatorname{Re} Lu(y - x)$$

$$\operatorname{Re} Lu(x) = \max(0, x) = (x + \operatorname{abs}(x)) / 2$$

So we only need to show the result for one of them. Let's pick max. We recover L, which we know suffices by Theorem 5. First max = ≤ ∪ <sup>≤</sup> .

Thus −1 −1 max = <sup>≥</sup> ∪ <sup>≤</sup> = L ut

Remark 4. It is standard that max together with linear maps generates all continuous pl functions. Our result can be seen as a generalization of this fact to the relational setting.

#### 5.5 Conclusion

These examples justify the generality of pl relations: they constitute the minimal extension of polyhedral algebra (and in some cases affine algebra) that can express any of the very useful relations above. This is interesting because pl relations form a nearly universal domain: they can approximate any smooth manifold over a bounded domain.

Despite our compelling examples, there could still be interesting props between polyhedral and pl relations. In particular, determining the prop generated by Σ + <sup>≥</sup> together with ∪ is currently an open problem.

### 6 Case Study: Electronic Circuits

To illustrate how one would use this theory in a concrete case, we turn to the study of electronic circuits. We build on the work done in [3]. The syntax mimics the usual circuits drawn by electrical engineers, by generating a free two-colored prop from basic elements and wires. The blue wires are electrical wires, and the black wires carry information; for details see [3].

The corresponding physical model imposes constraints between two quantities: current and voltage. To express this, we map an electrical wire into two GPLA wires, the top one for voltage and the bottom one for current. We then give to each generator a semantics in GPLA that expresses the relevant physical equations. For example:

$$\underline{\hspace{1cm}} \vdash\_{\bullet} [\vplus\_{\bullet} \vplus\_{\bullet} \vplus\_{\bullet} \vplus\_{\bullet} \vplus\_{\bullet}]$$

The core of this approach is the fact that composition of constraints in GPLA gives the behaviour of the corresponding composite electrical circuit. We can thus define the semantics of a whole circuit compositionally, and get the physically expected result.

So far this follows exactly [3]. Our contribution is the ability to express the behaviour of diodes:

Remark 5. We cannot include capacitors and inductors, because they require semantics in IH<sup>+</sup> R(x) , and R(x) cannot be ordered in a way that would be consistent with the physics. Finding diagrammatic semantics that can accommodate both capacitors and diodes is an important open problem.

This extension allows us to model electronic circuits! As hinted in the previous section, diodes by themselves can be used to build many things. For example, we can model a simple idealized transistor as follows: [10, Fig. 59.1]

That said, it is impractical to prove the equality of two non-trivial electronic circuits explicitly as the number of alternatives grows exponentially in the number of diodes. Like in standard mathematical practice, making this practical will require finding appropriate techniques and approximations, which we leave for future work.

Acknowledgements. The authors would like to thank the various Twitter and Zulip users who contributed to the genesis and development of the theory contained in this paper, notably Jules Hedges, Cole Comfort and Reid Barton. Reid Barton in particular contributed significantly to the proof of completeness.

The first author is funded by the EPSRC under grant OUCS/GB/1034913. The second author acknowledges support from EPSRC grant EP/V002376/1.

### References

1. Baez, J.C., Coya, B., Rebro, F.: Props in network theory. Theory and Applications of Categories 33(25), 727–783 (2018)


Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

## Token Games and History-Deterministic Quantitative Automata

Udi Boker<sup>1</sup> and Karoliina Lehtinen<sup>2</sup>

<sup>1</sup> Reichman University, Herzliya, Israel udiboker@idc.ac.il <sup>2</sup> CNRS, Marseille-Aix Universit´e, Universit´e de Toulon, LIS, Marseille, France lehtinen@lis-lab.fr

Abstract. A nondeterministic automaton is history-deterministic if its nondeterminism can be resolved by only considering the prefx of the word read so far. Due to their good compositional properties, historydeterministic automata are useful in solving games and synthesis problems. Deciding whether a given nondeterministic automaton is historydeterministic (the HDness problem) is generally a difcult task, which might involve an exponential procedure, or even be undecidable, for example for pushdown automata. Token games provide a PTime solution to the HDness problem of B¨uchi and coB¨uchi automata, and it is conjectured that 2-token games characterise HDness for all ω-regular automata. We extend token games to the quantitative setting and analyze their potential to help deciding HDness for quantitative automata. In particular, we show that 1-token games characterise HDness for all quantitative (and Boolean) automata on fnite words, as well as discounted-sum (DSum) automata on infnite words, and that 2-token games characterise HDness of LimInf and LimSup automata. Using these characterisations, we provide solutions to the HDness problem of Inf and Sup automata on fnite words in PTime, for DSum automata on fnite and infnite words in NP∩co-NP, for LimSup automata in quasipolynomial time, and for LimInf automata in exponential time, where the latter two are only polynomial for automata with a logarithmic number of weights.

Keywords: Automata, History-determinism, Token games, Synthesis

### 1 Introduction

History-determinism. A nondeterministic [quantitative] automaton is historydeterministic (HD) [11,8] if its nondeterministic choices can be resolved by only considering the word read so far, uniformly across possible sufxes (see Fig. 2 for examples of HD and non-HD automata). More precisely, there should be a function (strategy), sometimes called a resolver, that maps the fnite prefxes of a word to the transition to be taken at the last letter. The run built in this way must, in the Boolean setting, be accepting whenever the word is in the language of the automaton, and in the more general, quantitative, setting, attain the value of the automaton on the word (i.e., the supremum of all its runs' values).

History-determinism lies in between determinism and nondeterminism, enjoying in some aspects the best of both worlds: HD automata are, like deterministic ones, useful for solving games and reactive synthesis [16,11,17,18,12,15,8], yet can sometimes be more expressive and/or succinct. For example, HD coB¨uchi and LimInf automata can be exponentially more succinct than deterministic ones [19], and HD pushdown automata are both more expressive and at least exponentially more succinct than deterministic ones [20,15]. In the (ω-)regular setting, history-determinism coincides with good-for-gameness [7], while in the quantitative setting it is stronger [8]. The problem of deciding whether a nondeterministic automaton is HD is interreducible with deciding the best-value synthesis problem of a deterministic automaton [14,8]. In this quantitative version of the reactive synthesis problem, the system must guarantee a behaviour that matches the value of any global behaviour compatible with the environment's actions. The witness of HDness corresponds exactly to the solution system of this synthesis problem, providing another motivation for this line of research.

Deciding history-determinism – a difcult task. History-determinism is formally defned by a letter game played on the automaton A between Adam and Eve, where Adam produces an input word w, letter by letter, and Eve tries to resolve the nondeterminism in A so that the resulting run attains A's value on w. Then A is HD if Eve has a winning strategy in the letter game on it. The difculty of deciding who wins the letter game stems from its complicated winning condition – Eve wins if her run has the value of the supremum over all runs of A on w.

The naive solution is to determinise A into an automaton D, and consider a game equivalent to the letter game that has a simple winning condition and whose arena is the product of A and D [16]. The downside with this approach, however, is that it requires the determinisation of A, which often involves a procedure exponential in the size of A and sometimes is even impossible due to an expressiveness gap. Note that deciding whether an automaton is good-forgames, which is closely related to whether it is HD [7,8], is also difcult, as it requires reasoning about composition with all possible games.

Token games – a possible aid. In [3], Bagnol and Kuperberg introduced token games on ω-regular automata, which are closely related to the letter game, but easier to decide. In a k-token game on an automaton A, denoted by Gk(A), like in the letter game, Adam generates a word w letter by letter, and Eve builds a run on w by resolving the nondeterminism. In addition, Adam also has to resolve the nondeterminism of A to build k runs letter-by-letter over w. The winning condition for Eve in these games is that either all runs built by Adam are rejecting, or Eve's run is accepting. Such games, as they compare concrete runs, are easier to solve than the letter game.

Then, to decide HDness for a class of automata, one can attempt to show that the letter game always has the same winner as a k-token game, for some k, and solve the k-token game. (If Eve wins the letter game then she wins the k-token game, for every k, by using the same strategy, ignoring Adam's runs. However, it might be that she wins a k-token game, taking advantage of her knowledge of how Adam resolves the nondeterminism, but loses the letter game.)

Bagnol and Kuperberg showed in [3] that on B¨uchi automata, the letter game and the 2-token game always have the same winner, and in [6], Boker, Kuperberg, Lehtinen and Skrzypczak extended this result to coB¨uchi automata. In both cases, this allows for a PTime procedure for deciding HDness. Furthermore, Bagnol and Kuperberg suggested in [3, Conclusion] that 2-token games might characterise HDness also for parity automata (and therefore for all ω-regular automata); a conjecture (termed later the G2 conjecture) that is still open.

Our contribution. We extend token games to the quantitative setting, and use them to decide HDness of some quantitative automata. We defne a k-token game on a quantitative automaton exactly as on a Boolean one, except that Eve wins if her run has a value at least as high as all of Adam's runs.

We show frst, in Section 4, that the 1-token game, in which Adam just has one run to build, characterises history-determinism for all quantitative (and Boolean) automata on fnite words, and for discounted-sum (DSum) automata on infnite words. This results in a PTime decision procedure for checking HDness of Inf and Sup automata on fnite words, and an NP∩coNP procedure for DSum automata on fnite and infnite words. Note that the complexity for DSum automata on fnite words was already known [14], but on infnite words it was erroneously believed to be NP-hard [17, Theorem 6].

Towards getting the above results, we analyse key properties of value functions of quantitative automata, and show that the 1-token game characterises HDness for every Val automaton, such that Val is present-focused (Defnition 3), which is in particular the case for all Val automata on fnite words [8, Lemma 16], as well as DSum automata on infnite words [8, Lemma 22].

We then show, in Section 5, that the 2-token game, in which Adam builds two runs, characterises history-determinism for both LimSup and LimInf automata. The approach here is more involved: it decomposes the quantitative automaton into a collection of B¨uchi or coB¨uchi automata such that if Eve wins the 2-token game on the original automaton, she also wins in the component automata. Since the 2-token game characterises HD for B¨uchi and coB¨uchi automata, the component automata are then HD and the witness strategies can be combined with the 2-token strategy of the original automaton to build a letter-game strategy for Eve. The general fow of our approach is illustrated in Fig. 1.

We further present, in Section 5.1, algorithms to decide the winner of the twotoken games on LimInf and LimSup automata via reductions to solving parity games. The complexity of the procedure for a LimSup automaton A is the same as that of solving a parity game of size polynomial in the size of A with twice as many priorities as there are weights in A, which is in quasipolynomial time. For LimInf automata the procedure is in exponential time. In both cases, it is only in polynomial time if the number of weights is logarithmic in the automaton size.

For some variants of the synthesis problem, the complexity of the witness of history-determinism is also of particular interest (while for other variants it is not), as it corresponds to the complexity of the implementation of the solution system [8, Section 5]. We give an exponential upper bound to the complexity of the witness for LimSup and LimInf automata, which, for LimInf, is tight. As a corollary, we obtain that HD LimSup automata are as expressive as deterministic LimSup automata and at most exponentially more succinct.

Related work. In the ω-regular setting (where HDness coincides with good-forgameness), [16, Section 4] provides an exponential scheme for checking HDness of all ω-regular automata, based on determinisation and checking fair simulation. HDness of B¨uchi automata is resolved, as mentioned above, in PTime, using 2-token games [3]. The coB¨uchi case is also resolved in PTime, originally via an indirect usage of "joker games" [19], and later by using 2-token games [6].

In the quantitative setting, deciding HDness coincides with best-value partial domain synthesis [14], 0-regret synthesis [18] and, for some value functions, 0-regret determinisation [13,8]. There are procedures to decide HDness (which is sometimes called good-for-gameness due to erroneously assuming them equivalent) of Sum, Avg, and DSum automata on fnite words, as follows.

For Sum and Avg automata on fnite words, a PTime solution combines [1, Theorem 4.1], which provides a PTime algorithm for checking whether such an automaton is "determinisable by pruning", and [8, Theorem 21], which shows that such an automaton is HD if and only if it is determinisable by pruning.

Proposition 1. Deciding whether a Sum or Avg automaton on fnite words is history-deterministic is in PTime.

For DSum automata on fnite words, [14, Theorem 23] provides an NP∩co-NP solution, using a game that is quite similar to the one-token game, difering from it in a few aspects—for example, Adam is asked to either copy Eve with his token or move into a second phase where he plays transitions frst—and uses a characterisation of HD strategies resembling our notion of cautious strategies (Defnition 2) specialised to DSum automata.

### 2 Preliminaries

Words. An alphabet Σ is a fnite nonempty set of letters. A fnite (resp. infnite) word u = σ<sup>0</sup> . . . σ<sup>k</sup> ∈ Σ<sup>∗</sup> (resp. w = σ0σ<sup>1</sup> . . . ∈ Σ<sup>ω</sup>) is a fnite (resp. infnite) sequence of letters from Σ; ε is the empty word. We write Σ<sup>∞</sup> for Σ<sup>∗</sup> ∪ Σ<sup>ω</sup>. We use [i..j] to denote a set {i, . . . , j} of integers, [i] for [i..i], [..j] for [0..j], and [i..] for integers equal to or larger than i. We write w[i..j], w[..j], and w[i..] for the infx σ<sup>i</sup> . . . σ<sup>j</sup> , prefx σ<sup>0</sup> . . . σ<sup>j</sup> , and sufx σ<sup>i</sup> . . . of w. A language is a set of words.

Games. We consider a variety of turn-based zero-sum games between Adam (A) and Eve (E). Formally, a game is played on an arena of which the positions are partitioned between the two players. A play is a maximal (fnite or infnite) path. The winning condition partitions plays into those that are winning for each player. In some of the technical developments we use parity games, in which moves are coloured with integer priorities and a play is winning for Eve if the maximal priority that occurs infnitely often along the play is even.

A strategy for a player P ∈ {A, E} maps partial plays ending in a position belonging to P to a successor position. A (partial) play π agrees with a strategy s<sup>P</sup> of P, written π ∈ s<sup>P</sup> , if whenever its prefx p ends in a position of P, the next move is s<sup>P</sup> (p). A strategy of P is winning from a position v if all plays starting at v that agree with it are winning for P. A strategy is positional if it maps all plays that end in the same position to the same successor. A game is determined if for every position, one of the players has a winning strategy.

Quantitative Automata. A nondeterministic quantitative<sup>3</sup> automaton (or just automaton from here on) on words is a tuple A = (Σ, Q, ι, δ), where Σ is an alphabet; Q is a fnite nonempty set of states; ι ∈ Q is an initial state; and δ : Q × Σ → 2 (Q×Q) is a transition function over weight-state pairs.

A transition is a tuple (q, σ, x, q′ ) ∈ Q×Σ×Q × Q, also written q <sup>σ</sup>:<sup>x</sup> −−→ q ′ . (There might be several transitions with diferent weights over the same letter between the same states.) We write γ(t) = x for the weight of a transition t = (q, σ, x, q′ ). A is deterministic if for all q ∈ Q and a ∈ Σ, δ(q, a) is a singleton. We require that the automaton A is total, namely that for every state q ∈ Q and letter σ ∈ Σ, there is at least one state q ′ and a transition q <sup>σ</sup>:<sup>x</sup> −−→ q ′ .

A run of A on a word w is a sequence ρ = q<sup>0</sup> <sup>w</sup>[0]:x<sup>0</sup> −−−−→ <sup>q</sup><sup>1</sup> <sup>w</sup>[1]:x<sup>1</sup> −−−−→ <sup>q</sup><sup>2</sup> . . . of transitions where q<sup>0</sup> = ι and (x<sup>i</sup> , qi+1) ∈ δ(q<sup>i</sup> , w[i]). As each transition t<sup>i</sup> carries a weight γ(ti) ∈ Q, the sequence ρ provides a weight sequence γ(ρ) = γ(t0)γ(t1). . .. A Val (e.g., Sum) automaton is one equipped with a value function Val : Q<sup>∗</sup> → R or Val : Q<sup>ω</sup> → R, which assigns real values to runs of A. The value of a run ρ is Val(γ(ρ)). The value of A on a word w is the supremum of Val(ρ) over all runs ρ of A on w. Two automata A and A′ are equivalent, if they realise the same function. The size of an automaton consists of the maximum among the size of its alphabet, state-space, and transition-space.

### Value functions.

For fnite sequences v0v<sup>1</sup> . . . vn−<sup>1</sup> of rational weights:

$$-\operatorname{Sum}(v) = \sum\_{i=0}^{n-1} v\_i \quad \qquad \qquad -\operatorname{Avg}(v) = \frac{1}{n} \sum\_{i=0}^{n-1} v\_i$$

For fnite and infnite sequences v0v<sup>1</sup> . . . of rational weights:

$$-\ln \mathsf{f}(v) = \inf \{ v\_n \mid n \ge 0 \} \qquad \qquad -\mathsf{Sup}(v) = \sup \{ v\_n \mid n \ge 0 \} $$

– For a discount factor λ ∈ Q ∩ (0, 1), λ-DSum(v) = X i≥0 λ i vi

For infnite sequences v0v<sup>1</sup> . . . of rational weights:

<sup>3</sup> We speak of "quantitative" rather than "weighted" automata, following the distinction made in [5] between the two.

$$-\mathsf{Limlnf}(v) = \lim\_{n \to \infty} \inf \{ v\_i \mid i \ge n \} \qquad -\mathsf{LimSup}(v) = \lim\_{n \to \infty} \sup \{ v\_i \mid i \ge n \}.$$

ω-regular automata (with acceptance on transitions) can be viewed as special cases of quantitative automata. In particular, a B¨uchi (resp. coB¨uchi) automaton can be seen as a quantitative one, in which a rejecting transition has weight 0, an accepting transition has weight 1, and whose value function is 1 if the sequence of weighs has infnitely many 1's and 0 otherwise (resp. 1 if the sequence of weights has fnitely many 0). See more on ω-regular automata, e.g., in [4].

History-determinism. Intuitively, an automaton is history-deterministic if there is a strategy to resolve its nondeterminism according to the word read so far such that for every word, the value of the resulting run is the value of the word.

Defnition 1 (History-determinism [11,8]). A Val automaton A is historydeterministic (HD) if Eve wins the following win-lose letter game, in which Adam chooses the next letter and Eve resolves the nondeterminism, aiming to construct a run whose value is equivalent to the generated word's value.

	- Adam picks a letter σ<sup>i</sup> from Σ and
	- Eve chooses a transition t<sup>i</sup> = q<sup>i</sup> <sup>σ</sup>i:x<sup>i</sup> −−−→ qi+1.

In the limit, a play consists of an infnite word w that is derived from the concatenation of σ0, σ1, . . ., as well as an infnite sequence π = t0, t1, . . . of transitions. For A over infnite words, Eve wins a play in the lettergame if Val(π) ≥ A(w). For A over fnite words, Eve wins if for all i ∈ N, Val(π[0..i]) ≥ A(w[0..i]).

Consider for example the LimSup automaton A in Fig. 2. Eve loses the letter game on A: Adam can start with the letter a; then if Eve goes from s<sup>0</sup> to s1, Adam continues to choose a forever, generating the word a <sup>ω</sup>, where A(a <sup>ω</sup>) = 3, while Eve's run has the value 2. If, on the other hand, Eve chooses on her frst move to go from s<sup>0</sup> to s2, Adam continues with choosing b forever, generating the word ab<sup>ω</sup>, where A(ab<sup>ω</sup>) = 2, while Eve's run has the value 1.

Families of value functions. We will provide some of our results with respect to a family of Val automata based on properties of the value function Val.

We frst defne cautious strategies for Eve in both the letter game and token games (Section 3), which we use to defne present-focused value functions. Intuitively, a strategy is cautious if it avoids mistakes: it only builds run prefxes that can achieve the maximal value of any continuation of the current word prefx.

Defnition 2 (Cautious strategies [8]). Consider the letter game on a Val automaton A, in which Eve builds a run of A transition by transition. A move (transition) t = q <sup>σ</sup>:<sup>x</sup> −−→ q ′ of Eve, played after some run ρ ending in a state q, is non-cautious if for some word w, there is a run π ′ from q over σw such that Val(ρπ′ ) is strictly greater than the value of Val(ρπ) for any π starting with t.

A strategy is cautious if it makes no non-cautious moves.

A winning strategy for Eve in the letter game must of course be cautious; Whether all cautious strategies are winning depends on the value function. We call a value function present-focused if, morally, it depends on the prefxes of the value sequence, formalised by winning the letter game via cautious strategies.

Defnition 3 (Present-focused value functions [8]). A value function Val, on fnite or infnite sequences, is present-focused if for all automata A with value function Val, every cautious strategy for Eve in the letter game on A is also a winning strategy in that game.

Value functions on fnite sequences are present-focused, as they can only depend on prefxes, while value functions on infnite sequences are not necessarily present-focused [8, Remark 17], for example LimInf and LimSup.

Proposition 2 ([8, Lemma 16]). Every value function Val on fnite sequences of rational values is present focused.

Proposition 3 ([8, Lemma 22]). For every λ ∈ Q∩(0, 1), λ-DSum on infnite sequences of rational values is a present-focused value function.

### 3 Token Games

Token games were introduced by Bagnol and Kuperberg [3] in the scope of resolving the HDness problem of B¨uchi automata. In the k-token game, known as Gk, the players proceed as in the letter game, except that now Adam has k tokens that he must move after Eve has made her move, thus building k runs. For Adam to win, at least one of these must be better than Eve's run. In the Boolean setting, this run must be accepting, thus witnessing that the word is in the language of the automaton. Intuitively, the more tokens Adam has, the less information he is giving Eve about the future of the word he is building.

We generalise token games to the quantitative setting, defning that the maximal value produced by Adam's runs witnesses a lower bound on the value of the word, and Eve's task is to match or surpass this value on her run.

In the Boolean setting, G<sup>2</sup> has the same winner as the letter game for B¨uchi [3, Corollary 21] and coB¨uchi [6, Theorem 28] automata (the case of parity and more powerful automata is open). Since G<sup>2</sup> is solvable in polynomial time for B¨uchi and coB¨uchi acceptance conditions, this gives a PTime algorithm for deciding HDness, which avoids the determinisation used to solve the letter game directly. In the following sections we study how diferent token games can be used to decide HDness for diferent quantitative automata.

Defnition 4 (k-token games). Consider a Val automaton A = (Σ, Q, ι, δ). A confguration of the game Gk(A) for k ≥ 1 is a tuple (q, p1, . . . pk) ∈ Q<sup>k</sup>+1 of states. A play consists of an infnite sequence of confgurations (ι, ι, . . . , ι) = (q0, p1,0, . . . , pk,0),(q1, p1,1, . . . , pk,1), . . .. In a confguration (q<sup>i</sup> , p1,i, . . . , pk,i), the game proceeds to the next confguration as follows.


In the limit, a play consists of an infnite word w that is derived from the concatenation of σ0, σ1, . . ., as well as k + 1 infnite sequences π (picked by Eve) and π<sup>1</sup> . . . π<sup>k</sup> (picked by Adam) of transitions over w. Eve wins the play if Val(π) ≥ max(Val(π1), . . . , Val(πk)).

On fnite words, G<sup>k</sup> is defned as above, except that the winning condition is a safety condition for Eve: for all fnite prefxes of a play, it must be the case that the value of Eve's run is at least the value of each of Adam's runs.

Cautious strategies (Defnition 2) immediately extend to Eve's strategies in Gk(A). Note that unlike in the letter game, a winning strategy in Gk(A) must not necessarily be cautious, since Adam's run prefxes might not allow him to build an optimal run over the word witnessing that Eve's move was non-cautious.

### 4 Deciding History-Determinism via One-Token Games

Bagnol and Kuperberg showed that the one-token game G<sup>1</sup> does not sufce to characterise HDness for B¨uchi automata [3, Lemma 8]. However, it turns out that G<sup>1</sup> does characterise HDness for all quantitative (and Boolean) automata on fnite words and some quantitative automata on infnite words.

We can then use G<sup>1</sup> to decide history-determinism of some of these automata, over which the G<sup>1</sup> game is simple to decide. In particular, Inf and Sup automata on fnite words and DSum automata on fnite and infnite words.

Theorem 1. Given a nondeterministic automaton A with a present-focused value function Val over fnite or infnite words, Eve wins G1(A) if and only if A is HD. Furthermore, a winning strategy for Eve in G1(A) induces a HD strategy with the same memory.

Proof. One direction is easy: if A is HD, Eve can use her HD strategy to win G<sup>1</sup> by ignoring Adam's token. For the other direction, assume that Eve wins G1.

We consider the following family of copycat strategies for Adam in G1: a copycat strategy is one where Adam moves his token in the same way as Eve until she makes a non-cautious move t = q <sup>σ</sup>:<sup>x</sup> −−→ q ′ after building a run ρ; that is, there is some word w and run π ′ from q on σw, such that for every run π on σw starting with t, we have Val(ρπ′ ) > Val(ρπ). Then the copycat strategy stops copying and directs Adam's token along the run π ′ and plays the word w. If Eve plays a noncautious move in G<sup>1</sup> against a copycat strategy, she loses. Then, if Eve wins G<sup>1</sup> with a strategy s, she wins in particular against all copycat strategies and therefore s never makes a non-cautious move against such a strategy.

Eve can then play in the letter game over A with a strategy s ′ that moves her token as s would in G1(A) assuming Adam uses a copycat strategy. Then, s ′ never makes a non-cautious move and is therefore a cautious strategy. Since Val is present-focused, any cautious strategy, and in particular s ′ , is winning in the letter game, so A is HD. Note that s ′ requires no more memory than s. ⊓⊔

Corollary 1. Given a nondeterministic automaton A over fnite words, Eve wins G1(A) if and only if A is HD, and winning strategies in G1(A) induce HD strategies for A of the same complexity.

Proof. A direct consequence of Proposition 2 and Theorem 1. ⊓⊔

Solving token games. For resolving the HDness problem of Val automata where Val is present-focused, it then remains to study for which of them the corresponding G<sup>1</sup> game is simple to decide.

Theorem 2. Deciding whether an Inf or Sup automaton on fnite words is HD is in PTime, namely in O(|Σ|n <sup>2</sup>k) for Sup and O(|Σ|n 2k 2 ) for Inf, where Σ is the automaton's alphabet, k the number of weights and n the number of states.

Proof. Given a Sup automaton A = (Σ, Q, ι, δ) with weights W, G1(A) reduces to solving a safety game, whose positions (σ, q, q′ , xE, t) ∈ Σ ∪ {ε} × Q<sup>2</sup> × W × {L, E, A} consist of a possibly empty letter σ representing the last letter played, a pair of states (q, q′ ), one for Eve and one for Adam, which keep track of the end of the current run built by each player, a weight x<sup>E</sup> from W, which keeps track of the maximal weight seen on Eve's run so far, and a turn variable t ∈ {L, E, A} indicating whether it is Adam's turn to give a letter (L), Eve's turn to choose a transition (E), or Adam's turn to choose a transition (A). The initial position is (ε, ι, ι, m, L) where m is the minimal weight of A. The moves and position ownership encode the permitted moves in G1(A) and update x<sup>E</sup> to refect the maximal value of Eve's run. The winning condition for Eve is a safety condition: Adam wins if he picks a move with a weight higher than xE, the maximal weight on Eve's run. Then plays in this game are in bijection with plays of G1(A), and Eve wins if and only if she can avoid Adam choosing a transition with a larger weight than xE, that is, if she can win G1(A).

Then, solving G1(A) reduces to solving this safety game, which can be done in time linear in the number of positions of the arena, which is 3|Σ|n <sup>2</sup>k.

The case of Inf automata is similar, except that instead of keeping Eve's maximal value along her run, we need to keep the minimal value along Adam's run in some variable xA, and the safety condition for Eve is that her current value must always be at least as big as x<sup>A</sup> and Adam's next move. Since Adam plays after Eve in each round of the game, we also need to keep Eve's last value, thus having 3|Σ|n 2k <sup>2</sup> positions. ⊓⊔

Next, we show that solving G<sup>1</sup> is in NP∩co-NP for DSum automata.

Theorem 3. For every λ ∈ Q ∩ (0, 1), deciding whether a λ-DSum automaton A, on fnite or infnite words, is HD is in NP∩co-NP<sup>4</sup> .

<sup>4</sup> It was already known for fnite words [14]. It is perhaps surprising for infnite words, given the NP-hardness result in [17, Theorem 6]. In consultation with the authors, we have confrmed that there is an error in the hardness proof.

Proof. Consider a λ-DSum automaton A = (Σ, Q, ι, δ), where the weight of a transition t is denoted by γ(t). From Propositions 2 and 3 and Theorem 1, Eve wins G1(A) if and only if A is HD. It therefore sufces to show that solving G1(A) is NP∩co-NP. We achieve this by reducing solving G1(A) to solving a discounted-sum threshold game, which Eve wins if the DSum of a play is nonnegative. It is enough to consider infnite games, as they also encode fnite games, by allowing Adam to move to a forever-zero-position in each of his turns.

The reduction follows the same pattern as that in the proof of Theorem 2: we represent the arena of the game G1(A) as a fnite arena, and encode its winning condition, which requires the diference between the DSum of two runs to be nonnegative, as a threshold DSum winning condition. Note frst that the diference between the λ-DSum of the two sequences x0x1... and x ′ 0x ′ 1 ... of weights is equal to the λ-DSum of the sequence of diferences d<sup>0</sup> = (x<sup>0</sup> − x ′ 0 ), d<sup>1</sup> = (x<sup>1</sup> − x ′ 1 ), . . ., as follows: (P<sup>∞</sup> <sup>i</sup>=0 λ <sup>i</sup>xi) − P<sup>∞</sup> <sup>i</sup>=0 λ ix ′ <sup>i</sup> = P<sup>∞</sup> <sup>i</sup>=0 λ i (x<sup>i</sup> − x ′ i ).

We now describe the DSum arena G in which Eve wins with a non-strict 0-threshold objective if and only if she wins G1(A). The arena has positions in (σ, q, q′ , t, m) ∈ Σ ∪ {ε} × Q<sup>2</sup> × δ ∪ {ε} × {L, E, A} where σ is the potentially empty last played letter, starting with ε, the states q, q′ represent the positions of Eve and Adam's tokens, t is the transition just played by Eve if m = A and ε otherwise, and m denotes the move type, having L for Adam choosing a letter, E for Eve choosing a transition and A for Adam choosing a transition.

A move of Adam that chooses a transition t ′ = q ′ <sup>σ</sup>:<sup>x</sup> −−→ q ′′, namely a move (σ, q, q′ , t, A) → (σ, q, q′′, ε, L), is given weight γ(t) − γ(t ′ ), that is, the diference between the weights of the transitions chosen by both players. Other transitions are given weight 0. Observe that we need to compensate for the fact that only one edge in three is weighted. One option to do it is to take a discount factor λ ′ = λ 1 <sup>3</sup> for the DSum game G. Yet, λ ′ can then be irrational, which somewhat complicates things. Another option is to consider discounted-sum games with multiple discount factors [2] and choose three rational discount factors λ ′ , λ′′, λ′′′ ∈ Q ∩ (0, 1), such that λ ′ · λ ′′ · λ ′′′ = λ. Since the frst two weights in every triple are 0, only the multiplication of the three discount factors toward the third weight is what matters. For λ = p q , where p < q are positive integers, one can choose λ ′ = 4p <sup>4</sup>p+1 , λ′′ = 4p+1 <sup>4</sup>p+2 , and λ ′′′ = 2p+1 2q .

Plays in G1(A) and in G are in bijection. It now sufces to argue that the winning condition of G, namely that the (λ ′ , λ′′, λ′′′)-DSum of the play is nonnegative, correctly encodes the winning condition of G1(A), meaning that the diference between the λ-DSum of Eve's run and of Adam's run is non-negative.

Let d0d<sup>1</sup> . . . be the sequence of weight diferences between the transitions played by both players in G1(A), and let λ0, λ1, . . . and w0, w1, . . . be the corresponding sequences of discount factors and weights in the (λ ′ , λ′′, λ′′′)-DSum game, respectively, where for every i = (0 mod 3), we have w<sup>i</sup> = 0 and λ<sup>i</sup> = λ ′ , for every i = (1 mod 3), we have w<sup>i</sup> = 0 and λ<sup>i</sup> = λ ′′, and for every i = (2 mod 3), we have w<sup>i</sup> = d<sup>i</sup> and λ<sup>i</sup> = λ ′′′. Then the value of the (λ ′ , λ′′, λ′′′)-DSum sequence is equal to the required DSum sequence multiplied by λ ′ · λ ′′:

$$(\lambda', \lambda'', \lambda''') \text{-DSum} = \sum\_{i=0}^{\infty} (0 \cdot \prod\_{j=0}^{3i-1} \lambda\_j + 0 \cdot \prod\_{j=0}^{3i} \lambda\_j + w\_{3i+2} \cdot \prod\_{j=0}^{3i+1} \lambda\_j) = \lambda' \cdot \lambda'' \cdot \sum\_{i=0}^{\infty} \lambda^i d\_i$$

Hence Eve wins the game G1(A) if and only if she wins the 0-threshold (λ ′ , λ′′, λ′′′)-DSum game over G. As G has a state-space polynomial in the statespace of A and solving DSum-games is in NP∩coNP [2], solving G1(A), and therefore deciding whether A is HD, is also in NP∩coNP. ⊓⊔

DSum games are positionally determined [22,23,2] so this algorithm also computes a fnite-memory witness of HDness for A that is of polynomial size in the state-space of A. However, a positional witness also exists [17, Section 5].

### 5 Deciding History-Determinism via Two Token Games

In this section we solve the HDness problem of LimSup and LimInf automata via two-token games. As is the case with B¨uchi and coB¨uchi automata, one-token games do not characterise HDness of LimSup and LimInf automata. For LimInf, a possible alternative approach is to try to solve the letter game directly: we can use an equivalent deterministic LimInf automaton to track the value of a word, and the winning condition of the letter game corresponds to comparing Eve's run to the one of the deterministic automaton. Unfortunately, determinising LimInf automata is exponential in the number of its states [10, Theorem 13], so the new game is large, and, in addition, its winning condition, which compares the LimInf value of two runs, is non-standard and needs additional work to be encoded into a parity game. For LimSup automata the situation is even worse, as they are not necessarily equivalent to deterministic LimSup automata, so it is not obvious whether the winner of the letter game is decidable at all.

Here we show that the 2-token-game approach, used to resolve HDness of B¨uchi and coB¨uchi automata, can be generalised to LimSup and LimInf automata. While the proof that G<sup>2</sup> has the same winner as the letter game is quite diferent for the B¨uchi and coB¨uchi cases, our proofs for the LimSup and LimInf cases follow the same structure, while relying on the B¨uchi and coB¨uchi results respectively. However, the argument that G2(A) is solvable difers according to whether A is a LimSup or LimInf automaton. In particular, perhaps surprisingly (since the naive approach to solving the letter game seems harder for LimSup), we show that G<sup>2</sup> is solvable in quasipolynomial time for LimSup while for LimInf our algorithm is exponential in the number of weights (but not in the number of states).

Without loss of generality, we assume the weights to be {1, 2, . . .}.

We start, in Section 5.1, with analysing the 2-token game on LimSup and LimInf automata, and show, in Section 5.2, that it characterises their HDness.

### 5.1 G<sup>2</sup> on LimSup and LimInf Automata

We frst observe that G2(A), for both a LimSup and a LimInf automaton A, can be solved via a reduction to a parity game. The G<sup>2</sup> winning condition for LimSup automata can be encoded by adding carefully chosen priorities to the arena of G2(A), while for LimInf the encoding requires additional positions.

Lemma 1. Given a nondeterministic LimSup automaton A of size n with k weights, the game G2(A) can be solved in time quasipolynomial in n, and if k is in O(log n), in time polynomial in n.

Proof. We encode the game G2(A), for a LimSup automaton A = (Σ, Q, ι, δ), into a parity game as follows. The arena is simply the arena of G2(A), seen as a product of the alphabet and three copies of A, to refect the current letter and the current position of each of the three runs (one for Eve, two for Adam).

Adam's letter-picking moves are labelled with priority 0, Eve's choices of transition q <sup>σ</sup>:<sup>x</sup> −−→ q ′ are labelled with priority 2x and Adam's choices of transition q <sup>σ</sup>:<sup>x</sup> −−→ q ′ are labelled with priority 2x − 1.

We claim that Eve wins this parity game if and only if she wins G2(A), that is, the priorities correctly encode the winner of G2(A). Observe that the even priorities seen infnitely often in a play of the parity game are exactly priorities 2x, where x is a weight seen infnitely often in Eve's run in the corresponding play in G2(A). The odd priorities seen infnitely often on the other hand are 2x − 1, where x > 0 occurs infnitely often on one of Adam's runs in the corresponding play of G2(A). Hence, Eve can match the maximal value of Adam's runs in G2(A) if and only if she can win the parity game that encodes G2(A).

The number of positions in this game is polynomial in the size n of A; the maximal priority is linear in the number of weights. It can be solved in quasipolynomial time, or in polynomial time if the number of weights is in O(log n), using the reader's favourite state-of-the-art parity game algorithm, for instance [9]. ⊓⊔

Lemma 2. Given a nondeterministic LimInf automaton A of size n with k weights, the game G2(A) can be solved in time exponential in n, and if k is in O(log n), in time polynomial in n.

Proof. As in the proof of Lemma 1, we can represent G2(A) as a game on an arena that is the product of three copies of A, one for Eve and two for Adam. The winning condition for Eve is that the smallest weight seen infnitely often on the run built on her copy of A should be at least as large as both of the minimal weights seen infnitely often on the runs built on Adam's copies. We will encode this winning condition as a parity condition, but, unlike in the LimSup case, we will need to use an additional memory structure, which we describe now.

Intuitively, the weights on Eve's run will be encoded by odd priorities, with smaller weights corresponding to higher priorities, as in LimInf the lowest weight seen infnitely often is the one that matters, while weights on Adam's runs will be encoded by even priorities, but only once both of Adam's runs have seen the corresponding weight or a lower one. This is the role of the memory structure, which encodes which of Adam's runs has seen which weight recently.

More precisely, let k be the number of weights in A. Moves corresponding to Eve choosing a transition of weight i have priority 2(k − i + 1) − 1, that is, an odd priority that is larger the smaller i is. Further, for each weight, we use a three-valued variable x<sup>i</sup> ∈ {0, 1, 2}, initiated to 0, which gets updated as follows: if x<sup>i</sup> = 0 and the game takes a transition with a weight w ≤ i on one of Adam's runs, x<sup>i</sup> is updated to 1 or 2 according to which of Adam's run saw this weight; if x<sup>i</sup> = 1 (resp. 2) and Adam's second (resp. frst) run takes a transition with weight w ≤ i, then x<sup>i</sup> is reset to 0. Transitions that reset variables to 0 have priority 2(k−i+1) for the minimal i such that the transition resets x<sup>i</sup> to 0; other transitions have priority 1. Other moves do not afect x<sup>i</sup> , and have priority 1.

We now argue that the highest priority seen infnitely often along a play is even if and only if the LimInf value of Eve's run is at least as high as that of both of Adam's runs. Indeed, the maximal odd priority seen infnitely often on a play is 2(k − i + 1) − 1 such that i is the minimal priority seen on Eve's run infnitely often, and the maximal even priority seen infnitely often is 2(k−j + 1) where j is the minimal weight such that both of Adam's runs see j or a smaller priority infnitely often. In particular, 2(k − i + 1) − 1 < 2(k − j + 1) if and only if i ≥ j, that is, if Eve wins G2(A).

This parity game is of size exponential in k due to the memory structure ({0, 1, 2} k ) and has 2k priorities. As the number of priorities is logarithmic in the size of the game, it can be solved in polynomial time [9]. If the number of weights is in O(log n), then the algorithm is polynomial in the size n of A. ⊓⊔

### 5.2 G<sup>2</sup> Characterises HDness for LimSup and LimInf Automata

The rest of the section is dedicated to proving that a LimSup or LimInf automaton is HD if and only if Eve wins the 2-token game on it. In both cases, the structure of the argument is similar. One direction is immediate: if an automaton A is HD, then Eve can use the letter-game strategy to win in G2(A), ignoring Adam's tokens. The other direction requires more work. We use an additional notion, that of k-HDness, also known as the width of an automaton [21], which generalises HDness, in the sense that Eve maintains k runs, rather than only one, and needs at least one of them to be optimal. We will then show that if Eve wins G2(A), then A is k-HD for a fnite k (namely, the number of weights in A minus one). Finally, we will show that for automata that are k-HD, for any fnite k, a strategy for Eve in G2(A) can be combined with the k-HD strategy to obtain a strategy for her in the letter game.

Many of the tools used in this proof are familiar from the ω-regular setting [3,6]. The main novelty in the argument is the decomposition of the LimSup (LimInf) automaton A with k weights into k − 1 B¨uchi (coB¨uchi) automata A2, . . . , A<sup>k</sup> that are HD whenever Eve wins G2(A). (The converse does not hold, namely A2, . . . , A<sup>k</sup> can be HD even if Eve loses G2(A) – see Fig. 2.) The HD strategies for A2, . . . , A<sup>k</sup> can then be combined to prove the k-HDness of A.

Fig. 1 illustrates the fow of our arguments.

We frst generalise to quantitative automata Bagnol and Kuperberg's key insight that if Eve wins G2, then she also wins G<sup>k</sup> for all k [3, Thm 14].

Fig. 1. The fow of arguments for showing that G2(A) =⇒ HD(A) for a LimInf or LimSup automaton A.

Fig. 2. A LimSup automaton A and corresponding B¨uchi automata A<sup>2</sup> and A3, as per Lemma 3. (Accepting transitions in A<sup>2</sup> and A<sup>3</sup> are marked with double lines.) Observe that A is not HD and Eve loses the two-token game on A, while both A<sup>2</sup> and A<sup>3</sup> are HD. (In A, if Eve goes from s<sup>0</sup> to s1, Adam goes from s<sup>0</sup> to s<sup>2</sup> and continues with an a, and if she goes from s<sup>0</sup> to s2, Adam goes from s<sup>0</sup> to s<sup>1</sup> and continues with a b. In A<sup>2</sup> Eve goes from s<sup>0</sup> to s<sup>1</sup> and in A<sup>3</sup> from s<sup>0</sup> to s2.)

Theorem 4. Given a quantitative automaton A, if Eve wins G2(A) then she also wins Gk(A) for any k ∈ N \ {0}. Furthermore, if her winning strategy in G2(A) has memory of size m and A has n states, then she has a winning strategy in Gk(A) with memory of size n k−1 · m<sup>k</sup> .

Proof. This is the generalisation of [3, Thm 14]. The proof is similar to Bagnol and Kuperberg's original proof, but without assuming positional strategies for Eve in Gk(A). If Eve wins G2(A) then she obviously wins G1(A), using her G<sup>2</sup> strategy with respect to two copies of Adma's single token in G1. We thus consider below Gk(A) for every k ∈ N \ {0, 1, 2}.

Let s<sup>2</sup> be a winning strategy for Eve in G2(A). We inductively show that Eve has a winning strategy s<sup>i</sup> in Gi(A) for each fnite i. To do so, we assume a winning strategy si−<sup>1</sup> in Gi−1(A). The strategy s<sup>i</sup> maintains some additional (not necessarily fnite) memory that maintains the position of one virtual token in A, a position in the (not necessarily fnite) memory structure of si−1, and a position in the (not necessarily fnite) memory structure of s2. The virtual token is initially at the initial state of A. The strategy s<sup>i</sup> then plays as follows: at each turn, after Adam has moved his i tokens and played a letter (or, at the frst turn, just played a letter), it frst updates the si−<sup>1</sup> memory structure, by ignoring the last of Adam's tokens, and, treating the position of the virtual token as Eve's token in Gi−1(A), it updates the position of the virtual token according to the strategy si−1; it then updates the s<sup>2</sup> memory structure by treating Adam's last token and the virtual token as Adam's 2 tokens in G2(A), and fnally outputs the transition to be played according to s2.

We now argue that this strategy is indeed winning in Gi(A). Since si−<sup>1</sup> is a winning strategy in Gi−1(A), the virtual token traces a run of which the value is at least as large as the value of any of the runs built by the frst i − 1 tokens of Adam. Since s<sup>2</sup> is also winning, the value of the run built by Eve's token is at least as large as the values of the runs built by the virtual token and by Adam's last token. Hence, Eve is guaranteed to achieve at least the supremum value of Adam's i runs, making this a winning strategy in Gi(A).

As for the memory size of a winning strategy for Eve in Gk(A), let m be the memory size of her winning strategy in G2(A) and n the number of states in A. Then, by the above construction of her strategy in Gk(A), the memory of her strategy in G3(A) is n for the virtual token times m for the copy of her memory in G2(A) times m for the copy of her memory in Gi−1(A) = G2(A), namely n · m · m = n · m<sup>2</sup> . Then for G4(A) it is n · m ·(n · m<sup>2</sup> ) = n 2 · m<sup>3</sup> ; for G5(A) it is n · m · (n 2 · m<sup>3</sup> ) = n 3 · m<sup>4</sup> , and for Gk(A) it is n k−1 · m<sup>k</sup> . ⊓⊔

We proceed with the defnition of k-HDness, also known as width [21], based on the k-runs letter game (not to be confused with Gk, the k-token game), which generalises the letter game.

Defnition 5 (k-HD and k-runs letter game). A confguration of the game on a LimSup (LimInf) automaton A = (Σ, Q, ι, δ) is a tuple q <sup>k</sup> ∈ Q<sup>k</sup> of states of A, initialised to ι k .

In a confguration (qi,1, . . . , qi,k), the game proceeds to the next confguration (qi+1,1, . . . , qi+1,k) as follows.

– Adam picks a letter σ<sup>i</sup> ∈ Σ, then

– Eve chooses for each qi,j , a transition qi,j <sup>σ</sup>i:xi,j −−−−→ <sup>q</sup>i+1,j

In the limit, a play consists of an infnite word w that is derived from the concatenation of σ0, σ1, . . ., as well as of k infnite sequences ρ0, ρ1, . . . of transitions. Eve wins the play if maxj∈{1...k} Val(ρ<sup>j</sup> ) = A(w).

If Eve has a winning strategy, we say that A is k-HD, or that HDk(A) holds.

Notice that the standard letter game (Defnition 1) is a 1-run letter game and standard HD (Defnition 1) is 1-HD.

Next, we use HDk(A) to show that G<sup>2</sup> characterises HDness.

Proposition 4 ([3]). Given a quantitative automaton A, if HDk(A) for some k ∈ N, and Eve wins Gk, then A is HD.

Proof. The argument is identical to the one used in [3], which we summarise here. The strategy τ for Eve in HDk(A) provides a way of playing k tokens that guarantees that one of the k runs formed achieves the automaton's value on the word w played by Adam. If Eve moreover wins Gk(A) with some strategy sk, she can, in order to win in the letter game, play s<sup>k</sup> against Adam's letters and k virtual tokens that she moves according to τ . The winning strategy τ guarantees that one of the k runs built by the k virtual tokens achieves Val(w); then her strategy s<sup>k</sup> guarantees that her run also achieves Val(w). ⊓⊔

It remains to prove that if Eve wins G2(A), then HDk(A) for some k.

Given a LimSup automaton A, with weights {1, . . . , k}, we defne k −1 auxiliary B¨uchi automata A2, . . . , A<sup>k</sup> with acceptance on transitions, such that each A<sup>x</sup> is a copy of A, where a transition is accepting if its weight i in A is at least x. Each A<sup>x</sup> recognises the set of words w such that A(w) ≥ x. (See Fig. 2.)

Given a LimInf automaton A, we similarly defne auxiliary coB¨uchi automata: A<sup>x</sup> is a copy of A where transitions with weights smaller than x are rejecting, while those with weights x or larger are accepting. Again, A<sup>x</sup> recognises the set of words w such that A(w) ≥ x.

We now use these auxiliary automata to argue that if G2(A) then HDk−1(A).

Lemma 3. Given a LimSup or LimInf automaton A with weights {1, . . . , k}, if Eve wins G2(A), then for all x ∈ {2, . . . , k}, Eve also wins G2(Ax).

Proof. Since A<sup>x</sup> is identical to A except for the acceptance condition or value function, Eve can use in G2(Ax) her winning strategy in G2(A). For the LimSup case, if one of Adam's runs sees an accepting transition infnitely often, the underlying transition of A visited infnitely often has weight at least x. Then, Eve's strategy guarantees that her run also sees infnitely often a value at least as large as x, corresponding to an accepting transition in G2(Ax).

Similarly, for the LimInf case, if one of Adam's runs avoids seeing a rejecting transition infnitely often in Ax, then this run's value in A is at least x, and Eve's strategy guarantees that her run's value in A is at least x, meaning that it avoids seeing a rejecting transition in A<sup>x</sup> infnitely often, and accepts. ⊓⊔

Lemma 4. Given a LimSup or LimInf automaton A with weights {1, . . . , k}, if Eve wins G2(Ax) for all x ∈ {2, . . . , k} then HDk−1(A) holds.

Proof. From Lemma 3, if Eve wins G2(A), then for all x ∈ {2, . . . , k}, Eve also wins G2(Ax). Since each A<sup>x</sup> is a B¨uchi or coB¨uchi automaton, this implies that for all x ∈ {2, . . . , k}, the automaton A<sup>x</sup> is HD [3,6], that is, there is a winning strategy s<sup>x</sup> for Eve in the letter game on each Ax. Now, in the (k −1)-run letter game on A, Eve can use each s<sup>x</sup> to move one token. Then, if Adam plays a word w with some value Val(w) = i, this word is accepted by A<sup>i</sup> , and therefore the strategy s<sup>i</sup> guarantees that the run of the i th token achieves at least the value i, corresponding to seeing accepting transitions of A<sup>i</sup> infnitely often for the LimSup case, or eventually avoiding rejecting transitions in the LimInf case. ⊓⊔

Finally, we combine the G<sup>2</sup> and HDk−<sup>1</sup> strategies in A to show that A is HD.

Theorem 5. A nondeterministic LimSup or LimInf automaton A is HD if and only if Eve wins G2(A).

Proof. If A is HD then Eve can use the letter-game strategy to win in G2(A), ignoring Adam's moves. If Eve wins G2(A) then by Lemma 3 and Lemma 4 she wins HDk−1(A), where k is the number of weights in A. By Theorem 4 she also wins Gk−1(A) and, fnally, by Proposition 4 we get that A is HD. ⊓⊔

Theorem 6. Given a nondeterministic LimSup (resp. LimInf) automaton A of size n with k weights, the HDness problem of A can be solved in time quasipolynomial (resp. exponential) in n. In both cases, if k is in O(log n), it can be solved in time polynomial in n.

Proof. It directly follows from Theorem 5 and Lemmas 1 and 2; the former reducing the HDness problem to solving G2(A), and the latter two showing that G2(A) can be solved in the stated complexity. ⊓⊔

In contrast to the cases considered in the Section 4, where strategies in G<sup>1</sup> immediately induce HD strategies of the same complexity, for B¨uchi and coB¨uchi automata, a winning G<sup>2</sup> strategy does not necessarily induce an HD strategy (even though it implies the existence of such a strategy). We now analyse the size of the HD strategies which our proofs show exist whenever Eve wins G2, and discuss the implications for the determinisability of HD LimSup automata.

Corollary 2. Given an HD LimSup or LimInf automaton A of size n, there is an HD strategy for A with memory exponential in n. If A is a LimSup automaton with O(log n) weights then the memory is only polynomial in n.

Proof. Let n be the size of A and k + 1 the number of weights. We construct an HD strategy for A, by combining an HD<sup>k</sup> strategy and a G<sup>k</sup> strategy for it.

The HD<sup>k</sup> strategy—which, like the HD strategy, is hard to compute directly combines the HD strategies of the k auxiliary B¨uchi or coB¨uchi automata for A, as constructed in Lemma 3. For HD B¨uchi automata, which are equivalent to deterministic automata of quadratic size [19], there always exists a polynomial resolver: indeed, the letter game can be represented as a polynomial parity game, in which a positional strategy for Eve corresponds to a resolver. For HD coB¨uchi automata on the other hand, these auxiliary strategies might have exponential memory in the number of states of A [19].

The G<sup>k</sup> strategy on the other hand is positional for LimSup, since it can be encoded as a parity game directly on the Gk(A) arena, similarly to the reduction in Lemma 1; the size of the Gk(A) arena is O(n <sup>k</sup>+1). The overall HD strategy for LimSup therefore needs memory exponential in the number of weights.

For LimInf on the other hand, by Lemma 2 and Theorem 4, the G<sup>k</sup> strategy can do with memory of size n k−1 · 3 k 2 . The overall HD strategy therefore has memory exponential in the size of A. ⊓⊔

We leave open whether this can be improved upon. Already for coB¨uchi automata, it is known that deciding whether an automaton is HD is polynomial despite there being automata for which the optimal HD strategy is exponential. Hence, at least for the LimInf case, we cannot expect to do much better. However, for the LimSup case, it might be that polynomial, or even positional HD strategies could sufce. However, positionality is already open for the B¨uchi case.

Our proof does however imply that if a LimSup automaton A is HD, then there is a fnite memory HD strategy, which implies that A is determinisable, without increasing the number of weights, by taking a product of A with the fnite HD strategy. (Recall that every LimInf automaton can be determinised, while not every LimSup automaton can.)

Corollary 3. Every HD LimSup automaton is equivalent to a deterministic one with at most an exponential number of states and the same set of weights.

### 6 Conclusions

We have extended the token-game approach to characterising history-determinism from the Boolean (ω-regular) to the quantitative setting. Already 1-token games turn out to be useful for characterising history-determinism for some quantitative automata. For LimSup and LimInf automata, one token is not enough, but the 2-token game does the trick. Given the correspondence between deciding history-determinism and the best-value synthesis problem, our results also directly provide algorithms both to decide whether the synthesis problem is realisable and to compute a solution strategy.

This application further motivates understanding the limits of these techniques. Whether the 2-token game G<sup>2</sup> characterises more general Boolean classes of automata beyond B¨uchi and coB¨uchi automata is already an open question. Similarly, we leave open whether the G<sup>2</sup> game also characterises historydeterminism for limit-average automata and other quantitative automata. At the moment we are not aware of examples of automata of any kind (quantitative, pushdown, register, timed, . . . ) for which Eve could win G<sup>2</sup> despite the automaton not being history-deterministic, yet even for parity automata, a proof of characterisation remains elusive.

### Acknowledgments

We thank Guillermo A. P´erez for discussing history-determinism of discountedsum and limit-average automata.

### References

1. Benjamin Aminof, Orna Kupferman, and Robby Lampert. Reasoning about online algorithms with weighted automata. ACM Trans. Algorithms, 6(2):28:1–28:36, 2010.


Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

### On the Translation of Automata to Linear Temporal Logic?

Udi Boker<sup>1</sup> , Karoliina Lehtinen2, , and Salomon Sickert3??

<sup>1</sup> Reichman University, Herzliya, Israel udiboker@idc.ac.il <sup>2</sup> CNRS, Aix-Marseille University and University of Toulon, LIS, Marseille, France

lehtinen@lis-lab.fr

<sup>3</sup> The Hebrew University, Jerusalem, Israel salomon.sickert@mail.huji.ac.il

Abstract While the complexity of translating future linear temporal logic (LTL) into automata on infinite words is well-understood, the size increase involved in turning automata back to LTL is not. In particular, there is no known elementary bound on the complexity of translating deterministic ω-regular automata to LTL.

Our first contribution consists of tight bounds for LTL over a unary alphabet: alternating, nondeterministic and deterministic automata can be exactly exponentially, quadratically and linearly more succinct, respectively, than any equivalent LTL formula. Our main contribution consists of a translation of general counter-free deterministic ω-regular automata into LTL formulas of double exponential temporal-nesting depth and triple exponential length, using an intermediate Krohn-Rhodes cascade decomposition of the automaton. To our knowledge, this is the first elementary bound on this translation. Furthermore, our translation preserves the acceptance condition of the automaton in the sense that it turns a looping, weak, B¨uchi, coB¨uchi or Muller automaton into a formula that belongs to the matching class of the syntactic future hierarchy. In particular, it can be used to translate an LTL formula recognising a safety language to a formula belonging to the safety fragment of LTL (over both finite and infinite words).

Keywords: Linear temporal logic · Automata · Cascade decomposition

### 1 Introduction

Linear Temporal Logic with only future temporal operators (from here on LTL) and ω-regular automata, whether deterministic, nondeterministic or alternating, are both well-established formalisms to describe properties of infinite-word languages. LTL is popular in formal verification and synthesis due to its simple

c The Author(s) 2022

<sup>?</sup> The omitted proofs of this chapter can be found in the full version [5].

<sup>??</sup> Salomon Sickert is supported by the Deutsche Forschungsgemeinschaft (DFG) under project number 436811179.

P. Bouyer and L. Schr¨oder (Eds.): FoSSaCS 2022, LNCS 13242, pp. 140–160, 2022. https://doi.org/10.1007/978-3-030-99253-8\_8

syntax and semantics. Yet, while properties might be convenient to define in LTL, most verification and synthesis algorithms eventually compile LTL formulas into ω-regular automata. The expressiveness of both these key formalisms, as well as translations from LTL to automata of various types, are well understood. Here, we consider the converse translations, which, in comparison, have received less attention: up till now, no elementary upper bound on the size blow-up of going from automata to LTL was known.

Regarding expressive power, deterministic Muller automata, nondeterministic B¨uchi automata, and weak alternating automata recognise all ω-regular languages [21,40]. LTL-definable languages (surveyed in [13]) are a strict subset thereof, also defined by first-order logic, star-free regular expressions, aperiodic monoids, counter-free automata, and very weak alternating automata. As for succinctness, nondeterministic and alternating automata can be exponentially and double-exponentially more succinct than deterministic automata, respectively. Determinisation in particular has precise bounds [32,35,24,36,12,3].

The succinctness of various representations of LTL-definable languages is less clear: effective translations between the different models are far from straightforward, and their complexity is sometimes uncertain. In particular, to the best of our knowledge, up to now there has been no elementary bound even on the translation of deterministic counter-free automata, arguably the simplest automata model for this class of languages, into LTL formulas. (Considering LTL with both future and past temporal operators, there is a double-exponential upper bound on the length of the formula [26] 4 .) The complexity of obtaining a deterministic counter-free automaton from a nondeterministic one is also, to the best of our knowledge, open.

We study the complexity of translating automata to LTL (equivalently, to very weak alternating automata), considering formula length, size, and nesting depth of temporal operators.

We begin (Section 3), as a warm-up, with the unary alphabet case on finite words. We show that the size-blow up involved in translating deterministic, non-deterministic and alternating automata to LTL, when possible, is linear, quadratic and exponential, respectively, and these bounds are tight. In contrast, going from LTL to alternating, nondeterministic and deterministic automata is linear, exponential and double-exponential, respectively [33,41,19].

The case of non-unary alphabets is much more difficult. We provide a translation of counter-free deterministic ω-regular automata (with any acceptance condition) into LTL formulas with double exponential depth and triple exponential length. Our translation uses an intermediate Krohn-Rhodes reset cascade decomposition (wreath product) of deterministic automata, which is a deterministic automaton built from simple components.

Our main technical contribution consists of a translation of a reset cascade into an LTL formula of depth linear and length singly exponential in the number of cascade configurations. Combining this with Eilenberg's Holonomy translation of a semigroup into a cascade [14, Corollary II.7.2] and Pnueli and Maler's adapt-

<sup>4</sup> See Remark 1 on whether the upper bound in [26] is single or double exponential.

ation of it to automata [26, Theorem 3] (see Remark 1), we obtain a translation of counter-free deterministic ω-regular automata into LTL formulas of double exponential depth and triple exponential length. Our construction preserves the acceptance condition of the automaton in the sense that it turns a B¨uchi-looping, coB¨uchi-looping, weak, B¨uchi or coB¨uchi automaton into a formula that belongs to the matching class of the syntactic future hierarchy (see Definition 1 and [8]).

### Related work

Finite words. While LTL is usually interpreted over infinite words, it also admits finite-word semantics that coincide with the finite word version of the other equivalent formalisms. The equivalence between FO and star-free languages on finite words is due to McNaughton and Papert [31]. Cohen, Perrin and Pin [10] used the Krohn-Rhodes decomposition to characterise the expressive power of LTL with only X and F (eventually), but do not provide bounds on the size trade-off between the different models. Wilke [42] gives a double-exponential translation from counter-free DFA to LTL. More recently, Boja´nczyk provided an algebraically flavoured adaptation of Wilke's proof [2, Section 2.2.2].

Infinite words. With substantial effort over several decades, the above techniques have been extended to infinite words using intricate tools with opaque complexities. Ladner [22] and Thomas [38,39] for example extended the equivalence of star-free regular expressions and FO to infinite words, while the ω-extension of the equivalence with aperiodic languages is due to Perrin [34]. The correspondence with LTL is due to Kamp [18] and Gabbay, Pnueli, Shelah and Stavi [16]. Diekert and Gastin's survey [13] provides an algebraic translation into LTL via ωmonoids while Cohen-Chesnot gives a direct algebraic proof of the equivalence of star-free ω-regular expressions and LTL [11]. Wilke takes an automata-theoretic approach, using backward deterministic automata [43,44]. However, none of the above address the complexity of the transformations. Zuck's dissertation [46] gives a translation of star-free regular expressions into LTL, with at least nonelementary complexity. Subsequently, Chang, Mana and Pneuli [8] use Zuck's results to show that the levels of their hierarchy of future temporal properties coincide with syntactic fragments of LTL. Sickert and Esparza [37] gave an exponential translation of any LTL formula into level ∆<sup>2</sup> of this hierarchy.

### 2 Preliminaries

Languages. An alphabet Σ, of size |Σ|, is a finite set of letters. Σ<sup>∗</sup> , Σ<sup>+</sup>, and Σ<sup>ω</sup> denote the sets of finite, nonempty finite, and infinite words over Σ, respectively. A language of finite or infinite words is a subset of Σ<sup>∗</sup> or Σ<sup>ω</sup>, respectively. We write [i..j] and [i..j), with integers i ≤ j, for the sets {i, i + 1, . . . , j} and {i, i + 1, . . . , j − 1}, respectively. For a word w = σ<sup>0</sup> · σ<sup>1</sup> · · · , we write |w| for its length (∞ if w is infinite), w[i] for σ<sup>i</sup> , w[i..j] and w[i..j) for its corresponding infixes (w[i..i) is the empty word), and w[i..] for its (finite or infinite) suffix σ<sup>i</sup> · σi+1 · · · . Linear Temporal Logic (LTL). Let AP be a finite set of atomic propositions. LTL formulas are constructed from the constant true, atomic propositions a ∈ AP, the connectives ¬ (negation) and ∧ (and), and the temporal operators U (until) and X (next). Their semantics are given by a satisfiability relation |= between finite or infinite words w ∈ (2AP ) <sup>+</sup> ∪ (2AP ) <sup>ω</sup>, and a formula ϕ inductively as follows:

w |= true w |= a iff a ∈ w[0] w |= ¬ϕ iff w 6|= ϕ w |= ϕ ∧ ψ iff w |= ϕ and w |= ψ w |= Xϕ iff |w| > 1 and w[1..] |= ϕ w |= ϕUψ iff ∃i ∈ [0..|w|). w[i..] |= ψ and ∀j ∈ [0..i). w[j..] |= ϕ

We also use the common shortcuts false := ¬true, ϕ ∨ ψ := ¬((¬ϕ) ∧ (¬ψ)), Fϕ := trueUϕ, Gϕ := ¬F¬ϕ, and ψ1Rψ<sup>2</sup> := ¬(¬ψ1)U(¬ψ2). The language of finite words of ϕ is L <ω(ϕ) := {w ∈ (2AP ) <sup>+</sup> | w |= ϕ}, and the language of infinite words is L(ϕ) := {w ∈ (2AP ) <sup>ω</sup> | w |= ϕ}. Note that we omit the "< ω" superscript if it is clear from the context which set is used. The length |ϕ| of ϕ is the number of nodes in its syntax tree, the size of ϕ is the number of nodes in a DAG representing this syntax tree, and its temporal nesting depth, denoted by depth(ϕ), is defined by: depth(true) = 0; depth(a) = 0 for an atomic proposition a ∈ AP; depth(¬ψ) = depth(ψ); depth(ψ<sup>1</sup> ∧ ψ2) = max(depth(ψ1), depth(ψ2)); depth(Xψ) = depth(ψ)+ 1; and depth(ψ1Uψ2) = max(depth(ψ1), depth(ψ2))+ 1. Chang, Manna, and Pnueli define in [8] a syntactic hierarchy for LTL formulas (over infinite words):

#### Definition 1 (LTL Syntactic future hierarchy [8] 5 ).


Σ<sup>1</sup> is referred to as syntactic co-safety formulas, Π<sup>1</sup> as syntactic safety formulas.

Automata. A deterministic semiautomaton is a tuple D = (Σ, Q, δ), where Σ is an alphabet; Q is a finite nonempty set of states; and δ : Q × Σ → Q is a transition function and we extend it to finite words in the usual way. A path of D on a word w = σ<sup>0</sup> · σ<sup>1</sup> · · · is a sequence of states q0, q1, . . ., such that for every i < |w|, we have δ(q<sup>i</sup> , σi) = qi+1.

It is a reset semiautomaton if for every letter σ ∈ Σ, either i) for every state q ∈ Q we have δ(q, σ) = q, or ii) there exists a state q <sup>0</sup> ∈ Q, such that for every state q ∈ Q we have δ(q, σ) = q 0 .

<sup>5</sup> This extends [6,37] with negation, which can be removed via negation normal form.

It is counter free if for every state q ∈ Q, finite word u ∈ Σ+, and number n ∈ N \ {0}, there is a self loop of q on u <sup>n</sup> iff there is a self loop of q on u.

A deterministic automaton is a tuple D = (Σ, Q, ι, δ, α), where (Σ, Q, δ) is a deterministic semiautomaton, ι ∈ Q is an initial state; and α is some acceptance condition, as detailed below. A run of D on a word w is a path of D on w that starts in ι. It is a reset or counter-free automaton if its semiautomaton is.

The acceptance condition of an automaton on finite words is a set F ⊆ Q; a run is accepting if it ends in a state q ∈ F. The acceptance condition of an ωregular automaton, on infinite words, is defined with respect to the set inf (r ) of states visited infinitely often along a run r. We define below several acceptance conditions that we use in the sequel; for other conditions, see, for example, [3].

The Muller condition is a set α = {M1, . . . , Mk} of sets M<sup>i</sup> ⊆ Q of states, and a run r is accepting if there exists a set M<sup>i</sup> , such that M<sup>i</sup> = inf (r ). The Rabin condition is a set α = {(G1, B1), . . . ,(Gk, Bk)} of pairs of sets of states, and r is accepting if there exists a pair (G<sup>i</sup> , Bi), such that G<sup>i</sup> ∩ inf (r ) 6= ∅ and B<sup>i</sup> ∩ inf (r ) = ∅. The B¨uchi (resp. coB¨uchi) condition is a set α ⊆ Q of states, and r is accepting if α∩inf (r ) 6= ∅ (resp. α∩inf (r ) = ∅). A weak automaton is a B¨uchi automaton, in which every strongly connected component (SCC) contains only states in α or only states out of α. A looping automaton is a B¨uchi or coB¨uchi automaton, where all states are in α, except for a single sink state.

Deterministic automata of the above types correspond to the hierarchy of temporal properties [28]: Looping-B¨uchi, looping-coB¨uchi, weak, B¨uchi, coB¨uchi, and Rabin/Muller deterministic automata define respectively safety, guarantee (co-safety), obligation, recurrence, persistence, and reactivity languages. If the language is also LTL-definable, then there exists an equivalent LTL formula in Π1, Σ1, ∆1, Π2, Σ2, and ∆2, respectively [8]. Every deterministic ω-regular automaton is equivalent to deterministic Muller and Rabin automata, where the Muller (but not always Rabin) one can be defined on the same semiautomaton.

Nondeterministic and alternating automata (to which we only refer in Section 3, on finite words over a unary alphabet) extend deterministic automata by having a transition function δ : Q × Σ → 2 <sup>Q</sup> and δ : Q × Σ → (positive Boolean formulas over Q), respectively. (See, for example, [7] for formal definitions.)

### 3 Unary Alphabet

Kupferman, Ta-Shma and Vardi [20] compared the succinctness of different automata models when counting, that is, recognising the singleton language {a <sup>k</sup>} for some k over the singleton alphabet {a}. For the succinctness gap between automata and LTL, we study the task of recognising arbitrary languages over the unary alphabet, which can be seen as sets of integers, rather than a single integer.

For a unary alphabet, since there is only one infinite word, only languages on finite words are interesting. We thus consider LTL formulas over (no) atomic propositions AP = ∅, and automata on finite unary words over the corresponding alphabet Σ = 2AP = {∅}, where we use the shorthand a = ∅. The size of a deterministic automaton is the number of its states, of a nondeterministic

automaton the number of its transitions, and of an alternating automaton the number of subformulas in its transition function.

We show that the size blow-up involved in translating deterministic, nondeterministic, and alternating automata to LTL, when possible, is linear, quadratic, and exponential, respectively.

In our analysis, we shall use the following folklore theorem, which extends Wolper's Theorem [45].

Proposition 1 (Extended Wolper's theorem, Folklore). Consider an LTL formula ϕ with depth(ϕ) = n over the atomic propositions AP, and let Σ = 2AP . Then for every words u ∈ Σ<sup>∗</sup> , v ∈ Σ<sup>+</sup> and t ∈ Σω, and numbers i, j > n, ϕ has the same truth value on the words (uv<sup>i</sup> t) and (uv<sup>j</sup> t).

We use this to establish that unary LTL describes only finite and co-finite properties, and that there is a tight relation between the depth of LTL formulas and the length of words above which they are all in or all out of the language.

Proposition 2. Given an LTL formula ϕ with depth(ϕ) = n on finite words over the unary alphabet {a}, a <sup>i</sup> ∈ L(ϕ) for all i > n or a <sup>i</sup> ∈/ L(ϕ) for all i > n.

Proposition 3. Consider a language L ⊆ {a} <sup>+</sup> that agrees on all words of length over n, that is, has the same truth value on all such words. Then there is an LTL formula of size in O(n) with language L.

We now establish the trade-off between LTL and alternating automata (AFA) over unary alphabets. AFA are closed under (linear) complementation, so we use a pumping argument to bound the length after which all words have the same truth value, giving an upper bound on the LTL formula.

Lemma 1. Every alternating automaton with n states that recognises an LTLexpressible language L ⊆ {a} <sup>+</sup> is equivalent to an LTL formula of size in O(2<sup>n</sup>).

We show next that this upper bound is tight. Consider the language {a 2 n−1 }, which, according to Proposition 2, is only recognised by LTL formulas of size at least 2<sup>n</sup>−<sup>1</sup> . It is recognised by a weak alternating automaton with 2n states and size in O(n), using an automaton based on Leiss's construction [23]. Intuitively, the alternating automaton represents an n-bit up-counter with two states for each bit, one for 1 and one for 0 (see Fig. 1), where the universal transitions enforce that nondeterministic transitions correctly update the counter.

Lemma 2 (Adaptation of [23, proof of Theorem 1]). For every n ∈ N \ {0}, there is a weak alternating automaton with 2n states and transition function of size in O(n) recognising the language {a 2 n−1 }.

We continue to nondeterministic automata (NFAs), for which the arguments are more involved as they do not allow for linear complementation.

Figure 1. An alternating automaton of size in O(n) recognising {a 2 n−1 }; here with n = 3, where the initial configuration is q1,<sup>0</sup> ∧ q2,<sup>0</sup> ∧ q3,0.

Lemma 3. Every nondeterministic automaton with n states recognising an LTLexpressible language L ⊆ {a} <sup>+</sup> is equivalent to an LTL formula of size in O(n 2 ).

Proof sketch. For finite L, by a pumping argument, A only accepts words up to length n, and by Proposition 3 we are done. We now consider a co-finite L.

We use 2-way deterministic automata, which are deterministic automata that process words of the form `wa, where ` and a are start- and end-of-word markers respectively, and where transitions specify whether to read the letter to the right or to the left of the current position. They accept by reaching an end state, and reject by reaching a rejecting state or by failing to terminate [17], and every unary NFA A can be turned into a 2-way DFA D of size O(n 2 ) [9].

We construct from an NFA A a 2-way DFA D, and then a 2-way DFA D<sup>0</sup> of the same size that recognises a <sup>∗</sup> \ {a <sup>k</sup>}, where a k is the longest word not in L. We use the fact that a 2-way DFA of size m can be complemented into one of size 4m [17] to complement D<sup>0</sup> into D<sup>00</sup> that recognises {a <sup>k</sup>} and must therefore be of size at least k + 2 [1], so k, and by Proposition 2, an LTL formula for L, is in O(n 2 ).

We now show that this upper bound is tight. The previous lower bound ideas do not work with nondeterminism, since we need n states to recognise {a <sup>n</sup>} [20]. Yet, we need not count exactly to n for achieving a lower bound. We can use a variant of a language used in [4, pages 10–11]: For every positive integer k, define the set of positive integers S<sup>k</sup> = {m > 0 | ∃i, j ∈ N. m = ik + j(k + 1)}, and the language V<sup>k</sup> = {a <sup>m</sup> | m ∈ Sk} ⊆ {a} ∗ .

Proposition 4 (Folklore, [4, Theorem 3]). For every k ∈ N the number k <sup>2</sup> − k − 1 is the maximal number not in Sk.

Proposition 5 ([4, proof of Theorem 4]). For every n ∈ N, there is an NFA of size in O(n) recognising a co-finite language L ⊆ {a} ∗ , such that a k <sup>2</sup>−k−1 is not in L, while for every t ≥ k <sup>2</sup> − k, we have that a <sup>t</sup> ∈ L.

Theorem 1. The size blow-up involved in translating deterministic, nondeterministic, and alternating automata on finite unary words to LTL, when possible, is Θ(n), Θ(n 2 ), and Θ(2n), respectively.

### 4 General Alphabet

In this section we consider the more challenging task of turning counter-free ωregular automata over arbitrary alphabets into LTL. We use the fact that these automata can be turned into reset cascade automata (Krohn-Rhodes-Holonomy decomposition), which we describe in Section 4.1. Our technical contribution is then the translation of reset cascade automata into LTL.

In brief, we build, in Section 4.2, a parameterised LTL formula that is satisfied by a word w iff the run of the cascade on w, starting in the parameter configuration S, reaches a parameter configuration T, such that the remaining suffix of w satisfies a parameter LTL formula τ . We then use this formula, in Section 4.4, to describe the automaton's acceptance condition.

When encoding the behavior of a cascade by an LTL formula, we need to overcome two major challenges: First, the cascade is a formalism that looks at the past, namely at the word read so far, to determine the next configuration, while an LTL formula obtains its value only from the future. Second, the cascade has an internal state, while an LTL formula does not. Our reachability formulas are therefore quite involved, built inductively over the number of levels in the cascade, and implicitly allowing to track the internal configuration of the cascade.

In Section 4.3 we analyse the length and depth of the resulting formulas.

### 4.1 Cascaded Automata

Cascades. A cascaded semiautomaton (analogous to the algebraic wreath product) over an alphabet Σ is a semiautomaton that can be described as a sequence of simple semiautomata, such that the alphabet of each of them is Σ together with the current state of each of the preceding semiautomata in the sequence. It is a reset cascade if it is a sequence of reset semiautomata. Formally, a cascaded semiautomaton, or just cascade, over alphabet Σ with n levels is a tuple A = hΣ, A1, A2, . . . , Ani, such that A<sup>i</sup> = (Σ<sup>i</sup> , Q<sup>i</sup> , δi) is a semiautomaton for each level i, where Σ<sup>i</sup> = Σ × Q<sup>1</sup> × · · · × Qi−1. (So Σ<sup>1</sup> = Σ, Σ<sup>2</sup> = Σ × Q1, etc.). It is a reset cascade if all A<sup>i</sup> 's are reset semiautomata.

An i-configuration S of A is a tuple hq1, q2, . . . , qii ∈ Q<sup>1</sup> × · · · × Q<sup>i</sup> . If qi+1 ∈ Qi+1 is a state of level i + 1, we write hS, qi+1i for the (i + 1)-configuration hq1, . . . , q<sup>i</sup> , qi+1i. Note that the 0-configuration is the empty tuple hi. Further, we derive the transition relation for configurations by point-wise application of the respective δ<sup>i</sup> 's. We define δ<sup>≤</sup>i(hq1, q2, . . . qii, σ) as hδ1(q1,hσi), δ2(q2,hσ, q1i), . . .i. Note that we will omit the "≤ i"-subscript if it is clear from context, and by just writing "configuration", we mean an n-configuration.

Notice that A describes a standard semiautomaton D<sup>A</sup> over Σ, whose states are the configurations of A of level n, and its transition function is δ≤n. If there are up to j states in each level of A, there are up to j <sup>n</sup> states in DA. Observe that when A is a reset cascade, it can be translated to an equivalent reset cascade with up to n log j levels, and 2 states in each level [14, Ex. I.10.2].

For a state q ∈ Q<sup>i</sup> of level i of a reset cascade, we denote by Enter(q), Stay(q), and Leave(q) ⊆ Σ × Q<sup>1</sup> × · · · × Qi−<sup>1</sup> the sets of (combined) letters that enter q, stay in it, and leave it, respectively. These are sets of pairs hσ, Si, where S is an (i−1)-configuration and σ ∈ Σ. Notice that Enter(q) ⊆ Stay(q), and that Leave(q) is the complement of Stay(q) (w.r.t. the relevant (combined) letters).

A semiautomaton (Σ, Q, δ) is homomorphic to a cascade hΣ, A1, . . . , Ani if there exists a partial surjective function ϕ: Q<sup>1</sup> × · · · × Q<sup>n</sup> → Q, such that for every σ ∈ Σ and S ∈ Q<sup>1</sup> × · · · × Qn, we have δ(ϕ(S), σ) = ϕ(δ<sup>≤</sup>n(S, σ)).

Proposition 6 (Part of the Krohn-Rhodes-Holonomy Decomposition [14, Corollary II.7.2], [26, Theorem 3]). Every counter-free deterministic semiautomaton D with n states is homomorphic to a reset cascade A with up to 2 <sup>n</sup> levels and 2 <sup>n</sup> states in each level.

Remark 1. The Krohn-Rhodes and Holonomy decomposition theorems consider also more general cascades and give results with respect to arbitrary semiautomata. The Holonomy decomposition in [14], as opposed to many other proofs of the Krohn-Rhodes decomposition, guarantees up to 2<sup>n</sup> levels with up to 2<sup>n</sup> states in each level. Yet, it shows that A covers D, allowing A to operate over an alphabet different from that of D. In [26,27,25], the algebraic proof of [14] is translated to an automata-theoretic one, providing the stated homomorphism. It is also stated in [26, Theorem 3.1], [27, Corollary 20], and [25, Corollary 2] that the number of configurations in A is singly exponential in n, but to the best of our understanding they do not provide an explicit proof for it.

Cascades with acceptance conditions. As a cascade A describes a standard semiautomaton (whose states are the configurations of A), we can add to it an initial configuration and an acceptance condition to make it a standard deterministic automaton. We show below that the homomorphism between an automaton and a cascade can be extended to also transfer the same acceptance condition.

Proposition 7. Let D be a deterministic B¨uchi, coB¨uchi or Rabin automaton, with a semiautomaton homomorphic to a cascade A. There is respectively a deterministic B¨uchi, coB¨uchi or Rabin automaton D<sup>0</sup> equivalent to D with semiautomaton A. For Rabin, D and D<sup>0</sup> have the same number of acceptance pairs.

Proposition 8. Consider a deterministic Muller automaton D with n states, whose semiautomaton is homomorphic to a reset cascade A with m configurations. Then there is a deterministic Muller automaton D<sup>0</sup> equivalent to D, whose semiautomaton is A and its Muller condition has up to 2 <sup>O</sup>(mn) acceptance sets.

### 4.2 Encoding Reachability within Reset Cascades by LTL Formulas

For the rest of this section, let us fix a set of atomic propositions AP, an alphabet Σ = 2AP , and a reset cascade A = hΣ, A1, A2, . . . , Ani.

The main reachability formula. For every level i of A, three configurations S, B and T of level i, and two LTL formulas β and τ , we will define the LTL formula <sup>S</sup> <sup>∼</sup>❳∼∼∼ <sup>B</sup>(❳β) K T (τ ) with the intended semantics that it holds on a word w ∈ Σ<sup>ω</sup> iff A goes from the 'starting' configuration S to the 'target' configuration T along some prefix u of w, such that the suffix of w after u satisfies τ and the path along u avoids the 'bad' configuration B with a suffix satisfying β.

Auxiliary reachability formulas. We will formally define the main reachability formula by induction on the level i of the involved configurations, and using four auxiliary formulas, whose intended semantics is described in Table 1. These formulas distinguish between the case that the top-level state is unchanged along the reachability path, denoted with a solid arrow −−→, and the case that it is changed, denoted by a dashed arrow 999K. They also have dual, weak, versions.

Observe that intuitively <sup>S</sup> <sup>∼</sup>❳∼∼∼ <sup>B</sup>(❳β) K T (τ ) is an extended Until operator, while its dual S weak <sup>∼</sup>❳∼∼∼ <sup>B</sup>(❳β) <sup>K</sup> <sup>T</sup> (<sup>τ</sup> ) = <sup>¬</sup>(<sup>S</sup> <sup>∼</sup>❳∼∼∼ <sup>T</sup> (❳τ) K B (β)) is an extended Weak until (or Release) operator. We build the formulas so that for appropriate choices of β and τ , the (strong) reachability formulas 1, 3, and 5 (as numbered in Table 1) are syntactic co-safety and the weak formulas 2 and 4 are syntactic safety formulas.

Formulas 1 and 2. The main formula is simply defined as the union of two auxiliary formulas, corresponding to whether or not the top-level state changes, and its weak version is defined to be its dual.

$$\begin{aligned} S \mathop{\mathop{\textstyle S\hbox{\$\hbox{\$\hbox{\$\hbox{\$\hbox{\$\hbox{\$\hbox{\$\hbox{\$\hbox{\$\hbox{\$\hbox{\$\hbox{\$\hbox{\$\hbox{\$\hbox{\$\hbox{\$\hbox{\$\hbox{\$\hbox{\$\hbox{\$\hbox{\$\hbox{\$\hbox{\$\hbox{\$\hbox{\$\hbox{\$\hbox{\$\hbox{\$\hbox{\$\hbox{\$\hbox{\$\hbox{\$\hbox{\$\hbox{\$\hbox{\$\hbox{\$\hbox{\$\pi}}}}}}}}}}}}}}}\new{}}\new{\$\hbox{\$\hbox{\$\hbox{\$\hbox{\$\hbox{\$\hbox{\$\hbox{\$\hbox{\$\pi}}}}}}}{\$\texttt{\$\vdots\$}}}\newline \Gamma\big(\tau\big)} &:= \begin{cases} (\neg\beta)\mathbf{U}\tau & \text{if \$\,S = \langle\rangle} \\ S \mathop{\mathbf{3}\!\!\!\!\!\!}\_{\mathsf{\widetilde{B}\notin\{\texttt{\$\tilde{\mathbf{<\beta}}\}}}} T\left(\tau\right) \lor S & \mathop{\mathbf{3}\!\!\!\!\!\/)}\_{\mathsf{\widetilde{B}\notin\{\texttt{\tilde{A}}\}}} T\left(\tau\right) & \text{otherwise}. \end{cases} \\ \end{aligned}$$

Formula 3. Since the formula should ensure that the top-level state s is unchanged, we first distinguish between four cases, depending on which of the source configuration hS, si, bad configuration hB, bi, and target configuration hT, ti are equal. The definitions of the four cases only differ in whether or not each of β and τ are satisfied in the first position of the word.

We define them using an intermediate common formula that is indifferent to the first position, which we mark by "> 0" on top of the arrow. We then define the "> 0" formula by using the main reachability formula with respect to a lower level, namely with respect to the configurations S and T instead of hS, si and hT, ti, and having corresponding disjunctions and conjunctions on all the combined letters of the top level that belong to Stay(s) and Leave(s).


Table 1. The intended semantics of reachability formulas. Orange subformulas show the difference between the auxiliary formulas and the first or second (main) formula.

$$\begin{cases} \begin{aligned} \langle S,s\rangle \xright &\xrightarrow{\mathsf{T}\mathsf{St}\mathsf{A}\downarrow\mathsf{Q}} \langle T,t\rangle \left(\tau\right) := \\ & \begin{cases} \langle S,s\rangle \xright &\xrightarrow{>0} \langle T,t\rangle \left(\tau\right) &\text{if } \langle S,s\rangle \neq \langle B,b\rangle \text{ and } \langle S,s\rangle \neq \langle T,t\rangle \\ \langle S,s\rangle \xright &\xrightarrow{>0} \langle T,t\rangle \left(\tau\right) \lor \tau &\text{if } \langle S,s\rangle \neq \langle B,b\rangle \text{ and } \langle S,s\rangle = \langle T,t\rangle \\ \langle S,s\rangle \xright &\xrightarrow{>0} \langle T,t\rangle \left(\tau\right) \land \neg \beta &\text{if } \langle S,s\rangle = \langle B,b\rangle \text{ and } \langle S,s\rangle \neq \langle T,t\rangle \\ \left(\langle S,s\rangle \xright &\xrightarrow{>0} \langle T,t\rangle \left(\tau\right) \land \neg \beta\right) \lor \tau &\text{if } \langle S,s\rangle = \langle B,b\rangle \text{ and } \langle S,s\rangle = \langle T,t\rangle \end{aligned} \\\\ \text{where } \langle S,s\rangle \xright &\xrightarrow{>0} \langle T,t\rangle \left(\tau\right) := \underbrace{\bigvee\_{\langle \sigma,T\rangle \xright \upharpoonright}}\_{\langle \sigma,\tau\rangle \xright &\xleftarrow{}$$

$$\land \bigwedge\_{\langle \eta, L \rangle \in \mathsf{Laxe}(s)} S \leadsto\_{\mathsf{Q}} T'(\sigma \wedge \mathbf{X} \tau) \quad \land \bigwedge\_{\langle \rho, B' \rangle \in \mathsf{Say}(s)} S \leadsto\_{\mathsf{B'} \langle \mathsf{p} \wedge \mathbf{X} \xi \rangle} T'(\sigma \wedge \mathbf{X} \tau) \quad \land \bigwedge\_{\mathsf{s.t.} \ (B', s) \stackrel{\rho}{\to} (B, b)} T'(\sigma \wedge \mathbf{X} \tau)$$

Formula 4. Its intended semantics is also that the top-level state s is unchanged, but we weaken Formula 3 by not enforcing that the target configuration hT, ti is reached and τ is satisfied. Thus as long as the top-level state s stays unchanged and the bad configuration hB, bi is not reached while satisfying β, Formula 4 is also satisfied. Note that since both Formula 3 and Formula 4 need to ensure that the top-level state s is unchanged they cannot simply be defined as the dual of each other. However, they share the same construction principle:

hS, si weak −−−−−→ ❳<sup>h</sup>B,b❳<sup>i</sup>(❳β) hT, ti(τ ) := hS, si weak,>0 −−−−−→ ❳<sup>h</sup>B,b❳<sup>i</sup>(❳β) hT, ti(τ ) if hS, si 6= hB, bi and hS, si 6= hT, ti hS, si weak,>0 −−−−−→ ❳<sup>h</sup>B,b❳<sup>i</sup>(❳β) hT, ti(τ ) ∨ τ if hS, si 6= hB, bi and hS, si = hT, ti hS, si weak,>0 −−−−−→ ❳<sup>h</sup>B,b❳<sup>i</sup>(❳β) hT, ti(τ ) ∧ ¬β if hS, si = hB, bi and hS, si 6= hT, ti hS, si weak,>0 −−−−−→ ❳<sup>h</sup>B,b❳<sup>i</sup>(❳β) hT, ti(τ ) ∨ τ ∧ ¬β if hS, si = hB, bi and hS, si = hT, ti

where

$$\begin{split} \left< S, s \right> \xrightarrow[\left(\mathsf{Z}\mathsf{Y}\mathsf{A}\right)\sharp\middleleftarrow\left<\tau, t\right>\left(\tau\right):=\\ \bigvee\_{\begin{subarray}{c}(\sigma,T')\in\mathsf{Sup}(s)\\ (\sigma,L')\in\mathsf{T}(T,t)\end{subarray}} \left(\bigwedge\_{\begin{subarray}{c}(\eta,L)\in\mathsf{L}\mathsf{W}(s)\\ (\eta,L)\in\mathsf{L}\mathsf{W}(s)\end{subarray}} S \xrightarrow{\mathsf{weak}} T' \left(\sigma\wedge\mathsf{X}\tau\right) \wedge \bigwedge\_{\begin{subarray}{c}(\rho,B')\in\mathsf{Sup}(s)\\ (\rho,B')\in\mathsf{Sup}(s)\end{subarray}} S \xrightarrow{\mathsf{weak}} T' \left(\sigma\wedge\mathsf{X}\tau\right) \right) \quad (1) \\ \bigvee\_{\begin{subarray}{c}(\tau,L)\in\mathsf{L}\mathsf{W}(s)\end{subarray}} \left(\bigwedge\_{\begin{subarray}{c}(\sigma,B')\in\mathsf{Sup}(s)\\ (\rho,B')\in\mathsf{Sup}(s)\end{subarray}} S \xrightarrow{\mathsf{weak}} S \xrightarrow{\mathsf{weak}} S \xrightarrow{\mathsf{weak}} S \xrightarrow{\mathsf{weak}} S \begin{subarray}{c} (\mathsf{F}\mathsf{L}\mathsf{L}\mathsf{L})\mathsf{L}\mathsf{R} \end{subarray}} S \xrightarrow{\mathsf{weak}} S \xrightarrow{\mathsf{weak}} S \begin{subarray}{c} (\mathsf{F}\mathsf{L}\mathsf{L}\mathsf{L}\mathsf{L})\mathsf{L}\mathsf{R} \end{subarray}} \left(\begin{subarray}{c} (\mathsf{F}\mathsf{L}\mathsf{L}\mathsf{L})\mathsf{L}\mathsf{L}\mathsf{L}\mathsf{L}\mathsf{L}\mathsf{L}\mathsf{L} \end{subarray}\right) \quad (1) \\ \bigvee\_{\begin{subarray}{c}(\sigma,L)\in\mathsf{L}\mathsf{L}\mathsf{W}(s)\end{subarray}} S \xrightarrow{\mathsf{weak}} \begin{subarray}{c} (\mathsf{F}\mathsf{L}$$

Formula 5. The definition of the last reachability formula is the most challenging, since the top-level state changes (s 6= t), which prevents the direct usage of lower level configurations.

Intuitively, before reaching the target configuration hT, ti, the run must see a combined letter hσ, T<sup>0</sup> i ∈ Enter(t), after which the top-level state t is preserved and the bad situation hB, bi(β) is avoided. This is line (1) of the definition.

The run must also not see hB, bi(β) before reaching T 0 , which is handled in line (2), whose difference from line (1) is the additional constraint on the path from S to T 0 . (Line (1) is required for the case that Enter(b) is empty.) We use Formula 4 for that constraint, rather than Formula 3 which could also be used, in order to ensure that Formula 5 can be a syntactic co-safety formula.

Lastly, line (3) ensures that the top-level state is indeed changed.

$$\begin{aligned} \langle S, s \rangle \xrightarrow[\langle \mathfrak{B}, \mathfrak{A} \rangle \downarrow \downarrow \downarrow \downarrow]{} \langle T, t \rangle \langle \tau \rangle &:= \\ \bigvee\_{\begin{subarray}{c} (\sigma, T') \in \\ \mathsf{char}(t) \end{subarray}} \left( S \underset{\begin{subarray}{c} \mathfrak{T} \mathsf{F} \mathsf{A} \mathsf{A} \mathsf{e} \downarrow \\ \mathsf{F} \mathsf{F} \mathsf{A} \mathsf{b} \downarrow \downarrow \end{subarray}} T' \left( \sigma \wedge \mathbf{X} \Big( \delta \,(\langle T', \cdot \rangle, \sigma) \xrightarrow[\widetilde{\mathsf{T} \mathsf{B} \mathsf{A} \mathsf{b} \downarrow \sharp \downarrow}]{} \langle T, t \rangle \,(\tau) \right) \right) \wedge & \tag{1} \end{aligned} \tag{1}$$

$$\bigwedge\_{\substack{(\eta,R)\in\operatorname{\mathbb{E}}\\ \operatorname{\bf{Extar}}(b)}} S \xrightarrow[R(\eta\wedge\mathbf{X}(\delta(\overline{\langle R,\gamma\rangle})\wedge\bigvee\_{\substack{\mathbf{W}\in\operatorname{\mathbf{W}}\mathbf{Z}\subseteq\mathbf{W}}\mathbf{Z}\mathbf{0})\rightarrow}(B,b)(\beta))\right] \quad(2)$$

$$\wedge \bigvee\_{\langle \sigma, L \rangle \in \mathcal{L}} \langle S, s \rangle \xrightarrow[\langle \mathfrak{B}; \mathfrak{A} \rangle \downarrow \downarrow \downarrow \downarrow]{} \langle L, s \rangle \left( \sigma \wedge \begin{cases} \neg \overline{\beta} & \text{if } \langle L, s \rangle = \langle B, b \rangle \\ \mathtt{true} & \text{otherwise} . \end{cases} \right) \tag{3}$$

We prove the correctness of the above definitions with respect to the intended meaning of Table 1 by induction on the level of the involved configurations.

Lemma 4. The intended semantics of Table 1 hold for all infinite words w ∈ Σ<sup>ω</sup> = (2AP ) <sup>ω</sup>, configurations S, B, T of level m ≤ n, states s, b, t in level m + 1 (when m < n), and LTL formulas β and τ over AP.

Using the same induction principle we prove that the reachability formulas stay within certain classes of the syntactic future hierarchy (Definition 1). We use <sup>S</sup> <sup>∼</sup>❳∼∼∼ <sup>B</sup>(X❳) K T (Y ) ∈ Z as a shorthand for saying that for every formulas β ∈ X and τ ∈ Y , the formula <sup>S</sup> <sup>∼</sup>❳∼∼∼ <sup>B</sup>(❳β) K T (τ ) is in Z.

Lemma 5. Let S, B, T be configurations of level m ≤ n, and let s, b, t be states in level m + 1 (when m < n). Then for i ≥ 1 it holds that:

$$\begin{array}{llll} - & S \stackrel{\scriptstyle \mathsf{S} \stackrel{\scriptstyle \mathsf{W}}{\mathsf{\reflectbox{ $\mathsf{U}$ }}{\mathsf{\reflectbox{ $\mathsf{U}$ }}{\mathsf{\reflectbox{ $\mathsf{U}$ }}}}} T\left(\Sigma\_{i}\right), & \langle S,s\rangle \stackrel{\scriptstyle \mathsf{V} \stackrel{\scriptstyle \mathsf{V}}{\mathsf{\reflectbox{ $\mathsf{U}$ }}{\mathsf{\reflectbox{ $\mathsf{U}$ }}{\mathsf{\reflectbox{ $\mathsf{U}$ }}}}} \langle T,t\rangle \left(\Sigma\_{i}\right), & \langle S,s\rangle \stackrel{\scriptstyle \mathsf{V} \stackrel{\scriptstyle \mathsf{V}}{\mathsf{\reflectbox{ $\mathsf{U}$ }}{\mathsf{\reflectbox{ $\mathsf{U}$ }}}}} \langle T,t\rangle \left(\Sigma\_{i}\right) \in & \Sigma\_{i} \\ - & S \stackrel{\scriptstyle \mathsf{S} \stackrel{\scriptstyle \mathsf{U} \stackrel{\scriptstyle \mathsf{U}}{\mathsf{\reflectbox{ $\mathsf{U}$ }}}}{\mathsf{\reflectbox{ $\mathsf{U}$ }}{\mathsf{\reflectbox{ $\mathsf{U}$ }}}} T\left(\varprojlim\_{i} \langle \varprojlim\_{i} \rangle, & \langle S,s\rangle \stackrel{\scriptstyle \mathsf{V} \stackrel{\scriptstyle \mathsf{U}}{\mathsf{\reflectbox{ $\mathsf{U}$ }}}}{\mathsf{\reflectbox{ $\mathsf{U}$ }}{\mathsf{\reflectbox{ $\mathsf{U}$ }}}} \langle T,t\rangle \left(\varprojlim\_{i} \right) \in & \Pi\_{i} \end{array} \end{array}$$

#### 4.3 Depth and Length Analysis

We analyze the length and temporal-nesting depth of the LTL reachability formulas defined in Section 4.2. Notice that both measures are of independent interest, as there might be a non-elementary gap between the depth and length of LTL formulas [15, Theorem 6]. Since we provide upper bounds, the bound on the length of formulas obviously gives also a bound on their size.

We consider a reset cascade A with n levels, as in Section 4.2, and further assume for the length and depth analysis that it has up to n states in each level. (This assumption holds in the reset cascades that result from the Krohn-Rohdes decomposition as per Proposition 6.)

We define for each of the five reachability formulas a depth function Dx(i, d) and a length function Lx(i, l), where x refers to the number of the reachability formula, to bound the depth and length of the formulas. These depend on the level i of its input configurations S, B and T, and the maximal depth d and length l of its input formulas β and τ . For the main (first) reachability formula, we also use D and L, standing for D<sup>1</sup> and L1. For example, the length of the first formula <sup>S</sup> <sup>∼</sup>❳∼∼∼ <sup>B</sup>(❳β) K T (τ ) over configurations S, B and T of level 7 and formulas β and τ of length up to 77 is bounded by the value of L1(7, 77).

For simplicity, we consider the LTL representation of an alphabet letter σ ∈ Σ to be of length 1, while its actual length is 3 log<sup>2</sup> |Σ|. This increase is due to the need to encode an alphabet letter σ ∈ Σ = 2AP as a conjunction of atomic propositions in AP. The representation length can be multiplied by the total length of the final relevant formula (e.g., a formula equivalent to the entire reset cascade), since it remains constant along all steps of our inductive computation.

We provide in Table 2 upper bounds on the depth and length functions, relative to values of other depth and length functions with respect to configurations of the same or lower-by-one level. The table is constructed by following the syntactic definitions of the reachability formulas, and applying basic simplifications to the resulting expressions. For example, L1(0, l) = 2+2l standing for the length of (¬β)Uτ . In Lemma 6 we will use Table 2 to bound the absolute depth and length of the main reachability formula.

Depth Analysis. The temporal nesting depth of the main reachability formula <sup>S</sup> <sup>∼</sup>❳∼∼∼ <sup>B</sup>(❳β) K T (τ ) is intuitively exponential in the number n of levels of the reset cascade (linear in the number of configurations), since it is defined inductively along these levels, and the depth of a level-(i + 1) formula is about twice the depth of a level-i formula. The parameters of the reachability formula are both the configurations S, B and T of level i, and the formulas β and τ ; yet, the depth of the reachability formula only linearly depends on the depth of β and τ .

Length Analysis. Intuitively, the overall length of the main reachability formula <sup>S</sup> <sup>∼</sup>❳∼∼∼ <sup>B</sup>(❳β) K T (τ ) with respect to configurations of the top level is doubly exponential in the number n of levels of the reset cascade (and thus singly exponential in the number of configurations), since the formula is defined inductively along these levels, and the length L(i, l) is roughly L(i−1, l)· L(i−1, l). More precisely, L(i, l) = l · f(i) for some doubly exponential function f(i).

Now, why is L(i, l) roughly equal to L(i−1, l) · L(i−1, l)? The dominant component of the level-i reachability formula is line (2) in the definition of <sup>h</sup>S, s<sup>i</sup> <sup>9999999</sup> ❳<sup>h</sup>B,b❳<sup>i</sup>(❳β) K hT, ti(τ ). It is a level-(i−1) reachability formula whose formulaparameters are themselves auxiliary reachability formulas of level i with formula parameters of length l. The length of an auxiliary reachability formula of level i is roughly as of the main reachability formula of level i−1, implying that the length of Li(l) is roughly Li−1(Li−1(l)). By the inductive proof that Li−1(l) = l·f(i−1), we get that Li(l) = Li−1(Li−1(l)) = Li−1(l) · f(i−1) = l · f(i−1) · f(i−1).

As for the many disjunctions and conjunctions that appear in the formulas, observe that the number of disjuncts and conjuncts does not depend on the


Table 2. The relative depths and lengths of the reachability formulas over configurations of level i, and LTL formulas β and τ of depth at most d and length at most l. For the first two reachability formulas, we consider i ≥ 0 and for the other formulas i ≥ 1.

formula-parameters β and τ , but only the level i of the configurations S, B, and T. Hence, they do not dominate the growth rate of the overall formula length.

Lemma 6. Consider a reset cascade A with n levels and up to n states in each level, and a formula <sup>ζ</sup> <sup>=</sup> <sup>S</sup> <sup>∼</sup>❳∼∼∼ <sup>B</sup>(❳β) K T (τ ) with configurations S, B and T of A of level i ≤ n. Let d = max(depth(β), depth(τ )) and let l = max(|β|, |τ |). Then:

> (a) depth(ζ) ≤ d + 3<sup>i</sup> and (b) |ζ| ≤ l · (10|Σ| <sup>2</sup>n) 4 i

Lemma 6 is proven by induction on i and the details of this proof can be found in the full version [5].

#### 4.4 Translating Deterministic Counter-Free Automata to LTL

We use the reachability formulas of Section 4.2 to translate a reset cascade A to an equivalent LTL formula. Our LTL formulation of A's acceptance condition is based on an LTL formulation of "C is visited finitely/infinitely often along a run of A on a word w", for a given configuration C of A. It thus applies to every ω-regular acceptance condition and by Propositions 6 and 8 to every deterministic counter-free ω-regular automaton. We introduce two shorthands to the main reachability formula: the first is satisfied if we reach T from S without any side constraints (which is always satisfied in the case that S = T), and the second requires that we reach it along a nonempty prefix.

$$S \leadsto T := S \leadsto \leadsto T \left( \mathsf{true} \right) \qquad S \leadsto^{\geq 0} T := \bigvee\_{\sigma \in \Sigma} \left( \sigma \land \mathbf{X} (\delta(S, \sigma) \leadsto T) \right)$$

With Lemmas 4 and 5 we then obtain (the proof can be found in [5]):

Lemma 7. Consider a reset cascade A = h2 AP , A1, . . . , Ani together with an initial configuration ι and some configuration <sup>C</sup>. Then for a word <sup>w</sup> <sup>∈</sup> (2AP ) ω , the run of <sup>A</sup> on <sup>w</sup> starting in ι visits <sup>C</sup> finitely often iff <sup>w</sup> satisfies the formula Fin(C) := <sup>¬</sup>(ι ∼∼∼<sup>K</sup> <sup>C</sup>) <sup>∨</sup> ι ∼∼∼<sup>K</sup> <sup>C</sup>(¬(<sup>C</sup> <sup>&</sup>gt;<sup>0</sup> ∼∼∼K C)). Furthermore, Fin(C) ∈ Σ2.

We are now in position to give our main result.

Theorem 2. Every counter-free deterministic ω-regular automaton D over alphabet 2 AP with n states (and any acceptance condition) is equivalent to an LTL formula ϕ over atomic propositions AP of double-exponential temporal-nesting depth (in O(2<sup>2</sup> n )) and triple-exponential length (in 2 2 O(2n) ). If D is a looping-B¨uchi, looping-coB¨uchi, weak, B¨uchi, coB¨uchi, or Muller automaton then ϕ is respectively in the Π1, Σ1, ∆1, Π2, Σ2, or ∆<sup>2</sup> syntactic fragment of LTL.

Proof. We first prove the general result, w.r.t. an arbitrary counter-free deterministic automaton D, and then take into account D's acceptance condition, to establish the last part of the theorem.

Consider a counter-free deterministic ω-regular automaton D with some acceptance condition and n states. Recall that there is a Muller automaton D<sup>0</sup> equivalent to D over the semiautomaton of D. By Propositions 6 and 8, D<sup>0</sup> is equivalent to a deterministic Muller automaton D<sup>00</sup> that is described by a reset cascade A with up to m = 2<sup>n</sup> levels and m states in each level (and thus up to m<sup>m</sup> configurations), and whose acceptance condition has up to k ∈ 2 <sup>O</sup>(mmn) = 2<sup>O</sup>(mm) acceptance sets. An LTL formula ϕ equivalent to D can be defined by formulating the acceptance condition of D<sup>0</sup> along Lemma 7.

Recall that the Muller condition is a k-elements disjunction, where each disjunct M is a conjunction of requirements to visit infinitely often every configuration from some set G and finitely often every configuration not in G. Observe that M can be formulated as a disjunction over all the configurations in D<sup>00</sup> (at most m<sup>m</sup>), having for each configuration C the LTL formula Fin(C) or ¬Fin(C), as defined in Lemma 7, depending on whether or not C ∈ G. Hence, the overall formula ϕ is a combination of disjunctions and conjunctions of up to k ·m<sup>m</sup> subformulas of the form Fin(C) or ¬Fin(C). Therefore, the depth of ϕ is the same as of Fin(C), while |ϕ| ∈ O(km<sup>m</sup>|Fin(C)|) ≤ 2 O(mm) |Fin(C)|. For calculating depth(Fin(C)) and |Fin(C)|, we use Lemma 6 bottom up over the subformulas of Fin(C).

### Depth.

$$\begin{aligned} \mathsf{depth}(\iota \leadsto C) &\le 3^m \; ; \; \mathsf{depth}(C \leadsto C) \le 3^m + 1 \\ \mathsf{depth}(\iota \leadsto C(\neg(C \leadsto C))) &\le 2 \cdot 3^m + 1 \\ \mathsf{depth}(\mathit{Fin}(C)) &= \max(3^m, 2 \cdot 3^m + 1) \in O(3^m) = O(2^{2^n}), \\ \mathsf{implying } \mathsf{depth}(\varphi) &\in O(2^{2^n}). \end{aligned}$$

Length.

$$\begin{split} |\iota \smile C| &\le (10|\Sigma|^2 m)^{4^m} \; ; \; |C \stackrel{>0}{\leadsto} C| \le (4|\Sigma|) \cdot (10|\Sigma|^2 m)^{4^m} \\ |\iota \smile C(\neg(C \stackrel{>0}{\leadsto} C))| &\le (4|\Sigma|(10|\Sigma|^2 m)^{4^m} + 1)(10|\Sigma|^2 m)^{4^m} \in \left( |\Sigma| m \right)^{2^{O(m)}} \\ |\operatorname{Fin}(C)| &\in 2 + (10|\Sigma|^2 m)^{4^m} + \left( |\Sigma|m \right)^{2^{O(m)}} \in \left( |\Sigma|m \right)^{2^{O(m)}}. \end{split}$$

Therefore, |ϕ| ∈ 2 O(mm) · (mm) · ((|Σ|m) 2 O(m) ) = |Σ| 2 O(m) .

Expressing the length of ϕ with respect to the number n of states in the automaton D, and taking into account the fact that the alphabet Σ has at most n <sup>n</sup> different letters (any additional letter must have the same behavior as another letter), we have: |ϕ| ∈ |Σ| 2 O(2n) ≤ (2n) 2 O(2n) = 2<sup>2</sup> O(2n) .

We now sketch the second part of the theorem connecting the syntactic hierarchy and the different acceptance conditions of D. We only consider the cases in which D is either a Muller or a coB¨uchi automaton. The complete analysis is given in the full version [5]. If D is a Muller automaton, then the overall formula ϕ is in ∆2, since it is a Boolean combination of Fin(C) formulas, which by Lemma 7 belong to Σ2. If D is a coB¨uchi automaton, then we construct the formula ϕ directly from the coB¨uchi condition α: ϕ is a conjunction of Fin(C) formulas over all configurations C that are mapped to states in α. As Fin(C) belongs to Σ2, so does ϕ.

Observe that by Theorem 2, we get the following result, extending the result of [39, Theorem 3.2] that only considers Rabin automata.

Corollary 1. Every counter-free deterministic ω-regular automaton (with any acceptance condition) recognises an LTL-definable language.

Proof. Recall that every deterministic ω-regular automaton is equivalent to a deterministic Muller automaton over the same semiautomaton (see, e.g., [3]). The claim is then a direct consequence of Theorem 2.

Remark 2. Theorem 2 can be adapted to the finite-word setting. While on infinite words, the neXt operator is self-dual, i.e., ¬Xψ is equivalent to X¬ψ, over finite words, this equivalence does not hold on words of length 1. Thus X gains a dual weak next, defined as X˜ ψ := ¬X¬ψ. In the finite word case, syntactic cosafety (safety) formulas are constructed from true, false, a, ¬a, ∨, ∧, and the temporal operators U and X (R and X˜ ). Observe that X and X˜ differ only on words of length 1, and thus the only required change in our translation scheme is to replace some Xs with X˜ s in the reachability formula 4. For finite words a translation of a counter-free DFA to an LTL formula with only a double exponential size blow-up is known [42]; however, unlike our translation, it does not guarantee syntactic safety (cosafety) formulas for safety (cosafety) languages.

Lastly, we provide a corollary on looping automata, using Theorem 2 and the following known result.

Proposition 9 (Rephrased Theorem 13 from [29]). Let D be a deterministic looping-B¨uchi automaton with n states that recognises an LTL-definable language. Then there exists an equivalent counter-free deterministic looping-B¨uchi automaton D<sup>0</sup> with at most n states.

Corollary 2. Every deterministic looping-B¨uchi (looping-coB¨uchi) automaton with n states that recognises an LTL-definable language is equivalent to an LTL formula ϕ ∈ Π<sup>1</sup> (Σ1) of temporal nesting depth in O(2<sup>2</sup> n ) and length in 2 2 O(2n) .

This is an elementary upper bound for two constructions for which either the upper bound was unknown or non-elementary: the liveness-safety decomposition of LTL [29] and the translation of semantic safety LTL to syntactic safety LTL.

### 5 Conclusions

We have studied the size trade-offs between LTL and automata. Over a unary alphabet, the situation is straightforward and we provided tight complexity bounds. The general case of infinite words over an arbitrary alphabet is more complex. We gave to our knowledge the first elementary complexity bound on the translation of counter-free deterministic ω-regular automata into LTL formulas.

Every ω-regular automaton recognising an LTL-definable language can be translated to a counter-free deterministic automaton [39, Theorem 3.2]. Yet, we are unaware of a bound on the size blow-up involved in such a translation. Once established, it can be combined with our translation to get a general bound on the translation of automata to LTL. It will also provide a (currently unknown<sup>6</sup> ) elementary upper bound on the translation of LTL with both future and past operators to LTL with only future operators (which is the version of LTL that we have considered), as (both version of) LTL can be translated to nondeterministic B¨uchi automata with a single exponential size blow-up [41, Theorem 2.1].

While going from non-elementary to double-exponential depth and triple-exponential length is an improvement, these upper bounds might not be tight there is currently no known non-linear lower bound! Closing this gap is a challenging open problem, which might require new lower bound techniques for alternating automata, as LTL formulas are an inherently alternating model.

Acknowledgements. We thank Moshe Vardi and Orna Kupferman for suggesting studying the succinctness gap between semantic and syntactic safe formulas, and Miko laj Boja´nczyk for answering our questions on algebraic automata theory.

<sup>6</sup> In consultation with the author of [30], we have confirmed that while the lower bound provided in that paper holds, the stated upper bound is erroneous.

### References


Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

### Categorical composable cryptography<sup>⋆</sup>

Anne Broadbent and Martti Karvonen()

Department of Mathematics and Statistics, University of Ottawa, Ottawa, Canada {abroadbe,martti.karvonen}@uottawa.ca

Abstract. We formalize the simulation paradigm of cryptography in terms of category theory and show that protocols secure against abstract attacks form a symmetric monoidal category, thus giving an abstract model of composable security defnitions in cryptography. Our model is able to incorporate computational security, set-up assumptions and various attack models such as colluding or independently acting subsets of adversaries in a modular, fexible fashion. We conclude by using string diagrams to rederive the security of the one-time pad and no-go results concerning the limits of bipartite and tripartite cryptography, ruling out e.g., composable commitments and broadcasting.

Keywords: Cryptography · composable security · quantum cryptography · category theory

### 1 Introduction

Modern cryptographic protocols are complicated algorithmic entities, and their security analyses are often no simpler than the protocols themselves. Given this complexity, it would be highly desirable to be able to design protocols and reason about them compositionally, i.e., by breaking them down into smaller constituent parts. In particular, one would hope that combining protocols proven secure results in a secure protocol without need for further security proofs. However, this is not the case for stand-alone security notions that are common in cryptography. To illustrate such failures of composability, let us consider the history of quantum key distribution (QKD), as recounted in [60]: QKD was originally proposed in the 80s [7]. The frst security proofs against unbounded adversaries followed a decade later [8, 49, 50, 64]. However, since composability was originally not a concern, it was later realized that the original security defnitions did not provide a good enough level of security [42]—they didn't guarantee security if the keys were to be actually used, since even a partial leak of the key would compromise the rest. The story ends on a positive note, as eventually a new security criterion was proposed, together with stronger proofs [5, 62].

In this work we initiate a categorical study of composable security defnitions in cryptography. In the viewpoint developed here one thinks of cryptography

c The Author(s) 2022

<sup>⋆</sup> This work was supported by the Air Force Ofce of Scientifc Research under award number FA9550-20-1-0375, Canada's NFRF and NSERC, an Ontario ERA, and the University of Ottawa's Research Chairs program.

P. Bouyer and L. Schr¨oder (Eds.): FoSSaCS 2022, LNCS 13242, pp. 161–183, 2022. https://doi.org/10.1007/978-3-030-99253-8\_9

as a resource theory: cryptographic functionalities (e.g. secure communication channels) are viewed as resources and cryptographic protocols let one transform some starting resources to others. For instance, one can view the one-time-pad as a protocol that transforms an authenticated channel and a shared secret key into a secure channel. For a given protocol, one can then study whether it is secure against some (set of) attack model(s), and protocols secure against a fxed set of models can always be composed sequentially and in parallel.

This is in fact the viewpoint taken in constructive cryptography [47], which also develops the one-time-pad example above in more detail. However [47] does not make a formal connection to resource theories as usually understood, whether as in quantum physics [16,39], or more generally as defned in order theoretic [32] or categorical [20] terms. Instead, constructive cryptography is usually combined with abstract cryptography [48] which is formalized in terms of a novel algebraic theory of systems [46].

Our work can be seen as a particular formalization of the ideas behind constructive cryptography, or alternatively as giving a categorical account of the real-world-ideal-world paradigm (also known as the simulation paradigm [34]), which underlies more concrete frameworks for composable security, such as universally composable cryptography [13] and others [2,3,38,43,44,51,58]. We will discuss these approaches and abstract and constructive cryptography in more detail in Section 1.1

Our long-term goal is to enable cryptographers to reason about composable security at the same level of formality as stand-alone security, without having to fx all the details of a machine model nor having to master category theory. Indeed, our current results already let one defne multipartite protocols and security against arbitrary subsets of malicious adversaries in any symmetric monoidal category C. Thus, as long as one's model of interactive computation results in a symmetric monoidal category, or more informally, one is willing to use pictures such as fg. 1d to depict connections between computational processes without further specifying the order in which the picture was drawn, one can use the simulation paradigm to reason about multipartite security against malicious participants composably—and specifying fner details of the computational model is only needed to the extent that it afects the validity of one's argument. Moreover, as our attack models and composition theorems are fairly general, we hope that more refned models of adversaries can be incorporated.

We now highlight our contributions to cryptography: We show how to adapt resource theories as categorically formulated [20] in order to reason abstractly about secure transformations between resources. This is done in Section 3 by formalizing the simulation paradigm in terms of an abstract attack model (Definition 1), designed to be general enough to capture standard attack models of interest (and more) while still structured enough to guarantee composability. This section culminates in Corollary 1, which shows that for any fxed set of attack models, the class of protocols secure against each of them results in a symmetric monoidal category. In Theorem 3 we observe that under suitable conditions, images of secure protocols under monoidal functors remain secure, which gives an abstract variant of the lifting theorem [68, Theorem 15] that states that perfectly UC-secure protocols are quantum UC-secure. We adapt this framework to model computational security in two ways: either by replacing equations with an equivalence relation, abstracting the idea of computational indistinguishability, as is done in section 4, or by working with a notion of distance, deferred to a full version. In the case of a distance, one can then either explicitly bound the distance between desired and actually achieved behavior, or work with sequences of protocols that converge to the target in the limit: the former models working in the fnite-key regimen [67] and the latter models the kinds of asymptotic security and complexity statements that are common in cryptography.Finally, we apply the framework developed to study bipartite and tripartite cryptography. We frst prove pictorially the security of the one-time pad. We then reprove the no-go-theorems of [46, 48, 61] concerning two-party commitments (resp. threeparty broadcasting) in this setting, and reinterpret them as limits on what can be achieved securely in any compact closed category (resp. symmetric monoidal category). The key steps of the proof are done graphically, thus opening the door for cryptographers to use such pictorial representations as rigorous tools rather than merely as illustrations.

Moreover, we discuss some categorical constructions capturing aspects of resource theories appearing in the physics literature. These contributions may be of independent interest for further categorical studies on resource theories. In [20] it is observed that many resource theories arise from an inclusion C<sup>F</sup> ,→ C of free transformations into a larger monoidal category, by taking the resource theory of states. We observe that this amounts to applying the monoidal Grothendieck construction [53] to the functor C<sup>F</sup> → C hom(I,−) −−−−−−→ Set. This suggests applying this construction more generally to the composite of monoidal functors F : D → C and R: C → Set. In Example 1 we note that choosing F to be the n-fold monoidal product C<sup>n</sup> → C captures resources shared by n parties and n-partite transformations between them. In the extended version, we model categorically situations where there is a notion of distance between resources, and instead of exact resource conversions one either studies approximate transformations or sequences of transformations that succeed in the limit. In the extended version, we discuss a variant of a construction on monoidal categories, used in special cases in [31] and discussed in more detail in [23, 33], that allows one to declare some resources free and thus enlarge the set of possible resource conversions.

#### 1.1 Related work

We have already mentioned that cryptographers have developed a plethora of frameworks for composable security, such as universally composable cryptography [13], reactive simulatability [2, 3, 58] and others [38, 43, 44, 51]. Moreover, some of these frameworks have been adapted to the quantum setting [6, 54, 68]. One might hence be tempted to think that the problem of composability in cryptography has been solved. However, it is fair to say that most mainstream cryptography is not formulated composably and that composable cryptography

has yet to realize its full potential. Moreover, this proliferation of frameworks should be taken as evidence of the continued importance of the issue, and is in fact refected by the existence of a recent Dagstuhl seminar on this matter [12]. Indeed, the aforementioned frameworks mostly consist of setting up fairly detailed models of interacting machines, which as an approach sufers from two drawbacks: Firstly, in order to be more realistic, the detailed models are often complicated, both to reason in terms of and to defne, thus making practicing cryptographers less willing to use them. Perhaps more importantly it is not always clear whether the results proven in a particular model apply more generally for other kinds of machines, whether those of a competing framework or those in the real world. It is true that the choice of a concrete machine model does afect what can be securely achieved—for instance, quantum cryptography difers from classical cryptography and similarly classical cryptography behaves diferently in synchronous and asynchronous settings [4, 40]. Nevertheless, one might hope that composable cryptography could be done at a similar level of formality as complexity theory, where one rarely worries about the number of tapes in a Turing machine or of other low-level details of machine models. Second, changing the model slightly (to e.g., model diferent kinds of adversaries or to incorporate a diferent notion of efciency) often requires reproving "composition theorems" of the framework or at least checking that the existing proof is not broken by the modifcation.

In contrast to frameworks based on detailed machine models, there are two closely related top-down approaches to cryptography: constructive cryptography [47] and its cousin abstract cryptography [48]. We are indebted to both of these approaches, and indeed our framework could be seen as formalizing the key idea of constructive cryptography—namely, cryptography as a resource theory—and thus occupying a similar space as abstract cryptography. A key diference is that constructive cryptography is usually instantiated in terms of abstract cryptography [48], which in turn is based on a novel algebraic theory of systems [46]. However, our work is not merely a translation from this theory to categorical language, as there are important diferences and benefts that stem from formalizing cryptography in terms of a well-established and well-studied algebraic theory of systems—that of (symmetric) monoidal categories:

The fact that cryptographers wish to compose their protocols sequentially and in parallel strongly suggests using monoidal categories, that have these composition operations as primitives. In our framework, protocols secure against a fxed set of attack models results in a symmetric monoidal category. In contrast, the algebraic theory of systems [46] on which abstract cryptography is based takes parallel composition and internal wiring as its primitives. This design choice results in some technical kinks and tangles that are natural with any novel theory but have already been smoothed out in the case of category theory. For instance, in the algebraic theory of systems of [46] the parallel composition is a partial operation and in particular the parallel composite of a system with itself is never defned<sup>1</sup> and the set of wires coming out of a system is fxed once and for all<sup>2</sup> . In contrast, in a monoidal category parallel composition is a total operation and whether one draws a box with n output wires of types A1, . . . A<sup>n</sup> or single output wire of type N<sup>n</sup> <sup>i</sup>=1 A<sup>i</sup> is a matter of convenience. Technical diferences such as these make a direct formal comparison or translation between the frameworks difcult, even if informally and superfcially there are similarities.

We do not abstract away from an attacker model, but rather make it an explicit part of the formalism that can be modifed without worrying about composability. This makes it possible to consider and combine very easily different security properties, and in particular paves the way to model attackers with limited powers such as honest-but-curious adversaries. In our framework, one can frst fx a protocol transforming some resource to another one, and then discuss whether this transformation is secure against diferent attack models. In contrast, in abstract cryptography a cryptographic resource is a tuple of functionalities, one for each set of dishonest parties, and thus has no prior existence before fxing the attack model. This makes the question "what attack models is this protocol secure against?" difcult to formalize.

As category theory is de facto the lingua franca between several subfelds of mathematics and computer science, elucidating the categorical structures present in cryptography opens up the door to further connections between cryptography and other felds. For instance, game semantics readily gives models of interactive, asynchronous and probabilistic (or quantum) computation [18, 19, 69] in which our theory can be instantiated, and thus further paves the way for programming language theory to inform cryptographic models of concurrency.

Category theory comes with existing theory, results and tools that can readily be applied to questions of cryptographic interest. In particular, the graphical calculi of symmetric monoidal and compact closed categories [63] enables one to rederive impossibility results shown in [46, 48, 61] purely pictorially. In fact, such pictures were already often used as heuristic devices that illuminate the ofcial proofs, and viewing these pictures categorically lets us promote them from mere illustrations to rigorous yet intuitive proofs. Indeed, in [48, Footnote 27] the authors suggest moving from a 1-dimensional symbolic presentation to a 2-dimensional one, and this is exactly what the graphical calculus achieves.

The approaches above result in a framework where security is defned so as to guarantee composability. In contrast, approaches based on various protocol logics [25–30] aim to characterize situations where composition can be done securely, even if one does not use composable security defnitions throughout. As these approaches are based on process calculi, they are categorical under the hood [52,55] even if not overtly so. There is also earlier work explicitly discussing

<sup>1</sup> While the suggested fx is to assume that one has "copies" of the same system with disjoint wire labels, it is unclear how one recognizes or even defnes in terms of the system algebra that two distinct systems are copies of each other.

<sup>2</sup> Indeed, while [59] manages to bundle and unbundle ports along isomorphism when convenient, it seems like the chosen technical foundation makes this more of a struggle than it should be.

category theory in the context of cryptography [9, 10, 21, 22, 35–37, 41, 56, 57, 65, 66], but they concern stand-alone security of particular cryptographic protocols, rather than categorical aspects of composable security defnitions.

### 2 Resource theories

We briefy review the categorical viewpoint on resource theories of [20]. Roughly speaking, a resource theory can be seen as a SMC but the change in terminology corresponds to a change in viewpoint: usually in category theory one studies global properties of a category, such as the existence of (co)limits, relationships to other categories, etc. In contrast, when one views a particular SMC C as resource theory, one is interested in local questions. One thinks of objects of C as resources, and morphisms as processes that transform a resource to another. From this point of view, one mostly wishes to understand whether homC(X, Y ) is empty or not for resources X and Y of interest. Thus from the resource-theoretic point of view, most of the interesting information in C is already present in its preorder collapse. As concrete examples of resource-theoretic questions, one might wonder if (i) some noisy channels can simulate a (almost) noiseless channel [20, Example 3.13.], (ii) there is a protocol that uses only local quantum operations and classical communication and transforms a particular quantum state to another one [17], (iii) some non-classical statistical behavior can be used to simulate other such behavior [1]. In [20] the authors show how many familiar resource theories arise in a uniform fashion: starting from an SMC C of processes equipped with a wide sub-SMC C<sup>F</sup> , the morphisms of which correspond to "free" processes, they build several resource theories (=SMCs). Perhaps the most important of these constructions is the resource theory of states: given C<sup>F</sup> ,→ C, the corresponding resource theory of states can be explicitly constructed by taking the objects of this resource theory to be states of C, i.e., maps r : I → A for some A, and maps r → s are maps f : A → B in C<sup>F</sup> that transform r to s as in fg. 1a.

We now turn our attention towards cryptography. As contemporary cryptography is both broad and complex in scope, any faithful model of it is likely to be complicated as well. A beneft of the categorical idiom is that we can build up to more complicated models in stages, which is what we will do in the sequel. We phrase our constructions in terms of an arbitrary SMC C, but in order to model actual cryptographic protocols, the morphisms of C should represent interactive computational machines with open "ports", with composition then amounting to connecting such machines together. Diferent choices of C set the background for diferent kinds of cryptography, so that quantum cryptographers want C to include quantum systems whereas in classical cryptography it is sufcient that these computational machines are probabilistic. Constructing such categories C in detail is not trivial but is outside our scope—we will discuss this in more detail in section 6.

Our frst observation is that there is no reason to restrict to inclusions C<sup>F</sup> ,→ C in order to construct a resource theory of states. Indeed, while it is straightforward to verify explicitly that the resource theory of states is a symmetric monoidal category, it is instructive to understand more abstractly why this is so: in efect, the constructed category is the category of elements of the composite functor C<sup>F</sup> → C hom(I,−) −−−−−−→ Set. As this composite is a (lax) symmetric monoidal functor, the resulting category is automatically symmetric monoidal as observed in [53]. Thus this construction goes through for any symmetric (lax) monoidal functors D F −→ C R −→ Set. Here we may think of F as interpreting free processes into an ambient category of all processes, and R: C → Set as an operation that gives for each object A of C the set R(A) of resources of type A.

Explicitly, given symmetric monoidal functors D F −→ C R −→ Set, the category of elements R RF has as its objects pairs (r, A) where A is an object of D and r ∈ RF(A), the intuition being that r is a resource of type F(A). A morphism (r, A) → (s, B) is given by a morphism f : A → B in D that takes r to s, i.e., satisfes RF(f)(r) = s. The symmetric monoidal structure comes from the symmetric monoidal structures of D, Set and RF. Somewhat more explicitly, (r, A) ⊗ (s, B) is defned by (r ⊗ s, A ⊗ B) where r ⊗ s is the image of (r, s) under the function RF(A) × RF(B) → RF(A ⊗ B) that is part of the monoidal structure on RF, and on morphisms of R RF the monoidal product is defned from that of D.

From now on we will assume that F is strong monoidal, and while R = hom(I, −) captures our main examples of interest, we will phrase our results for an arbitrary lax monoidal R. This relaxation allows us to capture the n-partite structure often used when studying cryptography, as shown next.

Example 1. Consider the resource theory induced by C<sup>n</sup> <sup>⊗</sup>−→ C hom(I,−) −−−−−−→ Set, where we write ⊗ for the n-fold monoidal product<sup>3</sup> . The resulting resource theory has a natural interpretation in terms of n agents trying to transform resources to others: an object of this resource theory corresponds to a pair ((Ai) n <sup>i</sup>=1, r : I → NAi), and can be thought of as an n-partite state, depicted in fg. 1b, where the ith agent has access to a port of type A<sup>i</sup> . A morphism ¯f = (f1, . . . fn): ((Ai) n <sup>i</sup>=1, r) → ((Bi) n <sup>i</sup>=1, s) between such resources then amounts to a protocol that prescribes, for each agent i a process f<sup>i</sup> that they should perform so that r gets transformed to s as in fg. 1c.

In this resource theory, all of the agents are equally powerful and can perform all processes allowed by C, and this might be unrealistic: frst of all, C might include computational processes that are too powerful/expensive for us to use in our cryptographic protocols. Moreover, having agents with diferent computational powers is important to model e.g., blind quantum computing [11] where a client with access only to limited, if any, quantum computation tries to securely delegate computations to a server with a powerful quantum computer. This limitation is easily remedied: we could take the ith agent to be able to implement computations in some sub-SMC C<sup>i</sup> of C, and then consider Q<sup>n</sup> <sup>i</sup>=1 C<sup>i</sup> → C.

<sup>3</sup> As C is symmetric, the functor ⊗ is strong monoidal.

(a) A map f in the resource theory of states

(b) An n-partite state

(c) An n-partite transformation (d) Factorization of an attack on f ⊗ g

A more serious limitation is that such transformations have no security guarantees—they only work if each agent performs f<sup>i</sup> as prescribed by the protocol. We fx this next.

### 3 Cryptography as a resource theory

(a) Attack by the parties k + 1, . . . , n (b) Security against the parties k + 1, . . . , n

(c) Security against the initial attack

Fig. 2: Attacks and security constraints

In order for a protocol ¯f = (f1, . . . , fn): ((Ai) n <sup>i</sup>=1, r) → ((Bi) n <sup>i</sup>=1, s) to be secure, we should have some guarantees about what happens if, as a result of an attack on the protocol, something else than (f1, . . . , fn) happens. For instance, some subset of the parties might deviate from the protocol and do something else instead. In the simulation paradigm [34], security is then defned by saying that, anything that could happen when running the real protocol, i.e., ¯f with r, could also happen in the ideal world, i.e., with s. A given protocol might be secure against some kinds of attacks and insecure against others, so we defne security against an abstract attack model. This abstract notion of an attack model is one of the main defnitions of our paper. It isolates conditions needed for the composition theorem (Theorem 1). It also captures our key examples that we use to illustrate the defnition after giving it. Note that most proofs are deferred to an extended version.

Defnition 1. An attack model A on an SMC C consists of giving for each morphism f of C a class A(f) of morphisms of C such that


Let f : (A, r) → (B, s) defne a morphism in the resource theory R RF induced by F : D → C and R: C → Set. We say that f is secure against an attack model A on C (or A-secure) if for any f ′ ∈ A(F(f)) with dom(f ′ ) = F(A) there is b ∈ A(id<sup>F</sup> (B)) with dom(b) = F(B) such that R(f ′ )r = R(b)s.

The above defnition of security asks for perfect equality and corresponds to information-theoretic security in cryptography. This is often too much to hope for, and we will replace this by an equivalence relation in section 4 and by a notion of distance in an extended version.

The intuition is that A gives, for each process in C, the set of behaviors that the attackers could force to happen instead of honest behavior. In particular, A(idB) give the set of behaviors that is available to attackers given access to a system of type B. Then property (i) amounts to the assumption that the adversaries could behave honestly. The frst halves of properties (ii) and (iii) say that, given an attack on g and one on f, both attacks could happen when composing g and f sequentially or in parallel. The second parts of these say that attacks on composite processes can be understood as composites of attacks. However, note that (iii) does not say that an attack on a product has to be a product of attacks: the factorization says that any h ∈ A(g ⊗ f) factorizes as in fg. 1d with g ′ ∈ A(g), f ′ ∈ A(f) and h ′ ∈ A(idB⊗D). The intuition is that an attacker does not have to attack two parallel protocols independently of each other, but might play the protocols against each other in complicated ways. This intuition also explains why we do not require that all morphisms in A(f) have F(A) as their domain, despite the defnition of A-security quantifying only against those: when factoring h ∈ A(g ◦ f) as g ′ ◦ f ′ with g ′ ∈ A(g) and f ′ ∈ A(f), we can no longer guarantee that F(B) is the domain of g ′—perhaps the attackers take us elsewhere when they perform f ′ .

If one thinks of F : D → C as representing the inclusion of free processes into general processes, one also gets an explanation why we do not insist that free processes and attacks live in the same category, i.e., that F = idC. This is simply because we might wish to prove that some protocols are secure against attackers that can use more resources than we wish or can use in the protocols.

Example 2. For any SMC C there are two trivial attack models: the minimal one defned by A(f) = {f} and the maximal one sending f to the class of all morphisms of C. We interpret the minimal attack model as representing honest behavior, and the maximal one as representing arbitrary malicious behavior.

Proposition 1. If A1, . . . , A<sup>n</sup> are attack models on SMCs C1, . . . , C<sup>n</sup> respectively, then there is a product Q<sup>n</sup> <sup>i</sup>=1 <sup>A</sup><sup>i</sup> attack model on <sup>Q</sup><sup>n</sup> <sup>i</sup>=1 C<sup>i</sup> defned by ( Q<sup>n</sup> <sup>i</sup>=1 <sup>A</sup>i)(f1, . . . , fn) = <sup>Q</sup><sup>n</sup> <sup>i</sup>=1 Ai(fi).

This proposition, together with the minimal and maximal attack models, is already expressive enough to model multi-party computation where some subset of the parties might do arbitrary malicious behavior. Indeed, consider the npartite resource theory induced by C<sup>n</sup> <sup>⊗</sup>−→ C hom(I,−) −−−−−−→ Set. Let us frst model a situation where the frst n − 1 participants are honest and the last participant is dishonest. In this case we can set A = Q<sup>n</sup> <sup>i</sup>=1 A<sup>i</sup> where each of A1, . . . , An−<sup>1</sup> is the minimal attack model on C and A<sup>n</sup> is the maximal attack model. Then, an attack on ¯f = (f1, . . . fn): ((Ai) n <sup>i</sup>=1, r) → ((Bi) n <sup>i</sup>=1, s) can be represented by the frst n − 1 parties obeying the protocol and the n-th party doing an arbitrary computation a, as depicted in the two pictures of fg. 2a, where [n] := {1, . . . , n}, (k, n] := {k+1, . . . n}, ¯f|[k] := N<sup>k</sup> <sup>i</sup>=1 f<sup>i</sup> , and here k = n−1. The latter representation will be used when we do not need to emphasize pictorially the fact that the honest parties are each performing their own individual computations.

If instead of just one attacker, there are several independently acting adversaries, we can take A = Q<sup>n</sup> <sup>i</sup>=1 A<sup>i</sup> where A<sup>i</sup> is the minimal or maximal attack structure depending on whether the ith participant is honest or not. If the set of dishonest parties can collude and communicate arbitrarily during the process, we need the fexibility given in Defnition 1 and have the attack structure live in a diferent category than where our protocols live. For simplicity of notation, assume that the frst k agents are honest but the remaining parties are malicious and might do arbitrary (joint) processes in C. In particular, the action done by the dishonest parties N k + 1, . . . , n need not be describable as a product n <sup>i</sup>=k+1(ai) of individual actions. In that case we defne A as follows: we frst conidk×⊗ <sup>⊗</sup>−→ C hom(I,−)

sider our resource theory as arising from C<sup>n</sup> −−−−→ C<sup>k</sup>×C −−−−−−→ Set, and defne A on C<sup>k</sup> × C as the product of the minimal attack model on C<sup>k</sup> and the maximal one on C. Concretely, this means that the frst k agents always obey the protocol, but the remaining agents can choose to perform arbitrary joint behaviors in C. Then a generic attack on a protocol ¯f can be represented exactly as before in fg. 2a, except we no longer insist that k = n − 1. Now a protocol ¯f is A-secure if for any a with dom(a) = (Ai) n <sup>i</sup>=k+1 there is a b with dom(b) = (Bi) n <sup>i</sup>=k+1 satisfying the equation of fg. 2b.

If one is willing to draw more wire crossings, one can easily depict and defne security against an arbitrary subset of the parties behaving maliciously, and henceforward this is the attack model we have in mind when we say that some n-partite protocol is secure against some subset of the parties. Moreover, for any subset J of dishonest agents, one could consider more limited kinds of attacks: for instance, the agents might have limited computational power or limited abilities to perform joint computations—as long as the attack model satisfes the conditions of Defnition 1 one automatically gets a composable notion of secure protocols by Theorem 1 below.

Theorem 1. Given symmetric monoidal functors F : D → C, R: C → Set with F strong monoidal and R lax monoidal, and an attack model A on C, the class of A-secure maps forms a wide sub-SMC of the resource theory R RF induced by RF.

So far we have discussed security only against a single, fxed subset of dishonest parties, while in multi-party computation it is common to consider security against any subset containing e.g., at most n/3 or n/2 of the parties. However, as monoidal subcategories are closed under intersection, we immediately obtain composability against multiple attack models.

Corollary 1. Given a non-empty family of functors (D <sup>F</sup><sup>i</sup> −→ C<sup>i</sup> <sup>R</sup><sup>i</sup> −−→ Set)i∈<sup>I</sup> with RiF<sup>i</sup> = RjF<sup>j</sup> =: R for all i, j ∈ I and attack models A<sup>i</sup> on C<sup>i</sup> for each i, the class of maps in R R that is secure against each A<sup>i</sup> is a sub-SMC of R R.

Using Corollary 1 one readily obtains composability of protocols that are simultaneously secure against diferent attack models A<sup>i</sup> . Thus one could, in principle, consider composable cryptography in an n-party setting where some subsets are honest-but-curious, some might be outright malicious but have limited computational power, and some subsets might be outright malicious but not willing or able to coordinate with each other, without reproving any composition theorems.

While the security defnition of f quantifes over A(f), which may be infnite, under suitable conditions it is sufcient to check security only on a subset of A(f), so that whether f is A-secure often reduces to fnitely many equations.

Defnition 2. Given f : A → B, a subset X of A(f) is said to be initial if any f ′ ∈ A(f) with dom(f ′ ) = A can be factorized as b ◦ a with a ∈ X and b ∈ A(idB).

Theorem 2. Let f : (A, r) → (B, s) defne a morphism in the resource theory induced by F : D → C and R: C → Set and let A be an attack model on C. If X ⊂ A(F(f)) is initial, then f is A-secure if, and only if the security condition holds against attacks in X, i.e., if for any f ′ ∈ X with dom(f ′ ) = F(A) there is b ∈ A(id<sup>F</sup> (B)) such that R(f ′ )r = R(b)s.

Let us return to the example of C<sup>n</sup> → C with the frst k agents being honest and the fnal n − k dishonest and collaborating. Then we can take a singleton as our initial subset of attacks on ¯f, and this is given by ¯f|[k]⊗( N<sup>n</sup> <sup>i</sup>=k+1 id). Intuitively, this represents a situation where the dishonest parties k + 1, . . . , n merely stand by and forward messages between the environment and the functionality, so that initiality can be seen as explaining "completeness of the dummy adversary" [13, Claim 11] in UC-security. In this case the security condition can be equivalently phrased by saying that there exists b ∈ A([idb]) satisfying the equation of fg. 2c, which reproduces the pictures of [51]. Similarly, for classical honest-but-curious adversaries one usually only considers the initial such adversary, who follows the protocol otherwise except that they keep track of the protocol transcript.

Theorem 3. In the resource theory of n-partite states, if (f1, . . . fn) is secure against some subset J of [n] and F is a strong monoidal, then (F f1, . . . , F fn) is secure against J as well.

For instance, if the inclusion of classical interactive computations into quantum ones is strong monoidal, i.e., respects sequential and parallel composition (up to isomorphism), then unconditionally secure classical protocols are also secure in the quantum setting, as shown in the context of UC-security in [68, Theorem 15]. More generally, this result implies that the construction of the category of n-partite transformations secure against any fxed subset of [n] is functorial in C, and this is in fact also true for any family of subsets of [n] by Corollary 1.

### 4 Computational security

The discussion above has been focused on perfect security, so that the equations defning security hold exactly. This is often too high a standard for security to hope for, and consequently cryptographers routinely work with computational or approximate security. We model this in two ways. The frst approach replaces equations with an equivalence relation abstracting from the idea that the end results are "computationally indistinguishable" rather than strictly equal. The latter approach amounts to working in terms of a (pseudo)metric quantifying how close we are to the ideal resource and is needed to model statements in fnite-key cryptography [67]. The typical metric is given by "distinguisher advantage for polynomial-time environments", enabling one to use computational complexity theory. In a nutshell, this amounts to working with sequences of protocols and defning security by saying "for any ϵ > 0, for sufciently large n, for any attack on the nth protocol there is an attack on the target resource such that the end results are within ϵ". The frst approach is mathematically straightforward and we discuss it next, while the second approach is relegated to an extended version.

Replacing strict equations with equivalence relations is easy to describe on an abstract level as an instance of the theory so far: one just assumes that C has a monoidal congruence ≈ and then works with the resource theory induced by C<sup>n</sup> → C/≈ hom(I,−) −−−−−−→ Set with similar attack models as above. More explicitly, as long as each hom-set of C is equipped with an equivalence relation ≈ that respects ⊗ and ◦ in that f ≈ f ′ and g ≈ g ′ imply gf ≈ g ′f ′ (whenever defned) and g ⊗ f ≈ g ′ ⊗ f ′ , then working with C<sup>n</sup> → C/≈ hom(I,−) −−−−−−→ Set results in security conditions that replace = in C with ≈ throughout. If C describes (interactive) computational processes and ≈ represents computational indistinguishability (inability for any "efcient" process to distinguish between the two), one might need to replace C (and consequently functionalities, protocols and attacks on them) with the subcategory of C of efcient processes so that ≈ indeed results in a congruence.

### 5 Applications

We will now explore how the one-time pad (OTP) fts into our framework, paralleling the discussion of OTP in [47]. We will start from the category FinStoch of fnite sets and stochastic maps between them, with ⊗ given by cartesian product of sets. This is sufcient for OTP, even if more complicated and interactive cryptographic protocols will need a diferent starting category. However, the actual category C we work in is built from FinStoch, essentially by a tripartite variant of the "resource theory of universally-combinable processes" of [20, Section 3.4]. We will defer the detailed construction of C to an extended version and work in it more heuristically, allowing us to focus on the OTP.

Roughly speaking, a "basic object" of C consists of fnite sets A<sup>i</sup> ,B<sup>i</sup> , E<sup>i</sup> for i = 1, 2, and of a map f : A<sup>1</sup> ⊗ B<sup>1</sup> ⊗ E<sup>1</sup> → A<sup>2</sup> ⊗ B<sup>2</sup> ⊗ E<sup>2</sup> in FinStoch, depicted in fg. 3a. The intuition is that ⟨(A<sup>i</sup> , B<sup>i</sup> , Ei)i∈{1,2}, f⟩ represents a box shared

(a) Box shared by Alice, Bob and Eve (b) The OTP protocol (c) A secure PRNG (d) Secure channel

Fig. 3: Some resources and protocols

by Alice, Bob and Eve, with Alice's inputs and outputs ranging over A<sup>1</sup> and A<sup>2</sup> respectively, and similarly for Bob and Eve. We will often label the ports just by the party who controls it, and omit labeling trivial ports. For example, if fg. 4a depicts the copy map X → X ⊗ X for some set X in FinStoch, then

Fig. 4: Variants of the copy map

fg. 4b denotes an object of C representing Alice copying data privately, whereas fg. 4c denotes an object C that sends Alice's input unchanged to Bob and to Eve—which we view as an insecure (but authenticated) channel from Alice to Bob.

A general object of C then consists of a list of such basic objects, representing a list of such resources shared between Alice, Bob and Eve. A morphism of C is roughly speaking a way of using the starting resources and local computation by the three parties to produce the target resources: a more formal description will

be given in an extended version. In our attack model Alice and Bob are honest but Eve is dishonest, so she might do arbitrary local computation instead of whatever our protocols might prescribe.

In the version of the OTP we discuss, our starting resources consist of an insecure but authenticated channel<sup>4</sup> from Alice to Bob as in fg. 4c and (i.e., ⊗) of a random key over the same message space, shared by Alice and Bob (fg. 4d). The goal is to build a secure channel from Alice to Bob (fg. 3d) from these.

The local ingredients of OTP and the axioms they obey are depicted in fg. 5 and correspond to a Hopf algebra with an integral in a SMC. Any fnite group gives rise to such a structure in FinStoch, with the integral given by the uniform distribution. Concretely, this means that Alice and Bob must agree on a group structure on the message space, and the fact that this multiplication forms a group and that the key is random can be captured by the equations of fg. 5.

Fig. 5: Local ingredients of OTP and the axioms they obey

The OTP protocol is then depicted in fg. 3b, i.e., Alice adds the key to her message, broadcasts it to Eve and Bob. Eve deletes her part and Bob adds the inverse of the key to the ciphertext to recover the message.

To show that the protocol is secure, note that Eve has an initial attack given by just reading the ciphertext. The pictorial security proof is depicted in fg. 6. The frst equation is the interaction between multiplication and copying, the second uses (co)associativity, the third one properties of inverses, the fourth and last one use unitality, and the ffth one follows from the key being random. Taken together, these show that Eve's initial attack is equal to her just producing a random message herself with Alice and Bob sharing the target resource. The correctness of the protocol can be proven similarly. Thus OTP gives a map shared key⊗authenticated channel → secure channel that is secure against Eve.

We now use this example to illustrate the use of the composition theorems. A major drawback of OTP, despite its perfect security, is the fact that one needs a key that is as long as the message. In practice, Alice and Bob might only share a short key and wish to promote it a long key. If they agree on a pseudorandom number generator (PRNG) with their key as the seed, they can map the short key to a longer key. If the PRNG is computationally secure, then the end-result is (computationally) indistinguishable from a long key, depicted in

<sup>4</sup> If the insecure channel allows Eve to tamper with the message, the analysis changes.

Fig. 6: Security proof of OTP

fg. 3c, where ≈ stands for computational indistinguishability. We envision the computational security of the chosen PRNG to be proven "the usual way" and not graphically—after all, we believe that our framework is there to supplement ordinary cryptographic reasoning and not to replace it. The PRNG then results in a (computationally) secure way of promoting a short shared key into a long shared key, and then the composition theorems guarantee that these protocols can be composed, resulting in the security of the stream cipher.

Composable security is a stronger constraint than stand-alone security, and indeed many cryptographic functionalities are known to be impossible to achieve "in the plain model", i.e., without set-up assumptions. A case in point is bit commitment, which was shown to be impossible in the UC-framework in [14]. This result was later generalized in [61] to show that any two-party functionality that can be realized in the plain UC-framework is "splittable". While the authors of [61] remark that their result applies more generally than just to the UCframework, this wasn't made precise until [48] 5 . We present a categorical proof of this result in our framework, which promotes the pictures "illustrating the proof" in [61] into a full proof—the main diference is that in [61] the pictures explicitly keep track of an environment trying to distinguish between diferent functionalities, whereas we prove our result in the case of perfect security and then deduce the asymptotic claim.

We now assume that C, our ambient category of interactive computations is compact closed<sup>6</sup> . As we are in the 2-party setting, we take our free computations

<sup>5</sup> Except that in their framework the 2-party case seems to require security constraints also when both parties cheat.

<sup>6</sup> We do not view this as overtly restrictive, as many theoretical models of concurrent interactive (probabilistic/quantum) computation are compact closed [18, 19, 69].

to be given by C<sup>2</sup> , and we consider two attack models: one where Alice cheats and Bob is honest, and one where Bob cheats and Alice is honest. We think of as representing a two-way communication channel, but this interpretation is not needed for the formal result.

Theorem 4. For Alice and Bob (one of whom might cheat), if a bipartite functionality r can be securely realized from a communication channel between them, i.e., from , then there is a g such that

$$
\bigvee\_A \bigvee^B = \bigvee \bigvee^B \bigvee \tag{\*}
$$

Proof. If a protocol (fA, fB) achieves this, security constraints give us sA, s<sup>B</sup>

Corollary 2. Given a compact closed C modeling computation in which wires model communication channels, (composable) bit commitment and oblivious transfer are impossible in that model without setup, even asymptotically in terms of distinguisher advantage.

Proof. If r represents bit commitment from Alice to Bob, it does not satisfy the equation required by Theorem 4 for any g, and the two sides of (∗) can be distinguished efciently with at least probability 1/2. Indeed, take any f and let us compare the two sides of (∗): if the distinguisher commits to a random bit b, then Bob gets a notifcation of this on the left hand-side, so that f has to commit to a bit on the right side of (∗) to avoid being distinguished from the left side. But this bit coincides with b with probability at most 1/2, so that the diference becomes apparent at the reveal stage. The case of OT is similar.

We now discuss a similar result in the tripartite case, which rules out building a broadcasting channel from pairwise channels securely against any single party cheating. In [46] comparable pictures are used to illustrate the ofcial, symbolically rather involved, proof, whereas in our framework the pictures are the proof. Another key diference is that [46] rules out broadcasting directly, whereas we show that any tripartite functionality realizable from pairwise channels satisfes some equations, and then use these equations to rule out broadcasting.

Formally, we are working with the resource theory given by C<sup>3</sup> <sup>⊗</sup>−→ C hom(I,−) −−−−−−→ Set where C is an SMC, and reason about protocols that are secure against three

kinds of attacks: one for each party behaving dishonestly while the rest obey the protocol. Note that we do not need to assume compact closure for this result, and the result goes through for any state on A ⊗ A shared between each pair of parties: we will denote such a state by by convention.

Theorem 5. If a tripartite functionality r can be realized from each pair of parties sharing a state , securely against any single party, then there are simulators sA, sB, s<sup>C</sup> such that

Proof. Any tripartite protocol building on top of each pair of parties sharing can be drawn as in the left side of

Consider now the morphism in C depicted on the right: it can be seen as the result of three diferent attacks on the protocol (fA, fB, f<sup>C</sup> ) in C<sup>3</sup> : one where Alice cheats and performs f<sup>A</sup> and f<sup>B</sup> (and the wire connecting them), one where Bob performs f<sup>B</sup> twice, and one where Charlie performs f<sup>B</sup> and f<sup>C</sup> . The security of (fA, fB, f<sup>C</sup> ) against each of these gives the required simulators.

Corollary 3. Given a SMC C modeling interactive computation, and a state on A⊗A modeling pairwise communication, it is impossible to build broadcasting channels securely (even asymptotically in terms of distinguisher advantage) from pairwise channels.

Proof. We show that a channel r that enables Bob to broadcast an input bit to Alice and Charlie never satisfes the required equations for any sA, sB, s<sup>C</sup> . Indeed, assume otherwise and let the environment plug "broadcast 0" and "broadcast 1" to the two wires in the middle. The leftmost picture then says that Charlie receives 1, the rightmost picture implies that Alice gets 0 and the middle picture that Alice and Bob get the same output (if anything at all)—a contradiction. Indeed, one cannot satisfy all of these simultaneously with high probability, which rules out an asymptotic transformation.

### 6 Outlook

We have presented a categorical framework providing a general, fexible and mathematically robust way of reasoning about composability in cryptography. Besides contributing a further approach to composable cryptography and potentially helping with cross-talk and comparisons between existing approaches [12], we believe that the current work opens the door for several further questions.

First, due to the generality of our approach we hope that one can, besides honest and malicious participants, reason about more refned kinds of adversaries composably. Indeed, we expect that Defnition 1 is general enough to capture e.g., honest-but-curious adversaries<sup>7</sup> . It would also be interesting to see if this captures even more general attacks, e.g., situations where the sets of participants and dishonest parties can change during the protocol. This might require understanding our axiomatization of attack models more structurally and perhaps generalizing it. Does this structure (or a variant thereof) already arise in category theory? While we defne an attack model on a category, perhaps one could defne an attack model on a (strong) monoidal functor F, the current defnition being recovered when F = id.

Second, we expect that rephrasing cryptographic questions categorically would enable more cross-talk between cryptography and other felds already using category theory as an organizing principle. For instance, many existing approaches to composable cryptography develop their own models of concurrent, asynchronous, probabilistic and interactive computations. As categorical models of such computation exist in the context of game semantics [18,19,69], one is left wondering whether the models of the semanticists' could be used to study and answer cryptographic questions, or conversely if the models developed by cryptographers contain valuable insights for programming language semantics.

Besides working inside concrete models—which ultimately blends into "just doing composable cryptography"—one could study axiomatically how properties of a category relate to cryptographic properties in it. As a specifc conjecture in this direction, one might hope to talk about honest-but-curious adversaries at an abstract level using environment structures [21], that axiomatize the idea of deleting a system. Similarly, having agents purify their actions is an important tool in quantum cryptography [45]—can categorical accounts of purifcation [15, 21, 24] elucidate this?

Finally, we hope to get more mileage out of the tools brought in with the categorical viewpoint. For instance, can one prove further no-go results pictorially? More specifcally, given the impossibility results for two and three parties, one wonders if the "only topology matters" approach of string diagrams can be used to derive general impossibility results for n parties sharing pairwise channels. Similarly, while diagrammatic languages have been used to reason about positive cryptographic results in the stand-alone setting [9,10,41], can one push such approaches further now that composable security defnitions have a clear categorical meaning? Besides the graphical methods, thinking of cryptography as a resource theory suggests using resource-theoretic tools such as monotones. While monotones have already been applied in cryptography [70], a full understanding of cryptographically relevant monotones is still lacking.

<sup>7</sup> Heuristically speaking this is the case: an honest-but-curious attack on g◦f should be factorizable as one on g and one on f, and similarly an honest-but-curious attack on g ⊗f should be factorisable into ones on g and f that then forward their transcripts to an attack on id ⊗ id.

### References


Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

## DyNetKAT: An Algebra of Dynamic Networks <sup>⋆</sup>

Georgiana Caltais<sup>1</sup> () , Hossein Hojjat<sup>2</sup> , Mohammad Reza Mousavi<sup>3</sup> , and H¨unkar Can Tun¸c<sup>4</sup>

<sup>1</sup> University of Konstanz, Germany & University of Twente, The Netherlands g.g.c.caltais@utwente.nl <sup>2</sup> TeIAS, Khatam University & University of Tehran, Iran hojjat@ut.ac.ir <sup>3</sup> King's College London, UK mohammad.mousavi@kcl.ac.uk <sup>4</sup> University of Konstanz, Germany & Aarhus University, Denmark tunc@cs.au.dk

Abstract. We introduce a formal language for specifying dynamic updates for Software Defned Networks. Our language builds upon Network Kleene Algebra with Tests (NetKAT) and adds constructs for synchronisations and multi-packet behaviour to capture the interaction between the control- and data-plane in dynamic updates. We provide a sound and ground-complete axiomatisation of our language. We exploit the equational theory and provide an efcient method for reasoning about safety properties. We implement our equational theory in DyNetiKAT – a tool prototype, based on the Maude Rewriting Logic and the NetKAT tool, and apply it to a case study. We show that we can analyse the case study for networks with hundreds of switches using our tool prototype.

Keywords: Software Defned Networks · Dynamic Updates · Dynamic Network Reconfguration · NetKAT · Process Algebra · Equational Reasoning.

### 1 Introduction

Software-Defned Networking (SDN) is an approach to networking that enables the network to be centrally programmed. There is a spectrum of mathematically inspired network programming languages that varies between those with a small number of language constructs and those with expressive language design which allow them to support more networking features. Flowlog [16] and Kinetic [12] are points on the more expressive side of the spectrum, which provide support for formal reasoning based on SAT-solving and model checking, respectively.

<sup>⋆</sup> The work of Georgiana Caltais and H¨unkar Can Tun¸c was supported by the DFG project "CRENKAT", proj. no. 398056821. The work of Mohammad Reza Mousavi was supported by the UKRI Trustworthy Autonomous Systems Node in Verifability, Grant Award Reference EP/V026801/1. The authors would like to thank Alexandra Silva and Tobias Kapp´e for their useful insight into the NetKAT framework.

NetKAT [3,10] is an example of a minimalist language based on Kleene algebra with tests that has a sound and complete equational theory. While the core of the language is very simple with a few number of operators, the language has been extended in various ways to support diferent aspects of networking such as congestion control [9], history-based routing [6] and higher-order functions [20].

Our starting point is NetKAT, because it provides a clean and analysable framework for specifying SDNs. The minimalist design of NetKAT does not cater for some common (failure) patterns in SDNs, particularly those arising from dynamic reconfguration and the interaction between the data- and control-plane fows. In [13], the authors have proposed an extension to NetKAT to support stateful network updates. The extension embraces the notion of mutable state which is in contrast to the pure functional nature of the language. The purpose of this paper is to propose an extension of NetKAT to support dynamic and stateful behaviours. On the one hand, we preserve the big-step denotational semantics of NetKAT-specifc constructs enabling, for instance, handling fow table updates atomically, in the spirit of [17]. On the other hand, we extend NetKAT in a modular fashion, to integrate concurrent SDN behaviours such as dynamic updates, defned via a small-step operational semantics. To this end, we pledge to keep the minimalistic design of NetKAT by adding only a few new operators. Furthermore, our extension does not contradict the nature of the language. DyNetKAT is a conservative extension [2] of NetKAT that enables reusing in a modular fashion frameworks previously developed for NetKAT. Examples include the NetKAT axiomatisation in [3], for instance.

A number of concurrent extensions of NetKAT have been introduced to date [11,18,21]. These extensions followed diferent design decisions than the present paper and a comparison of their approaches with ours is provided in Section 2; however, the most important diference lies in the fact that inspired by earlier abstractions in this domain [17], we were committed to create diferent layers for data-plane fows and dynamic updates such that every data-plane packet observes a single set of fow tables through its fight through the network. This allowed us, unlike the earlier approaches, to build a layer on top of NetKAT without modifying its semantics. Although our presentation in this paper is based on NetKAT, we envisage that our concurrency layer can be modularly (in the sense of Modular SOS [14]) used for other network programming languages in the above-mentioned spectrum. We leave a more careful investigation of the modularity on other network languages for future work.

Running Example. To illustrate our language concepts, we focus on modelling with DyNetKAT an example of a stateful frewall that involves dynamically updating the fow table. The example is overly simplifed for the purpose of presentation. Towards the end of this paper and also in the extended version [7], we treat more complex and larger-scale case studies to evaluate the applicability and analysability of our language.

A frewall is supposed to protect the intranet of an organisation from unauthorised access from the Internet. However, due to certain requests from the intranet, it should be able to open up connections from the Internet to intranet.

Fig. 1: Stateful Firewall

An example is when a user within the intranet requests a secure connection to a node on the Internet; in that case, the response from the node should be allowed to enter the intranet. The behaviour of updating the fow tables with respect to some events in the network such as receiving a specifc packet is a challenging phenomenon for languages such as NetKAT.

Figure 1 shows a simplifed version of the stateful frewall network. Note that we are not interested in the fow of packets but interested in the fow update. In this version, the Switch does not allow any packet from the port ext to int at the beginning. When the Host sends a request to the Switch it opens up the connection.

Our Contributions. The contributions of this paper are summarised as follows. (a) We defne the syntax and operational semantics of a dynamic extension of NetKAT that allows for modelling and reasoning about control-plane updates and their interaction with data-plane fows (Sections 2.3, 2.4). (b) We give a sound and ground-complete axiomatisation of our language (Section 3). (c) We devise analysis methods for reasoning about fow properties using our axiomatisation, apply them on examples from the domain and gather and analyse evidence of applicability and efciency for our approach (Sections 4, 5, 6).

### 2 Language Design

In what follows, we provide a brief overview of the NetKAT syntax and semantics [3]. Then, we motivate our language design decisions, we introduce the syntax of DyNetKAT and its underlying semantics, and provide the corresponding encoding of our running example.

### 2.1 Brief Overview of NetKAT

We proceed by frst introducing some basic notions used throughout the paper.

Defnition 1 (Network Packets.) Let F = {f1, . . . , fn} be a set of feld names f<sup>i</sup> with i ∈ {1, . . . n}. We call network packet a partial function in F → N that maps feld names in F to values in N. We use σ, σ′ to range over network packets. We write, for instance, σ(fi) = v<sup>i</sup> to denote a test checking whether the value of f<sup>i</sup> in σ is vi. Furthermore, we write σ[f<sup>i</sup> := n<sup>i</sup> ] to denote the assignment of f<sup>i</sup> to v<sup>i</sup> in σ. A (possibly empty) list of packets is defned as a partial function from natural numbers to packets, where the natural number in the domain denotes the position of the packet in the list such that the domain of the function forms an interval starting from 0. The empty list is denoted by ⟨⟩ and is defned as the empty function (the function with the empty set as its domain). Let σ be a packet and l be a list, then σ :: l is the list l ′ in which σ is at position 0 in l ′ , i.e., l ′ (0) = σ, and l ′ (i + 1) = l(i), for all i in the domain of l.

NetKAT Syntax: Pr ::= 0 | 1 | f = n | Pr + Pr | Pr · Pr | ¬Pr N ::= Pr | f ← n | N + N | N · N | N ∗ | dup NetKAT Semantics: <sup>J</sup>1K(h) <sup>≜</sup> {h} <sup>J</sup>0K(h) <sup>≜</sup> {} <sup>J</sup><sup>f</sup> <sup>=</sup> <sup>n</sup><sup>K</sup> (σ::h) <sup>≜</sup> {σ::h} if σ(f) = n {} otherwise <sup>J</sup>¬a<sup>K</sup> (h) <sup>≜</sup> {h} \ <sup>J</sup>a<sup>K</sup> (h) <sup>J</sup><sup>f</sup> <sup>←</sup> <sup>n</sup><sup>K</sup> (σ::h) <sup>≜</sup> {σ[<sup>f</sup> := <sup>n</sup>]::h} <sup>J</sup><sup>p</sup> <sup>+</sup> <sup>q</sup><sup>K</sup> (h) <sup>≜</sup> <sup>J</sup>p<sup>K</sup> (h) <sup>∪</sup> <sup>J</sup>q<sup>K</sup> (h) <sup>J</sup><sup>p</sup> · <sup>q</sup><sup>K</sup> (h) <sup>≜</sup> (Jp<sup>K</sup> • <sup>J</sup>qK) (h) Jp ∗ <sup>K</sup> (h) <sup>≜</sup> S <sup>i</sup>∈<sup>N</sup> F i (h) F 0 (h) ≜ {h} F <sup>i</sup>+1 (h) <sup>≜</sup> (Jp<sup>K</sup> • <sup>F</sup> i ) (h) (f • g)(x) ≜ S {g(y) | y ∈ f(x)} <sup>J</sup>dup<sup>K</sup> (σ::h) <sup>≜</sup> {σ::(σ::h)}

Fig. 2: NetKAT: Syntax and Semantics [3]

In Figure 2, we recall the NetKAT syntax and semantics [3]. The predicate for dropping a packet is denoted by 0, while passing on a packet (without any modifcation) is denoted by 1. The predicate checking whether the feld f of a packet has value n is denoted by (f = n); if the predicate fails on the current packet it results on dropping the packet, otherwise it will pass the packet on. Disjunction and conjunction between predicates are denoted by Pr +Pr and Pr · Pr , respectively. Negation is denoted by ¬Pr . Predicates are the basic building blocks of NetKAT policies and hence, a predicate is a policy by defnition. The policy that modifes the feld f of the current packet to take value n is denoted by (f ← n). A multicast behaviour of policies is denoted by N +N, while sequencing policies (to be applied on the same packet) are denoted by N · N. The repeated application of a policy is encoded as N<sup>∗</sup> . The construct dup simply makes a copy of the current network packet.

In [3], lists of packets are referred to as histories. Let H stand for the set of packet histories, and P(H) denote the powerset of H. More formally, the denotational semantics of NetKAT policies is inductively defned via the semantic map <sup>J</sup>−<sup>K</sup> : <sup>N</sup> <sup>→</sup> (<sup>H</sup> → P(H)) in Figure 2, where <sup>N</sup> stands for the set of NetKAT policies, h ∈ H is a packet history, a ∈ Pr denotes a NetKAT predicate and σ ∈ F → N is a network packet.

For a reminder, the equational axioms of NetKAT include the Kleene Algebra axioms, Boolean Algebra axioms and the so-called Packet Algebra axioms that handle NetKAT networking specifc constructs such as feld assignments and dup. In this paper, we write ENK to denote the NetKAT axiomatisation [3].

#### 2.2 Design Decisions

Our main motivation behind DyNetKAT is to have a minimalist language that can model control-plane and data-plane network trafc and their interaction. Our choice for a minimal language is motivated by our desire to use our language as a basis for scalable analysis. We would like to be able to compile major practical languages into ours. Our minimal design helps us reuse much of the well-known scalable analysis techniques. Regarding its modelling capabilities, we are interested in modelling the stateful and dynamic behaviour of networks emerging from these interactions. We would like to be able to model control messages, connections between controllers and switches, data packets, links among switches, and model and analyse their interaction in a seamless manner.

Based on these motivations, we start of with NetKAT as a fundamental and minimal network programming language, which allows us to model the basic policies governing the network trafc. The choice of NetKAT, in addition to its minimalist nature, is motivated by its rigorous semantics and equational theory, and the existing techniques and tools for its analysis. This motivates our next design constraint, namely, to build upon NetKAT in a hierarchical manner and without redefning its semantics. This constraint should not be taken lightly as the challenges in the recent concurrent extensions of NetKAT demonstrated [11, 18, 21]. We will elaborate on this point, in the presentation of our syntax and semantics. We can achieve this thanks to the abstractions introduced in the domain [17] that allow for a neat layering of data-plane and control-plan fows such that every data-plane fow sees one set of fow-tables in its fight through the network.

We introduce a few extensions and modifcations to cater for the phenomena we desire to model in our extension regarding control-plane and dynamic and stateful behaviour, as follows. (a) Parallel composition and synchronisation: we introduce a basic mechanism for parallel composition based on handshake synchronisation with the possibility of communicating a network program (a fow table). The point of adding parallel composition is to have parallel controllers and switches as separate syntactic entities: controllers trigger reconfgurations and switches accept diferent types of reconfguration and change their continuation accordingly. (b) Guarded recursion: we introduce the concept of recursion to model the (persistent) dynamic changes that result from control messages and stateful behaviour. In other words, recursion is used to model the new state of the fow tables. An alternative modelling construct could have been using "global" variables and guards, but we prefer recursion due to its neat algebraic representation. We restrict the use of recursion to guarded recursion, that is a policy should be applied before changing state to a new recursive defnition, in order to remain within a decidable and analyse-able realm. A natural extension of our framework could introduce formal parameters and parameterised recursive variables; this future extension is orthogonal to our existing extensions and in this paper, we go for a minimal extension in which the parameters are coded in variable names. (c) Multi-packet semantics: we introduce the semantics of treating a list of packets, which is essential for studying the interaction between control- and data plane packets. This is in contrast with NetKAT where a singlepacket semantics is introduced. The introduction of multi-packet semantics also called for a new operator to denote the end of applying a fow-table to the current packet and proceeding with the next packet (possibly with the modifed fow-table in place). This is our new sequential composition operator, denoted

by ";". Inspired by the abstractions in the software defned networking community [17], we assume each packet is processed either using the confguration in place prior to the update, or the confguration in place after the update, but never a mixture of the two.

### 2.3 DyNetKAT Syntax

As already mentioned, NetKAT provides the possibility of recording the individual "hops" that packets take as they go through the network by using the so-called dup construct. The latter keeps track of the state of the packet at each intermediate hop. As a brief reminder of the approach in [3]: assume a NetKAT switch policy p and a topology t, together with an ingress in and an egress out. Checking whether out is reachable from in reduces to checking: in · dup · (p · t · dup) ∗ · out ̸≡ 0 (see Defnition 2 and Theorem 4 in [3]). Furthermore, as shown in [10], dup plays a crucial role in devising the NetKAT language semantics in a coalgebraic fashion, via Brzozowski-like derivatives on top of NetKAT coalgebras (or NetKAT automata) corresponding to NetKAT expressions.

We decided to depart from NetKAT in this respect, due to our important constraint not to redefne the NetKAT semantics: the dup expression allows for observable intermediate steps that result from incomplete application of fowtables and in concurrency scenarios, the same data packet may become subject to more than one fow table due to the concurrent interactions with the control plane. For this semantics to be compositional, one needs to defne a small step operational semantics in such a way that the small steps in predicate evaluation also become visible (see our past work on compositionality of SOS with data on such constraints [15]). This will frst break our constraint in building upon NetKAT semantics and secondly, due to the huge number of possible interleavings, make the resulting state-space intractable for analysis.

In addition to the argumentation above, note that similarly to the approach in [3], we work with packet felds ranging over fnite domains. Consequently, our analyses can be formulated in terms of reachability properties, further verifable by means of dup-free expressions of shape: in · (p · t) ∗ · out ̸≡ 0. Hence, we chose to defne DyNetKAT synchronisation, guarded recursion and multi-packet semantics on top of the dup-free fragment of NetKAT, denoted by NetKAT<sup>−</sup>dup .

The syntax of DyNetKAT is defned on top of the dup-free fragment of NetKAT as:

$$\begin{array}{l} N ::= \text{NetKAT}^{-\text{dup}} \\ D ::= \bot \mid N ; D \mid x ?N ; D \mid x !N ; D \mid D \mid D \mid D \oplus D \mid X \\ X \triangleq D \end{array} \tag{1}$$

We write p ∈ NetKAT, p ∈ NetKAT<sup>−</sup>dup or, respectively, p ∈ DyNetKAT in order to refer to a NetKAT, NetKAT<sup>−</sup>dup or, respectively, DyNetKAT policy p.

The DyNetKAT-specifc constructs are as follows. By ⊥ we denote a dummy policy without behaviour. Our new sequential composition operator, denoted by N ; D, specifes when the NetKAT<sup>−</sup>dup policy N is applicable to the current packet has come to a successful end and, thus, the packet can be transmitted further and the next packet can be fetched for processing according to the rest of the policy D.

Communication in DyNetKAT, encoded via x!N ; D and x?N ; D, consists of two steps. In the frst place, sending and receiving NetKAT−dup policies through channel x are denoted by x!N , and x?N . In an expression such as x?N ; P<sup>N</sup> , the combination of the channel name x and the update type N , determine how the continuation process P<sup>N</sup> , considering N as a placeholder in P<sup>N</sup> , enables defning compositional and compact parameterised DyNetKAT specifcations. Secondly, as soon as the sending or receiving messages are successfully communicated, a new packet is fetched and processed according to D. The parallel composition of two DyNetKAT policies (to enable synchronisation) is denoted by D || D.

As it will become clearer in Section 2.4, communication in DyNetKAT guarantees preservation of well-defned behaviours when transitioning between network confgurations. This corresponds to the so-called per-packet consistency in [17], and it guarantees that every packet traversing the network is processed according to exactly one NetKAT<sup>−</sup>dup policy.

Non-deterministic choice of DyNetKAT policies is denoted by D ⊕ D. For a non-determinstic choice over a fnite domain P, we use the syntactic sugar ⊕p∈<sup>P</sup> P ′ , where p appears as "bound variable" in P ′ ; this is interpreted as a sum of fnite summand by replacing the variable p with all its possible values in P.

Finally, one can use recursive variables X in the specifcation of DyNetKAT policies, where each recursive variable should have a unique defning equation X ≜ D. For the simplicity of notation, we do not explicitly specify the trailing "; ⊥" in our policy specifcations, whenever clear from the context.

In Figure 3 we provide the DyNetKAT formalisation of the frewall in Example 1. In the DyNetKAT encoding, we use the message channel secConReq to open up the connection and secConEnd to close it. We model the behaviour of the switch using the two programs Switch and Switch′ .

$$\begin{array}{cc} switch \triangleq \left( \left( port = int \right) \cdot \left( port \leftarrow ext \right) \right); switch \oplus\\ \left( \left( port = ext \right) \cdot \mathbf{0} \right); switch \oplus\\ secConReg1; switch' & secConRead1; \\ switch' \triangleq \left( \left( port = int \right) \cdot \left( port \leftarrow ext \right) \right); switch' \oplus\\ \left( \left( port = ext \right) \cdot \left( port \leftarrow int \right) \right); switch' \oplus\\ secConEnd \mathbf{?} 1; switch \end{array} \\ \begin{array}{cc} switch \triangleq \left( \left( port = int \right) \cdot \left( port \leftarrow int \right) \right); switch' \oplus\\ switch \oplus\\ secConEnd \mathbf{?} 1; switch \end{array} \\ \end{array}$$

Fig. 3: Stateful Firewall in DyNetKAT

### 2.4 DyNetKAT Semantics

The operational semantics of DyNetKAT in Figure 4 is provided over confgurations of shape (d, H, H′ ), where d stands for the current DyNetKAT policy, H is the list of packets to be processed by the network according to d and H′ is the list of packets handled successfully by the network. The rule labels γ range over


γ ::= (σ, σ′ ) | x!q | x?q | rcfg(x, q)

Fig. 4: DyNetKAT: Operational Semantics (relevant excerpt)

pairs of packets (σ, σ′ ) or communication/reconfguration-like actions of shape x!q, x?q or rcfg(x, q), depending on the context.

Note that the DyNetKAT semantics is devised in a "layered" fashion. Rule (cpol✓ ; ) in Figure 4 is the base rule that makes the transition between the NetKAT denotations and DyNetKAT operations. More precisely, whenever σ ′ is a packet resulted from the successful evaluation of a NetKAT policy p on σ, a (σ, σ′ )-labelled step is observed at the level of DyNetKAT. This transition applies whenever the current confguration encapsulates a DyNetKAT policy of shape p; q and a list of packets to be processed starting with σ. The resulting confguration continues with evaluating q on the next packet in the list, while σ ′ is marked as successfully handled by the network.

The remaining rules in Figure 4 defne non-deterministic choice ⊕, synchronisation || and recursion X in the standard fashion. Note that synchronisations leave the packet lists unchanged. Moreover, we choose not to hide the channel x and the policy p being communicated (as it is usually the case in ACP), but rather keep this information visible outside the SDN being modelled, by means of the label rcfg(x, p). Due to space limitation, we omitted the explicit defnitions of the symmetric cases for ⊕ and ||. The full semantics is provided in [7].

In Figure 5 we depict a labelled transition system (LTS) encoding a possible behaviour of the stateful frewall in Example 1. We assume the list of network packets to be processed consists of a "safe" packet σ<sup>i</sup> travelling from int to ext (i.e., σi(port) = int) followed by a potentially "dangerous" packet σ<sup>e</sup> travelling from ext to int (i.e., σe(port) = ext). For the simplicity of notation, in Figure 5 we write H for Host, S for Switch, S ′ for Switch′ , SCR for secConReq and SCE for secConEnd. Note that σ<sup>e</sup> can enter the network only if a secure connection request was received. More precisely, the transition labelled (σe, σi) is preceded by a transition labelled SCR?1 or rcfg(SCR, 1): n2 SCR?1, rcfg(SCR,1) −−−−−−−−−−−−−→ n<sup>3</sup> (σe,σi) −−−−→ n4.

Fig. 5: Stateful Firewall LTS

### 3 Semantic Results

In this section we defne bisimilarity of DyNetKAT policies and provide a corresponding sound and ground-complete axiomatization. We start with strong bisimilarity because it lends itself to a neat theory. Once we establish a theory for strong bisimilarity, a theory for other notions of equivalence in the lineartime and branching-time spectrum can be obtained by adding a specifc set of axioms following a standard recipe for each notion. We use this approach to reason about safety properties that are about traces.

Bisimilarity of DyNetKAT terms is defned in the standard fashion:

Defnition 2 (Bisimilarity (∼)) A symmetric relation R over DyNetKAT policies is a bisimulation whenever for (p, q) ∈ R the following holds: γ γ

If (p, H0, H1) −→ (p ′ , H′ 0 , H′ 1 ) then exists q ′ s.t. (q, H0, H1) −→ (q ′ , H′ 0 , H′ 1 ) and (p ′ , q′ ) ∈ R, with γ ::= (σ, σ′ ) | x?r | x!r | rcfg(x, r).

We call bisimilarity the largest bisimulation relation. Two policies p and q are bisimilar (p ∼ q) if there is a bisimulation relation R such that (p, q) ∈ R.

Semantic equivalence of NetKAT<sup>−</sup>dup policies is preserved by DyNetKAT.

Proposition 1 (Semantic Layering). Let p and q be NetKAT<sup>−</sup>dup policies. The following holds: <sup>J</sup>p<sup>K</sup> <sup>=</sup> <sup>J</sup>q<sup>K</sup> if (p; <sup>d</sup>) <sup>∼</sup> (q; <sup>d</sup>) for any DyNetKAT policy <sup>d</sup>.


Fig. 6: The axiom system EDNK (including ENK)

Proof sketch. This follows according to ∼ and (cpol✓ ; ) in Figure 4. ■

We further provide some additional ingredients needed to introduce the DyNetKAT axiomatisation in Figure 6. First, note that our notion of bisimilarity identifes synchronisation steps as in (cpol♣♠) in Figure 4. At the axiomatisation level, this requires introducing corresponding constants rcfgx,z defned as:

$$\boxed{(\mathbf{rcfg}\_{x,z};p,H\_0,H\_1) \xrightarrow{\mathbf{rcfg}(\mathbf{x},\mathbf{z})} (p,H\_0,H\_1)}^{\cdot}$$

In accordance with standard approaches to process algebra (see, e.g., [1, 4]) we consider the restriction operator δL(−) with L a set of forbidden actions ranging over x?z and x!z as in (1). In practice, we use the restriction operator to force synchronous communication. We also defne a projection operator πn(−) that, intuitively, captures the frst n steps of a DyNetKAT policy. πn(−) is crucial for defning the so-called "Approximation Induction Principle" that enables reasoning about equivalence of recursive DyNetKAT specifcations. Last, but not least, in our axiomatisation we employ the left-merge operator (T) and the communication-merge operator (|) utilised for axiomatising parallel composition. Intuitively, a process of shape <sup>p</sup>T<sup>q</sup> behaves like <sup>p</sup> as a frst step, and then continues as the parallel composition between the remaining behaviour of p and q. A process of shape p | q forces the synchronous communication between p and q in a frst step, and then continues as the parallel composition between the remaining behaviours of p and q. The full description of these auxiliary operators is provided in [7].

From this point onward, we denote by DyNetKAT the extension with the operators δL(−), πn(−) and rcfgx,z:

$$\begin{array}{lcl} N & ::= \text{NetKAT}^{-\text{dup}} \\ D\_e & ::= \bot \mid N ; D \mid x ?N ; D\_e \mid x ! N ; D\_e \mid \mathsf{refg}\_{x,N} ; D\_e \mid \\ & D\_e \mid \mid D\_e \mid D\_e \oplus D\_e \mid \delta \_\mathcal{L} (D\_e) \mid \pi\_n(D\_e) \mid D\_e \parallel D\_e \mid D\_e \mid D\_e \mid X \\ & X \triangleq D\_e, \ n \in \mathbb{N}, \mathcal{L} = \{c \mid c ::= x ?N \mid x ! N\} \end{array} \tag{2}$$

Bisimilarity is defned for DyNetKAT terms as in (2) in the natural fashion.

Lemma 3 For DyNetKAT, bisimilarity is a congruence.

Proof sketch. The result follows from the fact that the semantic rules defned in this paper comply to the congruence formats proposed in [15]; the notion of bisimilarity used in our paper coincides with the notion of stateless bisimilarity in [15] and hence, the lemma follows. ■

In Figure 6, we introduce EDNK – the axiom system of DyNetKAT, including the NetKAT axiomatisation ENK. Most of the axioms in Figure 6 comply to the standard axioms of parallel and communicating processes [4], where, intuitively, ⊕ plays the role of non-deterministic choice, ; resembles sequential composition and ⊥ is a process that deadlocks. An interesting axiom is (A7) : p || ⊥ ≡ p which, intuitively, states that if one network component fails, then the whole system continues with the behaviour of the remaining components. This is a departure from the approach in [11], where recovery is not possible in case of a component's failure; i.e., e || 0 ≡ 0. Additionally, (A12) "pin-points" a communication step via the newly introduced constants of form rcfgx,z. Axiom (A0) states that if the current packet is dropped as a result of the unsuccessful evaluation of a NetKAT policy, then the continuation is deadlocked. (A1) enables mapping the non-deterministic choice at the level of NetKAT to the setting of DyNetKAT.

The axioms encoding the restriction operator δL(−) and the projection operator πn(−) are defned in the standard fashion, on top of DyNetKAT normal forms later defned in this section. Intuitively, normal forms are defned inductively, as sums of complete tests and complete assignments α · π, or communication steps x?q, x!q and rcfgx,q, followed by arbitrary DyNetKAT policies. Complete tests (typically denoted by α) and complete assignments (typically denoted by π) were originally introduced in [3]. In short: let F = {f1, . . . , fn} be a set of felds names with values in V<sup>i</sup> , for i ∈ {1, . . . , n}. We call complete test (resp., complete assignment) an expression f<sup>1</sup> = v<sup>1</sup> · . . . · f<sup>n</sup> = v<sup>n</sup> (resp., f<sup>1</sup> ← v<sup>1</sup> · . . . · f<sup>n</sup> ← vn), with v<sup>i</sup> ∈ V<sup>i</sup> , for i ∈ {1, . . . , n}. Last, but not least, axiom (AIP) corresponds to the so-called "Approximation Induction Principle", and it provides a mechanism for reasoning about the equivalence of recursive behaviours, up to a certain limit denoted by n.

In what follows, we show that the axiom system EDNK is sound and groundcomplete with respect to DyNetKAT bisimilarity.

Lemma 4 (NetKAT−dup Normal Forms) We call a NetKAT−dup policy q in normal form (n.f.) whenever q is of shape Σα·π∈Aα·π with A = {α<sup>i</sup> ·π<sup>i</sup> | i ∈ I}. ENK is normalising for NetKAT−dup .

Proof sketch. The result follows from Lemma 4 in [3] stating that the standard semantics of every NetKAT expression is equal to the union of its minimal nonzero terms. In the context of NetKAT−dup and packet values drawn from fnite domains (as is the case in [3]), this union can be equivalently expressed as a sum of complete tests and complete assignments. I.e., ⊢ r ≡ Σi∈Iα<sup>i</sup> · π<sup>i</sup> for every NetKAT<sup>−</sup>dup expression r. ■

Defnition 5 (DyNetKAT Normal Forms) We call a DyNetKAT policy in normal form (n.f.) if it is of shape

$$\Sigma\_{i \in I}^{\oplus} (\alpha\_i \cdot \pi\_i); d\_i \oplus \Sigma\_{j \in J}^{\oplus} c\_j; d\_j \left( \oplus \bot \right);$$

where d<sup>i</sup> , d<sup>j</sup> range over DyNetKAT policies and c<sup>j</sup> ::= x?q | x!q | rcfgx,q with q denoting terms in NetKAT<sup>−</sup>dup .

Defnition 6 (Guardedness) A DyNetKAT policy p is guarded if and only if all occurrences of all variables X in p are guarded. An occurrence of a variable X in a policy p is guarded if and only if (i) p has a subterm of shape p ′ ;t such that either p ′ is variable-free, or all the occurrences of variables Y in p ′ are guarded, and X occurs in t, or (ii) if p is of shape y?X;t, y!X;t or rcfgX,t.

Note that guarded DyNetKAT policies are fnitely branching. In what follows, we assume DyNetKAT policies are guarded.

### Lemma 7 (DyNetKAT Normalisation) EDNK is normalising for DyNetKAT.

Proof sketch. The proof follows from Lemma 4 and (A1), by structural induction. Base cases: p ≜ ⊥ trivially holds; p ≜ q; d with q a NetKAT<sup>−</sup>dup term holds by Lemma 4 and (A1); p ≜ c; d with c ::= x?q | x!q | rcfgx,q trivially holds. Induction step, cases: p ≜ X - discarded, as p is not guarded; <sup>p</sup> <sup>≜</sup> <sup>p</sup><sup>1</sup> <sup>⊕</sup> <sup>p</sup><sup>2</sup> ; <sup>p</sup> <sup>≜</sup> <sup>p</sup><sup>1</sup>Tp<sup>2</sup> ; <sup>p</sup> <sup>≜</sup> <sup>π</sup>n(<sup>p</sup> ′ ) ; p ≜ p<sup>1</sup> | p<sup>2</sup> ; p ≜ δL(p ′ ) and, eventually, p ≜ p<sup>1</sup> || p<sup>2</sup> . All items before follow by the axiom system EDNK and the induction hypothesis, under the assumption that p1, p<sup>2</sup> and p ′ are guarded. ■

Lemma 8 (Soundness of EDyNetKAT\AIP ) Let EDyNetKAT\AIP stand for the axiom system EDNK in Figure 6, without the axiom (AIP). EDyNetKAT\AIP is sound for DyNetKAT bisimilarity.

Proof sketch. This is proven in a standard fashion, by case analysis on transitions of shape (p, H0, H′ 0 ) γ −→ (q, H1, H′ 1 ) with γ ::= (σ, σ′ ) | x?n | x!n | rcfg(x, n), according to the semantic rules of the DyNetKAT operators in (2). Take (A0) for instance. The left hand-side 0; p can only evolve according to (cpol✓ ; ) in Fig. <sup>4</sup> which, in turn, has an empty premise as <sup>J</sup>0K(<sup>σ</sup> :: ⟨⟩) = {} for all <sup>σ</sup>. Thus, (cpol✓ ; ) does not entail any step for this case. Symmetrically, there is no semantic transition for ⊥ in Fig. 4. In other words, none of the left/right hand-sides of (A0) displays any behaviour, therefore the axiom is sound. ■

Lemma 9 (Soundness of AIP) The Approx. Induction Principle (AIP) is sound for DyNetKAT bisimilarity.

Proof sketch. The proof is close to the one of Theorem 2.5.8 in [4] and uses the branching fniteness property of guarded DyNetKAT policies. ■

Theorem 1 (Soundness & Completeness). EDNK is sound and groundcomplete for DyNetKAT bisimilarity.

Proof. Soundness: if EDNK ⊢ p ≡ q then p ∼ q, follows from Lemma 8 and Lemma 9. Completeness: if p ∼ q then EDNK ⊢ p ≡ q, is shown as follows. Without loss of generality, assume p and q are in n.f., according to Lemma 7. We want to show that p ≡ q ⊕ p and q ≡ p ⊕ q which, by ACI of ⊕ implies p ≡ q. This reduces to showing that every summand of p is a summand of q and vice-versa. We frst argue that every summand of p is a summand of q. The reasoning is by structural induction.

Base case p ≜ ⊥ holds by the hypothesis p ∼ q that q ≜ ⊥.

Induction step. Case p ≜ ((α · π); p ′ ) ⊕ p ′′: then, (p, σ<sup>α</sup> :: H, H′ ) (σα,σπ) −−−−−→ (p ′ , H, σ<sup>π</sup> :: H′ ) implies by the hypothesis p ∼ q that (q, σ<sup>α</sup> :: H, H′ ) (σα,σπ) −−−−−→ (q ′ , H, σ<sup>π</sup> :: H′ ) and p ′ ∼ q ′ . Recall that q is in n.f.; hence, by the shape of the semantic rules in Figure 4 it holds that q ≜ ((α · π); q ′ ) ⊕ q ′′. By the induction hypothesis, it holds that p ′ ≡ q ′ hence, (α·π); p ′ is a summand of q as well. Cases p ≜ (c; p ′ ) ⊕ p ′′ with c ::= x?n | x!n | rcfgx,n follow in a similar fashion. Hence, p ≡ q ⊕ p holds. The symmetric case q ≡ p ⊕ q follows the same reasoning.

We refer to [7] for the complete proofs and additional details.

### 4 A Framework for Safety

In this section we provide a language for specifying safety properties for networks characterized by DyNetKAT, together with a procedure for reasoning about safety in an equational fashion. Intuitively, safety properties enable specifying the absence of undesired network behaviours.

Defnition 10 (Safety Properties - Syntax) Let A be an alphabet over letters of shape α · π and rcfgx,p, with α and π ranging over complete tests and assignments, and rcfgx,p ranging over reconfguration actions. Safety properties are defned in the following fashion:

```
act ::= α · π | rcfgx,p (α · π, rcfgx,p ∈ A)
regexp ::= true | act | ¬act | regexp + regexp | regexp · regexp |
           (regexp)
                    n
                       (with n ≥ 1)
 prop ::= [regexp]false
```
A safety property specifcation prop is satisfed whenever the behaviour encoded by regexp should not be observed within the network. Regular expressions regexp are defned with respect to actions act: a fow of shape α · π is the observable behaviour of a (NetKAT−dup) policy transforming a packet encoded by α into απ, whereas rcfgx,p corresponds to a reconfguration step in a network. Recursively, a sum of regular expressions regexp<sup>1</sup> + regexp<sup>2</sup> encodes the union of the two behaviours, a concatenation of regular expressions regexp<sup>1</sup> · regexp<sup>2</sup> encodes the behaviour of regexp<sup>1</sup> followed by the behaviour of regexp<sup>2</sup> . A property of shape [¬a]false, with a ∈ A, states that the system cannot do anything apart from a as a frst step. The property [true]false states that no action can be observed in the network, whereas [r <sup>n</sup>]false encodes the repeated application of r for n times.

Note that true, negated expressions ¬a and repetitions r <sup>n</sup> are mere syntactic sugars of equivalent expressions free of these operations. Not surprisingly, "desugaring" (ds(−)) is defned as:

$$\begin{array}{ll} ds(true) \triangleq \Sigma\_{a \in \mathcal{A}} a\\ ds(\neg a) \triangleq \Sigma\_{a\_i \in \mathcal{A}} a\_i & ds(r^n) \triangleq ds(\underbrace{r \cdot r \cdot \ldots \cdot r}\_{n \text{ times}})\\ a\_i \neq a & & \\ ds(r\_1 \cdot r\_2) \triangleq ds(r\_1) \cdot ds(r\_2) & \text{if } r\_1 \cdot r\_2 \text{ not de-sugared} \\ ds(r\_1 + r\_2) \triangleq ds(r\_1) + ds(r\_2) & \text{if } r\_1 + r\_2 \text{ not de-sugared} \\ ds(r) \triangleq r \text{ [owise]} \end{array}$$

The complete formal defnition of the de-sugaring function is provided in [7].

Defnition 11 (Safety Properties - Semantics) Let A be an alphabet over letters of shape α · π and rcfg(x, p), with α and π ranging over complete tests and assignments, and rcfg(x, p) ranging over reconfguration actions. We write w, w′ for (non-empty) words with letters in A (i.e., w, w′ ∈ A<sup>∗</sup> ) and | w | for the length of w. We write w ′ ⪯ w whenever w ′ is a prefx of w (including w).

Let r be a de-sugared regular expression (regexp) as in Defnition 10. We call head normal form (h.n.f.) of r, denoted by hnf(r), the sum of words as above obtained by left-/right- distributing · over + in r, in the standard fashion. Note that such a h.n.f. always exists for r. Let Prop stand for the set of all properties as in Defnition 10, in h.n.f.

The semantic map <sup>J</sup>−<sup>K</sup> : Prop <sup>→</sup> DyNetKAT associates to each safety property in Prop a DyNetKAT expression as follows. Let Θ be the DyNetKAT policy (in normal form) encoding all possible behaviours over A: Θ ≜ Σ ⊕ <sup>a</sup>∈A(a; ⊥⊕a; Θ). Then:

$$\begin{array}{ccccc} \mathbb{I}\left[\Sigma\_{i}\in I\right]w\_{i}\left[false\right] \triangleq \Sigma^{\oplus} & \begin{array}{c} w \in \mathcal{A} \\ w \in \mathcal{A}^{\*} \end{array}\left[\begin{array}{c} \overline{w};\bot \\ \mid\,w\mid $$

such that M is the length of the longest word wi, with i ∈ I, and w is a DyNetKAT-compatible term obtained from w where all letters have been separated by ; and inductively defned in the obvious way. Namely, a ≜ a for a ∈ A and a · w ≜ a; w for a ∈ A and w ∈ A<sup>∗</sup> . The semantic map <sup>J</sup>−<sup>K</sup> is defned following the intuition provided earlier in this section. For instance, as shown in (3), if none of the sequences of steps w<sup>i</sup> can be observed in the system, then the associated DyNetKAT term prevents the immediate execution of all wi.

Typically, safety analysis is reduced to reachability. In our context, a safety property is violated whenever the network system under analysis displays a (fnite) execution that is not in the behaviour of the property. Thus, the aforementioned semantic map is based on traces (or words in A<sup>∗</sup> ) and is not sensitive to branching. This paves the way to reasoning about safety properties in an equational fashion.

Defnition 12 (Safe Network Systems) Let Etr DNK stand for the equational axioms in Figure 6, including the additional axiom that enables switching from the context of bisimilarity to trace equivalence of DyNetKAT policies, namely: p; (q ⊕ r) ≡ p; q ⊕ p; r. Assume a specifcation given as the safety formula s and a network system implemented as the DyNetKAT policy i. We say that the network is safe whenever the following holds: Etr DNK <sup>⊢</sup> <sup>J</sup>s<sup>K</sup> <sup>⊕</sup> <sup>i</sup> <sup>≡</sup> <sup>J</sup>sK. In words: checking whether i satisfes s reduces to checking whether the trace behaviour of i is included into that of s.

For an example, consider the frewall in Figure 1 and the corresponding encoding in Figure 3. Recall that reaching int from ext without observing a secure connection request is a faulty behaviour. This entails the safety formula s<sup>n</sup> defned as [(¬rcfgsecConReq,<sup>1</sup> ) <sup>n</sup> · (α · π)]false, for n ∈ N, α ≜ (port = ext) and π ≜ (port ← int). Therefore, checking whether the network is safe reduces to checking, for all n ∈ N: Etr DNK <sup>⊢</sup> <sup>J</sup>sn<sup>K</sup> <sup>⊕</sup> Init <sup>≡</sup> <sup>J</sup>snK. Note that, for a fxed <sup>n</sup>, the verifcation procedure resembles bounded model checking [5].

### 5 Implementation

In this section, we describe our implementation for formal reasoning about dynamic networks. Our prototype tool, called DyNetiKAT (available at https: //github.com/hcantunc/DyNetiKAT) is based on Maude [8], the NetKAT decision procedure [10], and Python [19] as a glue language. Our modular extension of NetKAT allows for reusing the NetKAT tools in our framework. In our prototype, we focus on checking reachability and waypointing in a dynamic setting. We build upon the methods for checking reachability and waypointing properties in NetKAT [3]. For a reminder, in NetKAT, reachability and waypointing properties are characterised as follows: for reachability properties, an egress point out is reachable from an ingress point in, in the context of a switch policy p and topology t, whenever the following NetKAT equivalence holds: in·(p·t) ∗ ·out ̸≡ 0. For waypointing properties, an intermediate point w between in and out is considered a waypoint from in to out if all the packets from in to out go through w. Such a property is satisfed if the following equivalence holds:

$$\begin{aligned} \left( \dot{n} \cdot (p \cdot t)^{\*} \cdot out + \dot{n} \cdot (\neg out \cdot p \cdot t)^{\*} \cdot w \cdot (\neg in \cdot p \cdot t)^{\*} \cdot out \right) \\ \equiv \dot{n} \cdot (\neg out \cdot p \cdot t)^{\*} \cdot w \cdot (\neg in \cdot p \cdot t)^{\*} \cdot out \end{aligned}$$

In order to utilise the NetKAT decision procedure for property checking we represent the properties given as regular expressions (as described in Section 4). To this end, we introduced the operators head(D), and tail(D, R), where D is a DyNetKAT term and R is a set of terms of shape rcfgX,N . Intuitively, the operator head(D) returns a NetKAT policy representing the current confguration in D, and tail(D, R) returns a DyNetKAT policy which is the sum of policies in D that appear after the synchronisation events in R. We utilise these operators as follows: for a given DyNetKAT term we apply our equational reasoning framework to unfold the expression and rewrite it into the normal form. Then, we extract the desired confgurations by using the head and tail operators. After this step, the resulting expression is a NetKAT term and we use the NetKAT decision procedure for checking properties. For example, consider the safety property [(true) <sup>n</sup> · (α · π)]false as in Defnition 10, and a network SDN. Note that for a given complete assignments, there exists a corresponding complete test with the same values, e.g., the corresponding complete test for the complete assignment f<sup>0</sup> ← v<sup>0</sup> . . . f<sup>n</sup> ← v<sup>n</sup> is f<sup>0</sup> = v<sup>0</sup> . . . f<sup>n</sup> = vn. Henceforth, we write α<sup>π</sup> to represent the corresponding complete tests of π. The property [(true) <sup>n</sup> · (α · π)]false can be encoded in the style of NetKAT as follows:

$$
\alpha \cdot head(\pi\_n(SDN)) \cdot \alpha\_\pi \equiv \mathbf{0} \tag{4}
$$

$$
\alpha \cdot head(tail(\pi\_n(SDN), R)) \cdot \alpha\_\pi \equiv \mathbf{0} \tag{5}
$$

where R is the set of all synchronisation events in the network and πn(−) is the projection operator equationally defned in Figure 6. In our technical report [7] we provide the corresponding correctness specifcation of the stateful example discussed in Section 1. Note that in practice the parameter n in π<sup>n</sup> is a fxed value specifed by the user. Intuitively, (4) expresses that the initial confguration of the network is not able to transform the packets satisfying the predicate α such that they satisfy the predicate α<sup>π</sup> and (5) expresses that this transformation is still not possible in the confgurations after any sequence of synchronisation events. Formally, the operators head and tail are defned as follows:

head(⊥) = 0 tail(⊥, R) = ⊥ head(N; D) = N + head(D) tail(N; D, R) = tail(D, R) head(D ⊕ Q) = head(D) + head(Q) tail(D ⊕ Q, R) = tail(D, R) ⊕ tail(Q, R) head(rcfgX,N ; D) = 0 tail(rcfgX,N ; D, R) = D ⊕ tail(D, R) if rcfgX,Z ∈ R tail(rcfgX,N ; D, R) = ⊥ if rcfgX,N ̸∈ R

Note that we assume the DyNetKAT terms given as input to the operators head and tail do not contain terms of shape x?q and x!q. This can be ensured by applying the restriction operator δ on the input terms.

Observe that the safety properties of Defnition 10 are designed to capture unsafe fows. Similarly, one can also defne the syntax ⟨regexp⟩true to express that a certain safe fow is possible and reason about it. For an example, consider the stateful frewall example and the property ⟨(rcfgsecConReq,<sup>1</sup> ) <sup>n</sup> · (α · π)⟩true

Fig. 7: A FatTree Topology

where α ≜ (port = ext) and π ≜ (port ← int). This property expresses that the fow from port ext to port int is possible after the event rcfgsecConReq,<sup>1</sup> . This property can be encoded in the NetKAT style as α·head(tail(πn(Init), R))·α<sup>π</sup> ̸≡ 0 where R = {rcfgsecConReq,1}.

### 6 Experimental Evaluation

In this section we evaluate the applicability of our implementation based on a FatTree [22] topology case. FatTrees are hierarchical topologies commonly used in data centers. Figure 7 illustrates a FatTree with 3 levels: core, aggregation and top-of-rack (ToR). The switches at each level contain a number of redundant links to the upper level. The groups of ToR switches and their corresponding aggregation switches are called pods. For our experiments, we generated 6 Fat-Trees that grow in size and achieve a maximum size of 1344 switches. For these networks we computed a shortest path forwarding policy between all pairs of ToR switches. The number of switches in the ToR layer is set to k <sup>3</sup>/4 where k is the number of pods in the network.

We check dynamic properties on these networks and assess the time performance of our tool. We consider a scenario involving two ToR switches T<sup>a</sup> and T<sup>b</sup> that reside in diferent pods. Initially, all packets from T<sup>a</sup> to T<sup>b</sup> traverse through a frewall A<sup>x</sup> in the aggregation layer which flters SSH packets. The controller then decides to shift the frewall from A<sup>x</sup> to another switch Ax′ in the aggregation layer. For this purpose, the controller updates the corresponding aggregation and core layer switches resulting in 4 updates. The checked properties are as follows: (i) At any point while the controller is performing the updates, non-SSH packets from T<sup>a</sup> can always reach Tb. (ii) At any point while the controller is performing the updates, SSH packets from T<sup>a</sup> can never reach Tb. (iii) After all the updates are performed, Ax′ is a waypoint between T<sup>a</sup> and Tb.

We conducted the experiments on an Ubuntu 20.04 LTS OS with 16 core 2.4GHz Intel i9-9980HK processor and 64 GB RAM. The results are depicted in Figure 8. We report the preprocessing time, the time taken for checking properties (i), (ii), and (iii) individually (referred to as Reachability-I, Reachability-II, and Waypointing, respectively), and also time taken to check all the properties in parallel (referred to as All Properties). The reported times are the average of 10 runs.

The results indicate that preprocessing step is a non-negligible factor that contributes to overall time. However, preprocessing is independent of the property that is being checked and this procedure only needs to be done once for a given network. After the preprocessing step, the individual properties can be checked in less than 2 seconds for networks with less than 100 switches. For larger networks with sizes up to 931 and 1344 switches, the individual properties can be checked in a maximum of 5 minutes and 11 minutes, respectively. Checking for the property (iii) takes more than twice as much time as checking for the properties (i) and (ii). In the experiments where we check all properties in parallel, we allocated one thread for each property. In this setting, checking all properties introduced 24% overhead on average. After preprocessing, on average 87% of the running times are spent in the NetKAT decision procedure and this step becomes the bottleneck in analysing larger networks.

Fig. 8: Results of FatTree experiments. Light-coloured areas indicate the time spent in the NetKAT tool and solid coloured areas indicate the time spent in our tool.

### 7 Conclusions

We develop the language DyNetKAT for modelling and reasoning about dynamic reconfgurations in Software Defned Networks. Our language builds upon the concepts, syntax, and semantics of NetKAT and hence, provides a modular extension and makes it possible to reuse the theory and tools of NetKAT. We defne a formal semantics for our language and provide a sound and groundcomplete axiomatisation. We exploit our axiomatisation to analyse reachability properties of dynamic networks and show that our approach scales to networks with hundreds of switches. We assume that each data plane packet sees one set of fow tables throughout their fight in the network [17]. We plan to investigate small-step semantics in which the control plane updates can have a fner interleaving with in-fight packet as future work. Another natural direction for future work is devising compilation schemes enabling the translation of DyNetKAT programs into real running code.

### References


Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

### A new criterion for M, N-adhesivity, with an application to hierarchical graphs?

Davide Castelnovo<sup>1</sup>, Fabio Gadducci<sup>2</sup> , and Marino Miculan<sup>1</sup>

<sup>1</sup> Department of Mathematics, Computer Science and Physics, University of Udine, Udine, Italy.

davide.castelnovo@uniud.it, marino.miculan@uniud.it

<sup>2</sup> Department of Computer Science, University of Pisa, Pisa, Italy. fabio.gadducci@unipi.it

Abstract. Adhesive categories provide an abstract framework for the algebraic approach to rewriting theory, where many general results can be recast and uniformly proved. However, checking that a model satisfies the adhesivity properties is sometimes far from immediate. In this paper we present a new criterion giving a sufficient condition for M, N -adhesivity, a generalisation of the original notion of adhesivity. We apply it to several existing categories, and in particular to hierarchical graphs, a formalism that is notoriously difficult to fit in the mould of algebraic approaches to rewriting and for which various alternative definitions float around.

### 1 Introduction

The introduction of adhesive categories marked a watershed moment for the algebraic approaches to the rewriting of graph-like structures [16,9]. Until then, key results of the approaches on e.g. parallelism and confluence had to be proven over and over again for each different formalism at hand, despite the obvious similarity of the procedure. Differently from previous solutions to such problems, as the one witnessed by the butterfly lemma for graph rewriting [8, Lemma 3.9.1], the introduction of adhesive categories provided such a disparate set of formalisms with a common abstract framework where many of these general results could be recast and uniformly proved once and for all.

Despite the elegance and effectiveness of the framework, proving that a given category satisfies the conditions for being adhesive can be a daunting task. For this reason, we look for simpler general criteria implying adhesivity for a class of categories. Similar criteria have been already provided for the core framework of adhesive categories; e.g., every elementary topos is adhesive [17], and a category is (quasi)adhesive if and only if can be suitably embedded in a topos [15,12]. This covers many useful categories such as sets, graphs, etc.; on the other hand, there are many categories of interest which are not (quasi)adhesive, such as directed graphs, posets, and many of their subcategories. In these cases we can try to prove the more general M, N -adhesivity for suitable M, N ; however, so far this

<sup>?</sup> Work supported by the Italian MIUR project PRIN 2017FTXR7S "IT-MaTTerS".

has been achieved only by means of ad hoc arguments. To this end, one of the main contributions of this paper is a new criterion for M, N -adhesivity, based on the verification of some properties of functors connecting the category of interest to a family of suitable adhesive categories. This criterion allows us to prove in a uniform and systematic way some previous results about the adhesivity of categories built by products, exponents, and comma construction.

Moreover, it is well-known that categorical properties are often prescriptive, indicating abstractly the presence of some good behaviour of the modelled system. Adhesivity is one such property, as it is highly sought after when it comes to rewriting theories. Thus, our criterion for proving M, N -adhesivity can be seen also as a "litmus test" for the given category. This is useful in situations that are not completely settled, and for which different settings have been proposed. An important example is that of hierarchical graphs, for which we roughly can find two alternative proposals: on the one hand, algebraic formalisms where the edges have some algebraic structures, so that the nesting is a side effect of the term construction; on the other hand, combinatorial approaches where the topology of a standard graph is enriched by some partial order, either on the nodes or on the edges, where the order relation indicates the presence of nesting. By applying our criterion, we can show that the latter approach yields indeed an M, N adhesive category, confirming and overcoming the limitations of some previous approaches to hierarchical graphs [21,23,24], which we briefly recall next.

The more straightforward proposal is by Palacz [24], using a poset of edges instead of just a set; however, the class of rules has to be restricted in order to apply the approach, which in any case predates the introduction of adhesive categories. Our work allows to rephrase in terms of adhesive properties and generalise Palacz's proposal, dropping his constraint on rules. Another attempt are Mylonakis and Orejas' graphs with layers [21], for which M-adhesivity is proved for a class of monomorphisms in the category of symbolic graphs; however, nodes between edges at different layers cannot be shared. Padberg [23] goes for a coalgebraic presentation via a peculiar "superpower set" functor; this gives immediately M-adhesivity provided that this superpower set functor is well-behaved with respect to limits. However this approach is rather ad hoc, not modular and not very natural for actual modelling.

Summarising, the main contributions of this work are: (a) a new general criterion for assessing M, N -adhesivity; (b) new proofs of M, N -adhesivity for some relevant categories, systematising previous known proofs; (c) the first proof that a category of hierarchical graph is M, N -adhesive.

Synopsis. After having recalled some basic notions, in Section 2 we introduce the new criterion for M, N -adhesivity; using it, we show M, N -adhesivity of several constructions, such as products and comma categories. In Section 3 we apply this theory to various example categories, such as directed (acyclic) graphs, trees and term graphs. We show also the adhesivity of several categories obtained by combining adhesive ones, and in particular of the elusive category of hierarchical graphs. Conclusions and directions for future work are in Section 4. An extended version of this paper is available at [6].

### 2 M, N-adhesivity via creation of (co)limits

In this section we recall some definitions and results about M, N -adhesive categories and provide a new criterion to prove this property.

### 2.1 M, N-adhesive categories

Intuitively, an adhesive category is one in which pushouts of monomorphisms exist and "behave more or less as they do in the category of sets" [16]. Formally, we require pushouts of monomorphisms to be Van Kampen colimits.

Definition 2.1. A Van Kampen square in a category A is a pushout square

A B C D n g m f

such that for any cube as follows, where the back faces are pullbacks,

the top face is a pushout if and only if the front faces are pullbacks. Pushout squares which enjoy the "if " of this condition are called stable.

Given a category A we will denote by Mor(A), Mono(A), Reg(A) respectively the classes of morphisms, monomorphisms and regular monomorphisms of A.

Definition 2.2. Let A be a category and A ⊆ Mor(A). Then we say that A is


Remark 2.1. Clearly, "decomposition" corresponds to "left cancellation", but we prefer to stick to the name commonly used in literature (see e.g. [14]).

We are now ready to give the definition of M, N -adhesive category [14,25].

Definition 2.3. Let A be a category and M ⊆ Mono(A), N ⊆ Mor(A) where


Then we say that A is M, N -adhesive if


Remark 2.2. M-adhesivity as defined in [2] coincides with M, Mor(A)-adhesivity, while adhesivity and quasiadhesivity [16,12] coincide with Mono(A)-adhesivity and Reg(A)-adhesivity, respectively. Notice that, in the M-adhesive case, stability under pushouts of M derives from properties (a)–(c) of Definition 2.3, while closure under decomposition follows from stability under pullbacks in any category, so there is no need to prove it independently.

Other authors have introduced weaker notions of M-adhesivity; see, e.g., [9,11,28], where our M-adhesive categories are called adhesive HLR categories.

In general, proving that a given category is M, N -adhesive by verifying the conditions of Definition 2.3 may be long and tedious; hence, we seek criteria which are sufficient for adhesivity, and simpler to prove. A prominent example is the following result due to Lack and Soboci´nski.

Theorem 2.1 ([17], Thm. 26). Any elementary topos is an adhesive category.

In particular the category Set of sets and any presheaf category are adhesive. However, there are many important categories for (graph) rewriting which are not toposes, hence the need for more general criteria.

### 2.2 A new criterion for M, N-adhesivity

In this section we present our main result, i.e., that M, N -adhesivity is guaranteed by the existence of a family of functors with sufficiently nice properties. We will adapt some definitions from [1].

Definition 2.4. Let I : I → C be a diagram and J a set. We say that a family F = {Fj}j∈<sup>J</sup> of functors F<sup>j</sup> : C → D<sup>j</sup>


Remark 2.3. Joint preservation, reflection, lifting or creation of (co)limits of F = {F<sup>j</sup> : A → Bj}j∈<sup>J</sup> is equivalent to the usual preservation, reflection, lifting or creation of (co)limits for the functor A → Q <sup>j</sup>∈<sup>J</sup> B<sup>j</sup> induced by F. Notice that our notion of creation follows [22], which is more lax than, e.g., [19, Def. V.1].

Theorem 2.2. Let A be a category, M ⊂ Mono(A), N ⊂ Mor(A) satisfying conditions (i)–(iii) of Definition 2.3, and F a non empty family of functors F<sup>j</sup> : A → B<sup>j</sup> such that B<sup>j</sup> is M<sup>j</sup> , N<sup>j</sup> -adhesive.

1. If every F<sup>j</sup> preserves pullbacks, F<sup>j</sup> (M) ⊂ M<sup>j</sup> and F<sup>j</sup> (N ) ⊂ N<sup>j</sup> for every j ∈ J, F jointly preserves M, N -pushouts, and jointly reflects pushout squares

$$\begin{array}{c} F\_j(A) \xrightarrow{F\_j(f)} F\_j(B) \\ F\_j(m) \downarrow \\ F\_j(C) \xrightarrow{F\_j(g)} F\_j(D) \end{array}$$

with m, n ∈ M and f ∈ N , then M, N -pushouts in A are stable. Moreover if in addition F jointly reflects M-pullbacks and N -pullbacks then M, N -pushouts are Van Kampen squares.


$$\begin{aligned} \mathcal{M}\_F &:= \{ m \in \text{Mor}(\mathbf{A}) \mid F\_j(m) \in \mathcal{M}\_j \text{ for every } j \in J \}, \\ \mathcal{N}\_F &:= \{ n \in \text{Mor}(\mathbf{A}) \mid F\_j(n) \in \mathcal{N}\_j \text{ for every } j \in J \} \end{aligned}$$

Proof. (1.) Take a cube in which the bottom face is an M, N -pushout and all the vertical faces are pullbacks (below, left). Applying any F<sup>j</sup> ∈ F we get another cube in B<sup>j</sup> (below, right) in which the bottom face is an M<sup>j</sup> , N<sup>j</sup> -pushout (because F<sup>j</sup> (m) ∈ M<sup>j</sup> and F<sup>j</sup> (n) ∈ N<sup>j</sup> ) and the vertical faces are pullbacks, thus the top face of the second cube is a pushout for every j ∈ J

$$\begin{array}{c|c|c} & & F\_{j}(m') & F\_{j}(A') & F\_{j}(n') \\ C' & \stackrel{m'}{\longleftrightarrow} & f' & \stackrel{n'}{\longleftrightarrow} & B' & \stackrel{F\_{j}(C')}{\longleftrightarrow} & F\_{j}(B') \\ c & \stackrel{d}{\longleftrightarrow} & \stackrel{e\_{m}}{\longleftrightarrow} & \stackrel{h}{\longleftrightarrow} & \stackrel{k}{\longleftrightarrow} & F\_{j}(c) & \stackrel{F\_{j}(B')}{\longleftrightarrow} \\ C & \stackrel{d}{\longleftrightarrow} & \stackrel{m}{\longleftrightarrow} & \stackrel{h}{\longleftrightarrow} & \stackrel{k}{\longleftrightarrow} & F\_{j}(c) & \stackrel{F\_{j}(A)}{\longleftrightarrow} & F\_{j}(b) \\ C & \stackrel{h}{\longleftrightarrow} & \stackrel{h}{\longleftrightarrow} & \stackrel{k}{\longleftrightarrow} & \stackrel{k}{\longleftrightarrow} & F\_{j}(C) & \stackrel{k}{\longleftrightarrow} & \stackrel{k}{\longleftrightarrow} \\ & & & & & & F\_{j}(g) & F\_{j}(D) & \stackrel{k}{\longleftrightarrow} \end{array}$$

Now m<sup>0</sup> , f<sup>0</sup> ∈ M and n <sup>0</sup> ∈ N since they are the pullbacks of m, f and n and thus we can conclude.

Suppose now that F jointly reflects M-pullbacks and N -pullbacks, we have to show that the front faces of the first cube above are pullbacks if the top one is a pushout. In the second cube, the bottom and top face are M<sup>j</sup> , N<sup>j</sup> pushouts and the back faces are pullbacks, then the front faces are pullbacks too by M<sup>j</sup> , N<sup>j</sup> -adhesivity. Now, notice that f ∈ M and g ∈ N (since M and N are closed under pushouts) and thus we can conclude since F jointly reflects pullbacks along arrows in M or in N .

(2.) Let us show properties (a), (b), (c) defining M, N -adhesivity.


(3.) By the previous point it is enough to show that M<sup>F</sup> and N<sup>F</sup> satisfy conditions (i)–(iii) of Definition 2.3.


$$\begin{array}{c} A \xrightarrow{f} B \\ \bigcup \\ C \xrightarrow{} \begin{array}{c} \downarrow n \\ \end{array} \end{array} \end{array}$$

and suppose that it is a pullback with n ∈ M<sup>F</sup> (N<sup>F</sup> ), then applying any F<sup>j</sup> ∈ F we get that F<sup>j</sup> (m) is the pullback of F<sup>j</sup> (n) along F<sup>j</sup> (g), since F<sup>j</sup> (n) is in M<sup>j</sup> (in N<sup>j</sup> ), which implies that F<sup>j</sup> (m) ∈ M<sup>j</sup> (N<sup>j</sup> ). This is true for every j ∈ J, from which the thesis follows. Stability under pushouts is proved applying the same argument to m. ut

Applying the previous theorem to the families given by, respectively, projections, evaluations and the inclusion we get immediately the following three corollaries (cfr. also [9, Thm. 4.15]).

Corollary 2.1. Let {A}i∈<sup>I</sup> be a family of categories such that each A<sup>i</sup> is M<sup>i</sup> , Niadhesive. Then the product category Q <sup>i</sup>∈<sup>I</sup> A<sup>i</sup> is Q <sup>i</sup>∈<sup>I</sup> M<sup>i</sup> , Q <sup>i</sup>∈<sup>I</sup> Ni-adhesive, where

$$\prod\_{i \in I} \mathcal{M}\_i := \{ (m\_i)\_{i \in I} \in \mathsf{Mor}(\prod\_{i \in I} \mathbf{A}\_i) \mid m\_i \in \mathcal{M}\_i \text{ for every } i \in I \}$$

$$\prod\_{i \in I} \mathcal{N}\_i := \{ (n\_i)\_{i \in I} \in \mathsf{Mor}(\prod\_{i \in I} \mathbf{A}\_i) \mid n\_i \in \mathcal{N}\_i \text{ for every } i \in I \}$$

Corollary 2.2. Let A be an M, N -adhesive category. Then for every other category C, the category of functors A<sup>C</sup> is MC, N <sup>C</sup>-adhesive, where

$$\begin{aligned} \mathcal{M}^{\mathbf{C}} &:= \{ \eta \in \mathsf{Mor}(\mathbf{A}^{\mathbf{C}}) \mid \eta\_C \in \mathcal{M} \text{ for every object } C \text{ of } \mathbf{C} \} \\ \mathcal{N}^{\mathbf{C}} &:= \{ \eta \in \mathsf{Mor}(\mathbf{A}^{\mathbf{C}}) \mid \eta\_C \in \mathcal{N} \text{ for every object } C \text{ of } \mathbf{C} \} \end{aligned}$$

Corollary 2.3. Let A be a full subcategory of an M, N -adhesive category B and M<sup>0</sup> ⊂ Mono(A), N <sup>0</sup> ⊂ Mor(A) satisfying the first three conditions of Definition 2.3 such that M<sup>0</sup> ⊂ M, N <sup>0</sup> ⊂ N and A is closed in B under pullbacks and M<sup>0</sup> , N <sup>0</sup> -pushouts. Then A is M<sup>0</sup> , N <sup>0</sup> -adhesive.

#### 2.3 Comma categories

In this section we show how to apply Theorem 2.2 to the comma construction [19] in order to guarantee some adhesivity properties under suitable hypotheses.

Definition 2.5. For any two functors L : A → C, R : B → C, the comma category L↓R is the category in which


$$\begin{array}{c} L(A) \xrightarrow{L(h)} L(A')\\ f \downarrow \\ R(C) \xrightarrow{} \xrightarrow{} R(C') \end{array}$$

We have two obvious forgetful functors

$$\begin{aligned} U\_L: L \downarrow R &\to \mathbf{A} & \quad U\_R: L \downarrow R &\to \mathbf{B} \\ (A, B, f) &\longmapsto A & \quad (A, B, f) &\longmapsto B \\ (h, k) \downarrow & & \downarrow h & \quad (h, k) \downarrow & \quad \downarrow k \\ (A', B', g) &\longmapsto A' & \quad (A', B', g) &\longmapsto B' \end{aligned}$$

Example 2.1. Graph is equivalent to the comma category made from the identity functor on Set and the product functor sending X to X × X.

We have a classic result relating limits and colimits in the comma category with those preserved by L or R.

Lemma 2.1. Let I : I → L↓R be a diagram such that L preserves the colimit (if it exists) of U<sup>L</sup> ◦ I. Then the family {UL, UR} jointly creates colimits of I.

Corollary 2.4. The family {UL, UR} jointly creates limits along every diagram I : I → L↓R such that R preserves the limit of U<sup>R</sup> ◦ I.

Proof. Apply the previous lemma to Rop ↓L op which is equivalent to (L↓R) op .

We are now able to deduce the following result from Theorem 2.2.

Theorem 2.3. Let A and B be respectively M, N -adhesive and M<sup>0</sup> , N <sup>0</sup> -adhesive categories, L : A → C a functor that preserves M, N -pushouts, and R : B → C a pullback preserving one. Then L↓R is M ↓M<sup>0</sup> , N ↓N <sup>0</sup> -adhesive, where

$$\begin{aligned} \mathcal{M}\downarrow\mathcal{M}' &:= \{ (h,k) \in \mathsf{Mor}(L\downarrow R) \mid h \in \mathcal{M}, k \in \mathcal{M}' \}, \\ \mathcal{N}\downarrow\mathcal{N} &:= \{ (h,k) \in \mathsf{Mor}(L\downarrow R) \mid h \in \mathcal{N}, k \in \mathcal{N}' \}. \end{aligned}$$

### 3 Some paradigmatic examples

In this section we apply the results provided in Section 2, to some important categories, such as directed (acyclic) graphs, hierarchical (hyper)graphs, directed (acyclic) hypergraphs, and term graphs. These examples have been chosen for their importance in graph rewriting, and because we can recover their M, N adhesivity in a uniform and systematic way. In fact, in the case of hierarchical (hyper)graphs we give the first proof of M, N -adhesivity, to our knowledge.

### 3.1 Directed (acyclic) graphs

Among visual formalisms, directed (also known as "simple") graphs represent one of the most-used paradigms, since they adhere to the classical view of graphs as relations included in the cartesian product of vertices. It is also well-known that directed graphs are not quasiadhesive [15], not even in their acyclic variant. In this section we are going to exploit Corollary 2.3 to show that these categories of (acyclic) graphs have nevertheless adhesivity properties.

Definition 3.1. A directed multigraph is a 4-tuple (E, V, s, t) where E and V are sets, called the set of edges and nodes respectively, and s, t : E → V are functions, called source and target. An edge e is between v and w if s(e) = v and t(e) = w, E(v, w) is the set of edges between v and w. A morphism (E, V, s, t) → (F, W, s<sup>0</sup> , t0 ) is a pair (f, g) of functions f : E → F, g : V → W such that the following diagrams commute

$$\begin{array}{ccc} E \xrightarrow{s} V & E \xrightarrow{t} V \\ f \downarrow & \downarrow g & f \downarrow \\ F \xrightarrow[s']{} W & F \xrightarrow[t']{} W \end{array}$$

We will denote by Graph the category so defined. A directed graph is a directed multigraph in which there is at most one edge between two nodes, DGraph is the full subcategory of Graph given by directed graphs.

A path [e<sup>i</sup> ] n <sup>i</sup>=1 in a directed multigraph is a finite list of edges such that t(ei) = s(ei+1) for all 1 ≤ i ≤ n − 1. A path is called a cycle if s(e1) = t(en). A directed acyclic graph is a directed graph without cycles, directed acyclic graphs form a full subcategory DAG of DGraph and Graph.

Remark 3.1. Graph is equivalent to the category of presheaves on • ⇒ •, the category with just two objects and only two parallel arrows between them (besides the identities), thus it is a topos and as such adhesive. Notice that this also implies that limits and colimits are computed component-wise and that an arrow in Graph is mono if and only if both its underlying functions are injective.

Remark 3.2. Notice that if (f, g) : (E, V, s, t) → (F, W, s<sup>0</sup> , t0 ) is an arrow in DGraph with f injective, then g is injective too.

We will state now two categorical properties of DGraph that will be useful in the following.

### Proposition 3.1. The following properties hold


Remark 3.3. Notice that, since L does not modify the vertices part of a graph, Remark 3.2 implies that L preserves monomorphisms.

Example 3.1. In [15] it is shown that DGraph is not quasiadhesive. Take the cube

By the results of Proposition 3.1 the top and bottom faces are pushouts along regular monos and the back faces are pullbacks, but the front one is not, contradicting the Van Kampen property. The same example shows that even DAG is not quasiadhesive.

Definition 3.2. A monomorphism (f, g) : (E, V, s, t) → (F, W, s<sup>0</sup> , t0 ) in Graph is said to be downward closed if, for all e ∈ F, e ∈ f(E) whenever t 0 (e) ∈ g(V ) (in particular this implies that s 0 (e) ∈ g(V ) too). We denote by dclosed, dclosed<sup>d</sup> and dclosedda the classes of downward closed morphisms in Graph, DGraph and DAG respectively.

Remark 3.4. The functor L of Proposition 3.1 sends downward closed morphisms to downward closed morphisms.

Remark 3.5. By Proposition 3.1 it is clear that any downward closed morphism is regular. The vice-versa does not hold: a counterexample is given by

Lemma 3.1. DGraph and DAG are closed in Graph under pullbacks. Moreover, DGraph is closed under Reg(DGraph), Mono(DGraph)-pushouts and DAG under dclosedda, Mono(DAG)-pushouts.

Theorem 3.1. The category DGraph is Reg(DGraph), Mono(DGraph)- and Mono(DGraph), Reg(DGraph)-adhesive, while DAG is dclosedda, Mono(DAG) adhesive.

### 3.2 Tree Orders

In this section we present trees as partial orders and show that the resulting category is actually a topos of presheaves, hence adhesive. This fact will be exploited in Section 3.3 to construct a category of hierarchical graphs, where the hierarchy between edges is modelled by trees.

Definition 3.3. A tree order is a partial order (E, ≤) such that for every e ∈ E, ↓e is a finite set totally ordered by the restriction of ≤. Since ↓e is a finite chain we can define the immediate predecessor function

$$i\_E: E \to E \sqcup \{ \* \} \qquad e \mapsto \begin{cases} \max(\downarrow e > \{e\}) & \downarrow e \neq \{e\} \\ \* & \downarrow e = \{e\} \end{cases}$$

Let i 0 <sup>E</sup> be the inclusion E → E t {∗}; then, for any k ∈ N+, the k th predecessor function i k <sup>E</sup> : E → E t {∗} is defined by induction as follows:

$$e \mapsto \begin{cases} i\_E(i\_E^{k-1}(e)) & i\_E^{k-1}(e) \in E \\ \* & i\_E^{k-1}(e) = \* \end{cases}$$

Let f : (E, ≤) → (F, ≤) be a monotone map and f<sup>∗</sup> : E t {∗} → F t {∗} be its extension sending ∗ to ∗. We say that f is strict if the following diagram commutes

$$\begin{aligned} E &\xrightarrow{i\_E} E \sqcup \{ \* \} \\ f &\downarrow f\_\* \\ F &\xrightarrow[i\_F]{} F \sqcup \{ \* \} \end{aligned}$$

We define the category Tree as the subcategory of Poset given by tree orders and strict morphisms.

Example 3.2. A strict morphisms is simply a monotone function that preserves immediate predecessors (and thus every predecessor). For instance the function {0} → {0, 1} sending 0 to 1 and where we endow the codomain with the order 0 ≤ 1, is not a strict morphism.

Remark 3.6. Clearly i 1 <sup>E</sup> = i<sup>E</sup> and it holds that i k <sup>E</sup>(e) = ∗ if and only if |↓e| ≤ k. In this case an easy induction shows that ↓i k <sup>E</sup>(e) <sup>=</sup> |↓e| − <sup>k</sup>.

Remark 3.7. We have an obvious forgetful functor

$$\begin{aligned} |{-}| &: \mathbf{Tree} \to \mathbf{Set} \\ (E, \leq) &\longmapsto E \\ f \downarrow & \quad \downarrow f \\ (F, \leq) &\longmapsto F \end{aligned}$$

Remark 3.8. Let (E, ≤) be an object of Tree and ω the first infinite ordinal, then we can define its associated presheaf <sup>E</sup><sup>b</sup> : <sup>ω</sup> op → Set sending n to the set

$$\{e \in E \mid |\downarrow e \sim \{e\}| = n\}$$

If n ≤ m in ω, we can define a function

$$\iota\_{n,m}^E: \widehat{E}(m) \to \widehat{E}(n) \qquad e \mapsto i\_E^{m-n}(e).$$

which is well defined since |↓e| > m − n so

$$\left| \left| \downarrow i\_E^{m-n}(e) \right| = \left| \downarrow e \right| - m + n = m + 1 - m + n = n + 1$$

Notice that if m = n, i m−n E (e) is the identity, while for any k ≤ n ≤ m we have

$$\iota\_{k,n}^E(\iota\_{n,m}^E(e)) = i\_E^{n-k}(i\_E^{m-n}(e)) = i\_E^{n-k+m-n}(e) = i\_E^{m-k}(e) = \iota\_{m-k}^E(e)$$

so <sup>E</sup><sup>b</sup> is really a presheaf on <sup>ω</sup>.

Theorem 3.2. There exists an equivalence of categories (d−) : Tree <sup>→</sup> Set<sup>ω</sup> op sending (E, <sup>≤</sup>) to <sup>E</sup>b.

Corollary 3.1. Tree is adhesive and the forgetful functor |−| : Tree → Set preserves all colimits.

### 3.3 Various kinds of hierarchical graphs

In this section we construct several categories of hierarchical graphs combining sufficiently adhesive categories of preorders or graphs (modelling the hierarchy between the edges) and the wanted structure on the nodes. For each of them we can readily prove suitable adhesivity properties, leveraging the modularity provided by Theorem 2.2. Besides hypergraphs and interfaces, this methodology can be applied to other settings such as Petri nets (see [10]).

Hierarchical graphs We can use trees to produce a category of hierarchical graphs [24], which, in addition, can be equipped with an interface, modelled by a function into the set of nodes.

Definition 3.4. The category HIGraph of hierarchical graphs with interface has as objects 6-tuples ((E, ≤), V, X, f, s, t) where (E, ≤) is a tree order, f is a function X → V and s, t are functions E → V , and as arrows triples (h, k, l) : ((E, ≤), V, X, f, s, t) → ((F, ≤), W, Y, g, s<sup>0</sup> , t0 ) with h : (E, ≤) → (F, ≤) in Tree, k : V → W and l : X → Y in Set such that the following squares commute

$$\begin{array}{ccc} E \stackrel{s}{\longrightarrow} V & E \stackrel{t}{\longrightarrow} V & X \stackrel{f}{\longrightarrow} V \\\ h \downarrow & \downarrow k & h \downarrow & \downarrow k & l \downarrow & g \downarrow k \\ F \stackrel{s'}{\longrightarrow} W & F \stackrel{t'}{\longrightarrow} W & Y \stackrel{g}{\longrightarrow} W \end{array}$$

We can realise HIGRaph as a comma category: as L we take the functor |−| : Tree → Set of Remark 3.7, while as R we take the composition of cod : Set<sup>2</sup> → Set, sending an arrow to its codomain, with the functor Set → Set that sends a set X to X × X. Notice that cod preserves limits since it coincides with the forgetful functor idSet ↓idSet, so we can apply Theorem 2.3 to get the following.

Theorem 3.3. HIGraph is an adhesive category.

The next step is to move to hypergraphs, using the Kleene star (−) ? : Set → Set (the monoid monad) instead of the product functor. This step is not trivial: it relies on the fact that the monoid monad preserves all connected limits (such monads are called cartesian), which in turn rests upon the fact that the theory of monoids is a strongly regular theory (see [5, Sec. 3] and [18, Ch.4] for details).

Hierarchical hypergraphs A variation on the previous example is obtained by allowing an edge to be mapped to an arbitrary subset of nodes. In this way, we obtain a category of hypergraphs whose edges form a tree order, corresponding to Milner's (pure) bigraphs [20], with possibly infinite edges<sup>3</sup> .

Definition 3.5. The category HHGraph of hierarchical hypergraphs with interface has as objects 5-tuples ((E, ≤), V, X, f, e) where (E, ≤) is a tree order and f : X → V , e : E → V ? two functions; arrows are triples (h, k, l) : ((E, ≤), V, X, f, e) → ((F, ≤), W, Y, g, e<sup>0</sup> ) with h : (E, ≤) → (F, ≤) in Tree, k : V → W and l : X → Y in Set such that the following squares commute

<sup>3</sup> In bigraph terminology, "controls" and "edges" correspond to our edges and nodes.

Fig. 1. A DAG-hypergraph (left) and a DGraph-hypergraph corresponding to the CCS process P = a(x).b(xy).P (right). Relation between edges is depicted in red.

$$\begin{array}{ccc} E \xrightarrow{e} V^\* & X \xrightarrow{f} V \\ h \downarrow & \downarrow k^\* & l \downarrow & \downarrow k \\ F \xrightarrow[e']{} W^\* & Y \xrightarrow[g]{} W \end{array}$$

Even in this case HHGraph is a comma category: on the left side we take |−| as before, on the right side we take the composition of cod with the Kleene star, so even in this case we can deduce adhesivity.

Theorem 3.4. HHGRaph is adhesive.

DGraph and DAG-hypergraphs We can consider more general relations between edges, besides tree orders. An interesting case is when edges form a directed acyclic graph, yielding the category of DAG-hypergraphs; this corresponds to (possibly infinite) bigraphs with sharing, where an edge can have more than one parent, as in [27] (see also Fig. 1, left). Even more generally, we can consider any relation between edges, i.e., the edges form a generic directed graph possibly with cycles, yielding the category of DGraph-hypergraphs. These can be seen as "recursive bigraphs", i.e., bigraphs which allow for cyclic dependencies between controls, like in recursive processes; an example is in Fig. 1 (right).

Definition 3.6. We define the category of DGraph-hypergraphs (respectively DAG-hypergraphs) with interface DHGraph (DAGHGraph) as the one in which objects are 5-tuples ((E, T, s, t), V, X, f, e) where (E, T, s, t) is in DGraph (in DAG), f is a function X → V , and e a function T → V ? and as arrows triple ((h1, h2), k, l) : ((E, T, s, t), V, X, f, e) → ((F, T<sup>0</sup> , s<sup>0</sup> , t0 ), W, Y, g, e<sup>0</sup> ) with (h1, h2) : (E, T, s, t) → (F, T<sup>0</sup> , s<sup>0</sup> , t0 ) in DAG (in DGraph), k : V → W and l : X → Y in Set such that the following squares commute

$$\begin{array}{ccc} T \stackrel{e}{\to} V^{\star} & X \stackrel{f}{\to} V \\\ h\_2 \bigcup & \bigcup\_k k^{\star} & l \bigcup\_k & \bigcup\_k k \\ T' \stackrel{f}{e'} W^{\star} & Y \stackrel{f}{\to} W \end{array}$$

We can realise also DHGraph and DAGHGraph as comma categories: it is enough to take respectively the forgetful functors DGraph → Set and DAG → Set on one side and again the composition of the Kleene star with cod.

Theorem 3.5. DHGraph is adhesive with respect to the classes

$$\begin{aligned} \{ ((h\_1, h\_2), k, l) \in \mathsf{Mor}(\mathbf{DHGraph}) \mid (h\_1, h\_2) \in \mathsf{Reg}(\mathbf{DHGraph}), k, l \in \mathsf{Mono}(\mathbf{Set}) \} \\ \{ ((h\_1, h\_2), k, l) \in \mathsf{Mor}(\mathbf{DHGraph}) \mid (h\_1, h\_2) \in \mathsf{Mono}(\mathbf{DGraph}) \} \end{aligned}$$

while DAGHGraph is adhesive with respect to the classes

$$\begin{aligned} \{ ((h\_1, h\_2), k, l) \in \mathsf{Mor}(\mathbf{DAGHGraph}) \mid (h\_1, h\_2) \in \mathsf{dclosed}\_{\mathsf{da}}, k, l \in \mathsf{Mono}(\mathbf{Set}) \} \\ \{ ((h\_1, h\_2), k, l) \in \mathsf{Mor}(\mathbf{DHGraph}) \mid (h\_1, h\_2) \in \mathsf{Mono}(\mathbf{DAG}) \} \end{aligned}$$

### 3.4 Term graphs

The use of term graphs has been advocated as a tool for the optimal implementation of terms, with the intuition that the graphical counterpart of trees can allow for the sharing of sub-terms [26]. A brute force proof of quasiadhesivity of the category of terms graphs was given in [7]. In this section we recover that result by exploiting our new criterion for adhesivity.

Definition 3.7. Let Σ = (O, ar) be an algebraic signature (O is a set and ar : O → N a function called arity function). A term graph over Σ is a triple (V, l, s) where V is a set, l : V \* O, s : V \* V ? are partial functions such that


Elements of V are called nodes, a node v not in dom(l) is called empty. A morphism (V, l, s) → (W, t, r) is a function f : V → W such that

$$t(f(v)) = l(v) \qquad r(f(v)) = f^\star(s(v))$$

for every v ∈ dom(l). We will denote by TG<sup>Σ</sup> the category of term graphs over Σ and their morphisms. We will use U to denote the forgetful functor TG<sup>Σ</sup> → Set sending a term graph to the set of its nodes and that is the identity on arrows.

Definition 3.8. We define a functor ∆ : Set → TG<sup>Σ</sup> putting

$$\begin{array}{c} X \longmapsto (X, e\_1, e\_2) \\ f \downarrow \\ Y \longmapsto (Y, e\_1', e\_2') \end{array}$$

where the domains of the structural functions e1, e<sup>2</sup> of ∆(X) are the empty set.

Lemma 3.2. The following properties hold


Remark 3.9. Right adjoints preserves monomorphisms, so, by the first point of Lemma 3.2, if f : (V, l, s) → (W, t, r) is a monomorphism then its underlying function is injective. On the other hand U is faithful and thus reflects monomorphisms, i.e. also the other implication holds.

Remark 3.10. TG<sup>Σ</sup> in general does not have terminal objects. Since U preserves limits, if a terminal object exists it must have the singleton as set of nodes. Now take as signature the one given by two operations {a, b} both of arity 0, then we have three term graphs with only one node v: ∆({v}), ({v}, l, s) and ({v}, t, s) where l(v) = a, t(v) = b and s sends v to the empty word. Clearly there are no morphisms between the last two and from the last two to the first one, and thus neither of them can be terminal.

Remark 3.11. TG<sup>Σ</sup> is not an adhesive category. In particular it does not have pushouts along all monomorphisms. Take the signature of the previous remark, then we can use the identity {v} → {v} to form a span

$$(\{v\}, l, s) \xleftarrow{i} \Delta(\{v\}) \xrightarrow{i'} (\{v\}, t, s).$$

This span cannot be completed to commutative a square: if

$$
\begin{array}{c}
\Delta(\{v\}) \xrightarrow{i} \begin{array}{c}
(\{v\},t,s) \\
\downarrow{g}
\end{array} \\
(\{v\},l,s) \xrightarrow{} \begin{array}{c}
(V,p,r)
\end{array}
\end{array}
$$

is commutative then f(v) = g(v); therefore

$$a = l(v) = p(f(v)) = p(g(v)) = t(v) = b$$

and this is absurd.

Remark 3.12. It is worth to spell out the explicit construction of equalizers in TGΣ. Given two arrows f, g : (V, l, s) → (W, t, r), let

$$E = \{ v \in V \mid f(v) = g(v) \}$$

be the equalizer of U(f) and U(g) in Set. We have a partial function p : E \* O given by the restriction of l to E. Moreover, if v ∈ E ∩ dom(s) then

$$f^\star(s(v)) = r(f(v)) = r(g(v)) = g^\star(s(v)).$$

hence s(v) ∈ E? (which is the equalizer of f ? and g ? , see [5]), thus we can restrict s to q : E \* E? . In this way we get a term graph (E, p, q) with an arrow into (V, l, s) which clearly equalize f and g.

On the other hand, if k : (U, a, b) → (V, l, s) is such that

$$g \circ k = f \circ k$$

then the induced function ¯k : U → E is a morphism of TGΣ.

Remark 3.13. Lemma 3.2 implies that TG<sup>Σ</sup> has pullbacks. In the following we will need their explicit description. The pullback of a cospan

$$(V, l, s) \xrightarrow{f} (W, t, r) \xleftarrow{g} (U, a, b)$$

is given by (P, p, q) where

$$P = \{(v, u) \in V \times U \mid f(u) = g(v)\}$$

is the pullback of f along g in Set and

$$p: P \rightharpoonup O \qquad (v, u) \mapsto \begin{cases} l(v) & v \in \text{dom}(l), w \in \text{dom}(t) \\ \text{undefined} & \text{otherwise} \end{cases}$$

$$q: P \rightharpoonup P^\star \qquad (v, u) \mapsto \begin{cases} [(s(v)\_i, r(u)\_i)]\_{i=1}^{\text{ar}(l(v))} & v \in \text{dom}(l), w \in \text{dom}(t) \\ \text{undefined} & \text{otherwise} \end{cases}$$

where, given x ∈ X? , x<sup>i</sup> denotes its i th letter and, given x1, . . . , x<sup>n</sup> ∈ X, [x<sup>i</sup> ] n i=1 denotes the element in X? such that ([x<sup>i</sup> ] n <sup>i</sup>=1)<sup>i</sup> is exactly x<sup>i</sup> .

Now, notice that q is the unique partial function P \* P? that makes the projections arrows of TGΣ. Moreover even p has a uniqueness property: it is the unique partial function P \* O such that the projections are arrows of TG<sup>Σ</sup> and p(x) is undefined if and only if at least one of its image is undefined. In particular this implies the following result.

Proposition 3.2. U creates pullbacks along arrows which preserves empty nodes.

This is especially useful when paired with the following result from [7].

Proposition 3.3 ([7], Prop. 4.3). An arrow f : (V, l, s) → (W, t, r) in TG<sup>Σ</sup> is a regular mono if and only if f is injective and preserves empty nodes.

Proof. (⇒) Follows by the construction of equalizers given in Remark 3.12. (⇐) Consider (U, a, b) where U = W t(W rf(V )). Let i<sup>1</sup> and i<sup>2</sup> be the inclusions of W and W r f(V ) into U, we can define

$$a: U \rightharpoonup O \quad \begin{aligned} &t(w) \\ &u \mapsto \begin{cases} t(w) & u = i\_1(w), w \in \text{dom}(t) \\ t(w) & u = i\_2(w), w \in (W \succ f(V)) \cap \text{dom}(t) \\ \text{undefined} & \text{otherwise} \end{cases} \end{aligned}$$

while for b : U \* U? , we put b(u) = r(w) if u = i1(w), w ∈ dom(r), while if u = i2(w) with w ∈ dom(r) we define b(u) = [u<sup>i</sup> ] ar(a(u)) <sup>i</sup>=1 where

$$u\_i = \begin{cases} i\_2(r(w)\_i) & r(w) \in W \times f(V) \\ i\_1(r(w)\_1) & r(w) \in f(V) \end{cases}$$

We have two functions (V, t, r) → (U, a, b): one is just i1, while the other one is given by

$$g: W \to U \qquad w \mapsto \begin{cases} i\_1(w) & w \in f(V) \\ i\_2(w) & w \notin f(V) \end{cases}$$

Now, i<sup>1</sup> ◦ f and g ◦ f both send v to i1(f(v)), therefore

$$i\_1 \circ f = g \circ f$$

Suppose that h : (P, p, q) → (W, t, r) equalizes i<sup>1</sup> and g, thus h(x) ∈ f(V ) for every x ∈ P, and we have a unique function h 0 : P → V such that f ◦ h <sup>0</sup> = h. For every x ∈ dom(p), t(h(x)) = p(x), thus h(x) = f(h 0 (x)) ∈ dom(t). Since f preserves the empty nodes, h 0 (x) belongs to dom(l), so:

$$p(x) = t(h(x)) = t(f(h'(x))) = l(h'(x)).$$

Preservation of successors follows at once, while uniqueness follows from the uniqueness of the function h 0 in Set. ut

Lemma 3.3. U preserves and lifts pushouts along regular monomorphisms, moreover it reflects all pushout squares

$$\mathcal{U}(P,p,q) \xrightarrow{\mathcal{U}(f)} \mathcal{U}(W,t,r)$$

$$\mathcal{U}(m) \downarrow \mathcal{U}(g) \xrightarrow{\mathcal{U}(W,t,r)} \mathcal{U}(U,a,b)$$

in which n is regular. In addition Reg(TGΣ) is closed under pushouts.

We can now use the first point of Theorem 2.2 to get half of the following result.

Theorem 3.6 ([7, Thm. 4.2]). The category TG<sup>Σ</sup> is quasi-adhesive.

Proof. We already know by Lemmas 3.2 and 3.3 and Theorem 2.2 that pushouts along regular monos are stable. So, let us take a cube

$$(T',c',d') \leftarrow \begin{array}{c} \stackrel{m'}{\longleftarrow} (V',l',s') \\ \stackrel{m'}{\longleftarrow} (V',a',b') \\ \stackrel{g'}{\longleftarrow} (U',a',b') \end{array} \begin{array}{c} (V',t',r') \\ (W',t',r') \\ \stackrel{a}{\longleftarrow} \\ \stackrel{b}{\longleftarrow} m \\ (V,t,r) \\ (U,a,b) \end{array} \begin{array}{c} (V,t',r') \\ \stackrel{b}{\longleftarrow} (W,t,r) \\ (W,t,r) \\ \stackrel{f}{\longleftarrow} \end{array}$$

in which m is regular, the top and bottom faces are pushouts and the back faces pullbacks. Applying U we get another cube

with pushouts along monos as top and bottom faces and pullbacks as vertical ones. By Proposition 3.2 U creates pullbacks along regular monos and f ∈ Reg(TGΣ), then we can conclude that the front right face of the starting cube is a pullback as well. We have to show that the front left face of the starting cube is a pullback too. Suppose it is not, then, by the explicit description of pullbacks, there must be a node t ∈ T <sup>0</sup> which is empty in (T 0 , c0 , d<sup>0</sup> ) and such that g 0 (t) and c(t) are non empty. By the computation of pushouts along regular monos we can deduce that g 0 (t) ∈ dom(a 0 ) implies the existence of v ∈ V 0 , necessarily empty, such that m<sup>0</sup> (v) = t and f 0 (n 0 (v)) = g 0 (t), thus n 0 (v) is non empty since f 0 is regular. Moreover, c(m<sup>0</sup> (v)) = m(a(v)) and the left hand side is non empty, therefore even a(v) is non empty by the regularity of m, but this contradicts the hypothesis that the back right face is a pullback. ut

### 4 Conclusions

In this paper we have introduced a new criterion for M, N -adhesivity, based on the verification of some properties of functors connecting the category of interest to a family of suitably adhesive categories. This criterion can be seen as a distilled abstraction of many ad hoc proofs of adhesivity found in literature. This criterion allows us to prove in a uniform and systematic way some previous results about the adhesivity of categories built by products, exponents, and comma construction. We have applied the criterion to several significant examples, such as term graphs and directed (acyclic) graphs; moreover, using the modularity of our approach, we have readily proved suitable adhesivity properties to categories constructed by combining simpler ones. In particular, we have been able to tackle the adhesivity problem for several categories of hierarchical (hyper)graphs, including Milner's bigraphs, bigraphs with sharing, and a new version of bigraphs with recursion.

As future work, we plan to analyse other categories of graph-like objects using our criterion; an interesting case is that of directed bigraphs [13,3,4]. Moreover, it is worth to verify whether the M, N -adhesivity that we obtain from the results of this paper is suited for modelling specific rewriting systems, e.g. based on the DPO approach. As an example, TG<sup>Σ</sup> is quasiadhesive but this does not suffice in most applications, because the rules are often spans of monomorphisms, and not of regular monos [7].

### References


Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

### Quantifier elimination for counting extensions of Presburger arithmetic

Dmitry Chistikov<sup>1</sup> , Christoph Haase<sup>2</sup> , and Alessio Mansutti<sup>2</sup> ()

<sup>1</sup> Centre for Discrete Mathematics and its Applications (DIMAP) & Department of Computer Science, University of Warwick, Coventry, UK

d.chistikov@warwick.ac.uk

<sup>2</sup> Department of Computer Science, University of Oxford, Oxford, UK {christoph.haase,alessio.mansutti}@cs.ox.ac.uk

Abstract. We give a new quantifier elimination procedure for Presburger arithmetic extended with a unary counting quantifier ∃ =x y Φ that binds to the variable x the number of different y satisfying Φ. While our procedure runs in non-elementary time in general, we show that it yields nearly optimal elementary complexity results for expressive counting extensions of Presburger arithmetic, such as the threshold counting quantifier ∃ ≥c y Φ that requires that the number of different y satisfying Φ be at least c ∈ N, where c can succinctly be defined by a Presburger formula. Our results are cast in terms of what we call the monadically-guarded fragment of Presburger arithmetic with unary counting quantifiers, for which we develop a 2ExpSpace decision procedure.

### 1 Introduction

Counting the number of solutions to an equation, or the number of elements in a set subject to constraints, is a fundamental and often computationally challenging problem studied in logic, mathematics and computer science. In discrete geometry, counting the number of integral points in a polyhedron is a canonical #P-complete problem. Barvinok's celebrated algorithm solves this problem in polynomial time when the dimension is fixed [2]. In this paper, we investigate a generalization of this problem and study algorithmic aspects of counting the number of models of formulae of Presburger arithmetic, the first-order theory of the integers with addition and order, and more generally, extensions of this logic with counting quantifiers.

Counting quantifiers such as the H¨artig quantifier, which allows to assert equal-cardinality constraints on the sets of satisfying assignments of two given first-order formulae, have long been studied in first-order logic [6]. In first-order theories of integer arithmetic, it is compelling to consider variants of counting quantifiers that bind the number of satisfying assignments of a formula to a first-order variable. Apelt [1] and Schweikardt [10] studied the decidability of Presburger arithmetic enriched with the unary counting quantifier ∃ <sup>=</sup><sup>x</sup>y with the following semantics: given an assignment of integers to the first-order variables x, z1, . . . , zn, a formula ∃ <sup>=</sup><sup>x</sup>y Φ(x, y, z1, . . . , zn) evaluates to true whenever

the number of different y satisfying Φ(x, y, z1, . . . , zn) is exactly x. In both [1] and [10], decidability is shown by developing a quantifier elimination procedure for this extension of Presburger arithmetic which eliminates a counting quantifier by translating it into an equivalent quantified formula of Presburger arithmetic, i.e., one that only uses standard first-order quantifiers. This immediately gives decidability of Presburger arithmetic extended with the unary counting quantifier ∃ <sup>=</sup>xy since Presburger arithmetic is decidable in 2ExpSpace [9,3,12]. Unfortunately, the quantifier elimination procedures in [1,10] do not yield a similar elementary upper bound for the extended theory, as the elimination of a single quantifier ∃ <sup>=</sup>xy results in an exponential blow-up of the formula size and introduces nested first-order quantifiers. It is a widely open problem whether there is a decision procedure for Presburger arithmetic extended with the counting quantifier ∃ <sup>=</sup>xy with elementary running time, or whether this theory admits a significantly stronger lower bound than standard Presburger arithmetic.

To shed more light on the complexity of Presburger arithmetic extended with the aforementioned unary counting quantifier, Habermehl and Kuske gave a quantifier elimination procedure for Presburger arithmetic extended with a unary modulo counting quantifier ∃ (r,q)y, where r and q are positive natural numbers [4]. Here, ∃ (r,q)y Ψ(y, z1, . . . , zn) holds whenever the number of different y satisfying Ψ(y, z1, . . . , zn) is congruent to r modulo q. An analysis of the growth of the constants and coefficients occurring in their procedure then enables them to derive a 2ExpSpace upper bound for the logic, matching the complexity of Presburger arithmetic on deterministic machines. This noteworthy result shows that there is still room to extend Presburger arithmetic with non-trivial counting quantifiers without increasing the computational cost of deciding the logic.

Note that in order to keep the logic decidable, the counting quantifiers considered in the literature must be unary. Indeed, consider a binary counting quantifier ∃ =x (y1, y2) counting the number of different y<sup>1</sup> and y<sup>2</sup> satisfying a formula. Then, Φ(x, z) = ∃ =x (y1, y2)(0 ≤ y1, y<sup>2</sup> < z) holds for x = z 2 , which in turn allows defining multiplication, leading to undecidability of the resulting theory.

Our contribution. Following the lines of [4] while trying to avoid the limitations of the procedures in [1,10], our goal is to study decision procedures for Presburger arithmetic enriched with variants of counting quantifiers that do not increase the complexity of the Presburger arithmetic. To begin with, we develop a new quantifier elimination procedure for Presburger arithmetic with unary counting quantifiers ∃ <sup>=</sup><sup>x</sup>y that, in contrast to [1,10], does not require the introduction of first-order quantifiers. While the procedure still runs in non-elementary time, avoiding first-order quantification allows us not only to derive exponentially better bounds on the size of the formula obtained after eliminating a single ∃ <sup>=</sup><sup>x</sup>y, but also to identify the sources of non-elementary growth. We exploit those observations to extend the range of counting quantifiers that can be added to Presburger arithmetic without increasing the complexity of the resulting logic.

The first type of counting quantifiers we consider is a threshold counting quantifier ∃ <sup>≥</sup><sup>c</sup>y for some integer c. A formula ∃ <sup>≥</sup><sup>c</sup>y Ψ(y, z1, . . . , zn) evaluates to true whenever there are at least c different values of y satisfying Ψ(y, z1, . . . , zn). We show that Presburger arithmetic enriched with threshold counting quantifiers can be decided in 2ExpSpace, even when the threshold c itself is succinctly given as the unique solution of a Presburger arithmetic formula. This is surprising since in Presburger arithmetic one can define numbers that are triply exponential in the size of the formula used to encode them [7, pp. 151–152]. Furthermore, we show that if we restrict c to be at most doubly exponential in the size of its encoding then Presburger arithmetic with threshold counting quantifiers is decidable in STA(∗, 2 2 nO(1) , O(n)), matching the complexity of Presburger arithmetic [3]. Here, STA(s(n), t(n), a(n)) is the class of all decision problems in which inputs of length n can be decided by an alternating Turing machine in space s(n) and time t(n) using a(n) alternations, where "∗" stands for unbounded availability of a certain resource.

Our results on the quantifier ∃ <sup>≥</sup>cx arise from studying a more general extension of Presburger arithmetic that relies on the notion of monadic decomposition put forward by Veanes et al. in [11] and studied by Hague et al. [5] in the context of integer linear arithmetic. Briefly, a formula Φ(x, y1, . . . , yn) is said to be monadically decomposable on the variable x whenever it is equivalent to a formula of the form W <sup>i</sup>∈<sup>I</sup> ∆i(x) ∧ Ψi(y1, . . . , yn), i.e., a formula where the satisfaction of constraints on x does not depend on the values of y1, . . . , yn. Based on this definition, we extend Presburger arithmetic by allowing the general unary counting quantifiers ∃ <sup>=</sup>xy to appear with guards of the form ∃ x(Ψ ∧ ∃=xy Φ), where Ψ is monadically decomposable on the variable x. The resulting logic is very powerful, as it not only generalizes the quantifiers ∃ <sup>≥</sup>cx but also the modulo counting quantifiers ∃ (r,q)y from [4]. We establish two further results for this monadically-guarded fragment of Presburger arithmetic with counting quantifiers. First, we develop a 3ExpTime quantifier elimination procedure for the logic, matching the complexity of the best possible quantifier elimination procedures for Presburger arithmetic. Second, we exploit this procedure to obtain a quantifier relativization argument showing that the logic is decidable in 2ExpSpace.

### 2 Presburger arithmetic with counting quantifiers

General notation. The symbols Z, N and N<sup>+</sup> denote the set of integers, natural numbers including zero, and natural numbers without zero, respectively. We usually use a, b, c, . . . for integers, which we assume being encoded in binary. Given n ∈ N, we write [n] def = {0, . . . , n − 1}, and #A for the cardinality of a set A. If A is infinite, then #A = ∞, and we postulate n ≤ ∞ for all n ∈ Z.

Structure. We consider the structure Z = hZ,(c)c∈<sup>Z</sup>, +, <,(≡q)q∈N<sup>+</sup> i of Presburger arithmetic, where (c)c∈<sup>Z</sup> are constant symbols that shall be interpreted as their homographic integer numbers, the binary function symbol + is interpreted as addition on Z, the binary relation < is interpreted as "less than", and ≡<sup>q</sup> is interpreted as the modulo relation, i.e., a ≡<sup>q</sup> b if and only if q divides a−b. Basic syntax. Let X = {x, y, z, . . . } be a countable set of first-order variables. Linear terms, usually denoted by t, t1, t2, etc., are expressions of the form a1x<sup>1</sup> + · · · + adx<sup>d</sup> + c where x1, . . . , x<sup>d</sup> ∈ X, a1, . . . , ad, c ∈ Z. The integer a<sup>i</sup> is the coefficient of the variable x<sup>i</sup> . Variables not appearing in the linear term are tacitly assumed to have a 0 coefficient. A term t is said to be x-free if the coefficient of the variable x in t is 0. The integer c is the constant of the linear term. Linear terms with constant 0 are said to be homogeneous.

Given a term t, the lexeme t < 0 is understood as a linear inequality, and t ≡<sup>q</sup> 0 is a modulo constraint. Syntactically, Presburger arithmetic (PA) is the closure of linear inequalities and modulo constraints under the Boolean connectives ∧ and ¬ (i.e., conjunction and negation, respectively) and the first-order quantifier ∃y. Presburger arithmetic with counting quantifiers (PAC) extends PA with the (unary) counting quantifier ∃ <sup>=</sup>xy, where x and y are two syntactically distinct variables from X. Formulae of PAC are denoted by Φ, Ψ, Γ, etc.

We write vars(Φ) and fv(Φ) for the set of variables and free variables of Φ, respectively, with fv(∃ <sup>=</sup>xy Φ) def = {x}∪(fv(Φ)\ {y}). A sentence is a formula Φ with fv(Φ) = ∅. We sometimes write Φ(x1, . . . , xk) or Φ(x), with x = (x1, . . . , xk) a tuple of variables, for a formula Φ with fv(Φ) = {x1, . . . , xk}. We say that Φ is z-free if z ∈ X does not occur in Φ. Given terms t and t 0 , Φ[t <sup>0</sup>/t] stands for the formula obtained from Φ by syntactically replacing every occurrence of t by t 0 . Given Φ(x1, . . . , xk) and terms t1, . . . , tk, Φ(t1, . . . , tk) stands for Φ[t1/x1] . . . [tk/xk].

Semantics. An assignment is a function ν : X → Z assigning an integer value to every variable. As usual, we extend ν in the standard way to a function that maps every term to an element of Z. For instance, ν(x+3x+2) = ν(x)+3ν(y)+2. Given a variable x and an integer n, we write ν[n/x] for the assignment obtained form ν by updating the value of x to n, i.e. ν[n/x](x) = n, and for all variables y distinct from x, ν[n/x](y) = ν(y). Given a formula Φ of PAC and an assignment ν, the satisfaction relation ν |= Φ is defined as usual for linear inequalities, modulo constraints, Boolean connectives and the existential quantifier ranging over Z. For the counting quantifier, we define

$$\nu \vdash \exists^{=x} y \,\Phi \text{ if and only if } \#\{n \in \mathbb{Z} \mid \nu[n/y] \doteq \Phi\} = \nu(x).$$

Informally, ∃ <sup>=</sup><sup>x</sup>y Φ is satisfied by ν if there are exactly ν(x) distinct values for the variable y that make Φ true. A formula Φ of PAC is satisfiable (resp. valid) if ν |= Φ holds for an assignment (resp. every assignment) ν. A formula Φ entails a formula Ψ, written Φ |= Ψ, whenever every assignment satisfying Φ also satisfies Ψ. We write Φ ⇔ Ψ to denote that Φ and Ψ are equivalent, i.e. Φ |= Ψ and Ψ |= Φ.

Syntactic abbreviations. We define ⊥ def = 0 < 0 and > def = ¬⊥. The Boolean connectives ∨, → and ↔ and the universal first-order quantifier ∀ are derived as usual, and so are the (in)equalities <, ≤, =, ≥, and >, between terms. For instance, t<sup>1</sup> < t<sup>2</sup> corresponds to t<sup>1</sup> − t<sup>2</sup> < 0, where we tacitly manipulate t<sup>1</sup> − t<sup>2</sup> with standard operations of linear arithmetic to obtain an equivalent term. Similarly, t<sup>1</sup> ≡<sup>q</sup> t<sup>2</sup> is short for t<sup>1</sup> − t<sup>2</sup> ≡<sup>q</sup> 0, whereas |t1| + t<sup>2</sup> < 0 is short for (t<sup>1</sup> < 0 → t<sup>2</sup> −t<sup>1</sup> < 0)∧(t<sup>1</sup> ≥ 0 → t<sup>1</sup> +t<sup>2</sup> < 0). For a variable x ∈ X and r ∈ [q], we call x ≡<sup>q</sup> r a simple modulo constraint. All modulo constraints introduced by our quantifier elimination procedure given in Section 3 are simple.

The counting quantifier ∃ <sup>≥</sup>xy. Historically [1,10], the quantifier ∃ <sup>=</sup>xy has been the unary counting quantifier of choice when it comes to PAC. However, a priori one could define PAC as the extension of PA featuring counting quantifiers ∃ <sup>≥</sup>xy, where ν |= ∃ <sup>≥</sup>xy Φ holds for an assignment ν whenever there are at least ν(x) values n ∈ Z for y such that ν[n/y] |= Φ. Notice that the counting quantifier ∃ =y can be expressed using ∃ ≥y , and vice versa:

– ∃ <sup>=</sup>xy Φ ⇔ ∃<sup>≥</sup>xy Φ ∧ ∃x 0 : x <sup>0</sup> = x + 1 ∧ ¬∃<sup>≥</sup><sup>x</sup> 0 y Φ; and – ∃ <sup>≥</sup>xy Φ ⇔ (∀z ∃y : |z| ≤ |y| ∧ Φ) ∨ ∃x 0 : x <sup>0</sup> ≥ x ∧ ∃=<sup>x</sup> 0 y Φ.

Two comments are in order: first, translating a PAC formula by swapping the type of counting quantifiers using the equivalences above has the unpleasant effect of increasing the size of the formula, exponentially if the nesting depth of quantifiers is unbounded. Second, the subformula ∀z ∃y : |z| ≤ |y| ∧ Φ used in the last equivalence states that there are infinitely many values for y that make the formula Φ true. This formula highlights the main difference between ∃ <sup>=</sup>xy and ∃ <sup>≥</sup>xy quantifiers: the latter is true in the presence of infinitely many values for y, whereas the former is false. Throughout the paper, we focus on the quantifier ∃ <sup>=</sup>xy, as done in [1,10], but use this observation to argue that our results can be readily adapted to the counting quantifier ∃ <sup>≥</sup>xy. Full details of this adaptation are given in the full version of the paper.

Parameters of formulae. To analyze quantifier-elimination procedures, following [8,12], we introduce a number of parameters for formulae of PAC:


Given a vector v = (v1, . . . , vd) ∈ Z d , we write ||v|| = max{|v<sup>i</sup> | : 1 ≤ i ≤ d} for the infinity norm of v. Similarly, for a linear term t, we write ||t|| for the maximum absolute value of a coefficient or constant appearing in t. Given a finite set of vectors or a finite set of terms A, we define ||A|| = max{||a|| : a ∈ A}. Given a matrix A ∈ Z n×d , its infinity norm is the maximal infinity norm of its column vectors. Notice that ||lin(Φ)|| = ||hom(Φ) ∪ const(Φ)||. For a formula Φ, we define ||Φ|| def = ||lin(Φ) ∪ mod(Φ)||.

Complexity remarks. The proposition below characterizes the complexity of PA.

Proposition 1 ([3]). Presburger arithmetic is STA(∗, 2 2 nO(1) , O(n))-complete.

To be more precise, the number of alternations required to decide the validity or satisfiability of a formula Φ from Presburger arithmetic is linear in nr(Φ). Notice that 2NExpTime ⊆ STA(∗, 2 2 nO(1) , O(n)) ⊆ 2ExpSpace.

### 3 A quantifier elimination procedure for PAC

In this section, we develop a new quantifier elimination procedure (QE procedure) for the counting quantifier ∃ <sup>=</sup>xy:

Proposition 2. Let Φ be quantifier-free. Then ∃ <sup>=</sup>xy Φ is equivalent to a Boolean combination of linear inequalities and simple modulo constraints.

We quantify the growth of parameters in the formula in Section 4. Upper bounds on this growth are at the core of our results. Without any bounds (as stated), Proposition 2 is known and can be obtained by chaining the quantifier elimination procedure developed by Schweikardt [10] together with the standard quantifier elimination procedure for Presburger arithmetic. An advantage of our QE procedure for the quantifier ∃ <sup>=</sup>xy is that it avoids the introduction of additional ∃- and ∀-quantifiers when eliminating a counting quantifier on which Schweikardt's procedure relies. More precisely, given a formula ∃ <sup>=</sup>xy Φ where Φ is quantifier-free (q.f. in short), the QE procedure in [10] requires a full transformation of Φ into disjunctive normal form, and eliminates the quantifier ∃ <sup>=</sup>xy by introducing first-order quantifiers, producing an equivalent formula Ψ of Presburger arithmetic. This strategy comes at a cost: the size of the q.f. formula obtained after removing the quantifiers from Ψ is doubly exponential in the size of ∃ <sup>=</sup><sup>x</sup>y Φ. By avoiding the introduction of first-order quantifiers, our QE procedure already exponentially improves upon Schweikardt's procedure.

Our QE procedure performs a series of formula manipulations, divided into five steps. At the end of the i-th step, the procedure produces a formula Φ<sup>i</sup> equivalent to the original formula ∃ <sup>=</sup><sup>x</sup>y Φ. Ultimately, Φ<sup>5</sup> is a Boolean combination of inequalities and simple modulo constraints allowing us to establish Proposition 2. In this section, we present the procedure and briefly discuss its correctness, leaving the computational analysis of parameters lin(Φ5), hom(Φ5), const(Φ5) and mod(Φ5) to subsequent sections.

Step I: Normalise the coefficients of y. Given the input formula Φ<sup>0</sup> = ∃ <sup>=</sup><sup>x</sup>y Φ, with Φ q.f., the first step of the procedure is a standard step for QE procedures for Presburger arithmetic. It produces an equivalent formula Φ<sup>1</sup> in which all nonzero coefficients of y appearing in a linear term are normalized to 1 or −1. For simplicity, we first translate every modulo constraint in Φ into simple modulo constraints, by relying on the lemma below.

Lemma 1. Every constraint t ≡<sup>q</sup> 0 is equivalent to a Boolean combination Ψ of simple modulo constraints such that vars(Ψ) ⊆ vars(t ≡<sup>q</sup> 0) and mod(Ψ) = {q}.

The first step of our QE procedure is as follows:

	- ay + t < 0 −→ ky + (k/a) · t < 0, if a > 0,
	- ay + t < 0 −→ −ky − (k/a) · t < 0, if a < 0, and

$$\begin{array}{ccc} \bullet & y \equiv\_q r & \longrightarrow & ky \equiv\_{kq} kr, \end{array}$$

where t is a term, q ≥ 1 and r ∈ [q]:

4 Define Φ<sup>1</sup> def = ∃ =x y (y ≡<sup>k</sup> 0 ∧ Φ 0 [y/ky]).

Claim 1. Φ<sup>0</sup> ⇔ Φ1, and in Φ1, all non-zero coefficients of y are either 1 or −1.

Step II: Subdivide the formula according to term orderings and residue classes. We define an ordering for a set of linear terms T to be a formula of the form

$$(t\_1 \lhd\_1 t\_2) \land (t\_2 \lhd\_2 t\_3) \land \dots \land (t\_{n-1} \lhd\_{n-1} t\_n),\tag{1}$$

where {t1, . . . , tn} = T and {C1, . . . , Cn−1} ⊆ {<, =}.

Lemma 2. There is an algorithm that, given a set T of n linear terms over d variables, computes in time n O(d) log ||T||<sup>O</sup>(1) a set {O1, . . . , Oo} of orderings for T s.t. (1) o = O(n 2d ), (2) > ⇔ W<sup>o</sup> <sup>i</sup>=1 Oi, (3) ⊥ ⇔ O<sup>i</sup> ∧ O<sup>j</sup> whenever i 6= j.

Lemma 2 is proven analogously to [13, Proposition 5.1].

The second step of our QE procedure is as follows:


Claim 2. Φ<sup>1</sup> ⇔ Φ2.

In Steps III to V of the procedure, we focus on each disjunct of Φ<sup>2</sup> separately, iterating over all i ∈ [1, o], hence over all orderings, and all r : Z → [m], i.e., functions assigning residue classes modulo m to the variables in Z.

Step III: Split the range of y into segments. Recall that Φ<sup>1</sup> = ∃ <sup>=</sup><sup>x</sup>y Ψ, where Ψ is some Boolean combination of inequalities and modulo constraints with variables from vars(Φ) in which the non-zero coefficients of y are either 1 or −1. Let T|<sup>O</sup><sup>i</sup> def = (t 0 1 , · · · , t<sup>0</sup> ` ) be the tuple of all the terms in T that the formula O<sup>i</sup> asserts pairwise non-equal, taken in the ascending order. In other words, we obtain t 0 1 , . . . , t<sup>0</sup> ` by removing from the sequence t1, . . . , t<sup>n</sup> in Equation (1) all terms tj+1 for which C<sup>j</sup> is =. Let seg(y, Oi) be the set of formulae

$$\{ y < t\_1', \ y = t\_1', \ (t\_{i-1}' < y \land y < t\_i'), \ y = t\_i', \ t\_\ell' < y \; : \ i \in [2, \ell] \}.$$

We have #seg(y, Oi) = 2` + 1. Given κ ∈ seg(y, Oi), the formula O<sup>i</sup> ∧ κ imparts a linear ordering on the terms T ∪ {y}. This enables us to "almost evaluate" Ψ:

Lemma 3. For every κ ∈ seg(y, Oi), there is a Boolean combination Ψi,r <sup>κ</sup> of simple modulo constraints such that vars(Ψi,r κ ) = {y}, mod(Ψi,r κ ) ⊆ mod(Ψ) and

$$
\Gamma\_{i,r} \wedge \kappa \wedge \Psi \iff \Gamma\_{i,r} \wedge \kappa \wedge \Psi\_{\kappa}^{i,r} .
$$

Our QE procedure manipulates Φ<sup>2</sup> as follows:

10 For every i ∈ [1, o] and every r : Z → [m] :

11 Let seg(y, Oi) = {κ0, . . . , κ2`}.

12 For every j ∈ [0, 2`], consider the formula Ψ i,r κj from Lemma 3.

$$13 \qquad \text{Let } \Phi\_3^{i,r} = \exists x\_0 \dots \exists x\_{2\ell} \left( x = \sum\_{j=0}^{2\ell} x\_j \wedge \bigwedge\_{j=0}^{2\ell} \exists^{\neg x\_j} y(\kappa\_j \wedge \Psi\_{\kappa\_j}^{i,r}) \right).$$

14 Define Φ<sup>3</sup> def = W<sup>o</sup> i=1 W r : Z→[m] (Γi,r ∧ Φ i,r 3 ).

Claim 3. Φ<sup>2</sup> ⇔ Φ3.

Step IV: Compute the number of solutions for each segment. We next aim at eliminating the counting quantifiers introduced in Step III in the sub-formulae ∃ <sup>=</sup>x<sup>j</sup> y(κj∧Ψi,r κ<sup>j</sup> ). We go over each κ ∈ seg(y, Oi), and consider three cases depending on whether it specifies (syntactically) an infinite interval, a finite segment, or a single value for y.

Notice that r is in fact an assignment to variables, so r(t) ∈ Z is well-defined for every term t with free variables Z. For all i ∈ [1, o] and r : Z → [m], given T|O<sup>i</sup> = (t 0 1 , . . . , t<sup>0</sup> ` ) the procedure computes the following numbers c1, . . . , c`, p2, . . . , p` and r2, . . . , r`.

15 For every j ∈ [1, `] :

16 If Ψ i,r <sup>κ</sup> [r(t 0 <sup>j</sup> )/y] is true, where κ = (y = t 0 <sup>j</sup> ), then let c<sup>j</sup> def = 1, else let c<sup>j</sup> def = 0.


Lemma 4. Given a formula Ψi,r <sup>κ</sup> and m, u<sup>j</sup> , u<sup>j</sup> , the numbers p<sup>j</sup> and r 0 j can be computed in #P, or by a deterministic algorithm with running time O(m·|Ψi,r κ |).

The numbers c<sup>j</sup> , p<sup>j</sup> , r<sup>j</sup> determine, for each κ ∈ seg(y, Oi), how many assignments to the variable y satisfy the formula Ψi,r κ in the conjunction Γi,r ∧κ∧Ψi,r κ . Intuitively, this is c<sup>j</sup> for κ of the form y = t 0 j , and (p<sup>j</sup> (t 0 <sup>j</sup> − t 0 j−1 ) + r<sup>j</sup> )/m for κ of the form t 0 <sup>j</sup>−<sup>1</sup> < y ∧ y < t<sup>0</sup> j . We say "intuitively" here, because in the latter case the expression above depends on other variables so is not, strictly speaking, a number. The following claims formalize this intuition:

Claim 4. Let κ ∈ {y < t<sup>0</sup> 1 , t<sup>0</sup> ` < y}. If Ψi,r κ (y) is satisfiable, then Φi,r <sup>3</sup> ⇔ ⊥. Claim 5. Let j ∈ [1, `], κ = (y = t 0 j ), z ∈ X. Then, ∃ <sup>=</sup>zy (κ ∧ Ψi,r κ ) ⇔ z = c<sup>j</sup> . Claim 6. Let κ = (t 0 <sup>j</sup>−<sup>1</sup> < y ∧ y < t<sup>0</sup> j ) for some j ∈ [2, `] and let z be a fresh variable. Then, Γi,r ∧ ∃<sup>=</sup>zy (κ ∧ Ψi,r κ ) ⇔ Γi,r ∧ mz = p<sup>j</sup> (t 0 <sup>j</sup> − t 0 j−1 ) + r<sup>j</sup> .

The procedure manipulates the formula Φ<sup>3</sup> as follows:

23 For every i ∈ [1, o] and every r : Z → [m] :

24 If Ψ i,r <sup>κ</sup> (y) is satisfiable for some κ ∈ {y < t<sup>0</sup> 1, t<sup>0</sup> ` < y}, then let Φ i,r 4 def = ⊥,

25 else Φ i,r 4 def = ∃x<sup>2</sup> . . . ∃x` x = P` <sup>j</sup>=2 x<sup>j</sup> + P` <sup>j</sup>=1 c<sup>j</sup> ∧ V` <sup>j</sup>=2 mx<sup>j</sup> = p<sup>j</sup> (t 0 <sup>j</sup> − t 0 <sup>j</sup>−1) + r<sup>j</sup> .

26 Define Φ<sup>4</sup> def = W<sup>o</sup> i=1 W r : Z→[m] (Γi,r ∧ Φ i,r 4 ).

Claim 7. Φ<sup>3</sup> ⇔ Φ4.

Step V: Sum up the solutions. It remains to get rid of the variables x<sup>i</sup> introduced earlier. For each disjunct Γi,r ∧ Φ i,r 4 of Φ4, we use the notation from Step IV.

27 For every i ∈ [1, o] and every r : Z → [m] : 28 If Φ i,r <sup>4</sup> = ⊥, then let Φ i,r 5 def = ⊥, 29 else let Φ i,r 5 def = mx = P` <sup>j</sup>=2(p<sup>j</sup> (t 0 <sup>j</sup> − t 0 <sup>j</sup>−1) + r<sup>j</sup> ) + m · P` <sup>j</sup>=1 c<sup>j</sup> . 30 Let Φ<sup>5</sup> def = W<sup>o</sup> i=1 W r : Z→[m] (Γi,r ∧ Φ i,r 5 ).

The procedure outputs Φ5. The following claim implies Proposition 2. Claim 8. Φ<sup>4</sup> ⇔ Φ5. The formula Φ<sup>5</sup> is quantifier-free.

### 4 Discussion, summary of results and roadmap

The QE procedure for a single counting quantifier ∃ <sup>=</sup>xy from Section 3 forms the basis of our results. In this section we discuss its use and lay out its applications.

Analysis of the procedure. The next lemma establishes the growth of the formulae and their parameters in our quantifier elimination procedure.

Lemma 5. Let Φ<sup>5</sup> be obtained from applying the QE procedure of Section 3 to a formula ∃ <sup>=</sup><sup>y</sup>x Φ, where Φ is quantifier-free and #vars(Φ) = d. Then:

$$\begin{aligned} \text{mod}(\Phi\_5) &= \{ m \} \quad \text{with } m = k \cdot \text{lcm}(\text{mod}(\Phi)) \text{ and } k \le \| \text{hom}(\Phi) \|^{\#\text{hom}(\Phi)},\\ \#\text{lin}(\Phi\_5) &\le N^{O(d)}, \qquad \|\text{lin}(\Phi\_5)\| \le \mathcal{O}(N) \cdot \|\text{lin}(\Phi)\|,\\ \#\text{hom}(\Phi\_5) &\le N^{O(d)}, \quad \|\text{hom}(\Phi\_5)\| \le \mathcal{O}(N) \cdot \|\text{hom}(\Phi)\|, \quad \text{with } N = m^2 \cdot \#\text{lin}(\Phi). \end{aligned}$$

Remark 1. With minor changes to our procedure, one can obtain a QE procedure for the quantifier ∃ <sup>≥</sup><sup>x</sup>y. In particular, since ∃ <sup>≥</sup><sup>x</sup>y Φ is true if there are infinitely many values for y that satisfy Φ, Claim 4 needs to be updated so that Φ i,r <sup>3</sup> ⇔ > is deduced, instead of Φi,r <sup>3</sup> ⇔ ⊥. Other minor adaptations are required, e.g. equalities "x = . . . " and counting quantifiers ∃ <sup>=</sup>x<sup>j</sup> y appearing in Line 13 must be updated to "x ≤ . . . " and ∃ <sup>≥</sup>x<sup>j</sup> y. The resulting QE procedure for ∃ <sup>≥</sup><sup>x</sup>y still adheres to the bounds in Lemma 5.

A consequence of Lemma 5 is that our QE procedure gives an algorithm for deciding a formula Φ from PAC featuring multiple counting quantifiers ∃ <sup>=</sup>xy in time 2. . . 2 , where the height of the tower is linear in the quantifier rank of Φ. Indeed, in view of the upper bounds and equations given by Lemma 5 for #hom(Φ5), N, m, and k, we observe that the upper bound for #hom(Φ5) is exponential in #hom(Φ). This means that more fine-grained bounds are necessary for decision procedures with elementary complexity, i.e., with a running time bounded from above by a k-fold exponential in the size of the input formula.

Elementary decision procedures. In view of this growth of the parameters, it is natural to ask ourselves whether our QE procedure is perhaps na¨ıvely disregarding important properties of the underlying arithmetic theory that could lead to better bounds. A good test in this direction is to check whether improved bounds can be achieved when the procedure runs on restricted forms of counting quantifiers. In the remainder of the paper we show that this is the case, and explain how the growth of parameters can be countered for restricted quantifiers, obtaining 3ExpTime quantifier elimination procedures as well as 2ExpSpace decision procedures for extensions of PA with a variety of counting quantifiers.

As an example, let us consider Presburger arithmetic enriched with threshold quantifiers ∃ <sup>≥</sup>cy Φ, where c ∈ N is written in binary. These are satisfied whenever there are at least c distinct values for the variable y that make the formula Φ true. Notice that the threshold counting quantifiers ∃ <sup>≥</sup>cy are a syntactic generalization of the first-order quantifiers, as ∃ <sup>≥</sup>1y Φ ⇔ ∃y Φ. Interestingly enough, one can translate threshold quantifiers into standard Presburger arithmetic with just a polynomial increase in the size of the formula. For simplicity, assume that the threshold c is a power of 2. Then, the quantifier ∃ <sup>≥</sup>cy can be internalized in PA by relying on the equivalence

$$\exists^{\geq 2g} y \,\Phi(y,\mathbf{z}) \Leftrightarrow \exists u \,\forall v \,\exists^{\geq g} y \,:\,(v = 0 \leftrightarrow y < u) \land \Phi(y,\mathbf{z})$$

as well as ∃ <sup>≥</sup><sup>1</sup>y Φ ⇔ ∃y Φ. However, in terms of decision procedures, this is an inadequate solution, as it comes at the cost of introducing 2 log<sup>2</sup> c many quantifier alternations. Building upon the QE procedure from Section 3, we show how to directly eliminate threshold quantifiers. This proves that the increase in alternation depth that depends on the threshold c is unnecessary.

Theorem 1. The validity of a formula Φ from Presburger arithmetic with threshold counting quantifiers can be decided in STA(∗, 2 2 |Φ|O(1) , O(fd(Φ))).

This result matches the complexity of deciding standard PA in the case of unbounded alternation depth. Thus, PA can be enriched with threshold quantifiers with almost no computational overhead. Note that a slight increase in number of alternations is still required, and goes from O(nr(Φ)) for PA to O(fd(Φ)) for PA with threshold counting quantifiers.

We further strengthen Theorem 1, extending it to the case where the threshold c is encoded even more succinctly, as the unique solution of a PA formula Φ(x) as long as this solution is bounded doubly-exponentially in |Φ|. An example of such a formula is Φ(x) = ∃z : z = 1 ∧ Ψn(x, z), where

$$\begin{aligned} \Psi\_0(x, z) & \stackrel{\text{def}}{=} x = 2z, \\ \Psi\_{n+1}(x, z) & \stackrel{\text{def}}{=} \exists y \forall a \forall b : (a = x \land b = y) \lor (a = y \land b = z) \to \Psi\_n(a, b), \end{aligned}$$

and the only solution is given by x = 2<sup>2</sup> n [7, Lecture 23], whilst |Φ| = O(n). The crux of our results lies in the identification of a fragment of PAC that we call monadically-guarded, for which the following theorem can be established.

### Theorem 2. Monadically-guarded PAC is decidable in 2ExpSpace.

In the next section, we introduce the monadically-guarded fragment of PAC and discuss extensions of PA that can be captured by this fragment. In Section 6, by adding post-processing to the procedure from Section 3, we show how to deal with any monadically-guarded counting quantifiers in 3ExpTime. In Section 7 we establish Theorem 2 by designing a quantifier relativization argument, continuing the direction of research due to [12]. In Section 8 we prove Theorem 1.

### 5 The monadically-guarded fragment of PAC

Fix a logic L. A formula Φ(x, z) from L, where z is a tuple of variables not including x, is said to be monadically decomposable on the variable x whenever

$$
\Phi \Leftrightarrow \Psi,\text{ for some } \Psi \stackrel{\text{at}}{=} \bigvee\_{i \in I} (\Delta\_i(x) \wedge \Gamma\_i(\mathbf{z})),
$$

where ∆<sup>i</sup> and Γ<sup>i</sup> are formulae from L. In this case, Ψ is said to be a monadic decomposition of Φ on the variable x.

The notion of monadic decomposition has been put forward by Veanes et al. in [11], as a general simplification technique that improves the performance of solvers. Here, our interest lies in studying whether the notion of monadic decomposability can bring complexity advantages for Presburger arithmetic with counting quantifiers. With this in mind, we consider formulae of PAC that we call monadically-guarded: those in which the quantifiers ∃ <sup>=</sup><sup>x</sup>y only appear in subformulae of the form ∃x (Ψ ∧ ∃<sup>=</sup><sup>x</sup>y Φ), where Φ and Ψ are themselves from the monadically-guarded fragment of PAC, x does not occur in Φ, and Ψ is monadically decomposable on the variable x. The monadically-guarded fragment of PAC is understood as the set of all formulae from PAC that are monadicallyguarded. This fragment captures several interesting extensions of PA:

– It can express that the number of different y satisfying Φ(y, z) lies in an arithmetic progression b, b + p, b + 2 · p, b + i · p, . . . , with b, p ∈ N. That is,

$$\exists x (x \ge b \land x \equiv\_p b \land \exists^{=x} y \, \Phi(y, z)).$$

This type of monadically-guarded formulae extends the modulo counting quantifiers studied by Habermehl and Kuske [4]. Modulo counting quantifiers are written as ∃ (r,q)y Φ and hold whenever the number of different y satisfying Φ is congruent to r modulo q. Hence, ∃ (r,q)y Φ ⇔ ∃x (x ≡<sup>p</sup> r ∧ ∃<sup>=</sup><sup>x</sup>yΦ).

Moreover, in the monadically-guarded fragment, we can replace the integer r with an arbitrary linear term t with variables from z, since the modulo constraint x ≡<sup>p</sup> t can be monadically decomposed into W r∈[p] (x ≡<sup>p</sup> r ∧ t ≡<sup>p</sup> r). – As we recalled in the previous section with the formula Ψn(x, z), it is known that PA allows one to succinctly encode numbers that are doubly or triply exponentially large with respect to the size of the formula. For instance, one can define a formula Ln(x), again of size polynomial in n, that is true whenever x is the product of all primes in the interval [2, 2 2 n ] (see [7, Lecture 24]). In this case, x ≥ 2 c2 2n for some fixed c > 0. The monadically-guarded fragment of PAC allows one to use these succinct representations as guards of counting quantifiers. For instance, ∃x(Ln(x)∧∃=xy Ψ(y, z)) is true whenever the number of y satisfying Ψ(y, z) is the product of all primes in [2, 2 2 n ].

Hague et al. [5] proved that constructing the monadic decomposition of a quantifier-free formula can be done in exponential time. More precisely, given a q.f. formula Φ(x, y) from PA that is monadically decomposable on x, in [5] it is shown that there is a natural number B of magnitude exponential in |Φ| that makes the following formula ΨB(x, y) a monadic decomposition of Φ on x:

$$\begin{aligned} \Psi\_B \stackrel{\text{def}}{=} \bigvee\_{c=0}^{m-1} \left( \left( x \ge B \land x \equiv\_m c \land \Phi(B+c, y) \right) \vee \left( x \le -B \land x \equiv\_m c \land \Phi(-B-c, y) \right) \right) \\ \qquad \lor \bigvee\_{c=-B+1}^{B-1} (x = c \land \Phi(c, y)), \end{aligned}$$

where m = lcm(mod(Φ)). We study the arguments presented in [5] and refine the bound B, tracking dependencies on several formula parameters separately. We find that B is polynomial in ||Φ||; it is only exponential in #mod(Φ) and in the number of variables of the tuple y.

Proposition 3. Let Φ(x, y) be a q.f. formula from PA, where y = (y1, . . . , yd). Let m = lcm(mod(Φ)) and B = 248<sup>d</sup> 2 (m · ||lin(Φ)||) <sup>6</sup><sup>d</sup> + 1. If Φ is monadically decomposable on x, then the formula Ψ<sup>B</sup> is such a decomposition.

Together with our QE procedure, Proposition 3 shows that it is decidable to check whether a formula of PAC is monadically decomposable (on a certain variable). Due to Theorem 2, this problem is in 2ExpSpace for formulae of the monadically-guarded fragment of PAC. Besides, notice that all formulae having one free variable are monadic decompositions of themselves.

Our QE procedure for the monadically-guarded fragment of PAC, outlined below, makes use of the sharper bound obtained in Proposition 3.

### 6 Eliminating monadically-guarded counting quantifiers

Consider a formula Φ<sup>0</sup> = ∃x(Ψ ∧ ∃<sup>=</sup><sup>x</sup>y Φ), where Φ and Ψ are quantifier-free formulae, x does not occur in Φ, and Ψ is monadically decomposable on x. By relying on the QE procedure introduced in Section 3, we show how to obtain a quantifier-free formula equivalent to Φ0. W.l.o.g., we assume that all free variables distinct from x and y and occurring in Φ and Ψ come from the tuple of variables z.

Below, let Ψ<sup>0</sup> = W <sup>k</sup>∈<sup>K</sup> ∆k(x) ∧ Ψk(z) be the monadic decomposition of Ψ on the variable x computed according to Proposition 3. Recall that this means that each ∆<sup>k</sup> is a formula having one among the following three forms:

$$x \ge B \land x \equiv\_q c; \qquad \qquad x \le -B \land x \equiv\_q c; \text{ or } \qquad \qquad x = r, \dots$$

where q def = lcm(mod(Ψ)), c ∈ [q], r ∈ [−B + 1, B − 1] and B is a fixed natural number. Let us also consider the formula Φ<sup>5</sup> obtained from performing the QE procedure for the ∃ <sup>=</sup>xy counting quantifier on ∃ <sup>=</sup>xy Φ, so that Φ<sup>0</sup> ⇔ ∃x(Ψ<sup>0</sup> ∧ Φ5). In particular, recall that Φ<sup>5</sup> def = Wo i=1 W r : Z→[m] (Γi,r ∧ Φ i,r 5 ), where Z is the set of variables appearing in z, m = lcm(mod(Φ)) and Γi,r = O<sup>i</sup> ∧ ( V <sup>w</sup>∈<sup>Z</sup> w ≡<sup>m</sup> r(w)) is a conjunction of an ordering O<sup>i</sup> and simple modulo constraints with variables from Z. Hence, Γi,r is x-free. Moreover, Φi,r 5 is either ⊥ or a formula of the form

$$mx = \sum\_{j=2}^{\ell} (p\_j(t\_j' - t\_{j-1}') + r\_j) + m \cdot \sum\_{j=1}^{\ell} c\_j. \tag{2}$$

where the terms t 0 1 , . . . , t<sup>0</sup> ` are from T (where T is defined as in Step II of Section 3), and hence x-free. Therefore, the following property holds.

Claim 9. In Φ5, x only appears on the left-hand side of equalities of the form (2).

This inconspicuous claim, together with the shape of ∆k, is at the heart of our QE procedure eliminating x from the formula ∃x(Ψ<sup>0</sup> ∧ Φ5). Indeed, after distributing the existential quantifier ∃x and all conjunctions over disjunctions of Ψ<sup>0</sup> ∧ Φ5, we end up with a disjunction of formulae of the form ∃x : ∆k(x) ∧ Ψk(z) ∧ Γi,r ∧ Φ i,r 5 , and let us consider one such disjunct with ∆k(x) = (x ≥ B ∧x ≡<sup>q</sup> c) and Φi,r 5 as in Equation (2). The variable x can be eliminated with a simple substitution, rewriting ∆k(x)∧Φ i,r 5 as the new formula <sup>e</sup><sup>t</sup> <sup>≥</sup> <sup>m</sup>·<sup>B</sup> <sup>∧</sup>e<sup>t</sup> <sup>≡</sup>m·<sup>q</sup> <sup>m</sup> · <sup>c</sup>, where <sup>e</sup><sup>t</sup> is the right-hand side of Equation (2). The correctness of this rewrite step follows simply from the equivalences x ≥ B ⇔ m · x ≥ m · B and x ≡<sup>q</sup> c ⇔ m·x ≡m·<sup>q</sup> m· c, with m ≥ 1. In a similar way, we can treat all possible cases for the different forms of ∆k(x) and Φi,r 5 . We obtain a formula

$$
\Psi\_k(\mathbf{z}) \land \Gamma\_{i,r} \land \widetilde{t} \ge m \cdot B \land \widetilde{t} \equiv\_{m \cdot q} m \cdot c. \tag{3}
$$

The number of homogeneous terms across all such disjuncts is still prohibitive as it was in Φ5. Now comes the key simplification step; we deal with the inequality <sup>e</sup><sup>t</sup> <sup>≥</sup> <sup>m</sup> · <sup>B</sup> and with the modulo constraint <sup>e</sup><sup>t</sup> <sup>≡</sup>m·<sup>q</sup> <sup>m</sup> · <sup>c</sup>.

Consider the former first. By definition, all the coefficients p<sup>j</sup> of Equation (2) are non-negative, and thanks to the ordering O<sup>i</sup> appearing in Γi,r, in every valuation ν satisfying the formula in Equation (3) we have ν(t 0 <sup>j</sup> − t 0 j−1 ) ≥ 0. Therefore, the inequality <sup>e</sup><sup>t</sup> <sup>≥</sup> <sup>m</sup> · <sup>B</sup> can be translated into a formula of the form W g∈G V` <sup>j</sup>=2 t 0 <sup>j</sup> − t 0 <sup>j</sup>−<sup>1</sup> ≥ dg,j , where each dg,j is non-negative and, for every g ∈ G, the sum P` <sup>j</sup>=2 pjdg,j is at least e def = m(B − P` <sup>j</sup>=1 c<sup>j</sup> ) − P` <sup>j</sup>=2 r<sup>j</sup> . To compute this formula efficiently, we appeal to Lemma 2, with respect to the set of terms {t 0 <sup>j</sup> − t 0 j−1 | j ∈ [2, `]} ∪ [0, e].

Lemma 6. Let <sup>d</sup> <sup>=</sup> <sup>|</sup>fv(Oi∧e<sup>t</sup> <sup>≥</sup> <sup>m</sup>·B)|. In time (<sup>e</sup> <sup>+</sup> `) O(d) log(B · ||O<sup>i</sup> ||) <sup>O</sup>(1) one can compute a formula Θ = W g∈G V` <sup>j</sup>=2 t 0 <sup>j</sup> − t 0 <sup>j</sup>−<sup>1</sup> ≥ dg,j s.t. (1) dg,j ∈ [0, e + 1], (2) #G ≤ O((e + `) 2d ), and (3) <sup>O</sup><sup>i</sup> <sup>∧</sup> <sup>e</sup><sup>t</sup> <sup>≥</sup> <sup>m</sup> · <sup>B</sup> <sup>⇔</sup> <sup>O</sup><sup>i</sup> <sup>∧</sup> <sup>Θ</sup>.

A similar simplification can be done for the modulo constraint <sup>e</sup><sup>t</sup> <sup>≡</sup>m·<sup>q</sup> <sup>m</sup> · <sup>c</sup>: we guess residue classes of variables in <sup>e</sup><sup>t</sup> modulo <sup>m</sup>· <sup>q</sup>, rewriting <sup>e</sup><sup>t</sup> <sup>≡</sup>m·<sup>q</sup> <sup>m</sup>· <sup>c</sup> into W s: Z→[m·q] (e<sup>t</sup> <sup>≡</sup>m·<sup>q</sup> <sup>m</sup> · <sup>c</sup> <sup>∧</sup> V z∈Z z ≡m·<sup>q</sup> s(z)) and then replace, in each disjunct, <sup>e</sup><sup>t</sup> <sup>≡</sup>m·<sup>q</sup> <sup>m</sup> · <sup>c</sup> by <sup>&</sup>gt; or <sup>⊥</sup>, according to the satisfaction of <sup>s</sup>(et) <sup>≡</sup>m·<sup>q</sup> <sup>m</sup> · <sup>c</sup>.

The steps just discussed forms the post-processing phase of our QE procedure for the monadically-guarded fragment of PAC. Thanks to Lemma 6, we can show that the set of homogeneous terms of the resulting quantifier free formula Φ<sup>0</sup> , equivalent to Φ0, is the set of homogeneous terms in the monadic decomposition Ψ<sup>0</sup> , together with terms of the form t − t <sup>0</sup> with t and t <sup>0</sup> belong to the set T defined in Line 5. But #hom(Ψ<sup>0</sup> ) = O(#hom(Φ0)), and thus:

Lemma 7. #hom(Φ<sup>0</sup> ) ≤ O(#hom(Φ0) 2 ).

Running time. Lemma 7 is the key to obtaining an elementary QE procedure. In particular, this improvement over the exponential dependence of #hom(Φ5) on #hom(Φ) from our "baseline" Lemma 5 leads to the following bounds on the elimination of an arbitrary number of monadically-guarded quantifiers.

Lemma 8. Let Ω be a formula from the monadically-guarded fragment of PAC, with quantifier rank d. There is an equivalent quantifier-free formula Υ such that – #hom(Υ) ≤ |Ω| 2 O(d) and #mod(Υ) ≤ O(|Ω|);

.

– #lin(Υ), ||const(Υ)||, ||hom(Υ)|| and ||mod(Υ)|| are at most 2 |Ω| 2O(d)

Proof idea. In a nutshell, the bounds of Lemma 8 are obtained by first iterating Lemma 7 across all quantifier elimination rounds. This results in the doubly exponential bound |Ω| 2 O(d) on the cardinality of the set of homogeneous terms throughout the entire procedure. With this bound in hand, exponentiation on the right-hand side of the inequalities of Section 3 does not blow the parameters above triple exponential.

Subsequent analysis leads to the following result.

Theorem 3. There is a 3ExpTime quantifier elimination procedure for the monadically-guarded fragment of PAC.

Theorem 3 follows by combining Lemma 8 with upper bounds on the running time of a single quantifier elimination round. These upper bounds are all subsumed by the size of the obtained formulae, except possibly for the subdivision procedure of Step II (Lemma 2), the model counting procedure of Step IV (Lemma 4), and the further subdivision performed by Lemma 6. For Lemmas 2 and 6, the running time is only exponential in the size of the original formula, and thus polynomial time in the size of the obtained formula, as long as the latter has at least exponential size. For Lemma 4, observe that m ≤ ||mod(Υ)||, where Υ is the quantifier-free formula of Lemma 8. Therefore, the bounds of Lemma 8 suffice for a triply exponential time overall.

Remark 2. Only small updates are necessary to treat monadically-guarded formulae of the form ∃x(Ψ(x, z) ∧ ∃<sup>≥</sup>xyΦ(y, z)). Again, these updates deal with the fact that, contrary to ∃ <sup>=</sup>xyΦ, the formula ∃ <sup>≥</sup>xyΦ is true whenever there are infinitely many y satisfying Φ, or alternatively when x corresponds to a non-positive number. Then, Lemma 8 can be established for formulae of PAC containing both monadically-guarded quantifiers ∃ <sup>=</sup><sup>x</sup> and ∃ ≥x .

### 7 The monadically-guarded fragment is in doubly exponential space

In this section, we prove Theorem 2. Theorem 3 shows that our QE procedure has the same asymptotic running time as the standard QE procedures for PA. Historically, bounds obtained from the latter lead to computationally optimal decision procedures based on quantifier relativisation [12,4]. More precisely, given a formula Φ from PA, the QE procedures allow us to conclude that there is a bound C, of bitsize at most doubly exponential in |Φ|, such that ∃x Φ ⇔ ∃x : −C ≤ x ≤ C∧Φ holds (a small-model property). Then, a quantifier relativisation procedure follows the semantics of the formula and na¨ıvely tries all the possible assignments to x in [−C, C] whenever a quantifier ∃x is encountered. With some bookkeeping, this procedure runs in 2ExpSpace. In this section, we show that this is also the case for our QE procedure, leading to a 2ExpSpace relativisation procedure for the monadically-guarded fragment of PAC, proving Theorem 2.

First of all, we need to recall a folklore result regarding the existence of infinitely many solutions of a quantifier-free Presburger formula.

Lemma 9. Let ν be an assignment and Φ(y, z) be a q.f. formula of PA, where z has d variables. Let C def = ||Φ|| · d · max{1, |ν(z)| : z is in z} + ||Φ||#mod(Φ) + 1.


Together with Lemma 8, this result leads to the relativisation of first-order quantifiers in the context of PAC.

Lemma 10. There is a constant c with the following property. Let ν be an assignment, Φ(y, z) be a monadically-guarded formula of PAC, where z has d variables, and let C def = 2<sup>|</sup>Ψ<sup>|</sup> 2 c·d · max{1, |ν(z)| : z is in z}. Then, ν |= ∃y Φ if and only if ν[n/y] |= Φ holds for some n ∈ Z with |n| ≤ 3 · C.

We want to derive a similar lemma for monadically guarded counting quantifiers. First of all, we consider a formula Φ = ∃ <sup>=</sup><sup>x</sup>y Ψ(y, z) where Ψ is a monadically guarded formula. Recall that Φ is satisfied by an assignment ν whenever the number of distinct values n ∈ Z such that ν[n/y] |= Ψ is finite and equal to ν(x). By relying on Lemmas 8 and 9, we show the following lemma.

Lemma 11. There is a constant c with the following property. Let ν be an assignment, and consider a formula Φ = ∃ <sup>=</sup>xy Ψ(y, z) such that Ψ is a monadically guarded formula of quantifier rank d. Let C def = 2|Ψ<sup>|</sup> 2 c·d ·max{1, |ν(z)| : z is in z}. Then, ν |= Φ iff (i) ν[n/y] 6|= Ψ, for every n ∈ Z with C < |n| ≤ 3 · C; and (ii) #{n ∈ Z : |n| ≤ C and ν[n/y] |= Ψ} = ν(x).

We now consider the outermost quantifier x of a monadically-guarded formula Θ = ∃x (Ψ(x, z) ∧ ∃<sup>=</sup>xy Φ(y, z)), and aim at finding relativisation bounds for the variable x. Notice that the subformula Ψ(x, z) ∧ ∃<sup>=</sup>xy Φ(y, z) is not, strictly speaking, in the monadically-guarded fragment of PAC. However, we can first apply Lemma <sup>8</sup> and obtain quantifier-free formulae Ψ and Φ <sup>b</sup> <sup>0</sup> equivalent to Ψ and Φ, respectively. Then, we apply the QE procedure of Section 3 on input ∃ <sup>=</sup>xy Φ 0 , producing an equivalent quantifier-free formula Φ. We have <sup>b</sup> <sup>Θ</sup> ⇔ ∃<sup>x</sup> (Ψ<sup>b</sup> <sup>∧</sup> Φ), where <sup>b</sup> <sup>Ψ</sup><sup>b</sup> <sup>∧</sup> Φ is quantifier-free. Similarly to Lemma <sup>b</sup> 10, we can now obtain relativisation bounds from <sup>∃</sup><sup>x</sup> (Ψ<sup>b</sup> <sup>∧</sup> Φ) by relying on Lemma <sup>b</sup> 9:

Lemma 12. There is a constant c with the following property. Let ν be an assignment, and let Θ = ∃x (Ψ(x, z) ∧ ∃=xy Φ(y, z)) be a monadically-guarded formula of quantifier rank d. Define C def = 2<sup>|</sup>Θ<sup>|</sup> 2 c·d · max{1, |ν(z)| : z is in z}. Then, ν |= Θ if and only if there is n ∈ N s.t. n ≤ C and ν[n/x] |= Ψ ∧ ∃=xy Φ.

Lemmas 10 to 12 allow to evaluate the truth of a sentence of the monadicallyguarded fragment of PAC by recursively evaluating the truth of its subformulae, and iterating over a finite set of values when considering first-order and counting quantifiers. As all the considered values admit a binary encoding that is doubly exponential in the size of the input formula, this proves Theorem 2.

### 8 A complexity characterisation

By Theorem 2, for deterministic machines, the monadically-guarded fragment of PAC is no harder than standard Presburger arithmetic, and the same is true when considering monadically-guarded quantifiers ∃ <sup>≥</sup><sup>x</sup>y (Remark 2). However, by Proposition 1, PA is not complete for 2ExpSpace, but rather for the complexity class STA(∗, 2 2 nO(1) , O(n)). This leads to the natural question on whether the monadically-guarded fragment of PAC is also complete for the same STA class. While we leave this question open, in this section we show a completeness result in the restricted case where all monadically-guarded quantifiers appear in the form ∃x(Ψ(x) ∧ ∃<sup>≥</sup><sup>x</sup>y Φ), where Ψ(x) is any formula from PAC having all models bounded by 2<sup>2</sup> |Ψ| , in absolute value. For brevity, let us denote this fragment by F. As F extends PA, proving the following upper bound suffices.

Theorem 4. The validity of a sentence Φ in F can be decided by an alternating Turing machine with runtime 2 2 |Φ|O(1) and performing O(fd(Φ)) alternations.

Since the equivalence ∃ <sup>≥</sup><sup>c</sup>y Φ ⇔ ∃x : x = c∧∃<sup>≥</sup><sup>x</sup>y Φ, where c ∈ Z is written in binary, shows that F contains PA enriched with threshold counting quantifiers,


Fig. 1. Deciding whether a formula Φ from F is satisfied by all assignments in V .

this result implies Theorem 1. To establish Theorem 4, the first step is to rely on Lemmas 8 and 9 and adapt the proof of Lemma 11 to obtain a quantifier relativisation argument for the counting quantifier ∃ <sup>≥</sup>xy.

Lemma 13. There is a constant c with the following property. Let ν be an assignment, and consider a formula Φ = ∃ <sup>≥</sup>xy Ψ(y, z) such that Ψ is a monadically guarded formula of quantifier rank d. Let C def = 2<sup>|</sup>Ψ<sup>|</sup> 2 c·d ·max{1, |ν(z)| : z is in z}. Then, ν |= Φ iff (i) there is n ∈ Z s.t. ν[n/y] |= Ψ and C < |n| ≤ 3 · C, or (ii) #{n ∈ Z : |n| ≤ C and ν[n/y] |= Ψ} ≥ ν(x).

With Lemma 13 at hand, designing an algorithm that can be implemented as an alternating Turing machine with resources bounded as in Theorem 4 is simple. The function check(·, ·) given in Figure 1 provides such an algorithm.

Lemma 14. check(V , Φ) returns > if and only if for all ν ∈ V , ν |= Φ.

When Φ is a sentence, i.e. fv(Φ) = ∅, this lemma implies that Φ is valid if and only if check({ν}, Φ) = >, where ν is an arbitrary assignment. Then, Theorem 4 follows by establishing that check(·, ·) can be implemented with an alternating Turing machine that, on input ({ν}, Φ) where Φ is a sentence in F, runs in time 2 2 |Φ|O(1) and performs O(fd(Φ)) many alternations. We see the existential quantifications on V1, V2, ν and (Wν)ν∈<sup>V</sup> in Lines 2, 4 and 5 as guesses done by the alternating Turing machine. The computation in Line 1 is done deterministically in time polynomial in the encoding of V and t < 0. In Line 2, the alternating Turing machine decides which branch among check(V1,Φ1) and check(V2,Φ2) must be evaluated, at the cost of one alternation. In this way, alternations occur only in the case of check(V ,Φ<sup>1</sup> ∨ Φ2) and check(V ,¬Ψ), as the latter returns the negation of the assertion "∃ν ∈ V : check({ν}, Ψ) = >". This leads to O(fd(Φ)) many alternations overall. Let us now discuss the runtime of check(·, ·), again on alternating Turing machines. Assume that, after a certain number of recursive calls including at most r ≤ qr(Φ) calls to Line 5, the algorithm evaluates the input (V 0 , Ψ). Then, the number of assignments in V 0 is bounded by 2r·<sup>2</sup> |Φ| (this correspond to the case where each W<sup>ν</sup> in Line 7 contains the maximum amount of assignments, according to k (ν) ), and following the bounds on the numbers n (ν) and n (ν) i in Lines 10 and 11 and by Lemma 13, all these assignments map each variable to an integer that is, in absolute value, bounded by 2r·|Φ<sup>|</sup> 2 c·d , where c is the constant of Lemma 13 and d is the number of variables in Φ. So, as the number of recursive calls to check(·, ·) is bounded by |Φ|, no more than |Φ| · 2 qr(Φ)·2 |Φ| · log<sup>2</sup> (2qr(Φ)·|Φ<sup>|</sup> 2 c·d ) ≤ 2 2 (c+3)|Φ| space is required to represent all possible sets of assignments that are generated throughout the evaluation of check(·, ·). All the assignments are guessed by the alternating Turing machine and thus, when also accounting for the computation done in Line 1, we conclude that check({ν}, Φ) runs in time 2<sup>2</sup> |Φ|O(1) .

### 9 Conclusion

We developed a new quantifier elimination procedure for Presburger arithmetic extended with the unary counting quantifiers (PAC), and adapted it for its monadically-guarded fragment. While the existence of an algorithm for PAC running in elementary time is wide open, our procedure runs in 3ExpTime on the monadically-guarded fragment and leads to the small-model property and relativisation argument, which show that this logic is decidable in 2ExpSpace. When it comes to deterministic algorithms, this matches the complexity of deciding standard Presburger arithmetic. However, fully settling the complexity of the monadically-guarded fragment of Presburger arithmetic seems to require a generalisation of the STA complexity framework to capture counting mechanisms, which we leave as an avenue for further investigation. In this direction, we have shown that Presburger arithmetic is still STA(∗, 2 2 nO(1) , O(n))-complete when enriched with threshold quantifiers ∃ <sup>≥</sup><sup>c</sup>y, for the case of c written in binary but also even for the case of c represented succinctly as a solution of a Presburger formula Φ, characterising a number that may be doubly exponential in |Φ|.

With respect to our QE procedure for (general) unary counting quantifiers ∃ =x , we have pinpointed precisely where the non-elementary growth occurs. It remains to be seen whether our procedure can be further improved, or if, possibly based on insights obtained from it, a non-elementary lower bound for Presburger arithmetic extended with the ∃ <sup>=</sup><sup>x</sup>y quantifier can be established.

Acknowledgments. We are grateful to our reviewers for drawing our attention to the translation of threshold counting into standard PA. This work is part of a project that has received funding from the European Research Council (ERC) under the European Union's Horizon 2020 research and innovation programme (Grant agreement No. 852769, ARiAT).

### References


Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

### A frst-order logic characterisation of safety and co-safety languages

Alessandro Cimatti<sup>1</sup> , Luca Geatti<sup>3</sup> () , Nicola Gigante<sup>3</sup> () , Angelo Montanari<sup>2</sup> , and Stefano Tonetta<sup>1</sup>

> <sup>1</sup> Fondazione Bruno Kessler, Trento, Italy {cimatti,tonettas}@fbk.eu <sup>2</sup> University of Udine, Italy angelo.montanari@uniud.it <sup>3</sup> Free University of Bozen-Bolzano, Italy {geatti,gigante}@inf.unibz.it

Abstract. Linear Temporal Logic (LTL) is one of the most popular temporal logics, that comes into play in a variety of branches of computer science. Its widespread use is also due to its strong foundational properties. One of them is Kamp's theorem, showing that LTL and the frst-order theory of one successor (S1S[FO]) are expressively equivalent. Safety and co-safety languages, where a fnite prefx sufces to establish whether a word does not or does belong to the language, respectively, play a crucial role in lowering the complexity of problems like model checking and reactive synthesis for LTL. Safety-LTL (resp., coSafety-LTL) is a fragment of LTL where only universal (resp., existential) temporal modalities are allowed, that recognises safety (resp., co-safety) languages only. In this paper, we introduce a fragment of S1S[FO], called Safety-FO, and its dual coSafety-FO, which are expressively complete with regards to the LTL-defnable safety languages. In particular, we prove that they respectively characterise exactly Safety-LTL and coSafety-LTL, a result that joins Kamp's theorem, and provides a clearer view of the charactisations of (fragments of) LTL in terms of frst-order languages. In addition, it gives a direct, compact, and self-contained proof that any safety language defnable in LTL is defnable in Safety-LTL as well. As a by-product, we obtain some interesting results on the expressive power of the weak tomorrow operator of Safety-LTL interpreted over fnite and infnite traces.

### 1 Introduction

Linear Temporal Logic (LTL) is the de-facto standard logic for system specifcations [14]. It is a modal logic that is usually interpreted over infnite state sequences, but the fnite-trace semantics has recently gained attention as well [6,7]. The widespread use of LTL is due to its simple syntax and semantics, and to its strong foundational properties. Among them, we would like to mention the seminal work by Kamp [10] and Gabbay et al. [8], on its expressive completeness, i.e., LTL-defnable languages are exactly those defnable in the frst-order fragment of the monadic second-order theory of one successor [3] (S1S[FO] for short).

In formal verifcation, an important class of specifcations is that of safety languages. They are languages of infnite words where a fnite prefx sufces to tell whether a word does not belong to the language. As an example, the set of all and only those infnite sequences where some particular bad event never happens can be regarded as a safety language. In their duals, co-safety languages (sometimes called guarantee languages), a fnite prefx is sufcient to tell whether a word belongs to the language, e.g., when some desired event is mandated to eventually happen. Safety and co-safety languages are important for verifcation, model-checking, monitoring, and automated synthesis because they capture a variety of real-world requirements while being much simpler to deal with algorithmically [1, 11, 20].

Safety-LTL is the fragment of LTL where only universal temporal modalities are allowed. Similarly, its dual coSafety-LTL is obtained by only allowing existential modalities. It has been proved by Chang et al. [5] that Safety-LTL and coSafety-LTL defne exactly the safety and co-safety languages that are defnable in LTL, respectively.

In this paper, we provide a novel characterization of LTL-defnable safety languages, and of their duals, in terms of a fragment of S1S[FO], called Safety-FO, and its dual coSafety-FO. The presented fragments have a very natural syntax, and we prove they are expressively complete with regards to LTL-defnable safety and co-safety languages. We prove the correspondence between coSafety-FO and coSafety-LTL, which extends naturally to their duals and can be considered as a version of Kamp's theorem [10] specialized for safety and co-safety properties, helping to create a clearer picture of the correspondence between (fragments of) temporal and frst-order logics. We exploit such a result to prove the correspondence between co-safety languages defnable in LTL and coSafety-FO, thus establishing also the equivalence between the former and coSafety-LTL. This provides a proof of the fact that Safety-LTL captures exactly the set of LTL-defnable safety languages [5], which can be regarded as another contribution of the paper. The interest of our proof is twofold: on the one hand, the original proof by Chang et al. [5] is only sketched and it relies on two non-trivial translations scattered across diferent sources [16, 21]; on the other hand, such an equivalence result seems not to be very much known, as some authors presented the problem as open as lately as 2017 [20].<sup>4</sup> Thus, a compact and self-contained proof of the result seems to be a useful contribution for the community. It is worth to note that both proofs build on the fact that safety/co-safety languages can be captured by formulas of the form Gα/Fα with α pure-past, but after that, the two proofs signifcantly diverge. Finally, as a by-product of this proof, we provide some results that assess the expressive power of the weak tomorrow operator of Safety-LTL when interpreted over fnite vs. infnite traces.

The paper is organized as follows. After recalling necessary background knowledge in Section 2, Section 3 introduces Safety-FO and coSafety-FO and proves their correspondence with Safety-LTL and coSafety-LTL. Then, Section 4 proves

<sup>4</sup> As a matter of fact, we discovered about Chang et al. [5] after setting up the proof shown in this paper.

their correspondence with the set of safety and co-safety languages defnable in LTL, thus providing a compact and self-contained proof of the equivalence between Safety-LTL and LTL-defnable safety languages. Some properties of the weak next operator are outlined as well. Finally, Section 5 concludes the paper with some fnal considerations and a discussion of future work.

### 2 Preliminaries

Let A be a fnite alphabet. We denote as A<sup>∗</sup> and A<sup>ω</sup> the set of all fnite and infnite words, respectively, over A. We let A<sup>+</sup> = A<sup>∗</sup> \ {ε}, where ε is the empty word. Given a word σ ∈ A<sup>∗</sup> we denote as |σ| the length of σ. For an infnite word σ ∈ Aω, |σ| = ω. For a (fnite or infnite) word σ, we denote as σ<sup>i</sup> ∈ A, for 0 ≤ i < |σ|, the letter at the i-th position of the word. With σ[i,j] , for 0 ≤ i ≤ j < |σ|, we denote the subword that goes from the i-th to the j-th letter of the word, extrema included. With σ[i,∞] we denote the sufx of σ starting from the i-th letter. Given a word σ ∈ A<sup>∗</sup> and σ ′ ∈ A<sup>∗</sup> ∪ Aω, we denote the concatenation of the two words as σ · σ ′ , or simply σσ′ . A language L, either L ⊆ A<sup>∗</sup> or L ⊆ Aω, is a set of words. Given two languages L and L ′ with L ⊆ A<sup>∗</sup> and either L ′ ⊆ A<sup>∗</sup> or L ′ ⊆ Aω, we defne L · L′ = {σ · σ ′ | σ ∈ L and σ ′ ∈ L′ }. For a fnite word σ = σ<sup>0</sup> . . . σ<sup>k</sup> let σ <sup>r</sup> = σ<sup>k</sup> . . . σ<sup>0</sup> be the reverse of σ, and for a language of fnite words L let L <sup>r</sup> = {σ r | σ ∈ L}. We can now defne safety and co-safety languages.

Defnition 1 (Safety language [11, 19]). Let L ⊆ Aω. We say that L is a safety language if and only if for all the words σ ∈ A<sup>ω</sup> it holds that, if σ ̸∈ L, then there exists an i ∈ N such that, for all σ ′ ∈ Aω, σ[0,i] · σ ′ ̸∈ L. The class of safety languages is denoted as SAFETY.

Defnition 2 (Co-safety language [11, 19]). Let L ⊆ Aω. We say that L is a co-safety language if and only if for all the words σ ∈ A<sup>ω</sup> it holds that, if σ ∈ L, then there exists an i ∈ N such that, for all σ ′ ∈ A<sup>ω</sup>, σ[0,i] · σ ′ ∈ L. The class of co-safety languages is denoted as coSAFETY.

Linear Temporal Logic with Past (LTL+P) is a modal logic interpreted over infnite or fnite words. Given a set Σ of proposition variables, the syntax of an LTL formula ϕ is generated by the following grammar:


where ϕ<sup>1</sup> and ϕ<sup>2</sup> are LTL+P formulas and p ∈ Σ. An LTL+P formula is a pure future formula if it does not make use of past modalities, and it is pure past if it does not make use of future modalities. We denote with LTL the set of pure future formulas, and with LTL<sup>P</sup> the set of pure past formulas. Most of the temporal operators of the language can be defned in terms of a small number

of basic ones. In particular, conjunction can be defned in terms of disjunction (ϕ<sup>1</sup> ∧ ϕ<sup>2</sup> ≡ ¬(¬ϕ<sup>1</sup> ∨ ¬ϕ2)), the release operator can be defned in terms of the until operator (ϕ<sup>1</sup> R ϕ<sup>2</sup> ≡ ¬(¬ϕ<sup>1</sup> U ¬ϕ2)), and the triggered operator can be defned in terms of the since operator (ϕ<sup>1</sup> T ϕ<sup>2</sup> ≡ ¬(¬ϕ<sup>1</sup> S ¬ϕ2)). Nevertheless, we consider all these connectives and operators as primitive in order to be able to put any formula in negated normal form (NNF), i.e., a form where negations are only applied to proposition letters. Note that the syntax includes both a tomorrow (Xϕ) and weak tomorrow (Xeϕ) operators, as well as a yesterday (Yϕ) and weak yesterday (Zϕ) operators, for the same reason. Moreover, standard shortcut operators are available such as the eventually (Fϕ ≡ ⊤U ϕ), and always (Gϕ ≡ ¬F¬ϕ) future operators, and the once (Oϕ ≡ ⊤ S ϕ), and historically (Hϕ ≡ ¬O¬ϕ) past operators.

LTL+P is interpreted over state sequences, which are fnite or infnite words over 2Σ. Given a state sequence σ ∈ (2Σ) <sup>+</sup> or σ ∈ (2Σ) <sup>ω</sup>, the satisfaction of a formula ϕ by σ at a time point i ≥ 0, denoted as σ, i |= ϕ, is defned as follows:


We say that a state sequence σ satisfes ϕ, written σ |= ϕ, if σ, 0 |= ϕ. Note that, when interpreted over an infnite word, the tomorrow and weak tomorrow operators have the same semantics. The language of ϕ, denoted as L(ϕ), is the set of words σ ∈ (2<sup>Σ</sup>) <sup>ω</sup> such that σ |= ϕ. The language of fnite words of ϕ, denoted as L <ω(ϕ), is the set of fnite words σ ∈ (2<sup>Σ</sup>) <sup>+</sup> such that σ |= ϕ. Given a logic <sup>L</sup> (e.g., LTL), we denote as <sup>J</sup>L<sup>K</sup> the set of languages <sup>L</sup> such that there is a formula <sup>ϕ</sup> <sup>∈</sup> <sup>L</sup> such that <sup>L</sup> <sup>=</sup> <sup>L</sup>(ϕ), and we denote as <sup>J</sup>L<sup>K</sup> <ω the set of languages of fnite words L such that there is a formula ϕ ∈ L such that L = L <ω(ϕ). Note that <sup>J</sup>LTL<sup>K</sup> <ω is usually called LTLf in the literature [6].

We now defne the two fragments of LTL that are the subject of this paper.

Defnition 3 (Safety-LTL and coSafety-LTL [17]). The logic Safety-LTL (resp. coSafety-LTL) is the fragment of LTL where, for formulas in negated normal form, only the tomorrow, weak tomorrow and release (resp. until) temporal operators are allowed.

We also defne the logic coSafety-LTL(−Xe) as the logic coSafety-LTL devoid of the weak tomorrow operator (this logic will play a central role in our proofs).

In the next Section we present two fragments of the frst-order theory of one successor [2, 3], namely S1S[FO], or simply FO in the following. Fixed an alphabet Σ, FO is a frst-order language with equality over the signature ⟨<, {P}p∈Σ⟩, and is interpreted over structures M = ⟨DM, <M, {P<sup>M</sup>}p∈Σ⟩ where D<sup>M</sup>, for our goals, is either the set N of natural numbers or a prefx {0, . . . , n} thereof, and <<sup>M</sup> is the usual ordering relation between natural numbers. Given an FO formula ϕ(x0, . . . , xm) with m + 1 free variables, the satisfaction of ϕ by a frst-order structure M when x<sup>0</sup> = n0, . . . , x<sup>m</sup> = nm, denoted as M, n0, . . . , n<sup>m</sup> |= ϕ(x0, . . . , xm), is defned following the standard frst-order semantics. State sequences over Σ map naturally into such structures. Given a word σ ∈ (2Σ) <sup>∗</sup> or σ ∈ (2Σ) <sup>ω</sup>, we denote as (σ) s the corresponding frst-order structure. Given a formula ϕ(x) with exactly one free variable, the language of ϕ, denoted as L(ϕ), is the set of words σ ∈ (2Σ) <sup>ω</sup> such that (σ) s , 0 |= ϕ. Similarly, the language of fnite words of ϕ, denoted as L <ω(ϕ), is the set of fnite words σ ∈ (2Σ) <sup>+</sup> such that (σ) s <sup>|</sup><sup>=</sup> <sup>ϕ</sup>. We denote as <sup>J</sup>FO<sup>K</sup> and <sup>J</sup>FO<sup>K</sup> <ω the set of languages of infnite and fnite words, respectively, defnable by a FO formula.

Given a class of languages of fnite words <sup>J</sup>L<sup>K</sup> <ω, we denote as <sup>J</sup>L<sup>K</sup> <ω · (2Σ) ω the set of languages <sup>J</sup>L<sup>K</sup> <ω ·(2Σ) <sup>ω</sup> = {L ·(2Σ) <sup>ω</sup> | L ∈ <sup>J</sup>L<sup>K</sup> <ω}. We recall now some known results.

Proposition 1 (Kamp [10] and Gabbay [8]). <sup>J</sup>LTL<sup>K</sup> <sup>=</sup> <sup>J</sup>FO<sup>K</sup> and <sup>J</sup>LTL<sup>K</sup> <ω <sup>=</sup> <sup>J</sup>FO<sup>K</sup> <ω.

Finally, we state a normal form for LTL-defnable safety/co-safety languages.

Proposition 2 (Chang et al. [5], Thomas [19]). A language L ∈ <sup>J</sup>LTL<sup>K</sup> is safety (resp. co-safety) if and only if it is the language of a formula of the form Gα (resp. Fα), where α ∈ LTLP.

### 3 Safety-FO and coSafety-FO

In this section we introduce the core contribution of the paper, i.e., two fragments of FO that precisely capture Safety-LTL and coSafety-LTL, respectively, and we prove this relationship. A summary of the results provided by the paper is given in Fig. 1.

Defnition 4 (Safety-FO). The logic Safety-FO is generated by the following grammar:

$$\begin{aligned} \text{atomic} &:= x < y \mid x = y \mid x \neq y \mid P(x) \mid \neg P(x) \\ \phi &:= \text{atomic} \mid \phi\_1 \lor \phi\_2 \mid \phi\_1 \land \phi\_2 \mid \exists y (x < y < z \land \phi\_1) \mid \forall y (x < y \to \phi\_1) \end{aligned}$$

where x, y, and z are frst-order variables, P is a unary predicate, and ϕ<sup>1</sup> and ϕ<sup>2</sup> are Safety-FO formulas.

Fig. 1. Summary of the results of the paper, about languages over infnite words on the left, and over fnite words on the right. Solid arrows are own results. Dashed arrows are known from literature.

Defnition 5 (coSafety-FO). The logic coSafety-FO is generated by the following grammar:

$$\begin{aligned} \textit{atomic} &:= x < y \mid x = y \mid x \neq y \mid P(x) \mid \neg P(x) \\ \phi &:= \textit{atomic} \mid \phi\_1 \lor \phi\_2 \mid \phi\_1 \land \phi\_2 \mid \exists y (x < y \land \phi\_1) \mid \forall y (x < y < z \to \phi\_1) \end{aligned}$$

where x, y, and z are frst-order variables, P is a unary predicate, and ϕ<sup>1</sup> and ϕ<sup>2</sup> are coSafety-FO formulas.

We need to make a few observations on the syntax of the two fragments. First of all, note how any formula of Safety-FO is the negation of a formula of coSafety-FO and vice versa. Then, note that the two fragments are defned in negated normal form, i.e., negation only appears on atomic formulas. The particular kind of existential and universal quantifcations allowed are the culprit of these fragments. In particular Safety-FO restricts any existentially quantifed variable to be bounded between two already quantifed variables. The same applies to universal quantifcation in coSafety-FO. Moreover Safety-FO and coSafety-FO formulas are future formulas, i.e., the quantifers can only range over values greater than already quantifed variables. These two features are essential to precisely capture Safety-LTL and coSafety-LTL. Finally, note that the comparisons in the guards of the quantifers are strict, but non-strict comparisons can be used as well. In particular, ∃y(x ≤ y ∧ ϕ) can be rewritten as ϕ[y/x] ∨ ∃y(x < y ∧ ϕ), where ϕ[y/x] is the formula obtained by replacing all occurrences of y with x. Similarly, ∀z(x ≤ z ≤ y → ϕ) can be rewritten as ϕ[z/x] ∧ ϕ[z/y] ∧ ∀z(x < z < y → ϕ).

To prove the relationship between Safety-LTL, coSafety-LTL, and these fragments, we focus now on coSafety-FO. By duality, all the results transfer to Safety-FO. We focus on coSafety-FO because the unbounded quantifcation is existential, and it is easier to reason about the existence of prefxes than on all the prefxes at once. We start by observing that, since the weak tomorrow operator, over infnite words, coincides with the tomorrow operator, the following holds.

Observation 1. <sup>J</sup>coSafety-LTL<sup>K</sup> <sup>=</sup> <sup>J</sup>coSafety-LTL(−Xe)<sup>K</sup>

When reasoning over fnite words, the weak tomorrow operator plays a crucial role, since it can be used to recognize when we are at the last position of a word. In fact, the formula σ, i <sup>|</sup><sup>=</sup> <sup>X</sup>e<sup>⊥</sup> is true if and only if <sup>i</sup> <sup>=</sup> <sup>|</sup>σ|−1, for any <sup>σ</sup> <sup>∈</sup> (2Σ) ∗ .

Now, let us note that, thanks to the absence of the weak tomorrow operator, we can in some sense reduce ourselves to reasoning over fnite words.

Lemma 1. <sup>J</sup>coSafety-LTL(−Xe)<sup>K</sup> <sup>=</sup> <sup>J</sup>coSafety-LTL(−Xe)<sup>K</sup> <ω · (2Σ) ω

Proof. We have to prove that, for each formula <sup>ϕ</sup> <sup>∈</sup> coSafety-LTL(−Xe), it holds that:

$$\mathcal{L}(\phi) = \mathcal{L}^{<\omega}(\phi) \cdot (2^{\Sigma})^{\omega}$$

We proceed by induction on the structure of ϕ. For the base case, consider ϕ ≡ p ∈ Σ. The case for ϕ ≡ ¬p is similar. Let σ ∈ L(p). It holds that σ<sup>0</sup> |= p and σ<sup>0</sup> · σ ′ |= p, for all σ ′ |= (2Σ) <sup>ω</sup>, and in particular for σ ′ = σ[1,∞) . This is equivalent to say that σ ∈ L<ω(ϕ) · (2Σ) <sup>ω</sup>. For the inductive step:


The same property applies to coSafety-FO as well.

Lemma 2. <sup>J</sup>coSafety-FO<sup>K</sup> <sup>=</sup> <sup>J</sup>coSafety-FO<sup>K</sup> <ω · (2<sup>Σ</sup>) ω

Proof. We have to prove that, for each formula ψ ∈ coSafety-FO with one free variable, it holds that L(ψ) = L <ω(ψ) · (2<sup>Σ</sup>) <sup>ω</sup>. We proceed by induction, but with a more general statement. Let ϕ(x1, . . . , xk) have k free variables. We prove by induction on ϕ that for any infnite state sequence σ such that (σ) s , n1, . . . , n<sup>k</sup> |= ϕ(x1, . . . , xk), there exists a prefx σ[0,i] of σ such that for all σ ′ ∈ (2Σ) <sup>ω</sup>, (σ[0,i]σ ′ ) s , n1, . . . , n<sup>k</sup> |= ϕ(x1, . . . , xk). The base case considers the four kinds of atomic formulas. If (σ) s , n1, n<sup>2</sup> |= x<sup>1</sup> < x2, then n<sup>1</sup> < n<sup>2</sup> and we know that (σ[0,n2]σ ′ ) s , n1, n<sup>2</sup> |= x<sup>1</sup> < x<sup>2</sup> for all σ ′ ∈ (2Σ) ∗ . The case of x<sup>1</sup> = x<sup>2</sup> is similar. Now, if (σ) s , n<sup>1</sup> |= P(x1), then p ∈ σn<sup>1</sup> and we know that (σ[0,n1]σ ′ ) s , n<sup>1</sup> |= P(x1) for all σ ′ ∈ (2Σ) ∗ . The case for ¬P(x1) is similar. For the inductive step:


$$\{ (\sigma\_{[0,n\_\*]} \sigma')^s, n\_1, \dots, n\_k \} = \forall x\_{k+1} (x\_u < x\_{k+1} < x\_v \to \phi\_1(x\_1, \dots, x\_{k+1})) $$

Now, let ψ(x) be a coSafety-FO formula with exactly one free variable x. Thanks to the above induction we can conclude that each infnite state sequence σ such that (σ) s , 0 |= ϕ(x) is of the form σ[0,i] · σ ′ , where (σ[0,i]) s |= ϕ(x), and this implies that L(ψ) = L <ω(ψ) · (2<sup>Σ</sup>) ω.

It is worth to note that Lemmas <sup>1</sup> and <sup>2</sup> show that coSafety-LTL(−Xe) and coSafety-FO are insensitive to infniteness as defned by De Giacomo et al. [9].

Then, we can focus on coSafety-LTL(−Xe) and coSafety-FO on fnite words. If we can prove that <sup>J</sup>coSafety-LTL(−Xe)<sup>K</sup> <ω <sup>=</sup> <sup>J</sup>coSafety-FO<sup>K</sup> <ω, we are done. At frst, we show how to encode coSafety-LTL(−Xe) formulas into coSafety-FO.

Lemma 3. <sup>J</sup>coSafety-LTL(−Xe)<sup>K</sup> <ω <sup>⊆</sup> <sup>J</sup>coSafety-FO<sup>K</sup> <ω

Proof. Let L ∈ <sup>J</sup>coSafety-LTL(−Xe)<sup>K</sup> <ω, and let <sup>ϕ</sup> <sup>∈</sup> coSafety-LTL(−Xe) such that L = L <ω(ϕ). By following the semantics of the operators in ϕ, we can obtain an equivalent coSafety-FO formula ϕFO. We inductively defne the formula F O(ϕ, x), where x is a variable, as follows:


For each <sup>ϕ</sup> <sup>∈</sup> coSafety-LTL(−Xe), the formula F O(ϕ, x) has exactly one free variable x. It is easy to see that for all fnite state sequences σ ∈ (2<sup>Σ</sup>) ∗ , it holds that σ |= ϕ if and only if (σ) s , 0 |= F O(ϕ, x), and F O(ϕ, x) ∈ coSafety-FO. Therefore, L ∈ <sup>J</sup>coSafety-FO<sup>K</sup> <ω.

It is time to show the opposite direction, i.e., that any coSafety-FO formula can be translated into a coSafety-LTL(−Xe) formula which is equivalent over fnite words. To prove this fact we adapt a proof of Kamp's theorem by Rabinovich [15]. Kamp's theorem is one of the fundamental results about temporal logics, which states that LTL corresponds to FO in terms of expressiveness. Here, we prove a similar result in the context of co-safety languages. The proof goes by introducing a normal form for FO formulas, and showing that (i) any coSafety-FO formula can be translated into such normal form and (ii) any formula in normal form can be straightforwardly translated into a coSafety-LTL(−Xe) formula. We start by introducing such a normal form.

Defnition 6 (∃∀-formulas). An ∃∀-formula ϕ(z0, . . . , zm) with m free variables is a formula of this form:

$$\begin{aligned} \phi(z\_0, \dots, z\_m) &:= \exists x\_0 \dots \exists x\_n \big( \\ &x\_0 < x\_1 < \dots < x\_n \big) \\ &\land z\_0 = x\_0 \land \bigwedge\_{k=1}^m (z\_k = x\_{i\_k}) \\ &\land \bigwedge\_{j=0}^n \alpha\_j(x\_j) \\ &\land \bigwedge\_{j=1}^n \forall y (x\_{j-1} < y < x\_j \to \beta\_j(y)) \big) \quad interval \text{ constraints} \end{aligned}$$

where i<sup>k</sup> ∈ {0, . . . , n} for each 0 ≤ k ≤ m, and α<sup>j</sup> and β<sup>j</sup> , for each 1 ≤ j ≤ n, are quantifer-free formulas with exactly one free variable.

Some explanations are due. Each ∃∀-formula states a number of requirements for its free variables and for its quantifed variables. Through the binding constraints, the free variables are identifed with a subset of the quantifed variables in order to uniformly state the punctual and interval constraints, and the ordering constraints which sort all the variable in a total order. Note that there is no relationship between n and m: there might be more quantifed variables than free variables, or less. Note as well that the binding constraint z<sup>0</sup> = x<sup>0</sup> is always present, i.e., at least one free variable has to be the minimal element of the ordering. This ensures that ∃∀-formulas are always future formulas.

We say that a formula of coSafety-FO is in normal form if and only if it is a disjunction of ∃∀-formulas. To see how formulas in normal form make sense, let us immediately show how to translate them into coSafety-LTL(−Xe) formulas.

Lemma 4. For any formula ϕ(z) ∈ coSafety-FO in normal form, with a single free variable, there exists a formula <sup>ψ</sup> <sup>∈</sup> coSafety-LTL(−Xe) such that <sup>L</sup> <ω(ϕ(z)) = L <ω(ψ).

Proof. We show how any ∃∀-formula is equivalent to an coSafety-LTL(−Xe) formula, over fnite words. Since each formula in normal form is a disjunction of ∃∀-formulas, and since coSafety-LTL(−Xe) is closed under disjunction, this implies the proposition. Let ϕ(z) be a ∃∀-formula with a single free variable. Having only one free variable, ϕ(z) is of the form:

$$\begin{aligned} \exists x\_0 \dots \exists x\_n \big( x\_0 < \dots < x\_n \land z = x\_0 \\ \bigwedge\_{j=0}^n \alpha\_j(x\_j) \land \bigwedge\_{j=1}^n \forall y (x\_{j-1} < y < x\_j \to \beta\_j(y)) \big) \end{aligned}$$

Now, let A<sup>i</sup> be the temporal formulas corresponding to α<sup>i</sup> and B<sup>i</sup> be the ones corresponding to β<sup>i</sup> . Recall that α<sup>i</sup> and β<sup>i</sup> are quantifer free with only one free variable, hence this correspondence is trivial. Since z is the frst time point of the ordering mandated by the formula, we only need future temporal operators to encode <sup>ϕ</sup> into a coSafety-LTL(−Xe) formula <sup>ψ</sup> defned as follows:

$$\psi \equiv A\_0 \land \mathbb{X}(B\_0 \mathcal{U} \left(A\_1 \land \mathbb{X}(B\_1 \mathcal{U} A\_2 \land \dots \mathcal{X}(B\_{n-1} \mathcal{U} A\_n) \dots )\right))$$

It can be seen that σ, k |= ψ if and only if (σ) s , k |= ϕ(z), for each σ ∈ (2Σ) + and each k ≥ 0. Thus, L <ω(ϕ(z)) = L <ω(ψ).

Two diferences between our ∃∀-formulas and those used by Rabinovich [15] are crucial: frst, we do not have unbounded universal requirements, but all interval constraints use bounded quantifcations, hence we do not need the always operator to encode them; second, our ∃∀-formulas are future formulas, hence we only need future operators to encode them.

We now show that any coSafety-FO formula can be translated into normal form, that is, into a disjunction of ∃∀-formulas.

Lemma 5. Any coSafety-FO formula is equivalent to a disjunction of ∃∀-formulas.

Proof. Let ϕ be a coSafety-FO formula. We proceed by structural induction on ϕ. For the base case, for each atomic formula ϕ(z0, z1) we provide an equivalent ∃∀-formula ψ(z0, z1):

1. if ϕ ≡ z<sup>0</sup> < z<sup>1</sup> then ψ ≡ ∃x0∃x1(z<sup>0</sup> = x<sup>0</sup> ∧ z<sup>1</sup> = x<sup>1</sup> ∧ x<sup>0</sup> < x1); 2. if ϕ ≡ z<sup>0</sup> = z1, then ψ ≡ ∃x0(z<sup>0</sup> = x<sup>0</sup> ∧ z<sup>1</sup> = x0).


For the inductive step:


$$\begin{aligned} \psi\_1 &\equiv \exists x\_0 \dots \exists x\_n \left( x\_0 < \dots < x\_n \land z\_0 = x\_0 \land \dots \right) \\ \psi\_2 &\equiv \exists x\_{n+1} \dots \exists x\_m (x\_{n+1} < \dots < x\_m \land z\_0 = x\_{n+1} \land \dots ) \end{aligned}$$

Since the set of quantifed variables in ψ<sup>1</sup> is disjoint from the set of quantifed variables in ψ2, we can distribute the existential quantifers over the conjunction ψ<sup>1</sup> ∧ ψ2, obtaining:

$$\begin{aligned} \psi\_1 \wedge \psi\_2 &\equiv \exists x\_0 \dots \exists x\_n \exists x\_{n+1} \dots \exists x\_m \\ &\quad \left(x\_0 < \dots < x\_n \land x\_{n+1} < \dots < x\_m \land z\_0 = x\_0 \land z\_0 = x\_{n+1} \land \dots \right) \end{aligned}$$

Note that we can identify x<sup>0</sup> and xn+1, obtaining:

$$\begin{aligned} \psi\_1 \wedge \psi\_2 &\equiv \exists x\_0 \dots \exists x\_n \exists x\_{n+2} \dots \exists x\_m \\ & \quad \left(x\_0 < \dots < x\_n \land x\_0 < x\_{n+2} < \dots < x\_m \land \\ & \quad z\_0 = x\_0 \land \bigwedge\_{i=1}^k (z\_i = x\_{j\_i^{\prime\prime}}) \land \bigwedge\_{i=0, i \neq n+1}^m \alpha\_i(x\_i) \land \\ & \quad \bigwedge\_{\begin{subarray}{c} i=1, i \neq n+1 \\ i \neq n+2 \end{subarray}} \forall y (x\_{i-1} < y < x\_i \to \beta\_i(y)) \land \forall y (x\_0 < y < x\_{n+2} \to \beta\_{n+2})) \end{aligned}$$

Now, to turn this formula into a disjunction of ∃∀-formulas, we consider all the possible interleavings of the variables that respect the two imposed orderings and explode the formula into a disjunction that consider each such interleaving. Let X = {x0, . . . , xn, xn+2, . . . , xm} and let Π be the set of all the permutations of X compatible with the orderings x<sup>0</sup> < · · · < x<sup>n</sup> and x<sup>0</sup> < xn+1 < · · · < xm. For any π ∈ Π, π(0) = 0. Now, ψ<sup>1</sup> ∧ ψ<sup>2</sup> becomes the disjunction of a set of ∃∀-formulas ψπ, for each π ∈ Π, defned as:

$$\begin{aligned} \psi\_{\pi} &\equiv \exists x\_{\pi(0)} \dots \exists x\_{\pi(m)} \\ &\quad \left(x\_{\pi(0)} < \dots < x\_{\pi(m)} \land \\ &\quad z\_0 = x\_0 \land \bigwedge\_{i=1}^k (z\_i = x\_{\pi(j\_i^{\prime\prime})}) \land \bigwedge\_{i=0}^m \alpha\_i(x\_i) \land \\ &\quad \bigwedge\_{i=0}^m \forall y(x\_{\pi(i-1)} < y < x\_{\pi(i)} \to \beta\_i^\*(y)) \end{aligned}$$

where β ∗ i suitably combines the formulas β according to the interleaving of the orderings of the original variables, and is defned as follows:

$$\beta\_i^\* = \begin{cases} \beta\_{\pi(i)} & \text{if both } \pi(i), \pi(i-1) \le n \text{ or both } \pi(i), \pi(i-1) > n\\ \beta\_{\pi(i)} \land \beta\_{\pi(i-1)} & \text{if } \pi(i) \le n \text{ and } \pi(i-1) > n \text{ or } vice \text{ } versa \end{cases}$$

Then we have that ψ<sup>1</sup> ∧ ψ<sup>2</sup> ≡ W <sup>π</sup>∈Π(ψπ), which is a disjunction of ∃∀ formulas.

3. Let ϕ(z0, . . . , zm) ≡ ∃zm+1 . (z<sup>i</sup> < zm+1 ∧ ϕ1(z0, . . . , zm, zm+1)), for some 0 ≤ i ≤ m. By the inductive hypothesis, this is equivalent to the formula ∃zm+1(z<sup>i</sup> < zm+1 ∧ Wj <sup>k</sup>=0 ψk(z0, . . . , zm, zm+1)), where ψk(z0, . . . , zm, zm+1) is a ∃∀-formula, for each 0 ≤ k ≤ j, that is:

$$\exists z\_{m+1} \, . \,(z\_i < z\_{m+1} \land \bigvee\_{k=0}^{j} (\exists x\_0 \dots \exists x\_{n\_k} \psi'\_k(z\_0, \dots, z\_{m+1}, x\_0, \dots, x\_{n\_k}))) $$

By distributing the conjunction over the disjunction, we obtain:

$$\exists z\_{m+1} . \left( \bigvee\_{k=0}^{j} \left( (z\_i < z\_{m+1}) \land \exists x\_0 \dots \exists x\_{n\_k} \psi'\_k(z\_0, \dots, z\_{m+1}, x\_0, \dots, x\_{n\_k}) \right) \right)$$

and by distributing the existential quantifer over the disjunction, we have:

$$\bigvee\_{k=0}^{j} (\exists z\_{m+1} ((z\_i < z\_{m+1}) \land \exists x\_0 \dots \exists x\_{n\_k} \psi'\_k(z\_0, \dots, z\_{m+1}, x\_0, \dots, x\_{n\_k})))$$

Since the subformula z<sup>i</sup> < zm+1 does not contain the variables x0, . . . , xn, we can push it inside the existential quantifcation, obtaining:

$$\bigvee\_{k=0}^{j} (\exists z\_{m+1} \, . \, \exists x\_0 \dots \exists x\_{n\_k} \, . \, ((z\_i < z\_{m+1}) \wedge \psi'\_k(z\_0, \dots, z\_{m+1}, x\_0, \dots, x\_{n\_k})))$$

Now we divide in cases:

(a) suppose that the formula ψ ′ k (z0, . . . , zm+1, x0, . . . , x<sup>n</sup><sup>k</sup> ) contains the following conjuncts: z<sup>i</sup> = x<sup>l</sup><sup>i</sup> and zm+1 = x<sup>l</sup>m+1 , with l<sup>i</sup> = lm+1. It holds that these formulas are in contradiction with the formula z<sup>i</sup> < zm+1, that is:

$$(z\_i < z\_{m+1}) \land (z\_i = x\_{l\_i}) \land (z\_{m+1} = x\_{l\_{m+1}}) \equiv \bot$$

Therefore, the disjunct (z<sup>i</sup> < zm+1) ∧ ψ ′ k (z0, . . . , zm+1, x0, . . . , x<sup>n</sup><sup>k</sup> ) is equivalent to ⊥, and thus can be safely removed from the disjunction.

(b) suppose that the formula ψ ′ k (z0, . . . , zm+1, x0, . . . , x<sup>n</sup><sup>k</sup> ) contains the following conjuncts: z<sup>i</sup> = x<sup>l</sup><sup>i</sup> , zm+1 = x<sup>l</sup>m+1 (with l<sup>i</sup> ̸= lm+1), and x<sup>l</sup>m+1 < · · · < x<sup>l</sup><sup>i</sup> . As in the previous case, it holds that:

$$(z\_i < z\_{m+1}) \land (z\_i = x\_{l\_i}) \land (z\_{m+1} = x\_{l\_{m+1}}) \land (x\_{l\_{m+1}} < \dots < x\_{l\_i}) \equiv \bot$$

Thus, also in this case, this disjunct can be safely removed from the disjunction.

(c) otherwise, it holds that the formula ψ ′ k (z0, . . . , zm+1, x0, . . . , xn<sup>k</sup> ) contains the following conjuncts: z<sup>i</sup> = xl<sup>i</sup> , zm+1 = xlm+1 (with l<sup>i</sup> ̸= lm+1), and xl<sup>i</sup> < · · · < xlm+1 . Therefore, the subformula z<sup>i</sup> < zm+1 is redundant, and can be safely removed from ψ ′ k (z0, . . . , zm+1, x0, . . . , xn<sup>k</sup> ). The resulting formula is a ∃∀-formula.

After the previous transformation, we obtain:

$$\bigvee\_{k=0}^{j'} (\exists z\_{m+1} \dots \exists x\_0 \dots \exists x\_{n\_k} \dots \psi\_k''(z\_0, \dots, z\_{m+1}, x\_0, \dots, x\_{n\_k})) $$

Finally, since each formula ψ ′′ k (z0, . . . , zm+1, x0, . . . , xn<sup>k</sup> ) contains the conjunct zm+1 = xlm+1 , we can safely remove the quantifer ∃zm+1. We obtain the formula:

$$\bigvee\_{k=0}^{j'} (\exists x\_0 \dots \exists x\_{n\_k} \, . \,\psi\_k^{\prime\prime} (z\_0, \dots, z\_m, x\_0, \dots, x\_{n\_k})) $$

which is a disjunction of ∃∀-formulas.

4. Let ϕ(z0, . . . , zm) ≡ ∀zm+1(z<sup>i</sup> < zm+1 < z<sup>j</sup> → ϕ1(z0, . . . , zm, zm+1)), for some 0 ≤ i, j ≤ m. By the induction hypothesis we know that ϕ<sup>1</sup> is equivalent to a disjunction W <sup>k</sup> ψ<sup>k</sup> where ψ<sup>k</sup> are ∃∀-formulas, i.e., each ψ<sup>k</sup> is of the form:

$$\psi\_k \equiv \exists x\_0, \dots, x\_n \left( x\_0 < \dots < x\_n \land z\_0 = x\_0 \land \bigwedge\_{l=1}^{m+1} \left( z\_l = x\_{u\_l} \right) \land \dots \right)$$

$$\bigwedge\_{l=0}^n \alpha\_l(x\_l) \land \bigwedge\_{l=1}^n \forall y (x\_{l-1} < y < x\_l \to \beta\_l(y))$$

We now note that we can suppose w.l.o.g. that the ordering constraint and the binding constraint of ψ<sup>k</sup> imply that z<sup>i</sup> , zm+1 and z<sup>j</sup> are ordered consecutively, i.e., z<sup>i</sup> < zm+1 < z<sup>j</sup> with no other variable in between. That is because otherwise the constraints would be in confict with the guard of the universal quantifcation and the disjunct could be removed from the disjunction. Take for example a disjunct of ψ<sup>k</sup> with an ordering constraint of the type z<sup>i</sup> < z<sup>h</sup> < zm+1, for some h. The existence of such a z<sup>h</sup> is not guaranteed for each zm+1 between z<sup>i</sup> and z<sup>j</sup> because when zm+1 = z<sup>i</sup> + 1 there is no value between z<sup>i</sup> and z<sup>i</sup> + 1 (we are on discrete time models). That said, we can now isolate all the parts of ψ<sup>k</sup> that talk about zm+1, bringing them out of the existential quantifcation, obtaining ψ<sup>k</sup> ≡ θ<sup>k</sup> ∧ ηk, where:

$$\begin{aligned} \theta\_k &\equiv z\_i < z\_{m+1} < z\_j \\ &\land \alpha(z\_{m+1}) \land \forall y (z\_i < y < z\_{m+1} \to \beta(y)) \land \forall y (z\_{m+1} < y < z\_i \to \beta'(y)) \end{aligned}$$

$$\begin{aligned} \eta\_k \equiv \exists x\_0, \dots, x\_n \, (x\_0 < \dots < x\_n \land z\_0 = x\_0 \land \bigwedge\_{l=1}^m (z\_l = x\_{u\_l}) \land \\ \bigwedge\_{\substack{l=0 \\ l \neq u\_{m+1}}}^n \alpha\_l(x\_l) \land \bigwedge\_{\substack{l=1 \\ l \neq u\_j}}^n \forall y (x\_{l-1} < y < x\_l \to \beta\_l(y)) \end{aligned}$$

Now, we have ϕ ≡ ∀zm+1(z<sup>i</sup> < zm+1 < z<sup>j</sup> → W k (θ<sup>k</sup> ∧ ηk)). We can distribute the head of the implication over the disjunction, and then over the conjunction, obtaining:

$$\phi \equiv \forall z\_{m+1} \left( \bigvee\_{k} ((z\_i < z\_{m+1} < z\_j \to \theta\_k) \land (z\_i < z\_{m+1} < z\_j \to \eta\_k)) \right)$$

In order to simplify the exposition, we now show how to proceed in the case of two disjuncts, which is easily generalizable. So suppose we have:

$$\phi \equiv \forall z\_{m+1} \left( \begin{aligned} & \left( z\_i < z\_{m+1} < z\_j \rightarrow \theta\_1 \right) \land \left( z\_i < z\_{m+1} < z\_j \rightarrow \eta\_1 \right) \\ & \left( z\_i < z\_{m+1} < z\_j \rightarrow \theta\_2 \right) \land \left( z\_i < z\_{m+1} < z\_j \rightarrow \eta\_2 \right) \end{aligned} \right)$$

Now we can a) distribute the disjunction over the conjunction (i.e., convert in conjunctive normal form in the case of multiple disjuncts), b) factor out the head of the implications and c) distribute the universal quantifcation over the conjunction, obtaining:

$$\phi \equiv \begin{pmatrix} \forall z\_{m+1} (z\_i < z\_{m+1} < z\_j \to \theta\_1 \lor \theta\_2) \\ \land \forall z\_{m+1} (z\_i < z\_{m+1} < z\_j \to \theta\_1 \lor \eta\_2) \\ \land \forall z\_{m+1} (z\_i < z\_{m+1} < z\_j \to \eta\_1 \lor \theta\_2) \\ \land \forall z\_{m+1} (z\_i < z\_{m+1} < z\_j \to \eta\_1 \lor \eta\_2) \end{pmatrix}.$$

Now, note that η<sup>1</sup> and η<sup>2</sup> do not contain zm+1 as a free variable, because we factored out all the parts mentioning zm+1 into θ<sup>1</sup> and θ<sup>2</sup> before. Therefore we can push them out from the universal quantifcations, obtaining:

$$\phi \equiv \begin{pmatrix} \forall z\_{m+1} (z\_i < z\_{m+1} < z\_j \to \theta\_1 \lor \theta\_2) \\ \land \forall z\_{m+1} (z\_i < z\_{m+1} < z\_j \to \theta\_1) \lor \eta\_2 \\ \land \forall z\_{m+1} (z\_i < z\_{m+1} < z\_j \to \theta\_2) \lor \eta\_1 \\ \land \neg \exists z\_{m+1} (z\_i < z\_{m+1} < z\_j) \lor \eta\_1 \lor \eta\_2 \end{pmatrix}$$

Now, note that ¬∃zm+1(z<sup>i</sup> < zm+1 < z<sup>j</sup> ) is equivalent to z<sup>i</sup> = zj∨z<sup>j</sup> = zi+1, which is the disjunction of two formulas that can be turned into ∃∀-formulas. Since both η<sup>1</sup> and η<sup>2</sup> are already ∃∀-formulas and since we already know how to deal with conjunctions and disjunctions of ∃∀-formulas, it remains to show that the universal quantifcations in the formula above can be turned into ∃∀-formulas. Take ∀zm+1(z<sup>i</sup> < zm+1 < z<sup>j</sup> → θ1), i.e.:

$$\forall z\_{m+1} \left( z\_i < z\_{m+1} < z\_j \rightarrow \begin{matrix} z\_i < z\_{m+1} < z\_j \\ \land \alpha(z\_{m+1}) \\ \land \forall y (z\_i < y < z\_{m+1} \rightarrow \beta(y)) \\ \land \forall y (z\_{m+1} < y < z\_j \rightarrow \beta'(y)) \end{matrix} \right)$$

Note that the frst conjunct of the consequent can be removed, since it is redundant. Now, this formula is requesting β(y) for all y between z<sup>i</sup> and zm+1, but with zm+1 that ranges between z<sup>i</sup> and z<sup>j</sup> − 1, hence efectively requesting β(y) to hold between z<sup>i</sup> and z<sup>j</sup> . Similarly for β ′ (y), which has to hold for all y between z<sup>i</sup> + 1 and z<sup>j</sup> . Hence, it is equivalent to:

$$\begin{aligned} z\_i &= z\_j\\ \lor z\_j &= z\_i + 1\\ \lor \exists x\_{i+1} (z\_i < x\_{i+1} \land x\_{i+1} = z\_i + 1 \land z\_j = x\_{i+1} + 1 \land \alpha(x\_{i+1}))\\ \lor \exists x\_i \exists x\_{i+1} \exists x\_{j-1} \exists x\_j &= \begin{pmatrix} x\_i < x\_{i+1} < x\_{j-1} < x\_j\\ \land z\_i = x\_i \land z\_j = x\_j\\ \land \alpha(x\_{i+1}) \land \alpha(x\_{j-1})\\ \land \forall y (x\_i < y < x\_{i+1} \to \bot)\\ \land \forall y (x\_{j-1} < y < x\_j \to \bot)\\ \land \forall y (x\_i < y < x\_{j-1} \to \alpha(y) \land \beta(y))\\ \land \forall y (x\_{i+1} < y < x\_j \to \alpha(y) \land \beta'(y)) \end{pmatrix} \end{aligned}$$

which is a disjunction of a ∃∀-formula and others that can be turned into disjunctions of ∃∀-formulas. The reasoning is at all similar for ∀zm+1(z<sup>i</sup> < zm+1 < z<sup>j</sup> → θ<sup>1</sup> ∨ θ2).

Any coSafety-FO formula can be translated into a disjunction of ∃∀-formulas by Lemma 5, and then to a coSafety-LTL(−Xe) formula by Lemma 4. Together with Lemma 3, we obtain the following.

Corollary 1. <sup>J</sup>coSafety-FO<sup>K</sup> <ω <sup>=</sup> <sup>J</sup>coSafety-LTL(−Xe)<sup>K</sup> <ω

We are now ready to state the main result of this section.

Theorem 1. <sup>J</sup>coSafety-LTL<sup>K</sup> <sup>=</sup> <sup>J</sup>coSafety-FO<sup>K</sup>

Proof. We know that <sup>J</sup>coSafety-LTL<sup>K</sup> <sup>=</sup> <sup>J</sup>coSafety-LTL(−Xe)<sup>K</sup> <ω · (2<sup>Σ</sup>) <sup>ω</sup> by Observation <sup>1</sup> and Lemma 1. Since <sup>J</sup>coSafety-LTL(−Xe)<sup>K</sup> <ω <sup>=</sup> <sup>J</sup>coSafety-FO<sup>K</sup> <ω by Corollary 1, we have that <sup>J</sup>coSafety-LTL(−Xe)<sup>K</sup> <ω · (2<sup>Σ</sup>) <sup>ω</sup> <sup>=</sup> <sup>J</sup>coSafety-FO<sup>K</sup> <ω · (2<sup>Σ</sup>) <sup>ω</sup>. Then, by Lemma <sup>2</sup> we have that <sup>J</sup>coSafety-FO<sup>K</sup> <ω ·(2<sup>Σ</sup>) <sup>ω</sup> <sup>=</sup> <sup>J</sup>coSafety-FOK, hence <sup>J</sup>coSafety-LTL<sup>K</sup> <sup>=</sup> <sup>J</sup>coSafety-FOK.

Corollary 2. <sup>J</sup>Safety-LTL<sup>K</sup> <sup>=</sup> <sup>J</sup>Safety-FO<sup>K</sup>

### 4 Safety-FO captures LTL-defnable safety languages

In this section, we prove that coSafety-FO captures LTL-defnable co-safety languages. By duality, we have that Safety-FO captures LTL-defnable safety languages, and by the equivalence shown in the previous Section, this provides a novel proof of the fact that Safety-LTL captures LTL-defnable safety languages. We start by characterizing co-safety languages in terms of LTL over fnite words.

#### Lemma 6. <sup>J</sup>LTL<sup>K</sup> <sup>∩</sup> coSAFETY <sup>=</sup> <sup>J</sup>LTL<sup>K</sup> <ω · (2Σ) ω

Proof. (⊆) By Proposition <sup>2</sup> we know that each language L ∈ <sup>J</sup>LTLK∩coSAFETY is defnable by a formula of the form Fα where α ∈ LTLP. Hence for each σ ∈ L there exists an n such that σ, n |= α, hence σ[0,n] , n |= α. Note that σ[n+1,∞] is unconstrained. By replacing all the since/yesterday/weak yesterday operators in α with until/tomorrow/weak tomorrow operators, we obtain an LTL formula α r such that (σ[0,n]) r , 0 |= α r (where σ r is the reverse of σ). Since LTL captures star-free languages [12] and star-free languages are closed by reversal, there is also an LTL formula β such that σ[0,n] , 0 |= β. Hence L = L <ω(β) · (2Σ) <sup>ω</sup>, and we proved that <sup>J</sup>LTL<sup>K</sup> <sup>∩</sup> coSAFETY <sup>⊆</sup> <sup>J</sup>LTL<sup>K</sup> <ω · (2Σ) ω.

(⊇) Given L ∈ <sup>J</sup>LTL<sup>K</sup> <ω · (2Σ) <sup>ω</sup>, we know L = L <ω(β) · (2Σ) <sup>ω</sup> for some LTL formula β. Hence, for each σ ∈ L there is an n such that σ[0,n] , 0 |= β. Since LTL captures star-free languages and star-free languages are closed by reversal, there is an LTL formula α r such that (σ[0,n]) r , 0 |= α r . Now, by replacing all the until/tomorrow/weak tomorrow operators in α <sup>r</sup> with since/yesterday/weak yesterday operators, we obtain an LTL<sup>P</sup> formula α such that σ[0,n] , n |= α. Hence, σ is such that there is an n such that σ, n |= α, i.e., σ |= Fα. Therefore, by Proposition 2, L ∈ <sup>J</sup>LTL<sup>K</sup> <sup>∩</sup> coSAFETY, and this in turn implies that <sup>J</sup>LTL<sup>K</sup> <ω · (2Σ) <sup>ω</sup> <sup>⊆</sup> <sup>J</sup>LTL<sup>K</sup> <sup>∩</sup> coSAFETY.

Now, we show that, over fnite words, universal temporal operators are unneeded.

#### Lemma 7. <sup>J</sup>LTL<sup>K</sup> <ω <sup>=</sup> <sup>J</sup>Safety-LTL<sup>K</sup> <ω <sup>=</sup> <sup>J</sup>coSafety-LTL<sup>K</sup> <ω

Proof. Since Safety-LTL and coSafety-LTL are fragments of LTL, we only need to show one direction, <sup>i</sup>.e., that <sup>J</sup>LTL<sup>K</sup> <ω <sup>⊆</sup> <sup>J</sup>Safety-LTL<sup>K</sup> <ω and <sup>J</sup>LTL<sup>K</sup> <ω ⊆ <sup>J</sup>coSafety-LTL<sup>K</sup> <ω. At frst, we show that universal temporal operators are not needed over fnite words. For each LTL formula ϕ, we can build an equivalent coSafety-LTL formula with only existential temporal operators. The globally operator can be replaced by means of an until operator whose existential part always refers to the last position of the word. In turn, this can be done with the formula <sup>X</sup>e⊥, which is true only at the fnal position:

$$\mathbf{G}\phi \equiv \phi \mathcal{U} \left(\phi \land \mathbf{X} \bot\right)$$

Similarly, the release operator can be expressed by means of a globally operator in disjunction with an until operator:

$$\phi\_1 \mathcal{R} \,\phi\_2 \equiv \mathbf{G} \phi\_2 \vee (\phi\_2 \mathcal{U} \,(\phi\_1 \wedge \phi\_2)) \equiv \left(\phi\_2 \mathcal{U} \,(\phi\_2 \wedge \mathcal{X} \perp)\right) \vee \left(\phi\_2 \mathcal{U} \,(\phi\_1 \wedge \phi\_2)\right)$$

Hence <sup>J</sup>LTL<sup>K</sup> <ω <sup>=</sup> <sup>J</sup>coSafety-LTL<sup>K</sup> <ω. Now, if we exploit the duality between the eventually/until and the globally/release operators, we obtain:

$$\begin{aligned} \mathsf{F}\phi &\equiv \phi \,\, \mathcal{R} \,(\phi \lor \mathsf{X} \top) \\ \phi\_1 \,\, \mathcal{U} \,\phi\_2 &\equiv \phi\_2 \,\, \mathcal{R} \,(\phi\_2 \lor \mathsf{X} \top) \land \phi\_2 \,\, \mathcal{R} \,(\phi\_1 \lor \phi\_2) \end{aligned}$$

Hence <sup>J</sup>LTL<sup>K</sup> <ω <sup>=</sup> <sup>J</sup>Safety-LTL<sup>K</sup> <ω.

Then, we relate coSafety-LTL on fnite words and coSafety-FO.

#### Lemma 8. <sup>J</sup>coSafety-LTL<sup>K</sup> <ω · (2Σ) <sup>ω</sup> <sup>=</sup> <sup>J</sup>coSafety-FO<sup>K</sup>

Proof. (⊆) We have that <sup>J</sup>coSafety-LTL<sup>K</sup> <ω <sup>=</sup> <sup>J</sup>LTL<sup>K</sup> <ω by Lemma 7, and this implies that <sup>J</sup>coSafety-LTL<sup>K</sup> <ω ·(2Σ) <sup>ω</sup> <sup>=</sup> <sup>J</sup>LTL<sup>K</sup> <ω ·(2Σ) <sup>ω</sup>, and <sup>J</sup>coSafety-LTL<sup>K</sup> <ω · (2Σ) <sup>ω</sup> <sup>=</sup> <sup>J</sup>FO<sup>K</sup> <ω ·(2Σ) <sup>ω</sup> by Proposition 1. Now, let ϕ ∈ FO, and suppose w.l.o.g. that ϕ is in negated normal form. We defne the formula ϕ ′ (x, y), where x and y are two fresh variables that do not occur in ϕ, as the formula obtained from ϕ by a) replacing each subformula of ϕ of type ∃zϕ<sup>1</sup> with ∃z(x ≤ z ∧ ϕ1), and b) by replacing each subformula of ϕ of type ∀zϕ<sup>1</sup> with ∀z(x ≤ z < y → ϕ1). Now, consider the formula ψ ≡ ∃y(x ≤ y ∧ ϕ ′ (x, y)). Note that ψ is a coSafety-FO formula. When interpreted over infnite words, the models of ψ are exactly those containing a prefx that belongs to L <ω(ϕ), with the remaining sufx unconstrained, that is L(ψ) = L <ω(ϕ)·(2Σ) <sup>ω</sup>, hence <sup>J</sup>FO<sup>K</sup> <ω · (2Σ) <sup>ω</sup> ⊆ <sup>J</sup>coSafety-FOK, and this implies that <sup>J</sup>coSafety-LTL<sup>K</sup> <ω · (2Σ) <sup>ω</sup> <sup>⊆</sup> <sup>J</sup>coSafety-FOK.

(⊇) We know by Lemma <sup>2</sup> that <sup>J</sup>coSafety-FO<sup>K</sup> <sup>=</sup> <sup>J</sup>coSafety-FO<sup>K</sup> <ω · (2Σ) ω. Since coSafety-FO formulas are also FO formulas, we have <sup>J</sup>coSafety-FO<sup>K</sup> <sup>⊆</sup> <sup>J</sup>FO<sup>K</sup> <ω ·(2Σ) <sup>ω</sup>. By Proposition <sup>1</sup> and Lemma 7, we obtain that <sup>J</sup>coSafety-FO<sup>K</sup> <sup>⊆</sup> <sup>J</sup>coSafety-LTL<sup>K</sup> <ω · (2Σ) ω.

We are ready now to state the main result.

## Theorem 2. <sup>J</sup>LTL<sup>K</sup> <sup>∩</sup> coSAFETY <sup>=</sup> <sup>J</sup>coSafety-FO<sup>K</sup>

Proof. We know that <sup>J</sup>LTLK∩coSAFETY <sup>=</sup> <sup>J</sup>LTL<sup>K</sup> <ω ·(2<sup>Σ</sup>) <sup>ω</sup> by Lemma 6. Then, by Lemma <sup>7</sup> we know that <sup>J</sup>LTL<sup>K</sup> <ω <sup>=</sup> <sup>J</sup>coSafety-LTL<sup>K</sup> <ω, and this in turn implies that <sup>J</sup>LTL<sup>K</sup> <ω · (2<sup>Σ</sup>) <sup>ω</sup> <sup>=</sup> <sup>J</sup>coSafety-LTL<sup>K</sup> <ω · (2<sup>Σ</sup>) <sup>ω</sup>. Since <sup>J</sup>coSafety-LTL<sup>K</sup> <ω · (2<sup>Σ</sup>) <sup>ω</sup> <sup>=</sup> <sup>J</sup>coSafety-FO<sup>K</sup> by Lemma 8, we conclude that <sup>J</sup>LTL<sup>K</sup> <sup>∩</sup> coSAFETY <sup>=</sup> <sup>J</sup>coSafety-FOK.

This result together with Theorem 1 allow us to conclude the following.

## Theorem 3. <sup>J</sup>Safety-LTL<sup>K</sup> <sup>=</sup> <sup>J</sup>LTL<sup>K</sup> <sup>∩</sup> SAFETY

Note that by Observation 1 and Lemma 1 on one hand, and by Lemmas 6 and <sup>7</sup> on the other, the question of whether <sup>J</sup>Safety-LTL<sup>K</sup> <sup>=</sup> <sup>J</sup>LTLK∩SAFETY can be reduced to whether <sup>J</sup>coSafety-LTL<sup>K</sup> <ω ·(2<sup>Σ</sup>) <sup>ω</sup> <sup>=</sup> <sup>J</sup>coSafety-LTL(−Xe)<sup>K</sup> <ω ·(2<sup>Σ</sup>) ω. If coSafety-LTL and coSafety-LTL(−Xe) were equivalent over fnite words, this would already prove Theorem 3. However, we can prove this is not the case.

Theorem 4. <sup>J</sup>coSafety-LTL<sup>K</sup> <ω ̸<sup>=</sup> <sup>J</sup>coSafety-LTL(−Xe)<sup>K</sup> <ω Proof. Note that in coSafety-LTL(−Xe) we only have existential temporal modalities and we cannot hook the fnal position of the word without the weak tomorrow operator. For these reasons, given a coSafety-LTL(−Xe) formula <sup>ϕ</sup>, with a simple structural induction we can prove that for each σ ∈ (2Σ) <sup>+</sup> such that σ |= ϕ, it holds that σσ′ |= ϕ for any σ ′ ∈ (2Σ) <sup>+</sup>, i.e., all the extensions of σ satisfy ϕ as well. This implies that L <ω(ϕ) is either empty (i.e., if ϕ is unsatisfable) or infnite. Instead, by using the weak tomorrow operator to hook the last position of the word, we can describe a fnite non-empty language, for example as in the formula <sup>ϕ</sup> <sup>≡</sup> <sup>a</sup> <sup>∧</sup> <sup>X</sup>(<sup>a</sup> <sup>∧</sup> <sup>X</sup>e⊥). The language of <sup>ϕ</sup> is <sup>L</sup>(ϕ) = {aa}, including exactly one word, hence L(ϕ) cannot be described without the weak tomorrow operator.

Note that Theorem 4 does not contradict Theorem 3, that is, it does not imply that <sup>J</sup>coSafety-LTL<sup>K</sup> <ω ·(2Σ) <sup>ω</sup> ̸<sup>=</sup> <sup>J</sup>coSafety-LTL(−Xe)<sup>K</sup> <ω ·(2Σ) <sup>ω</sup>. For example, consider again the formula <sup>a</sup> <sup>∧</sup> <sup>X</sup>(<sup>a</sup> <sup>∧</sup> <sup>X</sup>e⊥). It cannot be expressed without the weak tomorrow operator, yet it holds that: L <ω(<sup>a</sup> <sup>∧</sup> <sup>X</sup>(<sup>a</sup> <sup>∧</sup> <sup>X</sup>e⊥)) · (2Σ) <sup>ω</sup> = L <ω(a ∧ Xa) · (2Σ) ω.

### 5 Conclusions

In this paper, we gave a frst-order characterization of safety and co-safety languages, by means of two fragments of frst-order logic, Safety-FO and coSafety-FO. These fragments of S1S[FO] provide a very natural syntax and are expressively complete with regards to LTL-defnable safety and co-safety languages.

The core theorem establishes a correspondence between Safety-FO (resp., coSafety-FO) and Safety-LTL (resp., coSafety-LTL), and thus it can be viewed as a special version of Kamp's theorem for safety (resp., co-safety) properties. Thanks to these new fragments, we were able to provide a novel, compact, and self-contained proof of the fact that Safety-LTL captures LTL-defnable safety languages. Such a result was previously proved by Chang et al. [5], but in terms of the properties of a non-trivial transformation from star-free languages to LTL by Zuck [21]. As a by-product, we provided a number of results that relate the considered languages when interpreted over fnite and infnite words. In particular, we highlighted the expressive power of the weak tomorrow temporal modality, showing it to be essential in coSafety-LTL over fnite words.

Diferent equivalent characterizations of LTL are known, in terms of (i) frstorder logic, (ii) regular expressions, (iii) automata, and (iv) monoids (see the summary by Thomas in [19]). This work focuses on the frst item, but for LTLdefnable safety languages. A natural follow-up would be to investigate the other items, looking for what kind of automata (resp., regular expressions, monoids) captures exactly safety and co-safety LTL-defnable languages. While on fnite traces simple characterizations in terms of automata and syntactic monoids exist, the infnite-traces scenario is more complex: there exists a characterization of LTL in terms of counter-free automata [13] and the one for safety ω-regular languages seems not to be difcult (see e.g., terminal automata [4, 18]), but their combination requires to have a canonical (minimal) representation of a (Muller/Rabin/Streett) automata corresponding to any ω-regular language.

### References


Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

### First-order separation over countable ordinals<sup>⋆</sup>

Thomas Colcombet<sup>1</sup> , Sam van Gool<sup>1</sup> , and Rémi Morvan<sup>2</sup> ()

1 IRIF, Université de Paris & CNRS, Paris, France {thomas.colcombet,vangool}@irif.fr <sup>2</sup> École normale supérieure Paris-Saclay, Gif-sur-Yvette, France fistname.lastname@ens-paris-saclay.fr

Abstract. We show that the existence of a frst-order formula separating two monadic second order formulas over countable ordinal words is decidable. This extends the work of Henckell and Almeida on fnite words, and of Place and Zeitoun on ω-words. For this, we develop the algebraic concept of monoid (resp. ω-semigroup, resp. ordinal monoid) with aperiodic merge, an extension of monoids (resp. ω-semigroup, resp. ordinal monoid) that explicitly includes a new operation capturing the loss of precision induced by frst-order indistinguishability. We also show the computability of FO-pointlike sets, and the decidability of the covering problem for frst-order logic on countable ordinal words.

Keywords: Regular languages · Separation, Pointlike sets · Countable Ordinals · First-order logic · Monadic second-order logic

A full version of this paper can be found on arXiv. This document contains internal hyperlinks, and is best read on an electronic device.

### 1 Introduction

In this paper, we establish the decidability of FO-separability over countable ordinal words:

Theorem 1. There is an algorithm which, given two regular languages of countable ordinal words K, L, either:


<sup>⋆</sup> This work was supported by the European Research Council (ERC) under the European Union's Horizon 2020 research and innovation programme (ERC DuaLL, grant agreement No. 670624), and by the DeLTA ANR project (ANR-16-CE40-0007).

P. Bouyer and L. Schr¨oder (Eds.): FoSSaCS 2022, LNCS 13242, pp. 264–284, 2022. https://doi.org/10.1007/978-3-030-99253-8\_14

The decidability of FO-separability was previously only known for fnite words [19,2,25,17] and for words of length ω [25]. Countable ordinal words are sequences of letters that are indexed by a countable total well-ordering, i.e., up to isomorphism, by a countable ordinal. There is a natural notion of regular languages over these objects which can be equivalently described in terms of logic (either monadic second-order logic or weak monadic second-order logic), automata (Büchi introduced a notion of automata for countable ordinal words [13], which was studied in more detail by Wojciechowski [39] and which generalises Choueka's automata [15] for words of length at most ω <sup>n</sup>—the fact that Choueka's automata can be seen as a restriction of Büchi's automata for countable ordinals was proven by Bedon [5]), rational expressions (introduced by Wojciechowski [40]), or algebra (recognisable by fnite ordinal monoids—introduced by Bedon and Carton [8]). A detailed survey of the equivalence between all these notions can be found in Bedon's thesis [6].

Our algorithm follows the approach initiated by Henckell, and constructs the FO-pointlike sets in an ordinal monoid that recognises the two input languages simultaneously. FO-pointlike sets are subsets of a monoid whose elements are inherently indistinguishable by frst-order logic. Our completeness proof for the algorithm follows a scheme similar to the one followed by Place and Zeitoun in the context of fnite and ω-words [25], which was inspired by Wilke's characterisation of FO-defnable languages [38]. We had to make several substantial changes to this approach for the proofs to generalize from fnite and ω-words to the setting of countable ordinal words. A seemingly slight modifcation of the notion of saturation (Defnition 8) allows for a careful redesign of several of the core lemmas in the proof of completeness, and in particular the construction of an FO-approximant in Section 5 below.

Related work This work lies in a line of research that aims to obtain a decidable understanding of the expressive power of subclasses of the class of regular languages. The seminal work in this area is the Schützenberger-McNaughton-Papert theorem [34,22] which efectively characterizes the languages of fnite words defnable in frst-order logic as the ones which have an aperiodic syntactic monoid. This theorem was at the origin of a large body of work that studies classes of languages through the corresponding classes of monoids, including for instance Simon's result characterising piecewise-testable languages via J -trivial monoids [36]. FO-pointlike sets are also known in the literature as aperiodic pointlike sets, and were frst studied and shown to be computable by Henckell [19], in the context of the Krohn-Rhodes semigroup complexity problem. The computability of pointlike sets was shown to be equivalent to the decidability of the covering problem by Almeida [2]. Alternative proofs of separation and covering problems for FO were given recently in [25,17], and, ever since Henckell's work, the computability of FO-pointlike sets was also extended to pointlike sets for other varieties—for example [4] for the variety of fnite groups, [3] for the variety of J -trivial fnite semigroups and [18] for varieties of fnite semigroups determined by a variety of fnite groups; also see [18] for further references. Place and Zeitoun recently used pointlike sets, in the form of covering problems [27], to resolve long-standing open membership problems for the lower levels of the dot-depth and of the Straubing-Thérien hierarchies [26,28,29].

Another, orthogonal, line of research consists in the extension of the notions of regularity (logic/automata/rational expressions/algebra) to models beyond fnite words. This is the case for fnite or infnite trees [30]. In this paper, we are concerned with words that go beyond fnite, such as words of length ω [12,37,24], of countable ordinal length [6,5], of countable scattered<sup>3</sup> length [31,32], or of general countable length [30,35,14].

These two branches have also been studied jointly, and frst-order logic was characterised on words of length ω [23], of countable ordinal length [7], of countable scattered length [10] (and in [9] for frst-order augmented with quantifers over Dedekind cuts), and for words of countable length [16] (as well as other logics [16,21,1]). Prior to the current work, the questions of computing the FOpointlike sets and deciding FO-separation for languages of infnite words had only been investigated for words of length ω [25].

Structure of the document In Section 2, we introduce important defnitions for manipulating infnite words in algebraic terms (ordinal monoids and their powerset), and in logical terms (frst-order logic and frst-order defnable maps). In Section 3, we describe the algorithm, and in particular its core, a saturation construction. The correctness of the algorithm is then proved in Section 4, and the completeness in Section 5. In Section 6, we show two stronger results that arise from the same technique: the decidability of the covering problem and the computability of pointlikes. Section 7 concludes.

### 2 Preliminaries

### 2.1 Ordinals

A linear ordering is a set equipped with a total order. It is countable (resp. fnite) if the underlying set is countable (resp. fnite). Let α and β be two linear orderings. A morphism from α to β is a monotonic function, and an isomorphism between α and β is a bijective morphism. The (ordered) sum of two linear orders α and β is denoted by α + β and is defned, as usual, on the disjoint union of the linear orders α and β, by further postulating that every element of α is below every element of β. The product of two linear orders is denoted by α · β and is defned to be the right-to-left lexicographic ordering on the Cartesian product of the two orders, i.e., (x, y) ⩽ (x ′ , y′ ) if y < y′ or y = y ′ and x ⩽ x ′ . The n-fold product of α with itself is denoted by α <sup>n</sup>. A linear ordering is well-founded when it does not contain an infnite strictly decreasing sequence. An ordinal is a wellfounded linear ordering, considered only up to isomorphism of linear orderings. The empty linear ordering, the linear ordering with a single element and the linear ordering of natural numbers are all ordinals, and are denoted 0, 1 and ω, respectively. The class of all ordinals is itself totally ordered by the embedding

<sup>3</sup> A linear ordering is scattered if it does not contain a dense subordering.

relation: α ≼ β means that there exists an injective monotonic function from α to β. The relation ≺ denotes the strict ordering associated with ≼. An ordinal is a successor ordinal if it has a maximum, and a limit ordinal otherwise.

### 2.2 Ordinal words

Given a set X, a word w over X is a map from some linear ordering to X. The linear ordering is called the domain of w, and denoted dom(w). A word is countable (resp. fnite, resp. scattered, resp. ω-word), if its domain is countable (resp. fnite, resp. scattered, resp. ω). In this paper, a countable ordinal word is a word that has a countable and ordinal domain (hence, the countability assumption in silently assumed throughout the paper). The set of all fnite words over X is denoted by X<sup>∗</sup> , and the collection of all countable ordinal words over X is denoted by Xord. Similarly, the set of fnite non-empty words is denoted by X<sup>+</sup> and the collection of non-empty countable ordinal words is denoted by Xord+. The concatenation of two countable ordinal words u and v over X is the word u· v : dom(u) + dom(v) → X over X defned by (u· v)<sup>ι</sup> := u<sup>ι</sup> if ι ∈ dom(u) and (u · v)<sup>ι</sup> := v<sup>ι</sup> if ι ∈ dom(v). If w is a countable ordinal word, we defne its omega iteration, denoted by w <sup>ω</sup>, as the word with domain dom(w) · ω defned by (w <sup>ω</sup>)(ι,n) := w<sup>ι</sup> for every ι ∈ dom(w) and n ∈ ω. For example, if a, b ∈ X, then the omega iteration (ab) <sup>ω</sup> of the two-letter word ab is the word ababab · · · with domain 2 · ω = ω.

### 2.3 Ordinal monoids

A semigroup is a set S equipped with an associative binary product, denoted by ·. A monoid is a semigroup with a distinguished neutral element for the product, denoted as 1. An element x ∈ S is called idempotent if x <sup>2</sup> = x. In a fnite fnite semigroup S, every element x ∈ S has a unique idempotent power, denoted by<sup>4</sup> x idem, which we recall is the limit of the ultimately constant series n 7→ x n! . We also denote x idem+k , for k integer, the limit of the ultimately constant series n 7→ x n!+k . Note that x idem is the identity element of the unique maximal group inside the subsemigroup generated by x. A fnite semigroup is aperiodic (we equivalently write group-trivial) if a idem = a idem+1 for all of its elements a.

We now extend the notion of monoid to obtain an algebraic structure in which one can evaluate a product indexed by any countable ordinal. Let Σ be any set, and α a countable ordinal. For any word (wι)ι<α over the set Σord of countable ordinal words—i.e. (wι)ι<α is a word whose letters are words over Σ— we defne fat(w<sup>ι</sup> | ι < α) to be the word over Σ with domain P ι<α dom(wι), which has the letter (wι)<sup>κ</sup> ∈ Σ at position (ι, κ), for every ι ∈ α and κ ∈ dom(wι).

<sup>4</sup> The standard notation is x ω , but this notation conficts with the linear ordering ω. It is sometimes denoted x π or x ! when in the context of infnite words. We fnd the notation x idem more self-explanatory.

Defnition 2. An ordinal monoid<sup>5</sup> is a pair M = (M, π) where M is a set and π : Mord → M is a function, called generalised product, such that:

$$-
\begin{array}{c}
\pi(x) = x \text{ for every } x \in M, \text{ and} \\
\pi((
\pi(u\_{\iota}))\_{\iota < \alpha}) = \pi(\text{flat}((u\_{\iota})\_{\iota < \alpha})) \text{ for every } word \ (u\_{\iota})\_{\iota < \alpha} \in (M^{\text{ord}})^{\text{ord}}.
\end{array}$$

The second axiom is called generalised associativity. An ordinal monoid morphism is a map between ordinal monoids preserving the generalised product. An ordinal monoid is ordered if it is equipped with an order ⩽ that makes π monotonic, i.e. such that u ⩽ v implies π(u) ⩽ π(v), in which ⩽ is extended letter-by-letter to words in Mord .

Given a set Σ (the alphabet), an ordinal monoid M = (M, π), a letterto-letter map σ : Σ → M extended to σ ord : Σord → Mord, and F ⊆ M, the language L ⊆ Σord recognised by (M, σ, F) is

$$L = \{ u \in \Sigma^{\text{ord}} \, : \, \pi(\sigma^{\text{ord}}(u)) \in F \},$$

and a language L ⊆ Σord is called recognisable if it is recognised by some such tuple (M, σ, F). We recall that recognisable languages of ordinal words coincide with the ones defnable in monadic second-order logic, or defnable by suitable automata. These languages are called regular. Example 9 below will illustrate this concept.

We now recall a fnite presentation of fnite ordinal monoids (originally for ordinal semigroups), frst given by Bedon [6] by extending a similar result established by Perrin and Pin [24, prop II.5.2] for ω-semigroups<sup>6</sup> . Let (S, π) be an ordinal monoid. We defne the constant 1 and two functions · : S × S → S and −<sup>ω</sup> : S → S by

$$\underline{1} := \pi(\varepsilon) \qquad \underline{x} : y := \pi(xy) \qquad \text{and} \qquad x^{\underline{\omega}} := \pi(x^{\omega}) = \pi(\overleftarrow{xxx \cdot \cdots}) \dots$$

The following proposition lets us interchangeably regard an ordinal monoid M as either a pair (M, π) or as a quadruple (M, 1, ·, −ω), that we refer to as its presentation.

Proposition 3 ([6, Thm. 3.5.6], originally for ordinal semigroups). In a fnite ordinal monoid the generalised product is uniquely determined by the operations 1, · and −<sup>ω</sup>.

An important construction on which our proof relies is the power ordinal monoid: given an ordinal monoid (M, π), we equip the powerset P(M) of M with a generalised product π : P(M) ord → P(M) defned by

$$\begin{aligned} \pi((X\_{\iota})\_{\iota < \kappa}) := \{ \pi((x\_{\iota})\_{\iota < \kappa}) \mid x\_{\iota} \in X\_{\iota} \text{ for all } \iota < \kappa \} \\ \text{for all words } (X\_{\iota})\_{\iota < \kappa} \in (\mathcal{P}(M))^{\text{ord}}. \end{aligned}$$

<sup>5</sup> The object should probably be called a 'countable ordinal monoid' since its intent is to model countable ordinal words. However the naming becomes clumsy for 'fnite countable ordinal monoids'...

<sup>6</sup> The fnitary reprensation of ω-semigroups is usually called a Wilke algebra, which is the algebraic structure introduced by Wilke in [37] to recognise regular ω-languages.

Observe that if M is a fnite ordinal monoid, then so is P(M). We can compute a fnite representation of the power ordinal monoid P(M) of M from a fnite representation of M. Indeed,

$$\underline{1} = \{\underline{1}\}, \quad X \lrcorner Y = \{x \cdot y \mid x \in X, \, y \in Y\}, \quad \text{and} \qquad X^{\underline{\omega}} = \{u \cdot v^{\underline{\omega}} \mid u, v \in X^{+}\}$$

for all X, Y ∈ P(M). The two frst properties are trivial while the third one can be proven using the infnite Ramsey's theorem—this is a classical argument used to give fnite representation of infnite structures, see e.g. [24, Theorem II.2.1]. Note that this power ordinal monoid is indeed an ordinal monoid. It is even an ordered ordinal monoid when equipped with the inclusion ordering.

### 2.4 First-order logic

Over a fxed (fnite) alphabet Σ, we defne the set of frst-order logic formulæ or FO-formulæ for short, by the grammar:

$$\varphi ::= \exists x. \varphi \quad \mid \quad \forall x. \varphi \quad \mid \quad \varphi \land \varphi \quad \mid \quad \varphi \lor \varphi \quad \mid \quad \neg \varphi \quad \mid \quad x < y \quad \mid \quad a(x)$$

where x, y range over some fxed infnite set of variables, and a over Σ. Free variables are defned as usual, and an FO-sentence is a formula with no free variables. In our setting, a model is a countable ordinal word, and a valuation over this model is a total map from variables to the domain of the word. We defne, for any word w and any valuation ν, the semantic relation w, ν |= φ of frst-order logic on countable ordinal words by structural induction on the FOformula φ, by interpreting variables as positions in the word and propositions of the form a(x) as "the letter at position x is an a". If φ is an FO-sentence, then the semantics of φ over a word w does not depend on the valuation, and thus we write w |= φ or w ̸|= φ. When w |= φ we say that w satisfes φ, or also that φ accepts w.

A language L ⊆ Σord is said to be FO-defnable if L = {w ∈ Σord | w |= φ} for some FO-sentence φ. For example, the language of words over the alphabet {a, b, c} such that every 'a' is at a fnite distance from a 'b' is defned by the FO-sentence ∀x.a(x) → ∃y.b(y) ∧ fnite(x, y), where:

$$\begin{aligned} \text{isSuccessor}(z) &::= \exists y. y < z \land (\forall x. x < z \to x \leqslant y) \\ \text{finite}(x, y) &::= \forall z. (x < z \leqslant y \lor y < z \leqslant x) \to \text{isSuccessor}(z) \; . \end{aligned}$$

Bedon [7] extended the Schützenberger-McNaughton-Papert theorem [34,22] to countable ordinal words.

Proposition 4 (Bedon's theorem [7, Theorem 3.4]). A language of countable ordinal words is FO-defnable if and only if it is recognised by a fnite aperiodic ordinal monoid.

Let L ⊆ Σord. A function f : L → X whose codomain X is a fnite set is said to be FO-defnable when every preimage f −1 [x], with x ∈ X, is an FO-defnable language. Note that if f is FO-defnable, then its domain L is necessarily an FO-defnable language.

For example, the function Σ<sup>∗</sup> → Z/2Z, sending a word w ∈ Σ<sup>∗</sup> to its length modulo 2, is not FO-defnable. On the other hand, for a fxed letter a ∈ Σ, the total function sending a word w ∈ Σord+ to ⊤ if w contains the letter 'a' and to ⊥ otherwise is FO-defnable.

A useful tool to manipulate words is the notion of condensation — see, e.g., [33, §4] for an introduction to the subject. A condensation of a countable ordinal α is an equivalence relation ∼ over α whose equivalence classes are convex. Note that the quotient of an countable ordinal by a condensation is still a countable ordinal.

A condensation formula φ(x, y) is a formula which is interpreted as a condensation of the domain over all countable ordinal words, i.e. for every word w ∈ Σord, the relation defned on dom(w) by ι ∼<sup>φ</sup> κ if and only if w, [x 7→ ι, y 7→ κ] |= φ(x, y) is a condensation. A condensation formula φ(x, y) induces a map:

$$
\hat{\varphi} \colon \Sigma^{\mathrm{ord}} \to (\Sigma^{\mathrm{ord}+})^{\mathrm{ord}}.
$$

where for every u ∈ Σord , φˆ(u) is a word whose domain is dom(w)/∼φ, and such that for every class I ∈ dom(w)/∼φ, the I-th letter of φˆ(u) is the word (uι)ι∈I—hence fat( ˆφ(u)) = u.

For example, the formula fnite(x, y) is a condensation formula, called fnite condensation. The function φˆfnite : Σord → (Σord) ord that it induces sends the word ababab · · · cdcdcd · · · abc ∈ Σord of length ω · 2 + 3 to the 3-letter word (ababab · · ·)(cdcdcd · · ·)(abc). Observe that for every word w ∈ Σord, every letter of φˆfnite(w) is a word of length ω, except possibly for the last letter (if the word has one), which can be fnite.

Given two FO-defnable functions—one that describes "local transformations" and another that described how to glue these local transformations together the following lemma allows us to build a new FO-defnable function. It is one of the key ingredients in our proof of Theorem 1.

Lemma 5. Let A, B, C be fnite sets. Let φ(x, y) be a condensation FO-formula over A, let f : Aord+ → B and g : Bord → C be FO-defnable functions. Then, the map

$$g \circ\_{\varphi} f \colon A^{\text{ord}} \to \begin{array}{c} C \\ \\ u \mapsto g \left( \prod\_{i \in \operatorname{dom}(\hat{\varphi}(u))} f(\hat{\varphi}(u)\_i) \right) \end{array}$$

is FO-defnable.

### 3 The algorithm

In this section we describe the algorithm behind Theorem 1. We frst introduce the key notion of saturation in Section 3.1, and formalise the algorithm in Section 3.2.

### 3.1 The saturation construction

Until the end of Section 3.1, we fx a fnite ordinal monoid M = (M, ·, 1, −<sup>ω</sup>).

The saturation construction is at the heart of the algorithm, both in this paper, and in previous work. We introduce the necessary defnitions. Note however that in our case, we do not close the defnition under subsets as is usually done. This change, which may look minor, is in fact key for our proof to go through in the case of countable ordinals, and we fnd it also simplifes some points in the setting of fnite words. We frst recall an essential operation on P(M) that we denote −grp. Applied to a set X ⊆ M, it computes the union of all the elements that belong to the maximal group in the subsemigroup of P(M) generated by X.

### Defnition 6. Let X ⊆ M. Defne

$$X^{\mathrm{grp}} = \bigcup\_{k \in \mathbb{N}} X^{\mathrm{idem} + k} = ^\star \bigcap\_{n \in \mathbb{N}} \bigcup\_{m \ge n} X^m.$$

Note that the ⋆ equality holds: Left to right inclusion comes from the fact that Xidem+<sup>k</sup> = X<sup>m</sup> holds for infnitely many values of m, while the other inclusion stems from the fact that X<sup>m</sup> can be written as Xidem+<sup>k</sup> for some k whenever m is sufciently large.

Some important properties of this operation are the following.

Lemma 7. The operation −grp is monotonic, and for all A, B ⊆ M, and all integers k,

and A

$$\begin{aligned} A^{\text{idem}+k} &\subseteq A^{\text{grp}}, & (A \circ B)^{\text{grp}} &= A \cdot (B \circ A)^{\text{grp}} \circ B \;, \\ \text{spr} &\quad \begin{aligned} A^{\text{grp}} &= (A^{\text{grp}})^{\text{grp}} = A^{\text{grp}}. \end{aligned} \end{aligned}$$

The core of the algorithm computes the closure under −grp and all the operations of the algebra of the images of the letters.

Defnition 8. Let A ⊆ P(M). The set ⟨A⟩ grp,ord ⊆ P(M) is defned to be the least set containing A, {1}, and closed under ·, grp and <sup>ω</sup>. 7

This defnition is close in spirit to what is called saturation in previous works, with the diference that we do not take the downward closure, and that we close under the operation −<sup>ω</sup>. Despite this diference, we sometimes call ⟨A⟩ grp,ord the saturation.

Observe that the ordinal monoid M is aperiodic if and only if

⟨{{x} | x ∈ M}⟩ grp,ord = {{x} | x ∈ M} .

<sup>7</sup> Recall that we showed that in a power ordinal monoid, the operation − ω is computable.

### 3.2 The algorithm

We are now ready to describe the core of the algorithm that is claimed to exist in Theorem 1. Let K and L be two regular languages of countable ordinal words over the alphabet Σ. The algorithm is:


⟨{{a}}⟩ grp,ord = {{1}, {a}, {aa}, {a, aa}, {a ω }, {a ω a}, {a ω aa}, {a ω a, a<sup>ω</sup> aa}}

Fig. 1. Egg-box diagram of a fnite ordinal monoid M recognising J, K and L (left), multiplication table and ω-iteration of M (right) and saturation (bottom).

Example 9. We illustrate the saturation construction and the algorithm on the following three languages over the singleton alphabet {a}:

J = {infnite words whose longest fnite sufx has even length}, K = {infnite words whose longest fnite sufx has odd length}, and L = { words that do not have a last letter}.

It is classical that J and K are not FO-defnable, while L is defned by the formula ∀x. ∃y. y > x. We can build a fnite ordinal monoid M recognising all three languages: it has six elements, 1, a, aa, a <sup>ω</sup>, a <sup>ω</sup>a and a <sup>ω</sup>aa. Its presentation its described Figure 1. Naturally, the letter a is mapped to σ(a) = a. Then J, K and L are recognised by F<sup>J</sup> := {a <sup>ω</sup>, a<sup>ω</sup>aa}, by F<sup>K</sup> := {a <sup>ω</sup>a} and by F<sup>L</sup> := {1, a<sup>ω</sup>}, respectively.

The languages K and L are FO-separable: in fact L is an FO-separator of K and L. On the other hand, J and K are not FO-separable, as witnessed by the saturation algorithm. Indeed, the saturation ⟨{{σ(a)} | a ∈ Σ}⟩ grp,ord contains all singletons, and furthermore {a, aa} = {a} grp. As a consequence, it also contains {a <sup>ω</sup>a, aωaa} = {a} <sup>ω</sup> · {a, aa}. This last set intersects both F<sup>J</sup> and FK.

The rest of the paper is dedicated to establishing the validity of this approach. In Section 4, we prove Proposition 12 stating that if the algorithm answers 'no', then the languages cannot be separated, as described in Theorem 1. In Section 5, we prove Corollary 16 stating that if the algorithm answers 'yes', then it is possible to construct an FO-separator sentence as described in Theorem 1. In Section 6, we shall package the results of Sections 4 and 5 diferently, concluding that we have in fact computed the pointlike sets, and that we can also decide the more general covering problem.

### 4 When the algorithm says 'no'

In this section, we establish the correctness of the algorithm, i.e., when the algorithm answers 'no', we have to prove that the two input languages cannot be separated by an FO-defnable language, and that we can produce a witness function. This is established in Proposition 12. The proof follows standard arguments.

The quantifer depth, a.k.a. quantifer rank, of an FO-formula is the maximal number of nested quantifers in the formula. Two words u, v ∈ Σord are said to be FOk-equivalent, denoted by u ≡FO<sup>k</sup> v, if every FO-sentence of quantifer depth at most k accepts u if and only if it accepts v.

### Proposition 10. Let k ∈ N.


This can be proved, for example, by using Ehrenfeucht-Fraïssé games—see e.g. [33, Lemma 6.5 & Corollary 6.9] for a proof of the frst and third items ; the proof of the second item is similar<sup>8</sup> . Note that the frst two items are also immediate corollaries of the Feferman-Vaught theorem [20, Theorem 1.3]. Note that the third property can be used to prove that every FO-defnable language is recognised by an aperiodic fnite ordinal monoid—this is the easy direction of Bedon's theorem [7].

Throughout the rest of this section, we fx K and L, two regular languages of countable ordinal words over an alphabet Σ. Recall that the algorithm computes the subset Sat := ⟨{{σ(a)} | a ∈ Σ}⟩ grp,ord of P(M), where M is a fnite ordinal monoid recognizing both K and L.

<sup>8</sup> Moreover, note that the frst item can be deduced from the second item by taking u<sup>n</sup> = v<sup>n</sup> = ε for n ≥ 2.

We begin with a lemma which states that to all sets that belong to Sat can be efectively associated witnesses of indisinguishability (we shall see in Proposition 30 that what we have proved is that the elements in Sat are pointlike sets).

Lemma 11. There exists a computable function which takes as input a number k ∈ N and an element X ∈ Sat, and produces an X-indexed sequence of ordinal words (ux)x∈<sup>X</sup> ∈ (Σord) <sup>X</sup> such that,

$$\begin{array}{l} -\ \pi(\sigma^{\text{ord}}(u\_x)) = x \ for \ all \ x \in X, \ and \\\ -\ \ u\_x \equiv\_{\text{FO}\_k} u\_{x'} \ for \ all \ x, x' \in X. \end{array}$$

The proof is by structural induction on the defnition of Sat, making use of the two frst items of Proposition 10 for composing witnesses, and of furthermore the third item for treating the −grp operation.

From the above lemma, one can easily deduce that when the algorithm answers 'no', there is indeed an obstruction to the fact that K and L can be FO-separated.

Proposition 12. Assume that the algorithm answers 'no' when run with input languages K and L. Then there is a witness function which computes, for any FO-sentence φ, a pair of words (u, u′ ) ∈ K × L such that u |= φ if and only if u ′ |= φ. In particular, K and L cannot be FO-separated.

Proof. Since the algorithm answered 'no', pick a pair (x, x′ ) ∈ F<sup>K</sup> ×F<sup>L</sup> such that x, x′ ∈ X for some X ∈ Sat. Now, for any FO-sentence φ, using the function of Lemma 11 with k the quantifer depth of φ, we can compute a sequence (ux)x∈<sup>X</sup> of ordinal words. Now defne u := u<sup>x</sup> and u ′ := ux′ . Then u ≡FO<sup>k</sup> u ′ , so that u |= φ if and only if u ′ |= φ. Also, π(σ ord(u)) = x ∈ F<sup>K</sup> and π(σ ord(u ′ )) = x ′ ∈ FL, so u ∈ K and u ′ ∈ L.

Example 13 (Continuing Example 9). Recall that J and K are not FO-separable. Because of the set {a <sup>ω</sup>a, a<sup>ω</sup>aa} ∈ ⟨{σ(a) | a ∈ Σ}⟩ grp,ord, the algorithm outputs 'no', and can return, to witness the FO-inseparability of the two languages the computable map φ 7→ (a ωa 2 <sup>k</sup>+1, a<sup>ω</sup>a 2 <sup>k</sup>+2) ∈ J × K, where k denoted the quantifer depth of φ. To prove that a ωa 2 <sup>k</sup>+1 ≡FO<sup>k</sup> a ωa 2 <sup>k</sup>+2, one can simply use the frst and third items of Proposition 10.

### 5 When the algorithm says 'yes'

We now establish the completeness part of the proof of the main theorem, Theorem 1. The goal of this proof is to establish that if the algorithm answers 'yes', it is indeed possible to produce an FO-separator (Corollary 16).

This is the part of the proof that difers most substantially from previous works on separation. In Section 5.1, we abstract the question with the notion of ordinal monoids with merge, and we introduce the notion of FO-approximants which are FO-defnable over-approximations of the product. The key result,

Lemma 15, states their existence for all fnite ordinal monoid with merge. Corollary 16 follows immediately. The proof of Lemma 15 is then established in Section 5.2 for words of fnite or ω length. Building on these simpler cases, the general case is the subject Section 5.3.

#### 5.1 Merge operators and FO-approximants

We abstract in this section the ordinal P(M) equipped with the −grp operator into a new algebraic structure. A fnite ordinal monoid with merge M = (M, 1, ⩽ , ·, ω , grp ) consists of:


$$\begin{aligned} a^{\text{idem}+k} &\leqslant a^{\text{grp}}, & (a^{\text{idem}})^{\text{grp}} &= a^{\text{idem}},\\ a^{\text{grp}} \div a^{\text{grp}} &= (a^{\text{grp}})^{\text{grp}} = a^{\text{grp}}, & \text{and} & (a \cdot b)^{\text{grp}} = a \cdot (b \cdot a)^{\text{grp}} \div b \;. \end{aligned}$$

The following lemma is an immediate consequence of Lemma 7.

Lemma 14. Both (P(M), {1}, ⊆, ·, ω , grp ) and (Sat, {1}, ⊆, ·, ω , grp ) are ordinal monoids with merge.

The idea behind ordinal monoids with merge is that not only there is a product operation as for every ordinal monoid, but also an FO-defnable overapproximation for it. This is the concept of FO-approximant that we introduce now. Given a an FO-defnable language L ⊆ Mord, an FO-approximant of π over L is an FO-defnable map ρ: L → M such that:

$$
\pi(u) \leqslant \rho(u), \qquad \text{for all } u \in L.
$$

The key result concerning ordinal monoids with merge is the existence of a total FO-approximant:

Lemma 15. There is an FO-approximant ρ over Mord for all ordinal monoids with merge M.

An example of an FO-approximant can be found in Example 26. Before establishing Lemma 15, let us explain why it is sufcient for concluding the proof of Theorem 1 in the case the algorithm answers 'yes'.

Corollary 16. If the algorithm answers 'yes', there exists an FO-separator.

Proof. By Lemmas 14 and 15, there exists an FO-approximant ρ : Aord → ⟨A⟩ grp,ord over the power ordinal monoid P(M), where A = {{σ(a)} | a ∈ Σ}. Now defne the language

$$S := \{ u \in \Sigma^{\mathrm{ord}} \mid \rho(\tilde{\sigma}^{\mathrm{ord}}(u)) \cap F\_K \neq \mathcal{Q} \}$$
 
$$\text{where } \tilde{\sigma}^{\mathrm{ord}}(u) := (\{\sigma(u\_i)\})\_{i \in \text{dom}(u)} \in A^{\mathrm{ord}} \text{ for all } u \in \Sigma^{\mathrm{ord}}.$$

Note frst that since ρ is FO-defnable, this language is FO-defnable. Let us show that it separates K from L.

For every u ∈ K, F<sup>K</sup> ∋ π(σ ord(u)) ⊆ ρ(σ˜ ord(u)), and as a consequence ρ(σ˜ ord(u)) ∩ F<sup>K</sup> ̸= ∅. We have proved K ⊆ S.

Conversely, consider some u ∈ L. We have F<sup>L</sup> ∋ π(σ ord(u)) ∈ ρ(σ˜ ord(u)) ∈ ⟨A⟩ grp,ord, and thus ρ(σ˜ ord(u))∩F<sup>L</sup> ̸= ∅. Since the algorithm returns 'yes', this means that there is no set in ⟨A⟩ grp,ord that intersects both F<sup>K</sup> and FL. In our case, this means that ρ(σ˜ ord(u)) ∩ F<sup>K</sup> = ∅, proving that u ̸∈ S. We have proved L ∩ S = ∅.

Overall, S is an FO-separator for K and L.

Remark 17. Notice how the "difcult" implication of Bedon's theorem (Proposition 4) can be easily deduced from Lemma 15<sup>9</sup> : recall that this implication consists in showing that a regular language L ⊆ Σord , recognised by some triplet (M, σ, F) with M is aperiodic is defnable in frst-order logic. Indeed, by aperiodicity of M, the operation grp applied to a singleton {a} yields the singleton {a idem}. Hence, the set ⟨{{σ(a)} | a ∈ Σ}⟩ grp,ord = {{π ◦ σ ord(u)} | u ∈ Σord} consists only of singletons, and as a consequence, all FO-approximants ρ (and in particular the one constructed in Lemma 15) maps a word u to π(u). Hence, π is an FO-defnable map, and thus L is an FO-defnable language.

The rest of this section is devoted to establishing Lemma 15. The construction is based on subresults showing the existence of FO-approximants over subsets of Mord; frst for fnite and ω-words in Section 5.2, and fnally for words of any countable ordinal length in Section 5.3. But beforehand, we shall introduce some more defnitions and elementary results.

In what follows we use the notation ⟨−⟩ grp,ord from Defnition 8, interpreted in a generic ordinal monoid with merge, as well as some variants. Let A ⊆ M. We defne ⟨A⟩ <sup>+</sup> as the closure of A under ·, ⟨A⟩ grp+ as the closure of A under · and −grp, and ⟨A⟩ grp<sup>∗</sup> as ⟨A⟩ grp+ ∪ {1}. We defne ⟨A⟩ grp,ord+ as the closure of A under ·, grp and <sup>ω</sup>. Note that thanks to the identities of ordinal monoids with merge, we have ⟨A⟩ grp,ord = ⟨A⟩ grp,ord+ ∪ {1}. Moreover, we have the following identities<sup>10</sup>:

Proposition 18. Let M be an ordinal monoid with merge. For every A ⊆ M,

⟨A⟩ grp+ = A⟨A⟩ grp<sup>∗</sup> = ⟨A⟩ grp<sup>∗</sup>A and ⟨A⟩ grp,ord+ = A⟨A⟩ grp,ord

.

Proof. Note, by defnition, that ⟨A⟩ grp<sup>∗</sup> = ⟨A⟩ grp+ ∪ {1}, so

$$A\langle A\rangle^{\text{grp}\*} = A\langle A\rangle^{\text{grp}+} \cup A \subseteq \langle A\rangle^{\text{grp}+}.$$

The converse inclusion ⟨A⟩ grp+ ⊆ A⟨A⟩ grp∗ is obtained by induction. Let b ∈ ⟨A⟩ grp+. If b ∈ A, then b ∈ A⟨A⟩ grp∗ since 1 ∈ ⟨A⟩ grp∗ . If c = cd with c, d ∈

<sup>9</sup> Similarly, for fnite words, Schützenberger-McNaughton-Papert's theorem is a consequence of Henckell's algorithm for aperiodic pointlikes—see e.g. [25, Corollary 4.8]

<sup>10</sup> Notice the similarity with the (trivial) identities A <sup>+</sup> = AA<sup>∗</sup> = A <sup>∗</sup>A and A ord+ = AAord .

⟨A⟩ grp+, then, by induction, c = ac′ for some a ∈ A and c ′ ∈ ⟨A⟩ grp∗ , thus b = a(c ′d) ∈ A⟨A⟩ grp∗ since a ∈ A and c ′d ∈ ⟨A⟩ grp∗ . Finally, if b = c grp, then, again by induction, c = ac′ for some a ∈ A and c ′ ∈ ⟨A⟩ grp∗ , and thus b = c grp = ccgrp = a(c ′ c grp) ∈ A⟨A⟩ grp∗ .

The equality ⟨A⟩ grp+ = ⟨A⟩ grp∗A is symmetric.

The identity ⟨A⟩ grp,ord+ = A⟨A⟩ grp,ord is similar. The new case in the induction is if some b ∈ ⟨A⟩ grp,ord+ is of the form c <sup>ω</sup>, then, by induction hypothesis, c = ac′ for some a ∈ A and c ′ ∈ ⟨A⟩ grp,ord, and thus b = c <sup>ω</sup> = cc<sup>ω</sup> = a(c ′ c <sup>ω</sup>) ∈ A⟨A⟩ grp,ord .

Proposition 19. If there are FO-approximants over K and L respectively, then there exist efectively FO-approximants over K ∪ L and KL.

### 5.2 Construction of FO-approximants for words of fnite and ω-length

First, we show how to construct FO-approximants for fnite words. It serves at the same time as a building block for more complex cases, as a way to show the proof mechanisms in simpler cases, as well as to comment on diferences with previous works.

Lemma 20. Let A ⊆ M, then either

– a · ⟨A⟩ grp+ ⊊ ⟨A⟩ grp+, for some a ∈ A, – ⟨A⟩ grp+ · a ⊊ ⟨A⟩ grp+, for some a ∈ A, or – ⟨A⟩ grp+ has a maximum.

Proof. Assume the two frst items do not hold. Because of the non-frst-one, the map x 7→ a · x is surjective on ⟨A⟩ grp+, for all a ∈ A. Since ⟨A⟩ grp+ is fnite, this means that it is bijective on ⟨A⟩ grp+. Hence it is also bijective on ⟨A⟩ <sup>+</sup>. The negation of the second item has a symmetric consequence. Together we get that ⟨A⟩ <sup>+</sup> is a group. Let I be its neutral element. Note frst that for all x ∈ ⟨A⟩ +, I = x k for some k, and hence, I ⩽ x grp. Set now a1, . . . , a<sup>n</sup> to be the elements in A, and defne: M = (a grp 1 · a grp 2 · · · a grp n ) grp .

By the above remark a<sup>i</sup> = I i−1 · a<sup>i</sup> · I <sup>n</sup>−<sup>i</sup> ⩽ a grp 1 · a grp 2 · · · a grp <sup>n</sup> ⩽ M for all i. Since furthermore for all x, y ⩽ M, x · y ⩽ M and x grp ⩽ M, it follows that z ⩽ M for all z ∈ ⟨A⟩ grp+.

A similar lemma is used in [25], but concludes with the existence of a pseudogroup as the third item.

Lemma 21. For all a ∈ M there exists an FO-approximant from a <sup>+</sup> to ⟨{a}⟩ grp+.

Construction. Let k be such that a idem = a k . Defne

$$\rho(\overbrace{a\cdots a}^{\text{length }n}) = \begin{cases} a^n & \text{if } n < k, \\ a^{\text{grp}} & \text{otherwise.} \end{cases} \tag{7}$$

We can now use this for proving the fnite word case.

Lemma 22. For all A ⊆ M there exists an FO-approximant from A<sup>+</sup> to ⟨A⟩ grp+.

Proof. We use a double induction on |⟨A⟩ grp+| and |A|. The induction is guided by Lemma 20. The base case is A = ∅, and the nowhere defned FO-approximant proves it.

First case: a · ⟨A⟩ grp+ ⊊ ⟨A⟩ grp+ for some a ∈ A. This part of the proof is similar to [25, Lemma 6.7]. Let B ::= A ∖ {a}.

We frst construct an FO-approximant from a <sup>+</sup>B<sup>+</sup> to a ·⟨A⟩ grp+. Indeed, we know by Lemma 21 that there is an FO-approximant from a <sup>+</sup> to ⟨{a}⟩ grp+ ⊆ a · ⟨A⟩ grp∗ . We also know by induction<sup>11</sup> that there is an FO-approximant from B<sup>+</sup> to ⟨B⟩ grp+ ⊆ ⟨A⟩ grp+. Thus by Proposition 19, there exists efectively an FOapproximant τ from a <sup>+</sup>B<sup>+</sup> to a · ⟨A⟩ grp∗ · ⟨A⟩ grp+ ⊆ a · ⟨A⟩ grp+.

We now provide an FO-approximant for (a <sup>+</sup>B+) <sup>+</sup> (which is FO-defnable), and for this, defne the condensation FO-formula φ(x, y) that expresses that "two positions x and y are equivalent if the subword on the interval [x, y] belongs to a <sup>∗</sup>B<sup>∗</sup> " (this can be expressed in frst-order logic). Over a word u ∈ (a <sup>+</sup>B+) +, each of the condensation classes belong to a <sup>+</sup>B<sup>+</sup> and its image under τ belongs to a · ⟨A⟩ grp+. Furthermore, still by induction hypothesis<sup>12</sup>, there is an FO-approximant from (a · ⟨A⟩ grp+) <sup>+</sup> to ⟨A⟩ grp+. By Lemma 5, we thus obtain an FO-defnable map from (a <sup>+</sup>B+) <sup>+</sup> to ⟨A⟩ grp+. It is an FO-approximant by construction.

Using the above case and Proposition 19, it can be easily extended to an FO-approximant from A<sup>+</sup> = AB<sup>∗</sup> (a <sup>+</sup>B+) ∗a ∗ to ⟨A⟩ grp+.

Second case: ⟨A⟩ grp+ · a ⊊ ⟨A⟩ grp+. This case is symmetric to the frst case.

Third case: ⟨A⟩ grp+ has a maximum M. Then the constant map that sends every word u ∈ A<sup>∗</sup> to M is an FO-approximant over A<sup>∗</sup> .

Following similar ideas, we can treat the case of ω-words. We defne here ⟨A⟩ grp,ω as the elements of the form {a · b <sup>ω</sup> | a, b ∈ ⟨A⟩ grp+}—or, equivalently, ⟨A⟩ grp,ω = (⟨A⟩ grp+) ω.

Lemma 23. Let M be an ordinal monoid with merge. For all A ⊆ M, there exists an FO-approximant from A<sup>ω</sup> to ⟨A⟩ grp,ω.

### 5.3 Construction of FO-approximants for countable ordinal words

As for the fnite case, the proof revolves around a carefully designed case distinction. This one is more complex to establish, and makes use of Green's relations and a precise understanding of the properties of ordinal monoids with merge.

Lemma 24 (Trichotomy principle). Let M be a fnite ordinal monoid with merge and A ⊆ M, then either

<sup>11</sup> Indeed, |B| < |A|.

<sup>12</sup> This time, we can use the induction hypothesis because |⟨(a · ⟨A⟩ grp+) +⟩ grp+| < |⟨A⟩ grp+|. Indeed, by Proposition 18, ⟨(a · ⟨A⟩ grp+) +⟩ grp+ ⊆ (a · ⟨A⟩ grp+) <sup>+</sup>⟨(a · ⟨A⟩ grp+) +⟩ grp<sup>∗</sup> ⊆ a · ⟨A⟩ grp+ ⊊ ⟨A⟩ grp+.


The above lemma is key in the proof of the existence of an FO-approximant.

#### Lemma 25. For all a ∈ M, there exists an FO-approximant over a ord .

The proof follows a similar structure as the one for Lemma 22 for the fnite case. This time, Lemma 24 is the key argument that makes the induction progress, playing the same role as Lemma 20 in the fnite case. Note, however, that the second items in Lemmas 20 and 24 are very diferent in structure. And indeed, this entails a diferent argument for constructing the FO-approximant. It is based on performing in one step the condensation of all the maximal factors of ordertype ω.

Example 26 (Continuing Example 13). An FO-approximant ρ of π over a ord in the ordinal monoid defned in Example 9 can be defned for all u ∈ {a} ord as:

$$\rho(u) := \begin{cases} \{1\} & \text{if } \text{dom}(u) \text{ is empty,} \\ \{a, aa\} & \text{if } \text{dom}(u) \text{ is finite and non-empty,} \\ \{a^{\omega}\} & \text{if } \text{dom}(u) \text{ is a non-zero limit ordinal,} \\ \{a^{\omega}a, a^{\omega}aa\} & \text{if } \text{dom}(u) \text{ is an infinite successor ordinal.} \end{cases}$$

Lemma 27. For all A ⊆ M, there exists an FO-approximant from Aord+ to ⟨A⟩ grp,ord+.

Proof. We prove the result by induction on |⟨A⟩ grp,ord+| and |Aord+|. The base case A = ∅ is trivial. If A is non-empty, following Lemma 24, there are three cases to treat.

First case: There exists a ∈ A such that a · ⟨A⟩ grp,ord+ ⊊ ⟨A⟩ grp,ord+. This case is as in the proof for fnite words, Lemma 22, using Lemma 25 in place of Lemma 21. The key reason why the proof remains valid is because the hypothesis a·⟨A⟩ grp,ord+ ⊊ ⟨A⟩ grp,ord+ implies |⟨(a·⟨A⟩ grp,ord+) ord+⟩ grp,ord+| < |⟨A⟩ grp,ord+| by Proposition 18<sup>13</sup> .

Second case<sup>14</sup>: ⟨⟨A⟩ grp,ω⟩ grp,ord+ ⊊ ⟨A⟩ grp,ord+. By Lemma 23, there is an FO-approximant from A<sup>ω</sup> to ⟨A⟩ grp,ω. By induction hypothesis<sup>15</sup>, we have an FO-approximant from (⟨A⟩ grp,ω) ord+ to ⟨⟨A⟩ grp,ω⟩ grp,ord+ ⊆ ⟨A⟩ grp,ord+. Since

<sup>13</sup> More precisely, we are using the property ⟨B⟩ grp,ord+ = B⟨B⟩ grp,ord of Proposition 18. By thinking of elements of ⟨B⟩ grp,ord+ as "countable ordinal words with merge", this property is simply saying that every "countable ordinal word with merge" has a frst letter. However, countable ordinal words need not have a last letter: this is what makes an hypothesis of the form ⟨A⟩ grp,ord+ · a ⊊ ⟨A⟩ grp,ord+ unusable—and this is the motivation behind the trichotomy principle Lemma 24.

<sup>14</sup> Note here that it is diferent from the second case in the proof of Lemma 22.

<sup>15</sup> Indeed, ⟨⟨A⟩ grp,ω⟩ grp,ord+ ⊊ ⟨A⟩ grp,ord+.

the formula fnite(x, y) is a condensation FO-formula, we obtain by Lemma 5 an FO-approximant from (Aω) ord+ → ⟨A⟩ grp,ord+. Using Proposition 19 and Lemma 22, we easily extend it to an FO-approximant from Aord+ = A(Aω) ordA<sup>∗</sup> to ⟨A⟩ grp,ord+.

Third case: x · y = y and x <sup>ω</sup> = y <sup>ω</sup>, for all x, y ∈ ⟨A⟩ grp,ord+. Then the product over A sends a countable ordinal word u ∈ Aord+ to its last letter if the word has a last letter, and to the unique omega power of ⟨A⟩ grp,ord+ if the word has no last letter. Since the languages of the form Aord+a where a ∈ A and {u ∈ Aord+ | dom(u) is a limit ordinal} all are FO-defnable, it follows that the product over A is FO-defnable.

### 6 Related problems

In this section, we solve two related problems: the decidability of the covering problem (Proposition 28), and the computability of pointlike sets (Proposition 30). Both are direct applications of the key lemmas presented above.

The FO-covering problem asks, given regular languages, in our case of countable ordinal words, L, K1, . . . , Kn, to determine if there exist FO-defnable languages C1, . . . , C<sup>n</sup> such that L ⊆ ∪iC<sup>i</sup> and C<sup>i</sup> ∩ K<sup>i</sup> = ∅ for all i—see [27] for more details. In general, separation problems trivially reduce to covering problems, since L and K are separable if and only if there is a solution to the covering problem for the instance (L, K). In the other direction, there is no known example of a variety with decidable separation problem but undecidable covering problem. We show that a further consequence of the above results is that the FO-pointlike sets in a fnite ordinal monoid (see Defnition 29) are computable, from which we deduce:

Proposition 28. The FO-covering problem for countable ordinal words is decidable.

Let us now introduce, and explain, the relation with pointlike sets. The FOkclosure of a word u is the set [u]FO<sup>k</sup> which contains all words that are FOkequivalent to u.

Defnition 29. Given a fnite ordinal monoid M the FO-pointlike sets of a map σ : Σ → M are defned by

$$\mathrm{Pl}\_{\mathrm{FO}}(\sigma) ::= \bigcap\_{k \in \mathbb{N}} \downarrow \left\{ \pi(\sigma^{\mathrm{ord}}([u]\_{\mathrm{FO}\_k})) \mid u \in \Sigma^{\mathrm{ord}} \right\},$$

where ↓ X denotes the downward closure of X.

The defnition of pointlike sets is in fact more general<sup>16</sup>: given a variety of fnite semigroups V one can defne a notion of pointlike sets with respect to this

<sup>16</sup> In the following discussion, we focus on fnite words, but the notion of variety—of algebras, or of languages—can be extended to countable ordinal words [8] and many other settings [11, §4].

variety. Almeida observed that the separation problem for the variety V—given two regular languages, can they be separated by a V-recognisable language?—is decidable if and only if the V-pointlikes of size 2 of every morphism are computable [2, Prop. 3.4]. The covering problem also has an algebraic counterpart: it is decidable for the variety V if and only if, for every morphism, the collection of all V-pointlike sets of this morphism is computable [2, Prop. 3.6]<sup>17</sup>. Hence, the fact that FO-covering and FO-separation are decidable for fnite words is simply a corollary of Henckell's theorem on aperiodic pointlikes [19, Fact 3.7 & Fact 5.31], stating that they are computable. Place & Zeitoun's simpler proof of the decidability of FO-covering for fnite words and for ω-words [25] relies on the same principle.<sup>18</sup> Unsurprisingly, our result can be interpreted in the same way: we are implicitly showing the following property, from which one can immediately deduce the computability of PlFO(σ).

Proposition 30. Given a fnite ordinal monoid M and σ : Σ → M,

PlFO(σ) = ↓ ⟨{{σ(a)} | a ∈ Σ}⟩ grp,ord .

### 7 Conclusion

In this paper, we have studied the problem of FO-separation over words of countable ordinal length. Our proof is based on the work of Place and Zeitoun over words of length ω [25]. We build an FO-approximant using essentially the same technique as Place and Zeitoun. However a key diference is that for fnite words and ω-words, the proof relies on a case distinction (Lemma 20) which is conceptually similar to the characterisation of groups as semigroups whose translations are bijective. This was no longer sufcient for countable ordinal words because of ω-iterations. In this situation, our new case distinction (Lemma 24) captures the subtle interaction of ω-iteration with groups in fnite ordinal monoids. In particular, a diference with previously known algorithms is that we do not close the saturation under subset. This a priori innocuous diference has signifcant consequences on the proof of completeness, yielding some simplifcations in the fnite and ω-case, and necessary for the proof to be extendable to all ordinals.

Of course, the next step is to go to longer words, in particular scattered countable words, or even better to all countable words. Here, there are conceptual difculties, and let us stress also that, starting from scattered countable words, frst-order logic and frst-order logic with access to Dedekind cuts begin to have a diferent expressiveness. Thus several notions of separation have to be studied.

### References

1. Adsul, B., Sarkar, S., Sreejith, A.V.: First-order logic and its infnitary quantifer extensions over countable words (2021)

<sup>17</sup> Beware: there is a typo in the statement of the frst item of the proposition.

<sup>18</sup> There is a diference in terminology: they refer to the PlFO(φ) as "optimal imprint with respect to FO on φ".


Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/ 4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

## A Faithful and Quantitative Notion of Distant Reduction for Generalized Applications

Jos´e Esp´ırito Santo<sup>1</sup> () , Delia Kesner2,<sup>3</sup> () , and Lo¨ıc Peyrot<sup>2</sup> ()

> <sup>1</sup> Centro de Matem´atica, Universidade do Minho, Portugal jes@math.uminho.pt <sup>2</sup> Universit´e de Paris, CNRS, IRIF, Paris, France {kesner,lpeyrot}@irif.fr 3 Institut Universitaire de France (IUF), France

Abstract. We introduce a call-by-name lambda-calculus λJ with generalized applications which integrates a notion of distant reduction that allows to unblock β-redexes without resorting to the permutative conversions of generalized applications. We show strong normalization of simply typed terms, and we then fully characterize strong normalization by means of a quantitative typing system. This characterization uses a non-trivial inductive definition of strong normalization –that we relate to others in the literature–, which is based on a weak-head normalizing strategy. Our calculus relates to explicit substitution calculi by means of a translation between the two formalisms which is faithful, in the sense that it preserves strong normalization. We show that our calculus λJ and the well-know calculus ΛJ determine equivalent notions of strong normalization. As a consequence, ΛJ inherits a faithful translation into explicit substitutions, and its strong normalization can be characterized by the quantitative typing system designed for λJ, despite the fact that quantitative subject reduction fails for permutative conversions.

Keywords: Lambda-calculus · Generalized applications · Quantitative types

### 1 Introduction

(Pure) functional programming can be understood by means of a universal model of computation known as the λ-calculus, which is in tight correspondence, by means of the so-called Curry-Howard isomorphism, with propositional intuitionistic logic in Gentzen's natural deduction style. The Curry-Howard isomorphism emphasizes the fact that proof systems on one hand, and programming languages on the other, are two mathematical and computational facets of the same object. The λ-calculus with generalized applications (ΛJ), introduced by Joachimski and Matthes [8], is an extension of the λ-calculus which can be seen as the Curry-Howard counterpart of van Plato's natural deduction with generalized elimination rules [11].

A generalized application in ΛJ is written t(u, y.r). It intuitively means that t is applied to u in the context of the substitution { /y}r. The conversion of the βredex (λx.t)(u, y.r) then produces two (nested) substitutions {{u/x}t/y}r. But some β-redexes can be blocked by the syntax, e.g. in the term t(u, y.r)(u 0 , y<sup>0</sup> .r0 ), where the (potential) application of r = λx.s to u 0 remains hidden. An iterated generalized application t(u, y.r)(u 0 , y<sup>0</sup> .r0 ) may be rearranged as t(u, y.r(u 0 , y<sup>0</sup> .r0 )) by a permutative conversion called π. Rule π is then an unblocker of stuck βredexes: the contractum t(u, y.(λx.s)(u 0 , y<sup>0</sup> .r0 )) unveils the desired application of r to u 0 . Rule π, together with rule β, allows natural deduction proofs to be brought to a "fully normal" form [11] enjoying the subformula property. Computationally, ΛJ defines a call-by-name operational semantics; a call-byvalue variant has been proposed in [5], but this is out of the scope of this paper.

Strong normalization w.r.t. the two rules β and π has been characterized by typability with (idempotent) intersection types by Matthes [10]: a term is typable if and only if it is strongly normalizing. However, this characterization is just qualitative. A different flavor of intersection types, called non-idempotent, offers a more powerful quantitative characterization of strong normalization, in the sense that the length of the longest reduction sequence to normal form starting at a typable term t is bound by the size of its type derivation. However, quantitative types were never used in the framework of generalized applications, and it is our purpose to propose and study one such typing system.

Quantitative types allow for simple combinatorial proofs of strong normalization, without any need to use reducibility or computability arguments. More remarkably, they also provide a refined tool to understand permutative rules. For instance, in ΛJ, rule π is not quantitatively sound (i.e. π does not enjoy quantitative subject reduction), although π becomes valid in an idempotent framework. Hence, a good question is: how can we unblock redexes to reach normal forms in a quantitative model of computation based on generalized applications?

Our solution is to adopt the paradigm of distant reduction [2] coming from explicit substitution (ES) calculi, which extends the key concept of β-redex, so that we may find the λ-abstraction hidden under a sequence of nested generalized applications. This is essentially similar to adopting a different permutation rule, converting t(u, y.λz.s) to λz.t(u, y.s). However, the permutation rule is mostly a way to overcome syntactical limitations, while distant β is a way to put emphasis on the computational behavior of the calculus: it is at the β-step that resources are consumed, not during the permutations.

The syntax of the ΛJ-calculus will thus be equipped with an operational callby-name semantics given by distant β, but without π. The resulting calculus is called λJ. As a major contribution, we prove a characterization of strong normalization in terms of typability in our quantitative system. In such proof, the soundness result (typability implies strong normalization) is obtained by combinatorial arguments, with the size of typing derivations decreasing at each step. For the completeness result (strong normalization implies typability) we need an inductive characterization of the terms that are strongly normalizing for distant β: this is a non-trivial technical contribution of the paper.

As mentioned above, we draw inspiration for our distant β rule from calculi with explicit substitutions, having in mind the usual translation of t(u, y.r) to the explicit substitution [tu/y]r (a let-binding of tu over y in r). As such, we expect the dynamic behavior of our calculus to be faithful to explicit substitutions. Such translation, however, does not in general preserve strong normalization. Indeed, in a β-redex (λx.t)(u, y.r), the interaction of λx.t with the argument u is materialized by the internal substitution in the contractum term {{u/x}t/y}r, as mentioned before. But such interaction is elusive: if the external substitution is vacuous (that is, if y is not free in r), β-reduction will simply throw away the λabstraction λx.t and its argument u, whereas (λx.t)u may reduce in the context of the explicit substitution [(λx.t)u/y]r. The different interaction between the abstraction and its argument in the two mentioned models of computation has important consequences. For instance, let δ ◦ := λx.x(x, w.w) be the encoding of δ = λx.xx as a ΛJ-term. Then, if y /∈ r and r is normal, the only thing the term δ ◦ (δ ◦ , y.r) can do is to reduce to r, whereas δδ may reduce forever in the context of the vacuous explicit substitution [δδ/y]r.

That is why we propose a new, type-preserving, encoding of terms with generalized applications into terms with explicit substitutions. Using this new encoding and quantitative types, we show that strong normalization of the source term with generalized applications is equivalent to the strong normalization of the target term with explicit substitutions.

As a final contribution, we compare λJ-strong normalization to that of other calculi, including the original ΛJ. We extract new results for the latter, as a faithful translation to ES, and a new normalizing strategy. Moreover, we obtain a quantitative characterization of ΛJ-strong normalization, where the bound for reduction given by the size of type derivations only holds for β (and not for π).

Plan of the paper. Sec. 2 presents our calculus with distant β. Sec. 3 provides an inductive characterization of strongly normalizing terms. Sec. 4 is about non-idempotent intersection types. Sec. 5 shows the faithful translation to ES. Sec. 6 contains the comparisons with other calculi. Sec. 7 concludes. Full proofs are available in [6].

### 2 A Calculus with Generalized Applications

In this section we define our calculus λJ with generalized applications and give some introductory observations on strong normalization in that system.

### 2.1 Syntax and Semantics

We start with some general notations. Given a reduction relation →R, we write →<sup>∗</sup> <sup>R</sup> (resp. →<sup>+</sup> <sup>R</sup>) for the reflexive-transitive (resp. transitive) closure of →R. A term t is said to be in R-normal form (written R-nf) iff there is no t 0 such that t →<sup>R</sup> t 0 . A term t is said to be R-strongly normalizing (written t ∈ SN (R)) iff there is no infinite R-sequence starting at t. R is strongly normalizing iff every term is R-strongly normalizing. When R is finitely branching, ||t||<sup>R</sup> denotes the maximal length of an R-reduction sequence to R-nf starting at t, for every t ∈ SN (R).

The set of terms generated by the following grammar is denoted by T<sup>J</sup> .

$$(\mathbf{Terms})\ t, u, r, s ::= x \mid \lambda x. t \mid t(u, x. r)$$

The term t(u, x.r) is called a generalized application, and the part x.r is sometimes referred as the continuation of that application. Free variables of terms are defined as usual, notably fv(t(u, x.r)) := fv(t)∪fv(u)∪fv(r) \ {x}. We also work modulo α-conversion, denoted =α, so that bound variables can be systematically renamed. We use I to denote the identity function λz.z.

We introduce contexts (terms with one occurrence of the hole ♦) and the special distant contexts:

$$\begin{array}{llll}(\mathbf{Convexts}) & \mathsf{C} ::= \diamondsuit \mid \lambda x. \mathsf{C} \mid \mathsf{C}(u, x.r) \mid t(\mathsf{C}, x.r) \mid t(u, x. \mathsf{C})\\ (\mathbf{Distant} \, \mathbf{Convexts}) \, \mathsf{D} ::= \diamondsuit \mid t(u, x. \mathsf{D})\end{array}$$

The term Chti denotes C where ♦ is replaced by t, so that capture of variables may eventually occur. Given a rewriting rule R ⊆ T<sup>J</sup> × T<sup>J</sup> , →<sup>R</sup> denotes the reduction relation generated by the closure of R under all contexts.

We say that t has an abstraction shape iff t = Dhλx.ui. The substitution operation is capture-avoiding and defined as usual, in particular {u/x}(t(s, y.r)) := ({u/x}t)({u/x}s, y.{u/x}r).

### 2.2 Towards a Call-by-Name Operational Semantics

The T<sup>J</sup> -syntax can be equipped with different rewriting rules, as discussed in the introduction. We use the generic notation T<sup>J</sup> [R] to denote the calculus given by the syntax T<sup>J</sup> equipped with the reduction relation →R.

Now, if we consider t<sup>0</sup> := t(u 0 , y<sup>0</sup> .λx.s)(u, y.r) in the calculus T<sup>J</sup> [β], where

$$(\lambda x.s)(u,y.r) \mapsto\_{\beta} \{\{u/x\}s/y\}r$$

we can see that the term t<sup>0</sup> is stuck since the subterm λx.s is not close to u. This is when the following rule π, plays the role of an unblocker of β-redexes:

$$t(u, y.r)(u', y'.r') \mapsto\_{\pi} t(u, y.r(u', y'.r'))$$

Indeed, t<sup>0</sup> →<sup>π</sup> t(u 0 , y<sup>0</sup> .(λx.s)(u, y.r)) →<sup>β</sup> t(u 0 , y<sup>0</sup> .{{u/x}s/y}r). More generally, given t<sup>1</sup> := Dhλx.si(u, y.r), with D 6= ♦, a sequence of π-steps reduces the term t<sup>1</sup> above to Dh(λx.s)(u, y.r)i. A further β-step produces Dh{{u/x}s/y}ri. So, the original ΛJ-calculus [8], which is exactly T<sup>J</sup> [β, π], has a derived notion of distant β rule, based on π, which can be specified by the following rule:

$$\mathsf{D}\langle\lambda x.s\rangle(u,y.r)\mapsto\mathsf{D}\langle\{\{u/x\}s/y\}r\rangle\tag{1}$$

However, π-reduction is not only about unblocking redexes, as witnessed by Dhxi(u, y.r) →<sup>∗</sup> <sup>π</sup> Dhx(u, y.r)i. So it is reasonable to keep terms of the form Dhxi(u, y.r) without reducing them further, as those π-steps do not contribute to unblock more β-redexes. The absence of terms of the form Dhλx.si(u, y.r) gives already a reasonable notion of normal form which, in particular, already enjoy the subformula property, as will be seen in Sec. 2.3.

Still, we will not reduce as in (1) because such rule, as well as π itself, does not admit a quantitative semantics (c.f. Sec. 4.3). We then choose to unblock β-redexes with the following rule p<sup>2</sup> instead<sup>4</sup> :

$$t(u', y'.\lambda x.s) \mapsto\_{\mathbb{P}\_{\mathbb{P}}} \lambda x.t(u', y'.s)$$

so that t<sup>1</sup> given above reduces in several p2-steps to (λx.Dhsi)(u, y.r), which can now be further reduced with β since it is no longer stuck. If we reduce it, we obtain {{u/x}Dhsi/y}r; and since free variables in u cannot be captured by D, this is equal to {Dh{u/x}si/y}r. We thus obtain our distant rule:

Definition 1. We write λJ for our new calculus T<sup>J</sup> [dβ], where the distant βrule is defined as follows:

$$\mathsf{D}\langle\lambda x.t\rangle(u,y.r) \mapsto\_{\mathsf{d}\beta} \{\mathsf{D}\langle\{u/x\}t\rangle/y\}r$$

A reduction step t<sup>1</sup> →d<sup>β</sup> t<sup>2</sup> is said to be erasing iff the reduced dβ-redex in t<sup>1</sup> is of the form Dhλx.ti(u, y.r) with x /∈ fv(t) or y /∈ fv(r).

It is obvious that →dβ⊂→<sup>+</sup> β,p<sup>2</sup> . Some other variants of the p2-rule are possible, like Dhλx.ti(u, y.r) 7→ (λx.Dhti)(u, y.r) or Dhλx.ti 7→<sup>p</sup><sup>2</sup> λx.Dhti, in both cases for D 6= ♦, but we do not develop them. However, while most of the paper is about λJ, brief comparisons with the calculi ΛJ and T<sup>J</sup> [β, p2] are considered in Sec.6.

### 2.3 Some (Un)typed Properties of λJ

Lemma 1. The grammar m characterizes dβ-normal forms.

$$\mathfrak{m} ::= x \mid \lambda x. \mathfrak{m} \mid \mathfrak{m}\_{\mathsf{var}}(\mathfrak{m}, x. \mathfrak{m}) \qquad \mathfrak{m}\_{\mathsf{var}} ::= x \mid \mathfrak{m}\_{\mathsf{var}}(\mathfrak{m}, x. \mathfrak{m}\_{\mathsf{var}}) \,\,\, \mathfrak{m}$$

We already saw that, once β is generalized to dβ, π is not needed anymore to unblock β-redexes; the next Lemma says that π preserves dβ-nfs, so it does not bring anything new to dβ-nfs either. The proof uses Lem. 1, and it proceeds by simultaneous induction on m and mvar.

Lemma 2. If t is a dβ-nf, and t →<sup>π</sup> t 0 , then t 0 is a dβ-nf.

Let us discuss now some properties related to (simple) typability for generalized applications [8], a system that we call ST. Recall the following typing rules, where σ, ρ, τ ::= a | σ → ρ, and a belongs to a set of base type variables:

$$\frac{\Gamma, x:\sigma \vdash t:\rho}{\Gamma, x:\sigma \vdash x:\sigma} \quad \frac{\Gamma \vdash t:\rho \to \tau \quad \Gamma \vdash u:\rho \quad \Gamma, y:\tau \vdash r:\sigma}{\Gamma \vdash t(u,y.r):\sigma}$$

We write Γ ST t : σ if there is a type derivation in system ST ending in Γ ` t : σ. In the following result, we refer to simple types as formulas.

<sup>4</sup> Rule p<sup>2</sup> is used in [7,3] along with two other permutation rules p<sup>1</sup> and p<sup>3</sup> to reduce T<sup>J</sup> -terms to a fragment isomorphic to natural deduction.

Lemma 3 (Subformula Property). If Φ = Γ ST m : τ then every formula in the derivation Φ is a subformula of τ or a subformula of some formula in Γ.

Proof. The lemma is proved together with another statement: If Ψ = Γ ST mvar : τ then every formula in Ψ is a subformula of some formula in Γ. The proof is by simultaneous induction of Φ and Ψ.

We close this section with the following:

Theorem 1. If t is simply typable, i.e. Γ ST t : σ, then t ∈ SN (dβ).

The proof is by a map into the λ-calculus which produces a simulation when the λ-calculus is equipped with the following σ-rules [13]:

(λx.M)NN<sup>0</sup> 7→σ<sup>1</sup> (λx.MN<sup>0</sup> )N (λx.λy.M)N 7→σ<sup>2</sup> λy.(λx.M)N

### 3 Inductive Characterization of Strong Normalization

In this section we give an inductive characterization of strong normalization (ISN) for λJ and prove it correct. This characterization will be useful to show completeness of the type system that we are going to present in Sec. 4.1, as well as to compare strong normalization of λJ to the ones of Tλ[β, p2] and ΛJ.

### 3.1 ISN in the λ-Calculus Through Weak-Head Contexts

As an introduction, we first look at the case of the ISN for the λ-calculus (ISN (β)), on which our forthcoming definition of ISN (dβ) elaborates. A usual way to define ISN (β) is by the following rules [12], where the general notation tr abbreviates (. . .(tr1). . .)r<sup>n</sup> for some n ≥ 0.

$$\frac{r\_1, \dots, r\_n \in \mathcal{LNN}(\beta)}{x\mathbf{r} \in \mathcal{LNN}(\beta)} \qquad \frac{t \in \mathcal{LNN}(\beta)}{\lambda x.t \in \mathcal{LNN}(\beta)} \qquad \frac{\{u/x\}tr, u \in \mathcal{LNN}(\beta)}{(\lambda x.t)u\mathbf{r} \in \mathcal{LNN}(\beta)}$$

One shows that t ∈ SN (β) if and only if t ∈ ISN (β).

The reduction strategy underlying the definition of ISN (β) is the following one: reduce terms to weak-head normal form, and then iterate reduction inside the components of the weak-head normal form, without any need to come back to the head of the term. Weak-head normal terms are of two kinds: (neutral terms) n ::= x | nt and (answers) a ::= λx.t. Neutral terms cannot produce any head β-redex. On the contrary, answers can create a β-redex when given at least one argument. In the case of the λ-calculus, these are only abstractions. If the term is not a weak-head term, a redex can be located with a weak-head context W ::= ♦ | Wt. These concepts allow a different definition of ISN (β).

$$\frac{\mathtt{n},t\in\mathsf{Z}\mathcal{SN}(\beta)}{\mathtt{n}\in\mathsf{Z}\mathcal{SN}(\beta)}\quad\frac{\mathtt{n},t\in\mathsf{Z}\mathcal{SN}(\beta)}{\mathtt{n}t\in\mathsf{Z}\mathcal{SN}(\beta)}\quad\frac{\mathtt{t}\in\mathsf{Z}\mathcal{SN}(\beta)}{\mathtt{\lambda}.\mathtt{t}\in\mathsf{Z}\mathcal{SN}(\beta)}\quad\frac{\mathtt{\Psi}\langle\{\boldsymbol{u}/x\}\boldsymbol{t}\rangle,\boldsymbol{u}\in\mathsf{Z}\mathcal{SN}(\beta)}{\mathtt{\Psi}\langle(\boldsymbol{\lambda}\boldsymbol{x}.\boldsymbol{t})\boldsymbol{u}\rangle\in\mathsf{Z}\mathcal{SN}(\beta)}$$

Weak-head contexts are an alternative to the meta-syntactic notation r of vectors of arguments. Notice that there is one rule for each kind of neutral term, one rule for answers and one rule for terms which are not weak-head normal forms.

### 3.2 ISN for dβ

We define ISN (dβ) with the same methodology as before. Hence, we first have to define neutral terms, answers and weak-head contexts.

Definition 2. We consider the following grammars:

(Neutral terms) n ::= x | n(u, x.n) (Answers) a ::= λx.t | n(u, x.a) (Neutral distant contexts) D<sup>n</sup> ::= ♦ | n(u, x.Dn) (Weak-head contexts) W ::= ♦ | W(u, x.r) | n(u, x.W)

Notice that n and a are disjoint and stable by dβ-reduction. Also D<sup>n</sup> ( W.

Example 1 (Decomposition). Let t = x1(x2, y1.I(I, z.I))(x3, y.II). Then, there are two decompositions of t in terms of a redex r and a weak-head context W: either W = ♦ and r = t, or W = x1(x2, y1.♦)(x3, y.II) and r = I(I, z.I). In both cases t = Whri. We will rule out the first possibility by defining next a restriction of the β-rule, securing uniqueness of such kind of decomposition in all cases.

The strategy underlying our definition of ISN (dβ) will be the weak-head strategy →wh, defined as the closure under W of the following restricted β-rule:

$$\mathsf{D\_n}\langle\lambda x.t\rangle(u,y.r) \mapsto \{\mathsf{D\_n}\langle\{u/y\}t\rangle/y\}r.t$$

The restriction of D to a neutral distant context D<sup>n</sup> is what allows determinism of our forthcoming Def. 3.

Lemma 4. The reduction →wh is deterministic.

As in the case of the λ-calculus, weak-head normal forms are either neutral terms or answers. This time, answers are not only abstractions, but also abstractions under a (neutral) distant context. Because of distance, these terms can also create a dβ-redex when applied to an argument, as seen in the next example.

Example 2. Consider again term t of Ex. 1. If the third form in the grammar of W was disallowed, then it would not be possible to write t as Whri, with r a restricted redex. In that case, the reduction strategy associated with ISN (dβ) would consider t as a weak-head normal form, and start reducing the subterms of t, including I(I, z.I). Now, the latter would eventually reach I and suddenly the whole term t <sup>0</sup> = x1(x2, y1.I)(x3, y.r<sup>0</sup> ) would be a weak-head redex again: the typical separation between an initial weak-head reduction phase and a later internal reduction phase, as it is the case in the λ-calculus, would be lost in our framework. This is a subtle point due to the distant character of rule dβ which explains the complexity of Def. 2.

Lemma 5. Let t ∈ T<sup>J</sup> . Then t is in wh-normal form iff t ∈ n ∪ a.

Our inductive definition of strong normalization follows.

Definition 3 (Inductive Strong Normalization). We consider the following inductive predicate:

$$\begin{array}{cc} \begin{array}{c} \begin{array}{c} \begin{array}{c} \mathbf{n},u,r \in \mathsf{Z}\mathcal{N}(\mathsf{d}\beta) \end{array} \end{array} \end{array} \end{array} \begin{array}{c} \begin{array}{c} \begin{array}{c} \mathbf{n},u,r \in \mathsf{Z}\mathcal{N}(\mathsf{d}\beta) \end{array} \end{array} \begin{array}{c} r \in \mathsf{wh}\text{-}nf \\ \begin{array}{c} \mathbf{n}\textbf{n}\textbf{p} \end{array} \end{array} \end{array} \end{array} \end{array} \begin{array}{c} \begin{array}{c} r \in \mathsf{wh}\text{-}nf \\ \begin{array}{c} \mathbf{n}\textbf{p} \end{array} \end{array} \end{pmatrix}$$
 
$$\begin{array}{c} \begin{array}{c} t \in \mathsf{Z}\mathcal{N}\mathcal{N}(\mathsf{d}\beta) \end{array} \end{array} \begin{array}{c} \begin{array}{c} \begin{array}{c} \forall\langle\,\mathsf{b}\,\,\mathsf{a}\rangle\langle\,\mathsf{b}\,\,\mathsf{x}\rangle \,\, \mathsf{t}\rangle \,\, \mathsf{T}\mathcal{N}\langle\,\mathsf{d}\beta\rangle \end{array} \end{array} \begin{array}{c} \begin{array}{c} r \in \mathsf{wh}\text{-}nf \\ \begin{array}{c} \mathbf{n}\textbf{p} \end{array} \end{array} \end{array} \begin{array}{c} \begin{array}{c} r \in \mathsf{wh}\text{-}nf \\ \begin{array}{c} \mathbf{n}\textbf{p} \end{array} \end{array} \end{array} \begin{array}{c} \begin{array}{c} r \in \mathsf{wh}\text{-}nf \\ \begin{array}{c} \mathbf{n}\textbf{p} \end{array} \end{array} \end{array}$$

Notice that every term can be written according to the conclusions of the previous rules, so that the grammar t, u, r ::= x | λx.t | n(t, x.r) | WhDnhλx.ti(u, y.s)i, with r ∈ wh-nf, also defines the syntax T<sup>J</sup> . Moreover, at most one rule in the previous definition applies to each term, i.e. the rules are deterministic. An equivalent, but non-deterministic definition, can be given by removing the side condition "r ∈ wh-nf" in rule (snapp). Indeed, this (weaker) rule would overlap with rule (snbeta) for terms in which the weak-head context lies in the last continuation, as for instance in x(u, y.y)(u 0 , y<sup>0</sup> .II). Notice the difference with the λ-calculus: the head of a term with generalized applications can be either on the left of the term (as in the λ-calculus), or recursively on the left in a continuation. We conclude with the following result.

Theorem 2. SN (dβ) = ISN (dβ).

### 4 Quantitative Types Characterize Strong Normalization

We proved that simply typable terms are strongly normalizing in Sec. 2.3. In this section we use non-idempotent intersection types to fully characterize strong normalization, so that strongly normalizing terms are also typable. First we introduce the typing system, next we prove the characterization and finally we study the quantitative behavior of π and give in particular an example of failure.

#### 4.1 The Typing System

We now define our quantitative type system ∩J for T<sup>J</sup> -terms and we show that strong normalization in λJ exactly corresponds to ∩J typability.

Given a countable infinite set BT V of base type variables a, b, c, . . ., we define the following sets of types:

$$\begin{array}{c} (\mathtt{types}) \ \sigma, \tau, \rho ::= a \in BTV \mid \mathcal{M} \to \sigma\\ (\mathtt{multistet} \ \mathtt{ types}) \ \mathcal{M}, \mathcal{N} ::= [\sigma\_i]\_{i \in I} \text{ where } I \text{ is a finite set} \end{array}$$

The empty multiset is denoted [ ]. We use |M| to denote the size of the multiset, thus if M = [σ<sup>i</sup> ]i∈<sup>I</sup> then |M| = |I|. We introduce a choice operator on multiset types: if M 6= [ ], then #(M) = M, otherwise #([ ]) = [σ], where σ is an arbitrary type. This operator is used to guarantee that there is always a typing witness for all the subterms of typed terms.

Typing environments (or just environments), written Γ, ∆, Λ, are functions from variables to multiset types assigning the empty multiset to all but a finite set of variables. The domain of Γ is given by dom(Γ) := {x | Γ(x) 6= [ ]}. The union of environments, written Γ ∧∆, is defined by (Γ ∧∆)(x) := Γ(x)t∆(x), where t denotes multiset union. This notion is extended to several environments as expected, so that ∧i∈IΓ<sup>i</sup> denotes a finite union of environments (∧i∈IΓ<sup>i</sup> is to be understood as the empty environment when I = ∅). We write Γ\\ x for the environment such that (Γ\\ x)(y) = Γ(y) if y 6= x and (Γ\\ x)(x) = [ ]. We write Γ; ∆ for Γ ∧ ∆ when dom(Γ) ∩ dom(∆) = ∅. A sequent has the form Γ ` t : σ, where Γ is an environment, t is a term, and σ is a type.

The type system ∩J is given by the following typing rules.

$$\begin{array}{c} \begin{array}{l} \begin{array}{l} \Gamma \mathrel{\mathop{:}} \,\mathsf{a} \,\mathsf{T} \mathrel{\mathop{:}} \,\mathsf{a} \,\mathsf{T} \mathrel{\mathop{:}} \,\mathsf{a} \,\mathsf{T} \mathrel{\mathop{:}} \,\mathsf{d} \,\mathsf{b} \,\mathsf{s} \end{array} \end{array} \begin{array}{l} \begin{array}{l} \Gamma \mathrel{\mathop{:}} \,\mathsf{x} : \mathcal{M} \vdash t : \sigma \\\ \Gamma \mathrel{\mathop{:}} \,\mathsf{A} \,\mathsf{x} : \mathcal{M} \to \sigma \end{array} \end{array} \begin{array}{l} \begin{array}{l} \Gamma \mathrel{\mathop{:}} \,\mathsf{t} : \mathsf{t} : \sigma \mathrel{\mathop{}} \,\mathsf{t} \mathrel{\mathop{:}} \,\mathsf{t} : \sigma \mathrel{\mathop{}} \,\mathsf{t} \end{array} \,\mathsf{f} \mathrel{\mathop{:}} \,\mathsf{d} \end{array} \end{array} \begin{array}{l} \begin{array}{l} \Gamma \mathrel{\mathop{:}} \,\mathsf{t} : \sigma \mathrel{\mathop{}} \,\mathsf{t} \end{array} \,\mathsf{f} \mathrel{\mathop{:}} \,\mathsf{d} \end{array} \end{array} \begin{array}{l} \begin{array}{l} \Gamma \mathrel{\mathop{:}} \,\mathsf{t} \\\ \lambda \mathrel{\mathop{=}} \,\mathsf{t} : \sigma \mathrel{\mathop{}} \,\mathsf{t} \end{array} \,\mathsf{f} \mathrel{\mathop{:}} \,\mathsf{d} \end{array} \end{array} \begin{array}{l} \Gamma \mathrel{\mathop{:}} \,\mathsf{d} \end{array} \,\mathsf{f} \mathrel{\mathop{:}} \,\mathsf{d} \end{array}$$

The use of the choice operator in rule (app) is subtle. If I is empty, then the multiset [M<sup>i</sup> → τ<sup>i</sup> ]i∈<sup>I</sup> typing t as well as the multiset ti∈IM<sup>i</sup> typing u are both empty, so that the choice operator must be used to type both terms. If I is not empty, then the multiset typing t is non-empty as well. However, the multiset typing u may or not be empty, e.g. if [[ ] → α] types t.

System ∩J lacks weakening: it is relevant.

Lemma 6 (Relevance). If Γ t : σ, then fv(t) = dom(Γ).

Notice that the typing rules (and the choice operator) force all the subterms of a typed term to be also typed. Moreover, if I = ∅ in rule (app), then the types of t and u are not necessarily related. Indeed, let δ ◦ := λy.y(y, w.w) in t<sup>0</sup> := δ ◦ (δ ◦ , x.z). Then t<sup>0</sup> is dβ-strongly-normalizing so it must be typed in system ∩J. However, since the set I of x : [τ<sup>i</sup> ]i∈<sup>I</sup> in the typing of r = z is necessarily empty (c.f. Lem. 6), then the unrelated types #([M<sup>i</sup> → τ<sup>i</sup> ]i∈<sup>I</sup> ) and #(ti∈IMi) of the two occurrences of δ ◦ witness to the fact that these subterms will never interact during the reduction of t0. Indeed, the term t<sup>0</sup> can be typed as follows, where ρ<sup>i</sup> := [[σ<sup>i</sup> ] → σ<sup>i</sup> , σ<sup>i</sup> ] → σ<sup>i</sup> and τ<sup>i</sup> := [σ<sup>i</sup> ] → σ<sup>i</sup> , for i = 1, 2:

$$\begin{array}{c} \begin{array}{l} \emptyset \vdash \delta^{\circ} \;:\ \rho\_{1} \\ \hline \emptyset \vdash \delta^{\circ} \;:\ [\rho\_{1}] \end{array} \left(\mathsf{many}\right) \end{array} \qquad \begin{array}{l} \begin{array}{l} \emptyset \vdash \delta^{\circ} \;:\ \rho\_{2} \\ \hline \emptyset \vdash \delta^{\circ} \;:\ [\rho\_{2}] \end{array} \left(\mathsf{nany}\right) \\ \hline \end{array} \qquad \begin{array}{l} \begin{array}{l} \begin{array}{l} \tau : [\tau] \Rightarrow x : [] \ \vdash z : \tau \end{array} \left(\mathsf{anx}\right) \end{array} \left(\mathsf{any}\right) \end{array}$$

where δ ◦ is typed with ρ<sup>i</sup> as follows:

$$\begin{array}{cc} \frac{y:\left[\tau\_{i}\right]\vdash y:\tau\_{i}}{y:\left[\tau\_{i}\right]\vdash y:\left[\tau\_{i}\right]} \text{ (\texttt{nany})} & \frac{y:\left[\sigma\_{i}\right]\vdash y:\sigma\_{i}}{y:\left[\sigma\_{i}\right]\vdash y:\left[\sigma\_{i}\right]} \text{ (\texttt{nany})} & \frac{}{w:\left[\sigma\_{i}\right]\vdash w:\sigma\_{i}} \text{ (\texttt{var})}\\ \hline & y:\left[\left[\sigma\_{i}\right]\to\sigma\_{i},\sigma\_{i}\right]\vdash y(y,w.w):\sigma\_{i} & \text{ (\texttt{ap})}\\ \hline & \varnothing\vdash\lambda y.y(y,w.w):\left[\left[\sigma\_{i}\right]\to\sigma\_{i},\sigma\_{i}\right]\to\sigma\_{i} \end{array}$$

We write Γ <sup>∩</sup><sup>J</sup> t : σ or simply Γ t : σ if there is a derivation in system ∩J ending in Γ ` t : σ. For n ≥ 1, we write Γ n ∩J t : σ or simply Γ <sup>n</sup> t : σ if there is a derivation in system ∩J ending in Γ ` t : σ and containing n occurrences of rules in the set {(var),(abs),(app)}.

### 4.2 The Characterization of dβ-Strong Normalization

The soundness Lem. 9 is based on Lem. 8, based in turn on Lem. 7.

Lemma 7 (Substitution Lemma). Let t, u ∈ T<sup>J</sup> with x ∈ fv(t). If both Γ; x : M <sup>n</sup> t : σ and ∆ <sup>m</sup> u : M hold, then Γ ∧ ∆ <sup>k</sup> {u/x}t : σ where k = n + m − |M|.

Lemma 8 (Non-Erasing Subject Reduction). Let Γ n<sup>1</sup> ∩J t<sup>1</sup> : σ. If t<sup>1</sup> →d<sup>β</sup> t<sup>2</sup> is a non-erasing step, then Γ n<sup>2</sup> ∩J t<sup>2</sup> : σ with n<sup>1</sup> > n2.

Lemma 9 (Soundness for λJ). If t is ∩J-typable, then t ∈ SN (dβ).

The completeness Lem. 13 is based on Lem. 10 and Lem. 12, this last based in turn on Lem. 11.

### Lemma 10 (Typing Normal Forms).


Lemma 11 (Anti-Substitution). If Γ {u/x}t : σ where x ∈ fv(t), then there exist Γt, Γ<sup>u</sup> and M 6= [ ] such that Γt; x : M t : σ, Γ<sup>u</sup> u : M and Γ = Γ<sup>t</sup> ∧ Γu.

Lemma 12 (Non-Erasing Subject Expansion). If Γ <sup>∩</sup><sup>J</sup> t<sup>2</sup> : σ and t<sup>1</sup> →d<sup>β</sup> t<sup>2</sup> is a non-erasing step, then Γ <sup>∩</sup><sup>J</sup> t<sup>1</sup> : σ.

Lemma 13 (Completeness for λJ). If t ∈ SN (dβ), then t is ∩J-typable.

We finally obtain:

Theorem 3 (Characterization). System ∩J characterizes strong normalization, i.e. t is ∩J-typable if and only if t is →d<sup>β</sup>-normalizing. Moreover, if Γ <sup>n</sup> t : σ then the number of reduction steps in any reduction sequence from t to normal form is bounded by n.

Proof. Soundness holds by Lem. 9, while completeness holds by Lem. 13. The bound is given by Thm. 9 in the long version [6].

#### 4.3 Why π Is Not Quantitative

In the introduction we discussed that π is rejected by the quantitative type systems ∩J for CBN. This happens in the critical case when x /∈ fv(r) and y ∈ fv(r 0 ) in t<sup>0</sup> = t(u, x.r)(u 0 , y.r<sup>0</sup> ) →<sup>π</sup> t(u, x.r(u 0 , y.r<sup>0</sup> )) = t1. Let us see a concrete example.

Example 3. We take t<sup>1</sup> = x(y, a.z)(w, b.b(b, c.c)) →<sup>π</sup> x(y, a.z(w, b.b(b, c.c))) = t2. Let ρ<sup>1</sup> = [σ] → τ and ρ<sup>2</sup> = [σ] → [τ ] → τ . For each i ∈ {1, 2} let ∆<sup>i</sup> = x : [σ1]; y : [σ2]; z : [ρ<sup>i</sup> ]. Consider

$$\Psi = \frac{b : [ [\tau] \to \tau ] \vdash b : [ [\tau] \to \tau ] \qquad b : [\tau] \Vdash b : [\tau] \qquad c : [\tau] \vdash c : \tau }{\begin{array}{c} b : [[\tau] \to \tau, \tau] \vdash b(b, c.c) : \tau \end{array}}$$

and the derivation Φ<sup>i</sup> for i ∈ {1, 2}:

$$\Phi\_i = \frac{x : [\sigma\_1] \vdash x : [\sigma\_1] \qquad y : [\sigma\_2] \Vdash y : [\sigma\_2] \qquad \overline{z : [\rho\_i] \vdash z : \rho\_i}}{\Delta\_i \vdash x(y, a.z) : \rho\_i}$$

Then, for the term t1, we have the following derivation:

Φ<sup>1</sup> Φ<sup>2</sup> ∆<sup>1</sup> ∧ ∆<sup>2</sup> ` x(y, a.z) : [ρ1, ρ2] w : [σ, σ] w : [σ, σ] Ψ Γ<sup>1</sup> ` x(y, a.z)(w, b.b(b, c.c)) : τ

where Γ<sup>1</sup> = z : [ρ1, ρ2]; w : [σ, σ]; x : [σ1, σ1]; y : [σ2, σ2].

While for the term t2, we have:

$$\frac{x: [\sigma\_1] \vdash x: [\sigma\_1] \qquad y: [\sigma\_2] \vdash y: [\sigma\_2] \qquad \Phi}{\Gamma\_2 \vdash x(y, a.z(w, b.b(b, c.c))): \tau}$$

where

$$\Phi = \frac{z : [\rho\_1, \rho\_2] \Vdash z : [\rho\_1, \rho\_2] \qquad w : [\sigma, \sigma] \Vdash w : [\sigma, \sigma] \qquad \Psi}{\varGamma\_2 \Vdash z (w, b.b(b, c.c)) : \tau}$$

and Γ<sup>2</sup> = z : [ρ1, ρ2]; w : [σ, σ]; x : [σ1]; y : [σ2].

Thus, the multiset types of x and y in Γ<sup>1</sup> and Γ<sup>2</sup> resp. are not the same. Despite the fact that the step t<sup>1</sup> →<sup>π</sup> t<sup>2</sup> does not erase any subterm, the typing environment is losing quantitative information.

Notice that by replacing non-idempotent types by idempotent ones, subject reduction (and expansion) would work for π-reduction: by assigning sets to variables instead of multisets, Γ<sup>1</sup> and Γ<sup>2</sup> would now represent the same object.

Despite the fact that quantitative subject reduction fails for some π-steps, the following weaker property is sufficient to recover (qualitative) soundness of our typing system ∩J w.r.t. the reduction relation →β,π. Soundness will be used later in Sec. 6 to show equivalence between SN (dβ) and SN (β, π).

Lemma 14 (Typing Behavior of π-Reduction). Let Γ n<sup>1</sup> ∩J t<sup>1</sup> : σ. If t<sup>1</sup> = t(u, x.r)(u 0 , y.r<sup>0</sup> ) 7→<sup>π</sup> t<sup>2</sup> = t(u, x.r(u 0 , y.r<sup>0</sup> )), then there are n<sup>2</sup> and Σ v Γ such that Σ n<sup>2</sup> ∩J t<sup>2</sup> : σ with n<sup>1</sup> ≥ n2.

Lemma 15 (Soundness for ΛJ). If t is ∩J-typable, then t ∈ SN (β, π).

### 5 Faithfulness of the Translation

As discussed in the introduction, the natural translation [4] of generalized applications into ES is not faithful. In this section we define an alternative encoding and prove it faithful: a term in T<sup>J</sup> is dβ-strongly normalizing iff its alternative encoding is strongly normalizing in the ES framework. In a later subsection, we use this connection with ES to establish the equivalence between strong normalization w.r.t. dβ and (β, p2).

### 5.1 Explicit Substitutions

We define the syntax and semantics of an ES calculus borrowed from [1] to which we relate λJ. It is a simple calculus where β is implemented in two independent steps: one creating a let-binding, and another one substituting the term bound. It has a notion of distance which allows to reduce redexes such as ([N/x](λy.M))P →dB [N/x][P/y]M, where the ES [N/x] lies between the abstraction and its argument. Terms and list contexts are given by:

$$\begin{array}{c} \left(\mathsf{T}\_{ES}\right)M, N, P, Q ::= x \mid \lambda x.M \mid MN \mid [N/x]M\\ \left(\mathsf{List}\ \mathsf{context}\right) & \mathsf{L} ::= \diamond \rangle \mid [N/x]\mathsf{L} \end{array}$$

The calculus λES is defined by TES[dB, s] (closed under all contexts) where:

$$\mathsf{L}\langle\lambda x.M\rangle N \mapsto\_{\mathsf{dB}} \mathsf{L}\langle[N/x]M\rangle \qquad \qquad [N/x]M \mapsto\_{\mathsf{B}} \{N/x\}M$$

Now, consider the (naive) translation from T<sup>J</sup> to TES [4]:

$$x^\star := x \qquad (\lambda x.t)^\star := \lambda x.t^\star \qquad t(u,y.r)^\star := [t^\star u^\star/y]r^\star$$

According to this translation, the notion of distance in λES corresponds to our notion of distance for λJ. For instance, the application t(u, x.·) in the term t(u, x.λy.r)(u 0 , z.r<sup>0</sup> ) can be seen as a substitution [t ?u ?/x]· inserted between the abstraction λy.r and the argument u 0 . But how can we now (informally) relate π to the notions of existing permutations for λES? Using the previous translation, we can see that t<sup>0</sup> = t(u, x.r)(u 0 , y.r<sup>0</sup> ) 7→<sup>π</sup> t(u, x.r(u 0 , y.r<sup>0</sup> )) = t<sup>1</sup> simulates as

$$t\_0^\star = [([t^\star u^\star / x]r^\star)u^{\prime \star} / y]r^{\prime \star} \to [[t^\star u^\star / x](r^\star u^{\prime \star}) / y]r^{\prime \star} \to [t^\star u^\star / x][r^\star u^{\prime \star} / y]r^{\prime \star} = t\_1^\star.$$

The first step is an instance of a rule in ES known as σ1: ([u/x]t)v 7→ [u/x](tv), and the second one of a rule we call σ4: [[u/x]t/y]v 7→ [u/x][t/y]v. Quantitative types for ES tell us that only rule σ1, but not rule σ4, is valid for a call-by-name calculus. This is why it is not surprising that π is rejected by our type system, as detailed in Sec. 4.3.

The alternative encoding we propose is as follows (noted <sup>∗</sup> instead of ? ):

### Definition 4 (Translation from T<sup>J</sup> to TES).

$$x^\* := x \qquad (\lambda x.t)^\* := \lambda x.t^\* \qquad t(u,x.r)^\* := [t^\*/x^\dagger][u^\*/x^r] \{x^l x^r/x\} r^\*$$

Notice the above π-reduction t<sup>0</sup> → t<sup>1</sup> is still simulated: t ∗ <sup>0</sup> →<sup>2</sup> σ<sup>4</sup> t ∗ 1 .

Consider again the counterexample to faithfulness already discussed in the introduction, given by t := δ ◦ (δ ◦ , y.r) with y /∈ fv(r), where δ ◦ = λx.x(x, w.w). The term t is a dβ-redex, whose contraction throws away the two copies of δ ◦ . The naive translation of t gives [δ ◦∗ δ ◦∗ /y]r ? , which clearly diverges in λES. The alternative encoding of t is [δ ◦∗ /y<sup>l</sup> ][δ ◦∗ /y<sup>r</sup> ]{y ly <sup>r</sup>/y}r ∗ , which is just [δ ◦∗ /y<sup>l</sup> ][δ ◦∗ /y<sup>r</sup> ]r ∗ , because y /∈ fv(r ∗ ). The only hope to have an interaction between the two copies of δ ◦∗ in the previous term is to execute the ES, but such executions will just throw away those two copies, because y l , y<sup>r</sup> ∈/ fv(r ∗ ). This gives an intuitive idea of the faithfulness of our encoding.

### 5.2 Proof of Faithfulness

We need to prove the equivalence between two notions of strong normalization: the one of a term in λJ and the one of its encoding in λES. While this proof can be a bit involved using traditional methods, quantitative types will make it very straightforward. Indeed, since quantitative types correspond exactly to strong normalization, we only have to show that a term t is typable exactly when its encoding is typable, for two appropriate quantitative type systems.

For λES, we will use the following system [9]:

Definition 5 (The Type System ∩ES).

(var) x : [σ] ` x : σ Γ; x : M ` t : σ (abs) Γ ` λx.M : M → σ (Γ<sup>i</sup> ` M : σi) i∈I I 6= [ ] (many) ∧i∈IΓ<sup>i</sup> ` M : [σ<sup>i</sup> ]i∈<sup>I</sup> Γ ` M : M → σ ∆ ` N : #(M) (app) Γ ∧ ∆ ` MN : σ Γ; x : M ` M : σ ∆ ` N : #(M) (sub) Γ ∧ ∆ ` [N/x]M : σ

Theorem 4. Let M ∈ TES. Then M is typable in ∩ES iff M ∈ SN (dB, s).

A simple induction on the type derivation shows that the encoding is sound.

Lemma 16. Let t ∈ T<sup>J</sup> . Then Γ <sup>∩</sup><sup>J</sup> t : σ =⇒ Γ <sup>∩</sup>ES t ∗ : σ.

We show completeness by a detour through the encoding of TES to T<sup>J</sup> :

Definition 6 (Translation from TES to T<sup>J</sup> ).

$$\begin{array}{cc} x^{\circ} := x & \text{(} MN\text{)}^{\circ} := M^{\circ}(N^{\circ}, x.x) \\ \left(\lambda x.M\right)^{\circ} := \lambda x.M^{\circ} & \text{(} [N/x]M \text{)}^{\circ} := \text{I}(N^{\circ}, x.M^{\circ}) \end{array}$$

The two following lemmas, shown by induction on the type derivations, give in particular that t ∗ typable implies t typable.

Lemma 17. Let M ∈ TES. Then Γ <sup>∩</sup>ES M : σ =⇒ Γ <sup>∩</sup><sup>J</sup> M◦ : σ.

Lemma 18. Let t ∈ T<sup>J</sup> . Then Γ <sup>∩</sup><sup>J</sup> t ∗◦ : σ =⇒ Γ <sup>∩</sup><sup>J</sup> t : σ.

Putting all together, we get this equivalence:

Corollary 1. Let t ∈ T<sup>J</sup> . Then Γ <sup>∩</sup><sup>J</sup> t : σ ⇐⇒ Γ <sup>∩</sup>ES t ∗ : σ.

This corollary, together with the two characterization theorems 3 and 4, provides the main result of this section:

Theorem 5 (Faithfulness). Let t ∈ T<sup>J</sup> . Then t ∈ SN (dβ) ⇐⇒ t <sup>∗</sup> ∈ SN (dB, s).

### 6 Equivalent Notions of Strong Normalization

In the previous section, we related strong dβ-normalization with strong normalization of ES. In this section we will compare the various concepts of strong normalization that are induced on T<sup>J</sup> by β, dβ, (β, p2) and (β, π). This comparison will make use of several results obtained in the previous sections, and will obtain new results about the original calculus ΛJ.

### 6.1 β-Normalization Is Not Enough

We discussed in Sec. 2.2 about the unblocking property of π and p2. From the point of view of normalization, this means that T<sup>J</sup> [β] has premature normal forms and that SN (β) ( SN (dβ). To illustrate this purpose we give an example of a T<sup>J</sup> -term which normalizes when only using rule β, but diverges when adding permutation rules or distance. We write Ω the term δ ◦ (δ ◦ , x.x), where δ ◦ = λy.y(y, z.z), so that Ω →<sup>β</sup> Ω. Now, let us take t := w(u, w<sup>0</sup> .δ◦ )(δ ◦ , x.x). Although this term is normal in T<sup>J</sup> [β], the second δ ◦ is actually an argument for the first one, as we can see with a π permutation:

$$t \to\_{\pi} w(u, w'. \delta^\diamond(\delta^\diamond, x. x)) = w(u, w'. \Omega) := t'$$

Thus t →<sup>π</sup> t <sup>0</sup> →<sup>β</sup> t <sup>0</sup> which implies t /∈ SN (β, π). We can also unblock the redex in t by a p2-permutation moving the inner λx up:

$$t \to\_{\mathbb{P}^2} (\lambda y.w(u, w'.y(y, z.z)))(\delta^\circ, x.x) \to\_\beta t'$$

Thus t →<sup>p</sup>2→<sup>β</sup> t <sup>0</sup> →<sup>β</sup> t <sup>0</sup> and thus t /∈ SN (β, p2). We get the same thing in a unique dβ-step: t →d<sup>β</sup> t 0 .

In all the three cases, β-strong normalization is not preserved by the permutation rules, as there is a term t ∈ SN (β) such that t /∈ SN (β, π), t /∈ SN (β, p2) and t /∈ SN (dβ).

### 6.2 Comparison with β + p<sup>2</sup>

We now formalize the fact that our calculus T<sup>J</sup> [dβ] is a version with distance of T<sup>J</sup> [β, p2], so that they are equivalent from a normalization point of view. For this, we will establish the equivalence between strong normalization w.r.t. dβ and (β, p2), through a long chain of equivalences. One of them is Thm. 5, that we have proved in the previous section; the other is a result about σ-rules in the λ-calculus – which is why we have to go through the λ-calculus again.

### Definition 7 (Translation from TES to Tλ).

$$x^\sharp := x \quad (\lambda x.M)^\sharp := \lambda x.M^\sharp \quad (MN)^\sharp := M^\sharp N^\sharp \quad [N/x]M^\sharp := (\lambda x.M^\sharp)N^\sharp$$

Lemma 19. Let M ∈ TES. Then M ∈ SN (dB, s) =⇒ M] ∈ SN (β).

Proof. For typability in the λ-calculus, we use the type system S 0 <sup>λ</sup> with choice operators in [9], which we rename here ∩S. It can be seen as a restriction of our system ∩ES to λ-terms. Suppose M ∈ SN (dB, s). By Thm. 4 M is typable in ∩ES, and it is straightforward to show that M] is typable in ∩S. Moreover, M] typable implies that M] ∈ SN (β) [9], which is what we want.

For t ∈ T<sup>J</sup> , let t := t ∗] . So, we are just composing the alternative encoding of generalized application into ES with the map into λ-calculus just introduced. The λ-term t may be given by recursion on t as follows:

$$x^{\square} = x \qquad (\lambda x.t)^{\square} = \lambda x.t^{\square} \qquad t(u,y.r)^{\square} = (\lambda y^r.(\lambda y^1.\{y^1y^r/y\}r^{\square})t^{\square})u^{\square}$$

Lemma 20. t ∈ SN (β, σ2) =⇒ t ∈ SN (β, p2).

Proof. Because (·) produces a strict simulation from T<sup>J</sup> to Tλ. More precisely: (i) if t<sup>1</sup> →<sup>β</sup> t<sup>2</sup> then t <sup>1</sup> →<sup>+</sup> β t 2 ; (ii) if t<sup>1</sup> →<sup>p</sup><sup>2</sup> t<sup>2</sup> then t <sup>1</sup> →<sup>2</sup> σ<sup>2</sup> t 2 .

Theorem 6. Let t ∈ T<sup>J</sup> . Then t ∈ SN (β, p2) iff t ∈ SN (dβ).

Proof. We prove that the following conditions are equivalent: 1) t ∈ SN (β, p2). 2) t ∈ SN (dβ). 3) t <sup>∗</sup> ∈ SN (dB, s). 4) t ∈ SN (β). 5) t ∈ SN (β, σ2). Now, 1) =⇒ 2) is because →d<sup>β</sup>⊂→<sup>+</sup> β,p<sup>2</sup> . 2) =⇒ 3) is by Thm. 5. 3) =⇒ 4) is by Lem. 19. 4) =⇒ 5) is showed in [13]. 5) =⇒ 1) is by Lem. 20.

### 6.3 Comparison with β + π

We now prove the equivalence between strong normalization for dβ and for (β, π). One of the implications already follows from the properties of the typing system.

Lemma 21. Let t ∈ T<sup>J</sup> . If t ∈ SN (dβ) then t ∈ SN (β, π).

Proof. Follows from the completeness of the typing system (Lem. 13) and soundness of ∩J for (β, π) (Lem. 15).

The proof of the other implication requires more work, organized in 4 parts: 1) A remark about ES. 2) A remark about translations of ES into the ΛJ-calculus. 3) Two new properties of strong normalization for (β, π) in ΛJ. 4) Preservation of strong (β, π)-normalization by a certain map from the set T<sup>J</sup> into itself.

The remark about explicit substitutions is this:

### Lemma 22. For all M ∈ TES, M ∈ SN (dB, s) iff M ∈ SN (B, s).

The translation ◦ in Def. 6 induces a simulation of each s-reduction step on TES into a β-reduction step on T<sup>J</sup> , but cannot simulate the creation of an ES effected by rule B. A solution is to refine the translation ◦ for applications, yielding the following alternative translation:

$$\begin{array}{c} x^{\bullet} := x \\ (MN)^{\bullet} := I(N^{\bullet}, y.M^{\bullet}(y, z.z)) \end{array} \qquad \begin{array}{c} (\lambda x.M)^{\bullet} := \lambda x.M^{\bullet} \\ [N/x]M^{\bullet} := I(N^{\bullet}, x.M^{\bullet}) \end{array}$$

Since the clause for ES is not changed, simulation of each s-reduction step by a β-reduction step holds as before. The improvement lies in the simulation of each B-reduction step:

$$I((
\lambda x.M)N)^\bullet = I(N^\bullet, y.(
\lambda x.M^\bullet)(y, z.z)) \rightarrow\_\beta I(N^\bullet, y.\{y/x\}M^\bullet) =\_\alpha ([N/x]M)^\bullet$$

This strict simulation gives immediately:

Lemma 23. For all M ∈ TES, if M• ∈ SN (β) then M ∈ SN (B, s).

We now prove two properties of strong normalization for (β, π) in ΛJ. Following [10], SN (β, π) admits an inductive characterization ISN (β, π), which uses the following inductive generation for T<sup>J</sup> -terms:

$$(t, u, r \; ::= x \mathbf{S} \mid \lambda x. t \mid (\lambda x. t) \mathbf{S} \mathbf{S} \qquad \mathbf{S} \; ::= (u, y. r)$$

Hence S stands for a generalized argument, while S denotes a possibly empty list of S's. The definition of ISN (β, π) is given below. Notice that at most one rule applies to a given term, so the rules are deterministic (and thus invertible).

$$\begin{array}{llll}\hline\\ x \in \mathcal{LNN}(\beta,\pi) & u,r \in \mathcal{LNN}(\beta,\pi) \\\hline\\ x(u,y.rS)\mathcal{S} \in \mathcal{LNN}(\beta,\pi) & (hvar) \\\hline\\ x(u,y.rS)\mathcal{S} \in \mathcal{LNN}(\beta,\pi) & (p) \\\hline\\ \hline\end{array} \begin{array}{llll} u \in \mathcal{LNN}(\beta,\pi) & t \in \mathcal{LNN}(\beta,\pi) \\\hline\\ \hline\lambda x.t \in \mathcal{LNN}(\beta,\pi) & (lamb da) \\\hline\\ (\lambda x.t)(u,y.r)\mathcal{S} \in \mathcal{LNN}(\beta,\pi) & t,u \in \mathcal{LNN}(\beta,\pi) \\\hline\\ (\lambda x.t)(u,y.r)\mathcal{S} \in \mathcal{LNN}(\beta,\pi) & (bota) \\\hline\end{array}$$

A preliminary fact is the following:

Lemma 24. SN (β, π) is closed under prefixing of arbitrary π-reduction steps:

$$\frac{t \to\_{\pi} t' \text{ and } t' \in \mathcal{SN}(\beta, \pi)}{t \in \mathcal{SN}(\beta, \pi)}$$

Given that SN (β, π) = ISN (β, π), the "rule" in Lem. 24, when written with ISN (β, π), is admissible for the predicate ISN (β, π). Now, consider:

$$\frac{u, r \in \mathcal{LN}(\beta, \pi)}{\{y(u, z.z)/x\} r \in \mathcal{LN}(\beta, \pi)}\,(I)$$

$$\frac{\{\{\{u/y\} t/z\} r/x\} r \in \mathcal{LN}(\beta, \pi) \qquad t, u \in \mathcal{LN}(\beta, \pi) \qquad x \notin \text{fv}(t, u, r) \qquad (II)}{\{(\lambda y. t)(u, z.r)/x\} r \in \mathcal{LN}(\beta, \pi)}\,(II)$$

Notice rule II generalizes rule (beta): just take r = xS, with x /∈ S.

The two new properties of strong normalization for (β, π) in ΛJ are contained in the following Lemma.

### Lemma 25. Rules I and II are admissible rules for the predicate ISN (β, π).

We now move to the fourth part of the ongoing reasoning. Consider the map from T<sup>J</sup> to itself obtained by composing (·) ∗ : T<sup>J</sup> → TES with (·) • : TES → T<sup>J</sup> . Let us write t † := t ∗•. A recursive definition is also possible, as follows:

$$x^\dagger = x \qquad \lambda x.t^\dagger = \lambda x.t^\dagger \qquad t(u,y.v)^\dagger = I(t^\dagger, y\_1.I(u^\dagger, y\_2.\{y\_1(y\_2,z.z)/y\}v^\dagger))$$

Lemma 26. If t ∈ SN (β, π) then t † ∈ SN (β, π).

Proof. Heavy use is made of Lem. 24 and Lem. 25.

All is in place to obtain the desired result:

Theorem 7. Let t ∈ T<sup>J</sup> . t ∈ SN (dβ) iff t ∈ SN (β, π).

Proof. The implication from left to right is Lem. 21. For the converse, suppose t ∈ SN (β, π). By Lem. 26, t † ∈ SN (β, π). Trivially, t † ∈ SN (β). Since t † = t ∗• , Lem. 23 gives t <sup>∗</sup> ∈ SN (B, s). By Lem. 22, t <sup>∗</sup> ∈ SN (dB, s). By an equivalence in the proof of Thm. 6, t ∈ SN (dβ).

#### 6.4 Consequences for ΛJ

The comparison with λJ gives new results about the original ΛJ (a quantitative typing system characterizing strong normalization, and a faithful translation into ES) as immediate consequences of Thms. 3, 5, and 7.

Corollary 2. Let t ∈ T<sup>J</sup> . (1) t ∈ SN (β, π) iff t is ∩J-typable. (2) t ∈ SN (β, π) iff t <sup>∗</sup> ∈ SN (dB, s).

Beyond strong normalization, ΛJ gains a new normalizing strategy, which reuses the notion of weak-head normal form introduced in Sec. 3.2. We take the definitions of neutral terms, answer and weak-head context W given there for λJ, in order to define a new weak-head strategy and a new predicate ISN for ΛJ. The strategy is defined as the closure under W of rule β and of the particular case of rule π where the redex has the form n(u, x.a)S 5 .

<sup>5</sup> Notice how a redex has the two possible forms (λx.t)S or n(u, x.a)S, that can be written as aS, that is, the form Dnhλx.tiS of a weak-head redex in λJ

Definition 8. Predicate ISN is defined by the rules (snvar), (snapp), (snabs) in Def. 3, together with the following two rules (which replace rule (snbeta)):

$$\frac{\mathbb{W}\langle\mathsf{n}(u,y,\mathsf{a}S)\rangle \in ISN}{\mathbb{W}\langle\mathsf{n}(u,y,\mathsf{a})S\rangle \in ISN} \left(\mathsf{snredex1}\right) \qquad \frac{\mathbb{W}\langle\{\{u/x\}t/y\}r\rangle, t, u \in ISN}{\mathbb{W}\langle(\lambda x.t)(u,y.r)\rangle \in ISN} \left(\mathsf{snredex2}\right)$$

The corresponding normalization strategy is organized as usual: an initial phase obtains a weak-head normal form, whose components are then reduced by internal reduction. Is this new strategy any good? The last theorem of the paper answers positively:

Theorem 8. Let t ∈ T<sup>J</sup> . t ∈ ISN iff t ∈ ISN (β, π).

### 7 Conclusion

Contributions. This paper presents and studies several properties of the callby-name λJ-calculus, a formalism implementing an appropriate notion of distant reduction to unblock the β-redexes arising in generalized application notation.

Strong normalization of simple typed terms was shown by translating the λJ into the λ-calculus. A full characterization of strong normalization was developed by means of a quantitative type system, where the length of dβ-reduction to normal form is bound by the size of the type derivation of the starting term. An inductive definition of dβ-strong normalization was defined and proved correct in order to achieve this characterization. It was also shown how the traditional permutative rule π is rejected by the quantitative system, thus emphasizing the choice of dβ-reduction for a quantitative generalized application framework.

We have also defined a faithful translation from the λJ-calculus into ES. The translation preserves strong normalization, in contrast to the traditional translation to ES e.g. in [4]. Last but not least, we related strong normalization of λJ with that of other calculi, including in particular the original ΛJ. New results for the latter were found by means of the techniques developed for λJ. In particular, a quantitative characterization of strong normalization was developed for ΛJ, where the bound of reduction given by the size of type derivations only holds for β-steps (and not for π-steps).

Future work. Regarding call-by-name for generalized applications, this paper opens new questions. We studied a new calculus λJ, proposed as an alternative to the original ΛJ, but we also mentioned some possible variants in Sec. 2.2, notably a calculus based on rule (1), and β + p<sup>2</sup> (used as a technical tool in Sec. 6). The first option seems to have the flavor of ΛJ whereas the β + p<sup>2</sup> option seems to have the flavor of λJ. It remains to be seen what are the advantages and drawbacks of the latter one with respect to λJ.

Regarding call-by-value, we plan to develop the quantitative semantics in the presence of generalized applications, starting from the calculus proposed in [5]. Further unification between call-by-name and call-by-value with the help of generalized applications could be considered in the setting of the polarized lambda-calculus [4].

### References


Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

### Modal Logics and Local Quantifiers: A Zoo in the Elementary Hierarchy

Raul Fervari<sup>1</sup> and Alessio Mansutti<sup>2</sup> ()

<sup>1</sup> CONICET and Universidad Nacional de C´ordoba, C´ordoba, Argentina rfervari@unc.edu.ar

<sup>2</sup> Department of Computer Science, University of Oxford, Oxford, UK alessio.mansutti@cs.ox.ac.uk

Abstract. We study a family of modal logics interpreted on tree-like structures, and featuring local quantifiers ∃ k p that bind the proposition p to worlds that are accessible from the current one in at most k steps. We consider a first-order and a second-order semantics for the quantifiers, which enables us to relate several well-known formalisms, such as hybrid logics, S5Q and graded modal logic. To better stress these connections, we explore fragments of our logics, called herein round-bounded fragments. Depending on whether first or second-order semantics is considered, these fragments populate the hierarchy 2NExp ⊂ 3NExp ⊂ · · · or the hierarchy 2AExppol ⊂ 3AExppol ⊂ · · ·, respectively. For formulae up-to modal depth k, the complexity improves by one exponential.

### 1 Introduction

From a traditional perspective, modal logics [10] are formalisms to reason about different modes of truth. However, another view consists of seeing these logics as computationally well-behaved fragments of first-order logic and second-order logic (see e.g., [1] for a discussion). Some examples of well-known modal logics with a good balance between expressivity and computational complexity are graded modal logic (GML) [5,28], whose satisfiability problem is PSpacecomplete; and the temporal logics LTL, CTL and CTL<sup>∗</sup> whose satisfiability problems are complete for PSpace, Exp and 2Exp, respectively [31,19,25].

A family of logics that elude this nice computational picture is that made of modal logics enriched with first-order or second-order propositional quantifiers ∃p, which update the set of worlds of a Kripke structure that satisfy the propositional symbol p. The literature of modal logics featuring quantification over propositional symbols can be traced back to [12,26,18]. All these works show that, in spite of the simplicity of the principle, propositional quantification leads to undecidability very quickly. One of the few exceptions is the logic S5Q, i.e. S5 enriched with second-order propositional quantifiers, which enjoys an exponential-size small model property, and is thus decidable [22,18]. Here, the success in finding a well-behaved framework for propositional quantification is due to the fact that S5 has a very restricted class of models. In modern literature, the family of hybrid logics [2] is one of the most relevant approaches offering first-order propositional quantification. Most hybrid logics provide operators ↓i that binds the current world to the proposition i, and @<sup>i</sup> that allows to jump to the world bound to i. This form of quantification is very expressive, and leads to undecidability over standard Kripke structures [3]. To regain decidability, one can restrict the logic to syntactical fragments that avoid the quantification patters ↓ and ♦↓♦, or restrict the interpretation to models in which each world has at most two successors [14]. Again, one can also simply consider S5 models: the hybrid logic with ↓ and @ on S5 is known to admit an NExp-complete satisfiability problem [30].

Recent works shed new lights on the role of propositional quantifiers. From a model theoretical perspective, a revision about the different forms of propositional quantification has been put forward in [9]. Novel algebraic insights on S5 with propositional quantification have been discovered in [17]. From a computational perspective, [6] shows that second-order propositional quantification is enough to obtain Tower-complete (hence, non-elementary decidable, [29]) logics on tree-like structures. This last result is of interest, as the second-order logic QCTL<sup>t</sup> <sup>X</sup> considered in [6] subsumes several other modal logics with forms of quantification "in disguise", such as the aforementioned GML, as well as modal separation logics [16], ambient logics [13] and team logics [21]. However, when translated into QCTL<sup>t</sup> <sup>X</sup>, the good computational properties of these logics are lost, and the Tower-hardness of QCTL<sup>t</sup> <sup>X</sup> prevents us to grasp the real capabilities of their (often restricted) form of propositional quantifications.

Contributions. The overall message of [6] is that the computational power of propositional quantification in the context of modal logic deserves to be better understood. Driven by this message, we investigate from a unified perspective a family of logics interpreted on tree-like models, featuring a very intuitive form of propositional quantification: the local quantifier ∃ <sup>k</sup>p , with k ≥ 1 integer, that binds the propositional symbol p to world(s) occurring within distance k from the current point of evaluation. More precisely, we look at two families of modal logics: the family ML(∃ 1 FO), ML(∃ 2 FO), · · · , where ML(∃ k FO) extends the basic modal logic ML with the first-order local quantifier ∃ <sup>k</sup>p binding p to exactly one world occurring within distance k of the current world; and the family ML(∃ 1 SO), ML(∃ 2 SO), · · · , where ML(∃ k SO) extends ML with the second-order local quantifier ∃ <sup>k</sup>p binding p to a set of worlds occurring within distance k.

As previously mentioned, in introducing these logics our aim is to better understand the similarities and differences between the various modal logics featuring propositional quantification, especially when it comes to their complexity. This analysis cannot be done using Tower-complete logics like QCTL<sup>t</sup> <sup>X</sup>, as finer complexity classes are required. In this sense, it is worth to notice that our framework features the logic ML(∃<sup>∞</sup> SO ), whose quantifier ∃<sup>∞</sup>p binds p to arbitrary worlds reachable from the current one. This is exactly the logic QCTL<sup>t</sup> <sup>X</sup>. Because of this connection and of similarities with other frameworks, e.g. [7], we argue that even if we restrict ourselves to quantifiers ∃ <sup>k</sup> with small k, the complexity does not improve. In fact, ML(∃ 2 FO) is already Tower-complete, although we defer this result to an extended version of the paper, due to the lack of space. Consequently, to pursue our goal of a fine-grained analysis of the computational power of propositional quantification in modal logic, in this paper we focus on a syntactical restriction for ML(∃ k FO) and ML(∃ k SO) where the local quantifiers are round-bounded (Sec. 2). Roughly speaking, under the round-bounded condition, ML(∃ k FO) and ML(∃ k SO) formulae can be split into parts having k nested modalities. Quantifiers belonging to one part of the formula do not interact with quantifiers from other parts of the formula. The following results are established.

Theorem 1. The sat. problem for round-bounded ML(∃ k FO) is (k+1)NExp-complete. It is kNExp-complete for formulae of ML(∃ k FO) of modal depth k.

Theorem 2. The sat. problem for round-bounded ML(∃ k SO) is (k+1)AExppolcomplete. It is kAExppol-complete for formulae of ML(∃ k SO) of modal depth k.

Here and along the paper, given natural numbers k, n ≥ 1, we write t for the tetration function inductively defined as t(0, n) def = n and t(k, n) = 2t(k−1,n) . Intuitively, t(k, n) defines a tower of exponentials of height k. Then, kNExp is the class of all problems decidable by a non-deterministic Turing machine running in time t(k, f(n)), for some polynomial f, on each input of length n; whereas kAExppol is the class of all problems decidable with an alternating Turing machine [15] in time t(k, f(n)) and performing at most g(n) alternations, for some polynomials f, g, on each input of length n. For all k ≥ 1, kNExp ⊆ kAExppol ⊆ Tower, as we recall that Tower is the class of all problems decidable with a Turing machine running in time t(g(n), f(n)) for some polynomial f and elementary function g, on each input of length n [29]. The lower bounds of Thms. 1 and 2 are established by reduction from suitable tiling problems (Sec. 3). The upper bounds are established by designing a quantifier elimination procedure that yields a (k + 1)ExpSpace small-model property for round-bounded ML(∃ k SO), and a <sup>k</sup>ExpSpace small-model property for the set of formulae of ML(∃ k SO) of modal depth k (Sec. 4). The round-bounded condition does not change the set of formulae of ML(∃ 1 FO) and ML(∃ 1 SO), and thus, as a corollary, we characterise the complexity of these logics:

Corollary 1. (I) The sat. problem for ML(∃ 1 FO) is 2NExp-complete. (II) The sat. problem for ML(∃ 1 SO) is <sup>2</sup>AExppol-complete.

As promised, our framework yields a refined analysis on the power of propositional quantification in modal logic, which we compare to previous known results in Sec. 2. Quite surprisingly, we show that, on tree-like models, modal logic enriched with propositional quantifiers is as expressive as graded modal logic. Moreover, we establish that S5Q is AExppol-complete (refining the previous results from [22,18]), and that hybrid logic with ↓ and @ on trees is Tower-complete.

### 2 Preliminaries

The symbol N (resp. N+) denotes the set of natural numbers including (resp. excluding) zero, N denotes the set N ∪ {∞}, where n < ∞, ∞ + n = ∞ and n mod ∞ = n for all n ∈ N, and N<sup>+</sup> def = N \ {0}. We write |S| ∈ N for the size of a set S. Finally, let AP = {p, q, r, . . . } be a countable set of atomic propositions. Kripke structures. A Kripke structure is a triple K = (W, R, V) where W is a non-empty set of worlds, V : AP → 2<sup>W</sup> is a valuation, and R ⊆ W × W is a binary accessibility relation. A Kripke-style forest is a Kripke structure whose accessibility relation R is such that its inverse R−<sup>1</sup> is functional and acyclic. In particular, the graph described by K is a collection of disjoint trees, where R encodes the child relation. We write R(w) for the set of children of w, i.e. {w <sup>0</sup> ∈ W : (w, w<sup>0</sup> ) ∈ R}. For i ∈ N, R<sup>i</sup> is the i-th composition of R: R<sup>0</sup> is the identity map on W, and Ri+1 def = {(w, w<sup>0</sup> ) ∈ W×W : (w, w00) ∈ R<sup>i</sup> and (w <sup>00</sup>, w<sup>0</sup> ) ∈ R, for some w <sup>00</sup> ∈ W}. For n, m ∈ N, R[n,m] def = S<sup>m</sup> <sup>j</sup>=<sup>n</sup> R<sup>j</sup> , and R<sup>∗</sup> def = R[0,∞] is the Kleene closure of R. For W<sup>0</sup> ⊆ W, V[p ← W<sup>0</sup> ] is the valuation obtained from V by updating to W<sup>0</sup> the set assigned to p ∈ AP. A pointed forest (K, w) is a Kripke-style finite forest K together with one of its worlds w.

Modal logic with local quantifiers. For k ∈ N<sup>+</sup> written in unary, we introduce the modal logic ML(∃ k ), whose formulae ϕ, ψ, χ, etc., are from the grammar below:

$$
\varphi, \psi \; := \; \top \; | \; p \; | \; \varphi \land \psi \; | \; \neg \varphi \; | \; \Diamond \varphi \; | \; \exists^k p \varphi , \qquad \text{where } p \in \mathsf{AP}.
$$

We call ∃ <sup>k</sup>p a local (existential) quantifier. We are interested in two interpretations for the logic ML(∃ k ), one where the local quantifier ∃ <sup>k</sup>p performs a first-order quantification, and one where it performs a second-order one. For simplicity, ML(∃ k FO) (resp. ML(∃ k SO)) stands for ML(∃ k ) interpreted under firstorder (resp. second-order) semantics. The basic modal logic ML is obtained by removing the constructor ∃ <sup>k</sup>p ϕ from the grammar.

Let (K, w) be a pointed forest, where K = (W, R, V). For formulae of ML(∃ k FO), the satisfaction relation |= is defined as follows (Boolean cases are omitted):

$$\begin{aligned} \mathcal{K}, w &\vDash p \quad \Leftrightarrow \; w \in \mathcal{V}(p); \; \mathcal{K}, w \mid \neg \lozenge \varphi \quad \Leftrightarrow \; \text{there is } w' \in R(w) \text{ s.t. } \mathcal{K}, w' \mid \neg \varphi;\\ \mathcal{K}, w &\vDash \exists^{k} p \; \varphi \Leftrightarrow \; \text{there is } w' \in R^{[0,k]}(w) \text{ such that } (\mathcal{W}, R, \mathcal{V}[p \gets \{w'\}]), w \mid \models \varphi. \end{aligned}$$

An atomic proposition p is said to be a nominal for (K, w) whenever |V(p)| = 1. Additionally, p is i-local whenever V(p) ⊆ R<sup>i</sup> (w). In particular, the first-order quantification ∃ <sup>k</sup>p ϕ leads to ϕ being evaluated in a pointed forest where p is an i-local nominal for some i ∈ [0, k]. Given a nominal p, we call w ∈ V(p) the world corresponding to p, and often denote it by wp.

For formulae of the second-order logic ML(∃ k SO), the interpretation of the ML fragment remains as for ML(∃ k FO), whereas we reinterpret the local quantifier as:

$$\mathcal{K}, w \vdash \exists^k p \,\varphi \Leftrightarrow \text{there is a set } \mathcal{W}' \subseteq R^{[0,k]}(w) \text{ s.t. } (\mathcal{W}, R, \mathcal{V}[p \gets \mathcal{W}']), w \vdash \varphi.$$

The contradiction ⊥ and connectives ∨, ⇒ and ⇔ are defined as usual. Below, let ϕ and ψ be two formulae of ML(∃ k ). The local universal quantifier ∀ <sup>k</sup>p ϕ and the modality ϕ are defined as ¬∃<sup>k</sup>p ¬ϕ and ¬♦¬ϕ, respectively. We define ♦ 0ϕ def = ϕ, and given i ∈ N, ♦ <sup>i</sup>+1ϕ def = ♦ <sup>i</sup>♦ϕ. Similarly, <sup>i</sup>ϕ def = ¬♦ <sup>i</sup>¬ϕ. We write @<sup>i</sup> <sup>p</sup>ϕ for ♦ i (p∧ϕ). If p is a nominal, the formula @<sup>i</sup> <sup>p</sup>ϕ states that p is i-local, and that its corresponding world satisfies ϕ. We define |<sup>0</sup>ϕ def = ϕ and <sup>0</sup>ϕ def = ϕ, and given i ∈ N, |<sup>i</sup>+1 ϕ def = ϕ ∨ ♦ |<sup>i</sup> ϕ and i+1ϕ def = ϕ ∧ <sup>i</sup> ϕ. We use the operator precedence {¬, ♦, , ∃ k , ∀ k , @<sup>i</sup> <sup>p</sup>} < {∧, ∨} < {⇒, ⇔}, and sometimes write ":" after a local quantifier with the intuitive meaning that the formula on the right of ":" should be enclosed in brackets, e.g. ∃ <sup>2</sup>p : ϕ ∧ ψ abbreviates ∃ <sup>2</sup>p (ϕ∧ψ). Given i ∈ N, we write ϕ[ψ ←<sup>i</sup> χ] for the formula obtained from ϕ by simultaneously substituting with χ each occurrence of the formula ψ appearing under the scope of exactly i nested modalities.

The length of ϕ, denoted with |ϕ|, is the number of symbols needed to represent ϕ. The modal depth md(ϕ) of ϕ is the maximal number of nested modalities occurring in ϕ. We write bp(ϕ) for the set of bound propositions of ϕ, i.e. propositions p that occur in a quantifier ∃ <sup>k</sup>p inside ϕ. We say that ϕ is well-quantified whenever each subformula ∃ <sup>k</sup>p ψ of ϕ quantifies on a different p ∈ AP, and every occurrence of p in ψ appears under the scope of at most k modalities. One can translate every formula into a well-quantified one at no cost: atomic propositions can be renamed, and occurrences of a quantified atomic proposition that are under the scope of more than k modalities can be replaced with ⊥.

We write ϕ ≡FO ψ (resp. ϕ ≡SO ψ) whenever ϕ and ψ are equivalent under their first-order (resp. second-order) semantics, i.e. they are satisfied by the same pointed forests. When clear from the context or true under both semantics, we drop the subscripts and write ϕ ≡ ψ. Notice that ∃ <sup>k</sup>p ϕ ≡ ∃k+1p (ϕ ∧ k+1¬p), and thus ML(∃ k ) is a syntactical fragment of ML(∃ <sup>k</sup>+1), and it is able to express all the local quantifiers ∃ <sup>1</sup>p , . . . , ∃ <sup>k</sup>p .

Round-bounded fragment. As discussed in Sec. 1, in this paper we focus on a syntactical restriction for ML(∃ k ) where the local quantifiers are round-bounded. The round-bounded formulae of ML(∃ k ) are those generated from the symbol ϕ k 0 of the grammar below (j ∈ N):

$$
\varphi\_j^k, \psi\_j^k := \top \mid p \; \mid \; \varphi\_j^k \land \psi\_j^k \; \mid \; \neg \varphi\_j^k \mid \; \Diamond \varphi\_{j+1}^k \; \mid \; \exists^{k-(j \bmod k)} p \; \varphi\_j^k, \; \text{where } p \in \mathsf{AP}.
$$

In a round-bounded formula of ML(∃ k ), quantifiers appearing under the scope of j modalities are restricted to ∃ k−(j mod k) , e.g. ∃ <sup>3</sup>p ♦∃ 2 q ♦∃ 1 r ♦∃ <sup>3</sup>p ϕ is a round-bounded formula of ML(∃ 3 ), provided that ϕ is also in this fragment, whereas ∃ <sup>3</sup>p ♦∃ 3 q ϕ is not round-bounded. The round-bounded condition does not change the set of formulae of ML(∃ 1 ) and ML(∃<sup>∞</sup>). Besides, every formula of ML(∃<sup>∞</sup>) of modal depth k is equivalent to a round-bounded formula of ML(∃ k ), of similar size, since given a formula ϕ of ML(∃<sup>∞</sup>), we have ∃<sup>∞</sup>p ϕ ≡ ∃md(ϕ)p ϕ.

Our framework of local quantifiers enables us to derive connections with other modal logics featuring some form of quantification, which we now briefly discuss.

Graded modal logic. A logic that has been shown related to different forms of quantification is the graded modal logic GML [5], that extends ML with modalities ♦<sup>≥</sup>` (` ∈ N), with semantics: K, w |= ♦<sup>≥</sup>`ϕ ⇔ |{w <sup>0</sup> ∈ R(w) | K, w<sup>0</sup> |= ϕ}| ≥ `. GML has a tree model property, i.e., each of its satisfiable formulae is satisfied by a pointed forest. Then, by syntactically replacing each ♦<sup>≥</sup>`ϕ occurring in a GML formula by ∃ <sup>1</sup>x1, . . . , x` : (V` i=0 V` <sup>j</sup>=i+1 @<sup>1</sup> xi ¬x<sup>j</sup> ) ∧ ((W` <sup>i</sup>=0 xi) ⇒ ϕ), one shows that GML embeds in ML(∃ 1 FO). At this point, it is worth noting that, for all k ∈ N+, ML(∃ k FO) can be embedded into ML(∃ k SO) by replacing, in a wellquantified formula of ML(∃ k FO), each occurrence of ∃ <sup>k</sup>p ϕ with the ML(∃ k SO) formula ∃ <sup>k</sup>p : ϕ∧uniq<sup>k</sup> (p), where uniq<sup>k</sup> (p) def = |<sup>k</sup>p∧ ∀<sup>k</sup> q : |<sup>k</sup> (p∧q) ⇒ <sup>k</sup> (p ⇒ q) states that there is at most one world satisfying p that is reachable from the current one in at most k steps. Hence, ML(∃ k SO) captures GML, and in fact the converse also holds, as we discover when proving Thm. 2. The corollary below is established.

#### Corollary 2. For k ∈ N+, ML(∃ k FO), ML(∃ k SO) and GML are equally expressive.

This result is surprising, as it implies that QCTL<sup>t</sup> <sup>X</sup> from [6] is as expressive as GML, and that in the context of modal logics, second-order propositional quantifiers do not yield any additional expressive power compared to first-order ones.

Connections with S5Q. The sat. problem of S5Q [18,22] is equireducible to the sat. problem for formulae of ML(∃ 1 SO) of modal depth 1. Briefly, any satisfiable formula of S5Q is satisfied by a Kripke structure (W, R, V) where R = W × W, and S5Q enriches ML with quantifiers ∃p which, by virtue of the relation R, are essentially the quantifiers ∃ <sup>1</sup>p from ML(∃ 1 SO). We can simulate the models of S5Q by using a pointed forest (K, w) with accessibility relation R<sup>0</sup> such that R0 (w) = W. The current world of the S5Q model is simulated with a 1-local nominal x for (K, w). Then, the translation τ from S5Q to ML(∃ 1 SO) is simple: τ (♦ϕ) = ∃ <sup>1</sup>x : ♦x ∧ uniq<sup>1</sup> (x) ∧ τ (ϕ), binding the nominal x to a new world; τ (p) = @<sup>1</sup> <sup>x</sup>p, and otherwise τ is homomorphic. A similar translation can be given from formulae of ML(∃ 1 SO) with modal depth 1 to S5Q. Following Thm. 2, this allows us to characterise the complexity of S5Q left open in [18].

Corollary 3. The sat. problem for S5Q is AExppol-complete.

Connections with hybrid logics. Hybrid logics [3] are among the most studied modal logics featuring first-order propositional quantification. Given a set of nominals NOM ⊆ AP, the hybrid logic HL(↓, @) extends ML with the binder ↓i and the satisfaction operator @<sup>i</sup> (where i ∈ NOM), having the semantics below:

(W, R, V), w |= ↓i.ϕ ⇔ (W, R, V[i ← {w}]), w |= ϕ; (W, R, V), w |= @iϕ ⇔ (W, R, V), w<sup>i</sup> |= ϕ, where V(i) = {wi}.

ML(∃ k FO) embeds in HL(↓, @) by replacing with ↓i.|<sup>k</sup>↓p.@iϕ each occurrence of ∃ <sup>k</sup>p ϕ appearing in an ML(∃ k FO) formula. This translation is (only) exponential in k, and so by uniform reduction for all k ∈ N+, and by Rabin's theorem [27] for the upper bound, Thm. 1 implies the following result.

Corollary 4. The sat. problem for HL(↓, @) on forests is Tower-complete.

#### 3 Lower bounds for ML(∃ k FO) and ML(∃ k SO)

In this section, we establish the lower bounds of Thms. 1 and 2, which follow by reduction from the k-exp alternating multi-tiling problem. While we will introduce this problem in due time, the main difficulty in establishing the reduction is defining, for all k, n ∈ N<sup>+</sup> given in unary, a formula type(k, n) that, whenever satisfied by a pointed forest (K, w), forces w to have t(k, n) children, each of

Fig. 1: Two worlds w and w 0 satisfying type(1, 2) and type(k, n), respectively.

them encoding a different number in [0, t(k, n) − 1]. To establish Thms. 1 and 2, it is essential that type(k, n) is of size polynomial in k and n, has modal depth k, it is in ML(∃ 1 FO) for k = 1, and is in round-bounded ML(∃ k−1 FO ) for all k ≥ 2. The formula type(k, n) is inspired by the homonymous formula defined in [6] to show that QCTL<sup>t</sup> <sup>X</sup> is Tower-hard, and later adapted in [7] to modal separation logics. With respect to both these works, our definition of type(k, n) poses two serious challenges. First, [6,7] rely on second-order quantification, whereas we only use first-order. Second, in [6,7] the formula type(k, n) is of size exponential in k, whereas our formula is of polynomial size. To achieve both improvements, we rely on a novel gadget that simulates binary addition with carry.

Numeric encoding. First of all, let us define how numbers are encoded by worlds of a pointed forest, following the presentation of [6]. Fix n + 1 distinct atomic propositions p1, . . . , pn, b, and consider a Kripke-style forest K = (W, R, V). Given j ∈ [1, k] and w ∈ W, we write n<sup>j</sup> (w) for the number in [0, t(j, n) − 1] encoded by w. For j = 1, we represent n1(w) ∈ [0, 2 <sup>n</sup> − 1] by using the truth values of the propositions p1, . . . , pn, where the proposition p<sup>i</sup> is responsible for the i-th least significant bit of the number. That is, n1(w) def = P{2 i−1 : i ∈ [1, n] and w ∈ V(pi)}. For j > 1, the number n<sup>j</sup> (w) is represented by the binary encoding of the truth values of the atomic proposition b on the children of w, where a child w <sup>0</sup> ∈ R(w) with nj−1(w 0 ) = i from [0, t(j − 1, n) − 1] is responsible for the (i + 1)-th least significant bit of the number encoded by w. Formally, n<sup>j</sup> (w) def = P{2 i : nj−1(w 0 ) = i and w <sup>0</sup> ∈ V(b), for some w <sup>0</sup> ∈ R(w)}.

With respect to this encoding of numbers, the forthcoming formula type(k, n) shall satisfy the specification given by the lemma below, which guarantees that in a pointed forest (K, w) satisfying type(k, n), the numbers encoded by the children of w span all over [0, t(k, n) − 1]. This is illustrated in Fig. 1.

Lemma 1. A pointed forest (K, w), with K = (W, R, V), satisfies type(k, n) iff 1. for all i ∈ [0, t(k, n)−1] there is exactly one world w <sup>0</sup> ∈ R(w) s.t. nk(w 0 ) = i; 2. if k > 1, then for every w <sup>0</sup> ∈ R(w), K, w<sup>0</sup> |= type(k − 1, n).

Addition with carry. In defining type(k, n), the main challenge lies in how to express the condition (1) of Lemma 1. In [6,7], this boils down to the definition of formulae that express (in)equalities between the numbers encoded by distinct w1, w<sup>2</sup> ∈ R(w), e.g. nk(w1) < nk(w2) or nk(w1) = nk(w2) + 1. Unfortunately, these formulae are tree-recursive on k, meaning that multiple (possibly negated) occurrences of the inequalities for the case k − 1 are required to


Fig. 2: Auxiliary formulae used in the definition of type(k, n), where i = k = 1 or i < k.

define the inequalities for the case k. Overall, this induces an exponential blowup on |type(k, n)|. To avoid this blow-up, instead of relying on these inequalities we consider a quaternary relation +k(w1, w2, w3, w4) that holds whenever nk(w1)+nk(w2) = nk(w3) and nk(w4) represents the sequence of carries needed to perform nk(w1) +nk(w2) in binary, on t(k −1, n) bits. For instance, for 4-bits numbers n1(w1) = 3 = (0011)2, n1(w2) = 5 = (0101)2, n1(w3) = 8 = (1000)<sup>2</sup> and n1(w4) = 14 = (1110)2, the tuple (w1, w2, w3, w4) is in +1, as

$$\begin{array}{l} \text{1 } 1 \text{ 0} \\ \text{0 } 0 \text{ 1 } 1 + :w\_1 \\ \hline \text{0 } 1 \text{ 0 } 1 \\ \hline \text{1 } 0 \text{ 0 } 0 \end{array} ; w\_2$$

corresponds to the table for the binary addition with carry of 3 + 5 = 8. By looking at the elementary algorithm for addition, a direct characterisation of +<sup>k</sup> is as follows. Let nk(w1) = (x<sup>m</sup> . . . x1)2, nk(w2) = (y<sup>m</sup> . . . y1)2, nk(w3) = (z<sup>m</sup> . . . z1)2, nk(w4) = (c<sup>m</sup> . . . c1)2, where m = t(k − 1, n), and x<sup>i</sup> , y<sup>i</sup> , z<sup>i</sup> and c<sup>i</sup> are the i-th least significant digits in the binary encoding of nk(w1), nk(w2), nk(w3), nk(w4), respectively. Then, +k(w1, w2, w3, w4) holds if and only if

\*\*A.  $c\_{1} = 0$  and at most one among  $c\_{m}$ ,  $x\_{m}$  and  $y\_{m}$  is  $1$ ,
\*\*B.\*\*  $\text{or every } i \in [2, m]$ ,  $c\_{i} = m \text{aj}(x\_{i-1}, y\_{i-1}, c\_{i-1})$ ,
\*\*A.\*\*  $\text{C. for every } i \in [1, m]$ ,  $z\_{i} = (x\_{i} \oplus y\_{i}) \oplus c\_{i}$ ,

where maj(ϕ, ψ, χ) def = (ϕ∧ψ)∨(ϕ∧χ)∨(ψ∧χ) and ϕ⊕ψ def = (ϕ∨ψ)∧¬(ϕ∧ψ) are the standard Boolean functions majority and exclusive or, respectively. When it comes to capturing +<sup>k</sup> with an ML(∃ k FO) formula, the key property is that the conditions (A), (B) and (C) can be checked with first-order quantification, by going through the binary encodings of nk(w1), nk(w2), nk(w3) and nk(w4) bit by bit, as one would do to check if an addition with carry was performed correctly.

A schema for type(k, n). We move to the definition of type(k, n). In view of its specification given in Lemma 1, the formula is defined recursively on k. For simplicity, we extend type(k, n) to k = 0, and define it as >. To express the condition (1) of Lemma 1, we rely on the auxiliary formulae presented in Fig. 2, which we later define. For k, n ∈ N+, we define type(k, n) as:

$$\begin{split} & \Box type(k-1,n) \wedge \lozenge \mathtt{0}\_{k} \wedge \lozenge \mathtt{1}\_{k} \wedge \lozenge \mathtt{E}\_{k} \wedge \\ & \forall \, ^{1}\mathtt{x} \, \forall \, ^{1}\mathtt{y} \, (\lozenge \mathtt{y} \wedge \lozenge \mathtt{1}\_{x} \, ^{1}\mathtt{c} \, : \lozenge \mathtt{c} \, \land \lozenge \mathtt{1}\_{x} \, \neg o\_{\mathtt{k}} \wedge \left( \mathit{add}\_{k}^{1}(\mathtt{x},\mathtt{z},\mathtt{y},\mathtt{c}) \vee \mathit{add}\_{k}^{1}(\mathtt{y},\mathtt{z},\mathtt{c}) \right) ). \end{split}$$

Whereas the first conjunct of type(k, n) clearly encodes the condition (2) of Lemma 1, the remaining part of the formula forces the condition (1) by saying that the current world w has three children encoding the numbers 0, 1 and t(k, n) − 1, respectively, and that for every two children wx, w<sup>y</sup> of w, if w<sup>x</sup> 6= w<sup>y</sup> (subformula ♦y∧@<sup>1</sup> <sup>x</sup>¬y) then there is a child w<sup>z</sup> of w such that nk(wz) 6= 0, and nk(wx)+nk(wz) = nk(wy) or nk(wy)+nk(wz) = nk(wx). Hence, in combination with ♦*0*k, ♦*1*<sup>k</sup> and ♦*E*k, the last conjunct of type(k, n) not only states that distinct children of w must encode different numbers, but also that every number of [0, t(k, n) − 1] must be encoded by some child of w.

To effectively construct type(k, n), what is left is to define the formulae in Fig. 2. Given how the numbers nk(.) are encoded, the definitions of *0*k, *1*<sup>k</sup> and *E*<sup>k</sup> are simple. For the case k = 1, we define *0*<sup>1</sup> def = V<sup>n</sup> <sup>j</sup>=1¬p<sup>j</sup> , *1*<sup>1</sup> def = (p<sup>1</sup> ∧ V<sup>n</sup> <sup>j</sup>=2¬p<sup>j</sup> ) and *E*<sup>1</sup> def = V<sup>n</sup> <sup>j</sup>=1p<sup>j</sup> . For k ≥ 2, we define instead: *0*<sup>k</sup> def = ¬b, *1*<sup>k</sup> def = (b ⇒ *0*k−1), and *E*<sup>k</sup> def = b. The main difficulty lies in how to define add<sup>i</sup> k , which requires a recursive definition. Below, we consider three cases. First, we consider the base case i = k = 1 and define add <sup>1</sup> <sup>1</sup> by only using the local quantifiers ∃ 1 . Afterwards, we consider the case 1 ≤ i < k − 1 and define the formula add<sup>i</sup> <sup>k</sup> by using local quantifiers ∃ 1 , . . . , ∃ k−1 . This formula relies on the definition of addi+1 k , which we assume to be defined by inductive reasoning. Lastly, we consider the only remaining case of i = k − 1, and define addk−<sup>1</sup> k by using quantifiers ∃ k−1 and ∃ 1 , and without relying on the definition of add <sup>1</sup> 1 . This case is left for last as it is somewhat more involved than the other two cases, and some ingenuity is required to define addk−<sup>1</sup> <sup>k</sup> without relying on the local quantifiers ∃ k . The ad-hoc treatment of this case is however fundamental, as it leads to type(k, n) being a round-bounded formula of the logic ML(∃ k−1 FO ), for every k ≥ 2.

Case: i = k = 1. Recall that the numbers n1(.) are encoded using the truth values of p1, . . . , p<sup>n</sup> ∈ AP. Then, add <sup>1</sup> 1 simply follows the constraints (†) of +1:

$$add\_1^1(\mathbf{x}, \mathbf{y}, \mathbf{z}, \mathbf{c}) \stackrel{\text{def}}{=} \otimes\_{\mathbf{c}}^1 \neg p\_1 \wedge \bigwedge\_{q \in \{\mathbf{x}, \mathbf{y}, \mathbf{c}\}} \left(\otimes\_{q}^1 p\_n \Rightarrow \bigwedge\_{r \in \{\mathbf{x}, \mathbf{y}, \mathbf{c}\} \backslash \{q\}} \otimes\_{r}^{1\_r \neg p\_n} \right) \tag{A}$$

$$\wedge \bigwedge\_{i=2}^{n} \left( \mathbb{\otimes}\_{c}^{1} p\_{i} \Leftrightarrow maj(\mathbb{\otimes}\_{x}^{1} p\_{i-1}, \mathbb{\otimes}\_{y}^{1} p\_{i-1}, \mathbb{\otimes}\_{c}^{1} p\_{i-1}) \right) \tag{B}$$

$$\wedge \bigwedge\_{i=1}^{n} \left( \otimes\_{\mathbf{z}}^{1} p\_{i} \Leftrightarrow \left( (\otimes\_{\mathbf{z}}^{1} p\_{i} \oplus \otimes\_{\mathbf{y}}^{1} p\_{i}) \oplus \otimes\_{\mathbf{c}}^{1} p\_{i} \right) \right) \tag{C}$$

Case: 1 ≤ i < k − 1. To define add<sup>i</sup> k , we assume by inductive reasoning that the formula add<sup>i</sup>+1 k is correctly defined, following its specification in Fig. 2. We specialise add<sup>i</sup>+1 k to define the two auxiliary formulae below:

$$\begin{array}{lcl} \mathit{eq}\_{k}^{i+1}(\mathbf{x},\mathbf{y}) & \stackrel{\scriptstyle \mathtt{af}}{=} \exists^{i+1}\mathbf{z}, \mathbf{c}: \Diamond^{i+1}\mathbf{c} \wedge \Diamond^{i+1}\mathbf{0}\_{\mathbf{z}} - \wedge \mathit{add}\_{k}^{i+1}(\mathbf{y},\mathbf{z},\mathbf{x},\mathbf{c});\\ succ\_{k}^{i+1}(\mathbf{x},\mathbf{y}) & \stackrel{\scriptstyle \mathtt{af}}{=} \exists^{i+1}\mathbf{z}, \mathbf{c}: \Diamond^{i+1}\mathbf{c} \wedge \Diamond^{i+1}\mathbf{1}\_{k-i} \wedge \mathit{add}\_{k}^{i+1}(\mathbf{y},\mathbf{z},\mathbf{x},\mathbf{c}). \end{array}$$

Given x and y be two (i+1)-local nominals for (K, w), with corresponding worlds w<sup>x</sup> and wy, if K, w<sup>0</sup> |= type(k − i, n) for some w <sup>0</sup> ∈ R<sup>i</sup> (w), then:

– K, w |= eq<sup>i</sup>+1 k (x, y) if and only if nk−i(wx) = nk−i(wy); – K, w |= succ<sup>i</sup>+1 (x, y) if and only if nk−i(wx) = nk−i(wy) + 1.

k

Notice that the semantics of succ<sup>i</sup>+1 k and eq<sup>i</sup>+1 k is given under the hypothesis that a world in R<sup>i</sup> (w) satisfies type(k − i, n). This extra hypothesis ensures that the local quantifiers ∃ <sup>i</sup>+1z and ∃ <sup>i</sup>+1c used to define succ<sup>i</sup>+1 k and eq<sup>i</sup>+1 k quantify over a set of worlds encoding all the numbers in [0, t(k−(i+1), n)−1], so that no possible addition with carry is missing. In defining add<sup>i</sup> k (x, y, z, c), this hypothesis is clearly satisfied, as the worlds corresponding to the i-local nominals x, y, z and c are assumed to satisfy type(k − i, n).

By relying on succi+1 k and eqi+1 k , we define add<sup>i</sup> k (x, y, z, c) again by following the characterisation (†) of +k−i+1, as shown below (where X def = {x, y, c}):

∀ <sup>i</sup>+1x, y, z, c, g : @<sup>i</sup> <sup>x</sup>♦<sup>x</sup> <sup>∧</sup> @<sup>i</sup> <sup>y</sup>♦<sup>y</sup> <sup>∧</sup> @<sup>i</sup> <sup>z</sup>♦<sup>z</sup> <sup>∧</sup> @<sup>i</sup> c (♦c ∧ ♦g) ⇒ (A): @<sup>i</sup>+1 c (*0*k−<sup>i</sup> ⇒ ¬b) ∧ ( V q∈X@<sup>i</sup>+1 <sup>q</sup> *E*k−i) ⇒ V q∈X @<sup>i</sup>+1 <sup>q</sup> b ⇒ V r∈X\{q}@<sup>i</sup>+1 <sup>r</sup> ¬b (B): ∧ eq<sup>i</sup>+1 k (x, y) ∧ eq<sup>i</sup>+1 k (y, c) ∧ succ<sup>i</sup>+1 k (g, c) ⇒ @<sup>i</sup>+1 <sup>g</sup> b ⇔ maj(@<sup>i</sup>+1 x b, @ i+1 y b, @ i+1 c b) (C): ∧ eq<sup>i</sup>+1 k (x, y) ∧ eq<sup>i</sup>+1 k (y, z) ∧ eq<sup>i</sup>+1 k (z, c) ⇒ @ i+1 z b ⇔ ((@<sup>i</sup>+1 x b ⊕ @ i+1 y b) ⊕ @ i+1 c b) . The first line of add<sup>i</sup> <sup>k</sup> binds the propositions x, y, z, and c and g to children of x,

y, z and c, respectively. Afterwards, the formula follows closely the constraints in (†). For instance, the last conjunct characterises the condition (C) by saying that whenever we consider children wx, wy, w<sup>z</sup> and w<sup>c</sup> of wx, wy, w<sup>z</sup> and w<sup>c</sup> respectively, if j = nk−i(wx) = nk−u(wy) = nk−i(wz) = nk−i(wc) for some j ∈ N, then n2(wz)[j] = ((n2(wx)[j] ⊕ n2(wy)[j]) ⊕ n2(wc)[j]), where n2(w)[j] is the (j + 1)-th least significant digit of the number encoded by a world w.

Case: i = k − 1. To complete the definition of add<sup>i</sup> k , what is left is to define addk−<sup>1</sup> k by only using quantifiers ∃ <sup>k</sup>−<sup>1</sup> and ∃ 1 . Below, the worlds wx, wy, w<sup>z</sup> and wc, corresponding to the (k−1)-local nominals x, y, z and c, satisfy type(1, n), and so accordingly with n2(.) they encode a number by looking at the value of the proposition b in their children, which themselves encode a number n1(.). To properly define addk−<sup>1</sup> k (x, y, z, c), we rely on the fact that these children encode n-bits numbers, with n given in unary. Then, instead of employing a quantifier ∃ k to refer to one of these children, we can rely on n + 1 local quantifiers ∃ k−1 to copy the values of p1, . . . , p<sup>n</sup> and b of a child directly on its parent. For instance, to check if w<sup>x</sup> and w<sup>y</sup> have children encoding the same numbers and equisatisfying b, one can follow the steps below, also sketched in Fig. 3:


This idea of copying information about children of wx, wy, w<sup>z</sup> and w<sup>c</sup> directly in these four worlds is at the base of our definition of add<sup>k</sup>−<sup>1</sup> k , which we now formalise. Similarly to n1(.), for an n-tuple of symbols r = (r1, . . . , rn), nr(w) def <sup>P</sup> <sup>=</sup> {2 i−1 : i ∈ [1, n], w ∈ V(ri)} stands for the n-bits number encoded by the world w by looking at the truth values of r1, . . . , rn. Given a second n-tuple of atomic

Fig. 3: Steps to check if two children of w<sup>x</sup> and w<sup>y</sup> encoding the same n1(.) equisatisfy b.

propositions s = (s1, . . . , sn), we introduce the formulae succ(r@x, s@y) def = W<sup>n</sup> i=1 @k−<sup>1</sup> x ri∧@k−<sup>1</sup> <sup>y</sup> ¬si∧ V<sup>i</sup>−<sup>1</sup> j=1(@k−<sup>1</sup> <sup>x</sup> ¬rj∧@k−<sup>1</sup> y s<sup>j</sup> )∧ V<sup>n</sup> j=i+1(@k−<sup>1</sup> x r<sup>j</sup> ⇔ @k−<sup>1</sup> y s<sup>j</sup> ) and eq(r@x, s@y) def = V<sup>n</sup> i=1(@<sup>k</sup>−<sup>1</sup> x r<sup>i</sup> ⇔ @<sup>k</sup>−<sup>1</sup> y si), having the following semantics:


The correctness of succ(r@x, s@y) follows from standard arithmetical properties: for two n-bits numbers a and b represented as binary bit vectors with most significant digit first, a = b + 1 holds iff a = c10 and b = c01 hold for a prefix c ∈ {0, 1} <sup>∗</sup> and bit vectors of same length 0 ∈ {0} <sup>∗</sup> and 1 ∈ {1} ∗ .

The definition of addk−<sup>1</sup> k (x, y, z, c) is given below, where X def = {x, y, c} and for v ∈ {x, y, z, c, g}, r<sup>v</sup> def = (r v 1 , . . . , r<sup>v</sup> n ) and ∀ k−1 r<sup>v</sup> is short for ∀ k−1 r v 1 . . . ∀ k−1 r v n .

∀ <sup>k</sup>−<sup>1</sup>rx, qx, ry, qy, rz, qz, rc, qc, rg, q<sup>g</sup> : V v∈{x,y,z,c}@<sup>k</sup>−<sup>1</sup> <sup>v</sup> copy(rv, qv) ∧ @<sup>k</sup>−<sup>1</sup> <sup>c</sup> copy(rg, qg) ⇒ (A): @<sup>k</sup>−<sup>1</sup> <sup>c</sup> (*0*<sup>1</sup> ⇒ ¬b) <sup>∧</sup> V q∈X@<sup>k</sup>−<sup>1</sup> q ♦(*E*<sup>1</sup> ∧ b) ⇒ V r∈X\{q}@<sup>k</sup>−<sup>1</sup> <sup>r</sup> (*E*<sup>1</sup> ⇒ ¬b) (B): ∧ eq(rx@x, ry@y) ∧ eq(ry@y, rc@c) ∧ succ(rg@c, rc@c) ⇒ @<sup>k</sup>−<sup>1</sup> <sup>c</sup> q<sup>g</sup> ⇔ maj(@<sup>k</sup>−<sup>1</sup> <sup>x</sup> qx, @<sup>k</sup>−<sup>1</sup> <sup>y</sup> qy, @<sup>k</sup>−<sup>1</sup> <sup>c</sup> qc) (C): ∧ eq(rx@x, ry@y) ∧ eq(ry@y, rz@z) ∧ eq(rz@z, rc@c) ⇒ @<sup>k</sup>−<sup>1</sup> <sup>z</sup> q<sup>z</sup> ⇔ (@<sup>k</sup>−<sup>1</sup> <sup>x</sup> q<sup>x</sup> ⊕ @<sup>k</sup>−<sup>1</sup> <sup>y</sup> qy) ⊕ @<sup>k</sup>−<sup>1</sup> <sup>c</sup> q<sup>c</sup> .

Notice that this formula first quantifies over fresh atomic propositions r<sup>v</sup> and qv, with v ∈ {x, y, z, c, g} ⊆ AP, so that the worlds wx, wy, wz, w<sup>c</sup> copy the truth of p1, . . . , p<sup>n</sup> and b of some of their children w.r.t. the fresh atomic propositions (see subformula V <sup>v</sup>∈{x,y,z,c} @k−<sup>1</sup> v copy(rv, qv) ∧ @k−<sup>1</sup> c copy(rg, qg)). Afterwards, the formula follows very closely the constraints (†) of +2.

By induction on i, we show that add<sup>i</sup> k respects the specification from Fig. 2.

Lemma 2. Let (K, w) be a pointed forest, and x, y, z, c be four i-local nominals for (K, w), with corresponding worlds wx, wy, w<sup>z</sup> and wc. If K, w<sup>p</sup> |= type(k−i, n) for every p ∈ {x, y, z, c}, then K, w |= add<sup>i</sup> k (x, y, z, c) iff +k−i+1(wx, wy, wz, wc).

Making add<sup>i</sup> <sup>k</sup> polynomial. At this stage, add<sup>i</sup> k (i < k − 1) has size exponential in k, as it is recursively defined using multiple occurrences of add<sup>i</sup>+1 k (appearing inside eq<sup>i</sup>+1 k and succ<sup>i</sup>+1 k ). However, all these occurrences have the same polarity, i.e. they all appear positively in the antecedents of the implications for the conditions (B) or (C). This property allows us to rely on a recursion trick by Fisher and Rabin [20] to obtain a polynomial size formulation of add<sup>i</sup> k . In a nutshell, given a first-order formula ϕ(x) free in the tuple of variables x, the trick consists in rewriting ψ def = ϕ(y) ∧ ϕ(z) as ∀x : (x = y ∨ x = z) ⇒ ϕ(x), so that the size of ψ becomes only |ϕ(x)| plus a constant, instead of being roughly twice |ϕ(x)|. In a similar way, one can treat arbitrary formulae, as long as all occurrences of ϕ(x) have the same polarity, as it is the case of addi+1 k . The (simple) manipulation of the formula add<sup>i</sup> <sup>k</sup> using this trick directly leads to a definition of type(k, n) of size polynomial in k and n.

Multi-tiling. The definition of type(k, n) provides the key technical step required to show the lower bounds of Thms. 1 and 2. Using this formula, both theorems can be proved by suitable reductions from the k-exp alternating multi-tiling problem (kAMTP), as we now briefly discuss.

A multi-tiling system P is a tuple (T , T0, Tacc, H, V,M, n) where T is a finite set of tile types, T0, Tacc ⊆ T are sets of initial and accepting tiles, respectively, n ∈ N<sup>+</sup> (written in unary) is the dimension of the system, and H, V,M ⊆ T ×T are the horizontal, vertical and multi-tiling matching relations, respectively.

Fix <sup>k</sup> <sup>∈</sup> <sup>N</sup>+. We write <sup>Σ</sup><sup>b</sup> for the set of words of length <sup>t</sup>(k, n) over an alphabet Σ. The initial row I(f) of a map f : [0, t(k, n) − 1]<sup>2</sup> → T is the word <sup>f</sup>(0, 0), f(0, 1), . . . , f(0, <sup>t</sup>(k, n)−1) from <sup>T</sup>b. A tiling for the grid [0, <sup>t</sup>(k, n)−1]<sup>2</sup> is a tuple (f1, f2, . . . , fn) such that, for all ` ∈ [1, n], the following conditions hold:

maps. f` : [0, t(k, n) − 1]<sup>2</sup> → T assigns a tile type to each position of the grid; init & acc. <sup>I</sup>(f`) <sup>∈</sup> <sup>T</sup>b0, and <sup>f</sup>n(t(k, n) <sup>−</sup> <sup>1</sup>, j) ∈ Tacc for some 0 <sup>≤</sup> j < <sup>t</sup>(k, n); hori. (f`(i, j), f`(i + 1, j)) ∈ H, for every i ∈ [0, t(k, n) − 2] and 0 ≤ j < t(k, n); vert. (f`(i, j), f`(i, j + 1)) ∈ V, for every j ∈ [0, t(k, n) − 2] and 0 ≤ i < t(k, n); multi. if ` < n then (f`(i, j), f`+1(i, j)) ∈ M for every 0 ≤ i, j < t(k, n).

The kAMTP takes as input P and a quantifier prefix Q = (Q1, · · · , Qn) ∈ {∃, ∀}n, and accepts whenever the statement "Q1w<sup>1</sup> <sup>∈</sup> <sup>T</sup>b<sup>0</sup> . . . Qnw<sup>n</sup> <sup>∈</sup> <sup>T</sup>b<sup>0</sup> : there is a tiling (f1, . . . , fn) of [0, t(k, n) − 1]<sup>2</sup> s.t. I(f`) = w` for all ` ∈ [1, n]" is true.

The AExppol-completeness of kAMTP for k = 1 can be traced back to [11]. The proof therein is independent from the size of the grid, and can be easily adapted to show kAExppol-completeness for arbitrary k (see [24] for a selfcontained presentation). The problem is kNExp-complete if we fix Q to only contain existential quantifiers. For the lower bound of Thm. 1, we reduce kAMTP on instances with Q ∈ {∃}<sup>n</sup> to the sat. problem of ML(∃ k FO), so that the translation produces a formula of ML(∃ 1 FO) of modal depth 1 for the case k = 1, and otherwise a round-bounded formula from ML(∃ k−1 SO ) of modal depth k. For Thm. 2 we get a similar reduction, from instances of the kAMTP with arbitrary Q to ML(∃ k SO).

The first step is to define an ML(∃ k FO) formula grid(k, n) that, when satisfied by a pointed forest (K, w), forces the children of w to encode every position in the grid [0, t(k, n) − 1]<sup>2</sup> , together with a formula tiling(k,P) that characterises the various tiling conditions. Fortunately, both these formulae can be defined as in [7], modulo very minor changes. Briefly, each child w <sup>0</sup> of w shall encode a different pair of numbers (n H k (w 0 ), n V k (w 0 )) representing a position in the grid. The number of bits required to represent n H k (w 0 ) and n V k (w 0 ) is the same as nk(.), which allows us to define grid(k, n) by slightly updating type(k, n). In particular, n H k (w 0 ) and n V k (w 0 ) can be encoded requiring w 0 to satisfy type(k − 1, n), and by using fresh symbols p H 1 , . . . , p<sup>H</sup> n , b<sup>H</sup> and p V 1 , . . . , p<sup>V</sup> n , b<sup>V</sup> to encode (n H k (w 0 ), n V k (w 0 )). For k = 1, the horizontal position is n H 1 (w 0 ) def = {2 i−1 : i ∈ [1, n] and w <sup>0</sup> ∈ V(p H i )}. For k ≥ 2, n H k (w 0 ) def = P{2 i : ∃w <sup>00</sup> ∈ R(w 0 ) s.t. nk−1(w <sup>00</sup>) = i and w <sup>00</sup> ∈ V(b <sup>H</sup>)}. The vertical position n V k (w 0 ) is defined in a similar way. Notice that, in the case of k ≥ 2, n H k (w 0 ) and n V k (w 0 ) are defined in terms of nk−1(w <sup>00</sup>), and thus using the t(k − 1, n) children of w 0 . For tiling(k,P), we see each tile type t ∈ T as an atomic proposition, and consider n distinct copies t (1), . . . , t(n) ∈ AP of it, so that the maps f1, . . . , f<sup>n</sup> can be encoded using just the set of worlds forced by grid(k, n). In particular, for every i ∈ [1, n], each child w 0 shall satisfy exactly one proposition in {t (i) : t ∈ T }, encoding the fact that fi(n H k (w 0 ), n V k (w 0 )) = t.

Following the above specification, the toolkit of formulae in Fig. 2 can be easily adapted to express properties of the horizontal and vertical positions encoded by a world, leading to the definition of grid(k, n) and tiling(k,P). For instance, given G ∈ {H, V} and ϕ ∈ {*0*k, *1*k, *E*k} we define the formula ϕ <sup>G</sup> as follows: for k = 1 we set ϕ <sup>G</sup> def = ϕ[p<sup>i</sup> ←<sup>0</sup> p G i : i ∈ [1, n]], and for k ≥ 2 we set ϕ <sup>G</sup> def = ϕ[b ←<sup>1</sup> b <sup>G</sup>]. Then, w 0 satisfies the formula *1* H <sup>k</sup> ∧ *0* V <sup>k</sup> whenever (n H k (w 0 ), n V k (w 0 )) = (1, 0).

Lemma 3. The ML(∃ k FO) formula grid(k, n) ∧ tiling(k,P) is satisfiable if and only if kAMTP accepts on input (P, Q), with Q ∈ {∃}n.

For the lower bound of Thm. 2, it remains to show how to capture in ML(∃ k SO) the arbitrary prefixes of quantification Q = (Q1, . . . , Qn) of kAMTP. Compared to [6,7], novel machinery is required to perform this step. As ML(∃ k SO) captures ML(∃ k FO), we now see grid(k, n) and tiling(k,P) as formulae of ML(∃ k SO). For each tile type t ∈ T , we consider an additional set of copies t (n+1), . . . , t(2n) ∈ AP. We also define t (i) def = (t (i) 1 , . . . , t(i) <sup>r</sup> ), where T = {t1, . . . , tr}. We use the propositions in t (n+i) to simulate the quantifier Q<sup>i</sup> , which we recall quantifies over the possible initial rows <sup>I</sup>(fi) <sup>∈</sup> <sup>T</sup>b<sup>0</sup> of the map <sup>f</sup><sup>i</sup> . If Q<sup>i</sup> = ∃, we simulate this form of quantification with the following shortcut, parametric on ϕ:

$$E\_i(\varphi) \stackrel{\text{def}}{=} \exists^1 \mathbf{t}^{(n+i)} : \varphi \land \Box(\vartheta\_k^{\sharp t} \Rightarrow \bigvee\_{t \in \mathcal{T}\_0} (t^{(n+i)} \land \bigwedge\_{s \in \mathcal{T}\backslash\{t\}} \neg s^{(n+i)})).$$

Here, the last conjunct states that each world encoding a position (0, j) of the grid, for some j ∈ [0, t(k, n) − 1], satisfies exactly one proposition t (n+i) with t ∈ T0. For Q<sup>i</sup> = ∀, we just define Ai(ϕ) def = ¬Ei(¬ϕ). Then, the prefix of quantification Q is captured by Q(ϕ) def = Q1(Q2(. . . Qn(ϕ))), where Qi(ϕ) def = Ei(ϕ) if Q<sup>i</sup> = ∃, else Qi(ϕ) def = Ai(ϕ). In deciding whether K, w |= Q(ϕ) holds for a pointed forest (K, w) satisfying grid(k, n), the satisfaction of ϕ is checked w.r.t. a model where each world encoding a position (0, j) of the grid satisfies exactly one t (n+i) with t ∈ T0, for all i ∈ [1, n]. In terms of tilings, this corresponds to having set the initial row <sup>I</sup>(fi) <sup>∈</sup> <sup>T</sup>b<sup>0</sup> of each of the maps <sup>f</sup><sup>i</sup> . We now want to tile the remaining part of the grid by finding a suitable instantiation for ϕ. To do so, we quantify over all t (1) , . . . t (n) , searching for an arrangement of these propositions that satisfies tiling(k,P) and such that, on worlds encoding a position (0, j) of the grid, the satisfaction of propositions in t (i) mirrors the satisfaction of the corresponding propositions in t (n+i) . In formula:

$$\overline{\operatorname{tiling}}(k,\mathcal{P}) \stackrel{\mathsf{def}}{=} \exists \, \mathtt{t}^{(1)}, \ldots, \mathtt{t}^{(n)} \;:\; \operatorname{tiling}(k,\mathcal{P}) \land \Box(\vartheta\_{k}^{\sharp \mathsf{t}} \Rightarrow \bigwedge\_{i=1}^{n} \bigvee\_{t \in \mathcal{T}} (t^{(i)} \Leftrightarrow t^{(n+i)})).$$

Lemma 4. The ML(∃ k SO) formula grid(k, n)∧Q(tiling(k,P)) is satisfiable if and only if kAMTP accepts on input (P, Q).

Round-boundedness. In defining type(k, n), we made sure to respect the following round-boundedness condition: type(1, n) has modal depth 1 and belongs to ML(∃ 1 FO), whereas for every k ≥ 2, type(k, n) is a round-bounded formula of ML(∃ k−1 FO ) of modal depth k. The same holds for grid(k, n), tiling(k,P) and Q(tiling(k,P)). Then, Lemmas 3 and 4 imply the lower bounds of Thms. 1 and 2.

#### 4 Upper bounds via a small-model property for ML(∃ k SO)

In this section, we establish the following small model property.

Proposition 1. Each satisfiable round-bounded formula ϕ in ML(∃ k SO) is satisfied by a pointed forest with t(k+1, O(|ϕ|)) worlds. Each satisfiable ϕ in ML(∃ k SO) with md(ϕ) ≤ k is satisfied by a pointed forest with t(k, O(|ϕ| 3 )) worlds.

As the logic ML(∃ k SO) captures ML(∃ k FO), Prop. 1 transfers to the latter logic. With this result at hand, the upper bounds of Thm. 1 and Thm. 2 easily follow. Consider a round-bounded formula ϕ of either ML(∃ k SO) of ML(∃ k FO) (the arguments for a formula of modal depth k are similar). First, we guess a pointed forest (K, w) with bounds as in Prop. 1. This can be done in (k+1)NExp. Then, we check whether (K, w) satisfies ϕ. For ML(∃ k SO), by seeing this logic as a fragment of monadic second-order logic, this can be done in polynomial time in the sizes of (K, w) and ϕ by using an alternating Turing machine that performs |ϕ| many alternations. As (K, w) has (k+1)-exponential size with respect to |ϕ|, the whole algorithm runs in (k+1)AExppol. For ML(∃ k FO), we rely on the fact that there is a deterministic algorithm for the model checking problem of first-order logic that runs in time O(|ϕ| · M<sup>|</sup>ϕ<sup>|</sup> ) where M is the size of the model. From the bounds on (K, w) we conclude that the procedure for ML(∃ k FO) is in (k+1)NExp.

Prop. 1 is shown through a quantifier elimination (QE) procedure that translates every formula of ML(∃ k SO) into an equivalent formula from GML, establishing Cor. 2 as a by-product. Without loss of generality, in this section we extend ML(∃ k SO) with graded modalities ♦<sup>≥</sup>jϕ, with <sup>j</sup> <sup>∈</sup> <sup>N</sup> given in unary, and see the modality ♦ as a shortcut for ♦<sup>≥</sup>1. Recall that a GML formula ♦<sup>≥</sup>jϕ can be represented with an ML(∃ k SO) formula of size O(j + |ϕ|) (Sec. 2).

Parameters of a formula. Fig. 4 introduces a set of parameters for a ML(∃ k SO) formula ϕ, which we rely on to establish Prop. 1. For instance, for ϕ = (p∨♦<sup>≥</sup>3r)∧ (q∨♦<sup>≥</sup>5♦<sup>≥</sup>2q) we have ap(1, ϕ) = {r}, gsf(0, ϕ) = {♦<sup>≥</sup>3r, ♦<sup>≥</sup>5♦<sup>≥</sup>2q}, msf(1, ϕ) = {r, ♦<sup>≥</sup>2q}, gsf(1, ϕ) = {♦<sup>≥</sup>2q}, gr(0, ϕ) = 5 and bd(0, ϕ) = 8. Note that every GML formula ϕ is a Boolean combination of formulae from ap(0, ϕ) ∪ gsf(0, ϕ), and for every d ∈ N, bd(d, ϕ) ≤ gr(d, ϕ) · |msf(d + 1, ϕ)|.

For a set of formulae Φ = {ϕ1, . . . , ϕn}, we define C(Φ) to be the set of all complete conjunctions of possibly negated formulae of Φ. Formally, C(Φ) def = {γ<sup>1</sup> ∧ · · · ∧ γ<sup>n</sup> : for all i ∈ [1, n], γ<sup>i</sup> ∈ {ϕ<sup>i</sup> , ¬ϕi}}, and we fix C(∅) = {>}. Given P ⊆fin AP we refer to the formulae in C(P) as ρ<sup>1</sup> , ρ<sup>2</sup> , · · · .

ap(d, ϕ) : set of atomic propositions of ϕ in the scope of exactly d graded modalities. gsf(d, ϕ) : set of subformulae ♦<sup>≥</sup>jψ of ϕ, in the scope of exactly d graded modalities. msf(d, ϕ) : set of maximal subformulae of ϕ in the scope of d graded modalities: msf(0, ϕ) = {ϕ}, and ψ ∈ msf(d + 1, ϕ) iff ♦<sup>≥</sup>jψ ∈ gsf(d, ϕ) for some j ∈ N. gr(d, ϕ) : largest j ∈ N such that either j = 0 or ♦<sup>≥</sup>jψ ∈ gsf(d, ϕ), for some ψ. bd(d, ϕ) : for d = 0 and let gsf(0, ϕ) = {♦<sup>≥</sup>j<sup>1</sup> ψ1, . . . , ♦<sup>≥</sup>j<sup>n</sup> ψn}, bd(0, ϕ) def = j1+· · ·+jn. For d ≥ 1, bd(d, ϕ) def = max {bd(d − 1, ψ) : ψ ∈ msf(1, ϕ)}.

$$\text{Fig. 4: Parameters of an } \mathsf{ML}(\exists^k) \text{ formula } \varphi \text{ ( $d \in \mathbb{N}$ ).}$$

Normal forms. We introduce a set of normal forms that are used by our QE procedure. An ML(∃ k SO) formula ϕ is in prenex normal form if it is of the form Q1p1Q2p<sup>2</sup> . . . Qnpnψ where Q<sup>i</sup> ∈ {∃<sup>k</sup> , ∀ <sup>k</sup>} and ψ is in GML. If ψ is instead in ML(∃ k SO) but all quantifiers are under the scope of at least k modalities, we say that ϕ is in prenex normal form up to k. An ML(∃ k SO) formula ϕ is in prenex round-bounded (p.r.b.) form if ϕ is round-bounded and, for all i ∈ N, all formulae in msf(i · k, ϕ) are in prenex normal form up to k. E.g., given a p.r.b. formula ψ in ML(∃ 2 SO), ∃ <sup>2</sup>p ∃ 2 q ♦♦∃ 2 r ψ is in p.r.b. form, while ∃ <sup>2</sup>p ♦∃ 1 q ♦∃ 2 r ψ is not. Thanks to the equivalences below one can translate each round-bounded formula ϕ of ML(∃ k SO) into an equivalent well-quantified p.r.b. formula of size O(|ϕ|):

$$
\Diamond \exists^{k-1} p \,\varphi \equiv \exists^k p \,\Diamond \varphi, \qquad \Box \exists^{k-1} p \,\varphi \equiv\_{\mathfrak{so}} \exists^k p \,\Box \varphi, \qquad \text{ for } k \ge 2. \tag{4}
$$

Similarly, every ϕ in ML(∃ k SO) of modal depth at most k can be translated into a well-quantified prenex formula of ML(∃ k SO) having size O(|ϕ|). Notice that the second equivalence in (‡) only holds on pointed forests and for the logic ML(∃ k SO). It does not hold for arbitrary Kripke structures, nor for ML(∃ k FO).

Our QE procedure translates each formula of ML(∃ k SO) into a GML formula in disjoint normal form (called good formulae in [23, Def. 8.5]) for which it is easy to estimate bounds on the size of the smallest satisfying pointed forest, if any. We say that a set {ϕ1, . . ., ϕn} of formulae in GML is a disjoint set over P ⊆fin AP whenever for all i, j ∈ [1, n], we have ϕ<sup>i</sup> = ρ<sup>i</sup> ∧ γ<sup>i</sup> and ϕ<sup>j</sup> = ρ<sup>j</sup> ∧ γ<sup>j</sup> , where ρi , ρ<sup>j</sup> ∈ C(P), ap(0, γi)∩P = ap(0, γ<sup>j</sup> )∩P = ∅, and either γ<sup>i</sup> ≡ γ<sup>j</sup> or (γi∧γ<sup>j</sup> ) ≡⊥. By taking ρ<sup>i</sup> and ρ<sup>j</sup> up-to commutativity and associativity of ∧, a disjoint set over P is also a disjoint set over any P <sup>0</sup> ⊂ P. We say that ϕ is in disjoint normal form (DisjNF) if for every d ∈ [0, md(ϕ)], msf(d, ϕ) is a disjoint set over ∅.

Proposition 2 ([23], Lemma 8.7). Each satisfiable GML formula ϕ in DisjNF is satisfied by a pointed forest with at most (maxd∈<sup>N</sup>(bd(d, ϕ)) + 1)md(ϕ) worlds.

To translate a well-quantified p.r.b. formula ϕ from ML(∃ k SO) into a GML formula in DisjNF, we consider the largest i ∈ N for which msf(i · k, ϕ) is nonempty, and inductively translate, for each j = i, i − 1, · · · , 0, all formulae in msf(j · k, ϕ) into equivalent ones in GML. At each of these i + 1 rounds, the following two steps are applied at most k times:


At the end of the round, msf(j · k, ϕ) solely contains GML formulae in DisjNF, and the next round considers the set msf((j−1)·k, ϕ), that now contains ML(∃ k SO) formulae in prenex normal form. The QE procedure has thus three key steps, which we now formalise: (I) manipulating a formula ϕ so that msf(j, ϕ) becomes a disjoint set, (II) eliminating the quantifier ∃ <sup>1</sup> obtaining a formula from GML, and (III) reducing the elimination of ∃ ` to the elimination of ∃ `−1 (for ` ≥ 2).

Step (I): making a single set disjoint. Let j ∈ N<sup>+</sup> and P ⊆fin AP. We show how to transform a GML formula ϕ into an equivalent formula ψ such that msf(j, ψ) is a disjoint set over P. Two strategies are possible, which will be combined and carefully chosen in order to obtain the bounds required by Prop. 1.

The first strategy considers the set S def = C(P ∪ ap(j, ϕ) ∪ gsf(j, ϕ)), which is disjoint over P (and so over ∅), and rewrites ϕ into an equivalent formula ψ with msf(j, ψ) ⊆ S. Consider γ ∈ msf(j, ϕ). By definition of C(.), W <sup>χ</sup>∈S χ is a tautology, and since γ is a Boolean combination of formulae in ap(j, ϕ) ∪ gsf(j, ϕ), for all χ ∈ S the formula γ ∧χ is equivalent to either ⊥ or χ. Then, γ ≡ W <sup>χ</sup>∈<sup>T</sup> χ for some T ⊆ S. Notice that γ ∈ msf(j, ϕ) holds if and only if ♦<sup>≥</sup>iγ ∈ gsf(j − 1, ϕ), for some i ∈ N. By relying on the equivalence of GML

$$
\Diamond\_{\geq \underline{i}}(\chi\_1 \vee \chi\_2) \equiv \bigvee\_{i=i\_1+i\_2} (\Diamond\_{\geq i\_1} \chi\_1 \wedge \Diamond\_{\geq i\_2} \chi\_2), \qquad \text{whenever } \chi\_1 \wedge \chi\_2 \equiv \bot, \text{ and } \chi\_1 \not\models \chi\_2
$$

we rewrite ♦<sup>≥</sup>iγ into a Boolean combination of formulae ♦<sup>≥</sup><sup>i</sup> <sup>0</sup>χ with i <sup>0</sup> ≤ i and χ ∈ T ⊆ S. These steps are applied to all the formulae in msf(j, ϕ).

The second strategy is as follows: for each γ ∈ msf(j, ϕ) and ρ ∈ C(P), let γρ def = γ[p ←<sup>0</sup> v : v ∈ {>, ⊥}, p ∈ P, and v = > iff p occurs positively in ρ ]. Notice that ap(0, γρ) ∩ P = ∅. As ρ gives a polarity to all propositions in P, we have ρ ∧ γ ≡ ρ ∧ γρ. Set T def = C({γ<sup>ρ</sup> : γ ∈ msf(j, ϕ), ρ ∈ C(P)}). Consider S <sup>0</sup> def = C(P ∪ T ), which is a disjoint set over P, and replay the arguments used for S in the first strategy to rewrite ϕ into an equivalent formula ψ with msf(j, ψ) ⊆ S<sup>0</sup> .

While both strategies keep most of the parameters of Fig. 4 unchanged (one exception being ap(j, ψ) ⊆ ap(j, ϕ) ∪ P), they yield profoundly different bounds on the size of msf(j, ψ). Because of the definition of S, from the first strategy we obtain |msf(j, ψ)| ≤ 2 |P|+|ap(j,ϕ)|+|gsf(j,ϕ)| , where we highlight the exponential dependence on |gsf(j, ϕ)|, and thus on the number of outermost graded modalities appearing in formulae of msf(j, ϕ). From the definition of S 0 , the second strategy yields |msf(j, ψ)| ≤ 2 |P|+2|P<sup>|</sup> ·|msf(j,ϕ)| . Here, |msf(j, ψ)| does not depend on gsf(j, ϕ), but it is doubly exponential in |P|. Remarkably, in both strategies gsf(j, ψ) ⊆ gsf(j, ϕ), thus if msf(j+1, ϕ) is a disjoint set over ∅, so is msf(j+1, ψ). This property is essential, as it allows us to bring the full formula in DisjNF.

Step (II): eliminating ∃ 1 . Given a well-quantified formula ϕ = ∃ <sup>1</sup>p ϕ<sup>0</sup> , where ϕ 0 is in GML and msf(1, ϕ) is a disjoint set over P, and p ∈ P, it is quite easy to eliminate the quantifier ∃ <sup>1</sup>p and produce a formula ψ in GML equivalent to ϕ and such that msf(1, ψ) is a disjoint set over P \ {p}. We sketch here the main points. First, from standard axioms of propositional calculus and by distributing ∃ <sup>1</sup>p over ∨, we obtain a representation of ϕ as a disjunction of formulae of the form ∃ <sup>1</sup>p (ρ ∧ γ) with ρ ∈ C(ap(0, ϕ)) and γ ∈ C(gsf(0, ϕ)). We eliminate the quantifier ∃ 1 from every such disjunct ∃ <sup>1</sup>p (ρ ∧ γ). Below, let χ be an arbitrary formula with p 6∈ ap(0, χ). First, using the equivalences ∃ <sup>1</sup>p (p∧χ) ≡SO ∃ <sup>1</sup>p χ and ∃ <sup>1</sup>p (¬p∧χ) ≡SO ∃ <sup>1</sup>p χ, we get rid of the occurrences of p in ρ, obtaining a formula ρ <sup>0</sup> ∈ C(ap(0, ϕ) \ {p}). Next, we remove p from γ thanks to the equivalences:

$$\begin{array}{c} \exists^1 p \,:\, \Diamond\_{\geq i} (p \wedge \chi) \wedge \Diamond\_{\geq j} (\neg p \wedge \chi) \quad \equiv \Box \sigma \quad \Diamond\_{\geq i+j} \chi; \\ \exists^1 p \,:\, \neg \Diamond\_{\geq i} (p \wedge \chi) \wedge \neg \Diamond\_{\geq j} (\neg p \wedge \chi) \quad \equiv \Box \sigma \quad \neg \Diamond\_{\geq i+j-1} \chi. \end{array}$$

We obtain a GML formula γ 0 such that ∃ <sup>1</sup>p (ρ∧γ) ≡SO ρ <sup>0</sup>∧γ 0 . Size-wise, Step (II) preserves all the parameters of Fig. 4 except gr(0, ψ) ≤ 2 · gr(0, ϕ).

Step (III): from ∃ <sup>k</sup>+1 to ∃ k . Consider a well-quantified ML(∃ k SO) formula ϕ <sup>0</sup> having all quantifiers appearing outside the scope of graded modalities, and with the set msf(k + 1, ϕ<sup>0</sup> ) disjoint over P. Given p ∈ P, we translate ϕ def = ∃ <sup>k</sup>+1p ϕ<sup>0</sup> into an equivalent well-quantified ML(∃ k SO) formula ψ having all quantifiers outside the scope of graded modalities, and with the set msf(k + 1, ψ) disjoint over P \ {p}. This is done by replacing ∃ <sup>k</sup>+1p with multiple ∃ k . The first step is to single out the occurrences of p under the scope of k+1 modalities by replacing them with a fresh symbol <sup>p</sup><sup>e</sup> and splitting <sup>∃</sup> <sup>k</sup>+1p into ∃ <sup>k</sup>p and ∃ <sup>k</sup>+1pe. We get <sup>ϕ</sup> <sup>≡</sup>SO ∃ <sup>k</sup>p ∃ <sup>k</sup>+1p ϕ<sup>e</sup> <sup>00</sup> where ϕ <sup>00</sup> = ϕ 0 [<sup>p</sup> <sup>←</sup>k+1 <sup>p</sup>e]. Let gsf(k, ϕ<sup>00</sup>) = {♦<sup>≥</sup>k<sup>1</sup> <sup>χ</sup>1, . . . , ♦<sup>≥</sup>k<sup>n</sup> <sup>χ</sup>n}. From the properties of ϕ 0 , no proposition from bp(ϕ <sup>00</sup>) appears in the GML formulae χ1, . . . , χn. Using fresh propositions q1, . . . , qn, we rewrite ϕ as

$$\exists^k p \; \exists^{k+1} \tilde{p} \; \exists^k q\_1, \dots, q\_n : \varphi \prime [\Diamond\_{\geq k\_i} \chi\_i \leftarrow\_k q\_i : 1 \leq i \leq n] \land \square^k \bigwedge\_{i=1}^n (q\_i \Leftrightarrow \Diamond\_{\geq k\_i} \chi\_i) .$$

Essentially, the subformula kV<sup>n</sup> <sup>i</sup>=1(q<sup>i</sup> <sup>⇔</sup> ♦<sup>≥</sup>kiχi) constraints each <sup>q</sup><sup>i</sup> to be true in exactly those worlds satisfying ♦<sup>≥</sup>kiχ<sup>i</sup> . This allows us to replace with q<sup>i</sup> all occurrences of ♦<sup>≥</sup>kiχ<sup>i</sup> appearing in ϕ <sup>00</sup> under the scope of k modalities (first conjunct of the formula above), without changing the semantics of ϕ. By definition, ϕ <sup>00</sup>[♦<sup>≥</sup>kiχ<sup>i</sup> ←<sup>k</sup> q<sup>i</sup> : 1 ≤ i ≤ n] has modal depth at most k, and thus the proposition <sup>p</sup><sup>e</sup> does not occur in it. We reorder the existential prefix of the formula and, by distributing ∃ <sup>k</sup>+1pe, conclude that <sup>ϕ</sup> is equivalent to:

$$\exists^k p, q\_1, \ldots, q\_n: \varphi^{\prime\prime}[\Diamond\_{\geq k\_i} \chi\_i \leftarrow\_k q\_i: 1 \leq i \leq n] \land \exists^{k+1} \tilde{p} \square^k \bigwedge^n\_{i=1} (q\_i \Leftrightarrow \Diamond\_{\geq k\_i} \chi\_i) \cdot\_i$$

Lastly, we eliminate ∃ <sup>k</sup>+1pe, obtaining the aforementioned ML(<sup>∃</sup> k SO) formula ψ. Using the second equivalence in (‡), we rewrite ∃ <sup>k</sup>+1pe<sup>k</sup>V<sup>n</sup> <sup>i</sup>=1(q<sup>i</sup> <sup>⇔</sup> ♦<sup>≥</sup>kiχi) into <sup>k</sup>∃ 1pe V<sup>n</sup> <sup>i</sup>=1(q<sup>i</sup> <sup>⇔</sup> ♦<sup>≥</sup>kiχi). Since {χ1, . . . , χn} is a set of formulae form GML that is disjoint over (<sup>P</sup> \ {p}) ∪ {pe}, by applying Step (II) one computes a formula ψ 0 in GML equivalent to ∃ 1pe V<sup>n</sup> <sup>i</sup>=1(q<sup>i</sup> <sup>⇔</sup> ♦<sup>≥</sup>kiχi) and such that msf(1, ψ<sup>0</sup> ) is a disjoint set over P \ {p}. Then, the (output) formula ψ is defined as follows:

$$\psi \stackrel{\mathfrak{def}}{=} \exists^k p, q\_1, \dots, q\_n \;:\; \varphi \prime [\Diamond\_{\geq k\_i} \chi\_i \gets\_k q\_i : 1 \leq i \leq n] \land \square^k \psi' .$$

Down to GML, inductively. The manipulation we just described yield the crucial inductive argument that allows us to translate any well-quantified prenex formula of ML(∃ k SO) into a formula of GML. Inductively on k, consider a wellquantified formula ϕ = Q1p<sup>1</sup> . . . Qnpnϕ <sup>0</sup> where each Q<sup>i</sup> ∈ {∃<sup>k</sup> , ∀ <sup>k</sup>}, the formula ϕ 0 is in GML and msf(k, ϕ) is a disjoint set over {p1, . . . , pn}. If k = 1, we repeatedly apply Step (II) to translate ϕ into a GML formula. If k ≥ 2, starting from p<sup>n</sup> down to p1, we apply Step (III) to translate ϕ into a wellquantified prenex formula χ from ML(∃ k−1 SO ). Afterwards, we rely on the first strategy of Step (I) to make the set msf(k − 1, χ) disjoint over bp(χ), and inductively obtain a GML formula ψ equivalent to ϕ. For a sake of conciseness, let |ϕ|<sup>k</sup> def = max(k, | S i∈[0,k] ap(i, ϕ)|, maxi<k gr(i, ϕ)). Fundamentally, the formula ψ has the same modal depth as ϕ, and for every i ∈ [0, k − 1] it satisfies:

gr(i, ψ) ≤ t(k − 1, 2 <sup>8</sup>·|ϕ|<sup>k</sup> · |msf(k, ϕ)|); msf(i, ψ) ≤ t(k − 1, 2 <sup>8</sup>·|ϕ|<sup>k</sup> · |msf(k, ϕ)|).

With these bounds at hand, Prop. 1 follows from Steps (I)–(III) and Prop. 2. First, consider the case of a well-quantified prenex formula ϕ in ML(∃ k ) of modal depth k. Using the first strategy from Step (I), we translate ϕ into an equivalent formula ψ such that the set msf(k, ψ) is disjoint over bp(ϕ) and has size exponential in |ϕ|. We apply the inductive argument discussed above, and translate ψ into a GML formula χ in DisjNF with md(χ) ≤ md(ϕ) and bd(d, χ) ≤ gr(d, χ) · |msf(d + 1, χ)|) ≤ t(k, O(|ϕ| 2 )) for all d ∈ N. By Prop. 2, whenever satisfiable, ϕ is satisfied by a pointed forest with at most t(k, O(|ϕ| 3 )) worlds. The case of general p.r.b. formulae of ML(∃ k SO) is similar, but we need to appeal to the second strategy of Step (I) to stop the chain of exponential blow-ups. For simplicity, let us consider the case of ϕ being a well-quantified p.r.b. formula of modal depth at most 2k. The arguments used for this case can be adapted for formulae of arbitrary modal depth. First, we look at the formulae of msf(k, ϕ), whose modal depth is at most k, and eliminate all local quantifiers from each of these formulae, as described above. In doing so, |gsf(k, ϕ)| witnesses a k-exponential blow-up, but the size of msf(k, ϕ) is unchanged. We consider the quantification prefix of ϕ, and eliminate all its quantifiers over P to produce an equivalent formula from GML. The first step is to make the set msf(k, ϕ) a disjoint set over P. Because of the k-exponential blow-up on gsf(k, ϕ), the first strategy of Step (I) is of no use. We appeal to the second one, which modifies msf(k, ϕ) into a disjoint set of size only doubly-exponential in the size of the original formula ϕ. By relying on the inductive reasoning discussed above, we produce the equivalent GML formula in DisjNF. Because of the doubly-exponential bound on msf(k, ϕ), this elimination is exponentially worse than the one done for formulae of modal depth at most k. Then, appealing to Prop. 2 yields Prop. 1.

### 5 Further connections

In introducing ML(∃ k FO) and ML(∃ k SO), one of our goals is to provide a common framework to relate several modal logics featuring propositional quantification in disguise. Apart from the relations stated in Sec. 2, in an extended version of this work we aim at establishing connections between ML(∃ 1 SO) and propositional team logics [21], propositional logic of dependence [32] and ambient logics [13]; as well as connections bwteen ML(∃<sup>∞</sup> FO ) and sabotage logics [8,4].

Acknowledgments. R. Fervari is supported by CONICET project PIP 11220200100812CO, and by the LIA SINFIN. A. Mansutti is supported by the ERC project ARiAT (Grant agreement No. 852769).

### References


Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

## Temporal Stream Logic modulo Theories<sup>∗</sup>

Bernd Finkbeiner , Philippe Heim , and Noemi Passing

CISPA Helmholtz Center for Information Security, Saarbr¨ucken, Germany {finkbeiner, philippe.heim, noemi.passing}@cispa.de

Abstract. Temporal stream logic (TSL) extends LTL with updates and predicates over arbitrary function terms. This allows for specifying dataintensive systems for which LTL is not expressive enough. In the semantics of TSL, functions and predicates are left uninterpreted. In this paper, we extend TSL with first-order theories, enabling us to specify systems using interpreted functions and predicates such as incrementation or equality. We investigate the satisfiability problem of TSL modulo the standard underlying theory of uninterpreted functions as well as with respect to Presburger arithmetic and the theory of equality: For all three theories, TSL satisfiability is neither semi-decidable nor co-semidecidable. Nevertheless, we identify three fragments of TSL for which the satisfiability problem is (semi-)decidable in the theory of uninterpreted functions. Despite the undecidability, we present an algorithm – which is not guaranteed to terminate – for checking the satisfiability of a TSL formula in the theory of uninterpreted functions and evaluate it: It scales well and is able to validate assumptions in a real-world system design.

### 1 Introduction

Linear-time temporal logic (LTL) [32] is one of the standard specification languages to describe properties of reactive systems. The success of LTL is largely due to its ability to abstract from the detailed data manipulations and to focus on the change of control over time. In data-intensive applications, such as smartphone apps, LTL is, however, often not expressive enough to capture the relevant properties. When specifying a music player app, for instance, we would like to state that if the user leaves the app, the track that is currently playing will be stored and will resume playing once the user returns to the app.

To specify data-intensive systems, extensions of LTL such as Constraint LTL (CLTL) [6] and, more recently, Temporal Stream Logic (TSL) [15] have been proposed. In CLTL, the atomic propositions of LTL are replaced with atomic constraints over a concrete domain D and an interpretation for relations. Relating variables with the equality relation, such as x = y, denoting that the value

<sup>∗</sup>This work was partially supported by the German Research Foundation (DFG) as part of the Collaborative Research Center "Foundations of Perspicuous Software Systems" (TRR 248, 389792660), and by the European Research Council (ERC) Grant OSARES (No. 683300). Philippe Heim and Noemi Passing carried out this work as PhD candidates at Saarland University, Germany.

of x is equal to the value of y, allows for specifying assignment-like statements. In this paper, however, we focus on the logic TSL to specify data-intensive systems.

TSL extends LTL with updates and predicates over arbitrary function terms. An update Jx f(y)K denotes that the result of applying function f to variable y is assigned to variable x. For the music player app, for instance, the update Jpaused track(current)K specifies that the track that is currently playing, obtained by applying function track to variable current, is stored in variable paused. Updates are the main characteristic of TSL that differentiates it from other first-order extensions of LTL: They allow for specifying the evolution of variables over time. Thus, programs can be represented in TSL and therefore, for instance, the model checking problem can be encoded.

In the semantics of TSL, functions and predicates are left uninterpreted, i.e., a system satisfies a TSL formula if the formula evaluates to true for all possible interpretations of the function and predicate symbols. This semantics has proven especially useful in the synthesis of reactive programs [15,17], where the synthesis algorithm builds a control structure, while the implementation of the functions and predicates is either done manually or provided by some library. One exemplary success story of TSL-based specification and synthesis of a reactive system is the arcade game Syntroids [17] realized on an FPGA.

In this paper, we define and investigate the satisfiability problem of TSL modulo the standard underlying theory of uninterpreted functions and with respect to other first-order theories such as the theory of equality and Presburger arithmetic. Intuitively, a TSL formula ϕ is satisfiable in a theory T if there is an execution satisfying ϕ that matches the function applications and predicate constraints of an interpretation in T. TSL validity in T is dual: A TSL formula ϕ is valid in a theory T if, and only if, ¬ϕ is unsatisfiable in T.

For LTL, satisfiability is decidable [37] and efficient algorithms for checking the satisfiability of an LTL formula have been implemented in tools like Aalta [25]. Satisfiability checking has numerous applications in the specification and analysis of reactive systems, such as identifying inconsistent system requirements during the design process, comparing different formalizations of the same requirements, and various types of vacuity checking. TSL satisfiability checking extends these applications to data-intensive systems.

We present an algorithm for checking the satisfiability of a TSL formula in the theory of uninterpreted functions. It is based on B¨uchi stream automata (BSAs), a new kind of ω-automata that we introduce in this paper. BSAs can handle the predicates and updates occurring in TSL formulas. Similar to the relationship between LTL formulas and nondeterministic B¨uchi automata, BSAs are an automaton representation of TSL formulas, i.e., there exists an equivalent BSA for every TSL formula. Given a TSL formula ϕ, our algorithm constructs an equivalent BSA B<sup>ϕ</sup> and then tries to prove satisfiability and unsatisfiability in parallel: For proving satisfiability, it searches for a lasso that ensures consistency of the function terms in an accepting run of Bϕ. If such a lasso is found, ϕ is satisfiable. For proving unsatisfiability, the algorithm discards inconsistent runs of Bϕ. If no accepting run is left, ϕ is unsatisfiable.

The algorithm does not always terminate. In fact, we show that TSL satisfiability is neither semi-decidable nor co-semi-decidable in the theory of uninterpreted functions. Thus, no complete algorithm exists. The undecidability result extends to the theory of equality and Presburger arithmetic. There exist, however, (semi-)decidable fragments of TSL in the theory of uninterpreted functions: For satisfiable formulas with a single variable as well as satisfiable reachability formulas, our algorithm is guaranteed to terminate. For slightly more restricted reachability formulas, satisfiability is decidable.

We have implemented the algorithm and evaluated it, clearly illustrating its applicability: It terminates within one second on many randomly generated formulas and scales particularly well for satisfiable formulas. Moreover, it is able to check realistic benchmarks for consistency and to (in-)validate their assumptions. Most notably, we successfully validate the assumptions of a Syntroids module.

A preliminary version of this paper has been published on arXiv [13]. This already lead to further research on TSL modulo theories: Maderbacher and Bloem show that the synthesis problem for TSL modulo theories is undecidable in general and present a synthesis procedure for TSL modulo theories based on a counter-example guided LTL synthesis loop [27].

Further details and proofs are available in the full version of this paper [14].

### 2 Preliminaries

We assume time to be discrete. A value can be of arbitrary type and we denote the set of all values by V. The Boolean values are denoted by B ⊆ V. Given n values, an n-ary function f : V <sup>n</sup> → V computes a new value. An n-ary predicate p : V <sup>n</sup> → B determines whether a property over n values is satisfied. The sets of all functions and predicates are denoted by F and P ⊆ F, respectively. Constants are both functions of arity zero and values. Starting from 0, we denote the i-th position of an infinite word σ by σ<sup>i</sup> and the i-th component of a tuple t by πi(t).

To argue about functions and predicates, we use a term based notation. Function terms τ<sup>f</sup> are constructed from variables and functions, recursively applied to a set of function terms. Predicate terms τ<sup>p</sup> are constructed by applying a predicate to function terms. The sets of all function and predicate terms are denoted by T<sup>F</sup> and T<sup>P</sup> ⊆ T<sup>F</sup> , respectively. Given sets Σ<sup>F</sup> , Σ<sup>P</sup> of function and predicate symbols with Σ<sup>P</sup> ⊆ Σ<sup>F</sup> , a set V of variables, and a set V of values, let h·i : V ∪ Σ<sup>F</sup> → V ∪ F be an assignment function assigning a concrete function (predicate) to each function (predicate) symbol and an initial value to each variable. We require hvi ∈ V, hfi ∈ F, and hpi ∈ P for v ∈ V , f ∈ Σ<sup>F</sup> , p ∈ Σ<sup>P</sup> . The evaluation χh·i : T<sup>F</sup> → V ∪ B of function terms is defined by χh·i(v) := hvi for v ∈ V , and by χh·i(f(τ0, . . . , τn)) := hfi(χh·i(τ0), . . . , χh·i(τn)) for f ∈ Σ<sup>F</sup> ∪ Σ<sup>P</sup> .

Functions and predicates are not tied to a specific interpretation. To restrict the possible interpretations, we utilize first-order theories. A first-order theory T is a tuple (Σ<sup>F</sup> , Σ<sup>P</sup> , A), where Σ<sup>F</sup> and Σ<sup>P</sup> are sets of function and predicate symbols, respectively, and A is a set of closed first-order logic formulas over Σ<sup>F</sup> , Σ<sup>P</sup> , and a set of variables V . For an introduction to first-order logic, we refer to the full version [14]. The elements of A are called the axioms of T and Σ<sup>F</sup> ∪ Σ<sup>P</sup> is called the signature of T. A model M for a theory T = (Σ<sup>F</sup> , Σ<sup>P</sup> , A) is a tuple (V,h·i), where V is a set of values and h·i is an assignment function as introduced above. Furthermore, (V,h·i) is required to entail ϕ<sup>A</sup> for each axiom ϕ<sup>A</sup> ∈ A. The set of all models of a theory T is denoted by Models(T).

In the remainder of this paper, we focus on the following three theories: The theory of uninterpreted functions T<sup>U</sup> is a theory without any axioms, i.e., every symbol is uninterpreted. It allows for arbitrarily many function and predicate symbols. The theory of equality T<sup>E</sup> additionally includes equality, i.e., its axioms enforce the equality symbol = to indeed represent equality. The theory of Presburger arithmetic T<sup>N</sup> implements the idea of numbers. Its axioms define the constants 0 and 1 as well as equality and addition.

### 3 Temporal Stream Logic modulo Theories

In this section, we introduce Temporal Stream Logic modulo theories, an extension of the recently introduced logic Temporal Stream Logic (TSL) [15] with first-order theories. First, we recap the main idea of TSL as well as its syntax and semantics. Afterwards, we extend TSL with first-order theories and define the basic notions of satisfiability and validity for TSL formulas modulo theories.

### 3.1 Temporal Stream Logic

Temporal Stream Logic (TSL) [15] is a temporal logic that separates temporal control and pure data. Data is represented as infinite streams of arbitrary type. TSL allows for checks and manipulations of streams on an abstract level: It focuses on the control flow and abstracts away concrete implementation details. The temporal structure of the data is expressed by temporal operators as in LTL [32]. TSL is especially designed for reactive synthesis and thus distinguishes between uncontrollable input streams and controllable output streams, so-called cells. In this paper, this distinction is not necessary since we consider TSL independent of its usage in synthesis. Thus, we use the notions of streams and cells as synonyms. The finite set of all cells is denoted by C.

In TSL, we use functions f ∈ F to modify cells and predicates p ∈ P to perform checks on cells. The cells c ∈ C serve as variables for function terms. The sets of all function and predicate terms over Σ<sup>F</sup> , Σ<sup>P</sup> , and C are denoted by T<sup>F</sup> and T<sup>P</sup> . TSL formulas are built according to the following grammar:

$$\varphi, \psi := true \mid \neg \varphi \mid \varphi \land \psi \mid \mathsf{O}\varphi \mid \varphi \mathcal{U} \psi \mid \tau\_p \mid \left[\mathsf{c} \longleftrightarrow \tau\_f\right] \mid$$

where c ∈ C, τ<sup>p</sup> ∈ T<sup>P</sup> , and τ<sup>f</sup> ∈ T<sup>F</sup> . An update Jc τ<sup>f</sup> K denotes that the value of the function term τ<sup>f</sup> is assigned to cell c. The value of τ<sup>f</sup> may depend on the value of the cells occurring in τ<sup>f</sup> . The temporal operators ϕ and ϕ U ψ are similar to the ones in LTL. We define ϕ = true U ϕ and ϕ = ¬ ¬ϕ.

Since functions and predicates are represented symbolically, they are not tied to a specific implementation. To assign an interpretation to them, we use an assignment function h·i : C∪Σ<sup>F</sup> → V ∪ F, where V is a set of values. We require hci ∈ V, hfi ∈ F and hpi ∈ P for c ∈ C, f ∈ Σ<sup>F</sup> , and p ∈ Σ<sup>P</sup> Note that h·i also assigns an initial value to each cell. Terms can be compared syntactically with the equivalence relation ≡. The set of all assignments of cells c ∈ C to function terms τ<sup>f</sup> ∈ T<sup>F</sup> is denoted by C. A computation ς ∈ C<sup>ω</sup> is an infinite sequence of assignments of cells to function terms, capturing the behavior of cells over time. The satisfaction of a TSL formula ϕ with respect to ς, a set of values V, an assignment function h·i, and a time step t is defined by:<sup>1</sup>

ς, t |=V,h·i ¬ϕ :⇔ ς, t 6|=V,h·i ϕ ς, t |=V,h·i ϕ ∧ ψ :⇔ ς, t |=V,h·i ϕ ∧ ς, t |=V,h·i ψ ς, t |=V,h·i ϕ :⇔ ς, t + 1 |=V,h·i ϕ ς, t |=V,h·i ϕ U ψ :⇔ ∃t ′′ ≥ t.∀t ≤ t ′ < t′′. ς, t′ |=V,h·i ϕ ∧ ς, t′′ |=V,h·i ψ ς, t |=V,h·i Jc τ K :⇔ ςt(c) ≡ τ ς, t |=V,h·i p(τ0, . . . , τm) :⇔ χh·i(η(ς, t, p(τ0, . . . , τm))),

where η : C <sup>ω</sup> × N × T<sup>F</sup> → T<sup>F</sup> is a symbolic evaluation function defined by

$$\eta(\varsigma, t, \mathbf{c}) = \begin{cases} \mathbf{c} & \text{if } t = 0 \\ \eta(\varsigma, t - 1, \varsigma\_{t-1}(\mathbf{c})) & \text{if } t > 0 \end{cases}$$

$$\eta(\varsigma, t, f(\tau\_0, \dots, \tau\_m)) = f(\eta(\varsigma, t, \tau\_0), \dots, \eta(\varsigma, t, \tau\_m))$$

We call (ς, V,h·i) an execution. The satisfaction of a predicate depends on the current and the past steps in the computation. For updates, the satisfaction depends solely on the current step. While updates are only checked syntactically, the satisfaction of predicates depends on the given assignment h·i. An execution (ς, V,h·i) satisfies a TSL formula ϕ, denoted ς |=V,h·i ϕ, if ς, 0 |=V,h·i ϕ holds.

Example 1. Suppose that we have a single cell x, i.e., C = {x}. Consider the computation ς = ({λc.f(x)}) <sup>ω</sup>, i.e., f(x) is assigned to cell x in every time step. Let V = N be the set of values and let h·i be an assignment function such that the initial value of x is 1, function f corresponds to incrementation, and predicate p determines whether its argument is even (true) or odd (false). Consider the TSL formula ϕ := Jx f(x)K ∧ ¬p(x) ∧ p(x). By the semantics of TSL, we have ς, 0 |=V,h·i ϕ if, and only if, (ς0(x) = f(x)) ∧ (¬hpi(hxi)) ∧ (hpi(hfi(hxi))) holds. The first conjunct clearly holds by construction of ς. Since 1 is odd and 1+ 1 = 2 is even, the other two conjuncts hold as well for the chosen assignment function. Hence, (ς, V,h·i) satisfies ϕ for ς = ({λc.f(x)}) <sup>ω</sup>, V = N and h·i.

A computation ς is called finitary with respect to ϕ, denoted finϕ(ς), if for all cells c ∈ C and for all points in time t, either ςt(c) ≡ c holds, or there is an update Jc τ K in ϕ such that ςt(c) ≡ τ , i.e., a finitary computation only contains updates occurring in ϕ and self-updates. For ς and ϕ from Example 1, for instance, ς is finitary with respect to ϕ.

<sup>1</sup>Note that we use a slightly different, but equivalent, definition than [15]: Instead of evaluating the function and predicate symbols on the fly, we construct the whole term first and then evaluate it recursively using the evaluation function χh·i.

### 3.2 Extending TSL with Theories

In this paper, we extend TSL with first-order theories. That is, we restrict the possible interpretations of predicate and function symbols to a theory. Hence, we define the notions of satisfiability and validity of a TSL formula modulo a theory T. Intuitively, a TSL formula ϕ is satisfiable in a theory T if there exists an execution satisfying ϕ whose domain and assignment function represent a model in T, i.e., that entail all axioms of T. Formally:

Definition 1 (TSL Satisfiability). Let T = (Σ<sup>F</sup> , Σ<sup>P</sup> , A) be a theory and let ϕ be a TSL formula over Σ<sup>F</sup> , Σ<sup>P</sup> , and C. We call ϕ satisfiable in T if, and only if, there exists an execution (ς, V,h·i), such that ς |=V,h·i ϕ and (V,h·i) ∈ Models(T) hold. If additionally finϕ(ς) holds, then ϕ is called finitary satisfiable in T.

Intuitively, a formula ϕ is valid in a theory T, if for all executions and all matching models of the theory the formula is satisfied. Formally:

Definition 2 (TSL Validity). Let T = (Σ<sup>F</sup> , Σ<sup>P</sup> , A) be a theory and let ϕ be a TSL formula over Σ<sup>F</sup> , Σ<sup>P</sup> , and C. The formula ϕ is called valid in T if, and only if, for all executions (ς, V,h·i) with (V,h·i) ∈ Models(T), we have ς |=V,h·i ϕ. If ς |=V,h·i ϕ holds for all executions (ς, V,h·i) with both (V,h·i) ∈ Models(T) and finϕ(ς), then ϕ is called finitary valid in T.

It follows directly from their definitions that (finitary) TSL satisfiability and (finitary) TSL validity are dual. Thus, we focus on TSL satisfiability in the remainder of this paper as the results can easily be extended to TSL validity.

Theorem 1 (Duality of TSL Satisfiability and Validity). Let ϕ be a TSL formula over Σ<sup>F</sup> , Σ<sup>P</sup> , and C and let T = (Σ<sup>F</sup> , Σ<sup>P</sup> , A) be a theory. Then, ϕ is (finitary) satisfiable in T if, and only if, ¬ϕ is not (finitary) valid in T.

### 4 TSL modulo T<sup>U</sup> Satisfiability Checking

In this section, we investigate the satisfiability of TSL modulo the theory of uninterpreted functions T<sup>U</sup> . Since T<sup>U</sup> has no axioms, there are no restrictions on how a model for T<sup>U</sup> evaluates the function and predicate symbols. The only condition is that the evaluated symbols are indeed functions. Therefore, we have (ς, V,h·i) ∈ Models(T<sup>U</sup> ) for all executions. Thus, finding some execution satisfying a TSL formula ϕ is sufficient for showing that ϕ is satisfied in T<sup>U</sup> :

Lemma 1. Let ϕ be a TSL formula over Σ<sup>F</sup> , Σ<sup>P</sup> , and C. If there exists an execution (ς, V,h·i) with ς |=V,h·i ϕ, then ϕ is satisfiable in T<sup>U</sup> . If additionally finϕ(ς) holds, then ϕ is finitary satisfiable in T<sup>U</sup> .

In the following, we introduce an (incomplete) algorithm for checking the satisfiability of a TSL formula ϕ in the theory of uninterpreted functions. By Lemma 1, it suffices to find an execution satisfying ϕ to prove its satisfiability in T<sup>U</sup> . To search for such an execution, we introduce B¨uchi stream automata (BSAs), a new kind of ω-automata that reads executions and allows for dealing with predicates and updates. BSAs are, similar to the connection between LTL and B¨uchi automata, an automaton representation for TSL. Then, we present the algorithm for checking satisfiability in T<sup>U</sup> based on BSAs.

### 4.1 B¨uchi Stream Automata

Intuitively, a B¨uchi stream automaton (BSA) is an ω-automaton with B¨uchi acceptance condition that reads infinite executions instead of infinite words. Furthermore, it is able to deal with predicates and updates occurring in TSL formulas. To do so, the transitions of a BSA are labeled with guards and update terms. Intuitively, the former define which predicates need to hold when taking the transition. The latter define how the corresponding cells are updated when taking the transition. Formally, a BSA is defined as follows:

Definition 3 (B¨uchi Stream Automaton). Let Σ<sup>F</sup> , Σ<sup>P</sup> be sets of function and predicate symbols, respectively, and let C be a finite set of cells. A B¨uchi Stream automaton B over Σ<sup>F</sup> , Σ<sup>P</sup> , and C is a tuple (Q, Q0, F, •, G, U, δ), where Q is a finite set of states, Q<sup>0</sup> ⊆ Q is a set of initial states, F ⊆ Q is a set of accepting states, • is a fresh term symbol such that • 6∈ C ∪ Σ<sup>F</sup> ∪ Σ<sup>P</sup> , G ⊆ T<sup>P</sup> is a finite set of predicate terms over Σ<sup>F</sup> , Σ<sup>P</sup> , and C, called guards, U ⊆ T<sup>F</sup> ∪ {•} is a finite set of function terms over Σ<sup>F</sup> , Σ<sup>P</sup> , and C, called update terms, and δ ⊆ Q × (G → B) × (C → U) × Q is a finite transition relation.

Note that by requiring the update terms U to be a finite set of function terms, not all executions can be read by a BSA: Non-finitary executions contain updates with function terms that do not occur in the given TSL formula. Thus, they may require infinitely many update terms. Therefore, we introduce the fresh term symbol • 6∈ C ∪Σ<sup>F</sup> ∪Σ<sup>P</sup> . If a transition in a BSA assigns • to a cell c ∈ C, then any function term can be assigned to c. This allows for reading non-finitary executions while maintaining finite representability of BSAs.

Example 2. Consider the three BSAs depicted in Figure 1. If B<sup>1</sup> is in state q<sup>0</sup> and p(x) holds, then cell x is updated with f(x) and B<sup>1</sup> chooses nondeterministically to either stay in q<sup>0</sup> or to move to the accepting state q1. In contrast, B<sup>2</sup> is deterministic. Yet, it is incomplete: In both q<sup>0</sup> and q1, no guard is satisfied if ¬p(x) holds. Hence, B<sup>2</sup> gets stuck, preventing satisfaction of the B¨uchi winning condition for any execution containing ¬p(x). The BSA B<sup>3</sup> makes use of the fresh term symbol •: If p(x) holds, any function term can be assigned to x.

Given sets Σ<sup>F</sup> , Σ<sup>P</sup> , C and a BSA B = (Q, Q0, F, •, G, U, δ) over Σ<sup>F</sup> , Σ<sup>P</sup> , C, an infinite word c ∈ (Q × (G → B) × (C → U) × Q) <sup>ω</sup> is called run of B if, and only if, the first state of c is an initial state, i.e., π1(c0) ∈ Q0, and both c<sup>t</sup> ∈ δ and π4(ct) = π1(ct+1) hold for all points in time t ∈ N0. Intuitively, a run c is an infinite sequence of tuples (q, g, u, q′ ) encoding transitions in the BSA: q is the

Fig. 1: Three exemplary B¨uchi stream automata. Accepting states are marked with double circles. Guards are highlighted in red, update terms in blue.

source state, q ′ is the target state, g determines which predicate terms hold, and u defines which updates are performed when taking the transition. A run c is called accepting if it contains infinitely many accepting states, i.e., for all points in time t ∈ N0, there exists a t ′ > t such that π1(c<sup>t</sup> ′ ) ∈ F holds.

Example 3. Let g1(p(x)) = true, g2(p(x)) = false, and u(x) = f(x). The infinite word c = (q0, g1, u, q1)(q1, g2, u, q0)(q0, g1, u, q1)(q1, g2, u, q0). . . is a run of BSA B<sup>1</sup> from Figure 1a. It is accepting as it visits q<sup>1</sup> infinitely often.

The characteristics of a BSA are its predicates and updates. Thus, it is not sufficient to solely consider accepting runs since the constraints produced by the predicates might be inconsistent. Therefore, we define the execution of a BSA that only permits consistent accepting runs. Intuitively, given a run c of a BSA B, an execution of c consists of a computation ς ∈ Cω, a domain V, and an assignment h·i such that the updates in ς match the updates in c and such that the recursive evaluation of a predicate term using h·i matches its truth value in ς. To capture the constraints accumulated in ς as well as their truth values, we define the constraint trace ̺ : (τ<sup>p</sup> × B) <sup>ω</sup> of ς and c: Formally, ̺ for ς and c is defined by ̺<sup>t</sup> := ∅ if t = 0, and ̺<sup>t</sup> := ̺<sup>t</sup>−1∪{(η(ς, t−1, τp), π2(c<sup>t</sup>−<sup>1</sup>)(τp)) | τ<sup>p</sup> ∈ G} if t > 0. As an example, reconsider the computation ς from Example 1 and the run c of BSA B<sup>1</sup> from Example 3. The constraint trace of ς and c is given by ̺ = ∅{(p(x), true)}{(p(x), true),(p(f(x)), false)} . . . . A constraint trace ̺ is called consistent if there is no predicate term τ<sup>p</sup> ∈ T<sup>P</sup> such that both (τp, true) and (τp, false) occur in S t∈N<sup>0</sup> ̺t. ̺ from the example above is consistent. Using constraint traces, we now formally define the execution of a BSA:

Definition 4 (Execution of a BSA). Let Σ<sup>F</sup> and Σ<sup>P</sup> be sets of function and predicate symbols, respectively, and let C be a finite set of cells. Let B be a BSA over Σ<sup>F</sup> , Σ<sup>P</sup> , and C and let c be a run of B. Let ς ∈ C<sup>ω</sup> be an infinite computation and let h·i : C ∪ Σ<sup>F</sup> → V ∪ F be an assignment function. Let ̺ be the constraint trace of ς and c. We call (ς, V,h·i) execution for c if (1) for all points in time t ∈ N<sup>0</sup> and all cells c ∈ C, we have either π3(ct)(c) = ςt(c) or π3(ct)(c) = •, and (2) for all (τp, b) ∈ S t∈N<sup>0</sup> ̺t, we have χh·i(τp) = b.

Note that the second requirement can only be fulfilled if the constraint trace is consistent. Consider the computation ς and the assignment function h·i from Example 1, the run c of B<sup>1</sup> from Example 3, and the constraint trace ̺ of ς and c given above. Then, (ς, N,h·i) is an execution for c: Since in both ς and c, cell x is always updated with f(x), the updates in ς and c coincide at every point in time. Furthermore, by construction of h·i, the constraints of ̺ match the truth values obtained by recursively evaluating h·i for all predicate terms.

We define two languages of a BSA B: The symbolic language L(B) is the set of all executions that have a respective accepting run, i.e., (ς, V,h·i) ∈ L(B) if, and only if, there exists an accepting run c such that (ς, V,h·i) is an execution for c. The language L<sup>T</sup> (B) in a theory T is the set of all executions whose domain and assignment function additionally form a model in T, i.e., (ς, V,h·i) ∈ L<sup>T</sup> (B) if, and only if, (ς, V,h·i) ∈ L(B) and (V,h·i) ∈ Models(T).

We call a BSA B = (Q, Q0, F, •, G, U, δ) finitary if • 6∈ U holds. Hence, every run c of a finitary BSA, has a unique computation ς and thus a unique constraint trace ̺. Therefore, for a finite prefix c<sup>p</sup> of c, we can compute its execution effect effect(cp) := (λc. η(ς, |cp|, c), ̺|cp<sup>|</sup>) from c<sup>p</sup> itself, i.e., without considering ς and ̺ explicitly. Intuitively, cp's execution effect consists of the function terms assigned to the cells during the execution of c<sup>p</sup> as well as the constraints and their truth values on the transitions taken with c<sup>p</sup> in the BSA. The BSAs B<sup>1</sup> and B2, depicted in Figure 1, are finitary while B<sup>3</sup> is not. Since B<sup>1</sup> is finitary, consider the prefix c<sup>p</sup> = (q0, g1, u, q1)(q1, g2, u, q0) of the run c of B<sup>1</sup> presented in Example 3. Its execution effect is given by effect(cp) = (λc.f(f(x)), {(p(x), true),(p(f(x)), false)}).

An LTL formula ϕ can be translated into a nondeterministic B¨uchi automaton (NBA) A<sup>ϕ</sup> with L(ϕ) = L(Aϕ) [38]. An analogous relation exists between TSL formulas and BSAs: A TSL formula ϕ can be translated into an equivalent BSA Bϕ: First, we approximate ϕ by an LTL formula ϕLTL, similarly to the approximation described in [15]. The main idea of the approximation is to represent every function and predicate term as well as every update occurring in ϕLTL by an atomic proposition and to add conjuncts that ensure that exactly one update is performed for every cell in every time step. Second, we build an equivalent NBA AϕLTL from ϕLTL. Third, we construct a BSA B<sup>ϕ</sup> from AϕLTL by, intuitively, translating the atomic propositions back into predicate terms and updates and by dividing them into guards and update terms, while maintaining the structure of the NBA AϕLTL . The full construction of an equivalent BSA B<sup>ϕ</sup> from a TSL formula ϕ is given in the full version [14].

Theorem 2 (TSL to BSA Translation). Given a TSL formula ϕ, there exists an equivalent (finitary) B¨uchi stream automaton B such that for all theories T, L<sup>T</sup> (B) 6= ∅ holds if, and only if, ϕ is (finitary) satisfiable in T.

For instance, the TSL formula ϕ<sup>1</sup> := Jx f(x)K ∧ (p(x) ∧ ¬p(x)) is finitary satisfiable in a theory T if, and only if, L<sup>T</sup> (B1) 6= ∅ holds for the BSA B<sup>1</sup> from Figure 1a. Analogously, ϕ<sup>2</sup> := (Jx f(x)K ∧ p(x)) ∧ ¬p(f(x)), and ϕ<sup>3</sup> := p(x) correspond to the BSAs B<sup>2</sup> and B<sup>3</sup> from Figure 1b and Figure 1c.

### Algorithm 1: Algorithm for Checking TSL modulo T<sup>U</sup> Satisfiability

Input: ϕ: TSL Formula Output: SAT, UNSAT 1 B := Finitary BSA for ϕ as defined in Theorem 2; 2 R := Set of runs of B; 3 Function SatSearch <sup>4</sup> for pref .rec<sup>ω</sup> ∈ {c | c ∈ R ∧ accepting(c)} do 5 (vp, ):=effect(pref ); 6 (vr, P):=effect(pref .rec); <sup>7</sup> if SMT V (tp,v)∈P ( t<sup>p</sup> if v = true <sup>¬</sup>t<sup>p</sup> if <sup>v</sup> <sup>=</sup> false ! ∧ V c∈C vp(c) = vr(c) ! = SAT then 8 return SAT 9 Function UnsatSearch <sup>10</sup> for n ∈ N<sup>0</sup> do 11 for c ∈ {c | c ∈ finiteSubwords(R) ∧ |c| = n} do 12 ( , P):=effect(c); 13 if ∃tp. (tp, true),(tp, false) ∈ P then 14 R := R \ {c ′ | ∃m ∈ N0. ∀0 ≤ i < n. c′ <sup>i</sup>+<sup>m</sup> = ci} 15 if {c | c ∈ R ∧ accepting(c)} = ∅ then 16 return UNSAT 17 return parallel(SatSearch, UnsatSearch)

### 4.2 An Algorithm for TSL modulo T<sup>U</sup> Satisfiability Checking

Utilizing BSAs, we present an algorithm for checking the satisfiability of a TSL formula in the theory of uninterpreted functions T<sup>U</sup> in the following. First, recall that finitary computations only perform self-updates or updates that occur in the given TSL formula. Since there are only finitely many cells, the behavior of finitary computations is thus restricted to a finite set of possibilities in each step. Hence, reasoning with finitary computations is easier than reasoning with non-finitary ones. In the algorithm, we make use of the fact that satisfiability can be reduced to finitary satisfiability in the theory of uninterpreted functions, enabling us to focus on finitary computations. The main idea of the reduction is to introduce a new cell for each cell of a given TSL formula. The new cells then capture the values that are constructed by the non-finitary parts of a computation. The proof is given in the full version [14].

Lemma 2. Let ϕ be a TSL formula. Then, there is a TSL formula ϕfin such that ϕ is satisfiable in T<sup>U</sup> if, and only if, ϕ ∧ ϕfin is finitary satisfiable in T<sup>U</sup> .

Algorithm 1 shows the algorithm for checking TSL modulo T<sup>U</sup> satisfiability. It directly works on B¨uchi stream automata. First, an equivalent BSA B is generated for the input formula ϕ. Then, in parallel, SatSearch tries to prove that ϕ is satisfiable in T<sup>U</sup> while UnsatSearch tries to prove unsatisfiability of ϕ.

SatSearch enumerates all lasso-shaped accepting runs pref.rec<sup>ω</sup> of B, i.e., accepting runs consisting of a finite prefix pref and a finite recurring part rec that is repeated infinitely often. Both pref and rec need to end in the same state of B. Then, the execution effects of pref and pref.rec are computed. SatSearch checks if it is possible to satisfy all predicate constraints induced by pref.rec under the condition that, for each cell, pref and pref.rec construct equal function terms. For this, it utilizes an SMT solver to check the satisfiability of a quantifier-free first-order logic formula, encoding the consistency requirement, in the theory of equality. If the check succeeds, adding rec to pref des not create an inconsistency and hence repeating rec infinitely often is consistent. Therefore, there exists an execution for pref.rec<sup>ω</sup> and thus ϕ is finitary satisfiable in T<sup>U</sup> by Lemma 1.

UnsatSearch computes the execution effect of finite subwords of runs of B and checks whether they are consistent. If a subword is inconsistent, then every run that contains this subword is inconsistent. Hence, there do not exist executions for these runs and therefore they are removed from the set of candidate runs. If there is no accepting candidate run left, then B has an empty symbolic language and thus, by Theorem 2, ϕ is unsatisfiable in T<sup>U</sup> .

Example 4. Consider the finitary BSAs B<sup>1</sup> and B<sup>2</sup> from Figures 1a and 1b as well as their respective TSL formulas ϕ<sup>1</sup> := Jx f(x)K ∧ (p(x) ∧ ¬p(x)) and ϕ<sup>2</sup> := ( (Jx f(x)K∧p(x))∧ ¬p(f(x)). If we execute Algorithm 1 on ϕ1, SatSearch considers the accepting lasso q<sup>0</sup> → q<sup>1</sup> → q<sup>0</sup> in B<sup>1</sup> at some point. Then, pref = ε and rec = (q0, g1, u, q1)(q1, g2, u, q0). Note that pref.rec is the finite prefix c<sup>p</sup> of a run of B<sup>1</sup> from Example 3. Thus, effect(pref.rec) is given by (λc.f(f(x)), {(p(x), true),(p(f(x)), false)}). Since effect(pref) = (λc.c, ∅) holds, SatSearch generates the query p(x) ∧ ¬p(f(x)) ∧ x = f(f(x)) which is satisfiable in TE. Hence, we can repeat the lasso q<sup>0</sup> → q<sup>1</sup> → q<sup>0</sup> infinitely often without getting any inconsistent constraints and thus ϕ<sup>1</sup> is satisfiable.

If we execute Algorithm 1 on ϕ2, UnsatSearch checks at some point wether in B<sup>2</sup> the transition sequence q<sup>0</sup> → q<sup>1</sup> followed by the upper self-loop is consistent. This is not the case as it requires p(f(x)) to be true (first transition) and false (second transition): We have ̺<sup>1</sup> = {(p(x), true),(p(f(x)), false)} and ̺<sup>2</sup> = ̺<sup>1</sup> ∪ {(p(f(x)), true),(p(f(f(x))), true)} by definition of the constraint trace. UnsatSearch also checks the transition sequence q<sup>0</sup> → q<sup>1</sup> followed by the lower self-loop which is also inconsistent. Hence, there is no consistent transition after q<sup>0</sup> → q<sup>1</sup> and thus there is no valid accepting run. Hence, ϕ<sup>2</sup> is unsatisfiable.

Note that the presentation of Algorithm 1 omits implementation details such as the enumeration of accepting loops and the implementation of the infinite set R. A more detailed description addressing these issues is given in [14].

Algorithm 1 is correct. Intuitively, it terminates with SAT if the constraint trace ̺ of the unique computation ς of pref.rec<sup>ω</sup> is consistent. Hence, ̺ defines an assignment h·i such that (ς, V,h·i) is an execution of pref.rec<sup>ω</sup>, implying satisfiability of ϕ in T<sup>U</sup> . If the algorithm terminates with UNSAT, then all accepting runs of the BSA are inconsistent and thus no finitary execution satisfying ϕ exists. For the proof, we refer the reader to the full version [14].

Theorem 3 (Correctness of Algorithm 1). Let ϕ be a TSL formula. If Algorithm 1 terminates on ϕ with SAT, then there exists an execution (ς, V,h·i) such that both ς |=V,h·i ϕ and finϕ(ς) hold. If Algorithm 1 terminates with UN-SAT, then for all executions (ς, V,h·i) with finϕ(ctr), ς 6|=V,h·i ϕ holds.

### 5 Undecidability of TSL modulo T<sup>U</sup> Satisfiability

The algorithm for TSL satisfiability checking in T<sup>U</sup> presented in the previous section does not necessarily terminate. In this section, we show that no complete algorithm exists: The satisfiability of a TSL formula in the theory of uninterpreted functions T<sup>U</sup> (TSL-T<sup>U</sup> -SAT) is neither semi-decidable nor co-semi-decidable:

Theorem 4 (Undecidability of TSL-T<sup>U</sup> -SAT). The satisfiability (validity) problem of TSL in T<sup>U</sup> is neither semi-decidable nor co-semi-decidable.

The main intuition behind the undecidability result is that we can encode numbers with TSL in the theory of uninterpreted functions. That is, we are able to encode incrementation, resetting some variable to zero, and equality. We only give the encoding here, for the proof of its correctness we refer to [14].

Lemma 3. Let <sup>f</sup> be a unary function, let <sup>=</sup><sup>b</sup> be a binary predicate, and let <sup>z</sup> be a constant. Let f x (z) correspond to applying f x-times to z. There exists a TSL formula ϕnum such that every execution entailing ϕnum requires from its models that for all a, b ∈ N0, a = b holds if, and only if, f a (z) <sup>=</sup><sup>b</sup> <sup>f</sup> b (z) holds.

Proof (Sketch). We construct ϕnum = ϕ<sup>1</sup> ∧ ϕ<sup>2</sup> as follows: The first conjunct is defined by <sup>ϕ</sup><sup>1</sup> := <sup>J</sup><sup>e</sup> <sup>z</sup><sup>K</sup> <sup>∧</sup> (J<sup>e</sup> <sup>f</sup>(e)<sup>K</sup> <sup>∧</sup> <sup>e</sup> <sup>=</sup><sup>b</sup> <sup>e</sup>). Let

$$\begin{aligned} \varphi\_{eq} &:= (x \widehat{=} b) \to (\{x \leftarrow z\} \wedge \{b \leftarrow f(b)\} \wedge \neg (b \widehat{=} f(b)) \wedge \neg (f(b) \widehat{=} b)) \\ \varphi\_{neq} &:= \neg (x \widehat{=} b) \to (\{x \leftarrow f(x)\} \wedge \{b \leftarrow b\} \wedge \neg (x \widehat{=} f(b)) \wedge \neg (f(b) \widehat{=} x)) \end{aligned}$$

Then, ϕ<sup>2</sup> is defined by ϕ<sup>2</sup> := Jx zK ∧ Jb zK ∧ (ϕeq ∧ ϕneq ).

Intuitively, f corresponds to incrementation, z to resetting a variable to zero, and = to equality: <sup>b</sup> <sup>ϕ</sup><sup>1</sup> ensures that <sup>f</sup> <sup>n</sup>(z) <sup>=</sup><sup>b</sup> <sup>f</sup> <sup>n</sup>(z) holds for all n ∈ N0. In contrast, ϕ<sup>2</sup> ensures that if a 6= b holds, then ¬(f a (z) <sup>=</sup><sup>b</sup> <sup>f</sup> b (z)): Starting with <sup>x</sup> <sup>=</sup> <sup>b</sup> <sup>=</sup> <sup>z</sup>, <sup>ϕ</sup><sup>1</sup> ensures that <sup>x</sup> <sup>=</sup><sup>b</sup> <sup>b</sup> holds initially. Then, <sup>ϕ</sup>eq resets <sup>x</sup> to <sup>z</sup> and "increments" b, while ensuring that ¬(f k (z) <sup>=</sup><sup>b</sup> <sup>f</sup> <sup>k</sup>+1(z)) holds, where b = f k (z). Then, <sup>¬</sup>(<sup>x</sup> <sup>=</sup><sup>b</sup> <sup>b</sup>) holds and thus <sup>ϕ</sup>neq "increments" <sup>x</sup> until it reaches <sup>b</sup> <sup>=</sup> <sup>f</sup> <sup>k</sup>+1(z), while ensuring that ¬(f <sup>k</sup>+1(z) <sup>=</sup><sup>b</sup> <sup>f</sup> ℓ (z)) holds for all ℓ < k + 1.

Using this encoding in TSL modulo T<sup>U</sup> , we can construct a TSL formula ϕ<sup>G</sup> for every Goto-program G such that ϕ<sup>G</sup> ∧ ϕnum is satisfiable in T<sup>U</sup> if, and only if, G terminates on every input. Intuitively, ϕ<sup>G</sup> "simulates" G on different inputs by starting with input zero and incrementing the input if the halting location was reached. The temporal operators of TSL allow for requiring that G terminates infinitely often. The construction of ϕ<sup>G</sup> is given in the full version [14]. Since the universal halting problem for Goto programs is neither semi-decidable nor co-semi-decidable, the same undecidability result follows for the satisfiability of a TSL formula modulo T<sup>U</sup> , proving Theorem 4.

Since the theory of Presburger arithmetic T<sup>N</sup> allows for incrementation, resetting a variable to zero, and equality, we can reuse the TSL formula ϕ<sup>G</sup> from above to reduce the universal halting problem for Goto programs to TSL satisfiability modulo T<sup>N</sup> (TSL-TN-SAT), proving undecidability of TSL-TN-SAT. Note that this result holds for other theories that can express incrementation, reset, and equality, for instance Peano Arithmetic, as well.

Theorem 5 (Undecidability of TSL-TN-SAT). The satisfiability (validity) problem of TSL in T<sup>N</sup> is neither semi-decidable nor co-semi-decidable.

Furthermore, equality allows for encoding incrementation and resetting a variable to zero. Hence, similarly to T<sup>U</sup> , there exists a TSL formula ϕenc that, if entailed, enforces a binary function and a constant to behave as incrementation and a reset, respectively. The construction of ϕenc is given in the full version [14]. Thus, the TSL formula ϕ<sup>G</sup> constructed as above for a Goto program G ensures that ϕ<sup>G</sup> ∧ ϕenc is satisfiable in the theory of equality T<sup>E</sup> if, and only if, G terminates on every input. Hence, undecidability of TSL-TE-SAT follows:

Theorem 6 (Undecidability of TSL-TE-SAT). The satisfiability (validity) problem of TSL in T<sup>E</sup> is neither semi-decidable nor co-semi-decidable.

### 6 (Semi-)Decidable Fragments

In Section 5, we showed that TSL satisfiability is undecidable in T<sup>U</sup> . In this section, however, we identify fragments of TSL on which Algorithm 1 terminates for certain inputs. In fact, we present one fragment for which TSL-T<sup>U</sup> -SAT is decidable and two fragments for which TSL-T<sup>U</sup> -SAT is semi-decidable.

First, we consider the TSL reachability fragment, i.e., the fragment of TSL that only permits the next operator and the eventually operator as temporal operators. In our applications, this fragment corresponds to finding counterexamples to safety properties. For satisfiable reachability formulas, Algorithm 1 terminates. The main idea behind the termination is that the BSA of a reachability formula has an accepting lasso-shaped run and since ϕ is satisfiable, this run is consistent. For the proof, we refer to the full version [14].

Lemma 4. Let ϕ be a TSL formula in the reachability fragment. If ϕ is finitary satisfiable in T<sup>U</sup> , then Algorithm 1 terminates on ϕ.

Restricting the reachability fragment further, we consider TSL formulas with updates, predicates, logical operators, next operators, and at most one top-level eventually operator. Such formulas are either completely time-bounded or they are of the form ϕ = ϕ ′ , where ϕ ′ is time-bounded. In the dual validity problem, such formulas correspond to invariants on a fixed time window, a useful property for many applications. Algorithm 1 is guaranteed to terminate for satisfiable and unsatisfiable formulas of the above form if a suitable BSA is constructed. Such a suitable BSA has a single accepting state q indicating that the time-bounded part has been satisfied. Intuitively, a suitable BSA ensures that all runs reaching q are accepting and that only finitely many transition sequences lead to q. Then, if ϕ is unsatisfiable, Algorithm 1 is able to exclude all transition sequences leading to q and thus to terminate. A BSA with infinitely many transition sequences leading to q, in contrast, may cause the algorithm to diverge as it may consider infinitely many consistent subsequences before finding the inconsistent one yielding the exclusion of the sequences leading to q. A suitable BSA exists for every TSL formula in the considered fragment. For the proof, including a more detailed description of suitable BSAs, we refer to the full version [14].

Lemma 5. Let ϕ be a TSL formula with only logical operators, predicates, updates, next operators, and at most one top-level eventually operator. Algorithm 1 terminates on ϕ if it picks a suitable respective BSA.

Note that Algorithm 1 is only a formal decider for this fragment if we ensure that a suitable BSA is always generated. In practice, we experienced that this is usually the case even without posing restrictions on the BSA construction. Lastly, we consider a fragment of TSL that does not restrict the temporal structure of the formula but the number of used cells. For TSL formulas with a single cell, Algorithm 1 always terminates on satisfiable inputs:

Lemma 6. Let ϕ be a TSL formula such that |C| = 1. If ϕ is finitary satisfiable in the theory of uninterpreted functions, then Algorithm 1 terminates on ϕ.

Intuitively, restricting the TSL formula to use only a single cell prevents us from simulating arbitrary computations and thus from reducing from the universal halting problem of Goto programs as in the general undecidability proof. The formal proof, given in the full version [14], however, is unrelated to the above intuition. Combining the three observations, we obtain the following (semi-)decidability results for the satisfiability of fragments of TSL modulo T<sup>U</sup> :

Theorem 7. The satisfiability problem of TSL formulas in T<sup>U</sup> is (1) semidecidable for the reachability fragment of TSL, (2) decidable for formulas consisting of only logical operators, predicates, updates, next operators, and at most one top-level eventually operator, and (3) semi-decidable for formulas with one cell.

### 7 Evaluation

We implemented the algorithm for checking TSL modulo T<sup>U</sup> satisfiability<sup>2</sup> . We used TSL tools<sup>3</sup> to handle TSL, spot [11] to transform the approximated LTL formulas into NBAs, SyFCo [20] for LTL transformations, and z3 [31] to solve SMT queries. The implementation follows the extended algorithm described in [14]. Since in some cases the default optimizations of spot produce a large overhead in

<sup>2</sup>https://github.com/reactive-systems/tsl-satisfiability-modulo-theories

<sup>3</sup>https://github.com/reactive-systems/tsltools

Fig. 2: Execution times in milliseconds of the scalability benchmark series.

computation time, we first execute it with these and if this does not terminate within 20s, we execute it without optimizations. We evaluated the implementation on three benchmark classes and a machine with an AMD Ryzen 7 processor, using a virtual machine with two logical cores and 6 GB of RAM.

Scalability Benchmark Series. We test the scalability of the algorithm with parameterized decidable benchmarks. The timeout is one minute. Note that spot can always perform its optimizations. The satisfiable benchmarks are defined by ϕsat(n) := ( Jx f(x)K) ∧ ( ¬p(x)) ∧ V<sup>n</sup> <sup>i</sup>=0 p(f i (x)) . The parameter n corresponds to the number of updates that have to be performed to find a satisfiable lasso. By Lemma 6, the algorithm always terminates. The TSL formula ϕunsat(n) := ( (q(x) ↔ ¬q(f <sup>n</sup>(x))))∧( Jx f(x)K)∧ (q(x)∧ n q(x)) defines the unsatisfiable benchmarks. The parameter n corresponds to the "distance" in time and number of updates of the conflict causing unsatisfiability. The algorithm always terminates. The results are shown in Figure 2. The algorithm scales particularly well for the satisfiable formulas. However, the experiments indicate an exponential complexity of the algorithm for the unsatisfiable formulas.

Random Benchmark Series. We implemented a random TSL formula generator that uses spot's ltlrand to generate random LTL formulas and then substitutes the atomic propositions with random updates and predicates. The generated TSL formulas have one to three cells, one to three different updates and one to three different predicates. For the LTL formulas generated by ltlrand we use tree sizes from 5 to 95 in steps of five. For each of the tree sizes, we generate 30 formulas; 570 in total. The execution times are shown in Figure 3. On many formulas, the algorithm terminates within one second. The implementation returns SAT for 513 of the 570 formulas. It times out after 30s on 29 formulas. However, the timeouts already occur in the automaton construction, both with and without spot's optimizations. Only 28 formulas are unsatisfiable. For 25 of these unsatisfiable formulas, the intermediate LTL approximation formula is already unsatisfiable, i.e., only for three formulas there is some conflict due to updates and predicate evaluation.


Table 1: Execution times in seconds of the application benchmark series.

Applications Benchmark Series. These benchmarks correspond to checking consistency of a specification and validating assumptions of a system. Hence, they illustrate how satisfiability results can aid the system design. The results are presented in Table 1. We introduce two of the benchmarks in more detail here. The other, slightly larger, ones, including different kinds of arbiters, a scheduler, and modules of the Syntroids [17] arcade game, are described in [14].

The Chain benchmark considers a compound system of two chained modules m<sup>1</sup> and m<sup>2</sup> that receive an input value, store it, and forward it to the next system: ϕ<sup>i</sup> := (Jmem<sup>i</sup> iniK ∧ Jini+1 memiK) for i ∈ {1, 2}. To simulate the input of the first module, we use an update with an uninterpreted function: ϕinp := Jin<sup>1</sup> f(in1)K. We require that if some property p holds on m1's input, p also needs to hold hold on m2's output: ϕspec := (p(in1) → p(in3)). Our algorithm determines within 8s that (ϕinp ∧ ϕ<sup>1</sup> ∧ ϕ2) ∧ ¬ϕspec is satisfiable, detecting an inconsistency: If m<sup>1</sup> stores some value on which p holds, it may overwrite it before m<sup>2</sup> copies it, preventing the value to reach m2's output.

The Filter benchmark studies a system that "passes through" an input value to a cell if it fulfills a certain property p and holds the previous value otherwise: ϕfilter := Jout d()K∧ ((p(in) → Jout inK)∧(¬p(in) → Jout outK)), where d is a constant representing an initial default value. The default value fulfills p, i.e., ϕfact := p(d()). As for the chain, ϕinp := Jin f(in)K simulates the input. The filter is valid if p holds on all outputs after the initialization: ϕspec := p(out). Within 400ms, the algorithm confirms that (ϕinp ∧ ϕfact ∧ ϕfilter ) ∧ ¬ϕspec is unsatisfiable, validating the filter.

### 8 Related Work

Linear-time temporal logic (LTL) [32] is one of the most popular specification languages for reactive systems. It is based on an underlying assertion logic, such as propositional logic, which is extended with temporal modalities. Satisfiability of propositional LTL has long known to be decidable [37] and there are efficient tools for LTL satisfiability checking [36,25].

While propositional LTL is very common, especially in hardware verification, LTL with richer assertion logics, such as first-order logic and various theories, have long been used in verification (cf. [28]). Temporal Stream Logic (TSL) [15] was introduced as a new temporal logic for reactive synthesis. In the original TSL semantics, all functions and predicates are uninterpreted. TSL synthesis is undecidable in general, even without inputs or equality, but can be underapproximated by the decidable LTL synthesis problem [15]. TSL has been used to specify and synthesize an arcade game realized on an FPGA [17].

Constraint LTL (CLTL) [6] extends LTL with the possibility of expressing constraints between variables at bounded distance. A constraint system D consists of a concrete domain and an interpretation of relations on the domain. In Constraint LTL over D (CLTL(D)), one can relate variables with relations defined in D. Similar to updates in TSL, CLTL can specify assignment-like statements by utilizing the equality relation. Like for all constraints allowing for a counting mechanism, LTL with Presburger constraints, i.e., CLTL(Z, =, +), is undecidable [6]. However, there exist decidable fragments such as LTL with finite constraint systems [4] and LTL with integer periodicity constraints [5]. Permitting constraints between variables at an unbounded distance leads to undecidability even for constraint systems that only allow equality checks on natural numbers. Restricting such systems to a finite number of constraints yields decidability again [9]. In TSL modulo theories, a theory is given from which a model can be chosen. In CLTL, in contrast, the concrete model is fixed. Therefore, TSL modulo theories cannot be encoded into CLTL in general.

LTL has been extended with the freeze operator [8,7], allowing for storing an input in a register. Then, the stored value can be compared with a current value for equality. Freeze LTL with two registers is undecidable [26,10] . For flat formulas, i.e., formulas where the possible occurrences of the freeze operator are restricted, decidability is regained [10]. Similar to the freeze operator, updates in TSL allow for storing values in cells and in TSL modulo the theory of equality the equality check can be performed. In TSL, we can perform computations on the stored values which is not possible in freeze LTL. Hence, freeze LTL can be seen as a special case of TSL. Constraint LTL has been augmented with the freeze operator as well [10]. For an infinite domain equipped with the equality relation, it is undecidable. For flat formulas, decidability is regained [10].

The temporal logic of actions (TLA) [24] is designed to model computer systems. States are assignments of values to variables and actions relate states. Actions can, similar to updates in TSL, describe assignments of variables. A TLA formula may contain state functions and predicates. Actions and state functions are combined with the temporal operators and . In contrast to TSL, and U are not permitted. The validity problem for the propositional fragment of TLA, i.e., with uninterpreted functions and predicates, is PSPACE complete [35].

Similar to temporal logics, dynamic logic [33,19] is an extension of modal logic to reason about computer programs. Dynamic logic allows for stating that after action a, it is necessarily the case that p holds, or it is possible that p holds. Compound actions can be build up from smaller actions. In propositional dynamic logic (PDL) [16], data is omitted, i.e., its terms are actions and propositions. PDL satisfiability is decidable in EXPTIME [34]. First-order dynamic logic (FODL) [18] allows for including data: First-order quantification over a first-order structure, the so-called domain of computation, is allowed. Dynamic logic does not contain temporal operators such as or . Since we consider reactive systems, i.e., systems that continually interact with their environment, temporal logics are better suited than dynamic logics for our setting.

Symbolic automata (see e.g. [2,3]) and register automata [21] are extensions of finite automata that are capable of handling large or infinite alphabets. Register automata have additionally been considered over infinite words in some works (see e.g. [8,22,12]). Similar to BSAs, transitions of symbolic automata are labeled with predicates over a domain of alphabet symbols. Register automata are equipped with a finite amount of registers that, similar to cells in BSAs, can store input values. Symbolic register automata (SRAs) [1] combine the features of both automata models. BSAs have the additional ability to modify the stored values and thus to perform actual computations on them. Moreover, they read infinite instead of finite words. Thus, SRAs can be seen as a special case of BSAs.

More recently, the verification of uninterpreted programs has been investigated [29]. Uninterpreted programs are similar to While-programs with equality and uninterpreted functions and predicates. They are annotated with assumptions. The verification of uninterpreted programs is undecidable in general; for the subclass of coherent uninterpreted programs, however, it is decidable [29]. The verification problem has been extended with theories, i.e., with axioms over the functions and predicates [30]. Adding axioms to coherent uninterpreted programs preserves decidability for some axioms, e.g., idempotence, while it yields undecidability for others, e.g., associativity. The synthesis problem for uninterpreted programs is undecidable in general, but decidable for coherent ones [23].

### 9 Conclusion

We have extended Temporal Stream Logic (TSL) with first-order theories and formalized the satisfiability and validity of a TSL formula in a theory. While we show that TSL satisfiability is neither semi-decidable nor co-semi-decidable in the theory of uninterpreted functions T<sup>U</sup> , the theory of equality TE, and Presburger arithmetic TN, we identify three fragments for which satisfiability in T<sup>U</sup> is (semi-)decidable: For reachability formulas as well as for formulas with a single cell, TSL satisfiability in T<sup>U</sup> is semi-decidable. For slightly more restricted reachability formulas, it is decidable. Moreover, we have presented an algorithm for checking the satisfiability of a TSL formula in the theory of uninterpreted functions that is based on B¨uchi stream automata, an automaton representation of TSL formulas introduced in this paper. Satisfiability checking has various applications in the specification and analysis of reactive systems such as identifying inconsistent requirements during the design process. We have implemented the algorithm and evaluated it on three different benchmark series, including consistency checks and assumption validations: The algorithm terminates on many randomly generated formulas within one second and scales particularly well for satisfiable formulas. Moreover, it is able to prove or disprove consistency of realistic benchmarks and to validate or invalidate their assumptions.

### References


Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

### The Different Shades of Infinite Session Types <sup>∗</sup>

Simon J. Gay<sup>1</sup> , Diogo Poças<sup>2</sup> () , and Vasco T. Vasconcelos<sup>2</sup>

<sup>1</sup> School of Computing Science, University of Glasgow, UK simon.gay@glasgow.ac.uk

<sup>2</sup> LASIGE, Faculdade de Ciências, Universidade de Lisboa, Portugal {dmpocas,vmvasconcelos}@ciencias.ulisboa.pt

Abstract. Many type systems include infinite types. In session type systems, infinite types are important because they specify communication protocols that are unbounded in time. Usually infinite session types are introduced as simple finite-state expressions rec X.T or by nonparametric equational definitions X .<sup>=</sup> <sup>T</sup>. Alternatively, some systems of label- or value-dependent session types go beyond simple recursive types. However, leaving dependent types aside, there is a much richer world of infinite session types, ranging through various forms of parametric equational definitions, to arbitrary infinite types in a coinductively defined space. We study infinite session types across a spectrum of shades of grey on the way to the bright light of general infinite types. We identify four points on the spectrum, characterised by different styles of equational definitions, and show that they form a strict hierarchy by establishing bidirectional correspondences with classes of automata: finite-state, 1 counter, pushdown and 2-counter. This allows us to establish decidability and undecidability results for type formation, type equivalence and duality in each class of types. We also consider previous work on context-free session types (and extend it to higher-order) and nested session types, and locate them on our spectrum of infinite types.

Keywords: Infinite types · Recursive types · Session types · Automata and formal language theory

### 1 Introduction

Session types [19,20,23,38] are an established approach to specifying communication protocols, so that protocol implementations can be verified by static typechecking or dynamic monitoring. The simplest protocols are finite: for example, ?int.!bool.end describes a protocol in which an integer is received, then a boolean is sent, and that's all. Most systems of session types, however, include

c The Author(s) 2022

<sup>∗</sup>Supported by EPSRC EP/T014628/1 "Session Types for Reliable Distributed Systems", by FCT PTDC/CCI-CIF/6453/2020 "Safe Concurrent Programming with Session Types", and by the LASIGE Research Unit UIDB/00408/2020 and UIDP/00408/2020. A full version is available at https://arxiv.org/abs/2201.08275 [16].

P. Bouyer and L. Schr¨oder (Eds.): FoSSaCS 2022, LNCS 13242, pp. 347–367, 2022. https://doi.org/10.1007/978-3-030-99253-8\_18

equi-recursive types for greater expressivity. A type that endlessly repeats the simple send-receive protocol is X such that X .<sup>=</sup> ?int.!bool.X, which can also be specified by rec X.?int.!bool.X. More realistic examples usually combine recursion and choice, as in Y such that Y .<sup>=</sup> &{go: ?int.!bool.Y , quit: end} which offers a choice between go and quit operations, each with its own protocol. A natural observation is that session types look like finite-state automata, but some systems from the literature go beyond the finite-state format: for example, context-free session types [39] and nested session types [9,10], as well as label-dependent session types [40] and value-dependent session types [41].

Even without introducing dependent types, a range of definitional formats can be considered for session types, presumably with varying degrees of expressivity, but they have never been systematically studied. That is the aim of the present paper. We consider various forms of parameterised equational definitions, illustrated by six running examples. Because our formal system only has one base type, the terminated channel type end, the running examples simply use end (or skip for context-free session types) as a representative basic message type that could otherwise be bool or int.

Our study of classes of infinite types should be generally applicable; we make it concrete by concentrating on session types where (potential) infinite types occur naturally. For the sake of uniformity, all our non-finite session types are introduced by equations, rather than, say, rec-types. Equations may be further parameterized, thus accounting for types that go beyond recursive types. The examples below illustrate the different kinds of parameterized equations we use.

Example 1 (No parameters). Type Tloop is X with equation X .<sup>=</sup> !end.X. Intuitively Tloop = !end.!end . . . continuously outputs values of type end.

Example 2 (One natural number parameter). Assuming z and s as the natural number constructors and N as a variable over natural numbers, type Tcounter is Xhzi with equations


A sequence of n inc operations followed by a dump triggers a reply of n end output messages.

Example 3 (Context-free types). With type skip used either to finish a session or to move to the next operation, type Ttree is X with equation

> X .<sup>=</sup> &{leaf : skip, node: <sup>X</sup>; ?skip; <sup>X</sup>}

The leaf choice terminates the reception of a binary tree of skip values and the node choice triggers the reception of a (left) tree, followed by ?skip (root), followed by a (right) tree. Even though the development in the rest of the paper considers higher-order types (where messages may convey arbitrary types rather than skip alone), for simplicity our example is first-order.

Example 4 (One list parameter). Assuming σ and τ as symbols and S as a variable over sequences of symbols (with ε the empty sequence), type Tmeta is Xhεi with equations

Xhεi .<sup>=</sup> &{addOut: <sup>X</sup>hσi, addIn: <sup>X</sup>h<sup>τ</sup> i} XhσSi .<sup>=</sup> &{addOut: <sup>X</sup>hσσSi, addIn: <sup>X</sup>hτσSi, pop: !end.XhSi} XhτSi .<sup>=</sup> &{addOut: <sup>X</sup>hστSi, addIn: <sup>X</sup>hτ τSi, pop: ?end.XhSi}

Type Tmeta records simple protocols composed of !end and ?end messages. Symbol σ in a parameter to a type identifier X denotes an output message and symbol τ an input message. The protocol behaves as a stack with two distinct push operations (addOut and addIn). The symbol (σ or τ ) at top of the stack determines whether a pop operation triggers !end or ?end, respectively.

Example 5 (Nested types). Taking α as a variable over types, type Tnest is X<sup>ε</sup> with equations

$$\begin{aligned} X\_{\varepsilon} & \doteq& \& \{ \text{addOut} \colon X\_{\text{out}} \langle X\_{\varepsilon} \rangle, \text{addIn} \colon X\_{\text{in}} \langle X\_{\varepsilon} \rangle \} \\ X\_{\text{out}} \langle \alpha \rangle & \doteq& \& \{ \text{addOut} \colon X\_{\text{out}} \langle X\_{\text{out}} \langle \alpha \rangle \}, \text{addIn} \colon X\_{\text{in}} \langle X\_{\text{out}} \langle \alpha \rangle \rangle, \text{pop} \colon \text{lend}.\alpha \} \\ X\_{\text{in}} \langle \alpha \rangle & \doteq& \& \{ \text{addOut} \colon X\_{\text{out}} \langle X\_{\text{in}} \langle \alpha \rangle \}, \text{addIn} \colon X\_{\text{in}} \langle X\_{\text{in}} \langle \alpha \rangle \rangle, \text{pop} \colon ? \text{end}.\alpha \} \end{aligned}$$

Type identifiers such as Xε, Xout, Xin take an arbitrary but fixed number of arguments. Type Tnest behaves as Tmeta in Example 4. The alignment should be clear if we take, e.g. XouthXinhαii for XhστSi, with σ denoting output and τ denoting input. Type identifiers Xout and Xin play the roles of stack symbols (symbols at the top of the stack, σ or τ ); type variable α denotes the lower part of the stack (S in Example 4).

Example 6 (Two natural number parameters). Type Titer is Xhz,zi with


Informally, writing !end<sup>n</sup> for a sequence of n output end messages, these definitions give Titer = ?end.!end<sup>1</sup> .?end.!end<sup>2</sup> .?end.!end<sup>3</sup> . . .

It is intuitively clear that Examples 2 and 4 to 6 cannot be expressed without parameters. It is perhaps less clear that each definitional style in Examples 1, 2, 4 and 6 is strictly more expressive than the previous one. This is the main result of the paper. We establish a hierarchy from finite session types all the way up to non-computable types that have no representation at all. The latter certainly exist, because for every infinite binary expansion of a real number between zero and one there is a session type derived by mapping 0 to send and 1 to receive — for cardinality reasons, almost all of these types are non-computable.

Our methodology is to develop the connection between session types and automata, in particular between progressively more expressive definitional styles of types and progressively more powerful classes of automata. We also consider the formal language class corresponding to each class of automata, and the decidability of important properties such as contractiveness, type formation, type equivalence and type duality. Our results are summarised in the table below, establishing a hierarchy of session types in parallel to the Chomsky hierarchy of languages, where by a 1-counter language, we mean a language accepted by a (deterministic) 1-counter automaton and where DCFL abbreviates deterministic context-free languages. The final row of the table emphasises that it is impossible to give an explicit example of a non-computable type or to even state the decision problems.

Context-free and 1-counter types are incomparable. Essentially, both models lie between levels 2 and 3 of the Chomsky hierarchy and correspond to different restrictions of deterministic pushdown automata. Context-free types correspond to constraining automata with a single state, whereas 1-counter types correspond to constraining the stack to have a single symbol.


Our main contributions can be summarized as follows.


<sup>3</sup>Possibly languages accepted by a single-state pushdown automata with empty stack acceptance.

Fig. 1. Finite and infinite types.

counter session types (Theorem 4). This implies that equivalence for higherorder context-free session types is decidable. The decidability results are not entirely unexpected, given that type equivalence for nested session types was recently shown to be decidable [9], and that these are equivalent to pushdown types. However, our proofs are independent of Das et al. [9].

Organization of the paper In Section 2 we introduce the various classes of session types. In Section 3 we explain how to associate to each given type a labelled infinite tree, as well as a set which we call the language of traces of that type. We also present our results on the strict hierarchy of types and state how previously studied classes of types fit into this hierarchy (Theorem 1). In Section 4 we describe how to convert a type into an automaton accepting its traces. In Section 5 we travel in the converse direction, i.e., from an automata into the corresponding type, and present a characterisation theorem of the different types in our hierarchy (Theorem 2). We then present our main algorithmic results: type formation, type equivalence and type duality are all decidable up to pushdown types (Theorem 3), and undecidable for 2-counter types (Theorem 4). Due to space constraints, all proofs and additional details can be found in the extended version of our paper at arXiv [16].

### 2 Shades of types

The finite world Finite types are in Fig. 1. The syntax of types is introduced via formation rules, paving the way for infinite types. Session types comprise the terminated type end, the input type ?T.U (input a value of type T and

Fig. 2. Recursive types. Extends Fig. 1.

continue as U), the output type !T.U (output a value of type T and continue as U), external choice &{` : T`}`∈<sup>L</sup> (receive a label k ∈ L and continue as Tk) and internal choice ⊕{` : T`}`∈<sup>L</sup> (select a label k ∈ L and continue as Tk). To avoid repeating similar rules, we use the symbol ] to denote either ? or !, and the symbol ? to denote either & or ⊕. At this point type equivalence is essentially syntactic equality, but the rule format allows for seamless extensions to infinite settings. Types, type equivalence and duality are all standard [15,20,44]. Note that rule D-Msg defines duality with respect to type equivalence: !T.V and ?U.W are dual types iff the type being exchanged is the same (T ' U) and the continuations are dual (V ⊥ W).

For finite types all judgements in Fig. 1 are interpreted inductively. For example, we can show that !(?end.end).!end.end is a type by exhibiting a finite derivation ending with this judgement.

The recursive world Recursive types suggest the first glimpse of infinity. The details are in Fig. 2. Recursion is given via equations, rather than µ-types for example, for easier extension. Towards this end, we introduce type identifiers X and equations of the form X .<sup>=</sup> <sup>T</sup>. The set of type identifiers is finite. We further assume at most one equation for each type, so that there are finitely many type equations. Every valid type T is required to be contractive, that is T contr. Contractiveness ensures that types reveal a type constructor after finitely many unfolds, and excludes undesirable cycles that don't describe any behaviour, e.g. cycles of the form {X .<sup>=</sup> <sup>Y</sup> , <sup>Y</sup> .<sup>=</sup> Z,Z .<sup>=</sup> <sup>X</sup>}. Contractiveness is inductive: we look for finite derivations for T contr judgements. A coinductive interpretation of the rules would allow to conclude X contr given an equation X .<sup>=</sup> <sup>X</sup>. In contrast, type formation, type equivalence and duality are now interpreted coinductively.

For example, no finite derivation would allow showing that Tloop type. Instead we proceed by showing that set {end, !end.X, X} is backward closed [34] for the rules for T type in Fig. 2, given that !end.X, the right-hand side of the equation for X, is contractive.

Natural numbers

$$n ::= \mathbf{z} \mid \mathbf{s}n$$

$$\frac{X \langle \mathsf{s} \, N \rangle \doteq T \, \, T[n/N] \, \text{contr} \quad T[n/N] \, \text{type}}{X \langle \mathsf{s} \, n \rangle \text{ type}}$$

$$\text{(T-s)}$$

New contractivity rules (ind.) T contr Xhzi .<sup>=</sup> <sup>T</sup> <sup>T</sup> contr Xhzi contr (C-z) Xhs Ni .<sup>=</sup> <sup>T</sup> <sup>T</sup>[n/N] contr Xhs ni contr (C-s) New formation rule (coind.) T type Xhzi .<sup>=</sup> <sup>T</sup> <sup>T</sup> contr <sup>T</sup> type Xhzi type (T-z) New equivalence rules (coind.) T ' T Xhzi .<sup>=</sup> <sup>U</sup> <sup>U</sup> contr <sup>U</sup> ' <sup>T</sup> Xhzi ' T (E-zL) Xhs Ni .<sup>=</sup> <sup>U</sup> <sup>U</sup>[n/N] contr <sup>U</sup>[n/N] ' <sup>T</sup> Xhs ni ' T (E-sL)

Fig. 3. 1-counter types. Extends Fig. 2; removes X; adds Xhni. Right versions of rules E-zL and E-sL omitted. New rules for duality obtained from those for equivalence by replacing ' by ⊥.

The 1-counter world The next step takes us to equations parameterised on natural numbers. The details are in Fig. 3. Natural numbers are built from the nullary constructor z and the unary constructor s. We discuss the changes from the recursive world in Fig. 2. Given a variable N on natural numbers, to each type identifier X we associate at most two equations, Xhzi .<sup>=</sup> <sup>T</sup> and Xhs Ni .<sup>=</sup> <sup>U</sup>. The rules for recursive types are naturally adapted to 1-counter types. Here again, type formation requires a suitable notion of contractiveness to exclude cycles of equations that never reach a type identifier, e.g. cycles of the form {Xhs Ni .<sup>=</sup> <sup>Y</sup> <sup>h</sup>ss <sup>N</sup>i, <sup>Y</sup> <sup>h</sup><sup>s</sup> <sup>N</sup><sup>i</sup> .<sup>=</sup> <sup>X</sup>hNi}. The right-hand-side of an equation Xhs Ni .<sup>=</sup> <sup>T</sup> is not necessarily a type for it may contain natural number variables (N in particular). However, if n is a natural number, then T[n/N] (that is, T with occurrences of N replaced by n) should be a type (cf. rule T-s). Again, to prove that Tcounter type, we show backward closure of the set {Xhni, Y hni, end, !end.Y hni, &{inc: Xhs ni, dump: Y hni} | n nat} for the type formation rules.

Higher-order context-free session types A little detour takes us to context-free session types, proposed by Thiemann and Vasconcelos [39] (see also Almeida et al. [1]). Here we follow the distilled presentation of Almeida et al. [2], extending types to the higher-order setting (that is, allowing ?T and !T for an arbitrary type T instead of just basic type skip).

The pushdown world The next extension replaces natural numbers by finite sequences s of symbols σ taken from a given stack alphabet. The details are in Fig. 4. We use ε to denote the empty sequence. The extension from 1-counter is straightforward. Parameters to type identifiers are now sequences of symbols, rather than natural numbers; all the rest remains the same. Once again, to show that Tmeta type, we proceed coinductively.

Strings s ::= ε | σs New contractive rules (ind.) T contr Xhεi .<sup>=</sup> <sup>T</sup> <sup>T</sup> contr Xhεi contr (C-z) XhσSi .<sup>=</sup> <sup>T</sup> <sup>T</sup>[s/S] contr Xhσsi contr (C-s) New formation rules (coind.) T type Xhεi .<sup>=</sup> <sup>T</sup> <sup>T</sup> contr <sup>T</sup> type Xhεi type (T-z) XhσSi .<sup>=</sup> <sup>T</sup> <sup>T</sup>[s/S] contr <sup>T</sup>[s/S] type Xhσsi type (T-s) New equivalence rules (coind.) T ' T Xhεi .<sup>=</sup> <sup>U</sup> <sup>U</sup> contr <sup>U</sup> ' <sup>T</sup> Xhεi ' T (E-zL) XhσSi .<sup>=</sup> <sup>U</sup> <sup>U</sup>[s/S] contr <sup>U</sup>[s/S] ' <sup>T</sup> Xhσsi ' T (E-sL)

Fig. 4. Pushdown types. Extends Fig. 2; removes X; adds Xhsi. Right versions of rules E-zL and E-sL omitted. For duality, proceed as in Fig. 3.

Nested session types A class of types that turns out to be equivalent to pushdown types was recently proposed by Das et al. [9]. The main idea is to have type identifiers that are applied not to natural numbers or to sequences of symbols but to types themselves, and to let type identifiers take a variable (but fixed) number of parameters.

The 2-counter world 2-counter types extend the 1-counter types by introducing equations parameterised on two natural numbers, rather than one. The new rules are a straightforward adaptation of those in Fig. 3 for 1-counter types and are thus omitted. To show that Titer type, we proceed coinductively.

The infinite world The final destination takes us to arbitrary, coinductive, infinite types. The details are in Fig. 1, except that all judgements not explicitly marked are taken coinductively. No equations (of any sort) are needed, just plain infinite types. We also allow choices with an infinite number of branches.

Infinite types arise by interpreting the syntax rules coinductively, which gives rise to potentially infinite chains of interactions. The structure of these arbitrary, coinductively defined, infinite types does not need to follow any pattern (e.g. it does not need to repeat itself), and arguably, the best way to think about these objects are as labelled infinite trees (Section 3). Such objects do not have in general a finite representation (or finite encoding), which can be shown by a simple cardinality argument. Hence the need for finding suitable subclasses of infinite types that can be represented and can be used in practice.

We can think of a type in two possible ways: as (one of) its representation(s), which is great for practical purposes as we can reason about types by reasoning about their representations; or as the underlying, possibly infinite, coinductive object which is being represented, which is suitable for developing a theory of types, in particular for comparing different models with one another.

### 3 Types, trees and traces

It should be clear that the constructions defined in Section 2 form some sort of type hierarchy; this section studies the hierarchy. In any case, every type lives in the largest universe; that of arbitrary, coinductively defined, infinite types.

To each type one can associate a labelled infinite tree [14,32]. This tree can in turn be expressed by the language of words encoding its paths. Let L be the set of labels used in choice types. Following Pierce [32, Definition 21.2.1], a tree is a partial function t ∈ ({d, c} ∪ L) <sup>∗</sup> → {end, ?, !, &L, ⊕<sup>L</sup> | L ⊆ L} subject to the following constraints (σ ranges over symbols and π over strings of symbols):


The labels d and c are abbreviations for data and continuation, corresponding to the two components of session types for messages.

If all sets L in a tree are finite, the tree is finitely branching. The tree generated by a (finite or infinite) type is coinductively defined as follows.

treeof(] Td.Tc)(ε) = ] treeof(?{` : T`}`∈L)(ε) = ?<sup>L</sup> treeof(] Td.Tc)(dπ) = treeof(Td)(π) treeof(?{` : T`}`∈L)(`π) = treeof(T`)(π) treeof(] Td.Tc)(cπ) = treeof(Tc)(π) treeof(end)(ε) = end

A path in a tree t is a word obtained by combining the symbols in the domain and the range of t. Given a symbol σ ∈ {?, !, &L, ⊕<sup>L</sup> | L ⊆ L} in the codomain of t (but different from end), and a symbol τ ∈ {d, c}∪L, let hσ, τ i denote the combination of both symbols, viewed as a letter over the alphabet {?, !, &L, ⊕<sup>L</sup> | L ⊆ L} × ({d, c} ∪ L). For simplicity in exposition, we often drop the angular brackets and the subscript L on the label set, and write, for example, ?c instead of h?, ci, ⊕l instead of h⊕L, li, etc.

Given a string π in the domain of a tree t, we can define the word path<sup>t</sup> (π) recursively as path<sup>t</sup> (ε) = ε and path<sup>t</sup> (πτ ) = path<sup>t</sup> (π) · ht(π), τ i. We say that a string π is terminal wrt to t if t(π) = end. For terminal strings, we can further define dpath<sup>t</sup> (π) = path<sup>t</sup> (π) · end.

Finally, we can define the language of (the paths in) a tree t as the set {path<sup>t</sup> (π) | π ∈ dom(t)} ∪ {dpath<sup>t</sup> (π) | π ∈ dom(t), π is terminal wrt t}. The language of (the traces of) a type T, denoted by L(T), is the language of treeof(T). Note that the traces of types are defined over the following alphabet.

$$
\Sigma = \{?, !, \&\_L, \oplus\_L \mid L \subseteq \mathbb{L}\} \times (\{\mathsf{ed}, \mathsf{c}\} \cup \mathsf{L}) \cup \{\mathsf{end}\} \tag{1}
$$

Fig. 5. The tree and the language of type Tloop.

Fig. 6. The tree and the language of type Tcounter.

Fig. 7. The tree and the language of type Ttree.

Figure 5 depicts (a finite fragment of) the tree corresponding to treeof(Tloop) (Example 1) and (some of the words in) its language L(Tloop). Type Tcounter (Example 2) describes an interaction that keeps track of a counter. Finite fragments of the corresponding tree and language are depicted in Fig. 6. Type Ttree (Example 3) describes the reception of a binary tree of end values. Finite fragments of the corresponding tree and language are depicted in Fig. 7.

In the above examples, the language L(T) is closed under prefixes. This holds for a general type T, since elements of L(T) correspond to paths in treeof(T).

Proposition 1. If w ∈ L(T) and u is a prefix of w then u ∈ L(T).

Another immediate observation is that treeof (resp. L) is an embedding from the class of all types to the class of all trees (resp. all languages).

### Proposition 2. Let T and U be two types. The following are equivalent: a) T ' U; b) treeof(T) = treeof(U); c) L(T) = L(U).

Proposition 2 tells us that two types are equivalent iff they have the same traces. In general, trace equivalence is a notion weaker than bisimulation [34]. However, both notions coincide for deterministic transition systems. The syntax of (infinite) session types is in fact deterministic (e.g. given a label ` for a choice, there can only be one type that continues from &`), which explains our result.

Section 2 introduces eight classes of types. We now distinguish them by means of subscripts: finite types (T type<sup>f</sup> , Fig. 1), recursive types (T type<sup>r</sup> , Fig. 2), 1 counter types (T type<sup>1</sup> , Fig. 3), context-free types (T type<sup>c</sup> ), pushdown types (T type<sup>p</sup> , Fig. 4), nested types (T type<sup>n</sup> ), 2-counter types (T type<sup>2</sup> ) and coinductive, infinite types (T type∞, Fig. 1 with rules interpreted coinductively). To each class of types we introduce the corresponding class of languages. For example, T<sup>r</sup> is the set {L(T) | T typer}. The strict hierarchy result is as follows:

$$
\mathbb{T}\_{\mathsf{f}} \subsetneq \mathbb{T}\_{\mathsf{r}} \subsetneq \mathbb{T}\_{\mathsf{1}} \subsetneq \mathbb{T}\_{\mathsf{p}} \subsetneq \mathbb{T}\_{\mathsf{2}} \subsetneq \mathbb{T}\_{\infty}
$$

We remark that the last step in the chain of strict inclusions is obtained by a cardinality argument, since the set T<sup>∞</sup> is uncountable. This shows an even stronger statement: for any finite representation system (including the systems T<sup>f</sup> to T2, as well as T<sup>c</sup> and Tn), there is an infinite, uncountable set of types that cannot be represented by that system.

We now turn our attention to nested types (T type<sup>n</sup> ), which turn out to be equivalent to pushdown types, and further establish equivalent sub-hierarchies inside both classes, parameterised by the 'complexity' of the corresponding representations. For pushdown session types, a natural measure of complexity is the number of type identifiers required to represent a given type. This number can be arbitrarily large, but always finite. For a given n ∈ N, we let T n <sup>p</sup> denote the subset corresponding to those types that can be represented with at most n type identifiers. When n = 0, there are no identifiers, and we can only represent finite types. As n increases, so does the expressivity of our constructions, and we have the infinite chain of inclusions<sup>4</sup>

$$\mathbb{T}\_{\mathsf{f}} = \mathbb{T}\_{\mathsf{p}}^{0} \subsetneq \mathbb{T}\_{\mathsf{p}}^{1} \subseteq \mathbb{T}\_{\mathsf{p}}^{2} \subseteq \cdots \subseteq \mathbb{T}\_{\mathsf{p}}.$$

Similarly, for nested session types we can define a hierarchy by looking at the arities of the type identifiers used. For a given n ∈ N, we let T n <sup>n</sup> denote the subset corresponding to the nested session types whose type identifiers have arity at most n. When n = 0 all type identifiers are constant, and we recover the class of recursive types. As n increases, so does the expressivity, and we also have an infinite chain of inclusions<sup>4</sup>

$$\mathbb{T}\_{\mathsf{r}} = \mathbb{T}\_{\mathsf{n}}^{0} \subsetneq \mathbb{T}\_{\mathsf{n}}^{1} \subseteq \mathbb{T}\_{\mathsf{n}}^{2} \subseteq \cdots \subseteq \mathbb{T}\_{\mathsf{n}}.$$

<sup>4</sup>Although not proven, we conjecture that all inclusions are strict.

It turns out that these hierarchies are one and the same (with the exception of the bottom level), so that we have <sup>4</sup>

$$\mathbb{T}\_{\mathsf{f}} = \mathbb{T}\_{\mathsf{p}}^{0} \subsetneq \mathbb{T}\_{\mathsf{r}} = \mathbb{T}\_{\mathsf{n}}^{0} \subsetneq \mathbb{T}\_{\mathsf{p}}^{1} = \mathbb{T}\_{\mathsf{n}}^{1} \subseteq \mathbb{T}\_{\mathsf{p}}^{2} = \mathbb{T}\_{\mathsf{n}}^{2} \subseteq \cdots \subseteq \mathbb{T}\_{\mathsf{p}} = \mathbb{T}\_{\mathsf{n}}.$$

Higher-order context-free types (denoted by Tc) lie between levels 0 and 1 in the sub-hierarchies above, i.e., they can represent recursive types, and can be represented by pushdown session types using at most one type identifier, or equivalently, by nested session types with either constant or unary type identifiers, so that we have

$$
\mathbb{T}\_{\mathfrak{r}} \subsetneq \mathbb{T}\_{\mathfrak{c}} \subsetneq \mathbb{T}\_{\mathfrak{p}}^{1} = \mathbb{T}\_{\mathfrak{n}}^{1}.
$$

We have a stronger observation than the inclusion T<sup>c</sup> ( T<sup>1</sup> p . Context-free session types are included in pushdown session types which have only one type identifier X, and where the equation Xhεi .<sup>=</sup> end accounts for the only occurrence of end. The latter means that the type ends iff the state Xhεi is reached, that is, iff the stack is empty. Thus, we can intuitively think of context-free session types as pushdown types with a single identifier and an empty stack acceptance criterion. This observation points to the fact that the qualifier 'context-free' in the so called context-free session types is a misnomer [9].

The result below summarises the entire hierarchy.

### Theorem 1 (Inclusions).

$$\begin{array}{ccccccccc} \mathsf{T\_{f}} = \ \mathsf{T\_{p}} \subsetneq \ \mathsf{T\_{r}} = \ \mathsf{T\_{n}} & \subset & \mathsf{T\_{1}} & \subsetneq & \mathsf{T\_{p}} & = & \mathsf{T\_{n}} \subsetneq \ \mathsf{T\_{2}} \subsetneq \ \mathsf{T\_{\infty}}\\ & & & & & \cup & & & \cup\\ & & & & & & \cup & & & & \cdots\\ & & & & & \mathsf{T\_{c}} \subsetneq \ \mathsf{T\_{p}} = \ \mathsf{T\_{n}} \subseteq & \mathsf{T\_{p}^{2}} \subseteq & \cdots & & & \end{array}$$

### 4 From types to automata

This section describes procedures to convert types in different levels of the hierarchy (recursive systems, 1-counter, pushdown and 2-counter) into automata at the same level. All constructions follow the same guiding principles, so we focus on the bottom level of the hierarchy (recursive systems) and then highlight the main differences as we advance in the hierarchy.

All automata that we consider are deterministic and total, i.e., the transition functions are such that any input word has a well-defined, unique computation path. We use the alphabet Σ defined in (1). Standard references in automata theory are Hopcroft and Ullman's book [22] and Valiant's PhD thesis [42].

Recursive types and finite-state automata Following the usual notation, a (deterministic) finite-state automaton is given by a set Q of states, with a specified initial state q<sup>0</sup> ∈ Q, a transition function δ : Q × Σ → Q, and a set A ⊆ Q of accepting states. Given a finite word a1a<sup>2</sup> · · · an, its execution by the automaton yields the sequence of states s0, s1, . . . , s<sup>n</sup> where s<sup>0</sup> = q<sup>0</sup> and si+1 = δ(s<sup>i</sup> , ai+1). A word is accepted by the automaton if its execution ends in an accepting state.

The definition of finite-state automata can be augmented into other types of automata. Essentially: in a 1-counter automata we have access to a counter (with operations for incrementing, decrementing, and checking whether the counter is non-zero), in addition to the current state; in a pushdown automata we have access to a stack (with operations for pushing a symbol, popping a symbol, and observing the top symbol of the stack); in a 2-counter automata, we have access to two counters.

Suppose we are given a system of recursive equations {X<sup>i</sup> .<sup>=</sup> <sup>T</sup>i}i∈<sup>I</sup> over a set X = {Xi}i∈<sup>I</sup> (which may or may not be contractive, i.e., define a type). Our first step is to convert this system into a normal form in which every right-hand side is either a identifier X, or a single application of one of the type constructors end, ?X.Y , !X.Y , &{` : X`}`∈<sup>L</sup> or ⊕{` : X`}`∈L. We can do this by introducing fresh, intermediate identifiers as needed. Essentially, whenever we have an equation X .<sup>=</sup> ?T1.T<sup>2</sup> where <sup>T</sup>1, <sup>T</sup><sup>2</sup> are not identifiers, we add two new identifiers <sup>X</sup><sup>0</sup> , X<sup>00</sup> , replace the above equation by X .<sup>=</sup> ?X<sup>0</sup> .X<sup>00</sup>, and add two new equations X<sup>0</sup> .<sup>=</sup> <sup>T</sup><sup>1</sup> and <sup>X</sup><sup>00</sup> .<sup>=</sup> <sup>T</sup>2. The process is similar for the other type constructors. By doing this repeatedly, we "break down" a long equation into many small equations. The number of new identifiers is linear in the size of the original system of equations.

Given such a system, we construct a finite-state automaton (over the alphabet Σ) as follows. The automaton has a state q<sup>X</sup> for every type identifier X, and two additional states: an 'end' state qend and an 'error' state qerror. The transitions from qerror are described by qerror <sup>a</sup>→ qerror for every symbol a. Similarly, the transitions at qend are described by qend <sup>a</sup>→ qerror for every symbol a. The transitions at state q<sup>X</sup> are given by the corresponding equation for identifier X, in the obvious way. Some examples:


We define all states other than qerror to be accepting states.<sup>5</sup> Notice that the finite-state automaton described above is an automaton with possible ε-moves. Although, by definition, deterministic finite-state automata do not permit εmoves, in our case paths of ε-moves are uniquely determined and always reach a state without outgoing ε-transitions (they cannot become stuck in a loop, assuming type contractivity). We can convert the given automaton into an equivalent automaton without ε-moves by 'shortcutting' such moves. Formally, suppose a

<sup>5</sup>We need all states to be accepting, since we might need to look at finite traces to distinguish between two types. For example, X .<sup>=</sup> &{a: <sup>X</sup>} and <sup>Y</sup> .<sup>=</sup> &{b: <sup>Y</sup> } define non-equivalent types that have no finite terminating paths.

360 S. Gay et al.

$$X \doteq ! \text{end} . X \qquad \qquad ! \text{c} \widehat{\text{C}\text{d}^{q\_X}} \widehat{\text{C}^{\text{ld}}} \widehat{\text{C}^{q\_Y}} \widehat{\text{C}^{\text{ed}}} \widehat{\text{C}^{q\_{\text{end}}}} $$

Fig. 8. An automaton for Tloop with initial state qX. All depicted states are accepting.

state X has an outgoing ε-transition to Y ; by construction, it is X's only outgoing transition. Assuming X and Y are different states, we can change every transition entering X and make it enter Y instead; finally, we can remove state X (hence removing the ε-transition from X).

We show in Fig. 8 the automaton that corresponds to type Tloop (Example 1). Every missing transition points to qerror which is not shown. In our examples, all depicted states are accepting, so we omit the usual double circle notation.

1-counter types For 1-counter systems, the only difference in the above construction is that instead of non-parameterised identifiers our equations now involve terms of the form Xhzi, Xhs zi, XhNi, Xhs Ni, etc. We assume for simplicity that the identifiers appearing in these equations are restricted as follows: if the left-hand side of an equation is of the form Xhzi, then the identifiers appearing in the right-hand side must be of the form X<sup>0</sup> hzi or X<sup>0</sup> hs zi (with X<sup>0</sup> possibly different from X); and if the left-hand side of an equation is of the form Xhs Ni, then the identifiers appearing in the right-hand side must be of the form X<sup>0</sup> hNi, X0 hs Ni or X<sup>0</sup> hss Ni. Any system can be converted into this form by adding finitely many new equations, e.g. Xhzi .<sup>=</sup> <sup>Y</sup> <sup>h</sup>sss <sup>z</sup><sup>i</sup> can be rewritten as

$$X \langle \mathbf{z} \rangle \dot{=} X' \langle \mathbf{s} \mathbf{z} \rangle \qquad \qquad X' \langle \mathbf{s} \, N \rangle \dot{=} X'' \langle \mathbf{s} \, \mathbf{s} \, N \rangle \qquad \qquad X'' \langle \mathbf{s} \, N \rangle \dot{=} Y \langle \mathbf{s} \, \mathbf{s} \, N \rangle$$

and Xhs Ni .<sup>=</sup> <sup>Y</sup> <sup>h</sup>z<sup>i</sup> can be rewritten as

$$X \langle \mathsf{s} N \rangle \doteq X' \langle N \rangle \qquad \qquad X' \langle \mathsf{s} N \rangle \dot{=} X' \langle N \rangle \qquad \qquad X' \langle \mathsf{z} \rangle \dot{=} Y \langle \mathsf{z} \rangle.$$

We can convert a 1-counter type into a (deterministic) 1-counter automaton, so that the transition function depends on whether the counter value is zero (corresponding to a left-hand side of the form Xhzi) or positive (corresponding to a left-hand side of the form Xhs Ni). Furthermore, the changes in the counter value along the identifiers are incorporated by changes in the counter value along the automaton. For example, take equation Xhs Ni .<sup>=</sup> <sup>Y</sup> <sup>h</sup>Ni. The corresponding transition from (qX,s, ε) to q<sup>Y</sup> decrements the counter.

For illustration purposes, we show how to construct a 1-counter automaton accepting L(Tcounter) from Example 2. First, we need to convert the equation for Y hs Ni into normal form. We add an extra identifier Z and write

$$\begin{aligned} X \langle \mathbf{z} \rangle &\doteq& \& \{ \text{inc} \colon X \langle \mathbf{s} \mathbf{z} \rangle, \text{dump} \colon Y \langle \mathbf{z} \rangle \} &\quad X \langle \mathbf{s} \, N \rangle &\doteq& \& \{ \text{inc} \colon X \langle \mathbf{s} \, N \rangle, \text{dump} \colon Y \langle \mathbf{s} \, N \rangle \} \\ Y \langle \mathbf{z} \rangle &\doteq& \text{end} & & Y \langle \mathbf{s} \, N \rangle &\doteq& Z \langle \mathbf{s} \, N \rangle, Y \langle N \rangle \\ Z \langle \mathbf{z} \rangle &\doteq& \text{end} & & Z \langle \mathbf{s} \, N \rangle &\doteq& \text{end} \end{aligned}$$

The corresponding automaton has states qX, q<sup>Y</sup> , qZ, one for each type identifier X, Y ,Z, as well as an additional state qend. The outgoing transitions for state q<sup>X</sup>

$$\begin{aligned} X \langle \mathbf{z} \rangle &\doteq& \& \{ \text{inc} \colon X \langle \mathbf{s} \mathbf{z} \rangle, \text{dump} \colon Y \langle \mathbf{z} \rangle \} & & Y \langle \mathbf{z} \rangle \doteq \text{end} \\ X \langle \mathbf{s} \, N \rangle &\doteq& \& \{ \text{inc} \colon X \langle \mathbf{s} \, \mathbf{s} \, N \rangle, \text{dump} \colon Y \langle \mathbf{s} \, N \rangle \} & & Y \langle \mathbf{s} \, N \rangle \doteq \text{lend} \, Y \langle N \rangle \end{aligned}$$

Fig. 9. A 1-counter automaton for type Tcounter = Xhzi. The initial configuration is (qX, 0). Here a transition δ(q, g, a) = (o, q<sup>0</sup> ) is denoted by an arc from q to q <sup>0</sup> with label g, a | o, where g ∈ {z,s}, a ∈ {ε}∪Σ, and o ∈ {=, +, −}. If both g = z and g = s lead to the same transition, then we use the symbol · to refer to both transitions. All depicted states are accepting, and non-depicted transitions lead to a non-accepting sink state.

are the same regardless of the counter value: either read &inc, incrementing the counter and staying in qX; or read &dump, keeping the counter value and moving to q<sup>Y</sup> . For state q<sup>Y</sup> , if the counter is zero, we can read end while moving to state qend. On the other hand, if the counter is non-zero, we can read !d, keeping the counter value and moving to qZ; or read !c, decrementing the counter value and staying in q<sup>Y</sup> . Finally, for state q<sup>Z</sup> we can only read end and move to state qend. Whatever we write in the equation for Zhzi is irrelevant, as this configuration is unreachable. All of this gives the automaton in Fig. 9.

Pushdown types Pushdown systems are similar, but now the behaviour of a identifier is specified by |∆| + 1 equations, where ∆ is the stack alphabet; one equation for each possible symbol at the top of the stack, and one equation for the case that the stack is empty. Accordingly, we use a (deterministic) pushdown automaton to simulate the stack contents by means of push and pop operations. The transitions from a state q<sup>X</sup> and a given stack indicator in {ε} ∪ ∆ are once more given by the corresponding equation with X as the type identifier on the left-hand side. Fig. 10 shows a pushdown automaton accepting L(Tmeta).

2-counter types The translation to 2-counter automata is as for the 1-counter case, but now the behaviour is specified by one of four different cases, depending on which of the two counters is zero or non-zero. Accordingly, we use a (deterministic) 2-counter automaton with the appropriate transition function.

### 5 From automata to types

The construction in Section 4 explains how we can build an automaton from a system of equations at some level in the hierarchy. If Xhσi type<sup>p</sup> , then the

$$\begin{aligned} X\langle \varepsilon \rangle & \doteq& \& \{ \text{addOut} \colon X\langle \sigma \rangle, \text{addIn} \colon X\langle \tau \rangle \} \\ X\langle \sigma S \rangle & \doteq& \& \{ \text{addOut} \colon X\langle \sigma \sigma S \rangle, \text{addIn} \colon X\langle \tau \sigma S \rangle, \text{pop} \colon \text{lend}.X\langle S \rangle \} \\ X\langle \tau S \rangle & \doteq& \& \{ \text{addOut} \colon X\langle \sigma \tau S \rangle, \text{addIn} \colon X\langle \tau \tau S \rangle, \text{pop} \colon \text{?end}.X\langle S \rangle \} \end{aligned}$$

Fig. 10. A pushdown automaton for type Tmeta = Xhεi. The initial configuration is (q0, ε). A transition δ(q, g, a) = (o, q<sup>0</sup> ) is denoted by an arc from q to q <sup>0</sup> with label g, a | o, where g ∈ {ε} ∪ ∆, a ∈ {ε} ∪Σ, and o ∈ Op. If all choices of g lead to the same transition, we use · to stand for all transitions. All depicted states are accepting.

language of the type given by Xhσi is the language accepted by the automaton with initial configuration (qX, σ) (and similarly for recursive, 1-counter, and 2 counter types). Conversely, given an automaton which accepts the language of traces of a type, we can construct the corresponding system of equations that specifies that type. This allow us to obtain a complete correspondence between classes of types and different models of computation based on automata theory. The following result is stronger than previous similar results which only show a forward implication [9]. Recall that a language is said to be regular if it is the set of words accepted by some finite-state automaton. We also say that a tree is regular if it has a finite number of distinct subtrees.

### Theorem 2 (Types, traces and automata).


We can now address the decidability of the key problems of type formation, type equivalence and type duality for our various classes of type languages.

### Theorem 3 (Decidability results).


We are also able to prove that these problems are undecidable for 2-counter types, since Theorem 2 also provides a construction from automata to systems of equations, and the corresponding problems for automata are undecidable.

Theorem 4 (Undecidability results). Problems T type<sup>2</sup> , T '<sup>2</sup> U and T ⊥<sup>2</sup> U are all undecidable.

### 6 Related work

The first papers on session types by Honda [19] and Takeuchi et al. [38] feature finite types only. Recursive types were introduced later [20] using µ-notation. Gay and Hole [15] introduce algorithms for deciding duality and subtyping of finite-state session types, based on bisimulation. Much of the literature on session types, surveyed by Hüttel et al. [23], uses the same approach. The natural decision algorithms for duality and subtyping presented by Gay and Hole were shown to be exponential in the size of the types by Lange and Yoshida [27], due to reliance on syntactic unfolding. Our polytime complexity for recursive type equivalence follows from the equivalence algorithm for finite-state automata by Hopcroft and Karp [21], and thus has quadratic complexity in the description size, improving on Gay and Hole. Lange and Yoshida use an automata-based algorithm to also achieve quadratic complexity for checking subtyping.

We use a coinductive formulation of infinite session types. This approach has some connections with the work of Keizer et al. [25] who present session types as states of coalgebras. Their types are restricted to finite-state recursive types, but they do address subtyping and non-linear types, two notions that we do not take into consideration. Our coinductive presentation avoids explicitly building coalgebras, and follows Gay et al. [17], solving problems with duality in the presence of recursive types [5,17,28].

We have not addressed the problem of deciding subtyping, but the panorama is not promising. Subtyping is known to be decidable for recursive types T<sup>r</sup> [15] and undecidable for context-free types T<sup>c</sup> [31] or nested types with arity at most one T 1 n [10], hence for pushdown types with one type constructor T 1 p (Theorem 1). The undecidability proof of the subtyping problem for context-free session types reduces from the inclusion problem for simple deterministic languages, which was shown to be undecidable by Friedman [13]. That for nested session types reduces from the inclusion problem for Basic Process Algebra [4], which was shown to be undecidable by Groote and Hüttel [18]. Given that 1-counter types T<sup>1</sup> and pushdown types with one type constructor T 1 <sup>p</sup> are incomparable (Theorem 1), the problem of subtyping for 1-counter types remains open.

Dependent session types have been studied for binary session types [40,41], for multi-party session types [12,29,45] and for polymorphic, nested session types [9]. Although our parameterised type definitions have some similarities with definitions in some dependently typed systems, we do not support the connection between values in messages and parameters in types, and we have not yet studied how the types that can be expressed in dependent systems fit into our hierarchy.

Connections between multiparty session types and communicating finitestate automata have been explored by Deniélou and Yoshida [11] but the investigation has not been extended to other classes of automata.

Solomon [37] studies the connection between inductive type equality for nested types and language equality for DPDAs and shows that the equivalence problem for nested types is as hard as the equivalence problem for DPDAs, an open problem at the time. We follow a similar approach but define type equivalence as a bisimulation rather than as language equivalence.

Many of the main results in this paper borrow from the theory of automata, developed in the mid-20th century. Here our standard reference is the book by Hopcroft and Ullman [22], where the notions of finite-state, pushdown, and counter automata can be found. 1-counter automata were studied in detail in Valiant's PhD thesis [42]. To prove the equivalence between types and automata, we need to convert automata to satisfying certain properties; similar techniques have appeared in Kao et al. [24] and Valiant and Paterson [43]. Our proofs of decidability of type equivalence make use of the corresponding results for automata [8,21,33,35,36,43]; we specifically mention Sénizergues' impressive result on equivalence of deterministic pushdown automata [36], a work which granted him the Gödel Prize in 2002. Finally, the strict hierarchy results use textbook pumping lemmas for regular languages (due to Rabin and Scott [33]) and context-free languages (due to Bar-Hillel et al. [3] and Kreowski [26]), as well as a somewhat less known result for 1-counter automata (due to Boasson [7]).

### 7 Conclusion

We introduce different classes of session types, some new, others from the literature, under a uniform framework and place them in n hierarchy. We further study different type-related problems—formation, equivalence and duality—and show that these relations are all decidable up to and including pushdown types.

Much remains to be done. From the point of view of programming languages, one should investigate whether decidability results translate into algorithms that may be incorporated in compilers. Even if subtyping is known to be undecidable for most systems "above" that of recursive types, the problem remains open for 1-counter types, an interesting avenue for further investigation. Our study of classes of infinite types may have applications beyond session types. One promising direction is that of non regular datatypes for functional programming (or polymorphic recursion schemes [30]), such as nested datatypes [6].

We have not addressed the decidability of the type checking problem. Type checking is known to be decidable for finite types, recursive, context-free and nested session types. Given that type checking for nested session types is incorporated in the RAST language [9], a natural first step would be to investigate how to translate 1-counter and pushdown processes into that language.

### References


Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

## Complete and tractable machine-independent characterizations of second-order polytime

Emmanuel Hainry<sup>1</sup> , Bruce M. Kapron<sup>2</sup> , Jean-Yves Marion<sup>1</sup> , and Romain P´echoux<sup>1</sup> 

<sup>1</sup> Universit´e de Lorraine, CNRS, Inria, LORIA, F-54000 Nancy, France <sup>2</sup> University of Victoria, Victoria, BC, Canada {hainry,marion,pechoux}@inria.fr bmkapron@uvic.ca

Abstract. The class of Basic Feasible Functionals BFF is the secondorder counterpart of the class of first-order functions computable in polynomial time. We present several implicit characterizations of BFF based on a typed programming language of terms. These terms may perform calls to imperative procedures, which are not recursive. The type discipline has two layers: the terms follow a standard simply-typed discipline and the procedures follow a standard tier-based type discipline. BFF consists exactly of the second-order functionals that are computed by typable and terminating programs. The completeness of this characterization surprisingly still holds in the absence of lambda-abstraction. Moreover, the termination requirement can be specified as a completeness-preserving instance, which can be decided in time quadratic in the size of the program. As typing is decidable in polynomial time, we obtain the first tractable (i.e., decidable in polynomial time), sound, complete, and implicit characterization of BFF, thus solving a problem opened for more than 20 years.

Keywords: Basic feasible functionals · Type 2 · Second-order · Polynomial time · Tiering · Safe recursion

### 1 Introduction

Motivations. The class of second-order functions computable in polynomial time was introduced and studied by Mehlhorn [27], building on an earlier proposal by Constable [10]. Kapron and Cook characterized this class using oracle Turing machines, giving it the name Basic Feasible Functionals (BFF):

Definition 1 ([19]). A functional F is in BFF, if there are an oracle Turing machine M and a second-order polynomial<sup>3</sup> P such that M computes F in time bounded by P(|f|, |x|), for any oracle f and any input x. 4

Since then, BFF was consensually considered as the natural extension to secondorder of the well-known class of (first-order) polynomial time computable functions, FP. Notions of second-order polynomial time, while of intrinsic interest,

<sup>3</sup> Second-order polynomials are a type-2 analogue of ordinary polynomials.

<sup>4</sup> The size of an oracle f is a first-order function defined by |f|(n) = max<sup>|</sup>y|≤<sup>n</sup> |f(y)|.

have also been applied in a range of areas, including structural complexity theory [27], resource-bounded topology [29], complexity of total search problems [5], feasible real analysis [21], and verification [14].

Starting with Cobham's seminal work [9], there have been several attempts to provide machine-independent characterizations of complexity classes such as (P and) FP, that is, characterizations based on programming languages rather than on machines. Beyond the purely theoretical aspects, the practical interest of such characterizations is to be able to automatically guarantee that a program can be executed efficiently and in a secure environment. For these characterizations to hold, some restrictions are placed on a given programming language. They ensure that a program can be simulated by a Turing machine in polynomial time and, therefore, corresponds to a function in FP. This property is called soundness. Conversely, we would like any function in FP to be computable by a program satisfying the restrictions. This property is called (extensional) completeness. For automation to be possible, it is necessary that the characterizations studied be tractable; that is, decidable in polynomial time. Moreover, they should preferably not require a prior knowledge of the program complexity. One speaks then of implicit characterization insofar as the programmer does not have to know an explicit bound on the complexity of the analyzed programs.

In the first-order setting, different restrictions and techniques have been developed to characterize the complexity class FP. One can think, among others, of the safe recursion and ramified recursion techniques for function algebras [6,24], of interpretation methods for term rewrite systems [8], or of light and soft linear logics typing-discipline for lambda-calculi [15,4,3].

In the second-order setting, a machine-independent characterization of BFF was provided in [16]. This characterization uses the tier-based (i.e., safe/ramified recursion-based) type discipline introduced in [26] on imperative programs for characterizing FP and can be restated as follows:

$$\text{BFF} = \lambda ([\text{ST}])\_2,$$

<sup>J</sup>ST<sup>K</sup> denotes the set of functions computed by typable and terminating programs; λ denotes the lambda closure, that is, for a given set of functionals X, λ(X) is the set of functionals denoted by simply-typed lambda-terms using constants in X; X<sup>2</sup> is the restriction of X to second-order functionals. Type inference for <sup>J</sup>ST<sup>K</sup> is fully automatic and can be performed in time cubic in the size of the analyzed program. However the above characterization has two main weaknesses:


Thus, providing a tractable, implicit, sound, and complete programming language for characterizing second-order polynomial time is still an open problem. Contributions. Our paper provides the first solution to this problem, open for more than 20 years. To this end, we introduce a higher-order programming language and design a suitable typing discipline that address the two weaknesses described above. The lambda closure requirement for completeness is removed by designing a suitable programming language that consists of a layer of simplytyped terms that can perform calls to a layer of imperative and non-recursive procedures following a tier-based type discipline. This language allows for some restricted forms of procedure composition that are handled by the simply-typed terms and also allows for some restricted forms of oracle composition that are managed through the use of closures, syntactic elements playing the role of firstorder abstractions with free variables. The termination criterion is specified as a completeness-preserving instance, called SCPS, of a variant of Size Change Termination [23] introduced in [7] that can be checked in time quadratic in the size of the analyzed program. The main contributions of this paper are:


The contributions of the paper are a non-trivial extension of existing works:


– The particular choice of the termination criterion SCP<sup>S</sup> was made to show that termination can be specified as a tractable/feasible criterion while preserving completeness. This is also a new result. SCP<sup>S</sup> may include nested loops (as described in [7]) and can be replaced by any termination criterion capturing the programs of our completeness proof. SCP<sup>S</sup> was chosen for its tractability, but not only: the SCP criterion of [7] ensures termination by using an error state which breaks the control flow. This control-flow break damages the non-interference property needed for tier-based typing to guarantee time complexity bounds.

Leading example. The program ce of Example 1 will be our leading example, as it computes a function known to be in BFF−JST<sup>K</sup> (i.e., it computes a function in BFF and not in <sup>J</sup>STK, see [20]). This program will be shown to be in SAFE<sup>0</sup> and, consequently, in SAFE and to terminate with SCPS.

Example 1 (Program ce). Let W be the set of words. Let the operator ε of arity 0 represent the empty word constant, let the operator != test whether or not its arguments are distinct, and let the operator pred remove the first letter of a word. The binary operator truncates and pads the size of its first operand to the size of its second operand plus 1. When the boxed variables X and y are fed with the inputs f ∈ W → W and w ∈ W, respectively, program ce calls procedure KS in the term t. Program ce computes |w| (i.e., the size of the word w) bounded iterations of f ◦ f through the iteration of the assignment z := X2(z w) in procedure KS. The bound on the output size of each iteration is computed by the first assignment <sup>w</sup> := <sup>X</sup>1(<sup>ε</sup> <sup>ε</sup>) of KS and is equal to <sup>f</sup>(1) (that is, <sup>f</sup>(J<sup>ε</sup> <sup>ε</sup>K), with <sup>J</sup><sup>ε</sup> <sup>ε</sup>K=1; <sup>J</sup>e<sup>K</sup> being the result of evaluating the expression <sup>e</sup>).

```
box [X , y] in
     declare
         KS ( X1 , X2 , v ) {
              var w , z;
              w := X1 (ε  ε);
              z := ε;
              while ( v != ε) {
                   v := pred ( v );
                   z := X2 ( z  w )
              }
               return z
         }
  in call KS ({ x → X @ x } , { x → X @ ( X @ x )} , y )
                                                                         ce
                                                        Procedure p
                                                              Term t
                                        Statement st'
                                        Statement st
```
Related work. Several tools providing machine-independent characterizations of distinct complexity classes have been developed in the field of Implicit Computational Complexity (ICC). Most of these tools are restricted to the study of first-order complexity classes. Whereas light logic approaches can deal with programs at higher types, their applications are restricted to first-order complexity classes such as FP [15,4,3]. Interpretation methods were extended to higher-order polynomials in [2] to study FP and adapted in [13] and [17] to characterize BFF. However, these characterizations are not decidable as they require checking of second-order polynomial inequalities. [12] and [18] study characterizations of BFF in terms of a simple imperative programming language that enforces an explicit external bound on the size of oracle outputs within loops. The corresponding restriction is not implicit by nature and is impractical from a programming perspective as the size of oracle outputs cannot be predicted. In this paper, the bound is programmer friendly because it is implicit and it only constrains the size of the oracle input.

### 2 A second-order language with imperative procedures

The syntax and semantics of the programming language designed to capture the complexity class BFF are introduced in this section. Programs of this language consist in second-order terms in which imperative procedures are declared and called. These procedures have no global variables, are not recursive, and their parameters can be of order 1 (oracles) or 0 (local variables). Oracles are in readonly mode: they cannot be declared and, hence, modified inside a procedure. Oracles can only be composed at the term level through the use of closures, first-order abstractions that can be passed as parameters in a procedure call.

Syntax. When we refer to a type-i syntactic element e (a variable, an expression, a statement, ...), for i ∈ N, we implicitly assume that the element e denotes some function of order i over words as basic type. We will sometimes write e i in order to make the order explicit. For example, e <sup>0</sup> denotes a word. This notion will be formally defined in Section 3. Let e denote a (possibly empty) tuple of n elements e1, . . . , en, where n is given by the context. Let |e| denote the length of tuple e, i.e., |e| , n. Let π<sup>i</sup> , i ≤ |e|, denote the projectors on tuples, i.e., πi(e) , e<sup>i</sup> .

Let V be a set of variables that can be split into three disjoint sets V = V<sup>0</sup> ] V<sup>1</sup> ] V<sup>≥</sup>2. The type-0 variables in V<sup>0</sup> will be denoted by lower case letters x, y, . . . and the type-1 variables in V<sup>1</sup> will be denoted by upper case letters X, Y, . . . Variables in V of arbitrary type will be denoted by letters a, b, a1, a2, . . ..

Let O be a set of (type-1) operators op of fixed arity ar(op) that will be used both in infix and prefix notations for notational convenience and that are always fully applied, i.e., applied to a number ar(op) of operands.

The programs are defined by the grammar of Figure 1. A program is either a term t 0 , a procedure declaration declare p in prog, or the declaration of a boxed variable a, called box, followed by a program: box [a] in prog. Boxed variables will represent the program inputs.

In Figure 1, there are three constructor/destructor pairs for abstraction and application; each of them playing a distinct rˆole:



Fig. 1: Syntax of type-2 programs

For some syntactic element e of the language, let V(e) ⊆ V be the set of all variables occurring in e. A variable is free if it is not under the scope of an abstraction and it is not boxed. A program is closed if it has no free variable.

For a given procedure declaration p = P(X, x){[var y; ] st return x}, define the procedure name of p by n(p) , P. Define also body(P) , st, local(P) , {y}, and param(P) , {X, x}. body(P) is called the body of procedure P. The variables in local(P) are called local variables and the variables in param(P) are called parameters. Finally, define Proc(t) (and Proc(prog)) to be the set of procedure names that are called within the term t (respectively program prog).

Throughout the paper, we will restrict our study to closed programs in normal form. These consist of programs with no free variable that can be written as follows box [X, x] in declare p in t, for some term t such that the following well-formedness conditions hold: (i) There are no name clashes. (ii) There are no free variables in a given procedure. (iii) Any procedure call has a corresponding procedure declaration. A closed program in normal form of the shape box [X, x] in declare p in t<sup>0</sup> , for some type-0 term t, will compute a type-2 functional. The typing discipline presented in Section 3 will restrict the analysis to such programs.

Operational semantics. Let W = Σ<sup>∗</sup> be the set of words over a finite alphabet Σ such that {0, 1} ⊆ Σ. The symbol denotes the empty word. The length of a word w is denoted |w|. Given two words w and v in W let v.w denote the concatenation of v and w. For a given symbol a ∈ Σ, let a <sup>n</sup> be defined inductively by a <sup>0</sup> = and a <sup>n</sup>+1 = a.an. Let E be the sub-word relation over W, which is defined by v E w, if ∃u, u <sup>0</sup> ∈ W, w = u.v.u 0 .

For a given word w ∈ W and an integer n, let w<sup>n</sup> be the word obtained by truncating w to its first min(n, |w|) symbols and then padding with a word of the form 10<sup>k</sup> to obtain a word of size exactly n + 1. For example, 1001<sup>0</sup> = 1, <sup>1001</sup><sup>1</sup> = 11, 1001<sup>2</sup> = 101, and 1001<sup>6</sup> = 1001100. Define <sup>∀</sup>v, <sup>w</sup> <sup>∈</sup> <sup>W</sup>, <sup>J</sup>K(v, w) = v|w<sup>|</sup> . Padding ensures that <sup>|</sup>JK(v, w)<sup>|</sup> <sup>=</sup> <sup>|</sup>w|+1. The syntax of programs enforces that oracle calls are always performed on input data padded by the input bound and, consequently, oracle calls are always performed on input data whose size does not exceed the size of the input bound plus one.

A total function <sup>J</sup>op<sup>K</sup> : <sup>W</sup>ar(op) <sup>→</sup> <sup>W</sup> is associated with each operator op of arity ar(op). Constants may be viewed as operators of arity zero. We define two classes of operators called neutral and positive depending on the total function they compute. This categorization of operators will be used by our type system as the admissible types for operators will depend on their category.

An operator op, computing the total function <sup>J</sup>op<sup>K</sup> : <sup>W</sup>ar(op) <sup>→</sup> <sup>W</sup>, is:

	- 1. either <sup>J</sup>op<sup>K</sup> is constant, i.e., ar(op) = 0;
	- 2. <sup>J</sup>op<sup>K</sup> : <sup>W</sup>ar(op) → {0, <sup>1</sup>} is a predicate;
	- 3. or ∀w ∈ War(op) , <sup>∃</sup><sup>i</sup> <sup>≤</sup> ar(op), <sup>J</sup>opK(w) <sup>E</sup> <sup>w</sup><sup>i</sup> ;

As neutral operators are always positive, in the sequel, we reserve the name positive for those operators that are positive but not neutral.

In what follows, let f, g, . . . denote total functions in W → W. A store µ consists of the disjoint union of a map µ<sup>0</sup> from V<sup>0</sup> to W and a map µ<sup>1</sup> from V<sup>1</sup> to total functions in W → W. For i ∈ {0, 1}, µ<sup>i</sup> is called a type-i store. Let dom(µ) be the domain of the store µ. Let µ[x ← w] denote the store µ 0 satisfying µ 0 (b) = µ(b), for all b 6= x, and µ 0 (x) = w. This notation is extended naturally to type-1 variables µ[X ← f] and to sequences of variables µ[x ← w, X ← f]. Finally, let µ<sup>∅</sup> denote the empty store.

Let ↓ denote the standard big-step call-by-name reduction relation on terms defined by: if t<sup>1</sup> ↓ λa.t and t{t2/a} ↓ v then t1@t<sup>2</sup> ↓ v, where {t2/a} is the standard substitution and where v can be a type-0 variable x, a lambda-abstraction λa.t, a type-1 variable application X@t, or a procedure call call P(c, t <sup>0</sup>).

A continuation is a map φ from V<sup>1</sup> to Closures, i.e., φ(X) = {x → t <sup>0</sup>} for some type-1 variable X, some type-0 variable x, and some type-0 term t 0 . Let X 7→ c with |X| = |c|, be a notation for the continuation mapping each X<sup>i</sup> ∈ V<sup>1</sup> to the closure c<sup>i</sup> .

Given a set of procedures σ, a store µ, and a continuation φ, we define three distinct kinds of judgments: (σ, µ, φ, e) →exp w for expressions, (σ, µ, φ, st) →st µ 0 for statements, and (σ, µ, prog) →env w for programs. The big-step operational semantics of the language is described in Figure 2.

A program prog = box [X, x] in declare p in t<sup>0</sup> computes the second-order partial functional <sup>J</sup>prog<sup>K</sup> <sup>∈</sup> (<sup>W</sup> <sup>→</sup> <sup>W</sup>) <sup>|</sup>X<sup>|</sup> → W|x<sup>|</sup> → W, defined by:

$$\mathsf{[prog]}(\overline{f}, \overline{w}) = w \text{ iff } (\emptyset, \mu\_{\emptyset}[\overline{\bf x} \leftarrow \overline{w}, \overline{\bf x} \leftarrow \overline{f}], \mathsf{prog}) \rightarrow\_{\mathsf{env}} w \text{ .}$$

In the special case where <sup>J</sup>prog<sup>K</sup> is a total function, the program prog is said to be terminating (strongly normalizing). We will denote by SN the set of terminating programs. For a given set of programs <sup>S</sup>, let <sup>J</sup>S<sup>K</sup> denote the set of functions computed by programs in <sup>S</sup>. Formally, <sup>J</sup>S<sup>K</sup> <sup>=</sup> {Jprog<sup>K</sup> <sup>|</sup> prog <sup>∈</sup> <sup>S</sup>}.

Example 2. Consider the program ce provided in Example 1, where:

$$\begin{aligned} \P[\varepsilon]() = \epsilon \in \mathbb{W}, \quad \text{[l=]} (w, v) = \begin{cases} 1 & \text{if } v = w \\ 0 & \text{otherwise}, \end{cases} \quad \text{[predd]}(v) = \begin{cases} \epsilon & \text{if } v = \epsilon \\ u & \text{if } v = a.w \end{cases} \end{aligned}$$

Program ce is in normal form and computes the second-order functional F : (W → W) → W → W defined by: ∀f ∈ W → W, ∀w ∈ W, F(f)(w) = F<sup>|</sup>w<sup>|</sup>(f), where F<sup>n</sup> is defined recursively as F0(f) = and Fn+1(f) = (f ◦ <sup>f</sup>)(JK(Fn(f), f(1)) = (<sup>f</sup> ◦ <sup>f</sup>)(Fn(f)|f(1)<sup>|</sup>). That is a function that composes the input function 2|w| times f while restricting its input to a fixed size |f(1)| every other iteration. Indeed, <sup>J</sup>εK() = and <sup>J</sup>K(, ) = |<sup>|</sup> = 1. Consequently, the oracle bound w in the oracle call X2(z w) is bound to value f(1) in the store by the statement w := X1(ε ε).

Observe that the operators ε, != and pred are all neutral. An example of positive operator can be given by the successor operators defined by <sup>J</sup>suciK(v) = i.v, for <sup>i</sup> ∈ {0, <sup>1</sup>}. These operators are positive since <sup>|</sup>JsuciK(v)<sup>|</sup> <sup>=</sup> <sup>|</sup>i.v<sup>|</sup> <sup>=</sup> <sup>|</sup>v|+1.

### 3 Type system

Tiers and typing environments. Let W be the type of words in W. Simple types over W are defined inductively by T, T 0 , . . . ::= W | T → T. Let T<sup>W</sup> be the set of simple types over W. The order of a simple type in T<sup>W</sup> is defined inductively by: ord(T) = 0, if T = W, and ord(T) = max(1 + ord(T1), ord(T2)), if T = T<sup>1</sup> → T2.

Tiers are elements of the totally ordered set (N, , 0, ∨, ∧), where N = {0, 1, 2, . . .} is the set of natural numbers, is the standard ordering on integers, and ∨ and ∧ are the max and min operators over integers. Let ≺ be defined by ≺ := ∩ 6=. We use the symbols k, k 0 , . . . , k1, k2, . . . to denote tier variables. For a finite set of tiers, {k1, . . . , kn}, let ∨ n <sup>i</sup>=1k<sup>i</sup> (∧ n <sup>i</sup>=1k<sup>i</sup> , respectively) denote k<sup>1</sup> ∨ . . . ∨ k<sup>n</sup> (k<sup>1</sup> ∧ . . . ∧ kn, respectively). A first-order tier is of the shape k<sup>1</sup> → . . . → k<sup>n</sup> → k 0 , with k<sup>i</sup> , k <sup>0</sup> ∈ N.

A simple typing environment Γ<sup>W</sup> is a finite partial map from V to TW, which assigns simple types to variables.

(Var) (σ, µ, φ, x) →exp µ(x) (σ, µ, φ, e) →exp w (Op) (σ, µ, φ, op(e)) <sup>→</sup>exp <sup>J</sup>opK(w) (σ, µ, φ, e1) →exp v (σ, µ, φ, e2) →exp u φ(X) = {x → t <sup>0</sup>} (σ, µ[<sup>x</sup> <sup>←</sup> <sup>J</sup>K(v, <sup>u</sup>)], <sup>t</sup> 0 ) →env w (Orc) (σ, µ, φ, X(e<sup>1</sup> e2)) →exp w (a) Expressions (Skip) (σ, µ, φ, skip) →st µ (σ, µ, φ, st1) →st µ 0 (σ, µ<sup>0</sup> , φ, st2) →st µ 00 (Seq) (σ, µ, φ, st1; st2) →st µ 00 (σ, µ, φ, e) →exp w (Asg) (σ, µ, φ, x := e) →st µ[x ← w] (σ, µ, φ, e) →exp w (σ, µ, φ, st<sup>w</sup> ) →st µ <sup>0</sup> w ∈ {0, 1} (Cond) (σ, µ, φ, if(e){st1} else {st0}) →st µ 0 (σ, µ, φ, e) →exp 0 (Wh0) (σ, µ, φ, while(e){st}) →st µ (σ, µ, φ, e) →exp 1 (σ, µ, φ, st; while(e){st}) →st µ 0 (Wh1) (σ, µ, φ, while(e){st}) →st µ 0 (b) Statements t 0 ↓ x (TVar) (σ, µ, t 0 ) →env µ(x) t 0 ↓ X@t 0 <sup>1</sup> (σ, µ, t 0 <sup>1</sup>) →env w (OA) (σ, µ, t 0 ) →env µ(X)(w) t 0 ↓ call P(c, t <sup>0</sup>) (σ, µ, t <sup>0</sup>) →env w (σ, µ[x ← w, y ← ], X 7→ c, st) →st µ 0 (Call) (σ ∪ {P(X, x){var y; st return z}}, µ, t <sup>0</sup>) →env µ 0 (z) (c) Type-0 terms (σ ∪ {p}, µ, prog) →env w (Dec) (σ, µ, declare p in prog) →env w (σ, µ, prog) →env w a ∈ dom(µ) (Box) (σ, µ, box [a] in prog) →env w (d) Programs

Fig. 2: Big step operational semantics

A variable typing environment Γ is a finite partial map from V<sup>0</sup> to N, which assigns single tiers to type-0 variables.

An operator typing environment ∆ is a mapping that associates to some operator op and some tier k ∈ N a set of admissible first-order tiers ∆(op)(k) of the shape k<sup>1</sup> → . . . → kar(op) → k 0 .

A procedure typing environment Ω is a mapping that associates to each procedure name P a pair hΓ, ki consisting of a variable typing environment Γ and a triplet of tiers k. Let Ω<sup>i</sup> , πi(Ω), i ∈ {1, 2}.

Let dom(Γ), dom(ΓW), dom(∆), and dom(Ω) denote the sets of variables typed by Γ and ΓW, the set of operators typed by ∆, and the set of procedures typed by Ω, respectively.

For a procedure typing environment Ω, it will be assumed that for every P ∈ dom(Ω), param(P) ∪ local(P) ⊆ dom(Ω1(P)).

While operator and procedure typing environments are global, i.e., defined for the whole program, variable typing environments are local, i.e., relative to the procedure under analysis. In a program typing judgment, the simple typing environment can be viewed as the typing environment for the main program.

Typing judgments and type system. The typing discipline includes two distinct kinds of typing judgments: Procedure typing judgments Γ, ∆ ` o : (k, kin, kout) and Term typing judgments ΓW, Ω, ∆ ` prog : T, with k, kin, kout ∈ N, o ∈ Expressions ∪ Statements, and T ∈ TW.

The meaning of the procedure typing judgment is that the expression tier (or statement tier ) is k, the innermost tier is kin, and the outermost tier is kout. The innermost (resp. outermost) tier is the tier of the innermost (resp. outermost) while loop guard where the expression or statement is located. The meaning of term typing judgments is that the program prog is of simple type T under the operator typing environment ∆, the procedure typing environment Ω and the simple typing environment ΓW.

A program prog (or term t) is of type-i, if ΓW, Ω, ∆ ` prog : T (ΓW, Ω, ∆ ` t : T) can be derived for some typing environments and type T s.t. ord(T) = i.

The type system for the considered programming language is provided in Figure 3. A well-typed program is a program that can be given the type (W → W) → W → W, i.e., the judgment ΓW, Ω, ∆ ` prog : (W → W) → W → W can be derived for the environments ΓW, Ω, ∆. Consequently, a well-typed program is a type-i program, for some i ≤ 2, computing a functional.

For a given typing judgment <sup>j</sup>, a typing derivation <sup>π</sup> <sup>3</sup> <sup>j</sup> is a tree whose root is the (procedure or term) typing judgment j and whose children are obtained by applications of the typing rules of Figure 3. The name π will be used alone whenever mentioning the root of a typing derivation is not explicitly needed. A typing sub-derivation of a typing derivation π is a subtree of π.

Intuitions. We now give some brief intuition to the reader on the type discipline in the particular case where exactly two tiers, 0 and 1, are involved. The type system splits program variables, expressions, and statements between the two disjoint tiers:

Γ(x) = k (E-VAR) Γ, ∆ ` x : (k, kin, kout) k<sup>1</sup> → · · · → k<sup>|</sup>e<sup>|</sup> → k ∈ ∆(op)(kin) ∀i ≤ |e|, Γ, ∆ ` e<sup>i</sup> : (ki, kin, kout) (E-OP) Γ, ∆ ` op(e) : (k, kin, kout) Γ, ∆ ` <sup>e</sup><sup>1</sup> : (k, <sup>k</sup>in, <sup>k</sup>out) Γ, ∆ ` <sup>e</sup><sup>2</sup> : (kout, <sup>k</sup>in, <sup>k</sup>out) <sup>k</sup> <sup>≺</sup> <sup>k</sup>in <sup>∧</sup> <sup>k</sup> <sup>k</sup>out (E-OR) Γ, ∆ ` X(e<sup>1</sup> e2) : (k, kin, kout) (S-SK) Γ, ∆ ` skip : (0, kin, kout) Γ, ∆ ` st : (k, kin, kout) (S-SUB) Γ, ∆ ` st : (k+1, kin, kout) Γ, ∆ ` st<sup>1</sup> : (k, kin, kout) Γ, ∆ ` st<sup>2</sup> : (k, kin, kout) (S-SEQ) Γ, ∆ ` st1; st<sup>2</sup> : (k, kin, kout) Γ, ∆ ` x : (k1, kin, kout) Γ, ∆ ` e : (k2, kin, kout) k<sup>1</sup> k<sup>2</sup> (S-ASG) Γ, ∆ ` x := e : (k1, kin, kout) Γ, ∆ ` e : (k, kin, kout) Γ, ∆ ` st<sup>1</sup> : (k, kin, kout) Γ, ∆ ` st<sup>0</sup> : (k, kin, kout) (S-CND) Γ, ∆ ` if(e){st1} else {st0} : (k, kin, kout) Γ, ∆ ` e : (k, kin, k) Γ, ∆ ` st : (k, k, k) 1 k (S-WINIT) Γ, ∆ ` while(e){st} : (k, kin, 0) Γ, ∆ ` <sup>e</sup> : (k, <sup>k</sup>in, <sup>k</sup>out) Γ, ∆ ` st : (k, <sup>k</sup>, <sup>k</sup>out) <sup>1</sup> <sup>k</sup> <sup>k</sup>out (S-WH) Γ, ∆ ` while(e){st} : (k, kin, kout) (a) Tier-based typing rules for expressions and statements ΓW, Ω, ∆ ` X : W → W ΓW, Ω, ∆ ` x, y, x : W (PR-DEC) ΓW, Ω, ∆ ` P(X, x){[var y; ] st return x} : (W → W) → W → W ΓW, Ω, ∆ ` P(X, x){. . .} : (W → W) → W → W ΓW, Ω, ∆ ` c : W → W ΓW, Ω, ∆ ` t : W (P-CALL) ΓW, Ω, ∆ ` call P(c, t) : W ΓW(a) = T (P-VAR) ΓW, Ω, ∆ ` a : T Γ<sup>W</sup> ] {a : T}, Ω, ∆ ` t : T 0 (P-ABS) ΓW, Ω, ∆ ` λa.t : T → T 0 ΓW, Ω, ∆ ` t<sup>1</sup> : T → T <sup>0</sup> ΓW, Ω, ∆ ` t<sup>2</sup> : T (P-APP) ΓW, Ω, ∆ ` t1@t<sup>2</sup> : T 0 ΓW, Ω, ∆ ` prog : T Γ, ∆ ` body(n(p)) : (k, kin, kout) Ω(n(p)) = hΓ,(k, kin, kout)i (P-DEC) ΓW, Ω, ∆ ` declare p in prog : T Γ<sup>W</sup> ] {x : W}, Ω, ∆ ` t : W (P-CLOS) ΓW, Ω, ∆ ` {x → t} : W → W

(b) Simple typing rules for procedures, terms, closures and programs

0

0

(P-BOX)

Γ<sup>W</sup> ] {a : T}, Ω, ∆ ` prog : T

ΓW, Ω, ∆ ` box [a] in prog : T → T

Fig. 3: Tier-based type system


The type system of Figure 3 is composed of two sub-systems. The typing rules provided in Figure 3b enforce that terms follow a standard simply-typed discipline. The typing rules of Figure 3a will implement a standard non-interference type discipline `a la Volpano et al. [30] on the expression (and statement) tier, preventing data flows from tier 0 to tier 1. The transition between the two subtype-systems is performed in the rule (P-DEC) of Figure 3b that checks that the procedure body follows the tier-based type discipline once and for all in a procedure declaration.

In Figure 3a, as tier 1 data cannot grow (but can decrease) and are the only data driving the program flow, the number of distinct memory configurations on such data for a terminating procedure is polynomial in the size of the program input (i.e., number of symbols). Hence a typable and terminating procedure has a polynomial step count (in the sense of [11]), i.e., on any input, the execution time of a procedure is bounded by a first-order polynomial in the size of their input and the maximal size of any answer returned by an oracle call.

The innermost tier is used to implement a declassification mechanism on operators improving the type-system's expressive power: an operator may be typed differently depending on its calling context (the statement where it is applied). This is the reason why more than 2 tiers can be used in general.

The outermost tier is used to ensure that oracles are only called on inputs of bounded size. This latter restriction on oracle calls enforces a semantic restriction, called finite lookahead revision, introduced in [22,20] and requiring that, during each computation, the number of calls performed by the oracle on an input of increasing size is bounded by a constant.

Let MPT be the class of second-order functionals computable by an oracle Turing machine with a polynomial step count and a finite lookahead revision. [20] shows that BFF = λ(MPT)2. The type system of Figure 3 ensures that each terminating procedure of a well-typed program computes a function in MPT.

Safe programs. In this section, we restrict the set of admissible operators to prevent programs admitting exponential growth from being typable. A program satisfying such a restriction will be called safe.

An operator typing environment ∆ is safe if for each op ∈ dom(∆) such that ar(op) <sup>&</sup>gt; 0, op is neutral or positive, <sup>J</sup>op<sup>K</sup> is a polynomial time computable function, and for each k ∈ N, and for each k<sup>1</sup> → . . . kar(op) → k <sup>0</sup> ∈ ∆(op)(k), the two conditions below hold:

1. k <sup>0</sup> ∧ar(op) <sup>i</sup>=1 <sup>k</sup><sup>i</sup> ∨ar(op) <sup>i</sup>=1 k<sup>i</sup> k, 2. if op is a positive operator then k <sup>0</sup> ≺ k.

Example 3. Consider the operators !=, pred, and suc<sup>i</sup> discussed in Example 1 and an operator typing environment ∆ that is safe and such that !=, pred, suc<sup>i</sup> ∈ dom(∆). We can set ∆(!=)(1) , {1 → 1 → 1} ∪ {k → k <sup>0</sup> → 0 | k, k <sup>0</sup> 1}, as != is neutral. However 1 → 0 → 1 ∈/ ∆(!=)(1) as it breaks Condition 1) above (i.e., 1 6 1 ∧ 0).

We can also set ∆(pred)(2) , {2 → k | k 2}∪ {1 → k | k 1}∪ {0 → 0}. We also have ∆(suci)(1) = {1 → 0, 0 → 0}. 1 → 1 ∈/ ∆(suci)(1) as suc<sup>i</sup> is a positive operator and, due to Condition 2) above, the operator output tier has to be strictly smaller than 1.

Given a simple typing environment ΓW, a procedure typing environment Ω, and a safe operator typing environment ∆, a program prog is a safe program if it is well-typed for these environments, i.e., ΓW, Ω, ∆ ` prog : (W → W) → W → W can be derived. Let SAFE be the set of safe programs.

Example 4. We consider the program ce of Example 1. We define the operator typing environment ∆ by∆(!=)(2) , {1 → 1 → 1}, ∆(pred)(1) , {1 → 1}, and ∆(ε)(2) , {0, 1}. As the three operators !=, pred, and ε are neutral, the environment ∆ is safe. We define the simple typing environment Γ<sup>W</sup> by ΓW(w) , W, ∆(v) , W, ∆(z) , W, ΓW(X1) , W → W, and ΓW(X2) , W → W. We define the variable typing environment Γ by Γ(w) , 1, ∆(v) , 1, ∆(z) , 0. Finally, define the procedure typing environment Ω by Ω(KS) , hΓ,(1, 2, 1)i. Using the rules of Figure 3, the following typing judgement can be derived ΓW, Ω, ∆ ` ce : (W → W) → W → W. Hence ce ∈ SAFE.

### 4 Characterizations of the class of Basic Feasible Functionals

Safe and terminating programs. In this section, we show that typable (safe) and terminating programs capture exactly the class of basic feasible functionals.

For a given set of functionals S, let S<sup>2</sup> be the restriction of S to second-order functionals and let λ(S) be the set of functions computed by closed simply-typed lambda terms using functions in S as constants. Formally, let λ(S) be the set of functions denoted by the set of closed simply-typed lambda terms generated inductively as follows:


Each lambda term of type τ represents a function of type τ and terms are considered up to β and η equivalences. λ(S)<sup>2</sup> is called the second-order simplytyped lambda closure of S.

Given a simple typing environment ΓW, a safe operator typing environment ∆, and a triplet of tiers (k, kin, kout), a procedure p , P(X, x){[var y; ] st return x} is safe if it is well-typed for these environments, i.e ΓW, ∆ ` st : (k, kin, kout) can be derived using the rules of Figure 3. P computes a second-order partial functional <sup>J</sup>P<sup>K</sup> <sup>∈</sup> (<sup>W</sup> <sup>→</sup> <sup>W</sup>) <sup>|</sup>X<sup>|</sup> <sup>→</sup> <sup>W</sup><sup>|</sup>x<sup>|</sup> <sup>→</sup> <sup>W</sup>, defined by <sup>J</sup>PK(f, <sup>w</sup>) = <sup>w</sup> iff ({p}, µ∅[<sup>x</sup> <sup>←</sup> <sup>w</sup>, <sup>X</sup> <sup>←</sup> <sup>f</sup>], call P(X, <sup>x</sup>)) <sup>→</sup>env <sup>w</sup> (see Figure 2). If <sup>J</sup>P<sup>K</sup> is a total function, then the procedure terminates. Let ST be the set of safe and terminating procedures.

The characterization of BFF in terms of safe and terminating procedures discussed in the introduction can be stated as follows.

## Theorem 1 ([16]). <sup>λ</sup>(JSTK)<sup>2</sup> <sup>=</sup> BFF.

We are now ready to state a first characterization of BFF in terms of safe (SAFE) and terminating (SN) programs, showing that the external simply-typed lambda-closure of Theorem 1 can be removed.

## Theorem 2. <sup>J</sup>SN <sup>∩</sup> SAFEK<sup>2</sup> <sup>=</sup> BFF.

We want to highlight that the characterization of Theorem 2 is not just "moving" the simply-typed lambda-closure inside the programming language by adding a construct for lambda-abstraction. Indeed, the soundness of this result crucially depends on some choices on the language design that we have enforced: the restricted ability to compose oracles using closures, and the read-only mode of oracles inside a procedure call, implemented through continuations.

Safe and terminating rank-r programs. More importantly, we also show that this characterization is still valid in the absence of lambda-abstraction.

A safe program prog w.r.t. to a typing derivation π is a rank-r program, if for any typing sub-derivation π 0 <sup>3</sup> <sup>Γ</sup>W, Ω, ∆ ` <sup>λ</sup>a.<sup>t</sup> : <sup>T</sup> of <sup>π</sup>, it holds that ord(T) ≤ r. In other words, all lambda-abstractions are at most type-k terms, for k ≤ r. In particular, a rank-(r + 1) program, for r ≥ 1, has variables that are at most type-r variables. Rank-0 and rank-1 programs may have both type-0 and type-1 variables as these variables can still be captured by closures, procedure declarations, or boxes.

For a given set S of well-typed programs, let S<sup>r</sup> be the subset of rank-r programs in S, i.e., S<sup>r</sup> , {prog ∈ S | prog is a rank-r program}. For example, SAFE<sup>r</sup> denotes the set of safe rank-r programs. It trivially holds that SAFE = ∪r∈<sup>N</sup>SAFEr. The rank is clearly not uniquely determined for a given program. In particular, any rank-r program is also a rank-(r + 1) program. Consequently, for any set S of well-typed programs and any i ≤ j, it trivially holds that S<sup>i</sup> ⊆ S<sup>j</sup> .

Example 5. Program ce of Example 1 is in SAFE0. Indeed, ce ∈ SAFE, cf. Example 4, and ce is a rank-0 program, as it does not use any lambda-abstraction.

Now we revisit the syntax and semantics of safe rank-0 programs in SAFE0. The programs are generated by the syntax of Figure 1, where the terms are all of type-0 and redefined by:

$$\mathbf{\color{red}{Tersus}}\qquad \mathbf{t}^{0}, \mathbf{t}^{0}\_{1}, \mathbf{t}^{0}\_{2}, \dots \quad \mathrel{::=} \mathbf{x} \mid \mathbf{x} \circledast \mathbf{t}^{0} \mid \mathbf{ca11 } \mathbf{P}(\overline{\mathbf{c}}, \overline{\mathbf{t}^{0}})$$

Moreover, there is no longer a need for call-by-name reduction in the big step operational semantics. As a consequence, the rules (TVar), (OA), and (Call) of Figure 2c can be replaced by the following simplified rules:

$$\begin{array}{c} \begin{array}{c} \begin{array}{c} \text{( $\mathsf{T}$ Var $)} \end{array} \end{array} \begin{array}{c} \begin{array}{c} \text{($ \mathsf{T} $Var$ )} \end{array} \end{array} \begin{array}{c} \begin{array}{c} (\sigma,\mu,\mathsf{t}\_{1}^{0})\rightarrow\mathsf{\_{\mathsf{env}}} \end{array} \begin{array}{c} w \\ (\sigma,\mu,\mathsf{X}@\mathsf{t}\_{1}^{0})\rightarrow\\_{\mathsf{env}} \end{array} \begin{array}{c} \text{( $\mathsf{OA}$ ^{0} $)} \end{array} \end{array} \end{array} \begin{array}{c} \begin{array}{c} \text{($ \mathsf{OA} $^{0}$ )} \end{array} \end{array} \end{array}$$

$$\frac{(\sigma,\mu,\overline{\mathfrak{t}^{0}})\to\_{\texttt{anv}}\overline{w}}{(\sigma\cup\{\mathtt{P}(\overline{\mathtt{X}},\mathtt{x})\{\mathtt{var}\,\overline{\mathtt{y}}\leftarrow\mathtt{z}\},\mathtt{st}\,\mathtt{return}\,\mathtt{z})\to\_{\texttt{at}}\mu'}(\mathtt{call}^{\mathit{o}})} \text{ } (\mathtt{while}^{\mathit{o}})\to\_{\texttt{at}}\mu' \text{ } (\mathtt{call}^{\mathit{o}})\to\mathtt{a}\mu' \text{ } \mathtt{while}^{\mathit{o}})\,\mathtt{a}$$

We are now ready to characterize BFF in terms of safe and terminating rank-0 programs.

## Theorem 3. <sup>J</sup>SN <sup>∩</sup> SAFE<sup>0</sup><sup>K</sup> <sup>=</sup> BFF.

Hence the characterization of Theorem 2 is just a conservative extension of Theorem 3: lambda-abstractions, viewed as a construct of the programming language, allow for more expressive power in the programming discipline but do not capture more functions. As lambda-abstraction is fully removed from the programming language, this also shows that the simply-typed lambda closure of Theorem 1 can be simulated through restricted oracle compositions in our programming language (using closures and continuations). Moreover, the full hierarchy of safe and terminating rank-r programs collapses.

Corollary 1. <sup>∀</sup><sup>r</sup> <sup>∈</sup> <sup>N</sup>, <sup>J</sup>SN <sup>∩</sup> SAFEr<sup>K</sup> <sup>=</sup> BFF.

Tractable type inference. Let the size |prog| of the program prog be the total number of symbols in prog. Type inference is tractable for safe programs.

Theorem 4. Given a program prog and a safe operator typing environment ∆,


Tractability of type inference is a nice property of the type system. Showing prog ∈ SN is at least as hard as showing the termination of a first-order program, hence Π<sup>0</sup> 2 -hard in the arithmetical hierarchy. Therefore, the characterizations of Theorems 1, 2, and 3 are unlikely to be decidable, let alone tractable.

### 5 A completeness-preserving termination criterion

In this section, we show that the undecidable termination assumption (SN) can be replaced with a criterion, called SCPS, adapted from the Size-Change Termination (SCT) techniques of [23], that is decidable in polynomial time and that preserves the completeness of the characterizations. We first show that studying safe program termination can be reduced to the study of procedure termination.

Lemma 1. For a given prog ∈ SAFE, if there exists P ∈ Proc(prog) that terminates, then prog is terminating.

Hence, ensuring the termination of any procedure of a given safe program is a sufficient condition for the program to terminate. The converse trivially does not hold as, for example, a procedure with an infinite loop may be declared and not be called within a given safe program.

Size-Change Termination. SCT relies on the fact that if all infinite executions imply an infinite descent in a well-founded order, then no infinite execution exists. To apply this fact for proving termination, [23] defines Size-Change Graphs (SCGs) that exhibit decreases in the parameters of function calls and then studies the infinite paths in all possible infinite sequences of calls. If all those infinite sequences have at least one strictly decreasing path, then the program must terminate for all inputs. While SCT is PSpace-complete, [7] develops a more effective technique, called SCP, that is in P. The SCP technique is strong enough for our use case. In the literature, SCT and SCP are applied to pure functional languages. As we shall enforce termination of procedures, we will follow the approach of [1] adapting SCT to imperative programs.

First, we distinguish two kinds of operators that will enforce some (strict) decrease. An operator op is (strictly) decreasing in i, for i ≤ ar(op), if ∀w ∈ W, <sup>w</sup> <sup>6</sup><sup>=</sup> , <sup>|</sup>JopK(w)| ≤ |w<sup>i</sup> <sup>|</sup> (|JopK(w)<sup>|</sup> <sup>&</sup>lt; <sup>|</sup>w<sup>i</sup> <sup>|</sup>, respectively) and <sup>J</sup>opK() = . For operators of arity greater than 2, i may not be unique but will be fixed for each operator in what follows.

For simplicity, we will assume that assignments of the considered programs are flattened, that is for any assignment x := e, either e = y ∈ V0, or e = op(x), with x ∈ V0, or e = X(y z), with y, z ∈ V<sup>0</sup> and X ∈ V1. Notice that, by using extra type-0 variables, any program can be easily transformed into a program with flattened assignments, while preserving semantics and safety properties.

For each assignment of a procedure P, we design a bipartite graph, called a SCG, whose nodes are type-0 variables in (local(P)∪param(P))∩V<sup>0</sup> and arrows indicates decreases or stagnation from the old variable to the new. If a variable may increase, then the new variable will not have an in-arrow.

The bipartite graph is generated for any flattened assignment x := e by:

	- decreasing operator in i, we draw an arrow from x<sup>i</sup> to x.
	- strictly decreasing operator in i, we draw a "down-arrow" from x<sup>i</sup> to x.

In all other cases (neutral and non-decreasing operators, positive operators, oracle calls), we do not draw arrows. We will name this SCG graph G(x := e). Finally, for a set V of variables, G<sup>V</sup> will denote the SCG obtained as a subgraph of G restricted to the variables of V .

Example 6. Here are the SCGs associated to simple assignments of a procedure with three type-0 variables x, y, z using a strictly decreasing operator in 1 (pred), a decreasing operator (min) in 1, a positive operator (+1), and an oracle call.


The language L(st) of (potentially infinite) sequences of SCG associated with the statement st is defined inductively as an ∞-regular expression.

$$\begin{aligned} \mathcal{L}(\mathbf{x} := \mathbf{e}) & \stackrel{\scriptstyle \Delta}{=} G(\mathbf{x} := \mathbf{e}) & \qquad \mathcal{L}(\mathbf{i}\mathbf{t}(\mathbf{e})\{\mathbf{s}\mathbf{t}\_{1}\}\mathbf{e}\mathbf{1}\mathbf{s}\mathbf{e}\{\mathbf{s}\mathbf{t}\_{2}\}) \stackrel{\scriptstyle \Delta}{=} \mathcal{L}(\mathbf{s}\mathbf{t}\_{1}) + \mathcal{L}(\mathbf{s}\mathbf{t}\_{2}) \\ \mathcal{L}(\mathbf{s}\mathbf{t}\_{1};\mathbf{s}\mathbf{t}\_{2}) & \stackrel{\scriptstyle \Delta}{=} \mathcal{L}(\mathbf{s}\mathbf{t}\_{1}) . \qquad \mathcal{L}(\mathbf{w}\mathbf{i}1\mathbf{e}(\mathbf{s})\{\mathbf{s}\mathbf{t}\_{1}\}) \stackrel{\scriptstyle \Delta}{=} \mathcal{L}(\mathbf{s}\mathbf{t}\_{1})^{\infty} \end{aligned}$$

where, following the standard terminology for automata [28], L(st)<sup>∞</sup> is defined by L(st)<sup>∞</sup> , L(st) <sup>∗</sup> +L(st) <sup>ω</sup>. In the composition of SCGs, we are interested in paths that advance through the whole concatenated graph. Such a path implies that the final value of the destination variable is of size at most equal to the initial value of the source variable. If the path contains a down-arrow, then the size of the corresponding words decreases strictly.

Following the terminology of [7], a (potentially infinite) sequence of SCGs has a down-thread if the associated concatenated graph contains a path spanning every SCG in the sequence and this path includes a down-arrow.

Example 7. Consider the statement st , y := pred(x); y := min(x, y); x := x+ 1; x := X(y z), whose SCGs are described in Example 6. The concatenated graph obtained from the (unique and finite) sequence of SCGs in L(st) is provided below. It contains a down-thread (the path from x to y).

$$\begin{array}{c} \begin{array}{|c|} \hline \mathbf{x} \implies \mathbf{z} \longrightarrow \mathbf{x} \\ \mathbf{y} \end{array} \end{array} \xrightarrow{\mathbf{x} \longrightarrow \mathbf{x}} \mathbf{y} \longrightarrow \mathbf{x} \longrightarrow \mathbf{x} \\ \mathbf{z} \longrightarrow \mathbf{z} \longrightarrow \mathbf{z} \longrightarrow \mathbf{z} \longrightarrow \mathbf{z} \longrightarrow \mathbf{z} \end{array}$$

A (potentially infinite) sequence of SCGs is fan-in free if the in-degree of nodes is at most 1. By construction, all the considered SCGs are fan-in free.

Safety and Polynomial Size-Change. Unfortunately, programs with downthreads can loop infinitely in the state. To prevent this, we restrict the analysis to cases where while loops explicitly break out when the decreasing variable reaches , that is procedures with while loops of the shape while(x != ε){st}.

For a given set V of variables, we will say that st satisfies the simple graph property for V if for any while loop while(x != ε){st<sup>0</sup>} in st all sequences of SCGs G<sup>V</sup> <sup>1</sup> G<sup>V</sup> 2 . . . such that G1G<sup>2</sup> . . . ∈ L(st<sup>0</sup> ) are fan-in free and contain a down-thread from x to x. A procedure is in SCP<sup>S</sup> if its statement satisfies the simple graph property for the set of variables in while guards. A program is in SCP<sup>S</sup> if all its procedures are in SCPS.

Example 8. The program ce of Example 1 is in SCPS. The language L(body(KS)) corresponding to the body of procedure KS is equal to G1.G2.(G3.G4)<sup>∞</sup>, where the SCGs G<sup>i</sup> are defined as follows:


First, the procedure body satisfies the syntactic restrictions on programs (flattened expressions and restricted while guards). Moreover, the procedure body satisfies the simple graph property for {v} as there is always a down-thread on the path from v to v in (G3.G4)<sup>∞</sup> and any corresponding sequence is fan-in free. Consequently, the program ce is in SCP<sup>S</sup> ∩ SAFE0, by Example 5.

SCP<sup>S</sup> preserves completeness on safe programs for BFF.

Theorem 5. <sup>J</sup>SCP<sup>S</sup> <sup>∩</sup> SAFE0<sup>K</sup> <sup>=</sup> <sup>J</sup>SCP<sup>S</sup> <sup>∩</sup> SAFE<sup>K</sup> <sup>=</sup> BFF.

While in general deciding if a program satisfies the size-change principle is PSpace-complete, SCP<sup>S</sup> can be checked in quadratic time and, consequently, we obtain the following results.

Theorem 6. Given a program prog and a safe operator typing environment,


### 6 Conclusion and future work

We have presented a typing discipline and a termination criterion for a programming language that is sound and complete for the class of second-order polytime computable functionals, BFF. This characterization has three main advantages: 1) it is based on a natural higher-order programming language with imperative procedures; 2) it is pure as it does not rely on an extra semantic requirements (such as taking the lambda closure); 3) belonging to the set SCP<sup>S</sup> ∩ SAFE can be decided in polynomial time. The benefits of tractability is that our method can be automated. However the expressive power of the captured programs is restricted. This drawback is the price to pay for tractability and we claim that the full SCT method, known to be PSpace-complete, could be adapted in a more general way to our programming language in order to capture more programs at the price of a worse complexity. Moreover, any termination criterion based on the absence of infinite data flows with respect to some well-founded order could work and preserve completeness of our characterizations. Another issue of interest is to study whether the presented approach could be extended to characterize BFF in a purely functional language. We leave these open issues as future work.

Acknowledgements. The authors would like to thank the anonymous reviewers for their suggestions and comments. Bruce M. Kapron's work was supported in part by NSERC RGPIN-2021-02481.

### References


Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended

### Variable binding and substitution for (nameless) dummies

Andr´e Hirschowitz<sup>1</sup> , Tom Hirschowitz<sup>2</sup> , Ambroise Lafont<sup>3</sup> , and Marco Maggesi<sup>4</sup>

<sup>1</sup> Univ. Cˆote d'Azur, CNRS, LJAD, 06103, Nice, France

<sup>2</sup> Univ. Grenoble Alpes, Univ. Savoie Mont Blanc, CNRS, LAMA, 73000, Chamb´ery,

France

<sup>3</sup> University of New South Wales, Sydney, Australia <sup>4</sup> Universit`a degli Studi di Firenze, Italy

Abstract. By abstracting over well-known properties of De Bruijn's representation with nameless dummies, we design a new theory of syntax with variable binding and capture-avoiding substitution. We propose it as a simpler alternative to Fiore, Plotkin, and Turi's approach, with which we establish a strong formal link. We also show that our theory easily incorporates simple types and equations between terms.

Keywords: syntax · variable binding · substitution · category theory

### 1 Introduction

There is a standard notion of signature for syntax with variable binding called binding signature. Such a signature consists of a set of operation symbols, together with, for each of them, a binding arity. A binding arity is a list (1, . . . , ) of natural numbers, whose meaning is that the considered operation has arguments, with variables bound in the th argument, for all ∈ {1, . . . , }.

Example 1.


There are several possible representations of the syntax specified by a binding signature, most of them benefiting from good semantical understanding. The traditional, nominal representation has been nicely framed within nominal sets [12]. The representation by De Bruijn levels, a.k.a. nested datatypes [5,1], is well-understood thanks to presheaf models [11], as is higher-order abstract syntax [19]. However, one of the oldest representations, using De Bruijn's idea of modelling variables with nameless dummies, does not benefit from any semantical framework. This may be related to the fact that it is often perceived

as low-level and error-prone [4]. Our goal in this paper is to equip De Bruijn's representation with a suitable semantical framework.

Let us start by stressing some of the features of this representation, for some fixed binding signature .

Inductive definition The set DB of terms is the least fixed point of a suitable endofunctor on sets, derived from . In particular, there is a variables map : **N** → DB and, for each operation in with binding arity (1, . . . , ), a map DB : DB → DB.

Substitution DB is equipped with a (parallel) substitution map

−[−] : DB × DB**<sup>N</sup>** → DB,

which satisfies three standard substitution lemmas (associativity, left and right unitality).

Furthermore, substitution is compatible with operations, in the sense that it satisfies the following crucial binding conditions: for each operation with binding arity (1, . . . , ), 1, . . . , ∈ DB, and : **N** → DB,

$$o\_{\rm DB\_S}(e\_1, \dots, e\_p)[f] = o\_{\rm DB\_S}(e\_1[\uparrow]^{n\_1}f], \dots, e\_p[\uparrow]^{n\_p}f[), \tag{1}$$

where ⇑ is a unary operation defined on DB**<sup>N</sup>** by

$$\begin{aligned} (\Uparrow \sigma)(0) &= \nu(0) \\ (\Uparrow \sigma)(n+1) &= \sigma(n)[p \mapsto \nu(p+1)]. \end{aligned}$$

In the present work, by abstracting over these properties, we propose a simple theory for syntax with variable binding, which we summarise as follows.


<sup>5</sup> There is a slightly different notion of De Bruijn algebra in the literature, see the related work section.

We thus propose a theory for syntax with substitution, which is an alternative to the mainstream initial-algebra semantics of Fiore et al.'s [11]. We have experienced the simplicity of our theory by formalising it not only in Coq, but also in HOL Light, which does not support dependent types.

Our theory is similar to the mainstream theory [11], in the following aspects.


### Related work

Abstract frameworks for variable binding One of the mainstream such frameworks is [11]. This has been our main reference and in §5 we establish a strong link between this framework and our proposal. This link could probably be extended to variants such as [17,18,3].

In a more recent work, Allais et al. [1] introduce a universe of syntaxes, which essentially corresponds to a simply-typed version of binding signatures. Their framework is designed to facilitate the definition of so-called traversals, i.e., functions defined by structural induction, "traversing" their argument. We leave for future work the task of adapting our approach to such traversals.

In a similar spirit, let us mention the recent work of Gheri and Popescu [13], which presents a theory of syntax with binding, mechanised in Isabelle/HOL. Potential links with our approach remain unclear to us at the time of writing.

Finally, the categories of well-behaved objects obtained in §4 are technically very close to nominal sets [12]: finite supports appear in the action-based presentation of nominal sets, while pullback preservation appears in their sheaf-based presentation. And indeed, any well-behaved presheaf yields a nominal set, and so does any well-behaved De Bruijn monad. However, these links are not entirely satisfactory, because they do not account for substitution. The reason is that the only categorical theory of substitution that we know of for nominal sets, by Power [24], is operadic rather than monadic, so we do not immediately see how to extend the correspondence.

Proof assistant libraries Allais et al. [1] mechanise their approach in Agda. In the same spirit, the presheaf-based approach was recently formalised [9].

De Bruijn representation benefits from well-developed proof assistant libraries, in particular Autosubst [26,27]. They introduce a notion of De Bruijn algebra, and design a sound and complete decision procedure for their equational theory, which they furthermore implement for Coq.

Our notion of De Bruijn algebra differs from theirs, notably in that their substitutions are finitely generated. Our approach makes the theoretical development significantly simpler, but of course finite generation is crucial for their main purpose, namely decidability.

### General notation

We denote by <sup>∗</sup> = Í ∈**<sup>N</sup>** the set of finite sequences of elements of , for any set . In any category C, we tend to write [, ] for the hom-set C(, ) between any two objects and . Finally, for any endofunctor , - alg denotes the usual category of -algebras and morphisms between them.

### 2 De Bruijn monads

In this section, we start by introducing De Bruijn monads. Then, we define lifting of assignments, the binding conditions, and the models of a binding signature in De Bruijn monads, De Bruijn -algebras. Finally, we construct the term De Bruijn -algebra.

### 2.1 Definition of De Bruijn monads

We start by fixing some terminology and notation, and then give the definition.

Definition 1. Given a set , an -assignment is a map **N** → . We sometimes merely use "assignment" when is clear from context.

Notation 21. Consider any map : × **<sup>N</sup>** → .


$$\begin{aligned} X^{\mathbb{N}} \times Y^{\mathbb{N}} &\to Z^{\mathbb{N}}\\ (f, \mathbf{g}) &\mapsto n \mapsto s(f(n), \mathbf{g}). \end{aligned}$$

We use similar notation for this map, i.e., [] () := () [].

Definition 2. A De Bruijn monad is a set , equipped with


satisfying, for all ∈ , and , : **N** → :


Example 2. The set **N** itself is clearly a De Bruijn monad, with variables given by the identity and substitution **N** × **N <sup>N</sup>** → **N** given by evaluation. This is in fact the initial De Bruijn monad, as should be clear from the development below.

Example 3. The set Λ := .**N**++ <sup>2</sup> of -terms forms a De Bruijn monad. The variables map **N** → Λ is the obvious one, while the substitution map Λ×Λ **<sup>N</sup>** → Λ is less obvious but standard. In Example 5, as an application of Theorem 2, we will characterise this De Bruijn monad by a universal property.

### 2.2 Lifting assignments

Given a De Bruijn monad , we define an operation called lifting on its set of assignments **N** → . It is convenient to stress that only part of the structure of De Bruijn monad is needed for this definition.

Definition 3. Consider any set , equipped with maps : × **<sup>N</sup>** → and : **N** → . For any assignment : **N** → , we define the assignment ⇑ : **N** → by (⇑ ) (0) = (0) (⇑ ) ( + 1) = () [↑],

where ↑: **N** → maps any to ( + 1).

Remark 1. Both ⇑ and ↑ depend on and (part of) (, ). Here, and in other similar situations below, we abuse notation and omit such dependencies for readability.

Of course we may iterate lifting:

Definition 4. Let ⇑ <sup>0</sup> = , and ⇑ +<sup>1</sup> =⇑ (⇑ ).

### 2.3 Binding arities and binding conditions

Our treatment of binding arities reflects the separation between the first-order part of the arity, namely its length, which concerns the syntax, and the binding information, namely the binding numbers, which concerns the compatibility with substitution.

### Definition 5.


Let us now axiomatise what we call an operation of a given binding arity.

Definition 6. Let = (1, . . . , ) be any binding arity, be any set, : × **<sup>N</sup>** → , and : **N** → be any maps. An operation of binding arity is a map : → satisfying the following -binding condition w.r.t. (, ):

$$\forall \sigma: \mathbb{N} \to M, \mathbf{x}\_1, \dots, \mathbf{x}\_p \in M, \quad o(\mathbf{x}\_1, \dots, \mathbf{x}\_p)[\sigma] = o(\mathbf{x}\_1[\mathbb{N}^{n\_1} \ \sigma], \dots, \mathbf{x}\_p[\mathbb{N}^{n\_p} \ \sigma]). \tag{2}$$

Remark 2. Let us emphasise the dependency of this definition on and – which is hidden in the notations for substitution and lifting.

### 2.4 Binding signatures and algebras

In this section, we recall the standard notions of first-order (resp. binding) signatures, and adapt the definition of algebras to our De Bruijn context. Let us first briefly recall the former.

Definition 7. A first-order signature consists of a set of operations, equipped with an arity map ar : → **N**.

Definition 8. For any first-order signature := (, ar ), an -algebra is a set , together with, for each operation ∈ , a map : ar () → .

Let us now generalise this to binding signatures.

### Definition 9.


Example 4. The binding signature for -calculus has two operations lam and app, of respective arities (1) and (0, 0). The associated first-order signature has two operations lam and app, of respective arities 1 and 2.

Let us now present the notion of De Bruijn -algebra:

Definition 10. For any binding signature := (, ar ), a De Bruijn -algebra is a De Bruijn monad (, , ) equipped with an operation of binding arity ar (), for all ∈ .

In order to state our characterisation of the term model, we associate to any binding signature an endofunctor on sets, as follows.

Definition 11. The endofunctor Σ associated to a binding signature (, ar ) is defined by Σ () = Í ∈ |ar () | .

Remark 3. The induced endofunctor just depends on the underlying first-order signature.

Remark 4. As is well known, for any binding signature, the initial (**N** + Σ) algebra has as carrier the least fixed point .**N** + Σ ().

The following theorem defines the term model of a binding signature.

Theorem 1. Consider any binding signature = (, ar ), and let DB denote the initial (**N**+Σ)-algebra, with structure maps : **N** → DB and : Σ (DB) → DB. Then,

	- for all ∈ , the map DB satisfies the ar ()-binding condition w.r.t. (, ).

Proof. We have proved the result in both HOL Light [22] and Coq [20].

Remark 5. Point (i) may be viewed as an abstract form of recursive definition for substitution in the term model. The theorem thus allows us to construct the term model of a signature in two steps: first the underlying set, constructed as the inductive datatype .**N** + Σ (), and then substitution, defined by the binding conditions viewed as recursive equations.

Remark 6. We hope that our mechanisations [22,20] may be useful for future developments based on De Bruijn representation, to automatically generate the correct syntax and substitution from a suitable signature. This will have the advantage of reducing what needs to be read to make sure that the development actually does what is claimed. Normally, this part includes the whole definition of syntax and substitution, while our framework reduces it to only the binding signature. Our mechanisations may in fact be used for this purpose on existing developments, to certify the syntax and substitution, leaving only the binding signature for the reader to check.

Example 5. For the binding signature of -calculus (Example 4), the carrier of the initial model is .**N** + + 2 , and substitution is defined inductively by:

$$\begin{array}{c} \nu(n)[\sigma] = \sigma(n) \\ \lambda(e)[\sigma] = \lambda(e[\uparrow \sigma]) \\ (e\_1 \ e\_2)[\sigma] = e\_1[\sigma] \ e\_2[\sigma]. \end{array}$$

### 3 Initial-algebra semantics of binding signatures in De Bruijn monads

In this section, for any binding signature , we organise De Bruijn -algebras into a category, - DBAlg, and prove that the term De Bruijn -algebra is initial therein.

### 3.1 A category of De Bruijn monads

Let us start by organising general De Bruijn monads into a category:

Definition 12. A morphism (, , ) → (, , ) between De Bruijn monads is a set-map : → commuting with substitution and variables, in the sense that for all ∈ and : **N** → we have ([]) = () [ ◦ ] and ◦ = .

Remark 7. More explicitly, the first axiom says: ((, )) = ( (), ◦ ).

Notation 31. De Bruijn monads and morphisms between them form a category, which we denote by DBMnd.

Let us conclude this subsection by briefly mentioning a categorical point of view on the category of De Bruijn monads for the categorically-minded reader, in terms of relative monads [2].

Proposition 1. The category DBMnd is canonically isomorphic to the category of monads relative to the functor 1 → Set picking **N**.

Remark 8. Canonicity here means that the isomorphism lies over the canonical isomorphism [1, Set] Set.

According to the theory of [2], this yields:

Corollary 1. The tensor product ⊗ := × **<sup>N</sup>** induces a skew monoidal [28] structure on Set, and DBMnd is precisely the category of monoids therein.

Proof. To see this, let us observe that, by viewing any set , in particular **N**, as a functor 1 → Set, one may compute the left Kan extension of along **N**, which is a functor Lan**N**(): Set → Set. By the standard formula for left Kan extensions [21], we have Lan**N**() () × **<sup>N</sup>** = ⊗ . The result thus follows by [2, Theorems 4 and 5].

### 3.2 Categories of De Bruijn algebras

In this section, for any binding signature , we organise De Bruijn -algebras into a category - DBAlg.

Let us start by recalling the category of -algebras for a first-order :

Definition 13. For any first-order signature , a morphism → of algebras is a map between underlying sets commuting with operations, in the sense that for each ∈ , letting := ar (), we have ( (1, . . . , )) = ( (1), . . . , ( )).

We denote by - alg the category of -algebras and morphisms between them.

We now exploit this to define De Bruijn -algebras:

Definition 14. For any binding signature , a morphism of De Bruijn -algebras is a map : → between underlying sets, which is a morphism both of De Bruijn monads and of ||-algebras. We denote by - DBAlg the category of De Bruijn -algebras and morphisms between them.

Theorem 2. Consider any binding signature = (, ar ), and let DB denote the initial (**N**+Σ)-algebra. Then, the De Bruijn -algebra structure of Theorem 1 on DB makes it initial in - DBAlg.

Proof. We have proved the result in both HOL Light [22] and Coq [20].

### 4 Relation to presheaf-based models

The classical initial-algebra semantics introduced in [11] associates in particular to each binding signature a category, say Φ - Mon of models, while we have proposed in §3 an alternative category of models - DBAlg. In this section, we are interested in comparing both categories of models.

In fact, we find that both include exotic models, in the sense that we do not see any loss in ruling them out. And when we do so, we obtain equivalent categories.

### 4.1 Trimming down presheaf-based models

First of all, in this subsection, let us recall the mainstream approach we want to relate to, and exclude some exotic objects from it.

Presheaf-based models We start by recalling the presheaf-based approach. The ambient category is the category of functors [**F**, Set], where **F** denotes the category of finite ordinals, and all maps between them. As is well-known, this category is equivalent to the category [Set, Set] of finitary endofunctors on sets, and inherits from it a substitution monoidal structure. By construction, monoids for this monoidal structure are equivalent to finitary monads on sets.

The idea is then to interpret binding signatures as endofunctors Φ on [**F**, Set], and to define models as monoids equipped with Φ-algebra structure, satisfying a suitable compatibility condition.

The definition of Φ relies on an operation called derivation:

### Definition 15 (Endofunctor associated to a binding signature).


Proposition 2. Through the equivalence with finitary functors, derivation becomes 0 () = ( + 1), for any finitary : Set → Set and ∈ Set.

Example 6. For the binding signature of Example 4 for -calculus we get Φ () () = () <sup>2</sup> + ( + 1).

Next, we want to express the relevant compatibility condition between algebra and monoid structure. For this, let us briefly recall the notion of pointed strength, see [11,10] for details.

Definition 16. A pointed strength on an endofunctor : C → C on a monoidal category (C, ⊗, , , , ) is a family of morphisms ,(,) : () ⊗ → ( ⊗ ), natural in ∈ C and (, : → ) ∈ /C, the coslice category below , satisfying two coherence conditions.

The next step is to observe that binding signatures generate pointed strong endofunctors.

Definition 17. The derivation endofunctor ↦→ <sup>0</sup> on [**F**, Set] has a pointed strength, defined through the equivalence with finitary functors by

(() + 1) ( ()+1) −−−−−−−−−−→ (() + (1)) [ (1), (2) ] −−−−−−−−−−−−−−−→ (( + 1)).

Product, coproduct, and composition of endofunctors lift to pointed strong endofunctors, which yields:

Corollary 2 ([11,10]). For all binding signatures , Φ is pointed strong.

At last, we arrive at the definition of models.

Definition 18. For any pointed strong endofunctor on C, an -monoid is an object equipped with -algebra and monoid structure, say : () → , : ⊗ → , and : → , such that the following pentagon commutes.

A morphism of -monoids is a morphism in C which is a morphism both of -algebras and of monoids. We let - Mon denote the category of -monoids and morphisms between them.

Example 7. For the binding signature of Example 4, a Φ -monoid is an object , equipped with maps <sup>0</sup> → and <sup>2</sup> → , and compatible monoid structure. Compatibility describes how substitution should be pushed down through abstractions and applications.

Well-behaved presheaves The exoticness we want to rule out only concerns the underlying functor of a model, so we just have to define well-behaved functors in [**F**, Set].

Well-behavedness for a functor : **F** → Set is about getting closed terms right. More precisely, for some finite sets and , an element of ( + ) which both exists in () and () should also exist in (∅), and uniquely so. This says exactly that should preserve the pullback

Remark 9. The reader might wonder about other, i.e., non-empty pullbacks. But these are automatically preserved, by [29, Proposition 2.1].

### Definition 19.


Example 8. As an example of a non well-behaved finitary monad, consider the monad of -calculus but edited so that (∅) = ∅.

The important result for comparing the presheaf-based approach with ours is the following.

Proposition 3. The subcategory Φ - Monwb includes the initial object.

Proof. Roughly, closed terms are isomorphic to terms in two free variables that use neither the first, nor the second.

Remark 10. In most natural situations, all models are in fact well-behaved [16, Proposition 5.17].

### 4.2 Trimming down De Bruijn monads

Let us now turn to well-behaved De Bruijn algebras. Here well-behavedness is about finitariness. However, it may not be immediately clear how to define finitariness of a De Bruijn monad.

Definition 20. A De Bruijn monad (, , ) is finitary iff each of its elements ∈ has a (finite) support ∈ **N**, in the sense that for all : **N** → **N** fixing the first numbers, the corresponding renaming ◦ fixes .

Example 9. By Proposition 4 below, the initial -algebra is finitary, for any binding signature . For a counterexample, consider the greatest fixed point .**N**+Σ (), for any with at least one operation with more than one argument. E.g., if has an operation of binding arity (0, 0), like application in -calculus, then the term (0) ((1) ((2) . . .)) does not have finite support.

Definition 21. For any binding signature , let - DBAlgwb denote the full subcategory spanning De Bruijn -algebras whose underlying De Bruijn monad is finitary.

Proposition 4. The subcategory - DBAlgwb includes the initial object.

### 4.3 Bridging the gap

We may at last state the relationship between initial-algebra semantics of binding signatures in presheaves and in De Bruijn monads:

Theorem 3. Consider any binding signature . The subcategories Φ - Monwb and - DBAlgwb are equivalent.

Proof. See [16, Appendix A].

Remark 11. The moral of this is that, if one removes exotic objects from both Φ - Mon and - DBAlg, then one obtains equivalent categories, which both retain the initial object. Thus, the two approaches to initial-algebra semantics of binding signatures differ only marginally.

Restricting attention to well-behaved objects, we may thus benefit from the strengths of both approaches. Typically, in De Bruijn monads, free variables need to be computed explicitly, while presheaves come with intrinsic scoping, as terms are indexed by sets of potential free variables. Conversely, in some settings, observational equivalence may relate programs with different sets of free variables [25]. In such cases, it is useful to have all terms collected in one single set. This needs to be computed (and involves non-trivial quotienting) in presheaves, while it is direct in De Bruijn monads.

### 5 Strength-based interpretation of the binding conditions

In the previous section, we have compared the category - DBAlg of models of a binding signature in De Bruijn monads with the standard category of Φ monoids [11]. In this section, we establish a different kind of link, by showing that, for any binding signature , both categories - DBAlg and Φ - Mon are instances of a common categorical construction. We have seen that the standard category Φ - Mon is constructed from the pointed strong endofunctor Φ, so we would like a similar construction of - DBAlg. However, pointed strong endofunctors live on monoidal categories [11,10], while we have seen in Corollary §1 that **N** and the tensor product only equip Set with skew monoidal structure. In order to bridge this gap, we resort to a generalisation of pointed strengths to skew monoidal categories proposed by Borthelle et al. [6].

We give a condensed account: the interested reader is referred to [16, §6].

The starting point is that the endofunctor Σ associated to any given binding signature may be equipped with a family of maps

$$\mathbf{d}\mathbf{b}\mathbf{s}\_S \colon \Sigma\_S(X)\otimes Y \to \Sigma\_S(X\otimes Y)\text{-}\mathbb{N}$$

However, in order for such a map to be well-defined, we need to assume that features variables and renaming, i.e., that it is a pointed **N**-module, as we now introduce:

### Definition 22.


Example 10. Any De Bruijn monad (, , ) (in particular **N** itself) has a canonical structure of pointed **N**-module given by and (, ) = [ ◦ ].

We may now define the map dbs. Lifting of assignments (Definition 3) straightforwardly generalises to pointed **N**-modules. Recalling the definition

$$\Sigma\_S(X) = \sum\_{o \in O} X^{p\_o},$$

where ar () = ( 1 , . . . , ) for all ∈ , we thus simply have:

Definition 23. For any binding signature = (, ar ), the De Bruijn strength dbs of the induced endofunctor Σ is defined by

$$\begin{aligned} \Sigma\_{\mathcal{S}}(X) \otimes Y &\to \Sigma\_{\mathcal{S}}(X \otimes Y) \\ ((o, (\mathbf{x}\_1, \dots, \mathbf{x}\_{p\_o})), \sigma) &\mapsto (o, ((\mathbf{x}\_1, \|\!\|^{\mathfrak{n}\_1}\sigma), \dots, (\mathbf{x}\_{p\_o}, \|\!\|^{\mathfrak{n}\_{p\_o}}\sigma))), \end{aligned}$$

for all sets and pointed **N**-modules , with again ar () = ( 1 , . . . , ).

The fact that any De Bruijn monad is in particular a pointed **N**-module by Example 10 enables the definition of models in the strength-based approach:

Definition 24. For any binding signature , a Σ-monoid is an object , equipped with monoid and Σ-algebra structure, say : ⊗ → , : **N** → , and : Σ () → , making the following pentagon commute.

$$\underbrace{\Sigma\_S(X)\otimes X}\_{a\otimes X}\xrightarrow{\text{dbs}\,X}\underbrace{\Sigma\_S(X\otimes X)\xrightarrow{\Sigma\_S(s)}\Sigma\_S(X)}\_{s}\tag{3}$$

A morphism of Σ-monoids is a map which is both a monoid and a Σ-algebra morphism.

Let Σ - Mon denote the category of Σ-monoids and morphisms between them.

Remark 12. In [16], this definition is framed in a more general context, notably emphasising the fact that dbs is in fact a structural strength on the endofunctor Σ.

We may at last relate the initial-algebra semantics of §3 with the strength-based approach:

Proposition 5. For any binding signature = (, ar ) and De Bruijn monad (, , ) equipped with a map : → for all ∈ with ar () = (1, . . . , ), the following are equivalent:


Corollary 3. For any binding signature , we have an isomorphism Σ - Mon - DBAlg of categories over Set.

This readily entails the following (bundled) reformulation of Theorems 1 and 2.

Corollary 4. Consider any binding signature = (, ar ), and let DB denote the initial (**N**+Σ)-algebra, with structure maps : **N** → DB and : Σ (DB) → DB. Then:

	- the map **N** ⊗ DB ⊗DB −−−−−−→ DB ⊗ DB −→ DB coincides with the left unit of the skew monoidal structure (, ) ↦→ (), and – the pentagon (3) (with Σ := Σ) commutes.

Proof. Let Mon(Set) denote the category of monoids in Set for the skew monoidal structure. We have an equality Mon(Set) = DBMnd of categories, and the algebra structure Σ (DB) → DB is merely the cotupling of the maps DB of Theorem 1. This correspondence translates one statement into the other.

Remark 13. This result hints at a potential push-button proof of Theorems 1 and 2 (and Corollary 4). Indeed, it is almost an instance of [6, Theorem 2.15]: the latter is stated for general skew monoidal categories instead of merely Set, but does not directly apply in the present setting, because it assumes that the tensor product is finitary in the second argument.

### 6 Simply-typed extension

In this section, we extend the framework of §2–3, which is untyped, to the simplytyped case. The development essentially follows the same pattern, replacing sets with families.

We fix in the whole section a set **T** of types, and call **T**-sets the objects of Set**<sup>T</sup>** . A morphism → is a family (() → ()) <sup>∈</sup>**<sup>T</sup>** of maps.

### 6.1 De Bruijn **T**-monads

In this subsection, we define the typed analogue of De Bruijn monads.

The role of **N** will be played in the typed context by the following **T**-set.

Definition 25. Let N ∈ Set**<sup>T</sup>** be defined by N() = **N**.

Remark 14. This provides a countable set of variables at each type, which may not quite be what the reader would have called "typed De Bruijn representation". An inconvenience of this representation is that an "erasure" map from typed to untyped terms appears to need to rely on a bijection **T** × **N N** for "renaming" variables. In particular, not all indices can be preserved by such a map.

Definition 26. Given a **T**-set , an -assignment is a morphism N → . We sometimes merely use "assignment" when is clear from context.

The analogue of the tensor product ⊗ = × **<sup>N</sup>** will be played by [N, ] ·, i.e., the iterated self-coproduct of , with one copy per -assignment.

Notation 61. For coherence with the untyped case, we tend to write an element of ( [N, ] · ) () as (, ), with ∈ () and : N → . Furthermore, Notation 21 straightforwardly adapts to the typed case.

The definition of De Bruijn monads generalises almost mutatis mutandis:

Definition 27. A De Bruijn **T**-monad is a **T**-set , equipped with

– a substitution morphism : [N, ] · → , which takes an element ∈ and an assignment : N → , and returns an element [ ], and – a variables morphism : N → ,

such that for all ∈ , and , : N → , we have

$$\mathbf{x}\begin{bmatrix} f \end{bmatrix}\begin{bmatrix} \mathbf{g} \end{bmatrix} = \mathbf{x}\begin{bmatrix} f\begin{bmatrix} \mathbf{g} \end{bmatrix} \end{bmatrix} \qquad \qquad \mathbf{v}(n)\begin{bmatrix} f \end{bmatrix} = f(n) \qquad \qquad \mathbf{x}\begin{bmatrix} \mathbf{v} \end{bmatrix} = \mathbf{x}\begin{bmatrix} \mathbf{x} \end{bmatrix}$$

Example 11. The set ΛST of simply-typed -terms with free variables of type in **N**× {}, considered equivalent modulo -renaming, forms a De Bruijn monad. Variables N → ΛST are given by mapping, at any , any ∈ **N** to the variable (, ). Substitution [N, ΛST] · ΛST → ΛST is standard, capture-avoiding substitution. One main purpose of this section is to characterise ΛST by a universal property, and reconstruct it categorically.

Morphisms generalise straightforwardly, and we get:

Proposition 6. De Bruijn **T**-monads and morphisms between them form a category DBMnd(**T**).

### 6.2 Initial-algebra semantics

We now adapt the initial-algebra semantics of §3 to the typed case. Let us start by generalising lifting to the typed case. This relies on a typed form of lifting, which acts on all variables of a given type, leaving all other variables untouched.

Definition 28. Let (, , ) denote any De Bruijn **T**-monad. We first define a typed analogue ↑ of the ↑ of Definition 3, as below left, and then the lifting of any assignment : N → as below right.

$$\begin{array}{ll} (\uparrow^{\tau})\_{\tau}(n) = \upsilon\_{\tau}(n+1) & (\uparrow^{\tau} \ \sigma)\_{\tau}(0) = \upsilon\_{\tau}(0) \\ (\uparrow^{\tau})\_{\tau'}(n) = \upsilon\_{\tau'}(n) & (\not\!f\; \tau \neq \tau') \\ & & (\uparrow^{\tau} \ \sigma)\_{\tau'}(n+1) = \sigma\_{\tau}(n)[\uparrow^{\tau}] \\ & & (\uparrow^{\tau} \ \sigma)\_{\tau'}(n) = \sigma\_{\tau'}(n)[\uparrow^{\tau}] & (\not\!f\; \tau \neq \tau'). \end{array}$$

Finally, for any sequence = (1, . . . , ) of types, we define ⇑ inductively, by ⇑ = and ⇑ , =⇑ (⇑ ), where denotes the empty sequence.

We then generalise first-order and binding arities. The main point is:

Definition 29. A binding arity is an element of (**T** <sup>∗</sup> × **T**) <sup>∗</sup> × **T**, i.e., a tuple ( ( (1, 1), . . . , (, )), ), where each ∈ **T** ∗ is a list of types, and each , as well as , are types, thought of as an inference rule <sup>1</sup> ` <sup>1</sup> . . . ` ` ·

Example 12. The binding signature for simply-typed -calculus has two operations lam, <sup>0</sup> and app, <sup>0</sup> for all types and 0 , of respective arities

$$\begin{array}{ccc} \frac{\tau \vdash \tau'}{\vdash \tau \rightarrow \tau'} & & \text{and} \\ \frac{\tau \vdash \tau \rightarrow \tau'}{\vdash \tau'} & & \dagger \vdash \tau' \end{array}$$

This allows us to generalise binding conditions, as follows.

Definition 30. Let = ( ( (1, 1), . . . , (, )), ) be any binding arity, and be any set equipped with morphisms : [N, ] · → and : N → . An operation of binding arity is a map : (1) × . . . × () → () satisfying the following -binding condition w.r.t. (, ):

$$\begin{array}{l} \forall \sigma: \mathbf{N} \to M, \mathbf{x}\_1, \dots, \mathbf{x}\_p \in M(\tau\_1) \times \dots \times M(\tau\_p),\\ o(\mathbf{x}\_1, \dots, \mathbf{x}\_p)[\sigma] = o(\mathbf{x}\_1[\mathbb{M}^{\gamma\_1} \ \sigma], \dots, \mathbf{x}\_p[\mathbb{M}^{\gamma\_p} \ \sigma]). \end{array} \tag{4}$$

·

We may now generalise signatures and their models.

Definition 31. A **T**-binding signature consists of a set of operations, equipped with an arity map → (**T** <sup>∗</sup> × **T**) <sup>∗</sup> × **T**.

Definition 32. Consider any **T**-binding signature := (, ar ). A De Bruijn -algebra consists of a De Bruijn **T**-monad (, , ), together with algebra structure on for the underlying first-order signature ||, in the obvious sense, such that for all ∈ with arity ar () = ( ( (1, 1), . . . , (, )), ), the structural map : (1) × . . . × () → () satisfies the ar ()-binding condition w.r.t. (, ).

We denote by - DBAlg the category of De Bruijn -algebras and (the obvious notion of ) morphisms between them.

Finally, following the untyped case, we may associate to each signature an endofunctor Σ, and we have the following typed extension of the initiality theorem.

Theorem 4. For any **T**-binding signature , let DB denote the initial (N+Σ) algebra, with structure maps : N → DB and : Σ (DB) → DB, inducing maps DB : DB (1) × . . . × DB () → DB () for all ∈ with ar () = ( ( (1, 1), . . . , (, )), ). Then:


Example 13. While we saw in Example 12 that the De Bruijn monad of simplytyped -calculus terms admits a simple signature, there is another relevant, related monad, whose elements at any type are values of that type. (Indeed, values are closed under value substitution.) It is relatively straightforward to design a binding signature for this De Bruijn monad, following [15].

### 7 Equations

In this section, we introduce a notion of equational theory for specifying (typed) De Bruijn monads, following ideas from [8].

### Definition 33. A De Bruijn equational theory consists of


Example 14. Recalling the binding signature <sup>Λ</sup> for -calculus from Example 4, let us define a De Bruijn equational theory for -equivalence. We take = (1, 0), and for any De Bruijn Λ-algebra ,

– () has as structure map (1, 2) ↦→ app(lam(1), 2) while

– () has as structure map (1, 2) ↦→ <sup>1</sup> [<sup>2</sup> · id]. (Here <sup>2</sup> · id denotes the assignment 0 ↦→ 2, + 1 ↦→ ().)

Definition 34. Given an equational theory = (, , , ), a De Bruijn algebra is a De Bruijn -algebra such that () = ().

Let - DBAlg denote the category of -algebras, with morphisms of De Bruijn -algebras between them.

Remark 15. The category - DBAlg is an equaliser of and in CAT.

Let us now turn to characterising the initial De Bruijn -algebra, for any De Bruijn equational theory . For this, we introduce the following relation.

Definition 35. For any De Bruijn equational theory = (, , , ), with = (, ar ) and = ( 0 , ar <sup>0</sup> ), let DB denote the initial (N + Σ)-algebra. We define ∼ to be the smallest equivalence relation on DB satisfying the following rules,

$$\begin{array}{cc} \rho\_{L(\text{DB}\_{\text{S}})}'(e\_1, \dots, e\_p) \sim\_E \rho\_{R(\text{DB}\_{\text{S}})}'(e\_1, \dots, e\_p) \\\\ \hline \\ e\_1 \sim\_E e\_1' & \dots & e\_q \sim\_E e\_q' \\\hline \\ o\_{\text{DB}\_{\text{S}}}(e\_1, \dots, e\_q) \sim\_E o\_{\text{DB}\_{\text{S}}}(e\_1', \dots, e\_q') \end{array}$$

for all , 1, . . . in DB, <sup>0</sup> ∈ <sup>0</sup> with |ar <sup>0</sup> ( 0 )| = , and ∈ with |ar ()| = .

Example 15. For the equational theory of Example 14, the first rule instantiates precisely to the -rule, while the second enforces congruence.

Theorem 5. For any equational theory = (, , , ), - DBAlg admits an initial object, whose carrier set is the quotient DB/∼ .

Proof. This has been mechanised in Coq [20] and HOL [22].

Example 16. The initial model for the equational theory of Example 14 is the quotient of -terms in De Bruijn representation by -equivalence.

Remark 16. In [16, §9], we mention an equivalent way of defining De Bruijn equational theories in terms of modules.

### 8 Conclusion

We have proposed a simple, set-based theory of syntax with variable binding, which associates a notion of model (or algebra) to each binding signature, and constructs a term model following De Bruijn representation. The notion of model features a substitution operation. We have experienced the simplicity of this theory by implementing it in both Coq and HOL Light.

We have furthermore equipped the construction with an initial-algebra semantics, organising the models of any binding signature into a category, and proving that the term model is initial therein.

We have then studied this initial-algebra semantics in a bit more depth, in two directions. We have first established a formal link with the mainstream, presheaf-based approach [11], proving that well-behaved models (in a suitable sense on each side of the correspondence) agree up to an equivalence of categories. We have then recast the whole initial-algebra semantics into the mainstream, abstract framework of [11,10]. Finally, we have shown that our theory extends easily to a simply-typed setting, and smoothly incorporates equations.

### References


Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

### Uniform Guarded Fragments

Reijo Jaakkola()

Tampere University, Tampere, Finland reijo.jaakkola@tuni.fi https://reijojaakkola.github.io

Abstract. In this paper we prove that the uniform one-dimensional guarded fragment, which is a natural polyadic generalization of guarded two-variable logic, has the Craig interpolation property. We will also prove that the satisfiability problem of uniform guarded fragment is NExpTime-complete.

Keywords: Guarded fragment · Interpolation · Satisfiability problem.

### 1 Introduction

The guarded fragment GF is a well studied fragment of first-order logic FO, which was introduced by Andr´eka, van Benthem and N´emeti [1] as a generalization of modal logic. Informally speaking, GF is obtained from FO by requiring that all quantification must be relativised by FO-atoms, which is motivated by the observation that "quantificaction" in modal logics is relativised by accessability relations. Like modal logic, GF behaves well both computationally and model-theoretically. In particular, it is decidable, it has a (generalized) treemodel property and it satisfies various preservation theorems [1,7].

We say that a logic L has Craig interpolation property (CIP), if for every two formulas ϕ and ψ of L we have that if ϕ |= ψ, then there exists a third formula — the interpolant — χ of L, so that ϕ |= χ, χ |= ψ and χ contains only relation symbols which occur in both ϕ and χ. CIP is widely regarded as a property that a "nice" logic should have and for (reasonable logics with compactness) it implies several other desirable model-theoretic properties such as Projective Beth Definability and Robinson's consistency theorem [1,4,13,19].

It is well-known that various modal logics have CIP [1,6,19], while GF fails to have it [11]. This is somewhat surprising, given that GF is a very natural generalisation of modal logic, and certainly raises the question of how the syntax of GF should be modified so as to obtain a logic which does have CIP, and which also behaves well both computationally and model-theoretically. One option would be to extend further the expressive power of GF, and in this direction we have the guarded negation fragment, which has CIP, is decidable and shares with GF various desirable model-theoretic properties [2].

The other option (and the one which is more relevant for this paper) is to investigate fragments of GF. In this direction we also have a positive result,

P. Bouyer and L. Schr¨oder (Eds.): FoSSaCS 2022, LNCS 13242, pp. 409–427, 2022. https://doi.org/10.1007/978-3-030-99253-8\_21

namely that GF<sup>2</sup> — the two-variable fragment of GF — has CIP [11]. Given this result, it is natural to ask whether there exists a polyadic extension of GF<sup>2</sup> which would also have CIP, where by a polyadic extension we mean intuitively a logic which contains GF<sup>2</sup> and can express non-trivial properties of polyadic relations. Indeed, it seems rather unlikely that there would not be such an extension, since it is well-known that there are polyadic modal logics which have CIP [1].

In [9] the uniform one-dimensional fragment UF<sup>1</sup> was introduced, which is a very natural polyadic extension of the two-variable fragment FO<sup>2</sup> of FO. Roughly speaking, UF<sup>1</sup> is obtained from FO by requiring that each maximal existential (or universal) block of quantifiers leaves at most one variable free and that when forming boolean combinations of formulas with more than one free variable, the formulas need to have exactly the same set of variables. Formulas satisfying the first restriction are called one-dimensional, while formulas satisfying the second restriction are called uniform. In [16] it was proved that UF<sup>1</sup> has the finite model property and the complexity of its satisfiability problem is NExpTime-complete, which is the same as for FO<sup>2</sup> [8]. The research around UF<sup>1</sup> and its variants has been quite active, see for instance [12,14,15,17,18].

Given that UF<sup>1</sup> is a polyadic extension of FO<sup>2</sup> , the guarded UF<sup>1</sup> is a natural candidate for being a polyadic extension of GF<sup>2</sup> with CIP. As the first main result of this paper we will prove that guarded UF<sup>1</sup> does, in fact, have CIP. Our proof follows closely the argument given in [11] for proving that GF<sup>2</sup> has CIP, the main technical difference being that the proof presented in [11] uses crucially the fact that in the case of GF<sup>2</sup> we can assume live sets to have size at most two, while in our case we have to deal with live sets of arbitrary size.

Since the research around modal-like fragments of FO is largely motivated by the fact that their satisfiability problems are often decidable, it is natural to also study the complexity of the satisfiability problem of the guarded UF1, which was in fact already done in [15]. More precisely it was proved in [15] that the satisfiability problem of one-dimensional GF is in NExpTime, while it is already NExpTime-hard for guarded UF1. These results left open the problem of determining the complexity of uniform GF and as the second main result of this paper we will prove that the satisfiability problem of uniform GF is also in NExpTime (and hence it is NExpTime-complete).

We also emphasize that as a necessary by-product of this second technical result, we isolate the uniformity restriction imposed to formulas of UF<sup>1</sup> as an independent syntactical restriction and provide a formal definition for it (which so far has been missing from the literature). <sup>1</sup> We believe that uniformity is an important and a natural syntactical restriction (at least) in the context of fragments of FO. Indeed, in addition to UF<sup>1</sup> there are several known decidable fragments of FO which satisfy this restriction up to some degree, such as the one-binding fragments introduced in [20] and the ordered logic introduced in [10]. We hope that the results presented in this paper provide further motivation for the study of various uniform fragments of FO.

<sup>1</sup> To be precise, we only define what it means for a formula to be uniform in the context of GF; however, it is easy to extend this definition for other logics.

The structure of this paper is as follows. After the preliminaries in Section 2, we define a notion of bisimulation for UGF<sup>1</sup> and establish its basic properties in Section 3. After this we will prove that UGF<sup>1</sup> has CIP in Section 4. In Section 5 we will establish that the complexity of the satisfiability problem of uniform GF is NExpTime-complete. The final Section will list some new problems that the research conducted in this paper raises.

### 2 Preliminaries

### 2.1 Notation

In this paper we will work with vocabularies which do not contain constants and function symbols. We will also assume that there are no relation symbols of arity 0. We will use the Fraktul capital letters to denote structures, and the corresponding Roman letters to denote their domains. Given a model A and C ⊆ A, we will use A C to denote the restriction of A to the set C. Given two structures A and B, we will use A ≤ B to denote that A is a substructure of B.

Occasionally we will identify tuples a = (a1, . . . , an) with sets {a1, . . . , an}, which allows us to use notations such as b ∈ a and a = X, where X is a set. Given two tuples a and b of the same length, we will use a 7→ b and p : a → b to denote the mapping induced by the relation a<sup>i</sup> 7→ b<sup>i</sup> . Given a tuple a = (a1, . . . , an) and a unary function f, we will use f(a) to denote the tuple (f(a1), . . . , f(an)). Given a positive integer n we will denote [n] = {1, . . . , n}. Finally, if a = (a1, . . . , an) and k ≥ n and µ : [k] → [n] is a surjection, we will use a<sup>µ</sup> to denote the tuple (aµ(1), . . . , aµ(k)).

#### 2.2 Types and Tables

The following definitions are standard in the context of UF<sup>1</sup> and were first introduced in [16]. Let σ be a vocabulary. Given a set X = {x1, . . . , xn} of distinct variables and a k-ary relation R ∈ σ, we say that an atomic formula R(x<sup>i</sup><sup>1</sup> , . . . , x<sup>i</sup><sup>k</sup> ) is an X-atom over σ, if X = {x<sup>i</sup><sup>1</sup> , . . . , x<sup>i</sup><sup>k</sup> }. If α is an X-atom, then α and ¬α are both X-literals over σ. A 1-type over σ is a maximal satisfiable set of {x}-literals over σ. We identify 1-types π with conjunctions of their elements

$$\bigwedge^{\pi(x)}$$

A k-table is a tuple hρ, π1, . . . , πki, where each π` is a 1-type over σ, while ρ is a maximal satisfiable set of {x1, . . . , xk}-literals over σ. We identify k-tables hρ, π1, . . . , πki with conjunctions

$$\bigwedge \rho(x\_1, \ldots, x\_k) \land \bigwedge\_{1 \le \ell \le k} \pi\_\ell(x\_\ell).$$

Let A be a σ-model. Given a 1-type π over σ, we say that a ∈ A realizes π if π is the unique 1-type so that A |= π[a]; we denote by tp<sup>σ</sup> <sup>A</sup>[a] the (unique) 1-type π over σ which is realized by a in A. For distinct elements a1, . . . , a<sup>k</sup> ∈ A we will use tp<sup>σ</sup> <sup>A</sup>[a1, . . . , ak] to denote the (unique) k-table over σ which is realized by the tuple (a1, . . . , ak).

### 2.3 Syntax of Uniform Fragments of GF

Given a vocabulary σ, we define GF[σ] to be the smallest set F which satisfies the following requirements.


$$
\exists \overline{y} (\alpha(\overline{x}) \land \psi(\overline{x})) \in \mathcal{F},
$$

where y ⊆ x and α is an atomic formula over σ.

If the vocabulary σ is irrelevant or known from the context, then we will simply use GF to denote GF[σ].

Next we will give a formal definitions for the syntactical notions of onedimensionality and uniformity. We will start by making the technical remark that we will define recursively the set of subformulas Sf(ϕ) of ϕ ∈ GF otherwise in a standard way, except that for formulas of the form ϕ := ∃y(α(x) ∧ ψ(x)), we define Sf(ϕ) to be

$$\{\exists \overline{y} (\alpha(\overline{x}) \land \psi(\overline{x}))\} \cup \text{Sf}((\alpha(\overline{x}) \land \psi(\overline{x}))).$$

In other words, we treat each maximal sequence of existential quantification as a single logical operator.

Definition 1. Let ϕ ∈ GF be a formula. We say that ϕ is one-dimensional, if every subformula of ϕ of the form

$$\exists \overline{y} (\alpha(\overline{x}) \land \psi(\overline{x})) $$

has at most one free variable. In other words each maximal sequence of (guarded) existential quantification leaves at most one variable free.

Next we will define what it means for a formula of GF to be uniform. The precise definition turns out to be somewhat technical, and we will start with the following auxiliary definition.

Definition 2. Let X be a (possibly empty) set of variables and let σ be a vocabulary. A relative X-atom over σ is a formula ψ of GF[σ] which satisfies one of the following conditions.


With the aid of this definition we are in a position where we can define the notion of uniformity formally.

Definition 3. Let ϕ ∈ GF[σ] be a formula. We say that ϕ is uniform, if every subformula ψ of ϕ is a boolean combination of relative X-atoms, where X is the set of free variables of ψ.

Remark 1. Consider a uniform quantifier-free formula ψ(x1, . . . , xk) of GF[σ]. Let A be a σ-model and let (a1, . . . , ak) be a tuple of not necessarily distinct elements. Then whether or not

$$\mathfrak{A} \mid = \psi(a\_1, \dots, a\_k),$$

holds depends only on the table of (c1, . . . , c`), where (c1, . . . , c`) is an arbitrary enumeration of the set of distinct elements of (a1, . . . , ak).

The definition of uniformity is somewhat technical, but the following examples should clarify the intuition behind it.

Example 1. Let σ = {S, R, P}, where S is a ternary relation symbol, R is a binary relation symbol and P is a unary relation symbol. The formula

∃x∃y(P(x) ∧ R(x, y) ∧ S(x, y, y) ∧ R(y, x) ∧ P(y)))

is both uniform and one-dimensional. On the other hand the formula

∃x∃y(∃z(S(x, y, z) ∧ P(z)) ∧ R(x, y) ∧ S(x, y, x))

is uniform but not one-dimensional. Finally, the formula

∃x∃y∃w(R(x, y) ∧ ∃zS(x, w, z))

is neither one-dimensional nor uniform.

Example 2. The standard translation of polyadic modal logic into FO results in formulas of the form

$$\exists x\_1 \dots \exists x\_k (R(x\_0, x\_1, \dots, x\_k) \land \bigwedge\_{1 \le \ell \le k} \psi\_\ell(x\_\ell))$$

which are uniform and one-dimensional [5].

We will use UGF to denote the set of formulas of GF which are uniform and UGF<sup>1</sup> to denote the set of formulas of GF which are both uniform and one-dimensional. Throughout this paper we will use ϕ(x1, . . . , xn), where all the variables in the tuple (x1, . . . , xn) are distinct, to denote a formula of either UGF<sup>1</sup> or UGF such that either {x1, . . . , xn} is precisely the set of free variables of ϕ or ϕ has at most one free variable which belongs to {x1, . . . , xn} or ϕ is of the form x<sup>i</sup> = x<sup>j</sup> , where 1 ≤ i, j ≤ n.

### 2.4 Interpolation

We start by recalling the definition of the Craig interpolation property.

Definition 4. Given a logic L, we say that L has the Craig interpolation property (CIP), if for every ϕ ∈ L[σ] and ψ ∈ L[τ ] we have that ϕ |= ψ implies that there exists an interpolant χ ∈ L[σ ∩ τ ] for this entailment, i.e., a sentence for which ϕ |= χ and χ |= ψ hold.

It is well-known that the full GF fails to have CIP. The known examples of sentences which demonstrate this can be used to make the following observation.

Proposition 1. The one-dimensional GF does not have CIP.

Proof. Consider the following sentences, which are simple variants of the formulas used in [13].

$$\varphi := \exists x \exists y \exists z (G(x, y, z) \land R(x, y) \land R(y, z) \land R(z, x))$$

$$\psi := \forall x \forall y (R(x, y) \to (A(x) \leftrightarrow \neg A(y)))$$

Notice that both of these sentence are one-dimensional. Now one can show, using essentially the same argument as the one used in Example 1 in [13], that there is no interpolant for the implication ϕ |= ¬ψ.

We remark that, in the context of fragments of FO, CIP is usually defined for formulas instead of sentences (as we have defined it). We could have also formulated it for formulas, but we decided to work with sentences for simplicity.

### 3 Bisimulation for UGF<sup>1</sup>

Given two models A and B, and tuples c ∈ A<sup>n</sup> and d ∈ B<sup>n</sup> we will use

$$(\mathfrak{A}, \mathfrak{c}) \equiv\_{\sigma} (\mathfrak{B}, \overline{d})$$

to denote the fact that for every ϕ(x1, . . . , xn) ∈ UGF<sup>1</sup> we have that

$$
\mathfrak{A} \vdash \varphi(c\_1, \dots, c\_n) \iff \mathfrak{B} \vdash \varphi(d\_1, \dots, d\_n).
$$

The purpose of this section is to define a corresponding notion of bisimulation for UGF<sup>1</sup> which captures the above equivalence relation. We will start by defining a suitable notion of partial isomorphism.

Definition 5. Let A and B be models, and let X := {a1, . . . , an} ⊆ A and Y ⊆ B. A bijection p : X → Y , is called a uniform partial σ-isomorphism between A and B, if

$$\operatorname{tp}\_{\mathfrak{A}}^{\sigma}[a\_1, \dots, a\_n] = \operatorname{tp}\_{\mathfrak{B}}^{\sigma}[p(a\_1), \dots, p(a\_n)].$$

Quantification in GF over a model A is restricted to live subsets of A, i.e., subsets of A which are either singletons or are contained in a single tuple a ∈ RA, for some R ∈ σ. In the case of UGF<sup>1</sup> we will need the following modified version of the notion of live set, which takes into account the requirement that our formulas are uniform.

Definition 6. Let A be a model and let X ⊆ A. We say that X is σ-live, if either |X| ≤ 1 or there exists R ∈ σ and (a1, . . . , an) ∈ R<sup>A</sup> so that X = {a1, . . . , an}.

We are now ready to define the notion of bisimulation for UGF1.

Definition 7. Let Z be a non-empty set of uniform partial σ-isomorphism between two structures A and B. Let c ∈ A<sup>n</sup> and d ∈ B<sup>n</sup> be tuples. We say that Z is a uniform guarded σ-bisimulation between (A, c) and (B, d), if for every p : X → Y ∈ Z the following conditions hold:


$$p(a) = q(a).$$

(back) For any b ∈ Y and a σ-live set Y <sup>0</sup> ⊆ B, with b ∈ Y 0 , there exists q : X<sup>0</sup> → Y <sup>0</sup> ∈ Z so that

$$p^{-1}(b) = q^{-1}(b).$$

If there exists a guarded σ-bisimulation between (A, c) and (B, d), then we denote this by (A, a) ∼<sup>σ</sup> (B, b).

In what follows we will often refer to uniform guarded bisimulations simply as guarded bisimulations. The following two lemmas establish that our notion of bisimulation is correct, the first of which can proved in a standard manner by using induction.

Lemma 1. Let A and B be models, and let c ∈ A<sup>n</sup> and d ∈ B<sup>n</sup> be tuples so that (A, c) ∼<sup>σ</sup> (B, d). Then (A, c) ≡<sup>σ</sup> (B, d).

For the proof of the second lemma we need to recall the definition of ωsaturated model. A elementary n-type over a vocabulary σ is a consistent set of first-order formulas (not necessarily quantifier-free) with free variables in {x1, . . . , xn}. Given a σ-model A, we say that it is ω-saturated, if for every tuple a ∈ A<sup>n</sup> of elements of A we have that each elementary n-type over the extended vocabulary σ ∪ {a1, . . . , an}, where each a<sup>i</sup> denotes a constant to be interpreted as the element a<sup>i</sup> , which is finitely consistent with the FO-theory of (A, a), is realized in (A, a). It is well-known that every σ-model, where σ is finite and relational, has an ω-saturated elementary extension [3].

Lemma 2. Let A and B be two ω-saturated models, and let c ∈ A<sup>n</sup> and d ∈ B<sup>n</sup> be tuples so that (A, c) ≡<sup>σ</sup> (B, d). Then (A, c) ∼<sup>σ</sup> (B, d).

Proof. Consider the following set

$$\mathcal{Z} := \{ p: \overline{a} \to \overline{b} \mid (\mathfrak{A}, \overline{a}) \equiv\_{\sigma} (\mathfrak{B}, \overline{b}) \}\dots$$

We claim that Z is a guarded σ-bisimulation between (A, c) and (B, d). We first note that by assumption c 7→ d ∈ Z, and hence Z satisfies (cover). Z also clearly consists of uniform partial σ-isomorphism between A and B. What remains to be proved is that Z also satisfies (forth) and (back). Since these two cases are analogous, we will concentrate on (forth).

Let p : a → b ∈ Z, a ∈ X and X<sup>0</sup> := {c1, . . . , cm} ⊆ A be a σ-live set so that a ∈ X<sup>0</sup> . For simplicity we will assume that a = c1. Consider now the following elementary m-type

$$\Sigma := \{ \varphi(p(a), x\_2, \dots, x\_m) \in \mathcal{U}\mathcal{G}\mathcal{F}\_1[\sigma \cup \{p(a)\}] \mid \mathfrak{A} \mid = \varphi(a, c\_2, \dots, c\_m) \}.$$

We claim that Σ is realized in (B, p(a)). Since B is ω-saturated, it suffices to show that each finite subset of Σ is realized in (B, p(a)). Let

$$
\psi\_1(p(a), x\_2, \dots, x\_m), \dots, \psi\_r(p(a), x\_2, \dots, x\_m) \in \Sigma.
$$

Since X<sup>0</sup> is σ-live, there exists an atomic formula α(x1, . . . , xm) over σ with the property that

$$\mathfrak{A} \models \exists x\_2 \dots \exists x\_m (\alpha(a, x\_2, \dots, x\_m) \land \bigwedge\_{1 \le i \le r} \psi\_i(a, x\_2, \dots, x\_m)).$$

Note that Definition 6 guarantees that this is indeed a formula of UGF1[σ]. Since (A, a) ≡<sup>σ</sup> (B, b), we know that

$$\mathfrak{B} \models \exists x\_2 \dots \exists x\_m (\alpha(p(a), x\_2, \dots, x\_m) \land \bigwedge\_{1 \le i \le r} \psi\_i(p(a), x\_2, \dots, x\_m))\dots$$

Thus {ψ1(p(a), x2, . . . , xm), . . . , ψr(p(a), x2, . . . , xm)} is satisfiable in (B, p(a)), and hence Σ is satisfiable in (B, p(a)), say by the tuple (p(a), d2, . . . , dm). Now c 7→ d ∈ Z is the mapping we were after.

Remark 2. Using the two previous lemmas one prove in a standard manner that UGF<sup>1</sup> is the maximal fragment of FO which is invariant under uniform guarded bisimulation, see for example [2].

### 4 Proof that UGF<sup>1</sup> has CIP

In this section we will prove that UGF<sup>1</sup> has CIP. We will start with the following lemma.

Lemma 3. Let σ and τ be signatures, and let ϕ ∈ UGF1[σ] and ψ ∈ UGF1[τ ]. Suppose that there is no χ ∈ UGF1[σ ∩ τ ] with the property that ϕ |= χ and χ |= ψ. Then there is a σ-model A and a τ -model B with the property that A |= ϕ, B 6|= ψ and A ≡σ∩<sup>τ</sup> B.

Proof. Essentially the same argument as the one used in the proof of Theorem 4.1 in [2] gives the result.

To give a high level overview of the rest of the proof, suppose that the assumption of Lemma 3 holds for sentences ϕ and ψ, which implies in particular that there are models A and B so that A ∼σ∩<sup>τ</sup> B. Now, what we want to prove is that ϕ ∧ ¬ψ is satisfiable. To do this, we will follow a standard approach in modal logic [2,11] by constructing an amalgam U which has the property that U ∼<sup>σ</sup> A and U ∼<sup>τ</sup> B. In particular, it will be a model of ϕ ∧ ¬ψ, since A |= ϕ and B |= ¬ψ.

Suppose now that A ∼σ∩<sup>τ</sup> B and let Z be a guarded (σ ∩ τ )-bisimulation which witnesses it. Given a pair (a, b) we will use (a, b) ∈ Z to denote the fact that there exists p ∈ Z with the property that a = dom(p) and p(a) = b. In other words the relation a<sup>i</sup> 7→ b<sup>i</sup> induces a uniform partial (σ ∩ τ )-isomorphism which belongs to Z.

Before describing the construction of U, we need to introduce some additional notation. Given two tuples a and b of the same length, we will let (a ⊗ b) denote the following tuple:

$$((a\_1, b\_1), \dots, (a\_n, b\_n))$$

Given (a ⊗ b), we say that it is left-good, if for every 1 ≤ i < j ≤ n we have that if a<sup>i</sup> = a<sup>j</sup> , then b<sup>i</sup> = b<sup>j</sup> . <sup>2</sup> Similarly we say that (a ⊗ b) is right-good, if for every 1 ≤ i < j ≤ n we have that if b<sup>i</sup> = b<sup>j</sup> , then a<sup>i</sup> = a<sup>j</sup> . Finally we say that (a ⊗ b) is good if it is left-good and right-good. Note that if (a ⊗ b) is of length n, k ≥ n and µ : [k] → [n] is a surjection, then we have that if (a ⊗ b) is left-good, then so is (a ⊗ b)µ. Analogous observation of course holds for right-good and good.

As the domain of the amalgam U we will take the set U = {(a, b) ∈ A × B | (a, b) ∈ Z}, while the interpretations of relation symbols will be defined as follows. First, for every R ∈ σ ∩ τ we define that

$$(\overline{a}\otimes\overline{b})\in R^{\mathfrak{A}}\text{ iff }\overline{a}\in R^{\mathfrak{A}}\text{ and }(\overline{a},\overline{b})\in \mathcal{Z}$$

Then, for every R ∈ (σ\τ ) we define that (a ⊗ b) ∈ R<sup>U</sup> iff a ∈ R<sup>A</sup> and one of the following conditions holds:

– (a, b) ∈ Z. – (a ⊗ b) is left-good and a is not (σ ∩ τ )-live.

Similarly, for every R ∈ (τ\σ) we define that (a ⊗ b) ∈ R<sup>U</sup> iff b ∈ R<sup>B</sup> and one of the following conditions holds:

– (a, b) ∈ Z. – (a ⊗ b) is right-good and b is not (σ ∩ τ )-live.

This concludes the construction of U. This construction is similar to the one given in [11] with the exception that we require tuples that are not (σ ∩ τ )-live to be either right-good or left-good.

2 In other words, if (a⊗b) is left-good, then the projection (a⊗b) 7→ a is an injection. We now define

$$\mathcal{Z}\_1 := \{ (\overline{a} \otimes \overline{b}) \mapsto \overline{a} \mid (\overline{a} \otimes \overline{b}) \text{ is } \sigma\text{-live in } \mathfrak{U}. \}$$

and

$$\mathcal{Z}\_2 := \{ (\overline{a} \otimes \overline{b}) \mapsto \overline{b} \mid (\overline{a} \otimes \overline{b}) \text{ is } \tau\text{-live in } \mathfrak{U}. \}$$

Note that if (a ⊗ b) is σ-live, then by construction it is also left-good (and an analogous observation obviously holds for τ -live tuples in U).

Lemma 4. Z<sup>1</sup> consists of uniform partial σ-isomorphism between U and A, and Z<sup>2</sup> consists of uniform partial τ -isomorphism between U and B.

Proof. We will only consider the case of Z1, since the case of Z<sup>2</sup> is analogous. Let (a ⊗ b) 7→ a ∈ Z1, where the length of (a ⊗ b) is n. We will separately check that this mapping preserves 1-types and n-ary atomic formulas.

Let 1 ≤ i ≤ n and suppose that

$$((a\_i, b\_i), \dots, (a\_i, b\_i)) \in R^{\mathfrak{A}},$$

where R ∈ σ. By construction we know that (a<sup>i</sup> , . . . , ai) ∈ RU. Suppose then that

$$(a\_i, \ldots, a\_i) \in R^{\mathfrak{A}}.$$

Since by definition of U we have that (a<sup>i</sup> , bi) ∈ Z, we can conclude that

$$((a\_i, b\_i), \dots, (a\_i, b\_i)) \in R^{\mathfrak{U}}.$$

Thus (a<sup>i</sup> , bi) and a<sup>i</sup> have the same 1-types over σ.

We will then verify that the mapping preserves n-ary atomic formulas. Let R ∈ σ be a k-ary relation, where k ≥ n, and let µ : [k] → [n] be a surjection. We need to show that (a ⊗ b)<sup>µ</sup> ∈ R<sup>U</sup> iff a<sup>µ</sup> ∈ RA. Again, the left to right direction follows immediately from the definition of U, so we will concentrate on the direction from right to left. First we note that if a is not (σ ∩ τ )-live, then we are done, since then also a<sup>µ</sup> is not (σ ∩ τ )-live.

Thus we can assume that a is (σ ∩ τ )-live. Now, due to the definition of Z1, we know that (a⊗b) is σ-live in U. Hence, by definition of U, and the fact that a is (σ ∩ τ )-live, we know that (a, b) ∈ Z, which is the same as (aµ, bµ) ∈ Z. Now we can deduce, due to the definition of U, that (a ⊗ b)<sup>µ</sup> ∈ R<sup>U</sup>. This, together with the fact that (a ⊗ b) 7→ a preserves 1-types over σ, allows us to conclude that tp<sup>σ</sup> U [a ⊗ b] = tp<sup>σ</sup> <sup>A</sup>[a].

Lemma 5. Z<sup>1</sup> is a guarded σ-bisimulation between U and A, and Z<sup>2</sup> is a guarded τ -bisimulation between U and B.

Proof. Again, we will only consider the case of Z1, since the case of Z<sup>2</sup> is analogous. Due to Lemma 4 we just need to verify (back) and (forth) conditions. Let (a ⊗ b) 7→ a ∈ Z1, where the length of a and b is n.


that (a<sup>i</sup> , bi) ∈ Z. Since Z is a guarded (σ ∩τ )-bisimulation, there exists a set {d1, . . . , dm} ⊆ B so that (c, d) ∈ Z and (a<sup>i</sup> , bi) ∈ (c ⊗ d). In particular (c ⊗ d) is σ-live in U, and hence (c ⊗ d) 7→ d ∈ Z1, which is the mapping we were after.

### Theorem 1. UGF<sup>1</sup> has Craig interpolation property.

Proof. Let ϕ ∈ UGF1[σ] and ψ ∈ UGF1[τ ] be sentences so that ϕ |= ψ, but there is no interpolant for this entailment. By lemma 3 there exists a σ-model A and a τ -model B such that A |= ϕ, B 6|= ψ and A ≡σ∩<sup>τ</sup> B. Take ω-saturated elementary extensions Aˆ and Bˆ of A and B. Since Aˆ ≡σ∩<sup>τ</sup> Bˆ , by lemma 2 we have that Aˆ ∼σ∩<sup>τ</sup> Bˆ . Using the construction presented in this section there exists a (σ∪τ )-model U with the property that U ∼<sup>σ</sup> Aˆ and U ∼<sup>τ</sup> Bˆ . Thus U |= ϕ∧¬ψ, i.e. ϕ∧¬ψ is consistent, which is a contradiction with the assumption that ϕ |= ψ.

### 5 Complexity of uniform GF

In this section we will prove that the complexity of the satisfiability problem of uniform GF is in NExpTime. Since it was proved in [15] that the complexity of the satisfiability problem of UGF<sup>1</sup> is NExpTime-hard, this upper bound is sharp.

#### 5.1 Scott normal form

As usual, we will start by arguing that we can restrict our attention to sentences which are in a certain normal form. The normal form that we will use here has a somewhat awkward form, but the proof of Lemma 6 should clarify why we chose to use it.

Definition 8. Let ϕ be a sentence of UGF. We say that ϕ is in normal form, if it has the following shape

$$\bigwedge\_{t \in T} \exists z \lambda\_t(z) \land \bigwedge\_{i \in I} \forall \overline{x}(\alpha\_i(\overline{x}) \to \exists \overline{y}(\beta\_i(\overline{x}, \overline{y}) \land \psi\_i(\overline{x}, \overline{y})))$$

$$\bigwedge\_{j \in J} \forall \overline{x}(\kappa\_j(\overline{x}) \to (\theta\_j(\overline{x}) \to \forall \overline{y}(\gamma\_j(\overline{x}, \overline{y}) \to \psi\_j(\overline{x}, \overline{y})))),$$

where T, I, J are non-empty (finite) sets, λt, αi, βi, κ<sup>j</sup> and γ<sup>j</sup> are atomic formulas and ψi, θ<sup>j</sup> and ψ<sup>j</sup> are quantifier-free formulas.

Remark 3. In the definition of the normal form we do not require that the tuples y are necessarily non-empty, i.e., we allow formulas of the form ∀x(αi(x) → ψi(x)) in our normal forms. However, we do require that the tuples x are nonempty, and hence we do not allow formulas of the form ∃y(βi(y) ∧ ψi(y)), where the length of y is more than one.

If ϕ is a sentence of UGF in normal form, then we refer to its conjuncts of the form

$$\forall \overline{x} (\alpha\_i(\overline{x}) \to \exists \overline{y} (\beta\_i(\overline{x}, \overline{y}) \land \psi\_i(\overline{x}, \overline{y})))$$

as the existential requirements and we will use ϕ ∃ i to denote them. Given a model A, an existential requirement ϕ ∃ i and a ∈ α A <sup>i</sup> we say that a tuple c is a witness for ϕ ∃ i and a if

$$\mathfrak{A} \mid = \beta\_i(\overline{a}, \overline{c}) \wedge \psi\_i(\overline{a}, \overline{c}) .$$

Conjuncts of the form

$$\forall \overline{x} (\kappa\_j(\overline{x}) \to (\theta\_j(\overline{x}) \to \forall \overline{y} (\gamma\_j(\overline{x}, \overline{y}) \to \psi\_j(\overline{x}, \overline{y}))))$$

will be referred to as the universal requirements and we will use ϕ ∀ j to denote them.

Using standard renaming techniques one can establish the following.

Lemma 6. There is a polynomial nondeterministic procedure, taking as its input a sentence ϕ ∈ UGF[σ] and producing a sentence ϕ <sup>0</sup> ∈ UGF[σ 0 ] in normal form, where σ <sup>0</sup> ⊃ σ, such that


Proof. We will essentially follow the proof of lemma 1 in [15], with some small technical modifications. Let ϕ ∈ UGF[σ] be a sentence, which w.l.o.g contains only existential quantification. Let ψ be the innermost formula of ϕ which starts with a block of existential quantifiers. If ψ is a sentence, we will nondeterministically either replace it with ⊥ or > and add ψ or ¬ψ (depending on our guess) as a conjunct to the resulting formula. Suppose then that ψ is a formula of the form

$$
\exists \overline{y} (\alpha(\overline{x}, \overline{y}) \land \psi(\overline{x}, \overline{y})) .
$$

Since ϕ was a sentence, ψ occurs in a scope of another formula of the form

$$
\exists \overline{z} (\alpha'(\overline{x}) \land \psi'(\overline{x})),
$$

where z ⊆ x. Let α <sup>0</sup> be the guard of the innermost such formula. We will now replace ϕ with the following formula

$$
\varphi[\psi(\overline{x})/R(\overline{x})] \land \forall \overline{x} (R(\overline{x}) \to \exists \overline{y} (\alpha(\overline{x}, \overline{y}) \land \psi(\overline{x}, \overline{y})))
$$

$$
\land \forall \overline{x} (\alpha'(\overline{x}) \to (\neg R(\overline{x}) \to \forall \overline{y} (\alpha(\overline{x}, \overline{y}) \to \neg \psi(\overline{x}, \overline{y})))),
$$

where ϕ[ψ(x)/R(x)] is the sentence obtained from ϕ by replacing the previously mentioned subformula ψ(x) with the atomic formula R(x) which has a fresh relation symbol R. It is straightforward to verify that the resulting sentence is equi-satisfiable with ϕ.

Now one can repeat the above procedure until one is left with a sentence of the form

$$\bigwedge\_{t \in T} \exists \overline{x} (\alpha\_t(\overline{x}) \land \psi\_t(\overline{x})) \land \bigwedge\_{i \in I} \varphi\_i^{\exists} \land \bigwedge\_{j \in J} \varphi\_j^{\forall},$$

where each ϕ ∃ i is an existential requirement, while each sentence ϕ ∀ j is an universal requirement. Now one can replace each conjunct ∃x<sup>1</sup> . . . ∃xn(α(x1, . . . , xn) ∧ ψt(x<sup>1</sup> . . . xn)) with a sentence of the form

$$
\exists x \lambda\_t(x) \land \forall x\_1 (\lambda\_t(x\_1) \to \exists x\_2 \dots \exists x\_n (\alpha\_t(x\_1, \dots, x\_n) \land \psi\_t(x\_1, \dots, x\_n))),
$$

where λ<sup>t</sup> is a fresh unary relation symbol. The resulting sentence is clearly equisatisfiable with the original sentence and furthermore it is in normal form.

### 5.2 Satisfiability Witnesses

A standard technique in proving that the complexity of the satisfiability problem of a given fragment of FO is in NExpTime is to show that each satisfiable sentence of this fragment has a finite model of size at most exponential with respect to the length of the sentence [8,12,15,16]. However, in the case of UGF it seems to be easier to show that we can associate to each of its sentences ϕ a different type of certificate, which is still at most exponential with respect to the length of the sentence, and which can be used to construct a (potentially infinite) model for ϕ.

Definition 9. Let ϕ ∈ UGF[σ] be a sentence in normal form, P be a set of 1-types over σ and π ∈ P. A pair (A, c), where c ∈ A, is called a (P, π)-witness for ϕ, if it satisfies the following requirements.


$$
\mathfrak{A} \vdash \gamma\_j(\overline{a}, \overline{b}) \to \psi\_j(\overline{a}, \overline{b}) .
$$

Here the intuition is that a (P, π)-witness (A, c) is a local certificate; it certifies that we can provide witnesses for tuples which contain the element c. The main idea now is that if we have a (P, π)-witness for each π ∈ P, then we can use them to construct a proper model for ϕ.

Definition 10. Let ϕ ∈ UGF[σ] be a sentence in normal form. A set of 1-types P over σ is a witness for ϕ, if it satisfies the following two requirements.


The following lemmas prove that an existence of a witness for ϕ is equivalent with the satisfiability of ϕ.

Lemma 7. Let ϕ ∈ UGF be a sentence in normal form. If ϕ is satisfiable, then there exists a witness for it.

Proof. Suppose that A |= ϕ. As the set of 1-types P we can take the set

$$\{\text{tp}\_{\mathfrak{A}}[a] \mid a \in A\}.$$

Clearly for every conjunct ∃zλt(z) there exists a suitable 1-type in P. Towards verifying the second requirement let π ∈ P and let c ∈ A be an element which realizes π. Then (A, c) is clearly a (P, π)-witness for ϕ.

Lemma 8. Let ϕ ∈ UGF be a sentence in normal form. If there exists a witness for ϕ, then it is satisfiable.

Proof. For simplicity we will assume that ϕ contains exactly one conjunct of the form ∃zλt(z). Let P be a witness for ϕ. Thus for every π ∈ P there exists a pair (A π , c) which is a (P, π)-witness for ϕ. Our goal is to use these witnesses to construct a sequence of models

$$
\mathfrak{A}\_1 \le \mathfrak{A}\_2 \le \mathfrak{A}\_3 \le \dots
$$

so that their union is a model of ϕ.

Let π ∈ P be a 1-type so that π |= λt. As the model A<sup>1</sup> we will take the model which contains a single element with 1-type π. Suppose then that we have defined A<sup>n</sup> in such a way that each 1-type realized in A<sup>n</sup> belongs to P. To define the model An+1 we will proceed as follows. Given a ∈ An, we will use W<sup>a</sup> to denote the set A<sup>π</sup> − {c}, where A<sup>π</sup> refers to the domain of the model in the (P, π)-witness (A π , c) of π := tp<sup>A</sup><sup>n</sup> [a]. Without loss of generality we will assume that the sets W<sup>a</sup> are pairwise disjoint. Now we will define An+1 as follows.

– The domain of the model is

$$A\_n \cup \bigcup\_{a \in A\_n} W\_a^\*$$

– An+1 A<sup>n</sup> is defined to be isomorphic with An.

– For each a ∈ A<sup>n</sup> and for each {c1, . . . , cm} ⊆ Wa, we define that

$$\operatorname{tp}\_{\mathfrak{A}\_{n+1}}[a, c\_1, \dots, c\_m] := \operatorname{tp}\_{\mathfrak{A}^\pi}[c, c\_1, \dots, c\_m],$$

where π is the 1-type of a.

– For every tuple (a1, . . . , am) and a m-ary relation R for which we have not yet defined whether (a1, . . . , am) belongs to RAn+1 , we will simply define that it does not belong to it.

The last step guarantees that if a tuple, which contains more than one element, is live in An+1, then it was already alive in one of the models A π . It is straightforward to verify that the union of the models (An)n<ω is indeed a model of ϕ.

### 5.3 Complexity of UGF

Although the size of a witness for ϕ is clearly only exponential with respect to |ϕ|, we do not yet have any upper bounds on the time it takes to verify that it really is a witness for ϕ. The following lemma gives us such a bound.

Lemma 9. Let ϕ ∈ UGF be a sentence in normal form and let σ denote the vocabulary of ϕ. Let P be a set of 1-types over σ and π ∈ P. If there exists a (P, π)-witness for ϕ, then there exists one in which the size of the model is at most 2 |ϕ| O(1) .

Proof. Let (A, c) be a (P, π)-witness for ϕ and let m = max{ar(R) | R ∈ σ}. Note that m ≤ |ϕ|. Our goal is to construct a sequence

$$
\mathfrak{B}\_1 \le \cdots \le \mathfrak{B}\_m
$$

of models so that (Bm, c) is a (P, π)-witness for ϕ and |Bm| ≤ 2 |ϕ| <sup>O</sup>(1) . As the model B<sup>1</sup> we will take the model which contains a single element with 1-type π; let e denote this element.

Before moving forward, we will introduce one auxiliary definition. Let a = (a1, . . . , an) and b = (b1, . . . , bn) be tuples of elements from two models A and B. Let {c1, . . . , cm} denote the set of distinct elements in a. We say that a and b are similar, if the mapping p : a → b, which was the mapping induced by the relation a<sup>i</sup> 7→ b<sup>i</sup> , is a bijection and furthermore

$$\text{tp}\_{\mathfrak{A}}[c\_1, \dots, c\_n] = \text{tp}\_{\mathfrak{B}}[p(c\_1), \dots, p(c\_n)].$$

Suppose now that we have defined Bk, where k < m, and in such a way that for each σ-live tuple b for which tp<sup>B</sup><sup>k</sup> [b] has been defined, there exists a similar tuple a which consists of elements of A. Given an existential requirement ϕ ∃ i of ϕ and a tuple b ∈ α B<sup>k</sup> i , which contains the element e, we say that b is a i-defect if there exists no witness for ϕ ∃ i and b in the model Bk. By construction, for each i-defect b we can find a tuple a of elements of A so that b and a are similar. In particular a ∈ α A i , and hence there exists a witness c for ϕ ∃ i and a in A; let Wb,i

denote the set of elements in c which were not contained in a. Without loss of generality we will assume that the sets Wb,i are pairwise disjoint. Now we will define Bk+1 as follows.

– The domain of the model is

$$B\_k \cup \bigcup\_{i \in I} \bigcup\_{\mathfrak{F}\_{\text{an } i \text{-defect}}} W\_{\mathfrak{F}\_{\mathfrak{F}\_i}}$$


$$\text{tp}\_{\mathfrak{B}\_{k+1}}[d\_1, \dots, d\_r, c\_1, \dots, c\_n] = \text{tp}\_{\mathfrak{A}}[p(d\_1), \dots, p(d\_r), c\_1, \dots, c\_n],$$

where (d1, . . . , dr) enumerates all the elements occurring in b and p : b → a. – For every tuple (b1, . . . , bn) and a n-ary relation R for which we have not yet defined whether (b1, . . . , bn) belongs to RBk+1 , we will simply define that it does not belong to it.

This completes the construction of the models B1, . . . , Bm. To bound the size of Bm, we first note that |Bk+1| ≤ |Bk| + |ϕ||Dk|, where D<sup>k</sup> denotes the number of defects in Bk. By construction, for every defect (d1, . . . , dr) of B<sup>k</sup> the set {d1, . . . , dr} is a σ-live set which is not contained in B`, for any ` < k. If k = 1, then the number of such σ-live sets is one, and if k > 1, then the number of such σ-live sets is Dk−1. Since each σ-live set is of size at most |ϕ|, there are at most |ϕ||ϕ| <sup>|</sup>ϕ<sup>|</sup>Dk−<sup>1</sup> = 2<sup>|</sup>ϕ<sup>|</sup> <sup>O</sup>(1)Dk−<sup>1</sup> defects in Bk, i.e., D<sup>k</sup> ≤ 2 |ϕ| <sup>O</sup>(1)Dk−1. Since m ≤ |ϕ|, we have that D<sup>k</sup> ≤ 2 |ϕ| <sup>O</sup>(1) , for any k < m, and hence |Bm| ≤ 2 |ϕ| O(1) .

Thus what remains to be proven is that (Bm, e) is a (P, π)-witness for ϕ. Here the only non-trivial requirement that we need to verify is that B<sup>m</sup> satisfies the second item in definition 9. So, let ϕ ∃ <sup>i</sup> be an existential requirement and let b = (b1, . . . , bn) ∈ α B<sup>m</sup> i be a tuple which contains e. We can clearly assume that n < m. It suffices to show that b is contained in Bk, for some k < m, since then by construction we know that it has a witness in Bm.

Aiming for a contradiction, suppose that b is contained in Bm, but it is not contained in B<sup>k</sup> for any k < m. By construction we know that, since b is σ-live, we assigned a table to some tuple (b 0 1 , . . . , b<sup>0</sup> r ), where (b 0 1 , . . . , b<sup>0</sup> r ) enumerates the set of distinct elements of (b1, . . . , bn). Again, by construction we know that we assigned a table to the tuple (b 0 1 , . . . , b<sup>0</sup> r ), because we wanted to provide a witness for some tuple (d1, . . . , ds), which contains e and for which {d1, . . . , ds} is a strict subset of {b 0 1 , . . . , b<sup>0</sup> <sup>r</sup>}. 3

Now observe that (d1, . . . , ds) is a σ-live tuple containing e, which is contained in Bm−<sup>1</sup> but is not contained in B<sup>k</sup> for any k < m−1. Indeed, if it were contained in Bk, for some k < m−1, then by construction we would have provided a witness for it in the model Bk+1, i.e., (b1, . . . , bn) would have been contained in Bk+1. But now we are in a position which is the same as the one that we started in; in

<sup>3</sup> If it were not, there would have been no need to provide a witness for it.

particular, we can repeat the above argument. After repeating the argument (at least) (n − 1)-times we would end up with the conclusion that e is contained in some Bk, where k > 1, but it is not contained in B1, which would be an obvious contradiction.

Now we can prove the main theorem of this section.

Theorem 2. The satisfiability problem of UGF is NExpTime-complete.

Proof. The lower bound follows from the proof of Theorem 3 in [15]. We will give an informal description of a non-deterministic procedure running in exponential time which determines whether a given sentence ϕ ∈ UGF is satisfiable. It starts by converting ϕ into an equi-satisfiable sentence ϕ <sup>0</sup> ∈ UGF in normal form, after which it guesses a set of 1-types P over the vocabulary of ϕ <sup>0</sup> and for each π ∈ P a (P, π)-witness (A, c) for ϕ, where the size of A is at most 2<sup>|</sup>ϕ<sup>|</sup> <sup>O</sup>(1) . Lemmas 6, 7, 8 and 9 guarantee that this procedure is correct. Since |P| ≤ 2 |ϕ| , the algorithm runs in exponential time with respect to |ϕ|.

### 6 Conclusions

In this paper we have proved two results of quite distinct flavour on uniform guarded fragments. The first result was that although GF fails to have Craig interpolation, its one-dimensional uniform fragment does have it. The second result was that the complexity of the satisfiability problem of the uniform guarded fragment is NExpTime-complete. The results presented in this paper suggest several new research questions, but here we will mention just two of them.

The first question is whether or not the uniform GF has Craig interpolation property. While the correctness of the amalgam construction presented in Section 4 rests on the assumption of one-dimensionality, we have not been able to show that uniform GF would not have Craig interpolation property. This has led the author to conjecture that the uniform GF does in fact have Craig interpolation property.

The second question is whether or not uniform GF has the exponential model property (note that if uniform GF would have an exponential model property, then one would obtain Theorem 2 for free). As we saw in the proof of Lemma 9, the requirement of uniformity essentially prevents uniform GF from enforcing long paths, and this seems to suggest that uniform GF can only enforce exponentially long paths (which it can enforce, since it contains standard modal logic with the global diamond). Because of this, the author conjectures that uniform GF has the exponential model property.

### Acknowledgements

The author wishes to thank Bartosz Bednarczyk for several helpful discussions on interpolation and fragments of first-order logic, and for suggesting the problem of determining the complexity of the satisfiability problem of uniform guarded fragment. The author also wishes to thank Antti Kuusisto for pointing out rather silly mistakes in the original definitions of uniformity and the uniform guarded bisimulation. Finally, the author wishes to thank the anonymous reviewers for their useful remarks which improved the presentation of this paper.

### References


Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

### Sweedler Theory of Monads

Dylan McDermott<sup>1</sup> () , Exequiel Rivas<sup>2</sup> , and Tarmo Uustalu1,<sup>2</sup>

<sup>1</sup> Dept. of Computer Science, Reykjavik University, Reykjavik, Iceland <sup>2</sup> Dept. of Software Science, Tallinn University of Technology, Tallinn, Estonia dylanm@ru.is, exequiel.rivas@ttu.ee, tarmo@ru.is

Abstract. Monad-comonad interaction laws are a mathematical concept for describing communication protocols between efectful computations and coefectful environments in the paradigm where notions of efectful computation are modelled by monads and notions of coefectful environment by comonads. We show that monad-comonad interaction laws are an instance of measuring maps from Sweedler theory for duoidal categories whereby the fnal interacting comonad for a monad and a residual monad arises as the Sweedler hom and the initial residual monad for a monad and an interacting comonad as the Sweedler copower. We then combine this with a (co)algebraic characterization of monadcomonad interaction laws to derive descriptions of the Sweedler hom and the Sweedler copower in terms of their coalgebras resp. algebras.

Keywords: (co)monads · (co)algebras · interaction laws · runners · duoidal categories · Sweedler operations

### 1 Introduction

The monad-comonad interaction laws of Katsumata et al. [16] are a mathematical concept for formalizing ways in which efectful programs (e.g., programs reading from and writing to a store, programs making nondeterministic choices) can be run. The idea is that efectful programs issue requests to the outside world; they can thus run on machines that can service such requests. Programs denote computations, machines implement environments. Notions of computation are modelled by monads in the manner frst explained by Moggi [23], while notions of environment can be modelled by comonads. Interaction laws model protocols of cooperation between computations and environments. Ideally, interaction should result in a return value and a fnal state. But it may be that some efects cannot be serviced, in which case interaction yields a residual computation of a return value and a fnal state; another monad is then needed to model the suitable notion of residual computation. A monad-comonad interaction law is therefore given by a monad T, a comonad D and a monad R on a symmetric monoidal category with a family of maps T X ⊗ DY → R(X ⊗ Y ) natural in X and Y and agreeing with the (co)units and (co)multiplications. If R = Id, we have a non-residual interaction law.

It is natural to ask for useful methods for recognizing and constructing monad-comonad interaction laws. Specifcally, it would be useful to fnd: a fnal monad for a given interacting comonad and residual monad; a fnal interacting comonad for a given monad and residual monad; or an initial residual monad for a given monad and interacting comonad.

In this paper, we show how to fnd these universal (co)monads, elaborating on some ideas and results from prior work on interaction [16,33]. We emphasize that the most important structural foundation for interaction laws is the duoidal [10,2] interrelationship of the composition and Day convolution monoidal structures in endofunctor categories. It is so signifcant that some central statements about interaction laws can be made on the level of monoids and comonoids in general symmetric closed duoidal categories, completely suppressing any specifcs about monads and comonads. In fact, it turns out that monad-comonad interaction laws are an instance of measuring maps from the Sweedler theory for duoidal categories as developed by L´opez Franco and Vasilakopoulou [20]. The universal (co)monads are instances of the operations studied in this theory. In particular, the fnal interacting comonad is an instance of the Sweedler hom and the initial residual monad is an instance of the Sweedler copower.

To obtain results about monad-comonad interaction specifcally, we combine this general perspective with the characterization of monad-comonad interaction laws by Uustalu and Voorneveld [33] as functors between the categories of (co)algebras of the (co)monads involved. This allows us to describe the Sweedler hom and the Sweedler power via their categories of (co)algebras in terms of what we call stateful and continuation-based runners.

We also discuss an enriched version of monad-comonad interaction laws, of which strong monad-comonad interaction laws are a special case. In this case, both kinds of runners of an enriched monad on a self-enriched category can be viewed as its algebras in another enriched category.

The paper is organized as follows. First, in Sect. 2, we review the basics of monad-comonad interaction laws. In Sect. 3, we show that monad-comonad interaction laws, the universal interacting comonad and the universal residual monad are an instance of measuring maps, the Sweedler hom and the Sweedler copower in symmetric closed duoidal categories. We then review the (co)algebraic perspective on monad-comonad interaction laws in Sect. 4, and apply it to derive (co)algebraic characterizations of the Sweedler hom and the Sweedler copower in Sect. 5. In Sect. 6, we comment on enriched monad-comonad interaction laws. We review some background category theory literature and related semantics work in Sect. 7. New material is primarily in Sects. 5, 6; some statements in Sect. 4 are also new.

We assume from the reader familiarity with the use of (strong) monads in mathematical semantics to model notions of efectful computation, and familiarity with the basics of the categorical machinery we need (monads and comonads, symmetric monoidal closed categories, accessibility [21,1], enrichment [17]).

### 2 Monad-Comonad Interaction Laws

We begin by reviewing the basics of monad-comonad interaction laws [16].

Consider a symmetric monoidal closed category (C, I, ⊗,⊸), e.g., a Cartesian monoidal closed category, e.g., Set.

A (residual) functor-functor interaction law is given by endofunctors F, G, H on C together with a family of maps

$$
\phi\_{X,Y}: FX \otimes GY \to H(X \otimes Y),
$$

natural in X, Y . We speak of a non-residual interaction law when H = Id. A map between (residual) functor-functor interaction laws (F, G, H, ϕ) and (F ′ , G′ , H′ , ϕ′ ) is given by natural transformations f : F → F ′ , g : G′ → G and h : H → H′ satisfying the equation

$$F\_{FX\otimes G'Y} \xrightarrow{FX\otimes g\_{Y}} \sum\_{f\_X\otimes G'Y}^{\psi\_{X,Y}} H(X\otimes Y)$$

$$\sum\_{f\_X\otimes G'Y} \overbrace{F'X\otimes G'Y}^{\psi'\_{X,Y}} \xrightarrow{\psi'\_{X,Y}} H'(X\otimes Y)$$

Functor-functor interaction laws form a category that has a monoidal structure based on endofunctor composition.

A (residual) monad-comonad interaction law is given by a monad T, a comonad D and a monad R on C with a family of maps

$$
\psi\_{X,Y} : TX \otimes DY \to R(X \otimes Y),
$$

natural in X, Y , that additionally satisfes the equations

$$\begin{array}{c} \iota \otimes \iota\_{Y} \\ X \otimes DY \\ X \otimes DY \\ \iota\_{X} \otimes \iota\_{Y} \\ \iota\_{X} \otimes \iota\_{Y} \\ \end{array} \xrightarrow[\begin{array}{c} \iota\_{X} \otimes Y \\ \iota\_{X} \otimes \iota\_{Y} \\ \end{array} \xrightarrow[\begin{array}{c} \iota\_{X,Y} \\ \iota\_{X} \otimes \iota\_{Y} \\ \end{array} \xrightarrow[\begin{array}{c} TTX \otimes DY \\ \end{array} \xrightarrow{R(TX \otimes DY) \xrightarrow{R(X \otimes Y)} RR(X \otimes Y)} \begin{array}{c} R(X \otimes Y) \\\\ \iota\_{X,Y} \\ \end{array} \xrightarrow[\begin{array}{c} \iota\_{X,Y} \\ \iota\_{X,Y} \\ \end{array} \xrightarrow[\begin{array}{c} \iota\_{X,Y} \\ \iota\_{X,Y} \\ \end{array} \xrightarrow{R(X \otimes Y)} R(X \otimes Y) \end{array}$$

(Every such interaction law gives a functor-functor interaction law (UT, UD, UR, ψ), where U sends (co)monads to their underlying functors.) When R = Id, we speak of a non-residual interaction law. A map between (residual) monad-comonad interaction laws (T, D, R, ψ) and (T ′ , D′ , R′ , ψ′ ) is given by a monad map T → T ′ , a comonad map D′ → D and a monad map R → R′ that make a map between the underlying functor-functor interaction laws. Monadcomonad interaction laws form a category isomorphic to the category of monoid objects in the category of functor-functor interaction laws.

Example 1. Let C = Set (or any SMCC). Take T X = S ⇒ (S × X) (the state monad) and DX = S<sup>0</sup> × (S<sup>0</sup> ⇒ X) (the costate monad). There is a non-residual monad-comonad interaction law of T, D when S = S<sup>0</sup> and more generally when S, S<sup>0</sup> come with a lens structure get : S<sup>0</sup> → S, put : S<sup>0</sup> × S → S0; in fact, these laws are in bijection with lenses.

Let C = Set (or any extensive category that also has the relevant initial algebras and fnal coalgebras). Take F X = 1 + X<sup>2</sup> and T the free monad on F, so T X ∼= µX′ .X + 1 + X′<sup>2</sup> (leaf-labelled nullary-binary trees). The only comonad D that can interact with T non-residually is DY ∼= 0. If we take RZ = 1 + Z, we have an R-residual interaction law of T and D for example for DY ∼= νY ′ .Y × (2 × Y ′ ) (node-labelled bitstreams), i.e., the cofree comonad for GY = 2 × Y .

See [16,33] for further examples and their intuitive meaning for semantics.

Some equivalent formulations of interaction laws will be useful. Due to the bijections

$$\begin{array}{c} FX \otimes GY \to H(X \otimes Y) \text{ nat. in } X, Y\\ \hline \overline{\mathbb{C}(X \otimes Y, Z) \to \mathbb{C}(FX \otimes GY, HZ) \text{ nat. in } X, Y, Z} \\\ \overline{\mathbb{C}(X, Y \multimap Z) \to \mathbb{C}(FX, GY \multimap HZ) \text{ nat. in } X, Y, Z} \\\hline \overline{F(Y \multimap Z) \to GY \multimap HZ \text{ nat. in } Y, Z} \end{array}$$

an H-residual functor-functor interaction law of F, G is the same as a family of maps

$$
\phi\_{Y,Z} : F(Y \multimap Z) \to GY \multimap HZ
$$

natural in Y , Z. Under this view, the equation required of a functor-functor interaction law map (f, g, h) between (F, G, H, ϕ) and (F ′ , G′ , H′ , ϕ′ ) becomes

$$F(Y \to Z) \xrightarrow{\phi\_{Y,Z}} GY \to HZ$$

$$\begin{array}{c|c} \downarrow & & \\ F'(Y \to Z) \xrightarrow{\phi'\_{Y,Z}} G'Y \to H'Z \end{array}$$

An R-residual monad-comonad interaction law of T, D is the same as a family of maps

$$
\psi\_{Y,Z} : T(Y \multimap Z) \to DY \multimap RZ
$$

natural in Y , Z satisfying

Y ⊸ Z ηY ⊸Z Y ⊸ Z εY <sup>⊸</sup>η<sup>R</sup> Z T(Y ⊸ Z) ψY,Z/DY <sup>⊸</sup> RZ T T(Y ⊸ Z) µY ⊸Z T ψY,Z/T(DY <sup>⊸</sup> RZ) ψDY,RZ/DDY <sup>⊸</sup> RRZ δY <sup>⊸</sup>µ<sup>R</sup> Z T(Y ⊸ Z) ψY,Z /DY <sup>⊸</sup> RZ

Suppose F, G, H : C → C are such that the coends and ends

$$\begin{array}{ll} (F \star G) \, Z = \int^{X,Y} \mathbb{C}(X \otimes Y, Z) \bullet (FX \otimes GY) & = \int^{Y} F(Y \multimap Z) \otimes GY\\ (G \multimap H) \, X = \int\_{Y,Z} \mathbb{C}(X, Y \multimap Z) \, \bigwedge (GY \multimap HZ) = \int\_{Y} GY \multimap H(X \otimes Y) \end{array}$$

exist. (F ⋆ G is called the Day convolution.) Then, because of the bijections

$$\begin{array}{l} \frac{\int^{X,Y} \mathbb{C}(X \otimes Y, Z) \bullet (FX \otimes GY) \to HZ \text{ nat. in } Z}{\mathbb{C}(X \otimes Y, Z) \to \mathbb{C}(FX \otimes GY, HZ) \text{ nat. in } X, Y, Z} \\ \hline \frac{\mathbb{C}(X, Y \multimap Z) \to \mathbb{C}(FX, GY \multimap HZ) \text{ nat. in } X, Y, Z}{\mathbb{C}X \multimap \int\_{Y,Z} \mathbb{C}(X, Y \multimap Z) \oplus (GY \multimap HZ) \text{ nat. in } X} \end{array}$$

an H-residual functor-functor interaction law of F, G turns out to be the same as a natural transformation F ⋆ G → H or F → G −⋆ H. An R-residual monad-comonad interaction law of T, D is the same as a natural transformation UT ⋆ UD → UR satisfying certain equations and also—by way of a particularly concise characterization—the same as a monad map T → D −⋆ R where D −⋆ R is a certain canonical monad with UD −⋆ UR as the underlying functor.

Now, if C is locally presentable and F, G, H are accessible, then F ⋆ G and G−⋆H are guaranteed to exist and be accessible. Writing [C, C]<sup>a</sup> for the category of accessible endofunctors on C, we obtain functors ⋆ : [C, C]a×[C, C]<sup>a</sup> → [C, C]<sup>a</sup> and −⋆ : [C, C] op <sup>a</sup> ×[C, C]<sup>a</sup> → [C, C]a. Together with J ∈ [C, C]<sup>a</sup> defned by JZ = C(I, Z) • I, the functor ⋆ equips [C, C]<sup>a</sup> with a symmetric monoidal structure. We also get that − ⋆ G ⊢ G −⋆ −, i.e., this structure is closed.<sup>3</sup> The functor −⋆ : [C, C] op <sup>a</sup> × [C, C]<sup>a</sup> → [C, C]<sup>a</sup> is lax monoidal wrt. the composition monoidal structure on [C, C]a. That UD −⋆ UR carries a monad structure if D is an accessible comonad and R is an accessible monad is a consequence of this.

These observations suggest the possibility of abstraction by switching to a more general setting. Instead of considering [C, C]a, we can consider an arbitrary category D equipped with a monoidal structure and a symmetric monoidal structure that suitably agree. The appropriate notion of agreement is duoidality [10,2]. We will next consider this abstraction and see that monad-comonad interaction laws are the measuring maps of an instance of L´opez Franco and Vasilakopoulou's Sweedler theory for duoidal categories [20].

### 3 Sweedler Theory for Duoidal Categories

We review the Sweedler theory for duoidal categories [20] and show that monads provide an instance.

Assume a symmetric duoidal category (D, I, ⋄, J, ⋆), i.e., a symmetric monoidal category in MonCAToplax, that is also closed in the sense that − ⋆ G has a right adjoint G −⋆ − in CAT. Explicitly, this means that we have a category D equipped with a monoidal structure (I, ⋄), a symmetric monoidal closed structure (J, ⋆, −⋆) and structural laws

$$\begin{array}{ccc} J \to I & & J \to J \diamond J \\ I \star I \to I & & (F \diamond G) \star (H \diamond K) \to (F \star H) \diamond (G \star K) \end{array}$$

satisfying appropriate equations witnessing oplax monoidality of J : 1 → D and ⋆ : D × D → D as functors between monoidal categories for the (I, ⋄) monoidal structure on D.

<sup>3</sup> If C is locally κ-presentable with the κ-presentable objects closed under I and ⊗, then the κ-accessible endofunctors on C form a monoidal category with ⋆ as tensor. Garner and L´opez Franco [13, Sect. 8.1] show that this monoidal category is closed, but their closed structure is diferent from ours. Our G −⋆ H has the property that natural transformations F → G−⋆H are H-residual functor-functor interaction laws of F, G even if F is not accessible; this is not the case for Garner and L´opez Franco's. This is why we do not restrict to fxed κ, and instead use all of [C, C]a.

The internal hom object F −⋆ I is called the dual of F. Stretching this terminology, the object F −⋆ H can be called the dual of F wrt. H.

We write Mon(D) (respectively Comon(D)) for the categories of monoids (resp. comonoids) in D wrt. the (I, ⋄) monoidal structure.

The composition monoidal and Day convolution symmetric monoidal closed structures (Id, ·) and (J, ⋆, −⋆) on [C, C]<sup>a</sup> yield an example of such a symmetric duoidal category D. The categories Mon([C, C]a) and Comon([C, C]a) are those of accessible monads and comonads.

The object J has a comonoid structure J → I, J → J ⋄ J, and the functor −⋆ : D op × D → D is lax monoidal wrt. the (I, ⋄) monoidal structure. The operations

$$\begin{array}{c} \star: \mathbb{D} \times \mathbb{D} \to \mathbb{D} \\ \rightsquigarrow: \mathbb{D}^{\textup{op}} \times \mathbb{D} \to \mathbb{D} \end{array}$$

lift to

$$\begin{array}{ll} \star : \mathbf{Comon}(\mathbb{D}) \times \mathbf{Comon}(\mathbb{D}) \to \mathbf{Comon}(\mathbb{D}) & \text{tensor of commands} \\ \rightarrow : (\mathbf{Comon}(\mathbb{D}))^{\mathrm{op}} \times \mathbf{Mon}(\mathbb{D}) \to \mathbf{Mon}(\mathbb{D}) & \text{power of a monoid} \end{array}$$

in the sense that

via

$$\varepsilon = \begin{array}{c} D\_0 \star D\_1 \xrightarrow{\varepsilon\_0 \star \varepsilon\_1} I \star I \xrightarrow{\begin{array}{c} \varepsilon\_0 \star \varepsilon\_1 \end{array}} I \end{array}$$

$$\delta = \begin{array}{c} D\_0 \star D\_1 \xrightarrow{\delta\_0 \star \delta\_1} (D\_0 \diamond D\_0) \star (D\_1 \diamond D\_1) \xrightarrow{\begin{array}{c} \varepsilon\_0 \star \varepsilon\_1 \end{array}} (D\_0 \star D\_1) \diamond (D\_0 \star D\_1) \end{array}$$

$$\eta = \begin{array}{c} I \xrightarrow{\begin{array}{c} \varepsilon \end{array}} I \to I \xrightarrow{\begin{array}{c} \varepsilon \star \eta \xrightarrow{R}} D \to \end{array}} D \to R$$

$$\mu = \begin{array}{c} (D \to R) \diamond (D \to R) \xrightarrow{\begin{array}{c} \varepsilon \end{array} \text{---} \begin{array}{c} (D \diamond D) \xrightarrow{\begin{array}{c} \delta \end{array} \star \mu \xrightarrow{R}} D \to \star R \end{array}}$$

Comonoid maps D<sup>0</sup> ⋆ D<sup>1</sup> → D are the same as maps ψ : UD<sup>0</sup> ⋆ UD<sup>1</sup> → UD satisfying

$$\begin{array}{c} D\_0 \star D\_1 \xrightarrow{\psi} \begin{array}{c} D \\ \downarrow \\ \downarrow \\ \end{array} \end{array} \xrightarrow{D} D\_0 \star D\_1 \xrightarrow{D\_0 \star D\_1} \xrightarrow{\psi} \begin{array}{c} \downarrow \\ \downarrow \\ \end{array} \xrightarrow{D} D$$

(omitting the Us in the equations). Such maps ψ could be called D-residual comonoid-comonoid interaction laws of D0, D1.

Monoid maps T → D −⋆ R are in bijection with maps ψ : UT ⋆ UD → UR that satisfy

(again omitting the Us in the equations), which are known as measuring maps from T to R by D and which we can also call R-residual monoid-comonoid interaction laws of T, R.

The three Sweedler operations

$$\begin{array}{ccc} \mathcal{L}: (\mathbf{Comon}(\mathbb{D}))^{\mathrm{op}} \times \mathbf{Comon}(\mathbb{D}) \to \mathbf{Comon}(\mathbb{D}) & \text{internal hom of commands} \\ \rhd: \mathbf{Comon}(\mathbb{D}) \times \mathbf{Mon}(\mathbb{D}) \to \mathbf{Mon}(\mathbb{D}) & \text{Sweedler copper of a monoid} \\ \mathcal{M}: (\mathbf{Mon}(\mathbb{D}))^{\mathrm{op}} \times \mathbf{Mon}(\mathbb{D}) \to \mathbf{Comon}(\mathbb{D}) & \text{Sweedler hom ofmonoids} \\ & & \text{(univ. measuring command)} \end{array}$$

are everywhere defned by the following adjunctions if the adjoints exist.

$$\begin{array}{ccc} \mathbf{Comon}(\mathbb{D}) & \mathbf{Mon}(\mathbb{D}) & \mathbf{Comon}(\mathbb{D}) \\ \huge{\qquad} \mathrm{-\*}D\_{1} \left\{ \begin{array}{c} \mathrm{+} \\ \mathrm{\qquad} \end{array} \right\} \mathcal{C}(D\_{1}, -) & \mathrm{D} \flat - \left\{ \begin{array}{c} \mathrm{+} \\ \mathrm{\qquad} \end{array} \right\} \mathrm{D} \cdot \star- & \huge{\qquad} \mathcal{T} \left\{ \begin{array}{c} \mathrm{+} \\ \mathrm{\qquad} \end{array} \right\} \mathcal{M}(T, -) \\ \mathbf{Comon}(\mathbb{D}) & \mathbf{Mon}(\mathbb{D}) & \mathbf{Mon}(\mathbb{D}) \end{array}$$

They are defned for specifc pairs of (co)monoids if the universal objects specifed by the following bijections exist.

$$\begin{array}{c} \begin{array}{c} \begin{array}{c} UT \star UD \rightarrow UR \text{ means.} \\ \hline T \rightarrow D \rightarrow \ast R \end{array} \\ \begin{array}{c} \begin{array}{c} T \rightarrow D \rightarrow T \rightarrow R \end{array} \\ \hline D \rightarrow T \rightarrow R \end{array} \end{array} \end{array}$$

The comonoid M(T, I) is called the Sweedler dual of the monoid T.

By defnition, the comonoid C(D1, D) is the fnal comonoid interacting with the comonoid D<sup>1</sup> D-residually. The Sweedler hom M(T, R) is the fnal Rresidually interacting comonoid for the monoid T. The Sweedler copower D ▷ T is the initial residual monoid for monoid-comonoid interactions of T and D.

If the Sweedler operations are everywhere defned, for which it sufces that D is locally presentable [20, Thm. 20], then the category (Comon(D), J, ⋆, C) is symmetric monoidal closed and the category (Mon(D), ▷, −⋆,M) is copowered, powered and enriched over (Comon(D), J, ⋆, C). However, local presentability of C is not enough for local presentability (or even accessibility) of [C, C]<sup>a</sup> (for example, [Set, Set]<sup>a</sup> is not accessible). In Sect. 5, we return to the question of everywhere-defnedness of the Sweedler operations for [C, C]a.

The Sweedler theory perspective allows us to establish some facts about interaction laws of free monads very easily. For example, we can straightforwardly derive a characterization of measuring maps from the free monoid F <sup>∗</sup> on F (assuming it exists).

Proposition 1. Measuring maps U(F ∗ )⋆UD → UR are in bijection with maps F ⋆ UD → UR.

Proof. This is witnessed by the following chain of bijections.

$$\begin{array}{c} \overbrace{F \to UD \to UR}^{F \star UD \to UR} \\ \overbrace{F \to U(D \to R)}^{} \\ \overbrace{F^\* \to D \to R}^{} \\ \overbrace{U(F^\*) \star UD \to UR \text{ means.}}^{} \end{array}$$

Similarly, we can calculate closed-form expressions for the Sweedler hom from a free monoid and the Sweedler copower of a free monoid. Here G† denotes the cofree comonoid on G (if it exists).

#### Proposition 2. (i) M(F ∗ , R) ∼= (F −⋆ UR) † . (ii) D ▷ F <sup>∗</sup> ∼= (F ⋆ UD) ∗ .

Proof. (i) As witnessed by the chain of bijections on the left below, comonoid maps D → M(F ∗ , R) and comonoid maps D → (F −⋆ UR) † are in bijection naturally in D. (ii) The chain of bijections on the right below composes to a bijection natural in R between monoid maps D ▷ F <sup>∗</sup> → R and monoid maps (F ⋆ UD) <sup>∗</sup> → R.

$$\begin{array}{ll} \frac{D \to (F \to UR)^{\dagger}}{UD \to F \to UR} \\ \frac{\overline{F \to UD \to UR}}{\overline{F \to U(D \to R)}} \\ \frac{\overline{F^{\*} \to D \to R}}{\overline{D^{\*} \to D \to R}} \\ \end{array} \qquad\qquad \begin{array}{ll} \frac{(F \star UD)^{\*} \to R}{\overline{F \to UD \to UR}} \\ \frac{\overline{F^{\*} \to UD \to UR}}{\overline{F \to U(D \to R)}} \\ \frac{\overline{F^{\*} \to D \to R}}{\overline{F^{\*} \to D \to R}} \\ \frac{\overline{F^{\*} \to D \to R}}{\overline{D \to F^{\*} \to R}} \end{array}$$

Example 2. Let C = Set. (i) Take F = 0, then F <sup>∗</sup> ∼= Id. We can calculate F −⋆ UR ∼= 1, therefore M(F ∗ , R) ∼= Id, for any monad R.

Next take F X = X<sup>2</sup> , then F <sup>∗</sup>X ∼= µX′ . X + X′<sup>2</sup> (these are leaf-labelled binary trees). We can calculate (F −⋆ UR)Y ∼= R (2 × Y ), hence M(F ∗ , R) Y ∼= νY ′ . Y × R (2 × Y ′ ) (node-labelled streams of bits for R = Id, node-labelled nonempty colists of bits for RZ = 1 + Z).

Finally, take F X = 1 + X<sup>2</sup> , then F <sup>∗</sup>X ∼= µX′ . X + 1 + X′<sup>2</sup> (leaf-labelled nullary-binary trees). We calculate (F −⋆ UR)Y ∼= R 0 × R (2 × Y ), hence M(F ∗ , R) Y ∼= νY ′ . Y × R 0 × R (2 × Y ′ ). For R = Id and any R such that R 0 ∼= 0, this means that M(F ∗ , R) ∼= 0. For RZ = 1 + Z, we get M(F ∗ , R) Y ∼= νY ′ . Y × (1 + 2 × Y ′ ) (node-labelled nonempty colists of bits).

(ii) Take F = 0, then F <sup>⋆</sup> ∼= Id. We can calculate (F ⋆ UD) ∼= 0, hence D ▷ F <sup>∗</sup> ∼= Id, for any comonad D.

Take F X = X<sup>2</sup> , then F <sup>∗</sup>X ∼= µX′ . X + X′<sup>2</sup> . We can calculate (F ⋆ UD)Z ∼= D (Z 2 ), therefore (D ▷ F ∗ )Z ∼= µZ′ . Z + D (Z ′2 ).

Take F X = 1 + X<sup>2</sup> , then F <sup>∗</sup>X ∼= µX′ . X + 1 + X′<sup>2</sup> . We can calculate (F ⋆ UD)Z ∼= D 1 + D (Z 2 ), therefore (D ▷ F ∗ )Z ∼= µZ′ . Z + D 1 + D (Z ′2 ).

These examples generalize to any wellpointed, locally presentable C with exponentials, when R and D are strong.

In exactly the same way as above, comonoid maps D<sup>0</sup> ⋆ D<sup>1</sup> → G† are in bijection with maps UD<sup>0</sup> ⋆ UD<sup>1</sup> → G, and C(D1, G† ) ∼= (UD<sup>1</sup> −⋆ G) † .

In the rest of this paper, we ignore comonad-comonad interaction laws and the internal hom of comonads since they are not our main focus. But developments similar to those for monad-comonad interaction laws and the Sweedler hom of monads and the Sweedler copower of a monad in Sects. 4, 5) below can be carried out for them as well.

### 4 Monad-comonad Interaction Laws (Co)algebraically

We now return to monad-comonad interaction laws specifcally and explain the (co)algebraic perspective developed in [33]. (Props. 4 and 6 did not appear in [33].) First, monad-comonad interaction laws admit the following useful characterization in terms of (co)algebras of the (co)monads involved.

Proposition 3. R-residual monad-comonad interaction laws ψ of T, D are in bijection with functors Ψ : (Coalg(D))op ×Alg(R) → Alg(T) that internal-hom carriers, i.e., satisfy

$$\begin{array}{c} (\mathsf{Coolg}(D))^{\mathrm{op}} \times \mathsf{Alg}(R) \xrightarrow{\Psi} \begin{array}{c} \mathsf{Alg}(T) \\ \longrightarrow \\ \mathsf{C^{op}} \times \mathbb{C} \xrightarrow{\begin{array}{c} \\ \end{array} \\ \end{array}} \end{array} \xrightarrow{\begin{array}{c} \mathsf{Alg}(T) \\ \longrightarrow \\ \end{array}} \begin{array}{c} \mathsf{Alg}(T) \\ \longrightarrow \\ \end{array}$$

Proof (sketch). Given an interaction law ψ, the functor Ψ is defned by

$$\Psi((Y,\chi),(Z,\zeta)) = (Y \multimap Z,\ T(Y \multimap Z) \xrightarrow{\psi} DY \multimap RZ \xrightarrow{\chi \multimap \zeta} Y \multimap Z),$$

Conversely, given a functor Ψ, the corresponding interaction law ψ is defned by

$$\psi = \ T(Y \rightharpoonup Z) \xrightarrow{T(\epsilon\_Y \rightharpoonup \eta\_Z^R)} T(DY \rightharpoonup RZ) \xrightarrow{\xi} \xrightarrow{\xi} DY \rightharpoonup RZ$$

where (DY ⊸ RZ, ξ) = Ψ((DY, δ<sup>Y</sup> ),(RZ, µ<sup>R</sup> Z )). ⊓⊔

We remark that such functors Ψ are completely determined by their action on (co)free (co)algebras. To be precise, there is a bijection between these functors and functors Ψ ′ : (CoKl(D))op × Kl(R) → Alg(T) that satisfy

$$\begin{array}{c} (\mathbf{CoKl}(D))^{\mathrm{op}} \times \mathbf{Kl}(R) \xrightarrow{\Psi'} \mathbf{Alg}(T) \\ \sideset{ $\mathbb{C}^{\mathrm{op}} \times \mathbb{K}$ } \\ \mathbb{C}^{\mathrm{op}} \times \mathbb{C} \xrightarrow{\cdots} \xrightarrow{\cdots} \mathbb{C} \end{array}$$

where K : CoKl(D) → C is the left adjoint of the coKleisli adjunction of D and K : Kl(R) → C is the right adjoint of the Kleisli adjunction of R.

The following reformulations of Prop. 1 enable a smooth derivation of further characterizations of monad-comonad interaction laws in terms of what we call runners, introduced next.

Corollary 1. R-residual interaction laws of T, D are in bijection with functors Ψ : Coalg(D) → [Alg(R), Alg(T)]op satisfying

Coalg(D) U <sup>Ψ</sup> /[Alg(R), Alg(T)]op [Alg(R),U] op C (Y 7→Y ⊸−) op /[C, C] op [U,C] op /[Alg(R), C] op

and also with functors Ψ : Alg(R) → [Coalg(D) op , Alg(T)] satisfying

Alg(R) U <sup>Ψ</sup> /[(Coalg(D))op , Alg(T)] [(Coalg(D))op,U] C (Z7→−⊸Z) /[C op , C] [U op,C] /[(Coalg(D))op , C]

#### Stateful Runners

Say that an R-residual stateful runner of T is an object Y ∈ C together with a family of maps

$$
\theta\_X: TX \otimes Y \to R(X \otimes Y),
$$

natural in X satisfying

$$\begin{array}{c} X \otimes Y \xrightarrow{\begin{array}{c} \begin{array}{c} X \otimes Y \end{array} \hline \begin{array}{c} X \otimes Y \end{array}} X \otimes Y\\TX \otimes Y \xrightarrow{\begin{array}{c} \begin{array}{c} \begin{array}{c} \begin{array}{c} \begin{array}{c} \begin{array}{c} \begin{array}{c} \begin{array}{c} \end{array} \end{array} \end{array} \end{array} \end{array} \end{array} \begin{array}{c} TTX \otimes Y \xrightarrow{\begin{array}{c} \begin{array}{c} \begin{array}{c} \begin{array}{c} \begin{array}{c} R \otimes X \end{array} \end{array} \end{array} \begin{array}{c} R(TX \otimes Y) \end{array} \end{array} \xrightarrow{R(X \otimes Y)} \begin{array}{c} R(X \otimes Y) \end{array} \end{array} \end{array} \right) \end{array} \right]$$

Maps (Y, θ) → (Y ′ , θ′ ) between stateful runners are maps f : Y → Y ′ satisfying R(X ⊗ f) ◦ θ<sup>X</sup> = θ ′ <sup>X</sup> ◦ (T X ⊗ f). Stateful runners form a category SRunR(T).

R-residual stateful runners of T with carrier Y are in bijection with monad maps T → St<sup>R</sup> <sup>Y</sup> where St<sup>R</sup> Y is the R-transformed state monad for state object Y defned by St<sup>R</sup> <sup>Y</sup> X = Y ⊸ R(X ⊗ Y ).

They are also in bijection with functors Θ : Alg(R) → Alg(T) that internalhom Y with the carrier, i.e., satisfy

$$\begin{array}{c} \mathsf{Alg}(R) \xrightarrow{\partial} \mathsf{Alg}(T) \\ \upsilon \bigvee\_{Y \hookrightarrow -} \qquad \bigvee\_{U}^{U} \\ \mathbb{C} \xrightarrow{Y \hookrightarrow -} \end{array}$$

Proof (sketch). Given a stateful runner θ, the functor Θ is defned by

$$\Theta(Z,\zeta) = \ T(Y \multimap Z) \xrightarrow{\theta\_Y \multimap Z} Y \multimap R((Y \multimap Z) \otimes Y) \xrightarrow{Y \multimap R\mathfrak{a}\lhd} Y \multimap RZ \xrightarrow{Y \multimap \zeta} Y \multimap Z$$

Conversely, given a functor Θ, the stateful runner θ is

$$\theta\_X = \begin{array}{ccccc} \text{T} & \xrightarrow{T \text{coay}} \text{T} (Y \rightharpoonup X \otimes Y) \xrightarrow{T(Y \rightharpoonup \eta\_X^R \otimes Y)} \text{T} (Y \rightharpoonup R(X \otimes Y)) \xrightarrow{\xi} & \xrightarrow{\xi} \text{Y} \multimap \text{R} (X \otimes Y) \\\\ \cdot & \xrightarrow{\cdot} & \cdots & \cdots & \cdots & \cdots & \text{ $\\_D$  } \cdot \end{array}$$

where (Y ⊸ R(X ⊗ Y ), ξ) = Θ(R(X ⊗ Y ), µ<sup>R</sup> X⊗Y ). ⊓⊔

This observation is strengthened by the following proposition that also talks about stateful runner maps.

Proposition 4. The following is pullback square:

$$\begin{array}{c} \mathsf{SRun}\_{R}(T) \xrightarrow{\begin{subarray}{c} \\ \begin{subarray}{c} U \end{subarray}} \end{subarray}} \xrightarrow{\begin{subarray}{c} \mathrm{[Alg}(R), \mathrm{Alg}(T)]^{\text{op}} \\ \end{subarray}} \begin{subarray}{c} \big[\begin{subarray}{c} \mathrm{[I,C]^{\text{op}}} \end{subarray}} \xrightarrow{\begin{subarray}{c} \big[\begin{subarray}{c} \mathrm{[Mg}(R),U]^{\text{op}} \end{subarray} \end{subarray} \end{array} \end{array}} \begin{array}{} [\mathrm{Alg}(R), \mathrm{Alg}(T)]^{\text{op}} \\ \big[\begin{subarray}{c} \mathrm{[Mg}(R),U]^{\text{op}} \end{subarray} \right]} \xrightarrow{\begin{subarray}{c} \big[\begin{subarray}{c} \mathrm{[Mg}(R),U]^{\text{op}} \end{subarray} \right]} \end{array}} [\mathrm{Alg}(R), \mathrm{U}]^{\text{op}} $$

Combining Prop. 4 with Cor. 1, we obtain a characterization of monadcomonad interaction laws in terms of stateful runners.

Proposition 5. R-residual monad-comonad interaction laws T, D are in a bijection with functors Ψ : Coalg(D) → SRunR(T) preserving carriers, i.e., satisfying

### Continuation-Based Runners

A D-fuelled continuation-based runner of T is an object Z ∈ C together with a family of maps

$$\theta\_X: D(X \multimap Z) \to TX \multimap Z$$

natural in X satisfying

$$\begin{array}{c} D(X \multimap Z) \xrightarrow{\theta\_X} TX \multimap Z \\ \hfil X \multimap Z \xrightarrow{\begin{subarray}{c} \theta\_X \\ \end{subarray}} X \xrightarrow{\begin{subarray}{c} \operatorname{\reflectbox{ $D$ }} \\ \end{subarray}} X \xrightarrow{\begin{subarray}{c} \operatorname{\reflectbox{ $D$ }} \\ \end{subarray}} \begin{array}{c} T \multimap Z \\ \end{array}} \begin{array}{c} TX \multimap Z \\ \hfil X \multimap Z \\ \end{array} \end{array} \xrightarrow{\begin{array}{c} TX \multimap Z \\ \hline \end{array}} TX \multimap Z \\ \hfil X \multimap Z \end{array}$$

These runners form a category CRunD(T).

D-fuelled continuation-based runners of T with carrier Z are in bijection with monad maps T → Cnt<sup>D</sup> <sup>Z</sup> , where Cnt<sup>D</sup> <sup>Z</sup> is the D-transformed continuation monad for answer object Z defned by Cnt<sup>D</sup> <sup>Z</sup> X = D(X ⊸ Z) ⊸ Z.

Continuation-based runners are also in bijection with functors Θ : (Coalg(D))op → Alg(T) that internal-hom the carrier with Z, i.e., that satisfy

$$\begin{array}{c} (\mathsf{Coalg}(D))^{\operatorname{op}} \xrightarrow{\partial} \mathsf{Alg}(T) \\ \mathsf{C}^{\operatorname{op}} \xrightarrow{\operatorname{\,}\_{\operatorname{\,}^{\operatorname{op}}}} \xrightarrow{\operatorname{\,}\_{\operatorname{\,}^{\operatorname{\,}}}} \mathsf{C} \end{array}$$

Moreover:

Proposition 6. The following is a pullback square:

$$\begin{array}{c} \mathbf{CRun}\_{D}(T) \xrightarrow{\begin{subarray}{c} \\ \mathbf{U} \end{subarray}} \begin{subarray}{c} \\ \end{subarray}} \xrightarrow{\begin{subarray}{c} \left[ (\mathbf{Coalg}(D))^{\operatorname{op}}, \operatorname{Alg}(T) \right] \end{subarray}} \begin{subarray}{c} \left[ (\mathbf{Coalg}(D))^{\operatorname{op}}, \operatorname{Alg}(T) \right] \end{subarray} \\ \begin{subarray}{c} \left[ (\mathbf{CCoalg}(D))^{\operatorname{op}}, \mathbb{C} \right] \end{subarray} \xrightarrow{\begin{subarray}{c} \left[ (\mathbf{Coalg}(D))^{\operatorname{op}}, U \right] \end{subarray}} \begin{subarray}{c} \left[ (\mathbf{Coalg}(D))^{\operatorname{op}}, \mathbb{C} \right] \end{subarray} \end{array}$$

Combining this proposition with Cor. 1, we obtain:

Proposition 7. R-residual monad-comonad interaction laws of T, D are in bijection with functors Ψ : Alg(R) → CRunD(T) that preserve carriers, i.e., that satisfy

### 5 Combining Sweedler Theory and the (Co)algebraic Perspective

We now combine our (co)algebraic observations with Sweedler theory.

### Sweedler Hom

By defnition, the Sweedler hom between monads T, R, if it exists, is the comonad M(T, R) together with an monad-comonad interaction law υ such that, for any other comonad D and monad-comonad interaction law ψ, there exists a unique comonad map g : D → M(T, R) satisfying

$$\underbrace{\epsilon\_{X\otimes Y}}\_{TX\otimes DY\_{TX\otimes g\_Y}^{\omega\_{X,Y}}}\wedge\cdots\wedge\wedge\wedge\wedge\wedge\wedge\wedge$$

Comonad maps D → D′ are in bijection with functors Coalg(D) → Coalg(D′ ) that preserve carriers. Therefore, by Prop. 5, the Sweedler hom, if it exists, is the comonad M(T, R) together with a carrier-preserving functor Υ : Coalg(M(T, R)) → SRunR(T) such that, for any other comonad D and carrier-preserving functor Ψ : Coalg(D) → SRunR(T), there exists a unique carrier-preserving functor Γ : Coalg(D) → Coalg(M(T, R)) such that

It follows that, if (SRunR(T), U) is strictly comonadic, then M(T, R) exists and (Coalg(M(T, R)), U) ∼= (SRunR(T), U). (Should (SRunR(T), U) fail to be strictly comonadic, then M(T, R) may still exist, but with diferent algebras.) Easy calculations show that U strictly creates equalizers of U-split pairs. Hence, by the dual of Beck's monadicity theorem, U is strictly comonadic if it is a left adjoint. Under our assumptions on C, T and R from Sect. 2, all is well.

Theorem 1. If C is locally presentable and T and R are accessible monads on C, then SRunR(T) is locally presentable and the forgetful functor U : SRunR(T) → C is a left adjoint. Hence the Sweedler hom M(T, R) exists, is accessible, and satisfes (Coalg(M(T, R)), U) ∼= (SRunR(T), U).

Proof (sketch). We frst show that SRunR(T) is locally presentable. The functor U : SRunR(T) → C strictly creates colimits by easy calculations, and hence SRunR(T) is cocomplete. For local presentability, it therefore remains to show that SRunR(T) is accessible, which we do by appealing to the fact that accessible categories are closed under inserters and equifers. The category of Fcoalgebras, for any accessible endofunctor F on C, is an inserter of accessible functors, and is therefore accessible by [1, Thm. 2.72]. For each Y , families of maps θ<sup>X</sup> : T X ⊗ Y → R(X ⊗ Y ) natural in X are in bijection with maps χ : Y → (T −⋆ R)Y , so that R-residual stateful runners of T are equivalently coalgebras (Y, χ) of the functor T −⋆ R, satisfying two equations. One equation is an equality between two maps Y → (Id −⋆ R)Y , the other between two maps Y → ((T · T) −⋆ R)Y . It follows that SRunR(T) is isomorphic to a full subcategory of the category coalg(T −⋆ R) of (T −⋆ R)-coalgebras, and that this full subcategory is the joint equifer of two natural transformations of accessible functors coalg(T −⋆ R) → coalg(Id −⋆ R) and of two natural transformations of accessible functors coalg(T −⋆ R) → coalg((T · T) −⋆ R). Accessible categories are closed under equifers of natural transformations of accessible functors [1, Lemma 2.76], so SRunR(T) is accessible and hence locally presentable.

As a colimit-preserving functor between locally presentable categories, U is a left adjoint by Freyd's special adjoint functor theorem, thus strictly comonadic. The induced comonad is the Sweedler hom M(T, R). Accessibility of M(T, R) follows from accessibility of the adjoints (the right adjoint by [1, Prop. 2.23]). ⊓⊔

Example 3. Let C = Set. Take T X = X<sup>S</sup> (the reader monad for state object S). R-residual stateful runners of T are objects Y with families of maps X<sup>S</sup> × Y → R(X ×Y ) natural in X or, equivalently, maps Y → R(S ×Y ) constrained by two equations. For R = Id or R = 1+−, these are in bijection with maps Y → S. The comonad with such structured objects Y as coalgebras, which is the Sweedler hom of T and R, is DY = S × Y (the coreader monad for S). For a general accessible monad R, the Sweedler hom can be described as a subcomonad of the cofree comonad DY = νY ′ . Y × R(S × Y ′ ).

Take T X = X<sup>+</sup> = µX′ .X × (1 + X′ ) (the nonempty list monad with concatenation as multiplication, free semigroup monad). R-residual stateful runners of T are objects Y with families of maps X<sup>+</sup> × Y → R(X × Y ) natural in X satisfying two equations or, equivalently, maps (X × X) × Y → R(X × Y ) constrained by one equation or, equivalently, maps Y → R(Y + Y ) coassociative wrt. the coproduct monoidal structure of Kl(R), i.e., making Y into a cosemigroup. For R = Id, the corresponding comonad is the cofree cosemigroup (wrt. the coproduct monoidal structure on Set) comonad. Its underlying functor is DY ∼= Y × (Y + Y ).

These examples generalize to any wellpointed, locally presentable C with exponentials, when R is a strong monad.

### Sweedler Copower

The Sweedler copower of a monad T by a comonad D, if it exists, is by defnition the monad D ▷ T together with a monad-comonad interaction law υ such that, for any other monad R and monad-comonad interaction law ψ, there exists a unique monad map g : D ▷ T → R satisfying

$$\|\xi\|\_{T\otimes DY\xrightarrow[\upsilon\_{X,Y}\to\{D\not\simeq T\}\backslash X\otimes Y\rangle\atop\mathcal{B}X\otimes Y}}\lesssim\|\xi\|\_{Y\otimes Y}$$

Monad maps R′ → R are in bijection with functors Alg(R) → Alg(R′ ) that preserve carriers. Therefore, by Prop. 7, the Sweedler copower, if it exists, is the monad D ▷ T together with a carrier-preserving functor Υ : Alg(D ▷ T) → CRunD(T) such that, for any other monad R and carrier-preserving functor Ψ : Alg(R) → CRunD(T), there exists a unique carrier-preserving functor Γ : Alg(R) → Alg(D ▷ T) such that

Consequently, if (CRunD(T), U) is strictly monadic, then D ▷ T exists and (Alg(D ▷ T), U) ∼= (CRunD(T), U). This is the case as soon as U is a right adjoint by Beck's strict monadicity theorem, because U is easily verifed to strictly create U-split coequalizers.

Theorem 2. If C is locally presentable and T and D are accessible, then CRunD(T) is locally presentable and the forgetful functor U : CRunD(T) → C is a right adjoint. Hence the Sweedler copower D ▷ T exists, is accessible, and satisfes (Alg(D ▷ T), U) ∼= (CRunD(T), U).

Proof (sketch). The proof is similar to that of Thm. 1. The functor U strictly creates limits, so CRunD(T) is complete. The category CRunD(T) is isomorphic to a full subcategory of the category of algebras of the functor D ⋆ T, forming a joint equifer. Categories of algebras of accessible endofunctors on C are inserters of accessible functors, and hence form accessible categories. It follows that CRunD(T) is also accessible, and hence locally presentable. The functor U strictly creates κ-fltered colimits, where κ is such that Id ⋆ T, D ⋆ T, and (D · D) ⋆ T are κ-accessible; in particular, U is accessible. Since U also strictly creates limits, it is therefore a right adjoint by [1, Theorem 1.66]. The induced monad is the Sweedler copower D ▷T, which is accessible because both adjoints are. ⊓⊔

Example 4. Let C = Set. Take T X = M × X where (M, u, ∗) is a monoid (the writer monad) and DY = S×Y (the coreader comonad). D-fuelled continuationbased runners of T are objects Z with families of maps S×Z <sup>X</sup> → Z<sup>M</sup>×<sup>X</sup> natural in X or, equivalently, maps (S × M) × Z → Z, subject to two equations. The monad with such structured objects Z as algebras, which is the Sweedler copower of T and D, is the writer monad for the free monoid on S × M quotiented by (s, a) ∗ (s, b) = (s, a ∗ b) and u = (s, u).

### 6 Enriched Interaction Laws

In Sects. 2, 4, 5 above, we worked with (a full subcategory of) the category [C, C] of endofunctors on a SMCC C and natural transformations between them, and abstracted it to a duoidal category D in Sect. 3.

An alternative is to proceed from an SMCC (V, I, ⊗,⊸) (copowered over itself by ⊗ and enriched and powered by ⊸) and another category C that is at least copowered or enriched over V, or possibly both or even powered too. In this setting, a V-enriched functor-functor interaction law is given by V-enriched endofunctors F on V and G and H on C together with either a family of maps ϕX,Y : F X • GY → H(X • Y ) in C that are V-natural in X ∈ V and Y ∈ C or, equivalently, a family of maps ϕY,Z : F(C(Y, Z)) → C(GY, HZ) in V that are V-natural in Y, Z ∈ C.

Two cases are of special interest.


The only case where the enriched setting agrees with the main one of this paper of Sects. 2–5, i.e., the concept of interaction law where there are no nonvacuous enrichment requirements and the endofunctors involved are all on the same category, is the intersection of the above two: V = C = Set.

A more general situation in which the two settings do not difer too much is when V = C and C is monoidally wellpointed. Then all functors with codomain C are uniquely C-enriched (but may fail to admit an enrichment) and all natural transformations between C-enriched functors with codomain C are C-enriched.

In the case V = C, which is probably the most interesting case for mathematical semantics applications, the duoidal abstraction of Sect. 3 still applies. We can take D to be (a suitable full subcategory of) C-[C, C], where C-[C, C] is the ordinary category of C-functors C → C (strong endofunctors).

In the case of a general V, the simple duoidal abstraction ceases to apply. We need to switch to an action ⋆ : W × D → D (in MonCAToplax) of a symmetric duoidal category (W, IW, ⋄W, JW, ⋆W) on a monoidal category (D, I, ⋄) together with a functor −⋆ : D op ×D → W such that −⋆ G ⊢ G−⋆− (in CAT). Crucially, the action ⋆ comes with structural laws

$$I \vdash\_{\mathcal{W}} \star I \to I \qquad (F \diamond\_{\mathcal{W}} G) \star (H \diamond K) \to (F \star H) \diamond (G \star K).$$

witnessing oplaxity of ⋆. Similarly to the simple duoidal situation, we get that ⋆ and −⋆ lift to functors ⋆ : Comon(W) × Comon(D) → Comon(D) and −⋆ : (Comon(D))op × Mon(D) → Mon(W) and can then defne measuring maps and Sweedler-like operations and ask if they are everywhere defned.

The instantiation is given by (suitable full subcategories of) W = V-[V, V], D = V-[C, C] and

$$\begin{array}{ll} (F \star G) \ Z = \int^{X,Y} \mathbb{C}(X \bullet Y, Z) \bullet (FX \bullet GY) & = \int^{Y} F(\mathbb{C}(Y, Z)) \bullet GY\\ (G \to H) \ X = \int\_{Y,Z} (X \multimap \mathbb{C}(Y, Z)) \to \mathbb{C}(GY, HZ) & = \int\_{Y} \mathbb{C}(GY, H(X \bullet Y)) \end{array}$$

where the integral signs now stand for V-enriched coends and ends.

### Runners as Generalized Algebras

Enriched monad-comonad interaction laws can be characterized as enriched functors between categories of (co)algebras analogously to Props. 3, 5, 7. But one pleasant feature of the enriched setting is that enriched versions of both stateful and continuation-based runners of T can be described as algebras of T in a generalized sense.

Suppose we are given an SMCC V (copowered over itself by ⊗ and enriched and powered by ⊸) and a V-enriched monad T on V. For a category K that is enriched and powered over V, we say that an algebra of T in K as an object Y of K together with family of maps χ<sup>X</sup> : X ⋔ Y → T X ⋔ Y in K that is V-enriched natural in X ∈ V and satisfes the equations

$$X \nrightarrow Y \xrightarrow{\times X} TX \nrightarrow Y \quad \begin{tabular}{c} \times X \nrightarrow Y \\ \searrow X \\ X \nrightarrow Y \end{tabular} \xrightarrow{\times X} \begin{tabular}{c} TX \nrightarrow Y \\ \searrow \\ TX \nrightarrow Y \end{tabular}$$

If V has enough limits, then these form a V-category Alg(T, K), and there is a forgetful V-functor U : Alg(T, K) → C. (The limits are required to carve out the object of algebra maps (Y, χ) → (Y ′ , χ′ ) from the hom-object K(Y, Y ′ ).)

An algebra like this is equivalently an object Y ∈ K together with a Venriched monad map T → Knt<sup>Y</sup> where Knt<sup>Y</sup> X = K(X ⋔ Y, Y ). If V = K, an algebra of T in this sense is the same as an algebra in the standard sense. In this case, we have Knt<sup>Y</sup> X = (X ⊸ Y ) ⊸ Y .

Enriched runners of T turn out to be algebras of T in this generalized sense. Given a category C enriched and copowered over V and a V-enriched monad R on C, an V-enriched R-residual stateful runner of T is an object Y ∈ C together with a family of maps θ<sup>X</sup> : T X • Y → R(X • Y ) in C V-natural in X ∈ V and satisfying two equations. Enriched stateful runners of T are in bijection with algebras of T in (Kl(R))op .

Proof (sketch). The statement is wellformed since, as soon as C is V-enriched and copowered by a functor • : V ⊗ C → C, we have that Kl(R) is V-enriched and copowered by a functor V ⊗ Kl(R) → Kl(R) that agrees with • on objects. Therefore (Kl(R))op is V-enriched and powered by the opposite of that functor. We have the following chain of bijections:

$$\frac{TX \bullet Y \to R(X \bullet Y) \text{ in } \mathbb{C} \text{ } \mathbb{V}\text{-nat. in } X}{TX \bullet Y \to X \bullet Y \text{ in } \mathbf{Kl}(R) \text{ } \mathbb{V}\text{-nat. in } X}$$
 
$$\frac{\begin{array}{l} X \wedge Y \to TX \wedge Y \text{ in } (\mathbf{Kl}(R))^{\text{op}} \text{ } \mathbb{V}\text{-nat. in } X \end{array}}{\Box} \end{array}$$

The statement about the category of enriched stateful runners is:

Proposition 8. If Alg(T,(Kl(R))op) exists as a V-category, then so does V-SRunR(T), and the following is a pullback square (in V-CAT).

$$\begin{array}{c} \mathbb{V}\text{-SRun}\_{R}(T) \longrightarrow \begin{array}{c} (\textbf{Alg}(T,(\textbf{Kl}(R))^{\text{op}})^{\text{op}})\\ \cdot\\ \mathbb{C} \xrightarrow{J} \end{array} \end{array} \biguplus \begin{array}{c} \downarrow\\ U^{\text{op}}\\ \end{array}$$

In the special case when V = C and R = IdC, we get (Coalg(M<sup>C</sup>(T, Id), U) ∼= ((Alg(T, C op))op, Uop) ("coalgebras" of the C-monad T).

By the same token, given a V-enriched and powered category C and a Venriched comonad D on C, we can defne what an V-enriched D-fuelled continuation based runner of T is: an object Z ∈ C together with a family of maps θ<sup>X</sup> : D(X ⋔ Z) → T X ⋔ Z in C that is V-natural in X ∈ V and satisfes two equations. Enriched continuation-based runners of T are in bijection with algebras of T in the coKleisli category of D. Moreover:

Proposition 9. If Alg(T, CoKl(D)) exists as a V-category, then so does V-CRunD(T), and the following is a pullback square:

$$\begin{array}{c} \mathbb{V}\text{-CR}\mathbf{un}\_{D}(T) \longrightarrow \text{Alg}(T, \mathbf{CoKl}(D)) \\ U \\ \mathbb{C} \xrightarrow{U} \xrightarrow{J} \mathbf{CoKl}(D) \end{array}$$

### 7 Related Work

In semantics work, the use of monads as notions of computation was pioneered by Moggi [23], but the frst to study comonads (or algebraic theories comodelled) as notions of environment (not under that name) were Shkaravska and Power [29]. This work was developed further by Plotkin and Power [24] and then Møgelberg and Staton [22] (who considered the enriched setting). Stateful runners appeared in Uustalu's paper [32], who noticed that nonresidual stateful runners of a set monad induced by an algebraic theory are in bijection with coalgebras of the comonad induced by the same theory (comodels). The concept of monad-comonad interaction law was distilled by Katsumata et al. [16], who also noticed that the universal interacting comonad of a monad is an instance of the Sweedler hom from Sweedler theory for duoidal categories; they calculated the dual and Sweedler dual for a number of cases. Uustalu and Voorneveld [33] noticed the bijection between monad-comonad interaction laws and suitable functors between categories of (co)algebras and that, in addition to stateful runners, monad-comonad interaction laws relate to continuation-based runners. Garner [12,11] further developed this thread. In particular, he gave a formula for the Sweedler duals of polynomial monads, and demonstrated properties of the dual/Sweedler dual (costructure/cosemantics) adjunction for accessible Set- (co)monads, such as its idempotency. He also pointed out that, when T and R are accessible Set-monads, the coalgebras of the Sweedler hom M(T, R) are algebras of T in (Kl(R))op with, as maps between them, maps in Set that J : Set → Kl(R) sends to algebra maps.

Independently, and earlier than in the semantics community, monad-comonad interaction laws were discovered among functional programmers by Kmett [19] and Freeman [8].

There is a disconnected and more mature thread of work in universal algebra started by Freyd [9] (or even Kan [15]), and continued by Tall and Wraith [31,34] and Bergman and Hausknecht [5], studying functors from coalgebras of a covariety to algebras (like those of our Prop. 3) in the case V = Set, R = Id<sup>C</sup> of our enriched setting. (There are also textbook expositions, by Popescu and Popescu [26, Ch. 3] and Bergman [4, Ch. 10].) Strangely, this thread seems to have never been picked up in semantics work. It was not cited in the work by Power and coauthors [29,24], and the later authors (except Garner) have been unaware of it.

Sweedler's original work [30] was for (co)algebras over a feld. Anel and Joyal [3] studied the Sweedler theory in great detail for dg-(co)algebras [3]. It was abstracted for (co)monoids in symmetric monoidal closed categories by Porst and Street [28] and Hyland et al. [14] (the internal hom of comonoids is older and goes back to Porst [27]) and then generalized for duoidal categories by L´opez Franco and Vasilakopoulou [20]. A typical example duoidal structure on a functor category is given by the Day convolution and pointwise tensor. Garner and L´opez Franco [13] considered the example of composition and the Day convolution of endofunctors (κ-accessible for a fxed κ).

We do not know the earliest reference to generalized algebras of a monad, in particular, coalgebras of a monad. The latter were considered by Poinsot and Porst [25] (and models of algebraic theories elsewhere than Set are standard).

### 8 Conclusion and Future Work

We have studied universal (co)monads for monad-comonad interactions. We have shown that an elegant setting for such a study on a more general level is provided by Sweedler theory for general duoidal categories as developed by L´opez Franco and Vasilakopoulou [20]. But for results about monad-comonad interaction specifcally it is fruitful to combine it with the (co)algebraic perspective on monad-comonad interaction laws [33]. This makes it possible to characterize the universal (co)monads defned by Sweedler operations via their categories of (co)algebras in terms of diferent favors of runners.

We have witnessed that there is the choice of whether to work with ordinary monad-comonad interaction laws or with the enriched version. It remains to be seen which option yields a richer or more useful theory. An issue with the enriched option is that we know little about accessibility for enriched categories, although some studies exist (e.g., [18,6,7]).

We refrained from discussing it in this paper altogether, but of course one can specifcally study interaction laws of monads and comonads specifed by algebraic theories. We intend to do this in a sequel paper. We also plan to explain properly the signifcance for semantics of the constructions of this paper by describing in detail how they work on semantics-motivated examples and what this means.

Acknowledgements We thank Niels Voorneveld for many useful discussions. Richard Garner's work is an endless source of inspiration. D.M. and T.U. were supported by the Icelandic Research Fund project grant no. 196323-053, T.U. also by the Estonian Research Council team grant no. PRG1210. E.R. was supported by the Estonian Research Council personal grant no. PSG659.

### References


Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If

## Model Checking Temporal Properties of Recursive Probabilistic Programs

Tobias Winkler , Christina Gehnen , and Joost-Pieter Katoen

RWTH Aachen University, Aachen, Germany {tobias.winkler,katoen}@cs.rwth-aachen.de christina.gehnen@rwth-aachen.de

Abstract. Probabilistic pushdown automata (pPDA) are a standard operational model for programming languages involving discrete random choices, procedures, and returns. Temporal properties are useful for gaining insight into the chronological order of events during program execution. Existing approaches in the literature have focused mostly on ω-regular and LTL properties. In this paper, we study the model checking problem of pPDA against ω-visibly pushdown languages that can be described by specification logics such as CaRet and are strictly more expressive than ω-regular properties. With these logical formulae, it is possible to specify properties that explicitly take the structured computations arising from procedural programs into account. For example, CaRet is able to match procedure calls with their corresponding future returns, and thus allows to express fundamental program properties like total and partial correctness.

Keywords: Probabilistic Recursive Programs · Model Checking · Probabilistic Pushdown Automata · Visibly Pushdown Languages · CaRet.

### 1 Introduction

Probabilistic programs extend traditional programs with the ability to flip coins or, more generally, sample values from probability distributions. These programs can be used to encode randomized algorithms and randomized mechanisms in security [7] in a natural way. The interest in probabilistic programs has significantly increased in recent years. To a large extent, this is due to the search in AI for more expressive and succinct languages than probabilistic graphical models for Bayesian inference [17]. Probabilistic programs have many applications [24]. They are used in, amongst others, machine learning, systems biology, security, planning and control, quantum computing, and software–defined networks. Probabilistic variants of many programming languages exist.

Procedural programs allow for declaration of procedures—small independent code blocks—and the ability to call procedures from one another, possibly in

This work is supported by the DFG research training group 2236 UnRAVeL and the ERC advanced research grant 787914 FRAPPANT.


Fig. 1. Recursive probabilistic program modeling the outbreak of an infectious disease. uniform(a, b) stands for the discrete uniform distribution on [a, b].

a recursive fashion. Most common programming languages such as C, Python, or Java support procedures. It is thus not surprising that recursion is a key ingredient in many modern probabilistic programming languages (PPL). In fact, many early approaches to extend Bayesian networks focused on incorporating recursion [26,19,11,27]. Randomized algorithms such as Hoare's quicksort with random pivot selection can be straightforwardly programmed using recursion. Recursion is also a first-class citizen in modeling rule-based dependencies between molecules or populations in systems biology (e.g., modeling reproduction).

This paper studies the automated verification of probabilistic pushdown automata [14] (pPDA) as an explicit-state operational model of procedural probabilistic programs against temporal specifications. As a motivating example, let us consider a simple epidemiological model for the outbreak of an infectious disease in a large population where the number of susceptible

Y 1.5 1 E 0.5 2 Fig. 2. Example infec-

Y E

tion rates by age groups. individuals can be assumed to be infinite. Our example model distinguishes young and elderly persons. Each affected individual infects a uniformly distributed number of others, with varying rates (expected values) according to the age groups (Figure 2). The fatality rate for infected elderly and young persons is 1% and 0%, respectively. Initially, we assume there is a single infected young person, i.e., the overall program is started by calling infectYoung(). It is an easy task for any working programmer to specify this model as a discrete probabilistic program with mutually recursive procedures (Figure 1). Note that this program can be easily amended to more realistic models involving, e.g., more age or gender groups, other distributions, hospitalization rate, etc.

The operational behavior of programs such as the one in Figure 1 can be naturally described by pPDA. The technical details of such a translation are beyond the scope of this paper but let us provide some intuition (more details can be found e.g. in [2]). Roughly, the local states of the procedures—the valuation of the local variables and the position of the program counter—constitute both the state space and the stack alphabet of the automaton. Procedure calls correspond to push transitions in the automaton in such a way that the program's procedure stack is simulated by the automaton's pushdown stack, i.e., the current local state is saved on top of the stack. Accordingly, returning from a procedure corresponds to taking a pop transition in order to restore the local state of the caller. Returning a value can be handled similarly. Clearly, if the reachable local state spaces of the involved procedures are finite, then the resulting automaton will be finite as well.

A number of relevant questions such as "Will the virus eventually become extinct?" (termination probability) or "What is the expected number of fatalities?" (expected costs) can be decided on finite pPDA (see [9] for a survey). In this work, we focus on temporal properties, e.g., questions that involve reasoning about the chronological order of certain events of interest during the epidemic. An example are chains of infection: For instance, we might ask

What is the probability that eventually a young person with only young persons in their chain of infection passes the virus on to an elderly person who then dies?

On the level of the program in Figure 1, this corresponds to the probability of reaching a global program configuration where the call stack only contains infectYoung() invocations and during execution of the current infectYoung(), the local variable f is eventually set to true. This requires reasoning about the nestings of calls and returns of a computation. In fact, in order to decide if f = true in the current procedure, we must "skip" over all calls within it and only consider their local return values. This requirement and many others can be rather naturally expressed in the logic CaRet [3], an extension of LTL:

$$
\Diamond^g \left( \Box^- p\_Y \wedge p\_Y \wedge \Diamond^a f \right) \Box
$$

Here, p<sup>Y</sup> is an atomic proposition that holds at states which correspond to being in procedure infectYoung, and f indicates that f = true. Intuitively, the above formula states that eventually (outer ♦ g ), the computation reaches a (global) state where only infectYoung is on the call stack and the current procedure is infectYoung as well (<sup>−</sup>p<sup>Y</sup> ∧ p<sup>Y</sup> ), and moreover the local—aka abstract—path within in the current procedure reaches a state where f is true (♦ <sup>a</sup>f). Such properties are in general context-free but not always regular and thus cannot be expressed in LTL [3].

Technical Contribution. We are given a (finite) pPDA ∆ and a CaRet formula ϕ and we are interested in determining the probability that a random trajectory of ∆ satisfies ϕ. In order for this problem to be decidable [13], we need to impose a mild visibility restriction on ∆, yielding a probabilistic visibly pushdown automaton (pVPA). Just like several previous works on model checking pPDA against ω-regular specifications [14,10,21], we follow the automata-based approach (see Figure 3). More specifically, we first translate ϕ into an equivalent non-deterministic B¨uchi visibly pushdown automaton [4] (VPA) A and then determinize it using a result of [22]. The resulting DVPA D uses a so-called stair-parity [22] acceptance condition that is strictly more expressive than standard parity or Muller DVPA [4]. Stair-parity differs from usual parity in that 452 T. Winkler, C. Gehnen, J.-P. Katoen

Fig. 3. Chain of reductions used in this paper. ETR stands for existential theory of the reals, i.e., the existentially quantified fragment of the FO-theory over (R, +, ·, ≤).

it only considers certain positions—called steps [22]—of an infinite word where the stack height never decreases again. We then construct a standard product ∆×D. Here, the visibility conditions ensure that the automata synchronize their stack actions, yielding a product automaton that uses a single stack instead of two independent ones, which would lead to undecidability [13]. Finally, we are left with computing a stair-parity acceptance probability in the product, which is itself a pPDA. This is achieved by constructing a specific finite Markov chain associated to ∆ × D, called step chain in this paper. Intuitively, the step chain jumps from one step of a run to the next, and therefore we only need to evaluate standard parity rather than stair-parity on the step chain. The idea of step chains is due to [14] where they were used to show decidability against deterministic non-pushdown B¨uchi automata. For constructing the step chain, certain termination probabilities of the pPDA need to be computed. These are in general algebraic numbers that cannot always be expressed by radicals [16], let alone by rationals. However, the relevant problems are still decidable via an encoding in the existential fragment of the FO-theory of the reals (ETR) [21].

The resulting main contributions of this paper are complexity results, summarized in Figure 4, and algorithms for quantitative model checking of pPDA against ω-VPL given in terms of either deterministic automata, non-deterministic automata, or as CaRet formulae. As common in the literature, we consider the special case of qualitative, or almost-sure (a.-s.), model checking separately. To the best of our knowledge, none of these problems was known to be decidable before. The work of [13] proved decidability of model checking against deterministic Muller VPA which capture a strict subset of the CaRet-definable languages [4]. As a lemma of independent interest, we show that the step chain can be used for checking all kinds of measurable properties defined on steps, even beyond parity.

Related work. We have already mentioned various works on recursion in probabilistic graphical models (and PPL) as well as on verifying pPDA and the equivalent model of recursive Markov chains [16]. The analysis of these models focuses on reachability probabilities, ω-regular properties or (fragments of) probabilistic CTL, expected costs, and termination probabilities. The computation of termination probabilities in recursive Markov chains and variations thereof with non-determinism is supported by the software tool PReMo [29]. Our paper can be seen as a natural extension from checking pPDA against ω-regular properties to ω-visibly pushdown languages. In contrast to these algorithmic approaches, various deductive reasoning methods have been developed for recursive


Fig. 4. Complexity results of this paper.

probabilistic programs. Proof rules for recursion were first provided in [20], and later extended to proof rules in a weakest-precondition reasoning style [23,25]. Olmedo et al. [25] also address the connection to pPDA and provide proof rules for expected run-time analysis. A mechanized method for proving properties of randomized algorithms, including recursive ones, for the Coq proof assistant is presented in [5]. The Coq approach is based on higher–order logic using a monadic interpretation of programs as probabilistic distributions.

Organization. We review the basics about VPA and CaRet in Section 2. Section 3 introduces probabilistic visibly pushdown automata (pVPA). The stair-parity DVPA model checking procedure is presented in Section 4, and the results for B¨uchi VPA and CaRet in Section 5. We conclude the paper in Section 6.

### 2 Visibly Pushdown Languages

We fix some general notation for words first. Given a non-empty alphabet Σ, let Σ<sup>∗</sup> be the set of finite words (this includes the empty word ), and let Σ<sup>ω</sup> be the set of infinite words over Σ. For i ≥ 0, the i-th symbol of a word w ∈ Σ<sup>∗</sup> ∪ Σ<sup>ω</sup> is denoted w(i) if it exists. |w| denotes the length of w.

### 2.1 Visibly Pushdown Automata

A finite alphabet Σ is called a pushdown alphabet if it is equipped with a partition Σ = Σcall ] Σint ] Σret into three—possibly empty—subsets of call, internal, and return symbols. A visibly pushdown automaton [4] (VPA) over Σ is like a standard pushdown automaton with the additional syntactic restriction that reading a call or return symbol triggers a push or a pop transition, respectively. Reading an internal symbol, on the other hand, does not affect the stack at all.

Definition 1 (VPA [4]). Let Σ be a pushdown alphabet. A visibly pushdown automaton (VPA) over Σ is a tuple A = (S, s0, Γ, ⊥, δ, Σ) with S a finite set of states, s<sup>0</sup> ∈ S an initial state, Γ a finite stack alphabet, ⊥ ∈ Γ a special bottom-of-stack symbol, and δ = (δcall, δint, δret) a triple of relations

δcall ⊆ (S ×Σcall)×(S ×Γ-<sup>⊥</sup>) , δint ⊆ (S ×Σint)×S , δret ⊆ (S ×Σret ×Γ)×S

where Γ-<sup>⊥</sup> = Γ \ { ⊥ }. For s, t ∈ S, Z ∈ Γ, and a ∈ Σ, we use the shorthand notations s <sup>a</sup>−→ tZ, s <sup>a</sup>−→ t, sZ <sup>a</sup>−→ t to indicate that there exist transitions (s, a, t, Z) ∈ δcall, (s, a, t) ∈ δint, (s, a, Z, t) ∈ δret, respectively. Note that e.g. s <sup>a</sup>−→ tZ implies implicitly that a ∈ Σcall and Z 6= ⊥, and similar for internal and return transitions. Intuitively, call transitions push a new symbol Z onto the stack, internal transitions ignore the stack, and return transitions pop the topmost symbol Z from the stack (unless Z = ⊥, in which case nothing is popped). A configuration of VPA A is a tuple (s, γ) ∈ S × Γ ∗ , written more succinctly as sγ in the sequel. Let w ∈ Σ<sup>ω</sup> be an infinite input word. An infinite sequence ρ = s0γ0, s1γ<sup>1</sup> . . . of configurations is called a run of A on w if s0γ<sup>0</sup> = s0⊥ and for all i ≥ 0, exactly one of the following cases applies:

$$\begin{aligned} & -\ w(i) \in \Sigma\_{\text{call}} \text{ and } \gamma\_{i+1} = \gamma\_i Z \text{ for some } Z \in \varGamma\_{\perp} \text{ such that } s\_i \xrightarrow{w(i)} s\_{i+1} Z; \text{or} \\ & -\ w(i) \in \Sigma\_{\text{int}} \text{ and } \gamma\_{i+1} = \gamma\_i \text{ and } s\_i \xrightarrow{w(i)} s\_{i+1}; \text{or} \\ & -\ w(i) \in \Sigma\_{\text{ret}} \text{ and } \gamma\_{i+1} Z = \gamma\_i \text{ for some } Z \in \varGamma\_{\perp} \text{ such that } s\_i Z \xrightarrow{w(i)} s\_{i+1}, \text{ or} \\ & \gamma\_i = \gamma\_{i+1} = \perp \text{ and } s\_i \xrightarrow{w(i)} s\_{i+1}. \end{aligned}$$

A B¨uchi acceptance condition for A is a subset F ⊆ S. A VPA equipped with a B¨uchi condition is called a B¨uchi VPA. An infinite word w ∈ Σ<sup>ω</sup> is accepted by a B¨uchi VPA if there exists a run s0γ0, s1γ1, . . . of A on w such that s<sup>i</sup> ∈ F for infinitely many i ≥ 0. The ω-language of words accepted by a B¨uchi VPA A is denoted L(A) ⊆ Σω.

Definition 2 (ω-VPL [4]). Let Σ be a pushdown alphabet. L ⊆ Σ<sup>ω</sup> is an ω-visibly pushdown language (ω-VPL) if L = L(A) for a B¨uchi VPA A over Σ.

A VPA is deterministic (DVPA) if it has exactly one run on each input word. In this case, δcall, δint, and δret can be viewed as (total) functions. As for standard NBA, the class of languages recognized by B¨uchi DVPA is a strict subset of the languages recognized by non-deterministic B¨uchi VPA. Unlike in the nonpushdown case, DVPA with Muller or parity conditions are also strictly less expressive than non-deterministic B¨uchi VPA [4]. A deterministic automaton model for ω-VPL was given in [22]. It uses a so-called stair-parity acceptance condition which is the topic of the next subsection.

### 2.2 Steps and Stair-parity Conditions

Let us fix a pushdown alphabet Σ and a VPA A over Σ. Consider a run ρ = s0γ0, s1γ1, . . . of A on an infinite word w ∈ Σ<sup>ω</sup>. We define the stack height of the i-th configuration as sh(ρ(i)) = |γ<sup>i</sup> | − 1 (the bottom symbol ⊥ does not count to the stack height). The stair-parity condition relies on the notion of steps:

Definition 3 (Step). Let ρ be a run of A. Position i ≥ 0 is a step of ρ if

$$\forall n \ge i \colon \quad sh(\rho(n)) \ge \; sh(\rho(i)) \; .$$

Fig. 5. Left: An example VPA (in fact, a DVPA) with Γ = {Z, ⊥ } over input alphabet Σ = { c } ] { τ } ] { r }. Transitions labeled c, Z are call transitions which push Z on the stack, the transitions labeled with τ are internal ones that ignore the stack, and those labeled Z, r and ⊥, r are return transitions that are only enabled if Z (⊥, resp.) is on top of the stack; when executing Z, r we also pop Z from the stack. However, the special bottom-of-stack symbol ⊥ can never be popped (see e.g. pos. 1). Right: The unique run of the DVPA on input word τ r c τ τ c r c<sup>2</sup> r 2 c 3 r 3 . . .. Steps are underlined.

Abusing terminology, we may also refer to the configurations at the step positions of a run as steps.

Example 1. Figure 5 depicts a DVPA and the initial fragment of its unique run ρ on the input word τ r c τ τ c r c<sup>2</sup> r 2 c 3 r 3 . . .. The step positions are underlined, i.e., positions 0-5, 7, 11, and 17 are steps. Note that if ρ(i) = s⊥ for some s ∈ S then i is a step, i.e., bottom configurations are always steps.

Steps play a central role in the rest of the paper. We therefore explain some of their fundamental properties.


Remark 1. One can also define the steps of a word w ∈ Σ<sup>ω</sup> as the positions where a run of any arbitrary VPA on w has a step. Due to the visibility restriction, the actual behaviour of the VPA does not influence the step positions [22]. In other words, the step positions are predetermined by the input word. Thus, we can also speak of the stack height sh(w(i)) of word w at position i.

We need one last notion before defining stair-parity. The footprint of an infinite run ρ = s0γ0, s1γ1, . . . is the infinite sequence ρ↓Steps = sn<sup>0</sup> sn<sup>1</sup> . . . ∈ S ω where for all i ≥ 0 the position n<sup>i</sup> is the i-th step of ρ. Phrased differently, ρ↓Steps is the projection of the run ρ onto the states occurring at its steps. For the example run in Figure 5 (right), ρ↓Steps = s0s1s1s0s ω 1 .

Definition 4 (Stair-parity [22]). Let A be a VPA over pushdown alphabet Σ. A stair-parity acceptance condition for A is defined in terms of a priority function Ω : S → N0. i.e. A word w ∈ Σ<sup>ω</sup> is accepted if A has a run ρ on ω s.t.

$$\min \left\{ k \in \mathbb{N}\_0 \: \mid \: \exists i \colon \ \Omega(\rho \downarrow\_{Steps}(i) \;) = k \right\}.$$

is even. The language accepted by A is denoted L(A).

Example 2. The DVPA in Figure 5 with Ω(s0) = 1 and Ω(s1) = 2 accepts

$$\mathcal{L}\_{repbda} = \left\{ w \in \Sigma^{\omega} \; | \; \exists B \ge 0, \; \exists i \ge 0 \colon sh(w(i)) \le B \right\},$$

the language of repeatedly bounded words [22], i.e., words whose stack height (cf. Remark 1) is infinitely often at most a constant B. It is known that Lrepbdd is not expressible by DVPA with usual parity conditions [4].

Theorem 1 ([22, Thm. 1]). For every non-deterministic B¨uchi VPA A there exists a deterministic stair-parity DVPA D with 2 O(|S| 2 ) states such that L(A) = L(D). Moreover, D can be constructed in exponential time in the size of A.

It was also shown in [22] that stair-parity DVPA characterize exactly the class of ω-VPL (and are thus not more expressive than non-deterministic B¨uchi VPA).

### 2.3 CaRet, a Temporal Logic of Calls and Returns

Specifying requirements directly in terms of automata is tedious in practice. CaRet [3] is an extension of Linear Temporal Logic (LTL) that can be used to describe ω-VPL. Its syntax is defined as follows:

Definition 5 (CaRet [3]). Let AP be a finite set of atomic propositions. The logic CaRet adheres to the grammar

$$\varphi \vDash = \begin{array}{c} p \mid \varphi \lor \varphi \mid \neg \varphi \mid \bigcirc^g \varphi \mid \varphi \mathcal{U}^g \varphi \mid \bigcirc^a \varphi \mid \varphi \mathcal{U}^a \varphi \mid \bigcirc^- \varphi \mid \varphi \mathcal{U}^- \varphi \mid \end{array}$$

where p ∈ AP ∪ { call, int,ret }.

Other common modalities such as ♦ <sup>b</sup> and <sup>b</sup> for b ∈ { g, a, − } are defined as usual via ♦ <sup>b</sup>ϕ = true U <sup>b</sup> ϕ, and <sup>b</sup>ϕ = ¬♦ <sup>b</sup>¬ϕ. We briefly explain the semantics of CaRet, the formal definition can be found in [3] or the full version [28]. We assume familiarity with LTL. CaRet formulae are interpreted over infinite words from the pushdown alphabet Σ = 2AP × { call, int,ret }. <sup>g</sup> and U <sup>g</sup> are the standard next and until modalities from LTL (called global next and until

Fig. 6. CaRet's various next modalities applied to the initial fragment of an example word. Call, internal, and return positions are depicted as boxes, circles, and rhombs, resp. Note that <sup>a</sup> of position 3 is undefined because <sup>g</sup> is a return.

in CaRet). CaRet extends LTL by two key operators, the caller modality <sup>−</sup> and the abstract successor <sup>a</sup> , see Figure 6. The former is a past modality that refers to the position of the last pending call. For internal and return symbols, the abstract successor <sup>a</sup> behaves like <sup>g</sup> unless the latter is a return, in which case <sup>a</sup> is undefined (e.g. pos. 3 in the example). On the other hand, the abstract successor of a call symbol is its matching return if it exists, or undefined otherwise. The until modalities U <sup>−</sup> and U <sup>a</sup> are defined over the paths induced by the callers and abstract successors, respectively. Note that the caller path is always finite and the abstract path can be either finite or infinite. A prime application of CaRet is to state Hoare-like total correctness of a procedure F [3]:

$$\begin{array}{rcl} \varphi\_{total} & = & \Box^g \left( \mathsf{call} \wedge p \wedge p\_F \to \Box^a q \right) \end{array}$$

where p and q are atomic propositions that hold at the states where the pre- and post-condition is satisfied, respectively, and p<sup>F</sup> is an atomic proposition marking the calls to F. Another example is the language of repeatedly bounded words from Example 2; it is Lrepbdd = L(♦ <sup>g</sup><sup>g</sup> (call → <sup>a</sup> ret)). Further examples are given in [3]. The language defined by a CaRet formula ϕ is denoted L(ϕ).

Theorem 2 ([1, Thm. 5.1]). CaRet-definable languages are ω-VPL: For each CaRet formula ϕ there exists a (non-deterministic) B¨uchi VPA A such that L(ϕ) = L(A), and A can be constructed in time 2 O(|ϕ|) .

The above theorem is well-known in the literature [1,2] even though it is usually stated for Nested Word Automata (NWA) which are equivalent to VPA, and it is more common to state a space bound on A rather than a time bound for the construction. The theorem also applies to more expressive extensions of CaRet [1] which we do not consider here for the sake of simplicity.

### 3 Probabilistic Visibly Pushdown Automata

As explained in the introductory section, we employ probabilistic pushdown automata [14] (pPDA) as an operational model for procedural probabilistic programs. pPDA thus play a fundamentally different role in this paper than VPA (cf. Definition 1): While the former are used to model the system, the latter encode the specification. Consequently, our pPDA do not read an input word like VPA do, but instead take their transitions randomly, according to fixed probability distributions. In this way, they define a probability space over their possible traces, i.e., runs projected on their labeling sequence. These traces constitute the input words of the VPA. In order for the model checking problems to be decidable [13], a syntactic visibility restriction related—but not exactly analogous—to the one required by VPA needs to be imposed on pPDA. In a nutshell, the condition is that each state only has outgoing transitions of one type, i.e., push, internal, or pop. This means that the stack operation is visible in the states (recall that for VPA, the stack operation is visible in the input symbol). This restriction is not severe in the context of modeling programs (see Remark 2 further below) and leads to our notion of probabilistic visibly pushdown automata (pVPA) which we now define formally.

Given a finite set X, we write D(X) = { f : X → [0, 1] | P <sup>a</sup>∈<sup>X</sup> f(a) = 1 } for the set of probability distributions on X.

Definition 6 (pVPA). A probabilistic visibly pushdown automaton (pVPA) is a tuple ∆ = (Q, q0, Γ, ⊥, P, Σ, λ) where Q is a finite set of states partitioned into Q = Qcall]Qint]Qret, q<sup>0</sup> ∈ Q is an initial state, Γ is a finite stack alphabet, ⊥ ∈ Γ is a special bottom-of-stack symbol, P = (Pcall, Pint, Pret) is a triple of functions with signature

Pcall : Qcall → D(Q × Γ-<sup>⊥</sup>) , Pint : Qint → D(Q) , Pret : Qret × Γ → D(Q) ,

Σ = Σcall ] Σint ] Σret is a pushdown alphabet, and λ: Q → Σ is a state labeling function consistent with the visibility condition, i.e., for all type ∈ {call, int,ret} and all q ∈ Q, we have that q ∈ Qtype iff λ(q) ∈ Σtype.

Intuitively, the behavior of a pVPA ∆ is as follows. If the current state q is a call state, then the probability distribution Pcall(q) determines a random successor state and stack symbol to be pushed on the stack (⊥ cannot be pushed). Similarly, if the current state is internal, then Pint(q) is the distribution over possible successor states and the stack is ignored completely. Lastly, if the current state is a return state and symbol Z ∈ Γ is on top of the stack, then Pret(q, Z) once again determines the probability distribution of successor states, and additionally Z is removed from the stack. Similar to VPA, the bottom symbol ⊥ is the only exception to this rule, it can never be removed. Thus, pVPA are a generalization of labeled Markov chains, which correspond to the special case Q = Qint.

We now define the semantics of pVPA more formally. For q, r ∈ Q, Z ∈ Γ and p > 0 we use the shorthand notations q p −→ rZ, q p −→ r, and qZ <sup>p</sup> −→ r to indicate that Pcall(q)(r, Z) = p, Pint(q)(r) = p, and Pret(q, Z)(r) = p, respectively. As for VPA, a configuration of a pVPA is an element qγ ∈ Q×Γ ∗ . An (infinite) run of a pVPA is a sequence of configurations ρ = q0γ0, q1γ1, . . . such that q0γ<sup>0</sup> = q0⊥ and for all i ≥ 0 we have that either

$$1. \ q\_i \in Q\_{\mathsf{call}}, \ \gamma\_{i+1} = \gamma\_i Z \text{ for some } Z \in \Gamma\_{\ast \perp} \text{ and } q\_i \xrightarrow{p} q\_{i+1} Z;$$


Note that our pVPA only produce infinite runs and do not simply "terminate" upon reaching the empty stack as in e.g. [14]. In fact, in our case the stack cannot be empty due to the special bottom symbol ⊥ that can never be popped. We have chosen to avoid finite pVPA runs for compatibility with CaRet which describes ω-languages per definition. Nonetheless, terminating behavior can be easily simulated in our framework by moving to a dedicated sink state once the pVPA attempts to pop ⊥ for the first time.

The set of all runs of a pVPA ∆ is denoted Runs∆. We extend ∆'s labeling function λ to runs ρ ∈ Runs<sup>∆</sup> by applying it to each state along ρ individually, yielding a word λ(ρ) ∈ Σω. Steps of pVPA runs are defined as in Definition 3. An example pVPA and its possible runs are depicted in Figure 7 on page 14.

We can view the set of all configurations Q × Γ <sup>∗</sup> as the (infinite) state space of a discrete-time Markov chain. In this way, we obtain a probability space (Runs∆, F, P) via the usual cylinder set construction [6, Ch. 10].

Remark 2. The visibility restriction of our pVPA is slightly different from the definition given in [13] which requires all incoming transitions to a state to be of the same type, i.e., call, internal, or return. Our definition, on the other hand, imposes the same requirement on the states' outgoing transitions. We believe that our condition is more natural for pVPA obtained from procedural programs, such as the one in Figure 1. In fact, programs where randomness is restricted to internal statements such as x := bernoulli(0.5) or x := uniform(0, 3) naturally comply with our visibility condition because all call and return states of such programs are deterministic and thus cannot violate visibility. However, the alternative condition of [13] is not necessarily fulfilled for such programs.

We can now formally state our main problem of interest:

Definition 7 (Probabilistic CaRet Model Checking). Let AP be a finite set of atomic propositions, ϕ be a CaRet formula over AP, ∆ be a pVPA with labels from the pushdown alphabet Σ = 2AP × { call, int,ret }, and θ ∈ [0, 1] ∩ Q. The quantitative CaRet Model Checking problem is to decide whether

$$\mathbb{P}(\{\rho \in Runs\_{\Delta} \mid \lambda(\rho) \in \mathcal{L}(\varphi)\}) \geq\_{?} \theta \ . $$

The qualitative CaRet Model Checking problem is the special case where θ = 1.

The probabilities in Definition 7 are well-defined as ω-VPL are measurable [22].

### 4 Model Checking against Stair-parity DVPA

In this section, we show that model checking pVPA (Definition 6) against VPL given in terms of a stair-parity DVPA (Definition 4) is decidable. This is achieved by first computing an automata-theoretic product of the pVPA and the DVPA and then evaluating the acceptance condition in the product automaton.

### 4.1 Products of Visibly Pushdown Automata

In general, pushdown automata are not closed under taking products as this would require two independent stacks. However, the visibility conditions on VPA and pVPA ensure that their product is again an automaton with just a single stack because the stack operations (push, nop, or pop) are forced to synchronize.

We now define the product formally. An unlabeled pVPA is a pVPA where the labeling function λ and alphabet Σ are omitted.

Definition 8 (Product ∆ × D). Let ∆ = (Q, q0, Γ, ⊥, P, Σ, λ) be a pVPA, and D = (S, s0, Γ<sup>0</sup> , ⊥, δ, Σ) be a DVPA over pushdown alphabet Σ. The product of ∆ and D is the unlabeled pVPA

$$
\Delta \times \mathcal{D} \,=\,\left(Q \times S,\,\left(q\_0, s\_0\right),\,\,\Gamma \times \Gamma',\,\,\langle \perp, \perp \rangle,\,\, P\_{\Delta \times \mathcal{D}}\right),
$$

where P∆×D is the smallest set of transitions satisfying the following rules for all q, r ∈ Q, Z ∈ Γ, s, t ∈ S, and Y ∈ Γ 0 :

q p −→<sup>∆</sup> rZ ∧ s λ(q) −−−→<sup>D</sup> tY (q, s) p −→∆×D (r, t)hZ, Y i q p −→<sup>∆</sup> r ∧ s λ(q) −−−→<sup>D</sup> t (q, s) p −→∆×D (r, t) qZ <sup>p</sup> −→<sup>∆</sup> <sup>r</sup> <sup>∧</sup> sY <sup>λ</sup>(q) −−−→<sup>D</sup> t (q, s)hZ, Y i p −→∆×D (r, t) (call) (internal) (return)

.

If the DVPA D is equipped with a priority function Ω : S → N0, then we extend Ω to Ω<sup>0</sup> : Q × S → N<sup>0</sup> via Ω<sup>0</sup> (q, s) = Ω(s).

It is not difficult to show that ∆×D is indeed a well-defined pVPA and moreover satisfies the following property (the proof is standard, see [28]):

Lemma 1 (Soundness of ∆ × D). Let ∆ be a pVPA and D be a stairparity DVPA with priority function Ω, both over pushdown alphabet Σ. Then the product pVPA ∆ × D with priority function Ω<sup>0</sup> as in Definition 8 satisfies

P({ ρ ∈ Runs<sup>∆</sup> | λ(ρ) ∈ L(D) }) = P({ ρ ∈ Runs∆×D | ρ↓Steps ∈ ParityΩ<sup>0</sup> }),

where ParityΩ<sup>0</sup> denotes the set of words in (Q×S) <sup>ω</sup> satisfying the standard parity condition defined by Ω<sup>0</sup> . Moreover, ∆×D can be constructed in polynomial time.

Remark 3. It is not actually important that the product satisfies the visibility condition. All techniques we apply to the product also work for general pPDA.

### 4.2 Stair-parity Acceptance Probabilities in pVPA

Lemma 1 effectively reduces model checking pVPA against stair-parity DVPA to computing stair-parity acceptance in the product, which is again an (unlabeled) pVPA. We therefore focus on pVPA in this section and do not consider DVPA.

Throughout the rest of this section, let ∆ = ( Q, q0, Γ, ⊥, P ) be an unlabeled pVPA. On the next pages we describe the construction of a finite Markov chain M<sup>∆</sup> that we call the step chain of ∆. Loosely speaking, M<sup>∆</sup> simulates jumping from one step (see Definition 3) of a run of ∆ to the next. A similar idea first appeared in [14]. Our construction, however, differs from the original one in various aspects. We discuss this in detail in Remark 5 further below.

Steps as events. For all n ∈ N0, we define a random variable V (n) on Runs<sup>∆</sup> whose value is either the state q of ∆ at the n-th step, or the extended state q⊥ in the special case where the n-th step occurs at a bottom configuration of the form q⊥, for some q ∈ Q. We denote the set of all such extended states with Q⊥ = { q⊥ | q ∈ Q }. Formally, V (n) : Runs<sup>∆</sup> → Q ∪ Q⊥ is defined as

$$V^{(n)}(\rho) \quad = \begin{cases} q & \text{if } \operatorname{step}\_n(\rho) = q\gamma \text{ and } \gamma \neq \bot \\ q \bot & \text{if } \operatorname{step}\_n(\rho) = q\bot \end{cases},$$

where stepn(ρ) denotes the configuration at the n-th step of ρ. Note that V (0) = q0⊥ because the first position of a run is always a step.

Lemma 2. For all n ∈ N<sup>0</sup> and v ∈ Q ∪ Q⊥, the event V (n) = v is measurable, and thus V (n) is a well-defined random variable.

We can view the sequence V (0), V (1) . . . of random variables as a stochastic process. It is intuitively clear that for all n ∈ N0, the value of V (n+1) depends only on V (n) , but not on V (i) for i < n. This is due to the more general observation that the state q at any step configuration qγ (with γ 6= ⊥) fully determines the future of the run because being a step already implies that no symbol in γ can ever be read as reading it implies popping it from the stack. In particular, q determines the probability distribution over possible next steps. A similar observation applies to bottom configurations of the form q⊥. Phrased in probability theoretical terms, the process V (0), V (1) . . . has the Markov property, i.e.,

$$\mathbb{P}(V^{(n)}=v\_n \mid V^{(n-1)}=v\_{n-1} \land \dots \land V^{(0)}=v\_0) = \mathbb{P}(V^{(n)}=v\_n \mid V^{(n-1)}=v\_{n-1}) \tag{1}$$

holds for all values of v0, . . . , v<sup>n</sup> such that the above conditional probabilities are well-defined <sup>1</sup> . This was proved in detail in [14]. It is also clear that the Markov process is time-homogeneous in the sense that

$$\mathbb{P}(V^{(n+1)}=v \mid V^{(n)}=v') \quad = \quad \mathbb{P}(V^{(n'+1)}=v \mid V^{(n')}=v')$$

holds for all n, n<sup>0</sup> ∈ N<sup>0</sup> for which the two conditional probabilities are welldefined. The following example provides some intuition on these facts.

Example 3. Consider the pVPA in Figure 7 (left). The initial fragments of its two equiprobable runs are depicted in the middle. In this example, it is easy to read off the next-step probabilities P(V (n) = v<sup>n</sup> | V (n−1) = vn−1) for all n ∈ N<sup>0</sup> and vn, vn−<sup>1</sup> ∈ Q ∪ Q⊥. They are summarized in the Markov chain on the right. For example, V (0) = q0⊥ holds with probability 1, and V (1) = q<sup>1</sup> and V (1) = q3⊥ hold with probability <sup>1</sup>/<sup>2</sup> each because the second step occurs either at position 1 with configuration q1⊥Z or at position 3 with configuration q3⊥,

<sup>1</sup> A conditional probability is well-defined if the condition, i.e., the event on the right hand side of the vertical bar, has positive probability. Expressions like the one in (1) are thus not necessarily well-defined because the probability that V (n−1) = vn−<sup>1</sup> might be zero for certain values of n and vn−1.

Fig. 7. Left: An example (unlabeled) pVPA ∆. Call, internal, and return states are depicted as squares, circles, and rhombs, respectively. The format of the transition labels is analogous to Figure 5 (left). Middle: Initial fragments of the two possible runs of ∆. Steps are underlined. Right: Its step Markov chain M<sup>∆</sup> (Definition 10, page 15).


Fig. 8. Next-step probabilities of the step Markov chain. Ptype for type ∈ { call, int,ret } are the probabilities of the pVPA's call, internal, and return transitions, respectively. The values [r <sup>0</sup>Z↓r] and [q↑] are the return and diverge probabilities from Definition 9.

and both options are equally likely. The case P(V (2) = q<sup>2</sup> | V (1) = q1) = 1 is slightly more interesting: Given that a configuration q1γ with γ 6= ⊥ is a step, we know that the next state must be q<sup>2</sup> (which is then also a step). Even though there is a transition from q<sup>1</sup> to q<sup>3</sup> in ∆, the next state cannot be q<sup>3</sup> because the latter is a return state which would immediately decrease the stack height of γ. This shows that, intuitively speaking, conditioning on being a step influences the probabilities of a state's outgoing transitions.

Probabilities of next steps, returns, and diverges. Our next goal is to provide expressions for the next-step probabilities P(V (n+1) = v 0 | V (n) = v) as we did in Example 3. It turns out that those can be stated in terms of the return and diverge probabilities of ∆.

Definition 9. Let p, q ∈ Q, Z ∈ Γ, and γ ∈ Γ ∗ . We define


Note that [p↑] is indeed independent of Z because the only way to read Z is by popping it from the stack which decreases the stack height. The diverge probabilities are closely related to steps. Indeed, the probability that a configuration pγ with γ 6= ⊥ is a step is equal to [p↑]. For example, in the pVPA in Figure 7 the configuration q1⊥Z is a step with probability [q1↑] = <sup>1</sup>/2.

It is known that the return and diverge probabilities are in general nonrational. As a minimal example, consider a pVPA that repeats the following steps until emptying its stack or getting stuck: (i) It pushes four symbols with probability 1/6, or (ii) pops one symbol with probability 1/2, or (iii) gets stuck otherwise. The resulting return probability is the least solution of x = (1/6)x <sup>5</sup> + 1/2, a non-rational number that is not even solvable by radicals [16, Thm. 3.2(1)].

Remark 4. The terms return and diverge are natural. When modeling procedural probabilistic programs as pVPA, [pZ↓q] is just the probability to eventually return from local state p of the current procedure to local state q of the calling procedure (the return address is stored on the stack in Z). Similarly, [p↑] is the probability that the current procedure diverges, i.e., it never returns to the calling context. Clearly, this is independent of the return address.

Lemma 3. The conditional next-step probabilities in Figure 8 are correct in the sense that if P(V (n+1) = v 0 | V (n) = v) is defined for n ∈ N<sup>0</sup> and v, v<sup>0</sup> ∈ Q ∪ Q⊥ then it is equal to the probability in the respective column " v → v <sup>0</sup> ".

Proof sketch. We only provide some intuition for two important cases; formal derivations are in [28]. Let r ∈ Q be arbitrary.


The step chain. It is convenient to view the stochastic process V (0), V (1) . . . as an explicit (graphical) Markov chain.

Definition 10 (The Step Chain M∆). M<sup>∆</sup> is the Markov chain with states

$$M\_{\quad} = \{ q \in Q\_{\text{call}} \cup Q\_{\text{int}} \mid [q\uparrow] > 0 \} \cup Q\bot,$$

Fig. 9. Left: Example pVPA with the following return-diverge probabilities: [cZ↓c] = <sup>1</sup>/6, [cZ↓r] = <sup>1</sup>/12, [rZ↓r] = <sup>1</sup>/3, [rZ↓c] = <sup>2</sup>/3, and [c↑] = <sup>3</sup>/4, [τ↑] = <sup>1</sup>/2, [r↑] = 0. Even though it is the case here, these probabilities are not always rational [16]. Right: Its step Markov chain according to Definition 10. The transition probabilities can be computed using the return and diverge probabilities and Figure 8.

initial state q0⊥, and for all v, v<sup>0</sup> ∈ M, the probability of transition v → v 0 is defined according to Figure 8.

Figure 9 depicts a non-trivial pVPA and its step chain. In this example, all return and diverge probabilities are rational. In general, however, the return and diverge probabilities (Definition 9) are algebraic numbers that are not always rational or even expressible by radicals [16]. As a consequence, one cannot easily perform numerical computations on the step chain. However, the probabilities can be encoded implicitly as the unique solution of an existential theory of the reals (ETR) formula, i.e. an existentially quantified FO-formula over (R, +, ·, ≤) [14]. Since the ETR is decidable, many questions about the step chain are still decidable as well. We will make use of this in Theorem 3 below.

The property of M<sup>∆</sup> that is most relevant to us is given by the following Lemma 4. We call ρ⇓Steps = V (0)(ρ)V (1)(ρ). . . the extended footprint of run ρ.

Lemma 4 (Soundness of M∆). Let ∆ be a pVPA with step chain M∆. Let M be the states of the step chain and consider a measurable set R ⊆ Mω. Then

$$\mathbb{P}(\{\rho \in Runs\_{\Delta} \mid \rho \psi\_{Steps} \in R\}) \quad = \quad \mathbb{P}(R) \ .$$

Proof sketch. For basic cylinder sets of the form R = w · M<sup>ω</sup> for some w ∈ M<sup>∗</sup> , the claim follows from the Markov property (1) together with the correctness of the transition probabilities of M<sup>∆</sup> according to Lemma 3. For other measurable sets, it can be shown by induction over the levels of the Borel hierarchy [28].

Remark 5. The step chain as presented here differs from the original definition in [14] in at least two important aspects. First, we have to take the semantics of our special bottom symbol ⊥ into account. This is why our chain uses a subset of Q ∪ Q⊥ as states—it must distinguish whether a step occurs at a bottom configuration. The pPDA in [14], on the other hand, may have both finite and infinite runs, and this needs to be handled differently in the step chain. Second, we use step chains for a different purpose than [14], namely to show that general measurable properties defined on steps—this includes stair-parity—can be evaluated on pVPA (Lemma 4).

Fig. 10. Left: The product of the pVPA from Figure 9 (left) and the DVPA from Figure 5 (left) on page 7. Right: Its step chain according to Definition 10. The dashed region is the only BSCC. It violates the parity condition Ω(s0) = 1 and Ω(s1) = 2 inherited from the DVPA (see Example 2 on page 8) since every run reaching the BSCC visits cs<sup>0</sup> infinitely often with probability 1. Only reachable states are depicted.

Putting it all together. We can now prove the main result of this section.

Theorem 3. Let ∆ be a pVPA and let D be a stair-parity DVPA, both over the same pushdown alphabet Σ. Then for all θ ∈ [0, 1] ∩ Q, the problem P({ρ ∈ Runs<sup>∆</sup> | λ(ρ) ∈ L(D)}) ≥? θ is decidable in PSPACE.

Proof sketch. We first construct the product ∆ × D according to Definition 8. By Lemma 1 we need to compute the stair-parity acceptance probability of ∆×D. Lemma 4 reduces this to computing a usual parity acceptance probability in the step chain M∆×D. This can be achieved through finding the bottom strongly connected components (BSCC) of M∆×D, classifying them as good (the minimum priority of a BSCC state is even) or otherwise bad, and running a standard reachability analysis wrt. the good states. See Figure 10 for an example. The remaining technical difficulty is that the transition probabilities of M∆×D are not rational in general. However, this can be dealt with using the fact that these probabilities are expressible in the ETR [14] (see [28] for the details).

#### 4.3 Probabilistic One-counter Automata

A probabilistic visibly one-counter automaton (pVOC) is the special case of a pVPA with unary stack alphabet, i.e., |Γ-<sup>⊥</sup>| = 1. For example, the pVPA in Figure 9 (left) is a pVOC. For many problems, better complexity bounds are known for pVOC than for the general case. In particular, [p↑] > 0 can be decided in P [9, Thm. 4]. We can exploit this to improve Theorem 3 in the pVOC case:

Corollary 1. Let ∆ be a pVOC and D be a stair-parity DVPA over pushdown alphabet Σ. The problem P({ρ ∈ Runs<sup>∆</sup> | λ(ρ) ∈ L(D)}) =? 1 is decidable in P.

Corollary 1 implies that there exist efficient algorithms for many properties of pVOC-expressible random walks on N0. In fact, a.-s. satisfaction of each fixed visibly-pushdown property can be decided in P. For instance, using the DVPA from Figure 5 we can decide if a random walk is a.-s. repeatedly bounded.

### 5 Model Checking against B¨uchi VPA and CaRet

With Theorems 1 and 3 it follows immediately that quantitative model checking of pVPA against non-deterministic B¨uchi VPA is decidable in EXPSPACE. We can improve the complexity in the qualitative case:

Theorem 4. Let ∆ be a pVPA and A be a (non-deterministic) B¨uchi VPA over the same pushdown alphabet. The problem P({ρ ∈ Runs<sup>∆</sup> | λ(ρ) ∈ L(A)}) =? 1 is EXPTIME-complete.

In the above result, membership in EXPTIME relies on the fact that one can construct the underlying graph of a step chain M∆×D in time exponential in the size of ∆ but polynomial in the size of D; see [28]. EXPTIME-hardness follows from [15, Thm. 8]. In fact, qualitative model checking of pPDA against nonpushdown B¨uchi automata is also EXPTIME-complete [15]. With Theorems 1 to 4 we immediately obtain the following complexity results for CaRet model checking:

Theorem 5. The quantitative and qualitative probabilistic CaRet model checking problems (Def. 7) are decidable in 2EXPSPACE and 2EXPTIME, respectively.

Both problems are known to be EXPTIME-hard [30].

### 6 Conclusion

We have presented the first decidability result for model checking pPDA—an operational model of procedural discrete probabilistic programs—against CaRet, or more generally, against the class of ω-VPL. We heavily rely on the determinization procedure from [22] and the notion of a step chain used in previous works. These two constructions turn our to be natural match.

We conjecture that our complexity bounds are not the best possible which is often the case in purely automata-based model checking. Future work is thus to investigate whether the doubly-exponential complexity can be lowered to singlyexponential, e.g. by generalizing the automata-less algorithm from [30]. Other topics are to explore to what extent algorithms for probabilistic CTL can be generalized to the branching-time variant of CaReT [18], to consider more expressive logics such as visibly LTL [8] or OPTL [12], and to study the interplay of conditioning and recursion [27] through the lens of pPDA.

Acknowledgement. The authors thank Christof L¨oding for his pointer to stairparity VPA, and the anonymous reviewers for their constructive feedback.

### References


Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

### **Author Index**

Angluin, Dana 1 Antonopoulos, Timos 1 Ascari, Flavio 21 Baier, Christel 40 Balasubramanian, A. R. 61 Blondin, Michael 81 Boisseau, Guillaume 101 Boker, Udi 120, 140 Broadbent, Anne 161 Bruni, Roberto 21 Caltais, Georgiana 184 Castelnovo, Davide 205 Chistikov, Dmitry 225 Cimatti, Alessandro 244 Colcombet, Thomas 264 Esparza, Javier 81 Fervari, Raul 305 Finkbeiner, Bernd 325 Fisman, Dana 1 Funke, Florian 40 Gadducci, Fabio 205 Gay, Simon J. 347 Geatti, Luca 244 Gehnen, Christina 449 George, Nevin 1 Gigante, Nicola 244 Gori, Roberta 21 Guillou, Lucie 61 Haase, Christoph 225 Hainry, Emmanuel 368 Heim, Philippe 325 Hirschowitz, André 389 Hirschowitz, Tom 389 Hojjat, Hossein 184 Jaakkola, Reijo 409

Kapron, Bruce M. 368 Karvonen, Martti 161 Katoen, Joost-Pieter 449 Kesner, Delia 285 Lafont, Ambroise 389 Lehtinen, Karoliina 120, 140 Maggesi, Marco 389 Mansutti, Alessio 225, 305 Marion, Jean-Yves 368 McDermott, Dylan 428 Miculan, Marino 205 Montanari, Angelo 244 Morvan, Rémi 264 Mousavi, Mohammad Reza 184 Passing, Noemi 325 Péchoux, Romain 368 Peyrot, Loïc 285 Piedeleu, Robin 101 Piribauer, Jakob 40 Poças, Diogo 347 Rivas, Exequiel 428 Santo, José Espírito 285 Sickert, Salomon 140 Tonetta, Stefano 244 Tunç, Hünkar Can 184 Uustalu, Tarmo 428 van Gool, Sam 264 Vasconcelos, Vasco T. 347 Weil-Kennedy, Chana 61 Winkler, Tobias 449 Ziemek, Robin 40