**Alexandra Silva K. Rustan M. Leino (Eds.)**

# LNCS 12760

# **Computer Aided Verification**

**33rd International Conference, CAV 2021 Virtual Event, July 20–23, 2021 Proceedings, Part II**

# Lecture Notes in Computer Science 12760

# Founding Editors

Gerhard Goos Karlsruhe Institute of Technology, Karlsruhe, Germany Juris Hartmanis Cornell University, Ithaca, NY, USA

### Editorial Board Members

Elisa Bertino Purdue University, West Lafayette, IN, USA Wen Gao Peking University, Beijing, China Bernhard Steffen TU Dortmund University, Dortmund, Germany Gerhard Woeginger RWTH Aachen, Aachen, Germany Moti Yung Columbia University, New York, NY, USA

More information about this subseries at http://www.springer.com/series/7407

Alexandra Silva • K. Rustan M. Leino (Eds.)

# Computer Aided Verification

33rd International Conference, CAV 2021 Virtual Event, July 20–23, 2021 Proceedings, Part II

Editors Alexandra Silva University College London London, UK

K. Rustan M. Leino Automated Reasoning Group | AWS Seattle, WA, USA

ISSN 0302-9743 ISSN 1611-3349 (electronic) Lecture Notes in Computer Science ISBN 978-3-030-81687-2 ISBN 978-3-030-81688-9 (eBook) https://doi.org/10.1007/978-3-030-81688-9

LNCS Sublibrary: SL1 – Theoretical Computer Science and General Issues

© The Editor(s) (if applicable) and The Author(s) 2021. This book is an open access publication.

Open Access This book is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this book are included in the book's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the book's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use.

The publisher, the authors and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This Springer imprint is published by the registered company Springer Nature Switzerland AG The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland

# Preface

It was our privilege to serve as the program chairs for CAV 2021, the 33rd International Conference on Computer-Aided Verification. CAV 2021 was held as a virtual conference during July 20–23, 2021. The tutorial days were on July 19 and July 24, 2021, and the pre-conference workshops were held during July 18–19, 2021. Due to the COVID-19 outbreak, all events took place online.

CAV is an annual conference dedicated to the advancement of the theory and practice of computer-aided formal analysis methods for hardware and software systems. The primary focus of CAV is to extend the frontiers of verification techniques by expanding to new domains such as security, quantum computing, and machine learning. This puts CAV at the cutting edge of formal methods research, and this year's program is a reflection of this commitment.

CAV 2021 received a very high number of submissions (290). We accepted 16 tool papers, 3 case studies, and 60 regular papers, which amounts to an acceptance rate of roughly 27%. The accepted papers cover a wide spectrum of topics, from theoretical results to applications of formal methods. These papers apply or extend formal methods to a wide range of domains such as concurrency, machine learning, and industrially deployed systems. The program featured keynote talks by Loris D'Antoni (UW-Madison), Corina Pasareanu (NASA), and Anna Slobodova (Centaur Technology, Inc.) as well as invited tutorials by Nate Foster (Cornell University), Zak Kincaid (Princeton) together with Tom Reps (UW-Madison), and Nadia Polikarpova (UC San Diego). Furthermore, we continued the tradition of Logic Lounge, a series of discussions on computer science topics targeting a general audience.

In addition to the main conference, CAV 2021 hosted the following workshops: Formal Approaches to Certifying Compliance (FACC), Formal Methods for ML-Enabled Autonomous Systems (FoMLAS), Formal Methods for Blockchains (FMBC), Numerical Software Verification (NSV), Theory and Practice of String Solving (TPSS), Verifying Probabilistic Programs (VeriProP), Synthesis (SYNT), Satisfiability Modulo Theories (SMT), and Verification Mentoring Workshop (VMW).

Organizing a flagship conference like CAV requires a great deal of effort from the community. The Program Committee for CAV 2021 consisted of 79 members — a committee of this size ensures that each member has to review only a reasonable number of papers in the allotted time. In all, the committee members wrote over 900 reviews while investing significant effort to maintain and ensure the high quality of the conference program. We are grateful to the CAV 2021 Program Committee for their outstanding efforts in evaluating the submissions and making sure that each paper got a fair chance. Like last year's CAV, we made the artifact evaluation mandatory for tool paper submissions and optional, but encouraged, for the rest of the accepted papers. This year saw an unprecedented number of 66 artifact submissions. The Artifact Evaluation Committee consisted of 72 members who put in significant effort to evaluate each artifact. The goal of this process was to provide constructive feedback to tool developers and help make the research published in CAV more reproducible. We are also very grateful to the Artifact Evaluation Committee for their hard work and dedication in evaluating the submitted artifacts.

CAV 2021 would not have been possible without the tremendous help we received from several individuals, and we would like to thank everyone who helped make CAV 2021 a success. First, we would like to thank Clément Pit-Claudel and Maria Schett for chairing the Artifact Evaluation Committee and John Cyphert for putting together the proceedings. We also thank Arie Gurfinkel for chairing the workshop organization, Bor-Yuh Evan Chang for managing sponsorship, Thomas Wies for arranging student fellowships, Norine Coenen for handling publicity, Leopold Haller for organising the Logic Lounge, and Peter Müller for putting together the Ask me Anything program. We also thank Jean-Baptiste Jeannin and Arjun Radhakrishna for chairing the Mentoring Committee. Putting together an online conference is a complex task and we are grateful to the virtualization chair Tiago Ferreira, the student volunteer coordinators Tobias Kappé and Tao Gu, the local organizers for the Asia timezone, Ichiro Hasuo and Krishna S, and the team at Slides Live for all their efforts. Last but not least, we would like to thank the members of the CAV Steering Committee (Kenneth McMillan, Aarti Gupta, Orna Grumberg, and Daniel Kroening) for helping us with several important aspects of organizing CAV 2021.

We hope that you will find the proceedings of CAV 2021 scientifically interesting and thought-provoking!

June 2021 Alexandra Silva Rustan Leino

# Organization

# Steering Committee


# Conference Co-chairs


# Artifact Co-chairs


# Workshop Chair


# Verification Mentoring Workshop Organizing Committee


# Logic Lounge Organizer


# Ask Me Anything Organizer

# Publicity Chair


Sagar Chaki Mentor Graphics, USA Jennifer Davis Collins Aerospace, USA Rajeev Joshi Amazon, USA K. Rustan M. Leino (Co-chair) Ruzica Piskac Yale University, USA

Bor-Yuh Evan Chang University of Colorado Boulder and Amazon, USA Hana Chockler King's College London, UK Cristina David University of Bristol, UK Yuxin Deng East China Normal University, China Rayna Dimitrova CISPA Helmholtz Center for Information Security, Germany Alastair Donaldson Imperial College London, UK Constantin Enea Université de Paris, France Joao Fernandes University of Porto, Portugal Bernd Finkbeiner CISPA Helmholtz Center for Information Security, Germany Vijay Ganesh University of Waterloo, Canada Pierre Ganty IMDEA Software Institute, Spain Aarti Gupta Princeton University, USA Arie Gurfinkel University of Waterloo, Canada Ichiro Hasuo National Institute of Informatics, Japan Marieke Huisman University of Twente, Netherlands David N. Jansen Institute of Software, Chinese Academy of Sciences, China Jean-Baptiste Jeannin University of Michigan, USA Ranjit Jhala University of California, San Diego, USA Temesghen Kahsai The University of Iowa, USA Benjamin Lucien Kaminski University College London, UK Joost-Pieter Katoen RWTH Aachen University, Germany Guy Katz The Hebrew University of Jerusalem, Israel Laura Kovacs Vienna University of Technology, Austria Mitja Kulczynski Kiel University, Germany Mohit Kumar Tekriwal University of Michigan, USA Orna Kupferman The Hebrew University of Jerusalem, Israel Marta Kwiatkowska University of Oxford, UK Shuvendu Lahiri Microsoft Research, USA Akash Lal Microsoft Research, India Kim Larsen Aalborg University, Denmark Marijana Lazic Technical University of Munich, Germany Owolabi Legunsen University of Illinois at Urbana-Champaign, USA Amazon, USA Rupak Majumdar Max Planck Institute for Software Systems, Germany

Ruben Martins Carnegie Mellon University, USA Ken McMillan University of Texas at Austin, USA Aina Niemetz Stanford University, USA Sylvie Putot Ecole Polytechnique, France


# Artifact Evaluation Committee


Luke Geeson Arm, UK Julien Lepiller Yale University, USA Marcel Moosbrugger TU Wien, Austria Marianela Morales Inria, France

Isabel Garcia-Contreras IMDEA Software Institute and Universidad Politecnica de Madrid, Spain Nick Giannarakis University of Wisconsin-Madison, USA Pablo Gordillo Universidad Complutense de Madrid, Spain Laura Graves University of Waterloo, Canada Zheng Guo University of California, San Diego, USA Vedad Hadžić Graz University of Technology, Austria Miguel Isabel Universidad Politécnica de Madrid, Spain Anastasiia Izycheva Technical University of Munich, Germany Chris Jenkins University of Iowa, USA Daniela Kaufmann Johannes Kepler University Linz, Austria Brian Kempa Iowa State University, USA Bettina Könighofer Graz University of Technology, Austria Mitja Kulczynski Kiel University, Germany Mohit Kumar Tekriwal University of Michigan, USA Stella Lau Massachusetts Institute of Technology, USA Chunxiao Li University of Waterloo, Canada Junyi Liu Institute of Software, Chinese Academy of Sciences, China Debasmita Lohar Max Planck Institute for Software Systems, Germany Makai Mann Stanford University, USA Roy Margalit Tel Aviv University, Israel Sidi Mohamed Beillahi Université de Paris and CNRS, France Jasper Nalbach RWTH Aachen University, Germany Andres Noetzli Stanford University, USA Mário Pereira Universidade NOVA de Lisboa, Portugal Mateo Perez University of Colorado Boulder, USA Elizabeth Polgreen University of California, Berkeley, USA Mathias Preiner Stanford University, USA Tim Quatmann RWTH Aachen University, Germany Bob Rubbens University of Twente, Netherlands Vimala S. Indian Institute of Technology, Madras, India Philipp Schröer RWTH Aachen University, Germany Joseph Scott University of Waterloo, Canada Amanda Stjerna Uppsala University, Sweden Zachary Susag University of Wisconsin-Madison, USA Hira Syeda Chalmers Universityof Technology, Sweden Martin Tappler Graz University of Technology, Austria Michael Tautschnig Queen Mary University of London, UK Saeid Tizpaz Niari University of Texas at El Paso, USA Hazem Torfah University of California, Berkeley, USA Deivid Vale Radboud University Nijmegen, Netherlands

Masaki Waga Kyoto University, Japan Peixin Wang Shanghai Jiao Tong University, China Sarah Winkler Free University of Bozen-Bolzano, Italy Tobias Winkler RWTH Aachen University, Germany Ali Younes Bauman Moscow State University, Russia Xiao-Yi Zhang National Institute of Informatics, Japan Yuhao Zhang University of Wisconsin-Madison, USA

# Additional Reviewers

Ahmad, Hammad An, Jie Armborst, Lukas Almagor, Shaull Arenas, Puri Asadi, Sepideh Amir, Guy Arif, Fareed Asarin, Eugene Baanen, Anne Batz, Kevin Berzish, Murphy Bacci, Giovanni Baumeister, Jan Blicha, Martin Balasubramanian, A. R. Belo Lourenço, Cláudio Boker, Udi Barbosa, Haniel Bentkamp, Alexander Bønneland, Frederik M. Barwell, Adam Berger, Jana Brain, Martin Castellano, Ezequiel Chen, Mingshuai Coenen, Norine Castro-Pérez, David Chida, Nariyoshi Cogumbreiro, Tiago Cetinkaya, Ahmet Chipara, Octav Correas Fernández, Jesús Cheang, Kevin Dai, Gaoyang

Defourné, Antoine Downing, Mara Darwin, Oscar Dill, David Dunn, Isaac Dave, Vrunda Dohmen, Taylor Dureja, Rohit De Masellis, Riccardo Doveri, Kyveli Eberhart, Clovis Eiers, William Esen, Zafer Ebrahimi, Masoud Farzan, Azadeh Feng, Yuan Fleury, Mathias Fedyukovich, Grigory Ferraiuolo, Andrew Gardy, Patrick Godefroid, Patrice Graham-Lengrand, Stéphane Gehani, Ashish Gomez-Zamalloa, Miguel Grumberg, Orna Genaim, Samir Goorden, Martijn Guan, Ji Georgiou, Pamina Gordillo, Pablo Guha, Shibashis Giacobbe, Mirco Graf, Susanne Gupta, Ashutosh Giesl, Jürgen

Habermehl, Peter Helfrich, Martin Huang, Chengchao Hadzic, Vedad Hofmann, Jana Huber, Nikolaus Hark, Marcel Holík, Lukáš Hyvärinen, Antti Hecking-Harbusch, Jesko Hozzova, Petra Irfan, Ahmed Isabel, Miguel Jaber, Nouraldin Jha, Susmit Jovanović, Dejan Jensen, Mathias Claus Jiang, Xu Junges, Sebastian Jensen, Peter Gjøl Kadron, Burak Klikovits, Stefan Koenighofer, Bettina Kempa, Brian Klinkenberg, Lutz Kremer, Gereon Kheterpal, Nishant Klüppelholz, Sascha Kura, Satoshi Kim, Edward La Malfa, Emanuele Li, Jianlin Lin, Shaokai Lachnitt, Hanna Li, Yangjia Lorber, Florian Larraz, Daniel Li, Yong Lukina, Anna Lathouwers, Sophie Limperg, Jannis Luppen, Zachary Lee, Sang-Hwa Maderbacher, Benedikt Merayo, Alicia Mora, Federico

Madnani, Khushraj Metzger, Niklas Mueller, Peter Mallik, Kaushik Michelmore, Rhiannon Mundkur, Prashanth Mann, Makai Mohaqeqi, Morteza Murali, Vishnu Martin-Martin, Enrique Monti, Raul Möhle, Sibylle Mazzucato, Denis Moosbrugger, Marcel Nagisetty, Vineel Nenzi, Laura Noll, Thomas Narodytska, Nina Nikšić, Filip Nummelin, Visa Nejati, Saeed Otoni, Rodrigo Ozdemir, Alex Özkan, Burcu Overbeek, Roy Pant, Yash Vardhan Perez, Mateo Polgreen, Elizabeth Passing, Noemi Philipoom, Jade Poulsen, Danny Bøgsted Patane, Andrea Pick, Lauren Preiner, Mathias Pereira, Mário Piribauer, Jakob Purser, David Quatmann, Tim Reynolds, Andrew Rubbens, Bob Ryan, Megan Rowe, Reuben Sato, Sota Sebastiani, Roberto Stanford, Caleb Schupp, Stefan

Shah, Ameesh Stankovic, Miroslav Schurr, Hans-Jörg Solovyev, Alexey Stein, Benno Schwenger, Maximilian Spel, Jip Tabar, Asmae Torfah, Hazem Tsiskaridze, Nestan Tekriwal, Mohit Tschaikowski, Max Turrini, Andrea Tibo, Alessandro Unno, Hiroshi Vasconcelos, Vasco Vediramana Krishnan, Hari Govind Vukmirović, Petar Vazquez-Chanlatte, Marcell Venkatesan, Abinaya Waga, Masaki Wang, Qisheng

Wilson, Amalee Wagner, Christopher Weil-Kennedy, Chana Winkler, Tobias Wang, Benjie Welzel, Christoph Wu, Haoze Wang, Fang Wicker, Matthew Wu, Min Wang, Peixin Xue, Bai Yu, Emily Zeljić, Aleksandar Zhang, Linpeng Zhou, Mengchu Zhang, Hanwei Zhao, Hengjun Zuleger, Florian Zhang, Hengjun Zhou, Li

# Contents – Part II

#### Complexity and Termination






# Contents – Part I

#### Invited Papers


Automated Safety Verification of Programs Invoking Neural Networks . . . . . 201 Maria Christakis, Hasan Ferit Eniser, Holger Hermanns, Jörg Hoffmann, Yugesh Kothari, Jianlin Li, Jorge A. Navas, and Valentin Wüstholz



#### Hybrid and Cyber-Physical Systems


#### Security



# **Complexity and Termination**

# **Learning Probabilistic Termination Proofs**

Alessandro Abate(B), Mirco Giacobbe(B), and Diptarko Roy(B)

University of Oxford, Oxford, UK {alessandro.abate,mirco.giacobbe, diptarko.roy}@cs.ox.ac.uk

**Abstract.** We present the first machine learning approach to the termination analysis of probabilistic programs. Ranking supermartingales (RSMs) prove that probabilistic programs halt, in expectation, within a finite number of steps. While previously RSMs were directly synthesised from source code, our method learns them from sampled execution traces. We introduce the *neural ranking supermartingale*: we let a neural network fit an RSM over execution traces and then we verify it over the source code using satisfiability modulo theories (SMT); if the latter step produces a counterexample, we generate from it new sample traces and repeat learning in a counterexample-guided inductive synthesis loop, until the SMT solver confirms the validity of the RSM. The result is thus a sound witness of probabilistic termination. Our learning strategy is agnostic to the source code and its verification counterpart supports the widest range of probabilistic single-loop programs that any existing tool can handle to date. We demonstrate the efficacy of our method over a range of benchmarks that include linear and polynomial programs with discrete, continuous, state-dependent, multi-variate, hierarchical distributions, and distributions with undefined moments.

# **1 Introduction**

Probabilistic programs are programs whose execution is affected by random variables [17,19,23,29,36]. Randomness in programs may emerge from numerous sources, such as uncertain external inputs, hardware random number generators, or the (probabilistic) abstraction of pseudo-random generators, and is intrinsic in quantum programs [34]. Notable exemplars are randomised algorithms, cryptographic protocols, simulations of stochastic processes, and Bayesian inference [7,33]. Verification questions for probabilistic programs require reasoning about the probabilistic nature of their executions in order to appropriately characterise properties of interest. For instance, consider the following question, corresponding to the program in Fig. 1: will an ambitious marble collector eventually gather any arbitrarily large amounts of red and blue marbles? Intuitively, the question has an affirmative answer regardless of the initially established target amounts, since there is always a chance of collecting a marble of either color. Notice that, if the probabilistic choice is replaced with non-determinism, as often happens in software verification, an adversary may exclusively draw one color of marble and make the program run forever. The question that matches the original intuition is whether the expected number of steps to termination is finite; this is the *positive almost-sure termination* (PAST) question [8,10,13,19,27].

**Fig. 1.** The ambitious marble collector (the variables red and blue are initialised nondeterministically).

Probabilistic termination analysis is typically mechanised through the automated synthesis of *ranking supermartingales* (RSMs), which are functions of the program variables whose value (i) decreases in expectation by a discrete amount across every loop iteration and (ii) is always bounded from below; an RSM formally witnesses that a program is PAST [10,13]. Early techniques for discovering RSMs reduced the synthesis problem from the source code of the program into constraint solving [10]. These methods have lent themselves to various generalisations, including polynomial programs, programs with non-determinism, lexicographic and modular termination arguments, and persistence properties [2,14–16,20,25]. Recently, for special classes of probabilistic programs or term rewriting systems, novel automated proof techniques that leverage computer algebra systems and satisfiability modulo theories (SMT) have been introduced [5,6,38,39,41]. All the above methods are sound and, under specific assumptions, complete; they represent the state of the art for the class of programs they have been designed for. However, their assumptions are often too restrictive for the analysis of many simple programs. In particular, to the best of our knowledge, none can identify an RSM for the program in Fig. 1. For this simple program, it is easy to argue that the expected output of the *neural network* depicted in Fig. 2 decreases after every iteration of the loop and that it is always non-negative (see Ex. 1). As such, this neural network is an appropriate RSM for the program.

**Fig. 2.** A neural ranking supermartingale for the program in Fig. 1.

We present a novel method for discovering RSMs using machine learning together with SMT solving. We introduce the *neural ranking supermartingale* (NRSM) model, which lets a neural network mimic a supermartingale over sampled execution traces from a program. We train an NRSM using standard optimisation algorithms over a loss function that makes the neural network decrease in average—across sampled iterations. We phrase the certification problem into that of computing a counterexample for the NRSM. To do so, we encode the neural network together with the expected value of the program variables; then, we use an SMT solver for verifying that the expected output of the network decreases along every execution. If the solver falsifies the NRSM, then it provides a counterexample that we use to guide a resampling of the execution traces; with this new data we retrain the neural network and repeat verification in a *counterexample-guided inductive synthesis* (CEGIS) fashion, until the SMT solver determines that no counterexample exists [4,44]. In the latter case, the solver has certified the generated NRSM; our method thus produces a *sound* PAST proof or runs indefinitely. Our procedure does not return for programs that are not PAST and may, in general, not return for some PAST instances. However, we experimentally demonstrate that, in practice, our method succeeds over a broad range of PAST benchmarks within a few CEGIS iterations. Previously, machine learning has been applied to the termination analysis of deterministic programs and to the stability analysis of dynamical systems [1,12,21,24,28,30– 32,42,43,45]; our method is the first machine learning approach for probabilistic termination analysis.

Our approach builds upon two key observations. First, the average of expressions along execution traces statistically approximates their true expected value. Thanks to this, we obtain a machine learning model for *guessing* RSM candidates that only requires execution traces and is thus agnostic to the source code. Second, solving the problem of *checking* an RSM is simpler than solving the entire termination analysis problem. Reasoning about source code is entirely delegated to the checking phase which, as such, supports programs that are out of reach to the available probabilistic termination analysers.

We experimentally demonstrate that our method is effective over many programs with linear and polynomial expressions, with both discrete and continuous distributions. This includes joint distributions, state-dependent distributions, distributions whose parameters are in turn random (hierarchical models), and distributions with undefined moments (e.g., the Cauchy distribution). We compare our method with a tool based on Farkas' lemma and with the tools Amber and Absynth [2,39,41]; whilst our software prototype is slower than these alternatives, it covers the widest range of benchmark single-loop programs.

Summarising, our contribution is fivefold. First, we present the first machine learning method for the termination analysis of probabilistic programs. Second, we introduce a loss function for training neural networks to behave as ranking supermartingales over execution traces. Third, we show an approach to verify the validity of ranking supermartingales using SMT solving, which applies to a wide variety of single-loop probabilistic programs. Fourth, we experimentally demonstrate over multiple baselines and newly-defined benchmarks the practical efficacy of our method. Fifth, we built a software prototype for evaluating our method.

**Fig. 3.** Syntax of loop-free probabilistic programs.

# **2 Termination Analysis of Probabilistic Programs**

We treat the termination analysis of single-loop probabilistic programs. We consider an imperative language that includes C-like arithmetic and Boolean expressions, and sequential and conditional composition of commands [13,17,19,23].

*Syntax.* A grammar for this language is shown in Fig. 3. We analyse single-loop programs of the form

$$\begin{array}{c} \text{whil} \text{\* } G \text{ do} \\ U \\ \text{od} \end{array}$$

where the loop guard G is a Boolean expression and the update statement U is a command. Variables are real-valued and can be either assigned to arithmetic expressions using the usual = operator, or sampled from probability distributions using the *∼* operator. Probability distributions, which can be either discrete or continuous, take not only parameters that are constant, and thus known at compile time, but also parameters that depend on other variables, and thus determined only at run time. In other words, distributions may depend on the current state of the program, which is a random variable. Also, they may depend on other random variables; as such, distributions may be multi-variate, resulting from models with coupled and hierarchically-structured variables.

*Semantics.* The operational semantics of a probabilistic program induces a probability space over runs, together with a stochastic process [13]. A state of the process is an element of IR<sup>n</sup> with <sup>n</sup> <sup>=</sup> <sup>|</sup>Vars|, that is, a valuation of the variables in the program. The space of outcomes Ωrun of a program is the set of runs. A run is a possibly infinite sequence of variable valuations (taken at the beginning of every loop iteration). This comes with a σ-algebra F of measurable subsets of Ωrun. Initial states are chosen non-deterministically and, thereafter, the process is purely probabilistic. Every initial state <sup>x</sup><sup>0</sup> <sup>∈</sup> IR<sup>n</sup> determines a unique probability measure <sup>P</sup>(x0) : F → [0, 1], namely a probability measure conditional on the state <sup>x</sup>0. The associated stochastic process is <sup>X</sup>(x0) <sup>=</sup> {X(x0) <sup>t</sup> }<sup>t</sup>∈IN, where X(x0) <sup>t</sup> is a random vector representing the state at the t-th step, initialised as X(x0) <sup>0</sup> = x0. Given an initial condition x<sup>0</sup> and a solution process X(x0) , the associated termination time is a random variable T(x0) denoting the length of an execution, which takes values in IN ∪ {∞}.

*Positive Almost-Sure Termination.* Runs are probabilistic and thus also the notion of termination requires a quantitative semantics. The termination question is generalised to the notions of *almost-sure* and *positive almost-sure* termination. Almost-sure termination (AST) indicates whether the joint probability of all runs that do not terminate is zero; positive almost-sure termination (PAST), which is stronger, indicates whether the expected number of steps to termination is finite. Formally, a probabilistic program terminates positively almost-surely if E[T(x0) ] <sup>&</sup>lt; <sup>∞</sup> for all <sup>x</sup><sup>0</sup> <sup>∈</sup> IR<sup>n</sup>. Notably, this implies that the program also terminates almost-surely, that is, <sup>P</sup>[T(x0) <sup>&</sup>lt; <sup>∞</sup>] = 1 for all <sup>x</sup><sup>0</sup> <sup>∈</sup> IR<sup>n</sup>. We provide conditions ensuring that probabilistic programs are PAST and, consequently, that they are AST. Notice that the converse may not be true, that is, there exist programs that are AST but not PAST. Our method addresses the PAST question only, by building upon the theory of ranking supermartingales [10].

*Ranking Supermartingales.* A scalar stochastic process {Mt} is an RSM if, for some > 0 and lower bound K ∈ IR,

$$\mathbb{E}\left[M\_{t+1} \mid M\_t = m\_t, \dots, M\_0 = m\_0\right] \le m\_t - \epsilon \tag{1}$$

and M<sup>t</sup> ≥ K for all t ≥ 0. In other words, this a process whose values are bounded from below and whose expected value decreases by a discrete amount at each step of the program. We prove that a program is PAST by mapping <sup>X</sup>(x0) into an RSM. Our goal is finding a function <sup>η</sup> : IR<sup>n</sup> <sup>→</sup> IR such that, for every initial condition x0, it satisfies the following two properties:

(i)  $\mathbb{E}[\eta(X\_{t+1}^{(x\_0)}) \mid X\_t^{(x\_0)} = x] \le \eta(x) - \epsilon$  for all  $x \in I$  and  $\eta(\text{ii})$   $\eta(x) \ge K$  for all  $x \in I$ ,

where <sup>I</sup> <sup>⊆</sup> IR<sup>n</sup> is some sufficiently strong loop invariant that can be the loop guard or, possibly, a stronger condition. Function η maps the entire stochastic process into an RSM. For this reason, we call η an RSM for the program.

**Input**: Single-loop probabilistic program (*G, U*), Initial state *<sup>x</sup>*<sup>0</sup> <sup>∈</sup> IR*<sup>n</sup>* **Output**: Transition samples *<sup>S</sup>* <sup>⊂</sup> IR*<sup>n</sup>* × P(IR*<sup>n</sup>*) **<sup>1</sup>** *S* ← ∅; **2** *P*- ← {*x*0}; **<sup>3</sup> for** *i* ← 1 **to** *k* **do** // *k* = path length **<sup>4</sup>** *P* ← *P*- ; **5** *P*- ← ∅; **<sup>6</sup>** *p* ← pick arbitrary element from *P*; **<sup>7</sup> if** eval(*G,p*) <sup>=</sup> *True* **then <sup>8</sup> for** *j* ← 1 **to** *m* **do** // *m* = branching factor **9** *P*- ← *P*- ∪ {exec(*U,p*)} **<sup>10</sup>** *S* ← *S* ∪ {(*p, P*- )}; **11 return** *S*

#### **Algorithm 1:** Interpreter

*Example 1.* Consider the ambitious marble collector problem from Fig. 1. An RSM for this program is a function η mapping variables red and blue to IR. Rephrasing condition (i) over this program, η is required to satisfy

$$0.01 \cdot \eta(\mathsf{red} - 1, \mathsf{b1ue}) + 0.99 \cdot \eta(\mathsf{red}, \mathsf{b1ue} - 1) \le \eta(\mathsf{red}, \mathsf{b1ue}) - \epsilon,\tag{2}$$

for all red, blue ∈ ZZ that satisfy red > 0 ∨ blue > 0, that is, the loop guard. So, for example, function η(red, blue) = red + blue satisfies this condition; however, it may take any negative value over the arguments red and blue such that red > 0 ∨ blue > 0, thus violating condition (ii). By contrast, the neural network in Fig. 2 succeeds at satisfying both conditions. In fact, the network realises function η(red, blue) = max{red, 0} + max{blue, 0}, which satisfies Eq. (2) and is bounded from below by zero. 

# **3 Training Neural Ranking Supermartingales**

Our framework synthesises RSMs by learning from program execution traces. We define a loss function, that measures the number of sampled program transitions that do not satisfy the RSM conditions. Applying gradient-descent optimisation to the loss function guides the parameters to values at which the candidate's value decreases, on average, across sampled program transitions. Since the learner does not require the underlying program (only execution traces), the learner is agnostic to the structure of program expressions, and the cost of evaluating the loss function does not scale with the size of the program.

A dataset of sampled transitions is produced using an instrumented program interpreter (Algorithm 1). At a program state p, the interpreter runs the loop body m times to sample successor states P , where m is a branching factor hyperparameter, before resuming execution from an arbitrarily chosen successor. The dataset S consists of the union of pairs (p, P ) generated by the interpreter.

**Fig. 4.** Neural ranking supermartingale architecture.

The loss function is used to optimise the parameters of an NRSM, whose architecture is shown in Fig. 4. This is a neural network with n inputs, one output neuron, and one hidden layer. The hidden layer has h neurons, each of which applies an activation function f to a weighted sum of its inputs. In our experiments, the activation function f is either f(x) = x<sup>2</sup> or f(x) = ReLU(x), where ReLU(x) = max{x, 0}.

Therefore, we employ either of the two following functional templates, defined over the learnable parameters wi,j and bi:

– Sum of ReLU (SOR):

$$\eta(x\_1, \ldots, x\_n) = \sum\_{i=1}^h \text{ReLU}\left(\sum\_{j=1}^n w\_{i,j} x\_j + b\_i\right);\tag{3}$$

– Sum of Squares (SOS):

$$\eta(x\_1, \ldots, x\_n) = \sum\_{i=1}^h \left( \sum\_{j=1}^n w\_{i,j} x\_j + b\_i \right)^2 \,. \tag{4}$$

These choices of activation mean that our NRSMs are restricted to non-negative outputs, and therefore satisfy condition (ii) by construction. The learner therefore needs to find parameters that satisfy condition (i), which requires η to decrease in expectation by at least some positive constant > 0.

The role of the loss function is to allow the learner parameters to be optimised such that the NRSM decreases, on average, across sampled transitions. That is, the loss function evaluates the number of sampled transitions for which the NRSM does not satisfy the RSM condition (i), and the lower its value, the more the neural network behaves like an RSM.

Concretely, the loss associated with a state p and its successors P is:

$$L(p, P') = \text{softplus}\left(\mathbb{E}\_{p' \sim P'}[\eta(p')] - \eta\left(p\right) + \epsilon\right),\tag{5}$$

where softplus(x) = ln(1 + ex), and Ep-∼P - [η(p )] is the average of η over the sampled successor states p from P .

We then train an NRSM by solving the following optimisation problem:

$$\min \frac{1}{|S|} \sum\_{(p, P') \in S} L(p, P'), \tag{6}$$

which aims to minimise the average loss over all sampled transitions in the dataset S, over the trainable weights w<sup>1</sup>,<sup>1</sup>,...,wh,n ∈ IR and biases b1,...,b<sup>h</sup> ∈ IR. This objective is non-convex and non-linear, and we resort to gradient-based optimisation (see Sect. 6).

The softplus in Eq. (5) forces the parameters to satisfy condition (i) uniformly across all sampled transitions in the dataset, rather than decreasing by a large amount in expectation over some transitions at the expense of failing to decrease sufficiently quickly for others. Furthermore, for NRSMs of SOR form we replace the ReLU activation function by softplus, to help gradient descent converge faster. Softplus approximates the ReLU function, and has the same asymptotic behaviour, but results in an NRSM that is differentiable w.r.t. the network parameters at all inputs, unlike ReLU [22, p.193]. However, since softplus is a transcendental function, we revert back to using a simpler ReLU activation when verifying an SOR candidate.

**Fig. 5.** CEGIS architecture for the adversarial training of NRSM.

A CEGIS loop integrates the learner and verifier (Fig. 5). The dataset S sampled by the interpreter is used to train an NRSM candidate η according to Eq. (6). The verifier checks whether η satisfies condition (i), concluding either that the program is PAST, or producing a counterexample program state xcex for which η does not satisfy (i). The interpreter generates new traces, starting at xcex, forcing it to explore parts of the state space over which the NRSM fails to decrease sufficiently in expectation.

**Fig. 6.** Verifier architecture.

### **4 Verifying Ranking Supermartingales by SMT Solving**

To verify an NRSM we must check that it decreases in expectation by at least some constant (condition (i)). Condition (ii) is satisfied by construction because the network's output is non-negative for every input, leaving only condition (i) to verify. The architecture of the verifier is depicted in Fig. 6. First, a program (G, U) is translated into an equivalent logical formulation denoted by G¯ and U¯ ('Encode' block), which are used to construct a closed-form term E[¯η] for the NRSM's expected value at the end of the loop body ('Marginalise' block). Secondly, given an NRSM η, its parameters are rounded and encoded as a logical term ¯η ('Round' block). Then, the satisfiability of the following formula is decided using SMT solving:

$$\bar{G}(x\_1\dots x\_n) \land \mathbb{E}[\bar{\eta}](x\_1\dots x\_n) > \bar{\eta}(x\_1\dots x\_n) - \epsilon. \tag{7}$$

This is the dual satisfiability problem for the validity problem associated with condition (i) on page 5. If Eq. (7) is unsatisfiable, then ¯η is a valid RSM and we conclude the program is PAST. Otherwise, the solver yields a counterexample state <sup>x</sup>cex <sup>∈</sup> IR<sup>n</sup>.

The rounding strategy ('Round' block) provides multiple candidates to the verifier by adding i.i.d. noise to parameters and rounding them to various precisions. Setting parameters that are numerically very small to zero is useful since learning that a parameter should be exactly zero could require an unbounded number of samples; rounding provides a pragmatic way of making this work in practice. If none of the generated candidates are valid NRSMs, all counterexamples are passed back to the interpreter which generates more transition samples for the learner (Fig. 5).


**Fig. 7.** Quantifier-free first-order logic formulae.

Notice that, if a program's guard predicate is not strong enough to allow a valid RSM to be verified as such, the CEGIS loop will run indefinitely. In general, stronger supporting loop invariants may need to be provided.

### **4.1 From Programs to Symbolic Store Trees**

We now introduce a translation from a loop-free probabilistic program to a *symbolic store tree* (Fig. 8), a datastructure representing the distribution over program states at the end of a loop iteration as a function of the variable valuation at its start. Marginalising out the probabilistic choices made in the loop yields the NRSM expectation E[¯η].


**Fig. 8.** Symbolic store tree.

This requires a form of symbolic execution. We represent program states symbolically using *symbolic stores*, denoted Σ (Fig. 8), which map program variables to *probabilistic terms*. A probabilistic term π can be either a first-order logic term (Fig. 7) representing an arithmetic expression, or a placeholder for a probability distribution whose parameters are terms (allowing them to be functions of the program state). Finally, *symbolic store trees* σ (Fig. 8) represent the set of control-flow paths through the loop body, arising from if-statements; it is a binary tree with symbolic stores at the leaves, and internal nodes labelled by logical formulae over program variables.

**Fig. 9.** Translation from a loop-free command to a symbolic store tree.

Figure 9 defines a translation from an initial symbolic store tree and command to a new symbolic store tree characterising the distribution over states after executing the command. At the top level, we provide the command G (the loop body) and the initial symbolic store {x <sup>1</sup> → x1,...,x <sup>n</sup> → xn}, where primed variables represent the variable valuation at the end of the iteration, whereas unprimed variables represent the variable valuation at the beginning of the loop.

The first four cases of Fig. 9 define the translation of arithmetic expressions (to terms) and Boolean expressions (to formulae), by replacing program syntax with the corresponding logical operators.

The next four cases define the translation of commands. skip leaves the symbolic store unchanged. For deterministic assignments, the right hand side of the assignment is translated in the current symbolic store and bound to the variable. Sequential composition involves translating the first command, and translating the second command in the resulting store tree. A conditional statement creates a new node in the symbolic store tree that selects between the two recursively-translated branches, based on the formula derived from the guard predicate. These rules assume the store tree to be a leaf-level symbolic store, because the next rule handles the case where the initial symbolic store tree is a node. Finally, if the command is a probabilistic assignment, we translate the parameters to terms, and bind the resulting probabilistic term to a freshly generated symbol. This allows variables to be overwritten by multiple probabilistic sampling operations in the body of the loop. The mapping of variables to distributions in leaf-level stores defines the probability density over particular probabilistic choices.

*Example 2.* Figure 10 is the store tree produced for the ambitious marble collector program (Fig. 1). Each leaf-level store in the program's store tree corresponds to a particular control-flow path through the loop body. The interpretation of a symbolic store tree is that if we fix the outcomes of the probabilistic sampling operations performed by the loop body, then the state of the variables at the end of the iteration is determined by the predicates labelling the internal nodes.

**Fig. 10.** A store tree for the program in Fig. 1.

#### **4.2 Marginalisation**

To construct the closed-form logical term representing the NRSM's expected value at the end of an iteration, the probabilistic choices in the symbolic store tree must be marginalised out. If the program is limited to discrete random variables with finite support, we automatically marginalise the random choices by enumeration (for both SOR- and SOS-form NRSMs), as illustrated by Ex. 3.

*Example 3.* The ambitious marble collector program of Fig. 1, yields the symbolic store tree of Fig. 10. Suppose we want to marginalise the NRSM:

$$\begin{aligned} \eta(\mathsf{red}, \mathsf{b1ue}) &= \mathrm{ReLU}(w\_{1,1} \cdot \mathsf{red} + w\_{1,2} \cdot \mathsf{b1ue} + b\_1) \\ &+ \mathrm{ReLU}(w\_{2,1} \cdot \mathsf{red} + w\_{2,2} \cdot \mathsf{b1ue} + b\_2) \end{aligned} \quad (8)$$

with respect to this symbolic store tree. We first apply the encoding of the NRSM to each leaf-level symbolic store of Fig. 10, and enumerate the possible choices for the probabilistic choices (which in this example is limited to ν ∈ {0, 1}), using the bindings of ν to distributions in leaf-level stores to compute the probability mass of each choice. After resolving the predicates for each choice of ν, this yields:

$$0.01 \cdot \eta(\mathbf{red} - 1, \mathbf{b1ue}) + 0.99 \cdot \eta(\mathbf{red}, \mathbf{b1ue} - 1). \tag{9}$$

The term (9) is then provided as the value of the NRSM's expectation to the verifier. 

If the program samples from continuous distributions, we marginalise SOSform NRSMs (but not SOR-form NRSMs) by substituting symbolic moments for a set of supported built-in distributions, including Gaussian, MultivariateGaussian, and Exponential, though could include any distribution whose closed-form symbolic moments are available. Example 4 provides an example. This strategy is general enough to support a wide variety of programs, including those of Sect. 5. If a sampling distribution lacks symbolic moments, the cumulative distribution function can also be utilised, which is illustrated in the slicedcauchy case study (Fig. 15).

*Example 4.* Consider an NRSM η(x)=(wx + b)<sup>2</sup> and a symbolic store tree node(p = 1, σ1, σ2) where σ<sup>1</sup> = {x → x + v, v → Exp(λ), p → Bernoulli(3/4)} and σ<sup>2</sup> = {x → x − v, v → Exp(λ), p → Bernoulli(3/4)}. Exp(λ) denotes the exponential distribution with parameter λ, with pdf denoted pExp(λ)(v). We apply η to each leaf-level symbolic store, and marginalise the probabilistic choices. We marginalise p first by enumerating over its possible values, and then marginalise v. There are no dependencies between the distributions in this example, so the order in which they are marginalised does not matter.

$$\int\_0^\infty \left(\frac{3}{4}\eta(x+v) + \frac{1}{4}\eta(x-v)\right) p\_{\text{Exp}\{\lambda\}}(v) dv. \tag{10}$$

The result of marginalisation is a closed-form expression for Eq. (10). Note that since

$$
\eta(x+v) = w^2v^2 + 2(wx+b)wv + (wx+b)^2\tag{11}
$$

and ∞ <sup>0</sup> <sup>v</sup>npExp(λ)(v)d<sup>v</sup> <sup>=</sup> <sup>n</sup>! <sup>λ</sup>*<sup>n</sup>* , we use linearity of integration to perform the following simplification, by substituting expressions for the moments of v in terms of the parameter λ:

$$\int\_0^\infty \eta(x+v) p\_{\text{Exp}(\lambda)}(v) \text{d}v = \frac{2w^2}{\lambda^2} + \frac{2(wx+b)w}{\lambda} + (wx+b)^2,\tag{12}$$

which is used to reduce Eq. (10) to a closed form. This is the method used to perform marginalisation for several case studies, including crwalk, gaussrw and expdistrw. 

Notably, our verifier requires the expected value of the RSM to be computed (or soundly approximated) in closed form. We automate marginalisation for discrete distributions of finite support, but require manual intervention for continuous distributions. Nevertheless, our learning component is automated in both cases. Characterising the space of programs with continuous distributions that admit fully automated verification of an RSM is an open question.

# **5 Case Studies**

Existing tools for synthesising RSMs reduce the problem to constraint-solving [2,10,11,14], which can limit the generality of the synthesis framework. For instance, methods that convert the RSM constraints into a linear program using Farkas' lemma can only handle programs with affine arithmetic, and can only synthesise linear/affine (lexicographic) RSMs [2,10]. A second restriction of existing approaches is that they typically require the moments of distributions to be compile-time constants. This rules out programs whose distributions are determined at runtime, such as hierarchical and state-dependent distributions. Since the loss function of Eq. (6) only requires execution traces, our learner is agnostic to the structure of program expressions, imposing minimal restrictions on the kinds of expressions that can occur, or the kinds of distributions that can be sampled from. This allows us to learn RSMs for a wider class of programs compared to existing tools, as we will illustrate in this section using a number of case studies.

#### **5.1 Non-linear Program Expressions and NRSMs**

Many simple programs do not admit linear or polynomial RSMs, such as Fig. 1. Since the program cannot be encoded as a prob-solvable loop (due to the disjunctive guard predicate which cannot be replaced by a polynomial inequality), it cannot be handled by another recent tool, Amber [39]. However, this program admits the following piecewise-linear NRSM:

$$\text{ReLU}(0 \cdot \mathbf{red} + 1 \cdot \mathbf{b1u\mathbf{e}} + 11) + \text{ReLU}(1 \cdot \mathbf{red} + 0 \cdot \mathbf{b1u\mathbf{e}} + 11),\qquad(13)$$

whose parameters are learnt by our method, within the first CEGIS iteration.

**Fig. 11.** Probabilistic factorial (probfact).

Similarly, we learn the piecewise-linear NRSM:

$$\text{ReLU}(-1\cdot\mathbf{i} + 0\cdot\mathbf{s} + 12) + \text{ReLU}(0\cdot\mathbf{i} + 0\cdot\mathbf{s} + 9) \tag{14}$$

for the program in Fig. 11, which contains a bilinear assignment (cf. multiplication of s and i on line 3), so this program is not supported by [2]. The conjunction in the guard means it is not supported by Amber, either.

**Fig. 12.** Random walk with correlated variables (crwalk).

#### **5.2 Multivariate and Hierarchical Distributions**

Figure 12 is a random walk that samples from a multivariate Gaussian distribution, with zero mean, unit variances, and correlation sampled uniformly in the range −1 <sup>2</sup> , 1 . The MultivariateGaussian of line 4 is an instance of a hierarchical distribution, having parameters that are random variables. This program also contains a non-linear (polynomial) expression that updates the value of x. For crwalk we learn an SOS-form NRSM:

$$\left(0.1 \cdot \mathbf{x} - 47.2\right)^2,\tag{15}$$

proving this program is PAST. To verify this, the NRSM expectation is computed via the symbolic moments of the multivariate Gaussian distribution, given its covariance matrix (line 3), and then marginalising w.r.t. rho (again, using the moments of the uniform distribution over −1 <sup>2</sup> , 1 ). Unfortunately, it is challenging to translate many simple programs containing hierarchical distributions into ones that can be handled by existing tools. For instance, although it is possible to simulate sampling from a bivariate Gaussian of arbitrary correlation by sampling from independent standard Gaussian distributions, this would involve computing a non-polynomial function of the correlation. Similarly, for the program in Fig. 14 (further discussed below), if a variable is exponentially distributed, <sup>X</sup> *<sup>∼</sup>* Exponential(1), then <sup>X</sup> <sup>λ</sup> *∼* Exponential(λ), providing a way of simulating an exponential distribution with arbitrary parameter λ. However, this again requires a non-polynomial program expression (i.e. the reciprocal of λ) when λ is part of the program state and not a constant, and therefore out of scope for methods that restrict program expressions to being linear/polynomial.

#### **5.3 State-Dependent Distributions and Non-Linear Expectations**


**Fig. 13.** Gaussian random walk with time-varying and coupled noise (gaussrw).

Once we allow hierarchical distributions, it is natural to consider *state-dependent* distributions, i.e. distributions whose parameters depend on the program state rather than being sampled from other distributions. As an example, consider the program in Fig. 13 (a 2-dimensional Gaussian random walk with state-dependent moments). This is unsupported by existing tools because the mean of the Gaussian is a non-polynomial function of the program state. However, after defining the function <sup>√</sup> 1 + x<sup>2</sup> by means of the following *polynomial* logical inequalities:

$$\mathbf{m}\mathbf{u}\,\mathbf{x}^2 = 1 + \mathbf{x}^2 \tag{16}$$

$$\text{mu.x} \ge 1 \tag{17}$$

(similarly for mu y), we express the expected value of an SOS-form NRSM in terms of symbolic moments mu x, etc. Since these moments are state-dependent, we cannot marginalise them out as in the hierarchical case. Instead we perform non-deterministic abstraction, providing inequalities <sup>1</sup> <sup>10</sup> ≤ vx, vy ≤ 2 and −1 ≤ rho ≤ 1 as further verifier assumptions.

**Fig. 14.** State-dependent exponential random walk (expdistrw).

Even if program expressions are linear, the presence of state-dependent distributions can result in a non-linear verification problem, if the moments are themselves non-linear functions of the program variables. For instance, the program in Fig. 14 represents a 1-dimensional random walk, with steps sampled from an exponential distribution. Since the nth moment of Exponential(λ) is n! <sup>λ</sup>*<sup>n</sup>* , the expectation of an SOS-form NRSM is non-polynomial but still expressible in the theory of non-linear real arithmetic (see Ex. 4). For expdistrw we learn

$$\left(0.1 \cdot \mathbf{x} - 3.3\right)^2,\tag{18}$$

whereas for gaussrw in Fig. 13 we learn

$$(0 \cdot \mathbf{x} - 1 \cdot \mathbf{y} + 11)^2 + (0 \cdot \mathbf{x} + 0 \cdot \mathbf{y} + 8)^2. \tag{19}$$

We translate the program in Fig. 14 for Amber by replacing the update for λ by instead sampling it uniformly from [1, 10]. Amber correctly identifies the program is AST, and that (10−x) is a supermartingale expression (note, not an RSM), though does not report that the program is PAST (answering "maybe").

#### **5.4 Undefined Moments**

The ability to evaluate the cumulative distribution function (CDF) of a sampled distribution could be useful in marginalisation, even if the moments of the sampled distribution are undefined or not known analytically to infinite precision. An example is Fig. 15: the program samples from the standard Cauchy distribution, for which all moments are undefined. Since the sampled value is *only* used to determine which branch of a conditional is taken, the RSM expectation is well defined, and can be expressed in terms of the standard Cauchy CDF. Namely, the if-branch is taken with probability <sup>q</sup> = 1 <sup>−</sup> <sup>1</sup> <sup>π</sup> arctan(10) + <sup>1</sup> 2 . This equation is not expressible using polynomials; so we perform a sound approximation by introducing a new variable that is quantified over a small interval surrounding a finite precision approximation to q. This allows us to learn and verify the SOR-form NRSM:

$$\text{ReLU}(1.2 \cdot \mathbf{x} + 9.1). \tag{20}$$

For our experimental evaluation (Sect. 6) we create a modified version of each of the six case studies described in this section, as follows:


#### **5.5 Rare Transitions**

A limitation of relying on a sampled transition dataset to learn NRSM parameters is we rely on the average E<sup>p</sup>-∼P - [η(p )] in Eq. (5) being accurate (see Sect. 3). This assumption is challenged by programs that have certain control-flow paths of very low probability, which are unlikely to be sampled by the interpreter. For example, in the context of the ambitious marble collector (Fig. 1), Fig. 16 shows that when the probability of obtaining a red marble decreases below 2−<sup>7</sup>, our success rate drops. This is because a lower probability makes the corresponding control-flow path rarer in the dataset, to the point where the expected value of the NRSM cannot be estimated accurately.

**Fig. 16.** Success rate and execution times for the ambitious marble collector program (Fig. 1), where *p* is the probability of taking the if-branch. Success rate refers to the fraction of 10 executions that succeeded in finding an NRSM before a timeout of 300 s. Execution times show the median time with the error bar ranging between the minimum and maximum times of the 10 executions.

### **6 Experimental Results**

We built a prototype implementation of our framework (in Python) and present experimental results for benchmarks adapted from previous work, as well as our own case studies (from Sect. 5). The case studies illustrate programs for which our framework synthesises an RSM, yet existing tools cannot prove to be PAST.

The learner is implemented with Jax [9]. To train NRSMs, we use AdaGrad [18] for gradient-based optimisation, with a learning rate of 10−<sup>2</sup>. Parameters are initialised by sampling from Gaussian distributions: weight parameters are sampled from a zero-mean Gaussian, whereas the bias parameters are sampled either from a Gaussian with mean 10 (for SOR candidates) or mean 0 (for SOS candidates). We verify the NRSMs using the SMT solver Z3 [26,40]. The outcomes are obtained on the following platform: macOS Catalina version 10.15.4, 8 GB RAM, Intel Core i5 CPU 2.4 GHz QuadCore, 64-bit.

As mentioned in Sect. 4, the verifier checks a candidate NRSM over states satisfying the loop predicate, which characterises the set of reachable states. For our experiments, we manually provide the NRSM expectation, and augment the guard predicate with additional invariants where necessary. We generate outcomes using two different rounding strategies (Sect. 4): an "aggressive" rounding strategy which generated between 80 and 120 candidates per CEGIS iteration, and a "weaker" rounding strategy producing between 15 to 25 candidates per CEGIS iteration. The outcomes in Table 1 used the aggressive rounding strategy.

**Table 1.** Experimental results over existing (top section) and newly added benchmarks (bottom section); (c) indicates the benchmark uses continuous distributions, (d) indicates it only uses discrete distributions. All reported times are in seconds, oot indicates time-out after 300 s, n/a indicates the tool terminated without definite answer, and—indicates the benchmark is unsupported. Our method is run 10 times with different seeds; the overall success rate is reported. Runtimes of interpretation, training, verification phases, and # of CEGIS iterations refer to the run with median total runtime.


*Benchmarks from Previous Work.* We run our prototype on single-loop programs from the WTC benchmark suite [3], augmented with probabilistic branching and assignments [2]. These correspond to the programs in the first section of Table 1. We perturb assignment statements by adding noise sampled from a discrete uniform distribution of support {−2, 2}, or a continuous uniform distribution on the interval [−2, 2]. The *while* loops are also made probabilistic; with probability 1/2 the loop is executed, and with the remaining probability a skip command is executed.

We compare our framework against three existing tools. The first is Amber [39]: where possible, we translate instances from the WTC suite into the language of Amber, but this is not possible for some programs where the loop predicate is a logical conjunction or disjunction of predicates (indicated by dashes in Table 1). Second, we compare against a tool for synthesising affine lexicographic RSMs (referred to as Farkas' lemma) for affine programs (i.e. containing only linear expressions), based on reduction to linear programming via Farkas' lemma [2]. This is applicable to probabilistic programs with nested-loops, unlike our method. However, since it is limited to affine programs and affine lexicographic RSMs, it is not able to analyse all the programs we consider (again, indicated by dashes in Table 1). The third tool is Absynth [41], for which we are able to encode all programs that were limited to discrete random variables.

The experimental results (Table 1) show that for all the WTC benchmarks our approach has a success rate of at least 8/10, and is able to synthesise an RSM within 2 iterations (for the seed that results in median total execution time). For 15 of the 18 WTC benchmarks no full CEGIS iterations are required. As expected our approach, particularly the learning component, is much slower than all three tools. However, our framework has broader applicability, as illustrated with the next set of experiments.

*Newly Defined Case Studies.* The examples in the second section of Table 1 (from Sect. 5) are not proven PAST by any of the three tools. Our approach is able to do so with a success rate of at least 9/10, under the "aggressive" rounding strategy. Of the new examples, marbles3 (Sect. 5) requires the longest time, since we use an NRSM with h = 3 ReLU nodes (see Sect. 3), and six of the nine parameters must be brought sufficiently close to zero to learn a valid RSM. For gaussrw/gaussrw2, we find it necessary to set an SMT solver time limit within the CEGIS loop (of 200 ms for gaussrw, and 5 s for gaussrw2), such that candidates taking longer than this to verify are skipped. The fact that these examples are harder to verify is unsurprising, given that they give rise to non-polynomial decision problems, containing equationally defined rational expressions. In comparing the two rounding strategies, we find that using the "aggressive" strategy tends to result in fewer CEGIS iterations, reducing the learner time, while increasing the verifier time: this is to be expected, since a larger number of candidates needs to be checked in each CEGIS iteration.

# **7 Conclusion**

We have presented the first machine learning method for the termination analysis of probabilistic programs. We have introduced a loss function for training neural networks so that they behave as RSMs over sampled execution traces; our training phase is agnostic to the program and thus easily portable to different programming languages. Reasoning about the program code is entirely delegated to our checking phase which, by SMT solving over a symbolic encoding of program and neural network, verifies whether the neural network is a sound RSM. Upon a positive answer, we have formally certified that the program is PAST; upon a negative answer, we obtain a counterexample that we use to resample traces and repeat training in a CEGIS loop. Our procedure runs indefinitely for programs that are not PAST, as these necessarily lack a ranking supermartingale, and may run indefinitely for some PAST programs. Nevertheless, we have experimentally demonstrated over several PAST benchmarks that our method is effective in practice and covers a broad range of programs w.r.t. existing tools.

Our method naturally generalises to deeper networks, but whether these are necessary in practice remains an open question; notably, neural networks with one hidden layer were sufficient to solve our examples. We have exclusively tackled the PAST question, and techniques for almost-sure (but not necessarily PAST) termination and non-termination exist [16,37,39]. Our results pose the basis for future research in machine learning (and CEGIS) for the formal verification of probabilistic programs. Different verification questions will require different learning models. Our approach lends itself to extensions toward probabilistic safety, exploiting supermartingale inequalities, and towards the non-termination question, using repulsing supermartingales [16]. Adapting our method to termination analysis with infinite expected time is also a matter for future investigation [37]. Moreover, we have exclusively considered purely probabilistic single-loop programs: generalisations to programs with non-determinism, arbitrary control-flow, and concurrency are material for future work [15,20,35].

**Acknowledgments.** This work was in part supported by a partnership between Aerospace Technology Institute (ATI), Department for Business, Energy & Industrial Strategy (BEIS) and Innovate UK under project HICLASS (113213), by the Engineering and Physical Sciences Research Council (EPSRC) Doctoral Training Partnership, by the Department of Computer Science Scholarship, University of Oxford, and by the DeepMind Computer Science Scholarship.

### **References**


26 A. Abate et al.


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

# **Ghost Signals: Verifying Termination of Busy Waiting**

Tobias Reinhard(B) and Bart Jacobs

imec-DistriNet Research Group, KU Leuven, Leuven, Belgium *{*tobias.reinhard,bart.jacobs*}*@kuleuven.be

**Abstract.** Programs for multiprocessor machines commonly perform busy waiting for synchronization. We propose the first separation logic for modularly verifying termination of such programs under fair scheduling. Our logic requires the proof author to associate a *ghost signal* with each busy-waiting loop and allows such loops to iterate while their corresponding signal *s* is not set. The proof author further has to define a well-founded order on signals and to prove that if the looping thread holds an obligation to set a signal *s*- , then *s* is ordered above *s*. By using conventional shared state invariants to associate the state of ghost signals with the state of data structures, programs busy-waiting for arbitrary conditions over arbitrary data structures can be verified.

# **1 Introduction**

Programs for multiprocessor machines commonly perform busy waiting for synchronization [22,23]. In this paper, we propose a separation logic [24,31] to modularly verify termination of such programs under fair scheduling. Specifically, we consider programs where some threads busy-wait for a certain condition C over a shared data structure to hold, e.g., a memory flag being set by other threads. By modularly, we mean that we reason about each thread and each function in isolation. That is, we do not reason about thread scheduling or interleavings. We only consider these issues when proving the soundness of our logic. Assuming fair scheduling is necessary since busy-waiting for a condition C only terminates if the thread responsible for establishing the condition is sufficiently often scheduled to establish C.

Busy waiting is an example of *blocking* behaviour, where a thread's progress *requires interference* from other threads. This is not to be confused with *nonblocking* concurrency, where a thread's progress does not rely on—and may in fact be *impeded* by—interference from other threads. Existing proposed approaches for verifying termination of concurrent programs consider only programs that only involve non-blocking concurrent objects [32], or *primitive blocking constructs* of the programming language, such as acquiring built-in mutexes, receiving from built-in channels, joining threads, or waiting for built-in monitor condition variables [2,5,19], or both [11]. Existing techniques that do support busy waiting are not Hoare logics; instead, they verify termination-preserving *contextual refinements* between more concrete and more abstract implementations of busy-waiting concurrent objects [15,21]. In contrast, we here propose the first conventional program logic for modular verification of termination of programs involving busy waiting, using Hoare triples as module specifications.

In order to prove that a busy-waiting loop terminates, we have to prove that it performs only finitely many iterations. To do this we introduce a special form of *ghost resources* [13] which we call *ghost signals*. As ghost resources they only exist on the verification level and hence do not affect the program's runtime behaviour. Signals are initially unset and come with an obligation to set them. Setting a signal does not by definition correspond to any runtime condition. So, in order to use a signal s effectively, anyone using our approach has to prove an invariant stating that s is set if and only if the condition of interest holds. Further, the proof author must prove that every thread discharges all its obligations by performing the corresponding actions, e.g., by setting a signal and establishing the corresponding condition by setting the memory flag.

In our verification approach we tie every busy-waiting loop to a finite set of ghost signals S that correspond to the set of conditions the loop is waiting for. Every iteration that does not terminate the loop must be justified by the proof author proving that some signal <sup>s</sup> <sup>∈</sup> <sup>S</sup> has indeed not been set, yet. This way, we reduce proving termination to proving that no signal is waited for infinitely often.

Our approach ensures that no thread directly or indirectly waits for itself by requiring the proof author (i) to choose a well-founded and partially ordered set of levels <sup>L</sup>evs and (ii) to assign a level to every signal and by (iii) only allowing a thread to wait for a signal if the signal's level is lower than the level of each held obligation. This guarantees that every signal is waited for only finitely often and hence that every busy-waiting loop terminates. We use this to prove that every program that is verified using our approach indeed terminates.

We start by gradually introducing the intuition behind our verification approach and the concepts we use. In Sect. 2.1 and Sect. 2.2 we present the main aspects of using signals to verify termination. We start by treating them as physical thread-safe resources and only consider busy waiting for a signal to be set. Then, we drop thread-safety and explain how to prove data-race- and deadlockfreedom. In Sect. 2.3 and Sect. 2.4 we generalize our approach to busy waiting for arbitrary conditions over arbitrary data structures and then lift signals to the verification level by introducing ghost signals.

In Sect. 3 we sketch the verification of a realistic producer-consumer example involving a bounded FIFO to demonstrate our approach's usability and address fine-grained concurrency in Sect. 4. Further, we describe the available tool support in Sect. 5 and discuss integrating higher-order features in Sect. 6. We conclude by comparing our approach to related work and reflecting on it in Sect. 7 and Sect. 8.

We formally define our logic and prove its soundness in the extended version of this paper [28]. To keep the presentation in this paper simple, we assume busywaiting loops to have a certain syntactical form. In our technical report [29] we present a generalised version of our logic and its soundness proof. Further, we verify the realistic example presented in Sect. 3 in full detail in the extended version of this paper and in the technical report, using the respective version of our logic. We used our tool support to verify C versions of the bounded FIFO example and the CLH lock. The tool we used and the annotated .c files can be found at [10,26,27].

# **2 A Guide on Verifying Termination of Busy Waiting**

When we try to verify termination of busy-waiting programs, multiple challenges arise. Throughout this section, we describe these challenges and our approach to overcome them. In Sect. 2.1 we start by discussing the core ideas of our logic. In order to simplify the presentation we initially consider a simple language with built-in thread-safe *signals* and a corresponding minimal example where one thread busy-waits for such a signal. Signals are heap cells containing boolean values that are specially marked as being solely used for busy waiting. Throughout this section, we generalize our setting as well as our example towards one that allows to verify programs with busy waiting for arbitrary conditions over arbitrary shared data structures. In Sect. 2.2 we present the concepts necessary to verify data-race-, deadlock-freedom and termination in the presence of built-in signals that are not thread safe. In Sect. 2.3 we explain how to use these non-thread-safe signals to verify programs that wait for arbitrary conditions over shared data structures. We illustrate this by an example waiting for a shared heap cell to be set. In Sect. 2.4 we erase the signals from our program and lift them to the verification level in the form of a concept we call *ghost signals*.

#### **2.1 Simplest Setting: Thread-Safe Physical Signals**

We want to verify programs that busy-wait for arbitrary conditions over arbitrary shared data structures. As a first step towards achieving this, we first consider programs that busy-wait for simple boolean flags, specially marked as being used for the purpose of busy waiting. We call these flags *signals*. For now, we assume that read and write operations on signals are thread-safe. Consider a simple programming language with built-in signals and with the following commands: (i) **new signal** for creating a new unset signal, (ii) **set signal**(x) for setting x and (iii) **await is set**(x) for busy-waiting until x is set. Figure 1 presents a minimal example where two threads communicate via a shared signal sig. The main thread creates the signal sig and forks a new thread that busy-waits for sig to be set. Then, the main thread sets the signal. As we assume signal operations to be thread-safe in this example, we do not have to care about potential data races. Notice that like all busy-waiting programs, this program is guaranteed to terminate only under fair thread scheduling: Indeed, it does not terminate if the main thread is never scheduled after it forks the new thread. In this paper we verify termination under fair scheduling.

**let** sig := **new signal in fork await is set**(sig); **set signal**(sig)

**Fig. 1.** Minimal example with two threads communicating via a physical thread-safe signal.

#### **Augmented Semantics**

*Obligations.* The only construct in our language that can lead to non-termination are busy-waiting loops of the form **await is set**(sig). In order to prove that programs terminate it is therefore sufficient to prove that all created signals are eventually set. We use so-called *obligations* [5,6,16,19] to ensure this. These are *ghost resources* [13], i.e., resources that do not exist during runtime and can hence not influence a program's runtime behaviour. They carry, however, information relevant to the program's verification. Generally, holding an obligation requires a thread to discharge it by performing a certain action. For instance, when the main thread in our example creates signal sig, it simultaneously creates an obligation to set it. The only way to discharge this obligation is to set sig.

We denote thread IDs by θ and describe which obligations a thread θ holds by bundling them into an obligations chunk θ.obs(O), where O is a multiset of signals. We denote multisets by double braces {[...]} and multiset union by . Each occurrence of a signal s in O corresponds to an obligation by thread θ to set <sup>s</sup>. Consequently, θ.obs(∅) asserts that thread <sup>θ</sup> does not hold any obligations.

*Augmented Semantics.* In the *real* semantics of the programming language we consider here, ghost resources such as obligations do not exist during runtime. To prove termination, we consider an *augmented* version of it that keeps track of ghost resources during runtime. In this semantics, we maintain the invariant that every thread holds exactly one obs chunk. That is, for every running thread θ, our heap contains a unique heap cell θ.obs that stores the thread's bag of obligations. Further, we let a thread get stuck if it tries to finish while it still holds undischarged obligations. Note that we use the term *finish* to refer to threadlocal behaviour while we write *termination* to refer to program-global behaviour, i.e., meaning that every thread finishes. For every augmented execution there trivially exists a corresponding execution in the real semantics.

Figure 2 presents some of the reduction rules we use to define the augmented semantics. We use - h to refer to augmented heaps, i.e., heaps that can contain ghost resources. A reduction step has the form *h, c* <sup>θ</sup> aug *h*- *, c*- *, T* expresses that thread <sup>θ</sup> reduces heap - h (which is shared by all threads) and command c to heap - h and command c . Further, T represents the set of threads forked during this step. It is either empty or a singleton containing the new thread's ID and the command it is going to execute, i.e., {(θ*<sup>f</sup>* , c*<sup>f</sup>* )}. We omit it whenever it is clear from the context that no thread is forked. Further, we denote disjoint union of sets by .

Our reduction rules comply with the intuition behind obligations we outlined above. Aug-Red-NewSignal creates a new signal and simultaneously a corresponding obligation. The only way to discharge it is by setting the signal using Aug-Red-SetSignal.


**Fig. 2.** Reduction rules for augmented semantics.

*Forking.* Whenever a thread forks a new thread, it can pass some of its obligations to the newly forked thread, cf. Aug-Red-Fork. Forking a new thread with ID θ*<sup>f</sup>* also allocates a new heap cell θ*<sup>f</sup>* .obs to store its bag of obligations. Since this is the only way to allocate a new obs heap cell, we will never run into a heap - <sup>h</sup> {θ.obs(O)}{θ.obs (O )} that contains multiple obligations chunks belonging to the same thread θ. Remember that threads cannot finish while holding obligations. This prevents them from dropping obligations via dummy forks.

*Levels.* In order to prove that a busy-waiting loop **await is set**(sig) terminates, we must ensure that the waiting thread does not directly or indirectly wait for itself. We could just check that it does not hold an obligation for the signal it is waiting for, but that is not sufficient as the following example demonstrates: Consider a program with two signals sig1,sig<sup>2</sup> and two threads. Let one thread hold the obligation for sig<sup>2</sup> and execute **await is set**(sig1); **set signal**(sig2). Likewise, let the other thread hold the obligation for sig<sup>1</sup> and let it execute **await is set**(sig2); **set signal**(sig1).

To prevent such *wait cycles* modularly, we apply the usual approach [3,4,19]. For every program that we want to execute in our augmented semantics, we choose a partially ordered set of levels <sup>L</sup>evs. Further, during every reduction step in the augmented semantics that creates a signal <sup>s</sup>, we pick a level <sup>L</sup> <sup>∈</sup> <sup>L</sup>evs and associate it with <sup>s</sup>. Note that much like obligations, levels do not exit during runtime in the real semantics. Signal chunks in the augmented semantics have the form signal((id, L)) where id is the unique signal identifier returned by **new signal**. The level assigned to any signal can be chosen freely, cf. Aug-Red-NewSignal. In practice, determining levels boils down to solving a set of constraints that reflect the dependencies. In our example, however, the choice is trivial as it only involves a single signal. We choose <sup>L</sup>evs <sup>=</sup> {0} and 0 as level for sig and thereby get signal((sig, 0)). Generally, we denote signal tuples by s = (id, L). Now we can rule out cyclic wait dependencies by only allowing a thread to busy-wait for a signal s if its level s.lev is smaller than the level of each held obligation, cf. Aug-Red-Await<sup>1</sup>. Given a bag of obligations O, we denote this by s.lev <sup>≺</sup><sup>L</sup> <sup>O</sup>.

*Proving Termination.* As we will explain below, the augmented semantics has no fair infinite executions. We can use this as follows to prove that a program c terminates under fair scheduling: For every fair infinite execution of c, show that we can construct a corresponding augmented execution. (This requires that each step's side conditions in the augmented semantics are satisfied. Note that we thereby prove certain properties for the real execution, like absence of cyclic wait dependencies.) As there are no fair infinite executions in the augmented semantics, we get a contradiction. It follows that c has no fair infinite executions in the real semantics.

*Soundness.* In order to prove soundness of our approach, we must prove that there indeed are no fair infinite executions in the augmented semantics. This boils down to proving that no signal can be waited for infinitely often. Consider any program and any fair augmented execution of it. Consider the execution's *program order graph*, (i) whose nodes are the execution steps and (ii) which has an edge from a step to the next step of the same thread and to the first step of the forked thread, if it is a fork step. Notice that for each obligation created during the execution, the set of nodes corresponding to a step made by a thread while that thread holds the obligation constitutes a path that ends when the obligation is discharged. We say that this path *carries* the obligation.

It is not possible that a signal is waited for infinitely often. Indeed, suppose some signals <sup>S</sup><sup>∞</sup> are. Take <sup>s</sup>min <sup>∈</sup> <sup>S</sup><sup>∞</sup> with minimal level. Since <sup>s</sup>min is never set, the path in the program order graph that carries the obligation must be infinite as well. Indeed, suppose it is finite. The final node N of the path cannot discharge the obligation without setting the signal, so it must pass the obligation on either to the next step of the same thread or to a newly forked thread. By fairness of the scheduler, both of these threads will eventually be scheduled. This contradicts N being the final node of the path.

The path carrying the obligation for smin waits only for signals that are waited for finitely often. (Remember that Aug-Red-Await requires the signal waited for to be of a lower level than all held obligations, i.e., a lower level than that of smin.) It is therefore a finite path. A contradiction.

<sup>1</sup> For simplicity, our augmented semantics assumes that the level order and the level associated with any object remains fixed for the entire execution. However, following the approach presented in [18], it would be sound to add a step rule that allows a thread to change the level of an object it has exclusive access to (cf. Sect. 2.2).

Notice that the above argument relies on the property that every non-empty set of levels has a minimal element. For this reason, for termination verification we require that <sup>L</sup>evs is not just partially ordered, but also well-founded.

#### **Program Logic**

Directly using the augmented semantics to prove that our example program terminates is cumbersome. In the following, we present a separation logic that simplifies this task. *Safety.* We call a program <sup>c</sup> *safe* under a (partial) heap -

 h if it provides all the resources necessary such that both c and any threads it forks can execute without getting stuck in the augmented semantics. (This depends on the angelic choices.) We denote this by safe(- h, c) [33] 2. Consider a program <sup>c</sup> that is safe under an augmented heap -

 h. Let h be the real heap that matches - h apart from the ghost resources. Then, for every real execution that starts with h we can construct a corresponding augmented execution.

*Specifications.* We use Hoare triples {A} <sup>c</sup> {λr. B(r)} [8] to specify the behaviour of a program c. Such a triple expresses the following: Consider any evaluation context E, such that for every return value v, running E[v] from a state that satisfies B(v) is safe. Then, running E[c] from a state that satisfies A is safe.

*Proof System.* We define a proof relation which ensures that whenever we can prove {A} <sup>c</sup> {λr. B(r)}, then <sup>c</sup> complies with the specification {A} <sup>c</sup> {λr. B(r)}. Figure 3b presents some of the proof rules we use to define . As we evolve our setting throughout this section, we also adapt our proof rules. Rules that will be changed later are marked with a prime in their name. The full set of rules is presented in the extended version of this paper [28]. Our proof rules PR-SetSignal' and PR-Await' are similar to the rules for sending and receiving on a channel presented in [19].

Notice how the proof rules enforce the side-conditions of the augmented semantics. Hence, all we have to do to prove that a program c terminates is to prove that every thread eventually discharges all its obligations. That is, we have to prove {obs(∅)} <sup>c</sup> {obs(∅)}. Figure 3a illustrates how we can apply our rules to verify that our minimal example terminates.

#### **2.2 Non-Thread-Safe Physical Signals**

As a step towards supporting waiting for arbitrary conditions over shared data structures, including non-thread-safe ones, we now move to non-thread-safe signals. For simplicity, in this paper we consider programs that use mutexes to synchronize concurrent accesses to shared data structures. (Our ideas apply equally to programs that use other constructs, such as atomic machine instructions.) Figure 4 presents our updated example.

<sup>2</sup> For a formal definition see this paper's extended version [28] and the technical report [29].

```
{obs(∅)}
let sig := new signal in PR-NewSignal' with L = 0
{obs({[(sig, 0)]}) ∗ signal((sig, 0))} s := (sig, 0)
fork ({obs(∅) ∗ signal(s)}
      await is set(sig) s.lev = 0 ≺L ∅
     {obs(∅) ∗ signal(s)});
{obs({[s]})}
set signal(sig)
{obs(∅)}
```
(a) Proof outline for program from Fig. 1. Applied proof rule marked in purple. Abbreviation marked in brown. General hint marked in red.

> PR-NewSignal' <sup>L</sup> ∈ Levs {obs(O)} **new signal** {λr. obs(<sup>O</sup> {[(r, L)]}) <sup>∗</sup> signal((r, L))} PR-SetSignal' {obs(<sup>O</sup> {[s]})} **set signal**(s.id) {obs(O)} PR-Fork' {obs(O<sup>f</sup> ) <sup>∗</sup> <sup>A</sup>} <sup>c</sup> {obs(∅) <sup>∗</sup> <sup>B</sup>} {obs(O<sup>m</sup> <sup>O</sup><sup>f</sup> ) <sup>∗</sup> <sup>A</sup>} **fork** <sup>c</sup> {obs(Om)} PR-Await' s.lev <sup>≺</sup><sup>L</sup> <sup>O</sup> {obs(O) <sup>∗</sup> signal(s)} **await is set**(s.id) {obs(O) <sup>∗</sup> signal(s)} PR-Let {A} <sup>c</sup> {λr. C(r)} <sup>∀</sup>v. {C(v)} <sup>c</sup> - [v/x] {B} {A} **let** <sup>x</sup> := <sup>c</sup> **in** <sup>c</sup> -{B}

(b) Proof rules. Rules only used in this section marked with '.

**Fig. 3.** Verifying termination of minimal example with physical thread-safe signal. (Color figure online)

```
let sig := new signal in
let mut := new mutex in
fork with mut await is set(sig);
acquire mut;
set signal(sig);
release mut
                                       with mut await c := (while acquire mut;
                                                                     let r := c in
                                                                     release mut;
                                                                     ¬r
                                                               do skip)
```
(a) Code. (b) Syntactic sugar. r not free in mut.

**Fig. 4.** Minimal example with two threads communicating via a physical non-threadsafe signal protected by a mutex.

As signal sig is no longer thread-safe, the two threads can no longer use it directly to communicate. Instead, we have to synchronize accesses to avoid data races. Hence, we protect the signal by a mutex mut created by the main thread. In each iteration, the forked thread acquires the mutex, checks whether sig has been set and releases it again. After forking, the main thread acquires the mutex, sets the signal and releases it again.

*Exposing Signal Values.* Signals are specially marked heap cells storing boolean values. We make this explicit by extending our signal chunks from signal(s) to signal(s, b) where b is the current value of s and by updating our proof rules accordingly. Upon creation, signals are unset. Hence, creating a signal sig now spawns an *unset* signal chunk signal((sig, L), False) for some freely chosen level L and an obligation for (sig, L), cf. PR-NewSignal". We present our new proof rules in Fig. 6 and demonstrate their application in Fig. 5.

```
{obs(∅)}
let sig := new signal in PR-NewSignal" with L = 1
{obs({[(sig, 1)]}) ∗ signal((sig, 1), False)} PR-ViewShift & VS-SemImp
{obs({[(sig, 1)]}) ∗ ∃b. signal((sig, 1), b)} s := (sig, 1), P := ∃b. signal(s, b)
let mut := new mutex in PR-NewMutex" with L = 0
{obs({[s]}) ∗ mutex(m, P)} PR-ViewShift

obs({[s]}) ∗ mutex(m, P) ∗ mutex(m, P)
                              & VS-CloneMut"
fork ({obs(∅) ∗ mutex(m, P)}
     with m await m.lev, s.lev ≺L ∅
        {obs({[m]}) ∗ P} PR-Exists
        ∀b. {obs({[m]}) ∗ signal(s, b)}
           is set(sig)
           {λr. obs({[m]}) ∗ signal(s, b) ∧ r = b} PR-ViewShift & VS-SemImp

            λr. obs({[m]})
               ∗ if r then P else signal(s, False)

    {obs(∅) ∗ mutex(m, P)} PR-ViewShift & VS-SemImp
    {obs(∅)});
{obs({[s]}) ∗ mutex(m, P)}
acquire mut; m.lev = 0 < 1 = s.lev
{obs({[s, m]}) ∗ locked(m, P) ∗ ∃b. signal(s, b)} PR-Exists
∀b. {obs({[s, m]}) ∗ locked(m, P) ∗ signal(s, b)}
   set signal(sig);
   {obs({[m]}) ∗ locked(m, P) ∗ signal(s, True)} PR-ViewShift & VS-SemImp
   {obs({[m]}) ∗ locked(m, P) ∗ P}
   release mut
   {obs(∅) ∗ mutex(m, P)} PR-ViewShift & VS-SemImp
   {obs(∅)}
```
**Fig. 5.** Proof outline for program Fig. 4, verifying termination with mutexes & nonthread safe signals. Applied proof and view shift rules marked in purple. Abbreviations marked in brown. General hints marked in red. (Color figure online)

PR-NewSignal" <sup>L</sup> ∈ Levs {obs(O)} **new signal** {λid. obs(<sup>O</sup> {[(id, L)]}) <sup>∗</sup> signal((id, L), False)} PR-SetSignal" {obs(<sup>O</sup> {[s]}) <sup>∗</sup> signal(s, )} **set signal**(s.id) {obs(O) <sup>∗</sup> signal(s, True)} PR-IsSignalSet" {signal(s, b)} **is set**(s.id) {λr.signal(s, b) <sup>∧</sup> <sup>r</sup> <sup>=</sup> <sup>b</sup>} PR-Await" m.lev, s.lev <sup>≺</sup><sup>L</sup> <sup>O</sup> signal(s, False) <sup>∗</sup> <sup>R</sup> <sup>P</sup> {obs(<sup>O</sup> {[m]}) <sup>∗</sup> <sup>P</sup>} <sup>c</sup> {λr. obs(<sup>O</sup> {[m]}) <sup>∗</sup> if <sup>r</sup> then <sup>P</sup> else signal(s, False) <sup>∗</sup> <sup>R</sup>} {obs(O) <sup>∗</sup> mutex(m, P)} **with** m.loc **await** <sup>c</sup> {obs(O) <sup>∗</sup> mutex(m, P)} (a) Signals & busy waiting. PR-NewMutex" <sup>L</sup> ∈ Levs {P} **new mutex** {λ. mutex((, L), P)} PR-Acquire" {obs(O) <sup>∗</sup> mutex(m, P) <sup>∧</sup> m.lev <sup>≺</sup><sup>L</sup> <sup>O</sup>} **acquire** m.loc {obs(<sup>O</sup> {[m]}) <sup>∗</sup> locked(m, P) <sup>∗</sup> <sup>P</sup>} PR-Release" {obs(<sup>O</sup> {[m]}) <sup>∗</sup> locked(m, P) <sup>∗</sup> <sup>P</sup>} **release** m.loc {obs(O) <sup>∗</sup> mutex(m, P)} (b) Mutexes. PR-Frame {A} <sup>c</sup> {B} {<sup>A</sup> <sup>∗</sup> <sup>F</sup>} <sup>c</sup> {<sup>B</sup> <sup>∗</sup> <sup>F</sup>} PR-Exists <sup>∀</sup><sup>a</sup> <sup>∈</sup> A. {a} <sup>c</sup> {B} { A} <sup>c</sup> {B} PR-Fork {obs(O<sup>f</sup> ) <sup>∗</sup> <sup>A</sup>} <sup>c</sup> {obs(∅)} {obs(O<sup>m</sup> <sup>O</sup><sup>f</sup> ) <sup>∗</sup> <sup>A</sup>} **fork** <sup>c</sup> {obs(Om)} PR-ViewShift A A- {A- } <sup>c</sup> {B- } <sup>B</sup>- B {A} <sup>c</sup> {B} (c) Standard rules. VS-SemImp <sup>∀</sup>H. consistentlh(H) <sup>∧</sup> <sup>H</sup> <sup>A</sup> <sup>A</sup> <sup>⇒</sup> <sup>H</sup> <sup>A</sup> <sup>B</sup> A B VS-Trans A C C B A B

> VS-CloneMut" mutex(m, P) mutex(m, P) <sup>∗</sup> mutex(m, P)

> > (d) View shifts.

**Fig. 6.** Proof rules and view shift rules for mutexes and non-thread safe signals. Rules only used in this section marked with ".

*Data Races.* As read and write operations on signals are no longer thread-safe, our logic has to ensure that two threads never try to access sig at the same time. Hence, in our logic possession of a signal chunk signal(s, b) expresses (temporary) *exclusive ownership* of s. Further, our logic requires threads to own any signal they are trying to access. Specifically, when a thread wants to set sig, it must hold a chunk of the form signal((sig, L), b), cf. PR-SetSignal". The same holds for reading a signal's value, cf. PR-IsSignalSet". Note that signal chunks are not duplicable and only created upon creation of the signal they refer to. Therefore, holding a signal chunk for sig indeed guarantees that the holding thread has the exclusive right to access sig (while holding the signal chunk).

*Synchronization and Lock Invariants.* After the main thread creates sig, it exclusively owns the signal. The main thread can transfer ownership of this resource during forking, cf. PR-Fork', and thereby allow the forked thread to busy-wait for sig. This would, however, leave the main thread without any permission to set the signal and thereby discharge its obligation.

We use mutexes to let multiple threads share ownership of a common set of resources in a synchronized fashion. Every mutex is associated with a *lock invariant* P, an assertion chosen by the proof author that specifies which resources the mutex protects. In our example, we want both threads to share sig. To reflect the fact that the signal's value changes over time, we choose a lock invariant that abstracts over its concrete value. We choose <sup>P</sup> := <sup>∃</sup>b. signal((sig, L), b). Let us ignore the chosen signal level L for now. Creating the mutex mut consumes this lock invariant and binds it to mut by creating a mutex chunk mutex((mut,...), P), cf. PR-NewMutex". Thereby, the main thread loses access to sig. The only way to regain access is by acquiring mut, cf. PR-Acquire". Once the thread releases mut, it again loses access to all resources protected by the mutex, cf. PR-Release".

*Deadlocks.* We have to ensure that any acquired mutex is eventually released, again. Hence, acquiring a mutex spawns a release obligation for this mutex and the only way to discharge this obligation is indeed by releasing it, cf. PR-Acquire" and PR-Release".

Any attempt to acquire a mutex will block until the mutex becomes available. In order to prove that our program terminates, we have to prove that it does not get stuck during an acquisition attempt. To prevent wait cycles involving mutexes, we require the proof author to associate every mutex as well (just like signals) with a level L. This level can be freely chosen during the mutex' creation, cf. PR-NewMutex". Mutex chunks therefore have the form mutex((, L), P) where is the heap location the mutex is stored at. Their only purpose is to record the level and lock invariant a mutex is associated with. Hence, these chunks can be freely duplicated as we will see later. Generally, we denote mutex tuples by m = (, L). We only allow to acquire a mutex if its level is lower than the level of each held obligation, cf. PR-Acquire". This also prevents any thread from attempting to acquire mutexes twice, e.g., **acquire** mut; **acquire** mut or **with** mut **await acquire** mut.

*View Shifts.* When verifying a program, it can be necessary to reformulate the proof state and to draw semantic conclusions. To allow this we introduce a socalled *view shift* relation - [14]. By applying proof rule PR-ViewShift and VS-SemImp we can strengthen the precondition and weaken the postcondition. In our example, we use this to convert the unset signal chunk into the lock invariant which abstracts over the signal's value, i.e., signal(s, False) -<sup>∃</sup>b. signal(s, b).

The logic we present in this work is an intuitionistic separation logic that allows us to drop chunks.<sup>3</sup> This allows us to simplify the postcondition of our fork proof rule's premise from obs(∅) <sup>∗</sup> <sup>B</sup> to obs(∅), cf. PR-Fork, and drop all unneeded chunks via a semantic implication obs(∅) <sup>∗</sup> <sup>B</sup> obs(∅).

We also allow to clone mutex chunks via view shifts, cf. VS-CloneMut". In our example, this is necessary to inform both threads which level and lock invariant mutex mut is associated with. That is, the main thread clones the mutex chunk mutex(m, P) and passes one chunk on when it forks the busywaiting thread.

In Sect. 2.4 we extend our view shift relation and revisit our interpretation of what a view shift expresses. The full set of rules we use to define is presented in the extended version of this paper [28].

*Busy Waiting.* In the approach presented in this paper, for simplicity we only support busy-waiting loops of the form **with** mut **await** c, which is syntactic sugar for **while acquire** mut; **let** <sup>r</sup> := <sup>c</sup> **in release** mut;¬<sup>r</sup> **do skip** where <sup>r</sup> denotes a fresh variable.<sup>4</sup> In each iteration, the loop tries to acquire mut, executes c, releases mut again and lets the result returned by c determine whether the loop continues. Such loops can fail to terminate for two reasons: (i) Acquiring mut can get stuck and (ii) the loop could diverge.

We prevent the loop from getting stuck by requiring mut's level to be lower than the level of each held obligation, cf. PR-Await". Further, we enforce termination by requiring the loop to wait for a signal. That is, when verifying a busy-waiting loop using our approach, the proof author must choose a fixed signal and prove that this signal remains unset at the end of every non-finishing iteration. This way, we can prove that the loop terminates by proving that every signal is eventually set, just as in Sect. 2.1. And just as before, our logic requires the level of the waited-for signal to be lower than the level of each held obligation.

Acquiring the mutex in every iteration makes the lock invariant available during the verification of the loop body c. This lock invariant has to be restored at the end of the iteration such that it can be consumed during the mutex's release. PR-Await" allows for an additional view shift to restore the invariant. In our example, we end our busy-waiting loop's non-finishing iterations with the assertion signal(s, False). We use a semantic implication view shift to convert the signal chunk into the mutex invariant <sup>∃</sup>b. signal(s, b).

<sup>3</sup> This allows a thread to drop its obligations chunk obs(*O*). Note, however, that by dropping this chunk the thread does not drop its obligations, but only its ability to show what its obligations are. In particular the thread would be unable to present an empty obligations chunk upon termination.

<sup>4</sup> As we discuss in Sect. 5, in the technical report accompanying this paper we present a more general logic that imposes no such syntactic restrictions.

*Choosing Levels.* In our example, we have to assign levels to the mutex mut and to the signal sig. Our proof rules for mutex acquisition and busy waiting impose some restrictions on the levels of the involved mutexes and signals. By analysing the corresponding rule applications that occur in our proof, we can derive which constraints our level choice must comply with. Our example's verification involves one application of PR-Acquire" and one application of PR-Await": (i) Our main thread tries to acquire mut while holding an obligation to set sig. (ii) The forked thread busy-waits for sig while not holding any obligations. Our assignment of levels must therefore satisfy the single constraint m.lev <sup>&</sup>lt;<sup>L</sup> s.lev. So, we choose <sup>L</sup>evs <sup>=</sup> {0, <sup>1</sup>}, m.lev = 0 and s.lev = 1.

# **2.3 Arbitrary Data Structures**

The proof rules we introduced in Sect. 2.2 allow us to verify programs busywaiting for arbitrary conditions over arbitrary shared data structures as follows: For every condition C the program waits for, the proof author inserts a signal s into the program. They ensure that s is set at the same time the program establishes C and prove an invariant stating that the signal's value expresses whether C holds. Then, the waiting thread can use s to wait for C. We illustrate this here for the simplest case of setting a single heap cell in Fig. 7a.

```
let x := cons(0) in
let mut := new mutex in
fork with mut await [x] = 1;
acquire mut;
[x] := 1;
release mut
```
(a) Example program with busy waiting for heap cell x to be set.

```
let x := cons(0) in
let sig := new signal in
let mut := new mutex in
fork with mut await [x] = 1;
acquire mut;
[x] := 1;
set signal(sig);
release mut
```
(b) Example program with additional signal sig inserted, marked in green . sig and x are kept in sync. 7a

```
[e] = e-
        := (let r :=[e] in r = e-

                                     )
```
(c) Syntactic sugar. r free in e- .

**Fig. 7.** Minimal example illustrating busy waiting for condition over heap cell. (Color figure online)

The program involves three new non-thread-safe commands: (i) **cons**(v) for allocating a new heap cell and initializing it with value v, (ii) [] := v for assigning value v to heap location , (iii) [] for reading the value stored in heap location . We use [] = v as syntactic sugar for **let** r :=[e] **in** r = e .

In our example, the main thread allocates x, initializes it with the value 0 and protects it using mutex mut. It forks a new thread busy-waiting for x to be set. Afterwards, the main thread sets x. As explained above, we verify the program by inserting a signal sig that reflects whether x has been set, yet. Figure 7b presents the resulting code. The main thread creates the signal and sets it when it sets x.

```
{obs(∅)}
let x := cons(0) in
{obs(∅) ∗ x → 0}
let sig := new signal in PR-NewSignal" with L = 1
let mut := new mutex in PR-NewMutex" with L = 0
s := (sig, 1), m := (mut, 0)
P := ∃v. x → v ∗ signal(s, v = 1)
{obs({[s]}) ∗ mutex(m, P) ∗ mutex(m, P)}
fork ({obs(∅) ∗ mutex(m, P)}
      with m await m.lev, s.lev ≺L ∅
         {obs({[m]}) ∗ P}
         ∀v. {obs({[m]}) ∗ x → v ∗ signal(s, v = 1)}
            [
            ⎧x]=1
            ⎨
            ⎩
              λr. obs({[m]})
                 ∗ if r then P
                   else x → v ∧ v -
                               = 1 ∗ signal(s, False)
                                                 ⎫
                                                 ⎬
                                                 ⎭
     {obs(∅)});
{obs({[s]}) ∗ mutex(m, P)}
acquire mut; m.lev = 0 < 1 = s.lev
∀v. {obs({[s, m]}) ∗ locked(m, P) ∗ x → v ∗ signal(s, v = 1)}
   [x] := 1;
   {obs({[s, m]}) ∗ locked(m, P) ∗ x → 1 ∗ signal(s, v = 1)}
   set signal(sig);
   {obs({[m]}) ∗ locked(m, P) ∗ x → 1 ∗ signal(s, True)}
   release mut
   {obs(∅)}
```
(a) Proof outline for program 7b. Applied proof rules marked in purple. Abbreviations marked in brown. General hints marked in red.


(b) Proof rules. Evaluation function [[·]]. Rules only used in this section marked with "'.

**Fig. 8.** Verifying termination of busy waiting for condition over heap cell. (Color figure online)

*Heap Cells.* Verifying this example does not conceptually differ from the example we presented in Sect. 2.2. Figure 8b presents the new proof rules we need and Fig. 8a sketches our example's verification. As with non-thread-safe signals, we have to prevent multiple threads from trying to access x at the same time in order to prevent data races. For this we use so-called *points-to* chunks [24,31]. They have the form <sup>→</sup> <sup>v</sup> and express that heap location stores the value <sup>v</sup>. When a thread holds such a chunk, it exclusively owns the right to access heap location .

Heap locations are unique and the only way to create a new points-to chunk is to allocate and initialize a new heap cell via **cons**(v), cf. PR-Cons. Hence, there will never be two points-to chunks involving the same heap location. In order to read or write a heap cell via [] or [] := e, the acting thread must first acquire possession of the corresponding points-to chunk, cf. PR-AssignToHeap and PR-ReadHeapLoc"'.

*Relating Signals to Conditions.* In our example, the forked thread busy-waits for x to be set while our proof rules require us to justify each iteration by showing an unset signal. That is, we must prove an invariant stating that the value of x matches sig. As this invariant must be shared between both threads, we encode it in the lock invariant: <sup>P</sup> := <sup>∃</sup>v. <sup>x</sup> <sup>→</sup> <sup>v</sup> <sup>∗</sup> signal(s, v = 1). This does not only allow both threads to share the heap cell and the signal but it also automatically enforces that they maintain the invariant whenever they acquire and release the mutex.

# **2.4 Signal Erasure**

In the program from Fig. 7b signal sig is never read and does hence not influence the waiting thread's runtime behaviour. Therefore, we can verify the original program presented in Fig. 7a by erasing the physical signal and treating it as ghost code.

*Ghost Signals.* Central aspects of the proof sketch we presented in Fig. 8a are that (i) the main thread was obliged to set sig and that (ii) the value of sig reflected whether x was already set. *Ghost signals* allow us to keep this information but at the same to remove the physical signals from the code. Ghost signals are essentially identical to the physical non-thread-safe signals we used so far. However, as ghost resources they cannot influence the program's runtime behaviour. They merely carry information we can use during the verification process.

*View Shifts Revisited.* We implement ghost signals by extending our view shift relation. In particular, we introduce two new view shift rules: VS-NewSignal and VS-SetSignal presented in Fig. 9b. The former creates a new unset signal and simultaneously spawns an obligation to set it. The latter can be used to set a signal and thereby discharge a corresponding obligation. We say that these rules change the *ghost state* and therefore call their application a *ghost proof step*. With this extension, a view shift A - B expresses that we can reach postcondition B from precondition A by (i) drawing semantic conclusions or by (ii) manipulating the ghost state. In Fig. 9a we use ghost signals to verify the program from Fig. 7a.

Note that lifting signals to the verification level does not affect the soundness of our approach. The argument we presented in Sect. 2.1 still holds. We formalize our logic and provide a formal soundness proof in the extended version of this paper [28] and in the technical report [29]. The latter contains a more general version of the presented logic that (i) is not restricted to busy-waiting loops of the form **with** mut **await** c and that (ii) is easier to integrate into existing tools like VeriFast [12], as explained in Sect. 5.

{obs(∅)} **let** x := **cons**(0) **in** {obs(∅) ∗ x → 0} new ghost signal; VS-NewSignal with L = 1. {∃sig. obs({[(sig, 1)]}) <sup>∗</sup> <sup>x</sup> → <sup>0</sup> <sup>∗</sup> signal((sig, 1), False)} <sup>s</sup> := (sig, 1) <sup>∀</sup>sig. {obs({[s]}) <sup>∗</sup> <sup>x</sup> → <sup>0</sup> <sup>∗</sup> signal(s, False)} <sup>P</sup> := <sup>∃</sup>v. <sup>x</sup> → <sup>v</sup> <sup>∗</sup> signal(s, v = 1) **let** mut := **new mutex in** PR-NewMutex" with L = 0 obs({[s]}) <sup>∗</sup> mutex((mut, 0), P) <sup>∗</sup> mutex((mut, 0), P) m := (mut, 0) **fork** ({obs(∅) <sup>∗</sup> mutex(m, P)} **with** <sup>m</sup> **await** m.lev, s.lev <sup>≺</sup><sup>L</sup> <sup>∅</sup> {obs({[m]}) <sup>∗</sup> <sup>P</sup>} <sup>∀</sup>v. {obs({[m]}) <sup>∗</sup> <sup>x</sup> → <sup>v</sup> <sup>∗</sup> signal(s, v = 1)} [ ⎧x]=1 ⎨ ⎩ λr. obs({[m]}) <sup>∗</sup> if r then P else x → <sup>v</sup> <sup>∧</sup> <sup>v</sup> -= 1 <sup>∗</sup> signal(s, False) ⎫ ⎬ ⎭ {obs(∅)}); {obs({[s]}) <sup>∗</sup> mutex(m, P)} **acquire** mut; m.lev = 0 < 1 = s.lev <sup>∀</sup>v. obs({[s, m]}) <sup>∗</sup> locked(m, P) <sup>∗</sup> <sup>x</sup> → <sup>v</sup> <sup>∗</sup> signal(s, v = 1) [x] := 1; set ghost signal(s); obs({[m]}) <sup>∗</sup> locked(m, P) <sup>∗</sup> <sup>x</sup> → <sup>1</sup> <sup>∗</sup> signal(s, True) **release** mut {obs(∅)}

(a) Proof outline for the program presented in Fig. 7a. Auxiliary commands hinting at view shifts and general hints marked in red. Applied proof and view shift rules marked in purple. Abbreviations marked in brown.

> VS-NewSignal <sup>L</sup> ∈ Levs obs(O) <sup>∃</sup>id. obs(<sup>O</sup> {[(id, L)]}) <sup>∗</sup> signal((id, L), False) VS-SetSignal obs(<sup>O</sup> {[s]}) <sup>∗</sup> signal(s, ) obs(O) <sup>∗</sup> signal(s, True)

> > (b) Proof rules.

**Fig. 9.** Verifying termination with ghost signals. (Color figure online)

# **3 A Realistic Example**

To demonstrate the expressiveness of the presented verification approach, we verified the termination of the program presented in Fig. 10a. It involves two threads, a consumer and a producer, communicating via a shared bounded FIFO with a maximal capacity of 10. The producer enqueues numbers 100, . . . , 1 into the FIFO and the consumer dequeues those. Whenever the queue is full, the producer busy-waits for the consumer to dequeue an element. Likewise, whenever the queue is empty, the consumer busy-waits for the producer to enqueue the next element. Each thread's finishing depends on the other thread's productivity. This is, however, no cyclic dependency. For instance, in order to prove that the producer eventually pushes number i into the queue, we only need to rely on the consumer to pop i + 10. A similar property holds for the consumer.

```
alloc ghost signal IDs(idi
                     pop, idi
                          push) for 1 ≤ i ≤ 100;
Li
 pop := 102 − i, Li
                push := 101 − i, si
                                x := (idi
                                       x, Li
                                          x) for 1 ≤ i ≤ 100
init ghost signals(s100
                pop, s100
                     push);
{obs({[s100
      pop, s100
          push]}) ∗ ...}
let fifo10 := cons(nil) in let mut := new mutex in
let cp := cons(100) in let cc := cons(100) in
fork (while ( cp decreases in each iteration.
       with mut await ( Busy-wait for fifo10 not being full.
         {obs({[s
               cp
               push, (mut, 0)]}) ∗ ...} → Wait for consumer to pop.
         let f := [fifo10] in
         if size(f) < 10 then ( If fifo10 not full, push next element.
           let c := [cp] in [fifo10] := f ·c; [cp] := c − 1;
           set ghost signal(sc
                          push);
           if c − 1 -
                  = 0 then init ghost signal(sc−1
                                          push));
         size(f) -
               = 10); if size(f) = 10 then wait for s
                                                                    cp+10
                                                                    pop
       [cp] -
          = 0) Lcp+10
                                           pop = 92 − cp < 101 − cp = Lcp
                                                                     push
     do skip);
while ( cc decreases in each iteration.
 with mut await ( Busy-wait for fifo10 not being empty.
   {obs({[scc
          pop, (mut, 0)]}) ∗ ...} → Wait for producer to push.
   let f := [fifo10] in
   if size(f) > 0 then ( If fifo10 not empty, pop next element.
     let c := [cc] in [fifo10] := tail(f); [cc] := c − 1;
     set ghost signal(sc
                     pop);
     if c − 1 -
            = 0 then init ghost signal(sc−1 pop));
   size(f) > 0); if size(f)=0 then wait for scc
                                                                   push
 [cc] -
     = 0) Lcc
                                           push = 101 − cc < 102 − cc = Lcc
                                                                     push
do skip);
```
(a) Example program with two threads communicating via a shared bounded FIFO with maximal size 10. Auxiliary commands hinting at view shifts and general hints marked in red. Abbreviations marked in brown. Hints on proof state marked in blue.

```
VS-AllocSigID
True  ∃id. uninitSig(id)
                                 VS-SigInit
                                 obs(O) ∗ uninitSig(id)
                                  obs(O  {[(id, L)]}) ∗ signal((id, L), False)
            (b) Fine-grained view shift rules for signal creation.
```
**Fig. 10.** Realistic example program. (Color figure online)

*Fine-Tuning Signal Creation.* To simplify complex proofs involving many signals we refine the process of creating a new ghost signal. For simplicity, we combined the allocation of a new signal ID and its association with a level and a boolean in one step. For some proofs, such as the one we outline in this section, it can be helpful to fix the IDs of all signals that will be created throughout the proof already at the beginning. To realize this, we replace view shift rule VS-NewSignal by the rules presented in Fig. 10b and adapt our signal chunks accordingly. With these more fine-grained view shifts, we start by allocating a signal ID, cf. VS-AllocSigID. Thereby we obtain an *uninitialized* signal uninitSig(id) that is not associated with any level or boolean, yet. Also, allocating a signal ID does not create any obligation because threads can only wait for *initialized* (and unset) signals. When we initialize a signal, we bind its already allocated ID to a level of our choice and associate the signal with False, cf. VS-SigInit. This creates an obligation to set the signal.

*Loops and Signals.* In our program, both threads have a local counter initially set to 100 and run a nested loop. The outer loops are controlled by their thread's counter, which is decreased in each iteration until it reaches 0 and the loop stops. For such loops, we introduce a conventional proof rule for total correctness of loops, cf. this paper's extended version [28]. Verifying termination of the inner loops is a bit more tricky and requires the use of ghost signals.

So far, we had to fix a single signal for the verification of every **await** loop. We can relax this restriction to considering a finite set of signals the loop may wait for, cf. PR-Await presented in [28]. Apart from being a generalisation, this rule does not differ from PR-Await" introduced in Sect. 2.2.

Initially, we allocate 200 signal IDs id<sup>100</sup> push, . . . , id<sup>1</sup> push, id<sup>100</sup> pop, . . . , id<sup>1</sup> pop. We are going to ensure that always at most one push signal and at most one pop signal are initialized and unset. The producer and consumer are going to hold the obligation for the push and pop signal, respectively. The producer will hold the obligation for s*<sup>i</sup>* push while <sup>i</sup> is the next number to be pushed into the FIFO and it will set s*<sup>i</sup>* push when it pushes the number <sup>i</sup> into the FIFO. Meanwhile, the consumer will use s*<sup>i</sup>* push to wait for the number <sup>i</sup> to arrive in the queue when it is empty. Similarly, the consumer will hold the obligation for s*<sup>i</sup>* pop while number i is the next number to be popped from the FIFO and will set s*<sup>i</sup>* pop when it pops the number i. The producer uses s*<sup>i</sup>* pop to wait for the consumer to pop i from the queue when it is full. At any time, we let the mutex mut protect the two active signals and thereby make them accessible to both threads.

*Choosing the Levels.* Note that we ignored the levels so far. The producer and the consumer both acquire the mutex while holding an obligation for a signal. Hence, we choose <sup>L</sup>evs <sup>=</sup> <sup>N</sup>, <sup>m</sup>.lev = 0 and s.lev <sup>&</sup>gt; 0 for every signal <sup>s</sup>. Both threads will justify iterations of their respective **await** loop by using an unset signal at the end of such an iteration. Our proof rules allow us to ignore the mutex obligation during this step. Hence, the mutex level does not interfere with the level of the unset signal. Whenever the queue is full, the producer waits for the consumer to pop an element and whenever the queue is empty, the consumer waits for the producer to push. That is, the producer waits for s*i*+10 pop while holding an obligation for s*<sup>i</sup>* push and the consumer waits for <sup>s</sup>*<sup>i</sup>* push while holding an obligation for s*<sup>i</sup>* pop. So, we have to choose the signal levels such that s*i*+10 pop .lev < s*<sup>i</sup>* push.lev and s*<sup>i</sup>* push.lev < s*<sup>i</sup>* pop.lev hold. We solve this by choosing s*<sup>i</sup>* pop.lev = 102 <sup>−</sup> <sup>i</sup> and s*i* push.lev = 101 <sup>−</sup> <sup>i</sup>.

*Verifying Termination.* This setup suffices to verify the example program. Via the lock invariant, each thread has access to both active signals. Whenever the producer pushes a number i into the queue, it sets s*<sup>i</sup>* push which discharges the held obligation and decreases its counter. Afterwards, if i > 1, it uses the uninitialized signal chunk uninitSig(id*<sup>i</sup>*−<sup>1</sup> push) to initialize <sup>s</sup>*<sup>i</sup>*−<sup>1</sup> push = (id*<sup>i</sup>*−<sup>1</sup> push, <sup>101</sup> <sup>−</sup> (<sup>i</sup> <sup>−</sup> 1)) and replaces s*<sup>i</sup>* push in the lock invariant by <sup>s</sup>*<sup>i</sup>*−<sup>1</sup> push before it releases the lock. If <sup>i</sup> = 1, the counter reached 0 and the loop ends. In this case, the producer holds no obligation. The consumer behaves similarly. Since we proved that each thread discharged all its obligations, we proved that the program terminates. Figure 10a illustrates the most important proof steps. We present the program's verification in full detail in the extended version of this paper [28] and in the technical report [29]. Furthermore, we encoded [27] the proof in VeriFast [12].

The number of threads in this program is fixed. However, our approach also supports the verification of programs where the number of threads is not even statically bounded. In [28] we present and verify such a program. It involves N producer and N consumer threads that communicate via a shared buffer of size 1, for a random number N > 0 determined during runtime.

# **4 Specifying Busy-Waiting Concurrent Objects**

Our approach can be used to verify busy-waiting concurrent objects with respect to abstract specifications. For example, we have verified [26] the CLH lock [7] against a specification that is very similar to our proof rules for built-in mutexes shown in Fig. 6. The main difference is that it is slightly more abstract: when a lock is initialized, it is associated with a *bounded infinite set* of levels rather than with a single particular level. (To make this possible, an appropriate universe of levels should be used, such as the set of lists of natural numbers, ordered lexicographically.) To acquire a lock, the levels of the obligations held by the thread must be above the elements of the set; the new obligation's level is an element of the set.

# **5 Tool Support**

We have extended the VeriFast tool [10] for separation logic-based modular verification of C and Java programs so that it supports verifying termination of busy-waiting C or Java programs. When verifying termination, VeriFast consumes a *call permission* at each recursive call or loop iteration. In the technical report [29] we define a generalised version of our logic that instead of providing a special proof rule for busy-waiting loops, provides *wait permissions* and a *wait view shift*. A call permission of a *degree* δ can be turned into a wait permission of a degree δ < δ for a given signal s. A wait view shift for an unset signal s for which a wait permission of degree δ exists produces a call permission of degree δ, which can be used to fuel a busy-waiting loop. When busy-waiting for some signal s, we can generate new permissions to justify each iteration as long as s remains unset.

VeriFast allows threads to freely exchange permissions. This is useful to verify termination of non-blocking algorithms involving compare-and-swap loops [11]. However, we must be careful to prevent self-fueling busy-waiting loops. Hence, we restrict where a permission can be consumed based on the *thread phase* it was created in. The main thread's initial phase is . When a thread in phase p forks a new thread, its phase changes to p.Forker and the new thread starts in phase p.Forkee. We allow a thread in phase p to consume a permission only if it was produced in an *ancestor thread phase* <sup>p</sup> <sup>p</sup>.

The only change we had to make to VeriFast's symbolic execution engine was to enforce the thread phase rule. We encoded the other aspects of the logic simply as axioms in a *trusted header file*. We used this tool support to verify the bounded FIFO (Sect. 3) and the CLH lock (Sect. 4). The bounded FIFO proof [27] contains 160 lines of proof annotations for 37 lines of code (an annotation overhead of 435%) and takes 0.08 s to verify. The CLH lock proof [26] contains 343 lines of annotations for 49 lines of code (an overhead of 700%) and takes 0.1 s to verify.

### **6 Integrating Higher-Order Features**

The logic we presented in this paper does not support higher-order features such as assertions that quantify over assertions, or storing assertions in the (logical) heap as the values of ghost cells. While we did not need such features to carry out our example proofs, they are generally useful to verify higher-order program modules against abstract specifications. The typical way to support such features in a program logic is by applying *step indexing* [1,17], where the domain of logical heaps is indexed by the number of execution steps left in the (partial) program trace under consideration. Assertions stored in a logical heap at index n+ 1 talk about logical heaps at index n; i.e., they are meaningful only *later*, after at least one more execution step has been performed.

It follows that such logics apply directly only to *partial* correctness properties. Fortunately, we can reduce a termination property to a safety property by writing our program in a programming language *instrumented* with runtime checks that guarantee termination. Specifically, we can write our program in a programming language that fulfils the following criteria: It tracks signals, obligations and permissions at runtime and has constructs for signal creation, waiting and setting a signal. The **fork** command takes as an extra operand the list of obligations to be transferred to the new thread (and the other constructs similarly take sufficient operands to eliminate any need for angelic choice). Threads get stuck when these constructs' preconditions are not satisfied, such as when a thread waits for a signal while holding the obligation for that signal. We can then use a step-indexing-based higher-order logic such as Iris [14] to verify that no thread in our program ever gets stuck. Once we established this, we know none of the instrumentation has any effect and can be safely *erased* from the program.

# **7 Related and Future Work**

In recent work [30] we propose a separation logic to verify termination of programs where threads busy-wait to be abruptly terminated. We generalize this work to support busy waiting for arbitrary conditions.

In [11] we propose an approach based on *call permissions* to verify termination of single- and multithreaded programs that involve loops and recursion. However, that work does not consider busy-waiting loops. In the technical report, we present a generalised logic that uses call permissions and allows busy waiting to be implemented using arbitrary looping and/or recursion. Furthermore, the use of call permissions allowed us to encode our case studies in our VeriFast tool which also uses call permissions for termination verification.

Liang and Feng [20,21] propose LiLi, a separation logic to verify liveness of blocking constructs implemented via busy waiting. In contrast to our verification approach, theirs is based on the idea of contextual refinement. In their approach, client code involving calls of blocking methods of the concurrent object is verified by first applying the contextual refinement result to replace these calls by code involving primitive blocking operations and then verifying the resulting client code using some other approach. In contrast, specifications in our approach are regular Hoare-style triples and proofs are regular Hoare-style proofs.

In [9] we propose a Hoare logic to verify liveness properties of the I/O behaviour of programs that do not perform busy waiting. By combining that approach with the one we proposed in this paper, we expect to be able to verify I/O liveness of realistic concurrent programs involving both I/O and busy waiting, such as a server where one thread receives requests and enqueues them into a bounded FIFO, and another one dequeues them and responds. To support this claim, we encoded the combined logic in VeriFast and verified a simple server application where the receiver and responder thread communicate via a shared buffer [25].

# **8 Conclusion**

We propose what is to the best of our knowledge the first separation logic for verifying termination of programs with busy waiting. We offer a soundness proof of the system of the paper in its extended version [28], and of a more general system in the technical report [29]. Further, we demonstrated its usability by verifying a realistic example. We encoded our logic and the realistic example in VeriFast [27] and used this encoding also to verify the CLH lock [26]. Moreover, we expect that our approach can be integrated into other existing concurrent separation logics such as Iris [14].

# **References**


50 T. Reinhard and B. Jacobs

33. Vafeiadis, V.: Concurrent separation logic and operational semantics. In: Electronic Notes in Theoretical Computer Science, 276, pp. 335–351 (2011). https:// doi.org/10.1016/j.entcs.2011.09.029, twenty-seventh Conference on the Mathematical Foundations of Programming Semantics (MFPS XXVII)

**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

# **Reflections on Termination of Linear Loops**

Shaowei Zhu(B) and Zachary Kincaid

Princeton University, Princeton, NJ 08544, USA {shaoweiz,zkincaid}@cs.princeton.edu

**Abstract.** This paper shows how techniques for linear dynamical systems can be used to reason about the behavior of general loops. We present two main results. First, we show that every loop that can be expressed as a transition formula in linear integer arithmetic has a best model as a deterministic affine transition system. Second, we show that for any linear dynamical system f with integer eigenvalues and any integer arithmetic formula G, there is a linear integer arithmetic formula that holds exactly for the states of f for which G is eventually invariant. Combining the two, we develop a monotone conditional termination analysis for general loops.

**Keywords:** Termination · Conditional termination · Best abstraction · Reflective subcategory · Linear dynamical systems · Monotone analysis

# **1 Introduction**

Linear and affine dynamical systems are a model of computation that is easy to analyze (relative to non-linear systems), making them useful across a broad array of applications. In the context of program analysis, affine dynamical systems correspond to loops of the form

**while** (*G*(**x**)) **do x** := *A***x** + **b** (†) where G is a formula, A is a matrix, **x** is a vector of program variables, and **b** is a constant vector. The termination problem for such loops has been shown to be decidable for several variations of this model [4,9,12,24,29]. However, few loops in real programs take this form, and so this work has not yet made an impact on practical termination analysis tools. This paper bridges the gap between theory and practice, showing how techniques for linear and affine dynamical systems can be used to reason about general programs.

*Example 1.* We illustrate our methodology using the example program in Fig. 1 (left). First, observe that although the body of this loop is not of the form (†), the value of the sum x + y decreases by z each iteration, and z remains the same. Thus, we can approximate the loop by the linear dynamical system in Fig. 1 (right), where the nature of the approximation is given by the linear map in the center of Fig. 1 (i.e., the a coordinate corresponds to x + y, and the b coordinate

$$\begin{array}{llll} 1 & z & := 1 & & & \\ 2 & \text{while} & \{x \ge 0 \land y \ge 0\} & \text{do} & & \begin{bmatrix} a \\ b \end{bmatrix} = \begin{bmatrix} 0 \ 1 \ 1 \ 0 \\ 0 \ 0 \ 0 \ 1 \end{bmatrix} \begin{bmatrix} w \\ x \\ y \\ z \end{bmatrix} \\ 4 & \text{if } \{\{x \gets y\} \nsubseteq \} & \text{do} & \begin{bmatrix} 0 \ 1 \ 0 \ 0 \end{bmatrix} & \text{do} & \begin{bmatrix} a \\ b \end{bmatrix} = \begin{bmatrix} 1 \ -1 \\ 0 \ 1 \end{bmatrix} \begin{bmatrix} a \\ b \end{bmatrix} \\ 5 & \text{if } \exists x := x \text{ - } z \\ & \text{if } \exists y := \begin{bmatrix} 1 \ 0 \ 1 \end{bmatrix} & \text{do} & \\ 6 & \text{else} & \\ 7 & \text{y } := \text{ y } \text{ - } z \end{array}$$

**Fig. 1.** Over-approximation of a loop by a linear dynamical system.

to z). The linear map is a simulation, in the sense that it transforms the state space of the program into the state space of the linear dynamical system so that every step in the loop has a corresponding step in the linear dynamical system.

Next, we compute the image of the guard of the loop (x ≥ 0 ∧ y ≥ 0) under the simulation, which yields a ≥ 0 (corresponding to the constraint x + y ≥ 0 over the original program variables). We can compute a closed form for this constraint holding on the kth iteration of the loop by exponentiating the dynamics matrix of the linear dynamical system, multiplying on the left by the row vector corresponding to the constraint, and on the right by the simulation:

$$\underbrace{\begin{bmatrix} 1 \ 0 \end{bmatrix}}\_{\text{Constraints}} \underbrace{\begin{bmatrix} 1 \ -1 \\ 0 \ 1 \end{bmatrix}}\_{\text{Dynamics Similarity}} \underbrace{\begin{bmatrix} 0 \ 1 \ 1 \ 0 \\ 0 \ 0 \ 0 \ 1 \end{bmatrix}}\_{\text{Simulation}} \begin{bmatrix} w \\ x \\ y \\ z \end{bmatrix} = (x + y) - kz.$$

We then analyze the asymptotic behavior of the closed form:

$$\text{As } k \to \infty, (x+y) - kz \to \begin{cases} -\infty & \text{if } z > 0 \\ x+y & \text{if } z = 0 \\ \infty & \text{if } z < 0 \end{cases}$$

We conclude that z > 0 ∨ (x + y) < 0 is a sufficient condition for the loop to terminate. ⌟

The paper is organized as follows. To serve as the class of "linear models" of loops, we introduce *deterministic affine transition system*s (DATS), a computational model that generalizes affine dynamical systems. Sect. 3 shows that any loop expressed as a linear integer arithmetic formula has a *DATS-reflection*, which is a best representation of the behavior of the loop as a DATS. Moreover, this holds for a restricted class of DATS with rational eigenvalues. Section 4 shows that for a linear map f with integer eigenvalues and a linear integer arithmetic formula G, there is a linear integer arithmetic formula that holds exactly for those states <sup>x</sup> such that <sup>G</sup>(<sup>f</sup> <sup>k</sup>(x)) holds for all but finitely many <sup>k</sup> <sup>∈</sup> <sup>N</sup>. Section 5 brings the results together, showing that the analysis of a DATS with rational eigenvalues can be reduced to the analysis of a linear dynamical system with integer eigenvalues. The fact that DATS-reflections are *best* implies monotonicity of the analysis. Finally, in Sect. 6, we demonstrate experimentally that the analysis can be successfully applied to general programs, using the framework of algebraic termination analysis [34] to lift our loop analysis to a whole-program conditional termination analysis. Some proofs are omitted for space, but may be found in the extended version of this paper [33].

# **2 Preliminaries**

This paper assumes familiarity with linear algebra – see for example [19]. We recall some basic definitions below.

In the following, a **linear space** refers to a finite-dimensional linear space over the field of rational numbers <sup>Q</sup>. For <sup>V</sup> a linear space and <sup>U</sup> <sup>⊆</sup> <sup>V</sup> , *span*(U) is the linear space generated by U; i.e., the smallest linear subspace of V that contains U. An **affine subspace** of a linear space V is the image of a linear subspace of V under a translation (i.e., a set of the form {v + v<sup>0</sup> : v ∈ U} for some linear subspace <sup>U</sup> <sup>⊆</sup> <sup>V</sup> and some <sup>v</sup><sup>0</sup> <sup>∈</sup> <sup>V</sup> ). For any scalar <sup>a</sup> <sup>∈</sup> <sup>Q</sup>, and any linear space V , we use a to denote the linear map a : V → V that maps v → av (in particular, 1 is the identity). A **linear functional** on a linear space V is a linear map <sup>V</sup> <sup>→</sup> <sup>Q</sup>; the set of all linear functionals on <sup>V</sup> forms a linear space called the **dual space** of V , denoted V -. A linear map f : V<sup>1</sup> → V<sup>2</sup> induces a dual linear map f - : V - <sup>2</sup> <sup>→</sup> <sup>V</sup> - <sup>1</sup> where <sup>f</sup> -(g) g ◦ f. For any linear space V , V is naturally isomorphic to V --, where the isomorphism maps <sup>x</sup> <sup>→</sup> λf : <sup>V</sup> -.f(x).

Let V be a linear space. A linear map f : V → V is associated with a **characteristic polynomial** p<sup>f</sup> (x), which is defined to be the determinant of (xI − A<sup>f</sup> ), where A<sup>f</sup> is a matrix representation of f with respect to some basis (the choice of which is irrelevant). Define the **spectrum** (set of eigenvalues) of f to be the set of (possibly complex) roots of its characteristic polynomial, *spec*(f) - {<sup>λ</sup> <sup>∈</sup> <sup>C</sup> : <sup>p</sup><sup>f</sup> (λ)=0}. We say that <sup>f</sup> has **rational spectrum** if *spec*(f) <sup>⊆</sup> <sup>Q</sup>; equivalently (by the spectral theorem – see e.g. [19, Ch. 6, Theorem 7]):


It is possible to determine whether a linear map has rational spectrum (and compute the basis of eigenvectors for V and V -) in polynomial time by computing its characteristic polynomial [15], factoring it [22], and checking whether each factor is linear.

The syntax of linear integer arithmetic (LIA) is given as follows:

$$\begin{aligned} x &\in \mathsf{Valable} \\ n &\in \mathbb{Z} \\ t &\in \mathsf{Term} ::= x \mid n \mid n \cdot t \mid t\_1 + t\_2 \\ F &\in \mathsf{Form} \vert \mathsf{a} ::= t\_1 \le t\_2 \mid (n \mid t) \mid F\_1 \land F\_2 \mid F\_1 \lor F\_2 \mid \neg F \mid \exists x.F \mid \forall x.F \end{aligned}$$

Let <sup>X</sup> <sup>⊆</sup> Variable be a set of variables. A **valuation** over <sup>X</sup> is a map <sup>v</sup> : <sup>X</sup> <sup>→</sup> Z. If F is a formula whose free variables range over X and v is a valuation over X, then we say that v satisfies F (written v |= F) if the formula F is true when interpreted over the standard model of the integers, using v to interpret the free variables. We write F |= G if every valuation that satisfies F also satisfies G.

#### **2.1 Transition Systems**

A **transition system** T is a pair T = S<sup>T</sup> , R<sup>T</sup> where S<sup>T</sup> is a set of states and R<sup>T</sup> ⊆ S<sup>T</sup> × S<sup>T</sup> is a transition relation. Within this paper, we shall assume that the state space of any transition system is a finite-dimensional linear space (over <sup>Q</sup>). We write <sup>x</sup> <sup>→</sup><sup>T</sup> <sup>x</sup> to denote that the pair x, x belongs to R<sup>T</sup> . We define the **domain** of a transition system T, dom(T) - {x ∈ S<sup>T</sup> : ∃x .x →<sup>T</sup> x }, to be the set of states that have a T-successor. We define the ω**-domain** dom<sup>ω</sup>(T) of T to be the set of states from which there exist infinite T-computations:

dom<sup>ω</sup>(T) -{x<sup>0</sup> ∈ S<sup>T</sup> : ∃x1, x2, ... such that x<sup>0</sup> →<sup>T</sup> x<sup>1</sup> →<sup>T</sup> x<sup>2</sup> →<sup>T</sup> · · ·} .

A **transition formula** F(X, X ) is an LIA formula whose free variables range over a designated finite set of variables X and a set of "primed copies" X = {x : x ∈ X}. For example, a transition formula that represents the body of the loop in Fig. 1 is

$$\begin{array}{l} x \ge 0 \land y \ge 0 \land w' = 3w + x + 1 \land z' = z\\ \land \left( \begin{array}{l} ((2 \mid x - y) \land x' = x - z \land y' = y) \\ \lor (\neg(2 \mid x - y) \land y' = y - z \land x' = x) \end{array} \right) \end{array} \tag{1}$$

We use **TF** to denote the set of transition formulas. A transition formula F(X, X ) defines a transition system where the state space is the set of functions <sup>X</sup> <sup>→</sup> <sup>Q</sup>, and where <sup>v</sup> <sup>→</sup><sup>F</sup> <sup>v</sup> if and only if both (1) <sup>v</sup> and <sup>v</sup> map each <sup>x</sup> <sup>∈</sup> <sup>X</sup> to an integer and (2) [v, v ] |= F, where [v, v ] denotes the valuation that maps each x ∈ X to v(x) and each x ∈ S to v (x). Defining the state space of F to be <sup>X</sup> <sup>→</sup> <sup>Q</sup> rather than <sup>X</sup> <sup>→</sup> <sup>Z</sup> is a technical convenience (<sup>X</sup> <sup>→</sup> <sup>Q</sup> <sup>∼</sup><sup>=</sup> <sup>Q</sup>|X<sup>|</sup> is a linear space), but does not materially affect the results of this paper since only (integral) valuations are involved in transitions.

Let T = S<sup>T</sup> , R<sup>T</sup> be a transition system. We say that T is:


– **total** if for all x ∈ S<sup>T</sup> there exists some x ∈ S<sup>T</sup> with x →<sup>T</sup> x

For example, the transition system T with transition relation

$$R\_T \triangleq \left\{ \left< \begin{bmatrix} x \\ y \end{bmatrix}, \begin{bmatrix} x' \\ y' \end{bmatrix} \right> : \begin{bmatrix} 1 \ 0 \\ 0 \ 1 \\ 0 \ 0 \end{bmatrix} \begin{bmatrix} x' \\ y' \end{bmatrix} = \begin{bmatrix} 2 \ 1 \\ 0 \ 1 \\ 0 \ 1 \end{bmatrix} \begin{bmatrix} x \\ y \end{bmatrix} + \begin{bmatrix} 0 \\ 0 \\ 1 \end{bmatrix} \right\}$$

is deterministic and affine, but not linear or total. The transition system U with transition relation

$$R\_U \triangleq \left\{ \left\langle \begin{bmatrix} x \\ y \end{bmatrix}, \begin{bmatrix} x' \\ y' \end{bmatrix} \right\rangle : \begin{bmatrix} 1 \ 1 \end{bmatrix} \begin{bmatrix} x' \\ y' \end{bmatrix} = \begin{bmatrix} \frac{1}{2} \ \frac{1}{2} \end{bmatrix} \begin{bmatrix} x \\ y \end{bmatrix} \right\}.$$

is total, linear (and affine), but not deterministic. The classical notion of a **linear dynamical system**—a transition system where the state evolves according to a linear map—corresponds to a *total*, *deterministic*, *linear* transition system. Similarly, an **affine dynamical system** is a transition system that is total, deterministic, and affine.

For any map s : X → Y , and any relation R ⊆ X × X, define the image of R under s to be the relation s[R] = {s(x), s(x ) : x, x ∈ R}. For any relation R ⊆ Y × Y , define the inverse image of R under s to be the relation <sup>s</sup>−<sup>1</sup>[R] = {x, x : s(x), s(x ) ∈ R}. Let T = S<sup>T</sup> , R<sup>T</sup> and U = S<sup>U</sup> , R<sup>U</sup> be transition systems. We say that a linear map s : S<sup>T</sup> → S<sup>U</sup> is a **linear simulation from** T **to** U, and write s : T → U, if for all x →<sup>T</sup> x , we have s(x) →<sup>U</sup> s(x ). Observe that the following are equivalent: (1) s is a simulation, (2) s[R<sup>T</sup> ] ⊆ R<sup>U</sup> , and (3) <sup>R</sup><sup>T</sup> <sup>⊆</sup> <sup>s</sup>−<sup>1</sup>[R<sup>U</sup> ].

An example of a simulation between a transition formula and a linear dynamical system is given in Fig. 1. In fact, there are many linear dynamical systems that over-approximate this loop; however, the simulation and linear dynamical system given in Fig. 1 is its *best abstraction*.

To formalize the meaning of *best abstractions*, it is convenient to use the language of category theory [17]. Any class of transition systems defines a category, where the objects are transitions systems of that class, and the arrows are linear simulations between them. We use boldface letters (**L**inear, **A**ffine, **D**eterministic, **T**otal) to denote categories of transition systems (e.g., **DATS** denotes the category of **D**eterministic **A**ffine **T**ransition **S**ystems).

If T is a transition system and **C** is a category of transition systems, a **Cabstraction** of T is a pair U, s consisting of a transition system U belonging to **C** and a linear simulation s : T → U. A **C-reflection** of T is a **C**-abstraction that satisfies a universal property among **C**-abstractions of T: for any **C**-abstraction V, t of T there exists a unique simulation t : U → V such that t ◦ s = t; i.e., the following diagram commutes:

If **D** is a category of transition systems and **C** is a subcategory such that every transition system in **D** has a **C**-reflection, we say that **C** is a **reflective subcategory** of **D**.

Our ultimate goal is to bring techniques from linear dynamical systems to bear on transition formulas. Fig. 1 gives an example of a program and its linear dynamical system reflection. Unfortunately, such reflections do not exist for *all* transition formulas, which motivates our investigation of alternative models.

**Proposition 1.** *The transition formula* x = x ∧ x = 0 *has no* **TDATS***reflection.*

*Proof.* Let F be the 1-dimensional transition formula x = x ∧ x = 0. For a contradiction, suppose that A, s is a **TDATS**-reflection of F. Since F contains the origin, then so must the transition relation of A, and so A is linear. Next, consider that for any <sup>λ</sup> <sup>∈</sup> <sup>Q</sup>, we have the simulation *id* : <sup>F</sup> <sup>→</sup> <sup>A</sup>λ, where *id* is the identity function and <sup>A</sup><sup>λ</sup> <sup>=</sup> Q, x <sup>→</sup> λx. Since A, s is a reflection of <sup>F</sup>, for any λ, there is some t<sup>λ</sup> such that t<sup>λ</sup> : A → A<sup>λ</sup> and *id* = t<sup>λ</sup> ◦ s. Since t<sup>λ</sup> is a simulation, we have λt<sup>λ</sup> = A<sup>λ</sup> ◦ t<sup>λ</sup> = t<sup>λ</sup> ◦ A. Since *id* = t<sup>λ</sup> ◦ s, we must have t<sup>λ</sup> non-zero, and so t<sup>λ</sup> is a left eigenvector of A with eigenvalue λ. Since this holds for all λ, A must have infinitely many eigenvalues, a contradiction.

# **3 Linear Abstractions of Transition Formulas**

Proposition 1 shows that not every transition formula has a total deterministic affine reflection. In the following we show that *totality* is the only barrier: every transition formula has a (computable) **DATS**-reflection. Moreover, we show that every transition formula has a *rational spectrum* **DATS** (Q-**DATS**)-reflection, a restricted class of **DATS** that generalizes affine maps x → A**x** + **b** where A has rational eigenvalues. The restriction on eigenvalues makes it easier to reason about the termination behavior of Q-**DATS**.

In the remainder of this section, we show that every transition formula has a Q-**DATS**-reflection by establishing a chain of reflective subcategories:

$$\text{TF} \xrightarrow{\text{Lemma 1}} \text{ATS} \xrightarrow{\text{Lemma 3}} \text{DATS} \xrightarrow{\text{Corollary 1}} \mathbb{Q}\text{-DATS}$$

The fact that Q-**DATS** is a reflective subcategory of **TF** then follows from the fact that a reflective subcategory of a reflective subcategory is reflective.

#### **3.1 Affine Abstractions of Transition Formulas**

Let F(X, X ) be a transition formula. The **affine hull** of F, denoted *aff*(F), is the smallest affine set *aff*(F) ⊆ (X ∪ X ) <sup>→</sup> <sup>Q</sup> <sup>∼</sup><sup>=</sup> (<sup>X</sup> <sup>→</sup> <sup>Q</sup>) <sup>×</sup> (<sup>X</sup> <sup>→</sup> <sup>Q</sup>) that contains all of the models of F. Reps et al. give an algorithm that can be used to compute *aff*(F), by using an SMT solver to sample a set of generators [26].

**Lemma 1.** *Let* F(X, X ) *be a transition formula. The affine hull of* F *(considered as a transition system) is the best affine abstraction of* F *(where the simulation from* F *to aff*(F) *is the identity).*

*Example 2.* Consider the example program in Fig. 1. Letting F denote the transition formula corresponding to the program, *aff*(F) can be represented as the solutions to the constraints

$$
\begin{bmatrix} 1 \ 0 \ 0 \ 0 \\ 0 \ 1 \ 1 \ 0 \\ 0 \ 0 \ 0 \ 1 \end{bmatrix} \begin{bmatrix} w' \\ x' \\ y' \\ z' \end{bmatrix} = \begin{bmatrix} 3 \ 1 \ 0 \ 0 \\ 0 \ 1 \ 1 \ -1 \\ 0 \ 0 \ 0 \ 1 \end{bmatrix} \begin{bmatrix} w \\ x \\ y \\ z \end{bmatrix} + \begin{bmatrix} 1 \\ 0 \\ 0 \\ 0 \end{bmatrix} \tag{2}
$$

Notice that *aff*(F) is 4-dimensional and has a transition relation defined by 3 constraints, and thus is *not* deterministic. The next step is to find a suitable projection onto a lower-dimensional space so that the resulting transition system is deterministic.

#### **3.2 Reflections via the Dual Space**

This section presents a key technical tool that will be used in the next two subsections to prove the existence of reflections. For any transition system T, an abstraction U, s of T consisting of a transition system U and a simulation <sup>s</sup> : <sup>S</sup><sup>T</sup> <sup>→</sup> <sup>S</sup><sup>U</sup> induces a subspace of <sup>S</sup>- <sup>T</sup> , which is the range of the dual map s- (i.e., the set of all linear functionals on <sup>S</sup><sup>T</sup> of the form <sup>g</sup> ◦ <sup>s</sup> where <sup>g</sup> <sup>∈</sup> <sup>S</sup>- <sup>U</sup> ). The essential idea is we can apply this in reverse: any subspace Λ of S- <sup>T</sup> induces a transition system U and a simulation s : T → U that satisfies a universal property among all abstractions V,v of <sup>T</sup> where the range of <sup>v</sup> is contained in Λ. We will now formalize this idea.

Let T be a transition system, and let Λ be a subspace of S- <sup>T</sup> . Define αΛ(T) to be the pair αΛ(T) - U, s consisting of a transition system U and a linear simulation s : T → U where

– <sup>s</sup> : <sup>S</sup><sup>T</sup> <sup>→</sup> <sup>Λ</sup> sends each x ∈ S<sup>T</sup> to λf : Λ.f(x) – S<sup>U</sup> - Λ-, and R<sup>U</sup> s[R<sup>T</sup> ] = {s(x), s(x ) : x, x ∈ R<sup>T</sup> }

**Lemma 2 (Dual space simulation).** *Let* T *be a transition system, let* Λ *be a subspace of* S- <sup>T</sup> *, and let* U, s = αΛ(T)*. Suppose that* Z *is a transition system and* <sup>z</sup> : <sup>T</sup> <sup>→</sup> <sup>Z</sup> *is a simulation such that the range of* <sup>z</sup> *is contained in* Λ*. Then there exists a unique simulation* z : U → Z *such that* z ◦ s = z*.*

*Proof.* The high-level intuition is that since the range of z is contained in Λ, we may consider it to be a map z- : S- <sup>Z</sup> → Λ; dualizing again, we get a map z-- : Λ- <sup>→</sup> <sup>S</sup>-- <sup>Z</sup> , whose domain is S<sup>U</sup> and codomain is (isomorphic to) SZ.

More formally, let <sup>j</sup> : <sup>S</sup><sup>Z</sup> <sup>→</sup> <sup>S</sup>-- <sup>Z</sup> be the natural isomorphism between S<sup>Z</sup> and S-- <sup>Z</sup> defined by j(y) λg : S- <sup>Z</sup>.g(y). Define z : Λ-→ S<sup>Z</sup> by

$$\overline{z}(h) \triangleq j^{-1}(\lambda g : S\_Z^\star h(g \circ z)) \ .$$

First we show that z ◦ s = z. Let x ∈ SZ. Then we have

$$\begin{aligned} (\overline{z}\diamond s)(x) &= \overline{z}(s(x)) \\ &= j^{-1}(\lambda g: S\_Z^\star.(s(x))(g\circ z)) \\ &= j^{-1}(\lambda g: S\_Z^\star.(\lambda f: A.f(x))(g\circ z)) \\ &= j^{-1}(\lambda g: S\_Z^\star.g(z(x))) \\ &= z(x) \ . \end{aligned}$$

Next we show that z is a simulation. Suppose y →<sup>U</sup> y . Since R<sup>U</sup> = s[R<sup>T</sup> ], there is some x, x ∈ S<sup>T</sup> such that x →<sup>T</sup> x , s(x) = y, and s(x ) = y . Since z : T → Z is a simulation, we have that z(x) →<sup>Z</sup> z(x), and so z(s(x)) →<sup>Z</sup> z(s(x )), and we may conclude that z(y) →<sup>Z</sup> z(y ).

Finally, observe that s is surjective, and therefore the solution to the equation z ◦ s = z is unique.

We conclude this section by illustrating how to compute the function α for affine transition systems. Suppose that T is an affine transition system of dimension n. We can represent states in S<sup>T</sup> by vectors in Q<sup>n</sup>, and the transition relation <sup>R</sup><sup>T</sup> by a finite set of transitions <sup>B</sup> <sup>⊆</sup> <sup>Q</sup><sup>n</sup> <sup>×</sup> <sup>Q</sup><sup>n</sup> that generates <sup>R</sup><sup>T</sup> (i.e., R<sup>T</sup> = *aff*(B)). Suppose that Λ is an m-dimensional subspace of S- <sup>T</sup> ; elements of S- <sup>T</sup> can be represented by n-dimensional row vectors, and Λ can be represented by a basis **f** - <sup>1</sup> ,...,**f**- <sup>m</sup>. We can compute a representation of U, s = αΛ(T) as follows. The elements of S<sup>U</sup> = Λ can be represented by m-dimensional vectors (with respect to the basis g1,...,g<sup>m</sup> such that g<sup>i</sup> is the linear map that sends **f** - <sup>j</sup> to 1 if i = j and to 0 otherwise). The simulation s can be represented by the m × n matrix where the ith row is **f** - <sup>i</sup> . Finally, the transition relation R<sup>U</sup> can be represented by a set of generators {s(**x**), s(**x** ) : **x**, **x** ∈ B}.

#### **3.3 Determinization**

In this section, we show that any transition system operating over a finitedimensional vector space has a best deterministic abstraction, and give an algorithm for computing the best deterministic affine abstraction (or *determinization*) of an affine transition system.

Towards an application of Lemma 2, we seek to characterize the determinization of a transition system by a space of functionals on its state space. For any linear space V and space of functionals Λ on V , define an equivalence relation ≡<sup>Λ</sup> on V by x ≡<sup>Λ</sup> y iff f(x) = f(y) for all f ∈ Λ. If T is a transition system and Λ, Λ are spaces of functionals on S<sup>T</sup> , we say that T is (Λ, Λ )**-deterministic** if for all x1, x<sup>2</sup> x 1, x <sup>2</sup> such that x<sup>1</sup> ≡<sup>Λ</sup> x2, x<sup>1</sup> →<sup>T</sup> x <sup>1</sup>, and x<sup>2</sup> →<sup>T</sup> x <sup>2</sup>, then we also have x <sup>1</sup> ≡<sup>Λ</sup> x <sup>1</sup>. Observe that if D is a deterministic transition system and d : T → D is a simulation, then T must be (Λd, Λd)-deterministic, where Λ<sup>d</sup> is the range of the dual map d-.

For any T and Λ, define Det(T,Λ) - {f : T is (Λ, {f})-deterministic} to be the greatest set of functionals such that T is (Λ, Det(T,Λ))-deterministic. Observe that Det(T, <sup>−</sup>) is a monotone operator on the complete lattice of linear subspaces of S- <sup>T</sup> (i.e., if <sup>Λ</sup><sup>1</sup> <sup>⊆</sup> <sup>Λ</sup><sup>2</sup> then Det(T,Λ1) <sup>⊆</sup> Det(T,Λ2), since <sup>Λ</sup><sup>1</sup> induces a coarser equivalence relation than Λ2). By the Knaster-Tarski fixpoint theorem [28], Det(T, <sup>−</sup>) has a greatest fixpoint, which we denote by Det(T). Then we have that T is (Det(T), Det(T))-deterministic, and Det(T) contains every space Λ such that T is (Λ, Λ)-deterministic.

**Lemma 3 (Determinization).** *For any transition system* T*,* α*Det*(T)(T) *is a deterministic reflection of* T*.*

*Proof.* Let D, d αDet(T)(T). First, we show that D is deterministic. Suppose that y →<sup>D</sup> y <sup>1</sup> and y →<sup>D</sup> y <sup>2</sup>; we must show that y <sup>1</sup> = y <sup>2</sup>. Since R<sup>D</sup> is defined to be d[R<sup>T</sup> ], there must be x1, x2, x <sup>1</sup>, and x <sup>2</sup> in S<sup>T</sup> such that x<sup>1</sup> →<sup>T</sup> x 1, x<sup>2</sup> →<sup>T</sup> x <sup>2</sup>, d(x1) = d(x2) = y, d(x <sup>1</sup>) = y <sup>1</sup>, and d(x <sup>2</sup>) = y2. Since d(x1) = d(x2), we have (λf : Det(T).f(x1)) = (λf : Det(T).f(x2)), and therefore <sup>x</sup><sup>1</sup> <sup>≡</sup>Det(T) <sup>x</sup>2. We thus have x <sup>1</sup> ≡Det(T ,Det(T)) x <sup>2</sup>, and since Det(T, Det(T)) = Det(T), we have y <sup>1</sup> = d(x <sup>1</sup>) = d(x <sup>2</sup>) = y 2.

It remains to show that D, d is a deterministic *reflection* of T. Suppose that U, u is another deterministic abstraction of T. Define G to be the range of u-. Since <sup>U</sup> is deterministic, we must have <sup>G</sup> <sup>⊆</sup> Det(T,G), and since Det(T) is the greatest fixpoint of Det(T, <sup>−</sup>) we have <sup>G</sup> <sup>⊆</sup> Det(T). By Lemma 2, there is a unique linear simulation u : D → U such that u ◦ d = u.

If a transition system T is affine, then its determinization can be computed in polynomial time. Fixing a basis for the state space S<sup>T</sup> (of some dimension n), we can represent the transition relation of T in the form R<sup>T</sup> = {**x**, **x** : A**x** = B**x**+ **<sup>c</sup>**} where A, B <sup>∈</sup> <sup>Q</sup><sup>m</sup>×<sup>n</sup> and **<sup>c</sup>** <sup>∈</sup> <sup>Q</sup><sup>m</sup> (for some <sup>m</sup>). We can represent functionals on <sup>S</sup><sup>T</sup> by <sup>n</sup>-dimensional vectors, where the vector **<sup>v</sup>** <sup>∈</sup> <sup>Q</sup><sup>n</sup> corresponds to the functional that maps **u** → **vu**. A linear space of functionals Λ can be represented by a system of linear equations <sup>Λ</sup> <sup>=</sup> {**<sup>x</sup>** : <sup>M</sup>**<sup>x</sup>** = 0}. The <sup>i</sup>th row **<sup>a</sup>**- <sup>i</sup> **v** = **b**- <sup>i</sup> **u**+ci, of the system of equations <sup>A</sup>**x** <sup>=</sup> <sup>B</sup>**<sup>x</sup>** <sup>+</sup> **<sup>c</sup>** can be read as "<sup>T</sup> is ({**b**- <sup>i</sup> } , {**a**- <sup>i</sup> }) deterministic." Thus, the functionals **f** such that T is (Λ, {**f**-})-deterministic are those that can be written as a linear combination of the rows of A such that the corresponding linear combination of the rows of B belongs to Λ; i.e.,

#### Det({**x**, **<sup>x</sup>** : A**x** = B**x**+**c**}, {**f** : M**f** = 0}) = {**d** : ∃**y**.MB**y** = 0 ∧ A**y** = **d**} .

A representation of Det(T,Λ) can be computed in polynomial time using Gaussian elimination. Since the lattice of linear subspaces of S- <sup>T</sup> has height n, the greatest fixpoint of Det(T, <sup>−</sup>) can be computed in polynomial time.

*Example 3.* Continuing the example from Fig. 1 and Example 2, we consider the determinization of the affine transition system in Eq. (2). The rows of the matrix on the left-hand side correspond to generators for Det(*aff*(F), <sup>Q</sup>4- ):

$$\begin{aligned} \mathsf{Det}(\mathit{aff}(F), \mathbb{Q}^{4^\star}) &= \span(\{ \left[ \begin{matrix} 1 \ 0 \ 0 \ 0 \end{matrix} \right], \left[ \begin{matrix} 0 \ 1 \ 1 \ 0 \end{matrix} \right], \left[ \begin{matrix} 0 \ 0 \ 0 \ 1 \end{matrix} \})) \\ \mathsf{Det}(\mathit{aff}(F), \mathsf{Det}(\mathit{aff}(F), \mathbb{Q}^{4^\star})) &= \span(\{ \left[ \begin{matrix} 0 \ 1 \ 1 \ 0 \end{matrix} \}, \left[ \begin{matrix} 0 \ 0 \ 0 \ 1 \end{matrix} \} \}) \end{aligned}$$

which is the greatest fixpoint Det(*aff*(F)). Intuitively: after one step of *aff*(F), the values of w, x + y, and z are affine functions of the input; after two steps x+y and z are affine functions of the input but w is not, since the value of w on the second step depends upon the value of x in the first, and x is not an affine function of the input.

This yields the deterministic reflection D, d (pictured in Fig. 1) where

$$R\_D = \left\{ \left\langle \begin{bmatrix} a \\ b \end{bmatrix}, \begin{bmatrix} a' \\ b' \end{bmatrix} \right\rangle : \begin{bmatrix} a' \\ b' \end{bmatrix} = \begin{bmatrix} 1 \ -1 \\ 0 \ 1 \end{bmatrix} \begin{bmatrix} a \\ b \end{bmatrix} \right\} \qquad \text{and} \qquad d = \begin{bmatrix} 0 \ 1 \ 1 \ 0 \\ 0 \ 0 \ 0 \ 1 \end{bmatrix} \quad \bot$$

#### **3.4 Rational-Spectrum Reflections of DATS**

In this section, we define rational-spectrum **DATS** and show that every **DATS** has a rational-spectrum-reflection.

In the following, it is convenient to work with transition systems that are linear rather than affine. We will prove that every deterministic *linear* transition system has a best abstraction with rational spectrum. The result extends to the affine case through the use of *homogenization*: i.e., we embed a (non-empty) affine transition system into a linear transition system with one additional dimension, such that if we fix that dimension to be 1 then we recover the affine transition system. If the transition relation of a **DATS** is represented in the form A**x** = B**x** + **c**, then its homogenization is simply

$$
\begin{bmatrix} A \ 0 \\ 0 \ 1 \end{bmatrix} \begin{bmatrix} \mathbf{x}' \\ y \end{bmatrix} = \begin{bmatrix} B \ \mathbf{c} \\ 0 \ 1 \end{bmatrix} \begin{bmatrix} \mathbf{x} \\ y \end{bmatrix} \ .
$$

For a **DATS** <sup>T</sup>, we use homog(T) to denote the pair L, h, consisting the **DLTS** L resulting from homogenization and the affine simulation h : T → L that maps each **<sup>x</sup>** <sup>∈</sup> <sup>S</sup><sup>T</sup> to **x** 1 (i.e., the affine simulation h formalizes the idea that if we fix the extra dimension y to be 1, we recover the original **DATS** T).

Let T be a deterministic linear transition system. Since our goal is to analyze the asymptotic behavior of T, and all long-running behaviors of T reside entirely within dom<sup>ω</sup>(T), we are interested in the structure of dom<sup>ω</sup>(T) and T's behavior on this set. First, we observe that dom<sup>ω</sup>(T) is a linear subspace of S<sup>T</sup> and is computable. For any k, let T <sup>k</sup> denote the linear transition system whose transition relation is the k-fold composition of the transition relation of R. Consider the descending sequence of linear spaces

$$\text{dom}(T) \supseteq \text{dom}(T^2) \supseteq \text{dom}(T^3) \supseteq \dotsb$$

(i.e., the set of states from which there are T computations of length 1, length 2, length 3, . . . ). Since the space S<sup>T</sup> is finite dimensional, this sequence must stabilize at some k. Since the states in dom(T <sup>k</sup>) have T-computations of any length and T is deterministic, we have that dom(T <sup>k</sup>) is precisely dom<sup>ω</sup>(T).

Since T is total on dom<sup>ω</sup>(T) and the successor of a state in dom<sup>ω</sup>(T) must also belong to dom<sup>ω</sup>(T), <sup>T</sup> defines a linear map <sup>T</sup>|<sup>ω</sup> : dom<sup>ω</sup>(T) <sup>→</sup> dom<sup>ω</sup>(T). In this way, we can essentially reduce asymptotic analysis of **DATS** to asymptotic analysis of linear dynamical systems. The asymptotic analysis of linear dynamical systems developed in Sects. 4 and 5 requires rational eigenvalues; thus we are interested in **DATS** T such that T|<sup>ω</sup> has rational eigenvalues. With this in mind, we define *spec*(T) = *spec*(T|ω), and say that T **has rational spectrum** if *spec*(T) <sup>⊆</sup> <sup>Q</sup>. Define <sup>Q</sup>-**DLTS** to be the subcategory of **DLTS** with rational spectrum, and Q-**DATS** to be the subcategory of **DATS** whose homogenization lies in Q-**DLTS**.

*Example 4.* Consider the **DLTS** T with

$$R\_T \triangleq \left\{ \left< \begin{bmatrix} x \\ y \\ z \end{bmatrix}, \begin{bmatrix} x' \\ y' \\ z' \end{bmatrix} \right> : \begin{bmatrix} 1 \ 0 \ 0 \\ 0 \ 1 \ 0 \\ 0 \ 0 \ 1 \\ 0 \ 0 \ 0 \end{bmatrix} \begin{bmatrix} x' \\ y' \\ z' \end{bmatrix} = \begin{bmatrix} 2 \ 0 \ 1 \\ 0 \ 2 \ 2 \\ 0 \ 0 \ 3 \\ 1 - 1 \ 0 \end{bmatrix} \begin{bmatrix} x \\ y \\ z \end{bmatrix} \right\}$$

The bottom-most equation corresponds to a constraint that only vectors where the x and y coordinates are equal have successors, so we have:

$$\text{dom}(T) = \left\{ \begin{bmatrix} x \ y \ z \end{bmatrix} \mathbf{7} : x = y \right\}.$$

Supposing that the x and y coordinates are equal in some pre-state, they are equal in the post-state exactly when z = 0, so we have

$$\text{dom}(T^2) = \left\{ \begin{bmatrix} x \ y \ z \end{bmatrix}^\mathsf{T} : x = y \wedge z = 0 \right\}$$

It is easy to check that dom(T<sup>3</sup>) = dom(T<sup>2</sup>), and therefore dom<sup>ω</sup>(T) = dom(T<sup>2</sup>). The vector - 110 is a basis for dom<sup>ω</sup>(T), and the matrix representation of <sup>T</sup>|<sup>ω</sup> with respect to this basis is - 2 (i.e., - 110- →<sup>T</sup> - 220- ). Thus we can see *spec*(T) = {2}, and <sup>T</sup> is a <sup>Q</sup>-**DLTS**. ⌟

Towards an application of Lemma 2, define the **generalized rational eigenspace** of a DLTS T to be

$$E\_{\mathbb{Q}}(T) \triangleq \operatorname{span} \left( \left\{ f \in S\_T^{\star} : \exists \lambda \in \mathbb{Q}, \exists r \in \mathbb{N}^+ . f \circ (T|\_{\omega} - \underline{\lambda})^r = 0 \right\} \right).$$

**Lemma 4.** *Let* T *be a DLTS, and define* Q, q α<sup>E</sup>Q(T)(T)*. Then for any* <sup>Q</sup>*-***DLTS** <sup>U</sup> *and any simulation* <sup>s</sup> : <sup>T</sup> <sup>→</sup> <sup>U</sup>*, there is a unique simulation* <sup>s</sup> : Q → U *such that* s ◦ q = s*.*

While α<sup>E</sup>Q(T)(T) satisfies a universal property for Q-**DLTS**, it does not necessary belong to Q-**DLTS** itself because it need not be deterministic. However, by iterative interleaving of Lemma 4 and determinization as shown in Algorithm 1, we arrive at a Q-**DLTS**-reflection. Example 5 demonstrates how we calculate a Q-**DLTS**-reflection of a particular **DLTS**.

*Example 5.* Consider the **DLTS** T with transition relation

$$R\_T \triangleq \left\{ \left< \begin{bmatrix} w \\ x \\ y \\ z \end{bmatrix}, \begin{bmatrix} w' \\ x' \\ y' \\ z' \end{bmatrix} \right> \colon \begin{bmatrix} 1 \ 0 \ 0 \ 0 \\ 0 \ 1 \ 0 \ 0 \\ 0 \ 0 \ 1 \ 0 \\ 0 \ 0 \ 0 \ 1 \\ 0 \ 0 \ 0 \ 0 \end{bmatrix} \begin{bmatrix} w' \\ x' \\ y' \\ z' \end{bmatrix} = \begin{bmatrix} 1 \ 1 \ 1 & 0 \ 0 \\ 1 \ 1 & 0 \ 0 \\ 0 \ 0 & 0 \ 1 \\ 0 \ 0 & -1 \ 0 \\ 1 - 1 & 0 & 0 \end{bmatrix} \begin{bmatrix} w \\ x \\ y \\ z \end{bmatrix} \right\}$$

We can calculate the ω-domain of T domω(T) = wxyz- : w = x , which has a basis B = - 1100- , - 0010- , - 0001- . With respect to B, T|<sup>ω</sup> corresponds to the matrix

$$T|\_{\omega} = \begin{bmatrix} 2 & 0 & 0 \\ 0 & 0 & 1 \\ 0 & -1 & 0 \end{bmatrix}$$

and so we have *spec*(T) = {2, i, −i}. We may calculate EQ(T) by finding (generalized) left eigenvectors with eigenvalue 2, the only rational number in *spec*(T):

$$\begin{aligned} E\_{\mathbb{Q}}(T) &= \left\{ \mathbf{v}^{\mathsf{T}} : \mathbf{v}^{\mathsf{T}} \begin{bmatrix} 1 \ 0 \ 0 \\ 1 \ 0 \ 0 \\ 0 \ 1 \ 0 \\ \end{bmatrix} \underbrace{\begin{pmatrix} 2 & 0 & 0 \\ 0 & 0 & 1 \\ 0 & -1 & 0 \end{pmatrix}}\_{B} - \underbrace{\begin{bmatrix} 2 \ 0 \ 0 \\ 0 \ 2 \ 0 \\ 0 \ 0 \ 2 \end{bmatrix}}\_{2I} \right) = 0 \right\},\\ &= \operatorname{span}(\left[ \begin{matrix} 1 \ 1 \ 0 \ 0 \ 0 \end{matrix} \right], \left[ \begin{matrix} 1 \ -1 \ 0 \ 0 \ 0 \end{matrix} \right)) \end{aligned}$$

Finally, we have Q, q = α<sup>E</sup>Q(T)(T), where

$$R\_Q = \left\{ \left\langle \begin{bmatrix} a \\ b \end{bmatrix}, \begin{bmatrix} a' \\ b' \end{bmatrix} \right\rangle \colon \begin{bmatrix} 1 \ 0 \\ 0 \ 1 \\ 0 \ 0 \end{bmatrix} \begin{bmatrix} a' \\ b' \end{bmatrix} = \begin{bmatrix} 2 \ 0 \\ 0 \ 0 \\ 0 \ 1 \end{bmatrix} \begin{bmatrix} a \\ b \end{bmatrix} \right\} \qquad q = \begin{bmatrix} 1 & 1 & 0 \ 0 \\ 1 & -1 & 0 \ 0 \end{bmatrix}$$

<sup>Q</sup> is deterministic and has rational spectrum, so Q, q is a <sup>Q</sup>-**DLTS**-reflection of T.

**Theorem 1.** *For any deterministic linear transition system, Algorithm 1 computes a* Q*-***DLTS***-reflection.*

Finally, by homogenization and Theorem 1, we conclude with the desired result:

**Corollary 1.** Q*-***DATS** *is a reflective subcategory of* **DATS***.*

### **4 Asymptotic Analysis of Linear Dynamical Systems**

This section is concerned with analyzing the behavior of loops of the form

$$\text{while } (G(\mathbf{x})) \text{ do } \mathbf{x} := A\mathbf{x}$$

```
Input : A DLTS T.
 Output : Q-DLTS-reflection of T
1 U ← T;
2 s ← λx.x ; /* Invariant: s is a simulation from T to U */
3 while spec(U|ω) -
               Q do
4 Q, q ← αEQ(U)(U) ; /* Lemma 4 */
5 U, d ← αDet(Q)(Q) ; /* Lemma 3 */
6 s ← d ◦ q ◦ s;
7 return U, s
```
**Algorithm 1:** Computation of a Q-**DLTS**-reflection of a **DLTS**

where the G(**x**) is an LIA formula and A is a matrix with integer spectrum. Our goal is to capture the asymptotic behavior of iterating the map A on an initial state **x**<sup>0</sup> with respect to the formula G. Specifically, we show that

**Theorem 2.** *For any LIA formula* G *and any matrix* A *with integer spectrum, there is a periodic sequence of LIA formulas* H0, H1, H2,... *such that for any initial state* **<sup>x</sup>**<sup>0</sup> <sup>∈</sup> <sup>Q</sup><sup>n</sup>*, there exists* <sup>K</sup> *such that for any* k>K*,* <sup>G</sup>(A<sup>k</sup>**x**0) *holds if and only if* Hk(**x**0) *does.*

Recall that an infinite sequence H0, H1, H2,... is *periodic* if it is of the form

(H0, H1,...,H<sup>P</sup> ) <sup>ω</sup> -H0, H1,...,H<sup>P</sup> , H0, H1,...,H<sup>P</sup> ,...

We call the periodic sequence (H0, H1,...,H<sup>P</sup> )<sup>ω</sup> the *characteristic sequence* of the guard formula G with respect to dynamics matrix A, and denote it by χ(G, A). Note that G(A<sup>k</sup>**x**0) holds for all but finitely many k exactly when )P <sup>i</sup>=0 Hi(**x**0) holds.

In the remainder of this section, we show how to compute characteristic sequences. Let G be an LIA formula and let A be a matrix with integer spectrum. To begin, we compute a quantifier-formula G that is equivalent to G (using, for example, Cooper's algorithm [7]). We define χ(G , A) by recursion on the structure of G . For the logical connectives ∧, ∨, and ¬, characteristic sequences are defined pointwise:

$$\begin{aligned} \chi(\neg H, A) & \triangleq (\neg(\chi(H, A)\_0), \neg(\chi(H, A)\_1), \dots) \\ \chi(H\_1 \land H\_2, A) & \triangleq (\chi(H\_1, A)\_0 \land \chi(H\_2, A)\_0, \chi(H\_1, A)\_1 \land \chi(H\_2, A)\_1, \dots) \\ \chi(H\_1 \lor H\_2, A) & \triangleq (\chi(H\_1, A)\_0 \lor \chi(H\_2, A)\_0, \chi(H\_1, A)\_1 \lor \chi(H\_2, A)\_1, \dots) \end{aligned}$$

It remains to show how χ acts on atomic formulas, which take the form of inequalities t<sup>1</sup> ≤ t<sup>2</sup> and divisibility constraints n | t. An important fact that we employ in both cases is that for any linear term **cx** over the variables **x**, we can compute a closed form for **c**-A<sup>k</sup>(**x**) by symbolically exponentiating A. Since (by assumption) A has integer eigenvalues, this closed form has the form <sup>1</sup> <sup>Q</sup> (p(**x**, k)) where <sup>Q</sup> <sup>∈</sup> <sup>N</sup> and <sup>p</sup> is an **integer exponential-polynomial term**, which takes the form

$$
\lambda\_1^k k^{d\_1} \mathbf{a\_1^\mathsf{T}x} + \dots + \lambda\_m^k k^{d\_m} \mathbf{a\_m^\mathsf{T}x} \tag{3}
$$

where <sup>λ</sup><sup>i</sup> <sup>∈</sup> *spec*(A), <sup>d</sup><sup>i</sup> <sup>∈</sup> <sup>N</sup>, and **<sup>a</sup>**<sup>i</sup> <sup>∈</sup> <sup>Z</sup>n. 1

**Characteristic Sequences for Inequalities.** Our method for computing characteristic sequences for inequalities is a variation of Tiwari's method for deciding termination of linear loops with real eigenvalues [29].

First, suppose that **p**(**x**, k) is an integer exponential-polynomial of the form in Eq. (3) such that each λ<sup>i</sup> is a *positive* integer. Further suppose that the summands are ordered by asymptotic growth, with the dominant term appearing earliest in the list; i.e., for i<j we have either λ<sup>i</sup> > λ<sup>j</sup> , or λ<sup>i</sup> = λ<sup>j</sup> and d<sup>i</sup> > d<sup>j</sup> . If we imagine that the variables **<sup>x</sup>** are fixed to some **<sup>x</sup>**<sup>0</sup> <sup>∈</sup> <sup>Z</sup><sup>n</sup>, then we see that p(**x**0, k) is either identically zero or has finitely many zeros, and therefore its sign is eventually stable. Furthermore, the sign of p(**x**0, k) as k tends to ∞ is simply the sign of its *dominant term* – that is, the sign of **a**- <sup>i</sup> **x**<sup>0</sup> for the least i such that **a**- <sup>i</sup> **x**<sup>0</sup> is non-zero. Thus, we may define a function DTA that maps any exponential-polynomial term p(**x**, k) (with positive integral λi) to an LIA formula such that for any **<sup>x</sup>**<sup>0</sup> <sup>∈</sup> <sup>Z</sup><sup>n</sup>, **<sup>x</sup>**<sup>0</sup> <sup>|</sup><sup>=</sup> DTA(p) holds if and only if **<sup>p</sup>**(**x**0, k) is eventually non-negative (**p**(**x**0, k) <sup>≥</sup> 0 for all but finitely many <sup>k</sup> <sup>∈</sup> <sup>N</sup>). DTA is defined as follows:

$$\mathsf{DTA}(0) \triangleq true$$

$$\mathsf{DTA}(\lambda^k k^d \mathbf{a} \mathsf{T} \mathbf{x} + p) \triangleq \mathbf{a} \mathsf{T} \mathbf{x} \ge 1 \vee (\mathbf{a} \mathsf{T} \mathbf{x} = 0 \wedge \mathsf{DTA}(p))$$

Finally, we define the characteristic sequence of an inequality atom as follows. An inequality t<sup>1</sup> ≤ t<sup>2</sup> over the variables **x** can be written as **cx** + d ≥ 0 for **<sup>c</sup>** <sup>∈</sup> <sup>Z</sup><sup>n</sup> and <sup>d</sup> <sup>∈</sup> <sup>Z</sup>. Let <sup>1</sup> <sup>Q</sup>*even* p*even*(**x**, k) and <sup>1</sup> <sup>Q</sup>*odd* p*odd*(**x**, k) be the closed forms of **c**-A2<sup>k</sup>(**x**) and **c**-A2k+1(**x**), respectively; by splitting into "even" and "odd" cases, we ensure that the exponential-polynomial terms p*even*(**x**, k) and p*odd*(**x**, k) have only *positive* λ<sup>i</sup> and thus are amenable to the dominant term analysis DTA described above. Then we define:

$$\chi\left(\mathbf{c}^{\mathsf{T}}\mathbf{x} + d \geq 0, A\right) \triangleq \left(\mathsf{DTA}(p\_{even}(\mathbf{x}, k) + dQ\_{even}), \mathsf{DTA}(p\_{odd}(\mathbf{x}, k) + dQ\_{odd})\right)^{\omega}$$

*Example 6.* Consider the matrix A and its exponential A<sup>k</sup> below:

$$A\begin{pmatrix}x\\y\\z\\a\\b\end{pmatrix}=\begin{bmatrix}1.10&0&0\\0.11&0&0\\0.01&0&0\\0.01&0&0\\0.00&-3\,0\\0\,0\,0\,0\,2\end{bmatrix}\begin{bmatrix}x\\y\\z\\a\\b\end{bmatrix}$$

$$A^k\left(\begin{bmatrix}x\\y\\z\\a\\b\end{bmatrix}\right)=\begin{bmatrix}1\ k\frac{k(k-1)}{2}&0&0\\0\ 1&k&0&0\\0\ 0&1&0&0\\0\ 0&0&(-3)^k&0\\0\ 0&0&0&2^k\end{bmatrix}\begin{bmatrix}x\\y\\z\\a\\b\end{bmatrix}=\begin{bmatrix}\frac{1}{2}(zk^2+(2y-z)k+2x)\\zk+y\\z\\b\end{bmatrix}$$

<sup>1</sup> Technically, we have <sup>1</sup> <sup>Q</sup> (λ<sup>k</sup> <sup>1</sup> <sup>k</sup><sup>d</sup><sup>1</sup> **<sup>a</sup>**- <sup>1</sup> <sup>+</sup>··· <sup>+</sup> <sup>λ</sup><sup>k</sup> mk<sup>d</sup>m**a**- <sup>m</sup>) = **c**-A<sup>k</sup>**x** for all k greater than rank of the highest-rank generalized eigenvector of 0, but since we are only interested in the asymptotic behavior of A we can disregard the first steps of the computation.

First we compute the characteristic sequence χ(x ≥ 0, A). Applying the dominant term analysis of the closed form of x yields

$$\mathsf{DTA}\left(zk^2 + \left(2y - z\right)k + x\right) = \begin{pmatrix} z > 0\\ \vee\left(z = 0 \wedge 2y - z > 0\right) \\ \vee\left(z = 0 \wedge 2y - z = 0 \wedge x \ge 0\right) \end{pmatrix},$$

Since the closed form involves only positive exponential terms, we need not split into an even and odd case, and we simply have:

$$(\chi(x \ge 0, A) = (z > 0 \lor (z = 0 \land 2y - z > 0) \lor (z = 0 \land 2y - z = 0 \land x \ge 0))^\omega$$

Next we compute the characteristic sequence χ(a − b ≥ 0, A), which does require a case split. Applying dominant term analysis of the closed form of (a − b) yields

$$\mathsf{DTA}(a \cdot (-3)^{2k} - b \cdot 2^{2k}) = a > 0 \vee (a = 0 \wedge -b \ge 0)$$

$$\mathsf{DTA}(a \cdot (-3)^{2k+1} - b \cdot 2^{2k+1}) = -a > 0 \vee (-a = 0 \wedge -b \ge 0) \dots$$

and thus we have

$$\chi(a-b\ge 0, A) = (a>0 \lor (a = 0 \land -b \ge 0), -a > 0 \lor (-a = 0 \land -b \ge 0))^\omega.$$

**Characteristic Sequences for Divisibility Atoms.** Last we show how to define χ for divisibility atoms n | t. Write the term t as **cx** + d and let the closed form of **c**-A<sup>k</sup>(**x**) be

$$\frac{1}{Q}(\lambda\_1^k k^{d\_1} \mathbf{a}\_1^\mathsf{T} \mathbf{x} + \dots + \lambda\_m^k k^{d\_m} \mathbf{a}\_m^\mathsf{T} \mathbf{x}) \dots$$

The formula n | **c**-<sup>A</sup><sup>k</sup>(**x**) +<sup>d</sup> is equivalent to Qn <sup>|</sup> <sup>λ</sup><sup>k</sup> <sup>1</sup>k<sup>d</sup><sup>1</sup> **<sup>a</sup>**- 1**x**+···+λ<sup>k</sup> mk<sup>d</sup>m**a**- <sup>m</sup>**x**+ Qd. For any <sup>i</sup>, the sequence λ<sup>k</sup> <sup>i</sup> <sup>k</sup><sup>d</sup><sup>i</sup> mod Qn<sup>∞</sup> <sup>k</sup>=0 is ultimately periodic, since (1) k mod Qn<sup>∞</sup> <sup>k</sup>=0 = (0, <sup>1</sup>, . . . , Qn <sup>−</sup> 1)<sup>ω</sup>, (2) λ<sup>k</sup> <sup>i</sup> mod Qn<sup>∞</sup> <sup>k</sup>=0 is ultimately periodic (with period and transient length bounded above by Qn)<sup>2</sup>, and (3) ultimately periodic sequences are closed under pointwise product. It follows that for each i, there is a periodic sequence of integers zi,k ∞ <sup>k</sup>=0 that agrees with λk <sup>i</sup> <sup>k</sup><sup>d</sup><sup>i</sup> mod Qn<sup>∞</sup> <sup>k</sup>=0 on all but finitely many terms. Finally, we take

$$\chi(n \mid t, A) \triangleq \langle Qn \mid z\_{1,k} \mathbf{a}\_1^\mathsf{T} \mathbf{x} + \dots + z\_{m,k} \mathbf{a}\_m^\mathsf{T} \mathbf{x} + Qd \rangle\_{k=0}^\infty \dots$$

*Example 7.* Consider matrix A and the closed form of its exponents below

$$A\begin{pmatrix} \begin{bmatrix} x \\ y \\ z \end{bmatrix} \end{pmatrix} = \begin{bmatrix} 1 \ 1 \ 0 \\ 0 \ 1 \ 0 \\ 0 \ 0 \ 5 \end{bmatrix} \begin{bmatrix} x \\ y \\ z \end{bmatrix} \qquad A^k \begin{pmatrix} \begin{bmatrix} x \\ y \\ z \end{bmatrix} \end{pmatrix} = \begin{bmatrix} 1 \ k \ 0 \\ 0 \ 1 \ 0 \\ 0 \ 0 \ 5^k \end{bmatrix} \begin{bmatrix} x \\ y \\ z \end{bmatrix}$$

<sup>2</sup> An infinite sequence s0, s1, s2,... is ultimately periodic, if there exists N such that s<sup>N</sup> , s<sup>N</sup>+1, s<sup>N</sup>+2,... is a periodic sequence. We call N the transient length of this sequence.

We show the characteristic sequences for some divisibility atoms w.r.t A:

$$\begin{aligned} \chi(3 \mid x, A) &= (3 \mid x, 3 \mid x + y, 3 \mid x + 2y)^{\omega} \\ \chi(3 \mid x + 2, A) &= (3 \mid x + 2, 3 \mid x + y + 2, 3 \mid x + 2y + 2)^{\omega} \\ \chi(3 \mid z, A) &= (3 \mid z, 3 \mid 2z)^{\omega} \end{aligned}$$

### **5 A Conditional Termination Analysis for Programs**

This section demonstrates how the results from Sects. 3 and 4 can be combined to yield a conditional termination analysis that applies to general programs.

**Integer-Spectrum Restriction for** Q-**DLTS.** Section 3 gives a way to compute a Q-**DATS**-reflection of any transition formula. Yet the analysis we developed in Sect. 4 only applies to linear dynamical systems with integer spectrum. We now show how to bridge the gap. Let V be a Q-**DATS**. As discussed in Sect. 3.4, we may homogenize V to obtain a Q-**DLTS** T. Define Z(T) to be the space spanned by the generalized (right) eigenvectors of T|<sup>ω</sup> that correspond to integer eigenvalues:

$$\mathbb{Z}(T) \triangleq \operatorname{span}(\{x \in \operatorname{dom}^\omega(T) : \exists r \in \mathbb{N}^+, \lambda \in \mathbb{Z}.(T|\_\omega - \underline{\lambda})^r(x) = 0\})$$

Since <sup>Z</sup>(T) is invariant under <sup>T</sup>|<sup>ω</sup> and thus <sup>T</sup>, <sup>T</sup> defines a linear map <sup>T</sup>|<sup>Z</sup> : <sup>Z</sup>(T) <sup>→</sup> <sup>Z</sup>(T), and by construction <sup>T</sup>|<sup>Z</sup> has integer spectrum. The following lemma justifies the restriction of our attention to the subspace Z(T).

**Lemma 5.** *Let* <sup>F</sup> *be a transition formula, let* V,s *be a* <sup>Q</sup>*-***DATS***-reflection of* <sup>F</sup>*, and let* T,h <sup>=</sup> *homog*(<sup>V</sup> )*. For any state* <sup>v</sup> <sup>∈</sup> *dom*<sup>ω</sup>(F)*, we have* <sup>h</sup>(s(v)) <sup>∈</sup> Z(T)*.*

*Example 8.* The following loop computes the number of trailing 0's in the binary representation of integer x and its corresponding transition formula:

$$\begin{array}{ll} 1 & c := 0\\ 2 & \text{while } \{x \nmid 2 == 0\} \quad \mathbf{do} \\ 3 & x = x \nmid 2 \\ 4 & c = c + 1 \end{array} F(x, c, x', c') = \left( \begin{array}{l} (2 \mid x) \\ \land (x - 1 \le 2x' \land 2x' \le x) \\ \land (c' = c + 1) \end{array} \right)$$

The homogenization of the Q-**DATS**-reflection of F is the Q-**DLTS** T, where:

$$R\_T \triangleq \left\{ \left\langle \begin{bmatrix} x \\ c \\ h \end{bmatrix}, \begin{bmatrix} x' \\ c' \\ h' \end{bmatrix} \right\rangle : \begin{bmatrix} x' \\ c' \\ h' \end{bmatrix} = \begin{bmatrix} \frac{1}{2} \ 0 \ 0 \\ 0 \ 1 \ 1 \\ 0 \ 0 \ 1 \end{bmatrix} \begin{bmatrix} x \\ c \\ h \end{bmatrix} \right\}$$

The ω-domain of T is the whole state space Q<sup>3</sup>. Since the eigenvector - 100 of the transition matrix corresponds to a non-integer eigenvalue <sup>1</sup> <sup>2</sup> , the x-coordinate of states in <sup>Z</sup>(T) must be 0; i.e., <sup>Z</sup>(T) = {(x, c, y) : <sup>x</sup> = 0}. We conclude that x = 0 is a sufficient condition for the loop to terminate.

**Input :** A transition formula F(**x**, **x**- ) ∈ **TF** in linear integer arithmetic. **Output :** A mortal precondition mp(F) for F. **<sup>1</sup>** <sup>A</sup> <sup>←</sup> aff(F) ; /\* Affine hull [26]; Lemma 1 \*/ **<sup>2</sup>** D, d ← <sup>α</sup>Det(A)(A) ; /\* Determinize; Lemma 3 \*/ **<sup>3</sup>** V,q ← <sup>Q</sup>-**DATS**-reflection of <sup>D</sup> ; /\* Algorithm 1 \*/ **<sup>4</sup>** <sup>v</sup> <sup>←</sup> <sup>q</sup> ◦ <sup>d</sup> ; /\* V,v is a <sup>Q</sup>-**DATS**-reflection of <sup>F</sup> \*/ **<sup>5</sup>** T,h ← homog(<sup>V</sup> ) ; /\* Homogenization of <sup>V</sup> \*/ **<sup>6</sup>** <sup>t</sup> <sup>←</sup> <sup>h</sup> ◦ <sup>v</sup> ; /\* <sup>t</sup> is an affine simulation <sup>F</sup> <sup>→</sup> <sup>T</sup> \*/ **<sup>7</sup>** <sup>p</sup> <sup>←</sup> (any) linear projection of <sup>S</sup><sup>T</sup> onto <sup>Z</sup>(T); **<sup>8</sup>** <sup>C</sup> <sup>←</sup> matrix such that <sup>C</sup>**<sup>w</sup>** = 0 ⇐⇒ **<sup>w</sup>** <sup>∈</sup> <sup>Z</sup>(T); **<sup>9</sup>** Let <sup>G</sup>(**w**) ← ∃**x**, **<sup>x</sup>**- .F(**x**, **x**- ) <sup>∧</sup> **<sup>w</sup>** <sup>=</sup> <sup>p</sup>(t(**x**)) <sup>∧</sup> Ct(**x**) = 0; **<sup>10</sup>** (H0(**w**),...,H<sup>P</sup> (**w**))<sup>ω</sup> <sup>←</sup> <sup>χ</sup>(G(**w**), T|Z) ; /\* Section 4 \*/

$$\mathbf{u} \mathbf{\color{red}{\mathbf{u}}} \mathbf{\color{red}{\mathbf{return}}} \neg \left( \left( \bigwedge\_{i} H\_{i}(p(t(\mathbf{x}))) \right) \mathbf{\color{red}{\mathbf{u}}} \mathbf{\color{red}{\mathbf{x}}} \mathbf{(x} \mathbf{)} = \mathbf{0} \right)$$

**Algorithm 2:** Procedure for computing *mp*(F).

**The Mortal Precondition Operator.** Algorithm 2 shows how to compute a mortal precondition for an LIA transition formula F(**x**, **x** ) (i.e., a sufficient condition for which F terminates). The algorithm operates as follows. First, we compute a Q-**DATS**-reflection of F, and homogenize to get a Q-**DLTS** T and an *affine* simulation t : F → T. Let p denote an (arbitrary) projection from S<sup>T</sup> onto <sup>Z</sup>(T) (so <sup>p</sup> is a simulation from <sup>T</sup> to <sup>T</sup>|Z). We then compute an LIA formula G which represents the states **w** of T|<sup>Z</sup> such that there is some v ∈ dom(F) such that <sup>t</sup>(v) <sup>∈</sup> <sup>Z</sup>(T) and <sup>p</sup>(t(v)) = **<sup>w</sup>**. Letting (H0, ..., H<sup>P</sup> )<sup>ω</sup> be the characteristic sequence <sup>χ</sup>(G, T|Z), we have that for any <sup>v</sup> <sup>∈</sup> dom<sup>ω</sup>(F), <sup>t</sup>(v) must belong to Z(T) and p(t(v)) satisfies each Hi, so we define

$$\operatorname{emp}(F) \triangleq \{ v \in S\_F : t(v) \notin \mathbb{Z}(T) \text{ or } v \not\equiv \bigwedge\_i H\_i(p(t(\mathbf{x}))) . \}$$

Within the context of the algorithm, we suppose that states of F are represented by n-dimensional vectors, states of T are represented as m-dimensional vectors, and state of T|<sup>Z</sup> are represented as q-dimensional vectors. The affine simulation <sup>t</sup> is represented in the form **<sup>x</sup>** <sup>→</sup> <sup>A</sup>**<sup>x</sup>** <sup>+</sup> **<sup>b</sup>**, where <sup>A</sup> <sup>∈</sup> <sup>Z</sup><sup>m</sup>×<sup>n</sup> and **<sup>b</sup>** <sup>∈</sup> <sup>Z</sup><sup>m</sup>, the projection <sup>p</sup> as a <sup>Z</sup><sup>q</sup>×<sup>m</sup> matrix, and the linear map <sup>T</sup>|<sup>Z</sup> as a <sup>Q</sup><sup>q</sup>×<sup>q</sup> matrix. The fact that p and t have all integer (rather than rational) entries is without loss of generality, since any simulation can be scaled by the least common denominator of its entries.

**Theorem 3 (Soundness).** *For any transition formula* F*, for any state* s *such that* <sup>s</sup> <sup>∈</sup> *mp*(F)*, we have* s /<sup>∈</sup> *dom*<sup>ω</sup>(F)*.*

*Proof.* Let T, t, p, C, G, and H0,...,H<sup>P</sup> be as in Algorithm 2. We prove the contrapositive: we assume <sup>v</sup> <sup>∈</sup> dom<sup>ω</sup>(F) and prove v /<sup>∈</sup> mp(F), or equivalently <sup>v</sup> <sup>|</sup><sup>=</sup> <sup>H</sup>i(p(t(**x**))) for each <sup>i</sup> and <sup>t</sup>(v) <sup>∈</sup> <sup>Z</sup>(T). We have <sup>t</sup>(v) <sup>∈</sup> <sup>Z</sup>(T) by Lemma 5, so it remains only to show that v |= Hi(p(t(**x**))) for each i.

Since <sup>v</sup> <sup>∈</sup> dom<sup>ω</sup>(F), there exists an infinite trajectory of <sup>F</sup> starting from v: v →<sup>F</sup> v<sup>1</sup> →<sup>F</sup> v<sup>2</sup> →<sup>F</sup> .... For any j, let **w**<sup>j</sup> = T| j <sup>Z</sup>(p(t(v))). Since p ◦ t

is an (affine) simulation, we have **w**<sup>j</sup> = p(t(v<sup>j</sup> )) for all j. It follows that for any j, we have [v<sup>j</sup> , vj+1] |= F(**x**, **x** ) ∧ **w**<sup>j</sup> = p(t(**x**<sup>j</sup> )) ∧ Ct(**x**<sup>j</sup> ) = 0, and so G(**w**<sup>j</sup> ) = ∃**x**, **x** .F(**x**, **x** )∧**w**<sup>j</sup> = p(t(**x**))∧Ct(**x**) = 0 holds for all j. By Theorem 2, Hi(p(t(**x**))) holds for all Hi.

The proof of soundness requires only that we can compute Q-**DATS**abstractions of transition formulas. The following is the culmination of our development of Q-**DATS***-reflections*:

**Theorem 4 (Monotonicity).** *For any transition formulas* F<sup>1</sup> *and* F<sup>2</sup> *such that* F<sup>1</sup> |= F2*, we have mp*(F2) |= *mp*(F1)*.*

The desire for monotonicity is inspired by the principle that *changes to a program should have a predictable impact on its analysis* [34]. Monotonicity guarantees that more information into the analysis always leads to better results—for example, if a user annotates a procedure with pre-conditions or adds loop invariants into the program, our termination analysis can only produce weaker (that is, better) preconditions for termination. Moreover, in the context of this work, monotonicity also guarantees that if we cannot prove termination using the *mp* operator that we defined, then *any* linear abstraction of the loop has reachable non-terminating states.

# **6 Evaluation**

Section 5 shows how to compute mortal preconditions for transition formulas. Using the framework of algebraic termination analysis [34], we can "lift" the analysis to compute mortal preconditions for whole programs. The essential idea is to compute summaries for loops and procedures in "bottom-up" fashion, apply the mortal precondition operator from Sect. 5 to each loop body summary, and then propagate the mortal preconditions for the loops back to the entry of the program (see [34] for more details). We can verify that a program terminates by using an SMT solver to check that its mortal precondition is valid.

We have implemented Algorithm 2 as a mortal precondition operator *mp*LR ("mortal precondition via Linear Reflections") in ComPACT, a tool that implements the termination analysis framework presented in [34]. We compare the performance of our analysis against 2LS [5], Ultimate Automizer [10] and CPAchecker [23], the top three competitors in the termination category of Competition on Software Verification (SV-COMP) 2020.

Experiments are run on a virtual machine with Ubuntu 18.04, with a singlecore Intel Core i7-9750H @ 2.60 GHz CPU and 8 GB of RAM. All tools were run with a time limit of 10 min.

*Benchmarks.* We tested on a suite of 263 programs divided into 4 categories. The termination and recursive suites contain small programs with challenging termination arguments, while the polybench suite contains larger real-world programs that have relatively simple termination arguments. The termination


**Table 1.** Termination verification benchmarks; time in seconds.


**Table 2.** Comparing mpLR and ComPACT; time in seconds.

category consists of the *non-recursive, terminating* benchmarks from SV-COMP 2020 in the Termination-MainControlFlow suite. The recursive category consists of the *recursive, terminating* benchmarks from the recursive directory and Termination-MainControlFlow. Note that 2LS does not handle recursive programs, so we exclude it from the recursive category. Finally, we created a new test suite linear consisting of programs with terminating linear abstractions. This suite is designed to exercise the capabilities of the *mp*LR, and includes all examples from Ben-Amram and Genaim's article [1] on multi-phase ranking functions, loops with disjunctive and/or modular arithmetic guards, and loops that model integer division and remainder calculation.

*How Does Our Analysis Compare with the State-of-the-Art?* The comparison of ComPACT using the *mp*LR operator against state-of-the-art termination analysis tools is shown in Table 1. ComPACT with *mp*LR is competitive with (but not dominating) leading tools in terms of number of tasks solved across the suite, and uses substantially less time. The *mp*LR analysis is least successful on the termination and recursive suites, which are designed to have difficult termination arguments. Most competitive tools use a portfolio of different termination techniques to approach such problems (e.g., Ultimate Automizer synthesizes linear, nested, multi-phase, lexicographic and piecewise ranking functions); we investigate the use of *mp*LR in a portfolio solver in the following.

ComPACT with *mp*LR solves all tasks in the polybench suite, which contains numerical programs that have simple termination arguments, but which are larger than the SV-COMP tasks. 2LS, Ultimate Automizer, and CPAChecker exhaust time or memory limits on all tasks. Nested loops are a problematic pattern that appears in these programs, e.g.,

**for**(**int** *i* = 0; *i* < 4096; *i* += *step*) **for** (**int** *j* = 0; *j* < 4096; *j* += *step*) *// no modifications to i, j, or step*

For such loops, *mp*LR is guaranteed to synthesize a conditional termination argument that is *at least* as weak as *step* > 0 (regardless of the contents of the inner loop) by monotonicity and the fact that the loop body formula entails i < 4096 ∧ i = i + *step* ∧ *step* = *step*. Ultimate Automizer, CPAChecker, and 2LS cannot make such theoretical guarantees.

The linear suite demonstrates that *mp*LR is capable of proving termination of programs that lie outside the boundaries of the other tools.

*Can Our Analysis Improve a Portfolio Solver?* We compare *mp*LR and Com-PACT in Table 2. The columns correspond to running ComPACT with the following options: excluding the portfolio from [34] (*mp*LR), including the portfolio but excluding *mp*LR (ComPACT-*mp*LR), and including the portfolio and *mp*LR (ComPACT+*mp*LR). ComPACT+*mp*LR can solve 11 additional tasks over ComPACT-*mp*LR while adding negligible runtime overhead. In fact, adding *mp*LR to the portfolio *decreases* the amount of time it takes for ComPACT to complete all benchmark suites. Note that the combined tool is successful on the most termination tasks among all the tools we tested, both overall and for each individual suite except the termination category.

# **7 Related Work**

*Termination Analysis of Linear Loops.* The universal termination problem for linear loops (or *total deterministic affine transition systems*, in the terminology of Sect. 4) was posed by Tiwari [29]. The case of linear loops over the reals was resolved by Tiwari [29], over the rationals by Braverman [4], and finally over the integers by Hosseini et al. [14]. In principle, we can combine any of these techniques with our algorithm for computing **DATS**-reflections of transition formulas to yield a sound (but incomplete) termination analysis. The significance of computing a **DATS**-reflection (rather than just "some" abstraction) is that is provides an algorithmic completeness result: if it is possible to prove termination of a loop by exhibiting a terminating linear dynamical system that simulates it, the algorithm will prove termination.

The method introduced in Sect. 4 to compute characteristic sequences of inequalities is based on the method that Tiwari used to prove decidability of the universal termination problem for linear loops with (positive) real spectra [29]. Tiwari's condition of having *real* spectra is strictly more general than the *integer* spectra used by our procedure; requiring that the spectrum be integer allows us express the **DTA** procedure in linear *integer* arithmetic rather than real arithmetic. Similar procedures appear also in [12,18]. We note in particular that our results in Sects. 4 and 5 subsume Frohn and Giesl's decision procedure for universal termination for upper-triangular linear loops [12]; since every rational upper-triangular linear loop has a rational spectrum (and is therefore a Q-**DATS**), the mortal precondition computed for any rational upper-triangular linear loop is valid iff the loop is universally terminating.

*Linear Abstractions.* The formulation of "best abstractions" using reflective subcategories is based on the framework developed in [17]. A variation of this method was used in the context of invariant generation, based on computing (weak) reflections of linear rational arithmetic formulas in the category of rational vector addition systems [27]. This paper is the first to apply the idea to termination analysis.

A method for extracting polynomial recurrence (in)equations that are entailed by a transition formula appears in [16]. The algorithm can also be applied to compute a **TDATS**-abstraction of a transition formula. The procedure does not guarantee that the **TDATS**-abstraction is a reflection (*best* abstraction); Proposition 1 demonstrates that no such procedure exists. In this paper, we generalize the model to allow non-total transition systems, and show that best abstractions do exist. The techniques from Sect. 3 can be used for invariant generation, improving upon the methods of [16].

Kincaid et al. show that the category of linear dynamical systems with *periodic rational* spectrum is a reflective subcategory of the category of linear dynamical systems [18]. A complex number n is periodic rational if n<sup>p</sup> is rational for some <sup>p</sup> <sup>∈</sup> <sup>Z</sup><sup>&</sup>gt;<sup>0</sup>. Combining this result with the technique from Sect. <sup>3</sup> yields the result that the category of **DATS** with periodic rational spectrum is a reflective subcategory of **TF**. The decision procedure from Sect. 4 extends easily to the periodic rational case, which results in a strictly more powerful decision procedure.

*Termination Analysis.* Termination analysis, and in particular conditional termination analysis, has been widely studied. Work on the subject can be divided into practical termination analyses that work on real programs (but offer few theoretical guarantees) [2,6,8,11,13,20,30–32], and work on simplified model (such as linear, octagonal, and polyhedral loops) with strong guarantees (but cannot be applied directly to real programs) [1,3,4,14,21,25,29]. This paper aims to help bridge the gap between the two, by showing how to apply analyses for linear loops to general programs, while preserving some of their desirable theoretical properties, in particular monotonicity.

**Acknowledgments.** This work was supported in part by the NSF under grant number 1942537 and by ONR under grant N00014-19-1-2318. Opinions, findings, conclusions, or recommendations expressed herein are those of the authors and do not necessarily reflect the views of the sponsoring agencies.

# **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

# **Decision Tree Learning in CEGIS-Based Termination Analysis**

Satoshi Kura1,2(B) , Hiroshi Unno3,4, and Ichiro Hasuo1,2

<sup>1</sup> National Institute of Informatics, Tokyo, Japan kura@nii.ac.jp

 The Graduate University for Advanced Studies (SOKENDAI), Kanagawa, Japan University of Tsukuba, Ibaraki, Japan RIKEN AIP, Tokyo, Japan

**Abstract.** We present a novel decision tree-based synthesis algorithm of ranking functions for verifying program termination. Our algorithm is integrated into the workflow of CounterExample Guided Inductive Synthesis (CEGIS). CEGIS is an iterative learning model where, at each iteration, (1) a synthesizer synthesizes a candidate solution from the current examples, and (2) a validator accepts the candidate solution if it is correct, or rejects it providing counterexamples as part of the next examples. Our main novelty is in the design of a synthesizer: building on top of a usual decision tree learning algorithm, our algorithm detects *cycles* in a set of example transitions and uses them for refining decision trees. We have implemented the proposed method and obtained promising experimental results on existing benchmark sets of (non-)termination verification problems that require synthesis of piecewise-defined lexicographic affine ranking functions.

# **1 Introduction**

*Termination Verification by Ranking Functions and CEGIS.* Termination verification is a fundamental but challenging problem in program analysis. Termination verification usually involves some well-foundedness arguments. Among them are those methods which synthesize *ranking functions* [16]: a ranking function assigns a natural number (or an ordinal, more generally) to each program state, in such a way that the assigned values strictly decrease along transition. Existence of such a ranking function witnesses termination, where well-foundedness of the set of natural numbers (or ordinals) is crucially used.

We study synthesis of ranking functions by CounterExample Guided Inductive Synthesis (CEGIS) [29]. CEGIS is an iterative learning model in which a synthesizer and a validator interact to find solutions for given constraints. At each iteration, (1) a synthesizer tries to find a candidate solution from the current examples, and (2) a validator accepts the candidate solution if it is correct, or rejects it providing counterexamples. These counterexamples are then used as part of the next examples (Fig. 1).

c The Author(s) 2021 A. Silva and K. R. M. Leino (Eds.): CAV 2021, LNCS 12760, pp. 75–98, 2021. https://doi.org/10.1007/978-3-030-81688-9\_4

**Fig. 1.** The CEGIS architecture

CEGIS has been applied not only to program verification tasks (synthesis of inductive invariants [17,18,25,26], that of ranking functions [19], etc.) but also to constraint solving (for CHC [12,14,28,36], for pwCSP(T ) [30,31], etc.). The success of CEGIS is attributed to the degree of freedom that synthesizers enjoy. In CEGIS, synthesizers receive a set of individual examples that synthesizers can use in various creative and speculative manners (such as machine learning). In contrast, in other methods such as [5–8,24,27], synthesizers receive logical constraints that are much more binding.

*Segmented Synthesis in CEGIS-Based Termination Analysis.* The choice of a *candidate space* for candidate solutions σ is important in CEGIS. A candidate space should be *expressive*: by limiting a candidate space, the CEGIS architecture may miss a genuine solution. At the same time, *complexity* should be low: a larger candidate space tends to be more expensive for synthesizers to handle.

This tradeoff is also in the choice of the type of examples: using an expressive example type, a small number of examples can prune a large portion of the candidate space; however, finding such expressive examples tends to be expensive. ⎧⎪⎪⎨-

In this paper, we use *piecewise affine functions* as our candidate space for ranking functions. Piecewise affine functions are functions of the form ⎪⎪⎩a1 · x- + b1 x--

$$\begin{aligned} \text{In this paper, we use } pieweise \text{ a affine functions as our candidate space for ranking functions. Pieewewise affine functions are functions of the form} \\ f(\tilde{x}) &= \begin{cases} \tilde{a}\_1 \cdot \tilde{x} + b\_1 & \tilde{x} \in L\_1 \\ \vdots \\ \tilde{a}\_n \cdot \tilde{x} + b\_n & \tilde{x} \in L\_n \end{cases} \\ \text{where } \{L\_1, \ldots, L\_n\} \text{ is a partition of the domain of } f(\tilde{x}) \text{ such that each } L\_i \text{ is a solution of } \tilde{x} \text{ in } L\_i \text{ is a solution of } \tilde{x} \text{ in } L\_i. \end{aligned}$$

) such that each L<sup>i</sup> is a polyhedron (i.e. a conjunction of linear inequalities). We say *segmented synthesis* to emphasize that our synthesis targets are piecewise affine functions with case distinction. Piecewise affine functions stand on a good balance between expressiveness and complexity: the tasks of synthesizers and validators can be reduced to linear programming (LP); at the same time, case distinction allows them to model a variety of situations, especially where there are discontinuities in the function values and/or derivatives.

We use *transition examples* as our example type (Table 1). Transition examples are pairs of program states that represent transitions; they are much cheaper to handle compared to *trace examples* (finite traces of executions until termination) used e.g. in [15,33]. The current work is the first to pursue segmented synthesis of ranking functions with transition examples; see Table 1.

**Table 1.** Ranking function synthesis by CEGIS

**Fig. 2.** Decision tree learning

*Decision Tree Learning for CEGIS-Based Termination Analysis: a Challenge.* In this paper, we represent piecewise affine functions (1) by the data structure of *decision trees*. The data structure suits the CEGIS architecture (Fig. 1): iterative refinement of candidate solutions can be naturally expressed by growing decision trees. The main challenge of this paper is the design of an effective synthesizer for decision trees—such a synthesizer *learns* decision trees from examples.

In fact, decision tree learning in the CEGIS architecture has already been actively pursued, for the synthesis of *invariants* as opposed to ranking functions [12,14,18,22,36]. It is therefore a natural idea to adapt the decision tree learning algorithms used there, from invariants to ranking functions. However, we find that a naive adaptation of those algorithms for invariants does not suffice: they are good at handling *state examples* that appear in CEGIS for invariants; but they are not good at handling transition examples.

More specifically, when decision tree learning is applied to invariant synthesis (Fig. 2a), examples are given in the form of program states labeled as positive or negative. Decision trees are then built by iteratively selecting the best halfspaces—where "best" is in terms of some quality measures—until each leaf contains examples with the same label. One common quality measure used here is an information-theoretic notion of *information gain*.

We extend this from invariant synthesis to ranking function synthesis where examples are given by transitions instead of states (Fig. 2b). In this case, a major challenge is to cope with examples that cross a border of the current segmentation—such as the transition e<sup>4</sup> crossing the border h<sup>1</sup> in Fig. 2b. Our decision tree learning algorithm should handle such crossing examples, taking into account the constraints imposed on the leaf labels affected by those examples (the affected leaf labels are <sup>f</sup>1(x-) and f3(x-) in the case of e4).

*Our Algorithm: Cycle-Based Decision Tree Learning for Transition Examples.* We use what we call the *cycle detection theorem* (Theorem 17) as a theoretical tool to handle such crossing examples. The theorem claims the following: if there is no piecewise affine ranking function with the current segmentation of the domain (such as the one in Fig. 2b given by h<sup>1</sup> and h2), then this must be caused by a certain type of cycle of constraints, which we call an *implicit cycle*.

In our decision tree learning algorithm, when we do not find a piecewise affine ranking function with the current segmentation, we find an implicit cycle and refine the segmentation to break the cycle. Once all the implicit cycles are gone, the cycle detection theorem guarantees the existence of a candidate piecewise affine ranking function with the segmentation.

We integrate this decision tree learning algorithm in the CEGIS architecture (Fig. 1) and use it as a synthesizer. Our implementation of this framework gives promising experimental results on existing benchmark sets.

*Contribution.* Our contribution is summarized as follows.


*Organization.* Section 2 shows the overview of our method via examples. Section 3 explains our target class of predicate constraint satisfaction problems and how to encode (non-)termination problem into such constraints. In Sect. 4, we review CEGIS architecture, and then explain simplification of examples into positive/negative examples. Section 5 proposes our main contribution, our decision tree-based ranking function synthesizer. Section 6 shows our implementation and experimental results. Related work is discussed in Sect. 7, and we conclude in Sect. 8.

# **2 Preview by Examples**

We present a preview of our method using concrete examples. We start with an overview of the general CEGIS architecture, after which we proceed to our main contribution, namely a decision tree learning algorithm for transition examples.

#### **2.1 Termination Verification by CEGIS**

Our method follows the usual workflow of termination verification by CEGIS. It works as follows: given a program, we encode the termination problem into a constraint solving problem, and then use the CEGIS architecture to solve the constraint solving problem.

*Encoding the Termination Problem.* The first step of our method is to encode the termination problem as the set C of constraints.

**Example 1.** As a running example, consider the following C program.

$$\begin{array}{ccccccccc}\text{while } \mathtt{fx} & \mathtt{l} & \mathtt{0} & \mathtt{if} \ \mathtt{fx} & \mathtt{0} & \mathtt{fx} & \mathtt{+} & \mathtt{fx} & \mathtt{else} & \mathtt{fx} & \mathtt{-} & \mathtt{f} & \mathtt{f} \\\end{array}$$

The termination problem is encoded as the following constraints.

$$x < 0 \land x' = x + 1 \implies R(x, x') \tag{2}$$

$$\neg(x<0) \land x' = x - 1 \implies R(x, x'). \tag{3}$$

Here, R is a predicate variable representing a well-founded relation, and term variables x, x are universally quantified implicitly.

The set C of constraints claims that the transition relation for the given program is subsumed by a well-founded relation. So, verifying termination is now rephrased as the existence of a solution for C. Note that we omitted constraints for invariants for simplicity in this example (see Sect. 3 for the full encoding).

*Constraint solving by CEGIS.* The next step is to solve C by CEGIS.

In the CEGIS architecture, a synthesizer and a validator iteratively exchange a set E of examples and a candidate solution R(x, x ) for C. At the moment, we present a rough sketch of CEGIS, leaving the details of our implementation to Sect. 2.2.

**Fig. 3.** An example of CEGIS iterations

**Example 2.** Figure 3 shows how the CEGIS architecture solves the set C of constraints shown in (2) and (3). Figure 3 consists of three pairs of interactions (i)–(vi) between a synthesizer and a validator.

(i) The synthesizer takes E = ∅ as a set of examples and returns a candidate solution R(x, x ) = ⊥ synthesized from E. In general, candidate solutions are required to satisfy all constraints in E, but the requirement is vacuously true in this case.


#### **2.2 Handling Cycles in Decision Tree Learning**

We explain the importance of handling cycles in our decision tree-based synthesizer of piecewise affine ranking functions.

In what follows, we deal with such decision trees as shown in Fig. 4: their internal nodes have affine inequalities (i.e. halfspaces); their leaves have affine functions; and overall, such a decision tree expresses a piecewise affine function (Fig. 4). When we remove leaf labels from such a decision tree, then we obtain a template of piecewise functions where condition guards are given but function bodies are not. We shall call the latter a *segmentation*.

*Input and Output of our Synthesizer.* The input of our synthesizer is a set E of transition examples (e.g. E = {R(1, 0), R(−2, −1)}) as explained in Sect. 2.1. The output of our synthesizer is a well-founded relation R(x, x-) := f(x-) > f(x-) ∧ f(xand f(x-

**Fig. 4.** An example of a decision tree that represents a piecewise affine ranking function f(x, y) ) ≥ 0 where x-

 is a sequence of variables ) is a piecewise affine function, which is represented by a decision tree (Fig. 4). Therefore our synthesizer aims at *learning* a suitable decision tree.

*Refining Segmentations and Handling Cycles.* Roughly speaking, our synthesizer learns decision trees in the following steps.

**Fig. 5.** Selecting halfspaces. Transition examples are shown by red arrows. Boundaries of halfspaces are shown by dashed lines.


The key step of our synthesizer is Step 3. We show a few examples.

**Example 3.** Suppose we are given E = {R(1, 0), R(−2, −1)} as a set of examples. Our synthesizer proceeds as follows: (1) Our synthesizer generates the set H := {x ≥ 1, x ≥ 0, x ≥ −2, x ≥ −1} from the examples in E. (2) Our synthesizer tries to find a ranking function of the form f(x) = ax + b (with the trivial segmentation), but there is no such ranking function. (3) Our synthesizer refines the current segmentation with (x ≥ 0) ∈ H because x ≥ 0 "looks good". (4) Our synthesizer tries to find a ranking function of the form f(x) = **if** x ≥ 0 **then** ax + b **else** cx + d, using the current segmentation. Our synthesizer obtains f(x) = **if** x ≥ 0 **then** x **else** − x and use this f(x) for a candidate solution.

How can we decide which halfspace in H "looks good"? We use *quality measure* that is a value representing the quality of each halfspace and select the halfspace with the maximum quality measure.

Figure 5 shows the comparison of the quality of x ≥ 0 and x ≥ −2 in this example. Intuitively, x ≥ 0 is better than x ≥ −2 because we can obtain a simple ranking function **if** x ≥ 0 **then** x **else** − x with x ≥ 0 (Fig. 5a) while we need further refinement of the segmentation with x ≥ −2 (Fig. 5b). In Sect. 5, we introduce a quality measure for halfspaces following this intuition.

Our synthesizer iteratively refines segmentations following this quality measure, until examples contained in each leaf of the decision tree admit an affine ranking function. This approach is inspired by the use of information gain in the decision tree learning for invariant synthesis.

Example 3 showed a natural extension of a decision tree learning method for invariant synthesis. However, this is not enough for transition examples, for the reasons of *explicit* and *implicit cycles*. Here are their examples.

**Fig. 6.** Two examples R(−1, 1) and R(1, 0) make an implicit cycle between x ≥ 1 and ¬(x ≥ 1).

**Example 4.** Suppose we are given E = {R(1, 0), R(0, 1)}. In this case, there is no ranking function because E contains a cycle 1 → 0 → 1 witnessing nontermination. We call such a cycle an *explicit cycle*.

**Example 5.** Let E = {R(−1, 1), R(1, 0), R(−1, −2), R(2, 3)} (Fig. 6). Our synthesizer proceeds as follows. (1) Our synthesizer generates the set H := {x ≥ 1, x ≥ 0,... } of halfspaces. (2) Our synthesizer tries to find a ranking function of the form f(x) = ax + b (with the trivial segmentation), but there is no such. (3) Our synthesizer refines the current segmentation with (x ≥ 1) ∈ H because x ≥ 1 "looks good" (i.e. is the best with respect to a quality measure).

We have reached the point where the naive extension of decision tree learning explained in Example 3 no longer works: although all constraints contained in each leaf of the decision tree admit an affine ranking function, there is no piecewise affine ranking function for E of the form f(x) = **if** x ≥ 1 **then** ax + b **else** cx + d.

More specifically, in this example, the leaf representing x ≥ 1 contains R(2, 3), and the other leaf representing ¬(x ≥ 1) contains R(−1, −2). The example R(2, 3) admits an affine ranking function f1(x) = −x + 2, and R(−1, −2) admits f2(x) = x + 1, respectively. However, the combination f(x) = **if** x ≥ 1 **then** f1(x) **else** f2(x) is not a ranking function for E. Moreover, there is no ranking function for E of the form f(x) = **if** x ≥ 1 **then** ax + b **else** cx + d.

It is clear that this failure is caused by the *crossing examples* R(−1, 1) and R(1, 0). It is not that every crossing example is harmful. However, in this case, the set {R(−1, 1), R(1, 0)} forms a cycle between the leaf for x ≥ 1 and the leaf for ¬(x ≥ 1) (see Fig. 6). This "cycle" among leaves—in contrast to *explicit* cycles such as {R(1, 0), R(0, 1)} in Example 4—is called an *implicit cycle*.

Once an implicit cycle is found, our synthesizer cuts it by refining the current segmentation. Our synthesizer continues the above steps (1–3) of decision tree learning as follows. (4) Our synthesizer selects (x ≥ 0) ∈ H and cuts the implicit cycle {R(−1, 1), R(1, 0)} by refining segmentations. (5) Using the refined segmentation, our synthesizer obtains f(x) = **if** x ≥ 1 **then** − x + 2 **else if** x ≥ 0 **then** 0 **else** x + 3 as a ranking function for E.

As explained in Example 4, and 5, handling (explicit and implicit) cycles is crucial in decision tree learning for transition examples. Moreover, our *cycle detection theorem* (Theorem 17) claims that if there is no explicit or implicit cycle, then one can find a ranking function for E without further refinement of segmentations.

# **3 (Non-)Termination Verification as Constraint Solving**

We explain how to encode (non-)termination verification to constraint solving.

Following [31], we formalize our target class pwCSP of predicate constraint satisfaction problems parametrized by a first-order theory T .

**Definition 6.** Given a formula φ, let *ftv*(φ) be the set of free term variables and *fpv*(φ) be the set of free predicate variables in φ. 

**Definition 7.** A pwCSP is defined as a pair (C, R) where C is a finite set of clauses of the form m

$$\begin{aligned} \text{On more precasese can choose in } \varphi. \\\\ \text{CSP is defined as a pair } (\mathcal{C}, \mathcal{R}) \text{ where } \mathcal{C} \text{ is a finite set of } \\\\ \phi \vee \left(\bigvee\_{i=1}^{\ell} X\_i(\widetilde{t\_i})\right) \vee \left(\bigvee\_{i=\ell+1}^{m} \neg X\_i(\widetilde{t\_i})\right) \end{aligned}$$

and R ⊆ *fpv*(C) is a set of predicate variables that are required to denote *wellfounded* relations. Here, 0 ≤ ≤ m. Meta-variables t and φ range over T -terms and T -formulas, respectively, such that *ftv*(φ) = ∅. Meta-variables x and X range over term and predicate variables, respectively.

A pwCSP (C, R) is called CHCs (constrained Horn clauses, [9]) if R = ∅ and ≤ 1 for all clauses c ∈ C. The class of CHCs has been widely studied in the verification community [12,14,28,36].

**Definition 8.** A *predicate substitution* σ is a finite map from predicate variables X to closed predicates of the form λx1,...,xar(X).φ. We write dom(σ) for the domain of σ and σ(C) for the application of σ to C. *fpv*(C) <sup>⊆</sup> dom(σ); (2) <sup>|</sup><sup>=</sup>

**Definition 9.** A predicate substitution σ is a *(genuine) solution* for (C, R) if (1) σ(C) holds; and (3) for all X ∈ R, σ(X) represents a well-founded relation, that is, sort(σ(X)) = (s, s-) → • for some sequence <sup>s</sup> of sorts and there is no infinite sequence <sup>v</sup>-1, v-<sup>2</sup>,... of sequences <sup>v</sup><sup>i</sup> of values of the sorts s such that <sup>|</sup><sup>=</sup> <sup>ρ</sup>(X)(vi, v<sup>i</sup>+1) for all i ≥ 1. *Encoding Termination.* Given a set of initial state <sup>ι</sup>(xτ (x, x-

) and a transition relation ), the termination verification problem is expressed by the pwCSP (C, R) where R = {R}, and C consists of the following clauses. ι(x-) =⇒ I(x-) τ (x, x-) ∧ I(x-) =⇒ I(x-) τ (x, x-) ∧ I(x-) =⇒ R(x, x-

$$\iota(\widetilde{x}) \implies I(\widetilde{x}) \qquad \tau(\widetilde{x}, \widetilde{x}') \land I(\widetilde{x}) \implies I(\widetilde{x}') \qquad \tau(\widetilde{x}, \widetilde{x}') \land I(\widetilde{x}) \implies R(\widetilde{x}, \widetilde{x}')$$

We use φ =⇒ ψ as syntax sugar for ¬φ ∨ ψ, so this is a pwCSP. The wellfounded relation R asserts that τ is terminating. We also consider an invariant I for τ to avoid synthesizing ranking functions on unreachable program states.

*Encoding Non-termination.* We can also encode a problem of non-termination verification to pwCSP via recurrent sets [20]. For simplicity, we explain the encoding for the case of only one program variable x. We consider a recurrent set R satisfying the following conditions.

$$
\mu(x) \implies R(x) \tag{5}
$$

$$R(x) \implies \exists x'. \tau(x, x') \land R(x') \tag{6}$$

To remove ∃ from (6), we use the following constraint that is equivalent to (6).

$$R(x) \implies \exists x'. \tau(x, x') \land R(x') \tag{6}$$

$$\text{remove } \exists \text{ from (6), we use the following constraint that is equivalent to (6).}$$

$$\begin{aligned} R(x) &\implies E(x, 0) \\ E(x, x') &\implies \left(\tau(x, x') \land R(x')\right) \\ &\lor \left(S(x', x'-1) \land E(x, x'-1)\right) \lor \left(S(x', x'+1) \land E(x, x'+1)\right) \end{aligned} \tag{7}$$

$$\vee \left( S(x', x'-1) \wedge E(x, x'-1) \right) \vee \left( S(x', x'+1) \wedge E(x, x'+1) \right) \tag{8}$$

The intuition is as follows. Given x in the recurrent set R, the relation E(x, x ) searches for the value of ∃x in (6). The search starts from x = 0 in (7), and x is nondeterministically incremented or decremented in (8). The well-founded relation S asserts that the search finishes within finite steps. As a result, we obtain a pwCSP for non-termination defined by (C, R) where R = {S} and C is given by (5), (7), and (the disjunctive normal form of) (8).

**Example 10.** Consider the following C program.

while(x > 0) { x = -2 \* x + 9; }

The non-termination problem is encoded as the pwCSP (C, R) where R = {S}, and C consists of

$$\begin{aligned} x > 0 &\implies R(x) &\implies E(x, 0) \\ E(x, x') &\implies x' = -2x + 9 \land R(x') \\ &\lor (S(x', x' - 1) \land E(x, x' - 1)) \lor (S(x', x' + 1) \land E(x, x' + 1)) .\end{aligned}$$

The program is non-terminating when x = 3. This is witnessed by a solution σ for (C, R), which is given by σ(R)(x) := x = 3, σ(E)(x, x ) := x = 3∧0 ≤ x ∧x ≤ 3, and σ(S)(x , x) := x = x + 1 ∧ x ≤ 3.

# **4 CounterExample-Guided Inductive Synthesis (CEGIS)**

We explain how CounterExample-Guided Inductive Synthesis [29] (CEGIS for short) works for a given pwCSP (C, R) following [31]. Then, we add the extraction of positive/negative examples to the CEGIS architecture, which enables our decision tree-based synthesizer to use a simplified form of examples. **Definition 11.** A formula <sup>φ</sup> is an *example* of <sup>C</sup> if *ftv*(φ) = <sup>∅</sup> and

CEGIS proceeds through the iterative interaction between a synthesizer and a validator (Fig. 1), in which they exchange examples and candidate solutions.

 C |= φ hold. Given a set E of examples of C, a predicate substitution σ is a *candidate solution* for (C, R) that is consistent with E if σ is a solution for (E, R).

*Synthesizer.* The input for a synthesizer is a set E of examples of C collected from previous CEGIS iterations. The synthesizer tries to find a candidate solution σ consistent with E instead of a genuine solution for (C, R). If the candidate solution σ is found, then σ is passed to the validator. If E is unsatisfiable, then E witnesses unsatisfiability of (C, R). Details of our synthesizer is described in Sect. 5.

*Validator.* A validator checks whether the candidate solution σ from the synthesizer is a genuine solution of (C, R) by using SMT solvers. That is, satisfiability of |= <sup>σ</sup>(C) is checked. If <sup>|</sup><sup>=</sup>  σ(C) is not satisfiable, then σ is a genuine solution of the original pwCSP (C, R), so the validator accepts this. Otherwise, the validator adds new examples to the set E of examples. Finally, the synthesizer is invoked again with the updated set E of examples. If |= ¬ 

 σ(C) is satisfiable, new examples are constructed as follows. Using SMT solvers, the validator obtains an assignment θ to term variables such that |= ¬θ(ψ) holds for some ψ ∈ σ(C). By (4), |= ¬θ(ψ) is a clause of the form |= ¬θ(φ)∧ - i=1 ¬σ(Xi)(θ(ti)) ∧ m i=-+1 σ(Xi)(θ(ti)) . To prevent this counterexample from being found in the next CEGIS iteration again, the validator adds the following example to E. Xi(θ(ti)) ∨ m ¬Xi(θ(-

$$\bigvee\_{i=1}^{\ell} X\_i(\theta(\tilde{t}\_i)) \lor \bigvee\_{i=\ell+1}^{m} \neg X\_i(\theta(\tilde{t}\_i)) \tag{9}$$

The CEGIS architecture repeats this interaction between the synthesizer and the validator until a genuine solution for (C, R) is found or E witnesses unsatisfiability of (C, R).

*Extraction of Positive/Negative Examples.* Examples obtained in the above explanation are a bit complex to handle in our decision tree-based synthesizer: each example in E is a disjunction (9) of literals, which may contain multiple predicate variables. *positive examples* (i.e., examples of the form <sup>X</sup>(v-

To simplify the form of examples, we extract from <sup>E</sup> the sets <sup>E</sup><sup>+</sup> <sup>X</sup> and E<sup>−</sup> <sup>X</sup> of )) and *negative examples* (i.e., examples of the form <sup>¬</sup>X(v-)) for each X ∈ *fpv*(E). This allows us to synthesize a predicate σ(X) for each predicate variable X ∈ *fpv*(E) separately. For simplicity, we write v- ∈ E<sup>+</sup> X and v- ∈ E<sup>−</sup> <sup>X</sup> instead of <sup>X</sup>(v-) ∈ E<sup>+</sup> X and ¬X(v-) ∈ E<sup>−</sup> X. able application <sup>X</sup>(v-

The extraction is done as follows. We first substitute for each predicate vari-) in <sup>E</sup> a boolean variable <sup>b</sup>X(v-) to obtain a SAT problem **SAT**(E). Then, we use SAT solvers to obtain an assignment η that is a solution for **SAT**(E). If a solution η exists, then we construct positive/negative examples from η; otherwise, E is unsatisfiable.

**Definition 12.** Let η be a solution for **SAT**(E). For each predicate variable <sup>X</sup> <sup>∈</sup> *fpv*(E), we define the set <sup>E</sup><sup>+</sup> <sup>X</sup> of *positive examples* and the set <sup>E</sup><sup>+</sup> <sup>X</sup> of *negative examples* under the assignment <sup>η</sup> by <sup>E</sup><sup>+</sup> X := {v- | η(bX(v-)) = **true**} and E<sup>−</sup> X := {v- | η(bX(v-)) = **false**}.

Note that some of predicate variable applications <sup>X</sup>(v-) may not be assigned true nor false because they do not affect the evaluation of **SAT**(E). Such predicate variable applications are discarded from {(E<sup>+</sup> <sup>X</sup>, E<sup>−</sup> <sup>X</sup>)}X∈*fpv*(E).

Our method uses the extraction of positive and negative examples when the validator passes examples to the synthesizer. If X ∈ *fpv*(E) ∩ R, then we apply our ranking function synthesizer to (E<sup>+</sup> <sup>X</sup>, E<sup>−</sup> <sup>X</sup> ). If X ∈ *fpv*(E) \ R, then we apply an invariant synthesizer. σ(X)(v-<sup>+</sup>) and <sup>|</sup><sup>=</sup> <sup>¬</sup>σ(X)(v-<sup>−</sup>) hold for each predicate variable <sup>X</sup> <sup>∈</sup> *fpv*(E), <sup>v</sup>-

We say a candidate solution <sup>σ</sup> is consistent with {(E<sup>+</sup> <sup>X</sup>, E<sup>−</sup> <sup>X</sup>)}X∈*fpv*(E) if |= <sup>+</sup> <sup>∈</sup> E+ X, and v-<sup>−</sup> ∈ E<sup>−</sup> <sup>X</sup>. If a candidate solution <sup>σ</sup> is consistent with {(E<sup>+</sup> <sup>X</sup>, E<sup>−</sup> <sup>X</sup>)}<sup>X</sup>∈*fpv*(E), then σ is also consistent with E.

Note that unsatisfiability of {(E<sup>+</sup> <sup>X</sup>, E<sup>−</sup> <sup>X</sup> )}<sup>X</sup>∈*fpv*(E) does not immediately implies unsatisfiability of <sup>E</sup> nor (C, <sup>R</sup>) because {(E<sup>+</sup> <sup>X</sup>, E<sup>−</sup> <sup>X</sup> )}<sup>X</sup>∈*fpv*(E) depends on the choice of the assignment η. Therefore, the CEGIS architecture need to be modified: if synthesizers find unsatisfiability of {(E<sup>+</sup> <sup>X</sup>, E<sup>−</sup> <sup>X</sup>)}<sup>X</sup>∈*fpv*(E), then we add the negation of an unsatisfiability core to E to prevent using the same assignment η again.

Note that some restricted forms of (9) have also been considered in previous work and are called implication examples in [17] and implication/negation constraints in [12]. Our extraction of positive and negative examples is applicable to the general form of (9).

# **5 Ranking Function Synthesis**

In this section, we describe one of the main contributions, that is, our decision tree-based synthesizer, which synthesizes a candidate well-founded relation σ(R) from a finite set <sup>E</sup><sup>+</sup> <sup>R</sup> of examples. We assume that only positive examples are given because well-founded relations occur only positively in pwCSP for termination analysis (see Sect. 3). The aim of our synthesizer is to find a piecewise affine lexicographic ranking function f -(x-) for the given set <sup>E</sup><sup>+</sup> <sup>R</sup> of examples. Below, we fix a predicate variable <sup>R</sup> ∈ R and omit the subscript <sup>E</sup><sup>+</sup> <sup>R</sup> <sup>=</sup> <sup>E</sup><sup>+</sup>.

#### **5.1 Basic Definitions**

To represent piecewise affine lexicographic ranking functions, we use decision trees like the one in Fig. 4. Let <sup>x</sup>- = (x1,...,xn) be the program variables where each x<sup>i</sup> ranges over Z. **Definition 13.** <sup>A</sup> *decision tree* <sup>D</sup> is defined by <sup>D</sup> := <sup>g</sup>-(x-) | **if** h(x**else** <sup>D</sup> where <sup>g</sup>-(x-)=(gk(x-),...,g0(x-)) is a tuple of affine functions and <sup>h</sup>(x-

) ≥ 0 **then** D ) is an affine function. A *segmentation tree* S is defined as a decision tree with undefined leaves <sup>⊥</sup>: that is, <sup>S</sup> := ⊥ | **if** <sup>h</sup>(x-) ≥ 0 **then** S **else** S. For each decision tree D, we can canonically assign a segmentation tree by replacing the label of each leaf with ⊥. This is denoted by S(D). For each decision tree D, we denote the corresponding piecewise affine function by f - D(x-) : <sup>Z</sup><sup>n</sup> <sup>→</sup> <sup>Z</sup>k+1.

Each leaf in a segmentation tree S corresponds to a polyhedron. We often identify the segmentation tree S with the set of leaves of S and a leaf with the polyhedron corresponding to the leaf. For example, we say something like "for each L ∈ S, v-∈ L is a point in the polyhedron L".

Suppose we are given a segmentation tree <sup>S</sup> and a set <sup>E</sup><sup>+</sup> of examples.

**Definition 14.** For each L1, L<sup>2</sup> ∈ S, we denote the set of example transitions from <sup>L</sup><sup>1</sup> to <sup>L</sup><sup>2</sup> by <sup>E</sup><sup>+</sup> L1,L2 := {(v, v- ) ∈ E+ | v- ∈ L1, v- ∈ L2}. An example (v, v- ) ∈ E<sup>+</sup> is *crossing* w.r.t. <sup>S</sup> if (v, v- ) ∈ E<sup>+</sup> <sup>L</sup>1,L<sup>2</sup> for some L<sup>1</sup> = L2, and *noncrossing* if (v, v- ) ∈ E<sup>+</sup> L,L for some L.

**Definition 15.** We define the *dependency graph* <sup>G</sup>(S, <sup>E</sup><sup>+</sup>) for <sup>S</sup> and <sup>E</sup><sup>+</sup> by the graph (V,E) where vertices V = S are leaves, and edges E = {(L1, L2) | L<sup>1</sup> = L2, ∃(v, v- ) ∈ E<sup>+</sup> <sup>L</sup>1,L<sup>2</sup> } are crossing examples. We denote the set of start points <sup>v</sup> and end points <sup>v</sup> of examples (v, vby E+ := {v- | (v, v-) ∈ E+}∪{v- | (v, v-

 ) ∈ E<sup>+</sup> ) ∈ E<sup>+</sup>}. -

# **5.2 Segmentation and (Explicit and Implicit) Cycles: One-Dimensional Case** (x-) = f(x-

For simplicity, we first consider the case where f ) : <sup>Z</sup><sup>n</sup> <sup>→</sup> <sup>Z</sup> is a onedimensional ranking function. Our aim is to find a ranking function <sup>f</sup>(x-) for <sup>E</sup><sup>+</sup>, which satisfies <sup>∀</sup>(v, v- ) ∈ E+. f(v-) > f(v- ) and ∀(v, v- ) ∈ E+. f(v-) ≥ 0. If our ranking function synthesizer finds such a ranking function <sup>f</sup>(x-), then a candidate well-founded relation <sup>R</sup><sup>f</sup> is constructed as <sup>R</sup><sup>f</sup> (x, x- ) := f(x-) ≥ 0 ∧ f(x-) > f(x- ). Our synthesizer builds a decision tree <sup>D</sup> to find a ranking function <sup>f</sup>D(x-

) for <sup>E</sup><sup>+</sup>. The main question in doing so is "when and how should we refine partitions of decision trees?" To answer this question, we consider the case where there is no ranking function <sup>f</sup>D(x-) for <sup>E</sup><sup>+</sup> with a fixed segmentation <sup>S</sup>, and classify reasons for this into three cases as follows.

*Case 1: Explicit Cycles in Examples.* We define an *explicit cycle* in <sup>E</sup><sup>+</sup> as a cycle in the graph (Z<sup>n</sup>, <sup>E</sup><sup>+</sup>). An explicit cycle witnesses that there is no ranking function for <sup>E</sup><sup>+</sup> (see e.g., Example 4).

*Case 2: Non-crossing Examples are Unsatisfiable.* The second case is when there is a leaf L ∈ S such that no affine (not *piecewise* affine) ranking function for the set <sup>E</sup><sup>+</sup> L,L of non-crossing examples exists. This prohibits the existence of piecewise affine function <sup>f</sup>D(x-) for <sup>E</sup><sup>+</sup> with segmentation <sup>S</sup> <sup>=</sup> <sup>S</sup>(D) because the restriction of fD(x-) to <sup>L</sup> <sup>∈</sup> <sup>S</sup> must be an affine ranking function for <sup>E</sup><sup>+</sup> L,L.

*Case 3: Implicit Cycles in the Dependency Graph.* We define an *implicit cycle* by a cycle in the dependency graph <sup>G</sup>(S, <sup>E</sup><sup>+</sup>). Case 3 is the case where an implicit cycle prohibits the existence of piecewise affine ranking functions for <sup>E</sup><sup>+</sup> with the segmentation S (e.g., Example 5). If Case 1 and Case 2 do not hold but no piecewise affine ranking function for <sup>E</sup><sup>+</sup> with the segmentation <sup>S</sup> exists, then there must be an implicit cycle by (the contraposition of) the following proposition.

**Proposition 16.** *Assume* <sup>E</sup><sup>+</sup> *is a set of examples that does not contain explicit cycles (i.e. Case 1 does not hold). Let* S *be a segmentation tree and assume that for each* <sup>L</sup> <sup>∈</sup> <sup>S</sup>*, there exists an affine ranking function* <sup>f</sup>L(x-) *for* <sup>E</sup><sup>+</sup> L,L *(i.e. Case 2 does not hold). If the dependency graph* <sup>G</sup>(S, <sup>E</sup><sup>+</sup>) *is acyclic, then there exists a decision tree* <sup>D</sup> *with the segmentation* <sup>S</sup>(D) = <sup>S</sup> *such that* <sup>f</sup>D(x-) *is a ranking function for* <sup>E</sup><sup>+</sup>*.*

*Proof.* By induction on the height (i.e. the length of a longest path from a vertex) of vertices in <sup>G</sup>(S, <sup>E</sup><sup>+</sup>). We construct a decision tree <sup>D</sup> as follows. If the height of L ∈ S is 0, then we assign f L(x-) := fL(x-) to the leaf L where fL(x-) is a ranking function for <sup>E</sup><sup>+</sup> L,L. If the height of L ∈ S is n > 0, then we assign f L(x-) := fL(x-) + <sup>c</sup> to the leaf <sup>L</sup> where <sup>c</sup> <sup>∈</sup> <sup>Z</sup> is a constant that satisfies ∀(v, v- ) ∈ E<sup>+</sup> L,L- , fL(v-) + c>f L- (v- ) for each cell L with the height less than n. 

Note that the converse of Proposition 16 does not hold: the existence of implicit cycles in <sup>G</sup>(S, <sup>E</sup><sup>+</sup>) does not necessarily imply that no piecewise affine ranking function exists with the segmentation S. -

#### **5.3 Segmentation and (Explicit and Implicit) Cycles: Multi-Dimensional Lexicographic Case** (x-)=(fk(x-),...,f0(x--

We consider a more general case where f )) is a multidimensional lexicographic ranking function and k is a fixed nonnegative integer. (x--(x, x-

Given a function f ), we consider the well-founded relation R<sup>f</sup> ) defined inductively as follows. R()(x, x-) := <sup>⊥</sup> <sup>R</sup>(f*k*,...,f0)(x, x-) := fk(x-) ≥ 0 ∧ fk(x-) > fk(x---

$$\begin{array}{ll}\text{dimensional lexicographic ranking function and } k \text{ is a fixed nonnegative integer.}\\\text{Given a function } f(\widetilde{x}) \text{, we consider the well-founded relation } R\_{\widetilde{f}}(\widetilde{x}, \widetilde{x}') \text{ defined inductively as follows.}\\\\R\_{\langle\rangle}(\widetilde{x}, \widetilde{x}') := \perp \quad R\_{\langle f\_{k}, \dots, f\_{0} \rangle}(\widetilde{x}, \widetilde{x}') := f\_{k}(\widetilde{x}) \ge 0 \land f\_{k}(\widetilde{x}) > f\_{k}(\widetilde{x}') \\\qquad \qquad \qquad \qquad \lor f\_{k}(\widetilde{x}) = f\_{k}(\widetilde{x}') \land R\_{\langle f\_{k-1}, \dots, f\_{0} \rangle}(\widetilde{x}, \widetilde{x}')\\\end{array} \tag{10}$$
 
$$\text{Our aim here is to find a lexicographic ranking function } \widetilde{f}(\widetilde{x}) \text{ for } \mathcal{E}^{+}, \text{ i.e. a.}$$

Our aim here is to find a lexicographic ranking function f ) for <sup>E</sup><sup>+</sup>, i.e. a function f (x-) such that R<sup>f</sup> (v, v- ) holds for each (v, v- ) ∈ E<sup>+</sup>. Our synthesizer does so by building a decision tree. The same argument as the one-dimensional case holds for lexicographic ranking functions. --

**Theorem 17 (cycle detection).** *Assume* <sup>E</sup><sup>+</sup> *is a set of examples that does not contain explicit cycles. Let* S *be a segmentation tree and assume that for each* L ∈ S*, there exists an affine function* f L(x-) *that satisfies* <sup>∀</sup>(v, v- ) ∈ E<sup>+</sup> L,L, R<sup>f</sup> *L* (v, v- )*. If the dependency graph* <sup>G</sup>(S, <sup>E</sup><sup>+</sup>) *is acyclic, then there exists a decision tree* <sup>D</sup> *with the segmentation* S(D) = S *such that* R<sup>f</sup> - *D* (v, v- ) *holds for each* (v, v- ) ∈ E<sup>+</sup>*.* -(x--(x-) + c where -- (x, x-

*Proof.* The proof is almost the same as Proposition 16. Here, note that if f ) = f c is a tuple of nonnegative integer constants, then R<sup>f</sup> - ) subsumes R<sup>f</sup> -(x, x- ).


```
Input: a set E+ of examples, an integer k ≥ 0
Output: a well-founded relation R such that ∀(x, -
                                                x-
                                                  -

                                                   ) ∈ E+, R(x, -
                                                              x-
                                                                -

                                                                 )
1: if E has a cycle then
2: return unsatisfiable
3: end if
4: D := ResolveCase2(E)
5: while true do
6: C := GetConstraints(D, E)
7: O := SumAbsParams(D)
8: ρ := Minimize(O, C)
9: if ρ is defined then
10: f
          -
           (x-
             ) := f
                  -

                   ρ(D)(x-
                         )
11: return Rf
                    -

12: else
13: get an unsat core in C
14: find an implicit cycle (v-
                                 1, v-
                                    -

                                    1),..., (v-
                                             l, v-
                                                -

                                                l ) in the unsat core
15: find a cell C and two distinct points v-
                                               -

                                               i, v-
                                                  i+1 ∈ C in the implicit cycle
16: add a halfspace to separate v-
                                      -

                                      i and v-
                                             i+1 and update D
17: end if
18: end while
```
#### **5.4 Our Decision Tree Learning Algorithm**

We design a concrete algorithm based on Theorem 17. It is shown in Algorithm 1 and consists of three phases. We shall describe the three phases one by one.

**Phase 1.** Phase 1 (Line 1–3) detects explicit cycles in <sup>E</sup><sup>+</sup> to exclude Case 1. Here, we use a cycle detection algorithm for directed graphs.

**Phase 2.** Phase 2 (Line 4) detects and resolves Case 2 by using Resolve-Case2 (Algorithm 2), which is a function that grows a decision tree recursively. ResolveCase2 takes non-crossing examples in a leaf, divides the leaf, and returns a *template tree* that is fine enough to avoid Case 2. Here, template trees are decision trees whose leaves are labeled by affine templates.

Algorithm 2 shows the detail of ResolveCase2. ResolveCase2 builds a template tree recursively starting from the trivial segmentation S = ⊥ and all given examples. In each polyhedron, ResolveCase2 checks whether the set C of constraints imposed by non-crossing examples can be satisfied by an affine lexicographic ranking function on the polyhedron (Line 2–3). If the set C of constraints is not satisfiable, then ResolveCase2 chooses a halfspace <sup>h</sup>(x-) ≥ 0 (Line 6) and divides the current polyhedron by the halfspace.

There is a certain amount of freedom in the choice of halfspaces. To guarantee termination of the whole algorithm, we require that the chosen halfspace h separates at least one point in <sup>E</sup><sup>+</sup> := {v- | (v, v- ) ∈ E+}∪{v- | (v, v- ) ∈ E<sup>+</sup>} from the other points in <sup>E</sup><sup>+</sup>. That is:

#### **Algorithm 2** Resolving Case 2.


16: **end function**


1: **function** QualityMeasure(h, <sup>E</sup>-+) 2: <sup>E</sup>++ := {(v, v-- ) ∈ E-+ | h(v-) ≥ 0 ∧ h(v-) ≥ 0} 3: <sup>E</sup><sup>+</sup><sup>−</sup> := {(v, v-- ) ∈ E-+ | h(v-) ≥ 0 ∧ h(v-) < 0} 4: <sup>E</sup>−<sup>+</sup> := {(v, v-- ) ∈ E-+ | h(v-) < 0 ∧ h(v-) ≥ 0} 5: <sup>E</sup>−− := {(v, v-- ) ∈ E-+ | h(v-) < 0 ∧ h(v-) < 0} 6: f -:= MakeAffineTemplate(k) 7: <sup>C</sup><sup>+</sup> := GetConstraints(f,E- ++) <sup>C</sup><sup>−</sup> := GetConstraints(f,E- −−) 8: <sup>N</sup><sup>+</sup> := MaxSmt(C+) <sup>N</sup><sup>−</sup> := MaxSmt(C−) 9: **return** N<sup>+</sup> + N<sup>−</sup> + (|E<sup>+</sup>−| + |E−<sup>+</sup>|)(1 − entropy(|E<sup>+</sup>−|, |E−<sup>+</sup>|)) 10: **end function Assumption 18.** If halfspace <sup>h</sup>(x-

) ≥ 0 is chosen in Line 6 of Algorithm 2, then there exist v, u- ∈ E<sup>+</sup> such that <sup>h</sup>(v-) ≥ 0 and h(u-) < 0.

We explain two strategies (eager and lazy) to choose halfspaces that can be used to implement ChooseQualifier. Both of them are guaranteed to terminate, and moreover, intended to yield simple decision trees.

*Eager Strategy.* In the eager strategy, we eagerly generate a finite set H of halfspaces from the set <sup>E</sup><sup>+</sup> of all examples beforehand and choose the best one from H with respect to a certain quality measure. To satisfy Assumption 18, <sup>H</sup> are generated so that any two points u, v- ∈ E<sup>+</sup> can be separated by some halfspace (h(x-) ≥ 0) ∈ H.

For example, we can use intervals H = {±(x<sup>i</sup> − ai) ≥ 0 | i = 1,...,n ∧ (a1,...,an) ∈ E<sup>+</sup>} and octagons <sup>H</sup> <sup>=</sup> {±(x<sup>i</sup> <sup>−</sup> <sup>a</sup>i) <sup>±</sup> (x<sup>j</sup> <sup>−</sup> <sup>a</sup><sup>j</sup> ) <sup>≥</sup> <sup>0</sup> <sup>|</sup> <sup>i</sup> <sup>=</sup> <sup>j</sup> <sup>∧</sup> (a1,...,an) ∈ E<sup>+</sup>} where <sup>x</sup>-= (x1,...,xn). For any input <sup>E</sup><sup>+</sup> ⊆ E<sup>+</sup> of ResolveCase2, intervals and octagons satisfy <sup>∅</sup> <sup>=</sup> <sup>H</sup> := {h(x-) ≥ 0 | ∃v, u- ∈ E+.h(v-) ≥ 0 ∧ h(u-) < 0}, so Assumption 18 is satisfied by choosing the best halfspace with respect to the quality measure from H . For each halfspace (h(x-

) ≥ 0) ∈ H , we calculate QualityMeasure in Algorithm 3, and choose one that maximizes QualityMeasure(h, <sup>E</sup><sup>+</sup>). QualityMeasure(h, <sup>E</sup><sup>+</sup>) calculates the sum of the maximum number of satisfiable constraints in each leaf divided by <sup>h</sup>(x-) ≥ 0 plus an additional term (|E<sup>+</sup>−| + <sup>|</sup>E−<sup>+</sup>|)(1 <sup>−</sup> entropy(|E<sup>+</sup>−|, <sup>|</sup>E−<sup>+</sup>|)) where entropy(x, y) = <sup>−</sup> <sup>x</sup> <sup>x</sup>+<sup>y</sup> log<sup>2</sup> <sup>x</sup> <sup>x</sup>+<sup>y</sup> <sup>−</sup> <sup>y</sup> x+y log<sup>2</sup> <sup>y</sup> <sup>x</sup>+<sup>y</sup> . Therefore, the term (|E<sup>+</sup>−| + |E−<sup>+</sup>|)(1 − entropy(|E<sup>+</sup>−|, |E−<sup>+</sup>|)) is close to |E+−| + |E−<sup>+</sup>| if almost all examples in E+<sup>−</sup> ∪ E−<sup>+</sup> cross h in the same direction and close to 0 if |E+−| is almost equal to |E−<sup>+</sup>|.

*Lazy Strategy.* In the lazy strategy, we lazily generate halfspaces. We divide the current polyhedron so that non-crossing examples in the cell point to almost the same direction. {(v, v- 

First, we label states that occur in <sup>E</sup><sup>+</sup> C,C as follows. We find a direction that most examples in C point to by solving the MAX-SMT *a* := max*<sup>a</sup>* ) ∈ E+ C,C | *a* · (v- − v- ) > 0} . For each (v, v- ) ∈ E<sup>+</sup> C,C , we label two points v, v with +1 if *a* · (v- − v- ) > 0 and with −1 otherwise.

Then we apply weighted C-SVM to generate a hyperplane that separates most of the positive and negative points. To guarantee termination of Algorithm 1, we avoid "useless" hyperplanes that classify all the points by the same label. If we obtain such a useless hyperplane, then we undersample a majority class and apply C-SVM again. By undersampling suitably, we eventually get linearly separable data with at least one positive point and one negative point.

Note that since coefficients of hyperplanes extracted from C-SVM are floating point numbers, we have to approximate them by hyperplanes with rational coefficients. This is done by truncating continued fraction expansions of coefficients by a suitable length.

**Phase 3.** In Line 5–18 of Algorithm 1, we further refine the segmentation S(D) to resolve Case 3. Once Case 2 is resolved by ResolveCase2, Case 2 never holds even after refining S(D) further. This enables to separate Phases 2 and 3. -D(x-

Given a template tree D, we consider the set C of constraints on parameters in D that claims f ) is a ranking function for <sup>E</sup><sup>+</sup> (Line 6).

If C is satisfiable, we use an SMT solver to obtain a solution of C (i.e. an assignment ρ of integers to parameters) while minimizing the sum of absolute values of unknown parameters in D at the same time (Line 8). This minimization is intended to give a simple candidate ranking function. The solution ρ is used to instantiate the template tree D (Line 11).

If C cannot be satisfied, there must be an implicit cycle in the dependency graph <sup>G</sup>(S(D), <sup>E</sup><sup>+</sup>) by Theorem 17. The implicit cycle can be found in an unsatisfiable core of C. We refine the segmentation of D to cut the implicit cycle in Line 16. To guarantee termination, we choose a halfspace satisfying the following assumption, which is similar to Assumption 18.

**Assumption 19.** If halfspace <sup>h</sup>(x-) ≥ 0 is chosen in Line 16 of Algorithm 1, then there exist v, u- ∈ E<sup>+</sup> such that <sup>h</sup>(v-) ≥ 0 and h(u-) < 0. In eager strategy, we choose a halfspace (h(x-

We have two strategy (eager and lazy) to refine the segmentation of D.

) ≥ 0) ∈ H that separates two distinct points <sup>v</sup>- i and v<sup>i</sup>+1 in the implicit cycle. In doing so, we want to reduce the number of implicit cycles in <sup>G</sup>(S(D), <sup>E</sup><sup>+</sup>), but adding a new halfspace may introduce new implicit cycles if there exists (v, v- ) ∈ E<sup>+</sup> C,C that crosses the new border from the side of <sup>v</sup>- <sup>i</sup> to the side of <sup>v</sup><sup>i</sup>+1. Therefore, we choose a hyperplane that minimizes the number of new crossing examples. In lazy strategy, we use an SMT solver to find a hyperplane <sup>h</sup>(xseparates vi and v-

) ∈ H that <sup>i</sup>+1 and minimizes the number of new crossing examples.

**Termination.** Assumption 18 and Assumption 19 guarantees that every leaf in <sup>S</sup>(D) contains at least one point in the finite set <sup>E</sup><sup>+</sup>. Because the number of leaves in S(D) strictly increases after each iteration of Phase 2 and Phase 3, we eventually get a segmentation S(D) where each L ∈ S(D) contains only one point in <sup>E</sup><sup>+</sup> in the worst case. Since we have excluded Case 1 at the beginning, Theorem 17 guarantees the existence of ranking function with the segmentation <sup>S</sup>(D). Therefore, the algorithm terminates within |E<sup>+</sup><sup>|</sup> times of refinement. -(x-

**Theorem 20.** *If Assumption 18 and Assumption 19 hold, then Algorithm 1 terminates. If Algorithm 1 returns a piecewise affine lexicographic function* f )*, then the function satisfies* R<sup>f</sup> (x, x- ) *for each* (x, x- ) ∈ E<sup>+</sup> *where* <sup>E</sup><sup>+</sup> *is the input of the algorithm.* -(x-

#### **5.5 Improvement by Degenerating Negative Values**


There is another way to define well-founded relation from the tuple f ) = (fk(x-),...,f0(x-)) of functions, that is, the well-founded relation R f (x, x- ) defined inductively by R ()(x, x- ) := ⊥ and R (f*k*,...,f0)(x, x- ) := fk(x-) ≥ 0 ∧ fk(x-) > fk(x- ) ∨ fk(x- ) < 0 ∨ fk(x-) = fk(x- ) ∧ R (f*k*−1,...,f0)(x, x- ). In this definition, we loosen the equality <sup>f</sup>i(x-) = fi(xof the usual lexicographic ordering (10) to <sup>f</sup>i(x-) < 0 ∨ fi(x-) = fi(x-


 ) (where i = 1,...,k) ). This means that once <sup>f</sup>i(x-) becomes negative, <sup>f</sup>i(x-) must stay negative but the value do not have to be the same, which is useful for the synthesizer to avoid complex candidate lexicographic ranking functions and thus improves the performance. -(x, x--(x, x--(x, x-

However, if we use this well-founded relation R f ) instead of R<sup>f</sup> ) in (10), then Theorem 17 fails because R f ) is not necessarily subsumed by R f -+c where c = (ck,...,c0) is a nonnegative constant (see the proof of Proposition 16 and Theorem 17). As a result, there is a chance that no implicit cycle can be found in line 14 of Algorithm 1. Therefore, when we use R f -(x, x- ), we modify Algorithm 1 so that if no implicit cycle can be found in line 14, then we fall back on the former definition of R<sup>f</sup> -(x, x- ) and restart Algorithm 1.

# **6 Implementation and Evaluation**

*Implementation.* We implemented a constraint solver MuVal that supports invariant synthesis and ranking function synthesis. For invariant synthesis, we apply an ordinary decision tree learning (see [12,14,18,22,36] for existing techniques). For ranking function synthesis, we implemented the algorithm in Sect. 5 with both eager and lazy strategies for halfspace selection. Our synthesizer uses well-founded relation explained in Sect. 5.5. Given a benchmark, we run our solver for both termination and non-termination verification in parallel, and when one of the two returns an answer, we stop the other and use the answer. MuVal is written in OCaml and uses Z3 as an SMT solver backend. We used clang and llvm2kittel [1] to convert C benchmarks to T2 [3] format files, which are then translated to pwCSP by MuVal.

*Experiments.* We evaluated our implementation MuVal on C benchmarks from Termination Competition 2020 (C Integer) [4]. We compared our tool with AProVE [10,13], iRankFinder [7], and Ultimate Automizer [21]. Experiments are conducted on StarExec [2] (CentOS 7.7 (1908) on Intel(R) Xeon(R) CPU E5-2609 0 @ 2.40GHz (2393 MHZ) with 263932744 kB main memory). The time limit was 300 s.

*Results.* Results are shown in Table 2. Yes/No/TO/U means the number of benchmarks that these tools could verify termination/could verify nontermination/could not answer within 300 s and timed out (TimeOut)/gave up before 300 s (Unknown), respectively. We also show scatter plots of runtime in Fig. 7.

**Table 2.** Numbers of solved benchmarks


<sup>a</sup>We removed one benchmarks from the result of iRankFinder because the answer was wrong.

MuVal was able to solve more benchmarks than Ultimate Automizer. Compared to iRankFinder, MuVal solved slightly fewer benchmarks, but was faster in a large number of benchmarks: 265 benchmarks were solved faster by MuVal, 68 by iRankFinder, and 2 were not solved by both tools within 300 s (here, we regard U (unknown) as 300 s). Compared to AProVE, MuVal solved fewer benchmarks. However, there are several benchmarks that MuVal could solve but AProVE could not. Among them is "TelAviv-Amir-Minimum truetermination.c", which does require piecewise affine ranking functions. MuVal found a ranking function <sup>f</sup>(x, y) = **if** <sup>x</sup> <sup>−</sup> <sup>y</sup> <sup>≥</sup> <sup>0</sup> **then** <sup>y</sup> **else** <sup>x</sup>, while AProVE timed out.

We also observed that using CEGIS with transition examples itself showed its strengths even for benchmarks that do not require piecewise affine ranking functions. Notably, there are three benchmarks that MuVal could solve but the other tools could not; they are examples that do not require segmentations. Further analysis of these benchmarks indicates the following strengths of our framework: (1) the ability to handle nonlinear constraints (to some extent) thanks to

**Fig. 7.** Scatter plots of runtime. Ultimate Automizer and AProVE sometimes gave up before the time limit, and such cases are regarded as 300s.

the example-based synthesis and the recent development of SMT solvers; and (2) the ability to find a long lasso-shaped non-terminating trace assembled from multiple transition examples. See [23, Appendix A] for details.

# **7 Related Work**

There are a bunch of works that synthesize ranking functions via constraint solving. Among them is a counterexample-guided method like CEGIS [29]. CEGIS is sound but not guaranteed to be complete in general: even if a given constraint has a solution, CEGIS may fail to find the solution. A complete method for ranking function synthesis is proposed in [19]. They collect only extremal counterexamples instead of arbitrary transition examples to avoid infinitely many examples. A limitation of their method is that the search space is limited to (lexicographic) affine ranking functions.

Another counterexample-guided method is proposed in [33] and implemented in SeaHorn. This method can synthesize piecewise affine functions, but their approach is quite different from ours. Given a program, they construct a *safety* property that the number of loop iterations does not exceed the value of a candidate ranking function. The safety property is checked by a verifier. If it is violated, then a trace is obtained as a counterexample and the candidate ranking function is updated by the counterexample. The main difference from our method is that their method uses trace examples while our method uses transition examples (which is less expensive to handle). FreqTerm [15] also uses the connection to safety property, but they exploit syntax-guided synthesis for synthesizing ranking functions.

Aside from counterexample-guided methods, constraint solving is widely studied for affine ranking functions [27], lexicographic affine ranking functions [5,7,24], and multiphase affine ranking functions [6,8]. Their implementation includes RankFinder and iRankFinder. Farkas' lemma or Motzkin's transposition theorem are often used as a tool to transform ∃∀-constraints to ∃ constraints. However, when we apply this technique to piecewise affine ranking functions, we get nonlinear constraints [24].

Abstract interpretation is also applied to segmented synthesis of ranking functions and implemented in FuncTion [32,34,35]. In this series of work, decision tree representation of ranking functions is used in [35] for better handling of disjunctions. Compared to their work, we believe that our method is more easily extensible to other theories than linear integer arithmetic as long as the theories are supported by SMT solvers (although such extensions are out of the scope of this paper).

Other state-of-the-art termination verifiers include the following. Ultimate Automizer [21] is an automata-based method. It repeatedly finds a trace and computes a termination argument that contains the trace until termination arguments cover the set of all traces. B¨uchi automata are used to handle such traces. AProVE [10,13] is based on term rewriting systems.

# **8 Conclusions and Future Work**

In this paper, we proposed a novel decision tree-based synthesizer for ranking functions, which is integrated into the CEGIS architecture. The key observation here was that we need to cope with explicit and implicit cycles contained in given examples. We designed a decision tree learning algorithm using the theoretical observation of the cycle detection theorem. We implemented the framework and observed that its performance is comparable to state-of-the-art termination analyzers. In particular, it solved three benchmarks that no other tool solved, a result that demonstrates the potential of the current combination of CEGIS, segmented synthesis, and transition examples.

We plan to extend our ranking function synthesizer to a synthesizer of piecewise affine ranking supermartingales. Ranking supermartingales [11] are probabilistic version of ranking functions and used for verification of almost-sure termination of probabilistic programs.

We also plan to implement a mechanism to automatically select a suitable set of halfspaces with which decision trees are built. In our ranking function synthesizer, intervals/octagons/octahedron/polyhedra can be used as the set of halfspaces. However, selecting an overly expressive set of halfspaces may cause the problem of overfitting [25] and result in poor performance. Therefore, applying heuristics that adjusts the expressiveness of halfspaces based on the current examples may improve the performance of our tool.

**Acknowledgement.** We thank Andrea Peruffo and the anonymous referees for many suggestions. This work was supported by JST ERATO HASUO Metamathematics for Systems Design Project (No. JPMJER1603) and JSPS KAKENHI Grant Numbers 20H04162, 20H05703, 19H04084, and 17H01720.

# **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

# **ATLAS: Automated Amortised Complexity Analysis of Self-adjusting Data Structures**

Lorenz Leutgeb2(B), Georg Moser<sup>1</sup>, and Florian Zuleger<sup>2</sup>

<sup>1</sup> Department of Computer Science, Universit¨at Innsbruck, Innsbruck, Austria <sup>2</sup> Institute of Logic and Computation 192/4, Technische Universit¨at Wien, Vienna, Austria lorenz@leutgeb.xyz

**Abstract.** Being able to argue about the performance of self-adjusting data structures such as splay trees has been a main objective, when Sleator and Tarjan introduced the notion of *amortised* complexity.

Analysing these data structures requires sophisticated potential functions, which typically contain logarithmic expressions. Possibly for these reasons, and despite the recent progress in automated resource analysis, they have so far eluded automation. In this paper, we report on the first fully-automated amortised complexity analysis of self-adjusting data structures. Following earlier work, our analysis is based on potential function templates with unknown coefficients.

We make the following contributions: 1) We encode the search for concrete potential function coefficients as an optimisation problem over a suitable constraint system. Our target function steers the search towards coefficients that minimise the inferred amortised complexity. 2) Automation is achieved by using a linear constraint system in conjunction with suitable lemmata schemes that encapsulate the required non-linear facts about the logarithm. We discuss our choices that achieve a scalable analysis. 3) We present our tool ATLAS and report on experimental results for *splay trees*, *splay heaps* and *pairing heaps*. We completely automatically infer complexity estimates that match previous results (obtained by sophisticated pen-and-paper proofs), and in some cases even infer better complexity estimates than previously published.

**Keywords:** Amortised cost analysis · Functional programming · Self-adjusting data structures · Automation · Constraint solving

# **1 Introduction**

Amortised analysis, as introduced by Sleator and Tarjan [47,49], is a method for the worst-case cost analysis of data structures. The innovation of amortised analysis lies in considering the cost of a single data structure operation as part of a sequence of data structure operations. The methodology of amortised analysis allows one to assign a low (e.g., constant or logarithmic) amortised cost to a data structure operation even though the worst-case cost of a single operation might be high (e.g., linear, polynomial or worse). The setup of amortised analysis guarantees that for a sequence of data structure operations the worst-case cost is indeed the number of data structure operations times the amortised cost. In this way amortised cost analysis provides a methodology for worst-case cost analysis. Notably, the cost analysis of self-adjusting data structures, such as splay trees, has been a main objective already in the initial proposal of amortised analysis [47,49]. Analysing these data structures requires sophisticated potential functions, which typically contain logarithmic expressions. Possibly for these reasons, and despite the recent progress in automated complexity analysis, they have so far eluded automation.

In this paper, we present the first fully-automated amortised cost analysis of self-adjusting data structures, that is, of *splay trees*, *splay heaps* and *pairing heaps*, which so far have only (semi-) manually been analysed in the literature. We implement and extend a recently proposed type-and-effect system for amortised resource analysis [26,27]. This system belongs to a line of work (see [20,22– 25,28] and the references therein), where types are template potential functions with unknown coefficients and the type-and-effect system extracts constraints over these coefficients in a syntax directed way from the program under analysis. Our work improves over [26,27] in three regards: 1) The approach of [26,27] only supports *type checking*, i.e. verifying that a manually provided type is correct. In this paper, we add an optimisation layer to the set-up of [26,27] in order to support *type inference*, i.e. our approach does not rely on manual annotations. Our target function steers the search towards coefficients that minimise the inferred amortised complexity. 2) The only case study of [26,27] is partial, focusing on the zig-zig case of the splay tree function splay, while we report on the full analysis of the operations of several data structures. 3) [26,27] does not report on a fully-automated analysis. Besides the requirement that the user needs to provide the resource annotation, the user also has to apply the structural rules of the type system manually. Our tool ATLAS is able to analyse our benchmarks fully automatically. Achieving full automation required substantial implementation effort as the structural rules need to be applied carefully—as we learned during our experiments—in order to avoid a size explosion of the generated constraint system. We evaluate and discuss our design choices that lead to a scalable implementation.

With our implementation and the obtained experimental results we make two contributions to the complexity analysis of data structures:

*1.) We automatically infer complexity estimates that match previous results (obtained by sophisticated pen-and-paper proofs), and in some cases even infer better complexity estimates than previously published.* In Table 1, we state the complexity bounds computed by ATLAS next to results from the literature. We match or improve the results from [37,41,42]. To the best of our knowledge, the bounds for splay trees and splay heaps represent the state-of-the-art. In particular, we improve the bound for the delete function of splay trees and all bounds for the splay heap functions. For pairing heaps, Iacono [29,30] has proven (using a more involved potential function) that insert and merge have constant amortised complexity,


**Table 1.** Amortised complexity bounds for splay trees (module name SplayTree, abbrev. ST), splay heaps (SplayHeap, SH) and pairing heaps (PairingHeap, PH).

<sup>a</sup>[42] uses a different cost metric, i.e. the numbers of arithmetic comparisons, whereas we and [37] count the number of (recursive) function applications. We adapted the results of [42] to our cost metric to make the results easier to compare, i.e. the coefficients of the logarithmic terms are by a factor 2 smaller compared to [42].

while the other data structure operations continue to have an amortised complexity of k log2(|t|); while we leave an automated analysis based on Iacono's potential function for future work, we note that his coefficients k in the logarithmic terms are large, and that therefore the small coefficients in Table 1 are still of interest. We will detail below that we used a simpler potential function than [37,41,42] to obtain our results. Hence, also the new proofs of the confirmed complexity bounds can be considered a contribution.

*2.) We establish a new approach for the complexity analysis of data structures.* Establishing the prior results in Table 1 required considerable effort. Schoenmakers studied in his PhD thesis [42] the best amortised complexity bounds that can be obtained using a parameterised potential function φ(t), where t is a binary tree, defined by <sup>φ</sup>(leaf) := 0 and <sup>φ</sup>((l, <sup>d</sup>, <sup>r</sup>)) := <sup>φ</sup>(l)+<sup>β</sup> logα( <sup>l</sup> <sup>+</sup> <sup>r</sup> ) + <sup>φ</sup>(r), for real-valued parameters α, β > 0. Carrying out a sophisticated optimisation with pen and paper, he concluded that the best bounds are obtained by setting <sup>α</sup> <sup>=</sup> <sup>√</sup><sup>3</sup> 4 and <sup>β</sup> <sup>=</sup> <sup>1</sup> <sup>3</sup> for splay trees, and by setting <sup>α</sup> <sup>=</sup> <sup>√</sup>2 and β = <sup>1</sup> <sup>2</sup> for pairing heaps (splay heaps were proposed only some years later by Okasaki in [38]). Brinkop and Nipkow verify his complexity results for splay trees in the theorem prover Isabelle [37]. They note that manipulating the expressions corresponding to <sup>β</sup> logα(|t|) could only partly be automated<sup>1</sup>.

<sup>1</sup> Nipkow et al. [37] state "The proofs in this subsection require highly nonlinear arithmetic. Only some of the polynomial inequalities can be automated with Harrison's sum-of-squares method [16]".

For splay heaps, there is to the best of our knowledge no previous attempt to optimise the obtained complexity bounds, which might explain why our optimising analysis was able to improve all bounds. For pairing heaps, Brinkop and Nipkow did not use the optimal parameters reported by Schoenmakers—probably in order to avoid reasoning about polynomial inequalities—, which explains the worse complexity bounds. In contrast to the discussed approaches, we were able to verify and improve the previous results fully automatically. Our approach uses a variation of Schoenmakers' potential function, where we roughly fix α = 2 and leave β as a parameter for the optimisation phase (see Sect. 2 for more details). Despite this choice, our approach was able to derive bounds that match and improve the previous results, which came as a surprise to us. Looking back at our experiments and interpreting the obtained results, we recognise that we might have been in luck with the particular choice of the potential function (because we can obtain the previous results despite fixing α = 2). However, we would not have expected that an automated analysis is able to match and improve all previously reported coefficients, which shows the power of the optimisation phase. *Thus, we believe that our results suggest a new approach for the complexity analysis of data structures.* So far, self-adjusting data structures had to be analysed manually. This is possibly due to the use of sophisticated potential functions, which may contain logarithmic expressions. Both features are challenging for automated reasoning. Our results suggest that the following alternative (see Sects. 2 and 4.2 for more details): (i) Fix a parameterised potential function; (ii) derive a (linear) constraint system over the function parameters from the AST of the program; (iii) capture the required non-linear reasoning in lemmata, and use Farkas' lemma to integrate the application of these lemmata into the constraint system (in our case two lemmata, one about an arithmetic property and one about the monotonicity of the logarithm, were sufficient for all of our benchmarks); and finally (iv) find values for the parameters by an (optimising) constraint solver. We believe that our approach will carry over to other data structures: one needs to adapt the potential functions and add suitable lemmata, but the overall setup will be the same. We compare the proposed methodology to program synthesis by sketching [48], where the synthesis engineer communicates her main insights to the synthesis engine (in our case the potential functions plus suitable lemmata), and a constraint solver then fills in the details. As conclusion from our benchmarking, we observe that an automated analysis of sophisticated data structures are possible without the need to (i) resort to user guidance; (ii) forfeit optimal results; or (iii) be bogged down in computation times. These results also show how dependencies on properties of functional correctness of the code can be circumvented.

*Related Work.* To the best of our knowledge the here presented automated amortised analysis of self-adjusting data-structures is novel and unparalleled in the literature. However, there is a vast amount of literature on (automated) resource analysis. Without hope for a completeness, we briefly mention [1–7,9– 11,14,15,17,18,20,22–25,39,44–46,52] for an overview of the field. Logarithmic and sublinear bounds are typically not in the focus of the cited approaches, but can be inferred by some tools. In the recurrence relations based approach to cost analysis [1] refinements of linear ranking functions are combined with criteria for divide-and-conquer patterns; this allows the tool PUBS to recognise logarithmic bounds for some problems, but examples such as *mergesort* or *splaying* are beyond the scope of this approach. Logarithmic and exponential terms are integrated into the synthesis of ranking functions in [8], making use of an insightful adaption of Farkas' and Handelman's lemmas. The approach is able to handle examples such as *mergesort*, but again not suitable to handle self-balancing data structures. A type based approach to cost analysis for an ML-like language is presented in [50], which uses the Master Theorem to handle divide-and-conquerlike recurrences. Recently, support for the Master Theorem was also integrated for the analysis of rewriting systems [51], extending [4] on the modular resource analysis of rewriting to so-called logically constrained rewriting systems [12]. The resulting approach also supports the fully automated analysis of *mergesort*.

*Structure.* In Sects. 2 and 3 we review the type system of [26,27]. We sketch the challenges to automation in Sect. 4 and present our contributions in Sects. 5 and 6. Finally, we conclude in Sect. 7.

### **2 Step by Step to an Automated Analysis of Splaying**

In this and the next section we sketch the theory developed by Hofmann et al. in [27], in order to be able to present the contributions of this article in Sect. 4 and 5. For brevity, we restrict our exposition to those parts essential in the analysis of a particular program code. As motivating example consider *splay trees*, introduced by Sleator and Tarjan [47,49]. *Splaying* is the most important operation on splay trees, which performs rotation. Consider Fig. 1, a depiction of the zig-zig case of splay, which implements *splaying*.

The analysis of [27] (see also [26]) is formulated in terms of the physicist's method of amortised analysis in the style of Sleator and Tarjan [47,49]. The central idea of this approach is to assign a *potential* to the data structures of interest such that the difference in potential before and after executing a function is sufficient to pay for the actual cost of the function, i.e. one chooses potential functions φ, ψ such that φ(v) c<sup>f</sup> (v) + ψ(f(v)) holds for all inputs v to a function f, where c<sup>f</sup> (v) denotes the *worst-case cost* of executing function f on v. This generalises the original formulation, which can be seen by setting φ(v) := a<sup>f</sup> (v) + ψ(v), where a<sup>f</sup> (v) denotes the *amortised cost* of f.

In order to be able to analyse self-adjusting data structures such as splay trees, one needs potential functions that can express *logarithmic* amortised cost. Hofmann et al. [26,27] propose to make use of a variant of Schoenmakers' potential, rk(t) for a tree t, cf. [37,41,42], defined inductively by

$$\text{rk}(\text{1}\text{-a}\text{f}) := 1 \qquad \text{rk}(\langle l, \,\,d, \,r\rangle) := \text{rk}(l) + \log\_2(|l|) + \log\_2(|r|) + \text{rk}(r) \ ,$$

where l, r are the left resp. right child of the tree (l, d, r), |t| denotes the size of a tree (defined as the number of leaves of the tree), and d is some

```
1 splay a t = match t with
2 | (cl, c, cr) -> match cl with
3 | (bl, b, br) -> let s = splay a bl in match s with
4 | (al, a', ar) -> (al, a', (ar, b, (br, c, cr)))
```
**Fig. 1.** Zig-zig case of the splay function.

data element that is ignored by the potential function. Besides Schoenmakers' potential, further basic potential functions need to be added to the analysis: For a sequence of <sup>m</sup> trees <sup>t</sup>1,...,t<sup>m</sup> and coefficients <sup>a</sup>i, b <sup>∈</sup> <sup>N</sup>, the potential function

$$p\_{(a\_1, \ldots, a\_m, b)}(t\_1, \ldots, t\_m) := \log\_2(a\_1 \cdot |t\_1| + \cdots + a\_m \cdot |t\_m| + b)$$

denotes the logarithm of a linear combination of the sizes of the tree.

Following [37], we set the cost csplay(t) of splaying a tree t to be the number of recursive calls to splay. Splaying and all operations that depend on splaying can be done in O(log<sup>2</sup> n) amortised cost. Employing the above introduced potential functions, the analysis of [27] is able verify the following cost annotation for splaying (the annotation needs to be provided by the user):

$$\mathbf{r}\mathbf{k}(t) + 3 \cdot p\_{(1,0)}(t) + 1 \gtrless c\_{\texttt{up1ay}}(t) + \mathbf{r}\mathbf{k}(\texttt{sp1ay}\ \mathbf{a}\ \texttt{t})\,. \tag{1}$$

From this result, one directly reads off 3 · p(1,0)(t)+1= 3 · log2(|t|) + 1 as bound on the amortised cost of splaying.<sup>2</sup>

Based on earlier work [6,20,22–25,27,28] employs a *type-and-effect system* that uses *template potential functions*, i.e. functions of a fixed shape with indeterminate coefficients. The key challenge is to identify templates that are suitable for logarithmic analysis and that are closed under the basic operations of the considered programming language. For example, one introduces the coefficients q∗, q(1,0), q(0,2), q ∗, q (1,0), q (0,2) and introduces the potential function templates

$$\begin{aligned} \Phi(t \colon \mathbb{T} | Q) &:= q\_\* \cdot \mathbb{r} \mathbf{k}(t) + q\_{(1,0)} \cdot p\_{(1,0)}(t) + q\_{(0,2)} \cdot p\_{(0,2)}(t) \\ \Phi(\text{sp1ay} \cdot \mathbf{a} \; \; \mathbb{T} | Q') &:= q'\_\* \cdot \mathbb{r} \mathbf{k}(\text{sp1ay} \; \; \mathbf{a} \; \; \mathsf{t}) + \\ &+ q'\_{(1,0)} \cdot p\_{(1,0)}(\text{sp1ay} \; \mathbf{a} \; \mathsf{t}) + q'\_{(0,2)} \cdot p\_{(0,2)}(\text{sp1ay} \; \mathsf{a} \; \mathsf{t}) \; , \end{aligned}$$

for the input and output of the splay function. The type system then derives constraints on the template function coefficients, as indicated in the sequel. We take up further discussion of the constraint system, in particular how to maintain a scalable analysis, in Sect. 4.

We explain the use of the type system on the motivating example. For brevity, type judgements and the type rules are presented in a simplified form. In particular, we restrict our attention to tree types, denoted as T. This omission is inessential to the actual complexity analysis. For the full set of rules see [27].

<sup>2</sup> For ease of presentation, we elide the underlying semantics for now and simply write "splay a t" for the resulting tree t - , obtained after evaluating splay a t.

splay:T|Q → T|Q- bl: T|Q splay a bl : T|Q- <sup>−</sup> <sup>1</sup> (app) <sup>Δ</sup>|<sup>R</sup> cf splay a bl : <sup>T</sup>|R- cr : T, br : T, s: T|Q<sup>4</sup> match x with <sup>|</sup>(al,a- ,ar) -> t - : T|Q- cr : T, bl: T, br : T|Q<sup>3</sup> e- <sup>1</sup> : T|Q- (let : T) cr : T, bl: T, br : T|Q<sup>2</sup> e- <sup>1</sup> : T|Q- (w) cl: T, cr : T|Q<sup>1</sup> match cl with <sup>|</sup>(bl,b,br) -> e- <sup>1</sup> : T|Q- (match) t: T|Q match t with|(cl,c,cr) -> e<sup>1</sup> : T|Q-(match)

**Fig. 2.** Partial typing derivation for the motivating example splay.

Let e denote the body of the function definition of splay a t , depicted in Fig. 1. Our automated analysis infers an *annotated type* of splaying, by verifying that the type judgement

$$t \colon \mathsf{T} \vert Q \vdash e \colon \mathsf{T} \vert Q' \,, \tag{2}$$

is derivable. As above, types are decorated with *annotations* Q := [q∗, q(1,0), q(0,2)] and Q := [q ∗, q (1,0), q (0,2)]—employed to express the potential carried by the arguments to splay and its results.

The soundness theorem of the type system (Theorem 1) expresses that if the above type judgement is derivable, then the total cost csplay(t) of splaying is bound by the difference between <sup>Φ</sup>(t:T|Q) and <sup>Φ</sup>(splay a t:<sup>T</sup> <sup>Q</sup> ), i.e. Φ(t:T Q) csplay(t) + Φ(splay a t:T Q ). In particular, Eq. <sup>1</sup> can be derived in this way.

We now provide an intuition on the type-and-effect system, stepping through the code of Fig. 1. The corresponding type derivation tree is depicted in Fig. 2. We note that the tree contains further annotations Q1, Q2, Q3, Q<sup>4</sup> (besides the annotations Q and Q ) which again represent the unknown coefficients of potential function templates. The goal of the type-and-effect system is to provide constraints for each programming construct that connect the annotations in subsequent derivation steps, e.g. Q<sup>2</sup> and Q3. The type-and-effect system operates *syntax-directed* and formulates one rule per programming languages construct. We now discuss some of these rules for the partial derivation for splay.

The outermost command of e is a match statement, for which the following rule is applied:

$$\frac{cl:\mathsf{T}, cr:\mathsf{T}|Q\_{1}\vdash e\_{1}:\mathsf{T}|Q'}{t:\mathsf{T}|Q \vdash \mathsf{match}\ t \text{ with }\hfil\mathsf{(}cl,c,cr\text{) \rightarrow e\_{1}:\mathsf{T}|Q'}} \stackrel{(\text{match})}{\rightarrow} \frac{(\text{match})}{\cdot}.$$

Here e<sup>1</sup> denotes the subexpression of e, which constitutes the nested pattern match. Primarily, this is a standard type rule for pattern matching. The novelty are the constraints on the annotations Q, Q and Q1. More precisely, (match) induces the constraints

$$q\_1^1 = q\_2^1 = q\_\* \quad \quad q\_{(1,1,0)}^1 = q\_{(1,0)} \quad \quad q\_{(1,0,0)}^1 = q\_{(0,1,0)}^1 = q\_\* \quad \quad q\_{(0,0,2)}^1 = q\_{(0,2)} \text{ },$$

which can be directly read-off the definition of rk(t) = rk(cl) + log2(|cl|) + log2(|cr|) +rk(cr). Similarly, the nested match command, starting expression <sup>e</sup> 1, is subject to the same rule; the resulting constraints amount to

$$\begin{aligned} q\_1^2 &= q\_2^2 = q\_3^2 & q\_{(0,0,0,2)}^2 &= q\_{(0,0,2)}^1 & q\_{(1,1,1,0)}^2 &= q\_{(1,1,0)}^1\\ q\_{(0,1,1,0)}^2 &= q\_{(1,0,0)}^1 & q\_{(1,0,0,0)}^2 &= q\_{(0,1,0)}^1 & q\_{(0,1,0,0)}^2 &= q\_{(0,0,1,0)}^2 = q\_1^1 \end{aligned}$$

Besides the rules for programming language constructs, the type-and-effect system contains *structural rules*, which operate on the type annotations themselves. The *weakening* rule allows a suitable adaptation of the coefficients of the potential function Φ(Γ|Q2) to obtain a new potential function Φ(Γ|Q3), where we use the shorthand Γ := cr :T, bl:T, br :T:

$$\frac{\Gamma|Q\_3 \vdash e\_1' \colon \mathsf{T}|Q' \quad \Phi(\varGamma|Q\_2) \geqslant \Phi(\varGamma|Q\_3)}{\Gamma|Q\_2 \vdash e\_1' \colon \mathsf{T}|Q'} \text{ (w)}$$

The difficulty in applying the *weakening* rule, consists in discharging the constraint:

$$
\Phi(\varGamma|Q\_2) \gtrless \Phi(\varGamma|Q\_3) \tag{3}
$$

Note, that the comparison is to be performed *symbolically*, that is, abstracted from the concrete value of the variables. We emphasise that this step can neither be avoided, nor easily moved to the axioms of the derivation, as in related approaches in the literature [19,21–23,28,31,35]. We use Farkas' Lemma in conjunction with two facts about the logarithm to linearise this symbolic comparison, namely the monotonicity of the logarithm and the fact that 2+log2(x)+log2(y) 2 log2(x+ y) for all x, y - 1. For example, for the facts log2(|bl|) ≤ log2(|bl| + |br|) and 2 + log2(|bl|) + log2(|cr| + |br|) ≤ 2 log2(|cr| + |bl| + |br|), we use Farkas' Lemma to generate the constraints

$$\begin{aligned} q\_{(0,0,0,2)}^2 + 2f &\geqslant q\_{(0,0,0,2)}^3 \\ q\_{(1,0,1,0)}^2 + f &\geqslant q\_{(1,0,1,0)}^3 \\ q\_{(1,1,1,0)}^2 - 2f &\geqslant q\_{(1,1,1,0)}^3 \end{aligned} \qquad \begin{aligned} q\_{(0,1,0,0)}^2 + f + g &\geqslant q\_{(0,1,0,0)}^3 \\ q\_{(0,1,1,0)}^2 & -g \geqslant q\_{(0,1,1,0)}^3 \end{aligned}$$

for some coefficients f,g - 0 introduced by Farkas' Lemma. We note that Farkas' Lemma can be interpreted as systematically exploring all positive-linear combinations of the considered mathematical facts. This can be seen on the above example: one can combine g times the first fact with f times the second fact.

Next, we apply the rule for the let expression. This rule is the most involved typing rule in the system proposed by Hofmann et al. [27].

$$\frac{\Delta|Q \vdash e\_2 \colon \mathsf{T}|Q'-1 \quad \Delta|R \vdash^{\text{cf}} e\_2 \colon \mathsf{T}|R' \quad \Theta|Q\_4 \vdash e\_3 \colon \mathsf{T}|Q'}{cr \colon \mathsf{T}, bl \colon \mathsf{T}, br \colon \mathsf{T}|Q\_3 \vdash \mathsf{1et} \; s = e\_2 \text{ in } e\_3 \colon \mathsf{T}|Q'} \text{ (let } \mathsf{T})) $$

Ignoring the annotations and in particular the second premise for a moment, the type rule specifies a standard typing for a let expression. We note that, as required by the rule, all variables in the type context Γ occur at most once in the let-expression. Γ can then be split into contexts Δ := bl:T and Θ := cr :T, br :T. Here, <sup>e</sup><sup>2</sup> := splay a bl and <sup>e</sup><sup>3</sup> denotes the last match statement in e. The let-rule facilitates a splitting of the potential Q<sup>3</sup> for the evaluation of e<sup>2</sup> and e<sup>3</sup> according to the type contexts Δ and Θ. Abusing notation, the distribution of potentials facilitated by the let-rule can be stated very roughly as two "equalities", that is, (i) "Q<sup>3</sup> = Q+R+P" and (ii) "Q<sup>4</sup> = (Q −1)+R +P". (i) states that the potential Q<sup>3</sup> pays for evaluating the splay expression e<sup>2</sup> (with and without costs, requiring the potential Q and R) and leaves the remainder potential P. (ii) states that the potential Q<sup>4</sup> is constituted of the remainder potential P and of the potentials left after evaluating e<sup>2</sup> (with and without costs, i.e. potentials Q −1 and R ). E.g. Q<sup>4</sup> is given by the following constraints

$$\begin{aligned} q\_1^4 &= q\_1^3 & q\_3^4 &= q\_\*^{'} & q\_{\{1,0,0,0\}}^4 &= q\_{\{1,0,0,0\}}^3 & q\_{\{1,1,1,0\}}^4 &= r'\_{\{1,0\}}\\ q\_2^4 &= q\_3^3 & q\_{\{0,1,0,0\}}^4 &= q\_{\{0,0,1,0\}}^3 & q\_{\{1,1,0,0\}}^4 &= q\_{\{1,0,1,0\}}^3 \end{aligned}$$

where the coefficients q<sup>3</sup> stem from the remainder potential of Q3, the coefficient q <sup>∗</sup> from Q − 1 and r (1,0) from R .

The most original part of this type rule is the second premise Δ R cf splay a bl:T R . Here, cf denotes the same kind of typing judgement as used in the overall typing derivation, but where all costs are set to zero (hence, the superscript *cost-free*). Let us assume R = [r(1,0)], R = [r (1,0)], and that ATLAS was able to establish that

$$\Phi(bl: \mathbb{T}|R) = \log\_2(|bl|) \gg \log\_2(|s|) = \Phi(s: \mathbb{T}|R')\,,\tag{4}$$

establishing the coefficients r(1,0) = 1 and r (1,0) = 1. (We note that cost-free typing derivations as in Eq. (4) constitute a *size analysis* that relates the sizes of input and output). Then, ATLAS infers from (4), taking advantage of the monotonicity of log, that

$$
\log\_2(|cr| + |bl| + |br|) \geqslant \log\_2(|cr| + |br| + |s|) \text{ .}
$$

This inequality expresses that if the summand log2(|cr|+|bl|+|br|) is included in the potential Φ(Γ|Q3), then the summand log2(|cr|+|br|+|s|) may be included in the potential <sup>Φ</sup>(cr :T, br :T, s:T|Q4). (The two logarithmic terms correspond to the coefficients q<sup>3</sup> (1,1,1,0) and q<sup>4</sup> (1,1,1,0) marked in red above.) Thus, the cost-free derivation allows the potential R to pass from Q3, via R , to Q4. This is crucial for being able to pay for the evaluation of e3.

The let-rule has the three premises <sup>Δ</sup>|<sup>Q</sup> <sup>e</sup><sup>2</sup> :T|Q <sup>−</sup> 1, <sup>Δ</sup>|<sup>R</sup> cf <sup>e</sup><sup>2</sup> :T|R and <sup>Θ</sup>|Q<sup>4</sup> <sup>e</sup><sup>3</sup> :T|Q . We focus here on the first premise and do not state the derivations for the other two premises (such derivations can be found in [27]). The judgement Δ Q splay a t:T Q 1 can be derived by the rule for function application, which states a cost of 1 with regard to the type signature of splay, represented by decrementing the potential induced by the annotation Q .

$$\frac{\mathtt{sp1ay} \colon \mathsf{T} | Q \to \mathsf{T} | Q'}{t \colon \mathsf{T} | Q \vdash \mathsf{sp1ay} \text{ a } \mathsf{t} \colon \mathsf{T} | Q' - 1} \text{ (a \mathtt{pp})}$$

The rule for function application is an axiom, and closes this branch of the typing derivation. This concludes the presentation of the partial type inference given in Fig. 2. Similarly to the above example of splay, estimates for the amortised costs of insertion and deletion on splay trees can be automatically inferred by our tool ATLAS. Further, our analysis handles similar self-adjusting data structures like *pairing heaps* and *splay heaps* (see Sect. 6.1).

# **3 Technical Foundation**

In this short section, we provide a more detailed account of the formal system underlying our tool ATLAS. We state the soundness of the system in Theorem 1.

A *typing context* is a mapping from variables V to types; denoted by uppercase Greek letters. A program P is a set of typed function definitions of the form f(x1,...,xn) = e, where the x<sup>i</sup> are variables and e an expression. A *substitution* (or an *environment*) σ is a mapping from variables to values that respects types. Substitutions are denoted as sets of assignments: σ = {x<sup>1</sup> → t1,...,x<sup>n</sup> → tn}. We employ a simple cost-sensitive big-step semantics based on eager evaluation, dressed up with cost assertions. The judgement σ e ⇒ v means that under environment σ, expression e is evaluated to value v in exactly steps. Here only rule applications emit (unit) costs. For brevity, the formal definition of the semantics is omitted but can be found in [27].

In Sect. 2, we introduced a variant of Schoenmakers' potential function, denoted as rk(t), and the additional potential functions p(a1,...,a*m*,b)(t1,...,tm) := log2(a<sup>1</sup> · |t1|+···+a<sup>m</sup> · |tm|+b), denoting the log<sup>2</sup> of a linear combination of tree sizes. log<sup>2</sup> denotes the logarithm to the base 2; throughout the paper we stipulate log2(0) := 0 in order to avoid case distinctions. Note that the constant function 1 is representable: 1 = λt. log2(0 · |t| + 2) = p(0,2). We are now ready to state the resource annotation of a sequence of trees:

**Definition 1.** *A* resource annotation *or simple* annotation *of length* m *is a sequence* Q = [q1,...,qm] ∪ [(q(a*m*,...,a*n*,b))<sup>a</sup>*i*,b∈<sup>N</sup>]*, vanishing almost everywhere.* *Let* t1,...,t<sup>m</sup> *be a sequence of trees. Then, the potential of* t1,...,t<sup>m</sup> *wrt.* Q *is given by*

$$\Phi(t\_1, \ldots, t\_m | Q) := \sum\_{i=1}^m q\_i \cdot \mathsf{rk}(t\_i) + \sum\_{a\_1, \ldots, a\_m, b \in \mathbb{N}} q\_{(a\_1, \ldots, a\_m, b)} \cdot p\_{(a\_1, \ldots, a\_m, b)}(t\_1, \ldots, t\_m) \ \ldots$$

In case of an annotation of length 1, we sometimes write q<sup>∗</sup> instead of q1, as we already did above.

*Example 1.* Let t be a tree, then its potential could be defined as follows: rk(t)+ 3 · log2(|t|) + 1. Wrt. the above definition this potential becomes representable by setting <sup>q</sup><sup>∗</sup> := 1, q(1,0) := 3, q(0,2) := 1. Thus, <sup>Φ</sup>(t|Q) = rk(t)+3 · log2(|t|) + 1. 

Let σ be a substitution, let Γ denote a typing context and let x<sup>1</sup> :T,...,x<sup>m</sup> :T denote all tree types in Γ. A *resource annotation for* Γ or simply *annotation* is an annotation for the sequence of trees x1σ,..., xmσ. We define the *potential* of the annotated context Γ|Q wrt. a substitution σ as Φ(σ; Γ|Q) := Φ(x1σ,..., xmσ|Q).

**Definition 2.** *An* annotated signature F *maps functions* f *to sets of pairs of the annotation type for the arguments and the annotation type of the result:*

F(f) := {α<sup>1</sup> ×···× αn|Q → β|Q : Q, Q *are annotations,* Q *is of length* m}*.*

*We suppose* f *takes* n *arguments of which* m *are trees;* m n *by definition.*

Instead of α<sup>1</sup> ×···× αn|Q → β|Q ∈ F(f), we sometimes succinctly write f : α<sup>1</sup> ×···× αn|Q → β|Q . The *cost-free* signature, denoted as <sup>F</sup>cf, is similarly defined.

*Example 2.* Consider the function splay from above. Its signature is formally represented as <sup>B</sup> <sup>×</sup> <sup>T</sup>|<sup>Q</sup> <sup>→</sup> <sup>T</sup>|Q , where Q := [q∗] ∪ [(q(a,b))a,b∈<sup>N</sup>] and Q := [q <sup>∗</sup>] ∪ [(q (a,b))a,b∈<sup>N</sup>]. We leave it to the reader to specify the coefficients in Q, Q so that the rule (app) as depicted in Sect. 2 can indeed by employed to type the recursive call of splay.

Let Q = [q∗] ∪ [(q(a,b))a,b∈<sup>N</sup>] be an annotation such that q(a,b) > 0. Then Q := Q−1 is defined as follows: Q = [q∗]∪[(q (a,b))a,b∈<sup>N</sup>], where q (0,2) := q(0,2)−1 and for all (a, b) = (0, 2) q (a,b) := q(a,b). By definition the annotation coefficient q(0,2) is the coefficient of the basic potential function p(0,2)(t) = log2(0|t|+2) = 1, so the annotation Q − 1, decrements cost 1 from the potential induced by Q.

*Type-and-Effect System.* The typing system makes use of a *cost-free* semantics, which does not attribute any costs to the calculation. I.e. the rule (app) (Sect. 2) is changed so that no cost is emitted. The cost-free application rule is denoted as (app : cf). The cost-free typing judgement is written as <sup>Γ</sup>|<sup>Q</sup> cf <sup>e</sup>: <sup>α</sup>|Q . The judgement Γ|Q e: α|Q is governed by a plethora of typing rules. We have illustrated several typing rules in Sect. 2 (the complete set of typing rules can be found in [27]).

A program <sup>P</sup> is called *well-typed* if for any rule <sup>f</sup>(x1,...,xk) = <sup>e</sup> <sup>∈</sup> <sup>P</sup> and any annotated signature f : α<sup>1</sup> ×···× αk|Q → β|Q , we have x<sup>1</sup> : α1,...,x<sup>k</sup> : αk|Q e: β|Q . A program P is called *cost-free* well-typed, if the cost-free typing relation is employed.

Hofmann et al. establish the following soundness result:<sup>3</sup>

**Theorem 1 (Soundness Theorem).** *Let* P *be well-typed and let* σ *be an environment. Suppose* <sup>Γ</sup>|<sup>Q</sup> <sup>e</sup>: <sup>α</sup>|Q *and* <sup>σ</sup> e ⇒ v*. Then* Φ(σ; Γ|Q)−Φ(v|Q ) - *. Further, if* <sup>Γ</sup>|<sup>Q</sup> *cf* <sup>e</sup>: <sup>α</sup>|Q *, then* Φ(σ; Γ|Q) - Φ(v|Q )*.*

# **4 The Road to Automation, Continued**

The above sketched type-and-effect system, originally proposed in [27], is only a first step towards full automation. Several challenges need to be overcome, which we detail in this section.

#### **4.1 Type Checking**

Comparison between logarithmic expressions, constitutes a first major challenge, as such a comparison cannot be directly encoded as a *linear* constraint problem. To achieve such *linearisation*, [27] makes use of the following: (i) a subtly and surprisingly effective variant of Schoenmakers potential (see Sect. 2); (ii) mathematical facts about the logarithm function—like Lemma 1 below—referred to as *expert knowledge*; and finally (iii) Farkas' Lemma for turning the universallyquantified premise of the weakening rule into an existentially-quantified statement that can be added to the constraint system—see Lemma 2.

A simple mathematical fact that is employed by Hofmann et al.— following earlier pen-and-paper proofs in the literature [37,38,41]—states as follows:

**Lemma 1.** *Let* x, y -1*. Then* 2 + log2(x) + log2(y) 2 log2(x + y)*.*

We remark that our automated analysis shows that this lemma is not only crucial in the analysis of splaying, but also for the other data structures we have investigated. Further, Hofmann et al. state and prove the following variant of Farkas' Lemma, which lies at the heart of an effective transformation of comparison demands like (3) into a linear constraint problem. Note that u and f denote column vectors of suitable length.

**Lemma 2 (Farkas' Lemma).** *Suppose* A
x b, 
x - 0 *is solvable. Then the following assertions are equivalent. (i)* ∀x - <sup>0</sup>. A
x b <sup>⇒</sup> <sup>u</sup><sup>T</sup> <sup>x</sup> <sup>λ</sup> *and (ii)* <sup>∃</sup>f -<sup>0</sup>. 
u<sup>T</sup> f<sup>T</sup> <sup>A</sup> <sup>∧</sup> fTb <sup>λ</sup>*.*

<sup>3</sup> Note that soundness assumes a terminating execution σ e ⇒ v of P. We point out that our analysis does not guarantee the termination of P for all environments σ.

The lemma allows the assumption of *expert knowledge* through the assumption A
x b for all x - 0. E.g., thus formalised expert knowledge is a clear point of departure for additional information. E.g. Hofmann et al. [27] propose the following potential extensions: (i) additional mathematical facts on the log function; (ii) a dedicated size analysis; (iii) incorporation of basic static analysis techniques. The incorporation of Farkas' Lemma with suitable expert knowledge is already essential for *type checking*, whenever the symbolic weakening rule (3) needs to be discharged.

ATLAS incorporates two facts into the expert knowledge: Lemma 2 and the monotonicity of the logarithm (see Sect. 5). We found these two facts to be sufficient for handling our benchmarks, i.e. expert knowledge of form (ii) and (iii) was not needed. (We note though that we have experimented with adding a dedicated size analysis (ii), which interestingly increased the solver performance, despite generating a large constraint system).

We indicate how ATLAS may be used to solve the constraints generated for the example in Sect. 2. We recall the crucial application of the *weakening* step between annotations Q<sup>2</sup> and Q3. This weakening step can be automatically discharged using the monotonicity of logs and Lemma 1. (More precisely, ATLAS employs the mode <sup>w</sup>{mono l2xy} see, Sect. 5.) For example, ATLAS is able to verify the validity of the following concrete constants:

$$\begin{aligned} Q\_2 \colon q\_1^2 = q\_2^2 = q\_3^2 = 1 & \quad Q\_3 \colon q\_1^3 = q\_2^3 = q\_3^3 = 1\\ q\_{(0,0,0,2)}^2 = 1 & \quad q\_{(0,1,1,0)}^2 = 1 & \quad q\_{(0,0,0,2)}^3 = 2 & \quad q\_{(1,0,0,0)}^3 = 1\\ q\_{(0,0,1,0)}^2 = 1 & \quad q\_{(1,0,0,0)}^2 = 1 & \quad q\_{(0,0,1,0)}^3 = 1 & \quad q\_{(1,0,1,0)}^3 = 1\\ q\_{(0,1,0,0)}^2 = 1 & \quad q\_{(1,1,1,0)}^2 = 3 & \quad q\_{(0,1,0,0)}^3 = 3 & \quad q\_{(1,1,1,0)}^3 = 1 \end{aligned}$$

#### **4.2 Type Inference**

We extend the type-and-effect system of [27] from type checking to type inference. Further, we automate the application of structural rules like *sharing* or *weakening*, which have so far required user guidance.

The two central contributions of this paper, as delineated in the introduction, are based on significant improvement over the state-of-the-art as described above. Concretely, they came about by a novel (i) *optimisation layer* ; (ii) a careful control of the *structural rules*; (iii) the generalisation of user-defined *proof tactics* into an overall strategy of type inference; and (iv) provision of an automated amortised analysis in the sense of Sleator and Tarjan. In the sequel of the section, we will discuss these stepping stones towards full automation in more details.

*Optimisation Layer.* We add an optimisation layer to the set-up, in order to support *type inference*. This allows for the inference of (optimal) type annotations based on user-defined type annotations. For example, assume the userprovided type annotation rk(t)+3 log2( <sup>t</sup> )+1 rk(splay(t)) can in principle be checked automatically. Then—instead of checking this annotation—ATLAS automatically *optimises* the signature, by minimising the deduced coefficients.

```
1 (match (*t*) leaf
2 (match (* cl *) ?
3 (w{l2xy} ( let:tree:cf (*s*)
4 app (* splay_eq a bl *)
5 (match leaf
6 ( let:tree:cf node (let:tree:cf node (w{mono} node ))))))))
```
**Fig. 3.** Tactic that matches the zig-zig case of splay as shown in Fig. 1.

(In Sect. 5 we discuss how this optimisation step is performed.) That is, ATLAS reports the following annotation

splay: <sup>1</sup>/<sup>2</sup> rk(t) + <sup>3</sup>/<sup>2</sup> log2( t ) <sup>1</sup>/<sup>2</sup> rk(splay(t)) ,

which yields the *optimal* amortised cost of splaying of <sup>3</sup>/<sup>2</sup> log2(|t|). Optimality here means that no better bound has been obtained by earlier pen-and-paper verification methods (compare the discussion in Sect. 1).

*Structural Rules.* We observed that an unchecked application of the structural rules, that is of the *sharing* and the *weakening* rule, quickly leads to an explosion of the size of the constraint system and thus to de-facto unsolvable problems. To wit, an earlier version of our implementation ran continuously for *24/7* without being able to infer a type for the complete definition of the function splay. 4

The type-and-effect system proposed by Hofmann et al. is in principle *linear*, that is, variables occur at most once in the function body. For example, this is employed in the definition of the let-rule, cf. Sect. 2. However, a *sharing* rule is admissible, that allows to treat multiple occurrences of variables. Occurrences of non-linear variables are suitably renamed apart and the carried potential is shared among the variants. (See [27] for the details.) The number of variables strongly influences the size of the constraint problem. Hence, eager application of the sharing rule proved infeasible. Instead, we restricted its application to individual program traces. For the considered benchmark examples, this removed the need for sharing altogether.

With respect to *weakening*, a careful application of the weakening rule proved necessary for performance reasons: First, we apply weakening only selectively. Second, when applying weakening, we employ different levels of *granularity*. We may only perform a simple coefficient comparison, or we may apply monotonicity or Lemma 1 or both in conjunction with Farkas' Lemma. We give the details in Sect. 5.

*Proof Tactics.* Hofmann et al. [27] already propose user-defined proof plans, so-called *tactics*, to improve the effectivity of type checking. In combination with our optimisation framework, tactics allow to significantly improve type annotations. To wit, ATLAS can be invoked with user-defined resource annotations for the function splay, representing its "standard" amortised complexity (e.g. copied from Okasaki's book [38]) and an easily definable tactic, cf. Fig. 3.

<sup>4</sup> The code ran single-threaded on AMD® Ryzen 7 3800 @ 3.90 GHz.

Then, ATLAS automatically derives the optimal bound reported above. Still, for full-automation tactics are clearly not sufficient. In order to obtain *type inference* in general, we developed a generalisation of all the tactics that proved useful on our benchmark and incorporated this proof search strategy into the type inference algorithm. Using this, the aforementioned (unsuccessful) week-long quest for a type inference of splaying can now be successfully answered (in an optimal form) in mere minutes.

We'd like to argue that ATLAS proof search strategy for full automation is free of bias towards the provided complexity analysis. As detailed in Sect. 5, the heuristics incorporates common design principles of the data structures analysed. Thus, we exploit recurring patterns in the input (destructuring of input trees, handling base/recursive cases, rotations) not in the solution. The situation is similar to the choice of the potential functions, which we expect to generalise to other data structures. Similarly, we expect generalisability of the current proof search strategy.

*Automated Amortised Analysis.* In Sect. 2, we provided a high-level introduction into the potential method and remarked that Sleator and Tarjan's original formulation is re-obtained, if the corresponding potential functions are defined such that φ(v) := a<sup>f</sup> (v) + ψ(x), see page 5. We now discuss how we can extract amortised complexities in the sense of Sleator and Tarjan from our approach. Suppose, we are interested in an amortised analysis of splay heaps. Then, it suffices to equate the right-hand sides of the annotated signatures of the splay heap functions. That is, we set del\_min: T Q<sup>1</sup> T Q , B T Q<sup>2</sup> T Q insert: and partition: B T Q<sup>3</sup> T Q for some unknown resource annotations Q1, Q2, Q3, Q . Note that we use the same annotation Q for all signatures. We can then obtain a potential function from the annotation Q in the sense of Sleator and Tarjan and deduce Q<sup>i</sup> − Q as an upper bound on the amortised complexity of the respective function. In Sect. 5, we discuss how to automatically optimise Q<sup>i</sup> − Q in order to minimise the amortised complexity bound. This automated minimisation is the second major contribution of our work. Our results suggest a new approach for the complexity analysis of data structures. On the one hand, we obtain novel insights into the automated worst-case runtime complexity analysis of involved programs. On the other hand, we provide a proof-of-concept of a computer-aided analysis of amortised complexities of data-structures that so far have only been analysed manually.

# **5 Implementation**

In this section, we present our tool ATLAS, which implements type inference for the type system presented in Sects. 2 and 3. ATLAS operates in three phases:


```
1 LNF[if a<a'
2 then (l,a,(leaf,a',r) )
3 else ( (l,a',leaf),a,r)]
                                1 let x1 = a<a' in if x1
                                2 then LNF[(l,a,(leaf,a',r))]
                                3 else LNF[( (l,a',leaf),a,r)]
1 let x1 = a < a' in if x1
2 then let x2 = leaf in let x3 = (x2,a',r) in (l,a,x3)
3 else let x4 = leaf in let x5 = (l,a',x4) in (x5,a,r)
```
**Fig. 4.** Preprocessing: let normal forms.

In terms of overall resource requirements, the bottleneck of the system is phase three. Preprocessing is both simple and fast. While the code implementing constraint generation might be complex, its execution is fast. All of the underlying complexity is shifted into the third phase. On modern machines with multiple gibibytes of main memory, ATLAS is constrained by the CPU, and not by the available memory. In the remainder of this section, we first detail these phases of ATLAS. We then go into more details of the second phase. Finally, we elaborate the optimisation function which is the key enabler of type inference.

# **5.1 The Three Phases of ATLAS**

*1.) Preprocessing.* The parser used in the first phase is generated with ANTLR<sup>5</sup> and transformation of the syntax is implemented in Java. The preprocessing performs two tasks: (i) Transformation of the input program into *let-normalform*, which is the form of program input required by our type system. (ii) The *unsharing* conversion creates explicit copies for variables that are used multiple times. Making multiple uses of a variables explicit is required by the let-rule of the type system.

In order to satisfy the requirement of the let-rule, it is actually sufficient to track variable usage on the level of program paths. It turns out that in our benchmarks variables are only used multiple times in different branches of an if-statement, for which no unsharing conversion is needed. Hence, we do not discuss the unsharing conversion further in this paper and refer the interested reader to [27] for more details.

*Let-Normal-Form Conversion.* The let-normal-form conversion is performed recursively and rewrites composed expressions into simple expressions, where each operator is only applied to a variable or a constant. This conversion is achieved by introducing additional let-constructs. We exemplify let-normal-form conversion on a code snippet in Fig. 4.

*2.) Generation of the Constraint System.* After preprocessing, we apply the typing rules. Importantly, the application of all typing rules, except for the weakening rule, which we discuss in further detail below, is *syntax-directed*: This means

<sup>5</sup> See antlr.org.

that each node of the AST of the input program dictates which typing rule is to be applied. The weakening rule could in principle be applied at each AST node, giving the constraint solver more freedom to find a solution. This degree of freedom needs to be controlled by the tool designer. In addition, recall that the suggested implementation of the weakening rule (see Sect. 4.1) is to be parameterised by the expert knowledge, fed into the weakening rule. In our experiments we noticed that the weakening rule has to be applied sparingly in order to avoid an explosion of the resulting constraint system.

We summarise the degrees of freedom available to the tool designer, which can be specified as parameters to ATLAS on source level. 1.) The selected template potential functions, i.e. the family of indices a, b for which coefficients q(a,b) are generated (we assume not explicitly generated are set to zero). 2.) The number of annotated signatures (with costs and without costs) for each function. 3.) The policy for applying the (parameterised) weakening rule.

We detail our choices for instantiating the above degrees of freedom in Sect. 5.2.

*3.) Solving.* For solving the generated constraint system, we rely on the Z3 SMT solver. We employ Z3's Java bindings, load Z3 as a shared library, and exchange constraints for solutions. ATLAS forwards user-supplied configuration to Z3, which allows for flexible tuning of solver parameters. We also record Z3's statistics, most importantly memory usage. During the implementation of ATLAS, Z3's feature to extract unsatisfiable cores has proven valuable. It supplied us with many counterexamples, often directly pinpointing bugs in our implementation. The tool exports constraint systems in SMT-LIB format to the file system. This way, solutions could be cross-checked by re-computing them with other SMT solvers that support minimisation, such as OptiMathSAT [43].

#### **5.2 Details on the Generation of the Constraint System**

We now discuss our choices for the aforementioned degrees of freedom.

*Potential Function Templates.* Following [27], we create for each node in the AST of the considered input program, where n variables of tree-type are currently in context, the coefficients q1,...,q<sup>n</sup> for the rank functions and the coefficients <sup>q</sup>(a,b) for the logarithmic terms, where <sup>a</sup> ∈ {0, <sup>1</sup>}<sup>n</sup> and <sup>b</sup> ∈ {0, <sup>2</sup>}. This choice turned out to be sufficient in our experiments.

*Number of Function Signatures.* We fix the number of annotations for each function f : α<sup>1</sup> ×···× αn|Q → β|Q to one regular and one cost-free signature. This was sufficient for our experiments.

*Weakening.* We need to discharge symbolic comparisons of form Φ(Γ|P) Φ(Γ|Q). As indicated in Sect. 4, we

**Fig. 5.** Monotonicity Lattice for |Q| = 2.

employ Farkas' Lemma to derive constraints for the weakening rule. For context <sup>Γ</sup> <sup>=</sup> <sup>t</sup>1,...,tn, we introduce variables <sup>x</sup>(a,b) where <sup>a</sup> ∈ {0, <sup>1</sup>}n, b ∈ {0, <sup>2</sup>}, which represent the potential functions p(a,b) = log2(a1|t1| + ... + an|tn| + b). Next, we explain how the monotonicity of log<sup>2</sup> and Lemma 1 can be used to derive inequalities on the variables x(a,b), which can then be used to instantiate matrix A in Farkas' Lemma as stated in Sect. 4.

*Monotonicity.* We observe that p(a,b) = log2(a1|t1| + ... + an|tn| + b) log2(a <sup>1</sup>|t1| + ... + a <sup>n</sup>|tn| + b ) = p(a-,b-), if a<sup>1</sup> a <sup>1</sup>,...,a<sup>n</sup> a <sup>n</sup> and b b . This allows us to obtain the lattice shown in Fig. 5. A path from x(a-,b-) to x(a,b) signifies x(a,b) x(a-,b-) resp. x(a,b) − x(a-,b-) 0, represented by a row with coefficients 1 and −1 in the corresponding columns of matrix A.

*Mathematical Facts, Like Lemma* 1*.* For an annotated context of length 2, Lemma 1 can be stated by the inequality 2x(0,0,2)+x(0,1,0)+x(1,0,0)−2x(1,1,0) 0; we add a corresponding row with coefficients 2, 1, 1, −2 to the matrix A. Likewise, for contexts of length > 2, we add, for each subset of 2 variables, a row with coefficients 2, 1, 1, −2, setting the coefficients of all other variables to 0.

*Sparse Expert Knowledge Matrix.* We observe for both kinds of constraints that matrix A is sparse. We exploit this in our implementation and only store non-zero coefficients.

*Parametrisation of Weakening.* Each applications of the weakening rule is parameterised by the matrix A. In our tool, we instantiate A with either the constraints for (i) monotonicity, shortly referenced as w{mono}; (ii) Lemma 1 (w{l2xy}); (iii) both (w{mono l2xy}); or (iv) none of the constraints (w).

In the last case, Farkas' Lemma is not needed because weakening defaults to point-wise comparison of the coefficients p(a,b), which can be implemented more directly. Each time we apply weakening, we need to choose how to instantiate matrix A. Our experiments demonstrate that we need to apply monotonicity and Lemma 1 sparingly in order to avoid blowing up the constraint system.

*Tactics and Automation.* ATLAS supports manually applying the weakening rule—for this the user has to provide a tactic—and a fully-automated mode.

*Naive Automation.* Our first attempt to automation applied the weakening rule everywhere instantiated with the full amount of available expert knowledge. This approach did not scale.

*Manual Mode via Tactics.* A tactic is given as a text file that contains a tree of rule names corresponding to the AST nodes of the input program, into which the user can insert applications of the weakening rule, parameterised by the expert knowledge which should be applied. A simple tactic is depicted in Fig. 3. Tactics are distributed with ATLAS, see [32]. The user can name sub-trees for reference in the result of the analysis and include ML-style comments in the tactics text. We provide two special commands that allow the user to directly deal with a whole branch of the input program: The question mark (?) allows partial proofs; no constraints will be created for the part of the program thus marked. The underscore ( ) switches to the naive automation of ATLAS and will apply the weakening rule with full expert knowledge everywhere. Both, ? and , were invaluable when developing and debugging the automated mode. We note that the manual mode still achieves solving times that are by a magnitude faster than the automated mode, which may be of interest to a user willing to hand-optimise solving times.

*Automated Mode.* For automation, we extracted common patterns from the tactics we developed manually: Weakening with mode w{mono} is applied before (var) and (leaf), <sup>w</sup>{mono l2xy} is applied only before (app). (We recall that the full set of rules employed by our analysis can be found in [27].) Further, for AST subtrees that construct trees, i.e. which only consist of (node), (var) and (leaf) rule applications, we apply w{mono} for each inner node, and w{l2xy} for each outermost node. For all other cases, no weakening is applied. This approach is sufficient to cover all benchmarks, with further improvements possible.

#### **5.3 Optimisation**

Given an annotated function f : α<sup>1</sup> ×···× αn|Q → β|Q , we want to find values for the coefficients of the resource annotations Q and Q that minimise Φ(Γ|Q) − Φ(Γ|Q ), since this difference is an upper bound on the amortised cost of f, cf. Sect. 4.2. However, as with weakening, we cannot directly express such a minimisation, and again resort to linearisation: We choose an optimisation function that directly maps from Q and Q to Q. Our optimisation function combines four measures, three of which involve a difference between coefficients of Q and Q , and a fourth one that only involves coefficients from Q in order to minimise the absolute values of the discovered coefficients. We first present these measures for the special case of |Q| = 1.

The first measure d1(Q, Q ) := q<sup>∗</sup> − q <sup>∗</sup> reflects our goal of preserving the coefficient for rk; note that for d1(Q, Q ) = 0, the resulting complexity bound would be super-logarithmic. The second measure d2(Q, Q ) := - (a,b)(q(a,b) − q (a,b))·w(a, b) reflects the goal of achieving logarithmic bounds that are as small as possible. Weights are defined to penalise more complex terms, and to exclude constants. (Recall that 1 is representable as log2(0 + 2).) We set

$$w(a,b) := \begin{cases} 0, & \text{for } (a,b) = (0,2), \\ (a + (b+1)^2)^2, & \text{otherwise.} \end{cases}$$

The third measure d3(Q, Q ) := q(0,2) − q (0,2) reflects the goal of minimising constant cost. Lastly, we set d4(Q, Q ) := - (a,b) q(a,b) in order to obtain small absolute numbers. The last measure does not influence bounds on the amortised cost, but leads to more beautiful solutions. These measures are then composed to the linear objective function min-4 <sup>i</sup>=1 di(Q, Q )· wi. In our implementation, we set w<sup>i</sup> = [16127, 997, 97, 2]; these weights are chosen (almost) arbitrary, we only noticed that w<sup>1</sup> must be sufficiently large to guarantee its priority. (We note that these weights were sufficient for our experiments; we refer to the literature for more principled ways of choosing the weights of an aggregated cost function [34].) *Multiple Arguments.* For |Q| > 1, we set d<sup>1</sup> := -|Q| <sup>i</sup>=1 q<sup>i</sup> − q <sup>∗</sup> and d2(Q, Q - ) := (a,a,...,b)(q(a,a,...,b) − q (a,b)) · w(a, b). The required changes for d<sup>3</sup> and d<sup>4</sup> are straight-forward. In our benchmarks, there is only one function ( merge of pairing heaps) that requires this minimisation function.

# **6 Evaluation**

We first describe the benchmark functions employed to evaluate ATLAS and then detail this experimental evaluation, already depicted in Table 1.

### **6.1 Automated Analysis of Splaying et al.**

*Splay Trees.* Introduced by Sleator and Tarjan [47,49], *splay trees* are selfadjusting binary search trees with strictly increasing in-order traversal, but without an explicit balancing condition. Based on splaying, searching is performed by splaying with the sought element and comparing to the root of the result. Similarly, insertion and deletion are based on splaying. Above we used the zig-zig case of splaying, depicted in Fig. 1 as motivating code example. While the penand-paper analysis of this case is the most involved, type inference for this case alone did not directly yield the desired automation of the complete definition. Rather, full automation required substantial implementation effort, as detailed in Sect. 5. As already emphasised, it came as a surprise to us that our tool ATLAS is able match up and partly improve upon the sophisticated optimisations performed by Schoenmakers [41,42]. This seems to be evidence of the versatility of the employed potential functions. Further, we leverage the sophistication of our optimisation layer in conjunction with the current power of state-of-the-art constraint solvers, like Z3 [36].

*Splay Heaps.* To overcome deficiencies of splay trees when implemented functionally, Okasaki introduced *splay heaps*. Splay heaps are defined similarly to splay trees and their (manual) amortised cost analysis follows similar patterns as the one for splay trees. Due to the similarity in the definitions between splay heaps and splay trees, extension of our experimental results in this direction did not pose any problems. Notably, however, ATLAS improves the known complexity bounds on the amortised complexity for the functions studied. We also remark that typical assumptions made in pen-and-paper proofs are automatically discharged by our approach: Schoenmakers [41,42] as well as Nipkow and Brinkop [37] make use of the (obvious) fact that the size of the resulting tree t or heap h equals the size of the input. As discussed, this information is captured by a cost-free derivation, cf. Sect. 2.

*Pairing Heaps.* These are another implementation of heaps, which are represented as binary trees, subject to the invariant that they are either leaf, or the right child is leaf, respectively. The left child is conceivable as list of pairing heaps. Schoenmakers and Nipkow et al. provide a (semi-)manual


**Table 2.** Experimental results

(a) Comparison of the number of constraints generated and time taken for the type inference of the core operation of each benchmark plus the zig-zig case of splay.


(b) Number of assertions, solving time and maximum memory usage (in mebibytes) for the combined analysis of functions per-module.

analysis of pairing heaps, that ATLAS can verify or even improve fullyautomatically. We note that we analyse a single function merge\_pairs, whereas [37] breaks down the analysis and studies two functions pass\_1 and pass\_2 with merge\_pairs = pass\_2 pass\_1. All definitions can be found at [33].

#### **6.2 Experimental Results**

Our main results have already been stated in Table 1 of Sect. 1. Table 2a compares the differences between the "naive automation" and our actual automation ("automated mode"), see Sect. 5. Within the latter, we distinguish between a "selective" and a "full" mode. The "selective" mode is as described on page 18. The "full" mode employs weakening for the same rule applications as the "selective" mode, but always with option w{mono l2xy}. The same applies to the "full" manual mode. The naive automation does not support selection of expert knowledge. Thus the "selective" option is not available, denoted as "n/a". Timeouts are denoted by "t/o". As depicted in the table, the naive automation does not terminate within 24 h for the core operations of the three considered data structures, whereas the improved automated mode produces optimised results within minutes. In Table 2b, we compare the (improved) automated mode with the manual mode, and report on the sizes of the resulting constraint system and on the resources required to produce the same results. Observe that even though our automated mode achieves reasonable solving times, there is still a significant gap between the manually crafted tactics and the automated mode, which invites future work.

# **7 Conclusion**

In this paper we have for the first time been able to automatically conduct an amortised analysis for self-adjusting data structures. Our analysis is based on the "sum of logarithms" potential function and we have been able to automate reasoning about these potential functions by using Farkas' Lemma for the linear part of the calculations and adding necessary facts about the logarithm. Immediate future work is concerned with replacing the "sum of logarithms" potential function in order to analyse skew heaps and Fibonacci heaps [42]. In particular, the potential function for skew heaps, which counts "right heavy" nodes, is interesting, because it is also used as a building block by Iacono in his improved analysis of pairing heaps [29,30]. Further, we envision to extend our analysis to related probabilistic settings such as priority queues [13] and skip lists [40].

# **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

# **Decision Procedures and Solvers**

# **Theory Exploration Powered by Deductive Synthesis**

Eytan Singher(B) and Shachar Itzhaky

Technion, Haifa, Israel {eytan.s,shachari}@cs.technion.ac.il

**Abstract.** This paper presents a symbolic method for automatic theorem generation based on deductive inference. Many software verification and reasoning tasks require proving complex logical properties; coping with this complexity is generally done by declaring and proving relevant sub-properties. This gives rise to the challenge of discovering useful subproperties that can assist the automated proof process. This is known as the *theory exploration* problem, and so far, predominant solutions that emerged rely on evaluation using concrete values. This limits the applicability of these theory exploration techniques to complex programs and properties.

In this work, we introduce a new symbolic technique for theory exploration, capable of (offline) generation of a library of lemmas from a base set of inductive data types and recursive definitions. Our approach introduces a new method for using abstraction to overcome the above limitations, combining it with deductive synthesis to reason about abstract values. Our implementation has shown to find more lemmas than prior art, avoiding redundant lemmas (in terms of provability), while being faster in most cases. This new abstraction-based theory exploration method is a step toward applying theory exploration to software verification and synthesis.

**Keywords:** Theory exploration · Synthesis · Automatic theorem proving

# **1 Introduction**

Most forms of software verification and synthesis rely on some form of logical reasoning to complete their task. Whether it is checking pre- and post-conditions, deriving specifications for sub-problems [1,19], or equivalence reduction [39], these methods rely on assumptions from both the input and relevant background knowledge. Domain-specific knowledge can reinforce these methods, whether via the design of a domain-specific language [29,36,45], specialized decision procedures [28], or decomposing specifications [35]. While hand-crafted techniques can treat whole classes of programs, every library or module contributes a collection of new primitives, requiring tweaking or extending these methods. Automatic formation of background knowledge can enable effortless treatment of such libraries and programs.

In the context of verification tools, such as Dafny [27] and Leon [7], as well as interactive proof assistants, such as Coq [12] and Isabelle/HOL [33], background knowledge is typically given as a set of *lemmas*. Usually, these libraries of lemmas (*i.e.* the background knowledge) are created by human engineers and researchers who are tasked with formulating them and proving their correctness. When a proof or verification task requires auxiliary lemmas missing from the existing background knowledge, the user is required to add and prove it, sometimes repeating this process until the proof is trivial or can be found automatically. For example, both Dafny and Leon fail to prove that addition is associative and commutative from first principles—based on an algebraic construction of the natural numbers. However, when given knowledge of these properties (*i.e.* encoded as lemmas: (x + y) + z = x + (y + z) and x + y = y + x)<sup>1</sup>, they readily prove composite facts such as (x + 5) + y =5+(x + y).

A possible solution is to eagerly generate valid lemmas, and to do so automatically, offline, as a precursor to any work that would be built on top of the library. This paradigm is known as *theory exploration* [8,9], and differs from the common conjecture generation approach (in theorem provers and SMT solvers [37]) that is guided by a proof goal. As opposed to using proof goal as the basis for discovering sub-goals, when eagerly generating lemmas there is a vast space of possible lemmas to consider. Currently, two main approaches exist for filtering candidate conjectures, counterexample-based and observational equivalencebased [18,22,23,43]. These filtering techniques are all based on testing and therefore require automatic creation of concrete examples.

Testing with concrete values allows for fast evaluation and filtering of terms when the data types involved are simple. However, when scaling to larger data types and function types it becomes a bottleneck of the theory exploration process. Previous research effort has revealed that testing-based discovery is sensitive to the number and size of type definitions occurring in the code base. For example, QuickSpec, which is based on QuickCheck (as are all the existing testing-based theory exploration methods), employs a heuristic to restrict the set of types allowed in terms in order to make the checker's job easier. Compound data types such as lists can be nested up to two levels (lists of lists, but not lists of lists of lists). This presents an obstacle towards scaling the approach to real software libraries, since "*QuickCheck's size control interacts badly with deeply nested types* [...] *will generate extremely large test data.*" [38]

Following are two example scenarios that attempt to represent cases from software systems where structured data types and complicated APIs exist: (i) A series of tree data-types T<sup>i</sup> where each T<sup>i</sup> is a tree of height i with i children of type T<sup>i</sup>−<sup>1</sup>, and the base case is an empty tree. Creating concrete examples for T<sup>i</sup> will be resource expensive, as each tree has O(i!) nodes, and each node requires a

<sup>1</sup> In fact, these properties are hard-wired into decision procedures for linear integer arithmetic in SMT solvers.

value. (ii) An ADT (Algebraic Data Type) A with multiple fields where each can contain a large amount of text or other ADTs, and a function over A that only accesses one of the fields. Even if evaluating the function is fast, fully creating A is expensive and will impact the theory exploration run-time.

This paper presents a new symbolic theory exploration approach that takes advantage of the characteristics of induction-based proofs. To overcome the blowup in the space of possible values, we make use of *symbolic values*, which contain interpreted symbols, uninterpreted symbols, or a mixture of the two. Conceptually, each symbolic value is an abstraction representing (infinitely) many possible values. This means that preexisting knowledge on the symbolic value can be applied without fully creating interpreted values. Still, when necessary, uninterpreted values can be expanded, creating larger symbolic values, thus refining the abstraction, and facilitating the necessary computation. We focus on the formation of *equational* theories, that is, lemmas that curtail the equivalence of two terms, with universal quantification over all free variables.

We show that our symbolic method for theory exploration is more applicable and faster in many different scenarios than state-of-the-art. As an example, given standard definitions for the list functions: ++ drop take filter our method proves facts that were not found by current state-of-the-art such as:

> (take i *xs*) ++ (drop i *xs*) = *xs* filter p (*xs* ++ *ys*) = (filter p *xs*) ++ (filter p *ys*)

**Main Contributions.** This paper provides the following contributions:


### **2 Overview**

Our theory exploration method, named TheSy (Theory Synthesizer, pronounced *Tessy*), is based on syntax-guided enumerative synthesis. Similarly to previous approaches [10,20,38], TheSy generates a comprehensive set of terms from

**Fig. 1.** TheSy system overview: breakdown into phases, with feedback loop.

the given vocabulary and looks for pairs that seem equivalent. Notably, TheSy employs deductive reasoning based on term rewriting systems to propose these pairs by extrapolating from a set of known equalities, employing a relatively lightweight (but unsound) reasoning procedure. The proposed pairs are passed as equality conjectures to a theorem prover capable of reasoning by induction.

The process (as shown in Fig. 1) is separated into four stages. These stages work in an iterative deepening fashion and are dependent on the results of each other. A short description is given to help the reader understand their context later on.


The phases are run iteratively in a loop, where each iteration deepens the generated terms and, hence, the discovered lemmas. These lemmas are fed back to earlier phases; this form of feedback contributes to discovering more lemmas thanks to several factors:


$$\begin{array}{ccccc} \mathcal{V} = \{ [] \} & \mathsf{list } T, & \mathcal{C} = \{ [], ::, \{ \\ & :: & T \to \mathsf{list } T \to \mathsf{list } T, \\ & \dashv \quad & \mathsf{list } T \to \mathsf{list } T \to \mathsf{list } T, \\ & \text{filter } & (T \to \mathsf{bool}) \to \mathsf{list } T \to \mathsf{list } T \; \} \end{array}$$

$$\begin{array}{ll} \mathcal{E} = \{ [] \+ \! \! + \! \! / = l, \qquad (x :: xs) \mapsto \! \! \! / = x :: (xs ++ l), \\ \text{filter } p \ [] = [], \quad \text{filter } p \ (x :: xs) = \text{if } p \ x \ \text{then } x :: \text{filter } p \ xs \text{ else } \text{filter } p \ xs \text{ } \} \end{array}$$

**Fig. 2.** An example input to TheSy.

*Running Example.* To illustrate TheSy's theory exploration procedure, we introduce a simple running example based on a list ADT. The input given to TheSy is shown in Fig. 2; it consists of a vocabulary V (of which C is a subset of ADT constructors) and a set of known equalities E. The vocabulary V contains the canonical list constructors [ ] and ::, and two basic list operations ++ (concatenate) and filter. The equalities E consist of the definitions of the latter two.

At a very high level, the following process is about to take place: TheSy generates symbolic terms representing length-bound lists, *e.g.*, [ ], [v1], [v2, v1]. Then, it will evaluate all combinations of function applications, up to a small depth, using these symbolic terms as arguments. If these evaluations yield common values for all possible assignments, the two application terms yielding them are conjectured to be equal. Since the evaluated expressions contain symbolic values, their result is a symbolic value. Comparing such symbolic values is done via congruence closure-based reasoning; we call this process *symbolic observational equivalence*, by way of analogy to observational equivalence [2] that is carried out using concrete values.

Out of the conjectures computed using symbolic observational equivalence, TheSy selects minimal ones according to a combined metric of compactness and generality. These are passed to a prover that employs both congruence closure and induction to verify the correction of the lemmas for *all* possible list values.

Some lemmas that TheSy can discover this way are:

$$\begin{array}{ccc} \text{filter } p \text{ (filter } p \text{ } l \text{)} = \text{filter } p \text{ } l & l\_1 ++ (l\_2 ++ l\_3) = (l\_1 ++ l\_2) ++ l\_3\\ \text{filter } p \text{ } l\_1 ++ \text{ filter } p \text{ } l\_2 = \text{filter } p \text{ } (l\_1 ++ l\_2) \end{array}$$

As briefly mentioned, our system design relies on congruence closure-based reasoning over universally quantified first-order formulas with uninterpreted functions. Congruence closure is weak but fast and constitutes one of the core procedures in SMT solvers [31,32]. On top of that, universally-quantified assumptions [4] are handled by formulating them as rewrite rules and applying some depth-bounded term rewriting as described in Subsect. 3.1. Additionally, TheSy implements a simple case splitting mechanism that enables reasoning on conditional expressions. Notably, this procedure *cannot* reason about recursive definitions since such reasoning routinely requires the use of induction. To that end, TheSy is geared towards discovering lemmas that can be proven by induction; a lemma is considered useful if it cannot be proven from existing lemmas by congruence closure alone, that is, without induction. Discovering such lemmas and adding them to the background knowledge evidently increases the reasoning power of the prover, since at least the fact of their own validity becomes provable, which it was not before.

# **3 Preliminaries**

This work relies heavily on term rewriting techniques, which is employed across multiple phases of the exploration. Term rewriting is implemented efficiently using equality graphs (e-graphs). In this section, we present some minimal background of both, which will be relevant for the exploration procedure described later.

# **3.1 Term Rewriting Systems**

Consider a formal language L of terms over some vocabulary of symbols. We use the notation R = t<sup>1</sup> . →t<sup>2</sup> to denote a rewrite rule from t<sup>1</sup> to t2. For a (universally quantified) semantic equality law t<sup>1</sup> = t2, we would normally create *both* t<sup>1</sup> . →t<sup>2</sup> and t<sup>2</sup> . → t1. We refrain from assigning a direction to equalities since we do not wish to restrict the procedure to strongly normalizing systems, as is traditionally done in frameworks based on the Knuth-Bendix algorithm [24]. Instead, we define equivalence when a sequence of rewrites can identify the terms in either direction. A small caveat involves situations where FV(t1) = FV(t2), that is, one side of the equality contains variables that do not occur on the other. We choose to admit only rules t<sup>i</sup> . → t<sup>j</sup> where FV(ti) ⊇ FV(t<sup>j</sup> ), because when FV(ti) ⊂ FV(t<sup>j</sup> ), applying the rewrite would have to create new symbols for the unassigned variables in t<sup>j</sup> , which results in a large growth in the number of symbols and typically makes rewrites much slower as a result.

This slight asymmetry is what motivates the following definitions.

**Definition 1.** *Given a rewrite rule* R = t<sup>1</sup> . → t2*, we define a corresponding relation* <sup>R</sup> −→ *such that* s<sup>1</sup> R −→ s<sup>2</sup> ⇐⇒ s<sup>1</sup> = C[t1σ]∧s<sup>2</sup> = C[t2σ] *for some context* C *and substitution* σ *for the free variables of* t1, t2*. (A* context *is a term with a single hole, and* C[t] *denotes the term obtained by filling the hole with* t*.)*

**Definition 2.** *Given a relation* <sup>R</sup> −→ *we define its symmetric closure:*

$$t\_1 \xleftarrow{\mathcal{R}} t\_2 \iff t\_1 \xrightarrow{\mathcal{R}} t\_2 \lor \ t\_2 \xrightarrow{\mathcal{R}} t\_1$$

**Definition 3.** *Given a set of rewrite rules* G<sup>R</sup> = {Ri}*, we define a relation as union of the relations of the rewrites:* {Ri} ←−−→ <sup>=</sup>- <sup>i</sup> ← R →i *.*

*In the sequel, we will mostly use its reflexive transitive closure,* {Ri} ←−−→ ∗ *.*

**Fig. 3.** An e-graph representing the expression filter *p* (*l*<sup>1</sup> ++ *l*2) (dark) and the equivalent expression filter *p l*<sup>1</sup> ++ filter *p l*<sup>2</sup> (light).

The relation {Ri} ←−−→ ∗ is reflexive, transitive, and symmetric, so it is an equivalence relation over L. Under the assumption that all rewrite rules in {Ri} are semantics preserving, for any equivalence class [t] ∈ L {Ri} ←−−→ ∗ , all terms belonging to [t] are definitionally equal. However, since L may be infinite, it is essentially impossible to compute {Ri} ←−−→ ∗ . Any algorithm can only explore a finite subset T ⊆L, and in turn, construct a subset of {Ri} ←−−→ ∗ .

#### **3.2 Compact Representation Using Equality Graphs**

In order to be able to cover a large set of terms T , we need a compact data structure that can efficiently represent many terms. Normally, terms are represented by their ASTs (Abstract Syntax Trees), but as there would be many instances of common subterms among the terms of T , this would be highly inefficient. Instead, we adopt the concept of equality graphs (e-graphs) from automated theorem proving [15], which also saw uses in compiler optimizations and program synthesis [30,34,41], in which context they are known as Program Expression Graphs (PEGs). An e-graph is essentially a hypergraph where each vertex represents a set of equivalent terms (programs), and labeled, directed hyperedges represent function applications. Hyperedges therefore have exactly one target and zero or more sources, which form an ordered multiset (a vector, basically). Just to illustrate, the expression filter p (l<sup>1</sup> ++ l2) will be represented by the nodes and edges shown in dark in Fig. 3. The nullary edges represent the constant symbols (p, l1, l2), and the node u<sup>0</sup> represents the entire term. The expression filter p l<sup>1</sup> ++ filter p l2, which is equivalent, is represented by the light nodes and edges, and the equivalence is captured by sharing of the node u0.

When used in combination with a rewrite system {Ri}, each rewrite rule is represented as a premise pattern P and a conclusion pattern C. Applying a rewrite rule is then reduced to searching the e-graph for the search pattern and obtaining a substitution σ for the free variables of P. The result term is then obtained by substituting the free variables of C using σ. This term is added to the same equivalence class as the matched term (*i.e.* P σ), meaning they will both have the same root node. Consequently, a single node can represent a set of terms exponentially large in the number of edges, all of which will always be equivalent modulo {Ri} ←−−→ ∗ .

In addition, since hyperedges always represent functions, a situation may arise in which two vertices represent the same term: This happens if two edges u¯ <sup>f</sup> −→ v<sup>1</sup> and ¯u f −→ v<sup>2</sup> are introduced by {Ri} for v<sup>1</sup> = v2. In a purely functional setting, this means that v<sup>1</sup> and v<sup>2</sup> are equal. Therefore, when such duplication is found, it is beneficial to *merge* v<sup>1</sup> and v2, eliminating the duplicate hyperedge. The e-graph data structure therefore supports a vertex merge operation and a congruence closure-based transformation [44] that finds vertices eligible for merge to keep the overall graph size small. This procedure can be quite expensive, so it is only run periodically.

# **4 Theory Synthesis**

In this section, we go into a more detailed description of the phases of theory synthesis and explain how they are combined within an iterative deepening loop. To simplify the presentation, we describe all the phases first, then explain how the output from the last phase is fed back to the next iteration to complete a feedback loop. We continue with the input from the running example in Sect. 2 (Fig. 2) and dive deeper by showing intermediate states encountered during the execution of TheSy on this input. Throughout the execution, TheSy maintains a state, consisting of the following elements:


### **4.1 Term Generation**

The first step is to generate a set of terms over the vocabulary V. For the purpose of generating universally-quantified conjectures, we introduce a set of uninterpreted symbols, which we will call *placeholders*. Let T<sup>Y</sup> be the set of types occurring as the type of some argument of a function symbol in V. For each type <sup>τ</sup> occurring in <sup>V</sup> we generate placeholders <sup>τ</sup> ◦i, two for each type (we will explain later why two are enough). These placeholders, together with all the symbols in V, constitute the terms at depth 0.

At every iteration of deepening, TheSy uses the set of terms generated so far, and the (non-nullary) symbols of V, to form new terms by placing existing ones in argument positions. For example, with the definitions from Fig. 2, we will have terms such as these at depths 1 and 2:

$$\begin{array}{llllll} 1 & \text{filter } ^{T \rightarrow \text{bool } ^{\text{Int } T} \text{O}\_{1} & ^{\text{Int } T} \text{+} \leftarrow ^{\text{Int } T} \text{2} \\\hline 2 & [] \text{ ++ filter } ^{T \rightarrow \text{bool } ^{\text{Int } T} & \text{@ } 1 \text{ ++ (filter } ^{T \rightarrow \text{bool } ^{\text{Int } T})} \\\text{filter } ^{T \rightarrow \text{bool } ^{\text{Int } T} \text{+} \ ^{\text{Int } T} & (\text{filter } ^{T \rightarrow \text{bool } ^{\text{Int } T})} \text{ ++ (filter } ^{T \rightarrow \text{bool } ^{\text{Int } T})} \\\end{array} \quad \begin{array}{llllll} \text{l } ^{T} & \text{l \rightarrow ^{\text{Int } T}} \\\text{o } 1 & \text{else } (^{T \rightarrow \text{bool } ^{\text{Int } T})} \\\text{filter } ^{T \rightarrow \text{bool } ^{\text{Int } T})} & \text{else } (^{T \rightarrow \text{bool } ^{\text{Int } T})} \\\end{array} \quad \text{(1)}$$

It is easy to see that filter <sup>T</sup>→bool ◦1 list T ◦<sup>1</sup> and [ ] ++ filter <sup>T</sup>→bool ◦1 list T ◦<sup>1</sup> are equivalent in any context; this follows directly from the definition of ++, available as part of E. It is therefore acceptable to discard one of them without affecting completeness. TheSy does not discard terms—since they are merged in the e-graph, there is no need to—rather, it chooses the smaller term as representative when it needs one. This sort of *equivalence reduction* is present, in some way or another, in many automated reasoning and synthesis tools.

To formalize the procedure of generating and comparing the terms, in an attempt to discover new equality conjectures, we introduce the concept of *Syntax Guided Enumeration* (SyGuE). SyGuE is similar to Syntax Guided Synthesis (SyGuS for short [3]) in that they both use a formal definition of a language to find program terms solving a problem. They differ in the problem definition: while SyGuS is defined as a search for a correct program over the well-formed programs in the language, SyGuE is the sub-problem of iterating over *all distinct* programs in the language. SyGuS solvers may be improved using a smart search algorithm, while SyGuE solvers need an efficient way to eliminate duplicate terms, which may depend on the definition of program equivalence. We implement our variant of SyGuE, over the equivalence relation {Ri}∗ ←−−→, using the aforementioned e-graph: by applying and re-applying rewrite rules, provably equivalent terms are naturally *merged* into hyper-vertices, representing equivalence classes.

#### **4.2 Conjecture Inference and Screening**

Of course, in order to discover *new* conjectures, we cannot rely solely on term rewriting based on E. To find more equivalent terms, TheSy carries on to generate a second set of terms, called *symbolic examples*, this time using only the constructors C⊂V and uninterpreted symbols for leaves. This set is denoted <sup>S</sup><sup>τ</sup> , where τ is an algebraic datatype participating in V (if several such datatypes are present, one <sup>S</sup><sup>τ</sup> per type is constructed). The depth of the symbolic examples (i.e. depth of applied constructors) is also bounded, but it is independent of the current term depth and does not increase during execution. For example, using the constructors of list T with an example depth of 2, we obtain the symbolic examples <sup>S</sup>list <sup>T</sup> <sup>=</sup> {[ ], <sup>v</sup>1::[ ], <sup>v</sup>2::v1::[ ]}, corresponding to lists of length up to 2 having arbitrary element values. Intuitively, if two terms are equivalent for all possible assignments of symbolic examples to list <sup>T</sup> ◦<sup>i</sup> , then we are going *hypothesize* that they are equivalent for all list values. This process is very similar to observational equivalence as used by program synthesis tools [2,42], but since it uses the symbolic value terms instead of concrete values, we dub it *symbolic observational equivalence* (SOE).

Consider, for example, the simple terms list <sup>T</sup> ◦<sup>1</sup> and list <sup>T</sup> ◦<sup>1</sup> ++ [ ]. In placeholder form, none of the rewrite rules derived from E applies, so it cannot be determined that these terms are, in fact, equivalent. However, with the symbolic list examples above, the following rewrites are enabled:

$$[] ++ [] \xleftarrow[] \xleftarrow[] \quad v\_1 :: [] ++ [] \xleftarrow[] \xleftarrow[] \xrightarrow{\{\mathcal{R}\_i\}} v\_1 :: [] \quad v\_2 :: v\_1 :: [] ++ [] \xleftarrow[] \xleftarrow[] \xrightarrow{\{\mathcal{R}\_i\}} v\_2 :: v\_1 :: []$$

A similar case can be made for the two bottom terms in (1). For symbolic values <sup>l</sup>1, l<sup>2</sup> ∈ Slist <sup>T</sup>, it can be shown that

$$\text{filter}^{T \stackrel{\text{\\_bool}}{\rightleftharpoons} \text{\\_}l\_1 ++ \text{\\_}l\_2} \xleftarrow{\{\mathcal{R}\_i\}} \stackrel{\text{\\_}}{\leftarrow} \left(\text{filter } \stackrel{T \stackrel{\text{\\_bool}}{\rightleftharpoons} \text{\\_}l\_1}) ++ \left(\text{filter } \stackrel{T \stackrel{\text{\\_bool}}{\rightleftharpoons} \text{\\_}l\_2}\right)$$

In fact, it is sufficient to substitute for list <sup>T</sup> ◦<sup>1</sup> , while *leaving* list <sup>T</sup> ◦<sup>2</sup> *alone, uninterpreted*: e.g., filter <sup>T</sup>→bool ◦<sup>1</sup> ([ ] ++ list <sup>T</sup> ◦<sup>2</sup> ) {Ri} ←−−→ ∗ (filter <sup>T</sup>→bool ◦<sup>1</sup> [ ]) ++ (filter <sup>T</sup>→bool ◦1 list T ◦<sup>2</sup> ). This reduces the number of equivalence checks significantly, and is more than a mere heuristic: since we are going to rely on a prover that proceeds by applying induction to one of the arguments, it makes perfect sense to only bound that argument. If computation is blocked on the second argument, we would prefer to first infer an auxiliary lemma first, then use it to discover the blocked lemma later. See Example 1 below for an idea of when this situation arises.

The attentive reader may notice that the cases of v1::[ ] and v2::v1::[ ] are a bit more involved: to proceed with the rewrite of filter, the expressions <sup>T</sup>→bool ◦<sup>1</sup> <sup>v</sup>1, <sup>T</sup>→bool ◦<sup>1</sup> <sup>v</sup><sup>2</sup> must be resolved to either *true* or *false*. However, the predicate <sup>T</sup>→bool ◦<sup>1</sup> as well as the arguments v<sup>1</sup>,<sup>2</sup> are uninterpreted. In this case, TheSy is required to perform a *case split* in order to enable the rewrites and unify the symbolic terms separately in each of the resulting four (2<sup>2</sup>) cases. Notice that leaving <sup>T</sup>→bool ◦1 uninterpreted means that the cases are only split when evaluation is blocked by one or more rewrite rule applications, potentially saving some branching. The following steps are then carried out for each case.

TheSy applies all the available rewrite rules to the entire e-graph, containing all the terms and symbolic examples. For every two terms t1, t<sup>2</sup> such that for all viable substitutions σ of placeholders to symbolic examples of the corresponding types, t1σ and t2σ were shown equal—that is, ended up in the same equivalence class of the e-graph—the conjecture t<sup>1</sup> ? = t<sup>2</sup> is emitted. *E.g.*, in the case of the running example:

> filter <sup>T</sup>→bool ◦<sup>1</sup> ( list T ◦<sup>1</sup> ++ list <sup>T</sup> ◦<sup>2</sup> ) ? = (filter <sup>T</sup>→bool ◦1 list T ◦<sup>1</sup> ) ++ (filter <sup>T</sup>→bool ◦1 list T ◦<sup>2</sup> )

In the presence of multiple cases, the results are intersected, so that a conjecture is emitted only if it follows from all the cases.

**Screening.** Generating all the pairs according to the above criteria potentially creates many "obvious" equalities, which are valid propositions, but do not contribute to the overall knowledge and just clutter the prover's state. For example, 

$$\text{filter } ^{T \to \text{bool}} \_{1}^{\text{Int}} \stackrel{\text{int}}{\left( \circlearrowright } ^{T} \text{++} ^{\text{int}} \text{ $\mathbf{0}\_{2}$ } \right)} \stackrel{?}{=} \text{filter } ^{T \to \text{bool}} \stackrel{\text{(\text{int}}}{\left( \circlearrowright \text{ $\mathbf{0}\_{1}$ } \right) \text{++} \left( \left[ \left[ \text{right} \right.\\ \circlearrowright \text{ $\mathbf{0}\_{2}$ } \right) \text{)}}$$

which follows from the definition of ++ and has nothing to do with filter. The synthesizer avoids generating such candidates, by choosing at most one term from every equivalence class of placeholder-form terms induced during the term generation phase. If both sides of the equality conjecture belong to the same equivalence class, the conjecture is dropped altogether.

The conjectures that remain are those equalities t<sup>1</sup> ? = t<sup>2</sup> where t<sup>1</sup> and t<sup>2</sup> got merged for all the assignments <sup>S</sup><sup>τ</sup> to some <sup>τ</sup> ◦1, and, furthermore, t<sup>1</sup> and t<sup>2</sup> themselves *were not* merged in placeholder form, prior to substitution. Such conjectures, if true, are guaranteed to increase the knowledge represented by E as (at least) the equality t<sup>1</sup> = t<sup>2</sup> was not previously provable using term rewriting and congruence closure.

#### **4.3 Induction Prover**

For practical reasons, the prover employs the following induction tactic:


The reasoning behind this design choice is that for every multi-variable term, *e.g.* list <sup>T</sup> ◦<sup>1</sup> ++ list <sup>T</sup> ◦<sup>2</sup> , the synthesizer also generates the symmetric counterpart list <sup>T</sup> ◦<sup>2</sup> ++ list <sup>T</sup> ◦<sup>1</sup> . So electing to perform induction on list <sup>T</sup> ◦<sup>1</sup> does not impede generality.

In addition, if more than one level of induction is needed, the proof can (almost) always be revised by factoring out the inner induction as an auxiliary lemma. Since the synthesizer produces *all* candidate equalities, that inner lemma will also be discovered and proved with one level of induction. Lemmas so proven are added to E and are available to the prover, so that multiple passes over the candidates can gradually grow the set of provable equalities.

When starting a proof, the prover never needs to look at the base case, as this case has already been checked during conjecture inference. Recall that placeholders <sup>τ</sup> ◦<sup>1</sup> are instantiated with bounded-depth expressions using the constructors of τ , and these include all base cases (non-recursive constructors) by default. For the example discussed above, the case of filter <sup>T</sup>→bool ◦<sup>1</sup> ([ ] ++ list <sup>T</sup> ◦<sup>2</sup> ) = (filter <sup>T</sup>→bool ◦<sup>1</sup> [ ]) ++ (filter <sup>T</sup>→bool ◦1 list T ◦<sup>2</sup> ) has been discharged early on, otherwise the conjecture would not have come to pass. The prover then turns to the induction step, which is pretty routine but is included in Fig. 4 for completeness of the presentation.

It is worth noting that the conjecture inference, screening and induction phases utilize a common reasoning core based on rewriting and congruence closure. In situations where the definitions include conditions such as match p x in Fig. 4 (in this case, desugared from if p x), the prover also performs automatic case split and distributes equalities over the branches. Details and specific optimizations are described in Sect. 5.

**Fig. 4.** Example proof by induction based on congruence closure and case splitting.

*Speculative Generalization.* When the prover receives a conjecture with multiple occurrences of a placeholder, *e.g.* list <sup>T</sup> ◦<sup>1</sup> ++ ( list T ◦<sup>2</sup> ++ list <sup>T</sup> ◦<sup>1</sup> ) ? = ( list T ◦<sup>1</sup> ++ list <sup>T</sup> ◦<sup>2</sup> ) ++ list <sup>T</sup> ◦<sup>1</sup> , it is designed to first speculate a more general form for it by replacing the multiple occurrences with fresh placeholders. Recall that in Subsect. 4.1 we argued that two placeholders of each type is going to be sufficient; this is the mechanism that enables it. There is more than one way to generalize a given conjecture: for this example, there are two ways (up to alpha-renaming):

$$\stackrel{\text{int}}{\diamondsuit}\_1^T \dashv \left(\stackrel{\text{int}}{\diamondsuit}\_2^T \dashv \stackrel{\text{int}}{\diamondsuit}\_3^T\right) \overset{?}{=} \left(\stackrel{\text{int}}{\diamondsuit}\_1^T \dashv \stackrel{\text{int}}{\diamondsuit}\_2\right) \dashv \stackrel{\text{int}}{\diamondsuit}\_3^T \qquad \stackrel{\text{int}}{\diamondsuit}\_1^T \dashv \stackrel{\text{int}}{\diamondsuit}\_3^T \right) \overset{?}{=} \left(\stackrel{\text{int}}{\diamondsuit}\_3^T \dashv \stackrel{\text{int}}{\diamondsuit}\_2\right) \dashv \stackrel{\text{int}}{\diamondsuit}\_1^T$$

The prover must attempt both. Failing that, it would fall back to the original conjecture. Formally, given an equality conjecture s = t we can consider an assignment σ such that r = sσ, q = tσ; where the original conjecture uses an assignment with only two values per type. The prover thus must iterate through different assignments σ<sup>i</sup> with more possible values per type, and attempt to prove a new conjecture rσ<sup>i</sup> = qσi. This incurs more work for the prover but is well worth its cost compared to a-priori generation of terms with three placeholders.

#### **4.4 Looping Back**

The equations obtained from Subsect. 4.3 are fed back in four different but interrelated ways. The first, inner feedback loop is from the induction prover to itself: the system will attempt to prove the smaller lemmas first, so that when proving the larger ones, these will already be available as part of E. This enables more proofs to go through. The second feedback loop uses the lemmas obtained to filter out proofs that are no longer needed. The third, outer loop is more interesting: as equalities are made into rewrite rules, additional equations may now pass the inference phase, since the symbolic evaluation core can equate more terms based on this additional knowledge. The fourth resonates with the third, applying the new rewrite rules acts as an equality reduction mechanism, reducing the number of hyperedges added to the e-graph during term generation.

It is worth noting that while concrete observational equivalence uses a trivially simple equivalence checking mechanism with the trade-off that it may generate many incorrect equalities, our *symbolic* observational equivalence is conservative in the sense that a symbolic value may represent infinitely many concrete inputs, and only if the synthesizer can *prove* that two terms will evaluate to equal values on *all* of them, by way of constructing a small proof, are they marked as equivalent. This means that some actually-equivalent terms may be "blocked" by the inference phase, which cannot happen when using concrete values—but also means that having additional inference rules (E) can improve this equivalence checking, potentially leading to more discovered lemmas. This property of TheSy is appealing because it allows an explored theory to evolve from basic lemmas to more complex ones.

*Example 1 (Lemma seeding).* To understand this last point, consider the standard definition of list reversal for the list datatype:

$$\begin{array}{l} \text{rev []} = [] \\ \text{rev (\$x::xs\$)} = \text{rev } xs \dashv \text{(\$x::[])} \end{array}$$

Given the terms t<sup>1</sup> = rev (list <sup>T</sup> ◦<sup>1</sup> ++ list <sup>T</sup> ◦<sup>2</sup> ) and <sup>t</sup><sup>2</sup> <sup>=</sup> revlist <sup>T</sup> ◦<sup>2</sup> ++ revlist <sup>T</sup> ◦<sup>1</sup> , symbolic observational equivalence with the assignments { list T ◦<sup>1</sup> → Slist <sup>T</sup> } fails to unify them. This is due to ++ being defined by induction on its first argument, hence, *e.g.*— ◦<sup>2</sup> ) <sup>→</sup><sup>∗</sup> 

$$\begin{array}{rcl} \text{rev}\left(v\_{2} :: v\_{1} :: [] ++ \stackrel{\text{int}}{\circ}\_{2}\right) & \rightarrow^{\*} & \left(\text{rev}\stackrel{\text{int}}{\circ}\_{2} ++ (v\_{1} :: [])\right) & ++ (v\_{2} :: []) \\ \text{rev}\left(\stackrel{\text{int}}{\circ\_{2}} ++ \text{rev}\;\,v\_{2} :: v\_{1} :: []\right) & \rightarrow^{\*} & \text{rev}\stackrel{\text{int}}{\circ\_{2}} ++ (v\_{1} :: v\_{2} :: []) \end{array}$$

Without the associativity property of ++, it would not be possible to show that these symbolic values are equivalent, so the conjecture t<sup>1</sup> ? = t<sup>2</sup> will not even be generated. Luckily, having proven list <sup>T</sup> ◦<sup>1</sup> ++ ( list T ◦<sup>2</sup> ++ list <sup>T</sup> ◦<sup>3</sup> ) ? = ( list T ◦<sup>1</sup> ++ list <sup>T</sup> ◦<sup>2</sup> ) ++ list <sup>T</sup> ◦<sup>3</sup> , these rewrites are "unblocked", so that the equality can be conjectured and ultimately proven.

One caveat is that whenever E is updated by the addition of a new lemma, some of the previously emitted conjectures may consequently become redundant. Moreover, conjectures that were passed to the prover before but failed validation may now succeed, and new ones may be emitted in the generation phase. To take these into account, the actual loop performed by TheSy is a bit more involved than has been described so far. For each term depth, TheSy performs all phases as described, but each time a lemma is discovered TheSy re-runs the conjecture generation, screening, and prover phases. Only when no more conjectures are available does TheSy increase the term depth and generate new terms.

# **5 Evaluation**

We implemented TheSy in Rust, using the e-graph manipulation library *egg* [44]. TheSy accepts definitions in SMTLIB-2.6 format [6], based on the UF theory (uninterpreted functions), limited to universal quantifications. Type declarations occurring in the input are collected and comprise V; universal equalities form E and are translated into rewrite rules (either uni- or bidirectional, as explained in Subsect. 3.1). Then SyGuE is performed on V, generating candidate conjectures using SOE. SyGuE uses *egg* for equivalence reduction, and SOE uses it for comparing symbolic values. Conjectures are then dismissed using TheSy's induction-based prover. This is done in an iterative deepening loop.

*Case Split.* Both SOE and the prover use a case splitting mechanism; This mechanism detects when rewriting cannot match due to an opaque value (an uninterpreted symbol), and applies case splitting according to the constructors of relevant ADTs. However, doing so for every rule is too costly and, in most cases, redundant—TheSy generates a variety of terms, so if one term is blocked due to an uninterpreted symbol, another one exists with a symbolic example instead. A situation where this is *not* the case is when *multiple* uninterpreted symbols block the rewrite (recall that TheSy only substitutes one placeholder per term with symbolic examples). To illustrate, consider the case in Fig. 4 where both the list x :: xs and p x are used in match expressions, therefore a case split is needed by p x ∈ {*true*, *false*}. Therefore, TheSy only performs case splitting for rewrite rules that require multiple match patterns but only one is blocked.

The splitting mechanism itself, operates by copying the e-graph and applying the term rewriting logic separately for each case. Each copy then yields a partition of the existing equivalence classes. These partitions are intersected between all cases, and each of the resulting intersections lead to merging of equivalence classes in the original e-graph. It is worth noting that TheSy never needs to backtrack a case split it has elected to apply. As a consequence, execution time is not exponential in the total number of case splits performed, only in the nesting level of such splits (which is bounded by 2 in our experiments).

We compare TheSy to the most recent and closely related theory exploration system, Hipster [23]—which is based on random testing (backed by Quick-Spec [38]) with proof automations from and frontend in Isabelle/HOL [33]. Hipster represents the culmination of several works on existing theory exploration (see Sect. 6). Both systems generate a set of proved lemmas as output, each such set encompassing a conceptual volume of knowledge that was discovered automatically. We note that the same knowledge can be represented in various ways, so directly comparing the sets of lemmas is going to be meaningless.

#### **5.1 Evaluating Theory Exploration Quality**

We define a comparison method for two theory exploration systems A and B starting from a common initial theory (defined as a set of closed formulas) T . As a metric for the quality and efficacy of results obtained from theory exploration, and, therefore, their perceived usefulness, we use the notion of *knowledge*

**Fig. 5.** A scatter plot showing the ratio of lemmas in theories discovered by each tool that were subsumed by the theory discovered by its counterpart (T = TheSy, H = Hipster). Each point represents a single test case. The vertical axis shows how many of the lemmas discovered by Hipster were subsumed by those discovered by TheSy, and the horizontal axis shows the converse.

(inspired by "knowledge base" in Theorema [8]). A theory T in a given logical proof system induces a collection of attainable knowledge, K<sup>T</sup> = {ϕ T ϕ}, that is, characterized by the set of (true) statements that can be proven based on T . In practice, a "pure" notion of knowledge based on provability is impractical, because most interesting logics are undecidable, and automated proving techniques cannot feasibly find proofs for all true statements. We, therefore, parameterize knowledge relative to a *prover*—a procedure that always terminates and can prove a subset of true statements. Termination can be achieved by restricting the space of proofs by either size or resource bounds. We say that T <sup>S</sup> ϕ when a prover, S, is able to verify the validity of ϕ in a theory T . A more realistic characterization of knowledge would then be <sup>K</sup><sup>S</sup> T = <sup>ϕ</sup> T <sup>S</sup> ϕ . Assuming that the prover S is fixed, a theory T is said to *increase knowledge* over <sup>T</sup> when <sup>K</sup><sup>S</sup> <sup>T</sup> ⊃ K<sup>S</sup> T .

We utilize the notion of <sup>K</sup><sup>S</sup> <sup>T</sup> described above to test the knowledge gained by A against that of B, and vice versa. We take the set of lemmas T<sup>A</sup> generated by A and check whether it is subsumed by TB, generated by B, by checking whether <sup>T</sup><sup>A</sup> ⊆ K<sup>S</sup> T ∪T<sup>B</sup> ; we then carry out the same comparison with the roles of <sup>A</sup> and <sup>B</sup> reversed. A working assumption is that both A and B include some mechanism for screening redundant conjectures. That is, a component that receives the current set of known lemmas T<sup>i</sup> and a conjecture ϕ and decides whether the conjecture is redundant. It is important to choose S such that whenever A (or <sup>B</sup>) discards <sup>ϕ</sup>, due to redundancy, it holds that <sup>ϕ</sup> ∈ K<sup>S</sup> Ti .

Incorporating the solver into the comparison makes the evaluation resistant to large amounts of trivial lemmas, as they will be discarded by A or B. It is still possible for some lemmas to be "better" than others, so knowledge is not uniformly distributed; this is hard to quantify, though. A few possible measures of usefulness come to mind, such as lemma utilization in a task (such as proof search), proof complexity, or matching to a given context, but given just the exploration task, there is not sufficient information to apply them. A first approximation is to consider the discovered lemmas themselves, *i.e.*, T<sup>A</sup> ∪ TB, as representing proof objectives. In doing so, we pit A and B in direct contest with one another. We choose this avenue because it is straightforward to apply, admitting that it may be inaccurate in some cases.

To evaluate our approach and its implementation, we run both TheSy and Hipster on functional definitions collected from the TIP 2015 benchmark suite [11], specifically the IsaPlanner [21] benchmarks (85 benchmarks in total), for compatibility between the two systems. TIP benchmarks also contain goal propositions, but for the purpose of evaluating the exploration technique, these are redacted. This experiment uses the simple rewrite-driven congruence-closure decision procedure with a case split mechanism in the role of the solver, S, occurring in the definition of knowledge K. Hipster uses Isabelle/HOL's simplifier as a conjecture redundancy filtering mechanism, which is in itself a simple rewrite-driven decision procedure, therefore S provides a suitable comparison. We compute the portion of lemmas found by Hipster that were provable (by S) from TheSy's results and vice versa. In other words, we check the ratio given by |T<sup>A</sup> ∩ K<sup>S</sup> T ∪T<sup>B</sup> <sup>|</sup> / |TA|, which we denote <sup>T</sup><sup>B</sup> %TA, in both directions. Figure <sup>5</sup> displays the ratios, where each point represents a single test case. Points above the diagonal line represent test cases where TheSy's ratio was higher and for points under the line Hipster's ratio was higher. We conduct this experiment twice: Once with the case-splitting mechanism of TheSy turned off for its exploration, and once with it turned on. (Hipster does not have such a switch as it always generates concrete values.) The reason for this is that case splitting increases the running time significantly (as we show next), so we want to evaluate its contribution to the discovery of lemmas. Comparing the two charts, while TheSy performs reasonably well compared to Hipster without case splitting (in 48 out of the 85 TheSy's ratio was better and equal in 12), enabling it leads to a clear advantage (in 65 out of the 85 TheSy's ratio was better and equal in 6).

**Performance.** To compare runtime efficiency, we consider the time it took to fully explore the IsaPlanner test suite. We consider an exploration "full" when it has finished enumerating all the terms, and associated candidate conjectures, up to the depth bound (k = 2)<sup>2</sup> with TheSy or size bound with Hipster (s = 7), and check them; or when a timeout of one hour is reached, whichever is sooner. We then sort the benchmarks from shortest- to longest-running for each of the tools, and report the accumulated time to explore the first i benchmarks (i = 1..85). The results are shown in the graph in Fig. 6, for Hipster, TheSy with case split disabled, and TheSy with case split enabled. In both configurations,

<sup>2</sup> Our experience shows that choosing larger *k*s greatly affects the run-time, but does not lead to many useful lemmas.

**Fig. 6.** Time to fully explore the 85 IsaPlanner benchmarks. A full exploration is considered one where either all terms up to the depth bound have been enumerated or a timeout of 1 h has been reached. The *y* axis shows the amount of time needed to complete the first *x* benchmarks, when they are sorted from shortest- to longestrunning. (Time scale is logarithmic; lower is better.)

TheSy is very fast for the lower percentiles, but begins to slow down, due to case splitting, towards the end of the line. To illustrate, in the 25th percentile TheSy was <sup>∼</sup>380 times faster (0.48 s *vs.* 182.47 s); in the 50th percentile, <sup>∼</sup>57 times faster (5.28 s *vs.* 305.37 s); and in the 75th percentile, <sup>∼</sup>6 times faster (141.24 to 883.8). Overall TheSy took 51.6K seconds and Hipster 47.1K, meaning Hipster was <sup>∼</sup>1.1 times faster. It is evident from the chart that case splitting is largely responsible for the longer execution times. Without case splitting, TheSy is much faster, and completes all 85 benchmarks in less time than it takes Hipster. Of course, in that mode of operation, TheSy finds fewer lemmas (as shown in Fig. 5), but is still superior to Hipster. Future work needs to focus on improving the case-splitting mechanism, similar to their treatment in SAT and SMT, allowing TheSy to deal with such theories more efficiently.

#### **5.2 Efficacy to Automated Proving**

While the mission statement of TheSy is solely to provide lemmas based on core theories, we wish to claim that such discovered theories are beneficial toward proving theorems in general, based on the same core theory. We used a collection of benchmarks for induction proofs used by CVC4 [37], and conducted the following experiment: First, the proof goals are skipped and only the symbol declarations and provided axioms are used to construct an input to TheSy. Then, whenever a new lemma is discovered and passes through the prover, we also attempt to prove the goal—utilizing the same mechanism used for vetting conjectures. As soon as the latter goes through, the exploration process is aborted, and all lemmas collected are discarded. The experiments are thus independent across the individual benchmarks.


**Table 1.** Results of the CVC4 benchmark suite (number of successful proofs in each category).

**Fig. 7.** Accumulated time-to-solve for each of the benchmark suites from the CVC4 collection. The *y* axis shows the amount of time needed to complete the first *x* (successful) proofs, when benchmarks are sorted from shortest- to longest-running.

Even though this setting is unfavorable to TheSy—because it does not take advantage of the fact that theory exploration can be done offline, then its results re-used for proofs over the same core theory—we report considerable success in solving these benchmarks. Out of the 311 benchmarks, our theory exploration + simple-minded induction was able to prove 187 (with a 5-min timeout, same as in the original CVC4 experiments). For comparison, Z3 and CVC4 (without conjecture generation) were able to prove 75 and 70 of them, respectively. This shows that the majority of instances were not solvable without the use of induction. CVC4 with its conjecture generation enabled was able to solve 260 of them. Table 1 shows the number of successful proofs achieved for each of the four suites. Figure 7 shows the accumulated time required for the benchmarks; the vast majority of the success cases occur early on, because in some cases a rather small auxiliary lemma is all that is needed to make the proof go through.

# **6 Related Work**

*Equality Graphs.* Originally brought into use for automated theorem proving [15], e-graphs were popularized as a mechanism for implementing low-level compiler optimizations [41], under the name *PEGs*. These e-graphs can be used to represent a large program space compactly by packing together equivalent programs. In that sense they are similar to Version Space Algebras [26], but their prime objective is entirely different. While VSAs focus on efficient intersections, egraphs are used to saturate a space of expressions with all equality relations that can be inferred. They have found use in optimizing expressions for more than just speed, for example to increase numerical stability of floating-point programs in Herbie [34]. There are two key differences in the way e-graphs are used in this work compared to prior: (i) equality laws are not hard-coded nor fixed, they are fertilized as the system proves more lemmas automatically; (ii) saturation cannot be guaranteed or even obtained in all cases, which we overcome by a bound on rewrite-rule application depth. (The latter point is an indirect consequence of the former.)

*Automated Theorem Provers.* Many systems rely on known theorems or are designed to support users in semi-automated proving. Congruence closure is also a proven method for tautology checking in automated theorem provers, such as Vampire [25], and is used as a decision procedure for reasoning about equality in leading SMT solvers Z3 [14] and CVC4 [5]. There, it is limited mostly to first-order reasoning, but can essentially be applied unchanged to higher-level scenarios such as ours.

Related to theory exploration, but using separate techniques, are Zipperposition [13], and the conjecture generation mechanism implemented as part of the induction prover in CVC4 [37]. It should be noted, that these are directed toward a specific proof goal, as opposed to theory exploration, which is presumed to be an offline phase. As such, the above two techniques incorporate generation of inductive hypotheses into the saturation proof search/SMT procedure, respectively.

*Theory Exploration.* IsaCoSy [22] pioneered the use of synthesis techniques for bottom-up lemma discovery. IsaCoSy combines equivalence reduction with counterexample-guided inductive synthesis (CEGIS [40]) for filtering candidate lemmas. This requires a solver capable of generating counterexamples to equivalence. Subsequent development was based on random generation of test values, as implemented in QuickSpec [38] for reasoning about Haskell programs, later combined with automated provers for checking the generated conjectures [10,20]. We have mentioned the deficiencies of using concrete values (as opposed to symbolic ones) and random testing in Sect. 1 and make an empirical comparison with Hipster, a descendent of IsaCoSy and QuickSpec, in Sect. 5.

*Inductive Synthesis.* In the area of SyGuS [3], tractable bottom-up enumeration is commonly achieved by some form of equivalence reduction [39]. When dealing with concrete input-output examples, observational equivalence [2,42] is very effective. The use of symbolic examples in synthesis has been suggested [17], but to the best of our knowledge, ours is the only setting where symbolic observational equivalence has been applied. Inductive synthesis, in combination with abduction [16], has also been used to infer specifications [1], although not as an exploration method but as a supporting mechanism for verification.

# **7 Conclusion**

We described a new method for theory exploration, which differentiates itself from existing work by basing the reasoning on a novel engine based on term rewriting. The new approach differs from previous work, specifically those based on testing techniques, in that:


By creating a feedback loop between the four different phases, term generation, conjecture inference, conjecture screening and induction prover, this system manages to efficiently explore many theories. This goes beyond similar feedback loops in existing tools, aiming to reduce false and duplicate conjectures. As explained in Subsect. 4.2, this form is also present in TheSy, but TheSy utilizes this feedback in more phases of the computation.

Theory exploration carries practical significance to many automated reasoning tasks, especially in formal methods, verification and optimization. Complex properties lead to an ever-growing number of definitions and associated lemmas, which constitute an integral part of proof construction. These lemmas can be used for SMT solving, automated and interactive theorem proving, and as a basis for equivalence reduction in enumerative synthesis. The term rewritingbased method that we presented in this paper is simple, highly flexible, and has already shown results surpassing existing exploration methods. The generated lemmas allow even this simple method to prove conjectures that normally require sophisticated SMT extensions. Our main conclusion is that deductive techniques and symbolic evaluation can greatly contribute to theory exploration, in addition to their existing applications in invariant and auxiliary conjecture inference.

**Acknowledgements.** This research was supported by the Israeli Science Foundation (ISF) Grants No. 243/19 and 2740/19 and by the United States-Israel Binational Science Foundation (BSF) Grant No. 2018675.

# **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

# **CoqQFBV: A Scalable Certified SMT Quantifier-Free Bit-Vector Solver**

Xiaomu Shi<sup>1</sup>, Yu-Fu Fu<sup>2</sup>, Jiaxiang Liu1(B) , Ming-Hsien Tsai<sup>3</sup>, Bow-Yaw Wang<sup>3</sup>, and Bo-Yin Yang<sup>3</sup>

> <sup>1</sup> Shenzhen University, Shenzhen, China <sup>2</sup> Georgia Institute of Technology, Atlanta, USA <sup>3</sup> Academia Sinica, Taipei City, Taiwan

**Abstract.** We present a certified SMT QF BV solver CoqQFBV built from a verified bit blasting algorithm, Kissat, and the verified SAT certificate checker GratChk in this paper. Our verified bit blasting algorithm supports the full QF BV logic of SMT-LIB; it is specified and formally verified in the proof assistant Coq. We compare CoqQFBV with CVC4, Bitwuzla, and Boolector on benchmarks from the QF BV division of the single query track in the 2020 SMT Competition, and realworld cryptographic program verification problems. CoqQFBV surprisingly solves more program verification problems with certification than the 2020 SMT QF BV division winner Bitwuzla without certification.

# **1 Introduction**

Satisfiability Modulo Theories (SMT) solvers for the Quantifier-Free Bit-Vector (QF BV) logic have been used to verify programs with bit-level accuracy [9, 10]. In such applications, a program verification problem is reformulated as an SMT QF BV query. An SMT QF BV solver is then invoked to compute a query result. The query result in turn decides the answer to the program verification problem. For cryptographic assembly programs, a missing carry or borrow flag will result in incorrect computation. Bit-accurate verification is thus necessary for cryptographic programs. SMT QF BV solvers in fact have been employed to verify such programs [8,25]. These solvers nonetheless are very complex programs with possibly unknown bugs [7,18]. Since bugs in SMT QF BV solvers may induce incorrect query results, program verification cannot be taken without a grain of salt when SMT QF BV solvers are employed.

In order to check SMT QF BV query results independently, SMT QF BV solvers can generate certificates to validate their answers. In the LFSC certificates [14,23], for instance, an SMT QF BV query result is certified by correct bit blasting and Boolean Satisfiability (SAT) solving. Such certificates demonstrate that the SMT QF BV query is reduced to a Boolean SAT query correctly *and* the corresponding SAT query is solved correctly. Although one can certify SAT query results with certificates from SAT solvers [24], it is not always easy to certify correct bit blasting due to complex arithmetic operations in SMT QF BV queries. Developing correct and efficient checkers for SMT QF BV certificates can be very challenging. Indeed, an LFSC certificate checker based on the proof assistant Coq has been developed to improve confidence [12]. Yet the Coq-based certificate checker does not fully support arithmetic operations and thus cannot certify results of SMT QF BV queries with complicated arithmetic operations. Consequently, the correctness of cryptographic programs still relies on the correctness of SMT QF BV solvers or their unverified certificate checkers.

In this paper, we take a more direct approach to ensure the correctness of SMT QF BV query results. Instead of certifying correct bit blasting for every SMT QF BV query, we specify a bit blasting algorithm and prove its correctness in the proof assistant Coq. In order to formalize the correctness of our bit blasting algorithm, we develop a formal bit-vector theory in Coq. Naturally, the formal theory has to support all arithmetic functions (addition, subtraction, multiplication, division, and remainder) for both signed and unsigned representations as needed in SMT-LIB [3]. Based on our new bit-vector theory, we give a formal semantics for SMT QF BV queries in Coq. Our semantics follows the SMT-LIB semantics carefully. Particularly, division and remainder are total arithmetic operations even when the divisor is zero. Using our Coq bit-vector theory and semantics, we prove that our bit blasting algorithm always returns a corresponding Boolean formula correctly on any SMT QF BV query. Since our algorithm has been formally verified, bit blasting is always correct and need not be certified. Through the OCaml program extracted from our verified bit blasting algorithm, a corresponding SAT query is obtained for each SMTQF BV query and sent to a SAT solver. A SAT certificate checker suffices to validate SAT query results and hence the correctness of answers to SMT QF BV queries. Since neither complicated SMT QF BV solvers nor their certificate checkers are trusted, our work can improve the confidence of SMT QF BV query results.

To our knowledge, our bit-vector theory is the first Coq formalization designed for bit blasting queries from the QF BV logic of SMT-LIB. Our semantics is the first Coq formalization for full SMT QF BV queries. We are not aware of any verified bit blasting algorithm or program for full SMT QF BV queries of SMT-LIB at the time of writing. Even the correctness of its results could be ensured, our certified SMT QF BV solver CoqQFBV would not be very useful if it were extremely inefficient. In order to evaluate its performance, we run Coq-QFBV on benchmarks from the QF BV division of the single query track in the 2020 SMT Competition. With the same memory and time limits in the competition, our solver successfully finishes 88.72% of the 6861 queries with certification. In comparison, CVC4 with its certificate checker solves 55.97% with certification, and the division winner Bitwuzla solves 98.22% of the benchmarks without certification. Our certified solver outperforms CVC4 with certification significantly. Generating and checking certificates make our certified solver finish about 10% of the queries less than the division winner. The price of accuracy perhaps is not unacceptable for the benchmarks in the competition. To further evaluate CoqQFBV, the certified solver is used to verify linear arithmetic assembly programs from various cryptography libraries such as OpenSSL [30]. CoqQFBV gives certified answers to 96.88% out of the 96 SMT QF BV queries from real-world cryptographic program verification. CVC4 with its certificate checker certifies 19.79%. Compared with efficient SMT QF BV solvers without certification, Boolector is able to solve 100% and Bitwuzla solves 91.67% of the queries. Intriguingly, our certified SMT QF BV solver outperforms the 2020 division winner Bitwuzla in queries from real-world verification problems. Our certified solver is probably useful for real-world verification problems.

*Related Work.* As mentioned, SMT certificate generating and checking are challenging. There are few efforts developing SMT QF BV certificate checkers, let alone verified ones. CVC4 is able to produce unsatisfiability certificates for QF BV queries, and also equipped with an (unverified) certificate checker [14]. SMTCoq [12] is proposed to check certificates from SMT solvers veriT and CVC4. It supports fragments of several logics including the QF BV logic. Moreover, its correctness is formally proved in Coq. However, the QF BV logic is not fully supported by SMTCoq. Z3 also supports certificate generation for the QF BV logic [19]. The proofs can be reconstructed, thus checked, within proof assistants HOL4 and Isabelle [6]. But the lack of details in Z3's generated certificates makes proof reconstruction particularly challenging.

With a similar approach in this paper, GL is a framework for bit blasting finitely bounded ACL2 theorems into SAT queries [28]. Its bit blasting algorithm is formally verified in ACL2. Though it is not designed for SMT-LIB, most of the operations defined in the QF BV logic are supported, except division and concatenation for instance. A bit blasting algorithm is defined and verified in HOL4 as well [13]. Neither [28] nor [13] aims to develop a scalable SMT QF BV solver. CoqQFBV accepts SMT-LIB inputs with fully supported QF BV logic while adopting performance optimizations such as caches.

In Isabelle and HOL4, one can use the bit-vector libraries to conform SMT-LIB operations, see [17] for example. Under the frame of Coq, coq-bits is a formalization of logical and arithmetic operations on bit-vectors [15]. The library provides the mapping between bit-vector operations and abstract number operations. Different from our theory, it does not support division/remainder or signed operations. Why3 [11] provides a bit-vector theory which is formalized in Coq too. It defines the division by zero in a different way from SMT-LIB. Moreover, the operations are defined based on integer operations. Our new bitvector theory instead defines bit-vector operations through bit manipulation. It is more suitable for the correctness proof of bit blasting algorithms.

We have the following organization. After the introduction, an overview is given in Sect. 2. Section 3 reviews preliminaries. Our formal bit-vector theory is presented in Sect. 4. It is followed by the formal semantics of SMT QF BV queries (Sect. 5). The correctness of our bit blasting algorithm is established in Sect. 6. Section 7 outlines the construction of our certified SMTQF BV solver. Experiments are presented in Sect. 8. Section 9 concludes our presentation.

# **2 Methodology Overview**

Given an SMT QF BV query, a bit blasting algorithm computes a Boolean formula such that the SMT QF BV query is satisfiable if and only if the Boolean formula is satisfiable. The QF BV logic contains arithmetic operations for bit-vectors. Computing an equi-satisfiable Boolean formula for an arbitrary SMT QF BV query can be very complicated and susceptible to errors. Our goal is to construct a correct bit blasting program for every SMT QF BV query. The correctness of the program moreover is verified by the proof assistant Coq to minimize gaps or even errors in hand-written proofs.

Our construction is based on a new formal bit-vector theory coq-nbits (Sect. 4). In coq-nbits, we define bit-vectors and their functions on top of the Coq data type for Boolean sequences. In order to support the QF BV logic of SMT-LIB fully, five arithmetic bit-vector functions (addition, subtraction, multiplication, division, and remainder) are defined in our formal theory. To establish the correctness of our definitions, formal proofs are provided to relate bit-vector functions with their arithmetic counterparts. For instance, we show the number represented by the output of the bit-vector negation function is indeed the arithmetic negation of the number represented by the input bit-vector.

Using our coq-nbits theory, we then give a formal semantics for SMTQF BV queries as defined in SMT-LIB (Sect. 5). In our formalization, a QF BV predicate denotes a Boolean value; and a QF BV expression denotes a bit-vector. An SMT QF BV query is formalized as a Boolean combination of QF BV predicates on QF BV expressions over QF BV variables and bit-vector constants. In order to demonstrate the correctness of our formal semantics for SMT QF BV queries, formal proofs are provided to show that our formal semantics coincides with those defined in SMT-LIB.

Our bit blasting algorithm is given in Coq (Sect. 6). It extends Tseitin transformation for Boolean formulae to SMT QF BV queries. More precisely, a QF BV predicate is transformed to a literal with a Boolean formula; a QF BV expression is transformed to a literal sequence with a Boolean formula. Using our formalization of SMT QF BV queries, the correctness of bit blasting algorithm is established in Coq by mutual induction. To improve efficiency, our bit blasting algorithm is further optimized with more economic transformations and a cache. The optimized bit blasting algorithm is also verified with formal Coq proofs.

Our formally verified bit blasting algorithm is written in the Coq specification language. It is not yet a program compilable into executable binary codes. Using the code extraction mechanism in Coq, an OCaml program is extracted from our verified bit blasting algorithm. The OCaml program takes expressions in our formal SMT QF BV query syntax as inputs and returns expressions in our formal syntax for Boolean formulae as outputs. SAT solvers can be employed to decide satisfiability of output Boolean formulae. Their certificates can be validated by SAT certificate checkers independently (Sect. 7).

# **3 Preliminaries**

Let v be a Boolean *variable* with values *ff* and *tt*. A *literal* is of the form v or ¬v. A *clause* is a disjunction l0∨l1∨···∨l*<sup>k</sup>* of literals l0, l1,...,l*k*. A Boolean formula in the *conjunctive normal form (CNF)* is a conjunction c0∧c1∧···∧c*<sup>m</sup>* of clauses c0, c1,...,c*m*. A SAT *query* is a Boolean CNF formula. An *environment* maps Boolean variables to their values. Given a SAT query, the *Boolean satisfiability problem* is to decide if the query evaluates to *tt* on some environments.

A bit-vector of *width* w is written as #bb*<sup>w</sup>*−<sup>1</sup>b*<sup>w</sup>*−<sup>2</sup> ··· b<sup>0</sup> with b*<sup>i</sup>* ∈ {0, 1} for 0 ≤ i<w. In the *unsigned* representation, the bit-vector #bb*<sup>w</sup>*−<sup>1</sup>b*<sup>w</sup>*−<sup>2</sup> ··· b<sup>0</sup> denotes the natural number (non-negative integer) - <sup>0</sup>≤*i<w* <sup>b</sup>*i*2*<sup>i</sup>* ; in *two's complement (signed)* representation, it denotes the integer - <sup>0</sup>≤*i<w*−<sup>1</sup> <sup>b</sup>*i*2*<sup>i</sup>* <sup>−</sup> <sup>2</sup>*<sup>w</sup>*−<sup>1</sup>b*<sup>w</sup>*−<sup>1</sup>. For instance, #b1010 denotes 10 and −6 in the unsigned and two's complement representations respectively. We use *bv2nat*(*bv*) for the natural number denoted by the bit-vector *bv* in the unsigned representation; and *nat2bv*(w, i) stands for the bit-vector of width w representing the natural number i modulo 2*<sup>w</sup>*.

Let bv = #bb*<sup>w</sup>*−<sup>1</sup>b*<sup>w</sup>*−<sup>2</sup> ··· b<sup>0</sup> and *cv* = #bc*<sup>u</sup>*−<sup>1</sup>c*<sup>u</sup>*−<sup>2</sup> ··· c<sup>0</sup> be bit-vectors of widths w and u respectively. The following QF BV *operations* are defined in the QF BV logic of SMT-LIB: *concat bv cv* - #bb*<sup>w</sup>*−<sup>1</sup>b*<sup>w</sup>*−<sup>2</sup> ··· b0c*<sup>u</sup>*−<sup>1</sup>c*<sup>u</sup>*−<sup>2</sup> ··· c<sup>0</sup> is the concatenation of *bv* and *cv*; *extract* i j *bv* - #bb*i*b*<sup>i</sup>*−<sup>1</sup> ··· b*<sup>j</sup>* extracts bits from *bv* where 0 ≤ j ≤ i<w; *bvnot bv*, *bvand bv cv*, and *bvor bv cv* are the bitwise complement, and, or operations respectively. Additionally, *bvneg bv nat2bv*(w, <sup>2</sup>*<sup>w</sup>* <sup>−</sup> *bv2nat*(*bv*)) is the arithmetic negation operation; *bvadd bv cv nat2bv*(w, *bv2nat*(*bv*) + *bv2nat*(*cv*)) is the arithmetic addition operation; and *bvmul bv cv nat2bv*(w, *bv2nat*(*bv*) × *bv2nat*(*cv*)) is the arithmetic multiplication operation. The arithmetic division and remainder operations are

$$\begin{array}{ll} b v v div \ b v \ c v \stackrel{\scriptstyle}{=} \begin{cases} \; \mathit{nat2} b v (w, 2^w - 1) & \text{if } b v 2 \mathit{nat} (c v) = 0\\ \; \mathit{nat2} b v (w, b v 2 \mathit{nat} (b v) \div b v 2 \mathit{nat} (c v)) & \text{otherwise} \end{cases} \\ b v v c \mathcal{r} \stackrel{\scriptstyle}{=} \begin{cases} b v & \text{if } b v 2 \mathit{nat} (c v) = 0\\ \; \mathit{nat2} b v (w, b v 2 \mathit{nat} (b v) \text{ mod } b v 2 \mathit{nat} (c v)) \text{ otherwise} \end{cases} \end{array}$$

Note that the arithmetic division and remainder operations are defined even when the divisor represents the number zero. Finally, the operations *bvshl bv cv nat2bv*(w, *bv2nat*(*bv*) <sup>×</sup> <sup>2</sup>*bv2nat*(*cv*) ) shifts the bit-vector *bv* to the left by *bv2nat*(*cv*) bits; *bvlshr bv cv nat2bv*(w, *bv2nat*(*bv*) <sup>÷</sup> <sup>2</sup>*bv2nat*(*cv*) ) shifts the bit-vector *bv* to the right by *bv2nat*(*cv*) bits. In addition to bit-vector operations, the QF BV logic of SMT-LIB defines QF BV *predicates* on bitvectors. The predicate *bveq bv cv* is true when the bit-vectors *bv* and *cv* are equal; *bvult bv cv* is true if *bv2nat*(*bv*) < *bv2nat*(*cv*). In the QF BV logic of SMT-LIB, both operands of binary operations and predicates must have the same width. Overall, seventeen bit-vector operations and predicates are defined in the QF BV logic of SMT-LIB. Particularly, arithmetic division and remainder operations with operands in both unsigned and two's complement signed representations are defined in SMT-LIB.

A QF BV *variable* denotes a bit-vector. A QF BV *expression* is constructed from QF BV operations over QF BV variables and bit-vectors. An SMT QF BV *query* is a Boolean combination of QF BV predicates on QF BV expressions. Let *stores* be mappings from QF BV variables to bit-vectors. Given an SMT QF BV query, the *satisfiability modulo* QF BV *theory problem* is to decide if the query evaluates to *tt* on some stores.

# **4 Bit-Vector Theory**

We present our formal Coq bit-vector theory coq-nbits in this section. The coq-nbits theory supports bit-vectors in both unsigned and two's complement signed representations. In coq-nbits, a bit-vector is represented by a Boolean sequence of the data type bits in the least significant bit-first order.

Definition bits : Set := seq bool.

In the definition, bool and seq are the data types for Boolean values (false and true) and sequences in Coq respectively. For instance, the bit-vector #b100 is represented by [:: false; false; true] in coq-nbits.

Coq functions defined for sequences are applicable to bit-vectors. Particularly, size *bv* computes the width of the bit-vector *bv* and *bv* ++ *cv* is the concatenation of the bit-vectors *bv* and *cv*. It is also straightforward to define auxiliary bit-vector functions. For example, zeros n returns the bit-vector of n false's; ones n returns the bit-vector of n true's; extract i j *bv* returns the sub-sequence of the bit-vector *bv* with indices from <sup>j</sup> to <sup>i</sup> where 0 <sup>≤</sup> <sup>j</sup> <sup>≤</sup> i < size *bv*. Let a -[:: false; false; true]. Then size a = 3 and extract 21a = [:: false; true].

Bitwise functions are defined as easily. For instance, the bitwise inverse function maps each Boolean value to its complement:

Definition invB *bv* : bits := map (fun <sup>b</sup> => ~~b) *bv* .

Other bitwise functions are defined similarly. Specifically, bitwise and andB, bitwise or orB, logical left shift shlB, logical right shift shrB are all defined in coq-nbits. Let b - [:: false; true; true]. We have invB b = [:: true; false; false], andB a b = [:: false; false; true], and shlB 1 b = [:: false; false; true].

Arithmetic bit-vector functions are slightly more complicated. To prove properties about arithmetic functions, coq-nbits provides conversion functions between bit-vectors and natural numbers.

Definition to <sup>N</sup> (*bv* : bits) : N := foldr (fun <sup>b</sup> *res* => N\_of\_bool <sup>b</sup> <sup>+</sup> *res* \* 2) 0 *bv* .

In the definition, to N *bv* converts the bit-vector *bv* to a natural number where N of bool false = 0 and N of bool true = 1. The to N function multiplies the previous result by two and adds the least significant bit b. For instance, to N a = to N [:: false; false; true] = 4. The function from N w n, on the other hand, converts any natural number n to a bit-vector of width w.

```
Fixpoint from N (w : nat) (n : N) : bits :=
  match w with
  | O => [::]
  | S w-
        => (N.odd n)::(from N w-
                                   (N.div n 2))
  end .
```
The function first checks the width w. If the width is zero, it returns the empty bit-vector. Otherwise, the function returns the bit-vector with the least significant bit N.odd n and the remaining w − 1 bits representing n divided by two. Observe that two Coq formalizations of natural numbers are used. The nat theory uses the unary representation suitable for inductive proofs; N uses the succinct binary representation. The following lemma is proved in Coq:

**Lemma 1.** *The following properties hold:*

```
1. ∀bv, from N (size bv) (to N bv) = bv .
2. ∀w n, n < 2w =⇒ to N (from N w n) = n.
```
The first property shows that bit-vectors can be converted to natural numbers and back to themselves. The second property shows that natural numbers can be converted to bit-vectors with sufficient widths and back to themselves. To see how they are used to prove properties about bit-vector functions in coq-nbits, consider the definition of the successor bit-vector function.

$$\begin{array}{lcl} \begin{array}{lcll} \textbf{Fixpoint} & \texttt{succB} & \texttt{(\,\,b\,\,r)} & \texttt{\,\,b\,\,is} & \texttt{\,\,b\,\,is} & \texttt{\,\,b\,\,is} \\ \begin{array}{lcll} \textbf{\underline{\texttt{match}}} & \texttt{b\,\,v} & \texttt{\,\,i\,\,t} & \texttt{\,\,i\,\,j} \\ \textbf{\underline{\texttt{\,\,l}}} & \texttt{\,\,i\,\,j} & \texttt{\,\,l} & \texttt{\,\,i\,\,l} \\ \textbf{\underline{\,\,l}} & \texttt{\,\,h\,\,i\,\,t} & \texttt{\,\,i\,\,l} & \texttt{\,\,h\,\,i\,\,\,m} \\ \textbf{\underline{\,\,l}\,} & \texttt{\,\,i\,\,l} & \texttt{\,\,i\,\,m} \end{array} \end{array} \end{array} \begin{array}{lcl} \textbf{\,\,b\,\,is} & \textbf{\,\,b\,\,is} & \textbf{\,\,b\,\,is} \\ \textbf{\,\,b\,\,is} & \texttt{\,\,i\,\,j} & \texttt{\,\,h\,\,i\,\,m} \\ \textbf{\,\,h\,\,i\,\,\,j} & \texttt{\,\,i\,\,\,k} & \texttt{\,\,i\,\,$$

If the input is the empty bit-vector, the function returns the empty bit-vector. Otherwise, succB checks the least significant bit of the input bit-vector. If the bit is true, the function computes the successor of the remaining bits and appends false as the least significant bit. If the least significant bit of the input is false, the function simply changes the least significant bit to true and copies the remaining bits. Using the conversion functions, the bit-vector successor is related to the arithmetic successor in the following lemma:

# **Lemma 2.** <sup>∀</sup>*bv*,succB *bv* <sup>=</sup> from <sup>N</sup> (size *bv*) ((to <sup>N</sup> *bv*) + <sup>1</sup>)*.*

Lemma 2 says that succB *bv* does compute the bit-vector representing the arithmetic successor of the natural number represented by the bit-vector *bv*. Observe that the successor bit-vector function is correct when the input bit-vector is empty. It is also correct when there is overflow. Indeed, both sides are zeros of width size *bv* when overflow occurs.

Other arithmetic bit-vector functions are defined and proved in coq-nbits similarly. Specifically, the arithmetic negation negB, addition addB, subtraction subB, unsigned multiplication mulB, unsigned division divB, and unsigned remainder remB functions are supported by coq-nbits. We give properties to relate the arithmetic functions for bit-vectors and natural numbers.

**Lemma 3.** *The following properties hold:*


Let *bv*, *cv* be bit-vectors of width w. Lemma 3 shows that the natural number represented by the bit-vector addB *bv cv* is equal to the modular sum of the natural numbers represented by *bv* and *cv*. Similarly, the natural number represented by mulB *bv cv* is equal to the modular product of the natural numbers represented by *bv* and *cv*. The division and remainder functions in coq-nbits follow the SMT-LIB semantics. Specifically, the quotient of any bit-vector divided by zero is equal to the bit-vector of all true's; the remainder of a bit-vector divided by zero is the bit-vector itself. For non-zero divisors, the division and remainder functions behave as expected. The natural number represented by the bit-vector divB *bv cv* is the quotient of the number represented by *bv* divided by the number represented by *cv*; and the bit-vector remB *bv cv* represents the remainder of the number represented by *bv* divided by the number represented by *cv*. Last but not least, the logical left (shlB) and right (shrB) shifts correspond to multiplication and division by powers of two respectively.

coq-nbits also provides comparison predicates. In addition to the equality predicate == inherited from Boolean sequences, ltB *bv cv* and leB *bv cv* compare the natural numbers represented by the bit-vectors *bv* and *cv*. Properties about comparison predicates have also been proved in Coq.

**Lemma 4.** *The following properties hold:*

*1.* <sup>∀</sup>*bv cv*,size *bv* <sup>=</sup> size *cv* <sup>=</sup><sup>⇒</sup> ltB *bv cv* = (to <sup>N</sup> *bv* <sup>&</sup>lt; to <sup>N</sup> *cv*)*. 2.* <sup>∀</sup>*bv cv*,size *bv* <sup>=</sup> size *cv* <sup>=</sup><sup>⇒</sup> leB *bv cv* = (to <sup>N</sup> *bv* <sup>≤</sup> to <sup>N</sup> *cv*)*.*

In addition to arithmetic functions and predicates in the unsigned representation, our formal bit-vector theory moreover defines arithmetic functions and predicates for bit-vectors in two's complement representation. For the signed representation, bit-vectors are converted to integers by the to Z function. Arithmetic bit-vector functions and predicates in the signed representation are related to arithmetic integer functions and predicates as follows.

#### **Lemma 5.** *The following properties hold:*

*1.* <sup>∀</sup>*bv*,¬(msb *bv* <sup>∧</sup> dropmsb *bv* <sup>=</sup> zeros (size *bv* <sup>−</sup> <sup>1</sup>)) =<sup>⇒</sup> to <sup>Z</sup> (negB *bv*) = <sup>−</sup>to <sup>Z</sup> *bv .*


In the lemma, sext n *bv* extends the bit-vector *bv* by n bits with the sign bit of *bv*, msb *bv* returns the sign bit of *bv*, and dropmsb *bv* drops the sign bit of *bv*. quot and rem are the quotient and remainder functions for Coq integers. Consider, for instance, the signed division function sdivB *bv cv* in coq-nbits (Lemma 5(4)). If the dividend *bv* is of width > 1, the widths of *bv* and the divisor *cv* are equal, and *bv* is not of the form #b100 ··· 0 or *cv* is not of the form #b11 ··· <sup>1</sup>, then the bit-vector sdivB *bv cv* represents the quotient of the integers represented by *bv* and *cv*. The condition may appear counter-intuitive. To see why it is necessary, consider *bv* = #b100 ··· 0 and *cv* = #b11 ··· 1 both of width <sup>w</sup>. *bv* and *cv* thus represent the integers <sup>−</sup>2*<sup>w</sup>*−<sup>1</sup> and <sup>−</sup>1 respectively. Their quotient 2*<sup>w</sup>*−<sup>1</sup> however cannot be represented by bit-vectors of width w in two's complement representation. The corner input case is hence excluded. The corner case is also excluded from the arithmetic negation function (Lemma 5(1)).

The coq-nbits theory has several important differences from the prior Coq formalization in [15]. Our formal bit-vector theory supports both unsigned and two's complement signed representations. It also provides the arithmetic division and remainder functions. Since these features are needed in the QF BV logic of SMT-LIB, they are essential to the formalization of SMT QF BV queries. Such important features unfortunately are lacking in the prior formalization. Another noted difference is the numeric representations used in theory developments. Since integers are needed for the QF BV logic, coq-nbits naturally uses binary representations for integers and natural numbers in Coq. The prior formalization on the other hand is mainly based on the unary natural number representation but provides conversion to positive integers in the binary representation.

### **5 Theory for SMT QF BV Queries**

Using coq-nbits, we formalize SMT QF BV queries. Our formalization consists of two parts: a syntactic representation for SMT QF BV queries in Coq inductive types and a formal semantics in our bit-vector theory coq-nbits.

#### **5.1 Syntax of SMT QF BV Queries**

An SMT QF BV query is a Coq term of the data type bexp. It can be constants Bfalse or Btrue, a unary predicate Bnot, or binary predicates Band or Bor for Boolean connectives. Additionally, Bbveq and Bbvult with two arguments of the data type exp are binary QF BV predicates.

```
Inductive bexp : Type := Bfalse : bexp | Btrue : bexp
(* other QF_BV predicates *)
end with exp : Type :=
(* other QF_BV operations *)
end .
```
A Coq term of the data type exp represents a QF BV expression. It can be a QF BV variable Evar *vid* with a variable identifier *vid* : var, a bit-vector constant Econst *bv* with *bv* : bits, a bitwise-not operation Ebvnot e0, a bitwise-and operation Ebvand e<sup>0</sup> e1, a bitwise-or operation Ebvor e<sup>0</sup> e1, a logical left-shift operation Ebvshl e<sup>0</sup> e1, or a logical right-shift operation Ebvlshr e<sup>0</sup> e1. For arithmetic operations, there are Ebvneg e<sup>0</sup> for negation, Ebvadd e<sup>0</sup> e<sup>1</sup> for addition, Ebvmul e<sup>0</sup> e<sup>1</sup> for multiplication, Ebvudiv e<sup>0</sup> e<sup>1</sup> for unsigned division, and Ebvurem e<sup>0</sup> e<sup>1</sup> for unsigned remainder with e0, e<sup>1</sup> : exp. Finally, the extraction Eextract ije<sup>0</sup> and the concatenation Econcat e<sup>0</sup> e<sup>1</sup> operations have the data type exp with i, j : nat and e0, e<sup>1</sup> : exp.

#### **5.2 Semantics of SMT QF BV Queries**

In our Coq formalization, an SMT QF BV query is interpreted on stores. A *store* is a mapping from QF BV variables to bits. Let σ be a store. The interpretation of *be* : bexp on σ is a Boolean value; the interpretation of e : exp on σ is a bit-vector. Semantic functions eval bexp and eval exp are as follows.

```
Fixpoint eval bexp (be : bexp) (σ : store) : bool :=
  match be with
  | Bfalse = > false
  | Btrue = > true
  | Bnot be0 = > ~~ (eval bexp be0 σ)
  | Band be0 be1 = > (eval bexp be0 σ) && (eval bexp be1 σ)
  | Bor be0 be1 = > (eval bexp be0 σ) || (eval bexp be1 σ)
  | Bbveq e0 e1 = > (eval exp e0 σ) == (eval exp e1 σ)
  | Bbvult e0 e1 = > ltB (eval exp e0 σ) (eval exp e1 σ)
```

```
(* other QF_BV predicates *)
end with eval exp (e : exp) (σ : store) : bits :=
match e with
(* other QF_BV operations *)
end .
```
An SMT QF BV query denotes a value in the Coq data type bool. Bfalse and Btrue denote false and true respectively. Boolean negation, conjunction, and disjunction correspond to ~~, &&, and || in bool respectively. For QF BV predicates, the bit-vector equality Bbveq is interpreted by the equality == for Boolean sequences. The coq-nbits function ltB is used to interpret Bbvult.

A QF BV expression denotes a bit-vector. For basic cases, QF BV variables are interpreted by corresponding bit-vectors in the store σ through the store access function Store.acc; bit-vector constants are interpreted by themselves. Bitwise logical operations Ebvnot, Ebvand, and Ebvor are interpreted by corresponding coq-nbits functions invB, andB, and orB respectively. For logical shift operations, the offset e<sup>1</sup> is first converted to a natural number through to nat (eval exp e<sup>1</sup> σ) and then passed to the corresponding logical shift functions shlB or shrB in coq-nbits. QF BV arithmetic operations are interpreted by corresponding coq-nbits arithmetic functions as expected. Finally, the extraction Eextract and concatenation Econcat operations are interpreted by extract and ++ in coq-nbits respectively.

In an SMT QF BV query, a QF BV variable designates a bit-vector of a certain width. An SMT QF BV query is hence associated with a *signature* Σ mapping QF BV variables to their respective widths. A store σ *conforms* to a signature Σ if the interpretation of each QF BV variable on σ has the same width as specified in Σ. Given an SMT QF BV query *be* : bexp with its signature Σ, *be* is *satisfiable* if there is a store σ conforming to Σ and eval bexp *be* σ = true.

#### **5.3 Derived QF BV Operations and Predicates**

In the QF BV logic of SMT-LIB, a number of QF BV operations and predicates are derived from a small set of core operations and predicates. Consider the signed comparison predicate *bvslt bv cv* in SMT-LIB:

*bvslt bv cv* - (*or* (*and* (= (*extract* (w − 1) (w − 1) *bv*) #b1) (= (*extract* (w − 1) (w − 1) *cv*) #b0)) (*and* (= (*extract* (w − 1) (w − 1) *bv*) (*extract* (w − 1) (w − 1) *cv*)) (*bvult bv cv*))).

To compare two bit-vectors of width w in two's complement representation, the sign bits are checked. If *bv* is negative but *cv* is positive, *bvslt bv cv* is true. Otherwise, the signed predicate checks that both operands have the same sign and compares the operands using the unsigned comparison predicate. Interestingly, the arithmetic subtraction operation is actually a derived operation in SMT-LIB: *bvsub bv cv bvadd bv* (*bvneg cv*). The arithmetic operation is defined to be the bit-vector sum of minuend and the negation of subtrahend. It is *not*, for instance, defined as *nat2bv*(w, *bv2nat*(*bv*) − *bv2nat*(*cv*)) because *bv2nat*(*bv*) − *bv2nat*(*cv*) may not be a natural number.

For derived operations and predicates, there is a subtle yet important difference between our formal semantics and those defined in SMT-LIB. In our formal bit-vector theory coq-nbits, most functions and predicates are defined directly. Particularly, the arithmetic subtraction function subB is defined by onebit subtractors in coq-nbits. Our formal semantics for the QF BV arithmetic operation *bvsub* therefore is defined by the corresponding bit-vector function subB. Since our formal semantics did not define *bvsub* by *bvadd* and *bvneg*, it could be different from those in SMT-LIB. In order to build a certified solver for the QF BV logic of SMT-LIB, it is necessary to establish semantic equivalences between both semantic definitions for all derived QF BV operations and predicates.

To justify our formal semantics, we show the semantics of our definitions and those of SMT-LIB indeed denote the same bit-vector functions or predicates. Consider again the subtraction operation. Recall the semantics of the arithmetic operations *bvadd* and *bvneg* are defined by the bit-vector functions addB and negB respectively. The next lemma is useful to show the semantic equivalence:

# **Lemma 6.** <sup>∀</sup>*bv cv*,size *bv* <sup>=</sup> size *cv* <sup>=</sup><sup>⇒</sup> subB *bv cv* <sup>=</sup> addB *bv* (negB *cv*)*.*

For all derived QF BV operations and predicates, we give Coq proofs for the equivalence between our formal semantics and those of SMT-LIB. Particularly, semantics of all QF BV arithmetic operations and predicates over two's complement representation are equivalent to those in SMT-LIB. Our formal semantics for QF BV queries is thus certified to be equivalent to SMT-LIB.

### **6 Certified Bit Blasting**

Recall that a SAT query is a Boolean CNF formula. Given an SMT QF BV query, a bit blasting algorithm computes a SAT query that is satisfiable if and only if the given SMT QF BV query is satisfiable. Although it is the standard technique for solving SMT QF BV queries, bit blasting can be very complex due to arithmetic operations and various optimizations. Bit blasting algorithms therefore can be tedious to construct and thus prone to errors. We verify a bit blasting algorithm for SMT QF BV queries using our Coq formalization.

Let us start with a simple formalization of Boolean CNF formulae. In our formalization, a clause is represented by a sequence of literals; a CNF formula in turn is represented by a sequence of clauses. Let bvar be the data type for Boolean variables. We have the following data types in Coq:

```
Inductive lit : Set := Pos of bvar | Neg of bvar.
Definition clause : Set := seq lit.
Definition CNF : Set := seq clause.
```
Define an *environment* to be a mapping from bvar to bool. Given a literal , a CNF formula f, and an environment , it is straightforward to define the semantic functions eval lit : bool and eval cnf f : bool. A SAT query f is *satisfiable* if there is an environment such that eval cnf f = true.

To illustrate how our Coq proof works, consider Tseitin transformation for the logical negation operation:

```
Definition bit blast Bnot  : lit * CNF :=
  let r := a fresh literal in
  (r, [:: [:: r; ]; [:: !r; !] ]).
```
Given a literal , bit blast Bnot returns a new literal r and the CNF formula (r ∨)∧(¬r ∨ ¬). Tseitin transformation ensures the interpretations of and r are complementary on any environment evaluating the CNF formula to true. We give a formal proof using our formalization in Coq:

**Lemma 7.** ∀r *cnf* ,(r, *cnf* ) = bit blast Bnot =⇒ eval cnf *cnf* = true =⇒ eval lit r = *~~* (eval lit )*.*

The idea is generalized to QF BV operations naturally. For each QF BV operation, we construct a literal sequence r and a Boolean CNF formula *cnf* . If *cnf* evaluates to true on an environment , the interpretation of r on needs to reflect the semantics of the QF BV operation. For instance, a Coq proof is given for the QF BV addition operation:

**Lemma 8.** <sup>∀</sup>r *cnf* <sup>0</sup> <sup>1</sup> ,(r, *cnf* ) = bit blast Ebvadd <sup>0</sup> <sup>1</sup> <sup>=</sup><sup>⇒</sup> eval cnf *cnf* <sup>=</sup> true <sup>=</sup><sup>⇒</sup> eval lits r <sup>=</sup> addB (eval lits <sup>0</sup> ) (eval lits <sup>1</sup> )*.*

Given two literal sequences <sup>0</sup> and 1, bit blast Ebvadd <sup>0</sup> <sup>1</sup> returns a literal sequence r and a CNF formula *cnf* . If *cnf* evaluates to true on an environment , then the interpretation of the literal sequence r on is indeed the bit-vector sum of the interpretations of <sup>0</sup> and <sup>1</sup> on . Bit blasting algorithms for other QF BV operations are given and shown to reflect the semantics of corresponding functions defined in the bit-vector theory coq-nbits. Particularly, our bit blasting algorithms for arithmetic division and remainder correctly reflect corresponding arithmetic bit-vector functions in coq-nbits.

Recall that the semantics for SMT QF BV queries is defined over stores for QF BV variables. In order to prove the correctness of bit blasting algorithms, one has to relate stores for QF BV variables with environments for Boolean variables. The relation is explicated through literal correspondences. A *literal correspondence* π is a mapping from QF BV variables to sequences of literals. For each QF BV variable v, the literal sequence π(v) is meant to interpret v on environments for Boolean variables. More formally, let eval lits : bits be the bit-vector for the literal sequence interpreted on the environment . The bit-vector eval lits π(v) is hence the interpretation of the QF BV variable v on the environment . Let σ be a store and π a literal correspondence. An environment is *consistent with* σ *through* π if the bit-vectors eval lits π(v) and Store.acc v σ are equal for every QF BV variable v in σ. Thus, an environment is consistent with a store if their interpretations of variables coincide.

It is now straightforward to give our bit blasting algorithm for SMT QF BV queries. For each QF BV expression, our algorithm first computes literals and CNF formulae for operands recursively. It then invokes an auxiliary bit blasting algorithm to construct result literals and a CNF formula for the QF BV operation. The literal correspondence is also updated when literals are allocated for QF BV variables. Finally, the result literals and the updated literal correspondence are returned along with the concatenation of all CNF formulae.

```
Definition bit blast bexp Σπb : lit * correspondence * CNF :=
match be with
    let (r0 , π-
                 , cnf 0 ) := bit blast bexp Σ π be0 in
    let (r, cnf ) := bit blast Bnot r0 in
    (r, π-
           , cnf ++ cnf 0 )
(* other QF_BV predicates *)
end with bit blast exp Σπe : seq lit * correspondence * CNF :=
match e with
    if π(v) is defined then (π(v), π, [::])
    else let r := fresh literals for v according to Σ in
          let π-
                 := update π with v -
                                     → r in
          (r, π-
                 , [::])
    let (r0 , π-
                 , cnf 0 ) := bit blast exp Σπe0 in
    let (r1 , π-
                -
                 , cnf 1 ) := bit blast exp Σ π-
                                               e1 in
    let (r, cnf ) := bit blast Ebvadd r0 r1 in
    (r, π-
          -
           , cnf ++ cnf 0 ++ cnf 1 )
(* other QF_BV operations *)
end.
```
The following Coq theorem establishes the connection between the output literals and the input SMT QF BV query or expression of the algorithm.

**Theorem 1.** *Let be* : bexp *be an* SMT QF BV *query with the signature* Σ*be ,* e : exp *a* QF BV *expression with the signature* Σ*e, and* π<sup>0</sup> *the empty literal correspondence.*


Let *be* be an SMT QF BV query with the signature Σ*be* , *r* and *cnf* the literal and CNF formula returned by bit blast bexp respectively. Consider any store conforming to Σ*be* and any environment consistent with the store. If the environment evaluates the formula *cnf* to true, Theorem 1 says that the literal *r* and the SMT QF BV query *be* evaluate to the same Boolean value on the environment and store respectively. In other words, the algorithm bit blast bexp is a generalized Tseitin transformation for SMT QF BV queries. Particularly, all QF BV arithmetic operations (addition, subtraction, multiplication, division, and remainder in the unsigned and two's complement representations) are transformed to CNF formulae with formal proofs of correctness in Coq.

A useful corollary to Theorem 1 is the reduction of the satisfiability of SMT QF BV queries to the satisfiability of SAT queries.

**Corollary 1.** *Let be* : bexp *be an* SMT QF BV *query with the signature* Σ*be and* π<sup>0</sup> *the empty literal correspondence. Then*

$$\begin{array}{c} \left( \begin{array}{c} \left( \begin{array}{c} \left( r, \pi, cnf \right) = \ \mathtt{bit.b1ast.bexp} \ \Sigma\_{be} \ \pi\_{0} \ \mathtt{be} \end{array} \right. \\\left[ \left( \begin{array}{c} \left( \begin{array}{c} \sigma, \sigma \end{array} \text{conforms} \ to \ \Sigma\_{be} \wedge \textbf{eval.bexp} \ be \ \sigma = \textbf{true} \right) \Longleftrightarrow \textbf{if} \right. \\\left( \begin{array}{c} \left( \begin{array}{c} \left[ \left[ \begin{array}{c} \left[ \begin{array}{c} \left[ \begin{array}{c} r \end{array} \right] \ \left( \begin{array}{c} \sigma \end{array} \right) \ \epsilon = \textbf{true} \end{array} \right) \end{array} \right) \end{array} \right) \end{array} \right)$$

Corollary 1 gives the formal proof of correctness for our bit blasting algorithm bit blast bexp. Let *be* be an arbitrary SMT QF BV query, r and *cnf* the literal and the CNF formula returned by the algorithm. The corollary shows that the query *be* is satisfiable if and only of the SAT query <sup>r</sup> <sup>∧</sup> *cnf* is satisfiable. An equi-satisfiable SAT query is indeed obtained from the bit blasting algorithm on every input SMT QF BV query with a formal proof of correctness.

Recall that several QF BV operations and predicates are derived from a small number of operations and predicates in SMT-LIB. A na¨ıve bit blasting algorithm could expand derived operations or predicates, and then perform bit blasting on a small set of operations and predicates. Such an algorithm would have a simpler proof of correctness but generate more intermediate literals and clauses. For instance, the na¨ıve algorithm for *bvsub* would perform bit blasting on *bvneg* followed by *bvadd* with intermediate literals and clauses. Our bit blasting algorithm for *bvsub* on the other hand reflects our semantics defined by the bitvector function subB. Intermediate literals or clauses are not needed. Our bit blasting algorithm hence transforms *bvsub* more economically than the na¨ıve algorithm.

To improve our bit blasting algorithm further, a cache for QF BV expressions and predicates is added. In large queries, QF BV expressions and predicates can occur a number of times. If a QF BV expression has several occurrences, our basic bit blasting algorithm will generate result literals and CNF formulae for each occurrence. Consider the SMTQF BV query

(*and* (*bvslt* #b1000 (*bvadd* x y)) (*bvslt* (*bvadd* x y) #b0111)).

The query checks whether the sum of the QF BV variables x and y can be in a proper range. Since the Boolean predicate *and* has two operands, our basic algorithm invokes the auxiliary bit blasting algorithm for the two comparison predicates. It in turn blasts the same expression *bvadd* x y twice. Repeated bit blasting on the same expression or predicate is redundant. A hash function can detect repeated QF BV expressions and predicates easily. When an expression or a predicate recurs, the previously computed literals with the empty CNF formula are returned from a cache as the result. More importantly, we give a formal Coq proof of Corollary 1 for the bit blasting algorithm with a cache.

# **7 A Certified SMT QF BV Solver**

We have so far built a formally verified bit blasting algorithm for SMT QF BV queries. Using the code extraction mechanism in Coq, an OCaml program corresponding to the verified bit blasting algorithm is obtained. Using a SAT solver and a SAT certificate checker, a certified SMTQF BV solver can be constructed. Figure 1 gives the flow of our certified solver.

**Fig. 1.** Certified SMT QF BV Solver

In the figure, the extracted OCaml program takes an OCaml expression *be* of the type bexp as an input (Sect. 5). The verified program performs bit blasting on the SMT QF BV query and returns an OCaml expression *cnf* of the type lit list list representing a SAT query (Sect. 6). Precisely, an OCaml term of the type lit represents a literal. The OCaml type lit list corresponds to the data type for clauses; and the type lit list list corresponds to the data type for CNF formulae. The expression *cnf* is sent to a SAT solver to check satisfiability. If the SAT solver reports SAT, the SMT QF BV query represented by *be* is satisfiable. Otherwise, the SAT solver reports UNSAT with a certificate. The certificate is sent to a SAT certificate checker for validation. If it is validated, the SMT QF BV query *be* is unsatisfiable with certification.

# **8 Experiments**

In order to evaluate the performance of our verified OCaml bit blasting program, we instantiate our SMT QF BV solver CoqQFBV based on Fig. 1 as follows. We write an OCaml parser to translate a text file in the SMT-LIB format to an SMT QF BV query in our formal syntax. The query is sent to the verified OCaml program for bit blasting. We then add an OCaml program to transform the output SAT query to a text file in the DIMACS format. The 2020 SAT Competition winner Kissat [5] is used to check the satisfiability of the SAT query. If the SAT solver reports UNSAT with a certificate in the DRAT format [31], the certificate is sent to the verified certificate checker GratChk [16] for validation. Certificate checkers for SAT solvers use much simpler algorithms than certificate checkers for SMT solvers. They are hence easier to build and prove correct. The correctness of GratChk is in fact verified by the proof assistant Isabelle [22]. We need not trust the certificate checker either.

We ran two experiments to evaluate our certified SMT QF BV solver. The first experiment is the QF BV division of the single query track in the 2020 SMT Competition [2]. The second experiment consists of verification problems from various assembly implementations for linear field arithmetic in cryptography libraries such as OpenSSL [30], RELIC [1], and BLST [29]. We compare CoqQFBV against three SMT QF BV solvers: CVC4 [4] with an LFSC certificate checker [27], the 2020 SMTQF BV division winner Bitwuzla [20], and the 2019 SMT QF BV division winner Boolector [21]. Bitwuzla and Boolector are designed for efficiency without certification. CVC4 provides an LFSC certificate checker implemented in C [26]. The certificate checker can validate certificates from different theories but is itself not verified. All experiments were run on a Linux machine with a 3.20 GHz CPU and 1 TB memory.<sup>1</sup>

#### **8.1 SMT QF BV Competition**

The first experiment is running our certified solver CoqQFBV on tasks from the QF BV division of the 2020 SMT Competition. We set 60 GB memory limit and 20 min timeout for each task as in the competition. A task solves a single SMT-LIB file sequentially. The SMT QF BV division contains 6861 files in the SMT-LIB format. All files are marked with *unsat*, *sat*, or *unknown* indicating expected query results. To save running time, we ran 10 tasks concurrently. The experimental results are summarized in Table 1.

In the table, the column N*SC* indicates the number of solved tasks with certification. O*SC* is the number of timeouts. E*SC* shows the number of unsolved tasks due to tool errors. T*SC* is the average time for solved tasks. CoqQFBV solves 6087 (88.72%) and CVC4 with its certificate checker solves 3840 (55.97%) with certification. We observe three stack overflow errors during bit blasting in CoqQFBV. These errors are induced by deep recursion. Among 328 errors from CVC4, 249 are segmentation faults raised by the LFSC certificate checker.

<sup>1</sup> CoqQFBV is available at https://github.com/fmlab-iis/coq-qfbv.git.


**Table 1.** Experimental results on the 2020 SMT QF BV division

**Table 2.** Experimental results on the 2020 SMT QF BV division by categories


The same table also compares against efficient but uncertified solvers. To evaluate the overhead from certificate checking, the two certified solvers Coq-QFBV and CVC4 still generate certificates but do not validate them. The column N*<sup>S</sup>* gives the number of solved tasks without certification. O*<sup>S</sup>* is the number of timeouts. E*<sup>S</sup>* indicates the number of errors, and T*<sup>S</sup>* is the average time for solved tasks. Our certified solver CoqQFBV finishes 6169 (89.91%) tasks. The CVC4 solver finishes 4255 (62.02%) tasks. CoqQFBV and CVC4 solve 82(= 6169 − 6087) and 415(= 4255 − 3840) more tasks without certification respectively. Since our bit blasting algorithm is verified for all inputs, CoqQFBV does not certify bit blasting on each query and hence induces less overhead. The 2020 and 2019 SMT QF BV division winners Bitwuzla and Boolector finish 6739 (98.22%) and 6719 (97.93%) tasks without certification respectively. CoqQFBV solves about 10% less tasks with certification than the 2020 track winner Bitwuzla without certification. It also performs significantly better than CVC4 with a general SMT certificate checker.

Table 2 compares the four solvers by tasks from the three expected query results. Among the 4238 *unsat* tasks, CoqQFBV and CVC4 give certified answers to 3838 (90.56%) and 1762 (41.58%) of them respectively. The column P*SU* gives the average size of certificates. Efficient solvers Bitwuzla and Boolector give 4188 (98.82%) and 4180 (98.63%) uncertified answers respectively.

Among the 2553 *sat* tasks, Bitwuzla and Boolector finish 2524 (98.86%) and 2516 (98.55%) of them respectively. CoqQFBV and CVC4 solve only 2242 (87.82%) and 2078 (81.39%) *sat* tasks respectively. For the 70 tasks marked *unknown*, Bitwuzla and Boolector respectively answer 27 (38.57%) and 23 (32.86%) of them without certification. Our certified SMT QF BV solver finds two *sat* and five *unsat* tasks. Answers to the five *unsat* tasks are all certified. CVC4 with its certificate checker fails to solve any *unknown* task. For the benchmarks from the 2020 SMT QF BV division, our certified solver Coq-QFBV appears to be more scalable than CVC4 with its general SMT certificate checker.

**Table 3.** Average time for CoqQFBV components


Table 3 further decomposes the time spent on different components in Coq-QFBV. The column T*BB* gives the average time for our verified OCaml bit blasting program; T*SAT* gives the average time used by the SAT solver Kissat; and T*Cert* contains the average time for the certificate checker GratChk. For the tasks in the QF BV division, the time for SAT solving and certificate checking are comparable. In comparison, the OCaml bit blasting program seems to take an unexpectedly large amount of time and hence can still be improved.

#### **8.2 Linear Field Arithmetic in Cryptography**

In this section, we evaluate our certified SMT QF BV solver on benchmarks from real-world assembly implementations in various cryptography libraries such as OpenSSL [30], RELIC [1], and BLST [29]. In elliptic curve cryptography, arithmetic operations over large finite fields are needed. A field element is typically represented by hundreds of bits. A field arithmetic operation takes two field elements and returns a field element as the result. In the signature scheme Ed25519 used in OpenSSH, for instance, a field element belongs to the residue system modulo the prime number 2<sup>255</sup> <sup>−</sup> 19. Field sum of two field elements is obtained by the arithmetic sum modulo 2<sup>255</sup> <sup>−</sup> 19. Commodity processors however do not


**Table 4.** Experimental results on cryptographic assembly program verification

support arithmetic instructions with operands in hundreds of bits natively. Field arithmetic has to be implemented by 32- or 64-bit instructions. The functional specification of the field addition used in Ed25519 may look as follows.

{ -3 *<sup>i</sup>*=0 <sup>a</sup>*<sup>i</sup>* <sup>×</sup> <sup>2</sup><sup>64</sup>×*<sup>i</sup>* <sup>&</sup>lt; <sup>2</sup><sup>255</sup> <sup>−</sup> <sup>19</sup> <sup>∧</sup> -3 *<sup>i</sup>*=0 <sup>b</sup>*<sup>i</sup>* <sup>×</sup> <sup>2</sup><sup>64</sup>×*<sup>i</sup>* <sup>&</sup>lt; <sup>2</sup><sup>255</sup> <sup>−</sup> <sup>19</sup>} x25519 fe64 add(r0, r1, r2, r3, a0, a1, a2, a3, b0, b1, b2, b3) ⎧ ⎨ ⎩ -3 *<sup>i</sup>*=0 <sup>r</sup>*<sup>i</sup>* <sup>×</sup> <sup>2</sup><sup>64</sup>×*<sup>i</sup>* <sup>≡</sup> -3 *<sup>i</sup>*=0 <sup>a</sup>*<sup>i</sup>* <sup>×</sup> <sup>2</sup><sup>64</sup>×*<sup>i</sup>* <sup>+</sup> -3 *<sup>i</sup>*=0 <sup>b</sup>*<sup>i</sup>* <sup>×</sup> <sup>2</sup><sup>64</sup>×*<sup>i</sup>* (mod 2<sup>255</sup> <sup>−</sup> 19) ∧ -3 *<sup>i</sup>*=0 <sup>r</sup>*<sup>i</sup>* <sup>×</sup> <sup>2</sup><sup>64</sup>×*<sup>i</sup>* <sup>&</sup>lt; <sup>2</sup><sup>255</sup> <sup>−</sup> <sup>19</sup> ⎫ ⎬ ⎭

Let a*i*, b*i*, c*<sup>i</sup>* be 64-bit variables (registers) for 0 ≤ i ≤ 3. The specification says that the output field element represented by r*i*'s computed by the program x25519 fe64 add is the field arithmetic sum of the input elements represented by a*i*'s and b*i*'s. In finite field arithmetic programs, over- or under-flow in assembly instructions lead to incorrect results, and bit-accurate program verification is required. We obtain 46 implementations and generate 96 SMT QF BV queries from verification conditions in order to evaluate our certified solver in this experiment.

Table 4 shows the verification results with the same memory and time limits in the 2020 SMT Competition. All SMT QF BV queries are expected to be unsatisfiable. Boolector successfully solves all queries (100%) without certification. The 2020 QF BV track winner Bitwuzla finishes 88 queries (91.67%) without certification. Surprisingly, CoqQFBV gives certified answers to 93 queries (96.88%). The verified SAT certificate checker GratChk used in Coq-QFBV successfully validates all certificates for the real-world cryptographic program verification problems. In comparison, CVC4 solves 46 queries (47.92%) but certifies only 19 (19.79%). The CVC4 certificate checker raises segmentation faults on the 27 (= 46−19) solved but uncertified queries. These certificates are perhaps too complicated to be validated by the unverified LFSC certificate checker. For the SMT QF BV queries from real-world program verification problems, our certified solver CoqQFBV seems to perform slightly better than the efficient but uncertified SMT QF BV solver Bitwuzla. Our certified solver is probably scalable enough for certain bit-accurate program verification problems.

# **9 Conclusion**

We combine algorithm design with interactive theorem proving to build a scalable certified SMT QF BV solver CoqQFBV in this work. Our certified solver employs a verified OCaml bit blasting program and the verified certificate checker GratChk to improve the confidence in SMT QF BV query results. Experiments on the QF BV division of the 2020 SMT Competition and realworld cryptographic program verification suggest that CoqQFBV is useful.

For future work, we plan to specify and verify more heuristics to further optimize CoqQFBV. Particularly, cryptographic program verification requires more sophisticated range checks. More verified bit blasting algorithms for such checks will undoubtedly improve the confidence of bit-accurate program verification.

**Acknowledgements.** We thank all the anonymous reviewers for their insightful comments and suggestions. We thank the authors of SSReflect for its powerful language and libraries. We would like to give special thanks to Prof. Moshe Vardi for his encouragement. The work is supported by National Natural Science Foundation of China under the Grant Numbers 62002228, 61802259 and 61836005; the Guangdong Science and Technology Department under the Grant Number 2018B010107004; Ministry of Science and Technology of Taiwan under the Grant Numbers MOST108-2221-E-001- 010-MY3 and MOST108-2221-E-001-009-MY2; Academia Sinica for Sinica Investigator Award AS-IA-109-M01; the Data Safety and Talent Cultivation Project AS-KPQ-109- DSTCP.

# **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

# **Porous Invariants**

Engel Lefaucheux<sup>1</sup> , Jo¨el Ouaknine<sup>1</sup> , David Purser1(B) , and James Worrell<sup>2</sup>

<sup>1</sup> Max Planck Institute for Software Systems, Saarland Informatics Campus, Saarbr¨ucken, Germany dpurser@mpi-sws.org

<sup>2</sup> Department of Computer Science, Oxford University, Oxford, UK

**Abstract.** We introduce the notion of *porous invariants* for multipath (or branching/nondeterministic) affine loops over the integers; these invariants are not necessarily convex, and can in fact contain infinitely many 'holes'. Nevertheless, we show that in many cases such invariants can be automatically synthesised, and moreover can be used to settle (non-)reachability questions for various interesting classes of affine loops and target sets.

**Keywords:** Linear dynamical systems · Linear loops · Invariants · Reachability · Presburger arithmetic

# **1 Introduction**

We consider the reachability problem for multipath (or branching) affine loops over the integers, or equivalently for nondeterministic integer linear dynamical systems. A (deterministic) integer linear dynamical system consists of an update matrix <sup>M</sup> <sup>∈</sup> <sup>Z</sup>*<sup>d</sup>*×*<sup>d</sup>* together with an initial point <sup>x</sup>(0) <sup>∈</sup> <sup>Z</sup>*<sup>d</sup>*. We associate to such a system its infinite orbit (x(*i*) ) consisting of the sequence of reachable points defined by the rule x(*i*+1) = Mx(*i*) . The reachability question then asks, given a target set Y , whether the orbit ever meets Y , i.e., whether there exists some time <sup>i</sup> such that <sup>x</sup>(*i*) <sup>∈</sup> <sup>Y</sup> . The nondeterministic reachability question allows the linear update map to be chosen at each step from a fixed finite collection of matrices.

When the orbit does eventually hit the target, one can easily substantiate this by exhibiting the relevant finite prefix. However, establishing non-reachability is intrinsically more difficult, since the orbit consists of an infinite sequence of points. One requires some sort of finitary certificate, which must be a relatively simple object that can be inspected and which provides a proof that the set Y is indeed unreachable. Typically, such a certificate will consist of an overapproximation I of the set R of reachable points, in such a manner that one can check both that Y ∩ I = ∅ and R ⊆ I; such a set I is called an invariant.

Formally we study the following problem for *inductive invariants*:

The full version of this paper is available at http://arxiv.org/abs/2106.00662. c The Author(s) 2021

<sup>-</sup>A. Silva and K. R. M. Leino (Eds.): CAV 2021, LNCS 12760, pp. 172–194, 2021. https://doi.org/10.1007/978-3-030-81688-9\_8

**Meta Problem 1.** *Consider a system with update functions* f1,...,f*n. A set* I *is an inductive invariant if* f*i*(I) ⊆ I *for all* i*. Given a reachability query* (x, Y ) *we search for a separating inductive invariant* I *such that* x ∈ I *and* Y ∩ I = ∅*.*

Meta Problem 1 is parametrised by the type of invariants and targets that are considered; that is, what are the classes of allowable invariant sets I and target sets Y , or equivalently how are such sets allowed to be expressed.

Fixing a particular invariant and target domain, a reachability query has three possible scenarios: (1) the instance is reachable, (2) the instance is unreachable and a separating invariant from the domain exists, or (3) the instance is unreachable but no separating invariant exists. Ideally, one would wish to provide a sufficiently expressive invariant domain so that the latter case does not occur, whilst keeping the resulting invariants as simple as possible and computable. For some classes of systems, it is known that distinguishing reachability (1) from unreachability (2, 3) is undecidable; it can also happen that determining whether a separating invariant exists (i.e., distinguishing (2) from (3)) is undecidable.

We note that the existence of *strongest* inductive invariants<sup>1</sup> is a desirable property for an invariant domain—when strongest invariants exist (and can be computed), separating (2) from (1, 3) is easy: compute the strongest invariant, and check whether it excludes the target state or not; if so, then you are done, and if not, no other invariant (from that class) can possibly do the trick either. However, unless (3) is excluded, computing the strongest invariant does not necessarily imply that reachability is decidable. Unfortunately, strongest invariants are not always guaranteed to exist for a particular invariant domain, although some separating inductive invariant may still exist for every target (or indeed may not).

In prior work from the literature, typical classes of invariants are usually convex, or finite unions of convex sets. In this paper we consider certain classes of invariants that can have infinitely many 'holes' (albeit in a structured and regular way); we call such sets *porous invariants*. These invariants can be represented via Presburger arithmetic<sup>2</sup>. We shall work instead with the equivalent formulation of semi-linear sets, generalising ultimately periodic sets to higher dimensions, as finite unions of linear sets of the form {<sup>b</sup> <sup>+</sup> <sup>p</sup>1<sup>N</sup> <sup>+</sup> ··· <sup>+</sup> <sup>p</sup>*m*N} (by which we mean {<sup>b</sup> <sup>+</sup> <sup>a</sup>1p<sup>1</sup> <sup>+</sup> ··· <sup>+</sup> <sup>a</sup>*m*p*<sup>m</sup>* <sup>|</sup> <sup>a</sup>1,...,a*<sup>m</sup>* <sup>∈</sup> <sup>N</sup>}, see Definition 2).

Let us first consider a motivating example:

*Example 1 (Hofstadter's MU Puzzle* [7]*).* Consider the following term-rewriting puzzle over alphabet {M, U, I}. Start with the word MI, and by applying the following grammar rules (where y and z stand for arbitrary words over our alphabet), we ask whether the word MU can ever be reached.

$$yI \to yI U \quad | \quad My \to Myy \quad | \quad yIIIz \to yUz \quad | \quad yUUz \to yz$$

<sup>1</sup> Given two invariants I and I- , we say that <sup>I</sup> is *stronger* than <sup>I</sup> iff <sup>I</sup> <sup>⊆</sup> <sup>I</sup>- ; thus

*strongest* invariants correspond to *smallest* invariant sets. <sup>2</sup> Presburger arithmetic is a decidable theory over the natural numbers, comprising Boolean operations, first-order quantification, and addition (but not multiplication).

The answer is *no*. One way to establish this is to keep track of the number of occurrences of the letter 'I' in the words that can be produced, and observe that this number (call it x) will always be congruent to either 1 or 2 modulo 3. In other words, it is not possible to reach the set {x | x ≡ 0 mod 3}. Indeed, Rules 2 and 3 are the only rules that affect the number of I's, and can be described by the system dynamics x → 2x and x → x−3. Hence the MU Puzzle can be viewed as a one-dimensional system with two affine updates,<sup>3</sup> or a twodimensional system with two linear updates.<sup>4</sup> The set {1+3Z}∪{2+3Z} is an inductive invariant, and we wish to synthesise this. (The stability of this set under our two affine functions is easily checked: both components are invariant under <sup>x</sup> → <sup>x</sup> <sup>−</sup> 3, and {1+3Z} → {2+6Z}⊆{2+3Z} under <sup>x</sup> → <sup>2</sup>x, and similarly {2+3Z} → {4+6Z}⊆{1+3Z}.)

The problem can be rephrased as a safety property of the following multipath loop, verifying that the 'bad' state x = 0 is never reached, or equivalently that the above loop can never halt, regardless of the nondeterministic choices made.

x = 1 while x = 0 x = 2 x || x = x−3 (where || represents nondeterministic branching)

The MU Puzzle was presented as a challenge for algorithmic verification in [4]; the tools considered in that paper (and elsewhere, to the best of our knowledge) rely upon the manual provision of an abstract invariant template. Our approach is to find the invariant fully automatically (although one must still abstract from the MU Puzzle the correct formulation as the program x → 2x || x → x − 3).

**Main Contributions.** Our focus is on the automatic generation of porous invariants for multipath affine loops over the integers, or equivalently nondeterministic integer linear dynamical systems.

	- We establish the existence of *strongest* <sup>Z</sup>-linear invariants, and show that they can be found algorithmically (Theorem 2). These invariants may or may not separate the target under consideration.
	- If a <sup>Z</sup>-linear invariant is not separating, we may instead look for an <sup>N</sup>semi-linear invariant (which generalises both Z-semi-linear and N-linear invariants), and we show that such an invariant can always be found

<sup>3</sup> One-dimensional affine updates are functions of the form f(x) = ax + b.

<sup>4</sup> a b 0 1 x 1 = ax + b 1 models affine functions using a matrix representation, holding one of the entries fixed to 1.

**Table 1.** Results for integer linear dynamical systems for a point target. Det/Non refers to deterministic or nondeterministic LDS. "Subsumed by . . . " means that sufficient invariants can be generated, but of a more general type.


for any unreachable point target when dealing with *deterministic* integer linear dynamical systems (Theorem 4).


#### **1.1 Related Work**

The reachability problem (in arbitrary dimension) for loops with a single affine update, or equivalently for deterministic linear dynamical systems, is decidable in polynomial time for point targets (that is Y = {y}), as shown by Kannan and Lipton [16]. However for nondeterministic systems (where the update matrix is chosen nondeterministically from a finite set at each time step), reachability is undecidable, by reduction from the matrix semigroup membership problem [22].

In particular this entails that for unreachable nondeterministic instances we cannot hope *always* to be able to compute a separating invariant. In some cases

<sup>5</sup> The affine span covers the entire space.

we may compute the strongest invariant (which may suffice if this invariant happens to be separating for the given reachability query), or we may compute an invariant in sub-cases for which reachability is decidable (for example in low dimensions). For some classes of invariants, it is also undecidable whether an invariant exists (e.g., polyhedral invariants [8]).

Various types of invariants have been studied for linear dynamical systems, including polyhedra [8,23], algebraic [15], and o-minimal [1] invariants. For certain classes of invariants (e.g., algebraic [15]), it is decidable whether a separating invariant exists, notwithstanding the reachability problem being undecidable. Other works (e.g., [5]) use heuristic approaches to generate invariants, without aiming for any sort of completeness.

Kincaid, Breck, Cyphert and Reps [18] study loops with linear updates, studying the closed forms for the variables to prove safety and termination properties. Such closed forms, when expressible in certain arithmetic theories, can be interpreted as another type of invariant and can be used to over-approximate the reachable sets. The work is restricted to a single update function (deterministic loops) and places additional constraints on the updates to bring the closed forms into appropriate theories.

Bozga, Iosif and Konecn´y's FLATA tool [2] considers affine functions in arbitrary dimension. However, it is restricted to affine functions with finite monoids; in our one-dimensional case this would correspond to limiting oneself to counterlike functions of the form f(x) = x + b.

Finkel, G¨oller and Haase [9], extending Fremont [10], show that reachability in a single dimension is **PSPACE**-complete for polynomial update functions (and allowing states can be used to control the sequences of updates which can be applied). The affine functions (and single-state restriction) we consider are a special case, but we focus on producing invariants to disprove reachability.

Other tools, e.g., AProVE [11] and B¨uchi Automizer [14] may (dis-)prove termination/reachability on *all* branches, but may not be able to prove termination/reachability on *some* branch.

Inductive invariants specified in Presburger arithmetic have been used to disprove reachability in vector addition systems [20]. A generalisation, 'almost semi-linear sets' [21] are also non-convex and can capture exactly the reachable points of vector addition systems. Our nondeterministic linear dynamical systems can be seen as vector addition systems over Z extended with affine updates (rather than only additive updates).

# **2 Preliminaries**

We denote by Z the integers and N the non-negative integers. We say that x, y <sup>∈</sup> <sup>Z</sup> are congruent modulo <sup>d</sup> <sup>∈</sup> <sup>N</sup>, denoted <sup>x</sup> <sup>≡</sup> <sup>y</sup> mod <sup>d</sup>, if <sup>d</sup> divides x − y. Given an integer x and natural d we write (x mod d) for the number in {0,...,d − 1} such that (x mod d) ≡ x mod d.

**Definition 1 (Integer Linear Dynamical Systems).** *A* d*-dimensional integer linear dynamical system (LDS)* (x(0), {M1,...,M*k*}) *is defined by an initial point* <sup>x</sup>(0) <sup>∈</sup> <sup>Z</sup>*<sup>d</sup> and a set of integer matrices* <sup>M</sup>1,...,M*<sup>k</sup>* <sup>⊆</sup> <sup>Z</sup>*d*×*d. An LDS is* deterministic *if it comprises a single matrix (*k = 1*) and is otherwise* nondeterministic*.*

*A point* <sup>y</sup> *is* reachable *if there exists* <sup>m</sup> <sup>∈</sup> <sup>N</sup> *and* <sup>B</sup>1,...,B*<sup>m</sup> such that* <sup>B</sup><sup>1</sup> ··· <sup>B</sup>*m*x(0) <sup>=</sup> <sup>y</sup> *and* <sup>B</sup>*<sup>i</sup>* ∈ {M1,...,M*k*} *for all* <sup>1</sup> <sup>≤</sup> <sup>i</sup> <sup>≤</sup> <sup>m</sup>*.*

*The* reachability set O ⊆ <sup>Z</sup>*<sup>d</sup> of an LDS is the set of reachable points.*

**Definition 2 (**K**-(semi)-linear sets).** *A* linear set L *is defined by a base vector* <sup>b</sup> <sup>∈</sup> <sup>Z</sup>*<sup>d</sup> and period vectors* <sup>p</sup>1,...,p*<sup>d</sup>* <sup>∈</sup> <sup>Z</sup>*<sup>d</sup> such that*

$$L = \{b + a\_1p\_1 + \dots + a\_dp\_d \mid a\_1, \dots, a\_d \in \mathbb{K}\}\dots$$

*For convenience we often write* {<sup>b</sup> <sup>+</sup> <sup>p</sup>1<sup>K</sup> <sup>+</sup> ··· <sup>+</sup> <sup>p</sup>*d*K} *for* <sup>L</sup>*. A set is* semi-linear *if it is the finite union of linear sets.*

N-semi-linear sets are precisely those definable in Presburger arithmetic (FO(Z, <sup>+</sup>, <sup>≤</sup>)) [12]. However, we can also consider <sup>Z</sup>-semi-linear sets (corresponding to FO(Z, +) without order), and the real counterparts (R and R+). Note that even if <sup>K</sup> <sup>=</sup> <sup>N</sup> we still allow <sup>p</sup>*<sup>i</sup>* <sup>∈</sup> <sup>Z</sup>*<sup>d</sup>*.

**Definition 3.** *Given an integer linear dynamical system* (x(0), {M1,...,M*k*})*, a set* I *is an* inductive invariant *if*

*–* <sup>x</sup>(0) <sup>∈</sup> <sup>I</sup>*, and –* {M*i*x | x ∈ I} ⊆ I *for all* i ∈ {1,...,k}*.*

Note in particular that every inductive invariant contains the reachability set (O ⊆ I). We are interested in the following problem:

**Definition 4 (Invariant Synthesis Problem).** *Given an invariant domain* <sup>D</sup>*, an integer linear dynamical system* (x(0), {M1,...,M*k*})*, and a target* <sup>Y</sup> *, does there exist an inductive invariant* I *in* D *disjoint from* Y *?*

In our setting, we are interested in classes D of invariants that are linear, or semi-linear. When a separating inductive invariant I exists, we also wish to compute it. Since (semi)-linear invariants are enumerable, the decision problem is, in theory, sufficient—although all of our proofs are constructive.

# **3** R **Invariants:** R**-linear and** R**-semi-linear**

Before delving into porous invariants, let us consider invariants over the real numbers, i.e., described as R-(semi)-linear sets.

Strongest R-linear invariants are given precisely by the affine hull of the reachability set, and can be computed using Karr's algorithm [17]. Moreover, we will show that strongest R-semi-linear invariants also exist and can be computed by combining techniques for algebraic invariants [15] and R-linear invariants.

<sup>R</sup>*-linear.* Recall that a set <sup>L</sup> is <sup>R</sup>-linear if <sup>L</sup> <sup>=</sup> {v<sup>0</sup> <sup>+</sup> <sup>v</sup>1<sup>R</sup> <sup>+</sup> ··· <sup>+</sup> <sup>v</sup>*t*R} for some <sup>v</sup>0,...,v*<sup>t</sup>* <sup>∈</sup> <sup>Z</sup>*<sup>d</sup>* that can be assumed to be linearly-independent<sup>6</sup> without loss of generality (and thus t ≤ d). Given two distinct points of L, every point on the infinite line connecting them must also be in L. Generalising this idea to higher dimensions, given a set <sup>S</sup> <sup>⊆</sup> <sup>R</sup>*d*, let the affine hull be

$$\overline{S}^a = \left\{ \sum\_{i=1}^k \lambda\_i x\_i \mid k \in \mathbb{N}, x\_i \in S, \lambda\_i \in \mathbb{R}, \sum\_{i=1}^k \lambda\_i = 1 \right\}.$$

Fix an LDS (x(0), {M1,...,M*k*}) and consider its reachability set <sup>O</sup> <sup>=</sup> <sup>M</sup>*<sup>i</sup><sup>m</sup>* ··· <sup>M</sup>*<sup>i</sup>*<sup>1</sup> <sup>x</sup>(0) <sup>|</sup> <sup>m</sup> <sup>∈</sup> <sup>N</sup>, i1,...,i*<sup>m</sup>* ∈ {1,...,k} . Then <sup>O</sup>*<sup>a</sup>* is precisely the strongest R-linear invariant. Karr's algorithm [17,26] can be used to compute this strongest invariant in polynomial time. The next lemma follows from Theorem 3.1 of [26].

**Lemma 1.** *Given an LDS* (x(0), {M1,...,M*k*}) *of dimension* <sup>d</sup>*, we can compute in time polynomial in* d*,* k*, and* log μ *(where* μ > 0 *is an upper bound on the absolute values of the integers appearing in* x(0) *and* M1,...,M*k), a* Q*-affinely independent set of integer vectors* R<sup>0</sup> ⊆ O *such that:*


Let R<sup>0</sup> = x(0), r1,...,r*<sup>d</sup>*- be obtained as per Lemma 1, with d ≤ d. The R-linear invariant of the LDS is the affine span R<sup>0</sup> *a* , which can be written as the R-linear set L<sup>0</sup> = <sup>x</sup>(0) + (r<sup>1</sup> <sup>−</sup> <sup>x</sup>(0))<sup>R</sup> <sup>+</sup> ··· + (r*<sup>d</sup>*- <sup>−</sup> <sup>x</sup>(0))<sup>R</sup> .

R*-semi-linear.* Let us now generalise this approach to R-semi-linear sets. The collection of <sup>R</sup>-semi-linear sets, { *<sup>m</sup> <sup>i</sup>*=1 <sup>L</sup>*<sup>i</sup>* <sup>|</sup> <sup>m</sup> <sup>∈</sup> <sup>N</sup>, L1,...,L*<sup>m</sup>* are <sup>R</sup>-linear sets}, is closed under finite unions and arbitrary intersections<sup>7</sup>. Thus for any given set X, the smallest R-semi-linear set containing X is simply the intersection of all R-semi-linear sets containing X. Let us denote by X<sup>R</sup> this smallest R-semi-linear set. We are interested in <sup>O</sup><sup>R</sup> .

# **Theorem 1.** *The strongest* <sup>R</sup>*-semi-linear invariant* <sup>O</sup><sup>R</sup> *of* <sup>O</sup> *is computable.*

Algebraic sets are those that are definable by finite unions and intersections of zeros of polynomials. For example, {(x, y) | xy = 0} describes the lines x = 0 and y = 0. The (real) Zariski closure X*<sup>z</sup>* of a set X is the smallest algebraic subset of R*<sup>d</sup>* containing the set X. The Zariski closure of the set of reachable points, <sup>O</sup>*<sup>z</sup>* , can be computed algorithmically [15].

<sup>6</sup> <sup>v</sup>0,...,v*<sup>m</sup>* are linearly independent if there does not exist <sup>a</sup>0,...,a*<sup>m</sup>* <sup>∈</sup> <sup>R</sup>, not all 0, such that <sup>a</sup>0v<sup>0</sup> <sup>+</sup> ··· <sup>+</sup> <sup>a</sup>*m*v*<sup>m</sup>* = 0. <sup>7</sup> When intersecting a linear set with a semi-linear set, either the latter does not

change, or one obtains a finite union of elements of smaller dimension. Thus, in an infinite intersection, only a finite number of intersections affects the original set.

An algebraic set A is *irreducible* if whenever A ⊆ B ∪ C, where B and C are algebraic sets, then we have A ⊆ B or A ⊆ C. Any algebraic set (and in particular a Zariski closure) can be written effectively as a finite union of irreducible sets [3].

**Proposition 1.** *Let* <sup>X</sup>*<sup>z</sup>* <sup>=</sup> <sup>A</sup><sup>1</sup> ∪··· ∪ <sup>A</sup>*k, with* <sup>A</sup>*i's irreducible. Then* <sup>X</sup><sup>R</sup> <sup>=</sup> X*z*<sup>R</sup> = A<sup>1</sup> R ∪···∪ A*<sup>k</sup>* <sup>R</sup> <sup>=</sup> <sup>A</sup><sup>1</sup> *a* ∪···∪ A*<sup>k</sup> a .*

*Proof.* Since <sup>A</sup>*<sup>i</sup>* <sup>⊆</sup> <sup>X</sup><sup>R</sup> <sup>=</sup> <sup>∪</sup>*j*L*<sup>j</sup>* , and <sup>A</sup>*<sup>i</sup>* is irreducible, we have <sup>A</sup>*<sup>i</sup>* <sup>⊆</sup> <sup>L</sup>*<sup>j</sup>* for some j (as the L*<sup>j</sup>* 's are algebraic sets). Since L*<sup>j</sup>* is R-linear, and A*<sup>i</sup> <sup>a</sup>* is the smallest R-linear set covering A*i*, we have A*<sup>i</sup> a* <sup>⊆</sup> <sup>L</sup>*<sup>j</sup>* . Taking <sup>X</sup><sup>R</sup> <sup>=</sup> <sup>A</sup><sup>1</sup> *a* ∪···∪A*<sup>k</sup> <sup>a</sup>* is thus optimal. 

Thus <sup>O</sup><sup>R</sup> can be obtained by computing <sup>A</sup>*<sup>i</sup> <sup>a</sup>* for each irreducible <sup>A</sup>*i*, where <sup>O</sup>*<sup>z</sup>* <sup>=</sup> <sup>A</sup><sup>1</sup> ∪···∪ <sup>A</sup>*k*. To complete the proof of Theorem <sup>1</sup> it remains to confirm that affine hulls of algebraic sets can be computed algorithmically. Let us fix an algebraic set A, and let W denote a set variable. Proceed as follows. Start with W ← {x} for some point x ∈ A, and repeatedly let W ← W ∪ {y} *a* , where y ∈ A \ W. Such a point y can always be found using quantifier elimination in the theory of the reals. Each step necessarily increases the dimension, which can occur at most d times, ensuring termination, at which point one has A*<sup>a</sup>* = W.

# **4 Strongest** Z**-linear Invariants**

Recall that a <sup>Z</sup>-linear set {<sup>q</sup> <sup>+</sup> <sup>p</sup>1<sup>Z</sup> <sup>+</sup> ··· <sup>+</sup> <sup>p</sup>*n*Z} is defined by a base vector <sup>q</sup> <sup>∈</sup> <sup>Z</sup>*<sup>d</sup>* and period vectors <sup>p</sup>1,...,p*<sup>n</sup>* <sup>∈</sup> <sup>Z</sup>*<sup>d</sup>*. Equivalently, a <sup>Z</sup>-linear set describes <sup>a</sup> *lattice*, i.e., {p1<sup>Z</sup> <sup>+</sup> ··· <sup>+</sup> <sup>p</sup>*n*Z}, in <sup>d</sup>-dimensional space, translated to start from q rather than **0**.

**Theorem 2.** *Given a* <sup>d</sup>*-dimensional dynamical system* (x(0), {M1,...,M*k*})*, the strongest* <sup>Z</sup>*-linear inductive invariant containing the reachability set* <sup>O</sup> *exists and can be computed algorithmically.*

The image of a <sup>Z</sup>-linear set <sup>L</sup> <sup>=</sup> {<sup>q</sup> <sup>+</sup> <sup>p</sup>1<sup>Z</sup> <sup>+</sup> ··· <sup>+</sup> <sup>p</sup>*n*Z} by a matrix <sup>M</sup> is the <sup>Z</sup>-linear set: <sup>M</sup>(L) = {Mq + (M p1)<sup>Z</sup> <sup>+</sup> ··· + (M p*n*)Z}. The following lemma asserts that when two points are in a Z-linear set, the direction between these two points can be applied from any reachable point, and hence this direction can be included as a period without altering the set.

**Proposition 2.** *Let* <sup>L</sup> <sup>=</sup> {<sup>q</sup> <sup>+</sup> <sup>a</sup>1p<sup>1</sup> <sup>+</sup> ··· <sup>+</sup> <sup>a</sup>*n*p*<sup>n</sup>* <sup>|</sup> <sup>a</sup>1,...,a*<sup>n</sup>* <sup>∈</sup> <sup>Z</sup>} *be a* <sup>Z</sup>*-linear set. If* x, y <sup>∈</sup> <sup>L</sup> *then for all* <sup>z</sup> <sup>∈</sup> <sup>L</sup> *and all* <sup>a</sup> <sup>∈</sup> <sup>Z</sup> *we have* <sup>z</sup> + (<sup>y</sup> <sup>−</sup> <sup>x</sup>)a <sup>∈</sup> <sup>L</sup>*. In particular, we have* L = {q + a1p<sup>1</sup> + ··· + a*n*p*<sup>n</sup>* + a (<sup>y</sup> <sup>−</sup> <sup>x</sup>) <sup>|</sup> <sup>a</sup>1,...,a*n*, a <sup>∈</sup> <sup>Z</sup>}*.*

*Proof.* If x = q + a1p<sup>1</sup> + ··· + a*n*p*<sup>n</sup>* and y = q + b1p<sup>1</sup> + ··· + b*n*p*<sup>n</sup>* then y − x = q + b1p<sup>1</sup> + ··· + b*n*p*<sup>n</sup>* − (q + a1p<sup>1</sup> + ··· + a*n*p*n*)=(b<sup>1</sup> − a1)p<sup>1</sup> + ··· + (b*<sup>n</sup>* − a*n*)p*n*.

Then for any z = q + c1p<sup>1</sup> + ··· + c*n*p*n*, we have z + a (y − x) = q + c1p<sup>1</sup> + ··· + c*n*p*<sup>n</sup>* + a ((b<sup>1</sup> − a1)p<sup>1</sup> + ··· + (b*<sup>n</sup>* − a*n*)p*n*) = q + (c<sup>1</sup> + a (b<sup>1</sup> − a1))p<sup>1</sup> + ··· + (c*<sup>n</sup>* + a (b*<sup>n</sup>* − a*n*))p*n*) where (c*<sup>i</sup>* + a (b*<sup>i</sup>* <sup>−</sup> <sup>a</sup>*i*)) <sup>∈</sup> <sup>Z</sup>, so <sup>z</sup> <sup>+</sup> <sup>a</sup> (y − x) ∈ L.  **Proposition 3.** *Given two* <sup>Z</sup>*-linear sets* <sup>L</sup><sup>1</sup> <sup>=</sup> {<sup>q</sup> <sup>+</sup> <sup>p</sup>1<sup>Z</sup> <sup>+</sup> ··· <sup>+</sup> <sup>p</sup>*n*Z} *and* <sup>L</sup><sup>2</sup> <sup>=</sup> {<sup>s</sup> <sup>+</sup> <sup>t</sup>1<sup>Z</sup> <sup>+</sup> ··· <sup>+</sup> <sup>t</sup>*m*Z}*, there exists a smallest* <sup>Z</sup>*-linear set* <sup>L</sup> *containing* <sup>L</sup><sup>1</sup> <sup>∪</sup>L2*: the set* <sup>L</sup> <sup>=</sup> {<sup>q</sup> + (<sup>s</sup> <sup>−</sup> <sup>q</sup>)<sup>Z</sup> <sup>+</sup> <sup>p</sup>1<sup>Z</sup> <sup>+</sup> ··· <sup>+</sup> <sup>p</sup>*n*<sup>Z</sup> <sup>+</sup> <sup>t</sup>1<sup>Z</sup> <sup>+</sup> ··· <sup>+</sup> <sup>t</sup>*m*Z}*.*

*Proof.* First we show L<sup>1</sup> ∪ L<sup>2</sup> ⊆ L:


Next we show minimality as a straightforward consequence of Proposition 2.

Clearly the vectors p1,...,p*<sup>n</sup>* can be added by Proposition 2 because any two points of L<sup>1</sup> differing by p*<sup>i</sup>* guarantees that adding p*<sup>i</sup>* does not alter the resulting set. Similarly, t1,...,t*<sup>m</sup>* can also be included. Finally, by Proposition 2, the vector s − q can be included because q and s both belong to L<sup>1</sup> ∪ L2. 

A d-dimensional lattice can always be defined by at most d vectors; and thus if d is the dimension of the matrices, no more than d period vectors are needed in total. However, Proposition 3 induces a representation which may over-specify the lattice by producing more than d vectors to define the lattice.

*Example 2.* Consider the lattice {(2, 2)<sup>Z</sup> + (0, 6)<sup>Z</sup> + (2, 6)Z}, specified with three vectors, which is equivalent to the lattice {(2, 0)<sup>Z</sup> + (0, 2)Z}. Note that one may not simply pick an independent subset of the periods, as none of the following sets are equal: {(2, 2)<sup>Z</sup> + (0, 6)Z}, {(2, 2)<sup>Z</sup> + (2, 6)Z}, {(0, 6)<sup>Z</sup> + (2, 6)Z}, and {(2, 2)<sup>Z</sup> + (0, 6)<sup>Z</sup> + (2, 6)Z}.

The *Hermite normal form* can be used to obtain a basis of the vectors that define the lattice. Consider a lattice <sup>L</sup>*<sup>i</sup>* <sup>=</sup> {p1<sup>Z</sup> <sup>+</sup> ··· <sup>+</sup> <sup>p</sup>*d*Z}. The lattice remains the same if p*<sup>i</sup>* is swapped with p*<sup>j</sup>* , if p*<sup>i</sup>* is replaced by −p*i*, or if p*<sup>i</sup>* is replaced by p*<sup>i</sup>* + αp*<sup>j</sup>* where α is any fixed integer<sup>8</sup>.

These are the unimodular operations. The Hermite normal form of a matrix M is a matrix H such that M = UH, where U is a unimodular matrix (formed by unimodular column operations) and H is lower triangular, non-negative and each row has a unique maximum entry which is on the main diagonal. Such a form always exists, and the columns of H form a basis of the same lattice as the columns of M, because they differ up to unimodular (lattice-preserving) operations. There are many texts on the subject; we refer the reader to the lecture notes of Shmonin [25] for more detailed explanations.

The columns of a matrix in Hermite normal form constitute a unique basis for the lattice (up to additional redundant zero columns). Hence a basis of minimal dimension can be obtained by computing the Hermite normal form of the matrix formed by placing the period vectors into columns.

<sup>8</sup> The last replacement is valid, since if <sup>x</sup> <sup>=</sup> <sup>y</sup>+βp*<sup>i</sup>* <sup>∈</sup> <sup>L</sup> then <sup>x</sup> <sup>=</sup> <sup>y</sup>+β(p*i*+αp*<sup>j</sup>* )−βαp*<sup>j</sup>* is in the new lattice.

We now prove the main theorem:

*Proof (Proof of* Theorem 2*).* We claim that Algorithm 1 returns the strongest Z-linear invariant I.

Algorithm 1 proceeds in two phases:


Recall the set R<sup>0</sup> = x(0), r1,...,r*<sup>d</sup>*- ⊆ O, with d ≤ d, from Lemma 1. The resulting Z-linear set L<sup>0</sup> = <sup>x</sup>(0) + (r<sup>1</sup> <sup>−</sup> <sup>x</sup>(0))<sup>Z</sup> <sup>+</sup> ··· + (r*<sup>d</sup>*- <sup>−</sup> <sup>x</sup>(0))<sup>Z</sup> is then a d -dimensional porous subset of the d -dimensional affine hull of the orbit (L<sup>0</sup> ⊆ O*a* ). Applying M1,...,M*<sup>k</sup>* can only increase the density, but not the dimension. As each <sup>r</sup>*<sup>i</sup>* and <sup>x</sup>(0) are in <sup>O</sup>, by Proposition <sup>2</sup> we can assume that each of the directions (r*<sup>i</sup>* <sup>−</sup> <sup>x</sup>(0)) must be represented in any <sup>Z</sup>-linear set containing <sup>O</sup>, and we therefore have that L<sup>0</sup> ⊆ I.

In the second phase, we 'fill in' the lattice as required to cover the whole of O. To do this we repeatedly apply the covering procedure of Proposition 3. That is, <sup>L</sup>*i*+1 is the smallest <sup>Z</sup>-linear set covering <sup>L</sup>*<sup>i</sup>* <sup>∪</sup>M1(L*i*)∪···∪M*k*(L*i*). To keep the number of vectors small, we keep the period vectors of the Z-linear set in Hermite normal form.

The vectors <sup>p</sup><sup>1</sup> = (r<sup>1</sup> <sup>−</sup> <sup>x</sup>(0)),...,p*<sup>d</sup>*- = (r*<sup>d</sup>*- <sup>−</sup> <sup>x</sup>(0)) form a parallelepiped (hyper-parallelogram) that repeats regularly. There are a finite number of integral points inside this parallelepiped. If new points are added in some step, they are added to every parallelepiped. Thus we can add new points finitely many times before saturating or becoming fixed. The volume of the parallelepiped is bounded above by |p1|···|p*<sup>d</sup>*-|.

At each step, the volume of the parallelepiped must at least halve, thus the volume at step t is vol*<sup>t</sup>* ≤ |p1|···|p*<sup>d</sup>*- <sup>|</sup>/2*<sup>t</sup>* . The procedure must saturate at or before the volume becomes 1, which occurs after at most log(|p1|···|p*<sup>d</sup>*- |) = *<sup>i</sup>* log(|p*i*|) steps. At each step, for efficiency considerations, we convert the Z-linear set into Hermite normal form to retain exactly d period vectors.

*Claim (I is the strongest invariant).* For every invariant J, we have I ⊆ J.

By induction, let us prove that every invariant J must contain L*i*. Clearly this is the case for L<sup>0</sup> because all points of R<sup>0</sup> ⊆ O must be in J and every period vectors in L<sup>0</sup> can be present, without loss of generality, thanks to Proposition 2. Assume L*<sup>i</sup>* ⊆ J. Then it must be the case that J contains every M*<sup>j</sup>* (L*i*), as otherwise it would not be an invariant. It therefore follows that J must contain L*i*+1, since the latter is the minimal Z-linear set containing L*<sup>i</sup>* and M*<sup>j</sup>* (L*i*) for all j ≤ k. Finally, since I is itself one of the L*i*'s, we have I ⊆ J as required. 

*Remark 1.* Note that a Z-linear set is not sufficient for the MU puzzle: both 1 and 2 are in the reachability set, thus {1+1Z} <sup>=</sup> <sup>Z</sup> is the strongest <sup>Z</sup>-linear invariant.

**Algorithm 1:** Strongest Z-linear invariant for LDS (x(0), M1,...,M*k*)

**Input**: x(0),M1,...,M*<sup>k</sup>* Compute R<sup>0</sup> = x(0), r1,...,r*d*- ⊆ O Compute <sup>p</sup>*<sup>i</sup>* <sup>=</sup> <sup>r</sup>*<sup>i</sup>* <sup>−</sup> <sup>x</sup>(0) for <sup>i</sup> ∈ {1,...,d- } L<sup>0</sup> = <sup>x</sup>(0) <sup>+</sup> <sup>p</sup>1<sup>Z</sup> <sup>+</sup> ··· <sup>+</sup> <sup>p</sup>*<sup>d</sup>*-Z **while** *True* **do** <sup>L</sup>*<sup>i</sup>* = Covering(L*<sup>i</sup>*−<sup>1</sup> <sup>∪</sup> <sup>M</sup>1(L*<sup>i</sup>*−<sup>1</sup>) ∪···∪ <sup>M</sup>*k*(L*<sup>i</sup>*−<sup>1</sup>)) H*<sup>i</sup>* = HermiteNormalForm(L*i*) L*<sup>i</sup>* = <sup>x</sup>(0) <sup>+</sup> <sup>h</sup>1<sup>Z</sup> <sup>+</sup> ··· <sup>+</sup> <sup>h</sup>*<sup>d</sup>*-<sup>Z</sup> <sup>|</sup> <sup>h</sup>*<sup>j</sup>* column of <sup>H</sup>*<sup>i</sup>* **if** <sup>L</sup>*<sup>i</sup>* <sup>=</sup> <sup>L</sup>*<sup>i</sup>*−<sup>1</sup> **then return** L*<sup>i</sup>* **end end**

### **4.1 Extensions of** Z**-linear Sets Without Strongest Invariants**

In this section we show that several generalisations of Z-linear domains fail to admit strongest invariants.

Z-semi-linear sets are unions of Z-linear sets, and therefore can include singletons. Consider the deterministic dynamical system starting from point 1 and doubling at each step M = (1,(x → 2x)). This system has reachability set O = <sup>2</sup>*<sup>k</sup>* <sup>|</sup> <sup>k</sup> <sup>∈</sup> <sup>N</sup> , which is not even N-semi-linear (our most general class). For this LDS we can construct the invariant <sup>2</sup>, <sup>4</sup>, <sup>8</sup>, ..., <sup>2</sup>*<sup>k</sup>* <sup>∪</sup> <sup>2</sup>*<sup>k</sup>*+1p<sup>1</sup> <sup>|</sup> <sup>p</sup><sup>1</sup> <sup>∈</sup> <sup>Z</sup> for each k. For any proposed strongest Z-semi-linear invariant, one can find a k for which the corresponding invariant is an improvement.

N-linear sets generalise Z-linear sets (observe that Z-linear sets are a proper subclass, since {<sup>x</sup> <sup>+</sup> <sup>p</sup>*i*Z} can be expressed as {<sup>x</sup> + (−p*i*)<sup>N</sup> <sup>+</sup> <sup>p</sup>*i*N}, but {<sup>x</sup> <sup>+</sup> <sup>p</sup>*i*N} is clearly not Z-linear). Consider the LDS ((x1, x2),( 0 1 1 0 )), with a reachability set consisting of just two points x = (x1, x2) and y = (x2, x1). There are two incomparable candidates for the minimal <sup>N</sup>-linear invariant: {<sup>x</sup> + (<sup>y</sup> <sup>−</sup> <sup>x</sup>)N} and {<sup>y</sup> + (<sup>x</sup> <sup>−</sup> <sup>y</sup>)N}. Similarly for <sup>R</sup>+-linear invariants, the sets {<sup>y</sup> + (<sup>x</sup> <sup>−</sup> <sup>y</sup>)R+} and {<sup>x</sup> + (<sup>y</sup> <sup>−</sup> <sup>x</sup>)R+} are incomparable half-lines.

### **4.2** Z**-linear Targets**

We have so far only considered invariants for point targets. We now turn to lattice-like targets, in particular targets specified as *full-dimensional* Z-linear sets.

**Theorem 3.** *It is decidable whether a given LDS* (x(0), {M1,...,M*k*}) *reaches a full-dimensional* <sup>Z</sup>*-linear target* <sup>Y</sup> <sup>=</sup> {<sup>x</sup> <sup>+</sup> <sup>p</sup>1<sup>Z</sup> <sup>+</sup> ··· <sup>+</sup> <sup>p</sup>*d*Z}*, with* x, p*<sup>i</sup>* <sup>∈</sup> <sup>Z</sup>*<sup>d</sup>.*

*Furthermore, for unreachable instances, a* Z*-semi-linear inductive invariant can be provided.*

Theorem 3 requires the targets to be *full-dimensional*. For nondeterministic systems reachability is undecidable for non-full-dimensional targets (in particular point targets) [22]. However, even for deterministic systems, when Z-linear targets fail to be *full-dimensional* the reachability problem becomes as hard as the Skolem problem (see, e.g. [24]), for example by choosing as target the set {(0, x2,...,x*d*) <sup>|</sup> <sup>x</sup>2,...,x*<sup>d</sup>* <sup>∈</sup> <sup>Z</sup>} <sup>=</sup> {**<sup>0</sup>** <sup>+</sup> <sup>e</sup>2<sup>Z</sup> <sup>+</sup> ··· <sup>+</sup> <sup>e</sup>*d*Z}, where <sup>e</sup>*<sup>i</sup>* ∈ {0, <sup>1</sup>} *<sup>d</sup>* is the standard basis vector, with (e*i*)*<sup>i</sup>* = 1 and (e*i*)*<sup>j</sup>* = 0 for i = j.

Towards proving Theorem 3, we first show that *full-dimensional* linear sets can be expressed as 'square' hybrid-linear sets. Hybrid-linear sets are semi-linear sets in which all the components share the same period vectors, and thus differ only in starting position (whereas semi-linear sets allow each component to have distinct period vectors). By square, we mean that all period vectors are the same multiple of standard basis vectors.

**Lemma 2.** *Let* <sup>Y</sup> <sup>=</sup> {<sup>x</sup> <sup>+</sup> <sup>p</sup>1<sup>Z</sup> <sup>+</sup> ··· <sup>+</sup> <sup>p</sup>*d*Z} *be a full-dimensional* <sup>Z</sup>*-linear set. Then there exists* <sup>m</sup> <sup>∈</sup> <sup>N</sup> *and a finite set* <sup>B</sup> <sup>⊆</sup> [0, m <sup>−</sup> 1]*<sup>d</sup> such that* <sup>Y</sup> <sup>=</sup> *<sup>b</sup>*∈*<sup>B</sup>* {<sup>b</sup> <sup>+</sup> me1<sup>Z</sup> <sup>+</sup> ··· <sup>+</sup> me*d*Z}*.*

*Proof.* Suppose p1,...,p*<sup>d</sup>* span a d-dimensional vector space. Let P = *<sup>p</sup>*<sup>1</sup> . . . *p<sup>d</sup>* 

be the matrix with rows p1,...,p*d*. Since P is full row rank it is invertible, hence there exists a rational matrix P <sup>−</sup><sup>1</sup> such that e*<sup>i</sup>* = P <sup>−</sup><sup>1</sup> *i,*<sup>1</sup> <sup>p</sup><sup>1</sup> <sup>+</sup> ··· <sup>+</sup> <sup>P</sup> <sup>−</sup><sup>1</sup> *i,d* p*d*. In particular let m*<sup>i</sup>* be such that P <sup>−</sup><sup>1</sup> *i,j* m*<sup>i</sup>* is integral for all j. Then there is an integral combination of p1,...,p*<sup>d</sup>* such that m*i*e*<sup>i</sup>* is an admissible direction in Y .

Let m = lcm {m1,...,m*d*}. Then me*<sup>i</sup>* is an admissible direction in Y . Hence by Proposition 2, Y is equivalent to <sup>x</sup> <sup>+</sup> <sup>p</sup>1<sup>Z</sup> <sup>+</sup> ··· <sup>+</sup> <sup>p</sup>*d*<sup>Z</sup> <sup>+</sup> me1<sup>Z</sup> <sup>+</sup> ··· <sup>+</sup> me*d*Z . By the presence of me1<sup>Z</sup> <sup>+</sup> ··· <sup>+</sup> me*d*<sup>Z</sup> we have that <sup>x</sup> <sup>∈</sup> <sup>Y</sup> if and only x ∈ Y where x *<sup>i</sup>* = (x*<sup>i</sup>* mod m).

And therefore Y can be written as *<sup>b</sup>*∈*<sup>B</sup>* {<sup>b</sup> <sup>+</sup> me1<sup>Z</sup> <sup>+</sup> ··· <sup>+</sup> me*d*Z}, where <sup>B</sup> = [0, m <sup>−</sup> 1]*<sup>d</sup>* <sup>∩</sup> <sup>Y</sup> . 

We now prove Theorem 3.

*Proof (Proof of Theorem* 3*).* Choose m and B as in Lemma 2, so that Y is of the form *<sup>b</sup>*∈*<sup>B</sup>* {<sup>b</sup> <sup>+</sup> me1<sup>Z</sup> <sup>+</sup> ··· <sup>+</sup> me*d*Z}. We build an invariant <sup>I</sup> of the form *b*∈*B*-{<sup>b</sup> <sup>+</sup> me1<sup>Z</sup> <sup>+</sup> ··· <sup>+</sup> me*d*Z} for some <sup>B</sup> <sup>⊆</sup> [0, m <sup>−</sup> 1]*<sup>d</sup>*.

We initialise the set <sup>I</sup><sup>0</sup> <sup>=</sup> {<sup>x</sup> <sup>+</sup> me1<sup>Z</sup> <sup>+</sup> ··· <sup>+</sup> me*d*Z}, where <sup>x</sup> <sup>∈</sup> [0, m <sup>−</sup> 1]*<sup>d</sup>* such that x*<sup>j</sup>* = (x(0) *<sup>j</sup>* mod m). We then build the set I<sup>1</sup> by adding to I<sup>0</sup> the sets {<sup>y</sup> <sup>+</sup> me1<sup>Z</sup> <sup>+</sup> ··· <sup>+</sup> me*d*Z} where for each choice of <sup>M</sup>*i*, <sup>y</sup> <sup>∈</sup> [0, m <sup>−</sup> 1]*<sup>d</sup>* is formed by y*<sup>j</sup>* = ((M*i*x)*<sup>j</sup>* mod m) for some x ∈ I0. We iterate this construction until it stabilises in an inductive invariant I. Termination follows from the finiteness of [0, m <sup>−</sup>1]*<sup>d</sup>* (noting in particular that if termination occurs with <sup>B</sup> = [0, m <sup>−</sup>1]*<sup>d</sup>*, then I = Z*<sup>d</sup>* which is indeed an inductive invariant).

If there exists <sup>y</sup> <sup>∈</sup> <sup>B</sup> <sup>∩</sup> <sup>I</sup> then return Reachable. This is because the same sequence of matrices applied to <sup>x</sup>(0) to produce <sup>y</sup> <sup>∈</sup> <sup>I</sup> would, thanks to the modulo step, wind up inside the set {<sup>y</sup> <sup>+</sup> me1<sup>Z</sup> <sup>+</sup> ··· <sup>+</sup> me*d*Z}, which is a part of the target.

Otherwise, return Unreachable and I as invariant. By construction, I is indeed an inductive invariant disjoint from the target set. 

*Remark 2.* By the same argument, Theorem 3 extends to a restricted class of Z-semi-linear targets: the finite union of *full-dimensional* Z-linear sets.

# **5** N**-Semi-linear Invariants**

We now consider N-semi-linear invariants, our most general class. N-semi-linear invariants gain expressivity thanks to the 'directions' provided by the period vectors. For example, the only possible Z-semi-linear invariant for the LDS (0,(<sup>x</sup> → <sup>x</sup> + 1)) is <sup>Z</sup>, yet the reachability set, <sup>N</sup>, is captured exactly by an <sup>N</sup>linear invariant. We show that a separating N-semi-linear invariant can *always* be found for unreachable instances of deterministic integer LDS, although the computed invariant will depend on the target. However, finding invariants is undecidable for nondeterministic systems, at least in high dimension. Nevertheless, we show decidability for the low-dimensional setting of the MU Puzzle—one dimension with affine updates.

# **5.1 Existence of Sufficient (but Non-minimal)** N**-semi-linear Invariants for Point Reachability in Deterministic LDS**

Kannan and Lipton showed decidability of reachability of a point target for deterministic LDS [16]. In this subsection, we establish the following result to provide a separating invariant in unreachability instances.

**Theorem 4.** *Given a deterministic LDS* (x(0), M) *together with a point target* y*, if the target is unreachable then a separating* N*-semi-linear inductive invariant can be provided.*

To do so, we will invoke the results from [8] to compute an R+-semi-linear inductive invariant, and then extract from it an N-semi-linear inductive invariant. More precisely, the authors of [8] show how to build polytopic inductive invariants for certain deterministic LDS. Such polytopes are either bounded or are R+-semi-linear sets. In the first case, the polytope contains only finitely many integral points, which can directly be represented via an N-semi-linear set. In the second case, we build an N-semi-linear set containing exactly the set of integral points included in the R+-semi-linear invariant, thanks to the following lemma.

**Lemma 3.** *Given an* <sup>R</sup>+*-linear set* <sup>S</sup> <sup>=</sup> {<sup>x</sup> <sup>+</sup> *<sup>i</sup>* <sup>p</sup>*i*R+}*, where the vectors* <sup>p</sup>*<sup>i</sup> have rational coefficients and* x *is an integer vector, one can build an* N*-semilinear set* N *comprising precisely all of the integral points of* S*.*

*Proof (Proof of Theorem* 4*).* We note that every invariant produced in [8] has rational period vectors, as the vectors are given by the difference of successive point in the orbit of the system, and thus Lemma 3 can be applied. The authors of [8] build an inductive invariant in all cases except those for which every eigenvalue of the matrix governing the evolution of the LDS is either 0 or of modulus 1 and at least one of the latter is not a root of unity. This situation however cannot occur in our setting. Indeed, the eigenvalues of an integer matrix are algebraic integers, and an old result of Kronecker [19] asserts that unless all of the eigenvalues are roots of unity, one of them must have modulus strictly greater than 1 (the case in which *all* eigenvalues are 0 being of course trivial).

This concludes the proof of Theorem 4. 

# **5.2 Undecidability of** N**-semi-linear Invariants for Nondeterministic LDS**

If the enhanced expressivity of N-semi-linear sets allows us always to find an invariant for deterministic LDS, it contributes in turn to making the invariantsynthesis problem undecidable when the LDS is not deterministic. We establish this through a reduction from the infinite Post correspondence problem (ω-PCP) that can be defined in the following way: given m pairs of non-empty words {(u1, v<sup>1</sup>),...,(u*<sup>m</sup>*, v*<sup>m</sup>*)} over alphabet {0, <sup>2</sup>}, does there exist an infinite word <sup>w</sup> <sup>=</sup> <sup>w</sup>1w<sup>2</sup> ... over alphabet {1,...,m} such that <sup>u</sup>*<sup>w</sup>*<sup>1</sup> <sup>u</sup>*<sup>w</sup>*<sup>2</sup> ... <sup>=</sup> <sup>v</sup>*<sup>w</sup>*<sup>1</sup> <sup>v</sup>*<sup>w</sup>*<sup>2</sup> .... This problem is known to be undecidable when m is at least 8 [6,13].

**Theorem 5.** *The invariant synthesis problem for* N*-semi-linear sets and linear dynamical systems with at least two matrices of size* 91 *is undecidable.*

*Proof (Sketch).* We first establish the result in the case of several matrices in low dimension; this can then be transformed in a standard way to two larger matrices (of size 91).

The proof is by reduction from the infinite Post correspondence problem. Given an instance of this problem the pair of words corresponding to each sequence of tiles has an integer representation, using base-4 encoding. An important property of our encoding is that the operation of appending a new tile to an existing pair of words can be encoded by matrix multiplication.

Recall that if the instance of ω-PCP is negative, then every generated pair of words will differ at some point. Our encoding is such that this difference of letters creates a difference in their numerical encodings that can be identified with an N-semi-linear invariant. On the other hand, when there is a positive answer to the <sup>ω</sup>-PCP instance, there can be no <sup>N</sup>-semi-linear invariant. 

#### **5.3 Nondeterministic One-Dimensional Affine Updates**

The previous section shows that point reachability for nondeterministic LDS is undecidable once there sufficiently many dimensions, motivating an analysis at lower dimensions. The MU Puzzle requires a single dimension with affine updates (or equivalently two dimensions in matrix representation, with the coordinate along the second dimension kept constant). We consider this onedimensional affine-update case, and therefore, rather than taking matrices as input, we directly work with affine functions of the form f*i*(x) = a*i*x + b*i*.

**Theorem 6.** *Given* <sup>x</sup>(0), y <sup>∈</sup> <sup>Z</sup>*, along with a finite set of functions* {f1,...,f*k*} *where* <sup>f</sup>*i*(x) = <sup>a</sup>*i*<sup>x</sup> <sup>+</sup> <sup>b</sup>*i,* <sup>a</sup>*i*, b*<sup>i</sup>* <sup>∈</sup> <sup>Z</sup> *for* <sup>1</sup> <sup>≤</sup> <sup>i</sup> <sup>≤</sup> <sup>k</sup>*, it is decidable whether* <sup>y</sup> *is reachable from* x(0)*.*

*Moreover, when* y *is unreachable, an* N*-semi-linear separating inductive invariant can be algorithmically computed.*

We note that decidability of reachability is already known [9,10]. We refine this result by exhibiting an invariant which can be used to disprove reachability. In fact our procedure will produce an N-semi-linear set which can be used to decide reachability, and which, in instances of non-reachability, will be a separating inductive invariant. We have implemented this algorithm into our tool porous, enabling us to efficiently tackle the MU Puzzle as well as its generalisation to arbitrary collections of one-dimensional affine functions. We report on our experiments in Sect. 6.

We build a case distinction depending on the type of functions that appear:

**Definition 5.** *A function* f(x) = ax + b*...*


# **Simplifying Assumptions**

**Lemma 4.** *Without loss of generality, redundant functions are redundant; more precisely, we can reduce the computation of an invariant for a system having redundant functions to finitely many invariant computations for systems devoid of such functions.*

*Proof.* Clearly the identity function has no impact on the reachability set, and so can be removed outright. For any other redundant function, its impact on the reachability set does not depend on when the function is used, and we may therefore assume that it was used in the first step, or equivalently, using an alternative starting point. Hence the invariant-computation problem can be reduced to finitely many instances of the problem over different starting points, with redundant functions removed. Finally, taking the union of the resulting invariants yields an invariant for the original system. 

**Lemma 5.** *Without loss of generality,* <sup>x</sup>(0) <sup>≥</sup> <sup>0</sup>*.*

*Proof.* We construct a new system, where each transition f(x) = ax + b is replaced by <sup>f</sup>(x) = ax−b. Then <sup>x</sup>(0) reaches <sup>y</sup> in the original system if and only if <sup>−</sup>x(0) reaches <sup>−</sup><sup>y</sup> in the new system. To see this, observe that if <sup>f</sup>(x) = ax+b, then f(−x) = −ax − b = −f(x). 

**Lemma 6.** *Suppose there are at least two distinct pure inverting functions (and possibly other types of functions). Then without loss of generality there are two opposing counters.*

*Proof.* Consider f(x) = −x + b, and g(x) = −x + c. Then f(g(x)) = −(−x + c) +b = x+b−c and g(f(x)) = −(−x+b) +c = x+c−b. Since b−c = −(c−b) and b = c (as f = g) these two functions are opposing. 

**Two Opposing Counters.** Let us first observe that when there are two opposing counters, we essentially move in either direction by some fixed amount. This will entail that only Z-(semi)-linear invariants can be produced, rather than proper N-(semi)-linear invariants.

**Lemma 7.** *Suppose there are two opposing counters,* f(x) = x + b*, and* g(x) = <sup>x</sup> <sup>−</sup> <sup>c</sup>*. Then for any reachable* <sup>x</sup> *we have* {<sup>x</sup> <sup>+</sup> <sup>d</sup>Z} ⊆ <sup>I</sup> *for* <sup>d</sup> = gcd(b, c)*.*

Therefore, starting with x(0) + dZ ∈ I we can 'saturate' the invariant under construction using the following lemma:

**Lemma 8.** *Let* h(x) = x + d *be chosen as a reference counter amongst the counters. If* {<sup>x</sup> <sup>+</sup> <sup>d</sup>Z} ∈ <sup>I</sup>*, then* {f(x) + <sup>d</sup>Z} ∈ <sup>I</sup> *for every function* <sup>f</sup>*.*

*Proof (Proof of* Lemma 8*).* Consider the function f(x) = ax+b. If x = y+dk ∈ I, then f(x) = ax + b = ay + adk + b = f(y) + adk ∈ I.

Now thanks to the presence of counter h(x) = x + d, by choosing the initial <sup>k</sup> <sup>∈</sup> <sup>Z</sup> appropriately and applying <sup>h</sup>(x) sufficiently many times (say <sup>m</sup> <sup>∈</sup> <sup>N</sup> times), one can reach <sup>f</sup>(x) + adk <sup>+</sup> dm <sup>=</sup> <sup>f</sup>(x) + dn for any desired <sup>n</sup> <sup>∈</sup> <sup>Z</sup>. 

Without loss of generality if {<sup>x</sup> <sup>+</sup> <sup>d</sup>Z} is in the invariant, then 0 <sup>≤</sup> x<d. We then repeatedly use Lemma 8 to find the required elements of the invariant. Since there are only finitely many residue classes (modulo d), every reachable residue class {c1,...,c*n*} can be found by saturation (in at most d steps), yielding invariant {c<sup>1</sup> <sup>+</sup> <sup>d</sup>Z}∪···∪{c*<sup>n</sup>* <sup>+</sup> <sup>d</sup>Z}.

Thanks to Lemma 6, in all remaining cases there is without loss of generality at most one pure inverter.

**Only Pure Inverters.** If there is exactly one pure inverter f(x) = −x+b (and no other types of functions), then <sup>f</sup>(x(0)) = <sup>−</sup>x(0)+<sup>b</sup> and <sup>f</sup>(−x(0)+b) = <sup>x</sup>(0)−b<sup>+</sup> b = x(0), thus the reachability set is finite, with exact invariant <sup>x</sup>(0), <sup>−</sup>x(0) <sup>+</sup> <sup>b</sup> . **No Counters.** If we are not in the preceding case and there are no counters, then there must be growing functions and by Lemma 6, without loss of generality at most one pure inverter. We show that all growing functions increase the modulus outside of some bounded region.

**Lemma 9.** *For every* M ≥ 0 *and every growing function* f(x) = ax+b*,* |a| ≥ 2*, there exists* C*<sup>M</sup> <sup>f</sup>* <sup>≥</sup> <sup>0</sup> *such that if* <sup>|</sup>x| ≥ <sup>C</sup>*<sup>M</sup> <sup>f</sup> then* |f(x)|≥|x| + M*.*

*Proof.* By the triangle inequality we have: |f(x)| = |ax + b|≥|a||x|−|b|. Thus <sup>|</sup>x| ≥ <sup>|</sup>*b*|+|*M*<sup>|</sup> <sup>|</sup>*a*|−<sup>1</sup> <sup>=</sup>⇒ |a||x|−|b|≥|x<sup>|</sup> <sup>+</sup> <sup>|</sup>M<sup>|</sup> <sup>=</sup>⇒ |f(x)|≥|x<sup>|</sup> <sup>+</sup> <sup>M</sup>. 

This is the only situation in which the invariant is not exactly the reachability set, and requires us to take an overapproximation.

Let C = max C<sup>0</sup> *<sup>f</sup>*<sup>1</sup> ,...,C<sup>0</sup> *<sup>f</sup><sup>k</sup>* , <sup>|</sup>y<sup>|</sup> + 1 , for f1,...,f*<sup>k</sup>* growing functions. If there are no pure inverters then {−<sup>C</sup> <sup>−</sup> <sup>N</sup>}∪{<sup>C</sup> <sup>+</sup> <sup>N</sup>} is invariant (although may not yet contain the whole of O). However, we can return the inductive invariant {−<sup>C</sup> <sup>−</sup> <sup>N</sup>}∪{<sup>C</sup> <sup>+</sup> <sup>N</sup>} ∪ (O ∩ (−C, C)). The set O ∩ (−C, C) is finite and can elicited by exhaustive search, noting that once an element of the orbit reaches absolute value at least C, the remainder of the corresponding trajectory remains forever outside of (−C, C).

If there is one pure inverter g(x) = −x + d then observe that −C is mapped to C + d and C + d is mapped to −C. Thus intuitively we want to use the interval (−C, C + d). However two problems may occur: (a) since d could be less than 0 then C + d may no longer be growing (under the application of the growing functions), and (b) an inverting growing function only ensures that −C is mapped to a value greater than or equal to C, rather than C+d. Hence, we choose C to ensure that C ± d is still growing by at least |d| (under the application of our growing functions). Let C = max C<sup>|</sup>*d*<sup>|</sup> *<sup>f</sup>*<sup>1</sup> ,...,C<sup>|</sup>*d*<sup>|</sup> *<sup>f</sup><sup>k</sup>* , <sup>|</sup>y<sup>|</sup> + 1 + |d|. Then the invariant is {−C <sup>−</sup> <sup>N</sup>}∪{C <sup>+</sup> <sup>d</sup> <sup>+</sup> <sup>N</sup>} ∪ (O ∩ (−C , C + d)).

**Non-opposing Counters.** The only remaining possibility (if there do not exist two opposing counters, and not all functions are growing or pure inverters), is that there are counter-like functions, but they are all counting in the same direction. There may also be a single pure inverter, and possibly some growing functions.

Pick a counter h(x) = x+d to be the reference counter; the choice is arbitrary, but it is convenient to pick a counter with minimal |d|. As a starting point, we have x(0) + dN ⊆ I.

**Lemma 10.** *If there is an inverter* <sup>g</sup>(x) = <sup>−</sup>ax <sup>+</sup> <sup>b</sup>*, with* a > <sup>0</sup>, b <sup>∈</sup> <sup>Z</sup>*, and we have* {<sup>x</sup> <sup>+</sup> <sup>d</sup>N} ⊆ <sup>I</sup> *then* {g(x) + <sup>d</sup>Z} ⊆ <sup>I</sup>*.*

The crucial difference with Lemma 8 is the observation that now an N-linear set has induced a Z-linear set.

*Proof.* Let <sup>r</sup> <sup>=</sup> <sup>g</sup>(x) + dm for <sup>m</sup> <sup>∈</sup> <sup>Z</sup>. We show <sup>r</sup> <sup>∈</sup> <sup>I</sup>. Consider <sup>x</sup> <sup>+</sup> dn for <sup>n</sup> <sup>∈</sup> <sup>N</sup>, then <sup>g</sup>(<sup>x</sup> <sup>+</sup> dn) = <sup>−</sup>a(<sup>x</sup> <sup>+</sup> dn) + <sup>b</sup> <sup>=</sup> <sup>−</sup>ax <sup>+</sup> <sup>b</sup> <sup>−</sup> adn <sup>=</sup> <sup>g</sup>(x) <sup>−</sup> adn. Hence <sup>g</sup>(x) <sup>−</sup> adn <sup>+</sup> dk, n, k <sup>∈</sup> <sup>N</sup>, is reachable by applying <sup>k</sup> times the function <sup>h</sup>(x). Hence for any <sup>m</sup> <sup>∈</sup> <sup>Z</sup> there exists k, n <sup>∈</sup> <sup>N</sup> such that <sup>k</sup> <sup>−</sup> na <sup>=</sup> <sup>m</sup>, so that <sup>r</sup> is indeed reachable. 

Similarly to the situation with two opposing counters, whenever the invariant contains some Z-linear set, Lemma 8 allows us to saturate amongst the finitely many reachable residue classes.

However, the invariant may contain subsets that are not Z-linear. Consider {<sup>x</sup> <sup>+</sup> <sup>d</sup>N} ⊆ <sup>I</sup>, which is not yet invariant. We repeatedly apply non-inverting functions to {<sup>x</sup> <sup>+</sup> <sup>d</sup>N} to obtain new <sup>N</sup>-linear sets (not <sup>Z</sup>-linear sets). When the function applied 'moves' in the direction of the counters this will ultimately saturate (in particular when applying other counter functions). However, in the opposite direction, we may generate infinitely many such classes.

*Example 3.* Consider the reference counter h(x) = x+4, with initial point 5. This yields an initial set {5+4N}⊆O, where 5 is the initial point and 4<sup>N</sup> is derived from the counter increment. Now when applying <sup>x</sup> → <sup>2</sup><sup>x</sup> + 6 to {5+4N} we obtain {10 + 6 + 8<sup>N</sup> + 4N} <sup>=</sup> {16 + 4N}, then {38 + 4N}, and then {82 + 4N}. However {82 + 4N}⊆{38 + 4N} and we can therefore stop with the invariant {5+4N}∪{16 + 4N}∪{38 + 4N}.

However, if the initial sequence is not moving in the direction of the reference counter, this saturation does not occur. Consider {5+4N} with the function <sup>x</sup> → <sup>2</sup><sup>x</sup> <sup>−</sup> 6. Then {5+4N} maps to {<sup>10</sup> <sup>−</sup> 6+8<sup>N</sup> + 4N} <sup>=</sup> {4+4N}, which maps to {2+4N}, {−2+4N}, {−10 + 4N}, {−26 + 4N}, and so on. However −2 and −10 are both 2 modulo 4 (and so is −26 as well). This means in the negative direction we can obtain arbitrarily large negative values congruent to 2 modulo 4 and then use the reference counter h(x) = x + 4 to obtain any value of {2+4Z}. 

Clearly we can examine all reachable residue classes defined by our reference counter. Any residue class reachable after an inverting function induces a Z-linear set. So it remains to consider those N-linear sets reachable without inverting functions. The remaining case to handle occurs when we repeatedly induce Nlinear sets until they repeat a residue class in the direction opposite to that of the reference counter.

We consider the case for h(x) = x+d with d ≥ 0. The case with h(x) = x−d is symmetric. It remains to detect when a set {<sup>x</sup> <sup>+</sup> <sup>d</sup>N} leads to {<sup>y</sup> <sup>+</sup> <sup>d</sup>N} by a sequence of non-inverting functions with x ≡ y mod d. Then by repeated application of these functions one can reach sets {<sup>z</sup> <sup>+</sup> <sup>d</sup>N} with <sup>z</sup> arbitrarily small, hence we can replace {<sup>x</sup> <sup>+</sup> <sup>d</sup>N} by {<sup>x</sup> <sup>+</sup> <sup>d</sup>Z}. We give further details in the full version.

**Reachability.** The above procedure is sufficient to decide reachability. In all cases apart from that in which there are no counters, the invariants produced coincide precisely with the reachability sets. A reachability query therefore reduces to asking whether the target belongs to the invariant.

In the remaining case, the invariant obtained is parametrised by the target via the bound C . The target lies within the region (−C , C +d), within which we can compute all reachable points. Thus once again, the target is reachable precisely if it belongs to the invariant. However, for a new target of larger modulus, a different invariant would need to be built.

#### **Complexity**

**Lemma 11.** *Assume that all functions, starting point, and target point are given in unary. Then the invariant can be computed in polynomial time.*

Without the unary assumption, the invariant could have exponential size, and hence require at least exponential time to compute. That is because the invariant we construct could include every value in an interval, for example, (−C, C), where C is of size polynomial in the largest value.

As shown in [10], the reachability problem is at least **NP**-hard in binary, because one can encode the integer Knapsack problem (which allows an object to be picked multiple times rather at most once). Moreover the Knapsack problem is efficiently solvable in pseudo-polynomial time via dynamic programming; that is, polynomial time assuming the input is in unary, matching the complexity of our procedure.

# **6 The POROUS Tool**

Our invariant-synthesis tool porous<sup>9</sup> computes N-semi-linear invariants for point and Z-linear targets on systems defined by one-dimensional affine functions. porous includes implementations of the procedures of Theorem 3 (restricted to one-dimensional affine systems) and Theorem 6. porous is built in Python and can be used by command-line file input, a web interface, or by directly invoking the Python packages.

porous takes as input an instance (a start point, a target, and a collection of functions) and returns the generated invariant. Additionally it provides a proof that this set is indeed an inductive invariant: the invariant is a union of N-linear sets, so for each linear set and each function, porous illustrates the application of that function to the linear set and shows for which other linear set in the invariant this is a subset. Using this invariant, porous can decide reachability; if the specific target is reachable the invariant is not in itself a proof of reachability (since the invariant will often be an overapproximation of the global reachability set). Rather, equipped with the guarantee of reachability, porous searches for a direct proof of reachability: a sequence of functions from start to target (a process which would not otherwise be guaranteed to terminate).

<sup>9</sup> Tool: invariants.davidpurser.net Code: github.com/davidjpurser/porous-tool.

**Table 2.** Results varying by size parameter (last row includes all instances tested). Times are given in seconds, with the average and maximum shown (except reachability proof time, which are all approximately 30 s due to instances that terminate just before the timeout).


**Experimentation.** porous was tested on all 2<sup>7</sup> <sup>−</sup> 1 possible combinations of the following function types, with a ≥ 2, b ≥ 1: positive counters (x → x + b), negative counters (x → x − b), growing (x → ax ± b), inverting and growing (x → −ax ± b), inverters with positive counters (x → −x + b), inverters with negative counters (x → −x − b) and the pure inverter (x → −x). For each such combination a random instance was generated, with a size parameter to control the maximum modulus of a and b, ranging between 8 and 1024. The starting point was between 1 and the size parameter and the target was between 1 and 4 times the size parameter. Ten instances were tested for each size parameter and each of the 2<sup>7</sup> <sup>−</sup> 1 combinations, with between 1 and 9 functions of each type (with a bias for one of each function type).

Our analysis, summarised in Table 2, illustrates the effect of the size parameter. The time to produce the proof of invariant is separated from the process of building the invariant, since producing the proof of invariant can become slower as |I| becomes larger; it requires finding L*<sup>k</sup>* ∈ I such that f*i*(L*<sup>j</sup>* ) ⊆ L*<sup>k</sup>* for every linear set <sup>L</sup>*<sup>j</sup>* <sup>∈</sup> <sup>I</sup> and every affine function <sup>f</sup>*i*. In every case porous successfully built the invariant, and hence decided reachability very quickly (on average well below 1 s) and also produced the proof of invariance in around half a second on average. To demonstrate correctness in instances for which the target is reachable porous also attempts to produce a proof of reachability (a sequence of functions from start to target). Since our paper is focused on invariants as certificates of non-reachability, our proof-of-reachability procedure was implemented crudely as a simple breadth-first search without any heuristics, and hence a timeout of 30 s was used for this part of the experiment only.

Our experimental methodology was partially limited due to the high prevalence of reachable instances. A random instance will likely exhibit a large (often universal) reachability set. When two random counters are included, the chance that gcd(b1, b2) = 1 (whence the whole space is covered) is around 60.8% and higher if more counters are chosen.

Overall around 86% of instances were reachable (of which 84% produced a proof within 30 s). Of the 14% of unreachable instances, all produced a proof, with the invariant taking around 0.2 s to build and 0.6 s to produce the proof. The 30-s timeout when demonstrating reachability directly is several orders of magnitudes longer than answering the reachability query via our invariant-building method.

A typical academic/consumer laptop was used to conduct the timing and analysis (a four-year-old, four-core MacBook Pro).

# **7 Conclusions and Open Directions**

We introduced the notion of porous invariants, which are not necessarily convex and can in fact exhibit infinitely many 'holes', and studied these in the context of multipath (or branching/nondeterministic) affine loops over the integers, or equivalently nondeterministic integer linear dynamical systems. We have in particular focused on reachability questions. Clearly, the potential applicability of porous invariants to larger classes of systems (such as programs involving nested loops) or more complex specifications remains largely unexplored.

Our focus is on the boundary between decidability and undecidability, leaving precise complexity questions open. Indeed, the complexity of synthesising invariants could conceivably be quite high, except where we have highlighted polynomial-time results. On the other hand, the invariants produced should be easy to understand and manipulate, from both a human and machine perspective.

On a more technical level, in our setting the most general class of invariants that we consider are N-semi-linear. There remains at present a large gap between decidability for one-dimensional affine functions, and undecidability for linear updates in dimension 91 and above. It would be interesting to investigate whether decidability can be extended further, for example to dimensions 2 and 3.

**Acknowledgements.** This work was funded by DFG grant 389792660 as part of TRR 248 (see perspicuous-computing.science). Jo¨el Ouaknine was supported by ERC grant AVS-ISS (648701), and is also affiliated with Keble College, Oxford as emmy.network Fellow. James Worrell was supported by EPSRC Fellowship EP/N008197/1.

### **References**

1. Almagor, S., Chistikov, D., Ouaknine, J., Worrell, J.: O-minimal invariants for discrete-time dynamical systems (2019, preprint, submitted). https://arxiv.org/ abs/1802.09263


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

# JavaSMT 3: Interacting with SMT Solvers in Java

Daniel Baier , Dirk Beyer , and Karlheinz Friedberger

LMU Munich, Munich, Germany

Abstract. Satisfiability Modulo Theories (SMT) is an enabling technology with many applications, especially in computer-aided verification. Due to advances in research and strong demand for solvers, there are many SMT solvers available. Since different implementations have different strengths, it is often desirable to be able to substitute one solver by another. Unfortunately, the solvers have vastly different APIs and it is not easy to switch to a different solver (lock-in effect). To tackle this problem, we developed JavaSMT, which is a solver-independent framework that unifies the API for using a set of SMT solvers. This paper describes version 3 of JavaSMT, which now supports eight SMT solvers and offers a simpler build and update process. Our feature comparisons and experiments show that different SMT solvers significantly differ in terms of feature support and performance characteristics. A unifying Java API for SMT solvers is important to make the SMT technology accessible for software developers. Similar APIs exist for other programming languages.

Keywords: Satisfiability Modulo Theories · SMT Solver · Java · API

# 1 Introduction

SMT solvers [6, 21] are used in a multitude of applications, e.g., in formal software analysis, where automated test-case generation [7, 16, 29, 30], SMT-based algorithms for software verification [10, 34], and interactive theorem proving [27, 44] are used. Applications and users rely on efficiency and expressiveness (supported SMT theories) to compute reasonable results in time. For application developers, the usability and API of the solver are also important aspects, and some features needed in applications, such as interpolation or optimization, are not available in some solvers.

Using the solver's own API directly makes it difficult to switch to another solver without rewriting extensive parts of the application, as there is no standardized binary API for SMT solvers. The SMT-LIB2 standard [4] improves this issue by defining a common language to interact with SMT solvers. However, this communication channel does not define a solver interface for special features like optimization or interpolation.<sup>1</sup> Additionally, the application has to parse the data provided by the SMT solver on its own, and this of course slightly changes from solver to solver.

<sup>1</sup> A proposal for adding interpolation queries exists since 2012, see https://ultimate. informatik.uni-freiburg.de/smtinterpol/proposal.pdf .

c The Author(s) 2021

A. Silva and K. R. M. Leino (Eds.): CAV 2021, LNCS 12760, pp. 195–208, 2021.

https://doi.org/10.1007/978-3-030-81688-9\_9

JavaSMT [37] provides a common API layer across multiple back-end solvers to address these problems. Our Java-based approach creates only minimal overhead, while giving access to most solver features. JavaSMT is available under the Apache 2.0 License on GitHub.<sup>2</sup>

Contribution. Our contribution consists of three parts:


Outline. This paper first provides a brief overview of JavaSMT in Sect. 2, explaining the inner structure and features. Sect. 3 discusses the development since the previous publication [37]: more integrated SMT solvers and extended support for operating systems and build processes. Sect. 4 describes a case study, based on SMT-based algorithms [10] in a common verification framework.

Related Work. SMT-LIB2 [4] is the established standard format for exchanging SMT queries. It provides simple usage, is easy to debug, and widely known in the community. However, it requires extra effort to parse and transform formulas in the user application. Features like optimization, interpolation, and receiving nested parts of formulas are not defined by the standard, such that some SMT solvers provide their own individual solution for that. Alternatively, several SMT solvers already come with their special bindings for some programming languages. Most SMT solvers are written in C/C++, so interacting with them in these low-level languages is the easiest way. However, the support for higher-level languages is sparse. The most prominent language binding for several SMT solvers is Python, as it directly allows the access to C code and avoids automated memory management operations like asynchronous garbage collection. Bindings for Java are available for some SMT solvers, such as MathSAT5 and Z3, but missing, unsupported, or unmaintained for others, such as Boolector and CVC4.

In the following, we discuss libraries, similar to JavaSMT, that provide access to several underlying SMT solvers via a common user interface in different popular languages, and their binding mechanism, i.e., whether the solver interaction is based on a native interface or text-based on SMT-LIB2. With SMT-LIB2, an arbitrary SMT solver can be queried, but the interaction happens through communicating processes and the solver is mostly limited to features defined in the standard. Accessing a native interface directly allows to support more features of the underlying solver, e.g., using callbacks, simplifying formulas, or eliminating quantifiers.

Table 1 provides an overview of the libraries for interacting with SMT solvers. We enumerate several special features that are not available in some libraries,

<sup>2</sup> https://github.com/sosy-lab/java-smt


Table 1: Comparison of different interface libraries for SMT solvers

such as unsat cores, interpolation, or optimization queries. Those features depend on the support by the underlying SMT solver, but can be provided in general by an API on top of them. Most libraries use their own formula representation and not just wrap the objects provided by the SMT solver. This potentially allows for easier formula decomposition and inspection, e.g., by using the visitor pattern. JavaSMT directly provides formula decomposition if available in the SMT solver. The provided numbers of forks and stars of the project repositories on GitHub or Bitbucket can be seen as a measurement of popularity.

PySMT [28] is a Python-based project and aims at rapid prototyping of algorithms using the native API of the installed SMT solvers. It has the ability to perform formula manipulation without a back-end SMT solver and additionally supports the conversion of boolean formulas to plain SAT problems and then apply a SAT solver or a BDD library. This approach comes with the drawback of a noticeable memory overhead and performance of an interpreted language. metaSMT [45], SMT Kit, and Smt-Switch [38] provide solver-agnostic APIs for interacting with various SMT solvers in C/C++ to focus on the application instead of the solver integration. jSMTLIB [20], Scala SMT-LIB, and ScalaSMT [17] are solver-independent libraries written in Java or Scala and interact via SMT-LIB2 with SMT solvers. Scala SMT-LIB and ScalaSMT allow to use an additional domain-specific language to interact with SMT solvers and rewrite Scala syntax into valid SMT-LIB2 and back. Both partially extend the SMT-LIB2 standard, e.g., by offering the ability to overload operators or receive interpolants. SBV and what4 are generic Haskell libraries based on process interaction via SMT-LIB2 and support several SAT and SMT solvers. rsmt2 offers a generic Rust library that currently supports three SMT solvers.

# 2 JavaSMT's Architecture and Solver Integration

In the following, we describe the architecture of JavaSMT and its main concepts. Afterwards, we give an overview of the integrated SMT solvers and their features. The architecture did not significantly change, but we added a few new SMT solvers, as shown in Fig. 1.

Architecture. JavaSMT provides a common API for various SMT solvers. The architecture, shown in Fig. 1, consists of several components: As common context, we use a SolverContext that loads the underlying SMT solver and defines the scope and lifetime of all created objects. As long as the context is available, we track memory regions of native SMT-solver libraries. When the context is closed, the corresponding memory is freed and garbage collection wipes all unused objects. Within a given context, JavaSMT provides FormulaManagers for creating formulas in various theories and ProverEnvironments for solving SMT queries.

A FormulaManager allows to create symbols and formulas in the corresponding theories and provides a type-safe way to combine symbols and formulas in order to encode a more complex SMT query. We support the structural analysis (like splitting a formula into its components or counting all function applications in a formula) and transformations (like substituting symbols or applying equisatisfiable simplifications) of formulas.

Each ProverEnvironment represents a solver stack and allows to push/pop boolean formulas and check them for satisfiability (the hard part). This follows the idea of incremental solving (if the underlying SMT solver supports it). After a satisfiability check, the ProverEnvironment provides methods to receive a model, interpolants, or an unsatisfiable core for the given formula.

JavaSMT guarantees that formulas built with a single FormulaManager can be used in several ProverEnvironments, e.g., the same formula can be pushed onto and solved within several distinct ProverEnvironments. The interaction with independent ProverEnvironments works from multiple threads. However, some SMT solvers require synchronization (e.g., locking for an interleaved usage) and other solvers do not require external synchronization (this allows concurrent usage).

SMT-Solver Integration and Bindings. Of the eight SMT solvers that are available in JavaSMT, only Princess [46] and SMTInterpol [18] were 'easy' to integrate, as they are written in Scala and Java, respectively. Those solvers also use the available memory management and garbage collection of the Java Virtual Machine (JVM). All other solvers are written in C/C++ and need a Java Native Interface (JNI) wrapper to interface with JavaSMT. Z3 [40] and CVC4 [5] provide their own Java wrappers, while the bindings used for MathSAT5 [19], Boolector [42], and Yices2 [25] are maintained by us. Those bindings are self-written or partially based on a version of the solver developers, extended with exception handling, and usable for debugging in JavaSMT. By providing language bindings for solvers in our library, we relieve the solver developers from this burden, and the implementation of exception handling and memory management is done in an efficient and common manner across several solvers.

Fig. 1: Overview of JavaSMT

Table 2: Size (LOC) of the Java-based solver wrappers and native solver bindings


Table 2 lists the size (lines of code) of the wrappers to integrate each solver in JavaSMT, in order to get a rough impression of the required effort to get a solver and its bindings usable in JavaSMT. The size information consists of two parts, namely the JNI bindings that are written in C/C++ and the Java code that implements the necessary interfaces of JavaSMT. An expressive solver API (like MathSAT5 or OptiMathSAT [47]) needs more code for their binding, with only a small increment in complexity compared to other solver bindings.

Note that the evolution of JavaSMT depends on the evolution of the underlying SMT solvers. Z3 is well-known, has a large user group, and an active development team. Yet, interpolation support for Z3 was dropped with release 4.8.1. 3 Bitwuzla [41] is the successor of the SMT solver Boolector, for which the developers still provide small fixes. Bitwuzla can be supported in JavaSMT in the future. CVC4 has been developed further to CVC5. However, the maintainers

<sup>3</sup> https://github.com/Z3Prover/z3/releases/tag/z3-4.8.1

dropped the existing Java API, partially because of issues with the Java garbage collection, and plan to replace it.<sup>4</sup> Yices2 is also actively maintained and adds new features regularly. For example, the developers added support for third-party SAT solvers such as CaDiCaL and CryptoMiniSat [48].

# 3 New Contributions in JavaSMT 3

This section describes the improvements over the JavaSMT version from five years ago [37], split into two parts. First, we describe newly integrated solvers and theory features. Second, we provide information about the build process.

Support for Additional SMT Solvers. JavaSMT 3 provides access to eight SMT solvers. Besides the solvers that were already integrated before, MathSAT5, OptiMathSAT, Z3, Princess, and SMTInterpol, the user can now additionaly use Boolector, CVC4, and Yices2. Table 3 lists available theories and important features supported by each individual solver. Boolector is specialized in Bitvector-based theories, but does not support the Integer theory. It is shipped with several back-end SAT solvers, from which the user can choose a favorite: CaDiCaL, CryptoMiniSat [48], Lingeling, MiniSat [26], and PicoSAT [13]. All solvers support the input of plain SMT-LIB2 formulas. However, the feature most requested by JavaSMT users is the input and output of SMT queries via the API, i.e., parsing and printing boolean formulas for a given context. This feature is required for (de-)serializing formulas to disk, for network transfer, and to translate formulas from one solver to another one. This feature is unfortunately missing for the newly integrated solvers, even though each solver internally already contains code for parsing and printing SMT-LIB2 formulas.

For formula manipulation, JavaSMT accesses the components of a formula, e.g., operators and operands. We do not require full access to the internal data structures of the SMT solvers, but only limited access to the most basic parts. Only Boolector does not provide the necessary API.

Build Simplification. JavaSMT 3 also supports more operating systems than before. Besides the existing support for Linux, we started to provide pre-compiled binaries for MacOS and Windows for more than half of the available solvers. This simplifies the initial steps for new users, which previously were required to compile and link the solvers on their own. This was an involving task, because of the diversity of build systems and dependencies of each solver.

In addition to this, we now offer direct support for two popular build systems for Java applications, namely Ant and Maven. JavaSMT comes with several examples and documentation, such that the mentioned build systems can be used to set up JavaSMT in a ready-to-go state on most systems. This eliminates the need for complex manual set up of dependencies and eases the use of JavaSMT and the SMT solvers.

<sup>4</sup> https://github.com/cvc5/cvc5/issues/5018


Table 3: SMT theories and features supported by SMT solvers in JavaSMT 3

### 4 Evaluation

Frameworks that provide a unified API to SMT solvers (such as JavaSMT, PySMT, and ScalaSMT) are necessary because the characteristics of the SMT solvers vary a lot. In the evaluation we provide support for this argument.

We inlined a discussion of the features already in the previous section. Table 3 provides the overview of supported theories and shows that certain theories are available only for a subset of SMT solvers. The table also shows that there are several features that restrict the choice of SMT solvers for certain applications.

In terms of performance, we evaluate JavaSMT 3 as a component of CPAchecker [11], which is an open-source software-verification framework <sup>5</sup> that provides a range of different SMT-based algorithms for program analysis [10] and encoding techniques for program control flow [8, 12]. We compare three well-known and successful SMT-based algorithms for software model checking and show that — when using the same algorithm and identical problem encoding — the performance result of an analysis depends on the used SMT solver. Some

<sup>5</sup> https://cpachecker.sosy-lab.org

algorithms depend on special features of the SMT solver, e.g., to provide a certain type of formula (such as interpolants) and operation on a formula (such as access to subformulas). There are SMT solvers that can not be used for some algorithms.

We aim to show that depending on the feature set of the SMT solvers, it is important to support a common API, and additionally, that using the text-based interaction via SMT-LIB2 is not an efficient solution, when it comes to formula analysis like adding additional information into a formula.

Benchmark Programs. We evaluate the usage of JavaSMT on a large subset of the SV-benchmark suite <sup>6</sup> containing over 1 000 verification tasks. To have a broad variation of benchmark tasks, we include reachability problems from the categories BitVectors, ControlFlow, Heap, and Loops.

BitVectors depends on bit-precise reasoning and thus, the SMT solver needs to support Bitvector logic. Heap depends on modeling heap memory access, e.g., which is either encoded in the theory of Arrays or as Uninterpreted Functions. The category Loops contains tasks where the state space is potentially quite large.

Experimental Setup. We run all our experiments on computers with Intel Xeon E3-1230 v5 CPUs with 3.40 GHz, and limit the CPU time to 15 min and the memory to 15 GB. We use CPAchecker revision r36714, which internally uses JavaSMT 3.7.0-73. The time needed for transforming the input program into SMT queries is rather small compared to the analysis time. Additionally, the progress of an algorithm depends on the result (e.g., model values or interpolants) returned from an SMT solver, thus we do not explicitly extract the run time required by the SMT solver itself for answering the satisfiability problem, but we measure the complete CPU time of CPAchecker for the verification run.

Analysis Configuration. We use three different SMT-based algorithms for software verification [10]. The first approach is bounded model checking (BMC) [14, 15], which is applied in software and hardware model checking since many years. In this approach, a verification problem is encoded as single large SMT query and given to the SMT solver. No further interaction with the SMT solver is required. In our evaluation, we use a loop bound k = 10, which limits the size of the SMT query.

The second approach is k-induction [9, 24], which extends BMC, and which uses auxiliary invariants to strengthen the induction hypothesis. In this approach, the algorithm generates several SMT queries (base case, inductive-step case, each with increasing loop bound) and uses an invariant generator that provides the auxiliary invariants. We use an interval-based invariant generator that provides not only the invariants, but also information about pointers and aliases, which must be inserted into the SMT formula using the formula visitor.

The third approach is predicate abstraction [3, 12, 31, 35], which uses Craig interpolation [22, 32, 39] to compute predicate abstractions of the program. This approach does not only query the SMT solver multiple times, but also uses (sequential) interpolation, which is currently supported only by MathSAT5, Princess, and SMTInterpol.

<sup>6</sup> https://github.com/sosy-lab/sv-benchmarks

Fig. 2: Quantile plot for the runtime of k-induction with several SMT solvers

All approaches are executed in two configurations, depending on the used encoding of program statements: First, we apply a bitvector-based encoding that precisely models bit-precise arithmetics and overflows of the program. Second, an encoding based on linear integer arithmetic is used, which approximates the concrete program execution and is sufficient for some programs.

Solver Configuration. Overall, we aim to show that each solver provides a unique fingerprint of features and results. We aim for a precise program analysis and thus configure the SMT solvers to be as precise as possible, but with a reasonable configuration for each solver (i.e., without using a feature combination that is unsupported by the SMT solver).

SMTInterpol does not support efficient solving of SMT queries in Bitvector logic, thus, it is configured to use only Integer logic. Boolector misses Integer logic, thus, it is applied only to the bit-precise configurations. Additionally, this SMT solver does not support formula inspection and decomposition, which is required by several components in k-induction, e.g., to encode proper pointer aliasing for the program analysis. While the code for formula inspection is called quite often, its influence on the results for the selected benchmark tasks is small. In order to be comparable as far as possible, we deactivate pointer aliasing when using Boolector. Yices2 misses proper support for Array logic, thus, we use a UF-based encoding of heap memory as alternative for this solver, which results in a slightly unsound analysis, but a comparable formula size and run time.

Results and Discussion. Figure 2 provides the quantile plot for the results of k-induction configurations with bit-precise encoding using several SMT solvers. The plot shows the CPU time for valid analysis results, i.e., proofs or counterexamples found, for both expected results true and false. We aim for providing all result that are useful for a user and do not show results where the tool (or SMT solver) crashes or runs out of resources. We do not subtract the run time required for the framework CPAchecker itself (which starts a Java virtual machine), as we assume it to be comparable per program task; we are only interested in the asymptotics in this evaluation. The overall performance of SMT solvers is similar for simple verification tasks, i.e., those with a small run time in the analysis. For difficult tasks with harder SMT queries, the differences of the SMT solvers emerge. When applying k-induction, the analysis inserts additional constraints into the

Table 4: Run time for using different SMT solvers for bounded model checking ('BMC'), k-induction ('KI'), and predicate abstraction ('PA') with the theories of Bitvectors ('BV') and Integers ('Int'); CPU time given in seconds with two significant digits, ' TO' indicates timeouts (900 s), ' ERR' indicates errors, and empty cells indicate that the theory or interpolation was not supported


SMT formula and requires the SMT solver to allow access to components of existing formulas. As Boolector misses this specific feature, k-induction cannot be very effective here. Other SMT solvers are the preferred choice.

Table 4 contains some example tasks from all used algorithms and encodings, where the difference between distinct SMT solvers is noteworthy. Choosing the optimal SMT solvers for an arbitrary problem task is not obvious.

# 5 Conclusion

We contribute JavaSMT 3, the third generation of the unifying Java API for SMT solvers. The package now contains more SMT solvers, an improved build process, and support for MacOS and Windows. The project has over 20 contributors, 2 500 commits, and overall about 41 000 lines of code.<sup>7</sup> JavaSMT is used in Java applications (e.g., [23, 33, 36]) as a solution to combine convenience and performance for the interaction with SMT solvers, or to switch between different solvers and compare them [11, 49]. The most prominent application using JavaSMT is the verification framework CPAchecker (a widely-used software

<sup>7</sup> https://www.openhub.net/p/java-smt

project <sup>8</sup> with 73 forks on GitHub alone), for which JavaSMT was originally developed. In the future, we plan to support more SMT solvers, operating systems, and hardware architectures, while keeping the user interface stable. We hope that even more researchers and developers of Java applications can benefit from SMT solving via a convenient and powerful API.

Data Availability Statement. All benchmark tasks for evaluation, configuration files, a ready-to-run version of our implementation, and tables with detailed results are available in our reproduction package on Zenodo as virtual machine [1] and as ZIP archive [2]. The source code of the open-source library JavaSMT [37] is available in the project repository; see https://github.com/sosy-lab/java-smt.

Funding. This project was supported by the Deutsche Forschungsgemeinschaft (DFG) – 378803395 (ConVeY).

# References


<sup>8</sup> https://github.com/sosy-lab/cpachecker


49. Sprey, J., Sundermann, C., Krieter, S., Nieke, M., Mauro, J., Thüm, T., Schaefer, I.: SMT-based variability analyses in FeatureIDE. In: Proc. VaMoS. pp. 6:1–6:9. ACM (2020). https://doi.org/10.1145/3377024.3377036

Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/ 4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

# **Efficient SMT-Based Analysis of Failure Propagation**

Marco Bozzano<sup>1</sup> , Alessandro Cimatti<sup>1</sup> , Anthony Fernandes Pires<sup>1</sup>, Alberto Griggio<sup>1</sup> , Martin Jon´aˇs<sup>1</sup> , and Greg Kimberly2(B)

> <sup>1</sup> Fondazione Bruno Kessler, Trento, Italy {bozzano,cimatti,griggio,mjonas}@fbk.eu <sup>2</sup> The Boeing Company, Seattle, USA greg.kimberly@boeing.com

**Abstract.** The process of developing civil aircraft and their related systems includes multiple phases of Preliminary Safety Assessment (PSA). An objective of PSA is to link the classification of failure conditions and effects (produced in the functional hazard analysis phases) to appropriate safety requirements for elements in the aircraft architecture. A complete and correct preliminary safety assessment phase avoids potentially costly revisions to the design late in the design process. Hence, automated ways to support PSA are an important challenge in modern aircraft design. A modern approach to conducting PSAs is via the use of abstract propagation models, that are basically hyper-graphs where arcs model the dependency among components, e.g. how the degradation of one component may lead to the degraded or failed operation of another. Such models are used for computing *failure propagations*: the fault of a component may have multiple ramifications within the system, causing the malfunction of several interconnected components. A central aspect of this problem is that of identifying the minimal fault combinations, also referred to as *minimal cut sets*, that cause overall failures.

In this paper we propose an expressive framework to model failure propagation, catering for multiple levels of degradation as well as cyclic and nondeterministic dependencies. We define a formal sequential semantics, and present an efficient SMT-based method for the analysis of failure propagation, able to enumerate cut sets that are minimal with respect to the order between levels of degradation. In contrast with the state of the art, the proposed approach is provably more expressive, and dramatically outperforms other systems when a comparison is possible.

# **1 Introduction**

The process of developing civil aircraft and their related systems is guided by documents ARP4754A [17] and ARP4761 [16] produced by the engineering and standards organization SAE International. These documents describe a structured process for the safety assessment of these classes of platforms. An important stage is that of the Preliminary Aircraft Safety Assessment (PASA) and Preliminary System Safety Assessment (PSSA). The PASA is followed by multiple PSSA, carried out at the level of the systems composing the aircraft. One important goal of these process stages is to link the classification of failure conditions and effects (produced in the aircraft functional hazard analysis phase) to appropriate safety requirements for elements in the aircraft architecture. These safety requirements drive, among other things, assignment of target Development Assurance Levels (DAL) for items within the architecture. A complete and correct preliminary safety assessment phase avoids potentially costly revisions to the design late in the design process. Hence, automated ways to support PSA are an important challenge in modern aircraft design [18].

An important goal of PSAs is to fully understand how faults of simple functions (e.g. providing electrical power, on-ground braking) interact and propagate to affect the overall behaviours (e.g. landing, take-off, taxiing). A modern approach to conducting such safety assessments is via propagation models [1,14,19], that model the dependency among components, e.g. how the degradation of one component may lead to the degraded or failed operation of another. Such models are used for computing *failure propagations*: the fault of a component may have multiple ramifications within the system, causing the malfunction of several interconnected components. A central problem is identifying the minimal fault combinations, also referred to as *minimal cut sets*, that cause overall failures [12].

Given that PSAs occur in the early stages of the development process when limited information regarding the design is available, reasoning is carried out at a very high level of abstraction. Therefore, instead of using behavioural models (e.g., infinite-state transition systems) adopted in formal verification, the system is more naturally modeled by a simpler formalism of propagation graphs. This does not make PSA any easier. There are in fact several aspects that must be taken into account. The first problem is the sheer size of propagation graphs, both in terms of nodes and hyper-paths to be explored, which make enumerative techniques completely inadequate.

Second, the propagation is non-Boolean [19]. That is, the degradation levels of the system functions are not binary (working vs not working) but the functions may be subject to different levels of degradation (e.g. fully operational, partly failed, completely failed), and fail in different ways (e.g. detected vs undetected, stuck open vs stuck closed), and different failures may be associated to different probabilities [19]. For example, the state of a component can be abstractly modeled into *working (w)*, *failed safe (fs)*, *failed detected (fd)*, or *failed undetected (fu)*, with degrees of degradation partially ordered as shown in Fig. 1.

**Fig. 1.** Hasse diagram of the fds W3F [14].

In this setting, the notion of minimality needs to take into account the order among the levels of degradation, and can not be simply considered in terms of minimality with respect to set-inclusion. Third, various forms of failure propagation may be possible, e.g., nondeterministic, temporally-constrained, cyclic. For example, the failure of a power generator may lead, within a certain amount of time, to a depleted battery and then to the loss of an engine. In turn, the loss of an engine may compromise the ability to generate power, which clearly requires the ability to deal with cyclic propagation graphs. Additionally, a failure of the control system might cause a pressure valve to become either stuck open or stuck closed; this requires the ability to deal with nondeterministic propagations.

In this paper we tackle the problem of analyzing failure propagation in the full generality required by real-world applications. We start from Finite Degradation Structures (fds) [14], a recently-proposed modeling framework, which unifies various combinational models traditionally used in safety analysis (such as fault trees and minimal cut sets) and generalizes them to deal with different levels of degradation. We propose a framework, referred to as pgfds (Propagation Graphs over fds), that allows to model non-deterministic and cyclic propagation graphs. The framework is general and can be used in other safety-critical domains.

In order to deal with cyclic behaviours, pgfds require a sequential semantics, expressed via symbolic transition systems. The computation of minimal cut sets over pgfds can be carried out by means of techniques based on model checking, developed for the general case of behavioural models [6].

Then, we prove that it is possible to carry out the same analysis within a combinational setting, leveraging two widely adopted assumptions: that faults are persistent and that the fault propagation is monotone. These assumptions allow us to devise an efficient algorithm that can analyze fault propagations of realistic industrial benchmarks that are currently out of reach of state-of-the-art methods. The analysis of pgfds is reduced to model enumeration for an SMT formula that does not require the explicit unrolling of the transition system. We tackle two key difficulties. The first one is to ensure causality and rule out self-supporting fault configurations in the combinational encoding. This is done by imposing cycle-breaking constraints requiring the existence of a partial order that is then constructed by the SMT solver during the analysis. The second one is to devise efficient enumeration techniques of models that are fds-minimal, i.e., minimal with respect to *the severity of the degradation* given by the fds. To this end, we propose an SMT-based enumerator of fds-minimal models.

We have experimentally evaluated our approach on a comprehensive set of realistic benchmarks, also generating random systems that have a similar structure as our proprietary systems<sup>1</sup>. The results demonstrate substantial advances with respect to the state of the art. Our approach is clearly superior to the approach proposed in [14], that is limited to the case of acyclic deterministic pgfds. For the cyclic pgfdss, we contrast our approach against the sequential approach based on model-checking and show that our approach is able to scale to large pgfds, dramatically outperforming the sequential approach.

This paper is structured as follows. In Sect. 2 we present the mathematical notation and background on fds. In Sect. 3 we describe Propagation Graphs over fds (pgfds). In Sect. 4 we present the combinational encoding of pgfds into SMT. In Sect. 5 we describe how to use the SMT encoding for the enumeration of fds-minimal cut sets. In Sect. 6 we discuss some related work, and in Sect. 7

<sup>1</sup> Unfortunately the proprietary systems cannot be disclosed.

we present the experimental evaluation. In Sect. 8 we draw some conclusions and outline directions for future work.

# **2 Preliminaries**

In the section, we explain the basic mathematical conventions that are used in the paper. We assume that the reader is familiar with the basic ideas of Satisfiability Modulo Theories (SMT) and in particular with the theory of linear integer arithmetic and the DPLL(T) procedure, as presented, e.g., in [2].

If convenient, we define unary functions with small domains in-place extensionally, e.g., {1 -<sup>→</sup> <sup>2</sup>, <sup>2</sup> -<sup>→</sup> <sup>3</sup>} is a function with domain {1, <sup>2</sup>} that maps 1 to 2 and 2 to 3. We say that the n-ary function f(x1, x2,...,x*n*) *depends* on its formal argument x*<sup>i</sup>* if there are some values v1, v2,...,v*n*, v *<sup>i</sup>* in the corresponding domains such that <sup>f</sup>(v1, v2,...,v*i*,...v*n*) <sup>=</sup> <sup>f</sup>(v1, v2,...,v *<sup>i</sup>*,...v*n*). Given sets <sup>A</sup> and B, we denote as B*<sup>A</sup>* the set of all functions from A to B. Given a partially ordered set (A, <sup>≤</sup>), its subset <sup>B</sup> <sup>⊆</sup> <sup>A</sup> is called an *upper* (resp. *lower* ) set if for all <sup>b</sup> <sup>∈</sup> <sup>B</sup>, <sup>a</sup> <sup>∈</sup> <sup>A</sup>, the condition <sup>a</sup> <sup>≥</sup> <sup>b</sup> (resp. <sup>a</sup> <sup>≤</sup> <sup>b</sup>) implies <sup>a</sup> <sup>∈</sup> <sup>B</sup>.

A Finite Degradation Structure (fds) [14] is a triple (*FM* , <sup>≤</sup>, <sup>⊥</sup>), where *FM* is a finite set of failure modes and ≤ is a partial order on *FM* with the least element <sup>⊥</sup>. For any set <sup>A</sup> and an fds <sup>B</sup> = (*FM <sup>B</sup>*, <sup>≤</sup>*B*, <sup>⊥</sup>*B*), the fds <sup>B</sup>*<sup>A</sup>* for the set of functions from <sup>A</sup> to *FM <sup>B</sup>* is defined as ((*FM <sup>B</sup>*)*<sup>A</sup>*, <sup>≤</sup>*B<sup>A</sup>* , <sup>⊥</sup>*B<sup>A</sup>* ), where <sup>⊥</sup>*B<sup>A</sup>* (a) = <sup>⊥</sup>*<sup>B</sup>* for all <sup>a</sup> <sup>∈</sup> <sup>A</sup>, and <sup>f</sup> <sup>≤</sup>*B<sup>A</sup>* <sup>f</sup> if and only if <sup>f</sup>(a) <sup>≤</sup>*<sup>B</sup>* <sup>f</sup> (a) for all <sup>a</sup> <sup>∈</sup> <sup>A</sup>. We assume that each fds contains at least two elements. We say that an fds is *Boolean* if it is isomorphic to the structure ({⊥, }, ⊥≤ , <sup>⊥</sup>). In the following, for an fds <sup>D</sup> = (*FM* , <sup>≤</sup>, <sup>⊥</sup>), we denote elements of the set *FM* with f,f and call them *failure modes*.

Given a first-order formula ϕ over the language of the theory of linear integer arithmetic, an assignment <sup>μ</sup> that assigns a value <sup>μ</sup>(b) ∈ {**false**, **true**} to each free Boolean variable <sup>b</sup> of <sup>ϕ</sup> and a value <sup>μ</sup>(n) <sup>∈</sup> <sup>Z</sup> to each free integer variable <sup>n</sup> of <sup>ϕ</sup> is called a model of <sup>ϕ</sup> (denoted <sup>μ</sup> <sup>|</sup><sup>=</sup> <sup>ϕ</sup>) if <sup>μ</sup> makes <sup>ϕ</sup> true. If <sup>B</sup> is a subset of free Boolean variables of <sup>ϕ</sup>, the model <sup>μ</sup> <sup>|</sup><sup>=</sup> <sup>ϕ</sup> is called *subset-minimal with respect to* <sup>B</sup> if there is no model <sup>μ</sup> <sup>|</sup><sup>=</sup> <sup>ϕ</sup> such that {<sup>b</sup> <sup>∈</sup> <sup>B</sup> <sup>|</sup> <sup>μ</sup> (b) = **true**} - {<sup>b</sup> <sup>∈</sup> <sup>B</sup> <sup>|</sup> <sup>μ</sup>(b) = **true**}.

A *transition system* T S is a tuple (X, I, T) where X is a set of (state) variables, I(X) is a formula representing the initial states, and T(X, X ) is a formula representing the transitions. A *state* of T S is an assignment to the variables X. <sup>A</sup> *trace* of <sup>M</sup> is a (possibly infinite) sequence <sup>s</sup>0, s1,... of states such that <sup>s</sup><sup>0</sup> <sup>|</sup><sup>=</sup> <sup>I</sup> and, for all <sup>i</sup> <sup>≥</sup> 0, <sup>s</sup>*i*, s *<sup>i</sup>*+1 <sup>|</sup><sup>=</sup> <sup>T</sup>.

# **3 Propagation Graphs over FDSs**

In this section, we introduce our model for fault propagation, which we call Propagation Graphs over fdss (pgfds), and provide a sequential semantics for it which can be used to encode pgfdss into transition systems.

Intuitively, a Propagation Graph over fds (pgfds) consists of a set of components of the system and of the *next* function. In each step of the failure propagation, each component is in some failure mode from the underlying fds. In the next step of the failure propagation, each component can either 1) stay in its previous failure mode or 2) switch to an arbitrary failure mode from the set of possible next failure modes. The set of possible next failure modes for each component is given by the function *next*, based on the current failure modes of all components in the system.

**Definition 1 (Propagation Graph over FDS (PGFDS)).** *Given a finite degradation structure* <sup>D</sup> = (*FM* , <sup>≤</sup>, <sup>⊥</sup>)*, a* propagation graph over <sup>D</sup> *is a pair* S = (C, *next*)*, where*


*<sup>A</sup>* state *of* <sup>S</sup> *is a mapping* <sup>s</sup>: <sup>C</sup> <sup>→</sup> *FM that assigns a failure mode* <sup>f</sup> <sup>∈</sup> *FM to each system component* <sup>c</sup> <sup>∈</sup> <sup>C</sup>*.*

*Example 1.* Consider a system with three components, h (hydraulic), e (electric), and <sup>g</sup> (control on ground), over the Boolean fds ({⊥, }, ⊥≤ , <sup>⊥</sup>). Each of the components is either working correctly (represented by the failure mode ⊥) or incorrectly ( ). Component g depends on the correct functionality of either e or h. Component e depends on h to function correctly and, symmetrically, h depends on e. The failure propagation of this system can be described by a pgfds <sup>S</sup> = ({g, <sup>e</sup>, <sup>h</sup>}, *next*), where


Note that *next*(c)(s) = <sup>∅</sup> means that if the system is in the state <sup>s</sup>, the component c cannot change its current failure mode.

The structure is intuitively associated with the hypergraph depicted in Fig. 2. The dashed rectangles represent the fact that each component can fail on its own (*locally*); the hyper-arc from e and h to g is conjunctive, while the arcs incoming into a node are disjunctive. 

The important assumption of our approach is that we consider only faultpersistent propagations, i.e., fault propagations where each component can fail only once and after it does, it stays in the same failure mode forever. Note that this is a realistic assumption that is also used in other techniques for reliability analysis [5]. It is also implicitly used in other modeling techniques that are purely combinational (e.g., [19]) because they model the system only in a single time step, without considering any change in time whatsoever. Single propagation step of such computations can be described by a *fault-persistent transition relation*; the whole such computation as *fault-persistent failure propagation*.

**Fig. 2.** The hypergraph view of a simple pgfds.

**Definition 2 (Fault-persistent transition relation).** *Let* S = (C, *next*) *be <sup>a</sup>* pgfds *over an* fds *with the least element* <sup>⊥</sup>*. The* fault-persistent transition relation *of* S*, denoted as* R*s, is the binary relation between states of* S *such that for all states* s, s *, the relation* R*s*(s, s ) *holds if and only if for each* <sup>c</sup> <sup>∈</sup> <sup>C</sup>

*–* s (c) = s(c) *or –* <sup>s</sup>(c) = <sup>⊥</sup> *and* <sup>s</sup> (c) <sup>∈</sup> *next*(c)(s)*.*

**Definition 3 (Fault-persistent failure propagation).** *Given a* pgfds S = (C, *next*)*, its fault-persistent transition relation* <sup>R</sup>*s, and* <sup>k</sup> <sup>∈</sup> <sup>N</sup>*, the sequence* (s*i*)<sup>0</sup>≤*i*≤*<sup>k</sup> of states of* <sup>S</sup> *is called a* fault-persistent failure propagation *if the relation* <sup>R</sup>*s*(s*i*, s*i*+1) *holds for all* <sup>0</sup> <sup>≤</sup> i<k*.*

Because we deal only with fault-persistent failure propagations in this paper, we from now on refer to the fault-persistent transition relation and the faultpersistent failure propagation only as *transition relation* and *failure propagation*, respectively.

**Definition 4 (Cyclic PGFDS).** *Let* S = (C, *next*) *be a* pgfds*. A component* <sup>c</sup> <sup>∈</sup> <sup>C</sup> *depends on a component* <sup>d</sup> <sup>∈</sup> <sup>C</sup> *iff next*(c)(s) <sup>=</sup> *next*(c)(s ) *for some* s, s : <sup>C</sup> <sup>→</sup> *FM such that* <sup>s</sup>(d) <sup>=</sup> <sup>s</sup> (d) *and* s(c ) = s (c ) *for all* <sup>c</sup> <sup>=</sup> <sup>d</sup>*. Let deps*(c) := {<sup>d</sup> <sup>∈</sup> <sup>C</sup> <sup>|</sup> <sup>c</sup> *depends on* <sup>d</sup>}*,* <sup>D</sup> <sup>⊆</sup> <sup>C</sup> <sup>×</sup> <sup>C</sup> *be such that* <sup>D</sup>(c, c ) *if and only if* <sup>c</sup> <sup>∈</sup> *deps*(c)*, and let* <sup>D</sup><sup>+</sup> *be the transitive closure of* <sup>D</sup>*. Then we say that* <sup>S</sup> *is* cyclic *if and only if there exists* <sup>c</sup> <sup>∈</sup> <sup>C</sup> *such that* <sup>D</sup><sup>+</sup>(c, c) *holds.*

*Example 2.* In the pgfds S from Example 1, the component g depends on components e and h, the component e depends on h, and the component h depends on e. The pgfds S is therefore cyclic because e (and also h) transitively depends on itself. 

To analyze reliability of the modeled system, it is important to identify the failures of its components (i.e., assignment of failure modes to the components) which cause the system to reach a given set of dangerous states, usually called *top level event (TLE)*. Such assignments are called *cut sets*. Since the number of all cut sets can be prohibitively large, it is often enough to identify the least severe failures in terms of the underlying fds that are sufficient to cause the TLE. Such cut sets are called fds*-minimal*, or *minimal* for short. These concepts are formalized in the following definitions.

**Definition 5 (Top Level Event).** *Given a* pgfds S*, a* Top Level Event *(TLE) is an arbitrary set of states of* S*.*

**Definition 6 ((FDS-Minimal) Cut Set).** *Given a* pgfds S = (C, *next*)*, and a top level event TLE , a* cut set *is any state* s *for which there is a fault-persistent failure propagation that starts in* <sup>s</sup> *and ends in some* <sup>s</sup>*<sup>k</sup>* <sup>∈</sup> *TLE . A cut set is called* fds-minimal *(or* minimal *for short) if it is minimal with respect to the pointwise ordering* <sup>≤</sup> *of the underlying* fds*.*

Given a system S and a top level event *TLE*, we denote the set of all corresponding cut sets as *CS*(S, *TLE*) and the set of all minimal cut sets as *MCS*(S, *TLE*). As a convention, when talking about cut sets, we will explicitly mention only the components to which the cut set assigns a failure mode different from ⊥.

*Example 3.* Consider again the pgfds S from Example 1 and the top level event TLE = {s: {g, <sup>e</sup>, <sup>h</sup>} → { , ⊥} | <sup>s</sup>(g) = }, which corresponds to the component g not working correctly. The minimal cut sets for the pgfds S and the given top level event are


Note that besides these three minimal cut sets, there are other cut sets that are not minimal, such as {<sup>e</sup> -→ , h -→ }. 

Fault-persistent computations of a pgfds can be easily represented as traces of a (symbolic) transition system.

**Definition 7 (Fault-persistent transition system).** *Given a* pgfds S = (C, *next*) *and an* fds <sup>D</sup> = (*FM* , <sup>≤</sup>, <sup>⊥</sup>)*, the corresponding* fault-persistent (symbolic) transition system *is given by TS <sup>S</sup>* = (X, **true**, T)*, where:*


By definition, every fault-persistent computation of S has a corresponding trace (of the same length) in *TS <sup>S</sup>*. Therefore, encoding pgfdss as transition systems allows leveraging off-the-shelf algorithms for subset-minimal cut set enumeration, such as those given in [6]. However, this might be inefficient, particularly for TLEs that are triggered by long failure propagations (corresponding to equally-long traces of the induced transition system). Moreover, as we show later, enumerating fds-minimal cut sets is more involved.

Fault propagation systems used in practice often have the property that no transition can be disabled by additional faults, i.e., by switching a failure mode of a component from <sup>⊥</sup> to <sup>f</sup> <sup>=</sup> <sup>⊥</sup>. This is also the case for the pgfds from Example 1. Such systems are called *subset-monotone* or *monotone* for short. This is formalized by the following definition.

**Definition 8 (Subset-monotone PGFDS).** *A* pgfds S = (C, *next*) *is called* subset-monotone *if for all* s, s : <sup>C</sup> <sup>→</sup> *FM , the condition* <sup>∀</sup><sup>c</sup> <sup>∈</sup> C. s(c) <sup>=</sup> ⊥ → s(c) = s (c) *implies* <sup>∀</sup><sup>c</sup> <sup>∈</sup> C. *next*(c)(s) <sup>⊆</sup> *next*(c)(s ).

# **4 From Sequential to Combinational**

In this section, we describe a combinational encoding of fault-persistent computations of a pgfds, which is guaranteed to be exact for subset-monotone pgfdss and provides a useful overapproximation for general pgfdss. In the rest of the section, let <sup>S</sup> = (C, *next*) be a pgfds over the fds <sup>D</sup> = (*FM* , <sup>≤</sup>, <sup>⊥</sup>), and *TLE* be a top level event. We show how to construct a first-order formula ϕ*cs* over the theory of linear integer arithmetic whose models correspond to cut sets of S with respect to *TLE*. In the next section, we then use this formula to enumerate all fds-minimal cut sets of S.

To encode the propagations of <sup>S</sup>, for each component <sup>c</sup> <sup>∈</sup> <sup>C</sup> and each failure mode <sup>f</sup> <sup>∈</sup> *FM* we introduce two Boolean variables: <sup>I</sup>*c,f* and <sup>F</sup>*c,f* . The variable I*c,f* encodes whether c was in the failure mode f in the initial state of the propagation. The variable F*c,f* encodes whether c has been in the failure mode f at any time during the propagation. We can then encode *TLE* as a formula ϕ*TLE* over variables F*c,f* . 2

Considering now a possible propagation, a component c can be in failure mode <sup>f</sup> <sup>=</sup> <sup>⊥</sup> at some time during the propagation for two reasons: either it was already in f in the initial state of the propagation, or it transitions to f because of its *next* function. The first case is represented by I*c,f* being true. The second case can be encoded as follows (for each <sup>c</sup> <sup>∈</sup> <sup>C</sup> and <sup>f</sup> <sup>∈</sup> *FM* \ {⊥}):

$$\bigvee\_{\substack{s\text{-}s\text{-}C\to FM\\f\in next(c)(s)}} \bigwedge\_{\substack{d\in decays(c)\\s(d)\neq\bot}} F\_{d,s(d)},\tag{1}$$

stating that there must exist a row in the truth table of *next*(c), whose result includes f and which agrees with the current state on the failure modes of failed dependencies.<sup>3</sup> The above, however, would *not* work in the presence of cycles. This can already be seen on the simple cyclic pgfds from Example 1.

<sup>2</sup> A naive encoding would be using the formula - *<sup>s</sup>*∈*TLE* ( *<sup>c</sup>*∈*C,s*(*c*)=<sup>⊥</sup> *Fc,s*(*c*) ∧ *c*∈*C,s*(*c*)=⊥ *<sup>f</sup>*∈*FM*\{⊥} <sup>¬</sup>*Fc,f* ), but more compact representations are of course pos-

sible (particularly if *TLE* is given symbolically). <sup>3</sup> This formula can again be encoded more compactly; particularly if the *next* function is given symbolically, which is usually the case in practice.

*Example 4.* Consider again the pgfds S from Example 1. The above-described encoding of the propagations of S is

$$\begin{aligned} (F\_{\mathcal{G},\top} &\rightarrow (I\_{\mathcal{G},\top} \vee (F\_{\mathcal{E},\top} \wedge F\_{\mathcal{H},\top}))) \quad \land \\ (F\_{\mathcal{E},\top} &\rightarrow (I\_{\mathcal{E},\top} \vee F\_{\mathcal{H},\top})) \quad \land \\ (F\_{\mathcal{H},\top} &\rightarrow (I\_{\mathcal{H},\top} \vee F\_{\mathcal{E},\top})). \end{aligned}$$

Although this encoding has a model <sup>μ</sup> such that <sup>μ</sup> <sup>|</sup><sup>=</sup> <sup>¬</sup>I<sup>g</sup>*,* ∧ ¬I<sup>e</sup>*,* ∧ ¬I<sup>h</sup>*,* <sup>∧</sup> <sup>F</sup><sup>g</sup>*,* <sup>∧</sup> <sup>F</sup><sup>e</sup>*,* <sup>∧</sup> <sup>F</sup><sup>h</sup>*,* , there is no propagation path of <sup>S</sup> in which both components <sup>e</sup> and <sup>h</sup> are initially in the state <sup>⊥</sup> and switch to state during the propagation. The problem is that the encoding allows models where a failure of e was caused by a failure of <sup>h</sup>, which was in turn caused by the same failure of <sup>e</sup>. 

In order to solve the problem, we introduce constraints imposing a *causal ordering* among the components, stating that the failure of a component can be caused only by other components that precede it in the causal order. We encode this by introducing one additional integer variable o*<sup>c</sup>* for each component c, which intuitively corresponds to the time when the component c switched to a failure mode different from ⊥, and modifying the formula (1) to take the causal ordering into account:<sup>4</sup>

$$\bigvee\_{\begin{subarray}{c}s\colon C\to FM\\f\in next(c)(s)\end{subarray}}\bigwedge\_{\begin{subarray}{c}d\in decays(c)\\s(d)\neq\bot\end{subarray}}\left(F\_{d,s(d)}\wedge o\_d < o\_c\right).\tag{2}$$

Putting it all together, the encoding for the failure mode changes is given by the formula ϕ*next* below:

$$\varphi\_{next} = \bigwedge\_{\substack{c \in C\\ f \in FM\backslash\{\bot\}}} (F\_{c,f} \to (I\_{c,f} \vee \langle 2 \rangle)) \wedge (I\_{c,f} \to F\_{c,f}).$$

*Example 5.* For the pgfds S from Example 1, the correct encoding of the propagations of S is thus the following formula ϕ*next*:

$$\begin{split} & \left( \begin{array}{rcl} \left( \boldsymbol{F}\_{\mathrm{G},\top} & \rightarrow & \left( \boldsymbol{I}\_{\mathrm{G},\top} \vee \left( \left( \boldsymbol{F}\_{\mathrm{E},\top} \wedge o\_{\mathrm{E}} < o\_{\mathrm{G}} \right) \wedge \left( \boldsymbol{F}\_{\mathrm{H},\top} \wedge o\_{\mathrm{H}} < o\_{\mathrm{G}} \right) \right) \right) \right) & \wedge \\ & \left( \boldsymbol{I}\_{\mathrm{G},\top} & \rightarrow & \boldsymbol{F}\_{\mathrm{G},\top} \right) & \wedge \\ & \left( \boldsymbol{F}\_{\mathrm{E},\top} & \rightarrow & \left( \boldsymbol{I}\_{\mathrm{E},\top} \vee \left( \boldsymbol{F}\_{\mathrm{H},\top} \wedge o\_{\mathrm{H}} < o\_{\mathrm{E}} \right) \right) \right) & \wedge \\ & \left( \boldsymbol{I}\_{\mathrm{E},\top} & \rightarrow & \boldsymbol{F}\_{\mathrm{E},\top} \right) & \wedge \\ & \left( \boldsymbol{F}\_{\mathrm{H},\top} & \rightarrow & \left( \boldsymbol{I}\_{\mathrm{H},\top} \vee \left( \boldsymbol{F}\_{\mathrm{E},\top} \wedge o\_{\mathrm{E}} < o\_{\mathrm{H}} \right) \right) \right) & \wedge \\ & \left( \boldsymbol{I}\_{\mathrm{H},\top} & \rightarrow & \boldsymbol{F}\_{\mathrm{H},\top} \right) . \end{split}$$

Note that the constraints for causal ordering now rule out the spurious selfsupporting propagation in which e fails because of h and h fails because of e.

<sup>4</sup> We remark that such ordering constraints are needed only if the input pgfds is cyclic, and only between components in the same strongly connected component of the dependency graph.

This would require that o<sup>h</sup> < o<sup>e</sup> and o<sup>e</sup> < o<sup>h</sup> are both true, which is clearly impossible in the theory of linear integer arithmetic (or, more generally, in any theory in which < is interpreted as a strict ordering relation).

The propagations of S mentioned in Example 3 correspond to the following assignments:


These assignments are not unique; there are infinitely many choices for the values of the ordering variables o*c*. Also note that there is no global causality ordering for the system: the causality ordering is different for different propagations. 

Finally, we encode the fault-persistence constraint by stating that no component can be in two failure modes either in the initial state of the propagation or at any time during the propagation:

$$\varphi\_{one} = \bigwedge\_{\substack{c \in C \\ f, f' \in FM \\ f \neq f'}} \left( \neg I\_{c,f} \lor \neg I\_{c,f'} \right) \land \left( \neg F\_{c,f} \lor \neg F\_{c,f'} \right) \dots$$

The final formula is then given by ϕ*cs* :

$$
\varphi\_{cs} = \varphi\_{TLE} \wedge \varphi\_{next} \wedge \varphi\_{once}.
$$

As the following theorem shows, the formula ϕ*cs* for general systems encodes an *overapproximation* of the set *CS*(S, *TLE*). The reason for this is that the encoding does not enforce failure mode of dependencies that are working, i.e., are in the failure mode <sup>⊥</sup>. Note that even an overapproximation of *CS*(S, *TLE*) is useful for safety analysis; it can be used, for example, for computing an upper bound on the probability of failure of the system. Moreover, if the system S is *subset-monotone*, which is often the case in practice, the formula ϕ*cs* is guaranteed to encode the set *CS*(S, *TLE*) exactly.

To formulate the relationship precisely, we define the function that provides the correspondence between the models of μ and the cut sets of S. Observe that thanks to ϕ*once* , each model μ of ϕ*cs* corresponds to a unique initial state *modelToState*(μ) of S as defined below:

$$modelToState(\mu)(c) = \begin{cases} f, & \text{if } \{f' \in FM \mid \{\perp\} \mid \mu(I\_{c,f'}) = \mathtt{true}\} = \{f\}, \\ \bot, & \text{if } \{f' \in FM \mid \{\perp\} \mid \mu(I\_{c,f'}) = \mathtt{true}\} = \emptyset. \end{cases}$$

MCS-enumeration(ϕ*cs* , *modelToState*):


**Fig. 3.** SMT-based MCS enumeration algorithm.

**Theorem 1.** *For an arbitrary* pgfds S *and a top level event TLE ,*

*CS*(S, *TLE*) ⊆ {*modelToState*(μ) <sup>|</sup> <sup>μ</sup> <sup>|</sup><sup>=</sup> <sup>ϕ</sup>*cs*}.

*Moreover, if* S *is subset-monotone, these sets are equal.*

# **5 Enumeration of FDS-Minimal Cut Sets**

In this section, we show how to efficiently enumerate fds-minimal cut sets of subset-monotone systems using the formula ϕ*cs* and an SMT solver. We first consider a simplified case, in which the underlying fds D is Boolean. We then show how to generalize our solution to arbitrary fdss.

# **5.1 Algorithm for Boolean FDSs**

The pseudo-code of our procedure for the case when the underlying fds is Boolean is shown in Fig. 3. Intuitively, the algorithm enumerates all the subsetminimal models of ϕ*cs* with respect to the set of variables of form I*c,f* . These models are enumerated one by one and each enumerated model is, together with all its supermodels, blocked by the assertion on line 13, until the formula becomes unsatisfiable. Each model of the formula is converted to a cut set by the function *modelToState*.

The algorithm makes use of a DPLL(T)-based SMT solver that provides the following functionalities:


The correctness for our algorithm is formalized by the theorem below.

**Theorem 2 (MCS enumeration over Boolean FDS).** *For a subsetmonotone* pgfds <sup>S</sup> *over the Boolean* fds*, the result of* MCS<sup>−</sup> enumeration(ϕ*cs* , *modelToState*) *is the set of all* fds*-minimal cut sets of* S*.*

*Proof.* Let S = (C, *next*) be a subset-monotone pgfds. It was proven by Di Rosa et al. [15] that if branching heuristics of a CDCL-based SAT solver are modified to assign **false** to a subset V of variables before branching on other variables (lines 4–5 of our pseudocode), the produced model is subset-minimal with respect to the set of variables V . This claim straightforwardly extends to DPLL(T)-based SMT solvers. In every iteration, the algorithm thus finds one subset-minimal model μ of ϕ*cs* with respect to the set of variables I*c,f* and adds a constraint that prevents enumerating any model <sup>μ</sup> such that {I*c,f* <sup>∈</sup> *vars*(ϕ*cs* ) <sup>|</sup> <sup>μ</sup>(I*c,f* ) = **true**}⊆{I*c,f* <sup>∈</sup> *vars*(ϕ*cs* ) <sup>|</sup> <sup>μ</sup> (I*c,f* ) = **true**} in the following iterations. Therefore, the described algorithm enumerates, for each model <sup>μ</sup> of the formula ∃{F*c,f* <sup>|</sup> <sup>c</sup> <sup>∈</sup> C, f <sup>∈</sup> *FM* } ∃{o*<sup>c</sup>* <sup>|</sup> <sup>c</sup> <sup>∈</sup> <sup>C</sup>} (ϕ*cs* ) that is subset-minimal with respect to the set of variables I*c,f* , exactly one model μ of ϕ*cs* that agrees with μ on all variables I*c,f* .

Note that *vars*(ϕ*cs* ) does not contain the variable <sup>I</sup>*c,*<sup>⊥</sup> for any <sup>c</sup> <sup>∈</sup> <sup>C</sup>. For a Boolean fds and models μ, μ <sup>|</sup><sup>=</sup> <sup>ϕ</sup>*cs* , we thus have {I*c,f* <sup>∈</sup> *vars*(ϕ*cs* ) <sup>|</sup> <sup>μ</sup>(I*c,f* ) = **true**}⊆{I*c,f* <sup>∈</sup> *vars*(ϕ*cs* ) <sup>|</sup> <sup>μ</sup> (I*c,f* ) = **true**} if and only if *modelToState*(μ) <sup>≤</sup> *modelToState*(μ ). Therefore, Theorem 1 implies that for subset-monotone S, subset-minimal models of ϕ*cs* with respect to the set of variables of form I*c,f* precisely correspond to fds-minimal cut sets of S and the correspondence is given by the function *modelToState*. 

#### **5.2 Extension to Arbitrary FDSs**

The algorithm of Fig. 3 does not work in general for arbitrary fdss, but only for the fdss in which all the failure modes different from <sup>⊥</sup> are incomparable. The problem is that the assumption that a cut set is fds-minimal iff the corresponding model of ϕ*cs* is subset-minimal with respect to the set of variables I*c,f* with <sup>f</sup> <sup>=</sup> <sup>⊥</sup> does not hold in general with the encoding of Sect. 4, as can be seen on the following simple example.

<sup>5</sup> For example, calling add-preferred-var(solver, *v*, **true**) means that if the solver has to perform a case split, *v* will be assigned before all non-preferred variables, and it will always be assigned to true by the branching heuristic.

**Fig. 4.** Hasse diagram of the ordered set (*W*3*F* ↓*,* ⊆) together with the encoding of the elements as formulas.

*Example 6.* Consider the fds <sup>D</sup> = ({⊥, m, }, ⊥ ≤ <sup>m</sup> <sup>≤</sup>, <sup>⊥</sup>) and the pgfds <sup>S</sup> = ({c}, *next*) with *next*(c)(s) = <sup>∅</sup> for all <sup>c</sup> and <sup>s</sup>. Intuitively, <sup>S</sup> contains one component that cannot change its failure mode during the computation. Consider further the top-level event *TLE* <sup>=</sup> {{<sup>c</sup> -<sup>→</sup> <sup>m</sup>}, {<sup>c</sup> -→ }}.

Both {<sup>c</sup> -→ } and {<sup>c</sup> -<sup>→</sup> <sup>m</sup>} are cut sets, but only the latter is fds-minimal. However, the algorithm of Fig. 3 will return both, since they both correspond to subset-minimal models with respect to the set of variables <sup>I</sup>*c,f* . 

We can adapt the procedure of Fig. 3 to arbitrary fdss by using an encoding in which the ordering of assignments to the I*c,f* variables corresponds to the severity ordering <sup>≤</sup> of the underlying fds <sup>D</sup>. In order to do this, we exploit the isomorphism between <sup>D</sup> = (*FM* , <sup>≤</sup>, <sup>⊥</sup>) and the poset <sup>D</sup> <sup>↓</sup> of its lower subsets generated by single elements defined as <sup>D</sup> <sup>↓</sup><sup>=</sup> {{f <sup>∈</sup> *FM* <sup>|</sup> <sup>f</sup> <sup>≤</sup> <sup>f</sup>} | <sup>f</sup> <sup>∈</sup> *FM* } with partial order ⊆ and the least element {⊥}. For example, the poset (W3<sup>F</sup> <sup>↓</sup>, <sup>⊆</sup>) for the fds <sup>W</sup>3<sup>F</sup> of Fig. <sup>1</sup> is shown in Fig. 4, together with an encoding of the elements as formulas.

With this isomorphism in mind, we define for each <sup>c</sup> <sup>∈</sup> <sup>C</sup> and <sup>f</sup> <sup>∈</sup> *FM* the formula ψ*c*=*<sup>f</sup>* that represents the failure mode f of component c by assigning the subset of variables {I*c,f* <sup>ˆ</sup> <sup>|</sup> <sup>ˆ</sup><sup>f</sup> <sup>≤</sup> <sup>f</sup>} to true:

$$\psi\_{c=f} = \bigwedge\_{\substack{\mathfrak{f} \in FM, \mathfrak{f} \le f}} I\_{c,f} \quad \wedge \bigwedge\_{\substack{\mathfrak{f} \in FM, \mathfrak{f} \not\le f}} \neg I\_{c,f}.$$

The important property of this definition is that for all <sup>c</sup> <sup>∈</sup> <sup>C</sup>, f,f <sup>∈</sup> *FM* and assignments <sup>μ</sup> <sup>|</sup><sup>=</sup> <sup>ψ</sup>*c*=*<sup>f</sup>* and <sup>μ</sup> <sup>|</sup><sup>=</sup> <sup>ψ</sup>*c*=*f*- , we have <sup>f</sup> <sup>≤</sup> <sup>f</sup> if and only if {I*c,f* <sup>ˆ</sup> <sup>|</sup> <sup>μ</sup>(I*c,f* <sup>ˆ</sup>) = **true**}⊆{I*c,f* <sup>ˆ</sup> <sup>|</sup> <sup>μ</sup> (I*c,f* <sup>ˆ</sup>) = **true**}.

We then modify the encoding ϕ*cs* of Sect. 4 as follows:

1. First, we modify ϕ*next* to encode the initial state by using ψ*c*=*<sup>f</sup>* instead of I*c,f* . This ensures that the ordering of assignments to the initial variables reflects the ordering given by the underlying fds. We also remove the mutual exclusion constraints on the variables I*c,f* from ϕ*once*, because the mutual exclusion of initial failure modes is now guaranteed by the definition of ψ*c*=*<sup>f</sup>* :

$$\varphi\_{next} = \bigwedge\_{\substack{c \in C \\ f \in FM \{\{\perp\} \\ f, f' \in F}}} \left( F\_{c,f} \to (\psi\_{c=f} \vee \{2\}) \right) \wedge (\psi\_{c=f} \to F\_{c,f}),$$

$$\varphi\_{one} = \bigwedge\_{\substack{c \in C \\ f, f' \in FM \{\perp\} \\ f \neq f'}} \left( \neg F\_{c,f} \vee \neg F\_{c,f'} \right).$$

2. Then, we add domain constraints that ensure that the resulting formula represents only models with assignments to I*c,f* that correspond to elements of <sup>D</sup> <sup>↓</sup>:

$$\varphi\_{D\downarrow} = \bigwedge\_{c \in C} \bigvee\_{f \in FM} \psi\_{c=f} \cdot$$

The new encoding is then given by ϕ*FM cs* :

> ϕ*FM cs* <sup>=</sup> <sup>ϕ</sup>*TLE* <sup>∧</sup> <sup>ϕ</sup>*next* <sup>∧</sup> <sup>ϕ</sup>*once* <sup>∧</sup> <sup>ϕ</sup>*<sup>D</sup>*↓.

The modified encoding ϕ*FM cs* represents the cut sets in a different way: instead of representing the failure modes directly by I*c,f* as in ϕ*cs* , they are now represented by the subformulas ψ*c*=*<sup>f</sup>* . Therefore, to prove correctness of the modified encoding, the function *modelToState* that maps models to cut sets also has to be changed. We define the initial state *modelToStateFM* (μ) corresponding to the model <sup>μ</sup> by *modelToStateFM* (μ)(c) = max{<sup>f</sup> <sup>∈</sup> *FM* <sup>|</sup> <sup>μ</sup>(I*c,f* ) = **true**}. Note that the maximum is guaranteed to exist because of the <sup>ϕ</sup>*<sup>D</sup>*<sup>↓</sup> constraint.

**Theorem 3.** *For an arbitrary* pgfds S *and a top level event TLE ,*

$$CS(S, TLE) \subseteq \{modelToState^{FM}(\mu) \mid \mu \mid = \varphi\_{cs}^{FM}\}.$$

*Moreover, if* S *is subset-monotone, these sets are equal.*

Therefore, the algorithm MCS-enumeration from Fig. 3 can be used to enumerate fds-minimal cut sets of a subset-monotone pgfds, given as the inputs the modified encoding ϕ*FM cs* and the modified function *modelToStateFM* . This is formalized by the following theorem:

**Theorem 4 (MCS enumeration for general FDS).** *For a subset-monotone* pgfds S *over an* fds D*, the result of MCS-enumeration*(ϕ*FM cs* , *modelToStateFM* ) *is the set of all* fds*-minimal cut sets of* S*.*

Note that our encoding of fds-minimality is general and does not depend on the algorithm for enumeration of subset-minimal models. Indeed, thanks to our encoding, any off-the-shelf minimal-model enumerator can be used to enumerate fds-minimal models. Therefore, any improvements to minimal model enumeration directly translate to improved performance of our method for fds-minimal cut set enumeration. From the opposite point of view, our encoding can in principle be employed by other tools to reduce fds-minimal cut set enumeration to subset-minimal cut set enumeration.

### **6 Related Work**

Finite Degradation Models (fdms) [14] are an algebraic framework accommodating the concept of fault degradation, where faults may have different values organized into a semi-lattice. Using fdms (probabilistic) safety analysis (fault trees and minimal cut sets) can be generalized from Boolean models to multistate systems. Compared to fdms, fault-persistent pgfdss differ in two significant aspects: first, since the function *next* returns a set of possible next failure modes, pgfdss allow non-determinism in the failure propagation, i.e., the failure of a component is not *uniquely* determined by the failure modes of its dependencies. Second, and more importantly, pgfdss allow cyclic dependencies and give them well-defined and expected semantics. Since the work on fdms is the closest to ours, we shall discuss it in detail below.

In [8] the authors present a framework for failure propagation which enables modeling sets of failure modes using a domain specific language. It is less expressive than fdms, in that sets of failure modes cannot be related by degradation orders, which significantly simplifies the enumeration of MCSs. Finally, classical formalisms for failure propagation, but less expressive than fds, include fptn [9] and Hip-HOps [11].

tfpgs (Timed Failure Propagation Graphs) [1] extend fault propagation model by enabling the specification of time bounds and mode constraints on the propagation links. However, tfpgs do not consider degradation, and they do not support cyclic dependencies. Conversely, the pgfds formalism can be easily extended to support time bounds, failure probabilities, mode constraints, and constraints on propagation delays similar to those available in tfpgs (e.g., following [5]). Moreover, once the minimal cut sets of a pgfds are computed, the existing approach to computing probability of overall failure [5] can be used almost unchanged.

Finally, xSAP [3] is a safety analysis platform that supports library-based fault models and the generation of safety artifacts for fully general behavioral models, e.g., it can generate fault trees and minimal cut sets for arbitrary transition systems [6]. Currently, xSAP does not support fds and degradation models.

#### **6.1 Detailed Comparison with Finite Degradation Models**

As outlined above, the formalism Finite Degradation Models (fdms), introduced in [14], is closely related to our pgfds. Here, we describe fdm in further detail and show that pgfds are a strict generalization of fds, obtained by (i) considering non-determinism in the propagation of failures, and (ii) by allowing cyclic dependencies among the components.

Each fdm has *state variables*, which correspond to the sources of failures in the system, and *flow variables*, which correspond to the propagated consequences of these failures. Each flow variable has an associated *equation*, which prescribes the failure mode of the corresponding flow variable based on the failure modes of state variables and other flow variables. We assume that the failure modes of all state and flow variables are modeled by the fds <sup>D</sup> = (*FM* , <sup>≤</sup>, <sup>⊥</sup>).<sup>6</sup>

**Definition 9 (Finite Degradation Model** [14]**).** *Given an arbitrary* fds D = (*FM* , <sup>≤</sup>, <sup>⊥</sup>)*, a Finite Degradation Model (* fdm*) is a pair* <sup>M</sup> = (<sup>V</sup> <sup>=</sup> SF, <sup>E</sup>)*, where*


We say that a flow variable W*m*+*<sup>i</sup> depends on* a variable v if the function φ*m*+*<sup>i</sup>* depends on v. An fdm is called acyclic if there are no cyclic dependencies among its flow variables, i.e., no flow variable transitively depends on itself. We stress out that in contrast to our definitions of pgfds, the original paper [14] only deals with acyclic fdms and does not provide semantics and necessary definitions for cyclic fdms. We thus assume in the rest of the section that all fdms are acyclic.

An assignment <sup>σ</sup> : V → *FM* is called *admissible* if the failure modes assigned to the flow variables satisfy all the corresponding equations, i.e., σ(W*m*+*i*) = <sup>φ</sup>*m*+*i*(σ) for each 1 <sup>≤</sup> <sup>i</sup> <sup>≤</sup> <sup>n</sup>. The assumption of acyclicity of fdms, together with the fact that all equations are deterministic functions and not general relations, guarantees that in each admissible assignment, failure modes of the flow variables are uniquely determined by the failure modes of the state variables. This defines a function [[M]](σ) = σ*M*, which maps each state variable assignment σ to its unique admissible extension σ*<sup>M</sup>* that assigns values to all variables. This is a stark contrast to pgfds, where a single initial state can give rise to multiple different propagation paths.

A corresponding notion to our notion of *top level event* for fdm is the notion of *observer*. An observer is a pair (R, U), where <sup>R</sup> is a flow variable and <sup>U</sup> <sup>⊆</sup> *FM* is a set of failure modes. Intuitively, the observer represents a set of dangerous failure modes of the given flow variable. A *cut set* is any assignment <sup>σ</sup> : S → *FM* of failure modes to state variables such that <sup>σ</sup>(R) <sup>∈</sup> <sup>U</sup>.

A notion related to our notion of *monotonicity* for fdm is *coherence*. The observer is coherent if for all assignments σ, σ : S → *FM* such that <sup>σ</sup> is a cut set and <sup>σ</sup> <sup>≤</sup> <sup>σ</sup> , the assignment σ is also a cut set.

Each fdm M can be translated to a pgfds S*<sup>M</sup>* such that the cut sets of M correspond to the cut sets of S*M*. Moreover, if the fdm M is coherent, the resulting pgfds S*<sup>M</sup>* is guaranteed to be subset-monotone. This enables efficient analysis of coherent fdms by our SMT-based technique. Intuitively, the pgfds S*<sup>M</sup>* has one component for each state variable of M and an additional component R for the observer flow variable R. The *next* function is defined in a way that the failure modes of all the components that correspond to state variables cannot

<sup>6</sup> Both fdms and our pgfds can be defined over multiple different fdss for different variables. Such generalization is straightforward, but it complicates the notation and the exposition significantly.


**Table 1.** Classes of pgfds and their traces that each of the compared tools can handle precisely.

change and that the component R can switch to a predefined set of failure modes if <sup>σ</sup>(R) <sup>∈</sup> <sup>U</sup>. This is achieved by composing all equations for the flow variables. If local variables are used in the symbolic encoding<sup>7</sup>, the size of the result is guaranteed to be polynomial.

### **7 Experimental Evaluation**

To evaluate the performance and scalability of our approach, we have implemented the proposed algorithm MCS-enumeration in a simple Python tool that uses the solver MathSAT [7], which supports all the required functionalities that are described in Sect. 5.1. In this section, we refer to the tool as SMT-PGFPS.

As a comparison, we have used Emmy [13], a tool based on decision diagrams for the enumeration of fds-minimal cut sets of fdms, and xSAP [3], a tool for safety assessment for arbitrary transition systems. Each of these tools only supports a subset of the capabilities of our approach, as summarized in Table 1.


<sup>7</sup> For example, let-expressions of form (let ((var definition) ...) body) in SMT-LIB.

For the comparison, we have created three sets of benchmarks:

**Scalable acyclic benchmarks** consisting of linear structures extended by a triple modular redundancy scheme. The basic architecture of these structures is parameterized by its size n and the system contains 6n components: 3n modules and 3n voters. These benchmarks use the fds W2F, which is a restriction of the fds W3F of Fig. <sup>1</sup> to failure modes {w, *fd*, *fu*}, with the ordering w < *fd* < *fu*.

Note that fds-minimal cut sets of these benchmarks cannot be enumerated by xSAP, as the benchmarks use a non-Boolean fds.

**Randomly generated systems with cycles over Boolean FDS** which share some structural properties with real-world systems. In particular, we generated random systems that have a similar distribution of in-degrees and out-degrees of the components as our proprietary systems, which we cannot disclose. We have generated 950 such systems of sizes ranging between 50 and 1000 components. We have used the Boolean fds for these benchmarks, so that they can be precisely analyzed also by xSAP.

Note that these benchmarks cannot be solved by Emmy, as they contain cyclic dependencies among the components.

**Randomly generated systems over W2F** which are created from the abovementioned randomly generated systems by using the fds W2F instead of the Boolean one. Although this does not change the overall structure of the system, it makes the transition relation more complicated and significantly increases number of minimal cut sets.

In the evaluation, we only used systems of size at most 400, as both the compared approaches timed out on the vast majority of larger systems.

Note that these benchmarks cannot be solved by Emmy, as they contain cyclic dependencies among the components. They can be solved by xSAP, but the generated cut sets are only subset-minimal with respect to fault variables, and not (in general) fds-minimal.

For the scalable benchmarks, we have generated encodings in the SMT format described in this paper and in the fds-ml format used by Emmy. For the randomly generated cyclic benchmarks, we have generated encodings in the SMT format and in the SMV format used by xSAP. The SMV encodings also include the assumption of *fault-persistence*. All the used benchmarks are *subsetmonotone*, and therefore our SMT-based approach can be used to compute the set of minimal cut sets correctly.

We have used wall time limit of 30 min for each solver-benchmark pair. All experiments were performed on a Linux laptop with Intel Core i7-8665U cpu and 32 GiB of ram.

A comparison of SMT-PGFPS and Emmy on the scalable acyclic benchmarks can be seen in Table 2. It shows that Emmy times out already on systems of size 5, i.e., on systems with 30 components. On the other hand, our approach is able to scale to systems with three thousand components.

A comparison against the sequential approach of xSAP on cyclic benchmarks can be seen in Fig. 5. Figures 5a and 5b show that on random systems over



Boolean fdss, our approach significantly outperforms the sequential approach of xSAP. As the size of the system grows, the difference can be up to several orders of magnitude. Both xSAP and SMT-PGFPS compute exactly the same minimal cut sets. Hence, the dramatic difference in performance can be justified by the reduction to the combinational case, which prevents the unrolling of the transition relation by implicitly encoding the propagations in the total ordering(s) found by the SMT solver.

The performance difference on the systems over the fds W2F, shown in Figures 5c and 5d, is even more pronounced. This can be caused by two additional factors. First, the systems over the fds W2F have more complicated transition relation, more minimal cut sets, and are in general harder. Thus, the unrolling performed by xSAP is even more costly. Second, xSAP has to enumerate more cut sets, because it is enumerating all subset-minimal cut sets and not only fds-minimal cut sets. However, this cannot be the main source of the observed performance gap: on 35 from the 113 benchmarks on which both xSAP and SMT-PGFPS finished before timeout, the number of cut sets are the same; on the remaining 78 benchmarks, xSAP enumerates on average 6% more cut sets and at most 62% more cut sets. In order to obtain fds-minimal cut sets from xSAP, the produced subset-minimal cut sets would have to be filtered or explicitly minimized, which would add yet another performance penalty for xSAP.

Overall, the SMT-based techniques presented in this paper yield a fundamental advancement with respect to the state of the art, both in terms of expressiveness as well as in terms of performance.

(a) Scatter plot of solving times over Boolean fds.

0.01 0.1 1 10 100 1000 T/O 250 500 750 1000 Number of components Time (s) Solving method SMT−PGFPS xSAP−IC3

(b) Dependence of solving time on the number of components over Boolean fds.

(c) Scatter plot of solving times over fds W2F.

(d) Dependence of solving time on the number of components over fds W2F.

**Fig. 5.** Comparison of SMT-PGFPS and xSAP-IC3 on random cyclic systems.

# **8 Conclusions and Further Work**

We tackled the problem of supporting the Preliminary Safety Assessment phase of aircraft design. Specifically, we defined an expressive framework for modeling failure propagation over components with multiple levels of degradation, with nondeterminism and cyclic dependencies. We presented a sequential semantics and proved that the problem can be tackled by means of minimal models enumeration in SMT. The framework is more expressive than the state of the art, and the proposed method outperforms the BDD-based techniques from [14] on acyclic benchmarks over generic fdss, and the model checking techniques of [6] on cyclic benchmarks.

In the future, we are going to introduce timing constraints and analyze redundancy architectures. We also investigate ways to relax the monotonicity and fault-persistence assumptions to explore recovery mechanisms and to further extend the reach of our approach. We are also working on encoding the causality constraints in the frameworks of SAT modulo acyclicity [10] and ASP modulo acyclicity [4], which could improve the performance of our approach even further.

# **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

# **ddSMT 2.0: Better Delta Debugging for the SMT-LIBv2 Language and Friends**

Gereon Kremer , Aina Niemetz(B) , and Mathias Preiner

Stanford University, Stanford, USA {gkremer,niemetz,preiner}@cs.stanford.edu

**Abstract.** Erroneous behavior of verification back ends such as SMT solvers require effective and efficient techniques to identify, locate and fix failures of any kind. Manual analysis of large real-world inputs usually becomes infeasible due to the complex nature of these tools. Delta Debugging has emerged as a valuable technique to automatically reduce failure-inducing inputs while preserving the original erroneous behavior. We present ddSMT 2.0, the successor of the delta debugger ddSMT. ddSMT is the current de-facto standard delta debugger for the SMT-LIBv2 language. Our tool improves and extends core concepts of ddSMT and extends input language support to the entire family of SMT-LIBv2 language dialects. In addition to its *ddmin*-based main minimization strategy, it implements an alternative, orthogonal strategy based on hierarchical input minimization. We combine both strategies into a hybrid strategy and show that ddSMT 2.0 significantly improves over ddSMT and other delta debugging tools for SMT-LIBv2 on real-world examples.

# **1 Introduction**

In recent years, a growing number of formal methods applications (e.g., [6,8]) rely on Satisfiability Modulo Theories (SMT) solvers as the back end. Current state-of-the-art SMT solvers are typically complex pieces of software, and debugging erroneous behavior requires effective and efficient techniques to analyze failure-inducing input with the purpose of identifying and locating the cause of the failure. Manual analysis of real-world problems that trigger a particular unwanted behavior is very often infeasible for large inputs, mainly due to the complex nature of these tools.

Erroneous behavior is never only triggered by a single unique input, but by a class of inputs that share a common trait. Extracting a *minimal working example*, i.e., an input that is *as small as possible* but still triggers the original faulty behavior, from such a class of inputs usually significantly decreases the time to identify and locate the cause of the failure. While ideally, the notion of size of an input directly correlates to the effort required to determine the failure

This work was supported in part by DARPA (award no. FA8650-18-2-7861) and ONR (award no. N68335-17-C-0558).

c The Author(s) 2021 A. Silva and K. R. M. Leino (Eds.): CAV 2021, LNCS 12760, pp. 231–242, 2021. https://doi.org/10.1007/978-3-030-81688-9\_11

cause, in practice this is hard to quantify. We instead use metrics such as file size, number of language constructs, and solver runtime until the failure occurs.

Finding such minimal working examples, however, is a problem of its own. Manual minimization is typically infeasible in practice, simply due to the large number of possible simplifications that may even depend on each other. Delta debugging techniques, on the other hand, provide automated means to minimize failure-inducing inputs. This typically entails to first read some input, apply a set of rules to simplify the input, and then check that the modified input still triggers the original behavior. Delta debugging in its simplest form [24] extracts a minimal working example by omitting parts of the input that are irrelevant for triggering the original faulty behavior. More input language specific tools perform additional simplifications to further minimize the input. All of these simplifications are typically performed until a fixed point is reached.

For the design of a delta debugger, this process raises a number of questions: How does the debugging tool check for "same behavior" of a tool on some input? Which simplification rules should be employed and how should they be combined? To what (syntactic and semantic) degree should the delta debugger itself understand the input language? In this paper, we address these questions in the context of delta debugging for the SMT-LIBv2 language and its dialects with our delta debugger ddSMT 2.0, the successor of ddSMT [18]. In the following, we will refer to ddSMT 2.0 as ddSMTv2, and to its predecessor as ddSMTv1.

*Related Work.* Generic delta debugging tools that are agnostic to the input language can be surprisingly efficient for some use cases. For minimizing SMT-LIB input, however, their usefulness is usually rather modest. One such generic tool is linedd [4], which solely performs line-based simplifications. The first delta debugging tool specific to the SMT-LIB language was presented in [7] as deltaSMT and targeted SMT-LIBv1 [22]. Three years later, the SMT community adopted a new input language SMT-LIBv2. In 2013, an updated version of deltaSMT [10] extended the tool syntactically for SMT-LIBv2 compliance, but limited to the feature set of the SMT-LIBv1 language and without full SMT-LIBv2 support. Note that this updated version is not available anymore. In the same year, ddsexpr [5], a generic hierarchical delta debugger for S-expressions (and thus applicable to the SMT-LIB language family), and ddSMTv1 [18], a delta debugger specific to the SMT-LIBv2 language, were presented. The latter implements a variant of Zeller's *ddmin* algorithm [24] and is considered as the current de-facto standard delta debugger in the SMT community. The only other delta debugging tool specific to the SMT-LIBv2 language we are aware of is delta [15], a hierarchical delta debugger shipped together with the SMT solver SMT-RAT [9]. A reimplementation of delta in Python is available as pyDelta at [14].

*Contributions.* In this paper, we present ddSMTv2, a delta debugging tool for the SMT-LIBv2 [2] language and its dialects. It supports the entirety of the SMT-LIBv2 standard as well as non-standardized extensions and derived formats such as the SyGuS input language [21]. Our tool is agnostic to future extensions of the standard in the sense that it does not require any modifications for basic support. It is easy to extend, and extensions will only be required for simplifications that are specific to new language features or a certain dialect of the SMT-LIBv2 language. In this sense it will also immediately support the SMT-LIBv3 [1] language, which is currently under development.

ddSMTv2 is the successor of the delta debugger ddSMTv1 [18] and incorporates, improves and extends its core concepts. It also implements an improved variant of the hierarchical approach of pyDelta as an alternative, orthogonal strategy, and allows to combine these two strategies in a hybrid manner. ddSMTv2 is intended to overcome major weaknesses of ddSMTv1, which is limited to the SMT-LIBv2 language and does not support the full set of standardized background theories or language extensions to the point where it is even unable to parse the input file. ddSMTv2 further extends the set of theory-specific simplifications over both ddSMTv1 and pyDelta, which allows to exploit even more minimization opportunities.

ddSMTv2 is implemented in Python and can be installed via pip3 install ddsmt. Its documentation is available at [11], and its source code is available under version 3 of the GNU General Public License (GPLv3) at [13].

# **2 Detecting Failure-Inducing Inputs**

An SMT solver is a fully automated tool to determine the satisfiability of a first order logic formula modulo some background theories and their combinations. For satisfiable inputs, SMT solvers optionally allow to query a model, whereas for unsatisfiable inputs, some optionally generate a proof of unsatisfiability. Additionally, SMT solvers usually provide a plethora of configuration options.

Within the SMT community, the notion of *failure* is generally defined as anything from abnormal termination or crashes (including segmentation faults and assertion failures), to performance regressions (one solver performs significantly worse on an input than a reference solver), unsoundness (answering sat instead of unsat and vice versa), incorrect models or incorrect proofs of unsatisfiability. In the following, we define a *failure-inducing input* to an SMT solver as an SMT-LIB input that triggers a failure. In particular, we do not consider options configured via command line as part of the input.

Strategies to determine if a minimized input still triggers the original faulty behavior typically differ depending on the kind of the failure. For *abnormal termination or crashes*, it is usually sufficient to compare the exit code of the solver call, optionally with additional comparisons of output on the standard output and error channels. For failures that generate error messages that include memory addresses, it is often useful to not compare the full output, but to only match against a specific phrase that occurs in the original error output.

By default, ddSMTv2 does exactly that: it determines if a simplified input has the same erroneous behavior as the original input by comparing the exit code and the output on the standard output and error channels for equality. Standard output and error output can optionally be ignored or matched against user-defined strings via command line options.

*Performance Regressions* are more tricky and typically involve helper scripts that call two solver configurations with some time limit and return a specific exit code in case the performance regression is triggered. The delta debugger will then minimize the input based on this exit code. Inputs that trigger *unsoundness failures* can be dealt with in a similar way. For inputs that reveal performance regressions and unsound answers, ddSMTv2 provides easy-to-use wrapper scripts that can also be adapted to more specific use cases.

*Incorrect models* and *incorrect proofs* are more involved since they typically require some checking mechanism to determine if a generated model or proof is incorrect. Most SMT solvers implement such mechanisms and will throw an assertion failure in debug mode when such a failure is detected. For cases that are not detected by the solver itself, external checking tools are required. Implementing such checks is considered out of scope for a debugging tool due to their complex nature.

# **3 Simplification Rules and Staged Simplification**

Historically, the set of simplification rules for delta debugging has been in general rather small and mainly limited to removing or reordering parts of the input. Adding *structural and semantic simplifications* on top of these basic transformations has proved successful for the SMT-LIB language, and greatly improves performance over language agnostic minimization techniques. The delta debuggers deltaSMT, delta and ddSMTv1 all support structural and semantic simplifications, albeit to a varying degree. Of these three, ddSMTv1 implements the largest set of language-specific simplifications. The SMT-LIB-agnostic delta debugger ddsexpr, on the other hand, performs structural simplifications only.

Additionally, it is beneficial to devise a strategy for *when* to apply *which kind* of simplification rules to *which part* of the input in order to avoid generating useless test cases. An example for a useless test case is when the declaration of a constant is removed before removing all occurrences of this constant. Such a test case is useless because it is almost guaranteed to fail due to a parse error in the solver instead of triggering the original faulty behavior. It is further beneficial to perform simplifications that promise larger overall reduction (e.g., removal of commands) early on, in order to reduce the burden of more local, theory-specific simplifications (e.g., replacing terms with default values of the same sort).

We require that applying a simplification rule indeed *simplifies* the input and that it is not possible to cycle between applications of simplification rules in order to ensure termination of the minimization procedure. Generally, we define *simplification* in terms of measuring the input size in bytes or in the number of Sexpressions. We supplement this with specific syntactic and semantic properties, e.g., the number of variable binders in a quantified formula, or the degree of "sortedness" of children of an S-expression. Intuitively, we say that given an input A, a simplification rule yields a simpler input B if the constructs in B are simpler according to some metric specific to the rule, or if B is smaller than A in terms of size. As an example for such a metric, consider a simplification rule that replaces a value with another value. Such a transformation is only interpreted as simpler if the value to be replaced does not already fall into the class of simpler values, e.g., for integer values we define the set of simpler values as {0*,* <sup>1</sup>}. Thus, replacing value 1234 with 0 is a simplification, but replacing 1 with 0 is not.

In ddSMTv2, possible input simplifications are generated by so-called *mutators*, which implement simplification rules. They either perform small local changes to a given S-expression, or introduce global modifications on the input based on that S-expression. Each mutator implements a *filter* method, which checks if the mutator is applicable to the given S-expression. If this is the case, the mutator can be queried to suggest (a list of) possible local and global simplifications. Mutators are not required to be equivalence or satisfiability preserving. They may extract semantic information from the input when needed, e.g., to infer the sort of a term, to query the set of declared or defined symbols, to extract indices of indexed operators, and more. ddSMTv2 applies a considerably larger set of simplifications than ddSMTv1 and currently implements 48 mutators, which range from generic simplifications on S-expressions that require no understanding of SMT-LIB, to more theory-specific mutators that make full use of SMT-LIB semantics. Each of these mutators is enabled by default and can optionally be disabled. Extending ddSMTv2 with a new simplification boils down to implementing a filter method and methods to query local and/or global mutations in a new mutator class, and registering this class as an active mutator.

# **4 Parsing and Input Representation**

While the question about the syntactic and semantic degree of understanding of the input language may seem silly at first glance, it is indeed warranted and actually crucial for the overall design of the delta debugger. The two extreme cases are aiming at *full understanding* of the language, and *no understanding*, i.e., treating the input as a sequence of bytes. The trade-off at hand is mainly between the ability to easily devise *language compliant* simplifications, and the burden of infrastructure required for *parsing* and *representing* the input, which is an additional burden on *maintenance* in case the input language changes.

Both deltaSMT and ddSMTv1 aim at full understanding, while most of the others try for some intermediate level of abstraction, i.e., a level that does not require full understanding of the input language but allows for smarter simplifications than just manipulating bytes. The line-based delta debugger linedd minimizes input by removing lines, whereas ddsexpr is syntax-aware in the sense that it understands S-expressions, but without any SMT-LIBv2 specific semantics. Both delta and pyDelta extend understanding of S-expressions with some semantic properties, however, in the case of delta only to a very basic degree (it is, e.g., not even aware of sorts). Outside of the context of the SMT-LIBv2 language, applying an intermediate abstraction approach was successful for the original *ddmin* algorithm [24], which considers change sets (e.g., commits or individual hunks of a commit), and in [23], where the authors use local semantics of certain C++ constructs. Another example is presented in [16], which exploits the hierarchical structure of an input, independent of the concrete semantics.

Our main target language is SMT-LIBv2, which is a hierarchically structured language where, to cite the SMT-LIBv2 standard [2], "every expression [. . . ] is a legal S-expression of Common Lisp". In contrast to ddSMTv1, in ddSMTv2 we aim for an intermediate level of abstraction to ease the burden on infrastructure and maintenance and choose to use S-expressions as the main representation of the input, just like ddsexpr does. However, additionally, we extract a comprehensive set of semantic properties to allow for SMT-LIBv2 specific and compliant simplifications. Language compliant transformations are a requirement for the specific use case of minimizing SMT-LIBv2 input to debug erroneous behavior of SMT solvers. This is mainly to avoid generating nonsensical test cases, i.e., test cases that an SMT solver will refuse to parse. Even when such test cases are refused immediately, if the overwhelming majority of generated test cases is nonsensical it can significantly impact the efficiency of our debugging tool. Note that we explicitly do not disallow delta debugging non-compliant input.

ddSMTv2 features a simple S-expression parser and represents S-expressions as a lightweight wrapper around built-in Python tuples and strings. Semantic information is recovered in an ad-hoc manner after parsing. This allows for minimal infrastructure and maintenance overhead for input parsing and representation. The parser component of ddSMTv2 has less than 100 LOC, and the ad-hoc semantic analysis accounts for less than 400 LOC. Adding support for new versions, dialects or non-standardized extensions of the SMT-LIB language does not require any changes to the parser.

This is in stark contrast to deltaSMT and ddSMTv1, which both aim to get a full understanding of the input, with all its negative consequences: deltaSMT dedicates about 50% (more than 2000 LOC) of its Java code base and ddSMTv1 even over 80% (3000 LOC) of its Python code base to parsing and input representation. Note that the former targets SMT-LIBv1, whereas the latter provides full SMT-LIBv2 support for most of the standardized theories. In both tools, parsing is a disproportionate part of the code base and extending the tools to support new theories or language constructs usually requires extensive modifications to their input parsers. These modifications have significantly complicated or even inhibited the development of these tools in the past: adding support for the theory of floating-point arithmetic in ddSMTv1 required touching more than 1000 LOC; deltaSMT, on the other hand, has never seen full support of SMT-LIBv2 and fails to parse almost all inputs from our test set.

# **5 Delta Debugging Strategies**

Our delta debugger ddSMTv2 implements two minimization strategies which we call ddmin and hierarchical. These two can be combined into a third strategy called hybrid, which aims to utilize the best of both worlds. All three strategies use the same input representation and have access to the same pool of available mutators. However, they differ in *how* they apply mutators to simplify the input.


*Strategy ddmin.* Our ddmin strategy implements a variant of the minimization strategy of ddSMTv1 and tries to perform simplifications on multiple Sexpressions in the input in parallel. Algorithm 1 shows the main loop of this strategy. For each active mutator *M*, the algorithm first collects all S-expressions in the input that can be simplified by *M* (Line 4). Simplifications are applied and checked in a fashion similar to Zeller's original *ddmin* algorithm [24]: the set of S-expressions *sexprs* is partitioned into subsets of size *size*; each S-expression *<sup>e</sup>* <sup>∈</sup> *subset* is substituted in *input* (Line 7) with a simplification suggested by *<sup>M</sup>*; the resulting simplified input *candidate* is then checked if it still triggers the original behavior (Line 8). Once all subsets of a given size are checked, *sexprs* is updated based on the current input and partitioned into smaller subsets. As soon as all subsets of size 1 were checked, the algorithm repeats these steps with the next available mutator. The main loop of strategy ddmin is run until a fixed point is reached, i.e., the input cannot be further simplified. Strategy ddmin applies mutators in two stages. The first stage targets top-level S-expressions (e.g., specific kinds of SMT-LIB commands) until a fixed point to aggressively simplify the input before applying more expensive mutators in the second stage.

*Strategy hierarchical.* The main loop of the hierarchical strategy performs a simple breadth-first traversal of the S-expressions in the input, and applies all enabled mutators to every S-expression, as shown in Algorithm 2. Once a simplification is found (Line 7), all pending checks for the current S-expression are aborted and the breadth-first traversal continues with the simplified S-expression *sexpr* (Line 9). This process is repeated until a fixed point is reached, i.e., until no further simplifications are found for any S-expression. The main simplification loop (Line 3) is applied multiple times, with varying sets of mutators. In the initial stages, strategy hierarchical aims for aggressive minimization using only a small set of selected mutators, in the next-to-last stage it employs all but a few mutators that usually only have cosmetic impact, and in the last stage it includes all mutators. We observed that breadth-first traversal yields significantly better results than


**Algorithm 2:** Core simplification loop of hierarchical strategy

a depth-first traversal, most probably since it tends to favor simplifications on larger subtrees of the input.

*Strategy hybrid.* This strategy combines strategies ddmin and hierarchical in a sequential portfolio manner. It first applies ddmin until a fixed point is reached, and then calls strategy hierarchical on the simplified input. We chose this order of strategies after observing in our experiments that ddmin is usually faster in simplifying input, while hierarchical often yields smaller inputs.

# **6 Experimental Evaluation**

We compare the different strategies implemented in ddSMTv2 against the existing delta debuggers ddsexpr, ddSMTv1, delta, linedd, and pyDelta. For this purpose, we compiled a set of SMT-LIB and SyGuS test cases from different sources. Every test case consists of an input file, a solver binary and command line configuration options for that binary. Our set of test cases includes those used in [18] and instances reported in bug reports of the SMT solvers Bitwuzla [19], CVC4 [3], Yices [12], and Z3 [17]. The test cases from [18] include issues encountered with development versions of the SMT solvers Boolector [20] and CVC4. Note that we excluded 9 test cases from this set because they did not trigger any faulty behavior on our experimental setup. In total, we collected 244 test cases consisting of inputs that trigger assertion failures, unexpected behavior or wrong solver answers. We performed all experiments on a cluster with Intel Xeon E5-2620v4 CPUs with 2.1 GHz and 128 GB memory and used a 1 h wall-clock time limit and 8 GB of memory for each delta debugger/test case pair. Table 1 summarizes the results on all 244 test cases.

A first immediate observation is the value of a simpler and more generic parser: ddSMTv1 fails to parse more than 20% of the inputs, mostly due to the lack of support for newer standard and non-standard SMT-LIBv2 constructs. Examples include the check-sat-assuming command, algebraic datatypes, some operators of the theory of strings, the SyGuS language extension, and


**Table 1.** Results summarized over all 244 test cases.

the non-standardized extension to encode problems of separation logic. We also observe that each strategy of ddSMTv2 simplifies significantly more inputs than any other tool. The only inputs that could not be simplified by ddSMTv2 were already very small (83 and 98 bytes). Strategy hybrid achieves the smallest output on 168 test cases (more than two thirds) and an average reduction in file size by 77% (79% not counting timeouts), while only timing out on 6 test cases.

Some debuggers increase the input size (in bytes), indicated by positive reductions. Eliminating let binders or inlining function definitions frequently increase the size of the input. A positive reduction occurs if the debugger times out while performing such simplifications, or if it is unable to find viable simplifications after the input size increased. In rare individual cases, incorrect outputs were produced that did not trigger the issue under investigation. This happened because of the unchecked removal of unused variables (delta), incorrect handling of timeouts (linedd) and defective handling of quoted symbols (pyDelta).

The hybrid strategy performs significantly better than ddSMTv1, even on the set of instances that both can reduce without any timeout or error. On these commonly reduced instances (107), the results from hybrid are smaller in most cases (99), and on average smaller by about a third.

On inputs that both ddmin and hybrid reduce without timeout or error (238), the hybrid strategy produces smaller outputs on 125 cases and never generates larger results. On average, over all 238 inputs the outputs are about 5% smaller. This may seem marginal, but can make a big difference for users in practice.

Figures 1–2 show the direct comparison of ddmin, hierarchical, hybrid and ddSMTv1 in terms of output size and overall runtime as scatter plots, where a dot represents a test case and dots on the "T" lines correspond to timeouts. While strategy hierarchical tends to produce smaller output files, it is considerably slower than ddmin and runs into the time limit on 116 more test cases. As a result of this observation, we combined both strategies into the hybrid strategy, which first uses ddmin to quickly reduce the input before applying hierarchical to achieve maximum reduction. Comparing hybrid to the best of strategies ddmin and hierarchical, we see that hybrid usually achieves the smallest output and is only slower on test cases that are comparably fast to minimize. If the runtime of ddSMTv2 exceeds a few minutes, there is no discernible performance penalty.

**Fig. 1.** Output size (in % of original size).

**Fig. 2.** Overall runtime (in seconds).

In comparison to ddSMTv1, strategy hybrid obtains significantly smaller output files on almost all inputs while having a similar runtime on inputs where ddSMTv1 terminates within the given time limit.

All strategies allow to use multiple worker processes to perform checks asynchronously. Though there is potential for significant runtime improvements, the current impact is rather limited. With 8 worker processes, hierarchical achieves on average a 2x speedup, and up to 6x speedup on a few instances. Both ddmin and hybrid, on the other hand, slow down on average (by 25% and 9%, respectively).

# **7 Conclusion**

We have presented ddSMTv2, a delta debugger for the SMT-LIBv2 language and its dialects. Our tool improves substantially over its predecessor ddSMTv1, which is the current de-facto standard in the SMT community for delta debugging SMT-LIB input. We have shown how a more generic parser approach not only lowers the maintenance overhead of the tool itself, but also makes the delta debugger more robust and easier to extend for future SMT-LIB extensions. Our experimental evaluation has shown that ddSMTv2 significantly outperforms existing delta debugging tools on a variety of real-world test cases from different SMT solvers. Further, our experiments suggest that combining different minimization strategies is beneficial in practice to quickly obtain small output files.

# **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

# **Learning Union of Integer Hypercubes with Queries (with Applications to Monadic Decomposition)**

Oliver Markgraf1(B) , Daniel Stan<sup>1</sup> , and Anthony W. Lin1,2

<sup>1</sup> TU Kaiserslautern, Kaiserslautern, Germany {markgraf,stan,lin}@cs.uni-kl.de <sup>2</sup> Max Planck Institute for Software Systems, Kaiserslautern, Germany

**Abstract.** We study the problem of learning a finite union of integer (axis-aligned) hypercubes over the d-dimensional integer lattice, i.e., whose edges are parallel to the coordinate axes. This is a natural generalization of the classic problem in the computational learning theory of learning rectangles. We provide a learning algorithm with access to a minimally adequate teacher (i.e. membership and equivalence oracles) that solves this problem in polynomial-time, for any fixed dimension d. Over a non-fixed dimension, the problem subsumes the problem of learning DNF boolean formulas, a central open problem in the field. We have also provided extensions to handle infinite hypercubes in the union, as well as showing how subset queries could improve the performance of the learning algorithm in practice. Our problem has a natural application to the problem of monadic decomposition of quantifier-free integer linear arithmetic formulas, which has been actively studied in recent years. In particular, a finite union of integer hypercubes correspond to a finite disjunction of monadic predicates over integer linear arithmetic (without modulo constraints). Our experiments suggest that our learning algorithms substantially outperform the existing algorithms.

# **1 Introduction**

Suppose that we are interested in finding a formula ϕ(¯x) over some theory T (e.g. integer linear arithmetic) to "capture" a certain phenomenon, which in verification could be, for instance, an invariant that a program satisfies some safety property. The process of discovering ϕ can be captured by the notion of a *learning algorithm* by allowing certain types of queries as an interface to some teacher [3]. Most standard learning frameworks can be captured in this way. Here are some examples. Valiant's well-known notion of *PAC-learning* can be captured by an oracle that returns a new random sample from an unknown distribution. Angluin's well-known notion of *exact learning* [2,3] can be captured by an interaction with the so-called *minimally adequate teachers*, which

This research was supported by the ERC Starting Grant 759969 (AV-SMP) and Max-Planck Fellowship.

can answer membership and equivalence queries. This has many applications in verification, e.g., verification of parameterized systems [10,20,23] and compositional verification [9]. Another learning framework that has become very popular in verification is CEGIS (Counterexample Guided Inductive Synthesis) [21,27], wherein a learning algorithm can ask equivalence queries, but expect various types of "constraint-like" counterexamples (e.g. implication counterexamples) to be returned by the teacher. This is of course in contrast to Angluin's exact learning setting, wherein the teacher may return only a positive/negative counterexample (a point in the symmetric difference of the target concept and the hypothesis).

In this paper, we study the problem of learning sets of points over the ddimensional integer lattice that can be expressed as a *finite union of integer (axis-aligned, a.k.a. rectilinear) hypercubes*, i.e., whose edges are parallel to the coordinate axes. Such a concept class of course forms a strict subclass of sets of points that are definable by a formula ϕ(x1,...,x*d*) in the integer linear arithmetic (a.k.a. *semilinear sets*), which have been addressed in several papers including [1,17,28], whose PAC-learnability is as hard as PAC-learning boolean formulas in DNF [16]—a long-standing open problem in learning theory—when binary representations are permitted (even over dimension one [1]). That said, finite unions of integer hypercubes are a concept class that naturally arises in computer science. Below we mention a few examples.

The problem of learning rectangles (2-cube) and generalization to d-dimension are a classic example in computational learning theory, e.g., see [16,22]. Maass and Tur´an [22] showed for example that the d-dimensional rectilinear cubes can be learned in polynomial-time with O(log n) queries, where the corners of the cubes are represented in binary. The authors posed as an open problem if one can learn a union of two (possibly overlapping) rectangles with only O(log n) equivalence queries. Chen [11] showed that this can be learned with 2 equivalence queries and O(d. log n) membership queries. Later Chen and Ameur [12] showed that there is a polynomial-time algorithm using at most O(log<sup>2</sup> n) queries. The same paper left as an open problem if there is a polynomial-time exact learning algorithm that learns finite unions of rectilinear cubes over a fixed dimension d. In this paper, *we answer this in the positive*, and further show that this can be extended to allow *infinite rectilinear hypercubes*, which in turn allow interesting applications in formal verification, as we discuss below.

Finite unions of rectilinear cubes arise naturally in program analysis and verification. Here we mention two examples. First, solving games over a large game graph has benefited from constraint-based approaches, where winning regions can be succinctly represented and checked efficiently [6]. For example, the discretization of the Cinderella-Stepmother problem [6] admits winning regions that may be represented by a union of a small number of cubes. Secondly, verification algorithms benefit from optimization techniques like monadic decomposition [29], where the aim is the rewriting of a given quantifier-free SMT formula ϕ(x1,...,x*n*) into an equivalent boolean combination of monadic predicates ψ(x*i*) in some special form, i.e., typically in DNF [5,7,15,19], or by an if-then-else formula [29], which could sometimes be exponentially smaller than the DNF equivalent representation. Veanes *et al.* [29] provided a generic semidecision procedure for performing this monadic decomposition as an if-then-else formula, which works regardless of the base theory. The restriction of the problem to the quantifier-free theory of integer linear arithmetic (with and without extra modulo constraints) was studied in [15], wherein the problem was shown to be coNP-complete and a monadic decomposition could be exponentially large in general. For the subcase without modulo constraints, a monadic decomposition in DNF corresponds precisely to a finite union of (possibly infinite) rectilinear hypercubes, which is the subject of this paper. We describe below how oracles for memberships and equivalence (as well as more powerful queries like subsets) admit a fast implementation via an SMT-solver, which enable our learning algorithms to be applied to compute such a monadic decomposition.

*Contributions.* We study the problem of learning finite unions of rectilinear hypercubes (over Z*<sup>d</sup>*) in Angluin's exact learning framework with membership and equivalence queries [2,3]. Our result is a polynomial-time exact learning algorithm for learning finite unions of rectilinear hypercubes over Z*<sup>d</sup>* for fixed d. This answers an open problem of [12]. As observed in [12], over non-fixed d, this problem generalizes DNF since each term can be seen as a hypercube over {0, <sup>1</sup>}*<sup>d</sup>*. That is, without fixing <sup>d</sup>, the problem is as hard as learning unrestricted DNF, which is well-known to be a major open problem in computational learning theory [4].

In view of applying our learning algorithm to the monadic decomposition problem [15,29] for quantifier-free integer linear arithmetic formulas, we consider two extensions. Firstly, we allow *infinite hypercubes*. For example, over 1-dimension, these would include infinite intervals like [7,∞), which would correspond to the formula <sup>x</sup> <sup>≥</sup> 7. Secondly, we observe that the *subset query* (i.e. checking if the target concept includes a given finite union H of hypercubes) is not an expensive query for performing monadic decomposition, i.e., it would correspond to a single satisfiability check of a quantifier-free integer linear arithmetic formula, which can be handled easily by an SMT-solver. Subset queries belong to one of the standard types of queries in Angluin's active learning framework, e.g., see [3]. For this reason, we provide an optimization of our learning algorithm by means of subset queries.

We implemented these learning algorithms (vanilla and various optimization including subset queries and "unary/binary acceleration"), using Z3 [26] as the backend for answering equivalence and subset queries (each a satisfiability check of a quantifier-free formula). We have performed a micro-benchmarking to stresstest our algorithms against the generic monadic decomposition procedure of [29], which also use Z3 as the backend, using various geometric objects over Z*<sup>d</sup>* as benchmarks. Our experiments suggest that our algorithms substantially outperform the generic procedure.

*Organization.* Preliminaries are in Sect. 2. We present the *overshooting algorithm* that witnesses polynomial learnability of finite unions of rectilinear cubes over a fixed dimension d with membership and equivalence in Sect. 3. In Sect. 4, we provide two extensions: (1) how subset queries could help speed up the overshooting algorithm, (2) how the algorithm could be extended to handle infinite cubes. Applications to monadic decomposition and experiments are presented in Sect. 5. We conclude in Sect. 6.

We refer the reader to the technical report [25] when proofs are omitted and to the artifact [24] for implementation and benchmark details.

# **2 Preliminaries**

We introduce below some common mathematical notations: N and Z are the sets of natural numbers and integers, respectively. For a, b <sup>∈</sup> <sup>Z</sup>, we write [a, b] = {<sup>i</sup> <sup>|</sup> <sup>a</sup> <sup>≤</sup> <sup>i</sup> <sup>≤</sup> <sup>b</sup>}; For any set <sup>X</sup>, we denote its power-set <sup>P</sup>(X) and its cardinal <sup>|</sup>X| ∈ <sup>N</sup>{∞}; Given two sets A, B, the *symmetric difference* is written AΔB <sup>=</sup> <sup>A</sup>\<sup>B</sup> <sup>∪</sup> <sup>B</sup>\A;

When analyzing complexity of the presented algorithms, we assume binary encoding for any number <sup>n</sup> <sup>∈</sup> <sup>Z</sup>, which is part of the input of the considered algorithms, namely, size(n) = 1+log(|n|+1) , where log is the base 2 logarithm.

*Hypercubes.* For a fixed *dimension* <sup>d</sup> <sup>∈</sup> <sup>N</sup>, we consider the *discrete lattice* <sup>Z</sup>*<sup>d</sup>*. A *point* **<sup>v</sup>** <sup>∈</sup> <sup>Z</sup>*<sup>d</sup>* can be described by its coordinates **<sup>v</sup>**[k] for <sup>k</sup> <sup>∈</sup> [1, d]. Let **<sup>v</sup>**[k/α] denote the vector **<sup>v</sup>** where the <sup>i</sup>-th coordinate has been replaced by <sup>α</sup> <sup>∈</sup> <sup>Z</sup>. The notation **<sup>0</sup>***<sup>d</sup>* = (0,..., 0) <sup>∈</sup> <sup>Z</sup>*<sup>d</sup>* denotes the origin, or simply **<sup>0</sup>** when the dimension is clear from context. We use standard notation for component-wise additions and scalar multiplication. In particular, for <sup>α</sup> <sup>∈</sup> <sup>Z</sup>, **<sup>v</sup>** <sup>+</sup> <sup>α</sup> · **<sup>v</sup>** denotes the vector **<sup>v</sup>** <sup>∈</sup> <sup>Z</sup>*<sup>d</sup>* such that for all <sup>i</sup>, **<sup>v</sup>**[i] = **<sup>v</sup>**[i] + <sup>α</sup> · **<sup>v</sup>** [i]. For 1 <sup>≤</sup> <sup>i</sup> <sup>≤</sup> <sup>d</sup>, we write **<sup>e</sup>***<sup>i</sup>* for the i-th *elementary vector*, **e***<sup>i</sup>* = **0**[i/1]. We shall be mostly using the standard *component-wise order* <sup>≤</sup> over vectors in <sup>Z</sup>*<sup>d</sup>*: **<sup>v</sup>** <sup>≤</sup> **<sup>v</sup>** iff for all <sup>i</sup>, **<sup>v</sup>**[i] <sup>≤</sup> **<sup>v</sup>** [i]. We finally denote the size of a vector as the sum of the sizes of its components: size(**v**) = *d <sup>i</sup>*=1 size(**v**[i]), for any **<sup>v</sup>** <sup>∈</sup> <sup>Z</sup>*<sup>d</sup>*.

Our main study focuses on *rectilinear hypercubes* (*cubes* for short), i.e., any set of points of the form <sup>C</sup> <sup>=</sup> {**<sup>v</sup>** <sup>|</sup> **<sup>v</sup>** <sup>≤</sup> **<sup>v</sup>** <sup>≤</sup> **<sup>v</sup>**} for some **<sup>v</sup>** <sup>≤</sup> **<sup>v</sup>** <sup>∈</sup> <sup>Z</sup>*<sup>d</sup>*. The size of C is uniquely defined as size(C) = size(**v**) + size(**v**). On the contrary, an arbitrary finite set X has no unique representation as a finite union of cubes, therefore we define its size as the size of its best representation:

$$\text{size}(X) = \min \left\{ \sum\_{i=1}^{n} \text{size}(\underline{\mathbf{v}}\_{i}) + \text{size}(\overline{\mathbf{v}}\_{i}) \; \middle| \; \exists n, \underline{\mathbf{v}}\_{1} \dots \overline{\mathbf{v}}\_{n} : X = \bigcup\_{i=1}^{n} \text{Cube}(\underline{\mathbf{v}}\_{i}, \overline{\mathbf{v}}\_{i}) \right\}$$

We adopt here a worst-case analysis approach, where our later reasoning and complexity analysis are valid for any representation, they are in particular valid for its best representation.

*Learning Model.* We first recall some standard definition from computational learning theory; for more, see [16]. Fix a countable *base set* <sup>D</sup> <sup>=</sup> *<sup>n</sup> <sup>i</sup>*=1 D*i*, where the sets D*i*'s are pairwise disjoint. The problem of learning boolean formulas in DNF uses <sup>D</sup>*<sup>i</sup>* <sup>=</sup> {0, <sup>1</sup>}*<sup>i</sup>* , i.e., the set of all binary sequences of length i, which can be thought of as a set of all assignments to a boolean function over x1,...,x*i*. The learning problem in this paper uses <sup>D</sup>*<sup>i</sup>* <sup>=</sup> <sup>Z</sup>*<sup>i</sup>* . A *concept* X is simply a subset of <sup>D</sup>*i*, for some <sup>i</sup> <sup>∈</sup> <sup>Z</sup>*>*<sup>0</sup>. For example, when <sup>D</sup>*<sup>i</sup>* <sup>=</sup> {0, <sup>1</sup>}*<sup>i</sup>* , a concept is simply a boolean function over x1,...,x*i*. When we speak of a learning problem, we always have a fixed set of representations in mind. For example, when we speak of learning boolean formulas in DNF (Disjunctive Normal Form), the representation ϕ*<sup>X</sup>* of a boolean function X has to be a formula over x1,...,x*<sup>i</sup>* in DNF. For example, X could be a boolean function, whereas ϕ*<sup>X</sup>* a DNF formula representing X. Note that a concept could admit many possible representations. A *concept class* C = <sup>∞</sup> *<sup>i</sup>*=1 C*<sup>i</sup>* is a set of concepts, where C*<sup>i</sup>* ⊆ P(D*i*). For example, <sup>C</sup>*<sup>i</sup>* could be the set of boolean functions over variables <sup>x</sup>1,...,x*i*. When the set of representations for C is fixed (e.g. DNF for representing boolean functions), we could define size(X) of the concept X to be the size of the smallest representation of <sup>X</sup>. In this paper, we are dealing with the concept class <sup>C</sup>*<sup>d</sup>* ⊆ P(Z*<sup>d</sup>*) of sets of integer points that can be represented as a finite union of rectilinear hypercubes over Z*<sup>d</sup>*. Earlier in this section we have defined this concept, as well as the size of the representation. To avoid notational clutter, we will often denote the concept class <sup>C</sup>*<sup>d</sup>* by <sup>C</sup> because our algorithm typically assumes that <sup>d</sup> is fixed.

In Angluin's active learning framework [2,3], the learner has access to oracles (a.k.a. teachers) that could provide hints about the target concept X to the learner. A *minimally adequate teacher* must be able to answer membership and equivalence queries.

**Definition 1 (M+EQ Oracles).** *Consider some target concept* <sup>X</sup> ∈ C*<sup>d</sup> for some concept class* C = <sup>∞</sup> *<sup>d</sup>*=1 <sup>C</sup>*<sup>d</sup> and let* <sup>⊥</sup>, ∈ D/ *be two fresh symbols.*

} *such that for all* hypothesis <sup>H</sup> ∈ C*,* <sup>Ψ</sup>*X*(H) <sup>∈</sup> (HΔX) {
} *and* <sup>Ψ</sup>*X*(H) = *implies* H = X*.*

Intuitively, an equivalence oracle tells, for any hypothesis <sup>H</sup> ∈ C, whether <sup>H</sup> <sup>=</sup> <sup>X</sup>. If yes, is returned; if not, it provides a *counterexample*, namely a point in the symmetric difference. Angluin has considered other types of queries as well in her framework including subset/superset queries and difference queries (e.g. see her excellent survey [3]). We will use the subset queries in Sect. 4.

A learning algorithm A is said to *learn* the concept class C = <sup>∞</sup> *<sup>d</sup>*=1 C*<sup>d</sup>* if, given d as input and any unknown target concept X, it terminates and outputs a representation of X after a finite amount of interaction with the oracles. Assuming that the oracle always returns the shortest counterexamples, its running time is defined to be number of steps (measured in <sup>d</sup> and size(X)) that <sup>A</sup> takes to output a representation of <sup>X</sup>. The complexity comp(d,size(X)) of <sup>A</sup> measures the number of steps taken in the worst case for all d and size(X). It runs in polynomial time if comp is a polynomial function. It remains a long-standing open problem in computational learning theory if there is a learning algorithm for boolean formulas represented in DNF, which is true for almost all major models including exact learning and PAC (see [4]). Over geometric concepts including hypercubes and semilinear sets, the dimension d is sometimes considered a fixed parameter, e.g., see [1,12,17,22].

# **3 Minimally Adequate Teacher**

We restrict first our attention to the minimally adequate teacher setting where only a membership and equivalence oracle are provided, and provide constructions for intermediate procedures that can be interpreted as oracles.

# **3.1 Corner Oracle**

At the heart of our learning algorithm is the concept of corners:

**Definition 2.** *Given a set of points* <sup>X</sup> <sup>⊆</sup> <sup>Z</sup>*<sup>d</sup>, a* maximal corner *(resp* minimal corner*) of* <sup>X</sup> *is a point* **<sup>v</sup>** <sup>∈</sup> <sup>X</sup> *maximal (resp minimal) with respect to component-wise ordering* <sup>≤</sup>*. We write* Corners(X) *and* Corners(X) *for the sets of maximal and minimal corners, respectively, and write* Corners(X) = Corners(X) <sup>∪</sup> Corners(X)*.*

Given a membership oracle for some <sup>X</sup> ∈ C containing **<sup>0</sup>**, Algorithm <sup>1</sup> returns *some* maximal corner of a given finite subset. Intuitively, for each coordinate i, a binary search is made until a border of X is eventually found. More precisely, we provide the following complexity analysis.

```
Algorithm 1. Binary search for a maximal corner, assuming 0 ∈ X
Ensure: Returned value is a maximal corner of X
Require: 0 ∈ X; ΦX a membership oracle for X
  function findMaxCorner(ΦX)
     i ← 0; v = 0
     while i<d do
        i ← i + 1; k ← 1; l ← 1;
        if ΦX(v + ei) then
           while ΦX(v + k · ei) do
              l ← k; k ← 2k
           while k − l > 1 do
              if ΦX(v + (k + l)/2 · ei) then
                 l ← (k + l)/2
              else
                 k ← (k + l)/2
           v ← v + l · ei; i ← 0
     return v
```
**Proposition 1.** *Let* <sup>Φ</sup>*<sup>X</sup> be a membership oracle for* <sup>X</sup> <sup>=</sup> <sup>∪</sup>*<sup>n</sup> <sup>i</sup>*=1Cube(**v***i*, **<sup>v</sup>***i*) *and assume* **<sup>0</sup>** <sup>∈</sup> <sup>X</sup>*. Then* findMaxCorner(Φ*X*) *terminates after* O *n <sup>j</sup>*=1 size(**v***<sup>j</sup>* ) *queries and returns some* **<sup>v</sup>** <sup>∈</sup> Corners(X)*.*

This algorithm provides a partial implementation of the following oracle:

**Definition 3.** *Given* <sup>X</sup> ∈ C*, a* corner oracle *for* <sup>X</sup> *is any function* <sup>Θ</sup>*<sup>X</sup>* : <sup>X</sup> <sup>→</sup> Corners(X) <sup>×</sup> Corners(X)*.*

A complete implementation of this oracle is provided by noticing that membership oracles can easily be composed:

*Remark 1.* Assume Φ*<sup>A</sup>* and Φ*<sup>B</sup>* are two given membership oracles, respectively for two arbitrary sets <sup>A</sup> and <sup>B</sup>, and <sup>f</sup> : <sup>Z</sup>*<sup>d</sup>* <sup>→</sup> <sup>Z</sup>*<sup>d</sup>*. One can build membership oracles for <sup>A</sup> <sup>∪</sup> <sup>B</sup>, <sup>A</sup> <sup>∩</sup> <sup>B</sup>, AΔB, <sup>A</sup>\<sup>B</sup> and <sup>f</sup>(A). In particular:


In both cases, notice that size(f(A)) <sup>≤</sup> size(A) + size(**v**0) <sup>≤</sup> 2size(A).

In the sequel we write Φ*<sup>C</sup>* for the membership oracle of any set C obtained by composing sets whose oracles are provided. We also assume having constructed the two procedures findMaxCorner(**v**, Φ*X*) and findMinCorner(**v**, Φ*X*).

#### **3.2 Overshooting Algorithm**


The core loop of the learning algorithm is presented in the LearnCubes function of Algorithm 2. The hypothesis is initially empty, and is later refined, as long as a counterexample is returned. How to refine the hypothesis given a counterexample? Two implementations of Refine are provided namely RefineSym and RefineAddRemove, giving rise to two variants of the algorithm. In both cases, the refinement takes a counterexample as an input and uses the corner oracle to build a cube C. In the former variant, a symmetric difference between the current hypothesis and C is made, while in the latter, C is either added or removed from the hypothesis.

**Fig. 1.** Possible run of the overshooting algorithm on two cubes in 2 dimensions

An example run of the RefineAddRemove variant is depicted in Fig. 1. While the above diagrams represent the search space used by the corner oracles, the below diagrams depict the resulting hypothesis after refinement. Initially, the hypothesis is empty (not represented) so the search space coincides with the target set X, which can be represented as a union of two overlapping cubes. A counterexample **<sup>v</sup>** <sup>∈</sup> <sup>X</sup>\<sup>H</sup> is therefore returned by the equivalence oracle. As **<sup>v</sup>** <sup>∈</sup> <sup>X</sup>, the refinement procedure adds some cube by searching the state space <sup>X</sup>\<sup>H</sup> <sup>=</sup> <sup>X</sup> around **<sup>v</sup>**. A too large cube is then added to the hypothesis, and a negative counterexample **<sup>v</sup>** <sup>∈</sup> <sup>H</sup>\<sup>X</sup> is then returned. The search space is now <sup>H</sup>\<sup>X</sup> and the algorithm aims at removing some smaller cube from the hypothesis. After two removals, the final hypothesis coincides with the target.

*Hypothesis Representation.* Both variants are operating on the hypothesis by applying boolean operations. One can naturally wonder if hypothesis represented by union, symmetric differences and differences of cubes can be handled by oracles operating on the concept class of finite cubes. As a matter of fact, we will observe that HΔX, <sup>H</sup>\<sup>X</sup> and <sup>X</sup>\<sup>H</sup> can all be represented in <sup>C</sup>:

**Lemma 1 (Cube intersection and subtraction).** *Let* C<sup>1</sup> = Cube(**v**1, **v**1) *and* C<sup>2</sup> = Cube(**v**2, **v**2) *two cubes.*

*Then* <sup>C</sup><sup>1</sup> <sup>∩</sup> <sup>C</sup><sup>2</sup> *is a cube and* <sup>C</sup>2\C<sup>1</sup> *can be written as the disjoint union of* <sup>2</sup><sup>d</sup> *cubes. Moreover, these computations are effective in* 2d *operations.*

Intuitively, one can think of a cube subtracted by a smaller cube results in a family of cubes, one for each face of the larger cube. There are 2d faces for a cube in dimension d.

#### **3.3 Repetition-Free Complexity**

In order to analyze the complexity of both variants of the algorithm, we fix a finite target set <sup>X</sup> ∈ C*<sup>d</sup>* and one of its representation as a union of cubes:

$$X = \bigcup\_{i=1}^{n} \text{Cube}(\underline{\mathbf{y}}\_i, \overline{\mathbf{v}}\_i)$$

We prove by induction on the iteration step that H can be expressed as a union of cubes, whose corners are aligned on a particular set of points:

**Definition 4 (Abstract grid).** *For* <sup>1</sup> <sup>≤</sup> <sup>k</sup> <sup>≤</sup> <sup>d</sup>*, we define the sets:*

$$\begin{aligned} \underline{B}\_k &= \{ \overline{\mathbf{v}\_i}[k] + 1 \mid 1 \le i \le C \} \cup \{ \underline{\mathbf{v}\_i}[k] \mid 1 \le i \le C \}, \\ \overline{B}\_k &= \{ \overline{\mathbf{v}\_i}[k] \mid 1 \le i \le C \} \cup \{ \underline{\mathbf{v}\_i}[k] - 1 \mid 1 \le i \le C \} \end{aligned}$$

*For any* <sup>A</sup> <sup>⊆</sup> <sup>Z</sup>*<sup>d</sup>, we write* <sup>A</sup> ∈ B *whenever is a finite union of cubes of the form* Cube(**v**, **v** ) *such that for all* <sup>k</sup>*,* **<sup>v</sup>**[k] <sup>∈</sup> <sup>B</sup>*<sup>k</sup> and* **<sup>v</sup>** [k] <sup>∈</sup> <sup>B</sup>*k.*

Intuitively, <sup>B</sup>*<sup>k</sup>* (resp <sup>B</sup>*k*) describes all the possible <sup>k</sup>-coordinate for minimal corners (resp maximal). A coordinate for a max corner, i.e. a constraint of the form <sup>x</sup>*<sup>k</sup>* <sup>≤</sup> <sup>α</sup>, can become a coordinate for a minimal corner, i.e. a constraint of the form <sup>x</sup>*<sup>k</sup>* <sup>≥</sup> <sup>α</sup>+ 1, when taking the complement during a difference operation, and vice versa.

We observe that B is stable by union, intersection and difference. In particular, the overshooting algorithms maintain <sup>H</sup> ∈ B, namely the hypothesis always has minimal (resp maximal) corners that align with <sup>B</sup>*<sup>k</sup>* (resp <sup>B</sup>*k*) on the <sup>k</sup>-th coordinate. Figure 2 provides an example of such points for a target made of the union of two cubes.

**Fig. 2.** Possible minimal and maximal corners for cubes appearing in the hypothesis, for a given target space

Since the sets <sup>B</sup>*<sup>k</sup>* and <sup>B</sup>*<sup>k</sup>* are of size at most 2<sup>n</sup> for every <sup>k</sup>, there are at most (2n)<sup>2</sup>*<sup>d</sup>* possible cubes, polynomial for a fixed <sup>d</sup>. Assuming <sup>H</sup> ∈ B, we can ensure that Lemma 1 maintains a polynomial representation of the hypothesis throughout the algorithm until termination.

Although <sup>B</sup> is of polynomial size, proving <sup>H</sup> ∈ B is not sufficient to prove termination of the algorithm in polynomial time, especially if some cubes in B are added and removed several times. Consider for example Fig. 3 which depicts a possible run of the algorithm on three aligned cubes by its successive hypotheses:

**Fig. 3.** Possible run on three cubes where cube B is added twice to the hypothesis.

cube B is added during the first step, but is later covered when the algorithm tries to learn A but overshoots. Another overshooting happens when trying to remove the space between A and B, which ends up removing all space between A and C. The cube C has then to be learned a second time, terminating the algorithm.

To circumvent this issue, we propose an optimization that prevents visiting twice the same minimal corner **v**. We base our reasoning on the following observations:

– If **<sup>v</sup>** <sup>∈</sup> <sup>X</sup>, then **<sup>v</sup>** <sup>∈</sup> <sup>X</sup>, so **<sup>v</sup>** should not be later removed.

– If **<sup>v</sup>** <sup>∈</sup>/ <sup>X</sup>, then **<sup>v</sup>** <sup>∈</sup>/ <sup>X</sup>, so **<sup>v</sup>** should not be later added back to <sup>H</sup>.

Algorithm 3 introduces an optimized refinement procedure to keep track of the already added maximal corners. Although an analogous optimization can be done on the symmetric difference variant, we only discuss here RefineAddRemove2.

Once a minimal corner **v** for a candidate cube has been found, we continue the search of a maximal corner **v** by avoiding points that will result in the removal (resp addition) of already added (resp removed) minimal corners.

```
Algorithm 3. Optimized refinement avoiding visited minimal corners
```

```
Let V ← ∅
function RefineAddRemove2(H, ve, ΦX)
   if ΦX(ve) then
      Let v = findMinCorner(ve, ΦX\H)
      Let v = findMaxCorner(v, ΦX\H\{v | ∃v-
                                              ∈V :v≤v-
                                                    ≤v})
      V ← V 	 {v}
      return H ∪ Cube(v, v)
   else
      Let v = findMinCorner(ve, ΦH\X)
      Let v = findMaxCorner(v, ΦH\X\{v | ∃v-
                                              ∈V :v≤v-
                                                    ≤v})
      V ← V 	 {v}
      return H\Cube(v, v)
```
Notice how only the maximal corner search benefits from the optimization, by tracking down minimal corners only. As a matter of fact, one could store the whole visited cubes in set V . However, when a search for maximal corner is carried, the resulting cube will intersect a previously visited cube as soon as the max corner crosses the minimal corner of the visited cube.

We exploit again Remark 1 to build an oracle for every mentioned membership oracle. Since V is a finite set, one can indeed build a membership oracle for the set {**<sup>v</sup>** | ∃**v** <sup>∈</sup> <sup>V</sup> \<sup>X</sup> : **<sup>v</sup>** <sup>≤</sup> **<sup>v</sup>** <sup>≤</sup> **<sup>v</sup>**}. Due to this exclusion region, a finer analysis has to be conducted to prove <sup>H</sup> ∈ B.

**Lemma 2.** *The two optimized variants maintain the following invariants:*

*1.* <sup>V</sup> <sup>∩</sup> <sup>X</sup> <sup>⊆</sup> <sup>H</sup>*; 2.* (<sup>V</sup> \X) <sup>∩</sup> <sup>H</sup> <sup>=</sup> <sup>∅</sup>*; 3. for all* **<sup>v</sup>** <sup>∈</sup> <sup>V</sup> *, and any* <sup>k</sup>*,* **<sup>v</sup>**[k] <sup>∈</sup> <sup>B</sup>*k; 4.* <sup>H</sup> ∈ B*.*

Properties 1 and 2 ensure that every v added to V is never added twice. These also ensures correctness of the algorithm: remark that the search for a maximal corner is not started from the initial counterexample **v***<sup>e</sup>* but from **v**, which is indeed is in the search space since **<sup>v</sup>** ∈ { / **<sup>v</sup>** | ∃**v** <sup>∈</sup> <sup>V</sup> : **<sup>v</sup>** <sup>≤</sup> **<sup>v</sup>** <sup>≤</sup> **<sup>v</sup>**} (no point added twice to <sup>V</sup> ). Finally, property 3 ensures that only elements of (B*k*)*<sup>k</sup>* are added to V , hence a maximal number of (2n)*<sup>d</sup>* additions.

*Proof.* At the beginning of the algorithm, <sup>V</sup> <sup>=</sup> <sup>H</sup> <sup>=</sup> <sup>∅</sup>, satisfying all given properties. We prove the result by induction on the iteration step:


For any <sup>k</sup> <sup>∈</sup> [1, d], **<sup>v</sup>** <sup>+</sup> **<sup>e</sup>***<sup>k</sup>* <sup>∈</sup>/ <sup>B</sup> so either:

– **<sup>v</sup>** <sup>+</sup> **<sup>e</sup>***<sup>k</sup>* <sup>∈</sup>/ <sup>A</sup> ∈ B so **<sup>v</sup>**[k] <sup>∈</sup> <sup>B</sup>*k*;

– or **<sup>v</sup>** <sup>+</sup> **<sup>e</sup>***<sup>k</sup>* ∈ {**<sup>v</sup>** | ∃**v** <sup>∈</sup> <sup>V</sup> : **<sup>v</sup>** <sup>≤</sup> **<sup>v</sup>** <sup>≤</sup> **<sup>v</sup>**} but since **<sup>v</sup>** is not in the set, there exists **<sup>v</sup>** <sup>∈</sup> <sup>V</sup> such that **<sup>v</sup>**[k]+1= **<sup>v</sup>** [k]. Since **v** [k] <sup>∈</sup> <sup>B</sup>*k*, we have **<sup>v</sup>**[k] <sup>∈</sup> <sup>B</sup>*k*.

This concludes the proof.

By combining Proposition 1 and Lemma 2, we summarize the complexity of our overshooting algorithms for a particular target <sup>X</sup> <sup>=</sup> <sup>∪</sup>*<sup>n</sup> <sup>i</sup>*=1Cube(**v***i*, **<sup>v</sup>***i*) ∈ C*d*.

**Theorem 1 (M+EQ).** *Both variants of* LearnCubes *terminates in at most* (2n)*<sup>d</sup> iterations, where an iteration requires:*


This algorithm terminates in polynomial time, for fixed d, in any representation of target X. In particular, the result holds in the worst-case where the representation of X as a finite union of cubes is minimal. As a matter of fact the presented exponential bound in <sup>d</sup> is tight: there exists a target <sup>X</sup> ∈ C and a pair of corner and equivalence oracles such that both algorithms terminate in exponential time.

**Fig. 4.** exponential blow-up, case d = 2

*Example 1.* Consider <sup>X</sup> <sup>=</sup> {**0**, *d <sup>i</sup>*=1 2**e***i*} composed of two cubes, then by learning Cube(**0**, *d <sup>i</sup>*=1 <sup>2</sup>**e***i*), then removing every middle plane of equation <sup>x</sup>*<sup>k</sup>* = 1 for every <sup>k</sup> <sup>∈</sup> [1, d], the resulting hypothesis is composed of 2*<sup>d</sup>* <sup>−</sup> 2 cubes to remove. An example with d = 2 is depicted in Fig. 4.

Whether finite unions of cubes can be learned in polynomial time in the dimension is left as an open problem, that we relate to DNF formula learning over <sup>d</sup> variables where each term can be interpreted as a cube over {0, <sup>1</sup>}*<sup>d</sup>*.

# **4 Extensions**

In this section we introduce extensions to the overshooting algorithm from Sect. 3.2. While membership and equivalence queries are sufficient for learning finite sets, one natural extension of the minimal learner setting is to introduce a subset oracle [3]:

**Definition 5 (Subset Oracle).** *Consider some target concept* <sup>X</sup> ∈ C*<sup>d</sup> for some concept class* C = <sup>∞</sup> *<sup>d</sup>*=1 <sup>C</sup>*<sup>d</sup> and let* <sup>⊥</sup>, ∈ D/ *be two fresh symbols.*

*<sup>A</sup>* subset oracle *(*SUB*) for* <sup>X</sup> *is a function* <sup>ρ</sup>*<sup>X</sup>* : <sup>C</sup>*<sup>d</sup>* → { , ⊥}*, which outputs iff* <sup>H</sup> <sup>⊆</sup> <sup>X</sup>*.*

The definition is similar to the membership oracle from Definition 1 except the oracle takes a set instead of a single point as input.

#### **4.1 Maximal Cube Oracle**

As opposed to the overshooting algorithm, using a subset oracle avoids the overshooting issue, that is to say, we can now search for cubes included in the target X. In order to increase the convergence speed, we nonetheless introduce a maximality criterion on the suitable cubes:

**Definition 6 (Maximal Cubes).** *A cube* Cube(**v**, **v**) *is* maximal *w.r.t.* X *if*


Figure 5 provides examples of possible maximal cubes in dimension d = 2.

(a) 4 maximal cubes when *n* = 2 (b) *n*(*n* + 1)*/*2 maximal cubes

**Fig. 5.** Example of maximal cubes w.r.t. to a union of n cubes

Next, we modify the corner oracle from Sect. 3.1 to use subset queries. Again, we only define the algorithm to find a max corner, the min corner algorithm can be implemented analogously. The algorithm first computes a lower and upper bound for the subsequent binary search. The computation is shown in the function computeMaxBounds. Given a cube defined by its minimal and a maximal corner, the value of coordinate i is increased as long as the resulting cube is still a subset of the target set X. The upper bound **v** is the first negative reply by the oracle and the lower bound **v** the last positive response. A binary search is made on **v** and **v** in the findMaxIncCorner function.

#### **4.2 Maximal Cube Algorithm**

Algorithm 5 presents a procedure that iteratively refines the hypothesis: for any point, the algorithm searches for a maximal cube contained by this point w.r.t. the target and adds it to the hypothesis. One can check that both procedure calls are valid, as <sup>H</sup> <sup>⊆</sup> <sup>X</sup> is an invariant. At every iteration the counterexample **<sup>v</sup>** satisfies **<sup>v</sup>** <sup>∈</sup> <sup>X</sup> \ <sup>H</sup>. The use of the subset oracle ensures that the function FindMaxIncCorner always returns a point **<sup>v</sup>** such that Cube(**v**, **<sup>v</sup>**) <sup>⊆</sup> <sup>X</sup>. Similarly, the function FindMinIncCorner always returns a corner **v** such that Cube(**v**, **<sup>v</sup>**) <sup>⊆</sup> <sup>X</sup>. The resulting cube is then added to the hypothesis, ensuring point **v** is never visited again as a counterexample. This entails the termination of the algorithm, in at most <sup>|</sup>X<sup>|</sup> iteration of the main loop. A better bound will be explored in Sect. 4.4.

```
Algorithm 4. Maximal corner of a maximal cube, in O(size(X)) subset queries
```

```
Ensure: Returned value is a maximal corner of X
  function findMaxIncCorner(v, v, ρX)
     for i ∈ [1, d] do
        (b, b) = computeMaxBounds(v, v, i, ρX)
        while b 
                 = b do
            m ← (b + b) ÷ 2
            if ρX(Cube(v, v[i/m])) then
               b ← m
            else
               b ← m
        v[i] ← b
     return v
  function computeMaxBounds(v, v, i, ρX)
     δ ← 1
     while ρX(Cube(v, v + δ · ei)) do
        δ ← 2 · δ
     return (v[i] + δ/2, v[i] + δ)
```
**Algorithm 5.** The maximal cube algorithm

**function** LearnMaxCube(ρX, ΨX) Let <sup>H</sup> ← ∅ **while** (**<sup>v</sup>** <sup>←</sup> <sup>Ψ</sup>X(H)) = **do** Let **<sup>v</sup>** <sup>←</sup> findMaxIncCorner(**v**, **<sup>v</sup>**, ρX) Let **<sup>v</sup>** <sup>←</sup> findMinIncCorner(**v**, **<sup>v</sup>**, ρX) <sup>H</sup> <sup>←</sup> <sup>H</sup> <sup>∪</sup> Cube(**v**, **<sup>v</sup>**)

#### **4.3 Extension to the Infinite Case**

We discuss now one possible extension to the infinite case, namely when cubes are possibly unbounded and may contain infinitely many points.

We adapt our learning formalism to deal with infinite bounds: for the remainder of the section we extend the discrete lattice <sup>Z</sup>*<sup>d</sup>* to (<sup>Z</sup> {+∞, −∞})*<sup>d</sup>* and extend trivially <sup>≤</sup> over the newly introduced points. For **<sup>v</sup>**, **<sup>v</sup>** <sup>∈</sup> (<sup>Z</sup> {+∞, −∞})*<sup>d</sup>*, the definition of <sup>C</sup> = Cube(**v**, **<sup>v</sup>**) remains unchanged, in particular <sup>C</sup> <sup>⊆</sup> <sup>Z</sup>*<sup>d</sup>* but may be infinite. The concept class <sup>C</sup>, hence the domain of oracle functions, is augmented with all finite unions of cubes with (possibly) infinite bounds.

A possible approach to tackle this problem in the minimally adequate teacher (M+EQ) formalism consists in running the overshooting algorithm of Sect. 3 on the state space restricted to some cube of width 2*<sup>k</sup>* centered in **0** and gradually increase k if counterexamples outside this restriction are found. This method is discussed in the extended version of the present article [25] but we focus here on a LearnMaxCube adaptation exploiting subset queries (SUB+EQ).

While Algorithm 5 remains unchanged, we need however to adjust the functions FindMaxIncCorner and FindMinIncCorner as those are not able to accelerate the search to infinity. Algorithm 6 achieves this goal by simply overriding the ComputeMaxBounds and ComputeMinBounds subroutines in order to check for possible +∞ and −∞ bounds. Whenever such bound is returned, no further binary search occurs for this coordinate (constant time).


#### **4.4 Complexity**

Termination of LearnMaxCube was proved using cardinality arguments in Sect. 4.1. These arguments obviously don't apply in the case where the target set is infinite. Moreover, we are interested in finer complexity analysis.

As in Sect. 3.3, we fix a target representation <sup>X</sup> <sup>=</sup> <sup>∪</sup>*<sup>n</sup> <sup>i</sup>*=1Cube(**v***i*, **<sup>v</sup>***i*) and study the algorithm complexity with respect to *n <sup>i</sup>*=1 size(**v***i*) + size(**v***i*) ∈ C*d*. As some of the vectors **v** may contain infinite coordinates, we carefully specify size(+∞) = size(−∞) = 1 and keep the usual definition of size(v).

**Theorem 2 (SUB+EQ).** LearnMaxCube *terminates in at most* n2*<sup>d</sup> iterations, where an iteration requires:*


*Proof.* At every iteration, one equivalence query is performed then FindMax-IncCorner and FindMinIncCorner perform a binary search, resulting in a linear number of subset similar (proof similar to Proposition 1).

In order to analyze the number of iterations of the main loop, let us first remark that each added maximal cube is added only once: if we write **v***<sup>k</sup>* the <sup>k</sup>-th counterexample and <sup>C</sup>*<sup>k</sup>* the learned maximal cube, then **<sup>v</sup>***k*+1 <sup>∈</sup> <sup>X</sup>\∪*<sup>k</sup> <sup>i</sup>*=1 <sup>C</sup>*<sup>i</sup>* and **<sup>v</sup>***k*+1 <sup>∈</sup> <sup>C</sup>*k*+1 so <sup>C</sup>*k*+1 <sup>=</sup> <sup>C</sup>*<sup>i</sup>* for every <sup>i</sup> <sup>∈</sup> [1, k].

The number of iterations is therefore bounded by the number of maximal cubes. We proceed now to bound the number of maximal cubes: Let <sup>C</sup> = Cube(**v**, **<sup>v</sup>**) be a maximal cube w.r.t. <sup>X</sup>. For any <sup>k</sup> <sup>∈</sup> [1, d] there exist i, j <sup>∈</sup> [1, n] such that **<sup>v</sup>**[k] = **<sup>v</sup>***i*[k] and **<sup>v</sup>**[k] = **<sup>v</sup>***<sup>j</sup>* [k], hence at most <sup>n</sup><sup>2</sup> possibilities for coordinate k.

As in Theorem 1 the number of iterations is polynomial in the number of cubes n but exponential in the dimension d. As opposed to the LearnCubes algorithm, the bound is not tight as the example Fig. 5b provides only a quadratic number of maximal number of cubes. As the maximal cube concept can be related to the notion of *prime implicant*, examples of DNF formula with an exponential of prime implicants (see for example [8]) can be translated into union of cubes with an exponential number of maximal 0–1 cubes.

From a practical perspective, one can nonetheless argue that LearnMax-Cube is likely to perform well in practice, by avoiding the overshooting problem mentioned in Example <sup>1</sup> as <sup>H</sup> <sup>⊆</sup> <sup>X</sup> is an invariant. In fact, one can easily check that if there are no adjacent<sup>1</sup> cubes, the number of iterations becomes linear.

# **5 Applications and Experiments**

In this section, we describe an immediate application of our learning algorithms to monadic decomposition of quantifier-free Presburger formulas [15,29]. We then report on experimental comparisons between our algorithms and existing methods for the problem.

#### **5.1 Application to Monadic Decomposition**

Here we consider quantifier-free linear integer arithmetic formulas without modulo arithmetic:

$$
\varphi ::= \alpha\_1 \sim \alpha\_2 \mid \varphi \land \varphi \mid \varphi \lor \varphi,
$$

where ∼ ∈ {≤, <sup>≥</sup>, <sup>=</sup>}, and <sup>α</sup>1, α<sup>2</sup> are integer linear combinations of the variables x1,...,x*n*, i.e., α*<sup>i</sup>* is of the form c0+*n <sup>j</sup>*=1 <sup>c</sup>*<sup>j</sup>* .x*<sup>j</sup>* , where each <sup>c</sup>*<sup>i</sup>* <sup>∈</sup> <sup>Z</sup>. The formula <sup>ϕ</sup>(¯x) is said to be *satisfiable* (written Z; + |<sup>=</sup> <sup>ϕ</sup>) if there exists an assignment σ of ¯x to Z such that the formula becomes true. Of course, this is just a simple fragment of the first-order theory of integer linear arithmetic and the notion of Z; + |<sup>=</sup> <sup>ϕ</sup> can be defined in the same way even with quantifiers [14,18]. A formula ϕ is said to be *monadic* if it has only one variable. Every monadic formula ϕ(x) in this fragment can be easily transformed into a union integer intervals of the form: (1) <sup>l</sup> <sup>≤</sup> <sup>x</sup> <sup>∧</sup> <sup>x</sup> <sup>≤</sup> <sup>u</sup> where l, u <sup>∈</sup> <sup>Z</sup>, (2) <sup>l</sup> <sup>≤</sup> <sup>x</sup> where <sup>l</sup> <sup>∈</sup> <sup>Z</sup>, (3) <sup>x</sup> <sup>≤</sup> <sup>u</sup> where <sup>u</sup> <sup>∈</sup> <sup>Z</sup>, or (4) or <sup>⊥</sup>.

A *monadic decomposition* [29] of a formula ϕ(¯x) is a boolean combination <sup>ψ</sup>(¯x) of monadic formulas that is equivalent to <sup>ϕ</sup> over the theory, i.e., Z; + |<sup>=</sup> <sup>∀</sup>x¯(<sup>ϕ</sup> <sup>↔</sup> <sup>ψ</sup>). Of course, not all formulas admit a monadic decomposition (e.g., <sup>x</sup> <sup>=</sup> y). It was shown in [15] that deciding if a formula in the theory be monadically decomposable is coNP-complete<sup>2</sup>. Veanes *et al.* [29] provides a generic semidecision procedure for computing a monadic decomposition of a quantifier-free formula as an if-then-else formula that is applicable to pretty much all theories considered in SMT. Despite its genericity, the procedure runs rather well, e.g., as the authors showed on their benchmarking in [29].

<sup>1</sup> Two cubes <sup>C</sup><sup>1</sup> and <sup>C</sup><sup>2</sup> are *adjacent* if min - <sup>i</sup> <sup>|</sup>**v**1[i] <sup>−</sup> **<sup>v</sup>**2[i] <sup>|</sup> **<sup>v</sup>**<sup>1</sup> <sup>∈</sup> <sup>C</sup>1, **<sup>v</sup>**<sup>2</sup> <sup>∈</sup> <sup>C</sup><sup>2</sup>

<sup>≤</sup> 1. <sup>2</sup> The proof in [15] uses modulo constraints to show that monadic decomposition of a two-variable formula ϕ(x, y) is coNP-complete. Modulo constraints could be easily removed by allowing more integer variables.

The application of our learning algorithms to computing monadic decomposition arises from the following observation. Since each monadic decomposition can be transformed into DNF, a monadic decomposition of a formula <sup>ϕ</sup>(¯x) over Z; + can be constructed as a finite union of (possibly infinite) hypercubes, where an infinite hypercube arises when a variable is either not bounded from above or not bounded from below (or both). Conversely, a finite union H of possibly infinite hypercubes can also be easily transformed into a boolean combination of monadic formulas ϕ*H*. For example, the formula (0 <sup>≤</sup> <sup>x</sup> <sup>≤</sup> <sup>5</sup> <sup>∧</sup> <sup>3</sup> <sup>≤</sup> <sup>y</sup> <sup>≤</sup> 10) <sup>∨</sup> (8 <sup>≤</sup> <sup>x</sup>) corresponds to the union of hypercubes Cube((0, 3),(5, 10))<sup>∪</sup> Cube((8, −∞),(+∞, <sup>+</sup>∞)). Furthermore, all relevant oracles admit a straightforward implementation:


$$|\langle \mathbb{Z}; + \rangle| = (\varphi\_H \land \neg \varphi) \lor (\varphi \land \neg \varphi\_H).$$

This is a single satisfiability check of quantifier-free integer-linear arithmetic formula, for which highly-optimized solvers exist (e.g., Z3 [26]).

– A subset query H can similarly be reduced to checking

$$
\langle \mathbb{Z}; + \rangle \mid = (\varphi\_H \wedge \neg \varphi).
$$

This is also a single satisfiability check over Z; +.

This allows us to apply both of our learning algorithms to the problem.

Monadic decomposition has numerous applications including quantifier elimination [29], string solving [15], and symbolic finite automata/transducers [13,29], among others. In the following example we illustrate how our learning algorithm(s) could be applied to improving quantifier elimination for the theory of linear integer arithmetic.

*Example 2.* Consider a formula of the form <sup>∀</sup>x¯∃y ϕ(¯x, <sup>y</sup>¯), where <sup>ϕ</sup> is a formula in linear integer arithmetic without modulo constraints. Suppose that ϕ is monadically decomposable, and is equivalent to the formula *<sup>n</sup> <sup>i</sup>*=1 <sup>D</sup>*i*(¯x, <sup>y</sup>¯), where each

<sup>D</sup>*<sup>i</sup>* is a disjunction of monadic predicates over the variables ¯x∪y¯. We assume w.l.o.g. that each D*<sup>i</sup>* is satisfiable. Then, this formula is equisatisfiable (over linear integer arithmetic) to <sup>ψ</sup> := <sup>∀</sup>x¯ ( *<sup>n</sup> <sup>i</sup>*=1 <sup>D</sup>*i*(¯x, <sup>c</sup>¯*i*)), where ¯<sup>y</sup> in <sup>D</sup>*<sup>i</sup>* are replaced by *fresh* constants ¯c*<sup>i</sup>* (i.e. two distinct D*i*, D *<sup>i</sup>* use different constants). This can be proven by a simple application of skolemization, and observing that each occurrence of <sup>f</sup>(¯x) in any disjunct is of the form a<f(¯x) < b, where <sup>a</sup> <sup>∈</sup> {−∞} ∪ <sup>Z</sup> and <sup>b</sup> <sup>∈</sup> <sup>Z</sup> ∪ {∞}, implying that <sup>f</sup>(¯x) can be replaced by a single constant, which does not depend on ¯x. Finally, let D *<sup>i</sup>* be the conjuncts in <sup>D</sup>*<sup>i</sup>* only involving variables in ¯x. Checking that ψ is true reduces to checking satisfiability of *<sup>n</sup> <sup>i</sup>*=1 <sup>¬</sup>D *i*.

To make this example concrete, we consider the formula <sup>∀</sup>x∃y(<sup>x</sup> <sup>≥</sup> <sup>0</sup> <sup>→</sup> <sup>x</sup> <sup>+</sup> <sup>y</sup> <sup>≥</sup> <sup>5</sup> <sup>∧</sup> <sup>y</sup> <sup>≥</sup> 0). A monadic decomposition of the quantifier-free part is x < <sup>0</sup> <sup>∨</sup> <sup>5</sup> *<sup>i</sup>*=0(<sup>x</sup> <sup>≥</sup> <sup>i</sup> <sup>∧</sup> <sup>y</sup> <sup>≥</sup> <sup>5</sup> <sup>−</sup> <sup>i</sup>). Therefore, checking the above formula can be reduced to satisfiability of <sup>x</sup> <sup>≥</sup> <sup>0</sup> <sup>∧</sup> <sup>5</sup> *<sup>i</sup>*=0 x<i which is not satisfiable.

# **5.2 Experiments**

In order to assess the performance of the algorithms FindMaxCorner and FindMinCorner respectively introduced in Sect. 3 and Sect. 4, we consider prototype implementations. The following prototypes and experiments can be found in [24].

*Variants.* Although the methods were presented with binary search strategies in mind, we also implemented a more naive unary search procedure to obtain the corners. As later noticed in the experiments, unary search may be preferred for very small cubes and performs especially well for cubes which are based 0– 1 integer programs, while binary search achieves better performance for larger cubes. Consequently, we refer to a third variation of the algorithm called "optimized", combining unary search for small instances and binary search for large values. More precisely two variants of the overshooting algorithm from Sect. 3 and three variants of the max cubes algorithm from Sect. 4 are presented, called respectively *overshoot unary* and *overshoot binary* and *max unary*, *max binary* and *max optimized*.

*Tool Comparison.* Evaluation is performed against a generic monadic decomposition procedure *mondec*<sup>1</sup> from [29] by Veanes et al., which works over an arbitrary base theory and outputs an if-then-else formula, which could be exponentially more succinct than a formula in DNF. The algorithm, which exploits the python-Z3 framework [26], uses a kind of a decision tree search heuristics to split the input into monadic predicates.

*Implementation.* Similarly to mondec1, our prototype is implemented in python using the python-Z3 framework, but is specialized in handling linear integer arithmetic formula, and that outputted formulas will be in DNF, unlike mondec1. For monadic decomposition applications, oracles queries are converted to appropriate Z3 satisfaction queries since a (possibly non-monadic) representation of the target set is already known.

(a) 50 overlapping cubes and the diagonal *x* + *y* = 50. (b) 100 big Cubes.

#### **5.3 Benchmark Suite**

Our benchmark suite is restricted to the problem of monadic decomposition of linear integer arithmetic, and its purpose is to stress-test our learning algorithms and mondec<sup>1</sup> against various kinds of "extreme conditions". The suite consists of six classes of monadically decomposable example formulas, which were constructed to test five features (see below). Note that the given formulas themselves might contain non-monadic predicates.

The five features (left to right in Table 1) represent the presence of (1) a large amount of cube overlaps, (2) a large number of cubes, (3) a large cube, (4) large dimension, and (5) an unbounded cube. We hypothesized that these five features play important roles in how fast the algorithms perform, which are indeed validated in our experimental

**Table 1.** Features of conducted benchmarks. A "+" (resp. "-") indicates a high (resp. low) presence of a feature.


results. The six classes of formulas are elaborated below.


#### **5.4 Results**

Experiments were conducted on an AMD Ryzen 5 1600 Six-Core CPU with 16 GB of RAM running on Windows 10. The results are summarized in Fig. 7 where each graph represents one benchmark comparing the run times of each algorithm.

cubes *K*.

 *,*000 *,*200 *,*400 *,*600 *,*800

(a) Benchmark on K Diagonal Restricted in Z<sup>2</sup>. The x-axis encodes the amount of

(c) Benchmark on K Diagonal Unrestricted in Z<sup>2</sup>.

The x-axis encodes the amount of cubes *K*.

<sup>(</sup>f) Benchmark on Example 2 in Z<sup>2</sup>. The x-axis encodes parameter *K*.


**Fig. 7.** Benchmark results. The y-axis encodes the time in seconds. The timeout is set to1800 s.

The overshooting phenomenon can be observed in Fig. 7c and Fig. 7e with its quadratic shape, as d = 2. In Fig. 7b, the running time quickly diverges as d increases, as anticipated by Example 1.

When the considered cubes are small, as in Fig. 7a and Fig. 7c, the unary search algorithms outperform their binary counterparts, meaning the few additional queries made by the binary search are more costly than a direct enumeration. The optimized variant is therefore a good compromise in all cases.

Figure 7d depicts a benchmark with many large cubes for a fixed dimension. While the impact of the overshooting phenomenon remains contained, the maxcube unary search variant is particularly slow. This can be explained by the size of the cubes making unary search inefficient, combined with the already expensive cost of every single inclusion query.

The mondec<sup>1</sup> algorithm is comparable to the overshooting algorithms in Fig. 7e. It also performs particularly well in Fig. 7f, which we conjecture is due to the conciseness of the solution in if-then-else form used by mondec1.

Overall, the maxcube algorithm in its optimized form is the most stable algorithm for this benchmark set and should be preferred when an inclusion oracle is available. The extra cost of these queries are here taken into account and remain affordable when implemented with Z3 queries.

# **6 Conclusion and Future Work**

We have presented a polynomial-time algorithm in Angluin's exact learning framework using membership and equivalence for learning a finite union of rectilinear cubes over Z*<sup>d</sup>* over any fixed dimension d. By considering an additional subset oracle, learning possibly infinite cubes can be achieved with the same complexity, but a simpler and faster learning algorithm in practice. The technique enables the introduction of auxiliary oracles, namely the corner (resp. maximal cube) oracle when a membership (resp. subset) oracle is provided. While oracles for subset queries tend to be difficult to implement, this turns out not to be the case for our proposed application of computing monadic decompositions of quantifier-free integer linear arithmetic formulas without modulo constraints, which is successfully solved by our algorithm.

We mention three future research directions. First, extensions to modulo operations could be explored, by encoding periodicity on d additional coordinates and providing adequate oracles on the encoded target. A second direction consists in applying these learning techniques to the verification of systems by learning invariants which are monadically decomposable in a small number of cubes. Lastly, one promising direction to further improve our algorithms is to investigate how to leverage if-then-else formula representations as used in mondec<sup>1</sup> [29], which could be exponentially more succinct than formulas in DNF.

**Acknowledgements.** The authors would like to thank the reviewers and program chairs for their thoughtful comments and suggestions to improve the presentation, as well as the Simons Institute for the Theory of Computing and Christoph Haase for fruitful discussions.

# **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

# **Interpolation and Model Checking for Nonlinear Arithmetic**

Dejan Jovanovi´c(B) and Bruno Dutertre

SRI International, Menlo Park, USA

**Abstract.** We present a new model-based interpolation procedure for satisfiability modulo theories (SMT). The procedure uses a new mode of interaction with the SMT solver that we call *solving modulo a model*. This either extends a given partial model into a full model for a set of assertions or returns an explanation (a model interpolant) when no solution exists. This mode of interaction fits well into the model-constructing satisfiability (MCSAT) framework of SMT. We use it to develop an interpolation procedure for any MCSAT-supported theory. In particular, this method leads to an effective interpolation procedure for nonlinear real arithmetic. We evaluate the new procedure by integrating it into a model checker and comparing it with state-of-art model-checking tools for nonlinear arithmetic.

**Keywords:** Satisfiability modulo theories · Craig interpolation · Nonlinear arithmetic

# **1 Introduction**

Craig interpolation is one of the central reasoning tools in modern verification algorithms. Verification techniques such as model checking rely on Craig interpolation [11,39] as a symbolic learning oracle that drives abstraction refinement and invariant inference. Interpolation has been studied for many fragments of first-order logic that are useful in practice, such as linear arithmetic [23], uninterpreted functions [9,37], arrays [25,38], and sets [32]. In these fragments, a typical interpolation procedure constructs interpolants by traversing the clausal proof of unsatisfiability provided by an SMT solver [26,34,41] while performing interpolation locally at proof nodes. A major missing piece in the class of fragments supported by interpolating SMT solvers is nonlinear arithmetic,<sup>1</sup> as the

<sup>1</sup> By nonlinear arithmetic we mean Boolean combination of arithmetic constraints over arbitrary-degree polynomials.

This material is based upon work supported by the Defense Advanced Research Project Agency (DARPA) and Space and Naval Warfare Systems Center, Pacific (SSC Pacific) under Contract No. N66001-18-C-4011, and the National Science Foundation (NSF) grant 1816936. Any opinions, findings and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of DARPA, SSC Pacific, or the NSF.

c The Author(s) 2021

A. Silva and K. R. M. Leino (Eds.): CAV 2021, LNCS 12760, pp. 266–288, 2021. https://doi.org/10.1007/978-3-030-81688-9\_13

complex reasoning required for nonlinear arithmetic makes fine-grained symbolic proof generation extremely difficult.

We present an approach to interpolation that is driven by models rather than proofs. Given a pair of formulas A and B such that A ∧ B is unsatisfiable, an interpolant is a formula I that is implied by A and inconsistent with B. Recent model-based decision procedures, specifically the ones developed within the MCSAT [13,28] framework for SMT, are internally naturally interpolating. But, rather than interpolating two formulas, they provide a way to interpolate a set of constraints against a partial model. We capitalize on this internal ability, and extend it so that a formula A can be checked and interpolated against a partial model (*model interpolation*). This is closely related to the ability of modern SAT solvers to perform solving modulo assumptions [17], a technique that can also been used to provide interpolation capabilities in finite-state model checking [3].

We take advantage of model interpolation to build a formula-interpolation procedure through a simple idea: we can compute an interpolant of formulas A and B by iteratively interpolating (and refuting) all models of B with model interpolants from A. We develop the interpolation procedure within the MCSAT framework. This immediately allows us to generate interpolants for any theory supported by the framework. As MCSAT provides efficient complete solvers for nonlinear real arithmetic [27,29], we develop the first complete interpolation procedure for real nonlinear arithmetic.

To show that this new interpolation procedure is an effective tool that can be used on real-world problems, we integrate it into a model checker that uses interpolation for inferring k-inductive invariants. We evaluate this model checker on a set of industrial benchmarks. Our evaluation shows that the new procedure is highly effective, both in terms of speed, and the ability to support the model checker in its quest for counter-examples and invariants.

*Outline.* Section 2 gives background on SMT, interpolation, and nonlinear arithmetic. Section 3 presents solving modulo a model and model interpolation, and develops the general interpolation procedure. In Sect. 4, we discuss the particular needs of nonlinear arithmetic. In Sect. 5 we evaluate our implementation on nonlinear model-checking problems. We conclude in Sect. 6 and provide future research directions.

### **2 Background**

We assume that the reader is familiar with the usual notions and terminology of first-order logic and model theory (for an introduction see, e.g., [1]).

*Nonlinear Arithmetic.* As usual, we denote the ring of integers with Z and the field of real numbers with R. Given a vector of variables *x* we denote the set of polynomials with integer coefficients and variables *x* as Z[*x*]. A polynomial <sup>f</sup> <sup>∈</sup> <sup>Z</sup>[*y*, x] is of the form

$$f(\boldsymbol{y}, \boldsymbol{x}) = a\_m \cdot \boldsymbol{x}^{d\_m} + a\_{m-1} \cdot \boldsymbol{x}^{d\_{m-1}} + \cdots + a\_1 \cdot \boldsymbol{x}^{d\_1} + a\_0,$$

where 0 < d<sup>1</sup> <sup>&</sup>lt; ··· < d*m*, and the coefficients <sup>a</sup>*<sup>i</sup>* are polynomials in <sup>Z</sup>[*y*] with a*<sup>m</sup>* = 0. We call x the *top variable* and the highest power d*<sup>m</sup>* is the *degree* of the polynomial f. As usual, we denote with f(*k*) the k-th derivative of f in its top variable. A number <sup>α</sup> <sup>∈</sup> <sup>R</sup> is a *root of the polynomial* <sup>f</sup> <sup>∈</sup> <sup>Z</sup>[x] if <sup>f</sup>(α) = 0.

A *polynomial constraint* C is a constraint of the form f - 0 where f is a polynomial and - ∈ {<, ≤, =, ≥, >}. If the polynomial f = f(x) is univariate then we also say that C is univariate. An atom is either a polynomial constraint or a Boolean variable, and formulas are defined inductively with the usual Boolean connectives (∧, ∨, ¬). The symbols and ⊥ denote true and false, respectively. In addition to the basic polynomial constraints, we will also be working with extended polynomial constraints. An *extended polynomial constraint* F is of the form x *<sup>r</sup>* root(f, k, x) where <sup>f</sup> <sup>∈</sup> <sup>Z</sup>[*y*, x] and *<sup>r</sup>* ∈ {<*r*, ≤*r*, =*r*, ≥*r*, >*r*}. The semantics of this predicate is the following: Given an assignment that gives real values *v* to the variables *y*, then the roots of f(*a*, x) can be ordered over R. If the polynomial f(*a*, x) has at least k real roots and α*<sup>k</sup>* is the k-th smallest root<sup>2</sup> then the constraint is equivalent to x α*k*. Otherwise, the constraint evaluates to <sup>⊥</sup>. For example, the constraint x < root(x<sup>2</sup> <sup>−</sup> <sup>2</sup>, <sup>2</sup>, x) represents x < <sup>√</sup>2.

Given a formula F(*x*) we say that a type-consistent variable assignment M = {*x* → *a*} satisfies F if the formula F evaluates to in the standard semantics of Booleans and reals. We call M a model of F and denote this with M F. If there is such a variable assignment, we say that F is *satisfiable*, otherwise it is *unsatisfiable*. If two models M<sup>1</sup> and M<sup>2</sup> agree on the values of their common variables, we denote the model that combines M<sup>1</sup> and M<sup>2</sup> with M<sup>1</sup> ∪ M2.

**Definition 1 (Craig interpolant).** *Given two formulas* A(*x*, *y*) *and* B(*y*, *z*) *such that* A ∧ B *is unsatisfiable, a* Craig interpolant *is a formula* I(*y*) *such that* A ⇒ I *and* I ⇒ ¬B*. We call the pair* (A, B) *an* interpolation problem*.*

*Model Checking.* A *state-transition system* is a pair S = I,T, where I(*x*) is a state formula describing the initial states and T(*x*, *x* ) is a state-transition formula describing the system's evolution. Given a state formula P (*the property*), we want to determine whether all reachable states of S satisfy P. If this is the case, P is an *invariant* of S. If P is not invariant, there is a concrete trace of the system, called a *counter-example*, that reaches ¬P.

The direct way to prove that a property P is an invariant of S is to show that it is inductive. This requires showing that P holds in the initial states: I ⇒ P, and that it is preserved by transitions: P(*x*) ∧ T(*x*, *x* ) ⇒ P(*x* ). As most invariants are not inductive, a key problem in model checking is to find am *inductive strengthening* of P, that is, a property P such that P ⇒ P and P is inductive.

<sup>2</sup> For example, *<sup>x</sup>*<sup>2</sup> <sup>−</sup>2 has two roots. The first root <sup>−</sup> <sup>√</sup>2 is the smallest of the two and the second root is <sup>√</sup>2.

*Example 1 (Cauchy–Schwarz inequality).* We can frame the Cauchy–Schwarz inequality as a model-checking problem in nonlinear arithmetic. The inequality is the following

$$(\sum\_{i=1}^{n} x\_i y\_i)^2 \le (\sum\_{i=1}^{n} x\_i^2)(\sum\_{i=1}^{n} y\_i^2). \tag{1}$$

As shown in [21], many inequalities that involve a discrete parameter (such as n above) can be converted to model-checking problems. For inequality (1), we construct the transition system S*cs* = I,T where

$$\begin{aligned} I &\equiv (S\_1 = 0) \land (S\_2 = 0) \land (S\_3 = 0), \\ T &\equiv (S\_1' = S\_1 + xy) \land (S\_2' = S\_2 + x^2) \land (S\_3' = S\_3 + y^2). \end{aligned}$$

The variables S1, S2, S<sup>3</sup> correspond to the sums in (1) in order. The two variables x and y of S*cs* model the variables x*<sup>i</sup>* and y*<sup>i</sup>* from (1) in each iteration of S*cs*. Proving the inequality amounts to showing that property <sup>P</sup>*cs* <sup>≡</sup> (S<sup>2</sup> <sup>1</sup> ≤ S2S3) is an invariant of S*cs*. Property P*cs* is not inductive on its own, but property P *cs* ≡ P*cs* ∧ (S<sup>2</sup> ≥ 0) ∧ (S<sup>3</sup> ≥ 0) is an inductive strengthening of P*cs*.

Many modern model-checking techniques, specifically those based on SMT solving, use interpolation as a tool to automatically infer inductive invariants. In this context, an interpolant can be used to over-approximate a transition in the context of a spurious counter-example. In addition to interpolation, the recent class of techniques broadly termed *property-directed reachability* (PDR) (e.g., [24,30,33]), relies on *model generalization*, which converts a concrete counterexample state into a set of counter-examples.

**Definition 2 (Generalization).** *Given a formula* F(*x*, *y*) *such that* F *is true in a model* M*, we call a formula* G(*x*) *a* generalization of M *if* G(*x*) *is true in* M *and* G(*x*) ⇒ ∃*y* . F(*x*, *y*)*.*

A PDR model-checking procedure for nonlinear arithmetic requires both an interpolation and a generalization procedure.

# **3 SMT Modulo Models and Interpolation**

SMT solvers typically provide an API to assert formulas and to check the satisfiability of asserted formulas. We denote with solver::assert(F) the solver method that adds the formula F to the set of assertions to be checked by the solver. We denote with solver::check() the solver method for checking satisfiability, with the following contract.

solver::check(): Check satisfiability of asserted formulas A and


In this contract, the solver does not return any form of inconsistency certificate when the assertions are unsatisfiable.<sup>3</sup> We generalize the standard SMT satisfiability checking to *SMT modulo models* as follows.

solver::check(M0): Check satisfiability of asserted formulas A and 1. if there is a model M ⊇ M<sup>0</sup> such that M A, return **sat**,M, ;

2. otherwise return **unsat**, ∅, I where A ⇒ I and M<sup>0</sup> ¬I.

SMT modulo models allows one to check that a formula is satisfiable modulo a partial model M0, by seeking a solution that extends M0. If there is no such solution, the formula I returned as the certificate of unsatisfiability is a *model interpolant*: it is implied by the assertions and inconsistent with M<sup>0</sup> (i.e., I evaluates to ⊥ in the model M0). If we restrict ourselves to Boolean formulas, SMT modulo models reduces exactly to solving modulo assumptions [17] used in the SAT community. Although this idea is not completely new, it is the first time that it is used for interpolation in SMT, as far as we know.

#### **3.1 Interpolation**

Before diving into an approach that can support the above mode of satisfiability checking, we first show how model interpolation can be used to devise a general interpolation method.

**Algorithm 1:** interpolate(A, B)

```
1 SA.assert(A) ;
2 SB.assert(B) ;
3 I ←  ;
4 while true do
5 rB, MB ← SB.check() ;
6 if rB = unsat then
7 return unsat, I
8 rA, MA, IA ← SA.check(MB) ;
9 if rA = sat then
10 return sat, MA ∪ MB
11 I ← I ∧ IA ;
12 SB.assert(IA)
```
Algorithm 1 shows the pseudocode of a procedure that checks satisfiability and interpolates two formulas A and B. The basic idea is simple: we enumerate

<sup>3</sup> Some solvers support proof generation. While proofs are fundamentally important, we are interested in certificates that can always be computed and are useful in supporting further analysis. For example, proof generation for nonlinear arithmetic is still a hard open problem.

models M*<sup>k</sup>* of the formula B, and refute each model M*<sup>k</sup>* with a model interpolant I*<sup>k</sup>* from A. If the process converges and returns **unsat**, we collect the model interpolants and construct the final interpolant I = I*k*. Each interpolant I*<sup>k</sup>* is implied by A because it is a model interpolant, so A ⇒ I. Each model of B is refuted by some model interpolant I*k*, and so I ⇒ ¬B. On the other hand, if the process returns **sat**, the procedure has found a common model for A and B. The procedure above is model-driven and modular, in that it checks the formulas A and B independently while only communicating models (from B to A) and model interpolants (from A to B).

**Lemma 1 (Correctness).** *If* interpolate(A,B)*returns***unsat**, I*then*A∧<sup>B</sup> *is unsatisfiable and* I *is an interpolant for* (A, B)*. If* interpolate(A, B) *returns* **sat**, M *then* A ∧ B *is satisfiable and* M *is a model of both* A *and* B*.*

Note that Lemma 1 does not claim termination of the procedure. Termination depends on the ability of model interpolation to produce a finite number of model interpolants that can eliminate a potentially infinite number of models.

A naive approach to check a formula A(*x*, *y*) for satisfiability modulo a model M<sup>0</sup> = {*y* → *v*} is to use an interpolating SMT solver. First, encode the model into a formula F*<sup>M</sup>* ≡ (y*<sup>i</sup>* = v*i*). If the formula A ∧ F*<sup>M</sup>* is satisfiable in a model M, so is A and M ⊇ M0. Otherwise, we compute the interpolant I of A and F*M*. This naive approach satisfies the requirements of solver::check(M0), but it is limited for the following reasons. First, theories such as nonlinear arithmetic have complex models and the formula F*<sup>M</sup>* can be hard to express. As an example, <sup>x</sup> <sup>→</sup> <sup>√</sup>2 can only be expressed by extending the constraint language to support algebraic numbers, or by using additional assertions such as (x<sup>2</sup> = 2) <sup>∧</sup> (x > 0). More important, traditional interpolation provides no guarantees in terms of convergence of a sequence of interpolation problems. For example, as already noted in [42], ¬F*<sup>M</sup>* would be a valid interpolant for A and F*M*. But such an interpolant only eliminates a single model and could, in general, lead to nontermination of interpolate(A, B). To tackle this issue, we require that the procedure solver::check() produces interpolants general enough to disallow such infinite sequences of model interpolants. We do this by adopting the convergence approach and terminology of [42] to model interpolation as follows.

**Definition 3 (Model Interpolation Sequence).** *Given a formula* A(*x*, *y*)*, a sequence of models* (M*k*) *of y, and a sequences of formulas* (I*k*) *over y, we call* (I*k*) *a* model interpolation sequence *for* A *and* (M*k*) *if for all* k *it holds that*


*3.* I*<sup>k</sup> is a model interpolant between* A *and* M*k.*

**Definition 4 (Finite Convergence).** *We say that* solver::check() *has the* finite convergence property *if it does not allow infinite model interpolation sequences.*

**Lemma 2 (Termination).** *If* solver::check() *has the finite convergence property, then* interpolate(A, B) *always terminates.*

#### **3.2 SMT Modulo Models with MCSAT**

We build a procedure for solving SMT modulo models by modifying the satisfiability checking procedure of MCSAT. The MCSAT method for SMT solving was introduced in [13,28] and further extended in [27]. We give a brief overview of the MCSAT terminology and mechanics, and we describe the satisfiability procedure. We emphasize modifications to the original MCSAT procedure that are needed for solving SMT modulo models.

The architecture of an MCSAT solver consists of a core solver, an assignment trail, and reasoning plugins. The *core solver* drives the overall solving process, and is responsible for dispatching notifications and handling requests from the plugins. The *solver trail* is a chronological record that tracks assignments of terms to values. It is shared by the core solver and the reasoning plugins. The *reasoning plugins* are modules dedicated to handling specific theory terms and constraints (e.g., clauses for Booleans, polynomial constraints for arithmetic). A plugin reasons about the content of the solver trail with respect to the set of currently relevant terms. In the context of nonlinear arithmetic problems, the reasoning plugins are the arithmetic plugin and the Boolean plugin. The most important role of the core solver is to perform conflict analysis when one of the reasoning plugins detects a conflicting state.

When formulas F1,...,F*<sup>n</sup>* are asserted, by calling solver::assert(F*i*), the core solver notifies all plugins of the asserted formulas. The plugins analyze the formulas and report all *relevant terms* back to the core. The relevant terms are the variables and subterms of the formulas F*i*s that need to be consistently assigned to ensure a satisfying assignment. In nonlinear arithmetic, relevant terms are all variables, arithmetic constraints, and non-negated Boolean terms that appear in the input formula (or are part of a learnt clause). Once the relevant terms are collected, the core solver adds the assertions to the trail. The initial trail contains then the partial assignment F*<sup>i</sup>* and the search for a full satisfying assignment starts from this trail.

*Solver Trail and Evaluation.* The assignment trail is the central data structure in the MCSAT framework. It is a generalization of the Boolean assignment trail used in modern CDCL SAT solvers. The trail records a partial (and potentially inconsistent) model that assigns values to relevant terms. If the satisfiability algorithm terminates with a **sat** answer, the full satisfying assignment can be read off the trail. At any point during the search, the trail can be used to evaluate any relevant compound term based on the values of its sub-terms. A term t (and ¬t, if Boolean) *can be evaluated* in the trail M if t itself is assigned in M, or if all closest relevant sub-terms of t are assigned in M (and its value can therefore be computed). As the search progresses, it is possible for some terms to *be evaluated in two different ways*, which can result in a conflict (i.e., a term assigned different values). In order to account for this ambiguity, we define an evaluation predicate evaluates[M](t, v) that returns **true** if the term t can evaluate to the value v in trail M.

**Algorithm 2:** mcsat::check(*<sup>x</sup>* <sup>→</sup> *<sup>v</sup>*)

```
Data: solver trail M, relevant variables/terms to assign in queue
1 while true do
2 unitPropagate() ;
3 if a plugin detected a conflict and the conflict clause is C then
4 C, f inal ← analyzeConflict(M, C, x) ;
5 if f inal then
6 I ← analyzeFinal(M, C) ;
7 return unsat, I
8 else backtrackWith(M, C) ;
9 else
10 if exists xi ∈ x unassigned in M then
11 ownerOf(xi).decideValue(xi, vi)
12 else
13 if queue.empty() then return sat, M ;
14 x ← queue.pop() ;
15 if x is unassigned then ownerOf(x).decideValue(x) ;
```
*Conflicts and Conflict Clauses.* One of the main responsibilities of reasoning plugins is to ensure that the trail is consistent at any point in the search. A trail is *evaluation consistent* if no relevant term can evaluate to two different values, as described above. A trail is *unit consistent* if every relevant term can be given a value without making the trail evaluation inconsistent. If the trail is not evaluation consistent or unit consistent, the trail is *in conflict*.

Trail consistency is a generalization of the consistency that CDCL SAT solvers enforce during their search. By unit propagation, a SAT solver ensures that, if no conflict has been detected, no clause can be falsified by assigning a single variable (i.e., no clause evaluates to both and ⊥). In the MCSAT framework, the plugins do the same: they keep track of unit constraints and reason about the consistency of the trail. It is the responsibility of the plugin to report conflicts. Each conflict must be accompanied with a *valid* conflict clause that explains the inconsistency.<sup>4</sup> A clause <sup>C</sup> <sup>≡</sup> (L<sup>1</sup> <sup>∨</sup> ... <sup>∨</sup> <sup>L</sup>*n*) is a *conflict clause* in a trail <sup>M</sup>, if each literal <sup>L</sup>*<sup>i</sup>* can evaluate to <sup>⊥</sup> in <sup>M</sup>, i.e. if evaluates[M](L*i*, <sup>⊥</sup>).

*Example 2.* Consider the constraint <sup>C</sup> <sup>≡</sup> (x<sup>2</sup> <sup>+</sup> <sup>y</sup><sup>2</sup> <sup>&</sup>lt; 1) with the set of relevant terms {C, x, y}, and the following solver trails

$$\begin{aligned} M\_1 &= \{ \ C \mapsto \top, x \mapsto 0 \ \| \}, & M\_2 &= \{ \ C \mapsto \top, x \mapsto 0, y \mapsto 0 \ \}, \\ M\_3 &= \{ \ C \mapsto \top, x \mapsto 1 \ \| \}, & M\_4 &= \{ \ C \mapsto \top, x \mapsto 1, y \mapsto 0 \ \}. \end{aligned}$$

The trails M<sup>1</sup> and M<sup>2</sup> are consistent, the trail M<sup>3</sup> is unit inconsistent (no consistent assignment for y exists), and M<sup>4</sup> is evaluation inconsistent (C evaluates to both and ⊥). A valid explanations for the inconsistency of M<sup>3</sup> is the conflict

<sup>4</sup> By valid here we mean that the clause is a universally true statement on its own.

clause C<sup>3</sup> ≡ ¬C ∨ (x < 1), while a valid explanation for the inconsistency of M<sup>4</sup> is the conflict clause C<sup>4</sup> ≡ ¬C∨C. Although C<sup>4</sup> is a tautology, it is an acceptable conflict clause since both literals can evaluate to <sup>⊥</sup> (because evaluates[M4](C, ) and evaluates[M4](C, <sup>⊥</sup>)).

*Main Procedure.* The implementation of the satisfiability checking procedure solver::check() is a generalization of the search-and-resolve loop of modern SAT solvers (see, e.g. [16,17]). The procedure is shown in Algorithm 2, where we emphasize the extensions needed for SMT modulo models in red. The overall procedure performs a direct search for a satisfying assignment and terminates either by finding an assignment that extends the given partial model, or deduces that the problem is unsatisfiable as certified by an appropriate model interpolant.

The main elements of the procedure are unit propagation and decisions, used for constructing the assignment, and conflict analysis for repairing the trail when it becomes inconsistent. The unitPropagate() procedure invokes the propagation procedures provided by the plugins. Propagation allows each plugin to add new assignments to the top of the trail. If, during propagation, a plugin detects an inconsistency, it reports the conflict to the core solver along with a valid conflict clause. The decideValue(x) procedure assigns a value of the given unassigned term x. Decisions are performed only after propagation has fully saturated with no reported conflicts, which means that the trail is unit consistent. In such a trail, an assignment for x is guaranteed to exist, but the choice of a particular value is delegated to the plugin responsible for x (e.g., the arithmetic plugin for real-typed terms).

**Modification 1 (Decisions)***. To support SMT modulo a model x* → *v, variables* x*<sup>i</sup>* ∈ *x of the input model are decided before any other term, and are assigned the provided value* v*i. The procedure that performs this decision is denoted with* decideValue(x*i,* v*i*)*. If a decision introduces an evaluation inconsistency, the plugin reports the conflict with a conflict clause.*

Detecting and explaining decision conflicts is straightforward: there must exist a single constraint C that can evaluate to both and ⊥ in the trail. Such conflicts can always be explained with a clause of the form (¬C ∨ C).

If a conflict is reported, either during propagation or in a decision, the procedure invokes the conflict analysis procedure analyzeConflict(). This procedure takes the reported conflict clause C and finds the root cause of the conflict. The analysis backtracks the trail, element by element, so long as C is a conflict clause, while resolving any trail propagations from C. Once done, the analysis returns the clause along with the flag that indicates whether this conflict clause C is empty (indicating the final conflict). If the conflict is not final, the procedure calls backtrackWith() to backtrack the trail further, if possible, and add a new assignment to the trail, ensuring progress and fixing the conflict. The main invariant of the conflict resolution procedure is that the *conflict clause* C *is always implied by asserted formulas*.

**Modification 2 (Conflict Analysis)***. To support SMT modulo a model x* → *v, the analysis procedure* analyzeConflict(M*,* C*, x*) *stops as soon as it encounters a variable* x*<sup>i</sup>* ∈ *x to resolve, and returns* C, **true***.*

This modification is based on the fact that the variables x*<sup>i</sup>* have a fixed value given by the model. Assume that conflict analysis attempts to undo a variable x*<sup>i</sup>* that is part of the provided model *x* → *v*. This can only happen when the trail consists of only variables from *x* and implications of asserted formulas. In other words, this particular conflict cannot be resolved unless we modify either the assertions themselves or the input model. The clause resulting from the analysis marked as final is our starting point for producing the model interpolant.

**Modification 3 (Final Analysis)***. To support SMT modulo a model x* → *v, the procedure* analyzeFinal(M*,* C) *resolves any remaining trail propagations in* M *from the clause* C *and returns the resulting clause* I*.*

The resolution of propagations in this final analysis is done in the same manner as in regular conflict analysis. This means that the resulting clause I is implied by the asserted formulas. In addition, resolving all propagations from the conflict clause ensures that all literals of I evaluate to false only because of the assignment *x* → *v*, making I an appropriate model interpolant.

*Example 3.* Consider two formulas <sup>F</sup><sup>1</sup> <sup>≡</sup> <sup>b</sup> and <sup>F</sup><sup>2</sup> ≡ ¬<sup>b</sup> <sup>∨</sup> (x<sup>2</sup> <sup>+</sup> <sup>y</sup><sup>2</sup> <sup>&</sup>lt; 2). When asserting these two formulas to the MCSAT solver, the Boolean and arithmetic plugins will identify the set of terms relevant for satisfiability as <sup>R</sup> <sup>=</sup> {b, x, y,(x<sup>2</sup> <sup>+</sup> <sup>y</sup><sup>2</sup> <sup>&</sup>lt; 2)}. Additionally, the assertions will be added to the trail and propagated<sup>5</sup>, resulting in the following initial trail

$$M\_0 = \{ \begin{array}{c} b \leadsto \top, F\_2 \leadsto \top, (x^2 + y^2 < 2) \stackrel{F\_2}{\leadsto} \top \}. \end{array}$$

We now apply our procedure to solve F<sup>1</sup> and F<sup>2</sup> modulo the partial model {x → 2}.

In the first iteration, no term in R is unit (with only one variable unassigned), and propagation does not infer any new facts or conflicts. The procedure thus perform a decision on the unassigned variable x of the model, resulting in the trail

$$M\_1 = \left\lbrack \begin{array}{c} b \leadsto \top, F\_2 \leadsto \top, (x^2 + y^2 < 2) \stackrel{F\_2}{\leadsto} \top, x \mapsto 2 \right\rbrack .$$

In the second iteration, as (x<sup>2</sup>+y<sup>2</sup> < 2) is unit in the trail M1, the arithmetic plugin examines the constraint and deduces that there is no potential solution for y. This constitutes a unit inconsistency that the plugin reports, along with the conflict clause<sup>6</sup>

$$C\_0 \equiv \neg(x^2 + y^2 < 2) \lor \neg(x > \sqrt{2}).$$

<sup>5</sup> Notation *t <sup>F</sup> v* denotes that *t* is assigned to *v* due to propagation, and *F* is the reason of the propagation.

<sup>6</sup> We use (*x >* <sup>√</sup>2) as a shorthand for the extended constraint *x ><sup>r</sup>* root(*x*<sup>2</sup> <sup>−</sup> <sup>2</sup>*,* <sup>2</sup>*, x*).

Conflict analysis takes clause C<sup>0</sup> and starts the resolution process. As the top variable x on the trail M<sup>1</sup> is part of the input model, the analysis stops and reports that the clause C<sup>0</sup> is the final explanation. This clause is valid, but not yet a model interpolant as it contains a literal with variable y. We then proceed with the final analysis to remove such literals. First, we resolve (x<sup>2</sup> + y<sup>2</sup> < 2) from <sup>C</sup><sup>0</sup> using its reason clause <sup>F</sup>1, which gives the clause <sup>C</sup><sup>1</sup> ≡ ¬b∨ ¬(x > <sup>√</sup>2). Then, we resolve b from C<sup>1</sup> with an empty reason (b is an assertion), resulting in the final clause and model interpolant <sup>I</sup> <sup>=</sup> <sup>¬</sup>(x > <sup>√</sup>2).

# **4 Nonlinear Arithmetic**

The general approach to interpolation presented so far is not specific to nonlinear arithmetic. We now tackle two practical issues that arise in nonlinear arithmetic and we discuss the properties of our interpolation procedure in the context of nonlinear arithmetic. First, on nonlinear problems, as seen in Example 3, the interpolation procedure can return model interpolants that include extended polynomial constraints. This is an artifact of the underlying decision procedure (such as NLSAT [29]) that might use extended polynomial constraints to succinctly represent conflict explanations. While such constraints make decision procedures more effective, they are undesirable for interpolation: interpolants should be described in the language of the input formulas, if possible. Second, to use the interpolant procedure in the context of model checking, we also need to devise a generalization procedure for polynomial constraints.

This section uses concepts from cylindrical algebraic decomposition (CAD). We keep the presentation example-driven and focused on our particular needs, and refer the reader to the existing literature for further information [2,5,7]. Cylindrical algebraic decomposition is a general approach for reasoning about polynomials based on the following result due to Collins [10]. For any set of polynomials <sup>f</sup>1,...,f*<sup>k</sup>* <sup>∈</sup> <sup>Z</sup>[x1,...,x*n*] one can algorithmically decompose <sup>R</sup>*<sup>n</sup>* into connected regions (called cells) such that all the polynomials f*<sup>j</sup>* are signinvariant in every cell C*i*. This means that the cells also maintain the truth value of any polynomial constraints over the polynomials f*i*, which is crucial in many reasoning techniques for polynomial constraints.

The theory and practice of CAD is heavily dependent on the ordering of variables involved. For this paper we always assume the CAD order to be the same as the order of the defined polynomials (e.g., x<sup>1</sup> < x<sup>2</sup> < ... < x*n*). Every CAD cell is cylindrical in nature, and can be described by constraints where every dimension of the cell (called a level) can be completely defined by relying only on the previous dimensions. We illustrate this through an example.

*Example 4.* Consider the polynomial <sup>f</sup> <sup>=</sup> <sup>x</sup><sup>2</sup> <sup>+</sup> <sup>y</sup><sup>2</sup> <sup>−</sup> <sup>2</sup> <sup>∈</sup> <sup>Z</sup>[x, y]. A CAD of <sup>f</sup> is depicted in Fig. 1 (left). The cell C<sup>1</sup> is defined by two constraints:

$$\begin{aligned} C\_1^y &\equiv y >\_r \mathsf{root}(x^2 + y^2 - 2, 2, y), \\ C\_1^x &\equiv x >\_r \mathsf{root}(x^2 - 2, 1, x) \land x <\_r \mathsf{root}(x^2 - 2, 2, x). \end{aligned}$$

**Fig. 1.** CAD of the polynomial *<sup>f</sup>* <sup>=</sup> *<sup>x</sup>*<sup>2</sup> <sup>+</sup> *<sup>y</sup>*<sup>2</sup> <sup>−</sup> 2 from Example <sup>4</sup> (left). Computed cell capturing the model (1*,* 2) of Example 5 (right).

Constraint C*<sup>x</sup>* <sup>1</sup> is at the first level (it's a constraint on x only), while constraint C*y* <sup>1</sup> is at the second level and relates variables x and y. The full cell description is then <sup>C</sup><sup>1</sup> <sup>≡</sup> <sup>C</sup>*<sup>x</sup>* <sup>1</sup> <sup>∧</sup> <sup>C</sup>*<sup>y</sup>* <sup>1</sup> . The green cell C<sup>2</sup> can be described by C*<sup>y</sup>* <sup>2</sup> ≡ and C*<sup>x</sup>* <sup>2</sup> <sup>≡</sup> x >*<sup>r</sup>* root(x<sup>2</sup> <sup>−</sup> <sup>2</sup>, <sup>2</sup>, x), with the full description <sup>C</sup><sup>2</sup> <sup>≡</sup> <sup>C</sup>*<sup>y</sup>* <sup>2</sup> <sup>∧</sup> <sup>C</sup>*<sup>x</sup>* 2 .

Model-based decision procedures such as NLSAT rely on CAD construction but do not construct the complete CAD decomposition. Instead, given a point in R*<sup>n</sup>* they can construct a single cell of a CAD in a model-driven fashion. For more information about this approach, we refer the reader to [4,27]. For our purposes we abstract the cell construction, and denote with describeCell(F,M) the function that, given a set of polynomials F, returns a description of a CAD cell of F that contains the model M.

Following the terminology used in CAD, we say that a non-empty connected subset of <sup>R</sup>*<sup>k</sup>* is a *region*. A set of polynomials {f1,...f*s*} ⊂ <sup>Z</sup>[*y*, x], with *<sup>y</sup>* <sup>=</sup> y1,...,y*n*, is said to be *delineable* in a region <sup>S</sup> <sup>⊆</sup> <sup>R</sup>*<sup>n</sup>* if for every <sup>f</sup>*<sup>i</sup>* (and <sup>f</sup>*<sup>j</sup>* ) from the set, the following properties are invariant for any *α* ∈ S:


Delineability has important consequences on the number and arrangement of real roots of polynomials f*i*. As explained by the following theorem, if a set of polynomials F is delineable on a region S, then the number of real roots of the polynomials does not change on S. Moreover, these roots maintain their relative order on the whole of S.

**Theorem 1 (Corollary 8.6.5 of** [40]**).** *Let* F *be a set of polynomials in* <sup>Z</sup>[*y*, x]*, delineable in a region* <sup>S</sup> <sup>⊂</sup> <sup>R</sup>*n. Then, the real roots of* <sup>F</sup> *vary continuously over* S*, while maintaining their order.*

For a polynomial <sup>f</sup> <sup>∈</sup> <sup>Z</sup>[*x*] and model <sup>M</sup> <sup>=</sup> {*<sup>x</sup>* <sup>→</sup> *<sup>v</sup>*}, we denote with sgncstr(f,M) the polynomial constraint that matches the sign of f in M, i.e.

$$\mathsf{sgn}\mathsf{sstr}(f,M) = \begin{cases} f < 0 & \text{if } \mathsf{sgn}(f(\mathfrak{v})) < 0 \\ f > 0 & \text{if } \mathsf{sgn}(f(\mathfrak{v})) > 0 \\ f = 0 & \text{if } \mathsf{sgn}(f(\mathfrak{v})) = 0 \end{cases}$$

As described above, a CAD cell can be succinctly described by relying on extended polynomial constraints. We now show that the description of the cell can be reduced to basic polynomial constraints.

**Lemma 3.** *Let* <sup>f</sup>*<sup>i</sup>* <sup>∈</sup> <sup>Z</sup>[y1,...,y*n*, x] *be two polynomials of degrees* <sup>m</sup>*i, and* <sup>F</sup>*<sup>i</sup>* <sup>≡</sup> x *<sup>r</sup>* root(f*i*, k*i*, x) *be extended polynomial constraints of a cell description. Let* <sup>S</sup> *be a region of* <sup>R</sup>*<sup>n</sup> where* {f1, f2} *are delineable and let* <sup>M</sup> <sup>=</sup> {*<sup>y</sup>* <sup>→</sup> *<sup>v</sup>*, x <sup>→</sup> <sup>α</sup>} *be a model such that v* ∈ S*. Then, for all y* ∈ S *it holds that*

$$\bigwedge\_{i=0}^{m\_1-1} \mathsf{sgn} \mathsf{cstr}(f\_1^{(i)}, M) \wedge \bigwedge\_{i=0}^{m\_2-1} \mathsf{sgn} \mathsf{cstr}(f\_2^{(i)}, M) \Rightarrow F\_1 \wedge F\_2.$$

The proof of this lemma is relatively straightforward. The CAD cell description for level x represents an entry in the sign table of f<sup>1</sup> and f<sup>2</sup> (with no roots in between). A part of this sign table entry that contains M can be described with the signs of all the derivatives of f<sup>1</sup> and f<sup>2</sup> as long as we can guarantee that neither the arrangement nor the number of roots f<sup>1</sup> and f<sup>2</sup> change. But, this is guaranteed by f<sup>1</sup> and f<sup>2</sup> being delineable on S, so the lemma holds.

As a corollary to this lemma, in the context of CAD cell construction around a model M, we can replace any extended constraints describing a cell C with basic constraints stating that the signs of the polynomial derivatives are the same as in M. This results in a valid CAD subcell C ⊆ C for the same polynomials, that still contains the model M. We denote the function that constructs a basic CAD cell description of a set of polynomials F capturing the model M with describeCellBasic(F,M).

*Example 5.* Based on Example 4, we can construct a cell around the model M = {<sup>x</sup> <sup>→</sup> <sup>1</sup>, y <sup>→</sup> <sup>2</sup>}. Function describeCellBasic(F,M) will return the constraints

$$\begin{aligned} C\_3^y &\equiv (x^2 + y^2 > 2) \land (y > 0), \\ C\_3^x &\equiv (x^2 < 2) \land (x > 0). \end{aligned}$$

The full cell description is then <sup>C</sup><sup>3</sup> <sup>≡</sup> <sup>C</sup>*<sup>x</sup>* <sup>3</sup> <sup>∧</sup>C*<sup>y</sup>* <sup>3</sup> . Note that this cell is smaller than the cell C<sup>1</sup> from Example 4. This reduction in size is generally undesirable, but it is a price to pay for having the description in a simpler language.

*Interpolation Without Extended Constraints.* We now show how the cell construction described above can be used to remove extended polynomial constraints from a model interpolant. Assume a clausal model interpolant

$$I = (L\_1 \lor \dots \lor L\_i \lor \dots \lor L\_N)$$

that is implied by formula A and refutes a model M = {*x* → *v*}, i.e., all literals of I evaluate to ⊥ in M. Assume also that some literal L*<sup>i</sup>* contains an extended polynomial constraint x*<sup>n</sup> <sup>r</sup>* root(f, k, x*n*), with <sup>f</sup> <sup>∈</sup> <sup>Z</sup>[*x*]. We aim to replace the extended literal L*<sup>i</sup>* with literals over basic polynomial constraints. To do so, we need to find literals L<sup>1</sup> *<sup>i</sup>* ,...,L*<sup>m</sup> <sup>i</sup>* such that <sup>L</sup>*<sup>i</sup>* <sup>⇒</sup> (L<sup>1</sup> *<sup>i</sup>* <sup>∨</sup> ... <sup>∨</sup> <sup>L</sup>*<sup>m</sup> <sup>i</sup>* ) and all literals L*j <sup>i</sup>* evaluate to ⊥ in M. Then, the clause

$$I' = (L\_1 \lor \dots \lor L\_i^1 \lor \dots \lor L\_i^m \lor \dots \lor L\_N)$$

will also be a model interpolant implied by A that refutes the model M.

We can construct the literals L*<sup>j</sup> <sup>i</sup>* using single cell construction as follows. We create a description of the CAD cell of the polynomial f from L*<sup>i</sup>* that captures the model <sup>M</sup>. Let describeCellBasic({f}, M) = <sup>D</sup><sup>1</sup> <sup>∧</sup>...∧D*<sup>m</sup>* be this description. Since the cell fully captures the behavior of f around M, we know that D<sup>1</sup> ∧ ... ∧ D*<sup>m</sup>* ⇒ ¬L*<sup>i</sup>* and all literals D*<sup>j</sup>* evaluate to . Therefore, we can use the cell description to eliminate the extended literal L*i*, obtaining the clause

$$I' = (L\_1 \lor \dots \lor \neg D\_i \lor \dots \neg D\_m \lor \dots \lor L\_n)$$

By continuing this process, we can replace all extended literals from a model interpolant, to obtain a model interpolant in the basic language of polynomial constraints.

*Example 6.* Consider the model interpolant <sup>I</sup> <sup>=</sup> <sup>¬</sup>(x >*<sup>r</sup>* root(x<sup>2</sup> <sup>−</sup> <sup>2</sup>, <sup>2</sup>, x) from Example 3 that refutes the model M = {x → 2}. To express I in terms of basic polynomials constraints we first construct a regular CAD cell of <sup>f</sup> <sup>=</sup> <sup>x</sup><sup>2</sup> <sup>−</sup> <sup>2</sup> around <sup>M</sup>. In this case this cell is simply x >*<sup>r</sup>* root(x<sup>2</sup> <sup>−</sup> <sup>2</sup>, <sup>2</sup>, x). Then, we use Lemma <sup>3</sup> to construct a basic CAD cell description as (x<sup>2</sup> <sup>&</sup>gt; 2)∧(x > 0). Finally, the simplified interpolant is <sup>I</sup> <sup>=</sup> <sup>¬</sup>(x<sup>2</sup> <sup>&</sup>gt; 2) ∨ ¬(x > 0).

*Termination.* With the description of the interpolation procedure complete, we discuss the termination of the procedure. To do so, we fix the formula A(*x*, *y*) of Definition 3 and we assume a fixed order of variables that ensures y*<sup>i</sup>* < x*i*. Since the MCSAT decision procedure on which we rely is based on CAD, we can put a bound on the set of literals that can ever appear in a model interpolant from the formula A to an arbitrary model M. Let P*<sup>A</sup>* be the set of polynomials appearing in A, and let P = P(P*A*) denote the closure of the set P*<sup>A</sup>* under the CAD projection operator used by the decision procedure. Finally, let P be the closure of P under derivatives. The set of polynomial constraints that can appear in the interpolant I is limited to basic polynomial constraints over polynomials in P . This means that the procedure mcsat::check() can only generate a finite number of model interpolants and therefore has the finite convergence property. **Lemma 4.** *Assuming a fixed variable order, the* mcsat::check() *procedure has the finite convergence property for nonlinear arithmetic formulas.*

Together with Lemma 2, this lemma implies that our interpolation procedure for the theory of nonlinear arithmetic terminates.

*Model Generalization.* We now proceed to show how the CAD cell construction can be used in a natural way to provide model-driven generalization. As in Definition 2, assume a formula F(*x*, *y*) such that F is true in a model M. Our aim is to construct a formula G(*x*) that generalizes the model M and still guarantees a solution to F.

Following the approach of [15], we do this in two steps. First, we construct an implicant B of F based on the model M. Then, we eliminate the variables *y* from B, again relying on the model M. The implicant B is a conjunction of literals that implies F and such that B is true in M. The implicant can be computed by a top-down traversal of the formula F while using the model M to evaluate the formula nodes (see, e.g., [15] for a detailed description). To find a formula G such that G ⇒ ∃*y* . B, we use CAD cell construction as follows. Let <sup>P</sup> <sup>⊆</sup> <sup>Z</sup>[*x*, *<sup>y</sup>*] be the set of all polynomials appearing in <sup>B</sup>, and let the cell description of P around M be

$$\mathsf{describeCel\mathsf{Basis}}(P,M) = D\_x \land D\_y \dots$$

Here, D*<sup>x</sup>* denotes the description of cell levels of variables *x*, while D*<sup>y</sup>* denotes the description of cell levels of variables *y*. Because of the cylindrical nature of CAD cells, and the order on variables y*<sup>i</sup>* and x*i*, we are guaranteed that every solution of D*<sup>x</sup>* can be extended to a solution of D*<sup>y</sup>* . Therefore we set the final generalization G(x) ≡ D*x*.

*Example 7 (Generalization).* Consider the formula <sup>F</sup> <sup>≡</sup> (x<sup>2</sup> <sup>+</sup> <sup>y</sup><sup>2</sup> <sup>&</sup>lt; 2) and the model M = {x → 1, y → 2} that satisfies F, and let us compute a generalization <sup>G</sup>(x) of <sup>M</sup>. First, we compute a CAD cell of <sup>f</sup> <sup>=</sup> <sup>x</sup><sup>2</sup> <sup>+</sup> <sup>y</sup><sup>2</sup> <sup>−</sup> 2 as shown in Example 5. Then we drop the description of cell level y, to obtain the model generalization <sup>G</sup>(x) <sup>≡</sup> (x<sup>2</sup> <sup>&</sup>lt; 2) <sup>∧</sup> (x > 0).

# **5 Evaluation**

To the best of our knowledge, there is no clear metric for evaluating how good an interpolant is, or for comparing different interpolants. In this section, we first show two examples to illustrate the procedure and its applications. Then, we evaluate the effectiveness of our interpolation procedure on practical problems that arise from model-checking applications. To this end, we integrate the procedure into a model checker and evaluate whether the procedure is efficient, and can produce abstractions that help the model checker synthesize invariants and discover counter-examples.

We have implemented the reasoning procedures (solving modulo partial models and interpolation procedure) by extending the existing mcsat implementation of the yices2 SMT solver [14]. We used the libpoly library [31] for computing the model generalization and simplification of algebraic cells. Since yices2 is integrated into the sally model checker [30], we rely on the pdkind method [30] as the model checking engine (the user of interpolation) in our evaluation.

**Fig. 2.** Illustration of interpolants from Example 8. In blue and orange are the feasible space of the formulas *A* and *B* (projected on *x* and *y*). In green is the feasible space of the interpolant produced by our method (on the left) and the interpolant produced by [19] (on the right). (Color figure online)

*Example 8.* We compare the style of interpolants generated by our new procedure with the ones generated by numerical approaches such as [19]. Example 4 from [19] considers two formulas of the form

$$\begin{aligned} A(x, y, a\_1, a\_2, b\_1, b\_2) &\equiv (f\_1 \ge 0 \land f\_2 \ge 0) \lor (f\_3 \ge 0 \land f\_4 \ge 0), \\ B(x, y, c\_1, c\_2, d\_1, d\_2) &\equiv (g\_1 \ge 0 \land g\_2 \ge 0) \lor (g\_3 \ge 0 \land g\_4 \ge 0). \end{aligned}$$

The polynomials f*<sup>i</sup>* and g*<sup>i</sup>* involved in A and B are of degree 2. The right-hand side of Fig. 2 shows the interpolant I<sup>1</sup> found by the approach in [19]. This interpolant is of the form h(x, y) > 0, where h is a polynomial degree two computed using semidefinite programming. Our approach, on the other hand, produces the interpolant I<sup>2</sup> shown on the left-hand side of Fig. 2. This interpolant consists of 12 clauses, each containing 6–8 polynomial constraints over 16 different polynomials (8 linear, 8 of degree 2). The interpolant I<sup>2</sup> is ultimately produced from fragments of a CAD so its edges touch upon the critical points of the shape they were produce from (formula A). Interpolant I1, on the other hand, has a simple form dictated by the method [19]. Which form is ultimately more useful depends on a particular application.


**Fig. 3.** Evaluation Results. For each tool, we report the number of solved problems, how many of the solved problems were valid and invalid, and the total time used to solve them. The rows correspond to different problem classes, and the bottom row reports the overall results for all 114 benchmarks.

*Example 9 (Cauchy-Schwartz).* As described in Example 1, we can model the computation of Cauchy-Schwarz inequality as a transition system S*cs*. Then we can prove the inequality correct if we can prove that the property P*cs* is valid in S*cs*. The pdkind model checking engine with the new interpolation procedure proves the property valid in 1 s.

*Benchmarks.* We run the evaluation on an existing set of nonlinear modelchecking problems used by Cimatti, et al. [8]. This set consists of 114 benchmarks from various sources: handcrafted benchmarks, hybrid system verification, nuxmv benchmarks, C floating-point verification, and verification of Simulink models. The benchmark problems all contain transition systems with nonlinear behavior. For each problem, the goal is to prove or disprove a single invariant. We refer the reader to [8] for a more detailed description.

*Evaluation.* Cimatti, et al. [8] present an abstraction approach based on incrementally more precise linear approximations of nonlinear polynomials. They show that this approach, implemented in the ic3-nra tool, is superior to other tools (such as, isat3 [36] and nuxmv [6] with upfront linear abstraction). Since our goal is to show the effectiveness of our interpolation procedure, rather than compare to many model checking engines, we keep the evaluation simple and only compare to ic3-nra. In addition, we include the k-induction engine kind of sally in the comparison to illustrate the importance of invariant inference and counter-example generation.<sup>7</sup>

We ran the tools on the benchmark set with a 1 h CPU timeout per problem. The results are shown in Fig. 3 and on the cactus plot in Fig. 4. A scatter plot comparison of pdkind against ic3-nra and kind is shown in Fig. 5.

<sup>7</sup> kind performs *k*-induction checks for increasing values of *k* and stops if either the property is shown *k*-inductive, or a counter-example is found.

**Fig. 4.** Cactus Plots Comparing the Performance of ic3-nra, kind, and pdkind. The *x* axis is the number of problems solved (valid on the left, invalid on the right) and the *y* axis is the time needed to solve the problem (log scale).

**Fig. 5.** Scatter Plots Comparing the Performance of ic3-nra and kind with pdkind. Green squares represent problems that are valid. Red dots represent problems that are invalid. Each axis represents the time it took the tool to solve the problem (log scale). (Color figure online)

As can be seen from Fig. 3, the results are positive. The pdkind engine with the new interpolation method can prove more properties and find more counterexamples than the state-of-the-art ic3-nra.

Out of 59 properties that pdkind shows correct, 36 cannot be proved by kind. This means that these properties are likely not k-inductive and that the interpolants produced by our procedure are valuable abstractions in invariant inference. Similarly, ic3-nra proves 37 properties that are not k-inductive. As can be seen from the scatter plot in Fig. 5, there are properties that pdkind can prove than ic3-nra cannot, and vice versa (11 and 10, respectively). This is to be expected from a difficult domain, but it also means that the interpolation and the abstraction approach (or other methods) can be used to complement each other.

As for the invalid properties, since our interpolation method (and thus pdkind) is based on complete and precise reasoning, while ic3-nra relies on abstraction, it is to be expected that pdkind can prove more properties invalid. Furthermore, the comparison with kind in Fig. 5 shows that pdkind finds all but one counter-examples that kind does in a similar amount of time. We see this as a confirmation that the interpolation and generalization methods are effective, i.e., they do not impede the search for counter-examples.

#### **5.1 Related Work**

There is ample literature on interpolation for different fragments of nonlinear arithmetic. Existing methods can roughly be classified into two categories: approaches based on interval reasoning, and approaches based on semidefinite programming. Interval reasoning techniques (e.g., [20,35,36]) construct a proof of unsatisfiability through interval slicing and propagation. From such a proof, interpolants can be built using proof-based interpolation techniques. While incomplete, interval-based techniques can be very effective on problems that are hard for complete techniques. Moreover they can support more polynomial functions (e.g., elementary functions, ODEs). Our procedure is complete, but it is limited to the theories supported by MCSAT. The approaches based on semidefinite programming [12,18,19] generally approach the interpolation problem by restricting both the fragment of arithmetic (e.g., bounded constraints, same set of variables, quadratic constraints) and the shape of the interpolant (a single polynomial constraint) so that the interpolant itself can be represented as a semidefinite optimization problem. When they apply, these procedures are also very effective but they suffer from numerical imprecision, requiring special care to account for these errors and making them difficult to use in formal verification. In contrast, out procedure applies to nonlinear arithmetic as a whole. It relies on symbolic techniques, which are not subject to numerical errors. It is precise and complete, and it produces clausal interpolants.

The core ideas beyond our model-based interpolation approach were presented at the Boolean level as SAT solving with assumptions [17]. Closest to our work is the work of Schindler and Jovanovi´c [42] where a similar modelbased approach to interpolation is applied to conjunctions of linear arithmetic constraints based on conflict resolution. Our work is more general as it applies to formulas other than conjunctions, and it is applicable to a wider range of theories.

# **6 Conclusion and Future Work**

We have presented a general approach for interpolation in SMT. This novel approach relies on a mode of interaction with the SMT solver that can check a formula for satisfiability modulo a partial model and, if the formula is unsatisfiable, can return a model interpolant that refutes the model. This allows us to develop a first complete interpolation procedure for nonlinear arithmetic. We have implemented the new procedure in the yices2 SMT solver and evaluated the interpolation procedure on model-checking problems. The new procedure seems to be effective in practice and opens new possibilities in the verification of systems that contain nonlinear behavior. Additionally, we show interesting examples of how the procedure can be used in automating induction proofs in mathematics.

The interpolation procedure that we presented can support other theories available in MCSAT (e.g., uninterpreted functions [28], bit-vectors [22], nonlinear integer arithmetic [27]). We plan to explore interpolation in these theories in more detail, and in the contexts where interpolation can be beneficial (e.g., model checking, quantified reasoning, termination, and proof generation).

# **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

# **An SMT Solver for Regular Expressions and Linear Arithmetic over String Length**

Murphy Berzish1(B) , Mitja Kulczynski<sup>2</sup>, Federico Mora<sup>3</sup>, Florin Manea<sup>4</sup>, Joel D. Day<sup>5</sup>, Dirk Nowotka<sup>2</sup>, and Vijay Ganesh<sup>1</sup>

 University of Waterloo, Waterloo, Canada mtrberzi@uwaterloo.ca Kiel University, Kiel, Germany University of California, Berkeley, USA University of G¨ottingen and Campus-Institute Data Science, G¨ottingen, Germany Loughborough University, Loughborough, UK

**Abstract.** We present a novel length-aware solving algorithm for the quantifier-free first-order theory over regex membership predicate and linear arithmetic over string length. We implement and evaluate this algorithm and related heuristics in the Z3 theorem prover. A crucial insight that underpins our algorithm is that real-world regex and string formulas contain a wealth of information about upper and lower bounds on lengths of strings, and such information can be used very effectively to simplify operations on automata representing regular expressions. Additionally, we present a number of novel general heuristics, such as the prefix/suffix method, that can be used to make a variety of regex solving algorithms more efficient in practice. We showcase the power of our algorithm and heuristics via an extensive empirical evaluation over a large and diverse benchmark of 57256 regex-heavy instances, almost 75% of which are derived from industrial applications or contributed by other solver developers. Our solver outperforms five other state-of-the-art string solvers, namely, CVC4, OSTRICH, Z3seq, Z3str3, and Z3-Trau, over this benchmark, in particular achieving a speedup of 2.4*×* over CVC4, 4.4*×* over Z3seq, 6.4*×* over Z3-Trau, 9.1*×* over Z3str3, and 13*×* over OSTRICH.

**Keywords:** String solvers · SMT solvers · Regular expressions

# **1 Introduction**

Satisfiability Modulo Theories (SMT) solvers that support theories over regular expression (regex) membership predicate and linear arithmetic over length of strings, such as CVC4 [25], Z3str3 [8], Norn [3], S3P [39], and HAMPI [22], have enabled many important applications in the context of analysis of string-intensive programs. Examples include symbolic execution and path analysis [11,32], as well as security analyzers that make use of string and regex constraints for input sanitization and validation [5,33,35]. Regular expression libraries in programming languages provide very intuitive and popular ways for developers to express input validation, sanitization, or pattern matching constraints. Common to all these program analysis applications is the requirement for a rich quantifier-free (QF) first-order theory over strings, regexes, and integer arithmetic over string length. Unfortunately, the QF first-order theory of strings containing regex constraints, linear integer arithmetic over string length, string-number conversion, and string concatenation (but no string equations<sup>1</sup>) is undecidable [7,9]. In a previous paper [19] we showed that a related QF first-order theory over word equations, linear integer arithmetic over string length, and string-number conversion predicate, but without regular expressions is also undecidable. It can also be shown that many non-trivial fragments of this theory are hard to decide (e.g., they have exponential-space lower bounds or are PSPACE-complete). Therefore, the task of creating efficient solvers to handle practical string constraints that belong to fragments of this theory remains a very difficult challenge.

Many modern solvers typically handle regex constraints via an automatabased approach [4]. Automata-based methods are powerful and intuitive, but solvers must handle two key practical challenges in this setting. The first challenge is that many automata operations, such as intersection, are computationally expensive, yet handling these operations is required in order to solve constraints that are relevant to real-world applications. The second challenge relates to the integration of length information with regex constraints. Length constraints derived from automata may imply a disjunction of linear constraints, which is often more challenging for solvers to handle than a conjunction.

As we demonstrate in this paper, the challenges of using automata-based methods can be addressed via prudent use of *lazy extraction of implied length constraints* and *lazy regex heuristics* in order to avoid performing expensive automata operations when possible. Inspired by this observation, we introduce a length-aware automata-based algorithm, Z3str3RE (and its implementation as part of the Z3 theorem prover [18]), for solving regex constraints and linear integer arithmetic over length of string terms. Z3str3RE takes advantage of the compactness of automata in representing regular expressions, while at the same time mitigating the effects of expensive automata operations such as intersection by leveraging length information and lazy heuristics.

**Contributions:** We make the following contributions in this paper.

**Z3str3RE: An SMT Solver for Regular Expressions and Linear Integer Arithmetic over String Length.** In Sect. 3, we present a novel decision procedure for the QF first-order theory over regex membership predicate and linear integer arithmetic over string length. We also describe its implementation, Z3str3RE, as part of the Z3 theorem prover [8,18]. The basic idea of our algorithm is that formulas obtained from practical applications have many implicit and explicit length constraints that can be used to reason efficiently about automata representing regexes. In Sect. 4 we present four heuristics that aid in solving regular expression constraints and that can be leveraged in general settings. Specifically, we present a heuristic to derive explicit length information directly from

<sup>1</sup> We use the terms "word" and "string" interchangeably in this paper.

regexes, a heuristic to perform expensive automata operations lazily, a heuristic to refine lower and upper bounds on lengths of string terms with respect to regex constraints, and a prefix/suffix over-approximation heuristic to find empty intersections without constructing automata. All heuristics are designed to guide the search and avoid expensive automata operations whenever possible. Our solver, Z3str3RE, handles the above theory as well as extensions (e.g. word equations and substring function) via the existing support in Z3str3. We focus on the core algorithm as it is the centerpiece of our regex solver. We also carefully distinguish the novelty of our method from previous work.

**Empirical Evaluation and Comparison of Z3str3RE**<sup>2</sup> **Against CVC4, OSTRICH, Z3seq, Z3str3, and Z3-Trau:** To validate the practical efficacy of our algorithm, we present a thorough and extensive evaluation of Z3str3RE in Sect. 5, where we compare it against CVC4 [24], OSTRICH [15], Z3's sequence solver [18], Z3str3 [42], and Z3-Trau [1] on 57256 instances across four regexheavy benchmarks with connections to industrial security applications, including instances from Amazon Web Services and AutomatArk [16]. Z3str3RE significantly outperforms other state-of-the-art tools on the benchmarks considered, having more correctly solved instances in total, lower running time, and fewer combined timeouts/unknowns than other tools, and no soundness errors or crashes. We note that almost 75% of the benchmarks were obtained from industrial applications or other solver developers. Over all the benchmarks, we demonstrate a speedup of 2.4× over CVC4, 4.4× over Z3seq, 6.4× over Z3-Trau, 9.1× over Z3str3, and 13× over OSTRICH.

# **2 Preliminaries**

This section contains some basic definitions as well as a brief overview of the theoretical results which shape the landscape in which we state our contribution.

# **2.1 Basic Definitions**

We first describe the syntax and semantics of the input language supported by our solver Z3str3RE (Algorithm 1).

**Syntax:** The core algorithm we present in Sect. 3 accepts formulas of the quantifier-free many-sorted first-order theory of regex membership predicates over strings and linear integer arithmetic over string length function. The syntax of this theory is shown in Fig. 1.

We denote the set of all string variables and all integer variables as Varstr and Varint respectively, and the set of all string constants and all integer constants as Constr and Conint respectively. String constants are any sequence of zero or more characters over a finite alphabet (e.g., ASCII).

Atomic formulas are regular expression membership constraints and linear integer (in)equalities. Regex terms are denoted recursively over regex concatenation, union, Kleene star, and complement, and for a string constant w, the

<sup>2</sup> A reproduction package is available at https://figshare.com/s/5ae73a6f3c55f5c5e4c1.

```
F ::= Atom | F ∧ F | F ∨ F | ¬F
Atom::= tstr ∈ RE | Aint
Aint ::= tint = tint | tint < tint
RE ::= "w" | RE · RE | RE ∪ RE | RE∗ | RE, with w ∈ Constr
tint ::= m | v | len(tstr) | tint + tint | m · tint, with m ∈ Conint, v ∈ V arint
tstr ::= s, with s ∈ Varstr ∪ Constr
```
**Fig. 1.** Syntax of the input language accepted by Algorithm 1. Z3str3RE accepts an extension of this syntax supporting word equations and other string terms.

regex term "w" represents the regular language containing w only. All regex terms must be grounded (i.e. cannot contain variables). Linear integer arithmetic terms include integer constants and variables, addition, and string length. Multiplication by a constant is expanded to repeated addition. String terms are either string variables or string constants. The length of a string S is denoted by len(S), the number of characters in S. The empty string has length 0.

Our implementation Z3str3RE supports the theory in Fig. 1 extended with more expressive functions and predicates, including word equations (equality between arbitrary string terms) and functions such as indexof and substr that are needed for program analysis. Z3str3RE handles these terms via existing support in Z3str3. We focus on the above input language in the presentation of our algorithm in this paper and theoretical content.

**Semantics:** We refer the reader to [42] for a detailed description of the semantics of standard terms in this theory. We focus here on the semantics of terms which are less commonly known. The regex membership predicate S ∈ R, where S is a string term and R is a regex term, is defined by structural recursion as follows:

S ∈ "w" iff S = w (where w is a string constant) S ∈ R<sup>1</sup> · R<sup>2</sup> iff there exist strings S1, S<sup>2</sup> with S = S<sup>1</sup> · S2, S<sup>1</sup> ∈ R1, S<sup>2</sup> ∈ R<sup>2</sup> S ∈ R<sup>1</sup> ∪ R<sup>2</sup> iff either S ∈ R<sup>1</sup> or S ∈ R<sup>2</sup> S ∈ R<sup>∗</sup> iff either S = or there exists a positive integer n such that S = S<sup>1</sup> · S<sup>2</sup> · ... · S<sup>n</sup> and S<sup>i</sup> ∈ R for each i = 1 ...n S ∈ R iff S ∈ R (that is, S ∈ R is false)

### **2.2 Theoretical Landscape**

To put our contributions in context, we briefly discuss a series of (un)decidability and complexity results developed around the fragments and extensions of the theory supported by Z3str3RE.

In particular, we consider extensions which may have a string-number conversion predicate numstr<sup>3</sup> and/or string concatenation. Both extensions are

<sup>3</sup> We introduce *numstr*, which is not part of the SMT-LIB standard, in order to simplify presentation of the theoretical results. The predicate is no more expressive than the standard operators str.to int/str.from int, except that those terms handle decimal inputs. The results easily extend to other (finite) alphabets including decimal/hexadecimal digits with appropriate case analysis.

important to real-world program analysis. The predicate numstr has the syntax numstr(tint, tstr) and the following semantics: numstr(n, s) is true for a given integer n and string s iff s is a valid binary representation of the number n (possibly with leading zeros) and n is a non-negative integer. That is, s only contains the characters 0 and 1, and len(s)−1 <sup>i</sup>=0 s [i]2len(s)−i−<sup>1</sup> = n, where s [i] is 0 if the ith character in s is '0' and 1 if that character is '1'. String concatenation has the syntax tstr ::= tstr · tstr and the usual semantics defined by SMT-LIB [10].

In the following, TLRE,n,c is the quantifier-free many-sorted first-order theory of linear integer arithmetic over string length function (L), regex (RE) membership predicates, string-number conversion (n), and string concatenation (c) <sup>4</sup>. The following quantifier-free fragments of TLRE,n,c are of interest: TLRE,c, TLRE, TRE,n,c, TRE,n, and TRE. The fragment TLRE,c (respectively, TLRE) has all functions and predicates of TLRE,n,c except the string-number conversion predicate (and, respectively, except the string concatenation function). The theory TRE,n,c (respectively, TRE,n and TRE) has all functions and predicates of TLRE,n,c except the length function (and, respectively, the string concatenation function, and, in the case of TRE, the string-number conversion predicate). Note that while all these theories allow equalities between terms of sort Int, they do not allow equalities between terms of sort Str and cannot express general word equations.

The theoretical landscape is laid out as follows. Firstly, following the results and techniques introduced in [3], we obtain that TLRE,c and, in particular, TLRE is decidable. A procedure deciding a formula from TLRE,c would first construct for each variable (string or integer), based on the regular expression constraints and length constraints which involve it, a finite automaton, then reduce the problem of checking the satisfiability of the formula to checking whether the constructed automata accept at least one string. A similar approach shows that TRE,n is decidable. We observe that the presence of complements in regular expressions is an inherent source of complexity for these procedures. Indeed, we can easily encode the universality problem for regular expressions as a formula in the theory TRE. Moreover, given a regex R of length n over an alphabet Σ, deciding whether L(R) = Σ<sup>∗</sup> is equivalent to deciding the satisfiability of the formula ϕ of TRE consisting of the atoms x ∈ R and x ∈ Σ∗. Accordingly, by the results from [37], if the choice for R is restricted to regular expressions with at least k stacked complements, then there exists a positive rational number c

such that the considered problems are not contained in NSPACE

$$\underbrace{\left(\underbrace{2^{2^{2^{\cdots^{2^{\cdots}}}}}}\_{k-1\text{times}}\right)}\_{\text{i.e.}^{}}\cdots$$

In other words, the depth of the stack of complements of the formula translates to the height of the tower of exponents in the complexity of deciding that formula ϕ. On the other hand, if we only consider regular expressions without stacked complements, then the decision problems for the considered theories are PSPACE-complete. Indeed, the automata-based approach described above can be implemented to work in nondeterministic polynomial space; strongly related complexity results are obtained in [26,27].

<sup>4</sup> Note that the fragments considered here do not include word equations.

#### **Algorithm 1:** Z3str3RE's length-aware algorithm for the theory TLRE of regex and integer constraints

```
Input : Conjunction φ of constraints of the form S ∈ RE, and conjunction ψ of linear
             integer arithmetic constraints over string lengths
   Output : SAT or UNSAT
1 forall constraints S ∈ RE in φ do
2 LS ← ComputeLengthAbstraction(S) ;
3 LRE ← ComputeLengthAbstraction(RE) ;
4 if ψ ∪ LS ∪ LRE inconsistent then
5 return UNSAT
6 end
7 refine LS as tightly as possible with respect to LRE;
8 end
9 forall strings Si occurring in φ do
10 let R be the set of all regexes RE in all terms Si ∈ RE ;
11 Automaton I ← intersection of all automata corresponding to regexes in R ;
12 if I is empty then
13 return UNSAT
14 else
15 LI ← ComputeLengthAbstraction(I) ;
16 end
17 end
18 LS ← the union of all length abstractions LS ;
19 LRE ← the union of all length abstractions LRE;
20 LI ← the union of all length abstractions LI ;
21 if ψ ∪ LS ∪ LRE ∪ LI has any solution M then
22 forall strings S occurring in φ do
23 obtain len(S) from M ;
24 let A be the set of all automata for all regexes RE in all terms S ∈ RE ;
25 Automaton J ← intersection of all terms in A ;
26 S ← any string of length len(S) in J ;
27 end
28 return SAT
29 else
30 return UNSAT
31 end
```
At the opposite end of the spectrum is the theory TLRE,n,c, which is undecidable. Indeed, one can show that the more specific theory TRE,n,c (i.e. disallowing arithmetic over length) has equivalent expressive power to the theory of word equations with regular constraints, a predicate allowing the comparison of the length of string terms, and the numstr predicate. Therefore, using the techniques from [17], one can show that the theory TLRE,n,c, in which we additionally allow arithmetic over length, is undecidable [7].

# **3 Length-Aware Regular Expression Algorithm**

This section outlines the high-level algorithm used by Z3str3RE to solve the satisfiability problem for TLRE, and its extension based on length-aware heuristics.

#### **3.1 High-Level Algorithm**

The pseudocode presented in Algorithm 1 captures the essence of Z3str3RE regex solver. Implementation-specific details are omitted for clarity. Z3str3RE incorporates a version of this algorithm as part of a DPLL(T)-style interaction with a core solver for Boolean combinations of atoms and other theory solvers able to handle arithmetic constraints and other terms. The tool handles string concatenation, string equality, and other string terms and predicates besides regex membership and string length via existing support in Z3str3, and leverages Z3's integer arithmetic solver for arithmetic reasoning and model construction. This high-level presentation is expanded in Sect. 4, where we describe several heuristics used in our implementation as part of the Z3str3RE tool.

The algorithm takes as input a conjunction φ of regex membership constraints and a conjunction ψ of linear integer arithmetic constraints over the lengths of string variables appearing in φ. Without loss of generality, it is assumed that all constraints in φ are positive; negative constraints S ∈ RE can be replaced with the positive complement S ∈ RE. The algorithm returns SAT iff there is a satisfying assignment to all string variables consistent with the regex constraints φ and length constraints ψ. It is assumed that the algorithm has access to a decision procedure for checking the consistency of linear integer arithmetic constraints and for obtaining satisfying assignments to these constraints (in our implementation, this is fulfilled by Z3's arithmetic solver).

Lines 1–8 check whether the length information implied by φ is consistent with ψ. The function ComputeLengthAbstraction takes as input either a string term S or a regex RE and computes a system of length constraints corresponding to derived length information from string constraints or possible lengths of words accepted by the regex RE. This abstraction is exact, not an over-approximation. For example, given the regex (abc)<sup>∗</sup> as input, ComputeLengthAbstraction would construct the length abstraction S ∈ (abc)<sup>∗</sup> → len(S)=3n, n ≥ 0 for a fresh integer variable n. If the length abstractions are inconsistent with the given length constraints, there can be no solution which satisfies both the length and regex constraints, and hence the algorithm returns UNSAT. Otherwise, line 7 refines the length abstraction L<sup>S</sup> with respect to the regex RE. This improves the efficiency of finding solutions to the augmented system of length constraints later in the algorithm. In our implementation, the lower and upper bounds of the length of S are checked against the lengths of accepting paths in the automaton for RE. For instance, if L<sup>S</sup> implies that len(S) ≥ 5, but the shortest accepting path in the automaton has length 7, the lower bound is refined to len(S) ≥ 7.

Lines 9–17 check that the intersection of all automata constraining each string variable is non-empty. Although intersecting automata is relatively expensive (as it runs in quadratic time w.r.t. the size of the intersected automata), it is still more efficient to do this before enumerating length assignments, and taking the intersection here is necessary to maintain soundness. (The heuristics in Sect. 4 illustrate some methods by which this computation can be made more efficient or even avoided.) If the length information is consistent, the algorithm adds a length abstraction constraint L<sup>I</sup> encoding the lengths of all possible solutions to the intersection I.

By construction of ψ ∪ L<sup>S</sup> ∪ LRE ∪ L<sup>I</sup> , the input formula is satisfiable iff this system of integer constraints has a solution. If such a solution M exists, lines 22–28 construct an assignment for each string variable with respect to its length assignment. A solution must exist as the lengths of strings considered are limited to those lengths for which the intersection of the corresponding automata is non-empty; the solution is consistent by construction with both the input length constraints and string constraints. If a solution M does not exist, then the constraints φ ∧ ψ are not jointly satisfiable, and the algorithm returns UNSAT.

We demonstrate soundness, completeness, and termination of Algorithm 1 as follows. On line 4 we check whether ψ ∪L<sup>S</sup> ∪LRE is satisfiable. If not, we return UNSAT on line 5. Lines 9–17 check whether the intersection of regex constraints for each string variable is empty. If so, we return UNSAT; otherwise, we add an additional constraint encoding the lengths of all strings in this intersection. Therefore, ψ ∪ L<sup>S</sup> ∪ LRE ∪ L<sup>I</sup> has a solution iff there exists an assignment to each string variable that is consistent with the arithmetic constraints ψ and that corresponds to the length of a solution in the intersection of its regex constraints L<sup>I</sup> . Lines 22–28 construct this solution if it exists. Therefore, Algorithm 1 is a decision procedure for the QF first-order theory of regex constraints, string length, and linear integer arithmetic.

As previously mentioned, Z3str3RE supports other high-level operations that are not part of this theory via existing support in Z3str3. An extension to this algorithm provides support for including these operations, which may render the theory undecidable. These terms are not in Algorithm 1 because their inclusion would make the algorithm incomplete (see Sect. 2.2). Algorithm 1 describes the part of the implementation which is novel and complete.

# **4 Length-Aware and Prefix/Suffix Heuristics in Z3str3RE**

In this section, we describe the length-aware heuristics that are used in Z3str3RE to improve the efficiency of regular expression reasoning. We present an empirical evaluation of the power of these heuristics in Sect. 5.6.

#### **4.1 Computing Length Information from Regexes**

The first length-aware heuristic is used when constructing the length abstraction on line 3. If the regex can be easily converted to a system of equations describing the lengths of all possible solutions (for instance, in the case when it does not contain any complements or intersections), this system can be returned as the abstraction without constructing the automaton for RE yet. As previously illustrated, for example, given the regex (abc)<sup>∗</sup> as input, ComputeLengthAbstraction would construct the length abstraction S ∈ (abc)<sup>∗</sup> → len(S)=3n, n ≥ 0 for a fresh integer variable n. Note that this can be done from the syntax of the regex without converting it to an automaton. Deriving length information from the automaton would be simple by, for example, constructing a corresponding unary automaton and converting to Chrobak normal form. However, performing automata construction lazily means we cannot rely on having an automaton in all cases; this technique also provides length information even when constructing an automaton would be expensive.

In cases where we cannot directly infer the length abstraction, the heuristic will fix a lower bound on the length of words in RE, and possibly an upper bound if it exists. Reasoning about the length abstraction early in the procedure gives our algorithm the opportunity to detect inconsistencies before expensive automaton operations are performed. This gives the arithmetic solver more opportunities to propagate facts discovered by refinement and potentially more chances to find inconsistencies or learn further derived facts.

#### **4.2 Optimizing Automata Operations via Length Information**

Similarly, computing the intersection I in line 11 is done lazily in the implementation of Z3str3RE and over several iterations of the algorithm. The most expensive intersection operations can be performed at the end of the search, after as much other information as possible has been learned. We use the following heuristics recursively to estimate the "cost" of each operation without actually constructing any automata:


In essence, the constructions which "blow up" the least are expected to be the least expensive and are performed first. In the best-case scenario, this could mean avoiding the most expensive operations completely if an intersection of smaller automata ends up being empty. In the worst case, all intersections are computed eventually, as this is necessary to maintain the soundness of our approach.

#### **4.3 Leveraging Length Information to Optimize Search**

Our implementation communicates integer assignments and lower/upper bounds with the external arithmetic solver in order to prune the search space. Checking for length assignments is done in practice as an abstraction-refinement loop involving Z3's arithmetic solver. The arithmetic solver proposes a single candidate model for the system of arithmetic constraints; the regex algorithm checks whether that model has a corresponding solution over the regex constraints. If it does not, it asserts a conflict clause blocking that combination of length assignments and regex constraints from being considered again. This is necessary in a DPLL(T)-style solver such as Z3 in order to handle Boolean structure in the input formula.

#### **4.4 Prefix/Suffix Over-Approximation Heuristic**

As previously mentioned, computing automata intersections is expensive, but in many cases it is necessary in order to prove that a set of intersecting regex constraints has no solution. In some cases, this can be done "by inspection" from the syntax of the regex terms without constructing or intersecting any automata. From the structure of a regular expression, it is easy to determine the first letter of all possible accepted strings that it matches. If several regexes would be intersected over the same string term, this is used to check whether these regexes have a prefix of length one in common. If they do not, their intersection cannot contain any strings other than the empty string (and we can also check whether the empty string could be accepted by a similar syntactic approach). A similar construction for suffixes of length 1 is also used. In this way, the heuristic can infer that the intersection of several regex constraints is either empty, resulting in a conflict clause, or can only contain the empty string, resulting in a new fact and a simplification of the formula – without actually constructing the intersection or, in fact, constructing any automata for these regexes.

For example, consider the following regex constraints on a variable X:

$$\begin{aligned} X &\in (abc)^{\*}\\ X &\in a^{+} \mid b^{+} \end{aligned}$$

In the first constraint, the pattern abc is matched zero or more times, and could be empty; therefore, either X is empty or it must start with a and end with c. In the second constraint, each pattern is matched at least once, and cannot be empty; therefore X must start with a or b, end with a or b, and cannot be the empty string. Observe that according to the prefix heuristic, these constraints are consistent, since a is a valid prefix of both regexes; however, according to the suffix heuristic, they are inconsistent, as the possible suffixes a and b of the second regex do not include c, and the empty string is not a solution to both constraints. Hence these constraints are not jointly satisfiable.

As demonstrated, all of these facts are derived from the syntax of the regular expression without constructing any automata. By constructing an overapproximation of the possible solutions of X allowed by regex constraints, the heuristic can determine that their intersection is empty (or can only contain the empty string) without computing it precisely using expensive automata-based reasoning. We limit this heuristic to the first letter as each additional letter requires exponentially more space.

### **5 Empirical Results**

In this section, we describe the empirical evaluation of Z3str3RE, our implementation of the length-aware regular expression algorithm presented in Sect. 3, to validate the effectiveness of the techniques presented. We evaluate the correctness and efficiency of our tool against other solvers, as well as against different configurations of the tool in order to demonstrate the efficacy of our heuristics.

**Fig. 2.** Cactus plot summarizing performance on all benchmarks. Z3str3RE has the best overall performance.

**Table 1.** Combined results of string solvers on all benchmarks. **Z3str3RE** has the best overall performance on all benchmarks compared to CVC4, OSTRICH, Z3seq, Z3str3, and Z3-trau and the biggest lead with a score of 1.02.


#### **5.1 Empirical Setup and Solvers Used**

We compare Z3str3RE against five other leading string solvers available today. CVC4 [24] is a general-purpose SMT solver which reasons about strings and regular expressions algebraically. Z3str3 [8] is the latest solver in the Z3-str family, and uses a reduction to word equations to reason about regular expressions. Z3str3RE is based on Z3str3 except for the length-aware algorithm and heuristics described in Sects. 3 and 4. Z3seq [36] is the Z3 sequence solver, implemented by Nikolaj Bjørner and others at Microsoft Research, as part of the Z3 theorem prover. Z3seq uses a new theory of derivatives for solving extended regular expressions. Z3-Trau [1] is also based on Z3 and uses an automata-based approach known as "flat automata" with both under- and over-approximations. OSTRICH [15] uses a reduction from string functions (including word equations) to a model-checking problem that is

**Fig. 3.** Cactus plot summarizing detailed performance on Automatark benchmark.

solved using the SLOTH tool and an implementation of IC3. We used CVC4's binary version 1.8, commit 59e9c87 of Z3str3, the sequence solver included in Z3's binary version 4.8.9, Z3-Trau commit 1628747, and OSTRICH version 1.0.1. All of these tools support the full SMT-LIB standard for strings. We did not compare against the Z3str2 [42] or Norn [3] solvers as neither tool supports the str.to int or str.from int terms which represent string-number conversion, which are used in some sanitizer benchmarks. Additionally, Norn does not support many of the other high-level string terms such as indexof or substr which are used in the benchmarks. The ABC [4] solver handles string and length constraints by conversion to automata. However, their method over-approximates the solution set of the input formula which may be unsound. Thus, we excluded ABC from our evaluation. We also were unable to evaluate against Trau [2] as the provided source code did not compile. All evaluations were performed on a server running Ubuntu 18.04.4 LTS with two AMD EPYC 7742 processors and 2TB of memory using the ZaligVinder [23] benchmarking framework. A 20 s timeout was used. We crossverified the models generated by each solver for satisfiable instances against all competing solvers.

#### **5.2 Benchmarks**

The comparison was performed on four suites of regex-based benchmarks with a total of 57256 instances. In total, almost 75% of the instances in our evaluation came from previously published industrial benchmarks or other solver developers. Under 10% contain extended regular expressions (having either complement or intersection, or both) and 53% contain only regex predicates. Only 201 instances fall into the undecidable theory TLRE,n,c. More details can be found in [7] where we analyse the benchmarks in greater detail. We briefly describe each benchmark's origin and composition.


**Table 2.** Detailed results for the Automatark benchmark. Z3str3RE has the biggest lead with a score of 1.01.

**AutomatArk** is a set of 19979 benchmarks based on a collection of real-world regex queries collected by Loris D'Antoni from the University of Wisconsin, Madison, USA. We translated the provided regexes [16] into SMT-LIB syntax resulting in two sets of instances: a "simple" set with a single regex membership predicate per instance, and a "complex" set with 2–5 regex membership predicates (possibly negated) over a single variable per instance. The instances in this benchmark are evenly divided between simple and complex problems.

**RegEx-Collected** is a set of 22425 instances taken from existing benchmarks with the purpose of evaluating the performance of solvers against real-world regex instances. This benchmark includes all instances from the AppScan [41], BanditFuzz,<sup>5</sup> JOACO [38], Kaluza [33], Norn [3], Sloth [21], Stranger [40], and Z3str3-regression [8] benchmarks in which at least one regex membership constraint appears.<sup>6</sup> No additional restrictions are placed on which instances were chosen besides the presence of at least one regex membership predicate. This benchmark tests solvers against challenging instances from widely distributed benchmark suites. Additionally, these instances may contain regex terms in any context and with any other supported string operators. As a result, the benchmark is also exemplary of how string solvers perform in the presence of operations and predicates that are relevant to program analysis.

**StringFuzz-regex-generated** is a set of 4170 problems generated by the StringFuzz string instance fuzzing tool [12]. These instances only contain regular expression and linear arithmetic constraints. This benchmark isolates the regex performance of a string solver in the context of mixed regex and arithmetic constraints. Tools with better regex and arithmetic solvers should perform better. Fuzz testing, as performed in the **StringFuzz-regex-generated** benchmark, has been shown to be extremely productive in discovering bugs and performance

<sup>5</sup> The BanditFuzz benchmark is an unpublished suite obtained via private communication with the authors.

<sup>6</sup> Other benchmark suites available to us, including the PyEx, PISA, and Kausler benchmarks, did not include any regex membership constraints.

**Fig. 4.** Cactus plot showing detailed results for the StringFuzz-regex-generated benchmark.



issues in SMT solvers. We included these instances because they exercise the performance of the solver on regex-heavy constraints in a way that the industrial benchmarks or instances obtained from other solver developers cannot.

**StringFuzz-regex-transformed** is a set of 10682 instances which were produced by transforming existing industrial instances with StringFuzz. We applied StringFuzz's transformers to instances supplied by Amazon Web Services related to security policy validation, handcrafted instances inspired by real-world input validation vulnerabilities, and the regex test cases in Z3str3's regression test suite. The instances contain regex constraints, arithmetic and length constraints, string-number conversion (numstr), string concatenation, word equations, and other high-level string operations such as charAt, indexof, and substr. As is

**Fig. 5.** Cactus plot showing detailed results for the StringFuzz-regex-transformed benchmark.



typical for fuzzing in software testing, the goal is to create a suite of tests from a given input that are similar in structure but that explore interesting behaviour not captured by a "typical" industrial instance. These transformed instances are often harder than the original industrial ones.

#### **5.3 Comparison and Scoring Methods**

We compare solvers directly against the total number of correctly solved cases, total time with and without timeouts, and total number of soundness errors and program crashes. We also computed the biggest lead winner and largest contribution ranking following the scoring system used by the SMT Competition [6]. Briefly, the biggest lead measures the proportion of correct answers of the leading tool to correct answers of the next ranking tool, and the contribution score measures what proportion of instances were solved the fastest by that solver. In accordance with the SMT Competition guidelines, a solver receives no contribution score (denoted as –) if it produces any incorrect answers on a given benchmark. In both cases, higher scores are better.

#### **5.4 Analysis of Empirical Results**

The cactus plot in Fig. 2 shows the cumulative time taken by each solver on all cases in increasing order of runtime. Solvers that are further to the right and closer to the bottom of the plot have better performance.

Overall Z3str3RE solves more instances and performs better than all competing solvers. Across all benchmarks, Z3str3RE is over 2.4× faster than CVC4, 4.4× faster than Z3seq, 6.4× faster than Z3-Trau, 9.1× faster than Z3str3, and 13× faster than OSTRICH (including timeouts). Additionally, Z3str3RE has fewer combined timeouts and unknowns than other tools considered, and no soundness errors or crashes. We summarize these results in Table 1. Notably, both Z3-Trau [1] and OSTRICH [15] had significant runtime issues in our experiments. Z3-Trau produced 5325 soundness errors and 2477 crashes on our benchmarks (13% of all instances), which is significantly higher than other tools used. OSTRICH produced 10901 "unknown" responses on the benchmarks (19% of all instances), due to both unsupported features and crashes, and also produced 28 soundness errors. Over all benchmarks, Z3str3RE produced 291 unknowns. There are several potential reasons for this; the solver may have encountered a resource limit and returned UNKNOWN, or it may have detected non-termination and returned UNKNOWN instead of looping forever. According to SMT Competition scoring, Z3str3RE won the division across all benchmarks with a lead of 1.02, and had the largest contribution to the division with a score of 145.07. CVC4 had a contribution score of 95.99, and Z3seq had a score of 19.87. OSTRICH, Z3- Trau, and Z3str3 received no contribution score as they each returned at least one incorrect answer. The presented results are typical of the performance of the evaluated tools over multiple runs. Results were cross-validated within runs and between multiple runs. For a random single instance, the sample variance in execution time for 100 runs is 0.001 (0.07% of average execution time). Over 57256 instances, this is negligible.

The empirical results make clear the efficacy of length-aware automata-based techniques for regular expression constraints when accompanied with length constraints (which is typical for industrial instances). The effectiveness of our technique is demonstrated particularly by comparing Z3str3RE with Z3str3, as the only differences between these tools are the length-aware regex algorithm and heuristics implemented in Z3str3RE and bug fixes. By improving the regex algorithm and applying our heuristics, we achieved a speedup of over 9x and solved over 10000 more cases than Z3str3.

**Fig. 6.** Cactus plot showing detailed performance for the RegEx-Collected benchmark.


**Table 5.** Detailed results for the RegEx-Collected benchmark. CVC4 has the biggest lead with a score of 1.03.

#### **5.5 Detailed Experimental Results**

Figure 3 and Table 2 show the detailed results for the **AutomatArk** benchmark. In this benchmark, Z3str3RE solves more instances than all other solvers, has the fewest timeouts/unknowns, and has the fastest overall running time. Including timeouts, Z3str3RE is 2.2× faster than CVC4, 4.7× faster than Z3seq, 40.4× faster than OSTRICH, 20.4× faster than Z3-Trau, and 32.3× faster than Z3str3.

Figure 4 and Table 3 show the detailed results for the **StringFuzz-regexgenerated** benchmark. Z3str3RE solves more instances than all other solvers, has over 90% fewer timeouts than other solvers, no unknowns, and has the fastest overall running time. Including timeouts, Z3str3RE is 6.1× faster than CVC4, 6.9× faster than Z3seq, 10× faster than OSTRICH, 7.3× faster than Z3-Trau, and 4.3× faster than Z3str3.

**Fig. 7.** Cactus plot comparing performance by disabling individual heuristics on all benchmarks.

Figure 5 and Table 4 show the detailed results for the **StringFuzz-regextransformed** benchmark. Z3str3RE solves more instances in total than all other solvers and has the lowest total running time without timeouts. Including timeouts, Z3str3RE is 2.7× faster than CVC4, 1.9× faster than Z3seq, 21× faster than OSTRICH, and 27× faster than Z3str3. Although Z3-Trau is 1.5× faster than Z3str3RE on this benchmark, including timeouts, Z3-Trau also produces 1241 answers with soundness errors and crashes on 718 other cases. Z3str3RE produces no wrong answers or soundness errors on the benchmark. Z3-Trau also solves 1923 fewer cases correctly in total than Z3str3RE.

Figure 6 and Table 5 show the detailed results for the **RegEx-Collected** benchmark. Z3str3RE outperforms Z3seq, Z3str3, OSTRICH, and Z3-Trau on this benchmark and is competitive with CVC4 both in terms of total number of instances correctly solved and total running time. CVC4 solves 609 more instances than Z3str3RE on this benchmark, but Z3str3RE is 1.1× faster overall (including timeouts). Z3str3RE is 3.6× faster than Z3seq, 5.4× faster than OSTRICH, 2.4× faster than Z3-Trau, and 2.6× faster than Z3str3.

#### **5.6 Analysis of Individual Heuristics and Results**

To demonstrate the effectiveness of individual heuristics described in Sect. 4 and implemented in Z3str3RE, we evaluated different configurations of the tool in which one or more heuristics were disabled. Figure 7 and Table 6 show the results. The plot line "Z3str3RE" shows the performance of the tool with all heuristics enabled. The plot line "All heuristics off" shows the performance with all heuristics disabled. Each of the other plot lines shows the performance with the named heuristic disabled and all others kept enabled. From the plots and table, it is clear that Z3str3RE performs best with all heuristics enabled. Z3str3RE is 4.4× faster using all our heuristics than using none. Every other configuration of the


**Table 6.** Comparison of different heuristics in Z3str3RE on all benchmarks.

tool performs significantly worse relative to the one with all heuristics enabled. Also, the length-aware and prefix/suffix heuristics provide significant boost over lazy intersections and the baseline. These results demonstrate empirically that each heuristic we introduce provides significant benefit in both total number of solved instances and total solver runtime, and that all of the heuristics can be used simultaneously for maximum efficacy.

# **6 Related Work**

**Comparison with Z3str3:** Z3str3 [8] supports regex constraints via (incomplete) reduction to word equations. We have replaced this word-based technique with our automata-based approach introduced in this paper. As demonstrated by our evaluation, the length-aware automata-based approach used in Z3str3RE is more efficient at solving these constraints, and is sound and complete for the QF theory TLRE.

**Comparison with Z3's Sequence Solver:** Z3's sequence solver [18] supports a more general theory of "sequences" over arbitrary datatypes, which allows it to be used as a string solver. Z3seq uses regular expression derivatives to reduce regex constraints without constructing automata. The experiments show Z3str3RE performs better than Z3seq overall.

**Comparison with CVC4:** The CVC4 solver [24] uses an algebraic approach to solving regex constraints. As shown in the experiments, Z3str3RE performs better than CVC4, widely considered as one of the best SMT solvers for strings as well as many other theories.

**Comparison with Z3-Trau:** The Z3-Trau [1] solver builds on Trau [2], reimplemented in Z3, and enriched with new ideas e.g. a more efficient handling of string-number conversion. The evaluation of Z3-Trau exposed 5325 soundness errors and 2477 crashes on our benchmarks.

**Comparison with OSTRICH:** The OSTRICH solver [15] implements a reduction from straight-line and acyclic fragments of an input formula to the emptiness problem of alternating finite automata. OSTRICH produced 10901 "unknown" responses and 4575 timeouts on our benchmarks, as well as 28 soundness errors.

**Related Algorithms and Theoretical Results:** The theory of word equations and various extensions have been studied extensively for many decades. In 1977, Makanin proved that satisfiability for the QF theory of word equations is decidable [28]; in 1999, Plandowski showed that this is in PSPACE [30,31]. Schulz [34] extended Makanin's algorithm to word equations with regex constraints. The satisfiability problem for the theory of word equations with length constraints still remains open [20,28,29,31], although the status of many other extensions of this theory was clarified [17]. Automata-based approaches were used to reason about string constraints enhanced with a ReplaceAll function [14] or transducers [21].

Liang et al. [25] present a formal calculus for a theory that extends TLRE with string concatenation (but not word equations). However, in that paper the authors do not present experimental results regarding implementation of the string calculus proposed. We have implemented an algorithm based on fundamentals of the theory and standard automata-based constructions, and presented a thorough experimental evaluation of our implementation.

Abdulla et al. [3] present an automata-based solver called Norn built upon results involving construction of length constraints from regex constraints. This approach differs significantly from our method. In particular, Norn only uses automata in inferring length constraints implied by regular expressions, then uses an algebraic approach to solve the remainder of the formula. By contrast, our tool uses a hybrid approach that includes both algebraic solving and automata-based reasoning in a symbiotic loop. In addition, we present several novel heuristics using length information to guide the search and, in some cases, avoid constructing automata or computing intersections.

The prefix/suffix over-approximation heuristic is inspired partly by the work of Brzozowski on regex derivatives [13]. The heuristic we introduce is conceptually different as we examine possible prefixes (and suffixes) of strings that could be accepted by a regex in order to demonstrate unsatisfiability, rather than examining the set of all possible suffixes given a fixed prefix in order to demonstrate satisfiability. Our heuristic computes suffixes as well, whereas Brzozowski derivatives are traditionally computed with respect to prefixes of a string. Newer versions of Z3seq, including the one we evaluated, use a regex algorithm based on symbolic derivatives [36].

# **7 Conclusions and Future Work**

In this paper, we empirically showcase the power of length-aware and prefix/suffix reasoning for regex constraints with our algorithm and its implementation in Z3str3RE via an extensive empirical comparison against five other stateof-the-art solvers (namely, CVC4, Z3seq, Z3str3, Z3-Trau, and OSTRICH) over a large and diverse benchmark of 57256 instances. Over this entire benchmark suite, we show that Z3str3RE has a speedup of 2.4× over CVC4, 4.4× over Z3seq, 6.4× over Z3-Trau, 9.1× over Z3str3, and 13× over OSTRICH. Our length-aware method is very general and has wide applicability in the broad context of string solving. In the future, we plan to explore further length-aware heuristics which include more expressive functions and predicates, including indexof, substr, and string-number conversion.

**Acknowledgments.** The work of Federico Mora is supported by NSF grants CNS-1739816 and CCF-1837132, by the DARPA LOGiCS project under contract FA8750- 20-C-0156, by the iCyPhycenter, and by gifts from Intel, Amazon, and Microsoft. The work of Florin Manea is supported by the DFG grant 389613931.

# **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

# **Counting Minimal Unsatisfiable Subsets**

Jaroslav Bend´ık1,2(B) and Kuldeep S. Meel<sup>2</sup>

<sup>1</sup> Faculty of Informatics, Masaryk University, Brno, Czech Republic xbendik@fi.muni.cz

<sup>2</sup> National University of Singapore, Singapore, Singapore

**Abstract.** Given an unsatisfiable Boolean formula *F* in CNF, an unsatisfiable subset of clauses U of *F* is called Minimal Unsatisfiable Subset (MUS) if every proper subset of *U* is satisfiable. Since MUSes serve as explanations for the unsatisfiability of *F*, MUSes find applications in a wide variety of domains. The availability of efficient SAT solvers has aided the development of scalable techniques for finding and enumerating MUSes in the past two decades. Building on the recent developments in the design of scalable model counting techniques for SAT, Bend´ık and Meel initiated the study of MUS counting techniques. They succeeded in designing the first approximate MUS counter, AMUSIC, that does not rely on exhaustive MUS enumeration. AMUSIC, however, suffers from two shortcomings: the lack of exact estimates and limited scalability due to its reliance on 3-QBF solvers.

In this work, we address the two shortcomings of AMUSIC by designing the first exact MUS counter, CountMUST, that does not rely on exhaustive enumeration. CountMUST circumvents the need for 3-QBF solvers by reducing the problem of MUS counting to projected model counting. While projected model counting is #NP-hard, the past few years have witnessed the development of scalable projected model counters. An extensive empirical evaluation demonstrates that CountMUST successfully returns MUS count for 1500 instances while AMUSIC and enumeration-based techniques could only handle up to 833 instances.

# **1 Introduction**

Boolean formulas serve as a primary representation language to model the behaviour of systems and properties. Given an unsatisfiable Boolean formula F in Conjunctive Normal Form (CNF), i.e. a set of clauses F <sup>=</sup> {f<sup>1</sup>, f<sup>2</sup>,...,f<sup>n</sup>}, a subset U <sup>⊆</sup> F is called Minimal Unsatisfiable Subset (MUS) of F iff U is unsatisfiable and for every f <sup>∈</sup> U, U \ {f} is satisfiable.

MUSes serve as *explanations* or *reasons* for unsatisfiability of F, and have, consequently, found applications in a wide variety of domains such as diagnosis [24,56], constrained sampling and counting [28], equivalence checking [20], and the like [1,2,25,30,47,64]. While the early applications relied on identifying a single [3,6,7,51,53] or enumerating multiple [4,10,12,39,41,52] MUSes, the rapid adoption of MUSes lead researchers to investigate problem formulations and their corresponding applications that do not rely on explicit MUS identification. These include, e.g., computing the union of all MUSes [45], deciding whether a given clause belongs to an MUS [31], or counting the number of MUSes. Especially, the counting of MUSes found many applications in the domain of diagnosis where the MUS count can be used to compute various inconsistency metrics [25,29,48–50,65] for general propositional knowledge bases.

A straightforward, and for many years the only available, approach for counting MUSes is to simply enumerate them. However, there can be up to exponentially many MUSes w.r.t. <sup>|</sup>F<sup>|</sup> and hence the complete enumeration is often practically intractable [9,10,39,69]. Inspired by the development of model counting techniques in the context of SAT, which in its nascent stages also depended on complete model enumeration while contemporary techniques often need to explicitly identify just a fraction of models, Bend´ık and Meel [13] recently initiated an investigation of counting MUSes without their explicit enumeration. In this context, they succeeded by developing a hashing-based approximate counter, AMUSIC [13], that provides the so-called PAC guarantees, also known as (ε, δ)-guarantees, wherein the computed answer is within the (1+ε)-factor of the exact count with confidence at least 1 <sup>−</sup> δ. AMUSIC reduces the problem of MUS counting to logarithmically many calls to a Σ<sup>P</sup> <sup>3</sup> oracle (3-QBF solver, in practice) wherein every <sup>Σ</sup><sup>P</sup> <sup>3</sup> query is constructed over a CNF formula conjuncted with XORs.

While AMUSIC achieved its stated goal of avoiding explicit enumeration, its scalability is significantly hampered by its reliance on a 3-QBF solver that can efficiently handle formulas conjuncted with XOR constraints. It is worth highlighting that the scalability of model counting techniques [17,60] in the context of SAT crucially relies on the availability of CryptoMiniSAT [61], a SAT solver with native support for CNF-XOR constraints. Despite significant advances in QBF solving over the years, the scalability remains a formidable challenge for 3-QBF solvers, and even more when XOR constraints are involved. As such, AMUSIC could scale to formulas involving few hundreds of variables and clauses.

In this work, we focus on addressing the scalability of MUS counting techniques. We begin our investigation by focusing on the observation of Bend´ık and Meel that their technique relied on a Σ<sup>P</sup> <sup>3</sup> oracle even though the problem of finding an MUS is in F P NP [19,44]. Therefore, a natural direction is to investigate the design of an algorithmic framework that can circumvent reliance on oracles with high complexity. In this context, we rely on the observation of Durand, Hermann, and Koliatis [21] that the complexity of counting problems whose search problems have F P NP complexity tend to be #NP (which contains #P class). Such an observation is timely given the recent surge of interest in designing efficient techniques for projected model counting, which is #NP-hard. Therefore, one wonders: *whether it is possible to design a MUS counting technique that can take advantage of projected model counters?*

The primary contribution of this paper is an affirmative answer to the above question. We design a new algorithmic framework, CountMUST, that reduces the problem of MUS counting to two projected model counting queries. In particular, CountMUST constructs a wrapper <sup>W</sup> and its remainder <sup>R</sup> such that the number of MUSes of F is |W| − |R|, i.e., the wrapper <sup>W</sup> over-approximates the set of MUSes while the remainder contains the spurious, non-MUS, subsets of F that emerge due to the over-approximation. We encode the wrapper <sup>W</sup> and the remainder <sup>R</sup> with Boolean formulas <sup>W</sup> and <sup>R</sup> such that the projected model counts for <sup>W</sup> and <sup>R</sup> (for a suitable projection set) equal to |W| and |R|, respectively. An interesting (and perhaps surprising) aspect of our CountMUST is that we do not enumerate a single MUS in our process, which is in stark contrast to the design of AMUSIC that relies on the enumeration of a *small* number of MUSes.

We discuss several strategies to construct wrappers (and their corresponding remainders) that are efficient to compute and are tight over-approximations of the set of MUSes. We conduct a detailed empirical analysis over 2553 instances and observe that CountMUST successfully returns MUS count for 1500 instances while AMUSIC and enumeration-based techniques could only handle up to 833 instances. We observe interesting complementary nature of the exact and approximate MUS counting approaches: the scalability of AMUSIC is often impacted by the number of clauses and appears to be less impacted by the number of MUSes while, on the other hand, the scalability of CountMUST is less impacted by the number of clauses and appears to depend on the number of MUSes.

Finally, our empirical analysis showcases that our wrappers W approximate the set of MUSes very tightly. Motivated by the tightness of our wrappers, we discuss several interesting applications of our framework: approximate MUS counting [13], MUS enumeration [5,40], MUS Sampling, estimation of minimum and maximum MUS cardinality [27,38], and MUS membership testing [31].

The rest of the paper is organized as follows. We introduce preliminaries in Sect. 2 and discuss related work in Sect. 3. We then present the primary technical contribution of our work in Sect. 4. We present the empirical evaluation in Sect. 5 and then discuss the implications of the tightness of our wrappers in Sect. 6. We finally conclude in Sect. 7.

# **2 Preliminaries and Problem Definition**

A Boolean formula F is built over Boolean values {1, <sup>0</sup>} and over a set *Vars*(F) of Boolean variables connected via standard logical operators: ∧, ∨, →, ↔, ¬. A literal is either a variable x <sup>∈</sup> *Vars*(F) or its negation <sup>¬</sup>x; *Lits*(F) denotes the set of all literals used in F. Given a set A of variables, a valuation π : A → {1, <sup>0</sup>} assigns to each variable its Boolean value. F[π] denotes the formula that emerges from F by substituting every variable x of F that is in the domain of π by π(x); furthermore, trivial simplifications, e.g., G <sup>∨</sup> 0 = G, G <sup>∧</sup> 0 = 0, <sup>¬</sup>1 = 0, <sup>¬</sup>0 = 1, are applied. Note that if A <sup>⊇</sup> *Vars*(F), then F[π] is simplified either to 1 or to 0. In the case when A <sup>⊇</sup> *Vars*(F) and F[π] = 1, we call <sup>π</sup> <sup>a</sup> *model* of <sup>F</sup> and write π <sup>|</sup><sup>=</sup> F; otherwise, when F[π] = 0, we write π <sup>|</sup><sup>=</sup> F. A formula F is *satisfiable* if it has a model; otherwise, <sup>F</sup> is *unsatisfiable*. We write <sup>M</sup><sup>F</sup> to denote the set of all models of <sup>F</sup>. Moreover, given a set <sup>A</sup> <sup>⊆</sup> *Vars*(F) of variables, we write <sup>M</sup><sup>F</sup> <sup>↓</sup><sup>A</sup> to denote the projection of <sup>M</sup><sup>F</sup> on <sup>A</sup>, and for every <sup>π</sup> <sup>∈</sup> <sup>M</sup><sup>F</sup> , we write <sup>π</sup><sup>↓</sup><sup>A</sup> to denote the projection of π on A. Finally, given two variable sets, A <sup>=</sup> {a<sup>1</sup>,...,ak} and <sup>B</sup> <sup>=</sup> {b<sup>1</sup>,...,bk}, such that <sup>A</sup> <sup>⊆</sup> *Vars*(F), we write <sup>F</sup>[A/B] to denote the formula that originates from <sup>F</sup> by substituting each variable <sup>a</sup><sup>i</sup> <sup>∈</sup> <sup>A</sup> by <sup>b</sup><sup>i</sup> <sup>∈</sup> <sup>B</sup>.

A formula in conjunctive normal form, shortly a *CNF formula*, is a conjunction of *clauses* where a clause is a disjunction of literals. When suitable, a CNF formula can also be viewed as a multiset of clauses where a clause is a set of literals; we use the two representations interchangeably based on the context. Throughout the whole text, let us by F <sup>=</sup> {f<sup>1</sup>,...,fn} denote the input CNF formula of interest. Furthermore, capital letters, e.g., S, K, N, or blackboard bold letters, e.g., W, R, are used to denote other formulas, small letters, e.g., f,f1, f<sup>i</sup>, are used to denote clauses, and small letters, e.g., x, x , y, are used to denote variables. Finally, given a set X, <sup>P</sup>(X) denotes the power-set of X, and <sup>|</sup>X<sup>|</sup> denotes the cardinality of X.

**Definition 1 (MUS).** *A subset* N *of* F *is a* minimal unsatisfiable subset *(MUS) of* F *iff* N *is unsatisfiable and for every* f <sup>∈</sup> N *it holds that* N \ {f} *is satisfiable.*

**Definition 2 (MSS).** *A subset* N *of* F *is a* maximal satisfiable subset *(MSS) of* F *iff* N *is satisfiable and for every* f <sup>∈</sup> F \ N *it holds that* N ∪ {f} *is unsatisfiable.*

**Definition 3 (MCS).** *A subset* N *of* F *is a* minimal correction subset *(MCS) of* F *iff* F \ N *is satisfiable and for every* f <sup>∈</sup> N *it holds that* F \ (N \ {f}) *is unsatisfiable. Equivalently,* N *is an MCS iff* F \ N *is an MSS.*

Note that the Boolean satisfiability is monotone w.r.t. the (clause) subset inclusion, i.e., all subsets of a satisfiable set of clauses are satisfiable. Consequently, all proper subsets of an MUS are in fact satisfiable, and, dually, all proper supersets of an MSS are unsatisfiable. Also, note that the minimality/maximality concept used here is a *set minimality/maximality* and not a *minimum/maximum cardinality*. Consequently, there can be up to - |F | |F |/2 MUSes/MCSes/MSSes of F (intuitively, this is the number of pair-wise incomparable subsets of F; see the Sperner's theorem [62]). We write *maximum* and *minimum* MUS to denote an MUS with the maximum and the minimum cardinality, respectively. Note that there can also be exponentially many maximum and minimum MUSes. We write MUS<sup>F</sup> to denote the set of all MUSes of <sup>F</sup>, and SS<sup>F</sup> to denote the set of all satisfiable subsets of <sup>F</sup>.

*Example 1.* Let us demonstrate the concepts of MUSes, MSSes and MCses on an example. Assume that <sup>F</sup> <sup>=</sup> {f<sup>1</sup> <sup>=</sup> {x<sup>1</sup>}, f<sup>2</sup> <sup>=</sup> {¬x<sup>1</sup>}, f<sup>3</sup> <sup>=</sup> {x<sup>2</sup>}, f<sup>4</sup> <sup>=</sup> {¬x<sup>1</sup>,¬x<sup>2</sup>}}. There are 2 MUSes: MUS<sup>F</sup> <sup>=</sup> {{f<sup>1</sup>, f<sup>3</sup>, f<sup>4</sup>}, {f<sup>1</sup>, f<sup>2</sup>}}, 3 MSSes: {{f<sup>2</sup>, f<sup>3</sup>, f<sup>4</sup>}, {f<sup>1</sup>, f<sup>4</sup>}, {f<sup>1</sup>, f<sup>3</sup>}}, and thus also 3 MCSes: {{f<sup>1</sup>}, {f<sup>2</sup>, f<sup>3</sup>}, {f<sup>2</sup>, f<sup>4</sup>}}. For illustration, see Fig. 1.

In this paper, we are concerned with the following two problems.

**Name:** #MUS **Input:** A CNF formula F. **Output:** The number <sup>|</sup>MUS<sup>F</sup> <sup>|</sup> of MUSes of <sup>F</sup>.

**Fig. 1.** Illustration of *P*(*F*) from the Example 1. Individual subsets are represented as bit-vectors, e.g., *{f*1*, f*2*}* is written as 1100. The subsets with a dashed border are the unsatisfiable subsets, and the others are satisfiable subsets. MUSes and MSSes are filled with a background colour.

**Name:** proj-#SAT **Input:** A formula G and a set of variables S <sup>⊆</sup> *Vars*(*G*). **Output:** The number <sup>|</sup>M<sup>G</sup>↓<sup>S</sup><sup>|</sup> of models of <sup>G</sup> projected on <sup>S</sup>.

Our goal is to solve the #MUS problem, and to do that, we propose a *strong subtractive reduction* to the proj-#SAT problem.

**Definition 4 (Strong Subtractive Reductions).** *[21] Let* Σ *be an alphabet and let* <sup>Q</sup><sup>1</sup> *and* <sup>Q</sup><sup>2</sup> *be two binary relations over* <sup>Σ</sup>*. Let* #·Q<sup>1</sup> *and* #·Q<sup>2</sup> *represent the corresponding counting problems. Then,* #·Q<sup>1</sup> *reduces to* #·Q<sup>2</sup> *via a strong subtractive reduction, if there exist polynomial-time computable functions f and g such that for every string* z <sup>∈</sup> Σ∗*:*

*1.* Q<sup>2</sup>(f(z)) <sup>⊆</sup> <sup>Q</sup><sup>2</sup>(g(z)) *2.* <sup>|</sup>Q<sup>1</sup>(z)<sup>|</sup> <sup>=</sup> <sup>|</sup>Q<sup>2</sup>(g(z))|−|Q<sup>2</sup>(f(z))|*.*

# **3 Related Work**

*MUS Counting.* A straight-forward approach to count the MUSes is to simply enumerate them via an MUS enumeration algorithm, e.g. [4,5,8,10,12,39,41,52]. However, since there can be up to exponentially many MUSes w.r.t. <sup>|</sup>F|, the complete enumeration is often practically intractable. An alternative approach to identify the MUS count is based on a so-called *minimal hitting set duality* between MUSes and MCSes that states that every MUS is a *minimal hitting set* of the set of all MCSes [32,56]. Consequently, one can determine the MUS count by first identifying all MCSes and then counting their minimal hitting sets [40]. However, there can be in general up to exponentially many MCSes, which makes this approach also often practically intractable [11,52].

The study of MUS counting without relying on exhaustive enumeration was initiated just recently by Bend´ık and Meel [13], who proposed an (ε, δ) approximation scheme called AMUSIC. AMUSIC extends a prior hashing-based model counting framework [15,18,63] to MUS counting. Briefly, AMUSIC divides the power-set <sup>P</sup>(F) into *nCells* small *cells*, then pick one of the cells and count the number *inCell* of MUSes in the cell, and estimate the overall MUS count as *nCells* ×*inCell*. The approach requires to perform logarithmically many calls to a Σ<sup>P</sup> <sup>3</sup> oracle (3-QBF solver) wherein each query consists of a CNF formula conjuncted with XOR constraints. The lack of solvers with native support for such constraints presents the major hindrance to the scalability of AMUSIC.

It is worth remarking on a recent work by Bend´ık and Meel [14] that focuses on exact counting of maximal satisfiable subsets (MSSes). While MUSes and MSSes are closely related concepts, to the best of our knowledge, there does not exist any efficient reduction from MUS counting to MSS counting, or vice versa. Note that the best known upper-bound on the problem of finding an MUS is FPNP [19], whereas for findind an MSS a tighter upper-bound FPNP[wit, log] is known [44], which suggests that counting MUSes is practically harder than counting MSSes. It would be an interesting question for future work if the counter developed in this work can be employed to perform MSS counting.

*Model Counting.* The complexity-theoretic study of model counting was initiated by Valiant [67] who showed that proj-#SAT is #P-complete when S <sup>=</sup> *Vars*(*G*). Subsequently, Durand, Hermann, and Koliatis [21] showed that the general problem of proj-#SAT is #NP-hard. A significant conceptual contribution of Durand et al. was to show the importance of subtractive reductions for problems in #NP; this idea has been applied for reductions to projecting counting [14].

Our work relies on the recent progress in the development of efficient projected model counters; in particular, we employ GANAK [59], a state-of-the-art *search-based* exact model counter; the entry based on GANAK won the projected model counting track in 2020 Model Counting Competition [23]. Search-based model counters build on three core ideas: (1) for a formula G and x <sup>∈</sup> S, we have <sup>|</sup>M<sup>G</sup>↓<sup>S</sup><sup>|</sup> <sup>=</sup> <sup>|</sup>M<sup>G</sup>(x→0)↓<sup>S</sup>|+|M<sup>G</sup>(x→1)↓<sup>S</sup>|, (2) if <sup>G</sup> can be partitioned into subset of clauses {C1, C2,...C<sup>k</sup>} such that <sup>∀</sup>i, j. *Vars*(C<sup>i</sup>) <sup>∩</sup> *Vars*(C<sup>j</sup> ) = <sup>∅</sup>, then we have <sup>|</sup>M<sup>G</sup>↓S<sup>|</sup> <sup>=</sup> <sup>k</sup> <sup>i</sup>=1 <sup>|</sup>M<sup>C</sup>*i*↓S|, and (3) finally, component caching is employed to cache the components. Consequently, the model count can be often determined by explicitly identifying just a fraction of all models. GANAK is built on top of earlier search-based model counters, sharpSAT [66] and Cachet [57,58].

# **4 MUS Counting via a Projected Model Counter**

We now gradually introduce several subtractive reductions of the MUS counting problem to the projected model counting, starting with the base idea in Sect. 4.1, and following with the particular reductions in Sects. 4.2–4.11.

#### **4.1 Basic MUS Counting Idea**

**Definition 5 (wrapper and remainder).** *A set* <sup>W</sup> *of subsets of* F *is a* wrapper *iff* MUS<sup>F</sup> ⊆W⊆ MUS<sup>F</sup> ∪ SS<sup>F</sup> *. Furthermore, the* remainder *of* W *is the set* R = W ∩ SS<sup>F</sup> *.*

**Proposition 1.** *Let* W *be a wrapper and* R *its corresponding remainder. Then* |MUS<sup>F</sup> | = |W| − |R|*.*

# *Proof.* Since R = W ∩ SS<sup>F</sup> , then MUS<sup>F</sup> ∩ R = ∅, and hence |W| = |MUS<sup>F</sup> | + |R|.

Our approach to determine the MUS count |MUS<sup>F</sup> | consists of the following steps. First, we define a wrapper W and its corresponding remainder R. Subsequently, we encode the wrapper <sup>W</sup> with a Boolean formula <sup>W</sup> such that each projected model of W (for a suitable projection set) corresponds to an element of <sup>W</sup>. Similarly, we construct a Boolean formula <sup>R</sup> such that each projected model of <sup>R</sup> corresponds to an element of the remainder <sup>R</sup>. Finally, we employ a projected model counter to determine the projected model counts of W and R, i.e., |W| and |R|, and hence we obtain the MUS count |MUS<sup>F</sup> | = |W| − |R|.

In the following, we first describe in Sect. 4.2 how to build a simple wrapper <sup>W</sup><sup>1</sup> and its remainder <sup>R</sup><sup>1</sup> and how to encode them via Boolean formulas <sup>W</sup><sup>1</sup> and R1, respectively. Subsequently, in Sects. 4.3–4.11, we propose several additional wrappers (and their remainders) that improve upon the base wrapper W<sup>1</sup> by exploiting various observations about MUSes. Finally, in Sect. 4.12, we show how to combine the individual wrappers.

# **4.2** *W***<sup>1</sup> - the Base Wrapper and Its Reminder**

Our base wrapper, W1, is simply the set of all satisfiable subsets and all MUSes of <sup>F</sup>, i.e., <sup>W</sup><sup>1</sup> <sup>=</sup> SS<sup>F</sup> <sup>∪</sup> MUS<sup>F</sup> . The corresponding remainder <sup>R</sup><sup>1</sup> is thus the set SS<sup>F</sup> of all satisfiable subsets of <sup>F</sup>. In the following, we describe how to encode the wrapper <sup>W</sup><sup>1</sup> and the remainder <sup>R</sup><sup>1</sup> via Boolean formulas <sup>W</sup><sup>1</sup> and <sup>R</sup><sup>1</sup> whose projected models correspond to elements of W<sup>1</sup> and R1, respectively.

Let us start with encoding the remainder R<sup>1</sup> = SS<sup>F</sup> . Given the unsatisfiable formula F <sup>=</sup> {f1,...,f<sup>n</sup>}, we introduce a set <sup>A</sup> <sup>=</sup> {a1,...,a<sup>n</sup>} of *activation variables*. Note that every valuation π of <sup>A</sup> one-to-one maps to an *activated* subset <sup>π</sup><sup>A</sup>,F of <sup>F</sup> defined as <sup>π</sup><sup>A</sup>,F <sup>=</sup> {f<sup>i</sup> <sup>∈</sup> <sup>F</sup> <sup>|</sup> <sup>π</sup>(a<sup>i</sup>)=1}. Using the activation variables, we build the formula R<sup>1</sup> as follows:

$$\mathbb{R}\_1 = \bigwedge\_{f\_i \in F} a\_i \to f\_i \tag{1}$$

Intuitively, if we set <sup>a</sup><sup>i</sup> to 0 then the formula <sup>a</sup><sup>i</sup> <sup>→</sup> <sup>f</sup><sup>i</sup> is trivially satisfied, and if we set <sup>a</sup><sup>i</sup> to 1 then <sup>f</sup><sup>i</sup> has to be satisfied to satisfy <sup>a</sup><sup>i</sup> <sup>→</sup> <sup>f</sup><sup>i</sup>. Hence, the models of <sup>R</sup><sup>1</sup> projected on <sup>A</sup> map to satisfiable subsets of <sup>F</sup>; formally:

**Proposition 2.** *For every valuation* <sup>π</sup> *of* <sup>A</sup>*,* <sup>π</sup> <sup>∈</sup> <sup>M</sup><sup>R</sup>1↓A *iff* <sup>π</sup><sup>A</sup>,F ∈ R<sup>1</sup> <sup>=</sup> SS<sup>F</sup> *. Consequently,* <sup>|</sup>M<sup>R</sup>1↓A<sup>|</sup> <sup>=</sup> |R1|*.*

Let us note that the concept of activation variables (or alternatively *relaxation variables*) and the idea behind the formula R<sup>1</sup> is not novel and it appeared also in several MUS/MSS/MCS related studies such as [14,31,42]. However, we are the first who apply it in the context of MUS counting.

To build a formula <sup>W</sup><sup>1</sup> that represents the wrapper <sup>W</sup><sup>1</sup> <sup>=</sup> SS<sup>F</sup> <sup>∪</sup> MUS<sup>F</sup> , we will proceed similarly, i.e., we build <sup>W</sup><sup>1</sup> using the activation variables <sup>A</sup> in such a way that a valuation <sup>π</sup> of <sup>A</sup> is a projected model of <sup>W</sup><sup>1</sup> iff <sup>π</sup>A,F ∈ W1. A straightforward approach to encode W<sup>1</sup> is to directly express that we are interested either in satisfiable subsets or MUSes of F. Such an encoding might look as <sup>R</sup>1(A)∨isMUS(A) where <sup>R</sup>1(A) is the formula from Eq. <sup>1</sup> encoding that <sup>π</sup>A,F is satisfiable and isMUS(A) is a formula encoding that <sup>π</sup>A,F is an MUS. However, encoding that a set S is an MUS is quite expensive since one has to express that all subsets of S are satisfiable and that S is unsatisfiable (Definition 1). Especially, encoding that a set S is unsatisfiable requires to assume all the exponentially many valuations of *Vars*(S). Several MUS related studies used various QBF encodings for the property of being an MUS, e.g., [13,31]. In particular, to express that a set S is an MUS, one can use the following, intuitively described, ∀∃-QBF encoding: "**for every** valuation τ of *Vars*(S) the valuation τ models <sup>¬</sup>S (i.e., S is unsatisfiable) **and for every** subset S of <sup>S</sup> there **exists** a valuation τ of *Vars*(S ) that satisfies S ". One could convert the ∀∃-QBF encoding into a plain Boolean formula by explicitly enumerating all the possible valuations of *Vars*(S) and all the subsets of S, however, this yields an exponentially large, and thus intractable, formula. Hence, instead of directly expressing that every element of the wrapper <sup>W</sup><sup>1</sup> is either a satisfiable subset or an MUS of <sup>F</sup>, we propose another approach based on a novel concept of an *evidence*.

**Definition 6 (evidence).** *Let* <sup>A</sup> *be a subset of* <sup>F</sup> <sup>=</sup> {f1,...,f<sup>n</sup>}*. An* evidence *for* A *is a tuple* (ρ1,...,ρ<sup>n</sup>) *such that for every* <sup>1</sup> <sup>≤</sup> <sup>i</sup> <sup>≤</sup> <sup>n</sup> *it holds that:*

*1.* <sup>ρ</sup><sup>i</sup> : *Vars*(F) → {1, <sup>0</sup>} *is truth assignment, and 2.* <sup>ρ</sup><sup>i</sup> <sup>|</sup><sup>=</sup> <sup>A</sup> \ {f<sup>i</sup>}*.*

Crucially, we observe the following:

**Proposition 3.** *For every subset* <sup>A</sup> *of* <sup>F</sup> *it holds that* <sup>A</sup> <sup>∈</sup> SS<sup>F</sup> <sup>∪</sup>MUS<sup>F</sup> <sup>=</sup> <sup>W</sup><sup>1</sup> *iff there exists an evidence for* A*.*

Our formula <sup>W</sup><sup>1</sup> (Eq. 2) that encodes the wrapper <sup>W</sup><sup>1</sup> captures every set A <sup>⊆</sup> F for which there exists an evidence (ρ<sup>1</sup>,...,ρ<sup>n</sup>). To represent the set A, we use the activation variables <sup>A</sup> <sup>=</sup> {a<sup>1</sup>,...,a<sup>n</sup>}. To represent the truth assignments <sup>ρ</sup><sup>1</sup>,...,ρ<sup>n</sup>, we introduce variable sets <sup>I</sup><sup>1</sup>,..., <sup>I</sup><sup>n</sup> where <sup>I</sup><sup>i</sup> is a fresh copy of *Vars*(F) for every i ∈ {1,...,n}.

$$W\_1 = \bigwedge\_{a\_i \in A} a\_i \to \left( \bigwedge\_{j \in \{1, \ldots, n\} \backslash \{i\}} (a\_j \to f\_{j \mid Var(F)/\mathbb{Z}\_i}) \right) \tag{2}$$

Intuitively, let <sup>π</sup> be a valuation of *Vars*(W1) and π <sup>A</sup>,F <sup>=</sup> {f<sup>i</sup> <sup>∈</sup> <sup>F</sup> <sup>|</sup> <sup>π</sup> (a<sup>i</sup>) = <sup>1</sup>} the subset of <sup>F</sup> activated by <sup>A</sup>. For every activated clause <sup>f</sup><sup>i</sup> <sup>∈</sup> <sup>π</sup> <sup>A</sup>,F , the formula expresses that π ↓I*<sup>i</sup>* is a model of <sup>π</sup> <sup>A</sup>,F \ {f<sup>i</sup>} where the variable set *Vars*(F) is substituted by <sup>I</sup>i.

**Proposition 4.** *For every valuation* <sup>π</sup> *of* <sup>A</sup>*,* <sup>π</sup> <sup>∈</sup> <sup>M</sup><sup>W</sup>1↓A *iff* <sup>π</sup><sup>A</sup>,F ∈ W<sup>1</sup> <sup>=</sup> SS<sup>F</sup> <sup>∪</sup> MUS<sup>F</sup> *. Consequently,* <sup>|</sup>M<sup>W</sup>1↓A<sup>|</sup> <sup>=</sup> |W1|*.*

Based on Propositions 2 and 4, we can now employ a projected model counter to obtain the model counts <sup>|</sup>M<sup>W</sup>1↓A<sup>|</sup> and <sup>|</sup>M<sup>R</sup>1↓A|, which yields |W1<sup>|</sup> and |R1|, and hence also |MUS<sup>F</sup> | (Proposition 1). However, the concern here is the tractability of obtaining the model counts. There are mainly two criteria that affect the practical tractability of projected model counting. One criterion is the number of projected models, i.e. the cardinality of the wrapper (and the remainder), and the other criterion is the cardinality of the projection set, i.e., |A|. The wrapper W<sup>1</sup> is not very efficient w.r.t. these two criteria. Especially, W<sup>1</sup> contains all satisfiable subsets of F, and there are often exponentially many satisfiable subsets of F w.r.t. <sup>|</sup>F|. Therefore, in the following, we will present nine additional wrappers, <sup>W</sup><sup>2</sup>,...,W10, and their corresponding remainders. Each of the wrappers captures a property of MUSes that allows us to provide a better description of MUSes, and hence reduce the cardinality of the wrapper and/or the cardinality of the projection set. Similarly as in the case of W1, we will use the activation variables A to represent the elements of the wrappers/remainders. Moreover, every of the following wrappers <sup>W</sup><sup>i</sup> will be encoded by a Boolean formula <sup>W</sup><sup>i</sup> such that for every valuation <sup>π</sup> of <sup>A</sup>, <sup>π</sup> <sup>∈</sup> <sup>M</sup>W*i*↓A iff <sup>π</sup>A,F ∈ W<sup>i</sup> (and similarly for the remainders).

# **4.3** *W***<sup>2</sup> - the Intersection of MUSes**

Our second wrapper <sup>W</sup><sup>2</sup> is based on a simple observation: every MUS of <sup>F</sup> has to contain the intersection IMUS<sup>F</sup> of all MUSes of <sup>F</sup>. Hence, we define the wrapper as <sup>W</sup><sup>2</sup> <sup>=</sup> {<sup>N</sup> ∈ W<sup>1</sup> <sup>|</sup> <sup>N</sup> <sup>⊇</sup> IMUS<sup>F</sup> } and encode it via <sup>W</sup><sup>2</sup> as follows:

$$\mathbb{W}\_2 = \mathbb{W}\_1 \land \bigwedge\_{f\_i \in \mathbf{Iws}\_F} a\_i \tag{3}$$

**Proposition 5.** *For every valuation* <sup>π</sup> *of* <sup>A</sup>*,* <sup>π</sup> <sup>∈</sup> <sup>M</sup>W2↓A *iff* <sup>π</sup>A,F ∈ W2*. Consequently,* <sup>|</sup>M<sup>W</sup>2↓A<sup>|</sup> <sup>=</sup> |W2|*.*

The remainder R<sup>2</sup> of W<sup>2</sup> is by Definition 5 the set W<sup>2</sup> ∩ SS<sup>F</sup> . To build the formula <sup>R</sup><sup>2</sup> that encodes <sup>R</sup>2, observe that we already have an encoding for the set W<sup>2</sup> (Eq. 3), and we also have an encoding for the set SS<sup>F</sup> since SS<sup>F</sup> = R1. Hence, we can build <sup>R</sup><sup>2</sup> as a conjunction of the two encodings: <sup>R</sup><sup>2</sup> <sup>=</sup> <sup>W</sup><sup>2</sup> <sup>∧</sup> <sup>R</sup>1. Note that this construction of the remainder and the formula that encodes it is purely mechanical and does not involve any specific property of the particular wrapper. Therefore, for every wrapper <sup>W</sup><sup>i</sup> and its encoding <sup>W</sup><sup>i</sup> that are presented in the following sections, we define the reminder as R<sup>i</sup> = W<sup>i</sup> ∩ R<sup>1</sup> and encode it as <sup>R</sup><sup>i</sup> <sup>=</sup> <sup>W</sup><sup>i</sup> <sup>∧</sup> <sup>R</sup>1. Proposition <sup>6</sup> witnesses the soundness of this construction:

**Proposition 6.** *For every valuation* <sup>π</sup> *of* <sup>A</sup>*,* <sup>π</sup> <sup>∈</sup> <sup>M</sup><sup>R</sup>*i*↓A *iff* <sup>π</sup><sup>A</sup>,F ∈ Ri*. Consequently,* <sup>|</sup>M<sup>R</sup>*i*↓A<sup>|</sup> <sup>=</sup> |Ri|*.*

This section's final question is how to compute the intersection IMUS<sup>F</sup> . It is well-known that a clause <sup>f</sup> <sup>∈</sup> <sup>F</sup> belongs to IMUS<sup>F</sup> iff <sup>F</sup> \ {f} is satisfiable (see, e.g., [32,40,56]). Hence, a straightforward way would be to perform such satisfiability check for each f <sup>∈</sup> F, however, that might be very expensive. Fortunately, there has been recently proposed [13] a quite efficient algorithm to compute IMUS<sup>F</sup> which usually requires only few satisfiability checks, so we implemented that algorithm and use it while building the wrapper.

# **4.4** *W***<sup>3</sup> - The Union of MUSes**

Our next wrapper, W3, is very similar to the previous wrapper. Observe that every MUS of <sup>F</sup> is necessarily a subset of the union UMUS<sup>F</sup> of all MUSes of F. Consequently, also a weaker observation holds: every MUS of F is a subset of every over-approximation of UMUS<sup>F</sup> . We define the wrapper as <sup>W</sup><sup>3</sup> <sup>=</sup> {<sup>N</sup> <sup>∈</sup> <sup>W</sup><sup>1</sup> <sup>|</sup> <sup>N</sup> <sup>⊆</sup> <sup>U</sup>} where <sup>U</sup> is either the exact union UMUS<sup>F</sup> or its over-approximation (<sup>U</sup> <sup>⊇</sup> UMUS<sup>F</sup> ). Details on obtaining <sup>U</sup> are provided below. The encoding <sup>W</sup><sup>3</sup> of <sup>W</sup><sup>3</sup> is analogical to <sup>W</sup>2:

$$\mathbb{W}\_3 = \mathbb{W}\_1 \land \bigwedge\_{f\_i \notin U} \neg a\_i \tag{4}$$

**Proposition 7.** *For every valuation* <sup>π</sup> *of* <sup>A</sup>*,* <sup>π</sup> <sup>∈</sup> <sup>M</sup>W3↓A *iff* <sup>π</sup>A,F ∈ W3*. Consequently,* <sup>|</sup>MW3↓A<sup>|</sup> <sup>=</sup> |W3|*.*

The computation of the union UMUS<sup>F</sup> has been examined in two recent studies [13,45] that provided two different approaches for that task. Unfortunately, due to the problem's hardness, both the studies showed that the proposed approaches can usually handle only relatively small input formulas. Namely, the approach from [13] requires <sup>O</sup>(|F|) calls of a Σ<sup>P</sup> <sup>2</sup> oracle. Fortunately, it is often possible to cheaply compute a good over-approximation of UMUS<sup>F</sup> via the concepts of *autark variables* and a *lean kernel*. Briefly, a subset V of *Vars*(F) is an *autark* [46] of F iff there exists a valuation χ of V such that for every clause f <sup>∈</sup> F that contains a variable from V it holds that χ <sup>|</sup><sup>=</sup> f. Since a union of two autark sets is also an autark set, there exists a unique maximum autark set [33,34]. The *lean kernel* K of F is the set of clauses that do not use any variable from the maximum autark set. It has been shown (e.g. [33,34]), that the lean kernel is an over-approximation of UMUS<sup>F</sup> . Hence, when building the wrapper <sup>W</sup>3, we use the lean kernel <sup>K</sup> as the over-approximation <sup>U</sup> of UMUS<sup>F</sup> , i.e., <sup>W</sup><sup>3</sup> <sup>=</sup> {<sup>N</sup> ∈ W<sup>1</sup> <sup>|</sup> <sup>N</sup> <sup>⊆</sup> <sup>K</sup>}. There have been proposed several algorithms to compute the lean kernel, e.g. [36,43]; we have implemented the algorithm by Marques-Silva et al. [43] using a MaxSAT solver UWrMaxSat [54] as a back-end.

Few words are in order to the effect of the two wrappers, W<sup>2</sup> and W3, on the tractability of the projected model counting. Observe that in both cases (W<sup>2</sup> and <sup>W</sup>3), we fix values of some variables from the projection set <sup>A</sup>. Hence, before passing the formulas to the projected model counter, we first propagate the fixed values of A to simplify the formulas. By doing so, we effectively reduce the size of the projection set <sup>A</sup> by <sup>|</sup>IMUS<sup>F</sup> <sup>|</sup> and <sup>|</sup>U<sup>|</sup> <sup>=</sup> <sup>|</sup>K|, respectively.

Finally, let us note that the fact that an MUS has to be a subset of the union of all MUSes and a superset of the intersection of all MUSes is well-known and it has been already exploited in various ways in several MUS related studies (see, e.g., [10,11,45]). Especially, the approximate MUS counting algorithm AMUSIC [13] utilizes UMUS<sup>F</sup> in its preprocessing phase, and IMUS<sup>F</sup> to simplify 3-QBF queries while searching for MUSes.

# **4.5** *W***<sup>4</sup> - Minimum MUS Cardinality**

Assume we can somehow compute the cardinality of a minimum MUS or at least its lower-bound minMUS. Knowing this number, we define our next wrapper as <sup>W</sup><sup>4</sup> <sup>=</sup> {<sup>N</sup> ∈ W<sup>1</sup> | |N| ≥ minMUS}. To encode this wrapper via a formula <sup>W</sup>4, we employ a Boolean cardinality constraint atLeast(A, minMUS) expressing that at least minMUS variables from A are set to 1:

$$\mathcal{W}\_4 = \mathcal{W}\_1 \land \texttt{at} \texttt{Least} \left( \mathcal{A}, \texttt{minMUS} \right) \tag{5}$$

**Proposition 8.** *For every valuation* <sup>π</sup> *of* <sup>A</sup>*,* <sup>π</sup> <sup>∈</sup> <sup>M</sup>W4↓A *iff* <sup>π</sup>A,F ∈ W4*. Consequently,* <sup>|</sup>MW4↓A<sup>|</sup> <sup>=</sup> |W4|*.*

There have been proposed several algorithms for computing an MUS with the minimum cardinality, e.g. [26,27,38]. However, since the task of computing a minimum MUS is in FP<sup>Σ</sup>*<sup>P</sup>* <sup>2</sup> [27,37], computing exactly a minimum MUS is too expensive for our scenario (empirically experienced). Instead, we propose an approach for cheaply computing a lower-bound on the minimum MUS cardinality.

Our method is based on a well-known relationship between MUSes and MCSes called *minimal hitting set duality* [32,56]. Given a collection C of sets, a set X is a *hitting set* of <sup>C</sup> iff C <sup>∩</sup> X <sup>=</sup> <sup>∅</sup> for every C ∈ C. Furthermore, a hitting set X of <sup>C</sup> is *minimal* if none of its proper subsets is a hitting set. The duality relation states that a set N is an MUS of F iff N is a minimal hitting set of the set MCS<sup>F</sup> of all MCSes of <sup>F</sup>. Dually, a set <sup>M</sup> is an MCS of <sup>F</sup> iff <sup>M</sup> is a minimal hitting set of the set MUS<sup>F</sup> . Consequently, one can identify all the MCSes and then compute their *minimum minimal* hitting set to get an MUS with the minimum cardinality. However, there can be up to exponentially many MCSes of F, and thus their complete enumeration is often practically intractable. Our approach to obtain a lower-bound on the minimum MUS cardinality is the following. First, we employ a recent MCS enumeration algorithm RIME [11] to generate a subset M of MCS<sup>F</sup> . Subsequently, we compute a minimum minimal hitting set of M and use it as the lower-bound minMUS on the minimum MUS cardinality while building the wrapper W4. Note that since M ⊆ MCS<sup>F</sup> , it holds that every hitting set of MCS<sup>F</sup> is also a hitting set of M, and hence minMUS is indeed a sound lower-bound on the cardinality of a minimum hitting set of MCS<sup>F</sup> .

Let us also briefly describe an algorithm for computing the minimum MUS by Ignatiev et al. [27], since it works on a similar principle as our approach. Their algorithm iteratively maintains a set *kMCSes* of known MCSes; initially *kMCes* = ∅. In each iteration, the algorithm computes a minimum minimal hitting set X of *kMCSes* and checks X for satisfiability. If X is unsatisfiable, then it is guaranteed to be a minimum MUS. Otherwise, X is enlarged to an MSS using a single MSS extraction subroutine, the complement of the MSS (i.e., an MCS) is added to *kMCSes*, and the algorithm proceeds with a next iteration. Observe that one can also terminate their approach after a given time limit and use the last computed X as a lower-bound on the minimum MUS cardinality. The main difference between our and their approach is that we employ a dedicated MCS enumerator in the first step and then compute just a single minimum minimal hitting set, whereas they alternate single MCS extraction with minimum minimal hitting set computation.

# **4.6** *W***<sup>5</sup> - Maximum MUS Cardinality**

Assuming that we can somehow compute an upper-bound maxMUS on the maximum cardinality of an MUS of <sup>F</sup>, we define our next wrapper as <sup>W</sup><sup>5</sup> <sup>=</sup> {<sup>N</sup> <sup>∈</sup> <sup>W</sup><sup>1</sup> | |N| ≤ maxMUS}. Similarly as in the case of <sup>W</sup>4, to build the formula <sup>W</sup><sup>5</sup> that encodes <sup>W</sup>5, we introduce a Boolean cardinality constraint atMost(A, maxMUS) expressing that at most maxMUS variables from A are set to 1:

$$\mathbb{W}\_5 = \mathbb{W}\_1 \land \texttt{at} \texttt{Mostt}(\mathcal{A}, \texttt{maxMUS}) \tag{6}$$

**Proposition 9.** *For every valuation* <sup>π</sup> *of* <sup>A</sup>*,* <sup>π</sup> <sup>∈</sup> <sup>M</sup>W5↓A *iff* <sup>π</sup>A,F ∈ W5*. Consequently,* <sup>|</sup>MW5↓A<sup>|</sup> <sup>=</sup> |W5|*.*

We are not aware of any prior work on computing the cardinality of the maximum MUS nor of a reasonable approach for computing at least its upperbound. Hence, we propose a custom approach to compute such an upper-bound maxMUS. The base idea is to exploit our concept of wrappers:

**Proposition 10.** *Let* W *be a wrapper, i.e.* W ⊆ MUS<sup>F</sup> ∪ SS<sup>F</sup> *,* A *the set of activation variables, and* <sup>W</sup> *a formula such that for every valuation* π *of* <sup>A</sup>*,* π <sup>∈</sup> <sup>M</sup><sup>W</sup>↓A *iff* <sup>π</sup><sup>A</sup>,F ∈ W*. Furthermore, let* maxOnes <sup>=</sup> max({*ones*(π)<sup>|</sup> <sup>π</sup> <sup>∈</sup> <sup>M</sup><sup>W</sup>↓A}) *where ones*(π) = |{a<sup>i</sup> ∈A| <sup>π</sup>(a<sup>i</sup>)=1}|*. Then* maxOnes *is an upper-bound on the maximum MUS cardinality.*

We use maxOnes as the value maxMUS while constructing wrapper W5. Any of the wrappers and its encoding presented in this paper can be used as W and W, respectively. To determine the value maxOnes, we define a partial MaxSAT problem using the formula <sup>W</sup> <sup>∧</sup> <sup>a</sup>*i*∈A <sup>a</sup><sup>i</sup>, where <sup>W</sup> are the hard clauses and <sup>a</sup>*i*∈A <sup>a</sup><sup>i</sup> are the soft clauses. To solve the problem, we employ the MaxSAT solver UWrMaxSat [54].

# **4.7** *W***<sup>6</sup> - Component Partitioning**

It is often the case that the clauses of F can be partitioned into several *components*, i.e. disjoint subsets of clauses, such that every MUS of F consists only of clauses from a single component. In particular:

**Definition 7 (components).** *Given a clause* <sup>f</sup><sup>i</sup> <sup>∈</sup> <sup>F</sup>*, the component* <sup>C</sup>(fi) *of* <sup>f</sup><sup>i</sup> *is the minimal subset of* <sup>F</sup> *satisfying:*

*1.* <sup>f</sup><sup>i</sup> ∈ C(f)*, and 2. for every* <sup>l</sup> <sup>∈</sup> <sup>f</sup><sup>i</sup> *and every* <sup>f</sup><sup>j</sup> <sup>∈</sup> <sup>F</sup> *with* <sup>¬</sup><sup>l</sup> <sup>∈</sup> <sup>f</sup><sup>j</sup> *,* <sup>C</sup>(fi) = <sup>C</sup>(f<sup>j</sup> )*.*

*Example 2.* Assume that <sup>F</sup> <sup>=</sup>{{x<sup>1</sup>}, {¬x<sup>1</sup>}, {x<sup>2</sup>}, {¬x<sup>1</sup>,¬x<sup>2</sup>}, {x<sup>3</sup>}, {¬x<sup>3</sup>}, {x<sup>4</sup>}, {x<sup>4</sup>, x<sup>5</sup>}}. There are four components: <sup>C</sup><sup>1</sup> <sup>=</sup> {{x<sup>1</sup>}, {¬x<sup>1</sup>}, {x<sup>2</sup>}, {¬x<sup>1</sup>,¬x<sup>2</sup>}}, <sup>C</sup><sup>2</sup> <sup>=</sup> {{x<sup>3</sup>}, {¬x<sup>3</sup>}}, <sup>C</sup><sup>3</sup> <sup>=</sup> {{x<sup>4</sup>}}, and <sup>C</sup><sup>4</sup> <sup>=</sup> {{x<sup>4</sup>, x<sup>5</sup>}}. <sup>C</sup><sup>1</sup> has two MUSes: {{x<sup>1</sup>}, {¬x<sup>1</sup>}} and {{x<sup>1</sup>}, {x<sup>2</sup>}, {¬x1,¬x<sup>2</sup>}}, <sup>C</sup><sup>2</sup> has one MUS: {{x<sup>3</sup>}, {¬x<sup>3</sup>}}, and <sup>C</sup><sup>3</sup> and <sup>C</sup><sup>4</sup> have no MUSes.

**Proposition 11.** *Let* <sup>N</sup> *be an MUS. Then for every two clauses* <sup>f</sup><sup>i</sup>, f<sup>j</sup> <sup>∈</sup> <sup>N</sup>*, it holds that* <sup>C</sup>(f<sup>i</sup>) = <sup>C</sup>(f<sup>j</sup> )*.*

The wrapper W<sup>6</sup> captures the partition of MUSes into components, and it is defined as <sup>W</sup><sup>6</sup> <sup>=</sup> {<sup>N</sup> ∈ W<sup>1</sup> | ∀<sup>f</sup>*i*,f*j*∈<sup>N</sup> . <sup>C</sup>(f<sup>i</sup>) = <sup>C</sup>(f<sup>j</sup> )} and encoded via <sup>W</sup>6:

$$\mathbb{W}\_6 = \mathbb{W}\_1 \land \bigwedge\_{a\_i \in \mathcal{A}} (a\_i \to \bigwedge\_{f\_j \in F \backslash \mathcal{C}(f\_i)} \neg a\_j) \tag{7}$$

**Proposition 12.** *For every valuation* <sup>π</sup> *of* <sup>A</sup>*,* <sup>π</sup> <sup>∈</sup> <sup>M</sup>W6↓A *iff* <sup>π</sup>A,F ∈ W6*. Consequently,* <sup>|</sup>MW6↓A<sup>|</sup> <sup>=</sup> |W6|*.*

To partition the input formula F into individual components, we construct an undirected graph whose vertices are the clauses of F and every two vertices, <sup>f</sup><sup>i</sup> and <sup>f</sup><sup>j</sup> , are connected via an edge iff there exists <sup>l</sup> <sup>∈</sup> <sup>f</sup><sup>i</sup> such that <sup>¬</sup><sup>l</sup> <sup>∈</sup> <sup>f</sup><sup>j</sup> . The components of F then correspond to connected components of the graph (which can be identified in linear time w.r.t. the size of F by traversing the graph). Note that a similar *flip graph* has been used in a study [68] on *model rotation* and its usage during single MUS extraction.

# **4.8** *W***<sup>7</sup> - Minimal Hitting Set Duality**

We again exploit the minimal hitting set duality between MUSes and MCSes (Sect. 4.5). Recall that if a set M is an MCS of F then M <sup>∩</sup> N <sup>=</sup> <sup>∅</sup> for every <sup>N</sup> <sup>∈</sup> MUS<sup>F</sup> . We define the wrapper <sup>W</sup><sup>7</sup> as {<sup>N</sup> ∈ W<sup>1</sup> | ∀<sup>M</sup>∈M<sup>M</sup> <sup>∩</sup> <sup>N</sup> <sup>=</sup> ∅} where M is a set of MCSes. To obtain M, we run an MCS enumeration algorithm RIME [11] constrained by a user-defined time limit. The encoding <sup>W</sup><sup>7</sup> of <sup>W</sup><sup>7</sup> is:

$$\mathbb{W}\_7 = \mathbb{W}\_1 \land \bigwedge\_{M \in \mathcal{M}} \bigvee\_{f\_i \in M} a\_i \tag{8}$$

**Proposition 13.** *For every valuation* <sup>π</sup> *of* <sup>A</sup>*,* <sup>π</sup> <sup>∈</sup> <sup>M</sup><sup>W</sup>7↓A *iff* <sup>π</sup><sup>A</sup>,F ∈ W7*. Consequently,* <sup>|</sup>M<sup>W</sup>7↓A<sup>|</sup> <sup>=</sup> |W7|*.*

# **4.9** *W***<sup>8</sup> - Literal Negation Cover**

Our next wrapper captures the following observation about MUSes.

**Proposition 14.** *Let* <sup>N</sup> *be an MUS of* <sup>F</sup>*,* <sup>f</sup><sup>i</sup> <sup>∈</sup> <sup>N</sup> *a clause of* <sup>N</sup>*, and* <sup>l</sup> <sup>∈</sup> <sup>f</sup><sup>i</sup> *<sup>a</sup> literal of* <sup>f</sup>i*. Then there exists a clause* <sup>f</sup><sup>j</sup> <sup>∈</sup> <sup>N</sup> *such that* <sup>¬</sup><sup>l</sup> <sup>∈</sup> <sup>f</sup><sup>j</sup> *.*

Based on the above proposition, we define the wrapper <sup>W</sup><sup>8</sup> as <sup>W</sup><sup>8</sup> <sup>=</sup> {<sup>N</sup> <sup>∈</sup> <sup>W</sup><sup>1</sup> | ∀f*i*∈<sup>N</sup> . <sup>∀</sup>l∈f*<sup>i</sup>* . <sup>∃</sup>f*j*∈<sup>N</sup> .¬<sup>l</sup> <sup>∈</sup> <sup>f</sup>j}, and encode it as follows:

$$\mathbb{W}\_8 = \mathbb{W}\_1 \land \bigwedge\_{a\_i \in \mathcal{A}} a\_i \to (\bigwedge\_{l \in f\_i} (\bigvee\_{\substack{f\_j \in F \mid \neg l \in f\_j}} a\_j)) \tag{9}$$

**Proposition 15.** *For every valuation* <sup>π</sup> *of* <sup>A</sup>*,* <sup>π</sup> <sup>∈</sup> <sup>M</sup>W8↓A *iff* <sup>π</sup>A,F ∈ W8*. Consequently,* <sup>|</sup>MW8↓A<sup>|</sup> <sup>=</sup> |W8|*.*

# **4.10** *W***<sup>9</sup> - Non-extendable Evidence Models**

Assume that N is an MUS and (ρ1,...,ρ<sup>n</sup>) is its evidence. By Definition 6, it holds that <sup>ρ</sup><sup>i</sup> <sup>|</sup><sup>=</sup> <sup>N</sup> \ {f<sup>i</sup>} for every 1 <sup>≤</sup> <sup>i</sup> <sup>≤</sup> <sup>n</sup>. Observe that since <sup>N</sup> is unsatisfiable, then it is also necessarily the case that <sup>ρ</sup><sup>i</sup> <sup>|</sup><sup>=</sup> <sup>¬</sup>f<sup>i</sup> for every 1 <sup>≤</sup> <sup>i</sup> <sup>≤</sup> <sup>n</sup>. Hence, we define our next wrapper, <sup>W</sup>9, as <sup>W</sup><sup>9</sup> <sup>=</sup> {<sup>N</sup> ∈ W<sup>1</sup> | ∃ρ1,...,ρ<sup>n</sup>. <sup>∀</sup><sup>1</sup>≤i≤<sup>n</sup>. ρ<sup>i</sup> <sup>|</sup><sup>=</sup> <sup>N</sup> \ {f<sup>i</sup>} and <sup>ρ</sup><sup>i</sup> <sup>|</sup><sup>=</sup> <sup>¬</sup>f<sup>i</sup>}. Note that the above-stated property applies *universally* to every evidence of an MUS, and yet we require in the definition of the wrapper only an *existence* of one such evidence. The reason is that there can be up to exponentially many evidences for an MUS w.r.t. <sup>|</sup>*Vars*(F)<sup>|</sup> and hence it is intractable to reason about all of them in the Boolean encoding of the wrapper.

$$\mathbb{W}\_{\mathcal{B}} = \mathbb{W}\_1 \land \bigwedge\_{a\_i \in \mathcal{A}} a\_i \to \neg f\_{i[Vars(F)/\mathcal{Z}\_i]} \tag{10}$$

**Proposition 16.** *For every valuation* <sup>π</sup> *of* <sup>A</sup>*,* <sup>π</sup> <sup>∈</sup> <sup>M</sup><sup>W</sup>9↓A *iff* <sup>π</sup><sup>A</sup>,F ∈ W9*. Consequently,* <sup>|</sup>M<sup>W</sup>9↓A<sup>|</sup> <sup>=</sup> |W9|*.*

# **4.11** *W***<sup>10</sup> - Enforced Evidence Models**

Our final wrapper, <sup>W</sup>10, again builds on the variable valuations <sup>ρ</sup><sup>1</sup>,...,ρ<sup>n</sup> that form an evidence of an MUS N of F. In the previous wrapper, <sup>W</sup>9, we have exploited that none of the variable valuations can be a model of N. Here, we express that none of the valuations can be *easily modified* to be a model of N. In particular, if <sup>f</sup><sup>i</sup> <sup>∈</sup> <sup>N</sup>, then by the definition of an evidence, <sup>ρ</sup><sup>i</sup> <sup>|</sup><sup>=</sup> <sup>N</sup> \ {f<sup>i</sup>}. Assume that we pick a literal <sup>l</sup> <sup>∈</sup> <sup>f</sup><sup>i</sup> and turn <sup>ρ</sup><sup>i</sup> into a valuation <sup>ρ</sup> <sup>i</sup> by flipping the assignment to l so that ρ <sup>i</sup> <sup>|</sup><sup>=</sup> <sup>f</sup><sup>i</sup>. Since <sup>N</sup> is an MUS (i.e., unsatisfiable), there necessarily exists a clause <sup>f</sup><sup>j</sup> <sup>∈</sup> <sup>N</sup> such that <sup>ρ</sup> <sup>i</sup> <sup>|</sup><sup>=</sup> <sup>f</sup><sup>j</sup> , i.e., <sup>f</sup><sup>j</sup> *forces* <sup>ρ</sup><sup>i</sup> to satisfy <sup>¬</sup><sup>l</sup> and hence prevents from flipping <sup>ρ</sup><sup>i</sup> to a model <sup>ρ</sup> <sup>i</sup> of the whole <sup>N</sup>. Formally: **Proposition 17.** *Let* <sup>N</sup> *be an MUS,* <sup>f</sup><sup>i</sup> <sup>∈</sup> <sup>N</sup> *a clause of* <sup>N</sup>*, and* <sup>ρ</sup><sup>i</sup> *a model of* <sup>N</sup> \ {fi}*. Then for every literal* <sup>l</sup> <sup>∈</sup> <sup>f</sup>i*, there exists a clause* <sup>f</sup><sup>j</sup> <sup>∈</sup> <sup>N</sup> *such that* <sup>¬</sup><sup>l</sup> <sup>∈</sup> <sup>f</sup><sup>j</sup> *and* <sup>ρ</sup><sup>i</sup> <sup>|</sup><sup>=</sup> <sup>f</sup><sup>j</sup> \ {¬l}*.*

Similarly as in the case of W9, observe that Proposition 17 applies *universally* to every evidence of an MUS, however, since there can be exponentially many such evidences, it is expensive to reason about all of them. Hence, in the wrapper, we capture just an *existence* of such an evidence: <sup>W</sup><sup>10</sup> <sup>=</sup> {<sup>N</sup> <sup>∈</sup> <sup>W</sup><sup>1</sup> | ∃ρ<sup>1</sup>,...,ρn. <sup>∀</sup><sup>1</sup>≤i≤n. ρ<sup>i</sup> <sup>|</sup><sup>=</sup> <sup>N</sup> \ {fi} and if <sup>f</sup><sup>i</sup> <sup>∈</sup> <sup>N</sup> then <sup>∀</sup>l∈f*<sup>i</sup>* . <sup>∃</sup>f*j*∈<sup>N</sup> .¬<sup>l</sup> <sup>∈</sup> <sup>f</sup><sup>j</sup> and <sup>ρ</sup><sup>i</sup> <sup>|</sup><sup>=</sup> <sup>f</sup><sup>j</sup> \ {¬l}}. Equation <sup>11</sup> shows the corresponding encoding via <sup>W</sup>10:

$$\mathcal{W}\_{10} = \mathcal{W}\_1 \land \bigwedge\_{a\_i \in \mathcal{A}} a\_i \to \bigwedge\_{l \in f\_i} \left( \bigvee\_{\substack{f\_j \in \{f\_j \in F \mid \neg l \in f\_j\}}} a\_j \land \neg(f\_j \nwarrow \{\neg l\})\_{\{Vars(F)/Z\_i\}} \right) \tag{11}$$

**Proposition 18.** *For every valuation* <sup>π</sup> *of* <sup>A</sup>*,* <sup>π</sup> <sup>∈</sup> <sup>M</sup>W10↓A *iff* <sup>π</sup>A,F ∈ W10*. Consequently,* <sup>|</sup>MW10↓A<sup>|</sup> <sup>=</sup> |W10|*.*

#### **4.12 Combining Wrappers and Their Remainders**

In the previous sections, we have presented multiple wrappers, each of which captures a different property of MUSes. In this section, we show that the individual wrappers can be easily combined and, hence, form wrappers that provide a more accurate description of the set MUS<sup>F</sup> .

**Proposition 19.** *Let* <sup>A</sup> *be the set of activation variables,* <sup>W</sup><sup>k</sup> *and* <sup>W</sup><sup>l</sup> *wrappers, and* <sup>R</sup><sup>k</sup> *and* <sup>R</sup><sup>l</sup> *the remainders of* <sup>W</sup><sup>k</sup> *and* <sup>W</sup><sup>l</sup> *. Furthermore, for every* m <sup>∈</sup> {k,l}*, let* <sup>W</sup><sup>m</sup> *and* <sup>R</sup><sup>m</sup> *be formulas such that:*


*Then all the following hold:*


Note that although Proposition 19 discusses only a combination of two wrappers, it can be applied repeatedly on already combined wrappers. Hence, we can combine any subset of the wrappers <sup>W</sup><sup>1</sup>,...,W<sup>10</sup> we proposed. Also, note that all the formulas <sup>W</sup><sup>2</sup>,...,W<sup>10</sup> subsume the formula <sup>W</sup>1, and hence if we combine multiple wrappers, we duplicate some clauses. In our implementation, we first remove all the duplicates and apply other straightforward model preserving simplifications before we pass the encoding to a projected model counter.

# **5 Experimental Evaluation**

We have implemented our approach for counting MUSes in a python-based tool<sup>1</sup>, using the projected model counter GANAK [59] to count the models of wrappers and remainders, and also using several auxiliary tools as described above.

We presented 10 *base* wrappers <sup>W</sup><sup>1</sup>,...,W<sup>10</sup> and shown how to combine them. Since <sup>W</sup><sup>1</sup> is subsumed by all the wrappers <sup>W</sup><sup>2</sup>,...,W10, there are 2<sup>9</sup> combined wrappers. Due to the large number of the combinations, we were able to evaluate only some of them. In particular, we evaluated the combination W1∩···∩W10, denoted as *Wall*, of all wrappers since it provides the most precise description of MUSes. We also evaluated 6 wrappers that emerge from Wall by excluding individual base wrappers or combinations of similar base wrappers, and also the most basic wrapper W1. The table below shows the names and definitions of the evaluated combinations:


We also evaluated two contemporary MUS enumerators, MARCO<sup>2</sup> [39] and UNIMUS<sup>3</sup> [10]. Moreover, we evaluated the approximate MUS counter AMU-SIC<sup>4</sup> [13] using its default guarantees, i.e., the provided MUS count estimates are within 1.8 multiplicative factor of the true count with 80% confidence.

Our benchmark suite consists of the 2553 instances previously employed in the prior MUS and MSS literature, including those released by authors of AMU-SIC [13]. The formulas contain from 78 to 1000 clauses and from 40 to 996 variables. The MUS count varies from 1 to 1.<sup>7</sup> <sup>×</sup> <sup>10</sup><sup>9</sup> MUSes.

We focus on three comparison criteria: 1) the number of benchmarks solved by the evaluated tools (i.e. benchmarks where the tools provided the MUS count), 2) the scalability of the tools w.r.t. the number of MUSes in the benchmarks, and 3) we examine the *accuracy* of our wrappers.

All experiments were run using a time limit of 3600 s per benchmark on a Linux machine with AMD 16-Core Processor and 20 GB memory limit. When using wrappers W<sup>4</sup> and W7, we used a combined limit of 300 s (included in the 3600 s) and 100000 MCSes for the MCS enumeration while building the wrappers; if both wrappers were used, we run the MCS enumeration just once. Finally, while constructing a combined wrapper of the form W<sup>∗</sup> ∩ W5, we used W<sup>∗</sup> to compute the value maxMUS for creating W5.

<sup>1</sup> https://github.com/jar-ben/exactMUSCounter.

<sup>2</sup> https://sun.iwu.edu/∼mliffito/marco/.

<sup>3</sup> https://github.com/jar-ben/unimus.

<sup>4</sup> https://github.com/jar-ben/amusic.

**Table 1.** Number of solved benchmarks by individual tools.

**Fig. 2.** The number of solved benchmarks in time.

#### **5.1 Solved Benchmarks**

In Table 1, we show the number of benchmarks that were solved by the individual evaluated tools. The worst performance was achieved by the basic wrapper W1 (W1), which is not surprising since it does not provide a good description of MUSes. AMUSIC solved 623 benchmarks, whereas UNIMUS and MARCO solved 833 and 799 benchmarks, respectively. Except for Wno8910 (and W1), which solved *only* 1058 benchmarks, all the remaining combined wrappers solved around 1450–1500 benchmarks and hence significantly dominated both AMUSIC and the two MUS enumerators. Maybe surprisingly, Wall that combines all the base wrappers ended up at the third position; the highest number (1500) of solved benchmarks was achieved by Wno5, and the second-highest (1498) by Wno4. Note that Wno5 and Wno4 exclude encoding of the minimum and maximum MUS cardinality via Boolean cardinality constraints. In general, solving Boolean cardinality constraints is often quite hard, and hence even though a presence of the two wrappers might provide a better description of MUSes, the constraints increase the hardness of the generated instances.

Figure 2 compares the time needed to solve the benchmarks by a subset (for a better clarity) of the evaluated tools. A point with coordinates [x, y] means that x benchmarks were solved (by the corresponding tool) within the first y seconds.

#### **5.2 Scalability W.r.t the MUS Count**

In Fig. 3, we compare the scalability of the evaluated tools w.r.t. the number of MUSes in the benchmarks. In particular, a point with coordinates [x, y] denotes that the corresponding tool solved y benchmarks that contained at most x MUSes. For a better clarity, we compare only our best wrapper, Wno5, with AMUSIC, MARCO, and UNIMUS. Note that whereas AMUSIC scales to instances

**Fig. 3.** The number of solved w.r.t. the MUS count.

with 10<sup>8</sup> MUSes, the remaining three tools scale only to instances with at most a million of MUSes. In fact, note that even though AMUSIC solved in overall *just* 623 benchmarks, there are 319 benchmarks that were solved only by AMUSIC. Based on a closer examination of the results, we identified that AMUSIC scales much better than the other tools w.r.t. the MUS count, however, it does not scale so well w.r.t. the number of clauses in the input formula F. This is not surprising since AMUSIC is *just* an approximate counter and as such, it needs to explicitly identify only logarithmically many MUSes w.r.t. <sup>|</sup>F<sup>|</sup> even though there can be up to <sup>O</sup>(2|<sup>F</sup> <sup>|</sup> ) many MUSes. On the other hand, AMUSIC relies on repeated calls to a 3-QBF solver whose efficiency highly depends on <sup>|</sup>F|.

#### **5.3 Accuracy of Wrappers**

Recall that a wrapper <sup>W</sup> *over-approximates* the set MUS<sup>F</sup> of all MUSes of <sup>F</sup>, i.e., W ⊇ MUS<sup>F</sup> (Definition 5), and hence we are interested in measuring the *accuracy* of the over-approximations. In particular, given a wrapper W and its remainder <sup>R</sup> constructed over a formula F, we measure the ratio |R| |W| . The range of the ratio is [0, 1); the closer to 0 the more accurate the wrapper is, and especially when |R| |W| = 0, the wrapper *exactly* captures the set MUS<sup>F</sup> (i.e., <sup>W</sup> <sup>=</sup> MUS<sup>F</sup> ).

We illustrate the ratio |R| |W| achieved by individual wrappers in Fig. 4. A point with coordinates [x, y] expresses that for x percent of benchmarks completed by the corresponding tool, the ratio |R| |W| was at most <sup>y</sup>. As expected, the ratio achieved by the most basic wrapper W1 (W1) is very high for all the benchmarks, i.e., the wrapper captures MUS<sup>F</sup> very inaccurately. On the other hand, the other wrappers achieved for a vast majority of benchmarks a very low ratio, i.e., they over-approximate MUS<sup>F</sup> very tightly. In fact, for 87% of benchmarks, the wrappers Wno23, Wno4, Wno5, Wno6, and Wall, achieved the ratio 0, i.e., the wrappers exactly captured the set MUS<sup>F</sup> . In contrast, the wrappers Wno7 and Wno8910 achieved the ratio 0 for *only* 68 and 80% of benchmarks, which suggest that the use of the corresponding wrappers, W7, W8, W9, and W10, is vital for an accurate description of MUS<sup>F</sup> . Moreover, note that the accuracy of the wrappers highly correlate with the number of solved benchmarks (Table 1), since Wno7 and Wno8910 (and W1) were the least efficient wrappers.

**Fig. 4.** The ratio |R| |W| expressing the inaccuracy of wrappers.

# **6 Future Possible Applications of Wrappers and Remainders**

Recall that a wrapper <sup>W</sup> *over-approximates* the set MUS<sup>F</sup> of all MUSes of <sup>F</sup>, i.e., W ⊇ MUS<sup>F</sup> (Definition 5). Moreover, in Sect. 5, we empirically witnessed that the best of our wrappers usually over-approximate MUS<sup>F</sup> very tightly or they even capture it exactly. Consequently, the propositional encodings W and R of a wrapper W and its remainder R, respectively, can very precisely capture the set MUS<sup>F</sup> . We strongly believe that such an accurate propositional description of MUS<sup>F</sup> paves the way (and will be thoroughly examined in our future work) to efficiently solve many other MUS related problems including, e.g., the following:

**Approximate MUS Counting.** Recall that |MUS<sup>F</sup> | = |W| − |R|. Assuming that |R| is much smaller than |W| and observing that R⊆W, computing <sup>|</sup>MR↓A<sup>|</sup> <sup>=</sup> |R| should be much faster than computing <sup>|</sup>MW↓A<sup>|</sup> <sup>=</sup> |W|. Hence, one could first relatively quickly *exactly* compute the value <sup>|</sup>MR↓A|, and then use an *approximate* model counter to find an *estimate* w of <sup>|</sup>MW↓A|. The MUS count <sup>|</sup>MUS<sup>F</sup> <sup>|</sup> can be then approximated as <sup>w</sup> − |R|. The *accuracy* of the approximation depends on the approximation guarantees of the model counter (e.g. using ApproxMC4 [18,60], we get the ( , δ)-guarantees provided by AMUSIC).

**MUS Enumeration.** Assume a valuation π of the activation variables <sup>A</sup> and the corresponding *activated* subset <sup>π</sup><sup>A</sup>,F <sup>=</sup> {f<sup>i</sup> <sup>∈</sup> <sup>F</sup> <sup>|</sup> <sup>π</sup>(a<sup>i</sup>)=1} of <sup>F</sup>. As shown in Sect. 4, <sup>π</sup><sup>A</sup>,F is an MUS iff <sup>π</sup> <sup>∈</sup> <sup>M</sup><sup>W</sup>↓A and <sup>π</sup> <sup>∈</sup> <sup>M</sup><sup>R</sup>↓A. Hence, one can enumerate MUSes by enumerating projected models of W and discarding those that are also projected models of R.

**MUS Sampling.** To sample an MUS of F, one can iteratively sample an element <sup>π</sup> of <sup>M</sup><sup>W</sup>↓A until it identifies <sup>π</sup> such that <sup>π</sup> <sup>∈</sup> <sup>M</sup><sup>R</sup>↓A, i.e., <sup>π</sup><sup>A</sup>,F is an MUS. Note that while the past decade has witnessed significant progress in the development of projected model sampling approaches [16,22,55] (with various distribution guarantees), we are not aware of any existing MUS sampling technique (with reasonable distribution guarantees).

**Minimum and Maximum MUS Cardinality.** As discussed in Sect. 4.6 (W5), one can over-approximate the maximum MUS cardinality by finding a model <sup>π</sup> <sup>∈</sup> <sup>M</sup><sup>W</sup>↓A that maximizes the number of variables assigned 1. Similarly, one can under-approximate the minimum MUS cardinality by finding a model <sup>π</sup> <sup>∈</sup> <sup>M</sup><sup>W</sup>↓A that minimizes the number of variables assigned 1. Intuitively, the smaller |R| is, the more precise approximations can be expected. Moreover, by checking if <sup>π</sup> <sup>∈</sup> <sup>M</sup><sup>R</sup>↓A, one can actually verify if <sup>π</sup>A,F is an MUS.

**MUS Membership.** The MUS membership problem is to decide if a clause <sup>f</sup><sup>i</sup> <sup>∈</sup> <sup>F</sup> belongs to an MUS of <sup>F</sup> and it is known to be <sup>Σ</sup><sup>P</sup> <sup>2</sup> -complete [31,35,37]. Contemporary techniques for deciding the problem are mainly based on solving 2-QBF or 3-QBF encodings [13,31]. Our wrapper-based framework allows for an alternative approach: to decide if a clause <sup>f</sup><sup>i</sup> belongs to an MUS of <sup>F</sup>, one can check if there exists a valuation π of <sup>A</sup> such that π(a<sup>i</sup>) = 1, <sup>π</sup> <sup>∈</sup> <sup>M</sup>W↓A, and π <sup>∈</sup> MR↓A. Note that when |R| = 0 or when |R| can be bounded by a constant, this check boils down to a single call of a SAT solver.

# **7 Conclusion and Future Work**

In this paper, we focused on the problem of MUS counting and proposed the first exact MUS counter, called CountMUST, that does not rely on explicit MUS enumeration. The base idea is to reduce the problem of MUS counting to (two queries of) projected model counting via the framework of wrappers and remainders. The availability of scalable projected model counter, GANAK, allowed CountMUST to scale much better and solve significantly more instances than other existing approaches. Moreover, as discussed in Sect. 6, the tightness of wrappers and remainders opens up new potential applications ranging from approximating counting, enumeration, membership, and the like.

We also revisit the complementary nature of CountMUST and AMUSIC with respect to the size of instances and the MUS count. The complementary performance opens up opportunities for a portfolio approach that can achieve the best of both of the worlds. Finally, let us note that we are fighting here the *chicken and egg* nature of the existence of practical applications and scalable algorithmic techniques for problems in automated reasoning. Often the lack of scalable techniques leads to a lack of incentives for end-users to design reductions to practical applications, and vice versa. Even though MUS counting has already many applications in the diagnosis domain [25,29,48–50,65], we hope that the availability of CountMUST will break this chicken and egg loop in other areas and enable further investigations into MUS counting applications.

**Acknowledgements.** This work was supported in part by the National Research Foundation Singapore under its NRF Fellowship Programme [NRF-NRFFAI1-2019- 0004] and the AI Singapore Programme [AISG-RP-2018-005], NUS ODPRT Grant [R-252-000-685-13].

# **References**

1. Andraus, Z.S., Liffiton, M.H., Sakallah, K.A.: CEGAR-based formal hardware verification: a case study. Technical report, University of Michigan, CSE-TR-531-07 (2007)


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

# Sound Verification Procedures for Temporal Properties of Infinite-State Systems

Quentin Peyras<sup>1</sup>, Jean-Paul Bodeveix<sup>2</sup>, Julien Brunel1(B) , and David Chemouil<sup>1</sup>

> <sup>1</sup> ONERA DTIS, Université de Toulouse, Toulouse, France {quentin.peyras,julien.brunel, david.chemouil}@onera.fr <sup>2</sup> IRIT CNRS UPS, Université de Toulouse, Toulouse, France jean-paul.bodeveix@irit.fr

Abstract. First-Order Linear Temporal Logic (FOLTL) is particularly convenient to specify distributed systems, in particular because of the unbounded aspect of their state space. We have recently exhibited novel decidable fragments of FOLTL which pave the way for tractable verification. However, these fragments are not expressive enough for realistic specifications. In this paper, we propose three transformations to translate a typical FOLTL specification into two of its decidable fragments. All three transformations are proved sound (the associated propositions are proved in Coq) and have a high degree of automation. To put these techniques into practice, we propose a specification language relying on FOLTL, as well as a prototype which performs the verification, relying on existing model checkers. This approach allows us to successfully verify safety and liveness properties for various specifications of distributed systems from the literature.

# 1 Introduction

Verifying properties of distributed protocols is a demanding endeavor. Several approaches have been proposed, ranging from verification frameworks, like Iron-Fleet [12] or Verdi [27] to tool-supported languages like TLA` [17], Event-B [1] or Ivy [20,21]. However, when systems of *arbitrary size* are considered, verifying properties usually requires some remarkable effort: inductive invariants must be sought and exhibited (possibly with tool support), and some manual proof effort may still be necessary. Worse, when *liveness* properties are checked, this effort becomes very substantial and tool support is still quite limited.

A natural setting for specification, in particular for safety and liveness properties of infinite-state systems, is (mono- and many-sorted) first-order linear temporal logic (FOLTL). However, it is highly undecidable [13,14]. In recent work [23,24], some of the present authors devised the "Geneva" fragments of FOLTL, which were shown to be decidable. More precisely, these fragments enjoy a "bounded domain property" (BDP), a form of computable finite model property over the first-order domains. Decidability is obtained by expanding first-order quantifiers over the domains (using the computed bounds) and then relying on (decidable) propositional-LTL satisfiability checking.

The Geneva fragments are rather expressive but still have limitations that thwart their use for the specification of systems. In particular, most forms of fairness assumptions, as well as frame conditions (which specify what does not change when a transition happens in a system), do not fit in the fragments. Furthermore, topological properties of systems (such as ring topologies) are hard or even impossible to specify.

In this article, we mitigate this deficiency by exhibiting three transformations that allow to map an *undecidable*, expressive fragment of FOLTL<sup>∗</sup> " (FOLTL with equality and reflexive-transitive closure, to characterize topological properties) into decidable fragments (akin to the Geneva ones), thus allowing the automatic verification of safety and liveness properties of infinite-state systems. Then we apply these techniques to the verification of properties of various protocols.

Notice that none of the proposed transformations is complete. It is actually impossible to devise complete transformations, even assuming a procedure that would be fed additional user input. This is because FOLTL is not even semidecidable.<sup>1</sup>

In more detail, we make the following contributions (cf. Fig. 1):

	- the first of these transformations (called TEA) is fully automatic while the other two (TTC and TFC) must be passed additional data (in the shape of peculiar formulas);
	- these three transformations, as well as other minor ones, are implemented as *tactics* in a prototype tool [22];
	- the associated theorems and lemmas are also formalized and proved correct, using Coq [22];

This article is organized as follows: in Sect. 2, we illustrate our approach using an example (a leader election protocol). Section 3 introduces definitions as well as the two fragments used in the rest of the paper. In Sect. 4, we present basic techniques, which are used in some of our transformations. Then, in Sect. 5, we formalize the automatic TEA transformation. Section 6 and 7 present, respectively, the TFC and TTC transformations. In Sect. 8, we evaluate our approach on various protocols. Finally, we compare our results with related work in Sect. 9.

<sup>1</sup> Indeed, having such a transformation would give a procedure for semi-decidability by testing all possible inputs on this transformation.

Fig. 1. Summary of the contributions of this article

### 2 The Cervino Language

In this section, we present the Cervino modeling language informally. Its semantics, given in terms of many-sorted FOLTL<sup>∗</sup> " (FOLTL with equality and reflexivetransitive closure), is formally introduced in Sect. 3.3. This language is suitable for specifying infinite-state systems. It is undecidable but we enforce some syntactic constraints anyway, in order to ease the further application of transformations mapping into decidable fragments of logic.

Cervino is illustrated in Fig. 2 using the example of a leader election protocol [6] in a ring of unbounded size. Nodes sit in a directed ring and each node has a unique ID. There is a total order on IDs. The goal of the protocol is to elect a leader (in practice, the one with the greatest ID). A node can send to its successor in the ring the IDs it knows about, the receiver keeping those that are greater than its own ID. A node is elected if it receives its own ID.

#### 2.1 Sorts, Relations and Axioms

A Cervino specification may define sorts, (first-order) sorted relations and sorted constants. An interpretation structure for such a specification is a set of *infinite* traces of states. Classically, a state maps a sort to a non-empty set, a constant to an element of such a set and a relation to a set of tuples, all respecting the obvious sorting and arity constraints. The interpretation of sets and constants is *rigid* while that of relations is *flexible*.

In the example, nodes and their IDs are conflated into a single sort Node; and: an *elected* relation represents the set of elected nodes; a *succ* relation represents if two nodes are successive in the ring topology; a *toSend* relation represents the mailbox for each node; an *lte* relation defines a total ordering on nodes; an *lmax* constant represents the highest maximal identifier among nodes.

States can be constrained by axioms, *i.e.* sets of formulas. The latter belong to FOLTL<sup>∗</sup> ", that is they can mix first-order logic (with equality) with the "always" (**G**), "eventually" (**F**) and "next" (written as a prime symbol and only applied to atoms), as well as a reflexive-transitive closure connective (written <sup>∗</sup>). However, we enforce a syntactic constraint on axioms: after converting them to *negation*

Fig. 2. Specification of the leader election protocol (prettified syntax)

*normal form* (NNF), *an existential quantifier cannot appear in the scope of a universal quantifier or of a* **<sup>G</sup>** *connective* (no @ ... <sup>D</sup> ..., no **<sup>G</sup>** ... <sup>D</sup> ...).

A binary relation r can by "tagged" (written using btw) to force r to be a function<sup>2</sup> and enable a special ternary relation btw[r]. Then, btw[r](x,y,z) means that there is an acyclic path between x and z passing through y. The semantics of btw[r] is given through axioms (see Definition 14) and is related to r <sup>∗</sup> through the following equivalence: r <sup>∗</sup>(x*,* y) <sup>⇔</sup> **btw**[r](**x***,* **y***,* **y**).

#### 2.2 Events

Events specify how the system may evolve from one state to another. Events (more precisely: event schemas) are declared with a name and a list of arguments that are the only variables that can appear free in the body of the event.

<sup>2</sup> @ x,y,z : <sup>s</sup> · r(x,y) <sup>∧</sup> r(x,z) <sup>⇒</sup> <sup>y</sup> " z.

The declaration of an event also features a modifies section describing which tuples of which relations may be modified by the event. Other relations or parts of relations are necessarily left unchanged. The body of an event is specified in *primed FO* (FO augmented with primed relation symbols representing the value of these relations in the next state) with the additional constraint that *no existential quantifier may appear positively in the body*.

The semantics for events is standard and comparable to the one used in TLA` or Electrum: in every state, at least one event is fired. In other words, there is a valuation for arguments of at least one event such that the body of the said event evaluates to true. More formally (and ignoring sorting constraints for the sake of readability), given event bodies φ1,...,φ<sup>n</sup> and arguments y1,...,y<sup>m</sup>*<sup>i</sup>* appearing as free variables in φi, the semantics of event is given by the formula: **G**( n <sup>i</sup>"<sup>1</sup> <sup>D</sup>y1,...y<sup>m</sup>*<sup>i</sup>* · <sup>φ</sup>i). We insist that this formula is only implicit: it cannot be input by the specifier as it is the purpose of transformations to massage it. Finally, if needed, fairness constraints must be added by the specifier .

In the example, the *send* event represents the fact that a node updates its successor's mailbox by adding all IDs that are larger than the successor's ID. This way, the largest ID is passed along the ring. Notice we use *universal* quantification: we could have defined *dst* and *id* as parameters of *send*, but the implicit existential quantification, although theoretically acceptable, can be costly performance-wise (as *succ* is a function, this is significant for the *id* argument only). We also specify that the event modifies the *toSend* relation for specific pairs of a node and an identifier, only if these satisfy a condition saying that the ID is in the sender's mailbox (or corresponds to the sender's ID) and if the node is the sender's successor (the body of the event says what happens in that case).

### 2.3 Commands

A check declares a command to verify whether a property holds. To do so, a command uses a certain tactic (TEA, TFC, TTC), as well as additional parameters in the case of TFC and TTC (these are presented in Sect. 5 and 6, respectively). The purpose of this article is precisely to present these transformations. We notice that a command may also be associated with additional, specific axioms in an assuming section (in the example, this section contains a fairness property, necessary to prove the liveness property).

# 3 Background on **FOLTL**

#### 3.1 Syntax and Semantics of **FOLTL**

The basic vocabulary of MSFOLTL (that we simply call FOLTL in the following) is defined out of a signature <sup>Σ</sup> " (S, *Const*, <sup>R</sup>) where <sup>S</sup> is a set of sorts, *Const* is the set of (sorted) constant symbols and R " (Rs)<sup>s</sup>PS‹ ) is a family of sets of *relation symbols*, with R<sup>s</sup> the set of relation symbols over tuples of sort s.

Definition 1 (Formulas). *Given a signature* <sup>Σ</sup> " (S, *Const*, <sup>R</sup>) *and a set of variables* <sup>V</sup>*,* FOLTL" formulas *over* <sup>Σ</sup> *and* <sup>V</sup> *are defined inductively by the following grammar:*

$$\psi ::= \left\lvert r(t\_1, \ldots, t\_n) \right\rvert \left\lvert t\_1 = t\_2 \mid \neg \psi \mid \psi \vee \psi \mid \mathbf{X} \psi \mid \mathbf{F} \psi \mid \forall x : s \cdot \psi \mid \exists x : s \cdot \psi$$

*where* <sup>x</sup> <sup>P</sup> <sup>V</sup><sup>s</sup>*,* <sup>r</sup> <sup>P</sup> <sup>R</sup>s1,...s*<sup>n</sup> and* <sup>t</sup><sup>i</sup> <sup>P</sup> <sup>V</sup>s*<sup>i</sup>* <sup>Y</sup> *Const* <sup>s</sup>*<sup>i</sup> for each* <sup>i</sup>*, with* <sup>V</sup><sup>s</sup> *(resp. Const* <sup>s</sup>*) the set of variables (resp. constants) of sort* s*.*

**X** and **F** stand for the "next" and "eventually" connectives. Usually FOLTL includes the **U** connectives, however it is not required in this paper. We also define "always" as **<sup>G</sup>**<sup>ψ</sup> " -**F**(ψ). Similarly, classical propositional connectives ∧, ⇒ and ⇔ are defined in the natural way. Additionally:


We now introduce the semantics of FOLTL". In the interpretation structures defined below, the interpretation of relations *varies* over time while that of function symbols *does not*.

Definition 2 (Interpretation Structure). *Given a signature* <sup>Σ</sup> " (S, *Const*, <sup>R</sup>)*, an* (interpretation) structure <sup>M</sup> (over <sup>Σ</sup>) *is a triple* ((Ds)<sup>s</sup>P<sup>S</sup> , σ, ρ) *where:*


Definition 3 (Assignment). *An* assignment <sup>C</sup> *in domains* (Ds)<sup>s</sup>P<sup>S</sup> *for variables in* <sup>V</sup> *is a map* V → <sup>D</sup>*. We write* <sup>C</sup>[<sup>x</sup> → <sup>d</sup>] *the assignment defined as* <sup>C</sup>[<sup>x</sup> → <sup>d</sup>](x) " <sup>d</sup> *and* <sup>C</sup>[<sup>x</sup> → <sup>d</sup>](y) " <sup>C</sup>(y) *if* <sup>y</sup> " <sup>x</sup>*. The* extension *of* <sup>C</sup> *to* terms*, also written* C*, is defined in the obvious way.*

Definition 4 (Satisfaction). *Given a structure* <sup>M</sup> " (D, σ, ρ) *and an assignment* C*, the* satisfaction *relation is defined by induction on formulas, for any* <sup>i</sup> <sup>P</sup> <sup>N</sup>*, as follows:*

$$-\mathcal{M}, i, \mathcal{C} \models t\_1 = t\_2 \; \vert \; \mathcal{IC}(t\_1) = \mathcal{C}(t\_2);$$


*Given a closed formula* <sup>φ</sup>*, we write* <sup>M</sup>, k <sup>φ</sup> *if* <sup>M</sup>, k, [] φ*, where* [] *is the empty assignment. Then* <sup>M</sup>od(φ) *denotes the set of structures* <sup>M</sup> *such that* <sup>M</sup>, <sup>0</sup> φ*.*

Definition 5 (Reflexive-Transitive Closure<sup>3</sup>). *We write* FOLTL<sup>∗</sup> " *for the enrichment of* FOLTL" *with a reflexive-transitive closure connective. Then for any sort* <sup>s</sup> <sup>P</sup> <sup>S</sup> *and any binary relation symbol* <sup>r</sup> <sup>P</sup> <sup>R</sup>s,s*, the language of* FOLTL<sup>∗</sup> " *is augmented with a fresh binary relation symbol :* <sup>r</sup><sup>∗</sup> <sup>P</sup> <sup>R</sup>s,s*, and we have:*

<sup>M</sup>, i, <sup>C</sup> <sup>r</sup>∗(t1, t2) *iff* <sup>M</sup>, i, <sup>C</sup> <sup>t</sup><sup>1</sup> " <sup>t</sup><sup>2</sup> *or there exists* <sup>n</sup> <sup>P</sup> <sup>N</sup> *s.t.* <sup>M</sup>, i, <sup>C</sup> - <sup>D</sup>x0,...,x<sup>n</sup> · <sup>t</sup><sup>1</sup> " <sup>x</sup><sup>0</sup> ^ <sup>t</sup><sup>2</sup> " <sup>x</sup><sup>n</sup> ^ ( 0i<sup>n</sup>´<sup>1</sup> <sup>r</sup>(xi, x<sup>i</sup>`<sup>1</sup>))*.*

Let φ, φ be two FOLTL<sup>∗</sup> " formulas. If for any structure M and any assignment <sup>C</sup>, we have <sup>M</sup>, <sup>0</sup>, <sup>C</sup> <sup>φ</sup> iff <sup>M</sup>, <sup>0</sup>, <sup>C</sup> φ then we say that φ and φ are logically equivalent, written <sup>φ</sup> " <sup>φ</sup> .

#### 3.2 Bounded Domain Property

In this section we introduce the Bounded Domain Property (BDP) and present two fragments of FOLTL that enjoy the BDP. These fragments play an important role in the verification procedures presented in this article.

Definition 6 (Bounded Domain Property). *A fragment Frag of* FOLTL *enjoys the* bounded domain property *(BDP) if given* <sup>φ</sup> <sup>P</sup> *Frag,* <sup>φ</sup> *is not satisfiable, or there is a domain-finite structure* <sup>M</sup> *s.t.* <sup>M</sup>, <sup>0</sup> φ *whose the domain size is computable from* φ*. Additionally, BDP implies decidability.*

We now present the two fragments that are used in this paper. Both fragments are included in a larger fragment for which the BDP is established in [24].

Definition 7 (*LTR* fragment). *A formula* <sup>φ</sup> *of* FOLTL" *is said to belong to the (multisorted) Linear-Temporal Reasoning (LTR) fragment if* φ *is in NNF and existential quantifiers only appear in the head of* φ*.*

Theorem 1 ([16,24]). *Any formula* <sup>φ</sup> <sup>P</sup> *LTR (even with equality) enjoys the BDP. The bound of verification for each sort is the sum of the numbers of existential quantifiers and constant symbols over this sort.*

<sup>3</sup> It is possible to fully axiomatize the transitive closure in pure FOLTL, however since it does not fit into the scope of this paper such an axiomatization is not presented here and we simply extends FOLTL with the classical definition of transitive closure.

Definition 8. *An* FOLTL *formula* <sup>ψ</sup> *is in* FOLTL(D↑, @↓) *if* <sup>ψ</sup> " Dy<sup>1</sup> : <sup>s</sup><sup>1</sup> ...y<sup>n</sup> : <sup>s</sup><sup>n</sup> · <sup>θ</sup>[y1,...,yn]*, where* <sup>θ</sup> *has the following syntax:* <sup>θ</sup>:: " <sup>|</sup> <sup>α</sup> <sup>|</sup> <sup>θ</sup> \_<sup>θ</sup> <sup>|</sup> <sup>θ</sup> ^<sup>θ</sup> <sup>|</sup> **<sup>X</sup>**<sup>θ</sup> <sup>|</sup> **<sup>G</sup>**<sup>θ</sup> <sup>|</sup> **<sup>F</sup>**θ*, where* <sup>α</sup> *is an FO formula in NNF without any existential quantifier and is a literal.*

Definition 9. FOLTL(**X**, **<sup>F</sup>**, @↓) *is defined by the following grammar:* <sup>φ</sup>:: " <sup>|</sup> <sup>α</sup> <sup>|</sup> <sup>φ</sup> \_ <sup>φ</sup> <sup>|</sup> <sup>φ</sup> ^ <sup>φ</sup> <sup>|</sup> **<sup>X</sup>**<sup>φ</sup> <sup>|</sup> **<sup>F</sup>**<sup>φ</sup> <sup>|</sup> <sup>D</sup><sup>y</sup> : <sup>s</sup> · <sup>φ</sup>*, with* <sup>α</sup> *an FO formula in NNF without any existential quantifier, a literal and* <sup>y</sup> <sup>P</sup> <sup>V</sup>*.*

Definition 10 (*Geneva* fragment). *The Geneva fragment of* FOLTL *consists of formulas* <sup>ψ</sup> ^ **<sup>G</sup>**(φ) *s.t.* <sup>φ</sup> *is a closed formula of* FOLTL(**X**, **<sup>F</sup>**, @↓) *and* <sup>ψ</sup> *is a closed formula of* FOLTL(D↑, @↓)*.*

Definition 11. *Given a formula* <sup>φ</sup> <sup>P</sup> FOLTL(**X**, **<sup>F</sup>**) *in NNF, we define its* stride K<sup>φ</sup> *as the maximal number of nested* **X** *connectives. Formally :*

$$\begin{cases} K\_{\ell} = K\_{\mathbf{F}\phi} = 0 \text{ (if } \ell \text{ is a literal)}\\ K\_{\forall x \cdot \phi} = K\_{\exists x \cdot \phi} = K\_{\phi} \end{cases} \quad \begin{cases} K\_{\mathbf{X}\phi} = K\_{\phi} + 1\\ K\_{\phi\_1 \land \phi\_2} = K\_{\phi\_1 \lor \phi\_2} = \max(K\_{\phi\_1}, K\_{\phi\_2}) \end{cases}$$

Theorem 2 ([24]). *The Geneva fragment enjoys the FDP. If* <sup>ψ</sup> ^ **<sup>G</sup>**(φ) *is a satisfiable formula in this fragment, for each sort* s *the (exact) bound on the domain size is:* <sup>|</sup>*Const* <sup>s</sup><sup>|</sup> ` (K<sup>φ</sup> ` 1) <sup>ˆ</sup> |V<sup>s</sup>|*.*

# 3.3 Semantics of Cervino

In this section, we define the semantics of a Cervino machine as an FOLTL<sup>∗</sup> " formula. Notice first that, in Cervino, the next instant is referred using the prime symbol, applied to relations only: this translates to an FOLTL sub-formula using the **X** connective, after application of the semantics.

Now, a frame condition is defined as a formula that specifies that a certain relation will not change (between the instant before and after the event occuring) for tuples satisfying some constraints.

Definition 12 (Frame condition). *We define a frame condition as a formula expressing that, under some hypotheses, a certain relation does not change along a transition. Given the the tuple* (r, x, ψ) *where* <sup>r</sup> <sup>P</sup> <sup>R</sup><sup>s</sup> *,* x <sup>P</sup> <sup>V</sup><sup>|</sup><sup>s</sup><sup>|</sup> *,* ψ *is a Boolean formula, where variables in* x *may appear free, we define the frame condition unchanged*[r, x, ψ] *as the formula* @x : s · <sup>ψ</sup> <sup>⇒</sup> (r(x) <sup>⇔</sup> **<sup>X</sup>**r(x))*.*

Definition 13 (Semantics of an event). *Let ev be an event of a Cervino machine declared as follows: event ev[*y : s*] modif {*τ*}, with modif = modifies* <sup>q</sup><sup>1</sup> *at* {(x1) · <sup>ψ</sup>1},...,q<sup>j</sup> *at* {(x<sup>j</sup> ) · <sup>ψ</sup><sup>j</sup>}*, where the free variables in each* <sup>ψ</sup><sup>k</sup> *are included in* xk, y*. Its semantics is defined as* [[*ev*]] " Dy : s · (<sup>τ</sup> ^ [[*modif*]])*, where*

$$\|mod \vec{y}\| = (\bigwedge\_{r \in \mathcal{R} \backslash \{q\_1, \ldots, q\_j\}} \mathsf{unchanged}[r, \vec{x}, \top]) \land (\bigwedge\_{1 \le k \le j} \mathsf{unchanged}[q\_k, \vec{x\_k}, \neg \psi\_k]),$$

*where each list* x *of variables have sorts corresponding to the profile of* r*.*

For any binary relation r that enables *btw*[r], the ternary relation *btw*[r] stating that there exists an acyclic path between two elements passing through a third element is axiomatized in FO following [18].

Definition 14 (Semantics of between). *Given a binary relation symbol* r*, the semantics of btw[r] is given by adding axioms of transitivity, antisymmetry, partial totality, partial reflexivity, cycle maximality, transitivity of reachability, path consistency, taken from [18] in addition to the following axiom:*

$$\forall x, y: s \cdot \left[ r(x, y) \Leftrightarrow \left( \mathbf{bttw}[r](x, y, y) \land (\forall z: s \cdot \mathbf{bttw}[r](x, z, z) \Rightarrow \mathbf{bttw}[r](x, y, z)) \right) \right] \tag{S}$$

*The property (TC) relating btw[r] and r*<sup>∗</sup> *can be deduced from the axioms provided that the domain of s is finite.*

$$\forall x, y: s \cdot \left[ \mathsf{btw}[r](x, y, y) \Leftrightarrow r^\*(x, y) \right] \tag{\text{TC}}$$

*Then, calling* btw *the conjunction of all between axioms,* [[*btw[r]*]] " **G**btw*.*

Definition 15 (Semantics of Cervino). *Let Mch be a Cervino machine with axioms* ψ1,...,ψn*, events ev*1*, . . . , ev<sup>m</sup> and such that the relations enabling btw are* r1,...,rl*. Then its semantics is given by the following* FOLTL<sup>∗</sup> *formula:*

$$[Mch] = \phi\_0 \wedge (\mathbf{G}\phi\_{tr}) \wedge \phi\_{btw}$$

*where* <sup>φ</sup><sup>0</sup> " <sup>n</sup> <sup>i</sup>"<sup>1</sup> <sup>ψ</sup>i*,* <sup>φ</sup>tr " - m <sup>i</sup>"<sup>1</sup> [[*ev*i]] *and* <sup>φ</sup>*btw* " 1il [[*btw*[ri]]]

The semantics of a Cervino machine is then an FOLTL<sup>∗</sup> " formula describing the set of its traces. But, since we aim at verifying systems, we are not only interested in the set of traces but also in the set of counterexamples of a property. This set is also described by an FOLTL<sup>∗</sup> " formula which is the conjunction of the semantics of the machine and the negation of the property we aim to check.

Definition 16 (Counterexamples). *If Mch is a Cervino machine and* φ *is an* FOLTL<sup>∗</sup> " *formula. Then we define* [[*Mch*]]<sup>φ</sup> " [[*Mch*]] ^ [[φ]]

### 4 Basic Transformations

In this section, we present basic transformations used to build the more complex TFC and TTC tactics (respectively presented in Sect. 6 and 7). These transformations are used to map (the semantics of) a system specification into a more general Geneva formula.

#### 4.1 Transforming Equality

Equality is replaced<sup>4</sup> by a dynamic congruence relation "<sup>s</sup>, for every sort <sup>s</sup>. The signature is therefore extended with these fresh "<sup>s</sup> relations.

<sup>4</sup> In practice, we ensure that the semantics of the modifies section, which uses equality, is also affected by this transformation.

Definition 17 (Equality transformation). *Given a fresh binary relation* "<sup>s</sup> *for every sort* s *of a formula, the transformation of equality is defined recursively:*


*Furthermore, the following set Eq*" *of axioms is added to the whole specification:*


**<sup>G</sup>** @x : s, y : s · (x<sup>1</sup> "<sup>s</sup><sup>1</sup> <sup>y</sup><sup>1</sup> ^ ... ^ <sup>x</sup><sup>n</sup> "<sup>s</sup>*<sup>n</sup>* <sup>y</sup>n) <sup>⇒</sup> (r(x) <sup>⇔</sup> <sup>r</sup>(y))

Lemma 1. *Given an* FOLTL" *formula* <sup>φ</sup>*, if* <sup>φ</sup> *is satisfiable then Abs*"(φ) *is satisfiable (and does contain* " *anymore).*

*Proof.* Proof validated in Coq. It is easy to see that equality is a particular case of the equivalence relation introduced by this transformation. 

# 4.2 Restricted Skolemization

The following transformation corresponds to a form of Skolemization meant to create only new constants symbols. Its main purpose is to introduce constants that can then be used by instantiation (Sect. 4.3). Existentially-quantified variables can be substituted by fresh constants, except when under a **G** connective.

Definition 18 (Skolemization). *Skolemization is defined by the following operation (all fresh constant symbols are added to the signature):*


Lemma 2. *Given an* FOLTL" *formula* <sup>φ</sup>*, then Abs*D(φ) *and* <sup>φ</sup> *are equisatisfiable.*

*Proof.* Proof validated in Coq. Corresponds to a usual Skolemization procedure. 

# 4.3 Instantiation

One of the main limitations of the Geneva fragment is the prohibition of temporal operators under universal quantifiers. The solution we propose to this problem is to *finitely* instantiate such universal quantifiers. The following transformation formalizes this idea: all universal quantifiers over temporal formulas are replaced by a conjunction over the set of constants and existentially-bound variables.

Definition 19 (Forall instantiation). *Given a set* I *of constant and variable symbols, we define the transformation of universal quantifiers as follows:*

	- *set of terms in* <sup>I</sup> *of sort* <sup>s</sup>*)*

*Remark 1.* There is no need to transform a universal quantifier if all temporal operators in its scope permute with it, for instance: @<sup>x</sup> · **<sup>G</sup>**<sup>P</sup> is equivalent to **<sup>G</sup>**(@<sup>x</sup> · <sup>P</sup>) and @<sup>x</sup> · (**X**P) <sup>⇒</sup> (**X**Q) is equivalent to **<sup>X</sup>**(@<sup>x</sup> · <sup>P</sup> <sup>⇒</sup> <sup>Q</sup>).

Lemma 3. *Given an* FOLTL" *formula* <sup>φ</sup>*, if* <sup>φ</sup> *is satisfiable and* I ⊆ *Const then Abs*@,I(φ) *is satisfiable.*

*Proof.* Proof validated in Coq. This operation consists in instantiating universal operators, thus preserving satisfiability. 

#### 4.4 Addressing Transitive Closure and the Between Relation

Since we target fragments of FOLTL (without transitive closure), we define the transformation *Abs*∗(), which leaves a formula unchanged except it *uninterprets* the operator <sup>∗</sup>, *i.e.*, *Abs*∗(φ) returns <sup>φ</sup> where every occurrence of <sup>r</sup><sup>∗</sup> is considered as a new relation symbol, unrelated with r.

Besides, the between relation axioms does not fit into Geneva or LTR, so we define their abstract semantics as follows.

Definition 20 (Transformation of between axioms). *Given a binary relation symbol r, we define btw[r]* " **G** btw *where* btw *is the conjunction of the axioms from Definition 14, except that*


$$\forall x, y: s \cdot \left[ r(x, y) \Rightarrow \left( \mathbf{bttw}[r](x, y, y) \land (\forall z: s \cdot \mathbf{bttw}[r](x, z, z) \Rightarrow \mathbf{bttw}[r](x, y, z)) \right) \right] \quad \text{(AS)}$$

$$\forall x, y: s \cdot \left[ \mathsf{btw}[r](x, y, y) \Leftrightarrow r^\*(x, y) \right] \tag{\text{TC}}$$

#### 4.5 Geneva Transformation

The basic transformations introduced above are mainly used together, in a specific order.

# Definition 21 (*Geneva* Transformation). *We define:*

$$Abs\_{Gen}(\phi) = Abs\_\*(Abs\_{\forall,Const}(Abs\_{\exists}(Eq\_{\equiv} \land Abs\_{=}(\phi))))$$

Theorem 3. *Given* <sup>ψ</sup> <sup>P</sup> FOLTL<sup>y</sup><sup>1</sup> (@) *and* <sup>φ</sup> <sup>P</sup> FOLTL<sup>y</sup>1Y<sup>y</sup><sup>2</sup> (**X**, **<sup>F</sup>**, @) *then Abs*Gen(Dy<sup>1</sup> : s<sup>1</sup> · (<sup>ψ</sup> ^ **<sup>G</sup>**(Dy<sup>2</sup> : s<sup>2</sup> · <sup>φ</sup>))) *belongs to the Geneva fragment.*

*Proof.* Recall the conditions to belong to Geneva: (1) no **G** operator in the scope of an existential quantifier that is itself under an **G** connective; (2) no existential quantifier in the scope of a universal quantifier; (3) no equality; (4) no temporal quantifier in the scope of universal quantifiers; and (5) no transitive closure. Given ψ, φ satisfying the given hypotheses, let us write <sup>α</sup> " Dy<sup>1</sup> : s<sup>1</sup> · (<sup>ψ</sup> ^ **<sup>G</sup>**(Dy<sup>2</sup> : s<sup>2</sup> · <sup>φ</sup>)). Then, in <sup>α</sup>, existential quantifiers appear either at the head of the formula or under an **G** operator over the φ formula. Since φ contains no other temporal connectives than **X** and **F**, condition (1) is met. Condition (2) is met as all existential quantifiers appear before universal quantifiers. *Abs*"(.) ensures that equality is not used in the final formula, thus ensuring condition (3). *Abs*@,*Const*(.) instantiates all universal quantifiers that contain temporal connectives in their scope (we assume that if such operator could have been swapped with an universal quantifier, it has been done beforehand), which ensures condition (4). Finally *Abs*∗(.) erases the reflexive transitive closure, ensuring condition (5). Since it is obvious that none of the transformations can introduce formulas breaking any of the conditions, we conclude that *Abs*Gen(α) belongs to Geneva. 

# 5 TEA: Transforming Existential Quantifiers

We now present the fully-automatic TEA transformation. It starts with the observation that the formula specifying events (see Definition 13) is of the shape **<sup>G</sup>**Dx · - <sup>i</sup> *ev*i(x), that is, in every state, at least an event is fired. The gist of the TEA transformation is then twofold: (1) we replace these existential quantifiers by *universal* ones; (2) for every such existential quantifier, we add a fresh relation E, which holds only for the constant semantically associated to this quantifier.

The whole resulting abstract specification lies in the LTR fragment, which enjoys the BDP (Theorem 1). The formula specifying events is however more general than the original one, because it allows more transitions to happen. The abstract system may thus violate a property holding on the original specification. But it is now decidable to check whether the property holds in the abstract system and, if so, this entails that it also holds in the original system.

Before presenting the transformation, notice that, in the following, we consider *event formulas*, that is *primed* FO" formulas of the shape <sup>φ</sup> " Dy<sup>1</sup> : <sup>s</sup><sup>y</sup><sup>1</sup> ,...,y<sup>n</sup> : <sup>s</sup><sup>y</sup>*<sup>n</sup>* · @x<sup>1</sup> : <sup>s</sup><sup>x</sup><sup>1</sup> ,...,x<sup>m</sup> : <sup>s</sup><sup>x</sup>*<sup>m</sup>* · <sup>ψ</sup>, where <sup>ψ</sup> is in NNF and does not contain any first-order quantifiers. These formulas naturally arise when putting the semantics of events in prenex normal form. We also suppose we have a supply of fresh relation symbols, written E<sup>i</sup> (one for every yi, 1 i n).

To devise the transformation and prove its soundness, we first introduce a formula specifying that the E relations are functional. This schema appears in the final abstract specification.

Definition 22 (Functional <sup>E</sup> relations). *Given an event formula* <sup>φ</sup> " Dy<sup>1</sup> : <sup>s</sup>y<sup>1</sup> ,...,y<sup>n</sup> : <sup>s</sup>y*<sup>n</sup>* · @x<sup>1</sup> : <sup>s</sup>x<sup>1</sup> ,...,x<sup>m</sup> : <sup>s</sup>x*<sup>m</sup>* · <sup>ψ</sup>*, we define the* functional formula based on <sup>φ</sup> *as:* Ax<sup>E</sup>(φ) " **<sup>G</sup>** n <sup>i</sup>"<sup>1</sup> @z1, z<sup>2</sup> : <sup>s</sup>y*<sup>i</sup>* · (Ei(z1) ^ <sup>E</sup>i(z2)) <sup>⇒</sup> <sup>z</sup><sup>1</sup> " <sup>z</sup><sup>2</sup> *where* E1,...,E<sup>n</sup> *are fresh unary relation symbols.*

As we introduce these E relations, we also define an enrichment of the event formula accounting for the extended signature. This new formula appears as a link between the two lemmas entailing soundness.

Definition 23 (Enriched event formula). *Given an event formula* <sup>φ</sup> " Dy<sup>1</sup> : <sup>s</sup><sup>y</sup><sup>1</sup> ,...,y<sup>n</sup> : <sup>s</sup><sup>y</sup>*<sup>n</sup>* ·@x<sup>1</sup> : <sup>s</sup><sup>x</sup><sup>1</sup> ,...,x<sup>m</sup> : <sup>s</sup><sup>x</sup>*<sup>m</sup>* ·ψ*, we define the* enriched event formula based on φ *as:*

$$\overline{\phi} = \operatorname{Ax}^{\mathbb{E}}(\phi) \land \left[ \exists y\_1 : s\_{y\_1}, \dots, y\_n : s\_{y\_n} \cdot \left( \bigwedge\_{i=1}^n \mathbb{E}\_i(y\_i) \land \forall x\_1 : s\_{x\_1}, \dots, x\_m : s\_{x\_m} \cdot \psi \right) \right]$$

*where* E1,...,E<sup>n</sup> *are fresh unary relation symbols.*

We now present the essential part of the transformation, transforming an event formula φ into a purely universal one Uφ, more general than φ. In other words, Uφ allows more transitions than φ if we ignore the specification of E1,... En. To do that, for any variable y whose corresponding fresh relation is E, we proceed with the following steps. First: equality between y and another variable is replaced with the relation E applied to the latter; and any other literal containing <sup>y</sup> is replaced by <sup>E</sup>(y) <sup>⇒</sup> . Once these transformation are done, it is possible to replace existential quantification over y by a universal quantification.

Definition 24 (Transformation). *Given an event formula* <sup>φ</sup> *of shape* <sup>D</sup>y<sup>1</sup> : <sup>s</sup><sup>y</sup><sup>1</sup> ,...,y<sup>n</sup> : <sup>s</sup><sup>y</sup>*<sup>n</sup>* ·@x<sup>1</sup> : <sup>s</sup><sup>x</sup><sup>1</sup> ,...,x<sup>m</sup> : <sup>s</sup><sup>x</sup>*<sup>m</sup>* ·ψ*, we define the* (TEA) transformation function *on* φ *as:*

$$\mathbb{U}\mathbb{U}[\phi] = \forall y\_1 : s\_{y\_1}, \dots, y\_n : s\_{y\_n} \cdot \mathbb{U}\_{\vec{y}}[\forall x\_1 : s\_{x\_1}, \dots, x\_m : s\_{x\_m} \cdot \psi]$$

*where* y " {y1,...,y<sup>n</sup>} *and where* <sup>E</sup>1,...,E<sup>n</sup> *are fresh relation symbols (one for every* <sup>y</sup> <sup>P</sup> y*); with* <sup>U</sup>yψ *defined recursively as follows:*


*–* Uy- " ( i <sup>k</sup>"<sup>1</sup> <sup>E</sup>a*<sup>k</sup>* (ya*<sup>k</sup>* )) <sup>⇒</sup> " ( - i <sup>k</sup>"<sup>1</sup> -<sup>E</sup>a*<sup>k</sup>* (ya*<sup>k</sup>* )) \_ *where is a (possibly primed) literal and* {ya<sup>1</sup> ,...ya*<sup>i</sup>* } " FV( ) <sup>X</sup> y *– (the rest is just a recursive walk on formulas)*

*Example 1.* Consider the following event formula, stating that there is an event making R true in the next state for a variable y (other variables remain unchanged w.r.t. <sup>R</sup>): <sup>φ</sup> " D<sup>y</sup> : <sup>A</sup> · <sup>R</sup> (y) ^ (@<sup>x</sup> : <sup>A</sup> · <sup>x</sup> " <sup>y</sup> <sup>⇒</sup> (R(x) <sup>⇔</sup> <sup>R</sup> (x))) that is, in prenex form: <sup>D</sup><sup>y</sup> : <sup>A</sup> · @<sup>x</sup> : <sup>A</sup> · <sup>R</sup> (y) ^ (<sup>x</sup> " <sup>y</sup> \_ (-<sup>R</sup>(x) <sup>∧</sup> -R (x)) \_ (R(x) <sup>∧</sup> <sup>R</sup> (x))). Then there is only one fresh E relation, and Uφ is:

@y, x : <sup>A</sup> · (-<sup>E</sup>(y) <sup>∨</sup> <sup>R</sup> (y)) <sup>∧</sup> <sup>E</sup>(x) <sup>∨</sup> (-<sup>R</sup>(x) <sup>∧</sup> -R (x)) \_ (R(x) <sup>∧</sup> <sup>R</sup> (x))

Now, the following lemma states that every model of the enriched event formula is also a model for the transformed event formula.

Lemma 4. *Given an event formula* <sup>φ</sup> " Dy<sup>1</sup> : <sup>s</sup><sup>y</sup><sup>1</sup> ,...,y<sup>n</sup> : <sup>s</sup><sup>y</sup>*<sup>n</sup>* · @x<sup>1</sup> : <sup>s</sup><sup>x</sup><sup>1</sup> ,...,x<sup>m</sup> : <sup>s</sup><sup>x</sup>*<sup>m</sup>* · <sup>ψ</sup>*, we have* <sup>φ</sup> - Uyφ*.*

*Proof.* Proof validated in Coq.

Lemma 5 applies to a formula representing a whole specification: if such a specification is satisfiable, then a certain transformed version of it is satisfiable too.

Lemma 5. *Let* <sup>θ</sup> *be an* FOLTL" *formula, and* <sup>φ</sup> *be an event formula on the same signature. Then if* <sup>θ</sup> ^ **<sup>G</sup>**<sup>φ</sup> *is satisfiable,* <sup>θ</sup> ^ **<sup>G</sup>**(Uy<sup>φ</sup>) ^ Ax<sup>E</sup>(φ) *is also satisfiable.*

*Proof.* Proof validated in Coq.

Definition 25 (Abstract semantics). *Given a Cervino machine Mch such that the relations enabling btw are r*1*,...,rl, we define* U-*Mch* " <sup>φ</sup><sup>0</sup> ^ **G**U<sup>φ</sup>tr ^ <sup>φ</sup>*btw, where* <sup>φ</sup><sup>0</sup> *and* <sup>φ</sup>tr *are defined as in Definition <sup>15</sup> and* <sup>φ</sup>*btw* " 1il *btw*[rl]*. Also, given an* FOLTL" *formula* <sup>φ</sup>*, we define* <sup>U</sup>φ-*Mch* " *Abs*∗(U-*Mch* ∧ φ)*.*

Theorem 4 (Soundness). *If* [[*Mch*]]<sup>φ</sup> *is satisfiable, then* Uφ-*Mch is also satisfiable.*

*Proof.* This is a direct application of Lemma 5.

Theorem 5. *Given a Cervino machine Mch such that* φ<sup>0</sup> *and* φtr *are defined as in Definition 15, if* <sup>φ</sup>0, φ <sup>P</sup> *LTR then* <sup>U</sup>φ-*Mch* P *LTR.*

*Proof.* Directly follows from the definition of U-..

# 6 TFC: Transforming Frame Conditions

The TEA transformation has the advantage of being fully automatic but it can be inconclusive in a number of cases. For instance, the verification of a distributed system involving strong interactions between its components, which induces events with two or more parameters, is likely to be inconclusive using TEA. This is because the universal quantifiers that are introduced by TEA are abstracting these interactions (which are naturally expressed with existential quantifiers) in a too drastic way.

In this section, we present another transformation, called TFC, which overcomes these limitations but requires some intervention from the specifier.

Instead of targeting the LTR fragment, we now target the Geneva one, which allows for existential quantifiers in the scope of **G**, but forbids temporal formulas in the scope of a universal quantifier. As a consequence, frame conditions, which are typically of shape @<sup>x</sup> : <sup>s</sup> · <sup>ϕ</sup>cond <sup>⇒</sup> (r(x) <sup>⇔</sup> **<sup>X</sup>**r(x)), are not expressible in Geneva. In order to fit into it, such universal quantifiers are instantiated over constants (see *Abs*@,*Const*(·) defined in Sect. 4.3). But then a large part of the information included in the frame conditions is lost. Therefore, we associate some particular kind of invariant properties, called stability axioms, with each event, as a finer transformation of frame conditions. Intuitively, a stability axiom is a pure FO formula that is preserved by an event. Since it is expressed in pure FO, the preservation of a stability axiom is then expressible in Geneva.

Definition 26 (Stability Axiom). *Given a set of frame conditions* C*, an* FO *formula* <sup>φ</sup> *is a* stability axiom *for* <sup>C</sup> *if* <sup>C</sup> <sup>φ</sup> <sup>⇒</sup> **<sup>X</sup>**φ*.* St<sup>C</sup> *denotes the set of stability axioms for* C*.*

The specification of stability axioms is a creative step, but it can be eased with the help of a syntactic condition, which is sufficient to be a stability axiom. The idea is that a formula of the following shape is necessarily a stability axiom: <sup>ϕ</sup>hyp <sup>⇒</sup> <sup>ϕ</sup>, were <sup>ϕ</sup>hyp corresponds to the guard of a frame condition that leaves a relation r unchanged, and ϕ only refers to the relation r.

*Example 2.* In order to illustrate the use of stability axioms, let us consider the leader election distributed system, introduced in Sect. 2. Since TEA does not succeed in proving the safety property, we can try TFC with the following stability axiom for event send:

@ x,y: Node · !succ(src,x) ⇒ (!toSend(x, y) ∨ (x " lmax ∧ btw[succ](x, lmax, y)))

This axiom expresses that if a node x different from the successor of src has an ID y in its mailbox, then the node with the greatest ID is located between x and y (recall that a node and its ID are conflated). This means that outside the scope of the event, an ID cannot jump over the node with the greatest identifier.

Exhibiting this stability axiom requires some work. It would also be possible to proceed using an inductive invariant but, since the property to check is not inductive, doing so would also require some effort.

*Example 3.* In order to illustrate the difference between stability axioms and inductive invariants, we take a toy token protocol as an example. For the sake of simplicity, we consider a property to check that is already inductive. The protocol features one token passing from nodes to nodes with only one send event send(x,y), with body: token(x) ∧ !token'(x) ∧ token'(y) ∧ frame, where frame := @ z · (z " x ∧ z " y) ⇒ (token(z) ⇔ token'(z)).

The (inductive) property to check is that there is always at most one node holding the token. To prove this property without relying on its inductiveness, we can use the following stability axiom: stab := @ z · (z " x ∧ z " y) ⇒ !token(z). Contrary to the inductive invariant, the stability axiom has free variables matching the parameters of the event (which are implicitly quantified existentially). Also the preservation of the stability axiom follows from the frame condition as frame stab <sup>⇒</sup> **X**stab, while the preservation of the inductive invariant follows from the whole transition.

*Remark 2.* Notice that this property is also true for the nodes that are in the scope of the event, *i.e.*, src and its successor. So in this case, the stability axiom is very close to an invariant property. But this is not the case in general. A distinguishing aspect is that TFC with this stability axiom succeeds in proving the safety property, whereas it would not be possible to deduce it from the "invariant" version of this stability axiom.

The TFC transformation is performed in two phases:


Definition 27 (Event enrichment with a stability axiom). *Let ev be an event of a Cervino machine declared as: event ev[*y : s*] modif{*τ*} and* <sup>C</sup> *be the frame condition of ev,* <sup>C</sup> " [[*modif*]]*. Given a stability axiom* <sup>I</sup> *for* <sup>C</sup>*, we define the enrichment* ρ*ev*, <sup>I</sup> *of ev with* <sup>I</sup> *as:* <sup>ρ</sup>ev, <sup>I</sup> " Dy : s · <sup>τ</sup> ^ <sup>C</sup> ^ (I ⇒ **<sup>X</sup>**I)*.*

Definition 28 (Cervino machine enrichment with stability axioms). *Let Mch be a Cervino machine with axioms* ψ1,...,ψn*, events* ev1, . . . , ev<sup>m</sup> *declared as event ev*<sup>i</sup> *[*y<sup>1</sup> : s<sup>1</sup> *] modif {*τi*} for each* <sup>i</sup> <sup>P</sup> <sup>1</sup>..m *and such that the relations enabling btw are* r1,...,rl*. Let* sta *be a function mapping each event to a stability axiom for the according frame condition. Using the same notation as Definition 27, we define the stability axiom enrichment* ρ-*Mch*,sta *of Mch as* ρ-*Mch*,sta " <sup>φ</sup><sup>0</sup> ^ **<sup>G</sup>**φtr ^ <sup>φ</sup>*btw* m

$$where \ \phi\_0 = \bigwedge\_{i=1}^n \psi\_i, \ \phi\_{tr} = \bigvee\_{i=1}^m \rho \{ev\_i, \texttt{sta}(ev\_i)\} \ \text{and} \ \phi\_{\texttt{btw}} = \bigwedge\_{1 \le i \le l} \{\texttt{btw}[r\_l]\}.$$

Definition 29 (Abstract semantics). *Given a Cervino machine Mch and a function* sta*, mapping each event to a stability axiom, we define the stability axiom semantics as* F-*Mch*<sup>φ</sup> " *Abs*Gen(ρ-*Mch*,sta ^ φ)

Theorem 6 (Soundness). *If* [[*Mch*]]<sup>φ</sup> *is satisfiable then* F-*Mch*<sup>φ</sup> *is satisfiable.*

*Proof.* Follows from Lemmas 1, 2 and 3.

Theorem 7. *If* <sup>φ</sup> <sup>P</sup> *LTR then* <sup>F</sup>-*Mch*<sup>φ</sup> P *Geneva.*

*Proof.* Follows from Theorem 3.

### 7 TTC: Transforming Reflexive-Transitive Closure

We now present a simple, effective transformation technique to approximate reflexive-transitive closure (which is present in Cervino and its FOLTL<sup>∗</sup> " semantics). This technique has shown to be useful to prove some liveness properties.

As is well known, transitive closure cannot be fully specified in pure FO. On the other hand, it *can* be specified in pure FOLTL, but the axiomatization we are aware of does not fit in the fragments considered here. However, it is possible to define an interesting *approximation* that does fit in the Geneva fragment.

Informally, the crux of our technique relies on the following observation: *any property propagating along a binary relation will eventually propagate to the reflexive-transitive closure thereof*. This is proved (see Theorem 8 below) by following the definitions of the transitive closure and of the eventually connective.

Definition 30 (Propagation schema). *Given binary relations r and t on a sort* <sup>s</sup>*, given a formula* <sup>P</sup> *with* <sup>k</sup> ` <sup>1</sup> *free variables (*<sup>k</sup> <sup>0</sup>*), the first of which (of sort* s*) is distinguished in the following. Given* k *variables* x *of appropriate typing, we define the* propagation *and* closure schemas *as follows:*

$$\begin{aligned} \textit{Propagates}[r, P, \vec{x}] &= \forall u, v: s \cdot \mathbf{r}(u, v) \Rightarrow \mathbf{G} \left( P[u, \vec{x}] \Rightarrow \mathbf{F}P[v, \vec{x}] \right) \\ \textit{Closure}[r, t, P, \vec{x}] &= \textit{Propagates}[r, P, \vec{x}] \Rightarrow \textit{Propagates}[t, P, \vec{x}] \ \ . \end{aligned}$$

Theorem 8 (Propagation). *Given a binary relations r on a sort* <sup>s</sup>*, the following property over its reflexive-transitive r*‹ *closure is valid: Closure*[*r*, *<sup>r</sup>*‹, P, x]*.*

*Proof.* Proof validated in Coq.

The proof sketch is the following : we consider the set of element to which the property eventually propagates. Then we use the hypothesis that the property propagates along a binary relation r to show that this set is closed under the relation r. Then as the transitive closure from some element is the smallest set closed under the relation r, we know that the property propagates to any element in the transitive closure.

We prove that under the Propagates[r, P, x] hypothesis, for any u, the set of <sup>v</sup>'s satisfying **<sup>G</sup>** (P[u, x] <sup>⇒</sup> **<sup>F</sup>**P[v, x]) is included in the set of <sup>v</sup>'s that are reachable from <sup>u</sup> along <sup>r</sup>. Let <sup>M</sup> be a structure and <sup>C</sup> an assignment s.t. <sup>M</sup>, <sup>C</sup> - Propagates[r, P, x]. We assume that there is an instant i such that P[u, x] holds (otherwise the satisfaction of the axiom is trivial). Then <sup>M</sup>, i, <sup>C</sup> - **F**P[u, x]. Also, given <sup>v</sup> s.t. <sup>M</sup>, i, <sup>C</sup> - **<sup>F</sup>**P[v, x], there exists <sup>k</sup> <sup>i</sup> s.t. <sup>M</sup>, k, <sup>C</sup> - P[v, x]. For any <sup>v</sup> s.t. <sup>M</sup>, <sup>0</sup>, <sup>C</sup> r(v, v ) Propagates[r, P, x] implies <sup>M</sup>, k, <sup>C</sup> - **F**P[v , x]. Thus <sup>M</sup>, i, <sup>C</sup> - P[v , x]. Then <sup>M</sup>, <sup>0</sup>, <sup>C</sup> <sup>r</sup>‹(u, v) implies <sup>M</sup>, i, <sup>C</sup> - **F**P[v, x]. Hence Closure[r, <sup>r</sup>‹, P, x] is valid. 

Given this theorem, the technique we propose consists in replacing the reflexive-transitive closure of a relation (which fits in Cervino and FOLTL<sup>∗</sup> ") by an uninterpreted relation satisfying the closure schema shown above, for some property P that depends on the sort of the considered binary relation as well as, possibly, other arguments. Remark that finding such a property P requires creativity: the specifier must come up with a relevant propagating property.

*Example 4.* In the case of the leader election example, we use TTC to check that a leader will be elected at some point. The property we use is propagation along *succ* of having a given ID in one's mailbox (Propagates[*succ*,*toSend*, *id*]).

Definition 31 (Abstract semantics). *Let Mch be a Cervino machine, such that* r<sup>j</sup><sup>1</sup> ,...,r<sup>j</sup>*<sup>l</sup> are binary relations enabling btw, and* r<sup>k</sup><sup>1</sup> ,...,r<sup>k</sup>*<sup>m</sup> are binary relations whose reflexive-transitive closure is used in Mch. Now, given formulas* <sup>P</sup>1,...,P*, where for every* <sup>1</sup> <sup>i</sup> <sup>m</sup>*,* FV(Pi) " {x, x1,...,x<sup>n</sup>*<sup>i</sup>* } *(with* <sup>x</sup> *the distinguished free variable), we define the transitive closure transformation as:*

$$\begin{aligned} \mathsf{T} \{ Mch \} &= \phi\_0 \wedge \mathsf{G} \phi\_{tr} \wedge \phi\_{\mathsf{b}\mathsf{t}\mathsf{w}} \\ &\qquad \wedge (\bigwedge\_{1 \leqslant i \leqslant m} \bigwedge\_{(c\_1, \ldots, c\_{n\_i}) \in Constant^{n\_i}} \mathsf{Closure}[r\_{k\_i}, r\_{k\_i}^\*, P\_i, (c\_1, \ldots, c\_{n\_i})]) \end{aligned}$$

*where* <sup>φ</sup><sup>0</sup> *and* <sup>φ</sup>tr *are defined as in Definition <sup>15</sup> and* <sup>φ</sup>*btw* " 1il *btw*[r<sup>j</sup>*<sup>i</sup>* ]*.*

*We also define* T-*Mch*<sup>φ</sup> " *Abs*Gen(T-*Mch* ^ φ) *(notice that, due to the application of the Geneva transformation, the* r<sup>∗</sup> <sup>i</sup> *relations become uninterpreted).*

Theorem 9 (Soundness). *If* [[*Mch*]]<sup>φ</sup> *is satisfiable then* T-*Mch*<sup>φ</sup> *is satisfiable.*

*Proof.* Follows directly from Theorem 8 and Lemmas 1, 2 and 3.

Theorem 10. *If* <sup>φ</sup> <sup>P</sup> *LTR then* <sup>T</sup>-*Mch*<sup>φ</sup> P *Geneva.*

*Proof.* Follows from Theorem 3.

# 8 Evaluation

To evaluate the relevance of our three tactics, we applied them to several models of distributed protocols. Our research questions were (1) to check that our methods were applicable to real models; (2) to check whether our approach was efficient enough; and (3) to assess the effort for the specifier to come up with


Fig. 3. All verifications take less than 20 s ("effort": estimation of user effort with the number of atoms (literals and equality tests) used in the TTC or TFC parameters).

parameters for TFC and TTC. Our strategy was always first to apply the TEA tactic. If TEA failed, then in the case of safety properties, we devised stability axioms in order to apply TFC. Otherwise, for liveness properties and for systems relying on transitive closure, we relied on TTC.

The Cervino prototype takes a Cervino specification as input and generates Electrum models which are then fed to the Electrum Analyzer [4], which itself calls a complete procedure in nuXmv [5]. On a general note, efficiency can be compromised in the case of the TTC and TFC tactics due to larger inferred bounds than for TEA. Furthermore, the size of LTL formulas generated by Electrum for nuXmv grows quickly as the tool merely unfolds quantifiers into conjunction and disjunctions, depending on the bounds. For this reason, we leveraged some properties of the Geneva fragment to end up with smaller models: (1) the size of each domain is an *exact* bound rather than just an upper one; (2) all constants are distinct; (3) existential quantifiers can be unfolded on a limited part of the domain. This is the case because the proof of the BDP for the Geneva fragment [24] shows that, if there is a model of a Geneva formula, there is a model satisfying these properties. The specifications we evaluated are of moderate complexity but are not just toy models:


Our conclusion to these case studies is the following (Fig. 3). First the TEA tactic is only efficient for models involving few interactions, which can be attributed to the loss of precision when using universal quantifiers. Regarding TFC, the effort required to find stability axioms seems to be similar to finding an inductive invariant. For TTC, all propagating properties were very simple. Finally, we noticed that, for more complex systems, TTC and TFC can lead to problems that are too large for the model-checker to answer in time (*e.g.* 1 h.)

# 9 Related Work

The usual way to check a safety property is to exhibit an inductive invariant for the system. The TEA tactic is completely automatic and can handle safety properties but remains quite limited. In our experiments, the TFC tactic showed to be as flexible as an invariant to prove safety properties. Finding stability axioms or an inductive invariant appear similar in difficulty. However, once found, checking an inductive invariant is quicker in computation time than checking the abstract system obtained with stability axioms. On the other hand, stability axioms allow to check complex temporal properties.

Regarding liveness properties, important approaches are based on exhibiting a variant or using the Liveness-to-Safety reduction method proposed in [19]. For the simple examples done with TEA, such methods would allow to prove the properties with little efforts, if done right, but are not fully automatic contrary to the TEA tactic. In both case the computation time is really low.

With the TTC approach, we do not need to exhibit any sort of invariant and the propagating property to exhibit has always been straightforward. To our knowledge there is no easiest method to prove some of the examples we presented. For example, the liveness property of the leader election protocol requires to exhibit a variant and an invariant and both are harder to exhibit than the propagating property. The Liveness-to-Safety reduction method also applies here, but it requires to find an invariant on the system obtained by reduction, as well as finding an axiomatization of the reflexive-transitive closure preserving the Liveness property (while this axiomatization is embedded in TTC tactic). However, despite being more immediate in our examples, the TTC tactic is less flexible than these two alternatives since it applies for liveness properties based on the reflexive-transitive closure.

Our approach can also be compared with the specification of parameterized systems. Cubicle [7–10] is an SMT-based model-checker for the verification of safety properties on parameterized systems. Cubicle is efficient for challenging systems but, contrary to our techniques, it enforces strict syntactic constraints on guards and on the checked property. Others techniques based on labelled proof systems have also been proposed [2]. In [15], the safety of the TLB Shootdown algorithm is proved using such a technique. The user must exhibit the correct invariant for the proof system to conclude; while the TEA tactic is automatic. Also, some methods, such as invisible invariants [25], rely on finding automatically a candidate for being an inductive invariant and then checking if this is the case without needing any input from the user. Such an approach is automatic and efficient but only applies to Bounded-Data Parameterized Systems while our methods applies to a wider context. While most work on parameterized systems focuses on safety properties, [11] addresses liveness properties, but remains essentially theoretical. We remark that the techniques mainly used for parameterized systems are mostly orthogonal to those presented in this paper, and a combination of both could be fruitful.

### 10 Conclusion

We devised three original, sound (but incomplete) transformations, that allow to check that a state machine specification, expressed in a rather expressive fragment of FOLTL<sup>∗</sup> ", enjoys a *temporal* property, expressed in the same setting, whatever the bounds on domains (associated with sorts) are. The transformations were proved correct in Coq. We evaluated our approach on several case studies and found that the transformations were effective and, for the semiautomatic ones, demanded an effort comparable to other approaches. A drawback is that the computed bounds can sometimes grow too much for modelchecking to be feasible with the back-end tools we used. Notice that our approach is orthogonal to the main other approaches (for instance, inference of invariants) and could certainly be combined with some of them. Once a universally quantified inductive invariant Inv is found, such a combination would be possible by adding an axiom of the form **G** Inv to our abtract specification. This refines the abstraction while fitting in both LTR and Geneva. This is left for future work.

Acknowledgements. We thank the article and artifact anonymous reviewers for their remarks that helped improve this work, as well as Nuno Macedo for his technical assistance.

# References


Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

# **Hardware and Model Checking**

# **Progress in Certifying Hardware Model Checking Results**

Emily Yu1(B), Armin Biere<sup>1</sup>, and Keijo Heljanko2,3

<sup>1</sup> Johannes Kepler University, Linz, Austria <sup>2</sup> University of Helsinki, Helsinki, Finland <sup>3</sup> Helsinki Institute for Information Technology, Helsinki, Finland

**Abstract.** We present a formal framework to certify k-induction-based model checking results. The key idea is the notion of a k-witness circuit which simulates the given circuit and has a simple inductive invariant serving as proof certificate. Our approach allows to check proofs with an independent proof checker by reducing the certification problem to pure SAT checks and checking a simple QBF with one quantifier alternation. We also present Certifaiger, the resulting certification toolkit, and evaluate it on instances from the hardware model checking competition. Our experiments show the practical use of our certification method.

# **1 Introduction**

In many verification applications, k-induction [34] (also known as temporal induction) is used as a powerful technique that reduces model checking to a series of SAT problems. It has been extensively investigated as an effective approach for unbounded model checking [18,22]. As a generalisation of simple induction, for a given safety property, the k-induction method concerns a base case and an inductive case: the base case is a bounded model checking problem with a depth of k; the inductive case assumes the property holds for k consecutive steps, then checks it also holds for k + 1 steps. The safety property is said to be k-inductive if both conditions are satisfied. The nature of the k-induction algorithm allows it to be integrated with modern SAT/SMT solvers. For example, reduction techniques such as preprocessing have been investigated with k-induction in an incremental setting [17]. The present state-of-the-art also concerns combining k-induction with existing SAT-based model checking (SMC) techniques including interpolation and property directed reachability [23,27]. Furthermore, k-induction has also been extended to the context of infinite-state systems [13,19,26,32], as well as software verification [16]. Another variant of this line of research is the use of k-induction in sequential equivalence checking [31].

Model checking has been an effective technique for the verification of safetycritical systems. In particular, applications deployed in industrial settings such as nuclear facilities, increasingly utilise model checking to gain trust in the correctness of their designs [20,30,36]. In such ultra safety-critical applications the certification that the model checking results are in fact correct is crucial. We c The Author(s) 2021

argue that in model checking generic machine checkable certification is still in its infancy in contrast to related fields. For instance in SAT competitions [2,24], certifiable proofs are mandatory. This has helped to improve the trust we have in SAT solving results as well as the quality of SAT solvers tremendously.

Even though counterexample validation is commonly used in model checking to certify negative verification results through simulation, producing a generic machine checkable proof on success is less straight-forward. To mitigate this problem, certification of model checking has been suggested earlier in [14,21,23,29,33,36,37], but the methods presented in these works are either not directly applicable to k-induction (in its vanilla form), produce k-induction specific certificates (fail to provide an inductive invariant), or are considered to have exponential certificates. This apparently made it hard to, e.g., require all model checkers to produce proofs in the hardware model checking competitions.

As symbolic model checking of bit-level properties for hardware circuits is PSPACE-complete, we introduce in this paper a novel certification framework for k-induction-based model checking. Our proposed approach generates a fixed number of SAT problems together with a one-alternation only QBF, which are verified by an independent certifier, thereby enabling the certification of kinduction proofs at lower complexity. Our method efficiently extends the given model checking problem to finding a simple inductive invariant of a larger circuit as a proof of k-induction of the original circuit. In particular, the certificate size (as a circuit) is shown to be linear in size of the given model, and the inductive depth. We present Certifaiger, which works as a complete tool suite for certification, independent of any model checker. Experimental results show that our technique works efficiently and can be adapted for practical use.

The rest of the paper is organised as follows: In Sect. 2 we introduce the notion of combinational simulation in the context of circuits. In Sect. 3, we study the formal property of combinational simulation and define k-induction-based model checking with an example. In Sect. 4, we present our proposed certification approach followed by theoretical results in terms of k-induction. We describe the implementation of our tool suite in Sect. 5, and report on experimental results in Sect. 6. Finally, we conclude in Sect. 7.

# **2 Circuits**

In this section, we present a slightly non-standard notation to formalize systems. It allows us to represent systems and particularly circuits symbolically in a compact way and is crucial to reduce notational clutter in the following.

Let <sup>B</sup>(V ) be the set of Boolean expressions (propositional formulas) over the Boolean variables V . We also write <sup>B</sup>(I,L) to denote the set of Boolean expressions over I <sup>∪</sup> L, where I and L are two sets of Boolean variables. Given two Boolean expressions f(V ), g(V ) <sup>∈</sup> <sup>B</sup>(V ) we call them *equivalent*, written f(V ) <sup>≡</sup> g(V ), if they have the same models. This notation is also applied to Boolean expressions over different sets of variables by simply interpreting them over the union of their variables. We use "" for syntactic equivalence [15], "→" for syntactic implication, and "⇒" for semantic implication. To define semantical concepts or abbreviations we stick to equality "=".

In the context of this paper, models are expressed in the form of finite logical circuits, where states can be seen as truth assignments to latches and inputs. Initial states are defined by the reset values of latches, in our case, represented by their reset functions. For each latch l in L, there is a reset function r*l*(L) which is a formula (Boolean expression) over a set of latches L, thus allowing cyclic definitions. Note that a cyclic definition can lead to unsatisfiable reset formulas, in which case there are simply no initial states. Additionally, for some L <sup>⊆</sup> <sup>L</sup>, we define R(L) = - *l*∈*L*-- l r*<sup>l</sup>*(L) to allow us to analyse reset functions of individual subsets of latches. The transition relation is expressed as a "next state" formula associated with each latch, whereas non-determinism comes from inputs (which act as the environment). The successor value of each latch is defined by applying its transition function on the current values of latches and inputs. Intuitively, a safety property specifies that the system must not violate certain behaviours, i.e., only "good states" are reachable. In this paper we focus on such simple safety properties and leave liveness properties (see e.g., [29]) etc. for future work.

# **Definition 1 (Circuit).** *A circuit* C = (I, L, R, F, P) *is defined as follows:*


The reset functions characterise the initialisation of the circuit. Such definition of reset abstracts the way how circuits are reset. As a short-hand we use L F(I,L) to denote a conjunction of the corresponding equivalences, i.e., it is interpreted as - *l*∈*L*- l <sup>f</sup>*<sup>l</sup>*(I,L). For clarity, we use **subscripts** as in <sup>L</sup>*<sup>i</sup>* to denote a copy of the latch variables L in the **temporal direction** at some timestamp i, where <sup>L</sup><sup>0</sup> is the set of latches at timestamp 0 when the circuit is supposed to be initialised. Note that, using such transition *functions* to describe transition relations implies that there will always be a successor state. The temporal evolution of a system is expressed using the notion of *unrolling*, which has a specific length and follows the transition relation at each step.

**Definition 2 (Unrolling).** *For an unrolling depth* m <sup>∈</sup> <sup>N</sup>*, the* unrolling *of a circuit* <sup>C</sup> *of length* <sup>m</sup> *is defined as the formula* <sup>U</sup>*<sup>m</sup>* <sup>=</sup> - *i*∈[0*,m*) (L*<sup>i</sup>*+1 <sup>F</sup>(I*<sup>i</sup>*, L*<sup>i</sup>*))*.*

Note that in this definition, we use <sup>I</sup>*<sup>i</sup>* and <sup>L</sup>*<sup>i</sup>* as sets of variables, whereas <sup>U</sup>*<sup>m</sup>* is a formula. For m = 0, the conjunction is empty thus the formula is trivial.

**Definition 3 (Initialised unrolling).** *An* initialised unrolling *of a circuit* C*, with* <sup>C</sup> = (I, L, R, F, P)*, is defined as* <sup>U</sup>*<sup>m</sup>* <sup>∧</sup> <sup>R</sup>(L<sup>0</sup>)*, where* <sup>U</sup>*<sup>m</sup> is an unrolling.*

We say an unrolling is *safe* if and only if the property holds at every timestamp along the whole length of the unrolling.

**Definition 4 (Safe unrolling).** *Unrolling* <sup>U</sup>*<sup>m</sup> of a circuit* <sup>C</sup> = (I, L, R, F, P) *is said to be* safe *if*

$$U\_m \Rightarrow \bigwedge\_{i \in [0,m]} P(I\_i, L\_i).$$

**Definition 5 (Safe initialised unrolling).** *An initialised unrolling* <sup>U</sup>*<sup>m</sup>* <sup>∧</sup> R(L<sup>0</sup>) *of a circuit* C = (I, L, R, F, P) *is said to be* safe *if*

$$U\_m \wedge R(L\_0) \Rightarrow \bigwedge\_{i \in [0, m]} P(I\_i, L\_i).$$

We are now ready to introduce the notion of a *combinational extension* between two circuits. It is purely syntactic based on sharing inputs and latches.

**Definition 6 (Combinational extension).** *Given circuits* C = (I, L, R, F, P) *and* C = (I , L , R , F , P )*,* C combinationally extends C *if* I <sup>=</sup> I *and* L <sup>⊆</sup> L *.*

As noticed above, this definition allows us to interpret the inputs and latches of a circuit as being part of another circuit. In practice for instance we simply assume that the first <sup>|</sup>L<sup>|</sup> latches of the circuit C are mapped to those of <sup>C</sup> assuming some ordering of the latches, as it is for instance the case in the AIGER format [7] used in the Hardware Model Checking Competition (HWMCC) [5].

To tackle the problem of generating a proof certificate for k-induction of the safety of a circuit C, as is the main goal of this paper, we extend it to a larger circuit C with additional "book-keeping" behaviours [1] for which we can show the same property by using standard induction. To ensure that the resulting extended circuit C preserves the original property, we provide a formalization through a *combinational simulation* relation between two circuits, which needs to be formally verified by a certifier. One important aspect of our design principles is to keep the complexity of the required certification procedure low, in other words, to be done via pure SAT solver checks or by solving a QBF with at most one quantifier alternation. This leads to a more complicated non-standard design of the certification approach, the details of which will be described in Sect. 4.

From a practical perspective, under *combinational simulation* defined below in Definition 7, we require that the transition functions on the "common" parts of the two circuits are equivalent. For the new latches, the transition functions are always satisfiable (as they are functions), and thus we need no constraints on them. As second condition we require that if the safety property P holds in the extended circuit, then the property P holds in the original circuit. The last condition we need to check is that all the new latches of the extended circuit can be initialised with some values whenever the original circuit can be initialised and using the same values for initialising the common latches. In other words, for all initialisations of the original circuit there is at least one initialisation of the extended circuit with the same values for common latches.

Under these conditions Theorem 1 in Sect. 3 shows that if the extended circuit (in this sense) combinationally simulates the original one and the extended circuit is safe then the original circuit is safe as well.

With some abuse of notation, we use <sup>∃</sup>L in a Quantified Boolean Formula (QBF) to denote existential quantification over variables in L. As usual, free variables are (implicitly) assumed to be quantified universally.

**Definition 7 (Combinational simulation).** *Given circuits* C = (I, L, R, F, P) *and* C = (I , L , R , F , P ) *where* C *combinationally extends* C*, we say that* C combinationally simulates C*, if the following holds:*

*1.* <sup>f</sup>*<sup>l</sup>*(I,L) <sup>≡</sup> f *<sup>l</sup>*(I,L ) *for* l <sup>∈</sup> L*, "transition" 2.* P (I,L ) <sup>⇒</sup> P(I,L)*, and "property" 3.* R(L) ⇒ ∃(L \L)R (L )*. "reset"*

In later context when verifying the combinational simulation relation between two circuits, we refer to Definition 7.1 as the *transition check*, Definition 7.2 as the *property check*, and Definition 7.3 as the *reset check*.

# **3 Model Checking**

In this section, we consider model checking via k-induction. The model checking problem for safety properties concerns determining whether, given a circuit with a property P, it is the case that P holds in all reachable states, *i.e.,* the initialised unrolling of a circuit of any arbitrary length is safe.

**Definition 8 (Safe circuit).** *Let* <sup>U</sup>*<sup>m</sup> be the unrolling of circuit* <sup>C</sup>*,* <sup>C</sup> *is safe iff* <sup>U</sup>*<sup>m</sup>* <sup>∧</sup> <sup>R</sup>(L<sup>0</sup>) <sup>⇒</sup> - *i*∈[0*,m*] P(I*<sup>i</sup>*, L*<sup>i</sup>*) *holds for all* m <sup>∈</sup> <sup>N</sup>*.*

Based on the above definition, we say the property P "holds" in C if the circuit is safe with respect to P.

**Theorem 1.** *Assume that the circuit* C *combinationally simulates the circuit* C*. If* C *is safe, then* <sup>C</sup> *is safe.*

*Proof.* We do a proof by contradiction. Let m <sup>∈</sup> <sup>N</sup> be a bound for which the claim does not hold. Thus the unrolling of length <sup>m</sup> of <sup>C</sup> is safe w.r.t. P, and therefore U *<sup>m</sup>* <sup>∧</sup> <sup>R</sup> (L <sup>0</sup>) ⇒ - *i*∈[0*,m*] P (I *<sup>i</sup>* , L *<sup>i</sup>*) holds. To obtain the contradiction we assume there is a satisfying assignment <sup>s</sup> of <sup>U</sup>*<sup>m</sup>*∧R(L<sup>0</sup>)∧¬ - P(I*<sup>i</sup>*, L*<sup>i</sup>*), which

*i*∈[0*,m*] would make <sup>C</sup> not to be safe. Thus <sup>R</sup>(L<sup>0</sup>) needs to be satisfiable. Now the reset check of Definition 7.3 implies that R (L <sup>0</sup>) <sup>∧</sup> R(L<sup>0</sup>) is guaranteed to be satisfiable with <sup>L</sup><sup>0</sup> being a subset of <sup>L</sup> <sup>0</sup>. Moreover, by Definition 7.1, the unrolling U *m* of C is also satisfiable with the transition function F applied on the projected ("common") component on both circuits. Also for the new latches the fact that we use a transition function for them, they are also satisfiable (transition functions guarantee that there is always a successor state for all states). Therefore the initialised unrolling R (L <sup>0</sup>) <sup>∧</sup> U *<sup>m</sup>* is satisfiable. Furthermore, by our assumption, - *i*∈[0*,m*] P (I *<sup>i</sup>* , L *<sup>i</sup>*) holds. By Definition 7.1 and Definition 7.3, the projected latches of <sup>C</sup> stay the same as <sup>L</sup>*<sup>i</sup>* for all <sup>i</sup> <sup>∈</sup> [0, m], and thus by Definition 7.2 we have that - *i*∈[0*,m*] P(I*i*, L*i*) holds. 

As usual, we call a formula φ to be an *inductive invariant* φ of a circuit C if φ satisfies the following conditions: (1) R(L) <sup>⇒</sup> φ(I,L), (2) φ(I,L) <sup>⇒</sup> P(I,L), and (3) <sup>U</sup><sup>1</sup> <sup>∧</sup> <sup>φ</sup>(I0, L<sup>0</sup>) <sup>⇒</sup> <sup>φ</sup>(I1, L<sup>1</sup>). As a generalisation, <sup>k</sup>-induction looks at <sup>k</sup> steps of evolution rather than 1 step by assuming the property holds in k consecutive timestamps at the induction step.

**Definition 9 (**k**-inductive).** *Given a circuit* C *with a property* P*, define the formula* <sup>S</sup>*<sup>k</sup>* <sup>=</sup> - *i*∈[0*,k*) P(I*<sup>i</sup>*, L*<sup>i</sup>*)*. Then* P *is called* k-inductive *in* C *if and only if the following two conditions hold:*

*1.* <sup>U</sup>*<sup>k</sup>*−<sup>1</sup> <sup>∧</sup> <sup>R</sup>(L<sup>0</sup>) <sup>⇒</sup> <sup>S</sup>*<sup>k</sup>, and "initiation" 2.* <sup>U</sup>*<sup>k</sup>* <sup>∧</sup> <sup>S</sup>*<sup>k</sup>* <sup>⇒</sup> <sup>P</sup>(I*<sup>k</sup>*, L*<sup>k</sup>*)*. "consecution"*

The first condition Definition 9.1 in this definition is called *initiation check*, also *bounded model checking check* or simply BMC check on the initialised unrolling of length k−1, whereas the second condition Definition 9.2 is referred to as the *consecution check* for the unrolling of k. Note that a 1-inductive invariant is equivalent to an inductive invariant when φ(I,L) <sup>≡</sup> P(I,L).


**Fig. 1.** The SMV code for the *Counter* example.

*Example 1.* We consider a simple example of an N-bit counter, where the counter counts up to a *modulo* bound m, then it resets to zero. There is also a *reset* signal which works as an enabler, such that when the signal is set to 1, the counter is forced to reset. The property checks whether the counter value reaches b.

**Fig. 2.** The transition diagram of the *Counter* example. The initial state is "000" (colored yellow). In the (gray) "bad" State "110" the property does not hold. (Color figure online)

Here the exact modulo check makes the model checking problem k-inductive (k <sup>=</sup> b <sup>−</sup> m + 1). More precisely, for N = 3, the formal description of a 3-bit counter is given in the SMV language in Fig. 1, where m = 5, b = 6. (Note that our example can be easily extended to integers too.) The state diagram of this system is shown in Fig. 2. The input values are specified with the transition relations. This model is 2-inductive.

# **4 Certification**

In our suggested approach, certifying model checking results concerns finding and checking an inductive invariant which implies the original specification, which in our case, is the safety property P. To tackle the problem of certifying kinduction-based model checking for any given circuit, in this section, we redirect the problem to generating a simple inductive invariant from a *k-witness circuit*, in which the original circuit is combinationally simulated.

We start by defining the formalism of a k-witness circuit. The main idea is to record the previous k <sup>−</sup> 1 states and inputs of the circuit observed during the execution, "flattening" the k-induction procedure back to normal induction of a larger circuit. As a result, the size of the circuit increases by a factor of k, where k is the constant used in the k-induction scheme. The k-witness circuit has k local components of inputs and latches. Each component can be seen as representing a state in the original circuit. Whenever a new state is saved, the oldest one is discarded.

One of the key technical challenges is the proper initialisation of the k-witness circuit. We use an additional k initialisation bits for indicating which components of the circuit have been initialised. This helps accomplishing the combinational simulation relation later. We say a component is initialised if its initialisation bit is . At initialisation, the k-witness circuit can be either *fully* or *partially* initialised. Figure 3 displays three ways of initialising the components. In the case of full initialisation, the circuit pre-computes k steps of the original circuit as the initial state of the k-witness circuit. Thus intuitively in the full initialisation case the initial state of the k-witness circuit encodes the states reachable in the k-step initialised BMC unrolling of the original circuit. In partial initialisation scenarios the circuit instead pre-computes an initialised BMC unrolling for fewer steps, where some components are left uninitialised. In the final case where there are no pre-computed steps, the circuit simply runs from an original initial state, leaving all the other components fully uninitialised.

In the definitions below, we use the **superscript** of i in L*<sup>i</sup>* to denote a copy of latches L in the **spacial direction**, such that we introduce a set of new latch variables for every L*<sup>i</sup>* , where l *<sup>i</sup>* <sup>∈</sup> L*<sup>i</sup>* is the corresponding copy of l <sup>∈</sup> L, and similarly for inputs. We refer to l *<sup>i</sup>* as some latch in L*<sup>i</sup>* , where i is the index of a latch set L*<sup>i</sup>* . The formal definition of k-witness circuit is given below. We continue to use **subscripts** for the **temporal direction**.

**Definition 10 (**k**-witness circuit).** *Given a circuit* C = (I, L, R, F, P)*, and* k <sup>∈</sup> <sup>N</sup>+*, the* <sup>k</sup>*-witness circuit C'=(I', L', R', F', P') of* <sup>C</sup> *is defined as follows:*

*1. I'* <sup>=</sup> <sup>I</sup>*. For simplicity we also refer to* <sup>I</sup> *as* <sup>X</sup>*<sup>k</sup>*−1*. 2. L'* <sup>=</sup> X<sup>0</sup> ∪···∪ X*<sup>k</sup>*−<sup>2</sup> <sup>∪</sup> L<sup>0</sup> ∪···∪ L*<sup>k</sup>*−<sup>1</sup> <sup>∪</sup> B*, such that, (a)* X*<sup>i</sup> is a copy of the original inputs, for all* <sup>i</sup> <sup>∈</sup> [0, k <sup>−</sup> 2]*. (b)* L*<sup>i</sup> is a copy of the original latches, for all* <sup>i</sup> <sup>∈</sup> [0, k <sup>−</sup> 1]*. (c)* B <sup>=</sup> {b0,...,b*<sup>k</sup>*−<sup>1</sup>} *is the set of initialisation bits. 3. The reset function R'* <sup>=</sup> {r *l*(L ) <sup>|</sup> l <sup>∈</sup> L } *is defined as follows: (a) For* x <sup>∈</sup> X<sup>0</sup> ∪···∪ <sup>X</sup>*<sup>k</sup>*−2, r *<sup>x</sup>* <sup>=</sup> <sup>x</sup>*. (b) For* i <sup>∈</sup> [1, k <sup>−</sup> 1)*,* u*<sup>i</sup>* <sup>=</sup> <sup>R</sup>(L*<sup>i</sup>* ) <sup>∨</sup> u*<sup>i</sup>*+1, *and* u*<sup>k</sup>*−<sup>1</sup> <sup>=</sup> R(L*<sup>k</sup>*−<sup>1</sup>)*. (c) For* l <sup>∈</sup> L0*,* <sup>r</sup> *<sup>l</sup>* <sup>=</sup> ite(u1,l, r*<sup>l</sup>*(L<sup>0</sup>))*. (d) For* i <sup>∈</sup> [1, k)*,* r *<sup>l</sup><sup>i</sup>* <sup>=</sup> ite(u*<sup>i</sup>* , l*i* , f*<sup>l</sup><sup>i</sup>* (X*<sup>i</sup>*−1, L*<sup>i</sup>*−<sup>1</sup>))*. (e)* r *<sup>b</sup>k*−<sup>1</sup> = *. (f )* r *<sup>b</sup>*<sup>0</sup> <sup>=</sup> <sup>¬</sup>u1*. (g) For* i <sup>∈</sup> [1, k <sup>−</sup> 1)*,* r *<sup>b</sup><sup>i</sup>* <sup>=</sup> <sup>b</sup>*<sup>i</sup>*−<sup>1</sup> <sup>∨</sup> (R(L*<sup>i</sup>* ) ∧ ¬u*<sup>i</sup>*+1)*. 4. F'* <sup>=</sup> {f *l*(I , L ) <sup>|</sup> l <sup>∈</sup> L } *is defined as follows: (a) For* i <sup>∈</sup> [0, k <sup>−</sup> 1)*,* f *<sup>x</sup><sup>i</sup>* (I , L ) = x*<sup>i</sup>*+1*. (b) For* l <sup>∈</sup> L*<sup>k</sup>*−<sup>1</sup>*,* <sup>f</sup> *l*(I , L ) = f*<sup>l</sup>*(X*<sup>k</sup>*−<sup>1</sup>, L*<sup>k</sup>*−<sup>1</sup>)*. (c) For* i <sup>∈</sup> [0, k <sup>−</sup> 1)*,* f *li* (I , L ) = l *<sup>i</sup>*+1*. (d) For* i <sup>∈</sup> [0, k <sup>−</sup> 1)*,* f *<sup>b</sup><sup>i</sup>* (I , L ) = b*<sup>i</sup>*+1*, and* f *<sup>b</sup>k*−<sup>1</sup> (I , L ) = b*<sup>k</sup>*−<sup>1</sup>*. 5. The property P' is defined as* P (I , L ) = - *i*∈[0*,*4] p*<sup>i</sup>*(I , L ) *such that: (a) For* i <sup>∈</sup> [0, k <sup>−</sup> 1)*,* h*<sup>i</sup>* = (L*<sup>i</sup>*+1 <sup>F</sup>(X*<sup>i</sup>* , L*i* ))*. (b)* p<sup>0</sup>(I , L ) = - *i*∈[0*,k*−1) (b*<sup>i</sup>* <sup>→</sup> b*<sup>i</sup>*+1)*. (c)* p<sup>1</sup>(I , L ) = - *i*∈[0*,k*−1) (b*<sup>i</sup>* <sup>→</sup> h*<sup>i</sup>* )*. (d)* p<sup>2</sup>(I , L ) = - *i*∈[0*,k*) (b*<sup>i</sup>* <sup>→</sup> P(X*<sup>i</sup>* , L*i* ))*. (e)* p<sup>3</sup>(I , L ) = - *i*∈[1*,k*) ((¬b*<sup>i</sup>*−<sup>1</sup> <sup>∧</sup> b*<sup>i</sup>* ) <sup>→</sup> R(L*<sup>i</sup>* ))*. (f )* p<sup>4</sup>(I , L ) = b*<sup>k</sup>*−<sup>1</sup>.

In Definition <sup>10</sup> we list five parts of the k-witness circuit. For clarity, we explain each part in more details in the following text:


Figure 4 illustrates a comparison of variable structures of the original circuit and its k-witness (this also suggests their combinational extension relation). The area marked yellow (left box and top right box on the right) consists of the same set of variables. We consider each pair (X*<sup>i</sup>* , L*i* ) as a *component* in the circuit and refer to (X*<sup>k</sup>*−1, L*<sup>k</sup>*−<sup>1</sup>) as the most recent component (youngest copy), and (X0, L<sup>0</sup>) as the oldest component (copy). Additionally we also refer to the inputs I as X*<sup>k</sup>*−<sup>1</sup> for convenience.

The property P is comprised of five sub-properties. The *monotonicity* property <sup>p</sup><sup>0</sup> expresses the monotonic nature of the initialisation bits. Intuitively, if a component is initialised, all components younger than it should also be initialised. The *transition* property <sup>p</sup><sup>1</sup> expresses the property that every initialised component has to follow the transition relation in the original circuit. Of particular interest is the k-safety property p<sup>2</sup>, which says the original property <sup>P</sup> needs to be satisfied in every initialised component. The *reset* property <sup>p</sup><sup>3</sup> expresses the property that in the case of partial initialisation, the oldest initialised component needs to satisfy the original reset function. Finally, <sup>p</sup><sup>4</sup> expresses that at least the youngest component should have the initialisation bit set.

We now show the combinational simulation relation between the original circuit and its k-witness circuit.

# **Theorem 2.** *The circuit* C *is combinationally simulated by its* k*-witness circuit.*

*Proof.* By the construction in Definition 10, the inputs stay the same in the k-witness circuit C , and the new latches are a superset of the original ones (the youngest component in C ). Thus by Definition 6, C combinationally extends C. Based on Definition 10.4, the transition function of L*<sup>k</sup>*−<sup>1</sup> is identical to the original one, which satisfies Definition 7.1. In the new property, <sup>p</sup><sup>4</sup> and <sup>p</sup><sup>2</sup> together

**Fig. 3.** The diagram shows three possible initial states of C- . Here (1) illustrates 1 initialisation, (2) is i-initialisation, and (3) full initialisation. The grey area are the uninitialised components (the "don't care"s).

**Fig. 4.** The structure of input and latch variables in C and C- . (Color figure online)

imply P(X*<sup>k</sup>*−1, L*<sup>k</sup>*−<sup>1</sup>). In other words, the original property holds in the most recent component. This then satisfies Definition 7.2. By Definition 10, for every satisfiable assignment of R(L), the same assignment satisfies R (L) on the common latches (the youngest component). For all the new latches we observe the following. Because the reset of the newest component is satisfiable with the same assignment as in the original circuit, we can see that u*<sup>k</sup>*−<sup>1</sup> is true in the <sup>k</sup>-witness circuit and therefore all other u*<sup>i</sup>* are also true. Therefore all the ite-statements of the reset definition become trivially satisfiable. To complete the argument, by Definition 10.3, all the initialisation bits can be now set to <sup>⊥</sup> except b*<sup>k</sup>*−<sup>1</sup> which can be set to . A satisfying assignment of R (L ) can thus be directly constructed (deterministically in polynomial time) from any satisfying assignment of R(L). This implies the reset condition of Definition 7.3 holds. (Sidenote: This implies that the QBF check needed in the combinational simulation relation could potentially be solved easily in practice for these k-witness circuits.) Therefore C combinationally simulates <sup>C</sup>. 

In the following, we present the main result of this paper on the relationship between a circuit <sup>C</sup> and its <sup>k</sup>-witness circuit <sup>C</sup> in terms of k-induction.

**Theorem 3.** *Given a circuit* <sup>C</sup>*, a fixed* <sup>k</sup> <sup>∈</sup> <sup>N</sup><sup>+</sup>*, and its* k*-witness circuit* C *,* P *is* k*-inductive in* C *iff* P *is 1-inductive in* <sup>C</sup> *.*

*Proof.* We consider the two k-inductive checks in Definition <sup>9</sup> for both directions. In Theorem 4 we show that the BMC check (of the initialised unrolling of length k <sup>−</sup> 1) in C passes, if and only if the same check (of the initialised unrolling of length 0) in C also passes. In Theorem <sup>5</sup> we prove that if the consecution check of C passes, then the consecution check also passes in C. Lastly, Theorem <sup>6</sup> shows that if P is k-inductive in C, then the consecution check of P using the unrolling of length 1 passes in C . By combining them together, we conclude P is k-inductive in C iff P is 1-inductive in C . 

For the BMC check in the two circuits, we need to analyse three separate cases as shown in Fig. 3, which correspond to Lemmas 2, 3, and 4, respectively. But before this we need a technical Lemma 1 on the initialisation bits. In the following context, we consider a given circuit C, and its k-witness circuit C with a fixed <sup>k</sup>.

**Lemma 1.** *For the initialised unrolling of length* <sup>0</sup> *of the* k*-witness circuit* C *, the reset values of the initialisation bits* <sup>B</sup><sup>0</sup> *are deterministic and depend only on the component with the largest index* i <sup>∈</sup> [0, k) *for which* R(L*<sup>i</sup>* <sup>0</sup>) *is satisfied.*

*Proof.* Firstly, we define S <sup>=</sup> {i <sup>|</sup> R(L*<sup>i</sup>* <sup>0</sup>)}, based on which we consider two cases. (1). By Definition 10.3(c), if <sup>¬</sup>u<sup>1</sup> <sup>0</sup>, then 0 <sup>∈</sup> S. In this case, b<sup>0</sup> <sup>0</sup> = by

Definition 10.3(f), and by Definition 10.3(e)(g), b<sup>1</sup> <sup>0</sup>, ..., b*<sup>k</sup>*−<sup>1</sup> <sup>0</sup> are all set to . (2). Otherwise we consider u<sup>1</sup> <sup>0</sup>, where S contains at least some i <sup>∈</sup> [1, k). Let m be the maximum index in S, and m = 0. Since R(L*<sup>m</sup>* <sup>0</sup> ), <sup>u</sup>*<sup>m</sup>* <sup>0</sup> is satisfied, so are u*<sup>m</sup>*−<sup>1</sup> <sup>0</sup> , ..., u<sup>1</sup> <sup>0</sup>, while u*<sup>m</sup>*+1 <sup>0</sup> , ..., u*<sup>k</sup>*−<sup>1</sup> <sup>0</sup> are not. In Definition 10.3(g), for all i <sup>∈</sup> S, R(L*<sup>i</sup>* <sup>0</sup>) ∧ ¬u*<sup>i</sup>*+1 is only satisfied when i <sup>=</sup> m, thus b*<sup>m</sup>* <sup>0</sup> = . Therefore b*i* <sup>0</sup> <sup>=</sup> for all <sup>i</sup> <sup>∈</sup> [<sup>m</sup> + 1, k). By Definition 10.3(f), <sup>b</sup><sup>0</sup> <sup>0</sup> = ⊥, therefore for all i <sup>∈</sup> [1, m), b*<sup>i</sup>* <sup>0</sup> = ⊥. 

Initialisation bits are indicators for the initialisation status of the k-witness circuit. We observe that the sub-properties <sup>p</sup>0, ..., p<sup>3</sup> of the <sup>k</sup>-witness circuit trivially hold for uninitialised components (*i.e.,* those for which the initialisation bit is 0), while <sup>p</sup><sup>4</sup> solely depends on <sup>b</sup>*<sup>k</sup>*−<sup>1</sup>.

**Lemma 2.** *If the initialised unrolling of length* k <sup>−</sup><sup>1</sup> *of the original circuit* C *is safe, the initialised unrolling of length* <sup>0</sup> *of the* k*-witness circuit* C *is also safe, in the case of* 1*-initialisation.*

*Proof.* Assume U*<sup>k</sup>*−<sup>1</sup>∧R(L<sup>0</sup>) <sup>⇒</sup> - *i*∈[0*,k*) P(I*<sup>i</sup>*, L*<sup>i</sup>*) such that the initialised unrolling

of C is safe. In the case of 1-initialisation, we consider R (L <sup>0</sup>) <sup>∧</sup> R(L*<sup>k</sup>*−<sup>1</sup> <sup>0</sup> ) as the initialised unrolling of C , as U <sup>0</sup> is trivial. By Lemma 1 and Definition 10.3, for the initialisation bits, only b*<sup>k</sup>*−<sup>1</sup> <sup>0</sup> is set to and the rest remain ⊥. The values of <sup>B</sup><sup>0</sup> then satisfy <sup>p</sup><sup>0</sup>(I <sup>0</sup>, L <sup>0</sup>), p<sup>1</sup>(I <sup>0</sup>, L <sup>0</sup>), p<sup>4</sup>(I <sup>0</sup>, L 0) trivially. Every satisfying assignment of R (L <sup>0</sup>)∧R(L*<sup>k</sup>*−<sup>1</sup> <sup>0</sup> ) satisfies <sup>R</sup>(L<sup>0</sup>) with <sup>L</sup><sup>0</sup> <sup>=</sup> <sup>L</sup>*<sup>k</sup>*−<sup>1</sup> <sup>0</sup> , I<sup>0</sup> <sup>=</sup> <sup>X</sup>*<sup>k</sup>*−<sup>1</sup> <sup>0</sup> . Similar to our argument in Theorem 1, U*<sup>k</sup>*−<sup>1</sup>∧R(L<sup>0</sup>) is then also satisfiable. By our assumption, P(X*<sup>k</sup>*−<sup>1</sup> <sup>0</sup> , L*<sup>k</sup>*−<sup>1</sup> <sup>0</sup> ) is thus satisfied. The premise of <sup>p</sup><sup>2</sup>(I <sup>0</sup>, L <sup>0</sup>) is only satisfied for b*<sup>k</sup>*−<sup>1</sup> <sup>0</sup> , and with the same assignment satisfying <sup>P</sup>(X*<sup>k</sup>*−<sup>1</sup> <sup>0</sup> , L*<sup>k</sup>*−<sup>1</sup> <sup>0</sup> ), <sup>p</sup><sup>2</sup>(I <sup>0</sup>, L <sup>0</sup>) is also satisfied. Lastly, the premise of p<sup>3</sup>(I <sup>0</sup>, L <sup>0</sup>) is only satisfied for <sup>¬</sup>b*<sup>k</sup>*−<sup>2</sup> <sup>0</sup> <sup>∧</sup> <sup>b</sup>*<sup>k</sup>*−<sup>1</sup> <sup>0</sup> , and since R(L*<sup>k</sup>*−<sup>1</sup> <sup>0</sup> ), p<sup>3</sup>(I <sup>0</sup>, L <sup>0</sup>) is satisfied. Therefore we have P (I <sup>0</sup>, L <sup>0</sup>).  **Lemma 3.** *If the initialised unrolling of length* k <sup>−</sup><sup>1</sup> *of the original circuit* C *is safe, the initialised unrolling of length 0 of the* k*-witness circuit* C *is also safe, in the case of* i*-initialisation.*

*Proof.* Firstly, we assume <sup>U</sup>*k*−<sup>1</sup> <sup>∧</sup> <sup>R</sup>(L) <sup>⇒</sup> - *i*∈[0*,k*) P(I*i*, L*i*). In the case of iinitialisation, we consider R (L <sup>0</sup>) <sup>∧</sup> R(L*<sup>m</sup>* <sup>0</sup> ) ∧ ¬u*m*−<sup>1</sup> as the initialised unrolling of C , where m <sup>∈</sup> [1, k <sup>−</sup> 1) is the largest index for which R(L*<sup>m</sup>* <sup>0</sup> ) is satisfied. As we showed in Lemma 1, b*<sup>m</sup>* <sup>0</sup> , ..., b*k*−<sup>1</sup> <sup>0</sup> are set to while <sup>b</sup><sup>0</sup> <sup>0</sup>, ...b*m*−<sup>1</sup> <sup>0</sup> are <sup>⊥</sup>. Following Definition 10.3, L*<sup>i</sup>* <sup>0</sup> <sup>F</sup>(X*i*−<sup>1</sup> <sup>0</sup> , L*i*−<sup>1</sup> <sup>0</sup> ) for all <sup>i</sup> <sup>∈</sup> (m, k), while all components older than m are uninitialised. Every satisfying assignment of R (L <sup>0</sup>) <sup>∧</sup> R(L*<sup>m</sup>* <sup>0</sup> ) ∧ ¬u*<sup>m</sup>*−<sup>1</sup> also satisfies - *i*∈[0*,k*−*m*−1) (L*<sup>i</sup>*+1 <sup>F</sup>(I*<sup>i</sup>*, L*<sup>i</sup>*)) <sup>∧</sup> <sup>R</sup>(L<sup>0</sup>)

with <sup>I</sup>*<sup>i</sup>*−*<sup>m</sup>* <sup>=</sup> <sup>X</sup>*<sup>i</sup>* <sup>0</sup>, L*<sup>i</sup>*−*<sup>m</sup>* <sup>=</sup> <sup>L</sup>*<sup>i</sup>* <sup>0</sup> for all <sup>i</sup> <sup>∈</sup> [m, k). In the rest of the proof, we fix the assignment satisfying R (L <sup>0</sup>) <sup>∧</sup> R(L*<sup>m</sup>* <sup>0</sup> ) ∧ ¬u*<sup>m</sup>*−<sup>1</sup>. Similar to our argument in Theorem 1, <sup>U</sup>*<sup>k</sup>*−<sup>1</sup> <sup>∧</sup> <sup>R</sup>(L<sup>0</sup>) is satisfiable with our fixed assignment. By our assumption, - *i*∈[*m,k*) P(X*<sup>i</sup>* 0, L*i* <sup>0</sup>) is then satisfied. We now consider P (I <sup>0</sup>, L <sup>0</sup>). As

the premise of p<sup>2</sup>(I <sup>0</sup>, L <sup>0</sup>) is only satisfied for b*<sup>m</sup>* <sup>0</sup> , ..., b*<sup>k</sup>*−<sup>1</sup> <sup>0</sup> , <sup>p</sup><sup>2</sup>(I <sup>0</sup>, L <sup>0</sup>) is satisfied. Similarly for the transition property, with L*<sup>i</sup>* <sup>0</sup> <sup>F</sup>(X*<sup>i</sup>*−<sup>1</sup> <sup>0</sup> , L*<sup>i</sup>*−<sup>1</sup> <sup>0</sup> ) for all <sup>i</sup> <sup>∈</sup> (m, k), p<sup>1</sup>(I <sup>0</sup>, L <sup>0</sup>) is satisfied. Given the values of B<sup>0</sup>, the monotonicity property is satisfied. In addition, p<sup>4</sup>(I <sup>0</sup>, L <sup>0</sup>) is also satisfied as b*<sup>k</sup>*−<sup>1</sup> <sup>0</sup> = . Finally, the premise of p<sup>3</sup>(I <sup>0</sup>, L <sup>0</sup>) is only satisfied for <sup>¬</sup>b*<sup>m</sup>*−<sup>1</sup> <sup>0</sup> <sup>∧</sup> <sup>b</sup>*<sup>m</sup>* <sup>0</sup> , and as we already have <sup>R</sup>(L*<sup>m</sup>* <sup>0</sup> ), <sup>p</sup><sup>3</sup> is satisfied. 

**Lemma 4.** *If the initialised unrolling of length* k <sup>−</sup><sup>1</sup> *of the original circuit* C *is safe, the initialised unrolling of length 0 of the* <sup>k</sup>*-witness circuit* <sup>C</sup> *is also safe, in the case of full initialisation.*

*Proof.* We assume <sup>U</sup>*<sup>k</sup>*−<sup>1</sup> <sup>∧</sup> <sup>R</sup>(L) <sup>⇒</sup> - *i*∈[0*,k*) P(I*<sup>i</sup>*, L*<sup>i</sup>*) for the original circuit. Since

we consider full initialisation, R (L <sup>0</sup>) <sup>∧</sup> R(L<sup>0</sup> <sup>0</sup>) ∧ ¬u<sup>1</sup> <sup>0</sup> is the initialised unrolling of C . Following Definition 10.3, L*<sup>i</sup>* <sup>0</sup> <sup>F</sup>(X*<sup>i</sup>*−<sup>1</sup> <sup>0</sup> , L*<sup>i</sup>*−<sup>1</sup> <sup>0</sup> ) for all <sup>i</sup> <sup>∈</sup> [1, k). Every satisfying assignment of R (L <sup>0</sup>) <sup>∧</sup> R(L<sup>0</sup> <sup>0</sup>) ∧ ¬u<sup>1</sup> <sup>0</sup> satisfies <sup>U</sup>*<sup>k</sup>*−<sup>1</sup> <sup>∧</sup> <sup>R</sup>(L<sup>0</sup>) with <sup>I</sup>*<sup>i</sup>* <sup>=</sup> <sup>X</sup>*<sup>i</sup>* <sup>0</sup>, L*<sup>i</sup>* <sup>=</sup> <sup>L</sup>*<sup>i</sup>* <sup>0</sup> for all <sup>i</sup> <sup>∈</sup> [0, k). The rest of the proof follows the same logic as in Lemma 3. 

**Lemma 5.** *If the BMC check for the unrolling of length* k <sup>−</sup> <sup>1</sup> *of the original circuit* C *passes, then the BMC check for the unrolling of length* <sup>0</sup> *of the* k*-witness circuit* <sup>C</sup> *also passes.*

*Proof.* Based on Definition 10.3, we consider the BMC check for all possible initial states. Lemma 2, <sup>3</sup> and <sup>4</sup> cover the case-split over all initial states of C based on whether each component satisfies the original reset function R(L*<sup>i</sup>* <sup>0</sup>) or not. We show that the BMC check of C passes under the same assumption for three initialisation cases respectively. In particular, our construction in Definition 10.3 does not allow all components to be uninitialised, in which case R (L 0) becomes unsatisfiable (more specifically, R (L0 <sup>0</sup>) is unsatisfiable). We conclude the BMC check of the initialised unrolling of length 0 passes in C . 

We proceed to prove the opposite direction of the BMC check for C and C by considering the reset status in the k-witness circuit.

**Lemma 6.** *If the BMC check for the unrolling of length 0 of the* k*-witness circuit* <sup>C</sup> *passes, then the BMC check for the unrolling of length* k <sup>−</sup> <sup>1</sup> *of the original circuit* C *also passes.*

*Proof.* We assume the BMC check passes in the k-witness circuit, R (L <sup>0</sup>) ⇒ P (I <sup>0</sup>, L <sup>0</sup>). We do a proof by contradiction by assuming the BMC check of length k <sup>−</sup> 1 fails for the original circuit. Thus there exists a satisfying assignment s of <sup>U</sup>*<sup>k</sup>*−<sup>1</sup> <sup>∧</sup> <sup>R</sup>(L<sup>0</sup>) ∧ ¬ - *i*∈0*,k*) P(I*<sup>i</sup>*, L*<sup>i</sup>*). We can construct a satisfying assignment of R (L <sup>0</sup>) as follows. Let a <sup>∈</sup> [0, k) be some index for which <sup>¬</sup>P(I*<sup>a</sup>*, L*<sup>a</sup>*) is satisfied. Let m <sup>∈</sup> [0, a] be the index for which R(L*<sup>m</sup>*) ∧ *i*∈(*m,a*] R(L*<sup>i</sup>*) is satisfied. Let X*<sup>k</sup>*−1−*<sup>i</sup>* <sup>0</sup> <sup>=</sup> <sup>I</sup>*<sup>a</sup>*−*<sup>i</sup>*, L*<sup>k</sup>*−1−*<sup>i</sup>* <sup>0</sup> <sup>=</sup> <sup>L</sup>*<sup>a</sup>*−*<sup>i</sup>*, b*<sup>k</sup>*−1−*<sup>i</sup>* <sup>0</sup> <sup>=</sup> for all <sup>i</sup> <sup>∈</sup> [0, a <sup>−</sup> <sup>m</sup>]. The rest of initialisation bits of <sup>B</sup><sup>0</sup> are set to <sup>⊥</sup>. By Definition 2, we have <sup>L</sup>*<sup>i</sup>*+1 <sup>F</sup>(I*<sup>i</sup>*, L*<sup>i</sup>*) for all i <sup>∈</sup> [m, a), which satisfies Definition 10.3(d). As our construction satisfies R (L <sup>0</sup>), by our assumption, P (I <sup>0</sup>, L <sup>0</sup>) is satisfied. By Theorem 2, P(I*<sup>a</sup>*, L*<sup>a</sup>*) is satisfied. Since we assume s satisfies <sup>¬</sup>P(I*<sup>a</sup>*, L*<sup>a</sup>*), we have reached a contradiction.

As an immediate consequence of Lemma <sup>5</sup> and 6, the BMC check of C passes iff the same check passes in C . We record the result in the following Theorem.

 

**Theorem 4.** *The BMC check for the unrolling of length 0 of the* k*-witness circuit* <sup>C</sup> *passes, if and only if the BMC check for the unrolling of length* k <sup>−</sup> <sup>1</sup> *of the original circuit* C *passes.*

**Fig. 5.** The diagram shows the consecution check in C and C- .

We show in Fig. 5 an illustration of the consecution check in both circuits.

**Theorem 5.** *If the consecution check for the unrolling of length 1 of the* k*witness circuit* <sup>C</sup> *passes, then the consecution check for the unrolling of length* k *of the original circuit* C *passes too.*

*Proof.* We assume U <sup>∧</sup> <sup>P</sup>(I , L ) <sup>⇒</sup> P(I , L ) holds. We then do a proof by contradiction by assuming that the consecution check for the original circuit fails. Thus there is a satisfying assignment <sup>s</sup> of the formula <sup>U</sup>*<sup>k</sup>* <sup>∧</sup> - P(I*i*, L*i*) <sup>∧</sup>

*i*∈[0*,k*) <sup>¬</sup>P(I*k*, L*k*). Based on s, we have a satisfying assignment for U <sup>∧</sup> <sup>P</sup> (I , L ) as follows. Let X*<sup>i</sup>* <sup>=</sup> <sup>I</sup>*i*, L*<sup>i</sup>* <sup>=</sup> <sup>L</sup>*i*, and <sup>b</sup>*<sup>i</sup>* <sup>=</sup> for all <sup>i</sup> <sup>∈</sup> [0, k). Let <sup>X</sup>*i*−<sup>1</sup> = I*i*, L*i*−<sup>1</sup> <sup>=</sup> <sup>L</sup>*i*, b*i*−<sup>1</sup> <sup>=</sup> for all <sup>i</sup> <sup>∈</sup> [1, k]. We now show this satisfies <sup>L</sup> F (I , L ). Since X*i*−<sup>1</sup> <sup>=</sup> <sup>I</sup>*<sup>i</sup>* <sup>=</sup> <sup>X</sup>*<sup>i</sup>* and <sup>L</sup>*i*−<sup>1</sup> <sup>=</sup> <sup>L</sup>*<sup>i</sup>* <sup>=</sup> <sup>L</sup>*<sup>i</sup>* for all <sup>i</sup> <sup>∈</sup> [1, k), Definition 10.4(a) and Definition 10.4(c) are satisfied. Since s satisfies U*<sup>k</sup>*, by Definition 2, it satisfies <sup>L</sup>*<sup>k</sup>* <sup>F</sup>(I*<sup>k</sup>*−1, L*<sup>k</sup>*−<sup>1</sup>). With <sup>X</sup>*<sup>k</sup>*−<sup>1</sup> <sup>=</sup> <sup>I</sup>*<sup>k</sup>*−1, L*<sup>k</sup>*−<sup>1</sup> <sup>=</sup> <sup>L</sup>*<sup>k</sup>*−<sup>1</sup>, and L*<sup>k</sup>*−<sup>1</sup> <sup>=</sup> <sup>L</sup>*<sup>k</sup>*, we have <sup>L</sup>*<sup>k</sup>*−<sup>1</sup> <sup>F</sup>(X*<sup>k</sup>*−<sup>1</sup> , L*<sup>k</sup>*−<sup>1</sup> ), and thus Definition 10.4(b). As for the initialisation bits, since all of them are set to in both <sup>B</sup><sup>0</sup> and <sup>B</sup><sup>1</sup>, Definition 10.4(d) is satisfied. As a result, U is satisfied, and we continue to show the same assignment satisfies P (I , L ). Similar to our proof in Lemma 3, the values of <sup>B</sup><sup>0</sup> satisfy <sup>p</sup><sup>0</sup>(I , L ) and p<sup>4</sup>(I , L ) immediately. As the premiss of p<sup>3</sup>(I , L ) is unsatisfiable, p<sup>3</sup>(I , L ) trivially holds. Since <sup>U</sup>*<sup>k</sup>* is satisfied, by Definition 2, we have <sup>L</sup>*<sup>i</sup>*+1 <sup>F</sup>(I*<sup>i</sup>*, L*<sup>i</sup>*) which satisfies <sup>h</sup>*<sup>i</sup>* for all <sup>i</sup> <sup>∈</sup> [0, k−1), thus also p<sup>1</sup>(I , L ). Lastly, since P(I*<sup>i</sup>*, L*<sup>i</sup>*) is satisfied for all i <sup>∈</sup> [0, k), the original property is satisfied in every component P(X*<sup>i</sup>* , L*i* ), resulting in the satisfaction of p<sup>2</sup>(I , L ). By our initial assumption, P (I , I ) is satisfied. By Theorem 2, we have P(X*<sup>k</sup>*−<sup>1</sup> , L*<sup>k</sup>*−<sup>1</sup> ), thus <sup>P</sup>(I*<sup>k</sup>*, L*<sup>k</sup>*). We reach a contradiction here. We can therefore conclude the consecution check of the original circuit passes. 

**Lemma 7.** *If the safety property* P *is* k*-inductive in the original circuit* C*, the consecution check of the unrolling of length 1 passes in the* k*-witness circuit* C *, given that* L *is partially initialised.*

*Proof.* Assume P is k-inductive in C. Let U be the unrolling of <sup>C</sup> , and m <sup>∈</sup> [1, k) is some index such that b<sup>0</sup> , ..., b*<sup>m</sup>*−<sup>1</sup> are set to <sup>⊥</sup>, while <sup>b</sup>*<sup>m</sup>* , ..., b*<sup>k</sup>*−<sup>1</sup> are set to (as we consider partial initialisation here). We do a proof by contradiction, and assume there is a satisfying assignment s of the negation of the consecution check formula U <sup>∧</sup> <sup>P</sup> (I , L ) ∧ ¬P (I , L ). Since we assume P (I , L ), it implies R(L*<sup>m</sup>* ), based on <sup>p</sup><sup>3</sup>(I , L ). We also have L*<sup>i</sup>*+1 <sup>F</sup>(X*<sup>i</sup>* , L*i* ) for i <sup>∈</sup> [m, k <sup>−</sup> 1), based on p<sup>1</sup>(I , L ). Furthermore, U implies <sup>L</sup> <sup>F</sup> (I , L ), and by Definition 10.4, L*<sup>k</sup>*−<sup>1</sup> <sup>F</sup>(X*<sup>k</sup>*−<sup>1</sup> , L*<sup>k</sup>*−<sup>1</sup> ). Therefore the same assignment satisfies <sup>U</sup>*<sup>k</sup>*−<sup>1</sup> <sup>∧</sup> <sup>R</sup>(L<sup>0</sup>) where <sup>I</sup>*<sup>i</sup>*−*<sup>m</sup>* <sup>=</sup> <sup>X</sup>*<sup>i</sup>* , L*<sup>i</sup>*−*<sup>m</sup>* <sup>=</sup> <sup>L</sup>*<sup>i</sup>* for all <sup>i</sup> <sup>∈</sup> [m, k), and <sup>I</sup>*<sup>k</sup>*−*<sup>m</sup>* <sup>=</sup> <sup>I</sup> , L*<sup>k</sup>*−*<sup>m</sup>* <sup>=</sup> <sup>L</sup>*<sup>k</sup>*−<sup>1</sup> . By our assumption that the BMC check passes in <sup>C</sup>, we have P(X*<sup>i</sup>* , L*i* ) for all i <sup>∈</sup> [m, k) and P(I , L*<sup>k</sup>*−<sup>1</sup> ).

We can then proceed to prove P (I , L ) is indeed satisfied. Similar to our proof in Theorem 5, based on Definition 10.4, b*<sup>i</sup>* <sup>=</sup> for all <sup>i</sup> <sup>∈</sup> [m, k) while b*i* <sup>=</sup> <sup>⊥</sup> for all <sup>i</sup> <sup>∈</sup> [0, m). Additionally, <sup>X</sup>*<sup>i</sup>* <sup>=</sup> <sup>X</sup>*<sup>i</sup>*+1 , L*<sup>i</sup>* <sup>=</sup> <sup>L</sup>*<sup>i</sup>*+1 for <sup>i</sup> <sup>∈</sup> [0, m <sup>−</sup> 1). The rest of the proof follows the same logic as Theorem <sup>5</sup> for showing P (I , L ) is satisfied. We then reach a contradiction here, and thus conclude the consecution check for C passes in this case. 

**Lemma 8.** *If the consecution check for the unrolling of length* k *passes in the original circuit* C*, the consecution check for the unrolling of length 1 passes in the* k*-witness circuit* C *, given that* L *is fully initialised.*

*Proof.* Let U <sup>1</sup> be the unrolling of <sup>C</sup> with <sup>b</sup><sup>0</sup> <sup>0</sup>, ..., b*k*−<sup>1</sup> <sup>0</sup> all set to . Similar to Lemma 7, we do a proof by contradiction, and assume there is a satisfying assignment s of U <sup>1</sup> <sup>∧</sup> <sup>P</sup> (I <sup>0</sup>, L <sup>0</sup>) ∧ ¬P (I <sup>1</sup>, L 1). By the transition property p<sup>1</sup>(I <sup>0</sup>, L <sup>0</sup>), the components follow the transition function F, such that L*i*+1 <sup>1</sup> <sup>F</sup>(X*<sup>i</sup>* 0, L*i* <sup>0</sup>) for all i <sup>∈</sup> [0, k <sup>−</sup> 1). Similar to our argument in Lemma 7, U 1 implies L*k*−<sup>1</sup> <sup>1</sup> <sup>F</sup>(I <sup>0</sup>, L*k*−<sup>1</sup> <sup>0</sup> ). We also have - *i*∈[0*,k*) P(X*<sup>i</sup>* 0, L*i* <sup>0</sup>) based on p<sup>2</sup>(I <sup>0</sup>, L 0)

and the values of <sup>B</sup><sup>0</sup>. The same assignment thus satisfies <sup>U</sup>*<sup>k</sup>* <sup>∧</sup> - P(L*i*, L*i*)

*i*∈[0*,k*) where <sup>L</sup>*<sup>i</sup>* <sup>=</sup> <sup>L</sup>*<sup>i</sup>* <sup>0</sup> <sup>∧</sup> <sup>I</sup>*<sup>i</sup>* <sup>=</sup> <sup>X</sup>*<sup>i</sup>* <sup>0</sup> for all <sup>i</sup> <sup>∈</sup> [0, k) and <sup>I</sup>*<sup>k</sup>* <sup>=</sup> <sup>I</sup> <sup>1</sup>, L*<sup>k</sup>* <sup>=</sup> <sup>L</sup>*<sup>k</sup>*−<sup>1</sup> <sup>1</sup> . Based on our assumption that the consecution of C passes, we have P(I <sup>1</sup>, L*<sup>k</sup>*−<sup>1</sup> <sup>1</sup> ). Following the same reasoning in Lemma 7, after one transition, b*<sup>i</sup>* <sup>1</sup> <sup>=</sup> for all <sup>i</sup> <sup>∈</sup> [0, k), and X*<sup>i</sup>* <sup>1</sup> <sup>=</sup> <sup>X</sup>*<sup>i</sup>*+1 <sup>0</sup> , L*<sup>i</sup>* <sup>1</sup> <sup>=</sup> <sup>L</sup>*<sup>i</sup>*+1 <sup>0</sup> for <sup>i</sup> <sup>∈</sup> [0, k <sup>−</sup> 1).

We can now show P (I <sup>1</sup>, L <sup>1</sup>) is satisfied. The k-safety property p<sup>2</sup>(I <sup>1</sup>, L <sup>1</sup>) is satisfied as we have proved p(X*<sup>i</sup>* 1, L*i* <sup>1</sup>) for all i <sup>∈</sup> [0, k). The transition property p<sup>1</sup>(I <sup>0</sup>, L <sup>0</sup>) is preserved, as <sup>U</sup>*<sup>k</sup>* is satisfied which implies <sup>L</sup>*<sup>i</sup>*+1 <sup>1</sup> <sup>F</sup>(X*<sup>i</sup>* 1, L*i* <sup>1</sup>). Based on the values of B<sup>1</sup>, <sup>p</sup><sup>0</sup>(I <sup>1</sup>, L <sup>1</sup>), p<sup>3</sup>(I <sup>1</sup>, L <sup>1</sup>), p<sup>4</sup>(I <sup>1</sup>, L <sup>1</sup>) are satisfied immediately. We conclude the P (I <sup>1</sup>, L <sup>1</sup>) is satisfied thus we reach a contradiction. Therefore the consecution check for C passes in this case. 

**Theorem 6.** *If both* k*-induction checks pass in the original circuit* C*, then the consecution check of the unrolling of length 1 in the* k*-witness circuit* C *passes.*

*Proof.* First of all, we assume both checks pass in C. We then do a proof by contradiction by assuming there is a satisfying assignment s for the negation of the consecution check U <sup>1</sup> <sup>∧</sup>P (I <sup>0</sup>, L <sup>0</sup>)∧ ¬P (I <sup>1</sup>, L <sup>1</sup>). Since s satisfies U <sup>1</sup> <sup>∧</sup>P (I <sup>0</sup>, L 0), we consider two separate cases where the property P (I <sup>0</sup>, L <sup>0</sup>) is satisfied: full initialisation or partial initialisation. Note when all b<sup>0</sup> <sup>0</sup>, ..., b*<sup>k</sup>*−<sup>1</sup> <sup>0</sup> are set to ⊥, P (I <sup>0</sup>, L <sup>0</sup>) is not satisfied. Therefore applying Lemma 8 and Lemma 7 together, we conclude if both k-induction checks pass in C, the consecution check of the unrolling of length 1 in the k-witness circuit also passes. 

We briefly discuss why the k-witness circuit is linear in the size of the original circuit, and the value k. If we consider the circuit size in terms of gate numbers, the number of latches and inputs increase by a factor of approximately k. The transition functions are copied k <sup>−</sup> 1 times, i.e., k <sup>−</sup> 2 times for reset in Definition 10.3(d), and once more in 10.4(b), while the k <sup>−</sup> 2 copies in the property part 10.5(a) have the same arguments and can be shared. For the reset predicates, defining R(L*<sup>i</sup>* ) is linear in the number of the latches, while u*<sup>i</sup>* is linear in k. We apply the same logic when defining the property, therefore we conclude our construction is linear in the size of the circuit and k.

# **5 Implementation**

Based on our new construction we implemented Certifaiger [12], which works as a tool suite comprised of multiple components as shown in Fig. 6. The tool takes as inputs a circuit which contains a safety property given in AIGER format [7] and a value k provided by a k-induction-based model checker which outputs a positive model checking result. Upon invocation, internally the inputs are passed on to the k-witness generator that parses the AIGER file and generates a k-witness circuit as defined in Definition 10. The new safety property is a simple inductive invariant (to be verified) for the k-witness circuit. We extended the reset logic definition of the existing AIGER format defined by the authors of [7] to enable reset functions, whereas all previous AIGER versions only allow reset values to be 0, 1, or uninitialised. The k-witness circuit from the k-witness generator is given in this extended AIGER format.

**Fig. 6.** The architecture of Certifaiger. C is the input circuit in AIGER format and k is the value given by a k-induction-based model checker. The final outputs of the SAT solvers are given in the form of S/U, for satisfiable or unsatisfiable. The QBF solver outputs true or false (T /F) as the result.

To verify the inductive invariant φ(I,L), as discussed in Sect. 3, our certifier generates three conditions. (Note that here we are only looking at extended circuits, therefore we use L instead of L .)


In our implementation, the latch variables used in the inductive invariant are updated with their next state literals after each transition. The consistency condition is rather trivial here, as the inductive invariant is exactly the property in the k-witness circuit, although this is only specific to our case.

Our certifier generates for each of the three conditions a (combinational) AIGER circuit which is then checked by a SAT solver. In our implementation, we used the SAT solver Kissat [6] for checking validity of the formulas after they have been converted to CNF by invoking aigtocnf from the AIGER library.

Furthermore, we implemented the combinational simulation checker for verifying the combinational simulation relation described in Definition 7. The checker takes as inputs the original circuit and the k-witness circuit. It generates two AIGER files for the transition check and the property check, as well as a QAIGER file for the reset check, as defined in Definition 7. Similar to the inductive invariant checker, the AIGER files are then converted to CNFs and verified by Kissat. QAIGER is a standard format used in QBF Competitions. In our experiments the formula is verified with the QBF solver QuAbS [35].

The tool Certifaiger returns "success" as a result if all six formulas hold, meaning that the circuit C combinationally simulates <sup>C</sup> and <sup>C</sup> is safe by the 1-induction proof. Thus by Theorem <sup>1</sup> the original circuit C is also safe. Note that this result holds regardless of how C is constructed.

Given a scenario where we would want to place trust on the correctness of the extended circuit mapping inside the k-witness generator (to trust that the k-witness circuit construction of Definition <sup>10</sup> is correct and the program implementing it is also provably correct), all three combinational simulation checks (one QBF and two SAT checks) could be skipped in the certification procedure.

Intuitively, given a faulty generation of the k-witness circuit C , the error would either be caught by the combinational simulation check (due to an erroneous under-approximation of the set of reachable states) or the inductive invariant check (due to an erroneous over-approximation of the set of reachable states). Furthermore, we have also done a sanity check of certification on failure, where the model checking results are falsified by Certifaiger. An incorrect value of k is detected by a negative result of <sup>ϕ</sup>*consec*, whereas <sup>ϕ</sup>*init* does not hold in cases where an initial state is a bad state.

# **6 Experiments**

As described in previous sections, the complexity of extending the original circuit into k-witness is linear in the size of the circuit, and the inductive depth. To evaluate the practicality of our tool, we now report the experimental results obtained by evaluating Certifaiger against a number of widely used benchmarks. The benchmarks were first run on the open source k-induction-based model checker McAiger [3], which was modified to give the values of k explicitly. All experiments were carried out on an Intel<sup>R</sup> Coretm i9-9900 CPU 3.60 GHz computer with 32 GB RAM running Manjaro with kernel version 5.4.72-1.

We start with the TIP suite benchmarks which were originally used in [18]. The benchmarks were converted from .smv to AIGER by invoking smvtoaig from the AIGER library. Table 1 reports the certification results obtained, where the file names are associated by the origin of the problems explained in [18]. The table displays the following information in each column:

**Fig. 7.** The time (a) and file size (b) comparison results for the TIP suite. The benchmark names are shown on the x-axis. Average values are shown as the blue horizontal line in each plot. The y-axis of (a) displays the time ratio of total certification time and model checking time. The y-axis of (b) shows the expansion factor indicating the comparison of circuit sizes (k-witness circuit v.s the original). (Color figure online)


Note here we selected benchmarks that gave a positive model checking result, only in which case the original property is k-inductive. Moreover, three instances that require simple paths constraints (also called *loopFree* constraints in [34]) were ruled out. Handling these constraints is an interesting area for future study. We retrieved the inductive depths k from the model checker McAiger, and compared with the results in [18] to ensure the values are identical. As shown in Table 1, the values of k vary between 4 and 96. The SAT solver was able to handle the proof checking without experiencing time-outs. We observe that the k-witness circuit generation time is rather small, compared with the model checking time as well as the proof checking time. In the proof checking stage, Table 1 suggests that the SAT-solving time for <sup>ϕ</sup>*consec* is much higher than the rest of the formulas. This is as expected, as the formula <sup>ϕ</sup>*consec* is in general more complicated than the rest, and appears to be the most difficult formula to solve. In addition, QBF solving times are also worth-noting: in a few cases QBF solving time is longer than for other formulas, however, in most cases, it is rather small. To compare certification time with model checking time, we plotted the results in Fig. 7a, where the y-axis shows the ratio of certification and model checking.


**Table 1.** Experimental results for the TIP suite.

Here certification time is the sum of time taken on each component, assuming the six conditions are computed in parallel. As shown in the diagram, the average time ratio is around 8, which is quite promising. Furthermore, Fig. 7b shows a comparison of circuit sizes, where the *expansion factor* ε is computed by #*C*- #*C*×*<sup>k</sup>* (alternatively, #C <sup>=</sup> <sup>ε</sup> · #<sup>C</sup> · <sup>k</sup>). The average value observed here is around 1.5. This is consistent with Definition 10, as we expected the size of the k-witness circuit to grow linearly with respect to the original circuit and the value of k.

**Fig. 8.** Certification time *vs.* model checking time obtained by running HWMCC'10 benchmarks.

We also used benchmarks from the Hardware Model Checking Competition (HWMCC) 2010 [4]. The benchmarks were pre-filtered by running on McAiger with a time-out of 15 min. A total of 513 instances were solved by McAiger, from which we selected from the 216 *unsat* instances with a meaningful k (*i.e.,* k <sup>≥</sup> 2). We also observed only 7 out of the 216 instances require simple path constraints. The results in Fig. 8 are sorted by the benchmark names, which enables us to compare individual benchmarks from the same family. In most cases, similar to our previous observation from the TIP suite, the SAT solving

**Fig. 9.** The <sup>k</sup>-witness circuit size *vs.* the original circuit size.

time of <sup>ϕ</sup>*consec* takes much longer than the rest, while in very few cases it is less than the QBF solving time for ϕ*reset*. The average time ratio is 30, where we excluded 4 outliers in the plot coming from the pj20 family, that give a worse result (total certification time ≥15 min). We observe that this was due to the high format conversion time from QAIGER to QCIR [25] before the QBF solving handled by QuAbS, while the actual QBF solving time was significantly smaller and more feasible. We believe this can be overcome by generating an alternative format directly in practice. Finally, similar to our previous TIP results, Fig. 9 shows the values of the expansion factor with an average of 1.5.

In the final experiments, to further inspect the expansion factor, we generalise the *Counter* example in Example 1, where we scale the number of bits to 500 with a modulo value 32. To clarify the complexity of our construction for the k-witness circuit, we ran experiments with different values of k. The results are shown in Fig. 10, where the x-axis shows the values of b up to 431, meaning the value of k was scaled up to 400. The expansion factor gradually converges to a constant as we increase the value of b, as we expected.

**Fig. 10.** The experimental results of the *Counter* example. The values of *b* are shown on the x-axis.

As noticed above, overall our approach works efficiently in the certification stage, in particular, in our implementation we adopted the linear construction of k-witness circuit in Definition 10, thus the size of the resulting AIGER circuit is linear in the size of the original circuit, and the value of k. Each component in the tool suite works independently from each other when performing verification, which increases trust in the verification results.

# **7 Conclusion**

We propose an approach to certify k-induction-based model checking results, by extending the model to produce an inductive invariant. The resulting tool, Certifaiger, was evaluated experimentally on multiple sets of widely used benchmarks. The analysis showed our approach can be adapted to use in practice.

Our certificates are linear in size of the original problem and k. Validation requires several SAT checks and solving a simple QBF. In related work [8,23] the worst case is considered to be exponential. It is an interesting open question whether our notion of combinational simulation requiring a QBF check for the reset condition can be changed to use only SAT checks.

Further, we only considered k-induction without simple paths constraints, even though such constraints on executions of the original model can in principle be handled by adding unique state constraints to our k-witness circuit. For simplicity we stick to models without such constraints, a restriction also made for instance in the hardware model checking competition. Thus certifying kinduction with simple path constraints is left to future work as well as handling different types of properties such as liveness properties.

We also want to extend our approach to common preprocessing techniques including temporal decomposition [11] or retiming [28] with the goal to obtain a single certificate (witness circuit). This goal is particularly challenging for complex multi-engine model checkers [9,10]. Furthermore, we believe our approach can be extended to infinite-state systems, where k-induction is commonly used.

**Acknowledgement.** This work is supported by the Austrian Science Fund (FWF) under the project W1255-N23, the LIT AI Lab funded by the State of Upper Austria, and Academy of Finland under the project 336092.

### **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

# **Model-Checking Structured Context-Free Languages**

Michele Chiari1(B) , Dino Mandrioli<sup>1</sup> , and Matteo Pradella1,2

<sup>1</sup> DEIB, Politecnico di Milano, Milan, Italy {michele.chiari,dino.mandrioli, matteo.pradella}@polimi.it <sup>2</sup> IEIIT, Consiglio Nazionale delle Ricerche, Milan, Italy

**Abstract.** The problem of model checking procedural programs has fostered much research towards the definition of temporal logics for reasoning on context-free structures. The most notable of such results are temporal logics on Nested Words, such as CaRet and NWTL. Recently, the logic OPTL was introduced, based on the class of Operator Precedence Languages (OPL), more powerful than Nested Words. We define the new OPL-based logic POTL, and provide a model checking procedure for it. POTL improves on NWTL by enabling the formulation of requirements involving pre/post-conditions, stack inspection, and others in the presence of exception-like constructs. It improves on OPTL by being FO-complete, and by expressing more easily stack inspection and function-local properties. We developed a model checking tool for POTL, which we experimentally evaluate on some interesting use-cases.

**Keywords:** Linear temporal logic · Operator precedence languages · Model Checking · Visibly pushdown languages · Input-driven languages

# **1 Introduction**

Model checking is one of the most successful techniques for the verification of software programs. It consists in the exhaustive verification of the mathematical model of a program against a specification of its desired behavior. The kind of properties that can be proved in this way depends both on the formalism employed to model the program, and on the one used to express the specification. The initial and most classical frameworks consist in the use of operational formalisms, such as Transition Systems and Finite State Automata (generally B¨uchi automata) for the model, and temporal logics such as Linear-time Temporal Logic (LTL), Computation-Tree Logic (CTL) and CTL\* for the specification [24]. The success of such logics is due to their ease in reasoning about linear or branching sequences of events over time, by expressing liveness and safety properties, their conciseness with respect to automata, and the complexity of their model checking.

In this paper we consider linear-time temporal domains. LTL limits its set of expressible properties to the First-Order Logic (FOL) definable fragment c The Author(s) 2021

of regular languages. This is quite restrictive when compared with the most popular abstract models of procedural programs, such as Pushdown Systems, Boolean Programs [10], and Recursive State Machines [3]. All such stack-based formalisms show behaviors that are expressible by means of Context-Free Languages (CFL), rather than regular ones. State and configuration reachability, fair computation problems, and model checking of *regular specifications* have been thoroughly studied for such formalisms [3,4,13,17,28,30,32,40,51,55]. To expand the expressive power of specification languages too, [12,14] augmented LTL with Presburger arithmetic constraints on the occurrences of states, obtaining a logic capable of even some context-sensitive specifications, but with only restricted decidable fragments. [41] introduced model checking of pushdown tree automata specifications on regular systems, and Dynamic Logic was extended to some limited classes of CFL [34]. Decision procedures for different kinds of regular constraints on stack contents have been given in [18,29,37].

A coherent approach came with the introduction of temporal logics based on Visibly Pushdown Languages (VPL) [7], a.k.a. Input-Driven Languages [47]. Such logics, namely CaRet [6] and its FO-complete successor NWTL [2], model the execution trace of a procedural program as a Nested Word [8], consisting in a linear ordering augmented with a one-to-one matching relation between function calls and returns. They are the first ones featuring temporal modalities that explicitly refer to the nesting structure of CFL [4]. This enables requirement specifications to include Hoare-style pre/post-conditions, stack-inspection properties, and more. A μ-calculus based on VPL extends model checking to branching-time semantics in [5], while [16] introduces a temporal logic capturing the whole class of VPL. Timed extensions of CaRet are given in [15].

VPL too have their limitations. They are more general than Parenthesis Languages [46], but their *matching relation* is essentially constrained to be one-toone [43]. This hinders their suitability to model processes in which a single event must be put in relation with multiple ones. Unfortunately, computer programs often present such behaviors: exceptions and continuations are single events that cause the termination (or re-instantiation) of multiple functions on the stack.

To reason about such behaviors, temporal logics based on Operator Precedence Languages (OPL) have been proposed [22]. OPL were initially introduced with the purpose of efficient parsing [31], a field in which they continue to offer useful applications [11]. They are capable of capturing the syntax of arithmetic expressions, and other constructs whose context-free structure is not immediately visible. The generality of the structure of their syntax trees is much greater than that of VPL, which are strictly included in OPL [25]. Nevertheless, they retain the same closure properties that make regular languages and VPL suitable for automata-theoretic model checking: OPL are closed under Boolean operations, concatenation, Kleene \*, and language emptiness and inclusion are decidable [42]. They have been characterized by means of push-down automata, Monadic Second-Order Logic and, recently, by an extension of Regular Expressions [42,44].

OPTL [22] is the first linear-time temporal logic for which a model checking procedure has been given on both finite and ω-words of OPL. It enables reasoning on procedural programs with exceptions, expressing properties about whether a function can be terminated by an exception, or throw one, and also pre/postconditions. NWTL can be translated into OPTL in linear time, thus the latter is capable of expressing all properties that can be formalized in CaRet and NWTL, and many more. [22] does not explore OPTL's expressiveness further, and does not investigate the practical applicability of their model checking construction.

In this article, we introduce Precedence Oriented Temporal Logic (POTL), which redefines the syntax and semantics of OPTL to be much closer to the context-free structure of words. With POTL, it is much easier to navigate a word's syntax tree, expressing requirements that are aware of its structure. From a more theoretical point of view, POTL is FO-complete whereas OPTL is not, so that CaRet, NWTL, OPTL and POTL constitute a strict hierarchy in terms of expressive power. Such a theoretical elaboration, however, is technically involved; thus, for length reasons, it is documented in a technical report [23].

In this paper, instead, we focus on the model-checking application of POTL. We provide a tableaux-construction procedure for model checking POTL, which yields nondeterministic automata of size at most singly exponential in the formula's length, and is thus not asymptotically greater than that of LTL, NWTL and OPTL. We implemented such a procedure in a tool called POMC, which we evaluate on several interesting case studies. POMC's performance is promising: almost all case studies are verified in seconds and with a reasonable memory consumption, with very few outliers. Such outliers are inevitable, due to the exponential complexity of the task.

The related work on tools is not as rich as the theoretical one. Tools and libraries such as VPAlib [48], VPAchecker [54], OpenNWA [27] and SymbolicAutomata [26] only implement operations such as union, intersection, universality/inclusion/emptiness check for Visibly Pushdown or Nested Word Automata, but have no model checking capabilities. PAL [19] uses nested-word based monitors to express program specifications, and a tool based on blast [36] implements its runtime monitoring and model checking. PAL follows the paradigm of program monitors, and is not—strictly speaking—a temporal logic. PTCaRet [52] is a past version of CaRet, and its runtime monitoring has been implemented in JavaMOP [20]. [49,50] describe a tool for model checking programs against CaRet specifications. Since its purpose is malware detection, it targets program binaries directly by modeling them as Pushdown Systems. Unfortunately, this tool does not seem to be available online. To the best of our knowledge, POMC is the only publicly-available<sup>1</sup> tool for model-checking temporal logics capable of expressing context-free properties.

The paper is organized as follows: we give some background on OPL in Sect. 2, we introduce POTL in Sect. 3 and its model checking in Sect. 4, and we evaluate our prototype model checker in Sect. 5. Due to space constraints, we leave all formal proofs to a technical report [21].

<sup>1</sup> https://github.com/michiari/POMC.

### **2 Operator Precedence Languages**

We assume some familiarity with classical formal language theory concepts such as context-free grammar, parsing, shift-reduce algorithm, syntax tree (ST) [33,35]. Operator Precedence Languages (OPL) are usually defined through their generating grammars [31]; in this paper, however, we characterize them through their accepting automata [42] which are the natural way for stating equivalence properties with logic characterization, and for model checking. Readers not familiar with OPL may refer to [43] for more explanations on the following basic concepts; an explanatory example is also given at the end of this section.

Let Σ be a finite alphabet, and ε the empty string. We use a special symbol # -<sup>∈</sup> <sup>Σ</sup> to mark the beginning and the end of any string. An *operator precedence matrix* (OPM) <sup>M</sup> over <sup>Σ</sup> is a partial function (<sup>Σ</sup> ∪ {#})<sup>2</sup> → {-, . <sup>=</sup>, }, that, for each ordered pair (a, b), defines the *precedence relation* (PR) M(a, b) holding between a and b. If the function is total we say that M is *complete*. We call the pair (Σ,M) an *operator precedence alphabet*. Relations -, . =, , are respectively named *yields precedence, equal in precedence*, and *takes precedence*. By convention, the initial # yields precedence, and other symbols take precedence on the ending #. If <sup>M</sup>(a, b) = <sup>π</sup>, where <sup>π</sup> ∈ {-, . <sup>=</sup>, }, we write aπb. For u, v <sup>∈</sup> <sup>Σ</sup><sup>+</sup> we write uπv if u = xa and v = by with aπb. The role of PR is to give structure to words: they can be seen as special and more concise parentheses, where e.g. one "closing" can match more than one "opening" -. Despite their graphical appearance, PR are not ordering relations.

**Definition 1.** *An* operator precedence automaton (OPA) *is a tuple* A = (Σ, M, Q, I, F, δ) *where:* (Σ,M) *is an operator precedence alphabet,* Q *is a finite set of states (disjoint from* <sup>Σ</sup>*),* <sup>I</sup> <sup>⊆</sup> <sup>Q</sup> *is the set of initial states,* <sup>F</sup> <sup>⊆</sup> <sup>Q</sup> *is the set of final states,* <sup>δ</sup> <sup>⊆</sup> <sup>Q</sup> <sup>×</sup> (<sup>Σ</sup> <sup>∪</sup> <sup>Q</sup>) <sup>×</sup> <sup>Q</sup> *is the transition relation, which is the union of the three disjoint relations* <sup>δ</sup>shif t <sup>⊆</sup> <sup>Q</sup>×<sup>Σ</sup> <sup>×</sup>Q*,* <sup>δ</sup>push <sup>⊆</sup> <sup>Q</sup>×<sup>Σ</sup> <sup>×</sup>Q*, and* <sup>δ</sup>pop <sup>⊆</sup> <sup>Q</sup> <sup>×</sup> <sup>Q</sup> <sup>×</sup> <sup>Q</sup>*. An OPA is deterministic iff* <sup>I</sup> *is a singleton, and all three components of* δ *are—possibly partial—functions.*

To define the semantics of OPA, we need some new notations. Letters p, q, pi, qi,... denote states in Q. We use q<sup>0</sup> <sup>a</sup>−→ <sup>q</sup><sup>1</sup> for (q0, a, q1) <sup>∈</sup> <sup>δ</sup>push, <sup>q</sup><sup>0</sup> a - q<sup>1</sup> for (q0, a, q1) <sup>∈</sup> <sup>δ</sup>shif t, <sup>q</sup><sup>0</sup> q2 <sup>=</sup><sup>⇒</sup> <sup>q</sup><sup>1</sup> for (q0, q2, q1) <sup>∈</sup> <sup>δ</sup>pop, and <sup>q</sup><sup>0</sup> w q1, if the automaton can read <sup>w</sup> <sup>∈</sup> <sup>Σ</sup><sup>∗</sup> going from <sup>q</sup><sup>0</sup> to <sup>q</sup>1. Let <sup>Γ</sup> <sup>=</sup> <sup>Σ</sup> <sup>×</sup> <sup>Q</sup> and <sup>Γ</sup> <sup>=</sup> <sup>Γ</sup> ∪ {⊥} be the *stack alphabet*; we denote symbols in <sup>Γ</sup> as [a, q] or <sup>⊥</sup>. We set smb([a, q]) = <sup>a</sup>, smb(⊥) = #, and st([a, q]) = <sup>q</sup>. For a stack content <sup>γ</sup> <sup>=</sup> <sup>γ</sup><sup>n</sup> ...γ1⊥, with <sup>γ</sup><sup>i</sup> <sup>∈</sup> <sup>Γ</sup>, <sup>n</sup> <sup>≥</sup> 0, we set smb(γ) = smb(γn) if <sup>n</sup> <sup>≥</sup> 1, smb(γ) = # if <sup>n</sup> = 0.

<sup>A</sup> *configuration* of an OPA is a triple <sup>c</sup> <sup>=</sup> w, q, γ, where <sup>w</sup> <sup>∈</sup> <sup>Σ</sup>∗#, <sup>q</sup> <sup>∈</sup> <sup>Q</sup>, and <sup>γ</sup> <sup>∈</sup> <sup>Γ</sup><sup>∗</sup>⊥. A *computation* or *run* is a finite sequence <sup>c</sup><sup>0</sup> <sup>c</sup><sup>1</sup>  ...  <sup>c</sup><sup>n</sup> of *moves* or *transitions* <sup>c</sup><sup>i</sup> <sup>c</sup>i+1. There are three kinds of moves, depending on the PR between the symbol on top of the stack and the next input symbol: **Push move:** if smb(γ) <sup>a</sup> then ax, p, γ x, q, [a, p]<sup>γ</sup>, with (p, a, q) <sup>∈</sup>

δpush; **Shift move:** if <sup>a</sup> . <sup>=</sup> <sup>b</sup> then bx, q, [a, p]<sup>γ</sup> x, r, [b, p]<sup>γ</sup>, with (q, b, r) <sup>∈</sup> <sup>δ</sup>shif t; **Pop move:** if <sup>a</sup> <sup>b</sup> then bx, q, [a, p]<sup>γ</sup> bx, r, γ, with (q, p, r) <sup>∈</sup> <sup>δ</sup>pop.


**Fig. 1.** OPM M**call** (left) and a string with chains shown by brackets (right).

Shift and pop moves are not performed when the stack contains only ⊥. Push moves put a new element on top of the stack consisting of the input symbol together with the current state of the OPA. Shift moves update the top element of the stack by *changing its input symbol only*. Pop moves remove the element on top of the stack, and update the state of the OPA according to δpop on the basis of the current state of the OPA and the state of the removed stack symbol. They do not consume the input symbol, which is used only to establish the relation, remaining available for the next move. The OPA accepts the language <sup>L</sup>(A) = {<sup>x</sup> <sup>∈</sup> <sup>Σ</sup><sup>∗</sup> <sup>|</sup><sup>x</sup>#, q<sup>I</sup> , <sup>⊥</sup><sup>∗</sup> #, q<sup>F</sup> , <sup>⊥</sup>, q<sup>I</sup> <sup>∈</sup> I,q<sup>F</sup> <sup>∈</sup> <sup>F</sup>} .

We now introduce the concept of *chain*, which makes the connection between OP relations and context-free structure explicit, through brackets.

**Definition 2.** *A* simple chain <sup>c</sup><sup>0</sup> [c1c<sup>2</sup> ...c-] c-+1 *is a string* c0c1c<sup>2</sup> ...cc-+1*, such that:* c0, c-+1 <sup>∈</sup> <sup>Σ</sup> ∪ {#}*,* <sup>c</sup><sup>i</sup> <sup>∈</sup> <sup>Σ</sup> *for every* <sup>i</sup> = 1, <sup>2</sup>,... *(* <sup>≥</sup> <sup>1</sup>*), and* <sup>c</sup><sup>0</sup> c<sup>1</sup> . = c<sup>2</sup> ...c-−1 . = cc-+1*. A* composed chain *is a string* c0s0c1s1c<sup>2</sup> ...csc-+1*, where* <sup>c</sup><sup>0</sup> [c1c<sup>2</sup> ...c-] c-+1 *is a simple chain, and* <sup>s</sup><sup>i</sup> <sup>∈</sup> <sup>Σ</sup><sup>∗</sup> *is the empty string or is such that* <sup>c</sup><sup>i</sup> [si] <sup>c</sup>i+1 *is a chain (simple or composed), for every* <sup>i</sup> = 0, <sup>1</sup>,..., *(* <sup>≥</sup> <sup>1</sup>*). Such a composed chain will be written as* <sup>c</sup><sup>0</sup> [s0c1s1c<sup>2</sup> ...cs-] c-+1 *.* c<sup>0</sup> *(resp.* c-+1*) is called its* left *(resp.* right*)* context*; all symbols between them form its* body*.*

A finite word w over Σ is *compatible* with an OPM M iff for each pair of letters c, d, consecutive in w, M(c, d) is defined and, for each substring x of #w# that is a chain of the form <sup>a</sup>[y] <sup>b</sup>, M(a, b) is defined.

Chains can be identified through the traditional operator precedence parsing algorithm. We apply it to the sample word wex = **call han call call exc call ret ret**, which is compatible with M**call** (for a more complete treatment, cf. [33,43]). First, write all precedence relations between consecutive characters, according to M**call**. Then, recognize all innermost patterns of the form a <sup>c</sup> . <sup>=</sup> ... . = c b as simple chains, and remove their bodies. Then, write the precedence relations between the left and right contexts of the removed body, a and b, and iterate this process until only ## remains. This procedure is applied to wex as follows:

```
1 # -
     call -
           han -
                 call -
                       call  exc  call .
                                        = ret  ret  #
2 # -
     call -
           han -
                 call  exc  call .
                                  = ret  ret  #
3 # -
     call -
           han .
                = exc  call .
                            = ret  ret  #
4 # -
     call -
           call .
                = ret  ret  #
5 # -
     call .
          = ret  #
6 # .
    = #
```
The chain body removed in each step is underlined. In step 1, **call**[**call**] **exc** is a simple chain, so its body **call** is removed. Then, in step 2 we recognize the simple chain **han**[**call**] **exc**, which means **han**[**call**[**call**]]**exc**, where [**call**] is the chain body removed in step 1, is a composed chain. This way, we recognize, e.g., **han**[**call**] **exc**, **call**[**han exc**] **call** as simple chains, and **han**[**call**[**call**]]**exc** and **call**[**han**[**call**[**call**]]**exc**] **call** as composed chains (with inner chain bodies enclosed in brackets). Figure 1 shows the structure of a longer version of wex, which is an isomorphic representation of its ST as depicted in Fig. 4. Each chain corresponds to an internal node, and the fringe of the subtree rooted at it is the chain's body.

Let <sup>A</sup> be an OPA. We call a *support* for the simple chain <sup>c</sup><sup>0</sup> [c1c<sup>2</sup> ...c-] c-+1 any path in <sup>A</sup> of the form <sup>q</sup><sup>0</sup> <sup>c</sup><sup>1</sup> −→ <sup>q</sup><sup>1</sup> -- ... - q-−1 c- - q- q0 <sup>=</sup><sup>⇒</sup> <sup>q</sup>-+1. The label of the last (and only) pop is exactly q0, i.e. the first state of the path; this pop is executed because of relation c c-+1. We call a *support for the composed chain* <sup>c</sup><sup>0</sup> [s0c1s1c<sup>2</sup> ...cs-] c-+1 any path in <sup>A</sup> of the form <sup>q</sup><sup>0</sup> - <sup>s</sup><sup>0</sup> q 0 <sup>c</sup><sup>1</sup> −→ <sup>q</sup><sup>1</sup> - <sup>s</sup><sup>1</sup> q 1 c2 -- ... <sup>c</sup>- - q- - s q - q- <sup>0</sup> <sup>=</sup><sup>⇒</sup> <sup>q</sup>-+1 where, for every <sup>i</sup> = 0, <sup>1</sup>,..., : if <sup>s</sup><sup>i</sup> -= , then <sup>q</sup><sup>i</sup> - <sup>s</sup><sup>i</sup> q i is a support for the chain <sup>c</sup><sup>i</sup> [si] <sup>c</sup>i+1 , else q <sup>i</sup> <sup>=</sup> <sup>q</sup>i.

Chains fully determine the parsing structure of any OPA over (Σ,M). If the OPA performs the computation sb, qi, [a, q<sup>j</sup> ]<sup>γ</sup> <sup>∗</sup> b, qk, γ, then <sup>a</sup>[s] <sup>b</sup> is necessarily a chain over (Σ,M), and there exists a support like the one above with s = s0c<sup>1</sup> ...cs and q-+1 = qk. This corresponds to the parsing of the string s0c<sup>1</sup> ...cs within the contexts a,b, which contains all information needed to build the subtree whose frontier is that string.

Consider the OPA <sup>A</sup>(Σ,M)=(Σ,M, {q}, {q}, {q}, δmax) where <sup>δ</sup>max(q, q) = <sup>q</sup>, and <sup>δ</sup>max(q, c) = <sup>q</sup>, <sup>∀</sup><sup>c</sup> <sup>∈</sup> <sup>Σ</sup>. We call it the *OP Max-Automaton* over Σ,M. For a max-automaton, each chain has a support. Since there is a chain #[s] # for any string <sup>s</sup> compatible with <sup>M</sup>, a string is accepted by <sup>A</sup>(Σ,M) iff it is compatible with <sup>M</sup>. If <sup>M</sup> is complete, each string is accepted by <sup>A</sup>(Σ,M), which defines the universal language Σ<sup>∗</sup> by assigning to any string the (unique) structure compatible with the OPM. With M**call** of Fig. 1, if we take e.g. the string **ret call han**, it is accepted by the max-automaton with structure #[[**ret**]**call**[**han**]]#.

In conclusion, given an OP alphabet, the OPM M assigns a unique structure to any compatible string in Σ∗; unlike VPL, such a structure is not visible in the string, and must be built by means of a non-trivial parsing algorithm. An OPA defined on the OP alphabet selects an appropriate subset within the "universe" of strings compatible with M. For a more complete description of the OPL family and of its relations with other CFL we refer the reader to [43].

#### **2.1 Operator Precedence** *ω***-Languages**

All definitions regarding OPL are extended to infinite words in the usual way, but with a few distinctions. Given an OP alphabet (Σ,M), an <sup>ω</sup>-word <sup>w</sup> <sup>∈</sup> <sup>Σ</sup><sup>ω</sup> is compatible with M if every prefix of w is compatible with M. OP ω-words are not terminated by the delimiter #. An ω-word may contain never-ending chains of the form c<sup>0</sup> c<sup>1</sup> . = c<sup>2</sup> . <sup>=</sup> ··· , where the relation between c<sup>0</sup> and c<sup>1</sup> is never closed by a corresponding . Such chains are called *open chains* and may be simple or composed. A composed open chain may contain both open and closed subchains. Of course, a closed chain cannot contain an open one. A terminal symbol <sup>a</sup> <sup>∈</sup> <sup>Σ</sup> is *pending* if it is part of the body of an open chain and of no closed chains.

OPA classes accepting the whole class of ωOPL can be defined by augmenting Definition 1 with B¨uchi or Muller acceptance conditions [42]. In this paper, we only consider the former. The semantics of configurations, moves and infinite runs are defined as for finite OPA. For the acceptance condition, let ρ be a run on an ω-word w. Define

Inf(ρ) = {<sup>q</sup> <sup>∈</sup> <sup>Q</sup> <sup>|</sup> there exist infinitely many positions <sup>i</sup> s.t. <sup>β</sup>i, q, xi<sup>∈</sup> <sup>ρ</sup>}

as the set of states that occur infinitely often in ρ. ρ is successful iff there exists a state <sup>q</sup><sup>f</sup> <sup>∈</sup> <sup>F</sup> such that <sup>q</sup><sup>f</sup> <sup>∈</sup> Inf(ρ). An <sup>ω</sup>OPBA <sup>A</sup> accepts <sup>w</sup> <sup>∈</sup> <sup>Σ</sup><sup>ω</sup> iff there is a successful run of <sup>A</sup> on <sup>w</sup>. The <sup>ω</sup>-language recognized by <sup>A</sup> is <sup>L</sup>(A) = {<sup>w</sup> <sup>∈</sup> <sup>Σ</sup><sup>ω</sup> <sup>|</sup> <sup>A</sup> accepts <sup>w</sup>}. Unlike OPA, <sup>ω</sup>OPBA do not require the stack to be empty for word acceptance: when reading an open chain, the stack symbol pushed when the first character of the body of its underlying simple chain is read remains into the stack forever; it is at most updated by shift moves.

The most important closure properties of OPL are preserved by ωOPL, which form a Boolean algebra and are closed under concatenation of an OPL with an ωOPL [42]. The equivalence between deterministic and nondeterministic automata is lost in the infinite case, which is unsurprising, since it also happens for regular ω-languages and ωVPL.

#### **2.2 Modeling Programs with OPA**

For readers not familiar with OPL, we show how OPA can naturally model programming languages such as Java and C++. Given a set AP of atomic propositions describing events and states of the program, we use (P(AP), MAP ) as the OP alphabet. For convenience, we consider a partitioning of AP into a set of standard propositional labels (in round font), and *structural labels* (SL, in bold). SL define the OP structure of the word: MAP is only defined for subsets of AP containing exactly one SL, so that given two SL **l**1, **l**2, for any a, a , b, b ∈ P(AP) s.t. **<sup>l</sup>**<sup>1</sup> <sup>∈</sup> a, a and **<sup>l</sup>**<sup>2</sup> <sup>∈</sup> b, b we have <sup>M</sup>AP (a, b) = <sup>M</sup>AP (a , b ). Hence, we define an OPM on the entire <sup>P</sup>(AP) by only giving the relations between SL, as we did for M**call**. Figure 2 shows how to model a procedural program with an OPA. The OPA simulates the program's behavior with respect to the stack, by expressing its execution traces with four event kinds: **call** (resp. **ret**) marks a procedure call (resp. return), **han** the installation of an exception handler by a try statement, and **exc** an exception being raised. OPM M**call** defines the context-free structure of the word, which is strictly linked with the programming language semantics: the - PR causes nesting (e.g., **call**s can be nested into other **call**s), and the . = PR implies a one-to-one relation, e.g. between a **call** and the **ret** of the same function, and a **han** and the **exc** it catches. Each OPA state represents a line in the source code. First, procedure p<sup>A</sup> is called by the program loader (M0),

**Fig. 2.** Example procedural program (top) and the derived OPA (bottom). '\*' implies a non-deterministic choice. Push, shift, pop moves are shown by, resp., solid, dashed and double arrows.

and [{**call**, <sup>p</sup>A}, M0] is pushed onto the stack, to track the program state before the **call**. Then, the try statement at line A0 of p<sup>A</sup> installs a handler. All subsequent calls to p<sup>B</sup> and p<sup>C</sup> push new stack symbols on top of the one pushed with **han**. p<sup>C</sup> may only call itself recursively, or throw an exception, but never return normally. This is reflected by **exc** being the only transition leading from state C0 to the accepting state Mr, and p<sup>B</sup> and p<sup>C</sup> having no way to a normal **ret**. The OPA has a look-ahead of one input symbol, so when it encounters **exc**, it must pop all symbols in the stack, corresponding to active function frames, until it finds the one with **han** in it, which cannot be popped because **han** . = **exc**. Notice that such behavior cannot be modeled by Visibly Pushdown Automata or Nested Word Automata, because they need to read an input symbol for each pop move. Thus, **han** protects the parent function from the exception. Since the state contained in **han**'s stack symbol is A0, the execution resumes in the catch clause of pA. p<sup>A</sup> then calls twice the error-handling function pErr, which ends regularly both times, and returns. The string of Fig. 1 is accepted by this OPA.

In this example, we only model the stack behavior for simplicity, but other statements, such as assignments, and other behaviors, such as continuations, could be modeled by a different choice of the OPA and OPM, and other aspects of the program's state by appropriate abstractions [38].

### **3 POTL: Syntax and Semantics**

Given a finite set of atomic propositions AP, the syntax of POTL follows:

$$\begin{aligned} \varphi ::= \mathbf{a} \mid \neg \varphi \mid \varphi \vee \varphi \mid \bigcirc^t \varphi \mid \bigcirc^t \varphi \mid \chi\_F^t \varphi \mid \chi\_P^t \varphi \mid \varphi \bigcirc^t \mathcal{U}\_\chi^t \varphi \mid \varphi \bigcirc\_X^t \varphi \\\mid \bigcirc\_{\bigcirc\_H^t \varphi}^t \mid \bigcirc\_H^t \varphi \mid \varphi \mathcal{U}\_H^t \varphi \mid \varphi \bigcirc\_S^t \mathcal{S}\_H^t \varphi \end{aligned}$$

**Fig. 3.** The string of Fig. 1 as an OP word. Chains are shown by edges joining their contexts. Standard atomic propositions are shown below SL: p*<sup>l</sup>* means a **call** or a **ret** is related to procedure p*l*. First, procedure p*<sup>A</sup>* is called (pos. 1), and it installs a handler in pos. 2. Then, three procedures are called, and one (p*<sup>C</sup>* ) throws an exception, which is caught by the handler. Two more functions are called and, finally, p*<sup>A</sup>* returns.

where a <sup>∈</sup> AP, and <sup>t</sup> ∈ {d, u}.

The semantics of POTL is based on the *word structure*—also called *OP word* for short—(U, MAP , P), where <sup>U</sup> <sup>=</sup> {0, <sup>1</sup>, . . . , n, n+1}, with <sup>n</sup> <sup>∈</sup> <sup>N</sup> is a set of word positions; <sup>P</sup> : <sup>U</sup> → P(AP) is a function associating each position in <sup>U</sup> with the set of atomic propositions holding in that position, with <sup>P</sup>(0) = <sup>P</sup>(<sup>n</sup> + 1) = {#}. Given two positions i, j and a PR π, we write iπj to say P(i) π P(j).

We define the chain relation <sup>χ</sup> <sup>⊆</sup> <sup>U</sup> <sup>×</sup> <sup>U</sup> so that <sup>χ</sup>(i, j) holds between two positions i, j iff i<j <sup>−</sup> 1, and <sup>i</sup> and <sup>j</sup> are resp. the left and right contexts of the same chain. For composed chains, χ may not be one-to-one, but also one-to-many or many-to-one. Given i, j <sup>∈</sup> <sup>U</sup>, relation <sup>χ</sup> has the following properties:


Property 4 says that when the chain relation is one-to-many, the contexts of the outermost chains are in the . = or relation, while the inner ones are in the relation. Property 3 says that contexts of outermost many-to-one chains are in the . = or relation, the inner ones being in the relation. In the ST, the right context <sup>j</sup> of a chain is at the *same level* as the left one <sup>i</sup> when <sup>i</sup> . = j (e.g., in Fig. 4, pos. 1 and 11), at a *lower level* when i j (e.g., pos. 1 with 7, and 9), at a *higher level* if i j (e.g., pos. 3 and 4 with 6).

The truth of POTL formulas is defined w.r.t. a single word position. Let <sup>w</sup> be an OP word, and a <sup>∈</sup> AP. Then, for any position <sup>i</sup> <sup>∈</sup> <sup>U</sup> of <sup>w</sup>, we have (w, i) <sup>|</sup>= a if a <sup>∈</sup> <sup>P</sup>(i). Operators such as <sup>∧</sup> and <sup>¬</sup> have the usual semantics from propositional logic. Next, while giving the formal semantics of POTL operators, we illustrate it by showing how it can be used to express properties on program execution traces, such as the one of Fig. 3.

**a) Next/Back Operators.** The *downward* next and back operators <sup>d</sup> and <sup>d</sup> are like their LTL counterparts, except they are true only if the next (resp. current) position is at a lower or equal ST level than the current (resp. preceding) one. The *upward* next and back, <sup>u</sup> and u, are symmetric. Formally, (w, i) <sup>|</sup><sup>=</sup> <sup>d</sup><sup>ϕ</sup> iff (w, i + 1) <sup>|</sup><sup>=</sup> <sup>ϕ</sup> and <sup>i</sup> - (i + 1) or <sup>i</sup> . = (<sup>i</sup> + 1), and (w, i) <sup>|</sup><sup>=</sup> <sup>d</sup><sup>ϕ</sup> iff (w, i <sup>−</sup> 1) <sup>|</sup><sup>=</sup> <sup>ϕ</sup>, and (<sup>i</sup> <sup>−</sup> 1) i or (<sup>i</sup> <sup>−</sup> 1) . = i. Substitute with to obtain the semantics for <sup>u</sup> and <sup>u</sup>. E.g., we can write <sup>d</sup>**call** to say that the next position is an inner call (it

**Fig. 4.** The ST corresponding to the word of Fig. 3. Dots are internal nodes.

holds in pos. 2, 3, 4 of Fig. 3), <sup>d</sup>**call** to say that the previous position is a **call**, and the current is the first of the body of a function (pos. 2, 4, 5), or the **ret** of an empty one (pos. 8, 10), and <sup>u</sup>**call** to say that the current position terminates an empty function frame (holds in 6, 8, 10). In pos. 2 <sup>d</sup>p<sup>B</sup> holds, but <sup>u</sup>p<sup>B</sup> does not.

**b) Chain Next/Back Operators.** The *chain* next and back operators χ<sup>t</sup> <sup>F</sup> and χt <sup>P</sup> , <sup>t</sup> ∈ {d, u}, evaluate their argument respectively on future and past positions in the chain relation with the current one. The *downward* (resp. *upward*) variant only considers chains whose right context goes down (resp. up) in the ST. E.g., in pos. 1 of Fig. 3, χ<sup>d</sup> <sup>F</sup> <sup>p</sup>Err holds because <sup>χ</sup>(1, 7) and <sup>χ</sup>(1, 9), meaning that p<sup>A</sup> calls pErr at least once. Formally, (w, i) <sup>|</sup><sup>=</sup> <sup>χ</sup><sup>d</sup> <sup>F</sup> <sup>ϕ</sup> iff there exists a position j>i such that χ(i, j), i <sup>j</sup> or <sup>i</sup> . <sup>=</sup> <sup>j</sup>, and (w, j) <sup>|</sup><sup>=</sup> <sup>ϕ</sup>. (w, i) <sup>|</sup><sup>=</sup> <sup>χ</sup><sup>d</sup> <sup>P</sup> <sup>ϕ</sup> iff there exists a position j<i such that χ(j, i), j <sup>i</sup> or <sup>j</sup> . <sup>=</sup> <sup>i</sup>, and (w, j) <sup>|</sup><sup>=</sup> <sup>ϕ</sup>. Replace - with for the upward versions. In Fig. 3, χ<sup>u</sup> <sup>F</sup> **exc** is true in **call** positions whose procedure is terminated by an exception thrown by an inner procedure (e.g. pos. 3 and 4). χ<sup>u</sup> <sup>P</sup> **call** is true in **exc** statements that terminate at least one procedure other than the one raising it, such as the one in pos. 6. χ<sup>d</sup> <sup>F</sup> **ret** and <sup>χ</sup><sup>u</sup> <sup>F</sup> **ret** hold in **call**s to non-empty procedures that terminate normally, and not due to an uncaught exception (e.g., pos. 1).

**c) Until/Since Operators.** POTL has two kinds of until and since operators. They express properties on paths, which are sequences of positions obtained by iterating the different kinds of next or back operators. In general, a *path* of length <sup>n</sup> <sup>∈</sup> <sup>N</sup> between i, j <sup>∈</sup> <sup>U</sup> is a sequence of positions <sup>i</sup> <sup>=</sup> <sup>i</sup><sup>1</sup> < i<sup>2</sup> <sup>&</sup>lt; ··· < i<sup>n</sup> <sup>=</sup> <sup>j</sup>. The *until* operator on a set of paths Γ is defined as follows: for any word w and position <sup>i</sup> <sup>∈</sup> <sup>U</sup>, and for any two POTL formulas <sup>ϕ</sup> and <sup>ψ</sup>, (w, i) <sup>|</sup><sup>=</sup> <sup>ϕ</sup> <sup>U</sup>(Γ) <sup>ψ</sup> iff there exist a position <sup>j</sup> <sup>∈</sup> <sup>U</sup>, <sup>j</sup> <sup>≥</sup> <sup>i</sup>, and a path <sup>i</sup><sup>1</sup> < i<sup>2</sup> <sup>&</sup>lt; ··· < i<sup>n</sup> between <sup>i</sup> and <sup>j</sup> in <sup>Γ</sup> such that (w, ik) <sup>|</sup><sup>=</sup> <sup>ϕ</sup> for any 1 <sup>≤</sup> k<n, and (w, in) <sup>|</sup><sup>=</sup> <sup>ψ</sup>. *Since* operators are defined symmetrically. Note that, depending on Γ, a path from i to j may not exist. We define until/since operators by associating them with different sets of paths.

The *summary* until <sup>ψ</sup> <sup>U</sup><sup>t</sup> <sup>χ</sup> <sup>θ</sup> (resp. since <sup>ψ</sup> <sup>S</sup><sup>t</sup> <sup>χ</sup> θ) operator is obtained by inductively applying the <sup>t</sup> and <sup>χ</sup><sup>t</sup> <sup>F</sup> (resp. <sup>t</sup> and <sup>χ</sup><sup>t</sup> <sup>P</sup> ) operators. It holds in a position in which either <sup>θ</sup> holds, or <sup>ψ</sup> holds together with <sup>t</sup> (<sup>ψ</sup> <sup>U</sup><sup>t</sup> <sup>χ</sup> <sup>θ</sup>) (resp. <sup>t</sup> (<sup>ψ</sup> <sup>S</sup><sup>t</sup> <sup>χ</sup> <sup>θ</sup>)) or χ<sup>t</sup> <sup>F</sup> (<sup>ψ</sup> <sup>U</sup><sup>t</sup> <sup>χ</sup> θ) (resp. χ<sup>t</sup> <sup>P</sup> (<sup>ψ</sup> <sup>S</sup><sup>t</sup> <sup>χ</sup> θ)). It is an until operator on paths that can move not only between consecutive positions, but also between contexts of a chain, skipping its body. With the OPM of Fig. 1, this means skipping function bodies. The downward variants can move between positions at the same level in the ST (i.e., in the same simple chain body), or down in the nested chain structure. The upward ones remain at the same level, or move to higher levels of the ST.

Formula U<sup>u</sup> <sup>χ</sup> **exc** is true in positions contained in the frame of a function that is terminated by an exception. It is true in pos. 3 of Fig. 3 because of path 3-6, and false in pos. 1, because no path can enter the chain whose contexts are pos. 1 and 11. Formula U<sup>d</sup> <sup>χ</sup> **exc** is true in call positions whose function frame contains **exc**s, but that are not necessarily terminated by one of them, such as the one in pos. 1 (with path 1-2-6).

We define *Downward Summary Paths* (DSP) as follows. Given an OP word <sup>w</sup>, and two positions <sup>i</sup> <sup>≤</sup> <sup>j</sup> in <sup>w</sup>, the DSP between <sup>i</sup> and <sup>j</sup>, if it exists, is a sequence of positions <sup>i</sup> <sup>=</sup> <sup>i</sup><sup>1</sup> < i<sup>2</sup> <sup>&</sup>lt; ··· < i<sup>n</sup> <sup>=</sup> <sup>j</sup> such that, for each 1 <sup>≤</sup> p<n,

$$i\_{p+1} = \begin{cases} k & \text{if } k = \max\{h \mid h \le j \land \chi(i\_p, h) \land (i\_p < h \lor i\_p \doteq h)\} \text{exists:}\\ i\_p + 1 & \text{otherwise, if } i\_p < (i\_p + 1) \text{ or } i\_p \doteq (i\_p + 1). \end{cases}$$

The Downward Summary (DS) until and since operators <sup>U</sup><sup>d</sup> <sup>χ</sup> and <sup>S</sup><sup>d</sup> <sup>χ</sup> use as <sup>Γ</sup> the set of DSP starting in the position in which they are evaluated. The definition for the upward counterparts is, again, obtained by substituting with . In Fig. 3, **call** <sup>U</sup><sup>d</sup> <sup>χ</sup> (**ret** ∧ pErr) holds in pos. 1 because of path 1-7-8 and 1-9-10, (**call** <sup>∨</sup> **exc**) <sup>S</sup><sup>u</sup> <sup>χ</sup> <sup>p</sup><sup>B</sup> in pos. 7 because of path 3-6-7, and (**call** <sup>∨</sup> **exc**) <sup>U</sup><sup>u</sup> <sup>χ</sup> **ret** in 3 because of path 3-6-7-8.

**d) Hierarchical Operators.** A single position may be the left or right context of multiple chains. The operators seen so far cannot keep this fact into account, since they "forget" about a left context when they jump to the right one. Thus, we introduce the *hierarchical* next and back operators. The *upward* hierarchical next (resp. back), <sup>u</sup> <sup>H</sup><sup>ψ</sup> (resp. <sup>u</sup> <sup>H</sup>ψ), is true iff the current position <sup>j</sup> is the right context of a chain whose left context is i, and ψ holds in the next (resp. previous) pos. j that is the right context of i, with i j, j . So, <sup>u</sup> <sup>H</sup>pErr holds in pos. 7 of Fig. 3 because pErr holds in 9, and <sup>u</sup> <sup>H</sup>pErr in 9 because pErr holds in 7. In the ST, <sup>u</sup> <sup>H</sup> goes *up* between **call**s to pErr, while <sup>u</sup> <sup>H</sup> goes down. Their *downward* counterparts behave symmetrically, and consider multiple inner chains sharing their right context. They are formally defined as:


– (w, i) <sup>|</sup><sup>=</sup> <sup>d</sup> <sup>H</sup><sup>ϕ</sup> iff there exist a position h>i s.t. <sup>χ</sup>(i, h) and <sup>i</sup> <sup>h</sup> and a position <sup>j</sup> = max{<sup>k</sup> <sup>|</sup> k<i <sup>∧</sup> <sup>χ</sup>(k, h) <sup>∧</sup> <sup>k</sup> <sup>h</sup>} and (w, j) <sup>|</sup><sup>=</sup> <sup>ϕ</sup>.

In the ST of Fig. 4, <sup>d</sup> <sup>H</sup> and <sup>d</sup> <sup>H</sup> go *down* and up among **call**s terminated by the same **exc**. For example, in pos. 3 <sup>d</sup> <sup>H</sup>p<sup>C</sup> holds, because both pos. 3 and 4 are in the chain relation with 6. Similarly, in pos. 4 <sup>d</sup> <sup>H</sup>p<sup>B</sup> holds. Note that these operators do not consider leftmost/rightmost contexts, so <sup>u</sup> <sup>H</sup>**ret** is false in pos. 9, as **call** . = **ret**, and pos. 11 is the rightmost context of pos. 1.

The hierarchical until and since operators are defined by iterating these next and back operators. The upward hierarchical path (UHP) between i and j is a sequence of positions <sup>i</sup> <sup>=</sup> <sup>i</sup><sup>1</sup> < i<sup>2</sup> <sup>&</sup>lt; ··· < i<sup>n</sup> <sup>=</sup> <sup>j</sup> such that there exists a position h<i such that for each 1 <sup>≤</sup> <sup>p</sup> <sup>≤</sup> <sup>n</sup> we have <sup>χ</sup>(h, ip) and <sup>h</sup> ip, and for each <sup>1</sup> <sup>≤</sup> q<n there exists no position <sup>k</sup> such that <sup>i</sup><sup>q</sup> <k<iq+1 and <sup>χ</sup>(h, k). The until and since operators based on the set of UHP starting in the position in which they are evaluated are denoted as <sup>U</sup><sup>u</sup> <sup>H</sup> and <sup>S</sup><sup>u</sup> <sup>H</sup> . E.g., **call** <sup>U</sup><sup>u</sup> <sup>H</sup> pErr holds in pos. 7 because of the singleton path 7 and path 7-9, and **call**S<sup>u</sup> <sup>H</sup> pErr in pos. 9 because of paths 9 and 7-9.

The downward hierarchical path (DHP) between i and j is a sequence of positions <sup>i</sup> <sup>=</sup> <sup>i</sup><sup>1</sup> < i<sup>2</sup> <sup>&</sup>lt; ··· < i<sup>n</sup> <sup>=</sup> <sup>j</sup> such that there exists a position h>j such that for each 1 <sup>≤</sup> <sup>p</sup> <sup>≤</sup> <sup>n</sup> we have <sup>χ</sup>(ip, h) and <sup>i</sup><sup>p</sup> <sup>h</sup>, and for each 1 <sup>≤</sup> q<n there exists no position k such that i<sup>q</sup> <k<iq+1 and χ(k, h). The until and since operators based on the set of DHP starting in the position in which they are evaluated are denoted as <sup>U</sup><sup>d</sup> <sup>H</sup> and <sup>S</sup><sup>d</sup> <sup>H</sup> . In Fig. 3, **call** <sup>U</sup><sup>d</sup> <sup>H</sup> p<sup>C</sup> holds in pos. 3, and **call** <sup>S</sup><sup>d</sup> <sup>H</sup> p<sup>B</sup> in pos. 4, both because of path 3-4.

The POTL until and since operators enjoy expansion laws similar to those of LTL. Here we give those for two until operators, those for their since and downward counterparts being symmetric.

$$\begin{aligned} \varphi \mathcal{U}\_\chi^t \psi &\equiv \psi \vee \left( \varphi \wedge \left( \bigcirc^t \left( \varphi \mathcal{U}\_\chi^t \psi \right) \vee \chi\_F^t (\varphi \mathcal{U}\_\chi^t \psi) \right) \right) \\ \varphi \mathcal{U}\_H^u \psi &\equiv \left( \psi \wedge \chi\_P^d \top \wedge \neg \chi\_P^u \top \right) \vee \left( \varphi \wedge \bigcirc\_H^u (\varphi \mathcal{U}\_H^u \psi) \right) \end{aligned}$$

#### **3.1 Expressiveness of POTL**

We first define some derived operators. For <sup>t</sup> ∈ {d, u}, we define the downward/upward summary *eventually* as <sup>t</sup> <sup>ϕ</sup> := U<sup>t</sup> <sup>χ</sup> <sup>ϕ</sup>, and the downward/upward summary *globally* as <sup>t</sup> <sup>ϕ</sup> := <sup>¬</sup><sup>t</sup> (¬ϕ). <sup>u</sup><sup>ϕ</sup> and <sup>u</sup><sup>ϕ</sup> resp. say that <sup>ϕ</sup> holds in one or all positions in the path from the current position to the root of the ST. <sup>d</sup><sup>ϕ</sup> says that <sup>ϕ</sup> holds in at least one position in the current subtree, and <sup>d</sup><sup>ϕ</sup> in all of them. E.g., if <sup>d</sup>(¬pA) holds in a **call**, it means that p<sup>A</sup> never holds in its whole function body, which is the subtree rooted next to the **call**.

In the technical report, we prove

#### **Theorem 1 (**[23]**).** *POTL = FOL with one free variable on OP words.*

Equivalence to FOL on the relevant algebraic structure is a desirable feature of linear-time temporal logics, and it was proved for LTL [39] and NWTL [2]. It is in some sense a theoretical assurance of the sufficient expressive power of the logic. Moreover, NWTL ⊂ OPTL was proved in [22], and OPTL ⊆ POTL comes from Theorem 1 and the semantics of OPTL being expressible in FOL. In [23], we also prove that there exist POTL formulas not expressible in OPTL. Thus, we can claim CaRet [6] ⊆ NWTL ⊂ OPTL ⊂ POTL. One of such formulas is <sup>d</sup>p<sup>A</sup> which, evaluated e.g. on a **han** position with a matched **exc**, states that p<sup>A</sup> holds in one of the positions in the same subtree.

More importantly, POTL can express many useful requirements of procedural programs. To emphasize the potential practical applications in automatic verification, we supply a few examples of typical program properties expressed as POTL formulas, not all of them being expressible in the other above languages.

The LTL *globally* can be written as <sup>ψ</sup> := <sup>¬</sup><sup>u</sup>(<sup>d</sup>¬ψ). The two nested eventually operators enumerate all future positions by going up and then down in any direction in the syntax tree: when negated, this means <sup>¬</sup><sup>ψ</sup> may never hold. POTL can express Hoare-style pre/postconditions with formulae such as (**call** <sup>∧</sup> <sup>ρ</sup> <sup>=</sup><sup>⇒</sup> <sup>χ</sup><sup>d</sup> <sup>F</sup> (**ret** <sup>∧</sup> <sup>θ</sup>)), where <sup>ρ</sup> is the precondition, and <sup>θ</sup> is the postcondition.

Unlike NWTL, POTL can easily express properties related to exception handling and interrupt management [43]. E.g., the shortcut CallT hr(ψ) := <sup>u</sup>(**exc** <sup>∧</sup> <sup>ψ</sup>) <sup>∨</sup> <sup>χ</sup><sup>u</sup> <sup>F</sup> (**exc** <sup>∧</sup> <sup>ψ</sup>), evaluated in a **call**, states that the procedure currently started is terminated by a **exc** in which <sup>ψ</sup> holds. So, (**call** <sup>∧</sup> <sup>ρ</sup> <sup>∧</sup> CallT hr() =<sup>⇒</sup> CallT hr(θ)) means that if precondition ρ holds when a procedure is called, then postcondition θ must hold if that procedure is terminated by an exception. In object oriented programming languages, if <sup>ρ</sup> <sup>≡</sup> <sup>θ</sup> is a class invariant asserting that a class instance's state is valid, this formula expresses *weak exception safety* [1], and *strong exception safety* if ρ and θ express particular states of the class instance. The *no-throw guarantee* can be stated with (**call** <sup>∧</sup> <sup>p</sup><sup>A</sup> <sup>=</sup>⇒ ¬CallT hr()), meaning procedure p<sup>A</sup> is never interrupted by an exception.

*Stack inspection* [29,37], i.e. properties regarding the sequence of procedures active in the program's stack at a certain point of its execution, is an important class of requirements that can be expressed with shortcut Scall(ϕ, ψ) := (**call** <sup>=</sup><sup>⇒</sup> <sup>ϕ</sup>) <sup>S</sup><sup>d</sup> <sup>χ</sup> (**call** <sup>∧</sup> <sup>ψ</sup>), which subsumes the *call since* of CaRet, as it also works with exceptions. E.g., (**call** <sup>∧</sup> <sup>p</sup><sup>B</sup> <sup>∧</sup> Scall(, <sup>p</sup>A)) =<sup>⇒</sup> CallT hr() means that whenever p<sup>B</sup> is executed and at least one instance of p<sup>A</sup> is on the stack, p<sup>B</sup> is terminated by an exception. The OPA of Fig. 2 satisfies this formula, because p<sup>B</sup> is called by pA, and p<sup>C</sup> throws.

### **4 Model Checking**

Given an OP alphabet (P(AP), MAP ), where AP is a finite set of atomic propositions, and a POTL formula <sup>ϕ</sup>, we build an OPA <sup>A</sup><sup>ϕ</sup> = (P(AP), MAP , Q, I, F, δ) that accepts models of <sup>ϕ</sup>. The construction of <sup>A</sup><sup>ϕ</sup> resembles the classical one for LTL and the ones for NWTL and OPTL, diverging from them significantly when dealing with temporal obligations that involve positions in the chain relation.

We first introduce Cl(ϕ), the *closure* of ϕ, containing all subformulas of ϕ, and some auxiliary operators. The latter are needed to model-check chain next and back operators. For any PR <sup>π</sup> ∈ {-, . <sup>=</sup>, }, we define them as follows: (w, i) <sup>|</sup><sup>=</sup> <sup>χ</sup><sup>π</sup> <sup>F</sup> <sup>ϕ</sup> iff there exists j>i such that <sup>χ</sup>(i, j), iπj, and (w, j) <sup>|</sup><sup>=</sup> <sup>ϕ</sup>; (w, i) <sup>|</sup><sup>=</sup> <sup>χ</sup><sup>π</sup> <sup>P</sup> <sup>ϕ</sup> iff there exists j<i such that <sup>χ</sup>(j, i), jπi, and (w, j) <sup>|</sup><sup>=</sup> <sup>ϕ</sup>.

Cl(ϕ) is the smallest set such that, for <sup>t</sup> ∈ {d, u}:


The set Atoms(ϕ) contains all consistent subsets of Cl(ϕ), i.e. all <sup>Φ</sup> <sup>⊆</sup> Cl(ϕ) s.t.


The consistency constraints on Atoms(ϕ) will be augmented incrementally in the following, for each operator.

The set of states of <sup>A</sup><sup>ϕ</sup> is <sup>Q</sup> <sup>=</sup> Atoms(ϕ)<sup>2</sup>, and its elements, which we denote with Greek capital letters, are of the form Φ = (Φc, Φp), where Φ<sup>c</sup> is the set of formulas that hold in the current position, and Φ<sup>p</sup> is the set of temporal obligations. The latter keep track of arguments of temporal operators that must be satisfied after a chain body, skipping it. The way they do so depends on the transition relation δ, which we also define incrementally. Each automaton state is associated to word positions. So, for (Φ, a, Ψ) <sup>∈</sup> <sup>δ</sup>push/shif t, with <sup>Φ</sup> <sup>∈</sup> Atoms(ϕ)<sup>2</sup> and <sup>a</sup> ∈ P(AP), we have <sup>Φ</sup><sup>c</sup> <sup>∩</sup> AP <sup>=</sup> <sup>a</sup> (by <sup>Φ</sup><sup>c</sup> <sup>∩</sup> AP we mean the set of atomic propositions in Φc). *Pop* moves do not read input symbols, and the automaton remains at the same position when performing them: for any (Φ, Θ, Ψ) <sup>∈</sup> <sup>δ</sup>pop we impose Φ<sup>c</sup> = Ψc. The initial set I contains states of the form (Φc, Φp), with <sup>ϕ</sup> <sup>∈</sup> <sup>Φ</sup>c, and the final set <sup>F</sup> states of the form (Ψc, Ψp), s.t. <sup>Ψ</sup><sup>c</sup> <sup>∩</sup> AP <sup>=</sup> {#} and Ψ<sup>c</sup> contains no future operators. We extend the construction to the most important operators, leaving the others and correctness proofs to [21].

**Next/Back Operators.** Let (Φ, a, Ψ) <sup>∈</sup> <sup>δ</sup>shif t <sup>∪</sup>δpush, with Φ, Ψ <sup>∈</sup> Atoms(ϕ)<sup>2</sup>, <sup>a</sup> ∈ P(AP), and let <sup>b</sup> <sup>=</sup> <sup>Ψ</sup><sup>c</sup> <sup>∩</sup> AP: we have <sup>d</sup><sup>ψ</sup> <sup>∈</sup> <sup>Φ</sup><sup>c</sup> iff <sup>ψ</sup> <sup>∈</sup> <sup>Ψ</sup><sup>c</sup> and either <sup>a</sup> b or <sup>a</sup> . = b. The constraints introduced for the <sup>d</sup> operator are symmetric, and for their upward counterparts it suffices to replace with .

If χ<sup>d</sup> <sup>F</sup> <sup>ψ</sup> <sup>∈</sup> Cl(ϕ), for each <sup>Φ</sup> <sup>∈</sup> Atoms(ϕ)<sup>2</sup> we impose that <sup>χ</sup><sup>d</sup> <sup>F</sup> <sup>ψ</sup> <sup>∈</sup> <sup>Φ</sup><sup>c</sup> iff χ- <sup>F</sup> <sup>ψ</sup> <sup>∈</sup> <sup>Φ</sup><sup>c</sup> or <sup>χ</sup> . = <sup>F</sup> <sup>ψ</sup> <sup>∈</sup> <sup>Φ</sup>c. Analogous rules are defined for the upward and past chain operators. The auxiliary symbol χ<sup>L</sup> forces the current position to be the first one of a chain body. Let the current state of the OPA be <sup>Φ</sup> <sup>∈</sup> Atoms(ϕ)<sup>2</sup>:


**Fig. 5.** Example accepting run of the automaton for χ*<sup>d</sup> <sup>F</sup>* **ret**.

<sup>χ</sup><sup>L</sup> <sup>∈</sup> <sup>Φ</sup><sup>p</sup> iff the next transition (i.e. the one reading the current position) is a push. Formally, if (Φ, a, Ψ) <sup>∈</sup> <sup>δ</sup>shif t or (Φ, Θ, Ψ) <sup>∈</sup> <sup>δ</sup>pop, for any Φ, Θ, Ψ and <sup>a</sup>, then <sup>χ</sup><sup>L</sup> -<sup>∈</sup> <sup>Φ</sup>p. If (Φ, a, Ψ) <sup>∈</sup> <sup>δ</sup>push, then <sup>χ</sup><sup>L</sup> <sup>∈</sup> <sup>Φ</sup>p. For any initial state (Φc, Φp) <sup>∈</sup> <sup>I</sup>, we have <sup>χ</sup><sup>L</sup> <sup>∈</sup> <sup>Φ</sup><sup>p</sup> iff # -<sup>∈</sup> <sup>Φ</sup>c.

If <sup>χ</sup> . = <sup>F</sup> <sup>ψ</sup> <sup>∈</sup> Cl(ϕ), its satisfaction is ensured by the following constraints on <sup>δ</sup>:


We illustrate how the construction works for <sup>χ</sup> . = <sup>F</sup> with the example of Fig. 5. The OPA starts in state Φ<sup>0</sup>, with χ<sup>d</sup> <sup>F</sup> **ret** <sup>∈</sup> <sup>Φ</sup><sup>0</sup> <sup>c</sup> , and guesses that χ<sup>d</sup> <sup>F</sup> will be fulfilled by <sup>χ</sup> . = <sup>F</sup> , so <sup>χ</sup> . = <sup>F</sup> **ret** <sup>∈</sup> <sup>Φ</sup><sup>0</sup> <sup>c</sup> . **call** is read by a push move, resulting in state <sup>Φ</sup><sup>1</sup>. The OPA guesses the next move will be a push, so <sup>χ</sup><sup>L</sup> <sup>∈</sup> <sup>Φ</sup><sup>1</sup> <sup>p</sup>. By rule 1, we have <sup>χ</sup> . = <sup>F</sup> **ret** <sup>∈</sup> <sup>Φ</sup><sup>1</sup> <sup>p</sup>. The last guess is immediately verified by the next push (step 2–3). Thus, the pending obligation for <sup>χ</sup> . = <sup>F</sup> **ret** is stored onto the stack in <sup>Φ</sup><sup>1</sup>. The OPA, then, reads **exc** with a shift, and pops the stack symbol containing Φ<sup>1</sup> (step 4–5). By rule 2, the temporal obligation is resumed in the next state <sup>Φ</sup><sup>4</sup>, so <sup>χ</sup> . = <sup>F</sup> **ret** <sup>∈</sup> <sup>Φ</sup><sup>4</sup> <sup>p</sup>. Finally, **ret** is read by a shift which, by rule 3, may occur only if **ret** <sup>∈</sup> <sup>Φ</sup><sup>4</sup> <sup>c</sup> . Rule <sup>3</sup> verifies the guess that <sup>χ</sup> . = <sup>F</sup> **ret** holds in <sup>Φ</sup>0, and fulfills the temporal obligation contained in Φ<sup>4</sup> <sup>p</sup>, by preventing computations in which **ret** -<sup>∈</sup> <sup>Φ</sup><sup>4</sup> <sup>c</sup> from continuing. Had the next transition been a pop (e.g. because there was no **ret** and **call**#), the run would have been blocked by rule 2, preventing the OPA from reaching an accepting state, and from emptying the stack.

**Summary Until and Since.** The construction for these operators is based on their expansion laws. For any <sup>Φ</sup> <sup>∈</sup> Atoms(ϕ)<sup>2</sup>, we have <sup>ψ</sup> <sup>U</sup><sup>t</sup> <sup>χ</sup> <sup>θ</sup> <sup>∈</sup> <sup>Φ</sup>c, with <sup>t</sup> ∈ {d, u} being a direction, iff either: 1. <sup>θ</sup> <sup>∈</sup> <sup>Φ</sup>c, 2. <sup>t</sup> (<sup>ψ</sup> <sup>U</sup><sup>t</sup> <sup>χ</sup> <sup>θ</sup>), ψ <sup>∈</sup> <sup>Φ</sup>c, or 3. χt <sup>F</sup> (<sup>ψ</sup> <sup>U</sup><sup>t</sup> <sup>χ</sup> <sup>θ</sup>), ψ <sup>∈</sup> <sup>Φ</sup>c. The rules for since are symmetric.

**Hierarchical Operators.** For the hierarchical operators, we do not give an explicit OPA construction, but we rely on a translation into other POTL operands. For each hierarchical operator η in ϕ, we add a propositional symbol q(η). The upward hierarchical operators consider the right contexts of chains sharing the same left context. To distinguish such positions, we define formula γL,η := χ- P q(η) ∧ (¬q(η)) ∧ (¬q(η)) , where and are as in Sect. 3.1. and are the LTL next and back operators, for which model checking can be done as for <sup>d</sup> and <sup>d</sup>, but removing the restrictions on PR. <sup>γ</sup>L,η, evaluated on a position i, asserts that q(η) holds in the unique position h such that χ(h, i) and h i. Thus, q(η) can be used to distinguish other positions j such that χ(h, j) and h j, as χ- <sup>P</sup> q(η) holds in them. The translations for future upward hierarchical operators follow, the others being analogous.

$$\begin{aligned} \bigcirc\_{H}^{u} \psi &:= \gamma\_{L, \bigcirc\_{H}^{u} \psi} \wedge \bigcirc \big( (\neg \chi\_{P}^{\lessgtr} \mathbf{q}\_{\{\bigcirc\_{H}^{u} \psi\}}) \mathcal{U}\_{\chi}^{u} \left( \chi\_{P}^{\lessgtr} \mathbf{q}\_{\{\bigcirc\_{H}^{u} \psi\}} \wedge \psi \right) \big) \\ \psi \mathcal{U}\_{H}^{u} \theta &:= \gamma\_{L, \psi \mathcal{U}\_{H}^{u} \theta} \wedge \left( \chi\_{P}^{\lessgtr} \mathbf{q}\_{\{\psi \mathcal{U}\_{H}^{u} \theta\}} \implies \psi \right) \mathcal{U}\_{\chi}^{u} \left( \chi\_{P}^{\lessgtr} \mathbf{q}\_{\{\psi \mathcal{U}\_{H}^{u} \theta\}} \wedge \theta \right) \end{aligned}$$

#### **4.1 Model Checking for** *ω***-Words**

To perform model checking of a POTL formula ϕ on OP ω-words, we build a generalized <sup>ω</sup>OPBA <sup>A</sup><sup>ω</sup> <sup>ϕ</sup> = (P(AP), MAP , Qω,I, **<sup>F</sup>**, δ), where <sup>Q</sup><sup>ω</sup> <sup>=</sup> Atoms(ϕ)<sup>2</sup> <sup>×</sup> <sup>P</sup>(Clstack(ϕ)), which differs from the finite-word OPA only for the state set and the acceptance condition. As in [2], the generalized B¨uchi acceptance condition is a slight variation on the one shown in Sect. 2.1: **F** is the set of sets of B¨uchi final states, and an ω-word is accepted iff at least one state from each one of the sets contained in **F** is visited infinitely often during the computation.

In finite words, the stack is empty at the end of every accepting computation, which implies the satisfaction of all temporal constraints tracked by the pending part of stack symbols. In ωOPBAs, the stack may never be empty, and symbols with a non-empty pending part may remain in it indefinitely, never enforcing the satisfaction of the respective formulas. To overcome this issue, we use Atoms(ϕ)<sup>2</sup> × P(Clstack(ϕ)), with Clstack(ϕ) <sup>⊆</sup> Cl(ϕ), as the state set of the ωOPBA. Such states have the form Φ = (Φc, Φp, Φs), where Φ<sup>c</sup> and Φ<sup>p</sup> have the same role as in the finite-word case, and Φ<sup>s</sup> is the *in-stack* part of Φ. All rules previously defined for Φ<sup>c</sup> and Φ<sup>p</sup> remain the same. Φ<sup>s</sup> contains elements of Clstack(ϕ) contained in any symbol currently on the stack. Clstack(ϕ) contains formulas in Cl(ϕ) that use the stack to ensure the satisfaction of future temporal requirements, namely all χ<sup>π</sup> <sup>F</sup> <sup>ψ</sup> <sup>∈</sup> Cl(ϕ), with <sup>π</sup> ∈ {-, . <sup>=</sup>, }. Thus, pending temporal obligations are moved from the stack to the ωOPBA state, and they can be considered by the B¨uchi acceptance condition.

Suppose we want to model check <sup>χ</sup> . = <sup>F</sup> <sup>ψ</sup>. Formula <sup>χ</sup> . = <sup>F</sup> <sup>ψ</sup> must be inserted in the in-stack part of the current state whenever a stack symbol containing it in its pending part is pushed. It must be kept in the in-stack part of the current state until the last stack symbol containing it in its pending part is popped, marking


**Fig. 6.** Prefix of an accepting run of the automaton for χ*<sup>d</sup> <sup>F</sup>* **ret**.

the satisfaction of its temporal requirement. Then, it is possible to define an acceptance set <sup>F</sup><sup>χ</sup> . = <sup>F</sup> <sup>ψ</sup> <sup>∈</sup> **<sup>F</sup>**, as the set of states not containing <sup>χ</sup> . = <sup>F</sup> <sup>ψ</sup> in any part. Figure <sup>6</sup> shows an <sup>ω</sup>OPBA run of this kind. Notice that after step 7 <sup>χ</sup> . = <sup>F</sup> <sup>ψ</sup> does not appear in any state's in-stack part, so the run is accepting.

This construction is formalized as follows. Let <sup>ψ</sup> <sup>∈</sup> Clstack(ϕ). We add a few constraints on the transition relations. For any Φ, Θ, Ψ <sup>∈</sup> <sup>Q</sup><sup>ω</sup> and <sup>a</sup> ∈ P(AP):


An acceptance condition for summary until operators is also needed. For <sup>ψ</sup> <sup>U</sup><sup>d</sup> <sup>χ</sup> <sup>θ</sup> <sup>∈</sup> Cl(ϕ), we add an acceptance set **<sup>F</sup>**<sup>ψ</sup>U<sup>d</sup> <sup>χ</sup><sup>θ</sup> such that for any Φ in it we have χ- <sup>F</sup> (<sup>ψ</sup> <sup>U</sup><sup>d</sup> <sup>χ</sup> <sup>θ</sup>), χ . = <sup>F</sup> (<sup>ψ</sup> <sup>U</sup><sup>d</sup> <sup>χ</sup> <sup>θ</sup>) -<sup>∈</sup> <sup>Φ</sup>s, and either <sup>ψ</sup> <sup>U</sup><sup>d</sup> <sup>χ</sup> <sup>θ</sup> -<sup>∈</sup> <sup>Φ</sup><sup>c</sup> or <sup>θ</sup> <sup>∈</sup> <sup>Φ</sup>c. The condition for <sup>ψ</sup> <sup>U</sup><sup>u</sup> <sup>χ</sup> <sup>θ</sup> is symmetric.

#### **4.2 Complexity**

The set Cl(ϕ) is linear in <sup>|</sup>ϕ|, the length of <sup>ϕ</sup>. Atoms(ϕ) has size at most 2|Cl(ϕ)<sup>|</sup> = 2<sup>O</sup>(|ϕ|), and the size of the set of states is the square of that in the finite case, and is bounded by its cube in the ω-case. Moreover, the use of the equivalences for the hierarchical operators causes only a linear increase in the length of ϕ. Therefore,

**Theorem 2.** *Given a POTL formula* ϕ*, it is possible to build an OPA or an* <sup>ω</sup>*OPBA* <sup>A</sup><sup>ϕ</sup> *accepting the language denoted by* <sup>ϕ</sup> *with at most* <sup>2</sup><sup>O</sup>(|ϕ|) *states.*

<sup>A</sup><sup>ϕ</sup> can then be intersected [42] with an OPA/ωOPBA modeling a program (e.g. Fig. 2), and emptiness can be decided with *summarization* techniques [4].


**Table 1.** Results of the evaluation. '# states' refers to the OPA to be verified.

# **5 Experimental Evaluation**

We implemented the OPA construction of Sect. 4 in an explicit-state model checking tool called POMC. The tool is written in Haskell [45], a purely functional, statically typed programming language with lazy evaluation. POMC checks OPA for emptiness by checking the reachability of an accepting configuration, by means of a modified DFS of the transition relation. This algorithm, similar to the one in [9], exploits the fact that all transitions only consider the topmost stack symbol, so reachability is actually computed only for *semi-configurations* made of one stack symbol and one state. Each time a chain support is explored, its ending semi-configuration is saved and associated with the starting one, so the next time the latter is reached, the support does not have to be re-explored. This allows the algorithm to exploit the cyclicities of OPA to terminate after having explored the whole transition relation. Given a POTL specification <sup>ϕ</sup> and an OPA <sup>A</sup> to be checked, POMC executes the reachability algorithm, generating the product between <sup>A</sup> and the OPA for <sup>¬</sup><sup>ϕ</sup> on-the-fly. The present prototype of POMC only supports finite-word model checking; its extension to deal with ω-languages is under development.

We checked with POMC several requirements on three case studies and we report the results in Table 1. Some additional formulas we checked are in Table 2. Such results can be reproduced through a publicly available artifact.<sup>2</sup> The experiments were executed on a laptop with a 2.2 GHz Intel processor and 15 GiB of RAM, running Ubuntu GNU/Linux 20.04. In the tables, by "Total" memory we mean the maximum resident memory including the Haskell runtime (which allocates 70 MiB by default), and by "MC only" the maximum memory used by model checking as reported by the runtime. Since model checking is polynomial in OPA size and exponential in formula length, we focus on checking a variety of requirements, rather than large OPA.

<sup>2</sup> https://doi.org/10.5281/zenodo.4723741.

#### **Generic Procedural Program.** We checked formula

 (**call** <sup>∧</sup> <sup>p</sup><sup>B</sup> <sup>∧</sup> Scall(, <sup>p</sup>A)) =<sup>⇒</sup> CallT hr() 

from Sect. 3.1 on the OPA of Fig. 2 (bench. 1), and also against two larger OPA (2, where the property does not hold, and 3, where it holds).

We also checked the largest of such OPA against a set of formulas devised with the purpose of testing all POTL operators. The results are reported in Table 2. All formulas are checked very quickly, with only one outlier that runs out of memory. We ran the same experiment on a machine with a 2.0 GHz AMD CPU and 512 GiB of RAM running Debian GNU/Linux 10, obtaining a time of 367 s with a memory occupancy of 16.3 GiB.

**Stack Inspection.** The security framework of the Java Development Kit (JDK) is based on stack inspection, i.e. the analysis of the contents of the program's stack during the execution. The JDK provides method checkPermission(perm) from class AccessController, which searches the stack for frames of functions that have not been granted permission perm. If any are found, an exception is thrown. Such permission checks prevent the execution of privileged code by unauthorized parts of the program, but they must be placed in sensitive points manually. Failure to place them appropriately may cause the unauthorized execution of privileged code. An automated tool to check that no code can escape such checks is thus desirable. Any such tool would need the ability to model exceptions, as they are used to avoid code execution in case of security violations.

[37] explains such needs by providing an example Java program for managing a bank account. It allows the user to check the account balance, and to withdraw money. To perform such tasks, the invoking program must have been granted permissions CanPay and Debit, respectively. We modeled such program as an OPA (4), and proved that the program enforces such security measures effectively by checking it against the formula

$$
\Box(\mathsf{call}\land\mathsf{read}\implies\neg(\top\,\mathcal{S}^{d}\_{\chi}\,(\mathsf{call}\land\neg\mathsf{CanPay}\land\neg\mathsf{read})))
$$

meaning that the account balance cannot be read if some function in the stack lacks the CanPay permission (a similar formula checks the Debit permission).

**Exception Safety.** [53] is a tutorial on how to make exception-safe generic containers in C++. It presents two implementations of a generic stack data structure, parametric on the element type T. The first one is not exception-safe: if the constructor of T throws an exception during a pop action, the topmost element is removed, but it is not returned, and it is lost. This violates the strong exception safety requirement that each operation is rolled back if an exception is thrown. The second version of the data structure instead satisfies such requirement.

While exception safety is, in general, undecidable, it is possible to prove the stronger requirement that each modification to the data structure is only committed once no more exceptions can be thrown. We modeled both versions as OPA, and checked such requirement with the following formula:

(**exc** <sup>=</sup>⇒ ¬((*<sup>u</sup>*modified <sup>∨</sup> <sup>χ</sup>*<sup>u</sup> <sup>P</sup>* modified) <sup>∧</sup> <sup>χ</sup>*<sup>u</sup> <sup>P</sup>* (Stack :: push ∨ Stack :: pop))) POMC successfully found a counterexample for the first implementation (5), and proved the safety of the second one (6).

Additionally, we proved that both implementations are *exception neutral* (7, 8), i.e. Stack functions do not block exceptions thrown by the underlying type T. This was accomplished by checking the following formula:

$$
\Box(\mathbf{exc} \wedge \odot^{u} \mathbf{T} \wedge \chi\_{P}^{d}(\mathbf{han} \wedge \chi\_{P}^{d} \mathbf{Stack}) \implies \chi\_{P}^{d} \chi\_{P}^{d} \chi\_{P}^{u} \mathbf{exc}).
$$

**Table 2.** Results of the additional experiments on OPA "generic larger".


# **6 Conclusions**

We introduced the temporal logic POTL, gave an automata-theoretic model checking procedure, and implemented it in a prototype tool. The results obtained in its experimental evaluation are promising. Additionally, POTL is proved to be FO-complete in a technical report [23]. We argue that the strong gain in expressive power w.r.t. previous approaches to model checking CFL, which comes without an increase in computational complexity, is worth the technicalities needed to achieve the present—and future—results.

In the evaluation, we used models directly coded into OPAs. To ease user interaction with our tool, we additionally implemented a new input format based on a simple procedural language with exceptions and Boolean variables, which is automatically translated into OPA. Moreover, we are currently working on the implementation of the model checking for ω-words, described in Sect. 4.1.

As a future research step, we plan to develop user-friendly domain-specific languages for specification too, to prove that OP languages and logics are suitable in practice to program verification.

**Acknowledgments.** We are thankful to Davide Bergamaschi for developing an early POMC prototype, and to Francesco Pontiggia for implementing performance optimizations.

### **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

# **Model Checking** *ω***-Regular Properties with Decoupled Search**

Daniel Gnad1(B) , Jan Eisenhut1, Alberto Lluch Lafuente2, and Jorg Hoffmann ¨ <sup>1</sup>

<sup>1</sup> Saarland University, Saarland Informatics Campus, Saarbrucken, Germany ¨ {gnad,hoffmann}@cs.uni-saarland.de, s8jaeise@stud.uni-saarland.de <sup>2</sup> Technical University of Denmark, Kongens Lyngby, Denmark albl@dtu.dk

**Abstract.** Decoupled search is a state space search method originally introduced in AI Planning. Similar to partial-order reduction methods, decoupled search exploits the independence of components to tackle the state explosion problem. Similar to symbolic representations, it does not construct the explicit state space, but sets of states are represented in a compact manner, exploiting component independence. Given the success of both partial-order reduction and symbolic representations when model checking liveness properties, our goal is to add decoupled search to the toolset of liveness checking methods. Specifically, we show how decoupled search can be applied to liveness verification for composed Buchi automata by adapting, and showing correct, a standard algorithm for ¨ detecting lassos (i.e., infinite accepting runs), namely nested depth-first search. We evaluate our approach using a prototype implementation.

# **1 Introduction**

Model checking is a well-known problem in formal verification. Given a formal description of a system M, the model checking problem is to decide whether the system satisfies a property φ. In contrast to safety properties, which can only express whether there exists a finite run of the system that reaches a state with certain (bad) properties, liveness properties can express good behaviours of the system that should occur repeatedly, i.e., infinite runs in which something good happens infinitely often.

In this work, we consider a liveness verification problem that arises when composing a set <sup>A</sup><sup>1</sup>,..., <sup>A</sup><sup>n</sup> of *non-deterministic Buchi automata ¨* (NBA), each with its own acceptance condition. We recall that an accepting run for a single NBA is a *lasso* ρp(ρc)<sup>ω</sup> with a prefix ρ<sup>p</sup> and a cycle ρ<sup>c</sup> that visits an accepting state. For the composition of a set of NBAs into an NBA we consider the following liveness property: a composed run is accepting if there is a cycle visiting a state that is accepting for *all* components. Such a general problem captures standard liveness verification problems related to ω-regular properties. An archetypal example is automata-based LTL checking, where system components are represented as NBAs and are composed with a property monitor, represented as a Buchi automaton (often the negation of an LTL property). ¨ In this case an accepting composed run witnesses a violation of a linear-time property.

The predominant approach to address such verification problems using explicit state space search is to use *nested depth-first search* (NDFS) algorithms [5,22,32], c The Author(s) 2021

also called *double depth-first search*, which perform on-the-fly checking of liveness properties while composing the NBAs. NDFS, like all state space search methods, suffers from the state explosion problem. Various methods, such as partial-order reduction [10,19,27,30,34], symbolic representations [2,28], symmetry reduction [7,23], or Petri-net unfolding [8,9] have been proposed to alleviate the state explosion problem. Here, we add *decoupled state space search* [14], shortly *decoupled search*, as a new method for model checking liveness properties, complementary to the existing approaches. Indeed, as Gnad and Hoffmann [14,15] have shown, decoupled search complements these techniques in the sense that there exist cases where it yields exponentially stronger reductions. It has also been shown that decoupled search can be fruitfully combined with partial-order reduction [16], symmetry reduction [18], and symbolic search [17].

Decoupled search has recently been introduced in AI planning [14], addressing goal reachability problems. Its applicability to model checking of safety properties has been shown in [12], where it was effectively introduced into the SPIN model checker [20]. However, the extension of decoupled search to cycle detection problems inherent to liveness model checking and NDFS algorithms has not yet been investigated. This paper addresses that investigation for the first time.

Decoupled search exploits the independence of system components, similar to partial-order reduction techniques, by not enumerating all interleavings of transitions across components. Similar to symbolic representations, decoupled search does not construct the explicit state space of the product. Instead, search nodes, called *decoupled states*, symbolically represent sets of states. Each decoupled state compactly represents many global states and their closure up to internal transitions of individual components. Similar to partial-order reduction or symbolic search, decoupled search can be exponentially more efficient than explicit search of the state space, as shown for reachability problems in the domains of AI planning [14] and model checking [12].

The main contribution of our paper is to extend the scope of decoupled search from safety properties, as done in [12], to liveness properties. In particular, we adapt a standard NDFS algorithm to the decoupled state representation. The resulting algorithms are able to solve the verification problem mentioned above, namely checking acceptance of composed NBAs. The main technical challenge for the correctness of our algorithms was to identify the conditions that imply existence of accepting runs in decoupled search and to show how such runs can be constructed efficiently.

We evaluate our decoupled NDFS algorithm using a prototype implementation on two showcase examples similar to the dining philosophers problem, and a set of randomly generated models. We compare to established tools, namely the SPIN model checker [20], and Petri-net unfolding with Cunf [30]. The results show that, like for safety properties, decoupled search can yield exponential advantages over state-of-theart methods. In particular, its advantage grows with the degree to which components act independently of others, via internal transitions that do not affect other components.

The rest of the paper is structured as follows. We start in Sect. 2 by recalling the necessary background on NBAs, the verification problem we consider, and a standard NDFS algorithm typically used to solve the problem. Sections 3–5 present our contribution: Sect. 3 formalizes decoupled search in terms of composed NBAs, and shows its desired properties; Sect. 4 discusses some issues that would arise in a na¨ıve attempt to (incorrectly) adapt it, and describes the (correct) adapted NDFS algorithm; Sect. 5 provides its correctness proof. In Sect. 6 we show our experimental evaluation, whose code and models are publicly available at [13]. Section 7 concludes the paper discussing related works and future research avenues.

# **2 Buchi Automata, Composition and Verification ¨**

This section recalls some basic notions of Buchi automata, their composition, the veri- ¨ fication problem we consider in this paper for such composition, and its standard algorithmic resolution based on NDFS.

*Buchi Automata and Accepting Runs. ¨* We start with the definition of non-deterministic Buchi automata (NBA). ¨

**Definition 1 (Non-deterministic Buchi Automaton). ¨** *A* non-determinitic Buchi ¨ automaton A *is a tuple* -S,→, L, s0, A*, where* <sup>S</sup> *is a finite set of* states*,* <sup>L</sup> *is a finite set of* transition labels*,* →⊆ <sup>S</sup> <sup>×</sup> <sup>L</sup> <sup>×</sup> <sup>S</sup> *is a* transition relation*,* <sup>s</sup><sup>0</sup> <sup>∈</sup> <sup>S</sup> *is an* initial state*, and* <sup>A</sup> <sup>∈</sup> (<sup>S</sup> <sup>→</sup> <sup>B</sup>) *is an* acceptance function*.*

A run <sup>ρ</sup> of an NBA is an infinite sequence of states <sup>s</sup>0, s1, s2, ···∈ <sup>S</sup><sup>ω</sup> starting from the initial state. The i-th state of a run ρ is denoted by ρ[i] and we will use the same notation for other lists and sequences. A run ρ is accepting if it traverses accepting states infinitely often. Formally, <sup>∞</sup> <sup>∃</sup><sup>j</sup> <sup>∈</sup> <sup>N</sup> : <sup>A</sup>(s[j]). We define a *trace* <sup>π</sup> of a run <sup>ρ</sup> <sup>=</sup> <sup>s</sup>0, s1, s2, ··· ∈ <sup>S</sup><sup>ω</sup> as a sequence of labels <sup>π</sup> <sup>=</sup> <sup>l</sup>0, l1, ··· ∈ <sup>L</sup><sup>ω</sup> such that <sup>∀</sup><sup>i</sup>∈<sup>N</sup> : <sup>s</sup>i, li, si+1 ∈→. We will also consider *finite* runs <sup>ρ</sup> <sup>∈</sup> <sup>S</sup><sup>n</sup> and *finite* traces <sup>π</sup> <sup>∈</sup> <sup>L</sup><sup>n</sup>.

As hinted in Sect. 1, the existence of accepting runs is interesting for several theoretical and practical reasons. On the theoretical side, the language of an NBA is the set of all traces <sup>σ</sup> in <sup>L</sup><sup>ω</sup> for which an accepting run exists such that <sup>ρ</sup>[i] <sup>σ</sup>[i] −−→ <sup>ρ</sup>[<sup>i</sup> + 1] for all <sup>i</sup> <sup>∈</sup> <sup>N</sup>. On the practical side, model checking <sup>ω</sup>-regular properties, including LTL properties, can be reduced to checking the existence of accepting runs. Such runs, indeed, provide witnesses or counterexamples for the properties of interest.

*Composition of NBAs.* From now on we assume that the set of labels L of an NBA is partitioned into a set L<sup>I</sup> of *internal labels* and a set L<sup>G</sup> of *global labels*. The notion of composition we use is based on (maximal) synchronisation on global labels, in words: in every transition involving a global label, each component having the global label in its set of labels must perform a local transition, while transitions with internal labels can be performed independently. When composing NBAs we assume w.l.o.g. that they do not share any internal label. Further, we assume that every global label is shared by at least two component NBAs. Otherwise, such labels can be made internal. We will use the following notation: for a set <sup>A</sup><sup>1</sup>,..., <sup>A</sup><sup>n</sup> of NBAs, we use superscripting to denote the components of each <sup>A</sup><sup>i</sup> , i.e., we assume <sup>A</sup><sup>i</sup> <sup>=</sup> -Si ,→<sup>i</sup> , L<sup>i</sup> = L<sup>i</sup> <sup>I</sup> <sup>∪</sup> <sup>L</sup><sup>i</sup> G, s<sup>i</sup> 0, A<sup>i</sup> .

**Definition 2 (Composition of NBAs).** *The* composition *of* <sup>n</sup> *NBAs* <sup>A</sup>1,..., <sup>A</sup>n*, denoted by* <sup>A</sup><sup>1</sup> ... <sup>A</sup>n*, is the NBA* -S,→, L, *<sup>s</sup>*0, A*, where* <sup>S</sup> <sup>=</sup> <sup>S</sup><sup>1</sup> ×···× <sup>S</sup>n*,* L = - <sup>i</sup>∈{1,...,n} <sup>L</sup><sup>i</sup> *, s*<sup>0</sup> = (s<sup>1</sup> 0,...,s<sup>n</sup> <sup>0</sup> )*,* <sup>A</sup> <sup>=</sup> {(s1,...,sn) → ∧i=1,...,nA<sup>i</sup> (si)} *and* → *is the smallest set of transitions closed under the following rules for interleaving of local transitions (1) and maximal synchronization on global labels (2):*

$$\begin{aligned} (1) \quad & \frac{s\_i \xrightarrow{l\_I} s'\_i}{(s\_1, \dots, s\_i, \dots, s\_n) \xrightarrow{l\_I} (s\_1, \dots, s'\_i, \dots, s\_n)}\\ (2) \quad & \frac{\exists\_{i \in \{1, \dots, n\}} : l\_G \in L^i\_G \ \forall\_{j \in \{1, \dots, n\} \mid l\_G \in L^j\_G} \ \vdots \ s\_j \xrightarrow[j \in \{1, \dots, n\} \mid\_G \boxtimes L^j\_G]}{(s\_1, \dots, s\_n) \xrightarrow{l\_G} (s'\_1, \dots, s'\_n)} \end{aligned}$$

As notation convention, we will denote component states simply by small case letters, e.g. <sup>s</sup>, and composed states (s1,...,sn) <sup>∈</sup> <sup>S</sup> by *<sup>s</sup>*, i.e., as a vector, and similarly for local runs ρ (resp. traces π) and composed runs *ρ* (composed traces *π*).

In Fig. <sup>1</sup> we illustrate a small example of a composition of two NBAs <sup>A</sup><sup>1</sup>, <sup>A</sup><sup>2</sup>. In the top of the figure, we show the local state space of the two components (A<sup>1</sup> left, <sup>A</sup><sup>2</sup> right), where the component states are <sup>S</sup><sup>1</sup> <sup>=</sup> {1, <sup>2</sup>, <sup>3</sup>}, <sup>S</sup><sup>2</sup> <sup>=</sup> {A, B}, and the labels are defined as L<sup>1</sup> <sup>G</sup> <sup>=</sup> <sup>L</sup><sup>2</sup> <sup>G</sup> <sup>=</sup> {<sup>l</sup> 1 G, l<sup>2</sup> <sup>G</sup>}, <sup>L</sup><sup>1</sup> <sup>I</sup> <sup>=</sup> {<sup>l</sup> 1 <sup>I</sup> }, <sup>L</sup><sup>2</sup> <sup>I</sup> <sup>=</sup> {<sup>l</sup> 2 <sup>I</sup> }. A local state is accepting for <sup>A</sup><sup>1</sup>, so <sup>A</sup><sup>1</sup>(s) = , iff <sup>s</sup> = 2, and similar <sup>A</sup><sup>2</sup>(s) = iff <sup>s</sup> <sup>=</sup> <sup>B</sup>. The initial states are s<sup>1</sup> <sup>0</sup> = 1 and s<sup>2</sup> <sup>0</sup> = A. The transitions are as shown. In the bottom, we depict the part of the state space of the composition <sup>A</sup><sup>1</sup> <sup>A</sup><sup>2</sup> reachable from *<sup>s</sup>*<sup>0</sup> = (1, A) as it would be generated by a standard DFS. Here, transitions via global labels synchronize the components, internal transitions are executed independently. The states crossed out would be pruned by duplicate checking, the underlined state is accepting.

$$\begin{array}{c} \left(\begin{matrix} l\_1^1 \\ l\_2^1 \end{matrix}\right)\_{l\_G^2}^{l\_1^1} \xleftarrow{l\_G^1} \\\ \left(\begin{matrix} l\_2^2 \\ l\_3^2 \end{matrix}\right)\_{l\_G^3}^{l\_1^1} \end{array} \quad \quad \quad \begin{matrix} (1,A) \xrightarrow{l\_I^1} (2,A) \xrightarrow{l\_G^2} (3,A) \xrightarrow{l\_G^1} (1,B) \xrightarrow{l\_I^2} (2,A) \xrightarrow{l\_I^3} (3,A) \xrightarrow{l\_I^4} (4,A) \xrightarrow{l\_I^5} (3,A) \xrightarrow{l\_I^6} (4,A) \xrightarrow{l\_I^7} (4,A) \xrightarrow{l\_I^8} (4,A) \xrightarrow{l\_I^9} (4,A) \xrightarrow{l\_I^{10}} (6,A) \xrightarrow{l\_I^{11}} (6,A) \xrightarrow{l\_I^{12}} (6,A) \xrightarrow{l\_I^{13}} (6,A) \xrightarrow{l\_I^{14}} (6,A) \xrightarrow{l\_I^{15}} (4,A) \xrightarrow{l\_I^{16}} (6,A) \xrightarrow{l\_I^{17}} (8,A) \xrightarrow{l\_I^{18}} (8,A) \xrightarrow{l\_I^{19}} (9,A) \xrightarrow{l\_I^{19}} (1,A) \xrightarrow{l\_I^{18}} (2,A) \xrightarrow{l\_I^{19}} (3,A) \xrightarrow{l\_I^{19}} (4,A) \xrightarrow{l\_I^{19}} (4,A) \xrightarrow{l\_I^{19}} (1,A) \xrightarrow{l\_I^{18}} (4,A) \xrightarrow{l\_I^{19}} (2,A) \xrightarrow{l\_I^{19}} (4,A) \xrightarrow{l\_I^{19}} (1,A) \xrightarrow{l\_I^{19}} (2,A) \xrightarrow{l\_I^{19}} (4,A) \xrightarrow{l\_I^{19}}$$

**Fig. 1.** Example of two NBAs, <sup>A</sup><sup>1</sup> and <sup>A</sup><sup>2</sup>, and the state space of their composition <sup>A</sup><sup>1</sup> A<sup>2</sup>.

*Verification Problem and Its Resolution with NDFS.* The verification problem we address in this paper is the existence of accepting runs in the composed NBA <sup>A</sup><sup>1</sup> ... <sup>A</sup><sup>n</sup>. In words, we look for runs in <sup>A</sup><sup>1</sup> ... <sup>A</sup><sup>n</sup> that infinitely often traverse states in which *all* component NBAs are in an accepting state. We discuss alternative acceptance conditions in Sect. 7.

Determining the existence of accepting runs in an NBA can be boiled down to the existence of so-called *lassos*, i.e., finite sequences of states in the NBA of the form *ρ*p*ρ*<sup>c</sup>

**Fig. 2.** A standard NDFS algorithm for lasso search in composed NBAs.

where *ρ*<sup>p</sup> is the prefix of the lasso and *ρ*<sup>c</sup> is the cycle of the lasso, which contains at least one accepting state and closes the cycle (i.e., *ρ*p[|*ρ*p| − 1] = *ρ*c[|*ρ*c| − 1]. Such a finite sequence of states represents an accepting run *ρ*p(*ρ*c)<sup>ω</sup>.

Several algorithms can be used to check the existence of lassos. The predominant family of algorithms are the variants of NDFS, originally introduced in [5]. Figure 2 shows the pseudo-code for one such variant, based on NDFS as presented in [4]. The algorithm is based on an ordinary depth-first search algorithm (**DFS**) that works as usual: a set V is used to record already visited states, and recursion enforces the depthfirst exploration order of the state space. Moreover, a stack *Stack* is used to keep track of the states on the current initial trace being explored. The main difference w.r.t. ordinary DFS is that a second, nested, depth-first search algorithm (**NestedDFS**) is invoked from accepting states on backtracking, i.e., after the recursive call to **DFS**. The idea is that, if this second depth-first search finds a state that is on *Stack*, then it is guaranteed that a cycle has been found, which contains at least one accepting state. That is, one finds the (un)desired lasso. The algorithm is also complete: no accepting cycle is missed.

**Fig. 3.** Example run of **CheckEmptiness**. The wavy arrow indicates the invocation of **NestedDFS**((2, B)); the dashed arrow indicates how the cycle is closed.

In Fig. 3, we illustrate an example run of the **CheckEmptiness** algorithm on our example. When **DFS** backtracks from (2, B), **NestedDFS** is invoked, illustrated by the wavy arrow. **NestedDFS** generates the successor (2, A), which is on *Stack*, so a cycle is reported. We can construct an accepting run *ρ*p(*ρ*c)<sup>ω</sup> with prefix *ρ*<sup>p</sup> induced by the trace l 1 <sup>I</sup> and cycle *<sup>ρ</sup>*<sup>c</sup> induced by the trace <sup>l</sup> 2 G, l<sup>1</sup> G, l<sup>1</sup> <sup>I</sup> , l<sup>2</sup> I .

# **3 The Decoupled State Space for Composed NBAs**

As previously stated, decoupled state space search was recently developed in AI planning [14], and adapted to model checking of safety properties later on [12]. It is designed to tackle the state explosion problem inherent in search problems that result from compactly represented systems with exponentially large state spaces. In AI planning, where decoupled search was originally introduced, such systems are modelled through state variables and a set of transition rules (called "actions"). The adaptation of decoupled search to reachability checking in SPIN presented in [12] devised decoupled search for automata models, but informally only. Here, we introduce decoupled search formally for NBA models. We define the decoupled state space for composed NBAs, as the result from the composition of a set of NBAs.

# **3.1 Decoupled Composition of NBAs**

In contrast to the explicit construction of the state space, where all reachable states are generated by searching over all traces of enabled transitions, decoupled search only searches over traces of global transitions, the ones that synchronize the component NBAs. In decoupled search, a *decoupled state* s<sup>D</sup> compactly represents a set of states closed by internal steps. This is done in terms of the sequence of global labels used to reach these states, plus a set of reached states for each component. Definition 3 formalizes this through the operation *decoupled composition of NBAs*, which adapts the composition operation provided in Definition 2 to decoupled state space search.

**Definition 3 (Decoupled composition of NBAs).** *The* decoupled composition *of* n *NBAs* <sup>A</sup><sup>1</sup>,..., <sup>A</sup><sup>n</sup>*, denoted by* <sup>A</sup><sup>1</sup> <sup>D</sup> ... <sup>D</sup> <sup>A</sup><sup>n</sup>*, is a tuple* -<sup>S</sup>D,→D,LG, s<sup>D</sup> <sup>0</sup> , A<sup>D</sup> *defined as follows:*


$$\frac{\begin{array}{l} \begin{array}{l} l \end{array} \in L\_{G} \quad \forall \begin{subarray}{l} \forall i \in \{1, \ldots, n\} \end{subarray} : S\_{i}^{\prime} = \{s\_{i}^{\prime} \mid \mathbf{s} \in s^{\mathcal{D}} \,:\, \mathbf{s} \stackrel{l\_{G}}{\longrightarrow} (s\_{1}^{\prime}, \ldots, s\_{i}^{\prime}, \ldots, s\_{n}^{\prime})\} \end{array} \begin{array}{l} S\_{i}^{\prime} \neq \emptyset \end{array}$$

*where, abusing notation, we write <sup>s</sup>*∈s<sup>D</sup> *if* <sup>s</sup>D=-<sup>S</sup>1,...,Sn *and <sup>s</sup>* <sup>∈</sup> <sup>S</sup>1×...×Sn*.*

In the decoupled composition <sup>A</sup><sup>1</sup> <sup>D</sup> ... <sup>D</sup> <sup>A</sup><sup>n</sup> a decoupled state <sup>s</sup><sup>D</sup> is defined by a tuple <sup>s</sup>D[A<sup>1</sup>],...,sD[A<sup>n</sup>], consisting of a non-empty set of component states <sup>s</sup>D[A<sup>i</sup> ] for each <sup>A</sup><sup>i</sup> . A decoupled state represents exponentially many *member states*, namely all composed states *<sup>s</sup>* = (s1,...,sn) such that *<sup>s</sup>* <sup>∈</sup> <sup>s</sup>D[A<sup>1</sup>] ×···× <sup>s</sup>D[A<sup>n</sup>]. We will always use a superscript <sup>D</sup> to denote decoupled states <sup>s</sup>D.

We overload the subset operation <sup>⊆</sup> for decoupled states <sup>s</sup><sup>D</sup> by doing it componentwise on the sets of reached local states, namely <sup>s</sup><sup>D</sup> <sup>⊆</sup> <sup>t</sup> <sup>D</sup> ⇔ ∀A<sup>i</sup> : <sup>s</sup>D[A<sup>i</sup> ] <sup>⊆</sup> <sup>t</sup> <sup>D</sup>[A<sup>i</sup> ].

During a search in the decoupled composition we define the *global trace* of a decoupled state sD, denoted π<sup>G</sup>(sD), as the sequence of global transitions on which s<sup>D</sup> was reached from s<sup>D</sup> <sup>0</sup> . For DFS, as considered in this work, this is well-defined.

In explicit state search, states that have been visited before – duplicates – are pruned to avoid repeating the search effort unnecessarily. The corresponding operation in decoupled search is *dominance pruning* [14]. A newly generated decoupled state t D is pruned if there exists a previously seen decoupled state s<sup>D</sup> that *dominates* t <sup>D</sup>, i.e., where t <sup>D</sup> <sup>⊆</sup> <sup>s</sup>D. With the correctness result given below, this is safe. One can make the representation of decoupled states, and thereby also the dominance checking, more efficient by representing the state sets <sup>s</sup>D[A<sup>i</sup> ] symbolically [17].

The initial decoupled state is obtained by closing each local state with internal steps (iclose), and decoupled transitions generate decoupled states whose local states are also closed under internal steps. This maximally preserves the decomposition afforded by the decoupled representation. Namely, as we will prove in what follows, a decoupled state s<sup>D</sup> compactly represents all explicit states that are reachable via traces that extend the global trace π<sup>G</sup>(sD) = l 1 G, l<sup>2</sup> G,...,l<sup>k</sup> <sup>G</sup> with local transition labels. That is, for every component <sup>A</sup><sup>i</sup> , <sup>s</sup><sup>D</sup> contains the non-empty subset of its local states <sup>s</sup>D[A<sup>i</sup> ] <sup>⊆</sup> <sup>S</sup><sup>i</sup> that can be reached with traces π<sup>i</sup> = l1, l2,...,l<sup>n</sup> such that there exist indices j<sup>1</sup> < j<sup>2</sup> < ··· < j<sup>k</sup> where <sup>l</sup><sup>j</sup>*<sup>t</sup>* <sup>=</sup> <sup>π</sup><sup>G</sup>(sD)[jt] for all <sup>1</sup> <sup>≤</sup> <sup>t</sup> <sup>≤</sup> <sup>k</sup>. In words, after every global label on π<sup>G</sup>(sD), arbitrary enabled sequences of internal transitions are allowed.

We remark that the decoupled composition of a set of NBAs is always deterministic. For every pair of decoupled state s<sup>D</sup> and global label lG, there is a unique successor t <sup>D</sup>. This is easy to see, since if there is a composed state *s* contained in s<sup>D</sup> that has multiple outgoing transitions labelled with lG, all of the composed successor states are contained in t <sup>D</sup>. This increases the possible state space reduction compared to standard search, which needs to branch over all these successors. Note that this is different from the determinization of NBA, which comes with a blow-up [31]. The determinism is a consequence of the compact representation where all possible outcome states of a non-deterministic transition are contained in the decoupled successor state.

#### **3.2 Correctness of Decoupled Composition**

In this section we show that decoupled search, as presented here, is sound and complete w.r.t. reachability properties. We adapt the corresponding result from AI planning [14].

We require some additional notation. For a trace *π*, by π<sup>G</sup>(*π*) we denote the subsequence of *π* that is obtained by projecting onto the global labels LG.

As previously stated, the decoupled state space captures reachability of the composed system exactly. The proof is an adaptation of previous results from AI Planning [14] to composed NBAs as considered here.

**Theorem 1.** *A state <sup>t</sup> of a composition of NBAs* <sup>A</sup><sup>1</sup> ... <sup>A</sup><sup>n</sup> *is reachable from a state s via a trace π, iff there exist decoupled states* sD*,* t <sup>D</sup> *in the decoupled composition* <sup>A</sup><sup>1</sup> <sup>D</sup> ... <sup>D</sup> <sup>A</sup><sup>n</sup>*, such that <sup>s</sup>* <sup>∈</sup> <sup>s</sup>D*, <sup>t</sup>* <sup>∈</sup> <sup>t</sup> <sup>D</sup>*, and* t <sup>D</sup> *is reachable from* s<sup>D</sup> *via* π<sup>G</sup>(*π*)*.*

**Fig. 4.** Illustration of the exponential separations to ample sets (left) and unfolding (right).

*Proof.* Let π<sup>G</sup>(*π*) = l 1 G,...,l<sup>k</sup> <sup>G</sup>, and <sup>s</sup><sup>D</sup> i l *i*+1 *<sup>G</sup>*−−→<sup>D</sup> <sup>s</sup><sup>D</sup> <sup>i</sup>+1 for all <sup>1</sup> <sup>≤</sup> i<k. We prove the claim by induction over the length of <sup>π</sup><sup>G</sup>(*π*). For the base case <sup>|</sup>π<sup>G</sup>(*π*)<sup>|</sup> = 0, the claim trivially holds, since, by the definition of iclose(), s<sup>D</sup> contains all composed states *t* that are reachable from any *<sup>s</sup>* <sup>∈</sup> <sup>s</sup><sup>D</sup> via only internal transitions.

Assume a decoupled state s<sup>D</sup> <sup>i</sup> is reachable from <sup>s</sup><sup>D</sup> via <sup>l</sup> 1 G,...,l<sup>i</sup> <sup>G</sup>. Then, by the definition of decoupled transitions and iclose(), the state s<sup>D</sup> <sup>i</sup>+1 contains all composed states *<sup>s</sup>*i+1 that are reachable from a state *<sup>s</sup>*<sup>i</sup> <sup>∈</sup> <sup>s</sup><sup>D</sup> <sup>i</sup> via a trace <sup>π</sup><sup>i</sup>→i+1 that consists of only internal transitions and l i+1 <sup>G</sup> . By hypothesis, we can extend the traces reaching every such *<sup>s</sup>*<sup>i</sup> from a *<sup>s</sup>* <sup>∈</sup> <sup>s</sup><sup>D</sup> by <sup>π</sup><sup>i</sup>→i+1 and obtain a trace reaching *<sup>s</sup>*i+1 from *<sup>s</sup>* with global sub-trace l 1 G,...,l<sup>i</sup> G, li+1 <sup>G</sup> .

For the other direction, if a composed state *s*<sup>i</sup> is reached in a decoupled state s<sup>D</sup> i and can reach a state *s*i+1 via a trace π<sup>i</sup>→i+1 that consists of internal labels and l i+1 <sup>G</sup> , then there exists a decoupled transition s<sup>D</sup> i l *i*+1 *<sup>G</sup>*−−→<sup>D</sup> <sup>s</sup><sup>D</sup> <sup>i</sup>+1 and, again by the definition of decoupled transitions and iclose(), s<sup>D</sup> <sup>i</sup>+1 contains *<sup>s</sup>*i+1. By hypothesis <sup>s</sup><sup>D</sup> <sup>i</sup> is reachable from <sup>s</sup>D, where *<sup>s</sup>*<sup>i</sup> is reachable from *<sup>s</sup>* <sup>∈</sup> <sup>s</sup>D. Thus, <sup>s</sup><sup>D</sup> <sup>i</sup>+1 is reachable from <sup>s</sup><sup>D</sup> via l 1 G,...,l<sup>i</sup> G, li+1 <sup>G</sup> .

#### **3.3 Relation to Other State-Space Reduction Methods**

Prior work has investigated the relation of decoupled search to other state-space reduction methods in the context of AI planning [14,15], in particular to strong stubborn sets [34], Petri-net unfolding [8,9], and symbolic representations using BDDs [2,28]. For all these techniques, there exist families of scaling examples where decoupled search is exponentially more efficient.

We capture this formally in terms of *exponential separations*. A search method X is *exponentially separated* from decoupled search if there exists a family of models {M<sup>n</sup>=A<sup>1</sup>,..., <sup>A</sup><sup>m</sup> <sup>|</sup> <sup>n</sup> <sup>∈</sup> <sup>N</sup>} of size polynomially related to <sup>n</sup> such that (1) the number of reachable decoupled states in <sup>A</sup><sup>1</sup> <sup>D</sup> ... <sup>D</sup> <sup>A</sup><sup>m</sup> is bounded by a polynomial in <sup>n</sup>, and (2) the state space representation of <sup>A</sup><sup>1</sup> ... <sup>A</sup><sup>m</sup> under <sup>X</sup> is exponential in <sup>n</sup>.

We next describe two scaling models showing that the ample sets variant of SPIN [21,29], as a representative for partial-order reduction in explicit-state search, and Petri-net unfolding are exponentially separated from decoupled search. For symbolic search with BDDs, the reduction achieved by both methods is in general incomparable.

For ample sets, a simple model family looks as follows: there are n components, each with the same state space: two local states Ai, Bi, initial state Ai, two global transitions l a,i <sup>G</sup> , <sup>l</sup> b,j <sup>G</sup> , one internal transition. A component and the transitions are depicted in the left of Fig. 4 (the dashed transition is internal). The global transitions synchronize components pairwise; our argument holds for every possible such synchronization.

Under ample set pruning, no reduction is achieved (no state is pruned) because there is a global transition enabled in every state. Thus, there exists no state where only safe (i.e. internal) transitions are enabled, and the search always branches over all enabled transitions of all components. The decoupled state space, in contrast, only has a single decoupled state, where both local states are reached in each component. All decoupled successor states are dominated and will be pruned.

Similar to decoupled search, Petri-net unfolding exploits component independence by a special representation. Instead of searching over composed states and pruning transitions, the states of individual components are maintained separately.<sup>1</sup>

A scaling model showing that unfolding is exponentially separated from decoupled search is illustrated in the right of Fig. 4. There are n components, each with the same state space with three local states Ai, Bi, Ci, a global label lG, and transitions as shown in the figure. In a Petri net, this model is encoded with 3n places and 2<sup>n</sup> transitions, one for every combination of one output place in each of the components. In the unfolding, this results in an event (the equivalent of a state) for every net transition. The decoupled state space has only two decoupled states: the initial state where {Ai} is reached for all components, and its <sup>l</sup>G-successor where {Bi, Ci} is reached in every component.

### **4 NDFS for Decoupled Search**

We now adapt NDFS to decoupled search. We start by discussing the deficiencies of a na¨ıve adaptation. We will then introduce the key concepts in our fixed algorithm in Sect. 4.2, and present the algorithm itself in Sect. 4.3. We close this section by showing that the exponential separations to partial-order reduction and unfolding from Sect. 3.3 carry over to liveness checking by simple modifications of the models.

#### **4.1 Issues with a Na¨ıve Adaptation of NDFS**

In a na¨ıve adaptation of NDFS to decoupled search, the only thing that changes is the treatment of decoupled states, which represent *sets of composed states*, compared to single states in the standard variant. This leads to three mostly minor changes: (1) instead of duplicate checking we perform dominance pruning; (2) checking if a decoupled state is accepting boils down to checking if it contains an accepting member state; and (3) to see if a state *t* contained in a state t <sup>D</sup> generated in **NestedDFS** is on the stack, we need to check if t <sup>D</sup> has a non-empty intersection with a state on *Stack*.

As we will show next, it turns out that this na¨ıve adaptation can *miss* cycles due to pruning. Revisiting a composed state in **NestedDFS** does actually not imply a cycle, because reaching t <sup>D</sup> from s<sup>D</sup> entails only that every member state of t <sup>D</sup> can be reached from *at least one* member state of sD, not from all of them. The critical point is that pruning does not take into account *from where* states are reachable.

<sup>1</sup> A general difference between the methods is that checking reachability of a conjunctive property is linear in the number of decoupled states, but **NP**-complete for an unfolding prefix [27].

**Fig. 5.** Counterexample showing that a na¨ıve adaptation of the NDFS algorithm is incomplete. The (only) component NBA <sup>A</sup><sup>1</sup> is depicted on the left. The search tree on the right shows the entire reachable decoupled state space, where pruned states are crossed out; the wavy arrow depicts the invocation of **NestedDFS** on the acceptance restriction s<sup>D</sup> <sup>0</sup>*,A* of s<sup>D</sup> 0 .

Consider the example in Fig. 5. The left part of the figure shows the local state space of component NBA <sup>A</sup><sup>1</sup>. For simplicity, we only show a single component, which is sufficient to illustrate the issue. Here, <sup>A</sup><sup>1</sup> is defined as follows: <sup>S</sup><sup>1</sup> <sup>=</sup> {1, <sup>2</sup>, <sup>3</sup>, <sup>4</sup>}, <sup>L</sup><sup>1</sup> <sup>G</sup> = {l 1 G, l<sup>2</sup> <sup>G</sup>}, <sup>L</sup><sup>1</sup> <sup>I</sup> <sup>=</sup> {<sup>l</sup> 1 <sup>I</sup> , l<sup>2</sup> <sup>I</sup> , l<sup>3</sup> <sup>I</sup> }, <sup>A</sup><sup>1</sup>(s) = iff <sup>s</sup> ∈ {2, <sup>4</sup>}, and <sup>s</sup><sup>1</sup> <sup>0</sup> = 1. The transitions are as shown in the left of the figure. The decoupled search space generated using NDFS is depicted in the right of the figure. Pruned states are crossed out.

**NestedDFS** is launched (indicated by the wavy arrow) on the accepting initial state s<sup>D</sup> <sup>0</sup> . Before explaining the main issue, we remark that, to ensure that a cycle through an *accepting* member state of s<sup>D</sup> <sup>0</sup> is found, not a cycle through a non-accepting one, we need to restrict the set of reached local states to those that are accepting, and the states internally reachable from those via iclose(). Thus, **NestedDFS** starts in what we call the acceptance-restriction s<sup>D</sup> <sup>0</sup>,A of <sup>s</sup><sup>D</sup> <sup>0</sup> , where s<sup>D</sup> <sup>0</sup>,A[A<sup>1</sup>] = {2, <sup>4</sup>}. Now, the issue results from the fact that s<sup>D</sup> <sup>0</sup>,A contains two accepting member states, only one of which, namely state 2, is on a cycle. Assuming that the decoupled states are generated in order of increasing subscripts, so s<sup>D</sup> <sup>1</sup> before s<sup>D</sup> <sup>2</sup> and so on, state 2 is first reached in **NestedDFS** as a member state of s<sup>D</sup> <sup>2</sup>,A, but via the transition labelled with <sup>l</sup> 2 <sup>G</sup> from state 3, so the cycle cannot be closed. When generating the l 1 <sup>G</sup> successor <sup>s</sup><sup>D</sup> <sup>4</sup>,A of <sup>s</sup><sup>D</sup> <sup>0</sup>,A, its only member state 3 has already been reached in s<sup>D</sup> <sup>1</sup>,A, so <sup>s</sup><sup>D</sup> <sup>4</sup>,A is pruned and the cycle of state 2 via l 1 G, l<sup>2</sup> <sup>G</sup> is missed. In the next Section we show how to fix this, through an extended state representation that keeps track of reachability from a set of reference states.

Another minor issue are lassos *ρ*p(*ρ*c)<sup>ω</sup> whose cycle *ρ*<sup>c</sup> is induced by internal labels only. These will not be detected, because **NestedDFS** only considers traces via global labels. We fix this by checking for L<sup>I</sup> -cycles in every accepting decoupled state generated during **DFS**, to see if there exists a component that can reach such a state.

#### **4.2 Reference-State Splits**

The problem underlying the issue described in the previous section is that pruning is done regardless of the accepting states in the root node of **NestedDFS**. We now introduce an operation on decoupled states splitting them with respect to the set of reached local accepting states for each component. In our algorithm, this will serve to distinguish the different accepting states, and thus force dominance pruning to distinguish reachability from these. Formally, we define the restriction to accepting local states as a new transition with a global label l A <sup>G</sup> that is a self-loop for all accepting states:

**Definition 4 (Acceptance-Split Transition).** *Let* -<sup>S</sup>D,→D,L, s<sup>D</sup> <sup>0</sup> , AD *be the decoupled composition of* <sup>A</sup>1,..., <sup>A</sup>n*. Let* <sup>s</sup><sup>D</sup> *be an accepting decoupled state, and for* <sup>1</sup> <sup>≤</sup> <sup>i</sup> <sup>≤</sup> <sup>n</sup> *let* si 1,...,s<sup>i</sup> <sup>c</sup>*<sup>i</sup>* ⊆ <sup>s</sup>D[A<sup>i</sup> ] *be the list of reached accepting states of* <sup>A</sup><sup>i</sup> *, where for all* <sup>1</sup> <sup>≤</sup> <sup>j</sup> <sup>≤</sup> <sup>c</sup><sup>i</sup> : <sup>A</sup><sup>i</sup> (si <sup>j</sup> ) = *. Then the* acceptance-split transition <sup>l</sup> A <sup>G</sup> *in* <sup>s</sup><sup>D</sup> *is defined as follows:*

$$\frac{A^{\mathcal{D}}(s^{\mathcal{D}}) = \top \quad \forall i \in \{1, \ldots, n\}, j \in \{1, \ldots, c\_i\}: s\_j^i \in s^{\mathcal{D}}[\mathcal{A}^i] \land A^i(s\_j^i) = \top}{\displaystyle s^{\mathcal{D}} \xrightarrow{l\_G^A} \langle \langle \text{icclose}(s\_1^1), \ldots, \text{icclose}(s\_{c\_1}^1) \rangle, \ldots, \langle \text{icclose}(s\_1^n), \ldots, \text{icclose}(s\_{c\_n}^n) \rangle \rangle}$$

*The outcome state* s<sup>D</sup> <sup>A</sup> *of an acceptance-split transition is a* split decoupled state*. The set of* reference states *of* s<sup>D</sup> <sup>A</sup> *is* <sup>R</sup>(s<sup>D</sup> <sup>A</sup>) := {<sup>s</sup> | ∃A<sup>i</sup> : <sup>s</sup> <sup>∈</sup> <sup>s</sup>D[A<sup>i</sup> ] <sup>∧</sup> <sup>A</sup><sup>i</sup> (s) = }*.*

In words, the operation splits up the single set of reached component states <sup>s</sup>D[A<sup>i</sup> ] of <sup>A</sup><sup>i</sup> into a list of state sets, where each such set <sup>s</sup><sup>D</sup> A[A<sup>i</sup> ]<sup>s</sup> contains the states that can be reached via internal transitions from the respective accepting state <sup>s</sup> <sup>∈</sup> <sup>s</sup>D[A<sup>i</sup> ].

Our search algorithm will use the acceptance-split transition to generate the root node s<sup>D</sup> <sup>A</sup> of **NestedDFS** from an accepting state <sup>s</sup><sup>D</sup> backtracked from in **DFS**. Hence **NestedDFS** will search in the space of split decoupled states. The transitions over these behind an s<sup>D</sup> <sup>A</sup> are defined as follows:

**Definition 5 (Split Transitions).** *Let* -<sup>S</sup>D,→D,L, s<sup>D</sup> <sup>0</sup> , A<sup>D</sup> *be the decoupled composition of* <sup>A</sup><sup>1</sup>,..., <sup>A</sup><sup>n</sup>*. Let* <sup>s</sup><sup>D</sup> *and* <sup>t</sup> <sup>D</sup> *be decoupled states, with a transition* <sup>s</sup><sup>D</sup> <sup>l</sup>*<sup>G</sup>*−→<sup>D</sup> <sup>t</sup> D*. Let* si 1,...,s<sup>i</sup> <sup>c</sup>*<sup>i</sup>* ⊆ <sup>S</sup><sup>i</sup> *be reference states for* <sup>A</sup><sup>i</sup> *. Then the* split transition s<sup>D</sup> R <sup>l</sup>*<sup>G</sup>*−→<sup>D</sup> <sup>t</sup> D R *is defined such that for every* <sup>A</sup><sup>i</sup> *and every* <sup>1</sup> <sup>≤</sup> <sup>j</sup> <sup>≤</sup> <sup>c</sup><sup>i</sup> *we have:*

$$t\_R^{\mathcal{D}}[\mathcal{A}^i]\_{s\_j^i} = \begin{cases} \text{close}(\{s' \in t^{\mathcal{D}}[\mathcal{A}^i] \mid \exists s \in s\_R^{\mathcal{D}}[\mathcal{A}^i]\_{s\_j^i} : s \xrightarrow{l\_G} s' \}) & l\_G \in L\_G^i\\ s\_R^{\mathcal{D}}[\mathcal{A}^i]\_{s\_j^i} & l\_G \notin L\_G^i \end{cases}$$

The list of reference states for an A<sup>i</sup> does not change along a trace of split transitions. Let s<sup>D</sup> <sup>A</sup> be a decoupled state generated by an acceptance-split transition s<sup>D</sup> <sup>l</sup> *A <sup>G</sup>*−→<sup>D</sup> <sup>s</sup><sup>D</sup> <sup>A</sup>, then for all successor states <sup>t</sup> <sup>D</sup> of s<sup>D</sup> <sup>A</sup>, the set of reference states is R(t <sup>D</sup>) = R(s<sup>D</sup> <sup>A</sup>).

We extend set operations to the split representation as follows. A split decoupled state s<sup>D</sup> <sup>R</sup> *dominates* a split decoupled state <sup>t</sup> D <sup>R</sup>, denoted <sup>t</sup> D <sup>R</sup> <sup>⊆</sup><sup>R</sup> <sup>s</sup><sup>D</sup> <sup>R</sup>, if <sup>R</sup>(<sup>t</sup> D <sup>R</sup>) <sup>⊆</sup> <sup>R</sup>(s<sup>D</sup> R) and for all components <sup>A</sup><sup>i</sup> and reference states <sup>s</sup> <sup>∈</sup> <sup>R</sup>(<sup>t</sup> D <sup>R</sup>) <sup>∩</sup> <sup>S</sup><sup>i</sup> we have <sup>t</sup> D R[A<sup>i</sup> ]<sup>s</sup> ⊆ s<sup>D</sup> R[A<sup>i</sup> ]s. In contrast, state membership is defined in a global manner, across reference states. Namely, the set of local states of an <sup>A</sup><sup>i</sup> reached in a split decoupled state <sup>s</sup><sup>D</sup> <sup>R</sup> is defined as s<sup>D</sup> R[A<sup>i</sup> ] := - s∈R(s<sup>D</sup> *<sup>R</sup>* )∩S*<sup>i</sup>* <sup>s</sup><sup>D</sup> R[A<sup>i</sup> ]s. Composed state membership is defined relative to these s<sup>D</sup> R[A<sup>i</sup> ] as before.

An important property of the splitting is that it preserves reachability of member states. Concretely, for a split-transition s<sup>D</sup> R <sup>l</sup>*<sup>G</sup>*−→<sup>D</sup> <sup>t</sup> D <sup>R</sup> induced by a transition <sup>s</sup><sup>D</sup> <sup>l</sup>*<sup>G</sup>*−→<sup>D</sup> <sup>t</sup> D for all <sup>A</sup><sup>i</sup> it holds that if <sup>s</sup><sup>D</sup> R[A<sup>i</sup> ] = <sup>s</sup>D[A<sup>i</sup> ], then t D R[A<sup>i</sup> ] = t <sup>D</sup>[A<sup>i</sup> ].

As a notation convention, we will always denote split states s<sup>D</sup> <sup>R</sup> by a subscript <sup>R</sup>, and the direct outcome of an acceptance-split transition by s<sup>D</sup> <sup>A</sup>, with a subscript <sup>A</sup>.

$$\begin{array}{c} s\_{1,R}^{\mathcal{D}}[\mathcal{A}^{1}] = \langle \{\}\_{2}, \{3\}\_{4} \rangle \xrightarrow{l\_{G}^{2}} s\_{2,R}^{\mathcal{D}}[\mathcal{A}^{1}] = \langle \{\}\_{2}, \{2\}\_{4} \rangle \xrightarrow{l\_{G}^{1}} \begin{array}{c} s\_{3,R}^{\mathcal{D}}[\mathcal{A}^{1}] = \langle \{\}\_{2}, \{3\}\_{4} \rangle \\ s\_{3,R}^{\mathcal{D}} \subseteq s\_{3,R}^{\mathcal{D}} \end{array} \xrightarrow[s\_{4,R}^{\mathcal{D}}[\mathcal{A}^{1}] = \langle \{\}\_{2}, \{3\}\_{4} \rangle \end{array}$$

**Fig. 6.** With acceptance-splitting, **NestedDFS** invoked on the l *A <sup>G</sup>*-successor s<sup>D</sup> <sup>1</sup>*,A* of s<sup>D</sup> <sup>0</sup> of the example in Fig. 5 correctly detects the cycle of state 2 induced by the trace l 1 *G*, l<sup>2</sup> *G*.

Considering our example again, Fig. 6 illustrates how, on split decoupled states, the cycle 2 l 1 *<sup>G</sup>*−→ 3 l 2 *<sup>G</sup>*−→ <sup>2</sup> is not pruned. The state <sup>s</sup><sup>D</sup> <sup>3</sup>,R is still pruned, as it contains only component states reached from state 4. In s<sup>D</sup> <sup>4</sup>,R and <sup>s</sup><sup>D</sup> <sup>5</sup>,R, the decoupled state keeps track of the traces from the origin state 2, so none of the two is pruned, since they are not dominated by any state s<sup>D</sup> i,R (the root node <sup>s</sup><sup>D</sup> <sup>1</sup>,A of **NestedDFS** is not yet visited).

As indicated before, in our emptiness checking algorithm we will use split decoupled states only within **NestedDFS**. The seed state s<sup>D</sup> <sup>A</sup> of **NestedDFS** will always be the l A <sup>G</sup>-successor of an accepting state <sup>s</sup><sup>D</sup> backtracked from in **DFS**. Every member state of s<sup>D</sup> <sup>A</sup> is accepting, or can be reached with <sup>L</sup><sup>I</sup> -transitions from an accepting state.

#### **4.3 Putting Things Together: Decoupled NDFS**

We are now ready to describe our adaptation of the standard NDFS algorithm to decoupled compositions. The pseudo-code is shown in Fig. 7. The differences w.r.t the standard algorithm (Fig. 2) are highlighted in blue. The basic structure of the algorithm is preserved. It starts by putting the decoupled initial state s<sup>D</sup> <sup>0</sup> onto the Stack in **Check-Emptiness**, and launches the main **DFS** from it.

In **DFS**, the control flow does not change, decoupled states are generated in depthfirst order by recursion, updating the stack accordingly. There are however three differences to the standard variant:


**Fig. 7.** Adaptation of a standard NestedDFS for lasso search in decoupled compositions of NBA.

3. As discussed in Sect. 4.2, when we launch **NestedDFS** at a decoupled state sD, we do so on the acceptance-split l A <sup>G</sup>-successor <sup>s</sup><sup>D</sup> <sup>A</sup> of <sup>s</sup>D.

**NestedDFS** now starts in the acceptance-split s<sup>D</sup> <sup>A</sup>, and traverses split transitions as per Definition 5. On generation of a new state t D <sup>R</sup>, we perform dominance pruning against the decoupled states visited during all prior calls to **NestedDFS**. If in an t D <sup>R</sup> for every component <sup>A</sup><sup>i</sup> there exists a reference state <sup>s</sup> <sup>∈</sup> <sup>S</sup><sup>i</sup> that is reachable from itself, so <sup>s</sup> <sup>∈</sup> <sup>t</sup> <sup>D</sup>[A<sup>i</sup> ]s, then we can construct a cycle. As we will show in Theorem 4, this test is guaranteed to find all cycles that start from an accepting state *<sup>s</sup>*<sup>A</sup> <sup>∈</sup> <sup>s</sup><sup>D</sup> A.

Note that we cannot check for a non-empty intersection with states r<sup>D</sup> on *Stack*, since these are not split relative to the reference states of s<sup>D</sup> <sup>A</sup>. Thus, since we do not know from which local state in r<sup>D</sup> the state in the intersection was reached, such a non-empty intersection would *not* imply a cycle. What we can do, however, is check for dominance instead, as an algorithm optimization inspired by [22]. The pseudo-code in Fig. 7 does so by checking whether t D <sup>R</sup> <sup>⊇</sup> <sup>r</sup>D, where the <sup>⊇</sup> relation between a split vs. non-split state is simply evaluated based on the overall sets t D R[A<sup>i</sup> ] vs. <sup>r</sup>D[A<sup>i</sup> ] of reached components states. If this domination relation holds true, then the reachability issue mentioned in the previous section is resolved because *all <sup>t</sup>* <sup>∈</sup> <sup>r</sup><sup>D</sup> are then reachable from s<sup>D</sup> <sup>A</sup> – including those *<sup>t</sup>* from which an accepting state *<sup>s</sup>* <sup>∈</sup> <sup>s</sup><sup>D</sup> <sup>A</sup> is reachable. Lemma 1 in the next section will spell out this argument as part of our correctness proof.

Observe that splitting a decoupled state incurs an increase in the size of the state representation, as the same local state may be reached from several reference states. More importantly, as dominance pruning is weaker on split states (which after all is the purpose of the split operation) the size of the search space may increase. As shown by the example in Fig. 5, though, there is no easy way around the splitting, since the

**Fig. 8.** Illustration of the component NBAs used in Example 1.

algorithm has to be able to know *from which* component state the successors states are reached. Assuming a component has M accepting states, then in the worst case all local successor states that are shared between these accepting states can be visited M times across all **NestedDFS** invocations. Unless some of the decoupled states revisiting the same member state are pruned by dominance pruning, it can actually happen that the revisits multiply across the components, so the size of the decoupled state space in **NestedDFS** can potentially be exponentially larger than the standard state space. As we shall see in our experimental evaluation, typically such blow-ups do not seem to occur.

In case we want to construct a lasso, we need to store a pointer to the predecessor of each decoupled state and the label of the generating transition. With this, we can, for each component <sup>A</sup><sup>i</sup> separately, reconstruct a trace *<sup>π</sup>* of a state *<sup>t</sup>* <sup>∈</sup> <sup>t</sup> <sup>D</sup> reached from a state *<sup>s</sup>* <sup>∈</sup> <sup>s</sup><sup>D</sup> where <sup>π</sup><sup>G</sup>(sD, tD) = <sup>π</sup><sup>G</sup>(*π*). Here, for a decoupled state <sup>t</sup> <sup>D</sup> that was reached from another decoupled state sD, by π<sup>G</sup>(sD, tD) we denote the global trace via which t <sup>D</sup> was reached from sD. This can be done in time polynomial in the size of the component and linear in the length of π<sup>G</sup>(sD, tD). Since the traces for all components are synchronized via π<sup>G</sup>(*π*), we add the required internal labels for each component in between every pair of global labels. We remark that, to decide if a lasso exists, we do not need to store any predecessor or generating label pointers.

We next show on an example how our algorithm works.

*Example 1.* The model has two component NBAs <sup>A</sup>1, <sup>A</sup><sup>2</sup> illustrated in Fig. 8. It is a variant of an example from [26]. The Figure should be self-explanatory, we remark that all global transitions l 1 G,...l<sup>7</sup> <sup>G</sup> induce a self loop in the only state 1 of A2.

**CheckEmptiness** starts by putting s<sup>D</sup> <sup>0</sup> onto *Stack* and enters **DFS**(s<sup>D</sup> <sup>0</sup> ). Let s<sup>D</sup> <sup>1</sup> = -{B,D}, {1}, <sup>s</sup><sup>D</sup> <sup>2</sup> = -{E,F}, {1}, and <sup>s</sup><sup>D</sup> <sup>3</sup> = -{D}, {1} be the successors generated along the trace l 1 G, l<sup>2</sup> G, l<sup>3</sup> <sup>G</sup> in **DFS**. Since <sup>s</sup><sup>D</sup> <sup>3</sup> <sup>⊆</sup> <sup>s</sup><sup>D</sup> <sup>1</sup> <sup>∈</sup> <sup>V</sup> , <sup>s</sup><sup>D</sup> <sup>3</sup> is pruned and the search backtracks to s<sup>D</sup> <sup>1</sup> . Say **DFS** selects the transition via l 4 <sup>G</sup> next, generating the state <sup>s</sup><sup>D</sup> <sup>4</sup> = -{C}, {1} and its <sup>l</sup> 5 <sup>G</sup>-successor <sup>s</sup><sup>D</sup> <sup>5</sup> = -{F}, {1}. Then <sup>s</sup><sup>D</sup> <sup>5</sup> is pruned because it is dominated by s<sup>D</sup> <sup>2</sup> <sup>∈</sup> <sup>V</sup> , and the search backtracks from <sup>s</sup><sup>D</sup> <sup>4</sup> , which is accepting.

Thus, **NestedDFS**(s<sup>D</sup> <sup>5</sup>,A) is invoked, where <sup>s</sup><sup>D</sup> <sup>5</sup>,A = --{C}<sup>C</sup> ,-{1}1, because <sup>C</sup> and 1 are accepting local states that become the reference states of s<sup>D</sup> <sup>5</sup>,A. **NestedDFS** will follow the trace l 5 G, l<sup>3</sup> G, l<sup>6</sup> G, l<sup>7</sup> G, l<sup>7</sup> <sup>G</sup>, which among others generates the state <sup>s</sup><sup>D</sup> <sup>6</sup>,R = --{G}<sup>C</sup> ,-{1}1 by <sup>l</sup> 6 <sup>G</sup>, and ends in <sup>s</sup><sup>D</sup> <sup>7</sup>,R = --{G}<sup>C</sup> ,-{1}1. The latter is pruned, because it is dominated by s<sup>D</sup> <sup>6</sup>,R, which is contained in <sup>V</sup> . No cycle is reported. This is correct, because the only member state (C, 1) of s<sup>D</sup> <sup>5</sup>,A does not occur on a cycle.

$$\cdots \bigotimes\_{l\_G^{a,j}} \xrightarrow{l\_G^{b,j}} \bigotimes\_{l\_G^{a,j}} \qquad\qquad\qquad\qquad\bigotimes\_{l\_G^{a,j}} \overbrace{\smile \dots \smile \dots \smile \overleftarrow{\smile \dots \smile}}^{l\_G} \overbrace{\smile \dots \smile \overleftarrow{\smile}}^{l\_G}$$

**Fig. 9.** Illustration of the exponential separations to ample sets (left) and unfolding (right).

**DFS** then backtracks to s<sup>D</sup> <sup>1</sup> = -{B,D}, {1} and generates its remaining successor s<sup>D</sup> <sup>8</sup> = -{G}, {1} via <sup>l</sup> 6 <sup>G</sup>. **DFS** further generates the <sup>l</sup> 7 <sup>G</sup>-successors of <sup>s</sup><sup>D</sup> <sup>8</sup> and eventually backtracks from s<sup>D</sup> <sup>8</sup> , invoking **NestedDFS**(s<sup>D</sup> <sup>8</sup>,A), where <sup>s</sup><sup>D</sup> <sup>8</sup>,A = --{G}G,-{1}1.

After two transitions via l 7 <sup>G</sup> the resulting state <sup>s</sup><sup>D</sup> <sup>9</sup>,R = --{G}G,-{1}1 satisfies the condition that for all components <sup>A</sup><sup>i</sup> <sup>∃</sup><sup>s</sup> : <sup>s</sup> <sup>∈</sup> <sup>s</sup><sup>D</sup> 9,R[A<sup>i</sup> ]s, namely G and 1. Thus, a cycle is reported. It is induced by the trace l 1 <sup>G</sup>, l<sup>I</sup> , l<sup>6</sup> G, l<sup>7</sup> G, l<sup>7</sup> G.

Note that no decoupled state in the second **NestedDFS** is pruned, since none of them is dominated by a state in V of the first **NestedDFS** invocation. In particular, s<sup>D</sup> <sup>8</sup>,A = -{G}G, {1}1 is not dominated by <sup>s</sup><sup>D</sup> <sup>6</sup>,R = -{G}<sup>C</sup> , {1}1, because the reference states differ – G and 1 for s<sup>D</sup> <sup>8</sup>,R and <sup>C</sup> and <sup>1</sup> for <sup>s</sup><sup>D</sup> <sup>6</sup>,R.

#### **4.4 Relation to Other State-Space Reduction Methods**

The comparison to ample set pruning and Petri-net unfolding from Sect. 3.3 carries over directly to liveness checking via simple adaptations to the examples, see Fig. 9.

**Theorem 2.** *CheckEmptiness with explicit-state search and ample sets pruning is exponentially separated from CheckEmptiness with decoupled search.*

*Proof (sketch).* The argument from Sect. 3.3 remains valid. With the states B<sup>i</sup> accepting (see Fig. 9, left), explicit-state search with ample sets pruning in the worst case has to exhaust the entire state space. It invokes **NestedDFS** on the accepting state (Bi)<sup>n</sup> and, worst-case, needs to exhaust the state space again to detect the cycle. Decoupled search invokes **NestedDFS** on the initial state restricted to the component states Bi. Every successor of that state closes the cycle via an arbitrary l b <sup>G</sup> transition. So there are only three decoupled states overall (including the acceptance-restricted initial state).

**Theorem 3.** *Constructing a complete unfolding prefix is exponentially separated from CheckEmptiness with decoupled search.*

*Proof (sketch).* The component states B<sup>i</sup> are made accepting and internal transitions <sup>B</sup><sup>i</sup> <sup>→</sup> <sup>A</sup><sup>i</sup> are added to the model (see Fig. 9, right). Unfolding constructs a complete prefix as described in Sect. 3.3, plus one event for each new internal transition.<sup>2</sup> Decoupled search generates the two states as described. The second state has {Ai, Bi, Ci} reached for all components, its successor via l<sup>G</sup> is pruned. **NestedDFS** is invoked on its restriction to Bi, in which all A<sup>i</sup> get reached via the new internal transitions. The <sup>l</sup>G-successor of this state closes the cycle, so there are only four decoupled states.

<sup>2</sup> A weaker cut-off rule is required for liveness checking that can only increase the prefix size [8].

# **5 Decoupled NDFS Correctness**

We now show the correctness of our approach. In Lemmas 1, 2, 3, we show that if our algorithm reports a cycle, then there exists an accepting run for <sup>A</sup><sup>1</sup> ... <sup>A</sup>n. In Theorem 4, we then show that decoupled NDFS does not miss an accepting run.

We first show that the optimization of checking dominance of states in **NestedDFS** against states on the stack is sound, i.e., that an accepting run exists.

**Lemma 1.** *Let* r<sup>D</sup> *be a decoupled state on the current DFS Stack, and let* t D <sup>R</sup> *be a decoupled state generated by NestedDFS. If* t D <sup>R</sup> <sup>⊇</sup> <sup>r</sup>D*, then there exists an accepting run for* <sup>A</sup><sup>1</sup> ... <sup>A</sup><sup>n</sup>*.*

*Proof.* Let s<sup>D</sup> be the accepting state that is backtracked from in **DFS**, i.e., the current **NestedDFS** was invoked on its l A <sup>G</sup>-successor <sup>s</sup><sup>D</sup> A.

From Theorem 1 we know that if s<sup>D</sup> <sup>2</sup> is reachable from s<sup>D</sup> <sup>1</sup> , then for every state *<sup>s</sup>*<sup>2</sup> <sup>∈</sup> <sup>s</sup><sup>D</sup> <sup>2</sup> there exists a state *<sup>s</sup>*<sup>1</sup> <sup>∈</sup> <sup>s</sup><sup>D</sup> <sup>1</sup> such that *s*<sup>1</sup> *π* −→ *<sup>s</sup>*2, where <sup>π</sup><sup>G</sup>(*π*) = <sup>π</sup><sup>G</sup>(s<sup>D</sup> <sup>1</sup> , s<sup>D</sup> <sup>2</sup> ).

This result also holds for decoupled states reached in **NestedDFS** from states in **DFS**. This is because the acceptance-split transition l A <sup>G</sup> only restricts the set of reached member states of s<sup>D</sup> in s<sup>D</sup> <sup>A</sup>, so in particular <sup>s</sup><sup>D</sup> <sup>A</sup> <sup>⊆</sup> <sup>s</sup>D. Furthermore, split transitions generating states behind s<sup>D</sup> <sup>A</sup> do not affect reachability of member states of these splitdecoupled states compared to their non-split counterparts.

In particular, (1) for every state *<sup>s</sup>*<sup>2</sup> <sup>∈</sup> <sup>s</sup><sup>D</sup> there exists a state *<sup>s</sup>*<sup>1</sup> <sup>∈</sup> <sup>s</sup><sup>D</sup> <sup>0</sup> that reaches *s*<sup>2</sup> on a trace *π* where π<sup>G</sup>(*π*) = π<sup>G</sup>(s<sup>D</sup> <sup>0</sup> , sD), which, with s<sup>D</sup> <sup>A</sup> <sup>⊆</sup> <sup>s</sup><sup>D</sup> also holds for all *<sup>s</sup>*<sup>2</sup> <sup>∈</sup> <sup>s</sup><sup>D</sup> <sup>A</sup>; and (2) for every state *<sup>t</sup>* <sup>∈</sup> <sup>t</sup> D <sup>R</sup> there exists an accepting state *<sup>s</sup>*<sup>A</sup> <sup>∈</sup> <sup>s</sup><sup>D</sup> <sup>A</sup> that reaches *t* on a trace *π* where π<sup>G</sup>(*π*) = π<sup>G</sup>(s<sup>D</sup> A, t<sup>D</sup> <sup>R</sup>).

Since <sup>r</sup><sup>D</sup> is on *Stack*, it holds that every *<sup>s</sup>* <sup>∈</sup> <sup>s</sup><sup>D</sup> <sup>A</sup> is reachable from a *<sup>r</sup>* <sup>∈</sup> <sup>r</sup>D, and, with t D <sup>R</sup> <sup>⊇</sup> <sup>r</sup>D, that every *<sup>r</sup>* <sup>∈</sup> <sup>r</sup><sup>D</sup> is reachable from an accepting state *<sup>s</sup>*<sup>A</sup> <sup>∈</sup> <sup>s</sup><sup>D</sup> A.

Let pred(s<sup>D</sup> <sup>1</sup> , s<sup>D</sup> <sup>2</sup> , *s*2) be a function that, if s<sup>D</sup> <sup>2</sup> is reachable from s<sup>D</sup> <sup>1</sup> and *<sup>s</sup>*<sup>2</sup> <sup>∈</sup> <sup>s</sup><sup>D</sup> 2 , outputs a state *<sup>s</sup>*<sup>1</sup> <sup>∈</sup> <sup>s</sup><sup>D</sup> <sup>1</sup> that reaches *s*<sup>2</sup> via a trace *π* with π<sup>G</sup>(s<sup>D</sup> <sup>1</sup> , s<sup>D</sup> <sup>2</sup> ) = π<sup>G</sup>(*π*).

Let *s*<sup>0</sup> be a state reached in both t D <sup>R</sup> and <sup>r</sup>D, and let *<sup>s</sup>*<sup>1</sup> <sup>=</sup> pred(rD, t<sup>D</sup> <sup>R</sup>, *<sup>s</sup>*0) be its predecessor in rD. If *s*<sup>1</sup> = *s*0, then we are done, because there exists a lasso *s*0,..., *s*0,..., *s*A,..., *s*0,..., *s*A, where *s*<sup>A</sup> is an accepting state traversed in s<sup>D</sup> A. Such an accepting state exists because all member states of a decoupled state in **NestedDFS** are reachable from an accepting state in s<sup>D</sup> A.

If *<sup>s</sup>*<sup>1</sup> <sup>=</sup> *<sup>s</sup>*0, then we iterate and set *<sup>s</sup>*<sup>i</sup> <sup>=</sup> pred(rD, t<sup>D</sup> <sup>R</sup>, *<sup>s</sup>*<sup>i</sup>−<sup>1</sup>), where such *<sup>s</sup>*<sup>i</sup> exist because <sup>r</sup><sup>D</sup> <sup>⊆</sup> <sup>t</sup> D <sup>R</sup>. Because there are only finitely many states in <sup>r</sup>D, eventually we get *s*<sup>i</sup> = *s*<sup>j</sup> (where j<i) and there exists a lasso as follows:

First, there exists a cycle *<sup>s</sup>*i,..., *<sup>s</sup>*<sup>i</sup>−<sup>1</sup>,..., *<sup>s</sup>*<sup>j</sup> <sup>=</sup> *<sup>s</sup>*i, where between every pair of states *<sup>s</sup>*k, *<sup>s</sup>*<sup>k</sup>−<sup>1</sup> an accepting state *<sup>s</sup>*k,A in <sup>s</sup><sup>D</sup> <sup>A</sup> is traversed, for the same reason as before. We can obviously shift and truncate the cycle to start right after and end in *s*i,A. The prefix of the lasso is *<sup>s</sup>*0,..., *<sup>s</sup>*i,A.

Lemmas 2 and 3 show the soundness of our main termination criterion, and of **CheckLocalAccept**.

**Lemma 2.** *Let* t D <sup>R</sup> *be a split decoupled state generated in NestedDFS. If for every component* <sup>A</sup><sup>i</sup> *there exists a component state* <sup>s</sup><sup>i</sup> *such that* <sup>s</sup><sup>i</sup> <sup>∈</sup> <sup>t</sup> <sup>D</sup>[A<sup>i</sup> ]s*<sup>i</sup> , then there exists an accepting run for* <sup>A</sup><sup>1</sup> ... A<sup>n</sup>*.*

*Proof.* Let s<sup>D</sup> <sup>A</sup> be the acceptance-split decoupled state from which **NestedDFS** was started. If for every component <sup>A</sup><sup>i</sup> such an <sup>s</sup><sup>i</sup> exists, then the state *<sup>s</sup>* = (s1,...,s<sup>n</sup>) is reachable in both s<sup>D</sup> <sup>A</sup> and <sup>t</sup> D <sup>R</sup>. By the construction of the reached state sets <sup>t</sup> D R[A<sup>i</sup> ]s*<sup>i</sup>* , *s* is reachable from itself and is accepting. Hence, there exists a lasso *<sup>s</sup>*0,..., *<sup>s</sup>*,..., *<sup>s</sup>*.

**Lemma 3.** *Let* t D *be an accepting decoupled state generated in DFS such that a cycle is reported by CheckLocalAccept*(t <sup>D</sup>)*, then an accepting run for* <sup>A</sup><sup>1</sup> ... <sup>A</sup><sup>n</sup> *exists.*

*Proof.* By prerequisite, there exists an accepting member state *s* of t <sup>D</sup>. If **CheckLocalAccept**(t <sup>D</sup>) reports a cycle, then there exists a component <sup>A</sup><sup>i</sup> , where an accepting state <sup>s</sup><sup>i</sup> <sup>∈</sup> <sup>t</sup> <sup>D</sup>[A<sup>i</sup> ] is reached that lies on an cycle induced by transitions labelled with Li <sup>I</sup> . Thus, we can set the local state of <sup>A</sup><sup>i</sup> in *<sup>s</sup>* to <sup>s</sup><sup>i</sup> , and the lasso looks as follows: *<sup>s</sup>*0,..., *<sup>s</sup>*,..., *<sup>s</sup>*, where on the cycle only <sup>A</sup><sup>i</sup> moves.

We are now ready to prove the correctness of our decoupled NDFS algorithm.

**Theorem 4.** *Let* <sup>A</sup><sup>1</sup> ... <sup>A</sup><sup>n</sup> *be the composition of* <sup>n</sup> *NBA and let* <sup>A</sup><sup>1</sup> <sup>D</sup> ... <sup>D</sup> <sup>A</sup><sup>n</sup> *be its decoupled composition. Then CheckEmptiness*(A<sup>1</sup> <sup>D</sup> ... <sup>D</sup> <sup>A</sup><sup>n</sup>) *reports a cycle if and only if an accepting run for* <sup>A</sup><sup>1</sup> ... <sup>A</sup><sup>n</sup> *exists.*

*Proof.* If **CheckEmptiness** reports a cycle, then by Lemmas 1, 2, and 3, which cover exactly the cases where a cycle is reported, an accepting run for <sup>A</sup><sup>1</sup> ... <sup>A</sup><sup>n</sup> exists.

For the other direction, assume that *<sup>ρ</sup>* is an accepting run for <sup>A</sup><sup>1</sup> ... <sup>A</sup><sup>n</sup>. Let *<sup>s</sup>*a, with <sup>0</sup> <sup>≤</sup> a<k, be the accepting state that starts the cycle of the lasso *<sup>ρ</sup>*<sup>p</sup> <sup>=</sup> *<sup>s</sup>*0,..., *<sup>s</sup>*a, *ρ*<sup>c</sup> = *s*a+1,..., *s*k, where *s*<sup>a</sup> = *s*k. Let *π* = l1,...,l<sup>k</sup> be the trace on which *s*<sup>k</sup> is reached, i.e., for all <sup>1</sup> <sup>≤</sup> i<k : *<sup>s</sup>*i, li+1, *<sup>s</sup>*i+1 ∈→.

By Theorem 1, there exists a decoupled state s<sup>D</sup> reached in **DFS** that contains *s*a.

If *<sup>π</sup>* is such that for all a<i <sup>≤</sup> <sup>k</sup> : <sup>l</sup><sup>i</sup> <sup>∈</sup> <sup>L</sup><sup>I</sup> , i.e., the cycle *<sup>ρ</sup>*<sup>c</sup> is induced only by internal labels, we next proof that **CheckLocalAccept**(sD) reports a cycle: As *s*<sup>a</sup> is accepting, s<sup>D</sup> is accepting, too, so unless a cycle is reported before, eventually **Check-LocalAccept**(sD) is called. If *ρ*<sup>c</sup> is induced by only internal labels, then, because there cannot be any component interaction via L<sup>I</sup> -transitions, there must exist a component <sup>A</sup><sup>i</sup> for which the local state <sup>s</sup><sup>i</sup> in *<sup>s</sup>*<sup>a</sup> reaches itself with only <sup>L</sup><sup>i</sup> <sup>I</sup> -transitions. We can pick any such <sup>A</sup><sup>i</sup> and ignore transitions from *<sup>ρ</sup>*<sup>c</sup> that are labelled by an element of <sup>L</sup><sup>I</sup> \ <sup>L</sup><sup>i</sup> <sup>I</sup> , since these are not required for an accepting cycle. Consequently, **CheckLocalAccept**(sD) reports a cycle.

We next show that, if *π* contains a global label on the cycle, i.e., there exists an <sup>i</sup> ∈ {<sup>a</sup> + 1,...,k} such that <sup>l</sup><sup>i</sup> <sup>∈</sup> <sup>L</sup>G, then, unless a cycle is reported before, **NestedDFS**(s<sup>D</sup> <sup>A</sup>) reports a cycle, where <sup>s</sup><sup>D</sup> <sup>A</sup> is the <sup>l</sup> A <sup>G</sup>-successor of <sup>s</sup>D.

Assume for contradiction that this is not the case, i.e., no cycle has been reported before, and **NestedDFS**(s<sup>D</sup> <sup>A</sup>) does not report a cycle. Let **NestedDFS**(s<sup>D</sup> <sup>A</sup>) be the first call to **NestedDFS** that misses a cycle, although an *<sup>s</sup>*<sup>a</sup> <sup>∈</sup> <sup>s</sup><sup>D</sup> <sup>A</sup> that is on a cycle exists.

If *s*<sup>a</sup> is on a cycle, then by Theorem 1 there exists a decoupled state t <sup>D</sup> reachable from s<sup>D</sup> <sup>A</sup> that also contains *s*a. The result of Theorem 1 holds in this case because, by the definition of split transitions, the splitting does not affect reachability of member states. So there exists t D <sup>R</sup> reachable from <sup>s</sup><sup>D</sup> <sup>A</sup> that contains *s*a.


**Fig. 10.** Statistics on the two scaling models, where #A is the number of philosophers, resp. the number of NBAs, Time is runtime in seconds, #States (#S) and #E are the number of visited states, resp. generated events, and Mem (M) is the memory usage in MiB.

Denote by *π*<sup>c</sup> = la+1,...l<sup>k</sup> the cycle part of *π*. Because *s*<sup>a</sup> is an accepting member state of sD, all its component states s<sup>i</sup> <sup>A</sup> become reference states in <sup>s</sup><sup>D</sup> <sup>A</sup>. Therefore, assuming that π<sup>G</sup>(s<sup>D</sup> A, t<sup>D</sup> <sup>R</sup>) = <sup>π</sup><sup>G</sup>(*π*c), for all components we have <sup>s</sup><sup>i</sup> <sup>A</sup> <sup>∈</sup> <sup>t</sup> D R[A<sup>i</sup> ]s*i <sup>A</sup>* and a cycle is reported. If this is not the case, then either (1) s<sup>D</sup> was not reached in **DFS**, or (2) t D <sup>R</sup> was not reached in **NestedDFS**(s<sup>D</sup> <sup>A</sup>).

In case (1), there must exist a state s<sup>D</sup> <sup>P</sup> <sup>⊇</sup> <sup>s</sup><sup>D</sup> that prunes <sup>s</sup>D. But then, <sup>s</sup><sup>D</sup> <sup>P</sup> contains *s*a, too, and **NestedDFS** was called on its l A <sup>G</sup>-successor <sup>s</sup><sup>D</sup> P,A and the cycle of *s*<sup>a</sup> was missed before, in contradiction.

For (2), either (a) there exists a state t D P,R <sup>⊇</sup><sup>R</sup> <sup>t</sup> D <sup>R</sup> that was reached in a prior invocation of **NestedDFS** on an accepting state s<sup>D</sup> P,A, or (b) a state <sup>t</sup> D P,R <sup>⊇</sup> <sup>t</sup> D <sup>R</sup> was reached in **NestedDFS**(s<sup>D</sup> <sup>A</sup>) before <sup>t</sup> D <sup>R</sup>. In both cases, <sup>t</sup> D <sup>R</sup> is pruned and the cycle through *s*<sup>a</sup> is missed. Case (a) can only happen if s<sup>D</sup> P,A contains *s*a, too, because the reference states of s<sup>D</sup> <sup>A</sup> need to be a subset of the ones of <sup>s</sup><sup>D</sup> P,A. But then, the cycle of *s*<sup>a</sup> was missed before, in contradiction. For (b), if t D <sup>R</sup> <sup>⊆</sup><sup>R</sup> <sup>t</sup> D P,R, then for all <sup>A</sup><sup>i</sup> we have si <sup>A</sup> <sup>∈</sup> <sup>t</sup> D P,R[A<sup>i</sup> ]s*i <sup>A</sup>* , so the cycle would have been reported before, in contradiction.

The reachability argument in (1,2a,2b) applies recursively to all predecessors of s<sup>D</sup> in **DFS**, and of t D <sup>R</sup> in **NestedDFS**(s<sup>D</sup> <sup>A</sup>), so, unless a cycle is reported before, eventually a state s<sup>D</sup> is reached in **DFS** that contains *s*a, and a state t D <sup>R</sup> with <sup>s</sup><sup>i</sup> <sup>A</sup> <sup>∈</sup> <sup>t</sup> D R[A<sup>i</sup> ]s*i <sup>A</sup>* in **NestedDFS**(s<sup>D</sup> <sup>A</sup>).

# **6 Experimental Evaluation**

We implemented a prototype of the decoupled NDFS algorithm from Fig. 7. The input is specified in the Hanoi Omega-Automata format [1], describing a set of NBAs synchronized via global labels as in Definition 2. We compare our prototype to the SPIN model checker [20] (v6.5.1), and to the Cunf Petri-net unfolding tool [30] (v1.6.1). We also experimented with the symbolic model checkers NuSMV and PRISM [3,25], but both are significantly outperformed by the other methods. We conjecture that this is

**Fig. 11.** Left part: scatterplots with the runtime of DecNDFS on the y-axis and the one of SPIN (left column) and Cunf (right column) on the x-axis, on randomly generated models. Each point represents one instance. In the top row, we highlight different ratios of local labels with different colors/shapes, in the bottom row we highlight different numbers of components. Right part: illustrations of the ring model (top) and the fork (middle) and philosopher (bottom) NBAs of the philosophers model. Initial (accepting) states are marked by an incoming arrow (double circle).

because both systems are not specifically designed for asynchronous execution of processes, or LTL model checking. For SPIN, we translate each NBA to a process where NBA states are represented by state labels, internal transitions by goto statements, and global transitions by rendezvous channel operations. For the latter, SPIN only supports synchronization of two processes at a time, so we restrict the models to global transitions with exactly two components. We model acceptance for SPIN explicitly using a monitor process that gets into an accepting state if all processes are in a local accepting state. The translation for Cunf encodes NBA states as net places and transitions as net transitions into a single Petri net, ignoring the individual components. In our prototype and in SPIN, when a lasso is reported or the algorithm proved that no lasso exists within the cut-off limits, we say that the instance was *solved*. For Cunf, we attempt to construct a complete unfolding prefix. We consider an instance solved if the construction terminates, i.e., we do not actually check the liveness property. The experiments were performed on a cluster of Intel E5-2660 machines running at 2.20 GHz, with time (memory) cut-offs of 15 min (4 GiB). Our code and models are publicly available [13].

We compare SPIN with standard options, i.e., with partial-order reduction enabled, Cunf with the cut-off rule of [10], and decoupled search (DecNDFS), using two kinds of benchmarks: (1) two scaling examples to showcase the behaviour on well-known models. One is an encoding of the dining philosophers problem, the other is a ring-shaped synchronisation topology. Both are illustrated in Fig. 11 (right). The philosophers model has 2N NBAs, N philosophers and N forks, synchronized by global transitions l lF*<sup>i</sup>* G and l rF*<sup>i</sup>* <sup>G</sup> . After synchronizing with its left and right fork, a philosopher can perform an


**Fig. 12.** Number of solved instances on the random models as a function of the ratio of internal transitions (left) and the number of components #A (right).

internal *eat* transition; after releasing the forks it can perform an internal *think* transition. In the ring-topology model, each component can enter a diamond-shaped region via internal transitions, followed by a synchronization with its left or right neighbor via l i <sup>G</sup> or <sup>l</sup> i+1 <sup>G</sup> . No accepting run exists for either model. Moreover, (2) we use a set of random automata, where for each combination of a ratio of internal transitions in {0%, 20%,..., 80%}, i.e., the number of transitions labelled with <sup>L</sup><sup>I</sup> divided by the total number of transitions, and a number of components in {2,..., <sup>8</sup>}, we generated sets of 150 random graphs. Each component has 15 to 100 local states, out of which up to 3% are accepting (at least one). We ensure that none of the instances has an internal accepting cycle to focus on more interesting cases. One could easily implement a lookup similar to **CheckLocalAccept**, which is necessary for DecNDFS, for the other methods, too, which then essentially simplifies the problem to basic reachability.

In Fig. 10, we show detailed statistics for the scaling models, with increasing number of components #A (Time in seconds, #States is the sum of states visited in both DFSs, #E is the number of events in the prefix, Memory in MiB). In dining philosophers, SPIN and DecNDFS show similar results. SPIN has a runtime advantage in the larger instances of roughly a factor of 2, but DecNDFS uses only a fraction of the memory. Cunf clearly outperforms both. This model is not very well suited to decoupled search. Only half of the NBAs have internal transitions, and only two each, and there are no non-deterministic transitions that DecNDFS could represent compactly. On the ringtopology model, SPIN manages to exhaust the search space for up to 9 components. Cunf and DecNDFS scale significantly higher, the number of decoupled states grows only linearly in the number of components. Cunf on the other hand does show a blowup and runs out of memory between 50 and 75 components. This showcase example only serves to illustrate a near-to-optimal case for decoupled search reductions, which likely does not carry over in this extent to real-world models.

In Fig. 11 (left part), we show detailed runtime behaviour in terms of scatter plots with a per-instance comparison on the random models. Each point corresponds to one instance, where the x-value is the runtime of SPIN, resp. Cunf, and the y-value is the runtime of DecNDFS, so points below the diagonal indicate an advantage of DecNDFS. Different ratios of internal labels (top row) and numbers of components (bottom row) are depicted in different colors/shapes. We observe that, as expected, with a higher ratio of internal transitions, the advantage of DecNDFS increases significantly. For all ratios, DecNDFS clearly improves with a higher number of components.

In Fig. 12, for the same benchmark set we show the number of solved instances as a function of the ratio (left) and of the number of components (right). Here, we see that from around 20% internal transitions, DecNDFS consistently beats both SPIN and Cunf. SPIN and Cunf also benefit from the decrease in synchronizing statements, although not as much as DecNDFS. On the right, we see that starting with 4 component NBAs (#A), DecNDFS consistently beats SPIN and Cunf. While SPIN and Cunf show a significant decline with more components, this effect is less pronounced for DecNDFS.

# **7 Concluding Remarks and Future Work**

We have presented an approach to adapt decoupled search, an AI planning technique to mitigate the state-space explosion, to the verification of liveness properties of composed NBAs. Specifically, we have adapted a standard on-the-fly algorithm for checking ωregular properties, nested depth-first search (NDFS), and proven its correctness. The necessary adaptations essentially pertain to the conditions that identify the existence of accepting runs, which must be handled differently given the different properties of decoupled states. Our approach extends the scope of decoupled search from safety properties, as done in [12], to liveness properties. Our experimental evaluation has shown that decoupled search can yield significant reductions in search effort across random models that consist of a set of synchronized NBAs, and simple scaling showcase examples.

We have focused on a verification problem for composed NBAs that is sufficiently general to cover significant cases like automata-based LTL model checking. We believe that our solution can be adapted to other verification problems for composed NBAs, including Buchi automata with multiple acceptance conditions such as ¨ *generalized Buchi automata ¨* , and language intersection of the involved automata. Indeed, NDFS has successfully been used for emptiness checking of generalized NBA. We are confident that decoupled NDFS can be adapted to the compilation introduced by [33], where an additional "counter component" is added to keep track of the components that already have an accepting cyle during the nested DFS. Concretely, we believe that the verification problem of generalized NBA can be handled with adaptations by our approach: In the compilation by [33], the counter component increases its local state from 1 to n (assuming n components), one by one whenever component i has an accepting state. We can essentially apply the same compilation in decoupled NDFS, restricting the set of local states of <sup>A</sup><sup>i</sup> to the accepting ones when the counter is increased from <sup>i</sup> to <sup>i</sup> + 1 by a separate acceptance-split transition l A*<sup>i</sup>* <sup>G</sup> for each Ai. This ensures that a global cycle includes an accepting state for all components.

There are several interesting topics for future work, like the adaptation of optimizations proposed for basic NDFS (e.g. [22,32]), or the combination with orthogonal state space reduction methods, as previously done in the context of AI planning for partial-order reduction [16], symmetry reduction [18], and symbolic search [17]. Having focused on NDFS [5,22,32] in this work, we believe that the adaptation of SCC-based algorithms is a promising line of research [6,11], extending the scope of decoupled search further to model checking of CTL properties [24].

**Acknowledgment.** We thank Alvaro Torralba for helpful discussions about the state-splitting ´ approach. Daniel Gnad was supported by the German Research Foundation (DFG), as part of project grant HO 2169/6-2, "Star-Topology Decoupled State Space Search". Jorg Hoff- ¨ mann's research group has received support by DFG grant 389792660 as part of TRR 248 (see perspicuous-computing.science).

# **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

# **AIGEN: Random Generation of Symbolic Transition Systems**

Swen Jacobs<sup>1</sup> and Mouhammad Sakr1,2(B)

<sup>1</sup> CISPA Helmholtz Center for Information Security, Saarbr¨ucken, Germany jacobs@cispa.de <sup>2</sup> Saarland University, Saarbr¨ucken, Germany sakr@react.uni-saarland.de

**Abstract.** AIGEN is an open source tool for the generation of transition systems in a symbolic representation. To ensure diversity, it employs a uniform random sampling over the space of all Boolean functions with a given number of variables. AIGEN relies on reduced ordered binary decision diagrams (ROBDDs) and canonical disjunctive normal form (CDNF) as canonical representations that allow us to enumerate Boolean functions, in the former case with an encoding that is inspired by data structures used to implement ROBDDs. Several parameters allow the user to restrict generation to Boolean functions or transition systems with certain properties, which are then output in AIGER format. We report on the use of AIGEN to generate random benchmark problems for the reactive synthesis competition SYNTCOMP 2019, and present a comparison of the two encodings with respect to time and memory efficiency in practice.

# **1 Introduction**

Verification and synthesis algorithms require benchmark problems that can be used for testing and evaluation. Unfortunately, a diverse set of benchmarks is very hard to obtain. This is a problem not only for tool developers, but also for organizers of competitions [3,4,8,11] that need to evaluate tools on a wide range of benchmarks, and to regularly search for new meaningful benchmarks.

If done properly, the generation of random benchmarks can be a solution to this problem by providing the best possible diversity and by generating new benchmarks whenever needed. On the other hand, random benchmarks come with a few caveats. First of all, completely random generation is usually not desired, since it could result in many benchmarks that, while drawn from a diverse set, are not interesting, e.g., they may be too easy or too difficult to solve for existing tools. Secondly, users may be interested in how their implementation handles benchmarks with specific properties, for instance those that require long chains of computations to reach a conclusion. Finally, if users know how *realistic* benchmarks for a certain type of verification or synthesis problem usually look like, they may want to restrict the random generation to such benchmarks, e.g., by forcing them to comply with certain conditions on their structure.

In this paper we present AIGEN, a tool for random generation of transition systems in a symbolic representation. We generated transition systems with partitioned transition relation, i.e., consisting of sets of Boolean functions. We ensure diversity at the level of individual Boolean functions by requiring a uniform random sampling over all Boolean functions with a given number of variables.

While for some application areas there exist tools that generate random Boolean functions in a specific form (e.g. randomly generated propositional formulas in CNF [9,16]), to the best of our knowledge none of these supports uniformly random distributions. The obvious benefit of this approach is that random samplings allow to make statements about the actual space of Boolean functions, instead of statements about a specific representation of the functions, and these benefits extend to the random generation of transition systems.

To ensure uniform random sampling, we rely on an enumeration of all Boolean functions with a given number of variables, based on their truth tables. From the truth tables one can generate in a straightforward way standard canonical representations of the functions, e.g., in canonical disjunctive normal form (CDNF) or canonical conjunctive normal form. As a more memory-efficient alternative, we developed an encoding that is inspired by data structures used for implementing reduced ordered binary decision diagrams (ROBDDs).

AIGEN implements our ROBDD-based algorithm and a CDNF-based algorithm. Development of AIGEN was motivated by the evaluation of reactive synthesis tools [13], and it was used to generate benchmarks for the reactive synthesis competition (SYNTCOMP) [11,12]. Since the existing benchmark library of SYNTCOMP consists mostly of benchmarks that were hand-crafted by tool developers, the diversity of benchmarks is limited, and their choice may be skewed towards problems or encodings that are well-suited for the existing tools. Hence, as an addition to the existing hand-crafted examples, random benchmarks are a valuable source of insight into the performance of synthesis algorithms.

**Outline.** We introduce BDDs and ROBDDs in Sect. 2. In Sect. 3 we present our basic idea for the random generation of symbolic transition systems, based on enumerating Boolean functions. In Sect. 4, we present a detailed description of the ROBDD-based algorithm, and in Sect. 5 the algorithm based on CDNF. Finally, in Sect. 6 we present a comparison between the ROBDD and the CDNF approaches, and we give details about our implementation and how to effectively use the tool to produce diverse benchmarks.

# **2 Canonical Representation of Boolean Functions**

A *Binary Decision Diagram (BDD)* over a set of variables *X* is a directed acyclic graph *<sup>G</sup>* = (*V,E*) with *<sup>V</sup>* <sup>⊂</sup> <sup>N</sup>, exactly one root *<sup>v</sup><sup>r</sup>* <sup>∈</sup> *<sup>V</sup>* , and a labeling on nodes. Each terminal node *<sup>v</sup>* <sup>∈</sup> *<sup>V</sup>* is labeled with a value *val*(*v*) ∈ {0*,* <sup>1</sup>}. Each nonterminal node *<sup>v</sup>* <sup>∈</sup> *<sup>V</sup>* is labeled with a variable *var*(*v*) <sup>∈</sup> *<sup>X</sup>* and has exactly two outgoing edges, leading to nodes that are denoted by *high*(*v*) <sup>∈</sup> *<sup>V</sup>* and *low*(*v*) <sup>∈</sup> *<sup>V</sup>* , respectively. Note that if *<sup>v</sup>* <sup>∈</sup> *<sup>V</sup>* is a non-terminal node, then the directed acyclic graph rooted in *v* is also a BDD. It is called the *sub-BDD of G with root v*.

A BDD *G*(*V,E*) over a set of variables *X* is *ordered* if on every path from the root to a terminal node, variables in node labels occur in the same order and each variable occurs at most once. A BDD is *reduced* if it does not contain any of the following:


Any ordered BDD can be transformed into a reduced BDD by using the isomorphism and Shannon reductions (cp. [10]). A BDD that is reduced and ordered is called a *Reduced Ordered Binary Decision Diagram (ROBDD)*.

Note that in an ROBDD, a triple (*x, high*(*v*)*, low*(*v*)) of a node *v*, where *x* = *var*(*v*), uniquely defines a sub-ROBDD. This implies that ROBDDs are a canonical representation of Boolean functions [10], i.e., for a fixed variable order there is a unique ROBDD representation for every Boolean function.

# **3 Enumerating Boolean Functions**

Based on a canonical representation of Boolean functions, we define an enumeration, i.e., a bijective mapping from natural numbers to Boolean functions (or ROBDDs), such that any procedure that produces uniformly random natural numbers (in some range) can be used to produce uniformly random Boolean functions (in some range, see below for details).

To define our mapping, we first describe the data structure for ROBDDs that is used by various BDD packages. Then we will illustrate the data structure we use for ROBDDs and how it guarantees canonicity and uniform random distribution. In the following, we assume that *<sup>X</sup>* <sup>=</sup> {*x*1*,...,xm*} is a set of variables with a fixed order.

**Unique Table.** BDD packages use the so-called *unique table* as a data structure for storing ROBDD nodes. The unique table of a BDD *G* = (*V,E*) over a set of variables *<sup>X</sup>* is a hash table that establishes a bijection between nodes *<sup>v</sup>* <sup>∈</sup> *<sup>V</sup>* and triples (*x, h, l*) <sup>∈</sup> *<sup>X</sup>* <sup>×</sup> *<sup>V</sup>* <sup>×</sup> *<sup>V</sup>* that uniquely identify them, where *<sup>x</sup>* <sup>=</sup> *val*(*v*) if *<sup>v</sup>* is a terminal node, and *x* = *var*(*v*) otherwise, *h* = *high*(*v*) and *l* = *low*(*v*).

**Virtual ROBDD Table.** We will use the ideas from the unique table that is used in BDD packages to define the virtual ROBDD table that enumerates all possible ROBDDs with respect to our variable order. This table can of course not be constructed explicitly, but the idea of this table can be used to define a (bijective) mapping from natural numbers to ROBDDs. We want to generate random Boolean functions that are based on a uniform distribution. For this reason the algorithm generates randomly a natural number *bddID* <sup>≤</sup> <sup>2</sup><sup>2</sup>*<sup>m</sup>* (since there are 2<sup>2</sup>*<sup>m</sup>* different Boolean functions of type <sup>B</sup>*<sup>m</sup>* <sup>→</sup> <sup>B</sup>), then computes a unique triple similar to the one above that corresponds to *bddID*, and then iteratively builds the complete ROBDD.

For the sake of illustrating how the algorithm computes the triple, assume that there exists a table, called *Virtual ROBDD Table* (or short: VRT), that maps natural numbers to ROBDDs, identified by a triple of variable index, and high and low children. In other words, every entry in the table maps uniquely a number *bddID* <sup>∈</sup> <sup>N</sup> (i.e. a BDD node) to a triple (*level, high, low*) where **level** is a variable index, *high* = *high*(*bddID*), and *low* = *low*(*bddID*). Like the unique table, none of the entries (i.e., ROBDDs) appears twice. However, in contrast to the unique table, the VRT is based on the fixed variable order, and uses the variable index in this order instead of the variable itself. Table 1 depicts a sketch of the VRT.

**Table 1.** VRT: Entries in the table are in ascending order over *bddID*. Each row is annotated with a *level* and a *sublevel*. *<sup>L</sup><sup>i</sup>* denotes the *<sup>i</sup> th* level, containing all triples with variable index *<sup>i</sup>*. The *sublevel sl<sup>i</sup><sup>j</sup>* denotes the *<sup>j</sup>th* sublevel of *<sup>L</sup><sup>i</sup>* which contains all triples of *L<sup>i</sup>* in which *j* is the *high* or the *low* child, and the other child *j* is a *bddID* that belongs to a level *L<sup>i</sup>* with *i* - *< i* such that [*j.j*- ] has not appeared before in *Li*. Each cell in a row annotated with *L<sup>i</sup>* and *sl<sup>i</sup><sup>j</sup>* is of the form (*bddID*)[*high.low*] where *bddID* is the unique identifier of the triple (*i, high, low*). Let *Y*<sup>1</sup> = 2<sup>2</sup>*i*−<sup>1</sup> and *<sup>Y</sup>*<sup>2</sup> <sup>=</sup> *<sup>j</sup>* −1 2(2<sup>2</sup>*i*−<sup>1</sup>



Note that a *bddID* between 1 and 2<sup>2</sup>*<sup>m</sup>* corresponds to a Boolean function with at most *m* input variables, and a *bddID* between 2<sup>2</sup>*m*−<sup>1</sup> + 1 and 2<sup>2</sup>*<sup>m</sup>* corresponds to a function with exactly *m* input variables. Thus, to uniformly sample Boolean functions, we can use a random number generator that uniformly samples natural numbers in such a range.

**Fig. 1.** BDD generated for number 16. Equivalent to boolean function: *x*2*x*<sup>1</sup> + ¯*x*2*x*¯1. The numbers on the left of the BDD represent the *level* i.e. corresponding variable indices.

It is important to remember that the VRT is not constructed explicitly. Instead, given a number of variables *m*, and based on the predefined ordering of ROBDD in the VRT (2<sup>2</sup>*<sup>m</sup>* ROBDDs), the algorithm generates first a random number *bddID* <sup>≤</sup> <sup>2</sup><sup>2</sup>*<sup>m</sup>* , then computes the triple (*level, high, low*) to which *bddID* maps. We note: *level* (or *<sup>i</sup>*) is equal to *log*2(*log*2(*bddID*)). Let *Y*<sup>1</sup> = 2<sup>2</sup>*i*−<sup>1</sup> , then we solve the following system of equations to compute *x* which is equivalent to the *sublevel*:

*<sup>Y</sup>*<sup>1</sup> + 2(*Y*<sup>1</sup> <sup>−</sup> 1) + *...* + 2(*Y*<sup>1</sup> <sup>−</sup> *<sup>x</sup>*) *< bddID <sup>Y</sup>*<sup>1</sup> + 2(*Y*<sup>1</sup> <sup>−</sup> 1) + *...* + 2(*Y*<sup>1</sup> <sup>−</sup> (*<sup>x</sup>* + 1)) <sup>≥</sup> *bddID*

High and low are then computed according to what is given in the table, see Sect. 4 for more details. Figure 1 shows the BDD generated for *bddID* = 16 which is equivalent to: *x*2*x*<sup>1</sup> + ¯*x*2*x*¯1.

# **4 Random Generation of (Controllable) Transition Systems**

In this section we present our algorithm for generating random transition systems, represented as AIGER circuits [5]. We use a generalization of the usual notion of transition systems that allows some of the input signals to be declared as controllable. This is useful to define synthesis problems, i.e., a synthesis procedure can define how these inputs should behave depending on the state and uncontrollable inputs of the system.

A *controllable transition system* (or short: controllable system) *T S* is a 6 tuple (*L, Xu, Xc, F, BAD, q*0), where *L* is a set of state variables (also called *latches*), *X<sup>u</sup>* is a set of uncontrollable input variables, *X<sup>c</sup>* is a set of controllable input variables, *<sup>F</sup>* = (*f*1*, ..., f*|*L*|) with *<sup>f</sup><sup>i</sup>* : <sup>B</sup>*<sup>L</sup>* <sup>×</sup> <sup>B</sup>*<sup>X</sup><sup>u</sup>* <sup>×</sup> <sup>B</sup>*<sup>X</sup><sup>c</sup>* <sup>→</sup> <sup>B</sup> is a vector of update functions for the latches, *BAD* : <sup>B</sup>*<sup>L</sup>* <sup>→</sup> <sup>B</sup> is the set of unsafe states, and *q*<sup>0</sup> is the initial state where all latches are initialized to 0.

Then, the idea of our tool for random generation of transition systems can be summarized in the following way:

– The user input determines parameters of the system, such as the number of latches and controllable or uncontrollable inputs.


#### **4.1 Random Generation Algorithm**

The procedure GenerateRandomAiger takes as input the number of latches *l*, uncontrollable inputs *u*, controllable inputs *c*, the bound *o*, optionally a list of seeds (i.e., natural numbers used to initialize a pseudorandom number generator). As output it produces a file in AIGER format.

Lines 3–6 generate for every latch a random ROBDD that represents an *update function* <sup>B</sup>*l*+*c*+*<sup>u</sup>* <sup>→</sup> <sup>B</sup> for the latch, i.e., a function that takes all current values of inputs and latches as input, and returns a new value for the given latch. Line 4 generates a random integer with 2*vars* random bits, i.e., a natural number between 1 and 2<sup>2</sup>*vars* . All the seeds used for generating the random integers will be written in the comment section at the end of the generated file. These seeds can be fed to the algorithm in order to regenerate the same instance. Line 5 constructs the ROBDD that corresponds to the generated number. Line 6 converts the constructed ROBDD into an AIG (And-Inverter Graph) relying on the fact that a BDD can be seen as a network of multiplexers.

Lines 8–10 construct the ROBDD of the function *<sup>f</sup>BAD* : <sup>B</sup>*<sup>o</sup>* <sup>→</sup> <sup>B</sup> which uses *<sup>o</sup>* <sup>≤</sup> *<sup>l</sup>* latch variables. The set of unsafe states *BAD* is then defined as *<sup>f</sup>*(*x<sup>i</sup>*<sup>1</sup> *,...,x<sup>i</sup><sup>o</sup>* ) <sup>∧</sup> - *<sup>j</sup>*∈{1*,...,l*}\{*i*1*,...,io*} *<sup>x</sup><sup>j</sup>* where the indices {*i*1*,...,io*} are also picked randomly. Line 11 creates the AIGER file that corresponds to the total number of variables and to the update functions that were randomly generated. Line 12 uses the *ABC* [7] tool to reduce the size of the generated AIGER file.

ConstructBDD is a recursive procedure for constructing all the nodes of the ROBDD that corresponds to the unique ID *bddID*. It starts with the root node and recursively proceeds to the child nodes until it reaches the nodes 0 or 1. Line 14 checks if the node was already created. If not, Line 15 computes the triple (*level, high, low*) that uniquely represent the node and adds it to the table *robddT able*. Lines 18–17 construct the child nodes. Note that the *robddT able* is initialized with the IDs 1 and 2 which correspond respectively to nodes 0 and 1.

Given an ID, procedure GetChildren computes the triple (*level, high, low*). Line 20 computes the level. Lines 21–24 compute the sublevel. Note that, as depicted in Table 1, a sub-level *<sup>s</sup><sup>i</sup><sup>j</sup>* has size 2(2<sup>2</sup>*i*−<sup>1</sup> <sup>−</sup> *<sup>j</sup>*), where 2<sup>2</sup>*i*−<sup>1</sup> is the sum of the sizes of all levels that are smaller than *i*. To compute the sublevel, we have to compute the single solution of the system of inequations in Lines 22, 23, to see that check the VRT table. Line 25 computes the ID of the left-most bit in the sub-level. Lines 26–27 compute the ID of the second child node, and Lines 28–30 check which node is the low edge and which node is the high edge.


### **5 CDNF-based Algorithm**

An obvious alternative to our ROBDD approach is to make use of the canonical disjunctive or conjunctive normal forms to generate random Boolean functions. Algorithm 2 employs CDNF as it is easier to convert to And-Inverter graph. CDNF is usually constructed directly from a truth table by taking the OR of all satisfying assignments. To convert a Boolean formula *<sup>f</sup><sup>i</sup>* <sup>=</sup> *cl*<sup>1</sup> <sup>∨</sup> *cl*<sup>2</sup> <sup>∨</sup> *...* <sup>∨</sup> *cl<sup>n</sup>* in CDNF to AIG, we consider its equivalent *f <sup>i</sup>* <sup>=</sup> <sup>¬</sup>(¬*cl*<sup>1</sup> ∧ ¬*cl*<sup>2</sup> <sup>∧</sup> *...* ∧ ¬*cln*).

The procedure DNFGenerateRandomAiger takes as input the number of latches *l*, uncontrollable inputs *u*, controllable inputs *c*, the bound *o*, and produces a file in AIGER format as output. Lines 3–6 generate a random update function for every latch. Line 4 generates a random bit vector of size 2*vars*.



This bit vector represents the valuation of all the *minterms*<sup>1</sup> of the truth table that represents the random function *fi*. For instance, if the left-most bit of the bit vector is equal to 1, then *<sup>x</sup><sup>c</sup>*<sup>0</sup> = 0*,...,x<sup>c</sup>*|*c*|−<sup>1</sup> = 0*, x<sup>u</sup>*<sup>0</sup> = 0*,...,x<sup>u</sup>*|*u*|−<sup>1</sup> <sup>=</sup> <sup>0</sup>*, x<sup>l</sup>*<sup>0</sup> = 0*,...,x<sup>l</sup>*|*l*|−<sup>1</sup> = 0 is a satisfying assignment of *<sup>f</sup>i*. Similarly, if the last element of the bit vector is equal to 1, then *<sup>x</sup><sup>c</sup>*<sup>0</sup> = 1*,...,x<sup>c</sup>*|*c*|−<sup>1</sup> = 1*, x<sup>u</sup>*<sup>0</sup> <sup>=</sup> <sup>1</sup>*,...,x<sup>u</sup>*|*u*|−<sup>1</sup> = 1*, x<sup>l</sup>*<sup>0</sup> = 1*,...,x<sup>l</sup>*|*l*|−<sup>1</sup> = 1 is a satisfying assignment of *<sup>f</sup>i*. Line 5 builds the random function that corresponds to the generated bit vector, and Line 6 converts it to AIG. Lines 8–10 generate the output random function, and Lines 11, 12 creates the AIGER file and call ABC to minimize it.

The procedure ConstructDNF takes as input a bit vector and the number of variables and generates the corresponding Boolean function. Line 14 initializes the DNF function to be created. For every element in the bit vector, if the *i*th element is equal to 1 (Line 15) then, in order to obtain the corresponding minterm, Line 17 converts the positive integer *i* to binary. For instance if *i* = 3 and vars = 3, then the minterm *<sup>x</sup><sup>c</sup>* ∧ ¬*x<sup>u</sup>* <sup>∧</sup> *<sup>x</sup><sup>l</sup>* is created. Line 18 creates the corresponding minterm. Line 19 negates the created clause and adds is to the DNF formula. Line 20 returns the negation of the constructed formula. As mentioned earlier, as the formula represented by the truth table is in DNF, we need to generate its equivalent that includes only AND and NOT logical gates. For instance giving a formula *<sup>f</sup><sup>i</sup>* <sup>=</sup> *cl*<sup>1</sup> <sup>∨</sup>*cl*<sup>2</sup> <sup>∨</sup>*...*∨*cl<sup>n</sup>* in CDNF, we construct its equivalent *f <sup>i</sup>* <sup>=</sup> <sup>¬</sup>(¬*cl*<sup>1</sup> ∧ ¬*cl*<sup>2</sup> <sup>∧</sup> *...* ∧ ¬*cln*).

<sup>1</sup> A minterm of *n* variables is a product (logical AND) of the variables in which each appears exactly once in uncomplemented or complemented form.

**Fig. 2.** Average number of AND gates. **Fig. 3.** Average running time in seconds including the time needed to minimize the generated Aiger circuit using ABC tool.

**Fig. 4.** Average running times.

### **6 Implementation and Evaluation**

AIGEN is implemented in Python, and a virtual machine with the tool ready to run is available at https://doi.org/10.5281/zenodo.4721314 [14]. The source code of AIGEN is also publicly available at https://github.com/mhdsakr/AIGEN-Tool, allowing interested users to add functionality, e.g., in order to add further parameters to generate only Boolean functions or transition systems with certain properties. It uses the mpmath [15] library together with GMPY [1] to deal with large numbers. By default, mpmath uses Python integers, however if GMPY is also installed on the operating system, mpmath will automatically detect it and use gmpy integers intead. This makes mpmath perform much faster, particularly at high precision (approximately above 100 digits). Furthermore, AIGEN uses ABC [7], and the AIGER tool set [6] to post-process AIGER circuits.

AIGEN has been used to generate thousands of random transition systems. Figures 4, 2, 3 shows average times and sizes for generating systems where, for example, 4.3.7 denotes systems with 4 controllable inputs, 3 uncontrollable inputs, and 7 latches (*o* = *l* = 7). These times were measured on a laptop with quad-core i7-6600U CPU at 2.6 GHz and 20 GB RAM.

Figures 4 and 2 compare average running time and average number of ANDgates between the ROBDD and DNF approaches. These results are without the use of the ABC tool (i.e. the command "ABCMinimize(aigerFilePath)" was skipped). Figure 4 shows that the DNF approach was faster in all cases which was expected due to the fact that generating a random ROBDD is much more complex than generating a truth table. Figure 2 shows that the ROBDD approach is much better in all cases. Figure 3 compares average running time between the ROBDD and DNF approaches, including the time needed for the ABC tool to minimize the generated transition system. Benchmarks 8*.*8*.*4, 9*.*9*.*4, and 10*.*11*.*2 timed out for the DNF approach(we used 10 h as a time limit). Obviously the ABC tool needed a lot of time to process these benchmarks. After a thorough inspection, the reason was, in addition to the huge size of these circuits, the incredibly long chains of AND-gates for every generated Boolean function. This figure shows that the total running time of the tool was way better when used with the ROBDD approach.

**The Effect of Parameters.** Although the benchmarks are randomly generated, AIGEN allows the user to choose the input parameters to obtain benchmarks with certain properties that correspond to their needs, for example:


To demonstrate the effect of these parameters, Table 2 shows the running time and results (realizable or unrealizable) of the synthesis tool *SimpleBDD-Solver* on selected benchmarks, generated using the ROBDD-based approach, in SyntComp 2019. SimpleBDDSolver has won all previous iterations of the Syntcomp competition. A benchmark name contains the parameters that were used to generate the file, e.g., *random n 19 1 3 15 14 1 abc* means that the benchmark has in total 19 variables with 1 controllable input, 3 uncontrollable inputs, 15 latches, and *o* = 14. The table shows that the example benchmarks with ratio *c/u* = 1*/*3 or *c/u* = 1*/*5 were unrealizable, the benchmarks with ratio *c/u* = 2 were realizable, while benchmarks with ratio *c/u* = 1*/*2 were difficult to solve for the tool, which timed out while trying to solve them. Note that a benchmark with *c/u* = 1*/*5 *can* still be realizable, and one with *c/u* = 2 *can* be unrealizable—it is just unlikely that this is the case for a randomly generated benchmark.


**Table 2.** Results of SimpleBDDSolver on selected random benchmarks generated by AIGEN in SyntComp 2019 [2]

# **7 Conclusion**

We have presented AIGEN, a tool for the generation of random transition systems in a symbolic representation, using either ROBDDs or CDNF for representing Boolean functions. Although the ROBDD based approach generates much smaller symbolic transition systems, the CDNF approach is faster when ABC minimization procedure is disabled. In contrast to the ROBDD approach, to generate a random formula in CDNF, no complex computation is needed. However, when using minimization, the huge size of these formulas becomes a problem for ABC as it has to deal and inspect all the generated AND-gates.

In future work, instead of using a fixed variable order, we will also allow to use a random order. The drawback of a fixed order is that some Boolean functions only have a large ROBDD representation, even though smaller ones exist with different orderings, and vice versa. Going further, we plan to include variable reorder techniques to find an order that leads to small ROBDDs at runtime. Finally, we also plan to investigate the use of AIGEN for finding bugs in verification and synthesis tools.

# **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

# **GPU Acceleration of Bounded Model Checking with ParaFROST**

Muhammad Osama(B) and Anton Wijs

Eindhoven University of Technology, Eindhoven, The Netherlands *{*o.m.m.muhammad,a.j.wijs*}*@tue.nl

**Abstract.** The effective parallelisation of Bounded Model Checking is challenging, due to SAT and SMT solving being hard to parallelise. We present ParaFROST, which is the first tool to employ a graphics processor to accelerate BMC, in particular the simplification of SAT formulas before and repeatedly during the solving, known as pre- and inprocessing. The solving itself is performed by a single CPU thread. We explain the design of the tool, the data structures, and the memory management, the latter having been particularly designed to handle SAT formulas typically generated for BMC, i.e., that are large, with many redundant variables. Furthermore, the solver can make multiple decisions simultaneously. We discuss experimental results, having applied ParaFROST on programs from the Core C99 package of Amazon Web Services.

**Keywords:** Bounded model checking · SAT solving · GPU computing

# **1 Introduction**

Bounded Model Checking (BMC) [5] determines whether a model M satisfies a certain property ϕ expressed in temporal logic, by translating the model checking problem to a propositional satisfiability (SAT) problem or a Satisfiability Modulo Theories (SMT) problem. The term *bounded* refers to the fact that the BMC procedure searches for a counterexample to the property, i.e., an execution trace, which is bounded in length by an integer k. If no counterexample up to this length exists, k can be increased and BMC can be applied again. This process can continue until a counterexample has been found, a user-defined threshold has been reached, or it can be concluded (via k-induction [38]) that increasing k further will not result in finding a counterexample. CBMC [14] is an example of a successful BMC model checker that uses SAT solving. CBMC can check ANSI-C programs. The verification is performed by *unwinding* the loops in the program under verification a finite number of times, and checking whether the bounded

M. Osama—This work is part of the GEARS project with project number TOP2.16.044, which is (partly) financed by the Netherlands Organisation for Scientific Research (NWO).

**Fig. 1.** Variable redundancy in CBMC SAT formulas

executions of the program satisfy a particular safety property [22]. These properties may address common program errors, such as null-pointer exceptions and array out-of-bound accesses, and user-provided assertions.

The performance of BMC heavily relies on the performance of the solver. Over the last decade, efficient SAT solvers [3,6,17,26] have been developed and applied for BMC [5,10–12,25]. Effectively *parallelising* BMC is hard. Parallel SAT solving often involves running several solvers, each solving the problem in its own way [18]. For BMC, multiple solvers can be used to solve the problem for different values of the bound k in parallel [1,21]. However, in these approaches, the individual solvers are still single-threaded.

Recently, Leiserson *et al.* [23] concluded that in the future, advances in computational performance will come from *many-threaded* algorithms that can employ hardware with a massive number of processors. Graphics processors (GPUs) are an example of such hardware. Multi-threaded BMC model checkers have been proposed, such as in [13,19,35], but these address tens of threads, not thousands.

In this paper, we propose the application of GPUs to accelerate SAT-based BMC. To the best of our knowledge, this is the first time this is being addressed. Recently, GPUs have been applied for explicit-state model checking and graph analysis [8,9,40,41]. In SAT solving, we used GPUs to accelerate test pattern generation [31], metaheuristic search [42], *preprocessing* [32,33] and *inprocessing* [34]. In these operations, a given SAT formula is simplified, i.e., it is rewritten to a formula with fewer variables and/or clauses, while preserving satisfiability, using various simplification rules. In preprocessing, this is only done once before the solving starts, while in inprocessing, this is done periodically during the solving. While the impact of accelerating these procedures has been demonstrated [34], its impact on BMC has not yet been addressed.

The structure of typical BMC SAT formulas suggests that GPU pre- and inprocessing will be effective. Figure 1a shows for a BMC benchmark set taken from the Core C99 package of Amazon Web Services (AWS)<sup>1</sup> [2], consisting of 168 problems of various data structures, that propositional formulas produced by CBMC tend to have a substantial amount of redundant variables that can

<sup>1</sup> We thank Daniel Kroening and Natasha Jebbo for pointing us to this package.

be removed using simplification procedures. For approximately 50% of the cases, 40% of the variables can be removed. Furthermore, Fig. 1b presents the amount of redundancy in relation to the total number of variables in the formula. It indicates that when a formula contains one million variables or more, at least 25% of those are redundant, and often many more. In the benchmark set, the maximum number of variables in one formula is 13 million (encoding the verification of the priority-queue shift-down routine), of which 65% is redundant. In contrast, the largest formula we encountered in the application track of the 2013–2020 SAT competitions that is not encoding a verification problem only has 0.2 million variables (it encodes a graph coloring problem [29]).

*Contributions.* We present the SAT solver ParaFROST that applies Conflict Driven Clause Learning (CDCL) [26] with GPU acceleration of pre- and inprocessing [32–34], tuned for BMC. It has been implemented in CUDA C++ v11 [28], is based on CaDiCaL [6], and interfaces with CBMC.

Having to deal on a GPU with large formulas with a lot of redundancy offers particular challenges. The elimination of variables typically leads to actually adding new clauses, and since the amount of memory on a GPU is limited, this cannot be done carelessly. Therefore, first of all, we have worked on compacting the data structure used to store formula clauses in ParaFROST as much as possible, while still allowing for the application of effective solving optimisations. Second of all, we introduce *memory-aware* variable elimination, to avoid running out of memory due to adding too many new clauses. In practice, we experienced this problem when applying the original procedure of [34] for BMC.

Additionally, to support BMC, ParaFROST must be an *incremental* solver, i.e., it must exploit that a number of very similar SAT problems are solved in sequence [16]. The procedure in [34] does not support this, so we extended it.

Finally, because of the many variables in BMC SAT formulas, ParaFROST supports *Multiple Decision Making* (MDM) in the solving procedure, as presented in [30]. With MDM, multiple decisions can be made at once, periodically during the solving. In case there are many variables, there is more potential to make many decisions simultaneously. We have generalised the original MDM decision procedure [30], making it easier to integrate MDM in solvers other than MiniSat and Glucose [3]. The effectiveness of MDM in BMC has never been investigated before, nor has been combined with GPU pre- and inprocessing.

# **2 Background**

*SAT Solving.* We assume that SAT formulas are in conjunctive normal form (CNF). A CNF formula is a conjunction of m clauses C<sup>1</sup> ∧···∧ Cm, and each clause C<sup>i</sup> is a disjunction of n literals 1∨···∨n. A literal is a Boolean variable x or its negation <sup>¬</sup>x, also referred to as ¯x. The domain of all literals is <sup>L</sup>. A clause can be interpreted as a set of literals, i.e., {1,...,n} encodes 1∨...∨n, and a SAT formula S as a set of clauses, i.e., {C1,...,Cm} encodes C1∧...∧Cm. With *Var*(C), we refer to the set of variables in C: *Var*(C) = {x | x ∈ C ∨ x¯ ∈ C}. The set S consists of all clauses in S containing : S-= {C ∈S| ∈ C}.

In CDCL, clauses are LEARNT or ORIGINAL. A LEARNT clause has been derived by the CDCL clause learning process during solving, and an ORIGINAL clause is part of the formula. We refer with L to the set of LEARNT clauses.

For a set of assignments Σ, consisting of all literals that have been assigned **true**, a formula S evaluates to **true** iff ∀C ∈ S.∃ ∈ C. ∈ Σ. When a *decision* is made, a literal is picked and added to Σ. Each assignment is associated with a *decision level* (time stamp) to monitor the assignment order. We call a clause C *unit* iff a single literal in it is still unassigned, and the others are assigned **false**, i.e., |*Var*(C) \ *Var*(Σ)| = 1 and C ∩ Σ = ∅.

*Variable-Clause Elimination* (VCE). Variables and clauses can be removed from formulas by applying *simplification rules* [15,20]. They rewrite a formula to an equi-satisfiable one with fewer variables and/or clauses. Applying them is referred to as pre- and inprocessing, before and during the solving, respectively.

*Incremental Bounded Model Checking.* Since 2001, incremental BMC has been applied to hardware and software verification [16,39]. It relies on incremental SAT solving [16]. In CDCL, clauses are learnt during the solving each time a wrong decision has been made, to avoid making those decisions again in the future. Incremental SAT solving builds on this: when multiple SAT formulas with similar characteristics are solved sequentially, then in each iteration, the clauses learnt in previous iterations are reused. An efficient approach to add and remove clauses is by using *assumptions* [16], which are initial assignments.

For BMC, the transition relation of a system design and the (negation of) the property to be verified are encoded in a SAT formula. A predicate I(s0) identifies the initial states, δ(si, si+1) encodes the transition relation at trace depth i, and E(i) = - <sup>0</sup>≤j≤<sup>i</sup> <sup>e</sup>(s<sup>j</sup> ) encodes the reachability of an error state up to trace depth <sup>i</sup>, where e(s<sup>j</sup> ) is **true** iff state s<sup>j</sup> is an error state. For incremental BMC, additional unit clauses σ<sup>i</sup> are used. These predicates are combined to define the following series of SAT formulas S(i) that must be solved incrementally:

$$\begin{aligned} \mathcal{S}(0) &= \mathcal{Z}(s\_0) \wedge (\mathcal{E}(0) \vee \sigma\_0), \text{ under assumption } \neg \sigma\_0\\ \mathcal{S}(i+1) &= \mathcal{S}(i) \wedge \delta(s\_i, s\_{i+1}) \wedge \sigma\_i \wedge (\mathcal{E}(i+1) \vee \sigma\_{i+1}), \text{ under assumption } \neg \sigma\_{i+1} \end{aligned}$$

Formula S(i) is satisfiable iff an error state is reachable via a trace with a length up to i [16,39]. At iteration i + 1, we know that E(i), included via S(i), cannot be satisfied (otherwise iteration i + 1 would not have been started). This means that E(i) must be removed to avoid that S(i + 1) is unsatisfiable. To effectively remove E(i), σ<sup>i</sup> is assigned **true**, resulting in E(i)∨σ<sup>i</sup> being satisfied. In general, at iteration i, σ<sup>i</sup> is assigned **false**, while in iterations i > i, it is assigned **true**.

*GPU Programming.* CUDA [28] is NVIDIA's parallel computing platform that can be used to develop general purpose GPU programs. A GPU consists of multiple streaming multiprocessors (SMs), and each SM contains several streaming processors (SPs). A GPU program consists of a *host* part, executed on a CPU, and *device* functions, or *kernels*, executed on a GPU. Each time a kernel is launched, the number of threads that need to execute it is given. On the SPs, the threads are executed. Compared to a CPU thread, GPU threads perform

**Fig. 2.** An activity diagram for the workflow of ParaFROST.

a relatively simple task. In particular, they read some data, perform a computation, and write the result. This allows the SPs to switch contexts easily. In practice, one to two orders of magnitude more threads are typically launched than the number of SPs, which results in hiding the memory latency: whenever a thread is waiting for some data, the associated SP can switch to another thread.

A GPU has various types of memory. Relevant here are *registers* and *global* memory. Global memory is used to copy data between the host and the device. *Registers* are used for on-chip storage of thread-local data. Global memory has a much higher latency than registers. We use *unified memory* [28] to store clauses. Unified memory creates one virtual memory pool for host and device. In this way, the same memory addresses can be used by the host and the device, combining the main memory of the host side and the global memory of the device side.

# **3 GPU-Accelerated Bounded Model Checking**

We implemented ParaFROST<sup>2</sup> with CUDA C++ v11. It is a hybrid CPU-GPU tool, with (sequential) solving done on the host side, and (parallel) VCE done on the device side. An interface with CBMC is implemented in C++. CBMC is patched to read a configuration file before ParaFROST is instantiated. This file contains all options supported by ParaFROST.

*The Workflow.* Figure 2 presents the general workflow of ParaFROST in the form of an activity diagram with host and device lanes. The diagram is focused on inprocessing; preprocessing works similarly on the device. First, the host performs a predetermined number of solving iterations. Once those have finished, and (un)satisfiability has not yet been proven, relevant clause data is copied to the global memory. To hide the latency of this operation as much as possible, clauses are copied asynchronously in batches. One batch is copied while the next is formatted for the GPU, as not all clause information on the host side is relevant for the device (see the next paragraph on data structures). On the device, signatures are computed for fast clause comparison, and the clauses are sorted for VCE (more on VCE later). Next, the device constructs a histogram, for fast lookup of clauses, and sorts the variables. The Thrust library is used

<sup>2</sup> The tool is available at https://gears.win.tue.nl/software/gpu4bmc.

for sorting.<sup>3</sup> After that, the host *schedules variables* for VCE, marking those variables in the global memory using unified memory. Next, the device applies VCE, marking clauses to be removed as DELETED. The host propagates units (literals in unit clauses are assigned **true**), which directly has an effect on the formula in the global memory. The VCE procedure is repeated until it has been performed a predetermined number of times. After each time, DELETED clauses are removed, and after the last iteration, this is done while the new clauses are copied to the host. Once this has been done, the overall procedure is repeated.

*Data Structures and Memory Management.* We have worked on making the storage of each clause in the GPU global memory as efficient as possible. However, we also wanted to annotate each clause with sufficient information for effective optimisations. In ParaFROST, the following information is stored for each clause:


In addition, a list of literals is stored, each literal taking 32 bits (1 bit to indicate whether it is negated or not, and 31 bits to identify the variable). In total, a clause requires 12 + 4t bytes, with t the number of literals in the clause. For comparison, MiniSat only requires 4 + 4t bytes, but it does not involve the used, lbd and sig fields, thereby not supporting the associated optimisations. CaDiCaL [6] uses 28 + 4t bytes, since it applies solving and VCE on the same structures. In ParaFROST, the GPU is only used for VCE, in which information for *probing* [24] and *vivification* [36], for instance, is irrelevant. Finally, in [34], 20 + 4t bytes are used, storing the same information as ParaFROST.

To store a formula S, a clause array is preallocated in the global memory, and filled with the clauses of S. More space is allocated than the size of S, to allow the addition of clauses that result from VCE. As the amount of allocated space is the limiting factor for the addition of new clauses, we have developed a memory-aware VCE mechanism, which we explain later in the current section.

*Parallel* VCE. ParaFROST supports the VCE rules *substitution* (i.e., gate equivalence reasoning), *resolution* (RES), *subsumption elimination* (SUB) and *eager redundancy elimination* (ERE) [15,20]. Substitution applies to patterns representing logical gates, and substitutes the involved variables with their gate definitions. ParaFROST supports *AND/OR*, *Inverter*, *If Then Else* and *XOR*.

<sup>3</sup> https://docs.nvidia.com/cuda/thrust.

```
RES: x ∪ C1, x¯ ∪ C2 ⇒ C1 ∪ C2 (x ∪ C1 ∈ L ∧ x¯ ∪ C2 ∈ L)
SUB1: x ∪ C1 ∪ C2, x¯ ∪ C2 ⇒ C1 ∪ C2, x¯ ∪ C2
SUB2: C1 ∪ C2, C2 ⇒ C2 (C2 ∈ L =⇒ L-
                                                      = L\{C2})
ERE: x ∪ C1, x¯ ∪ C2, C1 ∪ C2 ⇒ x ∪ C1, x¯ ∪ C2 ({x ∪ C1, x¯ ∪ C2}∩L = ∅ =⇒ C1 ∪ C2 ∈ L)
```
**Fig. 3.** VCE rules in ParaFROST. *C*<sup>1</sup> and *C*<sup>2</sup> are non-empty sets of literals.

In Fig. 3, we provide rewrite rules for SUB and RES. If clauses exist in S of the form expressed by the left hand side of a rule, then the rule is applicable, and the involved clauses are replaced by the clauses (called *resolvents*) on the right hand side. RES is applicable if there are two clauses of the form x∪C<sup>1</sup> and x¯ ∪ C2, and applying it results in replacing those with a clause C<sup>1</sup> ∪ C2. SUB consists of two rules; the second is applied once the first is no longer applicable.

Conditions are given between parentheses. For RES, only ORIGINAL clauses are considered. Besides that, if C<sup>1</sup> ∪ C<sup>2</sup> evaluates to **true**, it is actually not created. As LEARNT clauses are sometimes deleted during solving, SUB2 should only produce ORIGINAL clauses; if C<sup>2</sup> is LEARNT before applying the rule, it will become ORIGINAL (L refers to the set of LEARNT clauses after application). For ERE, LEARNT clauses cannot cause the deletion of an ORIGINAL clause.

VCE is applied in parallel by ParaFROST by scheduling sets of *mutuallyindependent* variables for analysis. Two variables x and y are independent in S iff S does not contain a clause containing literals that refer to both variables, i.e., S<sup>x</sup> ∪ Sx¯ and S<sup>y</sup> ∪ Sy¯ are disjoint. This ensures that two threads focussing on x and y, respectively, does not lead to data races. In incremental solving, variables referred to by assumptions must be excluded from VCE. In each VCE iteration, a different set Ψ of variables is selected. This is achieved by using an upper-bound μ for the number of occurrences of a variable in S. After each iteration, μ is increased, allowing the selection of more variables. ParaFROST supports configuring μ and the number of VCE iterations.

As already mentioned, clauses that can be removed are marked DELETED before they are removed. The removal of clauses is done once VCE has finished (see Fig. 2) to avoid data races. However, because of this, VCE may at first require more memory to store clauses. The clauses added during VCE must fit in the memory, otherwise the procedure fails. To ensure this, we have developed a memory-aware mechanism for VCE. Next, we explain this mechanism for the RES rule and substitution, as the application of those rules results in new clauses.

Algorithm 1 presents how RES and substitution are applied in ParaFROST. It requires S, stored in a clause array clauses. As clauses are of varying sizes, we need an array references that provides a reference to each clause. In addition, arrays varinfo, cindex and rindex are given, which are filled in the first lines.

At line 1, the kernel VceScan is called in which a different thread is assigned to each variable <sup>x</sup> <sup>∈</sup> <sup>Ψ</sup>. Each thread checks the applicability of VCE rules on its variable and computes the number of clauses and literals that will be produced by the first applicable rule. A thread with ID *i* stores the type τ of the applicable rule (NONE, RESOLVE, or SUBSTITUTE) and the number of clauses β and

#### **Algorithm 1:** Parallel memory-aware application of RES and substitution

```
Input : global Ψ, clauses, references, varinfo, cindex, rindex
 1 varinfo ← VceScan(Ψ, S)
 2 cindex ← computeClauseIndices(varinfo, size(clauses))
 3 rindex ← computeClauseRefIndices(varinfo, size(references))
 4 VceApply(Ψ, clauses, references, varinfo, cindex, rindex)
 5 kernel VceApply(Ψ, clauses, references, varinfo, cindex, rindex):
 6 for all i ∈ [0, |Ψ| do in parallel
 7 register cidx ← cindex[i], ridx = rindex[i]
 8 register τ, β, γ ← varinfo[i]
 9 if τ = RESOLVE ∧ memorySafe(ridx, cidx, β, γ) then
10 ResApply(clauses, references, x, ridx, cidx)
11 if τ = SUBSTITUTE ∧ memorySafe(ridx, cidx, β, γ) then
12 SubApply(clauses, references, x, ridx, cidx)
13 device function memorySafe(ridx, cidx, β, γ):
14 reqSpace ← cidx + 12 × β + (4 · γ) // required number of bytes
15 if reqSpace > capacity(clauses) then return false
16 numRefs ← ridx + β // required number of clause references
17 if numRefs > capacity(references) then return false
18 return true
```
literals γ produced by that rule in one integer at varinfo[i]. At lines 2–3, kernels computeClauseIndices and computeClauseRefIndices are called to add up the β's and γ's to obtain offsets into the arrays references and clauses (the method size(A) refers to the amount of data in array A). Both methods apply a parallel exclusive prefix sum [37], involving the β's and γ's. The result is that thread i, assigned to x, is instructed to start writing clause references at references[rindex[i]] and clauses at clauses[cindex[i]] when applying the next VCE rule for x. Whether the data actually fits is checked later.

Next, the kernel VceApply is called (lines 5–12). To each variable in Ψ, a thread is assigned. It retrieves the precomputed data (lines 7–8) and either applies the RES rule (lines 9–10), substitution (lines 11–12), or nothing, in case τ = NONE. However, a condition for applying a rule is that there is enough space, which is checked using the device function memorySafe (lines 13–18). The amount of allocated space for A is reflected by capacity(A), and memorySafe checks if there is enough space in clauses, starting at *cidx* (lines 14–15). If there is, it is checked if the references can be stored in references (lines 16–17).

### **4 Multiple Decision Making in Incremental Solving**

Given the fact that BMC SAT formulas often have many variables, a recently proposed extension of CDCL [30], in which periodically multiple decisions are made (MDM) at the same time, has much potential to speed up BMC. When the MDM method is called, it constructs a set <sup>M</sup> <sup>=</sup> { <sup>∈</sup> <sup>L</sup> <sup>|</sup> *Var*({})∩*Var*(Σ) = ∅} such that there does not exist a clause C ∈ S with |*Var*(C) \ *Var*(Σ)| = 1. In other words, the decisions M do not lead to logical follow-up assignments, i.e., implications. The reason for this restriction is that implications may lead to conflicts (clauses that cannot be satisfied). When a single decision is made, this decision needs to be rolled back when a conflict is caused, but when multiple


```
2 if r > 0 then
3 M ← MDM(freevars, decqueue)
4 r ← r − 1, prevMDsize ← |M|
 5 else
6 M ← singleDecision(freevars, decqueue)
 7 if r = 0 ∧ |freevars| ≥ prevMDsize then
8 r ← periodicFuse(nConflicts, ConfFactor)
 9 return M
10 function periodicFuse(nConflicts, ConfFactor):
11 if nConflicts ≥ ConfFactor then
12 updateFactor(ConfFactor)
13 return mdmrounds
14 else
15 return 0
```
decisions are made, detecting which decisions actually cause a conflict is more difficult. Note that MDM cannot always make multiple decisions; implications are needed to solve a formula, so single decisions still have to be made frequently.

In [30], MDM was integrated into MiniSat and Glucose, and since multiple decisions should be selected periodically, a mechanism was proposed that decides when to make multiple decisions based on the solver restart policy. However, since solvers can differ greatly in this policy, we wanted to create an alternative mechanism not depending on this. ParaFROST is based on CaDiCaL [6], which has a very different restart policy compared to MiniSat and Glucose.

Algorithm 2 presents ParaFROST's decide method, which is called every time a decision must be made. Besides Σ and L, it is given a queue *decqueue*, in which the variables are ordered based on a decision heuristic. In ParaFROST, the heuristics Variable State Independent Decaying Sum (VSIDS) [27] and Variable Move-To-Front (VMTF) [7] are alternatingly used. The latter was not used before in [30]. decide also gets a variable *r*, initially set to the constant mdmrounds. These values are used to control the periodic call of MDM, in which a set of multiple decisions is made per round. Experiments have shown that mdmrounds = 3 is effective [30]. Finally, the number of conflicts so far (*nConflicts*), a variable *ConfFactor* used to switch MDM on and off, and a variable *prevMDsize*, storing the size of the most recent set of multiple decisions, are given.

To select new decisions, the set of unassigned variables is created at line 1. If we are calling MDM mdmrounds times (line 2), then MDM is called again and r is updated. The alternative is to make a single decision (line 6). If we have stopped calling MDM, and enough unassigned variables are present (line 7), method periodicFuse is called, which either sets *r* back to mdmrounds or to 0, depending on *nConflicts* (lines 10–15). There are enough unassigned variables if there are more unassigned variables than variables in the most recent multiple decisions set. In periodicFuse, *nConflicts* is compared to *ConfFactor* , which is initially set to a configurable value (default 2,000). *ConfFactor* is updated using a function updateFactor. This makes *ConfFactor* grow linearly, to achieve a suitable balance between *ConfFactor* and *nConflicts* as the solving progresses.

# **5 Benchmarks**

We conducted experiments with CBMC in combination with MiniSat (the default), Glucose, CaDiCaL, ParaFROST, ParaFROST with MDM, and a CPU-only version, referred to as ParaFROST (noGpu). <sup>4</sup> We used the AWS benchmarks in which the data structures hash table, array list, array buff, linked list, priority queue, byte cursor and string were analysed. The loop unwinding upper-bounds 8, 16, 64, 128 and 1,000 were used, resulting in 168 different verification problems.

All experiments were executed on the DAS-5 cluster [4]. Each program was verified in isolation on a separate node, with a time-out of 3,600 s. Each node had an Intel Xeon E5-2630 CPU (2.4 GHz) with 64 GB of memory, and an NVIDIA RTX 2080 Ti, with 68 SMs (64 cores/SM) and 11 GB global memory.

Figure 4 presents the decision procedure runtime, and how much time was spent on VCE. ParaFROST outperforms all sequential solvers including CaDiCaL (plot 4a). Even though ParaFROST is based on CaDiCaL, its different data structures, simplification mechanism and parameters tuned for large formulas makes ParaFROST more effective in these experiments. MDM further improves ParaFROST. Plot 4b demonstrates that CBMC with MiniSat often spends most of the time on VCE. ParaFROST significantly reduces the time spent on VCE compared to other solvers.

In Table 1, the Verified column lists per solver the number of verified programs, and PAR-2 gives the *penalized average runtime-2* metric. PAR-2 score accumulates the running times of all solved instances with 2× the time-out of unsolved ones, divided by the total number of formulas. The solver with the lowest score is the winner. The triangles and mean significantly better and worse, respectively. The MiniSat column lists how many programs were verified faster with the other solvers compared to MiniSat. Between parentheses, it is given how many of those programs were not solved by MiniSat at all. The final four columns serve the same purpose for the other solvers. For example, ParaFROST-MDM verified 123 programs faster than CaDiCaL, in which 12 could not be verified by the latter. The last two rows provide a similar comparison. Clearly, ParaFROST-MDM verified the largest number of programs, with the lowest score.

Figure 5 presents the speedups of the ParaFROST configurations for the individual cases. Overall, SAT solving was accelerated effectively with ParaFROST and ParaFROST-MDM. Compared to ParaFROST (noGpu), ParaFROST (and ParaFROST-MDM), accelerated multiple instances by up to 18× (and 27×), and the geometric average speedup for all programs was 1.3× (and 1.6×).

<sup>4</sup> We also tried to use CBMC with Z3, but were not able to correctly configure this combination at the time of writing.

(a) Verification time (timeout: 3600 seconds) (b) Percentage of verification time used for VCE

**Fig. 4.** CBMC runtimes for all solvers over the benchmark suite.

**Table 1.** CBMC performance analysis using the various solvers.


**Fig. 5.** Speedups of the individual cases.

# **6 Conclusion**

We have presented ParaFROST, the first tool to accelerate BMC using GPUs. Given that BMC formulas tend to have much redundancy, ParaFROST effectively reduces solving times with GPU pre- and inprocessing, and by using MDM, which is particularly effective when many variables are present. In the future, we will combine our approach with (existing) multi-threaded BMC. We expect these techniques to strengthen each other.

# **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

# Pono**: A Flexible and Extensible SMT-Based Model Checker**

Makai Mann1(B) , Ahmed Irfan<sup>1</sup> , Florian Lonsing<sup>1</sup> , Yahan Yang1,3, Hongce Zhang<sup>2</sup>, Kristopher Brown<sup>1</sup> , Aarti Gupta<sup>2</sup> , and Clark Barrett<sup>1</sup>

> <sup>1</sup> Stanford University, Stanford, USA *{*makaim,irfan,lonsing,barrett*}*@cs.stanford.edu, ksb@stanford.edu <sup>2</sup> Princeton University, Princeton, USA hongcez@princeton.edu, aartig@cs.princeton.edu <sup>3</sup> University of Pennsylvania, Philadelphia, USA yangy96@seas.upenn.edu

**Abstract.** Symbolic model checking is an important tool for finding bugs (or proving the absence of bugs) in modern system designs. Because of this, improving the ease of use, scalability, and performance of model checking tools and algorithms continues to be an important research direction. In service of this goal, we present Pono, an open-source SMTbased model checker. Pono is designed to be both a research platform for developing and improving model checking algorithms, as well as a performance-competitive tool that can be used for academic and industry verification applications. In addition to performance, Pono prioritizes transparency (developed as an open-source project on GitHub), flexibility (Pono can be adapted to a variety of tasks by exploiting its general SMT-based interface), and extensibility (it is easy to add new algorithms and new back-end solvers). In this paper, we describe the design of the tool with a focus on the flexible and extensible architecture, cover its current capabilities, and demonstrate that Pono is competitive with stateof-the-art tools.

# **1 Introduction**

Model checking [39,61] is an influential verification capability in modern system design. Its greatest success has been with finite-state systems, where propositional methods such as binary decision diagrams (BDDs) [28] and Boolean satisfiability (SAT) solvers [69] are used as verification engines. At the same time, significant efforts have been made to lift model checking techniques from finitestate to infinite-state systems [24,30,31,35,46,63]. This requires more expressive verification engines, such as solvers for satisfiability modulo theories (SMT) [19]. Proponents of SMT-based techniques argue that such techniques can also benefit

<sup>&</sup>quot;Pono" is the Hawaiian word for proper, correct, or goodness. Our goal is that Pono can be a useful tool for people to verify the correctness of systems.

c The Author(s) 2021

A. Silva and K. R. M. Leino (Eds.): CAV 2021, LNCS 12760, pp. 461–474, 2021. https://doi.org/10.1007/978-3-030-81688-9\_22

finite-state systems, due to their ability to leverage word-level reasoning. Indeed, a word-level model checker won the most recent hardware model checking competition [22], giving credence to this claim. Despite these successes, there remain many directions for exploration in model checking. In this paper, we present Pono, an SMT-based model checking tool, with the goal of providing an open research platform for advancing these efforts.

Pono is designed with three use cases in mind: 1) *push-button verification*; 2) *expert verification*; and 3) *model checker development*. For 1, Pono provides competitive implementations of standard model checking algorithms. For 2, it exposes a flexible API, affording expert users fine-grained control over the tool. This can be useful in traditional model checking tasks (e.g., manually guiding the tool to an invariant, or adjusting the encoding for better performance), but it also enables the tool to be easily adapted for other tasks. In addition, Pono is designed using a completely generic SMT solver interface, making it trivial to experiment with different back-end solvers. For 3, Pono is open-source [7] and designed to be easily modifiable and extensible with a simple, modular, and hierarchical architecture. Taken together, these features make it relatively easy to do controlled experiments by comparing results obtained using Pono, while varying only the SMT solver or the model checking algorithm. Pono has already been used in a variety of research projects, both for model checking and other custom applications. It has also been used in two graduate level courses at Stanford University, where students used both the command-line interface and the API. With this promising start, we hope it will have a long and productive existence supporting research, education, and industry.

# **2 Design**

Pono is designed around the manipulation and analysis of transition systems. A symbolic transition system is a tuple -*X, I, T*, where *<sup>X</sup>* is a set of (sorted) uninterpreted constants referred to as the current-state variables of the system and coupled with corresponding next-state variables *X* ; *I*(*X*) is a formula constraining the initial states of the system; and *T*(*X, X* ) is a formula expressing the transition relation, which encodes the dynamics of the system. The transition system representation provides a clean and general interface, allowing Pono to target both hardware and software model checking. Pono is designed to fully leverage the expressivity and reasoning power of modern SMT solving. Its formulas use the language and semantics of the SMT-LIB standard [17], and its model checking algorithms use an SMT solving oracle. To streamline the interaction with SMT solvers, Pono uses Smt-Switch [59], an open-source C++ API for SMT solving. Smt-Switch provides a convenient, efficient, and generic interface for SMT solving. Smt-Switch supports a variety of SMT solver back-ends and can switch between them easily.

The diagram in Fig. 1 displays the overall architecture of Pono. The blocks with a dashed outline are globally available and used throughout the codebase. The Pono API provides access to all of the components shown, supporting the design goal of giving expert users control and flexibility.

**Fig. 1.** Architecture diagram

**Core.** The TransitionSystem class in Pono represents symbolic transition systems as structured Smt-Switch terms. Key data structures include the following: i) inputvars: a vector of Smt-Switch symbolic constants representing primary inputs to the system (i.e., they are part of *X*, but their primed versions are not used and cannot appear in *T*); ii) statevars: a vector of Smt-Switch symbolic constants corresponding to the non-input state variables (the remaining variables in *X*); iii) next map: a map from current (*X*) to next-state (*X* ) variables; iv) init: an Smt-Switch formula representing *I*(*X*); and v) trans: an Smt-Switch formula representing *T*(*X, X* ).

There are two kinds of transition systems: RelationalTransitionSystem and FunctionalTransitionSystem. The former has no restrictions on the form of the transition relation, while the latter is restricted to only functional updates: an equality (update assignment) with a next-state variable on the left and a function of current-state and input variables on the right. Some model checking algorithms take advantage of this structure [46,47]. Built-in checks ensure compliance with the restrictions.

A Property is an Smt-Switch formula representing a property to check for invariance.<sup>1</sup> A ProverResult is an enum which can be one of the following: i) UNKNOWN (result could not be determined, including incompleteness due to checking only up to some bound); ii) FALSE (the property does not hold); iii) TRUE (the property holds); and iv) ERROR (there was an internal error). The Unroller is a class for producing unrolled transition systems, i.e., encoding a finite-length symbolic execution by introducing fresh variables for each timestep.

<sup>1</sup> Pono currently supports invariant checking. Support for temporal properties is left to future work.

**Engines.** Model checking algorithms are implemented as subclasses of the abstract class Prover and stored in the engines directory. We cover the current suite of engines in more detail in Sect. 3.

**Frontends.** Although users can manually build transition systems through the API, it is also convenient to generate transition systems from structured input formats. Pono includes the following frontends: i) BTOR2Encoder: uses the opensource btor2tools [2] library to read the BTOR2 [66] format for hardware model checking; ii) SMVEncoder: supports a subset of nuXmv's [30] SMT-based theory extension of SMV [61], which added support for infinite-state systems; iii) CoreIREncoder: encodes the CoreIR [11] circuit intermediate representation. Note that Verilog [10] can be supported by using a translator from Verilog to either BTOR2 or SMV. Examples of translators include Yosys [72] and Verilog2SMV [53], both of which are open-source.

**Printers.** Pono prints witness traces when a property does not hold. The supported formats are the BTOR2 witness format and the VCD standard format used by EDA tools [10]. For theories such as arithmetic that are not supported by these formats, Pono implements simple extensions, ensuring that all variable assignments are included in witness traces.

**Modifiers and Refiners.** Pono includes functions that perform various transformations on transition systems, including: adding an auxiliary variable [14]; building an implicit predicate abstraction [70]; and computing a static cone-ofinfluence reduction for a functional transition system under a given property. It also includes functions for refining an abstract transition system.

**Utils and Options.** utils contains a collection of general-purpose classes and functions for manipulating and analyzing Smt-Switch terms and transition systems. options contains a single class, PonoOptions, for managing command-line options.

**API.** Pono's native API is in C++. In addition, Pono has Python bindings that interact with the Smt-Switch Python bindings, both written in *Cython* [20]. These bindings behave very similarly to "pure" Python objects, allowing introspection and *pythonic* use of the API.

We follow best practices for modern C++ development and code quality maintenance, including issue tracking, code reviews, and continuous integration (via *GitHub Actions*). The build infrastructure is written in CMake [3] and is configurable. The Pono repository also provides helper scripts for installing its dependencies. We support GoogleTest [5] for unit testing and gperftools [12] for code profiling. Tests can be parameterized by both the SMT solver and the algorithm or type of transition system. We utilize *PyTest* [9] to manage and parameterize unit tests for the python bindings.

# **3 Capabilities**

In this section, we highlight some key capabilities of Pono. The design makes use of abstract interfaces and inheritance to make it easy to add or extend functionality. Base class implementations of core functionality are provided but are kept simple to prioritize readability and transparency. And, of course, they can be overridden using inheritance and virtual functions.

We start by describing the interface and engines provided for push-button verification. Next, we take a closer look at two ways that the basic architecture can be extended. We then show how to use Pono to reason about a transition system using algebraic datatypes, demonstrating the expressive power provided by the SMT back-end.

**Main Engines.** All model checking algorithms in Pono are derived classes of the abstract base class Prover. The base class defines a simple public interface through a set of virtual functions:


Pono has several engines, all of which have been lifted to the SMT-level. We now list the main engines and include the corresponding lines of code (LoC) in the primary source file (the LoC includes all comments and license headers): 1. Bounded Model Checking [21] (88 LoC); 2. K-Induction [68] (161 LoC); 3. Interpolant-based Model Checking [62] (230 LoC); 4. IC3-style algorithms [25] (see below for LoC). The engines leverage the reusable infrastructure described in Sect. 2 (e.g., the Unroller for the unrolling based techniques).

**IC3 Variants.** IC3 is widely recognized as one of the best-performing algorithms for SAT-based model checking [43]. Liftings to SMT are an area of active research and have produced several variations with promising results [23,24,34,35,47,51, 54,55,71]. To support this active research direction, Pono includes a special IC3 base class IC3Base, which implements a framework common to all variations of the algorithm.<sup>2</sup> The framework has several parameters that can be provided by specific instances of the algorithm: IC3Formula is a configurable data structure used to represent formulas constraining IC3 frames; inductive generalization is the method used for inductive generalization; predecessor generalization

<sup>2</sup> For details on how the IC3 algorithm works, we refer the reader to [25,43].

is the method used for predecessor generalization; and abstract and refine are methods that can be implemented for abstraction-refinement approaches to IC3 [35,47]. The implementation of IC3Base is 1086 lines of code. Current instantiations of IC3Base implemented in Pono include: i) IC3: a standard Boolean IC3 implementation [25,43] (152 LoC); ii) IC3Bits: a simple extension of IC3 to bit-vectors, which learns clauses over the individual bits (113 LoC); iii) Modelbased IC3: a naive implementation of IC3 lifted to SMT, which learns clauses of equalities between variables and model values (397 LoC); iv) IC3IA: IC3 via Implicit Predicate Abstraction [35] (456 LoC); v) IC3SA: a basic implementation of IC3 with Syntax-Guided Abstraction for hardware verification [47] (984 LoC); vi) SyGuS-PDR: a syntax-guided synthesis approach for inductive generalization targeting hardware designs [73] (1047 LoC).

**Counterexample-Guided Abstraction Refinement (CEGAR).** CEGAR [57] is a popular framework for iteratively solving difficult model checking problems. It is typically parameterized by the underlying model checking algorithm, which operates on an abstract system that is iteratively refined as needed. Pono provides a generic CEGAR base class, parameterized by a model checking engine through a template argument. We describe two example uses of the CEGAR infrastructure implemented in Pono.

*Operator Abstraction.* This simple CEGAR algorithm uses uninterpreted functions (UF) to abstract potentially expensive theory operators (e.g. multiplication). The implementation is parameterized by the set of operators to replace with UFs. The refinement step analyzes a counterexample trace by restoring the concrete theory operator semantics. If the trace is found to be spurious, constraints are added to enforce the real semantics for the abstracted operators (e.g., equalities between certain abstract UFs and their theory operator counterparts), thus ruling out the spurious counterexample.

*Counterexample-Guided Prophecy.* This CEGAR approach replaces array variables with initially memoryless variables of uninterpreted sort and replaces the select and store array operators with UFs [58]. Due to the array theory semantics, it is not always possible to remove spurious counterexamples with quantifierfree refinement axioms over existing variables. However, instead of using potentially expensive quantifiers, the algorithm adds auxiliary variables (history and prophecy variables) [14], which can rule out spurious counterexamples of a given finite length. This approach has the effect of removing the need for array solving and can sometimes prove properties using prophecy variables that would otherwise require a universally quantified invariant.

**Case Study with Algebraic Datatypes.** To illustrate the flexibility of Pono's SMT-based formalism, we next describe a case study with generalized algebraic theories (GATs) [29]. GATs are a rich formalism which can be used for high-level specifications of software or mathematical constructs. While the equality of two terms in a GAT is undecidable, one can ask the bounded question: "Does there exist a path of up to *n* rewrites to take a source term to a target term?"

To model this question, we use algebraic datatypes to represent dependently-typed abstract syntax trees (ASTs), paths through an AST (e.g., the 2nd argument of the 3rd argument of a term's 1st argument), and rewrite rules (e.g., *succ*(*n*+1) = *succ*(*m*+1) <sup>≡</sup> *succ*(*n*) = *succ*(*m*)). Smt-Switch supports algebraic datatypes through the CVC4 [18] back-end. A rewrite function is encoded as a transition relation. The decision of which rule to apply and at which subpath to apply it is controlled by input variables, and a state variable represents the current AST term (initially set to the source term). We check the property that the target term is not reachable from the source term. Consequently, any discovered counterexample is a valid rewrite sequence, serving as a proof of an equality that holds in the theory.

The workflow accepts a GAT input, produces an SMT encoding optimized for that particular theory, and then parses user-provided source and target terms into this theory before running bounded model checking. We used Pono to successfully find equalities in the theories of Boolean algebras, preorders, monoids, categories, and read-over-write arrays. This case study demonstrates Pono's ability to model and model check unconventional systems.

# **4 Related Work**

Existing academic model checkers span a wide range of supported theories, modeling capabilities, and implemented algorithms. An important early model checker was SMV [61], which pioneered *symbolic* model checking of temporal logic properties [67] through BDDs [28]. NuSMV [32] and NuSMV2 [33] refined and extended the tool, followed by nuXmv [30] – a closed-source tool which added support for various SMT-based verification techniques using the SMT solver Math-SAT5 [36]. Spin [52] is a well-known explicit-state model checker with extensive support for partial order reduction and other optimizations.

Several model checkers specifically target hardware verification. ABC [26] is a well-established, state-of-the-art bit-level hardware model checker based on SAT solving. CoSA [60] is an open-source model checker implemented in Python using the Python solver-agnostic SMT solving library, PySMT [45]. Although CoSA also relies on a generic API similar to Smt-Switch, the Python implementation introduces significant overhead, limiting its ability to include efficient procedures that must be implemented outside of the underlying SMT solver (e.g., CEGAR loops and some IC3 variants). AVR [48] is a state-of-the-art SMT-based hardware model checker supporting several standard model checking algorithms. It also implements a novel technique: IC3 via syntax-guided abstraction [47]. Importantly, AVR won the hardware model checking competition in 2020 [22], outperforming the previous state-of-the-art SAT-based model checker, ABC. AVR is currently closed-source, making it unsuitable for several of the use-cases targeted by our work, but a binary is available on GitHub [1].

There are several SMT-based model checkers focused on parameterized protocols. MCMT [46], the open-source extension Cubicle [49], and related systems [15,16] perform backward-reachability analysis over infinite-state arrays.

Other open-source SMT-based model checkers include: i) ic3ia [13] – an example implementation of IC3IA built on MathSAT [36]; ii) Kind2 [31]–a model checker for Lustre programs; iii) Sally [42] – a model checker for infinitestate systems that uses the SAL language [65] and MCMT, an extension of the SMT-LIB text format for declaring transition systems; iv) Spacer [56]–a Constrained Horn Clauses (CHC) solver built into the open-source Z3 [64] SMT solver, also based on an IC3-style algorithm; and v) Intrepid [27] – a model checker focusing primarily on the control engineering domain.

Pono is open-source, SMT-based, and implements a variety of model checking algorithms over transition systems. Furthermore, in contrast to the tools which focus on more limited domains, it has support for a wide set of SMT theories including fixed-width bit-vectors, arithmetic, arrays, and algebraic datatypes. To our knowledge all current open-source SMT-based model checkers tie the implementation directly to an existing SMT solver or use PySMT or the SMT-LIB text format to interact with arbitrary solvers. In contrast, Pono makes use of the C++ API of Smt-Switch to efficiently manipulate SMT terms and solvers in memory without a need for a textual interface. This allows Pono to provide both flexibility and performance. Finally, like the new model checker Intrepid, Pono provides an extensive API, which can be adapted and extended as needed. However, the focus is broader than Intrepid in terms of application domains.

# **5 Evaluation**

In this section, we evaluate Pono<sup>3</sup> against current state-of-the-art model checkers across several domains. Our evaluation is not intended to be exhaustive. Rather, we highlight the breadth of Pono by selecting four sets of benchmarks in three diverse categories and a few reasonable competitors for each. The benchmarks are drawn from the following theories: i) unbounded quantifier-free arrays indexed by integers; ii) quantifier-free linear arithmetic over reals and integers; and iii) hardware verification over quantifier-free bit-vectors and (finite, bit-vector indexed) arrays. We ran all experiments on a 3.5 GHz Intel Xeon E5-2637 v4 CPU with a timeout of 1 h and a memory limit of 16 Gb. For all results, we also include the average runtime of solved instances in seconds. For portfolio solving, we ran each configuration in its own process with the full time and memory resources. In the first two categories, Pono used MathSAT5 [36] as the underlying SMT solver and interpolant [37,40,62] producer. For the hardware benchmarks, it used MathSAT5, Boolector [66], or both, depending on the configuration.

**Arrays.** We evaluate Pono on the integer-indexed array benchmark set of [44]. These are Constrained Horn Clauses (CHC) benchmarks inspired by software verification problems. Although there are no quantifiers in the benchmarks themselves, most cannot be proved safe without strengthening the property with quantified invariants. We compare against: i) freqhorn [44], a state-of-the-art CHC solver for this type of problem; ii) prophic3 [8], a recent method that

<sup>3</sup> GitHub commit c175a302857ff00229a0919d5cc8fc3f78d04a26.


**Fig. 2.** Results on Freqhorn Array benchmarks (81 total), all expected to be safe.


**Fig. 3.** Results on arithmetic benchmarks.

outperforms freqhorn [58]; and iii) nuXmv, which does not support quantified invariants, to illustrate that most of these benchmarks do require them; freqhorn takes the CHC format natively, and we used scripts from the ic3ia and nuXmv distributions to translate the CHC input to SMV and the Verification Modulo Theories (VMT) format [38] – an annotated SMT-LIB file representing a transition system – for the other tools. We ran Pono with Counterexample-Guided Prophecy using IC3IA as the underlying model checking technique. We ran prophic3 with both of the option sets used in their paper, and we ran the default configuration of freqhorn. Our results are shown in Fig. 2. We observe that Pono solves the same number of benchmarks as the reference implementation prophic3 and is a bit faster.

**Arithmetic.** We next evaluate Pono on two sets of arithmetic benchmarks, both from the nuXmv distribution's example directory. The first uses linear real arithmetic, and the second uses linear integer arithmetic. Figure 3 displays the results on both benchmark sets.

*Linear Real Arithmetic.* We chose the systemc QF LRA example benchmarks, because this is the largest set of linear real arithmetic benchmarks in the subset of SMV supported by Pono. <sup>4</sup> We ran both nuXmv and Pono with BMC and IC3IA in a portfolio. For both model checkers, BMC did not contribute any unique solves. We observe that Pono is quite competitive with nuXmv on nuXmv's own benchmarks.

*Linear Integer Arithmetic.* We also evaluate Pono on a set of Lustre benchmarks which use quantifier-free linear integer arithmetic. We obtained the Lustre benchmarks from the Kind [50] website [6] and the SMV translation of the benchmarks from the distribution of nuXmv. We compare against both nuXmv and Kind2 [31], the latest version of Kind. We ran all tools with a portfolio of techniques. For Pono and nuXmv we ran BMC and IC3IA. For Kind2 we ran two configurations suggested by the authors: the default configuration with Z3 [64] and the default configuration, but with Yices2 [41] as the main SMT solver. Since the default

<sup>4</sup> Pono does not yet support enumeration types.


**Fig. 4.** Results on HWMCC2020 benchmarks.

configurations of Kind2 run 8 techniques in parallel, we gave each configuration 8 cores. Additionally, we ran Kind2's BMC and IC3 implementations using MathSAT5 as the SMT solver, because this is closest to the other model checkers' configurations. The default with Z3 was the best configuration of Kind2. We observe that Pono solves the most benchmarks overall. Once again, BMC contributed no unique solves for any model checker.

**Hardware Verification.** Finally, we evaluate Pono on the 2020 Hardware Model Checking Competition (HWMCC) benchmarks. The benchmarks are split into bitvector-only and bitvector plus array categories. We evaluate against AVR [1,48] and CoSA2 [4] (a previous name and version of Pono), the winners of HWMCC 2020 and HWMCC 2019, respectively. We also compare against sygus-apdr (the reference implementation of SyGuS-PDR [73]) on the bitvector benchmarks (as sygus-apdr targets bitvectors). We ran all 16 configurations of AVR from their HWMCC 2020 entry: several configurations of BMC and k-induction, and 11 configurations of IC3SA. We ran the 4 configurations of CoSA2 from the HWMCC 2019 entry: two BMC configurations, k-induction, and interpolant-based model checking. We ran sygus-apdr with 4 different parameters controlling the grammar for lemmas. For the bitvector-only benchmarks, we ran Pono with 10 configurations: 3 configurations of IC3IA, 2 configurations of IC3SA, 2 configurations of SyGuS-PDR, IC3Bits, k-induction, and BMC. For the array benchmarks, we ran 5 configurations: 3 configurations of IC3IA (one with Counterexample-Guided Prophecy), k-induction, and BMC. We show our results on the HWMCC 2020 benchmarks in Fig. 4. AVR wins in both categories, although Pono is fairly competitive, outperforming the other tools.

These results show that Pono is well on its way to being both widely applicable and performance-competitive. The arithmetic experiments demonstrate the capabilities of its IC3IA engine, but other engines have some room for improvement. In particular, both IC3SA and SyGuS-PDR were recently added to Pono, and its implementation of these algorithms still lags the corresponding implementations in AVR and sygus-apdr, respectively. There are also some features that are known to help performance and are not yet implemented in Pono. For example, the best configurations of AVR use UF data abstraction. This differs from our UF operator abstraction in that it replaces all abstracted data with uninterpreted sorts and learns targeted data refinement axioms.

# **6 Conclusion**

We have presented Pono: a new open-source, SMT-based, and solver-agnostic model checker. We described its capabilities, design, and the emphasis on flexibility and extensibility in addition to performance. We demonstrated empirically that the suite of model checking algorithms is competitive with state-of-the-art tools. Pono has already been used in several research projects and two graduatelevel classes. With this promising start, we believe that Pono is poised to have an enduring and beneficial impact on research, education, and model checking applications. Future work includes adding support for temporal properties [67] and improving and adding to Pono's engines, in particular the IC3 variants.

**Acknowledgements.** This work was partially supported by the National Science Foundation Graduate Research Fellowship Program under Grant No. DGE-1656518. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation. This work was also supported by the Defense Advanced Research Projects Agency, grants FA8650-18-1-7818 and FA8650-18-2-7854. We thank these sponsors and our industry collaborators for their support.

# **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

# **Logical Foundations**

# Towards a Trustworthy Semantics-Based Language Framework via Proof Generation

Xiaohong Chen1(B) , Zhengyao Lin<sup>1</sup> , Minh-Thai Trinh<sup>2</sup> , and Grigore Roşu<sup>1</sup>

> <sup>1</sup> University of Illinois at Urbana-Champaign, Champaign, USA {xc3,zl38,grosu}@illinois.edu <sup>2</sup> Advanced Digital Sciences Center, Illinois at Singapore, Singapore, Singapore trinhmt@illinois.edu

Abstract. We pursue the vision of an *ideal language framework*, where programming language designers only need to define the formal *syntax* and *semantics* of their languages, and all language tools are automatically generated by the framework. Due to the complexity of such a language framework, it is a big challenge to ensure its trustworthiness and to establish the correctness of the autogenerated language tools. In this paper, we propose an innovative approach based on *proof generation*. The key idea is to generate proof objects as correctness certificates for each individual task that the language tools conduct, on a case-by-case basis, and use a trustworthy proof checker to check the proof objects. This way, we avoid formally verifying the entire framework, which is practically impossible, and thus can make the language framework both *practical* and *trustworthy*. As a first step, we formalize program execution as mathematical proofs and generate their complete proof objects. The experimental result shows that the performance of our proof object generation and proof checking is very promising.

Keywords: Semantic framework · Proof generation · Proof checking

# 1 Introduction

Unlike natural languages that allow vagueness and ambiguity, programming languages must be precise and unambiguous. Only with rigorous definitions of programming languages, called the *formal semantics*, can we guarantee the reliability, safety, and security of computing systems.

Our vision is thus an *ideal language framework* based on the formal semantics of programming languages. Shown in Fig. 1, an ideal language framework is one where language designers only need to define the formal syntax and semantics of their language, and all language tools are automatically generated by the framework. The *correctness* of these language tools is established by generating complete mathematical proofs as certificates that can be automatically machinechecked by a trustworthy proof checker.

Fig. 1. An ideal language framework vision; language tools are autogenerated, with machine-checkable mathematical proofs as correctness certificates.

The K language framework (https://kframework.org) is in pursuit of the above ideal vision. It provides a simple and intuitive front end language (i.e., a meta-language) for language designers to define the formal syntax and semantics of other programming languages. From such a formal language definition, the framework automatically generates a set of language tools, including a parser, an interpreter, a deductive verifier, a program equivalence checker, among many others [9,24]. K has obtained much success in practice, and has been used to define the complete executable formal semantics of many real-world languages, such as C [12], Java [2], JavaScript [21], Python [13], Ethereum virtual machines byte code [15], and x86-64 [10], from which their implementations and formal analysis tools are automatically generated. Some commercial products [14,18] are powered by these autogenerated implementations and/or tools.

What is *missing* in K (compared to the ideal vision in Fig. 1) is its ability to generate proof objects as correctness certificates. The current K implementation is a complex artifact with over 500,000 lines of code written in 4 programming languages, with new code committed on a weekly basis. Its code base includes complex data structures, algorithms, optimizations, and heuristics to support the various features such as defining formal language syntax using BNF grammar, defining computation configurations as constructor terms, defining formal semantics using rewrite rules, specifying arbitrary evaluation strategies, and defining the binding behaviors of binders (Sect. 3). The large code base and rich features make it challenging to formally verify the correctness of K.

Our main contribution is the proposal of a *practical approach* to establishing the correctness of a complex language framework, such as K, via *proof object generation*. Our approach consists of the following main components:


The key idea that makes our approach practical is that we establish the correctness not for the entire framework, but for each individual language tasks that it conducts, on a case-by-case basis. This idea is not limited to K but also applicable to the existing language frameworks and/or formal semantics approaches.

As a first step, we formalize *program execution* as mathematical proofs and generate their complete proof objects. The experimental result (Table 1) shows promising performance of the proof object generation and proof checking. For example, for a 100-step program execution trace, its complete proof object has 1.6 million lines of code that takes only 5.6 s to proof-check.

We organize the rest of the paper as follows. We give an overview of our approach in Sect. 2. We introduce K and discuss the generation of proof parameters in Sect. 3. We discuss *matching logic*—the logical foundation of K—in Sect. 4. We then compile K to matching logic in Sect. 5, and discuss *proof object generation* in Sect. 6. We discuss the limitations of our current implementation and show the experiment results in Sects. 7 and 8, respectively. Finally, we discuss related work in Sect. 9 and conclude the paper in Sect. 10.

# 2 Our Approach Overview

We give an overview of our approach via the following four main components: (1) a logical foundation of K, (2) proof parameters, (3) proof object generation, and (4) a trustworthy proof checker.

Logical Foundation of K. Our approach is based on *matching logic* [5,22]. Matching logic is the *logical foundation* of K, in the following sense:


$$
\varphi\_{init} \Rightarrow \varphi\_{final} \tag{1}
$$

where ϕ*init* is the formula that specifies the initial state of the execution, ϕ*final* specifies the final state, and "⇒" states the rewriting/reachability relation between states (see Sect. 5.1).

3. There exists a matching logic *proof system* that defines the provability relation between theories and formulas. For example, the correctness of the above execution from ϕ*init* to ϕ*final* is witnessed by the formal proof:

$$
\Gamma^L \vdash \varphi\_{init} \Rightarrow \varphi\_{final} \tag{2}
$$

Therefore, matching logic is the logical foundation of K. The *correctness* of K conducting one language task is reduced to the *existence of a formal proof* in matching logic. Such formal proofs are encoded as proof objects, discussed below.

Proof Parameters. A proof parameter is the necessary information that K should provide to help generate proof objects. For program execution, such as Eq. (2), the proof parameter includes the following information:


In other words, a proof parameter of a program execution trace contains the complete information about how such an execution is carried out by K. The proof parameter, once generated by K, is passed to the proof object generator to generate the corresponding proof object, discussed below.

Proof Object Generation. In our approach, a proof object is an encoding of matching logic formal proofs, such as Eq. (2). Proof objects are generated by a proof object generator from the proof parameters provided by K. At a high level, a proof object for program execution, such as Eq. (2), consists of:


Our proof objects have a *linear structure*, which implies a nice separation of concerns. Indeed, Item 1 is only about matching logic and is *not specific* to any programming languages/language tasks, so we only need to develop and proofcheck it *once and for all*. Item 2 is specific to the language semantics Γ *<sup>L</sup>* but is independent of the actual program executions, so it can be reused in the proof objects of various language executions for the same programming language L.

A Trustworthy Proof Checker. A proof checker is a small program that checks whether the formal proofs encoded in a proof object are correct. The proof checker is the main trust base of our work. In this paper, we use Metamath [20] a third-party proof checking tool that is simple, fast, and trustworthy—to formalize matching logic and encode its formal proofs.

Fig. 2. The complete K formal definition of an imperative language IMP.

Summary. Our approach to establishing the correctness of K is based on its logical foundation—matching logic. We formalize language semantics as logical theories, and program executions as formulas and proof goals, whose proof objects are automatically generated and proof-checked. Our proof objects have a linear structure that allows easy reuse of their components. The key characteristics of our logical-based approach are the following:


# 3 K Framework and Generation of Proof Parameters

### 3.1 K Overview

K is an effort in realizing the ideal language framework vision in Fig. 1. An easy way to understand K is to look at it as a meta-language that can define other programming languages. In Fig. 2, we show an example K language definition of an imperative language IMP. In the 39-line definition, we *completely* define the formal syntax and the (executable) formal semantics of IMP, using a front end language that is easy to understand. From this language definition, K can generate all language tools for IMP, including its parser, interpreter, verifier, etc.

We use IMP as an example to illustrate the main K features. There are two *modules*: IMP-SYNTAX defines the syntax and IMP defines the semantics using rewrite rules. Syntax is defined as BNF grammars. The keyword syntax leads production rules that can have attributes that specify the additional syntactic and/or semantic information. For example, the syntax of if -statements is defined in lines 11–12 and has the attribute [strict(1)] , meaning that the evaluation order is strict in the first argument, i.e., the condition of an if -statement.

In the module IMP , we define the *configurations* of IMP and its formal semantics. A configuration (lines 23–25) is a constructor term that has all semantic information needed to execute programs. IMP configurations are simple, consisting of the IMP code and a program state that maps variables to values. We organize configurations using *(semantic) cells*: </k> is the cell of IMP code and </state> is the cell of program states. In the initial configuration (lines 24–25), </state> is empty and </k> contains the IMP program that we pass to K for execution (represented by the special K variable \$PGM ).

We define formal semantics using *rewrite rules*. In lines 26–27, we define the semantics of variable lookup, where we match on a variable <sup>X</sup> in the </k> cell and look up its value <sup>I</sup> in the </state> cell, by matching on the binding <sup>X</sup> → <sup>I</sup> . Then, we rewrite <sup>X</sup> to <sup>I</sup> , denoted by <sup>X</sup> ⇒ <sup>I</sup> in the </k> cell in line 26. Rewrite rules in K are similar to those in the rewrite engines such as Maude [7].

A Running Example. IMP is too complex as a running example so we introduce a simpler one: TWO-COUNTERS . Although simple, TWO-COUNTERS still uses the core features of defining formal syntax as grammars and formal semantics as rewrite rules.


TWO-COUNTERS is a tiny language that defines a state machine with two counters. Its computation configuration is simply a pair m, n of two integers m and n, and its semantics is defined by the following (conditional) rewrite rule:

$$
\langle m, n \rangle \Rightarrow \langle m-1, n+m \rangle \qquad \text{if } m > 0 \tag{3}
$$

Therefore, TWO-COUNTERS adds n by m and reduces m by 1. Starting from the initial state m, 0, TWO-COUNTERS carries out m execution steps and terminates at the final state 0, m(m + 1)/2, where m(m + 1)/2 = m + (m − 1) + ··· + 1.

#### 3.2 Program Execution and Proof Parameters

In the following, we show a concrete program execution trace of TWO-COUNTERS starting from the initial state 100, 0:

$$\langle 100, 0 \rangle, \langle 99, 100 \rangle, \langle 98, 199 \rangle, \dots, \langle 1, 5049 \rangle, \langle 0, 5050 \rangle \tag{4}$$

To make K generate the above execution trace, we need to follow these steps:


3. Use the K execution tool krun and pass the source file to it: \$ krun 100.two-counters --depth N

The option --depth N tells K to execute for <sup>N</sup> steps and output the (intermediate) snapshot. By letting <sup>N</sup> be 1, 2, . . . , we collect all snapshots in Eq. (4).

The *proof parameter* of Eq. (4) includes the additional rewriting information for each execution step. That is, we need to know the rewrite rule that is applied and the corresponding substitution. In TWO-COUNTERS , there is only one rewrite rule, and the substitution can be easily obtained by pattern matching, where we simply match the snapshot with the left-hand side of the rewrite rule.

Note that we regard K as a "black box". We are not interested in its complex internal algorithms. Instead, we hide such complexity by letting K generate proof parameters that include enough information for proof object generation. This way, we create a separation of concerns between K and proof object generation. K can aim at optimizing the performance of the autogenerated language tools, *without* making proof object generation more complex.

# 4 Matching Logic and Its Formalization

We review the syntax and proof system of matching logic—the logical foundation of K. Then, we discuss its formalization, which is our main technical contribution and is a critical component of the proof objects we generate for K (see Sect. 2).

#### 4.1 Matching Logic Overview

Matching logic was proposed in [23] as a means to specify and reason about programs compactly and modularly. The key concept is its formulas, called *patterns*, which are used to specify program syntax and semantics in a uniform way. Matching logic is known for its simplicity and rich expressiveness. In [4–6,22], the authors developed matching logic theories that capture FOL, FOL-lfp, separation logic, modal logic, temporal logics, Hoare logic, λ-calculus, type systems, etc. In Sect. 5, we discuss the matching logic theories that capture K.

The *syntax* of matching logic is parametric in two sets of variables EV and SV . We call EV the set of *element variables*, denoted x, y, . . . , and SV the set of *set variables*, denoted X, Y, . . . .

Definition 1. *A* (matching logic) signature Σ *is a set of* (constant) symbols*. The set of* Σ-patterns*, denoted* Pattern(Σ)*, is inductively defined as follows:*

$$\varphi ::= x \mid X \mid \sigma \mid \varphi\_1 \; \varphi\_2 \mid \perp \mid \varphi\_1 \to \varphi\_2 \mid \exists x. \varphi \mid \mu X. \varphi$$

*where in* μX. ϕ *we require that* ϕ *has no negative occurrences of* X*.*

Thus, element variables, set variables, and symbols are patterns. ϕ<sup>1</sup> ϕ<sup>2</sup> is a pattern, called *application*, where the first argument is applied to the second. We


Fig. 4. Capture-free substitution are defined in the usual way and formalized later in Sect. 4.2 as a part of our proof objects.

have propositional connectives ⊥ and ϕ<sup>1</sup> → ϕ2, existential quantification ∃x. ϕ, and the least fixpoints μX. ϕ, from which the following *notations* are defined:

$$\begin{array}{ccccc} \neg\varphi \equiv \varphi \rightarrow \bot & \top \equiv \neg\bot & \varphi\_1 \land \varphi\_2 \equiv \neg(\neg\varphi\_1 \lor \neg\varphi\_2) \\ \varphi\_1 \lor \varphi\_2 \equiv \neg\varphi\_1 \rightarrow \varphi\_2 & \forall x.\,\varphi \equiv \neg\exists x.\,\neg\varphi & \quad \nu X.\,\varphi \equiv \neg\mu X.\,\neg\varphi[\neg X/X] \end{array}$$

We use fv(ϕ) to denote the free variables of ϕ, and ϕ[ψ/x] and ϕ[ψ/X] to denote capture-free substitution. Their (usual) definitions are listed in Fig. 4.

Matching logic has a *pattern matching semantics*, where a pattern ϕ is interpreted as the set of elements that match it. For example, ϕ<sup>1</sup> ∧ ϕ<sup>2</sup> is the pattern that is matched by those matching both ϕ<sup>1</sup> and ϕ2. Matching logic semantics is not needed for proof object generation, so we exile it to [5,22].

We show the *matching logic proof system* in Fig. 5, which defines the provability relation, written Γ ϕ, meaning that ϕ can be proved using the proof system, with patterns in Γ added as additional axioms. We call Γ a *matching logic theory*. The proof system is a main component of proof objects. To understand it, we first need to define *application contexts*.

Definition 2. *A* context *is a pattern* C *with a hole variable* -*. We write* C[ϕ] ≡ C[ϕ/-] *as the result of context plugging. We call* C *an* application context*, if*


That is, the path from the root to in C has only applications.

The proof rules are sound and can be divided into 4 categories: FOL reasoning, frame reasoning, fixpoint reasoning, and some technical rules. The FOL reasoning rules provide (complete) FOL reasoning (see, e.g., [25]). The frame reasoning rules state that application contexts are commutative with disjunctive connectives such as ∨ and ∃. The fixpoint reasoning rules support the standard fixpoint reasoning as in modal μ-calculus [17]. The technical proof rules are needed for some completeness results (see [5] for details).

#### 4.2 Formalizing Matching Logic

We discuss the formalization of matching logic, which is our first main contribution and forms an important component in our proof objects (see Sect. 2).


Fig. 5. Matching logic proof system (where *C, C*1*, C*<sup>2</sup> are application contexts).

Metamath [20] is a tiny language to state abstract mathematics and their proofs in a machine-checkable style. In our work, we use Metamath to formalize matching logic and to encode our proof objects. We choose Metamath for its simplicity and fast proof checking: Metamath proof checkers are often hundreds lines of code and can proof-check thousands of theorems in a second.

Our formalization follows closely Sect. 4.1. We formalize the syntax of patterns and the proof system. We also need to formalize some metalevel operations such as free variables and capture-free substitution. An *innovative* contribution is a generic way to handling *notations* (such as ¬ and ∧) in matching logic. The resulting formalization has only 245 lines of code, which we show in [16]. This formalization of matching logic is the main trust base of our proof objects.

Metamath Overview. We use an extract of our formalization of matching logic (Fig. 6) to explain the basic concepts in Metamath. At a high level, a Metamath source file consists of a list of *statements*. The main ones are:

1. *constant statements* ( \$c ) that declare Metamath constants;

Fig. 6. An extract of the Metamath formalization of matching logic.


Figure 6 defines the fragment of matching logic with only implications. We declare five constants in a row in line 1, where \imp , ( , and ) build the syntax, #Pattern is the type of patterns, and |- is the provability relation. We declare three metavariables of patterns in lines 3–6, and the syntax of implication ϕ<sup>1</sup> → ϕ<sup>2</sup> as ( \imp ph1 ph2 ) in line 7. Then, we define matching logic proof rules as Metamath axioms. For example, lines 18–22 define the rule (Modus Ponens).

In line 23, we show an example (meta-)theorem and its formal proof in Metamath. The theorem states that ϕ<sup>1</sup> → ϕ<sup>1</sup> holds, and its proof (lines 25–43) is a sequence of labels referring to the previous axiomatic/provable statements.

Metamath proofs are very easy to proof-check, which is why we use it in our work. The proof checker reads the labels in order and push them to a *proof stack* S, which is initially empty. When a label l is read, the checker pops its premise statements from S and pushes l itself. When all labels are consumed, the checker checks whether S has exactly one statement, which should be the original proof goal. If so, the proof is checked. Otherwise, it fails.

As an example, we look at the first 5 labels of the proof in Fig. 6, line 25:

```
// Initially, the proof stack S is empty
ph1-is-pattern // S = [ #Pattern ph1 ]
ph1-is-pattern // S = [ #Pattern ph1 ; #Pattern ph1 ]
ph1-is-pattern // S = [ #Pattern ph1 ; #Pattern ph1 ; #Pattern ph1 ]
imp-is-pattern // S = [ #Pattern ph1 ; #Pattern ( \imp ph1 ph1 ) ]
imp-is-pattern // S = [ #Pattern ( \imp ph1 ( \imp ph1 ph1 ) ) ]
```
where we show the stack status in comments. The first label ph1-is-pattern refers to a \$f -statement without premises, so nothing is popped off, and the corresponding statement #Pattern ph1 is pushed to the stack. The same happens, for the second and third labels. The fourth label imp-is-pattern refers to a \$a statement with two metavariables of patterns, and thus has 2 premises. Therefore, the top two statements in S are popped off, and the corresponding conclusion #Pattern ( \imp ph1 ph1 ) is pushed to S. The last label does the same, popping off two premises and pushing #Pattern ( \imp ph1 ( \imp ph1 ph1 ) ) to S. Thus, these five proof steps prove the *wellformedness* of ϕ<sup>1</sup> → (ϕ<sup>1</sup> → ϕ1).

Formalizing Matching Logic Syntax. Now, we go through the formalization of matching logic and emphasize some highlights. See [5,6,22] for full detail.

The syntax of patterns is formalized below, following Definition 1:

```
$c \bot \imp \app \exists \mu ( ) $.
var-is-pattern $a #Pattern xX $.
symbol-is-pattern $a #Pattern sg0 $.
bot-is-pattern $a #Pattern \bot $.
imp-is-pattern $a #Pattern ( \imp ph0 ph1 ) $.
app-is-pattern $a #Pattern ( \app ph0 ph1 ) $.
exists-is-pattern $a #Pattern ( \exists x ph0 ) $.
${ mu-is-pattern.0 $e #Positive X ph0 $.
   mu-is-pattern $a #Pattern ( \mu X ph0 ) $. $}
```
Note that we omit the declarations of metavariables (such as xX , sg0 , ...) because their meaning can be easily inferred. The only nontrivial case above is mu-is-pattern , where we require that ph0 is positive in <sup>X</sup> , discussed below.

Metalevel Assertions. To formalize matching logic, we need the following metalevel operations and/or assertions:


Item 1 is needed to define the syntax of μX. ϕ, while Items 2–5 are needed to define the proof system (Fig. 5). Here, we show how to define capture-free substitution as an example. Notations are discussed in the next section.

To formalize capture-free substitution, we first define a Metamath constant

**\$c** #Substitution **\$.**

that serves as an assertion symbol: #Substitution ph ph' ph" xX holds iff ph ≡ ph' [ ph" / xX ]. Then, we can define substitution following Fig. 4. The only nontrivial case is when ph' is ∃x. ϕ or μX. ϕ, in which case α-renaming is required to avoid variable capture. We show the case when ph' is ∃x. ϕ below:

```
substitution-exists-shadowed
   $a #Substitution ( \exists x ph1 ) ( \exists x ph1 ) ph0 x $.
${ $d xX x $.
   $d y ph0 $.
   substitution-exists.0 $e #Substitution ph2 ph1 y x $.
   substitution-exists.1 $e #Substitution ph3 ph2 ph0 xX $.
   substitution-exists
     $a #Substitution ( \exists y ph3 ) ( \exists x ph1 ) ph0 xX $. $}
```
There are two cases, as expected from Fig. 4. substitution-exists-shadowed is when the substitution is shadowed. substitution-exists is the general case, where we first rename <sup>x</sup> to a fresh variable <sup>y</sup> and then continue the substitution. The \$d -statements state that the substitution is not shadowed and <sup>y</sup> is fresh.

Supporting Notations. Notations (e.g., ¬ and ∧) play an important role in matching logic. Many proof rules such as (Propagation∨) and (Singleton) use notations (see Fig. 5). However, Metamath has no built-in support for notations. To define a notation, say ¬ϕ ≡ ϕ → ⊥, we need to (1) declare a constant \not and add it to the pattern syntax; (2) define the equivalence relation ¬ϕ ≡ ϕ → ⊥; and (3) add a new case for \not to *every metalevel assertions*. While (1) and (2) are reasonable, we want to avoid (3) because there are many metalevel assertions and thus it creates duplication.

Therefore, we implement an innovative and generic method that allows us to define *any notations* in a compact way. Our method is to declare a new constant #Notation and use it to capture the *congruence relation of sugaring/desugaring*. Using #Notation , it takes only three lines to define the notation ¬ϕ ≡ ϕ → ⊥:

```
$c \not $.
not-is-pattern $a #Pattern ( \not ph0 ) $.
not-is-sugar $a #Notation ( \not ph0 ) ( \imp ph0 \bot ) $.
```
To make the above work, we need to state that #Notation is a congruence relation with respect to the syntax of patterns and all the other metalevel assertions. Firstly, we state that it is reflexive, symmetric, and transitive:

```
notation-reflexivity $a #Notation ph0 ph0 $.
${ notation-symmetry.0 $e #Notation ph0 ph1 $.
   notation-symmetry $a #Notation ph1 ph0 $. $}
${ notation-transitivity.0 $e #Notation ph0 ph1 $.
   notation-transitivity.1 $e #Notation ph1 ph2 $.
   notation-transitivity $a #Notation ph0 ph2 $. $}
```
And the following is an example where we state that #Notation is a congruence with respect to provability:

```
${ notation-provability.0 $e #Notation ph0 ph1 $.
  notation-provability.1 $e |- ph0 $.
  notation-provability $a |- ph1 $. $}
```
This way, we only need a *fixed* number of statements that state that #Notation is a congruence, making it more compact and less duplicated to define notations.

Formalizing Proof System. With metalevel assertions and notations, it is now straightforward to formalize matching logic proof rules. We have seen the formalization of (Modus Ponens) in Fig. 6. In the following, we formalize the fixpoint proof rule (Kanaster-Tarski), whose premises use capture-free substitution:

**\${** rule-kt.0 **\$e** #Substitution ph0 ph1 ph2 X **\$.** rule-kt.1 **\$e** |- ( \imp ph0 ph2 ) **\$.** rule-kt **\$a** |- ( \imp ( \mu X ph1 ) ph2 ) **\$. \$}**

# 5 Compiling K into Matching Logic

To execute programs using K, we need to compile the K language definition for language L into a matching logic theory, written Γ *<sup>L</sup>* (see Sect. 3.2). In this section, we discuss this compilation process and show how to formalize Γ *<sup>L</sup>*.

#### 5.1 Basic Matching Logic Theories

Firstly, we discuss the basic matching logic theories that are required by Γ *<sup>L</sup>*. We discuss the theories of equality, sorts (and sorted functions), and rewriting.

Theory of Equality. By equality, we mean a (predicate) pattern ϕ<sup>1</sup> = ϕ<sup>2</sup> that holds (i.e., equals to ) iff ϕ<sup>1</sup> equals to ϕ2, and fails (i.e., equals to ⊥) otherwise. We first need to define *definedness* ϕ, which is a predicate pattern that states that ϕ is *defined*, i.e., ϕ is matched by at least one element: ϕ is not ⊥.

Definition 3. *Consider a symbol \_* ∈ Σ*, called the* definedness symbol*. We write* ϕ *for the application \_* ϕ*. In addition, we define the following axiom:*

$$(\text{Definedness}) \quad \lceil x \rceil \tag{5}$$

(Definedness) states that any element x is *defined*. Using the definedness symbol, we can define many important mathematical instruments, including equality, as the following notations:

$$\begin{array}{llll} \lfloor \varphi \rfloor \equiv \neg \lceil \neg \varphi \rceil & & \text{// Totality} & \varphi\_1 = \varphi\_2 \equiv \lfloor \varphi\_1 \leftrightarrow \varphi\_2 \rfloor & \text{// Equality} \\ \varphi\_1 \subseteq \varphi\_2 \equiv \lfloor \varphi\_1 \to \varphi\_2 \rfloor & \text{// Inclusion} & x \in \varphi \equiv \lceil x \wedge \varphi \rceil & & \text{//Member} \\ \end{array}$$

[22, Section 5.1] shows that the above indeed capture the intended semantics.

Theory of Sorts. Matching logic is not sorted, but K is. To compile K into matching logic, we need a systematic way to dealing with sorts. We follow the "sort-as-predicate" paradigm to handle sorts and sorted functions in matching logic, following [4,6]. The main idea is to define a symbol -\_ ∈ Σ, called the *inhabitant symbol*, and use the *inhabitant pattern* s (abbreviated for the application -\_ s) to represent the *inhabitant set* of sort s. For example, to define a sort *Nat*, we define a corresponding symbol *Nat* that represents the sort name, and use -*Nat* to represent the set of all natural numbers.

*Sorted functions* can be axiomatized as special matching logic symbols. For example, the successor function *succ* of natural numbers is a symbol with axiom:

$$\forall x. \, x \in \[Nat\mathbb{I}\] \to \exists y. \, y \in \[Nat\mathbb{I}\} \land succ \, x = y \tag{6}$$

In other words, for any x in the inhabitant set of *Nat*, there exists a y in the inhabitant set of *Nat* such that *succ* x equals to y. Thus, *succ* is a sorted function from *Nat* to *Nat*.

Theory of Rewriting. Recall that in K, the formal language semantics is defined using rewrite rules, which essentially define a *transition system* over computation configurations. In matching logic, a transition system can be captured by only one symbol • ∈ Σ, called *one-path next*, with the intuition that for any configuration γ, •γ is matched by all configurations that can go to γ in one step. In other words, γ is reached on *one-path* in the *next* configuration.

Program execution is the reflexive and transitive closure of one-path next. Formally, we define program execution (i.e., rewriting) as follows:

ϕ ≡ μX. ϕ ∨ •X // Eventually; equals to ϕ ∨ •ϕ ∨ ••ϕ ∨ ... ϕ<sup>1</sup> ⇒ ϕ<sup>2</sup> ≡ ϕ<sup>1</sup> → ϕ<sup>2</sup> // Rewriting

### 5.2 Kore: The Intermediate Between K and Matching Logic

The K compilation tool kompile (explained shortly) is what compiles a K language definition into a matching logic theory Γ *<sup>L</sup>*, written in a formal language called Kore. For legacy reasons, the Kore language is not the same as the syntax of matching logic (Definition 1), but an axiomatic extension with equality, sorts, and rewriting. Thus, to formalize Γ *<sup>L</sup>* in proof objects, we need to (1) formalize the matching logic theories of equality, sorts, and rewriting; and (2) automatically translate Kore definitions into the corresponding matching logic theories. Figure 7 shows the 2-phase translation from K to matching logic, via Kore.

*Phase 1: From* K *to Kore.* To compile a K definition such as two-counters.k in Fig. 3, we pass it to the K compilation tool kompile as follows:

\$ kompile two-counters.k

The result is a compiled Kore definition two-counters.kore . We show the autogenerated Kore axiom in Fig. 7 that corresponds to the rewrite rule in Eq. (3). As we can see, Kore is a much lower-level language than K, where the programming language concrete syntax and K's front end syntax are parsed and replaced by the abstract syntax trees, represented by the constructor terms.

Fig. 7. Automatic translation from K to matching logic, via Kore

*Phase 2: From Kore to Matching Logic.* We develop an automatic encoder that translates Kore syntax into matching logic patterns. Since Kore is essentially the theory of equality, sorts, and rewriting, we can define the syntactic constructs of the Kore language as *notations*, using the basic theories in Sect. 5.1.

### 6 Generating Proof Objects for Program Execution

In this section, we discuss how to generate proof objects for program execution, based on the formalization of matching logic and K/Kore in Sects. 4 and 5. The key step is to generate proof objects for *one-step executions*, which are then put together to build the proof objects for multi-step executions using the transitivity of the rewriting relation. Thus, we focus on the process of generating proof objects for one-step executions from the proof parameters provided by K.

#### 6.1 Problem Formulation

Consider the following K definition that consists of K (conditional) rewrite rules:

$$S = \{ t\_k \land p\_k \Rightarrow s\_k \mid k = 1, 2, \dots, K \}$$

where t*<sup>k</sup>* and s*<sup>k</sup>* are the left- and right-hand sides of the rewrite rule, respectively, and p*<sup>k</sup>* is the rewriting condition. Consider the following execution trace:

$$\left[\varphi\_0, \varphi\_1, \dots, \varphi\_n\right] \tag{7}$$

where ϕ0,...,ϕ*<sup>n</sup>* are snapshots. We let K generate the following proof parameter:

$$\Theta \equiv (k\_0, \theta\_0), \dots, (k\_{n-1}, \theta\_{n-1}) \tag{8}$$

where for each 0 ≤ i<n, k*<sup>i</sup>* denotes the rewrite rule that is applied on ϕ*<sup>i</sup>* (1 ≤ k*<sup>i</sup>* ≤ K) and θ*<sup>i</sup>* denotes the corresponding substitution such that t*<sup>k</sup><sup>i</sup>* θ*<sup>i</sup>* = ϕ*i*.

As an example, the rewrite rule of TWO-COUNTERS , restated below:

$$<\langle m,n\rangle \Rightarrow \langle m-1,n+m\rangle \quad \text{if } m>0 \qquad \text{// Some as Eq.(3)}$$

has the left-hand side t*<sup>k</sup>* ≡ m, n, the right-hand side s*<sup>k</sup>* ≡ m − 1, n + m, and the condition p*<sup>k</sup>* ≡ m ≥ 0. Note that the right-hand side pattern s*<sup>k</sup>* contains the arithmetic operations "+" and "−" that can be further evaluated to a value, if concrete instances of the variables m and n are given. Generally speaking, the right-hand side of a rewrite rule may include (built-in or user-defined) functions that are not constructors and thus can be further evaluated. We call such evaluation process a *simplification*.

#### 6.2 Applying Rewrite Rules and Applying Simplifications

In the following, we list all proof objects for one-step executions.

$$\begin{aligned} \Gamma^L &\vdash \varphi\_0 \Rightarrow s\_{k\_0} \theta\_0 & \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \vdash s\_{k\_0} \text{ using } \theta\_0\\ \Gamma^L &\vdash s\_{k\_0} \theta\_0 = \varphi\_1 & \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \vdash \text{ by simplifying } s\_{k\_0} \theta\_0\\ &\dotsb \\ \Gamma^L &\vdash \varphi\_{n-1} \Rightarrow s\_{k\_{n-1}} \theta\_{n-1} & \quad \mid \text{ by applying } t\_{k\_{n-1}} \land p\_{k\_{n-1}} \Rightarrow s\_{k\_{n-1}} \text{ using } \theta\_{n-1}\\ \Gamma^L &\vdash s\_{k\_{n-1}} \theta\_{n-1} = \varphi\_n & \quad \quad \quad \quad \quad \mid \text{ by simplifying } s\_{k\_{n-1}} \theta\_{n-1} \end{aligned}$$

As we can see, there are two types of proof objects: one that proves the results of *applying rewrite rules* and one that *applies simplification*.

Applying Rewrite Rules. The main steps in proving <sup>Γ</sup> *<sup>L</sup>* <sup>ϕ</sup>*<sup>i</sup>* <sup>⇒</sup> <sup>s</sup>*<sup>k</sup><sup>i</sup>* <sup>θ</sup>*<sup>i</sup>* are (1) to *instantiate* the rewrite rule t*<sup>k</sup><sup>i</sup>* ∧ p*<sup>k</sup><sup>i</sup>* ⇒ s*<sup>k</sup><sup>i</sup>* using the substitution

$$\theta\_i = [c\_1/x\_1, \dots, c\_m/x\_m],$$

given in the proof parameter, and (2) to show that the (instantiated) rewriting condition p*<sup>k</sup><sup>i</sup>* θ*<sup>i</sup>* holds. Here, x1,...,x*<sup>m</sup>* are the variables that occur in the rewrite rule and c1,...,c*<sup>m</sup>* are terms by which we instantiate the variables. For (1), we need to first prove the following lemma, called (Functional Substitution) in [5], which states that ∀-quantification can be instantiated by functional patterns:

$$\frac{\forall \mathbf{x}. \, t\_{k\_1} \land p\_{k\_i} \Rightarrow s\_{k\_i} \quad \exists y\_1. \, \varphi\_1 = y\_1 \; \cdots \; \exists y\_m. \, \varphi\_m = y\_m}{t\_{k\_i} \theta\_i \land p\_{k\_i} \theta\_i \Rightarrow s\_{k\_i} \theta\_i} \quad y\_1, \ldots, y\_m \text{ fresh}$$

Intuitively, the premise ∃y1. ϕ<sup>1</sup> = y<sup>1</sup> states that ϕ<sup>1</sup> is a functional pattern because it equals to some element y1.

If Θ in Eq. (8) is the correct proof parameter, θ*<sup>i</sup>* is the correct substitution and thus t*<sup>k</sup><sup>i</sup>* θ*<sup>i</sup>* ≡ ϕ*i*. Therefore, to prove the original proof goal for one-step execution, i.e. <sup>Γ</sup> *<sup>L</sup>* <sup>ϕ</sup>*<sup>i</sup>* <sup>⇒</sup> <sup>s</sup>*<sup>k</sup><sup>i</sup>* <sup>θ</sup>*i*, we only need to prove that <sup>Γ</sup> *<sup>L</sup>* <sup>p</sup>*<sup>k</sup><sup>i</sup>* <sup>θ</sup>*i*, i.e., the rewriting condition p*<sup>k</sup><sup>i</sup>* holds under θ*i*. This is done by *simplifying* p*<sup>k</sup><sup>i</sup>* θ*<sup>i</sup>* to , discussed together with the simplification process in the following.

Applying Simplifications. K carries out simplification exhaustively before trying to apply a rewrite rule, and simplifications are done by applying (oriented) equations. Generally speaking, let s be a functional pattern and p → t = t be a (conditional) equation, we say that s can be *simplified* w.r.t. p → t = t , if there is a sub-pattern s<sup>0</sup> of s (written s ≡ C[s0] where C is a context) and a substitution θ such that s<sup>0</sup> = tθ and pθ holds. The resulting *simplified pattern* is denoted C[t θ]. Therefore, a proof object of the above simplification consists of two proofs: <sup>Γ</sup> *<sup>L</sup>* <sup>s</sup> <sup>=</sup> <sup>C</sup>[<sup>t</sup> <sup>θ</sup>] and <sup>Γ</sup> *<sup>L</sup>* pθ. The latter can be handled recursively, by simplifying pθ to , so we only need to consider the former.

The main steps of proving <sup>Γ</sup> *<sup>L</sup>* <sup>s</sup> <sup>=</sup> <sup>C</sup>[<sup>t</sup> θ] are the following:


Finally, we repeat the above one-step simplifications until no sub-patterns can be simplified further. The resulting proof objects are then put together by the transitivity of equality.

# 7 Discussion on Implementation

As discussed in Sect. 2, a complete proof object for program execution (i.e., <sup>Γ</sup> *<sup>L</sup>* <sup>ϕ</sup>*init* <sup>⇒</sup> <sup>ϕ</sup>*final*) consists of (1) the formalization of matching logic and its basic theories; (2) the formalization of Γ *<sup>L</sup>*; and (3) the proofs of one-step and multi-step program executions. In our implementation, (1) is developed manually because it is fixed for all programming languages and program executions. (2) and (3) are automatically generated by the algorithms in Sect. 6.

During the (manual) development of (1), we needed to prove many basic matching logic (meta-)theorems as lemmas, such as (Functional Substitution) in Sect. 6.2. To ease the manual work, we developed an *interactive theorem prover* (ITP) for matching logic, which allows us to carry out higher-level interactive proofs that are later automatically translated into the lower-level Metamath proofs. We show the highlights of our ITP for matching logic in Sect. 7.1.

In Sect. 7.2, we discuss the main limitations of our current preliminary implementation. These limitations are planned to be addressed in future work.

#### 7.1 An Interactive Theorem Prover for Matching Logic

Metamath proofs are low-level and not human readable (see, e.g., the proof of ϕ → ϕ in Fig. 6). Metamath has its own interactive theorem prover (ITP), but it is for general purposes and does not have specific support for matching logic. Therefore, we developed a new ITP for matching logic that has the following characteristic features:


When an interactive proof is finished, our ITP will translate the higher-level proof tactics into real Metamath formal proofs, and thus ease the manual development. It is not our interest to fully introduce ITP in this paper, as more detail about the ITP is to be found in future publications.

#### 7.2 Limitations and Threats to Validity

We discuss the trust base of the autogenerated proof objects by pointing out the main threats to validity, caused by the limitations of our preliminary implementation. It should be noted that these limitations are about the implementation, and *not* our approach. We shall address these limitations in future work.

Limitation 1: Need to Trust Kore. Our current implementation is based on the existing K compilation tool kompile that compiles K into Kore definitions. Recall that Kore is a (legacy) formal language with built-in support for equality, sorts, and rewriting, and thus is different (and more complex) than the syntax of matching logic. By using Kore as the intermediate between K and matching logic (Fig. 7), we need to trust Kore and the K complication tool kompile .

In the future, we will eliminate Kore entirely from the picture and formalize K *directly*. To do that, we need to formalize the "front end matters" of K, such as concrete programming language syntax and K attributes, currently handled by kompile . That is, we need to formalize and generate proof objects for kompile .

Limitation 2: Need to Trust Domain Reasoning. K has built-in support for domain reasoning such as integer arithmetic. Our current proof objects do not include the formal proofs of such domain reasoning, but instead regard them as assumed lemmas. In the future, we will incorporate the existing research on generating proof objects for SMT solvers [1] into our implementation, in order to generate proof objects also for domain reasoning; see also Sect. 9.

Limitation 3: Do Not Support More Complex Kfeatures. Our current implementation only supports the core K features of defining programming language syntax and of defining formal semantics as rewrite rules. Some more complex features are not supported; the main ones are (1) the [strict] attributes that specify evaluation orders; and (2) the use of built-in collection datatypes, such as lists, sets, and maps.

To support (1), we should handle the so-called *heating/cooling rules* that are autogenerated rewrite rules that implement the specified evaluation orders. Our current implementation does not support these heating/cooling rules because they are conditional rules, and their conditions are those that state that an element is *not* a computation result. To prove such conditions, we need additional constructors axioms for the sorts/types that represent results of computation. To support (2), we should extend our algorithms in Sect. 6 with *unification* modulo these collection datatypes.

# 8 Evaluation

In this section, we evaluate the performance of our implementation and discuss the experiment results, summarized in Table 1. We use two sets of benchmarks.


Table 1. Performance of proof generation/checking (time measured in seconds).

The first is our running example TWO-COUNTERS with different inputs (10, 20, 50, and 100). The second is REC [11], which is a popular performance benchmark for rewriting engines. We evaluate both the performance of proof object *generation* and that of proof *checking*. Our implementation can be found in [16] and [3].

The main takeaways of our experiments are:


Proof Object Generation. We measure the proof object generation time as the time to generate complete proof objects following the algorithms in Sect. 6, from the compiled language semantics (i.e., Kore definitions) and proof parameters. As shown in Table 1, proof generation takes around 17–406 s on the benchmarks, and the average is 107 s.

Proof object generation can be divided into two parts: that of the language semantics Γ *<sup>L</sup>* and that of the (one-step and multi-step) program executions. Both parts are shown in Table 1 under columns "sem" and "rewrite", respectively. For the same language, the time to generate language semantics Γ *<sup>L</sup>* is the same (up to experimental error). The time for executions is linear to the number of steps.

Proof Checking. Proof checking is efficient and takes a few seconds on our benchmarks. We can divide the proof checking time into two parts: that of the logical foundation and that of the actual program execution tasks. Both parts are shown in Table 1 under columns "logic" and "task". The "logic" part includes formalization of matching logic and its basic theories, and thus is *fixed* for any programming language and program and has the same proof checking time (up to experimental error). The "task" part includes the language semantics and proof objects for the one-step and multi-step executions. Therefore, the time to check the "task" part is a *more valuable and realistic* measure, and according to our experiments, it is often less than 1 s, making it acceptable in practice.

As a pleasant surprise, the time for "task-specific"proof checking is roughly the same as the time that it takes K to parse and execute the programs. In other words, there is *no significant performance difference* on our benchmarks between running the programs directly in K and checking the proof objects.

There exists much potential to optimize the performance of proof checking and make it even faster than program execution. For example, in our approach proof checking is an *embarrassingly parallel problem*, because each metatheorems can be proof-checked entirely independently. Therefore, we can significantly reduce the proof checking time by running multiple checkers in parallel.

# 9 Related Work

The idea of using proof generation to address the functional correctness of complicated systems has been introduced a long time ago.

Interactive theorem provers such as Coq [19] and Isabelle [26] are often used to formalize programming language semantics and to reason about program properties. These provers often provide a high-level proof script language that allows the users to develop human-readable proofs, which are then automatically translated into lower-level proof objects that can be checked by the corresponding proof checkers. For example, the proof objects of Coq are of the form t : t , where t is a term that represents the proposition to be proved and t represents a formal proof. The typing claim t : t can then be proof-checked by a proof checker that implements the typing rules of the calculus of inductive constructions (CIC) [8], which is the logical foundation of Coq.

There are two main differences between provers such as Coq and our technique. Firstly, Coq is not regarded as a language framework in the sense of Fig. 1 because no language tools are autogenerated from the formal semantics. In our case, we need to be able to handle the correctness of individual tasks on a caseby-case basis to reduce the complexity. Secondly, Coq proof checking is based on CIC, which is arguably more complex than matching logic—the logical foundation of K as demonstrated in this paper. Indeed, the formalization of matching logic requires only 245 LOC which we display entirely in [16].

Another application of proof generation is to ensure the correctness of SMT solvers. These are popular tools to check the satisfiability of FOL formulas, written in a formal language containing interpreted functions and predicates. SMT solvers often implement complex data structures and algorithms, putting their correctness at risk. There is recent work such as [1] studying proof generation for SMT solvers. The research has been incorporated in theorem provers such as Lean, which attempts to bridge the gap between SMT reasoning and proof assistants more directly by building a proof assistant with efficient and sophisticated built-in SMT capabilities. As discussed in Sect. 7, our current implementation does not generate proofs for domain reasoning. So, we plan to incorporate the above SMT proof generation work into our future implementation.

# 10 Conclusion

We propose an innovative approach based on proof generation. The key idea is to generate proof objects as *proof certificates* for each individual task that the language tools conduct, on a case-by-case basis. This way, we avoid formally verifying the entire framework, which is practically impossible, and thus can make the language framework both *practical* and *trustworthy*.

Acknowledgment. The work presented in this paper was supported in part by NSF CNS 16-19275 and an IOHK grant. This material is based upon work supported by the United States Air Force and DARPA under Contract No. FA8750-18-C-0092.

# References


Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

# **Foundations of Fine-Grained Explainability**

Sylvain Hall´e(B) and Hugo Tremblay(B)

Laboratoire d'informatique formelle, Universit´e du Qu´ebec `a Chicoutimi, Saguenay, Canada shalle@acm.org, Hugo Tremblay2@uqac.ca

**Abstract.** Explainability is the process of linking part of the inputs given to a calculation to its output, in such a way that the selected inputs somehow "cause" the result. We establish the formal foundations of a notion of explainability for arbitrary abstract functions manipulating nested data structures. We then establish explanation relationships for a set of elementary functions, and for compositions thereof. A fully functional implementation of these concepts is finally presented and experimentally evaluated.

# **1 Introduction**

Developers of information systems in all disciplines are facing increasing pressure to come up with mechanisms to describe how or why a specific result is produced—a concept called *explainability*. For example, a web application testing tool that discovers a layout bug can be asked to pinpoint the elements of the page actually responsible for the bug [24]. A process mining system finding a compliance violation inside a business process log can extract a subset of the log's sequence of events that causes the violation [46]. Similarly, an event stream processing system can monitor the state of a server room and, when raising an alarm, identify what machines in the room are the cause of the alarm [40]. All these situations have in common that one is not only interested in an oracle that produces a simple Boolean pass/fail verdict from a given input, but also additional information that somehow links parts of this input to the result.

Explainability is currently handled by *ad hoc* means, if at all. Hence, a developer may write a script that checks a complex condition on some input object; however, for this script to provide an explanation, and not just a verdict, extra code must be written in order to identify, organize, and format the relevant input elements that form an explanation for the result. This extra code is undesirable: it represents additional work, is specific to the condition being evaluated, and relies completely on the developer's intuition as to what constitutes a suitable "explanation". Better yet would be a formal framework where this notion would be defined for arbitrary abstract functions, and accompanied by a generic and systematic way of constructing an explanation.

c The Author(s) 2021

A. Silva and K. R. M. Leino (Eds.): CAV 2021, LNCS 12760, pp. 500–523, 2021. https://doi.org/10.1007/978-3-030-81688-9\_24

In this paper, we present the theoretical foundations of a notion of explainability. Our model focuses on abstract functions whose input arguments and output values can be composite objects—that is, data structures that may be composed of multiple parts, such as lists. In contrast with existing works on the subject, which consider relationships between inputs and outputs as a whole, our framework is fine-grained: it is possible to point to a specific *part* of an output result, and construct an explanation that refers to precise *parts* of the input.

Figure 1 illustrates the approach on a simple example. On the left is an input, which in this case is a text log of comma-separated values. Suppose that on this log, one wishes to extract the second element of each line, and check that the average of any three successive values is always greater than 3. It is easy to see that the condition is false, but where in the input log are the locations that cause it to be violated? By applying the systematic mechanism described in this paper, one can construct an explanation that is represented by the graph at the right. We can observe that the false output of the condition (the graph's root) is linked to several leaves that designate parts of the input (character locations in each line of the file, identified by colors). Moreover, the graph contains Boolean nodes, indicating that the explanation may involve multiple elements ("and"), and that alternate explanations are possible ("or").

**Fig. 1.** Left: a simple text file. Right: an explanation graph obtained from the evaluation of a function on this input.

First, in Sect. 2, we review existing works related to the concept of causality, provenance and taint propagation, which are the notions closest to our concerns. Section 3 lays out the theoretical foundations of our framework: it introduces the notion of parts of composite objects, and formally defines a relation between parts of inputs and outputs of an arbitrary function, called *explanation*. Moreover, it shows how an explanation can be constructed for a function that is a composition of basic functions, by composing their respective explanations. Section 4 then illustrates these definitions by demonstrating what constitutes an explanation on a small yet powerful set of basic functions. Taken together, these results make it possible to easily construct explanations for a wide range of computations.

To showcase the feasibility of this approach, Sect. 5 presents *Petit Poucet*, a fully-functioning Java library that implements these concepts. The library allows users to create their own complex functions by composing built-in primitives, and can automatically produce an explanation for any result computed by these

functions on a given input. Experiments on a few examples provide insight on the time and memory overhead incurred by the handling of explanations. Finally, Sect. 6 identifies a number of exciting open theoretical questions that arise from our formal framework.

# **2 Related Work**

Explainability can be seen as a particular case of a more general concept called *lineage*, where inputs and outputs of a calculation are put in relation according to various definitions. Related works around this notion can be separated into a few categories, which we briefly describe below.

#### **2.1 Causality**

In the field of testing and formal verification of software systems, lineage often takes the form of *causality*. A classical definition is given by Halpern and Pearl [28], based on what is called "counterfactual dependence": A is a cause of B if the absence of A entails the absence of B. According to this principle, some feature of an object causes the failure of a verification or testing procedure if the absence of this feature instead causes the procedure to emit a passing verdict. This notion can be transposed for various types of systems. When a condition is expressed as a propositional logic formula, the cause of the true or false value of the formula can be constructed in the form of an *explanatory sub-model*; informally, it can be thought of as the smallest set of propositional variables whose value implies the value of the formula.

In the case of conditions expressed on state-based systems, the cause of a violation has been taken either as the shortest prefix that guarantees the violation of the property regardless of what follows [23], or as a minimal set of word-level predicates extracted from a failed execution [50]. A platform for hardware formal verification, called RuleBasePE, expands on this latter definition to identify components of a system that are responsible for the violation of a safety property on a single execution trace [5].

Causality has been criticized as an all-or-nothing notion; *responsibility* has been introduced a refinement on this concept, where the involvement of some element A as the cause of B can be quantified [12]. The problem of deciding causality also has a high computational complexity; as a matter of fact, determining if A is a cause of B in a model only allowing Boolean values to variables is already NP-complete [18]. Tight automata [31,43] have also been developed to produce minimal counter-examples, which in this case, are sequences of states produced by a traversal of the finite-state machine. Explanatory sub-models can also be extended to the temporal case [19]. Another approach involves the computation of a so-called "minimal debugging window", which is a small segment of the input trace that contains the discovered violation [36].

Finally, distance-based approaches compare a faulty trace with the closest valid trace (according to a given distance metric) [22]; the differences between the two traces are defined as the cause of the failure. A similar principle is applied in a software testing technique called *delta debugging* to identify values of variables in a program that are responsible for a failure [15].

#### **2.2 Provenance in Information Systems**

In a completely different direction, a large amount of work on lineage has been done in the field of databases, where this notion is often called *provenance*. A thorough survey of related approaches on provenance [9] reveals that this concept has been studied in several different areas of data management, such as scientific data processing and database management systems.

Research in this field typically distinguishes between three types of provenance. The first type is called *why-provenance* [17]: to each tuple t in the output of a relational database query, why-provenance associates a set of tuples present in the input of the query that helped to "produce" t. *How-provenance*, as its name implies, keeps track not only of what input tuples contribute to the input, but also in which way these tuples have been combined to form the result [21]. For example, a symbolic polynomial like t <sup>2</sup> <sup>+</sup> t·t indicates that an output tuple can be explained in two alternative ways: either by using tuple t twice, or by combining t and t . Finally, *where-provenance* describes where a piece of data is copied from [8]. It is typically expressed at a finer level of granularity, by allowing to link individual values inside an output tuple to individual values of one or more input tuple. One possible way of doing it is through a technique called annotation-propagation, where each part of the input is given symbolic "annotations", which are then percolated and tracked all the way to the output [7].

A recent survey reveals the existence of more than two dozen provenanceaware systems [38]. Where-provenance has been implemented into Polygen [51], DBNotes [11], Mondrian [20], MXQL [48] and Orchestra [29]. The Spider system performs a slightly different task, by showing to a user the "route" from input to output that is being taken by data when a specific database query is executed [10]. The foundations for all these systems are relational databases, where sets of tuples are manipulated by operators from relational algebra, or extensions of SQL.

Taken in a broad sense, we also list in this category various works that aim to develop *explainable* Artificial Intelligence [42]. Models used in AI vary widely, ranging from deep neural networks to Bayesian rules and decision trees; consequently, the notion of what constitutes an explanation for each of them is also very variable, and is at times only informally stated.

#### **2.3 Taint Analysis and Information Flow**

A last line of works this time considers the linkage between the inputs and the outputs produced by a piece of procedural code, mostly for considerations of security. Dynamic *taint analysis* consists in marking and tracking certain data in a program at run-time. A typical use case for taint analysis is to check whether sensitive information (such as a password) is being leaked into an unprotected memory location, or if a "tainted" piece of input such as a user-provided string is being passed to a function like a database query without having been sanitized first (opening the door to injection attacks).

In this category, TaintCheck has been developed into a system where each memory byte is associated with a 4-byte pointer to a taint data structure [37]; program inputs are marked as tainted, and the system propagates taint markers to other memory locations during the execution of a program; this concept has been extended to the operating system as a whole in an implementation called Asbestos [47]. Hardware implementations of this principle have also been proposed [16,44]. Dytan [14] is a notable system for taint propagation and analysis on programs written in assembly language. We shall also mention the Gift system, as well as a compiler based on it called Aussum [32]. On its side, Rifle focuses on the information flow [45], while TaintBochs is a system that has been used to track the lifetime of sensitive data inside the memory of a program [13].

The capability of following taint markings on the inputs of a program can be used in many ways. For example, a system called Comet uses taint propagation to improve the coverage of an existing test suite [33]. Taint analysis can also be used to quantify the amount of information leak in a system [34]. Stated simply, the propagation of taint markings can be seen as a form of how-provenance, applied to variables and procedural code instead of tuples and relational queries. Note however that it operates in a top-down fashion: mark the inputs, and track these markings on their way to the output. In contrast, we shall see that our notion of explanation is bottom-up: point at a part of the output, and retrace the parts of the input that are related to it.

# **3 A Formal Definition of Explanation**

The problem we consider can be simply stated. Given an abstract function f and an input argument x, establish a formal relationship that explains f(x) based on x. While this is more or less closely related to the works we presented in the previous section, the solution we propose has a few distinguishing features.

First, x and f(x) may be composite objects made of multiple "parts", and it is possible to relate specific parts of an output to specific parts of an input. Second, if f is itself a composition g ◦ h, it is possible to construct the inputoutput relationships of f from the individual input-output relationships of g and h. Third, these relations are not defined *ad hoc* for each individual function, but come as consequences of a general definition. Finally, given x and f(x), determining the parts of x in relation with a given part of f(x) is tractable.

In this section, we establish the formal foundations of our proposed approach. We start by defining an abstract "part-of" relation between abstract objects, based on the notion of *designator*, and establish some properties of this relation. We then propose a definition of *explanation* for arbitrary functions, and discuss how it differs from existing causality and lineage relationships mentioned earlier.

#### **3.1 Object Parts**

Let U = - *<sup>i</sup>* O*<sup>i</sup>* be a union of sets of *objects*; the sets O*<sup>i</sup>* are called *types*. We suppose that U contains a special object, noted -, that represents "nothing". Types are disjoint sets, with the exception of which, for convenience, is assumed to be part of every type.

Each type <sup>O</sup> is associated with a set ΠO, whose elements are called *parts*. The set <sup>Π</sup><sup>O</sup> contain functions of the form <sup>π</sup> : O→O ; it is expected that π(-) = - for every such function. In addition, we impose that <sup>Π</sup><sup>O</sup> always contains two other functions defined as **<sup>1</sup>** : x → x, and **<sup>0</sup>** : x → -. We shall override the term "part" and say that an object o ∈ O is a part of some oth er object <sup>o</sup> ∈ O if it is the result of applying some <sup>π</sup> <sup>∈</sup> <sup>Π</sup><sup>O</sup> to <sup>o</sup>. A *proper part* is any part other than **<sup>1</sup>** and **<sup>0</sup>**; we shall use Π<sup>∗</sup> <sup>O</sup> to designate the set of proper parts for a given type. A type <sup>O</sup> is called *scalar* if Π<sup>∗</sup> <sup>O</sup> <sup>=</sup> <sup>∅</sup>; otherwise it is called *composite*.

**Fig. 2.** Illustration of two abstract composite objects of the same type. Composite parts are represented by gray rectangles; scalar parts are represented by colored rectangles. (Color figure online)

Figure <sup>2</sup> shows an example of two abstract composite objects X and Y of the same type <sup>O</sup>, which we suppose has a set of three proper parts called <sup>π</sup>*<sup>A</sup>*, <sup>π</sup>*<sup>B</sup>* and <sup>π</sup>*<sup>C</sup>* . Parts of an object can themselves be of a composite type; we assume here that type A has four parts π<sup>1</sup>,...,π<sup>4</sup>, type <sup>B</sup> has two parts <sup>π</sup><sup>5</sup>, π<sup>6</sup>, and type <sup>C</sup> is a scalar. For example, π*<sup>B</sup>*(X) corresponds to the rectangle numbered 4, and <sup>π</sup>*<sup>C</sup>* (<sup>Y</sup> ) is the rectangle numbered 7. Consistent with our definitions, **<sup>1</sup>** designates the whole object, hence **<sup>1</sup>**(X) is the rectangle 1. These parts are not all present in all objects of a given type; for example we see that <sup>π</sup>*<sup>C</sup>* (X) = -.

Parts can be composed in the usual way: if π : O→O and <sup>π</sup> : <sup>O</sup> → O, their composition <sup>π</sup> ◦ <sup>π</sup> is defined as o → π(π (o)). This corresponds intuitively to the notion of the part of some part of an object, and will allow us to point to arbitrarily fine-grained portions of input arguments or output values. For example, in Fig. 2, we have that (π<sup>1</sup> ◦ <sup>π</sup>*<sup>A</sup>*)(X) corresponds to the rectangle numbered 8, which indeed corresponds to part <sup>π</sup><sup>1</sup> of the part <sup>π</sup>*<sup>A</sup>* of object <sup>X</sup>. If Π <sup>=</sup> {π<sup>1</sup>,...,π*<sup>n</sup>*} is a set of parts and <sup>π</sup> is another part, we will abuse notation and write <sup>Π</sup> ◦ <sup>π</sup> to mean the set {π<sup>1</sup> ◦ π,..., π*<sup>n</sup>* ◦ <sup>π</sup>}.

If π is a part of an object, we shall say that π ◦ π is a *refinement* of π; inversely, π is a *generalization* of π ◦ <sup>π</sup>, as it corresponds to a "greater" part of the object. Remark that since (**1**◦π)=(π ◦**1**) = π, any part π is simultaneously a refinement and a generalization of itself. If π is a proper part of o, all its refinements are also considered as parts of o. The notions of refinement and generalization can be extended to sets of parts. Given two sets Π <sup>=</sup> {π<sup>1</sup>,...,π*m*} and Π <sup>=</sup> {π <sup>1</sup>,...,π *<sup>n</sup>*}, Π is a refinement if there exists an injection μ between Π and Π such that μ(π) = π if and only if π is a refinement of π ; conversely, Π is a generalization of Π. Again, a refinement of a set of parts Π picks fewer parts, and more selective parts of an object than Π. In the example of Fig. 2, the set <sup>Π</sup> <sup>=</sup> {π<sup>1</sup> ◦ <sup>π</sup>*<sup>A</sup>*, π*<sup>C</sup>* } is a refinement of <sup>Π</sup> <sup>=</sup> {π*<sup>A</sup>*, π<sup>5</sup> ◦ <sup>π</sup>*<sup>B</sup>*, π*<sup>C</sup>* } (the injection here being the two associations <sup>π</sup><sup>1</sup> ◦ <sup>π</sup>*<sup>A</sup>* → <sup>π</sup>*<sup>A</sup>* and <sup>π</sup>*<sup>C</sup>* → <sup>π</sup>*<sup>C</sup>* ).

Given a part π, two objects o, o ∈ O are said to *differ on* <sup>π</sup> if <sup>π</sup>(o) <sup>=</sup> <sup>π</sup>(o ). This can be illustrated in Fig. 2. Objects <sup>X</sup> and <sup>Y</sup> differ on <sup>π</sup>*<sup>C</sup>* , since <sup>π</sup>*<sup>C</sup>* (X) <sup>=</sup> <sup>π</sup>*<sup>C</sup>* (<sup>Y</sup> ). They also differ on <sup>π</sup>*<sup>A</sup>* (since rectangles 3 and 5 are different) and on <sup>π</sup><sup>4</sup> ◦ <sup>π</sup>*<sup>A</sup>* (which produces rectangle 11 for <sup>X</sup> and for Y ). However, they do not differ on <sup>π</sup>*<sup>B</sup>* (rectangles 4 and 6 are identical), and they do not differ on <sup>π</sup><sup>3</sup> ◦ <sup>π</sup>*<sup>A</sup>* (rectangles 10 and 16 are identical). This example shows that if two objects differ on a part <sup>π</sup>, they do not necessarily differ on a refinement of <sup>π</sup> (compare <sup>π</sup>*<sup>A</sup>* and <sup>π</sup><sup>1</sup> ◦ <sup>π</sup>*<sup>A</sup>*). Given a set of parts <sup>Π</sup> <sup>=</sup> {π1,...,π*<sup>n</sup>*}, two objects differ on <sup>Π</sup> if they differ in some <sup>π</sup>*<sup>i</sup>* <sup>∈</sup> <sup>Π</sup>. Obviously, if objects differ on <sup>Π</sup>, they also differ on any of its generalizations.

Finally, two parts <sup>π</sup> and <sup>π</sup> *intersect* if there exist two parts <sup>π</sup>*<sup>I</sup>* and <sup>π</sup> *I* (different from **<sup>0</sup>**) such that whenever (π*<sup>I</sup>* ◦ <sup>π</sup>)(o) <sup>=</sup> and (π *<sup>I</sup>* ◦ <sup>π</sup> )(o) <sup>=</sup> -, then (π*<sup>I</sup>* ◦ <sup>π</sup>)(o)=(π *<sup>I</sup>* ◦ <sup>π</sup> )(o). Two sets of parts Π and Π intersect if at least one of their respective parts intersect. In Fig. 2, the sets {π*<sup>A</sup>*} and {π<sup>2</sup>◦π*<sup>A</sup>*, π*<sup>C</sup>* } intersect (they have in common the part <sup>π</sup><sup>2</sup> ◦ <sup>π</sup>*<sup>A</sup>*), while the sets {π*<sup>A</sup>*} and {π<sup>5</sup> ◦ <sup>π</sup>*<sup>B</sup>*} do not.

#### **3.2 A Definition of Explanation**

We are now interested in relations between a part of a function's output, and one or more parts of that function's input. We shall focus on the set of unary functions f : O→O . For the sake of simplicity, functions of multiple input arguments will be modeled as unary functions whose input is a composite type that contains the arguments. These composite types will be used informally to illustrate the notions, and will be formally defined in Sect. 4.

**Definition 1.** Let <sup>f</sup> : O→O be a function, <sup>Π</sup> <sup>⊆</sup> <sup>Π</sup><sup>O</sup> be a set of parts of the function's input type, and π <sup>∈</sup> Π<sup>O</sup> be a part of the function's output type. Consider a set Π such that there exists an object o that differs from <sup>o</sup> only on Π, and for which f(o) and f(o ) differ on π. We say that Π *explains* π if Π is minimal, meaning that no refinement of Π satisfies the previous condition.

As an example, consider the function f : x, y → xy, with x = 1, y = 1. For this particular input object, the part designating the first element of the

input is an explanation, as changing it to any other value changes the result of the function; the same argument can be made for the part that designates the second element of the input. Consider now the case where x = 0, y = 1. This time, the value of the second element is irrelevant: changing it to anything else still produces 0. Therefore, the second element of the input is not an explanation for the output; the first element of the input alone "explains" the result of 0 in the output.

Based on these simple definitions, we can already put our framework in contrast with some of the papers we surveyed in Sect. 2. First, note how this definition is different from counterfactual causality [1,28]. This can be illustrated by considering the previous function, in the case where x = 0 and y = 0. Argument x is not a counterfactual cause of the output value, as changing it to anything else still produces 0; the same argument shows that y is not a cause of the output either. Therefore, one ends up with the counter-intuitive conclusion that none of the inputs cause the output.

In contrast, there exists a minimal set of parts that satisfies our explanation property, which is the set that contains *both* the first and the second element. Indeed, there exists another input object that differs on both elements, and which produces a different result. This is in line with the intuition that either element is sufficient to explain the null value produced by the function, and that therefore both need to be changed to have an impact. It highlights a first distinguishing feature of our approach: the presence of multiple parts inside a set indicates a form of "disjunction" or "alternate" explanations, something that cannot be easily accounted for in many definitions of causality.<sup>1</sup>

Why-provenance is expressed on tuples manipulated by a relational query [17], but our simple case can easily be adapted by assuming that x, y and f( x, y ) each are tuples with a dummy attribute a. The definition then leads to the conclusion that both x and y are considered to "produce" the output, whereas explanation rather concludes that *either* explains the output; the use of howprovenance [21] would produce a similar verdict. Where-provenance [8] is even less appropriate here, as it makes little sense to ask whether the product of two numbers "copies" any of the input arguments to its output.

Since a single explanation is a set of parts, the set of all explanations is therefore a set of sets of parts. As we have shown, a set of parts intuitively represents an alternative (either part is an explanation). In turn, elements of a set of set of parts represents the fact that each of them is an explanation. Therefore, a concise graphical notation for representing sets of sets of parts are and-or trees, such as the one shown in Fig. 3. In the present case, leaves of the tree each represent a (single) object part, while non-leaf nodes are labeled either with "and" (∧) or "or" (∨). For example, this tree represents the set of sets {{π<sup>1</sup>}, {π<sup>2</sup> ◦ <sup>π</sup><sup>3</sup>, π<sup>5</sup> ◦ <sup>π</sup><sup>6</sup>, π<sup>4</sup>}, {π<sup>2</sup> ◦ <sup>π</sup><sup>3</sup>, π<sup>7</sup> ◦ <sup>π</sup><sup>6</sup>, π<sup>4</sup>}}. 2

<sup>1</sup> Case in point, in [1] a cause is assumed to be a *conjunction* of assertions of the form X = x.

<sup>2</sup> Obviously, there exist multiple equivalent trees for the same set of sets of parts.

# **4 Building Explanations for Functions**

Equipped with these abstract definitions, we shall now establish properties of a handful of elementary functions, namely logical and arithmetic operations, and list manipulations. The reader may find that this section is stating the obvious, as many of the results we present correspond to very intuitive notions. The main interest of our approach is that these seemingly trivial conclusions are not defined *ad hoc*, but rather come as consequences of the general definition of explanation introduced in the previous section.

**Fig. 3.** An example of and-or tree where leaves are object parts.

We first consider as *scalar* types the sets of Boolean values B, the set of real numbers <sup>R</sup>, and the set of characters <sup>S</sup>. Then, we shall denote by <sup>V</sup> O<sup>1</sup>,..., <sup>O</sup>*<sup>n</sup>* the set of vectors of size n, where the i-th element is of type <sup>O</sup>. Its set of parts <sup>Π</sup>VO1*,...,*O*n* contains **<sup>1</sup>** and **<sup>0</sup>**, as well as all functions [i] : <sup>V</sup> O<sup>1</sup>,..., <sup>O</sup>*<sup>n</sup>* →O*<sup>i</sup>* defined as:

$$[i]: \langle o\_1, \dots, o\_n \rangle \mapsto \begin{cases} o\_i & \text{if } 1 \le i \le n \\ \boxtimes & \text{otherwise.} \end{cases}$$

In other words, the proper parts of a vector are each of its elements. We shall designate for finite vectors of variable length and uniform type <sup>O</sup> by <sup>V</sup> O<sup>∗</sup> . Finally, character strings will be viewed as the type <sup>V</sup> <sup>S</sup><sup>∗</sup> , i.e. finite words over the alphabet of symbols S. We stress that, although our concept of explanation is illustrated on a small set of functions operating on these types, it is by no means limited to these functions or these types.

#### **4.1 Conservative Generalizations**

Some of the functions we shall consider return objects that may be composite; these functions introduce the additional complexity that one may want to refer not only to the whole output of the function, but also to a single part of that function's output. What is more, the inputs of these functions can also be composite objects, and explicitly enumerating all their minimal sets of parts for explanation may not be possible.

Take for example the function f : <sup>V</sup><sup>O</sup>→V O , which simply returns its input vector as is. Suppose that we focus on π = [1], the first element of the output vector. Clearly, the set {[1]}, pointing to the first element of the input vector, should be recognized as the only one that explains this output. However, if O is a composite type, this set is not minimal, and should be further broken down into all the parts of O. Besides being unmanageable, this enumeration also misses the intuition that what explains the first element of the output is simply the first element of the input.

In the following, we employ an alternate approach, which will be to define a set of *conservative generalizations* of the function's input parts. For most functions, the principle will be the same: given an output part π, we shall define a set of sets of input parts Π <sup>=</sup> {Π1,...,Π*<sup>n</sup>*}, and demonstrate that any input part Π that explains <sup>π</sup> on some input intersects with one of the <sup>Π</sup>*<sup>i</sup>*. It follows that any minimal input part that explains the output is a refinement of one of the Π*<sup>i</sup>*.

**Fig. 4.** The sets of parts Π1, Π<sup>2</sup> and Π<sup>3</sup> (in green) represent a conservative generalization of the minimal sets of explanation parts (yellow circles) (Color figure online).

This is illustrated in Fig. 4, where the minimal sets of explanations of an input are illustrated in yellow. Here, the set {Π<sup>1</sup>, Π<sup>2</sup>, Π<sup>3</sup>} has been identified as the target set of sets of input parts. It is possible to see that if one establishes that any set lying outside of the green ovals is not an explanation, it follows that the minimal sets of parts for explanation are all contained inside one of <sup>Π</sup><sup>1</sup>, <sup>Π</sup><sup>2</sup> and <sup>Π</sup><sup>3</sup>. Note that this is a generalization, as, for example, <sup>Π</sup><sup>2</sup> does not contain any minimal set. However, this generalization is conservative, in the sense that all minimal sets are indeed contained within a green oval.

The goal is therefore to come up with generalizations that are, in a sense, as tight as possible. In the example of function f above, we could easily demonstrate that any set of input parts Π that has an impact on the first element of the output must contain a refinement of the first element of the input, and therefore identify Π <sup>=</sup> {{[1]}} as a sufficient set that "covers" all the minimal explanation input parts. It so happens that this set corresponds exactly to the intuitive result we expected in the first place: the first element of the input vector contains all the parts that impact the output, and no other part of the input has this property.

#### **4.2 Explanation for Scalar Functions**

In the following, we provide the formal definition of conservative generalizations of the explanation relationship for a number of elementary functions. We start with functions performing basic arithmetic operations returning a scalar value. Establishing explanation for addition over a vector of numbers is trivial.

**Theorem 1.** Let f : <sup>V</sup> <sup>R</sup><sup>∗</sup> <sup>→</sup> <sup>R</sup> be the function defined as x<sup>1</sup>,...,x*<sup>n</sup>* → <sup>x</sup><sup>1</sup> <sup>+</sup> ··· <sup>+</sup> <sup>x</sup>*<sup>n</sup>*. For any input <sup>x</sup><sup>1</sup>,...,x*<sup>n</sup>* , Π explains **<sup>1</sup>** on x<sup>1</sup>,...,x*<sup>n</sup>* if and only if Π <sup>=</sup> {[i]} for some 1 <sup>≤</sup> i <sup>≤</sup> n.

In other words, any single element of the input vector explains the result; the case of subtraction is defined identically. Multiplication, however, has a different definition. This is caused by the fact that 0 is an absorbing element, hence its presence suffices for a product to yield zero, as is explained by the following theorem.

**Theorem 2.** Let f : <sup>R</sup>*<sup>n</sup>* <sup>→</sup> <sup>R</sup> be the function defined as <sup>x</sup>1,...,x*<sup>n</sup>* → <sup>x</sup><sup>1</sup> ··· <sup>x</sup>*<sup>n</sup>*. For a given input x1,...,x*<sup>n</sup>* , Π explains **<sup>1</sup>** if and only if:

– <sup>Π</sup> <sup>=</sup> {[i]} for some 1 <sup>≤</sup> <sup>i</sup> <sup>≤</sup> <sup>n</sup>, if for all 1 <sup>≤</sup> <sup>j</sup> <sup>≤</sup> <sup>n</sup>, <sup>x</sup>*<sup>j</sup>* = 0 – Π <sup>=</sup> - *<sup>i</sup>*∈{*j*:1 <sup>≤</sup> *<sup>j</sup>* <sup>≤</sup> *<sup>n</sup>* and *<sup>x</sup><sup>j</sup>* = 0}{[i]} otherwise.

*Proof.* Suppose that all elements of the input vector are non-null. Let Π <sup>=</sup> {[i]} for some 1 <sup>≤</sup> i <sup>≤</sup> n. Clearly, any input that differs from x1,...,x*<sup>n</sup>* only in the i-th element produces a different product, and hence is a minimal explanation. Suppose now that at least one element of the vector is null; in such a case, the function returns 0. Let <sup>S</sup> <sup>=</sup> {<sup>j</sup> : 1 <sup>≤</sup> <sup>j</sup> <sup>≤</sup> <sup>n</sup> and <sup>x</sup>*<sup>j</sup>* = 0} be the set of all such vector indices. A vector must differ from the input in at least all these positions in order to produce a different output, otherwise the function still returns zero. The only refinements of this set are its strict subsets, but none is sufficient to change the output, and hence the defined set is the only minimal set satisfying the explanation property.

The same argument can be made of the Boolean function that computes the conjunction of a vector of Boolean values. If all elements of the vector are , changing any of them produces a change of value, and hence each {[i]} explains the output. Otherwise, the set Π <sup>=</sup> - *<sup>i</sup>*∈{*j*:1 <sup>≤</sup> *<sup>j</sup>* <sup>≤</sup> *<sup>n</sup>* and *<sup>x</sup><sup>j</sup>* <sup>=</sup> ⊥}{[i]} is the only minimal set of input parts that explains the output, by a reasoning similar to the case of multiplication above. A dual argument can be done for disjunction by swapping the roles of and ⊥.

The case of the remaining usual arithmetic and Boolean functions can be dispatched easily. Functions taking a single argument (abs, etc.) obviously have this single argument as their only minimal explanation part.

We finally turn to the case of a function that extracts the k-th element of a vector. We recall that vectors can be nested, and hence this element may itself be composite. The intuition here is that what explains a part of the output is that same part in the k-th element of the input.

**Theorem 3.** Let <sup>f</sup>*<sup>k</sup>* : <sup>V</sup> O*<sup>n</sup>* →O be the function defined as x<sup>1</sup>,...,x*<sup>n</sup>* → <sup>x</sup>*<sup>k</sup>* if 1 <sup>≤</sup> k <sup>≤</sup> n, and x<sup>1</sup>,...,x*<sup>n</sup>* → otherwise. Let <sup>π</sup> <sup>∈</sup> <sup>Π</sup><sup>O</sup> be an arbitrary part of <sup>O</sup>. For any input x<sup>1</sup>,...,x*<sup>n</sup>* , Π explains π for x<sup>1</sup>,...,x*<sup>n</sup>* if and only if Π <sup>=</sup> {π ◦ [k]} and 1 <sup>≤</sup> k <sup>≤</sup> n.

*Proof.* If k < 0 or k>n, <sup>f</sup>*<sup>k</sup>* produces regardless of its argument, and so no set of input explains the result. Otherwise, it suffices to observe that (π ◦ [i])( x<sup>1</sup>,...,x*<sup>n</sup>* ) = π(x*i*) = π(f*k*( x<sup>1</sup>,...,x*<sup>n</sup>* )), and hence {π ◦ [i]} explains the output. No other set satisfies this condition.

**Table 1.** Definition of elementary vector functions studied in this paper.

$$\begin{aligned} [i](\langle o\_1 \ldots o\_n \rangle) &= \begin{cases} \langle o\_l \rangle \text{ if } 1 \le i \le n\\ \Box & \text{otherwise} \end{cases} \\ \alpha\_f(\langle o\_1, \ldots, o\_n \rangle) &= \langle f(o\_1), \ldots, f(o\_n) \rangle \\ \omega\_f^{k,o}(\langle o\_1, \ldots, o\_n \rangle) &= \langle f(\langle o\_1, \ldots, o\_k \rangle), \ldots, f(\langle o\_{n-k}, \ldots, o\_n \rangle) \rangle \\ \Diamond(\langle b, o, o' \rangle) &= \begin{cases} o & \text{if } b = \top \\ o' & \text{otherwise} \end{cases} \end{aligned}$$

#### **4.3 Explanation for Vector Functions**

We shall delve into more detail on basic functions that produce a value that may be of non-scalar type, summarized in Table <sup>1</sup> (function [i] has already been discussed earlier). Here, we must consider the fact that an explanation may refer to a part of an element of their output, i.e. designators of the form π ◦ [i], with π some arbitrary designator.

The first function, noted <sup>α</sup>*<sup>f</sup>* , applies a function <sup>f</sup> on each element of an input vector, resulting in an output vector of same cardinality.

**Theorem 4.** On a given input x<sup>1</sup>,...,x*<sup>n</sup>* , <sup>Π</sup> explains <sup>π</sup> ◦ [i] of <sup>α</sup>*<sup>f</sup>* if and only if Π <sup>=</sup> {Π ◦ [i]} for some set of parts <sup>Π</sup> , and <sup>Π</sup> explains <sup>π</sup> for <sup>x</sup>*<sup>i</sup>* of <sup>f</sup>.

*Proof.* The <sup>i</sup>-th element of the output of <sup>α</sup>*<sup>f</sup>* is <sup>f</sup>(x*<sup>i</sup>*); if <sup>Π</sup> explains <sup>x</sup>*<sup>i</sup>* of <sup>f</sup>, then Π ◦ [i] explains x<sup>1</sup>,...,x*<sup>n</sup>* of <sup>α</sup>*<sup>f</sup>* . Conversely, suppose that <sup>Π</sup> ◦ [i] explains x<sup>1</sup>,...,x*<sup>n</sup>* of <sup>α</sup>*<sup>f</sup>* ; by definition, there exists another input <sup>x</sup> <sup>1</sup>,...,x *n* that differs on <sup>Π</sup> ◦ [i] and such that the output of <sup>α</sup>*<sup>f</sup>* differs on <sup>π</sup> ◦ [i]. Then <sup>x</sup>*<sup>i</sup>* and x *<sup>i</sup>* differ on <sup>Π</sup> , and by the definition of <sup>α</sup>*<sup>f</sup>* , <sup>f</sup>(x*<sup>i</sup>*) and <sup>f</sup>(x *<sup>i</sup>*) differ on π. If Π admits a proper part that satisfies this property, then Π is not an explanation, which contradicts the hypothesis. Hence Π is minimal, and it therefore explains <sup>π</sup> for <sup>x</sup>*<sup>i</sup>* of <sup>f</sup>.

Function ω*k,o <sup>f</sup>* applies a function <sup>f</sup> on a sliding window of width <sup>k</sup> to the input vector. That is, the first element of the output vector is the result of evaluating f on the first k elements; the second element is the evaluation of f on elements at positions 2 to k + 1, and so on. If the input vector has fewer than k elements, the function is defined to return a predefined value o. To establish the set of minimal explanations, we define a special function σ*<sup>i</sup>*; given a set of parts <sup>Π</sup> such that all parts are of the form π ◦ [j], replaces each of them by π ◦ [j <sup>−</sup> i]. In

other words, parts pointing to a part π of the j-th element of a vector end up pointing to the same part π of the (j <sup>−</sup> i)-th element of a vector.

**Theorem 5.** On a given input x<sup>1</sup>,...,x*<sup>n</sup>* , Π explains π◦[i] of ω*k,o <sup>f</sup>* if and only if n <sup>≥</sup> k, i <sup>≥</sup> n <sup>−</sup> k, Π <sup>=</sup> {Π ◦ [i]} for some set of parts <sup>Π</sup> . and σ*i*(Π ) explains <sup>π</sup> for <sup>x</sup>*<sup>i</sup>* of <sup>f</sup>.

*Proof.* The proof is almost identical as for <sup>α</sup>*<sup>f</sup>* , with the added twist that in the i-th window, an explanation for f referring to a part of the j-th element of its input vector actually refers to the (j <sup>−</sup> i)-th element of the input vector given to <sup>ω</sup>*<sup>f</sup>* ; this explains the presence of <sup>σ</sup>*<sup>i</sup>*. We omit the details.

Finally, function "¿" acts as a form of if-then-else construct: depending on the value of its (Boolean) first argument, it returns either its second or its third argument (which can be arbitrary objects). Defining its explanation requires a few cases, depending on whether the second and third element of the input are equal.

**Theorem 6.** On a given input x1, x2, x<sup>3</sup> , if <sup>x</sup><sup>2</sup> <sup>=</sup> <sup>x</sup><sup>3</sup>, then {[1]} always explains <sup>π</sup> of ¿; moreover {<sup>π</sup> ◦ [2]} explains <sup>π</sup> of ¿ if <sup>x</sup><sup>1</sup> <sup>=</sup> , and {<sup>π</sup> ◦ [3]} explains <sup>π</sup> of ¿ if <sup>x</sup><sup>1</sup> <sup>=</sup> <sup>⊥</sup>. However, if <sup>x</sup><sup>2</sup> <sup>=</sup> <sup>x</sup><sup>3</sup>, then: 1. if <sup>x</sup><sup>1</sup> <sup>=</sup> , <sup>Π</sup> explains <sup>π</sup> of ¿ if and only if <sup>Π</sup> <sup>=</sup> {<sup>π</sup> ◦ [2]} or <sup>Π</sup> <sup>=</sup> {[1], π ◦ [3]}; 2. if <sup>x</sup><sup>1</sup> <sup>=</sup> <sup>⊥</sup>, <sup>Π</sup> explains <sup>π</sup> of ¿ if and only if Π <sup>=</sup> {π ◦ [3]} or Π <sup>=</sup> {[1], π ◦ [2]}.

*Proof.* Direct from the fact that ¿( , x2, x<sup>3</sup> ) = x<sup>2</sup>, and ¿( ⊥, x2, x<sup>3</sup> ) = x<sup>3</sup>. The only corner case is when <sup>x</sup><sup>2</sup> <sup>=</sup> <sup>x</sup><sup>3</sup>; if <sup>x</sup><sup>1</sup> <sup>=</sup> , one must change either <sup>x</sup><sup>2</sup>, or *both* <sup>x</sup><sup>1</sup> and <sup>x</sup><sup>3</sup> in order to produce a different result (and dually when <sup>x</sup><sup>1</sup> <sup>=</sup> <sup>⊥</sup>). 

#### **4.4 Explanation for Composed Functions**

Defining and proving conservative generalizations is a task that can quickly become tedious for complex functions, as the previous examples have shown us. Moreover, this process must be done from scratch for each new function one wishes to consider, as the proofs for each of them are quite different. In this section, we consider the situation where a complex function f is built through the composition of simpler functions.

We first demonstrate a recipe for building conservative generalizations for compositions of functions. In such a case, it is possible to derive a conservative generalization for f by chaining and combining the generalizations already obtained for the simpler functions it is made of. To ease notation, we shall write Π *<sup>o</sup>* <sup>π</sup> to indicate that <sup>Π</sup> is a conservative generalization of all minimal input parts that explain π for input o. We extend the notion of generalization to sets of output parts; for a set of parts Π , we have that Π *<sup>o</sup>* <sup>Π</sup> if <sup>Π</sup> *<sup>o</sup>* <sup>π</sup> for every π <sup>∈</sup> Π . We first trivially observe the following:

**Theorem 7.** For a given function <sup>f</sup> : O→O and a given input o ∈ O, if <sup>Π</sup><sup>1</sup> *<sup>o</sup>* <sup>π</sup><sup>1</sup> and <sup>Π</sup><sup>2</sup> *<sup>o</sup>* <sup>π</sup><sup>2</sup>, then <sup>Π</sup><sup>1</sup> <sup>∪</sup> <sup>Π</sup><sup>2</sup> *<sup>o</sup>* {π<sup>1</sup>, π<sup>2</sup>}.

Thus, given a set of output parts Π , a conservative generalization can be obtained by taking the union of the generalizations for each individual part in Π . We can then establish a result for the composition of two functions.

**Theorem 8.** Let π be an output part of some function f, and a given input <sup>o</sup> ∈ O. Let <sup>Π</sup>*<sup>f</sup>* be a set of sets of parts such that <sup>Π</sup>*<sup>f</sup> <sup>o</sup>* <sup>π</sup> for function <sup>f</sup>. Let <sup>Π</sup>*<sup>g</sup>* be a set of sets of parts such that <sup>Π</sup>*<sup>g</sup> <sup>f</sup>*(*o*) <sup>Π</sup>*<sup>f</sup>* for some function <sup>g</sup>. Then <sup>Π</sup>*<sup>g</sup> <sup>o</sup>* <sup>π</sup> for function <sup>f</sup> ◦ <sup>g</sup>.

*Proof.* Suppose that there is a set of parts <sup>Π</sup> that does not intersect with <sup>Π</sup>*<sup>g</sup>*, and such that for two inputs o and o that differ on <sup>π</sup>*<sup>g</sup>*, <sup>π</sup>(<sup>f</sup> ◦ <sup>g</sup>)(o) <sup>=</sup> <sup>π</sup>(<sup>f</sup> ◦ <sup>g</sup>)(o ). Let x <sup>=</sup> g(o) and y <sup>=</sup> g(o ); since π(f(x)) <sup>=</sup> π(f(y)), then x and y differ on a set of parts Π . By definition, a refinement of Π , called Π, is also a refinement of <sup>Π</sup>*<sup>f</sup>* . Since <sup>Π</sup>*<sup>g</sup> <sup>f</sup>*(*o*) <sup>Π</sup>*<sup>f</sup>* , this implies that <sup>o</sup> and <sup>o</sup> differ on a part that intersects with Π*<sup>g</sup>*, which contradicts the hypothesis. It follows that all sets of parts of <sup>f</sup> ◦ <sup>g</sup> that explain <sup>o</sup> for <sup>π</sup> intersect with <sup>Π</sup>*<sup>g</sup>*, and hence <sup>Π</sup>*<sup>g</sup> <sup>f</sup>*(*o*) <sup>π</sup> for function f ◦ g.

Thanks to this result, one can obtain a conservative generalization for f ◦ g by first finding a conservative generalization Π of π for g, and then finding a conservative generalization of Π of <sup>Π</sup> for <sup>f</sup>. This spares us from defining an input-output explanation relation for each possible function, at the price of obtaining a conservative approximation of the actual relation. When expressing these explanations as and-or trees, this simply amounts to appending the root of an explanation (the output of a function) to the leaf designating the corresponding input of the function it is composed with.

We recall that one of the claimed features of our proposed approach was tractability. The theorems stated throughout this section give credence to this claim. One can see that, for each of the elementary functions we studied in Sects. 4.2 and 4.3, determining the sets of input parts that are (conservative) explanations of an output part can be done by applying simple rules that require no particular calculation. Then, building an explanation for a composed function is not much harder, and requires properly matching the output parts of a function to the input parts of the one it calls.

# **5 Implementation and Experiments**

Combined, the previous results make it possible to systematically construct explanations for a wide range of computations. It suffices to observe that nested lists-of-lists, coupled with the functions defined in Sect. 4, represent a significant fragment of a functional programming language such as Lisp.

To illustrate this point, the concepts introduced above have been concretely implemented into *Petit Poucet*<sup>3</sup>, an open source Java library.<sup>4</sup> The library allows

<sup>3</sup> In English *Hop-o'-My-Thumb*, a fairy tale by French writer Charles Perrault where the main character uses stones to mark a trail that enables him to successfully lead his lost brothers back home.

<sup>4</sup> https://github.com/liflab/petitpoucet. Version 1.0 is considered in this paper.

users to create complex functions by composing the elementary functions studied earlier, to evaluate these functions on inputs, and to generate the corresponding explanation graphs. This library is meant as a proof of concept that serves two goals:1. show the feasibility of our proposed theoretical framework and provide initial results on its running time and memory consumption; 2. provide a test bench allowing us to study the explanation graphs of various functions for various inputs.

#### **5.1 Library Overview**

Petit Poucet provides a set of ready-made Function objects; in its current implementation, it contains all the functions defined in Sect. 4, in addition to a few others for number comparison, type conversion, descriptive statistics (i.e. average), basic I/O (reading and writing to files) and string and list manipulation. Composed functions are created by adding elementary functions into an object called CircuitFunction, and manually connecting the output of each function instance to the input argument of another.

Figure 5 shows a graphical representation of a complex function that can be created by instantiating and composing elementary functions of the library. Each white box represents an elementary function; composition is illustrated by lines connecting the output (dark square) of a function to the input (light square) of another. For functions taking other functions as parameters, namely <sup>α</sup>*<sup>f</sup>* and <sup>ω</sup>*<sup>f</sup>* , the parameterized function is represented by a rectangle attached with a dotted line (such as box A attached to box 2). The composed function shown here is exactly the one from our example in the introduction: from a CSV file (box 1), the second element of each line is extracted and cast to a number (boxes 2 and A), the average over a sliding window is taken (boxes 3 and B), each value is checked to be greater than 3 (boxes 4 and C), and the logical conjunction of all these values is taken (box 5).

**Fig. 5.** Evaluating a condition on the average of values over a sliding window.

Once created, a function can be evaluated with input arguments. When this happens, it returns a special object called a Queryable. The purpose of the queryable is to retain the information about the function's evaluation necessary to answer "queries" about it at a later time. Each evaluation of the function produces a distinct queryable object with its own memory. Given a designator pointing to a part of the function's output, calling a Queryable's method query produces the corresponding explanation and-or tree. Typically, one is interested in a simplified rendition displaying only the root, leaves and Boolean nodes, hiding the intermediate nodes made of the input/output of the intermediate functions in the explanation. On the CSV file shown in the introduction, the library produces the tree that is shown in Fig. 1. 5

One can see that the false result produced by the function admits three alternate explanations (the three sub-trees under the "or" node). The first explanation involves the numerical values in lines 3-4-5 (children of the first "and" node), the second includes lines 4-5-6, and the third explanation is made of the numbers in lines 5-6-7. This corresponds exactly to the three windows of successive value whose average is not greater than 3. Indeed, the presence of either of these three windows, and nothing else, suffices for our global condition to evaluate to false. Note how the explanation generation mechanism correctly and automatically identifies these, and also how, thanks to our concept of designator, the explanation can refer to specific locations inside specific lines of the input object.

In Petit Poucet, lineage capabilities are *built-in*. The user is not required to perform any special task in order to keep track of provenance information. In addition, it shall be noted that one does not need to declare in advance what designation graph will be asked for. The construction of the Queryable objects is the same, regardless of the output part being used as the starting point. Finally, the library follows a modular architecture where the set of available functions can easily be extended by creating packages defining new descendants of Function. It suffices for each function to produce a Queryable object that computes its specific input-output relationships; an explanation can then be computed for any composed function that uses it.

#### **5.2 Experiments**

To test the performance of the library, we selected various data processing tasks and implemented them as composed functions in Petit Poucet.

*Get All Numbers.* Represents a simple operation that takes an input commaseparated list of elements, and produces a vector containing only the elements of the file that are numerical values. The explanation we ask is to point at a given element of the output vector, and retrace the location in the input string that corresponds to this number.

*Sliding Window Average.* Given a CSV file, this task extracts the numerical value in each line, compute the average of each set of n successive values and check that it is below some threshold t (similar to the example we discussed

<sup>5</sup> Or more precisely a directed acyclic graph, since leaf nodes with the same designator are not repeated.

earlier). Computing an aggregation over a sliding window is a common task in the field of event stream processing [26] and runtime verification [4], and is also provided by most statistical software, such as R's smooth package. It can be seen as a basic form of trend deviation detection [41], where the end result of the calculation is an "alarm" indicating that the expected trend has not been followed across the whole data file; a classical example of this is the detection of temperature peaks in a server rack [40]. The explanation we ask is to point at the output Boolean value (true or false), and retrace the locations in the file corresponding to the numbers explaining the result.

*Triangle Areas.* Given a list of arbitrary vectors, this task checks that each vector contains the lengths of the three sides of a valid triangle; if so, it computes their area using Heron's formula<sup>6</sup> and sums the area of all valid triangles. It was chosen because it involves multiple if-then-else cases to verify the sides of a triangle. It also involves a slightly more involved arithmetical calculation to get the area, which is completely implemented by composing basic arithmetic operators in a composed function. The whole function is the composition of 59 elementary functions.

This example is notable, because the explanations it generates may take different forms. An explanation for the output value (total area) includes an explanation for each vector: if it represents a valid triangle, it refers to its three sides; if it does not, the condition can fail for different reasons: the vector may not have three elements, or have one of its elements that is not a number, or contain a negative value, or violate the triangle inequality. Each condition, when violated, produces different explanations pointing at different elements of the vector, or the vector as a whole. Moreover, each vector in the list may not be a valid triangle for different reasons, and hence a different explanation will be built for each of them. For example, given the list [ a, <sup>4</sup>, <sup>2</sup> , <sup>3</sup>, <sup>5</sup>, <sup>6</sup> , <sup>2</sup>, <sup>3</sup> ], a tree will be produced that describes an explanation involving three elements: the first points at the element a of the first vector (not a number), the second points at all three components of the second vector (valid triangle), and the last points at the whole third vector (wrong number of elements).

*Nested Bounding Boxes.* Given a DOM tree<sup>7</sup>, this task checks that each element has a bounding box (width and height) larger than all of its children. This condition is the symptom of a layout bug which shows visually as an element protruding from its parent box inside the web browser's window. This corresponds to one of the properties that is evaluated by web testing tools such as Cornipickle [24] and ReDeCheck [49] on real web pages. In this task, trees are represented as nested lists-of-lists, with each DOM node corresponding to a triplet made of its width, height, and a list of its children nodes. The explanation we ask is to point at the Boolean output of the condition, and retrace the nodes of the tree that violate it.

<sup>6</sup> <sup>A</sup> <sup>=</sup> s(<sup>s</sup> <sup>−</sup> <sup>a</sup>)(<sup>s</sup> <sup>−</sup> <sup>b</sup>)(<sup>s</sup> <sup>−</sup> <sup>c</sup>), where <sup>s</sup> <sup>=</sup> *<sup>a</sup>*+*b*+*<sup>c</sup>*

<sup>2</sup> . <sup>7</sup> A DOM tree represents the structure of elements in an HTML document [2].

In all these tasks, the inputs given to the function are randomly generated structures of the corresponding type. The experiments were implemented using the LabPal testing framework [25], which makes it possible to bundle all the necessary code, libraries and input data within a single self-contained executable file, such that anyone can download and independently reproduce the experiments. A downloadable lab instance containing all the experiments of this paper can be obtained online [27]. Overall, our empirical measurements involve 56 individual experiments, which together generated 224 distinct data points. All the experiments were run on a AMD Athlon II X4 640 1.8 GHz running Ubuntu 18.04, inside a Java 8 virtual machine with 3566 MB of memory.

**Memory Consumption.** The first experiment aims to measure the amount of memory used up by the Queryable objects generated by the evaluation of a function, and the impact of the size of the input on global memory consumption. To this end, we ran various functions on inputs of different size; for each, we measured the amount of memory consumed, with explainability successively turned on and off. This is possible thanks to a switch provided by Petit Poucet, and which allows users to completely disable tracking if desired.

**Fig. 6.** Impact of explainability on memory consumption.

Figure 6 shows a plot that compares the amount of memory consumed by Function objects. Each point in the plot corresponds to a pair of experiments: the x coordinate corresponds to the memory consumed by a function without explainability, and the y coordinate corresponds to the memory consumed by the same function on the same input, but with explainability enabled. All the points for the same task have been grouped into a category and are given the same color.

Analyzing this plot brings both bad news and good news. The "bad" news is that the additional memory required for explainability is high when expressed in relative terms. For example, a composed that requires 498 KB to be evaluated on an input requires close to 15 MB once explainability is enabled. The "good" news is twofold. First, this consumption is still reasonable in the absolute: at this

rate, it takes an input file of 42 million lines before filling up the available RAM in a 64 GB machine. Second, and most importantly, the relationship between memory consumption with and without lineage is linear: for all the tasks we tested, if m is the memory used without lineage, then the memory m used when lineage is enabled is in O(m), i.e. the ratio m /m does not depend on the size of the input.

These figures should be put in context by comparing the overhead incurred by other systems mentioned in Sect. 2. Related systems for provenance in databases (namely Polygen [51], Mondrian [20], MXQL [48], DBNotes [11] pSQL [7] and Orchestra [29]) do not divulge their storage overhead for provenance data. A recent technical report on a provenance-aware database management system measures an overhead ranging between 19% and 702% [3]. Dynamic taint propagation systems report a memory overhead reaching 4× for TaintCheck [37], 240× [14] for Dytan, and "an enormity" of logging information for Rifle ([13], authors' quote). Although these systems operate at a different level of abstraction, this shows that explainability is inherently costly regardless of the approach chosen.

**Computation Time.** We performed the same experiments, but this time measuring computation time. The results are shown in Fig. 7; similar to memory consumption, they compare the running time of the same function on the same input, both with and without explanation tracking. The largest slowdown observed across all instances is 6.7×. For a task like *Sliding Window Average*, the average slowdown observed is 1.93× across all inputs. Although this slowdown is non-negligible, it is reasonable nevertheless, adding at most a few hundreds of milliseconds on the problems considered in our benchmark.

**Fig. 7.** Impact of explainability on computation time.

Again, these results should be put in context with respect to existing works that include a form of lineage. The Mondrian system reports an average slowdown of 3×; pSQL ranges between 10× and 1,000×; the remaining tools do not report CPU overhead. For taint analysis tools, Dytan reports a 30–50× slowdown; GIFT-compiled programs are slowed down by up to 12×; TaintCheck has

a slowdown of around 20×, 1–2<sup>×</sup> for Rifle and around 20<sup>×</sup> for TaintCheck. Of course, these various systems compute different types of lineage information, but these figures give an outlook of the order of magnitude one should expect from such systems.

# **6 Conclusion**

This paper provided the formal foundations for a generic and granular explainability framework. An important highlight of this model is its capability to handle abstract composite data structures, including character strings or lists of elements. The paper then defined the notion of designator, which are functions that can point to and extract *parts* of these data structures. An explainability relationship on functions has been formally defined, and conservative approximations of this relation have been proved for a set of elementary functions. A point in favor of this approach is that explanations of composed functions can be built by composing the explanations for elementary functions. Combined, these concepts make it possible to automatically extract the explanation of a result for generic functions at a fine level of granularity. These concepts have been implemented into a proof-of-concept, yet fully functional library called *Petit Poucet*, and evaluated experimentally on a number of data processing tasks. These experiments revealed that the amount of memory required to track explainability metadata is relatively high, but more importantly, showed that it is *linear* in the size of the memory required to evaluate the function in the first place.

Obviously, Petit Poucet is not intended to replace programs written using other languages and following different paradigms. However, it could be used as a library by other tools that could benefit from its explanation features. In particular, testing libraries such as JUnit could be extended by assertions written as Petit Poucet functions, and provide a detailed explanation of a test failure without requiring extra code. Explainability functionalities could also easily be retrofitted into existing (Java) software, with minimal interference on their current code. Case in point, we already identified the Cornipickle web testing tool [24] and the BeepBeep event stream processing engine [26] as some of the first targets for the addition of explainability based on Petit Poucet. A lineage-aware version of the GRAL plotting library<sup>8</sup> is also considered.

The existence of a definition of fine-grained explainability opens the way to multiple exciting theoretical questions. For example: for a given function, is there a part of the input that is present in all explanations? We can see an example of this in Fig. 1, with the leaf pointing to value −80. Intuitively, this tends to indicate that some parts of an input have a greater "responsibility" than others in the result, and could provide an alternate way of quantifying this notion than what has been studied so far [12]. On the contrary, is there a part of the input that never explains the production of the output, regardless of the input? This latter question could shed a different light on an existing notion called *vacuity* [6], expressed not in terms of elements of the specification, but on the parts of

<sup>8</sup> https://github.com/eseifert/gral.

the input it is evaluated on. More generally, explainability can be viewed as a particular form of static analysis for functions; it would therefore be interesting to recast our model in the abstract interpretation framework [35,39] in order to further assess its strengths and weaknesses.

Finally, explanations could also prove useful from a testing and verification standpoint. The explanation graph could be used for log trace and bug triaging [30]: if two execution traces violate the same condition, one could keep one trace instance for each distinct explanation they induce, as representatives of traces that fail for different reasons. This could help reduce the amount of log data that needs to be preserved, by keeping only one log instance of each type of failure.

# **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

# **Latticed** *k***-Induction with an Application to Probabilistic Programs**

Kevin Batz1(B) , Mingshuai Chen1(B) , Benjamin Lucien Kaminski2(B) , Joost-Pieter Katoen1(B) , Christoph Matheja3(B) , and Philipp Schr¨oer<sup>1</sup>

> <sup>1</sup> RWTH Aachen University, Aachen, Germany {kevin.batz,chenms,katoen}@cs.rwth-aachen.de <sup>2</sup> University College London, London, UK b.kaminski@ucl.ac.uk <sup>3</sup> ETH Z¨urich, Z¨urich, Switzerland cmatheja@inf.ethz.ch

**Abstract.** We revisit two well-established verification techniques, k*-induction* and *bounded model checking* (BMC), in the more general setting of fixed point theory over complete lattices. Our main theoretical contribution is *latticed* k*-induction*, which (i) generalizes classical k-induction for verifying transition systems, (ii) generalizes Park induction for bounding fixed points of monotonic maps on complete lattices, and (iii) extends from naturals k to transfinite ordinals κ, thus yielding κ*-induction*.

The lattice-theoretic understanding of k-induction and BMC enables us to apply both techniques to the *fully automatic verification of infinitestate probabilistic programs*. Our prototypical implementation manages to automatically verify non-trivial specifications for probabilistic programs taken from the literature that—using existing techniques—cannot be verified without synthesizing a stronger inductive invariant first.

**Keywords:** k-induction · Bounded model checking · Fixed point theory · Probabilistic programs · Quantitative verification

# **1 Introduction**

Bounded model checking (BMC) [12,17] is a successful method for analyzing models of hardware and software systems. For checking a *finite-state* transition system (TS) against a safety property ("bad states are unreachable"), BMC unrolls the transition relation until it either finds a counterexample and hence refutes the property, or reaches a pre-computed completeness threshold on the unrolling depth and accepts the property as verified. For *infinite-state* systems, however, such completeness thresholds need not exist (cf. [64]), rendering BMC a *refutation-only* technique. To *verify* infinite-state systems, BMC is typically combined with the search for an *inductive invariant*, i.e., a superset of the reachable

This work has been partially funded by the ERC Advanced Project FRAPPANT under grant No. 787914.

c The Author(s) 2021

A. Silva and K. R. M. Leino (Eds.): CAV 2021, LNCS 12760, pp. 524–549, 2021. https://doi.org/10.1007/978-3-030-81688-9\_25

states which is closed under the transition relation. Proving a—not necessarily inductive—safety property then amounts to *synthesizing* a sufficiently strong, often complicated, inductive invariant that excludes the bad states. A plethora of techniques target computing or approximating inductive invariants, including IC3 [14], induction [13,20], interpolation [50,51], and predicate abstraction [27,36]. However, invariant synthesis may burden full automation, as it either relies on user-supplied annotations or confines push-button technologies to semi-decision or approximate procedures.

k*-induction* [65] generalizes the principle of simple induction (aka 1 induction) by considering k consecutive transition steps instead of only a single one. It is more powerful: an invariant can be k-inductive for some k > 1 but not 1-inductive. Following the seminal work of Sheeran et al. [65] which combines k-induction with SAT solving to check safety properties, k-induction has found a broad spectrum of applications in the realm of hardware [29,37,45,65] and software verification [10,21–23,55,63]. Its success is due to (1) being a foundational yet potent reasoning technique, and (2) integrating well with SAT/SMT solvers, as also pointed out in [45]: "*the simplicity of applying* k*-induction made it the go-to technique for SMT-based infinite-state model checking*". This paper explores whether k-induction can have a similar impact on the *fully automatic verification* of infinite-state *probabilistic programs*. That is, we aim to verify that the *expected value* of a specified *quantity*—think: "quantitative postcondition" after the execution of a probabilistic program is bounded by a specified threshold.

*Example 1 (Bounded Retransmission Protocol* [19,32]*).* The loop

while ( *sent* < *toSend* <sup>∧</sup> *fail* <sup>&</sup>lt; *maxFail*) { { *fail* := 0; *sent* := *sent* + 1 } [ 0.9 ] { *fail* := *fail* +1; *totalFail* := *totalFail* + 1 } }

models a simplified version of the bounded retransmission protocol, which attempts to transmit *toSend* packages via an unreliable channel (that fails with probability 0.1) allowing for at most *maxFail* retransmissions per package.

Using our generalization of k-induction, we can fully automatically verify that the *expected total number of failed transmissions* is at most 1, if the number of packages we want to (successfully) send is at most 3. In terms of weakest preexpectations [38,44,49], this quantitative property reads

$$\mathsf{wsp}[C] \text{ (total} \\ \text{Fail) } \preceq \text{ [} \\ \text{to} \\ \text{Send} \leq 3 \\ \text{]} \cdot \text{(total} \\ \text{Fail} + 1) + \text{[} \\ \text{to} \\ \text{Send} > 3 \\ \text{)} \cdot \infty.$$

The bound on the right-hand-side of the inequality is 4-inductive, but *not* 1 inductive; verifying the same bound using 1-induction requires finding a nontrivial—and far less perspicuous—inductive invariant. Moreover, if we consider an arbitrary number of packages to send, i.e., we drop [*toSend* ≤ 3], this bound becomes invalid. In this case, our BMC procedure produces a counterexample, i.e., values for *toSend* and *maxFail*, proving that the bound does not hold. -

Lifting the classical formalization (and SAT encoding) of k-induction over TSs to the probabilistic setting is non-trivial. We encounter the following challenges:

(A) *Quantitative reachability.* In a TS, a state reachable within k steps remains reachable on increasing k. In contrast, reachability *probabilities* in Markov chains—a common operational model for probabilistic programs [28] may increase on increasing k. Hence, proving that the probability of reaching a bad state remains below a given threshold is more intricate than reasoning about qualitative reachability.

(B) *Counterexamples are subsystems.* In a TS, an acyclic path from an initial to a bad state suffices as a witness for refuting safety, i.e., non-reachability. SAT encodings of k-induction rely on this by expressing the absence of witnesses up to a certain path-length. In the probabilistic setting, however, witnesses are no longer single paths [30]. Rather, a witness for the probability of reaching a bad state to exceed a threshold is a *subsystem* [15], i.e., a set of possibly cyclic paths.

(C) *Symbolic encodings.* To enable fully automated verification, we need a suitable encoding such that our lifting integrates well into SMT solvers. Verifying probabilistic programs involves reasoning about execution *trees*, where each (weighted) branch corresponds to a probabilistic choice. A suitable encoding needs to capture such trees which requires more involved theories than encoding paths in classical k-induction.

We address challenges (A) and (B) by developing *latticed* k*-induction*, which is a proof technique in the rather general setting of fixed point theory over complete lattices. Latticed k-induction generalizes classical k-induction in three aspects: (1) it works with any monotonic map on a complete lattice instead of being confined to the transition relation of a transition system, (2) it generalizes the Park induction principle for bounding fixed points of such monotonic maps, and (3) it extends from natural numbers k to (possibly transfinite) ordinals κ, hence its short name: κ*-induction*.

It is this lattice-theoretic understanding that enables us to lift both k-induction and BMC to reasoning about quantitative properties of probabilistic programs. To enable *automated* reasoning, we address challenge (C) by an incremental SMT encoding based on the theory of quantifier-free mixed integer and real arithmetic with uninterpreted functions (QF UFLIRA). We show how to effectively compute all needed operations for κ-induction using the SMT encoding and, in particular, how to decide *quantitative entailments*.

A prototypical implementation of our method demonstrates that κ-induction for (linear) probabilistic programs manages to automatically verify non-trivial specifications for programs taken from the literature which—using existing techniques—cannot be verified without synthesizing a stronger inductive invariant.

Due to space restrictions, most proofs and details about individual benchmarks have been omitted; they are found in an extended version of this paper [8].

**Related Work.** Besides the aforementioned related work on k-induction, we briefly discuss other automated analysis techniques for probabilistic systems and other approaches for bounding fixed points. Symbolic engines exist for exact inference [26] and sensitivity analysis [33]. Other automated approaches focus on bounding expected costs [56], termination analysis [2,16], and static analysis [3,67]. BMC has been applied in a rather rudimentary form to the on-the-fly verification of finite unfoldings of probabilistic programs [35], and the enumerative generation of counterexamples in finite Markov chains [68]. (Semi-)automated invariant-synthesis techniques can be found in [6,24,41]. A recent variant of IC3 for probabilistic programs called PrIC3 [7] is restricted to finite-state systems. When applied to finite-state Markov chains, our κ-induction operator is related to other operators that have been employed for determining reachabilitiy probabilities through value iteration [4,31,61]. In particular, when iterated on the candidate upper bound, the κ-induction operator coincides with the (upper value iteration) operator in interval iteration [4]; the latter operator can be used together with the up-to techniques (cf. [53,58,59]) to prove our κ-induction rule sound (in contrast, we give an elementary proof). However, the κ-induction operator avoids comparing current and previous iterations. It is thus easier to implement and more amenable to SMT solvers. Finally, the proof rules for bounding fixed points recently developed in [5] are restricted to finite-state systems.

### **2 Verification as a Fixed Point Problem**

We start by recapping some fundamentals on fixed points of monotonic operators on complete lattices before we state our target verification problem.

*Fundamentals.* For the next three sections, we fix a *complete lattice* (E, ), i.e. a carrier set E together with a partial order , such that every subset S <sup>⊆</sup> E has a *greatest lower bound* S (also called the *meet* of S) and a *least upper bound* - S (also called the *join* of S). For just two elements {g, h} ⊆ E, we denote their meet by g h and their join by g h. Every complete lattice has a *least* and a *greatest* element, which we denote by ⊥ and , respectively.

In addition to (E, ), we also fix a *monotonic operator* Φ: E <sup>→</sup> E. By the Knaster-Tarski theorem [43,47,66], every monotonic operator Φ admits a *complete lattice of (potentially infinitely many) fixed points*. The least fixed point lfp Φ and the greatest fixed point gfp Φ are moreover constructible by (possibly transfinite) *fixed point iteration* from ⊥ and , respectively: Cousot & Cousot [18] showed that there exist ordinals α and β, such that<sup>1</sup>

$$\mathfrak{fp}\,\Phi = \,\Phi^{\lceil \alpha \rceil}\left(\perp\right) \quad \text{and} \quad \mathfrak{gfp}\,\Phi = \,\Phi^{\lfloor \beta \rfloor}\left(\top\right),\tag{7}$$

where <sup>Φ</sup>δ (g) denotes the *upper* δ*-fold iteration* and Φδ (g) denotes the *lower* δ*-fold iteration* of Φ on g, respectively. Formally, Φδ (g) is given by<sup>2</sup>

<sup>1</sup> We use lowercase greek letters α, β, γ, δ, etc. to denote arbitrary (possibly transfinite)

ordinals and <sup>i</sup>, <sup>j</sup>, <sup>k</sup>, <sup>m</sup>, <sup>n</sup>, etc. to denote natural (finite) numbers in <sup>N</sup>. <sup>2</sup> To ensure well-definedness of transfinite iterations, we fix an *ambient ordinal* ν and *tacitly assume* δ<ν *for all ordinals* δ *considered throughout this paper.* Formally, ν is the smallest ordinal such that <sup>|</sup>ν<sup>|</sup> > <sup>|</sup>E|. Intuitively, ν then upper-bounds the length of any repetition-free sequence over elements of E.

$$\Phi^{\lceil \delta \rceil}(g) = \begin{cases} g & \text{if } \delta = 0, \\ \Phi\left(\Phi^{\lceil \gamma \rceil}(g)\right) & \text{if } \delta = \gamma + 1 \text{ is a successor ordinal,} \\ \bigsqcup\left\{\Phi^{\lceil \gamma \rceil}(g) \mid \gamma < \delta\right\} & \text{if } \delta \text{ is a limit ordinal.} \end{cases}$$

Intuitively, if δ is the successor of γ, then we simply do another iteration of Φ. If <sup>δ</sup> is a limit ordinal, then <sup>Φ</sup>δ (g) can also be thought of as a limit, namely of iterating Φ on g. However, simply iterating Φ on g need not always converge, especially if the iteration does not yield an ascending chain. To remedy this, we take as limit the join over the whole (possibly transfinite) iteration sequence, i.e., the least upper bound over all elements that occur along the iteration. The lower δ-fold iteration Φδ (g) is defined analogously to <sup>Φ</sup>δ (g), except that we take a meet instead of a join whenever δ is a limit ordinal.

An important special case for fixed point iteration (see (†)) is when the operator Φ is *Scott-continuous* (or simply *continuous*), i.e., if Φ- {g<sup>1</sup> <sup>g</sup><sup>2</sup> ...} = - Φ {g<sup>1</sup> <sup>g</sup><sup>2</sup> ...} . In this case, α in (†) coincides with the first infinite limit ordinal ω (which can be identified with the set <sup>N</sup> of natural numbers). This fact is also known as the Kleene fixed point theorem [1].

*Problem Statement.* Fixed points are ubiquitous in computer science. Prime examples of properties that can be conveniently characterized as least fixed points include both the set of reachable states in a transition system and the function mapping each state in a Markov chain to the probability of reaching some goal state (cf. [60]). However, least and greatest fixed points are often difficult or even impossible [39] to compute; it is thus desirable to *bound* them.

For example, it may be sufficient to prove that a system modeled as a Markov chain reaches a bad state from its initial state with probability *at most* 10−<sup>6</sup>, instead of computing *precise* reachability probabilities for each state. Moreover, if said probability is *not* bounded by 10−<sup>6</sup>, we would like to witness that as well.

In general lattice-theoretic terms, our problem statement reads as follows:

Given a complete lattice (E, ), a monotonic operator Φ: E <sup>→</sup> E, and a candidate upper bound f <sup>∈</sup> E on lfp Φ, *prove* or *refute* that lfp Φ f.

For *proving*, we will present *latticed* k*-induction*; for *refuting*, we will present *latticed bounded model checking*. Running both in parallel may (and under certain conditions: *will*) lead to a decision of the above problem.

# **3 Latticed** *k***-Induction**

In this section, we generalize the well-established k-induction verification technique [23,29,37,45,55,65] to *latticed* k*-induction* (for short: κ*-induction*; reads:

**Fig. 1.** κ-induction and latticed BMC in case that lfp Φ f. An arrow from g to h indicates g h. The solid blue arrow from Φ(Ψκ <sup>f</sup> (f)) to <sup>f</sup> is the premise of <sup>κ</sup>induction, i.e., the LHS of Lemma 2, which implies the dash-dotted blue arrow from Φ(Ψκ <sup>f</sup> (f)) to <sup>Ψ</sup>κ <sup>f</sup> (f), i.e., the RHS of Lemma 2. The dashed blue arrow from lfp <sup>Φ</sup> to Φ(Ψκ <sup>f</sup> (f)) is a consequence of the dash-dotted arrow (by Park induction, Theorem 1) and ultimately proves that lfp Φ f.

"kappa induction"). With κ-induction, our aim is to *prove* that lfp Φ f. To this end, we attempt "ordinary" induction, also known as *Park induction*:

**Theorem 1 (Park Induction** [57]**).** *Let* f <sup>∈</sup> E*. Then*

Φ(f) f implies lfp Φ f.

Intuitively, this principle says: if pushing our candidate upper bound f through Φ takes us *down* in the partial order , we have verified that f is indeed an upper bound on lfp Φ. The true power of Park induction is that applying Φ *once* tells us something about iterating Φ possibly *transfinitely often* (see (†) in Sect. 2).

Park induction, unfortunately, does *not* work in the reverse direction: If we are unlucky, f lfp Φ *is* an upper bound on lfp Φ, but nevertheless Φ(f) f. In this case, we say that f is *not inductive*. But how can we verify that f is indeed an upper bound in such a non-inductive scenario? We search *below* f for <sup>a</sup> *different, but inductive*, upper bound on lfp Φ, that is, we

search for an h <sup>∈</sup> <sup>E</sup> such that lfp <sup>Φ</sup> <sup>Φ</sup>(h) <sup>h</sup> f.

In order to perform a *guided* search for such an h, we introduce the κ-induction operator—a modified version of Φ that is parameterized by our candidate f:

**Definition 1 (***κ***-Induction Operator).** *For* f <sup>∈</sup> E*, we call*

$$
\Psi\_f \colon \quad E \to E, \qquad g \mapsto \Phi(g) \sqcap f
$$

*the* κ-induction operator *(with respect to* f *and* Φ*).*

What does <sup>Ψ</sup><sup>f</sup> do? As illustrated in Fig. 1, if <sup>Φ</sup>(f) <sup>f</sup> (i.e. <sup>f</sup> is non-inductive) then "*at least some part of* Φ(f) *is greater than* f". If the whole of Φ(f) is greater than f, then f Φ(f); if only some part of Φ(f) is greater and some is smaller than <sup>f</sup>, then <sup>f</sup> and <sup>Φ</sup>(f) are incomparable. The <sup>κ</sup>-induction operator <sup>Ψ</sup><sup>f</sup> now *rectifies* Φ(f) being (partly) greater than f by *pulling* Φ(f) *down* via the meet with f (i.e., via <sup>f</sup>), so that the result is in no part greater than <sup>f</sup>. Applying <sup>Ψ</sup><sup>f</sup> to <sup>f</sup> hence always yields something below or equal to <sup>f</sup>.

Together with the observation that <sup>Ψ</sup><sup>f</sup> is monotonic, iterating <sup>Ψ</sup><sup>f</sup> on <sup>f</sup> necessarily *descends* from f downwards in the direction of lfp Φ (and never below):

**Lemma 1 (Properties of the** *κ***-Induction Operator).** *Let* f <sup>∈</sup> E *and let* <sup>Ψ</sup><sup>f</sup> *be the* <sup>κ</sup>*-induction operator with respect to* <sup>f</sup> *and* <sup>Φ</sup>*. Then*

*(a)* <sup>Ψ</sup><sup>f</sup> *is monotonic, i.e.,* <sup>∀</sup> <sup>g</sup>1, g<sup>2</sup> <sup>∈</sup> <sup>E</sup> : <sup>g</sup><sup>1</sup> <sup>g</sup><sup>2</sup> implies <sup>Ψ</sup><sup>f</sup> (g<sup>1</sup>) <sup>Ψ</sup><sup>f</sup> (g<sup>2</sup>)*. (b) Iterations of* <sup>Ψ</sup><sup>f</sup> *starting from* <sup>f</sup> *are descending, i.e., for all ordinals* <sup>γ</sup>*,* <sup>δ</sup>*,*

> γ<δ implies Ψδ <sup>f</sup> (f) <sup>Ψ</sup>γ <sup>f</sup> (f).


$$\mathfrak{M} \not\!\!p \;\vert \sqsubseteq \dots \sqsubseteq \;\vert \sqsubseteq \Psi\_f^{\lfloor\delta\rfloor}(f) \;\sqsubseteq \;\dots \;\sqsubseteq \;\Psi\_f^{\lfloor 2\rfloor}(f) \;\sqsubseteq \;\Psi\_f(f) \;\sqsubseteq \;f\dots$$

The descending sequence <sup>f</sup> <sup>Ψ</sup><sup>f</sup> (f) <sup>Ψ</sup>2 <sup>f</sup> (f) ... constitutes our guided search for an inductive upper bound on lfp Φ. For each ordinal κ (hence the short name: κ-induction), Ψκ <sup>f</sup> (f) is a potential candidate for Park induction:

$$\Phi\left(\Psi\_f^{\lfloor \kappa \rfloor}(f)\right) \stackrel{\text{potentially}}{\subseteq} \Psi\_f^{\lfloor \kappa \rfloor}(f). \tag{\sharp}$$

For efficiency reasons, e.g., when offloading the above inequality check to an SMT solver, we will not check the inequality (‡) directly but a property equivalent to (‡), namely whether Φ(Ψκ <sup>f</sup> (f)) is below <sup>f</sup> instead of <sup>Ψ</sup>κ <sup>f</sup> (f):

**Lemma 2 (Park Induction from** *κ***-Induction).** *Let* f <sup>∈</sup> E*. Then*

$$\Phi\left(\Psi\_f^{\lfloor\kappa\rfloor}(f)\right) \subseteq f \quad \text{iff} \quad \Phi\left(\Psi\_f^{\lfloor\kappa\rfloor}(f)\right) \subseteq \Psi\_f^{\lfloor\kappa\rfloor}(f).$$

*Proof.* The if-direction is trivial, as Ψκ <sup>f</sup> (f) <sup>f</sup> (Lemma 1(d)). For only-if:

> Ψκ <sup>f</sup> (f) <sup>Ψ</sup>κ+1 <sup>f</sup> (f) (by Lemma 1(b)) <sup>=</sup> <sup>Ψ</sup><sup>f</sup> Ψκ <sup>f</sup> (f) (by definition of Ψκ+1 <sup>f</sup> (f)) <sup>=</sup> Φ Ψκ <sup>f</sup> (f) <sup>f</sup> (by definition of <sup>Ψ</sup><sup>f</sup> ) Φ Ψκ <sup>f</sup> (f) . (by the premise)


If Φ Ψκ <sup>f</sup> (f) f, then Lemma <sup>2</sup> tells us that Ψκ <sup>f</sup> (f) is Park inductive and thereby an upper bound on lfp <sup>Φ</sup>. Since iterating <sup>Ψ</sup><sup>f</sup> on <sup>f</sup> yields a descending iteration sequence (see Lemma 1(b)), Ψk <sup>f</sup> (f) is below <sup>f</sup> and therefore <sup>f</sup> is also an upper bound on lfp Φ. Put in more traditional terms, we have shown that Ψκ <sup>f</sup> (f) is an inductive invariant stronger than <sup>f</sup>. Formulated as a proof rule, we obtain the following induction principle:

**Theorem 2 (***κ***-Induction).** *Let* f <sup>∈</sup> E *and let* κ *be an ordinal. Then*

Φ Ψκ <sup>f</sup> (f) f implies lfp Φ f.

*Proof.* Following the argument above, for details see [8, Appx. A.2]. 

An illustration of κ-induction is shown in (the right frame of) Fig. 1. For every ordinal κ, if Φ(Ψκ <sup>f</sup> (f)) <sup>f</sup>, then we call <sup>f</sup> (κ+1)*-inductive* (for <sup>Φ</sup>). In particular, κ-induction generalizes Park induction, in the sense that 1-induction *is* Park induction and, (κ > 1)-induction is a *more general principle of induction*.

Algorithm <sup>1</sup> depicts a (semi-)algorithm that performs *latticed* k*-induction* (for k<ω) in order to prove lfp Φ f by iteratively increasing k. For implementing this algorithm, we require, of course, that both <sup>Φ</sup> and <sup>Ψ</sup><sup>f</sup> are computable and that is decidable. Notice that the loop (lines 2–3) never terminates if f Φ(f) a condition that can easily be checked before entering the loop. Even with this optimization, however, Algorithm <sup>1</sup> is a *proper* semi-algorithm: even if lfp Φ f, then f is still not guaranteed to be k-inductive for some k<ω. And even if an algorithm *could* somehow perform transfinitely many iterations, then f is still not guaranteed to be κ-inductive for some ordinal κ:

**Counterexample 1 (Incompleteness of** *κ***-Induction).** *Consider the carrier set* {0, <sup>1</sup>, <sup>2</sup>}*, partial order* <sup>0</sup> <sup>1</sup> <sup>2</sup>*, and the monotonic operator* Φ *with* Φ(0) = 0 = lfp Φ*, and* Φ(1) = 2*, and* Φ(2) = 2 = gfp Φ*. Then* lfp Φ <sup>1</sup>*, but for any ordinal* κ*,* Ψκ <sup>1</sup> (1) = 1 *and* <sup>Φ</sup>(1) = 2 <sup>1</sup>*. Hence* <sup>1</sup> *is not* <sup>κ</sup>*-inductive.* -

Despite its incompleteness, we now provide a *sufficient* criterion which ensures that *every* upper bound on lfp Φ is κ-inductive for some ordinal κ.

**Theorem 3 (Completeness of** *κ***-Induction for Unique Fixed Point).** *If* lfp Φ <sup>=</sup> gfp Φ *(i.e.* Φ *has* exactly one *fixed point), then, for every* f <sup>∈</sup> E*,*

lfp Φ f implies f *is* κ*-inductive for some ordinal* κ.

*Proof.* By the Knaster-Tarski theorem, we have Φβ() = gfp <sup>Φ</sup> for some ordinal β. We then show that f is (β+1)-inductive; see [8, Appx A.3] for details.

 

The proof of the above theorem immediately yields that, if the unique fixed point can be reached through *finite* fixed point iterations starting at , then f is kinductive for some *natural* number k; Algorithm <sup>1</sup> thus eventually terminates.

**Corollary 1.** *If* <sup>Φ</sup>n () = lfp Φ *for some* n <sup>∈</sup> <sup>N</sup>*, then, for every* f <sup>∈</sup> E*,*

lfp Φ f implies f *is* n*-inductive for some* n <sup>∈</sup> <sup>N</sup>.

# **4 Latticed vs. Classical** *k***-Induction**

We show that our purely lattice-theoretic κ-induction from Sect. <sup>3</sup> generalizes classical k-induction for hardware- and software verification. To this end, we first recap how k-induction is typically formalized in the literature [10,23,29,37]: Let TS = (S, I, T) be a transition system, where S is a (countable) set of *states*, I <sup>⊆</sup> S is a non-empty set of *initial states*, and T <sup>⊆</sup> S <sup>×</sup> S is a *transition relation*. As in the seminal work on k-induction [65], we require that T is a *total* relation, i.e., every state has at least one successor. This requirement is sometimes overlooked in the literature, which renders the classical SAT-based formulation of k-induction ((1a) and (1b) below) unsound in general.

Our goal is to verify that a given *invariant property* P <sup>⊆</sup> S covers all states reachable in TS from some initial state. Suppose that I, T and P are characterized by logical formulae I(s), T(s, s ) and P(s) (over the free variables s and s ), respectively. Then, achieving the above goal with classical k-induction amounts to proving the validity of

$$I(s\_1) \land T(s\_1, s\_2) \land \dots \land T(s\_{k-1}, s\_k) \implies P(s\_1) \land \dots \land P(s\_k), \qquad \text{and} \qquad \text{(1a)}$$

$$P(s\_1) \land T(s\_1, s\_2) \land \dots \land P(s\_k) \land T(s\_k, s\_{k+1}) \implies P(s\_{k+1}).\tag{1b}$$

Here, the *base case* (1a) asserts that P holds for *all states reachable within* k *transition steps from some initial state*; the *induction step* (1b) formalizes that P is *closed under taking up to* k *transition steps*, i.e., if we start in P and stay in P for up to k steps, then we also end up in P after taking the (k+1) st step. If both (1a) and (1b) are valid, then classical k-induction tells us that the property P holds for *all* reachable states of TS. How is the above principle reflected in *latticed* k-induction (cf. Sect. 3)? For that, we choose the complete lattice (2<sup>S</sup>, <sup>⊆</sup>), where 2<sup>S</sup> denotes the powerset of S; the least element is <sup>⊥</sup> <sup>=</sup> <sup>∅</sup> and the meet operation is standard intersection ∩.

Moreover, we define a monotonic operator Φ whose least fixed point precisely characterizes the set of reachable states of the transition system TS:

$$\Phi \colon \quad 2^S \to \ 2^S, \qquad F \longmapsto \ I \cup \mathsf{Succs}(F),$$

That is, Φ maps any given set of states F <sup>⊆</sup> S to the union of the initial states I and of those states Succs(F) that are reachable from F using a single transition.<sup>3</sup>

Using the <sup>κ</sup>-induction operator <sup>Ψ</sup><sup>P</sup> constructed from <sup>Φ</sup> and <sup>P</sup> according to Definition 1, the principle of κ-induction (cf. Theorem 2) then tells us that

$$\Phi\left(\Psi\_P^{\lfloor \kappa \rfloor}(P)\right) \subseteq P \qquad \text{implies} \quad \underbrace{\mathfrak{fp}\,\Phi}\_{\text{reachable states of TS}} \subseteq P.$$

For our above choices, the premise of κ-induction equals the classical formalization of k-induction—formulae (1a) and (1b)—because the set of initial states I is "baked into" the operator Φ. More concretely, for the base case (1a), we have

I(s1) Φ(∅) <sup>∧</sup>T(s1, s<sup>2</sup>) Φ-<sup>2</sup>(∅) <sup>∧</sup>... <sup>∧</sup> T(s<sup>k</sup>−1, s<sup>k</sup>) Φ*<sup>k</sup>*(∅) <sup>=</sup><sup>⇒</sup> P(s<sup>1</sup>) <sup>∧</sup> ... <sup>∧</sup> P(s<sup>k</sup>) meaning Φ*<sup>k</sup>*(∅) <sup>⊆</sup> <sup>P</sup> .

In other words, formula (1a) captures those states that are reachable from I via at most k transitions. If we assume that (1a) is valid, then P contains all initial states and formula (1b) coincides with the premise of κ-induction:

$$\underbrace{\underbrace{\Psi^{P}(s\_{1})\wedge T(s\_{1},s\_{2})\bigtriangleup P(s\_{2})\wedge T(s\_{2},s\_{3})\wedge...\wedge P(s\_{k})\wedge T(s\_{k},s\_{k+1})\implies P(s\_{k+1})}\_{\Phi^{\bullet}(P)\vartriangleleft}\_{\Phi^{\bullet}(P)^{-1}\vartriangleleft}\_{\Phi\left(\mathbb{V}\_{P}^{\lfloor k-1\rfloor}(P)\right)}\underbrace{\implies P(s\_{k+1})\ldots P(s\_{k})}\_{\Phi\left(\mathbb{V}\_{P}^{\lfloor k-1\rfloor}(P)\right)},$$

It follows that, when considering transition systems, our (latticed) κ-induction is equivalent to the classical notion of k-induction for κ<ω:

**Theorem 4.** *For every natural number* k <sup>≥</sup> <sup>1</sup>*,*

$$\Phi\left(\Psi\_P^{\lfloor k-1 \rfloor}(P)\right) \subseteq P \quad \text{iff} \quad \text{formulaae (1a) and (1b) are valid}.$$

<sup>3</sup> Formally, Succs(F) { t <sup>|</sup> t <sup>∈</sup> F, (t, t ) <sup>∈</sup> T }. C ::= skip | x := e | C ; C | { C } [ p ] { C } | if ( ϕ ) { C } else { C } | while ( ϕ ) { C } **(a)** pGCL programs e ::= n | x | n · e | e + e <sup>|</sup> <sup>e</sup> . − e (monus max{0, e − e}) **(b)** Linear expressions ϕ ::= e<e | ϕ ∧ ϕ | ¬ϕ **(c)** Linear guards

**Fig. 2.** Syntax of pGCL programs, linear expressions, and guards, where x is a variable taken from a countable set Vars of program variables (evaluating to natural numbers), p <sup>∈</sup> [0, 1] <sup>∩</sup> <sup>Q</sup> is a rational probability, and n <sup>∈</sup> <sup>N</sup> is a constant.

# **5 Latticed Bounded Model Checking**

We complement κ-induction with a latticed analog of bounded model checking [11,12] for *refuting* that lfp Φ f. In lattice-theoretic terms, bounded model checking amounts to a *fixed point iteration* of Φ on <sup>⊥</sup> while continually checking whether the iteration exceeds our candidate upper bound f. If so, then we have indeed refuted lfp Φ f:

**Theorem 5 (Soundness of Latticed BMC).** *Let* f <sup>∈</sup> E*. Then*

<sup>∃</sup> ordinal <sup>δ</sup> : <sup>Φ</sup>δ (⊥) f implies lfp Φ f.

Furthermore, if we were actually able to perform transfinite iterations of Φ on <sup>⊥</sup>, then latticed bounded model checking is also complete: If f is in fact *not* an upper bound on lfp Φ, this *will* be witnessed at some ordinal:

**Theorem 6 (Completeness of Latticed BMC).** *Let* f <sup>∈</sup> E*. Then*

lfp <sup>Φ</sup> <sup>f</sup> implies <sup>∃</sup> ordinal <sup>δ</sup> : <sup>Φ</sup>δ (⊥) f.

More practically relevant, if Φ is continuous (which is the case for Bellman operators characterizing reachability probabilities in Markov chains), then a simple *finite* fixed point iteration, see Algorithm 2, is sound and complete for refutation:

**Corollary 2 (Latticed BMC for Continuous Operators).** *Let* f <sup>∈</sup> E *and let* Φ *be continuous. Then*

<sup>∃</sup> <sup>n</sup> <sup>∈</sup> <sup>N</sup>: <sup>Φ</sup><sup>n</sup>(⊥) f iff lfp Φ f.

# **6 Probabilistic Programs**

In the remainder of this article, we employ latticed k-induction and BMC to verify imperative programs with access to discrete probabilistic choices—branching on the outcomes of coin flips. In this section, we briefly recap the necessary background on formal reasoning about probabilistic programs (cf. [44,49] for details).

#### **6.1 The Probabilistic Guarded Command Language**

*Syntax.* Programs in the *probabilistic guarded command language* pGCL adhere to the grammar in Fig. 2a. The semantics of most statements is standard. In particular, the *probabilistic choice* { <sup>C</sup><sup>1</sup> } [ <sup>p</sup> ] { <sup>C</sup><sup>2</sup> } flips a coin with bias <sup>p</sup> <sup>∈</sup> [0, 1] <sup>∩</sup> <sup>Q</sup>. If the coin yields heads, it executes C<sup>1</sup>; otherwise, <sup>C</sup><sup>2</sup>. In addition to the syntax in Fig. 2, we admit standard expressions that are definable as syntactic sugar, e.g., true, false, <sup>ϕ</sup><sup>1</sup> <sup>∨</sup> <sup>ϕ</sup><sup>2</sup>, <sup>e</sup><sup>1</sup> <sup>=</sup> <sup>e</sup><sup>2</sup>, <sup>e</sup><sup>1</sup> <sup>≤</sup> <sup>e</sup><sup>2</sup>, etc.

*Program States.* <sup>A</sup> *program state* σ maps every variable in Vars to its value, i.e., a natural number in N. <sup>4</sup> To ensure that the set of program states Σ remains countable<sup>5</sup>, we restrict ourselves to states in which only finitely many variables those that appear in a given program—evaluate to non-zero values. Formally,

$$\Sigma \triangleq \left\{ \sigma \colon \mathsf{Vars} \to \mathbb{N} \; \middle| \; \left| \{ x \in \mathsf{Vars} \mid \sigma(x) \neq 0 \} \right| < \infty \right\} \dots$$

The evaluation of expressions e and guards ϕ under a state σ, denoted by e(σ) and ϕ(σ), is standard. For example, we define the evaluation of "monus" as

$$(e\_1 \dot{\cdot} e\_2)(\sigma) \stackrel{\Delta}{=} \max\left\{0, \ e\_1(\sigma) - e\_2(\sigma)\right\}.$$

#### **6.2 Weakest Preexpectations**

*Expectations.* An *expectation* f : Σ <sup>→</sup> <sup>R</sup><sup>∞</sup> <sup>≥</sup><sup>0</sup> is a map from program states to the non-negative reals extended by infinity. We denote by E the set of all expectations. Moreover, (E, ) forms a complete lattice, where the partial order is given by the pointwise application of the canonical ordering <sup>≤</sup> on <sup>R</sup><sup>∞</sup> <sup>≥</sup><sup>0</sup>, i.e.,

$$f \; \leq \; g \; \quad \text{iff} \quad \forall \sigma \in \Sigma ; \quad f(\sigma) \; \leq \; g(\sigma).$$

To conveniently describe expectations evaluating to some r <sup>∈</sup> <sup>R</sup><sup>∞</sup> <sup>≥</sup><sup>0</sup> for every state, we slightly abuse notation and denote by r the constant expectation λσ**.** <sup>r</sup>. Similarly, given an arithmetic expression e, we denote by e the expectation λσ**.** <sup>e</sup>(σ).

<sup>4</sup> We prefer unsigned integers because our quantitative "specifications" (aka *expectations*) must evaluate to non-negative numbers. Otherwise, expectations like x+y are not well-defined, and, as a remedy, we would frequently have to take the absolute value of every program variable. Restricting ourselves to unsigned variables does not decrease expressive power as signed variables can be emulated (cf. [9, Sec. 11.2]).

<sup>5</sup> In order to avoid any technical issues pertaining to measurability.


**Table 1.** Rules defining the weakest preexpectation transformer.

The least element of (E, ) is 0 and the greatest element is <sup>∞</sup>. We employ the *Iverson bracket* notation to cast Boolean expressions into expectations, i.e.,

$$[\varphi]\_\cdot = \lambda \sigma \mathbf{.} \begin{cases} 1 & \text{if } \varphi(\sigma) = \text{true}, \\ 0 & \text{if } \varphi(\sigma) = \text{false}. \end{cases}$$

The *weakest preexpectation transformer* wp: pGCL <sup>→</sup> (<sup>E</sup> <sup>→</sup> <sup>E</sup>) is defined in Table 1, where g [x/e] denotes the substitution of variable x by expression e, i.e.,

$$\log\left[x/e\right] \triangleq \lambda \sigma \bullet g(\sigma\left[x \mapsto e(\sigma)\right]), \text{ where } \sigma\left[x \mapsto e(\sigma)\right] \triangleq \lambda y \bullet \begin{cases} e(\sigma) & \text{if } y = x, \\ \sigma(y) & \text{otherwise.} \end{cases}$$

We call wp-C (g) the *weakest preexpectation* of program C w.r.t. postexpectation g. The weakest preexpectation wp-C (g) is itself an expectation of type <sup>E</sup>, which maps each initial state σ to the expected value of g after running C on σ. More formally, if μ<sup>σ</sup> <sup>C</sup> is the distribution over final states obtained by executing <sup>C</sup> on initial state σ, then for any postexpectation g [44],

$$\mathfrak{sp}[C] \left( g \right) \left( \sigma \right) \left( \sigma \right) \implies \sum\_{\tau \in \Sigma} \mu\_C^{\sigma}(\tau) \cdot g(\tau).$$

For a gentle introduction to weakest preexpectations, see [38, Chap. 2 and 4].

# **7 BMC and** *k***-Induction for Probabilistic Programs**

We now instantiate latticed κ-induction and BMC (as developed in Sects. <sup>2</sup> to 5) to enable verification of loops written in pGCL; we discuss practical aspects later in Sects. 7.1 to 7.3 and Sect. 8. For the next two sections, we fix a loop

$$C\_{\text{loop}} \quad = \quad \text{while} \, \mathfrak{e}\left(\varphi\right)\{C\}\,.$$

For simplicity, we assume that the loop body C is loop-free (every probabilistic program can be rewritten as a single while loop with loop-free body [62]).

Given an expectation g <sup>∈</sup> <sup>E</sup> and a candidate upper bound f <sup>∈</sup> <sup>E</sup> on the expected value of <sup>g</sup> after executing <sup>C</sup>loop (i.e. wp-Cloop (g)), we will apply latticed verification techniques to check whether f indeed upper-bounds wp-Cloop (g).

To this end, we denote by <sup>Φ</sup> the *characteristic functional* of <sup>C</sup>loop and <sup>g</sup>, i.e.,

$$\Phi \colon \quad \mathbb{E} \to \mathbb{E}, \qquad h \mapsto \ [\neg \varphi] \cdot g + [\varphi] \cdot \mathsf{wp}[C] \ (h) \ ,$$

whose least fixed point defines wp-<sup>C</sup>loop (g) (cf. Table 1). We remark that Φ is a monotonic—and in fact even continuous—operator over the complete lattice (E, ) (cf. Sect. 6.2). In this lattice, the meet is a pointwise minimum, i.e.,

<sup>h</sup> <sup>h</sup> <sup>=</sup> h min h λσ**.** min { h(σ), h (σ) } .

By Definition 1, Φ and g then induce the (continuous) κ-induction operator

$$
\Psi\_f \colon \quad \mathbb{E} \to \mathbb{E}, \qquad h \mapsto \Phi(h) \text{ min } f.
$$

With this setup, we obtain the following proof rule for reasoning about probabilistic loops as an immediate consequence of Theorem 2:

**Corollary 3 (***k***-Induction for pGCL).** *For every natural number* k <sup>∈</sup> <sup>N</sup>*,*

Φ Ψk <sup>f</sup> (f) f implies wp-Cloop (g) f.

Analogously, refuting that f upper-bounds the expected value of g after execution of <sup>C</sup>loop via bounded model checking is an instance of Corollary 2:

#### **Corollary 4 (Bounded Model Checking for pGCL).**

$$f: \exists \, n \in \mathbb{N} : \quad \Phi^n(0) \not\supset \: f \qquad \text{iff} \qquad \mathsf{wp}[C\_{\text{loop}}] \begin{array}{c} (g) \not\supset \: f \dots \end{array}$$

*Example 2 (Geometric Loop).* The pGCL program

$$\begin{array}{rcl} C\_{\text{geo}} & = & \mathtt{while1e} \left( x = 1 \right) \left\{ \left\{ x := 0 \right\} \left[ 0.5 \right] \left\{ c := c + 1 \right\} \right\} \end{array}$$

keeps flipping a fair coin x until it flips heads, sets x to 0, and terminates. Whenever it flips tails instead, it increments the counter c and continues. We refer to <sup>C</sup>geo as the "geometric loop" because after its execution, the counter variable c is distributed according to a geometric distribution.

What is a (preferably small) upper bound on the expected value wp-Cgeo (c) of c after execution of Cgeo? Using 2-induction, we can (automatically) verify that c + 1 is indeed an upper bound: Since Φ(Ψ<sup>c</sup>+1(<sup>c</sup> + 1)) <sup>c</sup> + 1, where <sup>Φ</sup> denotes the characteristic functional of Cgeo, Corollary <sup>3</sup> yields wp-Cgeo (c) c + 1.

However, c + 1 *cannot* be proven an upper bound using Park induction as it is *not* inductive. Moreover, it is indeed the *least* upper bound, i.e., any smaller bound is refutable using BMC (cf. Corollary 4). For example, we have wp-Cgeo (c) c + 0.99, since Φ11 (0) c + 0.99. Finally, we remark that some correct upper bounds only become κ-inductive for *transfinite* ordinals κ. For instance, the innocuous-looking bound 2 · c+ 1 is not k-inductive for any natural number k, but it is (ω + 1)-inductive, since Φ Ψω <sup>2</sup>·c+1(2 · c + 1) <sup>2</sup> · c + 1. - In principle, we can semi-decide whether wp-Cloop (g) f holds or whether f is k-inductive for some k: it suffices to run Algorithms <sup>1</sup> and <sup>2</sup> in parallel. However, for these two algorithms to actually be semi-decision procedures, we cannot admit arbitrary expectations. Rather, we restrict ourselves to a suitable subset Exp of expectations in E satisfying all of the following requirements:

1. Exp is closed under computing the characteristic functional Φ, i.e.,

<sup>∀</sup> h <sup>∈</sup> Exp: Φ(h) is computable and belongs to Exp.

2. Quantitative entailments between expectations in Exp are decidable, i.e.,

<sup>∀</sup> h, h <sup>∈</sup> Exp: it is decidable whether h h .

3. (For k-induction) Exp is closed under computing meets, i.e.,

<sup>∀</sup> h, h <sup>∈</sup> Exp: <sup>h</sup> min <sup>h</sup> is computable and belongs to Exp.

Below, we show that *linear expectations* meet all of the above requirements.

#### **7.1 Linear Expectations**

Recall from Fig. 2b that we assume all expressions appearing in pGCL programs to be linear. For our fragment of syntactic expectations, we consider *extended* linear expressions ˜e that (1) are defined over *rationals* instead of natural numbers and (2) admit ∞ as a constant (but not as a *sub*expression). Formally, the set of extended linear expressions is given by the following grammar:

$$\tilde{e}\quad ::=\ e\mid\infty \qquad\qquad e\quad ::=\ r\mid x\mid r\cdot e\mid e+e\mid e\div e \qquad\qquad (r\in\mathbb{Q}\_{\geq 0})$$

Similarly, we admit extended linear expressions (without <sup>∞</sup>) in linear guards ϕ. 6 With these adjustments to expressions and guards in mind, the set LinExp of *linear expectations* is defined by the grammar

$$h \quad ::= \quad \tilde{e} \quad \mid \quad [\varphi] \cdot h \quad \mid \quad h + h.$$

We write <sup>h</sup> <sup>=</sup> <sup>h</sup> if h and h are *syntactically identical*; and h <sup>≡</sup> h if they are *semantically equivalent*, i.e., if for all states σ, we have h(σ) = h (σ).

Furthermore, the *rescaling* <sup>c</sup>·<sup>h</sup> of a linear expectation <sup>h</sup> by a constant <sup>c</sup> <sup>∈</sup> <sup>Q</sup>≥<sup>0</sup> is syntactic sugar for rescaling suitable<sup>7</sup> arithmetic subexpressions of h, e.g.,

$$\frac{1}{2} \cdot \left( \left[ x = 1 \right] \cdot 4 + 1/3 \cdot x + \infty \right) \equiv \, ^1\text{/} \cdot \left[ x = 1 \right] \cdot 4 + 1/2 \cdot 1/3 \cdot x + \infty \in \text{Lin} \mathbb{E} \text{xp}.$$

A formal definition of the rescaling c · h is found in [8, Appx A.5].

<sup>6</sup> We do not admit <sup>∞</sup> in guards for convenience. In principle, all comparisons with <sup>∞</sup> in guards can be removed by a simple preprocessing step.

<sup>7</sup> We do not rescale every subexpression to account for the corner cases c · ∞ <sup>=</sup> <sup>∞</sup> and 0 · ∞ = 0.

If we choose a linear expectation h as a postexpectation, then a quick inspection of Table 1 reveals that the weakest preexpectation wp-C (h) of any *loop-free* pGCL program C and h yields a linear expectation again. Hence, linear expectations are closed under applying Φ— Requirement <sup>1</sup> above—because

$$\forall g, h \in \mathsf{LinExp} \colon \quad \Phi(h) \; = \underbrace{\left[ \neg \varphi \right] \cdot g}\_{\in \underbrace{\mathsf{LinExp}}\_{\in \mathsf{LinExp}} \; \qquad \underbrace{\left[ \varphi \right] \cdot \underbrace{\mathsf{wp}[C] \; (h)}\_{\in \mathsf{LinExp}}}\_{\in \mathsf{LinExp}}.$$

#### **7.2 Deciding Quantitative Entailments Between Linear Expectations**

To prove that linear expectations meet Requirement 2—decidability of quantitative entailments—we effectively reduce the question of whether an entailment h h holds to the decidable satisfiability problem for QF LIRA—quantifier-free mixed linear integer and real arithmetic (cf. [42]).

As a first step, we show that every linear expectation can be represented as a sum of mutually exclusive extended arithmetic expressions—a representation we refer to as the *guarded normal form* (similar to [41, Lem. 1], [9, Lem. A.2]).

**Definition 2 (Guarded Normal Form (GNF)).** h <sup>∈</sup> LinExp *is in GNF if*

$$h = \sum\_{i=1}^{n} \left[ \varphi\_i \right] \cdot \tilde{e}\_i,$$

*where* <sup>e</sup>˜<sup>1</sup>,..., <sup>e</sup>˜<sup>n</sup> *are extended linear expressions,* <sup>n</sup> <sup>∈</sup> <sup>N</sup> *is some natural number, and* <sup>ϕ</sup>1,...,ϕ<sup>n</sup> *are linear Boolean expressions that partition the set of states, i.e., for each* σ <sup>∈</sup> Σ *there exists* exactly one i ∈ {1,...,n} *such that* ϕ<sup>i</sup>(σ) = true*.*

**Lemma 3.** *Every linear expectation* h <sup>∈</sup> LinExp *can effectively be transformed into an equivalent linear expectation* GNF (h) <sup>≡</sup> h *in guarded normal form.*

The number of summands <sup>|</sup>GNF (h)<sup>|</sup> in GNF (h) is, in general, exponential in the number of summands in h. In practice, however, this exponential blow-up can often be mitigated by pruning summands with unsatisfiable guards. Throughout the remainder of this paper, we denote the components of GNF (h) and GNF (h ), where h and h are arbitrary linear expectations, as follows:

$$\mathsf{GNF}(h) \;=\sum\_{i=1}^{n} \left[ \varphi\_i \right] \cdot \tilde{e}\_i \quad \text{and} \quad \mathsf{GNF}(h') \;=\sum\_{j=1}^{m} \left[ \psi\_j \right] \cdot \tilde{a}\_j \dots$$

We now present a decision procedure for the *quantitative entailment* over LinExp.

**Theorem 7. (Decidability of Quantitative Entailment over LinExp).** *For* h, h <sup>∈</sup> LinExp*, it is decidable whether* h h *holds.*

*Proof.* Let h, h <sup>∈</sup> LinExp. By Lemma 3, we have h h iff GNF (h) GNF (h ).

Let σ be some state. By definition of the GNF, σ satisfies exactly one guard <sup>ϕ</sup><sup>i</sup> and exactly one guard <sup>ψ</sup><sup>j</sup> . Hence, the inequality GNF (h) (σ) <sup>≤</sup> GNF (h ) (σ) does *not* hold iff ˜ei(σ) <sup>&</sup>gt; <sup>a</sup>˜<sup>j</sup> (σ) holds for the expressions ˜e<sup>i</sup> and ˜a<sup>j</sup> guarded by <sup>ϕ</sup><sup>i</sup> and <sup>ψ</sup><sup>j</sup> , respectively. Based on this observation, we construct a QF LIRA formula cex (h, h ) that is *unsatisfiable* iff there is no counterexample to h h :

$$\mathsf{cex}\_{\preceq}(h, h') \triangleq \bigvee\_{i=1}^{n} \bigvee\_{j=1, \tilde{a}\_{j} \neq \infty}^{m} (\varphi\_{i} \wedge \psi\_{j} \wedge \mathsf{encodedicity}(\tilde{e}\_{i}) > \tilde{a}\_{j}) \,.$$

Here, we identify every program variable in h or h with an <sup>N</sup>-valued SMT variable. Moreover, to account for comparisons with ∞, we rely on the fact that our (extended) arithmetic expressions either evaluate to ∞ for *every* state or *never* evaluate to <sup>∞</sup>. To deal with the case ˜e<sup>i</sup> <sup>&</sup>gt; <sup>∞</sup>, which is always false, we can thus safely exclude cases in which ˜a<sup>j</sup> <sup>=</sup> <sup>∞</sup> holds. To deal with the case <sup>∞</sup> <sup>&</sup>gt; <sup>a</sup>˜<sup>j</sup> , we represent <sup>∞</sup> by some unbounded number, i.e., we introduce a fresh, unconstrained <sup>N</sup>-valued SMT variable infty and set encodeInfty (˜e) to infty if e˜ <sup>=</sup> <sup>∞</sup>; otherwise, encodeInfty (˜e)=˜e. Since QF LIRA is decidable (cf. [42]), we conclude that the quantitative entailment problem is decidable. 

Since quantitative entailments are decidable, we can already conclude that, for linear expectations, Algorithm 2 is a semi-decision procedure.

#### **7.3 Computing Minima of Linear Expectations**

To ensure that latticed k-induction on pGCL programs (cf. Algorithm <sup>1</sup> and Sect. 7) is a semi-decision procedure when considering linear expectations, we have to consider Requirement 3—the expressability and computability of meets:

**Theorem 8.** LinExp *is effectively closed under taking minima.*

*Proof.* For k <sup>∈</sup> <sup>N</sup>, let **<sup>k</sup>** {1,...,k}. Then, for two linear expectations h, h , the linear expectation GNF (h) min GNF (h ) <sup>∈</sup> LinExp is given by:

$$\sum\_{\substack{(i,j)\in\mathbf{n}\times\mathbf{m}\\ \{\boldsymbol{\varphi}\_{i}\wedge\boldsymbol{\psi}\_{j}\wedge\widetilde{e}\_{i}=\widetilde{a}\_{j}\}}} \begin{cases} [\varphi\_{i}\wedge\psi\_{j}]\cdot\widetilde{a}\_{j}, & \text{if }\widetilde{e}\_{i}=\infty,\\ [\varphi\_{i}\wedge\psi\_{j}]\cdot\widetilde{e}\_{i}, & \text{if }\widetilde{a}\_{i}=\infty,\\ [\varphi\_{i}\wedge\psi\_{j}\wedge\widetilde{e}\_{i}\leq\widetilde{a}\_{j}]\cdot\widetilde{e}\_{i}+[\varphi\_{i}\wedge\psi\_{j}\wedge\widetilde{e}\_{i}>\widetilde{a}\_{j}]\cdot\widetilde{a}\_{j} & \text{otherwise}, \end{cases}$$

where we exploit that, for every state, exactly one guard <sup>ϕ</sup><sup>i</sup> and exactly one guard <sup>ψ</sup><sup>j</sup> is satisfied (cf. Lemma 3). Notice that in the last case we indeed obtain a linear expectation since neither ˜e nor ˜a are equal to <sup>∞</sup>. 

# **8 Implementation**

We have implemented a prototype called kipro2—k-Induction for PRObabilistic PROgrams—in Python 3.7 using the SMT solver Z3 [54] and the solver-API PySMT [25]. Our tool, its source code, and our experiments are available online.<sup>8</sup>

<sup>8</sup> https://github.com/moves-rwth/kipro2.

kipro2 performs in parallel latticed k-induction and BMC to fully automatically verify upper bounds on expected values of pGCL programs as described in Sect. 7. In addition to reasoning about expected values, kipro2 supports verifying bounds on *expected runtimes* of pGCL programs, which are characterized as least fixed points `a la [40]. Rather than fixing a specific runtime model, we took inspiration from [56] and added a statement tick (n) that does not affect the program state but consumes n <sup>∈</sup> <sup>N</sup> time units.

To discharge quantitative entailments and compute the meet, we use the constructions in Theorems 7 and 8, respectively. As an additional optimization, we do not iteratively apply the <sup>k</sup>-induction operator <sup>Ψ</sup><sup>f</sup> directly but use an *incremental encoding*. We briefly sketch our encoding for k-induction (Algorithm 2); the encoding for BMC is similar. In both cases, we employ uninterpreted functions on top of mixed integer and real arithmetic, i.e., QF UFLIRA.

Recall Example 2, the geometric loop Cgeo, where we used <sup>k</sup>-induction to prove wp-Cgeo (c) c + 1. For every k <sup>∈</sup> <sup>N</sup>, Φ(Ψk <sup>c</sup>+1(c + 1)) is given by

$$\underbrace{\begin{bmatrix} x = 1 \end{bmatrix} \cdot \left( 0.5 \cdot \underbrace{\Psi^{\lfloor k \rfloor}\_{c+1}(c+1)}\_{Q\_k} \begin{bmatrix} x/0 \end{bmatrix} + 0.5 \cdot \underbrace{\Psi^{\lfloor k \rfloor}\_{c+1}(c+1)}\_{Q\_k} \begin{bmatrix} c/c + 1 \end{bmatrix} \right) + \begin{bmatrix} x \neq 1 \end{bmatrix} \cdot c \cdot x$$

To obtain an incremental encoding, we introduce an uninterpreted function <sup>P</sup><sup>k</sup> : <sup>N</sup> <sup>×</sup> <sup>N</sup> <sup>→</sup> <sup>R</sup>≥<sup>0</sup> and a formula <sup>ρ</sup><sup>k</sup>(c, x) specifying that <sup>P</sup><sup>k</sup>(c, x) characterizes Φ(Ψk <sup>c</sup>+1(<sup>c</sup> + 1)), i.e., for all <sup>σ</sup> <sup>∈</sup> <sup>Σ</sup> and <sup>r</sup> <sup>∈</sup> <sup>R</sup>≥<sup>0</sup> with <sup>Φ</sup>(Ψk <sup>c</sup>+1(c + 1))(σ) < <sup>∞</sup>, 9

$$(\rho\_k(\sigma(c), \sigma(x)) \land P\_k(\sigma(c), \sigma(x)) = r \text{ is satisfiable} \quad \text{iff} \quad r = \Phi(\Psi\_{c+1}^{\lfloor k \rfloor}(c+1))(\sigma).$$

If Φ(Ψk <sup>c</sup>+1(c + 1))(σ) = <sup>∞</sup>, our construction of ρ<sup>k</sup>(x, c) ensures that the above conjunction is satisfiable for arbitrarily large r. Analogously, we introduce an uninterpreted function <sup>Q</sup><sup>k</sup> : <sup>N</sup> <sup>×</sup> <sup>N</sup> <sup>→</sup> <sup>R</sup>≥<sup>0</sup> that characterizes <sup>Ψ</sup>k <sup>c</sup>+1(c + 1).

In particular, may use all uninterpreted functions introduced for smaller or equal values of k—not just the function P<sup>k</sup>(c, x) it needs to characterize. This enables an incremental encoding, i.e., ρ<sup>k</sup>(c, x) can be computed on top of ρ<sup>k</sup>−<sup>1</sup>(c, x) by reusing P<sup>k</sup>−<sup>1</sup>(c, x), Q<sup>k</sup>(c, x), and the construction in Theorem 8.

Moreover, we can reuse ρ<sup>k</sup>(c, x) to avoid computing the (expensive) GNF for deciding certain quantitative entailments (cf. Theorem 7): For example, to check whether Φ(Ψk <sup>c</sup>+1(c + 1)) h holds, we only need to transform the right-hand side into GNF (cf. Sect. 7.2), i.e., if GNF (h ) = <sup>m</sup> <sup>j</sup>=1 [ψ<sup>j</sup> ] · <sup>a</sup>˜<sup>j</sup> , then

$$\overline{\Phi\left(\Psi\_{c+1}^{\lfloor k \rfloor}(c+1)\right)} \not\le g \quad \text{iff} \quad \rho\_k \land \bigvee\_{j=1,\ \tilde{a}\_j \ne \infty}^{m} \psi\_j \land P\_k(c, x) > \tilde{a}\_j \text{ is satisfiable.}$$

<sup>9</sup> Notice that we do *not* axiomatize in ρ<sup>k</sup>(c, x) that Φ(Ψk <sup>c</sup>+1(c+ 1)) and P<sup>k</sup>(c, x) are the same function because we have no access to universal quantifiers. Rather, we specify that both functions coincide for any fixed concrete values assigned to c and x. This weaker notion is *not* robust against formal modifications of the parameters, e.g., through substitution. For example, to assign the correct interpretation to P<sup>k</sup>(c, x) [c/c + 1], we have to construct a (second) formula ρ<sup>k</sup>(c, x) [c/c + 1].

# **9 Experiments**

We evaluate kipro2 on two sets of benchmarks. The first set, shown in Table 2, consists of four (infinite-state) probabilistic systems compiled from the literature; each benchmark is evaluated on multiple variants of candidate upper bounds:


Our second set of benchmarks, shown in Table 3, confirms the correctness of (1-inductive) bounds on the expected runtime of pGCL programs synthesized by the runtime analyzers Absynth [56] and (later) KoAT [52]; this gives a baseline for evaluating the performance of our implementation. Moreover, it demonstrates the flexibility of our approach as we effortlessly apply the expected runtime calculus [40] instead of the weakest preexpectation calculus for verification.

*Setup.* We ran Algorithms 1 and 2 in parallel using an AMD Ryzen 5 3600X processor with a shared memory limit of 8GB and a 15-minute timeout. For every benchmark finishing within the time limit, kipro2 either finds the smallest k required to prove the candidate bound by k-induction or the smallest unrolling depth k to refute it. If kipro2 refutes, the SMT solver provides a concrete initial state witnessing that violation. In Tables 2 and 3, column #formulae gives the maximal number of conjuncts on the solver stack; formulae t, sat t, and total t give the amount of time spent on (1) computing formulae, (2) satisfiability checking, and (3) everything (including preprocessing), respectively. The input consists of a program, a candidate upper bound, and a postexpectation; in Table 3, the latter is fixed to "postruntime" 0 and thus omitted.


**Table 2.** Empirical results for the first benchmark set (time in seconds).

*Evaluation of Benchmark Set 1.* Table 2 empirically underlines that probabilistic program verification can benefit from k-induction to the same extent as classical software verification: kipro2 *fully automatically* verifies relevant properties of *infinite-state* randomized algorithms and stochastic processes from the literature that require k *to be strictly larger than* 1. That is, proving these properties using (1-)inductive invariants requires either non-trivial invariant synthesis or additional user annotations. This indicates that k-induction mitigates the need for complicated specifications in probabilistic program verification (cf. [40]).

We observe that k-induction tends to succeed if *some* variable is bounded in the candidate upper bound under consideration (cf. brp, rabin, unif gen). However, k-induction can also succeed without any bounds (cf. geo). The time and formulae required for checking k-inductivity increases rapidly for larger k; this is particularly striking for rabin and unif gen. When refuting candidate bounds with BMC, we obtain a similar picture. Both the time and formulae required for refutation increase if the candidate bound increases (cf. brp, geo, rabin).

For both k-induction and BMC, we observe a direct correlation between the complexity of the loop, i.e., the number of possible traces through the loop from some fixed initial state after some bounded number of iterations, and the required time and space (number of formulae). Whereas for geo and brp—which exhibit a rather simple structure—these checks tend to be fast, this is not the case for rabin and unif gen, which have more complex loop bodies. For such complex loops, k-induction and BMC quickly become infeasible as k increases.


**Table 3.** Empirical results for (a subset of) the ERTs [56] (time in *milliseconds*).

*Evaluation of Benchmark Set 2.* From Table 3, we observe that—in almost every case—verification is instantaneous and requires very few formulae. The programs we verify are equivalent to the programs provided in [56] up to interpreting minus as *monus* and using N-typed (instead of Z) variables. A manual inspection reveals that this matters for C4B t303 and rdwalk, which is the reason why the runtime bound for C4B t303 is 3-inductive rather than 1-inductive.

There are two timeouts (2drwalk, bayesian network) due to the GNF construction from Lemma 3, which exhibits a runtime exponential in the number of possible execution branches through the loop body. We conjecture that further preprocessing (by pruning infeasible branches upfront) can mitigate this, rendering 2drwalk and bayesian network tractable as well. We consider a thorough investigation of suitable preprocessing strategies for GNF construction, which is outside the scope of this paper, a worthwhile direction for future research.

# **10 Conclusion**

We presented κ-induction, a generalization of classical k-induction to arbitrary complete lattices, and—together with a complementary bounded model checking approach—obtained a fully automated technique for verifying infinite-state probabilistic programs. Experiments showed that this technique can prove nontrivial properties in an automated manner that using existing techniques cannot be proven—at least not without synthesizing a stronger inductive invariant. If a given candidate bound is k-inductive for some k, then our prototypical tool will find that k for linear programs and linear expectations. In theory, our tool is also applicable to non-linear programs at the expense of an undecidability quantitative entailment problem. It is left for future work to consider (positive) real-valued program variables for non-linear expectations.

**Acknowledgements.** B. L. Kaminski thanks Larry Fischer for his linguistic advice.

# **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

# **Stochastic Systems**

# **Runtime Monitors for Markov Decision Processes**

Sebastian Junges(B) , Hazem Torfah , and Sanjit A. Seshia

University of California at Berkeley, Berkeley, USA sjunges@berkeley.edu

**Abstract.** We investigate the problem of monitoring partially observable systems with nondeterministic and probabilistic dynamics. In such systems, every state may be associated with a risk, e.g., the probability of an imminent crash. During runtime, we obtain partial information about the system state in form of observations. The monitor uses this information to estimate the risk of the (unobservable) current system state. Our results are threefold. First, we show that extensions of state estimation approaches do not scale due the combination of nondeterminism and probabilities. While exploiting a geometric interpretation of the state estimates improves the practical runtime, this cannot prevent an exponential memory blowup. Second, we present a tractable algorithm based on model checking conditional reachability probabilities. Third, we provide prototypical implementations and manifest the applicability of our algorithms to a range of benchmarks. The results highlight the possibilities and boundaries of our novel algorithms.

# **1 Introduction**

Runtime assurance is essential in the deployment of safety-critical (cyberphysical) systems [12,29,45,49,50]. Monitors observe system behavior and indicate when the system is at risk to violate system specifications. A critical aspect in developing reliable monitors is their ability to handle noisy or missing data. In cyber-physical systems, monitors observe the system state via sensors, i.e., sensors are an interface between the system and the monitor. A monitor has to base its decision solely on the obtained sensor output. These sensors are not perfect, and not every aspect of a system state can be measured.

This paper considers a model-based approach to the construction of monitors for systems with imprecise sensors. Consider Fig. 1(b). We assume a model for the environment together with the controller. Typically, such a model contains both nondeterministic and probabilistic behavior, and thus describes a Markov decision process (MDP): In particular, the sensor is a stochastic process [56] that

c The Author(s) 2021

This work is partially supported by NSF grants 1545126 (VeHICaL), 1646208 and 1837132, by the DARPA contracts FA8750-18-C-0101 (AA) and FA8750-20-C-0156 (SDCPS), by Berkeley Deep Drive, and by Toyota under the iCyPhy center.

A. Silva and K. R. M. Leino (Eds.): CAV 2021, LNCS 12760, pp. 553–576, 2021. https://doi.org/10.1007/978-3-030-81688-9\_26

translates the environment state into an observation. For example, this could be a perception module on a plane that during landing estimates the movements of an on-ground vehicle, as depicted in Fig. 1(a). Due to lack of precise data, the vehicle movements itself may be most accurately described using nondeterminism.

We are interested in the associated *state risk* of the current system state. The state risk may encode, e.g., the probability that the plane will crash with the vehicle within a given number of steps, or the expected time until reaching the other side of the runway. The challenge is that the monitor cannot directly observe the current system state. Instead, the monitor must infer from a trace of observations the current state risk. This cannot be done perfectly as the system state cannot be inferred precisely. Rather, we want a sound, conservative estimate of the system state. More concretely, for a fixed resolution of the nondeterminism, the *trace risk* is the weighted sum over the probability of being in a state having observed the trace, times the risk imposed by this state. The monitoring problem is to decide whether for any possible scheduler resolving the nondeterminism the trace risk of a given trace exceeds a threshold.

Monitoring of systems that contain either only probabilistic or only nondeterministic behavior is typically based on *filtering*. Intuitively, the monitor then estimates the current system states based on the model. For purely nondeterministic systems (without probabilities) a set of states needs to be tracked, and purely probabilistic systems (without nondeterminism) require tracking a distribution over states. This tracking is rather efficient. For systems that contain both probabilistic and nondeterministic behavior, filtering is more challenging. In particular, we show that filtering on MDPs results in an exponential memory blowup as the monitor must track sets of distributions. We show that a reduction based on the geometric interpretation of these distributions is essential for practical performance, but cannot avoid the worst-case exponential blowup. As a tractable alternative to filtering, we rephrase the monitoring problem as the computation of conditional reachability probabilities [9]. More precisely, we unroll and transform the given MDP, and then model check this MDP. This alternative approach yields a polynomial-time algorithm. Indeed, our experiments show the feasibility of computing the risk by computing conditional probabilities. We also show benchmarks on which filtering is a competitive option.

**Contribution and Outline.** This paper presents the first runtime monitoring for systems that can be adequately abstracted by a combination of *probabilities and nondeterminism* and where the system state is *partially observable*. We describe the use case, show that typical filtering approaches in general fail to deal with this setting, and show that a tractable alternative solution exists. In Sect. 3, we investigate *forward filtering*, used to estimate the possible system states in partially observable settings. We show that this approach is tractable for systems that have probabilistic *or* nondeterministic uncertainty, but not for systems that have both. To alleviate the blowup, Sect. 4 discusses an (often) efficacious pruning strategy and its limitations. In Sect. 5 we consider model checking as a more tractable alternative. This result utilizes constructions from the analysis

**Fig. 1.** A probabilistic world and sensor model represented by two MDPs for the scenario of an airplane in landing approach with on-ground vehicle movements.

of *partially observable MDPs* and model checking MDPs with *conditional properties*. In Sect. 6 we present baseline implementations of these algorithms, on top of the open-source model checker Storm, and evaluate their performance. The results show that the implementation allows for monitoring of a variety of MDPs, and reveals both strengths and weaknesses of both algorithms. We start with a motivating example and review related work at the end of the paper.

**Motivating Example.** Consider a scenario where an autonomous airplane is in its final approach, i.e., lined up with a designated runway and descending for landing, see Fig. 1(a). On the ground, close to the runway, maintenance vehicles may cross the runway. The airplane tracks the movements of these vehicles and has to decide, depending on the movements of the vehicles, whether to abort the landing. To simplify matters, assume that the airplane (P) is tracking the movement of one vehicle (V) that is about to cross the runway. Let us further assume that P tracks V using a perception module that can only determine the position of the vehicle with a certain accuracy [33], i.e., for every position of V, the perception module reports a noisy variant of the position of V. However, it is important to know that the plane obtains a sequence of these measurements.

Figure 1 illustrates the dynamics of the scenario. The world model describing the movements of V and P is given in Fig. 1(c), where D2, D1, and D<sup>0</sup> define how close P is to the runway, and R, M, and L define the position of V. Depending on what information V perceives about P, given by the atomic proposition {(p)*rogress*}, and what commands it receives {(w)*ait*}, it may or may not cross the runway. The perception module receives the information about the state of the world and reports with a certain accuracy (given as a probability) the position of V. The (simple) model of the perception module is given in Fig. 1(d). For example, if P is in zone D<sup>2</sup> and V is in R then there is high chance that the perception module returns that V is on the runway. The probability of incorrectly detecting V's position reduces significantly when P is in D0.

A monitor responsible for making the decision to land or to perform a goaround based on the information computed by the perception module, must take into consideration the accuracy of this returned information. For example, if the sequence of sensor readings passed to the monitor is the sequence τ = Ro·Ro·Mo, and each state is mapped to a certain risk, then how risky is it to land after seeing τ? If, for instance, the world is with high probability in state -M,D0, a very risky state, then the plane should go around. In the paper, we address the question of computing the risk based on this observation sequence. We will use this example as our running example.

# **2 Monitoring with Imprecise Sensors**

In this section, we formalize the problem of monitoring with imprecise sensors when both the world and sensor models are given by MDPs. We start with a recap of MDPs, define the monitoring problem for MDPs, and finally show how the dynamics of the system under inspection can be modeled by an MDP defined by the composition of two MDPs of the sensors and world model of the system.

# **2.1 Markov Decision Processes**

For a countable set <sup>X</sup>, let Distr(X) <sup>⊂</sup> (<sup>X</sup> <sup>→</sup> [0, 1]) define the set of all distributions over <sup>X</sup>, i.e., for <sup>d</sup> <sup>∈</sup> Distr(X) it holds that <sup>Σ</sup><sup>x</sup>∈<sup>X</sup>d(x) = 1. For <sup>d</sup> <sup>∈</sup> Distr(X), let the *support* of <sup>d</sup> be defined by supp(d) := {<sup>x</sup> <sup>|</sup> <sup>d</sup>(x) <sup>&</sup>gt; <sup>0</sup>}. We call a distribution <sup>d</sup> *Dirac*, if <sup>|</sup>supp(d)<sup>|</sup> = 1.

**Definition 1 (Markov decision process).** *A* Markov decision process *is a tuple* M = -S, ι,Act, P, <sup>Z</sup>, obs*, where* <sup>S</sup> *is a finite set of* states*,* <sup>ι</sup> <sup>∈</sup> Distr(S) *is an* initial distribution*,* Act *is a finite set of* actions*,* <sup>P</sup> : <sup>S</sup> <sup>×</sup> Act <sup>→</sup> Distr(S) *is a* partial transition function*,* <sup>Z</sup> *is a finite set of* observations*, and* obs: <sup>S</sup> <sup>→</sup> Distr(Z) *is a* observation function*.*

*Remark 1.* The observation function can also be defined as a state-action observation function obs: <sup>S</sup> <sup>×</sup> Act <sup>→</sup> Distr(Z). MDPs with state-action observation function can be easily transformed into equivalent MDPs with a state observation function using auxiliary states [19]. Throughout the paper we use state-action observations to keep (sensor) models concise.

For a state <sup>s</sup> <sup>∈</sup> <sup>S</sup>, we define AvAct(s) = {<sup>α</sup> <sup>|</sup> <sup>P</sup>(s, α) <sup>=</sup> ⊥}. W.l.o.g., <sup>|</sup>AvAct(s)| ≥ 1. If all distributions in <sup>M</sup> are Dirac, we refer to <sup>M</sup> as a *Kripke structure* (KS). If <sup>|</sup>AvAct(s)<sup>|</sup> = 1 for all <sup>s</sup> <sup>∈</sup> <sup>S</sup>, we refer to <sup>M</sup> as <sup>a</sup> *Markov chain* (MC). When <sup>Z</sup> <sup>=</sup> <sup>S</sup>, we refer to <sup>M</sup> as *fully observable* and omit <sup>Z</sup> and obs from its definition. A *finite path* in an MDP <sup>M</sup> is a sequence π = s0a0s<sup>1</sup> ...s<sup>n</sup> ∈ S × - Act <sup>×</sup> <sup>S</sup> <sup>∗</sup> such that for every 0 <sup>≤</sup> i<n it holds that P(si, ai)(si+1) > 0 and ι(s0) > 0. We denote the set of finite paths of M by ΠM. The *length* of the path is given by the number of actions along the path. The set Π<sup>n</sup> <sup>M</sup> for some <sup>n</sup> <sup>∈</sup> <sup>N</sup> denotes the set of finite paths of length <sup>n</sup>. We use <sup>π</sup><sup>↓</sup> to denote the last state in π. We omit M whenever it is clear from the context. <sup>A</sup> *trace* is a sequence of observations <sup>τ</sup> <sup>=</sup> <sup>z</sup><sup>0</sup> ...z<sup>n</sup> <sup>∈</sup> <sup>Z</sup><sup>+</sup>. Every path induces a distribution over traces.

As standard, any nondeterminism is resolved by means of a scheduler.

**Definition 2 (Scheduler).** *A* scheduler *for an MDP* M *is a function* <sup>σ</sup> : <sup>Π</sup><sup>M</sup> <sup>→</sup> Distr(Act) *with* supp(σ(π)) <sup>⊆</sup> AvAct(π↓) *for every* <sup>π</sup> <sup>∈</sup> <sup>Π</sup>M*.*

We use *Sched*(M) to denote the set of schedulers. For a fixed scheduler σ ∈ *Sched*(M), the probability Prσ(π) of a path <sup>π</sup> (under the scheduler <sup>σ</sup>) is the product of the transition probabilities in the induced Markov chain. For more details we refer the reader to [8].

#### **2.2 Formal Problem Statement**

Our goal is to determine the risk that a system is exposed to having observed a trace <sup>τ</sup> <sup>∈</sup> <sup>Z</sup><sup>+</sup>. Let <sup>r</sup> : <sup>S</sup> <sup>→</sup> <sup>R</sup>≥<sup>0</sup> map states in <sup>M</sup> to some risk in <sup>R</sup>≥<sup>0</sup>. We call r a *state-risk function* for M. This function maps to the risk that is associated with being in every state. For example, in our experiments, we flexibly define the state risk using the (expected reward extension of the) temporal logic PCTL [8], to define the probability of reaching a fail state. For example, we can define risk as the probability to crash within H steps. The use of expected rewards allows for even more flexible definitions.

Intuitively, to compute this risk of the system we need to determine the current system state having observed τ , considering both the probabilistic and nondeterministic context. To this end, we formalize the (conditional) probabilities and risks of paths and traces. Let Prσ(<sup>π</sup> <sup>|</sup> <sup>τ</sup> ) define the probability of a path π, under a scheduler σ, having observed τ . Since a scheduler may define many paths that induce the observation trace τ , we are interested in the weighted risk over all paths, i.e., <sup>π</sup>∈Π|τ<sup>|</sup> M Prσ(<sup>π</sup> <sup>|</sup> <sup>τ</sup> ) · <sup>r</sup>(π↓). The monitoring problem for MDPs then conservatively over-approximates the risk of a trace by assuming an adversarial scheduler, that is, by taking the supremum risk estimate over all schedulers<sup>1</sup>.

<sup>1</sup> We later see in Lemma 8 that this is indeed a maximum.

**The Monitoring Problem.** Given an MDP <sup>M</sup>, a state-risk <sup>r</sup> : <sup>S</sup> <sup>→</sup> <sup>R</sup>≥<sup>0</sup>, an observation trace <sup>τ</sup> <sup>∈</sup> <sup>Z</sup><sup>+</sup>, and a threshold <sup>λ</sup> <sup>∈</sup> [0,∞), decide <sup>R</sup>r(<sup>τ</sup> ) > λ, where the *weighted risk function* <sup>R</sup><sup>r</sup> : <sup>Z</sup><sup>+</sup> <sup>→</sup> <sup>R</sup>≥<sup>0</sup> is defined as

$$R\_r(\tau) \quad := \sup\_{\sigma \in Sched(\mathcal{M})} \sum\_{\pi \in \Pi\_{\mathcal{M}}^{|\tau|}} \mathsf{Pr}\_{\sigma}(\pi \mid \tau) \cdot r(\pi\_\downarrow).$$

The conditional probability Prσ(<sup>π</sup> <sup>|</sup> <sup>τ</sup> ) can be characterized using Bayes' rule<sup>2</sup>:

$$\text{Pr}\_{\sigma}(\pi \mid \tau) = \frac{\text{Pr}(\tau \mid \pi) \cdot \text{Pr}\_{\sigma}(\pi)}{\text{Pr}\_{\sigma}(\tau)}.$$

The probability Pr(<sup>τ</sup> <sup>|</sup> <sup>π</sup>) of a trace <sup>τ</sup> for a fixed path <sup>π</sup> is obstr(π)(<sup>τ</sup> ), where

$$\mathsf{obss}\_{\mathsf{tr}}(s) := \mathsf{obss}(s), \quad \mathsf{obss}\_{\mathsf{tr}}(\pi \alpha s') := \{ \tau \cdot z \mapsto \mathsf{obss}\_{\mathsf{tr}}(\pi)(\tau) \cdot \mathsf{obss}(s')(z) \},$$

when <sup>|</sup>π<sup>|</sup> <sup>=</sup> <sup>|</sup><sup>τ</sup> <sup>|</sup>, and obstr(π)(<sup>τ</sup> ) = 0 otherwise. The probability Prσ(<sup>τ</sup> ) of a trace <sup>τ</sup> is <sup>π</sup> Prσ(π) · Pr(<sup>τ</sup> <sup>|</sup> <sup>π</sup>).

We call the special variant with λ = 0 the *qualitative monitoring problem*. The problems are (almost) equivalent on Kripke structures, where considering a single path to an adequate state suffices. Details are given in [36, Appendix].

**Lemma 1.** *For Kripke structures the monitoring and qualitative monitoring problems are logspace interreducible.*

In the next sections we present two types of algorithms for the monitoring problem. The first algorithm is based on the widespread (forward) filtering approach [44]. The second is new algorithm based on model checking conditional probabilities. While filtering approaches are efficacious in a purely nondeterministic or a purely probabilistic setting, it does not scale on models such as MDPs that are both probabilistic and nondeterministic. In those models, model checking provides a tractable alternative. Before going into details, we first connect the problem statement more formally to our motivating example.

#### **2.3 An MDP Defining the System Dynamics**

We show how the weighted risk for a system given by a world and sensor model can be formalized as a monitoring problem for MDPs. To this end, we define the dynamics of the world and sensors that we use as basis for our monitor as the following joint MDP.

For a fully observable world MDP E = -<sup>S</sup><sup>E</sup> , ι<sup>E</sup> ,Act<sup>E</sup> , P<sup>E</sup> and a sensor MDP S = -<sup>S</sup><sup>S</sup> , ι<sup>S</sup> , S<sup>E</sup> , P<sup>S</sup> , <sup>Z</sup>, obs, where obs is state-action based, the *inspected system* is defined by an MDP --E, S = -<sup>S</sup><sup>J</sup> , ι<sup>J</sup> ,Act<sup>E</sup> , P<sup>J</sup> , <sup>Z</sup>, obs<sup>J</sup> being the synchronous composition of E and S:

<sup>2</sup> For conciseness we assume throughout the paper that <sup>0</sup> <sup>0</sup> = 0.

**Fig. 2.** A run with its observations of the inspected system --<sup>E</sup>, S where <sup>E</sup> and <sup>S</sup> are the models given in Fig. 1.

– S<sup>J</sup> := S<sup>E</sup> × S<sup>S</sup> , – ι<sup>J</sup> is defined as ι<sup>J</sup> (u, s) := ι<sup>E</sup> (u) · ι<sup>S</sup> (s) for each u ∈ S<sup>E</sup> and s ∈ S<sup>S</sup> , – <sup>P</sup><sup>J</sup> : <sup>S</sup><sup>J</sup> <sup>×</sup> Act<sup>E</sup> <sup>→</sup> Distr(S<sup>J</sup> ) such that for all u, s ∈ <sup>S</sup><sup>J</sup> and <sup>α</sup> <sup>∈</sup> Act<sup>E</sup> ;

$$P\_{\mathcal{I}}(\langle u, s \rangle, \alpha) = d\_{u, s} \in \mathsf{Distr}(S\_{\mathcal{I}}),$$

where for all u ∈ S<sup>E</sup> and s ∈ S<sup>S</sup> : du,s(u , s ) = P<sup>E</sup> (u, α)(u ) · P<sup>S</sup> (s, u)(s ), – obs<sup>J</sup> : <sup>S</sup><sup>J</sup> <sup>→</sup> Distr(Z) with obs<sup>J</sup> : u, s <sup>→</sup> obs(s, u).

In Fig. 2 we illustrate a run of --E, S for the world and sensor MDPs presented in Fig. 1. We particularly show the observations of the joint MDP given by the distributions over the observations for each transition in the run (we omitted the probabilistic transitions for simplicity). The observations of the MDP M present the output of the sensor upon a path through M. These observations in turn are the inputs to a monitor on top of the system. The role of the monitor is then to compute the risk of being in a critical state based on the received observations.

### **3 Forward Filtering for State Estimation**

We start by showing why standard forward filtering does not scale well on MDPs. We briefly show how filtering can be used to solve the monitoring problem for purely nondeterministic systems (Kripke structures) or purely probabilistic systems (Markov Chains). Then, we show why for MDPs, the forward filtering needs to manage, although finite but an exponential set of distributions. In Sect. 4 we present a new improved variant of forward filtering for MDPs based on filtering with vertices of the convex hull. In Sect. 5 we present a new polynomial-time model checking-based algorithm for solving the problem.

#### **3.1 State Estimators for Kripke Structures.**

For Kripke structures, we maintain a set of possible states that agree with the observed trace. This set of states is inductively characterized by the function estKS : <sup>Z</sup><sup>+</sup> <sup>→</sup> <sup>2</sup><sup>S</sup> which we define formally below. For an observation trace <sup>τ</sup> , estKS(τ ) defines the set of states that can be reached with positive probability. This set can be computed by a forward state traversal [31]. To illustrate how estKS(τ ) is computed for τ , consider the underlying Kripke structure of the inspected system --E, S for our running example in Fig. 1 (to make this a Kripke structure, we remove the probabilities). Consider further the observation trace τ = R<sup>o</sup> · M<sup>o</sup> · Lo. Since --E, S has only one initial state --R, D2, *sense* and R<sup>o</sup> is observable with a positive probability in this state, estKS(Ro) = {--R, D2, *sense*}. As <sup>M</sup><sup>o</sup> is observed next, estKS(R<sup>o</sup> · <sup>M</sup>o) computes the states reached from --R, D2, *sense* and where M<sup>o</sup> can be observed with a positive probability, i.e., estKS(R<sup>o</sup> · <sup>M</sup>o) = {--R, D1, *sense*,--R,M1, *sense*}. Finally, the current state having observed <sup>R</sup><sup>o</sup> ·M<sup>o</sup> ·L<sup>o</sup> may be one of the states estKS(<sup>τ</sup> ) = {--M,D1, *sense*, --L, D1, *sense*, --L, D0, *sense*, --M,D0, *sense*}, which especially shows that we might be in the high-risk world state -M,D0.

**Definition 3 (**KS **state estimator).** *For* KS <sup>=</sup> -S, ι,Act, P, <sup>Z</sup>, obs*, the state estimation function* estKS : <sup>Z</sup><sup>+</sup> <sup>→</sup> <sup>2</sup><sup>S</sup> *is defined as*

$$\begin{aligned} \mathsf{Set}\_{\mathsf{KS}}(z) &:= \{ s \in S \mid \iota(s) > 0 \land \mathsf{obs}(s)(z) > 0 \} \\ \mathsf{Set}\_{\mathsf{KS}}(\tau \cdot z) &:= \left\{ s' \in S \mid \exists s \in \mathsf{Set}\_{\mathsf{KS}}(\tau), \exists \alpha \in \mathsf{Act}, P(s, \alpha)(s') > 0 \land \mathsf{obs}(s')(z) > 0 \right\}. \end{aligned}$$

For a Kripke structure KS and a given trace τ , the monitoring problem can be solved by computing estKS(τ ), using [31] and Lemma 1.

**Lemma 2.** *For a Kripke stucture* KS <sup>=</sup> -S, ι,Act, P, <sup>Z</sup>, obs*, a trace* <sup>τ</sup> <sup>∈</sup> <sup>Z</sup>+*, and a state-risk function* <sup>r</sup> : <sup>S</sup> <sup>→</sup> <sup>R</sup>≥0*, it holds that* <sup>R</sup>r(<sup>τ</sup> ) = max <sup>s</sup>∈estKS(τ) r(s)*. Computing* Rr(τ ) *requires time* O(|τ |·|P|) *and space* O(|S|)*.*

A proof can be found in [36, Appendix]. The time and space requirements follow directly from the inductive definition of estKS which resembles solving a forward state traversal problem in automata [31]. In particular, the algorithm allows updating the result after extending τ in O(|P|).

# **3.2 State Estimators for Markov Chains**

For Markov chains, in addition to tracking the potential reachable system states, we also need to take the transition probabilities into account. When a system is (observation-)deterministic, we can adapt the notion of beliefs, similar to RVSE [54], and similar to the construction of belief MDPs for *partially observable MDPs*, cf. [53]:

**Definition 4 (Belief).** *For an MDP* <sup>M</sup> *with a set of states* <sup>S</sup>*, a belief* bel *is a distribution in* Distr(S)*.*

In the remainder of the paper, we will denote the function S → {0} by **0** and the set Distr(S) ∪ {**0**} by Bel. A state estimator based on Bel is then defined as follows [51,54,57] 3:

<sup>3</sup> For the deterministic case, we omit the unique action for brevity.

**Definition 5 (MC state estimator).** *For* MC <sup>=</sup> -S, ι,Act, P, <sup>Z</sup>, obs*, a trace* <sup>τ</sup> <sup>∈</sup> <sup>Z</sup><sup>+</sup> *the state estimation function* estMC : <sup>Z</sup><sup>+</sup> <sup>→</sup> Bel *is defined as*

$$\mathsf{est\_{\mathsf{MC}}}(z) := \begin{cases} \left\{ s \mapsto \frac{\iota(s) \cdot \mathsf{obs}(s)(z)}{\sum\_{i \in S} \iota(\hat{s}) \cdot \mathsf{obs}(\hat{s})(z)} \right\} & \exists s \in S. \ \iota(s) \cdot \mathsf{obs}(z) > 0, \\ \mathbf{0} & \text{otherwise.} \end{cases}$$

$$\mathsf{est\_{\mathsf{MC}}}(\tau \cdot z) := \left\{ s' \mapsto \frac{\sum\_{s \in S} \mathsf{est\_{\mathsf{MC}}}(\tau)(s) \cdot P(s, s') \cdot \mathsf{obs}(s')(z)}{\sum\_{s \in S} \mathsf{est\_{\mathsf{MC}}}(\tau)(s) \cdot \left( \sum\_{\hat{s} \in S} P(s, \hat{s}) \cdot \mathsf{obs}(\hat{s})(z) \right)} \right\}$$

To illustrate how estMC is computed, consider again our system in Fig. 1 and assume that the MDP has only the actions labeled with {p} (reducing it to the Markov chain induced by the a scheduler that only performs the {p} actions). Again we consider the observation trace τ = R<sup>o</sup> · M<sup>o</sup> · L<sup>o</sup> and compute estMC(τ ). For the first observation Ro, and since there is only one initial state, it follows that estMC(Ro) = {-R, D2 <sup>→</sup> <sup>1</sup>}<sup>4</sup>. From -R, D2 and having observed M<sup>o</sup> we can reach the states -R, D1 and -M,D1 with probabilities estMC(R<sup>o</sup> · Mo) = {-R, D1 <sup>→</sup> <sup>1</sup> <sup>2</sup> · <sup>1</sup> <sup>3</sup> <sup>1</sup> <sup>2</sup> · <sup>1</sup> <sup>3</sup> <sup>+</sup> <sup>1</sup> <sup>2</sup> · <sup>3</sup> 4 = <sup>4</sup> <sup>13</sup> ,-M,D1 <sup>→</sup> <sup>1</sup> <sup>2</sup> · <sup>3</sup> <sup>4</sup> <sup>1</sup> <sup>2</sup> · <sup>1</sup> <sup>3</sup> <sup>+</sup> <sup>1</sup> <sup>2</sup> · <sup>3</sup> 4 = <sup>9</sup> <sup>13</sup> }. Finally, from the later two states, when observing Lo, the states -M,D0 and -L, D0 can be reached with probabilities estMC(R<sup>o</sup> · <sup>M</sup><sup>o</sup> · <sup>L</sup>o) = {-M,D0 → 0.0001,-L, D0 → 0.999}. Notice that although the state -R, D0 can be reached from -R, D1, the probability of being in this state is 0 since the probability of observing L<sup>o</sup> in this state is obs(-R, D0)(Lo) = 0.

**Lemma 3.** *For a Markov chain* MC <sup>=</sup> -S, ι,Act, P, <sup>Z</sup>, obs*, a trace* <sup>τ</sup> <sup>∈</sup> <sup>Z</sup>+*, and a state-risk function* <sup>r</sup> : <sup>S</sup> <sup>→</sup> <sup>R</sup>≥0*, it holds that* <sup>R</sup>r(<sup>τ</sup> ) = <sup>s</sup>∈<sup>S</sup> estMC(<sup>τ</sup> )(s)· <sup>r</sup>(s). *Computing* Rr(τ ) *can be done in time* O(|τ |·|S|·|P|) *, and using* |S| *many rational numbers. The size of the rationals*<sup>5</sup> *may grow linearly in* τ *.*

*Proof Sketch.* Since the system is deterministic, there is a unique scheduler σ, thus Rr(τ ) = <sup>π</sup>∈Π|τ<sup>|</sup> MC Prσ(<sup>π</sup> <sup>|</sup> <sup>τ</sup> )· <sup>r</sup>(π↓) by definition. We can show by induction over the length of <sup>τ</sup> that Prσ(<sup>π</sup> <sup>|</sup> <sup>τ</sup> ) = estMC(<sup>τ</sup> )(π↓) and conclude that <sup>R</sup>r(<sup>τ</sup> ) = <sup>π</sup>∈Π|τ<sup>|</sup> M estMC(<sup>τ</sup> )(π↓) · <sup>r</sup>(π↓) = <sup>s</sup>∈<sup>S</sup> estMC(<sup>τ</sup> )(s) · <sup>r</sup>(s) because estMC(<sup>τ</sup> )(s)=0 for all <sup>s</sup> <sup>∈</sup> <sup>S</sup> for which there is no path <sup>π</sup> <sup>∈</sup> <sup>Π</sup><sup>|</sup>τ<sup>|</sup> <sup>M</sup> with <sup>π</sup><sup>↓</sup> <sup>=</sup> <sup>s</sup>. The complexity follows from the inductive definition of estMC that requires in each inductive step to iterate over all transitions of the system and maintain a belief over the states of the system.

#### **3.3 State Estimators for Markov Decision Processes**

In an MDP, we have to account for every possible resolution of nondeterminism, which means that a belief can evolve into a set of beliefs:

<sup>4</sup> We omit the (single) sensor state for conciseness.

<sup>5</sup> To avoid growth, one may use fixed-precision numbers that over-approximate the probability of being in any state—inducing a growing (but conservative) error.

**Definition 6 (MDP state estimator).** *For an MDP* M = -S, ι,Act, P, <sup>Z</sup>, obs*, a trace* <sup>τ</sup> <sup>∈</sup> <sup>Z</sup><sup>+</sup>*, and a state-risk function* <sup>r</sup> : <sup>S</sup> <sup>→</sup> <sup>R</sup>≥<sup>0</sup>*, the state estimation function* estMDP : <sup>Z</sup><sup>+</sup> <sup>→</sup> <sup>2</sup>Bel *is defined as*

$$\begin{array}{l} \mathsf{est\_{\mathsf{MDP}}}(z) &= \{ \mathsf{est\_{\mathsf{MDC}}}(z) \}, \\ \mathsf{est\_{\mathsf{MDP}}}(\tau \cdot z) &= \left\{ \mathsf{bel}' \in \mathsf{Bel} \, \Big|\, \exists \mathsf{bel} \in \mathsf{est\_{\mathsf{MDP}}}(\tau). \, \mathsf{bel}' \in \mathsf{est\_{\mathsf{MDP}}}^{\mathsf{up}}(\mathsf{bel}, z) \right\}, \end{array}$$

*and where* bel <sup>∈</sup> estup MDP(bel, z) *if there exists* <sup>ς</sup>bel : <sup>S</sup> <sup>→</sup> Distr(Act) *such that:*

$$\forall s'.\mathtt{bel}'(s') = \frac{\sum\_{s \in S} \mathtt{bel}(s) \cdot \sum\_{\alpha \in \mathsf{Act}} \mathtt{Qs}(s)(\alpha) \cdot P(s, \alpha, s') \cdot \mathtt{obs}(s')(z)}{\sum\_{s \in S} \mathtt{bel}(s) \cdot \sum\_{\alpha \in \mathsf{Act}} \varsigma\_{\mathsf{bel}}(s)(\alpha) \cdot \sum\_{\hat{s} \in S} P(s, \alpha, \hat{s}) \cdot \mathtt{obs}(\hat{s})(z)}.$$

The definition conservatively extends both Definition 3 and Definition 5. Furthermore, we remark that we do not restrict how the nondeterminism is resolved: any distribution over actions can be chosen, and the distributions may be different for different traces.

Consider our system in Fig. 1. For the trace <sup>τ</sup> <sup>=</sup> <sup>R</sup><sup>o</sup> · <sup>M</sup><sup>o</sup> · <sup>L</sup>o, estMDP(<sup>τ</sup> ) is computed as follows. First, when observing Ro, the state estimator computes the initial belief set estMDP(Ro) = {{-R, D2 → 1}}. From this set of beliefs, when observing <sup>M</sup>o, a set estMDP(R<sup>o</sup> · <sup>M</sup>o) can be computed since all transitions ∅, {p}, {w}, {p, w} (as well as their convex combinations) are possible from -R, D2. One of these beliefs is for example {-R, D1 <sup>→</sup> <sup>4</sup> <sup>13</sup> ,-M,D1 <sup>→</sup> <sup>9</sup> <sup>13</sup> } when a scheduler takes the transition {p} (as was computed in our example for the Markov chain case). Having additionally observed L<sup>o</sup> a new set estMDP(RoMoLo) of beliefs can be computed based on the beliefs in estMDP(RoMo). For example from the belief {-R, D1 <sup>→</sup> <sup>4</sup> <sup>13</sup> ,-M,D1 <sup>→</sup> <sup>9</sup> <sup>13</sup> }, two of the new beliefs are {-L, D0 → 0.999,-M,D0 → 0.0001} and {-M,D1 → 0.0287,-M,D0 → 0.0001,-L, D0 → 0.9712}. The first belief is reached by a scheduler that takes a transition {p} at both -R, D1 and -M,D1. Notice that the belief does not give a positive probability to the state -R, D0 because L<sup>o</sup> cannot be observed in this state. The second belief is reached by considering a scheduler that takes transition {p} at -M,D1 and transition ∅ at -R, D1.

**Theorem 1.** *For an MDP* M = -S, ι,Act, P, <sup>Z</sup>, obs*, a trace* <sup>τ</sup> <sup>∈</sup> <sup>Z</sup><sup>+</sup>*, and a state-risk function* <sup>r</sup> : <sup>S</sup> <sup>→</sup> <sup>R</sup>≥<sup>0</sup>*, it holds that* <sup>R</sup>r(<sup>τ</sup> ) = supbel∈estMDP(τ) <sup>s</sup>∈<sup>S</sup> bel(s) · <sup>r</sup>(s)*.*

*Proof Sketch.* For a given trace τ , each (history-dependent, randomizing) scheduler induces a belief over the states of the Markov chain induced by the scheduler. Also, each belief in estMDP(τ ) corresponds to a fixed scheduler, namely that one used to compute the belief recursively (i.e., an arbitrary randomizing memoryless scheduler for every time step). Once a scheduler σ and its corresponding belief bel is fixed, or vice versa, we can show using induction over the length of τ that <sup>π</sup>∈Π|τ<sup>|</sup> M Prσ(<sup>π</sup> <sup>|</sup> <sup>τ</sup> ) · <sup>r</sup>(π↓) = <sup>s</sup>∈<sup>S</sup> bel(s) · <sup>r</sup>(s).

**Fig. 3.** Beliefs in <sup>R</sup>*<sup>n</sup>* on <sup>M</sup> for <sup>τ</sup> <sup>=</sup> <sup>z</sup>0z0, <sup>z</sup>0z0z<sup>0</sup> and <sup>z</sup>0z0z1, respectively.

### **4 Convex Hull-Based Forward Filtering**

In this section, we show that we can use a finite representation for estMDP(τ ), but that this representation is exponentially large for some MDPs.

#### **4.1 Properties of estMDP(***τ* **).**

First, observe that **0** never maximizes the risk. Furthermore, **0** is closed under updates, i.e., estup MDP(**0**, z) = {**0**}. We can thus w.l.o.g. assume that **<sup>0</sup>** ∈ estMDP(<sup>τ</sup> ). Second, observe that estMDP(<sup>τ</sup> ) <sup>=</sup> <sup>∅</sup> if Prσ(<sup>τ</sup> ) <sup>&</sup>gt; 0.

We can interpret a belief bel <sup>∈</sup> Bel as point in (a bounded subset of) <sup>R</sup>(|S|−1). We are in particular interested in convex sets of beliefs. A set <sup>B</sup> <sup>⊆</sup> Bel is convex if the convex hull CH(B) of B, i.e. all convex combination of beliefs in B<sup>6</sup>, coincides with <sup>B</sup>, i.e., CH(B) = <sup>B</sup>. For a set <sup>B</sup> <sup>⊆</sup> Bel, a belief bel <sup>∈</sup> <sup>B</sup> is an interior belief if it can be expressed as convex combination of the beliefs in <sup>B</sup> \ {bel}. All other beliefs are (extremal) points or *vertices*. Let the set V(B) ⊆ B denote the set of *vertices of the convex hull* of B.

*Example 1.* Consider Fig. 3(a). All observation are Dirac, and only states s<sup>2</sup> and s<sup>4</sup> have observation z1. The beliefs having observed z0z<sup>0</sup> are distributions over s1, s3, and can thus be depicted in a one-dimensional simplex. In particular, we have <sup>V</sup>(estMDP(z0z0)) = {{s<sup>1</sup> <sup>→</sup> <sup>1</sup>}, {s<sup>1</sup> <sup>→</sup> <sup>3</sup>/4, s<sup>3</sup> <sup>→</sup> <sup>1</sup>/<sup>4</sup>}}, as depicted in Fig. 3(b). The six beliefs having observed z0z0z<sup>0</sup> are distributions over s0, s1, s3, depicted in Fig. 3(c). Five out of six beliefs are vertices. The belief having observed z0z0z<sup>1</sup> is in Fig. 3(d).

*Remark 2.* Observe that we illustrate the beliefs over only the states estKS(τ ). We therefore call <sup>|</sup>estKS(<sup>τ</sup> )<sup>|</sup> the dimension of estMDP(<sup>τ</sup> ).

From the fundamental theorem of linear programming [47, Ch. 7] it immediately follows that the trace risk R<sup>τ</sup> is obtained at a vertex of the beliefs of estMDPτ . We obtain the following refinement over Theorem 1:

<sup>6</sup> That is, CH(B) = { bel∈*<sup>B</sup>* <sup>w</sup>(bel) · bel <sup>|</sup> for all <sup>w</sup> <sup>∈</sup> <sup>R</sup>*<sup>B</sup>* <sup>≥</sup><sup>0</sup> with w(bel)=1}.

**Theorem 2.** *For every* <sup>τ</sup> *and* <sup>r</sup>*:* <sup>R</sup>r(<sup>τ</sup> ) = max bel∈V(estMDP(τ)) <sup>s</sup>∈<sup>S</sup> bel(s) · <sup>r</sup>(s).

Lemma 5 below clarifies that this maximum indeed exists.

We make some observations that allow us to compute the vertices more efficiently: Let estup MDP(B,z) denote bel∈<sup>B</sup> estup MDP(bel, z). From the properties of convex sets [18, Ch. 2], we make the following observations: If B is convex, estup MDP(B,z) is convex, as all operations in computing a new belief are convexset preserving<sup>7</sup>. Furthermore, if B has a finite set of vertices, then estup MDP(B,z) has a finite set of vertices. The following lemma which is based on the observations above clarifies how to compute the vertices:

**Lemma 4.** *For a convex set of beliefs* B *with a finite set of vertices and an observation* z*:*

$$\mathcal{V}(\mathsf{est}\_{\mathsf{MDP}}^{\mathsf{up}}(B, z)) = \mathcal{V}(\mathsf{est}\_{\mathsf{MDP}}^{\mathsf{up}}(\mathcal{V}(B), z)).$$

By induction and using the facts above we obtain:

**Lemma 5.** *Any* <sup>V</sup>(estMDP(<sup>τ</sup> )) *is finite.*

A monitor thus only needs to track the vertices. Furthermore, estup MDP(B,z) can be adapted to compute only vertices by limiting <sup>ς</sup>bel to <sup>S</sup> <sup>→</sup> Act.

# **4.2 Exponential Lower Bounds on the Relevant Vertices**

We show that a monitor in general cannot avoid an exponential blow-up in the beliefs it tracks. First observe that updating bel yields up to <sup>s</sup> <sup>|</sup>Act(s)<sup>|</sup> new beliefs (vertex or not), a prohibitively large number. The number of vertices is also exponential:

**Lemma 6.** *There exists a family of MDPs* M<sup>n</sup> *with* 2n + 1 *states such that* |V(estMDP(<sup>τ</sup> ))<sup>|</sup> = 2<sup>n</sup> *for every* <sup>τ</sup> *with* <sup>|</sup><sup>τ</sup> <sup>|</sup> <sup>&</sup>gt; <sup>2</sup>*.*

*Proof Sketch.* We construct M<sup>n</sup> with n = 3, that is, M<sup>3</sup> in Fig. 4(a). For this MDP and <sup>τ</sup> <sup>=</sup> AAA, |V(estMDP(<sup>τ</sup> ))<sup>|</sup> = 2<sup>3</sup>. In particular, observe how the belief factorizes into a belief within each component C<sup>i</sup> = {hi, li} and notice that M<sup>n</sup> has components C<sup>1</sup> to Cn. In particular, for each component, the belief being that we are with probability mass <sup>1</sup>/<sup>n</sup> (for n = 3, <sup>1</sup>/3) in the 'low' state l<sup>i</sup> or the 'high' state hi. We depict the beliefs in Fig. 4(b,c,d). Thus, for any τ with |τ | > 2 we can compactly represent <sup>V</sup>(estMDP(<sup>τ</sup> )) as bit-strings of length <sup>n</sup>. Concretely, the belief

$$\begin{aligned} \{h\_1, l\_2, l\_3 \mapsto 1/3, l\_1, h\_2, h\_3 \mapsto 0\} &\text{ maps to } 100, \text{ and} \\ \{h\_1, l\_2, h\_3 \mapsto 1/3, l\_1, h\_2, l\_3 \mapsto 0\} &\text{ maps to } 101. \end{aligned}$$

These are exponentially many beliefs for bit strings of length n.

One might ask whether a symbolic encoding of an exponentially large set may result in a more tractable approach to filtering. While Theorem 2 allows

<sup>7</sup> The scaling is called a *projection*.

**Fig. 4.** Construction for the correctness of Lemma 6.

to compute the associated risk from a set of linear constraints with standard techniques, it is not clear whether the concise set of constraints can be efficiently constructed and updated in every step. We leave this concern for future work.

In the remainder we investigate whether we need to track all these beliefs. First, when the monitor is unaware of the state-risk, this is trivially unavoidable. More precisely, all vertices may induce the maximal weighted trace risk by choosing an appropriate state-risk:

**Lemma 7.** *For every* <sup>τ</sup> *and every* bel ∈ V(estMDP(<sup>τ</sup> )) *there exists an* <sup>r</sup> *s.t.*

$$\sum\_{s \in S} \text{bel}(s) \cdot r(s) \ge \max\_{\mathsf{bel}' \in \mathcal{V}(\mathsf{est}\_{\mathsf{MD}}(\tau))} \sum\_{s \in S} \text{bel}'(s) \cdot r(s) \text{ with } \max\_{\mathsf{bel} \in \emptyset} = -\infty.$$

*Proof Sketch.* We construct r such that r(s) > r(s ) if bel(s) > bel(s ).

Second, even if the monitor is aware of the state risk r, it may not be able to prune enough vertices to avoid exponential growth. The crux here is that while some of the current beliefs may induce a smaller risk, an extension of the trace may cause the belief to evolve into a belief that induces the maximal risk.

**Theorem 3.** *There exist MDPs* <sup>M</sup><sup>n</sup> *<sup>a</sup>* <sup>τ</sup> *with* <sup>B</sup> := <sup>V</sup>(estMDP(<sup>τ</sup> )) *and a staterisk* <sup>r</sup> *such that* <sup>|</sup>B<sup>|</sup> = 2<sup>n</sup> *and for all* bel <sup>∈</sup> <sup>B</sup> *exists* <sup>τ</sup> <sup>∈</sup> <sup>Z</sup><sup>+</sup> *with* <sup>R</sup>r(<sup>τ</sup> · <sup>τ</sup> ) > supbel∈B <sup>s</sup> bel(s) · <sup>r</sup>(s)*, where* <sup>B</sup> <sup>=</sup> estup MDP(<sup>B</sup> \ {bel}, τ )*.*

It is helpful to understand this theorem as describing the outcome of a game between monitor and environment: The statement says if the monitor decides to drop some vertices from estMDPτ , the environment may produce an observation trace τ that will lead the monitor to underestimate the weighted risk at Rr(τ ·τ ).

*Proof Sketch.* We extend the construction of Fig. 4(a) with choices to go to a final state. The full proof sketch can be found in [36, Appendix].

# **4.3 Approximation by Pruning**

Finally, we illustrate that we cannot simply prune small probabilities from beliefs. This indicates that an approximative version of filtering for the monitoring problem is nontrivial. Reconsider observing z0z<sup>0</sup> in the MDP of Fig. 3, and, for the sake of argument, let us prune the (small) entry s<sup>3</sup> → <sup>1</sup>/<sup>4</sup> to 0. Now, continuing with the trace z0z0z1, we would update the beliefs from before and then conclude that this trace cannot be observed with positive probability. With pruning, there is no upper bound on the difference between the *computed* R<sup>τ</sup> and the *actual* R<sup>τ</sup> . Thus, forward filtering is, in general, not tractable on MDPs.

# **5 Unrolling with Model Checking**

We present a tractable algorithm for the monitoring problem. Contrary to filtering, this method incorporates the state risk. We briefly consider the qualitative case. An algorithm that solves that problem iteratively guesses a successor such that the given trace has positive probability, and reaches a state with sufficient risk. The algorithm only stores the current and next state and a counter.

### **Theorem 4.** *The Monitoring Problem with* λ = 0 *is in NLOGSPACE.*

This result implies the existence of a polynomial time algorithm, e.g., using a graph-search on a graph growing in |τ |. There also is a deterministic algorithm with space complexity <sup>O</sup>(log<sup>2</sup>(|M|+|<sup>τ</sup> <sup>|</sup>)), which follows from applying Savitch's Theorem [46] , but that algorithm has exponential time complexity.

We now present a tractable algorithm for the quantitative case, where we need to store all paths. We do this efficiently by storing an unrolled MDP with these paths using ideas from [9,19]. In particular, on this MDP, we can efficiently obtain the scheduler that optimizes the risk by model checking rather than enumerating over all schedulers explicitly. We give the result before going into details.

# **Theorem 5.** *The Monitoring Problem (with* λ > 0*) is P-complete.*

The problem is P-hard, as unary-encoded step-bounded reachability is P-hard [41]. It remains to show a P-time algorithm<sup>8</sup>, which is outlined below. Roughly, the algorithm constructs an MDP M from M in three conceptual steps, such that the

<sup>8</sup> On first sight, this might be surprising as step-bounded reachability in MDPs is PSPACE-hard and only quasi-polynomial. However, our problem gets a trace and therefore (assuming that the trace is not compressed) can be handled in time polynomial in the length of the trace.

**Fig. 5.** Polynomial-time algorithm for solving Problem 1 illustrated.

maximal probability of reaching a state in M coincides with the Rr(τ ). The former can be solved by linear programming in polynomial time. The downside is that even in the best case, the memory consumption grows linearly in |τ |.

We outline the main steps of the algorithm and exemplify them below: First, we transform M into an MDP M with *deterministic state* observations, i.e., with obs : <sup>S</sup> <sup>→</sup> <sup>Z</sup>. This construction is detailed in [19, Remark 1], and runs in polynomial time. The new initial distribution takes into account the initial observation and the initial distribution. Importantly, for each path π and each trace τ , obstr(π)(τ ) is preserved. From here, the idea for the algorithm is a tailored adaption of the construction for conditional reachability probabilities in [9]. We ensure that r(s) ∈ [0, 1] by scaling r and λ accordingly. Now, we construct a new MDP M = -<sup>S</sup>, ι,Act, P with state space <sup>S</sup> := (S ×{0,..., |τ |−1})∪{⊥, } and an n-times unrolled transition relation. Furthermore, from the states s, |τ |−1, there is a single outgoing action that with probability r(s) leads to and with probability 1 − r(s) leads to ⊥. Observe that the risk is now the supremum of conditioned reachability probabilities over paths that reach , conditioned by the trace τ . The MDP M is only polynomially larger. Then, we construct MDP M by copying M and replacing (part of) the transition relation P by <sup>P</sup> such that paths <sup>π</sup> with <sup>τ</sup> ∈ obstr(π) are looped back to the initial state (resembling rejection sampling). Formally,

$$P^{\prime\prime\prime}(\langle s,i\rangle,\alpha) = \begin{cases} P^{\prime\prime}(\langle s,i\rangle,\alpha) & \text{if } \mathsf{obs}^{\prime}(s) = \tau\_i, \\ \iota & \text{otherwise}. \end{cases}$$

The maximal conditional reachability probability in M is the maximal reachability probability in M [9]. Maximal reachability probabilities can be computed by solving a linear program [43], and can thus be computed in polynomial time.

*Example 2.* We illustrate the construction in Fig. 5. In Fig. 5(a), we depict an MDP M, with ι = {s0, s<sup>1</sup> → <sup>1</sup>/<sup>2</sup>}. Furthermore, let τ = z0z<sup>0</sup> and let r(s0)=1 and <sup>r</sup>(s1) = 2. Let obs(s0) = {z<sup>0</sup> <sup>→</sup> <sup>1</sup>} and obs(s1) = {z<sup>0</sup> <sup>→</sup> <sup>1</sup>/4, <sup>z</sup><sup>1</sup> <sup>→</sup> <sup>3</sup>/<sup>4</sup>}. State s<sup>1</sup> has two possible observations, so we split s<sup>1</sup> into s<sup>1</sup> and s<sup>2</sup> in MDP M , each with their own observations. Any transition into s<sup>1</sup> is now split. As |τ | = 2, we unroll the MDP M into MDP M to represent two steps, and add goal and sink states. After rescaling, we obtain that r(s0) = <sup>1</sup>/2, whereas r(s1) = r(s2) = <sup>2</sup>/<sup>2</sup> = 1, and we add the appropriate outgoing transitions to the states s<sup>1</sup> <sup>∗</sup>. In a final step, we create MDP M from M: we reroute all probability mass that does not agree with the observations to the initial states. Now, Rr(z0z0) is given by the probability to reach, in M, in an unbounded number of steps, .

The construction also implies that maximizing over a finite set of schedulers, namely the deterministic schedulers with a counter from 0 to |τ |, suffices. We denote this class ΣDC(|τ |). Formally, a scheduler is in ΣDC(k) if for all π, π :

$$\left(\pi\_{\downarrow} = \pi'\_{\downarrow} \land \left( |\pi| = |\pi'| \lor (|\pi| > k \land |\pi'| > k) \right) \right) \text{ implies } \sigma(\pi) = \sigma(\pi').$$

**Lemma 8.** *For every* τ *, it holds that*

$$R\_r(\tau) \quad = \max\_{\sigma \in \Sigma\_{DC}(|\tau|)} \sum\_{\pi \in \Pi\_M^{|\tau|}} \mathsf{Pr}\_{\sigma}(\pi \mid \tau) \cdot r(\pi\_\downarrow).$$

The crucial idea underpinning this lemma is that memoryless schedulers suffice for the unrolling, and that the states of the unrolling can be uniquely mapped to a state and the length of the history for every π through M. By reducing stepbounded reachability we can also show that this set of schedulers is necessary [4].

# **6 Empirical Evaluation**

*Implementation.* We provide prototype implementations for both filtering- and model-checking-based approaches from Sect. 3, built on top of the probabilistic model checker Storm [30]. We provide a schematic setup of our implementation in Fig. 6. As input, we consider a symbolic description of MDPs with statebased observation labels, based on an extended dialect of the Prism language. We define the state risk in this MDP via a temporal property (given as a PCTL formula), and obtain the concrete state-risk by model checking. We take a seed that yields a trace using the simulator. For the experiments, actions are resolved uniformly in this simulator<sup>9</sup>. The simulator iteratively feeds observations into the monitor, running either of our two algorithms (implemented in C++). After each observation zi, the monitor computes the risk R<sup>i</sup> having observed z<sup>0</sup> ...zi. We flexibly combine these components via a Python API<sup>10</sup>.

For filtering as in Sect. 4, we provide a sparse data structure for beliefs that is updated using only deterministic schedulers. This is sufficient, see Lemma 4. To further prune the set of beliefs, we implement an SMT-driven elimination [48]

<sup>9</sup> This is not an assumption but rather our evaluation strategy.

<sup>10</sup> Available at https://github.com/monitoring-MDPs/premise.

**Fig. 6.** Schematic setup for prototype mapping stream z<sup>0</sup> ...z*<sup>k</sup>* to stream R<sup>0</sup> ...R*k*.

of interior beliefs, inside of the convex hull<sup>11</sup>. We construct the unrolling as described in Sect. 5 and apply model checking via any sparse engines in Storm.

*Reproducibility.* We archived a container with sources, benchmarks, and scripts to reproduce our experiments: https://doi.org/10.5281/zenodo.4724622.

*Set-Up.* For each benchmark described below, we sampled 50 random traces using seeds 0–49 of lengths up to |τ | = 500. We are interested in the *promptness*, that is, the delay of time between getting an observation z<sup>i</sup> and returning corresponding risk ri, as well as the *cumulative performance* obtained by summing over the promptness along the trace. We use a timeout of 1 second for this query. We compare the forward filtering (FF) approach with and without convex hull (CH) reduction, and the model unrolling approach (UNR) with two model checking engines of Storm: exact policy iteration (EPI, [43]) and optimistic value iteration (OVI, [28]). All experiments are run on a MacBook Pro MV962LL/A, using a single core. The memory limit of 6GB was not violated. We use Z3 [38] as SMT-solver [11] for the convex hull reduction.

*Benchmarks.* We present three benchmark families, all MDPs with a combination of probabilities, nondeterminism and partial observability.

Airport-A is as in Sect. 1, but with a higher resolution for both ground vehicle in the middle lane and the plane. Airport-B has a two-state sensor model with stochastic transitions between them.

Refuel-A models robots with a depleting battery and recharging stations. The world model consists of a robot moving around in a D×D grid with some dedicated charging cells, where each action costs energy. The risk is to deplete the battery within a fixed horizon. Refuel-B is a two-state sensor variant.

Evade-I is inspired by a navigation task in a multi-agent setting in a <sup>D</sup>×<sup>D</sup> grid. The monitored robot moves randomly, and the risk is defined as the probability of crashing with the other robot. The other robot has an internal incentive in the form of a cardinal direction, and nondeterministically decides to move or

<sup>11</sup> Advanced algorithms like Quickhull [10] are not without significant adaptions applicable as the set of beliefs can be degenerate (roughly, a set without full rank).


**Table 1.** Performance for promptness of online monitoring on various benchmarks.

to uniformly randomly change its incentive. The monitor observes everything except the incentive of the other robot. Evade-V is an alternative navigation task: Contrary to above, the other robot does not have an internal state and indeed navigates nondeterministically in one of the cardinal directions. We only observe the other robot location is within the view range.

*Results.* We split our results in two tables. In Table 1, we give an ID for every benchmark name and instance, along with the size of the MDP (nr. of states |S| and transitions |P|) our algorithms operate on. We consider the promptness after prefixes of length |τ |. In particular, for forward filtering with the convex hull optimization, we give the number N of traces that did not time out before, and consider the average Tavg and maximal time Tmax needed (over all sampled traces that did not time-out before). Furthermore, we give the average, Bavg, and maximal, Bmax, number of beliefs stored (after reduction), and the average, Davg, and maximal, Dmax, dimension of the belief support. Likewise, for unrolling with exact model checking, we give the number N of traces that did not time out before, and we consider average Tavg and maximal time Tmax, as well as the average size and maximal number of states of the unfolded MDP.

In Table 2, we consider for the benchmarks above the cumulative performance. In particular, this table also considers an alternative implementation for both FF and UNR. We use the IDs to identify the instance, and sum for each prefix of length |τ | the time. For filtering, we recall the number of traces N that did not time out, the average and maximal cumulative time along the trace, the average cumulative number of beliefs that were considered, and the average cumulative number of beliefs eliminated. For the case without convex hull, we do not eliminate any vertices. For unrolling, we report average Tavg and maximal cumulative time using EPI, as well as the time required for model building, *Bld*% (relative to the total time, per trace). We compare this to the average


**Table 2.** Summarized performance for online monitoring

and maximal cumulative time for using OVI (notice that building times remain approximately the same).

*Discussion.* The results from our prototype show that conservative (sound) predictive modeling of systems that combine probabilities, nondeterminism and partial observability is within reach with the methods we proposed and stateof-the-art algorithms. Both forward filtering and an unrolling-based approaches have their merits. The practical results thus slightly diverge from the complexity results in Sect. 3.1, due to structural properties of some benchmarks. In particular, for airport-A and refuel-A, the nondeterminism barely influences the belief, and so there is no explosion, and consequentially the dimension of the belief is sufficiently small that the convex hull can be efficiently computed. Rather than the number of states, this belief dimension makes evade-V a difficult benchmark<sup>12</sup>. *If many states can be reached with a particular trace, and if along these paths there are some probabilistic states, forward filtering suffers significantly*. We see that if the benchmark allows for efficacious forward filtering, it is not slowed down in the way that unrolling is slower on longer traces. For UNR, we observe that OVI is typically the fastest, but EPI does not suffer from the numerical worst-cases as OVI does. *If an observation trace is unlikely, the unrolled MDP constitutes a numerically challenging problem, in particular for value-iteration based model checkers*, see [27]. For FF, the convex hull computation is essential for any dimension, and eliminating some vertices in every step keeps the number of belief states manageable.

<sup>12</sup> The max dimension =1 in evade-V is only over the traces that did not time-out. The dimension when running in time-outs is above 5.

# **7 Related Work**

We are not the first to consider model-based runtime verification in the presence of partial observability and probabilities. Runtime verification with state estimation on hidden Markov models (HMM)—without nondeterminism has been studied for various types of properties [51,54,57] and has been extended to hybrid systems [52]. The tool Prevent focusses on black-box systems by learning an HMM from a set of traces. The HMM approximates (with only convergence-inthe-limit guarantees) the actual system [6], and then estimates during runtime the most likely trace rather than estimating a distribution over current states. Extensions consider symmetry reductions on the models [7]. These techniques do not make a conservative (sound) risk estimation. The recent framework for runtime verification in the presence of partial observability [23] takes a more strict black-box view and cannot provide state estimates. Finally, [26] chooses to have partial observability to make monitoring of software systems more efficient, and [58] monitors a noisy sensor to reduce energy consumption.

State beliefs are studied when verifying HMMs [59], where the question whether a sequence of observations likely occurs, or which HMM is an adequate representation of a system [37]. State beliefs are prominent in the verification of partially observable MDPs [16,32,40], where one can observe the actions taken (but the problem itself is to find the right scheduler). Our monitoring problem can be phrased as a special case of verification of partially observable stochastic games [20], but automatic techniques for those very general models are lacking. Likewise, the idea of *shielding* (pre)computes all action choices that lead to safe behavior [3,5,15,24,34,35]. For partially observable settings, shielding again requires to compute partial-information schedulers [21,39], contrary to our approach. Partial observability has also been studied in the context of diagnosability, studying if a fault has occurred (in the past) [14], or what actions uncover faults [13]. We, instead assume partial observability in which we do detect faults, but want to estimate the risk that these faults occur in the future.

The assurance framework for reinforcement learning [42] implicitly allows for stochastic behavior, but cannot cope with partial observability or nondeterminism. Predictive monitoring has been combined with deep learning [17] and Bayesian inference [22], where the key problem is that the computation of an imminent failure is too expensive to be done exactly. More generally, learning automata models has been motivated with runtime assurance [1,55]. Testing approaches statistically evaluate whether traces are likely to be produced by a given model [25]. The approach in [2] studies stochastic black-box systems with controllable nondeterminism and iteratively learns a model for the system.

### **8 Conclusion**

We have presented the first framework for monitoring based on a trace of observations on models that combine nondeterminism and probabilities. Future work includes heuristics for approximate monitoring and for faster convex hull computations, and to apply this work to gray-box (learned) models.

**Acknowledgments.** The example in Fig. 1 is inspired by a challenge problem provided by Boeing in the DARPA Assured Autonomy program.

# **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

# Model Checking Finite-Horizon Markov Chains with Probabilistic Inference

Steven Holtzen1(B) , Sebastian Junges<sup>2</sup> , Marcell Vazquez-Chanlatte<sup>2</sup> , Todd Millstein<sup>1</sup> , Sanjit A. Seshia<sup>2</sup> , and Guy Van den Broeck<sup>1</sup>

> <sup>1</sup> University of California, Los Angeles, CA, USA sholtzen@cs.ucla.edu <sup>2</sup> University of California, Berkeley, CA, USA

Abstract. We revisit the symbolic verification of Markov chains with respect to finite horizon reachability properties. The prevalent approach iteratively computes step-bounded state reachability probabilities. By contrast, recent advances in probabilistic inference suggest symbolically representing all horizon-length paths through the Markov chain. We ask whether this perspective advances the state-of-the-art in probabilistic model checking. First, we formally describe both approaches in order to highlight their key differences. Then, using these insights we develop Rubicon, a tool that transpiles Prism models to the probabilistic inference tool Dice. Finally, we demonstrate better scalability compared to probabilistic model checkers on selected benchmarks. All together, our results suggest that probabilistic inference is a valuable addition to the probabilistic model checking portfolio, with Rubicon as a first step towards integrating both perspectives.

# 1 Introduction

Systems with probabilistic uncertainty are ubiquitous, e.g., probabilistic programs, distributed systems, fault trees, and biological models. Markov chains replace nondeterminism in transition systems with probabilistic uncertainty, and *probabilistic model checking* [4,7] provides model checking algorithms. A key property that probabilistic model checkers answer is: *What is the (precise) probability that a target state is reached (within a finite number of steps* h*)*? Contrary to classical *qualitative* model checking and approximate variants of probabilistic model checking, precise probabilistic model checking must find the total probability of *all* paths from the initial state to any target state.

c The Author(s) 2021

S. Holtzen and S. Junges—Contributed equally.

This work is partially supported by NSF grants #IIS-1943641, #IIS-1956441, #CCF-1837129, DARPA grant #N66001-17-2-4032, a Sloan Fellowship, a UCLA Dissertation Year Fellowship, and gifts by Intel and Facebook Research.

This work is partially supported by NSF grants 1545126 (VeHICaL), 1646208 and 1837132, by the DARPA contracts FA8750-18-C-0101 (AA) and FA8750-20-C-0156 (SDCPS), by Berkeley Deep Drive, and by Toyota under the iCyPhy center.

A. Silva and K. R. M. Leino (Eds.): CAV 2021, LNCS 12760, pp. 577–601, 2021. https://doi.org/10.1007/978-3-030-81688-9\_27

Fig. 1. Motivating example. Figure 1(c) compares the performance of Rubicon ( ), Storm's explicit engine ( ), Storm's symbolic engine ( ) and Prism ( ) when invoked on a (b) with arbitrarily fixed (different) constants for p*i*, q*<sup>i</sup>* and horizon h = 10. Times are in seconds, with a time-out of 30 min.

Nevertheless, the prevalent ideas in probabilistic model checking are generalizations of qualitative model checking. Whereas qualitative model checking tracks the states that can reach a target state (or dually, that can be reached from an initial state), probabilistic model checking tracks the i-step reachability probability for each state in the chain. The i+1-step reachability can then be computed via multiplication with the *transition matrix*. The scalability concern is that this matrix grows with the state space in the Markov chain. Mature model checking tools such as Storm [36], Modest [34], and Prism [51] utilize a variety of methods to alleviate the state space explosion. Nevertheless various natural models cannot be analyzed by the available techniques.

In parallel, within the AI community a different approach to representing a distribution has emerged, which on first glance can seem unintuitive. Rather than marginalizing out the paths and tracking reachability probabilities per state, the probabilistic AI community commonly aggregates all *paths* that reach the target state. At its core, inference is then a weighted sum over all these paths [16]. This hinges on the observation that this set of paths can often be stored more compactly, and that the probability of two paths that share the same prefix or suffix can be efficiently computed on this concise representation. This *inference technique* has been used in a variety of domains in the artificial intelligence (AI) and verification communities [9,14,27,39], but is not part of any mature probabilistic model checking tools.

This paper theoretically and experimentally compares and contrasts these two approaches. In particular, we describe and motivate Rubicon, a probabilistic model checker that *leverages the successful probabilistic inference techniques*. We begin with an example that explains the core ideas of Rubicon followed by the paper structure and key contributions.

Motivating Example. Consider the example illustrated in Fig. 1(a). Suppose there are n factories. Each day, the workers at each factory collectively decide whether or not to strike. To simplify, we model each factory (i) with two states, striking (ti) and not striking (si). Furthermore, since no two factories are identical, we take the probability to begin striking (pi) and to stop striking (qi) to be different for each factory. Assuming that each factory transitions synchronously and in parallel with the others, we query: "what is the probability that all the factories are simultaneously striking within h days?"

Despite its simplicity, we observe that state-of-the-art model checkers like Storm and Prism do not scale beyond 15 factories.<sup>1</sup> For example, Fig. 1(b) provides a Prism encoding for this simple model (we show the instance with 3 factories), where a Boolean variable c<sup>i</sup> is used to encode the state of each factory. The "allStrike" label identifies the target state. Figure 1(c) shows the run time for an increasing number of factories. While all methods eventually time out, Rubicon scales to systems with an order of magnitude more states.

*Why is This Problem Hard?* To understand the issue with scalability, observe that tools such as Storm and Prism store the transition matrix, either explicitly or symbolically using algebraic decision diagrams (ADDs). Every distinct entry of this transition matrix needs to be represented; in the case of ADDs using a unique leaf node. Because each factory in our example has a different probability of going on strike, that means each subset of factories will likely have a unique probability of jointly going on strike. Hence, the transition matrix then will have a number of distinct probabilities that is exponential in the number of factories, and its representation as an ADD must blow up in size. Concretely, for 10 factories, the size of the ADD representing the transition matrix has 1.9 million nodes. Moreover, the explicit engine fails due to the dense nature of the underlying transition matrix. We discuss this method in Sect. 3.

*How to Overcome This Limitation?* This problematic combinatorial explosion is often unnecessary. For the sake of intuition, consider the simple case where the horizon is 1. Still, the standard transition matrix representations blow up exponentially with the number of factories n. Yet, the probability of reaching the "allStrike" state is easy to compute, even when <sup>n</sup> grows: it is <sup>p</sup><sup>1</sup> · <sup>p</sup><sup>2</sup> ··· <sup>p</sup>n.

Rubicon aims to compute probabilities in this compact *factorized* way by representing the computation as a binary decision diagram (BDD). Figure 1(d) gives an example of such a BDD, for three factories and a horizon of one. A key property of this BDD, elaborated in Sect. 3, is that it can be interpreted as a *parametric Markov chain*, where the weight of each edge corresponds with the probability of a particular factory striking. Then, the probability that the goal state is reached is given by the weighted sum of paths terminating in T: for this instance, there is a single such path with weight p<sup>1</sup> · p<sup>2</sup> · p3. These BDDs are treelike Markov-chains, so model checking can be performed in time linear in the size

<sup>1</sup> Section 6 describes the experimental apparatus and our choice of comparisons.

of the BDD using dynamic programming. Essentially, the BDD represents the set of paths that reach a target state—an idea common in probabilistic inference.

To construct this BDD, we propose to encode our reachability query symbolically as a *weighted model counting* (WMC) query on a logical formula. By compiling that formula into a BDD, we obtain a diagram where computing the query probability can be done efficiently (in the size of the BDD). Concretely for Fig. 1(d), the BDD represents the formula c (1) <sup>1</sup> ∧ c (1) <sup>2</sup> ∧ c (1) <sup>3</sup> , which encodes all paths through the chain that terminate in the goal state (all factories strike on day 1). For this example and this horizon, this is a single path. WMC is a well-known strategy for probabilistic inference and is currently the among the state-of-the-art approaches for discrete graphical models [16], discrete probabilistic programs [39], and probabilistic logic programs [27].

In general, the exponential growth of the number of paths might seem like it dooms this approach: for n = 3 factories and horizon h = 1, we need to only represent 8 paths, but for h = 2, we would need to consider 64 different paths, and so on. However, a key insight is that, for many systems – such as the factory example – the structural compression of BDDs allows a concise representation of exponentially many paths, all *while* being parametric over path probabilities (see Sect. 4). To see why, observe that in the above discussion, the state of each factory is *independent* of the other factories: independence, and its natural generalizations like *conditional* and *contextual* independence, are the driving force behind many successful probabilistic inference algorithms [47]. Succinctly, the key advantage of Rubicon is that it exploits a form of structure that has thus far been under-exploited by model checkers, which is why it scales to more parallel factories than the existing approaches on the hard task. In Sect. 6 we consider an extension to this motivating example that adds dependencies between factories. This dependency (or rather, the accompanying increase in the size of the underlying MC) significantly decreases scalability for the existing approaches but negligibly affects Rubicon.

This leads to the task: *how does one go from a* Prism *model to a concise BDD efficiently*? To do this, Rubicon leverages a novel translation from Prism models into a probabilistic programming language called Dice (outlined in Sect. 5).

Contribution and Structure. Inspired by the example, we contribute conceptual and empirical arguments for leveraging BDD-based probabilistic inference in model checking. Concretely:


Fig. 2. (a) MC toy example (b) (distinct) pMC toy example (c) ADD transition matrix

4. We demonstrate that Rubicon indeed attains an order-of-magnitude scaling improvement on several natural problems including sampling from parametric Markov chains and verifying network protocol stabilization (Sect. 6).

Ultimately we argue that Rubicon makes a valuable contribution to the portfolio of probabilistic model checking backends, and brings to bear the extensive developments on probabilistic inference to well-known model checking problems.

### 2 Preliminaries and Problem Statement

We state the problem formally and recap relevant concepts. See [7] for details. We sometimes use p¯ to denote 1−p. A *Markov chain* (MC) is a tuple M = S, ι, P, T with S a (finite) set of *states*, ι ∈ S the *initial state*, P : S → Distr(S) the *transition function*, and T a set of *target states* T ⊆ S, where Distr(S) is the set of distributions over a (finite) set S. We write P(s, s ) to denote P(s)(s ) and call <sup>P</sup> <sup>a</sup> *transition matrix*. The successors of <sup>s</sup> are Succ(s) = {s <sup>|</sup> <sup>P</sup>(s, s ) > 0}. To support MCs with billions of states, we may describe MCs symbolically, e.g., with Prism [51] or as a probabilistic program [42,48]. For such a symbolic description P, we denote the corresponding MC with [[P ]]. States then reflect assignments to symbolic variables.

<sup>A</sup> *path* <sup>π</sup> <sup>=</sup> <sup>s</sup><sup>0</sup> ...s<sup>n</sup> is a sequence of states, <sup>π</sup> <sup>∈</sup> <sup>S</sup><sup>+</sup>. We use <sup>π</sup><sup>↓</sup> to denote the *last state* sn, and the *length* of π above is n and is denoted |π|. Let Paths<sup>h</sup> denote the paths of length h. The probability of a path is the product of the transition probabilities, and may be defined inductively by Pr(s)=1, Pr(π · s) = Pr(π) · P(π↓, s). For a fixed *horizon* h and set of states T, let the set [[ <sup>s</sup>→♦≤<sup>h</sup><sup>T</sup> ]] = {<sup>π</sup> <sup>|</sup> <sup>π</sup><sup>0</sup> <sup>=</sup> <sup>s</sup> ∧ |π| ≤ <sup>h</sup> <sup>∧</sup> <sup>π</sup><sup>↓</sup> <sup>∈</sup> <sup>T</sup> ∧ ∀i < <sup>|</sup>π|. π<sup>i</sup> ∈ T} denote paths from s of length at most h that terminate at a state contained in T. Furthermore, let PrM(<sup>s</sup> <sup>|</sup><sup>=</sup> ♦≤<sup>h</sup>T) = - <sup>π</sup>∈[[ <sup>s</sup>→♦≤*h*<sup>T</sup> ]] Pr(π) describe the probability to reach T within h steps. We simplify notation when s = ι and write [[♦≤<sup>h</sup>T ]] and PrM(♦≤<sup>h</sup>T), respectively. We omit <sup>M</sup> whenever that is clear from the context.

# Formal Problem: Given an MC <sup>M</sup> and a horizon <sup>h</sup>, compute PrM(♦≤hT).

*Example 1.* For conciseness, we introduce a toy example MC M in Fig. 2(a). For horizon h = 3, there are three paths that reach state 1,0: For example the path 0, 00, 11, 0 with corresponding reachability probability 0.4 · 0.5. The reachability probability PrM(♦≤3{1, <sup>0</sup>})=0.42.

It is helpful to separate the topology and the probabilities. We do this by means of a *parametric MC* (pMC) [22]. A pMC over a fixed set of parameters *p* generalises MCs by allowing for a transition function that maps to Q[*p*], i.e., to polynomials over these variables [22]. A pMC and a *valuation* of parameters *<sup>u</sup>*: *<sup>p</sup>* <sup>→</sup> <sup>R</sup> describe a MC by replacing *<sup>p</sup>* with *<sup>u</sup>* in the transition function <sup>P</sup> to obtain P[*u*]. If P[*u*](s) is a distribution for every s, then we call *u* a *welldefined* valuation. We can then think about a pMC M as a generator of a set of MCs {M[*u*] | *u* well-defined}. Figure 2(b) shows a pMC; any valuation *u* with *u*(p),*u*(q) ∈ [0, 1] is well-defined. We consider the following associated problem:

Parameter Sampling: Given a pMC M, a finite set of well-defined valuations <sup>U</sup>, and a horizon <sup>h</sup>, compute PrM[*u*](♦≤<sup>h</sup>T) for each *<sup>u</sup>* <sup>∈</sup> <sup>U</sup>.

We recap binary *decision diagrams* (BDDs) and their generalization into algebraic decision diagrams (ADDs, a.k.a. multi-terminal BDDs). ADDs over a set of variables X are directed acyclic graphs whose vertices V can be partitioned into *terminal nodes* V<sup>t</sup> without successors and *inner nodes* V<sup>i</sup> with two successors. Each terminal node is labeled with a polynomial over some parameters *p* (or just to constants in <sup>Q</sup>), val: <sup>V</sup><sup>t</sup> <sup>→</sup> <sup>Q</sup>[*p*], and each inner node <sup>V</sup><sup>i</sup> with a variable, var: <sup>V</sup><sup>i</sup> <sup>→</sup> <sup>X</sup>. One node is the root node <sup>v</sup>0. Edges are described by the two successor functions E<sup>0</sup> : V<sup>i</sup> → V and E<sup>1</sup> : V<sup>i</sup> → V . A BDD is an ADD with exactly two terminals labeled T and F. Formally, we denote an ADD by the tuple V,v0, X, var, val, E0, E1. ADDs describe functions <sup>f</sup> : <sup>B</sup><sup>X</sup> <sup>→</sup> <sup>Q</sup>[*p*] (described by a path in the underlying graph and the label of the corresponding terminal node). As finite sets can be encoded with bit vectors, ADDs represent functions from (tuples of) finite sets to polynomials.

*Example 2.* The transition matrix P of the MC in Fig. 2(a) maps states, encoded by bit vectors, x, y,x , y to the probabilities to move from state x, y to x , y . Figure 2(c) shows the corresponding ADD.<sup>2</sup>

# 3 A Model Checking Perspective

We briefly analyze the de-facto standard approach to symbolic probabilistic model checking of finite-horizon reachability probabilities. It is an adaptation of qualitative model checking, in which we track the (backward) reachable states. This set can be thought of as a mapping from states to a Boolean indicating whether a target state can be reached. We generalize the mapping to a function that maps every state s to the probability that we reach T within i steps,

<sup>2</sup> The ADD also depends on the variable order, which we assume fixed for conciseness.

Fig. 3. Bounded reachability and symbolic model checking for the MC M in Fig. 2(a).

denoted PrM(<sup>s</sup> <sup>|</sup><sup>=</sup> ♦≤<sup>i</sup> T). First, it is convenient to construct a transition relation in which the target states have been made absorbing, i.e., we define a matrix with A(s, s ) = P(s, s ) if s ∈ T and A(s, s )=[s = s ] <sup>3</sup> otherwise. The following *Bellman equations* characterize that aforementioned mapping:

$$\begin{aligned} \Pr\_{\mathcal{M}}\left(s \mid \vDash \lozenge \lozenge \right) &= [s \in T],\\ \Pr\_{\mathcal{M}}\left(s \mid \vDash \lozenge \lozenge \right) &= \sum\_{s' \in \mathtt{Succ}(s)} A(s, s') \cdot \Pr\_{\mathcal{M}}(s' \mid \vDash \lozenge \lozenge \mid \vDash \neg 1) \end{aligned}$$

The main aspect model checkers take from these equations is that to compute the h-step reachability from state s, one only needs to combine the h−1-step reachability from any state s *and* the transition probabilities P(s, s ). We define a vector *T* with *T* (s)=[s ∈ T]. The algorithm then iteratively computes and stores the <sup>i</sup> step reachability for <sup>i</sup> = 0 to <sup>i</sup> <sup>=</sup> <sup>h</sup>, e.g. by computing <sup>A</sup><sup>3</sup> · *<sup>T</sup>* using A ·(A ·(A · *T* )). This reasoning is thus *inherently backwards* and *implicitly marginalizing out paths*. In particular, rather than storing the i-step paths that lead to the target, one only stores a vector *<sup>x</sup>* <sup>=</sup> <sup>A</sup><sup>i</sup> · *<sup>T</sup>* that stores for every state s the sum over all i-long paths from s.

Explicit representations of matrix A and vector *x* require memory at least in the order |S|. <sup>4</sup> To overcome this limitation, *symbolic* probabilistic model checking stores both <sup>A</sup> and <sup>A</sup><sup>i</sup> · *<sup>T</sup>* as an ADD by considering the matrix as a function from a tuple s, s to A(s, s ), and *x* as a function from s to *x*(s) [2].

*Example 3.* Reconsider the MC in Fig. 2(a). The h-bounded reachability probability PrM(♦≤<sup>h</sup>{1, <sup>0</sup>}) can be computed as reflected in Fig. 3(a). The ADD for P is shown in Fig. 2(c). The ADD for *x* when h = 2 is shown in Fig. 3(b).

The performance of symbolic probabilistic model checking is directly governed by the sizes of these two ADDs. The size of an ADD is bounded from below by the number of leafs. In qualitative model checking, both ADDs are in fact BDDs, with two leafs. However, for the ADD representing A, this lower bound is given by the number of different probabilities in the transition matrix. In the running example, we have seen that a small program P may have an underlying MC [[P ]] with an exponential state space S and equally many different transition probabilities. Symbolic probabilistic model checking also scales

<sup>3</sup> Where [x]=1 if x holds and 0 otherwise.

<sup>4</sup> Excluding e.g., partial exploration or sampling which typically are not exact.

Fig. 4. The computation tree for M and horizon 3 and its compression. We label states as s=-0,0, t=-0,1, u=-1,0, v=-1,1. Probabilities are omitted for conciseness.

badly on some models where A has a concise encoding but *x* has too many different entries.<sup>5</sup> Therefore, model checkers may store *x* partially explicit [49].

The insights above are not new. Symbolic probabilistic model checking has advanced [46] to create small representations of both A and *x*. In competitions, Storm often applies a bisimulation-to-explicit method that extracts an explicit representation of the bisimulation quotient [26,36]. Finally, game-based abstraction [32,44] can be seen as a predicate abstraction technique on the ADD level. However, these methods do not change the computation of the finite horizon reachability probabilities and thus do not overcome the inherent weaknesses of the iterative approach in combination with an ADD-based representation.

# 4 A Probabilistic Inference Perspective

We present four key insights into probabilistic inference. (1) Sect. 4.1 shows how probabilistic inference takes the classical definition as summing over the set of paths, and turns this definition into an algorithm. In particular, these paths may be stored in a computation tree. (2) Sect. 4.2 gives the traditional reduction from probabilistic inference to the classical weighted model counting (WMC) problem [16,57]. (3) Sect. 4.3 connects this reduction to point (1) by showing that a BDD that represents this WMC is *bisimilar* to the computation tree assuming that the out-degree of every state in the MC is two. (4) Sect. 4.4 describes and compares the computational benefits of the BDD representation. In particular, we clarify that enforcing an out-degree of two is a key ingredient to overcoming one of the weaknesses of symbolic probabilistic model checking: the number of different probabilities in the underlying MC.

#### 4.1 Operational Perspective

The following perspective frames (an aspect of) probabilistic inference as a model transformation. By definition, the set of all paths – each annotated with the transition probabilities – suffices to extract the reachability probability. These sets of paths may be represented in the computation tree (which is itself an MC).

<sup>5</sup> For an interesting example of this, see the "Queue" example in Sect. 6.

*Example 4.* We continue from Example 1. We put all paths of length three in a computation tree in Fig. 4(a) (cf. the caption for state identifiers). The three paths that reach the target are highlighted in red. The MC is highly redundant. We may compress to the MC in Fig. 4(b).

Definition 1. *For MC* M *and horizon* h*, the computation tree (CT)* CT(M, h) = *Paths*h, ι, P , T *is an MC with states corresponding to paths in* M*, i.e., Paths*<sup>M</sup> <sup>h</sup> *, initial state* ι*, target states* T = [[♦≤hT ]]*, and transition relation*

$$P'(\pi, \pi') = \begin{cases} P(\pi\_\downarrow, s) & \text{if } \pi\_\downarrow \notin T \land \pi' = \pi.s, \\ \left[ \pi\_\downarrow \in T \land \pi' = \pi \right] & \text{otherwise.} \end{cases} \tag{1}$$

The CT contains (up to renaming) the same paths to the target as the original MC. Notice that after h transitions, all paths are in a sink state, and thus we can drop the step bound from the property and consider either finite or indefinite horizons. The latter considers all paths that eventually reach the target. We denote the probability mass of these paths with PrM(s |= ♦T) and refer to [7] for formal details.<sup>6</sup> Then, we may compute bounded reachability probabilities in the original MC by analysing unbounded reachability in the CT:

$$\Pr\_{\mathcal{M}}(\Diamond^{\leq h}T) = \Pr\_{\mathsf{CT}(\mathcal{M},h)}(\Diamond^{\leq h}T') = \Pr\_{\mathsf{CT}(\mathcal{M},h)}(\Diamond T').$$

The nodes in the CT have a natural topological ordering. The unbounded reachability probability is then computed (efficiently in CT's size) using dynamic programming (i.e., topological value iteration) on the Bellman equation for s ∈ T:

$$\Pr\_{\mathcal{M}}(s \mid \! \Diamond T) = \sum\_{s' \in \mathsf{Succ}(s)} P(s, s') \cdot \Pr\_{\mathcal{M}}(s' \mid \! \mid \! \Diamond T).$$

For pMCs, the right-hand side naturally is a factorised form of the *solution function* f that maps parameter values to the induced reachability probability, i.e. <sup>f</sup>(*u*) = PrM[*u***]**(♦≤<sup>h</sup>T) [22,24,33]. For bounded reachability (or acyclic pMCs), this function amounts to a sum over all paths with every path reflected by a term of a polynomial, i.e., the sum is a polynomial. In sum-of-terms representation, the polynomial can be exponential in the number of parameters [5].

For computational efficiency, we need a smaller representation of the CT. As we only consider reachability of T, we may simplify [43] the notion of (weak) bisimulation [6] (in the formulation of [40]) to the following definition.

Definition 2. *For* M *with states* S*, a relation* R ⊆ S × S *is a (weak) bisimulation (with respect to* T*) if* sRs *implies* PrM(s |= ♦T) = PrM(s |= ♦T)*. Two states* s, s *are (weakly) bisimilar (with respect to* T*) if* PrM(s |= ♦T) = PrM(s |= ♦T)

Two MCs M,M are bisimilar, denoted M∼M if the initial states are bisimilar in the disjoint union of the MCs. It holds by definition that if M∼M , then PrM(♦T) = PrM (♦T ). The notion of bisimulation can be lifted to pMCs [33].

<sup>6</sup> Alternatively, on acyclic models, a large step bound h > <sup>|</sup>S<sup>|</sup> suffices.

Idea 1: Given a symbolic description P of a MC [[P ]], efficiently construct a concise MC <sup>M</sup> that is bisimilar to CT([[<sup>P</sup> ]], h).

Indeed, the (compressed) CT in Fig. 4(b) and Fig. 4(a) are bisimilar. We remark that we do not necessarily compute the bisimulation quotient of CT([[<sup>P</sup> ]], h).

# 4.2 Logical Perspective

The previous section defined weakly bisimilar chains and showed computational advantages, but did not present an algorithm. In this section we frame the finite horizon reachability probability as a logical query known as *weighted model counting* (WMC). In the next section we will show how this logical perspective yields an algorithm for constructing bisimilar MCs.

Weighted model counting is well-known as an effective reduction for probabilistic inference [16,57]. Let ϕ be a logical sentence over variables C. The *weight function* <sup>W</sup><sup>C</sup> : <sup>C</sup> <sup>→</sup> <sup>R</sup>≥<sup>0</sup> assigns a weight to each logical variable. A *total variable assignment* <sup>η</sup> : <sup>C</sup> → {0, <sup>1</sup>} by definition has weight weight(η) = <sup>c</sup>∈<sup>C</sup> <sup>W</sup><sup>C</sup> (c)η(c) + (1 <sup>−</sup> <sup>W</sup><sup>C</sup> (c)) · (1 <sup>−</sup> <sup>η</sup>(c)). Then the *weighted model count* for ϕ given W is WMC(ϕ, W<sup>C</sup> ) = - <sup>η</sup>|=<sup>ϕ</sup> weight(η). Formally, we desire to compute a reachability query using a WMC query in the following sense:

Idea 2: Given an MC <sup>M</sup>, efficiently construct a predicate <sup>ϕ</sup><sup>C</sup> <sup>M</sup>,h and a weight-function <sup>W</sup><sup>C</sup> such that PrM(♦≤<sup>h</sup>T) = WMC(ϕ<sup>C</sup> <sup>M</sup>,h, W<sup>C</sup> ).

Consider initially the simplified case when the MC M is *binary*: every state has at most two successors. In this case producing (ϕ<sup>C</sup> <sup>M</sup>,h, W<sup>C</sup> ) is straightforward:

*Example 5.* Consider the MC in Fig. 2(a), and note that it is binary. We introduce logical variables called *state/step coins* C = {cs,i | s ∈ S, i < h} for every state and step. Assignments to these coins denote choices of transitions at particular times: if the chain is in state s at step i, then it takes the transition to the lexicographically first successor of s if cs,i is true and otherwise takes the transition to the lexicographically second successor. To construct the predicate ϕ<sup>C</sup> <sup>M</sup>,3, we will need to write a logical sentence on coins whose models encode accepting paths (red paths) in the CT in Fig. 4(a).

We start in state s = 0, 0 (using state labels from the caption of Fig. 4). We order states as s = 0, 0 < t = 0, 1 < u = 1, 0 < v = 1, 1. Then, cs,<sup>0</sup> is true if the chain transitions into state s at time 0 and false if it transitions to state t at time 0. So, one path from s to the target node 1, 0 is given by the logical sentence (cs,<sup>0</sup> ∧ ¬cs,<sup>1</sup> <sup>∧</sup> <sup>c</sup>t,2). The full predicate <sup>ϕ</sup><sup>C</sup> <sup>M</sup>,<sup>3</sup> is therefore:

$$
\varphi^C\_{\mathcal{M},3} = (c\_{s,0} \wedge \neg c\_{s,1} \wedge c\_{t,2}) \vee (\neg c\_{s,0} \wedge c\_{t,1}) \vee (\neg c\_{s,0} \wedge \neg c\_{t,1} \wedge c\_{v,2}).
$$

Each model of this sentence is a single path to the target. This predicate ϕ<sup>C</sup> M,h can clearly be constructed by considering all possible paths through the chain, but later on we will show how to build it more efficiently.

Finally, we fix W<sup>C</sup> : The weight for each coin is directly given by the transition probability to the lexicographically first successor: for 0 ≤ i<h, W<sup>C</sup> (cs,i)=0.6 and W<sup>C</sup> (ct,i) = W<sup>C</sup> (cv,i)=0.5. The WMC is indeed 0.42, reflecting Example 1.

When the MC is not binary, it suffices to limit the out-degree of an MC to be at most two by adding auxiliary states, hence binarizing all transitions, cf. [38].

#### 4.3 Connecting the Operational and the Logical Perspective

Now that we have reduced bounded reachability to weighted model counting, we reach a natural question: how do we perform WMC?<sup>7</sup> Various approaches to performing WMC have been explored; a prominent approach is to compile the logical function into a binary decision diagram (BDD), which supports fast weighted model counting [21]. In this paper, we investigate the use of a BDDdriven approach for two reasons: (i) BDDs admit straightforward support for parametric models. (ii) BDDs provide a direct connection between the logical and operational perspectives. To start, observe that the graph of the BDD, together with the weights, can be interpreted as an MC:

Definition 3. *Let* ϕ<sup>X</sup> *be a propositional formula over variables* X *and* <<sup>X</sup> *an ordering on* <sup>X</sup>*. Let* BDD(ϕ<sup>X</sup>, <X) = V,v0, X, var, val, E0, E1 *be the corresponding BDD, and let* W *be a weight function on* X *with* 0 ≤ W(x) ≤ 1*. We define the MC* BDDMC(ϕ<sup>X</sup>, <X, W) = S, ι, P, T *with* <sup>S</sup> <sup>=</sup> <sup>V</sup> *,* <sup>ι</sup> <sup>=</sup> <sup>v</sup>0*,* <sup>P</sup>(s) = {E0(s) <sup>→</sup> <sup>W</sup>(var(s)), E1(s) <sup>→</sup> <sup>1</sup> <sup>−</sup> <sup>W</sup>(var(s))} *and* <sup>T</sup> <sup>=</sup> {<sup>v</sup> <sup>∈</sup> <sup>V</sup> <sup>|</sup> val(v)=1}*.*

These BDDs are intimately related to the computation trees discussed before. For a binary MC <sup>M</sup>, the tree CT(M, h) is binary and can be considered as a (not necessarily reduced) BDD. More formally, let us construct BDDMC(ϕ<sup>C</sup> <sup>M</sup>,h, <<sup>C</sup> ,). We fix a total order on states. Then we fix *state/step coins* C = {cs,i | s ∈ S, i < h} and the weights as in Example 5. Finally, let <<sup>C</sup> be an order on C such that i<j implies cs,i<<sup>C</sup> cs,j . Then:

$$\mathsf{CT}(\mathcal{M}, h) \sim \mathsf{BDD}\_{\mathsf{MC}}(\varphi\_{\mathcal{M}, h}^{C}, <\_{C}, W). \tag{2}$$

In the spirit of Idea 1, we thus aim to construct BDDMC(ϕ<sup>C</sup> <sup>M</sup>,h, <<sup>C</sup> , W), a representation as outlined in Idea 2, efficiently. Indeed, the BDD (as MC) in Fig. 4(c) is bisimilar to the MC in Fig. 4(b).

Idea 3: Represent a bisimilar version of the computation tree using a BDD.

<sup>7</sup> In this paper, we concentrate on reductions to *exact* WMC, leaving approximate approaches for future work [14].

Fig. 5. Two computation trees for the motivating example in Sect. 1.

#### 4.4 The Algorithmic Benefits of BDD Construction

Thus far we have described how to construct a binarized MC bisimilar to the CT. Here, we argue that this construction has algorithmic benefits by filling in two details. First, the binarized representation is an important ingredient for compact BDDs. Second, we show how to choose a variable ordering that ensures that the BDDs grow linearly in the horizon. In sum,

Idea 4: WMC encodings of binarized Markov Chains may increase compression of computation trees.

To see the benefits of binarized transitions, we return to the factory example in Sect. 1. Figure 5(a) gives a bisimilar computation tree for the 3-factory h = 1 example. However, in this tree, the states are *unfactorized*: each node in the tree is a joint configuration of factories. This tree has 8 transitions (one for each possible joint state transition) with 8 distinct probabilities. On the other hand, the bisimilar computation tree in Fig. 1(d) has binarized transitions: each node corresponds to a single factory's state at a particular time-step, and each transition describes an update to only a single factory. This binarization enables the exploitation of new structure: in this case, the independence of the factories leads to smaller BDDs, that is otherwise lost when considering only joint configurations of factories.

Recall that the size of the ADD representation of the transition function is bounded from below by the number of distinct probabilities in the underlying MC: in this case, this is visualized by the number of distinct outgoing edge probabilities from all nodes in the unfactorized computation tree. Thus, a good binarization can have a drastically positive effect on performance. For the running example, rather than 2<sup>n</sup> different transition probabilities (with n factories), the system now has only 4 · n distinct transition probabilities!

*Causal Orderings.* Next, we explore some of the *engineering choices* Rubicon makes to exploit the sequential structure in a MC when constructing the BDD for a WMC query. First, note that the transition matrix P(s, s ) implicitly encodes a distribution over state transition functions, S → S. To encode P as a BDD, we must encode each transition as a logical variable, similar to the situation in Sect. 4.2. In the case of binary transitions this is again easy. In the case of nonbinary transitions, we again introduce additional logical variables [16,27,39,57]. This logical function has the following form:

$$f\_P \colon \{0, 1\}^C \to (S \to S). \tag{3}$$

Whereas the computation tree follows a fixed (temporal) order of states, BDDs can represent the same function (and the same weighted model count) using an arbitrary order. Note that the BDD's size and structure drastically depends both on the construction of the propositional formula *and* the order of the variables in that encoding. We can bound the size of the BDD by enforcing a variable order based on the temporal structure of the original MC. Specifically, given h coin collections *C* = C×...×C, one can generate a function f describing the h-length paths via repeated applications of f<sup>P</sup> :

$$f \colon \{0, 1\}^C \to \text{Paths}\_h \quad f(C\_1, \dots, C\_h) = \left(f\_P(C\_h) \circ \dots \circ f\_P(C\_1)\right)(\iota) \tag{4}$$

Let ψ denote an indicator for the reachability property as a function over paths, <sup>ψ</sup>: Paths<sup>h</sup> → {0, <sup>1</sup>} with <sup>ψ</sup>(π)=[<sup>π</sup> <sup>∈</sup> [[♦≤<sup>h</sup><sup>T</sup> ]]]. We call predicates formed by composition with f<sup>P</sup> , i.e., ϕ = ψ ◦ f<sup>P</sup> , *causal encodings* and orderings on ci,t ∈ *C* that are lexicographically sorted in time, t<sup>1</sup> < t<sup>2</sup> =⇒ ci,t<sup>1</sup> < cj,t<sup>2</sup> , *causal orderings*. Importantly, causally ordered / encoded BDDs grow linearly in horizon h, [61, Corollary 1]. More precisely, let ϕ*<sup>C</sup>* <sup>M</sup>,h be causally encoded where <sup>|</sup>*C*<sup>|</sup> <sup>=</sup> <sup>h</sup> · <sup>m</sup>. The causally ordered BDD for <sup>ϕ</sup>*<sup>C</sup>* <sup>M</sup>,h has at most <sup>h</sup> · |<sup>S</sup> <sup>×</sup>Sψ| · <sup>m</sup> · <sup>2</sup><sup>m</sup> nodes, where <sup>|</sup>Sψ<sup>|</sup> = 2 for reachability properties.<sup>8</sup> However, while the worst-case growth is linear in the horizon, constructing that BDD may induce a super-linear cost in the size, e.g., function composition using BDDs is super-linear!

Figure 5(b) shows the motivating factory example with 2 factories and h = 2. The variables are causally ordered: the factories in time step 1 occur before the factories in time step 2. For n factories, a fixed number f(n) of nodes are added to the BDD upon each iteration, guaranteeing growth on the order O(f(n)·h). Note the factorization that occurs: the BDD has node sharing (node c (2) <sup>2</sup> is reused) that yields additional computational benefits.

*Summary and Remaining Steps.* The operational view highlights that we want to compute a transformation of the original input MC M. The logical view presents an approach to do so efficiently: By computing a BDD that stores a predicate describing all paths that reach the target, and interpreting and evaluating the (graph of the) BDD as an MC. In the following section, we discuss the two steps that we follow to create the BDD: (i) From <sup>P</sup> generate <sup>P</sup> such that CT([[<sup>P</sup> ]], h) <sup>∼</sup> [[P ]]. (ii) From P generate M such that M = [[P ]].

### 5 RUBICON

We present Rubicon which follows the two steps outlined above. For exposition, we first describe a translation of *monolithic* Prism programs to Dice programs

<sup>8</sup> Generally, it is the smallest number of states required for a DFA to recognize ψ.

Fig. 6. From Prism to Dice using Rubicon.

and then extend this translation to admit modular programs. Technical steps and extensions are deferred to [38, Appendix].

Dice Preliminaries. We give a brief description of Dice, a probabilistic programming language (PPL) introduced in [39]. A PPL is a programming language augmented with a primitive notion of random choice: for instance, in Dice, a Bernoulli random variable is introduced by the syntax flip 0.5. The syntax of Dice is similar to the programming language OCaml: local variables are introduced by the syntax let x = e<sup>1</sup> in e2, where e<sup>1</sup> and e<sup>2</sup> are *expressions*, i.e., sub-programs. Dice supports procedures, bounded integers, bounded loops, and standard control flow via if-statements.

One goal of a PPL is to perform *probabilistic inference*: compute the probability that the program returns a particular value. Inference on the tiny Dice program let x = flip 0.1 in x would yield that true is returned with probability 0.1. The Dice compiler performs probabilistic inference via weighted model counting and BDD compilation. In doing so, it accomplishes the *non-trivial* tasks of: (i) choosing a logical encoding for probabilistic programs (ii) establishing good variable orderings (iii) efficiently manipulating and constructing BDDs (iv) performing WMC . For details, we refer the reader to [39].

Rubicon uses Dice to effectively construct a BDD and perform WMC on a Dice program that reflects a description of some computation tree. This implementation exploits the structure that was described in Sect. 4.4: in particular, the BDD generated in Fig. 5(b) is exactly the BDD that will be generated by Dice from the output of Rubicon. The variable ordering used by Dice is given by the order in which program variables are introduced, and Rubicon's translation was designed with this variable ordering in mind.

Transpiling Prism to Dice. We present the core translation routine implemented in Rubicon. We note that the ultimate performance of Rubicon is

heavily dependent on the quality of this translation. We evaluate the performance in the next section.

The Prism specification language consists of one or more reactive *modules* (or partially synchronized state machines) that may interact with each other. Our example in Fig. 1(b) illustrates fully synchronized state machines. While Prism programs containing multiple modules can be flattened into a single monolithic program, this yields an exponential blow-up: If one flattens the n modules in Fig. 1(b) to a single module, the resulting program has 2<sup>n</sup> updates per command. This motivates our direct translation of PRISM programs containing multiple modules.

*Monolithic Prism Programs.* We explain most ideas on Prism programs that consist of a single "monolithic" module before we address the modular translation at the end of the subsection. A module has a set of bounded variables, and the valuations of these variables span the state space of the underlying MC. Its transitions are described by guarded *commands* of the form:

$$[\mathbf{act}] \mid \mathbf{guard} \rightarrow \ p\_1 : \mathbf{update}\_1 + \dots \dots + p\_n : \mathbf{update}\_n$$

The *action* name act is only relevant in the modular case and can be ignored for now. The *guard* is a Boolean expression over the module's variables. If the guard evaluates to true for some state (a valuation), then the module evolves into one of the n successor states by updating its variables. An *update* is chosen according to the probability distribution given by the expressions p1,...,pn. In every state enabling the guard, the evaluation of p1,...,p<sup>n</sup> must sum up to one. A set of guards *overlap* if they all evaluate to true on a given state. The semantics of overlapping guards in the monolithic setting is to first uniformly select an active guard and then apply the corresponding stochastic transition. Finally, a self-loop is implicitly added to states without an enabled guard.

*Example 6.* We present our translation primarily through example. In Fig. 6(a), we give a Prism program for a MC. The program contains two variables x and y, where x is either zero or one, and y between zero and two. There are thus 6 different states. We denote states as tuples with the x- and y-value. We depict the MC in Fig. 6(b). From state 0, 0, (only) the first guard is enabled and thus there are two transitions, each with probability a half: one in which x becomes one and one in which y is increased by one. Finally, there is no guard enabled in state 1, 1, resulting in an implicit self-loop.

*Translation.* All Dice programs consist of two parts: a *main* routine, which is run by default when the program starts, and *function declarations* that declare auxiliary functions. We first define the auxiliary functions. For simplicity let us temporarily assume that no guards overlap and that probabilities are constants, i.e., not state-dependent.

The main idea in the translation is to construct a Dice function step that, given the current state, outputs the next state. Because a monolithic Prism

```
module main
x : [0 .. 2] init 1;
y : [0 .. 2] init 1;
[] x>1 -> 1:x '=y&y '=x;
[] y<2 -> 1:x '=min(x+1,2);
endmodule
           (a)
                                  fun step((x,y)) {
                                   let aEn =(x>1) in
                                   let bEn =(y<2) in
                                   let act = selectFrom(aEn, bEn) in
                                   if act==1 then (y,x)
                                   else if act==2 then (min(x+1,2),y)
                                   else (x,y)} ...
                                                (b)
```
Fig. 7. Prism program with overlapping guards and its translation (conceptually).

```
module m1
x : [0..1] init 0;
[a] x=1 -> 1:x'=1-y;
[b] x=0 -> 1:x'=0;
endmodule
module m2
y : [0..1] init 0;
[b] y=1 -> 0.5:y'=0 +0.5:y'=1;
[c] true -> 1:x'=1-x;
endmodule
             (a)
                                  fun step((x,y)) {
                                    let aEn =(x==1) in
                                    let bEn =(x=0 && y=1) in
                                    let cEn =true in
                                    let act =selectFrom(aEn, bEn, cEn) in
                                    if act==1 then (1-y, y)
                                    else if act==2 then (0, flip 0.5)
                                    else if act==3 then (1-x, y)
                                    else (x, y)
                                  }
                                                    (b)
```
Fig. 8. Modular Prism and resulting Dice step function.

program is almost a sequential program, in its most basic version, the step function is straightforward to construct using built-in Dice language primitives: we simply build a large if-else block corresponding to each command. This block iteratively considers each command's guard until it finds one that is satisfied. To perform the corresponding update we flip a coin – based on the probabilities corresponding to the updates – to determine which update to perform. If no command is enabled, we return the same state in accordance with the implicit self-loop. Figure 6(d) shows the program blocks for the Prism program from Fig. 6(a) with target state [[ x = 0, y = 2 ]]. There are two other important auxiliary functions. The init function simply returns the initial state by translating the initialization statements from Prism, and the hit function checks whether the current state is a target state that is obtained from the property.

Now we outline the main routine, given for this example in Fig. 6(c). This function first initializes the state. Then, it calls step 2 times, checking on each iteration using hit if the target state is reached. Finally, we return whether we have been in a target state. The probability to return true corresponds to the reachability probability on the underlying MC specified by the Prism program.

*Overlapping Guards.* Prism allows multiple commands to be enabled in the same state, with semantics to uniformly at random choose one of the enabled commands to evaluate. Dice has no primitive notion of this construct.<sup>9</sup> We illustrate the translation in Fig. 7(a) and Fig. 7(b). It determines which guards aEn, bEn, cEn are enabled. Then, we randomly select one of the commands which are enabled, i.e., we uniformly at random select a true bit from a given tuple

<sup>9</sup> One cannot simply condition on selecting an enabled guard as this redistributes probability mass over all paths and not only over paths with the same prefix.

of bits. We store the index of that bit and use it to execute the corresponding command.

*Modular Prism Programs.* For modular Prism programs, the *action names* at the front of Prism commands are important. In each module, there is a set of action names available. An action is *enabled* if each module that contains this action name has (at least) one command with this action whose guard is satisfied. Commands with an empty action are assumed to have a globally unique action name, so in that case the action is enabled iff the guard is enabled. Intuitively, once an action is selected, we randomly select a command per module in all modules containing this action name. Our approach resembles that for overlapping guards described above. See Fig. 8 for an intuitive example. To automate this, the updates require more care, cf. [38] for details.

*Implementation.* Rubicon is implemented on top of Storm's Python API and translates Prism to Dice fully automatically. Rubicon supports all MCs in the Prism benchmark suite and a large set of benchmarks from the Prism website and the QVBS [35], with the note that we require a single initial state and ignore reward declarations. Furthermore, we currently do not support the hide/restrict process-algebraic compositions and some integer operations.

### 6 Empirical Comparisons

We compare and contrast the performance of Storm against Rubicon to empirically demonstrate the following strengths and weaknesses:<sup>10</sup>


The sources, benchmarks and binaries are archived.<sup>11</sup>

There is no clear-cut model checking technique that is superior to others (see QCOMP [12]). We demonstrate that, while Rubicon is not competitive on some

<sup>10</sup> All experiments were conducted with Storm version 1.6.0 on the same server with 512 GB of RAM, using a single thread of execution. Time was reported using the built-in Unix time utility; the total wall-clock time is reported.

<sup>11</sup> http://doi.org/10.5281/zenodo.4726264 and http://github.com/sjunges/rubicon.

Fig. 9. Scaling plots comparing Rubicon ( ), Storm's symbolic engine ( ), and Storm's explicit engine ( ). An "(R)" in the caption denotes random parameters.

commonly used benchmarks [52], it improves a modern model checking portfolio approach on a significant set of benchmarks. Below we provide several natural models on which Rubicon is superior to one or both competing methods. We also evaluated Rubicon on standard benchmarks, highlighting that Rubicon is applicable to models from the literature. We see that Rubicon is effective on Herman (elaborated below), has mixed results on BRP [38, Appendix], and is currently not competitive on some other standard benchmarks (NAND, EGL, LeaderSync). While not exhaustive, our selected benchmarks highlight specific strengths and weaknesses of Rubicon. Finally, a particular benefit of Rubicon is fast sampling of parametric chains, which we demonstrate on Herman and our factory example.

Scaling Experiments. In this section, we describe several scaling experiments (Fig. 9), each designed to highlight a specific strength or weakness.

*Weather Factories.* First, Fig. 9(a) describes a generalization of the motivating example from Sect. 1. In this model, the probability that each factory is on strike is dependent on a common random event: whether or not it is raining. The rain on each day is dependent on the previous day's weather. We plot runtime for an increasing number of factories for h=10. Both Storm engines eventually fail due to the state explosion and the number of distinct probabilities in the MC. Rubicon is orders of magnitude faster in comparison, highlighting that it does not depend on complete independence among the factories. Figure 9(b) shows a more challenging instance where the weather includes *wind* which, each day, affects whether or not the sun will shine, which in turn affects strike probability.

*Herman*. Herman is based on a distributed protocol [37] that has been wellstudied [1,53] and which is one of the standard benchmarks in probabilistic model checking. Rather than computing the expected steps to 'stabilization', we consider the step-bounded probability of stabilization. Usually, all participants in

the protocol flip a coin with the same bias. The model is then highly symmetric, and hence is amenable to symbolic representation with ADDs. Figures 9(c) and 9(e) show how the methods scale on Herman examples with 13 and 17 parallel processes. We observe that the explicit approach scales very efficiently in the number of iterations but has a much higher up-front model-construction cost, and hence can be slower for fewer iterations.

To study what happens when the coin biases vary over the protocol participants, we made a version of the Herman protocol where each participant's bias is randomly chosen, which ruins the symmetry and so causes the ADD-based approaches to scale significantly worse (Figs. 9(d) and 9(f), and 9(g)); we see that symbolic ADD-based approaches completely fail on Herman 17 and Herman 19 (the curve terminating denotes a memory error). Rubicon and the explicit approach are unaffected by varying parameters.

*Queues.* The Queues model has K queues of capacity Q where every step, tasks arrive with a particular probability. Three queues are of type 1, the others of type 2. We ask the probability that all queues of type 1 and at least one queue of type 2 is full within k steps. Contrary to the previous models, the ADD representation of the transition matrix is small. Figure 9(h) shows the relative scaling on this model with K = 8 and Q = 3. We observe that ADDs quickly fail due to inability to concisely represent the probability vector *x* from Sect. 3. Rubicon outperforms explicit model checking until h = 10.

Sampling Parametric Markov Chains. We evaluate performance for the pMC sampling problem outlined in Sect. 2. Table 1 gives for four models the time to construct the BDD and to perform WMC, as well as the time to construct an ADD in Storm and to perform model checking with this ADD. Finally, we show the time for Storm to compute the solution function of the pMC (with the explicit representation). The pMC sampling in Storm – symbolic and explicit – computes the reachability probabilities with concrete probabilities. Rubicon, in contrast, constructs a 'parametric' BDD once, amortizing the cost of repeated efficient evaluation. The 'parametric BDD' may be thought of as a solution function, as discussed in Sect. 4.1. Storm cannot compute these solution functions as efficiently. We observe in Table 1 that fast parametric sampling is realized in Rubicon: for instance, after a 40s up-front compilation of the factories example with 15 factories, we have a solution function in factorized form and it costs an order of magnitude less time to draw a sample. Hence, sampling and computation of solution functions of pMCs is a major strength of Rubicon.

### 7 Discussion, Related Work, and Conclusion

We have demonstrated that the probabilistic inference approach to probabilistic model checking can improve scalability on an important class of problems. Another benefit of the approach is for sampling pMCs. These are used to evaluate e.g., robustness of systems [1], or to synthesise POMDP controllers [41]. Many state-of-the-art approaches [17,19,24] require the evaluation of various instantiated MCs, and Rubicon is well-suited to this setting. More generally, support


Table 1. Sampling performance comparison and pMC model checking, time in seconds.

of inference techniques opens the door to a variety of algorithms for additional queries, e.g., computing *conditional probabilities* [3,8].

An important limitation of probabilistic inference is that only finitely many paths can be stored. For infinite horizon properties in cyclic models, an infinite set of arbitrarily long paths would be required. However, as standard in probabilistic model checking, we may soundly approximate infinite horizons. Additionally, the inference algorithm in Dice does not support a notion of nondeterminism. It thus can only be used to evaluate MCs, not Markov decision processes. However, [61] illustrates that this is not a conceptual limitation. Finally, we remark that Rubicon achieves its performance with a straightforward translation. We are optimistic that this is a first step towards supporting a larger class of models by improving the transpilation process for specific problems.

Related Work. The tight connection with inference has been recently investigated via the use of model checking for Bayesian networks, the prime model in probabilistic inference [56]. Bayesian networks can be described as probabilistic programs [10] and their operational semantics coincides with MCs [31]. Our work complements these insights by studying how symbolic model checking can be sped up by probabilistic inference.

The path-based perspective is tightly connected to *factored state spaces*. Factored state spaces are often represented as (bipartite) Dynamic Bayesian networks. ADD-based model checking for DBNs has been investigated in [25], with mixed results. Their investigation focuses on using ADDs for factored state space representations. We investigate using BDDs representing paths. Other approaches also investigated a path-based view: The symbolic encoding in [28] annotates propositional sub-formulae with probabilities, an idea closer to ours. The underlying process implicitly constructs an (uncompressed) CT leading to an exponential blow-up. Likewise, an explicit construction of a computation tree without factorization is considered in [62]. Compression by grouping paths has been investigated in two *approximate* approaches: [55] discretises probabilities and encodes into a satisfiability problem with quantifiers and bit-vectors. This idea has been extended [60] to a PAC algorithm by purely propositional encodings and (approximate) model counting [14]. Finally, factorisation exploits symmetries, which can be exploited using symmetry reduction [50]. We highlight that the latter is not applicable to the example in Fig. 1(d).

There are many techniques for exact probabilistic inference in various forms of probabilistic modeling, including probabilistic graphical models [20,54]. The semantics of graphical models make it difficult to transpile Prism programs, since commonly used operations are lacking. Recently, *probabilistic programming languages* have been developed which are more amenable to transpilation [13,23,29,30,59]. We target Dice due to the technical development that it enables in Sect. 4, which enabled us to design and explain our experiments. Closest related to Dice is ProbLog [27], which is also a PPL that performs inference via WMC; ProbLog has different semantics from Dice that make the translation less straightforward. The paper [61] uses an encoding similar to Dice for inferring specifications based on observed traces. ADDs and variants have been considered for probabilistic inference [15,18,58], which is similar to the process commonly used for probabilistic model checking. The planning community has developed their own disjoint sets of methods [45]. Some ideas from learning have been applied in a model checking context [11].

# 8 Conclusion

We present Rubicon, bringing probabilistic AI to the probabilistic model checking community. Our results show that Rubicon can outperform probabilistic model checkers on some interesting examples, and that this is not a coincidence but rather the result of a significantly different perspective.

# References


Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

# **Enforcing Almost-Sure Reachability in POMDPs**

Sebastian Junges1(B) , Nils Jansen<sup>2</sup> , and Sanjit A. Seshia<sup>1</sup>

<sup>1</sup> University of California at Berkeley, Berkeley, USA sjunges@berkeley.edu

<sup>2</sup> Radboud University Nijmegen, Nijmegen, The Netherlands

**Abstract.** Partially-Observable Markov Decision Processes (POMDPs) are a well-known stochastic model for sequential decision making under limited information. We consider the EXPTIME-hard problem of synthesising policies that almost-surely reach some goal state without ever visiting a bad state. In particular, we are interested in computing the winning region, that is, the set of system configurations from which a policy exists that satisfies the reachability specification. A direct application of such a winning region is the safe exploration of POMDPs by, for instance, restricting the behavior of a reinforcement learning agent to the region. We present two algorithms: A novel SAT-based iterative approach and a decision-diagram based alternative. The empirical evaluation demonstrates the feasibility and efficacy of the approaches.

# **1 Introduction**

Partially observable Markov decision processes (POMDPs) constitute the standard model for agents acting under partial information in uncertain environments [34,52]. A common problem is to find a policy for the agent that maximizes a reward objective [36]. This problem is undecidable, yet, well-established approximate [27], point-based [43], or Monte-Carlo-based [49] methods exist. In safety-critical domains, however, one seeks a *safe* policy that exhibits strict behavioral guarantees, for instance in the form of temporal logic constraints [44]. The aforementioned methods are not suitable to deliver provably safe policies. In contrast, we employ almost-sure reach-avoid specifications, where the probability to reach a set of *avoid* states is zero, and the probability to *reach* a set of goal states is one. Our **Challenge 1** is to compute a policy that adheres to such specifications. Furthermore, we aim to ensure the *safe exploration of a POMDP*, with safe reinforcement learning [23] as direct application. **Challenge 2** is then

c The Author(s) 2021

This work is partially supported by NSF grants 1545126 (VeHICaL), 1646208 and 1837132, by the DARPA contracts FA8750-18-C-0101 (AA) and FA8750-20-C-0156 (SDCPS), by Berkeley Deep Drive, and by Toyota under the iCyPhy center.

This research has been partially funded by NWO grant OCENW.KLEIN.187: "Provably Correct Policies for Uncertain Partially Observable Markov Decision Processes".

to compute a large set of safe policies for the agent to choose from at any state of the POMDP. Such sets of policies are called *permissive policies* [21,31].

*POMDP Almost-Sure Reachability Verification.* Let us remark that in POMDPs, we cannot directly observe in which state we are, but we are in general able to track a *belief*, i.e., a distribution over states that describes where in the POMDP we may be. The belief allows us to formulate the following **verification task**:

For a POMDP, sets of target and avoid states, and a belief, does a policy exist such that we reach the target states without ever visiting a bad state?

The underlying EXPTIME-complete problem requires—in general—policies with access to memory of exponential size in the number of states [4,18]. For safe exploration and, e.g., to support nested temporal properties, the ability to solve this problem *for each belief in the POMDP* is essential.

We base our approaches on the concept of a *winning region*, also referred to as controllable or attractor regions. Such regions are sets of *winning beliefs* from which a policy exists that guarantees to satisfy an almost-sure specification. The verification task relates three concrete problems which we tackle in this paper: (1) *Decide* whether a belief is winning, (2) *compute* the *maximal* winning region, and (3) *compute* a *large* yet not necessarily maximal winning region. We now outline our two approaches. First, we directly exploit model checking for MDPs [5] using belief abstractions. The second, much faster approach iteratively exploits *satisfiability solving* (SAT) [8]. Finally, we define a scheme to enable safe reinforcement learning [23] for POMDPs, referred to as *shielding* [2,30].

*MDP Model Checking.* A prominent approach gives the semantics of a POMDP via an (infinite) belief MDP whose states are the beliefs in the POMDP [36]. For almost-sure specifications, it is sufficient to consider *belief-supports* rather than beliefs. In particular, two beliefs with the same support are either both in a winning region or not [47]. We abstract a belief MDP into a finite belief-support MDP, whose states are the support of beliefs. The (maximal) winning region are (all) states of the belief-support MDP from which one can almost surely reach a belief support that contains a goal state without visiting belief support states that contain an avoid state.

To find a winning region in the POMDP, we thus just have to solve almostsure reachability in this finite MDP. The number of belief supports, however, is exponentially large in the number of POMDP states, threatening the efficient application of explicit state verification approaches. Symbolic state space representations are a natural option to mitigate this problem [7]. We construct a symbolic description of the belief support MDP and apply state-of-the-art symbolic model checking. Our experiments show that this approach (referred to as *MDP Model Checking*) does in general not alleviate the exponential blow-up.

*Incremental SAT Solving.* While the belief support model exploits the structure of the belief support MDP by using a symbolic state space representation, it does not exploit elementary properties of the structure of winning regions. To overcome the scalability challenge, we aim to exploit information from the original POMDP, rather than working purely on the belief-support MDP. In a nutshell, our approach computes the winning regions in a backward fashion by *optimistically* searching policies without memory on the POMDP level. Concretely, starting from the belief support states that shall be reached almost-surely, further states are added to the winning region if we quickly can find a policy that reaches these states without visiting those that are to avoid. We search for these policies by incrementaly employing an encoding based on SAT solving. This symbolic encoding avoids an expensive construction of the belief support MDP. The computed winning region directly translates to sufficient constraints on the set of safe policies, i.e., each policy satisfying these constraints satisfies, by construction, the specification. The key idea is to successively add short-cuts corresponding to already known safe policies. These changes to the structure of the POMDP are performed implicitly on the SAT encoding. The resulting scalable method is sound, but not complete by itself. However, it can be rendered complete by trading off a certain portion of the scalability; intuitively one would eventually search for policies with larger amounts of memory.

*Shielding.* An agent that stays within a winning region is guaranteed to adhere to the specification. In particular, we *shield* (or *mask*) any action of the agent that may lead out of the winning region [1,39,42]. We stress that the shape of the winning region is independent of the transition probabilities or rewards in the POMDP. This independence means that the only prior knowledge we need to assume is the topology, that is, the graph of the POMDP. A pre-computation of the winning region thus yields a shield and allows us to restrict an agent to safely explore environments, which is the essential requirement for safe reinforcement learning [22,23] of POMDPs. The shield can be used with any RL agent [2].

*Comparison with the State-of-the-Art.* Similar to our approach, [15] solves almostsure specifications using SAT. Intuitively, the aim is to find a so-called *simple policy* that is Markovian (aka memoryless). Such a policy may not exist, yet, the method can be applied to a POMDP that has an extended state space to account for finite memory [33,37]. There are three shortcomings that our incremental SAT approach overcomes. First, one needs to pre-define the memory a policy has at its disposal, as well as a fixed lookahead on the exploration of the POMDP. Our encoding does not require to fix these hyperparameter a priori. Second, the approach is only feasible if small memory bounds suffice. Our approach scales to models that require policies with larger memory bounds. Third, the approach finds a single simple policy starting from a pre-defined initial state. Instead, we find a large winning region. For safe exploration, this means that we may exclude many policies and never explore important parts of the system, harming the final performance of the agent. Shielding MDPs is not new [2,9,10,30]. However, those methods do neither take partial observability into account, nor can they guarantee reaching desirable states. Nam and Alur [39] cover partial observability and reachability, but do not account for stochastic uncertainty.

*Experiments.* To showcase the feasibility of our method, we adopted a number of typical POMDP environments. We demonstrate that our method scales better than the state of the art. We evaluate the shield by letting an agent explore the POMDP environment according to the permissive policy, thereby enforcing the satisfaction of the almost-sure specification. We visualize the resulting behavior of the agent in those environments with a set of videos.

*Contributions.* Our paper makes four contributions: (1) We present an incremental SAT-based approach to compute policies that satisfy almost-sure properties. The method scales to POMDPs whose belief-support states count billions; (2) The novel approach is able to find large winning regions that yield permissive policies. (3) We implement a straightforward approach that constructs the belief-support symbolically using state-of-the-art model checking. We show that its completeness comes at the cost of limited scalability. (4) We construct a shield for almost-sure specifications on POMDPs which enforces at runtime that *no unsafe states are visited* and that, under mild assumptions, *the agent almostsurely reaches the set of desirable states*.

*Further Related Work.* Chatterjee et al. compute winning regions for minimizing a reward objective via an explicit state representation [17], or consider almostsure reachability using an explicit state space [16,51]. The problem of determining any winning policy can be cast as a strong cyclic planning problem, proposed earlier with decision diagrams [7]. Indeed, our BDD-based implementation on the belief-support MDP can be seen as a reimplementation of that approach.

Quantitative variants of reach-avoid specifications have gained attention in, e.g., [11,28,40]. Other approaches restrict themselves to simple policies [3,33,45, 58]. Wang et al. [55] use an iterative Satisfiability Modulo Theories (SMT) [6] approach for quantitative finite-horizon specifications, which requires computing beliefs. Various general POMDP approaches exist, e.g., [26,27,29,48,49,54,56]. The underlying approaches depend on discounted reward maximization and can satisfy almost-sure specifications with high reliability. However, enforcing probabilities that are close to 0 or 1 requires a discount factor close to 1, drastically reducing the scalability of such approaches [28]. Moreover, probabilities in the underlying POMDP need to be precisely given, which is not always realistic [14].

Another line of work (for example [53]) uses an idea similar to winning regions with uncertain specifications, but in a fully observable setting. Finally, complementary to shielding, there are approaches that guide reinforcement learning (with full observability) via temporal logic constraints [24,25].

# **2 Preliminaries and Formal Problem**

We briefly introduce POMDPs and their semantics in terms of belief MDPs, before formalising and studying the problem variants outlined in the introduction. We present belief-support MDPs as a finite abstraction of infinite belief MDPs.

We define the support *supp*(μ) = {x ∈ X | μ(x) > 0} of a discrete probability distribution μ and denote the set of all distributions with *Distr* (X).

**Definition 1 (MDP).** *A* Markov decision process *(MDP) is a tuple* M = S, Act, μinit, **P** *with a set* S *of states, an initial distribution* μinit ∈ *Distr* (S)*, a finite set* Act *of actions, and a transition function* **P**: S × Act → *Distr* (S)*.*

Let posts(α) = *supp*(**P**(s, α)) denote the states that may be the successors of the state s ∈ S for action α ∈ Act under the distribution **P**(s, α). If posts(α) = {s} for all actions α, s is called *absorbing*.

**Definition 2 (POMDP).** *A* partially observable MDP *(POMDP) is a tuple* P = M,Ω, obs *with* M = S, Act, μinit, **P** *the underlying MDP with finite* S*,* Ω *a finite set of observations, and* obs: S → Ω *an observation function. We assume that there is a unique initial observation, i.e., that* |{obs(s) | s ∈ *supp*(μinit)}| = 1*.*

More general observation functions obs: S → *Distr* (Ω) are possible via a (polynomial) reduction [17]. A path through an MDP is a sequence π, π = (s0, α0)(s1, α1)...s<sup>n</sup> of states and actions. such that si+1 ∈ post<sup>s</sup><sup>i</sup> (αi) for α<sup>i</sup> ∈ Act and 0 ≤ i<n. The observation function obs applied to a path yields an observation(-action) sequence obs(π) of observations and actions.

For modeling flexibility, we allow actions to be unavailable in a state (e.g., opening doors is only available when at a door), and it turned out to be crucial to handle this explicitly in the following algorithms. Technically, the transition function is a partial function, and the enabled actions are a set EnAct(s) = {α ∈ Act | posts(α) = ∅}. To ease the presentation, we assume that states s, s with the same observation share a set of enabled actions EnAct(s) = EnAct(s ).

**Definition 3 (Policy).** *A policy* σ : (S×Act)<sup>∗</sup>×S → *Distr* (Act) *maps a path* π *to a distribution over actions. A policy is* observation-based*, if for each two paths* π*,* π *it holds that* obs(π) = obs(π ) ⇒ σ(π) = σ(π ). *A policy is* memoryless*, if for each* π*,* π *it holds that* last(π) = last(π ) ⇒ σ(π) = σ(π )*. A policy is* deterministic*, if for each* π*,* σ(π) *is a Dirac distribution, i.e., if* |*supp*(σ(π))| = 1*.*

Policies resolve nondeterminism and partial observability by turning a (PO)MDP into the *induced* infinite discrete-time Markov chain whose states are the finite paths of the (PO)MDP. Probability measures are defined on this Markov chain.

For POMDPs, a *belief* describes the probability of being in certain state based on an observation sequence. Formally, a belief b is a distribution b ∈ *Distr* (S) over the states. A state s with positive belief b(s) > 0 is in the *belief support*, <sup>s</sup> <sup>∈</sup> *supp*(b). Let *Pr*<sup>σ</sup> <sup>b</sup> (S ) denote the probability to reach a set S ⊆ S of states from belief b under the policy σ. More precisely, *Pr*<sup>σ</sup> <sup>b</sup> (S ) denotes the probability of all paths that reach S from b when nondeterminism is resolved by σ.

The policy synthesis problem usually consists in finding a policy that satisfies a certain specification for a POMDP. We consider *reach-avoid* specifications, a subclass of indefinite horizon properties [46]. For a POMDP P with states S, such a specification is ϕ = *REACH*, *AVOID* ⊆ S × S. We assume that states in *AVOID* and in *REACH* are (made) absorbing and *REACH* ∩ *AVOID* = ∅.

**Definition 4 (Winning).** *A policy* σ *is* winning *for* ϕ *from belief* b *in (PO)MDP* <sup>P</sup> *iff Pr*<sup>σ</sup> <sup>b</sup> (*AVOID*)=0 *and Pr*<sup>σ</sup> <sup>b</sup> (*REACH*)=1*, i.e., if it reaches AVOID with probability zero and REACH with probability one (almost-surely) when* b *is the initial state. Belief* b *is* winning *for* ϕ *in* P *if there exists a winning policy from* b*.*

We omit P and ϕ whenever it is clear from the context and simply call b winning.

**Problem 1:** Given a POMDP, a belief b, and a specification ϕ, decide whether b is winning and find a policy σ that is winning from b.

The problem is EXPTIME-complete [18]. Contrary to MDPs, it is not sufficient to consider memoryless policies.

Model checking queries for POMDPs often rely on the analysis of the *belief MDP*. Indeed, we may analyse this generally infinite model. Let us first recap a formal definition of the belief MDP, using the presentation from [11]. In the following, let **P**(s, α, z) := - s-∈<sup>S</sup>[obs(s )=z] · **P**(s, α, s ) denote the probability<sup>1</sup> to move to (a state with) observation z from state s using action α. Then, **P**(b, α, z) := - <sup>s</sup>∈<sup>S</sup> <sup>b</sup>(s)· **<sup>P</sup>**(s, α, z) is the probability to observe <sup>z</sup> after taking <sup>α</sup> in b. We define the *belief obtained by taking* α *from* b*, conditioned on observing* z:

$$\text{update}(\mathfrak{b}|\alpha, z)(s') := \frac{[\text{obs}(s') = z] \cdot \sum\_{s \in S} \mathfrak{b}(s) \cdot \mathbf{P}(s, \alpha, s')}{\mathbf{P}(\mathfrak{b}, \alpha, z)}. \tag{1}$$

**Definition 5 (Belief MDP).** *The* belief MDP *of POMDP* P = M,Ω, obs *where* M = S, Act, μinit, **P** *is the MDP BelMDP*(P) := B, Act, **P**B, μinit *with* B = *Distr* (S)*, and transition function* **P**<sup>B</sup> *given by*

$$\mathbf{P}\_{\mathcal{B}}(\mathfrak{b},\alpha,\mathfrak{b}') := \begin{cases} \mathbf{P}(\mathfrak{b},\alpha,\mathrm{obs}(\mathfrak{b}')) & \text{if } \mathfrak{b}' = \mathrm{update}(\mathfrak{b}|\alpha,\mathrm{obs}(\mathfrak{b}')),\\ 0 & \text{otherwise}. \end{cases}$$

Due to (1) and the unique initial observation, we may restrict the beliefs to <sup>B</sup> <sup>=</sup> <sup>z</sup>∈<sup>Ω</sup> *Distr* ({<sup>s</sup> <sup>|</sup> obs(s) = <sup>z</sup>}), that is, each belief state has a unique associated observation. We can lift specifications to belief MDPs: *Avoid-beliefs* are the set of beliefs b such that *supp*(b) ∩ *AVOID* = ∅, and *reach-beliefs* are the set of beliefs b such that *supp*(b) ⊆ *REACH*.

Towards obtaining a finite abstraction, the main algorithmic idea is the following. For the qualitative reach-avoid specifications we consider, the belief probabilities are irrelevant—*only the belief support is important* [47].

**Lemma 1.** *For winning belief* b*, belief* b *with supp*(b) = *supp*(b ) *is winning.*

Consequently, we can abstract the belief MDP into a finite belief support MDP.

**Definition 6 (Belief-Support MDP).** *For a POMDP* P = M,Ω, obs *with* M = S, Act, μinit, **P***, the finite state space of a* belief-support MDP P<sup>B</sup> *is* B = b ⊆ S | ∀s, s ∈ b : obs(s) = obs(s ) *where each state is the support of a belief state. Action* α *in state* b *leads (with an irrelevant positive probability* p > 0*) to a state* b *, if*

$$b' \in \left\{ \bigcup\_{s \in b} \text{post}\_s(\alpha) \cap \{s \mid \text{obs}(s) = z\} \mid z \in \Omega \right\}.$$

<sup>1</sup> We use Iverson brackets: [x] = 1 if x holds and 0 otherwise.

Thus, transitions between states within b and b are mimicked in the POMDP. Equivalently, the following clarifies the belief-support MDP as an abstraction of the belief MDP: there are transitions with action α between b and b , if there exists beliefs b, b with *supp*(b) = b and *supp*(b ) = b , such that b ∈ postb(α). We lift the specification as before:

**Definition 7 (Lifted specification).** *For* ϕ = *AVOID*, *REACH, we define* ϕ<sup>B</sup> = *AVOID*B, *REACH*B *with AVOID*<sup>B</sup> = {b | b ∩ *AVOID* = ∅}*, and REACH*<sup>B</sup> = {b | b ⊆ *REACH*}*.*

We obtain the following lemma, which follows from the fact that almost-sure reachability is a graph property<sup>2</sup>.

**Lemma 2.** *If belief* b *is winning in the POMDP* P *for* ϕ*, then the support supp*(b) *is winning in the belief-support MDP* P<sup>B</sup> *for* ϕB*.*

Lemma 2 yields an equivalent reformulation of Problem 1 for belief supports:

**Problem 1 (equivalent):** Given a POMDP P, belief b, and specification ϕ, decide whether *supp*(b) is winning for ϕ<sup>B</sup> in the belief-support MDP PB.

# **3 Winning Regions**

This section provides the observations on winning regions, a key concept for this paper. An important consequence of Lemma 2 and the reformulation of Problem 1 to the belief-support MDP is that the initial distribution of the POMDP is no longer relevant. Winning policies for individual beliefs may be composed to a policy that is winning for all of these beliefs, using the individual action choices.

**Lemma 3.** *If the policies* σ *and* σ *are winning for the belief supports* b *and* b *, respectively, then there exists a policy* σ *that is winning for both* b *and* b *.*

While this statement may seem trivial on the MDP (or equivalently on beliefs), we notice that it does not hold for POMDP states. As a natural consequence, we are able to consider winning beliefs without referring to a specific policy.

**Definition 8 (Winning region).** *Let* σ *be a policy. A set* W<sup>σ</sup> <sup>ϕ</sup> ⊆ B *of belief supports is a* winning region for <sup>ϕ</sup> and <sup>σ</sup>*, if* <sup>σ</sup> *is winning from each* <sup>b</sup> <sup>∈</sup> <sup>W</sup><sup>σ</sup> <sup>ϕ</sup> *. A set* W<sup>ϕ</sup> ⊆ B *is a winning region for* ϕ*, if every* b ∈ W<sup>ϕ</sup> *is winning. The region containing all winning beliefs is the* maximal winning region<sup>3</sup>*.*

<sup>2</sup> Although the probabilities are not relevant to compute almost-sure reachability, it is important to notice that almost-sure reachability is different from surereachability [5]: For almost-sure reachability, there can be an infinite path that never reaches the target, as long as the probability mass over all those paths is 0. Almost-sure reachability can, however, be expressed as sure-reachability in a particular game-setting [47].

<sup>3</sup> In some literature, *winning region* always refers to a *maximal* winning region.

Observe that the maximal winning region in MDPs exists for qualitative reachability, but not for quantitative reachability, which we do not consider here.

**Problem 2:** Given a POMDP P and a specification ϕ, find the maximal winning region Wϕ.

Using this definition of winning regions, we are able to reformulate **Problem 1** by asking whether the support of some belief b is in the winning region.

Part of **Problem 1** was to compute a winning policy. Below, we study the connection between the winning region and winning policies. We are interested in subsets of the maximal winning region that exhibit two properties:

**Definition 9 (Deadlock-free).** *A set* W *of belief-supports* W ⊆ B *is* deadlock-free*, if for every* b ∈ W*, an action* α ∈ EnAct(b) *exists such that* postb(α) ⊆ W*.*

**Definition 10 (Productive).** *A set of belief supports* W ⊆ B *is* productive *(towards a set REACH*B*), if from every* b ∈ W*, there exists a (finite) path* π = b0α1b<sup>1</sup> ...b<sup>n</sup> *from* b<sup>0</sup> *to* b<sup>n</sup> ∈ *REACH*<sup>B</sup> *with* b<sup>i</sup> ∈ W *and* post<sup>b</sup><sup>i</sup> (α) ⊆ W *for all* 1 ≤ i ≤ n*.*

Every productive region is deadlock-free, as *REACH*-states are absorbing. The maximal winning region is productive towards *REACH*<sup>B</sup> (and thus deadlockfree) by definition. Intuitively, while a deadlock-free region ensures that one never has to leave the region, any productive winning region ensures that from every belief support within this region there is a policy to stay in the winning region and that can almost-surely reach a *REACH*-state. In particular, to find a winning policy (Challenge 1) or for the purpose of safe exploration (Challenge 2), it is sufficient to find a productive subset of the maximal winning region. We detail on this insight in Sect. 6.

**Problem 3:** Given a POMDP P and a specification ϕ, find a (large) productive winning region Wϕ.

To allow a compact representation of winning regions, we exploit that for any belief support b ⊆ b it holds that post<sup>b</sup>- (α) ⊆ postb(α) for all actions α ∈ Act, that is, the successors of b are contained in the successors of b.

**Lemma 4.** *For winning belief support* b*,* b ⊆ b *is winning.*

# **4 Iterative SAT-Based Computation of Winning Regions**

We devise an approach for iteratively computing an increasing sequence of productive winning regions. The approach delivers a compact symbolic encoding of winning regions: For a belief (or belief-support) state from a given winning region, we can efficiently decide whether the outcome of an action emanating from the state stays within the winning region.

Key ingredient is the computation of so-called memoryless winning policies. We start this section by briefly recapping how to compute such policies directly

**Fig. 1.** Cheese-Maze example to explain memoryless policies and shortcuts

on the POMDP, before we build an efficient incremental approach on top of this base method. In particular, we first present a naive iterative algorithm based on the notion of *shortcuts*, then describe how to implicitly add shortcuts within the encoding, and then finally combine the ideas to an efficient algorithm.

#### **4.1 One-Shot Approach to Find Small Policies from a Single Belief**

We aim to solve **Problem 1** and determine a winning policy. The number of policies is exponential in the actions and the (exponentially many) belief support states. Searching among doubly exponentially many possibilities is intractable in general. However, Chatterjee et al. [15] observe that often much simpler winning policies exist and provides a *one-shot approach* to find them. The essential idea is to search only for memoryless observation-based policies σ : Ω → *Distr* (Act) that are winning for the (initial) belief support b.

*Example 1.* Consider the small Cheese-POMDP [35] in Fig. 1(a). States are cells, actions are moving in the cardinal directions (if possible), and observations are the directions with adjacent cells, e.g., the boldface states 6, 7, 8 share an observation. We set *REACH* = {10} and *AVOID* = {9, 11}. From belief support b = {6, 8} there is no memoryless winning policy—In states {6, 8} we have to go north, which prevents us from going south in state 7. However, we can find a memoryless winning policy for {1, 5}, see Fig. 1(b).

This problem is NP-complete, and it is thus natural to encode the problem as a satisfiability query in propositional logic. We mildly adapt the original encoding of winning policies [15]. We introduce three sets of Boolean variables: Az,α, C<sup>s</sup> and Ps,j . If a policy takes action α ∈ Act with positive probability upon observation z ∈ Ω, then and only then, Az,α is true. If under this policy a state s ∈ S is reached from some initial belief support b<sup>ι</sup> with positive probability, then and only then, C<sup>s</sup> is true. We define a maximal rank k to ensure the productivity. For each state s and rank 0 ≤ j ≤ k, variable Ps,j indicates rank j for s, that is, a path from <sup>s</sup> leads to <sup>s</sup> <sup>∈</sup> *REACH* within <sup>j</sup> steps.<sup>4</sup> A winning policy is then obtained by finding a satisfiable solution (via a SAT solver) to the conjunction Ψ <sup>ϕ</sup> <sup>P</sup> (bι, k) of the constraints (2a)–(5), where <sup>S</sup>? <sup>=</sup> <sup>S</sup> \ *AVOID* ∪ *REACH* .

<sup>4</sup> Notice that a state s can have multiple 'ranks' in this encoding. Its rank is the smallest j such that Ps,j is true.

s-

$$\bigwedge\_{s \in b\_{\mathbf{i}}} C\_s \qquad\qquad\text{(2a)}\qquad\bigwedge\_{z \in \Omega} \left(\bigvee\_{\alpha \in \text{EnAct}(z)} A\_{z,\alpha}\right) \qquad\text{(2b)}$$

The initial belief support is clearly reachable (2a). The conjunction in (2b) ensures that in every observation, at least one action is taken.

$$\bigwedge\_{s \in \text{AVOD}} \neg C\_s \quad \land \bigwedge\_{\substack{s \in S \\ \alpha \in \text{EnAct}(s)}} \left( C\_s \land A\_{\text{obs}(s), \alpha} \to \bigwedge\_{s' \in \text{post}\_s(\alpha)} C\_{s'} \right) \tag{3}$$

The conjunction (3) ensures that for any model for these formulas, the set of states {<sup>s</sup> <sup>∈</sup> <sup>S</sup> <sup>|</sup> <sup>C</sup><sup>s</sup> <sup>=</sup> true} is reachable, does not overlap with *AVOID*, and is transitively closed under reachability (for the policy described by Az,α).

$$\bigwedge\_{s \in S \colon \gamma} C\_s \to P\_{s,k} \tag{4}$$

$$\bigwedge\_{s \notin \text{REACH}} \neg P\_{s,0} \quad \wedge \bigwedge\_{\substack{s \in S \colon \\ 1 \le i \le k}} P\_{s,j} \leftrightarrow \left( \bigvee\_{\alpha \in \text{EnAct}(s)} \left( A\_{\text{obs}(s),\alpha} \wedge \left( \bigvee\_{s' \in \text{post}\_s(\alpha)} P\_{s',j-1} \right) \right) \right) \tag{5}$$

Conjunction (4) states that any state that is reached almost-surely reaches a state in *REACH*, i.e., that there is a path (of length at most) k to the target. Conjunctions (5) describe a ranking function that ensures the existence of this path. Only states in *REACH* have rank zero, and a state with positive probability to reach a state with rank j−1 within a step has rank at most j.

1≤j≤k

By [15, Thm. 2], it holds that the conjunction Ψ <sup>ϕ</sup> <sup>P</sup> (bι, k) of the constraints (2a)–(5) is satisfiable, if there is a memoryless observation-based policy such that ϕ is satisfied. If k = |S|, then the reverse direction also holds. If k < |S|, we may miss states with a higher rank. Large values for k are practically intractable [15], as the encoding grows significantly with k. Pandey and Rintanen [41] propose extending SAT-solvers with a dedicated handling of ranking constraints.

In order to apply this to small-memory policies, one can unfold log(m) bits of memory of such a policy into an m times larger POMDP [15,33], and then search for a memoryless policy in this larger POMDP. Chatterjee et al. [15] include a slight variation to this unfolding, allowing smaller-than-memoryless policies by enforcing the same action over various observations.

#### **4.2 Iterative Shortcuts**

We exploit the one-shot approach to create a naive iterative algorithm that constructs a productive winning region. The iterative algorithm avoids the following restrictions of the one-shot approach. (1) In order to increase the likelihood of finding winning policies, we do not restrict ourselves to small-memory policies, and (2) we do not have to fix a maximal rank k. These modifications allow us to find more winning policies, without guessing hyper-parameters. As we do not need to fix the belief-state, those parts of the winning region that are easy to find for the solver are encountered first.

*The One-Shot Approach on Winning Regions.* To understand the naive iterative algorithm, it is helpful to consider the previous encoding in the light of **Problem 3**, i.e., finding productive winning regions. Consider first the interpretation of the variables. Indeed, observe that we have found *the same* winning policy for all states s where C<sup>s</sup> is true. Consequentially, any belief support b<sup>z</sup> = {s | C<sup>s</sup> true ∧ obs(s) = z} is winning.

**Lemma 5.** *If* σ *is winning for* b *and* b *, then* σ *is also winning for* b ∪ b *.*

This lemma is somewhat dual to Lemma 4, but requires a fixed policy. The constraints (3) and ensure that a winning-region is deadlock-free. The constraints (4) and (5) ensure productivity of the winning region.

*Adding Shortcuts Explicitly.* The key idea is that we iteratively add *short-cuts* in the POMDP that represent known winning policies. We find a winning policy σ for some belief states in the first iteration, and then add a fresh action α<sup>σ</sup> to all (original) POMDP states: This action leads – with probability one – to a *REACH* state, if the state is in the wining belief-support under policy σ. Otherwise, the action leads to an *AVOID* state.

**Definition 11.** *For POMDP* P = M,Ω, obs *where* M = S, Act, μinit, **P** *and a policy* σ *with associated winning region* W<sup>σ</sup> <sup>ϕ</sup> *, and assuming w.l.o.g.,* ∈ *REACH and* ⊥ ∈ *AVOID, we define the* shortcut POMDP P{σ} = M ,Ω, obs *with* M = S, Act , μinit, **P** *,* Act = Act∪{ασ}*,* **P** (s, α) = **P**(s, α) *for all* s ∈ S *and* α ∈ Act*, and* **P** (s, ασ) = { → [{s} ∈ <sup>W</sup><sup>σ</sup> <sup>ϕ</sup> ], <sup>⊥</sup> → [{s} ∈ <sup>W</sup><sup>σ</sup> <sup>ϕ</sup> ]}*.*

**Lemma 6.** *For a POMDP* P *and policy* σ*, the (maximal) winning regions for* P{σ} *and* P *coincide.*

First, adding more actions will not change a winning belief-support to be not winning. Furthermore, by construction, taking the novel action will only lead to a winning belief-support whenever following σ from that point onwards would be a winning policy. The *key* benefit is that adding shortcuts may extend the set of belief-support states that win via a memoryless policy. This observation also gives rise to the following extension to the one-shot approach.

*Example 2.* We continue with Example 1. If we add shortcuts, we can now find a memoryless winning policy for b = {6, 8}, depicted in Fig. 1(c).

*Iterative Shortcuts to Extend a Winning Region.* The idea is now to run the oneshot approach, extract the winning region, add the shortcuts to the POMDP, and rerun the one-shot approach. To make the one-shot approach applicable in this setting, it only needs one change: Rather than fixing an initial belief-support, we ask for an arbitrary new belief-support to be added to the states that we have previously covered. We use a data structure Win such that Win(z) encodes all winning belief supports with observation z. Internally, the data structure stores maximal winning belief supports (w.r.t. set inclusion, see also Lemma 4) as bit-vectors. By construction, for every <sup>b</sup> <sup>∈</sup> Win(z), a winning region exists, i.e., conceptually, there is a shortcut-action leading to *REACH*.


We extend the encoding (in partial preparation of the next subsection) and add a variable U<sup>z</sup> ∈ b that is true if the policy is winning in a belief support that is not yet in Win(z). We replace (2a) with:

$$\bigvee\_{z \in \Omega} U\_z \quad \wedge \bigwedge\_{\substack{z \in \Omega \\ \mathsf{W} \mathsf{in}(z) = \emptyset}} \left( U\_z \leftrightarrow \bigvee\_{\substack{s \in S \\ \mathsf{obs}(s) = z}} C\_s \right) \quad \wedge \bigwedge\_{\substack{z \in \Omega \\ \mathsf{W} \mathsf{in}(z) \neq \emptyset}} \left( U\_z \leftrightarrow \bigwedge\_{\substack{X \in \mathsf{W} \mathsf{in}(z) \\ \mathsf{obs}(s) = z}} \bigvee\_{\substack{s \in S | X \\ \mathsf{obs}(s) = z}} C\_s \right) \tag{6}$$

For an observation z for which we have not found a winning belief support yet, finding a policy from any state s with obs(s) updates the winning region. Otherwise, it means finding a winning policy for a belief support that is not subsumed by a previous one (6).

*Real-Valued Ranking.* To avoid setting a maximal path length, we use unbounded (real) variables R<sup>s</sup> rather than Boolean variables for the ranking [57]. This relaxation avoids the growth of the encoding and admits arbitrarily large ranks with a fixed-size encoding into difference logic. This logic is an extension to propositional logic that can be checked using an SMT solver [6].

 s∈S? C<sup>s</sup> → α∈EnAct(s) Aobs(s),α ∧ s- ∈posts(α) R<sup>s</sup> > R<sup>s</sup>- (7)

We replace (4) and (5): A state must have a successor state with a lower rank – as before, but with real-valued ranks (7).

*Algorithm.* Together, the algorithm is given in Algorithm 1. We initialize the winning region based on the specification, then encode the POMDP using the (modified) one-shot encoding. As long as the SMT solver finds policies that are winning for a new belief-support, we add those belief supports to the winning region. In each iteration, Win contains a winning region. Once we find no more policies that extend the winning region on the extended POMDP, we terminate.

The algorithm always terminates because the set of winning regions is finite, but in general does not solve **Problem 2**. Formally, the maximal winning region is a greatest fixpoint [5] and we iterate from below, i.e., the fixpoint that we find will be the smallest fixpoint (of the operation that we implement). However, iterating from above requires to reason that none of the doubly-exponentially many policies is winning for a particular belief support state; whereas our approach profits from finding simple strategies early on. Unfolding of memory as discussed earlier also makes this algorithm complete, yet, suffers from the same blow-up. A main advantage is that the algorithm often avoids the need for unfolding when searching for a winning policy or large winning regions.

Next, we address two weaknesses: First, the algorithm currently creates a new encoding in every iteration, yielding significant overhead. Second, the algorithm in many settings requires adding a bit of memory to realize behavior where in a particular observation, we *first* want to execute an action α and *then* follow a shortcut from the state (with the same observation) reached from there. We adapt the encoding to explicitly allow for these (non-memoryless) policies.

#### **4.3 Incremental Encoding of Winning Regions**

In this section, instead of naively adjusting the POMDP, we realize the idea of adding shortcuts directly on the encoding. This encoding is the essential step towards an efficacious approach for solving **Problem 3**. We find winning states based on a previous solution, and instead of adding actions, we allow the solver to decide following individual policies from each observation. In Sect. 4.4, we embed this encoding into an improved algorithm.

Our encoding represents an observation-based policy that can decide to take a shortcut, which means that it follows a previously computed winning policy from there (implicitly using Lemma 3). In addition to Az,α, C<sup>s</sup> and R<sup>s</sup> from the previous encoding, we use the following variables: The policy takes shortcuts in states s where D<sup>s</sup> is true. For each observation, we must take the same shortcut, referred to by a positive integer-valued index Iz. More precisely, I<sup>z</sup> refers to a shortcut from a previously computed (fragment of a) winning region stored in Win(z)<sup>I</sup><sup>z</sup> . The policy may decide to *switch*, that is, to follow a shortcut *after* taking an action starting in a state with observation z. If F<sup>z</sup> is true, the policy takes some action from z-states and from the next state, we take a shortcut. The encoding thus implicitly represents policies that are not memoryless but rather allow for a particular type of memory.

The conjunction of (6) and (8)–(13) yields the encoding Φ<sup>ϕ</sup> <sup>P</sup> (Win):

$$\bigwedge\_{z \in \Omega} \left( \bigvee\_{\alpha \in \text{EnAct}(z)} A\_{z, \alpha} \right) \quad \wedge \bigwedge\_{s \in \text{AVOD}} \neg C\_s \wedge \neg D\_s \tag{8}$$

$$\bigwedge\_{\substack{s\in S\\\alpha\in \text{EnAct}(s)}} \left( C\_s \wedge A\_{\text{obs}(s),\alpha} \wedge \neg F\_{\text{obs}(s)} \quad \rightarrow \bigwedge\_{s' \in \text{post}\_s(\alpha)} C\_{s'} \right) \tag{9}$$

$$\bigwedge\_{\substack{s\in S\\\alpha\in \text{EnAct}(s)}} \left( C\_s \wedge A\_{\text{obs}(s),\alpha} \wedge F\_{\text{obs}(s)} \quad \rightarrow \bigwedge\_{s' \in \text{post}\_s(\alpha)} D\_{s'} \right) \tag{10}$$

Similar to (2b), (3), we select at least one action and *AVOID*-states should not be reached (8). States reached are closed under the transitive closure, however,


**Input**: POMDP P, reach-avoid specification ϕ **Output**: Winning region encoded in Win Win(z) ← {s ∈ *REACH* | obs(s) = z} for all z ∈ Ω Φ ← *Encode*(P, ϕ, Win) Create encoding (6),(8)–(13). **while** ∃η s.t. η |= Φ **do** Call an SMT solver Win(z) ← Win(z) ∪ {b | s ∈ b iff η(Cs)} for all z ∈ Ω Φ ← *Encode*(P, ϕ, Win)

only if we do not switch to taking a shortcut (9). Furthermore, we mark the states reached after switching (10) and need to select a shortcut for these states.

$$\bigwedge\_{s \in S} \left( D\_s \to I\_{\text{obs}(s)} > 0 \right) \quad \wedge \quad \bigwedge\_{z \in \Omega} I\_z \le |\mathsf{Min}(z)| \tag{11}$$

$$\bigwedge\_{\substack{z \in \Omega \\ 0 < i \le |\mathsf{W}\mathfrak{n}(z)|}} \bigwedge\_{\substack{s \in S \\ \mathsf{w} \in S \\ \mathsf{obs}(s) = z}} D\_s \to |I\_z \neq i| \tag{12}$$

If we reach a state s after switching, then we must pick a shortcut. We can only pick an index that reflects a found winning region (11). If we pick this shortcut reflecting a winning region (fragment) for observation z, then we are winning from the states in Win(z)i, but not from any other state s with that observation. Thus, for <sup>s</sup> ∈ Win(z)i, if we are going to follow any shortcut (that is, <sup>D</sup><sup>s</sup> holds), we should not pick this particular shortcut encoded by I<sup>z</sup> (because it will lead to an *AVOID*-state). In terms of the policy: Taking this previously computed policy from state s is not (known to) lead us to a *REACH*-state (12). Finally, we update the ranking to account for shortcuts.

$$\bigwedge\_{s \in S\_?} C\_s \to \left( \bigvee\_{\alpha \in \text{EnAct}(s)} \left( A\_{\text{obs}(s), \alpha} \wedge \left( \bigvee\_{s' \in \text{post}\_s(\alpha)} R\_{s'} > R\_{s'} \right) \right) \vee F\_{\text{obs}(s)} \right) \tag{13}$$

We make a slight adaption to (7): Either we have a successor state with a lower rank (as before) or we follow a shortcut—which either leads to the target or to violating the specification (13). We formalize the correctness of the encoding:

**Lemma 7.** *If* <sup>η</sup> <sup>|</sup><sup>=</sup> <sup>Φ</sup><sup>ϕ</sup> <sup>P</sup> (Win)*, then for every observation* <sup>z</sup>*, the belief support* b<sup>z</sup> = {s | η(Cs) = *true*, obs(s) = z} *is winning.*

Algorithm 2 is a straightforward adaption of Algorithm 1 that avoids adding shortcuts explicitly (and uses the updated encoding). As before, the algorithm terminates and solves **Problem 3**. We conclude:

**Theorem 1.** *In any iteration, Algorithm 2 computes a productive winning region.*

#### **4.4 An Incremental Algorithm**

We adapt the algorithm sketched above to exploit the incrementality of modern SMT solvers. Furthermore, we aim to reduce the invocations of the solver by finding some extensions to the winning region via a graph-based algorithm.


**Algorithm 3** Incremental construction of winning regions

*Graph-Based Preprocessing.* To reduce the number of SMT invocations, we employ polynomial-time graph-based heuristics. The first step is to use (fully observable) MDP model checking on the POMDP as follows: find all states that under each (not necessarily observation-based) policy reach an *AVOID*-state with positive probability, and make them absorbing. Then, we find all states that under *each* policy reach a *REACH*-state almost-surely. Then, we iteratively search for *winning observations* and use them to extend the *REACH*-states. An observation z is winning, if the belief-support {s | obs(s) = z} is winning. We start with a previously determined winning region W. We iteratively update W by adding states b<sup>z</sup> = {s | obs(s) = z} for some observation z, if there is an action α such that from every s ∈ bz, it holds posts(α) ⊆ W. The iterative updates are interleaved with MDP model checking on the POMDP as described above until we find a fixpoint.

*Optimized Algorithm.* We improve Algorithm 2 along four dimensions to obtain Algorithm 3. First, we employ fewer updates of the winning region: We aim to extend the policy as much as possible, i.e., we want the SMT-solver to find more states with the same observation that are winning under the same policy. Therefore, we fix the variables for action choices that yield a new winning policy, and let the SMT solver search whether we can extend the corresponding winning region by finding more states and actions that are compatible with the partial policy. Second, we observe that between (outer) iterations, large parts of the encoding stay intact, and use an incremental approach in which we first push all the constraints from the POMDP onto the stack, then all the constraints from the winning region, and finally a constraint that asks for progress. After we found a new policy, we pop the last constraint from the stack, add new constraints regarding the winning region (notice that the old constraints remain intact), and push new constraints that ask for extending the winning region to the stack. We refresh the encoding periodically to avoid unnecessary cluttering. Third, further constraints (1) make the usage of shortcuts more flexible—we allow taking shortcuts either immediately or after the next action, and (2) enable an even more incremental encoding with some minor technical reformulations. Fourth, we add the graph-preprocessing discussed above during the outer iteration.

# **5 Symbolic Model Checking for the Belief-Support MDP**

In this section, we briefly describe how we encode a given POMDP into a beliefsupport MDP to employ symbolic, off-the-shelf probabilistic model checking. In particular, we employ symbolic (decision-diagram, DD) representations of the belief-support MDP as we expect this MDP to be huge. Constructing that DD representation effectively is not entirely trivial. Instead, we advocate constructing a (modular) symbolic description of the belief support MDP. Concretely, we automatically generate a model description in the MDP modeling language JANI [13],<sup>5</sup> and then apply off-the-shelf model checking on the JANI description.

Conceptually, we create a belief-support MDP with auxiliary states to allow for a concise encoding.<sup>6</sup> We use this auxiliary state ˆb to describe for any transition the conditioning on the observation. Concretely, a single transition **P**(b, α, b ) in the belief-support MDP is reflected by two transitions **<sup>P</sup>**(b, α, <sup>ˆ</sup>b) and **<sup>P</sup>**(ˆb, α⊥, b ) in our encoding, where α<sup>⊥</sup> is a unique dummy action. We encode states using triples belsup, newobs, lact. belsup is a bit vector with entries for every state s that we use to encode the belief support. Variables newobs and lact store an observation and an action and are relevant only for the auxiliary states. Technically, we now encode the first transition from b with the nondeterministic action α to ˆb. **P**(b, α) then yields (with arbitrary positive) probability a new observation that will reflect the observation obs(b ). We store α and obs(b ) in lact and newobs, respectively. The second step is a single deterministic (dummy) action updating belsup while taking into account newobs. The step also resets lact and newobs.

The encoding of the transitions as follows: For the first step, we create nondeterministic choices for each action α and observation z. We guard these choices with z meaning that the edge is only applicable to states having observation z, i.e., the guard is <sup>s</sup>∈S,obs(s)=<sup>z</sup> belsup(s). With these guarded edges, we define the destinations: With an arbitrary<sup>7</sup> probability p, we go to an observation z<sup>1</sup> *if* there is at least one state in s ∈ belsup which has a successor state s ∈ posts(α) with obs(s ) = z1.

<sup>5</sup> The description here works on a network of synchronized state machines as is also common in the PRISM language.

<sup>6</sup> The usage of message passing or *indexed assignments* in JANI would circumvent the need for intermediate states, but is to the best of our knowledge not supported by decision-diagram based model checkers.

<sup>7</sup> We leave this a parametric probability in model building to reduce the number of different probabilities, as this is beneficial for the size of the decision diagram that Storm constructs – it will only have leafs 0, p, 1. Technically, such MDPs are not necessarily well-defined but we can employ model checking on the graph structure.

The following pseudocode reflects the first step in the transition encoding. The syntax is as follows: **take** an action **if** a Boolean guard is satisfied, then updates are executed with probability **prob**. An example for a guard is an observation z.

$$\mathtt{false}\,\alpha\,\mathtt{if}\,z\,\mathtt{then}\begin{cases}\mathtt{prob}\,\left(\bigvee\_{\mathtt{P}(s,\alpha,z\_{1})>0}\mathtt{belusp}(s)\ ?\,p:0\right)\colon & \mathtt{1act}\leftarrow\alpha\\\ldots& \qquad\qquad\qquad\qquad\qquad\ldots\\\mathtt{prob}\,\left(\bigvee\_{\begin{subarray}{s\in S\\\mathtt{P}(s,\alpha,z\_{n})>0}\end{subarray}}\mathtt{belusp}(s)\ ?\,p:0\right)\colon & \mathtt{2subvobs}\leftarrow\textit{z}\_{n}\,\mathtt{d}\end{cases}$$

The second step synchronously updates each state s in the POMDP independently: The entry belsup(s ) is set to true if obs(s) = newobs and if there is a state s currently true in (the old) belsup with s ∈ posts(lact). The step thus can be captured by the following pseudocode for each s :

$$\texttt{take } \alpha\_{\bot} \text{ if } \texttt{true} \\ \texttt{then } \texttt{prob1}: \texttt{bel1} \\ \texttt{sup}(s') \leftarrow \left(\bigvee\_{s} \texttt{P}(s, \texttt{1act}, s') > 0\right) \land \texttt{obs}(s')$$

Finally, whenever the dummy action α<sup>⊥</sup> is executed, we also reset the variables newobs and lact. The resulting encoding thus has transitions in the order of |S| + |Ω| <sup>2</sup> · | max<sup>z</sup>∈<sup>Ω</sup> EnAct(z)|.

# **6 Almost-Sure Reachability Shields in POMDPs**

In this section, we define a *shield* for POMDPs – towards the application of safe exploration (Challenge 2) – that blocks actions which would lead an agent out of a winning region. In particular, the shield imposes restrictions on policies to satisfy the reach-avoid specification. Technically, we adapt so-called *permissive* policies [21,31] for a belief-support MDP. To force an agent to stay within a productive winning region W<sup>ϕ</sup> for specification ϕ, we define a ϕ*-shield* ν : b → <sup>2</sup>Act such that for any winning <sup>b</sup> for <sup>ϕ</sup> we have <sup>ν</sup>(b) ⊆ {<sup>α</sup> <sup>∈</sup> Act <sup>|</sup> postb(α) <sup>⊆</sup> Wϕ}, i.e., an action is part of the shield ν(b) if it exclusively leads to belief support states within the winning region.

A shield ν restricts the set of actions an arbitrary policy may take<sup>8</sup>. We call such restricted policies *admissible*. Specifically, let b<sup>τ</sup> be the belief support after observing an observation sequence τ . Then policy σ is ν-admissible if *supp*(σ(τ )) ⊆ ν(b<sup>τ</sup> ) for every observation-sequence τ . Consequently, a policy is *not* admissible if for some observation sequence τ , the policy selects an action α ∈ Act which is not allowed by the shield.

Some admissible policies may choose to stay in the winning region without progressing towards the *REACH* states. Such a policy adheres to the avoid-part of the specification, but violates the reachability part. To enforce *progress*, we

<sup>8</sup> While memory policies based on the belief (support) are sufficient to ensure almostsure reachability, the goal is to shield other policies that do not necessarily fall in this restricted class.

**Fig. 2.** Video stills from simulating a shielded agent on three different benchmarks.

adapt a notion of *fairness*. A policy is fair if it takes every action infinitely often at any belief support state that appears infinitely often along a trace [5]. For example, a policy that randomizes (arbitrarily) over all actions is fair–we notice that most reinforcement learning policies are therefore fair.

**Theorem 2.** *For a* ϕ*-shield* ν *and a winning belief support* b*, any* fair ν*admissible policy satisfies* ϕ *from* b*.*

We give a proof (sketch) in [32, Appendix]. The main idea is to show that the induced Markov chain of any admissible policy has only bottom SCCs that contain *REACH*-states.

*Remark 1.* If ϕ is a safety specification (where *Pr*<sup>σ</sup> <sup>b</sup> (*AVOID*) = 0 suffices), we can rely on deadlock-free winning regions rather than productive winning regions and drop the fairness assumption.

# **7 Empirical Evaluation**

We investigate the applicability of our incremental approach (Algorithm 3) to **Challenge 1** and **Challenge 2**, and compare with our adaption and implementation of the one-shot approach [15], see Sect. 4.1. We also employ the MDP modelchecking approach from Sect. 5. Experiments, videos, source code are archived<sup>9</sup>.

*Setting.* We implemented the one-shot algorithm, our incremental algorithm, and the generation of the JANI description of the belief support MDP into the model checker Storm [19] on top of the SMT solver z3 [38]. To compare with the one-shot algorithm for **Problem 1**, that is, for finding a policy from the initial state, we add a variant of Algorithm 3. Intuitively, any outer iteration starts with an SMT-check to see whether we find a policy covering the initial states. We realize the latter by fixing (temporarily) the Cs-variables. In the first iteration, this configuration and its resulting policy closely resemble the oneshot approach. For the MDP model-checking approach, we use Storm (from the C++ API) with the dd engine and default settings.

For the experiments, we use a MacBook Pro MV962LL/A, a single core, no randomization, and use a 6 GB memory limit. The time-out (TO) is 15 min.

<sup>9</sup> http://doi.org/10.5281/zenodo.4784940 or on http://github.com/sjunges/shielding-POMDPs.

*Baseline.* We compare with the one-shot algorithm including the graph-based preprocessing to identify more winning observations. We use two setups: (1) We (manually, a-priori) search for optimal hyper-parameters for each instance. We search for the smallest amount of memory possible, and for the smallest maximal rank k (being a multiplicative of five) that yields a result. Guessing parameters as an "oracle" is time-consuming and unrealistic. We investigate (2) the performance of the one-shot algorithm by fixing the hyper-parameters to two memorystates and k = 30. These parameters provide results for most benchmarks.

*Benchmarks.* Our benchmarks involve agents operating in N×N grids, inspired by, e.g., [12,15,20,50,51]. See Fig. 2 for video stills of simulating the following benchmarks. *Rocks* is a variant of *rock sample*. The grid contains two rocks which are either valuable or dangerous to collect. To find out with certainty, the rock has to be sampled from an adjacent field. The goal is to collect a valuable rock, bring it to the drop-off zone, and not collect dangerous rocks. *Refuel* concerns a rover that shall travel from one corner to the other, while avoiding an obstacle on the diagonal. Every movement costs energy and the rover may recharge at recharging stations to its full battery capacity E. It receives noisy information about its position and battery level. *Evade* is a scenario where a robot needs to reach a destination and evade a faster agent. The robot has a limited range of vision (R), but may scan the whole grid instead of moving. A certain safe area is only accessible by the robot. *Intercept* is inverse to *Evade* in the sense that the robot aims to meet an agent before it leaves the grid via one of two available exits. On top of the view radius, the agent observes a corridor in the center of the grid. *Avoid* is a related scenario where a robot shall keep distance to patrolling agents that move with uncertain speed, yielding partial information about their position The robot may exploit their predefined routes. *Obstacle* contains static obstacles where the robot needs to reach the exit. Its initial state and movement are uncertain, and it only observes whether the current position is a trap or exit.

*Results for Challenge 1.* Table 1 details the numerical benchmark results. For each benchmark instance (columns), we report the name and relevant characteristics: the number of states (|S|), the number of transitions (#Tr, the edges in the graph described by the POMDP), the number of observations (|Ω|), and the number of belief support states (|b|). For the incremental method, we provide the run time (Time, in seconds), the number of outer iterations (#Iter.) in Algorithm 3, and the number of invocations of the SMT solver (#solve), and the approximate size of the winning region (|W|). We then report these numbers when searching for a policy that wins from the initial state. For the one-shot method, we provide the time for the optimal parameters (on the next line)–TOs reflect settings in which we did not find any suitable parameters, and the time for the preset parameters (2,30), or N/A if no policy can be found with these parameters. Finally, for (belief-support) MDP model checking, we give only the run times.

The incremental algorithm finds winning policies for the initial state *without guessing parameters* and is often *faster* versus the one-shot approach with an


**Table 1.** Numerical results towards solving **Problem 1** and **Problem 3**.

oracle providing optimal parameters, and significantly faster than the one-shot approach with reasonably fixed parameters. In detail, *Rocks* shows that we can handle large numbers of iterations, solver invocations, and winning regions. The incremental approach scales to larger models, see e.g., *Avoid*. *Refuel* shows a large sensitivity of the one-shot method on the lookahead (going from 15 to 30 increases the runtime), while *Evade* shows sensitivity to memory (from 1 to 2). In contrast, the incremental approach does not rely on user-input, yet delivers comparable performance on *Refuel* or *Avoid*. It suffers slightly on *Evade*, where the one-shot approach has reduced overhead. We furthermore conclude that off-the-shelf MDP model checking is not a fast alternative. Its advantage is the guarantee to find the maximal winning region, however, for our benchmarks, maximal winning regions (empirically) coincide with the results from the incremental fixpoint approach.

*Results for Challenge 2.* Winning regions obtained from running incrementally to a fixpoint are significantly larger than when running them only until an initial winning policy is found (cf. the table), but requires extra computational effort.

If a *shielded agent* moves randomly through the grid-worlds, the larger winning regions indeed induce more permissiveness, that is, freedom to move for the agent (cf. the videos, Fig. 2). This observation can also be quantified. In Table 2, we compare the two different types of shields. For both, we give average and standard deviation over permissiveness over 250 paths. We choose to approximate permissiveness along a path as the number of cumulative actions allowed by the permissive scheduler along a path, divided by the number of cumulative actions available in the POMDP along that path. As the shield is correct by construction, each run indeed never visits avoid states and eventually reaches the target (albeit after many steps). This statement is not true for the unshielded agents.


**Table 2.** Quantification of permissiveness using fraction of allowed actions.

# **8 Conclusion**

We provided an incremental approach to find POMDP policies that satisfy almost-sure reachability specifications. The superior scalability is demonstrated on a string of benchmarks. Furthermore, this approach allows to shield agents in POMDPs and guarantees that any exploration of an environment satisfies the specification, without needlessly restricting the freedom of the agent. We plan to investigate a tight interaction with state-of-the-art reinforcement learning and quantitative verification of POMDPs. For the latter, we expect that an explicit approach to model checking the belief-support MDP can be feasible.

# **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

# **Rigorous Roundoff Error Analysis of Probabilistic Floating-Point Computations**

George Constantinides<sup>1</sup>, Fredrik Dahlqvist1,2, Zvonimir Rakamari´c<sup>3</sup>, and Rocco Salvia3(B)

> <sup>1</sup> Imperial College London, London, UK g.constantinides@ic.ac.uk <sup>2</sup> University College London, London, UK f.dahlqvist@ucl.ac.uk <sup>3</sup> University of Utah, Salt Lake City, USA {zvonimir,rocco}@cs.utah.edu

**Abstract.** We present a detailed study of roundoff errors in probabilistic floating-point computations. We derive closed-form expressions for the distribution of roundoff errors associated with a random variable, and we prove that roundoff errors are generally close to being uncorrelated with their generating distribution. Based on these theoretical advances, we propose a model of IEEE floating-point arithmetic for numerical expressions with probabilistic inputs and an algorithm for evaluating this model. Our algorithm provides rigorous bounds to the output and error distributions of arithmetic expressions over random variables, evaluated in the presence of roundoff errors. It keeps track of complex dependencies between random variables using an SMT solver, and is capable of providing sound but tight probabilistic bounds to roundoff errors using symbolic affine arithmetic. We implemented the algorithm in the PAF tool, and evaluated it on FPBench, a standard benchmark suite for the analysis of roundoff errors. Our evaluation shows that PAF computes tighter bounds than current state-of-the-art on almost all benchmarks.

# **1 Introduction**

There are two common sources of randomness in a numerical computation (a straight-line program). First, the computation might be using inherently noisy data, for example from analog sensors in cyber-physical systems such as robots, autonomous vehicles, and drones. A prime example is data from GPS sensors, whose error distribution can be described very precisely [2] and which we study in some detail in Sect. 2. Second, the computation itself might sample from random number generators. Such probabilistic numerical routines, known as Monte-Carlo methods, are used in a wide variety of tasks, such as integration [34,42], optimization [43], finance [25], fluid dynamics [32], and computer graphics [30]. We

Supported in part by the National Science Foundation awards CCF 1552975, 1704715, the Engineering and Physical Sciences Research Council (EP/P010040/1), and the Leverhulme Project Grant "Verification of Machine Learning Algorithms".

call numerical computations whose input values are sampled from some probability distributions *probabilistic computations*.

Probabilistic computations are typically implemented using floating-point arithmetic, which leads to roundoff errors being introduced in the computation. To strike the right balance between the performance and energy consumption versus the quality of the computed result, expert programmers rely on either a manual or automated floating-point error analysis to guide their design decisions. However, the current state-of-the-art approaches in this space have primary focused on *worst-case* roundoff error analysis of *deterministic* computations. So what can we say about floating-point roundoff errors in a probabilistic context? Is it possible to probabilistically quantify them by computing confidence intervals? Can we, for example, say with 99% confidence that the roundoff error of the computed result is smaller than some chosen constant? What is the distribution of outputs when roundoff errors are taken into account? In this paper, we explore these and similar questions. To answer them, we propose a rigorous – that is to say *sound* – approach to quantifying roundoff errors in probabilistic computations. Based on this approach, we develop an automatic tool that efficiently computes an overapproximate probabilistic profile of roundoff errors.

As an example, consider the floating-point arithmetic expression (<sup>X</sup> <sup>+</sup><sup>Y</sup> )÷<sup>Y</sup> , where X and Y are random inputs represented by independent random variables. In Sect. 4, we first show how the computation in *finite-precision* of a single arithmetic operation such as X + Y can be modeled as (X + Y )(1 + ε), where ε is also a random variable. We then show how this random variable can be computed from first principles and why it makes sense to view (X + Y ) and (1 + ε) as independent expressions, which in turn allows us to easily compute the distribution of (X + Y )(1 + ε). The distribution of ε depends on that of X + Y , and we therefore need to evaluate arithmetic operations between random variables. When the operands are independent – as in X + Y – this is standard [48], but when the operands are dependent – as in the case of the division in (<sup>X</sup> <sup>+</sup> <sup>Y</sup> ) <sup>÷</sup> <sup>Y</sup> – this is a hard problem. To solve it, we adopt and improve a technique for soundly bounding these distributions described in [3]. Our improvement comes from the use of an SMT solver to reason about the dependency between (X + Y ) and Y and remove regions of the state-space with zero probability. We describe this in Sect. 6.

We can thus soundly bound the output distribution of any probabilistic computation, such as (<sup>X</sup> <sup>+</sup><sup>Y</sup> )÷<sup>Y</sup> , performed in floating-point arithmetic. This gives us the ability to perform *probabilistic range analysis* and prove rigorous assertions like: 99% of the outputs of a floating-point computation are smaller than a given constant bound. In order to perform *probabilistic roundoff error analysis* we develop *symbolic affine arithmetic* in Sect. 5. This technique is combined with probabilistic range analysis to compute *conditional roundoff errors*. Specifically, we over-approximate the maximal error conditioned on the output landing in the 99% range computed by the probabilistic range analysis, meaning conditioned on the computations not returning an outlier.

We implemented our model and algorithms in a tool called PAF (for Probabilistic Analysis of Floating-point errors). We evaluated PAF on the standard floating-point benchmark suite FPBench [11], and compared its range and error analysis with the worst-case roundoff error analyzer FPTaylor [46,47] and the probabilistic roundoff error analyzer PrAn [36]. We present the results in Sect. 7, and show that FPTaylor's worst-case analysis is often overly pessimistic in the probabilistic setting, while PAF also generates tighter probabilistic error bounds than PrAn on almost all benchmarks.

We summarize our contributions as follows:


# **2 Motivating Example**

GPS sensors are inherently noisy. Bornholt [1] shows that the conditional probability of the true coordinates given a GPS reading is distributed according to a Rayleigh distribution. Interestingly, since the density of any Rayleigh distribution is always zero at x = 0, it is extremely unlikely that the true coordinates lie in a small neighborhood of those given by the GPS reading. This leads to errors, and hence the sensed coordinates should be corrected by adding a probabilistic error term which, on average, shifts the observed coordinates into an area of high probability for the true coordinates [1,2]. The latitude correction is given by:

$$\text{TrueLat} = \text{GPSLat} + \left( (\text{radius} \ast \sin(\text{ang1e})) \ast \text{DPERM} \right), \tag{1}$$

where radius is Rayleigh distributed, angle uniformly distributed, GPSLat is the latitude, and DPERM a constant for converting meters into degrees.

A developer trying to strike the right balance between resources, such as energy consumption or execution time, versus the accuracy of the computation, might want to run a rigorous worst-case floating-point analysis tool to determine which floating-point format is accurate enough to process GPS signals. This is mandatory if the developer requires rigorous error bounds holding with 100% certainty. The problem when analyzing a piece of code involving (1) is that the Rayleigh distribution has [0,∞) as its support, and *any* worst-case roundoff error analysis will return an infinite error bound in this situation. To get a meaningful (numeric) error bound, we need to truncate the support of the distribution. The most conservative truncation is [0, *max* ], where *max* is the largest representable number (not causing an overflow) at the target floating-point precision format.

<sup>1</sup> PAFis open source and publicly available at https://github.com/soarlab/paf.


**Table 1.** Roundoff error analysis for the probabilistic latitude correction of (1).

In Table 1, we report a detailed roundoff error analysis of (1) implemented in IEEE 754 double-, single-, and half-precision formats, with GPSLat set to the latitude of the Greenwich observatory. With each floating-point format, we associate the range [0, *max* ] of the truncated Rayleigh distribution. We compute worst-case roundoff error bounds for (1) with the state-of-the-art error analyzer FPTaylor [47] and with our tool PAF by setting the confidence interval to 100%. As expected, the error bounds from the two tools are identical. Finally, we compute the 99.9999% *conditional roundoff error* using PAF. This value is an upper bound to the roundoff error *conditioned* on the computation having landed in an interval capturing 99.9999% of all possible outputs. Column Absolute gives the error in degrees and Meters in meters (1◦ ≈111km).

By looking at the results obtained without our *probabilistic error analysis* (columns FPTaylor and PAF 100%), the developer might *erroneously* conclude that half-precision format is the most appropriate to implement (1) because it results in the smallest error bound. However, with the information provided by the 99.9999% *conditional roundoff error*, the developer can see that the *average* error is many orders of magnitude smaller than the worst-case scenarios. Armed with this information, the developer can conclude that with a roundoff error of roughly 40 cm (4.1e−1 ms) when correcting 99.9999% of GPS latitude readings, working in single-precision is an adequate compromise between efficiency and accuracy of the computation.

This motivates the innovative concept of *probabilistic precision tuning*, evolved from standard worst-case precision tuning [5,12], to determine which floatingpoint format is the most appropriate for a given computation. As an example, let us do a probabilistic precision tuning exercise for the latitude correction computation of (1). We truncate the Rayleigh distribution in the interval [0, 10<sup>307</sup>], and assume we can tolerate up to 1e−5 roundoff error (roughly 1 m). First, we manually perform worst-case precision tuning using FPTaylor to determine that the minimal floating-point format not violating the given error bound needs 1022 mantissa and 11 exponent bits. Such large custom format is prohibitively expensive, in particular for devices performing frequent GPS readings, like smartphones or smartwatches. Conversely, when we manually perform probabilistic precision tuning using PAF with a confidence interval set to 99.9999%, we determine we need only 22 mantissa and 11 exponent bits. Thanks to PAF, the developer can provide a custom confidence interval of interest to the probabilistic precision tuning routine to adjust for the extremely unlikely corner cases like the ones we described for (1), and ultimately obtain more optimal tuning results.

# **3 Preliminaries**

#### **3.1 Floating-Point Arithmetic**

Given a *precision* <sup>p</sup> <sup>∈</sup> <sup>N</sup> and an *exponent range* [emin, emax] - {<sup>n</sup> <sup>|</sup> <sup>n</sup> <sup>∈</sup> <sup>N</sup> <sup>∧</sup> <sup>e</sup>min <sup>≤</sup> <sup>n</sup> <sup>≤</sup> <sup>e</sup>max}, we define <sup>F</sup>(p, emin, emax), or simply <sup>F</sup> if there is no ambiguity, as the set of extended real numbers

$$\mathbb{F} \triangleq \left\{ (-1)^s 2^e \left( 1 + \frac{k}{2^p} \right) \, \middle| \, s \in \{0, 1\}, e \in [e\_{min}, e\_{max}], 0 \le k < 2^p \right\} \cup \{ -\infty, 0, \infty \}.$$

Elements <sup>z</sup> <sup>=</sup> <sup>z</sup>(s, e, k) <sup>∈</sup> <sup>F</sup> will be called *floating-point representable numbers* (for the given precision p and exponent range [emin, emax]) and we will use the variable z to represent them. The variable s will be called the *sign*, the variable e the *exponent*, and the variable k the *significand* of z(s, e, k).

Next, we introduce a *rounding map* Round : <sup>R</sup> <sup>→</sup> <sup>F</sup> that rounds to nearest (or to −∞/∞ for values smaller/greater than the smallest/largest finite element of F) and follows any of the IEEE 754 rounding modes in case of a tie. We will not worry about which choice is made since the set of mid-points will always have probability zero for the distributions we will be working with. All choices are thus equivalent, probabilistically speaking, and what happens in a tie can therefore be left unspecified. We will denote the extended real line by R - <sup>R</sup> ∪ {−∞,∞}. The (signed) *absolute error function* errabs : <sup>R</sup> <sup>→</sup> <sup>R</sup> is defined as: errabs(x) = <sup>x</sup>−Round(x). We define the sets z, z <sup>=</sup> {<sup>y</sup> <sup>∈</sup> <sup>R</sup> <sup>|</sup> Round(y) = Round(z)}. Thus if <sup>z</sup> <sup>∈</sup> <sup>F</sup>, then z, z is the collection of all reals rounding to <sup>z</sup>. As the reader will see, the basic result of Sect. 4 (Eq. (5)) is expressed entirely using the notation z, z which is parametric in the choice of the Round function. It follows that our results apply to rounding modes other that round-to-nearest with minimal changes. The *relative error function* errrel : <sup>R</sup> \ {0} → <sup>R</sup> is defined by

$$\text{err}\_{\text{rel}}(x) = \frac{x - \text{Round}(x)}{x}.$$

Note that errrel(x) = 1 on <sup>0</sup>, <sup>0</sup>\{0}, errrel(x) = <sup>∞</sup> on −∞, −∞ and errrel(x) = −∞ on <sup>∞</sup>,∞. Recall also the fact [26] that <sup>−</sup>2−(p+1) <sup>&</sup>lt; errrel(x) <sup>&</sup>lt; <sup>2</sup>−(p+1) outside of <sup>0</sup>, <sup>0</sup><sup>∪</sup>−∞, −∞∪ <sup>∞</sup>,∞. The quantity 2−(p+1) is usually called the *unit roundoff* and will be denoted by u.

For <sup>z</sup>1, z<sup>2</sup> <sup>∈</sup> <sup>F</sup> and op ∈ {+, <sup>−</sup>, <sup>×</sup>, ÷} an (infinite-precision) arithmetic operation, the traditional model of IEEE 754 floating-point arithmetic [26,39] states that the finite-precision implementation op<sup>m</sup> of op must satisfy

$$z\_1 \text{ op}\_{\mathfrak{n}} \ z\_2 = (z\_1 \text{ op } z\_2)(1+\delta) \qquad |\delta| \le u,\tag{2}$$

We leave dealing with subnormal floating-point numbers to future work. The model given by Eq. (2) stipulates that the implementation of an arithmetic operation can induce a relative error of magnitude *at most* u. The exact size of the error is, however, not specified and Eq. (2) is therefore a *non-deterministic* *model of computation*. It follows that numerical analyses based on Eq. (2) must consider *all* possible relative errors δ and are fundamentally *worst-case* analyses. Since the output of such a program might be the input of another, one should also consider non-deterministic inputs, and this is indeed what happens with automated tools for roundoff error analysis, such as Daisy [12] or FPTaylor [46, 47], which require for each variable of the program a (bounded) range of possible values in order to perform a worst-case analysis (*cf.* GPS example in Sect. 2).

In this paper, we study a model formally similar to Eq. (2), namely

$$z\_1 \text{ op}\_{\mathfrak{n}} \ z\_2 = (z\_1 \text{ op } z\_2)(1+\delta) \qquad \delta \sim dist. \tag{3}$$

The difference is that δ is now *distributed according to* dist, a probability distribution whose support is [−u, u]. In other words, we move from a non-deterministic to a *probabilistic* model of roundoff errors. This is similar to the 'Monte Carlo arithmetic' of [41], but whilst *op. cit. postulates* that dist is the uniform distribution on [−u, u], we compute dist from first principles in Sect. 4.

#### **3.2 Probability Theory**

To fix the notation and be self-contained, we present some basic notions of probability theory which are essential to what follows.

**Cumulative Distribution Functions and Probability Density Functions.** We assume that the reader is (at least intuitively) familiar with the notion of a (real) random variable. Given a random variable X we define its Cumulative Distribution Function (CDF) as the function c(t) - <sup>P</sup> [<sup>X</sup> <sup>≤</sup> <sup>t</sup>]. If there exists a non-negative integrable function <sup>d</sup> : <sup>R</sup> <sup>→</sup> <sup>R</sup> such that

$$c(t) \triangleq \mathbb{P}\left[X \le t\right] = \int\_{-\infty}^{t} d(t) \, dt.$$

then we call d(t) the Probability Density Function (PDF) of X. If it exists, then it can be recovered from the CDF by differentiation d(t) = <sup>∂</sup> ∂t <sup>c</sup>(t) by the fundamental theorem of calculus.

Not all random variables have a PDF: consider the random variable which takes value 0 with probability <sup>1</sup>/<sup>2</sup> and value 1 with probability <sup>1</sup>/2. For this random variable it is impossible to write <sup>P</sup> [<sup>X</sup> <sup>≤</sup> <sup>t</sup>] = <sup>d</sup>(t) dt. Instead, we will write the distribution of such a variable using the so-called Dirac delta measure at 0 and 1 as <sup>1</sup>/2δ<sup>0</sup> + <sup>1</sup>/2δ1. It is possible for a random variable to have a PDF covering part of its distribution – its *continuous part* – and a sum of Dirac deltas covering the rest of its distribution – its *discrete part*. We will encounter examples of such random variables in Sect. 4. Finally, if X is a random variable and <sup>f</sup> : <sup>R</sup> <sup>→</sup> <sup>R</sup> is a measurable function, then <sup>f</sup>(X) is a random variable. In particular errrel(X) is a random variable which we will describe in Sect. 4.

**Arithmetic on Random Variables.** Suppose X, Y are *independent* random variables with PDFs f<sup>X</sup> and f<sup>Y</sup> , respectively. Using the arithmetic operations we can form new random variables <sup>X</sup> <sup>+</sup> Y,X <sup>−</sup> Y,X <sup>×</sup> Y,X <sup>÷</sup> <sup>Y</sup> . The PDFs of these new random variables can be expressed as operations on f<sup>X</sup> and f<sup>Y</sup> , which can be found in [48]. It is important to note that these operations are only valid if X and Y are assumed to be independent. When an arithmetic expression containing variable repetitions is given a random variable interpretation, this independence can no longer be assumed. In the expression (<sup>X</sup> <sup>+</sup> <sup>Y</sup> ) <sup>÷</sup> <sup>Y</sup> the sub-term (<sup>X</sup> <sup>+</sup> <sup>Y</sup> ) can be interpreted by the formulas of [48] if X, Y are independent. However, the sub-terms X + Y and Y cannot be interpreted in this way since X + Y and Y are clearly not independent random variables.

**Soundly Bounding Probabilities.** The constraint that the distribution of a random variable must integrate to 1 makes it impossible to order random variables in the 'natural' way: if <sup>P</sup> [<sup>X</sup> <sup>∈</sup> <sup>A</sup>] <sup>≤</sup> <sup>P</sup> [<sup>Y</sup> <sup>∈</sup> <sup>A</sup>], then <sup>P</sup> [<sup>Y</sup> <sup>∈</sup> <sup>A</sup><sup>c</sup>] <sup>≤</sup> <sup>P</sup> [<sup>X</sup> <sup>∈</sup> <sup>A</sup><sup>c</sup>], i.e., we cannot say that <sup>X</sup> <sup>≤</sup> <sup>Y</sup> if <sup>P</sup> [<sup>X</sup> <sup>∈</sup> <sup>A</sup>] <sup>≤</sup> <sup>P</sup> [<sup>Y</sup> <sup>∈</sup> <sup>A</sup>]. This means that we cannot quantify our probabilistic uncertainty about a random variable by sandwiching it between two other random variables as one would do with reals or real-valued functions. One solution is to restrict the sets used in the comparison, i.e., declare that <sup>X</sup> <sup>≤</sup> <sup>Y</sup> iff <sup>P</sup> [<sup>X</sup> <sup>∈</sup> <sup>A</sup>] <sup>≤</sup> <sup>P</sup> [<sup>Y</sup> <sup>∈</sup> <sup>A</sup>] for <sup>A</sup> ranging over a given set of 'test subsets'. Such an order can be defined by taking as 'test subsets' the intervals (−∞, x] [44]. This order is called the *stochastic order*. It follows from the definition of the CDF that this order can be defined by simply saying that <sup>X</sup> <sup>≤</sup> <sup>Y</sup> iff <sup>c</sup><sup>X</sup> <sup>≤</sup> <sup>c</sup><sup>Y</sup> , where <sup>c</sup><sup>X</sup> and <sup>c</sup><sup>Y</sup> are the CDFs of <sup>X</sup> and <sup>Y</sup> , respectively. If it is possible to sandwich an unknown random variable X between known lower and upper bounds <sup>X</sup>lower <sup>≤</sup> <sup>X</sup> <sup>≤</sup> <sup>X</sup>upper using the stochastic order then it becomes possible to give sound bounds to the quantities <sup>P</sup> [<sup>X</sup> <sup>∈</sup> [a, b]] via

$$\mathbb{P}\left[X \in [a, b]\right] = c\_X(b) - c\_X(a) \le c\_{X\_{upper}}(b) - c\_{X\_{lower}}(a)$$

**P-Boxes and DS-Structures.** As mentioned above, giving a random variable interpretation to an arithmetic expression containing variable repetitions cannot be done using the arithmetic of [48]. In fact, these interpretations are in general analytically intractable. Hence, a common approach is to give up on soundness and approximate such distributions using Monte-Carlo simulations. We use this approach in our experiments to assess the quality of our sound results. However, we will also provide sound under- and over-approximations of the distribution of arithmetic expressions over random variables using the stochastic order discussed above. Since <sup>X</sup>lower <sup>≤</sup> <sup>X</sup> <sup>≤</sup> <sup>X</sup>upper is equivalent to saying that <sup>c</sup><sup>X</sup>*lower* (x) <sup>≤</sup> <sup>c</sup>X(x) <sup>≤</sup> <sup>c</sup><sup>X</sup>*upper* (x), the fundamental approximating structure will be a pair of CDFs satisfying <sup>c</sup>1(x) <sup>≤</sup> <sup>c</sup>2(x). Such a structure is known in the literature as a *p-box* [19], and has already been used in the context of probabilistic roundoff errors in related work [3,36]. The data of a p-box is equivalent to a pair of sandwiching distributions for the stochastic order.

A *Dempster-Shafer structure* (DS-structure) of size N is a collection (i.e., set) of interval-probability pairs {([x0, y0], p0),([x1, y2], p1), ..,([x<sup>N</sup> , y<sup>N</sup> ], p<sup>N</sup> )} where <sup>N</sup> <sup>i</sup>=0 <sup>p</sup><sup>i</sup> = 1. The intervals in the collection might overlap. One can always convert a DS-structure to a p-box and back again [19], but arithmetic operations are much easier to perform on DS-structures than on p-boxes ([3]), which is why we will use DS-structures in the algorithm described in Sect. 6.

# **4 Distribution of Floating-Point Roundoff Errors**

Our tool PAF computes *probabilistic* roundoff errors by conditioning the maximization of symbolic affine form (presented in Sect. 5) on the output of the computation landing in a confidence interval. The purpose of this section is to provide the necessary probabilistic tools to compute these intervals. In other words, this section provides the foundations of *probabilistic range analysis*. All proofs can be found in the extended version [7].

#### **4.1 Derivation of the Distribution of Rounding Errors**

Recall the probabilistic model of Eq. (3) where op is an infinite-precision arithmetic operation and op<sup>m</sup> its finite-precision implementation:

$$z\_1 \text{ op}\_{\mathfrak{m}} \ z\_2 = (z\_1 \text{ op } z\_2)(1+\delta) \qquad \delta \sim dist.$$

Let us also assume that z1, z<sup>2</sup> are random variables with known distributions. Then z<sup>1</sup> op z<sup>2</sup> is also a random variable which can (in principle) be computed. Since the IEEE 754 standard states that z<sup>1</sup> op<sup>m</sup> z<sup>2</sup> is computed by rounding the infinite precision operation z<sup>1</sup> op z2, it is a completely natural consequence of the standard to require that δ is simply be given by

$$\delta = \text{err}\_{\text{rel}}(z\_1 \text{ op } z\_2)$$

Thus, dist is the distribution of the random variable errrel(z<sup>1</sup> op z2). More generally, if X is a random variable with know distribution, we will show how to compute the distribution dist of the random variable

$$\text{err}\_{\text{rel}}(X) = \frac{X - \text{Round}(X)}{X}.$$

We choose to express the distribution dist of relative errors *in multiples of the unit roundoff* u. This choice is arbitrary, but it allows us to work with a distribution on the conceptually and numerically convenient interval [−1, 1], since the absolute value of the relative error is strictly bounded by u (see Sect. 3.1), rather than the interval [−u, u].

To compute the density function of dist, we proceed as described in Sect. 3.2 by first computing the CDF c(t) and then taking its derivative. Recall first from Sect. 3.1 that errrel(x) = 1 if <sup>x</sup> <sup>∈</sup><sup>0</sup>, <sup>0</sup>\{0}, errrel(x) = <sup>∞</sup> if <sup>x</sup> <sup>∈</sup>−∞, −∞, errrel(x) = −∞ if <sup>x</sup> <sup>∈</sup><sup>∞</sup>,∞, and <sup>−</sup><sup>u</sup> <sup>≤</sup> errrel(x) <sup>≤</sup> <sup>u</sup> elsewhere. Thus:

$$\begin{aligned} \mathbb{P}\left[\text{err}\_{\text{rel}}(X) = -\infty\right] &= \mathbb{P}\left[X \in \left[\infty, \infty\right]\right] \\ \mathbb{P}\left[\text{err}\_{\text{rel}}(X) = \infty\right] &= \mathbb{P}\left[X \in \left[-\infty, -\infty\right]\right] \end{aligned} \quad \mathbb{P}\left[\text{err}\_{\text{rel}}(X) = 1\right] = \mathbb{P}\left[X \in \left[0, 0\right]\right]$$

In other words, the probability measure corresponding to errrel has three discrete components at {−∞}, {1}, and {∞}, which cannot be accounted for by a PDF (see Sect. 3.2). It follows that the probability measure dist is given by

$$dist\_c + \mathbb{P}\left[X \in \left[0, 0\right]\right] \delta\_1 + \mathbb{P}\left[X \in \left[-\infty, -\infty\right]\right] \delta\_\infty + \mathbb{P}\left[X \in \left[\infty, \infty\right]\right] \delta\_{-\infty} \tag{4}$$

**Fig. 1.** Theoretical vs. empirical error distribution, clockwise from top-left: (i) Eq. (5) for Unif(2, 4) 3 bit exponent, 4 bit significand, (ii) Eq. (5) for Unif(2, 4) in halfprecision, (iii) Eq. (6) for Unif(7, 8) in single-precision, (iv) Eq. (6) for Unif(4, 5) in single-precision, (v) Eq. (6) for Unif(4, 32) in single-precision, (vi) Eq. (6) for Norm(0, 1) in single-precision.

where dist<sup>c</sup> is a continuous measure that is not quite a probability measure since its total mass is 1 <sup>−</sup> <sup>P</sup> [<sup>X</sup> <sup>∈</sup><sup>0</sup>, <sup>0</sup>] <sup>−</sup> <sup>P</sup> [<sup>X</sup> <sup>∈</sup> −∞, −∞] <sup>−</sup> <sup>P</sup> [<sup>X</sup> <sup>∈</sup><sup>∞</sup>,∞]. In general, dist<sup>c</sup> integrates to 1 in machine precision since <sup>P</sup> [<sup>X</sup> <sup>∈</sup><sup>0</sup>, <sup>0</sup>] is of the order of the smallest positive floating-point representable number, and the PDF of X rounds to 0 way before it reaches the smallest/largest floating-point representable number. However in order to be sound, we must in general include these three discrete components to our computations. The density dist<sup>c</sup> is given explicitly by the following result whose proof can already be found in [9].

**Theorem 1.** *Let* X *be a real random variable with PDF* f*. The continuous part* dist<sup>c</sup> *of the distribution of* errrel(X) *has a PDF given by*

$$d(t) = \sum\_{z \in \mathbb{F}\{ -\infty, 0, \infty \}} \mathbb{1}\_{\lfloor z, z \rfloor} \left( \frac{z}{1 - tu} \right) f\left( \frac{z}{1 - tu} \right) \frac{u \lfloor z \rfloor}{(1 - tu)^2},\tag{5}$$

*where* <sup>1</sup>A(x) *is the indicator function which returns 1 if* <sup>x</sup> <sup>∈</sup> <sup>A</sup> *and 0 otherwise.*

Figure 1 (i) and (ii) shows an implementation of Eq. (5) applied to the distribution Unif(2, 4), first in very low precision (3 bit exponent, 4 bit significand) and then in half-precision. The theoretical density is plotted alongside a histogram of the relative error incurred when rounding 100,000 samples to low precision (computed in double-precision). The reported statistic is the K-S (Kolmogorov-Smirnov) test which measures the likelihood that a collection of samples were drawn from a given distribution. This test reports that we cannot reject the hypothesis that the samples are drawn from the corresponding density. Note how in low precision the term in <sup>1</sup> (1−tu)<sup>2</sup> induces a visible asymmetry on the central section of the distribution. This effect is much less pronounced in halfprecision.

For low precisions, say up to half-precision, it is computationally feasible to explicitly go through all floating-point numbers and compute the density of the roundoff error distribution dist directly from Eq. (5). However, this rapidly becomes prohibitively computationally expensive for higher precisions (since the number of floating-point representable numbers grows exponentially).

#### **4.2 High-Precision Case**

As the working precision increases, a regime changes occurs: on the one hand it becomes practically impossible to enumerate all floating-point representable numbers as done in Eq. (5), but on the other hand sufficiently well-behaved density functions are numerically close to being constant at the scale of an interval between two floating-point representable numbers. We exploit this smoothness to overcome the combinatorial limit imposed by Eq. (5).

**Theorem 2.** *Let* X *be a real random variable with PDF* f*. The continuous part* dist<sup>c</sup> *of the distribution of* errrel(X) *has a PDF given by* dc(t) = dhp(t) + R(t) *where* <sup>d</sup>hp(t) *is the function on* [−1, 1] *defined by*

$$d\_{hp}(t) = \begin{cases} \frac{1}{1-tu} \sum\_{s,e=e\_{min}+1}^{e\_{max}-1} \int\_{(-1)^s 2^e (1-u)}^{(-1)^s 2^e (2-u)} \frac{|x|}{2^{e+1}} f(x) \, dx & |t| \le \frac{1}{2} \\\\ \frac{1}{1-tu} \sum\_{s,e=e\_{min}+1}^{e\_{max}-1} \int\_{(-1)^s 2^e (1-u)}^{(-1)^s 2^e (\frac{1}{|t|} - u)} \frac{|x|}{2^{e+1}} f(x) \, dx & \frac{1}{2} < |t| \le 1 \end{cases} (6)$$

*and* <sup>R</sup>(t) *is an error whose total contribution* <sup>|</sup>R|- 1 −<sup>1</sup>|R(t)|dt *can be bounded by*

$$|R| \le \mathbb{P}\left[\text{Round}(X) = z(s, e\_{min}, k)\right] + \mathbb{P}\left[\text{Round}(X) = z(s, e\_{max}, k)\right] + \delta$$

$$\frac{3}{4} \left(\sum\_{s, e\_{min} < e < e\_{max}} \left| f'(\xi\_{e,s}) \xi\_{e,s} + f(\xi\_{e,s}) \right| \frac{2^{2e}}{2^p}\right)$$

*where for each exponent* <sup>e</sup> *and sign* <sup>s</sup>*,* <sup>ξ</sup>e,s *is a point in* [z(s, e, 0), z(s, e, <sup>2</sup><sup>p</sup> <sup>−</sup> 1)] *if* <sup>s</sup> = 0 *and in* [z(s, e, <sup>2</sup><sup>p</sup> <sup>−</sup> 1), z(s, e, 0)] *if* <sup>s</sup> = 1*.*

Note how Eq. (6) reduces the sum over *all* floating-point representable numbers in Eq. (5) to a sum over *the exponents* by exploiting the regularity of f. Note also that since f is a PDF, it usually decreases very quickly away from 0, and its derivative decreases even quicker and <sup>|</sup>R<sup>|</sup> thus tends to be very small and <sup>|</sup>R| → 0 as the precision <sup>p</sup> → ∞.

Figure 1 shows Eq. (6) for: (i) the distribution Unif(7, 8) where large significands are more likely, (ii) the distribution Unif(4, 5) where small significands are more likely, (iii) the distribution Unif(4, 32) where significands are equally likely, and (iv) the distribution Norm(0, 1) with infinite support. The graphs show the density function given by Eq. (6) in single-precision versus a histogram of the relative error incurred when rounding 1,000,000 samples to single-precision (computed in double-precision). The K-S test reports that we cannot reject the hypothesis that the samples are drawn from the corresponding distributions.

#### **4.3 Typical Distribution**

The distributions depicted in graphs (ii), (v) and (vi) of Fig. 1 are very similar, despite being computed from very different input distributions. What they have in common is that their input distributions have the property that all significands in their supports are equally likely. We show that under this assumption, the distribution of roundoff errors given by Eq. (5) converges to a unique density as the precision increases, irrespective of the input distribution! Since signifi-

**Fig. 2.** Typical distribution.

cands are frequently equiprobable (it is the case for a third of our benchmarks), this density is of great practical importance. If one had to choose 'the' canonical distribution for roundoff errors, we claim that the density given below should be this distribution, and we therefore call it the *typical distribution*; we depict it in Fig. 2 and formalize it with the following theorem, which can mostly be found in [9].

**Theorem 3.** *If* X *is a random variable such that* P [Round(X) = z(s, e, k0)] = 1 <sup>2</sup>*<sup>p</sup> for any significand* <sup>k</sup>0*, then*

$$d\_{typ}(t) \triangleq \lim\_{p \to \infty} d(t) = \begin{cases} \frac{3}{4} & |t| \le \frac{1}{2} \\ \frac{1}{2} \left( \frac{1}{t} - 1 \right) + \frac{1}{4} \left( \frac{1}{t} - 1 \right)^2 & |t| > \frac{1}{2} \end{cases} \tag{7}$$

*where* d(t) *is the exact density given by Eq.* (5)*.*

#### **4.4 Covariance Structure**

The result above can be interpreted as saying that if X is such that all mantissas are equiprobable, then X and errrel(X) are asymptotically independent (as <sup>p</sup> → ∞). Much more generally, we now show that if a random variable <sup>X</sup> has a sufficiently regular PDF, it is close to being uncorrelated from errrel(X). Formally, we prove that the covariance

$$\operatorname{Cov}(X, \operatorname{err}\_{\operatorname{rel}}(X)) = \mathbb{E}\left[X.\operatorname{err}\_{\operatorname{rel}}(X)\right] - \mathbb{E}\left[X\right]\mathbb{E}\left[\operatorname{err}\_{\operatorname{rel}}(X)\right] \tag{8}$$

is small, specifically of the order of u. Note that the expectation in the first summand above is taken w.r.t. the joint distribution of X and errrel(X).

The main technical obstacles to proving that the expression above is small are that E[errrel(X)] turns out to be difficult to compute (we only manage to bound it) and that the joint distribution <sup>P</sup> [<sup>X</sup> <sup>∈</sup> <sup>A</sup> <sup>∧</sup> errrel(X) <sup>∈</sup> <sup>B</sup>] does not have a PDF since it is not continuous w.r.t. the Lebesgue measure on R<sup>2</sup>. Indeed, it is supported by the graph of the function errrel which has a Lebesgue measure of 0. This does not mean that it is impossible to compute the expectation

$$\mathbb{E}\left[X.\text{err}\_{\text{rel}}(X)\right] = \int\_{\mathbb{R}^2} xut\ d\mathbb{P}\tag{9}$$

but it is necessary to use some more advanced probability theory. We will make the simplifying assumption that the density of X is constant on each interval z, z in order to keep the proof manageable. In practice this is an extremely good approximation. Without this assumption, we would need to add an error term similar to that of Theorem 2 to the expression below. This is not conceptually difficult, but it is messy, and would distract from the main aim of the following theorem which is to bound E[errrel(X)], compute E[X.errrel(X)], and show that the covariance between X and errrel(X) is typically of the order of u.

**Theorem 4.** *If the density of* <sup>X</sup> *is piecewise constant on intervals* z, z*, then*

$$\left(L - \mathbb{E}\left[X\right]K\frac{u}{6}\right) \le \text{Cov}(X, \text{err}\_{\text{rel}}(X)) \le \left(L - \mathbb{E}\left[X\right]K\frac{4u}{3}\right).$$

*where* L = s,e <sup>f</sup>((−1)<sup>s</sup>2<sup>e</sup>)(−1)<sup>s</sup>2<sup>2</sup><sup>e</sup> <sup>3</sup>u<sup>2</sup> <sup>2</sup> *and* <sup>K</sup> <sup>=</sup> <sup>e</sup>*max* <sup>−</sup><sup>1</sup> s,e=e*min*+1 (−1)*s*2*e*(2−u) (−1)*s*2*e*(1−u) |x| 2*e*+1 f(x) dx*.*

If the distribution of X is centered (i.e., E[X] = 0) then L is the exact value of the covariance, and it is worth noting that L is fundamentally an artifact of the floating-point representation and is due to the fact that the intervals <sup>2</sup><sup>e</sup>, <sup>2</sup><sup>e</sup> are not symmetric. More generally, for E[X] of the order of, say, 2, the covariance will be small (of the order of <sup>u</sup>) as <sup>K</sup> <sup>≤</sup> 1 (since <sup>|</sup>x| ≤ <sup>2</sup><sup>e</sup>+1 in each summand). For very large values of E[X] it is worth noting that there is a high chance that L is also be very large, partially canceling E[X]. An illustration of this is given by the *doppler* benchmark examined in Sect. 7, an outlier as it has an input variable with range [20, 20000]. Nevertheless, even for this benchmark the bounds of Theorem 4 still give a small covariance of the order of 0.001.

#### **4.5 Error Terms and P-Boxes**

In low-precision we can use the exact formula Eq. (5) to compute the error distribution. However, in high-precision, approximations (typically extremely good) like Eqs. (6) and (7) must be used. In order to remain sound in the implementation of our model (see Sect. 6) we must account for the error made by this approximation. We have not got the space to discuss the error made by Eq. (7), but taking the term <sup>|</sup>R<sup>|</sup> of Theorem <sup>2</sup> as an illustration, we can use the notion of p-box described in Sect. 3.2 to create an object which soundly approximates the error distribution. We proceed as follows: since <sup>|</sup>R<sup>|</sup> bounds the total error accumulated over all <sup>t</sup> <sup>∈</sup> [−1, 1], we can soundly bound the CDF <sup>c</sup>(t) of the error distribution given by Eq. (6) by using the p-box

$$c^-(t) = \max(0, c(t) - |R|) \qquad \text{and} \qquad c^+(t) = \min(1, c(t) + |R|)$$

# **5 Symbolic Affine Arithmetic**

In this section, we introduce *symbolic affine arithmetic*, which we employ to generate the symbolic form for the roundoff error that we use in Sect. 6.3. Affine arithmetic [6] is a model for range analysis that extends classic interval arithmetic [40] with information about linear correlations between operands. Symbolic affine arithmetic extends standard affine arithmetic by keeping the coefficients of the noise terms *symbolic*. We define a *symbolic affine form* as

$$
\hat{x} = x\_0 + \sum\_{i=1}^n x\_i \epsilon\_i, \qquad \text{where } \epsilon\_i \in [-1, 1]. \tag{10}
$$

We call x<sup>0</sup> the central symbol of the affine form, while x<sup>i</sup> are the symbolic coefficients for the noise terms i. We can always convert a symbolic affine form to its corresponding interval representation. This can be done using interval arithmetic or, to avoid precision loss, using a global optimizer.

Affine operations between symbolic forms follow the usual rules, such as

$$
\alpha \hat{x} + \beta \hat{y} + \zeta = \alpha x\_0 + \beta y\_0 + \zeta + \sum\_{i=1}^{n} (\alpha x\_i + \beta y\_i) \epsilon\_i
$$

Non-linear operations cannot be represented exactly using an affine form. Hence, we approximate them like in standard affine arithmetic [49].

**Sound Error Analysis with Symbolic Affine Arithmetic.** We now show how the roundoff errors get propagated through the four arithmetic operations. We apply these propagation rules to an arithmetic expression to accurately keep track of the roundoff errors. Since the (absolute) roundoff error directly depends on the range of a computation, we describe range and error together as a pair (range: Symbol, err : Symbolic Affine Form). Here, range represents the infinite-precision range of the computation, while err is the symbolic affine form for the roundoff error in floating-point precision. Unary operators (e.g., rounding) take as input a (range, error form) pair, and return a new output pair; binary operators take as input two pairs, one per operand. For linear operators, the ranges and errors get propagated using the standard rules of affine arithmetic.

For the multiplication, we distribute each term in the first operand to every term in the second operand:

$$(\mathbf{x}, \widehat{errr}\_x) \* (\mathbf{y}, \widehat{errr}\_y) = (\mathbf{x}^\mathbf{x} \mathbf{y}, \mathbf{x} \* \widehat{errr}\_y + \mathbf{y} \* \widehat{errr}\_x + \widehat{errr}\_x \* \widehat{errr}\_y)$$

The output range is the product of the input ranges and the remaining terms contribute to the error. Only the last (quadratic) expression cannot be represented exactly in symbolic affine arithmetic; we bound such non-linearities using a global optimizer. The division is computed as the term-wise multiplication of the numerator with the inverse of the denominator. Hence, we need the inverse of the denominator error form, and then we can proceed as for multiplication. To compute the inverse, we leverage the symbolic expansion used in FPTaylor [46].

Finally, after every operation we apply the unary rounding operator from Eq. (2). The infinite-precision range is not affected by rounding. The rounding operator appends a fresh noise term to the symbolic error form. The coefficient for the new noise term is the (symbolic) floating-point range given by the sum of the input range with the input error form.

**Fig. 3.** Toolflow of PAF.

# **6 Algorithm and Implementation**

In this section, we describe our probabilistic model of floating-point arithmetic and how we implement it in a prototype named PAF (for Probabilistic Analysis of Floating-point errors). Figure 3 shows the toolflow of PAF.

#### **6.1 Probabilistic Model**

PAF takes as input a text file describing a probabilistic floating-point computation and its input distributions. The kinds of computations we support are captured with this simple grammar:

$$\mathbf{t} \mathrel{\mathop{:=}} \mathbf{z} \mid \mathbf{x}\_{\mathbf{i}} \mid \mathbf{t} \; \mathbf{op}\_{\mathfrak{n}} \mathbf{t} \qquad \mathbf{z} \in \mathbb{F}, \mathbf{i} \in \mathbb{N}, \; \mathbf{op}\_{\mathfrak{n}} \in \{+, -, \times, \div\}$$

Following [8,31], we interpret each computation t given by the grammar as a random variable. We define the interpretation map -− over the computation tree inductively. The base case is given by z(s, e, k) - (−1)<sup>s</sup>2<sup>e</sup>(1 + <sup>k</sup>2−<sup>p</sup>) and xi - Xi, where the real numbers z(s, e, k) are understood as constant random variables and each X<sup>i</sup> is a random input variable with a user-specified distribution. Currently, PAF supports several well-known distributions out-ofthe-box (e.g., uniform, normal, exponential), and the user can also define custom distributions as piecewise functions. For the inductive case t<sup>1</sup> op<sup>m</sup> t2, we put the lessons from Sect. 4 to work. Recall first the probabilistic model from Eq. (3):

$$x \text{ op}\_{\mathfrak{m}} y = (x \text{ op } y)(1 + \delta), \qquad \delta \sim dist$$

In Sect. 4.1, we showed that dist should be taken as the distribution of the actual roundoff errors of the random elements (x op y). We therefore define:

$$\begin{array}{c} \begin{bmatrix} \mathbf{t}\_1 \ \mathbf{op\_n} \ \mathbf{t}\_2 \end{bmatrix} \stackrel{\Delta}{=} \begin{pmatrix} \begin{bmatrix} \mathbf{t}\_1 \end{bmatrix} \ \mathbf{op\_l} \ \begin{bmatrix} \mathbf{t}\_2 \end{bmatrix} \end{array} \right) \times \begin{pmatrix} 1 + \text{err}\_{\text{rel}}(\begin{bmatrix} \mathbf{t}\_1 \end{bmatrix} \ \mathbf{op\_l} \ \begin{bmatrix} \mathbf{t}\_2 \end{bmatrix}) \end{array} \tag{11}$$

To evaluate the model of Eq. (11), we first use the appropriate closed-form expression Eqs. (5) to (7) derived in Sect. 4 to evaluate the distribution of the random variable errrel(t1 op t2)—or the corresponding p-box as described in Sect. 4.5. We then use Theorem 4 to justify evaluating the multiplication operation in Eq. (11) *independently*—that is to say by using [48]—since the roundoff process is very close to being uncorrelated to the process generating it. The validity of this assumption is also confirmed experimentally by the remarkable agreement of Monte-Carlo simulations with this analytical model.

We now introduce the algorithm for evaluating the model given in Eq. (11). The evaluation performs an in-order (LNR) traversal of the *Abstract Syntax Tree* (AST) of a computation given by our grammar, and it feeds the results to the parent level along the way. At each node, it computes the probabilistic range of the intermediate result using the probabilistic ranges computed for its children nodes (i.e., operands). We first determine whether the operands are independent or not (Ind? branch in the toolflow), and we either apply a cheaper (i.e., no SMT solver invocations) algorithm if they are independent (see below) or a more involved one (see Sect. 6.2) if they are not. We describe our methodology at a generic intermediate computation in the AST of the expression.

We consider two distributions X and Y discretized into DS-structures DS<sup>X</sup> and DS<sup>Y</sup> (Sect. 3.2), and we want to derive the DS-structure DS<sup>Z</sup> for Z = <sup>X</sup> op <sup>Y</sup> , op ∈ {+, <sup>−</sup>, <sup>×</sup>, ÷}. Together with the DS-structures of the operands, we also need the traces trace<sup>X</sup> and trace<sup>Y</sup> containing the history of the operations performed so far, one for each operand. A trace is constructed at each leaf of the AST with the input distributions and their range. It is then propagated to the parent level and populated at each node with the current operation. Such history traces are critical when dealing with dependent operations since they allow us to interrogate an SMT solver about the feasibility of the current operation, as we describe in the next section. When the operands are independent, we simply use the arithmetic operations on independent DS-structures [3].

#### **6.2 Computing Probabilistic Ranges for Dependent Operands**

When the operands are dependent, we start by assuming that the dependency is unknown. This assumption is sound because the dependency of the operation is included in the set of unknown dependencies, while the result of the operation is no longer a single distribution but a p-box. Due to this "unknown assumption", the CDFs of the output p-box are a very pessimistic over-approximation of the operation, i.e., they are far from each other. Our key insight is to use an


**Algorithm 1.** Dependent Operation Z = X op Y

SMT solver to prune infeasible combinations of intervals from the input DSstructures, which prunes regions of zero probability from the output p-box. This probabilistic pruning using a solver squeezes together the CDFs of the output p-box, often resulting in a much more accurate over-approximation. With the solver, we move from an unknown to a *partially known* dependency between the operands. Currently, PAF supports the Z3 [17] and dReal [23] SMT solvers.

Algorithm 1 shows the pseudocode of our algorithm for computing the probabilistic output range (i.e., DS-structure) for dependent operands. When dealing with dependent operands, interval arithmetic (line 5) might not be as precise as in the independent case. Hence, we use an SMT solver to prune away any over-approximations introduced by interval arithmetic when computing with dependent ranges (line 6); this use of the solver is orthogonal to the one dealing with probabilities. On line 7, we check with an SMT solver whether the current combination of ranges [x1, x2] and [y1, y2] is compatible with the traces of the operands. If the query is satisfiable, the probability is strictly greater than zero but currently unknown (line 8). If the query is unsatisfiable, we assign a probability of zero to the range in DS<sup>Z</sup> (line 10). Finally, we append a new range to the DS-structure DS<sup>Z</sup> (line 11). Note that the loops are independent, and hence in our prototype implementation we run them in parallel.

After this algorithm terminates, we still need to assign probability values to all the unknown-probability ranges in DSZ. Since we cannot assign an exact value, we compute a range of potential values [p<sup>z</sup>*min* , p<sup>z</sup>*max* ] instead. This computation is encoded as a *linear programming* routine exactly as in [3].

#### **6.3 Computing Conditional Roundoff Error**

The final step of our toolflow computes the conditional roundoff error by combining the symbolic affine arithmetic error form of the computation (see Sect. 5) with the probabilistic range analysis described above. The symbolic error form gets maximized conditioned on the results of all the intermediate operations 9: ranges = ranges ∪ [x1, x2]

12: **break**

14: **return** error

10: **if** accumulator ≥ conf idence **then** 11: allRanges.append(ranges)

13: error = maximize(errorF orm, allRanges)




Algorithm 2 shows the pseudocode of the roundoff error computation algorithm. The algorithm takes as input a list DSS of DS-structures (one for each intermediate result range in the computation), the generated symbolic error form, and a confidence interval. It iterates over all intermediate DS-structures (line 3), and for each it determines the ranges needed to support the chosen confidence intervals (lines 4–12). In each iteration, it sorts the list of range-probability pairs (i.e., focal elements) of the current DS-structure by their probability value in a descending order (line 4). This is a heuristic that prioritizes the focal elements with most of the probability mass and avoids the unlikely outliers that cause large roundoff errors into the final error computation. With the help of an accumulator (line 8), we keep collecting focal elements (line 9) until the accumulated probability satisfies the confidence interval (line 10). Finally, we maximize the error form conditioned to the collected ranges of intermediate operations (line 13). The maximization is done using the rigorous global optimizer Gelpia [24].

# **7 Experimental Evaluation**

We evaluate PAF (version 1.0.0) on the standard FPBench benchmark suite [11, 20] that uses the four basic operations we currently support {+, <sup>−</sup>, <sup>×</sup>, ÷}. Many of these benchmarks were also used in recent related work [36] that we compare against. The benchmarks come from a variety of domains: embedded software (*bsplines*), linear classifications (*classids*), physics computations (*dopplers*), filters (*filters*), controllers (*traincars*, *rigidBody*), polynomial approximations of functions (*sine*, *sqrt*), solving equations (*solvecubic*), and global optimizations (*trids*). Since FPBench has been primarily used for worst-case roundoff error analysis, the benchmarks come with ranges for input variables, but they do not specify input distributions. We instantiate the benchmarks with three wellknown distributions for all the inputs: uniform, standard normal distribution, and double exponential (Laplace) distribution with σ = 0.01 which we will call 'exp'. The normal and exp distributions get truncated to the given range. We assume single-precision floating-point format for all operands and operations.

To assess the accuracy and performance of PAF, we compare it with PrAn (commit 7611679 [10]), the current state-of-the-art tool for automated analysis of probabilistic roundoff errors [36]. PrAn currently supports only uniform and normal distributions. We run all 6 tool configurations and report the best result for each benchmark. We fix the number of intervals in each discretization to 50 to match PrAn. We choose 99% as the confidence interval for the computation of our conditional roundoff error (Sect. 6.3) and of PrAn's probabilistic error. We also compare our probabilistic error bounds against FPTaylor (commit efbbc83 [21]), which performs worst-case roundoff error analysis, and hence it does not take into account the distributions of the input variables. We ran our experiments in parallel on a 4-socket 2.2 GHz 8-core Intel Xeon E5-4620 machine.

Table 2 compares roundoff errors reported by PAF, PrAn, and FPTaylor. PAF outperforms PrAn by computing tighter probabilistic error bounds on almost all benchmarks, occasionally by orders of magnitude. In the case of uniform input distributions, PAF provides tighter bounds for 24 out of 27 benchmarks, for 2 benchmarks the bounds from PrAn are tighter, while for *sqrt* they are the same. In the case of normal input distributions, PAF provides tighter bounds for all the benchmarks. Unlike PrAn, PAF supports probabilistic output range analysis as well. We present these results in the extended version [7].

In Table 2, of particular interest are benchmarks (10 for normal and 18 for exp) where the error bounds generated by PAF for the 99% confidence interval are at least an order of magnitude tighter than the worst-case bounds generated by FPTaylor. For such a benchmark and input distribution, PAF's results inform a user that there is an opportunity to optimize the benchmark (e.g., by reducing precision of floating-point operations) if their use-case can handle at most 1% of inputs generating roundoff errors that exceed a user-provided bound. FPTaylor's results, on the other hand, do not allow for a user to explore such fine-grained trade-offs since they are worst-case and do not take probabilities into account.

In general, we see a gradual reduction of the errors transitioning from uniform to normal to exp. When the input distributions are uniform, there is a significant chance of generating a roundoff error of the same order of magnitude as the worstcase error, since all inputs are equally likely. The standard normal distribution concentrates more than 99% of probability mass in the interval [−3, 3], resulting in the *long tail* phenomenon, where less than 0.5% of mass spreads in the interval [3,∞]. When the normal distribution gets truncated in a neighborhood of zero (e.g., [0, 1] for *bsplines* and *filters*) nothing changes with respect to the uniform case—there is still a high chance of committing errors close to the worst-case.

**Table 2.** Roundoff error bounds reported by PAF, PrAn, and FPTaylor given uniform (uni), normal (norm), and Laplace (exp) input distributions. We set the confidence interval to 99% for PAF and PrAn, and mark the smallest reported roundoff errors for each benchmark in bold. Asterisk (\*) highlights a difference of more than one order of magnitude between PAF and FPTaylor.


However, when the normal distribution gets truncated to a wider range (e.g., [−100, 100] for *trids*), then the outliers causing large errors are very rare events, not included in the 99% confidence interval. The exponential distribution further compresses the 99% probability mass in the tiny interval [−0.01, <sup>0</sup>.01], so the long tails effect is common among all the benchmarks.

**Fig. 4.** CDFs of the range (left) and error (right) distributions for the benchmark *traincars3* for uniform (top), normal (center), and exp (bottom).

The runtimes of PAF vary between 10 min for small benchmarks, such as *bsplines*, to several hours for benchmarks with more than 30 operations, such as *trid4* ; they are always less than two hours, except for *trids* with 11 h and *filters* with 6 h. The runtime of PAF is usually dominated by Z3 invocations, and the long runtimes are caused by numerous Z3 timeouts that the respective benchmarks induce. The runtimes of PrAn are comparable to PAF since they are always less than two hours, except for *trids* with 3 h, *sqrt* with 3 h, and *sine* with 11 h. Note that neither PAF nor PrAn are memory intensive.

To assess the quality of our rigorous (i.e., sound) results, we implement Monte Carlo sampling to generate both roundoff error and output range distributions. The procedure consists of randomly sampling from the provided input distributions, evaluating the floating-point computation in both the specified and highprecision (e.g., double-precision) floating-point regimes to measure the roundoff error, and finally partitioning the computed errors into bins to get an approximation (i.e., histogram) of the PDF. Of course, Monte Carlo sampling does not provide rigorous bounds, but is a useful tool to assess how far the rigorous bounds computed statically by PAF are from an empirical measure of the error.

Figure 4 shows the effects of the input distributions on the output and roundoff error ranges of the *traincars3* benchmark. In the error graphs (right column), we show the Monte Carlo sampling evaluation (yellow line) together with the error bounds from PAF with 99% confidence interval (red plus symbol) and FPTaylor's worst-case bounds (green crossmark). In the range graphs (left column), we also plot PAF's p-box over-approximations. We can observe that in the case of uniform inputs the computed p-boxes overlap at the extrema of the output range. This phenomenon makes it impossible to distinguish between 99% and 100% confidence intervals, and hence as expected the bound reported by PAF is almost identical to FPTaylor's. This is not the case for normal and exponential distributions, where PAF can significantly improve both the output range and error bounds over FPTaylor. This again illustrates how pessimistic the bounds from worst-case tools can be when the information about the input distributions is not taken into account. Finally, the graphs illustrate how the p-boxes and error bounds from PAF follow their respective empirical estimations.

# **8 Related Work**

Our work draws inspiration from *probabilistic affine arithmetic* [3,4], which aims to bound probabilistic uncertainty propagated through a computation; a similar goal to our probabilistic range analysis. This was recently extended to polynomial dependencies [45]. On the other hand, PAF detects any non-linear dependency supported by the SMT solver. While these approaches show how to bound moments, we do not consider moments but instead compute conditional roundoff error bounds, a concern specific to the analysis of floating-point computations. Finally, the concentration of measure inequalities [4,45] provides bounds for (possibly very large) problems that can be expressed as sums of random variables, for example multiple increments of a noisy dynamical system, but are unsuitable for typical floating-point computations (such as FPBench benchmarks).

The most similar approach to our work is the recent static probabilistic roundoff error analysis called PrAn [36]. PrAn also builds on [3], and inherits the same limitations in dealing with dependent operations. Like us, PrAn hinges on a discretization scheme that builds p-boxes for both the input and error distributions and propagates them through the computation. The question of how these p-boxes are chosen is left open in the PrAn approach. In contrast, we take the input variables to be user-specified random variables, and show how the distribution of each error term can be computed directly and exactly from the random variables generating it (Sect. 4). Furthermore, unlike PrAn, PAF leverages the non-correlation between random variables and the corresponding error distribution (Sect. 4.4). Thus, PAF performs the rounding in Eq. (3) as an *independent* operation. Putting these together leads to PAF computing tighter probabilistic roundoff error bounds than PrAn, as our experiments show (Sect. 7).

The idea of using a probabilistic model of rounding errors to analyze *deterministic* computations can be traced back to Von Neumann and Goldstine [51]. Parker's so-called 'Monte Carlo arithmetic' [41] is probably the most detailed description of this approach. We, however, consider *probabilistic* computations. For this reason, the famous critique of the probabilistic approach to roundoff errors [29] does not apply to this work. Our preliminary report [9] presents some early ideas behind this work, including Eqs. (5) and (7) and a very rudimentary range analysis. However, this early work manipulated distributions *unsoundly*, could not handle any repeated variables, and did not provide any roundoff error analysis. Recently, probabilistic roundoff error models have also been investigated using the concentration of measure inequalities [27,28]. Interestingly, this means that the distribution of errors in Eq. (3) can be left almost completely unspecified. However, as in the case of related work from the beginning of this section [4,45], concentration inequalities are very ill-suited to the applications captured by the FPBench benchmark suite.

Worst-case analysis of roundoff errors has been an active research area with numerous published approaches [12–16,18,22,33,35,37,38,46,47,50]. Our symbolic affine arithmetic used in PAF (Sect. 5) evolved from rigorous affine arithmetic [14] by keeping the coefficients of the noise terms symbolic, which often leads to improved precision. These symbolic terms are very similar to the firstorder Taylor approximations of the roundoff error expressions used in FPTaylor [46,47]. Hence, PAF with the 100% confidence interval leads to the same worst-case roundoff error bounds as computed by FPTaylor (Sect. 7).

**Acknowledgments.** We thank Ian Briggs and Mark Baranowski for their generous and prompt support with Gelpia. We also thank Alexey Solovyev for his detailed feedback and suggestions for improvements. Finally, we thank the anonymous reviewers for their insightful reviews that improved our final version, and program chairs for carefully guiding the review process.

# **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

# **Model-Free Reinforcement Learning for Branching Markov Decision Processes**

Ernst Moritz Hahn<sup>1</sup> , Mateo Perez<sup>2</sup> , Sven Schewe<sup>3</sup> , Fabio Somenzi<sup>2</sup> , Ashutosh Trivedi2(B) , and Dominik Wojtczak<sup>3</sup>

> University of Twente, Enschede, The Netherlands University of Colorado Boulder, Boulder, USA ashutosh.trivedi@colorado.edu University of Liverpool, Liverpool, UK

**Abstract.** We study reinforcement learning for the optimal control of Branching Markov Decision Processes (BMDPs), a natural extension of (multitype) Branching Markov Chains (BMCs). The state of a (discretetime) BMCs is a collection of entities of various types that, while spawning other entities, generate a payoff. In comparison with BMCs, where the evolution of a each entity of the same type follows the same probabilistic pattern, BMDPs allow an external controller to pick from a range of options. This permits us to study the best/worst behaviour of the system. We generalise model-free reinforcement learning techniques to compute an optimal control strategy of an unknown BMDP in the limit. We present results of an implementation that demonstrate the practicality of the approach.

# **1 Introduction**

Branching Markov Chains (BMCs), also known as Branching Processes, are natural models of population dynamics and parallel processes. The state of a BMC consists of entities of various types, and many entities of the same type may coexist. Each entity can branch in a single step into a (possibly empty) set of entities of various types while disappearing itself. This assumption is natural, for instance, for annual plants that reproduce only at a specific time of the year, or for bacteria, which either split or die. An entity may spawn a copy of itself, thereby simulating the continuation of its existence.

The offspring of an entity is chosen at random among options according to a distribution that depends on the type of the entity. The type captures significant differences between entities. For example, stem cells are very different from

This work was supported by the Engineering and Physical Sciences Research Council through grant EP/P020909/1 and by the National Science Foundation through grant 2009022. This project has received funding from the European Union's Horizon 2020 research and innovation programme under grant agreement No 101032464 (SyGaST) and 864075 (CAESAR).

regular cells; parallel processes may be interruptible or have different privileges. The type may reflect characteristics of the entities such as their age or size.

Although entities coexist, the BMC model assumes that there is no interaction between them. Thus, how an entity reproduces and for how long it lives is the same as if it were the only entity in the system. This assumption greatly improves the computational complexity of the analysis of such models and is appropriate when the population exists in an environment that has virtually unlimited resources to sustain its growth. This is a common situation that holds when a species has just been introduced into an environment, in an early stage of an epidemic outbreak, or when running jobs in cloud computing.

BMCs have a wide range of applications in modelling various physical phenomena, such as nuclear chain reactions, red blood cell formation, population genetics, population migration, epidemic outbreaks, and molecular biology. Many examples of BMC models used in biological systems are discussed in [12].

Branching Markov Decision Processes (BMDPs) extend BMCs by allowing a controller to choose the branching dynamics for each entity. This choice is modelled as nondeterministic, instead of random. This extension is analogous to how Markov Decision Processes (MDPs) generalise Markov chains (MCs) [24]. Allowing an external controller to select a mode of branching allows us to study the best/worst behaviour of the examined model.

As a motivating example, let us discuss a simple model of cloud computing. A computation may be divided into tasks in order to finish it faster, as each server may have different computational power. Since the computation of each task depends on the previous one, the total running time is the sum of the running times of each spawned task as well as the time needed to split and merge the result of each computation into the final solution. As we shall see, the execution of each task is not guaranteed to be successful and is subject to random delays. Specifically, let us consider the following model with two different types (T and S), and two actions (a<sup>1</sup> and a2). This BMDP consists of the main task, T, that may be split (action a1) into three smaller tasks, for simplicity assumed to be of the same type S, and this split and merger of the intermediate results takes 1 hour (1h). Alternatively (action a2), we can execute the whole task T on the main server, but it will be slow (8 h). Task S can (action a1) be run on a reliable server in 1.6 h or (action a2) an unreliable one that finishes after 1 h (irrespective of whether or not the computation is completed successfully), but with a 40% chance we need to rerun this task due to the server crashing. We can represent this model formally as:

$$\begin{array}{ccc} T \stackrel{a\_1}{\longrightarrow} SSS & \text{[1h]} & S \stackrel{a\_1}{\longrightarrow} \epsilon & \text{[1.6h]}\\ T \stackrel{a\_2}{\longrightarrow} \epsilon & \text{[8h]} & S \stackrel{a\_2}{\longrightarrow} 40\% : S \text{ or } 60\% : \epsilon & \text{[1h]} \end{array}$$

We would like to know the infimum of the expected running time (i.e. the expected running time when optimal decisions are made) of task T. In this case the optimal control is to pick action a<sup>1</sup> first and then actions a<sup>1</sup> for all tasks S with a total running time of 5.8 h. The expected running time when picking actions a<sup>2</sup> for S instead would be 1 + 3 · 1/0.6 = 6 [hours].

Let us now assume that the execution of tasks S for action a<sup>1</sup> may be interrupted with probability 30% by a task of higher priority (type H). Moreover, these H tasks may be further interrupted by tasks with even higher priority (to simplify matters, again modelled by type H). The computation time of T is prolonged by 0.1 h for each H spawned. Our model then becomes:

$$\begin{aligned} T &\stackrel{a\_1}{\longrightarrow} SSS & \text{[1h]} & \quad S &\stackrel{a\_1}{\longrightarrow} 30\% : H \text{ or } 70\% : \epsilon & \text{[1.6h]} & \quad H &\stackrel{\ast}{\longrightarrow} & 30\% : HH \text{ or } \\ T &\stackrel{a\_2}{\longrightarrow} \epsilon & & \quad \text{[8h]} & \quad S &\stackrel{a\_2}{\longrightarrow} 40\% : S \text{ or } 60\% : \epsilon & \text{[1h]} & \quad & 70\% : \epsilon & \text{[0.1h]} \end{aligned}$$

As we shall see, the expected total running time of H can be calculated by solving the equation x = 0.3(x + x)+0.1, which gives x = 0.25 [hour]. So the expected running time of S using action a<sup>1</sup> increases by 0.3 · 0.25 = 0.075 [hour]. This is enough for the optimal strategy of running S to become a2. Note that if the probability of H being interrupted is at least 50% then the expected running time of H becomes ∞.

When dealing with a real-life process, it is hard to come up with a (probabilistic and controlled) model that approximates it well. This requires experts to analyse all possible scenarios and estimate the probability of outcomes in response to actions based on either complex calculations or the statistical analysis of sufficient observational data. For instance, it is hard to estimate the probability of an interrupt H occurring in the model above without knowing which server will run the task, its usual workload and statistics regarding the priorities of the tasks it executes. Even if we do this estimation well, unexpected or rare events may happen that would require us to recalibrate the model as we observe the system under our control.

Instead of building such a model explicitly first and fixing the probabilities of possible transitions in the system based on our knowledge of the system or its statistics, we advocate the use of reinforcement learning (RL) techniques [27] that were successfully applied to finding optimal control for finite-state Markov Decision Processes (MDPs). Q-learning [30] is a well-studied model-free RL approach to compute an optimal control strategy without knowing about the model apart from its initial state and the set of actions available in each of its states. It also has the advantage that the learning process converges to the optimal control while exploiting along the way what it already knows. While the formulation of the Q-learning algorithm for BMDPs is straightforward, the proof that it works is not. This is because, unlike the MDPs with discounted rewards for which the original Q-learning algorithm was defined, our model does not have an explicit contraction in each step, nor does boundedness of the optimal values or one-step updates hold. Similarly, one cannot generalise the result from [11] that estimates the time needed for the Q-learning algorithm to converge within of the optimal values with high probability for finite-state MDPs.

#### **1.1 Related Work**

The simplest model of BMCs are Galton-Watson processes [31], discrete-time models where all entities are of the same type. They date as far back as 1845 [14] and were used to explain why some aristocratic family surnames became extinct. The generalisation of this model to multiple types of entities was first studied in 1940s by Kolmogorov and Sevast'yanov [17]. For an overview of the results known for BMCs, see e.g. [13] and [12]. The precise computational complexity of decision problems about the probabilities of extinction of an arbitrary BMC was first established in [9]. The problem of checking if a given BMC terminates almost surely was shown in [5] to be strongly polynomial. The probability of acceptance of a run of a BMC by a deterministic parity tree automaton was studied in [4] and shown to be computable in PSPACE and in polynomial time for probabilities 0 or 1. In [16] a generalisation of the BMCs was considered that allowed for limited synchronisation of different tasks.

BMDPs, a natural generalisation of BMCs to a controlled setting, have been studied in the OR literature e.g., [23,26]. Hierarchical MDPs (HMDPs) [10] are a special case of BMDPs where there are no cycles in the offspring graph (equivalently, no cyclic dependency between types). BMDPs and HMDPs have found applications in manpower planning [29], controlled queuing networks [2, 15], management of livestock [20], and epidemic control [1,25], among others. The focus of these works was on optimising the expected average, or the discounted reward over a run of the process, or optimising the population growth rate. In [10] the decision problem whether the optimal probability of termination exceeds a threshold was studied: it was shown to be solvable in PSPACE and at least as hard as the square-root sum problem, but one can determine if the optimal probability is 0 or 1 in polynomial time. In [7], it was shown that the approximation of the optimal probability of extinction for BMDPs can be done in polynomial time. The computational complexity of computing the optimal expected total cost before extinction for BMDPs follows from [8] and was shown there to be computable in polynomial time via a linear program formulation. The problem of maximising the probability of reaching a state with an entity of a given type for BMDPs was studied in [6]. In [28] an extension of BMDPs with real-valued clocks and timing constraints on productions was studied.

### **1.2 Summary of the Results**

We show that an adaptation of the Q-learning algorithm converges almost surely to the optimal values for BMDPs under mild conditions: all costs are positive and each Q-value is selected for update independently at random. We have implemented the proposed algorithm in the tool Mungojerrie [21] and tested its performance on small examples to demonstrate its efficiency in practice. To the best of our knowledge, this is the first time model-free RL has been used for the analysis of BMDPs.

# **2 Problem Definitions**

### **2.1 Preliminaries**

We denote by N the set of non-negative integers, by R the set of reals, by <sup>R</sup><sup>+</sup> the set of positive reals, and by <sup>R</sup>≥<sup>0</sup> the set of non-negative reals. We let R-<sup>+</sup> <sup>=</sup> <sup>R</sup><sup>+</sup> ∪ {∞}, and <sup>R</sup>-<sup>≥</sup><sup>0</sup> <sup>=</sup> <sup>R</sup>≥<sup>0</sup> ∪ {∞}. We denote by <sup>|</sup>X<sup>|</sup> the cardinality of a set X and by X<sup>∗</sup> (Xω) the set of all possible finite (infinite) sequences of elements of X. Finite sequences are also called lists.

*Vectors and Lists.* We use ¯x, y, ¯ c¯ to denote vectors and ¯x<sup>i</sup> or ¯x(i) to denote its i-th entry. We let ¯0 denote a vector with all entries equal to 0; its size may vary depending on the context. Likewise ¯1 is a vector with all entries equal to 1. For vectors ¯x, y¯ ∈ Rn <sup>≥</sup><sup>0</sup>, ¯<sup>x</sup> <sup>≤</sup> <sup>y</sup>¯ means <sup>x</sup><sup>i</sup> <sup>≤</sup> <sup>y</sup><sup>i</sup> for every <sup>i</sup>, and ¯x < <sup>y</sup>¯ means ¯<sup>x</sup> <sup>≤</sup> <sup>y</sup>¯ and x<sup>i</sup> = y<sup>i</sup> for some i. We also make use of the infinity norm x¯<sup>∞</sup> = max<sup>i</sup> |x¯(i)|.

We use α, β, γ to denote finite lists of elements. For a list α = a1, a2,...,a<sup>k</sup> we write α<sup>i</sup> for the i-th element a<sup>i</sup> of list α and |α| for its length. For two lists α and β we write α · β for their concatenation. The empty list is denoted by -. countable set <sup>Q</sup> is a function <sup>μ</sup> : <sup>Q</sup>→[0, 1] such that

*Probability Distributions.* A *finite discrete probability distribution* over a <sup>q</sup>∈<sup>Q</sup> <sup>μ</sup>(q)=1 and its support set *supp*(μ)= {q ∈ Q | μ(q)>0} is finite. We say that μ ∈ D(Q) is a *point distribution* if μ(q)=1 for some q ∈ Q.

*Markov Decision Processes.* Markov decision processes [24], are a well-studied formalism for systems exhibiting nondeterministic and probabilistic behaviour.

**Definition 1.** *A* Markov decision process *(MDP) is a tuple* M = (S, A, p, c) *where:*


We say that an MDP M is *finite* (*discrete*) if both S and A are finite (countable). We write A(s) for the set of actions available at s, i.e., the set of actions a for which p(s, a) is defined. In an MDP M, if the current state is s, then one of the actions in A(s) is chosen nondeterministically. If the chosen action is a then the probability of reaching state s ∈ S in the next step is p(s, a)(s ) and the cost incurred is c(s, a).

#### **2.2 Branching Markov Decision Processes**

We are now ready to define (multitype) BMDPs.

**Definition 2.** *A* branching Markov decision process *(BMDP) is a tuple* B = (P, A, p, c) *where:*


*–* <sup>c</sup> : <sup>P</sup> <sup>×</sup> <sup>A</sup> <sup>→</sup> <sup>R</sup><sup>+</sup> *is the* cost function*.*

We write A(q) for the set of actions available to an entity of type q ∈ P, i.e., the set of actions a for which p(q, a) is defined. A *Branching Markov Chain (BMC)* is simply a BMDP with just one action available for each type.

Let us first describe informally how BMDPs evolve. A state of a BMDP B is a list of elements of P that we call *entities*. A BMDP starts at some initial configuration, <sup>α</sup><sup>0</sup> <sup>∈</sup> <sup>P</sup>∗, and the controller picks for one of the entities one of the actions available to an entity of its type. In the new configuration α<sup>1</sup>, this one entity is replaced by the list of new entities that it spawned. This list is picked according to the probability distribution p(q, a) that depends both on the type of the entity, q, and the action, a, performed on it by the controller. The process proceeds in the same manner from α<sup>1</sup>, moving to α<sup>2</sup>, and from there to α<sup>3</sup>, etc. Once the state is reached, i.e., when no entities are present in the system, the process stays in that state forever.

**Definition 3 (Semantics of BMDP).** *The semantics of a BMDP* B = (P, A, p, c) *is an MDP* M<sup>B</sup> = (*States*B, *Actions*B,*Prob*B, *Cost*B) *where:*


$$Prob\_{\mathcal{B}}(\alpha, (i, a))(\alpha\_1 \dots \alpha\_{i-1} \cdot \beta \cdot \alpha\_{i+1} \dots) = p(\alpha\_i, a)(\beta),$$

*for every* β ∈ P<sup>∗</sup> *and* 0 *in all other cases.*

*– Cost*<sup>B</sup> : *States*<sup>B</sup> <sup>×</sup> *Actions*<sup>B</sup> <sup>→</sup> <sup>R</sup><sup>+</sup> *is the cost function such that*

$$\operatorname{Cost}\_{\mathfrak{B}}(\alpha,(i,a)) = c(\alpha\_i, a).$$

For a given BMDP B and states α ∈ *States*B, we denote by *Actions*B(α) the set of actions (i, a) ∈ *Actions*B, for which *Prob*B(α,(i, a)) is defined.

Note that our semantics of BMDPs assumes an explicit listing of all the entities in a particular order similar to [10]. One could, instead, define this as a multi-set or simply a vector just counting the number of occurrences of each entity as in [23]. As argued in [10], all these models are equivalent to each other. Furthermore, we assume that the controller expands a single entity of his choice at the time rather all of them being expanded simultaneously. As argued in [32], that makes no difference for the optimal values of the expected total cost that we study in this paper, provided that all transitions' costs are positive.

#### **2.3 Strategies**

A *path* of a BMDP B is a finite or infinite sequence

$$\begin{aligned} \pi &= \alpha^0, ((i\_1, a\_1), \alpha^1), ((i\_2, a\_2), \alpha^2), ((i\_3, a\_3), \alpha^3), \dots \\ &\in States\mathsf{g} \times ((Actions\mathsf{g} \times States\mathsf{g})^\* \cup (Actions\mathsf{g} \times States\mathsf{g})^\omega), \end{aligned}$$

consisting of the initial state and a finite or infinite sequence of action and state pairs, such that *Prob*B(α<sup>j</sup> ,(i<sup>j</sup> , a<sup>j</sup> ))(αj+1) <sup>&</sup>gt; 0 for any 0 <sup>≤</sup> <sup>j</sup> ≤ |π|, where <sup>|</sup>π<sup>|</sup> is the number of actions taken during path π. (|π| = ∞ if the path is infinite.) For a path π, we denote by πA(j) = (i<sup>j</sup> , a<sup>j</sup> ) the j-th action taken along path π, by πS(j)(= α<sup>j</sup> ) the j-th state visited, where πS(0)(= α<sup>0</sup>) is the initial state, and by π(j)(= α<sup>0</sup>,((i1, a1), α<sup>1</sup>),...,((i<sup>j</sup> , a<sup>j</sup> ), α<sup>j</sup> )) the first j action-state pairs of π.

We call a path of infinite (finite) length a *run* (*finite path*). We write *Runs*<sup>B</sup> (*FPath*B) for the sets of all runs (finite paths) and *Runs*B,α (*FPath*B,α) for the sets of all runs (finite paths) that start at a given initial state α ∈ *States*B, i.e., paths π with πS(0) = α. We write *last*(π) for the last state of a finite path π.

A *strategy* in BMDP B is a function σ : *FPath*<sup>B</sup> → D(*Actions*B) such that, for all π ∈ *FPath*B, *supp*(σ(π)) ⊆ *Actions*B(*last*(π)). We write Σ<sup>B</sup> for the set of all strategies. A strategy is called *static*, if it always applies an action to the first entity in any state and for all entities of the same type in any state it picks the same action. A static strategy τ is essentially a function of the form σ : P → A, i.e., for an arbitrary π ∈ *FPath*B, we have τ (π) = (1, σ(*last*(π)1)) whenever *last*(π) = -.

A strategy σ ∈ Σ<sup>B</sup> and an initial state α induce a probability measure over the set of runs of BMDP B in the following way: the basic open sets of *Runs*<sup>B</sup> are of the form <sup>π</sup> ·(*Actions*<sup>B</sup> <sup>×</sup>*States*B)<sup>ω</sup>, where <sup>π</sup> <sup>∈</sup> *FPath*B, and the measure of this open set is equal to |π|−<sup>1</sup> <sup>i</sup>=0 σ(π(i))(πA(i+1)) · *Prob*B(πS(i), πA(i+1))(πS(i+1)) if πS(0) = α and equal to 0 otherwise. It is a classical result of measure theory that this extends to a unique measure over all Borel subsets of *Runs*<sup>B</sup> and we will denote this measure by *P*<sup>σ</sup> <sup>B</sup>,α. Let <sup>f</sup> : *Runs*<sup>B</sup> <sup>→</sup> <sup>R</sup>-<sup>B</sup>,α {f} <sup>=</sup>

<sup>+</sup> be a function measurable with respect to *P*<sup>σ</sup> <sup>B</sup>,α. The expected value of f under strategy σ when starting at α is defined as E<sup>σ</sup> *Runs*<sup>B</sup> <sup>f</sup> <sup>d</sup>*P*<sup>σ</sup> <sup>B</sup>,α (which can be ∞ even if the probability that the value of f is infinite is 0). The infimum expected value of f in B when starting at α is defined as <sup>V</sup>∗(α)(f) = inf<sup>σ</sup>∈Σ<sup>B</sup> <sup>E</sup><sup>σ</sup> <sup>B</sup>,α {f}. A strategy, <sup>σ</sup>, is said to be optimal if Eσ- <sup>B</sup>,α {f} <sup>=</sup> <sup>V</sup>∗(α)(X) and <sup>ε</sup>-optimal if <sup>E</sup>σ- <sup>B</sup>,α {f}≤V∗(α)(f) + <sup>ε</sup>. Note that ε-optimal strategies always exists by definition. We omit the subscript B, e.g., in *States*B, ΣB, etc., when the intended BMDP is clear from the context. of a run <sup>π</sup> after <sup>N</sup> steps, as Total<sup>N</sup> (π) = <sup>N</sup>−<sup>1</sup> 

For a given BMDP B and N ≥ 0 we define Total<sup>N</sup> (π), the cumulative cost <sup>i</sup>=0 *Cost*(πS(i), πA(i+1)). For a configuration α ∈ *States* and a strategy σ ∈ Σ, let ETotal<sup>N</sup> (B, α, σ) be the <sup>N</sup>*-step expected total cost* defined as ETotal<sup>N</sup> (B, α, σ) = <sup>E</sup><sup>σ</sup> B,α Total<sup>N</sup> and the *expected total cost* be ETotal∗(B, α, σ) = lim<sup>N</sup>→∞ ETotal<sup>N</sup> (B, α, σ). This last value can potentially be ∞. For each starting state α, we compute the *optimal expected cost* over all strategies of a BMDP starting at α, denoted by ETotal∗(B, α), i.e.,

$$\text{ETtotal}\_\*(\mathcal{B}, \alpha) = \inf\_{\sigma \in \Sigma\_{\mathcal{B}}} \text{ETtotal}(\mathcal{B}, \alpha, \sigma).$$

As we are going to prove in Theorem 4.b that, for any α ∈ *States*, we have

al.

$$\text{ove in Theorem 4.b that, for any } \alpha \in \mathbb{R}$$

$$\text{ETotal}\_\*(\mathcal{B}, \alpha) = \sum\_{i=1}^{|\alpha|} \text{ETotal}\_\*(\mathcal{B}, \alpha\_i).$$

This justifies focusing on this value for initial states that consist of a single entity only, as we will do in the following section.

### **3 Fixed Point Equations**

Following [8], we define here a linear equation system with a minimum operator whose *Least Fixed Point* solution yields the desired optimal values for each type of a BMDP with non-negative costs. This system generalises the Bellman's equations for finite-state MDPs. We use a variable x<sup>q</sup> for each unknown ETotal∗(B, q) where q ∈ P. Let ¯x be the vector of all xq, whereq ∈ P. The system has one equation of the form x<sup>q</sup> = Fq(¯x) for each type q ∈ P, defined as c(q, a) + 

$$x\_q = \min\_{a \in A(q)} \left( c(q, a) + \sum\_{\alpha \in P^\*} p(q, a)(\alpha) \sum\_{i \le |\alpha|} x\_{\alpha\_i} \right) \,. \tag{\Theta}$$

We denote the system in vector form by ¯x = F(¯x). Given a BMDP, we can easily construct its associated system in linear time. Let ¯c<sup>∗</sup> <sup>∈</sup> <sup>R</sup>n <sup>≥</sup><sup>0</sup> denote the <sup>n</sup>-dimensional vector of ETotal∗(B, q)'s where <sup>n</sup> <sup>=</sup> <sup>|</sup>P|. Let us define ¯x<sup>0</sup> <sup>=</sup> ¯0, <sup>x</sup>¯<sup>k</sup>+1 <sup>=</sup> <sup>F</sup><sup>k</sup>+1(¯0) = <sup>F</sup>(¯x<sup>k</sup>), for <sup>k</sup> <sup>≥</sup> 0. *(a) The map* <sup>F</sup> : <sup>R</sup>-≥0 → R-

**Theorem 4.** *The following hold:*


*Proof.*


$$c(q, a) + \sum\_{i \le |\alpha|} \text{ETotal}\_\*(\mathcal{B}, \alpha\_i) \; . \tag{4}$$

This is because then the expected total cost of picking action a when at q is just a weighted sum of these expressions with weights p(q, a)(α) for offspring α. And finally, to optimise the cost, one would pick an action a with the smallest such expected total cost showing that c(q, a) + 

$$\text{ETtotal}\_\*(\mathcal{B}, q) = \min\_{a \in A(q)} \left( c(q, a) + \sum\_{\alpha \in P^\*} p(q, a)(\alpha) \sum\_{i \le |\alpha|} \text{ETtotal}\_\*(\mathcal{B}, \alpha\_i) \right)$$

indeed holds.

Now, to show (♣), consider an --optimal strategy σ<sup>i</sup> for a BMDP that starts at αi. It can easily be composed into a strategy σ that starts at α just by executing σ<sup>1</sup> first until all descendants of α<sup>1</sup> die out, before moving on to σ2, etc. If one of these strategies, σi, never stops executing then, due to the assumption that all costs are positive, the expected total cost when starting with α<sup>i</sup> has to be infinite and so has to be the overall cost when starting with α (as all descendants of α<sup>i</sup> have to die out before the overall process terminates), so (♣) holds. This shows that <sup>c</sup>(q, a) + <sup>i</sup>≤|α<sup>|</sup> ETotal∗(B, x<sup>α</sup>*<sup>i</sup>* ) can be achieved when starting at α. At the same time, we cannot do better because that would imply the existence of a strategy σ for one of the entities σ<sup>j</sup> with a better cost than its optimal cost ETotal∗(B, α<sup>j</sup> ).


We now claim that, for all k ≥ 0, ETotalk(B, q, σ ) ≤ c¯ <sup>q</sup> holds. For k = 0, this is trivial as ETotalk(B, q, σ )=0 ≤ c¯ <sup>q</sup>. For k > 0, we have that (q))+ 

$$\text{this is trivial as } \text{ETotal}\_k(\mathcal{B}, q, \sigma') = 0 \le \vec{c}'\_q. \text{ For } k > 0, \text{ we have that}$$

$$\begin{aligned} \text{ETotal}\_k(\mathcal{B}, q, \sigma') &\stackrel{(1)}{\leq} c(q, \sigma'(q)) + \sum\_{\alpha \in P^\*} p(q, \sigma'(q))(\alpha) \sum\_{i \leq |\alpha|} \text{ETotal}\_{k-1}(\mathcal{B}, \alpha\_i, \sigma') \\ &\stackrel{(2)}{\leq} c(q, \sigma'(q)) + \sum\_{\alpha \in P^\*} p(q, \sigma'(q))(\alpha) \sum\_{i \leq |\alpha|} \check{c}'\_{\alpha\_i} \\ &\stackrel{(3)}{=} \min\_{a \in A(q)} \left( c(q, a) + \sum\_{\alpha \in P^\*} p(q, a)(\alpha) \sum\_{i \leq |\alpha|} \check{c}'\_{\alpha\_i} \right) \stackrel{(4)}{=} \check{c}'\_q \end{aligned}$$

where (1) follows from the fact that after taking action σ (q) first, there are only k − 1 steps left of the BMDP B that would need to be distributed among the offspring α of q somehow. Allowing for k−1 steps for each of the entities α<sup>i</sup> is clearly an overestimate of the actual cost. (2) follows from the inductive assumption. (3) follows from the definition of σ . The last equality, (4), follows from the fact that ¯c is a fixed point of F.

Finally, for every q ∈ P, from the definition we have ¯c<sup>∗</sup> <sup>q</sup> = ETotal∗(B, q) ≤

ETotal∗(B, q, σ ) = limk→∞ ETotalk(B, q, σ ) and each element of the last sequence was just shown to be ≤ c¯ q. (e) We know that ¯x<sup>∗</sup> = limk→∞ <sup>x</sup>¯<sup>k</sup> exists in <sup>R</sup>-

n <sup>≥</sup><sup>0</sup> because it is a monotonically non-decreasing sequence (note that some entries may be infinite). In fact we have ¯x<sup>∗</sup> = limk→∞ <sup>F</sup>k+1(¯0) = <sup>F</sup>(limk→∞ <sup>F</sup><sup>k</sup>(¯0)), and thus ¯x<sup>∗</sup> is a fixed point of F. So from (d) we have ¯c<sup>∗</sup> ≤ x¯∗. At the same time, due to (c), we have ¯x<sup>k</sup> <sup>≤</sup> <sup>c</sup>¯<sup>∗</sup> for all <sup>k</sup> <sup>≥</sup> 0, so ¯x<sup>∗</sup> = limk→∞ <sup>x</sup>¯<sup>k</sup> <sup>≤</sup> <sup>c</sup>¯<sup>∗</sup> and thus limk→∞ <sup>x</sup>¯<sup>k</sup> <sup>=</sup> c¯∗.

The following is a simple corollary of Theorem 4.

**Corollary 5.** *In BMDPs, there exists an optimal static control strategy* σ∗*.*

*Proof.* It is enough to pick as σ∗, the strategy σ from Theorem 4.d, for ¯c = ¯c∗. We showed there that for all k ≥ 0 and q ∈ P we have ETotalk(B, q, σ∗) ≤ c¯ q. So ETotal∗(B, q, σ∗) = lim<sup>k</sup>→∞ ETotalk(B, q, σ∗) ≤ c¯<sup>∗</sup> <sup>q</sup> = ETotal∗(B, q), so in fact ETotal∗(B, q, σ∗) = ETotal∗(B, q) has to hold as clearly ETotal∗(B, q, σ∗) ≥ ETotal∗(B, q).

Note that for a BMDPs with a fixed static strategy σ (or equivalently BMCs), we have that <sup>F</sup>(¯x) = <sup>B</sup>σx¯ + ¯cσ, for some non-negative matrix <sup>B</sup><sup>σ</sup> <sup>∈</sup> <sup>R</sup><sup>n</sup>×<sup>n</sup> <sup>≥</sup><sup>0</sup> , and a positive vector ¯c<sup>σ</sup> > 0 consisting of all one step costs c(q, σ(q)). We will refer to F as F<sup>σ</sup> in such a case and exploit this fact later in various proofs.

We now show that ¯c<sup>∗</sup> is in fact essentially a unique fixed point of F.

**Theorem 6.** *If* F(¯x)=¯x *and* x¯<sup>q</sup> < ∞ *for some* q ∈ P *then* x¯<sup>q</sup> = ¯c<sup>∗</sup> q *.*

*Proof.* By Corollary 5, there exists an optimal static strategy, denoted by σ∗, which yields the finite optimal reward vector ¯c∗.

We clearly have that ¯x = F(¯x) ≤ F<sup>σ</sup><sup>∗</sup> (¯x), because σ<sup>∗</sup> is just one possible pick of actions for each type rather than the minimal one as in (♠). Furthermore,

$$\begin{aligned} F\_{\sigma^\*} (\bar{x}) &= B\_{\sigma^\*} \bar{x} + b\_{\sigma^\*} \\ &\le B\_{\sigma^\*} (B\_{\sigma^\*} \bar{x} + b\_{\sigma^\*}) + b\_{\sigma^\*} \\ &= B\_{\sigma}^2 \bar{x} + (B\_{\sigma^\*} + 1) b\_{\sigma}^\* \\ &\le \dots \le \lim\_{k \to \infty} B\_{\sigma^\*}^k \bar{x} + \left( \sum\_{k=0}^{\infty} B\_{\sigma^\*}^k \right) b\_{\sigma^\*} .\end{aligned}$$
  $\text{Note that } \bar{c}^\* = (\sum\_{k=0}^{\infty} B\_{\sigma^\*}^k) b\_{\sigma^\*}, \text{ because}$ 

<sup>k</sup>=0 B<sup>k</sup> <sup>σ</sup><sup>∗</sup> )b<sup>σ</sup><sup>∗</sup> , because

$$\bar{c}^\* = \lim\_{k \to \infty} F^k(\mathbf{0}) = \lim\_{k \to \infty} F^k\_{\sigma^\*}(\mathbf{0}) = \lim\_{k \to \infty} \sum\_{i=0}^k B^i\_{\sigma^\*} b\_{\sigma^\*}.$$

Due to Theorem 4.d, we know that ¯c<sup>∗</sup> <sup>q</sup> ≤ x¯<sup>q</sup> < ∞, so all entries in the q-th row of B<sup>k</sup> <sup>σ</sup><sup>∗</sup> have to converge to 0 as k → ∞, because otherwise the q-th row of ∞ <sup>k</sup>=0 B<sup>k</sup> <sup>σ</sup><sup>∗</sup> would have at least one infinite value and, as a result, the q-th position of ¯c<sup>∗</sup> = (<sup>∞</sup> <sup>k</sup>=0 B<sup>k</sup> <sup>σ</sup><sup>∗</sup> )bσ<sup>∗</sup> would also be infinite as all entries of bσ<sup>∗</sup> are positive. Therefore, limk→∞(B<sup>k</sup> <sup>σ</sup><sup>∗</sup> x¯)<sup>q</sup> = 0 and so σ∗ x¯)q + (( ∞

$$\bar{x}\_q \le (\lim\_{k \to \infty} B^k\_{\sigma^\*} \bar{x})\_q + ((\sum\_{k=0}^{\infty} B^k\_{\sigma^\*}) b\_{\sigma^\*})\_q = \bar{c}^\*\_q.$$

The proof is now complete.

### **4 Q-learning**

We next discuss the applicability of Q-learning to the computation of the fixed point defined in the previous section.

Q-learning [30] is a well-studied model-free RL approach to compute an optimal strategy for discounted rewards. Q-learning computes so-called Q-values for every state-action pair. Intuitively, once Q-learning has converged to the fixed point, Q(s, a) is the optimal reward the agent can get while performing action a after starting at s. The Q-values can be initialised arbitrarily, but ideally they should be close to the actual values. Q-learning learns over a number of episodes, each consisting of a sequence of actions with bounded length. An episode can terminate early if a sink-state or another non-productive state is reached. Each episode starts at the designated initial state s0. The Q-learning process moves from state to state of the MDP using one of its available actions and accumulates rewards along the way. Suppose that in the i-th step, the process has reached state si. It then either performs the currently (believed to be) optimal action (so-called *exploitation* option) or, with probability -, picks uniformly at random one of the actions available at s<sup>i</sup> (so-called *exploration* option). Either way, if ai, ri, and si+1 are the action picked, reward observed and the state the process moved to, respectively, then the Q-value is updated as follows:

$$Q\_{i+1}(s\_i, a\_i) = (1 - \lambda\_i) Q\_i(s\_i, a\_i) + \lambda\_i (r\_i + \gamma \cdot \max\_a Q\_i(s\_{i+1}, a)) \ ,$$

where λ<sup>i</sup> ∈]0, 1[ is the learning rate and γ ∈]0, 1] is the discount factor. Note the model-freeness: this update does not depend on the set of transitions nor their probabilities. For all other pairs s, a we have Qi+1(s, a) = Qi(s, a), i.e., they are left unchanged. Watkins and Dayan showed the convergence of Q-learning [30]. 

**Theorem 7 (Convergence** [30]**).** *For* γ < 1*, bounded rewards* r<sup>i</sup> *and learning rates* 0 ≤ λ<sup>i</sup> < 1 *satisfying:* λi = ∞ *and* ∞

$$\sum\_{i=0}^{\infty} \lambda\_i = \infty \text{ and} \sum\_{i=0}^{\infty} \lambda\_i^2 < \infty,$$

*we have that* Qi(s, a) → Q(s, a) *as* i → ∞ *for all* s, a ∈ S×A *almost surely if all* (s, a) *pairs are visited infinitely often.*

However, in the total reward setting that corresponds to Q-learning with discount factor γ = 1, Q-learning may not converge, or converge to incorrect values. However, it is guaranteed to work for finite-state MDPs in the setting of undiscounted total reward with a target sink-state under the assumption that all strategies reach that sink-state almost surely. The assumption that we make instead is that every transition of BMDP incurs a positive cost. This guarantees that a process that does not terminate almost surely generates an expected infinite reward in which case the Q-learning will coverage (or rather diverge) to ∞, so our results generalise these existing results for Q-learning.

We adopt the Q-learning algorithm to minimise cost as follows. Each episode starts at the designated initial state q<sup>0</sup> ∈ P. The Q-learning process moves from state to state of the BMDP using one of its available actions and accumulates costs along the way. Suppose that, in the i-th step, the process has reached state α. It then selects uniformly at random one of the entities of α, e.g., the j-th one, α<sup>j</sup> and either performs the currently (believed to be) optimal action or, with probability -, picks an action uniformly at random among all the actions available for α<sup>j</sup> . If c and β denote the observed cost and entities spawned by this action, respectively, then the Q-value of the pair α<sup>j</sup> , a<sup>i</sup> are updated as follows: c + 

$$Q\_{i+1}(\alpha\_j, a\_i) = (1 - \lambda\_i) Q\_i(\alpha\_j, a\_i) + \lambda\_i \left(c + \sum\_{i=1}^{|\beta|} \min\_{a \in A(\beta\_i)} Q\_i(\beta\_i, a)\right).$$

and all other Q-values are left unchanged. In the next section we show that Qlearning almost surely converges (diverges) to the optimal finite (respectively, infinite) value of ¯c<sup>∗</sup> almost surely under rather mild conditions.

### **5 Convergence of Q-Learning for BMDPs**

We show almost sure convergence of the Q-learning to the optimal values ¯c<sup>∗</sup> in a number of stages. We first focus on the case when all optimal values in ¯c<sup>∗</sup> are finite. In such a case, we show a weak convergence of the expected optimal values for BMCs to the unique fixed-point ¯c∗, as defined in Sect. 3. To establish this, we show that the expected Q-values are monotonically decreasing (increasing) if we start with Q-values κc¯<sup>∗</sup> for κ > 1 (κ < 1). This convergence from above and below gives us convergence in expectation using the squeeze theorem.

We then establish almost sure convergence to ¯c<sup>∗</sup> by proving a contraction argument, with the extra assumption that the selection of the Q-value to update is done independently at random in each step.

In the next step, we extend this result to BMDPs, first establishing that Q-learning will almost surely converge to the *region* of the Q-values less than or equal to ¯c∗. We then show that, when considering the pointwise limes inferior values of the sequences of Q-values, there is no point in that region such that every ε-ball around it has a non-zero probability to be represented in the limes inferior. This establishes that ¯c<sup>∗</sup> is the fixed point the Q-values converge against.

Only at the very end, we show that Q-learning also converges (or rather diverges) to the optimal value even if that value happens to be infinite. We then turn to a type with non-finite optimal value and provide an argument for the divergence to ∞ of its corresponding Q-value. 

We assume that all the Q-values are stored in a vector Q of size (|P|·|A|). We also use Q(q, a) to refer to the entry for type q ∈ P and action a ∈ A(q). We introduce the *target for* Q *operator*, T, that maps a Q-values vector Q to: <sup>T</sup>(Q)(q, a) = <sup>c</sup>(q, a) + 

$$T(Q)(q,a) = c(q,a) + \sum\_{\alpha \in Q^\*} p(q,a)(\alpha) \sum\_{i=1}^{|\alpha|} \min\_{a\_i \in A(\alpha\_i)} Q(\alpha\_i, a\_i) \quad \text{a}$$

We call T the 'target', because, when the Q(q, a) value is updated, then

<sup>E</sup>(Qi+1(q, a)) = (1 <sup>−</sup> <sup>λ</sup>i)Qi(q, a) + <sup>λ</sup>iT(Qi)(q, a)

holds, whereas otherwise Qi+1(q, a) = Qi(q, a).

Thus, when Q(q, a) is selected for update with a chance of pq,a, we have that

$$\mathbb{E}(Q\_{i+1}(q,a)) = (1 - \lambda\_i p\_{q,a}) Q\_i(q,a) + \lambda\_i p\_{q,a} T(Q\_i)(q,a) \ . \tag{\%}$$

 

#### **5.1 Convergence for BMCs with Finite ¯***c<sup>∗</sup>*

Since BMCs have only one action, we omit mentioning it for ease of notation.

Note that for BMCs, the target for the Q-values is a simple affine function:

\*\*e for BMCs with Finite  $\bar{c}^\*$ 

only one action, we omit mentioned it for MCs, the target for the Q-values is a sim

$$T(Q)(q) = c(q) + \sum\_{\alpha \in P^\*} p(q)(\alpha) \sum\_{i=1}^{|\alpha|} Q(\alpha\_i).$$

And it coincides with operator F as defined in Sect. 3. Therefore, due to Theorem 6, T(Q) has a unique fixed point which is ¯c∗. Moreover, T(Q) = BQ+¯c, where B is a non-negative matrix and ¯c is a vector of one step costs c(q), which are all positive.

Naturally, applying T to a non-negative vector Q or multiplying it by B are monotone: Q ≥ Q → T(Q) ≥ T(Q ) and BQ ≥ BQ . Also, due to the linearity of T, E(T(Q)) = T(E(Q)) holds, where Q is a random vector.

We now start with a lemma describing the behaviour of Q-learning for initial Q-values when they happen to be equal to κc¯<sup>∗</sup> for some κ ≥ 1.

**Lemma 8.** *Let* Q<sup>0</sup> = κc¯<sup>∗</sup> *for a scalar factor* κ ≥ 1*. Then the following holds for all* <sup>i</sup> <sup>∈</sup> <sup>N</sup>*,*

$$
\bar{c}^\* \le T(\mathbb{E}(Q\_i)) \le \mathbb{E}(Q\_{i+1}) \le \mathbb{E}(Q\_i),
$$

*assuming that Q-value to be updated in each step is selected independently at random.*

*Proof.* We show this by induction. For the induction basis (i = 0), we have that c¯<sup>∗</sup> ≤ Q<sup>0</sup> by definition.

As ¯c<sup>∗</sup> is the fixed-point of T, we have T(¯c∗)=¯c∗, and the monotonicity of T provides T(¯c∗) ≤ T(Q0). At the same time

$$\begin{aligned} T(Q\_0) = T(\kappa \bar{c}^\*) &= B\kappa \bar{c}^\* + \bar{c} \\ &= \kappa (B\bar{c}^\* + \bar{c}) - \kappa \bar{c} + \bar{c} \\ &= \kappa \bar{c}^\* - (\kappa - 1)\bar{c} \\ &= Q\_0 - (\kappa - 1)\bar{c} \le Q\_0. \end{aligned}$$

This provides ¯c<sup>∗</sup> <sup>≤</sup> <sup>T</sup>(E(Q0)) <sup>≤</sup> <sup>E</sup>(Q0). Finally, <sup>T</sup>(E(Q0)) <sup>≤</sup> <sup>E</sup>(Q0) entails for a learning rate <sup>λ</sup><sup>0</sup> <sup>∈</sup> [0, 1] that <sup>T</sup>(E(Q0)) <sup>≤</sup> <sup>E</sup>(Q1) <sup>≤</sup> <sup>E</sup>(Q0) due to (♥).

For the induction step (i → i + 1), we use the induction hypothesis

$$
\overline{c}^\* \le T(\mathbb{E}(Q\_i)) \le \mathbb{E}(Q\_{i+1}) \le \mathbb{E}(Q\_i).
$$

The monotonicity of <sup>T</sup> and ¯c<sup>∗</sup> <sup>≤</sup> <sup>E</sup>(Qi+1) <sup>≤</sup> <sup>E</sup>(Qi) imply that <sup>T</sup>(¯c∗) <sup>≤</sup> <sup>T</sup>(E(Qi+1)) <sup>≤</sup> <sup>T</sup>(E(Qi)) holds. With <sup>T</sup>(¯c∗)=¯c<sup>∗</sup> (from the fixed point equations) and the induction hypothesis, ¯c<sup>∗</sup> <sup>≤</sup> <sup>T</sup>(E(Qi+1)) <sup>≤</sup> <sup>E</sup>(Qi+1) follows.

Using <sup>T</sup>(E(Qi+1)) = <sup>E</sup>(T(Qi+1)), this provides <sup>E</sup>(T(Qi+1)) <sup>≤</sup> <sup>E</sup>(Qi+1), which implies with λi+1 ∈ [0, 1] that

$$T(\mathbb{E}(Q\_{i+1})) = \mathbb{E}(T(Q\_{i+1})) \le \mathbb{E}(Q\_{i+2}) \le \mathbb{E}(Q\_{i+1})$$

holds, completing the induction step.

By simply replacing all ≤ with ≥ in the above proof, we can get the following for all initial Q-values that happen to be κc¯<sup>∗</sup> where κ ≤ 1:

**Lemma 9.** *Let* Q<sup>0</sup> = κc¯<sup>∗</sup> *for a scalar factor* κ ∈ [0, 1]*. Then the following holds for all* <sup>i</sup> <sup>∈</sup> <sup>N</sup>*, assuming that the Q-value to update in each step is selected independently at random:* <sup>c</sup>¯<sup>∗</sup> <sup>≥</sup> <sup>T</sup>(E(Qi)) <sup>≥</sup> <sup>E</sup>(Qi+1) <sup>≥</sup> <sup>E</sup>(Qi)*.* 

We now first establish that the distance between Q and ¯c<sup>∗</sup> can be upper bounded by the distance between Q and T(Q) with a fixed linear factor μ > 0.

**Lemma 10.** *There exists a constant* μ > 0 *such that*

$$\sum\_{q \in P} |(Q - T(Q)(q)) \ge \mu \sum\_{q \in P} |(Q - \bar{c}^\*(q))|$$

*when* Q<sup>0</sup> = κc¯∗*.*

*Proof.* We show this for κ > 1. The proof for κ < 1 is similar, and there is nothing to show for κ = 1. <sup>∗</sup>, T(Q) <sup>≤</sup> Q, and Q(q) = 

We first consider the linear programme with a variable for each type with the following constraints for some fixed δ > 0:

$$Q \ge \bar{c}^\*, T(Q) \le Q, \text{and} \sum\_{q \in P} Q(q) = \sum\_{q \in P} \bar{c}^\*(q) + \delta.$$

An example solution to this constraint system is <sup>Q</sup> = (1 + δ *<sup>q</sup>*∈*<sup>P</sup>* <sup>c</sup>¯∗(q) )¯c∗.

We then find a solution minimising the objective <sup>q</sup>∈<sup>P</sup> <sup>|</sup>(Q−T(Q)(p)|, noting that all entries are non-negative due to the first constraint. This is expressed by adding 2|P| constraints and minimising

$$\begin{aligned} x\_q &\geq Q(q) - T(Q)(q) \\ x\_q &\geq T(Q)(q) - Q(q) \end{aligned}$$

<sup>q</sup>∈<sup>P</sup> <sup>x</sup>q.

As ¯c<sup>∗</sup> is the only fixed-point of <sup>T</sup>, and q∈P Q(q) = <sup>q</sup>∈<sup>P</sup> <sup>c</sup>¯∗(q) + <sup>δ</sup> implies that, for an optimal solution Q∗, Q<sup>∗</sup> = ¯c∗, we have that

$$\sum\_{q \in P} |(Q^\* - T(Q^\*)(q)) | > 0.$$

Due to the constraint <sup>Q</sup> <sup>≥</sup> <sup>c</sup>¯∗, we always have <sup>Q</sup> = ¯c<sup>∗</sup> <sup>+</sup> <sup>Q</sup><sup>Δ</sup> for some <sup>Q</sup><sup>Δ</sup> <sup>&</sup>gt; ¯0. We can now re-formulate this linear programme to look for Q<sup>Δ</sup> instead of Q: 

$$\begin{aligned} Q\_{\Delta} &\geq \bar{0}, \\ B Q\_{\Delta} &\leq Q\_{\Delta}, \text{and} \\ \sum\_{q \in P} Q\_{\Delta}(q) &= \delta, \end{aligned}$$

with the objective to minimise <sup>q</sup>∈<sup>P</sup> <sup>|</sup>(Q<sup>Δ</sup> <sup>−</sup> BQΔ)(q)|.

The optimal solution Q<sup>∗</sup> <sup>Δ</sup> to this linear programme gives an optimal value Q<sup>∗</sup> = ¯c∗+Q<sup>∗</sup> <sup>Δ</sup> for the former and, vice versa, the value Q<sup>∗</sup> for the former provides an optimal solution Q<sup>∗</sup> <sup>Δ</sup> −c¯<sup>∗</sup> for the latter, and these two solutions have the same value in their respective objective function. 

Thus, while the former constraint system is convenient to show that the value of the objective function is positive, the latter constraint system is, except for <sup>q</sup>∈<sup>P</sup> <sup>Q</sup>Δ(q) = <sup>δ</sup>, linear. This means that any optimal solution for <sup>δ</sup> <sup>=</sup> <sup>δ</sup><sup>1</sup> can be obtained from the optimal solution for δ = δ<sup>2</sup> just by rescaling it by δ1/δ2. It follows that the optimal value of the objective function is linear in δ, e.g., there exists μ > 0 such that its value is μδ.

We now show that the sequence of Q-values updates converges in expectation to ¯c<sup>∗</sup> when Q<sup>0</sup> = κc¯∗. 

**Lemma 11.** *Let* Q<sup>0</sup> = κc¯<sup>∗</sup> *where* κ ≥ 0*. Then, assuming that each type-action pair is selected for update with a minimal probability* pmin *in each step, and that* ∞ <sup>i</sup>=0 <sup>λ</sup><sup>i</sup> <sup>=</sup> <sup>∞</sup>*, then* lim<sup>i</sup>→∞ <sup>E</sup>(Qi)=¯c<sup>∗</sup> *holds.*

*Proof.* We proof this for κ ≥ 1. A similar proof shows this for any κ ∈ [0, 1]. Lemma <sup>8</sup> provides that all <sup>E</sup>(Qi) satisfy the constraints <sup>E</sup>(Qi) <sup>≥</sup> <sup>c</sup>¯<sup>∗</sup> and <sup>T</sup>(E(Qi)) <sup>≤</sup> <sup>E</sup>(Qi).

666 E. M. Hahn et al. 

Let pmin be the smallest probability any Q-value is selected with in each update step. Due to Lemma 10, there is a fixed constant μ > 0 such that

 

$$\sum\_{q \in P} |Q\_i(q) - T(Q\_i)(q)| \ge \mu \sum\_{q \in P} |Q\_i(q) - \bar{c}^\*(q)| \text{ .}$$

By taking the expected value of both sides and the fact that ¯c<sup>∗</sup> <sup>≤</sup> <sup>T</sup>(E(Qi)) <sup>≤</sup> <sup>E</sup>(Qi+1) <sup>≤</sup> <sup>E</sup>(Qi) due to Lemma 8, we get 

$$\sum\_{q \in P} \mathbb{E}(Q\_i)(q) - T(\mathbb{E}(Q\_i))(q) \ge \mu \sum\_{q \in P} \mathbb{E}(Q\_i)(q) - \bar{c}^\*(q),$$

then due to (♥) we have 

$$\sum\_{q \in P} \mathbb{E}(Q\_i)(q) - \mathbb{E}(Q\_{i+1})(q) \ge \mu p\_{\text{min}} \lambda\_i \sum\_{q \in P} \mathbb{E}(Q\_i)(q) - \bar{c}^\*(q),$$

 

and finally just by rearranging these terms we get

$$\begin{aligned} &\text{If finally just by rearranging these terms we get} \\ &\sum\_{q\in P} \mathbb{E}(Q\_{i+1})(q) - \bar{c}^\*(q) \le (1 - \mu p\_{\text{min}} \lambda\_i) \sum\_{q\in P} \mathbb{E}(Q\_i)(q) - \bar{c}^\*(q) \quad . \end{aligned}$$
  $\text{Note that all summands are positive by Lemma 8.}$   $\text{With } \sum\_{i=0}^{\infty} \lambda\_i = \infty$ , we get that  $\sum\_{i=0}^{\infty} \mu p\_{\text{min}} \lambda\_i = \infty$ , because  $p\_{\text{max}}$  is a maximal minimum.  $\bar{c}^\*$ 

Note that all summands are positive by Lemma 8.

<sup>i</sup>=0 μpminλ<sup>i</sup> = ∞, because pmin and μ are fixed positive values. This implies that <sup>∞</sup> <sup>i</sup>=0(1 − μpminλi) = 0 and so the distance between <sup>E</sup>(Qi) and ¯c<sup>∗</sup> converges to 0. *probability* <sup>p</sup>min *in each step, and* <sup>∞</sup>

Lemma 11 suffices to show convergence of Q-values in expectation.

**Theorem 12.** *When each Q-value is selected for an update with a minimal* <sup>i</sup>=0 <sup>λ</sup><sup>i</sup> <sup>=</sup> <sup>∞</sup>*, then* lim<sup>i</sup>→∞ <sup>E</sup>(Qi)=¯c<sup>∗</sup> *holds for every starting Q-values* <sup>Q</sup><sup>0</sup> <sup>≥</sup> ¯0*.*

*Proof.* We first note that none of the entries of ¯c<sup>∗</sup> can be 0. This implies that there is a scalar factor <sup>κ</sup> <sup>≥</sup> 0 such that ¯0 <sup>≤</sup> <sup>Q</sup><sup>0</sup> <sup>≤</sup> κc¯∗. As the <sup>Q</sup><sup>i</sup> are monotone in the entries of Q0, and as the property holds for Q <sup>0</sup> <sup>=</sup> ¯0=0 · <sup>c</sup>¯<sup>∗</sup> and <sup>Q</sup> <sup>0</sup> = κc¯<sup>∗</sup> by Lemma 11, the squeeze theorem implies that it also holds for Q0.

Convergence of the expected value is a weaker property than expected convergence, which also explains why our assumptions are weaker than in Theorem 7. With the common assumption of sufficiently fast falling learning rates, ∞ <sup>i</sup>=0 λ<sup>i</sup> <sup>2</sup> <sup>&</sup>lt; <sup>∞</sup>, we will now argue that the pointwise limes inferior of the sequence of Q-values almost surely converges to ¯c∗. This will later allow us to infer convergence of the actual sequence of Q-values to ¯c∗. 

**Theorem 13.** *When each Q-value is selected for update with a minimal probability* pmin *in each step,*

$$\begin{array}{ll}\text{cost} & \text{converges } \text{to } c \\ \text{cual sequence of Q-values to} \\\\ \text{ch } & Q\text{-value} \quad \text{is } \text{ selected for} \\ \text{exp}, \\ \sum\_{i=0}^{\infty} \lambda\_i = \infty \text{ and} \sum\_{i=0}^{\infty} \lambda\_i^2 < \infty, \end{array}$$

*then* lim<sup>i</sup>→∞ <sup>Q</sup><sup>i</sup> = ¯c<sup>∗</sup> *holds almost surely for every starting Q-values* <sup>Q</sup><sup>0</sup> <sup>≥</sup> ¯0*.*

*Proof.* We assume for contradiction that, for some <sup>Q</sup> = ¯c∗, there is a non-zero chance of a sequence {Qi}i∈N<sup>0</sup> such that – Q <sup>−</sup> lim inf <sup>i</sup>→∞ <sup>Q</sup>i<sup>∞</sup> < ε for all <sup>ε</sup> <sup>&</sup>gt; 0, and – there is a type <sup>q</sup> such that <sup>Q</sup>(q) < T(Q)(q).


Then there must be an ε > 0 such that <sup>Q</sup>(q)+3ε<T(Q <sup>−</sup>2<sup>ε</sup> · ¯1)(q). We fix such an ε > 0. Now we have the assumption that the probability of Q−lim infn→∞ <sup>Q</sup>i<sup>∞</sup> <sup>&</sup>lt;

ε is positive. Then, in particular, the chance that, at the same time, lim inf <sup>i</sup>→∞ <sup>Q</sup><sup>i</sup> <sup>&</sup>gt; <sup>Q</sup> <sup>−</sup> <sup>ε</sup> · ¯1 *and* lim inf <sup>i</sup>→∞ <sup>Q</sup><sup>i</sup> <sup>&</sup>lt; <sup>Q</sup> <sup>+</sup> <sup>ε</sup> · ¯1, is positive. such that, for all i>nε, <sup>Q</sup><sup>i</sup> <sup>≥</sup> <sup>Q</sup> <sup>−</sup> <sup>2</sup><sup>ε</sup> · ¯1. This implies <sup>T</sup>(Qi)(q) <sup>≥</sup> <sup>T</sup>(Q <sup>−</sup> <sup>2</sup><sup>ε</sup> · ¯1)(q) <sup>&</sup>gt; <sup>Q</sup>(q)+3ε.

Thus, there is a positive chance that the following holds: there exists an n<sup>ε</sup> Thus, the expected limit value of <sup>Q</sup>i(q) is at least <sup>Q</sup>(q)+3ε, for every tail of

$$T(Q\_i)(q) \ge T(\hat{Q} - 2\varepsilon \cdot \bar{1})(q) > \hat{Q}(q) + 3\varepsilon.$$

the update sequence. Now, we can use <sup>Q</sup>−2<sup>ε</sup> as a bound on the estimation of the updates in <sup>Q</sup>-learning as <sup>Q</sup><sup>i</sup> <sup>≥</sup> <sup>Q</sup> <sup>−</sup> <sup>2</sup><sup>ε</sup> · ¯1 holds. At the same time, the variation of the sum of the updates goes to 0 when <sup>i</sup> = 0∞λ<sup>2</sup> <sup>i</sup> is bounded. Therefore, it cannot be that lim inf <sup>i</sup>→∞ <sup>Q</sup><sup>i</sup> <sup>&</sup>lt; <sup>Q</sup> <sup>+</sup> <sup>ε</sup> · ¯1 holds; a contradiction.

We note that if, for a Q-values <sup>Q</sup> <sup>≥</sup> ¯0, there is a <sup>q</sup> <sup>∈</sup> <sup>P</sup> with <sup>Q</sup>(q ) < c¯∗(q ), then there is a q ∈ P with Q(q) < T(Q)(q) and Q(q) < c¯∗(q). This is because, for the Q-values Q with Q (q) = min{Q(q), c¯∗(q)} for all q ∈ Q, Q < c¯∗. Thus, there must be a type <sup>q</sup> <sup>∈</sup> <sup>P</sup> such that <sup>κ</sup> <sup>=</sup> <sup>Q</sup> (q) <sup>c</sup>¯∗(q) < 1 is minimal, and Q ≥ κc¯∗. As we have shown before, T(κc¯∗) = κc¯<sup>∗</sup> −(κ−1)¯c, such that the following holds:

$$T(Q)(q) \ge T(Q')(q) \ge T(\kappa \bar{c}^\*)(q) = \kappa \bar{c}^\*(q) + (1 - \kappa)c(q) > \bar{c}^\*(q) = Q(q).$$

Thus, we have that lim inf <sup>i</sup>→∞ Q<sup>i</sup> ≥ c¯<sup>∗</sup> holds almost surely. With lim<sup>i</sup>→∞ <sup>E</sup>(Qi)=¯c∗, it follows that lim<sup>i</sup>→∞ <sup>Q</sup><sup>i</sup> = ¯c∗.

#### **5.2 Convergence for BMDPs and Finite ¯***c<sup>∗</sup>*

We start with showing that, for BMDPs, the pointwise limes superior of each sequence is almost surely less than or equal to ¯c∗. We then proceed to show that the limes inferior of a sequence is almost surely ¯c∗, which together implies almost sure convergence. *minimal probability* <sup>p</sup>min *in each step,* <sup>∞</sup> <sup>i</sup>=0 <sup>λ</sup><sup>i</sup> <sup>=</sup> <sup>∞</sup>*,* <sup>∞</sup>

**Lemma 14.** *When each Q-value of BMDP is selected for update with a* <sup>i</sup>=0 λ<sup>2</sup> <sup>i</sup> < ∞*, then* lim sup<sup>i</sup>→∞ <sup>Q</sup><sup>i</sup> <sup>≤</sup> <sup>c</sup>¯<sup>∗</sup> *holds almost surely for every starting Q-values* <sup>Q</sup><sup>0</sup> <sup>≥</sup> ¯0*.*

*Proof.* To show the property for the limes superior, we fix an optimal static strategy σ<sup>∗</sup> that exists due to Corollary 5.

We define an BMC obtained by replacing each type q in the BMDP with A(q) = {a1,...,ak}, by k types (q, a1),...,(q, ak) with one action, where each type q is replaced by the type-action pair (q , σ∗(q )).

It is easy to see that a type (q, σ∗(q)) for the resulting BMC has the same value as the type q and the type-action pair (q, σ∗(q)) in the BMDP that we started with.

When identifying these corresponding type-action pairs, we can look at the same sampling for the BMDP and the BMC, leading to sequences Q0, Q1, Q2,... and Q 0, Q 1, Q <sup>2</sup>,..., respectively, where Q<sup>0</sup> = Q 0. 

It is easy to see by induction that Q<sup>i</sup> ≤ Q <sup>i</sup>. Considering that {Q <sup>i</sup>}i∈<sup>N</sup> almost surely converges to ¯c<sup>∗</sup> by Theorem 13, we obtain our result.

**Theorem 15.** *When each Q-value of an BMDP is selected for update with a minimal probability* pmin*,* ∞ <sup>i</sup>=0 λ<sup>i</sup> = ∞*,* ∞ <sup>i</sup>=0 λ<sup>2</sup> <sup>i</sup> < ∞*, then* lim<sup>i</sup>→∞ Q<sup>i</sup> = ¯c<sup>∗</sup> *holds almost surely for every starting Q-values* <sup>Q</sup><sup>0</sup> <sup>≥</sup> ¯0*.*

*Proof.* As a first simple corollary from Lemma 14, we get the same result for the limes inferior (as lim inf ≤ lim sup must hold). We now assume for contradiction that, for some vector Q < <sup>c</sup>¯∗, there is a

non-zero chance of a sequence {Qi}<sup>i</sup>∈<sup>N</sup> such that Q <sup>−</sup> lim inf<sup>n</sup>→∞ <sup>Q</sup>i<sup>∞</sup> < ε for all ε > 0. As <sup>Q</sup> is below the fixed point of <sup>T</sup>, there must be one type-action pair (q, σ∗(q)) such that <sup>Q</sup>(q, σ∗(q)) < T(Q)(q, σ∗(q)) (cf. the proof of Theorem 13).

Moreover, there must be an ε > 0 such that <sup>Q</sup>(q, σ∗(q)) + 3ε<T(Q + 2<sup>ε</sup> · ¯1)(q, σ∗(q)).

$$
\hat{Q}(q, \sigma^\*(q)) + 3\varepsilon < T(\hat{Q} + 2\varepsilon \cdot \bar{1})(q, \sigma^\*(q)).
$$

We fix such an ε > 0.

Now we assume that the probability of Q−lim inf<sup>n</sup>→∞ <sup>Q</sup>i<sup>∞</sup> < ε is positive. Then the chance that, simultaneously, lim inf <sup>i</sup>→∞ <sup>Q</sup>i(q, σ∗(q)) <sup>&</sup>gt; <sup>Q</sup>(q, σ∗(q)) <sup>−</sup> <sup>ε</sup> and lim inf <sup>i</sup>→∞ <sup>Q</sup>i(q, σ∗(q)) <sup>&</sup>lt; <sup>Q</sup>(q, σ∗(q)) + <sup>ε</sup>, is positive. such that, for all i>n<sup>ε</sup> we have <sup>Q</sup><sup>i</sup> <sup>≥</sup> <sup>Q</sup> <sup>−</sup> <sup>2</sup><sup>ε</sup> · ¯1. This entails <sup>T</sup>(Qi)(q, σ∗(q)) <sup>≥</sup> <sup>T</sup>(Q <sup>−</sup> <sup>2</sup><sup>ε</sup> · ¯1)(q, σ∗(q)) <sup>&</sup>gt; <sup>Q</sup>(q, σ∗(q)) + 3ε.

Thus, there is a positive chance that the following holds: there exists an n<sup>ε</sup> Thus, the expected limit value of <sup>Q</sup>i(q, σ∗(a)) is at least <sup>Q</sup>(q, σ∗(a)) + 3ε, for

$$T(Q\_i)(q, \sigma^\*(q)) \ge T(\hat{Q} - 2\varepsilon \cdot \bar{1})(q, \sigma^\*(q)) > \hat{Q}(q, \sigma^\*(q)) + 3\varepsilon.$$

every tail of the update sequence. Now, we can use <sup>T</sup>(Q <sup>−</sup> <sup>2</sup><sup>ε</sup> · ¯1)(q, σ∗(a)) as a bound on the estimation of T(Q)(q, σ∗(q)) during the update of the Q-value of the type-action pair (q, σ∗(q)). At the same time, the variation of the sum of the updates goes to 0 when <sup>∞</sup> <sup>i</sup>=0 λ<sup>2</sup> <sup>i</sup> is bounded. Therefore, it cannot be that lim inf <sup>i</sup>→∞ <sup>Q</sup>i(q, σ∗(a)) <sup>&</sup>lt; <sup>Q</sup>(q, σ∗(a)) + <sup>ε</sup> holds; a contradiction.

#### **5.3 Divergence**

We now show divergence of Q(q) to ∞ when at least one of the entries of ¯c∗(q) is infinite. First due to Theorem <sup>6</sup> and its proof we have that ¯c<sup>∗</sup> <sup>=</sup> <sup>∞</sup> <sup>i</sup>=0 B<sup>i</sup> c¯ for some non-negative B and positive ¯c. Therefore ¯c<sup>∗</sup> is monotonic in B for BMCs. Likewise, the value of ¯c<sup>∗</sup> for a BMDP depends only on the cost function and the expected number of successors of each type spawned: Two BMDPs with same cost functions and the expected numbers of successors have the same fixed point c¯∗. Thus, if a type q with one action spawns either exactly one q or exactly one q with a chance of 50% each, or if it spawns 10 successors of type q and another 10 or type q with a chance of 5%, while dying without offspring with a chance of 95%, both lead to identical matrices B and so the same ¯c<sup>∗</sup> (though this difference may impact the performance of Q-learning).

Naturally, raising the number of expected number of successors of any type for any type-action pair strictly raises ¯c∗, while lowering it reduces ¯c∗, and for every set of expected numbers, the value of ¯c<sup>∗</sup> is either finite or infinite.

Let us consider a set of parameters at the fringe of finite vs. infinite ¯c∗, and let us choose them pointwise not larger than the parameters from the BMC or BMDP under consideration. As the fixed point from Sect. 3 is clearly growing continuously in the parameter values, this set of expected successors leads to a c¯<sup>∗</sup> which is not finite.

We now look at the family of parameter values that lead to α ∈ [0, 1[ times the expected successors from our chosen parameter at the fringe between finite and infinite values, and refer to it as the α-BMDP. Let also ¯c<sup>∗</sup> <sup>α</sup> denote the fixed point for the reduced parameters. As the solution to the fixed point grows continuously, so does ¯c<sup>∗</sup> α. Moreover, if ¯c<sup>∗</sup> <sup>1</sup> = lim<sup>α</sup>→<sup>1</sup> c¯<sup>∗</sup> <sup>α</sup> was finite, then ¯c<sup>∗</sup> would be finite as well, because then ¯c<sup>∗</sup> <sup>1</sup> = ¯c∗.

Clearly, for all parameters α ∈ [0, 1[, the Q-values of an α-BMC or α-BMDP converge against ¯c<sup>∗</sup> <sup>α</sup>. Thus, the Q-values for the BMC or BMDP we have started with converges against a value, which is at least sup<sup>α</sup>∈[0,1[ c¯<sup>∗</sup> <sup>α</sup>. As this is not a finite value, Q-learning diverges to ∞.

# **6 Experimental Results**

We implemented the algorithm described in the previous section in the formal reinforcement learning tool Mungojerrie [21], a C++-based tool which reads BMDPs described in an extension of the PRISM language [18]. The tool provides an interface for RL algorithms akin to that of [3] and invokes a linear programming tool (GLOP) [22] to compute the optimal expected total cost based on the optimality equations (♠).

#### **6.1 Benchmark Suite**

The BMDPs on which we tested Q-learning are listed in Table 1. For each model, the numbers of types in the BMDP, are given. Table 1 also shows the total cost (as computed by the LP solver), which has full access to the BMDP. This is followed by the estimate of the total cost computed by Q-learning and the time taken by learning. The learner has several hyperparameters: is the exploration rate, α is the learning rate, and tol is the tolerance for Q-values to be considered different when selecting an optimal strategy. Finally, ep-l is the maximum episode length and ep-n is the number of episodes. The last two columns of Table 1 report the values of ep-l and ep-n when they deviate from the default values. All performance data are the averages of three trials with Q-learning. Since costs are undiscounted, the value of a state-action pair computed by Q-learning is a direct estimate of the optimal total cost from that state when taking that action.


**Table 1.** Q-learning results. The default values of the learner hyperparameters are: -= 0.1, α = 0.1, tol= 0.01, ep-l= 30, and ep-n= 20000. Times are in seconds.

Models cloud1 and cloud2 are based on the motivating example given in the introduction. Examples bacteria1 and bacteria2 model the population dynamics of a family of two bacteria [28] subject to two treatments. The objective is to determine which treatment results in the minimum expected cost to extinction of the bacteria population. The protein example models a stochastic Petri net description [19] corresponding to a protein synthesis example with entities corresponding to active and inactive genes and proteins. The example frozenSmall [3] is similar to classical frozen lake example, except that one of the holes result in branching the process in two entities. Entities that fall in the target cell become extinct. The objective is to determine a strategy that results in a minimum number of steps before extinction. Finally, the remaining 5 examples are randomly created BMDP instances.

# **7 Conclusion**

We study the total reward optimisation problem for branching decision processes with unknown probability distributions, and give the first reinforcement learning algorithm to compute an optimal policy. Extending Q-learning is hard, even for branching processes, because they lack a central property of the standard convergence proof: as the value range of the Q-table is not a priori bounded for a given starting table Q0, the variation of the disturbance is not bounded. This looks like a more substantial obstacle than the one Q-learning faces when maximising undiscounted rewards for finite-state MDPs, and it is well known that this defeats Q-learning. So it is quite surprising that we could not only show that Q-learning works for branching processes, but extend these results to branching decision processes, too. Finally, in the previous section, we have demonstrated that our Q-learning algorithm works well on examples of reasonable size even with default hyperparameters, so it is ready to be applied in practice without the need for excessive hyperparameter tuning.

# **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

# **Software Verification**

# **Cameleer: A Deductive Verification Tool for OCaml**

M´ario Pereira(B) and Ant´onio Ravara

NOVA LINCS, Nova School of Science and Technology, Lisbon, Portugal *{*mjp.pereira,aravara*}*@fct.unl.pt

**Abstract.** We present Cameleer, an automated deductive verification tool for OCaml. We leverage on the recently proposed GOSPEL (Generic OCaml SPEcification Language) to attach rigorous, yet readable, behavioral specification to OCaml code. The formally-specified program is fed to our toolchain, which translates it into an equivalent one in WhyML, the programming and specification language of the Why3 verification framework. We report on successful case studies conducted in Cameleer.

**Keywords:** Deductive software verification *·* OCaml *·* Why3 *·* GOSPEL

# **1 Introduction**

Over the past decades, we have witnessed a tremendous development in the field of deductive software verification [11], the practice of turning the correctness of code into a mathematical statement and then prove it. Interactive proof assistants have evolved from obscure and mysterious tools into *de facto* standards for proving industrial-size projects. On the other end of the spectrum, the socalled *SMT revolution* and the development of reusable intermediate verification infrastructures contributed decisively to the development of practical automated deductive verifiers.

Despite all the advances in deductive verification and proof automation, little attention has been given to the family of *functional languages* [27]. Let us consider, for instance, the OCaml language. It is well suited for verification, given its well-defined semantics, clear syntax, and state-of-the-art type system. Yet, the community still lacks an easy to use framework for the specification and verification of OCaml code. The working programmers must either re-implement their code in a proof-aware language (and then rely on code extraction), or they must turn themselves into interactive frameworks. Cameleer fills the gap, being a tool for the deductive verification of programs written in OCaml, with a clear

This work is partly supported by the HORIZON 2020 Cameleer project (Marie Sklodowska-Curie grant agreement ID:897873) and NOVA LINCS (Ref. UIDB/04516/2020).

focus on proof automation. Cameleer uses the recently proposed GOSPEL [5], a specification language for OCaml. We advocate here the vision of the *specifying programmer* : the person who writes the code should also be able to naturally provide suitable specification. GOSPEL terms are written in a subset of the OCaml language, which makes them more appealing to the regular programmer. Moreover, we believe specification and implementation should co-exist and evolve together, which is exactly the approach followed in Cameleer.

Cameleer takes as input a GOSPEL-annotated OCaml program and translates it into an equivalent counterpart in WhyML, the programming and specification language of the Why3 framework [16]. Why3 is a toolset for the deductive verification of software, clearly oriented towards automated proof. A distinctive feature of Why3 is that it interfaces with several different off-the-shelf theorem provers, namely SMT solvers.

*Contributions.* To the best of our knowledge, Cameleer is the first deductive verification tool for annotated OCaml programs. It handles a realistic subset of the language, and its interaction with the Why3 verification framework greatly increases proof automation. Our set of case studies successfully verified with the Cameleer tool constitutes, by itself, an important contribution towards building a comprehensive body of verified OCaml codebases. Finally, it is worth noting that the original presentation of GOSPEL was limited to the specification of interface files. In the scope of this work, we have extended it to include implementation primitives, such as loop invariants and ghost code (*i.e.*, code that has no computational purpose and is used only to ease specification and proof effort) evolving GOSPEL from an interface specification language into a more mature proof language.

# **2 Illustrative Example – Binary Search**

*Higher-Order Implementation.* Fig. 1 presents an implementation of binary search, where the comparison function, cmp, is given as an argument to the main function. For the sake of readability, we give the type of arguments and return value of function binary search, but these can be inferred by the OCaml compiler.

The function contract is given after its definition as a GOSPEL annotation, written within comments of the form (\*@ ... \*). The first line names the returned value. Next, the first precondition establishes that the cmp is a total pre order following the OCaml convention: if x is smaller than y, then cmp x y < 0; if x is greater than y, then cmp x y > 0; finally, cmp x y = 0 if x and y are equal values<sup>1</sup>. It is worth noting that GOSPEL, hence Cameleer, assumes cmp to be a pure function (*i.e.*, a function without any form of side-effects). The second precondition requires the array to be sorted according to the cmp relation. Finally, the last two clauses capture the possible outcomes of execution: the regular postcondition (ensures clause) states the returned index is within the bounds of a and its value is equal to v; the exceptional postcondition (raises)

<sup>1</sup> For the sake of space, we omit the definition of predicate is total pre order.

**Fig. 1.** Binary search implemented as a functor.

states that whenever exception Not found is raised, there is no such index within bounds whose value is equal to v. As usual in deductive verification, the presence of the while loop requires one to supply a loop invariant. Here, it boils down to the two invariant clauses, which state the limits of the search space are always within the bounds of a and that for every index i for which a.(i) is equal to v, then i must be within the limits of the current search space. We also provide a decreasing measure (variant) in order to prove loop termination.

Assuming file binary search.ml contains the program of Fig. 1, starting a proof with Cameleer is as easy as typing cameleer binary search.ml in a terminal. Users are immediately presented with the Why3 IDE, where they can conduct the proof. Twelve verification conditions are generated for binary search: two for loop invariant initialization, four loop invariant preservation (two for each branch of if..then..else), two for safety (check division by zero and index in array bounds), two for loop termination (one for each branch), and finally one for each postcondition. All of these are easily discharged by SMT solvers.

*Functor-Based Implementation.* The implementation in Fig. 2 depicts (the skeleton of) an alternative implementation of the binary search routine. Instead of passing the comparison function as an argument of binary search, here the functor Make takes as argument a module of type OrderedType, which provides a monomorphic comparison function over a type t. This is the same approach found in the OCaml standard library, namely in the Set and Map modules. The @logic attribute instructs Cameleer that cmp is both a programming and logical function. This is what allows us to provide the axiom about the behavior of cmp.

Other than the call to Ord.cmp, the implementation and specification of binary search does not change, hence we omit it here. When fed into Cameleer, the functorial implementation generates the *exact same* twelve verification conditions as the higher-order counterpart, all of them easily discharged as well. Thus, the use of a functor does not impose any verification burden, showing the flexibility of Cameleer to handle different idiomatic OCaml programming styles.

**Fig. 2.** Binary search implemented as a functor.

**Fig. 3.** Cameleer verification workflow.

# **3 Implementation**

*Cameleer Workflow.* Figure <sup>3</sup> depicts the verification workflow of the Cameleer tool. We use the GOSPEL toolchain<sup>2</sup>, in order to parse and manipulate (via the ppxlib library) the abstract syntax tree of the GOSPEL-annotated OCaml program. A dedicated parser and type-checker (extended to handle implementation features) treat GOSPEL special comments and attach the generated specification to nodes in the OCaml AST. Cameleer translates the decorated AST into an

<sup>2</sup> https://github.com/ocaml-gospel/gospel.

equivalent WhyML representation, which is then fed to Why3. The Why3 typeand-effect system might reject the input program, in which case the reported error is propagated back to the level of the original OCaml code. Otherwise, if the translated program fits Why3 requirements, the underlying VCGen computes a set of verification conditions that can then be discharged by different solvers. Throughout all this pipeline, the user only has to write the OCaml code and GOSPEL specification (represented in Fig. 3 as a full-lined box), while every other element is automatically generated (dash-lined boxes). The user never needs to manipulate or even care about the generated WhyML program. In short, the Cameleer user intervenes in the beginning and in the end of the process, *i.e.*, in the initial specifying phase and in the last step, helping Why3 to close the proof. Our development effort currently amounts to 1.8K non-blank lines of OCaml code.

*Translation into WhyML.* The core of Cameleer is a translation from GOSPELannotated OCaml code into WhyML. In order to guide our implementation effort, we have defined such a translation as a set of inductive inference rules between the source and target languages [26]. Here, rather than focusing on more fundamental aspects, we give a brief overview of how the translation works in practice.

OCaml and WhyML are both dialects of the ML-family, sharing many syntactic and semantics traits. Hence, translation of OCaml expressions and declarations into WhyML is rather straightforward: GOSPEL annotations are readily translated into WhyML specification, while supported OCaml programming constructions (including ghost code) are easily mapped into semantically-equivalent WhyML constructions. Consider, for instance the following piece of OCaml code:

```
type 'a non_empty_list = { self: 'a list }
(*@ invariant self <> [] *)
let[@ghost] hd (l: 'a non_empty_list) = match l with
  | [] -> assert false
  | x :: _ -> x
(*@ r = hd l
      ensures match l with
              | [] -> false
              | x :: _ -> r = x *)
```
For such case, Cameleer generates the following WhyML program:

```
type non_empty_list 'a = { self: list 'a }
invariant { self <> Nil }
let ghost hd (l: non_empty_list 'a)
  returns { r -> match l with
                 | Nil -> false
                 | Cons x _ ->x=r end }
= match l with
  | Nil -> absurd
  | Cons x _ -> x end
```
Other than the small syntactic differences, the generated WhyML program is identically to the original OCaml one. In particular, the @ghost annotation generates a ghost function in WhyML, while the assert false expression (which is treated in a special way by the OCaml type-checker) is translated into the absurd construction, with the same semantics. Supplied annotations, in this case postcondition and type invariant, are readily mapped into equivalent specification.

The translation of the OCaml module language is more interesting and involved. A WhyML program is a list of modules, a module is a list of top-level declarations, and declarations can be organized within *scopes*, the WhyML unit for namespaces management. However, there is no dedicated syntax for functors on the Why3 side. These are represented, instead, as modules containing only abstract symbols [17]. Thus, when translating OCaml functors into WhyML, we need to be more creative. If we consider, for instance, the Make functor from Fig. 2, Cameleer will generate the following WhyML program:

```
scope Make
  scope Ord
    type t
    val function cmp t t : int
    axiom total_pre_order: is_total_pre_order cmp
  end
  let binary_search a v = ...
end
```
The functor argument Ord is encoded as a nested scope inside Make. This means the binary search implementation can access any symbol from the Ord namespace, via name qualification (*e.g.*, Ord.t and Ord.cmp).

*Interaction with Why3.* One distinguishing feature of the Why3 architecture is that it can be extended to accommodate new front-end languages [32, Chap. 4]. Building on the devised OCaml to WhyML translation scheme, we use the Why3 API to build an in-memory representation of the WhyML program. We also register OCaml as an admissible input language for Why3, which amounts to instructing Why3 to recognize .ml files as a valid input format and triggering our translation in such case. Following this integration, we can use any Why3 tool, out of the box, to process a .ml file. We are currently using the extract and session tools: the latter to gather statistics about number of generated verification conditions and proof time; the former to erase ghost code.

*Limitations of Using Why3.* The WhyML specification sub-language and GOSPEL are similar. Moreover, they share some fundamental principles, namely the arguments of functions are not aliased by construction and each data structure carries an implicit representation predicate. However, one can use GOSPEL to formally specify OCaml programs which cannot be translated into WhyML. This is evident when it comes to recursive mutable data structures. Consider, for instance, the cell type from the Queue module of the OCaml standard library<sup>3</sup>:

type 'a cell = Nil | Cons of { content: 'a; mutable next: 'a cell }

As we attempt to translate such data type, Why3 emits the following error:

This field has non-pure type, it cannot be used in a recursive type definition

Recursive mutable data types are beyond the scope of Why3's type-and-effect discipline [14], since these can introduce arbitrary memory aliasing which breaks the *bounded-mutability* principle of Why3 (*i.e.*, all aliases must be staticallyknown). The solution would be to resort to an axiomatic memory model of OCaml in Why3, or to employ a richer program logic, *e.g.*, Separation Logic [28] or Implicit Dynamic Frames [31]. We describe such an extension as future work (Sect. 6).

### **4 Evaluation**

In order to assess the usability and performance of Cameleer, we have put together a test suite of over 1000 lines of OCaml code. The reported case studies are all automatically verified. To build our gallery of verified programs we used a combination of Alt-Ergo 2.4.0, CVC4 1.8, and Z3 4.8.6. Figure 4 summarizes important metrics about our verified case studies: the number of generated verification conditions for each example; the total lines of OCaml code, GOSPEL specification, and lines of ghost (these are also included in the number of OCaml LOC), respectively; the time it takes (in seconds) to replay a proof; and finally, if the proof is immediately discharged, *i.e.*, no extra user effort is required other than writing down suitable specification.

Our test bed includes OCaml implementations issued from realistic and massively used programming libraries: the List.fold left iterator and Stack module from the OCaml standard library; the Leftist Heap implementation from ocaml-containers<sup>4</sup>; finally, the applicative Queue module from OCamlgraph<sup>5</sup>. We have used Cameleer to verify programs of different nature. These include: numerical programs (*e.g.*, binary multiplication and fast exponentiation); sorting and searching (*e.g.*, binary search and insertion sort); logical algorithms (conversion of a propositional formula into conjunctive normal form); array scanning (finding duplicate values in an array of integers); small-step iterators; data structures implemented as functors (*e.g.*, Pairing Heaps and Binary Search Trees); historical algorithms (checking a large routine by Turing, Boyer-Moore's majority algorithm, FIND by Hoare, and binary tree same fringe); examples in Rustain Leino's forthcoming textbook "Program Proofs"; and higher-order implementations (height of a binary tree computed in CPS). Both small-step iterators and

<sup>3</sup> https://caml.inria.fr/pub/docs/manual-ocaml/libref/Queue.html.

<sup>4</sup> https://github.com/c-cube/ocaml-containers/blob/master/src/core/CCHeap.ml

<sup>5</sup> https://github.com/backtracking/ocamlgraph/blob/master/src/lib/persistentQueue.ml



**Fig. 4.** Summary of the case studies verified with the Cameleer tool.

the list fold function use a modular approach to reason about iteration [18]. Our largest case study to date is a toy compiler from arithmetic expressions to a stack machine, while Union Find features the most involved, but very elegant, specification. The former is inspired by the presentation in Nielsons' textbook [25]; the latter follows recently proposed specification techniques [7,12] to achieve fully automatic proofs of correctness and termination.

The runtimes shown in Fig. 4 were measured by averaging over ten runs on a Lenovo Thinkpad X1 Carbon 8th Generation, running Linux Mint 20.1, OCaml 4.11.1, and Why3 1.3.3 (developer version). They show that Cameleer can effectively verify realistic OCaml code in a decent amount of time. Following good practices in deductive verification, Cameleer allows the user to write *ghost code* in order to ease proof and specification. The number of lines of ghost code in Fig. 4 stands for ghost fields in record types, ghost functions, and lemma functions. In particular, the arithmetic compiler example uses lemma functions to prove, by induction, results about semantics preservation. Finally, case studies marked with required some form of manual interaction in the Why3 IDE [9]. These are very simple proofs by induction (of auxiliary lemmas) and case analysis, in order to better guide SMT solvers.

From our experience developing this gallery of verified programs, we believe the required annotation effort is reasonable, although non-negligible. Some case studies, namely the Heap implementations, feature a considerable amount of lines of GOSPEL specification. However, these are classic definitions (*e.g.*, minimum element) and results (*e.g.*, the root of the Heap is the minimum element), which are easily adapted to any variant of Heap implementation.

# **5 Related Work**

*Automated Deductive Verification.* One can cite Why3, F\* [1], Dafny [23], and Viper [24] as successful automated deductive verification tools. Formal proofs are conducted in the proof-aware language of these frameworks, and then executable reliable code can be automatically extracted. In the Cameleer project, we chose to develop a verification tool that accepts as input a program written directly in OCaml, instead of a dedicated proof language. This obviates the need to re-write entire OCaml codebases (*e.g.*, libraries), just for the sake of verification.

Regarding tools that tackle the verification of programs written in mainstream languages, one can cite Frama-C [21] (for the C language), VeriFast [20] (C and Java), Nagini [10] (Python), Leon [22] (Scala), Spec# [3] (C#), and Prusti [2] (Rust). Despite the remarkable case studies verified with these tools, programs written in the these languages can quickly degenerate into a nightmare of pointer manipulation and tricky semantics issues. We argue the OCaml language presents a number of features that make it a better target for formal verification.

Finally, language-based approaches offer an alternative path to the verification of software. Liquid Haskell [34] extends standard Haskell types with Liquid Types [29], a form of refinement types [30], in order to prove properties about realistic Haskell programs [33]. In this approach, verification conditions are generated and discharged during type-checking. This is also its major weakness: in order to remain decidable, the expressiveness of the refinement language is hindered. In Cameleer, the use of GOSPEL allows us to provide rich specification to relevant case studies, while still achieving good proof automation results.

*Deductive Verification of OCaml Programs.* Prior to our work, CFML [4] and coq-of-ocaml [8] were the only available tools for the deductive verification of OCaml-written code, via translation into the Coq proof language. On one hand, CFML features an embedding of a higher-order Separation Logic in Coq, together with a *characteristic formulae* generator. On the other hand, coq-of-ocaml compiles non-mutable OCaml programs to pure Gallina code. These two tools have been successfully applied to the verification of non-trivial case studies, such as the correctness and worst-case amortized complexity bound of cycle detection algorithm [19], as well as part of the Tezos' blockchain protocol<sup>6</sup>. However, they

<sup>6</sup> https://clarus.github.io/coq-of-ocaml/examples/tezos/.

still require a tremendous level of expertise and manual effort from users. Also, no behavioral specification is provided with the OCaml implementation. The user must write specification at the level of the generated code, which breaks our vision that implementation and specification must coexist and evolve together.

The VOCaL project aims at developing a mechanically verified OCaml library [6]. One of the main novelties of this project is the combined use of three different verification tools: Why3, CFML, and Coq. The GOSPEL specification language was developed in the scope of this project, as a tool-agnostic language that could be manipulated by any of the three mentioned frameworks. Up to this point, the three mentioned tools were only using GOSPEL for interface specification, and not as a proof language. We believe the Cameleer approach nicely complements the existing toolchains [13] in the VOCaL ecosystem.

# **6 Conclusions and Future Work**

In this paper we presented Cameleer, a tool for automated deductive verification of OCaml programs, with bounded mutability. We use the recently proposed GOSPEL language, which we also extended in the scope of this work, in order to attach formal specification to an OCaml program. Cameleer fulfills a gap in the OCaml community, by providing programmers with a tool to directly specify and verify their implementations. By departing from the interactive-based approach, we believe Cameleer can be an important step towards bringing more OCaml programmers to include formal methods techniques in their daily routines.

The core of Cameleer is a translation from OCaml annotated code into WhyML. The two languages share many common traits (both in their syntax and semantics), so it makes sense to target this intermediate verification language in the first major iteration of Cameleer. We have successfully applied our tool and approach to the verification of several case studies. These include implementations issued from existing libraries, and scale up to data structures implemented as functors and tricky effectful computations. In the future, we intend to apply Cameleer to the verification of even larger case studies.

*What We Do Not Support.* Currently, we target a subset of the OCaml language which roughly corresponds to caml-light, with basic support for the module language (including functors). Also, WhyML limits effectful computations to the cases where alias is information statically known, which limits our support for higher-order functions and mutable recursive data structures. Adding support for the objective layer of the OCaml language would require a major extension to the GOSPEL language and a redesign of our translation into WhyML. Nonetheless, Why3 has been used in the past to verify Java-written programs [15], so in principle an encoding of OCaml objects in WhyML is possible.

We do not support some of the more advanced type features in OCaml, namely Generalized Algebraic Data Types (GADTs) and polymorphic variants. One way to support such constructions would to be extend the type system of Why3 itself, which would likely mean a considerable redesign of the WhyML language. Another possible route is to extend the core of Cameleer with the ability to translate OCaml code into other, richer, verification frameworks.

*Interface with Viper and CFML.* In order to augment the class of OCaml programs we can treat, we plan on extending Cameleer to target the Viper infrastructure and the CFML tool. On one hand, Viper is an intermediate verification language based on Separation Logic but oriented towards SMT-based software verification, allowing one to automatically verify heap-dependent programs. On the other hand, the CFML tool allows one to verify effectful higher-order programs. We plan on extending the CFML translation engine, in order to take source-code level GOSPEL annotations into account. Since it targets the rich proof language and type system of Coq, it can in principle be extended to reason about GADTs and other advanced OCaml features. Even if it relies on an interactive proof assistant, CFML provides a comprehensive tactics library that eases proof effort.

Our ultimate goal is to grow Cameleer to a verification tool that can simultaneously benefit from the best features of different intermediate verification frameworks. Our motto: we want Cameleer to be able to verify parts of OCaml code using Why3, others with Viper, and some very specific functions with CFML.

### **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

# LLMC: Verifying High-Performance Software

Freark I. van der Berg(B)

Formal Methods and Tools, University of Twente, Enschede, The Netherlands f.i.vanderberg@utwente.nl

Abstract. Multi-threaded unit tests for high-performance thread-safe data structures typically do not test all behaviour, because only a single scheduling of threads is witnessed per invocation of the unit tests. Model checking such unit tests allows to verify all interleavings of threads. These tests could be written in or compiled to LLVM IR. Existing LLVM IR model checkers like divine and Nidhugg, use an LLVM IR interpreter to determine the next state. This paper introduces llmc, a multi-core explicit-state model checker of multi-threaded LLVM IR that translates LLVM IR to LLVM IR that is *executed* instead of interpreted. A test suite of 24 tests, stressing data structures, shows that on average llmc clearly outperforms the state-of-the-art tools divine and Nidhugg.

# 1 Introduction

High-performance software often uses thread-safe data structures to allow multiple threads access to the data, without corrupting it. Unit tests for such data structures typically do not test all behaviour, because the thread scheduler of the run-time environment non-deterministically chooses only a single interleaving. Thus, only a single trace is witnessed each time the unit test is invoked. If we would *model check* [1] these unit tests, we can witness all possible traces by exploring all thread schedules. Because it does not depend on the run-time environment, model checking can become part of a continuous integration pipeline, enabling push-button verification of multi-threaded software.

These thread-safe data structures can be written in or compiled to LLVM IR, the intermediate representation of the LLVM Project [2]. The LLVM Project is a collection of modular and reusable compiler and toolchain technologies. Many front-ends for LLVM IR exist, for example for C, C++, Java, Ruby, and Rust, potentially allowing an LLVM IR model checker to be usable for many languages.

#### 1.1 Related Work

Model checkers that operate on LLVM IR already exist, for example divine, Nidhugg, RCMC and LLBMC. divine [3] is a stateful multi-core model checker of multi-threaded LLVM IR. It has many features such as capturing I/O during model checking, SC and TSO memory models, library support such as libc and libpthread. Input programs are linked with divine's operating system layer, DiOS, and are interpreted as a whole on the DiVM virtual machine.

c The Author(s) 2021

A. Silva and K. R. M. Leino (Eds.): CAV 2021, LNCS 12760, pp. 690–703, 2021. https://doi.org/10.1007/978-3-030-81688-9\_32

divine detects memory operations to thread-private memory, by traversing the heap on-the-fly and recognizing if a memory-object is either known only to one thread or to multiple [4]. In the former case, memory operations to that memory-object can be *collapsed*, i.e. joined with the previous instruction.

Nidhugg [5] is a stateless multi-core model checker of multi-threaded LLVM IR that uses an LLVM IR interpreter. It features a sophisticated partial-order reduction, *rfsc* [6], that categorizes traces according to which read reads from which write and traverses only one trace in each category. In practice this reduction is quite powerful. However, Nidhugg comes with a caveat: because Nidhugg is stateless, common prefixes of traces are traversed once per trace instead of once in total. This down-side of a stateless approach becomes more pronounced with longer and more often occurring common traces. Moreover, Nidhugg might not terminate in the presence of infinite loops.

RCMC [7] is also a stateless LLVM IR model checker. During execution within its LLVM IR interpreter, it keeps track of a happens-before graph of all observed memory operations. Using this, RCMC can determine the possible values a read can observe, without simply executing all interleavings of all threads. Unlike Nidhugg, it does not support heap memory and is only released in binary form.

CBMC [8] is a bounded model checker for C and C++ programs, using SMT solving to check for memory safety, exceptions, undefined behaviour and assertions. Loops and recursion are a problem for CBMC when their bound cannot be determined: one needs to set an upper bound on the number of unwindings.

LLBMC [9] is similar to CBMC, using SMT-solving to find bugs, but only for single-threaded C/C++ programs and it operates on LLVM IR.

Other, less related tools include SMACK [10], SeaHorn [11] and KLEE [12].

#### 1.2 Contribution

This paper introduces llmc 0.2, a stateful multi-core model checker of multithreaded LLVM IR. Instead of using an LLVM IR interpreter like divine, Nidhugg and llmc 0.1 [13], it transforms input LLVM IR to LLVM IR that implements the dmc api, the next-state interface to the model checker dmc [14]. We call this transformation process ll2dmc and combined with dmc (Fig. 1), it allows for up to three orders of magnitude higher throughput (states/s) than divine. At present, llmc lacks sophisticated state space reductions, causing state space sizes of roughly two orders of magnitude larger than divine. We compared llmc to divine and Nidhugg using a test suite covering various data structures. Overall, despite the lack of sophisticated reductions, llmc is on average an order of magnitude faster than divine and <sup>∼</sup>3.8x faster than Nidhugg. Additionally, llmc is able to compute the state spaces of the tests where divine or Nidhugg fail.

Fig. 1. The flow of how an LLVM IR input program is verified in llmc.

# 2 LLMC: Low-Level Model Checker

This section explains how the transformation process (ll2dmc) transforms the input LLVM IR of a program to LLVM IR that implements the dmc api. llmc supports LLVM IR compiled from C and C++, by handling a number of builtins (e.g. \_\_atomic\_\* for atomic memory operations), part of libpthread (for thread support), libc (e.g. for memory allocation) and global constructors.

#### 2.1 DMC Model Checker

The model created by ll2dmc is given to dmc to explore. Dmc interacts with the model via the dmc api (NextState API and dtree API combined) as illustrated in Fig. 2: after requesting the initial state from the model, dmc continues to request successor states, until the state space has been generated. A state is a vector of 32-bit integers; two states need not be of the same length.

The states are stored in the concurrent compression tree dtree [14], allowing lossless compression, fast insertion and duplicate detection of states. When inserted, states are given a unique StateID. A StateID can be stored in states as

Fig. 2. DMC model checker

well, thus allowing the creation of a DAG of states: a *root-state* and *sub-states*. Additionally, dtree allows incremental updates to a state, without having the actual contents of the state and it allows partial reconstruction of states. This *delta interface* uses the StateID to identify states and can avoid needless copying of entire states, increasing performance. Dmc exposes these dtree features as part of the dmc api [14].

#### 2.2 Input Language to LL2DMC: LLVM IR

To understand how llmc handles input LLVM IR [2], we briefly explain it here. LLVM IR supports control flow by way of basic blocks. Basic blocks are a list of instructions that execute sequentially. The last instruction of a basic block is a terminator instruction, such as a branch (jump) instruction or return statement.

LLVM IR uses single static assignment form for register values. To support data flow depending on control flow, φ-nodes exist. These nodes are instructions at the beginning of a basic block that take a value depending on the basic block from which was jumped to the basic block containing the φ-nodes.

#### 2.3 Output of LL2DMC: Model Implementing DMC API

The output of ll2dmc is a model that implements the NextState API part of the dmc api of the model checker dmc [14]. The NextState API requires two interfaces from a model: one to communicate the initial state and one to generate next states, given a state.

The *initial state* of a model generated by ll2dmc is as if one just started the program: registers are unused, global memory is initialized to 0 and a call to the global constructor (@llvm.global\_ctors) is set up. Global constructors are functions that are called before main, which are used to initialize memory and miscellaneous initialization, such that the executable is set up properly before main is invoked. Having the initial state in this manner, allows the global constructor to be part of the state space and thus be checked as well.

Starting with the initial state, dmc will keep asking the model to generate the next states for a given state, by invoking the *next-state interface* of the model, until there are no more new states of which to request next states. Given a state, the next-state interface determines the states reachable from that state. In the case of a model generated by ll2dmc, first the global constructors of the modelled program are explored, thus faults in global constructors are detected. When the global constructors are completed, a call to main is set up. At this point, the exploration is performed until no new states are visited.

#### 2.4 State Space Exploration

This section describes the next-state function and how it is generated from LLVM IR. Figure 3 describes what a state looks like. A state contains information not unlike what an operating system keeps track of [15]. All instructions are mapped to a unique index, such that the PC (program counter) uniquely identifies the current position in code. The field Thread Results holds the return values of finished threads; the field #threads specifies the number of threads in the current state. The remainder of the state constitutes a list of per-thread data.

Each thread has its own PC and can independently manipulate it by function calls or branching. Status fields are used to indicate whether the thread/program is running, done or failed. Each thread has its own set of Registers , the current state of LLVM IR registers. The size of Registers is determined by the function requiring the largest number of LLVM IR registers. Function calls manipulate these registers and the list of stack frames described by Previous frame .

A Field is a StateID to a sub-state, as described in Sect. 2.1. The separation into a root-state and sub-states allows sub-states to grow and the state storage component of dmc, dtree, to compress them using tree compression [14]. It also allows the use of the delta interface: a write to memory can be simply translated

Fig. 3. A description of the state used by llmc.

to a single, efficient call, taking the current Memory index, the offset to write to and the new data. The resulting index can be written to Memory .

A single LLVM IR instruction in the program is translated to many LLVM IR instructions in the model. We will distinguish LLVM IR registers in the model from registers in the source program by calling the former *model-registers*. In general, a single LLVM IR instruction is translated to a single step with three phases: In the *Preamble* phase, operands to the source LLVM IR instruction are remapped to model-registers and loaded from Registers or Memory . In the *Action* phase, the source LLVM IR instruction is cloned, with the operands remapped to the LLVM IR model-registers set up during the Preamble phase. In the *Epilogue* phase, if the source LLVM IR instruction assigns a value to a register, the value returned by the cloned instruction is written to Registers .

Listing 1 illustrates how a step is performed as part of the next-state function. Multiple steps can be performed as part of the same transition (line 8), as long as the changes are local to the thread (line 4). This is explained in more detail in Sect. 2.5. The step function is called for every thread in the state vector.

### 2.4.1 Register Manipulation

Note that the Registers are not separated into a sub-state, like Memory . We chose this such that simple register manipulating LLVM IR instructions would have no need for an indirection and directly translate to an identical instruction, with its operands mapped such that they are loaded from the Registers and the return value of the instruction written back to the corresponding register. This allows us to trivially collapse such instructions, combining the Preamble phases, requiring dependencies only to be loaded once.

#### 2.4.2 Memory Instructions

Memory instructions such as loads and stores can be directly mapped to the delta interface, reading or writing only a part of the Memory sub-state. There is no distinction between memory allocated on the stack (alloca) and on the heap

Listing 1 In the next-state function, the step function is called for each thread.

```
1 void step(StateVector sv, int threadID)
2 bool onlyLocal = true; # true while handling commutative instructions
3 bool emit = false; # set to true when new state is to be emitted
4 while(sv.threads[threadID].pc > 0 and onlyLocal)
5 switch(sv.threads[threadID].pc)
6 case 0: break; # not running, do nothing
7 case SomePC: # PC of first instruction of group
8 # statically collapsed instructions: preamble, action, epilogue
9 # sv.threads[threadID].pc, onlyLocal and emit may change
10 ...
11 if(emit) MC.insert(sv); # emit new state if needed
```
(malloc): both allocate memory by growing the Memory sub-state. The returned pointer describes which thread created the memory and the offset within the sub-state. Any thread can write to and read from any such memory location. At present, memory cannot be freed, so free has no effect. Because of the tree compression, this has no detrimental effect on memory usage, but does mean llmc currently does not detect free-related bugs.

#### 2.4.3 Branching, Function Calls and Threading

To support control flow in llmc, the PC can be changed to the index assigned to the first instruction in the target basic block. If the target basic block contains φ-nodes, those registers are updated to the value corresponding to the basic block we are branching from.

Function calls set up a new stack frame with the current Registers , PC and where to write the return value, then pushes it to the linked list of frames pointed to by Previous frame . A return from a function pops the top frame from the list of frames, copies the Registers into the state vector, updates the PC and writes the return value into the right register. There is no bound on the number of frames; the last frame has Previous frame set to 0, indicating no next frame.

Threads are created (pthread\_create) by enlarging the root state with enough space to fit another thread and incrementing #threads . When a thread is done, it is marked as such, but not removed from the state vector. This is to retain the memory allocated by a thread. Due to the compression of dtree, it has little impact on the memory foot print of the state space. The return value from the thread is added to Thread results , where it can be read (pthread\_join).

#### 2.5 State Space Reduction

Instructions that only have an effect local to a thread do not change the behaviour of another thread. Such instructions are commutative; their respective ordering is not relevant. Thus, such instructions can be collapsed with the previous or next instruction. For example, instructions that read and write only to registers of a thread are local instructions and do not influence another thread. Branching and function calls are other such commutative instructions.

llmc collapses commutative instructions statically as well as dynamically. The latter is needed to collapse instructions after conditional control flow, because statically the condition is unknown. On-the-fly, the condition is evaluated, the branch taken and it is determined if the next instruction can be collapsed.

2.5.1 Thread-Private Memory llmc collapses all such commutative instructions, with the important exception of memory operations on memory only accessible to the current thread (memory operations to memory accessible to other threads are never collapsed). This requires knowledge on what memory each thread can access, which llmc currently does not track. divine implements [4] this by traversing the memory graph in every state, using a run-time type system to identify pointers and how to follow them (edges); each allocation yields a node.

Nidhugg uses a partial-order-reduction [6] that takes into account from which write a value read by a read originates. In this process, memory operations to thread-private memory are indeed collapsed, because a read can read only a single value: the last value written by the thread itself. The current version of llmc does not feature an on-the-fly state space reduction for memory operations. Instead, we preprocess the input LLVM IR and statically annotate memory operations that cannot be proven to be local to a thread. While this does reduce the state space, because many operations are to stack variables that remain threadprivate, it can only approach the on-the-fly reductions of divine and Nidhugg.

# 3 Evaluation

Table 1 shows a feature comparison between the tools mentioned in Sect. 1.1. The table shows that RCMC and CBMC do not support dynamic memory in the presence of multiple threads. This limits their usability for our use case, model checking multi-threaded tests of data structures, since numerous threadsafe data structures use dynamic memory. Furthermore, RCMC, CBMC and LLBMC do not support infinite loops and only have limited support for spinlocks. More complex infinite loops like appending a new node in the Michael-Scott queue [17] using compare-and-swap are not supported. Thus, we focus on an experimental comparison between llmc, divine and Nidhugg on execution time, memory footprint of the state space and scalability across multiple threads, since all three tools support using multiple threads for model checking.


Table 1. A feature comparison between the tools mentioned in Sect. 1.1.

*<sup>a</sup>* Models [16]: S) Sequentially consistent; T) TSO; P) PSO; W) POWER; A) ARM.

*<sup>b</sup>* Not supported in combination with threads.

*<sup>c</sup>* Only trivial spin-locks are supported.

*<sup>d</sup>* Threads within global constructors not supported.

We ran our experiments on a Dell R930 with 4 E7-8890-v4 CPUs totaling 96 cores and 2 TiB RAM. All sources were compiled using GCC 9.3.0.

### 3.1 Test Suite

We tested the tools using four real-world concurrent LLVM IR data structures, one concurrent algorithm and one protocol. Sources for all tests are available online<sup>1</sup>. We instantiate the tests with various combinations of threads and number of elements inserted, processed or dequeued. All combinations are listed later, in Table 2. These six tests cover different classes of problem types, different shapes of state spaces, and serve to illustrate the strengths and weaknesses of the tools:


These tests highlight the strengths and weaknesses of each tool using realworld data structures and algorithms. The well-known Michael-Scott queue for example is used in many software packages. They reflect different *kinds* of state spaces: LinkedList focuses on "wide" state spaces, with many end states; SortedLinkedList examples state spaces that go wide, but converge into a single end state; Prefixsum highlights the model-checker's ability to detect thread-local memory: model checkers that can detect this have a narrow state space, otherwise a model checker will explore all interleavings.

<sup>1</sup> https://github.com/bergfi/llmc/tree/cav2021/tests/performance.

#### 3.2 Observations and Considerations

For each model, we verified that all expected end states were reachable. For example for <sup>1</sup> , we manually verified that all 8!/(4!4!) = 70 possible outcomes of the linked list were generated.

We witnessed divine returning varying state space sizes across different runs on the same test when using multiple threads, indicating a concurrency problem. It also occasionally crashed, most often when using 192 threads. Even though this indicates the answers divine gives might not be correct, we opted to include the results, assuming they would at least provide an indication of the performance.

Furthermore, we did run RCMC on a number of tests. RCMC often runs out of memory before crashing; likely the result of an infinite loop. For even some small tests, it could not finish within 100x the time other tools needed.

#### 3.3 Experimental Results

Figure 4 shows the results of llmc compared to divine on state space exploration time (4a) and Nidhugg on wall-clock time (4b) when applied to the models from Table 2. These graphs indicate relative performance: the uppermost (blue) line for example indicates the line where llmc is 100x faster. Figure 4c compares llmc (lower data points) and divine (upper data points) on the memory compression of the state spaces they generate. Figure 4d compares llmc (upper data points) and divine (lower data points) on the throughput of states per second.

#### 3.3.1 LLMC vs DIVINE

Looking at the results in Fig. 4a, we see that llmc outperforms divine by at least 5x in all test cases except Prefixsum and two SortedLinkedList tests. llmc suffers in the Prefixsum tests because of the lack of dynamic threadprivate memory detection. This results in significantly larger state spaces, up to three orders of magnitude for <sup>4</sup> , as seen in Fig. 4c.

Comparing the sorted and non-sorted linked list cases, we notice llmc is able to outperform divine in the non-sorted cases by higher factors than the sorted cases. This difference can be explained by that the two tools generate more similarly sized state spaces for non-sorted cases, but not for sorted cases. For example, llmc generates <sup>∼</sup>14.4x more states than divine for <sup>4</sup> , but only <sup>∼</sup>2.2x more for <sup>4</sup> . This highlights llmc is lacking a reduction technique, which works for divine in the sorted cases, but not as well for the non-sorted cases.

For the two Hashmap cases that both tools completed, llmc outperforms divine by 8.4x and 157x. Since the hash map is a single global memory object all threads can access, llmc does not have the disadvantage of lacking a dynamic thread-private memory reduction. divine crashed for the two other test cases.

divine is unable to complete two of the four Michael-Scott queue tests, crashing out, the others are verified 86x and 272x faster by llmc than by divine.

As the complexity of the Philosopher test cases increases, llmc increasingly outperforms divine. The two tools generate similarly sized state state spaces,

Fig. 4. All experimental results, see Table 2 for a legend. Results above the DNF line mean the tool on the y-axis Did Not Finish, not supporting the test.

because the high contention leaves relatively few memory instructions to be collapsed by divine's reduction, thus levelling the playing field.

In summary, llmc is able to outperform divine in most of the test cases, mostly between 10x–100x faster, with an outlier as high as 2450x faster ( ). This highlights the performance difference, as on average llmc visits <sup>∼</sup>1.4M

Table 2. The six tests with various combinations of number of threads and elements, totaling 24 input programs. MSQ configurations describe a combination of Enqueuers and ([B]locking) Dequeuers in parallel (-) and sequential (;).


states per second (∼8.5M states/s for ), where divine visits <sup>∼</sup>4k states per second (Fig. 4d).

#### 3.3.2 LLMC vs Nidhugg

Moving on to Fig. 4b, we notice Nidhugg is unable to complete any of the Michael-Scott queue , Hashmap or Philosopher test cases. This is because Nidhugg supports neither the \_\_atomic\_\* instructions needed for the Michael-Scott queue nor the spin-lock used in the Hashmap and Philosopher tests. We tried Nidhugg's transformation capabilities to transform the spin-lock to an assume statement, thus limiting the traces traversed to the ones where the condition of the spin-lock holds, but the generated LLVM IR was invalid and could not be used. Additionally, we tried an experimental version (7b8be8a) with a changelog containing potential fixes to no avail.

We see that Nidhugg outperforms llmc in the Prefixsum test cases consistently by multiple orders of magnitude: Nidhugg traverses only a *single* trace for each of these test cases. This highlights the strength of Nidhugg in its ability to conclude that each read can only read a single value. Without this technique, llmc needs to exhaustively go through all interleavings of the threads.

For the linked list, sorted and non-sorted , we see that as the cases get bigger, llmc is able to outperform Nidhugg. This highlights the disadvantage of stateless model checking: bigger state spaces tend to cause more common prefixes of paths, which causes more work for stateless model checking.

#### 3.3.3 Scalability

Figure 5 shows the results for various number of threads for SortedLinkedList3.9 <sup>3</sup> , chosen for the performance similarity of the three tools. The graph shown is typical: other test expose similar patterns as the one we highlight here. divine does not scale well in the number of threads: its peak performance lies typically around 4 or 8 threads, confirmed by the divine developers<sup>2</sup>. Nidhugg expectedly does scale very well, as threads just execute a specific trace, with hardly and communication. llmc shows some scalability, but a <sup>∼</sup>4x improvement using 192 threads leaves a lot of room for improvement<sup>3</sup>.

Fig. 5. Scalability comparison of divine , llmc , Nidhugg .

<sup>2</sup> https://divine.fi.muni.cz/trac/ticket/44.

<sup>3</sup> https://github.com/bergfi/dmc/issues/1.

# 3.3.4 DMC and DTREE

We highlight one aspect of the performance of llmc: the underlying model checker dmc and its storage component dtree [14]. In Figure 4c, we notice that although llmc on average generates state spaces of an order of magnitude larger compared to divine, it uses two orders of magnitude less memory per state, due to dtree. Furthermore, dtree allows to apply a delta to a state without reconstructing the entire state. Since states are typically ∼2kiB in these tests, this significantly avoids copying memory and increases performance.

# 4 Conclusion

We have introduced llmc 0.2<sup>4</sup>, the multi-threaded low-level model checker that model checks software via LLVM IR. It translates the input LLVM IR into a model LLVM IR that implements the dmc api, the API of the high-performance model checker dmc. This allows llmc to *execute* the model's next-state function, instead of *interpreting* the input LLVM IR, like divine and Nidhugg. We compared llmc to these tools using a test suite of 24 tests, covering various data structures. llmc outperforms divine and Nidhugg up to three orders of magnitude, while other tests have shown areas for improvement. Averaging the results of all completed tests, llmc is an order of magnitude faster than divine and <sup>∼</sup>3.4x faster than Nidhugg. divine and Nidhugg are unable to complete 4 and 12 tests, respectively, due to crashing or not supporting infinite loops or \_\_atomic\_\* library calls.

*Future Work.* llmc will benefit most from a state space reduction technique that collapses memory instructions to thread-private memory. We aim to integrate this as part of a memory emulation layer that also adds support for relaxed memory models. Even without the dynamic reduction technique, the results show that llmc in its current form is a high performing tool to model check software.

# References


<sup>4</sup> https://github.com/bergfi/llmc.


Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

# Formally Validating a Practical Verification Condition Generator

Gaurav Parthasarathy1(B), Peter Müller<sup>1</sup>, and Alexander J. Summers<sup>2</sup>

<sup>1</sup> Department of Computer Science, ETH Zurich, Zurich, Switzerland {gaurav.parthasarathy, peter.mueller}@inf.ethz.ch

> <sup>2</sup> University of British Columbia, Vancouver, Canada alex.summers@ubc.ca

Abstract. A program verifier produces reliable results only if both the *logic* used to justify the program's correctness is sound, and the *implementation* of the program verifier is itself correct. Whereas it is common to formally prove soundness of the logic, the implementation of a verifier typically remains unverified. Bugs in verifier implementations may compromise the trustworthiness of successful verification results. Since program verifiers used in practice are complex, evolving software systems, it is generally not feasible to formally verify their implementation.

In this paper, we present an alternative approach: we *validate successful runs* of the widely-used Boogie verifier by producing a *certificate* which proves correctness of the obtained verification result. Boogie performs a complex series of program translations before ultimately generating a verification condition whose validity should imply the correctness of the input program. We show how to certify three of Boogie's core transformation phases: the elimination of cyclic control flow paths, the (SSA-like) replacement of assignments by assumptions using fresh variables (passification), and the final generation of verification conditions. Similar translations are employed by other verifiers. Our implementation produces certificates in Isabelle, based on a novel formalisation of the Boogie language.

# 1 Introduction

Program verifiers are tools which attempt to prove the correctness of an implementation with respect to its specification. A successful verification attempt is, however, only meaningful if both the *logic* used to justify the program's correctness is sound, and the *implementation* of the program verifier is itself correct. It is common to formally prove soundness of the logic, but the implementations of program verifiers typically remain unverified. As is standard for complex software systems, bugs in verifier implementations can and do arise, potentially raising doubts as to the trustworthiness of successful verification results.

One way to close this gap is to prove a verifier's implementation correct. However, such a *once-and-for-all* approach faces serious challenges. Verifying an existing implementation bottom-up is not practically feasible because such implementations tend to be large and complex (for instance, the Boogie verifier [29] consists of over 30K lines of imperative C# code), use a variety of libraries, and are typically written in efficient mainstream programming languages which themselves lack a formalisation. Alternatively, one could develop a verifier that is correct by construction. However, this approach requires the verifier to be (re-)implemented in an interactive theorem prover (ITP) such as Coq [14] or Isabelle [24]. This precludes the free choice of implementation language and paradigm, exploitation of concurrency, and possibility of tight integration with standard compilers and IDEs, which is often desirable for program verifiers [4,5,13,26]. Both verification approaches substantially impede software maintenance, which is problematic since verifiers are often rapidly-evolving software projects (for instance, the Boogie repository [1] contains more than 5000 commits).

To address these challenges, in this work we employ a different approach. Instead of verifying the implementation once and for all, we *validate specific runs* of the verifier by automatically producing a *certificate* which proves the correctness of the obtained verification result. Our certificate generation formally relates the input and output of the verifier, but does so largely independently of its implementation, which can freely employ complex languages, algorithms, or optimisations. Our certificates are formal proofs in Isabelle, and so checkable by an independent trusted tool; their guarantees for a certified run of the verifier are as strong as those provided by a (hypothetical) verified verifier.

We apply our novel verifier validation approach to the widely-used Boogie verifier, which verifies programs written in the intermediate verification language Boogie. The Boogie verifier is a *verification condition generator* : it verifies programs by generating a verification condition (VC), whose validity is then discharged by an SMT solver. Certifying a verifier run requires proving that validity of the VC implies the correctness of the input program. Certification of the validity-checking of the VC is an orthogonal concern; our results can be combined with work in that area [11,15,19] to obtain end-to-end guarantees.

Like many automatic verifiers, Boogie is a *translational verifier* : it performs a sequence of substantial Boogie-to-Boogie translations (*phases*), simplifying the task and output of the final efficient VC computation [6,18]. The key challenges in certifying runs of the Boogie tool are to certify each of these phases, including final VC generation. In particular, we present novel techniques for making the following three key phases (and many smaller ones) of Boogie's tool chain certifying:


3. The final generation of the VC, which includes the erasure and logical encoding of Boogie's polymorphic type system [33] *(VC phase)*.

The certification of such verifier phases is related to existing work on compiler verification [34] and validation [8,41,42]. However, the translations and the certified property we tackle here are fundamentally different from those in compilers. Compilers typically require that each execution of the target program corresponds to an execution of the source program. In contrast, the encoding of a program in a translational verifier typically has intentionally more executions (for instance, allows more non-determinism). Moreover, translational verifiers need to handle features not present in standard programming languages such as **assume** statements and background theories. Prior work on validating such verifier phases has been limited in the supported language and extent of the formal guarantee; we discuss comparisons in detail in Sect. 8.

Contributions. Our paper makes the following technical contributions.


Making the Boogie verifier certifying is an important result, reducing the trusted code base for a wide variety of verification tools implemented via encodings into Boogie, e.g. Dafny [31], VCC [13], Corral [28], and Viper [35]. Moreover, the technical approach we present here can in future be applied to the certification of the translations performed by these tools, and those based on comparable intermediate verification languages such as Frama-C [26] and Krakatoa [17] based on Why3 [16] and Prusti [4] and VerCors [10] based on Viper [35].

*Outline.* Section 2 explains at a high-level, how our validation approach is structured for the different phases. Section 3 introduces a formal semantics for Boogie. Sections 4, 5 and 6 present our validation of the CFG-to-DAG, passification, and VC phases, respectively. Section 7 evaluates our certificate-producing version of Boogie. Section 8 discusses related work. Section 9 concludes. Further details are available in our accompanying technical report (hereafter, TR) [37].

# 2 Approach

A Boogie program consists of a set of procedures, each with a specification and a procedure body in the form of a (reducible) control-flow-graph (CFG), whose blocks contain basic commands; we present the formal details in the next section. Boogie verifies each procedure modularly, desugaring procedure calls according

Fig. 1. Key phases of verification in Boogie and their certification. The solid edges show Boogie's transformations on a procedure body; each node G<sup>i</sup> represents a control-flowgraph. Our final certificate (dashed edge) is constructed by formally linking the three phase certificates represented by the dotted edges. Each of the three phase certificates also incorporate extra smaller transformations that we do not show here.

to their specifications. Verification is implemented via a series of phases: programto-program translations and a final computation of a VC to be checked by an SMT solver. Our goal is to formally certify (per run of Boogie) that validity of this VC implies the correctness of the original procedure.

To keep the complexity of certificates manageable, our technical approach is *modular* in three dimensions: decomposing our formal goal per *procedure* in the Boogie program, per *phase* of the Boogie verification, and per *block* in the CFG of each procedure. This modularity makes the full automation of our certification proofs in Isabelle practical. In the following, we give a high-level overview of this modular structure; the details are presented in subsequent sections.

*Procedure Decomposition.* Boogie has no notion of a main program or an overall program execution. A Boogie program is correct if each of its procedures is individually correct (that is, the procedure body has no failing traces, as we make precise in the next section). Boogie computes a separate VC for each procedure, and we correspondingly validate the verification of each procedure separately.

*Phase Decomposition.* We break our overall validation efforts down into perphase sub-problems. In this paper, we focus on the following three most substantial and technically-challenging of these sequential phases, illustrated in Fig. 1. (1) The *CFG-to-DAG phase* translates a (possibly-cyclic) CFG to an acyclic CFG (*cf.* Sect. 4). This phase substantially alters the CFG structure, cutting loops using annotated loop invariants to over-approximate their executions. (2) The *passification phase* eliminates imperative updates by transforming the code into static single assignment (SSA) form and then replacing assignments with *constraints* on variable versions (*cf.* Sect. 5). Both of these phases introduce extra non-determinism and **assume** statements (which, if implemented incorrectly could make verification unsound by masking errors in the program). (3) The final *VC phase* translates the acyclic, passified CFG to a verification condition that, in addition to capturing the weakest precondition, encodes away Boogie's polymorphic type system [33].

We construct certificates for each of these key phases separately (depicted by the blue dotted lines in Fig. 1). For each phase, we certify that *if* the target of the translation phase is correct (a correct Boogie program for the first two phases; a valid VC for the VC phase) then the source (program) of the phase is correct. This modular approach lets us focus the proof strategy for each phase on its conceptually-relevant concerns, and provides robustness against *changes* to the verifier since at most the certification of the changed phases may need adjustment. Logically, our per-phase certificates are finally glued together to guarantee the analogous end-to-end property for the entire pipeline, depicted by the green dashed edge in Fig. 1. For our certificates, we import the input and output programs (and VC) of each key phase from Boogie into Isabelle; we do not reimplement any of Boogie's phases inside Isabelle.

The certificates of the key phases also incorporate various smaller transformations between the key phases, such as peephole optimisation. Our work also validates these smaller transformations, but we focus the presentation on the key phases in this paper. Boogie also performs several smaller translation steps *prior* to the CFG-to-DAG phase. These include transforming ASTs to corresponding CFGs, optimisations such as dead variable elimination, and desugaring procedure calls using their specifications (via explicit **assert**, **assume**, and **havoc** statements). Our approach applies analogously to these initial smaller phases, but our current implementation certifies only the pipeline of all phases from the (input to the) CFG-to-DAG phase onwards. Thus, our certificate relates Boogie's VC to the original source AST program so long as these prior translation steps are correct.

*CFG Decomposition.* When tackling the certification of *each* phase, we further break down validation of a procedure's CFG in the source program of the phase into sub-problems for each block in the CFG. We prove two results for each block in the source CFG:


This decomposition separates command-level reasoning (local block lemmas) from CFG-level reasoning (global block theorems). It enables concise lemmas and proofs in Isabelle and makes each comprehensible to a human.

# 3 A Formal Semantics for Boogie

Our certificates prove that the validity of a VC generated by Boogie formally implies correctness of the Boogie CFG-to-DAG source program. This proof relies crucially on a formal semantics for Boogie itself. Our first contribution is the first such formal semantics for a significant subset of Boogie, mechanised in Isabelle. Our semantics uses the Boogie reference manual [29], the presentation of its type system [33], and the Boogie implementation for reference; none of those provide a formal account of the language. For space reasons, we explain only the key concepts of our detailed formalisation here; more details are provided in App. A of the TR [37] and the full Isabelle mechanisation is available as part of our accompanying artifact [36].

#### 3.1 The Boogie Language

Boogie programs consist of a set of top-level declarations of global variables and constants (the *global data*), axioms, uninterpreted (polymorphic) functions, type constructors, and procedures. A procedure declaration includes parameter, local-variable, and result-variable declarations (the *local data*), a pre- and postcondition, and a procedure body given as a CFG.<sup>1</sup> CFGs are formalised as usual in terms of basic blocks (containing a possibly-empty list of *basic commands*), and edges; semantically, execution after a basic block continues via any of its successors non-deterministically.

$$\begin{aligned} e &::= x \mid \mathsf{false} \mid \mathsf{true} \mid i \mid e\_1 \; loop \, e\_2 \mid \mathsf{u} op(e) \mid f[\vec{\tau}](\vec{e}) \mid \mathsf{oLd}(e) \mid \\ & \forall x : \tau. \; e \mid \exists x : \tau. \; e \mid \forall\_{ty} t. \; e \mid \exists\_{ty} \tau. \; e \\ \tau &::= Int \mid Bool \mid C(\vec{\tau}) \mid t \quad c ::= \textbf{assume} \; e \mid \textbf{asserrt} \; e \mid x := e \mid \textbf{havoc} \; x \end{aligned}$$

Fig. 2. The syntax of our formalised Boogie subset, where τ , e, and c, denote the types, expressions, and basic commands respectively; control-flow is handled via CFGs over the basic commands. *bop* and *uop* denote binary and unary operations, respectively.

The types, expressions, and basic commands in our Boogie subset are shown in Fig. 2. We support the primitive types *Int* and *Bool*; types obtained via declared type constructors are *uninterpreted types*; the sets of values such types denote are constrained only via Boogie axioms and **assume** commands. Moreover, types can contain type variables (for instance, to specify polymorphic functions).

Boogie expression syntax is largely standard (e.g. including typical arithmetic and boolean operations). Old-expressions **old**(e) evaluate the expression e w.r.t. the current local data and the global data as it *was* in the pre-state of the

<sup>1</sup> Source-level procedure specifications also include *modifies clauses*, declaring a set of global variables the procedure may modify. As we tackle Boogie programs after procedure calls have been desugared, there are no modifies clauses in our formalisation.

procedure execution. Boogie expressions also include universal and existential *value* quantification (written ∀x : τ. e and ∃x : τ. e), as well as universal and existential *type* quantification (written ∀*ty* t. e and ∃*ty* t. e). In the latter, t is bound in e and quantifies over *closed* Boogie types (i.e. types that do not contain any type variables).

Basic commands form the single-steps of traces through a Boogie CFG; sequential composition is implicit in the list of basic commands in a CFG basic block and further control flow (including loops) is prescribed by CFG edges. Boogie's basic commands are assumes, asserts, assignments, and havocs; **havoc** x non-deterministically assigns a value matching the type of variable x to x.

The main Boogie features *not* supported by our subset are maps and other primitive types such as bitvectors. Boogie maps are polymorphic and impredicative, i.e. one can define maps that contain themselves in their domain. Giving a semantic model for such maps in a proof assistant such as Isabelle or Coq is non-trivial; we aim to tackle this issue in the future. Modelling bitvectors will be simpler, although maintaining full automation may require some additional work.

### 3.2 Operational Semantics

*Values and State Model.* Our formalisation embeds integer and boolean values shallowly as their Isabelle counterparts; an Isabelle carrier type for all *abstract values* (those of uninterpreted types) is a parameter of our formalisation. Each uninterpreted type is (indirectly) associated with a *non-empty* subset of abstract values via a *type interpretation* map T from abstract values to (single) types; particular interpretations of uninterpreted types can be obtained via different choices of type interpretation T .

One can understand Boogie programs in terms of the sets of possible *traces* through each procedure body. Traces are (as usual) composed of sequences of steps according to the semantics of basic commands and paths through the CFG; these can be finite or infinite (representing a non-terminating execution). A trace may halt in three cases: (1) an exit block of the procedure is reached in a state satisfying the procedure's postcondition (a *complete* trace),<sup>2</sup> (2) an **assert** A command is reached in a state not satisfying assertion A (a *failing* trace), or (3) an **assume** A command is reached in a state not satisfying A (a trace which *goes to magic* and stops). Our formalisation correspondingly includes three kinds of Boogie program states: a distinguished *failure state* F, a distinguished *magic state* M, and *normal states* N((*os*, *gs*, *ls*)). A normal state is a triple of partial mappings from variables to values for the old global state (for the evaluation of old-expressions), the (current) global state, and the local state, respectively.

*Expression Evaluation.* An expression e evaluates to value v if the (big-step) judgement T , Λ, Γ, Ω e, N(*ns*) ⇓ v holds in the context (T , Λ, Γ, Ω). Here, T

<sup>2</sup> The case of the postcondition *not* holding is subsumed under point (2), since Boogie checks postconditions by generating extra **assert** statements.

Fig. 3. Running example in source code and CFG representation, respectively.

is a *type interpretation* (as above), Λ is a *variable context*: a pair (G, L) of type declarations for the global (G) and local (L) data. Γ is a *function interpretation*, which maps each function name to a semantic function mapping a list of types and a list of values to a return value. The type substitution Ω maps type variables to types.

The rules defining this judgement can be found in App. A.2 of the TR [37]. For example, the following rule expresses when a universal type quantification evaluates to **true** (t is bound to the quantified type and may occur in e):

$$\frac{\forall \tau. \ closeded(\tau) \Longrightarrow \mathcal{T}, A, \Gamma, \Omega(t \mapsto \tau) \vdash \langle e, ns \rangle \Downarrow \mathtt{true}}{\mathcal{T}, A, \Gamma, \Omega \vdash \langle \forall\_{ty} t. \, e, ns \rangle \Downarrow \mathtt{true}}$$

The premise requires one to show that the expression e reduces to **true** for every possible type τ that is closed. In general, expression evaluation is possible only for well-typed expressions; we also formalise Boogie's type system and (for the first time) prove its type safety for expressions in Isabelle.

*Command and CFG Reduction.* The (big-step) judgement T , Λ, Γ, Ω c, s → s defines when a command c reduces in state s to state s ; the rules are in App. A.3 of the TR [37]. This reduction is lifted to lists of commands cs to model the semantics of a single trace through a CFG block (the judgement T , Λ, Γ, Ω cs, s [→] s ). The operational semantics of CFGs is modelled by the (small-step) judgement T , Λ, Γ, Ω, G δ →CFG δ , expressing that the CFG configuration δ reduces to configuration δ in the CFG G. A CFG configuration is either *active* or *final*. An active configuration is given by a tuple (inl(bn), s), where b<sup>n</sup> is the block identifier indicating the current position of the execution and s is the current state. A final configuration consists of a tuple (inr(()), s) for state s (and unit value ()) and is reached at the end of a block that has either no successors, or is in a magic or failure state.

Fig. 4. The CFG-to-DAG phase applied to the running example (source is left, target is right). The back-edge (the red edge from B<sup>5</sup> to B<sup>1</sup> in the left CFG) is eliminated. The blue commands are new. A is given by j >= 0 ∧ (i = 0 ⇒ j > 0).

#### 3.3 Correctness

A procedure is *correct* if it has *no failing traces*. This is a *partial correctness* semantics; a procedure body whose traces never leave a loop is trivially correct provided that no intermediate **assert** commands fail. Procedure correctness relies on CFG correctness. A CFG G is correct w.r.t. a postcondition Q and a context (T , Λ, Γ, Ω) in an initial normal state N(*ns*) if the following holds for all configurations (r, s ):

$$\begin{aligned} & (\mathcal{T}, \Lambda, \Gamma, \Omega, G \vdash (\mathsf{inl}(\mathsf{entry}(G)), \mathsf{N}(ns)) \to^\*\_{\mathsf{CFG}} (r, s') \Longrightarrow [s' \neq \mathsf{F} \land \\ & (r = \mathsf{inr}()) \Longrightarrow (\forall ns'. s' = \mathsf{N}(ns') \Longrightarrow \mathcal{T}, \Lambda, \Gamma, \Omega \vdash \langle Q, \mathsf{N}(ns') \rangle \Downarrow \mathsf{true})) \end{aligned}$$

where entry(G) is the entry block of G and →<sup>∗</sup> CFG is the reflexive-transitive closure of the CFG reduction. The postcondition is needed only if a final configuration is reached in a normal state, while failing states must be unreachable. Whenever we omit Q, we implicitly mean the postcondition to be simply **true**. In our tool, we consider only empty initial mappings Ω, since we do not support procedure type parameters (lifting our work to this feature will be straightforward).

For a procedure p to be correct w.r.t. a context, its body CFG must be correct w.r.t. the same context and p's postcondition, *for all* initial normal states N(ns) that satisfy p's precondition and which respect the context. For *ns* to *respect* a context, it must be well-typed and must satisfy the axioms when restricted to its constants. We say that p is *correct*, if it is correct w.r.t. *all well-formed contexts*, which must have a well-typed function interpretation and a type interpretation that inhabits every uninterpreted closed type (and only those).

*Running Example.* We will use the simple CFG of Fig. 3 as a running example, intended as body of a procedure with trivial (**true**) pre- and post-conditions. The code includes a simple loop with a declared loop invariant, which functions as a classical Floyd/Hoare-style inductive invariant, and for the moment can be considered as an implicit **assert** statement at the loop head. The CFG has infinite traces: those which start from any state in which i is negative. Traces starting from a state in which i is zero go to magic; they do not reach the loop. The program is correct (has no failing traces): all other initial states will result in traces that satisfy the loop invariant and the final **assert** statement. If we removed the initial **assume** statement, however, there *would* be failing traces: the loop invariant check would fail if i were initially zero.

# 4 The CFG-to-DAG Phase

In this section, we present the validation for the CFG-to-DAG phase in the Boogie verifier. This phase is challenging as it changes the CFG structure, inserts additional non-deterministic assignments and **assume** statements, and must do so correctly for arbitrary (reducible) nested loop structures, which can include unstructured control flow (e.g. jumps out of loops).

#### 4.1 CFG-to-DAG Phase Overview

The CFG-to-DAG phase applies to every *loop head* block identified by Boogie's implementation and any *back-edges* from a block reachable from the loop head block back to the loop head (following standard definitions for reducible CFGs [21]). Figure 4 illustrates the phase's effect on our running example. Block B<sup>1</sup> is the (only) loop head here, and the edge from B<sup>5</sup> to it the only back-edge (completing looping paths via B<sup>2</sup> and B<sup>3</sup> or B<sup>2</sup> and B4). An **assert** A statement starting a loop head (like B1) is interpreted as declaring A to be the loop invariant.<sup>3</sup> The CFG-to-DAG phase performs the following steps:


The havoc-then-assume sequence introduced in step 3 can be understood as generating traces for *arbitrary values of* X<sup>H</sup> satisfying the loop invariant A,

<sup>3</sup> In general, multiple asserts at the beginning of a loop head may form the invariant.

<sup>4</sup> Omitting **assume false** if there are no successors would be incomplete, since otherwise the postcondition would have to be satisfied.

effectively over-approximating the set of states reachable at the loop head in the original program. In particular, the remnants of any originally looping path (e.g. B <sup>1</sup>, B <sup>2</sup>, B <sup>3</sup>, B <sup>5</sup>) enforce that any non-failing trace starting from any such state must (due to the **assert** added to block B <sup>5</sup> in step 2) result in a state which re-establishes the loop invariant. Such paths exist only to enforce this inductive step (analogously to the premise of a Hoare logic while rule); so long as the **assert** succeeds, we can discard these traces via step 4.

While we illustrate this step on a simple CFG, in general a loop head may have multiple back-edges, looping structures may nest, and edges may exit multiple loops. For the above translation to be correct, the CFG must be reducible and loop heads and corresponding back-edges identified accurately, which is complex in general. Importantly (but perhaps surprisingly), our work makes this phase of Boogie certifying *without* explicitly verifying (or even defining) these notions.

# 4.2 CFG-to-DAG Certification: Local Block Lemmas

We define first our local block lemmas for this phase. Recall that these prove that if executing the statements of a target block yields no failing executions, the same holds for the corresponding source block; this result is trivial for source blocks other than loop heads and their immediate predecessors, since these are unchanged in this phase. To enable eventual composition of our block lemmas, we need to also reflect the role of the **assume** and **assert** statements employed in this phase. The formal statement of our local block lemmas is as follows<sup>5</sup>:

Theorem 1 (CFG-to-DAG Local Block Lemma). *Let* B *be a source block with commands cs*S*, whose corresponding target block has commands cs*<sup>T</sup> *. If* B *is a loop head, let* X<sup>H</sup> *be as defined in CFG-to-DAG step 1 (and empty otherwise) and let* A*pre be its loop invariant (or true otherwise). If* B *is a* predecessor *of a loop head, let* A*post be the loop invariant of its successor (and true otherwise). Then,* if*:*


then*:* s <sup>1</sup> = *F and if* s <sup>1</sup> *is a normal state, then (1)* A*post is satisfied in* s <sup>1</sup>*, and (2) if no assume false was added at the end of cs*<sup>T</sup> *, then there is a target execution in* cs<sup>T</sup> *from N*(ns2) *that reaches a normal state that differs from* s <sup>1</sup> *only on variables not defined in* Λ*.*

The gist of this lemma is to capture *locally* the ideas behind the four steps of the phase. For example, consequence (1) reflects that *after* the transformation, any blocks that *were previously* predecessors of a loop head (B <sup>0</sup> and B <sup>5</sup> in our running example) will have an **assert** statement checking for the corresponding invariant (and so if the target program has no failing traces, in each trace this invariant will be true at that point).

<sup>5</sup> We omit some details regarding well-typedness, handled fully in our formalisation.

Fig. 5. The passification phase applied to the branch in the running example with the result on the right. The final (green) commands in B-- <sup>3</sup> and B-- <sup>4</sup> are the synchronisation commands. At the uppermost blocks shown here, the current versions of i and j are i1 and j2, respectively. The full CFGs are shown in App. B of the TR [37].

#### 4.3 CFG-to-DAG Certification: Global Block Theorems

We lift our certification to *all* traces through the source and target CFGs; the statement of the corresponding global block theorems is similar to that of local block theorems lifted to CFG executions, and for space reasons we do not present it here, but it is included in our Isabelle formalisation. In particular, we prove for each block (working in reverse topological order through the target CFG blocks) that if executions starting in the target CFG block never fail, neither do any executions starting from the corresponding source CFG block, and looping paths modify at most the variables havoced according to step 3 of the phase.

The major challenge in these proofs is reasoning about looping paths in the source CFG, since these revisit blocks. To solve this challenge, we perform inductive arguments per loop head in terms of the number of steps remaining in the trace in question.<sup>6</sup> Our global block theorem for a block B then carries as an assumption an induction hypothesis for each loop that contains B. Proving a global block theorem for the origin of a back-edge is taken care of by applying the corresponding induction hypothesis.

This proof strategy works only if we have obtained the induction hypothesis for the loop head *before* we use the global block theorem of the origin of a back-edge (otherwise we cannot discharge the block theorem's hypothesis). In other words, our proof implicitly shows the necessary requirement that loop heads (as identified by Boogie) dominate all back-edges reaching them *without us formalising any notion of domination, CFG reducibility, or any other advanced graph-theoretic concept*. This shows a major benefit of our validation approach over a once-and-for-all verification of Boogie itself: our proofs indirectly check that the identification of loop heads and back-edges guarantees the necessary *semantic properties* without being concerned with *how* Boogie's implementation computes this information.

<sup>6</sup> This may seem insufficient since traces can be infinite, but importantly a *failing* trace is always finite, and our theorems need only eliminate the chance of failing traces.

Our approach applies equally to nested loops and more-generally to reducible CFG structures; *all* corresponding induction hypotheses are carried through from the visited loop heads. The requirement that no more than the havoced variables X<sup>H</sup> are modified in the source program is easily handled by showing that variables modified in an inner loop are a subset of those in outer loops. As for all of our results, our global block lemmas are proven automatically in Isabelle per Boogie procedure, providing per-run certificates for this phase.

# 5 The Passification Phase

In this section, we describe the validation of the passification phase in the Boogie verifier. Unlike the previous phase, passification makes no changes to the CFG structure, but makes substantial changes to the program states (via SSAlike renamings), substantially increases non-determinism, and employs **assume** statements to re-tame the sets of possible traces.

### 5.1 Passification Phase Overview

The main goal of passification is to eliminate assignments such that a more efficient VC can be ultimately generated [6,18,30]. In the Boogie verifier, this is implemented as a single transformation phase that can be thought of as two independent steps. Firstly, the source CFG is transformed into *static single assignment* (SSA) form, introducing *versions* (fresh variables) for each original program variable such that each version is assigned at most once in any program trace. In a second step, variable assignments are *completely eliminated*: each assignment command x := e is replaced by **assume** x = e. Havoc statements are simply removed; their effect is implicit in the fact that a new variable version is used (via the SSA step) *after* such a statement.

Figure 5 shows the effect of this phase on four blocks of our running example (the full figure of the target CFG is shown in App. B of the TR [37]). The commands inserted just before the join block (here, B <sup>5</sup> ) introduce a consistent variable version (here, j4) for use in the join block. It is convenient to speak of target variables in terms of their source program counterparts: we say e.g. that j *has version 4* on entry to block B 5.

Compared to traces through the source program, the space of variable values in a trace through the target program is initially much larger; each version may, on entry to the CFG, have an arbitrary value. For example, j4 may have any value on entry to B <sup>2</sup> ; traces in which its value does not correspond to the constraint of the **assume** statements in B <sup>3</sup> or B <sup>4</sup> will go to magic and not reach B 5 . Importantly, however, not *all* traces go to magic; enough are preserved to simulate the executions of the original program: each **assume** statement constrains the value of exactly one variable version, and the same version is never constrained more than once. Capturing this delicate argument formally is the main challenge in certifying this step.

As extra parts of the passification phase, the Boogie verifier performs constant propagation and desugars old-expressions (using variable versions appropriate to the entry point of the CFG). We omit their descriptions here for brevity, but our implementation certifies them.

#### 5.2 Passification Certification: Local Block Lemmas

To validate the passification phase, it is sufficient to show that each source execution is simulated by a corresponding target execution, made precise by constructing a relation between the states in these executions. Such *forward simulation* arguments are standard for proving correctness of compilers for deterministic languages. However, the situation here is more complex due to the fact that the target CFG has a much wider space of traces: the values of each versioned variable in the target program are initially unconstrained, meaning traces exist for all of their combinations. On the other hand, many of these traces do not survive the **assume** statements encountered in the target program. Picking the correct *single* trace or state to simulate a particular source execution would require knowledge of all variable assignments that are *going* to happen, which is not possible due to non-determinism and would preclude the block-modular proof strategies that our validation approach employs.

Instead, we generalise this idea to relating each single source state s with a *set* T of corresponding target program states. We define variable relations V<sup>R</sup> at each point in a trace, making explicit the mappings used in the SSA step between source program variables and their corresponding versions. For example, on entry to block B <sup>2</sup> in the source version of our running example (correspondingly B 2 in the target), the V<sup>R</sup> relation relates <sup>i</sup> to i1 and <sup>j</sup> to j2. All states t ∈ T must precisely agree with s w.r.t. V<sup>R</sup> (e.g., s(i) = t(i1), s(j) = t(j2)). On the other hand, our sets of states T are defined to be completely unconstrained (besides typing) for *future* variable versions. For example, for every t ∈ T at the same point in our example, there will be states in T assigning each possible value (of the same type) to i2 (and otherwise agreeing with t).

More precisely, for a set of variables X, we say that a set of states T *constrains at most* X *w.r.t. variable context* Λ if, for every t ∈ T, z /∈ X, z is in Λ, and value v of z's type, we have t[z → v] ∈ T. In other words, the set T is closed under arbitrary changes to values of all variables in Λ but *not* in X. We construct our sets T such that they constrain at most *current and past versions* of program variables. It is this fact that enables us to handle subsequent **assume** statements in the target program and, in particular, to show that the set of possible traces in the target program never becomes empty while there are possible traces in the source program. For example, when relating the source command j := j+1 in B <sup>3</sup> with the target command **assume** j3 = j2 + 1 in block B <sup>3</sup> , we use the fact that our set of states does not constrain j3 to prove that, although many traces go to magic at this point, for a non-empty set of states T ⊆ T (those in which j3 has the "right" value equal to j2 + 1), execution continues in the target.

We now make these notions more precise by showing the definition of our local block lemmas for the passification phase (See footnote 5).

Theorem 2 (Passification Local Block Lemma). *Let* B *be a source block with commands cs, whose corresponding target block has commands cs ; let* V<sup>R</sup> *and* V <sup>R</sup> *be the variable relations at the beginning and end of* B*, respectively. Let* X *be a set of variable versions, and N*(*ns*) *be a normal state. Let* T *be a nonempty set of normal states such that N*(*ns*) *agrees with* T *according to* VR*, and* T *constrains at most* X *w.r.t.* Λ2*. Furthermore, let* Y *be the variable versions corresponding to the targets of assignment and havoc statements in cs. If both*

*1.* A, Λ1, Γ, Ω *cs*, *N*(*ns*) [→] s ∧ s = *M 2.* X ∩ Y = ∅

*then there exists a non-empty set of normal states* T ⊆ T *s.t.* T *constrains at most* X Y *w.r.t.* Λ<sup>2</sup> *and for each* t ∈ T *, there exists a state* t ∗ *s.t.*

*1.* A, Λ2, Γ, Ω *cs*2, t [→] t ∗ ∧ (s = *F* =⇒ t ∗ = *F*) *2. If* s *is a normal state, then* s *and* t *are related w.r.t.* V <sup>R</sup> *(and* t ∗ = t *).*

This lemma captures our generalised notion of forward simulation appropriately. The first conclusion expresses that the target does not get stuck and that failures are preserved, while the second shows that if the source execution neither fails nor stops then the resulting states are related. Note that premise 2 is essential in the proof to guarantee that the **assume** statements introduced by passification do not eliminate the chance to simulate source executions; the condition expresses that the variable versions newly constrained do not intersect with those previously constrained. To prove these lemmas over the commands in a single block, we are forced to check that the same version is not constrained twice.

### 5.3 Passification Certification: Global Block Theorems

As for all phases, we lift our local block lemmas to theorems certifying all executions *starting* from a particular block, and thus, ultimately, to entire CFGs. For the passification phase, most of the conceptual challenges are analogous to those of the local block lemmas; we similarly employ V<sup>R</sup> relations between source variables and their corresponding target versions. To connect with our local block lemmas (and build up our global block theorems, which we do backwards through the CFG structure), we repeatedly require the key property that the set of variable versions constrained in our executions so far is disjoint from those which may be constrained by a subsequent **assume** statement (*cf.* premise 2 of our local block lemma above). Concretely tracking and checking disjointness of these concrete sets of variables is simple, but turns out to get expensive in Isabelle when the sets are large.

We circumvent this issue with our own *global versioning scheme* (as opposed to the versions used by Boogie, which are *independent* for different source variables): according to the CFG structure, we assign a *global* version number verG(x) to each variable x in the target program such that, if x is constrained in a target block B and y is constrained in another target block B reachable from B , then verG(x) < verG(y). Such a consistent global versioning always exists in the target programs generated by Boogie because the only variables not constrained exactly once *in the program* are those used to synchronise executions (i.e. j4 in Fig. 5), which always appear right before branches are merged. We can now encode our disjointness properties much more cheaply: we simply compare the *maximal* global version of all already-constrained variables with the *minimal* global version of those (potentially) to be constrained. Since we represent variables as integers in the mechanisation, we directly use our global version *as* the variable name for the target program; there is no need for an extra lookup table. Note that (readability aside) it makes no difference which variables names are used in intermediate CFGs; we ultimately care only about validating the original CFG.

# 6 The VC Phase

In this section, we present the validation of the VC phase in the Boogie verifier. This phase has two main aspects: (1) it encodes and desugars all aspects of the Boogie type system, employing additional uninterpreted functions and axioms to express its properties [33]; program expression elements such as Boogie functions are analogously desugared in terms of these additional uninterpreted functions, creating a non-trivial logical gap between expressions as represented in the VC and those from the input program. (2) It performs an efficient (block-by-block) calculation of a weakest precondition for the (acyclic, passified) CFG, resulting in a formula characterising its verification requirements, subject to background axioms and other hypotheses.

#### 6.1 VC Structure

The generated VC has the following overall structure (represented as a shallow embedding in our certificates)<sup>7</sup>:

$$\forall \underbrace{VC\ quantifiers}\_{\begin{subarray}{c}\text{type encoding parameters,\\funktions,\text{variable values}\end{subarray}} \dots \underbrace{(\begin{array}{c}VC\ assumptions\\\text{type encoding,}\end{array}}\_{\begin{array}{c}\text{type encoding,}\end{array}} \implies CFG\text{ }WP\text{)}$$

The VC quantifies over parameters required for the type encoding, as well as VC counterparts representing the variable values and functions in the Boogie program. The VC body is an implication, whose premise contains: (1) assumptions that axiomatise the type encoding parameters, (2) axioms expressing the typing of Boogie variables and functions, and (3) assumptions directly relating

<sup>7</sup> Note that top-level quantification over functions is implicit in the (first-order) SMT problem generated by Boogie; we quantify explicitly in our Isabelle representation.

to axioms explicitly declared in the Boogie program. The conclusion of the implication is an optimised version of the weakest (liberal) precondition (WP) of the CFG.<sup>8</sup>

### 6.2 Boogie's Logical Encoding of the Boogie Type System

We first briefly explain Boogie's logical encoding of its own type system. Values and types are represented at the VC level by two uninterpreted carrier sorts V and T. An uninterpreted function *typ* from V to T maps each value to the representation of its type. Boogie type constructors are each modelled with an (injective) uninterpreted function C with return sort T and taking arguments (per constructor parameter) of sort T. For example, a type constructor *List*(t) is represented by a VC function from T to T. Projection functions are also generated for each type constructor (C<sup>π</sup> <sup>i</sup> for each type argument at position i), e.g. mapping the representation of a type *List*(t) to the representation of type t.

This encoding is then used in the VC to recover Boogie typing constraints for the untyped VC terms. Recovering the constraints is not always straightforward due to optimisations performed by Boogie. For example, the VC translation of the Boogie expression ∀*ty* t. ∀x : List(t). e no longer quantifies over types; all original occurrences of t in e having been translated to *List*<sup>π</sup> <sup>1</sup> (*typ*(x)). This optimisation reflects that this particular type quantification is redundant, since t can be recovered from the type of x. 9

# 6.3 Working from VC Validity

Our certificates assume that the generated VC is valid (certifying the validitychecking of the VC by an SMT solver is an orthogonal concern). However, connecting VC validity back to block-level properties about the specific program requires a number of technical steps. We need to construct Isabelle-level semantic values to *instantiate* the top-level quantifiers in the VC such that the corresponding VC assumptions (left-hand side of the VC) can be proved and, thus, validity of the corresponding WP can be deduced. Moreover, we must ensure that our instantiation yields a WP whose validity implies correctness of the Boogie program. For example, a top-level VC quantifier modelling a Boogie function f must be instantiated with a mathematical function that behaves in the same way as f for arguments of the correct type.

We instantiate the carrier sort V for values in the VC with the corresponding type denoting Boogie values in our formalisation; the carrier sort T for *types* is instantiated to be all Boogie types that do not contain free variables (i.e. closed types). Constructing explicit models for the quantified functions used to

<sup>8</sup> One difference in our version of the Boogie verifier is that we switched off the generation of extra variables introduced to report error traces [32]; these are redundant for programs that do not fail and further complicate the VC structure.

<sup>9</sup> Note that in the VC the quantification over x ranges over all values of sort V . An implication is used to consider only those x for which *typ*(x) = *List*(*List*<sup>π</sup> <sup>1</sup> (*typ*(x))).

model Boogie's type system (satisfying, e.g., suitable inverse properties for the projection functions) is straightforward. For the VC-level variable values, we can directly instantiate the corresponding values in the initial Boogie program state.

VC-level functions representing those declared in the Boogie program are instantiated as (total) functions which, *for input values of appropriate type* (the arguments and output are untyped values of sort V ), are defined simply to return the same values as the corresponding function in our model. However, perhaps surprisingly, Boogie's VC embedding of functions logically requires functions to return values of the specified return type even if the input values do not have the types specified by the function. In such cases, we define the instantiated function to return some value of the specified type, which is possible since in well-formed contexts every closed type has at least one value in our model.

After our instantiation, we need to prove the hypotheses of the VC's implication; in particular that all axioms (both those generated by the type system encoding and those coming from the program itself) are satisfied. The former are standard and simple to prove (given the work above), while the latter largely follow from the assumption that each declared axiom must be satisfied in the initial state restricted to the constants. The only remaining challenge is to relate VC expressions with the evaluation of corresponding Boogie expressions; an issue which also arises (and is explained) below, where we show how to connect validity of the instantiated WP to the program.

#### 6.4 Certifying the VC Phase

Boogie's weakest precondition calculation is made size-efficient by the usage of explicit named constants for the weakest preconditions wp(B, **true**) for each block B, which is defined in terms of the named constants for its successor blocks. For example, in Fig. 5, wp(B <sup>2</sup> , **true**) is given by i vc <sup>1</sup> =0=⇒ wp(B <sup>3</sup> , **true**) ∧ wp(B <sup>4</sup> , **true**). Here i vc <sup>1</sup> is the value that we instantiated for the variable i1.

We exploit this modular construction of the generated weakest precondition for the local and global block theorems. We prove for each block B with commands cs the following local block lemma:

#### Theorem 3 (VC Phase Local Block Lemma).

*If* A, Λ, Γ, Ω *cs*, *N*(*ns*) [→] s *and* wp(B, *true*) *holds, then* s = *F and if* s *is a normal state, then* ∀B*suc* ∈ *successors*(B). wp(B*suc*, *true*)*.*

Once one has proved this lemma for all blocks in the CFG, combining them to obtain the corresponding global block theorems (via our usual reverse walk of the CFG) is straightforward. The main challenge is in decomposing the proof for the local block lemma itself for a block B, for which we outline our approach next.

By this phase, the first command in B must be either an **assume** e or an **assert** e command. In the former case, we rewrite wp(B, **true**) into the form <sup>e</sup>vc <sup>=</sup><sup>⇒</sup> <sup>H</sup>, where <sup>e</sup>vc is the VC counterpart of <sup>e</sup> and where <sup>H</sup> corresponds to the weakest precondition of the remaining commands. This rewriting may involve undoing certain optimisations Boogie's implementation performed on the formula structure. Next, we need to prove that e evaluates to evc (see below). Hence, if e evaluates to **true** (the execution does not go to magic) then H must be true, and we can continue inductively. The argument for **assert** e is similar but where we rewrite the VC to <sup>e</sup>vc <sup>∧</sup> <sup>H</sup> (i.e. <sup>e</sup>vc and <sup>H</sup> must both hold); if <sup>e</sup> evaluates to evc, we know that the execution does not fail.

Proving that e evaluates to evc arises in both cases and also in our previous discharging of VC hypotheses. Note that, in contrast to e, evc is not a Boogie expression, but a shallowly embedded formula that includes the instantiations of quantified variables we constructed above. Showing this property works largely on syntax-driven rules that relate a Boogie expression with its VC counterpart, except for extra work due to mismatching function signatures and optimisations that Boogie made either to the formula structure or via the type system encoding (*cf.* Sect. 6.2). We handle some of these cases by showing that we can rewrite the formula back into the unoptimised standard form we require for our syntaxdriven rules and in other cases we directly work with the optimised form. Both cases are automated using Isabelle tactics.

This concludes our discussion of the certification of Boogie's three key phases. Combining the three certificates yields an end-to-end proof that the validity of the generated verification conditions implies the correctness of the input program, that is, that the given verification run is sound.

# 7 Implementation and Evaluation

In this section, we evaluate our certifying version of the Boogie verifier [36], which produces Isabelle certificates proving the correctness of Boogie's pipeline for programs it verifies.

We have implemented our validation tool as a new C# module compiled with Boogie. We instrumented Boogie's codebase to call out to our module, which allows us to obtain information that we can use to validate the key phases, and extended parts of the codebase to extract information more easily. Moreover, we disabled counter-example related VC features and the generation of VC axioms for any built-in types and operators that we do not support. We added or changed fewer than 250 non-empty, uncommented lines of code across 11 files in the existing Boogie implementation.

Given an input file verified by Boogie, our work produces an Isabelle certificate per procedure p that certifies the correctness of the corresponding CFG-to-DAG source CFG as represented internally in Boogie. The generation and checking of the certificate is fully automatic, without any user input. We use a combination of custom and built-in Isabelle tactics. In addition to the three key phases we describe in detail, our implementation also handles several smaller transformations made by Boogie, such as constant propagation. Our tool currently supports the default options of Boogie (only) and does not support advanced source-level *attributes* (for instance, to selectively force procedures to be inlined).

Table 1. Selection of algorithmic examples with the lines of code (LOC), the number of procedures (#P), the time it takes for Isabelle to check the certficate in seconds (the average of 5 runs on a Lenovo T480 with 32 GB, i7-8550U 1.8 GhZ, Ubuntu 18.04 on the Windows Subsystem for Linux), and the certificate size expressed as the number of non-empty lines of Isabelle.


We evaluated our work in two ways. Firstly, to evaluate the applicability of our certificate generation, we automatically collected all input files with at least one procedure from Boogie's test suite [1] which verify successfully and which either use no unsupported features or are easily desugared (by hand) into versions without them. This includes programs with procedure calls since Boogie simply desugars these in an early stage. For programs employing attributes, we checked whether the program still verifies *without* attributes, and if so we also kept these. In total, this yields 100 programs from Boogie's test suite. Secondly, we collected a corpus of ten Boogie programs which verify interesting algorithms with non-trivial specifications: three from Boogie's test suite and seven from the literature [12,27]. Where needed we manually desugared usages of Boogie maps (which we do not yet support) using type declarations, functions, and axioms.

Of the 100 programs from Boogie's test suite, we successfully generate certificates in 96 cases. The remaining 4 cases involve special cases that we do not handle yet. For 2 of them, extending our work is straightforward: one special case includes a naming clash and the other case can be amended by using a more specific version of a helper lemma. The remaining two fail because of our incomplete handling of function calls in the VC phase when combined with coercions between VC integers or booleans and their Boogie counterparts. Handling this is more challenging but is not a fundamental issue.

For the corpus of 10 examples, Table 1 shows the generated certificate size and the time for Isabelle to check their validity.<sup>10</sup> The ratio of certificate size to code size ranges from 41 to 89; this rather large ratio emphasises the substantial work in formally validating the substantial work which Boogie's implementation

<sup>10</sup> The time to generate the certificate is not included, but is negligible here.

performs. Optimisations to further reduce the ratio are possible. The validation of certificates takes usually under one second per line of code. While these times are not short, they are acceptable since certificate generation needs to run only for (verified) release versions of the program in question.

# 8 Related Work

Several works explore the validation of program verifiers. Garchery et al. [20] validate VC rewritings in the Why3 VC generator [16]. Unlike our work, they do not connect VCs with programs and do not handle the erasure of polymorphic types. Strub et al. [39] validate part of a previous version of the F\* verifier [40] by generating a certificate for the F\* type checker itself, which type checks programs by generating VCs. Like us, they assume the validity of the generated VC itself, but they do not consider program-to-program transformations such as ours. Another approach is taken by Aguirre [2] who shows how one can map proofs of the VC back to correctness of an F\* program. They prove a once-andfor-all result, but the approach could be lifted to a validation approach using the proof-producing capability of SMT solvers [7]. Lifting the approach would require extending the work to handle classical instead of constructive VC proofs.

There is some work on proving VC generator implementations correct once and for all, although none of the proven tools are used in practice. Homeier and Martin [23] prove a VC generator correct in HOL for an executable language and a simpler VC phase than Boogie's. Herms et al. [22] prove a VC generator inspired by Why3 correct in Coq. However, some more-challenging aspects of Why3's VC transformation and polymorphic type system are not handled. Vogels et al. [44] prove a toolchain for a Boogie-like language correct in Coq, including passification and VC phases. However, the language is quite limited: without unstructured control flow, loops (i.e. no need for a CFG-to-DAG phase), functions, or polymorphism (i.e. no type encoding). Verifiers other than VC generators, include the verified Verasco static analyzer [25], which supports a realistic subset of C, but whose performance is not yet on par with unverified, industrial analyzers.

Validation has also been explored in other settings. Alkassar et al. [3] adjust graph algorithms to produce witnesses that can be then used by verified validators to check whether the result is correct. In the context of compiler correctness, many validation techniques express a per-run validator in Coq, prove it correct once-and-for-all [8,41,43], and then extract executable code (the extraction must be trusted). In the verified CompCert compiler [34], such validators have been used in combination with the once-and-for-all approach. Validators are used for phases that can be more easily validated than proved correct once and for all. One such example related to our certification of the passification phase is the validation of the SSA phase [8], dealing also with versioned variables in the target (but not with **assume** statements that prune executions). In contrast to our work, they require an explicit notion of CFG domination and they do not use a global versioning scheme to efficiently check that two parts of the CFG constrain

disjoint versions. Our versioning idea is similar to a technique used for the validation of a dominator relation in a CFG [9], which assigns intervals to basic blocks (as opposed to assigning versions to variables) to efficiently determine whether a block dominates another one. The validation of the Cogent compiler [38] follows a similar approach to ours in that it generates proofs in Isabelle.

# 9 Conclusion

We have presented a novel verifier validation approach, and applied it successfully to three key phases of the Boogie verifier, providing formal underpinnings for both the language and its verifier for the first time. Our work demonstrates that it is feasible to provide strong formal guarantees regarding the verification results of practical VC generators written in modern mainstream languages.

In the future, we plan to extend our supported subset of Boogie, e.g. to include procedure calls and bitvectors. Supporting Boogie's potentiallyimpredicative maps is the main open challenge: maps can take other maps as input, potentially including themselves. The challenge with this feature is to still be able to express a type in Isabelle capturing all Boogie values despite the potentially-cyclic nature of map types. In practice, however, this may not be required in full generality: we have observed that Boogie front-ends rarely use maps that contain maps of the same type as input. Therefore, we plan to extend our technique to support a suitably-expressive restricted form of Boogie maps.

Acknowledgements. We thank Alain Delaët–Tixeuil for his earlier work on this topic, Thibault Dardinier for improving our artifact, Martin Clochard for helpful discussions and the anonymous reviewers for their valuable comments. This work was partially funded by the Swiss National Science Foundation (SNSF) under Grant No. 197065.

# References


Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

# **Automatic Generation and Validation of Instruction Encoders and Decoders**

Xiangzhe Xu , Jinhua Wu , Yuting Wang(B) , Zhenguo Yin, and Pengfei Li

Shanghai Jiao Tong University, Shanghai 200240, China yuting.wang@sjtu.edu.cn

**Abstract.** Verification of instruction encoders and decoders is essential for formalizing manipulation of machine code. The existing approaches cannot guarantee the critical *consistency* property, i.e., that an encoder and its corresponding decoder are mutual inverses of each other. We observe that consistent encoder-decoder pairs can be automatically derived from bijections inherently embedded in instruction formats. Based on this observation, we develop a framework for writing specifications that capture these bijections, for automatically generating encoders and decoders from these specifications, and for formally validating the consistency and soundness of the generated encoders and decoders by synthesizing proofs in Coq and discharging verification conditions using SMT solvers. We apply this framework to a subset of X86-32 instructions to illustrate its effectiveness in these regards. We also demonstrate that the generated encoders and decoders have reasonable performance.

**Keywords:** Formalized instruction formats · Verified parsing · Program synthesis · Proof synthesis · Translation validation

# **1 Introduction**

Software that manipulates machine code such as compilers, OS kernels and binary analysis tools, relies on *instruction encoders and decoders* for extracting structural information of instructions from machine code and for translating such information back into binary forms. Because of the sheer amount of instructions provided by any instruction set architecture (ISA) and the complexity of instruction formats, it is extremely tedious and error-prone to implement instruction encoders and decoders by hand. Therefore, the literature contains abundant work on automatic generation of instruction encoders and decoders, often from specifications written in a formal language capable of concisely and accurately characterizing instruction formats on various ISAs [7,12,15].

Unfortunately, the above approaches generate little formal guarantee, therefore not suitable for rigorous analysis or verification of machine code. In those settings, instruction encoders and decoders are expected to be *consistent*, i.e., any encoder and its corresponding decoder are inverses of each other, and *sound*, i.e., they meet formal specifications of instruction formats that human could easily understand and check.

Consistency is essential for verification of machine code because it guarantees that manipulation and reasoning over the abstract syntax of instructions can be mirrored precisely onto their binary forms. For example, verification of assemblers requires that instruction decoding reverts the assembling (encoding) process [20]. However, the previously proposed approaches to verifying instruction encoders and decoders all fail to establish consistency: to handle the complexity of instruction formats (especially that of CISC architectures), they employ expressive but ambiguous specifications such as context-free grammars or variants of regular expressions, from which it is impossible to derive consistent encoders and decoders. A representative example is the bidirectional grammar proposed by Tan and Morrisett [18]. It is an extension of regular expressions for writing instruction specifications from which verified encoders and decoders can be generated. However, because of the ambiguity of such specifications, two different abstract instructions may be encoded into the same *bit string* (i.e., a sequence of bits). When the decoder is deterministic, not all encoded instructions can be decoded back to the original instructions.

In this paper, we present an approach to automatic construction of instruction encoders and decoders that are verified to be consistent and sound. It is based on the observation that an instruction format inherently implies a bijection between abstract instructions and their binary forms that manifests as the determinacy of instruction decoding in actual hardware. This is true even for the most complicated CISC architectures. From a well-designed instruction specification that *precisely* captures this bijection, we are able to extract an appropriate representation of instructions, a pair of instruction encoder and decoder between this representation and the binary forms of instructions, and the consistency and soundness proofs of the encoder and decoder.

Based on the above ideas, we develop a framework for automatically generating consistent and sound instruction encoders and decoders. It extends the approach to specifying and generating instruction encoders and decoders proposed by Ramsey and Fern´andez [15] with mechanisms for *validating* their soundness and consistency by using theorem provers and SMT solvers. The framework consists of the following components (which are also our technical contributions):


the Coq theorem prover so that the encoder and decoder can be formally validated later.

– *The algorithms for automatically validating the consistency and soundness of the generated encoders and decoders*. Given any instruction specification, they synthesize the consistency and soundness proofs for the generated encoder and decoder in Coq. This is possible because the bijection implied by the original specification guarantees that the encoder and decoder are inverses of each other, under the requirement that the binary "shapes" of different instructions or operands do not overlap with each other. This requirement is inherently satisfied by any instruction format, and can be easily proved with SMT solvers.

To demonstrate the effectiveness of our framework, we have applied it to a subset of 32-bit X86 instructions. In the rest of this paper, we first introduce relevant background information for this work and discuss the inadequacy of the existing work in Sect. 2. We then give an overview of our framework in Sect. 3 by further elaborating on the points above. After that, we discuss the definition of our specification language and the ideas supporting its design in Sect. 4. In the two subsequent sections Sect. 5 and Sect. 6, we discuss the algorithms for automatically generating and validating encoders and decoders. In Sect. 7, we present the evaluation of our framework. Finally, we discuss related work and conclude in Sect. 8.

# **2 Background**

For our approach to work, the specification language we use must support the instruction formats on contemporary RISC and CISC architectures. In this section, we first introduce the key characteristics of these formats and then present a running example. We conclude this section by exposing the inadequacy of the existing approaches in capturing the bijections between the abstract and binary forms of instructions.

# **2.1 The Characteristics of Instruction Formats**

**Fig. 1.** The format of 32-bit X86 instructions

Instruction formats on CISC architectures may vary in length and structure even for the same type of instructions and may contain complex dependencies between their operands. In contrast, instructions on RISC architectures usually have fixed formats which are largely subsumed by CISC formats. Therefore, we focus on handling CSIC formats in this paper.

We use the format of 32-bit X86 instructions as an example to illustrate the complex characteristics of CISC instructions. It is depicted in Fig. 1. An instruction is divided into a sequence of *tokens* where each token is one or more bytes playing a particular role. The first token **Opcode** partially or fully determines the basic type of the instruction; it may be one to three bytes long. Following **Opcode** is an one-byte token **ModRM**. **ModRM** is further divided into a sequence of *fields* where a field f[n<sup>1</sup> : n2] represents a segment of the token named f that occupies the n2-th to n1-th bits in that token. Depending on the value of **Opcode**, **ModRM** may or may not exist. When it exists, the value of **Reg op[5:3]** may contain the encoded representation of a register operand. Another operand of the instruction may be an *addressing mode*. It is collectively determined by the values of **Mod[7:6]**, **RM[2:0]**, the token **SIB** (scaled index byte) and the displacement **Disp** following **ModRM**. Finally, the instruction may have an operand of immediate values in the token **Imms**.

For simplicity of our discussion, we have omitted some details such as the optional prefixes of instructions in Fig. 1. However, this simplified form is already enough to expose the key characteristics and complexity of CISC instruction formats (some of which also manifest in RISC). We summarize them below:


own fields or tokens. For example, if an instruction does not take any argument, then the value of its **Opcode** determines that there is no token following **Opcode**. For another example, when **Mod[7:6]** contains the value 0b11, the addressing mode is simply a register operand. Otherwise, the addressing mode may further depends on the values in **SIB** and **Disp**.

Note that, despite the above complexity, an instruction format is designed to inherently embed a (partial) bijection between the binary forms of instructions and their abstract representation as the composition of components. This is to ensure the determinacy of instruction decoding in hardware. This bijection is the central property to be investigated in this work.

#### **2.2 A Running Example**


**Table 1.** The different forms of addressing modes

We present an example of encoding the add instruction to concretely illustrate the characteristics of the X86 instruction format. It will be used as a running example for the rest of the paper. The operands of add may have many forms. For simplicity, we only consider two cases: *1)* the first operand is a register while the second one is an addressing mode, and *2)* the first operand is an addressing mode while the second one is an immediate value.

In the first case, **Opcode** is 0x03, indicating that **ModRM** exists and the first operand is encoded in its **Reg op** field. The addressing mode has over 23 combinations because of the dependencies and constraints over their fields. We list only some of the combinations in Table 1, where - indicates that this field or token does not exist. The first row shows the direct addressing mode **r** where **Mod** is 0b11 and **RM** contains the encoded register operand **r**. The following three rows shows different kinds of indirect addressing modes. They are valid only if **Mod** is 0b00 and further constraints are satisfied. For example, the second row shows the indirect addressing mode **(r)** where **r** is encoded in **RM**. In this case, **r** must neither be **ESP** (encoded as 0b100) nor be **EBP** (encoded as 0b101). Similarly, the addressing mode (**s** ∗ **i** + **b**) requires that **RM** must be 0b100, **Index** must not be 0b100 and **Base** must not be 0b101.

In the second case, **Opcode** is 0x81, indicating that **ModRM** exists, the first operand is an addressing mode, and the second operand is an immediate value following it. Here, **Reg Op** must be 0b000.

**Fig. 2.** Some concrete examples of instruction encoding

We demonstrate the concrete examples of encoding add (4,%ecx,%esp), %ebx, add 0x88, %ebx and add \$0x66, (%ebx) in Fig. 2 where %ebx and %ecx are encoded into 0b011 and 0b001, respectively (the order of operands is *reversed* because we use the AT&T assembly syntax). Note how the forms of operands change significantly depending on the different values in the related fields. Note also, despite such complex dependencies, a bit string representing a valid add instruction corresponds to a *unique* combination of components.

#### **2.3 Inadequacy of the Existing Approaches**

The existing approaches to specifying instructions are either *1)* too general and allow ambiguity or *2)* too low-level and break the component-based abstraction we just described. Either way, they fail to capture the inherent bijection embedded in an instruction format.

The bidirectional grammars [18] demonstrate the first kind of inadequacy. They contain the alternation grammar Alt g<sup>1</sup> g<sup>2</sup> for matching a bit string s when either the sub-grammar g<sup>1</sup> or g<sup>2</sup> matches s. The ambiguity arises when both g<sup>1</sup> and g<sup>2</sup> match s: in this case, the same s corresponds to two different internal representations. Therefore, bidirectional grammars cannot encode bijections in general. The same can be said for other work on verified parsing based on ambiguous grammars. We shall discuss them in detail in Sect. 8.

The Specification Language for Encoding and Decoding (or SLED) demonstrates the second kind of inadequacy [15]. It is a language for describing translations between symbolic and binary representations of machine instructions. On the surface, SLED takes the component-based view in specifying instructions. However, SLED specifications are interpreted through a normalization process by which every component is flattened into a sequence of tokens. After that, the structural information of components is completely lost. As a result, users can only derive encoders from the normalized specifications. They need to write decoders by using completely different specifications called "matching statements." This inability to generate matching encoders and decoders from a single specification is a common phenomenon in other approaches to ISA specifications.

In summary, no existing approach can precisely capture the bijections inherently embedded in instruction formats. This is the main intellectual problem we try to tackle in this paper. We shall elaborate on our solution to this problem in the remaining sections.

# **3 An Overview of the Framework**

**Fig. 3.** The framework

We develop a framework for automatic generation of verified encoders and decoders that are consistent and sound. It is depicted in Fig. 3. To generate formally verified encoders and decoders, users first need to write down a specification of instructions S in a language called CSLED (or CoreSLED). CSLED is an enhancement to SLED for characterizing the bijection between the binary forms and the abstract syntax of instructions. Roughly speaking, S consists of a collection of *class* definitions, each of which defines a unique type of components that form instructions or their operands; the "top-most" class defines the type of instructions. Each class is associated with a set of *patterns* to uniquely determine a bijection between the binary and abstract forms of components in that class. Note that this bijection exists only when certain *well-formedness conditions* for patterns are satisfied. We shall elaborate on these ideas in Sect. 4.

From S, the following definitions are generated and translated into Coq:


Then, S is fed into a collection of algorithms G to generate the following definitions and proofs in Coq:


$$\begin{aligned} \vee \mathcal{K} \ k \ l \ l', \mathbb{E}\_{\mathcal{K}}(k) &= \lfloor l \rfloor \Longrightarrow \mathbb{D}\_{\mathcal{K}}(l++l') = \lfloor (k,l') \rfloor. \\ \vee \mathcal{K} \ k \ l \ l', \mathbb{D}\_{\mathcal{K}}(l++l') &= \lfloor (k,l') \rfloor \Longrightarrow \mathbb{E}\_{\mathcal{K}}(k) = \lfloor l \rfloor. \end{aligned}$$

Their Coq proofs are automatically generated by inspecting the logical structure of classes and patterns in S. For this, we need to derive a very important property: the decoder always decodes a bit string l back to the same sequence of components. We achieve this goal by combining proofs in Coq with SMT solving of verification conditions that are automatically derived from wellformed specifications.

– The proof of soundness of the encoder and decoder. The soundness theorems are stated as follows:

$$\begin{aligned} \forall \mathcal{K} \ k \ l \ l', \mathbb{E}\_{\mathcal{K}}(k) &= \lfloor l \rfloor \Longrightarrow \mathbb{R}[\mathcal{K}] \ k \ l. \\ \forall \ K \ k \ l \ l', \mathbb{D}\_{\mathcal{K}}(l++l') &= \lfloor (k,l') \rfloor \Longrightarrow \mathbb{R}[\mathcal{K}] \ k \ l. \end{aligned}$$

As we shall see later, <sup>E</sup><sup>K</sup> and <sup>R</sup>[[K]] are both defined recursively on the definition of classes in S. Their main difference is that the former is a function while the latter is a relation. Therefore, it is easy to prove the first soundness theorem by induction on k. By using the second consistency theorem and the first soundness theorem, we can easily prove the second soundness theorem.

As we shall see in the following sections, the actual implementations of encoders and decoders and their consistency and soundness theorems are more complicated than presented here. Nevertheless, the above discussion covers the highlevel ideas of our framework.

Note that in Fig. 3, S and G are not formalized and hence not in the trusted base. The consistency and soundness of E and D are independently *validated* by using Coq and SMT solvers. If the validation of either property fails, the framework reports a failed attempt to generate the encoder and decoder. This often indicates that the instruction specification is not well-formed.

# **4 The Specification Language**

The key idea underlying the design of CSLED is to record explicitly the structures of components in instruction specifications, instead of normalizing them into tokens as did in SLED. In this way, CSLED specifications accurately capture the key characteristics of instruction formats described in Sect. 2.1, hence the bijections embedded in instruction formats. In this section, we present the syntax of CSLED, explain the ideas underlying its design, and use the running example to illustrate how CSLED specifications are written. We also introduce the syntactical and relational interpretations of CSLED specifications and present the well-formedness conditions for the bijections to exist.

#### **4.1 The Syntax**


**Fig. 4.** The syntax of CSLED

The syntax of CSLED is shown in Fig. 4. A CSLED specification (denoted by S) consists of a list of *definitions* (denoted by D). The three kinds of definitions are for tokens (denoted by T ), fields (denoted by F) and classes (denoted by K). Every definition is bound to a unique identifier where *tid*, *fid* and *kid* represents the identifiers of tokens, fields and classes, respectively.

Tokens represent consecutive segments of bytes and are the basic elements for forming instructions. They are necessary for distinguishing the same sequence of bytes with different interpretations. Their definitions have the form (n) where n must be divisible by 8 which denotes a token of n-bits or n/8 bytes. Definitions of fields have the form *tid*(n<sup>1</sup> : n2) which denotes a field occupying the n2-th to n1-th bits in the token *tid*.

Classes represent specific types of components. They play a central role in the specifications by accurately capturing the component-based abstraction we discussed in Sect. 2.1. A class consists of a collection of *branches* (denoted by B) each of which denotes a possible form of components in the class. Definitions of branches have the form constr *cid* [*aid*] (P) where *cid* is a unique identifier for the branch (denoting a constructor) and [*aid*] is a list of *fid* or *kid* denoting the sub-components or fields for constructing a component (i.e., the arguments to the constructor). These arguments capture the nested structures of components where a bigger component may be constructed from smaller ones or basic fields.

A branch is associated with a single *pattern* P. A pattern plays two roles: it determines the types of a sequence of tokens that concretely forms components of this branch, and it describes a relation between these tokens (and their fields) with the abstract arguments of the branch. This relation essentially encodes the bijection between the abstract and binary forms of components in this branch.

At the top-most level, P is a sequence of *judgments* (denoted by J ) separated by ;, such that J1; ... ;J*<sup>n</sup>* matches a sequence of tokens concretely represented by a bit string l if and only if l = l1++l2++ ... ++l*<sup>n</sup>* and J*<sup>i</sup>* matches l*<sup>i</sup>* for 1 ≤ i ≤ n. This sequential pattern is enough for relating abstract and binary forms of components when each J*<sup>i</sup>* (and l*i*) corresponds to a single (sub-)component. However, according to the discussion in Sect. 2.1, components may be interleaved with each other and J*<sup>i</sup>* may correspond to multiple components. Therefore, a judgment is a conjunction of *atomic patterns* (denoted by A) each of which matches an interleaved component. In case there is no interleaving, a judgment reduces to a single atomic pattern.

An atomic pattern has two forms: cls %i for relating a sequence of tokens to the i-th argument in [*aid*] of the corresponding branch which must be a class, and O for relating tokens to field arguments in [*aid*] and for further constraining the fields of these tokens. The O patterns are called *basic patterns*. Among them -:*tid* matches any token of type *tid*; *fid* = n (*fid* = n) matches a token with the field *fid* whose value is (is not) the constant n; similar to cls %i, fld %i relates the i-th argument in [*aid*] of the branch which must be a field to the concrete value of the field in the matching token. The last two cases of basic patterns indicate that arbitrary sequencing and interleaving of basic patterns are allowed. Despite such free interleaving, a basic pattern can only match with sequences of tokens of the same length and of a unique type because we require that O<sup>1</sup> & O<sup>2</sup> be well-formed only if both O<sup>1</sup> and O<sup>2</sup> match sequences of tokens with the same type. Therefore, basic patterns have the same expressiveness as SLED specifications in their normalized forms [15].

In contrast to basic patterns, judgments and atomic patterns are much more expressive as they may match tokens of different lengths and forms. This is because a class pattern cls %i can match components of a class K with multiple branches, each of which may have different patterns. By introducing class patterns into atomic patterns, we are able to represent the complete structures of components and establish bijections from these structures. This is the key improvement we made in CSLED compared to SLED.

#### **4.2 The CSLED Specification of the Running Example**

```
token Opcode = (8); token Disp = (32); token Imms = (32);
token ModRM = (8); token SIB = (8);
field opcode = Opcode(7 : 0); field disp = Disp(31 : 0);
field imms = Imms(31 : 0); field mod = ModRM (7 : 6);
field reg op = ModRM (5 : 3); field rm = ModRM (2 : 0);
field scale = SIB(7 : 6); field index = SIB(5 : 3);
field base = SIB(2 : 0);
class Addrmode =
  (mod = 0b00 & rm = 0b100;
  fld %1 & fld %2 & fld %3 & index = 0b100 & base = 0b101)
...
class Instruction =
  (opcode = 0x81; reg op = 0b000 & cls %1; fld %2)
...
```
#### **Fig. 5.** The CSLED specification of the running example

The CSLED specification of our running example is depicted in Fig. 5. The *Addrmode* class specifies the possible addressing modes. Its branches are translated from the addressing modes described in Table 1 one by one, such that their patterns exactly match the binary structures of components in the corresponding branches. For instance, the branch *addr sib* is translated from the fourth addressing mode in Table 1. Its pattern is a sequence of two judgment. The first judgment is a conjunction of two basic patterns that are the required constraints on the fields *mod* and *rm* of *ModRM* described in Table 1. Therefore, it must match the single token *ModRM* . The second judgment is a conjunction of basic patterns that constrain the fields *index* and *base* of *SIB* and relate arguments of *addr sib* with the concrete values in the fields *scale*, *index* and *base*. Because these patterns all constrain the fields of *SIB*, the second judgment must match the single token *SIB*.

Similarly, the *Instruction* class specifies the instructions. Its two branches characterize the two kinds of add instructions described in Sect. 2.2. Note how conjunctions between the basic patterns for *reg op* and class patterns for *Addrmode* are used to describe the interleaving of register operands and addressing modes. Note also that in every branch of *Addrmode* the first pattern matches the token *ModRM* , and in any branch of *Instruction* the token *Opcode* is always followed by *Addrmode*. Therefore, *ModRM* always follows *Opcode* as desired.

By this example, we demonstrate the critical feature of CSLED: because the syntax of CSLED is designed to precisely describe instruction formats in ISA manuals, it implicitly captures the embedded bijections. Note that, because of its faithfulness to the ISA manuals, CSLED's syntax contains full details about instruction encoding by nature. However, it is not hard to imagine this syntax being refined to the client's syntax through another straightforward bijection. In fact, this is how we anticipate clients will use CSLED in practice, e.g., to build verified assemblers for X86.

#### **4.3 Interpretation of CSLED Specifications**

From a CSLED specification S, we extract *1)* a collection of data types for representing the abstract syntax of components, and *2)* a collection of binary relations between these data types and bit strings for representing the mappings between the abstract and concrete forms of components.

**Data Types of Components.** We use the operator <sup>T</sup>[[−]] to denote the interpretation of basic fields and classes into data types. The translation for fields are simple: given a field definition field *fid* <sup>=</sup> *tid*(n<sup>1</sup> : <sup>n</sup>2), <sup>T</sup>[[*fid*]] = <sup>n</sup><sup>1</sup> <sup>−</sup> <sup>n</sup><sup>2</sup> + 1 where n represent an unsigned binary integer of n bits. Note that we do not further translate the values of fields as they have straightforward interpretations (such as the mapping from bits to registers described in Sect. 2.1). The interpretation of classes is only slightly more involved. Given a class definition class *kid* <sup>=</sup> <sup>K</sup>, <sup>T</sup>[[*kid*]] is an algebraic data type named *kid*. For each branch constr *cid* [*aid* <sup>1</sup>,..., *aid <sup>n</sup>*] P of K, there is a constructor *cid* for *kid* that takes n arguments of types T[[*aid* <sup>1</sup>]],...,T[[*aid <sup>n</sup>*]].

**Relations Derived from CSLED.** The translation of CSLED specifications into relations is defined in Fig. 6. Here, *BS* denotes the type of bit strings. When *aids* = [*aid* <sup>1</sup>,..., *aid <sup>n</sup>*] we write T[[*aids*]] to denote the product type of <sup>T</sup>[[*aid* <sup>1</sup>]],...,T[[*aid <sup>n</sup>*]]. We use <sup>≡</sup> to denote the definitional equality.

The function R[[*aid*]] translates a type of components associated with *aid* into a binary relation between its abstract representation and bit strings, where *aid* may denote a field or a class. The definition for field components is straightforward. R[[*kid*]] k *l* holds iff there is a branch of *kid* whose interpretation relates k and *l*, which further requires (by the third rule in Fig. 6) that k is constructed by using the constructor of that branch and the pattern of the branch relates the arguments of the constructor to *<sup>l</sup>*. The latter relation is defined by <sup>R</sup>*p*[[−, <sup>−</sup>]] such that <sup>R</sup>*p*[[P, *aids*]] *args l* holds iff <sup>P</sup> matches *<sup>l</sup>* and the arguments *args* satisfy the constraints enforced by <sup>P</sup> and *aids*. More specifically, <sup>R</sup>*p*[[P;<sup>J</sup> , *aids*]] *args l* holds iff P matches a prefix of *l* and J matches the rest of *l*. The definition of <sup>R</sup>*p*[[<sup>J</sup> &A]] is slightly different in that <sup>R</sup>*p*[[<sup>J</sup> &A, *aids*]] *args l* holds iff <sup>A</sup> matches the whole *l* and J matches a prefix of l. This is necessary for describing the

**Fig. 6.** Translation of CSLED specifications into relations

interleaving of components. Furthermore, certain constraints need to be satisfied for deriving a bijection as shall discuss in Sect. 4.4. <sup>R</sup>*p*[[O1;O2, *aids*]] and <sup>R</sup>*p*[[O1&O2, *aids*]] are not shown in Fig. <sup>6</sup> because they are defined the same as <sup>R</sup>*p*[[P;<sup>J</sup> , *aids*]] and <sup>R</sup>*p*[[<sup>J</sup> &A, *aids*]], respectively. <sup>R</sup>*p*[[*fid* <sup>=</sup> n, *aids*]] *args l* holds iff *<sup>l</sup>* is a token containing *fid* whose value is <sup>n</sup>; similar for <sup>R</sup>*p*[[*fid* <sup>=</sup> n, *aids*]]. R*p*[[fld %i, *aids*]] holds iff the i-th argument in *args* matches with the concrete value found in *l*; same for R*p*[[cls %i, *aids*]]. Note how the last two definitions make use of *args* for getting the values of arguments.

#### **4.4 Well-Formedness of Specifications**

The binary relation we define in the last section denotes a bijection only when the CSLED specification under investigation satisfies certain well-formedness conditions. These conditions guarantee that, given any bit string l, there is at most one abstract object related to l via the defined binary relation. Well-formedness is the composition of three properties which we call *disjointness*, *compatibility*, and *uniqueness*. We give and explain their definitions below. The logic for checking these conditions is embedded in the generation algorithms we will discuss in the next section and will be exploited for the validation of the generated encoders and decoders.

**Disjointness.** Given a pattern P1&P2, it satisfies disjointness if P<sup>1</sup> and P<sup>2</sup> match disjoint fields.<sup>1</sup> To understand this, suppose <sup>P</sup><sup>1</sup> and <sup>P</sup><sup>2</sup> relate different abstract arguments a<sup>1</sup> and a<sup>2</sup> to overlapping bits in a bit string l. Then, we cannot determine if the values in the overlapping bits are for a<sup>1</sup> or a2. Hence, the derived binary relation cannot possibly be a bijection. Disjointness rules out such possibility.

**Compatibility.** We call the types of sequences of tokens a pattern P matches the "shapes" of P. Given a pattern P1&P2, it satisfies compatibility if every possible shape of P<sup>1</sup> is in a prefix of every possible shape of P<sup>2</sup> when P<sup>2</sup> is a class pattern (and vice versa). Enforcing compatibility simplifies the interpretation of P1&P<sup>2</sup> when P<sup>1</sup> or P<sup>2</sup> is a class pattern with multiple branches that may match bit strings with different shapes. Compatibility makes sense because for common instruction formats it is always the case that the components matched by P<sup>1</sup> are embedded in the *longest common prefixes* of all the possible shapes of P<sup>2</sup> when P<sup>2</sup> is a class pattern (and vice versa). For example, in the example depicted in Fig. 2, **Reg op** is always embedded into the common prefix of all the possible shapes of addressing modes, i.e., the **ModRM** token.

**Uniqueness.** Given a class pattern K, it satisfies uniqueness if for any bit string l, at most one of its branches matches l. Uniqueness is essential for ensuring the determinacy of decoders in presences of class patterns. Fortunately, it implicitly holds for common instruction formats as they are designed with determinacy of decoding in mind. To concretely check the uniqueness implied by instruction formats, we first define the *structural condition* for a branch with pattern P as the conjunction of the statically known constraints in P, denoted by [[P]]*cond*. We then require that no structure conditions for any two branches of a class can be satisfied simultaneously. This requirement allows us to uniquely determine the branch used to construct a class component. For example, the structural conditions of the first three branches of *Addrmode* are (*mod* = 0b11), (*mod* = 0b00 & *rm* = 0b100 & *rm* = 0b101) and (*mod* = 0b00 & *rm* = 0b101). Obviously, any pairwise combination of these conditions cannot possibly be satisfied. This is true even if we consider all the branches of *Addrmode*. Therefore, there is at most one way to decode any addressing mode.

# **5 Generation of Encoders and Decoders**

We discuss the algorithm for generating encoders and decoders from CSLED specifications. The structures of these encoders and decoders closely match the relations derived from specifications. Furthermore, every operation in an encoder has a counterpart in the corresponding decoder, and vice versa.

<sup>1</sup> We abuse the notation by using <sup>P</sup> to denote suitable patterns such as <sup>J</sup> , <sup>A</sup> or <sup>O</sup>.

#### **5.1 Generation of Encoders**

From every class <sup>K</sup>, we extract an encoder <sup>E</sup><sup>K</sup> for its components. It is a partial function that takes two arguments—a component k and a bit string l representing the result previously generated by encoders—and outputs an updated bit string if the encoding succeeds. We shall write <sup>E</sup>K(k,l) = <sup>l</sup> to denote that l is the result of encoding k on top of l.

<sup>E</sup>K(k,l) is defined by recursion on the structure of <sup>k</sup>. For every branch <sup>B</sup> of K, we generate a piece of Coq code from the pattern P of B for encoding <sup>k</sup>. We then insert it into the definition of <sup>E</sup>K(k,l). We write <sup>G</sup>E[[P, *bs*, *args*]] to denote the code snippet so generated, where *bs* is the name of the generated bit string at this point and *args* contains the names of the arguments to the constructor. GE[[P, *bs*, *args*]] is defined in Fig. 7 where we use the option monad for sequencing the encoding operations. The first case is obvious. Code generated by GE[[*fid* = n, *bs*, *args*]] writes the constant n into the field associated with *fid*. GE[[*fid* = n, *bs*, *args*]] checks whether the corresponding field contains the constant n and returns none if the checking fails. GE[[fld %i, *bs*, *args*]] writes the value of the i-th argument into the corresponding field. GE[[cls %i, *bs*, *args*]] calls the encoder for the class corresponding to cls %i. GE[[O<sup>1</sup> ; O2, *bs*, *args*]] encodes its two parts recursively and concatenates the results together, where *first n*(*bs*, n) returns the first n bits in *bs* and *skip n*(*bs*, n) skips the first n bits in *bs* and returns the remaining ones. GE[[O1&O2, *bs*, *args*]] first encodes data matching O1, and then passes the result to the encoding for O2. The last two cases are similar. Note that if the generated code occurs at the beginning of a branch, then *bs* coincides with the input argument l. Otherwise, *bs* denotes intermediate results. As we can see, all these cases follow the logical structure of CLSED specifications we have described before.

#### **5.2 Generation of Decoders**

From every class <sup>K</sup>, we extract a decoder <sup>D</sup>K. It is a partial function such that <sup>D</sup>K(l) = (k,l1, l2) holds iff <sup>l</sup> <sup>=</sup> <sup>l</sup> ++l2, l is the binary representation of k, and l<sup>1</sup> is the result of inverting the encoding operation, i.e., setting every bit the decoder touches in l to 0. This extra return value is introduced to help with the verification as we shall see in Sect. 6.

$$\begin{aligned} & \mathcal{G}\_{\mathbb{D}}[\mathcal{i};i,b,s.args] ::= remainis \leftarrow skip.p.n(bs,t;id); \mid \mid (bs,\ remain inins) \\ & \mathcal{G}\_{\mathbb{D}}[\mathcal{i};d=n,bs,args] ::= ori \leftarrow clear\_{\mathbb{D}}; \mid \mathcal{i};rand(b,s;l;in); \\ & \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad$$

**Fig. 8.** Generation of decoders from patterns

The first step of <sup>D</sup><sup>K</sup> is to decide which branch of <sup>K</sup> should be chosen for decoding l. It can be done by checking the structural conditions derived from the patterns of branches (which we have introduced in Sect. 4.4) against l. Specifically, for the pattern P of each branch of K, we translate its structural condition [[P]]*cond* into a decision procedure in Coq (a function returning boolean values) in a straightforward manner. We then insert an if-statement to check if [[P]]*cond* can be satisfied. If so, we start the decoding process for this branch. Otherwise, we repeatedly check other branches until a matching case is found. Note also that by uniqueness, there is at most one structural condition that can be satisfied. Therefore, <sup>D</sup><sup>K</sup> is deterministic in choosing branches.

Once a matching branch is found, we use the algorithm GD[[P, *bs*, *args*]] (the counterpart of GE[[P, *bs*, *args*]]) to generate a piece of Coq code for decoding the arguments of this branch. It is defined in Fig. 8. Similar to encoding, the generated code snippet follows the logical structure of CSLED specifications. The function clear*fid bs* set the bits of the field *fid* in *bs* to 0. Note that the decoding operations are exactly the inversion of those in Fig. 7. Note also that the fourth and fifth cases in Fig. 8 are responsible for decoding the arguments and storing them in *argi*. By applying the corresponding constructor to these arguments, we get the output component k, which together with the two values returned by <sup>G</sup><sup>D</sup> form the final output of <sup>D</sup><sup>K</sup> .

# **5.3 Generation for the Running Example**

We show the representative cases of the generated encoder and decoder for our running example in Fig. 9. They include the encoding and decoding procedures for the fourth branch of *Addrmode* (the most complicated one). We can see that the encoding and decoding operations are exactly the inverses of each other. The encoder first writes the fields in *ModRM* and then those in *SIB*. Conversely, the decoder first reads the fields in *ModRM* and then those in *SIB*. Finally, it forms the component and returns the reverted and remaining bits. The function BF addr sib is the decision procedure generated from the structural condition for the fourth branch of *Addrmode*. We also show the encoding and decoding procedures for the first add instruction in Fig. 9. Their structures are very similar to those of *Addrmode*.

# **6 Validation of Encoders and Decoders**

In this section, we discuss how to exploit the logical structure of and the wellformedness conditions for CSLED specifications to automatically synthesize the proofs of consistency and soundness for encoders and decoders.

# **6.1 Synthesizing the Proof of Consistency**

The consistency between encoders and decoders is composed of two properties and stated as follows:

**Theorem 1 (Consistency between Encoders and Decoders).** *Given any class* <sup>K</sup>*, its encoder* <sup>E</sup><sup>K</sup> *and decoder* <sup>D</sup><sup>K</sup> *are consistent with each other if they invert each other. That is, the following properties hold:*

$$\begin{aligned} \forall \; k \; l \; r \; l', \; valid \; input\_{\mathcal{K}}(l) &\Longrightarrow \mathbb{E}\_{\mathcal{K}}(k, l) = \lfloor r \rfloor \Longrightarrow \mathbb{D}\_{\mathcal{K}}(r ++ l') = \lfloor (k, l, l') \rfloor. \\ \forall \; k \; l \; r \; l', \mathbb{D}\_{\mathcal{K}}(r ++ l') &= \lfloor (k, l, l') \rfloor \Longrightarrow \mathbb{E}\_{\mathcal{K}}(k, l) = \lfloor r \rfloor. \end{aligned}$$

```
Definition encode_addrmode instance input :=
  match instance with
  ...
  | addr_sib arg1 arg2 arg3 ⇒
    (* Encode ModRM *)
    let ModRM := input in
    let tmp := write_mod ModRM b["00"] in
    let tmp := write_rm tmp b[ "100"] in
    let result0 := tmp in
    (* Encode SIB *)
    let SIB := zeros 8 in
    let tmp := write_scale SIB arg1 in
    let tmp := write_index tmp arg2 in
    let tmp := write_base tmp arg3 in
    let index := read_index tmp in
    let base := read_base tmp in
    do _ ← assert(index = b["100"]);
    do _ ← assert(base = b["101"]);
    let result1 := tmp in
    (* Concatenate the results of
       encoding ModRM and SIB *)
    Some (result0++result1)
  | ...
  end.
                                                Definition decode_addrmode bs :=
                                                  ...
                                                  if BF_addr_sib bs then
                                                     (* Revert the encoding of ModRM *)
                                                     let ori := clear_mod bs in
                                                     let ori := clear_rm ori in
                                                     let ori1 := ori in
                                                     do remains ← skipn bs 8; (* Skip ModRM *)
                                                     (* Decode SIB to get the arguments
                                                        and revert the encoding of SIB *)
                                                     let bs := remains in
                                                     let arg3 := read_base bs in
                                                     let ori := clear_base bs in
                                                     let arg2 := read_index ori in
                                                     let ori := clear_index ori in
                                                     let arg1 := read_scale ori in
                                                     let ori := clear_scale ori in
                                                     let ori2 := ori in
                                                     do remains ← skipn bs 8; (* Skip SIB *)
                                                     (* Return the result *)
                                                     Some(addr_sib arg1 arg2 arg3,
                                                          ori1++ori2, remains)
                                                  else if BF_addr_r bs then ...
                                                     ...
Definition encode_instr instance input :=
  match instance with
  | AddGvEv arg1 arg2 ⇒
    ...
    let tmp := write_reg_op ModRM arg1 in
    do tmp ← encode_addrmode arg2 tmp;
    ...
  | ...
  end.
Definition decode_instr bs :=
  if BF_AddGvEv bs then
    ...
    do arg2, ori, remains ←
       decode_addrmode bs;
    let arg1 := read_reg_op ori in
    let ori := clear_reg_op ori in
    ...
                                                Definition BF_addr_sib bs :=
                                                  let ModRM := firstn bs 8 in
                                                  (* mod = 0b00 ∧ rm = 0b100 *)
                                                  let result0 :=
                                                     (ModRM & b["11000111"]) = b["00000100"] in
                                                  let tmp := skipn bs 8 in
                                                  let SIB := firstn tmp 8 in
                                                  (* index = 0b100 *)
                                                  let result10 :=
                                                     (SIB & b["00111000"]) = b["00100000"] in
                                                  (* base = 0b101 *)
                                                  let result11 :=
                                                     (SIB & b["00000111"]) = b["00000101"] in
                                                  result0 ∧ result10 ∧ result11.
                                                Definition BF_AddGvEv bs :=
                                                  let Opcode := firstn bs 8 in
                                                  (Opcode & b["11111111"]) = b["00000011"].
```
**Fig. 9.** Encoders and decoders generated from the running example

We first discuss how the proof for the first property in Theorem 1 is generated. Here, the assumption *valid input*K(l) asserts that all the bits in l that may be modified by <sup>E</sup><sup>K</sup> must be 0. This is necessary to ensure that the decoder can revert the resulting bit string back to its initial state by setting them to 0 (i.e., the second result of decoding is the same as l).

The proof proceeds by induction on the structure of k. For each branch B with the pattern P, we generate a lemma and its proof that the decision procedure generated from [[P]]*cond* as described in Sect. 5.2 always returns true given any bit string generated by the encoder for P. With this lemma, the proof for the "symmetric" case where the decoder takes the same branch as the encoder reduces to proving that the encoder and decoder generated from P are inverses of each other. This proof is straightforward by the definitions of G<sup>E</sup> and G<sup>D</sup>

in Sect. 5. An important point to note is that, for any pattern cls %i, we need to recursively apply the consistency lemma for its corresponding class, which in turn requires us to establish a *valid input* assumption. By the disjointness property in Sect. 4.4, we can easily conclude that the encoding of sub-components does not interfere with each other, thereby the desired *valid input* assumption can be derived.

To finish the proof, we need to show that the "asymmetric" cases are not possible. For each asymmetric branch B with the pattern P , we have that [[P ]]*cond* holds by the decision procedure guarding this branch. Furthermore, by the above reasoning, [[P]]*cond* holds. We hence have that the conjunction of [[P]]*cond* and [[P ]]*cond* holds. However, this contradicts with the uniqueness property given in Sect. 4.4. Therefore, the decoder can never go into a branch different from the encoder. Continue with our running example, suppose we are proving the consistency of the encoder and decoder for *Addrmode*. Further suppose we are working on the branch with the constructor *addr sib*. Then, the verification condition for the asymmetric case with the constructor *addr r* is

$$\forall bs, (read\_{mod} \; bs = \textsf{0b00} \land read\_{rm} \; bs = \textsf{0b100} \dots) \land (read\_{mod} \; bs = \textsf{0b11})$$

which cannot possibly hold (for simplicity we omit the conditions for *index* and *base*). We note that such condition can be easily checked by any SMT solver with the theory of bit-vectors, and we use Z3 [5] to validate them. This checking can also be directly formalized in Coq, which we plan to do in the future.

Finally, the second property in Theorem 1 can be proved by induction on k in a similar fashion. We elide a discussion of its proof.

#### **6.2 Synthesizing the Proof of Soundness**

As we have discussed in Sect. 4.3, the relational specifications extracted from CSLED specifications are tightly related to the actual instruction formats. Thus, it is reasonable to check the soundness of the generated encoders and decoders against these specifications. The relational specifications are easily translated into Coq definitions and we shall use the same notations. The soundness of encoders and decoders is then stated as follows:

**Theorem 2 (Soundness of Encoders and Decoders).** *Given any class* K*, its encoder* <sup>E</sup><sup>K</sup> *is sound if the following property holds:*

$$\forall \; k \; l \; r \; l', \mathbb{E}\_{\mathcal{K}}(k, l) = \lfloor r \rfloor \implies \mathbb{R}[\mathbb{K}] \; k \; r.$$

*Similarly, its encoder* <sup>D</sup><sup>K</sup> *is sound if the following holds:*

$$(\forall \; k \; l \; r \; l', \mathbb{D}\_{\mathcal{K}}(r++l') = \lfloor (k,l,l') \rfloor \implies \mathbb{R} \lbrack \mathcal{K} \rbrack \; k \; r.$$

The soundness of encoder is easily proved by induction on the structure of k. We need to exploit the well-formedness conditions of CSLED specifications as for the consistency proofs at relevant points. The soundness of decoder is a corollary of the soundness of encoder and the second consistency property.

# **7 Evaluation**

Besides the CSLED language, our framework has two major parts: *1)* the algorithms for generating encoders, decoders and their proofs and *2)* a Coq library containing the definitions and properties of basic types (including bits, bytes and bit strings) and a collection of automation tactics (Ltac definitions) for proof synthesis. The generation algorithms amount to 5,193 lines of C++ code (excluding comments and empty lines, and likewise for the following statistics). The Coq library amounts to 1,036 lines of Coq code (written in Coq 8.11.0 and counted using coqwc). We also make use of the monad definitions and some basic data formats in CompCert's library [13]. The whole framework took six person months to develop.


**Table 2.** The lines of generated Coq code

To evaluate the effectiveness of our framework, we have written a CSLED specification for a total of 186 representative X86-32 instructions which cover the operands with the most complicated formats (e.g., addressing modes) and are sufficient for supporting the assembling process in CompCert's X86-32 backend. The specification is very succinct, containing only 260 lines of CSLED code. From this specification, our framework *automatically* generates around 87k lines of Coq code which form the verified encoder and decoder. The lines of Coq definitions and proofs for individual components are shown in Table 2. Note that the verification conditions account for a major part of the definitions because we need to consider all the possible combinations of structural conditions for the proofs of consistency and soundness. The Coq proofs related to verification conditions are for identifying the concrete forms of structural conditions. As expected, the consistency proof is the most complicated one among all the proofs.

To evaluate the performance of the generated encoder and decoder, we randomly generate four sets of instructions, encode them into bit strings, and decode the bit strings back. The executable encoder and decoder are obtained by extracting Coq definitions into OCaml programs and compiling with OCaml 4.08.0. We repeat this experiment for 30 times on a machine with Intel(R) i7-4980HQ CPU@2.8 GHz and 16 GB memory. For comparison, we conduct the same experiments on the hand-written encoder and decoder in the X86-32 back-end of CompCertELF [20]. The results are shown in Table 3. For each test case, it shows the


**Table 3.** Performance evaluation

numbers of randomly generated instructions and the median time (in seconds) and the variance (in percentage) for encoding and decoding. We observe that the automatically generated encoder and decoder perform reasonably well, but significantly slower than the hand-written ones. This is because 1) the handwritten encoder and decoder in CompCertELF currently supports significantly less instructions (about 20) than the CLSED ones due to the complexity in manual implementation, and 2) the hand-written ones are manually optimized while the auto-generated ones are not optimized at all. We plan to solve the above issues by optimizing our generation algorithms in the future.

# **8 Related Work and Conclusion**

We compare our framework with existing work on specification languages of instruction sets, verified parsing and pretty printing, and formalized ISAs.

There exists a lot of work on developing languages for specifying ISAs. Their major deficiency is the lack of formal guarantees. For example, the nML specification language employs attribute grammars to describe instruction sets [7]. For another example, EEL uses machine independent primitives to provide syntactic and semantic information of instructions [12]. The most relevant work in this category is the SLED language which our CSLED is based upon [15]. The patterns in SLED can only describe constraints on tokens and fields. By contrast, CSLED contains class patterns for accurately characterizing the structures of components. This extension enables CSLED to capture the bijection between the abstract and concrete forms of instructions.

Instruction decoding and encoding are special cases of parsing and pretty printing, respectively. Although there was early work on verifying that parsing and pretty-printing are inverses of each other by formulating them as bijections [1,10], this requirement was perceived as too strong [16]. Most of the recent work on verified parsing and pretty printing are dedicated to verify parser generators based on context-free grammars, regular expressions, parser combinators, or general data formats [3,11,17]. Some of them are also specialized work on verifying the encoder-decoder pairs [6,14,19,21]. They mostly deal with general and ambiguous grammars or specifications where bijection is difficult (if not impossible) to establish. By contrast, we intentionally restrict the expressiveness of CSLED specifications to make proving consistency possible. Specifically, the syntax presented in Fig. 4 implies that CSLED specifications can only match sequences of tokens with finite lengths and shapes, making it strictly weaker than regular expressions, yet sufficiently strong for precisely capture the common instruction formats.

There is also abundant work on the development of formal ISA specifications (e.g., [2,4,8,9]). However, almost all of them focus on the problem of rigorously defining the *semantics* of ISAs (such as their sequential behaviors, concurrency models and interrupt behaviors). Although formalized encoders or decoders (or both) are sometimes generated (e.g., in Coq or Isabelle/HOL), there is no formal verification of the soundness or consistency of instruction encoding and decoding which only concerns the *syntax* of instructions.

In this paper, we have presented a framework for specifying instruction formats and for automatically generating and verifying encoders and decoders based on such specifications. The verified encoders and decoders are consistent with each other (being inverses of each other) and sound (conforming to high-level specifications). Consistency is provable in our framework because our specifications capture the bijections inherently embedded in instruction formats. In the future, we would like to apply this framework to a major part of X86-32 and X86- 64 instructions and also to other ISAs, thereby to demonstrate the versatility and scalability of our framework.

**Acknowledgments.** We thank Zhong Shao for his comments and suggestions on this project when it was at an early stage. We are also indebted to anonymous reviewers for their detailed comments. This work was supported by the National Natural Science Foundation of China (NSFC) under Grant No. 62002217.

# **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

# An SMT Encoding of LLVM's Memory Model for Bounded Translation Validation

Juneyoung Lee1(B), Dongjoo Kim<sup>1</sup>, Chung-Kil Hur<sup>1</sup>, and Nuno P. Lopes<sup>2</sup>

<sup>1</sup> Seoul National University, Seoul, South Korea juneyoung.lee@sf.snu.ac.kr <sup>2</sup> Microsoft Research, Cambridge, UK

Abstract. Several automatic verification tools have been recently developed to verify subsets of LLVM's optimizations. However, none of these tools has robust support to verify memory optimizations.

In this paper, we present the first SMT encoding of LLVM's memory model that 1) is sufficiently precise to validate all of LLVM's intraprocedural memory optimizations, and 2) enables bounded translation validation of programs with up to hundreds of thousands of lines of code. We implemented our new encoding in Alive2, a bounded translation validation tool, and used it to uncover 21 new bugs in LLVM memory optimizations, 10 of which have been already fixed. We also found several inconsistencies in LLVM IR's official specification document (LangRef) and fixed LLVM's code and the document so they are in agreement.

# 1 Introduction

Ensuring that LLVM is correct is crucial for the safety and reliability of the software ecosystem. There has been significant work towards this goal including, e.g., formally specifying the semantics of the LLVM IR (intermediate representation). This entails describing precisely what each instruction does and how it handles special cases such as integer overflows, division by zero, or dereferencing out-of-bounds pointers [8,24,26,29,47]. There has also been work on automatic verification of classes of optimizations, such as peephole optimizations [25,31], semi-automated proofs [48], translation validation [20,35,42,44], and fuzzing [23,46]. All this work uncovered several hundred bugs in LLVM.

While there has been great success in improving correctness of scalar optimizations, current verification tools only support basic memory optimizations, if any. Since memory operations can take a significant fraction of a program's run time, memory optimizations are very important for performance. The implementation of these optimizations and related pointer analyses tends to be complex, which further justifies the investment in verifying them.

Verifying programs with memory operations is very challenging and it is hard to scale automatic verification tools that handle these. The main issue lies with pointer aliasing: which objects does a given memory operation access? Without any prior information, a verifier must consider that each operation *may* load or store from any live object (global variables and stack/heap allocations). This creates a big case split for the underlying constraint solver to (attempt to) solve.

Since automatic verification of the source code of memory optimizations is out of reach at the moment, we focus on bounded translation validation [30, 40] (BTV) instead. (Bounded) translation validation consists in verifying that an optimization was correct for a particular input program (up to a bounded unrolling of loops) rather than verifying its correctness for all input programs.

In this paper, we present the first SMT encoding of LLVM's memory model [24] that is precise enough to validate all of LLVM's intraprocedural memory optimizations. The design of the encoding was guided by practical insights of the common aliasing cases in BTV to achieve better performance. For example, we observed that in most cases we can cheaply infer whether a pointer aliases with a locally-allocated or a global object (but not both). Therefore, our encoding case-splits itself on this property rather than leaving that to the SMT solver, as we can cheaply resolve the case split for over 95% of the cases.

The second contribution of this paper is a new semantics for heap allocation for the verification of optimizations for real-world C/C++ programs. Although LLVM's memory model has a reasonable semantics for heap allocations [24], we realized it was not suitable for verifying optimizations. In some programming styles, the result of functions such as malloc is not checked against NULL and the resulting pointer is dereferenced right away. Since malloc can return NULL in some executions, we could end up proving that some undesirable optimizations were correct since the program triggers undefined behavior in at least one execution. We propose a new semantics for heap allocations in this paper that is better suited for the verification of optimizations.

The third contribution is the identification of approximations to the SMT encoding such that it is still sufficiently precise to verify (and find bugs) in LLVM's memory optimizations. This is possible since for translation validation we only need to be as precise as LLVM's static analyses (e.g., in the encoding of aliasing rules), and therefore we do not need to consider extremely precise analyses nor arbitrary transformations. Compilers have limited reasoning power by construction in order to keep compilation time reasonable.

We implemented our new SMT encoding of LLVM's memory model in Alive2 [30], a bounded translation validation tool for LLVM. We used Alive2 to find and report 21 previously unknown bugs in LLVM memory optimizations, 10 of which have already been fixed.

To summarize, the contributions of this paper are as follows.


# 2 Overview

Consider the functions below in C: <sup>1</sup> a source (original) function on the left and a target (optimized) function on the right. According to the semantics of highlevel languages, and also of LLVM IR, a pointer received as argument or a callee cannot guess the address of a memory region allocated within a function. That is, pointer q is not aliased with p, r, nor touched by g(p+1). Although the caller of f may guess the address of q in practice, that behavior is excluded by the language semantics because p's object (*provenance*) cannot be a fresh one like q. If p happens to alias q, accessing such pointer triggers undefined behavior (UB).

```
1 int f( int *p) {
2 int *q = malloc(4);
3 *q = 42;
4 int *r = g(p+1);
5 *r = 37;
6 return *q;
7 }
                                1-
                                  int f (int *p) {
                                2-
                                  // q removed
                                3-

                                4-
                                  int *r = g(p+1);
                                5-
                                  *r = 37;
                                6-
                                  return 42;
                                7-
                                  }
```
The provenance rules allow LLVM to forward the stored value in line 3 to line 6, and therefore line 6 simply returns 42. As the value stored to \*q is not used anymore and pointer q does not escape, LLVM also removes the heap allocation.

Next we show how to verify this example. Note that we do not require the two programs to be aligned; the example is aligned to make it easier to understand.

# 2.1 Verifying the Example Transformation

We start by defining two auxiliary functions that encode the effect of memory operations on the program state. Let state S = (m, ub) be a pair, where m is a memory and ub a boolean that tracks whether the program has already executed UB or not. Let p be the accessed pointer, and v the stored value. The definition of functions **load** and **store** is as follows:

$$\begin{aligned} \overline{\mathtt{Load}} \ p \ S &::= \left( \mathtt{Load}(p, S. \mathfrak{m}) \right), \left( S. \mathfrak{m}, S. \mathtt{ub} \vee \neg \mathtt{doverf}(p, \mathtt{sizseof}(\*p), S. \mathfrak{m}) \right) \\ \overline{\mathtt{stcore}} \ p \ v \ S &::= \left( \mathtt{stcore}(p, v, S. \mathfrak{m}) \right), \left. S. \mathtt{ub} \vee \neg \mathtt{doverf}(p, \mathtt{sizseof}(\*p), S. \mathfrak{m}) \right) \end{aligned}$$

**load** returns a pair with the loaded value and the updated state, where ub is further constrained to ensure that pointer p is dereferenceable for at least the size of the loaded type. Similarly, **store** returns the updated state. The gray boxes ( ··· ) encode SMT expressions; we describe these in the next section.

<sup>1</sup> We use the syntax of C for many of the examples in this paper to make them easier to read, even though we consider the semantics of LLVM IR.


Table 1. States and axioms after executing each of the lines of f.

*1. Encoding the output states.* Table 1 shows the state after executing each of the programs' lines. p, m0, and ub<sup>0</sup> are SMT variables for the input pointer, and function f caller's memory and UB flag, respectively. The target's corresponding variables are primed. Meta variables are upper-cased and SMT variables are lower-cased.

On line 2, q is assigned a pointer to a new object (encoded in axiom A1). On line 3, '\*q = 42' updates the state using **store**.

On line 4, the return value, output memory, and UB of g(p+1) are represented with fresh variables r, mg, and ubg, respectively. Axiom A<sup>2</sup> encodes the provenance rules: the return value cannot alias with locally non-escaped pointers (q) and only the remaining objects are modified. Line 4 does not need these axioms because there are no locally-allocated objects in the target function.

Finally, the outputs O and O are a pair of return value and state.

*2. Relating the source and target's states.* To prove correctness of a transformation, we must first establish refinement between the input states of the source/ target functions. Refinement (-) is used rather than equality because it is allowed for the source's caller to give less defined inputs than the target's.

$$A\_{\rm in} := \left| p \right| \sqsupseteq p' \land \left| m\_0 \right| \sqsubseteq m\_0' \land \left( \left. ub\_0' \right| \implies \left. ub\_0 \right) \right|$$

The inputs and outputs of function calls are also related using refinement. For any pair of calls in the source and target functions, if the target's inputs refine those of the source, the target's output also refines the source's output. The example only has one function call pair:

$$A\_{\text{call}} := \left( \mid S\_2. \mathbf{m} \sqsupseteq m\_{\mathbf{0}}' \land p + 1 \sqsupseteq p' + 1 \quad \Longrightarrow \mid m\_{\mathbf{5}} \sqsupseteq m\_{\mathbf{5}}' \land r \sqsupseteq r' \land (u b\_{\mathbf{g}}' \implies u b\_{\mathbf{g}}) \right)$$

We can now state the correctness theorem for the example transformation. For any input, if the axioms hold, the output of the target must refine that of the source for some internal nondeterminism in the source (e.g., the address of pointer q). Output is refined iff (i) the source triggers UB, or (ii) the target triggers no UB, and the target's return value and memory refine those of the source.

∀p, p- , m0, m- 0, ub0, ub- 0, mg, m- g, ubg, ub- <sup>g</sup> . <sup>∃</sup>q . (A<sup>1</sup> <sup>∧</sup> <sup>A</sup><sup>2</sup> <sup>∧</sup> <sup>A</sup>in <sup>∧</sup> <sup>A</sup>call) =<sup>⇒</sup> <sup>O</sup> <sup>O</sup>-

# 2.2 Efficiently Encoding LLVM's Memory Model and Refinement

We now present our key ideas for efficiently encoding LLVM's memory model and refinement (the gray boxes) in SMT, which is one of our main contributions.

*1. Pointers.* We represent a pointer as a pair (bid, o) of a block id (i.e., its provenance) and an offset within, so that we can easily detect out-of-bound accesses: accessing (bid, o) in memory <sup>m</sup> triggers UB unless <sup>0</sup> <sup>≤</sup> o<m[bid].size, from which **deref**((bid, o), sz, m) naturally follows.

*2. Bounding the number of blocks.* Our first observation is that we can safely bound the number of memory blocks for *bounded* translation validation since loops are unrolled for a fixed number of iterations. As a result, we can use a (fixed-length) bit-vector to encode block ids.

For the example source function, four blocks are sufficient: three for pointers p, q, r as they may all point to different blocks, and an extra to represent all the other blocks that are not syntactically present but are accessible by function g.

For the sake of simplifying the example, we ignore that p, q, r may be **null**. Our model does not make such assumption; we explain later how null is handled.

*3. Aliasing rules.* Several of the aliasing rules are encoded for free as we can distinguish most blocks by construction. First, we use the most significant bit of the block ids to distinguish local (1) from non-local (0) blocks. Second, we assign constant ids whenever possible (e.g., global variables and stack allocations).

For the example source function, (without loss of generality) we set the block ids of q, p and the extra block to 100(2), 000(2), and 011(2) (in binary format), respectively. However, we cannot fix the block id of r and instead give the constraint that it should be either 000(2) or 001(2) since r may alias with p but not with q. This establishes the alias constraints in A<sup>1</sup> and A<sup>2</sup> for free.

*4. Memory accesses.* In order to leverage the fact that each pointer may range over a small number of blocks as seen above, we use one SMT array per block (from an offset to a byte) instead of using a single global array (from a pointer to a byte). For the latter, it becomes harder to exploit non-aliasing guarantees since all stores to different blocks are grouped together.

For the example source function, m<sup>0</sup> consists of four arrays m(100) <sup>0</sup> , <sup>m</sup>(000) <sup>0</sup> , m(001) <sup>0</sup> , <sup>m</sup>(011) <sup>0</sup> for the four blocks. Then since q's block id is 100(2), **store** q 42 S<sup>1</sup> at line 3 only updates the array m(100) <sup>0</sup> , leaving the others unchanged. Similarly, **store** r 2 S<sup>3</sup> at line 5 only updates m(000) <sup>0</sup> and <sup>m</sup>(001) <sup>0</sup> using the SMT if-then-else expression on r's block id. Finally, **load** q S<sup>4</sup> at line 6 reads from the updated array at 100(2), thereby easily realizing that the read value is 42.

*5. Refinement.* The value/memory refinement is defined based on a mapping between source and target blocks, which we efficiently encode leveraging the alignment information between source and target as much as possible (Sect. 7).

# 3 LLVM's Memory Model

In this section, we give a brief introduction to LLVM's memory model [24]. In this paper we only consider logical pointers (i.e., integer-to-pointer casts are not supported) and a single address space.

*Memory Block.* A memory block is the unit of memory allocation: each stack or global variable has a distinct block, and heap allocation functions like **malloc** create a fresh block each time they are called. Each block is uniquely identified with a non-negative integer (bid), and has associated properties, including size, alignment, whether it can be written to, whether it is alive, allocation type (heap, stack, global), physical address, and value.

*Pointer.* A pointer is defined as a triple (bid, off, attrs), where off is an offset within the block bid, and attrs is a set of attributes that constrain dereferenceability and which operations are allowed.

Pointer arithmetic operations (**gep**) only change the offset, with bid and attrs being carried over. Unlike C, an offset is allowed to go out-of-bounds (OOB). Such pointer, however, cannot be dereferenced like in C (triggers undefined behavior— UB), but can be used for pointer comparisons for example.

LLVM supports several pointer attributes. For example, a **readonly** pointer p cannot be used to store data. However, it is possible to use a non-**readonly** pointer q to store data to the same location as p (provided the block is writable). A **nocapture** pointer cannot escape from a function. For example, when a function returns, no global variable may have a **nocapture** pointer stored (otherwise it is UB).

LLVM has three constant pointers. The **null** pointer is defined as (0, <sup>0</sup>, <sup>∅</sup>). Block 0 is defined as zero sized and not alive. The **undef** <sup>2</sup> pointer is defined as (β, δ, <sup>∅</sup>), with β, δ being fresh variables for each observation of the pointer. There is also a **poison**<sup>3</sup> pointer.

*Instructions.* We consider the following LLVM memory-related instructions:


<sup>2</sup> In LLVM, **undef** values are arbitrary values of a given type with the additional property that they can yield a different value each time they are observed. **undef** values can be replaced with any value of the same type, except **poison** values.

<sup>3</sup> A **poison** value taints whole expression trees (e.g., **poison** +1 = **poison**), and branching on it is UB. Similarly, dereferencing a **poison** pointer is UB.


Unsupported memory instructions are: integer-to-pointer casts, and atomic and volatile memory accesses.

# 4 Encoding Memory Blocks and Pointers in SMT

We describe our new encoding of LLVM's memory model in SMT over the next few sections. We use the theories of UFs (uninterpreted functions), BVs (bitvectors), and arrays with lambdas [7], with first order quantification. Moreover, we consider that the scope of verification is a single function without loops (or where loops have been previously unrolled).

#### 4.1 Memory Blocks

Each memory block is assigned a distinct identifier (a bit-vector number). We further split memory blocks into local and non-local. Local blocks are all those that are allocated within the function under consideration, either on the stack or the heap. Non-local blocks are the remaining ones, including global variables, heap/stack allocations in callers and heap allocations in callees (stack allocations in callees are not observable, since they are deallocated when the called function returns, hence there is no need to consider them).

We use the most significant bit (MSB) to encode whether a block is local (1) or non-local (0). This representation allows the null block to have bid = 0 and be non-local. We refer to the short block id, or bid-, to refer to bid without the MSB. This is used in cases where it has already been established whether the block is local or not. Example with 4-bit block ids:

```
int g; // bid(g) = 0001
void f(int *p) { // bid(p) = 0xyz (with xyz = arbitrary)
 int a[2]; // bid(a) = 1000
 int *q = malloc(4); // bid(q) = 1001
}
```
The separation of local and non-local block ids is an efficient way to encode the constraint that pointers of these groups cannot alias with each other. In the example above, argument p cannot alias with either a or q.

As we only consider functions without loops, block ids can be statically assigned for each allocation site.

#### 4.2 Pointers

A pointer ptr = (bid, off, attrs) is encoded as a single bit-vector consisting in the concatenation of the three elements. The offset is interpreted as a *signed* number (which is why blocks cannot be larger than half of the address space). Each attribute (such as **readonly**) is encoded with a bit. Example with 2-bit block ids and offsets, and a single attribute (we use . to visually separate the elements):

```
void f(char readonly *p, char *q) { // p = 0x.ab.1, q = 0y.cd.0
 char *r = p + 2; // r = 0x.(ab+2).1
 char *s = q + 3; // s = 0y.(cd+3).0
 char *t = malloc(4); // t = 10.00.0
}
Let off-
```
 be a truncated offset where the least significant bits corresponding to the greatest common divisor of the alignment and sizes of all memory operations are removed. For example, if all operations are 4-byte aligned and they access either 4- or 8-byte values, then off has less 2 bits than off (as these are guaranteed to be always zero when accessing the memory).

#### 4.3 Block Properties

Each block has seven associated properties: size, alignment, read-only, liveness, allocation type (heap, stack, global), physical address, and value. Block properties are looked up and updated by memory operations. For example, when doing a store, we need to check if the access is within the bounds of the block.

Except for liveness and value, properties are fixed at allocation time. Liveness is encoded with a bit-vector (one bit per block), and value with arrays (indexed on off-). We use a multi-memory encoding, where we have one array per bid. non-local blocks, we use a UF symbol per property, taking bid-

The encoding of fixed properties differs for local and non-local blocks. For as argument. For local blocks, we cannot use UFs because for the refinement check some of these would have to be quantified (c.f. Sect. 7) and most, if not all, SMT solvers do not support quantification of UF symbols. Therefore, we encode each of the remaining properties of local blocks as an if-then-else (ITE) expression, which is tailored for each use (e.g., each time an operation needs to lookup a local block's size, we build an ITE expression for the given bid-).

Using ITE expressions to encode properties is less concise than using UFs. However, it is not a disaster for two reasons. Firstly, we only need to consider the local blocks that have been allocated beforehand, since the program cannot access blocks allocated afterward. Secondly, pointers are usually not fully arbitrary. Oftentimes we know statically which type of block they refer to, and even what is the block id, given that pointer arithmetic operations do not change the block id. Therefore, the ITE expressions are usually small in practice. Example with 4-bit block ids and offsets of a source program:

int g; // g = 0001.0000, size\_src(001) = 4 void f() { char p[2]; // p = 1000.0000 char q[3]; // q = 1001.0000

```
char *r = ... p or q or g ...
 r[2] = 0;
 char t[1]; // t = 1010.0000
}
```
The store in this program is only well defined if the size of block pointed by r is greater than 2. This is encoded in SMT as follows: ite(**islocal**(r), ite(**bid** (r)=0, <sup>2</sup>, 3),size*src*(**bid** (r))) <sup>&</sup>gt; <sup>2</sup>

$$\text{ite}(\text{islocal}(r), \text{ite}(\mathbf{bid}(r) = 0, 2, 3), \text{size}\_{src}(\mathbf{bid}(r))) > 2$$

Function **islocal**(p) is encoded with the SMT extract expression to fetch the MSB of the pointer. Similarly, **bid** (p) extracts the relevant bits from a pointer. The expression for local blocks only needs to consider local blocks 0 and 1, since block 2 (t) is only allocated afterward. This allows a simple single pass through the code to generate optimized ITE expressions.

Value. Value is defined as an array from short offset to byte (described later in Sect. 6.1). For non-local blocks, only those that are constant are initialized with the respective value. The remaining blocks are allowed to take almost any value. The exception is for pointers: non-local blocks cannot initially have local pointers stored, since the calling environment cannot fabricate local pointers.

Local blocks are initialized with **poison** values using a constant array (i.e., an array that yields the same value for all indexes).

#### 4.4 Physical Addresses

If a program observes addresses (through, e.g., pointer-to-integer casting), we need additional constraints to ensure that addresses of blocks that overlap in time are disjoint. Since we are doing translation validation, we have two programs with potentially different sets of locally allocated blocks. Therefore, we need to ensure that non-local blocks' addresses are disjoint from those of local blocks of both programs. This makes the disjointness constraints quite complex.

As an optimization, we split the address space in two: local blocks have MSB = 1 and non-locals have MSB = 0. Since the encoding of address disjointness is quadratic in the worst case (cross-product of blocks), halving the number of blocks is significant. This optimization, however, is an under-approximation of the program's behavior (Sect. 9). After investigating LLVM's optimizations, we believe it is highly unlikely this approximation will cause false negatives.

If a program does not observe any pointer's physical address, neither the block's physical address property nor the disjointness axioms are instantiated. However, when dereferencing a pointer, we need to check if the physical address is sufficiently aligned. When physical addresses are not created, we resort to checking alignment of both of the pointer's block and offset. Since in this case physical addresses are not observed (and therefore not constrained by the program using, e.g., pointer comparisons), a block's physical address can take any value, and therefore blocks and offsets must be both sufficiently aligned to ensure that physical pointers are aligned in all program executions. This argument justifies why we can soundly discard physical addresses.


Table 2. Comparison of two semantics for pointer comparison.

#### 4.5 Pointer Comparison

Given two pointers p and q, if a program learns that q is placed right after p in memory, the program can potentially change the contents of q without the compiler realizing it. Detecting the existence of such code is impossible in general, hence restricting the ways a program can learn the layout of objects in memory is important to make pointer analyses fast yet precise.

A way the memory layout can leak is through pointer comparison. For example, what should p < q return if these point to different memory blocks? If it is a well-defined operation (i.e., simply compares their integer values), it leaks memory layout information. An alternative is to return a non-deterministic value to prevent layout leaks, the formal semantics of which is defined at [24].

We found that there are pros and cons of both semantics for the comparison of pointers of different blocks, and that neither of them covers all optimizations that LLVM performs. Table 2 summarizes the effects on each of the optimizations.

We decided to implement the integer comparison semantics, as LLVM performs all the optimizations above and its alias analyses (AA) mostly give up when they encounter an integer-to-pointer cast. In summary, we have to remove the first optimization from LLVM to make it sound. Additionally, we make it harder to improve LLVM's AA algorithms w.r.t. to pointers cast from integers.

#### 4.6 Bounding the Maximum Number of Blocks

Since we assume that programs do not have loops, we can statically bound the maximum number of both local and non-local blocks a program may observe.

The maximum number of local blocks in the source and target programs, respectively, N*src local* and <sup>N</sup>*tgt local*, is computed by counting the number of heap and stack allocation instructions. Note that this is an upper-bound because not all allocation sites may be reachable in practice.

For non-local blocks, we cannot see their definitions as with local blocks, except for global variables. Nevertheless, we can still bound the maximum number of observed blocks. It is sufficient to count the number of instructions that may return non-local pointers, such as function calls and pointer loads. In addition, we consider a null block when needed (if the null pointer may be observed).

To encode the behavior of source and target programs, we need N*src nonlocal* + N*tgt nonlocal* non-local blocks in the worst case, as all referenced pointers may be distinct. However, correct transformations will not have the target program observe more blocks than the source. If the target observes a pointer to a non-local block that was not observed in the source, we can set that pointer to **poison** because its value is not restricted by the source. Therefore, N*src nonlocal* non-local blocks are sufficient to allow the target to exhibit *an* incorrect behavior. The bit-width of bidis: <sup>w</sup>*bid*-

 <sup>=</sup> log2(max(N*src nonlocal*, max(N*src local*, N*tgt local*))). When only local or non-local pointers are used, <sup>w</sup>*bid* <sup>=</sup> <sup>w</sup>*bid*- , as we know statically if the pointer is local or not. Otherwise, <sup>w</sup>*bid* <sup>=</sup> <sup>w</sup>*bid*-+ 1.

# 5 Memory Allocation

In LLVM, memory blocks can be allocated on the stack (**alloca**), in the heap (e.g., **malloc**, **calloc**, etc.), or as global variables. It is surprisingly non-trivial to find a semantics for memory allocations that allows all of LLVM's optimizations, and rejects undesired transformations. For example, we have to support allocation removal and splitting, introduce new stack allocations and new constant global variables, etc. We explore multiple semantics and show their merits and shortcomings in the context of proving correctness of program transformations.

# 5.1 Heap Allocation

Heap allocation is done through functions such as **malloc**, **calloc**, C++'s new operator, etc. We describe semantics for **malloc**; remaining functions can be described in terms of it.

First of all, it is important to note that there are two common idioms used in practice by C programmers when doing memory allocation:

> int \*p = malloc(4); \*p = 0; int \*p = malloc(4); if (p) { \*p = 0; }

In some programs, like the example on the left, **malloc** is assumed to never return **null**, say non-null assumption. This is mainly because the program does not consume too much memory and it is expected that the computer has enough memory/swap space. In other programs like the one on the right, **malloc** is expected to sometimes return **null**, say may-null assumption. Therefore, the program performs null-ness checks.

Since both programming styles are prevalent, we would like optimizations to be correct for both. This is non-trivial, as the two assumptions are conflicting: with the non-null assumption, it is sound to eliminate **null** checks, but not with the may-null assumption. We now explore several possible semantics to find one that works for both programming styles.

*A. Malloc always succeeds.* Based on the non-null assumption, in this semantics we only consider executions where there is enough space for all allocations to succeed. Regardless of whether the target uses more or less memory than the source, all calls to **malloc** yield non-null pointers. Therefore, for example, deleting unused **malloc** calls is allowed.

However, removing **null** checks of **malloc** is also allowed in this semantics. For example, optimizing the right example above into the left one is sound. This transformation, however, is obviously undesirable.

*B. Malloc only succeeds if there is enough free space.* To solve the problem just described, based on the may-null assumption, we can simulate the behavior of dynamic memory allocation and define **malloc** to return a pointer to a newly created block if there is an empty space in memory, and **null** otherwise. This semantics prevents the removal of **null** checks of **malloc** as it may return **null**.

However, this semantics does not explain removal of unused allocations. It aligns both source and target programs' allocations such that any change in the allocation sequence disrupts the program alignment and thus makes verification fail. For example, the following transformation removing unused **malloc** instructions and replacing comparisons of their output with **null** is not supported:

$$\begin{array}{rcl} \mathsf{int} \ \mathsf{star} = \mathsf{null} \mathsf{loc} \{ \mathsf{4} \}; & & & \\ \mathsf{if} \ \{ \mathsf{x} \mid = \mathsf{null} \mathsf{l} \mathsf{ptr} \} \{ \begin{array}{rcl} \Box & \mathsf{if} & \mathsf{true} \end{array} & \begin{array}{rcl} \mathsf{if} & \mathsf{true} \mathsf{so} \ \mathsf{x} & \mathsf{(unused)} \\ \mathsf{if} & \mathsf{if} & \mathsf{(true} \end{array} & \begin{array}{rcl} \mathsf{if} & \mathsf{(unused)} \end{array} \end{array}$$

In case there were 0 bytes left in memory, x would be **null**, but since LLVM assumes that the program cannot observe the state of the allocator it folds the comparison x != nullptr to true after eliminating the allocation. This optimization would be flagged as incorrect in this semantics.

LLVM assumes very little about the run-time behavior of memory allocators. This is to support, for instance, garbage collectors, where an allocation may fail but if repeated it may succeed because memory was reclaimed in between. This explains why LLVM folds comparisons with **null** of unused memory blocks, and also contradicts the linear view of allocations of this semantics.

*C. Malloc non-deterministically returns null.* This semantics abstracts the behavior of the memory allocator by (1) allowing **malloc** to nondeterministically return **null** even if there is available space, and (2) only considering executions where there is enough space for all allocations to succeed. This semantics prevents the removal of null checks of **malloc**, which fixes the shortcomings of semantics A, and also allows the removal of unused allocations, which fixes those of semantics B. However, this semantics is too weak and therefore allows other undesirable transformations, like the following:

$$\begin{array}{ll} \mathbf{p} = \mathtt{mal1loc(4)};\\ \ast \mathbf{p} = \mathbf{0}; \end{array} \qquad \Rightarrow \qquad \qquad \mathtt{exit()}; \mathbf{p}$$

For the sake of proving refinement (Sect. 7), we need just one trace triggering UB (i.e., one particular realization of the non-deterministic choices) for a given

Fig. 1. Bit-wise representation of a byte. A pointer byte is poison if 'p?' is zero. A non-pointer byte tracks poison bit-wise.

input to be able to transform the source program into anything for that input. Informally speaking, refinement always picks the worst-case execution for each input. Since the source program executes UB when p is **null**, it is correct to transform the source into any program although that is obviously undesirable.

This semantics is too weak in practice since many programs are written without **null** checks, either assuming the program will not run out of memory, or assuming the program will terminate if it runs out memory. It is not reasonable in practice to allow compilers to break all such programs.

*Our Solution.* As we have seen, there is no single semantics that both allows all desired transformations and rejects undesired ones. While semantics B prevents desired optimizations like allocation removal, semantics A and C allow undesired optimizations, but in a complementary way. For example, removing null checks of **malloc** is allowed in A but not in C. On the other hand, transforming an access of a **malloc**-allocated block without a **null** check beforehand into arbitrary code is allowed in C but not in A.

Therefore, we obtain a good semantics by requiring both A and C: an optimization is correct if it passes the refinement criteria with each of the two semantics. Intuitively, this definition requires the compiler to support the two considered coding styles: semantics A supports the non-null assumption, while semantics C the may-null assumption.

#### 5.2 Stack Allocation

The semantics of **alloca**, the stack-allocation instruction, is slightly different from that of **malloc**. LLVM assumes that stack allocations always succeed, since the program will likely crash if there is a stack overflow. That is, **alloca** never returns a **null** pointer.

LLVM performs more optimizations on stack allocations than on heap ones. For example, LLVM can split an allocation into multiple smaller ones or increase the alignment. These transformations can increase memory consumption.

# 6 Encoding Loads and Stores in SMT

We encode the value of memory blocks with several arrays (one per bid): from short offset to byte. We next give the definition of byte and the encoding of memory accessing instructions in SMT.

#### 6.1 Byte

There are two types of bytes: *pointer* bytes and *non-pointer* bytes, cf. Fig. 1.

A pointer byte has the most significant bit (MSB) set to one. The following bit states whether the byte is poison or not. Next is the pointer representation as described in Sect. 4.2 (bid, off, attrs).

Pointers are often longer than one byte, so when storing a pointer to memory we write multiple consecutive bytes. Each of these bytes records the same pointer, but with a different byte offset (the last bits of the byte) to distinguish between the partial bytes of the pointer.

For non-pointer bytes, we track whether each of the bits is poison or not. This is not required for pointers, since LLVM does not allow pointer values to be manipulated bit-wise. Non-pointer values can be manipulated bit-wise (e.g., using vectors with element types smaller than 8 bits). Each bit of the integral value is only significant if the corresponding poison bit is zero.

#### 6.2 Load and Store Instructions

Load and store instructions are trivially encoded using SMT arrays. These arrays store bytes as described in the previous section. We next describe how LLVM values are encoded to and decoded from our byte representation.

We define two functions, *ty*⇓(v) and *ty*⇑(b), which convert a value <sup>v</sup> into a byte array and a byte array <sup>b</sup> back to value, respectively. We show below *ty*⇓(v) when <sup>v</sup> <sup>=</sup> **poison**. **<sup>i</sup>***sz* stands for the integer type with bit-width sz. If *sz* is not a multiple of 8 bits, v is zero-extended first. When v is poison, all poison bits are set to one. BitVec(n, b) stands for number n with bit-width b. Pointer's byte offset is 3 bits because we assume 64-bit pointers.

**<sup>i</sup>***sz*⇓(v) or **float**⇓(v) = λi. 0 ++ 0<sup>8</sup> ++ bitrepr(v)[8×i... <sup>8</sup>×(<sup>i</sup> + 1) <sup>−</sup> 1] ++ padding *ty*∗⇓(v) = λi. <sup>1</sup><sup>2</sup> ++ bitrepr(v) ++ BitVec(i, 3)

**<sup>i</sup>***sz*⇑(b) and **float**⇑(b) return **poison** if any bit is **poison**, or if any of the bytes is a pointer. Otherwise, these functions return the concatenation of the integral values of the bytes.

*ty*∗⇑(b) returns **poison** if any of the bytes is **poison** or not a pointer, there is more than one distinct pointer value in b, or one of the bytes has an incorrect byte offset (they have to be consecutive, from zero to byte size minus one). An exception is reading a non-pointer zero byte, which is interpreted as a null pointer byte. This allows initialization of, e.g., arrays with null pointers with **memset** (which is an idiom commonly used in LLVM IR).

#### 6.3 Multi-array Memory

As already described, we use a multi-array encoding for memory, with one array per block id, each indexed on off-. A simpler encoding would have used a single array indexed on ptr. The multi-array encoding is beneficial when we can cheaply compute small aliasing sets for each memory access. In that case, we reduce the


Fig. 2. Type definitions and variable naming conventions.


Fig. 3. Refinement of value and final state.

case-splitting work on bid that the SMT solver needs to do, and it enables further formula simplifications like store forwarding.

The multi-array encoding may, however, end up in a larger encoding overall if several of the accesses may alias with too many blocks. For load operations that alias multiple blocks the resulting expression is a linear combination of the loads of each block, e.g., ite(bid = 0, **load**(m0, off-), ite(bid = 1, **load**(m1, off-),...)). In this case, it would be more compact to use the single-array encoding. Note that even if we do not know the specific block id, we often know whether a pointer refers to a local or non-local block (e.g., pointers received as argument have unknown block id, but are known to be non-local), and hence splitting the memory in two is usually a good idea (c.f. Sect. 10).

We perform several optimizations that are enabled with this multi-array encoding. We do partial-order reduction (POR) to shrink the potential aliasing of pointers with unknown block id. For example, consider a function with two pointer arguments (x and y) and one global variable. We assign bid = 1 to the global variable. Then, we estipulate that <sup>x</sup> can only alias blocks with bid <sup>≤</sup> <sup>2</sup>, which is sufficient to access the global variable or another unknown block. Argument <sup>y</sup> is also constrained to only alias blocks with bid <sup>≤</sup> <sup>3</sup>, allowing it to alias with the global variable, the same block as x, or a different block. The same is

Fig. 4. Refinement of memory and pointers.

done for function calls that return pointers. This POR technique greatly reduces the potential aliasing of unknown pointers without losing precision.

### 7 Verifying Correctness of Optimizations

To verify correctness of LLVM optimizations, we establish a refinement relation between source (or original) and target (or optimized) functions. Equivalence is not used due to undefined behavior and nondeterminism. Compilers are allowed to reduce the set of possible behaviors from the source.

Given functions f*src* and f*tgt*, set of input and output variables I*src*/I*tgt* and O (which include, e.g., memory and the return value), and set of non-determinism variables N*src*/N*tgt*, f*src* is refined by f*tgt* iff: <sup>f</sup>*tgt*(I*tgt*, N*tgt*) = <sup>O</sup>*tgt*

<sup>∀</sup>I*src*, I*tgt*, O*tgt* . valid(I*src*, I*tgt*) <sup>∧</sup> <sup>I</sup>*src* - <sup>I</sup>*tgt* ∧ ∃N*src* . presrc(I*src*, N*src*) <sup>∧</sup> <sup>∃</sup>N*tgt* . pretgt(I*tgt*, N*tgt*) <sup>∧</sup> -<sup>=</sup><sup>⇒</sup> (∃N*src* . presrc(I*src*, N*src*) <sup>∧</sup> <sup>f</sup>*src*(I*src*, N*src*) st O*tgt*)

Predicate valid(I*src*, I*tgt*) encodes the global precondition of the input memory and arguments such as disjointness of non-local blocks. Function's preconditions, presrc and pretgt, include the constraint for disjointness of local blocks. The existential presrc constrains the input such that the source function has at least one possible execution. st is the refinement between final states.

Figure 2 shows the definition of final program state which is a tuple of return value, return memory, and UB. A memory is a function from block id to a memory block. A memory block has seven attributes that are described in Sect. 4.3.

Figure 3 shows the definition of refinement of value and final state. For pointers, we cannot simply use equality because local pointers in source and target are internal to each of the functions. Even if they have the same block identifier, they may refer to different allocation sites in the functions (value-ptr). Similarly, the refinement of the final state should consider this difference between local pointers. To address this, we track a mapping μ between escaped local blocks of the two functions (described next).

#### 7.1 Refinement of Memory

Checking refinement of non-local memory blocks is simple as blocks are the same in the source and target functions (e.g., global variables have the same ids in the two functions). Therefore, one just needs to compare blocks of source and target functions with the same id pairwise.

Checking refinement of local blocks is harder but needed when, e.g., the function returns a locally-allocated heap block. This is legal, but block ids in the two functions may not be equal as allocations may have happened in a different order. Therefore, we cannot simply compare local blocks with the same ids.

To check refinement of local blocks, we need to align the two functions' allocations, i.e., we need to find a correspondence between local blocks of the two functions. We introduce a mapping <sup>μ</sup> <sup>∈</sup> BlockID → BlockID between target and source local block ids.

Local blocks become related on function calls and return statements, which is when local pointers may be observed. For example, if a function is called with a pointer to a local block as the first argument, μ should relate that pointer with the first argument of an equivalent function call in the target function.

Figure <sup>4</sup> gives the definition of memory refinement, <sup>M</sup> *µ* mem M , as well as other related relations between memory blocks and pointers. The first rule pointer describes refinement between source pointer p and target pointer p with respect to μ. The following four rules define refinement between bytes b and b . In rule byte-nonptr, '<sup>a</sup> <sup>|</sup> <sup>b</sup>' is the bitwise OR operation, and it is used to check the equality of only those bits that are not **poison**. Predicate isZeroByte(b) holds if b is a **null** pointer or if it is a zero-valued non-pointer byte. This is needed because stores of **null** pointers can be optimized to **memset** instructions.

Rules bytes and block define refinement between memory blocks' values and memory blocks, respectively. Rule memory-map describes memory refinement with respect to local block mapping μ. M[bid] stands for the memory block with block id bid.

The well-formedness of μ is established in the refinement rules for function calls and return statements. We show these for function calls in the next section. We note that there might be multiple well-formed μ due to non-determinism.

$$\begin{array}{lcl} \text{(NONPTR \quad \text{(PTR-CRG-UNMPED)} & \text{(PTR-RY-RVAL)}\\ \text{(-ARG)} & \text{-ARG} & sz > 0 & o = p.\text{offset} & o' = p'.\text{offset}\\ \left[v, v' \notin \begin{array}{l} \text{-MRPED} \\ \text{Pointer} \end{array} \right] & \text{isLocal}(\{p, p'\}) & mb = M[p.\text{bid}] & mb' = M'[p'.\text{bid}]\\ p.\text{offset} = p'.\text{offset} & p'.\text{offset} & \left[\forall 0 \le i < sz, \\ \frac{v \supseteq^{\mu} v'}{p \implies^{\mu,sz} p'} \quad \frac{p \left[\exists^{\mu}, \text{bid}\right] \left[\exists^{\mu}\_{\text{bid}} \text{ M}' [p'.\text{bid}]\right]}{p \implies^{\mu,sz} p'} & \frac{\left[mb.\text{bids}[o+i] \right] \left[\exists^{\mu}\_{\text{bid}} \text{ m} \delta/\text{b} \text{ type}[o'+i]\right]}{p \implies^{\mu,sz} p'} \end{array}$$

Fig. 5. Refinement between function arguments.

### 8 Function Calls

A call to an unknown function may change the memory arbitrarily (except for, e.g., constant variables and non-escaped local blocks). The outputs in the source and target are, however, related: if the target's inputs refine those of the source, refinement holds between their outputs as well. Alive2 already supported function calls; this section shows how it was extended to support memory.

Let (M*in*, v*in*) and (M*out*, v*out*) be the input and output of a function call in the source, and their primed versions, (M *in*, v *in*) and (M *out*, v *out*), those of a function call in the target. Let μ*in* be a local block mapping before executing the calls. To state that the outputs are refined if the inputs are refined, we add the following formula to the target's precondition:

$$\left(M\_{in} \stackrel{\neg \mu\_{in}}{\sqsupset} M'\_{in} \land \forall i . v\_{in}[i] \stackrel{\neg \mu\_{in}}{=}^{\mu\_{in}} \lnot \right] \left(\begin{matrix} \neg \mu\_{out} \land \not\equiv [ \\ \lor \text{ } \\ \lor \text{ } \vdash \end{matrix} \right) \Longrightarrow \left(M\_{out} \stackrel{\neg \mu\_{out}}{\sqsupset} M'\_{out} \land v\_{out} \stackrel{\neg \mu\_{out}}{\sqsupset} v'\_{out}\right))$$

A call to a function with a pointer to a local block as argument escapes this block, as the callee may, e.g., store that pointer to a global variable. Moreover, any pointer stored in this block also escapes as the callee may traverse the block and grab any pointer stored there, and do so transitively. The updated mapping μ*out* = extend(μ*in*, M*in*, M *in*, v*in*, v *in*) returns μ*in* updated with the relationship between the newly escaped blocks in source and target functions.

Figure 5 shows the definition of refinement between function call arguments in source and target programs. The first rule relates non-pointer arguments. The second one handles pointers that have escaped before these calls. The third rule handles local pointers of blocks that did not escape before these calls, and therefore we need to check if the contents of these block are refined.

The fourth refinement rule handles **byval** pointer arguments. These arguments get a freshly allocated block and the contents of the pointer are copied from the pointer's offset onwards.

### 9 Approximating Program Behavior

In order to speedup verification, we approximate programs' behaviors, which can result in false positives and false negatives. We believe none of these approximations has a significant impact for two reasons: (1) we only need to be as precise as LLVM's static analyses, i.e., we do not need to support arbitrary optimizations, and (2) we do not consider the compiler to be malicious (which may not be true in certain contexts). Moreover, we conducted an extensive evaluation to support these claims, on which we report in the next section.

# *Under-Approximations*


*Over-Approximations.* The set of local blocks that escape (e.g., whose address is stored into a global variable) is computed per function. This may overapproximate the set of escaped pointers at times because, e.g., a pointer may only escape in a particular branch. LLVM also computes the set of escaped pointers per function.

# 10 Evaluation

We implemented our new memory model in Alive2 [30]. The implementation of the memory model consists in about 3.0 KLoC plus an additional 0.4 KLoC for static analyses for optimization.

We run two set of experiments to both validate our implementation and the formal semantics, and to identify bugs in LLVM. First, we did translation validation of LLVM's unit tests (test/Transforms) to increase confidence that we match LLVM's behavior in practice. Second, we run five benchmarks: bzip2, gzip, oggenc, ph7, and SQLite3.

Benchmarks were compiled with -O3. Moreover, we disabled type-based aliasing because there is no formal model for this feature yet. During compilation, we emitted pairs of IR files before and after each intra-procedural optimization. We discarded syntactically equal pairs as well as pairs without memory operations.

We used a machine with two Intel Xeon E5-2630 v2 CPUs (total of 12 cores). We set Z3's timeout to 1 min and memory limit to 1 GB. Loops were unrolled once. We used LLVM from 11/Dec (5e31e22) and Z3 [33] from 16/Dec (11477f).

# 10.1 LLVM Unit Tests

LLVM's Transforms unit test suite consists in 6,600 tests totaling 36,600 functions. Alive2 takes about 2.5 h (in parallel) to validate these. By running LLVM's unit tests, we found 21 new bugs in memory optimizations.


Table 3. Statistics and results for the single-file benchmarks.

We show below an example of a bug we found. This optimization was shrinking the store from 64 to 32 bits, which is incorrect since the last 32 bits were not copied. This happened because of the mismatch in the load/store's sizes.

> // i32 \*x, \*y, \*z; i32 \*p = (\*x < \*y ? x : y); \*(i64\*)z = \*(i64\*)p; ⇒ // i32 \*x, \*y, \*z; i32 r = (\*x < \*y ? \*x : \*y); \*z = r;

#### 10.2 Benchmarks

Table 3 shows the statistics and results for translation validation. The Pairs column indicates the number of source/optimized function pairs considered for validation. We discarded pairs where the two functions were syntactically equal, as the transformation is then trivially correct. The last column indicates the number of skipped pairs because they use features Alive2 does not yet support.

All the 79 incorrect pairs are due to mismatches between LLVM and the formal semantics. Of these, 74 are related with incorrect handling of **undef** and **poison** values, and the remaining 5 are caused by incorrect load type punning optimizations. This shows that our tool has no false positives.

#### 10.3 Specification Bugs

While testing our tool, we found a mismatch in the semantics of the **nonnull** attribute between LLVM's documentation and LLVM's code. The documentation specified that passing a null pointer to a **nonnull** argument triggered UB. However, as illustrated below, LLVM adds **nonnull** to a pointer that may be **poison**. This is incorrect because **poison** can be optimized into any value including null.

```
p = gep inbounds q, 1
f(p) ⇒ p = gep inbounds q, 1
                        f(nonnull p) ; UB if p poison
```
We proposed a new semantics to the LLVM developers, where nonconforming pointers would be considered **poison** rather than UB. This was accepted and we have contributed patches to fix the docs and the incorrect optimizations.

#### 10.4 Alias Sets

To show that splitting the memory into multiple arrays is beneficial, we gathered statistics of the alias sets in our benchmarks. More than 96% of the dereferenced pointers turned out to be only local or non-local, but not both. This shows that splitting the memory into local and non-local simplifies the memory encoding.

We also counted the number of memory blocks pointers may alias with. Half of the pointers were aliased with just one block. About 80% of the pointers aliased with at most 3 blocks. This is much less than the median number of blocks functions have. The median of the number of memory blocks was 7 ∼ 13 (varying over programs), and only 10% of the functions had fewer than 3 blocks.

# 11 Related Work

*Semantics of LLVM IR.* The official LLVM IR's specification is written in prose [1]. Vellvm [47] and K-LLVM [29] formalized large subsets of the IR in Coq and K, respectively. [26] clarifies the semantics of **undef** and **poison** and proposes a new **freeze** instruction. [24] formalizes various memory instructions of LLVM. [32] presents a C memory model that supports compilation to that LLVM model.

*Translation validation.* [38] presents a translation validation infrastructure for GCC's intermediate language, using a set of arithmetic/aliasing rules for showing equivalence. LLVM-MD [44] and Peggy [42] verify LLVM optimizations by showing equivalence of source and targets with rewrite rules/equality axioms. They suffer, however, from incomplete axioms for aliasing.

In order to simplify the work of translation validation tools, it is possible to extend the compiler to produce hints (witnesses) [18,36,38,41]. One of these tools, Crellvm [20], is formally verified in Coq.

*Verifying programs with memory using SMT solvers.* SMT solvers have been used before to check equivalence of programs with memory [11,14,21,25,31]. [12] give an encoding of some (but not all) aliasing constraints needed to do translation validation of assembly generated by C compilers.

Other memory models encoded in SMT include one for Solidity (Etherium smart contracts) [16], and for separation logic [37,39]. Several verification tools include SAT/SMT-based (partial) memory models for C [2,9,10] and Java [43].

Several automatic software verification tools, often based on CHCs (constrained Horn clauses), support memory programs [6,13]. For example, both Sea-Horn and Cascade use a field-sensitive alias analysis to split the memory [15,45]. SLAyer [4] is an automatic tool for analyzing memory safety of a C program using Z3. Smallfoot [3] verifies assertions written in separation logic.

There have been recent advances in speeding up verification of (SMT) array programs [17,22], from which we could likely benefit.

CompCert [27] splits the memory into local (private) and non-local (public) blocks, similarly to what we do, but assumes that allocations never fail [28]. Work on verifying peephole optimizations for CompCert does not support memory [34].

To support integer-to-pointer casts in CompCert, [5] proposes extending integer values to carry block ids as well. In this model, arithmetic on pointer values yields a symbolic expression. [19] makes the pointer-to-integer cast an instruction that assigns a physical address to the block. Neither of these models supports several optimizations performed by LLVM.

# 12 Conclusion

We presented the first SMT encoding of LLVM's memory model that is sufficiently precise to validate all of LLVM's intra-procedural memory optimizations.

Using our new encoding, we found and reported 21 previously unknown bugs in LLVM memory optimizations, 10 of which have already been fixed.

Acknowledgement. This work was supported in part by the Basic Science Research Program through the National Research Foundation of Korea (NRF-2020R1A2C2011947).

# References


Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

# **Automatically Tailoring Abstract Interpretation to Custom Usage Scenarios**

Muhammad Numair Mansur1(B) , Benjamin Mariano<sup>2</sup>, Maria Christakis<sup>1</sup>, Jorge A. Navas<sup>3</sup>, and Valentin W¨ustholz<sup>4</sup>

> <sup>1</sup> MPI-SWS, Kaiserslautern and Saarbr¨ucken, Germany {numair,maria}@mpi-sws.org <sup>2</sup> The University of Texas at Austin, Austin, USA bmariano@cs.utexas.edu <sup>3</sup> SRI International, Menlo Park, USA jorge.navas@sri.com <sup>4</sup> ConsenSys, Kaiserslautern, Germany valentin.wustholz@consensys.net

**Abstract.** In recent years, there has been significant progress in the development and industrial adoption of static analyzers, specifically of abstract interpreters. Such analyzers typically provide a large, if not huge, number of configurable options controlling the analysis precision and performance. A major hurdle in integrating them in the softwaredevelopment life cycle is tuning their options to custom usage scenarios, such as a particular code base or certain resource constraints.

In this paper, we propose a technique that automatically tailors an abstract interpreter to the code under analysis and any given resource constraints. We implement this technique in a framework, tAIlor, which we use to perform an extensive evaluation on real-world benchmarks. Our experiments show that the configurations generated by tAIlor are vastly better than the default analysis options, vary significantly depending on the code under analysis, and most remain tailored to several subsequent code versions.

# **1 Introduction**

*Static analysis* inspects code, without running it, in order to prove properties or detect bugs. Typically, static analysis approximates code behavior, for instance, because checking the correctness of most properties is undecidable. *Performance* is another important reason for this approximation. Typically, the closer the approximation is to the actual code behavior, the less efficient and the more *precise* the analysis is, that is, the fewer false positives it reports. For less tight approximations, the analysis tends to become more efficient but less precise.

Recent years have seen tremendous progress in the development and industrial adoption of static analyzers. Notable successes include Facebook's Infer [7,8] and AbsInt's Astr´ee [5]. Many popular analyzers, such as these, are based on *abstract interpretation* [12], a technique that abstracts the concrete program c The Author(s) 2021

semantics and reasons about its abstraction. In particular, program states are abstracted as elements of *abstract domains*. Most abstract interpreters offer a wide range of abstract domains that impact the precision and performance of the analysis. For instance, the Intervals domain [11] is typically faster but less precise than Polyhedra [16], which captures linear inequalities among variables.

In addition to the domains, abstract interpreters usually provide a large number of other options, for instance, whether backward analysis should be enabled or how quickly a fixpoint should be reached. In fact, the sheer number of option combinations (over 6M in our experiments) is bound to overwhelm users, especially non-expert ones. To make matters worse, the best option combinations may vary significantly depending on the code under analysis and the resources, such as time or memory, that users are willing to spend.

In light of this, we suspect that most users resort to using the default options that the analysis designer pre-selected for them. However, these are definitely not suitable for all code. Moreover, they do not adjust to different stages of software development, e.g., running the analysis in the editor should be much faster than running it in a continuous integration (CI) pipeline, which in turn should be much faster than running it prior to a major release. The alternative of enabling the (in theory) most precise analysis can be even worse, since in practice it often runs out of time or memory as we show in our experiments. As a result, the widespread adoption of abstract interpreters is severely hindered, which is unfortunate since they constitute an important class of practical analyzers.

**Our Approach.** To address this issue, we present the first technique that automatically tailors a generic abstract interpreter to a custom usage scenario. With the term *custom usage scenario*, we refer to a particular piece of code and specific resource constraints. The key idea behind our technique is to phrase the problem of customizing the abstract-interpretation configuration to a given usage scenario as an optimization problem. Specifically, different configurations are compared using a cost function that penalizes those that prove fewer properties or require more resources. The cost function can guide the configuration search of a wide range of existing optimization algorithms. This problem of tuning abstract interpreters can be seen as an instance of the more general problem of *algorithm configuration* [31]. In the past, algorithm configuration has been used to tune algorithms for solving various hard problems, such as SAT solving [32,33], and more recently, training of machine-learning models [3,18,52].

We implement our technique in an open-source framework called tAIlor<sup>1</sup>, which configures a given abstract interpreter for a given usage scenario using a given optimization algorithm. As a result, tAIlor enables the abstract interpreter to prove as many properties as possible within the resource limit without requiring any domain expertise on behalf of the user.

Using tAIlor, we find that tailored configurations vastly outperform the default options pre-selected by the analysis designers. In fact, we show that this is possible even with very simple optimization algorithms. Our experiments

<sup>1</sup> The tool implementation is found at https://github.com/Practical-Formal-Methods/tailor and an installation at https://doi.org/10.5281/zenodo.4719604.

also demonstrate that tailored configurations vary significantly depending on the usage scenario—in other words, there cannot be a single configuration that fits all scenarios. Finally, most of the generated configurations remain tailored to several subsequent code versions, suggesting that re-tuning is only necessary after major code changes.

**Contributions.** We make the following contributions:


# **2 Overview**

We now illustrate the workflow and tool architecture of tAIlor and provide examples of its effectiveness.

**Terminology.** In the following, we refer to an abstract domain with all its options (e.g., enabling backward analysis or more precise treatment of arrays etc.) as an *ingredient*.

As discussed earlier, abstract interpreters typically provide a large number of such ingredients. To make matters worse, it is also possible to combine different ingredients into a sequence (which we call a *recipe*) such that more properties are verified than with individual ingredients. For example, a user could configure the abstract interpreter to first use Intervals to verify as many properties as possible and then use Polyhedra to attempt verification of any remaining properties. Of course, the number of possible configurations grows exponentially in the length of the recipe (over 6M in our experiments for recipes up to length 3).

**Workflow.** The high-level architecture of tAIlor is shown in Fig. 1. It takes as input the code to be analyzed (i.e., any program, file, function, or fragment), a user-provided resource limit, and optionally an optimization algorithm. We focus on time as the constrained resource in this paper, but our technique could be easily extended to other resources, such as memory.

The optimization engine relies on a recipe generator to generate a fresh recipe. To assess its quality in terms of precision and performance, the recipe evaluator computes a cost for the recipe. The cost is computed by evaluating how precise and efficient the abstract interpreter is for the given recipe. This cost is used by the optimization engine to keep track of the best recipe so far, i.e., the one that proves the most properties in the least amount of time. tAIlor repeats this process for a given number of iterations to sample multiple recipes and returns the recipe with the lowest cost.

Zooming in on the evaluator, a recipe is processed by invoking the abstract interpreter for each ingredient. After each analysis (i.e., one ingredient), the evaluator collects the new verification results, that is, the verified assertions. All

**Fig. 1.** Overview of our framework.

verification results that have been achieved so far are subsequently shared with the analyzer when it is invoked for the next ingredient. Verification results are shared by converting all verified assertions into assumptions. After processing the entire recipe, the evaluator computes a cost for the recipe, which depends on the number of unverified assertions and the total analysis time.

In general, there might be more than one recipe tailored to a particular usage scenario. Na¨ıvely, finding one requires searching the space of all recipes. Section 4.3 discusses several optimization algorithms for performing this search, which tAIlor already incorporates in its optimization engine.

**Examples.** As an example, let us consider the usage scenario where a user runs the Crab abstract interpreter [25] in their editor for instant feedback during code development. This means that the allowed time limit for the analysis is very short, say, 1 s. Now assume that the code under analysis is a program file<sup>2</sup> of the multimedia processing tool ffmpeg, which is used to evaluate the effectiveness of tAIlor in our experiments. In this file, Crab checks 45 assertions for common bugs, i.e., division by zero, integer overflow, buffer overflow, and use after free.

Analysis of this file with the default Crab configuration takes 0.35 s to complete. In this time, Crab proves 17 assertions and emits 28 warnings about the properties that remain unverified. For this usage scenario, tAIlor is able to tune the abstract-interpreter configuration such that the analysis time is 0.57 s and the number of verified properties increases by 29% (i.e., 22 assertions are proved). Note that the tailored configuration uses a completely different abstract domain than the one in the default configuration. As a result, the verification results are significantly better, but the analysis takes slightly longer to complete (although remaining within the specified time limit). In contrast, enabling the most precise analysis in Crab verifies 26 assertions but takes over 6 min to complete, which by far exceeds the time limit imposed by the usage scenario.

While it takes tAIlor 4.5 s to find the above configuration, this is time well invested; the configuration can be re-used for several subsequent code versions. In fact, in our experiments, we show that generated configurations can remain

<sup>2</sup> https://github.com/FFmpeg/FFmpeg/blob/master/libavformat/idcin.c

tailored for at least up to 50 subsequent commits to a file under version control. Given that changes in the editor are typically much more incremental, we expect that no re-tuning would be necessary at all during an editor session. Re-tuning may be beneficial after major changes to the code under analysis and can happen offline, e.g., between editor sessions, or in the worst case overnight.

As another example, consider the usage scenario where Crab is integrated in a CI pipeline. In this scenario, users should be able to spare more time for analysis, say, 5 min. Here, let us assume that the analyzed code is a program file<sup>3</sup> of the curl tool for transferring data by URL, which is also used in our evaluation. The default Crab configuration takes 0.23 s to run and only verifies 2 out of 33 checked assertions. tAIlor is able to find a configuration that takes 7.6 s and proves 8 assertions. In contrast, the most precise configuration does not terminate even after 15 min.

Both scenarios demonstrate that, even when users have more time to spare, the default configuration cannot take advantage of it to improve the verification results. At the same time, the most precise configuration is completely impractical since it does not respect the resource constraints imposed by these scenarios.

### **3 Background: A Generic Abstract Interpreter**

Many successful abstract interpreters (e.g., Astr´ee [5], C Global Surveyor [53], Clousot [17], Crab [25], IKOS [6], Sparrow [46], and Infer [8]) follow the generic architecture in Fig. 2. In this section, we describe its main components to show that our approach should generalize to such analyzers.

**Memory Domain.** Analysis of low-level languages such as C and LLVM-bitcode requires reasoning about pointers. It is, therefore, common to design a *memory domain* [42] that can simultaneously reason about pointer aliasing, memory contents, and numerical relations between them.

*Pointer domains* resolve aliasing between pointers, and *array domains* reason about memory contents. More specifically, array domains can reason about individual memory locations (cells), infer universal properties over multiple cells, or both. Typically, reasoning about individual cells trades performance for precision unless there are very few array elements (e.g., [22,42]). In contrast, reasoning about multiple memory locations (*summarized cells*) trades precision for performance. In our evaluation, we use *Array smashing* domains [5] that abstract different array elements into a single summarized cell. *Logico-numerical domains* infer relationships between program and *synthetic* variables, introduced by the pointer and array domains, e.g., summarized cells.

Next, we introduce domains typically used for proving the absence of runtime errors in low-level languages. *Boolean domains* (e.g., flat Boolean, BDDApron [1]) reason about Boolean variables and expressions. *Non-relational domains* (e.g., Intervals [11], Congruence [23]) do not track relations among different variables, in contrast to *relational domains* (e.g., Equality [35], Zones [41],

<sup>3</sup> https://github.com/curl/curl/blob/master/lib/cookie.c

**Fig. 2.** Generic architecture of an abstract interpreter.

Octagons [43], Polyhedra [16]). Due to their increased precision, relational domains are typically less efficient than non-relational ones. *Symbolic domains* (e.g., Congruence closure [9], Symbolic constant [44], Term [21]) abstract complex expressions (e.g., non-linear) and external library calls by uninterpreted functions. *Non-convex domains* express disjunctive invariants. For instance, the DisInt domain [17] extends Intervals to a finite disjunction; it retains the scalability of the Intervals domain by keeping only non-overlapping intervals. On the other hand, the Boxes domain [24] captures arbitrary Boolean combinations of intervals, which can often be expensive.

**Fixpoint Computation.** To ensure termination of the fixpoint computation, Cousot and Cousot introduce *widening* [12,14], which usually incurs a loss of precision. There are three common strategies to reduce this precision loss, which however sacrifice efficiency. First, *delayed widening* [5] performs a number of initial fixpoint-computation iterations in the hope of reaching a fixpoint before resorting to widening. Second, *widening with thresholds* [37,40] limits the number of program expressions (thresholds) that are used when widening. The third strategy consists in applying *narrowing* [12,14] a certain number of times.

**Forward and Backward Analysis.** Classically, abstract interpreters analyze code by propagating abstract states in a *forward* manner. However, abstract interpreters can also perform *backward* analysis to compute the execution states that lead to an assertion violation. Cousot and Cousot [13,15] define a *forwardbackward refinement* algorithm in which a forward analysis is followed by a backward analysis until no more refinement is possible. The backward analysis uses invariants computed by the forward analysis, while the forward analysis does not explore states that cannot reach an assertion violation based on the backward analysis. This refinement is more precise than forward analysis alone, but it may also become very expensive.

**Algorithm 1: Optimization engine.**

```
1 Function Optimize(P, rmax , lmax , idom , iset, recinit, GenerateRecipe,
    Accept) is
2 // Phase 1 (optimize domains)
3 recbest := reccurr := recinit
4 cost best := cost curr := Evaluate(P, rmax , recbest )
5 for l := 1 to lmax do
6 for i := 1 to idom · l do
7 recnext := GenerateRecipe(reccurr , l)
8 costnext := Evaluate(P, rmax , recnext )
9 if costnext < cost best then
10 recbest, cost best := recnext, costnext
11 if Accept(cost curr , costnext ) then
12 reccurr , cost curr := recnext, costnext
13 // Phase 2 (optimize settings)
14 for i := 1 to iset do
15 recmut := MutateSettings(recbest )
16 costmut := Evaluate(P, rmax , recmut )
17 if costmut < cost best then
18 recbest, cost best := recmut, costmut
19 return recbest
```
**Intra- and Inter-procedural Analysis.** An *intra-procedural* analysis analyzes a function ignoring the information (i.e., call stack) that flows into it, while an *inter-procedural* analysis considers all flows among functions. The former is much more efficient and easy to parallelize, but the latter is usually more precise.

### **4 Our Technique**

This section describes the components of tAIlor in detail; Sects. 4.1, 4.2, 4.3 explain the optimization engine, recipe evaluator, and recipe generator (Fig. 1).

#### **4.1 Recipe Optimization**

Algorithm 1 implements the optimization engine. In addition to the code *P* and the resource limit *rmax* , it also takes as input the maximum length of the generated recipes *lmax* (i.e., the maximum number of ingredients), a function to generate new recipes GenerateRecipe (i.e., the recipe generator from Fig. 1), and four other parameters, which we explain later.

A tailored recipe is found in two phases. The first phase aims to find the best abstract domain for each ingredient, while the second tunes the remaining analysis settings for each ingredient (e.g., whether backward analysis should be enabled). Parameters *idom* and *iset* control the number of iterations of each phase. Note that we start with a search for the best domains since they have the largest impact on the precision and performance of the analysis.

During the first phase, the algorithm initializes the best recipe *recbest* with an initial recipe *recinit* (line 3). The cost of this recipe is evaluated with function Evaluate, which implements the recipe evaluator from Fig. 1. The subsequent nested loop (line 5) samples a number of recipes, starting with the shortest recipes (*l* := 1) and ending with the longest recipes (*l* := *lmax* ). The inner loop generates *<sup>i</sup>dom* ingredients for each ingredient in the recipe (i.e., *<sup>i</sup>dom* · *<sup>l</sup>* total iterations) by invoking function GenerateRecipe, and in case a recipe with lower cost is found, it updates the best recipe (lines 9–10). Several optimization algorithms, such as hill climbing and simulated annealing, search for an optimal result by mutating some of the intermediate results. Variable *reccurr* stores intermediate recipes to be mutated, and function Accept decides when to update it (lines 11–12).

As explained earlier, the purpose of the first phase is to identify the best sequence of abstract domains. The second phase (lines 13–18) focuses on tuning the other settings of the best recipe so far. This is done by randomly mutating the best recipe via MutateSettings (line 15), and updating the best recipe if better settings are found (lines 17–18). After exploring *iset* random settings, the best recipe is returned to the user (line 19).

#### **4.2 Recipe Evaluation**

The recipe evaluator from Fig. 1 uses a cost function to determine the quality of a fresh recipe with respect to the precision and performance of the abstract interpreter. This design is motivated by the fact that analysis imprecision and inefficiency are among the top pain points for users [10].

Therefore, the cost function depends on the number of generated warnings *w* (that is, the number of unverified assertions), the total number of assertions in the code *wtotal* , the resource consumption *r* of the analyzer, and the resource limit *rmax* imposed on the analyzer:

$$cost(w, w\_{total}, r, r\_{max}) = \begin{cases} w + \frac{r}{r\_{max}}\\ \frac{w\_{total}}{w\_{total}}, & \text{if } r \le r\_{max}\\ \infty, & \text{otherwise} \end{cases}$$

Note that *w* and *r* are measured by invoking the abstract interpreter with the recipe under evaluation. The cost function evaluates to a lower cost for recipes that improve the precision of the abstract interpreter (due to the term *w/wtotal*). In case of ties, the term *r/rmax* causes the function to evaluate to a lower cost for recipes that result in a more efficient analysis. In other words, for two recipes resulting in equal precision, the one with the smaller resource consumption is assigned a lower cost. When a recipe causes the analyzer to exceed the resource limit, it is assigned infinite cost.

#### **4.3 Recipe Generation**

In the literature, there is a broad range of optimization algorithms for different application domains. To demonstrate the generality and effectiveness of tAIlor, we instantiate it with four adaptations of three well-known optimization algorithms, namely random sampling [38], hill climbing (with regular restarts) [48], and simulated annealing [36,39]. Here, we describe these algorithms in detail, and in Sect. 5, we evaluate their effectiveness.

Before diving into the details, let us discuss the suitability of different kinds of optimization algorithms for our domain. There are algorithms that leverage mathematical properties of the function to be optimized, e.g., by computing derivatives as in Newton's iterative method. Our cost function, however, is evaluated by running an abstract interpreter, and thus, it is not differentiable or continuous. This constraint makes such analytical algorithms unsuitable. Moreover, evaluating our cost function is expensive, especially for precise abstract domains such as Polyhedra. This makes algorithms that require a large number of samples, such as genetic algorithms, less practical.

Now recall that Algorithm 1 is parametric in how new recipes are generated (with GenerateRecipe) and accepted for further mutations (with Accept). Instantiations of these functions essentially constitute our search strategy for a tailored recipe. In the following, we discuss four such instantiations. Note that, in theory, the order of recipe ingredients matters. This is because any properties verified by one ingredient are converted into assumptions for the next, and different assumptions may lead to different verification results. Therefore, all our instantiations are able to explore different ingredient orderings.

**Random Sampling.** Random sampling (rs) just generates random recipes of a certain length. Function Accept always returns *false* as each recipe is generated from scratch, and not as a result of any mutations.

**Domain-Aware Random Sampling.** rs might generate recipes containing abstract domains of comparable precision. For instance, the Octagons domain is typically strictly more precise than Intervals. Thus, a recipe consisting of these domains is essentially equivalent to one containing only Octagons.

Now, assume that we have a partially ordered set (poset) of domains that defines their ordering in terms of precision. An example of such a poset for a particular abstract interpreter is shown in Fig. 3. An optimization algorithm can then leverage this information to reduce the search space of possible recipes. Given such a poset, we therefore define domain-aware random sampling (dars), which randomly samples recipes that do not contain abstract domains of comparable precision. Again, Accept always returns *false*.

**Simulated Annealing.** Simulated annealing (sa) searches for the best recipe by mutating the current recipe *reccurr* in Algorithm 1. The resulting recipe (*recnext*), if accepted on line 12, becomes the new recipe to be mutated. Algoirthm 2 shows an instantiation of GenerateRecipe, which mutates a given recipe such that the poset precision constraints are satisfied (i.e., there are no domains of comparable precision). A recipe is mutated either by adding new ingredients with

#### **Algorithm 2: A recipe-generator instantiation.**

```
1 Function GenerateRecipe(rec, lmax ) is
2 act := RandomAction({ADD: 0.2, MOD: 0.8}))
3 if act = ADD ∧ Len(rec) < lmax then
4 ingrnew := RandomPosetLeastIncomparable(rec)
5 recmut := AddIngredient(rec, ingrnew )
6 else
7 ingr := RandomIngredient(rec)
8 actm := RandomAction({GT: 0.5, LT: 0.3, INC: 0.2})
9 if actm = GT then
10 ingrnew := PosetGreaterThan(ingr)
11 else if actm = LT then
12 ingrnew := PosetLessThan(ingr )
13 else
14 recrem := RemoveIngredient(rec, ingr)
15 ingrnew := RandomPosetLeastIncomparable(recrem )
16 recmut := ReplaceIngredient(rec, ingr , ingrnew )
17 if ¬PosetCompatible(recmut ) then
18 recmut := GenerateRecipe(rec, lmax )
19 return recmut
```
20% probability or by modifying existing ones with 80% probability (line 2). The probability of adding ingredients is lower to keep recipes short.

When adding a new ingredient (lines 4–5), Algorithm 2 calls Random-PosetLeastIncomparable, which considers all domains that are incomparable with the domains in the recipe. Given this set, it randomly selects from the domains with the least precision to avoid adding overly expensive domains. When modifying a random ingredient in the recipe (lines 7–16), the algorithm can replace its domain with one of three possibilities: a domain that is immediately more precise (i.e., not transitively) in the poset (via PosetGreaterThan), a domain that is immediately less precise (via PosetLessThan), or an incomparable domain with the least precision (via RandomPosetLeastIncomparable). If the resulting recipe does not satisfy the poset precision constraints, our algorithm retries to mutate the original recipe (lines 17–18).

For simulated annealing, Accept returns *true* if the new cost (for the mutated recipe) is less than the current cost. It also accepts recipes whose cost is higher with a certain probability, which is inversely proportional to the cost increase and the number of explored recipes. That is, recipes with a small cost increase are likely to be accepted, especially at the beginning of the exploration.

**Hill Climbing.** Our instantiation of hill climbing (hc) performs regular restarts. In particular, it starts with a randomly generated recipe that satisfies the poset precision constraints, generates 10 new valid recipes, and restarts with a random recipe. Accept returns *true* only if the new cost is lower than the best cost, which is equivalent to the current cost.

### **5 Experimental Evaluation**

To evaluate our technique, we aim to answer the following research questions:

**RQ1:** Is our technique effective in tailoring recipes to different usage scenarios? **RQ2:** Are the tailored recipes optimal?


#### **5.1 Implementation**

We implemented tAIlor by extending Crab [25], a parametric framework for modular construction of abstract interpreters<sup>4</sup>. We extended Crab with the ability to pass verification results between recipe ingredients as well as with the four optimization algorithms discussed in Sect. 4.3.

Table 1 shows all settings and values used in our evaluation. The first three settings refer to the strategies discussed in Sect. 3 for mitigating the precision loss incurred by widening. For the initial recipe, tAIlor uses Intervals and the Crab default values for all other settings (in bold in the table). To make the search more efficient, we selected a representative subset of all possible setting values.

Crab uses a DSA-based [26] pointer analysis and can, optionally, reason about array contents using array smashing. It offers a wide range of logiconumerical domains, shown in Fig. 3. The bool domain is the flat Boolean domain, ric is a reduced product of Intervals and Congruence, and term(int) and term(disInt) are instantiations of the Term domain with intervals and disInt, respectively. Although Crab provides a bottom-up inter-procedural analysis, we use the default intra-procedural analysis; in fact, most analyses deployed in real usage scenarios are intra-procedural due to time constraints [10].

#### **5.2 Benchmark Selection**

For our evaluation, we systematically selected popular and (at some point) active C projects on GitHub. In particular, we chose the six most starred C repositories


**Table 1.** Crab settings and their possible values as used in our experiments. Default settings are shown in bold.

<sup>4</sup> Crab is available at https://github.com/seahorn/crab.


**Table 2.** Overview of projects.

with over 300 commits that we could successfully build with the Clang-5.0 compiler. We give a short description of each project in Table 2.

For analyzing these projects, we needed to introduce properties to be verified. We, thus, automatically instrumented these projects with four types of assertions that check for common bugs; namely, division by zero, integer overflow, buffer overflow, and use after free. Introducing assertions to check for runtime errors such as these is common practice in program analysis and verification.

As projects consist of different numbers of files, to avoid skewing the results in favor of a particular project, we randomly and uniformly sampled 20 LLVMbitcode files from each project, for a total of 120. To ensure that each file was neither too trivial nor too difficult for the abstract interpreter, we used the number of assertions as a complexity indicator and only sampled files with at least 20 assertions and at most 100. Additionally, to guarantee all four assertion types were included and avoid skewing the results in favor of a particular assertion type, we required that the sum of assertions for each type was at least 70 across all files—this exact number was largely determined by the benchmarks.

Overall, our benchmark suite of 120 files totals 1346 functions, 5557 assertions (on average 4 assertions per function), and 667927 LLVM instructions (Table 3).

#### **5.3 Results**

We now present our experimental results for each research question. We performed all experiments on a 32-core Intel®Xeon®E5-2667 v2 CPU @ 3.30 GHz machine with 264 GB of memory, running Ubuntu 16.04.1 LTS.

**Fig. 3.** Comparing logico-numerical domains in Crab. A domain *d*<sup>1</sup> is less precise than *d*<sup>2</sup> if there is a path from *d*<sup>1</sup> to *d*<sup>2</sup> going upward, otherwise *d*<sup>1</sup> and *d*<sup>2</sup> are incomparable.


**Table 3.** Benchmark characteristics (20 files per project). The last three columns show the number of functions, assertions, and LLVM instructions in the analyzed files.

**RQ1: Is Our Technique Effective in Tailoring Recipes to Different Usage Scenarios?** We instantiated tAIlor with the four optimization algorithms described in Sect. 4.3: rs, dars, sa, and hc. We constrained the analysis time to simulate two usage scenarios: 1 s for instant feedback in the editor, and 5 min for feedback in a CI pipeline. We compare tAIlor with the default recipe (def), i.e., the default settings in Crab as defined by its designer after careful tuning on a large set of benchmarks over the years. def uses a combination of two domains, namely, the reduced product of Boolean and Zones. The other default settings are in Table 1.

For this experiment, we ran tAIlor with each optimization algorithm on the 120 benchmark files, enabling optimization at the granularity of files. Each algorithm was seeded with the same random seed. In Algorithm 1, we restrict recipes to contain at most 3 domains (*lmax* = 3) and set the number of iterations for each phase to be 5 and 10 (*idom* = 5 and *iset* = 10).

The results are presented in Fig. 4, which shows the number of assertions that are verified with the best recipe found by each algorithm as well as by the default recipe. All algorithms outperform the default recipe for both usage scenarios, verifying almost twice as many assertions on average. The random-

**Fig. 4.** Comparison of the number of assertions verified with the best recipe generated by each optimization algorithm and with the default recipe, for varying timeouts.

**Fig. 5.** Comparison of the number of assertions verified by a tailored vs. the default recipe.

**Fig. 6.** Comparison of the total time (in sec) that each algorithm requires for all iterations, for varying timeouts.

sampling algorithms are shown to find better recipes than the others, with dars being the most effective. Hill climbing is less effective since it gets stuck in local cost minima despite restarts. Simulated annealing is the least effective because it slowly climbs up the poset toward more precise domains (see Algorithm 2). However, as we explain later, we expect the algorithms to converge on the number of verified assertions for more iterations.

Figure 5 gives a more detailed comparison with the default recipe for the time limit of 5 min. In particular, each horizontal bar shows the total number of assertions verified by each algorithm. The orange portion represents the assertions verified by both the default recipe and the optimization algorithm, while the green and red portions represent the assertions only verified by the algorithm and default recipe, respectively. These results show that, in addition to verifying hundreds of new assertions, tAIlor is able to verify the vast majority of assertions proved by the default recipe, regardless of optimization algorithm.

In Fig. 6, we show the total time each algorithm takes for all iterations. dars takes the longest. This is due to generating more precise recipes thanks to its domain knowledge. Such recipes typically take longer to run but verify more assertions (as in Fig. 4). On average, for all algorithms, tAIlor requires only 30 s to complete all iterations for the 1-s timeout and 16 min for the 5-min timeout. As discussed in Sect. 2, this tuning time can be spent offline.

**Fig. 7.** Comparison of the number of assertions verified with the best recipe generated by the different optimization algorithms, for different numbers of iterations.

Figure 7 compares the total number of assertions verified by each algorithm when tAIlor runs for 40 (*idom* = 5 and *iset* = 10) and 80 (*idom* = 10 and *iset* = 20) iterations. The results show that only a relatively small number of additional assertions are verified with 80 iterations. In fact, we expect the algorithms to eventually converge on the number of verified assertions, given the time limit and precision of the available domains.

As dars performs best in this comparison, we only evaluate dars in the remaining research questions. We use a 5-min timeout.

**RQ1 takeaway:** tAIlor verifies between 1.6–2.1<sup>×</sup> the assertions of the default recipe, regardless of optimization algorithm, timeout, or number of iterations. In fact, even very simple algorithms (such as rs) significantly outperform the default recipe.

**RQ2: Are the Tailored Recipes Optimal?** To check the optimality of the tailored recipes, we compared them with the most precise (and least efficient) Crab configuration. It uses the most precise domains from Fig. 3 (i.e., bool, polyhedra, term(int), ric, boxes, and term(disInt)) in a recipe of 6 ingredients and assigns the most precise values to all other settings from Table 1. We generously gave a 30-min timeout to this recipe.

For 21 out of 120 files, the most precise recipe ran out of memory (264 GB). For 86 files, it terminated within 5 min, and for 13, it took longer (within 30 min)—in many cases, this was even longer than tAIlor's tuning time in Fig. 6. We compared the number of assertions verified by our tailored recipes (which do not exceed 5 min) and by the most precise recipe. For the 86 files that terminated within 5 min, our recipes prove 618 assertions, whereas the most precise recipe proves 534. For the other 13 files, our recipes prove 119 assertions, whereas the most precise recipe proves 98.

Consequently, our (in theory) less precise and more efficient recipes prove more assertions in files where the most precise recipe terminates. Possible explanations for this non-intuitive result are: (1) Polyhedra coefficients may overflow, in which case the constraints are typically ignored by abstract interpreters, and

**Fig. 8.** Effect of different settings on the precision and performance of the abstract interpreter. (dw: NUM DELAY WIDEN, ni: NUM NARROW ITERATIONS, wt: NUM WIDEN - THRESHOLDS, as: array smashing, b: backward analysis, d: abstract domain, o: ingredient ordering).

(2) more precise domains with different widening operations may result in less precise results [2,45].

We also evaluated the optimality of tailored recipes by mutating individual parts of the recipe and comparing to the original. In particular, for each setting in Table 1, we tried all possible values and replaced each domain with all other comparable domains in the poset of Fig. 3. For example, for a recipe including zones, we tried octagons, polyhedra, and intervals. In addition, we tried all possible orderings of the recipe ingredients, which in theory could produce different results. We observed whether these changes resulted in a difference in the precision and performance of the analyzer.

Figure 8 shows the results of this experiment, broken down by setting. Equal (in orange) indicates that the mutated recipe proves the same number of assertions within ±5 s of the original. Positive (in green) indicates that it either proves more assertions or the same number of assertions at least 5 s faster. Negative (in red) indicates that the mutated recipe either proves fewer assertions or the same number of assertions at least 5 seconds slower.

The results show that, for our benchmarks, mutating the recipe found by tAIlor rarely led to an improvement. In particular, at least 93% of all mutated recipes were either equal to or worse than the original recipe. In the majority of these cases, mutated recipes are equally good. This indicates that there are many optimal or close-to-optimal solutions and that tAIlor is able to find one.

**RQ2 takeaway:** As compared to the most precise recipe, tAIlor verified more assertions across benchmarks where the most precise recipe terminated. Furthermore, mutating recipes found by tAIlor resulted in improvement only for less than 7% of recipes.

**RQ3: How Diverse are the Tailored Recipes?** To motivate the need for optimization, we must show that tailored recipes are sufficiently diverse such that they could not be replaced by a well-crafted default recipe. To better understand the characteristics of tailored recipes, we manually inspected all of them.

**Fig. 9.** Occurrence of domains (in %) in the best recipes for all assertion types.

tAIlor generated recipes of length greater than 1 for 61 files. Out of these, 37 are of length 2 and 24 of length 3. For 77% of generated recipes, NUM DELAY - WIDEN is not set to the default value of 1. Additionally, 55% of the ingredients enable array smashing, and 32% enable backward analysis.

Figure 9 shows how often (in percentage) each abstract domain occurs in a best recipe found by tAIlor. We observe that all domains occur almost equally often, with 6 of the 10 domains occurring in between 9% and 13% of recipes. The most common domain was bool at 18%, and the least common was intervals at 4%. We observed a similar distribution of domains even when instrumenting the benchmarks with only one assertion type, e.g., checking for integer overflow.

We also inspected which domain combinations are frequently used in the tailored recipes. One common pattern is combinations between bool and numerical domains (18 occurrences). Similarly, we observed 2 occurrences of term(disInt) together with zones. Interestingly, the less powerful variants of combining disInt with zones (3 occurrences) and term(int) with zones (6 occurrences) seem to be sufficient in many cases. Finally, we observed 8 occurrences of polyhedra or octagons with boxes, which are the most precise convex and non-convex domains. Our approach is, thus, not only useful for users, but also for designers of abstract interpreters by potentially inspiring new domain combinations.

**RQ3 takeaway:** The diversity of tailored recipes prevents replacing them with a single default recipe. Over half of the tailored recipes contain more than one ingredient, and ingredients use a variety of domains and their settings.

**RQ4: How Resilient are the Tailored Recipes to Code Changes?** We expect tailored recipes to be resilient to code changes, i.e., to retain their optimality across several changes without requiring re-tuning. We now evaluate if a recipe tailored for one code version is also tailored for another, even when the two versions are 50 commits apart.

For this experiment, we took a random sample of 60 files from our benchmarks and retrieved the 50 most recent commits per file. We only sampled 60 out of

**Fig. 10.** Difference in the safe assertions across commits.

120 files as building these files for each commit is quite time consuming—it can take up to a couple of days. We instrumented each file version with the four assertion types described in Sect. 5.2. It should be noted that, for some files, we retrieved fewer than 50 versions either because there were fewer than 50 total commits or our build procedure for the project failed on older commits. This is also why we did not run this experiment for over 50 commits.

We analyzed each file version with the best recipe, *Ro*, found by tAIlor for the oldest file version. We compared this recipe with new best recipes, *Rn*, that were generated by tAIlor when run on each subsequent file version. For this experiment, we used a 5-min timeout and 40 iterations.

Note that, when running tAIlor with the same optimization algorithm and random seed, it explores the same recipes. It is, therefore, very likely that recipe *R<sup>o</sup>* for the oldest commit is also the best for other file versions since we only explore 40 different recipes. To avoid any such bias, we performed this experiment by seeding tAIlor with a different random seed for each commit. The results are shown in Fig. 10.

In Fig. 10, we give a bar chart comparing the number of files per commit that have a positive, equal, and negative difference in the number of verified assertions, where commit 0 is the oldest commit and 49 the newest. An equal difference (in orange) means that recipe *R<sup>o</sup>* for the oldest commit proves the same number of assertions in the current file version, *fn*, as recipe *R<sup>n</sup>* found by running tAIlor on *fn*. To be more precise, we consider the two recipes to be equal if they differ by at most 1 verified assertion or 1% of verified assertions since such a small change in the number of safe assertions seems acceptable in practice (especially given that the total number of assertions may change across commits). A positive difference (in green) means that *R<sup>o</sup>* achieves better verification results than *Rn*, that is, *R<sup>o</sup>* proves more assertions safe (over 1 assertion or 1% of the assertions that *R<sup>n</sup>* proves). Analogously, a negative difference (in red) means that *R<sup>o</sup>* proves fewer assertions. We do not consider time here because none of the recipes timed out when applied on any file version.

Note that the number of files decreases for newer commits. This is because not all files go forward by 50 commits, and even if they do, not all file versions build. However, in a few instances, the number of files increases going forward in time. This happens for files that change names, and later, change back, which we do not account for.

For the vast majority of files, using recipe *R<sup>o</sup>* (found for the oldest commit) is as effective as using *R<sup>n</sup>* (found for the current commit). The difference in safe assertions is negative for less than a quarter of the files tested, with the average negative difference among these files being around 22% (i.e., *R<sup>o</sup>* proved 22% fewer assertions than *R<sup>n</sup>* in these files). On the remaining three quarters of the files tested however, *R<sup>o</sup>* proves at least as many assertions as *Rn*, and thus, *R<sup>o</sup>* tends to be tailored across code versions.

Commits can result in both small and large changes to the code. We therefore also measured the average difference in the number of verified assertions per changed line of code with respect to the oldest commit. For most files, regardless of the number of changed lines, we found that *R<sup>o</sup>* and *R<sup>n</sup>* are equally effective, with changes to 1000 LOC or more resulting in little to no loss in precision. In particular, the median difference in safe assertions across all changes between *R<sup>o</sup>* and *R<sup>n</sup>* was 0 (i.e., *R<sup>o</sup>* proved the same number of assertions safe as *Rn*), with a standard deviation of 15 assertions. We manually inspected a handful of outliers where *R<sup>o</sup>* proved significantly fewer assertions than *R<sup>n</sup>* (difference of over 50 assertions). These were due to one file from git where *R<sup>o</sup>* is not as effective because the widening and narrowing settings have very low values.

**RQ4 takeaway:** For over 75% of files, tAIlor's recipe for a previous commit (from up to 50 commits previous) remains tailored for future versions of the file, indicating the resilience of tailored recipes across code changes.

#### **5.4 Threats to Validity**

We have identified the following threats to the validity of our experiments.

**Benchmark Selection.** Our results may not generalize to other benchmarks. However, we selected popular GitHub projects from different application domains (see Table 2). Hence, we believe that our benchmark selection mitigates this threat and increases generalizability of our findings.

**Abstract Interpreter and Recipe Settings.** For our experiments, we only used a single abstract interpreter, Crab, which however is a mature and actively supported tool. The selection of recipe settings was, of course, influenced by the available settings in Crab. Nevertheless, Crab implements the generic architecture of Fig. 2, used by most abstract interpreters, such as those mentioned at the beginning of Sect. 3. We, therefore, expect our approach to generalize to such analyzers.

**Optimization Algorithms.** We considered four optimization algorithms, but in Sect. 4.3, we explain why these are suitable for our application domain. Moreover, tAIlor is configurable with respect to the optimization algorithm.

**Assertion Types.** Our results are based on four types of assertions. However, these cover a wide range of runtime errors that are commonly checked by static analyzers.

# **6 Related Work**

The impact of different abstract-interpretation configurations has been previously evaluated [54] for Java programs and partially inspired this work. To the best of our knowledge, we are the first to propose tailoring abstract interpreters to custom usage scenarios using optimization.

However, optimization is a widely used technique in many engineering disciplines. In fact, it is also used to solve the general problem of algorithm configuration [31], of which there exist numerous instantiations, for instance, to tune hyper-parameters of learning algorithms [3,18,52] and options of constraint solvers [32,33]. Existing frameworks for algorithm configuration differ from ours in that they are not geared toward problems that are solved by sequences of algorithms, such as analyses with different abstract domains. Even if they were, our experience with tAIlor shows that there seem to be many optimal or closeto-optimal configurations, and even very simple optimization algorithms such as rs are surprisingly effective (see RQ2); similar observations were made about the effectiveness of random search in hyper-parameter tuning [4].

In the rest of this section, we focus on the use of optimization in program analysis. It has been successfully applied to a number of program-analysis problems, such as automated testing [19,20], invariant inference [50], and compiler optimizations [49].

Recently, researchers have started to explore the direction of enriching program analyses with machine-learning techniques, for example, to automatically learn analysis heuristics [27,34,47,51]. A particularly relevant body of work is on adaptive program analysis [28–30], where existing code is analyzed to learn heuristics that trade soundness for precision or that coarsen the analysis abstractions to improve memory consumption. More specifically, adaptive program analysis poses different static-analysis problems as machine-learning problems and relies on Bayesian optimization to solve them, e.g., the problem of selectively applying unsoundness to different program components (e.g., different loops in the program) [30]. The main insight is that program components (e.g., loops) that produce false positives are alike, predictable, and share common properties. After learning to identify such components for existing code, this technique suggests components in unseen code that should be analyzed unsoundly.

In contrast, tAIlor currently does not adjust soundness of the analysis. However, this would also be possible if the analyzer provided the corresponding configurations. More importantly, adaptive analysis focuses on learning analysis heuristics based on existing code in order to generalize to arbitrary, unseen code. tAIlor, on the other hand, aims to tune the analyzer configuration to a custom usage scenario, including a particular program under analysis. In addition, the custom usage scenario imposes user-specific resource constraints, for instance by limiting the time according to a phase of the software-engineering life cycle. As we show in our experiments, the tuned configuration remains tailored to several versions of the analyzed program. In fact, it outperforms configurations that are meant to generalize to arbitrary programs, such as the default recipe.

# **7 Conclusion**

In this paper, we have proposed a technique and framework that tailors a generic abstract interpreter to custom usage scenarios. We instantiated our framework with a mature abstract interpreter to perform an extensive evaluation on realworld benchmarks. Our experiments show that the configurations generated by tAIlor are vastly better than the default options, vary significantly depending on the code under analysis, and typically remain tailored to several subsequent code versions. In the future, we plan to explore the challenges that an inter-procedural analysis would pose, for instance, by using a different recipe for computing a summary of each function or each calling context.

**Acknowledgements.** We are grateful to the reviewers for their constructive feedback. This work was supported by DFG grant 389792660 as part of TRR 248 (see https:// perspicuous-computing.science). Jorge Navas was supported by NSF grant 1816936.

# **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

# **Functional Correctness of C Implementations of Dijkstra's, Kruskal's, and Prim's Algorithms**

Anshuman Mohan(B), Wei Xiang Leow, and Aquinas Hobor

School of Computing, National University of Singapore, Singapore, Republic of Singapore amohan@cs.cornell.edu

**Abstract.** We develop machine-checked verifications of the full functional correctness of C implementations of the eponymous graph algorithms of Dijkstra, Kruskal, and Prim. We extend Wang *et al.*'s CertiGraph platform to reason about labels on edges, undirected graphs, and common spatial representations of edge-labeled graphs such as adjacency matrices and edge lists. We certify binary heaps, including Floyd's bottom-up heap construction, heapsort, and increase/decrease priority.

Our verifications uncover subtle overflows implicit in standard textbook code, including a nontrivial bound on edge weights necessary to execute Dijkstra's algorithm; we show that the intuitive guess fails and provide a workable refinement. We observe that the common notion that Prim's algorithm requires a connected graph is wrong: we verify that a standard textbook implementation of Prim's algorithm can compute minimum spanning forests without finding components first. Our verification of Kruskal's algorithm reasons about two graphs simultaneously: the undirected graph undergoing MSF construction, and the directed graph representing the forest inside union-find. Our binary heap verification exposes precise bounds for the heap to operate correctly, avoids a subtle overflow error, and shows how to recycle keys to avoid overflow.

**Keywords:** Separation logic · Graph algorithms · Coq · VST

# **1 Introduction**

Dijkstra's eponymous shortest-path algorithm [22] finds the cost-minimal paths from a distinguished *source* vertex to all reachable vertices in a directed graph. Prim's [61] and Kruskal's [42] algorithms return minimal spanning trees for undirected graphs. Binary heaps are the first priority queue one typically encounters. These algorithms/structures are classic and ubiquitous, appearing widely in textbooks [20,33,36,65,66,68] and in real routing protocol libraries.

In addition to decades of use and textbook analysis, recent efforts have verified one or more of these algorithms in proof assistants and formally proved c The Author(s) 2021

claims about their behavior [12,15,30,45,53]. A reasonable person might think that all that can be said, has been. However, we have found that textbook code glosses over a cornucopia of issues that routinely crop up in real-world settings: under/overflows, integration with performant data structures, manual memory (de-)allocation, error handling, casts, memory alignment, *etc.* Further, previous verification efforts with formal checkers often operate within idealized formal environments, which likewise leads them to ignore the same kinds of issues.

In our work, we provide C implementations of each of these algorithms/data structures, and prove in Coq [71] the functional correctness of the same with respect to the formal semantics of CompCert C [50]. By "functional correctness" we mean natural algorithmic specifications; we do not prove resource bounds. Although our C code is developed from standard textbooks, we uncover several subtleties that are absent from the algorithmic and formal methods literature:

	- §5 several potential overflows in binary heaps equipped with Floyd's lineartime build-heap function and an edit-priority operation.

We wish to develop general and reusable techniques for verifying graphmanipulating programs written in real programming languages. This is a significant challenge, and so we choose to leverage and/or extend three large existing proof developments to state and prove the full functional correctness of our code in Coq: CompCert; the Verified Software Toolchain [4] (VST) separation logic [59] deductive verifier; and our own previous efforts [73], hereafter dubbed the CertiGraph project. Our primary extensions are to the third, and include:


We prove that our pure machinery and our spatial machinery are well-isolated from each other by verifying several implementations (of each of Dijkstra and Prim) that represent graphs differently in memory but reuse the entire pure portion of the proof. Likewise, we show that our spatial reasoning is generic by reusing graph representations across Dijkstra and Prim. Our verification of Kruskal proves that we can reason about two graphs simultaneously: a directed graph with vertex labels for union-find and an undirected graph with edge labels for which we are building a spanning forest. In addition to our verification of Dijkstra, Prim, and Kruskal, we develop increased lemma support for the preexisting CertiGraph union-find example [73]. Our extension to "base VST" (*e.g.*, verifications without graphs) primarily consists of our verified binary heap.

The remainder of this paper is organized as follows:


Our results are completely machine-checked in Coq and publicly available [1].

# **2 Extensions to CertiGraph**

We begin with the briefest of introductions to CertiGraph's core structure and then detail the extensions we make to various levels of CertiGraph in service of our Dijkstra, Prim, and Kruskal verifications. Ignoring modularity and eliding elements not used in this work, a mathematical graph in CertiGraph is a tuple: (V, E, vvalid, evalid, src, dst, vlabel, elabel, sound). Here V/E are the carrier types of vertices/edges, vvalid/evalid place restrictions specifying whether a vertex/edge is valid<sup>1</sup>, and src/dst : E→V map edges to their source/destination. Labels are allowed on vertices and edges, and a soundness condition allows custom application-specific restrictions [73]. Mathematical graphs connect to graphs in computer memory via spatial predicates in separation logic.

#### **2.1 Pure Reasoning for Adjacency Matrix-Represented Graphs**

Two of our algorithms operate over graphs represented as adjacency matrices. Not every legal graph can be represented as an adjacency matrix, so we develop a unified, reusable, and extendable soundness condition SoundAdjMat that a graph must satisfy in order for it to be represented as an adjacency matrix.

SoundAdjMat is parameterized by the graph's size and a distinguished number inf. We restrict most fields in the tuple: (<sup>V</sup> <sup>=</sup> <sup>Z</sup>, <sup>E</sup> <sup>=</sup> <sup>Z</sup> <sup>×</sup> <sup>Z</sup>, vvalid = λv. 0 ≤ v < size, evalid = ..., src = *fst*, dst = *snd*, vlabel, elabel, sound = ...). We also restrict the carrier type of vertex labels to unit

<sup>1</sup> Validity denotes presence in the graph: *e.g.*, if we are using <sup>Z</sup> as the carrier type <sup>V</sup>, and have only 7 vertices, then vvalid(*x*) is probably the proposition 0 ≤ *x <* 7).

and edge labels to Z. We require the parameters size and inf be strictly positive and representable on the machine. Most critical, however, is the semantics of evalid: a valid edge must have a machine-representable label and that label cannot have value inf; an invalid edge *must* have label inf. Last, the graph must be finite.

The restriction on edge labels is necessary because we are working with labeled adjacency matrices on a real system: we need to set aside a distinguished number inf such that edgeweight inf indicates the *absence* of an edge. We cannot prescribe some inf because client needs can vary widely. For instance, our verifications of Dijkstra's and Prim's algorithms require subtly different infs.

SoundAdjMat guarantees spatial representability as an adjacency matrix, but it can be extended with further algorithm-specific restrictions before being plugged in for sound. Dijkstra's algorithm requires nonnegative edge weights, and—as we will discuss in §3.2—nontrivial restrictions on size and inf.

# **2.2 New Spatial Representations for Edge-Labeled Graphs**

We give predicates for adjacency matrices and edge lists for edge-labeled graphs.

**Adjacency Matrices.** Adjacency matrices enable efficient label access for edgelabeled graphs. We support three common adjacency matrix representations: a stack-allocated 2D array int graph[size][size], a stack-allocated 1D array int graph[size×size], and a heap-allocated 2D array int \*\*graph. To the casual observer, these are essentially interchangeable, but that is a mistake when thinking spatially. Apart from the arithmetic that the second flavor uses to access cells, there is a more subtle point: the first and second enjoy a contiguous block of memory, but the third does not: it is an allocated "spine" with pointers to separately-allocated rows. For a taste, the spatial representation of the first is:

$$\begin{array}{rcl} \mathsf{arr.addr}(ptr, i, \texttt{size}) & \stackrel{\scriptstyle \Delta}{=} & ptr+ (i \times \texttt{size})\\ \mathsf{array}(ptr, list) & \stackrel{\scriptstyle \Delta}{=} & \mathsf{PK} \quad (ptr+i) \mapsto list[i])\\ \mathsf{arr.rep}(\gamma, i, ptr) & \stackrel{\scriptstyle \Delta}{=} & \mathsf{let} \quad \mathsf{row} := \texttt{graph2mat}(\gamma)[i] \text{ in} \\ & \stackrel{\scriptstyle \mathsf{array}}{\texttt{graph\\_rep}}(\gamma, g.addr, \texttt{\Box}) & \stackrel{\scriptstyle \Delta}{=} & \mathsf{PK\\_rep}(\gamma, v, g.addr) \end{array}$$

We use the separation logic ∗ in its iterated form to say that the arrays are separate in memory. We elide details relating to object sizes, pointer alignment, and so forth, although our formal proofs handle such matters. Of particular note are graph2mat, which performs two projections to drag out the graph's nested edge labels into a 2D matrix, and *arr addr* , which in this instance simply computes the address of any legal row *i* from the base address of the graph. Notice that this graph rep predicate ignores its third argument. To represent a heap-allocated 2D array we can still use graph2mat but can no longer use address arithmetic; the third parameter is then a list of pointers to the row sub-arrays.

While ironing out these spatial wrinkles, we develop utilities that easily unfold and refold our adjacency matrices, thus smoothing user experience when reading and writing arrays and cells. Of course these utilities themselves vary by flavor of representation, but the net effect is that users of our adjacency matrices really can be agnostic to the style of representation they are using (see §3.1).

**Edge Lists.** Edge lists are the representation of choice for sparse graphs. Our C implementation defines an edge as a struct containing src, dst, and weight, and defines a graph as a struct containing the graph's size, edge count, and an array of edges. Our spatial representation follows this pattern:

graph rep(γ, *g addr* ,*e addr* ) <sup>Δ</sup> = - *g addr* → (|γ.V |, |γ.E|, *e addr* ) <sup>∗</sup> array(*<sup>e</sup> addr* , γ.E)

#### **2.3 Undirectedness in a Directed World**

The CertiGraph library presented in [73] supports only directed graphs, and, as we have seen, bakes direction-reliant idioms such as src and dst deep into its development. Our challenge is to add support for undirected graphs atop of this.

Our approach is to observe that every directed graph can be treated as an undirected graph by ignoring edge direction. We develop a lightweight layer of "undirected flavored" definitions atop of the existing "directed flavored" definitions, state and prove connections between these, and then build the undirected infrastructure we need. The result is that we retain full access to CertiGraph's graph theory formalizations modulo some mathematical bridging.

Our basic "undirected flavored" definitions are standard [20]. Vertices *u* and *v* are adjacent if there is an edge between them in either direction; vertices are self-adjacent. A valid upath (undirected path) is list of valid vertices that form a pairwise-adjacent chain. Two vertices are connected when a valid upath features them as head and foot (essentially the transitive closure of adjacenct).

The definitions above sync up with preexisting "directed flavored" definitions. Intuitively, undirectedness is more lax than directedness, and so it is unsurprising that these connections are straightforward weakenings of directed properties. We next give standard definitions [20] that culminate in minimum spanning forest, which is exactly our postcondition of both Prim's and Kruskal's algorithms.<sup>2</sup>

An undirected cycle (ucycle) is a valid non-empty upath whose first and last vertices are equal. A connected graph means that any two valid vertices are connected. is partial graph f g means everything in f is in g. We proceed:

```
1 Definition uforest g :=
2 (∀ e, evalid g e → strong_evalid g e) ∧
3 (∀ p l, ¬ucycle g p l).
4 Definition spanning g g' :=
5 ∀ u v, connected g u v ↔ connected g' u v.
```
<sup>2</sup> That Prim's postcondition has a *forest* may raise an eyebrow. See §4.2.

```
6 Definition spanning_uforest f g :=
7 is_partial_graph f g ∧ uforest f ∧ spanning f g.
```
The strong evalid predicate means that the src and dst of the edge are also valid, so *e.g.*, a valid edge cannot point to a deleted/absent vertex. The second conjunct of uforest is critical: a forest has no undirected cycles. The other definitions are straightforward from there, and minimum spanning forest f g means that no other spanning forest has lower total edge cost than f.

Our undirected work is also compatible with our new developments in §2.1 and §2.2. An adjacency matrix-representable undirected graph has all the pure properties discussed in SoundAdjMat, and also has symmetry across the left diagonal. We extend SoundAdjMat into SoundUAdjMat by requiring that all valid edges have src ≤ dst. This effectively "turns off" the matrix on one half of the diagonal and avoids double-counting. Prim's algorithm uses SoundUAdjMat and places no further restrictions. Further, spatial representations and fold/unfold utilities are shared across directed and undirected adjacency matrices.

# **3 Shortest Path**

We verify a standard C implementation of Dijkstra's algorithm. We first sketch our proof in some detail with an emphasis on our loop invariants, then uncover and remedy a subtle overflow bug, and finish with a discussion of related work.

# **3.1 Verified Dijkstra's Algorithm in C**

Figure 1 shows the code and proof sketch of Dijkstra's algorithm. Red text is used in the figure to highlight changes compared to the annotation immediately prior. Our code is implemented exactly as suggested by CLRS [20], so we refer readers there for a general discussion of the algorithm. The adjacency-matrixrepresented graph γ of size vertices is passed as the parameter g along with the source vertex src and two allocated arrays dist and prev. The spatial predicate array(x, *v*), which connects an array pointer x with its contents *v*, is standard and unexciting. PQ(pq, *heap*) is the spatial representation of our priority queue (PQ) and Item(i,(*key*, *pri*, *data*)) lays out a struct that we use to interact with the PQ; we leave the management of the PQ to the operations described in§ 5. Of greater interest is AdjMat(g, γ), which as explained in §2.2, links the concrete memory values of g to an abstract mathematical graph γ, which in turn exposes an interface in the language of graph theory (*e.g.*, vertices, edges, labels). Graph γ contains the general adjacency matrix restrictions given in §2.1 along with some further Dijkstra-specific restrictions to be explained in §3.2. We verify Dijkstra three times using different adjacency-matrix representations as explained in §2.2. Thanks to some careful engineering, the C code and the Coq verification are both almost completely agnostic to the form of representation. The only variation between implementations is when reading a cell (line 15), so we refactor this out into a straightforward helper method and verify it separately; accordingly, the proof bases for the three variants differ by less than 1%.

```
1 void dijkstra ( int **g, int src , int *dist ,
2 int *prev , int size , int inf {
3 // -

       AdjMat(g, γ) ∗ array(dist, ) ∗ array(prev, ) ∧ src ∈ γ ∧ connected(γ, src)

4 Item* temp = (Item*) mallocN( sizeof (Item));
5 int * keys = mallocN (size * sizeof (int ));
6 PQ* pq = pq_make(size); int i, u, cost;
7 for (i = 0; i < size; i++)
8 { dist[i] = inf; prev[i] = inf; keys[i] = pq_push(pq ,inf ,i); }
9 dist[src]= 0; prev[src]= src; pq_edit_priority(pq,keys[src],0);
10 while (pq_size(pq) > 0) {
11 //
      ⎧
      ⎪⎪⎨
      ⎪⎪⎩
        ∃dist, prev, popped, heap. AdjMat(g, γ) ∗ PQ(pq, heap) ∗ Item(temp, ) ∗
        array(dist, dist) ∗ array(prev, prev) ∗ array(keys, keys) ∧
        linked correctly(γ, heap, keys, dist, popped) ∧
        dijk correct(γ, src, popped, prev, dist)
                                                                       ⎫
                                                                       ⎪⎪⎬
                                                                       ⎪⎪⎭
12 pq_pop(pq, temp); u = temp ->data;
13 for (i = 0; i < size; i++) {
14 //
      ⎧
      ⎪⎪⎪⎪⎨
      ⎪⎪⎪⎪⎩
        ∃dist-

             , prev-

                  , heap-

                        . AdjMat(g, γ) ∗ PQ(pq, heap-

                                                  ) ∗
        array(dist, dist-

                      ) ∗ array(prev, prev-

                                       ) ∗ array(keys, keys) ∗
        Item(temp, (keys[u], dist[u], u)) ∧ min(dist[u], heap-

                                                         ) ∧
        linked correctly(γ, heap-

                             , keys, dist-

                                       , popped  {u}) ∧
        dijk correct weak(γ, src, popped  {u}, prev-

                                               , dist-

                                                    , i, u)
                                                            ⎫
                                                            ⎪⎪⎪⎪⎬
                                                            ⎪⎪⎪⎪⎭
15 cost = getCell(g, u, i);
16 if (cost < inf) {
17 if (dist[i] > dist[u] + cost) {
18 dist[i] = dist[u] + cost; prev[i] = u;
19 pq_edit_priority(pq , keys[i], dist[i]);
20 }}}} //
              ⎧
              ⎨
              ⎩
                 ∃dist-
                      -
                      , prev-
                           -
                            . AdjMat(g, γ) ∗ PQ(pq, ∅) ∗ Item(temp, ) ∗
                 array(dist, dist-
                               -
                               ) ∗ array(prev, prev-
                                                -
                                                 ) ∗ array(keys, keys) ∧
                 ∀dst. dst ∈ γ → inv popped(γ, src, γ.V , prev-
                                                         -
                                                         , dist-
                                                              -
                                                               , dst)
                                                                     ⎫
                                                                     ⎬
                                                                     ⎭
21 freeN (temp); pq_free (pq); freeN (keys); return ; }
```
**Fig. 1.** C code and proof sketch for Dijkstra's algorithm.

Dijkstra's algorithm uses a PQ to greedily choose the cheapest unoptimized vertex on line 12. The best-known distances to vertices are expected to improve as various edges are relaxed, and such improvements need to be logged in the PQ: Dijkstra's algorithm implicitly assumes that its PQ supports the additional operation decrease priority. Our "advanced" PQ (§5.3) supports this operation in logarithmic time with the pq edit priority function<sup>3</sup>.

The first nine lines are standard setup. The *keys* array, assigned on line 8, is thereafter a mathematical constant. The pure predicate *linked correctly* contains the plumbing connecting the various mathematical arrays. The verification turns on the loop invariants on lines 11 and 14. The pure while invariant

<sup>3</sup> Because decrease priority is relatively complex to implement, several popular workarounds (*e.g.* [12]) use simpler PQs at the cost of decreased performance.

*dijk correct*(γ, *src*, *popped*, *prev*, *dist*) essentially unfolds into:

∀*dst*. *dst* ∈ γ → *inv popped*(γ, *src*, *popped*, *prev*, *dist*, *dst*) ∧ *inv unpopped*(γ, *src*, *popped*, *prev*, *dist*, *dst*) ∧ *inv unseen*(γ, *src*, *popped*, *prev*, *dist*, *dst*)

That is, a destination vertex *dst* falls into one of three categories:


After line 12, the above invariant is no longer true: a minimum-cost item *u* has been popped from the PQ, and so the *dist* and *prev* arrays need to be updated to account for this pop. The for loop does exactly this repair work. Its pure invariant *dijk correct weak*(γ, *src*, *popped*, *prev*, *dist*, *u*, *i*) essentially unfolds into:


We now have five cases, many of which are familiar from *dijk correct*:


5. *inv unseen weak* (between *i* and size): no edge exists from any previouslypopped vertex to *dst*, but there may be one from *u*. As *i* increments, we consider whether routing via *u* reveals a path to *dst*. This is strengthened into *inv unpopped* if so, and into *inv unseen* if not.

At the end of the for loop the fourth and fifth cases fall away (*i* = size), and the PQ and the *dist* and *prev* arrays finish "catching up" to the pop on line 12. This allows us to infer the while invariant *dijk correct*, and thus continue the while loop. The while loop itself breaks when all vertices have been popped and processed. The second and third clauses of the while loop invariant *dijk correct* then fall away, as seen on line 20: all vertices satisfy *inv popped*, and are either optimally reachable or altogether unreachable. We are done.

#### **3.2 Overflow in Dijkstra's Algorithm**

Dijkstra's algorithm clearly cannot work when a path cost is more than INT MAX. A reasonable-looking restriction is to bound edge costs by INT MAX size−1 , since the longest optimal path has size−1 links and so the most expensive possible path costs no more than INT MAX. However, this has two flaws.

First, since we are writing real code in C, rather than pseudocode in an idealized setting, we must reserve some concrete int value inf for "infinity". Suppose we set inf = INT MAX, and that size − 1 divides INT MAX. Now the longest path can have cost (size − 1) · INT MAX size−1 = INT MAX = inf. This creates an unpleasant ambiguity: we cannot tell if the farthest vertex is unreachable, or if it is reachable with legitimate cost INT MAX. We need to adjust our maximum edge weights to leave room for inf; using INT MAX−1 size−1 solves this first issue.

Second, even though the best-known distances start at inf (see line 8) and only ever decrease from there, the code can overflow on lines 17 and 18. Consider applying Dijkstra's algorithm on a 32-bit unsigned machine to the graph in Fig. 2. The size of the graph is 3 nodes, and the proposed edge-weight upper bound is INT MAX−1 size−1 = (232−1)−<sup>1</sup> 3−1 = 2<sup>31</sup> <sup>−</sup> 1, for example as in the graph pictured in Fig. 2. A glance at the figure shows that the true distance from the source A to vertices B and C are 2<sup>31</sup> <sup>−</sup>1 and 2<sup>32</sup> <sup>−</sup>2 respectively. Both values are representable with 32 bits, and neither distance is inf = 2<sup>32</sup> <sup>−</sup> 1, so na¨ıvely all seems well. Unfortunately, Dijkstra's algorithm does not exactly work like that.

After processing vertices A and B, 2<sup>31</sup> <sup>−</sup> 1 and 2<sup>32</sup> <sup>−</sup> <sup>2</sup> *are* the costs reflected in the dist array for B and C respectively—*but unfortunately vertex C is still in the priority queue*. After vertex C is popped on line 12, we fetch its neighbors in the for loop; the cost from C to B (2<sup>31</sup> <sup>−</sup> 1) is fetched on line 15. On line 17 the currently optimal cost to B (2<sup>31</sup> <sup>−</sup> 1) is compared with the sum of the optimal cost to C (2<sup>32</sup> <sup>−</sup>2) plus the just-retrieved cost of the edge from C to B (2<sup>31</sup> <sup>−</sup>1). Na¨ıvely, (2<sup>32</sup> <sup>−</sup> 2) + (2<sup>31</sup> <sup>−</sup> 1) is *greater than* the currently optimal cost 2<sup>31</sup> <sup>−</sup> 1, so the algorithm should stick with the latter. However, (2<sup>32</sup> <sup>−</sup> 2) + (2<sup>31</sup> <sup>−</sup> 1) overflows, with - (2<sup>32</sup> <sup>−</sup> 2) + (2<sup>31</sup> <sup>−</sup> 1) mod <sup>2</sup><sup>32</sup> = 2<sup>31</sup> <sup>−</sup> 3, which is *less than*

**Fig. 2.** A graph that will result in overflow on a 32-bit machine.

<sup>2</sup><sup>31</sup> <sup>−</sup> 1! Thus the code decides that a new cheaper path from A to B exists (in particular, A-B-C-B) and then trashes the dist and prev arrays on line 18.

Our code uses signed int rather than unsigned int so we have undefined behavior rather than defined-but-wrong behavior, but the essence of the overflow is identical. We ensure that the "probing edge" does not overflow by restricting the maximum edge cost further, from INT MAX−1 size−1 to INT MAX size . In Fig. 2, edge weights should be bounded by <sup>2</sup>32−<sup>1</sup> 3 = 1,431,655,765; call this value w. Suppose we change the edge weights in Fig. <sup>2</sup> from 2<sup>31</sup> <sup>−</sup> 1 to <sup>w</sup>. Now vertex B has distance w and C has distance 2·w. When we remove C from the priority queue, the comparison on line 17 is between the known best cost to B (*i.e.*, w) and the candidate best cost to B via C (*i.e.*, 3 · <sup>w</sup> = 2<sup>32</sup> <sup>−</sup> 1 = INT MAX). There is no overflow, so the candidate is rejected and the code behaves as advertised.

We fold these new restrictions into the mathematical graph γ. In addition to the bounds discussed above, we require a few other more straightforward bounds: edge costs be non-negative, as is typical for Dijkstra; 4·size ≤ INT MAX, to ensure that the multiplication in the malloc on line 5 does not overflow; and that INT MAX size · (size − 1) < inf, so no valid path has cost inf. These bounds are optimal: if the input is any less restricted, the postcondition will fail. The last restriction on inf is not sufficient when size = 1, so in that special case we further require that any (self-loop) edges cost less than inf. Whenever <sup>0</sup> <sup>&</sup>lt; <sup>4</sup>·size <sup>≤</sup> INT MAX, the restrictions on inf are satisfiable with inf <sup>Δ</sup> =INT MAX.

#### **3.3 Related Work on Dijkstra in Algorithms and Formal Methods**

We were not able to find a reference that gives a robust, precise, and full description of the overflow issue we describe above. Dijkstra's original paper [22] ignores the issue, as do the standard textbooks *Introduction to Algorithms* (*a.k.a.* CLRS) by Cormen *et al.* [20] and *Algorithm Design* by Kleinberg and Tardos [38]. Sedgewick's book on graph algorithms in C [66] sidesteps the overflow in line 17 by requiring weights be in double, which *does* have a well-defined positive infinity value and cannot overflow in the traditional sense; Sedgewick and Wayne's *Algorithms* textbook in Java does the same [67]. However, Sedgewick's sidestep entails enduring the inevitable round-off intrinsic to floating-point arithmetic, and so his algorithm computes approximate optimal costs rather than exact ones. Sedgewick does not specify any bounds on input edge weights, and accordingly does not (and cannot) provide any bound on this accumulated error. Sedgewick is silent on how to handle an int-weighted input graph. Skiena's *Algorithm* *Design Manual* [68] contains code with exactly the bug we identify: he uses integer weights and does not specify any bounds. To its credit, Heineman *et al.*'s *Algorithms in a Nutshell* [33] takes int edge weights as inputs and mentions overflow as a possibility. Heineman *et al.* hustle their way around this overflow by performing the arithmetic in line 17 in long. However, this cast does not really handle the problem in a fundamental way: if edge weights are given in long rather than int, then it would be necessary to cast to long long; if edge weights are given in long long, then Heineman's hustle breaks as there is no bigger type to which to cast. Moreover, Heineman *et al.* do not bound edge weights, so when the cumulative edge weights are too high their code fails silently.

Chen verified Dijkstra in Mizar [15], Gordon *et al.* formalized the reachability property in HOL [29], Moore and Zhang verified it in ACL2 [53], Mange and Kuhn verified it in Jahob [52], Filliˆatre in Why3 [25], and Klasen verified it in KeY [37]. Liu *et al.* took an alternative SMT-based approach to verify a Java implementation of Dijkstra [51]. The most recent effort (2019) is by Lammich *et al.*, working within Isabelle/HOL, although they only return the weight of the shortest path rather than the path itself [45]. In general the previous mechanized proofs on Dijkstra verify code defined within idealized formal environments, *e.g.* with unbounded integers rather than machine ints and a distinguished noninteger value for infinity. No previous work mentions the overflow we uncover.

### **4 Minimum Spanning Trees**

Here we discuss our verifications of the classic MST algorithms Prim and Kruskal. Although our machine-checked proofs are about real C code, in this section we take a higher-level approach than we did in §3, focusing on our key algorithmic findings and overall experience. Accordingly, we only provide pseudocode for Prim's algorithm rather than a decorated program and do not show any code for Kruskal's. Our development contains our C code and formal proofs [1].

#### **4.1 Prim's Algorithm**

We put the pseudocode for Prim's algorithm in Fig. 3; the code on the left-hand side is directly from CLRS, whereas the code on the right omits line 5 and will be discussed in §4.2. Note that line 12 contains an implicit call to the PQ's edit priority. Since the pseudocode only compares keys (*i.e.*, edge weights) rather than doing arithmetic on them *`a la* Dijkstra, there are no potential overflows and it is reasonable to set INF to INT MAX in C.

Indeed, our initial verifications of C code were largely "turning the crank" once we had the definitions and associated lemma support for pure/abstract undirected graphs, forests, *etc.* discussed in §2.3. Accordingly, our initial contribution was a demonstration that this new graph machinery was sufficient to verify real code. We also proved that our extensions to CertiGraph from §2 were generic rather than verification-specific by reusing much pure and spatial reasoning that had been originally developed for our verification of Dijkstra.

**Fig. 3.** Left: Prim's algorithm from CLRS [20]. Right: the same omitting line 5.

### **4.2 Prim's Algorithm Handles Multiple Components Out of the Box**

Textbook discussions of Prim's algorithm are usually limited to single-component input graphs (*a.k.a.* connected graphs), producing a minimum spanning tree. It is widely believed that Prim's is not directly applicable to graphs with multiple components, which should produce a minimum spanning forest. For example, both Rozen [65] and Sedgewick *et al.* [66,67] leave the extension to multiple components as a formal exercise for the reader, whereas Kepner and Gilbert suggest that multiple-component graphs should be handled by first finding the components and then running Prim on each component [36].

After we completed our initial verification, a close examination of our formal invariants showed us that the algorithm *exactly as given by standard textbooks* will properly handle multi-component graphs *in a single run*. The confusion starts because, in a fully connected graph, any vertex u removed from the PQ on line 8 must have u.key < INF; *i.e.*, u must be immediately reachable from the spanning tree that is in the process of being built. However, nothing in the code relies upon this connectedness fact! All we need is that u is the "closest vertex" to the "current component." If u.key = INF *and* u is a minimum of the PQ, then it simply means that the "previous component" is done, and we have started spanning tree construction on a new unconnected component "rooted" at u, yielding a forest. The node u's parent will remain NIL, at it was after the setup loop on line 4, indicating that it is the root of a spanning tree. Its key will be INF rather than 0, but the keys are *internal to Prim's algorithm*: clients only get back the spanning forest as encoded in the parent pointers<sup>4</sup>.

Having made this discovery, we updated our proofs to support the new weaker precondition, which is what we currently formally verify in Coq [71]. A little further thought led to the realization that since Prim can handle arbitrary numbers

<sup>4</sup> The keys simply record the edge-weight connecting a vertex to its candidate parent; recall that line 12 is really a call to the PQ's edit priority. If a client wishes to know this edge weight, it can simply look up the edge in the graph.

of components, the initialization of the root's key in line 5 is in fact unnecessary. Accordingly, if we remove this line and the associated function argument r from MST-PRIM (*i.e.*, the code on the right half of Fig. 3), the algorithm still works correctly. Moreover, *the program invariants become simpler* because we no longer need to treat a specified vertex (r) in a distinguished manner. Our formal development verifies this version of the algorithm as well [1].

#### **4.3 Related Work on Prim in Algorithms and Formal Methods**

Prim's algorithm was in fact first developed by the Czech mathematician Vojtˇech Jarn´ık in 1930 [35] before being rediscovered by Robert Prim in 1957 [61] and a third time by Edsger W. Dijkstra in 1959 [22]. Both Prim's and Dijkstra's treatment explicitly assumes a connected graph; although we cannot read Czech, some time with Google translate suggests that Jarn´ık's treatment probably does the same. The textbooks we surveyed [20,36,38,65–68] seem to derive from Prim's and/or Dijkstra's treatment. More casual references such as Wikipedia [3] and innumerable lecture slides are presumably derived from the textbooks cited. We have not found any references that state that Prim's algorithm *without modification* applies to multi-component graphs, even when executable code is provided: *e.g.*, Heineman *et al.* provide C++ code that aligns closely with our C code [33], but do not mention that their code works equally well on multicomponent graphs. Sadly, many sources promulgate the false proposition that modifications to the algorithm are needed to handle multi-component graphs (*e.g.*, [3,36,65–67]). Likewise, we have found no reference that removes the initialization step (line 5 in Fig. 3) from the standard algorithm.

Prim's algorithm has been the focus of a few previous formalization efforts. Guttman formalised and proved the correctness of Prim's algorithm using Stone-Kleene relation algebras in Isabelle/HOL [30]. He works in an idealized formal environment that does not require the development of explicit data structures; his code does not appear to be executable. Lammich *et al.* provided a verification of Prim's algorithm [45]. Lammich *et al.* also work within the idealized formal environment of Isabelle/HOL, but, in contrast to Guttman, develop efficient purely functional data structures and extract them to executable code. Both Guttman and Lammich explicitly require that the input graph be connected.

#### **4.4 Kruskal's Algorithm**

Although Kruskal's algorithm is sometimes presented as taking connected graphs and producing spanning trees, the literature also discusses the more general case of multi-component input graphs and spanning forests. However, Kruskal has only recently been the focus of formal verification efforts, partly because it relies on the notoriously difficult-to-verify union-find algorithm; fortunately, the CertiGraph project has an existing fully-verified union-find implementation that we can leverage [73]. Kruskal also requires a sorting function; we implemented heapsort as explained in §5.2. Kruskal is optimized for compact representations of sparse graphs, so the O(1) space cost of heapsort is a reasonable fit.

The primary interest of our verification of Kruskal is in our proof engineering. Kruskal inputs graphs as edge lists rather than adjacency matrices. In addition to requiring an addition to our spatial graph predicate menu, this means that Kruskal's input graphs can have multiple edges between a given pair of vertices (*i.e.*, a "multigraph"). Pleasingly, we can reuse most of the undirected graph definitions (§2.3), demonstrating that they are generic and reusable.

Another challenge is integrating the pre-existing CertiGraph verification of union-find. We are pleased to say that no change was required for CertiGraph's existing union-find definitions, lemmas, specifications and verification. Kruskal actually manipulates two graphs simultaneously: a directed graph with vertex labels (to store parent pointers and ranks) within union-find, and an undirected multigraph with edge labels (for which the algorithm is constructing a spanning forest). Beyond showing that CertiGraph was capable of this kind of systemsintegration challenge, we had to develop additional lemma support to bridge the directed notion of "reachability," used within the directed union-find graph to the undirected notion of "connectedness," used in the MSF graph (§2.3).

#### **4.5 Related Work on Kruskal in Algorithms and Formal Methods**

Joseph Kruskal published his algorithm in 1956 [42] and it has appeared in numerous textbooks since (*e.g.*, [20,38,66,68]). Kruskal's algorithm is usually preferred over Prim's for sparse graphs, and is sometimes presented as "the right choice" when confronted with multi-component graphs under the mistaken assumption that Prim's first requires a component-finding initial step.

Guttman generalized minimum spanning tree algorithms using Stone relation algebras [31], and provided a proof of Kruskal's algorithm formatted in said algebras. Like in his work on Prim's [30], Guttmann works within Isabelle/HOL and does not include concrete data structures such as priority-queues and union-find, instead capturing their action as equivalence relations in the underlying algebras. In Guttmann's Kruskal paper, he mentions that his Prim paper axiomatizes the fact that "every finite graph has a minimum spanning forest," which he is then able to prove *using his Kruskal algorithm*. Interestingly, our Prim verification needs the same fact, but we prove it directly.

In a similar vein, Haslbeck *et al.* verified Kruskal's algorithm [32] by building on Lammich *et al.*'s earlier work on Prim [45]. Like Lammich *et al.*, Haslbeck *et al.* work within Isabelle/HOL with a focus on purely functional data structures.

One of the stumbling blocks in verifying Kruskal's algorithm is the need to verify union-find. In addition to CertiGraph [73], two recent efforts to certify union-find are by Chargu´eraud and Pottier, who also prove time complexity [14]; and by Filliˆatre [26], whose proof benefits from a high degree of automation.

### **5 Verified Binary Heaps in C**

A binary heap embeds a heap-ordered tree in an array and uses arithmetic on indices to navigate between a parent and its left and right children [20]. In addition to providing the standard insert and remove-min/remove-max operations (depending on whether it is a min- or max-ordered heap) in logarithmic time, binary heaps can by upgraded to support two nontrivial operations. First, Floyd's heapify function builds a binary heap from an unordered array in linear time, and as a related upgrade, heapsort performs a worst-case linearithmic-time sort using only constant additional space. Second, binary heaps can be upgraded to support logarithmic-time decrease- and increase-priority operations, which we generalize straightforwardly into edit priority.

Binary heaps are a good fit for our graph algorithms because Dijkstra's and Prim's algorithms need to edit priorities, and a constant-space heapsort is appropriate for the sparse edge-list-represented graphs typically targeted by Kruskal's. The C language has poor support for polymorphic higher-order functions, and a binary heap that supports edit priority is half as fast as a binary heap that does not. Accordingly, we implement binary heaps in C three times:


Priorities are of type int. The Kruskal-specific implementation is stripped down to the bare minimum required to implement heapsort (*e.g.*, it does not support insert). We next overview these verifications in three parts: basic heap operations, heapify and heapsort operations, and the edit priority operation.

#### **5.1 The Basic Heap Operations of Insertion and Min/Max-Removal**

Because we are juggling three implementations, we take some care to factor our verification to maximize reuse. First, each C implementation has its own exchange and comparison functions that handle the nitty-gritty of the payload and choose between a min or max heap. The following lines are from the "basic" implementation, in which the "payload" (data field) is of type void\*:

```
5 void exch( unsigned int j , unsigned int k, Item arr[]) {
6 int priority = arr[j].priority; void * data = arr[j].data;
7 arr[j].priority = arr[k].priority; arr[j]. data = arr[k].data;
8 arr[k].priority = priority; arr[k]. data = data; }
9 int less( unsigned int j , unsigned int k, Item arr[]) {
10 return (arr[j]. priority <= arr[k]. priority ); }
```
These C functions are specified as refinements of Gallina functions that exchange polymorphic data in lists and compare objects in an abstract preordered set; we verify them in VST after a little irksome engineering. The payoff is that the key heap operations, which, following Sedgewick [66], we call swim and sink, can use identical C code (up to alpha renaming) in all three implementations:

```
11 void swim( unsigned int k, Item arr[]) {
12 while (k > ROOT_IDX && less (k, PARENT(k), arr)) {
13 exch(k, PARENT(k), arr); k = PARENT(k); } }
14 void sink ( unsigned int k, Item arr[], unsigned int available) {
```

```
15 while (LEFT_CHILD(k) < available) {
16 unsigned j = LEFT_CHILD(k);
17 if (j+1 < available && less(j+1, j, arr)) j++;
18 if (less(k, j, arr)) break ; exch(k, j, arr); k = j; } }
```
These functions involve a number of complexities, both at the algorithms level and at the semantics-of-C level. At the C level, there is the potential for a rather subtle bug in the macros ROOT IDX, PARENT, etc. Abstractly, these are simple: the root is in index 0; the children of x at roughly 2x and the parent at roughly <sup>x</sup> 2 , with ±1 as necessary. The danger is thinking that because the variables are unsigned int, all arithmetic will occur in this domain; in fact we must force the associated constants into unsigned int as well:

```
1 #define ROOT_IDX 0u
2 #define PARENT(x) (x-1u)/2u
                                  3 #define LEFT_CHILD(x) (2u*x)+1u
                                  4 #define RIGHT_CHILD(x) 2u*(x+1u)
```
A second C-semantics issue is the potential for overflow within LEFT CHILD and RIGHT CHILD (as well as the increments on line 17), and underflow within the PARENT macro (if x should ever be 0). To avoid this overflow, the precondition of sink requires that when k is in bounds (*i.e.*, k < available), then 2 ·(available−1) ≤ max unsigned. An edge case occurs when deleting the last element from a heap (k = available); we then require 2 · k ≤ max unsigned.

At the algorithmic level, both the swim and sink functions involve nontrivial loop invariants; sink is complicated by the further need to support Floyd's heapify, during which a large portion of the array is unordered. Accordingly, we build Gallina models of both functions and show that they restore heap order given a mostly-ordered input heap. There are two different versions of "mostlyordered". Specifically, swim uses a "bottom-up" version:

```
5 Definition weak_heapOrdered2 (L : list A) (j : nat) : Prop :=
6 (∀ i b, i = j → nth_error L i = Some b →
7 ∀ a, nth_error L (parent i) = Some a → a  b) ∧
8 (grandsOk L j root_idx).
```
whereas sink uses a "top-down" version:

```
9 Definition weak_heapOrdered_bounded (L: list A) (k:nat) (j:nat) :=
10 (∀ i a, i ≥ k → i = j → nth_error L i = Some a →
11 (∀ b, nth_error L (left_child i) = Some b → a  b ) ∧
12 (∀ c, nth_error L (right_child i) = Some c → a  c)) ∧
13 (grandsOk L j k).
```
The parameter j indicates a "hole", at which the heap may not be heap-ordered; grandsOk bridges this hole by ordering the parent and the children of j:

```
1 Definition grandsOk (L : list A) (j : nat) (k : nat) : Prop :=
2 j = root_idx → parent j ≥ k →
3 ∀ gs bb , parent gs = j → nth_error L gs = Some bb →
4 ∀ a, nth_error L (parent j) = Some a → a  bb.
```
The parameter k is used to support Floyd's heapify: it bounds the portion of the list in which elements are heap-ordered (with the exception of j). The proofs that the Gallina swim and sink can restore (bounded) heap-orderedness involve a number of edge cases, but given the above definitions go through. The invariants of the C versions of swim and sink are stated via the associated Gallina versions, thereby delegating all heap-ordering proofs to the Gallina versions.

The insertion and remove functions we verify are in fact "non-checking" versions (insert nc and remove nc): their preconditions assume there is room in the heap to add or an item in the heap to remove. In the context of Dijkstra and Prim, these preconditions can be proven to hold. The associated verifications involve a little separation logic hackery (specifically, to Frame away the "junk" part of the heap-array from the "live" part), but are straightforward using VST. We avoid the overflow issue in sink by bounding the maximum capacity of the heap: 4 ≤ 12 · capacity ≤ max unsigned; the magic number 12 comes from the size of the underlying data structure in C. We require users to prove this bound on heap creation, and thereafter handle it under the hood.

### **5.2 Bottom-Up Heapify and Heapsort**

Floyd's bottom-up procedure for constructing a binary heap in linear time, and using a binary heap to sort, are classics of the literature [20,66]. Happily, while the asymptotic bound on heap construction is nontrivial, the implementations of both are basically repeated calls to sink (and exchanges to remove the root):

```
19 void build_heap( Item arr[], unsigned int size) {
20 unsigned int start = PARENT(size);
21 while (1) { sink(start , arr , size);
22 if (start == 0) break ; start --; } }
23 void heapsort_rev( Item* arr , unsigned int size) {
24 build_heap(arr , size);
25 while (size > 1) { size --;
26 exch(ROOT_IDX , size , arr); sink(ROOT_IDX , arr , size); } }
```
Given that in §5.1 we already generalized the specification for sink to handle a portion of the array being unordered, the verification of these functions is straightforward. There is, however, the possibility of a subtle underflow on line 20, in the case when building an empty heap (*i.e.*, size = 0). In turn, this means that heapsort rev as given above cannot sort empty lists; in our "basic" implementation we strengthen the precondition accordingly, whereas in our "Kruskal" implementation we add a line before 24 that returns when size = 0. We use a max-heap for Kruskal because heapsort yields a *reverse* sorted list.

#### **5.3 Modifying an Element's Priority**

To support edit-priority, each live item is associated not only with its usual int priority but also given a unique unsigned int "key", generated during insert and returned to the client. The binary heap internally maintains a secondary array key table that maps each key to the current location of the associated item within the primary heap array. The client calls edit priority by supplying the key for the item that it wishes to modify, which the binary heap looks up in the key table to locate the item in the heap array before calling sink or swim. To keep everything linked together, the key table is modified during exchange.

To generate the keys on insert, we store a key field within each heap-item in the main array. These keys are initialized to 0..(capacity − 1), and thereafter are never modified other than when two cells are swapped during exchange. An invariant can then be maintained that the keys from the "live" and "junk" parts have no duplicates. On insertion, we "recycle" the key of the first "junk" item, which is by the invariant known to be appropriately fresh.

#### **5.4 Related Work on Binary Heaps in Algorithms and Formal Methods**

J. W. J. Williams published the binary heap data structure, along with heapsort, in June 1964 [28]. Floyd proposed his linear-time bottom-up method to construct such heaps that December [27]. Since then, binary heaps, including Floyd's construction and heapsort, have become a staple of the introductory data structure diet [20]. On the other hand, standard textbooks are surprisingly vague on the implementation of edit priority [20,38,66], and completely silent on the generation of fresh keys during insertion. Our method above of "recycling keys" avoids a subtle overflow in a na¨ıve approach, and does not appear in the literature we examined. The na¨ıve idea is to have a global counter starting at 0, which is then increased on each insert. Unfortunately, this is unsound: during (very) long runs involving both insert and remove-min, this key counter will overflow. Although overflow is defined in C for unsigned int, this overflow is fatal algorithmically: multiple live items could be assigned the same key.

Binary heaps have been verified several times in the literature. They were problem 2 of the VACID-0 benchmark [49], and solved in this regard as well by the Why3 team [69]. These solutions did not implement bottom-up heap construction or edit priority. Summers verified heapsort in Viper, again without bottom-up heap construction [56]. Lammich verified Introsort, which includes a heapsort subroutine [44]. Previous formal work ignores nitty-gritty C issues such as the difference between signed and unsigned arithmetic. We believe we are the first formally verified binary heap to support edit-priority.

### **6 Engineering Considerations**

Verifying real code is meaningfully harder than verifying toy implementations. On top of such challenges, verifying graph algorithms requires a significant amount of mathematical machinery: there are many plausible ways to define basic notions such as reachability, but not all of them can handle the challenges of verifying real code [72]. Moreover, we would like our mathematical, spatial, and verification machinery to be generic and reusable.

All of the above suggests that it is important to work within existing formal proof developments due a strong desire to not reinvent very large wheels (the existing proof bases we work with contain hundreds of thousands of lines of formal proof). We chose to work with the CompCert certified compiler [50]; the Verified Software Toolchain [4], which provides significant tactic support for separation logic-based deductive verification of CompCert C programs; and the CertiGraph framework [73], which provides much pure and spatial reasoning support for verifying graph-manipulating programs within VST. We did so because these frameworks can handle the challenges of real code and because the CertiGraph included several fully verified implementations of union-find that we wished to reuse in our verification of Kruskal's algorithm.

Modular formal proof development involves major software engineering challenges [64]. Accordingly, we took care factoring our extensions to CertiGraph into generic and reusable pieces. This factoring allows us to reuse machinery between verifications, including in the mathematical, spatial, and verification levels. So, *e.g.*, we share significant pure and spatial machinery between Dijkstra, Prim, and Kruskal. Moreover, we maintain good separation between pure and spatial reasoning. So, *e.g.*, both our Dijkstra and Prim verifications can handle multiple spatial variants of adjacency matrices without significant change.

On the other hand, working within existing developments involves some challenges, primarily in that some design decisions have been already made and are hard to change. Moreover, our verifications tickled numerous bugs within VST, including: overly-aggressive automatic entailment simplifying, poor error messages, improper handling of C structs, and performance issues. We have been fortunate that the VST team has been willing to work with us to fix such bugs, although some work still remains. Performance remains one area of focus: for example, checking our verification of Kruskal with a 3.7 GHz processor and 32 gb of memory takes more than 22 min even after all of the generic pure and spatial reasoning has been checked, *i.e.* approximately 7 s per line of C code (including whitespace and comments). This performance is unviable for verifying an industrial-sized application of equivalent difficulty: *e.g.*, it would take 13 years for Coq to check the proof for 1,000,000 lines of C. Before some optimizations to our proof structure, the time was significantly longer still.

Our contributions to CertiGraph include pieces that are reused repeatedly and pieces that are more bespoke. Below, we give a sense of both the size of our development (lines of formal Coq proof) and the mileage we get out of our own work via reuse. Items "added with +" are very similar (within 1%) of each other; Prim #4 is the version that does not set the root, *i.e.* on the right in Fig. 3.


In total we have 26,314 novel lines of Coq proof to verify 1,155 lines of C code divided among 12 files, including 3 variants of Dijkstra, 4 variants of Prim, 1 of Kruskal (which includes its heapsort), and 2 binary heaps.

# **7 Concluding Thoughts: Related and Future Work**

We have already discussed work directly related to Dijkstra's (§3.3), Prim's (§4.3), and Kruskal's (§4.5) algorithms, as well as binary heaps (§5.4). Summarizing briefly to the point of unreasonableness, our observations about Dijkstra's overflow and Prim's specification are novel, and existing formal proofs focus on code working within idealized environments rather than handling the real-world considerations that we do. We have also discussed the three formal developments we build upon and extend: CompCert, VST, and CertiGraph (Sect. 6). Our goal now is to discuss mechanized graph reasoning and verification more broadly.

*Reasoning About Mathematical Graphs.* There is a 30+ year history of mechanizing graph theory, beginning at least with Wong [74] and Chou [19] and continuing to the present day; Wang discusses many such efforts [72, §3.3]. The two abstract frameworks that seem closest to ours are those by Noschinski [58]; and by Lammich and Nipkow [45]. The latter is particularly related to our work, because they too start with a directed graph library and must extend it to handle undirected graphs so that they can verify Prim's algorithm.

*More-Automated Verification.* Broadly speaking, mechanized verification of software falls in a spectrum between more-automated-but-less-precise verifications and less-automated-but-more-precise verifications. Although VST contains some automation, we fall within the latter camp. In the former camp, landmark initial separation logic [63] tools such as Smallfoot [7] have grown into Facebook's industrial-strength Infer [11]. Other notable relatively-automated separation logic-based tools include HIP/SLEEK [17], Bedrock [18], KIV [24], VerCors [9], and Viper [57]. More-automated solutions that use techniques other than separation logic include Boogie [6], Blast [8], Dafny [48], and KeY [2]. In Sect. 3.3 we discuss how some of these more-automated approaches have been applied to verify Dijkstra's algorithm. Petrank and Hawblitzel's Boogie-based verification of a garbage collector [60], Bubel's KeY-based verification of the Schorr-Waite algorithm, and Chen *et al.*'s Tarjan's strongly connected components algorithm in (among others) Why3 [16] are three examples of more-automated verification of graph algorithms. M¨uller verified *binomial* (not binary) heaps in Viper, although his implementation did not support an edit-priority function [55]. The VOCAL project has verified a number of data structures, including binary and other heaps (all without edit-priority) and union-find [13].

We are not confident that more-automated tools would be able to replicate our work easily. We prove full functional correctness, whereas many moreautomated tools prove only more limited properties. Moreover, our full functional correctness results rely upon a meaningful amount of domain-specific knowledge about graphs, which automated tools usually lack. Even if we restrict ourselves to more limited domains such as overflows, several more automated efforts did not uncover the overflow that we described in Sect. 3.3. The proof that certain bounds on edge weights and inf suffice depends on an intimate understanding of Dijkstra's algorithm (in particular, that it explores one edge beyond the optimum paths); overall the problem seems challenging in highly-automated settings. The more powerful specification we discover for Prim's algorithm in Sect. 4.2 is likewise not something a tool is likely to discover: human insight appears necessary, at least given the current state of machine learning techniques.

In contrast, several of the potential overflows in our binary heap might be uncovered by more-automated approaches, especially those related to the PARENT and LEFT CHILD macros from Sect. 5.1. Although the arithmetic involves both addition/subtraction and multiplication/division, we suspect a tool such as Z3 [54] could handle it. Moreover, a sufficiently-precise tool would probably spot the necessity of forcing the internal constants into unsigned int. The issue of sound key generation described in Sect. 5.3 might be a bit trickier. On the one hand, unsigned int overflow is defined in C, so real code sometimes relies upon it. Accordingly, merely observing that the counter could overflow does not guarantee that the code is necessarily buggy. On the other hand, some tools might flag it anyway out of caution (*i.e.* right answer, wrong reason).

*Less-Automated Verification.* Although as discussed above some moreautomated tools have been applied to verify graph algorithms, the problem domain is sufficiently complex that many of the verifications discussed in Sect. 3.3, Sect. 4.3, and Sect. 4.5 use less-automated techniques. Two basic approaches are popular. The "shallow embedding" approach is to write the algorithm in the native language of a proof assistant. The "deep embedding" approach is to write the algorithm in another language whose semantics has been precisely defined in the proof assistant. VST uses a deep embedding, and so we do too; one of VST's more popular competitors in the deep embedding style is "Iris Proof Mode" [39]. In contrast, Lammich *et al.* have produced a series of results verifying a variety of graph algorithms using a shallow embedding (*e.g.*, [32,43,45–47]). From a bird's-eye view Lammich *et al.*'s work is the most related to our results in this paper: they verify all three algorithms we do and are able to extract fullyexecutable code, even if sometimes their focus is a bit different, *e.g.* on novel purely-functional data structures such as a priority queue with edit priority.

*Pen-and-Paper Verification of Graph Algorithms.* We use separation logic [63] as our base framework. Initial work on graph algorithms in separation logic was minimal; Bornat *et al.* is an early example [10]. Hobor and Villard developed the technique of ramification to verify graph algorithms [34], using a particular "star/wand" pattern to express heap update. Wang *et al.* later integrated ramification into VST as the CertiGraph project we use [73]. Krishna *et al.* [40] have developed a flow algebraic framework to reason about local and global properties of *flow graphs* in the program heap; their flow algebra is mainly used to tackle local reasoning of global graphs in program heaps. Flow algebras should be compatible with existing separation logics; implementation and integration with the Iris project appears to be work in progress [41].

Krishna *et al.* are interested in concurrency [40]; Raad *et al.* provide another example of pen-and-paper reasoning about concurrent graph algorithms [62].

*Future Work.* We see several opportunities for decreasing the effort and/or increasing the automation in our approach. At the level of Hoare tuples, we see opportunities for improved VST tactics to handle common cases we encounter in graph algorithms. At the level of spatial predicates, we can continue to expand our library of graph constructions, for example for adjacency lists. We also believe there are opportunities to increase modularity and automation at the interface between the spatial and the mathematical levels, *e.g.* we sometimes compare C pointers to heap-represented graph nodes for equality, and due to the nature of our representations this equality check will be well-defined in C when the associated nodes are present in the mathematical graph, so this check should pass automatically.

We believe that more automation is possible at the level of mathematical graphs: for example reachability techniques based on regular expressions over matrices and related semirings [5,23,70]. We are also intrigued by the recent development of various specialized graph logics such as by Costa *et al.* [21] and hope that these kinds of techniques will allow us to simplify our reasoning. The key advantage of having end-to-end machine-checked examples such as the ones we presented above is that they guide the automation efforts by providing precise goals that are known to be strong enough to verify real code.

*Conclusion.* We extend the CertiGraph library to handle undirected graphs and several flavours of graphs with edge labels, both at the pure and at the spatial levels. We verify the full functional correctness of the three classic graph algorithms of Dijkstra, Prim, and Kruskal. We find nontrivial bounds on edge costs and infinity for Dijkstra and provide a novel specification for Prim. We verify a binary heap with Floyd's heapify and edit priority. All of our code is in CompCert C and all of our proofs are machine-checked in Coq.

**Acknowledgements.** We thank Shengyi Wang for his help and support.

# **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

# Gillian, Part II: Real-World Verification for JavaScript and C

Petar Maksimovi´c1(B) , Sacha-Elie Ayoun ´ <sup>1</sup>, Jos´e Fragoso Santos<sup>2</sup>, and Philippa Gardner<sup>1</sup>

<sup>1</sup> Imperial College London, London, UK {p.maksimovic,s.ayoun17,p.gardner}@imperial.ac.uk <sup>2</sup> INESC-ID/Instituto Superior T´ecnico, Universidade de Lisboa, Lisbon, Portugal jose.fragoso@tecnico.ulisboa.pt

Abstract. We introduce verification based on separation logic to Gillian, a multi-language platform for the development of symbolic analysis tools which is parametric on the memory model of the target language. Our work develops a methodology for constructing compositional memory models for Gillian, leading to a unified presentation of the JavaScript and C memory models. We verify the JavaScript and C implementations of the AWS Encryption SDK message header deserialisation module, specifically designing common abstractions used for both verification tasks, and find two bugs in the JavaScript and three bugs in the C implementation.

# 1 Introduction

Separation logic (SL) [25,40] introduced compositional program verification using Hoare reasoning. Current analysis tools based on ideas from SL include: the automatic tool Infer [8,9] used inside Facebook to find lightweight bugs in Java/C/C++/Obj-C programs; the semi-automatic tool Verifast [26], which provides full verification for fragments of C and Java; the semi-automatic tool JaVerT [21], which provides bug-finding and verification for JavaScript (JS) programs; and the Viper architecture [36,35], which provides a verification backend for multiple programming languages, including Java, Rust, and Python. Our goal is to introduce verification based on SL to Gillian [19], a multi-language platform for symbolic analysis, integrating bug-finding and verification in the spirit of JaVerT and targeting many languages in the spirit of Viper.

Gillian currently supports three types of program analysis: symbolic testing, verification and bi-abduction. In [19], the focus was on symbolic testing, parametrised on complete concrete and symbolic memory models of the target language (TL), and underpinned by a core symbolic execution engine with strong mathematical foundations. Gillian analysis is done on GIL, an intermediate goto language parametric on a set of memory actions, which describe the fundamental ways in which TL programs interact with their memories. To instantiate Gillian to a new TL, a tool developer must: (1) identify the set of the TL memory actions and implement the TL memory models using these actions; and (2) provide a

<sup>©</sup> The Author(s) 2021 A. Silva and K. R. M. Leino (Eds.): CAV 2021, LNCS 12760, pp. 827-850, 2021. https://doi.org/10.1007/978-3-030-81688-9\_38

trusted compiler from the TL to GIL, which preserves the TL memory models and the semantics. In [19], Gillian was instantiated to JS and C, and used to find bugs in two real-world data-structure libraries, Buckets.js [43] and Collections-C [41]. Here, we introduce compositional memory models for Gillian, extend Gillian analysis with verification based on separation logic, adapt Gillian-JS and Gillian-C to this compositional setting, and provide verified specifications of the JS and C implementations of the deserialisation module of the AWS Encryption SDK.

The compositional Gillian memory models (§2) are given by the tool developer for each TL instantiation. They are based on partial memories, and formulated using core predicates and the associated consumer and producer actions. Core predicates describe fundamental units of TL memories: e.g., a property of a JS object and a C block cell. Consumers and producers, respectively, frame off and frame on the TL memory resource described by the core predicates. Partiality and frame are familiar concepts from SL [25,40,11]. What is perhaps less familiar is our emphasis on negative resource: i.e., the resource known to be absent from the partial memory. For example, in JS, a new extensible object is known not to contain any property; and, in C, a freed block is known not to be in memory and a cell is known not to exist beyond the block bound. We introduce a methodology for designing Gillian compositional memory models, and apply it to JS and C (§3), resulting in a unexpected similarity between the two models. Our compositional JS memory models follow those given in work on a JS program logic [24] and the JaVerT tool [21], where negative resource was essential for frame preservation, inspired by the use of negative resource to capture stability properties in the CAP concurrent separation logic [14], now used in Iris [27]. Our compositional C memory models are based on the complete CompCert memory model [31]. Despite a large body of work on separation logic for C, we were unable to find a partial C memory model that captures the negative resource in its entirety. The nearest is probably the CH20 formalism [29], which handles freed locations but not block bounds. Negative resource for freed locations has also been used in incorrectness logic [39], and for block bounds in a program logic for WebAssembly [48].

We build Gillian verification on top of our compositional memory models. In particular, using the core predicates, we design an assertion language for writing function specifications in separation logic and, using the consumers and producers, we build a fully parametric spatial entailment engine which enables the use of function specifications in symbolic execution. Gillian also supports user-defined predicates, which allow tool developers to identify the TL language interface familiar to code developers, and code developers to describe and prove properties about the particular data structures in their programs.

We extend Gillian-JS and Gillian-C to enable verification, introducing the JS and C compositional memory models, and using the same trusted compilers as in [19]. With these instantiations, we provide functionally-correct, verified specifications of the message header deserialisation module of the AWS Encryption SDK JS and C implementations (§4, §5). This is stable, critical, industry-grade code (~200loc for JS, ~950loc for C), which uses advanced language features to manipulate complex data structures. To verify this code, we create language-independent

predicates to capture the message header, which we then connect without modification to both JS and C memories, giving specifications for the module functions. We also build a library of associated lemmas, used for the verification of both implementations. The verification itself required a substantial improvement of the reasoning capabilities of Gillian, especially when it came to handling arrays of symbolic size. We discovered two bugs in the JS implementation: one a form of prototype poisoning, predicted theoretically in our paper on JaVerT [21]; and another that allowed third parties to potentially alter authenticated, non-secret data. We have also discovered three bugs in the C implementation: one which allowed some malformed headers to be parsed as correct; one over-allocation; and one undefined behaviour. All of these bugs have been fixed.

# 2 Gillian Verification

We introduce Gillian verification based on separation logic (§2.2), extending the GIL execution engine presented in [19] with compositional memory models (§2.1).

#### 2.1 Compositional Memory Models

GIL is a simple goto intermediate language whose syntax is given below. It is parametric on a set of TL memory actions, A 3 α, given per instantiation by the tool developer. GIL values, v ∈ Val, contain numbers, strings, booleans, uninterpreted symbols (used, e.g., to represent memory locations), simple types (e.g., numbers, strings), function identifiers and lists of values. GIL expressions, e ∈ Expr , contain values, program variables, and unary and binary operators (e.g. addition, list concatenation); GIL symbolic expressions, eˆ ∈ Eˆxpr , are analogous except that symbolic variables, xˆ ∈ Xˆ, are used instead of program variables.

#### GIL Syntax


GIL commands, c ∈ Cmd , contain variable assignment, conditional goto, function call, memory actions, allocation of uninterpreted/interpreted symbols, function return, error termination and path cutting. A GIL function, f(x){c}, comprises an identifier f ∈ F, a formal parameter x 3 , and a body given by a list of commands c. A GIL program is a set of GIL functions with unique identifiers.

GIL execution is defined in terms of state models, which are parametric on a value set, V ⊇ Val, and a set of memory actions, A. We distinguish the Boolean value set, Π ⊂ V, and refer to π ∈ Π as a context. State models expose an interface consisting of state actions, A ] AS, where the actions

<sup>3</sup> The implementation supports multiple parameters.

Fig. 1: GIL Execution Semantics: Memory Actions

A<sup>S</sup> = {setVarx}x∈X ∪{setStore, getStore}∪{evale}e∈Expr∪{assume, uSym, iSym} address store management, expression evaluation, branching, and allocation.

Definition 1 (State Model). A state model, S(V, A) , h|S|, eai, comprises: a set of states σ = hµ, ρ, πi ∈ |S|, containing a memory µ, a variable store ρ, and a (satisfiable) context π 4 ; and an action execution function, ea : (A ] AS) → |S| → V \* P(|S| × V × R), with the result r ∈ R = {S, E,M} denoting success, non-correctible error, or missing information error, pretty-printed σ.α(v) → {(σ<sup>i</sup> , vi) ri |i∈<sup>I</sup> } for all outcomes and σ.α(v) (µ<sup>i</sup> , σi) ri for a specific outcome, with countable I. The value set of concrete state models is the set of GIL values, Val <sup>5</sup> ; the value set of symbolic state models is the set of symbolic expressions, Eˆxpr .

Definition 2 (GIL Execution Semantics). Given a state model S, the GIL execution semantics has judgements of the form:

$$\mathbf{p} \vdash \langle \sigma, cs, i \rangle^o \leadsto\_S \langle \sigma', cs', j \rangle^{o'}$$

with: call stacks, cs ∈ Call <sup>S</sup>; command indexes, i, j ∈ N; and outcomes, o ∈ O.

The GIL execution semantics is standard for a goto language, except that it is parametrised by the memory actions. Call stacks capture function-related control flow, with cmd(p, cs, i) denoting the i-th command of the currently executing function (cf. [33] for details). Outcomes, o ∈ O , S | N(v) | E(v) | M(v), indicate how the execution is to proceed: S states that it can continue; N(v) states that it terminated normally with return value v; and E(v) and M(v) state that it failed with either a non-correctible or missing information error described by v. We give the rules for memory action execution in Figure 1; all can be found in [33].

Compositional Memory Models. We move from whole-program memory models [19] to compositional memory models by introducing memory core predicates, γ ∈ Γ, which represent the fundamental units of the TL memory model (e.g., a memory cell). Core predicates take two lists of parameters, in-parameters (or ins), denoted v<sup>i</sup> , and out-parameters (or outs), denoted vo, such that from the ins we can learn the outs. This concept is similar to predicate parameter modes

<sup>4</sup> States also include allocators (cf. [33] for details), elided to limit clutter.

<sup>5</sup> Note that the only satisfiable concrete context is true, meaning that concrete contexts can be elided and concrete states can be viewed as memory-store pairs, hµ, ρi.

of [37] and we use it to implement a parametric spatial entailment engine. An example of a core predicate is the cell assertion, x 7→ v, which captures a cell in memory at address x having value v. Its in-parameter is x, and its out-parameter is v, because, if we know x, we can find v by looking it up in the memory.

With each core predicate γ ∈ Γ, we associate a consumer and a producer memory action, denoted by cons<sup>γ</sup> and prod<sup>γ</sup> respectively, to obtain the set of predicate actions A<sup>Γ</sup> = S γ∈Γ {consγ, prodγ}, whose meaning is discussed shortly.

Definition 3 (Compositional Memory Model). Given value set V and core predicate set Γ, a compositional memory model, M(V, Γ) , h|M|, Wf , ea<sup>Γ</sup> i, comprises: (1) a partial commutative monoid (PCM)<sup>6</sup> , |M| = (|M|, •, 0), where 0 denotes the (indivisible) empty memory; (2) a well-formedness relation, Wf ⊆ |M| × Π, with Wf<sup>π</sup> (µ) denoting that memory µ is well-formed in (satisfiable) context π; and (3) a predicate action execution function, ea <sup>Γ</sup> : A<sup>Γ</sup> ×|M|×V×Π \* P(|M| × V × Π × R), pretty-printed µ.α(v)<sup>π</sup> → (µ<sup>i</sup> , vi) ri πi |i∈<sup>I</sup> for all outcomes and µ.α(v)<sup>π</sup> (µ<sup>i</sup> , vi) ri πi for a specific outcome, with countable I. The value set of concrete memory models is the set of GIL values, Val; the value set of symbolic memory models is the set of symbolic expressions, Eˆxpr .

We discuss the most important properties that the components of compositional memory models must satisfy; a full list is available in [33]. The PCM requirement is well-known from separation logic [40,11]. Well-formedness holds only for satisfiable contexts, and describes the separation of symbolic resource and any further TL-specific well-formedness criteria (cf. §3). It must be monotonic with respect to context strengthening, compatible with the PCM composition, and the empty memory must be well-formed in any satisfiable context. The action execution function, µ.α(v)<sup>π</sup> → (µ<sup>i</sup> , vi) ri πi |i∈<sup>I</sup> , denotes that, in a memory µ that is well-formed in the context π, executing action α with parameter v yields a countable number of branches characterised by the non-overlapping<sup>7</sup> , satisfiable contexts π<sup>i</sup> , each of which implies π and makes the corresponding memory µ<sup>i</sup> wellformed, and all of which together cover π (i.e., π ⇒ W i∈I πi). This last property means that memory actions do not drop paths, which is essential for verification.

The intuition behind consumers and producers is that consumers frame off the core predicate resource (CPR), uniquely determined by the core predicate ins, and the producers frame it on. The following properties capture this intuition. First, we define the CPR of a core predicate γhv<sup>i</sup> · voi as the memory resulting from its production in 0, which must succeed in any satisfiable context:

$$
\boldsymbol{\pi} \cdot \mathbf{SAT} \implies \mathbf{0}.\text{prod}\_{\boldsymbol{\gamma}}(\mathbf{v}\_{i} \cdot \mathbf{v}\_{o})\_{\boldsymbol{\pi}} \leadsto (\boldsymbol{\gamma} \langle \mathbf{v}\_{i} \cdot \mathbf{v}\_{o} \rangle, \text{true})\_{\boldsymbol{\pi}}^{\mathcal{S}} \land \boldsymbol{\gamma} \langle \mathbf{v}\_{i} \cdot \mathbf{v}\_{o} \rangle \neq \mathbf{0}.\text{}
$$

overloading notation for the core predicate and its resource. Moreover, we require that any successful production frames on the CPR:

$$
\mu\text{.prod}\_{\gamma}(\mathbf{v}\_{i}\cdot\mathbf{v}\_{o})\_{\pi} \leadsto (\mu', \text{true})\_{\pi'}^{\mathcal{S}} \implies \mu' = \mu\bullet\gamma\langle\mathbf{v}\_{i}\cdot\mathbf{v}\_{o}\rangle.
$$

<sup>6</sup> A PCM, X = hX, •, 0i, comprises a carrier set X (overloaded for simplicity), a partial, associative, and commutative composition operator •, and unit element 0.

<sup>7</sup> Note that this requirement makes concrete memory actions deterministic.

and also that producers cannot return missing information errors, as they are meant to succeed precisely when the CPR is missing. The consumers, on the other hand, must succeed if and only if the CPR is present in memory:

$$\begin{aligned} \mu. \text{cons}\_{\gamma}(\mathsf{v}\_{i})\_{\pi} &\leadsto (\mu', \mathsf{v}\_{o})\_{\pi'}^{\mathcal{S}} \implies \pi' \vdash \mu = \mu' \bullet \gamma \langle \mathsf{v}\_{i} \cdot \mathsf{v}\_{o} \rangle \\ \pi \vdash \mu = \mu' \bullet \gamma \langle \mathsf{v}\_{i} \cdot \mathsf{v}\_{o} \rangle \land \mathcal{W}f\_{\pi}(\mu) &\Longrightarrow \mu. \text{cons}\_{\gamma}(\mathsf{v}\_{i})\_{\pi} \leadsto (\mu', \mathsf{v}\_{o})\_{\pi}^{\mathcal{S}} \end{aligned}$$

with the resulting context π <sup>0</sup> having enough information to isolate the CPR<sup>8</sup> . Interestingly, erroneous executions cannot be fully characterised in terms of CPR presence or absence, because of TL-specific error cases: for example, in C, attempting to either get or set the value of a block cell that is beyond the block bound raises an out-of-bounds error (cf. §3). What we require instead is that consumed CPR can always be re-produced, that producers fail in a memory in which consumers succeed, and that producers succeed in a memory in which consumers return a missing information error (and vice versa for the latter):

$$\begin{aligned} \mu.\mathrm{cons}\_{\gamma}(\mathsf{v}\_{i})\_{\pi} &\leadsto(\mu',\mathsf{v}\_{o})\_{\pi'}^{\mathcal{S}} \implies \mu'.\mathrm{prod}\_{\gamma}(\mathsf{v}\_{i}\cdot\mathsf{v}\_{o}')\_{\pi'} \leadsto(\mu'',\mathsf{true})\_{\pi'}^{\mathcal{S}}\\ \mu.\mathrm{cons}\_{\gamma}(\mathsf{v}\_{i})\_{\pi} &\leadsto(\mu',\mathsf{v}\_{o})\_{\pi'}^{\mathcal{S}} \implies \mu.\mathrm{prod}\_{\gamma}(\mathsf{v}\_{i}\cdot -)\_{\pi} \leadsto(\mu,\mathsf{false})\_{\pi'}^{\mathcal{S}}\\ \mu.\mathrm{cons}\_{\gamma}(\mathsf{v}\_{i})\_{\pi} &\leadsto(\mu,\mathsf{false})\_{\pi'}^{\mathcal{M}} \Longleftrightarrow \mu.\mathrm{prod}\_{\gamma}(\mathsf{v}\_{i}\cdot\mathsf{v}\_{o})\_{\pi} \leadsto(\mu\bullet\gamma\langle\mathsf{v}\_{i}\cdot\mathsf{v}\_{o}\rangle,\mathsf{true})\_{\pi'}^{\mathcal{S}}. \end{aligned}$$

The properties given so far allow us, for example, to prove that well-formed memories cannot contain duplicated CPR. The final property below requires that non-missing executions of consumers and erroneous executions of producers must be frame-preserving, with the former formulated as follows:

$$\begin{aligned} \mu\text{.cons}\_{\gamma}(\mathsf{v}\_{i})\_{\pi} &\leadsto (\mu', \mathsf{v}\_{o})\_{\pi'}^{r} \wedge r \neq \mathcal{M} \wedge (\pi'' \Rightarrow \pi') \wedge \mathcal{W}f\_{\pi'}(\mu \bullet \mu\_{f}) \\ \Longrightarrow \ (\mu \bullet \mu\_{f}).\mathsf{cons}\_{\gamma}(\mathsf{v}\_{i})\_{\pi''} &\leadsto (\mu' \bullet \mu\_{f}, \mathsf{v}\_{o})\_{\pi''}^{r} \end{aligned}$$

where π <sup>00</sup> effectively maintains well-formedness constraints for µ, adds on further ones required for µ • µ<sup>f</sup> to be defined and also isolates the consumed CPR. Note that neither missing executions of consumers nor successful executions of producers can be frame preserving, as framing on the appropriate CPR could result in success for the former, and a duplicated resource error for the latter.

Using the consumers and producers, we are able to derive getter and setter actions, A , {getγ,set<sup>γ</sup> : γ ∈ Γ}, which perform frame-preserving CPR lookup and mutation, as given below. We discuss getters and setters further in §3, in the context of our JS and C instantiations.

$$\begin{array}{llll} \text{GetTER:} & \text{Success} & \text{Success} \\ \mu.\text{cons}\_{\gamma}(\mathsf{v}\_{i})\_{\pi} \leadsto (\mu',\mathsf{v}\_{o})\_{\pi'}^{\mathcal{S}} & \mu.\mathsf{cons}\_{\gamma}(\mathsf{v}\_{i})\_{\pi} \leadsto (\mu',-)\_{\pi'}^{\mathcal{S}} \\ \hline \mu'.\text{prod}\_{\gamma}(\mathsf{v}\_{i}\cdot\mathsf{v}\_{o})\_{\pi'} \leadsto (\mu'',\mathsf{v}\_{o})\_{\pi'}^{\mathcal{S}} & \mu'.\text{prod}\_{\gamma}(\mathsf{v}\_{i}\cdot\mathsf{v}\_{o})\_{\pi'} \leadsto (\mu'',\mathsf{true})\_{\pi'}^{\mathcal{S}} \\ \hline \mu.\text{get}\_{\gamma}(\mathsf{v}\_{i})\_{\pi} \leadsto (\mu'',\mathsf{v}\_{o})\_{\pi} \leadsto (\mu'',\mathsf{true})\_{\pi'}^{\mathcal{S}} \end{array}$$

Getter: Non-Success µ.consγ(vi)<sup>π</sup> (µ, false) r <sup>π</sup><sup>0</sup> r 6= S µ.getγ(vi)<sup>π</sup> (µ, false) r π0 Setter: Non-Success µ.consγ(vi)<sup>π</sup> (µ, false) r <sup>π</sup><sup>0</sup> r 6= S µ.setγ(v<sup>i</sup> · vo)<sup>π</sup> (µ, false) r π0

<sup>8</sup> The π ` . . . denotes reasoning under context π. In the concrete case, it can be ignored.

Compositional State Models. Compositional memory models lift to compositional state models, in a similar way to the lifting of the complete memory models illustrated in [19]; see [33] for details. Here, we focus on memory action execution, which is lifted as follows to state action execution, given a memory model M(V, Γ) and α ∈ A<sup>Γ</sup> ] A:

$$\operatorname{ea}(\alpha, \langle \mu, \rho, \pi \rangle, \mathbf{v}) \triangleq \{ (\langle \mu', \rho, \pi' \rangle, \mathbf{v}')^r \mid \mu . \alpha(\mathbf{v})\_\pi \leadsto (\mu', \mathbf{v}')\_{\pi'}^r \}.$$

Observe how the context of the state is passed to the memory execution function, which may then strengthen it before passing it back to the resulting state. We can show that the PCM and well-formedness relation on memories lift to a PCM and well-formedness relation on states, and that state action execution maintains properties analogous to those given for memory models.

#### 2.2 GIL Verification

We give an overview of Gillian verification based on separation logic (SL); see [33] for details. We describe GIL assertions, parameterised by the core predicates of the TL, define assertion satisfiability in a novel, parametric way using the core predicate producers, and provide a mechanism for using verified function specifications in GIL execution. GIL Assertion Syntax

A compositional memory model with core predicates Γ induces an SL-assertion language given on the right. GIL memory assertions, p, q ∈ A, are formed using the



empty assertion, the separating conjunction, the core predicates, and user-defined predicates, whose names come from a dedicated set, ∆ 3 δ. The empty assertion and the separating conjunction are standard. Core predicate assertions are lifted from memory core predicates. User-defined predicates, introduced by example in §3 and §4, are used by tool developers to characterise the interface of the TL, and by code developers to describe the data structures in their programs. They have in- and out-parameters like core predicates, and can have multiple definitions, separated by a semi-colon. Assertions, P, Q ∈ Asrt, extend memory assertions with pure first-order assertions, π, conflated with Boolean symbolic expressions.

Satisfiability. To define assertion satisfiability, we lift memory consumers and producers from core predicates to memory assertions, denoted by µ.consθ(p) and µ.prodθ(p), and then to states and arbitrary assertions, denoted by σ.consθ(P) and σ.prodθ(P), using substitutions θ : X 7→ˆ V (extended to symbolic expressions inductively, in the standard way) to map core predicate assertions, with parameters given by symbolic expressions, to the core predicates of the memory model, with parameters given by values. We highlight the successful base case of the memory assertion consumers, where the returned context requires the out-parameters of the assertion to match the ones found in memory:

$$\frac{\mu.\text{cons}\_{\gamma}(\theta(\hat{e}\_{i}))\_{\pi}\leadsto(\mu',\mathsf{v}\_{o}')\_{\pi'}^{\mathcal{S}}}{\mu.\text{cons}\_{\theta}(\gamma\langle\hat{e}\_{i}\cdot\hat{e}\_{o}\rangle)\_{\pi}\leadsto(\mu',\mathsf{true})\_{\pi'}^{\mathcal{S}}}$$

and the successful consumption of an arbitrary assertion P = p ∧ π:

$$\frac{\mu'.\text{cons}\_{\theta}(p)\_{\pi'} \leadsto (\mu'',\text{true})\_{\pi''}^{\mathcal{S}}}{\langle \mu',\rho,\pi' \rangle.\text{cons}\_{\theta}(p\land\pi) \leadsto (\langle \mu'',\rho,\pi'' \rangle,\text{true})^{\mathcal{S}}}$$

Definition 4 (Satisfiability). The satisfiability relation, stating that memory µ <sup>0</sup> and context π 0 satisfy assertion p ∧ π under substitution θ, is defined by:

µ 0 , π<sup>0</sup> , θ |= p ∧ π ⇐⇒ 0.prodθ(p)true (µp,true) S π<sup>p</sup> ∧ π <sup>0</sup> ` (µ <sup>0</sup> = µ<sup>p</sup> ∧ π<sup>p</sup> ∧ θ(π))

and is lifted to states as: hµ 0 , ρ, π<sup>0</sup> i, θ |= p ∧ π if and only if µ 0 , π<sup>0</sup> , θ |= p ∧ π.

In Definition 4, the production, when successful, creates the (unique) memory µ<sup>p</sup> that corresponds to the resource of the assertion p, with its (unique) wellformedness constraints, πp. In the concrete case, as the only allowed context is true, the formulation simplifies to the more intuitive 0.prodθ(p) → (µ 0 ,true) <sup>S</sup> ∧ θ(π).

Specifications. Gillian function specifications have the form {x, P ˆ }f(x){Q} eˆ , where f is the function identifier, x is the function parameter, xˆ is the symbolic variable holding the value of x, P is the pre-condition, Q is the post-condition, and eˆ is the return value of the function, with the following, well-known, constraints:


We extend GIL programs with function specifications, accessible via p.specs, and the GIL execution semantics with rules for folding and unfolding user-defined predicates, as well as with a rule for calling function specifications, the success case of which is given below. Gillian verifies a specification {x, P ˆ }f(x){Q} eˆ if, given the identity substitution ˆθ and a symbolic state σˆ with store {x 7→ ˆθ(xˆ)} such that σ, ˆ ˆθ |= P, the symbolic execution of f starting from σˆ always terminates, for all final symbolic states σˆ<sup>i</sup> there exists some ˆθ<sup>i</sup> ≥ ˆθ such that σˆ<sup>i</sup> , ˆθi |= Q, and the corresponding return value equals ˆθi(eˆ) under the context of σˆ<sup>i</sup> . We can prove that if Gillian verifies a specification, then its standard SL interpretation holds.

Spec Call - Success cmd(p, cs, i) = y := e(e 0 ) with θ function call with substitution θ σ.eval<sup>e</sup> (−) f σ.evale<sup>0</sup> (−) v 0 get function id and parameter value {x, P ˆ }f(x){Q} <sup>e</sup><sup>ˆ</sup> ∈ p.specs get one of the function specifications θ <sup>0</sup> = θ[ˆx 7→ v 0 ] extend substitution with parameter value σ.consθ<sup>0</sup> (P) → {(σ<sup>j</sup> ,true) S |j∈<sup>J</sup> } consume pre-condition j ∈ J select a branch σ<sup>j</sup> .prodθ<sup>0</sup> (Q) (σ 0 j ,true) S produce post-condition σ 0 j .setVar<sup>y</sup> (θ 0 (ˆe)) σ 0 assign return value 0

p ` hσ, cs, ii hσ , cs, i+1i

Note that for this rule to succeed, the consumption of P must succeed. The rule is slightly simplified for presentation. First, it assumes to have the substitution upfront; in the implementation, we have a unification algorithm that, starting from the function parameter and using the consumers, learns the substitution. Second, it assumes that the post-condition does not introduce fresh symbolic variables; these are handled using allocators and added to the substitution.

Remark. Due to space constraints, we have not been able to give the full technical details of Gillian verification. These are available in the Gillian technical report [33], where we demonstrate that the overall GIL execution using compositional memory models is frame-preserving (up to the usual renaming of allocated memory locations) and prove a standard verification soundness result.

### 3 Compositional Memory Models: JavaScript and C

We present the compositional memory models of JS and C, giving the basic actions and core predicates, and some of the user-defined predicates that capture the intuitive interfaces of these languages. The key ideas behind compositional JS memory models were introduced in the JaVerT project [21,20,22]; we transfer them to Gillian. We introduce the compositional C memory models, building on the concrete block-offset memory model of CompCert [31], simplifying the presentation.<sup>9</sup> In doing so, we highlight a striking similarity between the JS and C models that is the result of our emphasis on negative resource.

The JS and C concrete compositional memory models are made up of building blocks that are assigned a unique location (or identifier) from a set of uninterpreted symbols, L ⊂ U: for JS, the building blocks are the extensible objects; for C, they are the blocks of linear memory of a given size. Each building block is divided into at least one component. For JS, each object has three components: a property table, h : S \* Val, partially mapping property names (strings) to values; a domain, d : P(S), discussed shortly; and metadata, m : Val, which keeps track of internal JS properties for that object [22]. For C, each block has two components: the block contents k : N \* Val, partially mapping offsets (natural numbers) to values; and a bound, n : N, discussed shortly. Finally, the memory units are, intuitively, the parts of the memory components that cannot be separated further: for JS, these are single object properties, domains, and metadata; for C, these are single block cells and bounds. These memory units directly correspond to the core predicates given in Definitions 6 and 7.

Compositional memory models must keep track of negative resource, which can come from two sources: allocation and deallocation. For JS and C, the negative information originating from allocation has infinite representation: in JS, a freshly created object is known to not have any properties; in C, a freshly allocated block of a given size in C is known not to have offsets beyond that size. This infinite information is captured, for JS, by the object domain whose meaning

<sup>9</sup> We assume that values have the same size in memory and omit permissions. Gillian-C implements the full models, eliding the concurrency-related aspects of permissions.

is that any property not in the domain is absent, and, for C, by the block bound whose meaning is that any accesses beyond that bound result in a buffer overrun error. The negative information originating from deallocation is easier to handle, tracked by a dedicated uninterpreted symbol, ∅ ∈ U. In JS, deallocation is at the unit level: only object properties are deleted. This is captured by extending the co-domain of property tables with ∅: that is, h : S \* Val∅. In C, deallocation is at the building-block level: only entire blocks can be deleted. This is captured by extending the co-domain of blocks with ∅, indicating that a block has been freed.

Due to compositionality, any building block, component or unit can be missing. In the theory, we capture this either implicitly, via absence from the domain of a mapping (e.g., a missing object property for JS or a missing block cell for C), or explicitly, using the symbol ⊥ (e.g. a missing domain, metadata, or bound).

Definition 5 (Compositional JS and C Memories). The PCMs of compositional concrete JS and C memories, |MJS| and |MC|, are given by the sets

$$\begin{array}{rcll} \mu \in |M\_{\mathbb{N}}| &: \mathcal{L} \rightharpoonup ((\mathcal{S} \rightharpoonup \mathcal{V} al\_{\mathcal{B}}) \times \mathcal{P}(\mathcal{S})\_{\perp} \times \mathcal{V}al\_{\perp}),\\ \mu \in |M\_{\mathbb{C}}| &: \mathcal{L} \rightharpoonup ((\mathbb{N} \rightharpoonup \mathcal{V} al) \times \mathbb{N}\_{\perp})\_{\mathcal{B}}, \end{array}$$

composition defined as disjoint union, and empty memory ∅. The PCMs of compositional symbolic JS and C memories, |Mˆ JS| and |Mˆ <sup>C</sup>|, are given by the sets

$$\begin{array}{rcl} \hat{\mu} \in |\hat{M}\_{\mathcal{S}}| &: \, \hat{\mathcal{E}xpr} \rightharpoonup ((\hat{\mathcal{E}xpr} \to \hat{\mathcal{E}xpr}) \times \hat{\mathcal{E}xpr}\_{\perp} \times \hat{\mathcal{E}xpr}\_{\perp}),\\ \hat{\mu} \in |\hat{M}\_{\mathcal{C}}| &: \, \hat{\mathcal{E}xpr} \rightharpoonup ((\hat{\mathcal{E}xpr} \to \hat{\mathcal{E}xpr}) \times \hat{\mathcal{E}xpr}\_{\perp})\_{\mathcal{B}}, \end{array}$$

with composition defined as (syntactic) disjoint union, and empty memory ∅.

In the above definition, symbolic memory models are simple liftings of the concrete ones. In the implementation, we employ heavy optimisation: for example, in Gillian-C, we have developed a complex tree representation of symbolic blocks inspired by [29], enabling tractable reasoning about arrays of symbolic size.

Well-formedness of concrete memories addresses the relationship between positive and negative information, given for JS and C below:

$$\begin{array}{rcl} \mathcal{W}f^{\mathbb{R}}(\mu) & \triangleq \,\,\forall (h,d,-) \in \mathsf{codim}(\mu). \,\,d \neq \perp & \implies \mathsf{dom}(h) \subseteq d\\ \mathcal{W}f^{\mathbb{C}}(\mu) & \triangleq \,\,\forall (k,n) \in \mathsf{codim}(\mu). \,n \neq \perp & \implies \mathsf{dom}(k) \subseteq [0,n) \end{array}$$

Well-formedness of symbolic memories additionally has to address separation of locations and separation in any other mappings with symbolic expressions in its domain (e.g. object properties for JS and offsets for C). We give the well-formedness criterion for the symbolic C memory:

$$\mathcal{W}\hat{f}^{\mathbb{C}}\_{\pi}(\widehat{\mu}) \triangleq \pi \vdash \bigwedge\_{\substack{\boldsymbol{l},\boldsymbol{l}^{\boldsymbol{\ell}} \in \mathsf{dom}(\widehat{\mu}) \\ \widehat{\mathbb{I}} \not\equiv \boldsymbol{l}^{\boldsymbol{\ell}}}} \widehat{\boldsymbol{l}} \neq \widehat{\boldsymbol{l}}^{\boldsymbol{\ell}} \wedge \bigwedge\_{\substack{(\boldsymbol{k},\boldsymbol{-}) \in \mathsf{cod}\mathsf{dom}(\widehat{\mu}) \\ \boldsymbol{\vartheta},\boldsymbol{\vartheta}' \in \mathsf{dom}(\widehat{\boldsymbol{k}}),\,\boldsymbol{\vartheta} \not\equiv \boldsymbol{\ell}}} \widehat{\boldsymbol{\phi}} \neq \widehat{\boldsymbol{o}}^{\boldsymbol{\ell}} \wedge \bigwedge\_{\substack{(\boldsymbol{k},\boldsymbol{\ell}) \in \mathsf{cod}\mathsf{dom}(\widehat{\mu}) \\ \widehat{\boldsymbol{\vartheta}} \in \mathsf{dom}(\widehat{\boldsymbol{k}}),\,\boldsymbol{\vartheta} \not\equiv \boldsymbol{\ell}}} \widehat{\boldsymbol{o}} < \widehat{\boldsymbol{n}}$$

For our JS and C instantiations, the core predicates follow straightforwardly from the units of their memory models.


Fig. 2: Selected rules for the consCell consumer.

Definition 6 (JS Core Predicates). JS has three core predicates, γJS ∈ ΓJS:


Definition 7 (C Core Predicates). C has three core predicates, γ<sup>C</sup> ∈ Γ<sup>C</sup> <sup>10</sup>:


We illustrate the C predicate action execution functions, ea <sup>C</sup> and eaˆ <sup>C</sup> , respectively, with a selection of rules for the C cell-predicate consumer, consCell, given in Figure 2. The remaining rules, as well as the rules for their JS counterparts, ea JS and eaˆ JS, can be found in the Gillian technical report [33]. With this information, we can define the compositional concrete and symbolic JS and C memory models.

Definition 8 (JS Memory Models). The compositional concrete and symbolic JS memory models are defined, respectively, as MJS(Val, ΓJS) = h|MJS|, Wf JS , ea JSi and Mˆ JS(Eˆxpr , ΓJS) = h|Mˆ JS|, Wˆf JS , eaˆ JSi.

Definition 9 (C Memory Models). The compositional concrete and symbolic C memory models are defined, respectively, as MC(Val, ΓJS) = h|MC|, Wf C , ea <sup>C</sup>i and Mˆ <sup>C</sup>(Eˆxpr , ΓJS) = h|Mˆ <sup>C</sup>|, Wˆf C , eaˆ <sup>C</sup> i.

<sup>10</sup> In full C and the Gillian-C implementation, memory values may be of different sizes, and holes may exist between these values due to alignment restrictions. To address this, the implemented cell assertion carries additional information related to, e.g., size and type, similarly to that of [4], and there also exists a hole core predicate.

The getters and setters for JS and C are defined using the methodology described in §2. In particular, the JS getters and setters are given by AJS = {getProp,setProp, getDomain,setDomain, getMetadata,setMetadata}, and the summary of the execution of the symbolic getProp( ˆl, pˆ) getter is illustrated below:

Similarly, the C getters and setters are given by A<sup>C</sup> = {getCell,setCell, getBound, setBound, getFreed,setFreed} and the summary of the execution of the symbolic getCell( ˆl, oˆ) getter is illustrated below:

$$\begin{array}{c|c} \text{if } l \in \text{dom}(\mathfrak{A}) \xrightarrow{\text{we}} \mathfrak{A}(l) = \mathfrak{g} \xright \xrightarrow[\mathfrak{h}]{\mathfrak{h}(l) - (\mathfrak{f}, \mathfrak{a})} \mathfrak{g} \in \text{dom}(\widehat{\mathfrak{h}}) \xrightarrow[\mathfrak{a}]{\mathfrak{h}(l) - (\mathfrak{f}, \mathfrak{a})} \mathfrak{h} = \bot & \xrightarrow[\mathfrak{a}]{\mathfrak{w} \mapsto} \chi(l, \mathfrak{d}) \\ \text{1} & \bigvee\_{}^{\text{wa}} & \bigvee\_{}^{\text{wa}} & \bigvee\_{}^{\text{wa}} & \bigvee\_{}^{\text{wa}} & \bigvee\_{}^{\text{wa}} & \bigvee\_{}^{\text{wa}} \\ \text{Missing block} & & & & & & \mathfrak{h} \geq \mathfrak{h} \xrightarrow[\mathfrak{h}]{\mathfrak{h} \mapsto} \cdots & \mathfrak{h} \end{array}$$

The similarities in the two diagrams are evident, with the main difference being that JS getters do not throw errors, whereas C getters do.

User-defined JS and C Predicates. Core predicates describe fundamental units of the TL memory model. On top, user-defined predicates build layers of abstraction to describe memory components and building blocks, standard library interfaces, all the way to complex data structures for particular code such as the AWS message header. Using Gillian notation, we present some of the JS and C user-defined predicates; in this notation: ∗ and ∧ are conflated to ∗, with automatic differentiation between spatial and pure assertions<sup>11</sup>; predicate definitions are separated with a semi-colon; and logical variables are prefixed with the # symbol and are implicitly existentially quantified in predicate definitions.

Gillian-JS inherits many user-defined predicates from JaVerT [21], including simple ones for describing JS objects and their properties, as well as advanced ones for specifying scoping, function closures and prototype chains. We focus here on the new FrozenObject(o, proto, pvs) predicate, which describes a frozen object<sup>12</sup> o with prototype proto and property-value pairs pvs. We first define the predicate FrozenObjectProps(o, pvs) to grab the resource of the object properties:

```
pred FrozenObjectProps(o, pvs) : pvs = [ ];
    pvs = [#p, #v] :: #rpvs * DataPropConst(o, #p, #v) *
    FrozenObjectProps(o, #rpvs);
```
where DataPropConst(o, #p, #v) states that the object o has a non-writable property #p with value #v. We then add information about the object prototype and its non-extensibility using the JSObject(o, proto, ext) predicate, and also state that the object has no properties other than pvs using the domain core predicate:

<sup>11</sup> From the separation logic literature, the pure assertions can be regarded as dotted.

<sup>12</sup> A JS object is frozen if it cannot be extended and all its properties are non-writable.

```
pred FrozenObject(o, proto, pvs) :
    JSObject(o, proto, false) * FrozenObjectProps(o, pvs) *
    FirstProj(pvs, #ps) * ListToSet(#ps, #pss) * domain(o, #pss)
```
where FirstProj(pvs, #ps) means that the list #ps is the first projection of the list of pairs pvs, and ListToSet(#ps, #pss) means that the elements of the list #ps form the set #pss.

Gillian-C, on the other hand, comes with user-defined predicates capturing, for example, arrays and blocks in memory, as well as automatically-generated predicates describing C structs, with support for nested structs. In particular, the array(b, off, c) predicate describes a contiguous fragment of a block b, starting from offset off, with contents described by the mathematical list c:

```
array(b, off, c) : c = [];
                   (b, off) -> #c * array(b, off+1, #d) * c = #c :: #d
```
and the block(b, c) predicate captures an entire C block with contents c:

```
block(b, c) : array(b, 0, c) * bound(b, |c|)
```
In the implementation, arrays also exist as core predicates. This allows us to reason about arrays automatically in the symbolic memory (e.g., to split an array into sub-arrays), supported by our tree representation of symbolic blocks, instead of requiring manual application of lemmas.

Finally, we illustrate automatically generated struct-related predicates using the aws\_byte\_cursor structure given below, which contains two fields: an unsigned integer len; and a nullable pointer to an array of 8-bit unsigned integers buf. This struct is used for traversing the AWS message header (cf. §4), and is intended to capture an array in memory that starts at buf and has length len.

```
struct aws_byte_cursor { pred struct_aws_byte_cursor(cur, len, buf) :
 size_t len; (cur == [#b, #o]) * ((#b, #o) -int64-> len) *
 uint8_t *buf; ((#b, #o +p 8) -int64-> buf) *
} is_ptr_or_null(buf)
```
The generated predicate describes the struct's layout in memory and gives basic typing information: it states that an aws\_byte\_cursor, starting from the position given by the pointer cur, occupies 16 bytes in memory (8 + 8, given by the type annotation int64), with the first 8 bytes taken by len, and the second 8 bytes (note the pointer addition +p) taken by buf, which is either a pointer or null.

# 4 AWS Encryption SDK Message Header Specification

The encrypted data handled by the AWS Encryption SDK is stored within a structure called a message [3]. The message format has two versions of similar complexity: we verify version 1; version 2 was introduced very recently. Messages consist of a header, a body, and a footer. Here, we describe only the structure of the header, as we are verifying header deserialisation.

The AWS Encryption SDK message header is a sequence of bytes (buffer) divided into sections, as illustrated below; above each section is its length in bytes.


Our approach is to abstract the header contents into a list and formulate pure predicates that describe its structure in a language-independent way. This allows us to then use the same abstractions as part of further, language-dependent, abstractions for both JS and C. Our design of the abstractions was informed by existing code annotations found in the implementations, which describe simple first-order properties of the code and, in the case of C, can also link to the CBMC [30] bounded model checker. However, these annotations are limited by the expressivity of JS and C, particularly when it comes to reflecting on the memory contents. Our predicates have no such limitations.

We narrow down our exposition to the encryption context, as it illustrates well the language-independent and language-dependent aspects of our specification, and is also the section in which we discovered bugs in both implementations.

Pure Specification of the Encryption Context. The encryption context (EC) is a sequence of bytes that describes a set of key-value pairs. Its structure is given in the diagram below.

The first two bytes represent the number of key-value pairs, denoted by KC, and the rest describe the KC key-value pairs themselves. Keys and values are represented by sequences of bytes and, as they are of variable length, are serialised by first having two bytes that represent the length, followed by that many bytes of the actual key or value; we refer to this pattern as a field, and to a sequence of n fields as an n-element. Then, a key-value pair is serialised as a 2-field element, and all of the key-value pairs form a sequence of KC 2-field elements.

We specify the EC by building layers of abstraction, from fields to elements to element sequences to the EC, each of which can either be complete, incomplete (partial, but with correct structure), or malformed (with incorrect structure). In the implementation, these are specified separately and are joined together in appropriate over-arching abstractions. Here, we focus on complete variants only.

The Field(buf, pos, fld, len) predicate states that the buffer (list of bytes) buf, at index pos, holds a field with contents fld (list of bytes) and total length len:

This predicate uses the GIL operator sub(l, s, n), which returns the sublist of list l starting from index s and of length n, and also the UInt16(rn, n) predicate, which states that n is a 16-bit big-endian interpretation of the raw 2-byte list rn. The Element(buf, pos, fC, elem, len) predicate states that buffer buf at index pos holds a sequence of fC fields, with contents elem (a list of the appropriate field contents) and total length len. It is defined similarly to a standard linked-list predicate, with the 'link' being the fact that the list members are contiguous in memory:

```
pred Element(buf, pos, fC, elem, len) :
  (fC = 0) * (0 <= pos) * (pos <= |buf|) * (elem = [ ]) * (len = 0);
  (0 < fC) * Field(buf, pos, #fld, #fL) * Element(buf, pos+#fL, fC-1, #rFs,
  #rL) * (elem = #fld :: #rFs) * (len = #fL+#rL)
```
Next, analogously to Element, we define the Elements(buf, pos, eC, fC, elems, len) predicate, which states that the buffer buf, at index pos, holds a sequence of eC elements, each with fC fields, with contents elems (a list of the appropriate element contents) and of total length len. Finally, the EncryptionContext(buf, KVs) predicate states that the entire buffer buf is an EC with key-value pairs KVs, with all keys being unique:

```
pred EncryptionContext(buf, KVs) : (buf = [ ]) * (KVs = [ ]);
     (#rKC = sub(buf, 0, 2)) * UInt16(#rKC, #KC) * (0 < #KC) *
     Elements(buf, 2, #KC, 2, KVs, #len) *
     FirstProj(KVs, #Ks) * Unique(#Ks) * (2+#len = |buf|)
```
Next, we show how this pure specification of the EC contents can be connected without modification without modification to both the JS and C memories.

Encryption Context in JS. In JS, the EC is serialised as an ArrayBuffer, which is a raw binary data buffer in memory, and accessed using a Uint8Array, which is a view on top of that ArrayBuffer starting from a given offset and of a given length, treating the raw data underneath as 8-bit unsigned integers. This Uint8Array view is similar in function to the aws\_byte\_cursor C structure (cf. §3). Abstracting ArrayBuffer contents to lists, we connect these data structures in JS memory to our pure EC specification (cf. Figure 3, top and centre):

```
pred JSSerEC(o, EC, KVs) :
     Uint8Array(o, #aBuf, #off, #len) * ArrayBuffer(#aBuf, #data) *
     (EC = sub(#data, #off, #len)) * EncryptionContext(EC, KVs)
```
In JS, the EC is deserialised into a frozen JS object with prototype null, whose properties represent the keys and hold the values. This is done by converting the keys and the values to UTF-8 strings, and is specified as follows:

```
pred JSDeserEC(o, KVs) : toUtf8(KVs, #sKVs) * FrozenObject(o, null, #sKVs)
where toUtf8 converts the list KVs point-wise to strings, obtaining #sKVs.
```
Fig. 3: Serialised Encryption Context: language-independent pure part (red; middle) and language-specific resource (green; JS above, C below)

{ JSSerEC(eEC, #EC, #KVs) } function decodeEncryptionContext(eEC) { PRE-CONDITION \* JSDeserEC(ret, #KVs) } Finally, the specification of the decodeEncryptionContext function states that the EC deserialisation is performed correctly.

Encryption Context in C. In C, the EC is serialised as a block in memory, and is traversed using an AWS byte cursor. Using the auto-generated predicate given in §3, we define the aws\_byte\_cursor(cur, buf, c) predicate, stating that cur points to a byte cursor which has access to an array starting from buf, and holding contents c, making the length implicit:

```
pred aws_byte_cursor(cur, buf, c) :
  struct_aws_byte_cursor(cur, #len, buf) * (buf = [#b, #off]) *
  array(#b, #off, c) * (#len = |c|)
```
A serialised EC can then be described as a valid byte cursor whose contents represent the EC key-value pairs (cf. Figure 3, centre and bottom):

```
pred CSerEC(cur, buf, EC, KVs) :
  aws_byte_cursor(cur, buf, EC) * EncryptionContext(EC, KVs)
```
In C, the EC is deserialised into an AWS hash table, whose keys and values directly correspond to the key/value pairs of the EC, specified as follows, eliding the internal structure of the hash tables due to space constraints:

```
pred CDeserEC(ht, KVs) : valid_hash_table(ht, KVs)
```
The specification of the EC deserialisation function is more complex than for JS. In particular, the byte cursor that originally pointed to the EC ends up shifted to the end of the byte buffer, exposing the array underneath the CSerEC predicate.

```
{ empty_hash_table(ec) * CSerEC(cur, #buf, #EC, #KVs) }
  int aws_cryptosdk_enc_ctx_deserialize(
      struct aws_hash_table *ec, struct aws_byte_cursor *cur)
{ (ret = 0) * CDeserEC(ec, #KVs) * (#buf = [#b, #off]) *
  array(#b, #off, #EC) * aws_byte_cursor(cur, #buf +p |#EC|, [ ]) }
```
# 5 AWS Encryption SDK Message Header Verification

Using Gillian-JS and Gillian-C, together with the specifications given in §4, we verify full functional correctness of the header deserialisation module of the AWS Encryption SDK JS [2] (~200loc) and C [1] (~950loc) implementations. In particular, we verify that the deserialisation of a complete header is correct, and the deserialisation of an incomplete or a malformed header raises an appropriate error.

Verification Effort and Performance. The JS verification took 3 personmonths and the C verification took 2 person-months, with the latter taking less time because a large part of the infrastructure developed for JS could be re-used. We substantially improved the first-order solver of Gillian to reason automatically about complex operations on lists of symbolic length, first used in the modelling of JS ArrayBuffers and then for C dynamic arrays. We created a collection of language-independent predicates and lemmas about their inductive properties (~1.2kloc) that cover the project-specific AWS header, but also reusable first-order concepts such as list element uniqueness, projections of lists of pairs, conversion from bytes to numbers, and conversion from raw bytes to strings. Similarly, we also had to create language-dependent abstractions and associated lemmas for the JS and C manipulation of the AWS message header (~1.2kloc). Finally, we had to: annotate the code with specifications and loop invariants, with the latter often having more than twenty components; manually apply lemmas to prove numerous complex entailments; and manually unfold user-defined predicates at times (the folding is automated) (~1.1kloc).

On a machine with an Intel Core i7-4980HQ CPU 2.80 GHz, DDR3 RAM 16GB, and a 256GB solid-state hard-drive running macOS, the JS verification takes approximately 45 seconds and the C verification takes approximately six minutes. The C time is longer, in part due to the larger codebase, but mainly due to the complexity of the implementation of the full C memory model, which is able to reason about arrays of symbolic size. This requires frequent satisfiability checks and (for the moment) branching on non-zero array size. These times could both be improved with the implementation of basic merging techniques.

JS Verification: Bugs/Improvements. We discovered two bugs and improved one function implementation to link better with the underlying data structure.


with the underlying data structures. Its parameters were non-intuitive (it received eC · fC, buf, and pos), and used complex array operations to re-form the final return value. We re-implemented this function to construct the returned array of arrays efficiently, simplifying specification and verification, and our implementation was integrated into the codebase.

JS Verification: Caveats. Our JS verification is correct up to the following caveats. First, as the AWS SDK JS implementation is written in TypeScript, we elide types to obtain JS; this could be automated, potentially generating predicates from the types. Next, some ES6 features, such as patterns in function parameters, are not yet supported by Gillian-JS; these we rewrite to ES5 Strict, preserving their meaning. Next, we use axiomatic specifications of the ArrayBuffer, DataView, and UInt8Array ES6 built-in libraries, as well as of the Object.freeze and Array.prototype.map built-in functions. These would ideally be accompanied with implementations, tested against the official Test262 test suite [16] and verified against their specifications. Finally, as Gillian does not support higher-order reasoning, we axiomatise the toUtf8 function, passed into the deserialisation module as a parameter, as an injective function from raw bytes to JS strings.

C Verification: Bugs. We discovered three bugs: one logical error; one undefined behaviour; and one over-allocation.


C Verification: Caveats. Our C verification is correct up to the following caveats. First, we do not use the aws\_byte\_cursor\_advance\_nospec function, which advances the byte cursor, but also uses complex computation to protect against the Spectre bug. We instead use aws\_byte\_cursor\_advance, which has equivalent behaviour, as our specifications are not expressive enough to capture this distinction. Next, we axiomatise the functions of the AWS hash tables and array list libraries, as their verification is of comparable complexity to the entire deserialisation module. Finally, the AWS allocators of the C implementation, which are passed into some of the functions, contain pointers to memory management functions; this is higher-order in nature. In the verification, we assume those functions are malloc, calloc, and realloc.

### 6 Related Work

The literature explores many techniques and tools for verifying JS [44,18,22,21] and C [23,26,28,13,7]. We describe: multi-language verification architectures; JS and C verification tools based on separation logic; C memory models related to our models; and other analyses applied to the AWS Encryption SDK.

Multi-Language Verification Architectures. The multi-language verification architectures closest to Gillian are coreStar [6] and Viper [36,35]. Both of these architectures were designed to serve as verification back-ends for TLs and both have at their core a simple intermediate representation with a dedicated symbolic execution engine<sup>13</sup>. However, they work with the TL in different ways.

In coreStar, TL core assertions are modelled as abstract predicates and memory actions as function calls. The function specifications play the role of our consumer and producer actions. The user also has to provide logical axioms, describing properties of the abstract predicates. The Gillian equivalent of these axioms are the implementations of the memory actions using consumers and producers, which can be optimised, but require understanding of the inner workings of Gillian. Like Gillian, coreStar's symbolic execution engine is parametric on the underlying logical theory and can thus be used to reason about any memory model representable using abstract predicates. It is, however, unclear how efficiently this can be done. coreStar has been used inside the tool jStar [15], which has verified implementations of several Java design patterns but was not pushed to more complex Java code. In [21], the authors observed that coreStar was not able to handle tractably even simple JS programs.

Unlike Gillian and coreStar, Viper [35,36] comes with a fixed intermediate language, also called Viper. The user encodes their memory model and corresponding core assertions into the memory model and assertion language of Viper. A key advantage of Viper lies in its expressive permission model, which includes fractional, recursive, and abstract read permissions, as well as in its support for custom mathematical domains, which enable users to extend Viper with their own first-order theories, tailored to the data structures at hand. Viper has mechanisms similar to our consumer and producer actions, called inhale and exhale. Viper can reason about both sequential and concurrent programs, and has been used to verify programs written in Java, Go, Rust, and Python, but not JS and C. In fact, it is not clear to us how difficult it would be to use Viper to reason about JS objects and the linear memory of C, as neither can be simply expressed using the static objects natively provided by Viper.

Semi-automatic JS and C Verification Tools. There are very few verification tools for JS based on separation logic. For example, JaVerT [21] has been used to verify simple sequential data-structure algorithms. Its successor, JaVerT 2.0 [22], provides whole-program symbolic testing, verification and bi-abductive reasoning [10], unified by a core symbolic execution engine.

<sup>13</sup> Viper includes both a symbolic execution engine and a verification condition generator based on Boogie [5] for its intermediate language.

JaVerT 2.0 verification is more efficient than JaVerT verification, but has still only been applied to simple data-structure algorithms. Gillian [19] builds on JaVerT 2.0, taking the highly non-trivial step of designing the intermediate language, correctness results, and implementation to be parametric on the TL memory models. Despite this generalisation, Gillian substantially outperforms JaVerT 2.0, both for symbolic testing [19] and for verification.

Verifast [26] and the tool in [7] are prominent examples of semi-automatic tools that provide functionally-correct verification of C programs using separationlogic specifications. These tools work with C fragments and simplified memory models. While the tool in [7] has not been applied to real-world code, Verifast has been used to verify, e.g., an implementation of a Policy Enforcement Point (PEP) for Network Admission Control scenarios [38]. One difference between these tools and Gillian is that Gillian specifications can express negative resource, allowing us to differentiate missing resource errors from use-after-free errors. However, Verifast, unlike Gillian, supports reasoning about concurrent programs. There is also much work on using theorem provers to verify both sequential and concurrent C code using separation logic: see, for example, the DeepSpec project [45] and the Iris project [47], which we do not describe here.

Related Formal C Memory Models. Our compositional C memory models were inspired by CompCert [32] and the CH20 formalisation of Krebbers [29]. In particular, our concrete C model is adapted from the complete model of CompCert, which supports reasoning about programs that access in-memory data representations. This feature is used by the AWS deserialisation algorithm, which reads the buffer contents at the byte-granularity.

We present our compositional symbolic C memory model in this paper as a simple lifting of the concrete one. Our implementation is more complex, however, representing blocks as trees holding symbolic values and combining the concepts of memory trees and abstract values from the concrete memory model of the CH2O formalisation. Although not mentioned in [29], CH2O does keep track of some negative resource in that it maintains freed locations, but not block bounds.

Analysis of the AWS Encryption SDK. Amazon has recently directed considerable effort towards the formal analyses of their codebase, with a number of tools incorporated into their CI pipeline. For example, the main cryptographic algorithms of the AWS Encryption SDK have certified implementations in the specification language Cryptol [17], underpinned by SAW [12]. These implementations, however, have not yet been proven equivalent to the corresponding C implementation. In addition, the C implementation of the AWS Encryption SDK includes a symbolic test suite run using CBMC [30]. This implementation makes heavy use of the aws-c-common data-structure library, which is annotated with first-order assertions checked by CBMC. CBMC is a mature, industrial-strength tool, likely to outperform and have broader coverage than the symbolic testing of Gillian-C, with substantially fewer annotations than Gillian verification. However, as CBMC is a bounded model checker, it provides weaker correctness guarantees and is not compositional. Its expressivity is also somewhat constrained by the expressivity of the C runtime. For example, it does not allow reasoning

about the size of allocated memory. Gillian specifications have this expressivity, as highlighted by the discovered over-allocation bug. The subtle logical bug found by Gillian also demonstrates the importance of being able to express full, functionally-correct specifications. We believe there has been no previous analysis of the JS implementation of AWS Encryption SDK.

# 7 Conclusions

We have introduced compositional verification to the Gillian platform. Our work includes a methodology for designing compositional TL memory models, distinguishing negative resource from missing resource and using the JS and C memory models as demonstrator examples. It also includes a novel, parametric approach to assertion interpretation, independent of the TL, enabling compositional use of function specifications in verification. We have been able to push the Gillian verification to self-contained, critical, real-world AWS JS and C code. The bugs and suggestions for code improvements that arose during this verification process have all been accepted by the developers and incorporated into the codebase. To our knowledge, this is the first time that industry-grade JS code has been fully verified and the first time that, in one verification platform, the same abstractions were used to verify industry code from languages as different as JS and C. The artifact accompanying this paper can be found at [34], and the entire Gillian development at [46]. In future, we will publish correctness results for Gillian verification [33], as part of an in-depth theoretical study of program correctness and incorrectness for symbolic testing, verification and bi-abductive reasoning being developed in Gillian.

Acknowledgments. We thank AWS engineers and Mike Dodds from Galois for several inspiring meetings which led to our focus on verifying the JS implementation of the AWS Encryption SDK message header deserialisation module. We would especially like to thank Ryan Emery for many detailed discussions about his JS code. We thank the reviewers, whose comments have improved the overall quality of the paper. Gardner and Maksimović were supported by the EPSRC Fellowship 'VetSpec: Verified Trustworthy Software Specification' (EP/R034567/1). Fragoso Santos was supported by national funds through Fundação para a Ciência e a Tecnologia (UIDB/50021/2020, INESC-ID multi-annual funding) and project INFOCOS (PTDC/CCI-COM/32378/2017). Ayoun was supported by a Department of Computing PhD Scholarship from Imperial.

### References


850 P. Maksimović et al.

**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

# **Debugging Network Reachability with Blocked Paths**

S. Bayless(B) , J. Backes, D. DaCosta, B. F. Jones, N. Launchbury, P. Trentin, K. Jewell, S. Joshi, M. Q. Zeng, and N. Mathews

> Amazon Web Services, Seattle, USA sabayles@amazon.com

**Abstract.** In this industrial case study we describe a new network troubleshooting analysis used by VPC Reachability Analyzer, an SMT-based network reachability analysis and debugging tool. Our troubleshooting analysis uses a formal model of AWS Virtual Private Cloud (VPC) semantics to identify whether a destination is reachable from a source in a given VPC configuration. In the case where there is no feasible path, our analysis derives a *blocked path*: an infeasible but otherwise complete path that would be feasible if a corresponding set of VPC configuration settings were adjusted.

Our blocked path analysis differs from other academic and commercial offerings that either rely on packet probing (e.g., tcptrace) or provide only partial paths terminating at the first component that rejects the packet. By providing a complete (but infeasible) path from the source to destination, we identify for a user all the configuration settings they will need to alter to admit that path (instead of requiring them to repeatedly re-run the analysis after making partial changes). This allows users to refine their query so that the blocked path is aligned with their intended network behavior before making any changes to their VPC configuration.

# **1 Introduction**

This paper describes a new network connectivity troubleshooting analysis used by VPC Reachability Analyzer, a service that analyzes Amazon Web Services' (AWS) Virtual Private Cloud (VPC) configurations.

VPCs are user-configured networks of virtual compute devices and resources. AWS VPC offers dozens of networking components and controls to give users flexibility in configuring their networks. Access to these resources is logically isolated within virtual networks configured by the users. As VPCs grow in size and complexity, users can increasingly benefit from automation to identify and resolve misconfigurations, as well as to validate that applications maintain security and availability invariants through infrastructure changes.

VPC Reachability Analyzer uses the Tiros [2] formal model of AWS VPC networking semantics to identify whether a destination is reachable from a source in a given VPC configuration. If the destination is reachable, then Tiros identifies a *feasible path* from the source to the destination, where a path is a sequence of network components associated with incoming and/or outgoing packet header assignments (protocol, addresses, ports). The outgoing packet header of one component is the incoming packet header of the next component. Paths may also identify relevant VPC configuration details such as the specific routes, firewall rules, or other settings admitting the packet at each step. Each component in a VPC may accept or reject incoming and outgoing packet headers; a *feasible path* is a path in which every component on the path accepts both its incoming and outgoing packet header.

Tiros's analysis is static, *i.e.*, Tiros does not inject traffic into VPC configurations, and is complete for the subset of AWS VPC semantics it supports: if there exists a path connecting the source and destination, Tiros will find it. Since 2018, Tiros has powered the commercially available *Network Reachability* assessment in Amazon Inspector [1], statically identifying ports on EC2 Instances (virtual machines) accessible outside of their VPCs.

In this work, we extend Tiros by introducing a new diagnostic *blocked path* analysis when there is not a feasible path, to help users understand why their query is infeasible. A *blocked path* is a path as defined above, in which at least one component rejects its incoming or outgoing packet, along with one or more *blocking reasons*: elements of the VPC configuration preventing one or more components on the path from accepting packets. The blocked path identifies a sufficient set of blocking reasons, such that if each were addressed the query would be satisfiable.

Previous tools for connectivity diagnosis typically provide a partial path, up to the first component/rule that rejects the packet; in some cases those tools also identify a single blocking reason. Remediations based on a partial path may address that initial blocking reason only to discover that remediations are still necessary, or that the remediation may be working towards a path that the user ultimately will reject. Providing a complete blocked path connecting the source and destination allows users to ensure that their intent is aligned with our diagnosis before taking any corrective actions.

Our contributions in this work are:


# **2 Background**

#### **2.1 Related Works**

Many previous works have proposed network reachability diagnosis tools, including both widely-used industry tools and academic literature. These tools can be broadly divided into model-based and non-model-based approaches.

Non-model-based network diagnostic tools include system applications such as iptrace and tcptrace, commercial tools such as *Cisco Packet Tracer* [7], and academic works such as *Tulip* [12]. These tools trace live packets through a network or routing device, identifying the sequence of addresses of devices that accept the packet. Packet tracing tools lack visibility into the configuration settings that block and route packets.

Model-based tools [2,5,6,13,16] statically analyze reachability between a specified source and destination in a network or routing device. Rather than transmitting live packets, these tools use formal methods such as constraint solvers to rigorously identify feasible paths. Existing model-based tools provide control-plane level information when there is a feasible path, but produce either no information for unreachable paths, or identify only the first (out of potentially many) reasons why a path is blocked.

Our blocked path analysis is based on deriving minimal correction subsets (described below), which several previous works have proposed for generalpurpose SAT-based error diagnosis or repair [4,8,9,17].

#### **2.2 Minimal Correction Subsets**

The blocked path analysis we describe in Sect. 3 relies on two related concepts: Maximal Satisfiable Subsets (MSS) and Minimal Correction Subsets (MCS), which we define below. Following the definitions from [14]:

**Definition 1 (MSS).** S⊆F *is a Maximal Satisfiable Subset of constraints* F *iff* <sup>S</sup> *is satisfiable and* <sup>∀</sup>*<sup>c</sup>* ∈F \S*,* S∪{*c*} *is unsatisfiable.*

**Definition 2 (MCS).** C⊆F *is a Minimal Correction Subset of constraints* F *iff* F \C *is satisfiable and* <sup>∀</sup>*<sup>c</sup>* ∈ C*,*(F \C) ∪ {*c*} *is unsatisfiable.*

The complement of an MCS, F \ *MCS*(F), is guaranteed to be a maximal satisfiable subset of <sup>F</sup>; for this reason the MCS is sometimes called the coMSS.<sup>1</sup>

In general, the MCS and MSS are not guaranteed to be unique. There is a close connection between the definition of a Maximal Satisfiable Subset and MaxSAT [10]: The largest MSS (and therefore smallest MCS) corresponds to a solution to MaxSAT. Indeed, one approach for computing the MCS is to compute MaxSAT and take the complement. Efficient algorithms for directly computing the (not necessarily smallest) MCS without computing MaxSAT are available and are typically much faster than computing MaxSAT; a good survey of MCS algorithms including an empirical evaluation can be found in [14].

In constraint optimization problems, it is common to consider hard and soft constraints, in which only the soft constraints may be relaxed. Definition 2 assumes that all constraints are soft, but can be easily extended to support

<sup>1</sup> Note that a minimal correction subset is a distinct concept from an unsatisfiable core [11]. An unsatisfiable core is always unsatisfiable, but its complement F \ CORE(F) is not guaranteed to be satisfiable; in contrast, an MCS may or may not be satisfiable, but its complement is guaranteed to be satisfiable.

a mix of soft and hard constraints (where the MCS must contain only soft constraints). In this case, the MCS is only well defined if the hard constraints are satisfiable.

In Sect. 4, we will use a function computeMCS(*Soft, Hard*) that supports both hard and soft constraints. computeMCS returns a minimal correction set <sup>C</sup> <sup>=</sup> *MCS*(*Soft* <sup>∪</sup> *Hard*), with C ⊆ *Soft*. Our implementation of computeMCS uses a simple binary search, similar to FastDiag [4], or Algorithm BFD from [14]. We add activation literals to the soft constraints to allow the underlying solver instance to be re-used incrementally while testing different subsets of soft constraints for satisfiability.

### **2.3 Network Reachability**

We use the SMT-encoding of AWS VPC network semantics previously described in Tiros [2]. In this section, we briefly review this graph-based encoding; we refer readers to [2] for more details.

We take as input a configuration describing one or more user VPCs, and a user-specified reachability query, consisting of a source and destination component in the VPC. For example, the source of the query may be an internet gateway, and the destination may be an EC2 Instance. A query may also optionally specify additional constraints, such as the protocol, a range of source or destination addresses or ports for the packet, or an intermediate component that must (or must not) be on the path.

**Fig. 1.** Simplified example symbolic graph representation of a VPC (*left*), with symbolic packet header consisting of bitvectors (*right*). Edges in the graph are associated with theory atoms, and are traversable only if those atoms are assigned true. Two example constraints, enforcing that a network interface is only accessible if the packet is addressed to/from that interface are shown. These constraints relate edge atoms in the symbolic graph to the bitvectors in the symbolic packet header to enforce AWS VPC semantics.

We encode VPC configurations as constrained symbolic graphs using the SMT solver MonoSAT [3], with fixed-width bitvectors representing the protocol, port, and addressing information in a symbolic packet header. Figure 1 shows a symbolic graph along with a packet header and example constraints.

VPC components are represented as a nodes in the symbolic graph. Each component has semantics governing which packets it will accept; these semantics are encoded as constraints that restrict which edges incident to that component's node are traversible, depending on the assignment of the packet header variables. A satisfying assignment to the full set of constraints corresponds to a feasible path. In such an assignment, the bitvector variable assignments provide the packet header(s) and the graph theory model provides a path of network component nodes connecting the source and destination of the user's query.

Some components (such as NAT gateways) transform and retransmit packets. Tiros supports this by unrolling the VPC configuration graph into multiple copies with separate packet header variables. Edges from packet-transforming components connect to their components in the next unrolled section of the graph. Tiros unrolls the graph to a sufficient depth to model the behavior of the components for each query.

Query source and destination reachability is enforced with a single graph theory reachability predicate requiring a feasible path in the VPC configuration graph from the source to the destination of the query. Query restrictions requiring intermediate components are enforced using additional reachability predicates. Query restrictions requiring that a given resource not occur on a path are enforced by excluding that resource from the VPC configuration graph representation. Packet header restrictions are enforced using bitvector constraints.

If the constraints are satisfiable, Tiros extracts a reachable path satisfying the query from the satisfying assignment to the constraints. In the next section, we will discuss how we extend Tiros to also provide diagnostic feedback in the case where the constraints are unsatisfiable.

# **3 Blocked Paths for Network Configuration Diagnosis**

We introduce the notion of *blocked path* for analyzing infeasible network connections. As shown in Fig. 2, a blocked path is an infeasible but otherwise complete path from a source to a destination, in which one or more edges or nodes are annotated with *blocking reasons*: configuration settings or network semantics that explain why that transition in the path is infeasible.

Unlike a live packet trace, a blocked path continues past components that reject or redirect the packet so as to reach the user's intended destination, potentially transiting through multiple infeasible steps along the way.

#### **Definition 3 (Blocked path).**

*1. A blocked path is a complete (but infeasible) path from a source to a destination in a network, satisfying the user's query.*


**Fig. 2.** Two alternative blocked paths from an EC2 instance to an internet gateway. These blocked paths take different routes, and have different *blocking reasons* (shown in red) that explain why those paths are infeasible. In the first blocked path, there are two blocking reasons: the security group egress rule rejects packets destined for the Internet, and the internet gateway requires that the source instance must have a public IP address. Note that although the packet would be rejected by the security group, the blocked path continues past the security group to identify a complete (but infeasible) path to the internet gateway. The second blocked path transitions through an intermediate NAT gateway, which satisfies the security group rule and also has a public IP address. However, this path is still blocked, because the route table does not have an applicable route to the NAT gateway.

#### **Validating User Intent**

Showing a complete path from the source to destination, along with all the relevant configuration settings blocking that path, allows users to confirm that this course of action matches their intended network behavior before making any changes. However, in many cases there are multiple ways to adjust a configuration to admit a path, resulting in different blocked paths.

For example, Fig. 2 shows two example blocked paths to an internet gateway from an EC2 instance lacking a public IP address. Our analysis might initially produce for the user the shorter blocked path. Two remediation steps are required to admit this shorter path: The user must adjust the security group rule of the instance to admit egress packets to the public internet, and the user must also associate a public IP address with the source instance. Upon seeing the complete blocked path, the user may immediately determine that this would be the wrong solution for their network.

If the proposed blocked path doesn't match the user's intent, we allow users to submit a refined query so as to generate an alternative blocked path. For instance, the user may specify allowed address or port ranges for the packet, or specify components that must or must not appear on the path. Similarly, the user may submit a refined query specifying that a NAT gateway must be an intermediate component on the path. In this case, we might produce the longer blocked path from Fig. 2.

#### **Actionable Blocked Paths**

In some cases, there may not exist any combination of VPC configuration adjustments that would allow a query to be satisfied. For example, under typical conditions in VPCs, route tables cannot be adjusted to redirect packets that are destined for a local address within the VPC. It is possible for users to specify queries that cannot be satisfied without violating this local route restriction.

In principle, it is possible to derive a blocked path with non-user-configurable blocking reasons, however the resulting paths may behave in misleading or confusing ways, and in general will not be possible for users to actually achieve in any real configuration of their VPC. If possible, we want to ensure that the path contains only user-configurable blocking reasons, so that we produce an *actionable* finding for users. However, we still want to be able to provide useful diagnostics in cases where no actionable blocked path is possible (*e.g.*, to explain to the user that the local route restriction will prevent their path).

In Sect. 4, we describe how we determine when it is not possible to produce a blocked path without including non-configurable blocking reasons. In this case, we produce a partial path up to that first non-configurable blocking reason.

Additionally, in some cases a user may specify a query that remains unsatisfiable even if all of the network semantics in our model are relaxed. This can occur if the user specifies components that do not exist, or that are in isolated, disconnected networks (for which no relaxation of the edge constraints will admit a path). In this case, our blocked path analysis fails, and Tiros falls back on other techniques to produce diagnostic information.

In Sect. 5 we show that in most cases, our analysis succeeds and produces an actionable blocked path.

# **4 Deriving Blocked Paths from Unsatisfiable Queries**

We group VPC configuration semantics into three disjoint sets of constraints: (*<sup>U</sup>* <sup>∪</sup> *<sup>N</sup>* <sup>∪</sup> *<sup>H</sup>*). Set *<sup>U</sup>* contains constraints that enforce user-configurable controlplane settings (such as a user-defined route or firewall rule), while set *N* contains non-configurable for user-visible network semantics (such as the local route restriction).

Set *H* contains elements of the constraints that are either not user-visible (such as internal implementation details) or that should never be relaxed (such as the reachability predicate or any other constraints defined by the user's query). For example, many of our constraints involve containment comparisons between CIDRs and bitvectors representing IP addresses. An individual CIDR comparison is encoded as a fresh literal represnting the truth value of the comparison, along with multiple clauses that enforce the comparison semantics. The intermediate clauses that enforce the comparison semantics are implementation details that we include in set *H*, ensuring they are not included in the blocking reasons.

When a query is unsatisfiable, we derive a blocked path and corresponding blocking reasons from a Maximal Satisfiable Subset and Minimal Correction Subset of (*<sup>U</sup>* <sup>∪</sup> *<sup>N</sup>* <sup>∪</sup> *<sup>H</sup>*), with set *<sup>H</sup>* being treated as hard constraints that must not be included in the MCS.

If possible, we want to produce an MCS containing only configurable blocking reasons from *U*. This ensures that the resulting blocked path is actionable. If we directly compute the MCS of the full constraint set *<sup>U</sup>* <sup>∪</sup>*<sup>N</sup>* <sup>∪</sup>*H*, with both *<sup>U</sup>* and *N* as soft constraints, non-configurable constraints from *N* may be included in the MCS even in cases where there exists an MCS containing only constraints from *U*. On the other hand, we still want to be able to produce an MCS in the case where the non-configurable and hard constraints (*<sup>N</sup>* <sup>∪</sup>*H*) are, by themselves, unsatisfiable.

In Algorithm 1, we resolve this by breaking the computation of the MCS into two steps, initially computing an MCS of *<sup>N</sup>* <sup>∪</sup> *<sup>H</sup>*, and only allowing constraints from *<sup>N</sup>* into the blocking reasons if MCS(*<sup>N</sup>* <sup>∪</sup> *<sup>H</sup>*) is non-empty.

When *<sup>N</sup>* <sup>∪</sup> *<sup>H</sup>* is satisfiable, Algorithm <sup>1</sup> produces a blocked path that only contains the configurable blocking reasons from *U*.

Algorithm <sup>1</sup> constructs two correction sets, *MCS<sup>N</sup>* <sup>⊆</sup> *<sup>N</sup>* and *MCS<sup>U</sup>* <sup>⊆</sup> *<sup>U</sup>*, with *MCS<sup>N</sup>* <sup>∪</sup>*MCS<sup>U</sup>* a valid MCS of (*U*∪*N*∪*H*). We then extract a path *<sup>p</sup>* from a satisfying assignment to the corresponding MSS (*<sup>U</sup>* <sup>∪</sup>*<sup>N</sup>* <sup>∪</sup>*H*)\(*MCS<sup>N</sup>* <sup>∪</sup>*MCS<sup>U</sup>* ). Finally, as shown below, we return either a complete or a partial blocked path, by associating blocking reasons from the MCS with nodes on that path.

Algorithm 1 relies on two helper methods, ExtractPath and BuildPath. ExtractPath retrieves the satisfying theory model (a sequence of edges) for the query reachability predicate from a satifiable formula, using the graph theory in the SMT-solver MonoSAT, and associates packet header assignments with each step of that path from the corresponding bitvector assignments. BuildPath maps the literals of the MCS to descriptive strings representing blocking reasons, and associates those strings with steps on the blocked path.



We can see that *MCS<sup>N</sup>* <sup>∪</sup>*MCS<sup>U</sup>* meets the definition of a minimal correction set of *<sup>U</sup>* <sup>∪</sup> *<sup>N</sup>* <sup>∪</sup> *<sup>H</sup>* by observing that:

SAT((*U* ∪ *N* ∪ *H* \ (*MCS<sup>N</sup>* )) \ (*MCS<sup>N</sup>* ∪ *MCS<sup>U</sup>* )) line 8 =⇒ SAT((*U* ∪ *N* ∪ *H*) \ (*MCS<sup>N</sup>* ∪ *MCS<sup>U</sup>* ))) ∀*c* ∈ *MCS<sup>N</sup> ,* UNSAT((*N* ∪ *H*) \ (*MCS<sup>N</sup>* \ {*c*}) line 6 ∀*c* ∈ *MCS<sup>U</sup> ,* UNSAT((*U* ∪ *N* ∪ *H*) \ (*MCS<sup>U</sup>* \ {*c*})) line 8 =⇒ ∀*c* ∈ (*MCS<sup>N</sup>* ∪ *MCS<sup>U</sup>* )*,* UNSAT((*U* ∪ *N* ∪ *H*) \ ((*MCS<sup>N</sup>* ∪ *MCS<sup>U</sup>* ) \ {*c*}))

If *<sup>N</sup>* <sup>∪</sup> *<sup>H</sup>* is satisfiable, then *MCS<sup>N</sup>* is empty and *MCS<sup>U</sup>* , containing only configurable constraints, is an MCS of (*<sup>U</sup>* <sup>∪</sup> *<sup>N</sup>* <sup>∪</sup> *<sup>H</sup>*). In this case, BuildPath constructs a complete blocked path consisting entirely of configurable blocking reasons.

If *<sup>N</sup>* <sup>∪</sup> *<sup>H</sup>* is unsatisfiable, then *MCS<sup>N</sup>* is non-empty and *MCS<sup>N</sup>* <sup>∪</sup> *MCS<sup>U</sup>* contains at least one non-actionable constraints. In this case, the path *p* may behave unexpectedly and may not be realizable in a VPC configuration after adjustment. If *MCS<sup>N</sup>* is non-empty, BuildPath forms the blocked path as above, but returns only the prefix of that blocked path up to and including the first edge or node associated with a non-actionable setting.

Above, we discussed the cases where *<sup>N</sup>* <sup>∪</sup> *<sup>H</sup>* is satisfiable or unsatisfiable. There is also a third possibility: The hard constraints *H*, representing the constraints enforcing the user's query or implementation details of our model, may by themselves be unsatisfiable. For example, *H* may be unsatisfiable if the user specifies a source and destination that are in separate, disconnected networks.

If *H* is unsatisfiable, Algorithm 1 fails, and is unable to produce even a partial blocked path. In this case, we fall back on other techniques to provide useful diagnostic information for users. In practice, the typical reason that *H* is unsatisfiable is that the source and destination are in disconnected VPCs (so the reachability constraint is unsatisfiable). We use a static analysis pass to identify this case and handle it separately in our service.

In the case that Algorithm 1 produces a complete (*resp.* partial) blocked path, the underlying MCS algorithm guarantees that the blocked path will have the fewest possible number of blocking reasons from among all complete (*resp.* partial) blocked paths. In general this blocked path is not unique.

In our implementation of Algorithm 1, the graph-based decision heuristic in MonoSAT will prioritize finding shortest-length paths in most cases, but does not guarantee that a shortest-length path is always found.

# **5 Evaluation**

VPC Reachability Analyzer, a commercial offering available from AWS since December 2020, uses the blocked path analysis we have described to derive findings for queries between unreachable endpoints.

To demonstrate the practical impact of this blocked path analysis, we randomly selected 1000 unreachable queries processed by VPC Reachability Analyzer. We executed the blocked path analysis for those queries on an 'm5.24xlarge' EC2 instance using GNU Parallel [15], running Amazon Linux 2, using MonoSAT version 1.6.0.

**Fig. 3.** Number of blocking reasons per blocked path (among the 63% of unreachable queries for which the blocked path analysis produced a complete blocked path). 97% percent of blocked paths have three or fewer blocking reasons; 60% have just a single blocking reason.

Excluding the time to complete the blocked path analysis, the average time required to initially determine satisfiability of the constraints was 2.1 s (P50: 1.7 s, P99: 7.4 s). The blocked path analysis was as fast or faster than the initial solving time, requiring 0.3 s on average (P50: 0.05 s, P99: 6.6 s).

As described in Sect. 4, in some cases, the blocked path analysis can produce only a partial path, or no results at all. Of those 1000 unreachable queries, 63*.*2% resulted in complete blocked paths, 7*.*4% resulted in partial blocked paths, and the remainder (29*.*4%) produced no analysis (in which case VPC Reachability Analyzer applies other techniques so that it can still provide useful diagnostics).<sup>2</sup>

As can be seen in Fig. 3, most blocked paths have just one blocking reason, and 97% have at most three. This demonstrates that our analysis produces actionable, concise findings on real production data, a key requirement of a useful diagnosis service.

# **6 Conclusion**

The blocked path analysis we have introduced provides key advantages over previous network diagnostic techniques. By showing users a blocked path from a source to a destination, we allows users the opportunity to refine their query such that their intended path is aligned with our analysis. Furthermore, showing all blocking reasons on a blocked path allows users to understand the VPC configuration adjustments necessary to realize a path for their query.

Our blocked path analysis is a fully static analysis (requiring no packets to be injected into the network), can be computed efficiently using standard techniques from the formal methods literature, and is now used successfully in production by VPC Reachability Analyzer.

# **References**


<sup>2</sup> Of the queries for which no blocked path analysis was performed, 80% were due to users specifying endpoints in disconnected VPCs. We perform a disconnected component analysis to identify this case. Others were due to users specifying resources they lack access to, or that we do not support.


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

# **Lower-Bound Synthesis Using Loop Specialization and Max-SMT**

Elvira Albert1,2 , Samir Genaim1,2 , Enrique Martin-Martin<sup>1</sup> , Alicia Merayo1(B), and Albert Rubio1,2

<sup>1</sup> Fac. Inform´atica, Complutense University of Madrid, Madrid, Spain amerayo@ucm.es

<sup>2</sup> Instituto de Tecnolog´ıa del Conocimiento, Madrid, Spain

**Abstract.** This paper presents a new framework to synthesize lowerbounds on the worst-case cost for non-deterministic integer loops. As in previous approaches, the analysis searches for a *metering function* that under-approximates the number of loop iterations. The key novelty of our framework is the *specialization* of loops, which is achieved by restricting their enabled transitions to a subset of the inputs combined with the narrowing of their transition scopes. Specialization allows us to find metering functions for complex loops that could not be handled before or be more precise than previous approaches. Technically, it is performed (1) by using quasi-invariants while searching for the metering function, (2) by strengthening the loop guards, and (3) by narrowing the space of non-deterministic choices. We also propose a Max-SMT encoding that takes advantage of the use of soft constraints to force the solver look for more accurate solutions. We show our accuracy gains on benchmarks extracted from the 2020 Termination and Complexity Competition by comparing our results to those obtained by the LoAT system.

# **1 Introduction**

One of the most important problems in program analysis is to automatically –and accurately– bound the cost of program's executions. The first automated analysis was developed in the 70s [24] for a strict functional language and, since then, a plethora of techniques has been introduced to handle the peculiarities of the different programming languages (see, e.g., for Integer programs [5], for Java-like languages [2,19], for concurrent and distributed languages [16], for probabilistic programs [15,18], etc.) and to increase their accuracy (see, e.g., [10,14,21,22]). The vast majority of these techniques have focused on inferring *upper bounds* on the worst-case cost, since having the assurance that none execution of the program will exceed the inferred amount of resources (e.g., time, memory, etc.) has crucial applications in safety-critical contexts. On the other hand, *lower bounds*

This work was funded partially by the Spanish MCIU, AEI and FEDER (EU) project RTI2018-094403-B-C31, by the CM project S2018/TCS-4314 co-funded by EIE Funds of the European Union, and by the UCM CT42/18-CT43/18 grant.

c The Author(s) 2021

A. Silva and K. R. M. Leino (Eds.): CAV 2021, LNCS 12760, pp. 863–886, 2021. https://doi.org/10.1007/978-3-030-81688-9\_40

on the best-case cost characterize the minimal cost of any program execution and are useful in task parallelization (see, e.g., [3,9,10]). There are a third type of important bounds which are the focus of this work: *lower bounds on the worstcase cost*, they bound the worst-case cost from below. Their main application is that, together with the upper bounds on worst-case, allow us to infer tighter worst-case cost bounds (when they coincide ensuring that the inferred cost is exact) what can be crucial in safety-critical contexts. Besides, lower bounds on the worst-case cost will give us families of inputs that lead to an expensive cost, what could be used to detect performance bugs. In what follows, we use the acronyms LB<sup>w</sup> and LB<sup>b</sup> to refer to *w*orst-case and *b*est-case lower-bounds, resp.

*State-of-the-Art in* LBw. An important difference between LB<sup>w</sup> and LB<sup>b</sup> is that, while the best-case must consider *all* program runs, LB<sup>w</sup> holds for (usually infinite) families of the most expensive program executions. This is why the techniques applicable to LB<sup>b</sup> inference (e.g., [3,9,10]) are not useful for LB<sup>w</sup> in general, since they would provide too inaccurate (low) results. The state-ofthe-art in LB<sup>w</sup> inference is [12,13] (implemented in the LoAT system) which introduces a variation of ranking functions, called *metering functions*, to underestimate the number of iterations of *simple* loops, i.e., loops without branching nor nested loops. The core of this method is a simplification technique that allows treating general loops (with branchings and nested loops) by using the so-called *acceleration*: that replaces a transition representing one loop iteration by another rule that collects the effect of applying several consecutive loop iterations using the original rule. Asymptotic lower bounds are then deduced from the resulting simplified programs using a special-purpose calculus and an SMT encoding.

*Motivation.* Our work is motivated by the limitation of state-of-the-art methods when, by treating each simple loop separately, a LB<sup>w</sup> bound cannot be found or it is too imprecise. For example, consider the interleaved loop in Fig. 1, that is a simplification of the benchmark SimpleMultiple.koat from the Termination and Complexity competition. Its *transition system* appears to the right (the transition system is like a control-flow graph (CFG) in which the transitions τ are labeled with the applicability conditions and with the updates for the variables, primed variables denote the updated values). In every iteration x or y can decrease by one, and these behaviors can interleave. The worst case is obtained for instance when x is decreased to 0 (x iterations) and then y is decreased to 0 (y iterations), resulting in x + y iterations, or when y is first decreased to 1 and then <sup>x</sup> to <sup>−</sup>1, etc. The approach in [12,13] accelerates independently both <sup>τ</sup><sup>1</sup> and τ4, resulting in accelerated versions τ <sup>a</sup> <sup>1</sup> <sup>=</sup> <sup>x</sup> ≥ −<sup>1</sup> <sup>∧</sup> y > <sup>0</sup> <sup>∧</sup> <sup>x</sup> <sup>=</sup> <sup>−</sup><sup>1</sup> <sup>∧</sup> <sup>y</sup> <sup>=</sup> <sup>y</sup> with cost x + 1 and τ <sup>a</sup> <sup>4</sup> <sup>=</sup> <sup>x</sup> <sup>≥</sup> <sup>0</sup> <sup>∧</sup> <sup>y</sup> <sup>≥</sup> <sup>0</sup> <sup>∧</sup> <sup>x</sup> <sup>=</sup> <sup>x</sup> <sup>∧</sup> <sup>y</sup> = 0 with cost <sup>y</sup>. Applying one accelerated version results in that the other accelerated version cannot be applied because of the final values of the variables. Thus, the overall knowledge extracted from the loop is that it can iterate x+1 or y times, whereas the precise LB<sup>w</sup> is x+y iterations. Our challenge for inferring more precise LB<sup>w</sup> is to devise a method that can handle all loop transitions simultaneously, as disconnecting them leads to a semantics loss that cannot be recovered by acceleration.

while (x >= 0 && y > 0 ) { if ( ∗ ) { x=x − 1 ; } else { y=y − 1 ; } } -0 -1 *e* τ<sup>0</sup> : x- = x ∧ y- = y τ<sup>1</sup> : x ≥ 0 ∧ y > 0 ∧ x- = x − 1 ∧ y- = y τ<sup>4</sup> : x ≥ 0 ∧ y > 0 ∧ x- = x ∧ y- = y − 1 τ<sup>3</sup> : y ≤ 0 ∧ x- = x ∧ y- = y τ<sup>2</sup> : x < 0 ∧ x- = x ∧ y-= y

**Fig. 1.** Interleaved loop (left) and its representation as a transition system (right)

*Non-Termination and* LBw. Our work is inspired by [17], which introduces the powerful concept of *quasi-invariant* to find witnesses for non-termination. A quasi-invariant is an invariant which does not necessarily hold on initialization, and can be found as in template-based verification [23]. Intuitively, when there is a loop in the program that can be mapped to a quasi-invariant that forbids executing any of the outgoing transitions of the loop, then the program is nonterminating. This paper leverages such powerful use of quasi-invariants and Max-SMT in non-termination analysis to the more difficult problem of LB<sup>w</sup> inference. Non-termination and LB<sup>w</sup> are indeed related properties: in both cases we need to find witnesses, resp., for non-terminating the loop and for executing at least a certain number of iterations. For LB<sup>w</sup>, we additionally need to provide such under-estimation for the number of iterations and search for LB<sup>w</sup> behaviors that occur for a class of inputs rather than for a single input instantiation (since the LB<sup>w</sup> for a single input is a concrete (i.e., constant) cost, rather than a parametric LB<sup>w</sup> function as we are searching for). Instead, for non-termination, it is enough to find a non-terminating input instantiation.

*Our Approach.* A fundamental idea of our approach is to *specialize* loops in order to guide the search of the metering functions of complex loops, avoiding the inaccuracy introduced by disconnecting them into simple loops. To this purpose, we propose specializing loops by combining the addition of constraints to their transitions with the restriction of the valid states by means of quasi-invariants. For instance, for the loop in Fig. 1, our approach automatically narrows τ<sup>1</sup> by adding x > 0 (so that <sup>x</sup> is decreased until <sup>x</sup> = 0) and <sup>τ</sup><sup>4</sup> by adding <sup>x</sup> <sup>≤</sup> 0 (so that τ<sup>4</sup> can only be applied when x = 0). This specialized loop has lost many of the possible interleavings of the original loop but keeps the worst case execution of x+y iterations. These specialized guards do not guarantee that the loop executes x + y iterations in every possible state, as the loop will finish immediately for x < 0 or <sup>y</sup> <sup>≤</sup> 0, thus our approach also infers the quasi-invariant <sup>x</sup> <sup>≥</sup> <sup>0</sup> <sup>∧</sup> <sup>x</sup> <sup>≤</sup> <sup>y</sup>. Combining the specialized guards and the quasi-invariant, we can assure that when reaching the loop in a valid state according to the quasi-invariant, x + y is a lower bound on the number of iterations of the loop, i.e., its cost. Using quasi-invariants that include all (invariant) inequalities syntactically appearing in loop transitions might work for the case of loops with single path. However, for the general case, the specialized guards usually lead to essential quasi-invariants that do not appear in the original loop. The specialization achieved by adding constraints could be also applied in the context of non-termination to increase the accuracy of [17], as only quasi-invariants were used. Therefore, we argue that our work avoids the precision loss caused by the simplification in [12,13] and, besides, introduces a loop specialization technique that can also be applied to gain precision in non-termination analysis [17].

*Contributions.* Briefly, our main theoretical and practical contributions are:


# **2 Background**

This section introduces some notation on the program representation and recalls the notion of LB<sup>w</sup> we aim at inferring.

# **2.1 Program Representation**

Our technique is applicable to sequential non-deterministic programs with integer variables and commands whose updates can be expressed in linear (integer) arithmetic. We assume that the non-determinism originates from nondeterministic assignments of the form "x:=nondet();", where x is a program variable and nondet() can be represented by a fresh non-deterministic variable u. This assumption allows us to also cover non-deterministic branching, e.g., "if (\*){..} else {..}" as it can be expressed by introducing a non-deterministic variable <sup>u</sup> and rewriting the code as "u:=nondet(); if (u≥0){..} else {..}".

Our programs are represented using *transition systems*, in particular using the formalization of [17] that simplifies the presentation of some formal aspects of our work. A transition system (abbrev. TS) is a tuple <sup>S</sup> <sup>=</sup> x, ¯ u, ¯ <sup>L</sup>, <sup>T</sup> , Θ, where ¯x is a tuple of n integer program variables, ¯u is a tuple of integer (nondeterministic) variables, <sup>L</sup> is a set of locations, <sup>T</sup> is a set of transitions, and <sup>Θ</sup> is a formula that defines the valid input and is specified by a conjunction of linear constraints of the form ¯a·x¯+b0 where ∈{>, <, <sup>=</sup>, <sup>≥</sup>, ≤}. A transition is of the form (, , <sup>R</sup>) ∈ T such that , ∈ L, and <sup>R</sup> is a formula over ¯x, ¯<sup>u</sup> and ¯x that is specified by a conjunction of linear constraints of the form ¯a·x¯+¯b·u¯+ ¯c·x¯+d<sup>0</sup> where ∈{>, <, <sup>=</sup>, <sup>≥</sup>, ≤}, and primed variables ¯x represent the values of the unprimed corresponding variables after the transition. We sometimes write R as <sup>R</sup>(¯x, u, ¯ <sup>x</sup>¯ ), use <sup>R</sup>(¯x) to refer to the constraints that involve only variables ¯<sup>x</sup> (i.e., the *guard*), and use <sup>R</sup>(¯x, <sup>u</sup>¯) to refer to the constraints that involve only variables u¯ and (possibly) ¯x. W.l.o.g., we may assume that constraints involving primed variables are of the form x <sup>i</sup> = ¯<sup>a</sup> · <sup>x</sup>¯ <sup>+</sup> ¯<sup>b</sup> · <sup>u</sup>¯ <sup>+</sup> <sup>c</sup>. This is because non-determinism can be moved to <sup>R</sup>(¯x, <sup>u</sup>¯) – if a primed variable <sup>x</sup> <sup>i</sup> appears in any expression that is not of this form, we replace x <sup>i</sup> by a fresh non-deterministic variable <sup>u</sup><sup>i</sup> in such expressions and add the equality x <sup>i</sup> <sup>=</sup> <sup>u</sup>i. We require that for any ¯<sup>x</sup> satisfying <sup>R</sup>(¯x), there are ¯<sup>u</sup> satisfying <sup>R</sup>(¯x, <sup>u</sup>¯), formally

$$\forall \bar{x}.\exists \bar{u}.\ \mathcal{R}(\bar{x}) \to \mathcal{R}(\bar{x},\bar{u})\tag{1}$$

This guarantees that for any state ¯x satisfying the condition, there are values for the non-deterministic variables ¯u such that we can make progress. A transition that does not satisfy this condition is called *invalid*. Note that (1) does not refer to ¯x since they are set in a deterministic way, once the values of ¯x and ¯u are fixed. W.l.o.g., we assume that all coefficients and free constants, in all linear constraints, are integer; and that there is a single *initial location* <sup>0</sup> ∈ L with no incoming transitions, and a single *final location* <sup>e</sup> with no outgoing transitions.

*Example 1.* The TS graphically presented in Fig. 1 is expressed as follows, considering that all inputs are valid (Θ = *true*):

$$\begin{array}{l} \mathcal{S} \equiv \langle \{x, y\}, \emptyset, \{\ell\_0, \ell\_1, \ell\_e\}, \\ \{ (\ell\_0, \ell\_1, x' = x \wedge y' = y), \\ (\ell\_1, \ell\_1, x \ge 0 \wedge y > 0 \wedge x' = x - 1 \wedge y' = y), \\ (\ell\_1, \ell\_e, x < 0 \wedge x' = x \wedge y' = y), \\ (\ell\_1, \ell\_e, y \le 0 \wedge x' = x \wedge y' = y), \\ (\ell\_1, \ell\_1, x \ge 0 \wedge y > 0 \wedge x' = x \wedge y' = y - 1) \}, \\ \end{array}$$

A configuration <sup>C</sup> is a pair (, σ) where ∈ L and <sup>σ</sup> : ¯<sup>x</sup> → <sup>Z</sup> is a mapping representing a state. We abuse notation and use <sup>σ</sup> to refer to <sup>∧</sup><sup>n</sup> <sup>i</sup>=1x<sup>i</sup> <sup>=</sup> <sup>σ</sup>(xi), and also write σ for the assignment obtained from σ by renaming the variables to primed variables. There is a transition from (, σ1) to ( , σ2) iff there is (, , <sup>R</sup>) ∈ T such that <sup>∃</sup>u.σ ¯ <sup>1</sup> <sup>∧</sup> <sup>σ</sup> <sup>2</sup> <sup>|</sup><sup>=</sup> <sup>R</sup>. A (valid) trace <sup>t</sup> is a (possibly infinite) sequence of configurations (0, σ0),(1, σ1),... such that <sup>σ</sup><sup>0</sup> <sup>|</sup><sup>=</sup> <sup>Θ</sup>, and for each <sup>i</sup> there is a transition from (i, σi) to (i+1, σi+1). Traces that are infinite or end in a configuration with location <sup>e</sup> are called complete. A configuration (, σ), where <sup>=</sup> e, is *blocking* iff

$$
\sigma \not\models \bigvee\_{(\ell, \ell', \mathcal{R}) \in \mathcal{T}} \mathcal{R}(\bar{x}) \tag{2}
$$

A TS is non-blocking if no trace includes a blocking configuration. We assume that the TS under consideration is non-blocking, and thus any trace is a prefix of a complete one. Throughout the paper, we represent a TS as a CFG, and analyze its strongly connected components (SCC) one by one. An SCC is said to be *trivial* if it has no edge.

# **2.2 Lower-Bounds**

For simplicity, we assume that an execution step (a transition) costs 1. Under this assumption, the cost of a trace t is simply its length *len*(t) where the length of an infinite trace is ∞. In what follows, the set of all configurations is denoted by C, the set of all valid complete traces (using a transition system S) when starting from configuration <sup>C</sup> ∈ C is denoted by *Traces*<sup>S</sup> (C), and <sup>R</sup>≥<sup>0</sup> <sup>=</sup> {<sup>k</sup> <sup>∈</sup> <sup>R</sup> <sup>|</sup> <sup>k</sup> <sup>≥</sup> <sup>0</sup>} ∪ {∞}. For a non-empty set <sup>M</sup> <sup>⊆</sup> <sup>R</sup>≥<sup>0</sup>, sup M is the least upper bound of M and *inf* M is the greatest lower bound of M. The worst-case cost of an initial configuration C is the cost of the most expensive complete trace starting from C and the best-case cost is the less expensive complete trace.

**Definition 1 (worst- and best-case cost).** *Let* S *be a TS. Its worst-case cost function* wc<sup>S</sup> : C → <sup>R</sup>≥<sup>0</sup> *is* wc<sup>S</sup> (C) = *sup* {*len*(t) <sup>|</sup> <sup>t</sup> <sup>∈</sup> *Traces*<sup>S</sup> (C)} *and its best-case cost function* bc<sup>S</sup> : C → <sup>R</sup>≥<sup>0</sup> *is* bc<sup>S</sup> (C) = *inf* {*len*(t) <sup>|</sup> <sup>t</sup> <sup>∈</sup> *Traces*<sup>S</sup> (C)}*.*

Clearly, wc<sup>S</sup> and bc<sup>S</sup> are not computable. Our goal in this paper is to automatically find a lower-bound function <sup>ρ</sup> : <sup>Z</sup><sup>n</sup> <sup>→</sup> <sup>R</sup>≥<sup>0</sup> such that for any initial configuration <sup>C</sup> = (0, σ) we have wc<sup>S</sup> (C) <sup>≥</sup> <sup>ρ</sup>(σ(¯x)), i.e., it is an LB<sup>w</sup>. An LB<sup>b</sup> would be a function <sup>ρ</sup> : <sup>Z</sup><sup>n</sup> <sup>→</sup> <sup>R</sup>≥<sup>0</sup> that ensures that bc<sup>S</sup> (C) <sup>≥</sup> <sup>ρ</sup>(¯x) for any initial configuration C = (0, σ). In what follows, for a function ρ(¯x), we let ρ(¯x) <sup>=</sup> max(0, ρ(x)) to map all negative valuations of <sup>ρ</sup> to zero.

*Example 2.* Consider the TS <sup>S</sup> <sup>=</sup> {x}, {u}, {0, 1, e}, <sup>T</sup> ,*true* with transitions:

$$\begin{array}{l} \mathcal{T} \equiv \{ \ \tau\_1 = (\ell\_0, \ell\_1, x \ge 0), \\ \qquad \tau\_2 = (\ell\_1, \ell\_1, x > 0 \land x' = x - u \land u \ge 1 \land u \le 2), \\ \tau\_3 = (\ell\_1, \ell\_c, x \le 0 \land x' = x) \end{array}$$

<sup>S</sup> contains a loop at <sup>1</sup> where variable <sup>x</sup> is non-deterministically decreased by 1 or 2. From any initial configuration C<sup>0</sup> = (0, σ0), the longest possible complete trace decreases <sup>x</sup> by 1 in every iteration with <sup>τ</sup>2, therefore wc<sup>S</sup> (C0) = σ0(x)+2 because of the σ0(x) iterations in <sup>1</sup> plus the cost of <sup>τ</sup><sup>1</sup> and <sup>τ</sup>3. The most precise lower bound for wc<sup>S</sup> is <sup>ρ</sup>(x) = x + 2, although <sup>ρ</sup>(x) = x or <sup>ρ</sup>(x) = <sup>x</sup> <sup>−</sup> <sup>2</sup> are also valid lower bounds. The shortest complete trace from C<sup>0</sup> decreases x by 2 in every iteration, so bc<sup>S</sup> (C0) = <sup>σ</sup>0(x) <sup>2</sup> + 2. There are several valid lower bounds for bc<sup>S</sup> (C0) like <sup>ρ</sup>(x) = <sup>x</sup> <sup>2</sup> + 2, <sup>ρ</sup>(x) = <sup>x</sup> <sup>2</sup> , or <sup>ρ</sup>(x) = 2.

# **3 Local Lower-Bound Functions**

*Focus on Local Bounds.* Existing techniques and tools for cost analysis (e.g., [1, 12]) work by inferring *local* (iteration) bounds for those parts of the TS that correspond to loops, and then combining these bounds by propagating them "backwards" to the entry point in order to obtain a *global* bound. For example, suppose that our program consists of the following two loops:

$$\begin{array}{l} \mathsf{a} \mathsf{s} \mathsf{s} \mathsf{r} \mathsf{t} \left( \mathsf{x} > \mathsf{0} \quad \mathsf{k} \mathsf{k} \mathsf{z} > \mathsf{0} \right); \\ \mathsf{w} \mathsf{h} \mathsf{i} \mathsf{i} \mathsf{e} \quad \left( \mathsf{z} > \mathsf{0} \right) \left\{ \begin{array}{l} \mathsf{x} = \mathsf{x} + \mathsf{z} \ ; \ \mathsf{z} = \mathsf{-}; \\ \mathsf{w} \mathsf{h} \mathsf{i} \mathsf{i} \mathsf{e} \end{array} \right. \right. \\ \end{array}$$

where the second loop makes x iterations (when considering the value of x just before executing the loop), and the first loop makes z iterations and increments x by z in each iteration. We are interested in inferring a global function that describes the total number of iterations of both loops, in terms of the input values x<sup>0</sup> and z0. While both loops have linear complexity locally, i.e., iteration bounds z and x, the second one has quadratic complexity w.r.t the initial values. This can be inferred automatically from the local bounds z and x by inferring how the value of x changes in the first loop, and then rewriting x in terms of the initial values to e = x<sup>0</sup> + <sup>z</sup>0·(z0−1) <sup>2</sup> (e.g., by solving corresponding recurrence relations). Now the global cost would be e plus the cost of the first loop z0. Rewriting the loop bound x as above is done by propagating it backwards to the entry point, and there are several techniques in the literature for this purpose that can be directly adopted in our setting to produce global bounds. These techniques can infer global bounds for nested-loops as well, given the iteration bounds of each loop. Thus, we focus on inferring local lower-bounds on the number of iterations that non-nested loops (more precisely, parts of the TS that correspond to loops) can make, and assume that they can be rewritten to global bounds by adopting the existing techniques of [1,12] (our implementation indeed could be used as a black-box which provides local lower-bounds to these tools). Namely, we aim at inferring, for each non-nested loop, a function ρ(¯x) <sup>=</sup> max(0, ρ(x)) that is a (local) LB<sup>w</sup> on its number of iterations, i.e., whenever the loop is reached with values ¯<sup>v</sup> for the variables ¯x, it is possible to make at least ρ(¯v) iterations.

*Loops and TSs.* For ease of presentation, we first consider a special case of TSs in which all locations, except the initial and exit ones define loops, and Sect. 3.6 explains how the techniques can be used for the general case. In particular, we consider that each non-trivial SCC consists of a single location and at least one transition, and we call it *loop* . Transitions from to are called *loop transitions* and their guards are called *loop guards*, and transitions from to <sup>=</sup> are called *exit transitions*. The number of iterations of a loop in a trace t is defined as the number of transitions from to , which we refer to as the cost of loop as well (since we are assuming that the cost of transitions is always 1, see Sect. 2.2). The notions of best-case and worst-case cost in Definition 1 naturally extend to the cost of a loop , i.e., we can ask what is the best-case and worst-case number of iterations of a given loop.

*Overview of the Section.* The overall idea of our approach is to *specialize* each loop , by restricting the initial values and/or adding constraints to its transitions, such that it becomes possible to obtain a metering function for the specialized loop. A function that is a LB<sup>b</sup> of the specialized loop is by definition a LB<sup>w</sup> of loop , as it does not necessarily hold for all execution traces but rather for the class of restricted ones. Technically, inferring a LB<sup>b</sup> of a (specialized) loop is done by inferring a metering function ρ [13], such that whenever the (specialized) loop is reached with a state <sup>σ</sup>, it is guaranteed to make at least ρ(σ(¯x)) iterations. Besides, specialization is done in such away that the TS obtained by putting all specialized loops together is non-blocking, i.e., there is an execution that is either non-terminating or reaches the exit location, and thus the cost of this execution is, roughly, the sum of the costs of all (specialized) loops that are traversed. The rest of this section is organized as follows. In Sect. 3.1 we generalize the basic definition of metering function for simple loops from [12] to general types of loops and explore its limitations. Then, in the following 3 sections, we explain how to overcome these limitations by means of the following specializations: using quasi-invariants to narrow the set of input values (Sect. 3.2); narrowing loop guards to make loop transitions mutually exclusive and force some execution order between them (Sect. 3.3); and narrowing the space of nondeterministic choices to force longer executions (Sect. 3.4). Sect. 3.5 states the conditions, to be satisfied when specializing loops, in order to guarantee that the TS obtained by putting all specialized loops together is non-blocking.

#### **3.1 Metering Functions**

*Metering functions* were introduced by [13], as a tool for inferring a lower-bound on the number of iterations that a given loop can make. The definition is analogue to that of (linear) ranking function which is often used to infer upper-bounds on the number of iterations. The definition as given in [13] considers a loop with a single transition, and assumes that the exit condition is the negation of its guard. We start by generalizing it to our notion of loop.

**Definition 2 (Metering function).** *We say that a function* ρ *is a metering function for a loop* ∈ L*, if the following conditions are satisfied*

$$\begin{cases} \forall \bar{x}, \bar{u}, \bar{x}'.\mathcal{R} \to \rho\iota(\bar{x}) - \rho\iota(\bar{x}') \le 1 & \text{for each } (\ell, \ell, \mathcal{R}) \in \mathcal{T} \\\forall \bar{x}, \bar{u}, \bar{x}'.\mathcal{R} \to \rho\iota(\bar{x}) \le 0 & \text{for each } (\ell, \ell', \mathcal{R}) \in \mathcal{T} \end{cases} (3)$$

*Intuitively, Condition* (3) *requires* ρ *to decrease at most by* 1 *in each iteration, and Condition* (4) *requires* ρ*to be non-positive when leaving the loop.*

Assuming (, σ) is a reachable configuration in <sup>S</sup>, it is easy to see that loop will make at least ρ-(σ(¯x)) iterations when starting from (, σ). We require (, σ) to be reachable in <sup>S</sup> since we are interested only in non-blocking executions. Typically, we are interested in linear metering functions, i.e., of the form ρ-(¯x) = <sup>a</sup>¯ · <sup>x</sup>¯ <sup>+</sup> <sup>a</sup>0, since they are easier to infer and cover most loops in practice. Nonlinear lower-bound functions will be obtained when rewriting these local linear lower-bounds in terms of the initial input at location <sup>0</sup> (see beginning of Sect. 3) and by composing nested loops (see Sect. 3.6).

*Example 3 (Metering function).* Consider the following loop on location <sup>1</sup> that decreases x (τ1) until it takes non-positive values and exits to <sup>2</sup> (τ2):

$$
\tau\_1 = (\ell\_1, \ell\_1, x \ge 0 \land x' = x - 1) \qquad \tau\_2 = (\ell\_1, \ell\_2, x < 0 \land x' = x)
$$

The function ρ-<sup>1</sup> (x) = x + 1 is a valid metering function because it decreases by exactly 1 in <sup>τ</sup><sup>1</sup> and becomes non-positive when <sup>τ</sup><sup>2</sup> is applicable (x < <sup>0</sup> <sup>→</sup> <sup>x</sup>+ 1 <sup>≤</sup> 0, Condition (3) of Definition 2). The function ρ -<sup>1</sup> (x) = <sup>x</sup> <sup>2</sup> is also metering because its value decreases by less than 1 when applying τ<sup>1</sup> ( <sup>x</sup> <sup>2</sup> <sup>−</sup> <sup>x</sup>−<sup>1</sup> <sup>2</sup> <sup>=</sup> <sup>1</sup> <sup>2</sup> ≤ 1) and becomes non-positive in τ2. Even a function as ρ -<sup>1</sup> (x) = 0 is trivially metering, as it satisfies (3) and (4). Although all of them are valid metering functions, ρ-<sup>1</sup> (x) is preferable as it is more accurate (i.e., larger) and thus captures more precisely the number of iterations of the loop. Note that functions like ρ∗ -<sup>1</sup> (x)=2<sup>x</sup> or <sup>ρ</sup>∗∗ -<sup>1</sup> (x) = <sup>x</sup> + 5 are not metering because they do not verify (3) (because 2<sup>x</sup> <sup>−</sup> 2(<sup>x</sup> <sup>−</sup> 1) = 2 ≤ 1 for <sup>ρ</sup><sup>∗</sup> -<sup>1</sup> ) or (4) (because x < <sup>0</sup> → <sup>x</sup> + 5 <sup>≤</sup> 0 for ρ∗∗ -1 ).

#### **3.2 Narrowing the Set of Input Values Using Quasi-Invariants**

Metering functions typically exist for loops with simple loop guards. However, when guards involve more than one inequality they usually do not exist in a simple (linear) form. This is because such loops often include several exit transitions with unrelated conditions, where each one corresponds to the negation of an inequality of the guard. It is unlikely then that a non-trivial (linear) function satisfies (4) for all exit transitions. This is illustrated in the next example.

*Example 4.* Consider the following loop that iterates on <sup>1</sup> if <sup>x</sup> <sup>≥</sup> <sup>0</sup> <sup>∧</sup> y > 0, and exits when x < 0 or <sup>y</sup> <sup>≤</sup> 0:

$$\begin{array}{l} \tau\_1 = (\ell\_1, \ell\_1, x \ge 0 \land y > 0 \land x' = x - 1 \land y' = y) \\ \tau\_2 = (\ell\_1, \ell\_2, x < 0 \land x' = x \land y' = y) \\ \tau\_3 = (\ell\_1, \ell\_2, y \le 0 \land x' = x \land y' = y) \end{array}$$

Intuitively, this loop executes x + 1 transitions, but ρ-<sup>1</sup> (x, y) = x + 1 is not a valid metering function because it does not satisfy (4) for <sup>τ</sup>3: <sup>y</sup> <sup>≤</sup> <sup>0</sup> → <sup>x</sup>+ 1 <sup>≤</sup> 0. Moreover, no other function depending on x (e.g., <sup>x</sup> <sup>2</sup> , <sup>x</sup> <sup>−</sup> 2, etc.) will be a valid metering function, as it will be impossible to prove (4) for τ<sup>3</sup> only from the information <sup>y</sup> <sup>≤</sup> 0 on its guard. The only valid metering function for this loop will be the trivial one ρ-<sup>1</sup> (x, y) = <sup>c</sup> with <sup>c</sup> <sup>≤</sup> 0, which does not provide any information about the number of iterations of the loop.

Our proposal to overcome the imprecision discussed above is to consider only a subset of the input values s.t. conditions (3,4) hold in the context of the corresponding reachable states. For example, the reachable states might exclude some of the exit transitions, i.e., it is guaranteed that they are never used, and then (4) is not required to hold for them. A metering function in this context is a LB<sup>b</sup> of the loop when starting from that specific input, and thus it is a LB<sup>w</sup> (i.e., not necessarily best-case) of the loop when the input values are not restricted.

Technically, our analysis materializes the above idea by relying on quasiinvariants [17]. A quasi-invariant for a loop is a formula <sup>Q</sup>over ¯x such that

$$\begin{aligned} \forall \bar{x}, \bar{u}, \bar{x}'. \; \mathcal{Q}\_{\ell}(\bar{x}) \land \mathcal{R} \to \mathcal{Q}\_{\ell}(\bar{x}') \qquad \qquad \text{for each } (\ell, \ell, \mathcal{R}) \in \mathcal{T} \qquad (5) \\\exists \bar{x}. \; \mathcal{Q}\_{\ell}(\bar{x}) \end{aligned} \tag{6}$$

Intuitively, Q is similar to an inductive invariant but without requiring it to hold on the initial states, i.e., once Q holds it will hold during all subsequent visits to . This also means that for executions that start in states within <sup>Q</sup>-, it is guaranteed that Q is an over-approximation of the reachable states. Condition (6) is used to avoid quasi-invariants that are *false*. Given a quasi-invariant Q for , we say that ρis a metering function for if the following holds

$$\begin{aligned} \forall \bar{x}, \bar{u}, \bar{x}'. \; \mathcal{Q}\_{\ell}(\bar{x}) \land \mathcal{R} \to \rho\_{\ell}(\bar{x}) - \rho\_{\ell}(\bar{x}') \le 1 \qquad \text{for each } (\ell, \ell, \mathcal{R}) \in \mathcal{T} \qquad (7) \\\forall \bar{x}, \bar{u}, \bar{x}'. \; \mathcal{Q}\_{\ell}(\bar{x}) \land \mathcal{R} \to \rho\_{\ell}(\bar{x}) \le 0 \qquad \qquad \text{for each } (\ell, \ell', \mathcal{R}) \in \mathcal{T} \qquad (8) \end{aligned}$$

Intuitively, these conditions state that (3,4) hold in the context of the states induced by Q-. Assuming that (, σ) is reachable in <sup>S</sup> and that <sup>σ</sup> <sup>|</sup><sup>=</sup> <sup>Q</sup>-, loop will make at least ρ-(σ(¯x)) iterations in any execution that starts in (, σ).

*Example 5.* Recall that the loop in Example 4 only admitted trivial metering functions because of the exit transition <sup>τ</sup>3. It is easy to see that <sup>Q</sup>-<sup>1</sup> <sup>≡</sup> x<y verifies (5,6), because y is not modified in τ<sup>1</sup> and x decreases, and thus it is a quasi-invariant. In the context of Q-<sup>1</sup> , function ρ-<sup>1</sup> (x, y) = x + 1 is metering because when taking τ<sup>3</sup> the value of x is guaranteed to be negative, i.e., τ<sup>3</sup> satisfies (8) because x<y <sup>∧</sup> <sup>y</sup> <sup>≤</sup> <sup>0</sup> <sup>→</sup> <sup>x</sup> + 1 <sup>≤</sup> 0. Notice that <sup>ρ</sup>-<sup>1</sup> (x, y) = x + 1 will still be a valid metering function considering other quasi-invariants of the form Q -<sup>1</sup> <sup>≡</sup> y>c with <sup>c</sup> <sup>≥</sup> 0, as they would completely disable transition <sup>τ</sup>3.

#### **3.3 Narrowing Guards**

The loops that we have considered so far consist of a single loop transition, what makes easier to find a metering function. This is because there is only one way to modify the program variables (with some degree of non-determinism induced by the non-deterministic variables). However, when we allow several loop transitions, we can have loops for which a non-trivial metering function does not exist even when narrowing the set of input values.

*Example 6.* Consider the extension of the loop in Example 4 with a new transition τ<sup>4</sup> that decrements y (it corresponds to the example in Sect. 1):

$$\begin{array}{l} \tau\_{1} = (\ell\_{1}, \ell\_{1}, x \ge 0 \land y > 0 \land x' = x - 1 \land y' = y) \\ \tau\_{4} = (\ell\_{1}, \ell\_{1}, x \ge 0 \land y > 0 \land x' = x \land y' = y - 1) \\ \tau\_{2} = (\ell\_{1}, \ell\_{2}, x < 0 \land x' = x \land y' = y) \\ \tau\_{3} = (\ell\_{1}, \ell\_{2}, y \le 0 \land x' = x \land y' = y) \end{array}$$

The most precise LB<sup>w</sup> of this loop is ρ-<sup>1</sup> (x, y) where <sup>ρ</sup>-<sup>1</sup> (x, y) = x + y. As mentioned, this corresponds, e.g., to an execution that uses τ<sup>1</sup> until x = 0, i.e., x times, and then τ<sup>4</sup> until y = 0, i.e., y times. It is easy to see that if we start from a state that satisfies <sup>x</sup> <sup>≥</sup> <sup>0</sup>∧<sup>x</sup> <sup>≤</sup> <sup>y</sup>, then it will be satisfied during the particular execution that we just described. Moreover, assuming that Q-<sup>1</sup> <sup>≡</sup> <sup>x</sup> <sup>≥</sup> <sup>0</sup> <sup>∧</sup> <sup>x</sup> <sup>≤</sup> <sup>y</sup> is a quasi-invariant, it is easy to show that together with ρ-<sup>1</sup> we can verify (7,8), and thus ρ-<sup>1</sup> will be a metering function. However, unfortunately, Q-<sup>1</sup> is not a quasi-invariant since the above loop can make executions other than the one described above (e.g., decreasing y to 1 first and then x to 0).

Our idea to overcome this imprecision is to narrow the set of states for which loop transitions are enabled, i.e., strengthening loop guards by additional inequalities. This, in principle, reduces the number of possible executions, and thus it is more likely to find a metering function (or a better quasi-invariant), because now they have to be valid for fewer executions. For example, this might force an execution order between the different paths, or even disable some transitions by narrowing their guard to *false*. Again, a metering function for the specialized loop is not a valid LB<sup>b</sup> of the original loop, but rather its a valid LB<sup>w</sup> that is what we are interested in. Next, we state the requirements that such narrowing should satisfy. The choice of a narrowing that leads to longer executions is discussed in Sect. 4.

<sup>A</sup> *guard narrowing* for a loop transition <sup>τ</sup> ∈ T is a formula <sup>G</sup><sup>τ</sup> (¯x), over variables ¯x. A specialization of a loop is obtained simply by adding these formulas to the corresponding transitions. Conditions (5)-(8) can be specialized to hold only for executions that use the specialized loop as follows. Suppose that for a loop ∈ L we are given a narrowing <sup>G</sup><sup>τ</sup> for each loop transition <sup>τ</sup> , then <sup>Q</sup> and ρ- are quasi-invariant and metering function resp. for the corresponding specialized loop if the following conditions hold

$$\forall \vec{x}, \vec{u}, \vec{x}'. \; \mathcal{Q}\_{\ell}(\vec{x}) \land \mathcal{G}\_{\tau}(\vec{x}) \land \mathcal{R} \to \mathcal{Q}\_{\ell}(\vec{x}') \newline \tag{9}$$

$$\exists \vec{x}. \; \mathcal{Q}\_{\ell}(\vec{x}) \tag{10}$$

$$\forall \vec{x}, \vec{u}, \vec{x}'. \; \mathcal{Q}\_{\ell}(\vec{x}) \land \mathcal{G}\_{\tau}(\vec{x}) \land \mathcal{R} \to \rho\_{\ell}(\vec{x}) - \rho\_{\ell}(\vec{x}') \le 1 \qquad \text{for each } (\ell, \ell, \mathcal{R}) \in \mathcal{T} \tag{11}$$

$$\forall \vec{x}. \; \mathcal{Q}\_{\ell}(\vec{x}) \land \mathcal{R}(\vec{x}) \to \rho\_{\ell}(\vec{x}) \le 0 \qquad\qquad\qquad\text{for each } (\ell, \ell', \mathcal{R}) \in \mathcal{T} \tag{12}$$

Conditions (9,10) guarantee that Q is a non-empty quasi-invariant for the specialized loop, and conditions (11,12) guarantee that ρ is a metering function for the specialized loop in the context of Q-. However, in this case, function ρ- induces a lower-bound on the number of iterations only if the specialized loop is non-blocking for states in Q-. This is illustrated in the following example.

*Example 7.* Consider the loop from Example 3 where we have specialized the guard of <sup>τ</sup><sup>1</sup> by adding <sup>x</sup> <sup>≥</sup> 5:

$$\tau\_1 = (\ell\_1, \ell\_1, x \ge 0 \land x \ge 5 \land x' = x - 1) \qquad \tau\_2 = (\ell\_1, \ell\_2, x < 0 \land x' = x)$$

With this specialized guard and considering Q-<sup>1</sup> ≡ *true*, the metering function ρ-<sup>1</sup> (x) = <sup>x</sup> + 1 still satisfies (11,12), and <sup>Q</sup>-<sup>1</sup> trivially satisfies (9,10). However, ρ-<sup>1</sup> is not a valid measure of the number of transitions executed because the loop gets blocked whenever <sup>x</sup> takes values 0 <sup>≤</sup> <sup>x</sup> <sup>≤</sup> 5, and thus it will never execute x + 1 transitions.

To guarantee that the specialized loop is non-blocking for states in Q-, it is enough to require the following condition to hold

$$\forall \bar{x}. \; \mathcal{Q}\_{\ell}(\bar{x}) \to \bigvee\_{\tau = (\ell, \ell, \mathcal{R}) \in \mathcal{T}} (\mathcal{R}(\bar{x}) \land \mathcal{G}\_{\tau}(\bar{x})) \quad \bigvee\_{\tau = (\ell, \ell', \mathcal{R}) \in \mathcal{T}} \mathcal{R}(\bar{x}) \tag{13}$$

Intuitively, it states that from any state in Q we can make progress, either by making a loop iteration or exiting the loop. Assuming that (, σ) is reachable in <sup>S</sup> and that <sup>σ</sup> <sup>|</sup><sup>=</sup> <sup>Q</sup>-, the specialized loop will make at least ρ-(σ(¯x)) iterations in any execution that starts in (, σ). This also means that the original loop *can* make at least ρ-(σ(¯x)) iterations in any execution that starts in (, σ).

*Example 8.* In Example 6, we have seen that if Q-<sup>1</sup> <sup>≡</sup> <sup>x</sup> <sup>≤</sup> <sup>y</sup> <sup>∧</sup> <sup>x</sup> <sup>≥</sup> 0 was a quasi-invariant, then function ρ-<sup>1</sup> (x, y) = x+y becomes metering. We can make Q-<sup>1</sup> a quasi-invariant by specializing the guards of the loop in transitions τ<sup>1</sup> and τ<sup>4</sup> to force the following execution with x + y iterations: first use τ<sup>1</sup> until x = 0 (x iterations) and then use τ<sup>4</sup> until y =0(y iterations). This behavior can be forced by taking <sup>G</sup><sup>τ</sup><sup>1</sup> <sup>≡</sup> x > 0 and <sup>G</sup><sup>τ</sup><sup>4</sup> <sup>≡</sup> <sup>x</sup> <sup>≤</sup> 0. With <sup>G</sup><sup>τ</sup><sup>1</sup> we assure that <sup>x</sup> stops decreasing when <sup>x</sup> = 0, and with <sup>G</sup><sup>τ</sup><sup>4</sup> we assure that <sup>τ</sup><sup>4</sup> is used only when <sup>x</sup> = 0. Now, <sup>Q</sup>-<sup>1</sup> <sup>≡</sup> <sup>x</sup> <sup>≤</sup> <sup>y</sup> <sup>∧</sup> <sup>x</sup> <sup>≥</sup> 0 and <sup>ρ</sup>-<sup>1</sup> (x, y) = x + y are valid quasi-invariant and metering, resp. Function ρ-<sup>1</sup> decreases by exactly 1 in τ<sup>1</sup> and τ4, is trivially non-positive in <sup>τ</sup><sup>2</sup> because that transition is indeed disabled (<sup>x</sup> <sup>≥</sup> 0 from <sup>Q</sup>-1 and x < 0 from the guard) and is non-positive in <sup>τ</sup><sup>3</sup> (<sup>x</sup> <sup>≤</sup> <sup>y</sup>∧<sup>y</sup> <sup>≤</sup> <sup>0</sup> <sup>→</sup> <sup>x</sup>+<sup>y</sup> <sup>≤</sup> 0). Regarding Q-<sup>1</sup> , it verifies (9,10), and more importantly, the loop in <sup>1</sup> is nonblocking w.r.t Q-<sup>1</sup> , G<sup>τ</sup><sup>1</sup> , and G<sup>τ</sup><sup>4</sup> , i.e., Condition (13) holds.

#### **3.4 Narrowing Non-deterministic Choices**

Loop transitions that involve non-deterministic variables, might give rise to executions of different lengths when starting from the same input values. Since we are interested in LB<sup>w</sup>, we are clearly searching for longer executions. However, since our approach is based on inferring LB<sup>b</sup>, we have to take all executions into account which might result in less precise, or even trivial, LB<sup>w</sup>.

*Example 9.* Consider a modification of the loop in Example 6 in which the variable x in τ<sup>1</sup> is decreased by a non-deterministic positive quantity u:

$$\tau\_1 = (\ell\_1, \ell\_1, x \ge 0 \land y > 0 \land x' = x - u \land u \ge 1 \land y' = y)$$

The effect of this non-deterministic variable u is that τ<sup>1</sup> can be applied x times if we always take <sup>u</sup> = 1, <sup>x</sup> <sup>2</sup> times if we always take <sup>u</sup> = 2 or even only once if we take u>x. As a consequence, ρ-<sup>1</sup> (x, y) = x + y is no longer a valid metering function because <sup>x</sup> can decrease by more than 1 in <sup>τ</sup>1. Moreover, <sup>Q</sup>-<sup>1</sup> <sup>≡</sup> <sup>x</sup> <sup>≤</sup> <sup>y</sup> <sup>∧</sup> <sup>x</sup> <sup>≥</sup> 0 is not a quasi-invariant anymore since <sup>x</sup> <sup>=</sup> <sup>x</sup> <sup>−</sup> <sup>u</sup> <sup>∧</sup> <sup>u</sup> <sup>≥</sup> 1 does not entail <sup>x</sup> <sup>≥</sup> 0. In fact, no metering function involving <sup>x</sup> will be valid in <sup>τ</sup><sup>1</sup> because x can decrease by any positive amount.

To handle this complex situation, we propose narrowing the space of nondeterministic choices, and thus metering functions should be valid *wrt.* fewer executions and more likely be found and be more precise. Next we state the requirements that such narrowing should satisfy. The choice of a narrowing that leads to longer executions is discussed in Sect. 4.

<sup>A</sup> *non-deterministic variables narrowing* for a loop transition <sup>τ</sup> ∈ T is a formula <sup>U</sup><sup>τ</sup> (¯x, <sup>u</sup>¯), over variables ¯<sup>x</sup> and ¯u, that is added to <sup>τ</sup> to restrict the choices for variables ¯u. A specialized loop is now obtained by adding both <sup>G</sup><sup>τ</sup> and <sup>U</sup><sup>τ</sup> to the corresponding transitions. Suppose that for loop ∈ L, in addition to <sup>G</sup><sup>τ</sup> , we are also given <sup>U</sup><sup>τ</sup> for each of its loop transitions <sup>τ</sup> . For <sup>Q</sup> and ρ to be quasi-invariant and metering function for the specialized loop , we require conditions (9)-(13) to hold but after adding U<sup>τ</sup> to the left-hand side of the implications in (9) and (11). Besides, unlike narrowing of guards, narrowing of non-deterministic choices might make a transition invalid, i.e., not satisfying Condition (1), and thus ρ-(¯x) cannot be used as a lower-bound on the number of iterations. To guarantee that specialized transitions are valid we require, in addition, the following condition to hold

$$\forall \vec{x} \exists \bar{u}. \; Q\_{\ell}(\bar{x}) \land \mathcal{R}(\bar{x}) \land \mathcal{G}\_{\tau}(\bar{x}) \to \mathcal{R}(\bar{x}, \bar{u}) \land \mathcal{U}\_{\tau}(\bar{x}, \bar{u}) \qquad \text{for each } (\ell, \ell, \mathcal{R}) \in \mathcal{T} \tag{14}$$

This condition is basically (1) taking into account the inequalities introduced by the corresponding narrowings. Assuming that (, σ) is reachable in <sup>S</sup> and that <sup>σ</sup> <sup>|</sup><sup>=</sup> <sup>Q</sup>-, the specialized loop will make at least ρ-(σ(¯x)) iterations in any execution that starts in (, σ), which also means, as before, that the original loop *can* make at least ρ-(σ(¯x)) iterations in any execution that starts in (, σ).

*Example 10.* To solve the problems shown in Example 9 we need to narrow the non-deterministic variable u to take bounded values that reflect the worstcase execution of the loop. Concretely, we need to take <sup>U</sup><sup>τ</sup><sup>1</sup> <sup>≡</sup> <sup>u</sup> <sup>≤</sup> 1, which combined with <sup>u</sup> <sup>≥</sup> 1 entails <sup>u</sup> = 1 so <sup>x</sup> decreases by exactly 1 in <sup>τ</sup>1. Considering the narrowing U<sup>τ</sup><sup>1</sup> , the resulting loop is equivalent to the one presented in Example 8 so we could obtain the precise metering function ρ-<sup>1</sup> (x, y) = x + y with the quasi-invariant Q-<sup>1</sup> <sup>≡</sup> <sup>x</sup> <sup>≤</sup> <sup>y</sup> <sup>∧</sup> <sup>x</sup> <sup>≥</sup> 0. Note that (14) holds for τ<sup>1</sup> because u = 1 makes the consequent true for every value of x and y: <sup>∀</sup>x¯∃u. ¯ (<sup>x</sup> <sup>≤</sup> <sup>y</sup> <sup>∧</sup> <sup>x</sup> <sup>≥</sup> 0) <sup>∧</sup> (<sup>x</sup> <sup>≥</sup> <sup>0</sup> <sup>∧</sup> y > 0) <sup>∧</sup> x > <sup>0</sup> <sup>→</sup> <sup>u</sup> <sup>≥</sup> <sup>1</sup> <sup>∧</sup> <sup>u</sup> <sup>≤</sup> <sup>1</sup>

#### **3.5 Ensuring the Feasibility of the Specialized Loops**

In order to enable the propagation of the local lower-bounds back to the input location (as we have discussed at the beginning of Sect. 3), we have to ensure that there is actually an execution that starts in <sup>0</sup> and passes through the specialized loop. In other words, we have to guarantee that when putting all specialized loops together, they still form a non-blocking TS for some set of input values. We achieve this by requiring that the quasi-invariants of the preceding loops ensure that the considered quasi-invariant for this loop also holds on initialization (i.e., it is an invariant for the considered context). Technically, we require, in addition to (9)-(14), the following conditions to hold for each loop :

$$\begin{aligned} \forall \bar{x}, \bar{u}, \bar{x}'. \; \mathcal{Q}\_{\ell'}(\bar{x}) \land \mathcal{R} \to \mathcal{Q}\_{\ell}(\bar{x}') \qquad \qquad \text{for each } (\ell', \ell, \mathcal{R}) \in \mathcal{T} \qquad (15) \\\forall \bar{x}. \; \mathcal{Q}\_{\ell\_0} \to \Theta \end{aligned} \tag{16}$$

Condition (15) means that transitions entering loop , strengthened with the quasi-invariant of the preceding location , must lead to states within the quasiinvariant Q-. Condition (16) guarantees that Q-<sup>0</sup> defines valid input values, i.e., within the initial condition Θ.

**Theorem 1 (soundness).** *Given* Q *for each non-exit location* ∈ L*, narrowings* <sup>G</sup><sup>τ</sup> *and* <sup>U</sup><sup>τ</sup> *for each loop transition* <sup>τ</sup> ∈ T *, and function* <sup>ρ</sup> *for each loop location , such that* (9)*-*(16) *are satisfied, it holds:*


The proof of this soundness result is straightforward: it follows as a sequence of facts using the definitions of the conditions (9)-(16) given in this section.

We note that when there is an unbounded overlap between the guards of the loop transitions and the guards of exit transitions, it is likely that a non-trivial metering function does not exist because it must be non-positive on the overlapping states. To overcome this limitation, instead of using the exit transitions in (12), we can use ones that correspond to the negation of the guards of loop transitions, and thus it is ensured that they do not overlap. However, we should require (13) to hold for the original exit transitions as well in order to ensure that the non-blocking property holds. Another way to overcome this limitation is to simply strengthen the exit transitions by the negation of the guards.

As a final comment, we note that it is not needed to assume that the TS S that we start with is non-blocking (even though we have done so in Sect. 2.1 for clarity). This is because our formalization above finds a subset of S (S in Theorem 1) that is non-blocking, which is enough to ensure the feasibility of the local lower-bounds. This is useful not only for enlarging the set of TSs that we accept as input, but also allows us to start the analysis from any subset of S that includes a path from <sup>0</sup> to the exit location. For example, it can be used to remove trivial execution paths from S, or concentrate on ones that include more sequences of loops (since we are interested in LB<sup>w</sup>).

#### **3.6 Handling General TSs**

So far we have considered a special case of TSs in which all locations, except the entry and exit ones, are multi-path loops. Next we explain how to handle the general case. It is easy to see that we can allow locations that correspond to trivial SCCs. These correspond to paths that connect loops and might include branching as well. For such locations, there is no need to infer metering functions or apply any specialization, we only need to assign them quasi-invariants that satisfy (15) to guarantee that the overall specialized TS is non-blocking.

The more elaborated case is when the TS includes non-trivial SCCs that do not form a multi-path loop. In such case, if a SCC has a single cut-point, we can unfold its edges and transform it into a multi-path following the techniques of [1]. It is important to note that when merging two transitions, the cost of the new one is the sum of their costs. In this case the number of iterations is still a lower-bound on the cost of the loop, however, we might get a better one by multiplying it by the minimal cost of its transitions.

If a SCC cannot be transformed into a multi-path loop by unfolding its transitions, then it might correspond to a nested loop, and, in such case, we can recover the nesting structure and consider them as separated TSs that are "called" from the outer one using loop extraction techniques [25]. Each inner-loop is then analyzed separately, and replaced (in the original TS, where is "called") by a single edge with its lower-bound as cost for that edge, and then the outer is analyzed taking that cost into account. Besides, to guarantee that the specialized program corresponds to a valid execution, we require the quasi-invariant of the inner loop to hold in the context of the quasi-invariant of the outer loop. This approach is rather standard in cost analysis of structured programs [1,3,12].

Another issue is how to compose the (local) lower-bounds of the specialized loops into a global-lower bound. For this, we can rely on the techniques [1,3] that rewrite the local lower-bounds in terms of the input values by relying on invariant generation and recurrence relations solving.

### **4 Inference Using Max-SMT**

This section presents how metering functions and narrowings can be inferred automatically using Max-SMT, namely how to automatically infer all G<sup>τ</sup> , U<sup>τ</sup> , Q-, and ρ such that (9)-(16) are satisfied. We do it in a modular way, i.e., we seek G<sup>τ</sup> , U<sup>τ</sup> , Q-, and ρ for one loop at a time following a (reversed) topological order of the SCCs, as we describe next. Recall that (16) is required only for loops connected directly to 0, and w.l.o.g. we assume there is only one such loop.

#### **4.1 A Template-Based Verification Approach**

We first show how the template-based approach of [6,17] can be used to find G<sup>τ</sup> , U<sup>τ</sup> , and Q by representing them as template constraint systems, i.e., each is a conjunction of linear constraints where coefficients and constants are unknowns. Also, ρ is represented as a linear template function ¯<sup>a</sup> · <sup>x</sup>¯ <sup>+</sup> <sup>a</sup><sup>0</sup> where (a0, <sup>a</sup>¯) are unknowns. Then, the problem is to find concrete values for the unknowns such that all formulas generated by (9)-(16) are satisfied:

– Each ∀-formula generated by (9)-(16), except those of (14) that we handle below, can be viewed as an ∃∀ problem where the ∃ is over the unknowns of the templates and the ∀ is over (some of) the program variables. It is well-known that solving such an ∃∀ problem, i.e., finding values for the unknowns, can be done by translating it into a corresponding ∃ problem over the existentially quantified variables (i.e., the unknowns) using Farkas' lemma [20], which can then be solved using an off-the-shelf SMT solver.

– To handle (14) we follow [17], and eliminate <sup>∃</sup>u¯ using the skolemization <sup>u</sup><sup>i</sup> = ¯<sup>a</sup> · <sup>x</sup>¯ <sup>+</sup> <sup>a</sup><sup>0</sup> where (a0, <sup>a</sup>¯) are fresh unknowns (different for each <sup>u</sup>i). This allows handling it using Farkas' lemma as well. However, in addition, when solving the corresponding <sup>∃</sup> problem we require all (a0, <sup>a</sup>¯) to be integer. This is because the domain of program variables is the integers, and picking integer values for all (a0, a¯) guarantees that the values of any x <sup>i</sup> that depends on ¯u will be integer as well<sup>1</sup>.

The size of templates for G<sup>τ</sup> , U<sup>τ</sup> , and Q-, i.e., the number of inequalities, is crucial for precision and performance. The larger the size is, the more likely that we get a solution if one exists, but also the worse the performance is (as the corresponding SMT problem will include more constraints and variables). In practice, one typically starts with templates of size 1, and iteratively increases it by 1 when failing to find values for the unknowns, until a solution is found or the bound on the size is reached.

Alternatively, we can use the approach of [17] to construct G<sup>τ</sup> , U<sup>τ</sup> , and Q- incrementally. This starts with templates of size 1, but instead of requiring all (9)- (16) to hold, the conditions generated by (12) are marked as soft constraints (i.e., we accept solutions in which they do not hold) and use Max-SMT to get a solution that satisfies as many of such soft conditions as possible. If all are satisfied, we are done, if not, we use the current solution to instantiate the templates, and then add another template inequality to each of them and repeat the process again. This means that at any given moment, each template will include at most one inequality with unknowns. Finally, to guarantee progress from one iteration to another, soft conditions that hold at some iteration are required to hold at the next one, i.e., they become hard.

The use of (12) as soft constraint is based on the observation [12] that when seeking a metering function, the problematic part is often to guarantee that it is negative on exit transitions, which is normally achieved by adding quasiinvariants that are incrementally inferred. By requiring (12) to be soft we handle more exit transitions as the quasi-invariant gets stronger until all are covered.

#### **4.2 Better Quality Solutions**

The precision can also be affected by the quality of the solution picked by the SMT solver for the corresponding ∃ problem. Since there might be many metering functions that satisfy (9)-(16), we are interested in narrowing the search space of the SMT solver in order to find more accurate ones, i.e., lead to longer executions. Next we present some techniques for this purpose.

*Enabling More Loop Transitions.* We are interested in guard narrowings that keep as many loop transitions as possible, since such narrowings are more likely

<sup>1</sup> Because we assumed that constraints involving primed variables are of the form x- <sup>i</sup> = ¯<sup>a</sup> · <sup>x</sup>¯ <sup>+</sup> ¯<sup>b</sup> · <sup>u</sup>¯ <sup>+</sup> <sup>c</sup>.

to generate longer executions. This can be done by requiring the following to hold

$$\exists \bar{x}. \quad \bigvee\_{\tau=(\ell,\ell,\mathcal{R}) \in T} \left( \mathcal{Q}\_{\ell}(\bar{x}) \land \mathcal{R}(\bar{x}) \land \mathcal{G}\_{\tau}(\bar{x}) \right) \tag{17}$$

We also use Max-SMT to require a solution that satisfies as many disjuncts as possible and thus eliminating less loop transitions (if Q-(¯x) ∧ R(¯x) ∧ G<sup>τ</sup> (¯x) is *false* for a transition τ , then it is actually disabled). Note that this condition can be used instead of (10) that requires the quasi-invariant to be non-empty.

*Larger Metering Functions.* We are interested in metering functions that lead to longer executions. One way to achieve this is to require metering functions to be ranking as well, i.e., in addition to (11) we require the following to hold

$$\forall \bar{x}, \bar{u}, \bar{x}'.\mathcal{Q}\_{\ell}(\bar{x}) \land \mathcal{G}\_{\tau}(\bar{x}) \land \mathcal{U}\_{\tau}(\bar{x}, \bar{u}) \land \mathcal{R} \to \rho\_{\ell}(\bar{x}) - \rho\_{\ell}(\bar{x}') \ge 1 \quad \text{for each } (\ell, \ell, \mathcal{R}) \in \mathcal{T} \tag{18}$$

$$\forall \bar{x}, \bar{u}.\mathcal{Q}\_{\ell}(\bar{x}) \land \mathcal{G}\_{\tau}(\bar{x}) \land \mathcal{R}(\bar{x}) \to \rho\_{\ell}(\bar{x}) \ge 0 \qquad \text{for each } (\ell, \ell, \mathcal{R}) \in \mathcal{T} \tag{19}$$

These new conditions are added as soft constraints, and we use Max-SMT to ask for a solution that satisfies as many conditions as possible.

*Unbounded Metric Functions.* We are interested in metering functions that do not have an upper bound, since otherwise they will lead to constant lower-bound functions. For example, for a loop with a transition <sup>x</sup> <sup>≥</sup> <sup>0</sup> <sup>∧</sup> <sup>x</sup> <sup>=</sup> <sup>x</sup> <sup>−</sup> 1, we want to avoid quasi-invariants like <sup>x</sup> <sup>≤</sup> 5 which would make the metering function <sup>x</sup> bounded by 5. For this, we rely on the following lemma.

**Lemma 1.** *A function* <sup>ρ</sup>(¯x)=¯<sup>a</sup> · <sup>x</sup>¯ <sup>+</sup> <sup>a</sup><sup>0</sup> *is unbounded over a polyhedron* <sup>P</sup>*, iff* <sup>a</sup>¯ · <sup>y</sup>¯ *is positive on at least one ray* <sup>y</sup>¯ *of the* recession cone *of* <sup>P</sup>*.*

It is known that for a polyhedron P given in constraints representation, its recession cone cone(P) is the set specified by the constraints of P after removing all free constants. Now we can use the above lemma to require that the metering function ρ-(¯x)=¯<sup>a</sup> · <sup>x</sup>¯ + ¯a<sup>0</sup> is unbounded in the quasi-invariant <sup>Q</sup> by requiring the following condition to hold

$$
\exists \bar{x}. \; \mathsf{cone}(\mathcal{Q}\_{\ell}) \land \bar{a} \cdot \bar{x} > 0 \tag{20}
$$

where cone(Q-) is obtained from the template of Q by removing all (unknowns corresponding to) free constants, i.e., it is the *recession cone* of Q-.

Note that all encodings discussed in this section generate non-linear SMT problems, because they either correspond to ∃∀ problems that include templates on the left-hand side of implications, or to ∃ problems over templates that include both program variables and unknowns.

Finally, it is important to note that the optimizations described provide theoretical guarantees to get better lower bounds: the one that adds (18,19) leads to a bound that corresponds exactly to the worst-case execution (of the specialized program), and the one that uses (20) is essential to avoid constant bounds.

# **5 Implementation and Experimental Evaluation**

We have implemented a *LO*wer-*B*ound synthesiz*ER*, named LOBER, that can be used from an online web interface at http://costa.fdi.ucm.es/lober. LOBER is built as a pipeline with the following processes: (1) it first reads a KoAT file [5] and generates a corresponding set of multi-path loops, by extracting parts of the TS that correspond to loops [25], applying unfolding, and inferring loop summaries to be used in the calling context of nested loops, as explained in Sect. 3.6; (2) it then encodes in SMT the conditions (9)–(13) defined through the paper, for each loop separately, by using template generation, a process that involves several non-trivial implementations using Farkas' lemma (this part is implemented in Java and uses Z3 [8] for simple (linear) satisfiability checks when producing the Max-SMT encoding); (3) the problem is solved using the SMT solver Barcelogic [4], as it allows us to use non-linear arithmetic and Max-SMT capabilities in order to assert soft conditions and implement the solutions described in Sect. 4; (4) in order to guarantee the correctness of our system results, we have added to the pipeline an additional checker that proves that the obtained metering function and quasi-invariants verify conditions (9)–(13) by using Z3. To empirically evaluate the results of our approach, we have used benchmarks from the Termination Problem Data Base (TPDB), namely those from the category *Complexity ITS* that contains Integer Transition Systems. We have removed non-terminating TSs and terminating TSs whose cost is unbounded (i.e., the cost depends on some non-deterministic variables and can be arbitrarily high) or non-linear, because they are outside the scope of our approach. In total, we have considered a set of 473 multi-path loops from which we have excluded 13 that were non-linear. Analyzing these 473 programs took 199 min, an average of 25 sec by program, approximately. For 255 of them, it took less than 1 s.

Table 1 illustrates our results and compares them to those obtained by the LoAT [12,13] system, which also outputs a pair (ρ, <sup>Q</sup>) of a lower-bound function <sup>ρ</sup> and initial conditions <sup>Q</sup> on the input for which <sup>ρ</sup> is a valid lower-bound. In order to automatically compare the results obtained by the two systems, we have implemented a comparator that first expresses costs as functions f : <sup>N</sup> <sup>→</sup> <sup>R</sup>≥<sup>0</sup> over a single variable <sup>n</sup> and then checks which function is greater. To obtain this unary cost function from the results (ρ, <sup>Q</sup>), we use convex polyhedra manipulation libraries to maximize the obtained cost <sup>ρ</sup> wrt. Q ∧ <sup>−</sup><sup>n</sup> <sup>≤</sup> <sup>x</sup><sup>i</sup> <sup>≤</sup> <sup>n</sup>, where x<sup>i</sup> are the TS variables, and express that maximized expressions in terms of n. Therefore, f(n) represents the maximum cost when the variables are bounded by <sup>|</sup>xi| ≤ <sup>n</sup> and satisfy the corresponding initial condition <sup>Q</sup>, a notion very similar to the runtime complexity used in [12,13]. Once we have both unary linear costs <sup>f</sup>1(n) = <sup>k</sup>1<sup>n</sup> <sup>+</sup> <sup>d</sup><sup>1</sup> and <sup>f</sup>2(n) = <sup>k</sup>2<sup>n</sup> <sup>+</sup> <sup>d</sup>2, we compare them in <sup>n</sup> <sup>≥</sup> <sup>0</sup> by inspecting k<sup>1</sup> and k2.

Each row of the table contains the number of loops for which both tools obtain the same result (=), the number of loops where LOBER is better than LoAT (>) and the number of loops where LoAT is better than LOBER (<). The subcategories are obtained directly from the name of the innermost folder, except for the cases in which this folder contains too few examples that we merge them


**Table 1.** Results of the experiments.

all in a Misc folder in the parent directory. The total number of loops that are considered in each subcategory appears in column **Total**. Brockschmidt 16 and Hark 20 have their first row empty as all their results are contained in their subcategories. Globally, both tools behave the same in 412 programs (column "="), obtaining equivalent linear lower bounds in 376 of them and a constant lower bound in the remaining ones. Our tool LOBER achieves a better accuracy in 37 programs (column ">"), while LoAT is more precise in 11 programs (column "<"). Let us discuss the two sets of programs in which both tools differ. As regards the 37 examples for which we get better results, we have that LoAT crashes in 4 cases and it can only find a constant lower bound in 1 example while our tool is able to find a path of linear length by introducing the necessary quasi-invariants. For the remaining 32 loops, both tools get a linear bound, but LOBER finds one that leads to an unboundedly longer execution: 18 of these loops correspond to cases that have implicit relations between the different execution paths (like our running examples) and require semantic reasoning; for the remaining 14, we get a better set of quasi-invariants. The following techniques have been needed to get such results in these 37 better cases (note that (i) is not mutually exclusive with the others):


Therefore, this shows experimentally the relevance of all components within our framework and its practical applicability thanks to the good performance of the Max-SMT solver on non-linear arithmetic problems. In general, for all the set of programs, we can solve 308 examples without quasi-invariants and 444 without guard-narrowing. The intersection of these two sets is: 298 examples (63% of the programs), that leaves 175 programs that need the use of some of the proposed techniques to be solved.

As regards the 11 examples for which we get worse results than LoAT, we have two situations: (1) In 6 cases, the SMT-solver is not able to find a solution. We noticed that too many quasi-invariants were required, what made the SMT problem too hard. To improve our results, we could start, as a preprocessing step, from a quasi-invariant that includes all invariant inequalities that syntactically appear in the loop transitions, something similar to what is done by LoAT when inferring what they call conditional metering function [12]. This is left for future experimentation. (2) In the other 5 cases, our tool finds a linear bound but with a worse set of quasi-invariants, which makes the LoAT bound provide unboundedly longer executions. We are investigating whether this can be improved by adding new soft constraints that guide the solver to find these better solutions. Finally, let us mention that, for the 13 problems that LoAT gives a non-linear bound and have been excluded from our benchmarks as justified above, we get a linear bound for the 12 that have a polynomial bound (of degree 2 or more), and a constant bound for the additional one that has a logarithmic lower bound. This is the best we can obtain as our approach focuses on the inference of precise local linear bounds, as they constitute the most common type of loops.

All in all, we argue that our experimental results are promising: we triple LoAT in the number of benchmarks for which we get more accurate results and, besides, many of those examples correspond to complex loops that lead to worse results when disconnecting transitions. Besides, we see room for further improvement, as most examples for which LoAT outperforms us could be handled as accurately as them with better quasi-invariants (that is somehow a black-box component in our framework). Syntactic strategies that use invariant inequalities that appear in the transitions, like those used in LoAT, would help, as well as further improvements in SMT non-linear arithmetic.

*Application Domains.* The accuracy gains obtained by LOBER have applications in several domains in which knowing the precise cost can be fundamental. This is the case for predicting the gas usage [26] of executing *smart contracts*, where gas cost amounts to monetary fees. The caller of a transaction needs to include a gas limit to run it. Giving a too low gas limit can end in an "out of gas" exception and giving a too high gas limit can end in a "not enough eth (money)" error. Therefore having a tighter prediction is needed to be safe on both sides. Also, when the UB is equal to the LB, we have an exact estimation, e.g., we would know precisely the runtime or memory consumption of the most costly executions. This can be crucial in safety-critical applications and has been used as well to detect potential vulnerabilities such as denial-of-service attacks. In https://apps.dtic. mil/sti/pdfs/AD1097796.pdf, vulnerabilities are detected in situations in which both bounds do not coincide. For instance, in password verification programs, if the UB and LB differ due to a difference on the delays associated to how many characters are right in the guessed password, this is identified as a potential attack.

# **6 Related Work and Conclusions**

We have proposed a novel approach to synthesize precise lower-bounds from integer non-deterministic programs. The main novelties are on the use of loop specialization to facilitate the task of finding a (precise) metering function and on the Max-SMT encoding to find larger (better) solutions. Our work is related to two lines of research: (1) non-termination analysis and (2) LB inference. In both kinds of analysis, one aims at finding classes of inputs for which the program features a non-terminating behavior (1) or a cost-expensive behavior (2). Therefore, techniques developed for non-termination might provide a good basis for developing a LB analysis. In this sense, our work exploits ideas from the Max-SMT approach to non-termination in [17]. The main idea borrowed from [17] has been the use of quasi-invariants to specialize loops towards the desired behavior: in our case towards the search of a metering function, while in theirs towards the search of a non-termination proof. However, there are fundamental differences since we have proposed other new forms of loop specialization (see a more detailed comparison in Sect. 1) and have been able to adapt the use of Max-SMT to accurately solve our problem (i.e., find larger bounds). As mentioned in Sect. 1, our loop specialization technique can be used to gain precision in non-termination analysis [17]. For instance, in this loop: "while (x>=0 and y>=0) {if (∗) {x++; y−−;} else {x−−;y++;}}" no sub SCC (considering only one of the transitions) is non-terminating and no quasiinvariant can be found to ensure we will stay in the loop (when considering both transitions), hence cannot be handled by [17]. Instead if we narrow the transitions by adding y >= x in the if-condition (and hence x>y in the else), we can prove that x >= 0 <sup>∧</sup> y >= 0 <sup>∧</sup> <sup>x</sup> <sup>+</sup> <sup>y</sup> = 1 is quasi-invariant, which allow us to prove non-termination in the way of [17] (as we will stay in the loop forever).

As regards LB inference, the current state-of-the-art is the work by Frohn et al. [12,13] that introduces the notion of metering function and acceleration. Our work indeed tries to recover the semantic loss in [12,13] due to defining metering functions for simple loops and combining them in a later stage using acceleration. Technically, we only share with this work the basic definition of metering function in Sect. 3.1. Indeed, the definition in conditions (3) and (4) already generalizes the one in [12,13] since it is not restricted to simple loops. This definition is improved in the following sections with several loop specializations. While [12,13] relies on pure SMT to solve the problem, we propose to gain precision using Max-SMT. We believe that similar ideas could be adapted by [12,13]. Due to the different technical approaches underlying both frameworks, their accuracy and efficiency must be compared experimentally wrt. the LoAT system that implements the ideas in [12,13]. We argue that the results in Sect. 5 justify the important gains of using our new framework and prove experimentally that, the fact that we do not lose semantic relations in the search of metering functions is key to infer LB for challenging cases in which [12,13] fails. Originally, the LoAT [12,13] system only accelerated simple loops by using metering functions, so the overall precision of the lower bound relied on obtaining valid and precise metering functions. However, the framework in [12,13] is independent of the accelerating technique applied. In order to increase the number of simple loops that can be accelerated, Frohn [11] proposes a calculus to combine different conditional acceleration techniques (monotonic increase/decrease, eventual increase/decrease, and metering functions). These conditional acceleration techniques assume that all the iterations of the loop verify some condition ϕ, and the calculus applies the techniques in order and extract those conditions ϕ from fragments of the loop guard. Although more precise and powerful, the combined acceleration calculus considers only simple loops, so it does not solve the precision loss when the loop cost involves several interleaved transitions. Moreover, the techniques in [11] are integrated into LoAT, so the experimental evaluation in Sect. 5 compares our approach to the framework in [12,13] extended with several techniques to accelerate loops (not only metering functions).

Finally, our approach presents similarities to the CTL\* verification for ITS in [7] as both extend transition guards of the original ITS. The difference is that in [7] the added constraints only contain newly created *prophecy variables* and the transitions to modify are detected directly using graph algorithms; whereas our SMT-based approach adds constraints only over existing variables to satisfy the properties that characterize a good metering function. Additionally, both approaches differ both in the goal (CTL\* verification vs. inference of lower-bounds) and the technologies applied (CTL model checkers vs. Max-SMT solvers).

# **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

# **Fast Computation of Strong Control Dependencies**

Marek Chalupa(B) , David Kla˘ska, Jan Strejˇcek , and Luk´a˘s Tomovi˘c

Masaryk University, Brno, Czech Republic {chalupa,strejcek}@fi.muni.cz, {david.klaska,tomovic}@mail.muni.cz

**Abstract.** We introduce new algorithms for computing *non-termination sensitive control dependence* (NTSCD) and *decisive order dependence* (DOD). These relations on vertices of a control flow graph have many applications including program slicing and compiler optimizations. Our algorithms are asymptotically faster than the current algorithms. We also show that the original algorithms for computing NTSCD and DOD may produce incorrect results. We implemented the new as well as fixed versions of the original algorithms for the computation of NTSCD and DOD. Experimental evaluation shows that our algorithms dramatically outperform the original ones.

# **1 Introduction**

Control dependencies between program statements are studied since 70's. They have important applications in compiler optimizations [12,14,16], program analysis [9,19,36], and program transformations, especially program slicing [1,9,22,26,37]. Slicing is used in many areas including testing, debugging, parallelization, reverse engineering, program analysis and verification [17,28].

Informally, two statements in a program are control dependent if one directly controls the execution of the other in some way. This is typically the case for **if** statements and their bodies. Control dependencies are nowadays classified as *weak (non-termination insensitive)* if they assume that a given program always terminates, or as *strong (non-termination sensitive)* if they do not have this assumption [13]. We illustrate the difference on the control flow graph in Fig. 1. Node a controls whether b or c (and then d) is going to be executed, so b, c, and d are control dependent on a (the convention is to display dependence as edges in the "controls" direction). Similarly, b controls the execution of c and d, as these nodes may be bypassed by going from b to e. Note also that d controls whether d is going to be executed in the future and thus is control dependent on itself. However, c does not control d as any path from c hits d. All dependencies mentioned so far are weak, namely *standard control dependencies* as defined by Ferrante et al. [16]. Weak control dependence assumes that the program always terminates, in particular, that the loop over d cannot iterate forever. As a result,

c The Author(s) 2021

**Fig. 1.** An example of a control flow graph and control dependencies (red edges). The dotted dependencies are additional non-termination sensitive control dependencies. (Color figure online)

e is reached by all executions and thus it is not weakly control dependent on any node. However, e is strongly control dependent on b and d. Indeed, if we assume that some executions can loop over d forever, then reaching e is controlled clearly by d and also by b as it can send the execution directly to e.

This paper is concerned with the computation of two prominent strong control dependencies introduced by Ranganath et al. [32,33], namely *nontermination sensitive control dependence (NTSCD)* and *decisive order dependence (DOD)*. NTSCD is studied in Sect. 3, which follows after preliminaries in Sect. 2. We first recall the definition of NTSCD and the algorithm of Ranganath et al. [33] for its computation. Then we show a flaw in the algorithm and suggest a fix. Finally, we introduce a new algorithm for the computation of NTSCD. Given a control flow graph with <sup>|</sup><sup>V</sup> <sup>|</sup> nodes, the new algorithm runs in time <sup>O</sup>(|<sup>V</sup> <sup>|</sup> 2), while the algorithm of Ranganath et al. runs in time <sup>O</sup>(|<sup>V</sup> <sup>|</sup> <sup>4</sup> · log <sup>|</sup><sup>V</sup> <sup>|</sup>) and its fixed version in time <sup>O</sup>(|<sup>V</sup> <sup>|</sup> <sup>5</sup>). We show a NTSCD relation of size <sup>Θ</sup>(|<sup>V</sup> <sup>|</sup> <sup>2</sup>), which means that our algorithm is asymptotically optimal.

The DOD relation captures the cases when one node controls the execution order of two other nodes. Roughly speaking, nodes {b, c} are DOD on <sup>a</sup> whenever all executions passing through a eventually reach both b and c and a controls which is reached first. Ranganath et al. [33] proved that the relation is empty for *reducible* graphs [21], i.e., graphs where every cycle has a single entry point. Control flow graphs of structured programs are reducible, but irreducible graphs may arise for example in the following situations [11,33,35]:


The DOD relation is important (together with NTSCD) when we want to slice possibly non-terminating programs with irreducible control flow graphs and preserve their termination properties as well as data integrity [1,33]. This is a common requirement when slicing is used as a preprocessing step before program verification [9,23,26], worst-case execution time analysis [29], information flow analysis [18,19], analysis of concurrent programs [18] with busy-waiting synchronization or synchronization where possible spurious wake-ups of threads are guarded by loops (e.g., programs using the *pthread* library), and analysis of reactive systems and generic state-based models [2,24,33].

The DOD relation is studied in Sect. 4, where we recall its definition, discuss the Ranganath et al.'s algorithm for DOD [33], and show that this algorithm also contains a flaw. Fortunately, this flaw can be easily fixed without changing the complexity of the algorithm. Further, we develop a theory that underpins our new algorithm for the computation of DOD. Due to the space limitations, proofs of theorems can be found only in the extended version of this paper [8]. The new algorithm, presented at the end of the section, computes DOD in time <sup>O</sup>(|<sup>V</sup> <sup>|</sup> <sup>3</sup>), while the original as well as the fixed version of the Ranganath et al.'s algorithm runs in <sup>O</sup>(|<sup>V</sup> <sup>|</sup> <sup>5</sup> · log <sup>|</sup><sup>V</sup> <sup>|</sup>). We show a DOD relation of size <sup>Θ</sup>(|<sup>V</sup> <sup>|</sup> 3), which means that our algorithm is again asymptotically optimal.

Section 5 focuses on *control closures (CC)* introduced by Danicic et al. [33], which generalize control dependence to arbitrary directed graphs. It is known that the *strong* (i.e., non-termination sensitive) control closure for a set of nodes containing the starting node is equivalent to the closure under NTSCD and DOD relations. Hence, our algorithms for NTSCD and DOD can be used to compute strong CC in time <sup>O</sup>(|<sup>V</sup> <sup>|</sup> <sup>3</sup>) on control flow graphs, while the original algorithm by Danicic et al. [13] runs in <sup>O</sup>(|<sup>V</sup> <sup>|</sup> 4).

Our theoretical contribution to computation of strong control dependencies is summarized in Table 1. Section 6 presents experimental evaluation showing that our algorithms are indeed dramatically faster than the original ones. The paper is concluded with Sect. 7.

### **1.1 Related Work**

The first paper concerned with control dependence is due to Denning and Denning [15], who used control dependence to certify that flow of information in a program is secure. Weiser [37], Ottenstein and Ottenstein [30], and Ferrante et al. [16] used control dependence in program slicing, which is also the motivation for the most of the latter research in this area. These "classical" papers study control dependence in terminating programs with a unique exit node eventually reached by every execution. These restrictions have been gradually removed.


**Table 1.** Overview of discussed algorithms and their complexities on CFGs

Podgurski and Clarke [31] defined the first strong control dependence that does not assume termination of the program.<sup>1</sup> However, their definitions and algorithms still require programs to have a unique exit node.

Bilardi and Pingal [5] introduced a framework that uses generalized dominance relation on graphs. In their framework, they are able to compute Podgurski and Clarke's control dependence in <sup>O</sup>(|E|+|<sup>V</sup> <sup>|</sup> <sup>2</sup>) time for a directed graph (V,E) with a unique exit node. In theory, NTSCD could be computed in their framework. However, computing *augmented post-dominator tree* – the central data structure of their framework – requires the unique exit node as it starts with post-dominator tree and, mainly, is much more complicated compared to our algorithm for NTSCD [5].

Chen and Rosu [10] introduced a parametric approach where loops can be annotated with information about termination. The resulting control dependence is somewhere between the classical and Podgurski and Clarke's control dependence, the two being the extremes.

The notion of NTSCD and DOD was founded in works of Ranganath et al. [32,33] in order to slice reactive systems, e.g., operating systems or controllers of embedded devices. They generalized also classical (*non-termination insensitive*) control dependence to graphs without the unique exit point (further investigated, e.g., by Androutsopoulos et al. [3]) and provided several relaxed versions of DOD.

Danicic et al. [13] introduced *weak* and *strong control closures (CC)* that generalize weak and strong control dependence (thus also NTSCD) to arbitrary graphs. They provide algorithms for the computation of minimal closures that run in <sup>O</sup>(|<sup>V</sup> <sup>|</sup> <sup>3</sup>) (weak CC) and <sup>O</sup>(|<sup>V</sup> <sup>|</sup> <sup>4</sup>) (strong CC) on graph with <sup>|</sup><sup>V</sup> <sup>|</sup> nodes.

An orthogonal study of control dependence that arises between statements in different procedures (e.g., due to calls to exit()) was carried out by Loyall and Mathisen [27], Harrold et al. [20], and Sinha et al. [34].

# **2 Preliminaries**

A *finite directed graph* is a pair G = (V,E), where V is a finite set of *nodes* and <sup>E</sup> <sup>⊆</sup> <sup>V</sup> <sup>×</sup> <sup>V</sup> is a set of *edges*. If there is an edge (m, n) <sup>∈</sup> <sup>E</sup>, then <sup>n</sup> is called a *successor* of m, m is a *predecessor* of n, and the edge is an *outgoing edge* of m. Given a node n, *Successors*(n) and *Predecessors*(n) denote the sets of all its successors and predecessors, respectively. A *path* from a node n<sup>1</sup> is a nonempty finite or infinite sequence <sup>n</sup>1n<sup>2</sup> ... <sup>∈</sup> <sup>V</sup> <sup>+</sup> <sup>∪</sup><sup>V</sup> <sup>ω</sup> of nodes such that there is an edge (ni, ni+1) <sup>∈</sup> <sup>E</sup> for each pair <sup>n</sup>i, ni+1 of adjacent nodes in the sequence. A path is called *maximal* if it cannot be prolonged, i.e., it is infinite or the last node of the path has no outgoing edge. A node m is *reachable* from a node n if there exists a finite path such that its first node is n and its last node is m.

We say that a graph is a *cycle*, if it is isomorphic to a graph (V,E) where <sup>V</sup> <sup>=</sup> {n1,...,nk} for some k > 0 and <sup>E</sup> <sup>=</sup> {(n1, n2),(n2, n3),...,(n<sup>k</sup>−<sup>1</sup>, nk),

<sup>1</sup> Podgurski and Clarke [31] called their control dependence *weak control dependence* as it is a superset of classical control dependence. Nowadays, we use the terms *weak* and *strong* precisely in the opposite meaning [13].

(nk, n1)}. A cycle *unfolding* is a path in the cycle that contains each node precisely once.

In this paper, we consider programs represented by control flow graphs, where nodes correspond to program statements and edges model the flow of control between the statements. As control dependence reflects only the program structure, our definition of a control flow graph does not contain any statements. Our definition also does not contain any start or exit nodes as these are not important for the problems we study in this paper.

**Definition 1 (Control flow graph, CFG).** *A* control flow graph (CFG) *is a finite directed graph* <sup>G</sup> = (V,E) *where each node* <sup>v</sup> <sup>∈</sup> <sup>V</sup> *has at most two outgoing edges. Nodes with exactly two outgoing edges are called* predicate nodes *or simply* predicates*. The set of all predicates of a CFG* G *is denoted by Predicates*(G)*.*

# **3 Non-termination Sensitive Control Dependence**

This section recalls the definition of NTSC by Ranganath et al. [32] and their algorithm for computing NTSCD. Then we show that the algorithm can produce incorrect results and introduce a new algorithm that is asymptotically faster.

**Definition 2 (Non-termination sensitive control dependence, NTSCD).** *Given a CFG* <sup>G</sup> = (V,E)*, a node* <sup>n</sup> <sup>∈</sup> <sup>V</sup> *is* non-termination sensitive control dependent (NTSCD) *on a predicate node* <sup>p</sup> <sup>∈</sup> *Predicates*(G)*, written* <sup>p</sup> <sup>−</sup>*NTSCD* −−→ <sup>n</sup>*, if* p *has two successors* s<sup>1</sup> *and* s<sup>2</sup> *such that*

*– all maximal paths from* s<sup>1</sup> *contain* n*, and*

*– there exists a maximal path from* s<sup>2</sup> *that does not contain* n*.*

#### **3.1 Algorithm of Ranganath et al. [33] for NTSCD**

The algorithm is presented in Algorithm 1. Its central data structure is a twodimensional array S where for each node n and for each predicate node p with successors <sup>r</sup> and <sup>s</sup>, <sup>S</sup>[n, p] always contains a subset of {tpr, tps}. Intuitively, <sup>t</sup>pr should be added to S[n, p] if n appears on all maximal paths from p that start with the prefix pr. The *workbag* holds the set of nodes n for which some S[n, p] value has been changed and this change should be propagated. The first part of the algorithm initializes the array S with the information that each successor r of a predicate node p is on all maximal paths from p starting with pr. The main part of the algorithm then spreads the information about the reachability on all maximal paths in the forward manner. Finally, the last part computes the NTSCD relation according to Definition 2 and with use of the information in S.

The algorithm runs in time <sup>O</sup>(|E|·|<sup>V</sup> <sup>|</sup> <sup>3</sup> · log <sup>|</sup><sup>V</sup> <sup>|</sup>) [33] for a CFG <sup>G</sup> = (V,E). The log <sup>|</sup><sup>V</sup> <sup>|</sup> factor comes from set operations. Since every node in CFG has at most 2 outgoing edges, we can simplify the complexity to <sup>O</sup>(|<sup>V</sup> <sup>|</sup> <sup>4</sup> · log <sup>|</sup><sup>V</sup> <sup>|</sup>).

Although the correctness of the algorithm has been proved [32, Theorem 7], Fig. 2 presents an example where the algorithm provides an incorrect answer.

**Algorithm 1:** The NTSCD algorithm by Ranganath et al. [33]

```
Input: a CFG G = (V,E)
   Output: a potentially incorrect NTSCD relation stored in ntscd
1 Set S[n, p] = ∅ for all n ∈ V and p ∈ Predicates(G) // Initialization
2 workbag ← ∅
3 for p ∈ Predicates(G) do
4 for r ∈ Successors(p) do
5 S[r, p] ← {tpr}
6 workbag ← workbag ∪ {r}
7
8 while workbag = ∅ do // Computation of S
9 n ← pop from workbag
10 if Successors(n) = {s} for some s = n then // One successor case
11 for p ∈ Predicates(G) do
12 if S[n, p] -
                   S[s, p] = ∅ then
13 S[s, p] ← S[s, p] ∪ S[n, p]
14 workbag ← workbag ∪ {s}
15 if |Successors(n)| > 1 then // Multiple successors case
16 for m ∈ V do
17 if |S[m, n]| = |Successors(n)| then
18 for p ∈ Predicates(G) -
                               {n} do
19 if S[n, p] -
                        S[m, p] = ∅ then
20 S[m, p] ← S[m, p] ∪ S[n, p]
21 workbag ← workbag ∪ {m}
22
23 ntscd ← ∅ // Computation of NTSCD
24 for n ∈ V do
25 for p ∈ Predicates(G) do
26 if 0 < |S[n, p]| < |Successors(p)| then
27 ntscd ← ntscd ∪ {p −NTSCD
                          −−→ n}
```
The first part of the algorithm initializes S as shown in the figure and sets *workbag* to {2, <sup>6</sup>, <sup>3</sup>, <sup>4</sup>}. Then any node from *workbag* can be popped and processed. Let us apply the policy used for queues: always pop the oldest element in *workbag*. Hence, we pop 2 and nothing happens as the condition on line 17 is not satisfied for any m. This also means that the symbol t<sup>12</sup> is not propagated any further. Next we pop 6, which has no effect as 6 has no successor. By processing 3 and 4, t<sup>23</sup> and t<sup>24</sup> are propagated to S[5, 2] and 5 is added to the *workbag*. Finally, we process 5 and set <sup>S</sup>[6, 2] to {t23, t24}. The final content of <sup>S</sup> is provided in the figure. Unfortunately, the information in S is sound but incomplete. In other words, if <sup>t</sup>pr <sup>∈</sup> <sup>S</sup>[n, p], then <sup>n</sup> is indeed on all maximal paths from <sup>p</sup> starting with pr, but the opposite implication does not hold. In particular, t<sup>12</sup> is missing in S[5, 1] and S[6, 1]. Consequently, the last part of the algorithm computes an incorrect NTSCD relation: it correctly identifies 1 −NTSCD −−→ 2, 2 −NTSCD −−→ 3, and 2 −NTSCD −−→ 4, but it also incorrectly produces 1 −NTSCD −−→ 6 and misses 1 −NTSCD −−→ 5.

**Fig. 2.** An example that shows the incorrectness of the NTSCD algorithm by Ranganath et al. [33]. Solid red edges depict the dependencies computed by the algorithm when it always pops the oldest element in *workbag*. The crossed dependence is incorrect. The dotted dependence is missing in the result.

A necessary condition to get the correct result is to process 2 only after 3, 4 are processed and <sup>S</sup>[5, 6] = {t23, t24}. For example, one obtains the correct <sup>S</sup> (also shown in the figure) when the nodes are processed in the order 3, 4, 2, 5, 6.

The algorithm is clearly sensitive to the order of popping nodes from *workbag*. We are currently not sure whether for each CFG there exists an order that leads to the correct result. An easy way to fix the algorithm is to ignore the *workbag* and repeatedly execute the body of the **while** loop (lines 10–21) for all <sup>n</sup> <sup>∈</sup> <sup>V</sup> until the array <sup>S</sup> reaches a fixpoint. However, this modification would slow down the algorithm substantially. Computing the fixpoint needs <sup>O</sup>(|<sup>V</sup> <sup>|</sup> 3) iterations over the loop body (lines 10–21 excluding lines 14 and 21 handling the *workbag*) and one iteration of this loop body needs <sup>O</sup>(|<sup>V</sup> <sup>|</sup> <sup>2</sup>). Hence, the overall time complexity of the fixed version is <sup>O</sup>(|<sup>V</sup> <sup>|</sup> 5).

#### **3.2 New Algorithm for NTSCD**

We have designed and implemented a new algorithm computing NTSCD. Our algorithm is correct, significantly simpler and asymptotically faster than the original algorithm of Ranganath et al. [33].

The new algorithm calls for each node n a procedure that identifies all NTSCD dependencies of n on predicate nodes. The procedure works in the following steps.


# **Algorithm 2:** The new NTSCD algorithm

```
Input: a CFG G = (V,E)
   Output: the NTSCD relation stored in ntscd
1 Procedure visit(n) // Auxiliary procedure
2 n.counter ← n.counter − 1
3 if n.counter = 0 ∧ n.color = red then
4 n.color ← red
5 for m ∈ Predecessors(n) do
6 visit(m)
7
8 Procedure compute(n) // Coloring the graph red for a given n
9 for m ∈ V do
10 m.color ← uncolored
11 m.counter ← |Successors(m)|
12 n.color ← red
13 for m ∈ Predecessors(n) do
14 visit(m)
15
16 ntscd ← ∅ // Computation of NTSCD
17 for n ∈ V do
18 compute(n)
19 for p ∈ Predicates(G) do
20 if p has a red successor and an uncolored successor then
21 ntscd ← ntscd ∪ {p −NTSCD
                          −−→ n}
```
Unlike the Ranganath et al.'s algorithm which works in a forward manner, our algorithm spreads the information about the reachability of n on all maximal paths in the backward direction starting from n.

The algorithm is presented in Algorithm 2. The procedure compute(n) implements the first two steps mentioned above. In the second step, it does not search over all nodes to pick the next node to color. Instead, it maintains the count of uncolored successors for each node. Once the count drops to 0 for a node, the node is colored red and the search continues with predecessors of this node. The third step is implemented directly in the main loop of the algorithm.

To prove that the algorithm is correct, we basically need to show that when compute(n) finishes, a node m is red iff all maximal paths from m contain n. We start with a simple observation.

**Lemma 1.** *After* compute(n) *finishes, a node* m *is red if and only if* m = n *or* m *has a positive number of successors and all of them are red.*

*Proof.* For each node m, the counter is initialized to the number of its successors and it is decreased by calls to visit(m) each time a successor of m gets red. When the counter drops to 0 (i.e., all successors of the node are red), the node is colored red. Therefore, if <sup>m</sup> is red, it got red either on line <sup>12</sup> and <sup>m</sup> <sup>=</sup> <sup>n</sup>, or <sup>m</sup> <sup>=</sup> <sup>n</sup> and m is red because all its successors got red (it must have a positive number of successors, otherwise the counter could not be 0 after its decrement). In the other direction, if m = n, it gets red on line 12. If it has a positive number of successors which all get red, the node is colored red by the argument above.

**Theorem 1.** *After* compute(n) *finishes, for each node* m *it holds that* m *is red if and only if all maximal paths from* m *contain* n*.*

*Proof.* ("⇐=") We prove this implication by contraposition. Assume that <sup>m</sup> is an uncolored node. Lemma 1 implies that each uncolored node has an uncolored successor (if it has any). Hence, we can construct a maximal path from m containing only uncolored nodes simply by always going to an uncolored successor, either up to infinity or up to a node with no successors. This uncolored maximal path cannot contain n which is red.

("=⇒") For the sake of contradiction, assume that there is a red node <sup>m</sup> and a maximal path from m that does not contain n. Lemma 1 implies that all nodes on this path are red. If the maximal path is finite, it has to end with a node without any successor. Lemma 1 says that such a node can be red if and only if it is n, which is a contradiction. If the maximal path is infinite, it must contain a cycle since the graph is finite. Let r be the node on this cycle that has been colored red as the first one. Let s be the successor of r on the cycle. Recall that <sup>r</sup> <sup>=</sup> <sup>n</sup> as the maximal path does not contain <sup>n</sup>. Hence, node <sup>r</sup> could be colored red only when all its successors including s were already red. This contradicts the fact that <sup>r</sup> was colored red as the first node on the cycle.

To determine the complexity of our algorithm on a CFG (V,E), we first analyze the complexity of one run of compute(n). The lines 9–11 iterate over all nodes. The crucial observation is that the procedure visit is called at most once for each edge (m, m ) <sup>∈</sup> <sup>E</sup> of the graph: to decrease the counter of <sup>m</sup> when <sup>m</sup> gets red. Hence, the procedure compute(n) runs in <sup>O</sup>(|<sup>V</sup> <sup>|</sup>+|E|). This procedure is called on line 18 for each node n. Finally, lines 20–21 are executed for each pair of node n and predicate node p. This gives us the overall complexity <sup>O</sup>((|<sup>V</sup> <sup>|</sup> <sup>+</sup> <sup>|</sup>E|) · |<sup>V</sup> <sup>|</sup> <sup>+</sup> <sup>|</sup><sup>V</sup> <sup>|</sup> <sup>2</sup>) = <sup>O</sup>((|<sup>V</sup> <sup>|</sup> <sup>+</sup> <sup>|</sup>E|) · |<sup>V</sup> <sup>|</sup>). Since in control flow graphs it holds <sup>|</sup>E| ≤ <sup>2</sup>|<sup>V</sup> <sup>|</sup>, the complexity can be simplified to <sup>O</sup>(|<sup>V</sup> <sup>|</sup> 2).

Note that our algorithm is asymptotically optimal as there are CFGs with NTSCD relations of size <sup>Θ</sup>(|<sup>V</sup> <sup>|</sup> <sup>2</sup>). For example, the CFG in Fig. <sup>3</sup> has <sup>|</sup><sup>V</sup> <sup>|</sup> = 2k+1 nodes and the corresponding NTSCD relation

$$\{n\_i \xrightarrow{\mathsf{MTSD}} m\_j \mid i, j \in \{1, \ldots, k\}\} \cup \{n\_i \xrightarrow{\mathsf{MTSD}} n\_{i+1} \mid i \in \{1, \ldots, k-1\}\}$$

is of size <sup>k</sup><sup>2</sup> <sup>+</sup> <sup>k</sup> <sup>−</sup> <sup>1</sup> <sup>∈</sup> <sup>Θ</sup>(|<sup>V</sup> <sup>|</sup> 2).

**Fig. 3.** A CFG with <sup>|</sup><sup>V</sup> <sup>|</sup> nodes that has the NTSCD relation of size <sup>Θ</sup>(|<sup>V</sup> <sup>|</sup> 2).

**Fig. 4.** An example of an irreducible CFG. There are no NTSCD dependencies, but a and b are DOD on p.

# **4 Decisive Order Dependence**

There are control dependencies not captured by NTSCD. For example, consider the CFG in Fig. 4. Nodes a and b are not NTSCD on p as they lie on all maximal paths from p. However, p controls which of a and b is executed first. Ranganath et al. [33] introduced the DOD relation to capture such dependencies.

**Definition 3 (Decisive order dependence, DOD).** *Let* G = (V,E) *be a CFG and* p, a, b <sup>∈</sup> <sup>V</sup> *be three distinct nodes such that* <sup>p</sup> *is a predicate node with successors* s<sup>1</sup> *and* s2*. Nodes* a, b *are* decisive order-dependent (DOD) *on* p*, written* <sup>p</sup> <sup>−</sup>*DOD* −→ {a, b}*, if*


The importance of DOD for slicing of irreducible programs is discussed in the introduction.

#### **4.1 Algorithm of Ranganath et al. [33] for DOD**

Ranganath et al. provided an algorithm that computes the DOD relation for a given CFG <sup>G</sup> = (V,E) in time <sup>O</sup>(|<sup>V</sup> <sup>|</sup> <sup>4</sup> · |E| · log <sup>|</sup><sup>V</sup> <sup>|</sup>) which amounts to <sup>O</sup>(|<sup>V</sup> <sup>|</sup> 5 · log <sup>|</sup><sup>V</sup> <sup>|</sup>) on CFGs [33, Fig. 7]. The algorithm contains one unclear point. For each triple of nodes p, a, b <sup>∈</sup> <sup>V</sup> such that <sup>p</sup> <sup>∈</sup> *Predicates*(G) and <sup>a</sup> <sup>=</sup> <sup>b</sup>, the algorithm executes the following check and if it succeeds, then <sup>p</sup> <sup>−</sup>DOD −→ {a, b} is reported:

reachable(*a*, *<sup>b</sup>*, *<sup>G</sup>*) <sup>∧</sup> reachable(*b*, *<sup>a</sup>*, *<sup>G</sup>*) <sup>∧</sup> dependence(*p*, *<sup>a</sup>*, *<sup>b</sup>*, *<sup>G</sup>*) (1)

**Fig. 5.** An example that shows the incorrectness of the DOD algorithm by Ranganath et al. [33]

The procedure dependence(p, a, b, G) returns *true* iff a is on all maximal paths from one successor of p before any occurrence of b and b is on all maximal paths from the other successor of p before any occurrence of a. The procedure reachable is specified only by words [33, description of Fig. 7] as follows:

reachable(*a*, *b*, *G*) returns *true* if b is reachable from a in the graph G.

Unfortunately, this algorithm can provide incorrect results. For example, consider the CFG in Fig. 5. Nodes p, a, b satisfy the formula (1): a appears on all maximal paths from one successor of p (namely a) before any occurrence of b, and b appears on all maximal paths from the other successor of p (which is b) before any occurrence of a. At the same time, a and b are reachable from each other. However, it is not true that <sup>p</sup> <sup>−</sup>DOD −→ {a, b}, because <sup>a</sup> and <sup>b</sup> do not lie on all maximal paths from p (the first condition of Definition 3 is violated).

The algorithm can be fixed by modifying the procedure reachable(a, b, G) to return *true* if b is on all maximal paths from a. The modified procedure can be implemented with use of the procedure compute(b) of Algorithm 2. As the procedure compute(b) runs in <sup>O</sup>(|<sup>V</sup> <sup>|</sup> <sup>+</sup> <sup>|</sup>E|), the modification does not increase the overall complexity of the algorithm. By comparing the fixed and the original version of reachable(a, b, G), one can readily confirm that the original version produces supersets of DOD relations.

#### **4.2 New Algorithm for DOD: Crucial Observations**

As in the case of NTSCD, we have designed a new algorithm for the computation of DOD, which is relatively simple and asymptotically faster than the DOD algorithm of Ranganath et al. [33].

Given a CFG, our algorithm first computes for each predicate p the set V<sup>p</sup> of nodes that are on all maximal paths from p. The definition of DOD implies that only pairs of nodes in V<sup>p</sup> can be DOD on p. For every predicate p we build an auxiliary graph A<sup>p</sup> with nodes V<sup>p</sup> and from this graph we get all pairs of nodes that are DOD on p. The graph A<sup>p</sup> is defined as follows.

**Definition 4 (**V **-interval** [13]**).** *Given a CFG* <sup>G</sup> = (V,E) *and a subset* <sup>V</sup> <sup>⊆</sup> <sup>V</sup> *, a path* <sup>n</sup><sup>1</sup> ...n<sup>k</sup> *such that* <sup>k</sup> <sup>≥</sup> <sup>2</sup>*,* <sup>n</sup>1, n<sup>k</sup> <sup>∈</sup> <sup>V</sup> *, and* <sup>∀</sup><sup>1</sup> <i<k : <sup>n</sup><sup>i</sup> ∈ <sup>V</sup> *is called a* V -interval *from* n<sup>1</sup> *to* n<sup>k</sup> *in* G*.*

In other words, a V -interval is a finite path with at least one edge that has the first and the last node in V but no other node on the path is in V .

**Definition 5 (Graph** A<sup>p</sup> <sup>2</sup>**).** *Given a CFG* <sup>G</sup> = (V,E)*, a predicate node* <sup>p</sup> <sup>∈</sup> *Predicates*(G) *and the subset* <sup>V</sup><sup>p</sup> <sup>⊆</sup> <sup>V</sup> *of nodes that are on all maximal paths from* p*, the* A<sup>p</sup> = (Vp, Ep) *is the graph where*

<sup>E</sup><sup>p</sup> <sup>=</sup> {(x, y) <sup>|</sup> *there exists a* <sup>V</sup><sup>p</sup> *-interval from* <sup>x</sup> *to* <sup>y</sup> *in* <sup>G</sup>}.

In this subsection, we describe the connections between these graphs and DOD that underpin our algorithm. The proofs of the theorems can be found in the extended version of this paper [8].

Given a predicate p of a CFG G, the graph A<sup>p</sup> does not have to be a CFG as nodes in A<sup>p</sup> can have more than two successors. However, A<sup>p</sup> preserves exactly all possible orders of the first occurrences of nodes in V<sup>p</sup> on maximal paths in G starting from p. More precisely, for each maximal path from p in G, there exists a maximal path from p in A<sup>p</sup> with the same order of the first occurrences of all nodes in Vp, and vice versa. Further, it turns out that there are no nodes DOD on p unless A<sup>p</sup> has the *right shape*.

**Definition 6 (Right shape of** <sup>A</sup><sup>p</sup> **).** *Given a CFG* <sup>G</sup>*, a predicate node* <sup>p</sup> <sup>∈</sup> *Predicates*(G) *and the graph* A<sup>p</sup> = (Vp, Ep)*, we say that* A<sup>p</sup> *has the* right shape *if it consists only of a cycle and the node* p *with at least two edges going to some nodes on the cycle (i.e., the nodes of* V<sup>p</sup> - {p} *can be labeled* <sup>n</sup>1,...,n<sup>k</sup> *such that* <sup>E</sup><sup>p</sup> <sup>=</sup> {(n1, n2),(n2, n3),...,(n<sup>k</sup>−1, nk),(nk, n1)}∪{(p, ni) <sup>|</sup> <sup>i</sup> <sup>∈</sup> <sup>I</sup>} *for some* <sup>I</sup> ⊆ {1,...,k} *with* <sup>|</sup>I| ≥ <sup>2</sup>*).*

Figure 6 depicts an A<sup>p</sup> which has the right shape. In the following text, we work only with A<sup>p</sup> graphs in the right shape.

Let s<sup>1</sup> and s<sup>2</sup> be the two successors of p in G. Note that s<sup>1</sup> and s<sup>2</sup> may, but do not have to be in Ap. To compute the pairs of nodes that are DOD on p, we need to know all possible orders of the first occurrences of nodes in V<sup>p</sup> on the maximal paths in G starting in s<sup>1</sup> and s2. Hence, for each successor s<sup>i</sup> we compute the set S<sup>i</sup> of nodes that appear as the first node of V<sup>p</sup> on some maximal path from <sup>s</sup><sup>i</sup> in <sup>G</sup>. Formally, for <sup>i</sup> ∈ {1, <sup>2</sup>}, we define

$$S\_i = \{ n \in V\_p \mid \text{there exists a path } s\_i \dots n \in (V \sim V\_p)^\*. V\_p \text{ in } G \}.$$

The nodes in <sup>S</sup><sup>1</sup> <sup>∪</sup> <sup>S</sup><sup>2</sup> are exactly all the successors of <sup>p</sup> in <sup>A</sup>p. Further, the maximal paths from the nodes of S<sup>i</sup> in A<sup>p</sup> reflect exactly all possible orders of the first occurrences of nodes in V<sup>p</sup> on maximal paths in G starting in si. If S<sup>1</sup> and S<sup>2</sup> are not disjoint, then there exist two maximal paths in G, one starting in s<sup>1</sup> and the other in s2, that differ only in prefixes of nodes outside Vp. The definition of DOD implies that there are no nodes DOD on p in this case. Therefore we assume that S<sup>1</sup> and S<sup>2</sup> are disjoint.

The nodes in S<sup>i</sup> divide the cycle of A<sup>p</sup> into si*-strips*, which are parts of the cycle starting with a node from S<sup>i</sup> and ending before the next node of Si.

<sup>2</sup> Graph A*<sup>p</sup>* can be defined as the graph induced by V*<sup>p</sup>* in terms of Danicic et al. [13].

**Fig. 6.** An example of <sup>A</sup>*<sup>p</sup>* in the right shape. Strips are computed for <sup>S</sup><sup>1</sup> <sup>=</sup> {n1, n7} (blue nodes) and <sup>S</sup><sup>2</sup> <sup>=</sup> {n2, n5} (red nodes). (Color figure online)

**Definition 7 (**si**-strip).** *Let* <sup>i</sup> ∈ {1, <sup>2</sup>}*. An* <sup>s</sup>i-strip *is a path* n...m <sup>∈</sup> <sup>S</sup>i.(V<sup>p</sup> - Si)<sup>∗</sup> *in* A<sup>p</sup> *such that the successor of* m *in* A<sup>p</sup> *is a node in* Si*.*

An example of A<sup>p</sup> with si-strips is in Fig. 6. The si-strips directly say which pairs of nodes of V<sup>p</sup> are in the same order on all maximal paths from s<sup>i</sup> in G. In particular, a node a is before any occurrence of node b on all maximal paths from a successor s of p in G if and only if there is an s-strip containing both a and b where a is before b. As a corollary, we get the following theorem:

**Theorem 2.** *Let* p *be a predicate with successors* s1, s<sup>2</sup> *such that* A<sup>p</sup> *has the right shape and* <sup>S</sup><sup>1</sup> <sup>∩</sup> <sup>S</sup><sup>2</sup> <sup>=</sup> <sup>∅</sup>*. Then nodes* a, b <sup>∈</sup> <sup>V</sup><sup>p</sup> *are DOD on* <sup>p</sup> *if and only if*


Consider again the A<sup>p</sup> in Fig. 6. The theorem implies that nodes n1, n<sup>5</sup> are DOD on p as they appear in s1-strip n1n2n3n4n5n<sup>6</sup> and in s2-strip n5n6n7n8n<sup>1</sup> in the opposite order. Nodes n1, n<sup>6</sup> are DOD on p for the same reason.

With use of the previous theorem, we can find a regular language over V<sup>p</sup> such that there exist nodes a, b DOD on p iff some unfolding of the cycle in A<sup>p</sup> is in the language.

**Theorem 3.** *Let* p *be a predicate with successors* s1, s<sup>2</sup> *such that* A<sup>p</sup> *has the right shape and* <sup>S</sup><sup>1</sup> <sup>∩</sup> <sup>S</sup><sup>2</sup> <sup>=</sup> <sup>∅</sup>*. Further, let* <sup>U</sup> <sup>=</sup> <sup>V</sup><sup>p</sup> - (S<sup>1</sup> <sup>∪</sup> <sup>S</sup>2)*. There are some nodes* a, b *DOD on* p *if and only if the cycle in* A<sup>p</sup> *has an unfolding of the form* S1.U∗.(S2.U∗)∗.S2.U∗.(S1.U∗)∗*.*

Finally, an unfolding of the mentioned form can be directly used for the computation of nodes that are DOD on p.

**Theorem 4.** *Let* p *be a predicate with successors* s1, s<sup>2</sup> *such that* A<sup>p</sup> *has the right shape and* <sup>S</sup><sup>1</sup> <sup>∩</sup> <sup>S</sup><sup>2</sup> <sup>=</sup> <sup>∅</sup>*. Further, let* <sup>A</sup><sup>p</sup> *have an unfolding of the form* S1.U∗.(S2.U∗)∗.S2.U∗.(S1.U∗)<sup>∗</sup> *where* U = V<sup>p</sup> - (S<sup>1</sup> <sup>∪</sup> <sup>S</sup>2)*. Then there is exactly one path* <sup>m</sup><sup>1</sup> ...m<sup>i</sup> <sup>∈</sup> <sup>S</sup>1.U∗.S<sup>2</sup> *and exactly one path* <sup>o</sup><sup>1</sup> ...o<sup>j</sup> <sup>∈</sup> <sup>S</sup>2.U∗.S<sup>1</sup> *on the cycle. Moreover,* <sup>p</sup> <sup>−</sup>*DOD* −→ {a, b} *if and only if* <sup>m</sup><sup>1</sup> ...m<sup>i</sup>−<sup>1</sup> *contains* <sup>a</sup> *and* <sup>o</sup><sup>1</sup> ...o<sup>j</sup>−<sup>1</sup> *contains* b *(or the other way round).*

# **Algorithm 3:** The algorithm computing V<sup>n</sup> for all nodes n

```
Input: a CFG G = (V,E)
   Output: Vn = {m ∈ V | m is on all maximal paths from n} for all n ∈ V
1 Procedure visit(n, r) // Auxiliary procedure
2 n.counter ← n.counter − 1
3 if n.counter = 0 ∧ r ∈ Vn then
4 Vn ← Vn ∪ {r}
5 for m ∈ Predecessors(n) do
6 visit(m, r)
7
8 Procedure compute(n) // 'Coloring the graph red' for a given n
9 for m ∈ V do
10 m.counter ← |Successors(m)|
11 Vn ← Vn ∪ {n}
12 for m ∈ Predecessors(n) do
13 visit(m, n)
14
15 Procedure computeVps // Computation of sets Vn for all nodes n
16 for n ∈ V do
17 Vn ← ∅
18 for n ∈ V do
19 compute(n)
```
#### **4.3 New Algorithm for DOD: Pseudocode and Complexity**

Our DOD algorithm is shown in Algorithms 3 and 4. As nearly all applications of DOD need also NTSCD, we present the algorithm with a simple extension (gray lines with asterisks) that simultaneously computes NTSCD.

The DOD algorithm starts at line 20 of Algorithm 4. The first step is to compute the sets V<sup>p</sup> for all predicate nodes p of a given CFG G. The computation of predicate nodes can be found in Algorithm 3. It is a slightly modified version of Algorithm 2. Recall that the procedure compute(n) of Algorithm 2 marks red every node such that all maximal paths from the node contain n. The procedure compute(n) of Algorithm 3 does in principle the same, but instead of the red color it marks the nodes with the identifier of the node n. Every node m collects these marks in set Vm. After we run compute(n) for all the nodes n in the graph, each node m has in its set V<sup>m</sup> precisely all nodes that are on all maximal paths from m. For the computation of DOD, only the sets V<sup>p</sup> for predicate nodes p are needed, but the extension computing NTSCD may use all these sets.

When the sets V<sup>p</sup> are calculated, we compute DOD (and NTSCD) dependencies for each predicate node separately by procedures computeDOD(p) and computeNTSCD(p). The procedure computeDOD(p) first constructs the graph A<sup>p</sup> with the use of buildAp(p). Nodes of the graph are these of Vp. To compute edges, we trigger depth-first search in <sup>G</sup> from each <sup>n</sup> <sup>∈</sup> <sup>V</sup>p. If we find a node <sup>m</sup> <sup>∈</sup> <sup>V</sup>p, we add the edge (n, m) to the graph <sup>A</sup><sup>p</sup> and stop the search on

**Algorithm 4:** The new DOD algorithm which computes also NTSCD if the gray lines are included (computeVps is given in Algorithm 3) **Input:** a CFG G = (V,E) **Output:** the DOD relation stored in *dod* (and NTSCD stored in *ntscd*) **<sup>1</sup> Procedure** computeDOD(p) // Computation of DOD for predicate p **<sup>2</sup>** <sup>A</sup>*<sup>p</sup>* <sup>←</sup> buildA*p*(p) // Get the graph <sup>A</sup>*<sup>p</sup>* **<sup>3</sup> if** <sup>A</sup>*<sup>p</sup> does not have the right shape* **then <sup>4</sup> return** ∅ **<sup>5</sup>** <sup>S</sup>1, S<sup>2</sup> <sup>←</sup> computeS1S2(p) // Get the sets <sup>S</sup>1, S<sup>2</sup> **<sup>6</sup> if** <sup>S</sup><sup>1</sup> <sup>∩</sup> <sup>S</sup><sup>2</sup> <sup>=</sup> <sup>∅</sup> **then <sup>7</sup> return** ∅ **<sup>8</sup>** <sup>n</sup>1n<sup>2</sup> ...n*<sup>t</sup>* <sup>←</sup> unfoldCycle(A*p*, S1) // Unfold the cycle of <sup>A</sup>*<sup>p</sup>* **<sup>9</sup>** <sup>U</sup> <sup>←</sup> <sup>V</sup>*<sup>p</sup>* - (S<sup>1</sup> <sup>∪</sup> <sup>S</sup>2) **<sup>10</sup> if** <sup>n</sup>1n<sup>2</sup> ...n*<sup>t</sup>* ∈ (S1.U∗) <sup>+</sup>.(S2.U∗) <sup>+</sup>.(S1.U∗) <sup>∗</sup> **then** // Apply Thm. 3 **<sup>11</sup> return** ∅ **<sup>12</sup>** <sup>m</sup><sup>1</sup> ...m*<sup>i</sup>* <sup>←</sup> extract(n1n<sup>2</sup> ...n*t*, S1.U∗.S2) // Apply Thm. <sup>4</sup> **<sup>13</sup>** <sup>o</sup><sup>1</sup> ...o*<sup>j</sup>* <sup>←</sup> extract(n1n<sup>2</sup> ...n*t*, S2.U∗.S1) **14 return** - <sup>p</sup> <sup>−</sup>DOD −→ {a, b} | <sup>a</sup> ∈ {m1,...,m*<sup>i</sup>*−<sup>1</sup>}, b ∈ {o1,...,o*<sup>j</sup>*−<sup>1</sup>} **15 \*16 Procedure** computeNTSCD(p) // Computation of NTSCD for predicate p **\*17** {s1, s2} ← *Successors*(p) **\*18 return** {<sup>p</sup> <sup>−</sup>NTSCD −−→ <sup>n</sup> <sup>|</sup> <sup>n</sup> <sup>∈</sup> (V*<sup>s</sup>*<sup>1</sup> - <sup>V</sup>*<sup>s</sup>*<sup>2</sup> ) <sup>∪</sup> (V*<sup>s</sup>*<sup>2</sup> - <sup>V</sup>*<sup>s</sup>*<sup>1</sup> )} **19 <sup>20</sup>** computeV*p*s // Computation of DOD and NTSCD for all nodes **<sup>21</sup>** *dod* ← ∅ **\*22** *ntscd* ← ∅ **<sup>23</sup> for** <sup>p</sup> <sup>∈</sup> *Predicates*(G) **do <sup>24</sup>** *dod* <sup>←</sup> *dod* <sup>∪</sup> computeDOD(p) **\*25** *ntscd* <sup>←</sup> *ntscd* <sup>∪</sup> computeNTSCD(p)

this path. When the graph A<sup>p</sup> is constructed, we check whether it has the right shape. If not, we return <sup>∅</sup> as there are no nodes DOD on <sup>p</sup> in this case.

The next step is to compute the sets S<sup>1</sup> and S2. Again, we apply a similar depth-first search as in the construction of A<sup>p</sup> described above. If the sets S1, S<sup>2</sup> are not disjoint, we return <sup>∅</sup> as there are no nodes DOD on <sup>p</sup>.

Then we unfold the cycle in A<sup>p</sup> from an arbitrary node in S1, compute the set U, and check whether the unfolding matches (S1.U∗)<sup>+</sup>.(S2.U∗)<sup>+</sup>.(S1.U∗)∗. Note that any unfolding starting in S<sup>1</sup> matches this language iff the cycle has an unfolding of the form S1.U∗.(S2.U∗)∗.S2.U∗.(S1.U∗)<sup>∗</sup> of Theorem 3. Hence, we return ∅ if the check fails.

**Fig. 7.** A CFG with <sup>|</sup><sup>V</sup> <sup>|</sup> nodes that has the DOD relation of size <sup>Θ</sup>(|<sup>V</sup> <sup>|</sup> 3).

Finally, we extract the paths of the form S1.U∗.S<sup>2</sup> and S2.U∗.S<sup>1</sup> from the unfolding. Note that the last node of the latter path can be the first node of the unfolding. Finally, we compute the DOD dependencies according to Theorem 4.

The procedure computeNTSCD(p) used for the computation of NTSCD simply follows Definition 2: it makes dependent on p each node that is on all maximal paths from the successor s<sup>1</sup> but not on all maximal paths from the successor s<sup>2</sup> or symmetrically for s<sup>2</sup> and s1.

As the correctness of our algorithm comes directly from the observations made in the previous subsection, it remains only to analyze its complexity. The procedure computeVps consists of two cycles in sequence. The first cycle runs in <sup>O</sup>(|<sup>V</sup> <sup>|</sup>). The second cycle calls <sup>O</sup>(|<sup>V</sup> <sup>|</sup>)-times the procedure compute(n). This procedure is essentially identical to the procedure of the same name in Algorithm <sup>2</sup> and so is its time complexity, namely <sup>O</sup>(|<sup>V</sup> <sup>|</sup> <sup>+</sup> <sup>|</sup>E|). Note that sets can be represented by bitvectors and therefore adding an element and checking the presence of an element in a set are constant-time. Overall, the procedure computeVp<sup>s</sup> runs in <sup>O</sup>(|<sup>V</sup> | · (|<sup>V</sup> <sup>|</sup> <sup>+</sup> <sup>|</sup>E|)), which is <sup>O</sup>(|<sup>V</sup> <sup>|</sup> <sup>2</sup>) for CFGs.

Now we discuss the complexity of the procedure computeDOD(p). Creating the graph <sup>A</sup><sup>p</sup> requires calling depth-first search <sup>O</sup>(|<sup>V</sup> <sup>|</sup>) times, which yields <sup>O</sup>(|<sup>V</sup> |·|E|) in total. Computation of <sup>S</sup>1, S<sup>2</sup> requires another two calls of depthfirst search, which is in <sup>O</sup>(|E|). When sets are represented as bitvectors, checking that <sup>S</sup><sup>1</sup> and <sup>S</sup><sup>2</sup> are disjoint is in <sup>O</sup>(|<sup>V</sup> <sup>|</sup>). Unfolding the cycle, matching the unfolding to the language (line 10), and the procedure extract run also in <sup>O</sup>(|<sup>V</sup> <sup>|</sup>). The construction of the DOD relation on line <sup>14</sup> is in <sup>O</sup>(|<sup>V</sup> <sup>|</sup> <sup>2</sup>). Altogether, computeDOD(p) runs in <sup>O</sup>(|<sup>V</sup> |·|E<sup>|</sup> <sup>+</sup> <sup>|</sup><sup>V</sup> <sup>|</sup> <sup>2</sup>) which simplifies to <sup>O</sup>(|<sup>V</sup> <sup>|</sup> <sup>2</sup>) for CFGs.

computeDOD is called <sup>O</sup>(|<sup>V</sup> <sup>|</sup>) times, so the overall complexity of computing DOD for a CFG <sup>G</sup> = (V,E) is <sup>O</sup>(|<sup>V</sup> <sup>|</sup> <sup>3</sup>). If we compute also NTSCD, we make <sup>O</sup>(|<sup>V</sup> <sup>|</sup>) extra calls to computeNTSCD(p), where one call takes <sup>O</sup>(|<sup>V</sup> <sup>|</sup>) time. Therefore, the asymptotic complexity of computing NTSCD with DOD does not change from computing DOD only.

Our algorithm running in time <sup>O</sup>(|<sup>V</sup> <sup>|</sup> <sup>3</sup>) is asymptotically optimal as there exist graphs with DOD relations of size <sup>Θ</sup>(|<sup>V</sup> <sup>|</sup> <sup>3</sup>). For example, the CFG in Fig. 7 has <sup>|</sup><sup>V</sup> <sup>|</sup> = 4<sup>k</sup> + 1 nodes and the corresponding DOD relation

{q<sup>i</sup> <sup>−</sup>DOD −→ {n<sup>j</sup> , ml} | <sup>i</sup> ∈ {1,...,k + 1}, j, l ∈ {1,...,k}}

is of size <sup>k</sup><sup>3</sup> <sup>+</sup> <sup>k</sup><sup>2</sup> <sup>∈</sup> <sup>Θ</sup>(|<sup>V</sup> <sup>|</sup> 3).

# **5 Comparison to Control Closures**

In 2011, Danicic et al. [13] introduced *control closures (CC)* that generalize control dependence from CFGs to arbitrary graphs. In particular, *strong control closure*, which is sensitive to non-termination, generalizes strong control dependence including NTSCD and DOD.

**Definition 8 (Strongly control-closed set).** *Let* G = (V,E) *be a CFG and let* <sup>U</sup> <sup>⊆</sup> <sup>V</sup> *. The set* <sup>U</sup> *is* strongly control-closed<sup>3</sup> *in* <sup>G</sup> *if and only if for every node* <sup>v</sup> <sup>∈</sup> <sup>V</sup> -U *that is reachable in* G *from a node in* U*, one of these holds:*


In other words, whenever we leave a strongly control-closed set, we either cannot return back or we have to return back to the set in a certain node.

**Definition 9 (Strong control closure, strong CC).** *Let* G = (V,E) *be a CFG and* <sup>V</sup> <sup>⊆</sup> <sup>V</sup> *. A* strong control closure (strong CC) *of* <sup>V</sup> *is a strongly control-closed set* <sup>U</sup> <sup>⊇</sup> <sup>V</sup> *such that there is no strongly control-closed set* <sup>U</sup> *satisfying* <sup>U</sup> <sup>U</sup> <sup>⊇</sup> <sup>V</sup> *.*

Danicic et al. present an algorithm for the computation of strong control closures running in <sup>O</sup>(|<sup>V</sup> <sup>|</sup> <sup>4</sup>) [13, Theorem 66]. In fact, the algorithm uses a procedure Γ that is very similar to our procedure compute(n) of Algorithm 2.

We can also define the closure of a set of nodes under NTSCD and DOD.

**Definition 10 (NTSCD and DOD closure).** *Let* G = (V,E) *be a CFG. A* NTSCD and DOD closure *of a set* <sup>V</sup> <sup>⊆</sup> <sup>V</sup> *is the smallest set* <sup>U</sup> <sup>⊇</sup> <sup>V</sup> *satisfying*

(<sup>n</sup> <sup>∈</sup> <sup>U</sup> <sup>∧</sup> <sup>p</sup> <sup>−</sup>*NTSCD* −−→ <sup>n</sup>) =<sup>⇒</sup> <sup>p</sup> <sup>∈</sup> <sup>U</sup> *and* (a, b <sup>∈</sup> <sup>U</sup> <sup>∧</sup> <sup>p</sup> <sup>−</sup>*DOD* −→ {a, b}) =<sup>⇒</sup> <sup>p</sup> <sup>∈</sup> U.

Definition 10 directly provides an algorithm computing the NTSCD and DOD closure of a given set <sup>V</sup> <sup>⊆</sup> <sup>V</sup> . Roughly speaking, if we represent the NTSCD relation with edges and the DOD relation with hyperedges in a directed hypergraph with nodes V , the closure computation amounts to gathering backward reachable nodes from V .

<sup>3</sup> We adjusted the definition to the fact that predicates in our CFGs always have two outgoing edges (i.e., they are *complete* in terms of Danicic et al. [13]). The original definition [13] works with CFGs where each predicate has at most two successors and considers also paths that may end in a predicate with less than two successors.

Danicic et al. [13, Lemmas 93 and 94] proved that for a CFG G = (V,E) with a distinguished *start* node from which all nodes in V are reachable and a subset <sup>U</sup> <sup>⊆</sup> <sup>V</sup> such that *start* <sup>∈</sup> <sup>U</sup>, the set <sup>U</sup> is strongly control-closed iff it is closed under NTSCD and DOD. Hence, on graphs with such a *start* node, the strong CC of a set V containing the *start* node can be computed also by computing its NTSCD and DOD closure. Computation of the NTSCD and DOD closure runs in <sup>O</sup>(|<sup>V</sup> <sup>|</sup> <sup>3</sup>) as the backward reachability is dominated by the computation of NTSCD and DOD relations.

A substantial difference between the algorithm for strong CC by Danicic et al. [13] and our algorithm is that we are able to compute DOD and NTSCD separately, whereas the former is not. Moreover, our algorithm for NTSCD and DOD closure is asymptotically faster.

# **6 Experimental Evaluation**

We implemented our algorithms for the computation of NTSCD, DOD, and the NTSCD and DOD closure in C++ on top of the LLVM [25] infrastructure. The implementation is a part of the library for program analysis and slicing called DG [6], which is used for example in the verification and test generation tool Symbiotic [7]. We also implemented the original Ranganath et al.'s algorithms for NTSCD and DOD, the fixed versions of these algorithms from Subsects. 3.1 and 4.1, and the algorithm for the computation of strong CC by Danicic et al.

In the implementation of the strong CC algorithm by Danicic et al. [13], we use our procedure compute(n) of Algorithm 2 to implement the function Γ. This should have only a positive effect as this procedure is more efficient than iterating over all edges in a copy of the graph and removing them [13].

In our experiments, we use CFGs of functions (where nodes of the CFG represent basic blocks of the function) obtained in the following way. We took all benchmarks from the *Competition on Software Verification (SV-COMP)* 2020.<sup>4</sup> These benchmarks contain many artificial or generated code, but also a lot of real-life code, e.g., from the Linux project. Each source code file was compiled with clang into LLVM and preprocessed by the -lowerswitch pass to ensure that every basic block has at most two successors. Then we extracted individual functions and removed those with less than 100 basic blocks, as the computation of control dependence runs swiftly on small graphs. Because it is possible that one function is present in multiple benchmarks, the next step was to remove these duplicate functions. For every function, we computed the number of nodes and edges in its CFG, and performed DFS on the CFG to obtain the number of tree, forward, cross and back edges, and the depth of the DFS tree. If two or more functions shared the name and all the computed numbers, we kept only one such function. Note that this process may have removed also a function that was not a duplicate of some other, but only with a low probability. At the end, we were left with 2440 functions. The biggest function has 27851 basic blocks. Table 2 shows the distribution of the sizes of the generated CFGs.

<sup>4</sup> https://github.com/sosy-lab/sv-benchmarks, tag svcomp20.


**Table 2.** The *numbers* of considered CFGs by their *sizes*. The size of a CFG is the number of its nodes, which is the number of basic blocks of the corresponding function.

**Fig. 8.** Comparison of the running times of the new NTSCD algorithm and the incorrect (left) and the fixed (right) versions of the original NTSCD algorithm. TO stands for timeout.

The experiments were run on machines with *AMD EPYC* CPU with the frequency 3.1 GHz. Each benchmark run was constrained to 1 core and 8 GB of RAM. We used the tool *Benchexec* [4] to enforce resources isolation and to measure their usage. All presented times are CPU times. We set the timeout to 100 s for each algorithm run.

In the following, *original* algorithms refers to the algorithms of Ranganath et al. (we distinguish between the incorrect and the fixed versions when needed) and *new* algorithms refers to the algorithms introduced in this paper.

**NTSCD Algorithms.** In the first set of experiments, we compared the new algorithm for NTSCD against the incorrect and the fixed version of the original NTSCD algorithm. Although it seems that comparing to the incorrect version is meaningless, we did not want to compare only to the fixed version as the provided fix slows down the algorithm.

The results are depicted in Fig. 8. On the left scatter plot, there is the comparison of the new algorithm to the incorrect original algorithm and on the right scatter plot we compare to the fixed original algorithm. As we can see,

**Fig. 9.** Comparison of the running times of the new and the (fixed) original DOD algorithm. We use the considered benchmarks (left) and random graphs with 500 nodes and the number of edges specified by the x-axis (right).

the new algorithm outperforms the original algorithm significantly. The incorrect original algorithm produced a wrong NTSCD relation in 98.6 % of the considered benchmarks. The fixed version of the original algorithm returned precisely the same NTSCD relations as the new algorithm. We can also see that the scatter plot on the right contains more timeouts of the original algorithm. It supports the claim that the fix slows down the original algorithm.

**DOD Algorithms.** We compared the new DOD algorithm to the fixed version of the original DOD algorithm. As the fix does not change the asymptotic complexity of the original algorithm, we do not compare the new algorithm with the incorrect version of the original algorithm. The results of the experiments are displayed in Fig. 9 (left). We can see that the new algorithm is again very fast. In fact, the results resemble the results of the pure NTSCD algorithm, which is basically the part of the DOD algorithm that computes V<sup>p</sup> sets. It benefits from early checks that detect predicate nodes with no DOD dependencies.

As mentioned in the introduction, DOD is empty for structured programs as their CFGs are reducible. We do not know precisely how many of the 2440 considered functions have irreducible CFGs, but we know that 2373 of them use **goto** statements. DOD relations for 12 functions was non-empty, which means that CFGs of these functions are irreducible. Note that there may have been other irreducible CFGs with empty DOD relation.

Additionally, we tested the DOD algorithms on randomly generated graphs, where we can expect that irreducible graphs emerge more often. Figure 9 (right) shows the results for graphs that have 500 nodes and 50, 100, 150, . . . randomly distributed edges (such that every node has at most two successors). Each presented running time is in fact an average of 10 measurements with different random graphs. We can see that the new algorithm is agnostic to the number of

**Fig. 10.** Comparison of the running times of the strong CC algorithm by Danicic et al. [13] and our algorithm for the NTSCD and DOD closure.

edges. Its running time in this experiment ranges from 4.<sup>12</sup> · <sup>10</sup>−<sup>3</sup> to 8.<sup>89</sup> · <sup>10</sup>−<sup>3</sup> seconds. The original DOD algorithm does not scale well with the increasing number of edges.

**Strong CC Algorithm.** We also compare the strong CC algorithm of Danicic et al. [13] against our NTSCD and DOD closure algorithm on sets of nodes containing a distinguished *start* node, where these two algorithms produce equivalent results. For these experiments, we need a starting set that is going to be closed. We decided to run these experiments on the considered functions that have at least two exit points. The starting set consists of the node representing the entry point and the node representing one of the exit points. The closure of this set contains all nodes that may influence getting to the other exit points. The results are shown on the scatter plot in Fig. 10. Our algorithm clearly outperforms the strong CC algorithm.

# **7 Conclusion**

We studied algorithms for the computation of strong control dependence, namely non-termination sensitive control dependence (NTSCD) and decisive order dependence (DOD) by Ranganath et al. [33] and strong control closures (strong CC) by Danicic et al. [13] on control flow graphs where each branching statement has two successors. We have demonstrated flaws in the original algorithms for computation of NTSCD and DOD and we have suggested corrections. Moreover, we have introduced new algorithms for NTSCD, DOD, and strong CC that are asymptotically faster. All the mentioned algorithms have been implemented and our experiments confirm dramatically better performance of the new algorithms.

# **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

# **DIFFY: Inductive Reasoning of Array Programs Using Difference Invariants**

Supratik Chakraborty<sup>1</sup> , Ashutosh Gupta<sup>1</sup>, and Divyesh Unadkat1,2(B)

<sup>1</sup> Indian Institute of Technology Bombay, Mumbai, India *{*supratik,akg*}*@cse.iitb.ac.in <sup>2</sup> TCS Research, Pune, India divyesh.unadkat@tcs.com

**Abstract.** We present a novel verification technique to prove properties of a class of array programs with a symbolic parameter N denoting the size of arrays. The technique relies on constructing two slightly different versions of the same program. It infers difference relations between the corresponding variables at key control points of the joint control-flow graph of the two program versions. The desired post-condition is then proved by inducting on the program parameter N, wherein the difference invariants are crucially used in the inductive step. This contrasts with classical techniques that rely on finding potentially complex loop invaraints for each loop in the program. Our synergistic combination of inductive reasoning and finding simple difference invariants helps prove properties of programs that cannot be proved even by the winner of Arrays sub-category in SV-COMP 2021. We have implemented a prototype tool called Diffy to demonstrate these ideas. We present results comparing the performance of Diffy with that of state-of-the-art tools.

# **1 Introduction**

Software used in a wide range of applications use arrays to store and update data, often using loops to read and write arrays. Verifying correctness properties of such array programs is important, yet challenging. A variety of techniques have been proposed in the literature to address this problem, including inference of quantified loop invariants [20]. However, it is often difficult to automatically infer such invariants, especially when programs have loops that are sequentially composed and/or nested within each other, and have complex control flows. This has spurred recent interest in mathematical induction-based techniques for verifying parametric properties of array manipulating programs [11,12,42,44]. While induction-based techniques are efficient and quite powerful, their Achilles heel is the automation of the inductive argument. Indeed, this often becomes the limiting step in applications of induction-based techniques. Automating the induction step and expanding the class of array manipulating programs to which induction-based techniques can be applied forms the primary motivation for our work. Rather than being a stand-alone technique, we envisage our work being used as part of a portfolio of techniques in a modern program verification tool.

We propose a novel and practically efficient induction-based technique that advances the state-of-the-art in automating the inductive step when reasoning about array manipulating programs. This allows us to automatically verify interesting properties of a large class of array manipulating programs that are beyond the reach of state-of-the-art induction-based techniques, viz. [12,42]. The work that comes closest to us is Vajra [12], which is part of the portfolio of techniques in VeriAbs [1] – the winner of SV-COMP 2021 in the Arrays Reach sub-category. Our work addresses several key limitations of the technique implemented in Vajra, thereby making it possible to analyze a much larger class of array manipulating programs than can be done by VeriAbs. Significantly, this includes programs with nested loops that have hitherto been beyond the reach of automated techniques that use mathematical induction [12,42,44].

A key innovation in our approach is the construction of two slightly different versions of a given program that have identical control flow structures but slightly different data operations. We automatically identify simple relations, called *difference invariants*, between corresponding variables in the two versions of a program at key control flow points. Interestingly, these relations often turn out to be significantly simpler than inductive invariants required to prove the property directly. This is not entirely surprising, since the difference invariants depend less on what individual statements in the programs are doing, and more on the difference between what they are doing in the two versions of the program. We show how the two versions of a given program can be automatically constructed, and how differences in individual statements can be analyzed to infer simple difference invariants. Finally, we show how these difference invariants can be used to simplify the reasoning in the inductive step of our technique.

We consider programs with (possibly nested) loops manipulating arrays, where the size of each array is a symbolic integer parameter N (> 0)<sup>1</sup>. We verify (a sub-class of) quantified and quantifier-free properties that may depend on the symbolic parameter N. Like in [12], we view the verification problem as one of proving the validity of a parameterized Hoare triple {ϕ(N)} <sup>P</sup><sup>N</sup> {ψ(N)} for all values of N (> 0), where arrays are of size N in the program P<sup>N</sup> , and N is a free variable in ϕ(·) and ψ(·).

To illustrate the kind of programs that are amenable to our technique, consider the program shown in Fig. 1(a), adapted from an SV-COMP benchmark. This program has a couple of sequentially composed loops that update arrays and scalars. The scalars S and F are initialized to 0 and 1 respectively before the first loop starts iterating. Subsequently, the first loop computes a recurrence in variable S and initializes elements of the array B to 1 if the corresponding elements of array A have non-negative values, and to 0 otherwise. The outermost branch condition in the body of the second loop evaluates to true only if the program parameter N and the variable S have same values. The value of F is reset based on some conditions depending on corresponding entries of arrays A and B. The pre-condition of this program is true; the post-condition asserts that F is never reset in the second loop.

<sup>1</sup> For a more general class of programs supported by our technique, please see [13].

```
// assume(true)
1. S = 0; F = 1;
2. for(i = 0; i< N; i++) {
3. S = S + 1;
4. if ( A[i] >= 0 ) B[i] = 1;
5. else B[i] = 0;
6. }
7. for(j = 0; j< N; j++) {
8. if(S == N) {
9. if ( A[j] >= 0 && !B[j] ) F = 0;
10. if ( A[j] < 0 && B[j] ) F = 0;
11. }
12.}
// assert(F == 1)
                     (a)
                                                // assume(true)
                                                1. S = 0;
                                                2. for(i=0; i<N; i++) A[i] = 0;
                                                3. for(j=0; j<N; j++) S = S + 1;
                                                4. for(k=0; k<N; k++) {
                                                5. for(l=0; l<N; l++) A[l] = A[l] + 1;
                                                6. A[k] = A[k] + S;
                                                7. }
                                                // assert(forall x in [0,N), A[x]==2*N)
                                                                     (b)
```
**Fig. 1.** Motivating examples

State-of-the-art techniques find it difficult to prove the assertion in this program. Specifically, Vajra [12] is unable to prove the property, since it cannot reason about the branch condition (in the second loop) whose value depends on the program parameter N. VeriAbs [1], which employs a sequence of techniques such as loop shrinking, loop pruning, and inductive reasoning using [12] is also unable to verify the assertion shown in this program. Indeed, the loops in this program cannot be merged as the final value of S computed by the first loop is required in the second loop; hence loop shrinking does not help. Also, loop pruning does not work due to the complex dependencies in the program and the fact that the exact value of the recurrence variable S is required to verify the program. Subsequent abstractions and techniques applied by VeriAbs from its portfolio are also unable to verify the given post-condition. VIAP [42] translates the program to a quantified first-order logic formula in the theory of equality and uninterpreted functions [32]. It applies a sequence of tactics to simplify and prove the generated formula. These tactics include computing closed forms of recurrences, induction over array indices and the like to prove the property. However, its sequence of tactics is unable to verify this example within our time limit of 1 min.

Benchmarks with nested loops are a long standing challenge for most verifiers. Consider the program shown in Fig. 1(b) with a nested loop in addition to sequentially composed loops. The first loop initializes entries in array A to 0. The second loop aggregates a constant value in the scalar S. The third loop is a nested loop that updates array A based on the value of S. The entries of A are updated in the inner as well as outer loop. The property asserts that on termination, each array element equals twice the value of the parameter N.

While the inductive reasoning of Vajra and the tactics in VIAP do not support nested loops, the sequence of techniques used by VeriAbs is also unable to prove the given post-condition in this program. In sharp contrast, our prototype tool Diffy is able to verify the assertions in both these programs automatically within a few seconds. This illustrates the power of the inductive technique proposed in this paper.

The technical contributions of the paper can be summarized as follows:


# **2 Overview and Relation to Earlier Work**

In this section, we provide an overview of the main ideas underlying our technique. We also highlight how our technique differs from [12], which comes closest to our work. To keep the exposition simple, we consider the program P<sup>N</sup> , shown in the first column of Fig. 2, where N is a symbolic parameter denoting the sizes of arrays a and b. We assume that we are given a parameterized pre-condition ϕ(N), and our goal is to establish the parameterized post-condition ψ(N), for all N > 0. In [12,44], techniques based on mathematical induction (on N) were proposed to solve this class of problems. As with any induction-based technique, these approaches consist of three steps. First, they check if the *base case* holds, i.e. if the Hoare triple {ϕ(N)} <sup>P</sup><sup>N</sup> {ψ(N)} holds for small values of <sup>N</sup>, say 1 ≤ N ≤ M, for some M > 0. Next, they assume that the *inductive hypothesis* {ϕ(<sup>N</sup> <sup>−</sup> 1)} <sup>P</sup><sup>N</sup>−<sup>1</sup> {ψ(<sup>N</sup> <sup>−</sup> 1)} holds for some <sup>N</sup> <sup>≥</sup> <sup>M</sup> + 1. Finally, in the *inductive step*, they show that if the inductive hypothesis holds, so does {ϕ(N)} <sup>P</sup><sup>N</sup> {ψ(N)}. It is not hard to see that the inductive step is the most crucial step in this style of reasoning. It is also often the limiting step, since not all programs and properties allow for efficient inferencing of {ϕ(N)} <sup>P</sup><sup>N</sup> {ψ(N)} from {ϕ(<sup>N</sup> <sup>−</sup> 1)} <sup>P</sup><sup>N</sup>−<sup>1</sup> {ψ(<sup>N</sup> <sup>−</sup> 1)}.

Like in [12,44], our technique uses induction on N to prove the Hoare triple {ϕ(N)} <sup>P</sup><sup>N</sup> {ψ(N)} for all N > 0. Hence, our base case and inductive hypothesis are the same as those in [12,44]. However, our reasoning in the crucial inductive step is significantly different from that in [12,44], and this is where our primary contribution lies. As we show later, not only does this allow a much larger class of programs to be efficiently verified compared to [12,44], it also permits reasoning about classes of programs with nested loops, that are beyond the reach of [12,44]. Since the work of [12] significantly generalizes that of [44], henceforth, we only refer to [12] when talking of earlier work that uses induction on N.

In order to better understand our contribution and its difference vis-a-vis the work of [12], a quick recap of the inductive step used in [12] is essential. The

**Fig. 2.** Pictorial depiction of our program transformations

inductive step in [12] crucially relies on finding a "difference program" ∂P<sup>N</sup> and a "difference pre-condition" ∂ϕ(N) such that: (i) P<sup>N</sup> is semantically equivalent to <sup>P</sup><sup>N</sup>−<sup>1</sup>; <sup>∂</sup>P<sup>N</sup> , where ';' denotes sequential composition of programs<sup>2</sup>, (ii) <sup>ϕ</sup>(N) <sup>⇒</sup> ϕ(N − 1) ∧ ∂ϕ(N), and (iii) no variable/array element in ∂ϕ(N) is modified by <sup>P</sup><sup>N</sup>−<sup>1</sup>. As shown in [12], once <sup>∂</sup>P<sup>N</sup> and ∂ϕ(N) satisfying these conditions are obtained, the problem of proving {ϕ(N)} <sup>P</sup><sup>N</sup> {ψ(N)} can be reduced to that of proving {ψ(<sup>N</sup> <sup>−</sup> 1) <sup>∧</sup> ∂ϕ(N)} <sup>∂</sup>P<sup>N</sup> {ψ(N)}. This approach can be very effective if (i) ∂P<sup>N</sup> is "simpler" (e.g. has fewer loops or strictly less deeply nested loops) than P<sup>N</sup> and can be computed efficiently, and (ii) a formula ∂ϕ(N) satisfying the conditions mentioned above exists and can be computed efficiently.

The requirement of <sup>P</sup><sup>N</sup> being semantically equivalent to <sup>P</sup><sup>N</sup>−<sup>1</sup>; <sup>∂</sup>P<sup>N</sup> is a very stringent one, and finding such a program ∂P<sup>N</sup> is non-trivial in general. In fact, the authors of [12] simply provide a set of syntax-guided conditionally sound heuristics for computing ∂P<sup>N</sup> . Unfortunately, when these conditions are violated (we have found many simple programs where they are violated), there are no known algorithmic techniques to generate ∂P<sup>N</sup> in a sound manner. Even if a program ∂P<sup>N</sup> were to be found in an ad-hoc manner, it may be as "complex" as P<sup>N</sup> itself. This makes the approach of [12] ineffective for analyzing such programs. As an example, the fourth column of Fig. <sup>2</sup> shows <sup>P</sup><sup>N</sup>−<sup>1</sup> followed by one possible ∂P<sup>N</sup> that ensures P<sup>N</sup> (shown in the first column of the same figure) is semantically equivalent to <sup>P</sup><sup>N</sup>−<sup>1</sup>; <sup>∂</sup>P<sup>N</sup> . Notice that <sup>∂</sup>P<sup>N</sup> in this example has two sequentially composed loops, just like P<sup>N</sup> had. In addition, the assignment statement in the body of the second loop uses a more complex expression than that present in the corresponding loop of <sup>P</sup><sup>N</sup> . Proving {ψ(<sup>N</sup> <sup>−</sup> 1) <sup>∧</sup> ∂ϕ(N)} <sup>∂</sup>P<sup>N</sup> {ψ(N)}

<sup>2</sup> Although the authors of [12] mention that it suffices to find a ∂P*<sup>N</sup>* that satisfies *{*ϕ(N)*}* P*<sup>N</sup>−*<sup>1</sup>; ∂P*<sup>N</sup> {*ψ(N)*}*, they do not discuss any technique that takes ϕ(N) or ψ(N) into account when generating ∂P*<sup>N</sup>* .

may therefore not be any simpler (perhaps even more difficult) than proving {ϕ(N)} <sup>P</sup><sup>N</sup> {ψ(N)}.

In addition to the difficulty of computing ∂P<sup>N</sup> , it may be impossible to find a formula ∂ϕ(N) such that ϕ(N) ⇒ ϕ(N −1) ∧ ∂ϕ(N), as required by [12]. This can happen even for fairly routine pre-conditions, such as ϕ(N) ≡ - N−<sup>1</sup> <sup>i</sup>=0 A[i] = N . Notice that there is no ∂ϕ(N) that satisfies ϕ(N) ⇒ ϕ(N − 1) ∧ ∂ϕ(N) in this case. In such cases, the technique of [12] cannot be used at all, even if P<sup>N</sup> , <sup>ϕ</sup>(N) and <sup>ψ</sup>(N) are such that there exists a trivial proof of {ϕ(N)} <sup>P</sup><sup>N</sup> {ψ(N)}.

The inductive step proposed in this paper largely mitigates the above problems, thereby making it possible to efficiently reason about a much larger class of programs than that possible using the technique of [12]. Our inductive step proceeds as follows. Given P<sup>N</sup> , we first algorithmically construct two programs <sup>Q</sup><sup>N</sup>−<sup>1</sup> and peel(P<sup>N</sup> ), such that <sup>P</sup><sup>N</sup> is semantically equivalent to <sup>Q</sup><sup>N</sup>−<sup>1</sup>; peel(P<sup>N</sup> ). Intuitively, <sup>Q</sup><sup>N</sup>−<sup>1</sup> is the same as <sup>P</sup><sup>N</sup> , but with all loop bounds that depend on <sup>N</sup> now modified to depend on <sup>N</sup> <sup>−</sup>1 instead. Note that this is different from <sup>P</sup><sup>N</sup>−<sup>1</sup>, which is obtained by replacing *all uses* (not just in loop bounds) of N in P<sup>N</sup> by <sup>N</sup> <sup>−</sup> 1. As we will see, this simple difference makes the generation of peel(P<sup>N</sup> ) significantly simpler than generation of <sup>∂</sup>P<sup>N</sup> , as in [12]. While generating <sup>Q</sup><sup>N</sup>−<sup>1</sup> and peel(P<sup>N</sup> ) may sound similar to generating <sup>P</sup><sup>N</sup>−<sup>1</sup> and <sup>∂</sup>P<sup>N</sup> [12], there are fundamental differences between the two approaches. First, as noted above, <sup>P</sup><sup>N</sup>−<sup>1</sup> is semantically different from <sup>Q</sup><sup>N</sup>−<sup>1</sup>. Similarly, peel(P<sup>N</sup> ) is also semantically different from <sup>∂</sup>P<sup>N</sup> . Second, we provide an algorithm for generating <sup>Q</sup><sup>N</sup>−<sup>1</sup> and peel(P<sup>N</sup> ) that works for a significantly larger class of programs than that for which the technique of [12] works. Specifically, our algorithm works for all programs amenable to the technique of [12], and also for programs that violate the restrictions imposed by the grammar and conditional heuristics in [12]. For example, we can algorithmically generate <sup>Q</sup><sup>N</sup>−<sup>1</sup> and peel(P<sup>N</sup> ) even for a class of programs with arbitrarily nested loops – a program feature explicitly disallowed by the grammar in [12]. Third, we guarantee that peel(P<sup>N</sup> ) is "simpler" than P<sup>N</sup> in the sense that the maximum nesting depth of loops in peel(P<sup>N</sup> ) is *strictly less* than that in P<sup>N</sup> . Thus, if P<sup>N</sup> has no nested loops (all programs amenable to analysis by [12] belong to this class), peel(P<sup>N</sup> ) is guaranteed to be loop-free. As demonstrated by the fourth column of Fig. 2, no such guarantees can be given for ∂P<sup>N</sup> generated by the technique of [12]. This is a significant difference, since it greatly simplifies the analysis of peel(P<sup>N</sup> ) vis-a-vis that of ∂P<sup>N</sup> .

We had mentioned earlier that some pre-conditions ϕ(N) do not admit any ∂ϕ(N) such that ϕ(N) ⇒ ϕ(N − 1) ∧ ∂ϕ(N). It is, however, often easy to compute formulas ϕ (N −1) and Δϕ (N) in such cases such that ϕ(N) ⇒ ϕ (N − 1) ∧ Δϕ (N), and the variables/array elements in Δϕ (N) are not modified by either <sup>P</sup><sup>N</sup>−<sup>1</sup> or <sup>Q</sup><sup>N</sup>−<sup>1</sup>. For example, if we were to consider a (new) pre-condition ϕ(N) ≡ - <sup>N</sup>−<sup>1</sup> <sup>i</sup>=0 <sup>A</sup>[i] = <sup>N</sup> for the program <sup>P</sup><sup>N</sup> shown in the first column of Fig. 2, then we have ϕ (N − 1) ≡ - <sup>N</sup>−<sup>2</sup> <sup>i</sup>=0 A[i] = N and Δϕ (N) ≡ - A[N − 1] = N . We assume the availability of such a ϕ (N − 1) and Δϕ (N) for the given ϕ(N). This significantly relaxes the requirement on pre-conditions and allows a much larger class of Hoare triples to be proved using our technique vis-a-vis that of [12].

The third column of Fig. <sup>2</sup> shows <sup>Q</sup>N−<sup>1</sup> and peel(P<sup>N</sup> ) generated by our algorithm for the program P<sup>N</sup> in the first column of the figure. It is illustrative to compare these with <sup>P</sup>N−<sup>1</sup> and <sup>∂</sup>P<sup>N</sup> shown in the fourth column of Fig. 2. Notice that <sup>Q</sup>N−<sup>1</sup> has the same control flow structure as <sup>P</sup>N−<sup>1</sup>, but is not semantically equivalent to <sup>P</sup>N−<sup>1</sup>. In fact, <sup>Q</sup>N−<sup>1</sup> and <sup>P</sup>N−<sup>1</sup> may be viewed as closely related versions of the same program. Let V<sup>Q</sup> and V<sup>P</sup> denote the set of variables of <sup>Q</sup>N−<sup>1</sup> and <sup>P</sup>N−<sup>1</sup> respectively. We assume <sup>V</sup><sup>Q</sup> is disjoint from <sup>V</sup>P, and analyze the joint execution of <sup>Q</sup>N−<sup>1</sup> starting from a state satisfying the precondition ϕ (<sup>N</sup> <sup>−</sup> 1), and <sup>P</sup><sup>N</sup>−<sup>1</sup> starting from a state satisfying <sup>ϕ</sup>(<sup>N</sup> <sup>−</sup> 1). The purpose of this analysis is to compute a difference predicate D(VQ, VP, N − 1) that relates corresponding variables in <sup>Q</sup><sup>N</sup>−<sup>1</sup> and <sup>P</sup><sup>N</sup>−<sup>1</sup> at the end of their joint execution. The above problem is reminiscent of (yet, different from) translation validation [4,17,24,40,46,48,49], and indeed, our calculation of D(VQ, VP, N −1) is motivated by techniques from the translation validation literature. An important finding of our study is that corresponding variables in <sup>Q</sup><sup>N</sup>−<sup>1</sup> and <sup>P</sup><sup>N</sup>−<sup>1</sup> are often related by simple expressions on N, regardless of the complexity of P<sup>N</sup> , ϕ(N) or ψ(N). Indeed, in all our experiments, we didn't need to go beyond quadratic expressions on N to compute D(VQ, VP, N − 1).

Once the steps described above are completed, we have Δϕ (N), peel(P<sup>N</sup> ) and D(VQ, VP, N − 1). It can now be shown that if the inductive hypothesis, i.e. {ϕ(<sup>N</sup> <sup>−</sup> 1)} <sup>P</sup><sup>N</sup>−<sup>1</sup> {ψ(<sup>N</sup> <sup>−</sup> 1)} holds, then proving {ϕ(N)} <sup>P</sup><sup>N</sup> {ψ(N)} reduces to proving {Δϕ (N) ∧ ψ (<sup>N</sup> <sup>−</sup>1)} peel(P<sup>N</sup> ) {ψ(N)}, where <sup>ψ</sup> (N −1) ≡ ∃V<sup>P</sup> - ψ(N − 1) ∧ D(VQ, VP, N − 1) . A few points are worth emphasizing here. First, if D(VQ, VP, N − 1) is obtained as a set of equalities, the existential quantifier in the formula ψ (N − 1) can often be eliminated simply by substitution. We can also use quantifier elimination capabilities of modern SMT solvers, viz. Z3 [39], to eliminate the quantifier, if needed. Second, recall that unlike ∂P<sup>N</sup> generated by the technique of [12], peel(P<sup>N</sup> ) is guaranteed to be "simpler" than P<sup>N</sup> , and is indeed loop-free if P<sup>N</sup> has no nested loops. Therefore, proving {Δϕ (N) ∧ ψ (<sup>N</sup> <sup>−</sup> 1)} peel(P<sup>N</sup> ) {ψ(N)} is typically significantly simpler than proving {ψ(<sup>N</sup> <sup>−</sup> 1) <sup>∧</sup> ∂ϕ(N)} <sup>∂</sup>P<sup>N</sup> {ψ(N)}. Finally, it may happen that the pre-condition in {Δϕ (N) ∧ ψ (<sup>N</sup> <sup>−</sup> 1)} peel(P<sup>N</sup> ) {ψ(N)} is not strong enough to yield a proof of the Hoare triple. In such cases, we need to strengthen the existing pre-condition by a formula, say ξ (N − 1), such that the strengthened pre-condition implies the weakest pre-condition of ψ(N) under peel(P<sup>N</sup> ). Having a simple structure for peel(P<sup>N</sup> ) (e.g., loop-free for the entire class of programs for which [12] works) makes it significantly easier to compute the weakest pre-condition. Note that ξ (N − 1) is defined over the variables in VQ. In order to ensure that the inductive proof goes through, we need to strengthen the post-condition of the original program by ξ(N) such that ξ(N − 1) ∧ D(VQ, VP, N − 1) ⇒ ξ (N − 1). Computing ξ(N − 1) requires a special form of logical abduction that ensures that ξ(N − 1) refers only to variables in V<sup>P</sup> . However, if D(VQ, VP, N − 1) is given as a set of equalities (as is often the case), ξ(N − 1) can be computed from ξ (N − 1) simply by substitution. This process of strengthening the pre-condition and post-condition may need to iterate a few times until a fixed point is reached, similar to what happens in the inductive step of [12]. Note that the fixed point iterations may not always converge (verification is undecidable in general). However, in our experiments, convergence always happened within a few iterations. If ξ (N − 1) denotes the formula obtained on reaching the fixed point, the final Hoare triple to be proved is {ξ (N − 1) ∧ Δϕ (N) ∧ ψ (<sup>N</sup> <sup>−</sup> 1)} peel(P<sup>N</sup> ) {ξ(N) <sup>∧</sup> <sup>ψ</sup>(N)}, where ψ (N − 1) ≡ ∃V<sup>P</sup> - ψ(N − 1) ∧ D(VQ, VP, N − 1) . Having a simple (often loop-free) peel(P<sup>N</sup> ) significantly simplifies the above process.

We conclude this section by giving an overview of how <sup>Q</sup>N−<sup>1</sup> and peel(P<sup>N</sup> ) are computed for the program P<sup>N</sup> shown in the first column of Fig. 2. The second column of this figure shows the program obtained from P<sup>N</sup> by peeling the last iteration of each loop of the program. Clearly, the programs in the first and second columns are semantically equivalent. Since there are no nested loops in P<sup>N</sup> , the peels (shown in solid boxes) in the second column are loop-free program fragments. For each such peel, we identify variables/array elements modified in the peel and used in subsequent non-peeled parts of the program. For example, the variable x is modified in the peel of the first loop and used in the body of the second loop, as shown by the arrow in the second column of Fig. 2. We replace all such uses (if needed, transitively) by expressions on the right-hand side of assignments in the peel until no variable/array element modified in the peel is used in any subsequent non-peeled part of the program. Thus, the use of x in the body of the second loop is replaced by the expression x+N\*N in the third column of Fig. 2. The peeled iteration of the first loop can now be moved to the end of the program, since the variables modified in this peel are no longer used in any subsequent non-peeled part of the program. Repeating the above steps for the peeled iteration of the second loop, we get the program shown in the third column of Fig. 2. This effectively gives a transformed program that can be divided into two parts: (i) a program <sup>Q</sup><sup>N</sup>−<sup>1</sup> that differs from <sup>P</sup><sup>N</sup> only in that all loops are truncated to iterate N − 1 (instead of N) times, and (ii) a program peel(P<sup>N</sup> ) that is obtained by concatenating the peels of loops in P<sup>N</sup> in the same order in which the loops appeared in P<sup>N</sup> . It is not hard to see that P<sup>N</sup> , shown in the first column of Fig. 2, is semantically equivalent to <sup>Q</sup><sup>N</sup>−<sup>1</sup>; peel(P<sup>N</sup> ). Notice that the construction of <sup>Q</sup><sup>N</sup>−<sup>1</sup> and peel(P<sup>N</sup> ) was fairly straightforward, and did not require any complex reasoning. In sharp contrast, construction of ∂P<sup>N</sup> , as shown in the bottom half of fourth column of Fig. 2, requires non-trivial reasoning, and produces a program with two sequentially composed loops.

# **3 Preliminaries and Notation**

We consider programs generated by the grammar shown below:

PB ::= St St ::= St ; St <sup>|</sup> <sup>v</sup> := <sup>E</sup> <sup>|</sup> <sup>A</sup>[E] := <sup>E</sup> <sup>|</sup> **if**(BoolE) **then** St **else** St <sup>|</sup> **for** ( := 0; < UB; := +1) {St} <sup>E</sup> ::= E op E <sup>|</sup> <sup>A</sup>[E] <sup>|</sup> <sup>v</sup> <sup>|</sup> <sup>|</sup> <sup>c</sup> <sup>|</sup> <sup>N</sup> op ::= + <sup>|</sup> - <sup>|</sup> \* <sup>|</sup> / UB ::= UB op UB <sup>|</sup> <sup>|</sup> <sup>c</sup> <sup>|</sup> <sup>N</sup> BoolE ::= E relop E <sup>|</sup> BoolE AND BoolE <sup>|</sup> NOT BoolE <sup>|</sup> BoolE OR BoolE Formally, we consider a program <sup>P</sup><sup>N</sup> to be a tuple (V,L, <sup>A</sup>, PB, N), where <sup>V</sup> is a set of scalar variables, L⊆V is a set of scalar loop counter variables, A is a set of array variables, PB is the program body, and N is a special symbol denoting a positive integer parameter of the program. In the grammar shown above, we assume that <sup>A</sup> ∈ A, <sup>v</sup> ∈ V\L, ∈ L and <sup>c</sup> <sup>∈</sup> <sup>Z</sup>. We also assume that each loop L has a unique loop counter variable that is initialized at the beginning of L and is incremented by 1 at the end of each iteration. We assume that the assignments in the body of L do not update . For each loop L with termination condition < UB, we require that UB is an expression in terms of <sup>N</sup>, variables in <sup>L</sup> representing loop counters of loops that nest <sup>L</sup>, and constants as shown in the grammar. Our grammar allows a large class of programs (with nested loops) to be analyzed using our technique, and that are beyond the reach of state-of-the-art tools like [1,12,42].

We verify Hoare triples of the form {ϕ(N)} <sup>P</sup><sup>N</sup> {ψ(N)}, where the formulas ϕ(N) and ψ(N) are either universally quantified formulas of the form ∀I (α(I,N) ⇒ β(A, V,I,N)) or quantifier-free formulas of the form η(A, V, N). In these formulas, I is a sequence of array index variables, α is a quantifier-free formula in the theory of arithmetic over integers, and β and η are quantifier-free formulas in the combined theory of arrays and arithmetic over integers.

For technical reasons, we rename all scalar and array variables in the program in a pre-processing step as follows. We rename each scalar variable using the wellknown Static Single Assignment (SSA) [43] technique, such that the variable is written at (at most) one location in the program. We also rename arrays in the program such that each loop updates its own version of an array and multiple writes to an array element within the same loop are performed on different versions of that array. We use techniques for array SSA [30] renaming studied earlier in the context of compilers, for this purpose. In the subsequent exposition, we assume that scalar and array variables in the program are already SSA renamed, and that all array and scalar variables referred to in the pre- and post-conditions are also expressed in terms of SSA renamed arrays and scalars.

### **4 Verification Using Difference Invariants**

The key steps in the application of our technique, as discussed in Sect. 2, are


We now discuss techniques for solving each of these sub-problems.

# **4.1 Generating Q***<sup>N</sup> <sup>−</sup>***<sup>1</sup> and peel(P***<sup>N</sup>* **)**

The procedure illustrated in Fig. 2 (going from the first column to the third column) is fairly straightforward if none of the loops have any nested loops within them. It is easy to extend this to arbitrary sequential compositions of non-nested loops. Having all variables and arrays in SSA-renamed forms makes it particularly easy to carry out the substitution exemplified by the arrow shown in the second column of Fig. 2. Hence, we don't discuss any further the generation of <sup>Q</sup><sup>N</sup>−<sup>1</sup> and peel(P<sup>N</sup> ) when all loops are non-nested.

The case of nested loops is, however, challenging and requires additional discussion. Before we present an algorithm for handling this case, we discuss the intuition using an abstract example. Consider a pair of nested loops, L<sup>1</sup> and L2, as shown in Fig. 3. Suppose that B1 and B3 are loop-free code fragments in the body of L<sup>1</sup> that precede and succeed the

**Fig. 3.** A generic nested loop

nested loop L2. Suppose further that the loop body, B2, of L<sup>2</sup> is loop-free. To focus on the key aspects of computing peels of nested loops, we make two simplifying assumptions: (i) no scalar variable or array element modified in B2 is used subsequently (including transitively) in either B3 or B1, and (ii) every scalar variable or array element that is modified in B1 and used subsequently in B2, is not modified again in either B1, B2 or B3. Note that these assumptions are made primarily to simplify the exposition. For a detailed discussion on how our technique can be used even with some relaxations of these assumptions, the reader is referred to [13]. The peel of the abstract loops L<sup>1</sup> and L<sup>2</sup> is as shown in Fig. 4. The first loop in the peel includes the last iteration of <sup>L</sup><sup>2</sup> in each of the <sup>N</sup> <sup>−</sup> <sup>1</sup> iterations of <sup>L</sup>1, that was missed in <sup>Q</sup><sup>N</sup>−<sup>1</sup>. The subsequent code includes the last iteration of <sup>L</sup><sup>1</sup> that was missed in <sup>Q</sup><sup>N</sup>−<sup>1</sup>.

Formally, we use the notation L1(N) to denote a loop L<sup>1</sup> that has no nested loops within it, and its loop counter, say 1, increases from 0 to an upper bound that is given by an expression in N. Similarly, we use L1(N, L2(N)) to denote a loop L<sup>1</sup> that has another loop L<sup>2</sup> nested within it. The loop counter <sup>1</sup> of L<sup>1</sup> increases from 0 to an upper bound expression in N, while the loop counter <sup>2</sup> of L<sup>2</sup> increases from 0 to an upper bound expression in <sup>1</sup> and N. Using this notation, L1(N, L2(N, L3(N))) represents three nested

**Fig. 4.** Peel of the nested loop

loops, and so on. Notice that the upper bound expression for a nested loop can depend not only on N but also on the loop counters of other loops nesting it. For notational clarity, we also use LPeel(Li, a, b) to denote the peel of loop L<sup>i</sup>

consisting of all iterations of L<sup>i</sup> where the value of <sup>i</sup> ranges from a to b-1, both inclusive. Note that if b-a is a constant, this corresponds to the concatenation of (b-a) peels of Li.

We will now try to see how we can implement the transformation from the first column to the second column of Fig. 2 for a nested loop L1(N, L2(N)). The first step is to truncate all loops to use N − 1 instead of N in the upper

bound expressions. Using the notation introduced above, this gives the loop L1(N-1, L2(N-1)). Note that all uses of N other than in loop upper bound expressions stay unchanged as we go from L1(N, L2(N)) to L1(N-1, L2(N-1)). We now ask: *Which are the loop iterations of* L1(N, L2(N)) *that have been missed (or skipped) in going to* L1(N-1, L2(N-1))*?* Let the upper bound expression of L<sup>1</sup> in L1(N, L2(N)) be UL<sup>1</sup> (N), and that of L<sup>2</sup> be UL<sup>2</sup> (1, N). It is not hard to see that in every iteration <sup>1</sup> of <sup>L</sup>1, where 0 <sup>≤</sup> <sup>1</sup> < UL<sup>1</sup> (<sup>N</sup> <sup>−</sup> 1), the iterations corresponding to <sup>2</sup> ∈ {UL<sup>2</sup> (1, N − 1),...,UL<sup>2</sup> (1, N) − 1} have been missed. In addition, all iterations of <sup>L</sup><sup>1</sup> corresponding to <sup>1</sup> ∈ {UL<sup>1</sup> (<sup>N</sup> <sup>−</sup>1),...,UL<sup>1</sup> (N)−1} have also been missed. This implies that the "peel" of L1(N, L2(N)) must include all the above missed iterations. This peel therefore is the program fragment shown in Fig. 5.

Notice that if UL<sup>2</sup> (<sup>1</sup> ,N) - UL<sup>2</sup> (<sup>1</sup> ,N-1) is a constant (as is the case if UL<sup>2</sup> (1,N) is any linear function of <sup>1</sup> and N), then the peel does not have any loop with nesting depth 2. Hence, the maximum nesting depth of loops in the peel is strictly less than that in L1(N,

for(1=0; 1<U<sup>L</sup><sup>1</sup> (N-1); 1++) { for(2=0; 2<U<sup>L</sup><sup>2</sup> (1,N-1); 2++) LPeel(L3, U<sup>L</sup><sup>3</sup> (1,2,N-1), U<sup>L</sup><sup>3</sup> (1,2,N)) LPeel(L2, U<sup>L</sup><sup>2</sup> (1,N-1), U<sup>L</sup><sup>2</sup> (1,N)) LPeel(L1, U<sup>L</sup><sup>1</sup> (N-1), U<sup>L</sup><sup>1</sup> (N)) **Fig. 6.** Peel of L1(N, L2(N, L3(N)))

L2(N)), yielding a peel that is "simpler" than the original program. This argument can be easily generalized to loops with arbitrarily large nesting depths. The peel of L1(N, L2(N, L3(N))) is as shown in Fig. 6.

}

As an illustrative example, let us consider the program in Fig. 7(a), and suppose we wish to compute the peel of this program containing nested loops. In this case, the upper bounds of the loops are U<sup>L</sup><sup>1</sup> (N) = U<sup>L</sup><sup>2</sup> (N) = N. The peel is shown

$$\begin{array}{rclclcl}\text{for (i=0;} & \mathbf{i<0}; & \mathbf{j<0};\\\mathbf{for (j=0;} & \mathbf{j<0}; & \mathbf{j++} \end{array} & \begin{array}{rcl} \text{for (i=0;} & \mathbf{i<0} - \mathbf{1}; & \mathbf{i++} \end{array} \\\text{for (j=0;} & \mathbf{j<0}; & \mathbf{j++} \end{array}$$

$$\begin{array}{rclclcl}\mathbf{A}\begin{bmatrix} \mathbf{i} \end{bmatrix} \begin{bmatrix} \mathbf{j} \end{bmatrix} = \mathbf{N}; & \mathbf{for (j=0;} & \mathbf{j<0}; & \mathbf{j++} \end{bmatrix} \\\text{A [N=1] [...] [...] = N;} \\\text{(a) } & \begin{array}{rclclcl}\mathbf{A}\begin{bmatrix} \mathbf{N}-\mathbf{1} \end{bmatrix} \begin{bmatrix} \mathbf{j} \end{bmatrix} = \mathbf{N}; \end{array}$$


in Fig. 7(b) and consists of two sequentially composed non-nested loops. The first loop takes into account the missed iterations of the inner loop (a single iteration in this example) that are executed in <sup>P</sup><sup>N</sup> but are missed in <sup>Q</sup><sup>N</sup>−<sup>1</sup>. The

for(1=0; 1<U<sup>L</sup><sup>1</sup> (N-1); 1++) LPeel(L2, U<sup>L</sup><sup>2</sup> (1,N-1), U<sup>L</sup><sup>2</sup> (1,N)) LPeel(L1, U<sup>L</sup><sup>1</sup> (N-1), U<sup>L</sup><sup>1</sup> (N))

$$\textbf{Fig. 5.}\text{ }\textbf{Peeel of }\mathsf{L}\_1\text{ (}\mathsf{N},\ \mathsf{L}\_2\text{(}\mathsf{N}\})$$

# **Algorithm 1.** GenQandPeel(P<sup>N</sup> : program)

1: Let sequentially composed loops in P*<sup>N</sup>* be in the order L1, L2, ..., L*m*; 2: **for** each loop <sup>L</sup>*<sup>i</sup>* <sup>∈</sup> TopLevelLoops(P*<sup>N</sup>* ) **do** 3: QL*<sup>i</sup>* , RL*<sup>i</sup>* ← GenQandPeelForLoop(L*i*); 4: **while** <sup>∃</sup>v.*use*(*v*) <sup>∈</sup> QL*<sup>i</sup>* <sup>∧</sup> *def(v)* <sup>∈</sup> RL*<sup>j</sup>* , for some 1 <sup>≤</sup> j<i <sup>≤</sup> <sup>N</sup> **do** v is var/array element 5: Substitute rhs expression for <sup>v</sup> from RL*<sup>j</sup>* in QL*<sup>i</sup>* ; - If RL*<sup>j</sup>* is a loop, abort 6: <sup>Q</sup>*N*−<sup>1</sup> <sup>←</sup> QL<sup>1</sup> ; QL<sup>2</sup> ; ... ; QL*m*; 7: peel(P*<sup>N</sup>* ) <sup>←</sup> RL<sup>1</sup> ; RL<sup>2</sup> ; ... ; RL*m*; 8: **return** Q*N*−1, peel(P*<sup>N</sup>* ); 9: **procedure** GenQandPeelForLoop(L: loop) 10: Let UL(N) be the UB expression of loop L; 11: QL <sup>←</sup> <sup>L</sup> with <sup>N</sup> <sup>−</sup> 1 substituted for <sup>N</sup> in all UB expressions (including for nested loops); 12: **if** L has subloops **then** 13: <sup>t</sup> <sup>←</sup> nesting depth of inner-most nested loop in <sup>L</sup>; 14: <sup>R</sup>*t*+1 <sup>←</sup> empty program with no statements; 15: **for** <sup>k</sup> <sup>=</sup> <sup>t</sup>; <sup>k</sup> <sup>≥</sup> 2; <sup>k</sup>-- **do** 16: **for** each subloop SL*<sup>j</sup>* in L*<sup>i</sup>* at nesting depth k **do** - Ordered SL1, SL2, ..., SL*<sup>j</sup>* 17: <sup>R</sup>*SLj* <sup>←</sup> LPeel(SL*<sup>j</sup>* , <sup>U</sup>*SLj* (1,...,*k*−1, N <sup>−</sup> 1), <sup>U</sup>*SLj* (1,...,*k*−1, N)); 18: <sup>R</sup>*<sup>k</sup>* <sup>←</sup> for (i=0; i<UL*k*−<sup>1</sup> (<sup>N</sup> <sup>−</sup> <sup>1</sup>); i++) { <sup>R</sup>*k*+1;R*SL*<sup>1</sup> ;R*SL*<sup>2</sup> ;...;R*SLj* }; 19: RL <sup>←</sup> <sup>R</sup>2; LPeel(L, UL(<sup>N</sup> <sup>−</sup> 1), UL(N)); 20: **else** 21: RL <sup>←</sup> LPeel(L, UL(<sup>N</sup> <sup>−</sup> 1), UL(N)); 22: **return** QL, RL;

second loop takes into account the missed iterations of the outer loop in <sup>Q</sup><sup>N</sup>−<sup>1</sup> compared to P<sup>N</sup> .

Generalizing the above intuition, Algorithm 1 presents function GenQand-Peel for computing <sup>Q</sup><sup>N</sup>−<sup>1</sup> and peel(P<sup>N</sup> ) for a given <sup>P</sup><sup>N</sup> that has sequentially composed loops with potentially nested loops. Due to the grammar of our programs, our loops are well nested. The method works by traversing over the structure of loops in the program. In this algorithm QL*<sup>i</sup>* and RL*<sup>i</sup>* represent the counterparts of <sup>Q</sup><sup>N</sup>−<sup>1</sup> and peel(P<sup>N</sup> ) for loop <sup>L</sup>i. We create the program <sup>Q</sup><sup>N</sup>−<sup>1</sup> by peeling each loop in the program and then propagating these peels across subsequent loops. We identify the missed iterations of each loop in the program P<sup>N</sup> from the upper bound expression UB. Recall that the upper bound of each loop L<sup>k</sup> at nesting depth k, denoted by U<sup>L</sup>*<sup>k</sup>* is in terms of the loop counters of outer loops and the program parameter N. We need to peel U<sup>L</sup>*<sup>k</sup>* (1, 2,...,<sup>k</sup>−<sup>1</sup>, N) − U<sup>L</sup>*<sup>k</sup>* (1, 2,...,<sup>k</sup>−<sup>1</sup>, N − 1) number of iterations from each loop, where <sup>1</sup> ≤ <sup>2</sup> ≤ ... ≤ <sup>k</sup>−<sup>1</sup> are counters of the outer nesting loops. As discussed above, whenever this difference is a constant value, we are guaranteed that the loop nesting depth reduces by one. It may so happen that there are multiple sequentially composed loops SL<sup>j</sup> at nesting depth k and not just a single loop Lk. At line 2, we iterate over top level loops and call function GenQandPeelForLoop(Li) for each sequentially composed loop L<sup>i</sup> in P<sup>N</sup> . At line 11 we construct Q<sup>L</sup> for loop L. If the loop L has no nested loops, then the peel is the last iterations computed using the upper bound in line 21 For nested loops, the loop at line 15 builds the peel for all loops inside L following the above intuition. The peels of all sub-loops are collected and inserted in the peel of L at line 19. Since all the peeled iterations are moved after Q<sup>L</sup> of each loop, we need to repair expressions appearing in QL. The repairs are applied by the loop at line 4. In the repair step, we identify the right hand side expressions for all the variables and array elements assigned in the peeled iterations. Subsequently, the uses of the variables and arrays in QL*<sup>i</sup>* that are assigned in RL*<sup>j</sup>* are replaced with the assigned expressions whenever j<i. If RL*<sup>j</sup>* is a loop, this step is more involved and hence currently not considered. Finally at line 8, the peels and Qs of all top level loops are stitched and returned.

Note that lines 4 and 5 of Algorithm 1 implement the substitution represented by the arrow in the second column of Fig. 2. This is necessary in order to move the peel of a loop to the end of the program. If either of the loops L<sup>i</sup> or L<sup>j</sup> use array elements as index to other arrays then it can be difficult to identify what expression to use in QL*<sup>i</sup>* for the substitution. However, such scenarios are observed less often, and hence, they hardly impact the effectiveness of the technique on programs seen in practice. The peel RL*<sup>j</sup>* , from which the expression to be substituted in QL*<sup>i</sup>* has to be taken, itself may have a loop. In such cases, it can be significantly more challenging to identify what expression to use in QL*<sup>i</sup>* . We use several optimizations to transform the peeled loop before trying to identify such an expression. If the modified values in the peel can be summarized as closed form expressions, then we can replace the loop in the peel with its summary. For example, consider the peeled loop, for ( <sup>1</sup> =0; <sup>1</sup> < N; <sup>1</sup> ++) { S = S + 1; }. This loop is summarized as S = S + N; before it can be moved across subsequent code. If the variables modified in the peel of a nested loop are not used later, then the peel can be trivially moved. In many cases, the loop in the peel can also be substituted with its conservative over-approximation. We have implemented some of these optimizations in our tool and are able to verify several benchmarks with sequentially composed nested loops. It may not always be possible to move the peel of a nested loop across subsequent loops but we have observed that these optimizations suffice for many programs seen in practice.

**Theorem 1.** *Let* <sup>Q</sup><sup>N</sup>−<sup>1</sup> *and* peel(P<sup>N</sup> ) *be generated by application of function* GenQandPeel *from Algorithm 1 on program* P<sup>N</sup> *. Then* P<sup>N</sup> *is semantically equivalent to* <sup>Q</sup><sup>N</sup>−<sup>1</sup>; peel(P<sup>N</sup> )*.*

**Lemma 1.** *Suppose the following conditions hold;*


*Then, the max nesting depth of loops in* peel(P<sup>N</sup> ) *is strictly less than that in* P<sup>N</sup> *.*

*Proof.* Let U<sup>L</sup>*<sup>k</sup>* (1,...,<sup>k</sup>−<sup>1</sup>, N) be the upper bound expression of a loop <sup>L</sup><sup>k</sup> at nesting depth <sup>k</sup>. Suppose <sup>U</sup><sup>L</sup>*<sup>k</sup>* <sup>=</sup> <sup>c</sup>1.<sup>1</sup> <sup>+</sup> ··· <sup>c</sup><sup>k</sup>−<sup>1</sup>.<sup>k</sup>−<sup>1</sup> <sup>+</sup> C.N <sup>+</sup> D, where c1,...c<sup>k</sup>−<sup>1</sup>, C and D are constants. Then U<sup>L</sup>*<sup>k</sup>* (1,...,<sup>k</sup>−<sup>1</sup>, N) − U<sup>L</sup>*<sup>k</sup>* (1,...<sup>k</sup>−<sup>1</sup>, N − 1) = C, i.e. a constant. Now, recalling the discussion in Sect. 4.1, we see that LPeel(Lk, <sup>U</sup>k(1, . . . ,<sup>k</sup>−<sup>1</sup>, <sup>N</sup> <sup>−</sup> 1), <sup>U</sup>k(1, . . . ,<sup>k</sup>−<sup>1</sup>, N)) simply results in concatenating a constant number of peels of the loop Lk. Hence, the maximum nesting depth of loops in LPeel( <sup>L</sup>k, <sup>U</sup>k(1,...,k−<sup>1</sup>, <sup>N</sup> <sup>−</sup> 1), Uk(1,...,k−<sup>1</sup>, N)) is strictly less than the maximum nesting depth of loops in Lk.

Suppose loop L with nested loops (having maximum nesting depth t) is passed as the argument of function GenQandPeelForLoop (see Algorithm 1). In line 15 of function GenQandPeelForLoop, we iterate over all loops at nesting depth 2 and above within <sup>L</sup>. Let <sup>L</sup><sup>k</sup> be a loop at nesting depth <sup>k</sup>, where 2 <sup>≤</sup> <sup>k</sup> <sup>≤</sup> <sup>t</sup>. Clearly, <sup>L</sup><sup>k</sup> can have at most <sup>t</sup> <sup>−</sup> <sup>k</sup> nested levels of loops within it. Therefore, when LPeel is invoked on such a loop, the maximum nesting depth of loops in the peel generated for <sup>L</sup><sup>k</sup> can be at most <sup>t</sup> <sup>−</sup> <sup>k</sup> <sup>−</sup> 1. From lines 18 and 19 of function GenQandPeelForLoop, we also know that this LPeel can itself appear at nesting depth k of the overall peel RL. Hence, the maximum nesting depth of loops in <sup>R</sup><sup>L</sup> can be <sup>t</sup> <sup>−</sup> <sup>k</sup> <sup>−</sup> 1 + <sup>k</sup>, i.e. <sup>t</sup> <sup>−</sup> 1. This is strictly less than the maximum nesting depth of loops in <sup>L</sup>. 

**Corollary 1.** *If* P<sup>N</sup> *has no nested loops, then* peel(P<sup>N</sup> ) *is loop-free.*

#### **4.2 Generating** *ϕ-* **(***N −* **1) and** *Δϕ-* **(***N***)**

Given ϕ(N), we check if it is of the form <sup>N</sup>−<sup>1</sup> <sup>i</sup>=0 ρi, where ρ<sup>i</sup> is a formula on the i th elements of one or more arrays, and scalars used in P<sup>N</sup> . If so, we infer ϕ (<sup>N</sup> <sup>−</sup> 1) to be <sup>N</sup>−<sup>2</sup> <sup>i</sup>=0 ρ<sup>i</sup> and Δϕ (N) to be ρ<sup>N</sup>−<sup>1</sup> (assuming variables/array elements in <sup>ρ</sup><sup>N</sup>−<sup>1</sup> are not modified by <sup>Q</sup><sup>N</sup>−<sup>1</sup>). Note that all uses of <sup>N</sup> in <sup>ρ</sup><sup>i</sup> are retained as is (i.e. not changed to N −1) in ϕ (N −1). In general, when deriving ϕ (N − 1), we do not replace any use of N in ϕ(N) by N − 1 unless it is the limit of an iterated conjunct as discussed above. Specifically, if ϕ(N) doesn't contain an iterated conjunct as above, then we consider ϕ (N − 1) to be the same as ϕ(N) and Δϕ (N) to be True. Thus, our generation of ϕ (N − 1) and Δϕ (N) differs from that of [12]. As discussed earlier, this makes it possible to reason about a much larger class of pre-conditions than that admissible by the technique of [12].

#### **4.3 Inferring Inductive Difference Invariants**

Once we have <sup>P</sup><sup>N</sup>−<sup>1</sup>, <sup>Q</sup><sup>N</sup>−<sup>1</sup>, <sup>ϕ</sup>(N−1) and <sup>ϕ</sup> (N−1), we infer *difference invariants*. We construct the standard cross-product of programs <sup>Q</sup><sup>N</sup>−<sup>1</sup> and <sup>P</sup><sup>N</sup>−<sup>1</sup>, denoted as <sup>Q</sup><sup>N</sup>−<sup>1</sup> <sup>×</sup>P<sup>N</sup>−<sup>1</sup>, and infer difference invariants at key control points. Note that <sup>P</sup><sup>N</sup>−<sup>1</sup> and <sup>Q</sup><sup>N</sup>−<sup>1</sup> are guaranteed to have synchronized iterations of corresponding loops (both are obtained by restricting the upper bounds of all loops to use N − 1 instead of N). However, the conditional statements within the loop body may not be synchronized. Thus, whenever we can infer that the corresponding conditions are equivalent, we synchronize the branches of the conditional statement. Otherwise, we consider all four possibilities of the branch conditions. It can be seen that the net effect of the cross-product is executing the programs <sup>P</sup><sup>N</sup>−<sup>1</sup> and <sup>Q</sup><sup>N</sup>−<sup>1</sup> one after the other.

We run a dataflow analysis pass over the constructed product graph to infer difference invariants at loop head, loop exit and at each branch condition. The only dataflow values of interest are differences between corresponding variables in <sup>Q</sup>N−<sup>1</sup> and <sup>P</sup>N−<sup>1</sup>. Indeed, since structure and variables of <sup>Q</sup>N−<sup>1</sup> and <sup>P</sup>N−<sup>1</sup> are similar, we can create the correspondence map between the variables. We start the difference invariant generation by considering relations between corresponding variables/array elements appearing in pre-conditions of the two programs. We apply static analysis that can track equality expressions (including disjunctions over equality expressions) over variables as we traverse the program. These equality expressions are our difference invariants.

We observed in our experiments the most of the inferred equality expressions are simple expressions of N (atmost quadratic in N). This not totally surprising and similar observations have also been independently made in [4,15,24]. Note that the difference invariants may not always be equalities. We can easily extend our analysis to learn inequalities using interval domains in static analysis. We can also use a library of expressions to infer difference invariants using a guessand-check framework. Moreover, guessing difference invariants can be easy as in many cases the difference expressions may be independent of the program constructs, for example, the equality expression <sup>v</sup> <sup>=</sup> <sup>v</sup> where <sup>v</sup> <sup>∈</sup> <sup>P</sup><sup>N</sup>−<sup>1</sup> and <sup>v</sup> <sup>∈</sup> <sup>Q</sup><sup>N</sup>−<sup>1</sup> does not depend on any other variable from the two programs.

For the example in Fig. 2, the difference invariant at the head of the first loop of Q<sup>N</sup>−<sup>1</sup> × P<sup>N</sup>−<sup>1</sup> is D(VQ, VP, N − 1) ≡ (x − x = i × (2 × N − 1) ∧ ∀i ∈ [0, N − 1), a [i] − a[i] = 1), where x, a ∈ V<sup>P</sup> and x , a ∈ VQ. Given this, we easily get x − x = (N − 1) × (2 × N − 1) when the first loop terminates. For the second loop, D(VQ, VP, N − 1) ≡ (∀j ∈ [0, N − 1), b [j] − b[j] = (x <sup>−</sup> <sup>x</sup>) + <sup>N</sup><sup>2</sup> = (<sup>N</sup> <sup>−</sup> <sup>1</sup>)<sup>×</sup> (<sup>2</sup> <sup>×</sup> <sup>N</sup> <sup>−</sup> <sup>1</sup>) + <sup>N</sup><sup>2</sup>).

Note that the difference invariants and its computation are agnostic of the given post-condition. Hence, our technique does not need to re-run this analysis for proving a different post-condition for the same program.

#### **4.4 Verification Using Inductive Difference Invariants**

We present our method Diffy for verification of programs using inductive difference invariants in Algorithm 2. It takes a Hoare triple {ϕ(N)} <sup>P</sup><sup>N</sup> {ψ(N)} as input, where ϕ(N) and ψ(N) are pre- and post-condition formulas. We check the base in line 1 to verify the Hoare triple for N = 1. If this check fails, we report a counterexample. Subsequently, we compute <sup>Q</sup><sup>N</sup>−<sup>1</sup> and peel(P<sup>N</sup> ) as described in Sect. 4.1 using the function GenQandPeel from Algorithm 1. At line 4, we compute the formulas ϕ (N −1) and Δϕ (N) as described in Sect. 4.2. For automation, we analyze the quantifiers appearing in ϕ(N) and modify the quantifier ranges such that the conditions in Sect. 4.2 hold. We infer difference invariants D(VQ, VP, N − 1) on line 5 using the method described in Sect. 4.3, wherein <sup>V</sup><sup>Q</sup> and <sup>V</sup><sup>P</sup> are sets of variables from <sup>Q</sup><sup>N</sup>−<sup>1</sup> and <sup>P</sup><sup>N</sup>−<sup>1</sup> respectively. At line 6, we compute ψ (<sup>N</sup> <sup>−</sup> 1) by eliminating variables <sup>V</sup><sup>P</sup> from <sup>P</sup><sup>N</sup>−<sup>1</sup> from ψ(N − 1) ∧ D(VQ, VP, N − 1). At line 7, we check the inductive step of our analysis. If the inductive step succeeds, then we conclude that the assertion holds.

# **Algorithm 2.** Diffy( {ϕ(N)} <sup>P</sup><sup>N</sup> {ψ(N)} )


If that is not the case then, we try to iteratively strengthen both the pre- and post-condition of peel(P<sup>N</sup> ) simultaneously by invoking Strengthen.

The function Strengthen first initializes the formula χ(N) with ψ(N) and the formulas ξ(N) and ξ (<sup>N</sup> <sup>−</sup> 1) to True. To strengthen the pre-condition of peel(P<sup>N</sup> ), we infer a formula χ (N − 1) using Dijkstra's weakest pre-condition computation of χ(N) over the peel(P<sup>N</sup> ) in line 16. It may happen that we are unable to infer such a formula. In such a case, if the program peel(P<sup>N</sup> ) has loops then we recursively invoke Diffy at line 19 to further simplify the program. Otherwise, we abandon the verification effort (line 21). We use quantifier elimination to infer χ(N − 1) from χ (N − 1) and D(VQ, VP, N − 1)) at line 6.

The inferred pre-conditions χ(N) and χ (N −1) are accumulated in ξ(N) and ξ (<sup>N</sup> <sup>−</sup>1), which strengthen the post-conditions of <sup>P</sup><sup>N</sup> and <sup>Q</sup><sup>N</sup>−<sup>1</sup> respectively in lines 23–24. We again check the base case for the inferred formulas in ξ(N) at line 25. If the check fails we abandon the verification attempt at line 26. If the base case succeeds, we then proceed to the inductive step. When the inductive step succeeds, we conclude that the assertion is verified. Otherwise, we continue in the loop and try to infer more pre-conditions untill we run out of time.

The pre-condition in Fig. <sup>2</sup> is <sup>φ</sup>(N) <sup>≡</sup> True and the post-condition is <sup>ψ</sup>(N) <sup>≡</sup> <sup>∀</sup><sup>j</sup> <sup>∈</sup> [0, <sup>N</sup>), <sup>b</sup>[j] = <sup>j</sup> <sup>+</sup> <sup>N</sup><sup>3</sup>). At line 4, <sup>φ</sup> (N − 1) and Δφ (N − 1) are computed to be True. <sup>D</sup>(VQ, VP, N <sup>−</sup> 1) is the formula computed in Sect. 4.3. At line 6,


**Table 1.** Summary of the experimental results. S is successful result. U is inconclusive result. TO is timeout.

ψ (N −1) ≡ (∀j ∈ [0, N − 1), b [j] = <sup>j</sup> + (<sup>N</sup> <sup>−</sup> <sup>1</sup>)<sup>3</sup> + (<sup>N</sup> <sup>−</sup> <sup>1</sup>) <sup>×</sup> (<sup>2</sup> <sup>×</sup> <sup>N</sup> <sup>−</sup> <sup>1</sup>) + <sup>N</sup><sup>2</sup> <sup>=</sup> j + N<sup>3</sup>). The algortihm then invokes Strengthen at line 10 which infers the formulas χ (<sup>N</sup> <sup>−</sup> 1) <sup>≡</sup> (x = (<sup>N</sup> <sup>−</sup> <sup>1</sup>)<sup>3</sup>) at line 16 and <sup>χ</sup>(N) <sup>≡</sup> (<sup>x</sup> <sup>=</sup> <sup>N</sup><sup>3</sup>) at line 22. These are accumulated in ξ (N −1) and ξ(N), simultaneosuly strengthening the pre- and post-condition. Verification succeeds after this strengthening iteration.

The following theorem guarantees the soundness of our technique.

**Theorem 2.** *Suppose there exist formulas* ξ (N) *and* ξ(N) *and an integer* M > 0 *such that the following hold*


*Then* {ϕ(N)} <sup>P</sup><sup>N</sup> {ψ(N)} *holds for all* N > <sup>0</sup>*.*

### **5 Experimental Evaluation**

We have instantiated our technique in a prototype tool called Diffy. It is written in C++ and is built using the LLVM(v6.0.0) [31] compiler. We use the SMT solver Z3(v4.8.7) [39] for proving Hoare triples of loop-free programs. Diffy and the supporting data to replicate the experiments are openly available at [14].

**Setup.** All experiments were performed on a machine with Intel i7-6500U CPU, 16 GB RAM, running at 2.5 GHz, and Ubuntu 18.04.5 LTS operating system. We have compared the results obtained from Diffy with Vajra(v1.0) [12], VIAP(v1.1) [42] and VeriAbs(v1.4.1-12) [1]. We choose Vajra which also employs inductive reasoning for proving array programs and verify the benchmarks in its test-suite. We compared with VeriAbs as it is the winner of the arrays sub-category in SV-COMP 2020 [6] and 2021 [7]. VeriAbs applies a

**Fig. 8.** Cactus Plots (a) All Safe Benchmarks (b) All Unsafe Benchmarks

sequence of techniques from its portfolio to verify array programs. We compared with VIAP which was the winner in arrays sub-category in SV-COMP 2019 [5]. VIAP also employs a sequence of tactics, implemented for proving a variety of array programs. Diffy does not use multiple techniques, however we choose to compare it with these portfolio verifiers to show that it performs well on a class of programs and can be a part of their portfolio. All tools take C programs in the SV-COMP format as input. Timeout of 60 s was set for each tool. A summary of the results is presented in Table 1.

**Benchmarks.** We have evaluated Diffy on a set of 303 array benchmarks, comprising of the entire test-suite of [12], enhanced with challenging benchmarks to test the efficacy of our approach. These benchmarks take a symbolic parameter N which specifies the size of each array. Assertions are (in-)equalities over array elements, scalars and (non-)linear polynomial terms over N. We have divided both the safe and unsafe benchmarks in three categories. Benchmarks in C1 category have standard array operations such as min, max, init, copy, compare as well as benchmarks that compute polynomials. In these benchmarks, branch conditions are not affected by the value of N, operations such as modulo and nested loops are not present. There are 110 safe and 99 unsafe programs in the C1 category in our test-suite. In C2 category, the branch conditions are affected by change in the program parameter N and operations such as modulo are used in these benchmarks. These benchmarks do not have nested loops in them. There are 24 safe and unsafe benchmarks in the C2 category. Benchmarks in category C3 are programs with atleast one nested loop in them. There are 23 safe and unsafe programs in category C3 in our test-suite. The test-suite has a total of 157 safe and 146 unsafe programs.

**Analysis.** Diffy verified 151 safe benchmarks, compared to 110 verified by Vajra as well as VeriAbs and 20 verified by VIAP. Diffy was unable to verify 6 safe benchmarks. In 3 cases, the smt solver timed out while trying to prove the induction step since the formulated query had a modulus operation and in 3 cases it was unable to compute the predicates needed to prove the assertions. Vajra was unable to verify 47 programs from categories C2 and

**Fig. 9.** Cactus plots (a) Safe C1 benchmarks (b) Unsafe C1 benchmarks

C3. These are programs with nested loops, branch conditions affected by N, and cases where it could not compute the difference program. The sequence of techniques employed by VeriAbs, ran out of time on 47 programs while trying to prove the given assertion. VeriAbs proved 2 benchmarks in category C2 and 3 benchmarks in category C3 where Diffy was inconclusive or timed out. VeriAbs spends considerable amount of time on different techniques in its portfolio before it resorts to Vajra and hence it could not verify 14 programs that Vajra was able to prove efficiently. VIAP was inconclusive on 24 programs which had nested loops or constructs that could not be handled by the tool. It ran out of time on 113 benchmarks as the initial tactics in its sequence took up the allotted time but could not verify the benchmarks. Diffy was able to verify all programs that VIAP and Vajra were able to verify within the specified time limit.

The cactus plot in Fig. 8(a) shows the performance of each tool on all safe benchmarks. Diffy was able to prove most of the programs within three seconds. The cactus plot in Fig. 9(a) shows the performance of each tool on safe benchmarks in C1 category. Vajra and Diffy perform equally well in the C1 category. This is due to the fact that both tools perform efficient inductive reasoning. Diffy outperforms VeriAbs and VIAP in this category. The cactus plot in Fig. 10(a) shows the performance of each tool on safe benchmarks in the combined categories C2 and C3, that are difficult for Vajra as most of these programs are not within its scope. Diffy out performs all other tools in categories C2 and C3. VeriAbs was an order of magnitude slower on programs it was able to verify, as compared to Diffy. VeriAbs spends significant amount of time in trying techniques from its portfolio, including Vajra, before one of them succeeds in verifying the assertion or takes up the entire time allotted to it. VIAP took 70 seconds more on an average as compared to Diffy to verify the given benchmark. VIAP also spends a large portion of time in trying different tactics implemented in the tool and solving the recurrence relations in programs.

Our technique reports property violations when the base case of the analysis fails for small fixed values of N. While the focus of our work is on proving assertions, we report results on unsafe versions of the safe benchmarks from our test-suite. Diffy was able to detect a property violation in 142 unsafe programs and was inconclusive on 4 benchmarks. Vajra detected violations in 115 programs and was inconclusive on 31 programs. VeriAbs reported 125 programs as unsafe and ran out of time on 21 programs. VIAP reported property violation in 120 programs, was inconclusive on 23 programs and timed out on 3 programs.

The cactus plot in Fig. 8(b) shows the performance of each tool on all unsafe benchmarks. Diffy was able to detect a violation faster than all other tools and on more benchmarks from the test-suite. Figure 9(b) and Fig. 10(b) give a finer glimpse of the performance of these tools on the categories that we have defined. In the C1 category, Diffy and Vajra have comparable performance and Diffy disproves the same number of benchmarks as Vajra and VIAP. In C2 and C3 categories, we are able to detect property violations in more benchmarks than other tools in less time.

To observe any changes in the performance of these, we also ran them with an increased time out of 100 seconds (Fig. 11). Performance remains unchanged for Diffy, Vajra and VeriAbs on both safe and unsafe benchmarks, and of VIAP on unsafe benchmarks. VIAP was able to additionally verify 89 safe programs in categories C1 and C2 with the increased time limit.

**Fig. 10.** Cactus plots (a) Safe C2 & C3 benchmarks (b) Unsafe C2 & C3 benchmarks

**Fig. 11.** Cactus plots. TO = 100 s. (a) Safe benchmarks (b) Unsafe benchmarks

# **6 Related Work**

*Techniques Based on Induction.* Our work is related to several efforts that apply inductive reasoning to verify properties of array programs. Our work subsumes the full-program induction technique in [12] that works by inducting on the entire program via a program parameter N. We propose a principled method for computation and use of difference invariants, instead of computing difference programs which is more challenging. An approach to construct safety proofs by automatically synthesizing squeezing functions that shrink program traces is proposed in [27]. Such functions are not easy to synthesize, whereas difference invariants are relatively easy to infer. In [11], the post-condition is inductively established by identifying a tiling relation between the loop counter and array indices used in the program. Our technique can verify programs from [11], when supplied with the *tiling* relation. [44] identifies recurrent program fragments for induction using the loop counter. They require restrictive data dependencies, called *commutativity of statements*, to move peeled iterations across subsequent loops. Unfortunately, these restrictions are not satisfied by a large class of programs in practice, where our technique succeeds.

*Difference Computation.* Computing differences of program expressions has been studied for incremental computation of expensive expressions [35,41], optimizing programs with arrays [34], and checking data-structure invariants [45]. These differences are not always well suited for verifying properties, in contrast with the difference invariants which enable inductive reasoning in our case.

*Logic Based Reasoning.* In [21], trace logic that implicitly captures inductive loop invariants is described. They use theorem provers to introduce and prove lemmas at arbitrary control locations in the program. Unlike their technique, we focus primarily on universally quantified and quantifier-free properties, although a restricted class of existentially quantified properties can be handled by our technique (see [13] for more details). VIAP [42] translates the program to an quantified first-order logic formula using the scheme proposed in [32]. It uses a portfolio of tactics to simplify and prove the generated formulas. Dedicated solvers for recurrences are used whereas our technique adapts induction for handling recurrences.

*Invariant Generation.* Several techniques generate invariants for array programs. QUIC3 [25], FreqHorn [9,19] infer universally quantified invariants over arrays for Constrained Horn Clauses (CHCs). Template-based techniques [8,23,47] search for inductive quantified invariants by instantiating parameters of a fixed set of templates. We generate relational invariants, which are often easier to infer compared to inductive quantified invariants for each loop.

*Abstraction-Based Techniques.* Counterexample-guided abstraction refinement using prophecy variables for programs with arrays is proposed in [36]. Veri-Abs [1] uses a portfolio of techniques, specifically to identify loops that can be soundly abstracted by a bounded number of iterations. Vaphor [38] transforms array programs to array-free Horn formulas to track bounded number of array cells. Booster [3] combines lazy abstraction based interpolation [2] and acceleration [10,28] for array programs. Abstractions in [16,18,22,26,29,33,37] implicitly or explicitly partition the range array indices to infer and prove facts on array segments. In contrast, our method does not rely on abstractions.

# **7 Conclusion**

We presented a novel verification technique that combines generation of difference invariants and inductive reasoning. Difference invariants relate corresponding variables and arrays from the two versions of a program and are often easy to infer and prove. We have instantiated these techniques in our prototype Diffy. Experiments shows that Diffy out-performs the tools that won the Arrays sub-category in SV-COMP 2019, 2020 and 2021. Although we have focused on universal and quantifier-free properties in this paper, the technique applies to some classes of existential properties as well. The interested reader is referred to [13] for more details. Investigations in using synthesis techniques for automatic generation of difference invariants to verify properties of array manipulating programs is a part of future work.

# **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

# **Author Index**

Abate, Alessandro II-3 Agarwal, Pratyush I-341 Akshay, S. I-619 Albert, Elvira II-863 Alur, Rajeev I-249 Amram, Gal I-870 André, Étienne I-552 Andriushchenko, Roman I-856 Arcaini, Paolo I-595 Armstrong, Alasdair I-303 Arquint, Linard I-367 Ayoun, Sacha-Élie II-827 Backes, J. II-851 Bae, Kyungmin I-491 Baier, Christel I-894 Baier, Daniel II-195 Bak, Stanley I-263 Balunovic, Mislav I-225 Bansal, Suguman I-870 Bardin, Sébastien I-669 Barrett, Clark II-461 Batz, Kevin II-524 Baumeister, Jan I-694 Bayless, S. II-851 Bendík, Jaroslav II-313 Beneš, Nikola I-505 Berzish, Murphy II-289 Beyer, Dirk II-195 Biere, Armin II-363 Bodeveix, Jean-Paul II-337 Bonakdarpour, Borzoo I-694 Boston, Brett I-645 Bozzano, Marco II-209 Bragg, Nate F. F. I-808 Breese, Samuel I-645 Brim, Luboš I-505 Brown, Kristopher II-461 Brunel, Julien II-337

Campbell, Brian I-303 Carpenter, Taylor I-249 Cauli, Claudia I-767

Ceška, Milan ˇ I-856 Chakraborty, Supratik II-911 Chalupa, Marek II-887 Chatterjee, Krishnendu I-341 Chemouil, David II-337 Chen, Guangke I-175 Chen, Jiayu I-225 Chen, Mingshuai I-443, II-524 Chen, Taolue I-175 Chen, Xiaohong II-477 Chiari, Michele II-387 Christakis, Maria I-201, II-777 Cimatti, Alessandro I-529, II-209 Clochard, Martin I-367 Coenen, Norine I-694, I-894 Cogumbreiro, Tiago I-403 Constantinides, George II-626 Cyphert, John I-46, I-783 D'Antoni, Loris I-84, I-783 DaCosta, D. II-851 Dahlqvist, Fredrik II-626 Dan, Andrei I-225 Day, Joel D. II-289 Dodds, Joey I-645 Dodds, Mike I-645 Dutertre, Bruno II-266 Dwyer, Matthew B. I-137

Eilers, Marco I-718 Eisenhut, Jan II-411 Elad, Neta I-317 Elbaum, Sebastian I-137 Eniser, Hasan Ferit I-201

Fang, Wang I-151 Farinier, Benjamin I-669 Farzan, Azadeh I-832 Ferlez, James I-287 Fernandes Pires, Anthony II-209 Finkbeiner, Bernd I-694, I-894 Foster, Jeffrey S. I-808 Fried, Dror I-870 Friedberger, Karlheinz II-195

Fu, Yu-Fu II-149 Funke, Florian I-894 Ganesh, Vijay II-289 Gardner, Philippa II-827 Gastin, Paul I-619 Genaim, Samir II-863 Giacobbe, Mirco II-3 Girol, Guillaume I-669 Gnad, Daniel II-411 Goel, Shilpi I-26 Gopinath, Divya I-3 Griggio, Alberto I-529, II-209 Guan, Ji I-151 Gupta, Aarti II-461 Gupta, Ashutosh II-911 Hahn, Ernst Moritz II-651 Hallé, Sylvain II-500 Hamilton, Nathaniel I-263 Hasuo, Ichiro I-595, II-75 Hauptman, Dustin I-566 Heljanko, Keijo II-363 Hermanns, Holger I-201 Hobor, Aquinas II-801 Hoffmann, Jörg I-201, II-411 Holtzen, Steven II-577 Hu, Qinheping I-84, I-783 Huffman, Brian I-645 Hur, Chung-Kil II-752 Immerman, Neil I-317 Irfan, Ahmed I-529, II-461 Itzhaky, Shachar I-110, II-125 Ivanov, Radoslav I-249 Jacobs, Bart II-27 Jacobs, Swen II-435 Jansen, Nils II-602 Jantsch, Simon I-894 Jewell, K. II-851 Johnson, Andrew I-380 Johnson, Taylor T. I-263 Jonáš, Martin II-209 Jones, B. F. II-851 Joshi, S. II-851 Jovanovi´c, Dejan II-266 Junges, Sebastian I-856, II-553, II-577, II-602

Kaminski, Benjamin Lucien II-524 Katoen, Joost-Pieter I-443, I-856, II-524 Keshmiri, Shawn I-566 Khedr, Haitham I-287 Kim, Dongjoo II-752 Kim, Jinwoo I-84 Kim, Sharon I-491 Kimberly, Greg II-209 Kincaid, Zachary I-46, II-51 Kla˘ska, David II-887 Kokologiannakis, Michalis I-427 Koskinen, Eric I-742 Kothari, Yugesh I-201 Kovács, Laura I-317 Kremer, Gereon II-231 Kulczynski, Mitja II-289 Kura, Satoshi II-75

Lal, Ratan I-566 Lange, Julien I-403 Launchbury, N. II-851 Lee, Insup I-249 Lee, Jaehun I-491 Lee, Juneyoung II-752 Lefaucheux, Engel II-172 Leow, Wei Xiang II-801 Leutgeb, Lorenz II-99 Li, Jianlin I-201 Li, Meng I-767 Li, Pengfei II-728 Li, Yangge I-580 Lin, Anthony W. II-243 Lin, Wang I-467 Lin, Zhengyao II-477 Liu, Jiaxiang II-149 Liu, Zhiming I-467 Lluch Lafuente, Alberto II-411 Lonsing, Florian II-461 Lopes, Nuno P. II-752 Lopez, Diego Manzanas I-263 Lyu, Deyun I-595

Ma, Lei I-595 Maksimovi´c, Petar II-827 Mandrioli, Dino II-387 Manea, Florin II-289 Mann, Makai II-461 Mansur, Muhammad Numair II-777 Mariano, Benjamin II-777 Markgraf, Oliver II-243


Ölveczky, Peter Csaba I-491 Oortwijn, Wytse I-367 Osama, Muhammad II-447 Ouaknine, Joël II-172

Pal, Neelanjana I-263 Pappas, George I-249 Parthasarathy, Gaurav II-704 P˘as˘areanu, Corina S. I-3 Pastva, Samuel I-505 Pathak, Shreya I-341 Pavlogiannis, Andreas I-341 Peleg, Hila I-110 Pereira, João C. I-367 Pereira, Mário II-677 Perez, Mateo II-651 Petcher, Adam I-645 Peyras, Quentin II-337 Piterman, Nir I-767 Polikarpova, Nadia I-110 Prabhakar, Pavithra I-566 Pradella, Matteo II-387 Prakash, Karthik R. I-619 Preiner, Mathias II-231 Pulte, Christopher I-303 Purser, David II-172

Rain, Sophie I-317 Rakamari´c, Zvonimir II-626 Ravara, António II-677 Reinhard, Tobias II-27 Reps, Thomas I-46, I-84, I-783 Rong, Dennis Liew Zhen I-403 Ro¸su, Grigore II-477 Roux, Cody I-808 Rowe, Reuben N. S. I-110 Roy, Diptarko II-3 Rubio, Albert II-863 Ryou, Wonryong I-225 Šafránek, David I-505 Sagiv, Mooly I-317 Sakr, Mouhammad II-435 Salvia, Rocco II-626 Sánchez, César I-694 Santos, José Fragoso II-827 Schewe, Sven II-651 Schröer, Philipp II-524 Sergey, Ilya I-110 Seshia, Sanjit A. II-553, II-577, II-602 Sewell, Peter I-303 Shi, Xiaomu II-149 Shoukry, Yasser I-287 Shriver, David I-137 Sibai, Hussein I-580 Siber, Julian I-894 Simner, Ben I-303 Singh, Gagandeep I-225 Singher, Eytan II-125 Slobodova, Anna I-26 Solar-Lezama, Armando I-808 Somenzi, Fabio II-651 Song, Fu I-175 Stan, Daniel II-243 Stefanescu, Andrei I-645 Strejˇcek, Jan II-887 Stupinský, Šimon I-856 Summers, Alexander J. II-704 Sumners, Rob I-26 Sun, Youcheng I-3 Swords, Sol I-26 Tabajara, Lucas Martinelli I-870

Tang, Xiaochao I-467 Terauchi, Tachio I-742 Tkachuk, Oksana I-767 Toman, Viktor I-341

Tomovi˘c, Luká˘s II-887 Tonetta, Stefano I-529 Torfah, Hazem II-553 Tran, Hoang-Dung I-263 Tremblay, Hugo II-500 Trentin, P. II-851 Trinh, Minh-Thai II-477 Trivedi, Ashutosh II-651 Tsai, Ming-Hsien II-149

Unadkat, Divyesh II-911 Unno, Hiroshi I-742, II-75 Usman, Muhammad I-3

Vafeiadis, Viktor I-427 Van den Broeck, Guy II-577 van der Berg, Freark I. II-690 Vardi, Moshe Y. I-870 Vazquez-Chanlatte, Marcell II-577 Vechev, Martin I-225

Wahl, Thomas I-380 Wang, Bow-Yaw II-149 Wang, Qiuye I-443 Wang, Yuting II-728 Weimer, James I-249 Weiss, Gera I-870 Wijs, Anton II-447 Wojtczak, Dominik II-651 Wolf, Felix A. I-367 Worrell, James II-172 Wu, Jinhua II-728 Wüstholz, Valentin I-201, II-777

Xu, Xiangzhe II-728 Xue, Bai I-443

Yang, Bo-Yin II-149 Yang, Xiaodong I-263 Yang, Yahan II-461 Yang, Zhengfeng I-467 Yin, Zhenguo II-728 Ying, Mingsheng I-151 Yu, Emily II-363

Zeng, M. Q. II-851 Zeng, Xia I-467 Zeng, Zhenbing I-467 Zhan, Naijun I-443 Zhang, Hongce II-461 Zhang, Yedi I-175 Zhang, Yidan I-467 Zhang, Zhenya I-595 Zhao, Jianjun I-595 Zhao, Zhe I-175 Zhu, Shaowei II-51 Zicarelli, Hannah I-403 Zuleger, Florian II-99