**Constantin Enea Akash Lal (Eds.)**

# **Computer Aided Verification**

**35th International Conference, CAV 2023 Paris, France, July 17–22, 2023 Proceedings, Part II**

## **Lecture Notes in Computer Science 13965**

Founding Editors

Gerhard Goos Juris Hartmanis

### Editorial Board Members

Elisa Bertino, *Purdue University, West Lafayette, IN, USA* Wen Gao, *Peking University, Beijing, China* Bernhard Steffen , *TU Dortmund University, Dortmund, Germany* Moti Yung , *Columbia University, New York, NY, USA*

The series Lecture Notes in Computer Science (LNCS), including its subseries Lecture Notes in Artificial Intelligence (LNAI) and Lecture Notes in Bioinformatics (LNBI), has established itself as a medium for the publication of new developments in computer science and information technology research, teaching, and education.

LNCS enjoys close cooperation with the computer science R & D community, the series counts many renowned academics among its volume editors and paper authors, and collaborates with prestigious societies. Its mission is to serve this international community by providing an invaluable service, mainly focused on the publication of conference and workshop proceedings and postproceedings. LNCS commenced publication in 1973.

Constantin Enea · Akash Lal Editors

## Computer Aided Verification

35th International Conference, CAV 2023 Paris, France, July 17–22, 2023 Proceedings, Part II

*Editors* Constantin Enea LIX, Ecole Polytechnique, CNRS and Institut Polytechnique de Paris Palaiseau, France

Akash Lal Microsoft Research Bangalore, India

ISSN 0302-9743 ISSN 1611-3349 (electronic) Lecture Notes in Computer Science ISBN 978-3-031-37702-0 ISBN 978-3-031-37703-7 (eBook) https://doi.org/10.1007/978-3-031-37703-7

© The Editor(s) (if applicable) and The Author(s) 2023. This book is an open access publication.

**Open Access** This book is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this book are included in the book's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the book's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use.

The publisher, the authors, and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This Springer imprint is published by the registered company Springer Nature Switzerland AG The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland

## **Preface**

It was our privilege to serve as the program chairs for CAV 2023, the 35th International Conference on Computer-Aided Verification. CAV 2023 was held during July 19–22, 2023 and the pre-conference workshops were held during July 17–18, 2023. CAV 2023 was an in-person event, in Paris, France.

CAV is an annual conference dedicated to the advancement of the theory and practice of computer-aided formal analysis methods for hardware and software systems. The primary focus of CAV is to extend the frontiers of verification techniques by expanding to new domains such as security, quantum computing, and machine learning. This puts CAV at the cutting edge of formal methods research, and this year's program is a reflection of this commitment.

CAV 2023 received a large number of submissions (261). We accepted 15 tool papers, 3 case-study papers, and 49 regular papers, which amounts to an acceptance rate of roughly 26%. The accepted papers cover a wide spectrum of topics, from theoretical results to applications of formal methods. These papers apply or extend formal methods to a wide range of domains such as concurrency, machine learning and neural networks, quantum systems, as well as hybrid and stochastic systems. The program featured keynote talks by Ruzica Piskac (Yale University), Sumit Gulwani (Microsoft), and Caroline Trippel (Stanford University). In addition to the contributed talks, CAV also hosted the CAV Award ceremony, and a report from the Synthesis Competition (SYNTCOMP) chairs.

In addition to the main conference, CAV 2023 hosted the following workshops: Meeting on String Constraints and Applications (MOSCA), Verification Witnesses and Their Validation (VeWit), Verification of Probabilistic Programs (VeriProP), Open Problems in Learning and Verification of Neural Networks (WOLVERINE), Deep Learning-aided Verification (DAV), Hyperproperties: Advances in Theory and Practice (HYPER), Synthesis (SYNT), Formal Methods for ML-Enabled Autonomous Systems (FoMLAS), and Verification Mentoring Workshop (VMW). CAV 2023 also hosted a workshop dedicated to Thomas A. Henzinger for this 60th birthday.

Organizing a flagship conference like CAV requires a great deal of effort from the community. The Program Committee for CAV 2023 consisted of 76 members—a committee of this size ensures that each member has to review only a reasonable number of papers in the allotted time. In all, the committee members wrote over 730 reviews while investing significant effort to maintain and ensure the high quality of the conference program. We are grateful to the CAV 2023 Program Committee for their outstanding efforts in evaluating the submissions and making sure that each paper got a fair chance. Like recent years in CAV, we made artifact evaluation mandatory for tool paper submissions, but optional for the rest of the accepted papers. This year we received 48 artifact submissions, out of which 47 submissions received at least one badge. The Artifact Evaluation Committee consisted of 119 members who put in significant effort to evaluate each artifact. The goal of this process was to provide constructive feedback to tool developers and help make the research published in CAV more reproducible. We are also very grateful to the Artifact Evaluation Committee for their hard work and dedication in evaluating the submitted artifacts.

CAV 2023 would not have been possible without the tremendous help we received from several individuals, and we would like to thank everyone who helped make CAV 2023 a success. We would like to thank Alessandro Cimatti, Isil Dillig, Javier Esparza, Azadeh Farzan, Joost-Pieter Katoen and Corina Pasareanu for serving as area chairs. We also thank Bernhard Kragl and Daniel Dietsch for chairing the Artifact Evaluation Committee. We also thank Mohamed Faouzi Atig for chairing the workshop organization as well as leading publicity efforts, Eric Koskinen as the fellowship chair, Sebastian Bardin and Ruzica Piskac as sponsorship chairs, and Srinidhi Nagendra as the website chair. Srinidhi, along with Enrique Román Calvo, helped prepare the proceedings. We also thank Ankush Desai, Eric Koskinen, Burcu Kulahcioglu Ozkan, Marijana Lazic, and Matteo Sammartino for chairing the mentoring workshop. Last but not least, we would like to thank the members of the CAV Steering Committee (Kenneth McMillan, Aarti Gupta, Orna Grumberg, and Daniel Kroening) for helping us with several important aspects of organizing CAV 2023.

We hope that you will find the proceedings of CAV 2023 scientifically interesting and thought-provoking!

June 2023 Constantin Enea Akash Lal

## **Organization**

## **Conference Co-chairs**


## **Artifact Co-chairs**


## **Workshop Chair**


Mohamed Faouzi Atig Uppsala University, Sweden

## **Verification Mentoring Workshop Organizing Committee**


## **Fellowship Chair**


### **Website Chair**


## **Sponsorship Co-chairs**


### **Proceedings Chairs**


## **Program Committee**

Aarti Gupta Princeton University, USA Alexander Nadel Intel, Israel Ankush Desai Amazon Web Services Anna Slobodova Intel, USA

Arjun Radhakrishna Microsoft, India

Cezara Dragoi Amazon Web Services, USA

Abhishek Bichhawat IIT Gandhinagar, India Aditya V. Thakur University of California, USA Ahmed Bouajjani University of Paris, France Aina Niemetz Stanford University, USA Akash Lal Microsoft Research, India Alan J. Hu University of British Columbia, Canada Alessandro Cimatti Fondazione Bruno Kessler, Italy Anastasia Mavridou KBR, NASA Ames Research Center, USA Andreas Podelski University of Freiburg, Germany Anthony Widjaja Lin TU Kaiserslautern and Max-Planck Institute for Software Systems, Germany Arie Gurfinkel University of Waterloo, Canada Aws Albarghouthi University of Wisconsin-Madison, USA Azadeh Farzan University of Toronto, Canada Bernd Finkbeiner CISPA Helmholtz Center for Information Security, Germany Bettina Koenighofer Graz University of Technology, Austria Bor-Yuh Evan Chang University of Colorado Boulder and Amazon, USA Burcu Kulahcioglu Ozkan Delft University of Technology, The Netherlands Caterina Urban Inria and École Normale Supérieure, France

Corina Pasareanu CMU, USA

Juneyoung Lee AWS, USA Kshitij Bansal Google, USA Kyungmin Bae POSTECH, South Korea Marcell Vazquez-Chanlatte Alliance Innovation Lab

Markus Rabe Google, USA Michael Emmi AWS, USA

Christoph Matheja Technical University of Denmark, Denmark Claudia Cauli Amazon Web Services, UK Constantin Enea LIX, CNRS, Ecole Polytechnique, France Cristina David University of Bristol, UK Dirk Beyer LMU Munich, Germany Elizabeth Polgreen University of Edinburgh, UK Elvira Albert Complutense University, Spain Eunsuk Kang Carnegie Mellon University, USA Gennaro Parlato University of Molise, Italy Hossein Hojjat Tehran University and Tehran Institute of Advanced Studies, Iran Ichiro Hasuo National Institute of Informatics, Japan Isil Dillig University of Texas, Austin, USA Javier Esparza Technische Universität München, Germany Joost-Pieter Katoen RWTH-Aachen University, Germany Jyotirmoy Deshmukh University of Southern California, USA Kenneth L. McMillan University of Texas at Austin, USA Kristin Yvonne Rozier Iowa State University, USA Kuldeep Meel National University of Singapore, Singapore (Nissan-Renault-Mitsubishi), USA Marieke Huisman University of Twente, The Netherlands Marta Kwiatkowska University of Oxford, UK Matthias Heizmann University of Freiburg, Germany Mihaela Sighireanu University Paris Saclay, ENS Paris-Saclay and CNRS, France Mohamed Faouzi Atig Uppsala University, Sweden Naijun Zhan Institute of Software, Chinese Academy of Sciences, China Nikolaj Bjorner Microsoft Research, USA Nina Narodytska VMware Research, USA Pavithra Prabhakar Kansas State University, USA Pierre Ganty IMDEA Software Institute, Spain Rupak Majumdar Max Planck Institute for Software Systems, Germany Ruzica Piskac Yale University, USA

Serdar Tasiran Amazon, USA Shaz Qadeer Meta, USA Swarat Chaudhuri UT Austin, USA

Sebastian Junges Radboud University, The Netherlands Sébastien Bardin CEA, LIST, Université Paris Saclay, France Sharon Shoham Tel Aviv University, Israel Shuvendu Lahiri Microsoft Research, USA Subhajit Roy Indian Institute of Technology, Kanpur, India Suguman Bansal Georgia Institute of Technology, USA Sylvie Putot École Polytechnique, France Thomas Wahl GrammaTech, USA Tomáš Vojnar Brno University of Technology, FIT, Czech Republic Yakir Vizel Technion - Israel Institute of Technology, Israel Yu-Fang Chen Academia Sinica, Taiwan Zhilin Wu State Key Laboratory of Computer Science, Institute of Software, Chinese Academy of Sciences, China

## **Artifact Evaluation Committee**

Alvin George IISc Bangalore, India Amit Samanta University of Utah, USA Anan Kabaha Technion, Israel Andres Noetzli Cubist, Inc., USA Avraham Raviv Bar Ilan University, Israel Ayrat Khalimov TU Clausthal, Germany Charles Babu M. CEA LIST, France

Alejandro Hernández-Cerezo Complutense University of Madrid, Spain Aman Goel Amazon Web Services, USA Anna Becchi Fondazione Bruno Kessler, Italy Arnab Sharma University of Oldenburg, Germany Baoluo Meng General Electric Research, USA Benjamin Jones Amazon Web Services, USA Bohua Zhan Institute of Software, Chinese Academy of Sciences, China Cayden Codel Carnegie Mellon University, USA Chungha Sung Amazon Web Services, USA Clara Rodriguez-Núñez Universidad Complutense de Madrid, Spain Cyrus Liu Stevens Institute of Technology, USA Daniel Hausmann University of Gothenburg, Sweden

Daniela Kaufmann TU Wien, Austria Debasmita Lohar MPI SWS, Germany Denis Mazzucato Inria, France Ferhat Erata Yale University, USA Filipe Arruda UFPE, Brazil Florian Sextl TU Wien, Austria Frédéric Recoules CEA LIST, France Goktug Saatcioglu Cornell, USA Grégoire Menguy CEA LIST, France Hadrien Renaud UCL, UK Ignacio D. Lopez-Miguel TU Wien, Austria John Kolesar Yale University, USA Kirby Linvill CU Boulder, USA Luke Geeson UCL, UK

Deivid Vale Radboud University Nijmegen, Netherlands Dorde Žikeli´c Institute of Science and Technology Austria, Austria Ekanshdeep Gupta New York University, USA Enrico Magnago Amazon Web Services, USA Filip Cordoba Graz University of Technology, Austria Florian Dorfhuber Technical University of Munich, Germany Francesco Parolini Sorbonne University, France Goran Piskachev Amazon Web Services, USA Guy Amir Hebrew University of Jerusalem, Israel Habeeb P. Indian Institute of Science, Bangalore, India Haoze Wu Stanford University, USA Hari Krishnan University of Waterloo, Canada Hünkar Tunç Aarhus University, Denmark Idan Refaeli Hebrew University of Jerusalem, Israel Ilina Stoilkovska Amazon Web Services, USA Ira Fesefeldt RWTH Aachen University, Germany Jahid Choton Kansas State University, USA Jie An National Institute of Informatics, Japan Joseph Scott University of Waterloo, Canada Kevin Lotz Kiel University, Germany Kush Grover Technical University of Munich, Germany Levente Bajczi Budapest University of Technology and Economics, Hungary Liangcheng Yu University of Pennsylvania, USA Lutz Klinkenberg RWTH Aachen University, Germany Marek Chalupa Institute of Science and Technology Austria, Austria

Matthias Hetzenberger TU Wien, Austria Mertcan Temel Intel Corporation, USA Michele Chiari TU Wien, Austria

Omkar Tuppe IIT Bombay, India

Sankalp Gambhir EPFL, Switzerland

Mario Bucev EPFL, Switzerland Mário Pereira NOVA LINCS—Nova School of Science and Technology, Portugal Marius Mikucionis Aalborg University, Denmark Martin Jonáš Masaryk University, Czech Republic Mathias Fleury University of Freiburg, Germany Maximilian Heisinger Johannes Kepler University Linz, Austria Miguel Isabel Universidad Complutense de Madrid, Spain Mihai Nicola Stevens Institute of Technology, USA Mihály Dobos-Kovács Budapest University of Technology and Economics, Hungary Mikael Mayer Amazon Web Services, USA Mitja Kulczynski Kiel University, Germany Muhammad Mansur Amazon Web Services, USA Muqsit Azeem Technical University of Munich, Germany Neelanjana Pal Vanderbilt University, USA Nicolas Koh Princeton University, USA Niklas Metzger CISPA Helmholtz Center for Information Security, Germany Pablo Gordillo Complutense University of Madrid, Spain Pankaj Kalita Indian Institute of Technology, Kanpur, India Parisa Fathololumi Stevens Institute of Technology, USA Pavel Hudec HKUST, Hong Kong, China Peixin Wang University of Oxford, UK Philippe Heim CISPA Helmholtz Center for Information Security, Germany Pritam Gharat Microsoft Research, India Priyanka Darke TCS Research, India Ranadeep Biswas Informal Systems, Canada Robert Rubbens University of Twente, Netherlands Rubén Rubio Universidad Complutense de Madrid, Spain Samuel Judson Yale University, USA Samuel Pastva Institute of Science and Technology Austria, Austria Sarbojit Das Uppsala University, Sweden Sascha Klüppelholz Technische Universität Dresden, Germany Sean Kauffman Aalborg University, Denmark


## **Additional Reviewers**

Azzopardi, Shaun Baier, Daniel Belardinelli, Francesco Bergstraesser, Pascal Boker, Udi Ceska, Milan Chien, Po-Chun Coglio, Alessandro Correas, Jesús Doveri, Kyveli Drachsler Cohen, Dana Durand, Serge Fried, Dror Genaim, Samir Ghosh, Bishwamittra Gordillo, Pablo

Guillermo, Roman Diez Gómez-Zamalloa, Miguel Hernández-Cerezo, Alejandro Holík, Lukáš Isabel, Miguel Ivrii, Alexander Izza, Yacine Jothimurugan, Kishor Kaivola, Roope Kaminski, Benjamin Lucien Kettl, Matthias Kretinsky, Jan Lengal, Ondrej Losa, Giuliano Luo, Ning Malik, Viktor

Markgraf, Oliver Martin-Martin, Enrique Meller, Yael Perez, Mateo Petri, Gustavo Pote, Yash Preiner, Mathias Rakamaric, Zvonimir Rastogi, Aseem Razavi, Niloofar Rogalewicz, Adam Sangnier, Arnaud Sarkar, Uddalok Schoepe, Daniel Sergey, Ilya

Stoilkovska, Ilina Stucki, Sandro Tsai, Wei-Lun Turrini, Andrea Vafeiadis, Viktor Valiron, Benoît Wachowitz, Henrik Wang, Chao Wang, Yuepeng Wies, Thomas Yang, Jiong Yen, Di-De Zhu, Shufang Žikeli´c, Ɖor de Zohar, Yoni

## **Contents – Part II**

#### **Decision Procedures**


#### **Model Checking**


xvi Contents – Part II


## **Decision Procedures**

## **Bitwuzla**

Aina Niemetz(B) and Mathias Preiner

Stanford University, Stanford, USA {niemetz,preiner}@cs.stanford.edu

**Abstract.** Bitwuzla is a new SMT solver for the quantifier-free and quantified theories of fixed-size bit-vectors, arrays, floating-point arithmetic, and uninterpreted functions. This paper serves as a comprehensive system description of its architecture and components. We evaluate Bitwuzla's performance on all benchmarks of supported logics in SMT-LIB and provide a comparison against other state-of-the-art SMT solvers.

## **1 Introduction**

Satisfiability Modulo Theories (SMT) solvers serve as back-end reasoning engines for a wide range of applications in formal methods (e.g., [13,14,21,23,35]). In particular, the theory of fixed-size bit-vectors, in combination with arrays, uninterpreted functions and floating-point arithmetic, have received increasing interest in recent years, as witnessed by the high and increasing numbers of benchmarks submitted to the SMT-LIB benchmark library [5] and the number of participants in corresponding divisions in the annual SMT competition (SMT-COMP) [42]. State-of-the-art SMT solvers supporting (a subset of) these theories include Boolector [31], cvc5 [3], MathSAT [15], STP [19], Yices2 [17] and Z3 [25]. Among these, Boolector had been largely dominating the quantifier-free divisions with bit-vectors and arrays in SMT-COMP over the years [2].

Boolector was originally published in 2009 by Brummayer and Biere [11] as an SMT solver for the quantifier-free theories of fixed-size bit-vectors and arrays. Since 2012, Boolector has been mainly developed and maintained by the authors of this paper, who have extended it with support for uninterpreted functions and lazy handling of non-recursive lambda terms [32,38,39], local search strategies for quantifier-free bit-vectors [33,34], and quantified bit-vector formulas [40].

While Boolector is still competitive in terms of performance, it has several limitations. Its code base consists of largely monolithic C code, with a rigid architecture focused on a very specialized, tight integration of bit-vectors and arrays. Consequently, it is cumbersome to maintain, and adding new features is difficult and time intensive. Further, Boolector requires manual management of memory and reference counts from API users; terms and sorts are tied to a specific solver instance and cannot be shared across instances; all preprocessing

This work was supported in part by the Stanford Center for Automated Reasoning, the Stanford Agile Hardware Center, the Stanford Center for Blockchain Research and a gift from Amazon Web Services.

c The Author(s) 2023

C. Enea and A. Lal (Eds.): CAV 2023, LNCS 13965, pp. 3–17, 2023. https://doi.org/10.1007/978-3-031-37703-7\_1

techniques are destructive, which disallows incremental preprocessing; and due to architectural limitations, incremental solving with quantifiers is not supported.

In 2018, we forked Boolector in preparation for addressing these issues, and entered an improved and extended version of this fork as Bitwuzla in the SMT competition 2020 [26]. At that time, Bitwuzla extended Boolector with: support for floating-point arithmetic by integrating SymFPU [8] (a C**++** library of bit-vector encodings of floating-point operations); a novel generalization of its propagation-based local search strategy [33] to ternary values [27]; unsat core extraction; and since 2022, support for reasoning about quantified formulas for all supported theories and their combinations. This version of Bitwuzla was already made available on GitHub at [28], but not officially released. However, architectural and structural limitations inherited from Boolector remained. Thus, to overcome these limitations and address the above issues, we decided to discard the existing code base and rewrite Bitwuzla from scratch.

In this paper, we present the first official release of Bitwuzla, an SMT solver for the (quantified and quantifier-free) theories of fixed-size bit-vectors, arrays, floating-point arithmetic, uninterpreted functions and their combinations. Its name (pronounced as *bitvootslah*) is derived from an Austrian dialect expression that can be translated as *someone who tinkers with bits*. Bitwuzla is written in C**++**, inspired by techniques implemented in Boolector. That is, rather than only redesigning problematic aspects of Boolector, we carefully dissected and (re)evaluated its parts to serve as guidance when writing a new solver from scratch. In that sense, it is not a reimplementation of Boolector, but can be considered its superior successor. Bitwuzla is available on GitHub [28] under the MIT license, and its documentation is available at [29].

## **2 Architecture**

Bitwuzla supports reasoning about quantifier-free and quantified formulas over fixed-size bit-vectors, floating-point arithmetic, arrays and uninterpreted functions as standardized in SMT-LIB [4]. In this section, we provide an overview of Bitwuzla's system architecture and its core components as given in Fig. 1.

Bitwuzla consists of two main components: the *Solving Context* and the *Node Manager*. The Solving Context can be seen as a solver instance that determines satisfiability of a set of formulas and implements the lazy, abstraction/refinement-based SMT paradigm *lemmas on demand* [6,24] (in contrast to SMT solvers like cvc5 and Z3, which are based on the CDCL(T ) [36] framework). The Node Manager is responsible for constructing and maintaining nodes and types and is shared across multiple Solving Context instances.

Bitwuzla provides a comprehensive C**++** API as its main interface, with a C and Python API built on top. All features of the C**++** API are also accessible to C and Python users. The API documentation is available at [29]. The C**++** API exports Term, Sort, Bitwuzla, and Option classes for constructing nodes and types, configuring solver options and constructing Bitwuzla solver instances (the external representation of Solving Contexts). Term and Sort objects may be

**Fig. 1.** Bitwuzla system architecture.

used in multiple Bitwuzla instances. The parser interacts with the solver instance via the C**++** API. A textual command line interface (CLI) builds on top of the parser, supporting SMT-LIBv2 [4] and BTOR2 [35] as input languages.

### **2.1 Node Manager**

Bitwuzla represents formulas and terms as reference-counted, immutable nodes in a directed acyclic graph. The Node Manager is responsible for constructing and managing these nodes and employs hash-consing to maximize sharing of subgraphs. Automatic reference counting allows the Node Manager to determine when to delete nodes. Similarly, types are constructed and managed by the *Type Manager*, which is maintained by the Node Manager. Nodes and types are stored globally (thread-local) in the Node Database and Type Database, which has the key advantage that they can be shared between arbitrarily many solving contexts within one thread. This is one of the key differences to Boolector's architecture, where terms and types are manually reference counted and tied to a single solver instance, which does not allow sharing between solver instances.

### **2.2 Solving Context**

A *Solving Context* is the internal equivalent of a solver instance and determines the satisfiability of a set of asserted formulas (assertions). Solving Contexts are fully configurable via options and provide an incremental interface for adding and removing assertions via push and pop. Incremental solving allows users to perform multiple satisfiability checks with similar sets of assertions while reusing work from earlier checks. On the API level, Bitwuzla also supports satisfiability queries under a given set of assumptions (SMT-LIB command check-sat-assuming), which are internally handled via push and pop.

Nodes and types constructed via the Node Manager may be shared between multiple Solving Contexts. If the set of assertions is satisfiable, the Solving Context provides a model for the input formula. It further allows to query model values for any term, based on this model (SMT-LIB command get-value). In case of unsatisfiable queries, the Solving Context can be configured to extract an unsatisfiable core and unsat assumptions.

A Solving Context consists of three main components: a *Rewriter*, a *Preprocessor* and a *Solver Engine*. The Rewriter and Preprocessor perform local (node level) and global (over all assertions) simplifications, whereas the Solver Engine is the central solving engine, managing theory solvers and their interaction.

**Preprocessor.** As a first step of each satisfiability check, prior to solving, the preprocessor applies a pipeline of preprocessing passes in a predefined order to the current set of assertions until fixed-point. Each preprocessing pass implements a set of satisfiability-preserving transformations. All passes can be optionally disabled except for one mandatory transformation, the reduction of the full set of operators supported on the API level to a reduced operator set: Boolean connectives are expressed by means of {¬,∧}, quantifier <sup>∃</sup> is represented in terms of <sup>∀</sup>, inequalities are represented in terms of <sup>&</sup>lt; and <sup>&</sup>gt;, signed bit-vector operators are expressed in terms of unsigned operators, and more. These reduction transformations are a subset of the term rewrites performed by the Rewriter, and rewriting is implemented as one preprocessing pass. Additionally, Bitwuzla implements 7 preprocessing passes, which are applied sequentially, after rewriting, until no further transformations are possible: *and flattening*, which splits a top-level <sup>∧</sup> into its subformulas, e.g., <sup>a</sup> <sup>∧</sup> (<sup>b</sup> <sup>∧</sup> (<sup>c</sup> <sup>=</sup> <sup>d</sup>)) into {a, <sup>b</sup>, <sup>c</sup> <sup>=</sup> <sup>d</sup>}; *substitution*, which replaces all occurrences of a constant x with a term t if x = t is derived on the top level; *skeleton preprocessing*, which simplifies the Boolean skeleton of the input formula with a SAT solver; *embedded constraints*, which substitutes all occurrences of top-level constraints in subterms of other top-level constraints with *true*; *extract elimination*, which eliminates bit-vector extracts over constants; *lambda elimination*, which applies beta reduction on lambda terms; and *normalization* of arithmetic expressions.

Preprocessing in Bitwuzla is *fully incremental*: all passes are applied to the current set of assertions, from all assertion levels, and simplifications derived from lower levels are applied to all assertions of higher levels (including assumptions). Assertions are processed per assertion level i, starting from i = 0, and for each level i > 0, simplifications are applied based on information from all levels <sup>j</sup> <sup>≤</sup> <sup>i</sup>. Note that when solving under assumptions, Bitwuzla internally pushes an assertion level and handles these assumptions as assertions of that level. When a level i is popped, the assertions of that level are popped, and the state of the preprocessor is backtracked to the state that was associated with level <sup>i</sup>−1. Note that preprocessing assertion levels i<j with information derived from level j requires to not only restore the state of the preprocessor, but to also reconstruct the assertions on levels i<j when level j is popped to the state before level j was pushed, and is left to future work.

Boolector, on the other hand, only performs preprocessing based on toplevel assertions (assertion level 0) and does not incorporate any information from assumptions or higher assertion levels.

**Rewriter.** The rewriter transforms terms via a predefined set of rewrite rules into semantically equivalent normal forms. This transformation is local in the sense that it is independent from the current set of assertions. We distinguish between required and optional rewrite rules, and further group rules into socalled rewrite levels from 0–2. The set of required rules consists of operator elimination rewrites, which are considered level 0 rewrites and ensure that nodes only contain operators from a reduced base set. For example, the two's complement <sup>−</sup><sup>x</sup> of a bit-vector term <sup>x</sup> is rewritten to (<sup>∼</sup> <sup>x</sup> + 1) by means of one's complement and bit-vector addition. Optional rewrite rules are grouped into level 1 and level 2. Level 1 rules perform rewrites that only consider the immediate children of a node, whereas level 2 rules may consider multiple levels of children. If not implemented carefully, level 2 rewrites can potentially destroy sharing of subterms and consequently increase the overall size of the formula. For example, rewriting (t + 0) to t is considered a level 1 rewrite rule, whereas rewriting (<sup>a</sup> <sup>−</sup> <sup>b</sup> <sup>=</sup> <sup>c</sup>) to (<sup>b</sup> <sup>+</sup> <sup>c</sup> <sup>=</sup> <sup>a</sup>) is considered a level 2 rule since it may introduce an additional bit-vector addition (<sup>b</sup> <sup>+</sup> <sup>c</sup>) if (<sup>a</sup> <sup>−</sup> <sup>b</sup>) occurs somewhere else in the formula. The maximum rewrite level of the rewriter can be configured by the user.

Rewriting is applied on the current set of assertions as a preprocessing pass and, as all other passes, applied until fixed-point. That is, on any given term, the rewriter applies rewrite rules until no further rewrite rules can be applied. For this, the rewriter must guarantee that no set of applied rewrite rules may lead to cyclic rewriting of terms. Additionally, all components of the solving context apply rewriting on freshly created nodes to ensure that all nodes are always fully normalized. In order to avoid processing nodes more than once, the rewriter maintains a cache that maps nodes to their fully rewritten form.

**Solver Engine.** After preprocessing, the solving context sends the current set of assertions to the Solver Engine, which implements a lazy SMT paradigm called *lemmas on demand* [6,24]. However, rather than using a propositional abstraction of the input formula as in [6,24], it implements a bit-vector abstraction similar to Boolector [12,38]. At its core, the Solver Engine maintains a bit-vector theory solver and a solver for each supported theory. Quantifier reasoning is handled by a dedicated quantifiers module, implemented as a theory solver. The Solver Engine manages all theory solvers, the distribution of relevant terms, and the processing of lemmas generated by the theory solvers.

The bit-vector solver is responsible for reasoning about the bit-vector abstraction of the input assertions and lemmas generated during solving, which includes all propositional and bit-vector terms. Theory atoms that do not belong to the bit-vector theory are abstracted as Boolean constants, and bit-vector terms whose operator does not belong to the bit-vector theory are abstracted as bitvector constants. For example, an array select operation of type bit-vector is abstracted as a bit-vector constant, while an equality between two arrays is abstracted as a Boolean constant.

If the bit-vector abstraction is satisfiable, the bit-vector solver produces a satisfying assignment, and the floating-point, array, function and quantifier solvers check this assignment for theory consistency. If a solver finds a theory inconsistency, i.e., a conflict between the current satisfying assignment and the solver's theory axioms, it produces a lemma to refine the bit-vector abstraction and rule out the detected inconsistency. Theory solvers are allowed to send any number of lemmas, with the only requirement that if a theory solver does not send a lemma, the current satisfying assignment is consistent with the theory.

Finding a satisfying assignment for the bit-vector abstraction and the subsequent theory consistency checks are implemented as an abstraction/refinement loop as given in Algorithm 1. Whenever a theory solver sends lemmas, the loop is restarted to get a new satisfying assignment for the refined bit-vector abstraction. The loop terminates if the bit-vector abstraction is unsatisfiable, or if the bit-vector abstraction is satisfiable and none of the theory solvers report any theory inconsistencies. Note that the abstraction/refinement algorithm may return *unknown* if the input assertions include quantified formulas.



**Backtrackable Data Structures.** Every component of the Solver Context except for the Rewriter depends on the current set of assertions. When solving incrementally, the assertion stack is modified by adding (SMT-LIB command push) and removing (SMT-LIB command pop) assertions. In contrast to Boolector, Bitwuzla supports saving and restoring the internal solver state, i.e., the state of the Solving Context, corresponding to these push and pop operations by means of *backtrackable data structures*. These data structures are custom variants of mutable data structures provided in the C**++** standard library, extended with an interface to save and restore their state on push and pop calls. This allows the solver to take full advantage of incremental solving by reusing work from previous satisfiability checks and backtracking to previous states. Further, this enables incremental preprocessing. Bitwuzla's backtrable data structures are conceptually similar to context-dependent data structures in cvc5 [3].

## **3 Theory Solvers**

The Solver Engine maintains a theory solver for each supported theory and implements a module for handling quantified formulas as a dedicated theory solver. The central engine of the Solver Engine is the bit-vector theory solver, which reasons about a bit-vector abstraction of the current set of input assertions, refined with lemmas generated by other theory solvers. The theories of fixed-size bit-vectors, arrays, floating-point arithmetic, and uninterpreted functions are combined via a model-based theory combination approach similar to [12,38].

Theory combination is based on candidate models produced by the bit-vector theory solver for the bit-vector abstraction (function T*BV* ::solve() in Algorithm 1). For each candidate model, each theory solver checks consistency with the axioms of the corresponding theory (functions <sup>T</sup>∗::check() in Algorithm 1). If a theory solver requests a model value for a term that is not part of the current bit-vector abstraction, the theory solver who "owns" that term is queried for a value. If this value or the candidate model is inconsistent with the axioms of the theory querying the value, it sends a lemma to refine the bit-vector abstraction.

#### **3.1 Arrays**

The array theory solver implements and extends the array procedure from [12] with support for reasoning over (equalities of) nested arrays and non-extensional constant arrays. This is in contrast to Boolector, which generalizes the lemmas on demand procedure for extensional arrays as described in [12] to non-recursive first-order lambda terms [37,38], without support for nested arrays. Generalizing arrays to lambda terms allows to use the same procedure for arrays and uninterpreted functions and enables a natural, compact representation and extraction of extended array operations such as *memset*, *memcpy* and array initialization patterns as described in [39]. As an example, *memset*(a, i, n, e), which updates n elements of array a within range [i, i + n[ to a value e starting from index i, can be represented as λj . *ite*(<sup>i</sup> <sup>≤</sup> j<i <sup>+</sup> n, e, a[j]). Reasoning over equalities involving arbitrary lambda terms (including these operations), however, requires higher-order reasoning, which is not supported by Boolector. Further, extensionality over standard array operators that are represented as lambda terms (e.g., store) requires special handling, which makes the procedure unnecessarily complex. Bitwuzla, on the other hand, implements separate theory solvers for arrays and uninterpreted functions. Consequently, since it does not generalize arrays to lambda terms, it cannot utilize the elegant representation of Boolector for the extended array operations of [39]. Thus, currently, extracting and reasoning about these operations is not yet supported. Instead of representing such operators as lambda terms, we plan to introduce specific array operators. This will allow a seamless integration into Bitwuzla's array procedure, with support for reasoning about extensionality involving these operators. We will also add support for reasoning about extensional constant arrays in the near future.

### **3.2 Bit-Vectors**

The bit-vector theory solver implements two orthogonal approaches: the classic *bit-blasting* technique employed by most state-of-the-art bit-vector solvers, which eagerly translates the current bit-vector abstraction to SAT; and the *ternary propagation-based local search* approach presented in [27]. Since local search procedures only allow to determine satisfiability, they are particularly effective as a complementary strategy, in combination with (rather than instead of) bitblasting [27,33]. Bitwuzla's bit-vector solver allows to combine local search with bit-blasting in a sequential portfolio setting: the local search procedure is run until a predefined resource limit is reached before falling back on the bit-blasting procedure. Currently, Bitwuzla allows combining these two approaches only in this particular setting. We plan to explore more interleaved configurations, possibly while sharing information between the procedures as future work.

*Bit-Blasting.* Bitwuzla implements the eager reduction of the bit-vector abstraction to propositional logic in two phases. First, it constructs an And-Inverter-Graph (AIG) circuit representation of the abstraction while applying AIG-level rewriting techniques [10]. This AIG circuit is then converted into Conjunctive Normal Form (CNF) via Tseitin transformation and sent to the SAT solver back-end. Note that for assertions from levels > 0, Bitwuzla leverages solving under assumptions in the SAT solver in order to be able to backtrack to lower assertion levels on pop. Bitwuzla supports CaDiCaL [7], CryptoMiniSat [41], and Kissat [7] as SAT back-ends and uses CaDiCaL as its default SAT solver.

*Local Search.* Bitwuzla implements an improved version of the ternary propagation-based local search procedure described in [27]. This procedure is a generalization of the propagation-based local search approach implemented in Boolector [33] and addresses one of its main weaknesses: its obliviousness to bits that can be simplified to constant values. Propagation-based local search is based on propagating target values from the outputs to the inputs, does not require bit-blasting, brute-force randomization or restarts, and lifts the concept of backtracing of Automatic Test Pattern Generation (ATPG) [22] to the word-level. Boolector additionally implements the stochastic local search (SLS) approach presented in [18], optionally augmented with a propagation-based strategy [34]. Bitwuzla, however, only implements our ternary propagation-based approach since it was shown to significantly outperform these approaches [33].

### **3.3 Floating-Point Arithmetic**

The solver for the theory of floating-point arithmetic implements an eager translation of floating-point atoms in the bit-vector abstraction to equisatisfiable formulas in the theory of bit-vectors, a process sometimes referred to as *word-blasting*. To translate floating-point expressions to the word-level, Bitwuzla integrates SymFPU [9], a C**++** library of bit-vector encodings of floating-point operations. SymFPU uses templated types for Booleans (un)signed bit-vectors, rounding modes and floating-point formats, which allows utilizing solver-specific representations. SymFPU has also been integrated into cvc5 [3].

#### **3.4 Uninterpreted Functions**

For the theory of uninterpreted functions (UF), Bitwuzla implements *dynamic Ackermannization* [16], which is a lazy form of Ackermann's reduction. The UF solver checks whether the current satisfying assignment of the bit-vector abstraction is consistent with the function congruence axiom ¯<sup>a</sup> <sup>=</sup> ¯<sup>b</sup> <sup>→</sup> <sup>f</sup>(¯a) = f(¯b) and produces a lemma whenever the axiom is violated.

#### **3.5 Quantifiers**

Quantified formulas are handled by the quantifiers module, which is treated as a theory solver and implements model-based quantifier instantiation [20] for all supported theories and their combinations. In the bit-vector abstraction, quantified formulas are abstracted as Boolean constants. Based on the assignment of these constants, the quantifiers solver produces instantiation or Skolemization lemmas. If the constant is assigned to true, the quantifier is treated as universal quantifier and the solver produces instantiation lemmas. If the constant is assigned to false, the solver generates a Skolemization lemma. Bitwuzla allows to combine quantifiers with all supported theories as well as incremental solving and unsat core extraction. This is in contrast to Boolector, which only supports sequential reasoning about quantified bit-vector formulas and, generally, does not provide unsat cores for unsatisfiable instances.

## **4 Evaluation**

We evaluate the overall performance of Bitwuzla on all non-incremental and incremental benchmarks of all supported logics in SMT-LIB [5]. We further include logics with floating-point arithmetic that are classified as containing linear integer arithmetic (LRA). Bitwuzla does not support LRA reasoning, but


**Table 1.** Solved instances and total runtime on solved instances (non-incremental).

the benchmarks in these logics currently only involve to-floating-point conversion (SMT-LIB command to fp) from real values, which is supported.

We compare against Boolector [31] and the SMT-COMP 2022 version of Bitwuzla [26] (configuration SC22), which, at that time, was an improved and extended version of Boolector and won several divisions in all tracks of SMT-COMP 2022 [2]. Boolector did not participate in SMT-COMP 2022, thus we use the current version of Boolector available on GitHub (commit 13a8a06d) [1]. Further, since Boolector does not support logics involving floating-point arithmetic, quantified logics other than pure quantified bit-vectors and incremental solving when quantifiers are involved, we also compare against the SMT-COMP 2022 versions of cvc5 [3] and Z3 [25]. Both solvers are widely used, high performance SMT solvers with support for a wide range of theories, including the theories supported by Bitwuzla. Note that this version of cvc5 uses a sequential portfolio of multiple configurations for some logics.


**Table 2.** Solved queries and total runtime on solved queries (incremental).

We ran all experiments on a cluster with Intel Xeon E5-2620 v4 CPUs. We allocated one CPU core and 8GB of RAM for each solver and benchmark pair, and used a 1200 s s time limit, the same time limit as used in SMT-COMP 2022 [2].

Table 1 shows the number of solved benchmarks for each solver in the nonincremental quantifier-free (QF ) and quantified divisions. Overall, Bitwuzla solves the largest number of benchmarks in the quantified divisions, considerably improving over SC22 and Boolector with over 600 and 4,200 solved benchmarks, respectively. Bitwuzla also takes the lead in the quantifier-free divisions, with 44 more solved instances compared to SC22, and more than 650 solved benchmarks compared to cvc5. On the 140,438 commonly solved instances between Bitwuzla, SC22, cvc5, and Z3 over all divisions, Bitwuzla is the fastest solver with 203,838s, SC22 is slightly slower with 208,310s, cvc5 is 2.85<sup>×</sup> slower (586,105s), and Z3 is <sup>5</sup>.1<sup>×</sup> slower (1,049,534s).

Table 2 shows the number of solved incremental check-sat queries for each solver in the incremental divisions. Again, Bitwuzla solves the largest number of queries overall and in the quantifier-free divisions. For the quantified divisions, Bitwuzla solves 42,770 queries, the second largest number of solved queries after Z3 (45,373), and more than 3700 more queries than SC22 (39,040). On benchmarks of the ABVFPLRA division, Bitwuzla significantly outperforms SC22 due to the occurrence of nested arrays, which were unsupported in SC22.

The artifact of this evaluation is archived and available in the Zenodo openaccess repository at https://zenodo.org/record/7864687.

## **5 Conclusion**

Our experimental evaluation shows that Bitwuzla is a state-of-the-art SMT solver for the quantified and quantifier-free theories of fixed-size bit-vectors, arrays, floating-point arithmetic, and uninterpreted functions. Bitwuzla has been extensively tested for robustness and correctness with Murxla [30], an API fuzzer for SMT solvers, which is an integral part of its development workflow. We have outlined several avenues for future work throughout the paper. We further plan to add support for the upcoming SMT-LIB version 3 standard, when finalized.

## **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

## Decision Procedures for Sequence Theories

Artur Jeż<sup>1</sup> , Anthony W. Lin2,3 , Oliver Markgraf2(B) , and Philipp Rümmer4,5

 University of Wrocław, Wrocław, Poland TU Kaiserslautern, Kaiserslautern, Germany markgraf@cs.uni-kl.de Max Planck Institute for Software Systems, Kaiserslautern, Germany University of Regensburg, Regensburg, Germany Uppsala University, Uppsala, Sweden

Abstract. Sequence theories are an extension of theories of strings with an infinite alphabet of letters, together with a corresponding alphabet theory (e.g. linear integer arithmetic). Sequences are natural abstractions of extendable arrays, which permit a wealth of operations including append, map, split, and concatenation. In spite of the growing amount of tool support for theories of sequences by leading SMT-solvers, little is known about the decidability of sequence theories, which is in stark contrast to the state of the theories of strings. We show that the decidable theory of strings with concatenation and regular constraints can be extended to the world of sequences over an alphabet theory that forms a Boolean algebra, while preserving decidability. In particular, decidability holds when regular constraints are interpreted as parametric automata (which extend both symbolic automata and variable automata), but fails when interpreted as register automata (even over the alphabet theory of equality). When length constraints are added, the problem is Turingequivalent to word equations with length (and regular) constraints. Similar investigations are conducted in the presence of symbolic transducers, which naturally model sequence functions like map, split, filter, *etc*. We have developed a new sequence solver, SeCo, based on parametric automata, and show its efficacy on two classes of benchmarks: (i) invariant checking on array-manipulating programs and parameterized systems, and (ii) benchmarks on symbolic register automata.

### 1 Introduction

Sequences are an extension of strings, wherein elements might range over an infinite domain (e.g., integers, strings, and even sequences themselves). Sequences

A. Jeż was supported under National Science Centre, Poland project number 2017/26/E/ST6/00191. A. Lin and O. Markgraf were supported by the ERC Consolidator Grant 101089343 (LASD). P. Rümmer was supported by the Swedish Research Council (VR) under grant 2018-04727, the Swedish Foundation for Strategic Research (SSF) under the project WebSec (Ref. RIT17-0011), and the Wallenberg project UPDATE.

are ubiquitous and commonly used data types in modern programming languages. They come under different names, e.g., Python/Haskell/Prolog lists, Java ArrayList (and to some extent Streams) and JavaScript arrays. Crucially, sequences are *extendable*, and a plethora of operations (including append, map, split, filter, concatenation, etc.) can naturally be defined and are supported by built-in library functions in most modern programming languages.

Various techniques in software model checking [30] — including symbolic execution, invariant generation — require an appropriate SMT theory, to which verification conditions could be discharged. In the case of programs operating on sequences, we would consequently require an SMT theory of sequences, for which leading SMT solvers like Z3 [6,38] and cvc5 [4] already provide some basic support for over a decade. The basic design of sequence theories, as done in Z3 and cvc5, as well as in other formalisms like symbolic automata [15], is in fact quite natural. That is, sequence theories can be thought of as extensions of theories of strings with an infinite alphabet of letters, together with a corresponding alphabet theory, e.g. Linear Integer Arithmetic (LIA) for reasoning about sequences of integers. Despite this, very little is known about what is decidable over theories of sequences.

In the case of finite alphabets, sequence theories become theories over strings, in which a lot of progress has been made in the last few decades, barring the long-standing open problem of string equations with length constraints (e.g. see [26]). For example, it is known that the existential theory of concatenation over strings with regular constraints is decidable (in fact, PSpace-complete), e.g., see [17,29,36,40,43]. Here, a *regular constraint* takes the form <sup>x</sup> <sup>∈</sup> <sup>L</sup>(E), where E is a regular expression, mandating that the expression E matches the string represented by x. In addition, several natural syntactic restrictions — including straight-line, acylicity, and chain-free (e.g. [1,2,5,11,12,26,35]) — have been identified, with which string constraints remain decidable in the presence of more complex string functions (e.g. transducers, replace-all, reverse, etc.). In the case of infinite alphabets, only a handful of results are available. Furia [25] showed that the existential theory of sequence equations over the alphabet theory of LIA is decidable by a reduction to the existential theory of concatenation over strings (over a finite alphabet) *without regular constraints*. Loosely speaking, a number (e.g. 4) can be represented as a string in unary (e.g. 1111), and addition is then simulated by concatenation. Therefore, his decidability result does not extend to other data domains and alphabet theories. Wang et al. [45] define an extension of the array property fragment [9] with concatenation. This fragment imposes strong restrictions, however, on the equations between sequences (here called finite arrays) that can be considered.

*"Regular Constraints" Over Sequences.* One answer of what a regular constraint is over sequences is provided by *automata modulo theories*. Automata modulo theories [15,16] are an elegant framework that can be used to capture the notion of regular constraints over sequences: Fix an alphabet theory T that forms a Boolean algebra; this is satisfied by virtually all existing SMT theories. In this framework, one uses formulas in T to capture multiple (possibly infinitely many) transitions of an automaton. More precisely, between two states in a *symbolic automaton* one associates a unary<sup>1</sup> formula <sup>ϕ</sup>(x) <sup>∈</sup> <sup>T</sup>. For example, <sup>q</sup> <sup>→</sup><sup>ϕ</sup> <sup>q</sup> with <sup>ϕ</sup> := <sup>x</sup> <sup>≡</sup> 0 (mod 2) over LIA corresponds to all transitions <sup>q</sup> <sup>→</sup><sup>i</sup> <sup>q</sup> with any even number i. Despite their nice properties, it is known that many simple languages cannot be captured using symbolic automata; e.g., one cannot express the language consisting of sequences containing the same even number i *throughout* the sequence.

There are essentially two (expressively incomparable) extensions of symbolic automata that address the aforementioned problem: (i) Symbolic Register Automata (SRA) [14] and (ii) Parametric Automata (PA) [21,23,24]. The model SRA was obtained by combining register automata [31] and symbolic automata. The model PA extends symbolic automata by allowing *free variables* (a.k.a. *parameters*) in the transition guards, i.e., the guard will be of the form ϕ(x, p¯), for parameters p¯. In an accepting path of PA, a parameter p used in multiple transitions has to be instantiated with the same value, which enables comparisons of different positions in an input sequence. For example, we can assert that only sequences of the form i <sup>∗</sup>, for an even number i, are accepted by the PA with a single transition <sup>q</sup> <sup>→</sup><sup>ϕ</sup> <sup>q</sup> with <sup>ϕ</sup>(x, p) := <sup>x</sup> <sup>=</sup> <sup>p</sup> <sup>∧</sup> <sup>x</sup> <sup>≡</sup> 0 (mod 2) and q being the start and final state. PA can also be construed as an extension of both variable automata [27] and symbolic automata. SRA and PA are not comparable: while parameters can be construed as read-only registers, SRA can only compare two different positions using equality, while PA may use a general formula in the theory in such a comparison (e.g., order).

*Contributions.* The main contribution of this paper is to provide *the first decidable fragments of a theory of sequences parameterized in the element theory*. In particular, we show how to leverage string solvers to solve theories over sequences. We believe this is especially interesting, in view of the plethora of existing string solvers developed in the last 10 years (e.g. see the survey [3]). This opens up new possibilities for verification tasks to be automated; in particular, we show how verification conditions for Quicksort, as well as Bakery and Dijkstra protocols, can be captured in our sequence theory. This formalization was done in the style of *regular model checking* [8,34], whose extension to infinite alphabets has been a longstanding challenge in the field. We also provide a new (dedicated) sequence solver SeCo We detail our results below.

We first show that the quantifier-free theory of sequences with concatenation and PA as regular constraints is decidable. Assuming that the theory is solvable in PSpace (which is reasonable for most SMT theories), we show that our algorithm runs in ExpSpace (i.e., double-exponential time and exponential space). We also identify conditions on the SMT theory T under which PSpace can be achieved and as an example show that Linear Real Arithmetic (LRA) satisfies those conditions. This matches the PSpace-completeness of the theory of strings with concatenation and regular constraints [18].

We consider three different variants/extensions:

<sup>1</sup> This can be generalized to any arity, which has to be set uniformly for the automaton.


We have implemented the solver SeCo based on our algorithms, and demonstrated its efficacy on two classes of benchmarks: (i) invariant checking on array-manipulating programs and parameterized systems, and (ii) benchmarks on Symbolic Register Automata (SRA) from [14]. For the first benchmarks, we model as sequence constraints invariants for QuickSort, Dijkstra's Self-Stabilizing Protocol [20] and Lamport's Bakery Algorithm [33]. For (ii), we solve decision problems for SRA on benchmarks of [14] such as emptiness, equivalence and inclusion on regular expressions with back-references. We report promising experimental results: our solver SeCo is up to three orders of magnitude faster than the SRA solver in [14].

*Organization.* We provide a motivating example of sequence theories in Sect. 2. Section 3 contains the syntax and semantics of the sequence constraint language, as well as some basic algorithmic results. We deal with equational and regular constraints in Sect. 4. In Sect. 5, we deal with the decidable fragments with equational constraints, regular constraints, and transducers. We deal with extensions of these languages with length and SRA constraints in Sect. 6. In Sect. 7 we report our implementation and experimental results. We conclude in Sect. 8. Missing details and proofs can be found in the full version.

#### 2 Motivating Example

We illustrate the use of sequence theories in verification using a implementation of QuickSort [28], shown in Listing 1. The example uses the Java Streams API and resembles typical implementations of QuickSort in functional languages; the program uses high-level operations on streams and lists like *filter* and *concatenation*. As we show, the data types and operations can naturally be modelled using a theory of sequences over integer arithmetic, and our results imply decidability of checks that would be done by a verification system.

The function quickSort processes a given list l by picking the first element as the pivot p, then creating two sub-lists left, right in which all numbers

```
/*@
 * ensures \forall int i; \result.contains(i) == l.contains(i);
 */
public static List<Integer > quickSort(List<Integer > l) {
  if (l.size() < 1) return l;
  Integer p = l.get(0);
  List<Integer > left = l.stream().filter(i -> i < p)
                           .collect(Collectors.toList());
  List<Integer > right = l.stream().skip(1).filter(i -> i >= p)
                           .collect(Collectors.toList());
  List<Integer > result = quickSort(left);
  result.add(p); result.addAll(quickSort(right));
  return result;
}
```
Listing 1. Implementation of QuickSort with Java Streams.

<sup>≥</sup>p (resp., <sup>&</sup>lt;p) have been eliminated. The function quickSort is then recursively invoked on the two sub-lists, and the results are finally concatenated and returned.

We focus on the verification of the post-condition shown in the beginning of Listing 1: sorting does not change the set of elements contained in the input list. This is a weaker form of the permutation property of sorting algorithms, and as such known to be challenging for verification methods (e.g., [42]). Sortedness of the result list can be stated and verified in a similar way, but is not considered here. Following the classical design-by-contract approach [37], to verify the partial correctness of the function it is enough to show that the post-condition is established in any top-level call of the function, assuming that the post-condition holds for all recursive calls. For the case of non-empty lists, the verification condition, expressed in our logic, is:

$$\begin{cases} \mathsf{left} = T\_{$$

The variables **l**, **res**, **left**, **right**, **left** , **right** range over sequences of integers, while i is a bound integer variable. The formula uses several operators that a useful sequence theory has to provide: (i) **l**0: the first element of input list **l**; (ii) ∈ and ∈: membership and non-membership of an integer in a list, which can be expressed using symbolic parametric automata; (iii) *skip*1, T<**l**<sup>0</sup> , T≥**l**<sup>0</sup> : sequence-to-sequence functions, which can be represented using symbolic parametric transducers; (iv) · . ·: concatenation of several sequences. The formula otherwise is a direct model of the method in Listing 1; the variables **left** , **right** are the results of the recursive calls, and concatenated to obtain the result sequence.

In addition, the formula contains quantifiers. To demonstrate validity of the formula, it is enough to eliminate the last quantifier ∀i by instantiating with a Skolem symbol k, and then instantiate the other quantifiers (left of the implication) with the same k:

$$\begin{pmatrix} \mathtt{left} = T\_{<1\_0}(\mathtt{l}) \land \mathtt{right} \mathtt{left} = T\_{\ge 1\_0}(skip\_1(\mathtt{l})) \land \\ (k \in \mathtt{left} \leftrightarrow k \in \mathtt{left'}) \land (k \in \mathtt{right} \leftrightarrow k \in \mathtt{right'}) \land \\ \mathtt{res} = \mathtt{left'}.\left[\mathtt{l}\_0\right].\mathtt{right'} \end{pmatrix} \rightarrow (k \in \mathtt{l} \leftrightarrow k \in \mathtt{res})$$

As one of the results of this paper, we prove that this final formula is in a decidable logic. The formula can be rewritten to a disjunction of straight-line formulas, and shown to be valid using the decision procedure presented in Sect. 5.

### 3 Models

In this section, we will define our sequence constraint language, and prove some basic results regarding various constraints in the language. The definition is a natural generalization of string constraints (e.g. see [12,17,26,29,35]) by employing an alphabet theory (a.k.a. element theory), as is done in symbolic automata and automata modulo theories [15,16,44].

For simplicity, our definitions will follow a model-theoretic approach. Let σ be a vocabulary. We fix a σ-structure S = (D; I), where D can be a finite or an infinite set (i.e., the universe) and I maps each function/relation symbol in σ to a function/relation over D. The elements of our sequences will range over D. We assume that the quantifier-free theory T<sup>S</sup> over S (including equality) is decidable. Examples of such T<sup>S</sup> are abound from SMT, e.g., LRA and LIA. We write T instead of TS, when S is clear. Our quantifier-free formula will use *uninterpreted* T*-constants* a, b, c, . . ., and may also use variables x, y, z, . . .. (The distinction between uninterpreted constants and variables is made only for the purpose of presentation of sequence constraints, as will be clear shortly.) We use C to denote the set of all uninterpreted T-constants. A formula ϕ is satisfiable if there is an assignment that maps the uninterpreted constants and variables to concrete values in D such that the formula becomes true in S.

Next, we define how we lift T to sequence constraints, using T as the *alphabet theory* (a.k.a. *element theory*). As in the case of strings (over a finite alphabet), we use standard notation like D<sup>∗</sup> to refer to the set of all sequences over D. By default, elements of D<sup>∗</sup> are written as standard in mathematics, e.g., 7, 8, 100, when D = Z. Sometimes we will disambiguate them by using brackets, e.g., (7, 8, 100) or [7, 8, 100]. We will use the symbol s (with/without subscript) to refer to concrete sequences (i.e., a member of D∗). We will use x, y, z to refer to T-sequence variables. Let V denote the set of all T-sequence variables, and <sup>Γ</sup> := C ∪ <sup>D</sup>. We will define constraint languages syntactically at the beginning, and will instantiate them to specific sequence operations. The theory T <sup>∗</sup> of Tsequences consists of the following constraints:

$$\varphi ::= R(\mathbf{x}\_1, \dots, \mathbf{x}\_r) \mid \varphi \land \varphi$$

where R is an r-ary relation symbol. In our definition of each atom R below, we will specify if an assignment μ, which maps each x<sup>i</sup> to a T-sequence and each uninterpreted constant to a T-element, satisfies R. If μ satisfies all atoms, we say that μ is a *solution* and the *satisfiability problem* is to decide whether there is a solution for a given ϕ.

A few remarks about the missing boolean operators in the constraint language above are in order. Disjunctions can be handled easily using the DPLL(T) framework (e.g. see [32]), so we have kept our theory conjunctive. As in the case of strings, negations are usually handled separately because they can sometimes (but not in all cases) be eliminated while preserving decidability.

*Equational Constraints.* A T*-sequence equation* is of the form

$$L = R$$

where each of L and R is a concatenation of concrete T-elements, uninterpreted constants, and <sup>T</sup>-sequence variables. That is, if <sup>Θ</sup> := <sup>Γ</sup> ∪ V, then L, R <sup>∈</sup> <sup>Θ</sup>∗.

For example, in the equation

$$0.1.\mathbf{x} = \mathbf{x}.0.1$$

the set of all solutions is of the form <sup>x</sup> <sup>→</sup> (01)∗. To make this more formal, we extend each assignment <sup>μ</sup> to a homomorphism on <sup>Θ</sup>∗. We write <sup>μ</sup> <sup>|</sup><sup>=</sup> <sup>L</sup> <sup>=</sup> <sup>R</sup> if μ(L) = μ(R). Notice that this definition is just direct extension of that of *word equations* (e.g. see [17]), i.e., when the domain D is finite.

In most cases the inequality constraints <sup>L</sup> <sup>=</sup> <sup>R</sup> can be reduced to equality in our case this requires also element constraints, described below.

*Element Constraints.* We allow T-formulas to constrain the uninterpreted constants. More precisely, given a T-sentence (i.e., no free variables) ϕ that uses C as uninterpreted constants, we obtain a proposition P (i.e., 0-ary relation) that <sup>μ</sup> <sup>|</sup><sup>=</sup> <sup>P</sup> iff <sup>T</sup> <sup>|</sup>=<sup>μ</sup> <sup>ϕ</sup>.

Negations in the equational constraints can be removed just like in the case of strings, i.e., by means of additional variables/constants and element constraints. For example, <sup>x</sup> <sup>=</sup> <sup>y</sup> can be replaced by (<sup>x</sup> <sup>=</sup> <sup>z</sup>ax <sup>∧</sup> <sup>y</sup> <sup>=</sup> <sup>z</sup>by <sup>∧</sup> <sup>a</sup> <sup>=</sup> <sup>b</sup>) <sup>∨</sup> <sup>x</sup> <sup>=</sup> <sup>y</sup>az∨xa<sup>z</sup> <sup>=</sup> <sup>y</sup>. Notice that <sup>a</sup> <sup>=</sup> <sup>b</sup> is a <sup>T</sup>-formula because we assume the equality symbol in T.

*Regular Constraints.* Over strings, regular constraints are simply unary constraints U(x), where U is an automaton. The interpretation is x is in the language of U. We define an analogue of regular constraints over sequences using *parametric automata* [21,23,24], which generalize both symbolic automata [15,16] and variable automata [27].

<sup>A</sup> *parametric automaton* (PA) over <sup>T</sup> is of the form <sup>A</sup> = (<sup>X</sup> , Q, Δ, q0, F), where X is a finite set of parameters, Q is a finite set of control states, q<sup>0</sup> ∈ Q is the initial state, <sup>F</sup> <sup>⊆</sup> <sup>Q</sup> is the set of final states, and <sup>Δ</sup>⊆fin<sup>Q</sup> <sup>×</sup> <sup>T</sup>(*curr*, <sup>X</sup> ) <sup>×</sup> <sup>Q</sup>. Here, *parameters* are simply uninterpreted T-constants, i.e., X ⊆C. Formulas that appear in transitions in Δ will be referred to as *guards*, since they restrict which transitions are enabled at a given state. Note that *curr* is an uninterpreted constant that refers to the "current" position in the sequence. The semantics is quite simply defined: a sequence (d1, d2,...,dn) is in the language of <sup>A</sup> under the assignment of parameters <sup>μ</sup>, written as (d1,...,dn) <sup>∈</sup> <sup>L</sup>μ(A), when there is a sequence of Δ-transitions

$$(q\_0, \varphi\_1(
curr, \mathcal{X}), q\_1), (q\_1, \varphi\_2(
curr, \mathcal{X}), q\_2), \dots, (q\_{n-1}, \varphi\_n(
curr, \mathcal{X}), q\_n),$$

such that <sup>q</sup><sup>n</sup> <sup>∈</sup> <sup>F</sup> and <sup>T</sup> <sup>|</sup><sup>=</sup> <sup>ϕ</sup>i(di, μ(<sup>X</sup> )). Finally, for a regular constraint <sup>A</sup>(x) is satisfied by <sup>μ</sup>, when <sup>μ</sup>(x) <sup>∈</sup> <sup>L</sup>μ(A).

Note, that it is possible to complement a PA A, one has to be careful with the semantics: we treat A as a symbolic automaton, which are closed under boolean operations [15]. So we are looking for <sup>μ</sup> such that <sup>μ</sup>(x) <sup>∈</sup> <sup>L</sup>μ(x). What we cannot do using the complementation, is a universal quantification over the parameters; note that already theory of strings with universal and existential quantifiers is undecidable.

We state next a lemma showing that PAs using only "local" parameters, together with equational constraints, can encode the constraint language that we have defined so far.

Lemma 1. *Satisfiability of sequence constraints with equation, element, and regular constraints can be reduced in polynomial-time to satisfiability of sequence constraints with equation and regular constraints (i.e., without element constraints). Furthermore, it can be assumed that no two regular constraints share any parameter.*

Proposition 1. *Assume that* T *is solvable in* NP *(resp.* PSpace*). Then, deciding nonemptiness of a parametric automaton over* T *is in* NP *(resp.* PSpace*).*

The proof is standard (e.g. see [21,23,24]), and only sketched here. The algorithm first nondeterministically guesses a simple path in the automaton A from an initial state q<sup>0</sup> to some final state q<sup>F</sup> . Let us say that the guards appearing in this path are <sup>ψ</sup>1(*curr*, <sup>X</sup> ),...,ψk(*curr*, <sup>X</sup> ). We need to check if this path is realizable by checking T-satisfiability of

$$\exists \mathcal{X}. \bigwedge\_{i=1}^{k} \exists \mathit{curr}. (\psi\_i(\mathit{curr}, \mathcal{X})).$$

It is easy to see that this is an NP (resp. NPSPACE = PSpace) procedure.

*Parametric Transducers.* We define a suitable extension of symbolic transducers over parameters following the definition from Veanes et al. [44]. A *transducer constraint* is of the form <sup>y</sup> <sup>=</sup> <sup>T</sup> (x), for a parametric transducer <sup>T</sup> . A *parametric transducer* over <sup>T</sup> is of the form <sup>T</sup> = (<sup>X</sup> , Q, Δ, q0, F), where <sup>X</sup> , <sup>Q</sup>, <sup>q</sup>0, <sup>F</sup> are just like in parametric automata. Unlike parametric automata, Δ is a finite set of tuples of the form (p,(ϕ, w), q), where (p, ϕ, q) is a standard transition in parametric automaton, and w is a (possibly empty) sequence of T-terms over variable curr and constants <sup>X</sup> , e.g., <sup>w</sup> = (curr+7, curr+2). One can think of <sup>w</sup> as the output produced by the transition. Given an assignment μ of parameters and the sequence variables, the constraint <sup>y</sup> <sup>=</sup> <sup>T</sup> (x) is satisfied when there is a sequence of Δ-transitions

(q0, ϕ1(*curr*, <sup>X</sup> ), <sup>w</sup>1, q1),(q1, ϕ2(*curr*, <sup>X</sup> ), <sup>w</sup>2, q2),...(qn−1, ϕn(*curr*, <sup>X</sup> ), <sup>w</sup>n, qn), such that <sup>q</sup><sup>n</sup> <sup>∈</sup> <sup>F</sup> and <sup>T</sup> <sup>|</sup><sup>=</sup> <sup>ϕ</sup>i(di, μ(<sup>X</sup> )), where <sup>μ</sup>(x)=(d1,...,dn), and finally <sup>μ</sup>(y) = <sup>μ</sup>1(w1)··· <sup>μ</sup>n(wn)

where μ<sup>i</sup> is μ but maps *curr* to di. The definition assumes that μ<sup>i</sup> is extended to terms and concatenation thereof by homomorphism, e.g., in LRA, if w<sup>1</sup> = (*curr* + 7, *curr* + 2) and μ<sup>1</sup> maps *curr* to 10, then w<sup>1</sup> will get mapped to 17, 12. Given a set S ⊆ D<sup>∗</sup> and an assignment μ (mapping the constants to D), we define the *pre-image* <sup>T</sup> <sup>−</sup><sup>1</sup> <sup>μ</sup> (S) of <sup>S</sup> under <sup>T</sup> with respect to <sup>μ</sup> as the set of sequences <sup>w</sup> <sup>∈</sup> <sup>D</sup><sup>∗</sup> such that <sup>w</sup> <sup>=</sup> <sup>T</sup> (w) holds with respect to <sup>μ</sup>.

### 4 Solving Equational and Regular Constraints

Here we present results on solving equational constraints, together with regular constraints, by a reduction to the string case, for which a wealth of results are already available. In general, this reduction causes an exponential blow-up in the resulting string constraint, which we show to be unavoidable in general. That said, we also provide a more refined analysis in the case when the underlying theory is LRA, where we can avoid this exponential blow-up.

Prelude: The Case of Strings. We start with some known results about the case of strings. The satisfiability of word equations with regular constraints is PSpace-complete [18,19]. This upper bound can be extended to full quantifierfree theory [10]. When no regular constraints are given, the problem is only known to be NP-hard, and it is widely believed to be in NP. In the absence of regular constraints, without loss of generality Γ can be assumed to contain only letters from the equations; this is not the case in presence of regular constraints. The algorithm solving word equations [19] does not need an explicit access to Γ: it is enough to know whether there is a letter which labels a given set of transitions in the NFAs used in the regular constraints. In principle, there could be exponentially many different (i.e., inducing different transitions in the NFAs) letters. When oracle access to such alphabet is provided, the satisfiability can still be decided in PSpace: while not explicitly claimed, this is exactly the scenario in [19, Sect. 5.2]

Other constraints are also considered for word equations; perhaps the most widely known are the length constraints, which are of the form: <sup>x</sup>∈V <sup>a</sup>x·|x| ≤ <sup>c</sup>, where {ax}<sup>x</sup>∈V , c are integer constants and <sup>|</sup>x<sup>|</sup> denotes the length <sup>|</sup>μ(x)|, with an obvious semantics. It is an open problem, whether word equations with length constraints are decidable, see [26].

Reduction to Word Equations. We assume Lemma 1, i.e. that the parameters used for different automata-based constraints are pairwise different. In particular, when looking for a satisfying assignment μ we can first fix assignment for X and then try to extend it to V. To avoid confusion, we call this partial assignment <sup>π</sup> : X → <sup>D</sup>.

Consider a set Φ of all atoms in all guards in the regular constraints together with the set of formulas {<sup>x</sup> <sup>=</sup> <sup>c</sup>} over all constants <sup>c</sup> <sup>∈</sup> <sup>D</sup> that appear in all equational constraints and the negations of both types of formulas. Fix an assignment <sup>π</sup> : X → <sup>D</sup>. The type typeπ(a) of <sup>a</sup> (under assignment <sup>π</sup>) is the set of formulas in <sup>Φ</sup> satisfied by <sup>a</sup>, i.e. {<sup>ϕ</sup> <sup>∈</sup> <sup>Φ</sup> : <sup>ϕ</sup>(π(<sup>X</sup> ), a) holds}. Clearly there are at most exponentially many different types (for a fixed π). A type t is realizable (for π) when t = typeπ(a) and it is realized by a.

If the constraints are satisfiable (for some parameters assignment π) then they are satisfiable over a subset Dπ⊆finD, in the sense that we assign uniterpreted constants elements from D<sup>π</sup> and T-sequence variables elements of D<sup>∗</sup> <sup>π</sup>, where D<sup>π</sup> is created by taking (arbitrarily) one element of a realizable type. Note that for each constant c in the equational constraints there is a formula "x = c" in Φ, in particular typeπ(c) is realizable (only by <sup>c</sup>) and so <sup>c</sup> <sup>∈</sup> <sup>D</sup>π.

Lemma 2. *Given a system of constraints and a parameter assignment* π *let* D<sup>π</sup> ⊆ D *be obtained by choosing (arbitrarily) for each realizable type a single element of this type. Then the set of constraints is satisfiable (for* π*) over* D *if and only if they are satisfiable (for* π*) over* Dπ*. To be more precise, there is a letter-to-letter homomorphism* <sup>ψ</sup> : <sup>D</sup><sup>∗</sup> <sup>→</sup> <sup>D</sup><sup>∗</sup> <sup>π</sup> *such that if* μ *is a solution of a system of constraints then* ψ ◦ μ *is also a solution.*

The proof can be found in the full version, its intuition is clear: we map each letter a ∈ D to the unique letter in D<sup>π</sup> of the same type.

Once the assignment is fixed (to π) and domain restricted to a finite set (Dπ), the equational and regular constraints reduce to word equations with regular constraints: treat <sup>D</sup><sup>π</sup> as a finite alphabet, for a parametric automaton <sup>A</sup> <sup>=</sup> (<sup>X</sup> , Q, Δ, q0, F) create an NFA <sup>A</sup> = (Dπ, Q, Δ , q0, F), i.e. over the alphabet Dπ, with the same set of states Q, same starting state q<sup>0</sup> and accepting states F and the relation defined as (q, a, q ) <sup>∈</sup> <sup>Δ</sup> if and only if there is (q,ϕ(*curr*, <sup>X</sup> ), q ) <sup>∈</sup> <sup>Δ</sup> such that <sup>ϕ</sup>(a, π(<sup>X</sup> )) holds, i.e. we can move from <sup>q</sup> to <sup>q</sup> by <sup>a</sup> in <sup>A</sup> if and only if we can make this move in A under assignment π. Clearly, from the construction

Lemma 3. *Given an assignment of parameters* π *let* D<sup>μ</sup> *be a set from Lemma 2,* A *be a parametric automaton and* A *the automaton as constructed above. Then*

$$L\_{\pi}(\mathcal{A}) \cap D\_{\pi}^\* = L(\mathcal{A}') \text{ .}$$

We can rewrite the parametric automata-constraints with regular constraints and treat equational constraints as word equations (over the finite alphabet Dπ). From Lemma 2 and Lemma 3 it follows that the original constraints have a solution for assignment π if and only if the constructed system of constraints has a solution. Therefore once the appropriate assignment π is fixed, the validity of constraints can be verified [19]. It turns out that we do not need the actual π, it is enough to know which types are realisable for it, which translates to an exponential-size formula. We will use letter τ to denote subset of Φ; the idea is that <sup>τ</sup> <sup>=</sup> {typeπ(a) : <sup>a</sup> <sup>∈</sup> <sup>D</sup>} ⊆ <sup>2</sup><sup>Φ</sup> and if different π, π give the same sets of realizable types, then they both yield a satisfying assignment or both not. Hence it is enough to focus on τ and not on actual π.

Lemma 4. *Given a system of equational and regular constraints we can nondeterministically reduce them to a formula of a form*

$$\exists\_{t \in \tau} a\_t \in D. \exists \mathcal{X} \in D^+. \bigwedge\_{t \in \tau} \bigwedge\_{\varphi \in t} \varphi(\mathcal{X}, a\_t) \; , \tag{1}$$

*where* <sup>τ</sup> <sup>⊆</sup> <sup>2</sup><sup>Φ</sup> *is of at most exponential size, and a system of word equations with regular constraints of linear size and over an* |τ |*-size alphabet, using auxiliary* <sup>O</sup>(n|<sup>τ</sup> <sup>|</sup>) *space. The solution of the latter word equations (for which also* (1) *holds) are solutions of the original system, by appropriate identifications of symbols.*

*Proof.* We guess the set τ of types of the assignment of parameters π, i.e. τ = {typeπ(a) : <sup>a</sup> <sup>∈</sup> <sup>D</sup>} such that there is an assignment <sup>μ</sup> extending <sup>π</sup>; note that as <sup>Φ</sup> has linearly many atoms and <sup>τ</sup> <sup>⊆</sup> <sup>2</sup><sup>Φ</sup>, then <sup>|</sup><sup>τ</sup> <sup>|</sup> may be of exponential size, in general. The (1) verifies the guess: we validate whether there are values of X such that for each type <sup>t</sup> <sup>∈</sup> <sup>τ</sup> there is a value <sup>a</sup> such that typeπ(a) = <sup>t</sup>.

Let D<sup>π</sup> be a set having one symbol per every type in τ , as in Lemma 2; note that this includes all constants in the equational constraints. The algorithm will not have access to particular values, instead we store each t ∈ τ , say as a bitvector describing which atoms in <sup>Φ</sup> this letter satisfies. In particular, <sup>|</sup>Dπ<sup>|</sup> <sup>=</sup> <sup>|</sup><sup>τ</sup> <sup>|</sup> and it is at most exponential. In the following we will consider only solutions over Dπ.

For each a ∈ D<sup>π</sup> we can validate, which transitions in A it can take: the transition is labelled by a guard which is a conjunction of atoms from Φ and either each such atom is in typeπ(a) or not. Hence we can treat <sup>A</sup> as an NFA for Dπ. We do not need to construct nor store it, we can use A: when we want to make a transition by <sup>ϕ</sup>(<sup>X</sup> , a) we look up, whether each atom of <sup>ϕ</sup> is in typeπ(a) or not. Similarly, the constraint <sup>A</sup>(x) is restricted to <sup>x</sup> <sup>∈</sup> <sup>L</sup>π(A) and for <sup>x</sup> <sup>∈</sup> <sup>D</sup><sup>∗</sup> π this is a usual regular constraint.

We treat equational constraints as word equations over alphabet Dπ.

Concerning the correctness of the reduction: if the system of word equations (with regular constraints) is satisfiable and the formula (1) is also satisfiable, then there is a satisfying assignment μ over D<sup>π</sup> and D<sup>∗</sup> <sup>π</sup> in particular, there is an assignment of parameters for which there are letters of the given types (note that in principle it could be that μ induces more types, i.e. there is a value a such that typeμ(a) <sup>∈</sup>/ <sup>τ</sup> and so it is not represented in <sup>D</sup>π, but this is fine: enlarging the alphabet cannot invalidate a solution), i.e. the transitions for a<sup>t</sup> in the automata after the reduction are the same as in the corresponding parametric automata for the assignment π, this is guaranteed by the satisfiability of (1) and the way we construct the instance, see Lemma 3.

On the other hand, when there is a solution of the input constraints, there is one for some assignment of parameters π. Hence, by Lemma 2, there is a solution over <sup>D</sup>π. The algorithm guesses <sup>τ</sup> <sup>=</sup> {typeπ(a) : <sup>a</sup> <sup>∈</sup> <sup>D</sup>} and (1) is true for it. Then by Lemma 2 there is a solution over D<sup>π</sup> as constructed in the reduction and by Lemma 3 the regular constraints define the same subsets of D<sup>∗</sup> <sup>π</sup> both when interpreted as parametric automata and NFAs.

Theorem 1. *If theory* T *is in* PSpace *then sequence constraints are in* ExpSpace*.*

*If* τ *is polynomial size and the formula* (1) *can be verified in* PSpace*, then sequence constraints can be verified in* PSpace*.*

One of the difficulties in deciding sequence constraints using the word equations approach is the size of set of realizable types τ , which could be exponential. For some concrete theories it is known to be smaller and thus a lower upper bound on complexity follows. For instance, it is easy to show that for LRA there are linearly many realizable types, which implies a PSpace upper bound.

#### Corollary 1. *Sequence constraints for Linear Real Arithmetic are in* PSpace*.*

In general, the ExpSpace upper bound from Theorem 1 cannot be improved, as even non-emptiness of intersection of parametric automata is ExpSpacecomplete for some theories decidable in PSpace. This is in contrast to the case of symbolic automata, for which the non-emptiness of intersection (for a theory T decidable in PSpace) is in PSpace. This shows the importance of parameters in our lower bound proof.

Theorem 2. *There are theories with existential fragment decidable in* PSpace *and whose non-emptiness of intersection of parametric automata is* ExpSpace*complete.*

When no regular constraints are allowed, we can solve the equational and element constraints in PSpace (note that we do not use Lemma 1).

Theorem 3. *For a theory* T *decidable in* PSpace*, the element and equational constraints (so no regular constraints) can be decided in* PSpace*.*

### 5 Algorithm for Straight-Line Formulas

It is known that adding finite transducers into word equations results in an undecidable model (e.g. see [35]). Therefore, we extend the *straight-line restriction* [12,35] to sequences, and show that it suffices to recover decidability for equational constraints, together with regular and transducer constraints. In fact, we will show that deciding problems in the straight-line fragment is solvable in doubly exponential time and is ExpSpace-hard, if T is solvable in PSpace. It has been observed that the straight-line fragment for the theory of strings already covers many interesting benchmarks [12,35], and similarly many properties of sequence-manipulating programs can be proven using the fragment, including the QuickSort example from Sect. 2 and other benchmarks shown in Sect. 7.

The Straight-Line Fragment SL. We start by defining recognizable formulas over sequences, followed by the syntactic and semantic restrictions on our constraint language. This definition follows closely the definition of recognizable relations over finite alphabets, except that we replace finite automata with parametric automata.

Definition 1 (Recognizable formula). *A formula* R(*x*1,..., *x*r) *is* recognizable *if it is equivalent to a positive Boolean combination of regular constraints.*

Note that this is simply a generalization of regular constraints to multiple variables, i.e., 1-ary recognizable formula can be turned into a regular constraint, which is closed under intersection and union.

To define the straight-line fragment, we use the approach of [12]; that is, the fragment is defined in terms of "feasibility of a symbolic execution". Here, a *symbolic execution* is just a sequence of assignments and assertions, whereas the *feasibility* problem amounts to deciding whether there are concrete values of the variables so that the symbolic execution can be run and none of the assertions are violated. We now make this intuition formal. A symbolic execution is syntactically generated by the following grammar:

$$S \quad ::= \quad \mathbf{y} := f(\mathbf{x}\_1, \dots, \mathbf{x}\_k, \mathcal{X}) \mid \mathbf{assert}(R(\mathbf{x}\_1, \dots, \mathbf{x}\_r)) \mid \mathbf{assert}(\varphi) \mid S; S \tag{2}$$

where <sup>f</sup> : (D∗)<sup>k</sup> <sup>×</sup> <sup>D</sup>|X | <sup>→</sup> <sup>D</sup> is a function, <sup>R</sup> are recognizable formulas, and <sup>ϕ</sup> are element constraints.

The symbolic execution S can be turned into a sequence constraint as follows. Firstly, we can turn S into the standard *Static Single Assignment (SSA)* form by means of introducing new variables on the left-hand-side of an assignment. For example, y := f(x); y := g(z) becomes y := f(x1); y := g(z). Then, in the resulting constraint, each variable appears *at most once* on the left-hand-side of an assignment. That way, we can simply replace each assignment symbol := with an equality symbol =. We then treat each sequential composition as the conjunction operator ∧ and assertion as a conjunct. Note that individual assertions are already sequence constraints. Next, we define how an interpretation μ satisfies the constraint <sup>y</sup> <sup>=</sup> <sup>f</sup>(x1,..., <sup>x</sup>r, <sup>X</sup> ):

$$
\mu \vdash \mathbf{y} = f(\mathbf{x}\_1, \dots, \mathbf{x}\_r, \mathcal{X}) \quad \text{iff} \quad \mu(\mathbf{y}) = f(\mu(\mathbf{x}\_1), \dots, \mu(\mathbf{x}\_r), \mu(\mathcal{X})).
$$

Note that '=' on the l.h.s. is syntactic, while the '=' on the r.h.s. is in the metalanguage. The definition of the semantics of the language is now inherited from Sect. 3.

In addition to the syntactic restrictions, we also need a semantic condition: in our language, we only permit functions f such that the pre-image of each regular constraint under f is effectively a recognizable formula:

(RegInvRel) A function <sup>f</sup> is permitted if for each regular constraint <sup>A</sup>(y), it is possible to compute a recognizable formula that is equivalent to the formula <sup>∃</sup><sup>y</sup> : <sup>A</sup>(y) <sup>∧</sup> <sup>y</sup> <sup>=</sup> <sup>f</sup>(x1,..., <sup>x</sup>r, <sup>X</sup> ).

Two functions satisfying (RegInvRel) are the concatenation function x := y.z (here <sup>y</sup> could be the same as <sup>z</sup>) and parametric transducers <sup>y</sup> := <sup>T</sup> (x). We will only use these two functions in the paper, but the result is generalizable to other functions.

Proposition 2. *Given a regular constraint* <sup>A</sup>(*y*) *and a constraint <sup>y</sup>* <sup>=</sup> *<sup>x</sup>*.*z, we can compute a recognizable formula* <sup>ψ</sup>(*x*, *<sup>z</sup>*) *equivalent to* <sup>∃</sup>*<sup>y</sup>* : <sup>A</sup>(*y*) <sup>∧</sup> *<sup>y</sup>* <sup>=</sup> *<sup>x</sup>*.*z. Furthermore, this can be achieved in polynomial time.*

The proof of this proposition is exactly the same as in the case of strings, e.g., see [12,35].

Proposition 3. *Given a regular constraint* <sup>A</sup>(*y*) *and a parametric transducer constraint <sup>y</sup>* <sup>=</sup> <sup>T</sup> (*x*)*, we can compute a regular constraint* <sup>A</sup> (*x*) *that is equivalent to* <sup>∃</sup>*<sup>y</sup>* : <sup>A</sup>(*y*) <sup>∧</sup> *<sup>y</sup>* <sup>=</sup> <sup>T</sup> (*x*)*. This can be achieved in exponential time.*

The construction in Proposition 3 is essentially the same as the pre-image computation of a symbolic automaton under a symbolic transducer [44]. The complexity is exponential in the maximum number of output symbols of a single transition (i.e. the maximum length of w in the transducer), which is in practice a small natural number.

The following is our main theorem on the SL fragment with equational constraints, regular constraints, and transducers.

Theorem 4. *If* T *is solvable in* PSpace*, then the SL fragment with concatenation and parametric transducers over* T *is in 2-*ExpTime *and is* ExpSpace*-hard.*

*Proof.* We give a decision procedure. We assume that S is already in SSA (i.e. each variable appears at most once on the left-hand side). Let us assume that S is of the form S ; y := f(x1, ...xr), for some symbolic execution S . Without loss of generality, we may assume that each recognizable constraint is of the form <sup>A</sup>(x). This is no limitation: (1) since each <sup>R</sup> in the assertion is a recognizable formula, we simply have to "guess" one of the implicants for each R, and (2) assert(ψ<sup>1</sup> <sup>∧</sup> <sup>ψ</sup>2) is equivalent to assert(ψ1); assert(ψ2).

Assume now that {A1(y),..., <sup>A</sup>m(y)} are all the regular constraints on <sup>y</sup> in S. By our assumption, it is possible to compute a recognizable formula equivalent to

$$
\psi(\mathbf{x}\_1, \dots, \mathbf{x}\_r) := \exists \mathbf{y} : \bigwedge\_{i=1}^m \mathcal{A}\_i(\mathbf{y}) \wedge \mathbf{y} = f(\mathbf{x}\_1, \dots, \mathbf{x}\_r).
$$

There are two ways to see this. The first way is that regular constraints are closed under intersection. This is in general computationally quite expensive because of a product automata construction before applying the pre-image computation. A better way to do this is to observe that ψ is equivalent to the conjunction of ψi's over i = 1,...,m, where

$$
\psi\_i := \exists \mathbf{y} : \mathcal{A}\_i(\mathbf{y}) \land \mathbf{y} = f(\mathbf{x}\_1, \dots, \mathbf{x}\_r).
$$

Fig. 1. A<sup>0</sup> accepts all words not containing *k* and A<sup>1</sup> accepts all words containing *k*.

By our semantic condition, we can compute recognizable formulas ψ i,...,ψ m equivalent to ψ1,...,ψ<sup>m</sup> respectively. Therefore, we simply replace S by

> S ; assert(ψ <sup>1</sup>); ··· ; assert(ψ <sup>m</sup>),

in which every occurrence of y has been completely eliminated. Applying the above variable elimination iteratively, we obtain a conjunction of regular constraints. We now end up with a conjunction of regular constraints and element constraints, which as we saw from Sect. 4 is decidable.

*Example 1.* We consider the example from Sect. 2 where a weaker form of the permutation property is shown for QuickSort. The formula that has to be proven is a disjunction of straight-line formulas and in the following we execute our procedure only on one disjunct without redundant formulas:

$$\mathbf{assert}(\mathcal{A}\_0(\mathbf{left}')); \mathbf{assert}(\mathcal{A}\_0(\mathbf{right}')); \mathbf{res} = \mathbf{left}'.[\mathbf{l}\_0].\mathbf{right}'; \mathbf{assert}(\mathcal{A}\_1(\mathbf{res}))$$

We model <sup>L</sup>(A1) as the language which accepts all words which contain one letter equal to <sup>k</sup> and <sup>L</sup>(A0) as the language which accepts only words not containing k, where k is an uninterpreted constant, so a single element. See Fig. 1. We begin by removing the operation **res** = **left** . [**l**0] . **right** . The product automaton for all assertions that contain **res** is just A1. Hence, we can remove the assertion assert(A1(**res**)). The concatenation function . satisfies RegInvRel and the pre-image g can be represented by

$$\bigvee\_{0 \le i,j \le 1} \mathcal{A}\_1^{q\_0,\{q\_i\}}(\mathbf{left}') \wedge \mathcal{A}\_1^{q\_i,\{q\_j\}}([\mathbf{l}\_0]) \wedge \mathcal{A}\_1^{q\_j,\{q\_1\}}(\mathbf{right}'),$$

where <sup>A</sup>p,F - <sup>i</sup> is A<sup>i</sup> with start state set to p and finals to F .

In the next step, the assertion g is added to the program and all assertions containing **res** and the concatenation function are removed.

assert(A0(**left** )); assert(A(**right** )); assert(g(**left** , [**l**0], **right** ))

From here, we pick a tuple from g, lets say i = j = 1, and obtain

$$\begin{aligned} \mathsf{assert}(\mathcal{A}\_0(\mathsf{left}')); \mathsf{assert}(\mathcal{A}\_0(\mathsf{right}')); \mathsf{assert}(\mathsf{left}' \in \mathcal{A}\_1^{q\_0,\{q\_1\}});\\ \mathsf{assert}([\mathsf{l}\_0] \in \mathcal{A}\_1^{q\_1,\{q\_1\}}); \mathsf{assert}(\mathsf{right}' \in \mathcal{A}\_1^{q\_1,\{q\_1\}}) \end{aligned}$$

Finally, the product automata <sup>A</sup><sup>0</sup> × Aq0,{q1} <sup>1</sup> and <sup>A</sup><sup>0</sup> × Aq0,{q1} <sup>1</sup> are computed for the variables **left** , **right** and a non-emptiness check over the product automata and the automaton for [**l**0] is done. The procedure will find no combination of paths for each automaton which can be satisfied, since **left** is forced to accept no words containing k by A<sup>0</sup> and only accepts by reading a <sup>k</sup> from <sup>A</sup>q0,{q1} <sup>1</sup> . Next, the procedure needs to exhaust all tuples from (Aq0,{q*i*} <sup>1</sup> , <sup>A</sup>q*i*,{q*<sup>j</sup>* } <sup>1</sup> , <sup>A</sup>q*<sup>j</sup>* ,{q1} <sup>1</sup> )0≤i,j≤<sup>1</sup> before it is proven that this disjunct is unsatisfiable.

#### 6 Extensions and Undecidability

Length Constraints. We consider the extension of our model by allowing *length-constraints* on the sequence variables: for each sequence variable x we consider the associated length variable <sup>x</sup>, let the set of length variables be <sup>L</sup> <sup>=</sup> {<sup>x</sup> : <sup>x</sup> ∈ V}, we extend <sup>μ</sup> to <sup>L</sup>, it assigns natural numbers to them. The length constraints are of the form <sup>x</sup> <sup>a</sup><sup>x</sup><sup>x</sup>?0, where ? ∈ {<, <sup>≤</sup>, <sup>=</sup>, <sup>=</sup>, <sup>≥</sup>, >} and each a<sup>x</sup> is an integer constant, i.e., linear arithmetic formulas on the lengthvariables. The semantics is natural: we require that <sup>|</sup>μ(x)<sup>|</sup> <sup>=</sup> <sup>μ</sup>(<sup>x</sup>) (the assigned values are the true lengths of sequences) and that <sup>μ</sup>(L) satisfies each length constraint.

There is, however, another possible extensions: if we the theory T<sup>S</sup> is the Presburger arithmetic, then the parameter automata could use the values <sup>x</sup>. We first deal with a more generic, though restricted case, when this is not allowed: then all reductions from Sect. 4 generalize and we can reduce to the word equations with regular and length constraints. However, the decidability status of this problem is unknown. When we consider Presburger arithmetic and allow the automata to employ the length variables, then it turns out that we can interpret the formula (1) as a collection of length constraints, and again we reduce to word equations with regular and length constraints.

*Automata Oblivious of Lengths.* We first consider the setting, in which the length variables L can only be used in length constraints. It is routine to verify that the reduction from Sect. 4 generalize to the case of length constraints: it is possible to first fix μ for parameters, calling it again π. Then Lemma 2 shows that each solution μ can be mapped by a letter-to-letter homomorphism to a finite alphabet Dπ, and this mapping preserves the satisfiability/unsatisfiability of length constraints, so Lemma 2 still holds when also length constraints are allowed. Similarly, Lemma 3 is also not affected by the length constraints and finally Lemma 4 deals with regular and equational constraints, ignoring the other possible constraints and the length of substitutions for variables are the same. Hence it holds also when the length constraints are allowed then the resulting word equations use regular and length constraints.

Unfortunately, the decidability of word equations with linear length constraints (even without regular constraints) is a notorious open problem. Thus instead of decidability, we get Turing-equivalent problems.

Theorem 5. *Deciding regular, equational and length constraints for* T*sequences of a decidable theory* T *is Turing-equivalent to word equations with regular and length constraints.*

*Automata Aware of the Sequence Lengths.* We now consider the case when the underlying theory T<sup>S</sup> is the Presburger arithmetic, i.e. S is the natural numbers and we can use addition, constants 0, 1 and comparisons (and variables). The additional functionality of the parametric automaton A is that <sup>Δ</sup>⊆fin<sup>Q</sup> <sup>×</sup> <sup>T</sup>(*curr*, <sup>X</sup> ,L) <sup>×</sup> <sup>Q</sup>, i.e. the guards can also use the length variables; the semantics is extended in the natural way.

Then the type typeπ(a) of <sup>a</sup> <sup>∈</sup> <sup>N</sup> now depends on <sup>μ</sup> values on <sup>X</sup> and <sup>L</sup>, hence we denote by π the restriction of μ to X ∪L. Then Lemma 2, 3 still hold, when we fix π. Similarly, Lemma 4 holds, but the analogue of (1) now uses also the length variables, which are also used in the length constraints. Such a formula can be seen as a collection of length constraints for original length variables L as well as length variables X ∪{a<sup>t</sup> : <sup>t</sup> <sup>∈</sup> <sup>τ</sup>}. Hence we validate this formula as part of the word equations with length constraints. Note that a<sup>t</sup> has two roles: as a letter in D<sup>π</sup> and as a length variable. However, the connection is encoded in the formula from the reduction (analogue of (1)) and we can use two different sets of symbols.

Theorem 6. *Deciding conjunction of regular, equational and length constraints for sequences of natural numbers with Presburger arithmetic, where the regular constraints can use length variables, is Turing-equivalent to word equations with regular and (up to exponentially many) length constraints.*

Undecidability of Register Automata Constraints. One could use more powerful automata for regular constraints; one such popular model are register automata; informally, such automaton has k registers r1,...,r<sup>k</sup> and its transition depends on state and a value of formula using the registers and *curr*: the read value [23]; note that the registers can be updated: to *curr* or to one of register's values; this is specified in the transition. In "classic" register automata guards can only use equality and inequality between registers and *curr*; in SRA model more powerful atoms are allowed. We show that sequence constraints and register automata constraints (which use quantifier-free formulas with equality and inequality as only atoms, i.e. do not employ the SRA extension) lead to undecidability (over infinite domain D).

Theorem 7. *Satisfiability of equational constraints and register automata constraints, which use equality and inequality only, over infinite domain, is undecidable.*

## 7 Implementations, Optimizations and Benchmarks

Implementation. We have implemented our decision procedure for problems in the constraint language SL for the theory of sequences in a new tool SeCo (Sequence Constraint Solver) on top of the SMT solver Princess [41]. We extend a publicly available library for symbolic automata and transducers [13] to parametric automata and transducers by connecting them to the uninterpreted constants in our theory of sequences. Our tool supports symbolic transducers, concatenation of sequences and reversing of sequences. Any additional function which satisfies RegInvRel such as a replace function which replaces only the first and leftmost longest match can be added in the future.

Our algorithm is an adaption of the tool OSTRICH [12] and closely follows the proof of Theorem 4. To summarize the procedure, a depth-first search is employed to remove all functions in the given input and splitting on the preimages of those functions. When removing a function, new assertions are added to the pre-image constraints. After all functions have been removed and only assertions are left a nonemptiness check is called over all parametric automata which encoded the assertions. If the check is successful a corresponding model can be constructed, otherwise the procedure computes a conflict set and backjumps to the last split in the depth search.<sup>2</sup>

Benchmarks. We have performed experiments on two benchmark suites. The first one concerns itself with the verification of properties for programs manipulating sequences. The second benchmark suite compares our tool against an algorithm using symbolic register automata [13] on decision procedures of regular expressions with back-references such as emptiness, equivalence and inclusion.

Both benchmark suites require universal quantification over the parameters; there are existing methods for eliminating these universal quantifiers, one such class are the *semantically deterministic* (SD) [22] PAs; despite its name, being SD is algorithmically checkable. Most of considered the PAs are SD, in particular all in benchmark suite 2.

Experiments were conducted on an AMD Ryzen 5 1600 Six-Core CPU with 16 GB of RAM running on Windows 10. The results for second benchmark suite is shown Table 1. The timeout for all benchmarks is 300 s.

In the first benchmarks suite we are looking to verify a weaker form of the permutation property of sorting as shown in Sect. 2. Furthermore, we verify properties of two self-stabilizing algorithms for mutual exclusion on parameterized systems. The first one is Lamport's bakery algorithm [33], for which we proved that the algorithm ensures mutual exclusion. The system is modelled in the style of regular model checking [8], with system states represented as words, here over an infinite alphabet: the character representing a thread stores the thread control state, a Boolean flag, and an integer as the number drawn by the thread. The system transitions are modelled as parametric transducers, and invariants as parametric automata. The second algorithm is known as Dijkstra's Self-Stabilizing Protocol [20], in which system states are encoded as sequences of integers, and in which we verify that the set of states in which exactly one processor is privileged forms an invariant. The mentioned benchmarks require

<sup>2</sup> For a more detailed write-up of the depth-first search algorithm see OSTRICH [12] Algorithm 1.


Table 1. Benchmark suite 2. *SRA* is used for the algorithm for symbolic register automata and *SEQ* for our tool. The symbol ∅ indicates the column where emptiness was checked, ≡ indicates self equivalence and ⊆ inclusion of languages.

universal quantification, but similar to the motivating example from Sect. 2 one can eliminate quantifiers by Skolemization and instantiation which was done by hand.

The second benchmark suite consists of three different types of benchmarks, summarized in Table 1. The benchmark PR-Cn describes a regular expression for matching products which have the same code number of length n, and PR-CLn matches not only the code number but also the lot number. The last type of benchmark is IP-n, which matches n positions of 2 IP addresses. The benchmarks are taken from the regular-expression crowd-sourcing website RegExLib [39] and are also used in experiments for symbolic register automata [14] which we also compare our results against. To apply our decision procedure to the benchmarks, we encode each of the benchmarks as a parametric automaton, using parameters for the (bounded-size) back-references. The task in the experiments is to check emptiness, language equivalence, and language inclusion for the same combinations of the benchmarks as considered in [14].

*Results of the Experiments.* All properties can be encoded by parametric automata with very few states and parameters. As a result the properties for each program can be verified in < 2.6 s, in detail the property for Dijkstra's algorithm was proven in 0.6 s, QuickSort in 1.1 s and Lamport's bakery algorithm in 2.5 s.

The results for the second benchmark suite are shown in Table 1. The algorithm for symbolic register automata times out on 11 of the 36 benchmarks and our tool solves most benchmarks in <1 s. One thing to observe that the symbolic register automata scales poorly when more registers are needed to capture the back-references while the performance of our approach does not change noticeably when more parameters are introduced.

#### 8 Conclusion and Future Work

In this paper, we have performed a systematic investigation of decidability and complexity of constraints on sequences. Our starting point is the subcase of string constraints (i.e. over a finite set of sequence elements), which include equational constraints with concatenation, regular constraints, length constraints, and transducers. We have identified parametric automata (extending symbolic automata and variable automata) as suitable notion of "regular constraints" over sequences, and parametric transducers (extending symbolic transducers) as suitable notion of transducers over sequences. We showed that decidability results in the case of strings carry over to sequences, although the complexity is in general higher than in the case of strings (sometimes exponentially higher). For certain element theory (e.g. Linear Real Arithmetic), it is possible to retain the same complexity as in the string case. We also delineate the boundary of the suitable notion of "regular constraints" by showing that the equational constraints with symbolic register automata [14] yields undecidable satisfiability. Finally, our new sequence solver SeCo shows promising experimental results.

There are several future research avenues. Firstly, the complexity of sequence constraints over other specific element theories (e.g. Linear Integer Arithmetic) should be precisely determined. Secondly, is it possible to recover decidability with other fragments of register automata (e.g., single-use automata [7])? On the implementation side, there are some algorithmic improvements, e.g., better nonemptiness checks for parametric automata in the case of a single automaton, as well as product of multiple automata.

Acknowledgment. We thank anonymous reviewers for their thorough and helpful feedback. We are grateful to Nikolaj Bjørner, Rupak Majumdar and Margus Veanes for the inspiring discussion.

#### References


Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

## **Exploiting Adjoints in Property Directed Reachability Analysis**

Mayuko Kori1,2(B) , Flavio Ascari<sup>3</sup> , Filippo Bonchi<sup>3</sup> , Roberto Bruni<sup>3</sup> , Roberta Gori<sup>3</sup> , and Ichiro Hasuo1,2

<sup>1</sup> National Institute of Informatics, Tokyo, Japan {mkori,hasuo}@nii.ac.jp <sup>2</sup> The Graduate University for Advanced Studies (SOKENDAI), Hayama, Japan <sup>3</sup> Dipartimento di Informatica, Universit`a di Pisa, Pisa, Italy flavio.ascari@phd.unipi.it, {filippo.bonchi,roberto.bruni,roberta.gori}@unipi.it

**Abstract.** We formulate, in lattice-theoretic terms, two novel algorithms inspired by Bradley's property directed reachability algorithm. For finding safe invariants or counterexamples, the first algorithm exploits over-approximations of both forward and backward transition relations, expressed abstractly by the notion of adjoints. In the absence of adjoints, one can use the second algorithm, which exploits lower sets and their principals. As a notable example of application, we consider quantitative reachability problems for Markov Decision Processes.

**Keywords:** PDR · Lattice theory · Adjoints · MDPs · Over-approximation

## **1 Introduction**

*Property directed reachability analysis* (PDR) refers to a class of verification algorithms for solving safety problems of transition systems [5,12]. Its essence consists of 1) interleaving the construction of an *inductive invariant* (a *positive chain*) with that of a *counterexample* (a *negative sequence*), and 2) making the two sequences *interact*, with one narrowing down the search space for the other.

PDR algorithms have shown impressive performance both in hardware and software verification, leading to active research [15,18,28,29] going far beyond its original scope. For instance, an abstract domain [8] capturing the overapproximation exploited by PDR has been recently introduced in [13], while PrIC3 [3] extended PDR for quantitative verification of probabilistic systems.

Research supported by MIUR PRIN Project 201784YSZ5 *ASPRA*, by JST ERATO HASUO Metamathematics for Systems Design Project JPMJER1603, by JST CREST Grant JPMJCR2012, by JSPS DC KAKENHI Grant 22J21742 and by EU Next-GenerationEU (NRRP) SPOKE 10, Mission 4, Component 2, Investment N. 1.4, CUP N. I53C22000690001.

To uncover the abstract principles behind PDR and its extensions, Kori et al. proposed LT-PDR [19], a generalisation of PDR in terms of lattice/category theory. LT-PDR can be instantiated using domain-specific *heuristics* to create effective algorithms for different kinds of systems such as Kripke structures, Markov Decision Processes (MDPs), and Markov reward models. However, the theory in [19] does not offer guidance on devising concrete heuristics.

**Adjoints in PDR.** Our approach shares the same vision of LT-PDR, but we identify different principles: *adjunctions* are the core of our toolset.

An adjunction f g is one of the central concepts in category theory [23]. It is prevalent in various fields of computer science, too, such as abstract interpretation [8] and functional program-A g <sup>⊥</sup> -- C f 

ming [22]. Our use of adjoints in this work comes in the following two flavours.


**Our Algorithms.** The problem we address is the standard lattice theoretical formulation of safety problems, namely whether the least fixed point of a continuous map b over a complete lattice (L, ) is below a given element p ∈ L. In symbols μb ? p. We present two algorithms.

The first one, named AdjointPDR, assumes to have an element i ∈ L and two adjoints f g : L → L, representing respectively initial states, forward semantics and backward semantics L <sup>⊥</sup> f 

(see right) such that b(x) = f(x) i for all x ∈ L. Under this assumption, we have the following equivalences (they follow from the Knaster-Tarski theorem, g


$$\text{see } \S 2) ! \qquad \qquad \qquad \dots$$

$$
\mu b \sqsubseteq p \quad \Leftrightarrow \quad \mu(f \sqcup i) \sqsubseteq p \quad \Leftrightarrow \quad i \sqsubseteq \nu(g \sqcap p),
$$

where μ(f i) and ν(g p) are, by the Kleene theorem, the limits of the *initial* and *final* chains illustrated below.

$$\bot \sqsubseteq i \sqsubseteq f(i) \sqcup i \sqsubseteq \cdots \qquad \qquad \qquad \cdot \cdots \sqsubseteq g(p) \sqcap p \sqsubseteq p \sqsubseteq \top$$

As positive chain, PDR exploits an over-approximation of the initial chain: it is made greater to accelerate convergence; still it has to be below p.

The distinguishing feature of AdjointPDR is to take as a negative sequence (that is a sequential construction of potential counterexamples) an overapproximation of the final chain. This crucially differs from the negative sequence of LT-PDR, namely an under-approximation of the computed positive chain.

We prove that AdjointPDR is sound (Theorem 5) and does not loop (Proposition 7) but since, the problem μb ? p is not always decidable, we cannot prove termination. Nevertheless, AdjointPDR allows for a formal theory of heuristics that are essential when instantiating the algorithm to concrete problems. The theory prescribes the choices to obtain the boundary executions, using initial and final chains (Proposition 10); it thus identifies a class of heuristics guaranteeing termination when answers are negative (Theorem 12).

AdjointPDR's assumption of a forward-backward adjoint f g, however, does not hold very often, especially in probabilistic settings. Our second algorithm AdjointPDR<sup>↓</sup> circumvents this problem by extending the lattice for the negative sequence, from L to the lattice L<sup>↓</sup> of *lower sets* in L.

Specifically, by using the second form of adjoints, namely an abstraction-concretization pair, the problem μb ? p in L can be translated to an equivalent problem on b<sup>↓</sup> in L↓, for

$$\underbrace{}\_{b}\underbrace{\asymp}\_{L}\underbrace{\llcorner\llcorner}\_{L^{\downarrow}}L^{\downarrow}\llcorner\llcorner\llcorner$$

which an adjoint b<sup>↓</sup> b<sup>↓</sup> r is guaranteed. This allows one to run AdjointPDR in the lattice L↓. We then notice that the search for a positive chain can be conveniently restricted to principals in L↓, which have representatives in L. The resulting algorithm, using L for positive chains and L<sup>↓</sup> for negative sequences, is AdjointPDR↓.

The use of lower sets for the negative sequence is a key advantage. It not only avoids the restrictive assumption on forward-backward adjoints f g, but also enables a more thorough search for counterexamples. AdjointPDR<sup>↓</sup> can simulate step-by-step LT-PDR (Theorem 17), while the reverse is not possible due to a single negative sequence in AdjointPDR<sup>↓</sup> potentially representing multiple (Proposition 18) or even all (Proposition 19) negative sequences in LT-PDR.

**Concrete Instances.** Our lattice-theoretic algorithms yield many concrete instances: the original IC3/PDR [5,12] as well as Reverse PDR [27] are instances of AdjointPDR with L being the powerset of the state space; since LT-PDR can be simulated by AdjointPDR↓, the latter generalizes all instances in [19].

As a notable instance, we apply AdjointPDR<sup>↓</sup> to MDPs, specifically to decide if the maximum reachability probability [1] is below a given threshold. Here the lattice L = [0, 1]<sup>S</sup> is that of fuzzy predicates over the state space S. Our theory provides guidance to devise two heuristics, for which we prove negative termination (Corollary 20). We present its implementation in Haskell, and its experimental evaluation, where comparison is made against existing probabilistic PDR algorithms (PrIC3 [3], LT-PDR [19]) and a non-PDR one (Storm [11]). The performance of AdjointPDR<sup>↓</sup> is encouraging—it supports the potential of PDR algorithms in probabilistic model checking. The experiments also indicate the importance of having a variety of heuristics, and thus the value of our adjoint framework that helps coming up with those.

Additionally, we found that abstraction features of Haskell allows us to code lattice-theoretic algorithms almost literally (∼100 lines). Implementing a few heuristics takes another ∼240 lines. This way, we found that mathematical abstraction can directly help easing implementation effort.

**Related Work.** Reverse PDR [27] applies PDR from unsafe states using a backward transition relation **T** and tries to prove that initial states are unreachable. Our right adjoint g is also backward, but it differs from **T** in the presence of nondeterminism: roughly, **T**(X) is the set of states which *can* reach X in one step, while g(X) are states which *only* reach X in one step. *fb*PDR [28,29] runs PDR and Reverse PDR in parallel with shared information. Our work uses both forward and backward directions (the pair f g), too, but approximate differently: Reverse PDR over-approximates the set of states that can reach an unsafe state, while we over-approximate the set of states that only reach safe states.

The comparison with LT-PDR [19] is extensively discussed in Sect. 4.2. PrIC3 [3] extended PDR to MDPs, which are our main experimental ground: Sect. 6 compares the performances of PrIC3, LT-PDR and AdjointPDR↓.

We remark that PDR has been applied to other settings, such as software model checking using theories and SMT-solvers [6,21] or automated planning [30]. Most of them (e.g., software model checking) fall already in the generality of LT-PDR and thus they can be embedded in our framework.

It is also worth to mention that, in the context of abstract interpretation, the use of adjoints to construct initial and final chains and exploit the interaction between their approximations has been investigated in several works, e.g., [7].

**Structure of the Paper.** After recalling some preliminaries in Sect. 2, we present AdjointPDR in Sect. 3 and AdjointPDR<sup>↓</sup> in Sect. 4. In Sect. 5 we introduce the heuristics for the max reachability problems of MDPs, that are experimentally tested in Sect. 6.

## **2 Preliminaries and Notation**

We assume that the reader is familiar with lattice theory, see, e.g., [10]. We use (L, ), (L1, 1), (L2, 2) to range over complete lattices and x, y, z to range over their elements. We omit subscripts and order relations whenever clear from the context. As usual, and denote least upper bound and greatest lower bound, and denote join and meet, and ⊥ top and bottom. Hereafter we will tacitly assume that all maps are monotone. Obviously, the identity map id: L → L and the composition f ◦ g : L<sup>1</sup> → L<sup>3</sup> of two monotone maps g : L<sup>1</sup> → L<sup>2</sup> and <sup>f</sup> : <sup>L</sup><sup>2</sup> <sup>→</sup> <sup>L</sup><sup>3</sup> are monotone. For a map <sup>f</sup> : <sup>L</sup> <sup>→</sup> <sup>L</sup>, we inductively define <sup>f</sup> <sup>0</sup> <sup>=</sup> id and <sup>f</sup> <sup>n</sup>+1 <sup>=</sup> <sup>f</sup> ◦ <sup>f</sup> <sup>n</sup>. Given <sup>l</sup>: <sup>L</sup><sup>1</sup> <sup>→</sup> <sup>L</sup><sup>2</sup> and <sup>r</sup> : <sup>L</sup><sup>2</sup> <sup>→</sup> <sup>L</sup>1, we say that <sup>l</sup> is the *left adjoint* of r, or equivalently that r is the *right adjoint* of l, written l r, when it holds that l(x) <sup>2</sup> y iff x <sup>1</sup> r(y) for all x ∈ L<sup>1</sup> and y ∈ L2. Given a map f : L → L, the element x ∈ L is a *post-fixed point* iff x f(x), a *pre-fixed point* iff f(x) x and a *fixed point* iff x = f(x). Pre, post and fixed points form complete lattices: we write μf and νf for the least and greatest fixed point.

Several problems relevant to computer science can be reduced to check if μb p for a monotone map b : L → L on a complete lattice L. The Knaster-Tarski fixed-point theorem characterises μb as the least upper bound of all prefixed points of b and νb as the greatest lower bound of all its post-fixed points:

$$\mu b = \bigcap \{ x \mid b(x) \sqsubseteq x \} \qquad \qquad \nu b = \bigsqcup \{ x \mid x \sqsubseteq b(x) \} \quad .$$

This immediately leads to two proof principles, illustrated below:

$$\begin{array}{c} \exists x, \; b(x) \sqsubseteq x \sqsubseteq p \\ \hline \mu b \sqsubseteq p \end{array} \qquad \begin{array}{c} \exists x, \; i \sqsubseteq x \sqsubseteq b(x) \\ \hline i \sqsubseteq \nu b \end{array} \tag{KT}$$

**Fig. 1.** The transition system of Example 1, with S = {s0,...s6} and I = {s0}.

By means of (KT), one can prove μb p by finding some pre-fixed point x, often called *invariant*, such that x p. However, automatically finding invariants might be rather complicated, so most of the algorithms rely on another fixedpoint theorem, usually attributed to Kleene. It characterises μb and νb as the least upper bound and the greatest lower bound, of the *initial* and *final chains*:

$$\begin{aligned} \bot &\subseteq b(\bot) \subseteq b^2(\bot) \subseteq \cdots \quad \text{and} \quad \cdots \subseteq b^2(\top) \subseteq b(\top) \subseteq \top. \quad \text{That is,} \qquad (\text{Kl})\\ \mu b &= \bigsqcup\_{n \in \mathbb{N}} b^n(\bot), & \nu b &= \bigcap\_{n \in \mathbb{N}} b^n(\top). \end{aligned}$$

The assumptions are stronger than for Knaster-Tarski: for the leftmost statement, it requires the map b to be ω*-continuous* (i.e., it preserves of ω-chains) and, for the rightmost ω*-co-continuous* (similar but for -). Observe that every left adjoint is continuous and every right adjoint is co-continuous (see e.g. [23]).

As explained in [19], property directed reachability (PDR) algorithms [5] exploits (KT) to try to prove the inequation and (Kl) to refute it. In the algorithm we introduce in the next section, we further assume that b is of the form f i for some element i ∈ L and map f : L → L, namely b(x) = f(x) i for all x ∈ L. Moreover we require f to have a right adjoint g : L → L. In this case

$$
\mu(f \sqcup i) \sqsubseteq p \qquad \text{iff} \qquad i \sqsubseteq \nu(g \sqcap p) \tag{1}
$$

(which is easily shown using the Knaster-Tarski theorem) and (f i) and (g p) are guaranteed to be (co)continuous. Since f g and left and right adjoints preserve, resp., arbitrary joins and meets, then for all <sup>n</sup> <sup>∈</sup> <sup>N</sup>

$$(f \sqcup i)^{n}(\bot) = \bigsqcup\_{j$$

which by (Kl) provide useful characterisations of least and greatest fixed points.

$$\mu(f \sqcup i) = \bigsqcup\_{n \in \mathbb{N}} f^n(i) \qquad \qquad \nu(g \sqcap p) = \bigsqcap\_{n \in \mathbb{N}} g^n(p) \tag{Kl\text{-}l}$$

We conclude this section with an example that we will often revisit. It also provides a justification for the intuitive terminology that we sporadically use.

*Example 1 (Safety problem for transition systems).* A *transition system* consists of a triple (S, I, δ) where S is a set of states, I ⊆ S is a set of initial states, and δ : S → PS is a transition relation. Here PS denotes the powerset of S, which

$$\begin{array}{c} x\_{0} = \bot & \text{(I)} & \text{If } y \neq \varepsilon \text{ then } p \sqsubseteq y\_{n-1} & \text{(N1)}\\ 1 \le k \le n & \text{(I)} & \forall j \in [k, n-2], \, g(y\_{j+1}) \sqsubseteq y\_{j} & \text{(N2)}\\ \forall j \in [0, n-2], \, x\_{j} \sqsubseteq x\_{j+1} & \text{(I2)} & \forall j \in [k, n-1], \, x\_{j} \sqsubseteq y\_{j} & \text{(PN)}\\ & & \forall j \in [k, n-1], \, (f \sqcup i)^{j}(\bot) \sqsubseteq x\_{j} \sqsubseteq (g \sqcap p)^{n-1-j}(\top) & \text{(A1)}\\ & & x\_{n-2} \sqsubseteq p & \text{(P2)} & \forall j \in [1, n-1], \, x\_{j-1} \sqsubseteq g^{n-1-j}(p) & \text{(A2)}\\ \forall j \in [0, n-2], \, x\_{j} \sqsubseteq g(x\_{j+1}) & \text{(P3a)} & \forall j \in [k, n-1], \, g^{n-1-j}(p) \sqsubseteq y\_{j} & \text{(A3)}\\ \end{array}$$

#### **Fig. 2.** Invariants of AdjointPDR.

forms a complete lattice ordered by inclusion ⊆. By defining F : PS → PS as F(X) def = s∈X <sup>δ</sup>(s) for all <sup>X</sup> ∈ PS, one has that <sup>μ</sup>(<sup>F</sup> <sup>∪</sup> <sup>I</sup>) is the set of all states reachable from I. Therefore, for any P ∈ PS, representing some safety property, μ(F ∪ I) ⊆ P holds iff all reachable states are safe. It is worth to remark that F has a right adjoint G: PS → PS defined for all X ∈ PS as G(X) def = {s | δ(s) ⊆ X}. Thus by (1), μ(F ∪ I) ⊆ P iff I ⊆ ν(G ∩ P).

Consider the transition system in Fig. 1. Hereafter we write <sup>S</sup>j for the set of states {s0, s1,...,sj} and we fix the set of safe states to be <sup>P</sup> <sup>=</sup> <sup>S</sup>5. It is immediate to see that μ(F ∪ I) = S<sup>4</sup> ⊆ P. Automatically, this can be checked with the initial chains of (F ∪ I) or with the final chain of (G ∩ P) displayed below on the left and on the right, respectively.

$$\emptyset \subseteq I \subseteq S\_2 \subseteq S\_3 \subseteq S\_4 \subseteq S\_4 \subseteq \dotsb \qquad \qquad \cdots \subseteq S\_4 \subseteq S\_4 \subseteq P \subseteq S\_2$$

The (j + 1)-th element of the initial chain contains all the states that can be reached by I in at most j transitions, while (j + 1)-th element of the final chain contains all the states that in at most j transitions reach safe states only.

## **3 Adjoint PDR**

In this section we present AdjointPDR, an algorithm that takes in input a tuple (i, f, g, p) with i, p ∈ L and f g : L → L and, if it terminates, it returns true whenever μ(f i) p and false otherwise.

The algorithm manipulates two sequences of elements of <sup>L</sup>: *x* def <sup>=</sup> <sup>x</sup>0,...,xn−<sup>1</sup> of length <sup>n</sup> and *y* def <sup>=</sup> <sup>y</sup>k,...yn−<sup>1</sup> of length <sup>n</sup> <sup>−</sup> <sup>k</sup>. These satisfy, through the executions of AdjointPDR, the invariants in Fig. 2. Observe that, by (A1), <sup>x</sup>j over-approximates the <sup>j</sup>-th element of the initial chain, namely (<sup>f</sup> i)<sup>j</sup> (⊥) <sup>x</sup>j , while, by (A3), the <sup>j</sup>-indexed element <sup>y</sup>j of *<sup>y</sup>* over-approximates <sup>g</sup>n−j−<sup>1</sup>(p) that, borrowing the terminology of Example 1, is the set of states which are safe in <sup>n</sup> <sup>−</sup> <sup>j</sup> <sup>−</sup> 1 transitions. Moreover, by (PN), the element <sup>y</sup>j witnesses that <sup>x</sup>j is unsafe, i.e., that <sup>x</sup>j <sup>g</sup>n−1−<sup>j</sup> (p) or equivalently <sup>f</sup> <sup>n</sup>−j−<sup>1</sup>(xj ) <sup>p</sup>. Notably, *<sup>x</sup>* is a positive chain and *y* a negative sequence, according to the definitions below.

#### AdjointPDR (i, f, g, p)

```
<INITIALISATION >
 (x-
    y)n,k : = (⊥, -
                 ε)2,2
<ITERATION > % x, y not conclusive
 case (x-
         y)n,k of
      y = ε and xn−1  p : %(Unfold)
         (x-
           y)n,k : = (x, -
                        ε)n+1,n+1
      y = ε and xn−1  p : %(Candidate)
         choose z ∈ L such that xn−1  z and p  z;
         (x-
           y)n,k : = (x-
                      z)n,n−1
      y = ε and f(xk−1)  yk : %(Decide)
         choose z ∈ L such that xk−1  z and g(yk)  z;
         (x-
           y)n,k : = (x-
                      z, y)n,k−1
      y = ε and f(xk−1)  yk : %(Conflict)
         choose z ∈ L such that z  yk and (f  i)(xk−1 	 z)  z;
         (x-
           y)n,k : = (x 	k z-
                          tail(y))n,k+1
 endcase
<TERMINATION >
  i f ∃j ∈ [0, n − 2] . xj+1  xj then return true % x conclusive
  i f i y1 then return f alse % y conclusive
```
**Fig. 3.** AdjointPDR algorithm checking <sup>μ</sup>(<sup>f</sup> i) p.

**Definition 2 (positive chain).** *A* positive chain *for* μ(f i) p *is a finite chain* <sup>x</sup><sup>0</sup> ··· <sup>x</sup>n−<sup>1</sup> *in* <sup>L</sup> *of length* <sup>n</sup> <sup>≥</sup> <sup>2</sup> *which satisfies* (P1)*,* (P2)*,* (P3) *in Fig. 2. It is* conclusive *if* <sup>x</sup>j+1 <sup>x</sup>j *for some* <sup>j</sup> <sup>≤</sup> <sup>n</sup> <sup>−</sup> <sup>2</sup>*.*

In a conclusive positive chain, <sup>x</sup>j+1 provides an invariant for <sup>f</sup> <sup>i</sup> and thus, by (KT), <sup>μ</sup>(<sup>f</sup> i) <sup>p</sup> holds. So, when *x* is conclusive, AdjointPDR returns true.

**Definition 3 (negative sequence).** *A* negative sequence *for* μ(f i) p *is a finite sequence* <sup>y</sup>k,...,yn−<sup>1</sup> *in* <sup>L</sup> *with* <sup>1</sup> <sup>≤</sup> <sup>k</sup> <sup>≤</sup> <sup>n</sup> *which satisfies* (N1) *and* (N2) *in Fig. 2. It is* conclusive *if* k = 1 *and* i y1*.*

When *<sup>y</sup>* is conclusive, AdjointPDR returns false as <sup>y</sup><sup>1</sup> provides a counterexample: (N1) and (N2) entail (A3) and thus <sup>i</sup> <sup>y</sup><sup>1</sup> <sup>g</sup>n−<sup>2</sup>(p). By (Kl-), <sup>g</sup>n−<sup>2</sup>(p) <sup>ν</sup>(<sup>g</sup> <sup>p</sup>) and thus <sup>i</sup> <sup>ν</sup>(<sup>g</sup> <sup>p</sup>). By (1), <sup>μ</sup>(<sup>f</sup> <sup>i</sup>) <sup>p</sup>.

The pseudocode of the algorithm is displayed in Fig. 3, where we write (*xy*)n,k to compactly represents the state of the algorithm: the pair (n, k) is called the *index* of the state, with *x* of length <sup>n</sup> and *y* of length <sup>n</sup> <sup>−</sup> <sup>k</sup>. When <sup>k</sup> <sup>=</sup> <sup>n</sup>, *y* is the empty sequence <sup>ε</sup>. For any <sup>z</sup> <sup>∈</sup> <sup>L</sup>, we write *x*, z for the chain <sup>x</sup>0,...,xn−<sup>1</sup>, z of length <sup>n</sup> + 1 and z, *<sup>y</sup>* for the sequence z,yk,...yn−<sup>1</sup> of length <sup>n</sup>−(k−1). Moreover, we write *<sup>x</sup>*j <sup>z</sup> for the chain <sup>x</sup>0z,..., xjz, xj+1,...,xn−<sup>1</sup>. Finally, tail(*y*) stands for the tail of *<sup>y</sup>*, namely <sup>y</sup>k+1,...yn−<sup>1</sup> of length <sup>n</sup>−(k+1).

The algorithm starts in the initial state s<sup>0</sup> def = (⊥, ε)<sup>2</sup>,<sup>2</sup> and, unless one of *x* and *y* is conclusive, iteratively applies one of the four mutually exclusive rules: (Unfold), (Candidate), (Decide) and (Conflict). The rule (Unfold) extends the positive chain by one element when the negative sequence is empty and the positive chain is under p; since the element introduced by (Unfold) is , its application typically triggers rule (Candidate) that starts the negative sequence with an over-approximation of <sup>p</sup>. Recall that the role of <sup>y</sup>j is to witness that <sup>x</sup>j is unsafe. After (Candidate) either (Decide) or (Conflict) are possible: if <sup>y</sup>k witnesses that, besides <sup>x</sup>k, also <sup>f</sup>(xk−<sup>1</sup>) is unsafe, then (Decide) is used to further extend the negative sequence to witness that <sup>x</sup>k−<sup>1</sup> is unsafe; otherwise, the rule (Conflict) improves the precision of the positive chain in such a way that <sup>y</sup>k no longer witnesses <sup>x</sup>k <sup>z</sup> unsafe and, thus, the negative sequence is shortened.

Note that, in (Candidate), (Decide) and (Conflict), the element z ∈ L is chosen among a set of possibilities, thus AdjointPDR is nondeterministic.

To illustrate the executions of the algorithm, we adopt a labeled transition system notation. Let <sup>S</sup> def <sup>=</sup> {(*xy*)n,k <sup>|</sup> <sup>n</sup> <sup>≥</sup> 2, <sup>k</sup> <sup>≤</sup> <sup>n</sup>, *<sup>x</sup>* <sup>∈</sup> <sup>L</sup><sup>n</sup> and *<sup>y</sup>* <sup>∈</sup> <sup>L</sup>n−k} be the set of all possible states of AdjointPDR. We call (*xy*)n,k ∈ S *conclusive* if *x* or *y* are such. When <sup>s</sup> ∈ S is not conclusive, we write <sup>s</sup> D → to mean that s satisfies the guards in the rule (Decide), and s D <sup>→</sup>zs to mean that, being (Decide) applicable, AdjointPDR moves from state s to s by choosing z. Similarly for the other rules: the labels *Ca*, *Co* and U stands for (Candidate), (Conflict) and (Unfold), respectively. When irrelevant we omit to specify labels and choices and we just write s → s . As usual <sup>→</sup><sup>+</sup> stands for the transitive closure of <sup>→</sup> while →<sup>∗</sup> stands for the reflexive and transitive closure of →.

*Example 4.* Consider the safety problem in Example 1. Below we illustrate two possible computations of AdjointPDR that differ for the choice of z in (Conflict). The first run is conveniently represented as the following series of transitions.

(∅, Sε)<sup>2</sup>,<sup>2</sup> *Ca* →<sup>P</sup> (∅, SP)<sup>2</sup>,<sup>1</sup> *Co* →<sup>I</sup> (∅, Iε)<sup>2</sup>,<sup>2</sup> U → (∅,I,Sε)<sup>3</sup>,<sup>3</sup> *Ca* →<sup>P</sup> (∅,I,SP)<sup>3</sup>,<sup>2</sup> *Co* →<sup>S</sup><sup>2</sup> (∅,I,S2ε)<sup>3</sup>,<sup>3</sup> U →*Ca* →<sup>P</sup> (∅,I,S2, SP)<sup>4</sup>,<sup>3</sup> *Co* →<sup>S</sup><sup>3</sup> (∅,I,S2, S3ε)<sup>4</sup>,<sup>4</sup> U →*Ca* →<sup>P</sup> (∅,I,S2, S3, SP)<sup>5</sup>,<sup>4</sup> *Co* →<sup>S</sup><sup>4</sup> (∅,I,S2, S3, S4ε)<sup>5</sup>,<sup>5</sup> U →*Ca* →<sup>P</sup> (∅,I,S2, S3, S4, SP)<sup>6</sup>,<sup>5</sup> *Co* →<sup>S</sup><sup>4</sup> (∅,I,S2, S3, S4, S4ε)<sup>6</sup>,<sup>6</sup>

The last state returns true since x<sup>4</sup> = x<sup>5</sup> = S4. Observe that the elements of *<sup>x</sup>*, with the exception of the last element <sup>x</sup>n−<sup>1</sup>, are those of the initial chain of (<sup>F</sup> <sup>∪</sup> <sup>I</sup>), namely, <sup>x</sup>j is the set of states reachable in at most <sup>j</sup> <sup>−</sup> 1 steps. In the second computation, the elements of *x* are roughly those of the final chain of (<sup>G</sup> <sup>∩</sup>P). More precisely, after (Unfold) or (Candidate), <sup>x</sup>n−j for j<n <sup>−</sup> 1 is the set of states which only reach safe states within j steps.

$$\begin{array}{c} (\emptyset,S\|\|\varepsilon\rangle\_{2,2} \xrightarrow{Ca} (\emptyset,S\|\|P)\_{2,1} \xrightarrow{Co} (\emptyset,P\|\|\varepsilon\rangle\_{2,2} \\ \xrightarrow{U\cdot Ca} (\emptyset,P,S\|\|P)\_{3,2} \xrightarrow{D}\_{S\_4} (\emptyset,P,S\|\|S\_4,P\|\_{3,1} \xrightarrow{Co}\_{S\_4} (\emptyset,S\_4,S\|\|P)\_{3,2} \xrightarrow{Co}\_P (\emptyset,S\_4,P\|\varepsilon\rangle\_{3,3} \\ \xrightarrow{U\cdot Ca} (\emptyset,S\_4,P,S\|\|P)\_{4,3} \xrightarrow{D}\_{S\_4} (\emptyset,S\_4,P,S\|S\_4,P\|\_{4,2} \xrightarrow{Co}\_{S\_4} (\emptyset,S\_4,S\_4,S\|P)\_{4,3} \end{array}$$

Observe that, by invariant (A1), the values of *x* in the two runs are, respectively, the least and the greatest values for all possible computations of AdjointPDR.

Theorem 5.1 follows by invariants (I2), (P1), (P3) and (KT); Theorem 5.2 by (N1), (N2) and (Kl-). Note that both results hold for any choice of z.

#### **Theorem 5 (Soundness).** AdjointPDR *is sound. Namely,*


#### **3.1 Progression**

It is necessary to prove that in any step of the execution, if the algorithm does not return true or false, then it can progress to a new state, not yet visited. To this aim we must deal with the subtleties of the non-deterministic choice of the element z in (Candidate), (Decide) and (Conflict). The following proposition ensures that, for any of these three rules, there is always a possible choice.

**Proposition 6 (Canonical choices).** *The following are always possible: 1. in (Candidate)* z = p*; 2. in (Decide)* <sup>z</sup> <sup>=</sup> <sup>g</sup>(yk)*; 3. in (Conflict)* <sup>z</sup> <sup>=</sup> <sup>y</sup>k*; 4. in (Conflict)* <sup>z</sup> = (<sup>f</sup> <sup>i</sup>)(xk−<sup>1</sup>)*.*

*Thus, for all non-conclusive* s ∈ S*, if* s<sup>0</sup> →<sup>∗</sup> s *then* s →*.*

Then, Proposition 7 ensures that AdjointPDR always traverses new states.

#### **Proposition 7 (Impossibility of loops).** *If* <sup>s</sup><sup>0</sup> <sup>→</sup><sup>∗</sup> <sup>s</sup> <sup>→</sup><sup>+</sup> <sup>s</sup> *, then* s = s *.*

Observe that the above propositions entail that AdjointPDR terminates whenever the lattice L is finite, since the set of reachable states is finite in this case.

*Example 8.* For (I, F, G, P) as in Example 1, AdjointPDR behaves essentially as IC3/PDR [5], solving reachability problems for transition systems with finite state space S. Since the lattice PS is also finite, AdjointPDR always terminates.

#### **3.2 Heuristics**

The nondeterministic choices of the algorithm can be resolved by using heuristics. Intuitively, a heuristic chooses for any states s ∈ S an element z ∈ L to be possibly used in (Candidate), (Decide) or (Conflict), so it is just a function h: S → L. When defining a heuristic, we will avoid to specify its values on conclusive states or in those performing (Unfold), as they are clearly irrelevant.

With a heuristic, one can instantiate AdjointPDR by making the choice of z as prescribed by h. Syntactically, this means to erase from the code of Fig. <sup>3</sup> the three lines of choose and replace them by <sup>z</sup>:= <sup>h</sup>( (*xc*)n,k ). We call AdjointPDR<sup>h</sup> the resulting deterministic algorithm and write <sup>s</sup>→hs to mean that AdjointPDRh moves from state <sup>s</sup> to <sup>s</sup> . We let <sup>S</sup><sup>h</sup> def = {s ∈S| s0→<sup>∗</sup> hs} be the sets of all states reachable by AdjointPDRh.

**Definition 9 (legit heuristic).** *A heuristic* h: S → L *is called* legit *whenever for all* s, s ∈ Sh*, if* <sup>s</sup>→hs *then* <sup>s</sup> <sup>→</sup> <sup>s</sup> *.*

When <sup>h</sup> is legit, the only execution of the deterministic algorithm AdjointPDRh is one of the possible executions of the non-deterministic algorithm AdjointPDR.

The canonical choices provide two legit heuristics: first, we call *simple* any legit heuristic h that chooses z in (Candidate) and (Decide) as in Proposition 6:

$$g(x \| \| y)\_{n,k} \mapsto \begin{cases} p & \text{if } (x \| \| y)\_{n,k} \stackrel{Ca}{\to} \\ g(y\_k) & \text{if } (x \| \| y)\_{n,k} \stackrel{D}{\to} \end{cases} \tag{3}$$

Then, if the choice in (Conflict) is like in Proposition 6.4, we call h *initial*; if it is like in Proposition 6.3, we call h *final*. Shortly, the two legit heuristics are:

$$\begin{array}{c|c} \text{simple } initial & \text{(3) and ( $x||y)\_{n,k} \mapsto (f \sqcup i)(x\_{k-1}) & \text{if } (x||y)\_{n,k} \in Co \\ \hline \\ simple \ final & \text{(3) and ($ x||y)\_{n,k} \mapsto y\_k & \text{if } (x||y)\_{n,k} \in Co \end{array}$$

Interestingly, with any simple heuristic, the sequence *y* takes a familiar shape:

**Proposition 10.** *Let* <sup>h</sup>: S → <sup>L</sup> *be any simple heuristic. For all* (*xy*)n,k ∈ Sh*, invariant* (A3) *holds as an equality, namely for all* <sup>j</sup> <sup>∈</sup> [k, n−1]*,* <sup>y</sup>j <sup>=</sup> <sup>g</sup>n−1−<sup>j</sup> (p)*.*

By the above proposition and (A3), the negative sequence *y* occurring in the execution of AdjointPDRh, for a simple heuristic <sup>h</sup>, is the least amongst all the negative sequences occurring in any execution of AdjointPDR.

Instead, invariant (A1) informs us that the positive chain *x* is always in between the initial chain of <sup>f</sup> <sup>i</sup> and the final chain of <sup>g</sup> <sup>p</sup>. Such values of *x* are obtained by, respectively, simple initial and simple final heuristic.

*Example 11.* Consider the two runs of AdjointPDR in Example 4. The first one exploits the simple initial heuristic and indeed, the positive chain *x* coincides with the initial chain. Analogously, the second run uses the simple final heuristic.

#### **3.3 Negative Termination**

When the lattice L is not finite, AdjointPDR may not terminate, since checking μ(f i) p is not always decidable. In this section, we show that the use of certain heuristics can guarantee termination whenever μ(f i) p.

The key insight is the following: if μ(f i) p then by (Kl), there should exist some ˜<sup>n</sup> <sup>∈</sup> <sup>N</sup> such that (<sup>f</sup> <sup>i</sup>)n˜(⊥) <sup>p</sup>. By (A1), the rule (Unfold) can be applied only when (<sup>f</sup> <sup>i</sup>)n−<sup>1</sup>(⊥) <sup>x</sup>n−<sup>1</sup> <sup>p</sup>. Since (Unfold) increases <sup>n</sup> and <sup>n</sup> is never decreased by other rules, then (Unfold) can be applied at most ˜n times.

The elements of negative sequences are introduced by rules (Candidate) and (Decide). If we guarantee that for any index (n, k) the heuristic in such cases returns a finite number of values for z, then one can prove termination. To make this formal, we fix *CaD*<sup>h</sup> n,k def <sup>=</sup> {(*xy*)n,k ∈ S<sup>h</sup> <sup>|</sup> (*xy*)n,k *Ca* <sup>→</sup> or (*xy*)n,k D →}, i.e., the set of all (n, k)-indexed states reachable by AdjointPDRh that trigger (Candidate) or (Decide), and h(*CaD*<sup>h</sup> n,k) def = {h(s) | s ∈ *CaD*<sup>h</sup> n,k}, i.e., the set of all possible values returned by h in such states.

**Theorem 12 (Negative termination).** *Let* h *be a legit heuristic. If* h(*CaD*h n,k) *is finite for all* n, k *and* <sup>μ</sup>(<sup>f</sup> <sup>i</sup>) <sup>p</sup>*, then* AdjointPDR<sup>h</sup> *terminates.*

**Corollary 13.** *Let* <sup>h</sup> *be a simple heuristic. If* <sup>μ</sup>(<sup>f</sup> <sup>i</sup>) <sup>p</sup>*, then* AdjointPDRh *terminates.*

Note that this corollary ensures negative termination whenever we use the canonical choices in (Candidate) and (Decide) *irrespective of the choice for* (Conflict), therefore it holds for both simple initial and simple final heuristics.

## **4 Recovering Adjoints with Lower Sets**

In the previous section, we have introduced an algorithm for checking μb p whenever b is of the form f i for an element i ∈ L and a left-adjoint f : L → L. This, unfortunately, is not the case for several interesting problems, like the max reachability problem [1] that we will illustrate in Sect. 5.

The next result informs us that, under standard assumptions, one can transfer the problem of checking μb p to lower sets, where adjoints can always be defined. Recall that, for a lattice (L, ), a *lower set* is a subset X ⊆ L such that if x ∈ X and x x then x ∈ X; the set of lower sets of L forms a complete lattice (L↓, ⊆) with joins and meets given by union and intersection; as expected ⊥ is ∅ and is L. Given b : L → L, one can define two functions b↓, b<sup>↓</sup> r : <sup>L</sup><sup>↓</sup> <sup>→</sup> <sup>L</sup><sup>↓</sup> as b↓(X) def = b(X)<sup>↓</sup> and b<sup>↓</sup> r(X) def = {x | b(x) ∈ X}. It holds that b<sup>↓</sup> b<sup>↓</sup> r.

$$b^b \widehat{\smile}^{\perp} (L, \underline{\sqsubset}) \xleftarrow{\Box}^{\perp} (L^\perp, \underline{\sqsubset}) \xleftarrow{\Box} b^\perp \rightsquigarrow b^\perp \tag{4}$$

In the diagram above, (−)<sup>↓</sup> : x → {x | x x} and -: L<sup>↓</sup> → L maps a lower set X into -{x | x ∈ X}. The maps and (−)<sup>↓</sup> form a *Galois insertion*, namely - - (−)<sup>↓</sup> and -(−)<sup>↓</sup> = id, and thus one can think of (4) in terms of *abstract interpretation* [8,9]: L<sup>↓</sup> represents the concrete domain, L the abstract domain and b is a sound abstraction of b↓. Most importantly, it turns out that b is *forward-complete* [4,14] w.r.t. b↓, namely the following equation holds.

$$(-)^{\downarrow} \circ b = b^{\downarrow} \circ (-)^{\downarrow} \tag{5}$$

**Proposition 14.** *Let* (L, ) *be a complete lattice,* p ∈ L *and* b : L → L *be a* ω*-continuous map. Then* μb p *iff* μ(b<sup>↓</sup> ∪ ⊥<sup>↓</sup>) ⊆ p↓*.*

By means of Proposition 14, we can thus solve μb p in L by running AdjointPDR on (⊥<sup>↓</sup>, b↓, b<sup>↓</sup> r, p↓). Hereafter, we tacitly assume that <sup>b</sup> is <sup>ω</sup>continuous.

## **4.1** AdjointPDR*↓***: Positive Chain in** *L***, Negative Sequence in** *L<sup>↓</sup>*

While AdjointPDR on (⊥<sup>↓</sup>, b↓, b<sup>↓</sup> r, p↓) might be computationally expensive, it is the first step toward the definition of an efficient algorithm that exploits a convenient form of the positive chain.

A lower set X ∈ L<sup>↓</sup> is said to be a *principal* if X = x<sup>↓</sup> for some x ∈ L. Observe that the top of the lattice (L↓, ⊆) is a principal, namely <sup>↓</sup>, and that the meet (intersection) of two principals x<sup>↓</sup> and y<sup>↓</sup> is the principal (x y)↓.

Suppose now that, in (Conflict), AdjointPDR(⊥<sup>↓</sup>, b↓, b<sup>↓</sup> r, p↓) always chooses principals rather than arbitrary lower sets. This suffices to guarantee that all the elements of *<sup>x</sup>* are principals (with the only exception of <sup>x</sup><sup>0</sup> which is constantly the bottom element of L<sup>↓</sup> that, note, is ∅ and not ⊥<sup>↓</sup>). In fact, the elements of

```
AdjointPDR (b, p)
<INITIALISATION >
  (x-
    Y )n,k : = (∅, ⊥, -
                    ε)3,3
<ITERATION >
  case (x-
          Y )n,k of % x, Y not conclusive
      Y = ε and xn−1  p : %(Unfold)
         (x-
            Y )n,k : = (x, -
                         ε)n+1,n+1
      Y = ε and xn−1  p : %(Candidate)
         choose Z ∈ L such that xn−1 ∈ Z and p ∈ Z;
         (x-
            Y )n,k : = (x-
                       Z)n,n−1
      Y = ε and b(xk−1) ∈ Yk : %(Decide)
         choose Z ∈ L such that xk−1 ∈ Z and br(Yk) ⊆ Z;
         (x-
            Y )n,k : = (x-
                       Z, Y )n,k−1
      Y = ε and b(xk−1) ∈ Yk : %(Conflict)
         choose z ∈ L such that z ∈ Yk and b(xk−1 	 z)  z;
         (x-
            Y )n,k : = (x 	k z-
                           tail(Y ))n,k+1
  endcase
<TERMINATION >
   i f ∃j ∈ [0, n − 2] . xj+1  xj then return true % x conclusive
   i f Y1 = then return f alse % Y conclusive
```
**Fig. 4.** The algorithm AdjointPDR<sup>↓</sup> for checking μb <sup>p</sup>: the elements of negative sequence are in L↓, while those of the positive chain are in L, with the only exception of x<sup>0</sup> which is constantly the bottom lower set ∅. For x0, we fix b(x0) = ⊥.

*x* are all obtained by (Unfold), that adds the principal <sup>↓</sup>, and by (Conflict), that takes their meets with the chosen principal.

Since principals are in bijective correspondence with the elements of L, by imposing to AdjointPDR(⊥<sup>↓</sup>, b↓, b<sup>↓</sup> r, p↓) to choose a principal in (Conflict), we obtain an algorithm, named AdjointPDR↓, where the elements of the positive chain are drawn from L, while the negative sequence is taken in L↓. The algorithm is reported in Fig. <sup>4</sup> where we use the notation (*x<sup>Y</sup>* )n,k to emphasize that the elements of the negative sequence are lower sets of elements in L.

All definitions and results illustrated in Sect. 3 for AdjointPDR are inherited<sup>1</sup> by AdjointPDR↓, with the only exception of Proposition 6.3. The latter does not hold, as it prescribes a choice for (Conflict) that may not be a principal. In contrast, the choice in Proposition 6.4 is, thanks to (5), a principal. This means in particular that the simple initial heuristic is always applicable.

**Theorem 15.** *All results in Sect. 3, but Proposition 6.3, hold for AdjointPDR*↓*.*

#### **4.2** AdjointPDR*<sup>↓</sup>* **Simulates LT-PDR**

The closest approach to AdjointPDR and AdjointPDR<sup>↓</sup> is the lattice-theoretic extension of the original PDR, called LT-PDR [19]. While these algorithms exploit essentially the same positive chain to find an invariant, the main difference lies in the sequence used to witness the existence of some counterexamples.

<sup>1</sup> Up to a suitable renaming: the domain is (L↓, <sup>⊆</sup>) instead of (L, ), the parameters are ⊥<sup>↓</sup>, b↓, b<sup>↓</sup> <sup>r</sup>, p<sup>↓</sup> instead of i, f, g, p and the negative sequence is *Y* instead of *y*.

**Definition 16 (Kleene sequence, from** [19]**).** *A sequence <sup>c</sup>* <sup>=</sup> <sup>c</sup>k,...,cn−<sup>1</sup> *of elements of* L *is a* Kleene sequence *if the conditions* (C1) *and* (C2) *below hold. It is* conclusive *if also condition* (C0) *holds.*

$$(\text{C0})\ c\_1 \sqsubseteq b(\bot), \qquad (\text{C1})\ c\_{n-1} \nsubseteq p, \qquad (\text{C2})\ \forall j \in [k, n-2]. \ c\_{j+1} \sqsubseteq b(c\_j).$$

LT-PDR tries to construct an under-approximation <sup>c</sup>n−<sup>1</sup> of <sup>b</sup>n−<sup>2</sup>(⊥) that violates the property p. The Kleene sequence is constructed by trial and error, starting by some arbitrary choice of <sup>c</sup>n−<sup>1</sup>.

AdjointPDR crucially differs from LT-PDR in the search for counterexamples: LT-PDR under-approximates the final chain while AdjointPDR overapproximates it. The algorithms are thus incomparable. However, we can draw a formal correspondence between AdjointPDR<sup>↓</sup> and LT-PDR by showing that AdjointPDR<sup>↓</sup> simulates LT-PDR, but cannot be simulated by LT-PDR. In fact, AdjointPDR<sup>↓</sup> exploits the existence of the adjoint to start from an overapproximation <sup>Y</sup>n−<sup>1</sup> of <sup>p</sup><sup>↓</sup> and computes backward an over-approximation of the set of safe states. Thus, the key difference comes from the strategy to look for a counterexample: to prove μb <sup>p</sup>, AdjointPDR<sup>↓</sup> tries to find <sup>Y</sup>n−<sup>1</sup> satisfying <sup>p</sup> <sup>∈</sup> <sup>Y</sup>n−<sup>1</sup> and μb ∈ <sup>Y</sup>n−<sup>1</sup> while LT-PDR tries to find <sup>c</sup>n−<sup>1</sup> s.t. <sup>c</sup>n−<sup>1</sup> <sup>p</sup> and <sup>c</sup>n−<sup>1</sup> μb.

Theorem 17 below states that any execution of LT-PDR can be mimicked by AdjointPDR↓. The proof exploits a map from LT-PDR's Kleene sequences *<sup>c</sup>* to AdjointPDR↓'s negative sequences *neg***(***c***)** of a particular form. Let (L↑, <sup>⊇</sup>) be the complete lattice of upper sets, namely subsets X ⊆ L such that X = X<sup>↑</sup> def = {x ∈ L | ∃x ∈ X.x x }. There is an isomorphism <sup>¬</sup>: (L↑, <sup>⊇</sup>) <sup>∼</sup><sup>=</sup>←→ (L↓, ⊆) mapping each X ⊆ S into its complement. For a Kleene sequence *<sup>c</sup>* <sup>=</sup> <sup>c</sup>k,...,cn−<sup>1</sup> of LT-PDR, the sequence *neg***(***c***)** def <sup>=</sup> <sup>¬</sup>({ck}<sup>↑</sup>),...,¬({cn−<sup>1</sup>}<sup>↑</sup>) is a negative sequence, in the sense of Definition 3, for AdjointPDR↓. Most importantly, the assignment *c* → *neg***(***c***)** extends to a function, from the states of LT-PDR to those of AdjointPDR↓, that is proved to be a *strong simulation* [24].

**Theorem 17.** AdjointPDR<sup>↓</sup> *simulates LT-PDR.*

Remarkably, AdjointPDR↓'s negative sequences are not limited to the images of LT-PDR's Kleene sequences: they are more general than the complement of the upper closure of a singleton. In fact, a single negative sequence of AdjointPDR<sup>↓</sup> can represent *multiple* Kleene sequences of LT-PDR at once. Intuitively, this means that a single execution of AdjointPDR<sup>↓</sup> can correspond to multiple runs of LT-PDR. We can make this formal by means of the following result.

**Proposition 18.** *Let* {*cm* }m∈M *be a family of Kleene sequences. Then its pointwise intersection* m∈M *neg***(***c<sup>m</sup>* **)** *is a negative sequence.*

The above intersection is pointwise in the sense that, for all j ∈ [k, n − 1], it holds ( m∈M *neg***(***c<sup>m</sup>* **)**)<sup>j</sup> def = m∈M(*neg***(***c<sup>m</sup>* **)**)<sup>j</sup> <sup>=</sup> <sup>¬</sup>({c<sup>m</sup> j <sup>|</sup> <sup>m</sup> <sup>∈</sup> <sup>M</sup>}<sup>↑</sup>): intuitively, this is (up to *neg***(***·***)**) a set containing all the <sup>M</sup> counterexamples. Note that, if the negative sequence of AdjointPDR<sup>↓</sup> makes (A3) hold as an equality, as it is possible with any simple heuristic (see Proposition 10), then its complement contains *all* Kleene sequences possibly computed by LT-PDR.

**Proposition 19.** *Let c be a Kleene sequence and Y be the negative sequence s.t.* <sup>Y</sup>j = (b<sup>↓</sup> r)n−1−<sup>j</sup> (p↓) *for all* <sup>j</sup> <sup>∈</sup> [k, n <sup>−</sup> 1]*. Then* <sup>c</sup><sup>j</sup> ∈ ¬(Y<sup>j</sup> ) *for all* <sup>j</sup> <sup>∈</sup> [k, n <sup>−</sup> 1]*.*

While the previous result suggests that simple heuristics are always the best in theory, as they can carry all counterexamples, this is often not the case in practice, since they might be computationally hard and outperformed by some smart over-approximations. An example is given by (6) in the next section.

## **5 Instantiating** AdjointPDR*<sup>↓</sup>* **for MDPs**

In this section we illustrate how to use AdjointPDR<sup>↓</sup> to address the max reachability problem [1] for Markov Decision Processes.

<sup>A</sup> *Markov Decision Process* (MDP) is a tuple (A, S, sι, δ) where <sup>A</sup> is a set of labels, <sup>S</sup> is a set of states, <sup>s</sup>ι <sup>∈</sup> <sup>S</sup> is an initial state, and <sup>δ</sup> : <sup>S</sup> <sup>×</sup> <sup>A</sup> → D<sup>S</sup> + 1 is a transition function. Here DS is the set of probability distributions over S, namely functions d: S → [0, 1] such that s∈S <sup>d</sup>(s) = 1, and <sup>D</sup>S+1 is the disjoint union of DS and 1 = {∗}. The transition function δ assigns to every label a ∈ A and to every state s ∈ S either a distribution of states or ∗ ∈ 1. We assume that both S and A are finite sets and that the set *Act*(s) def = {a ∈ A | δ(s, a) = ∗} of actions enabled at s is non-empty for all states.

Intuitively, the *max reachability problem* requires to check whether the probability of reaching some bad states β ⊆ S is less than or equal to a given threshold λ ∈ [0, 1]. Formally, it can be expressed in lattice theoretic terms, by considering the lattice ([0, 1]S, ≤) of all functions d: S → [0, 1], often called frames, ordered pointwise. The max reachability problem consists in checking μb ≤ p for p ∈ [0, 1]<sup>S</sup> and b : [0, 1]<sup>S</sup> → [0, 1]S, defined for all d ∈ [0, 1]<sup>S</sup> and s ∈ S, as

$$p(s) \stackrel{\text{def}}{=} \begin{cases} \lambda & \text{if } s = s\_\iota, \\ 1 & \text{if } s \neq s\_\iota, \end{cases} \qquad b(d)(s) \stackrel{\text{def}}{=} \begin{cases} 1 & \text{if } s \in \beta, \\ \max\_{a \in Act(s)} \sum\_{s' \in S} d(s') \cdot \delta(s, a)(s') & \text{if } s \notin \beta. \end{cases}$$

The reader is referred to [1] for all details.

Since b is not of the form f i for a left adjoint f (see e.g. [19]), rather than using AdjointPDR, one can exploit AdjointPDR↓. Beyond the simple initial heuristic, which is always applicable and enjoys negative termination, we illustrate now two additional heuristics that are experimentally tested in Sect. 6.

The two novel heuristics make the same choices in (Candidate) and (Decide). They exploit functions α: S → A, also known as memoryless schedulers, and the function <sup>b</sup>α : [0, 1]<sup>S</sup> <sup>→</sup> [0, 1]<sup>S</sup> defined for all <sup>d</sup> <sup>∈</sup> [0, 1]<sup>S</sup> and <sup>s</sup> <sup>∈</sup> <sup>S</sup> as follows:

$$b\_{\alpha}(d)(s) \stackrel{\text{def}}{=} \begin{cases} 1 & \text{if } s \in \beta, \\ \sum\_{s' \in S} d(s') \cdot \delta(s, \alpha(s))(s') & \text{otherwise.} \end{cases}$$

Since for all D ∈ ([0, 1]S)↓, b<sup>↓</sup> r(D) = {<sup>d</sup> <sup>|</sup> <sup>b</sup>(d) <sup>∈</sup> <sup>D</sup>} <sup>=</sup> α{<sup>d</sup> <sup>|</sup> <sup>b</sup>α(d) <sup>∈</sup> <sup>D</sup>} and since AdjointPDR<sup>↓</sup> executes (Decide) only when <sup>b</sup>(xk−<sup>1</sup>) <sup>∈</sup>/ <sup>Y</sup>k, there should exist some <sup>α</sup> such that <sup>b</sup>α(xk−<sup>1</sup>) <sup>∈</sup>/ <sup>Y</sup>k. One can thus fix

$$(\mathbf{z} \| \mathbf{Y})\_{n,k} \mapsto \begin{cases} p^\downarrow & \text{if } (\mathbf{z} \| \mathbf{Y})\_{n,k} \xrightarrow{Ca} \\ \{d \mid b\_\alpha(d) \in Y\_k\} & \text{if } (\mathbf{z} \| \mathbf{Y})\_{n,k} \xrightarrow{D} \end{cases} \tag{6}$$

Intuitively, such choices are smart refinements of those in (3): for (Candidate) they are exactly the same; for (Decide) rather than taking b<sup>↓</sup> r(Yk), we consider a larger lower-set determined by the labels chosen by α. This allows to represent each <sup>Y</sup>j as a set of <sup>d</sup> <sup>∈</sup> [0, 1]<sup>S</sup> satisfying a *single* linear inequality, while using b↓ r(Yk) would yield a systems of possibly exponentially many inequalities (see Example 21 below). Moreover, from Theorem 12, it follows that such choices ensures negative termination.

**Corollary 20.** *Let* h *be a legit heuristic defined for (Candidate) and (Decide) as in* (6)*. If* μb ≤ <sup>p</sup>*, then AdjointPDR*<sup>↓</sup> h *terminates.*

*Example 21.* Consider the maximum reachability problem with threshold λ = <sup>1</sup> 4 and <sup>β</sup> <sup>=</sup> {s3} for the following MDP on alphabet <sup>A</sup> <sup>=</sup> {a, b} and <sup>s</sup>ι <sup>=</sup> <sup>s</sup>0.

$$s\_2 \xleftarrow{b,1} \xleftarrow{b,1} s\_0 \xleftarrow{a,\frac{1}{2}} \xleftarrow{a,\frac{1}{2}} s\_1 \xleftarrow{a,\frac{1}{2}} s\_3 \xleftarrow{a,\frac{1}{2}} \xleftarrow{} s\_4 \xleftarrow{} s\_5$$

Hereafter we write d ∈ [0, 1]<sup>S</sup> as column vectors with four entries v<sup>0</sup> ...v<sup>3</sup> and we will use · for the usual matrix multiplication. With this notation, the lower set p<sup>↓</sup> ∈ ([0, 1]S)<sup>↓</sup> and b : [0, 1]<sup>S</sup> → [0, 1]<sup>S</sup> can be written as

$$p^\downarrow = \left\{ \begin{bmatrix} \mathbf{v}\_1 \\ \mathbf{v}\_2 \\ \mathbf{v}\_3 \end{bmatrix} \middle| \begin{bmatrix} \mathbf{i} & \mathbf{o} \ \mathbf{o} & \mathbf{0} \end{bmatrix} \cdot \begin{bmatrix} \mathbf{v}\_0 \\ \mathbf{v}\_1 \\ \mathbf{v}\_3 \end{bmatrix} \le \begin{bmatrix} \mathbf{4} \end{bmatrix} \qquad \text{and} \qquad b\begin{bmatrix} \begin{bmatrix} \mathbf{v}\_0 \\ \mathbf{v}\_1 \\ \mathbf{v}\_3 \end{bmatrix} \end{bmatrix} = \begin{bmatrix} \frac{\max\left(\frac{\mathbf{v}\_1 + \mathbf{v}\_2}{2}, \frac{\mathbf{v}\_0 + 2\mathbf{v}\_2}{3}\right)}{\frac{\mathbf{v}\_0}{2}} \\\ \mathbf{i} \end{bmatrix}.$$

Amongst the several memoryless schedulers, only two are relevant for us: ζ def = (s<sup>0</sup> → a, s<sup>1</sup> → a, s<sup>2</sup> → b, s<sup>3</sup> → <sup>a</sup>) and <sup>ξ</sup> def = (s<sup>0</sup> → b, s<sup>1</sup> → a, s<sup>2</sup> → b, s<sup>3</sup> → <sup>a</sup>). By using the definition of <sup>b</sup>α : [0, 1]<sup>S</sup> <sup>→</sup> [0, 1]S, we have that

$$b\_{\zeta}(\begin{bmatrix} \boldsymbol{v}\_{0} \\ \boldsymbol{v}\_{2} \\ \boldsymbol{v}\_{3} \end{bmatrix}) = \begin{bmatrix} \frac{\frac{\boldsymbol{v}\_{1} + \boldsymbol{v}\_{2}}{2}}{2} \\ \frac{\boldsymbol{v}\_{0} + \boldsymbol{v}\_{3}}{2} \end{bmatrix} \qquad \text{and} \qquad b\_{\xi}(\begin{bmatrix} \boldsymbol{v}\_{0} \\ \boldsymbol{v}\_{1} \\ \boldsymbol{v}\_{3} \end{bmatrix}) = \begin{bmatrix} \frac{\frac{\boldsymbol{v}\_{0} + 2\boldsymbol{v}\_{2}}{3}}{\frac{\boldsymbol{v}\_{0}}{2}} \\ \frac{\boldsymbol{v}\_{0}}{\boldsymbol{v}\_{0}} \end{bmatrix}.$$

It is immediate to see that the problem has negative answer, since using ζ in 4 steps or less, s<sup>0</sup> can reach s<sup>3</sup> already with probability <sup>1</sup> <sup>4</sup> <sup>+</sup> <sup>1</sup> 8 .

To illustrate the advantages of (6), we run AdjointPDR<sup>↓</sup> with the simple initial heuristic and with the heuristic that only differs for the choice in (Decide), taken as in (6). For both heuristics, the first iterations are the same: several

0 def = { *v*0 *v*1 *v*2 *v*3 |[1000]· *v*0 *v*1 *v*2 *v*3 ≤[ <sup>1</sup> <sup>4</sup> ]} { *v*0 *v*1 *v*2 *v*3 |[1000]· *v*0 *v*1 *v*2 *v*3 ≤[ <sup>1</sup> 4 ]} 1 def = { - *v*0 *v*1 *v*2 *v*3 | 0110 1020 · - *v*0 *v*1 *v*2 *v*3 ≤ 1 2 3 4 } { - *v*0 *v*1 *v*2 *v*3 |[<sup>0</sup> <sup>1</sup> <sup>2</sup> <sup>1</sup> <sup>2</sup> <sup>0</sup>]· - *v*0 *v*1 *v*2 *v*3 ≤[ <sup>1</sup> 4 ]} 2 def = { - *v*0 *v*1 *v*2 *v*3 | 3001 2110 4020 · - *v*0 *v*1 *v*2 *v*3 ≤ <sup>1</sup> 3 2 9 4 } { - *v*0 *v*1 *v*2 *v*3 |[ 3 <sup>4</sup> 0 0 <sup>1</sup> 4 ]· - *v*0 *v*1 *v*2 *v*3 ≤[ <sup>1</sup> 4 ]} 3 def = { - *v*0 *v*1 *v*2 *v*3 | ⎡ ⎢ ⎢ ⎣ 0 3 <sup>2</sup> <sup>3</sup> <sup>2</sup> <sup>0</sup> 1 020 3 <sup>2</sup> 1 1 <sup>1</sup> <sup>2</sup> <sup>13</sup> <sup>6</sup> <sup>0</sup> <sup>4</sup> <sup>3</sup> <sup>1</sup> <sup>2</sup> 2 220 10 <sup>3</sup> <sup>0</sup> <sup>8</sup> <sup>3</sup> <sup>0</sup> ⎤ ⎥ ⎥ ⎦· - *v*0 *v*1 *v*2 *v*3 ≤ ⎡ ⎢ ⎢ ⎣ 0 0 3 2 3 2 9 4 9 4 ⎤ ⎥ ⎥ ⎦ } { - *v*0 *v*1 *v*2 *v*3 |[<sup>0</sup> <sup>3</sup> <sup>8</sup> <sup>3</sup> <sup>8</sup> <sup>0</sup>]· - *v*0 *v*1 *v*2 *v*3 ≤[0]} 4 def = { - *v*0 *v*1 *v*2 *v*3 | - 1000 0100 0010 0001 · - *v*0 *v*1 *v*2 *v*3 ≤ - 0 0 0 0 } = { - 0 0 0 0 } { - *v*0 *v*1 *v*2 *v*3 |[ <sup>9</sup> <sup>16</sup> 0 0 <sup>3</sup> <sup>16</sup> ]· - *v*0 *v*1 *v*2 *v*3 ≤[0]} 5 def =

**Fig. 5.** The elements of the negative sequences computed by AdjointPDR<sup>↓</sup> for the MDP in Example 21. In the central column, these elements are computed by means of the simple initial heuristics, that is <sup>F</sup><sup>i</sup> = (b<sup>↓</sup> r) i (p↓). In the rightmost column, these elements are computed using the heuristic in (6). In particular <sup>F</sup><sup>i</sup> <sup>=</sup> {<sup>d</sup> <sup>|</sup> <sup>b</sup><sup>ζ</sup> (d) ∈ F<sup>i</sup>−<sup>1</sup>} for <sup>i</sup> <sup>≤</sup> 3, while for <sup>i</sup> <sup>≥</sup> 4 these are computed as <sup>F</sup><sup>i</sup> <sup>=</sup> {<sup>d</sup> <sup>|</sup> <sup>b</sup>ξ(d) ∈ F<sup>i</sup>−<sup>1</sup>}.

repetitions of (Candidate), (Conflict) and (Unfold) exploiting elements of the positive chain that form the initial chain (except for the last element <sup>x</sup>n−<sup>1</sup>).

$$(\emptyset \begin{bmatrix} 0 \\ 0 \\ 0 \\ 1 \end{bmatrix} \begin{bmatrix} 1 \\ 1 \\ 1 \end{bmatrix} \|\varepsilon\rangle\_{3,3} \xrightarrow{Ca} \begin{bmatrix} \alpha \multimap \alpha \\ 0 \\ 0 \\ 1 \end{bmatrix} \|\varepsilon\rangle\_{3,3} \xrightarrow{U} \xrightarrow{Ca} \stackrel{Ca}{\longrightarrow} \stackrel{Ca}{\longrightarrow} \stackrel{Ca}{\longrightarrow} \stackrel{Ca}{\longrightarrow} \stackrel{Ca}{\longrightarrow} \stackrel{Ca}{\longrightarrow} \begin{bmatrix} \alpha \begin{bmatrix} 0 \\ 0 \\ 0 \\ 1 \end{bmatrix} \begin{bmatrix} 1 \\ 2 \\ 1 \end{bmatrix} \begin{bmatrix} \frac{1}{2} \\ \frac{1}{2} \\ 1 \\ 1 \end{bmatrix} \begin{bmatrix} \frac{1}{2} \\ \frac{1}{2} \\ 1 \\ 1 \end{bmatrix} \|p^{\perp}\rangle\_{7,6} \xrightarrow{} \begin{bmatrix} \alpha \end{bmatrix} \begin{bmatrix} \alpha \multimap \alpha \\ \alpha \end{bmatrix} \|\varepsilon\|\_{3,3} \xrightarrow{} \begin{bmatrix} \alpha \alpha \multimap \alpha \\ \alpha \end{bmatrix} \|\varepsilon\|\_{3,3}$$

In the latter state the algorithm has to perform (Decide), since b(x5) ∈/ p↓. Now the choice of z in (Decide) is different for the two heuristics: the former uses b↓ r(p↓) = {<sup>d</sup> <sup>|</sup> <sup>b</sup>(d) <sup>∈</sup> <sup>p</sup><sup>↓</sup>}, the latter uses {<sup>d</sup> <sup>|</sup> <sup>b</sup><sup>ζ</sup> (d) <sup>∈</sup> <sup>p</sup><sup>↓</sup>}. Despite the different choices, both the heuristics proceed with 6 steps of (Decide):

$$(\emptyset \begin{bmatrix} \ 0 \\ 0 \\ 0 \\ 1 \end{bmatrix} \begin{bmatrix} 0 \\ 1 \\ 1 \\ 1 \end{bmatrix} \begin{bmatrix} \ \frac{1}{4} \\ \frac{1}{3} \\ 1 \\ 1 \end{bmatrix} \begin{bmatrix} \frac{1}{4} \\ \frac{1}{3} \\ 1 \\ 1 \end{bmatrix} \|\mathcal{F}^{0})\_{7,6} \stackrel{D}{\longrightarrow} \stackrel{D}{\longrightarrow} \stackrel{D}{\longrightarrow} \begin{bmatrix} 0 \\ 0 \\ 0 \\ 1 \end{bmatrix} \begin{bmatrix} 0 \\ 1 \\ 1 \\ 1 \end{bmatrix} \begin{bmatrix} \frac{1}{4} \\ \frac{1}{3} \\ 1 \\ 1 \end{bmatrix} \begin{bmatrix} \frac{1}{4} \\ \frac{1}{3} \\ 1 \\ 1 \end{bmatrix} \|\mathcal{F}^{5}, \mathcal{F}^{4}, \mathcal{F}^{3}, \mathcal{F}^{2}, \mathcal{F}^{1}, \mathcal{F}^{0} \\ \text{(7.12)}$$

The element of the negative sequence F<sup>i</sup> are illustrated in Fig. 5 for both the heuristics. In both cases, <sup>F</sup><sup>5</sup> <sup>=</sup> <sup>∅</sup> and thus AdjointPDR<sup>↓</sup> returns false.

To appreciate the advantages provided by (6), it is enough to compare the two columns for the F<sup>i</sup> in Fig. 5: in the central column, the number of inequalities defining F<sup>i</sup> significantly grows, while in the rightmost column is always 1.

Whenever <sup>Y</sup>k is generated by a single linear inequality, we observe that <sup>Y</sup>k <sup>=</sup> {d ∈ [0, 1]<sup>S</sup> | s∈S(r<sup>s</sup> · <sup>d</sup>(s)) <sup>≤</sup> <sup>r</sup>} for suitable non-negative real numbers <sup>r</sup> and <sup>r</sup>s for all <sup>s</sup> <sup>∈</sup> <sup>S</sup>. The convex set <sup>Y</sup>k is generated by finitely many <sup>d</sup> <sup>∈</sup> [0, 1]<sup>S</sup> enjoying a convenient property: d(s) is different from 0 and 1 only for at most one <sup>s</sup> <sup>∈</sup> <sup>S</sup>. The set of its generators, denoted by <sup>G</sup>k, can thus be easily computed. We exploit this property to resolve the choice for (Conflict). We consider its sub set <sup>Z</sup>k def <sup>=</sup> {<sup>d</sup> ∈ Gk <sup>|</sup> <sup>b</sup>(xk−<sup>1</sup>) <sup>≤</sup> <sup>d</sup>} and define <sup>z</sup>B, z<sup>01</sup> <sup>∈</sup> [0, 1]<sup>S</sup> for all <sup>s</sup> <sup>∈</sup> <sup>S</sup> as

$$z\_B(s) \stackrel{\text{def}}{=} \begin{cases} (\bigwedge \mathcal{Z}\_k)(s) & \text{if } r\_s \neq 0, \mathcal{Z}\_k \neq \emptyset \\ b(x\_{k-1})(s) & \text{otherwise} \end{cases} \\ z\_{01}(s) \stackrel{\text{def}}{=} \begin{cases} \lceil z\_B(s) \rceil & \text{if } r\_s = 0, \mathcal{Z}\_k \neq \emptyset \\ z\_B(s) & \text{otherwise} \end{cases} \tag{7}$$

where, for u ∈ [0, 1], u denotes 0 if u = 0 and 1 otherwise. We call hCoB and hCo01 the heuristics defined as in (6) for (Candidate) and (Decide) and as <sup>z</sup>B, respectively z01, for (Conflict). The heuristics hCo01 can be seen as a Boolean modification of hCoB, rounding up positive values to 1 to accelerate convergence.

**Proposition 22.** *The heuristics* hCoB *and* hCo01 *are legit.*

By Corollary 20, AdjointPDR<sup>↓</sup> terminates for negative answers with both hCoB and hCo01. We conclude this section with a last example.

*Example 23.* Consider the following MDP with alphabet <sup>A</sup> <sup>=</sup> {a, b} and <sup>s</sup>ι <sup>=</sup> <sup>s</sup><sup>0</sup>

$$s\_{a,1} \subset \cdots \xleftarrow{} s\_2 \xleftarrow{} \xleftarrow{a,1} s\_0 \xleftarrow{a,\frac{1}{3}} s\_1 \xleftarrow{a,\frac{1}{3}} s\_3 \xleftarrow{a,\frac{2}{3}} a,1$$

and the max reachability problem with threshold λ = <sup>2</sup> <sup>5</sup> and β = {s3}. The lower set p<sup>↓</sup> ∈ ([0, 1]S)<sup>↓</sup> and b : [0, 1]<sup>S</sup> → [0, 1]<sup>S</sup> can be written as

$$p^\downarrow = \left\{ \begin{smallmatrix} \boldsymbol{v}\_0 \\ \boldsymbol{v}\_1 \\ \boldsymbol{v}\_3 \end{smallmatrix} \, \middle| \, \begin{smallmatrix} \boldsymbol{\h} & \boldsymbol{0} & \boldsymbol{0} \end{smallmatrix} \right\} \begin{smallmatrix} \boldsymbol{v}\_0 \\ \boldsymbol{v}\_1 \\ \boldsymbol{v}\_3 \end{smallmatrix} \leq \left\{ \begin{smallmatrix} \boldsymbol{\h} \\ \boldsymbol{v}\_1 \\ \boldsymbol{v}\_3 \end{smallmatrix} \right\} \quad \text{and} \quad b\begin{pmatrix} \boldsymbol{v}\_0 \\ \boldsymbol{v}\_1 \\ \boldsymbol{v}\_3 \end{pmatrix} \boldsymbol{j} = \begin{bmatrix} \frac{\max\left(\boldsymbol{v}\_0, \frac{\boldsymbol{v}\_1 + \boldsymbol{v}\_2}{2}\right)}{3} \\ \frac{\boldsymbol{v}\_0 + 2 \cdot \boldsymbol{v}\_3}{3} \\ \boldsymbol{1} \end{smallmatrix} \right\}$$

With the simple initial heuristic, AdjointPDR<sup>↓</sup> does not terminate. With the heuristic hCo01, it returns true in 14 steps, while with hCoB in 8. The first 4 steps, common to both hCoB and hCo01, are illustrated below.

(∅ -0 0 0 0 - 1 1 1 1 ε)<sup>3</sup>,<sup>3</sup> *Ca* → (∅ -0 0 0 0 - 1 1 1 1 p<sup>↓</sup>)<sup>3</sup>,<sup>2</sup> *Co* → (∅ -0 0 0 0 -2 5 0 0 1 ε)<sup>3</sup>,<sup>3</sup> b( -0 0 0 0 ) =-0 0 0 1 Z<sup>2</sup> = { - 2 5 0 0 1 , - 2 5 1 0 1 , - 2 5 0 1 1 , - 2 5 1 1 1 } U →*Ca* →(∅ -0 0 0 0 -2 5 0 0 1 - 1 1 1 1 p<sup>↓</sup>)<sup>4</sup>,<sup>3</sup> *Co* → (∅ -0 0 0 0 -2 5 0 0 1 -2 5 1 0 1 ε)<sup>4</sup>,<sup>4</sup> (∅ -0 0 0 0 -2 5 0 0 1 -2 5 4 5 0 1 ε)<sup>4</sup>,<sup>4</sup> b( - 2 5 0 0 1 ) =- 2 5 4 5 0 1 Z<sup>3</sup> = { - 2 5 1 0 1 , - 2 5 1 1 1 }

Observe that in the first (Conflict) <sup>z</sup>B <sup>=</sup> <sup>z</sup>01, while in the second <sup>z</sup>01(s1)=1 and <sup>z</sup>B(s1) = <sup>4</sup> <sup>5</sup> , leading to the two different states prefixed by vertical lines.

## **6 Implementation and Experiments**

We first developed, using Haskell and exploiting its abstraction features, a common template that accommodates both AdjointPDR and AdjointPDR↓. It is a program parametrized by two lattices—used for positive chains and negative sequences, respectively—and by a heuristic.

For our experiments, we instantiated the template to AdjointPDR<sup>↓</sup> for MDPs (letting L = [0, 1]S), with three different heuristics: hCoB and hCo01 from Proposition 22; and hCoS introduced below. Besides the template (∼100 lines), we needed ∼140 lines to account for hCoB and hCo01, and additional ∼100 lines to further obtain hCoS. All this indicates a clear benefit of our abstract theory: a general template can itself be coded succinctly; instantiation to concrete problems is easy, too, thanks to an explicitly specified interface of heuristics.

Our implementation accepts MDPs expressed in a symbolic format inspired by Prism models [20], in which states are variable valuations and transitions are described by symbolic functions (they can be segmented with symbolic guards {guardi}i). We use rational arithmetic (Rational in Haskell) for probabilities to limit the impact of rounding errors.

**Heuristics.** The three heuristics (hCoB, hCo01, hCoS) use the same choices in (Candidate) and (Decide), as defined in (6), but different ones in (Conflict).

The third heuristics hCoS is a *symbolic* variant of hCoB; it relies on our symbolic model format. It uses <sup>z</sup>S for <sup>z</sup> in (Conflict), where <sup>z</sup>S(s) = <sup>z</sup>B(s) if <sup>r</sup>s = 0 or <sup>Z</sup>k <sup>=</sup> <sup>∅</sup>. The definition of <sup>z</sup>S(s) otherwise is notable: we use a piecewise affine function (ti · <sup>s</sup>+ui)i for <sup>z</sup>S(s), where the affine functions (ti · <sup>s</sup>+ui)i are guarded by the same guards {guardi}<sup>i</sup> of the MDP's transition function. We let the SMT solver Z3 [25] search for the values of the coefficients <sup>t</sup>i, ui, so that <sup>z</sup>S satisfies the requirements of (Conflict) (namely <sup>b</sup>(xk−<sup>1</sup>)(s) <sup>≤</sup> <sup>z</sup>S(s) <sup>≤</sup> 1 for each <sup>s</sup> <sup>∈</sup> <sup>S</sup> with <sup>r</sup>s = 0), together with the condition <sup>b</sup>(zS) <sup>≤</sup> <sup>z</sup>S for faster convergence. If the search is unsuccessful, we give up hCoS and fall back on the heuristic hCoB.

As a task common to the three heuristics, we need to calculate <sup>Z</sup>k <sup>=</sup> {<sup>d</sup> ∈ Gk <sup>|</sup> <sup>b</sup>(xk−<sup>1</sup>) <sup>≤</sup> <sup>d</sup>} in (Conflict) (see (7)). Rather than computing the whole set <sup>G</sup>k of generating points of the linear inequality that defines <sup>Y</sup>k, we implemented an ad-hoc algorithm that crucially exploits the condition <sup>b</sup>(xk−<sup>1</sup>) <sup>≤</sup> <sup>d</sup> for pruning.

**Experiment Settings.** We conducted the experiments on Ubuntu 18.04 and AWS t2.xlarge (4 CPUs, 16 GB memory, up to 3.0 GHz Intel Scalable Processor). We used several Markov chain (MC) benchmarks and a couple of MDP ones.

**Research Questions.** We wish to address the following questions.


**Experiments on MCs (Table** 1**).** We used six benchmarks: Haddad-Monmege is from [17]; the others are from [3,19]. We compared AdjointPDR<sup>↓</sup> (with three **Table 1.** Experimental results on MC benchmarks. |S| is the number of states, P is the reachability probability (calculated by manual inspection), λ is the threshold in the problem P ≤? λ (shaded if the answer is no). The other columns show the average execution time in seconds; TO is timeout (900 s); MO is out-of-memory. For AdjointPDR<sup>↓</sup> and LT-PDR we used the tasty-bench Haskell package and repeated executions until std. dev. is < 5% (at least three execs). For PrIC3 and Storm, we made five executions. Storm's execution does not depend on λ: it seems to answer queries of the form P ≤? λ by calculating P. We observed a wrong answer for the entry with (†) (Storm, sp.-num., Haddad-Monmege); see the discussion of RQ2.


heuristics) against LT-PDR [19], PrIC3 (with four heuristics *none*, *lin.*, *pol.*, *hyb.*, see [3]), and Storm 1.5 [11]. Storm is a recent comprehensive toolsuite that implements different algorithms and solvers. Among them, our comparison is against *sparse-numeric*, *sparse-rational*, and *sparse-sound*. The *sparse* engine uses explicit state space representation by sparse matrices; this is unlike another representative *dd* engine that uses symbolic BDDs. (We did not use *dd* since it often reported errors, and was overall slower than *sparse*.) *Sparse-numeric* is a value-iteration (VI) algorithm; *sparse-rational* solves linear (in)equations using rational arithmetic; *sparse-sound* is a sound VI algorithm [26].<sup>2</sup>

<sup>2</sup> There are another two sound algorithms in Storm: one that utilizes interval iteration [2] and the other does optimistic VI [16]. We have excluded them from the results since we observed that they returned incorrect answers.


**Table 2.** Experimental results on MDP benchmarks. The legend is the same as Table 1, except that P is now the maximum reachability probability.

**Experiments on MDPs (Table** 2**).** We used two benchmarks from [17]. We compared AdjointPDR<sup>↓</sup> only against Storm, since RQ1 is already addressed using MCs (besides, PrIC3 did not run for MDPs).

**Discussion.** The experimental results suggest the following answers to the RQs.

**RQ1**. The performance advantage of AdjointPDR↓, over both LT-PDR and PrIC3, was clearly observed throughout the benchmarks. AdjointPDR<sup>↓</sup> outperformed LT-PDR, thus confirming empirically the theoretical observation in Sect. 4.2. The profit is particularly evident in those instances whose answer is positive. AdjointPDR<sup>↓</sup> generally outperformed PrIC3, too. Exceptions are in ZeroConf, Chain and DoubleChain, where PrIC3 with polynomial (pol.) and hybrid (hyb.) heuristics performs well. This seems to be thanks to the expressivity of the polynomial template in PrIC3, which is a possible enhancement we are yet to implement (currently our symbolic heuristic hCoS uses only the affine template).

**RQ2**. The comparison with Storm is interesting. Note first that Storm's *sparsenumeric* algorithm is a VI algorithm that gives a guaranteed lower bound *without guaranteed convergence*. Therefore its positive answer to P ≤? λ may not be correct. Indeed, for Haddad-Monmege with <sup>|</sup>S| ∼ <sup>10</sup><sup>3</sup>, it answered <sup>P</sup> = 0.<sup>5</sup> which is wrong ((†) in Table 1). This is in contrast with PDR algorithms that discovers an explicit witness for P ≤ λ via their positive chain.

Storm's *sparse-rational* algorithm is precise. It was faster than PDR algorithms in many benchmarks, although AdjointPDR<sup>↓</sup> was better or comparable in ZeroConf (10<sup>4</sup>) and Haddad-Monmege (41), for <sup>λ</sup> such that <sup>P</sup> <sup>≤</sup> <sup>λ</sup> is true. We believe this suggests a general advantage of PDR algorithms, namely to accelerate the search for an invariant-like witness for safety.

Storm's *sparse-sound* algorithm is a sound VI algorithm that returns correct answers aside numerical errors. Its performance was similar to that of sparse-numeric, except for the two instances of Haddad-Monmege: sparse-sound returned correct answers but was much slower than sparse-numeric. For these two instances, AdjointPDR<sup>↓</sup> outperformed sparse-sound.

It seems that a big part of Storm's good performance is attributed to the sparsity of state representation. This is notable in the comparison of the two instances of Haddad-Monmege (41 vs. 10<sup>3</sup>): while Storm handles both of them easily, AdjointPDR<sup>↓</sup> struggles a bit in the bigger instance. Our implementation can be extended to use sparse representation, too; this is future work.

**RQ3**. We derived the three heuristics (hCoB, hCo01, hCoS) exploiting the theory of AdjointPDR↓. The experiments show that each heuristic has its own strength. For example, hCo01 is slower than hCoB for MCs, but it is much better for MDPs. In general, there is no silver bullet heuristic, so coming up with a variety of them is important. The experiments suggest that our theory of AdjointPDR<sup>↓</sup> provides great help in doing so.

**RQ4**. Table 2 shows that AdjointPDR<sup>↓</sup> can handle nondeterminism well: once a suitable heuristic is chosen, its performances on MDPs and on MCs of similar size are comparable. It is also interesting that better-performing heuristics vary, as we discussed above.

**Summary.** AdjointPDR<sup>↓</sup> clearly outperforms existing probabilistic PDR algorithms in many benchmarks. It also compares well with Storm—a highly sophisticated toolsuite—in a couple of benchmarks. These are notable especially given that AdjointPDR<sup>↓</sup> currently lacks enhancing features such as richer symbolic templates and sparse representation (adding which is future work). Overall, we believe that AdjointPDR<sup>↓</sup> *confirms the potential of PDR algorithms in probabilistic model checking*. Through the three heuristics, we also observed the value of an abstract general theory in devising heuristics in PDR, which is probably true of verification algorithms in general besides PDR.

## **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

## Fast Approximations of Quantifier Elimination

Isabel Garcia-Contreras1(B) , V. K. Hari Govind<sup>1</sup> , Sharon Shoham<sup>2</sup> , and Arie Gurfinkel<sup>1</sup>

> <sup>1</sup> University of Waterloo, Waterloo, Canada {igarciac,hgvedira,agurfink}@uwaterloo.ca <sup>2</sup> Tel-Aviv University, Tel Aviv, Israel sharon.shoham@cs.tau.ac.il

Abstract. Quantifier elimination (qelim) is used in many automated reasoning tasks including program synthesis, exist-forall solving, quantified SMT, Model Checking, and solving Constrained Horn Clauses (CHCs). Exact qelim is computationally expensive. Hence, it is often approximated. For example, Z3 uses "light" pre-processing to reduce the number of quantified variables. CHC-solver Spacer uses model-based projection (MBP) to under-approximate qelim relative to a given model, and over-approximations of qelim can be used as abstractions.

In this paper, we present the QEL framework for fast approximations of qelim. QEL provides a uniform interface for both quantifier reduction and model-based projection. QEL builds on the egraph data structure – the core of the EUF decision procedure in SMT – by casting quantifier reduction as a problem of choosing *ground* (i.e., variable-free) representatives for equivalence classes. We have used QEL to implement MBP for the theories of Arrays and Algebraic Data Types (ADTs). We integrated QEL and our new MBP in Z3 and evaluated it within several tasks that rely on quantifier approximations, outperforming state-of-the-art.

#### 1 Introduction

Quantifier Elimination (qelim) is used in many automated reasoning tasks including program synthesis [18], exist-forall solving [8,9], quantified SMT [5], and Model Checking [17]. Complete qelim, even when possible, is computationally expensive, and solvers often approximate it. We call these approximations *quantifier reductions*, to separate them from qelim. The difference is that quantifier reduction might leave some free variables in the formula.

For example, Z3 [19] performs quantifier reduction, called QeLite, by greedily substituting variables by definitions syntactically appearing in the formulas. While it is very useful, it is necessarily sensitive to the order in which variables are substituted and depends on definitions appearing explicitly in the formula. Even though it may seem that these shortcomings need to be tolerated to keep QeLite fast, in this paper we show that it is not actually the case; we propose an egraph-based algorithm, QEL, to perform fast quantifier reduction that is complete relative to some semantic properties of the formula.

Egraph [20] is a data structure that compactly represents infinitely many terms and their equivalence classes. It was initially proposed as a decision procedure for EUF [20] and used for theorem proving (e.g., Simplify [7]). Since then, the applications of egraphs have grown. Egraphs are now used as term rewrite systems in equality saturation [15,23], for theory combination in SMT solvers [7,21], and for term abstract domains in Abstract Interpretation [6,10,12].

Using egraphs for rewriting or other formula manipulations (like qelim) requires a special operation, called *extract*, that converts nodes in the egraph back into terms. Term extraction was not considered when egraphs were first designed [20]. As far as we know, extraction was first studied in the application of egraphs for compiler optimization. Specifically, equality saturation [15,22] is an optimization technique over egraphs that consists in populating an egraph with many equivalent terms inferred by applying rules. When the egraph is saturated, i.e., applying the rules has no effect, the equivalent term that is most desired, e.g., smallest in size, is *extracted*. This is a recursive process that extracts each sub-term by choosing one representative among its equivalents.

Application of egraphs to rewriting have recently resurged driven by the egg library [24] and the associated workshop<sup>1</sup>. In [24], the authors show, once again, the power and versatility of this data structure. Motivated by applications of equality saturation, they provide a generic and efficient framework equipped with term extraction, based on an extensible class analysis.

Egraphs seem to be the perfect data-structure to address the challenges of quantifier reduction: they allow reasoning about infinitely many equivalent terms and consider all available variable definitions and orderings at once. However, things are not always what they appear. The key to quantifier reduction is finding ground (i.e., variable-free) representatives for equivalence classes with free variables. This goes against existing techniques for term extraction since it requires selecting larger, rather than smaller, terms to be representatives. Selecting representatives carelessly makes term extraction diverge. To our surprise, this problem has not been studied so far. In fact, egg [24] incorrectly claims that any representative function can be used with its term extraction, while the implementation diverges. In this paper, we bridge this gap by providing necessary and sufficient conditions for a representative function to be admissible for term extraction as defined in [15,24]. Furthermore, we extend extraction from terms to formulas to enable extracting a formula of the egraph.

Our main contribution is a new quantifier reduction algorithm, called QEL. Building on the term extraction described above, it is formulated as finding a representative function that maximizes the number of ground terms as representatives. Furthermore, it greedily attempts to represent variables without ground representatives in terms of other variables, thus further reducing the number of variables in the output. We show that QEL is complete relative to ground definitions entailed by the formula. Specifically, QEL guarantees to eliminate a variable if it is equivalent to a ground term.

<sup>1</sup> https://pldi22.sigplan.org/series/egraphs.

Whenever an application requires eliminating all free variables, incomplete techniques such as QeLite or QEL are insufficient. In this case, qelim is underapproximated using a Model-based Projection (MBP) that uses a model M of a formula to guide under-approximation using equalities and variable definitions that are consistent with M. In this paper, we show that MBP can be implemented using our new techniques for QEL together with the machinery from equality saturation. Just like SMT solvers use egraphs as glue to combine different theory solvers, we use egraphs as glue to combine projection for different theories. In particular, we give an algorithm for MBP in the combined theory of Arrays and Algebraic DataTypes (ADTs). The algorithm uses insights from QEL to produce less under-approximate MBPs.

We implemented QEL and the new MBP using egraphs inside the state-ofart SMT solver Z3 [19]. Our implementation (referred to as Z3eg) replaces the existing QeLite and MBP. We evaluate our algorithms in two contexts. First, inside the QSAT [5] algorithm for quantified satisfiability. The performance of QSAT in Z3eg is improved, compared to QSAT in Z3, when ADTs are involved. Second, we evaluate our algorithms inside the Constrained Horn Clause (CHC) solver Spacer [17]. Our experiments show that Spacer in Z3eg solves many more benchmarks containing nested Arrays and ADTs.

*Related Work.* Quantifier reduction by variable substitution is widely used in quantified SMT [5,11]. To our knowledge, we are the first to look at this problem semantically and provide an algorithm that guarantees that the variable is eliminated if the formula entails that it has a ground definition.

Term extraction for egraphs comes from equality saturation [15,22]. The egg Rust library [24] is a recent implementation of equality saturation that supports rewriting and term extraction. However, we did not use egg because we integrated QEL within Z3 and built it using Z3 data structures instead.

Model-based projection was first introduced for the Spacer CHC solver for LIA and LRA [17] and extended to the theory of Arrays [16] and ADTs [5]. Until now, it was implemented by syntactic rewriting. Our egraph-based MBP implementation is less sensitive to syntax and, more importantly, allows for combining MBPs of multiple theories for MBP of the combination. As a result, our MBP is more general and less model dependent. Specifically, it requires fewer model equalities and produces more general under-approximations than [5,16].

*Outline.* The rest of the paper is organized as follows. Section 2 provides background. Section 3 introduces term extraction, extends it to formulas, and characterizes representative-based term extraction for egraphs. Section 4 presents QEL, our algorithm for fast quantifier reduction that is relatively complete. Section 5 shows how to compute MBP combining equality saturation and the ideas from Sect. 4 for the theories of ADTs and Arrays. All algorithms have been implemented in Z3 and evaluated in Sect. 6.

### 2 Background

We assume the reader is familiar with multi-sorted first-order logic (FOL) with equality and the theory of equality with uninterpreted functions (EUF) (for an introduction see, e.g. [4]). We use ≈ to denote the designated logical equality symbol. For simplicity of presentation, we assume that the FOL signature Σ contains only functions (i.e., no predicates) and constants (i.e., 0-ary functions). To represent predicates, we assume the FOL signature has a designated sort Bool, and two Bool constants and ⊥, representing true, and false respectively. We then use Bool-valued functions to represent predicates, using P(a) ≈ and P(a) ≈ ⊥ to mean that P(a) is true or false, respectively. Informally, we continue to write P(a) and <sup>¬</sup>P(a) as a syntactic sugar for P(a) ≈ and P(a) ≈ ⊥, respectively. We use lowercase letters like a, b for constants, and f, g for functions, and uppercase letters like P, Q for Bool functions that represent predicates. We denote by ψ<sup>∃</sup> the existential closure of <sup>ψ</sup>.

*Quantifier Elimination (qelim).* Given a quantifier-free (QF) formula ϕ with free variables *<sup>v</sup>*, *quantifier elimination* of <sup>ϕ</sup><sup>∃</sup> is the problem of finding a QF formula ψ with no free variables such that ψ <sup>≡</sup> ϕ<sup>∃</sup>. For example, a qelim of <sup>∃</sup>a · (a <sup>≈</sup> x <sup>∧</sup> f(a) > 3) is f(x) > <sup>3</sup>; and, there is no qelim of <sup>∃</sup>x · (f(x) > 3), because it is impossible to restrict f to have "at least one value in its range that is greater than 3" without a quantifier.

*Model Based Projection (MBP).* Let ϕ be a formula with free variables *v*, and M a model of ϕ. A *model-based projection* of ϕ relative to M is a QF formula ψ such that ψ <sup>⇒</sup> ϕ<sup>∃</sup> and <sup>M</sup> <sup>|</sup><sup>=</sup> <sup>ψ</sup>. That is, <sup>ψ</sup> has no free variables, is an underapproximation of ϕ, and satisfies the designated model M, just like ϕ. MBP is used by many algorithms to under-approximate qelim, when the computation of qelim is too expensive or, for some reason, undesirable.

*Egraphs.* An egraph is a well-known data structure to compactly represent a set of terms and an equivalence relation on those terms [20]. Throughout the paper, we assume that graphs have an ordered successor relation and use n[i] to denote the ith successor (child) of a node n. An out-degree of a node n, deg(n), is the number of edges leaving n. Given a node n, parents(n) denotes the set of nodes with an outgoing edge to n and children(n) denotes the set of nodes with an incoming edge from n.

Definition 1. *Let* Σ *be a first-order logic signature. An* egraph *is a tuple* G <sup>=</sup> *<sup>N</sup>* ,*E*, L, root *, where*


Fig. 1. Example egraph of ϕ1.

Given an egraph G, the *class* of a node n <sup>∈</sup> G, *class*(n) ρroot(n), is the set of all nodes that are equivalent to n. The term of n, *term*(n), with L(n) = f is f if deg(n)=0 and f(*term*(n[1]),...,*term*(n[deg(n)])), otherwise. We assume

that the terms of different nodes are different, and refer to a node n by its term. An example of an egraph G <sup>=</sup> *<sup>N</sup>* ,*E*, L, root is shown in Fig. 1. A symbol f inside a circle depicts a node n with label L(n) = f, solid black and dashed red arrows depict *E* and root, respectively. The order of the black arrows from left to right defines the order of the children. In our examples, we refer to a specific node i by its number using <sup>N</sup>(i) or its term, e.g., <sup>N</sup>(k + 1). A node n without an outgoing red arrow is its own root. A set of nodes connected to the same node with red edges forms an equivalence class. In this example, root defines the equivalence classes {N(3), <sup>N</sup>(4), <sup>N</sup>(5), <sup>N</sup>(6)}, {N(8), <sup>N</sup>(9)}, and a class for each of the remaining nodes. Examples of some terms in G are *term*(N(9)) = y and *term*(N(5)) = *read*(a, y).

*An Egraph of a Formula.* We consider formulas that are conjunctions of equality literals (recall that we represent predicate applications by equality literals). Given a formula ϕ - (t<sup>1</sup> <sup>≈</sup> <sup>u</sup><sup>1</sup> ∧···∧t*<sup>k</sup>* <sup>≈</sup> <sup>u</sup>*<sup>k</sup>*), an egraph from <sup>ϕ</sup> is built (following the standard procedure [20]) by creating nodes for each <sup>t</sup>*<sup>i</sup>* and <sup>u</sup>*<sup>i</sup>*, recursively creating nodes for their subexpressions, and merging the classes of each pair <sup>t</sup>*<sup>i</sup>* and u*<sup>i</sup>*, computing the congruence closure for root. We write *egraph*(ϕ) for an egraph of ϕ constructed via some deterministic procedure based on the recipe above. Figure <sup>1</sup> shows an *egraph*(ϕ<sup>1</sup>) of <sup>ϕ</sup><sup>1</sup>. The equality <sup>z</sup> <sup>≈</sup> *read*(a, x) is captured by N(3) and N(4) belonging to the same class (i.e., red arrow from N(4) to <sup>N</sup>(3)). Similarly, the equality x <sup>≈</sup> y is captured by a red arrow from <sup>N</sup>(9) to <sup>N</sup>(8). Note that by congruence, <sup>ϕ</sup><sup>1</sup> implies *read*(a, x) <sup>≈</sup> *read*(a, y), which, by transitivity, implies that k + 1 <sup>≈</sup> *read*(a, x). In Fig. 1, this corresponds to red arrows from <sup>N</sup>(5) and <sup>N</sup>(6) to <sup>N</sup>(3). The predicate application <sup>3</sup> > z is captured by the red arrow from <sup>N</sup>(1) to <sup>N</sup>(0). From now on, we omit and <sup>⊥</sup> and the corresponding edges from figures to avoid clutter.

*Explicit and Implicit Equality.* Note that egraphs represent equality implicitly by placing nodes with equal terms in the same equivalence class. Sometimes, it is necessary to represent equality explicitly, for example, when using egraphs for

(a) G*a*, interpreting *eq* as . (b) G*b*, not interpreting *eq*. (c) G*c*, combining (a) and (b).

Fig. 2. Different egraph interpretations for ϕ2.

equality-aware rewriting (e.g., in egg [24]). To represent equality explicitly, we introduce a binary Bool function *eq* and write *eq*(a, b) for an equality that has to be represented explicitly. We change the *egraph* algorithm to treat *eq*(a, b) as both a function application, and as a logical equality a <sup>≈</sup> b: when processing term *eq*(a, b), the algorithm both adds *eq*(a, b) to the egraph, and merges the nodes for a and b into one class. For example, Fig. <sup>2</sup> shows three different interpretations of a formula <sup>ϕ</sup><sup>2</sup> with equality interpreted: implicitly (as in [20]), explicitly (as in [24]), and both implicitly and explicitly (as in this paper).

#### 3 Extracting Formulas from Egraphs

Egraphs were proposed as a decision procedure for EUF [20] – a setting in which converting an egraph back to a formula, or *extracting*, is irrelevant. Term extraction has been studied in the context of equality saturation and term rewriting [15,24]. However, existing literature presents extraction as a heuristic, and, to the best of our knowledge, has not been exhaustively explored. In this section, we fill these gaps in the literature and extend extraction from terms to formulas.

*Term Extraction.* We begin by recalling how to extract the term of a node. The function ntt (node-to-term) in Fig. 3 does an extraction parametrized by a representative function repr : N <sup>→</sup> N (same as in [24]). A function repr assigns each class a unique representative node (i.e., nodes in the same class are mapped to the same representative) so that <sup>ρ</sup>root <sup>=</sup> <sup>ρ</sup>repr. The function ntt extracts a term of a node recursively, similarly to *term*, except that the representatives of the children of a node are used instead of the actual children. We refer to terms built in this way by ntt(n, repr) and omit repr when it is clear from the context.

As an example, consider repr<sup>1</sup> - {N(3), <sup>N</sup>(8))} for Fig. 1. For readability, we denote representative functions by sets of nodes that are the class representatives, omitting <sup>N</sup>() that always represents its class, and omitting all singleton classes. Thus, repr<sup>1</sup> maps all nodes in *class*(N(3)) to <sup>N</sup>(3), nodes in *class*(N(8)) to <sup>N</sup>(8), nodes in *class*(N()) to <sup>N</sup>(), and all singleton classes to themselves. For example, ntt(N(5)) extracts *read*(a, x), since <sup>N</sup>(9) has as representative <sup>N</sup>(8).

```
egraph :: to formula(repr, S)
 1: Lits := ∅
 2: for r = repr(r) ∈ N do
 3: t := ntt(r, repr)
 4: for n ∈ (class(r) \ r) do
 5: if n S then
 6: Lits := Lits ∪ {t ≈ ntt(n, repr)}
 7: ret Lits
                                                    egraph :: ntt(n, repr)
                                                     8: f := L[n]
                                                     9: if deg(n)=0 then
                                                    10: ret f
                                                    11: else
                                                    12: for i ∈ [1, deg(n)] do
                                                    13: Args[i] := ntt(repr(n[i]), repr)
                                                    14: ret f(Args)
```
Fig. 3. Producing formulas from an egraph.

*Formula Extraction.* Let G <sup>=</sup> *egraph*(ϕ) be an egraph of some formula ϕ. A formula ψ is a *formula of* G, written *isFormula*(G, ψ), if ψ<sup>∃</sup> <sup>≡</sup> <sup>ϕ</sup><sup>∃</sup>.

Figure <sup>3</sup> shows an algorithm to\_formula(repr, S) to compute a formula ψ that satisfies *isFormula*(G, ψ) for a given egraph G. In addition to repr, to\_formula is parameterized by a set of nodes S <sup>⊆</sup> *<sup>N</sup>* to exclude<sup>2</sup>. To produce the equalities corresponding to the classes, for each representative r, for each n <sup>∈</sup> (*class*(r) \ {r}) the output formula has a literal ntt(r) <sup>≈</sup> ntt(n). For example, using repr<sup>1</sup> for the egraph in Fig. 1, we obtain for *class*(N(8)), (<sup>x</sup> <sup>≈</sup> <sup>y</sup>); for *class*(N(3)), (z <sup>≈</sup> *read*(a, x) <sup>∧</sup> z <sup>≈</sup> *read*(a, x) <sup>∧</sup> z <sup>≈</sup> k + 1); and for *class*(N(0)), ( ≈ <sup>3</sup> > z). The final result (slightly simplified) is: x <sup>≈</sup> y∧z <sup>≈</sup> *read*(a, x)∧z <sup>≈</sup> k<sup>+</sup> <sup>1</sup> <sup>∧</sup> <sup>3</sup> > z.

Let G <sup>=</sup> *egraph*(ϕ) for some formula ϕ. Note that, ψ computed by to\_formula is not syntactically the same as ϕ. That is, to\_formula is not an inverse of *egraph*. Furthermore, since to\_formula commits to one representative per class, it is limited in what formulas it can generate. For example, since <sup>x</sup> <sup>≈</sup> <sup>y</sup> is in <sup>ϕ</sup><sup>1</sup>, for any repr, <sup>ϕ</sup><sup>1</sup> cannot be the result of to\_formula, because the output can contain only one of *read*(a, x) or *read*(a, y).

*Representative Functions.* The representative function is instrumental for determining the terms that appear in the extracted formula. To illustrate the importance of representative choice, consider the formula <sup>ϕ</sup><sup>4</sup> of Fig. <sup>4</sup> and its egraph <sup>G</sup><sup>4</sup> <sup>=</sup> *egraph*(ϕ<sup>4</sup>). For now, ignore the blue dotted lines. For repr4*a*, to\_formula obtains <sup>ψ</sup>*<sup>a</sup>* - (x <sup>≈</sup> g(6) <sup>∧</sup> f(x) <sup>≈</sup> <sup>6</sup> <sup>∧</sup> y <sup>≈</sup> 6). For repr4*b*, to\_formula produces <sup>ψ</sup>*<sup>b</sup>* - (g(6) <sup>≈</sup> x∧f(g(6)) <sup>≈</sup> <sup>6</sup>∧y <sup>≈</sup> 6). In some applications (like qelim considered in this paper) <sup>ψ</sup>*<sup>b</sup>* is preferred to <sup>ψ</sup>*<sup>a</sup>*: simply removing the literals <sup>g</sup>(6) <sup>≈</sup> <sup>x</sup> and <sup>y</sup> <sup>≈</sup> <sup>6</sup> from <sup>ψ</sup>*<sup>b</sup>* results in a formula equivalent to <sup>∃</sup>x, y · <sup>ϕ</sup><sup>4</sup> that does not contain variables. Consider a third representative choice repr4*c*, for node <sup>N</sup>(1), ntt does not terminate: to produce a term for N(1), a term for N(3), the representative of its child, N(2), is required. Similarly to produce a term for N(3), a term for the representative of its child node N(5), N(1), is necessary. Thus, none of the terms can be extracted with repr4*c*.

For extraction, representative functions repr are either provided explicitly or implicitly (as in [24]), the latter by associating a cost to nodes and/or terms and

<sup>2</sup> The set S affects the result, but for this section, we restrict to the case of S -∅.

Fig. 4. Egraphs of ϕ<sup>4</sup> with Grepr (Color figure online).

letting the representative be a node with minimal cost. However, observe that not all costs guarantee that the chosen repr can be used (the computation does not terminate). For example, the ill-defined repr4*<sup>c</sup>* from above is a representative function that satisfies the cost function that assigns function applications cost 0 and variables and constants cost 1. A commonly used cost function is term AST size, which is sufficient to ensure termination of ntt(n, repr).

We are thus interested in characterizing representative functions motivated by two observations: not every cost function guarantees that ntt(n) terminates; and the kind of representative choices that are most suitable for qelim (repr4*b*) cannot be expressed over term AST size.

Definition 2. *Given an egraph* G <sup>=</sup> *<sup>N</sup>* ,*E*, L, root *, a representative function* repr : N <sup>→</sup> N *is* admissible for G *if*


Dotted blue edges in the graphs of Fig. <sup>4</sup> show the corresponding Grepr. Intuitively, for each node <sup>n</sup>, all reachable nodes in <sup>G</sup>repr are the nodes whose ntt term is necessary to produce the ntt(n). Observe that <sup>G</sup>repr4*<sup>c</sup>* has a cycle, thus, repr4*<sup>c</sup>* is not admissible.

Theorem 1. *Given an egraph* G *and a representative function* repr*, the function* G.to*\_*formula(repr, <sup>∅</sup>) *terminates with result* ψ *such that isFormula*(G, ψ) *iff* repr *is admissible for* G*.*

To the best of our knowledge, Theorem 1 is the first complete characterization of all terms of a node that can be obtained by extraction based on class representatives (via describing all admissible repr, note that the number is finite). This result contradicts [24], where it is claimed to be possible to extract a term of a node for any cost function. The counterexample is repr4*c*. Importantly, this characterization allows us to explore representative functions outside those in the existing literature, which, as we show in the next section, is key for qelim.

Input: A formula <sup>ϕ</sup> with free variables *v*. Output: A quantifier reduction of ϕ.

*QEL*(ϕ, *v*) 1: G := *egraph*(ϕ) 2: repr := G.find\_defs(*v*) 3: repr := G.refine\_defs(repr, *v*) 4: core := G.find\_core(repr) 5: ret G.to\_formula(repr, G.*Nodes*() \ core)

Algorithm 1: QEL – Quantifier reduction using egraphs.

### 4 Quantifier Reduction

Quantifier reduction is a relaxation of quantifier elimination: given two formulas ϕ and ψ with free variables *v* and *u*, respectively, ψ is a *quantifier reduction* of ϕ if *u* <sup>⊆</sup> *v* and ϕ<sup>∃</sup> <sup>≡</sup> <sup>ψ</sup><sup>∃</sup>. If *<sup>u</sup>* is empty, then <sup>ψ</sup> is a quantifier elimination of <sup>ϕ</sup><sup>∃</sup>. Note that quantifier reduction is possible even when quantifier elimination is not (e.g., for EUF). We are interested in an efficient quantifier reduction algorithm (that can be used as pre-processing for qelim), even if a complete qelim is possible (e.g., for LIA). In this section, we present such an algorithm called QEL.

Intuitively, QEL is based on the well-known substitution rule: (∃x·x <sup>≈</sup> t∧ϕ) <sup>≡</sup> ϕ[x <sup>→</sup> t]. A naive implementation of this rule, called QeLite in Z3, looks for syntactic definitions of the form x <sup>≈</sup> t for a variable x and an x-free term t and substitutes x with t. While efficient, QeLite is limited because of: (a) dependence on syntactic equality in the formula (specifically, it misses implicit equalities due to transitivity and congruence); (b) sensitivity to the order in which variables are eliminated (eliminating one variable may affect available syntactic equalities for another); and (c) difficulty in dealing with circular equalities such as x <sup>≈</sup> f(x).

For example, consider the formula ϕ<sup>4</sup>(x, y) in Fig. 4. Assume that <sup>y</sup> is eliminated first using y <sup>≈</sup> f(x), resulting in x <sup>≈</sup> g(f(x)) <sup>∧</sup> f(x) <sup>≈</sup> <sup>6</sup>. Now, x cannot be eliminated since the only equality for x is circular. Alternatively, assume that QeLite somehow noticed that by transitivity, <sup>ϕ</sup><sup>4</sup> implies <sup>y</sup> <sup>≈</sup> <sup>6</sup>, and obtains (∃y · ϕ<sup>4</sup>) x <sup>≈</sup> g(6) <sup>∧</sup> f(x) <sup>≈</sup> <sup>6</sup>. This time, x <sup>≈</sup> g(6) can be used to obtain f(g(6)) <sup>≈</sup> <sup>6</sup> that is a qelim of ϕ<sup>∃</sup> <sup>4</sup> . Thus, both the elimination order and implicit equalities are crucial.

In QEL, we address the above issues by using an egraph data structure to concisely capture all implicit equalities and terms. Furthermore, egraphs allow eliminating multiple variables together, ensuring that a variable is eliminated if it is equivalent (explicitly or implicitly) to a ground term in the egraph.

Pseudocode for QEL is shown in Algorithm 1. Given an input formula ϕ, QEL first builds its egraph G (line 1). Then, it finds a representative function repr that maps variables to equivalent ground terms, as much as possible (line 2). Next, it further reduces the remaining free variables by refining repr to map each variable x to an equivalent x-free (but not variable-free) term (line 3). At this point, QEL is committed to the variables to eliminate. To produce the output, find\_core identifies the subset of the nodes of G, which we call *core*,

Fig. 5. Egraphs including Grepr (Color figure online) of ϕ5.

that must be considered in the output (line 4). Finally, to\_formula converts the core of G to the resulting formula (line 5). We show that the combination of these steps is even stronger than variable substitution.

To illustrate QEL, we apply it on <sup>ϕ</sup><sup>1</sup> and its egraph <sup>G</sup> from Fig. 1. The function find\_defs returns repr <sup>=</sup> {N(6), <sup>N</sup>(8)}<sup>3</sup>. Node <sup>N</sup>(6) is the only node with a ground term in the equivalence class *class*(N(3)). This corresponds to the definition z <sup>≈</sup> k + 1. Node <sup>N</sup>(8) is chosen arbitrarily since *class*(N(8)) has no ground terms. There is no refinement possible, so refine\_defs returns repr. The core is N \ {N(3), <sup>N</sup>(5), <sup>N</sup>(9)}. Nodes <sup>N</sup>(3) and <sup>N</sup>(9) are omitted because they correspond to variables with definitions (under repr), and N(5) is omitted because it is congruent to N(4) so only one of them is needed. Finally, to\_formula produces k + 1 <sup>≈</sup> *read*(a, x) <sup>∧</sup> <sup>3</sup> > k + 1. Variables z and y are eliminated.

In the rest of this section we present QEL in detail and QEL's key properties.

*Finding Ground Definitions.* Ground variable definitions are found by selecting a representative function repr that ensures that the maximum number of terms in the formula are rewritten into ground equivalent ones, which, in turn, means finding a ground definition for all variables that have one.

Computing a representative function repr that is admissible and ensures finding ground definitions when they exist is not trivial. Naive approaches for identifying ground terms, such as iterating arbitrarily over the classes and selecting a representative based on *term*(n) are not enough – *term*(n) may not be in the output formula. It is also not possible to make a choice based on ntt(n), since, in general, it cannot be yet computed (repr is not known yet).

Admissibility raises an additional challenge since choosing a node that appears to be a definition (e.g., not a leaf) may cause cycles in Grepr. For example, consider <sup>ϕ</sup><sup>5</sup> of Fig. 5. Assume that <sup>N</sup>(1) and <sup>N</sup>(4) are chosen as representatives of their equivalence classes. At this point, <sup>G</sup>repr has two edges: <sup>N</sup>(5), <sup>N</sup>(4) and <sup>N</sup>(2), <sup>N</sup>(1) , shown by blue dotted lines in Fig. 5a. Next, if either N(2) or <sup>N</sup>(5) are chosen as representatives (the only choices in their class), then <sup>G</sup>repr

<sup>3</sup> Recall that we only show representatives of non-singleton classes.

```
egraph :: find_defs(v)
 1: for n ∈ N do repr(n) := 
 2: todo := {leaf (n) | n ∈ N ∧ ground(n)}
 3: repr := process(repr, todo)
 4: todo := {leaf (n) | n ∈ N}
 5: repr := process(repr, todo)
 6: ret repr
                                             egraph :: process(repr, todo)
                                              7: while todo = ∅ do
                                              8: n := todo.pop()
                                              9: if repr(n) =  then continue
                                             10: for n-
                                                        ∈ class(n) do repr(n-

                                                                              ) := n
                                             11: for n-
                                                        ∈ class(n) do
                                             12: for p ∈ parents(n-

                                                                     ) do
                                             13: if ∀c ∈ children(p) · repr(c) =  then
                                             14: todo.push(p)
                                             15: ret repr
```
Algorithm 2: Find definitions maximizing groundness.

becomes cyclic (shown in blue in Fig. 5a). Furthermore, backtracking on representative choices needs to be avoided if we are to find a representative function efficiently.

Algorithm 2 finds a representative function repr while overcoming these challenges. To ensure that the computed representative function is admissible (without backtracking), Algorithm 2 selects representatives for each class using a "bottom up" approach. Namely, leaves cannot be part of cycles in <sup>G</sup>repr because they have no outgoing edges. Thus, they can always be safely chosen as representatives. Similarly, a node whose children have already been assigned representatives in this way (leaves initially), will also never be part of a cycle in Grepr. Therefore, these nodes are also safe to be chosen as representatives.

This intuition is implemented in find\_defs by initializing repr to be undefined () for all nodes, and maintaining a workset, *todo*, containing nodes that, if chosen for the remaining classes (under the current selection), maintain acyclicity of Grepr. The initialization of *todo* includes leaves only. The specific choice of leaves ensures that ground definitions are preferred, and we return to it later. After initialization, the function process extracts an element from *todo* and sets it as the representative of its class if the class has not been assigned yet (lines 9 and 10). Once a class representative has been chosen, on lines 11 to 14, the parents of all the nodes in the class such that all the children have been chosen (the condition on line 13) are added to *todo*.

So far, we discussed how admissibility of repr is guaranteed. To also ensure that ground definitions are found whenever possible, we observe that a similar bottom up approach identifies terms that can be rewritten into ground ones. This builds on the notion of constructively ground nodes, defined next.

A class c is *ground* if c contains a *constructively ground*, or *c-ground* for short, node n, where a node n is c-ground if either (a) *term*(n) is ground, or (b) n is not a leaf and the class *class*(n[i]) of every child n[i] is ground. Note that nodes labeled by variables are never c-ground.

In the example in Fig. 1, *class*(N(7)) and *class*(N(8)) are not ground, because all their nodes represent variables; *class*(N(6)) is ground because N(6) is cground. Nodes N(4) and N(5) are not c-ground because the class of N(8) (a child of both nodes) is not ground. Interestingly, N(1) is c-ground, because *class*(N(3)) = *class*(N(6)) is ground, even though its term <sup>3</sup> > z is not ground.

Ground classes and c-ground nodes are of interest because whenever ϕ <sup>|</sup><sup>=</sup> *term*(n) <sup>≈</sup> t for some node n and ground term t, then *class*(n) is ground, i.e., it contains a c-ground node, where c-ground nodes can be found recursively starting from ground leaves. Furthermore, the recursive definition ensures that when the aforementioned c-ground nodes are selected as representatives, the corresponding terms w.r.t. repr are ground.

As a result, to maximize the ground definitions found, we are interested in finding an admissible representative function repr that is *maximally ground*, which means that for every node n <sup>∈</sup> N, if *class*(n) is ground, then repr(n) is c-ground. That means that c-ground nodes are always chosen if they exist.

Theorem 2. *Let* G <sup>=</sup> *egraph*(ϕ) *be an egraph and* repr *an admissible representative function that is maximally ground. For all* n <sup>∈</sup> N*, if* ϕ <sup>|</sup><sup>=</sup> *term*(n) <sup>≈</sup> t *for some ground term* t*, then* repr(n) *is c-ground and* ntt(repr(n)) *is ground.*

We note that not every choice of c-ground nodes as representatives results in an admissible representative function. For example, consider the formula <sup>ϕ</sup><sup>4</sup> of Fig. 4 and its egraph. All nodes except for N(5) and N(2) are c-ground. However, a repr with N(3) and N(1) as representatives is not admissible. Intuitively, this is because the "witness" for c-groundness of N(1) in *class*(N(2)) is N(4) and not N(3). Therefore, it is important to incorporate the selection of c-ground representatives into the bottom up procedure that ensures admissibility of repr.

To promote c-ground nodes over non c-ground in the construction of an admissible representative function, find\_defs chooses representatives in two steps. First, only the ground leaves are processed (line 2). This ensures that c-ground representatives are chosen while guaranteeing the absence of cycles. Then, the remaining leaves are added to *todo* (line 4). This triggers representative selection of the remaining classes (those that are not ground).

We illustrate find\_defs with two examples. For <sup>ϕ</sup><sup>4</sup> of Fig. 4, there is only one leaf that is ground, N(4), which is added to *todo* on line 2, and *todo* is processed. N(4) is chosen as representative and, as a consequence, its parent N(1) is added to *todo*. N(1) is chosen as representative so N(3), even though added to the queue later, is not chosen as representative, obtaining repr4*<sup>b</sup>* <sup>=</sup> {N(4), <sup>N</sup>(1)}. For <sup>ϕ</sup><sup>5</sup> of Fig. 5, no nodes are added to *todo* on line 2. N(3) and N(6) are added on line 4. In process, both are chosen as representatives obtaining, repr5*b*.

Algorithm 2 guarantees that repr is maximally ground. Together with Theorem 2, this implies that all terms that can be rewritten into ground equivalent ones will be rewritten, which, in turn, means that for each variable that has a ground definition, its representative is one such definition.

*Finding Additional (Non-ground) Definitions.* At this point, QEL found ground definitions while avoiding cycles in Grepr. However, this does not mean that as many variables as possible are eliminated. A variable can also be eliminated if it can be expressed as a function of other variables. This is not achieved by

```
egraph :: refine_defs(repr, v)
 1: for n ∈ N do
 2: if n = repr(n) and L(n) ∈ v then
 3: r := n
 4: for n-
            ∈ class(n) \ {n} do
 5: if L(n-

              ) ∈ v then
 6: if not cycle(n-

                       , repr) then
 7: r := n-

                 ;
 8: break
 9: for n-
            ∈ class(n) do
10: repr[n-

               ] := r
11: ret repr
                                       egraph :: find_core(repr, v)
                                        1: core := ∅
                                        2: for n ∈ N s.t. n = repr(n) do
                                        3: core := core ∪ {n}
                                        4: for n-
                                                  ∈ (class(n) \ n) do
                                        5: if L(n-

                                                    ) ∈ v then continue
                                        6: else if ∃m ∈ core · m congruent with n-

                                               then
                                        7: continue
                                        8: core := core ∪ {n-

                                                               }
                                        9: ret core
```
Algorithm 3: Refining repr and building core.

find\_defs. For example, in repr5*<sup>b</sup>* both variables are representatives, hence none is eliminated, even though, since x <sup>≈</sup> g(f(y)), x could be eliminated in f<sup>5</sup> by rewriting x as a function of y, allowing to eliminate x by rewriting it as a function of y, g(f(y)). Algorithm <sup>3</sup> shows function refine\_defs that refines maximally ground reprs to further find such definitions while keeping admissibility and ground maximality. This is done by greedily attempting to change class representatives if they are labeled with a variable. refine\_defs iterates over the nodes in the class checking if there is a different node that is not a variable and that does not create a cycle in <sup>G</sup>repr (line 6). The resulting repr remains maximally ground because representatives of ground classes are not changed.

For example, let us refine repr5*<sup>b</sup>* <sup>=</sup> {N(3), <sup>N</sup>(6), <sup>N</sup>(5)} obtained for <sup>ϕ</sup><sup>5</sup>. Assume that x is processed first. For *class*(N(x)), changing the representative to N(1) does not introduce a cycle (see Fig. 5c), so N(1) is selected. Next, for *class*(N(y)), choosing <sup>N</sup>(4) causes <sup>G</sup>repr to be cyclic since <sup>N</sup>(1) was already chosen (Fig. 5a), so the representative of *class*(N(y)) is not changed. The final refinement is repr5*<sup>c</sup>* <sup>=</sup> {N(1), <sup>N</sup>(6), <sup>N</sup>(5)}.

At this point, QEL found a representative function repr with as many ground definitions as possible and attempted to refine repr to have fewer variables as representatives. Next, QEL finds a core of the nodes of the egraph, based on repr, that will govern the translation of the egraph to a formula. While repr determines the semantic rewrites of terms that enable variable elimination, it is the use of the core in the translation that actually eliminates them.

*Variable Elimination Based on a Core.* <sup>A</sup> *core* of an egraph G <sup>=</sup> *<sup>N</sup>* ,*E*, L, root and a representative function repr, is a subset of the nodes <sup>N</sup>*<sup>c</sup>* <sup>⊆</sup> <sup>N</sup> such that <sup>ψ</sup>*<sup>c</sup>* <sup>=</sup> G.to\_formula(repr, N \ <sup>N</sup>*<sup>c</sup>*) satisfies *isFormula*(G, ψ*<sup>c</sup>*).

Algorithm 3 shows pseudocode for find\_core that computes a core of an egraph for a given representative function. The idea is that non-representative nodes that are labeled by variables, as well as nodes congruent to nodes that are already in the core, need not be included in the core. The former are not needed since we are only interested in preserving the existential closure of the output, while the latter are not needed since congruent nodes introduce the same syntactic terms in the output. For example, for <sup>ϕ</sup><sup>1</sup> and repr1, find\_core returns core<sup>1</sup> <sup>=</sup> <sup>N</sup><sup>1</sup> \{N(3), <sup>N</sup>(5), <sup>N</sup>(9)}. Nodes <sup>N</sup>(3) and <sup>N</sup>(9) are excluded because they are labeled with variables; and node N(5) because it is congruent with N(4).

Finally, QEL produces a quantifier reduction by applying to\_formula with the computed repr and core. Variables that are not in the core (they are not representatives) are eliminated – this includes variables that have a ground definition. However, QEL may eliminate a variable even if it is a representative (and thus it is in the core). As an example, consider ψ(x, y) f(x) <sup>≈</sup> f(y) <sup>∧</sup> x <sup>≈</sup> y, whose egraph <sup>G</sup> contains 2 classes with 2 nodes each. The core <sup>N</sup>*<sup>c</sup>* relative to any admissible repr contains only one representative per class: in the *class*(N(x)) because both nodes are labeled with variables, and in the *class*(N(f(x))) because nodes are congruent. In this case, to\_formula(repr, N*<sup>c</sup>*) results in (since singleton classes in the core produce no literals in the output formula), a quantifier elimination of ψ. More generally, the variables are eliminated because none of them is reachable in <sup>G</sup>repr from a non-singleton class in the core (only such classes contribute literals to the output).

We conclude the presentation of QEL by showing its output for our examples. For ϕ<sup>1</sup>, QEL obtains (<sup>k</sup> + 1 <sup>≈</sup> *read*(a, x)∧<sup>3</sup> > k + 1), a quantifier reduction, using repr<sup>1</sup> <sup>=</sup> {N(3), <sup>N</sup>(8))} and core<sup>1</sup> <sup>=</sup> <sup>N</sup><sup>1</sup> \ {N(3), <sup>N</sup>(5), <sup>N</sup>(9)}. For <sup>ϕ</sup><sup>4</sup>, QEL obtains (6 <sup>≈</sup> <sup>f</sup>(g(6))), a quantifier elimination, using repr4*<sup>b</sup>* <sup>=</sup> {N(4), <sup>N</sup>(1)}, and core4*<sup>b</sup>* <sup>=</sup> <sup>N</sup><sup>4</sup> \ {N(3), <sup>N</sup>(2)}. Finally, for <sup>ϕ</sup><sup>5</sup>, QEL obtains (<sup>y</sup> <sup>≈</sup> <sup>h</sup>(f(y)) <sup>∧</sup> <sup>f</sup>(g(f(y))) <sup>≈</sup> <sup>f</sup>(y)), a quantifier reduction, using repr5*<sup>c</sup>* <sup>=</sup> {N(1), <sup>N</sup>(6), <sup>N</sup>(5)} and core5*<sup>c</sup>* <sup>=</sup> <sup>N</sup><sup>5</sup> \ {N(3)}.

*Guarantees of QEL.* Correctness of QEL is straightforward. We conclude this section by providing two conditions that ensure that a variable is eliminated by QEL. The first condition guarantees that a variable is eliminated whenever a ground definition for it exists (regardless of the specific representative function and core computed by QEL). This makes QEL *complete relative to quantifier elimination based on ground definitions*. Relative completeness is an important property since it means that QEL is unaffected by variable orderings and syntactic rewrites, unlike QeLite. The second condition, illustrated by ψ above, depends on the specific representative function and core computed by QEL.

Theorem 3. *Let* ϕ *be a QF conjunction of literals with free variables v, and let* <sup>v</sup> <sup>∈</sup> *<sup>v</sup>. Let* <sup>G</sup> <sup>=</sup> *egraph*(ϕ)*,* <sup>n</sup>*<sup>v</sup> the node in* <sup>G</sup> *such that* <sup>L</sup>(n*<sup>v</sup>*) = <sup>v</sup> *and* repr *and* core *computed by QEL. We denote by NS* <sup>=</sup> {n <sup>∈</sup> core <sup>|</sup> (*class*(n) <sup>∩</sup> core) <sup>=</sup> {n}} *the set of nodes from classes with two or more nodes in* core*. If one of the following conditions hold, then* v *does not appear in QEL*(ϕ, *v*)*:*

*(1) there exists a ground term* t *s.t.* ϕ <sup>|</sup><sup>=</sup> v <sup>≈</sup> t*, or*

*(2)* <sup>n</sup>*<sup>v</sup> is not reachable from any node in NS in* <sup>G</sup>repr*.*

As a corollary, if every variable meets one of the two conditions, then QEL finds a quantifier elimination.

This concludes the presentation of our quantifier reduction algorithm. Next, we show how QEL can be used to under-approximate quantifier elimination, which allows working with formulas for which QEL does not result in a qelim.

ElimWrRd1 ϕ[*read*(*write*(t, i, v ), j)] <sup>ϕ</sup>[v] <sup>∧</sup> <sup>i</sup> <sup>≈</sup> <sup>j</sup> <sup>M</sup> <sup>|</sup><sup>=</sup> <sup>i</sup> <sup>≈</sup> <sup>j</sup>

ElimWrRd2 ϕ[*read*(*write*(t, i, v ), j)] ϕ[*read*(t, j )] i j <sup>M</sup> <sup>|</sup><sup>=</sup> <sup>i</sup> <sup>j</sup>

Fig. 6. Two MBP rules from [16]. The notation ϕ[t] means that ϕ contains term t. The rules rewrite all occurrences of *read*(*write*(t, i, v), j) with v and *read*(t, j), respectively.

ElimWrRd 1: **function** match(t) 2: **ret** t = *read*(*write*(s, i, v ), j) 3: **function** apply(t, M, G) 4: **if** <sup>M</sup> <sup>|</sup><sup>=</sup> <sup>i</sup> <sup>≈</sup> <sup>j</sup> **then** 5: G.assert(<sup>i</sup> <sup>≈</sup> <sup>j</sup>) 6: G.assert(<sup>t</sup> <sup>≈</sup> <sup>v</sup>) 7: **else** 8: G.assert(i j) 9: G.assert(t *read*(s, j ))

Fig. 7. Adaptation of rules in Fig. 6 using QEL API.

### 5 Model Based Projection Using QEL

Applications like model checking and quantified satisfiability require efficient computation of under-approximations of quantifier elimination. They make use of model-based projection (MBP) algorithms to project variables that cannot be eliminated cheaply. Our QEL algorithm is efficient and relatively complete, but it does not guarantee to eliminate all variables. In this section, we use a model and theory-specific projection rules to implement an MBP algorithm on top of QEL.

We focus on two important theories: Arrays and Algebraic DataTypes (ADT). They are widely used to encode program verification tasks. Prior works separately develop MBP algorithms for Arrays [16] and ADTs [5]. Both MBPs were presented as a set of syntactic rewrite rules applied until fixed point.

Combining the MBP algorithms for Arrays and ADTs is non-trivial because applying projection rules for one theory may produce terms of the other theory. Therefore, separately achieving saturation in either theory is not sufficient to reach saturation in the combined setting. The MBP for the combined setting has to call both MBPs, check whether either one of them produced terms that can be processed by the other, and, if so, call the other algorithm. This is similar to theory combination in SMT solving where the core SMT solver has to keep track of different theory solvers and exchange terms between them.

Our main insight is that egraphs can be used as a glue to combine MBP algorithms for different theories, just like egraphs are used in SMT solvers to combine satisfiability checking for different theories. Implementing MBP using egraphs allows us to use the insights from QEL to combine MBP with on-the-fly quantifier reduction to produce less under-approximate formulas than what we get by syntactic application of MBP rules.

To implement MBP using egraphs, we implement all rewrite rules for MBP in Arrays [16] and ADTs [5] on top of egraphs. In the interest of space, we explain the implementation of just a couple of the MBP rules for Arrays<sup>4</sup>.

Figure 6 shows two Array MBP rules from [16]: ElimWrRd1 and ElimWrRd2. Here, ϕ is a formula with arrays and M is a model for ϕ. Both rules rewrite terms which match the pattern *read*(*write*(t, i, v), j), where t, i, j, k are all terms and t contains a variable to be projected. ElimWrRd1 is applicable when M <sup>|</sup><sup>=</sup> i <sup>≈</sup> j. It rewrites the term *read*(*write*(t, i, v), j) to v. ElimWrRd2 is applicable when M |<sup>=</sup> i <sup>≈</sup> j and rewrites *read*(*write*(t, i, v), j) to *read*(t, j).

Figure 7 shows the egraph implementation of ElimWrRd1 and ElimWrRd2. The match(t) method checks iftsyntactically matches *read*(*write*(s, i, v), j), where s contains a variable to be projected. The apply(t) method assumes that t is *read*(*write*(s, i, v), j). It first checks if M <sup>|</sup><sup>=</sup> i <sup>≈</sup> j, and, if so, it adds i <sup>≈</sup> j and t <sup>≈</sup> v to the egraph G. Otherwise, if M |<sup>=</sup> i <sup>≈</sup> j, apply(t) adds a disequality i ≈ j and an equality t <sup>≈</sup> *read*(s, v) to G. That is, the egraph implementation of the rules only adds (and does not remove) literals that capture the side condition and the conclusion of the rule.

Our algorithm for MBP based on egraphs, MBP-QEL, is shown in Alg. 4. It initializes an egraph with the input formula (line 1), applies MBP rules until saturation (line 4), and then uses the steps of QEL (lines 7–12) to generate the projected formula.

Applying rules is as straightforward as iterating over all terms t in the egraph, and for each rule r such that r.match(t) is true, calling r.apply(t, M, G) (lines 14– 22). As opposed to the standard approach based on formula rewriting, here the terms are *not* rewritten – both remain. Therefore, it is possible to get into an infinite loop by re-applying the same rules on the same terms over and over again. To avoid this, MBP-QEL marks terms as *seen* (line 23) and avoids them in the next iteration (line 15). Some rules in MBP are applied to pairs of terms. For example, Ackermann rewrites pairs of *read* terms over the same variable. This is different from usual applications where rewrite rules are applied to individual expressions. Yet, it is easy to adapt such pairwise rewrite rules to egraphs by iterating over pairs of terms (lines 25–30).

MBP-QEL does not apply MBP rules to terms that contain variables but are already c-ground (line 16), which is sound because such terms are replaced by ground terms in the output (Theorem 3). This prevents unnecessary application of MBP rules thus allowing MBP-QEL to compute MBPs that are closer to a quantifier elimination (less model-specific).

Just like each application of a rewrite rule introduces a new term to a formula, each call to the apply method of a rule adds new terms to the egraph. Therefore, each call to *ApplyRules* (line 4) makes the egraph bigger. However, provided that the original MBP combination is terminating, the iterative application of *ApplyRules* terminates as well (due to marking).

Some MBP rules introduce new variables to the formula. MBP-QEL computes repr based on both original and newly introduced variables (line 7). This

<sup>4</sup> Implementation of all other rules is similar.

Input: A QF formula <sup>ϕ</sup> with free variables *v* all of sort *Array*(I,V ) or *ADT*, a model M |= ϕ∃, and sets of rules *ArrayRules* and *ADTRules* Output: A cube <sup>ψ</sup> s.t. <sup>ψ</sup><sup>∃</sup> <sup>⇒</sup> <sup>ϕ</sup>∃, <sup>M</sup> <sup>|</sup><sup>=</sup> <sup>ψ</sup>∃, and *vars*(ψ) are not Arrays or ADTs

MBP-QEL(ϕ, *v*, M) 1: G := *egraph*(ϕ) 2: p1, p<sup>2</sup> := , ; S, S*<sup>p</sup>* := ∅, ∅ 3: while <sup>p</sup><sup>1</sup> <sup>∨</sup> <sup>p</sup><sup>2</sup> do 4: p<sup>1</sup> := *ApplyRules*(G, M, *ArrayRules*, S, S*p*) 5: p<sup>2</sup> := *ApplyRules*(G, M, *ADTRules*, S, S*p*) 6: *v*- := G.*Vars*() 7: repr := G.find\_defs(*v*- ) 8: repr := G.refine\_defs(repr, *v*- ) 9: core := G.find\_core(repr, *v*- ) 10: *<sup>v</sup><sup>e</sup>* := {<sup>v</sup> <sup>∈</sup> *<sup>v</sup>*- | *is\_arr*(v) ∨ *is\_adt*(v)} 11: core*<sup>e</sup>* := {<sup>n</sup> <sup>∈</sup> core <sup>|</sup> *gr*(*term*(n), *<sup>v</sup><sup>e</sup>*)} 12: ret G.to\_formula(repr, G.*Nodes*()\core*e*)

*ApplyRules*(G, M, R, S, S*p*) 13: *progress* := ⊥ 14: N := G.*Nodes*() 15: U := {n | n ∈ N \ S} 16: T := {*term*(n) | n ∈ U ∧ (*is*\_*eq*(*term*(n))∨¬*c-ground*(n))} 17: R*<sup>p</sup>* := {r ∈ R | r.*is*\_*for*\_*pairs*()} 18: R*<sup>u</sup>* := R \ R*<sup>p</sup>* 19: for each <sup>t</sup> <sup>∈</sup> T,r <sup>∈</sup> <sup>R</sup>*<sup>u</sup>* do 20: if r.match(t) then 21: r.apply(t, M, G) 22: *progress* := 23: S := S ∪ N 24: N*<sup>p</sup>* := { n1, n2 | n1, n<sup>2</sup> ∈ N} 25: T*<sup>p</sup>* := {*term*(n*p*) | n*<sup>p</sup>* ∈ N*<sup>p</sup>* \ S*p*} 26: for each <sup>t</sup>*<sup>p</sup>* <sup>∈</sup> <sup>T</sup>*p*, r <sup>∈</sup> <sup>R</sup>*<sup>p</sup>* do 27: if r.match(p) then 28: r.apply(p, M, G) 29: *progress* := 30: S*<sup>p</sup>* := S*<sup>p</sup>* ∪ N*<sup>p</sup>* 31: ret *progress*

Algorithm 4: MBP-QEL: an MBP using QEL. Here *gr*(t, *v*) checks whether term t contains any variables in *v* and *is*\_*eq*(t) checks if t is an equality literal.

allows MBP-QEL to eliminate all variables, including non-Array, non-ADT variables, that are equivalent to ground terms (Theorem 3).

As mentioned earlier, MBP-QEL never removes terms while rewrite rules are saturating. Therefore, after saturation, the egraph still contains all original terms and variables. From soundness of the MBP rules, it follows that after each invocation of apply, MBP-QEL creates an under-approximation of ϕ<sup>∃</sup> based on the model M. From completeness of MBP rules, it follows that, after saturation, all terms containing Array or ADT variables can be removed from the egraph without affecting equivalence of the saturated egraph. Hence, when calling to\_formula, MBP-QEL removes all terms containing Array or ADT variables (line 12). This includes, in particular, all the terms on which rewrite rules were applied, but potentially more.

We demonstrate our MBP algorithm on an example with nested ADTs and Arrays. Let P - <sup>A</sup>*<sup>I</sup>*×*<sup>I</sup>* , I be the datatype of a pair of an integer array and an integer, and let *pair* : <sup>A</sup>*<sup>I</sup>*×*<sup>I</sup>* <sup>×</sup> <sup>I</sup> <sup>→</sup> <sup>P</sup> be its sole constructor with destructors *fst* : <sup>P</sup> <sup>→</sup> <sup>A</sup>*<sup>I</sup>*×*<sup>I</sup>* and *snd* : <sup>P</sup> <sup>→</sup> <sup>I</sup>. In the following, let <sup>i</sup>, <sup>l</sup>, <sup>j</sup> be integers, <sup>a</sup> an integer array, <sup>p</sup>, <sup>p</sup> pairs, and *<sup>p</sup>*<sup>1</sup>, *<sup>p</sup>*<sup>2</sup> arrays of pairs (A*<sup>I</sup>*×*<sup>P</sup>* ). Consider the formula:

$$\varphi\_{mbp}(p,a) \triangleq \operatorname{read}(a,i) \approx i \land p \approx pair(a,l) \land p\_2 \approx write(\mathbf{p}\_1,j,p) \land p \not\approx p'r$$

where p and a are free variables that we want to project and all of i, j, l, *p*1, *p*2, p are constants that we want to keep. MBP is guided by a model <sup>M</sup>*mbp* <sup>|</sup><sup>=</sup> <sup>ϕ</sup>*mbp*. To eliminate <sup>p</sup> and <sup>a</sup>, MBP-QEL constructs the egraph of <sup>ϕ</sup>*mbp* and applies the MBP rules. In particular, it uses Array MBP rules to rewrite the *write*(*p*1, j, p) term by adding the equality *read*(*p*2, j) <sup>≈</sup> p and merging it with the equivalence class of *<sup>p</sup>*<sup>2</sup> <sup>≈</sup> *write*(*p*1, j, p). It then applies ADT MBP rules to deconstruct the equality p <sup>≈</sup> *pair* (a, l) by creating two equalities *fst*(p) <sup>≈</sup> a and *snd*(p) <sup>≈</sup> l. Finally, the call to to\_formula produces

$$\begin{aligned} read(fst(read(\mathbf{p}\_1, j)), i) &\approx i \land send(read(\mathbf{p}\_1, j)) \approx l \land \\ read(\mathbf{p}\_2, j) &\approx pair(fst(read(\mathbf{p}\_1, j)), l) \land \\ & \mathbf{p}\_2 \approx write(\mathbf{p}\_1, j, read(\mathbf{p}\_2, j)) \land read(\mathbf{p}\_2, j) \not\approx p' \end{aligned}$$

The output is easy to understand by tracing it back to the input. For example, the first literal is a rewrite of the literal *read*(a, i) <sup>≈</sup> i where a is represented with *fst*(p) and p is represented with *read*(*p*<sup>1</sup>, j). While the interaction of these rules might seem straightforward in this example, the MBP implementation in Z3 fails to project a in this example because of the multilevel nesting.

Notably, in this example, the c-ground computation during projection allows MBP-QEL not splitting on the disequality p ≈ p based on the model. While ADT MBP rules eliminate disequalities by using the model to split them, MBP-QEL benefits from the fact that, after the application of Array MBP rules, the class of p becomes ground, making p ≈ p c-ground. Thus, the c-ground computation allows MBP-QEL to produce a formula that is less approximate than those produced by syntactic application of MBP rules. In fact, in this example, a quantifier elimination is obtained (the model <sup>M</sup>*mbp* was not used).

In the next section, we show that our improvements to MBP translate to significant improvements in a CHC-solving procedure that relies on MBP.

### 6 Evaluation

We implement QEL (Alg. 1) and MBP-QEL (Alg. 4) inside Z3 [19] (version 4.12.0), a state-of-the-art SMT solver. Our implementation (referred to as Z3eg), is publicly available on GitHub<sup>5</sup>. Z3eg replaces QeLite with QEL, and the existing MBP with MBP-QEL.

We evaluate Z3eg using two solving tasks. Our first evaluation is on the QSAT algorithm [5] for checking satisfiability of formulas with alternating quantifiers. In QSAT, Z3 uses both QeLite and MBP to under-approximate quantified formulas. We compare three QSAT implementations: the existing version in Z3 with the default QeLite and MBP; the existing version in Z3 in which QeLite and MBP are replaced by our egraph-based algorithms, Z3eg; and the QSAT implementation in YicesQS<sup>6</sup>, based on the Yices [8] SMT solver. During the evaluation, we found a bug in QSAT implementation of Z3 and fixed it<sup>7</sup>.

<sup>5</sup> Available at https://github.com/igcontreras/z3/tree/qel-cav23.

<sup>6</sup> Available at https://github.com/disteph/yicesQS.

<sup>7</sup> Available at https://github.com/igcontreras/z3/commit/133c9e438ce.


Table 1. Instances solved within 20 min by different implementations. Benchmarks are quantified LIA and LRA formulas from SMT-LIB [2]. Table 2. Instances solved within 60 s for our handcrafted benchmarks.

The fix resulted in Z3 solving over 40 sat instances and over 120 unsat instances more than before. In the following, we use the fixed version of Z3.

We use benchmarks in the theory of (quantified) LIA and LRA from SMT-LIB [2,3], with alternating quantifiers. LIA and LRA are the only tracks in which Z3 uses the QSAT tactic by default. To make our experiments more comprehensive, we also consider two modified variants of the LIA and LRA benchmarks, where we add some non-recursive ADT variables to the benchmarks. Specifically, we wrap all existentially quantified arithmetic variables using a record type ADT and unwrap them whenever they get used<sup>8</sup>. Since these benchmarks are similar to the original, we force Z3 to use the QSAT tactic on them with a tactic.default\_tactic=qsat command line option.

Table 1 summarizes the results for the SMT-LIB benchmarks. In LIA, both Z3eg and Z3 solve all benchmarks in under a minute, while YicesQS is unable to solve many instances. In LRA, YicesQS solves all instances with very good performance. Z3 is able to solve only some benchmarks, and our Z3eg performs similarly to Z3. We found that in the LRA benchmarks, the new algorithms in Z3eg are not being used since there are not many equalities in the formula, and no equalities are inferred during the run of QSAT. Thus, any differences between Z3 and Z3eg are due to inherent randomness of the solving process.

Table 2 summarizes the results for the categories of mixed ADT and arithmetic. YicesQS is not able to compete because it does not support ADTs. As expected, Z3eg solves many more instances than Z3.

The second part of our evaluation shows the efficacy of MBP-QEL for Arrays and ADTs (Alg. 4) in the context of CHC-solving. Z3 uses both QeLite and MBP inside the CHC-solver Spacer [17]. Therefore, we compare Z3 and Z3eg on CHC problems containing Arrays and ADTs. We use two sets of benchmarks to test out the efficacy of our MBP. The benchmarks in the first set were generated for verification of Solidity smart contracts [1] (we exclude benchmarks with non-linear arithmetic, they are not supported by Spacer). These benchmarks have a very complex structure that nests ADTs and Arrays. Specifically, they contain both ADTs of Arrays, as well as Arrays of ADTs. This makes them suitable to test our MBP-QEL. Row 1 of Table 3 shows the number of instances

<sup>8</sup> The modified benchmarks are available at https://github.com/igcontreras/LIA-ADT and https://github.com/igcontreras/LRA-ADT.



solved by Z3 (Spacer) with and without MBP-QEL. Z3eg solves 29 instances more than Z3. Even though MBP is just one part of the overall Spacer algorithm, we see that for these benchmarks, MBP-QEL makes a significant impact on Spacer. Digging deeper, we find that many of these instances come from the category called *abi* (row 2 in Table 3). Z3eg solves all of these benchmarks, while Z3 fails to solve 20 of them. We traced the problem down to the MBP implementation in Z3: it fails to eliminate all variables, causing runtime exception. In contrast, MBP-QEL eliminates all variables successfully, allowing Z3eg to solve these benchmarks.

We also compare Z3eg with Eldarica [14], a state-of-the-art CHC-solver that is particularly effective on these benchmarks. Z3eg solves almost as many instances as Eldarica. Furthermore, like Z3, Z3eg is orders of magnitude faster than Eldarica. Finally, we compare the performance of Z3eg on Array benchmarks from the CHC competition [13]. Z3eg is competitive with Z3, solving 2 additional safe instances and almost as many unsafe instances as Z3 (row 3 of Table 3). Both Z3eg and Z3 solve quite a few instances more than Eldarica.

Our experiments show the effectiveness of our QEL and MBP-QEL in different settings inside the state-of-the-art SMT solver Z3. While we maintain performance on quantified arithmetic benchmarks, we improve Z3's QSAT algorithm on quantified benchmarks with ADTs. On verification tasks, QEL and MBP-QEL help Spacer solve 30 new instances, even though MBP is only a relatively small part of the overall Spacer algorithm.

### 7 Conclusion

Quantifier elimination, and its under-approximation, Model-Based Projection are used by many SMT-based decision procedures, including quantified SAT and Constrained Horn Clause solving. Traditionally, these are implemented by a series of syntactic rules, operating directly on the syntax of an input formula. In this paper, we argue that these procedures should be implemented directly on the egraph data-structure, already used by most SMT solvers. This results in algorithms that better handle implicit equality reasoning and result in easier to implement and faster procedures. We justify this argument by implementing quantifier reduction and MBP in Z3 using egraphs and show that the new implementation translates into significant improvements to the target decision procedures. Thus, our work provides both theoretical foundations for quantifier reduction and practical contributions to Z3 SMT-solver.

Acknowledgment. The research leading to these results has received funding from the European Research Council under the European Union's Horizon 2020 research and innovation programme (grant agreement No [759102-SVIS]). This research was partially supported by the Israeli Science Foundation (ISF) grant No. 1810/18. We acknowledge the support of the Natural Sciences and Engineering Research Council of Canada (NSERC), MathWorks Inc., and the Microsoft Research PhD Fellowship.

## References


Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

## **Local Search for Solving Satisfiability of Polynomial Formulas**

Haokun Li , Bican Xia , and Tianqi Zhao(B)

School of Mathematical Sciences, Peking University, Beijing, China {haokunli,zhaotq}@pku.edu.cn, xbc@math.pku.edu.cn

**Abstract.** Satisfiability Modulo the Theory of Nonlinear Real Arithmetic, SMT(NRA) for short, concerns the satisfiability of *polynomial formulas*, which are quantifier-free Boolean combinations of polynomial equations and inequalities with integer coefficients and real variables. In this paper, we propose a local search algorithm for a special subclass of SMT(NRA), where all constraints are strict inequalities. An important fact is that, given a polynomial formula with n variables, the zero level set of the polynomials in the formula decomposes the n-dimensional real space into finitely many components (cells) and every polynomial has constant sign in each cell. The key point of our algorithm is a new operation based on real root isolation, called *cell-jump*, which updates the current assignment along a given direction such that the assignment can 'jump' from one cell to another. One cell-jump may adjust the values of several variables while traditional local search operations, such as *flip* for SAT and *critical move* for SMT(LIA), only change that of one variable. We also design a two-level operation selection to balance the success rate and efficiency. Furthermore, our algorithm can be easily generalized to a wider subclass of SMT(NRA) where polynomial equations linear with respect to some variable are allowed. Experiments show the algorithm is competitive with state-of-the-art SMT solvers, and performs particularly well on those formulas with high-degree polynomials.

**Keywords:** SMT · Local search · Nonlinear real arithmetic · Cell-jump · Cylindrical Algebraic Decomposition (CAD)

## **1 Introduction**

Satisfiability modulo theories (SMT) refers to the problem of determining whether a first-order formula is satisfiable with respect to (w.r.t.) certain theories, such as the theories of linear integer/real arithmetic, nonlinear integer/real arithmetic and strings. In this paper, we consider the theory of nonlinear real arithmetic (NRA) and restrict our attention to the problem of solving satisfiability of quantifier-free polynomial formulas.

Solving polynomial constraints has been a central problem in the development of mathematics. In 1951, Tarski's decision procedure [33] made it possible to solve polynomial constraints in an algorithmic way. However, Tarski's

The authors are listed in alphabetical order and they make equal contribution.

c The Author(s) 2023 C. Enea and A. Lal (Eds.): CAV 2023, LNCS 13965, pp. 87–109, 2023. https://doi.org/10.1007/978-3-031-37703-7\_5

algorithm is impractical because of its super-exponential complexity. The first relatively practical method is cylindrical algebraic decomposition (CAD) algorithm [13] proposed by Collins in 1975, followed by lots of improvements. See for example [6,14,20,22,26]. Unfortunately, those variants do not improve the complexity of the original algorithm, which is doubly-exponential. On the other hand, SMT(NRA) is important in theorem proving and program verification, since most complicated programs use real variables and perform nonlinear arithmetic operation on them. Particularly, SMT(NRA) has various applications in the formal analysis of hybrid systems, dynamical systems and probabilistic systems (see the book [12] for reference).

The most popular approach for solving SMT(NRA) is the lazy approach, also known as CDCL(T) [5]. It combines a propositional satisfiability (SAT) solver that uses a conflict-driven clause learning (CDCL) style algorithm to find assignments of the propositional abstraction of a polynomial formula and a theory solver that checks the consistency of sets of polynomial constraints. The solving effort in the approach is devoted to both the Boolean layer and the theory layer. For the theory solver, the only complete method is the CAD method, and there also exist many efficient but incomplete methods, such as linearisation [10], interval constraint propagation [34] and virtual substitution [35]. Recall that the complexity of the CAD method is doubly-exponential. In order to ease the burden of using CAD, an improved CDCL-style search framework, the model constructing satisfiability calculus (MCSAT) framework [15,21], was proposed. Further, there are many optimizations on CAD projection operation, *e.g.* [7,24,29], custom-made for this framework. Besides, an alternative algorithm for determining the satisfiability of conjunctions of non-linear polynomial constraints over the reals based on CAD is presented in [1].

The development of this approach brings us effective SMT(NRA) solvers. Almost all state-of-the-art SMT(NRA) solvers are based on the lazy approach, including Z3 [28], CVC5 [3], Yices2 [16] and MathSAT5 [11]. These solvers have made great progress in solving SMT(NRA). However, the time and memory usage of them on some hard instances may be unacceptable, particularly when the proportion of nonlinear polynomials in all polynomials appearing in the formula is high. It pushes us to design algorithms which perform well on these hard instances.

Local search plays an important role in solving satisfiability problems, which is an incomplete method since it can only determine satisfiability but not unsatisfiability. A local search algorithm moves in the space of candidate assignments (the search space) by applying local changes, until a satisfied assignment is found or a time bound is reached. It is well known that local search method has been successfully applied to SAT problems [2,4,9,23]. In recent years, some efforts trying to develop local search method for SMT solving are inspiring: Under the DPLL(T) framework, Griggio et al. [19] introduced a general procedure for integrating a local search solver of the WalkSAT family with a theory solver. Pure local search algorithms [17,30,31] were proposed to solve SMT problems with respect to the theory of bit-vectors directly on the theory level. Cai et al. [8] developed a local search procedure for SMT on the theory of linear integer arithmetic (LIA) through the *critical move* operation, which works on the literal-level and changes the value of one variable in a false LIA literal to make it true. We also notice that there exists a local search SMT solver for the theory of NRA, called NRA-LS, performing well at the SMT Competition 2022<sup>1</sup>. A simple description of the solver without details about local search can be found in [25].

In this paper, we propose a local search algorithm for a special subclass of SMT(NRA), where all constraints are strict inequalities. The idea of applying the local search method to SMT(NRA) comes from CAD, which is a decomposition of the search space R<sup>n</sup> into finitely many cells such that every polynomial in the formula is sign-invariant on each cell. CAD guarantees that the search space only has finitely many states. Similar to the local search method for SAT which moves between finitely many Boolean assignments, local search for SMT(NRA) should jump between finitely many cells. So, we may use a local search framework for SAT to solve SMT(NRA).

Local search algorithms require an operation to perform local changes. For SAT, a standard operation is *flip*, which modifies the current assignment by flipping the value of one Boolean variable from false to true or vice-versa. For SMT(NRA), we propose a novel operation, called *cell-jump*, updating the current assignment <sup>x</sup><sup>1</sup> -<sup>→</sup> <sup>a</sup>1,...,x<sup>n</sup> -<sup>→</sup> <sup>a</sup><sup>n</sup> (a<sup>i</sup> <sup>∈</sup> <sup>Q</sup>) to a solution of a false polynomial constraint 'p < 0' or 'p > 0', where x<sup>i</sup> is a variable appearing in the given polynomial formula. Different from the critical move operation for linear integer constraints [8], it is difficult to determine the threshold value of some variable x<sup>i</sup> such that the false polynomial constraint becomes true. We deal with the issue by the method of real root isolation, which isolates every real root of the univariate polynomial <sup>p</sup>(a1,...,a<sup>i</sup>−1, xi, a<sup>i</sup>+1,...,an) in an open interval sufficiently small with rational endpoints. If there exists at least one endpoint making the false constraint true, a cell-jump operation assigns x<sup>i</sup> to one closest to ai. The procedure can be viewed as searching for a solution along a line parallel to the xi-axis. In fact, a cell-jump operation can search along any fixed straight line, and thus one cell-jump may change the values of more than one variables. Each step, the local search algorithm picks a cell-jump operation to execute according to a twolevel operation selection and updates the current assignment, until a solution to the polynomial formula is found or the terminal condition is satisfied. Moreover, our algorithm can be generalized to deal with a wider subclass of SMT(NRA) where polynomial equations linear w.r.t. some variable are allowed.

The local search algorithm is implemented with Maple2022 as a tool. Experiments are conducted to evaluate the tool on two classes of benchmarks, including selected instances from SMT-LIB<sup>2</sup>, and some hard instances generated randomly with only nonlinear constraints. Experimental results show that our tool is competitive with state-of-the-art SMT solvers on the SMT-LIB benchmarks, and performs particularly well on the hard instances. We also combine our tool with

<sup>1</sup> https://smt-comp.github.io/2022.

<sup>2</sup> https://smtlib.cs.uiowa.edu/benchmarks.shtml.

Z3, CVC5, Yices2 and MathSAT5 respectively to obtain four sequential portfolio solvers, which show better performance.

The rest of the paper is organized as follows. The next section introduces some basic definitions and notation and a general local search framework for solving a satisfiability problem. Section 3 shows from the CAD perspective, the search space for SMT(NRA) only has finite states. In Sect. 4, we describe cell-jump operations, while in Sect. 5 we provide the scoring function which gives every operation a score. The main algorithm is presented in Sect. 6. And in Sect. 7, experimental results are provided to indicate the efficiency of the algorithm. Finally, the paper is concluded in Sect. 8.

## **2 Preliminaries**

#### **2.1 Notation**

Let *x*¯ := (x1,...,xn) be a vector of variables. Denote by <sup>Q</sup>, <sup>R</sup> and <sup>Z</sup> the set of rational numbers, real numbers and integer numbers, respectively. Let <sup>Q</sup>[*x*¯] and <sup>R</sup>[*x*¯] be the ring of polynomials in the variables <sup>x</sup>1,...,x<sup>n</sup> with coefficients in <sup>Q</sup> and in R, respectively.

**Definition 1 (Polynomial Formula).** *Suppose* <sup>Λ</sup> <sup>=</sup> {P1,...,P<sup>m</sup>} *where every* <sup>P</sup><sup>i</sup> *is a non-empty finite subset of* <sup>Q</sup>[*x*¯]*. The following formula*

$$F = \bigwedge\_{P\_i \in \Lambda} \bigvee\_{p\_{ij} \in P\_i} p\_{ij}(x\_1, \dots, x\_n) \rhd\_{ij} 0, \text{ where } \rhd\_{ij} \in \{<, >, =\}, \ell$$

*is called a* polynomial formula*. Additionally, we call* pij (x1,...,xn) ij 0 *an* atomic polynomial formula*, and* <sup>p</sup>ij∈P<sup>j</sup> <sup>p</sup>ij (x1,...,xn) ij 0 *a* polynomial clause*.*

For any polynomial formula F, poly(F) denotes the set of polynomials appearing in F. For any atomic formula , poly() denotes the polynomial appearing in and rela() denotes the relational operator ('<', '>' or '=') of .

For any polynomial formula <sup>F</sup>, an *assignment* is a mapping <sup>α</sup> : *x*¯ <sup>→</sup> <sup>R</sup><sup>n</sup> such that <sup>α</sup>(*x*¯)=(a1,...,an) where <sup>a</sup><sup>i</sup> <sup>∈</sup> <sup>R</sup>. Given an assignment <sup>α</sup>,


When the context is clear, we simply say a *true* (or *false*) atomic polynomial formula and a *satisfied* (or *falsified*) polynomial clause. A polynomial formula is *satisfiable* if there exists an assignment α such that all clauses in the formula are satisfied under α, and such an assignment is a *solution* to the polynomial formula. A polynomial formula is *unsatisfiable* if any assignment is not a solution.

#### **2.2 A General Local Search Framework**

When applying local search algorithms to solve a satisfiability problem, the search space is the set of all assignments. A general local search framework begins with a complete, initial assignment. Every time, one of the operations with the highest score is picked and the assignment is updated after executing the operation until reaching the set terminal condition. Below, we give the formal definitions of *operation* and *scoring function*.

**Definition 2 (Operation).** *Let* F *be a formula. Given an assignment* α *which is not a solution of* F*, an* operation *modifies* α *to another assignment* α *.*

**Definition 3 (Scoring Function).** *Let* F *be a formula. Suppose* α *is the current assignment and* op *is an operation. A* scoring function *is defined as* score(op, α) := cost(α) <sup>−</sup> cost(α )*, where the real-valued function* cost *measures the cost of making* F *satisfied under an assignment according to some heuristic, and* α *is the assignment after executing* op*.*

*Example 1.* In local search algorithms for SAT, a standard operation is *flip*, which modifies the current assignment by flipping the value of one Boolean variable from false to true or vice-versa. A commonly used scoring function measures the change on the number of falsified clauses by flipping a variable. Thus, operation op is flip(b) for some Boolean variable b, and cost(α) is interpreted as the number of falsified clauses under the assignment α.

Actually, only when score(op, α) is a positive number does it make sense to execute operation op, since the operation guides the current assignment to an assignment with less cost of being a solution.

**Definition 4 (Decreasing Operation).** *Suppose* α *is the current assignment. Given a scoring function* score*, an operation* op *is a* decreasing operation *under* α *if* score(op, α) > 0*.*

A general local search framework is described in Algorithm 1. The framework was used in GSAT [27] for solving SAT problems. Note that if the input formula F is satisfied, Algorithm 1 outputs either (i) a solution of F if the solution is found successfully, or (ii) "unknown" if the algorithm fails.

## **3 The Search Space of SMT(NRA)**

The search space for SAT problems consists of finitely many assignments. So, theoretically speaking, a local search algorithm can eventually find a solution, as long as the formula indeed has a solution and there is no cycling during the search. It seems intuitive, however, that the search space of an SMT(NRA) problem, *e.g.* Rn, is infinite and thus search algorithms may not work.

Fortunately, due to Tarski's work and the theory of CAD, SMT(NRA) is decidable. Given a polynomial formula in n variables, by the theory of CAD, R<sup>n</sup> is decomposed into finitely many cells such that every polynomial in the formula is sign-invariant on each cell. Therefore, the search space of the problem is essentially finite. The cells of SMT(NRA) are very similar to the Boolean assignments of SAT, so just like traversing all Boolean assignments in SAT, there exists a basic strategy to traverse all cells.

In this section, we describe the search space of SMT(NRA) based on the CAD theory from a local search perspective, providing a theoretical foundation for the operators and heuristics we will propose in the next sections.

*Example 2.* Consider the polynomial formula

$$(F := (f\_1 > 0 \lor f\_2 > 0) \land (f\_1 < 0 \lor f\_2 < 0),$$

where <sup>f</sup><sup>1</sup> = 17x<sup>2</sup>+2xy+17y<sup>2</sup>+48x−48<sup>y</sup> and <sup>f</sup><sup>2</sup> = 17x<sup>2</sup>−2xy+17y<sup>2</sup>−48x−48y.

The solution set of F is shown as the shaded area in Fig. 1. Notice that poly(F) consists of two polynomials and decomposes R<sup>2</sup> into 10 areas: C1,...,C<sup>10</sup> (see Fig. 2). We refer to these areas as *cells*.

**Fig. 1.** The solution set of F in Example 2.

**Fig. 2.** The zero level set of poly(F) decomposes R<sup>2</sup> into 10 cells.

**Definition 5 (Cell).** *For any finite set* <sup>Q</sup> <sup>⊆</sup> <sup>R</sup>[*x*¯]*, a* cell *of* <sup>Q</sup> *is a maximally connected set in* R<sup>n</sup> *on which the sign of every polynomial in* Q *is constant. For any point* <sup>a</sup>¯ <sup>∈</sup> <sup>R</sup><sup>n</sup>*, we denote by* cell(Q, <sup>a</sup>¯) *the cell of* <sup>Q</sup> *containing* <sup>a</sup>¯*.*

By the theory of CAD, we have

**Corollary 1.** *For any finite set* <sup>Q</sup> <sup>⊆</sup> <sup>R</sup>[*x*¯]*, the number of cells of* <sup>Q</sup> *is finite.*

It is obvious that any two cells of Q are disjoint and the union of all cells of Q equals R<sup>n</sup>. Definition 5 shows that for a polynomial formula F with poly(F) = Q, the satisfiability of F is constant on every cell of Q, that is, either all the points in a cell are solutions to F or none of them are solutions to F.

*Example 3.* Consider the polynomial formula F in Example 2. As shown in Fig. 3, assume that we start from point a to search for a solution to F. Jumping from a to b makes no difference, as both points are in the same cell and thus neither are solutions to F. However, jumping from a to c or from a to d crosses different cells and we may discover a cell satisfying F. Herein, the cell containing d satisfies F.

**Fig. 3.** Jumping from point a to search for a solution of F.

**Fig. 4.** A cylindrical expansion of a cylindrically complete set containing poly(F).

For the remainder of this section, we will demonstrate how to traverse all cells through point jumps between cells. The method of traversing cell by cell in a variable by variable direction will be explained step by step from Definition 6 to Definition 8.

**Definition 6 (Expansion).** *Let* <sup>Q</sup> <sup>⊆</sup> <sup>R</sup>[*x*¯] *be finite and* <sup>a</sup>¯ = (a1,...,an) <sup>∈</sup> <sup>R</sup><sup>n</sup>*. Given a variable* <sup>x</sup><sup>i</sup> (1 <sup>≤</sup> <sup>i</sup> <sup>≤</sup> <sup>n</sup>)*, let* <sup>r</sup><sup>1</sup> <sup>&</sup>lt; ··· < r<sup>s</sup> *be all real roots of* {q(a1,...,a<sup>i</sup>−1, xi, a<sup>i</sup>+1,...,an) <sup>|</sup> <sup>q</sup>(a1,...,a<sup>i</sup>−1, xi, a<sup>i</sup>+1,...,an) <sup>≡</sup> <sup>0</sup>, q <sup>∈</sup> <sup>Q</sup>}*, where* <sup>s</sup> <sup>∈</sup> <sup>Z</sup>≥<sup>0</sup>*. An* expansion *of* <sup>a</sup>¯ *to* <sup>x</sup><sup>i</sup> *on* <sup>Q</sup> *is a point set* <sup>Λ</sup> <sup>⊆</sup> <sup>R</sup><sup>n</sup> *satisfying*


*For any point set* {a¯(1),..., <sup>a</sup>¯(m)} ⊆ <sup>R</sup><sup>n</sup>*, an* expansion *of the set to* <sup>x</sup><sup>i</sup> *on* <sup>Q</sup> *is* <sup>m</sup> <sup>j</sup>=1 <sup>Λ</sup><sup>j</sup> *, where* <sup>Λ</sup><sup>j</sup> *is an expansion of* <sup>a</sup>¯(j) *to* <sup>x</sup><sup>i</sup> *on* <sup>Q</sup>*.*

*Example 4.* Consider the polynomial formula F in Example 2. The set of black solid points in Fig. 3, denoted as Λ, is an expansion of point (0, 0) to x on poly(F). The set of all points (including black solid points and hollow points) is an expansion of Λ to y on poly(F).

As shown in Fig. 3, an expansion of a point to some variable is actually a result of the point continuously jumping to adjacent cells along that variable direction. Next, we describe the expansion of all variables in order, which is a result of jumping from cell to cell along the directions of variables w.r.t. a variable order.

**Definition 7 (Cylindrical Expansion).** *Let* <sup>Q</sup> <sup>⊆</sup> <sup>R</sup>[*x*¯] *be finite and* <sup>a</sup>¯ <sup>∈</sup> <sup>R</sup>n*. Given a variable order* <sup>x</sup><sup>1</sup> ≺ ··· ≺ <sup>x</sup>n*, a* cylindrical expansion *of* <sup>a</sup>¯ *w.r.t. the variable order on* <sup>Q</sup> *is* <sup>n</sup> <sup>i</sup>=1 <sup>Λ</sup>i*, where* <sup>Λ</sup><sup>1</sup> *is an expansion of* <sup>a</sup>¯ *to* <sup>x</sup><sup>1</sup> *on* <sup>Q</sup>*, and for* <sup>2</sup> <sup>≤</sup> <sup>i</sup> <sup>≤</sup> <sup>n</sup>*,* <sup>Λ</sup><sup>i</sup> *is an expansion of* <sup>Λ</sup>i−<sup>1</sup> *to* <sup>x</sup><sup>i</sup> *on* <sup>Q</sup>*. When the context is clear, we simply call* <sup>n</sup> <sup>i</sup>=1 <sup>Λ</sup><sup>i</sup> *<sup>a</sup>* cylindrical expansion *of* <sup>Q</sup>*.*

*Example 5.* Consider the formula F in Example 2. It is clear that the set of all points in Fig. <sup>3</sup> is a cylindrical expansion of point (0, 0) w.r.t. <sup>x</sup> <sup>≺</sup> <sup>y</sup> on poly(F). The expansion actually describes the following jumping process. First, the origin (0, 0) jumps along the x-axis to the black points, and then those black points jump along the y-axis direction to the white points.

Clearly, a cylindrical expansion is similar to how a Boolean vector is flipped variable by variable. Note that the points in the expansion in Fig. 3 do not cover all the cells (*e.g.* C<sup>7</sup> and C<sup>8</sup> in Fig. 2), but if we start from (0, 2), all the cells can be covered. This implies that whether all the cells can be covered depends on the starting point.

**Definition 8 (Cylindrically Complete).** *Let* <sup>Q</sup> <sup>⊆</sup> <sup>R</sup>[*x*¯] *be finite. Given a variable order* <sup>x</sup><sup>1</sup> ≺ ··· ≺ <sup>x</sup>n*,* <sup>Q</sup> *is said to be* cylindrically complete *w.r.t. the variable order, if for any* <sup>a</sup>¯ <sup>∈</sup> <sup>R</sup><sup>n</sup> *and for any cylindrical expansion* <sup>Λ</sup> *of* <sup>a</sup>¯ *w.r.t. the order on* Q*, every cell of* Q *contains at least one point in* Λ*.*

**Theorem 1.** *For any finite set* <sup>Q</sup> <sup>⊆</sup> <sup>R</sup>[*x*¯] *and any variable order, there exists* <sup>Q</sup> *such that* <sup>Q</sup> <sup>⊆</sup> <sup>Q</sup> <sup>⊆</sup> <sup>R</sup>[*x*¯] *and* <sup>Q</sup> *is cylindrically complete w.r.t. the variable order.*

*Proof.* Let Q be the projection set of Q [6,13,26] obtained from the CAD projection operator w.r.t. the variable order. According to the theory of CAD, Q is cylindrically complete.

**Corollary 2.** *For any polynomial formula* F *and any variable order, there exists a finite set* <sup>Q</sup> <sup>⊆</sup> <sup>R</sup>[*x*¯] *such that for any cylindrical expansion* <sup>Λ</sup> *of* <sup>Q</sup>*, every cell of* poly(F) *contains at least one point in* Λ*. Furthermore,* F *is satisfiable if and only if* F *has solutions in* Λ*.*

*Example 6.* Consider the polynomial formula F in Example 2. By the proof of Theorem 1, <sup>Q</sup> := {x, <sup>−</sup><sup>2</sup> <sup>−</sup> <sup>3</sup><sup>x</sup> <sup>+</sup> <sup>x</sup>2, <sup>−</sup>2+3<sup>x</sup> <sup>+</sup> <sup>x</sup>2, 10944 + 17x2, f1, f2} is a cylindrically complete set w.r.t. <sup>x</sup> <sup>≺</sup> <sup>y</sup> containing poly(F). As shown in Fig. 4, the set of all (hollow) points is a cylindrical expansion of point (0, 0) w.r.t. <sup>x</sup> <sup>≺</sup> <sup>y</sup> on Q , which covers all cells of poly(F).

Corollary 2 shows that for a polynomial formula F, there exists a finite set <sup>Q</sup> <sup>⊆</sup> <sup>R</sup>[*x*¯] such that we can traverse all the cells of poly(F) through a search path containing all points in a cylindrical expansion of Q. The cost of traversing the cells is very high, and in the worst case, the number of cells will grow exponentially with the number of variables.

The key to building a local search on SMT(NRA) is to construct a heuristic search based on the operation of jumping between cells.

## **4 The Cell-Jump Operation**

In this section, we propose a novel operation, called *cell-jump*, that performs local changes in our algorithm. The operation is determined by the means of real root isolation. We review the method of real root isolation and define *sample points* in Sect. 4.1. Section 4.2 and Sect. 4.3 present a cell-jump operation along a line parallel to a coordinate axis and along any fixed straight line, respectively.

#### **4.1 Sample Points**

Real root isolation is a symbolic way to compute the real roots of a polynomial, which is of fundamental importance in computational real algebraic geometry (*e.g.*, it is a routing sub-algorithm for CAD). There are many efficient algorithms and popular tools in computer algebra systems such as Maple and Mathematica to isolate the real roots of polynomials.

We first introduce the definition of *sequences of isolating intervals* for nonzero univariate polynomials, which can be obtained by any real root isolation tool, *e.g.* CLPoly<sup>3</sup>.

**Definition 9 (Sequence of Isolating Intervals).** *For any nonzero univariate polynomial* <sup>p</sup>(x) <sup>∈</sup> <sup>Q</sup>[x]*, a* sequence of isolating intervals *of* <sup>p</sup>(x) *is a sequence of open intervals* (a1, b1),...,(as, bs) *where* <sup>s</sup> <sup>∈</sup> <sup>Z</sup>≥0*, such that*

*(i) for each* <sup>i</sup> (1 <sup>≤</sup> <sup>i</sup> <sup>≤</sup> <sup>s</sup>)*,* <sup>a</sup>i, b<sup>i</sup> <sup>∈</sup> <sup>Q</sup>*,* <sup>a</sup><sup>i</sup> < b<sup>i</sup> *and* <sup>b</sup><sup>i</sup> < a<sup>i</sup>+1*,*

*(ii) each interval* (ai, bi) (1 <sup>≤</sup> <sup>i</sup> <sup>≤</sup> <sup>s</sup>) *has exactly one real root of* <sup>p</sup>(x)*, and*

*(iii) none of the real roots of* <sup>p</sup>(x) *are in* <sup>R</sup> \ s <sup>i</sup>=1(ai, bi)*.*

*Specially, the sequence of isolating intervals is empty, i.e.,* s = 0*, when* p(x) *has no real roots.*

By means of sequences of isolating intervals, we define *sample points* of univariate polynomials, which is the key concept of the *cell-jump* operation proposed in Sect. 4.2 and Sect. 4.3.

**Definition 10 (Sample Point).** *For any nonzero univariate polynomial* <sup>p</sup>(x) <sup>∈</sup> <sup>Q</sup>[x]*, let* (a1, b1),...,(as, bs) *be a sequence of isolating intervals of* <sup>p</sup>(x) *where* <sup>s</sup> <sup>∈</sup> <sup>Z</sup>≥<sup>0</sup>*. Every point in the set* {a1, b<sup>s</sup>}∪<sup>s</sup>−<sup>1</sup> <sup>i</sup>=1 {bi, <sup>b</sup>i+ai+1 <sup>2</sup> , a<sup>i</sup>+1} *is a* sample point *of* p(x)*. If* x<sup>∗</sup> *is a sample point of* p(x) *and* p(x∗) > 0 (*or* p(x∗) < 0)*, then* x<sup>∗</sup> *is a* positive sample point (*or* negative sample point) *of* p(x)*. For the zero polynomial, it has no* sample point*, no* positive sample point *and no* negative sample point*.*

*Remark 1.* For any nonzero univariate polynomial p(x) that has real roots, let <sup>r</sup>1,...,r<sup>s</sup> (<sup>s</sup> <sup>∈</sup> <sup>Z</sup>≥<sup>1</sup>) be all distinct real roots of <sup>p</sup>(x). It is obvious that the sign of p(x) is positive constantly or negative constantly on each interval I of the set {(−∞, r1),(r1, r2),...,(r<sup>s</sup>−<sup>1</sup>, rs),(rs, <sup>+</sup>∞)}. So, we only need to take a point x<sup>∗</sup> from the interval I, and then the sign of p(x∗) is the constant sign of

<sup>3</sup> https://github.com/lihaokun/CLPoly.

<sup>p</sup>(x) on <sup>I</sup>. Specially, we take <sup>a</sup><sup>1</sup> as the sample point for the interval (−∞, r1), bi, <sup>b</sup>i+ai+1 <sup>2</sup> or <sup>a</sup>i+1 as a sample point for (ri, ri+1) where 1 <sup>≤</sup> <sup>i</sup> <sup>≤</sup> <sup>s</sup> <sup>−</sup> 1, and <sup>b</sup><sup>s</sup> as the sample point for (rs, <sup>+</sup>∞). By Definition 10, there exists no sample point for the zero polynomial and a univariate polynomial with no real roots.

*Example 7.* Consider the polynomial <sup>p</sup>(x) = <sup>x</sup><sup>8</sup> <sup>−</sup>4x<sup>6</sup> + 6x<sup>4</sup> <sup>−</sup>4x<sup>2</sup> + 1. It has two real roots <sup>−</sup>1 and 1, and a sequence of isolating intervals of it is (−<sup>215</sup> <sup>128</sup> , <sup>−</sup><sup>19</sup> <sup>32</sup> ), ( <sup>19</sup> <sup>32</sup> , <sup>215</sup> <sup>128</sup> ). Every point in the set {−<sup>215</sup> <sup>128</sup> , <sup>−</sup><sup>19</sup> <sup>32</sup> , <sup>0</sup>, <sup>19</sup> <sup>32</sup> , <sup>215</sup> <sup>128</sup> } is a sample point of <sup>p</sup>(x). Note that <sup>p</sup>(x) <sup>&</sup>gt; 0 holds on the intervals (−∞, <sup>−</sup>1) and (1, <sup>+</sup>∞), and <sup>p</sup>(x) <sup>&</sup>lt; <sup>0</sup> holds on the interval (−1, 1). Thus, <sup>−</sup><sup>215</sup> <sup>128</sup> and <sup>215</sup> <sup>128</sup> are positive sample points of <sup>p</sup>(x); <sup>−</sup><sup>19</sup> <sup>32</sup> , 0 and <sup>19</sup> <sup>32</sup> are negative sample points of <sup>p</sup>(x).

#### **4.2 Cell-Jump Along a Line Parallel to a Coordinate Axis**

The *critical move* operation [8, Definition 2] is a literal-level operation. For any false LIA literal, the operation changes the value of one variable in it to make the literal true. In the subsection, we propose a similar operation which adjusts the value of one variable in a false atomic polynomial formula with '<' or '>'.

**Definition 11.** *Suppose the current assignment is* <sup>α</sup> : <sup>x</sup><sup>1</sup> -<sup>→</sup> <sup>a</sup>1,...,x<sup>n</sup> -<sup>→</sup> <sup>a</sup><sup>n</sup> *where* <sup>a</sup><sup>i</sup> <sup>∈</sup> <sup>Q</sup>*. Let be a false atomic polynomial formula under* <sup>α</sup> *with a relational operator '*<*' or '*>*'.*


Every assignment in the search space can be viewed as a point in R<sup>n</sup>. Then, performing a cjump(xi, ) operation is equivalent to moving one step from the current point <sup>α</sup>(*x*¯) along the line (a1,...,a<sup>i</sup>−<sup>1</sup>, <sup>R</sup>, a<sup>i</sup>+1,...,an). Since the line is parallel to the xi-axis, we call cjump(xi, ) a *cell-jump along a line parallel to a coordinate axis*.

**Theorem 2.** *Suppose the current assignment is* <sup>α</sup> : <sup>x</sup><sup>1</sup> -<sup>→</sup> <sup>a</sup>1,...,x<sup>n</sup> -<sup>→</sup> <sup>a</sup><sup>n</sup> *where* <sup>a</sup><sup>i</sup> <sup>∈</sup> <sup>Q</sup>*. Let be a false atomic polynomial formula under* <sup>α</sup> *with a relational operator '*<*' or '*>*'. For every* <sup>i</sup> (1 <sup>≤</sup> <sup>i</sup> <sup>≤</sup> <sup>n</sup>)*, there exists a solution of in* {α <sup>|</sup> <sup>α</sup> (*x*¯) <sup>∈</sup> (a1,...,a<sup>i</sup>−<sup>1</sup>, <sup>R</sup>, a<sup>i</sup>+1,...,an)} *if and only if there exists a* cjump(xi, ) *operation.*

*Proof.* ⇐ It is clear by the definition of negative (or positive) sample points.

<sup>⇒</sup> Let <sup>S</sup> := {α <sup>|</sup> <sup>α</sup> (*x*¯) <sup>∈</sup> (a1,...,a<sup>i</sup>−<sup>1</sup>, <sup>R</sup>, a<sup>i</sup>+1,...,an)}. It is equivalent to proving that if there exists no cjump(xi, ) operation, then no solution to exists in <sup>S</sup>. We only prove it for of the form <sup>p</sup>(*x*¯) <sup>&</sup>lt; 0. Recall Definition <sup>10</sup> and Remark 1. There are only three cases in which cjump(xi, ) does not exist: (1) p<sup>∗</sup> is the zero polynomial, (2) p<sup>∗</sup> has no real roots, (3) p<sup>∗</sup> has a finite number of real roots, say <sup>r</sup>1,...,r<sup>s</sup> (<sup>s</sup> <sup>∈</sup> <sup>Z</sup>≥<sup>1</sup>), and <sup>p</sup><sup>∗</sup> is positive on <sup>R</sup> \ {r1,...,rs}, where <sup>p</sup><sup>∗</sup> denotes the polynomial <sup>p</sup>(a1,...,ai−<sup>1</sup>, xi, ai+1,...,an). In the first case, <sup>p</sup>(α (*x*¯)) = 0 and in the third case, p(α (*x*¯)) <sup>≥</sup> 0 for any assignment <sup>α</sup> <sup>∈</sup> <sup>S</sup>. In the second case, the sign of p<sup>∗</sup> is positive constantly or negative constantly on the whole real axis. Since is false under <sup>α</sup>, we have <sup>p</sup>(α(*x*¯)) <sup>≥</sup> 0, that is, <sup>p</sup>∗(ai) <sup>≥</sup> 0. So, <sup>p</sup>∗(xi) <sup>&</sup>gt; 0 for any <sup>x</sup><sup>i</sup> <sup>∈</sup> <sup>R</sup>, which means <sup>p</sup>(α (*x*¯)) <sup>&</sup>gt; 0 for any <sup>α</sup> <sup>∈</sup> <sup>S</sup>. Therefore, no solution to exists in S in the three cases. That completes the proof.

The above theorem shows that if cjump(xi, ) does not exist, then there is no need to search for a solution to along the line (a1,...,a<sup>i</sup>−1, <sup>R</sup>, a<sup>i</sup>+1,...,an). And we can always obtain a solution to after executing a cjump(xi, ) operation.

*Example 8.* Assume the current assignment is <sup>α</sup> : <sup>x</sup><sup>1</sup> -<sup>→</sup> <sup>1</sup>, x<sup>2</sup> -→ 1. Consider two false atomic polynomial formulas <sup>1</sup> : 2x<sup>2</sup> <sup>1</sup> + 2x<sup>2</sup> <sup>2</sup> <sup>−</sup> <sup>1</sup> <sup>&</sup>lt; 0 and <sup>2</sup> : <sup>x</sup><sup>8</sup> 1x<sup>3</sup> <sup>2</sup> <sup>−</sup> <sup>4</sup>x<sup>6</sup> <sup>1</sup> + 6x<sup>4</sup> <sup>1</sup>x<sup>2</sup> <sup>−</sup> <sup>4</sup>x<sup>2</sup> <sup>1</sup> + x<sup>2</sup> > 0. Let p<sup>1</sup> := poly(1) and p<sup>2</sup> := poly(2).

We first consider cjump(xi, 1). For the variable x1, the corresponding univariate polynomial is p1(x1, 1) = 2x<sup>2</sup> <sup>1</sup> + 1, and for x2, the corresponding one is p1(1, x2)=2x<sup>2</sup> <sup>2</sup> + 1. Both of them have no real roots, and thus there exists no cjump(x1, 1) operation and no cjump(x2, 1) operation for 1. Applying Theorem 2, we know a solution of <sup>1</sup> can only locate in <sup>R</sup><sup>2</sup> \ (1, <sup>R</sup>) <sup>∪</sup> (R, 1) (also see Fig. 5 (a)). So, we cannot find a solution of <sup>1</sup> through one-step cell-jump from the assignment point (1, 1) along the lines (1, R) and (R, 1).

Then consider cjump(xi, 2). For the variable x1, the corresponding univariate polynomial is p2(x1, 1) = x<sup>8</sup> <sup>1</sup> <sup>−</sup> <sup>4</sup>x<sup>6</sup> <sup>1</sup> + 6x<sup>4</sup> <sup>1</sup> <sup>−</sup> <sup>4</sup>x<sup>2</sup> <sup>1</sup> + 1. Recall Example 7. There are two positive sample points of <sup>p</sup>2(x1, 1) : <sup>−</sup><sup>215</sup> <sup>128</sup> , <sup>215</sup> <sup>128</sup> . And <sup>215</sup> <sup>128</sup> is the closest one to α(x1). So, cjump(x1, 2) assigns x<sup>1</sup> to <sup>215</sup> <sup>128</sup> . After executing cjump(x1, 2), the assignment becomes <sup>α</sup> : <sup>x</sup><sup>1</sup> -<sup>→</sup> <sup>215</sup> <sup>128</sup> , x<sup>2</sup> -<sup>→</sup> 1 which is a solution of 2. For the variable x2, the corresponding polynomial is p2(1, x2) = x<sup>3</sup> <sup>2</sup> + 7x<sup>2</sup> <sup>−</sup> 8, which has one real root 1. A sequence of isolating intervals of p2(1, x2) is ( <sup>19</sup> <sup>32</sup> , <sup>215</sup> <sup>128</sup> ), and <sup>215</sup> 128 is the only positive sample point. So, cjump(x2, 2) assigns x<sup>2</sup> to <sup>215</sup> <sup>128</sup> , and then the assignment becomes <sup>α</sup> : <sup>x</sup><sup>1</sup> -<sup>→</sup> <sup>1</sup>, x<sup>2</sup> -<sup>→</sup> <sup>215</sup> <sup>128</sup> which is another solution of 2.

#### **4.3 Cell-Jump Along a Fixed Straight Line**

Given the current assignment <sup>α</sup> such that <sup>α</sup>(*x*¯)=(a1,...,an) <sup>∈</sup> <sup>Q</sup><sup>n</sup>, a false atomic polynomial formula of the form <sup>p</sup>(*x*¯) <sup>&</sup>gt; 0 or <sup>p</sup>(*x*¯) <sup>&</sup>lt; 0 and a vector dir = (d1,...,dn) <sup>∈</sup> <sup>Q</sup><sup>n</sup>, we propose Algorithm <sup>2</sup> to find a cell-jump operation along the straight line <sup>L</sup> specified by the point <sup>α</sup>(*x*¯) and the direction dir, denoted as cjump(dir, ).

In order to analyze the values of <sup>p</sup>(*x*¯) on line <sup>L</sup>, we introduce a new variable <sup>t</sup> and replace every <sup>x</sup><sup>i</sup> in <sup>p</sup>(*x*¯) with <sup>a</sup><sup>i</sup> <sup>+</sup> <sup>d</sup>i<sup>t</sup> to get <sup>p</sup>∗(t). If rela() ='<' and p∗(t) has negative sample points, there exists a cjump(dir, ) operation. Let t ∗ be a negative sample point of p∗(t) closest to 0. The assignment becomes α : x1 -<sup>→</sup> <sup>a</sup><sup>1</sup> <sup>+</sup> <sup>d</sup>1<sup>t</sup> <sup>∗</sup>,...,x<sup>n</sup> -<sup>→</sup> <sup>a</sup><sup>n</sup> <sup>+</sup> <sup>d</sup>n<sup>t</sup> <sup>∗</sup> after executing the operation cjump(dir, ). It is obvious that α is a solution to . If rela() ='>' and p∗(t) has positive sample points, the situation is similar. Otherwise, has no cell-jump operation along line L.

Similarly, we have:

**Theorem 3.** *Suppose the current assignment is* <sup>α</sup> : <sup>x</sup><sup>1</sup> -<sup>→</sup> <sup>a</sup>1,...,x<sup>n</sup> -<sup>→</sup> <sup>a</sup><sup>n</sup> *where* <sup>a</sup><sup>i</sup> <sup>∈</sup> <sup>Q</sup>*. Let be a false atomic polynomial formula under* <sup>α</sup> *with a relational operator '*<*' or '*>*',* dir := (d1,...,dn) *a vector in* Q<sup>n</sup> *and* L := {(a<sup>1</sup> <sup>+</sup> <sup>d</sup>1t, . . . , a<sup>n</sup> <sup>+</sup> <sup>d</sup>nt) <sup>|</sup> <sup>t</sup> <sup>∈</sup> <sup>R</sup>}*. There exists a solution of in* <sup>L</sup> *if and only if there exists a* cjump(dir, ) *operation.*

Theorem <sup>3</sup> implies that through one-step cell-jump from the point <sup>α</sup>(*x*¯) along any line that intersects the solution set of , a solution to will be found.

*Example 9.* Assume the current assignment is <sup>α</sup> : <sup>x</sup><sup>1</sup> -<sup>→</sup> <sup>1</sup>, x<sup>2</sup> -→ 1. Consider the false atomic polynomial formula <sup>1</sup> : 2x<sup>2</sup> <sup>1</sup> + 2x<sup>2</sup> <sup>2</sup> <sup>−</sup> <sup>1</sup> <sup>&</sup>lt; 0 in Example 8. Let <sup>p</sup> := poly(1). By Fig. <sup>5</sup> (b), the line (line <sup>L</sup>3) specified by the point <sup>α</sup>(*x*¯) and the direction vector dir = (1, 1) intersects the solution set of 1. So, there exists a cjump(dir, 1) operation by Theorem 3. Notice that the line can be described in a parametric form, that is {(x1, x2) <sup>|</sup> <sup>x</sup><sup>1</sup> = 1+t, x<sup>2</sup> = 1+<sup>t</sup> where <sup>t</sup> <sup>∈</sup> <sup>R</sup>}. Then, analyzing the values of <sup>p</sup>(*x*¯) on the line is equivalent to analyzing those of <sup>p</sup>∗(t) on the real axis, where p∗(t) = p(1+t, 1+t)=4t <sup>2</sup>+8t+3. A sequence of isolating intervals of <sup>p</sup><sup>∗</sup> is (−<sup>215</sup> <sup>128</sup> , <sup>−</sup><sup>75</sup> <sup>64</sup> ), (−<sup>19</sup> <sup>32</sup> , <sup>−</sup> <sup>61</sup> <sup>128</sup> ), and there are two negative sample points: <sup>−</sup><sup>75</sup> <sup>64</sup> , <sup>−</sup><sup>19</sup> <sup>32</sup> . Since <sup>−</sup><sup>19</sup> <sup>32</sup> is the closest one to 0, the operation cjump(dir, 1) changes the assignment to <sup>α</sup> : <sup>x</sup><sup>1</sup> -<sup>→</sup> <sup>13</sup> <sup>32</sup> , x<sup>2</sup> -<sup>→</sup> <sup>13</sup> <sup>32</sup> , which is a solution of 1. Again by Fig. 5, there are other lines (the dashed lines) that go through <sup>α</sup>(*x*¯) and intersect the solution set. So, we can also find a solution to <sup>1</sup> along these lines. Actually, for any false atomic polynomial formula with '<' or '>' that really has solutions, there always exists some direction dir in Q<sup>n</sup> such that cjump(dir, ) finds one of them. Therefore, the more directions we try, the greater the probability of finding a solution of .

#### **Algorithm 2.. Cell-Jump Along a Fixed Straight Line**

**Input :** <sup>α</sup> = (a1,...,an), the current assignment <sup>x</sup><sup>1</sup> → <sup>a</sup>1,...,x<sup>n</sup> → <sup>a</sup><sup>n</sup> where <sup>a</sup><sup>i</sup> <sup>∈</sup> <sup>Q</sup> , a false atomic polynomial formula under α with a relational operator '<' or '>' dir = (d1,...,dn), a vector in Q<sup>n</sup> **Output:** α- , the assignment after executing a cjump(dir, ) operation, which is a solution to ; FAIL, if there exists no cjump(dir, ) operation **<sup>1</sup>** <sup>p</sup> <sup>←</sup> poly() **<sup>2</sup>** <sup>p</sup><sup>∗</sup> <sup>←</sup> replace every <sup>x</sup><sup>i</sup> in <sup>p</sup> with <sup>a</sup><sup>i</sup> <sup>+</sup> <sup>d</sup>it, where <sup>t</sup> is a new variable **<sup>3</sup> if** rela() =*'*<*' and* p<sup>∗</sup> *has negative sample points* **then <sup>4</sup>** t <sup>∗</sup> <sup>←</sup> a negative sample point of <sup>p</sup><sup>∗</sup> closest to 0 **<sup>5</sup>** α- <sup>←</sup> (a<sup>1</sup> <sup>+</sup> <sup>d</sup>1t∗,...,a<sup>n</sup> <sup>+</sup> <sup>d</sup>nt∗) **<sup>6</sup> return** α- **<sup>7</sup> if** rela() =*'*>*' and* p<sup>∗</sup> *has positive sample points* **then <sup>8</sup>** t <sup>∗</sup> <sup>←</sup> a positive sample point of <sup>p</sup><sup>∗</sup> closest to 0 **<sup>9</sup>** α- <sup>←</sup> (a<sup>1</sup> <sup>+</sup> <sup>d</sup>1t∗,...,a<sup>n</sup> <sup>+</sup> <sup>d</sup>nt∗) **<sup>10</sup> return** α- **<sup>11</sup> return** *FAIL*

(a) Neither *L*<sup>1</sup> nor *L*<sup>2</sup> intersects the solution set.

(b) Line *L*<sup>3</sup> and the dashed lines intersect the solution set.

**Fig. 5.** The figure of the cell-jump operations along the lines L1, L<sup>2</sup> and L<sup>3</sup> for the false atomic polynomial formula -<sup>1</sup> : 2x<sup>2</sup> <sup>1</sup> + 2x<sup>2</sup> <sup>2</sup> − 1 < 0 under the assignment α : x<sup>1</sup> -→ 1, x<sup>2</sup> -<sup>→</sup> 1. The dashed circle denotes the circle 2x<sup>2</sup> 1+ 2x<sup>2</sup> <sup>2</sup>−1 = 0 and the shaded part in it represents the solution set of the atom. The coordinate of point A is (1, 1). Lines L1, L<sup>2</sup> and L<sup>3</sup> pass through A and are parallel to the x1-axis, the x2-axis and the vector (1, 1), respectively.

*Remark 2.* For a false atomic polynomial formula with '<' or '>', cjump(xi, ) and cjump(dir, ) make an assignment move to a new assignment, and both assignments map to an element in Q<sup>n</sup>. In fact, we can view cjump(xi, ) as a special case of cjump(dir, ) where the i-th component of dir is 1 and all the other components are 0. The main difference between cjump(xi, ) and cjump(dir, ) is that cjump(xi, ) only changes the value of one variable while cjump(dir, ) may change the values of many variables. The advantage of cjump(xi, ) is to avoid that some atoms can never become true when the values of many variables are adjusted together. However, performing cjump(dir, ) is more efficient in some cases, since it may happen that a solution to can be found through one-step cjump(dir, ), but through many steps of cjump(xi, ).

## **5 Scoring Functions**

Scoring functions guide local search algorithms to pick an operation at each step. In this section, we introduce a score function which measures the difference of the distances to satisfaction under the assignments before and after performing an operation.

First, we define the distance to truth of an atomic polynomial formula.

**Definition 12 (Distance to Truth).** *Given the current assignment* α *such that* <sup>α</sup>(*x*¯)=(a1,...,an) <sup>∈</sup> <sup>Q</sup><sup>n</sup> *and a positive parameter* pp <sup>∈</sup> <sup>Q</sup><sup>&</sup>gt;0*, for an atomic polynomial formula with* p := poly()*, its* distance to truth *is*

$$\mathbf{tt}\mathbf{t}(\ell,\alpha,pp) := \begin{cases} 0, & \text{if } \alpha \text{ is a solution to } \ell, \\ |p(a\_1,\ldots,a\_n)| + pp, & \text{otherwise.} \end{cases}$$

For an atomic polynomial formula , the parameter pp is introduced to guarantee that the distance to truth of is 0 if and only if the current assignment

*x*1

α is a solution of . Based on the definition of dtt, we use the method of [8, Definition 3 and 4] to define the distance to satisfaction of a polynomial clause and the score of an operation, respectively.

**Definition 13 (Distance to Satisfaction).** *Given the current assignment* α *and a parameter* pp <sup>∈</sup> <sup>Q</sup>>0*, the* distance to satisfaction *of a polynomial clause* <sup>c</sup> *is* dts(c, α, pp) := min∈c{dtt(, α, pp)}*.*

**Definition 14 (Score).** *Given a polynomial formula* F*, the current assignment* <sup>α</sup> *and a parameter* pp <sup>∈</sup> <sup>Q</sup><sup>&</sup>gt;0*, the* score *of an operation* op *is defined as*

$$\mathbf{succ}(op,\alpha,pp) := \sum\_{c \in F} (\mathbf{dst}(c,\alpha,pp) - \mathbf{dst}(c,\alpha',pp)) \cdot w(c),$$

*where* w(c) *denotes the weight of clause* c*, and* α *is the assignment after performing* op*.*

Note that the definition of the score is associated with the weights of clauses. In our algorithm, we employ the probabilistic version of the PAWS scheme [9, 32] to update clause weights. The initial weight of every clause is 1. Given a probability sp, the clause weights are updated as follows: with probability 1−sp, the weight of every falsified clause is increased by one, and with probability sp, for every satisfied clause with weight greater than 1, the weight is decreased by one.

## **6 The Main Algorithm**

Based on the proposed cell-jump operation (see Sect. 4) and scoring function (see Sect. 5), we develop a local search algorithm, called LS Algorithm, for solving satisfiability of polynomial formulas in this section. The algorithm is a refined extension of the general local search framework as described in Sect. 2.2, where we design a two-level operation selection. The section also explains the restart mechanism and an optimization strategy used in the algorithm.

Given a polynomial formula F such that every relational operator appearing in it is '<' or '>' and an initial assignment that maps to an element in Q<sup>n</sup>, LS Algorithm (Algorithm 3) searches for a solution of F from the initial assignment, which has the following four steps:


Section 4.2]. The heuristic distinguishes a special subset <sup>S</sup> <sup>⊆</sup> <sup>D</sup> from the rest of <sup>D</sup>, where <sup>S</sup> <sup>=</sup> {cjump(xi, ) <sup>∈</sup> <sup>D</sup> <sup>|</sup> appears in a falsified clause}, and searches for an operation with the highest score from S. If it fails to find any operation from <sup>S</sup> (*i.e.* <sup>S</sup> <sup>=</sup> <sup>∅</sup>), then it searches for one with the highest score from <sup>D</sup> \ <sup>S</sup>. Perform the found operation and update the assignment. Go to Step (*i*).


We propose a two-level operation selection in LS Algorithm, which prefers to choose an operation changing the values of less variables. Concretely, only when there does not exist a decreasing cjump(xi, ) operation that changes the value of one variable, do we update clause weights and pick a cjump(dir, ) operation that may change values of more variables. The strategy makes sense in experiments, since it is observed that changing too many variables together at the beginning might make some atoms never become true.

It remains to explain the restart mechanism and an optimization strategy.

**Restart Mechanism.** Given any initial assignment, LS Algorithm takes it as the starting point of the local search. If the algorithm returns "unknown", we restart LS Algorithm with another initial assignment. A general local search framework, like Algorithm 1, searches for a solution from only one starting point. However, the restart mechanism allows us to search from more starting points. The approach of combining the restart mechanism and a local search procedure also aids global search, which finds a solution over the entire search space.

We set the initial assignments for restarts as follows: All variables are assigned with 1 for the first time. For the second time, for a variable xi, if there exists clause <sup>x</sup><sup>i</sup> < ub <sup>∨</sup> <sup>x</sup><sup>i</sup> <sup>=</sup> ub or <sup>x</sup><sup>i</sup> > lb <sup>∨</sup> <sup>x</sup><sup>i</sup> <sup>=</sup> lb, then <sup>x</sup><sup>i</sup> is assigned with ub or lb; otherwise, <sup>x</sup><sup>i</sup> is assigned with 1. For the <sup>i</sup>-th time (3 <sup>≤</sup> <sup>i</sup> <sup>≤</sup> 7), every variable is assigned with 1 or <sup>−</sup>1 randomly. For the <sup>i</sup>-th time (<sup>i</sup> <sup>≥</sup> 8), every variable is assigned with a random integer between <sup>−</sup>50(<sup>i</sup> <sup>−</sup> 6) and 50(<sup>i</sup> <sup>−</sup> 6).

**Forbidding Strategies.** An inherent problem of the local search method is cycling, *i.e.*, revisiting assignments. Cycle phenomenon wastes time and prevents the search from getting out of local minima. So, we employ a popular forbidding strategy, called tabu strategy [18], to deal with it. The tabu strategy forbids reversing the recent changes and can be directly applied in LS Algorithm. Notice that every cell-jump operation increases or decreases the values of some variables. After executing an operation that increases/decreases the value of a variable, the tabu strategy forbids decreasing/increasing the value of the variable in the subsequent tt iterations, where tt <sup>∈</sup> <sup>Z</sup>≥<sup>0</sup> is a given parameter.

#### **Algorithm 3. LS Algorithm**

```
Input : F , a polynomial formula such that the relational operator of every atom is '<' or '>'
   initα, an initial assignment that maps to an element in Qn
   Output: a solution (in Qn) to F or unknown
1 α ← initα
2 while the terminal condition is not reached do
3 if α satisfies F then return α
4 fal cl ← the set of atoms in falsified clauses
5 sat cl ← the set of false atoms in satisfied clauses
6 if ∃ a decreasing cjump(xi, ) operation where  ∈ fal cl then
7 op ← such an operation with the highest score
8 α ← α with op performed
9 else if ∃ a decreasing cjump(xi, ) operation where  ∈ sat cl then
10 op ← such an operation with the highest score
11 α ← α with op performed
12 else
13 update clause weights according to the PAWS scheme
14 generate a direction vector set dset
15 if ∃ a decreasing cjump(dir, ) operation where dir ∈ dset and  ∈ fal cl then
16 op ← such an operation with the highest score
17 α ← α with op performed
18 else if ∃ a decreasing cjump(dir, ) operation where dir ∈ dset and  ∈ sat cl then
19 op ← such an operation with the highest score
20 α ← α with op performed
21 else
22 return unknown
23 return unknown
```
*Remark 3.* If the input formula has equality constraints, then we need to define a cell-jump operation for a false atom of the form <sup>p</sup>(*x*¯) = 0. Given the current assignment <sup>α</sup> : <sup>x</sup><sup>1</sup> -<sup>→</sup> <sup>a</sup>1,...,x<sup>n</sup> -<sup>→</sup> <sup>a</sup><sup>n</sup> (a<sup>i</sup> <sup>∈</sup> <sup>Q</sup>), the operation should assign some variable <sup>x</sup><sup>i</sup> to a real root of <sup>p</sup>(a1,...,a<sup>i</sup>−<sup>1</sup>, xi, a<sup>i</sup>+1,...,an), which may be not a rational number. Since it is time-consuming to isolate real roots of a polynomial with algebraic coefficients, we must guarantee that all assignments are rational during the search. Thus, we restrict that for every equality equation <sup>p</sup>(*x*¯)=0 in the formula, there exists at least one variable such that the degree of p w.r.t. the variable is 1. Then, LS Algorithm also works for such a polynomial formula after some minor modifications: In Line <sup>6</sup> (or Line 9), for every atom <sup>∈</sup> f al cl (or <sup>∈</sup> sat cl) and for every variable <sup>x</sup>i, if has the form <sup>p</sup>(*x*¯) = 0, <sup>p</sup> is linear w.r.t. <sup>x</sup><sup>i</sup> and <sup>p</sup>(a1,...,a<sup>i</sup>−<sup>1</sup>, xi, a<sup>i</sup>+1,...,an) is not a constant polynomial, there is a candidate operation that changes the value of x<sup>i</sup> to the (rational) solution of <sup>p</sup>(a1,...,a<sup>i</sup>−<sup>1</sup>, xi, a<sup>i</sup>+1,...,an) = 0; if has the form <sup>p</sup>(*x*¯) <sup>&</sup>gt; 0 or <sup>p</sup>(*x*¯) <sup>&</sup>lt; 0, a candidate operation is cjump(xi, ). We perform a decreasing candidate operation with the highest score if such one exists, and update α in Line 8 (or Line 11). In Line 15 (or Line 18), we only deal with inequality constraints from f al cl (or sat cl), and skip equality constraints.

## **7 Experiments**

We carried out experiments to evaluate LS Algorithm on two classes of instances, where one class consists of selected instances from SMT-LIB while another is generated randomly, and compared our tool with state-of-the-art SMT(NRA) solvers. Furthermore, we combine our tool with Z3, CVC5, Yices2 and Math-SAT5 respectively to obtain four sequential portfolio solvers, which show better performance.

#### **7.1 Experiment Preparation**

**Implementation:** We implemented LS Algorithm with Maple2022 as a tool, which is also named LS. There are 3 parameters in LS Algorithm: pp for computing the score of an operation, tt for the tabu strategy and sp for the PAWS scheme, which are set as pp = 1, tt = 10 and sp = 0.003. The direction vectors in LS Algorithm are generated in the following way: Suppose the current assignment is <sup>x</sup><sup>1</sup> -<sup>→</sup> <sup>a</sup>1,...,x<sup>n</sup> -<sup>→</sup> <sup>a</sup><sup>n</sup> (a<sup>i</sup> <sup>∈</sup> <sup>Q</sup>) and the polynomial appearing in the atom to deal with is p. We generate 12 vectors. The first one is the gradient vector ( ∂p ∂x<sup>1</sup> ,..., ∂p ∂x<sup>n</sup> )|(a1,...,an). The second one is the vector (a1,...,an). And the rest are random vectors where every component is a random integer between −1000 and 1000.

**Experiment Setup:** All experiments were conducted on 16-Core Intel Core i9-12900KF with 128GB of memory and ARCH LINUX SYSTEM. We compare our tool with 4 state-of-the-art SMT(NRA) solvers, namely Z3 (4.11.2), CVC5 (1.0.3), Yices2 (2.6.4) and MathSAT5 (5.6.5). Each solver is executed with a cutoff time of 1200 seconds (as in the SMT Competition) for each instance. We also combine LS with every competitor solver as a sequential portfolio solver, referred to as "LS+OtherSolver", where we first run LS with a time limit of 10 seconds, and if LS fails to solve the instance within that time, we then proceed to run OtherSolver from scratch, allotting it the remaining 1190 seconds.

#### **7.2 Instances**

We prepare two classes of instances. One class consists of in total 2736 unknown and satisfiable instances from SMT-LIB(NRA)<sup>4</sup>, where in every equality polynomial constraint, the degree of the polynomial w.r.t. each variable is less than or equal to 1.

The rest are random instances. Before introducing the generation approach of random instances, we first define some notation. Let **rn**(down, up) denote a

<sup>4</sup> https://clc-gitlab.cs.uiowa.edu:2443/SMT-LIB-benchmarks/QF NRA.

random integer between two integers down and up, and **rp**({x1,...,xn}, d, m) denote a random polynomial <sup>m</sup> <sup>i</sup>=1 <sup>c</sup>iM<sup>i</sup> <sup>+</sup> <sup>c</sup>0, where <sup>c</sup><sup>i</sup> <sup>=</sup> **rn**(−1000, 1000) for <sup>0</sup> <sup>≤</sup> <sup>i</sup> <sup>≤</sup> <sup>m</sup>, <sup>M</sup><sup>1</sup> is a random monomial in {xa<sup>1</sup> <sup>1</sup> ··· <sup>x</sup>a<sup>n</sup> <sup>n</sup> <sup>|</sup> <sup>a</sup><sup>i</sup> <sup>∈</sup> <sup>Z</sup>≥<sup>0</sup>, a1+···+a<sup>n</sup> <sup>=</sup> <sup>d</sup>} and <sup>M</sup><sup>i</sup> (2 <sup>≤</sup> <sup>i</sup> <sup>≤</sup> <sup>m</sup>) is a random monomial in {xa<sup>1</sup> <sup>1</sup> ··· <sup>x</sup>a<sup>n</sup> <sup>n</sup> <sup>|</sup> <sup>a</sup><sup>i</sup> <sup>∈</sup> <sup>Z</sup>≥<sup>0</sup>, a<sup>1</sup> <sup>+</sup> ··· <sup>+</sup> <sup>a</sup><sup>n</sup> <sup>≤</sup> <sup>d</sup>}.

A randomly generated polynomial formula **rf**({<sup>v</sup> <sup>n</sup>1, v <sup>n</sup>2}, {<sup>p</sup> <sup>n</sup>1, p <sup>n</sup>2}, {d−, <sup>d</sup>+}, {n−, n+}, {m−, m+}, {cl <sup>n</sup>1, cl <sup>n</sup>2}, {cl <sup>l</sup>1, cl <sup>l</sup>2}), where all parameters are in <sup>Z</sup>≥<sup>0</sup>, is constructed as follows: First, let <sup>n</sup> := **rn**(<sup>v</sup> <sup>n</sup>1, v <sup>n</sup>2) and generate n variables x1,...,xn. Second, let num := **rn**(p n1, p n2) and generate num polynomials <sup>p</sup>1,...,pnum. Every <sup>p</sup><sup>i</sup> is a random polynomial **rp**({x<sup>i</sup><sup>1</sup> ,... , x<sup>i</sup>ni }, d, m), where <sup>n</sup><sup>i</sup> <sup>=</sup> **rn**(n−, n+), <sup>d</sup> <sup>=</sup> **rn**(d−, d+), <sup>m</sup> <sup>=</sup> **rn**(m−, m+), and {x<sup>i</sup><sup>1</sup> ,...,x<sup>i</sup>ni } are <sup>n</sup><sup>i</sup> variables randomly selected from {x1,...,x<sup>n</sup>}. Finally, let cl n := **rn**(cl n1, cl n2) and generate cl n clauses such that the number of atoms in a generated clause is **rn**(cl l1, cl l2). The **rn**(cl l1, cl l2) atoms are randomly picked from {p<sup>i</sup> <sup>&</sup>lt; <sup>0</sup>, p<sup>i</sup> <sup>&</sup>gt; <sup>0</sup>, p<sup>i</sup> = 0 <sup>|</sup> <sup>1</sup> <sup>≤</sup> <sup>i</sup> <sup>≤</sup> num}. If some picked atom has the form p<sup>i</sup> = 0 and there exists a variable such that the degree of p<sup>i</sup> w.r.t. the variable is greater than 1, replace the atom with p<sup>i</sup> < 0 or p<sup>i</sup> > 0 with equal probability. We generate totally 500 random polynomial formulas according to **rf**({30, <sup>40</sup>}, {60, <sup>80</sup>}, {20, <sup>30</sup>}, {10, <sup>20</sup>}, {20, <sup>30</sup>}, {40, <sup>60</sup>}, {3, <sup>5</sup>}).

The two classes of instances have different characteristics. The instances selected from SMT-LIB(NRA) usually contain lots of linear constraints, and their complexity is reflected in the propositional abstraction. For a random instance, all the polynomials in it are nonlinear and of high degrees, while its propositional abstraction is relatively simple.

#### **7.3 Experimental Results**

The experimental results are presented in Table 1. The column "#inst" records the number of instances. Let us first see Column "Z3"–Column "LS". On instances from SMT-LIB(NRA), LS performs worse than all competitors except MathSAT5, but it is still comparable. It is crucial to note that our approach is much faster than both CVC5 and Z3 on 90% of the Meti-Tarski benchmarks of SMT-LIB (2194 instances in total). On random instances, only LS solved all of them, while the competitor Z3 with the best performance solved 29% of them. The results show that LS is not good at solving polynomial formulas with complex propositional abstraction and lots of linear constraints, but it has great ability to handle those with high-degree polynomials. A possible explanation is that as a local search solver, LS cannot exploit the propositional abstraction well to find a solution. However, for a formula with plenty of high-degree polynomials, cell-jump may 'jump' to a solution faster.

The data revealed in the last column "LS+CVC5" of Table 1 indicates that the combination of LS and CVC5 manages to solve the majority of the instances across both classes, suggesting a complementary performance between LS and top-tier SMT(NRA) solvers. As shown in Table 2, when evaluating combinations of different solvers with LS, it becomes evident that our method significantly enhances the capabilities of existing solvers in the portfolio configurations. The


**Table 1.** Results on SMT-LIB(NRA) and random instances.

most striking improvement can be witnessed in the "LS+MathSAT5" combination, which demonstrates superior performance and the most significant enhancement among all the combination solvers.


**Table 2.** Performance Comparison of Different Solver Combinations with LS.

Besides, Fig. 6 shows the performance of LS and the competitors on all instances. The horizontal axis represents time, while the vertical axis represents the number of solved instances within the corresponding time. Figure 7 presents the run time comparisons between LS+CVC5 and CVC5. Every point in the figure represents an instance. The horizontal coordinate of the point is the computing time of LS+CVC5 while the vertical coordinate is the computing time of CVC5 (for every instance out of time, we record its computing time as 1200 seconds). The figure shows that LS+CVC5 improves the performance of CVC5. We also present the run time comparisons between LS and each competitor in Figs. 8–11.

**Fig. 6.** Number of solved instances within given time (sec: seconds).

**Fig. 10.** Comparing LS with MathSAT5. **Fig. 11.** Comparing LS with Yices2.

**Fig. 7.** Comparing LS+CVC5 with CVC5.

**Fig. 8.** Comparing LS with Z3. **Fig. 9.** Comparing LS with CVC5.

## **8 Conclusion**

For a given SMT(NRA) formula, although the domain of variables in the formula is infinite, the satisfiability of the formula can be decided through tests on a finite number of samples in the domain. A complete search on such samples is inefficient. In this paper, we propose a local search algorithm for a special class of SMT(NRA) formulas, where every equality polynomial constraint is linear with respect to at least one variable. The novelty of our algorithm contains the cell-jump operation and a two-level operation selection which guide the algorithm to jump from one sample to another heuristically. The algorithm has been applied to two classes of benchmarks and the experimental results show that it is competitive with state-of-the-art SMT solvers and is good at solving those formulas with high-degree polynomial constraints. Tests on the solvers developed by combining this local search algorithm with Z3, CVC5, Yices2 or MathSAT5 indicate that the algorithm is complementary to these state-of-theart SMT(NRA) solvers. For the future work, we will improve our algorithm such that it is able to handle all polynomial formulas.

**Acknowledgement.** This work is supported by National Key R&D Program of China (No. 2022YFA1005102) and the NSFC under grant No. 61732001. The authors are grateful to the reviewers for their valuable comments and constructive suggestions.

## **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

## **Partial Quantifier Elimination and Property Generation**

Eugene Goldberg(B)

Land O Lakes, USA eu.goldberg@gmail.com

**Abstract.** We study partial quantifier elimination (PQE) for propositional CNF formulas with existential quantifiers. PQE is a generalization of quantifier elimination where one can limit the set of clauses taken out of the scope of quantifiers to a small subset of clauses. The appeal of PQE is that many verification problems (e.g., equivalence checking and model checking) can be solved in terms of PQE and the latter can be dramatically simpler than full quantifier elimination. We show that PQE can be used for property generation that one can view as a generalization of testing. The objective here is to produce an *unwanted* property of a design implementation, thus exposing a bug. We introduce two PQE solvers called *EG*-*PQE* and *EG*-*PQE* <sup>+</sup>. *EG*-*PQE* is a very simple SAT-based algorithm. *EG*-*PQE* <sup>+</sup> is more sophisticated and robust than *EG*-*PQE*. We use these PQE solvers to find an unwanted property (namely, an unwanted invariant) of a buggy FIFO buffer. We also apply them to invariant generation for sequential circuits from a HWMCC benchmark set. Finally, we use these solvers to generate properties of a combinational circuit that mimic symbolic simulation.

### **1 Introduction**

In this paper, we consider the following problem. Let F(X, Y ) be a propositional formula in conjunctive normal form (CNF)<sup>1</sup> where X, Y are sets of variables. Let G be a subset of clauses of F. Given a formula ∃X[F], find a quantifier-free formula H(Y ) such that ∃X[F] ≡ H ∧ ∃X[F \ G]. In contrast to *full* quantifier elimination (QE), only the clauses of G are taken out of the scope of quantifiers here. So, we call this problem *partial* QE (PQE) [1]. (In this paper, we consider PQE only for formulas with *existential* quantifiers.) We will refer to H as a *solution* to PQE. Like SAT, PQE is a way to cope with the complexity of QE. But in contrast to SAT that is a *special* case of QE (where all variables are quantified), PQE *generalizes* QE. The latter is just a special case of PQE where G = F and the entire formula is unquantified. Interpolation [2,3] can be viewed as a special case of PQE as well [4,5].

<sup>1</sup> Every formula is a propositional CNF formula unless otherwise stated. Given a CNF formula *F* represented as the conjunction of clauses *C*1∧···∧*Ck*, we will also consider *F* as the *set* of clauses {*C*1*,...,Ck*}.

c The Author(s) 2023

C. Enea and A. Lal (Eds.): CAV 2023, LNCS 13965, pp. 110–131, 2023. https://doi.org/10.1007/978-3-031-37703-7\_6

The appeal of PQE is threefold. First, it can be much more efficient than QE if G is a *small* subset of F. Second, many verification problems like SAT, equivalence checking, model checking can be solved in terms of PQE [1,6–8]. So, PQE can be used to design new efficient methods for solving known problems. Third, one can apply PQE to solving *new* problems like property generation considered in this paper. In practice, to perform PQE, it suffices to have an algorithm that takes a single clause out of the scope of quantifiers. Namely, given a formula ∃X[F(X, Y )] and a clause C ∈ F, this algorithm finds a formula H(Y ) such that ∃X[F] ≡ H ∧ ∃X[F \ {C}]. To take out k clauses, one can apply this algorithm k times. Since H ∧ ∃X[F] ≡ H ∧ ∃X[F \ {C}], solving the PQE above reduces to finding H(Y ) that makes C *redundant* in H ∧ ∃X[F]. So, the PQE algorithms we present here employ *redundancy based reasoning*. We describe two PQE algorithms called *EG*-*PQE* and *EG*-*PQE* <sup>+</sup> where "*EG*" stands for "Enumerate and Generalize". *EG*-*PQE* is a very simple SAT-based algorithm that can sometimes solve very large problems. *EG*-*PQE* <sup>+</sup> is a modification of *EG*-*PQE* that makes the algorithm more powerful and robust.

In [7], we showed the viability of an equivalence checker based on PQE. In particular, we presented instances for which this equivalence checker outperformed ABC [9], a high quality tool. In this paper, we describe and check experimentally one more important application of PQE called property generation. Our motivation here is as follows. Suppose a design implementation *Imp* meets the set of specification properties P1,...,P*m*. Typically, this set is incomplete. So, *Imp* can still be buggy even if every P*i*, i = 1,...,m holds. Let P<sup>∗</sup> *m*+1,...,P<sup>∗</sup> *n* be *desired* properties adding which makes the specification complete. If *Imp* meets the properties P1,...,P*<sup>m</sup>* but is still buggy, a missed property P<sup>∗</sup> *<sup>i</sup>* above fails. That is, *Imp* has the *unwanted* property P<sup>∗</sup> *<sup>i</sup>* . So, one can detect bugs by generating unspecified properties of *Imp* and checking if there is an unwanted one.

Currently, identification of unwanted properties is mostly done by massive testing. (As we show later, the input/output behavior specified by a single test can be cast as a simple property of *Imp*.) Another technique employed in practice is *guessing* unwanted properties that may hold and formally checking if this is the case. The problem with these techniques is that they can miss an unwanted property. In this paper, we describe property generation by PQE. The benefit of PQE is that it can produce much more complex properties than those corresponding to single tests. So, using PQE one can detect bugs that testing overlooks or cannot find in principle. Importantly, PQE generates properties covering different parts of *Imp*. This makes the search for unwanted properties more systematic and facilitates discovering bugs that can be missed if one simply guesses unwanted properties that may hold.

In this paper, we experimentally study generation of invariants of a sequential circuit N. An invariant of N is unwanted if a state that is supposed to be reachable in N falsifies this invariant and hence is unreachable. Note that finding a formal proof that N has no unwanted invariants is impractical. (It is hard to efficiently prove a large set of states reachable because different states are reached by different execution traces.) So developing practical methods for finding unwanted invariants if very important. We also study generation of properties mimicking symbolic simulation for a combinational circuit obtained by unrolling a sequential circuit. An unwanted property here exposes a wrong execution trace.

This paper is structured as follows. (Some additional information can be found in the supporting technical report [5].) In Sect. 2, we give basic definitions. Section 3 presents property generation for a combinational circuit. In Sect. 4, we describe invariant generation for a sequential circuit. Sections 5 and 6 present *EG*-*PQE* and *EG*-*PQE* <sup>+</sup> respectively. In Sect. 7, invariant generation is used to find a bug in a FIFO buffer. Experiments with invariant generation for HWMCC benchmarks are described in Sect. 8. Section 9 presents an experiment with property generation for combinational circuits. In Sect. 10 we give some background. Finally, in Sect. 11, we make conclusions and discuss directions for future research.

## **2 Basic Definitions**

In this section, when we say "formula" without mentioning quantifiers, we mean "a quantifier-free formula".

**Definition 1.** *We assume that formulas have only Boolean variables. A literal of a variable* v *is either* v *or its negation. A clause is a disjunction of literals. A formula* F *is in conjunctive normal form (CNF) if* F = C<sup>1</sup> ∧···∧ C*<sup>k</sup> where* C1,...,C*<sup>k</sup> are clauses. We will also view* F *as the set of clauses* {C1,...,C*k*}*. We assume that every formula is in CNF.*

**Definition 2.** *Let* <sup>F</sup> *be a formula. Then Vars***(***F***)** *denotes the set of variables of* <sup>F</sup> *and Vars***(***∃X***[***F***])** *denotes Vars*(F)\X*.*

**Definition 3.** *Let* <sup>V</sup> *be a set of variables. An assignment* #»<sup>q</sup> *to* <sup>V</sup> *is a mapping* <sup>V</sup> → {0, <sup>1</sup>} *where* <sup>V</sup> <sup>⊆</sup> <sup>V</sup> *. We will denote the set of variables assigned in* #»<sup>q</sup> *as Vars***(***q***)***. We will refer to* #»<sup>q</sup> *as a full assignment to* <sup>V</sup> *if Vars*(*q*) = <sup>V</sup> *. We will denote as* #»*q <sup>⊆</sup>* #»*r the fact that a) Vars*(*q*) <sup>⊆</sup> *Vars*(*r*) *and b) every variable of Vars*(*q*) *has the same value in* #»<sup>q</sup> *and* #»<sup>r</sup> *.*

**Definition 4.** *A literal, a clause and a formula are said to be satisfied (respectively falsified) by an assignment* #»<sup>q</sup> *if they evaluate to 1 (respectively 0) under* #»<sup>q</sup> *.*

**Definition 5.** *Let* C *be a clause. Let* H *be a formula that may have quantifiers, and* #»<sup>q</sup> *be an assignment to Vars*(H)*. If* <sup>C</sup> *is satisfied by* #»<sup>q</sup> *, then Cq <sup>≡</sup>* **<sup>1</sup>***. Otherwise, Cq is the clause obtained from* <sup>C</sup> *by removing all literals falsified by* #»<sup>q</sup> *. Denote by Hq the formula obtained from* <sup>H</sup> *by removing the clauses satisfied by* #»<sup>q</sup> *and replacing every clause* <sup>C</sup> *unsatisfied by* #»<sup>q</sup> *with* <sup>C</sup>*q .*

**Definition 6.** *Given a formula* ∃X[F(X, Y )]*, a clause* C *of* F *is called a quantified clause if Vars*(C) ∩ X = ∅*. If Vars*(C) ∩ X = ∅*, the clause* C *depends only on free, i.e., unquantified variables of* F *and is called a free clause.*

**Definition 7.** *Let* G, H *be formulas that may have existential quantifiers. We say that* G, H *are equivalent, written <sup>G</sup> <sup>≡</sup> <sup>H</sup>, if* <sup>G</sup>*q* <sup>=</sup> <sup>H</sup>*q for all full assignments* #»<sup>q</sup> *to Vars*(G) <sup>∪</sup> *Vars*(H)*.*

**Definition 8.** *Let* F(X, Y ) *be a formula and* G ⊆ F *and* G = ∅*. The clauses of* <sup>G</sup> *are said to be redundant in <sup>∃</sup>X***[***F***]** *if* <sup>∃</sup>X[F] ≡ ∃X[<sup>F</sup> \ <sup>G</sup>]*. Note that if* F \ G *implies* G*, the clauses of* G *are redundant in* ∃X[F]*.*

**Definition 9.** *Given a formula* ∃X[F(X, Y ))] *and* G *where* G ⊆ F*, the Partial Quantifier Elimination (PQE) problem is to find* H(Y ) *such that <sup>∃</sup>X***[***F***]** *<sup>≡</sup> H ∧ ∃X***[***<sup>F</sup> \ <sup>G</sup>***]***. (So, PQE takes* <sup>G</sup> *out of the scope of quantifiers.) The formula* H *is called a solution to PQE. The case of PQE where* G = F *is called Quantifier Elimination (QE).*

*Example 1.* Consider the formula F = C1∧C2∧C3∧C<sup>4</sup> where C<sup>1</sup> = x3∨x4, C<sup>2</sup> = y1∨x3, C<sup>3</sup> = y<sup>1</sup> ∨ x4, C<sup>4</sup> =y2∨x4. Let Y denote {y1, y2} and X denote {x3, x4}. Consider the PQE problem of taking C<sup>1</sup> out of ∃X[F], i.e., finding H(Y ) such that ∃X[F] ≡ H ∧ ∃X[F \ {C1}]. As we show later, ∃X[F] ≡ y<sup>1</sup> ∧ ∃X[F \ {C1}]. That is, H =y<sup>1</sup> is a solution to the PQE problem above.

*Remark 1.* Let D be a clause of a solution H to the PQE problem of Definition 9. If F \ G implies D, then H \ {D} is a solution to this PQE problem too.

**Proposition 1.** *Let* H *be a solution to the PQE problem of Definition 9. That is,* ∃X[F] ≡ H ∧ ∃X[F \ G]*. Then* F ⇒ H *(i.e.,* F *implies* H*).*

The proofs of propositions can be found in [5].

**Definition 10.** *Let clauses* C *,*C *have opposite literals of exactly one variable* w ∈ *Vars*(C )∩*Vars*(C)*. Then* C *,*C *are called resolvable on* w*. Let* C *be a clause of a formula* G *and* w ∈ *Vars*(C)*. The clause* C *is said to be blocked [10] in* G *with respect to the variable* w *if no clause of* G *is resolvable with* C *on* w*.*

**Proposition 2.** *Let a clause* C *be blocked in a formula* F(X, Y ) *with respect to a variable* x ∈ X*. Then* C *is redundant in* ∃X[F]*, i.e.,* ∃X[F \ {C}] ≡ ∃X[F]*.*

## **3 Property Generation by PQE**

Many known problems can be formulated in terms of PQE, thus facilitating the design of new efficient algorithms. In [5], we give a short summary of results on solving SAT, equivalence checking and model checking by PQE presented in [1,6–8]. In this section, we describe application of PQE to *property generation* for a combinational circuit. The objective of property generation is to expose a bug via producing an *unwanted* property.

Let M(X, V,W) be a combinational circuit where X, V,W specify the sets of the internal, input and output variables of M respectively. Let F(X, V,W) denote a formula specifying M. As usual, this formula is obtained by Tseitin's transformations [11]. Namely, F equals F*<sup>G</sup>*<sup>1</sup> ∧···∧F*<sup>G</sup><sup>k</sup>* where G1,...,G*<sup>k</sup>* are the gates of M and F*<sup>G</sup><sup>i</sup>* specifies the functionality of gate G*i*.

*Example 2.* Let G be a 2-input AND gate defined as x<sup>3</sup> = x<sup>1</sup> ∧ x<sup>2</sup> where x<sup>3</sup> denotes the output value and x1, x<sup>2</sup> denote the input values of G. Then G is specified by the formula F*<sup>G</sup>* = (x1∨x2∨x3)∧(x1∨x3)∧(x2∨x3). Every clause of F*<sup>G</sup>* is falsified by an inconsistent assignment (where the output value of G is not implied by its input values). For instance, x1∨ x<sup>3</sup> is falsified by the inconsistent assignment x<sup>1</sup> = 0, x<sup>3</sup> = 1. So, every assignment *satisfying* F*<sup>G</sup>* corresponds to a *consistent* assignment to G and vice versa. Similarly, every assignment satisfying the formula F above is a consistent assignment to the gates of M and vice versa.

#### **3.1 High-Level View of Property Generation by PQE**

One generates properties by PQE until an unwanted property exposing a bug is produced. (Like in testing, one runs tests until a bug-exposing test is encountered.) The benefit of property generation by PQE is fourfold. First, by property generation, one can identify bugs that are hard or simply impossible to find by testing. Second, using PQE makes property generation efficient. Third, by taking out different clauses one can generate properties covering different parts of the design. This increases the probability of discovering a bug. Fourth, every property generated by PQE specifies a large set of high-quality tests.

In this paper (Sects. 7, 9), we consider cases where identifying an unwanted property is easy. However, in general, such identification is not trivial. A discussion of this topic is beyond the scope of this paper. (An outline of a procedure for deciding if a property is unwanted is given in [5].)

#### **3.2 Property Generation as Generalization of Testing**

The behavior of M corresponding to a single test can be cast as a property. Let <sup>w</sup>*<sup>i</sup>* <sup>∈</sup> <sup>W</sup> be an output variable of <sup>M</sup> and #»<sup>v</sup> be a test, i.e., a full assignment to the input variables <sup>V</sup> of <sup>M</sup>. Let <sup>B</sup>*<sup>v</sup>* denote the longest clause falsified by #»<sup>v</sup> , i.e., *Vars*(B*<sup>v</sup>* ) = V . Let l(w*i*) be the literal satisfied by the value of w*<sup>i</sup>* produced by <sup>M</sup> under input #»<sup>v</sup> . Then the clause <sup>B</sup>*<sup>v</sup>* <sup>∨</sup> <sup>l</sup>(w*i*) is satisfied by every assignment satisfying F, i.e., B*<sup>v</sup>* ∨ l(w*i*) is a property of M. We will refer to it as a **singletest property** (since it describes the behavior of M for a single test). If the input #»<sup>v</sup> is supposed to produce the opposite value of <sup>w</sup>*<sup>i</sup>* (i.e., the one *falsifying* <sup>l</sup>(w*i*)), then #»<sup>v</sup> exposes a bug in <sup>M</sup>. In this case, the single-test property above is an **unwanted** property of <sup>M</sup> exposing the same bug as the test #»<sup>v</sup> .

A single-test property can be viewed as a weakest property of M as opposed to the strongest property specified by ∃X[F]. The latter is the truth table of M that can be computed explicitly by performing QE on ∃X[F]. One can use PQE to generate properties of M that, in terms of strength, range from the weakest ones to the strongest property inclusively. (By combining clause splitting with PQE one can generate single-test properties, see the next subsection.) Consider the PQE problem of taking a clause C out of ∃X[F]. Let H(V,W) be a solution to this problem, i.e., ∃X[F] ≡ H ∧ ∃X[F \ {C}]. Since H is implied by F, it can be viewed as a **property** of M. If H is an **unwanted** property, M has a bug. (Here we consider the case where a property of M is obtained by taking a clause out of formula ∃X[F] where only the *internal* variables of M are quantified. Later we consider cases where some external variables of M are quantified too.)

We will assume that the property H generated by PQE has no redundant clauses (see Remark 1). That is, if D ∈ H, then F \ {C} ⇒ D. Then one can view H as a property that holds due to the presence of the clause C in F.

#### **3.3 Computing Properties Efficiently**

If a property H is obtained by taking only one clause out of ∃X[F], its computation is much easier than performing QE on ∃X[F]. If computing H still remains too time-consuming, one can use the two methods below that achieve better performance at the expense of generating weaker properties. The first method applies when a PQE solver forms a solution *incrementally*, clause by clause (like the algorithms described in Sects. 5 and 6). Then one can simply stop computing H as soon as the number of clauses in H exceeds a threshold. Such a formula H is still implied by F and hence specifies a property of M.

The second method employs *clause splitting*. Here we consider clause splitting on input variables v1,...,v*p*, i.e., those of V (but one can split a clause on any subset of variables from *Vars*(F)). Let F denote the formula F where a clause C is replaced with p + 1 clauses: C<sup>1</sup> = C ∨ l(v1),. . . , C*<sup>p</sup>* = C ∨ l(v*p*), C*p*+1 = C ∨l(v1)∨···∨l(v*p*), where l(v*i*) is a literal of v*i*. The idea is to obtain a property H by taking the clause C*p*+1 out of ∃X[F ] rather than C out of ∃X[F]. The former PQE problem is simpler than the latter since it produces a weaker property H. One can show that if {v1,...,v*p*} = V , then a) the complexity of PQE reduces to **linear**; b) taking out C*p*+1 actually produces a **single-test property**. The latter specifies the input/output behavior of <sup>M</sup> for the test #»<sup>v</sup> falsifying the literals l(v1),...,l(v*p*). (The details can be found in [5].)

#### **3.4 Using Design Coverage for Generation of Unwanted Properties**

Arguably, testing is so effective in practice because one verifies a *particular design*. Namely, one probes different parts of this design using some coverage metric rather than sampling the truth table (which would mean verifying *every possible design*). The same idea works for property generation by PQE for the following two reasons. First, by taking out a clause, PQE generates a property inherent to the *specific* circuit M. (If one replaces M with an equivalent but structurally different circuit, PQE will generate different properties.) Second, by taking out different clauses of F one generates properties corresponding to different parts of M thus "covering" the design. This increases the chance to take out a clause corresponding to the buggy part of M and generate an unwanted property.

#### **3.5 High-Quality Tests Specified by a Property Generated by PQE**

In this subsection, we show that a property H generated by PQE, in general, specifies a large set of high-quality tests. Let H(V,W) be obtained by taking C

out of ∃X[F(X, V,W)]. Let Q(V,W) be a clause of H. As mentioned above, we assume that <sup>F</sup> \ {C} <sup>⇒</sup> <sup>Q</sup>. Then there is an assignment (#»x, #»<sup>v</sup> , #»w) satisfying formula (<sup>F</sup> \ {C}) <sup>∧</sup> <sup>Q</sup> where #»x, #»v , #»<sup>w</sup> are assignments to X, V,W respectively. (Note that by definition, (#»<sup>v</sup> , #»w) falsifies <sup>Q</sup>.) Let (#»<sup>x</sup> <sup>∗</sup>, #»v , #»w∗) be the execution trace of <sup>M</sup> under the input #»<sup>v</sup> . So, (#»<sup>x</sup> <sup>∗</sup>, #»v , #»w∗) satisfies <sup>F</sup>. Note that the output assignments #»<sup>w</sup> and #»w<sup>∗</sup> must be different because (#»v , #»w∗) has to satisfy <sup>Q</sup>. (Otherwise, (#»<sup>x</sup> <sup>∗</sup>, #»v , #»w∗) satisfies <sup>F</sup> <sup>∧</sup> <sup>Q</sup> and so <sup>F</sup> <sup>⇒</sup> <sup>Q</sup> and hence <sup>F</sup> <sup>⇒</sup> <sup>H</sup>.) So, one can view #»<sup>v</sup> as a test "detecting" disappearance of the clause <sup>C</sup> from <sup>F</sup>. Note that different assignments satisfying (<sup>F</sup> \ {C}) <sup>∧</sup> <sup>Q</sup> correspond to different tests #»<sup>v</sup> . So, the clause <sup>Q</sup> of <sup>H</sup>, in general, specifies a very large number of tests. One can show that these tests are similar to those detecting stuck-at faults and so have very high quality [5].

## **4 Invariant Generation by PQE**

In this section, we extend property generation for combinational circuits to sequential ones. Namely, we generate *invariants*. Note that generation of *desired* auxiliary invariants is routinely used in practice to facilitate verification of a predefined property. The problem we consider here is different in that our goal is to produce an *unwanted* invariant exposing a bug. We picked generation of invariants (over that of weaker properties just claiming that a state cannot be reached in k transitions or less) because identification of an unwanted invariant is, arguably, easier.

### **4.1 Bugs Making States Unreachable**

Let N be a sequential circuit and S denote the state variables of N. Let I(S) specify the initial state #»s*ini* (i.e.,I(#»s*ini*) = 1). Let <sup>T</sup>(S ,V,S) denote the transition relation of N where S , S are the present and next state variables and <sup>V</sup> specifies the (combinational) input variables. We will say that a state #»<sup>s</sup> of <sup>N</sup> is reachable if there is an execution trace leading to #»<sup>s</sup> . That is, there is a sequence of states #»s0,..., #»s*<sup>k</sup>* where #»s<sup>0</sup> <sup>=</sup> #»s*ini*, #»s*<sup>k</sup>* <sup>=</sup> #»<sup>s</sup> and there exist #»v*<sup>i</sup>* <sup>i</sup> = 0,...,k−1 for which <sup>T</sup>(#»s*i*, #»v*i*, #»s*i*+1) = 1. Let <sup>N</sup> have to satisfy a set of **invariants** <sup>P</sup>0(S),...,P*m*(S). That is, <sup>P</sup>*<sup>i</sup>* holds iff <sup>P</sup>*i*(#»<sup>s</sup> ) = 1 for every reachable state #»<sup>s</sup> of <sup>N</sup>. We will denote the **aggregate invariant** <sup>P</sup><sup>0</sup> ∧···∧ <sup>P</sup>*<sup>m</sup>* as *Pagg* . We will call #»<sup>s</sup> <sup>a</sup> **bad state** of <sup>N</sup> if *<sup>P</sup>agg* (#»<sup>s</sup> ) = 0. If *<sup>P</sup>agg* holds, no bad state is reachable. We will call #»<sup>s</sup> <sup>a</sup> **good state** of <sup>N</sup> if *<sup>P</sup>agg* (#»<sup>s</sup> ) = 1.

Typically, the set of invariants P0,...,P*<sup>m</sup>* is incomplete in the sense that it does not specify all states that must be *unreachable*. So, a good state can well be unreachable. We will call a good state **operative** (or **op-state** for short) if it is supposed to be used by N and so should be *reachable*. We introduce the term *an operative state* just to factor out "useless" good states. We will say that N has an **op-state reachability bug** if an op-state is unreachable in N. In Sect. 7, we consider such a bug in a FIFO buffer. The fact that *Pagg* holds says *nothing* about reachability of op-states. Consider, for instance, a trivial circuit *<sup>N</sup>triv* that simply stays in the initial state #»s*ini* and *<sup>P</sup>agg* (#»s*ini*) = 1. Then *<sup>P</sup>agg* holds for *Ntriv* but the latter has op-state reachability bugs (assuming that the correct circuit must reach states other than #»s*ini*).

Let <sup>R</sup>#»*<sup>s</sup>* (S) be the predicate satisfied only by a state #»<sup>s</sup> . In terms of CTL, identifying an op-state reachability bug means finding #»<sup>s</sup> for which the property EF.R#»*<sup>s</sup>* must hold but it does not. The reason for assuming #»<sup>s</sup> to be *unknown* is that the set of op-states is typically too large to *explicitly* specify every property ET.R#»*<sup>s</sup>* to hold. This makes finding op-state reachability bugs very hard. The problem is exacerbated by the fact that reachability of different states is established by *different traces*. So, in general, one cannot efficiently prove many properties EF.R#»*<sup>s</sup>* (for different states) *at once*.

#### **4.2 Proving Operative State Unreachability by Invariant Generation**

In practice, there are two methods to check reachability of op-states for large circuits. The first method is testing. Of course, testing cannot prove a state unreachable, however, the examination of execution traces may point to a potential problem. (For instance, after examining execution traces of the circuit *Ntriv* above one realizes that many op-states look unreachable.) The other method is to check **unwanted invariants**, i.e., those that are supposed to fail. If an unwanted invariant holds for a circuit, the latter has an op-state reachability bug. For instance, one can check if a state variable s*<sup>i</sup>* ∈ S of a circuit never changes its initial value. To break this unwanted invariant, one needs to find an op-state where the initial value of s*<sup>i</sup>* is flipped. (For the circuit *Ntriv* above this unwanted invariant holds for every state variable.) The potential unwanted invariants are formed manually, i.e., simply *guessed*.

The two methods above can easily overlook an op-state reachability bug. Testing cannot prove that an op-state is unreachable. To correctly guess an unwanted invariant that holds, one essentially has to know the underlying bug. Below, we describe a method for invariant generation by PQE that is based on property generation for combinational circuits. The appeal of this method is twofold. First, PQE generates invariants "inherent" to the implementation at hand, which drastically reduces the set of invariants to explore. Second, PQE is able to generate invariants related to different parts of the circuit (including the buggy one). This increases the probability of generating an unwanted invariant. We substantiate this intuition in Sect. 7.

Let formula *Fk* specify the combinational circuit obtained by unfolding a sequential circuit N for k time frames and adding the initial state constraint I(S0). That is, F*<sup>k</sup>* = I(S0) ∧ T(S0, V0, S1) ∧···∧ T(S*<sup>k</sup>*−<sup>1</sup>, V*<sup>k</sup>*−<sup>1</sup>, S*k*) where S*<sup>j</sup>* , V*<sup>j</sup>* denote the state and input variables of j-th time frame respectively. Let H(S*k*) be a solution to the PQE problem of taking a clause C out of ∃X*k*[F*k*] where X*<sup>k</sup>* = S0∪V0∪···∪S*<sup>k</sup>*−<sup>1</sup>∪V*<sup>k</sup>*−<sup>1</sup>. That is, ∃X*k*[F*k*] ≡ H∧ ∃X*k*[F*<sup>k</sup>* \{C}]. Note that in contrast to Sect. 3, here some external variables of the combinational circuit (namely, the input variables V0,...,V*<sup>k</sup>*−<sup>1</sup>) are quantified too. So, H depends only on state variables of the last time frame. H can be viewed as a **local invariant** asserting that no state falsifying H can be reached in k transitions.

One can use H to find global invariants (holding for *every* time frame) as follows. Even if H is only a local invariant, a clause Q of H can be a *global* invariant. The experiments of Sect. 8 show that, in general, this is true for many clauses of H. (To find out if Q is a global invariant, one can simply run a model checker to see if the property Q holds.) Note that by taking out different clauses of F*<sup>k</sup>* one can produce global single-clause invariants Q relating to different parts of N. From now on, when we say "an invariant" without a qualifier we mean a **global invariant**.

## **5 Introducing** *EG***-***PQE*

In this section, we describe a simple SAT-based algorithm for performing PQE called *EG*-*PQE*. Here *'EG'* stands for 'Enumerate and Generalize'. *EG*-*PQE* accepts a formula ∃X[F(X, Y )] and a clause C ∈ F. It outputs a formula H(Y ) such that ∃X[*Fini*] ≡ H ∧ ∃X[*Fini* \ {C}] where *Fini* is the initial formula F. (This point needs clarification because *EG*-*PQE* changes F by adding clauses.)

#### **5.1 An Example**

Before describing the pseudocode of *EG*-*PQE*, we explain how it solves the PQE problem of Example 1. That is, we consider taking clause C<sup>1</sup> out of ∃X[F(X, Y )] where F = C<sup>1</sup> ∧···∧ C4, C<sup>1</sup> = x<sup>3</sup> ∨ x4, C<sup>2</sup> =y1∨x3, C<sup>3</sup> = y<sup>1</sup> ∨ x4, C<sup>4</sup> =y2∨x<sup>4</sup> and Y = {y1, y2} and X = {x3, x4}.

*EG*-*PQE* iteratively generates a full assignment #»<sup>y</sup> to <sup>Y</sup> and checks if (C1)*y* is redundant in <sup>∃</sup>X[F*y* ] (i.e., if <sup>C</sup><sup>1</sup> is redundant in <sup>∃</sup>X[F] in subspace #»<sup>y</sup> ). Note that if (<sup>F</sup> \ {C1})*y implies* (C1)*y* , then (C1)*y* is trivially redundant in <sup>∃</sup>X[F*y* ]. To avoid such subspaces, *EG*-*PQE* generates #»<sup>y</sup> by searching for an assignment (#»y , #»x) satisfying the formula (<sup>F</sup> \{C1})∧C1. (Here #»<sup>y</sup> and #»<sup>x</sup> are full assignments to <sup>Y</sup> and <sup>X</sup> respectively.) If such (#»<sup>y</sup> , #»x) exists, it satisfies <sup>F</sup> \ {C1} and falsifies <sup>C</sup><sup>1</sup> thus proving that (<sup>F</sup> \ {C1})*y does not* imply (C1)*y* .

Assume that *EG*-*PQE* found an assignment(y<sup>1</sup> = 0, y<sup>2</sup> = 1, x<sup>3</sup> = 1, x<sup>4</sup> = 0) satisfying (<sup>F</sup> \ {C1})∧C1. So #»<sup>y</sup> = (y<sup>1</sup> = 0, y<sup>2</sup> = 1). Then *EG*-*PQE* checks if <sup>F</sup>*y* is satisfiable. <sup>F</sup>*y* = (x3∨x4)∧x3∧x<sup>4</sup> and so it is *unsatisfiable*. This means that (C1)*y is not* redundant in <sup>∃</sup>X[F*y* ]. (Indeed, (<sup>F</sup> \ {C1})*y* is satisfiable. So, removing <sup>C</sup><sup>1</sup> makes <sup>F</sup> satisfiable in subspace #»<sup>y</sup> .) *EG*-*PQE makes* (C1)*y* redundant in <sup>∃</sup>X[F*y* ] by **adding** to <sup>F</sup> a clause <sup>B</sup> falsified by #»<sup>y</sup> . The clause <sup>B</sup> equals <sup>y</sup><sup>1</sup> and is obtained by identifying the assignments to individual variables of Y that made <sup>F</sup>*y* unsatisfiable. (In our case, this is the assignment <sup>y</sup><sup>1</sup> = 0.) Note that derivation of clause y<sup>1</sup> *generalizes* the proof of unsatisfiability of F in subspace (y<sup>1</sup> = 0, y<sup>2</sup> = 1) so that this proof holds for subspace (y<sup>1</sup> = 0, y<sup>2</sup> = 0) too.

Now *EG*-*PQE* looks for a new assignment satisfying (F \ {C1})∧C1. Let the assignment (y<sup>1</sup> = 1, y<sup>2</sup> = 1, x<sup>3</sup> = 1, x<sup>4</sup> = 0) be found. So, #»<sup>y</sup> = (y<sup>1</sup> = 1, y<sup>2</sup> = 1). Since (y<sup>1</sup> = 1, y<sup>2</sup> = 1, x<sup>3</sup> = 0) satisfies <sup>F</sup>, the formula <sup>F</sup>*y* is satisfiable. So, (C1)*y*

is *already redundant* in <sup>∃</sup>X[F*y* ]. To avoid re-visiting the subspace #»<sup>y</sup> , *EG*-*PQE* generates the **plugging** clause <sup>D</sup> <sup>=</sup> <sup>y</sup><sup>1</sup> <sup>∨</sup> <sup>y</sup><sup>2</sup> falsified by #»<sup>y</sup> .

*EG*-*PQE* fails to generate a new assignment #»<sup>y</sup> because the formula <sup>D</sup> <sup>∧</sup> (<sup>F</sup> \ {C1}) <sup>∧</sup> <sup>C</sup><sup>1</sup> is unsatisfiable. Indeed, every full assignment #»<sup>y</sup> we have examined so far falsifies either the clause y<sup>1</sup> added to F or the plugging clause <sup>D</sup>. The only assignment *EG*-*PQE* has not explored yet is #»<sup>y</sup> = (y<sup>1</sup> = 1, y<sup>2</sup> = 0). Since (<sup>F</sup> \ {C1})*y* <sup>=</sup> <sup>x</sup><sup>4</sup> and (C1)*y* <sup>=</sup> <sup>x</sup><sup>3</sup> <sup>∨</sup> <sup>x</sup>4, the formula (<sup>F</sup> \ {C1}) <sup>∧</sup> <sup>C</sup><sup>1</sup> is unsatisfiable in subspace #»<sup>y</sup> . In other words, (C1)*y* is implied by (<sup>F</sup> \ {C1})*y* and hence is redundant. Thus, C<sup>1</sup> is redundant in ∃X[*Fini* ∧y1] for every assignment to Y where *Fini* is the initial formula F. That is, ∃X[*Fini*] ≡ y1∧ ∃X[*Fini* \{C1}] and so the clause y<sup>1</sup> is a solution H to our PQE problem.

#### **5.2 Description of** *EG***-***PQE*

*EG*-*PQE*(*F, X, Y, C*) { *Plg* := <sup>∅</sup>; *<sup>F</sup>ini* := *<sup>F</sup>* while (*true*) { *<sup>G</sup>*:= *<sup>F</sup>* \ {*C*} *<sup>y</sup>* :=*Sat*1(*Plg*<sup>∧</sup> *<sup>G</sup>*∧*C*) if ( *y* = *nil* ) return(*<sup>F</sup>* \ *<sup>F</sup>ini* ) (*x* <sup>∗</sup>*, B*) := *Sat*2(*F, y* ) if (*<sup>B</sup>* <sup>=</sup> *nil* ) { *<sup>F</sup>* := *<sup>F</sup>* ∪ {*B*} continue } *D* :=*P lugCls*( *y,x* <sup>∗</sup>*,F*) *Plg* := *Plg D*

The pseudo-code of *EG*-*PQE* is shown in Fig. 1. *EG*-*PQE* starts with storing the initial formula F and initializing formula *Plg* that accumulates the plugging clauses generated by *EG*-*PQE* (line 1). As we mentioned in the previous subsection, plugging clauses are used to avoid re-visiting the subspaces where the formula F is proved satisfiable.

All the work is carried out in a while loop. First, *EG*-*PQE* checks if there is a new subspace #»<sup>y</sup> where <sup>∃</sup>X[(<sup>F</sup> \ {C})*y* ] does not imply <sup>F</sup>*y* . This is done by searching for an assignment (#»<sup>y</sup> , #»x) satisfying *Plg*∧(<sup>F</sup> \{C})<sup>∧</sup> C (lines 3–4). If such an assignment does not exist, the clause C is redundant in ∃X[F]. (Indeed, let #»<sup>y</sup> be a full assignment to <sup>Y</sup> .

The formula *Plg* <sup>∧</sup>(<sup>F</sup> \ {C})∧<sup>C</sup> is unsatisfiable in subspace #»<sup>y</sup> for one of the two reasons. First, #»<sup>y</sup> falsifies *Plg*. Then <sup>C</sup>*y* is redundant because <sup>F</sup>*y* is satisfiable. Second, (<sup>F</sup> \ {C})*y* <sup>∧</sup> <sup>C</sup>*y* is unsatisfiable. In this case, (<sup>F</sup> \ {C})*y* implies <sup>C</sup>*y* .) Then *EG*-*PQE* returns the set of clauses added to the initial formula F as a solution H to the PQE problem (lines 5–6).

If the satisfying assignment (#»<sup>y</sup> , #»x) above exists, *EG*-*PQE* checks if the formula <sup>F</sup>*y* is satisfiable (line 7). If not, then the clause <sup>C</sup>*y is not* redundant in <sup>∃</sup>X[F*y* ] (because (<sup>F</sup> \ {C})*y* is satisfiable). So, *EG*-*PQE makes* <sup>C</sup>*y* redundant by generating a clause <sup>B</sup>(<sup>Y</sup> ) falsified by #»<sup>y</sup> and adding it to <sup>F</sup> (line 9). Note that adding <sup>B</sup> also prevents *EG*-*PQE* from re-visiting the subspace #»<sup>y</sup> again. The clause <sup>B</sup> is built by finding an *unsatisfiable* subset of <sup>F</sup>*y* and collecting the literals of <sup>Y</sup> removed from clauses of this subset when obtaining <sup>F</sup>*y* from <sup>F</sup>.

If <sup>F</sup>*y* is satisfiable, *EG*-*PQE* generates an assignment #»<sup>x</sup> <sup>∗</sup> to <sup>X</sup> such that (#»y , #»<sup>x</sup> <sup>∗</sup>) satisfies <sup>F</sup> (line 7). The satisfiability of <sup>F</sup>*y* means that every clause of <sup>F</sup>*y* including <sup>C</sup>*y* is redundant in <sup>∃</sup>X[F*y* ]. At this point, *EG*-*PQE* uses the longest clause <sup>D</sup>(<sup>Y</sup> ) falsified by #»<sup>y</sup> as a plugging clause (line 11). The clause <sup>D</sup> is added to *Plg* to avoid re-visiting subspace #»<sup>y</sup> . Sometimes it is possible to remove variables from #»<sup>y</sup> to produce a shorter assignment #»<sup>y</sup> <sup>∗</sup> such that (#»<sup>y</sup> <sup>∗</sup>, #»<sup>x</sup> <sup>∗</sup>) still satisfies <sup>F</sup>. Then one can use a shorter plugging clause <sup>D</sup> that is falsified by #»<sup>y</sup> <sup>∗</sup> and involves only the variables assigned in #»<sup>y</sup> <sup>∗</sup>.

#### **5.3 Discussion**

*EG*-*PQE* is similar to the QE algorithm presented at CAV-2002 [12]. We will refer to it as *CAV02* -*QE*. Given a formula ∃X[F(X, Y )], *CAV02* -*QE* enumerates full assignments to <sup>Y</sup> . In subspace #»<sup>y</sup> , if <sup>F</sup>*y* is unsatisfiable, *CAV02* -*QE* adds to <sup>F</sup> a clause falsified by #»<sup>y</sup> . Otherwise, *CAV02* -*QE* generates a plugging clause D. (In [12], D is called "a blocking clause". This term can be confused with the term "blocked clause" specifying a completely different kind of a clause. So, we use the term "the plugging clause" instead.) To apply the idea of *CAV02* -*QE* to PQE, we reformulated it in terms of redundancy based reasoning.

The main flaw of *EG*-*PQE* inherited from *CAV02* -*QE* is the necessity to use plugging clauses produced from a satisfying assignment. Consider the PQE problem of taking a clause C out of ∃X[F(X, Y )]. If F is proved *unsatisfiable* in subspace #»<sup>y</sup> , typically, only a small subset of clauses of <sup>F</sup>*y* is involved in the proof. Then the clause generated by *EG*-*PQE* is short and thus proves C redundant in many subspaces different from #»<sup>y</sup> . On the contrary, to prove <sup>F</sup> *satisfiable* in subspace #»<sup>y</sup> , every clause of <sup>F</sup> must be satisfied. So, the plugging clause built off a satisfying assignment includes almost every variable of Y . Despite this flaw of *EG*-*PQE*, we present it for two reasons. First, it is a very simple SAT-based algorithm that can be easily implemented. Second, *EG*-*PQE* has a powerful advantage over *CAV02* -*QE* since it solves PQE rather than QE. Namely, *EG*-*PQE* does not need to examine the subspaces #»<sup>y</sup> where <sup>C</sup> is implied by F \ {C}. Surprisingly, for many formulas this allows *EG*-*PQE* to *completely avoid* examining subspaces where F is satisfiable. In this case, *EG*-*PQE* is very efficient and can solve very large problems. Note that when *CAV02* -*QE* performs complete QE on <sup>∃</sup>X[F], it *cannot* avoid subspaces #»<sup>y</sup> where <sup>F</sup>*y* is satisfiable unless F *itself* is unsatisfiable (which is very rare in practical applications).

## **6 Introducing** *EG***-***PQE* **<sup>+</sup>**

In this section, we describe *EG*-*PQE* <sup>+</sup>, an improved version of *EG*-*PQE*.

#### **6.1 Main Idea**

The pseudocode of *EG*-*PQE* <sup>+</sup> is shown in Fig. 2. It is different from that of *EG*-*PQE* only in line 11 marked with an asterisk. The motivation for this change is as follows. Line 11 describes proving redundancy of <sup>C</sup> for the case where <sup>C</sup>*y* is not implied by (<sup>F</sup> \ {C})*y* and <sup>F</sup>*y* is satisfiable. Then *EG*-*PQE* simply uses a satisfying assignment as a proof of redundancy of <sup>C</sup> in subspace #»<sup>y</sup> . This proof is unnecessarily strong because it proves that *every* clause of F (including C) is redundant in <sup>∃</sup>X[F] in subspace #»<sup>y</sup> . Such a strong proof is hard to generalize to other subspaces.

*EG*-*PQE* <sup>+</sup>(*F, X, Y, C*) { *Plg* := <sup>∅</sup>; *<sup>F</sup>ini* := *<sup>F</sup>* while (*true*) { *........* <sup>∗</sup> *D* :=*P rvClsRed*( *y ,F,C*) *Plg* := *Plg D*

**Fig. 2.** Pseudocode of *EG*-*PQE* <sup>+</sup>

The idea of *EG*-*PQE* <sup>+</sup> is to generate a proof for a much weaker proposition namely a proof of redundancy of C (and only C). Intuitively, such a proof should be easier to generalize. So, *EG*-*PQE* <sup>+</sup> calls a procedure *PrvClsRed* generating such a proof. *EG*-*PQE* <sup>+</sup> is a generic algorithm in the sense that *any* suitable procedure can be employed as *PrvClsRed*. In our current implementation, the procedure *DS*-*PQE* [1] is used as

*PrvClsRed*. *DS*-*PQE* generates a proof stating that C is redundant in ∃X[F] in subspace #»<sup>y</sup> <sup>∗</sup> <sup>⊆</sup> #»<sup>y</sup> . Then the plugging clause <sup>D</sup> falsified by #»<sup>y</sup> <sup>∗</sup> is generated. Importantly, #»<sup>y</sup> <sup>∗</sup> can be much shorter than #»<sup>y</sup> . (A brief description of *DS*-*PQE* in the context of *EG*-*PQE* <sup>+</sup> is given in [5].)

*Example 3.* Consider the example solved in Subsect. 5.1. That is, we consider taking clause C<sup>1</sup> out of ∃X[F(X, Y )] where F = C<sup>1</sup> ∧···∧ C4, C<sup>1</sup> = x<sup>3</sup> ∨ x4, C<sup>2</sup> = y1∨x3, C<sup>3</sup> = y<sup>1</sup> ∨ x4, C<sup>4</sup> = y2∨x<sup>4</sup> and Y = {y1, y2} and X = {x3, x4}. Consider the step where *EG*-*PQE* proves redundancy of <sup>C</sup><sup>1</sup> in subspace #»<sup>y</sup> <sup>=</sup> (y<sup>1</sup> = 1, y<sup>2</sup> = 1). *EG*-*PQE* shows that (y<sup>1</sup> = 1, y<sup>2</sup> = 1,x<sup>3</sup> = 0) satisfies F, thus proving every clause of <sup>F</sup> (including <sup>C</sup>1) redundant in <sup>∃</sup>X[F] in subspace #»<sup>y</sup> . Then *EG*-*PQE* generates the plugging clause <sup>D</sup> <sup>=</sup> <sup>y</sup><sup>1</sup> <sup>∨</sup> <sup>y</sup><sup>2</sup> falsified by #»<sup>y</sup> .

In contrast to *EG*-*PQE*, *EG*-*PQE* <sup>+</sup> calls *PrvClsRed* to produce a proof of redundancy for the clause C<sup>1</sup> alone. Note that F has no clauses resolvable with <sup>C</sup><sup>1</sup> on <sup>x</sup><sup>3</sup> in subspace #»<sup>y</sup> <sup>∗</sup> = (y<sup>1</sup> = 1). (The clause <sup>C</sup><sup>2</sup> containing <sup>x</sup><sup>3</sup> is satisfied by #»<sup>y</sup> <sup>∗</sup>.) This means that <sup>C</sup><sup>1</sup> is blocked in subspace #»<sup>y</sup> <sup>∗</sup> and hence redundant there (see Proposition 2). Since #»<sup>y</sup> <sup>∗</sup> <sup>⊂</sup> #»<sup>y</sup> , *EG*-*PQE* <sup>+</sup> produces a more general proof of redundancy than *EG*-*PQE*. To avoid re-examining the subspace #»<sup>y</sup> <sup>∗</sup>, *EG*-*PQE* <sup>+</sup> generates a *shorter* plugging clause D = y1.

#### **6.2 Discussion**

Consider the PQE problem of taking a clause C out of ∃X[F(X, Y )]. There are two features of PQE that make it easier than QE. The first feature mentioned earlier is that one can ignore the subspaces #»<sup>y</sup> where <sup>F</sup> \ {C} implies <sup>C</sup>. The second feature is that when <sup>F</sup>*y* is satisfiable, one only needs to prove redundancy of the clause C alone. Among the three algorithms we run in experiments, namely, *DS*-*PQE*, *EG*-*PQE* and *EG*-*PQE* <sup>+</sup> only the latter exploits both features. (In addition to using *DS*-*PQE* inside *EG*-*PQE* <sup>+</sup> we also run it as a stand-alone PQE solver.) *DS*-*PQE* does not use the first feature [1] and *EG*-*PQE* does not exploit the second one. As we show in Sects. 7 and 8, this affects the performance of *DS*-*PQE* and *EG*-*PQE*.

## **7 Experiment with FIFO Buffers**

In this and the next two sections we describe some experiments with *DS*-*PQE*, *EG*-*PQE* and *EG*-*PQE* <sup>+</sup> (their sources are available at [13,14] and [15] respectively). We used Minisat2.0 [16] as an internal SAT-solver. The experiments were run on a computer with Intel Core i5-8265U CPU of 1.6 GHz.

if (*write* == 1 && *currSize < n*) \* if (*dataIn* != *Val* ) begin *Data*[*wrPnt*] = *dataIn*; *wrPnt* = *wrPnt* + 1; end

**Fig. 3.** A buggy fragment of Verilog code describing *Fifo*

In this section, we give an example of bug detection by invariant generation for a FIFO buffer. Our objective here is threefold. First, we want to give an example of a bug that can be overlooked by testing and guessing the unwanted properties to check (see Subsect. 7.3). Second, we want to substantiate the intuition of Subsect. 3.4 that property generation by PQE (in our case, invariant generation by PQE) has the same reasons to be effective as testing. In particular, by taking out different clauses one generates invariants relating to

different parts of the design. So, taking out a clause of the buggy part is likely to produce an unwanted invariant. Third, we want to give an example of an invariant that can be easily identified as unwanted<sup>2</sup>.

#### **7.1 Buffer Description**

Consider a FIFO buffer that we will refer to as *Fifo*. Let n be the number of elements of *Fifo* and *Data* denote the data buffer of *Fifo*. Let each *Data*[i], i = <sup>1</sup>,...,n have <sup>p</sup> bits and be an integer where 0 <sup>≤</sup> *Data*[i] <sup>&</sup>lt; <sup>2</sup>*<sup>p</sup>*. A fragment of the Verilog code describing *Fifo* is shown in Fig. 3. This fragment has a buggy line marked with an asterisk. In the correct version without the marked line, a new element *dataIn* is added to *Data* if the *write* flag is on and *Fifo* has less than n elements. Since *Data* can have any combination of numbers, all *Data* states are supposed to be reachable. However, due to the bug, the number *Val* cannot appear in *Data*. (Here *Val* is some constant 0 < *Val* < 2*<sup>p</sup>*. We assume that the buffer elements are initialized to 0.) So, *Fifo* has an *op-state reachability bug* since it cannot reach operative states where an element of *Data* equals *Val*.

<sup>2</sup> Let *P*(*S*ˆ) be an invariant for a circuit *N* depending only on a subset *S*ˆ of the state variables *S*. Identifying *P* as an unwanted invariant is much easier if *S*ˆ is meaningful from the high-level view of the design. Suppose, for instance, that assignments to *S*ˆ specify values of a high-level variable *v*. Then *P* is unwanted if it claims unreachability of a value of *v* that is supposed to be reachable. Another simple example is that assignments to *S*ˆ specify values of high-level variables *v* and *w* that are supposed to be *independent*. Then *P* is unwanted if it claims that some combinations of values of *v* and *w* are unreachable. (This may mean, for instance, that an assignment operator setting the value of *v* erroneously involves the variable *w*.)

#### **7.2 Bug Detection by Invariant Generation**

Let N be a circuit implementing *Fifo*. Let S be the set of state variables of N and *Sdata <sup>⊂</sup> <sup>S</sup>* be the subset corresponding to the data buffer *Data*. We used *DS*-*PQE*, *EG*-*PQE* and *EG*-*PQE* <sup>+</sup> to generate invariants of N as described in Sect. 4. Note that an invariant Q depending only on *Sdata* is an **unwanted** one. If Q holds for N, some states of *Data* are unreachable. Then *Fifo* has an op-state reachability bug since every state of *Data* is supposed to be reachable. To generate invariants, we used the formula F*<sup>k</sup>* = I(S0) ∧ T(S0, V0, S1) ∧···∧ T(S*<sup>k</sup>*−<sup>1</sup>, V*<sup>k</sup>*−<sup>1</sup>, S*k*) introduced in Subsect. 4.2. Here I and T describe the initial state and the transition relation of N respectively and S*<sup>j</sup>* and V*<sup>j</sup>* denote state variables and combinational input variables of j-th time frame respectively. First, we used a PQE solver to generate a local invariant H(S*k*) obtained by taking a clause C out of ∃X*k*[F*k*] where X*<sup>k</sup>* = S<sup>0</sup> ∪ V<sup>0</sup> ∪···∪ S*<sup>k</sup>*−<sup>1</sup> ∪ V*<sup>k</sup>*−<sup>1</sup>. So, ∃X*k*[F*k*] ≡ H∧ ∃X*k*[F*<sup>k</sup>* \ {C}]. (Since F*<sup>k</sup>* ⇒ H, no state falsifying H can be reached in k transitions.) In the experiment, we took out only clauses of F*<sup>k</sup>* containing an *unquantified variable*, i.e., a state variable of the k-th time frame. The time limit for solving the PQE problem of taking out a clause was set to 10 s.


**Table 1.** FIFO buffer with *n* elements of 32 bits. Time limit is 10 s per PQE problem

For each clause Q of every local invariant H generated by PQE, we checked if Q was a global invariant. Namely, we used a public version of *IC3* [17,18] to verify if the property Q held (by showing that no reachable state of N falsified Q). If so, and Q depended only on variables of *Sdata* , N had an *unwanted invariant*. Then we stopped invariant generation. The results of the experiment for buffers with 32-bit elements are given in Table 1. When picking a clause to take out, i.e., a clause with a state variable of k-th time frame, one could make a good choice by pure luck. To address this issue, we picked clauses to take out *randomly* and performed 10 different runs of invariant generation and then computed the average value. So, the columns four to twelve of Table 1 actually give the average value of 10 runs.

Let us use the first line of Table 1 to explain its structure. The first two columns show the number of elements in *Fifo* implemented by N and the number of latches in N (8 and 300). The third column gives the number k of time frames (i.e., 5). The next three columns show the total number of PQE problems solved by a PQE solver before an unwanted invariant was generated. For instance, *EG*-*PQE* <sup>+</sup> found such an invariant after solving 8 problems. On the other hand, *DS*-*PQE* failed to find an unwanted invariant and had to solve *all* 1,236 PQE problems of taking out a clause of F*<sup>k</sup>* with an unquantified variable. The following three columns show the share of PQE problems *finished* in the time limit of 10 s. For instance, *EG*-*PQE* finished 36% of 311 problems. The next three columns show if an unwanted invariant was generated by a PQE solver. (*EG*-*PQE* and *EG*-*PQE* <sup>+</sup> found one whereas *DS*-*PQE* did not.) The last three columns give the total run time. Table 1 shows that only *EG*-*PQE* <sup>+</sup> managed to generate an unwanted invariant for all four instances of *Fifo*. This invariant asserted that *Fifo* cannot reach a state where an element of *Data* equals *Val*.

#### **7.3 Detection of the Bug by Conventional Methods**

The bug above (or its modified version) can be overlooked by conventional methods. Consider, for instance, testing. It is hard to detect this bug by *random* tests because it is exposed only if one tries to add *Val* to *Fifo*. The same applies to testing using the *line coverage* metric [19]. On the other hand, a test set with 100% *branch* coverage [19] will find this bug. (To invoke the *else* branch of the *if* statement marked with '\*' in Fig. 3, one must set *dataIn* to *Val*.) However, a slightly modified bug can be missed even by tests with 100% branch coverage [5].

Now consider, manual generation of unwanted properties. It is virtually impossible to guess an unwanted *invariant* of *Fifo* exposing this bug unless one knows exactly what this bug is. However, one can detect this bug by checking a property asserting that the element *dataIn* must appear in the buffer if *Fifo* is ready to accept it. Note that this is a *non-invariant* property involving states of different time frames. The more time frames are used in such a property the more guesswork is required to pick it. Let us consider a modified bug. Suppose *Fifo* does not reject the element *Val*. So, the non-invariant property above holds. However, if *dataIn* == *Val*, then *Fifo* changes the *previous* accepted element if that element was *Val* too. So, *Fifo* cannot have two consecutive elements *Val*. Our method will detect this bug via generating an unwanted invariant falsified by states with consecutive elements *Val*. One can also identify this bug by checking a property involving two consecutive elements of *Fifo*. But picking it requires a lot of guesswork and so the modified bug can be easily overlooked.

### **8 Experiments with HWMCC Benchmarks**

In this section, we describe three experiments with 98 multi-property benchmarks of the HWMCC-13 set [20]. (We use this set because it has a multiproperty track, see the explanation below.) The number of latches in those benchmarks range from 111 to 8,000. More details about the choice of benchmarks and the experiments can be found in [5]. Each benchmark consists of a sequential circuit N and invariants P0,...,P*<sup>m</sup>* to prove. Like in Sect. 4, we call *Pagg* = P<sup>0</sup> ∧···∧ P*<sup>m</sup>* the *aggregate invariant*. In experiments 2 and 3 we used PQE to generate new invariants of N. Since every invariant P implied by *Pagg*

is a desired one, the necessary condition for P to be *unwanted* is *Pagg* ⇒ P. The conjunction of many invariants P*<sup>i</sup>* produces a stronger invariant *Pagg* , which makes it *harder* to generate P not implied by *Pagg* . (This is the reason for using multi-property benchmarks in our experiments.) The circuits of the HWMCC-13 set are *anonymous*, so, we could not know if an unreachable state is supposed to be reachable. For that reason, we just generated invariants not implied by *Pagg* without deciding if some of them were unwanted.

Similarly to the experiment of Sect. 7, we used the formula F*<sup>k</sup>* = I(S0) ∧ T(S0, V0, S1) ∧···∧ T(S*<sup>k</sup>*−<sup>1</sup>, V*<sup>k</sup>*−<sup>1</sup>, S*k*) to generate invariants. The number k of time frames was in the range of 2 ≤ k ≤ 10. As in the experiment of Sect. 7, we took out only clauses containing a state variable of the k-th time frame. In all experiments, the **time limit** for solving a PQE problem was set to 10 s.

#### **8.1 Experiment 1**

In the first experiment, we generated a *local invariant* H by taking out a clause C of ∃X*k*[F*k*] where X*<sup>k</sup>* = S<sup>0</sup> ∪ V<sup>0</sup> ∪···∪ S*<sup>k</sup>*−<sup>1</sup> ∪ V*<sup>k</sup>*−<sup>1</sup>. The formula H asserts that no state falsifying H can be reached in k transitions. Our goal was to show that PQE can find H for large formulas F*<sup>k</sup>* that have hundreds of thousands of clauses. We used *EG*-*PQE* to partition the PQE problems we tried into two groups. *The first group* consisted of 3,736 problems for which we ran *EG*-*PQE* with the time limit of 10 s and it never encountered a subspace #»s*<sup>k</sup>* where <sup>F</sup>*<sup>k</sup>* was satisfiable. Here #»s*<sup>k</sup>* is a full assignment to <sup>S</sup>*k*. Recall that only the variables <sup>S</sup>*<sup>k</sup>* are unquantified in <sup>∃</sup>X*k*[F*k*]. So, in every subspace #»s*k*, formula <sup>F</sup>*<sup>k</sup>* was either unsatisfiable or (F*<sup>k</sup>* \ {C}) ⇒ C. (The fact that so many problems meet the condition of the first group came as a big surprise.) *The second group* consisted of 3,094 problems where *EG*-*PQE* encountered subspaces where F*<sup>k</sup>* was satisfiable.

For the first group, *DS*-*PQE* finished only 30% of the problems within 10 s whereas *EG*-*PQE* and *EG*-*PQE* <sup>+</sup> finished 88% and 89% respectively. The poor performance of *DS*-*PQE* is due to not checking if (F*<sup>k</sup>* \ {C}) ⇒ C in the current subspace. For the second group, *DS*-*PQE*, *EG*-*PQE* and *EG*-*PQE* <sup>+</sup> finished 15%, 2% and 27% of the problems respectively within 10 s. *EG*-*PQE* finished far fewer problems because it used a satisfying assignment as a proof of redundancy of C (see Subsect. 6.2).

To contrast PQE and QE, we employed a high-quality tool *CADET* [21,22] to perform QE on the 98 formulas ∃X*k*[F*k*] (one formula per benchmark). That is, instead of taking a clause out of ∃X*k*[F*k*] by PQE, we applied *CADET* to perform full QE on this formula. (Performing QE on ∃X*k*[F*k*] produces a formula H(S*k*) specifying *all* states unreachable in k transitions.) *CADET* finished only 25% of the 98 QE problems with the time limit of 600 s. On the other hand, *EG*-*PQE* <sup>+</sup> finished 60% of the 6,830 problems of both groups (generated off ∃X*k*[F*k*]) within 10 s. So, PQE can be much easier than QE if only a small part of the formula gets unquantified.

#### **8.2 Experiment 2**

The second experiment was an extension of the first one. Its goal was to show that PQE can generate invariants for realistic designs. For each clause Q of a local invariant H generated by PQE we used *IC3* to verify if Q was a global invariant. If so, we checked if *Pagg* ⇒ Q held. To make the experiment less time consuming, in addition to the time limit of 10 s per PQE problem we imposed a few more constraints. The PQE problem of taking a clause out of ∃X*k*[F*k*] terminated as soon as H accumulated 5 clauses or more. Besides, processing a benchmark aborted when the summary number of clauses of all formulas H generated for this benchmark reached 100 or the total run time of all PQE problems generated off ∃X*k*[F*k*] exceeded 2,000 s.

**Table 2.** Invariant generation


Table 2 shows the results of the experiment. The third column gives the number of local single-clause invariants (i.e., the total number of clauses in all H over all benchmarks). The fourth column shows how many local single-clause invariants turned out to be global. (Since global invariants were extracted from H and the summary size of all H could not exceed

100, the number of global invariants per benchmark could not exceed 100.) The last column gives the number of global invariants not implied by *Pagg* . So, these invariants are candidates for checking if they are unwanted. Table 2 shows that *EG*-*PQE* and *EG*-*PQE* <sup>+</sup> performed much better than *DS*-*PQE*.

#### **8.3 Experiment 3**

To prove an invariant P true, *IC3* conjoins it with clauses Q1, . . . ,Q*<sup>n</sup>* to make P ∧ Q<sup>1</sup> ∧···∧ Q*<sup>n</sup>* inductive. If *IC3* succeeds, every Q*<sup>i</sup>* is an invariant. Moreover, Q*<sup>i</sup>* may be an *unwanted* invariant. The goal of the third experiment was to demonstrate that PQE and *IC3* , in general, produce different invariant clauses. The intuition here is twofold. First, *IC3* generates clauses Q*<sup>i</sup>* to prove a *predefined* invariant rather than find an unwanted one. Second, the closer P to being inductive, the fewer new invariant clauses are generated by *IC3* . Consider the circuit *<sup>N</sup>triv* that simply stays in the initial state #»s*ini* (Sect. 4). Any invariant satisfied by #»s*ini* is already *inductive* for *<sup>N</sup>triv* . So, IC3 will not generate *a single new invariant* clause. On the other hand, if the correct circuit is supposed to leave the initial state, *Ntriv* has unwanted invariants that our method will find.

In this experiment, we used *IC3* to generate P<sup>∗</sup> *agg* , an *inductive* version of *Pagg* . The experiment showed that in 88% cases, an invariant clause generated by *EG*-*PQE* <sup>+</sup> and not implied by *Pagg* was not implied by P<sup>∗</sup> *agg* either. (More details about this experiment can be found in [5].)

#### **9 Properties Mimicking Symbolic Simulation**

Let M(X, V,W) be a combinational circuit where X, V,W are internal, input and output variables. In this section, we describe generation of properties of M that mimic symbolic simulation [23]. Every such a property Q(V ) specifies a cube of tests that produce the same values for a given subset of variables of W. We chose generation of such properties because deciding if Q is an unwanted property is, in general, simple. The procedure for generation of these properties is slightly different from the one presented in Sect. 3.

Let F(X, V,W) be a formula specifying M. Let B(W) be a clause. Let H(V ) be a solution to the PQE problem of taking a clause C ∈ F out of ∃X∃W[F ∧B]. That is, ∃X∃W[F ∧B] ≡ H∧ ∃X∃W[(F \ {C})∧B]. Let Q(V ) be a clause of H. Then <sup>M</sup> has the **property** that for every full assignment #»<sup>v</sup> to <sup>V</sup> falsifying <sup>Q</sup>, it produces an output #»<sup>w</sup> falsifying <sup>B</sup> (a proof of this fact can be found in [5]). Suppose, for instance, <sup>Q</sup>=v1<sup>∨</sup> <sup>v</sup>10∨v<sup>30</sup> and <sup>B</sup> <sup>=</sup>w2<sup>∨</sup> <sup>w</sup>40. Then for every #»<sup>v</sup> where v<sup>1</sup> = 0, v<sup>10</sup> = 1,v<sup>30</sup> = 0, the circuit M produces an output where w<sup>2</sup> = 0, w<sup>40</sup> = 1. Note that Q is implied by F ∧ B rather than F. So, it is a property of M under constraint B rather than M alone. The property Q is **unwanted** if there is an input falsifying Q that *should not* produce an output falsifying B.

To generate combinational circuits, we unfolded sequential circuits of the set of 98 benchmarks used in Sect. 8 for invariant generation. Let N be a sequential circuit. (We reuse the notation of Sect. 4). Let M*k*(S0, V0,...,S*<sup>k</sup>*−<sup>1</sup>, V*<sup>k</sup>*−<sup>1</sup>, S*k*) denote the combinational circuit obtained by unfolding N for k time frames. Here S*<sup>j</sup>* , V*<sup>j</sup>* are state and input variables of j-th time frame respectively. Let F*<sup>k</sup>* denote the formula I(S0)∧T(S0, V0, S1)∧···∧T(S*<sup>k</sup>*−<sup>1</sup>, V*<sup>k</sup>*−<sup>1</sup>, S*k*) describing the unfolding of N for k time frames. Note that F*<sup>k</sup>* specifies the circuit M*<sup>k</sup>* above under the input constraint I(S0). Let B(S*k*) be a clause. Let H(S0, V0,...,V*<sup>k</sup>*−<sup>1</sup>) be a solution to the PQE problem of taking a clause C ∈ F*<sup>k</sup>* out of formula ∃S<sup>1</sup>*,k*[F*<sup>k</sup>* ∧B]. Here S<sup>1</sup>*,k* = S<sup>1</sup> ∪···∪S*k*. That is, ∃S<sup>1</sup>*,k*[F*<sup>k</sup>* ∧B] ≡ H∧ ∃S<sup>1</sup>*,k*[(F*<sup>k</sup>* \ {C})∧B]. Let <sup>Q</sup> be a clause of <sup>H</sup>. Then for every assignment (#»s*ini*, #»v0,. . . ,#»<sup>v</sup> *<sup>k</sup>*−<sup>1</sup>) falsifying <sup>Q</sup>, the circuit <sup>M</sup>*<sup>k</sup>* outputs #»s*<sup>k</sup>* falsifying <sup>B</sup>. (Here #»s*ini* is the initial state of <sup>N</sup> and #»s*<sup>k</sup>* is a state of the last time frame.)


**Table 3.** Property generation for combinational circuits

In the experiment, we used *DS*-*PQE*,*EG*-*PQE* and *EG*-*PQE* <sup>+</sup> to solve 1,586 PQE problems described above. In Table 3, we give a sample of results by *EG*-*PQE* <sup>+</sup>. (More details about this experiment can be found in [5].) Below, we use the first line of Table 3 to explain its structure. The first column gives the benchmark name (6s326).

The next column shows that 6s326 has 3,342 latches. The third column gives the number of time frames used to produce a combinational circuit M*<sup>k</sup>* (here k = 20). The next column shows that the clause B introduced above consisted of 15 literals of variables from S*k*. (Here and below we still use the index k assuming that k = 20.) The literals of B were generated *randomly*. When picking the length of B we just tried to simulate the situation where one wants to set a particular *subset* of output variables of M*<sup>k</sup>* to specified values. The next two columns give the size of the subcircuit M *<sup>k</sup>* of M*<sup>k</sup>* that feeds the output variables present in B. When computing a property H we took a clause out of formula ∃S<sup>1</sup>*,k*[F *<sup>k</sup>* ∧ B] where F *<sup>k</sup>* specifies M *<sup>k</sup>* instead of formula ∃S<sup>1</sup>*,k*[F*<sup>k</sup>* ∧ B] where F*<sup>k</sup>* specifies M*k*. (The logic of M*<sup>k</sup>* not feeding a variable of B is irrelevant for computing H.) The first column of the pair gives the number of gates in M *<sup>k</sup>* (i.e., 348,479). The second column provides the number of input variables feeding M *k* (i.e., 1,774). Here we count only variables of V0∪···∪V*<sup>k</sup>*−<sup>1</sup> and ignore those of S<sup>0</sup> since the latter are already assigned values specifying the initial state #»s*ini* of <sup>N</sup>.

The next four columns show the results of taking a clause out of ∃S<sup>1</sup>*,k*[F *<sup>k</sup>*∧ B]. For each PQE problem the time limit was set to 10 s. Besides, *EG*-*PQE* <sup>+</sup> terminated as soon as 5 clauses of property H(S0, V0,...,V*<sup>k</sup>*−<sup>1</sup>) were generated. The first three columns out of four describe the minimum and maximum sizes of clauses in H and the run time of *EG*-*PQE* <sup>+</sup>. So, it took for *EG*-*PQE* <sup>+</sup> 2.9 s. to produce a formula H containing clauses of sizes from 27 to 28 variables. A clause Q of H with 27 variables, for instance, specifies 2<sup>1747</sup> tests falsifying Q that produce the same output of M *<sup>k</sup>* (falsifying the clause B). Here 1747 = 1774−27 is the number of input variables of M *<sup>k</sup>* not present in Q. The last column shows that at least one clause Q of H specifies a property that cannot be produced by 3-valued simulation (a version of symbolic simulation [23]). To prove this, one just needs to set the input variables of M *<sup>k</sup>* present in Q to the values falsifying Q and run 3-valued simulation. (The remaining input variables of M *<sup>k</sup>* are assigned a don't-care value.) If after 3-valued simulation some output variable of M *<sup>k</sup>* is assigned a don't-care value, the property specified by Q cannot be produced by 3-valued simulation.

Running *DS*-*PQE*, *EG*-*PQE* and *EG*-*PQE* <sup>+</sup> on the 1,586 PQE problems mentioned above showed that a) *EG*-*PQE* performed poorly producing properties only for 28% of problems; b) *DS*-*PQE* and *EG*-*PQE* <sup>+</sup> showed much better results by generating properties for 62% and 66% of problems respectively. When *DS*-*PQE* and *EG*-*PQE* <sup>+</sup> succeeded in producing properties, the latter could not be obtained by 3-valued simulation in 74% and 78% of cases respectively.

### **10 Some Background**

In this section, we discuss some research relevant to PQE and property generation. Information on BDD based QE can be found in [24,25]. SAT based QE is described in [12,21,26–32]. Our first PQE solver called *DS*-*PQE* was introduced in [1]. It was based on redundancy based reasoning presented in [33] in terms of variables and in [34] in terms of clauses. The main flaw of *DS*-*PQE* is as follows. Consider taking a clause C out of ∃X[F]. Suppose *DS*-*PQE* proved C redundant in a subspace where F is *satisfiable* and some *quantified* variables are assigned. The problem is that *DS*-*PQE* cannot simply assume that C is redundant every time it re-enters this subspace [35]. The root of the problem is that redundancy is a *structural* rather than semantic property. That is, redundancy of a clause in a formula ξ (quantified or not) does not imply such redundancy in every formula logically equivalent to ξ. Since our current implementation of *EG*-*PQE* <sup>+</sup> uses *DS*-*PQE* as a subroutine, it has the same learning problem. We showed in [36] that this problem can be addressed by the machinery of certificate clauses. So, the performance of PQE can be drastically improved via enhanced learning in subspaces where F is satisfiable.

We are unaware of research on property generation for combinational circuits. As for invariants, the existing procedures typically generate some auxiliary *desired* invariants to prove a predefined property (whereas our goal is to generate invariants that are *unwanted*). For instance, they generate loop invariants [37] or invariants relating internal points of circuits checked for equivalence [38]. Another example of auxiliary invariants are clauses generated by *IC3* to make an invariant inductive [17]. As we showed in Subsect. 8.3, the invariants produced by PQE are, in general, different from those built by *IC3* .

#### **11 Conclusions and Directions for Future Research**

We consider Partial Quantifier Elimination (PQE) on propositional CNF formulas with existential quantifiers. In contrast to *complete* quantifier elimination, PQE allows to unquantify a *part* of the formula. We show that PQE can be used to generate properties of combinational and sequential circuits. The goal of property generation is to check if a design has an *unwanted* property and thus is buggy. We used PQE to generate an unwanted invariant for a FIFO buffer exposing a non-trivial bug. We also applied PQE to invariant generation for HWMCC benchmarks. Finally, we used PQE to generate properties of combinational circuits mimicking symbolic simulation. Our experiments show that PQE can efficiently generate properties for realistic designs.

There are at least three directions for future research. The first direction is to improve the performance of PQE solving. As we mentioned in Sect. 10, the most promising idea here is to enhance the power of learning in subspaces where the formula is satisfiable. The second direction is to use the improved PQE solvers to design new, more efficient algorithms for well-known problems like SAT, model checking and equivalence checking. The third direction is to look for new problems that can be solved by PQE.

#### **References**

1. Goldberg, E., Manolios, P.: Partial quantifier elimination. In: Yahav, E. (ed.) HVC 2014. LNCS, vol. 8855, pp. 148–164. Springer, Cham (2014). https://doi.org/10. 1007/978-3-319-13338-6 12


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

## Rounding Meets Approximate Model Counting

Jiong Yang(B) and Kuldeep S. Meel

National University of Singapore, Singapore, Singapore jiong@comp.nus.edu.sg

Abstract. The problem of model counting, also known as #SAT, is to compute the number of models or satisfying assignments of a given Boolean formula F. Model counting is a fundamental problem in computer science with a wide range of applications. In recent years, there has been a growing interest in using hashing-based techniques for approximate model counting that provide (ε, δ)-guarantees: i.e., the count returned is within a (1 + <sup>ε</sup>)-factor of the exact count with confidence at least 1 <sup>−</sup> <sup>δ</sup>. While hashing-based techniques attain reasonable scalability for large enough values of δ, their scalability is severely impacted for smaller values of δ, thereby preventing their adoption in application domains that require estimates with high confidence.

The primary contribution of this paper is to address the Achilles heel of hashing-based techniques: we propose a novel approach based on *rounding* that allows us to achieve a significant reduction in runtime for smaller values of δ. The resulting counter, called ApproxMC6 (The resulting tool ApproxMC6 is available open-source at https://github.com/meelgroup/approxmc), achieves a substantial runtime performance improvement over the current state-of-the-art counter, ApproxMC. In particular, our extensive evaluation over a benchmark suite consisting of 1890 instances shows ApproxMC6 solves 204 more instances than ApproxMC, and achieves a 4<sup>×</sup> speedup over ApproxMC.

### 1 Introduction

Given a Boolean formula F, the problem of model counting is to compute the number of models of F. Model counting is a fundamental problem in computer science with a wide range of applications, such as control improvisation [13], network reliability [9,28], neural network verification [2], probabilistic reasoning [5,11,20,21], and the like. In addition to myriad applications, the problem of model counting is a fundamental problem in theoretical computer science. In his seminal paper, Valiant showed that #SAT is #P-complete, where #<sup>P</sup> is the set of counting problems whose decision versions lie in NP [28]. Subsequently, Toda demonstrated the theoretical hardness of the problem by showing that every problem in the entire polynomial hierarchy can be solved by just one call to a #<sup>P</sup> oracle; more formally, PH <sup>⊆</sup> <sup>P</sup>#<sup>P</sup> [27].

Given the computational intractability of #SAT, there has been sustained interest in the development of approximate techniques from theoreticians and practitioners alike. Stockmeyer introduced a randomized hashing-based technique that provides (ε, δ)-guarantees (formally defined in Sect. 2) given access to an NP oracle [25]. Given the lack of practical solvers that could handle problems in NP satisfactorily, there were no practical implementations of Stockmeyere's hashing-based techniques until the 2000s [14]. Building on the unprecedented advancements in the development of SAT solvers, Chakraborty, Meel, and Vardi extended Stockmeyer's framework to a scalable (ε, δ)-counting algorithm, ApproxMC [7]. The subsequent years have witnessed a sustained interest in further optimizations of the hashing-based techniques for approximate counting [5,6,10,11,17–19,23,29,30]. The current state-of-the-art technique for approximate counting is a hashing-based framework called ApproxMC, which is in its fourth version, called ApproxMC4 [22,24].

The core theoretical idea behind the hashing-based framework is to use 2 universal hash functions to partition the solution space, denoted by sol(F) for a formula F, into *roughly equal small* cells, wherein a cell is considered *small* if it contains solutions less than or equal to a pre-computed threshold, thresh. An NP oracle (in practice, a SAT solver) is employed to check if a cell is small by enumerating solutions one-by-one until either there are no more solutions or we have already enumerated thresh + 1 solutions. Then, we randomly pick a cell, enumerate solutions within the cell (if the cell is small), and scale the obtained count by the number of cells to obtain an estimate for <sup>|</sup>sol(F)|. To amplify the confidence, we rely on the standard *median technique*: repeat the above process, called ApproxMCCore, multiple times and return the median. Computing the median amplifies the confidence as for the median of t repetitions to be outside the desired range (i.e., - |sol(F)| 1+<sup>ε</sup> ,(1 + <sup>ε</sup>)|sol(F)<sup>|</sup> ), it should be the case that at least half of the repetitions of ApproxMCCore returned a wrong estimate.

In practice, every subsequent repetition of ApproxMCCore takes a similar time, and the overall runtime increases linearly with the number of invocations. The number of repetitions depends logarithmically on δ−<sup>1</sup>. As a particular example, for = 0.8, the number of repetitions of ApproxMCCore to attain <sup>δ</sup> = 0.1 is 21, which increases to 117 for <sup>δ</sup> = 0.001: a significant increase in the number of repetitions (and accordingly, the time taken). Accordingly, it is no surprise that empirical analysis of tools such as ApproxMC has been presented with a high delta (such as <sup>δ</sup> = 0.1). On the other hand, for several applications, such as network reliability, and quantitative verification, the end users desire estimates with high confidence. Therefore, the design of efficient counting techniques for small δ is a major challenge that one needs to address to enable the adoption of approximate counting techniques in practice.

The primary contribution of our work is to address the above challenge. We introduce a new technique called *rounding* that enables dramatic reductions in the number of repetitions required to attain a desired value of confidence. The core technical idea behind the design of the *rounding* technique is based on the following observation: Let L (resp. U) refer to the event that a given invocation of ApproxMCCore under (resp. over)-estimates <sup>|</sup>sol(F)|. For a median estimate to be wrong, either the event L happens in half of the invocations of ApproxMCCore or the event U happens in half of the invocations of ApproxMCCore. The number of repetitions depends on max(Pr[L],Pr[U]). The current algorithmic design (and ensuing analysis) of ApproxMCCore provides a weak upper bound on max{Pr[L],Pr[U]}: in particular, the bounds on max{Pr[L],Pr[U]} and Pr[L∪U] are almost identical. Our key technical contribution is to design a new procedure, ApproxMC6Core, based on the rounding technique that allows us to obtain significantly better bounds on max{Pr[L],Pr[U]}.

The resulting algorithm, called ApproxMC6, follows a similar structure to that of ApproxMC: it repeatedly invokes the underlying core procedure ApproxMC6Core and returns the median of the estimates. Since a single invocation of ApproxMC6Core takes as much time as ApproxMCCore, the reduction in the number of repetitions is primarily responsible for the ensuing speedup. As an example, for <sup>ε</sup> = 0.8, the number of repetitions of ApproxMC6Core to attain <sup>δ</sup> = 0.1 and <sup>δ</sup> = 0.001 is just 5 and 19, respectively; the corresponding numbers for ApproxMC were 21 and 117. An extensive experimental evaluation on 1890 benchmarks shows that the rounding technique provided 4<sup>×</sup> speedup than the state-of-the-art approximate model counter, ApproxMC. Furthermore, for a given timeout of 5000 s, ApproxMC6 solves 204 more instances than ApproxMC and achieves a reduction of 1063 s in the PAR-2 score.

The rest of the paper is organized as follows. We introduce notation and preliminaries in Sect. 2. To place our contribution in context, we review related works in Sect. 3. We identify the weakness of the current technique in Sect. 4 and present the rounding technique in Sect. 5 to address this issue. Then, we present our experimental evaluation in Sect. 6. Finally, we conclude in Sect. 7.

## 2 Notation and Preliminaries

Let <sup>F</sup> be a Boolean formula in conjunctive normal form (CNF), and let Vars(F) be the set of variables appearing in <sup>F</sup>. The set Vars(F) is also called the *support* of <sup>F</sup>. An assignment <sup>σ</sup> of truth values to the variables in Vars(F) is called a *satisfying assignment* or *witness* of F if it makes F evaluate to true. We denote the set of all witnesses of <sup>F</sup> by sol(F). Throughout the paper, we will use <sup>n</sup> to denote <sup>|</sup>Vars(F)|.

The *propositional model counting problem* is to compute <sup>|</sup>sol(F)<sup>|</sup> for a given CNF formula F. A *probably approximately correct* (or PAC) counter is a probabilistic algorithm ApproxCount(·, ·, ·) that takes as inputs a formula <sup>F</sup>, a tolerance parameter ε > 0, and a confidence parameter <sup>δ</sup> <sup>∈</sup> (0, 1], and returns an (ε, δ) estimate c, i.e., Pr - |sol(F)| 1+<sup>ε</sup> <sup>≤</sup> <sup>c</sup> <sup>≤</sup> (1 + <sup>ε</sup>)|sol(F)<sup>|</sup> <sup>≥</sup> 1−δ. PAC guarantees are also sometimes referred to as (ε, δ)-guarantees.

A closely related notion is projected model counting, where we are interested in computing the cardinality of sol(F) projected on a subset of variables P ⊆ Vars(F). While for clarity of exposition, we describe our algorithm in the context of model counting, the techniques developed in this paper are applicable to projected model counting as well. Our empirical evaluation indeed considers such benchmarks.

#### 2.1 Universal Hash Functions

Let n, m <sup>∈</sup> <sup>N</sup> and <sup>H</sup>(n, m) = {<sup>h</sup> : {0, 1}<sup>n</sup> → {0, 1}m} be a family of hash functions mapping {0, 1}<sup>n</sup> to {0, <sup>1</sup>}m. We use <sup>h</sup> <sup>R</sup> ← H(n, m) to denote the probability space obtained by choosing a function <sup>h</sup> uniformly at random from <sup>H</sup>(n, m). To measure the quality of a hash function we are interested in the set of elements of sol(F) mapped to <sup>α</sup> by <sup>h</sup>, denoted CellF,h,α and its cardinality, i.e., <sup>|</sup>CellF,h,α|. We write Pr[<sup>Z</sup> : <sup>Ω</sup>] to denote the probability of outcome <sup>Z</sup> when sampling from a probability space Ω. For brevity, we omit Ω when it is clear from the context. The expected value of <sup>Z</sup> is denoted <sup>E</sup> [Z] and its variance is denoted <sup>σ</sup><sup>2</sup>[Z].

Definition 1. *A family of hash functions* <sup>H</sup>(n, m) *is strongly 2-universal if* <sup>∀</sup>x, y ∈ {0, <sup>1</sup>}<sup>n</sup>*,* <sup>α</sup> ∈ {0, 1}<sup>m</sup>*,* <sup>h</sup> <sup>R</sup> ← H(n, m)*,*

$$\Pr\left[h(x) = \alpha\right] = \frac{1}{2^m} = \Pr\left[h(x) = h(y)\right]$$

For h <sup>R</sup> ← H(n, n) and <sup>∀</sup><sup>m</sup> ∈ {1, ..., n}, the <sup>m</sup>th prefix-slice of <sup>h</sup>, denoted <sup>h</sup>(m) , is a map from {0, 1}<sup>n</sup> to {0, <sup>1</sup>}<sup>m</sup>, such that <sup>h</sup>(m) (y)[i] = <sup>h</sup>(y)[i], for all <sup>y</sup> ∈ {0, 1}<sup>n</sup> and for all <sup>i</sup> ∈ {1, ..., m}. Similarly, the <sup>m</sup>th prefix-slice of <sup>α</sup> ∈ {0, 1}<sup>n</sup>, denoted α(m) , is an element of {0, 1}<sup>m</sup> such that <sup>α</sup>(m) [i] = <sup>α</sup>[i] for all <sup>i</sup> ∈ {1, ..., m}. To avoid cumbersome terminology, we abuse notation and write CellF,m(resp. CntF,m) as a short-hand for CellF,h(m),α(m) (resp. |CellF,h(m),α(m)|). The following proposition presents two results that are frequently used throughout this paper. The proof is deferred to Appendix A.

Proposition 1. *For every* 1 <sup>≤</sup> <sup>m</sup> <sup>≤</sup> <sup>n</sup>*, the following holds:*

$$\mathbb{E}\left[\mathsf{Cnt}\_{\langle F,m\rangle}\right] = \frac{|\mathsf{sol}(\mathsf{F})|}{2^m} \tag{1}$$

$$\sigma^2\left[\mathsf{Cnt}\_{\langle F,m\rangle}\right] \le \mathsf{E}\left[\mathsf{Cnt}\_{\langle F,m\rangle}\right] \tag{2}$$

The usage of prefix-slice of h ensures monotonicity of the random variable, CntF,m, since from the definition of prefix-slice, we have that for every 1 <sup>≤</sup> m<n, <sup>h</sup>(m+1)(y) = <sup>α</sup>(m+1) <sup>⇒</sup> <sup>h</sup>(m) (y) = <sup>α</sup>(m) . Formally,

Proposition 2. *For every* <sup>1</sup> <sup>≤</sup> m<n*,* CellF,m+1 <sup>⊆</sup> CellF,m

#### 2.2 Helpful Combinatorial Inequality

Lemma 1. *Let* <sup>η</sup>(t, m, p) = <sup>t</sup> k=m t k <sup>p</sup><sup>k</sup>(1 <sup>−</sup> <sup>p</sup>)<sup>t</sup>−<sup>k</sup> *and* p < 0.5*, then*

$$\eta(t, \lceil t/2 \rceil, p) \in \Theta\left(t^{-\frac{1}{2}} \left(2\sqrt{p(1-p)}\right)^t\right)$$

*Proof.* We will derive both an upper and a matching lower bound for η (t, t/<sup>2</sup>, p). We begin by deriving an upper bound: <sup>η</sup>(t, t/<sup>2</sup>, p) = <sup>t</sup> <sup>k</sup>= <sup>t</sup> 2 t k <sup>p</sup>k(1−p)t−<sup>k</sup> <sup>≤</sup> <sup>t</sup> t/2 <sup>t</sup> <sup>k</sup>= <sup>t</sup> <sup>2</sup> <sup>p</sup>k(1−p)t−<sup>k</sup> <sup>≤</sup> <sup>t</sup> t/2 ·(p(1−p)) <sup>t</sup> <sup>2</sup> · <sup>1</sup> 1−2p ≤ √ 1 <sup>2</sup><sup>π</sup> · t ( t <sup>2</sup> <sup>−</sup>0.5)( <sup>t</sup> <sup>2</sup> +0.5) · t t−1 t · <sup>e</sup> <sup>1</sup> <sup>12</sup><sup>t</sup> <sup>−</sup> <sup>1</sup> <sup>6</sup>t+6 <sup>−</sup> <sup>1</sup> <sup>6</sup>t−<sup>6</sup> · t <sup>−</sup> <sup>1</sup> 2 2<sup>t</sup> · (p(1 <sup>−</sup> <sup>p</sup>)) <sup>t</sup> <sup>2</sup> · (p(1 <sup>−</sup> p)) 1 <sup>2</sup> · <sup>1</sup> <sup>1</sup>−2<sup>p</sup> . The last inequality follows Stirling's approximation. As a result, <sup>η</sup>(t, t/2, p) ∈ O t <sup>−</sup> <sup>1</sup> 2 2 <sup>p</sup>(1 <sup>−</sup> <sup>p</sup>) t . Afterwards; we move on to deriving a matching lower bound: <sup>η</sup>(t, t/2, p) = <sup>t</sup> <sup>k</sup>= <sup>t</sup> 2 t k <sup>p</sup><sup>k</sup>(1−p)<sup>t</sup>−<sup>k</sup> <sup>≥</sup> <sup>t</sup> t/2 p <sup>t</sup> <sup>2</sup> (1<sup>−</sup> <sup>p</sup>)<sup>t</sup>− <sup>t</sup> <sup>2</sup> ≥ <sup>√</sup> 1 <sup>2</sup><sup>π</sup> · t ( t <sup>2</sup> <sup>−</sup>0.<sup>5</sup>)( <sup>t</sup> <sup>2</sup> +0.<sup>5</sup>) · t t+1<sup>t</sup> · <sup>e</sup> <sup>1</sup> <sup>12</sup><sup>t</sup> <sup>−</sup> <sup>1</sup> <sup>6</sup>t+6 <sup>−</sup> <sup>1</sup> <sup>6</sup>t−<sup>6</sup> · t <sup>−</sup> <sup>1</sup> 2 2<sup>t</sup> · (p(1 <sup>−</sup> <sup>p</sup>)) <sup>t</sup> 2 · p 1 2 (1 <sup>−</sup> <sup>p</sup>)<sup>−</sup> <sup>1</sup> <sup>2</sup> · <sup>1</sup> <sup>1</sup>−2<sup>p</sup> . The last inequality again follows Stirling's approximation. Hence, <sup>η</sup>(t, t/2, p) <sup>∈</sup> <sup>Ω</sup> t <sup>−</sup> <sup>1</sup> 2 2 <sup>p</sup>(1 <sup>−</sup> <sup>p</sup>) t . Combining these two bounds, we conclude that <sup>η</sup>(t, t/2, p) <sup>∈</sup> <sup>Θ</sup> t <sup>−</sup> <sup>1</sup> 2 2 <sup>p</sup>(1 <sup>−</sup> <sup>p</sup>) t .

### 3 Related Work

The seminal work of Valiant established that #SAT is #P-complete [28]. Toda later showed that every problem in the polynomial hierarchy could be solved by just a polynomial number of calls to a #<sup>P</sup> oracle [27]. Based on Carter and Wegman's seminal work on universal hash functions [4], Stockmeyer proposed a probabilistic polynomial time procedure, with access to an NP oracle, to obtain an (ε, δ)-approximation of <sup>F</sup> [25].

Built on top of Stockmeyer's work, the core theoretical idea behind the hashing-based approximate solution counting framework, as presented in Algorithm 1 (ApproxMC [7]), is to use 2-universal hash functions to partition the solution space (denoted by sol(F) for a given formula <sup>F</sup>) into *small* cells of *roughly equal* size. A cell is considered *small* if the number of solutions it contains is less than or equal to a pre-determined threshold, thresh. An NP oracle is used to determine if a cell is small by iteratively enumerating its solutions until either there are no more solutions or thresh + 1 solutions have been found. In practice, an SAT solver is used to implement the NP oracle. To ensure a polynomial number of calls to the oracle, the threshold, thresh, is set to be polynomial in the input parameter ε at Line 1. The subroutine ApproxMCCore takes the formula F and thresh as inputs and estimates the number of solutions at Line 7. To determine the appropriate number of cells, i.e., the value of <sup>m</sup> for <sup>H</sup>(n, m), ApproxMCCore uses a search procedure at Line 3 of Algorithm 2. The estimate is calculated as the number of solutions in a randomly chosen cell, scaled by the number of cells, i.e., 2<sup>m</sup> at Line 5. To improve confidence in the estimate, ApproxMC performs multiple runs of the ApproxMCCore subroutine at Lines 5– 9 of Algorithm 1. The final count is computed as the median of the estimates obtained at Line 10.

Algorithm 1. ApproxMC(F, ε, δ) 1: thresh <sup>←</sup> 9.84 - 1 + <sup>ε</sup> 1+ε 1 + <sup>1</sup> ε 2 ; 2: <sup>Y</sup> <sup>←</sup> BoundedSAT(F,thresh); 3: if (|<sup>Y</sup> <sup>|</sup> <sup>&</sup>lt; thresh) then return <sup>|</sup><sup>Y</sup> <sup>|</sup>; 4: <sup>t</sup> ← 17 log<sup>2</sup>(3/δ) ; <sup>C</sup> <sup>←</sup> emptyList; iter <sup>←</sup> 0; 5: repeat 6: iter <sup>←</sup> iter + 1; 7: nSols <sup>←</sup> ApproxMCCore(F,thresh); 8: AddToList(C, nSols); 9: until (iter <sup>≥</sup> <sup>t</sup>); 10: finalEstimate <sup>←</sup> FindMedian(C); 11: return finalEstimate;

## Algorithm 2. ApproxMCCore(F,thresh)

1: Choose <sup>h</sup> at random from <sup>H</sup>(n, n); 2: Choose <sup>α</sup> at random from {0, 1}<sup>n</sup>; 3: <sup>m</sup> <sup>←</sup> LogSATSearch(F, h, α,thresh); 4: Cnt-F,m <sup>←</sup> BoundedSAT F ∧ - h(m) <sup>−</sup><sup>1</sup> - α(m) ,thresh ; 5: return (2<sup>m</sup> <sup>×</sup> Cnt-F,m);

In the second version of ApproxMC [8], two key algorithmic improvements are proposed to improve the practical performance by reducing the number of calls to the SAT solver. The first improvement is using galloping search to more efficiently find the correct number of cells, i.e., LogSATSearch at Line 3 of Algorithm 2. The second is using linear search over a small interval around the previous value of m before resorting to the galloping search. Additionally, the third and fourth versions [22,23] enhance the algorithm's performance by effectively dealing with CNF formulas conjuncted with XOR constraints, commonly used in the hashing-based counting framework. Moreover, an effective preprocessor named Arjun [24] is proposed to enhance ApproxMC's performance by constructing shorter XOR constraints. As a result, the combination of Arjun and ApproxMC4 solved almost all existing benchmarks [24], making it the current state of the art in this field.

In this work, we aim to address the main limitation of the ApproxMC algorithm by focusing on an aspect that still needs to be improved upon by previous developments. Specifically, we aim to improve the core algorithm of ApproxMC, which has remained unchanged.

## 4 Weakness of **ApproxMC**

As noted above, the core algorithm of ApproxMC has not changed since 2016, and in this work, we aim to address the core limitation of ApproxMC. To put our contribution in context, we first review ApproxMC and its core algorithm, called ApproxMCCore. We present the pseudocode of ApproxMC and ApproxMCCore in Algorithms 1 and 2, respectively. ApproxMCCore may return an estimate that falls outside the PAC range - |sol(F)| 1+<sup>ε</sup> ,(1 + <sup>ε</sup>)|sol(F)<sup>|</sup> with a certain probability of error. Therefore, ApproxMC repeatedly invokes ApproxMCCore (Lines 5– 9) and returns the median of the estimates returned by ApproxMCCore (Line 10), which reduces the error probability to the user-provided parameter δ.

Let Error<sup>t</sup> denote the event that the median of t estimates falls outside - |sol(F)| 1+<sup>ε</sup> ,(1 + <sup>ε</sup>)|sol(F)<sup>|</sup> . Let L denote the event that an invocation ApproxMCCore returns an estimate less than <sup>|</sup>sol(F)<sup>|</sup> 1+<sup>ε</sup> . Similarly, let U denote the event that an individual estimate of <sup>|</sup>sol(F)<sup>|</sup> is greater than (1+ε)|sol(F)|. For simplicity of exposition, we assume t is odd; the current implementation of t indeed

ensures that <sup>t</sup> is odd by choosing the smallest odd <sup>t</sup> for which Pr[Error<sup>t</sup>] <sup>≤</sup> <sup>δ</sup>. In the remainder of the section, we will demonstrate that reducing max {Pr[L] ,Pr[U]} can effectively reduce the number of repetitions <sup>t</sup>, making the small-δ scenarios practical. To this end, we will first demonstrate the existing analysis technique of ApproxMC leads to loose bounds on Pr[Error<sup>t</sup>]. We then present a new analysis that leads to tighter bounds on Pr[Error<sup>t</sup>].

The existing combinatorial analysis in [7] derives the following proposition:

#### Proposition 3.

$$\Pr\left[\mathsf{Error}\_t\right] \le \eta(t, \lceil t/2 \rceil, \Pr\left[L \cup U\right])$$

*where* <sup>η</sup>(t, m, p) = <sup>t</sup> k=m t k <sup>p</sup><sup>k</sup>(1 <sup>−</sup> <sup>p</sup>)<sup>t</sup>−<sup>k</sup>*.*

Proposition 3 follows from the observation that if the median falls outside the PAC range, at least t/2 of the results must also be outside the range. Let <sup>η</sup>(t, t/2,Pr[<sup>L</sup> <sup>∪</sup> <sup>U</sup>]) <sup>≤</sup> <sup>δ</sup>, and we can compute a valid <sup>t</sup> at Line <sup>4</sup> of ApproxMC.

Proposition 3 raises a question: can we derive a tight upper bound for Pr[Error<sup>t</sup>]? The following lemma provides an affirmative answer to this question.

Lemma 2. *Assuming* t *is odd, we have:*

$$\Pr\left[\mathsf{Error}\_t\right] = \eta(t, \lceil t/2 \rceil, \Pr\left[L\right]) + \eta(t, \lceil t/2 \rceil, \Pr\left[U\right])$$

*Proof.* Let I<sup>L</sup> <sup>i</sup> be an indicator variable that is 1 when ApproxMCCore returns a nSols less than <sup>|</sup>sol(F)<sup>|</sup> 1 + <sup>ε</sup> , indicating the occurrence of event L in the i-th repetition. Let I<sup>U</sup> <sup>i</sup> be an indicator variable that is 1 when ApproxMCCore returns a nSols greater than (1+ε)|sol(F)|, indicating the occurrence of event <sup>U</sup> in the <sup>i</sup>-th repetition. We aim first to prove that Error<sup>t</sup> ⇔ <sup>t</sup> <sup>i</sup>=1 I<sup>L</sup> <sup>i</sup> <sup>≥</sup> <sup>t</sup> 2 ∨ <sup>t</sup> <sup>i</sup>=1 I<sup>U</sup> <sup>i</sup> <sup>≥</sup> <sup>t</sup> 2 . We will begin by proving the right (⇒) implication. If the median of t estimates violates the PAC guarantee, the median is either less than <sup>|</sup>sol(F)<sup>|</sup> 1+<sup>ε</sup> or greater than (1 + <sup>ε</sup>)|sol(F)|. In the first case, since half of the estimates are less than the median, at least <sup>t</sup> 2 estimates are less than <sup>|</sup>sol(F)<sup>|</sup> 1+<sup>ε</sup> . Formally, this implies <sup>t</sup> <sup>i</sup>=1 I<sup>L</sup> <sup>i</sup> <sup>≥</sup> <sup>t</sup> 2 . Similarly, in the case that the median is greater than (1+ε)|sol(F)|, since half of the estimates are greater than the median, at least <sup>t</sup> 2 estimates are greater than (1+ε)|sol(F)|, thus formally implying <sup>t</sup> <sup>i</sup>=1 I<sup>U</sup> <sup>i</sup> <sup>≥</sup> <sup>t</sup> 2 . On the other hand, we prove the left (⇐) implication. Given <sup>t</sup> <sup>i</sup>=1 I<sup>L</sup> <sup>i</sup> <sup>≥</sup> <sup>t</sup> 2 , more than half of the estimates are less than <sup>|</sup>sol(F)<sup>|</sup> 1+<sup>ε</sup> , and therefore the median is less than <sup>|</sup>sol(F)<sup>|</sup> 1+<sup>ε</sup> , violating the PAC guarantee. Similarly, given <sup>t</sup> <sup>i</sup>=1 I<sup>U</sup> <sup>i</sup> <sup>≥</sup> <sup>t</sup> 2 , more than half of the estimates are greater than (1 + <sup>ε</sup>)|sol(F)|, and therefore the median is greater than (1 + <sup>ε</sup>)|sol(F)|, violating the PAC guarantee. This concludes the proof of Error<sup>t</sup> ⇔ <sup>t</sup> <sup>i</sup>=1 I<sup>L</sup> <sup>i</sup> <sup>≥</sup> <sup>t</sup> 2 ∨ <sup>t</sup> <sup>i</sup>=1 I<sup>U</sup> <sup>i</sup> <sup>≥</sup> <sup>t</sup> 2 . Then we obtain:

$$\begin{aligned} \Pr\left[\mathsf{Error}\_{t}\right] &= \Pr\left[\left(\sum\_{i=1}^{t} I\_{i}^{L} \geq \lceil t/2 \rceil\right) \vee \left(\sum\_{i=1}^{t} I\_{i}^{U} \geq \lceil t/2 \rceil\right)\right] \\ &= \Pr\left[\left(\sum\_{i=1}^{t} I\_{i}^{L} \geq \lceil t/2 \rceil\right)\right] + \Pr\left[\left(\sum\_{i=1}^{t} I\_{i}^{U} \geq \lceil t/2 \rceil\right)\right] \\ &- \Pr\left[\left(\sum\_{i=1}^{t} I\_{i}^{L} \geq \lceil t/2 \rceil\right) \wedge \left(\sum\_{i=1}^{t} I\_{i}^{U} \geq \lceil t/2 \rceil\right)\right] \end{aligned}$$

Given I<sup>L</sup> <sup>i</sup> <sup>+</sup> <sup>I</sup><sup>U</sup> <sup>i</sup> <sup>≤</sup> <sup>1</sup> for <sup>i</sup> = 1, <sup>2</sup>, ..., t, <sup>t</sup> <sup>i</sup>=1(I<sup>L</sup> <sup>i</sup> <sup>+</sup> <sup>I</sup><sup>U</sup> <sup>i</sup> ) <sup>≤</sup> <sup>t</sup> is there, but if <sup>t</sup> <sup>i</sup>=1 I<sup>L</sup> <sup>i</sup> <sup>≥</sup>t/<sup>2</sup> ∧ <sup>t</sup> <sup>i</sup>=1 I<sup>U</sup> <sup>i</sup> <sup>≥</sup>t/<sup>2</sup> is also given, we obtain <sup>t</sup> <sup>i</sup>=1(I<sup>L</sup> <sup>i</sup> + IU <sup>i</sup> ) <sup>≥</sup> <sup>t</sup> + 1 contradicting <sup>t</sup> <sup>i</sup>=1(I<sup>L</sup> <sup>i</sup> <sup>+</sup> <sup>I</sup><sup>U</sup> <sup>i</sup> ) <sup>≤</sup> <sup>t</sup>; Hence, we can conclude that Pr - <sup>t</sup> <sup>i</sup>=1 I<sup>L</sup> <sup>i</sup> <sup>≥</sup>t/<sup>2</sup> ∧ <sup>t</sup> <sup>i</sup>=1 I<sup>U</sup> <sup>i</sup> <sup>≥</sup>t/<sup>2</sup> = 0. From this, we can deduce:

$$\begin{aligned} \Pr\left[\mathsf{Error}\_t\right] &= \Pr\left[\left(\sum\_{i=1}^t I\_i^L \ge \lceil t/2 \rceil\right)\right] + \Pr\left[\left(\sum\_{i=1}^t I\_i^U \ge \lceil t/2 \rceil\right)\right] \\ &= \eta(t, \lceil t/2 \rceil, \Pr\left[L\right]) + \eta(t, \lceil t/2 \rceil, \Pr\left[U\right]) \end{aligned}$$

Though Lemma <sup>2</sup> shows that reducing Pr[L] and Pr[U] can decrease the error probability, it is still uncertain to what extent Pr[L] and Pr[U] affect the error probability. To further understand this impact, the following lemma is presented to establish a correlation between the error probability and <sup>t</sup> depending on Pr[L] and Pr[U].

Lemma 3. *Let* <sup>p</sup>max = max {*Pr* [L] ,*Pr* [U]} *and* <sup>p</sup>max <sup>&</sup>lt; <sup>0</sup>.5*, we have*

$$Pr[\mathsf{Error}\_t] \in \Theta\left(t^{-\frac{1}{2}} \left(2\sqrt{p\_{\max}(1-p\_{\max})}\right)^t\right)$$

*Proof.* Applying Lemmas 1 and 2, we have

$$\begin{split} \Pr\left[\mathsf{Error}\_{t}\right] &\in \Theta\left(t^{-\frac{1}{2}} \left( \left(2\sqrt{\Pr\left[L\right]\left(1-\Pr\left[L\right]\right)}\right)^{t} + \left(2\sqrt{\Pr\left[U\right]\left(1-\Pr\left[U\right]\right)}\right)^{t} \right) \right) \\ &= \Theta\left(t^{-\frac{1}{2}} \left(2\sqrt{p\_{\max}\left(1-p\_{\max}\right)}\right)^{t}\right) \\ &\dots \end{split}$$

In summary, Lemma <sup>3</sup> provides a way to tighten the bound on Pr[Error<sup>t</sup>] by designing an algorithm such that we can obtain a tighter bound on pmax in contrast to previous approaches that relied on obtaining a tighter bound on Pr[<sup>L</sup> <sup>∪</sup> <sup>U</sup>].

#### 5 Rounding Model Counting

In this section, we present a *rounding*-based technique that allows us to obtain a tighter bound on pmax. On a high-level, instead of returning the estimate from one iteration of the underlying core algorithm as the number of solutions in a randomly chosen cell multiplied by the number of cells, we *round* each estimate of the model count to a value that is more likely to be within (1 +ε)-bound. While counter-intuitive at first glance, we show that rounding the estimate reduces max {Pr[L] ,Pr[U]}, thereby resulting in a smaller number of repetitions of the underlying algorithm.

We present ApproxMC6, a *rounding*-based approximate model counting algorithm, in Sect. 5.1. Section 5.2 will demonstrate how ApproxMC6 decreases max {Pr[L] ,Pr[U]} and the number of estimates. Lastly, in Sect. 5.3, we will provide proof of the theoretical correctness of the algorithm.

#### 5.1 Algorithm

Algorithm 3 presents the procedure of ApproxMC6. ApproxMC6 takes as input a formula F, a tolerance parameter ε, and a confidence parameter <sup>δ</sup>. ApproxMC6 returns an (ε, δ)-estimate <sup>c</sup> of <sup>|</sup>sol(F)<sup>|</sup> such that Pr - |sol(F)| 1+<sup>ε</sup> <sup>≤</sup> <sup>c</sup> <sup>≤</sup> (1 + <sup>ε</sup>)|sol(F)<sup>|</sup> <sup>≥</sup> 1−δ. ApproxMC6 is identical to ApproxMC in its initialization of data structures and handling of base cases (Lines 1–4).

In Line 5, we pre-compute the rounding type and rounding value to be used in ApproxMC6Core. configRound is implemented in Algorithm 5; the precise choices arise due to technical analysis, as presented in Sect. 5.2. Note that, in configRound, CntF,m is *rounded up* to roundValue for ε < <sup>3</sup> (roundUp = 1) but *rounded* to roundValue for <sup>ε</sup> <sup>≥</sup> 3 (roundUp = 0). Rounding up means we assign roundValue to CntF,m if CntF,m is less than roundValue and, otherwise, keep CntF,m unchanged. Rounding means that we assign roundValue to CntF,m in all cases. ApproxMC6 computes the number of repetitions necessary to lower error probability down to δ at Line 6. The implementation of computeIter is presented Algorithm 3. ApproxMC6(F, ε, δ)

1: thresh <sup>←</sup> 9.84 - 1 + <sup>ε</sup> 1+ε 1 + <sup>1</sup> ε 2 ; 2: <sup>Y</sup> <sup>←</sup> BoundedSAT(F,thresh); 3: if (|<sup>Y</sup> <sup>|</sup> <sup>&</sup>lt; thresh) then return <sup>|</sup><sup>Y</sup> <sup>|</sup>; 4: <sup>C</sup> <sup>←</sup> emptyList; iter <sup>←</sup> 0; 5: (roundUp, roundValue) <sup>←</sup> configRound(ε) 6: <sup>t</sup> <sup>←</sup> computeIter(ε, δ) 7: repeat 8: iter <sup>←</sup> iter + 1; 9: nSols <sup>←</sup> ApproxMC6Core(F,thresh,roundUp,roundValue); 10: AddToList(C, nSols); 11: until (iter <sup>≥</sup> <sup>t</sup>); 12: finalEstimate <sup>←</sup> FindMedian(C); 13: return finalEstimate ;

in Algorithm 6 following Lemma 2. The iterator keeps increasing until the tight error bound is no more than <sup>δ</sup>. As we will show in Sect. 5.2, Pr[L] and Pr[U] depend on ε. In the loop of Lines 7–11, ApproxMC6Core repeatedly estimates <sup>|</sup>sol(F)|. Each estimate nSols is stored in List <sup>C</sup>, and the median of <sup>C</sup> serves as the final estimate satisfying the (ε, δ)-guarantee.

Algorithm 4 shows the pseudo-code of ApproxMC6Core. A random hash function is chosen at Line <sup>1</sup> to partition sol(F) into *roughly equal* cells. A random hash value is chosen at Line 2 to randomly pick a cell for estimation. In Line 3, we search for a value <sup>m</sup> such that the cell picked from 2<sup>m</sup> available cells is *small* enough to enumerate solutions one by one while providing a good estimate of <sup>|</sup>sol(F)|. In Line 4, a bounded model counting is invoked to compute the size of the picked cell, i.e., CntF,m. Finally, if roundUp equals <sup>1</sup>, CntF,m is rounded up to roundValue at Line 6. Otherwise, roundUp equals <sup>0</sup>, and CntF,m is rounded to roundValue at Line 8. Note that *rounding up* returns roundValue only if CntF,m is less than roundValue. However, in the case of *rounding*, roundValue is always returned no matter what value CntF,m is.

For large <sup>ε</sup> (<sup>ε</sup> <sup>≥</sup> 3), ApproxMC6Core returns a value that is independent of the value returned by BoundedSAT in line 4 of Algorithm 4. However, observe the value depends on m returned by LogSATSearch [8], which in turn uses BoundedSAT to find the value of m; therefore, the algorithm's run is not independent of all the calls to BoundedSAT. The technical reason for correctness stems from the observation that for large values of ε, we can always find a value of m such that 2<sup>m</sup> <sup>×</sup><sup>c</sup> (where <sup>c</sup> is a constant) is a (1+ε)-approximation of <sup>|</sup>sol(F)|. An example, consider <sup>n</sup> = 7 and let <sup>c</sup> = 1, then a (1+ 3)-approximation of a number between 1 and 128 belongs to [1, 2, 4, 8, 16, 32, 64, 128]; therefore, returning an answer of the form <sup>c</sup> <sup>×</sup> 2<sup>m</sup> suffices as long as we are able to search for the right value of m, which is accomplished by LogSATSearch. We could skip the final call to BoundedSAT in line 4 of ApproxMC6Core for large values of ε, but the actual computation of BoundedSAT comes with LogSATSearch.

## Algorithm 4. ApproxMC6Core(F,thresh,roundUp,roundValue)

1: Choose <sup>h</sup> at random from <sup>H</sup>(n, n); 2: Choose <sup>α</sup> at random from {0, 1}<sup>n</sup>; 3: <sup>m</sup> <sup>←</sup> LogSATSearch(F, h, α,thresh); 4: Cnt-F,m <sup>←</sup> BoundedSAT F ∧ - h(m) <sup>−</sup><sup>1</sup> - α(m) ,thresh ; 5: if roundUp = 1 then 6: return (2<sup>m</sup> <sup>×</sup> max{Cnt-F,m,roundValue}); 7: else 8: return (2<sup>m</sup> <sup>×</sup> roundValue);

## Algorithm 5. configRound(ε)

1: if (ε < <sup>√</sup>2 <sup>−</sup> 1) then return (1, <sup>√</sup>1+2<sup>ε</sup> <sup>2</sup> pivot); 2: else if (ε < 1) then return (1, pivot <sup>√</sup><sup>2</sup> ); 3: else if (ε < 3) then return (1, pivot); 4: else if (ε < 4 √2 <sup>−</sup> 1) then return (0, pivot); 5: else 6: return (0, √2pivot);

#### 5.2 Repetition Reduction

We will now show that ApproxMC6Core allows us to obtain a smaller max {Pr[L] ,Pr[U]}. Furthermore, we show the large gap between the error probability of ApproxMC6 and that of ApproxMC both analytically and visually.

The following lemma presents the upper bounds of Pr[L] and Pr[U] for ApproxMC6Core. Let pivot = 9.84 1 + <sup>1</sup> ε <sup>2</sup> for simplicity.

Lemma 4. *The following bounds hold for* ApproxMC6*:*

$$\Pr\left[L\right] \le \begin{cases} 0.262 & \text{if } \varepsilon < \sqrt{2} - 1 \\ 0.157 & \text{if } \sqrt{2} - 1 \le \varepsilon < 1 \\ 0.085 & \text{if } 1 \le \varepsilon < 3 \\ 0.055 & \text{if } 3 \le \varepsilon < 4\sqrt{2} - 1 \\ 0.023 & \text{if } \varepsilon \ge 4\sqrt{2} - 1 \end{cases}$$

$$\Pr[U] \le \begin{cases} 0.169 & \text{if } \varepsilon < 3 \\ 0.044 & \text{if } \varepsilon \ge 3 \end{cases}$$

The proof of Lemma 4 is deferred to Sect. 5.3. Observe that Lemma 4 influences the choices in the design of configRound (Algorithm 5). Recall that max {Pr[L] ,Pr[U]} ≤ 0.36 for ApproxMC (Appendix C), but Lemma <sup>4</sup> ensures max {Pr[L] ,Pr[U]} ≤ 0.262 for ApproxMC6. For <sup>ε</sup> <sup>≥</sup> 4 √2 <sup>−</sup> 1, Lemma <sup>4</sup> even delivers max {Pr[L] ,Pr[U]} ≤ 0.044.

## Algorithm 6. computeIter(ε, δ)

```
1: iter ← 1;
2: while (η(iter, iter/2,Prε[L]) + η(iter, iter/2,Prε[U]) > δ) do
3: iter ← iter + 2;
4: return iter;
```
The following theorem analytically presents the gap between the error probability of ApproxMC6 and that of ApproxMC<sup>1</sup>.

Theorem 1. *For* <sup>√</sup>2 <sup>−</sup> 1 <sup>≤</sup> ε < 1*,*

$$Pr[\mathsf{Error}\_t] \in \begin{cases} \mathcal{O}\left(t^{-\frac{1}{2}} 0.75^t\right) & \text{for } \mathsf{Approx} \mathsf{MC6} \\ \mathcal{O}\left(t^{-\frac{1}{2}} 0.96^t\right) & \text{for } \mathsf{Approx} \mathsf{MC} \end{cases}$$

*Proof.* From Lemma 4, we obtain <sup>p</sup>max <sup>≤</sup> <sup>0</sup>.<sup>169</sup> for ApproxMC6. Applying Lemma 3, we have

$$\Pr\left[\mathsf{Error}\_t\right] \in \mathcal{O}\left(t^{-\frac{1}{2}} \left(2\sqrt{0.169(1-0.169)}\right)^t\right) \subseteq \mathcal{O}\left(t^{-\frac{1}{2}}0.75^t\right),$$

For ApproxMC, combining <sup>p</sup>max <sup>≤</sup> <sup>0</sup>.<sup>36</sup> (Appendix C) and Lemma 3, we obtain

$$\Pr\left[\mathsf{Error}\_t\right] \in \mathcal{O}\left(t^{-\frac{1}{2}} \left(2\sqrt{0.36(1-0.36)}\right)^t\right) = \mathcal{O}\left(t^{-\frac{1}{2}}0.96^t\right)$$
 
$$\square$$

Figure 1 visualizes the large gap between the error probability of ApproxMC6 and that of ApproxMC. The x-axis represents the number of repetitions (t) in ApproxMC6 or ApproxMC. The y-axis represents the upper bound of error probability in the log scale. For example, as <sup>t</sup> = 117, ApproxMC guarantees that with a probability of <sup>10</sup><sup>−</sup><sup>3</sup>, the median over 117 estimates violates the PAC guarantee. However, ApproxMC6 allows a much smaller error probability that is at most 10<sup>−</sup><sup>15</sup> for <sup>√</sup>2 <sup>−</sup> 1 <sup>≤</sup> ε < 1. The smaller error probability enables ApproxMC6 to repeat fewer repetitions while providing the same level of theoretical guarantee. For example, given <sup>δ</sup> = 0.001 to ApproxMC, i.e., <sup>y</sup> = 0.001 in Fig. 1, ApproxMC requests 117 repetitions to obtain the given error probability. However, ApproxMC6 claims that 37 repetitions for ε < <sup>√</sup> √ 2 <sup>−</sup> 1, 19 repetitions for 2 <sup>−</sup> 1 <sup>≤</sup> ε < 1, 17 repetitions for 1 <sup>≤</sup> ε < 3, 7 repetitions for 3 <sup>≤</sup> ε < 4 √2 <sup>−</sup> 1, and 5 repetitions for <sup>ε</sup> <sup>≥</sup> 4 √2 <sup>−</sup> 1 are sufficient to obtain the same level of error probability. Consequently, ApproxMC6 can obtain 3×, 6×, 7×, 17×, and 23<sup>×</sup> speedups, respectively, than ApproxMC.

<sup>1</sup> We state the result for the case <sup>√</sup>2−1 <sup>≤</sup> ε < 1. A similar analysis can be applied to other cases, which leads to an even bigger gap between ApproxMC6 and ApproxMC.

Fig. 1. Comparison of error bounds for ApproxMC6 and ApproxMC.

## 5.3 Proof of Lemma <sup>4</sup> for Case *<sup>√</sup>***<sup>2</sup>** *<sup>−</sup>* **<sup>1</sup>** *<sup>≤</sup> ε <* **<sup>1</sup>**

We provide full proof of Lemma <sup>4</sup> for case <sup>√</sup><sup>2</sup> <sup>−</sup> <sup>1</sup> <sup>≤</sup> ε < <sup>1</sup>. We defer the proof of other cases to Appendix D.

Let T<sup>m</sup> denote the event CntF,m < thresh , and let L<sup>m</sup> and U<sup>m</sup> denote the events CntF,m < <sup>E</sup>[Cnt-F,m] 1+ε and CntF,m > E CntF,m (1 + <sup>ε</sup>) , respectively. To ease the proof, let U <sup>m</sup> denote CntF,m > E CntF,m (1 + <sup>ε</sup> 1+<sup>ε</sup> ) , and thereby U<sup>m</sup> ⊆ U <sup>m</sup>. Let <sup>m</sup><sup>∗</sup> <sup>=</sup> log<sup>2</sup> <sup>|</sup>sol(F)| − log<sup>2</sup> (pivot)+1 such that <sup>m</sup><sup>∗</sup> is the smallest m satisfying <sup>|</sup>sol(F)<sup>|</sup> <sup>2</sup><sup>m</sup> (1 + <sup>ε</sup> 1+<sup>ε</sup> ) <sup>≤</sup> thresh <sup>−</sup> <sup>1</sup>.

Let us first prove the lemmas used in the proof of Lemma 4.

Lemma 5. *For every* 0 <β< 1*,* γ > 1*, and* 1 <sup>≤</sup> <sup>m</sup> <sup>≤</sup> <sup>n</sup>*, the following holds:*

1.  $\Pr\left[\mathsf{Cnt}\_{\langle F,m\rangle} \le \beta \mathsf{E}\left[\mathsf{Cnt}\_{\langle F,m\rangle}\right]\right] \le \frac{1}{1 + (1 - \beta)^2 \mathsf{E}\left[\mathsf{Cnt}\_{\langle F,m\rangle}\right]}$ . 2.  $\Pr\left[\mathsf{Cnt}\_{\langle F,m\rangle} \ge \gamma \mathsf{E}\left[\mathsf{Cnt}\_{\langle F,m\rangle}\right]\right] \le \frac{1}{1 + (\gamma - 1)^2 \mathsf{E}\left[\mathsf{Cnt}\_{\langle F,m\rangle}\right]}$ .

*Proof.* Statement 1 can be proved following the proof of Lemma 1 in [8]. For statement 2, we rewrite the left-hand side and apply Cantelli's inequality: Pr CntF,m−E CntF,m <sup>≥</sup>(γ−1)<sup>E</sup> CntF,m <sup>≤</sup> <sup>σ</sup><sup>2</sup>[Cnt-F,m] <sup>σ</sup><sup>2</sup>[Cnt-F,m]+((γ−1)<sup>E</sup>[Cnt-F,m])<sup>2</sup> . Finally, applying Eq. 2 completes the proof.

Lemma 6. *Given* <sup>√</sup><sup>2</sup> <sup>−</sup> <sup>1</sup> <sup>≤</sup> ε < <sup>1</sup>*, the following bounds hold:*

*1. Pr* [T<sup>m</sup>∗−<sup>3</sup>] <sup>≤</sup> <sup>1</sup> 62.5 *2. Pr* [L<sup>m</sup>∗−<sup>2</sup>] <sup>≤</sup> <sup>1</sup> 20.68 *3. Pr* [L<sup>m</sup>∗−<sup>1</sup>] <sup>≤</sup> <sup>1</sup> 10.84 *4. Pr* [<sup>U</sup> <sup>m</sup><sup>∗</sup> ] <sup>≤</sup> <sup>1</sup> 5.92

*Proof.* Following the proof of Lemma 2 in [8], we can prove statements 1, 2, and 3. To prove statement 4, replacing <sup>γ</sup> with (1 + <sup>ε</sup> 1+<sup>ε</sup> ) in Lemma <sup>5</sup> and employing E CntF,m∗ <sup>≥</sup> pivot/2, we obtain Pr[<sup>U</sup> <sup>m</sup><sup>∗</sup> ] <sup>≤</sup> <sup>1</sup> 1+( <sup>ε</sup> 1+<sup>ε</sup> ) 2 pivot/<sup>2</sup> <sup>≤</sup> <sup>1</sup> <sup>5</sup>.<sup>92</sup> .

Now we prove the upper bounds of Pr[L] and Pr[U] in Lemma <sup>4</sup> for <sup>√</sup>2−1 <sup>≤</sup> ε < 1. The proof for other <sup>ε</sup> is deferred to Appendix <sup>D</sup> due to the page limit. Lemma 4. *The following bounds hold for* ApproxMC6:

$$\Pr\left[L\right] \le \begin{cases} 0.262 & \text{if } \varepsilon < \sqrt{2} - 1 \\ 0.157 & \text{if } \sqrt{2} - 1 \le \varepsilon < 1 \\ 0.085 & \text{if } 1 \le \varepsilon < 3 \\ 0.055 & \text{if } 3 \le \varepsilon < 4\sqrt{2} - 1 \\ 0.023 & \text{if } \varepsilon \ge 4\sqrt{2} - 1 \end{cases}$$

$$\Pr\left[U\right] \le \begin{cases} 0.169 & \text{if } \varepsilon < 3 \\ 0.044 & \text{if } \varepsilon \ge 3 \end{cases}$$

*Proof.* We prove the case of <sup>√</sup>2 <sup>−</sup> 1 <sup>≤</sup> ε < 1. The proof for other <sup>ε</sup> is deferred to Appendix D. Let us first bound Pr[L]. Following LogSATSearch in [8], we have

$$\Pr\left[L\right] = \left[ \bigcup\_{i \in \{1, \ldots, n\}} \left( \overline{T\_{i-1}} \cap T\_i \cap L\_i \right) \right] \tag{3}$$

Equation <sup>3</sup> can be simplified by three observations labeled <sup>O</sup>1, O2 and <sup>O</sup>3 below. <sup>O</sup>1 : <sup>∀</sup><sup>i</sup> <sup>≤</sup> <sup>m</sup><sup>∗</sup> <sup>−</sup> <sup>3</sup>, T<sup>i</sup> <sup>⊆</sup> <sup>T</sup>i+1. Therefore,

$$\bigcup\_{i \in \{1, \ldots, m^\*-3\}} (\overline{T\_{i-1}} \cap T\_i \cap L\_i) \subseteq \bigcup\_{i \in \{1, \ldots, m^\*-3\}} T\_i \subseteq T\_{m^\*-3}$$

<sup>O</sup>2 :]For <sup>i</sup> ∈ {m<sup>∗</sup> <sup>−</sup> 2, m<sup>∗</sup> <sup>−</sup> 1}, we have

$$\bigcup\_{i \in \{m^\star - 2, m^\star - 1\}} (\overline{T\_{i-1}} \cap T\_i \cap L\_i) \subseteq L\_{m^\star - 2} \cup L\_{m^\star - 1}$$

<sup>O</sup>3 : <sup>∀</sup><sup>i</sup> <sup>≥</sup> <sup>m</sup>∗, since rounding CntF,i up to pivot <sup>√</sup><sup>2</sup> and <sup>m</sup><sup>∗</sup> <sup>≥</sup> log<sup>2</sup> <sup>|</sup>sol(F)| − log<sup>2</sup> (pivot), we have <sup>2</sup><sup>i</sup> <sup>×</sup> CntF,i <sup>≥</sup> <sup>2</sup><sup>m</sup><sup>∗</sup> <sup>×</sup> pivot <sup>√</sup><sup>2</sup> <sup>≥</sup> <sup>|</sup>sol(F)<sup>|</sup> <sup>√</sup><sup>2</sup> <sup>≥</sup> <sup>|</sup>sol(F)<sup>|</sup> 1+<sup>ε</sup> . The last inequality follows from <sup>ε</sup> <sup>≥</sup> <sup>√</sup><sup>2</sup> <sup>−</sup> <sup>1</sup>. Then we have CntF,i <sup>≥</sup> <sup>E</sup>[Cnt-F,i] 1+<sup>ε</sup> . Therefore, <sup>L</sup><sup>i</sup> <sup>=</sup> <sup>∅</sup> for <sup>i</sup> <sup>≥</sup> <sup>m</sup><sup>∗</sup> and we have

$$\bigcup\_{i \in \{m^\bullet, \ldots, n\}} (\overline{T\_{i-1}} \cap T\_i \cap L\_i) = \emptyset$$

Following the observations <sup>O</sup>1, <sup>O</sup>2, and <sup>O</sup>3, we simplify Eq. <sup>3</sup> and obtain

$$\Pr\left[L\right] \le \Pr\left[T\_{m^\*-3}\right] + \Pr\left[L\_{m^\*-2}\right] + \Pr\left[L\_{m^\*-1}\right]$$

Employing Lemma <sup>6</sup> gives Pr[L] <sup>≤</sup> 0.157.

Now let us bound Pr[U]. Similarly, following LogSATSearch in [8], we have

$$\Pr\left[U\right] = \left[ \bigcup\_{i \in \{1, \ldots, n\}} \left( \overline{T\_{i-1}} \cap T\_i \cap U\_i \right) \right] \tag{4}$$

We derive the following observations <sup>O</sup>4 and <sup>O</sup>5.


$$\bigcup\_{i \in \{m^\*, \ldots, n\}} (\overline{T\_{i-1}} \cap T\_i \cap U\_i) \subseteq \bigcup\_{i \in \{m^\* + 1, \ldots, n\}} \overline{T\_{i-1}} \cup (\overline{T\_{m^\* - 1}} \cap T\_{m^\*} \cap U\_{m^\*})$$

$$\subseteq \overline{T\_{m^\*}} \cup (\overline{T\_{m^\* - 1}} \cap T\_{m^\*} \cap U\_{m^\*})$$

$$\subseteq \overline{T\_{m^\*}} \cup U\_{m^\*} \tag{5}$$

$$\subseteq U'\_{m^\*} \tag{5}$$

Remark that for <sup>√</sup><sup>2</sup> <sup>−</sup> <sup>1</sup> <sup>≤</sup> ε < <sup>1</sup>, we round CntF,m∗ up to pivot <sup>√</sup><sup>2</sup> , and we have 2<sup>m</sup><sup>∗</sup> <sup>×</sup> pivot <sup>√</sup><sup>2</sup> ≤ |sol(F)|(1 + <sup>ε</sup>), which means *rounding* doesn't affect the event U<sup>m</sup><sup>∗</sup> ; therefore, Inequality 5 still holds.

Following the observations <sup>O</sup>4 and <sup>O</sup>5, we simplify Eq. <sup>4</sup> and obtain

$$\Pr\left[U\right] \le \Pr\left[U\_{m^\*}'\right]$$

Employing Lemma <sup>6</sup> gives Pr[U] <sup>≤</sup> 0.169.

The breakpoints in ε of Lemma 4 arise from how we use rounding to lower the error probability for events <sup>L</sup> and <sup>U</sup>. Rounding up counts can lower Pr[L] but may increase Pr[U]. Therefore, we want to round up counts to a value that doesn't affect the event <sup>U</sup>. Take <sup>√</sup>2−<sup>1</sup> <sup>≤</sup> ε < <sup>1</sup> as an example; we round up the count to a value such that Lm<sup>∗</sup> becomes an empty event with zero probability while Um<sup>∗</sup> remains unchanged. To make Lm<sup>∗</sup> empty, we have

$$2^{m^\*} \times \text{round}\mathsf{Value} \geq 2^{m^\*} \times \frac{1}{1 + \varepsilon} \mathsf{pivot} \geq \frac{1}{1 + \varepsilon} |\mathsf{sol}(\mathsf{F})| \tag{6}$$

where the last inequality follows from <sup>m</sup><sup>∗</sup> <sup>≥</sup> log<sup>2</sup> <sup>|</sup>sol(F)| − log<sup>2</sup> (pivot). To maintain Um<sup>∗</sup> unchanged, we obtain

$$2^{m^\*} \times \text{round}\mathsf{Value} \le 2^{m^\*} \times \frac{1+\varepsilon}{2} \mathsf{pivot} \le (1+\varepsilon)|\mathsf{sol}(\mathsf{F})|\tag{7}$$

where the last inequality follows from <sup>m</sup><sup>∗</sup> <sup>≤</sup> log<sup>2</sup> <sup>|</sup>sol(F)| − log<sup>2</sup> (pivot)+1. Combining Eqs. 6 and 7 together, we obtain

$$2^{m^\*} \times \frac{1}{1+\varepsilon} \mathsf{pivot} \le 2^{m^\*} \times \frac{1+\varepsilon}{2} \mathsf{pivot}$$

which gives us <sup>ε</sup> <sup>≥</sup> <sup>√</sup>2 <sup>−</sup> 1. Similarly, we can derive other breakpoints.

#### 6 Experimental Evaluation

It is perhaps worth highlighting that both ApproxMCCore and ApproxMC6Core invoke the underlying SAT solver on identical queries; the only difference between ApproxMC6 and ApproxMC lies in what estimate to return and how often ApproxMCCore and ApproxMC6Core are invoked. From this viewpoint, one would expect that theoretical improvements would also lead to improved runtime performance. To provide further evidence, we perform extensive empirical evaluation and compare ApproxMC6's performance against the current state-of-the-art model counter, ApproxMC [22]. We use Arjun as a pre-processing tool. We used the latest version of ApproxMC, called ApproxMC4; an entry based on ApproxMC4 won the Model Counting Competition 2022.

Previous comparisons of ApproxMC have been performed on a set of 1896 instances, but the latest version of ApproxMC is able to solve almost all the instances when these instances are pre-processed by Arjun. Therefore, we sought to construct a new comprehensive set of 1890 instances derived from various sources, including Model Counting Competitions 2020–2022 [12,15,16], program synthesis [1], quantitative control improvisation [13], quantification of software properties [26], and adaptive chosen ciphertext attacks [3]. As noted earlier, our technique extends to projected model counting, and our benchmark suite indeed comprises 772 projected model counting instances.

Experiments were conducted on a high-performance computer cluster, with each node consisting of 2xE5-2690v3 CPUs featuring 2 × 12 real cores and 96GB of RAM. For each instance, a counter was run on a single core, with a time limit of 5000 s and a memory limit of 4 GB. To compare runtime performance, we use the PAR-2 score, a standard metric in the SAT community. Each instance is assigned a score that is the number of seconds it takes the corresponding tool to complete execution successfully. In the event of a timeout or memory out, the score is the doubled time limit in seconds. The PAR-2 score is then calculated as the average of all the instance scores. We also report the speedup of ApproxMC6 over ApproxMC4, calculated as the ratio of the runtime of ApproxMC4 to that of ApproxMC6 on instances solved by both counters. We set δ to 0.001 and ε to 0.8.

Specifically, we aim to address the following research questions:


*Summary.* In summary, ApproxMC6 consistently outperforms ApproxMC4. Specifically, it solved 204 additional instances and reduced the PAR-2 score by 1063 s in comparison to ApproxMC4. The average speedup of ApproxMC6 over ApproxMC4 was 4.68. In addition, ApproxMC6 provided a high-quality approximation with an average observed error of 0.1, much smaller than the theoretical error tolerance of 0.8.

#### 6.1 RQ1. Overall Performance

Figure 2 compares the counting time of ApproxMC6 and ApproxMC4. The x-axis represents the index of the instances, sorted in ascending order of runtime, and the <sup>y</sup>-axis represents the runtime for each instance. A point (x, y) indicates that a counter can solve x instances within y seconds. Thus, for a given time limit y, a counter whose curve is on the right has solved more instances than a counter on the left. It can be seen in the figure that ApproxMC6 consistently outperforms ApproxMC4. In total, ApproxMC6 solved 204 more instances than ApproxMC4.

Table 1 provides a detailed comparison between ApproxMC6 and ApproxMC4. The first column lists three measures of interest: the number of solved instances, the PAR-2 score, and the speedup of ApproxMC6 over ApproxMC4. The second and third columns show the results for ApproxMC4 and ApproxMC6, respectively. The second column indicates that ApproxMC4 solved 998 of the 1890 instances and achieved a PAR-2 score of 4934. The third column shows that ApproxMC6 solved 1202 instances and achieved a PAR-2 score of 3871. In comparison, ApproxMC6 solved 204 more instances and reduced the PAR-2 score by 1063 s in comparison to ApproxMC4. The geometric mean of the speedup for ApproxMC6 over ApproxMC4 is 4.68. This speedup was calculated only for instances solved by both counters.

#### 6.2 RQ2. Approximation Quality

We used the state-of-the-art probabilistic exact model counter Ganak to compute the exact model count and compare it to the results of ApproxMC6. We collected statistics on instances solved by both Ganak and ApproxMC6. Figure 3 presents results for a subset of instances. The x-axis represents the index of instances

Table 1. The number of solved instances and PAR-2 score for ApproxMC6 versus ApproxMC4 on 1890 instances. The geometric mean of the speedup of ApproxMC6 over ApproxMC4 is also reported.


Fig. 2. Comparison of counting times for ApproxMC6 and ApproxMC4.

Fig. 3. Comparison of approximate counts from ApproxMC6 to exact counts from Ganak.

sorted in ascending order by the number of solutions, and the y-axis represents the number of solutions in a log scale. Theoretically, the approximate count from ApproxMC6 should be within the range of <sup>|</sup>sol(F)| · 1.8 and <sup>|</sup>sol(F)|/1.8 with probability 0.999, where <sup>|</sup>sol(F)<sup>|</sup> denotes the exact count returned by Ganak. The range is indicated by the upper and lower bounds, represented by the curves <sup>y</sup> = <sup>|</sup>sol(F)| · 1.8 and <sup>y</sup> = <sup>|</sup>sol(F)|/1.8, respectively. Figure <sup>3</sup> shows that the approximate counts from ApproxMC6 fall within the expected range [|sol(F)|/1.8, <sup>|</sup>sol(F)| · 1.8] for all instances except for four points slightly above the upper bound. These four outliers are due to a bug in the preprocessor Arjun that probably depends on the version of the C++ compiler and will be fixed in the future. We also calculated the observed error, which is the mean relative difference between the approximate and exact counts in our experiments, i.e., max{finalEstimate/|sol(F)| − 1, <sup>|</sup>sol(F)|/finalEstimate <sup>−</sup> 1}. The overall observed error was 0.1, which is significantly smaller than the theoretical error tolerance of 0.8.

## 7 Conclusion

In this paper, we addressed the scalability challenges faced by ApproxMC in the smaller δ range. To this end, we proposed a *rounding*-based algorithm, ApproxMC6, which reduces the number of estimations required by 84% while providing the same (ε, δ)-guarantees. Our empirical evaluation on 1890 instances shows that ApproxMC6 solved 204 more instances and achieved a reduction in PAR-2 score of 1063 s. Furthermore, ApproxMC6 achieved a 4<sup>×</sup> speedup over ApproxMC on the instances both ApproxMC6 and ApproxMC could solve.

Acknowledgements. This work was supported in part by National Research Foundation Singapore under its NRF Fellowship Programme [NRF-NRFFAI1-2019-0004], Ministry of Education Singapore Tier 2 Grant [MOE-T2EP20121-0011], and Ministry of Education Singapore Tier 1 Grant [R-252-000-B59-114]. The computational work for this article was performed on resources of the National Supercomputing Centre, Singapore https://www.nscc.sg. We are thankful to Yash Pote for the insightful early discussions that helped shape the idea. We are grateful to Tim van Bremen for his detailed feedback on the early drafts of the paper. We sincerely appreciate the anonymous reviewers for their constructive comments to enhance this paper.

### A Proof of Proposition 1

*Proof.* For <sup>∀</sup><sup>y</sup> ∈ {0, <sup>1</sup>}<sup>n</sup>, α(m) ∈ {0, <sup>1</sup>}<sup>m</sup>, let <sup>γ</sup>y,α(m) be an indicator variable that is 1 when h(m) (y) = <sup>α</sup>(m) . According to the definition of strongly 2-universal function, we obtain <sup>∀</sup>x, y ∈ {0, 1}<sup>n</sup>, <sup>E</sup> γy,α(m) = <sup>1</sup> <sup>2</sup><sup>m</sup> and E γx,α(m) · γy,α(m) = <sup>1</sup> <sup>2</sup>2<sup>m</sup> . To prove Eq. 1, we obtain

$$\mathbb{E}\left[\mathsf{Cnt}\_{\{F,m\}}\right] = \mathbb{E}\left[\sum\_{y \in \mathsf{sol}(\mathsf{F})} \gamma\_{y,\alpha^{(m)}}\right] = \sum\_{y \in \mathsf{sol}(\mathsf{F})} \mathsf{E}\left[\gamma\_{y,\alpha^{(m)}}\right] = \frac{|\mathsf{sol}(\mathsf{F})|}{2^m}$$

To prove Eq. 2, we derive

$$\begin{split} \mathbb{E}\left[\mathsf{Cut}^{2}\_{\langle F,m\rangle}\right] &= \mathbb{E}\left[\sum\_{y\in\mathsf{sol}(\mathsf{F})}\gamma^{2}\_{y,\alpha^{(m)}} + \sum\_{x\neq y\in\mathsf{sol}(\mathsf{F})}\gamma\_{x,\alpha^{(m)}}\cdot\gamma\_{y,\alpha^{(m)}}\right] \\ &= \mathbb{E}\left[\sum\_{y\in\mathsf{sol}(\mathsf{F})}\gamma\_{y,\alpha^{(m)}}\right] + \sum\_{x\neq y\in\mathsf{sol}(\mathsf{F})}\mathbb{E}\left[\gamma\_{x,\alpha^{(m)}}\cdot\gamma\_{y,\alpha^{(m)}}\right] \\ &= \mathbb{E}\left[\mathsf{Cut}\_{\langle F,m\rangle}\right] + \frac{|\mathsf{sol}(\mathsf{F})|(|\mathsf{sol}(\mathsf{F})|-1)}{2^{2m}} \end{split}$$

Then, we obtain

$$\begin{split} \sigma^{2}\left[\mathsf{Cnt}\_{\{F,m\}}\right] &= \mathsf{E}\left[\mathsf{Cnt}\_{\{F,m\}}^{2}\right] - \mathsf{E}\left[\mathsf{Cnt}\_{\{F,m\}}\right]^{2} \\ &= \mathsf{E}\left[\mathsf{Cnt}\_{\{F,m\}}\right] + \frac{|\mathsf{sol}(\mathsf{F})|(|\mathsf{sol}(\mathsf{F})|-1)}{2^{2m}} - \left(\frac{|\mathsf{sol}(\mathsf{F})|}{2^{m}}\right)^{2} \\ &= \mathsf{E}\left[\mathsf{Cnt}\_{\{F,m\}}\right] - \frac{|\mathsf{sol}(\mathsf{F})|}{2^{2m}} \\ &\leq \mathsf{E}\left[\mathsf{Cnt}\_{\{F,m\}}\right] \end{split}$$

#### B Weakness of Proposition 3

The following proposition states that Proposition 3 provides a loose upper bound for Pr[Error<sup>t</sup>].

Proposition 4. *Assuming* t *is odd, we have:*

$$Pr\left[\mathsf{Error}\_t\right] < \eta(t, \lceil t/2 \rceil, Pr\left[L \cup U\right])$$

*Proof.* We will now construct a case counted by <sup>η</sup>(t, t/2,Pr[<sup>L</sup> <sup>∪</sup> <sup>U</sup>]) but not contained within the event Errort. Let I<sup>L</sup> <sup>i</sup> be an indicator variable that is 1 when ApproxMCCore returns a nSols less than <sup>|</sup>sol(F)<sup>|</sup> 1+<sup>ε</sup> , indicating the occurrence of event L in the i-th repetition. Let I<sup>U</sup> <sup>i</sup> be an indicator variable that is 1 when ApproxMCCore returns a nSols greater than (1 + <sup>ε</sup>)|sol(F)|, indicating the occurrence of event U in the i-th repetition. Consider a scenario where I<sup>L</sup> <sup>i</sup> = 1 for <sup>i</sup> = 1, 2, ..., <sup>t</sup> 4 , I<sup>U</sup> <sup>j</sup> = 1 for <sup>j</sup> <sup>=</sup> <sup>t</sup> 4 + 1, ..., <sup>t</sup> 2 , and I<sup>L</sup> <sup>k</sup> <sup>=</sup> <sup>I</sup><sup>U</sup> <sup>k</sup> = 0 for k > <sup>t</sup> 2 . <sup>η</sup>(t, t/2,Pr[<sup>L</sup> <sup>∪</sup> <sup>U</sup>]) represents <sup>t</sup> <sup>i</sup>=1(I<sup>L</sup> <sup>i</sup> <sup>∨</sup> <sup>I</sup><sup>U</sup> <sup>i</sup> ) <sup>≥</sup> <sup>t</sup> <sup>2</sup> . We can see that this case is included in <sup>t</sup> <sup>i</sup>=1(I<sup>L</sup> <sup>i</sup> <sup>∨</sup> <sup>I</sup><sup>U</sup> <sup>i</sup> ) <sup>≥</sup> <sup>t</sup> <sup>2</sup>  and therefore counted by <sup>η</sup>(t, t/2,Pr[<sup>L</sup> <sup>∪</sup> <sup>U</sup>]) since there are <sup>t</sup> 2 estimates outside the PAC range. However, this case means that <sup>t</sup> 4 estimates fall within the range less than <sup>|</sup>sol(F)<sup>|</sup> 1+ε and <sup>t</sup> 2 − t 4 estimates fall within the range greater than (1+ε)|sol(F)|, while the remaining <sup>t</sup> 2 estimates correctly fall within the range - |sol(F)| 1+<sup>ε</sup> ,(1 + <sup>ε</sup>)|sol(F)<sup>|</sup> . Therefore, after sorting all the estimates, ApproxMC6 returns a correct estimate since the median falls within the PAC range - |sol(F)| 1+<sup>ε</sup> ,(1 + <sup>ε</sup>)|sol(F)<sup>|</sup> . In other words, this case is out of the event Errort. In conclusion, there is a scenario that is out of the event Errort, undesirably included in expression <sup>t</sup> <sup>i</sup>=1(I<sup>L</sup> <sup>i</sup> <sup>∨</sup>I<sup>U</sup> <sup>i</sup> ) <sup>≥</sup> <sup>t</sup> 2  and counted by <sup>η</sup>(t, t/2,Pr[<sup>L</sup> <sup>∪</sup> <sup>U</sup>]), which means Pr[Errort] is strictly less than <sup>η</sup>(t, t/2,Pr[<sup>L</sup> <sup>∪</sup> <sup>U</sup>]).

## C Proof of *pmax <sup>≤</sup>* **<sup>0</sup>***.***<sup>36</sup>** for **ApproxMC**

*Proof.* We prove the case of <sup>√</sup>2 <sup>−</sup> 1 <sup>≤</sup> ε < 1. Similarly to the proof in Sect. 5.3, we aim to bound Pr[L] by the following equation:

$$\Pr\left[L\right] = \left[ \bigcup\_{i \in \{1, \ldots, n\}} \left( \overline{T\_{i-1}} \cap T\_i \cap L\_i \right) \right] \tag{3 \text{ revisited}}$$

which can be simplified by three observations labeled <sup>O</sup>1, O2 and <sup>O</sup>3 below. <sup>O</sup>1 : <sup>∀</sup><sup>i</sup> <sup>≤</sup> <sup>m</sup><sup>∗</sup> <sup>−</sup> <sup>3</sup>, T<sup>i</sup> <sup>⊆</sup> <sup>T</sup>i+1. Therefore,

$$\bigcup\_{i \in \{1, \ldots, m^\*-3\}} (\overline{T\_{i-1}} \cap T\_i \cap L\_i) \subseteq \bigcup\_{i \in \{1, \ldots, m^\*-3\}} T\_i \subseteq T\_{m^\*-3}$$

<sup>O</sup>2 : For <sup>i</sup> ∈ {m<sup>∗</sup> <sup>−</sup> 2, m<sup>∗</sup> <sup>−</sup> 1}, we have

$$\bigcup\_{i \in \{m^\star - 2, m^\star - 1\}} (\overline{T\_{i-1}} \cap T\_i \cap L\_i) \subseteq L\_{m^\star - 2} \cup L\_{m^\star - 1}$$

<sup>O</sup>3 : <sup>∀</sup><sup>i</sup> <sup>≥</sup> <sup>m</sup>∗, <sup>T</sup><sup>i</sup> implies CntF,i <sup>&</sup>gt; thresh and then we have <sup>2</sup><sup>i</sup> <sup>×</sup> CntF,i <sup>&</sup>gt; <sup>2</sup><sup>m</sup><sup>∗</sup> <sup>×</sup> thresh ≥ |sol(F)<sup>|</sup> 1 + <sup>ε</sup> 1+ε . The second inequality follows from <sup>m</sup><sup>∗</sup> <sup>≥</sup> log<sup>2</sup> <sup>|</sup>sol(F)| − log<sup>2</sup> (pivot). Then we obtain CntF,i > E CntF,i 1 + <sup>ε</sup> 1+ε . Therefore, <sup>T</sup><sup>i</sup> <sup>⊆</sup> <sup>U</sup> <sup>i</sup> for i ≥ m∗. Since ∀i, T<sup>i</sup> ⊆ T<sup>i</sup>−<sup>1</sup>, we have

$$\begin{aligned} \bigcup\_{i \in \{m^\*, \ldots, n\}} (\overline{T\_{i-1}} \cap T\_i \cap L\_i) &\subseteq \bigcup\_{i \in \{m^\* + 1, \ldots, n\}} \overline{T\_{i-1}} \cup (\overline{T\_{m^\*-1}} \cap T\_{m^\*} \cap L\_{m^\*})\\ &\subseteq \overline{T\_{m^\*}} \cup (\overline{T\_{m^\*-1}} \cap T\_{m^\*} \cap L\_{m^\*})\\ &\subseteq \overline{T\_{m^\*}} \cup L\_{m^\*}\\ &\subseteq U\_{m^\*} \cup L\_{m^\*} \end{aligned}$$

Following the observations <sup>O</sup>1, O2 and <sup>O</sup>3, we simplify Eq. <sup>3</sup> and obtain

$$\Pr\left[L\right] \le \Pr\left[T\_{m^\*-3}\right] + \Pr\left[L\_{m^\*-2}\right] + \Pr\left[L\_{m^\*-1}\right] + \Pr\left[U'\_{m^\*} \cup L\_{m^\*}\right]$$

Employing Lemma 2 in [8] gives Pr[L] <sup>≤</sup> 0.36. Note that <sup>U</sup> in [8] represents <sup>U</sup> of our definition.

Then, following the <sup>O</sup>4 and <sup>O</sup>5 in Sect. 5.3, we obtain

$$\Pr\_{\dots}[U] \le \Pr\_{\dots}[U\_{m^\*}^{\prime}]$$

Employing Lemma <sup>6</sup> gives Pr[U] <sup>≤</sup> <sup>0</sup>.169. As a result, <sup>p</sup>max <sup>≤</sup> <sup>0</sup>.36.

#### D Proof of Lemma 4

We restate the lemma below and prove the statements section by section. The proof for <sup>√</sup><sup>2</sup> <sup>−</sup> <sup>1</sup> <sup>≤</sup> ε < <sup>1</sup> has been shown in Sect. 5.3.

Lemma 4. *The following bounds hold for* ApproxMC6:

$$\Pr\left[L\right] \le \begin{cases} 0.262 & \text{if } \varepsilon < \sqrt{2} - 1 \\ 0.157 & \text{if } \sqrt{2} - 1 \le \varepsilon < 1 \\ 0.085 & \text{if } 1 \le \varepsilon < 3 \\ 0.055 & \text{if } 3 \le \varepsilon < 4\sqrt{2} - 1 \\ 0.023 & \text{if } \varepsilon \ge 4\sqrt{2} - 1 \end{cases}$$

$$\Pr\left[U\right] \le \begin{cases} 0.169 & \text{if } \varepsilon < 3 \\ 0.044 & \text{if } \varepsilon \ge 3 \end{cases}$$

## D.1 Proof of Pr **[***L***]** *<sup>≤</sup>* **<sup>0</sup>***.***<sup>262</sup>** for *ε < <sup>√</sup>***<sup>2</sup>** *<sup>−</sup>* **<sup>1</sup>**

We first consider two cases: E CntF,m∗ < 1+<sup>ε</sup> <sup>2</sup> thresh and E CntF,m∗ ≥ 1+ε <sup>2</sup> thresh, and then merge the results to complete the proof.

Case 1: E Cnt*-F,m∗ <* **1+***<sup>ε</sup>* **<sup>2</sup>** thresh

Lemma 7. *Given* ε < <sup>√</sup><sup>2</sup> <sup>−</sup> <sup>1</sup>*, the following bounds hold:*

*1. Pr* [T<sup>m</sup>∗−<sup>2</sup>] <sup>≤</sup> <sup>1</sup> 29.67 *2. Pr* [L<sup>m</sup>∗−<sup>1</sup>] <sup>≤</sup> <sup>1</sup> 10.84

*Proof.* Let's first prove the statement 1. For ε < <sup>√</sup>2 <sup>−</sup> 1, we have thresh <sup>&</sup>lt; (2 <sup>−</sup> <sup>√</sup><sup>2</sup> <sup>2</sup> )pivot and <sup>E</sup> CntF,m∗−2 <sup>≥</sup> <sup>2</sup>pivot. Therefore, Pr[T<sup>m</sup>∗−<sup>2</sup>] <sup>≤</sup> Pr - CntF,m∗−2 <sup>≤</sup> (1 <sup>−</sup> <sup>√</sup><sup>2</sup> <sup>4</sup> )<sup>E</sup> CntF,m∗−2 . Finally, employing Lemma 5 with <sup>β</sup> = 1 <sup>−</sup> <sup>√</sup><sup>2</sup> <sup>4</sup> , we obtain Pr[T<sup>m</sup>∗−<sup>2</sup>] <sup>≤</sup> <sup>1</sup> 1+( <sup>√</sup><sup>2</sup> <sup>4</sup> )2·2pivot <sup>≤</sup> <sup>1</sup> 1+( <sup>√</sup><sup>2</sup> <sup>4</sup> )2·2·9.84·(1+ <sup>√</sup> <sup>1</sup> <sup>2</sup>−<sup>1</sup> )<sup>2</sup> <sup>≤</sup> 1 <sup>29</sup>.<sup>67</sup> . To prove the statement 2, we employ Lemma <sup>5</sup> with <sup>β</sup> <sup>=</sup> <sup>1</sup> 1+<sup>ε</sup> and E CntF,m∗−1 <sup>≥</sup> pivot to obtain Pr[L<sup>m</sup>∗−<sup>1</sup>] <sup>≤</sup> <sup>1</sup> 1+(1<sup>−</sup> <sup>1</sup> 1+<sup>ε</sup> )2·<sup>E</sup>[Cnt-F,m∗−1] <sup>≤</sup> 1 1+(1<sup>−</sup> <sup>1</sup> 1+<sup>ε</sup> )2·9.84·(1+ <sup>1</sup> <sup>ε</sup> )<sup>2</sup> <sup>=</sup> <sup>1</sup> <sup>10</sup>.<sup>84</sup> .

Then, we prove that Pr[L] <sup>≤</sup> 0.126 for <sup>E</sup> CntF,m∗ < 1+<sup>ε</sup> <sup>2</sup> thresh. *Proof.* We aim to bound Pr[L] by the following equation:

$$\Pr\left[L\right] = \left[ \bigcup\_{i \in \{1, \ldots, n\}} \left( \overline{T\_{i-1}} \cap T\_i \cap L\_i \right) \right] \tag{3 \text{ revisited}}$$

which can be simplified by the three observations labeled <sup>O</sup>1, O2 and <sup>O</sup>3 below.

<sup>O</sup>1 : <sup>∀</sup><sup>i</sup> <sup>≤</sup> <sup>m</sup><sup>∗</sup> <sup>−</sup> <sup>2</sup>, T<sup>i</sup> <sup>⊆</sup> <sup>T</sup>i+1. Therefore,

$$\bigcup\_{i \in \{1, \ldots, m^\star - 2\}} (\overline{T\_{i-1}} \cap T\_i \cap L\_i) \subseteq \bigcup\_{i \in \{1, \ldots, m^\star - 2\}} T\_i \subseteq T\_{m^\star - 2}$$

<sup>O</sup>2 : For <sup>i</sup> = <sup>m</sup><sup>∗</sup> <sup>−</sup> 1, we have

$$\overline{T\_{m^\*-2}} \cap T\_{m^\*-1} \cap L\_{m^\*-1} \subseteq L\_{m^\*-1}$$

<sup>O</sup>3 : <sup>∀</sup><sup>i</sup> <sup>≥</sup> <sup>m</sup>∗, since rounding CntF,i up to <sup>√</sup>1+2<sup>ε</sup> <sup>2</sup> pivot, we have CntF,i ≥ <sup>√</sup>1+2<sup>ε</sup> <sup>2</sup> pivot <sup>≥</sup> thresh <sup>2</sup> > <sup>E</sup>[Cnt-F,m∗] 1+<sup>ε</sup> <sup>≥</sup> <sup>E</sup>[Cnt-F,i] 1+<sup>ε</sup> . The second last inequality follows from E CntF,m∗ < 1+<sup>ε</sup> <sup>2</sup> thresh. Therefore, <sup>L</sup><sup>i</sup> <sup>=</sup> <sup>∅</sup> for <sup>i</sup> <sup>≥</sup> <sup>m</sup><sup>∗</sup> and we have

$$\bigcup\_{i \in \{m^\bullet, \ldots, n\}} (\overline{T\_{i-1}} \cap T\_i \cap L\_i) = \emptyset$$

Following the observations <sup>O</sup>1, O2 and <sup>O</sup>3, we simplify Eq. <sup>3</sup> and obtain

$$\Pr\left[L\right] \le \Pr\left[T\_{m^\*-2}\right] + \Pr\left[L\_{m^\*-1}\right],$$

Employing Lemma <sup>7</sup> gives Pr[L] <sup>≤</sup> 0.126.

Case 2: E Cnt*-F,m∗ <sup>≥</sup>* **1+***<sup>ε</sup>* **<sup>2</sup>** thresh Lemma 8. *Given* E CntF,m∗ <sup>≥</sup> 1+<sup>ε</sup> <sup>2</sup> thresh*, the following bounds hold:*

*1. Pr* [T<sup>m</sup>∗−<sup>1</sup>] <sup>≤</sup> <sup>1</sup> 10.84 *2. Pr* [L<sup>m</sup><sup>∗</sup> ] <sup>≤</sup> <sup>1</sup> 5.92

*Proof.* Let's first prove the statement 1. From E CntF,m∗ <sup>≥</sup> 1+<sup>ε</sup> <sup>2</sup> thresh, we can derive E CntF,m∗−1 <sup>≥</sup> (1 + <sup>ε</sup>)thresh. Therefore, Pr[T<sup>m</sup>∗−<sup>1</sup>] <sup>≤</sup> Pr - CntF,m∗−1 <sup>≤</sup> <sup>1</sup> 1+εE CntF,m∗−1 . Finally, employing Lemma <sup>5</sup> with <sup>β</sup> = 1 1+<sup>ε</sup> , we obtain Pr[T<sup>m</sup>∗−<sup>1</sup>] <sup>≤</sup> <sup>1</sup> 1+(1<sup>−</sup> <sup>1</sup> 1+<sup>ε</sup> )2·<sup>E</sup>[Cnt-F,m∗−1] <sup>≤</sup> <sup>1</sup> 1+(1<sup>−</sup> <sup>1</sup> 1+<sup>ε</sup> )2·(1+ε)thresh = 1 1+9.84(1+2ε) <sup>≤</sup> <sup>1</sup> <sup>10</sup>.<sup>84</sup> . To prove the statement 2, we employ Lemma 5 with <sup>β</sup> = <sup>1</sup> 1+<sup>ε</sup> and E CntF,m∗ <sup>≥</sup> 1+<sup>ε</sup> <sup>2</sup> thresh to obtain Pr[L<sup>m</sup><sup>∗</sup> ] <sup>≤</sup> 1 1+(1<sup>−</sup> <sup>1</sup> 1+<sup>ε</sup> )2·<sup>E</sup>[Cnt-F,m∗] <sup>≤</sup> <sup>1</sup> 1+(1<sup>−</sup> <sup>1</sup> 1+<sup>ε</sup> )2· 1+<sup>ε</sup> <sup>2</sup> thresh <sup>=</sup> <sup>1</sup> 1+4.92(1+2ε) <sup>≤</sup> <sup>1</sup> <sup>5</sup>.<sup>92</sup> .

Then, we prove that Pr[L] <sup>≤</sup> 0.262 for <sup>E</sup> CntF,m∗ <sup>≥</sup> 1+<sup>ε</sup> <sup>2</sup> thresh. *Proof.* We aim to bound Pr[L] by the following equation:

$$\Pr\left[L\right] = \left[ \bigcup\_{i \in \{1, \ldots, n\}} \left( \overline{T\_{i-1}} \cap T\_i \cap L\_i \right) \right] \tag{3 \text{ revisited}}$$

which can be simplified by the three observations labeled <sup>O</sup>1, O2 and <sup>O</sup>3 below.

<sup>O</sup>1 : <sup>∀</sup><sup>i</sup> <sup>≤</sup> <sup>m</sup><sup>∗</sup> <sup>−</sup> <sup>1</sup>, T<sup>i</sup> <sup>⊆</sup> <sup>T</sup>i+1. Therefore,

$$\bigcup\_{i \in \{1, \ldots, m^\star - 1\}} (\overline{T\_{i-1}} \cap T\_i \cap L\_i) \subseteq \bigcup\_{i \in \{1, \ldots, m^\star - 1\}} T\_i \subseteq T\_{m^\star - 1}$$

<sup>O</sup>2 : For <sup>i</sup> = <sup>m</sup>∗, we have

$$\overline{T\_{m^\*-1}} \cap T\_{m^\*} \cap L\_{m^\*} \subseteq L\_{m^\*}$$

<sup>O</sup>3 : <sup>∀</sup><sup>i</sup> <sup>≥</sup> <sup>m</sup><sup>∗</sup> + 1, since rounding CntF,i up to <sup>√</sup>1+2<sup>ε</sup> <sup>2</sup> pivot and m<sup>∗</sup> ≥ log<sup>2</sup> <sup>|</sup>sol(F)| − log<sup>2</sup> (pivot), we have <sup>2</sup><sup>i</sup> <sup>×</sup> CntF,i <sup>≥</sup> <sup>2</sup><sup>m</sup>∗+1 <sup>×</sup> <sup>√</sup>1+2<sup>ε</sup> <sup>2</sup> pivot ≥ √1+2ε|sol(F)| ≥ <sup>|</sup>sol(F)<sup>|</sup> 1+<sup>ε</sup> . Then we have CntF,i <sup>≥</sup> <sup>E</sup>[Cnt-F,i] 1+ε . Therefore, <sup>L</sup><sup>i</sup> <sup>=</sup> <sup>∅</sup> for <sup>i</sup> <sup>≥</sup> <sup>m</sup><sup>∗</sup> + 1 and we have

$$\bigcup\_{i \in \{m^\*+1,\ldots,n\}} (\overline{T\_{i-1}} \cap T\_i \cap L\_i) = \emptyset$$

Following the observations <sup>O</sup>1, O2 and <sup>O</sup>3, we simplify Eq. <sup>3</sup> and obtain

$$\Pr\left[L\right] \le \Pr\left[T\_{m^\*-1}\right] + \Pr\left[L\_{m^\*}\right]$$

Employing Lemma <sup>8</sup> gives Pr[L] <sup>≤</sup> 0.262.

Combining the Case 1 and 2, we obtain Pr[L] <sup>≤</sup> max{0.126, 0.262} = 0.262. Therefore, we prove the statement for ApproxMC6: Pr[L] <sup>≤</sup> <sup>0</sup>.<sup>262</sup> for ε < <sup>√</sup>2−1.

## D.2 Proof of Pr **[***L***]** *≤* **0***.***085** for **1** *≤ ε <* **3**

Lemma 9. *Given* 1 <sup>≤</sup> ε < 3*, the following bounds hold:*

*1. Pr* [T<sup>m</sup>∗−<sup>4</sup>] <sup>≤</sup> <sup>1</sup> 86.41 *2. Pr* [L<sup>m</sup>∗−<sup>3</sup>] <sup>≤</sup> <sup>1</sup> 40.36 *3. Pr* [L<sup>m</sup>∗−<sup>2</sup>] <sup>≤</sup> <sup>1</sup> 20.68

*Proof.* Let's first prove the statement 1. For ε < 3, we have thresh < <sup>7</sup> <sup>4</sup> pivot and E CntF,m∗−4 <sup>≥</sup> <sup>8</sup>pivot. Therefore, Pr[T<sup>m</sup>∗−<sup>4</sup>] <sup>≤</sup> Pr CntF,m∗−4 <sup>≤</sup> <sup>7</sup> <sup>32</sup>E CntF,m∗−4 . Finally, employing Lemma <sup>5</sup> with <sup>β</sup> <sup>=</sup> <sup>7</sup> <sup>32</sup> , we obtain Pr[T<sup>m</sup>∗−<sup>4</sup>] <sup>≤</sup> <sup>1</sup> 1+(1<sup>−</sup> <sup>7</sup> <sup>32</sup> )2·8pivot <sup>≤</sup> <sup>1</sup> 1+(1<sup>−</sup> <sup>7</sup> <sup>32</sup> )2·8·9.84·(1+ <sup>1</sup> <sup>3</sup> )<sup>2</sup> ≤ 1 <sup>86</sup>.<sup>41</sup> . To prove the statement 2, we employ Lemma <sup>5</sup> with <sup>β</sup> <sup>=</sup> <sup>1</sup> 1+<sup>ε</sup> and E CntF,m∗−3 <sup>≥</sup> <sup>4</sup>pivot to obtain Pr[L<sup>m</sup>∗−<sup>3</sup>] <sup>≤</sup> <sup>1</sup> 1+(1<sup>−</sup> <sup>1</sup> 1+<sup>ε</sup> )2·<sup>E</sup>[Cnt-F,m∗−3] <sup>≤</sup> 1 1+(1<sup>−</sup> <sup>1</sup> 1+<sup>ε</sup> )2·4·9.84·(1+ <sup>1</sup> <sup>ε</sup> )<sup>2</sup> <sup>=</sup> <sup>1</sup> <sup>40</sup>.<sup>36</sup> . Following the proof of Lemma 2 in [8] we can prove the statement 3.

Now let us prove the statement for ApproxMC6: Pr[L] <sup>≤</sup> 0.085 for 1 <sup>≤</sup> ε < 3.

*Proof.* We aim to bound Pr[L] by the following equation:

$$\Pr\left[L\right] = \left[ \bigcup\_{i \in \{1, \ldots, n\}} \left( \overline{T\_{i-1}} \cap T\_i \cap L\_i \right) \right] \tag{3 \text{ revisited}}$$

which can be simplified by the three observations labeled <sup>O</sup>1, O2 and <sup>O</sup>3 below. <sup>O</sup>1 : <sup>∀</sup><sup>i</sup> <sup>≤</sup> <sup>m</sup><sup>∗</sup> <sup>−</sup> <sup>4</sup>, T<sup>i</sup> <sup>⊆</sup> <sup>T</sup>i+1. Therefore,

$$\bigcup\_{i \in \{1, \ldots, m^\star - 4\}} (\overline{T\_{i-1}} \cap T\_i \cap L\_i) \subseteq \bigcup\_{i \in \{1, \ldots, m^\star - 4\}} T\_i \subseteq T\_{m^\star - 4}$$

<sup>O</sup>2 : For <sup>i</sup> ∈ {m<sup>∗</sup> <sup>−</sup> 3, m<sup>∗</sup> <sup>−</sup> 2}, we have

$$\bigcup\_{i \in \{m^\star - 3, m^\star - 2\}} (\overline{T\_{i-1}} \cap T\_i \cap L\_i) \subseteq L\_{m^\star - 3} \cup L\_{m^\star - 2}$$

<sup>O</sup>3 : <sup>∀</sup><sup>i</sup> <sup>≥</sup> <sup>m</sup><sup>∗</sup> <sup>−</sup> <sup>1</sup>, since rounding CntF,i up to pivot and <sup>m</sup><sup>∗</sup> <sup>≥</sup> log<sup>2</sup> <sup>|</sup>sol(F)| − log<sup>2</sup> (pivot), we have <sup>2</sup><sup>i</sup> <sup>×</sup> CntF,i <sup>≥</sup> <sup>2</sup><sup>m</sup>∗−<sup>1</sup> <sup>×</sup> pivot <sup>≥</sup> <sup>|</sup>sol(F)<sup>|</sup> <sup>2</sup> <sup>≥</sup> <sup>|</sup>sol(F)<sup>|</sup> 1+<sup>ε</sup> . The last inequality follows from <sup>ε</sup> <sup>≥</sup> 1. Then we have CntF,i <sup>≥</sup> <sup>E</sup>[Cnt-F,i] 1+ε . Therefore, <sup>L</sup><sup>i</sup> <sup>=</sup> <sup>∅</sup> for <sup>i</sup> <sup>≥</sup> <sup>m</sup><sup>∗</sup> <sup>−</sup> <sup>1</sup> and we have

$$\bigcup\_{i \in \{m^\ast -1, \ldots, n\}} (\overline{T\_{i-1}} \cap T\_i \cap L\_i) = \emptyset$$

Following the observations <sup>O</sup>1, O2 and <sup>O</sup>3, we simplify Eq. <sup>3</sup> and obtain

$$\Pr\left[L\right] \le \Pr\left[T\_{m^\*-4}\right] + \Pr\left[L\_{m^\*-3}\right] + \Pr\left[L\_{m^\*-2}\right]$$

Employing Lemma <sup>9</sup> gives Pr[L] <sup>≤</sup> 0.085.

#### D.3 Proof of Pr **[***L***]** *≤* **0***.***055** for **3** *≤ ε <* **4** *<sup>√</sup>***<sup>2</sup>** *<sup>−</sup>* **<sup>1</sup>**

Lemma 10. *Given* 3 <sup>≤</sup> ε < 4 √2 <sup>−</sup> 1*, the following bound hold:*

$$\Pr\left[T\_{m^\*-3}\right] \le \frac{1}{18.19}$$

*Proof.* For ε < 4 √2 <sup>−</sup> 1, we have thresh <sup>&</sup>lt; (2 <sup>−</sup> <sup>√</sup><sup>2</sup> <sup>8</sup> )pivot and <sup>E</sup> CntF,m∗−3 ≥ <sup>4</sup>pivot. Therefore, Pr[T<sup>m</sup>∗−<sup>3</sup>] <sup>≤</sup> Pr - CntF,m∗−3 <sup>≤</sup> ( <sup>1</sup> <sup>2</sup> <sup>−</sup> <sup>√</sup><sup>2</sup> <sup>32</sup> )<sup>E</sup> CntF,m∗−3 . Finally, employing Lemma <sup>5</sup> with <sup>β</sup> = <sup>1</sup> <sup>2</sup> <sup>−</sup> <sup>√</sup><sup>2</sup> <sup>32</sup> , we obtain Pr[T<sup>m</sup>∗−<sup>3</sup>] <sup>≤</sup> 1 1+(1−( <sup>1</sup> <sup>2</sup> <sup>−</sup> <sup>√</sup><sup>2</sup> <sup>32</sup> ))2·4pivot <sup>≤</sup> <sup>1</sup> 1+(1−( <sup>1</sup> <sup>2</sup> <sup>−</sup> <sup>√</sup><sup>2</sup> <sup>32</sup> ))2·4·9.84·(1+ <sup>1</sup> 4 <sup>√</sup>2−<sup>1</sup> )<sup>2</sup> <sup>≤</sup> <sup>1</sup> <sup>18</sup>.<sup>19</sup> .

Now let us prove the statement for ApproxMC6: Pr[L] <sup>≤</sup> 0.055 for 3 <sup>≤</sup> ε < 4 √2 <sup>−</sup> 1.

*Proof.* We aim to bound Pr[L] by the following equation:

$$\Pr\left[L\right] = \left[ \bigcup\_{i \in \{1, \ldots, n\}} \left( \overline{T\_{i-1}} \cap T\_i \cap L\_i \right) \right] \tag{3 \text{ revisited}}$$

which can be simplified by the two observations labeled <sup>O</sup>1 and <sup>O</sup>2 below.

<sup>O</sup>1 : <sup>∀</sup><sup>i</sup> <sup>≤</sup> <sup>m</sup><sup>∗</sup> <sup>−</sup> <sup>3</sup>, T<sup>i</sup> <sup>⊆</sup> <sup>T</sup>i+1. Therefore,

$$\bigcup\_{i \in \{1, \ldots, m^\ast - 3\}} (\overline{T\_{i-1}} \cap T\_i \cap L\_i) \subseteq \bigcup\_{i \in \{1, \ldots, m^\ast - 3\}} T\_i \subseteq T\_{m^\ast - 3}$$

<sup>O</sup>2 : <sup>∀</sup><sup>i</sup> <sup>≥</sup> <sup>m</sup><sup>∗</sup> <sup>−</sup> <sup>2</sup>, since rounding CntF,i to pivot and <sup>m</sup><sup>∗</sup> <sup>≥</sup> log<sup>2</sup> <sup>|</sup>sol(F)| − log<sup>2</sup> (pivot), we have <sup>2</sup><sup>i</sup> <sup>×</sup> CntF,i <sup>≥</sup> <sup>2</sup><sup>m</sup>∗−<sup>2</sup> <sup>×</sup> pivot <sup>≥</sup> <sup>|</sup>sol(F)<sup>|</sup> <sup>4</sup> <sup>≥</sup> <sup>|</sup>sol(F)<sup>|</sup> 1+<sup>ε</sup> . The last inequality follows from <sup>ε</sup> <sup>≥</sup> 3. Then we have CntF,i <sup>≥</sup> <sup>E</sup>[Cnt-F,i] 1+ε . Therefore, <sup>L</sup><sup>i</sup> <sup>=</sup> <sup>∅</sup> for <sup>i</sup> <sup>≥</sup> <sup>m</sup><sup>∗</sup> <sup>−</sup> <sup>2</sup> and we have

$$\bigcup\_{i \in \{m^\star - 2, \ldots, n\}} (\overline{T\_{i-1}} \cap T\_i \cap L\_i) = \emptyset$$

Following the observations <sup>O</sup>1 and <sup>O</sup>2, we simplify Eq. <sup>3</sup> and obtain

$$\Pr\left[L\right] \le \Pr\left[T\_{m^\*-3}\right]$$

Employing Lemma <sup>10</sup> gives Pr[L] <sup>≤</sup> 0.055.

#### D.4 Proof of Pr **[***L***]** *≤* **0***.***023** for *ε ≥* **4** *<sup>√</sup>***<sup>2</sup>** *<sup>−</sup>* **<sup>1</sup>**

Lemma 11. *Given* <sup>ε</sup> <sup>≥</sup> 4 √2 <sup>−</sup> 1*, the following bound hold:*

$$\Pr\left[T\_{m^\*-4}\right] \le \frac{1}{45.28}$$

*Proof.* We have thresh <sup>&</sup>lt; 2pivot and <sup>E</sup> CntF,m∗−4 <sup>≥</sup> 8pivot. Therefore, Pr[T<sup>m</sup>∗−<sup>4</sup>] <sup>≤</sup> Pr CntF,m∗−4 <sup>≤</sup> <sup>1</sup> 4E CntF,m∗−4 . Finally, employing Lemma 5 with <sup>β</sup> = <sup>1</sup> <sup>4</sup> , we obtain Pr[T<sup>m</sup>∗−<sup>4</sup>] <sup>≤</sup> <sup>1</sup> 1+(1<sup>−</sup> <sup>1</sup> <sup>4</sup> )2·8pivot <sup>≤</sup> <sup>1</sup> 1+(1<sup>−</sup> <sup>1</sup> <sup>4</sup> )2·8·9.<sup>84</sup> <sup>≤</sup> <sup>1</sup> <sup>45</sup>.<sup>28</sup> .

Now let us prove the statement for ApproxMC6: Pr[L] <sup>≤</sup> 0.023 for <sup>ε</sup> <sup>≥</sup> 4 √2−1. *Proof.* We aim to bound Pr[L] by the following equation:

$$\Pr\left[L\right] = \left[ \bigcup\_{i \in \{1, \ldots, n\}} \left( \overline{T\_{i-1}} \cap T\_i \cap L\_i \right) \right] \tag{3 \text{ revisited}}$$

which can be simplified by the two observations labeled <sup>O</sup>1 and <sup>O</sup>2 below.

<sup>O</sup>1 : <sup>∀</sup><sup>i</sup> <sup>≤</sup> <sup>m</sup><sup>∗</sup> <sup>−</sup> <sup>4</sup>, T<sup>i</sup> <sup>⊆</sup> <sup>T</sup>i+1. Therefore,

$$\bigcup\_{i \in \{1, \ldots, m^\star - 4\}} (\overline{T\_{i-1}} \cap T\_i \cap L\_i) \subseteq \bigcup\_{i \in \{1, \ldots, m^\star - 4\}} T\_i \subseteq T\_{m^\star - 4}$$

<sup>O</sup>2 : <sup>∀</sup><sup>i</sup> <sup>≥</sup> <sup>m</sup><sup>∗</sup> <sup>−</sup> <sup>3</sup>, since rounding CntF,i to <sup>√</sup>2pivot and <sup>m</sup><sup>∗</sup> <sup>≥</sup> log<sup>2</sup> <sup>|</sup>sol(F)| − log<sup>2</sup> (pivot), we have <sup>2</sup><sup>i</sup> <sup>×</sup> CntF,i <sup>≥</sup> <sup>2</sup>m∗−<sup>3</sup> <sup>×</sup> <sup>√</sup>2pivot <sup>≥</sup> <sup>√</sup>2|sol(F)<sup>|</sup> 8 ≥ |sol(F)| 1+<sup>ε</sup> . The last inequality follows from <sup>ε</sup> <sup>≥</sup> <sup>4</sup> √2 <sup>−</sup> 1. Then we have CntF,i <sup>≥</sup> <sup>E</sup>[Cnt-F,i] 1+ε . Therefore, <sup>L</sup><sup>i</sup> <sup>=</sup> <sup>∅</sup> for <sup>i</sup> <sup>≥</sup> <sup>m</sup><sup>∗</sup> <sup>−</sup> <sup>3</sup> and we have

$$\bigcup\_{i \in \{m^\*-3, \dots, n\}} (\overline{T\_{i-1}} \cap T\_i \cap L\_i) = \emptyset$$

Following the observations <sup>O</sup>1 and <sup>O</sup>2, we simplify Eq. <sup>3</sup> and obtain

$$\Pr\left[L\right] \le \Pr\left[T\_{m^\*-4}\right]$$

Employing Lemma <sup>11</sup> gives Pr[L] <sup>≤</sup> 0.023.

## D.5 Proof of Pr **[***U***]** *≤* **0***.***169** for *ε <* **3**

#### Lemma 12

$$\Pr\left[U\_{m^\*}'\right] \le \frac{1}{5.92}$$

*Proof.* Employing Lemma <sup>5</sup> with <sup>γ</sup> = (1 + <sup>ε</sup> 1+<sup>ε</sup> ) and <sup>E</sup> CntF,m∗ <sup>≥</sup> pivot/2, we obtain Pr[<sup>U</sup> <sup>m</sup><sup>∗</sup> ] <sup>≤</sup> <sup>1</sup> 1+( <sup>ε</sup> 1+<sup>ε</sup> ) 2 pivot/<sup>2</sup> <sup>≤</sup> <sup>1</sup> 1+9.84/<sup>2</sup> <sup>≤</sup> <sup>1</sup> <sup>5</sup>.<sup>92</sup> .

Now let us prove the statement for ApproxMC6: Pr[U] <sup>≤</sup> 0.169 for ε < 3. *Proof.* We aim to bound Pr[U] by the following equation:

$$\Pr\left[U\right] = \left[ \bigcup\_{i \in \{1, \ldots, n\}} \left( \overline{T\_{i-1}} \cap T\_i \cap U\_i \right) \right] \tag{4.7visited}$$

We derive the following observations <sup>O</sup>1 and <sup>O</sup>2.

<sup>O</sup>1 : <sup>∀</sup><sup>i</sup> <sup>≤</sup> <sup>m</sup><sup>∗</sup> <sup>−</sup> <sup>1</sup>, since <sup>m</sup><sup>∗</sup> <sup>≤</sup> log<sup>2</sup> <sup>|</sup>sol(F)| − log<sup>2</sup> (pivot)+1, we have <sup>2</sup><sup>i</sup> <sup>×</sup> CntF,i <sup>≤</sup> <sup>2</sup><sup>m</sup>∗−<sup>1</sup> <sup>×</sup> thresh ≤ |sol(F)<sup>|</sup> 1 + <sup>ε</sup> 1+ε . Then we obtain CntF,i ≤ E CntF,i 1 + <sup>ε</sup> 1+ε . Therefore, <sup>T</sup><sup>i</sup> <sup>∩</sup> <sup>U</sup> <sup>i</sup> <sup>=</sup> <sup>∅</sup> for <sup>i</sup> <sup>≤</sup> <sup>m</sup><sup>∗</sup> <sup>−</sup> <sup>1</sup> and we have

$$\bigcup\_{i \in \{1, \ldots, m^\*-1\}} \left( \overline{T\_{i-1}} \cap T\_i \cap U\_i \right) \subseteq \bigcup\_{i \in \{1, \ldots, m^\*-1\}} \left( \overline{T\_{i-1}} \cap T\_i \cap U\_i' \right) = \emptyset$$

<sup>O</sup>2 : <sup>∀</sup><sup>i</sup> <sup>≥</sup> <sup>m</sup>∗, <sup>T</sup><sup>i</sup> implies CntF,i <sup>&</sup>gt; thresh and then we have <sup>2</sup><sup>i</sup> <sup>×</sup> CntF,i <sup>&</sup>gt; <sup>2</sup>m<sup>∗</sup> <sup>×</sup> thresh ≥ |sol(F)<sup>|</sup> 1 + <sup>ε</sup> 1+ε . The second inequality follows from <sup>m</sup><sup>∗</sup> <sup>≥</sup> log<sup>2</sup> <sup>|</sup>sol(F)| − log<sup>2</sup> (pivot). Then we obtain CntF,i > E CntF,i 1 + <sup>ε</sup> 1+ε . Therefore, <sup>T</sup><sup>i</sup> <sup>⊆</sup> <sup>U</sup> <sup>i</sup> for i ≥ m∗. Since ∀i, T<sup>i</sup> ⊆ Ti−1, we have

$$\bigcup\_{i \in \{m^\*, \dots, n\}} (\overline{T\_{i-1}} \cap T\_i \cap U\_i) \subseteq \bigcup\_{i \in \{m^\* + 1, \dots, n\}} \overline{T\_{i-1}} \cup (\overline{T\_{m^\* - 1}} \cap T\_{m^\*} \cap U\_{m^\*})$$

$$\subseteq \overline{T\_{m^\*}} \cup (\overline{T\_{m^\* - 1}} \cap T\_{m^\*} \cap U\_{m^\*})$$

$$\subseteq \overline{T\_{m^\*}} \cup U\_{m^\*}$$

$$\subseteq U'\_{m^\*} \tag{8}$$

Remark that for ε < <sup>√</sup><sup>2</sup> <sup>−</sup> <sup>1</sup>, we round CntF,m∗ up to <sup>√</sup>1+2<sup>ε</sup> <sup>2</sup> pivot and we have 2<sup>m</sup><sup>∗</sup> <sup>×</sup> <sup>√</sup>1+2<sup>ε</sup> <sup>2</sup> pivot ≤ |sol(F)|(1 + <sup>ε</sup>). For <sup>√</sup><sup>2</sup> <sup>−</sup> <sup>1</sup> <sup>≤</sup> ε < <sup>1</sup>, we round CntF,m∗ up to pivot <sup>√</sup><sup>2</sup> and we have <sup>2</sup><sup>m</sup><sup>∗</sup> <sup>×</sup> pivot <sup>√</sup><sup>2</sup> ≤ |sol(F)|(1 +ε). For <sup>1</sup> <sup>≤</sup> ε < <sup>3</sup>, we round CntF,m∗ up to pivot and we have <sup>2</sup><sup>m</sup><sup>∗</sup> <sup>×</sup> pivot ≤ |sol(F)|(1 + <sup>ε</sup>). The analysis means *rounding* doesn't affect the event U<sup>m</sup><sup>∗</sup> and therefore Inequality 8 still holds.

Following the observations <sup>O</sup>1 and <sup>O</sup>2, we simplify Eq. <sup>4</sup> and obtain

$$\Pr\left[U\right] \le \Pr\left[U\_{m^\*}'\right]$$

Employing Lemma <sup>12</sup> gives Pr[U] <sup>≤</sup> 0.169.

## D.6 Proof of Pr **[***U***]** *≤* **0***.***044** for *ε ≥* **3**

Lemma 13

$$\Pr\left[\overline{T\_{m^\*+1}}\right] \le \frac{1}{23.14}$$

*Proof.* Since E CntF,m∗+1 <sup>≤</sup> pivot <sup>2</sup> , we have Pr T<sup>m</sup>∗+1 ≤ Pr - CntF,m∗+1 <sup>&</sup>gt; 2(1 + <sup>ε</sup> 1+<sup>ε</sup> )<sup>E</sup> CntF,m∗+1 . Employing Lemma <sup>5</sup> with <sup>γ</sup> = 2(1 + <sup>ε</sup> 1+<sup>ε</sup> ) and <sup>E</sup> CntF,m∗+1 <sup>≥</sup> pivot <sup>4</sup> , we obtain Pr T<sup>m</sup>∗+1 ≤ 1 1+(1+ <sup>2</sup><sup>ε</sup> 1+<sup>ε</sup> ) 2 pivot/<sup>4</sup> <sup>=</sup> <sup>1</sup> 1+2.46·(3+ <sup>1</sup> ε ) <sup>2</sup> <sup>≤</sup> <sup>1</sup> 1+2.46·3<sup>2</sup> <sup>≤</sup> <sup>1</sup> <sup>23</sup>.<sup>14</sup> .

Now let us prove the statement for ApproxMC6: Pr[U] <sup>≤</sup> 0.044 for <sup>ε</sup> <sup>≥</sup> 3. *Proof.* We aim to bound Pr[U] by the following equation:

$$\Pr\left[U\right] = \left[ \bigcup\_{i \in \{1, \ldots, n\}} \left( \overline{T\_{i-1}} \cap T\_i \cap U\_i \right) \right] \tag{4. revisited}$$

We derive the following observations <sup>O</sup>1 and <sup>O</sup>2.

<sup>O</sup>1 : <sup>∀</sup><sup>i</sup> <sup>≤</sup> <sup>m</sup>∗+1, for 3 <sup>≤</sup> ε < 4 √<sup>2</sup>−1, because we round CntF,i to pivot and have <sup>m</sup><sup>∗</sup> <sup>≤</sup> log<sup>2</sup> <sup>|</sup>sol(F)|−log<sup>2</sup> (pivot)+1, we obtain <sup>2</sup><sup>i</sup> <sup>×</sup>CntF,i <sup>≤</sup> <sup>2</sup>m∗+1 <sup>×</sup>pivot <sup>≤</sup> 4·|sol(F)| ≤ (1+ε)|sol(F)|. For <sup>ε</sup> <sup>≥</sup> 4 √<sup>2</sup>−1, we round CntF,i to <sup>√</sup>2pivot and obtain <sup>2</sup>i×CntF,i <sup>≤</sup> <sup>2</sup>m∗+1×√2pivot <sup>≤</sup> <sup>4</sup> √2·|sol(F)| ≤ (1+ε)|sol(F)|. Then, we obtain CntF,i ≤ E CntF,i (1 + <sup>ε</sup>). Therefore, <sup>U</sup><sup>i</sup> <sup>=</sup> <sup>∅</sup> for <sup>i</sup> <sup>≤</sup> <sup>m</sup><sup>∗</sup> + 1 and we have

$$\bigcup\_{i \in \{1, \ldots, m^\star + 1\}} \left( \overline{T\_{i-1}} \cap T\_i \cap U\_i \right) = \emptyset$$

<sup>O</sup>2 : <sup>∀</sup><sup>i</sup> <sup>≥</sup> <sup>m</sup><sup>∗</sup> + 2, since <sup>∀</sup>i, <sup>T</sup><sup>i</sup> <sup>⊆</sup> <sup>T</sup><sup>i</sup>−<sup>1</sup>, we have

$$\bigcup\_{i \in \{m^\*+2,\ldots,n\}} \left( \overline{T\_{i-1}} \cap T\_i \cap U\_i \right) \subseteq \bigcup\_{i \in \{m^\*+2,\ldots,n\}} \overline{T\_{i-1}} \subseteq \overline{T\_{m^\*+1}}$$

Following the observations <sup>O</sup>1 and <sup>O</sup>2, we simplify Eq. <sup>4</sup> and obtain

$$\Pr\left[U\right] \le \Pr\left[\overline{T\_{m^\*+1}}\right]$$

Employing Lemma <sup>13</sup> gives Pr[U] <sup>≤</sup> 0.044.

### References



Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

## Satisfiability Modulo Finite Fields

Alex Ozdemir1(B) , Gereon Kremer1,2, Cesare Tinelli<sup>3</sup>, and Clark Barrett<sup>1</sup>

> Stanford University, Stanford, USA aozdemir@stanford.edu Certora, Tel Aviv-Yafo, Israel University of Iowa, Iowa, USA

Abstract. We study satisfiability modulo the theory of finite fields and give a decision procedure for this theory. We implement our procedure for prime fields inside the cvc5 SMT solver. Using this theory, we construct SMT queries that encode translation validation for various zero knowledge proof compilers applied to Boolean computations. We evaluate our procedure on these benchmarks. Our experiments show that our implementation is superior to previous approaches (which encode field arithmetic using integers or bit-vectors).

### 1 Introduction

Finite fields are critical to the design of recent cryptosystems. For instance, elliptic curve operations are defined in terms of operations in a finite field. Also, Zero-Knowledge Proofs (ZKPs) and Multi-Party Computations (MPCs), powerful tools for building secure and private systems, often require key properties of the system to be expressed as operations in a finite field.

Field-based cryptosystems already safeguard everything from our money to our privacy. Over 80% of our TLS connections, for example, use elliptic curves [4,66]. Private cryptocurrencies [32,59,89] built on ZKPs have billiondollar market capitalizations [44,45]. And MPC protocols have been used to operate auctions [17], facilitate sensitive cross-agency collaboration in the US federal government [5], and compute cross-company pay gaps [8]. These systems safeguard our privacy, assets, and government data. Their importance justifies spending considerable effort to ensure that the systems are free of bugs that could compromise the resources they are trying to protect; thus, they are prime targets for formal verification.

However, verifying field-based cryptosystems is challenging, in part because current automated verification tools do not reason directly about finite fields. Many tools use Satisfiability Modulo Theories (SMT) solvers as a back-end [9, 27,33,93,95]. SMT solvers [7,10,12,20,26,35,73,76,77] are automated reasoners that determine the satisfiability of formulas in first-order logic with respect to one or more *background theories*. They combine propositional search with specialized reasoning procedures for these theories, which model common data types such as Booleans, integers, reals, bit-vectors, arrays, algebraic datatypes, and more. Since SMT solvers do not currently support a theory of finite fields, SMT-based tools must encode field operations using another theory.

There are two natural ways to represent finite fields using commonly supported theories in SMT, but both are ultimately inefficient. Recall that a finite field of prime order can be represented as the integers with addition and multiplication performed modulo a prime p. Thus, field operations can be represented using integers or bit-vectors: both support addition, multiplication, and modular reduction. However, both approaches fall short. Non-linear integer reasoning is notoriously challenging for SMT solvers, and bit-vector solvers perform abysmally on fields of cryptographic size (hundreds of bits).

In this paper, we develop for the first time a direct solver for finite fields within an SMT solver. We use well-known ideas from computer algebra (specifically, Gröbner bases [21] and triangular decomposition [6,99]) to form the basis of our decision procedure. However, we improve on this baseline in two important ways. First, our decision procedure does not manipulate *field polynomials* (i.e., those of form <sup>X</sup><sup>p</sup> <sup>−</sup> <sup>X</sup>). As expected, this results in a loss of completeness at the Gröbner basis stage. However, surprisingly, this often does not matter. Furthermore, completeness is recovered during the model construction algorithm (albeit in a rather rudimentary way). This modification turns out to be crucial for obtaining reasonable performance. Second, we implement a proof-tracing mechanism in the Gröbner basis engine, thereby enabling it to compute unsatisfiable cores, which is also very beneficial in the context of SMT solving. Finally, we implement all of this as a theory solver for prime-order fields inside the cvc5 SMT solver.

To guide research in this area, we also give a first set of QF\_FF (quantifier-free, finite field) benchmarks, obtained from the domain of ZKP compiler correctness. ZKP compilers translate from high-level computations (e.g., over Booleans, bitvectors, arrays, etc.) to systems of finite field constraints that are usable by ZKPs. We instrument existing ZKP compilers to produce translation validation [86] verification conditions, i.e. conditions that represent desirable correctness properties of a specific compilation. We give these compilers concrete Boolean computations (which we sample at random), and construct SMT formulas capturing the correctness of the ZKP compilers' translations of those computations into field constraints. We represent the formulas using both our new theory of finite fields and also the alternative theory encodings mentioned above.

We evaluate our tool on these benchmarks and compare it to the approaches based on bit-vectors, integers, and pure computer algebra (without SMT). We find that our tool significantly outperforms the other solutions. Compared to the best previous solution (we list prior alternatives in Sect. 7), it is 6× faster and it solves 2× more benchmarks.

In sum, our contributions are:


4. the first set of QF\_FF benchmarks, which encode translation validation queries for ZKP compilers on Boolean computations.

In the rest of the paper, we discuss related work (§1.1), cover background and notation (§2), define the theory of finite fields (§3), give a decision procedure (§4), describe our implementation (§5), explain the benchmarks (§6), and report on experiments (§7).

#### 1.1 Related Work

There is a large body of work on computer algebra, with many algorithms implemented in various tools [1,18,31,37,49,52,58,72,100,101]. However, the focus in this work is on quickly constructing useful algebraic objects (e.g., a Gröbner basis), rather than on searching for a solution to a set of field constraints.

One line of recent work [54,55] by Hader and Kovács considers SMT-oriented field reasoning. One difference with our work is that it scales poorly with field size because it uses field polynomials to achieve completeness. Furthermore, their solver is not public.

Others consider verifying field constraints used in ZKPs. One paper surveys possible approaches [97], and another considers proof-producing ZKP compilation [24]. However, neither develops automated, general-purpose tools.

Still other works study automated reasoning for non-linear arithmetic over reals and integers [3,23,25,29,47,60–62,70,74,96,98]. A key challenge is reasoning about *comparisons*. We work over finite fields and do not consider comparisons because they are used for neither elliptic curves nor most ZKPs.

Further afield, researchers have developed techniques for verified algebraic reasoning in proof assistants [15,64,75,79], with applications to mathematics [19,28,51,65] and cryptography [39,40,85,91]. In contrast, our focus is on *fully automated* reasoning about finite fields.

### 2 Background

#### 2.1 Algebra

Here, we summarize algebraic definitions and facts that we will use; see [71, Chapters 1 through 8] or [34, Part IV] for a full presentation.

*Finite Fields.* A *finite field* is a finite set equipped with binary operations + and × that have identities (0 and 1 respectively), have inverses (save that there is no multiplicative inverse for 0), and satisfy associativity, commutativity, and distributivity. The *order* of a finite field is the size of the set. All finite fields have order q = p<sup>e</sup> for some prime p (called the *characteristic*) and positive integer e. Such an integer q is called a *prime power*.

Up to isomorphism, the field of order q is unique and is denoted Fq, or F when the order is clear from context. The fields F<sup>q</sup>*<sup>d</sup>* for d > 1 are called *extension fields* of <sup>F</sup>q. In contrast, <sup>F</sup><sup>q</sup> may be called the *base field*. We write <sup>F</sup> <sup>⊂</sup> <sup>G</sup> to indicate that F is a field that is isomorphic to the result of restricting field G to some subset of its elements (but with the same operations). We note in particular that <sup>F</sup><sup>q</sup> <sup>⊂</sup> <sup>F</sup>q*<sup>d</sup>* . A field of prime order <sup>p</sup> is called a *prime field*.

*Polynomials.* For a finite field F and formal variables X1,...,Xk, F[X1,...,Xk] denotes the set of polynomials in X1,...,X<sup>k</sup> with coefficients in F. By taking the variables to be in <sup>F</sup>, a polynomial <sup>f</sup> <sup>∈</sup> <sup>F</sup>[X1,...,Xk] can be viewed as a function from <sup>F</sup><sup>k</sup> <sup>→</sup> <sup>F</sup>. However, by taking the variables to be in an extension <sup>G</sup> of <sup>F</sup>, <sup>f</sup> can also be viewed as function from <sup>G</sup><sup>k</sup> <sup>→</sup> <sup>G</sup>.

For a set of polynomials <sup>S</sup> <sup>=</sup> {f1,...,f<sup>m</sup>} ⊂ <sup>F</sup>q[X1,...,Xk], the set <sup>I</sup> <sup>=</sup> {g1f<sup>1</sup> <sup>+</sup> ··· <sup>+</sup> <sup>g</sup>mf<sup>m</sup> : <sup>g</sup><sup>i</sup> <sup>∈</sup> <sup>F</sup>q[X1,...,Xk]} is called the *ideal* generated by <sup>S</sup> and is denoted f1,...,f<sup>m</sup> or S. In turn, S is called a *basis* for the ideal I.

The *variety* of an ideal <sup>I</sup> in field <sup>G</sup> <sup>⊃</sup> <sup>F</sup> is denoted <sup>V</sup><sup>G</sup>(I), and is the set {**<sup>x</sup>** <sup>∈</sup> <sup>G</sup><sup>k</sup> : <sup>∀</sup><sup>f</sup> <sup>∈</sup> I,f(**x**)=0}. That is, <sup>V</sup><sup>G</sup>(I) contains the common zeros of polynomials in I, viewed as functions over G. Note that for any set of polynomials <sup>S</sup> that generates <sup>I</sup>, <sup>V</sup><sup>G</sup>(I) contains exactly the common zeros of <sup>S</sup> in <sup>G</sup>. When the space <sup>G</sup> is just <sup>F</sup>, we denote the variety as <sup>V</sup>(I). An ideal <sup>I</sup> that contains <sup>1</sup> contains all polynomials and is called *trivial*.

One can show that if I is trivial, then V(I) = ∅. However, the converse does not hold. For instance, <sup>X</sup><sup>2</sup> + 1 <sup>∈</sup> <sup>F</sup>3[X] has no zeros in <sup>F</sup>3, but <sup>1</sup> ∈ X<sup>2</sup> + 1. But, one can also show that <sup>I</sup> is trivial iff for all extensions <sup>G</sup> of <sup>F</sup>, <sup>V</sup><sup>G</sup>(I) = <sup>∅</sup>.

The *field polynomial* for field <sup>F</sup><sup>q</sup> in variable <sup>X</sup> is <sup>X</sup><sup>q</sup> <sup>−</sup> <sup>X</sup>. Its zeros are all of F<sup>q</sup> and it has no additional zeros in any extension of Fq. Thus, for an ideal I of polynomials in F[X1,...,Xk] that contains field polynomials for each variable Xi, I is trivial iff V(I) = ∅. For this reason, field polynomials are a common tool for ensuring the completeness of ideal-based reasoning techniques [48,54,97].

*Representation.* We represent <sup>F</sup><sup>p</sup> as the set of integers {0, <sup>1</sup>,...,p <sup>−</sup> <sup>1</sup>}, with the operations <sup>+</sup> and <sup>×</sup> performed modulo <sup>p</sup>. The representation of <sup>F</sup><sup>p</sup>*<sup>e</sup>* with e > <sup>1</sup> is more complex. Unfortunately, the set {0, <sup>1</sup>,...,p<sup>e</sup> <sup>−</sup> <sup>1</sup>} with <sup>+</sup> and <sup>×</sup> performed modulo p<sup>e</sup> is not a field because multiples of p do not have multiplicative inverses. Instead, we represent F<sup>p</sup>*<sup>e</sup>* as the set of polynomials in F[X] of degree less than e. The operations + and × are performed modulo q(X), an irreducible polynomial<sup>1</sup> of degree e [71, Chapter 6]. There are p<sup>e</sup> such polynomials, and so long as q(X) is irreducible, all (save 0) have inverses. Note that this definition of <sup>F</sup><sup>p</sup>*<sup>e</sup>* generalizes <sup>F</sup>p, and captures the fact that <sup>F</sup><sup>p</sup> <sup>⊂</sup> <sup>F</sup><sup>p</sup>*<sup>e</sup>* .

#### 2.2 Ideal Membership

The ideal membership problem is to determine whether a given polynomial p is in the ideal generated by a given set of polynomials D. We summarize definitions and facts relevant to algorithms for this problem; see [30] for a full presentation.

*Monomial Ordering.* In F[X1,...,Xk], a *monomial* is a polynomial of form X<sup>e</sup><sup>1</sup> <sup>1</sup> ··· <sup>X</sup><sup>e</sup>*<sup>k</sup>* <sup>k</sup> with non-negative integers ei. A *monomial ordering* is a total ordering on monomials such that for all monomials p, q, r, if p<q, then pr < qr.

<sup>1</sup> Recall that an irreducible polynomial cannot be factored into two or more nonconstant polynomials.

The *lexicographical* ordering for monomials Xe<sup>1</sup> <sup>1</sup> ··· <sup>X</sup>e*<sup>k</sup>* <sup>k</sup> orders them lexicographically by the tuple (e1,...,ek). The *graded-reverse lexicographical* (grevlex) ordering is lexicographical by the tuple (e<sup>1</sup> + ··· + ek, e1,...,ek). With respect to an ordering, lm(f) denotes the greatest monomial of a polynomial f.

*Reduction.* For polynomials p and d, if lm(d) divides a term t of p, then we say that <sup>p</sup> *reduces* to <sup>r</sup> *modulo* <sup>d</sup> (written <sup>p</sup> <sup>→</sup><sup>d</sup> <sup>r</sup>) for <sup>r</sup> <sup>=</sup> <sup>p</sup> <sup>−</sup> <sup>t</sup> lm(d)d. For a set of polynomials D, we write p →<sup>D</sup> r if p →<sup>d</sup> r for some d ∈ D. Let →<sup>∗</sup> <sup>D</sup> be the transitive closure of →<sup>D</sup>. We define p ⇒<sup>D</sup> r to hold when p →<sup>∗</sup> <sup>D</sup> r and there is no r such that r →<sup>D</sup> r .

Reduction is a sound—but incomplete—algorithm for ideal membership. That is, one can show that p ⇒<sup>D</sup> 0 implies p ∈ D, but the converse does not hold in general.

*Gröbner Bases.* Define the *s-polynomial* for polynomials p and q, by spoly(p, q) = p · lm(q) − q · lm(p). A Gröbner basis (GB) [21] is a set of polynomials P characterized by the following equivalent conditions:

1. ∀p, p ∈ P, spoly(p, p ) ⇒<sup>P</sup> 0 (*closure under the reduction of s-polynomials*) 2. ∀p ∈ P, p ⇒<sup>P</sup> 0 (*reduction is a complete test for ideal membership*)

Gröbner bases are useful for deciding ideal membership. From the first characterization, one can build algorithms for constructing a Gröbner basis for any ideal [21]. Then, the second characterization gives an ideal membership test. When P is a GB, the relation ⇒<sup>P</sup> is a function (i.e., →<sup>P</sup> is confluent), and it can be efficiently computed [1,21]; thus, this test is efficient.

A *Gröbner basis engine* takes a set of generators G for some ideal I and computes a Gröbner basis for I. We describe the high-level design of such engines here. An engine constructs a sequence of bases G0, G1, G2,... (with G<sup>0</sup> = G) until some G<sup>i</sup> is a Gröbner basis. Each G<sup>i</sup> is constructed from G<sup>i</sup>−<sup>1</sup> according to one of three types of steps. First, for some p, q ∈ G<sup>i</sup>−<sup>1</sup> such that spoly(p, q) ⇒<sup>G</sup>*i−*<sup>1</sup> r = 0, the engine can set G<sup>i</sup> = G<sup>i</sup>−<sup>1</sup> ∪ {r}. Second, for some p ∈ G<sup>i</sup>−<sup>1</sup> such that p ⇒<sup>G</sup>*i−*1\{p} r = p, the engine can set G<sup>i</sup> = (G<sup>i</sup>−<sup>1</sup> \ {p}) ∪ {r}. Third, for some p ∈ G<sup>i</sup>−<sup>1</sup> such that p ⇒<sup>G</sup>*i−*1\{p} 0, the engine can set G<sup>i</sup> = G<sup>i</sup>−<sup>1</sup> \ {p}. Notice that all rules depend on the current basis; some add polynomials, and some remove them. In general, it is unclear which sequence of steps will construct a Gröbner basis most quickly: this is an active area of research [1,18,41,43].

#### 2.3 Zero Knowledge Proofs

Zero-knowledge proofs allow one to prove that some secret data satisfies a public property, without revealing the data itself. See [94] for a full presentation; we give a brief overview here. There are two parties: a *verifier* V and a *prover* P. V knows a public *instance* x and asks P to show that it has knowledge of a secret *witness* w satisfying a public *predicate* φ(x, w). To do so, P runs an efficient (i.e., polytime in a security parameter λ) proving algorithm Prove(φ, x, w) → π and sends the resulting *proof* π to V. Then, V runs an efficient verification algorithm Verify(φ, x, π) → {0, 1} that accepts or rejects the proof. A system for Zero-Knowledge Proofs of knowledge (ZKPs) is a (Prove,Verify) pair with:


ZKP applications are manifold. ZKPs are the basis of private cryptocurrencies such as Zcash and Monero, which have a combined market capitalization of \$2.80B as of 30 June 2022 [44,45]. They've also been proposed for auditing sealed court orders [46], operating private gun registries [63], designing privacypreserving middleboxes [53] and more [22,56].

This breadth of applications is possible because implemented ZKPs are very general: they support any φ checkable in polytime. However, φ must be first compiled to a cryptosystem-compatible computation language. The most common language is a *rank-1 constraint system* (R1CS). In an R1CS C, x and w are together encoded as a vector **<sup>z</sup>** <sup>∈</sup> <sup>F</sup><sup>m</sup>. The system <sup>C</sup> is defined by three matrices A, B, C <sup>∈</sup> <sup>F</sup><sup>n</sup>×<sup>m</sup>; it is satisfied when <sup>A</sup>**<sup>z</sup>** ◦ <sup>B</sup>**<sup>z</sup>** <sup>=</sup> <sup>C</sup>**z**, where ◦ is the elementwise product. Thus, the predicate can be viewed as n distinct *constraints*, where constraint i has form ( - <sup>j</sup> Aij z<sup>j</sup> )(- <sup>j</sup> Bij z<sup>j</sup> ) − ( - <sup>j</sup> Cij z<sup>j</sup> )=0. Note that each constraint is a degree ≤ 2 polynomial in m variables that **z** must be a zero of. For security reasons, <sup>F</sup> must be large: its prime must have <sup>≈</sup>255 bits.

*Encoding.* The efficiency of the ZKP scales quasi-linearly with n. Thus, it's useful to encode φ as an R1CS with a minimal number of constraints. Since equisatifiability—not logical equivalence—is needed, encodings may introduce new variables.

As an example, consider the Boolean computation a ← c<sup>1</sup> ∨···∨ ck. Assume that c 1,...,c <sup>k</sup> <sup>∈</sup> <sup>F</sup> are elements in **<sup>z</sup>** that are 0 or 1 such that <sup>c</sup><sup>i</sup> <sup>↔</sup> (c <sup>i</sup> = 1). How can one ensure that <sup>a</sup> <sup>∈</sup> <sup>F</sup> (also in **<sup>z</sup>**) is 0 or 1 and <sup>a</sup> <sup>↔</sup> (a = 1)? Given that there are k − 1 ORs, natural approaches use Θ(k) constraints. One clever approach is to introduce variable x and enforce constraints x ( - <sup>i</sup> c <sup>i</sup>) = a and (1 − a )(- <sup>i</sup> c <sup>i</sup>)=0. If any c<sup>i</sup> is true, a must be 1 to satisfy the second constraint; setting x to the sum's inverse satisfies the first. If all c<sup>i</sup> are false, the first constraint ensures a is 0. This encoding is correct when the sum does not overflow; thus, k must be smaller than F's characteristic.

Optimizations like this can be quite complex. Thus, ZKP programmers use constraint synthesis libraries [14,69] or compilers [13,24,80,81,84,92,102] to generate an R1CS from a high-level description. Such tools support objects like Booleans, fixed-width integers, arrays, and user-defined data-types. The correctness of these tools is critical to the correctness of any system built with them.

<sup>2</sup> <sup>f</sup>(λ) <sup>≤</sup> negl(λ) if for all <sup>c</sup> <sup>∈</sup> <sup>N</sup>, <sup>f</sup>(λ) = <sup>o</sup>(λ*−<sup>c</sup>*).

#### 2.4 SMT

We assume usual terminology for many-sorted first order logic with equality ( [38] gives a complete presentation). Let Σ be a many-sorted signature including a sort Bool and symbol family ≈<sup>σ</sup> (abbreviated ≈) with sort σ × σ → Bool for all σ in Σ. A *theory* is a pair T = (Σ, **I**), where Σ is a signature and **I** is a class of Σ-interpretations. A Σ-formula φ is *satisfiable* (resp., *unsatisfiable*) in T if it is satisfied by some (resp., no) interpretation in **I**. Given a (set of) formula(s) S, we write S |=<sup>T</sup> φ if every interpretation M ∈ **I** that satisfies S also satisfies φ.

When using the CDCL(T) framework for SMT, the reasoning engine for each theory is encapsulated inside a *theory solver*. Here, we mention the fragment of CDCL(T) that is relevant for our purposes ( [78] gives a complete presentation)).

The goal of CDCL(T) is to check a formula φ for satisfiability. A *core* module manages a propositional search over the propositional abstraction of φ and communicates with the theory solver. As the core constructs partial propositional assignments for the abstract formula, the theory solver is given the literals that correspond to the current propositional assignment. When the propositional assignment is completed (or, optionally, before), the theory solver must determine whether its literals are jointly satisfiable. If so, it must be able to provide an interpretation in **I** (which includes an assignment to theory variables) that satisfies them. If not, it may indicate a strict subset of the literals which are unsatisfiable: an unsatisfiable core. Smaller unsatisfiable cores usually accelerate the propositional search.

### 3 The Theory of Finite Fields

We define the theory T<sup>F</sup>*<sup>q</sup>* of the finite field Fq, for any order q. Its sort and symbols are indexed by the parameter q; we omit q when clear from context.

The signature of the theory is given in Fig. 1. It includes sort F, which intuitively denotes the sort of elements of F<sup>q</sup> and is represented in our proposed SMT-LIB format as (\_ FiniteField <sup>q</sup>). There is a constant symbol for each element of Fq, and function symbols for addition and multiplication. Other finite field operations (e.g., negation, subtraction, and inverses) naturally reduce to this signature.

An interpretation <sup>M</sup> of <sup>T</sup><sup>F</sup>*<sup>q</sup>* must interpret: <sup>F</sup> as <sup>F</sup>q, <sup>n</sup> ∈ {0,...,q <sup>−</sup> <sup>1</sup>} as the <sup>n</sup>th element of <sup>F</sup><sup>q</sup> in lexicographical order,<sup>3</sup> <sup>+</sup> as addition in <sup>F</sup>q, <sup>×</sup> as multiplication in <sup>F</sup>q, and <sup>≈</sup> as equality in <sup>F</sup>q.

Note that in order to avoid ambiguity, we require that the sort of any constant ff<sup>n</sup> must be ascribed. For instance, the <sup>n</sup>th element of <sup>F</sup><sup>q</sup> would be (as ff<sup>n</sup> (\_ FiniteField <sup>q</sup>)). The sorts of non-nullary function symbols need not be ascribed: they can be inferred from their arguments.

<sup>3</sup> For non-prime F*p<sup>e</sup>* , we use the lexicographical ordering of elements represented as polynomials in F*p*[X] modulo the Conway polynomial [83,90] C*p,e*(X). This representation is standard [57].


Fig. 1. Signature of the theory of F*<sup>q</sup>*


Fig. 2. The decision procedure for F*q*.

### 4 Decision Procedure

Recall (§2.4) that a CDCL(T) theory solver for F must decide the satisfiability of a set of F-literals. At a high level, our decision procedure comprises three steps. First, we reduce to a problem concerning a single algebraic variety. Second, we use a GB-based test for unsatisfiability that is fast and sound, but incomplete. Third, we attempt model construction. Figure 2 shows pseudocode for the decision procedure; we will explain it incrementally.

#### 4.1 Algebraic Reduction

Let <sup>L</sup> <sup>=</sup> {1,...,|L|} be a set of literals. Each <sup>F</sup>-literal has the form <sup>i</sup> <sup>=</sup> <sup>s</sup><sup>i</sup> t<sup>i</sup> where <sup>s</sup> and <sup>t</sup> are <sup>F</sup>-terms and ∈ {≈, ≈}. Let **X** = {X1,...,X<sup>k</sup>} denote the free variables in L. Let E,D ⊆ {1,..., |L|} be the sets of indices corresponding to equalities and disequalities in <sup>L</sup>, respectively. Let [[t]] <sup>∈</sup> <sup>F</sup>[**X**] denote the natural interpretation of <sup>F</sup>-terms as polynomials in <sup>F</sup>[**X**] (Fig. 3). Let <sup>P</sup><sup>E</sup> <sup>⊂</sup> <sup>F</sup>[**X**] be the set of interpretations of the equalities; i.e., P<sup>E</sup> = {[[si]] − [[ti]]}<sup>i</sup>∈<sup>E</sup>. Let P<sup>D</sup> ⊂ <sup>F</sup>[**X**] be the interpretations of the *dis*equalities; i.e., <sup>P</sup><sup>D</sup> <sup>=</sup> {[[si]] <sup>−</sup> [[ti]]}<sup>i</sup>∈<sup>D</sup>. The satisfiability of L reduces to whether V(P<sup>E</sup>) \ <sup>p</sup>∈P*<sup>D</sup>* <sup>V</sup>(p) is non-empty.

To simplify, we reduce disequalities to equalities using a classic technique [88]: we introduce a fresh variable W<sup>i</sup> for each i ∈ D and define P <sup>D</sup> as

$$P'\_D = \{W\_i(\left[s\_i\right] - \left[t\_i\right]) - 1\}\_{i \in D}$$

$$\text{Const}\frac{t \in \mathbb{F}}{\lceil t \rceil = t} \quad \text{Var}\frac{\text{Var}\frac{\lceil s \rceil = s'}{\lceil X\_i \rceil = X\_i}}{\lceil s + t \rceil = s' + t'} \quad \text{Mul}\frac{\lceil s \rceil = s' \quad \lceil t \rceil = t'}{\lceil s \times t \rceil = s' \times t'}$$

Fig. 3. Interpreting F-terms as polynomials

Note that each p ∈ P <sup>D</sup> has zeros for exactly the values of **X** where its analog in P<sup>D</sup> is *not* zero. Also note that P <sup>D</sup> <sup>⊂</sup> <sup>F</sup>q[**X** ], with **X** = **X** ∪ {Wi}i∈D.

We define P to be P<sup>E</sup> ∪P <sup>D</sup> (constructed in lines 2 to 6, Fig. 2) and note three useful properties of P. First, L is satisfiable if and only if V(P) is non-empty. Second, for any P ⊂ P, if V(P ) = ∅, then {π(p) : p ∈ P } is an unsatisfiable core, where π maps a polynomial to the literal it is derived from. Third, from any **x** ∈ V(P) one can immediately construct a model. Thus, our theory solver reduces to understanding properties of the variety V(P).

#### 4.2 Incomplete Unsatisfiability and Cores

Recall (§2.2) that if 1 ∈ P, then V(P) is empty. We can answer this ideal membership query using a Gröbner basis engine (line 7, Fig. 2). Let *GB* be a subroutine that takes a list of polynomials and computes a Gröbner basis for the ideal that they generate, according to some monomial ordering. We use grevlex: the ordering for which GB engines are typically most efficient [42]. We compute *GB*(P) and check whether 1 ⇒*GB*(<sup>P</sup> ) 0. If so, we report that V(P) is empty. If not, recall (§2.2) that V(P) may still be empty; we proceed to attempt model construction (lines 9 to 11, Fig. 2, described in the next subsection).

If 1 *does* reduce by the Gröbner basis, then identifying a subset of P which is sufficient to reduce 1 yields an unsatisfiable core. To construct such a subset, we formalize the inferences performed by the Gröbner basis engine as a calculus for proving ideal membership.

Figure 4 presents IdealCalc: our ideal membership calculus. IdealCalc proves facts of the form p ∈ P, where p is a polynomial and P is the set of generators for an ideal. The G rule states that the generators are in the ideal. The Z rule states that 0 is in the ideal. The S rule states that for any two polynomials in the ideal, their s-polynomial is in the ideal too. The R<sup>↑</sup> and R<sup>↓</sup> rules state that if p →<sup>q</sup> r with q in the ideal, then p is in the ideal if and only if r is.

The soundness of IdealCalc follows immediately from the definition of an ideal. Completeness relies on the existence of algorithms for computing Gröbner bases using only s-polynomials and reduction [21,41,43]. We prove both properties in Appendix A.

Theorem 1 (IdealCalcSoundness). *If there exists an* IdealCalc *proof tree with conclusion* p ∈ P*, then* p ∈ P*.*

Theorem 2 (IdealCalcCompleteness). *If* p ∈ P*, then there exists an* IdealCalc *proof tree with conclusion* p ∈ P*.*

$$\begin{array}{llll} \mathsf{Z} \xrightarrow{} & \mathsf{G} \xrightarrow{p \in P} & \mathsf{R}\_{\uparrow} \xrightarrow{r \in \langle P \rangle} & p \in \langle P \rangle & q \in \langle P \rangle & p \to\_{q} r \\ & & p \in \langle P \rangle & q \in \langle P \rangle & p \in \langle P \rangle \\ & & \mathsf{S} \xrightarrow{p \in \langle P \rangle} & \mathsf{poly}(p,q) \in \langle P \rangle & & r \in \langle P \rangle \\ \end{array}$$

Fig. 4. IdealCalc: a calculus for ideal membership


Fig. 5. Finding common zeros for a Gröbner basis. After handling trivial cases, *FindZero* uses *ApplyRule* to apply the first applicable rule from Fig. 6.

By instrumenting a Gröbner basis engine and reduction engine, one can construct IdealCalc proof trees. Then, for a conclusion 1 ∈ P, traversing the proof tree to its leaves gives a subset P ⊆ P such that 1 ∈ P . The procedure *CoreFromTree* (called in line 8, Fig. 2) performs this traversal, by accessing a proof tree recorded by the *GB* procedure and the reductions. The proof of Theorem 2 explains our instrumentation in more detail (Appendix A).

#### 4.3 Completeness Through Model Construction

As discussed, we still need a *complete* decision procedure for determining if V(P) is empty. We call this procedure *FindZero*; it is a backtracking search for an element of V(P). It also serves as our model construction procedure.

Figure 5 presents *FindZero* as a recursive search. It maintains two data structures: a Gröbner basis <sup>B</sup> and partial map <sup>M</sup> : **<sup>X</sup>** <sup>→</sup> <sup>F</sup> from variables to field elements. By applying a branching rule (which we will discuss in the next paragraph), *FindZero* obtains a disjunction of single-variable assignments X <sup>i</sup> → z, which it branches on. *FindZero* branches on an assignment X <sup>i</sup> → z by adding it to M and updating B to *GB*(B ∪ {X <sup>i</sup> − z}).

Figure 6 shows the branching rules of *FindZero*. Each rule comprises *antecedents* (conditions that must be met for the rule to apply) and a *conclusion* (a disjunction of single-variable assignments to branch on). The Univariate rule applies when B contains a polynomial p that is univariate in some variable X i that M does not have a value for. The rule branches on the univariate roots of p. The Triangular rule comes from work on triangular decomposition [68]. It

$$\text{Univariate} \xrightarrow{p \in B \quad p \in \mathbb{F}[X\_i'] \quad X\_i' \notin M \quad Z \leftarrow \text{UnivariateZeros}(p)}$$

Triangular *Dim*( *<sup>B</sup>* )=0 *<sup>X</sup><sup>i</sup>* <sup>∈</sup>*/ M p MinPoly*(*B,X<sup>i</sup>* ) *<sup>Z</sup> UnivariateZeros*(*p*) *<sup>z</sup>*∈*<sup>Z</sup>* (*X<sup>i</sup> <sup>z</sup>*)

$$\mathsf{Exhaust}\,\frac{\mathsf{Exhaust}\,\frac{\int\_{\mathsf{x}\in\mathsf{F}}\,\mathsf{V}\_{\mathsf{x}'\notin M}(X\_i'\mapsto z)}{\int\_{\mathsf{x}}\,\mathsf{F}\_{\mathsf{x}'}\,\mathsf{f}'M}$$

Fig. 6. Branching rules for *FindZero*.

applies when B is zero-dimensional.<sup>4</sup> It computes a univariate *minimal polynomial* p(X <sup>i</sup> ) in some unassigned variables X <sup>i</sup> , and branches on the univariate roots of p. The final rule Exhaust has no conditions and simply branches on all possible values for all unassigned variables.

*FindZero*'s *ApplyRule* sub-routine applies the first rule in Fig. 6 whose conditions are met. The other subroutines (*GB* [21,41,43], *Dim* [11], *MinPoly* [2], and *UnivariateZeros* [87]) are commonly implemented in computer algebra libraries. *Dim*, *MinPoly*, and *UnivariateZeros* run in (randomized) polytime.

Theorem 3 (*FindZero*Correctness). *If* V(B) = ∅ *then FindZero returns* ⊥*; otherwise, it returns a member of* V(B)*. (Proof: Appendix B)*

*Correctness and Efficiency.* The branching rules achieve a careful balance between correctness and efficiency. The Exhaust rule is always applicable, but a full exhaustive search over a large field is unreasonable (recall: ZKPs operate of ≈255-bit fields). The Triangular and Univariate rules are important alternatives to exhaustion. They create a far smaller set of branches, but apply only when the variety has dimension zero or the basis has a univariate polynomial.

As an example of the importance of Univariate, consider the univariate system <sup>X</sup><sup>2</sup> = 2, in a field where <sup>2</sup> is not a perfect square (e.g., <sup>F</sup>7). <sup>X</sup><sup>2</sup> <sup>−</sup> <sup>2</sup> is already a (reduced) Gröbner basis, and it does not contain 1, so *FindZero* applies. With the Univariate rule, *FindZero* computes the univariate zeros of <sup>X</sup><sup>2</sup> <sup>−</sup> <sup>2</sup> (there are none) and exits. Without it, the Exhaust rule creates <sup>|</sup>F<sup>|</sup> branches.

As an example of when Triangular is critical, consider

$$\begin{aligned} X\_1 + X\_2 + X\_3 + X\_4 + X\_5 &= 0\\ X\_1 X\_2 + X\_2 X\_3 + X\_3 X\_4 + X\_4 X\_5 + X\_5 X\_1 &= 0\\ X\_1 X\_2 X\_3 + X\_2 X\_3 X\_4 + X\_3 X\_4 X\_5 + X\_4 X\_5 X\_1 + X\_5 X\_1 X\_2 &= 0\\ X\_1 X\_2 X\_3 X\_4 + X\_2 X\_3 X\_4 X\_5 + X\_3 X\_4 X\_5 X\_1 + X\_4 X\_5 X\_1 X\_2 + X\_5 X\_1 X\_2 X\_3 &= 0\\ X\_1 X\_2 X\_3 X\_4 X\_5 &= 1 \end{aligned}$$

<sup>4</sup> The *dimension* of an ideal is a natural number that can be efficiently computed from a Gröbner basis. If the dimension is zero, then one can efficiently compute a minimal polynomial in any variable X, given a Gröbner basis [2,68].

in F<sup>394357</sup> [68]. The system is unsatisfiable, it has dimension 0, and its ideal does not contain 1. Moreover, our solver computes a (reduced) Gröbner basis for it that does not contain any univariate polynomials. Thus, Univariate does not apply. However, Triangular does, and with it, *FindZero* quickly terminates. Without Triangular, Exhaust would create at least <sup>|</sup>F<sup>|</sup> branches.

In the above examples, Exhaust performs very poorly. However, that is not always the case. For example, in the system X<sup>1</sup> +X<sup>2</sup> = 0, using Exhaust to guess X1, and then using the univariate rule to determine X<sup>2</sup> is quite reasonable. In general, Exhaust is a powerful tool for solving *underconstrained* systems. Our experiments will show that despite including Exhaust, our procedure performs quite well on our benchmarks. We reflect on its performance in Sect. 8.

*Field Polynomials: A Road not Taken.* By guaranteeing completeness through (potential) exhaustion, we depart from prior work. Typically, one ensures completeness by including *field polynomials* in the ideal (§2.2). Indeed, this is the approach suggested [97] and taken [55] by prior work. However, field polynomials induce enormous overhead in the Gröbner basis engine because their degree is so large. The result is a procedure that is only efficient for tiny fields [55]. In our experiments, we compare our system's performance to what it would be if it used field polynomials.<sup>5</sup> The results confirm that deferring completeness to *FindZero* is far superior for our benchmarks.

### 5 Implementation

We have implemented our decision procedure for prime fields in the cvc5 SMT solver [7] as a theory solver. It is exposed through cvc5's SMT-LIB, C++, Java, and Python interfaces. Our implementation comprises <sup>≈</sup>2k lines of C++. For the algebraic sub-routines of our decision procedure (§4), it uses CoCoALib [1]. To compute unsatisfiable cores (§4.2), we inserted hooks into CoCoALib's Gröbner basis engine (17 lines of C++).

Our theory solver makes sparse use of the interface between it and the rest of the SMT solver. It acts only once a full propositional assignment has been constructed. It then runs the decision procedure, reporting either satisfiability (with a model) or unsatisfiability (with an unsatisfiable core).

## 6 Benchmark Generation

Recall that one motivation for this work is to enable translation validation for compilers to field constraint systems (R1CSs) used in zero-knowledge proofs (ZKPs). Our benchmarks are SMT formulas that encode translation validation queries for compilers from *Boolean* computations to R1CS. At a high level, each benchmark is generated as follows.

<sup>5</sup> We add field polynomials to our procedure on line 2, Fig. 2. This renders our ideal triviality test (lines 7 and 8) complete, so we can eliminate the fallback to *FindZero*.


Through step 3, we construct SMT queries that are satisfiable, unsatisfiable, and of unknown status. Through step 5, we construct queries solvable using bit-vector reasoning, integer reasoning, or a stand-alone computer algebra system.

#### 6.1 Examples

We describe our benchmark generator in full and give the definitions of soundness and determinism in Appendix C. Here, we give three example benchmarks. Our examples are based on the Boolean formula Ψ(x1, x2, x3, x4) = x<sup>1</sup> ∨ x<sup>2</sup> ∨ x<sup>3</sup> ∨ x4. Our convention is to mark field variables with a prime, but not Boolean variables. Using the technique from Sect. 2.3, CirC compiles this formula to the two-constraint system: i s = r ∧(1−r )s = 0 where s - -3 <sup>i</sup>=0 x <sup>i</sup>. Each Boolean input x<sup>i</sup> corresponds to field element x <sup>i</sup> and r corresponds to the result of Ψ.

*Soundness.* An R1CS is sound if it ensures the output r corresponds to the value of Ψ (when given valid inputs). Concretely, our system is sound if the following formula is valid:

$$\underbrace{\forall i. (x\_i' = 0 \lor x\_i' = 1) \land (x\_i' = 1 \iff x\_i)}\_{\text{inputs are correct}} \land \underbrace{i's' = r' \land (1 - r')s' = 0}\_{\text{constraints hold}}$$

$$\implies$$

$$\underbrace{(r\_i' = 0 \lor r\_i' = 1) \land (r\_i' = 1 \iff \Psi)}\_{\text{output is correct}}$$

where Ψ and s are defined as above. This is an UNSAT benchmark, because the formula is valid.

*Determinism.* An R1CS is deterministic if the values of the inputs uniquely determine the value of the output. To represent this in a formula, we use two copies of the constraint system: one with primed variables, and one with doubleprimed variables. Our example is deterministic if the following formula is valid:

$$\underbrace{\forall i. (x\_i' = x\_i'')}\_{\text{inputs agree}} \land \underbrace{i's' = r' \land (1 - r')s'}\_{\text{constraints hold for both systems}} = r'' \land (1 - r'')s'' = 0$$
 
$$\implies$$
 
$$\underbrace{r' = r''}\_{\text{center}}$$

 outputs agree

*Unsoundness.* Removing constraints from the system can give a formula that is not valid (a SAT benchmark). For example, if we remove (1−r )s = 0, then the soundness formula is falsified by {x<sup>i</sup> → , x <sup>i</sup> → 1, r → 0, i → 0}.

## 7 Experiments

Our experiments show that our approach:


Our test bed is a cluster with Intel Xeon E5-2637 v4 CPUs. Each run is limited to one physical core, 8GB memory, and 300s.

Throughout, we generate benchmarks for two correctness properties (soundness and determinism), three different ZKP compilers, and three different statuses (sat, unsat, and unknown). We vary the field size, encoding, number of inputs, and number of terms, depending on the experiment. We evaluate our cvc5 extension, Bitwuzla (commit 27f6291), and z3 (version 4.11.2).

Fig. 7. The performance of field-based and BV-based approaches (with various BV solvers) when the field size ranges from 5 to 60 bits.

solved instances at all bit-widths.

#### 7.1 Comparison with Bit-Vectors

Since bit-vector solvers scale poorly with bit-width, one would expect the effectiveness of a BV encoding of our properties to degrade as the field size grows. To validate this, we generate BV-encoded benchmarks for varying bit-widths and evaluate state-of-the-art bit-vector solvers on them. Though our applications of interest use b = 255, we will see that the BV-based approach does not scale to


Table 1. Solved small-field benchmarks by tool, property, and status.

fields this large. Thus, for this set of experiments we use b ∈ {5, 10,..., 60}, and we sample formulas with 4 inputs and 8 intermediate terms.

Figure 7a shows performance of three bit-vector solvers (cvc5 [7], Bitwuzla [76], and z3 [73]) and our F solver as a cactus plot; Table 1 splits the solved instances by property and status. We see that even for these small bit-widths, the field-based approach is already superior. The bit-vector solvers are more competitive on the soundness benchmarks, since these benchmarks include only half as many field operations as the determinism benchmarks.

For our benchmarks, Bitwuzla is the most efficient BV solver. We further examine the time that it and our solver take to solve the 9 benchmarks they can both solve at all bit-widths. Figure 7b plots the total solve time against b. While the field-based solver's runtime is nearly independent of field size, the bit-vector solvers slow down substantially as the field grows.

In sum, the BV approach scales poorly with field size and is already inferior on fields of size at least 2<sup>40</sup>.

#### 7.2 The Cost of Field Polynomials

Recall that our decision procedure does not use field polynomials (§4.3), but our implementation optionally includes them (§5). In this experiment, we measure the cost they incur. We use propositional formulas in 2 variables with 4 terms, and we take b ∈ {4,..., 12}, and include SAT and unknown benchmarks.

Figure 8a compares the performance of our tool with and without field polynomials. For many benchmarks, field polynomials cause a slowdown greater than 100×. To better show the effect of the field size, we consider the solve time for the SAT benchmarks, at varying values of b. Figure 8b shows how solve times change as b grows: using field polynomials causes exponential growth. For UNSAT benchmarks, both configurations complete within 1s. This is because (for these benchmarks) the GB is just {1} and CoCoA's GB engine is good at discovering that (and exiting) without considering the field polynomials.

This growth is predictable. GB engines can take time exponential (or worse) in the degree of their inputs. A simple example illustrates this fact: consider computing a Gröbner basis with X<sup>2</sup>*<sup>b</sup>* <sup>−</sup> <sup>X</sup> and <sup>X</sup><sup>2</sup> <sup>−</sup> <sup>X</sup>. The former reduces to 0 modulo the latter, but the reduction takes <sup>2</sup><sup>b</sup> <sup>−</sup> <sup>1</sup> steps.

(a) All benchmarks, both configurations.

(b) Each series is one property at different numbers of bits.

Fig. 8. Solve times, with and without field polynomials. The field size varies from 4 to 12 bits. The benchmarks are all SAT or unknown.

(a) Our SMT solver with and without UNSAT cores.

(b) Our SMT solver compared with a pure computer algebra system.

Fig. 9. The performance of alternative algebra-based approaches.

#### 7.3 The Benefit of UNSAT Cores

Section 4.2 describes how we compute unsatisfiable (UNSAT) cores in the F solver by instrumenting our Gröbner basis engine. In this experiment, we measure the benefit of doing so. We generate Boolean formulas with 2, 4, 6, 8, 10, and 12 variables; and 2<sup>0</sup>, 2<sup>1</sup>, 2<sup>2</sup>, 2<sup>3</sup>, 2<sup>4</sup>, 2<sup>5</sup>, 2<sup>6</sup>, and 2<sup>7</sup> intermediate terms, for a 255-bit field. We vary the number of intermediate terms widely in order to generate benchmarks of widely variable difficulty. We configure our solver with and without GB instrumentation.

Figure 9a shows the results. For many soundness benchmarks, the cores cause a speedup of more than 10×. As expected, only the soundness benchmarks benefit. Soundness benchmarks have non-trivial boolean structure, so the SMT core makes many queries to the theory solver. Returning good UNSAT cores shrinks the propositional search space, reduces the number of theory queries, and thus reduces solve time. However, determinism benchmarks are just a conjunction

Fig. 10. A comparison of all approaches.

of theory literals, so the SMT core makes only one theory query. For them, returning a good UNSAT core has no benefit—but also induces little overhead.

#### 7.4 Comparison to Pure Computer Algebra

In this experiment, we compare our SMT-based approach (which integrates computer-algebra techniques into SMT) against a stand-alone use of computer-algebra. We encode the Boolean structure of our formulas in F<sup>p</sup> (see Appendix C). When run on such an encoding, our SMT solver makes just one query to its field solver, so it cannot benefit from the search optimizations present in CDCL(T). For this experiment, we use the same benchmark set as the last.

Figure 9b compares the pure F approach with our SMT-based approach. For benchmarks that encode soundness properties, the SMT-based approach is clearly dominant. The intuition here is is that computer algebra systems are not optimized for Boolean reasoning. If a problem has non-trivial Boolean structure, a cooperative approach like SMT has clear advantages. SMT's advantage is less pronounced for determinism benchmarks, as these manifest as a single query to the finite field solver; still, in this case, our encoding seems to have some benefit much of the time.

#### 7.5 Main Experiment

In our main experiment, we compare our approach against all reasonable alternatives: a pure computer-algebra approach (§7.4), a BV approach with Bitwuzla (the best BV solver on our benchmarks, §7.1), an NIA approach with cvc5 and z3, and our own tool without UNSAT cores (§7.3). We use the same benchmark set as the last experiment; this uses a 255-bit field.

Figure 10 shows the results as a cactus plot. Table 2 shows the number of solved instances for each system, split by property and status. Bitwuzla quickly runs out of memory on most of the benchmarks. A pure computer-algebra approach outperforms Bitwuzla and cvc5's NIA solver. The NIA solver of z3 does a bit better, but our field-aware SMT solver is the best by far. Moreover, its best configuration uses UNSAT cores. Comparing the total solve time of ff-cvc5 and


Table 2. Solved benchmarks by tool, property, and status.

nia-z3 on commonly solved benchmarks, we find that ff-cvc5 reduces total solve time by 6×. In sum, the techniques we describe in this paper yield a tool that substantially outperforms all alternatives on our benchmarks.

#### 8 Discussion and Future Work

We've presented a basic study of the potential of an SMT theory solver for finite fields based on computer algebra. Our experiments have focused on translation validation for ZKP compilers, as applied to Boolean input computations. The solver shows promise, but much work remains.

As discussed (Sect. 5), our implementation makes limited use of the interface exposed to a theory solver for CDCL(T). It does no work until a full propositional assignment is available. It also submits no lemmas to the core solver. Exploring which lightweight reasoning should be performed during propositional search and what kinds of lemmas are useful is a promising direction for future work.

Our model construction (Sect. 4.3) is another weakness. Without univariate polynomials or a zero-dimensional ideal, it falls back to exhaustive search. If a solution over an extension field is acceptable, then there are Θ(|F<sup>|</sup> <sup>d</sup>) solutions, so an exhaustive search seems likely to quickly succeed. Of course, we need a solution in the base field. If the base field is closed, then every solution is in the base field. Our fields are finite (and thus, not closed), but for our benchmarks, they seem to bear some empirical resemblance to closed fields (e.g., the GB-based test for an empty variety never fails, even though it is theoretically incomplete). For this reason, exhaustive search may not be completely unreasonable for our benchmarks. Indeed, our experiments show that our procedure is effective on our benchmarks, including for SAT instances. However, the worst-case performance of this kind of model construction is clearly abysmal. We think that a more intelligent search procedure and better use of ideas from computer algebra [6,67] would both yield improvement.

Theory combination is also a promising direction for future work. The benchmarks we present here are in the QF\_FF logic: they involve only Booleans and finite fields. Reasoning about different fields in combination with one another would have natural applications to the representation of elliptic curve operations inside ZKPs. Reasoning about datatypes, arrays, and bit-vectors in combination with fields would also have natural applications to the verification of ZKP compilers.

Acknowledgements. We appreciate the help and guidance of Andres Nötzli, Andy Reynolds, Anna Bigatti, Dan Boneh, Erika Ábrahám, Fraser Brown, Gregory Sankaran, Jacob Van Geffen, James Davenport, John Abbott, Leonardo Alt, Lucas Vella, Maya Sankar, Riad Wahby, Shankara Pailoor, and Thomas Hader.

This material is in part based upon work supported by the DARPA SIEVE program and the Simons foundation. Any opinions, findings, and conclusions or recommendations expressed in this report are those of the author(s) and do not necessarily reflect the views of DARPA. It is also funded in part by NSF grant number 2110397.

## A Proofs of **IdealCalc** Properties

This appendix is available in the full version of the paper [82].

### B Proof of Correctness for *FindZero*

We prove that *FindZero* is correct (Theorem 3).

*Proof.* It suffices to show that for each branching rule that results in <sup>j</sup> (X<sup>i</sup>*j*−r<sup>j</sup> ),

$$\mathcal{V}(\langle B \rangle) \subset \bigcup\_j \mathcal{V}(\langle B \cup \{X\_{i\_j} - r\_j\} \rangle)$$

First, consider an application of Univariate with univariate p(Xi). Fix z ∈ V(B). z is a zero of p, so for some j, r<sup>j</sup> = z and z ∈ V(B ∪ {X<sup>i</sup> − z}).

Next, consider an application of Triangular to variable X<sup>i</sup> with minimal polynomial p(Xi). By the definition of minimal polynomial, any zero z of B has a value for X<sup>i</sup> that is a root of p. Let that root be r. Then, z ∈ V(B ∪ {X<sup>i</sup> − z}).

Finally, consider an application of Exhaust. The desired property is immediate.

### C Benchmark Generation

This appendix is available in the full version of the paper [82].

### References

1. Abbott, J., Bigatti, A.M.: CoCoALib: A C++ library for computations in commutative algebra... and beyond. In: International Congress on Mathematical Software (2010)


Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

## **Solving String Constraints Using SAT**

Kevin Lotz1(B) , Amit Goel<sup>2</sup>, Bruno Dutertre<sup>2</sup> , Benjamin Kiesl-Reiter<sup>2</sup> , Soonho Kong<sup>2</sup> , Rupak Majumdar<sup>2</sup> , and Dirk Nowotka<sup>1</sup>

<sup>1</sup> Department of Computer Science, Kiel University, Kiel, Germany {kel,dn}@informatik.uni-kiel.de <sup>2</sup> Amazon Web Services, Seattle, USA {amgoel,dutebrun,benkiesl,soonho,rumajumd}@amazon.com

**Abstract.** String solvers are automated-reasoning tools that can solve combinatorial problems over formal languages. They typically operate on restricted first-order logic formulas that include operations such as string concatenation, substring relationship, and regular expression matching. String solving thus amounts to deciding the satisfiability of such formulas. While there exists a variety of different string solvers, many string problems cannot be solved efficiently by any of them. We present a new approach to string solving that encodes input problems into propositional logic and leverages incremental SAT solving. We evaluate our approach on a broad set of benchmarks. On the logical fragment that our tool supports, it is competitive with state-of-the-art solvers. Our experiments also demonstrate that an eager SAT-based approach complements existing approaches to string solving in this specific fragment.

## **1 Introduction**

Many problems in software verification require reasoning about strings. To tackle these problems, numerous *string solvers*—automated decision procedures for quantifier-free first-order theories of strings and string operations—have been developed over the last years. These solvers form the workhorse of automatedreasoning tools in several domains, including web-application security [19,31,33], software model checking [15], and conformance checking for cloud-access-control policies [2,30].

The general theory of strings relies on deep results in combinatorics on words [5,16,23,29]; unfortunately, the related decision procedures remain intractable in practice. Practical string solvers achieve scalability through a judicious mix of heuristics and restrictions on the language of constraints.

We present a new approach to string solving that relies on an *eager* reduction to the Boolean satisfiability problem (SAT), using incremental solving and unsatisfiable-core analysis for completeness and scalability. Our approach supports a theory that contains Boolean combinations of regular membership constraints and equality constraints on string variables, and captures a large set of practical queries [6].

Our solving method iteratively searches for satisfying assignments up to a length bound on each string variable; it stops and reports unsatisfiability when the search reaches computed upper bounds without finding a solution. Similar to the solver Woorpje [12], we formulate regular membership constraints as reachability problems in nondeterministic finite automata. By bounding the number of transitions allowed by each automaton, we obtain a finite problem that we encode into propositional logic. To cut down the search space of the underlying SAT problem, we perform an *alphabet reduction* step (SMT-LIB string constraints are defined over an alphabet of 3 · <sup>2</sup><sup>16</sup> letters and a naive reduction to SAT does not scale). Inspired by bounded model checking [8], we iteratively increase bounds and utilize an incremental SAT solver to solve the resulting series of propositional formulas. We perform an unsatisfiable-core analysis after each unsatisfiable incremental call to increase only the bounds of a minimal subset of variables until a theoretical upper bound is reached.

We have evaluated our solver on a large set of benchmarks. The results show that our SAT-based approach is competitive with state-of-the-art SMT solvers in the logical fragment that we support. It is particularly effective on satisfiable instances.

Closest to our work is the Woorpje solver [12], which also employs an eager reduction to SAT. Woorpje reduces systems of word equations with linear constraints to a single Boolean formula and calls a SAT solver. An extension can also handle regular membership constraints [21]. However, Woorpje does not handle the full language of constraints considered here and does not employ the reduction and incremental solving techniques that make our tool scale in practice. More importantly, in contrast to our solver, Woorpje is not complete—it does not terminate on unsatisfiable instances.

Other solvers such as Hampi [19] and Kaluza [31] encode string problems into constraints on fixed-size bit-vector, which can be solved by reduction to SAT. These tools support expressive constraints but they require a user-provided bound on the length of string variables.

Further from our work are approaches based on the *lazy* SMT paradigm, which tightly integrates dedicated, heuristic, theory solvers for strings using the CDCL(T) architecture (also called DPLL(T) in early papers). Solvers that follow this paradigm include Ostrich [11], Z3 [25], Z3str4 [24], cvc5 [3], Z3str3RE [7], Trau [1], and CertiStr [17]. Our evaluation shows that our eager approach is competitive with lazy solvers overall, but it also shows that combining both types of solvers in a portfolio is most effective. Our eager approach tends to perform best on satisfiable instances while lazy approaches work better on unsatisfiable problems.

## **2 Preliminaries**

We assume a fixed alphabet Σ and a fixed set of variables Γ. Words of Σ<sup>∗</sup> are denoted by w, w , w, etc. Variables are denoted by x, y, z. Our decision procedure supports the theory described in Fig. 1.

$$\begin{aligned} F &:= F \lor F \mid F \land F \mid \neg F \mid \text{Atom} \\ \text{Atom} &:= \mathbf{x} \dot{\in} RE \mid \mathbf{x} \dot{=} \mathbf{y} \\ RE &:= RE \cup RE \mid RE \cdot RE \mid RE^\* \mid RE \cap RE \mid \text{?} \mid w \end{aligned}$$

**Fig. 1.** Syntax: x and y denote string variables and w denotes a word of Σ∗. The symbol ? is the wildcard character.

Atoms in this theory include *regular membership constraints* (or *regular constraints* for short) of the form <sup>x</sup> . <sup>∈</sup> RE, where RE is a regular expression, and *variable equations* of the form <sup>x</sup> . = y. Concatenation is not allowed in equations.

Regular expressions are defined inductively using union, concatenation, intersection, and the Kleene star. Atomic regular expressions are constant words <sup>w</sup> <sup>∈</sup> <sup>Σ</sup><sup>∗</sup> and the wildcard character ?, which is a placeholder for an arbitrary symbol <sup>c</sup> <sup>∈</sup> <sup>Σ</sup>. All regular expressions are grounded, meaning that they do not contain variables. We use the symbols . ∈ and . = as a shorthand notation for negations of atoms using the respective predicate symbols. The following is an example formula in our language: <sup>¬</sup>(<sup>x</sup> . <sup>∈</sup> <sup>a</sup> · ?<sup>∗</sup> <sup>∧</sup> <sup>y</sup> . ∈ ?<sup>∗</sup> · b) ∨ x . <sup>=</sup> <sup>y</sup> <sup>∨</sup> <sup>x</sup> . ∈ a · b.

Using our basic syntax, we can define additional relations, such as *constant equations* <sup>x</sup> . <sup>=</sup> <sup>w</sup>, and *prefix and suffix constraints*, written <sup>w</sup> . <sup>x</sup> and <sup>w</sup> . x, respectively. Even though these relations can be expressed as regular constraints (e.g., the prefix constraint ab . <sup>x</sup> can be expressed as <sup>x</sup> . ∈ a · b · ?∗), we can generate more efficient reductions to SAT by encoding them explicitly.

This string theory is not as expressive as others, since it does not include string concatenation, but it still has important practical applications. It is used in the Zelkova tool described by Backes, et al. [2] to support analysis of AWS security policies. Zelkova is a major industrial application of SMT solvers [30].

Given a formula ψ, we denote by *atoms*(ψ) the set of atoms occurring in ψ, by V (ψ) the set of variables occurring in ψ, and by Σ(ψ) the set of constant symbols occurring in ψ. We call Σ(ψ) the *alphabet of* ψ. Similarly, given a regular expression R, we denote by Σ(R) the set of characters occurring in R. In particular, we have <sup>Σ</sup>(?) = <sup>∅</sup>.

We call a formula *conjunctive* if it is a conjunction of literals and we call it a *clause* if it is a disjunction of literals. We say that a formula is in *normal form* if it is a conjunctive formula without unnegated variable equations. Every conjunctive formula can be turned into normal form by substitution, i.e., by repeatedly rewriting <sup>ψ</sup> <sup>∧</sup> <sup>x</sup> . = y to ψ[x := y]. If ψ is in negation normal form (NNF), meaning that the negation symbol occurs only directly in front of atoms, we denote by *lits*(ψ) the set of literals occurring in ψ. We say that an atom <sup>a</sup> occurs with *positive polarity* in <sup>ψ</sup> if <sup>a</sup> <sup>∈</sup> *lits*(ψ) and that it occurs with *negative polarity* in <sup>ψ</sup> if <sup>¬</sup><sup>a</sup> <sup>∈</sup> *lits*(ψ); we denote the respective sets of atoms of ψ by *atoms*<sup>+</sup>(ψ) and *atoms*−(ψ). The notion of polarity can be extended to arbitrary formulas (not necessarily in NNF), intuitively by considering polarity in a formula's corresponding NNF (see [26] for details).

**Fig. 2.** Overview of the solving process.

The semantics of our language is standard. A regular expression R defines a regular language <sup>L</sup>(R) over <sup>Σ</sup> in the usual way. An *interpretation* is a mapping (also called a *substitution*) <sup>h</sup>: <sup>Γ</sup> <sup>→</sup> <sup>Σ</sup><sup>∗</sup> from string variables to words. Atoms are interpreted as usual, and a *model* (also called a *solution*) is an interpretation that makes a formula evaluate to true under the usual semantics of the Boolean connectives.

## **3 Overview**

Our solving method is illustrated in Fig. 2. It first performs three preprocessing steps that generate a Boolean abstraction of the input formula, reduce the size of the input alphabet, and initialize bounds on the lengths of all string variables. After preprocessing, we enter an encode-solve-and-refine loop that iteratively queries a SAT solver with a problem encoding based on the current bounds and refines the bounds after each unsatisfiable solver call. We repeat this loop until either the propositional encoding is satisfiable, in which case we conclude satisfiability of the input formula, or each bound has reached a theoretical upper bound, in which case we conclude unsatisfiability.

*Generating the Boolean Abstraction.* We abstract the input formula ψ by replacing each theory atom <sup>a</sup> <sup>∈</sup> *atoms*(ψ) with a new Boolean variable **<sup>d</sup>**(a), and keep track of the mapping between a and **d**(a). This gives us a *Boolean abstraction* <sup>ψ</sup><sup>A</sup> of <sup>ψ</sup> and a set **<sup>D</sup>** of definitions, where each definition expresses the relationship between an atom a and its corresponding Boolean variable **d**(a). If a occurs with only one polarity in ψ, we encode the corresponding definition as an implication, i.e., as **<sup>d</sup>**(a) <sup>→</sup> <sup>a</sup> or as <sup>¬</sup>**d**(a) → ¬a, depending on the polarity of <sup>a</sup>. Otherwise, if a occurs with both polarities, we encode it as an equivalence consisting of both implications. This encoding, which is based on ideas behind the well-known *Plaisted-Greenbaum transformation* [28], ensures that the formulas <sup>ψ</sup> and <sup>ψ</sup><sup>A</sup> <sup>∧</sup> - <sup>d</sup>∈**<sup>D</sup>** <sup>d</sup> are equisatisfiable. An example is shown in Fig. 3.

*Reducing the Alphabet.* In the SMT-LIB theory of strings [4], the alphabet Σ comprises 3 · <sup>2</sup><sup>16</sup> letters, but we can typically use a much smaller alphabet without affecting satisfiability. In Sect. 4, we show that using Σ(ψ) and one extra

**Fig. 3.** Example of Boolean abstraction. The formula ψ, whose expression tree is shown on the left, results in the Boolean abstraction illustrated on the right, where p, q, and r are fresh Boolean variables. We additionally get the definitions <sup>p</sup> <sup>→</sup> <sup>x</sup> *.* <sup>∈</sup> <sup>R</sup>1, <sup>q</sup> <sup>↔</sup> <sup>y</sup> *.* ∈ R2, and <sup>r</sup> <sup>↔</sup> <sup>z</sup> . <sup>=</sup> <sup>w</sup>. We use an implication (instead of an equivalence) for atom <sup>x</sup> *.* ∈ R<sup>1</sup> since it occurs only with positive polarity within ψ.

character per string variable is sufficient. Reducing the alphabet is critical for our SAT encoding to be practical.

*Initializing Bounds.* A model for the original first-order formula ψ is a substitution <sup>h</sup> : <sup>Γ</sup> <sup>→</sup> <sup>Σ</sup><sup>∗</sup> that maps each string variable to a word of arbitrary length such that ψ evaluates to true. As we use a SAT solver to find such substitutions, we need to bound the lengths of strings, which we do by defining a bound function b : <sup>Γ</sup> <sup>→</sup> <sup>N</sup> that assigns an upper bound to each string variable. We initialize a small upper bound for each variable, relying on simple heuristics. If the bounds are too small, we increase them in a later refinement step.

*Encoding, Solving, and Refining Bounds.* Given a bound function b, we build a propositional formula ψ <sup>b</sup> that is satisfiable if and only if the original formula ψ has a solution <sup>h</sup> such that <sup>|</sup>h(x)| ≤ b(x) for all <sup>x</sup> <sup>∈</sup> <sup>Γ</sup>. We encode ψ <sup>b</sup> as the conjunction <sup>ψ</sup><sup>A</sup> <sup>∧</sup> -**D** <sup>b</sup> <sup>∧</sup> h <sup>b</sup>, where <sup>ψ</sup><sup>A</sup> is the Boolean abstraction of <sup>ψ</sup>, -**D** b is an encoding of the definitions **D**, and h <sup>b</sup> is an encoding of the set of possible substitutions. We discuss details of the encoding in Sect. 5. A key property is that it relies on *incremental SAT solving under assumptions* [13]. Increasing bounds amounts to adding new clauses to the formula ψ <sup>b</sup> and fixing a set of assumptions, i.e., temporarily fixing the truth values of a set of Boolean variables. If ψ <sup>b</sup> is satisfiable, we can construct a substitution h from a Boolean model ω of ψ <sup>b</sup>. Otherwise, we examine an unsatisfiable core (i.e., an unsatisfiable subformula) of ψ <sup>b</sup> to determine whether increasing the bounds may give a solution and, if so, to identify the variables whose bounds must be increased. In Sect. 6, we explain in detail how we analyze unsatisfiable cores, increase bounds, and conclude unsatisfiability.

## **4 Reducing the Alphabet**

In many applications, the alphabet Σ is large—typically Unicode or an approximation of Unicode as defined in the SMT-LIB standard—but formulas use much fewer symbols (less than 100 symbols is common in our experiments). In order to check the satisfiability of a formula ψ, we can restrict the alphabet to the symbols that occur in ψ and add one extra character per variable. This allows us to produce compact propositional encodings that can be solved efficiently in practice.

To prove that such a reduced alphabet A is sufficient, we show that a model <sup>h</sup>: <sup>Γ</sup> <sup>→</sup> <sup>Σ</sup><sup>∗</sup> of <sup>ψ</sup> can be transformed into a model <sup>h</sup> : <sup>Γ</sup> <sup>→</sup> <sup>A</sup><sup>∗</sup> of <sup>ψ</sup> by replacing characters of Σ that do not occur in ψ by new symbols—one new symbol per variable of <sup>ψ</sup>. For example, suppose <sup>V</sup> (ψ) = {x1, <sup>x</sup>2}, <sup>Σ</sup>(ψ) = {a, <sup>c</sup>, <sup>d</sup>}, and <sup>h</sup> is a model of ψ such that h(x1) = abcdef and h(x2) = abbd. We introduce two new symbols <sup>α</sup>1, α<sup>2</sup> <sup>∈</sup> <sup>Σ</sup> \ <sup>Σ</sup>(ψ) , define <sup>h</sup> (x1)=aα1cdα1α<sup>1</sup> and h (x2)=aα2α2d , and argue that h is a model as well.

More generally, assume B is a subset of Σ and n is a positive integer such that <sup>|</sup>B|≤|Σ| − <sup>n</sup>. We can then pick <sup>n</sup> distinct symbols <sup>α</sup>1,...,α<sup>n</sup> from <sup>Σ</sup> \ <sup>B</sup>. Let <sup>A</sup> be the set <sup>B</sup> ∪ {α1,...,α<sup>n</sup>}. We construct <sup>n</sup> functions <sup>f</sup>1,...,f<sup>n</sup> from <sup>Σ</sup> to <sup>A</sup> by setting <sup>f</sup>i(a) = <sup>a</sup> if <sup>a</sup> <sup>∈</sup> <sup>B</sup>, and <sup>f</sup>i(a) = <sup>α</sup><sup>i</sup> otherwise. We extend <sup>f</sup><sup>i</sup> to words of <sup>Σ</sup><sup>∗</sup> in the natural way: <sup>f</sup>i(ε) = <sup>ε</sup> and <sup>f</sup>i(<sup>a</sup> · <sup>w</sup>) = <sup>f</sup>i(a) · <sup>f</sup>i(w). This construction satisfies the following property:

**Lemma 4.1.** *Let* <sup>f</sup>1,...,f<sup>n</sup> *be mappings as defined above, and let* i, j <sup>∈</sup> <sup>1</sup>,...,n *such that* <sup>i</sup> <sup>=</sup> <sup>j</sup>*. Then, the following holds:*

*1. If* <sup>a</sup> *and* <sup>b</sup> *are distinct symbols of* <sup>Σ</sup>*, then* <sup>f</sup>i(a) <sup>=</sup> <sup>f</sup><sup>j</sup> (b)*. 2. If* <sup>w</sup> *and* <sup>w</sup> *are distinct words of* <sup>Σ</sup>∗*, then* <sup>f</sup>i(w) <sup>=</sup> <sup>f</sup><sup>j</sup> (w )*.*

*Proof.* The first part is an easy case analysis. For the second part, we have that <sup>|</sup>fi(w)<sup>|</sup> <sup>=</sup> <sup>|</sup>w<sup>|</sup> and <sup>|</sup>f<sup>j</sup> (w )<sup>|</sup> <sup>=</sup> <sup>|</sup>w <sup>|</sup>, so the statement holds if <sup>w</sup> and <sup>w</sup> have different lengths. Assume now that w and w have the same length and let v be the longest common prefix of w and w . Since w and w are distinct, we have that <sup>w</sup> <sup>=</sup> <sup>v</sup> · <sup>a</sup> · <sup>u</sup> and <sup>w</sup> <sup>=</sup> <sup>v</sup> · <sup>b</sup> · <sup>u</sup> , where <sup>a</sup> <sup>=</sup> <sup>b</sup> are symbols of <sup>Σ</sup> and <sup>u</sup> and <sup>u</sup> are words of <sup>Σ</sup>∗. By the first part, we have <sup>f</sup>i(a) <sup>=</sup> <sup>f</sup><sup>j</sup> (b), so <sup>f</sup>i(w) and <sup>f</sup><sup>j</sup> (w ) must be distinct. 

The following lemma can be proved by induction on R.

**Lemma 4.2.** *Let* f1,...,f<sup>n</sup> *be mappings as defined above and let* R *be a regular expression with* <sup>Σ</sup>(R) <sup>⊆</sup> <sup>B</sup>*. Then, for all words* <sup>w</sup> <sup>∈</sup> <sup>Σ</sup><sup>∗</sup> *and all* <sup>i</sup> <sup>∈</sup> <sup>1</sup>,...,n*,* <sup>w</sup> ∈ L(R) *if and only if* <sup>f</sup>i(w) ∈ L(R)*.*

Given a subset A of Σ, we say that ψ is satisfiable in A if there is a model <sup>h</sup>: <sup>V</sup> (ψ) <sup>→</sup> <sup>A</sup><sup>∗</sup> of <sup>ψ</sup>. We can now prove the main theorem of this section, which shows how to reduce the alphabet while maintaining satisfiability.

**Theorem 4.3.** *Let* ψ *be a formula with at most* n *string variables* x1,..., x<sup>n</sup> *such that* <sup>|</sup>Σ(ψ)<sup>|</sup> <sup>+</sup> <sup>n</sup> ≤ |Σ|*. Then,* <sup>ψ</sup> *is satisfiable if and only if it is satisfiable in an alphabet* <sup>A</sup> <sup>⊆</sup> <sup>Σ</sup> *of cardinality* <sup>|</sup>A<sup>|</sup> <sup>=</sup> <sup>|</sup>Σ(ψ)<sup>|</sup> <sup>+</sup> <sup>n</sup>*.*

*Proof.* We set B = Σ(ψ) and use the previous construction. So the alphabet <sup>A</sup> <sup>=</sup> <sup>B</sup> ∪ {α1,...,αn} has cardinality <sup>|</sup>Σ(ψ)<sup>|</sup> <sup>+</sup> <sup>n</sup>, where <sup>α</sup>1,...α<sup>n</sup> are distinct symbols of <sup>Σ</sup> \ <sup>B</sup>. We can assume that <sup>ψ</sup> is in disjunctive normal form, meaning that it is a disjunction of the form <sup>ψ</sup> <sup>=</sup> <sup>ψ</sup><sup>1</sup> ∨ ··· ∨ <sup>ψ</sup>m, where each <sup>ψ</sup><sup>t</sup> is a conjunctive formula. If ψ is satisfiable, then one of the disjuncts ψ<sup>k</sup> is satisfiable and we have <sup>Σ</sup>(ψk) <sup>⊆</sup> <sup>B</sup>. We can turn <sup>ψ</sup><sup>k</sup> into normal form by eliminating all variable equalities of the form x<sup>i</sup> . = x<sup>j</sup> from ψk, resulting in a conjunction ϕ<sup>k</sup> of literals of the form x<sup>i</sup> . <sup>∈</sup> <sup>R</sup>, <sup>x</sup><sup>i</sup> . <sup>∈</sup> <sup>R</sup>, or <sup>x</sup><sup>i</sup> . <sup>=</sup> <sup>x</sup><sup>j</sup> . Clearly, for any <sup>A</sup> <sup>⊆</sup> <sup>Σ</sup>, <sup>ϕ</sup><sup>k</sup> is satisfiable in A if and only if ψ<sup>k</sup> is satisfiable in A.

Let <sup>h</sup>: <sup>V</sup> (ϕk) <sup>→</sup> <sup>Σ</sup><sup>∗</sup> be a model of <sup>ϕ</sup><sup>k</sup> and define the mapping <sup>h</sup> : <sup>V</sup> (ϕk) <sup>→</sup> A<sup>∗</sup> as h (xi) = fi(h(xi)). We show that h is a model of ϕk. Consider a literal l of ϕk. We have three cases:


All literals of ϕ<sup>k</sup> are then satisfied by h , hence ϕ<sup>k</sup> is satisfiable in A and thus so is <sup>ψ</sup>k. It follows that <sup>ψ</sup> is satisfiable in <sup>A</sup>. 

The reduction presented here can be improved and generalized. For example, it can be worthwhile to use different alphabets for different variables or to reduce large character intervals to smaller sets.

## **5 Propositional Encodings**

Our algorithm performs a series of calls to a SAT solver. Each call determines the satisfiability of the propositional encoding ψ <sup>b</sup> of ψ for some upper bounds b. Recall that ψ <sup>b</sup> <sup>=</sup> <sup>ψ</sup><sup>A</sup> <sup>∧</sup>h <sup>b</sup> ∧-**D** <sup>b</sup>, where <sup>ψ</sup><sup>A</sup> is the Boolean abstraction of <sup>ψ</sup>, h <sup>b</sup> is an encoding of the set of possible substitutions, and -**D** <sup>b</sup> is an encoding of the theory-literal definitions, both bounded by b. Intuitively, h <sup>b</sup> tells the SAT solver to "guess" a substitution, -**D** <sup>b</sup> makes sure that all theory literals are assigned proper truth values according to the substitution, and <sup>ψ</sup><sup>A</sup> forces the evaluation of the whole formula under these truth values.

Suppose the algorithm performs <sup>n</sup> calls and let b<sup>k</sup> : <sup>Γ</sup> <sup>→</sup> <sup>N</sup> for <sup>k</sup> <sup>∈</sup> <sup>1</sup>,...,n denote the upper bounds used in the k-th call to the SAT solver. For convenience, we additionally define b0(x) = 0 for all <sup>x</sup> <sup>∈</sup> <sup>Γ</sup>. In the <sup>k</sup>-th call, the SAT solver decides whether ψ <sup>b</sup><sup>k</sup> is satisfiable. The Boolean abstraction <sup>ψ</sup>A, which we already discussed in Sect. 3, stays the same for each call. In the following, we thus discuss the encodings of the substitutions h <sup>b</sup><sup>k</sup> and of the various theory literals a <sup>b</sup><sup>k</sup> and -<sup>¬</sup>a <sup>b</sup><sup>k</sup> that are part of -**D** <sup>b</sup><sup>k</sup> . Even though SAT solvers expect their input in CNF, we do not present the encodings in CNF to simplify the presentation, but they can be converted to CNF using simple equivalence transformations.

Most of our encodings are *incremental* in the sense that the formula for call <sup>k</sup> is constructed by only adding clauses to the formula for call <sup>k</sup> <sup>−</sup> 1. In other words, for substitution encodings we have h <sup>b</sup><sup>k</sup> = h <sup>b</sup>k−<sup>1</sup> <sup>∧</sup> h b<sup>k</sup> <sup>b</sup>k−<sup>1</sup> and for literals we have l <sup>b</sup><sup>k</sup> = l <sup>b</sup>k−<sup>1</sup> <sup>∧</sup> l b<sup>k</sup> <sup>b</sup>k−<sup>1</sup> , with the base case h <sup>b</sup><sup>0</sup> = l <sup>b</sup><sup>0</sup> <sup>=</sup> . In these cases, it is thus enough to encode the incremental additions l b<sup>k</sup> bk−<sup>1</sup> and h b<sup>k</sup> <sup>b</sup>k−<sup>1</sup> for each call to the SAT solver. Some of our encodings, however, introduce clauses that are valid only for a specific bound b<sup>k</sup> and thus become invalid for larger bounds. We handle the deactivation of these encodings with *selector variables* as is common in incremental SAT solving.

Our encodings are correct in the following sense.<sup>1</sup>

**Theorem 5.1.** *Let* <sup>l</sup> *be a literal and let* b : <sup>Γ</sup> <sup>→</sup> <sup>N</sup> *be a bound function. Then,* l *has a model that is bounded by* b *if and only if* h <sup>b</sup> <sup>∧</sup> l <sup>b</sup> *is satisfiable.*

#### **5.1 Substitutions**

We encode substitutions by defining for each variable <sup>x</sup> <sup>∈</sup> <sup>Γ</sup> the characters to which each of x's positions is mapped. Specifically, given x and its corresponding upper bound b(x), we represent the substitution h(x) by introducing new variables x[1],..., x[b(x)], one for each symbol h(x)[i] of the word h(x). We call these variables *filler variables* and we denote the set of all filler variables by Γˇ. By introducing a new symbol <sup>λ</sup> ∈ <sup>Σ</sup>, which stands for an unused filler variable, we can define <sup>h</sup> based on a substitution <sup>h</sup><sup>ˇ</sup> : <sup>Γ</sup><sup>ˇ</sup> <sup>→</sup> <sup>Σ</sup><sup>λ</sup> over the filler variables, where <sup>Σ</sup><sup>λ</sup> <sup>=</sup> <sup>Σ</sup> ∪ {λ}:

$$h(\mathbf{x})[i] = \begin{cases} \varepsilon & \text{if } \check{h}(\mathbf{x}[i]) = \lambda \\ \check{h}(\mathbf{x}[i]) & \text{otherwise} \end{cases}$$

We use this representation of substitutions (known as "filling the positions" [18]) because it has a straightforward propositional encoding: For each variable <sup>x</sup> <sup>∈</sup> <sup>Γ</sup> and each position <sup>i</sup> <sup>∈</sup> <sup>1</sup>,..., b(x), we create a set {h<sup>a</sup> <sup>x</sup>[i] <sup>|</sup> <sup>a</sup> <sup>∈</sup> <sup>Σ</sup><sup>λ</sup>} of Boolean variables, where h<sup>a</sup> <sup>x</sup>[i] is true if <sup>h</sup>ˇ(x[i]) = <sup>a</sup>. We then use a propositional encoding of an *exactly-one* (EO) constraint (e.g., [20]) to assert that exactly one variable in this set must be true:

$$\left\lVert \boldsymbol{h} \right\rVert\_{\mathbf{b}\_{k-1}}^{\mathbf{b}\_{k}} = \bigwedge\_{\mathbf{x} \in \Gamma} \bigwedge\_{i=\mathbf{b}\_{k-1}(\mathbf{x})+1}^{\mathbf{b}\_{k}(\mathbf{x})} \operatorname{EO}(\{\boldsymbol{h}\_{\mathbf{x}[i]}^{a} \mid \boldsymbol{a} \in \Sigma\_{\lambda}\}) \tag{1}$$

$$\times \bigwedge\_{\mathbf{x} \in \Gamma} \bigwedge\_{i=\mathbf{b}\_{k-1}(\mathbf{x})}^{\mathbf{b}\_k(\mathbf{x})-1} h\_{\mathbf{x}[i]}^\lambda \to h\_{\mathbf{x}[i+1]}^\lambda \tag{2}$$

<sup>1</sup> Proof is omitted due to space constraints but made available for review purposes.

Constraint (2) prevents the SAT solver from considering filled substitutions that are equivalent modulo λ-substitutions—it enforces that if a position i is mapped to λ, all following positions are mapped to λ too. For instance, abλλ, aλbλ, and λλab all correspond to the same word ab, but our encoding allows only abλλ. Thus, every Boolean assignment ω that satisfies h<sup>b</sup> encodes exactly one substitution hω, and for every substitution h (bounded by b) there exists a corresponding assignment ω<sup>h</sup> that satisfies h<sup>b</sup>.

#### **5.2 Theory Literals**

The only theory literals of our core language are regular constraints (<sup>x</sup> . <sup>∈</sup> <sup>R</sup>) and variable equations (<sup>x</sup> . <sup>=</sup> <sup>y</sup>) with their negations. Constant equations (<sup>x</sup> . = w) as well as prefix and suffix constraints (<sup>w</sup> . <sup>x</sup> and <sup>w</sup> . x) could be expressed as regular constraints, but we encode them explicitly to improve performance.

**Regular Constraints.** We encode a regular constraint <sup>x</sup> . <sup>∈</sup> <sup>R</sup> by constructing a propositional formula that is true if and only if the word h(x) is accepted by a specific nondeterministic finite automaton that accepts the language <sup>L</sup>(R). Let x . <sup>∈</sup> <sup>R</sup> be a regular constraint and let <sup>M</sup> = (Q, Σ, δ, q0, F) be a nondeterministic finite automaton (with states Q, alphabet Σ, transition relation δ, initial state <sup>q</sup>0, and accepting states <sup>F</sup>) that accepts <sup>L</sup>(R) and that additionally allows <sup>λ</sup>-selftransitions on every state. Given that λ is a placeholder for the empty symbol, λ-transitions do not change the language accepted by M. We allow them so that M performs exactly b(x) transitions, even for substitutions of length less than b(x). This reduces checking whether the automaton accepts a word to only evaluating the states reached after exactly b(x) transitions.

Given a model <sup>ω</sup> <sup>|</sup><sup>=</sup> h <sup>b</sup>, we express the semantics of M in propositional logic by encoding which states are reachable after reading hω(x). To this end, we assign b(x) + 1 Boolean variables {S<sup>0</sup> <sup>q</sup> , S<sup>1</sup> <sup>q</sup> ,...,Sb(x) <sup>q</sup> } to each state <sup>q</sup> <sup>∈</sup> <sup>Q</sup> and assert that ωh(S<sup>i</sup> <sup>q</sup>) = 1 if and only if q can be reached by reading prefix hω(x)[1..i]. We encode this as a conjunction -(M; x) = -I(M;x) ∧ -T(M;x) ∧ -P(M;x) of three formulas, modelling the semantics of the initial state, the transition relation, and the predecessor relation of M. We assert that the initial state q<sup>0</sup> is the only state reachable after reading the prefix of length 0, i.e., -I(M;x)<sup>b</sup><sup>1</sup> = S<sup>0</sup> <sup>q</sup><sup>0</sup> ∧ - <sup>q</sup>∈Q\{q0} <sup>¬</sup>S<sup>0</sup> <sup>q</sup> . The condition is independent of the bound on x, thus we set -I(M;x) bk <sup>b</sup>k−<sup>1</sup> <sup>=</sup> for all k > 1.

We encode the transition relation of M by stating that if M is in some state q after reading hω(x)[1..i], and if there exists a transition from q to q labelled with an a, then M can reach state q after i + 1 transitions if hω(x)[i + 1] = a. This is expressed in the following formula:

$$\|\mathbf{T}\_{\left(M;\mathbf{x}\right)}\|\_{\mathbf{b}\_{k-1}}^{\mathbf{b}\_k} = \bigwedge\_{i=\mathbf{b}\_{k-1}\left(\mathbf{x}\right)}^{\mathbf{b}\_k\left(\mathbf{x}\right)-1} \bigwedge\_{q,a\in\text{dom}\left(\delta\right)} \bigwedge\_{q'\in\delta\left(q,a\right)} \left(S\_q^i \wedge h\_{\mathbf{x}\left[i+1\right]}^a\right) \to S\_{q'}^{i+1}$$

The formula captures all possible forward moves from each state. We must also ensure that a state is reachable only if it has a reachable predecessor, which we encode with the following formula, where pred(q ) = {(q, a) <sup>|</sup> <sup>q</sup> <sup>∈</sup> <sup>δ</sup>(q, a)}:

$$\|\mathbf{P}\_{(M;\mathbb{X})}\|\_{\mathbf{b}\_{k-1}}^{\mathbf{b}\_k} = \bigwedge\_{i=\mathbf{b}\_{k-1}(\mathbf{x})+1}^{\mathbf{b}\_k(\mathbf{x})} \bigwedge\_{q' \in Q} (S\_{q'}^i \to \bigvee\_{\substack{(q,a) \in \text{pred}(q')}} (S\_q^{i-1} \wedge h\_{\mathbf{x}[i]}^a))$$

The formula states that if state <sup>q</sup> is reachable after <sup>i</sup> <sup>≥</sup> 1 transitions, then there must be a reachable predecessor state <sup>q</sup> <sup>∈</sup> <sup>ˆ</sup>δ({q0}, hω(x)[1..i−1]) such that <sup>q</sup> <sup>∈</sup> <sup>δ</sup>(q, hω(x)[i]).

To decide whether the automaton accepts hω(x), we encode that it must reach an accepting state after bk(x) transitions. Our corresponding encoding is only valid for the particular bound bk(x). To account for this, we introduce a fresh selector variable s<sup>k</sup> and define accept<sup>x</sup> . ∈<sup>M</sup> b<sup>k</sup> <sup>b</sup>k−<sup>1</sup> <sup>=</sup> <sup>s</sup><sup>k</sup> <sup>→</sup> <sup>q</sup><sup>f</sup> <sup>∈</sup><sup>F</sup> <sup>S</sup>bk(x) <sup>q</sup><sup>f</sup> . Analogously, we define reject<sup>x</sup> . ∈<sup>M</sup> b<sup>k</sup> <sup>b</sup>k−<sup>1</sup> <sup>=</sup> <sup>s</sup><sup>k</sup> <sup>→</sup> - <sup>q</sup><sup>f</sup> <sup>∈</sup><sup>F</sup> <sup>¬</sup>Sbk(x) <sup>q</sup><sup>f</sup> . In the <sup>k</sup>-th call to the SAT solver and all following calls with the same bound on x, we solve under the assumption that s<sup>k</sup> is true. In the first call k with bk(x) < b<sup>k</sup>- (x), we re-encode the condition using a new selector variable s<sup>k</sup> and solve under the assumption that s<sup>k</sup> is false and s <sup>k</sup> is true. The full encoding of the regular constraint <sup>x</sup> . <sup>∈</sup> <sup>R</sup> is thus given by

$$\left\lVert \mathbf{x} \dot{\in} R \right\rVert\_{\mathbf{b}\_{k-1}}^{\mathbf{b}\_k} = \left\lbrack (M; \mathbf{x}) \right\rbrack\_{\mathbf{b}\_{k-1}}^{\mathbf{b}\_k} \land \left\lbrack \mathbf{accept}\_{\mathbf{x} \in M} \right\rbrack\_{\mathbf{b}\_{k-1}}^{\mathbf{b}\_k}$$

and its negation <sup>x</sup> . <sup>∈</sup> <sup>R</sup> is encoded as

$$\|\mathbf{x}\notin R\|\_{\mathbf{b}\_{k-1}}^{\mathbf{b}\_k} = \left[ (M; \mathbf{x}) \right]\_{\mathbf{b}\_{k-1}}^{\mathbf{b}\_k} \land \left[ \text{reject}\_{\mathbf{x}\in M} \right]\_{\mathbf{b}\_{k-1}}^{\mathbf{b}\_k}.$$

**Variable Equations.** Let <sup>x</sup>, <sup>y</sup> <sup>∈</sup> <sup>Γ</sup> be two string variables, let <sup>l</sup> <sup>=</sup> min(b<sup>k</sup>−<sup>1</sup>(x), <sup>b</sup><sup>k</sup>−<sup>1</sup>(y)), and let <sup>u</sup> = min(bk(x), <sup>b</sup>k(y)). We encode equality between x and y with respect to b<sup>k</sup> position-wise up to u:

$$\|\mathbf{x} \doteq \mathbf{y}\|\_{\mathbf{b}\_{k-1}}^{\mathbf{b}\_k} = \bigwedge\_{i=l+1}^{u} \bigwedge\_{a \in \Sigma\_{\lambda}} (h\_{\mathbf{x}[i]}^a \to h\_{\mathbf{y}[i]}^a).$$

The formula asserts that for each position <sup>i</sup> <sup>∈</sup> <sup>l</sup> + 1,...,u, if <sup>x</sup>[i] is mapped to a symbol, then y[i] is mapped to the same symbol (including λ). Since our encoding of substitutions ensures that every position in a string variable is mapped to exactly one character, x . = y b<sup>k</sup> <sup>b</sup>k−<sup>1</sup> ensures <sup>x</sup>[i] = <sup>y</sup>[i] for <sup>i</sup> <sup>∈</sup> <sup>l</sup> + 1,...,u. In conjunction with x . = y <sup>b</sup>k−<sup>1</sup> , which encodes equality up to the l-th position, we have symbol-wise equality of x and y up to bound u. Thus, if bk(x)=bk(y), then the formula ensures the equality of both variables. If bk(x) > bk(y), we add hλ <sup>x</sup>[u+1] as an assumption to the solver to ensure <sup>x</sup>[i] = <sup>λ</sup> for <sup>i</sup> <sup>∈</sup> <sup>u</sup> + 1,..., <sup>b</sup>k(x) and, symmetrically, we add the assumption h<sup>λ</sup> <sup>y</sup>[u+1] if bk(y) <sup>&</sup>gt; <sup>b</sup>k(x).

For the negation x . = y, we encode that h(x) and h(y) must disagree on at least one position, which can happen either because they map to different symbols or because the variable with the higher bound is mapped to a longer word. As for the regular constraints, we again use selector variable s<sup>k</sup> to deactivate the encoding for all later bounds, for which it will be re-encoded:

$$\left\|\mathbf{x}\neq\mathbf{y}\right\|\_{\mathbf{b}\_{k-1}}^{\mathbf{b}\_{k}} = \begin{cases} s\_{k} \rightarrow \left(\bigvee\_{i=1}^{u} \bigvee\_{a\in\Sigma\_{\lambda}} (\neg h\_{\mathbf{x}[i]}^{a} \wedge h\_{\mathbf{y}[i]}^{a})\right) & \text{if } \mathbf{b}\_{k}(\mathbf{x}) = \mathbf{b}\_{k}(\mathbf{y}),\\ s\_{k} \rightarrow \left(\bigvee\_{i=1}^{u} \bigvee\_{a\in\Sigma\_{\lambda}} (\neg h\_{\mathbf{x}[i]}^{a} \wedge h\_{\mathbf{y}[i]}^{a})\right) \vee \neg h\_{\mathbf{y}[u+1]}^{\lambda} & \text{if } \mathbf{b}\_{k}(\mathbf{x}) < \mathbf{b}\_{k}(\mathbf{y}),\\ s\_{k} \rightarrow \left(\bigvee\_{i=1}^{u} \bigvee\_{a\in\Sigma\_{\lambda}} (\neg h\_{\mathbf{x}[i]}^{a} \wedge h\_{\mathbf{y}[i]}^{a})\right) \vee \neg h\_{\mathbf{x}[u+1]}^{\lambda} & \text{if } \mathbf{b}\_{k}(\mathbf{x}) > \mathbf{b}\_{k}(\mathbf{y}). \end{cases}$$

**Constant Equations.** Given a constant equation <sup>x</sup> . = w, if the upper bound of <sup>x</sup> is less than <sup>|</sup>w|, the atom is trivially unsatisfiable. Thus, for all <sup>i</sup> such that <sup>b</sup>i(x) <sup>&</sup>lt; <sup>|</sup>w|, we encode <sup>x</sup> . <sup>=</sup> <sup>w</sup> with a simple literal <sup>¬</sup>sx,w and add <sup>s</sup>x,w to the assumptions. For bk(x) ≥ |w|, the encoding is based on the value of b<sup>k</sup>−<sup>1</sup>(x):

$$\begin{aligned} \left\lVert \mathbf{x} = w \right\rVert\_{\mathbf{b}\_{k-1}}^{\mathbf{b}\_k} = \begin{cases} \bigwedge\_{i=1}^{|w|} h\_{\mathbf{x}[i]}^{w[i]} & \text{if } \mathbf{b}\_{k-1}(\mathbf{x}) < |w| = \mathbf{b}\_k(\mathbf{x})\\ \bigwedge\_{i=1}^{|w|} h\_{\mathbf{x}[i]}^{w[i]} \wedge h\_{\mathbf{x}[|w|+1]}^{\lambda} & \text{if } \mathbf{b}\_{k-1}(\mathbf{x}) < |w| < \mathbf{b}\_k(\mathbf{x})\\ h\_{\mathbf{x}[|w|+1]}^{\lambda} & \text{if } \mathbf{b}\_{k-1}(\mathbf{x}) = |w| < \mathbf{b}\_k(\mathbf{x})\\ \top & \text{if } |w| < \mathbf{b}\_{k-1}(\mathbf{x}) \end{cases} \end{aligned}$$

If b<sup>k</sup>−<sup>1</sup>(x) <sup>&</sup>lt; <sup>|</sup>w|, then equality is encoded for all positions 1,..., <sup>|</sup>w|. Additionally, if bk(x) <sup>&</sup>gt; <sup>|</sup>w|, we ensure that the suffix of <sup>x</sup> is empty starting from position <sup>|</sup>w<sup>|</sup> + 1. If b<sup>k</sup>−<sup>1</sup>(x) = <sup>|</sup>w<sup>|</sup> <sup>&</sup>lt; <sup>b</sup>k(x), then only the empty suffix has to be ensured. Lastly, if <sup>|</sup>w<sup>|</sup> <sup>&</sup>lt; <sup>b</sup><sup>k</sup>−<sup>1</sup>(x), then x . = w <sup>b</sup>k−<sup>1</sup> <sup>⇔</sup> x . = w bk . .

Conversely, for an inequality x <sup>=</sup> <sup>w</sup>, if bk(x) <sup>&</sup>lt; <sup>|</sup>w|, then any substitution trivially is a solution, which we simply encode with . Otherwise, we introduce a selector variable s <sup>x</sup>,w and define

$$\left\lVert \mathbf{x} \not\equiv w \right\rVert\_{\mathbf{b}\_{k-1}}^{\mathbf{b}\_{k}} = \begin{cases} s\_{\mathbf{x},w}^{\prime} \rightarrow \bigvee\_{i=1}^{\left\lVert w \right\rVert} \neg h\_{\mathbf{x}[i]}^{\left\lvert \mathbf{i} \right\rvert} & \text{if } \mathbf{b}\_{k-1}(\mathbf{x}) < \left| w \right| = \mathbf{b}\_{k}(\mathbf{x}) \\ \bigvee\_{i=1}^{\left\lvert w \right\rvert} \neg h\_{\mathbf{x}[i]}^{\left\lvert \mathbf{i} \right\rvert} \lor \neg h\_{\mathbf{x}[\left\lvert w \right\rvert + 1]}^{\lambda} & \text{if } \mathbf{b}\_{k-1}(\mathbf{x}) < \left| w \right| < \left| \mathbf{b}\_{k}(\mathbf{x}) \right| \\ \top & \text{if } \left| w \right| < \mathbf{b}\_{k-1}(\mathbf{x}) \le \mathbf{b}\_{k}(\mathbf{x}) \end{cases}$$

If bk(x) = <sup>|</sup>w|, then a substitution <sup>h</sup> satisfies the constraint if and only if <sup>h</sup>(x)[i] <sup>=</sup> <sup>w</sup>[i] for some <sup>i</sup> <sup>∈</sup> <sup>1</sup>,..., <sup>|</sup>w|. If bk(x) <sup>&</sup>gt; <sup>|</sup>w|, in addition, <sup>h</sup> satisfies the constraint if <sup>|</sup>h(x)<sup>|</sup> <sup>&</sup>gt; <sup>|</sup>w|. Thus, if bk(x) = <sup>|</sup>w|, we perform solver call <sup>k</sup> under the assumption s <sup>x</sup>,w, and if bk(x) <sup>&</sup>gt; <sup>|</sup>w|, we perform it under the assumption ¬s <sup>x</sup>,w. Again, if <sup>|</sup>w<sup>|</sup> <sup>&</sup>lt; <sup>b</sup><sup>k</sup>−<sup>1</sup>(x), then x . = w <sup>b</sup>k−<sup>1</sup> <sup>⇔</sup> x . = w bk .

**Prefix and Suffix Constraints.** A prefix constraint <sup>w</sup> . x expresses that the first <sup>|</sup>w<sup>|</sup> positions of <sup>x</sup> must be mapped exactly onto <sup>w</sup>. As with equations between a variable x and a constant word w, we could express this as a regular constraint of the form <sup>x</sup> . <sup>∈</sup> <sup>w</sup>·?∗. However, we achieve a more efficient encoding simply by dropping from the encoding of x . = w the assertion that the suffix of <sup>x</sup> starting at <sup>|</sup><sup>w</sup> + 1<sup>|</sup> be empty. Accordingly, a negated prefix constraint <sup>w</sup> . x expresses that there is an index <sup>i</sup> <sup>∈</sup> <sup>1</sup>,..., <sup>|</sup>w<sup>|</sup> such that the <sup>i</sup>-th position of <sup>x</sup> is mapped onto a symbol different from w[i], which we encode by repurposing x . <sup>=</sup> <sup>w</sup> in a similar manner. Suffix constraints <sup>w</sup> . <sup>x</sup> and <sup>w</sup> . x can be encoded by analogous modifications to the encodings of <sup>x</sup> . <sup>=</sup> <sup>w</sup> and <sup>x</sup> . = w.

## **6 Refining Upper Bounds**

Our procedure solves a series of SAT problems where the length bounds on string variables increase after each unsatisfiable solver call. The procedure terminates once the bounds are large enough so that further increasing them would be futile. To determine when this is the case, we rely on the upper bounds of a *shortest solution* to a formula ψ. We call a model h of ψ a shortest solution of ψ if ψ has no model h such that <sup>x</sup>∈<sup>Γ</sup> <sup>|</sup>h (x)<sup>|</sup> <sup>&</sup>lt; <sup>x</sup>∈<sup>Γ</sup> <sup>|</sup>h(x)|. We first establish this bound for conjunctive formulas in normal form, where all literals are of the form x . = y, x . <sup>∈</sup> <sup>R</sup>, or <sup>x</sup> . <sup>∈</sup> <sup>R</sup>. Once established, we show how the bound can be generalized to arbitrary formulas.

Let ϕ be a formula in normal form and let x1,..., x<sup>n</sup> be the variables of ϕ. For each variable xi, we can collect all the regular constraints on xi, that is, all the literals of the form x<sup>i</sup> . <sup>∈</sup> <sup>R</sup> or <sup>x</sup><sup>i</sup> . <sup>∈</sup> <sup>R</sup> that occur in <sup>ϕ</sup>. We can characterize the solutions to all these constraints by a single nondeterministic finite automaton Mi. If the constraints on x<sup>i</sup> are x<sup>i</sup> . <sup>∈</sup> <sup>R</sup>1,..., <sup>x</sup><sup>i</sup> . <sup>∈</sup> <sup>R</sup>k, <sup>x</sup><sup>i</sup> . <sup>∈</sup> <sup>R</sup> <sup>1</sup>...., <sup>x</sup><sup>i</sup> . <sup>∈</sup> <sup>R</sup> l, then <sup>M</sup><sup>i</sup> is an NFA that accepts the regular language <sup>k</sup> <sup>t</sup>=1 <sup>L</sup>(Rt) <sup>∩</sup> <sup>l</sup> <sup>t</sup>=1 <sup>L</sup>(R <sup>t</sup>), where <sup>L</sup>(R) denotes the complement of <sup>L</sup>(R). We say that <sup>M</sup><sup>i</sup> accepts the regular constraints on x<sup>i</sup> in ϕ. If there are no such constraints on xi, then M<sup>i</sup> is the one-state NFA that accepts the full language Σ∗. Let Q<sup>i</sup> denote the set of states of Mi. If we do not take inequalities into account and if the regular constraints on <sup>x</sup><sup>i</sup> are satisfiable, then a shortest solution <sup>h</sup> has length <sup>|</sup>h(xi)|≤|Q<sup>i</sup>|.

Theorem 6.1 gives a bound for the general case with variable inequalities. Intuitively, we prove the theorem by constructing a single automaton P that takes as input a vector of words W = (w1, ..., wn)<sup>T</sup> and accepts W iff the substitution <sup>h</sup><sup>W</sup> with <sup>h</sup><sup>W</sup> (xi) = <sup>w</sup><sup>i</sup> satisfies <sup>ϕ</sup>. To construct <sup>P</sup>, we introduce one two-state NFA for each inequality and we then form the product of these NFAs with (slightly modified versions of) the NFAs M1,...,Mn. We can then derive the bound of a shortest solution from the number of states of P.

**Theorem 6.1.** *Let* ϕ *be a conjunctive formula in normal form over variables* x1,..., xn*. Let* M<sup>i</sup> = (Qi,Σ,δi, q0,i, Fi) *be an NFA that accepts the regular constraints on* x<sup>i</sup> *in* ϕ *and let* k *be the number of inequalities occurring in* ϕ*. If* ϕ *is satisfiable, then it has a model* h *such that*

$$|h(\mathbb{x}\_i)| \le 2^k \times |Q\_1| \times \dots \times |Q\_n|.$$

*Proof.* Let <sup>λ</sup> be a symbol that does not belong to <sup>Σ</sup> and define <sup>Σ</sup><sup>λ</sup> <sup>=</sup> <sup>Σ</sup>∪{λ}. As previously, we use <sup>λ</sup> to extend words of <sup>Σ</sup><sup>∗</sup> by padding. Given a word <sup>w</sup> <sup>∈</sup> <sup>Σ</sup><sup>∗</sup> <sup>λ</sup>, we denote by ˆw the word of Σ<sup>∗</sup> obtained by removing all occurrences of λ from w. We say that <sup>w</sup> is well-formed if it can be written as <sup>w</sup> <sup>=</sup> <sup>v</sup> · <sup>λ</sup><sup>t</sup> with <sup>v</sup> <sup>∈</sup> <sup>Σ</sup><sup>∗</sup> and <sup>t</sup> <sup>≥</sup> 0. In this case, we have ˆ<sup>w</sup> <sup>=</sup> <sup>v</sup>. Thus a well-formed word <sup>w</sup> consists of a prefix in Σ<sup>∗</sup> followed by a sequence of λs.

Let Δ be the alphabet Σ<sup>n</sup> <sup>λ</sup> , i.e., the letters of Δ are the n-letter words over Σλ. We can then represent a letter u of Δ as an n-element vector (u1,...,un), and a word <sup>W</sup> of <sup>Δ</sup><sup>t</sup> can be written as an <sup>n</sup> <sup>×</sup> <sup>t</sup> matrix

$$W = \begin{pmatrix} u\_{11} \ \dots \ u\_{t1} \\ \vdots & \vdots \\ u\_{1n} \ \dots \ u\_{tn} \end{pmatrix}$$

where <sup>u</sup>ij <sup>∈</sup> <sup>Σ</sup>λ. Each column of this matrix is a letter in <sup>Δ</sup> and each row is a word in Σ<sup>t</sup> <sup>λ</sup>. We denote by pi(W) the i-th row of this matrix and by ˆpi(W) = p <sup>i</sup>(W) the word pi(W) with all occurrences of λ removed. We say that W is well-formed if the words p1(W),...,pn(W) are all well-formed. Given a well-formed word W, we can construct a mapping <sup>h</sup><sup>W</sup> : {x1,..., <sup>x</sup><sup>n</sup>} → <sup>Σ</sup><sup>∗</sup> by setting <sup>h</sup><sup>W</sup> (xi)= ˆpi(W) and we have <sup>|</sup>h<sup>W</sup> (xi)|≤|W<sup>|</sup> <sup>=</sup> <sup>t</sup>.

To prove the theorem, we build an NFA <sup>P</sup> with alphabet <sup>Δ</sup> such that a wellformed word <sup>W</sup> is accepted by <sup>P</sup> iff <sup>h</sup><sup>W</sup> satisfies <sup>ϕ</sup>. The shortest well-formed <sup>W</sup> accepted by <sup>P</sup> has length no more than the number of states of <sup>P</sup> and the bound will follow.

We first extend the NFA M<sup>i</sup> = (Qi,Σ,δi, q0,i, Fi) to an automaton M <sup>i</sup> with alphabet Δ. M <sup>i</sup> has the same set of states, initial state, and final states as Mi. Its transition relation δ <sup>i</sup> is defined by

$$\delta\_i'(q, u) = \begin{cases} \delta\_i(q, u\_i) & \text{if } u\_i \in \Sigma\\ \{q\} & \text{if } u\_i = \lambda \end{cases}$$

One can easily check that M <sup>i</sup> accepts a word W iff M<sup>i</sup> accepts ˆpi(W). .

For an inequality x<sup>i</sup> <sup>=</sup> <sup>x</sup><sup>j</sup> , we construct an NFA <sup>D</sup>i,j = ({e, d}, Δ, δ, e, {d}) with transition function defined as follows:

$$\begin{aligned} \delta(e, u) &= \{e\} \quad \text{if } u\_i = u\_j\\ \delta(e, u) &= \{d\} \quad \text{if } u\_i \neq u\_j\\ \delta(d, u) &= \{d\}. \end{aligned}$$

This NFA has two states. It starts in state e (for "equal") and stays in e as long as the characters u<sup>i</sup> and u<sup>j</sup> are equal. It transitions to state d (for "different") on the first <sup>u</sup> where <sup>u</sup><sup>i</sup> <sup>=</sup> <sup>u</sup><sup>j</sup> and stays in state <sup>d</sup> from that point. Since <sup>d</sup> is the final state, a word <sup>W</sup> is accepted by <sup>D</sup>i,j iff <sup>p</sup>i(W) <sup>=</sup> <sup>p</sup><sup>j</sup> (W). If <sup>W</sup> is well-formed, we also have that <sup>W</sup> is accepted by <sup>D</sup>i,j iff ˆpi(W) = ˆp<sup>j</sup> (W). . .

Let x<sup>i</sup><sup>1</sup> <sup>=</sup> <sup>x</sup><sup>j</sup><sup>1</sup> ,..., <sup>x</sup><sup>i</sup><sup>k</sup> <sup>=</sup> <sup>x</sup><sup>j</sup><sup>k</sup> denote the <sup>k</sup> inequalities of <sup>ϕ</sup>. We define <sup>P</sup> to be the product of the NFAs M 1,...,M <sup>n</sup> and D<sup>i</sup>1,j<sup>1</sup> ,...,D<sup>i</sup>k,j<sup>k</sup> . A well-formed word <sup>W</sup> is accepted by <sup>P</sup> if it is accepted by all <sup>M</sup> <sup>i</sup> and all Dit,j<sup>t</sup> , which means that <sup>P</sup> accepts a well-formed word <sup>W</sup> iff <sup>h</sup><sup>W</sup> satisfies <sup>ϕ</sup>.

Let <sup>P</sup> be the set of states of <sup>P</sup>. We then have <sup>|</sup>P| ≤ <sup>2</sup><sup>k</sup> × |Q1| × ... × |Qn|. Assume <sup>ϕ</sup> is satisfiable, so <sup>P</sup> accepts a well-formed word <sup>W</sup>. The shortest wellformed word accepted by P has an accepting run that does not visit the same state twice. So the length of this well-formed word <sup>W</sup> is no more than <sup>|</sup>P|. The mapping <sup>h</sup><sup>W</sup> satisfies <sup>ϕ</sup> and for every <sup>x</sup>i, it satisfies <sup>|</sup>h<sup>W</sup> (xi)<sup>|</sup> <sup>=</sup> <sup>|</sup>pˆi(W)|≤|W| ≤ <sup>|</sup>P| ≤ <sup>2</sup><sup>k</sup> × |Q1| × ... × |Qn|. 

The bound given by Theorem 6.1 holds if ϕ is in normal form but it also holds for a general conjunctive formula ψ. This follows from the observation that converting conjunctive formulas to normal form preserves the length of solutions. In particular, we convert <sup>ψ</sup>∧<sup>x</sup> . = y to formula ψ = ψ[x := y] so x does not occur in ψ , but clearly, a bound for y in ψ gives us the same bound for x in ψ.

In practice, before we apply the theorem we decompose the conjunctive formula ϕ into subformulas that have disjoint sets of variables. We write ϕ as <sup>ϕ</sup><sup>1</sup> <sup>∧</sup> ... <sup>∧</sup> <sup>ϕ</sup><sup>m</sup> where the conjuncts have no common variables. Then, <sup>ϕ</sup> is satisfiable if each conjunct ϕ<sup>t</sup> is satisfiable and we derive upper bounds on the shortest solution for the variables of ϕt, which gives more precise bounds than deriving bounds from ϕ directly. In particular, if a variable x<sup>i</sup> of ψ does not occur in any inequality, then the bound on <sup>|</sup>h(xi)<sup>|</sup> is <sup>|</sup>Q<sup>i</sup>|.

Theorem 6.1 only holds for conjunctive formulas. For an arbitrary (nonconjunctive) formula ψ, a generalization is to convert ψ into disjunctive normal form. Alternatively, it is sufficient to enumerate the subsets of *lits*(ψ). Given a subset A of *lits*(ψ), let us denote by d<sup>A</sup> a mapping that bounds the length of solutions to <sup>A</sup>, i.e., any solution <sup>h</sup> to <sup>A</sup> satisfies <sup>|</sup>h(x)| ≤ <sup>d</sup>A(x). This mapping d<sup>A</sup> can be computed from Theorem 6.1. The following property gives a bound for ψ.

**Proposition 6.2.** *If* ψ *is satisfiable, then it has a model* h *such that for all* <sup>x</sup> <sup>∈</sup> <sup>Γ</sup>*, it holds that* <sup>|</sup>h(x)| ≤ max{dA(x) <sup>|</sup> <sup>A</sup> <sup>⊆</sup> *lits*(ψ)}*.*

*Proof.* We can assume that ψ is in negation normal form. We can then convert ψ to disjunctive normal form <sup>ψ</sup> <sup>⇔</sup> <sup>ψ</sup>1∨···∨ψ<sup>n</sup> and we have *lits*(ψi) <sup>⊆</sup> *lits*(ψ). Also, ψ is satisfiable if and only if at least one ψ<sup>i</sup> is satisfiable and the proposition follows. 

Since there are 2|*lits*(ψ)<sup>|</sup> subsets of *lits*(ψ), a direct application of Proposition 6.2 is rarely feasible in practice. Fortunately, we can use unsatisfiable cores to reduce the number of subsets to consider.

#### **6.1 Unsatisfiable-Core Analysis**

Instead of calculating the bounds upfront, we use the unsatisfiable core produced by the SAT solver after each incremental call to evaluate whether the upper bounds on the variables exceed the upper bounds of the shortest solution. If ψ <sup>b</sup> is unsatisfiable for bounds b, then it has an unsatisfiable core

$$C = C\_{\mathcal{A}} \land C\_h \land \bigwedge\_{a \in atoms^+(\psi)} C\_a \land \bigwedge\_{a \in atoms^-(\psi)} C\_{\bar{a}}$$

with (possibly empty) subsets of clauses <sup>C</sup><sup>A</sup> <sup>⊆</sup> <sup>ψ</sup>A, <sup>C</sup><sup>h</sup> <sup>⊆</sup> h <sup>b</sup>, <sup>C</sup><sup>a</sup> <sup>⊆</sup> (**d**(a) <sup>→</sup> a <sup>b</sup>), and <sup>C</sup>a¯ <sup>⊆</sup> (¬**d**(a) <sup>→</sup> -<sup>¬</sup>a <sup>b</sup>). Here we implicitly assume <sup>ψ</sup>A, **<sup>d</sup>**(a) <sup>→</sup> a b, and <sup>¬</sup>**d**(a) <sup>→</sup> -<sup>¬</sup>a <sup>b</sup> to be in CNF. Let <sup>C</sup><sup>+</sup> <sup>=</sup> {<sup>a</sup> <sup>|</sup> <sup>C</sup><sup>a</sup> <sup>=</sup> ∅} and <sup>C</sup><sup>−</sup> <sup>=</sup> {¬<sup>a</sup> <sup>|</sup> <sup>C</sup><sup>a</sup>¯ <sup>=</sup> ∅} be the sets of literals whose encodings contain at least one clause of the core C. Using these sets, we construct the formula

$$
\psi^{\mathcal{C}} = \psi\_{\mathcal{A}} \land \bigwedge\_{a \in \mathcal{C}^{+}} \mathbf{d}(a) \to a \land \bigwedge\_{\neg a \in \mathcal{C}^{-}} \neg \mathbf{d}(a) \to \neg a,
$$

which consists of the conjunction of the abstraction and the definitions of the literals that are contained in <sup>C</sup><sup>+</sup>, respectively <sup>C</sup><sup>−</sup>. Recall that <sup>ψ</sup> is equisatisfiable to the conjunction <sup>ψ</sup><sup>A</sup> <sup>∧</sup> - <sup>d</sup>∈**<sup>D</sup>** <sup>d</sup> of the abstraction and all definitions in **<sup>D</sup>**. Let ψ denote this formula, i.e.,

$$\psi' = \psi\_{\mathcal{A}} \land \bigwedge\_{a \in atoms^{+}(\psi)} \mathbf{d}(a) \to a \land \bigwedge\_{\neg a \in atoms^{-}(\psi)} \neg \mathbf{d}(a) \to \neg a.$$

The following proposition shows that it suffices to refine the bounds according to ψC.

**Proposition 6.3.** *Let* ψ *be unsatisfiable with respect to* b *and let* C *be an unsatisfiable core of* ψ <sup>b</sup>*. Then,* <sup>ψ</sup><sup>C</sup> *is unsatisfiable with respect to* <sup>b</sup> *and* <sup>ψ</sup> <sup>|</sup><sup>=</sup> <sup>ψ</sup>C*.*

*Proof.* By definition, we have ψC <sup>b</sup> <sup>=</sup> <sup>ψ</sup><sup>A</sup> <sup>∧</sup> h <sup>b</sup> <sup>∧</sup> - <sup>a</sup>∈C<sup>+</sup> **<sup>d</sup>**(a) <sup>→</sup> a b ∧ - <sup>¬</sup>a∈C<sup>−</sup> <sup>¬</sup>**d**(a) → ¬-<sup>¬</sup>a <sup>b</sup>. This implies <sup>C</sup> <sup>⊆</sup> ψC <sup>b</sup> and, since C is an unsatisfiable core, ψC <sup>b</sup> is unsatisfiable. That is, ψ<sup>C</sup> is unsatisfiable with respect to b. We also have <sup>ψ</sup> <sup>|</sup><sup>=</sup> <sup>ψ</sup><sup>C</sup> since <sup>C</sup><sup>+</sup> <sup>⊆</sup> *atoms*<sup>+</sup>(ψ) and <sup>C</sup><sup>−</sup> <sup>⊆</sup> *atoms*−(ψ). 

Applying Proposition 6.2 to ψ<sup>C</sup> results in the upper bounds of the shortest solution <sup>h</sup><sup>C</sup> for <sup>ψ</sup>C. If <sup>|</sup>h<sup>C</sup> (x)| ≤ b(x) holds for all <sup>x</sup> <sup>∈</sup> <sup>Γ</sup>, then <sup>ψ</sup><sup>C</sup> has no solution and unsatisfiability of ψ follows from Proposition 6.3. Because ψ and ψ are equisatisfiable, we can conclude that ψ is unsatisfiable.

Otherwise, we increase the bounds on the variables that occur in ψ<sup>C</sup> while keeping bounds on the other variables unchanged: We construct b<sup>k</sup>+1 with <sup>b</sup>k(x) <sup>≤</sup> <sup>b</sup><sup>k</sup>+1(x) ≤ |h<sup>C</sup> (x)<sup>|</sup> for all <sup>x</sup> <sup>∈</sup> <sup>Γ</sup>, such that bk(y) <sup>&</sup>lt; <sup>b</sup><sup>k</sup>+1(y) holds for at least one <sup>y</sup> <sup>∈</sup> <sup>V</sup> (ψC). By strictly increasing at least one variable's bound, we eventually either reach the upper bounds of ψ<sup>C</sup> and return unsatisfiability, or we eliminate it as an unsatisfiable implication of ψ. As there are only finitely many possibilities for <sup>C</sup> and thus for <sup>ψ</sup>C, our procedure is guaranteed to terminate.

We do not explicitly construct formula ψ<sup>C</sup> to compute bounds on h<sup>C</sup> as we know the set *lits*(ψC) = <sup>C</sup><sup>+</sup> ∪ C−. Finding upper bounds still requires enumerating all subsets of *lits*(ψC), but we have <sup>|</sup>*lits*(ψC)|≤|*lits*(ψ)<sup>|</sup> and usually *lits*(ψC) is much smaller than *lits*(ψ). For example, consider the formula

$$\psi = \mathbf{z} \ne abd \land (\mathbf{x} \dot{=} a \lor \mathbf{x} \dot{\in} ab^\*) \land \mathbf{x} \dot{=} \mathbf{y} \land (\mathbf{y} \dot{=} bbc \lor \mathbf{z} \in a (b|c)^\*d) \land \mathbf{y} \dot{\in} ab \cdot \mathbf{?}^\*$$

which is unsatisfiable for the bounds b(x) = b(y) = 1 and b(z) = 4. The unsatisfiable core C returned after solving ψ <sup>b</sup> results in the formula <sup>ψ</sup><sup>C</sup> = (<sup>x</sup> . <sup>=</sup> <sup>a</sup> <sup>∨</sup> <sup>x</sup> . <sup>∈</sup> ab∗) <sup>∧</sup> <sup>x</sup> . <sup>=</sup> <sup>y</sup> <sup>∧</sup> <sup>y</sup> . <sup>∈</sup> ab·?<sup>∗</sup> containing four literals. Finding upper bounds for ψ<sup>C</sup> thus amounts to enumerating just 2<sup>4</sup> subsets, which is substantially less than considering all 2<sup>7</sup> subsets of *lits*(ψ) upfront. The conjunction of a subset of *lits*(ψC) yielding the largest upper bounds is <sup>x</sup> . <sup>∈</sup> ab<sup>∗</sup> <sup>∧</sup> <sup>x</sup> . <sup>=</sup> <sup>y</sup> <sup>∧</sup> <sup>y</sup> . ∈ ab·?∗, which simplifies to <sup>x</sup> . <sup>∈</sup> ab<sup>∗</sup> <sup>∩</sup> ab·?<sup>∗</sup> and has a solution of length at most 2 for x and y. With bounds b(x) = b(y) = 2 and b(z) = 4, the formula is satisfiable.

## **7 Implementation**

We have implemented our approach in a solver called nfa2sat. nfa2sat is written in Rust and uses CaDiCaL [9] as the backend SAT solver. We use the incremental API provided by CaDiCaL to solve problems under assumptions. Soundness of nfa2sat follows from Theorem 5.1. For completeness, we rely on CaDiCaL's *failed* function to efficiently determine *failed assumptions*, i.e., assumption literals that were used to conclude unsatisfiability.

The procedure works as follows. Given a formula ψ, we first introduce one fresh Boolean selector variable <sup>s</sup><sup>l</sup> for each theory literal <sup>l</sup> <sup>∈</sup> *lits*(ψ). Then, instead of adding the encoded definitions of the theory literals directly to the SAT solver, we precede them with their corresponding selector variables: for a positive literal <sup>a</sup>, we add <sup>s</sup><sup>a</sup> <sup>→</sup> (**d**(a) <sup>→</sup> <sup>a</sup>), and for a negative literal <sup>¬</sup>a, we add <sup>s</sup>¬<sup>a</sup> <sup>→</sup> (¬**d**(a) <sup>→</sup> -<sup>¬</sup>a) (considering assumptions introduced by a as unit clauses). In the resulting CNF formula, the new selector variables are present in all clauses that encode their corresponding definition, and we use them as assumptions for every incremental call to the SAT solver, which does not affect satisfiability. If such an assumption failed, then we know that at least one of the corresponding clauses in the propositional formula was part of an unsatisfiable core, which enables us to efficiently construct the sets <sup>C</sup><sup>+</sup> and <sup>C</sup><sup>−</sup> of positive and negative atoms present in the unsatisfiable core. As noted previously, we have *lits*(ψ<sup>C</sup> ) = <sup>C</sup><sup>+</sup> ∪ C<sup>−</sup> and hence the sets are sufficient to find bounds on a shortest model for ψ<sup>C</sup> .

This approach is efficient for obtaining *lits*(ψ<sup>C</sup> ) but since CaDiCaL does not guarantee that the set of failed assumptions is minimal, *lits*(ψ<sup>C</sup> ) is not minimal in general. Moreover, even a minimal *lits*(ψ<sup>C</sup> ) can contain too many elements for processing all subsets. To address this issue, we enumerate the subsets only if *lits*(ψ<sup>C</sup> ) is small (by default, we use a limit of ten literals). In this case, we construct the automata M<sup>i</sup> used in Theorem 6.1 for each subset, facilitating the techniques described in [7] for quickly ruling out unsatisfiable ones. Otherwise, instead of enumerating the subsets, we resort to sound approximations of upper bounds, which amounts to over-approximating the number of states without explicitly constructing the automata (c.f. [14]).

Once we have obtained upper bounds on the length of the solution of ψ<sup>C</sup> , we increment bounds on all variables involved, except those that have reached their maximum. Our default heuristics computes a new bound that is either double the current bound of a variable or its maximum, whichever is smaller.

## **8 Experimental Evaluation**

We have evaluated our solver on a large set of benchmarks from the ZaligVinder [22] repository<sup>2</sup>. The repository contains 120,287 benchmarks stemming from both academic and industrial applications. In particular, all the string problems from the SMT-LIB repository,<sup>3</sup> are included in the ZaligVinder repository. We converted the ZaligVinder problems to the SMT-LIB 2.6 syntax and removed duplicates. This resulted in 82,632 unique problems out of which 29,599 are in the logical fragment we support.

We compare nfa2sat with the state-of-the-art solvers cvc5 (version 1.0.3) and Z3 (version 4.12.0). The comparison is limited to these two solvers because they are widely adopted and because they had the best performance in our evaluation. Other string solvers either don't support our logical fragment (CertiStr, Woorpje) or gave incorrect answers on the benchmark problems considered here. Older, no-longer maintained, solvers have known soundness problems, as reported in [7] and [27].

We ran our experiment on a Linux server, with a timeout of 1200 s seconds CPU time and a memory limit of 16 GB. Table 1 shows the results. As a single tool, nfa2sat solves more problems than cvc5 but not as many as Z3. All three tools solve more than 98% of the problems.

The table also shows results of portfolios that combine two solvers. In a portfolio configuration, the best setting is to use both Z3 and nfa2sat. This combination solves all but 20 problems within the timeout. It also reduces the total run-time from 283,942 s for Z3 (about 79 h) to 28,914 s for the portfolio (about 8 h), that is, a 90% reduction in total solve time. The other two portfolios namely, Z3 with cvc5 and nfa2sat with cvc5—also have better performance than a single solver, but the improvement in runtime and number of timeouts is not as large.

Figure 4a illustrates why nfa2sat and Z3 complement each other well. The figure shows three scatter plots that compare the runtime of nfa2sat and Z3 on our problems. The plot on the left compares the two solvers on *all* problems, the one in the middle compares them on *satisfiable* problems, and the one on the right compares them on *unsatisfiable* problems. Points in the left plot are concentrated close to the axes, with a smaller number of points near the diagonal, meaning that Z3 and nfa2sat have different runtime on most problems. The other two

<sup>2</sup> https://github.com/zaligvinder/zaligvinder.

<sup>3</sup> https://clc-gitlab.cs.uiowa.edu:2443/SMT-LIB-benchmarks/QF S.

**Table 1.** Evaluation on ZaligVinder benchmarks. The three left columns show results of individual solvers. The other three columns show results of portfolios combining two solvers.


**Fig. 4.** Comparison of runtime (in seconds) with Z3 and cvc5. The left plots include all problems, the middle plots include only satisfiable problems, and the right plots include only unsatisfiable problems. The lines marked "failed" correspond to problems that are not solved because a solver ran out of memory. The lines marked "timeout" correspond to problems not solved because of a timeout (1200 s).

plots show this even more clearly: nfa2sat is faster on satisfiable problems while Z3 is faster on unsatisfiable problems. Figure 4b shows analogous scatter plots comparing nfa2sat and cvc5. The two solvers show similar performance on a large set of easy benchmarks although cvc5 is faster on problems that both solvers can solve in less than 1 s. However, cvc5 times out on 38 problems that nfa2sat solves in less than 2 s. On unsatisfiable problems, cvc5 tends to be faster than nfa2sat, but there is a class of problems for which nfa2sat takes between 10 and 100 s whereas cvc5 is slower.

Overall, the comparison shows that nfa2sat is competitive with cvc5 and Z3 on these benchmarks. We also observe that nfa2sat tends to work better on satisfiable problems. For best overall performance, our experiments show that a portfolio of Z3 and nfa2sat would solve all but 20 problems within the timeout, and reduce the total solve time by 90%.

## **9 Conclusion**

We have presented the first eager SAT-based approach to string solving that is both sound and complete for a reasonably expressive fragment of string theory. Our experimental evaluation shows that our approach is competitive with the state-of-the-art lazy SMT solvers Z3 and cvc5, outperforming them on satisfiable problems but falling behind on unsatisfiable ones. A portfolio that combines our approach with these solvers—particularly with Z3—would thus yield strong performance across both types of problems.

In future work, we plan to extend our approach to a more expressive logical fragment, including more general word equations. Other avenues of research include the adaption of model checking techniques such as IC3 [10] to string problems, which we hope would lead to better performance on unsatisfiable instances. A particular benefit of the eager approach is that it enables the use of mature techniques from the SAT world, especially for proof generation and parallel solving. Producing proofs of unsatisfiability is complex for traditional CDCL(T) solvers because of the complex rewriting and deduction rules they employ. In contrast, efficiently generating and checking proofs produced by SAT solvers (using the DRAT format [32]) is well-established and practicable. A challenge in this respect would be to combine unsatisfiability proofs from a SAT solver with proof that our reduction to SAT is sound. For parallel solving, we plan to explore the use of a parallel incremental solver (such as iLingeling [9]) as well as other possible ways to solve multiple bounds in parallel.

## **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

## The GOLEM Horn Solver

Martin Blicha1,2(B) , Konstantin Britikov<sup>1</sup> , and Natasha Sharygina<sup>1</sup>

<sup>1</sup> Università della Svizzera Italiana, Lugano, Switzerland {blichm,britik,sharygin}@usi.ch <sup>2</sup> Charles University, Prague, Czech Republic

Abstract. The logical framework of Constrained Horn Clauses (CHC) models verification tasks from a variety of domains, ranging from verification of safety properties in transition systems to modular verification of programs with procedures. In this work we present Golem, a flexible and efficient solver for satisfiability of CHC over linear real and integer arithmetic. Golem provides flexibility with modular architecture and multiple back-end model-checking algorithms, as well as efficiency with tight integration with the underlying SMT solver. This paper describes the architecture of Golem and its back-end engines, which include our recently introduced model-checking algorithm TPA for deep exploration. The description is complemented by extensive evaluation, demonstrating the competitive nature of the solver.

Keywords: Constrained Horn Clauses · Model Checking

## 1 Introduction

The framework of *Constrained Horn Clauses* (CHC) has been proposed as a unified, purely logic-based, intermediate format for software verification tasks [33]. CHC provides a powerful way to model various verification problems, such as safety, termination, and loop invariant computation, across different domains like transition systems, functional programs, procedural programs, concurrent systems, and more [33–35,41]. The key advantage of CHC is the separation of modelling from solving, which aligns with the important software design principle *separation of concerns*. This makes CHCs highly reusable, allowing a specialized CHC solver to be used for different verification tasks across domains and programming languages. The main focus of the front end is then to translate the source code into the language of constraints, while the back end can focus solely on the well-defined formal problem of deciding satisfiability of a CHC system.

CHC-based *verification* is becoming increasingly popular, with several frameworks developed in recent years, including SeaHorn, Korn and TriCera for C [27,28,36], JayHorn for Java [44], RustHorn for Rust [48], HornDroid for Android [18], SolCMC and SmartACE for Solidity [2,57]. A novel CHC-based approach for *testing* also shows promising results [58]. The growing demand from verifiers drives the development of specialized *Horn* solvers. Different solvers implement different techniques based on, e.g., model-checking approaches (such

c The Author(s) 2023 as predicate abstraction [32], CEGAR [22] and IC3/PDR [16,26]), machine learning, automata, or CHC transformations. Eldarica [40] uses predicate abstraction and CEGAR as the core solving algorithm. It leverages Craig interpolation [23] not only to guide the predicate abstraction but also for acceleration [39]. Additionally, it controls the form of the interpolants with *interpolation abstraction* [46,53]. Spacer [45] is the default algorithm for solving CHCs in Z3 [51]. It extends PDR-style algorithm for nonlinear CHC [38] with underapproximations and leverages *model-based projection* for predecessor computation. Recently it was enriched with *global guidance* [37]. Ultimate TreeAutomizer [25] implements automata-based approaches to CHC solving [43,56]. HoIce [20] implements a machine-learning-based technique adapted from the ICE framework developed for discovering inductive invariants of transition systems [19]. FreqHorn [29,30] combines syntax-guided synthesis [4] with data derived from unrollings of the CHC system.

According to the results of the international competition on CHC solving CHC-COMP [24,31,54], solvers applying model-checking techniques, namely Spacer and Eldarica, are regularly outperforming the competitors. These are the solvers most often used as the back ends in CHC-based verification projects. However, only specific algorithms have been explored in these tools for CHC solving, limiting their application for diverse verification tasks. Experience from software verification and model checking of transition systems shows that in contrast to the state of affairs in CHC solving, it is possible to build a flexible infrastructure with a unified environment for multiple back-end solving algorithms. CPAchecker [6–11], and Pono [47] are examples of such tools.

This work aims to bring this flexibility to the general domain-independent framework of constrained Horn clauses. We present Golem, a new solver for CHC satisfiability, that provides a unique combination of flexibility and efficiency.<sup>1</sup> Golem implements several SMT-based model-checking algorithms: our recent model-checking algorithm based on *Transition Power Abstraction* (TPA) [13,14], and state-of-the-art model-checking algorithms Bounded Model Checking (BMC) [12], k-induction [55], Interpolation-based Model Checking (IMC) [49], Lazy Abstractions with Interpolants (LAWI) [50] and Spacer [45]. Golem achieves efficiency through tight integration with the underlying interpolating SMT solver OpenSMT [17,42] and preprocessing transformations based on *predicate elimination*, *clause merging* and *redundant clause elimination*. The flexible and modular framework of OpenSMT enables customization for different algorithms; its powerful interpolation modules, particularly, offer fine control (in size and strength) with multiple interpolant generation procedures. We report experimentation that confirms the advantage of multiple diverse solving techniques and shows that Golem is competitive with state-of-the-art Horn solvers on large sets of problems.<sup>2</sup> Overall, Golem can serve as an efficient back

<sup>1</sup> Golem is available at https://github.com/usi-verification-and-security/golem.

<sup>2</sup> This is in line with results from CHC-COMP 2021 and 2022 [24,31]. In 2022, Golem beat other solvers except Z3-Spacer in the LRA-TS, LIA-Lin and LIA-Nonlin tracks.

end for domain-specific verification tools and as a research tool for prototyping and evaluating SMT- and interpolation-based verification techniques in a unified setting.

## 2 Tool Overview

In this section, we describe the main components and features of the tool together with the details of its usage. For completeness, we recall the terminology related to CHCs first.

Constrained Horn Clauses. A constrained Horn clause is formula ϕ∧B1∧B2∧ ... <sup>∧</sup> <sup>B</sup>*<sup>n</sup>* <sup>=</sup><sup>⇒</sup> <sup>H</sup>, where <sup>ϕ</sup> is the *constraint*, a formula in the background theory, B1,...,B*<sup>n</sup>* are uninterpreted predicates, and H is an uninterpreted predicate or *false*. The antecedent of the implication is commonly denoted as the *body* and the consequent as the *head*. A clause with more than one predicate in the body is called *nonlinear*. A nonlinear system of CHCs has at least one nonlinear clause; otherwise, the system is linear.

Fig. 1. High-level architecture of Golem

Architecture. The flow of data inside Golem is depicted in Fig. 1. The system of CHCs is read from .smt2 file, a script in an extension of the language of SMT-LIB.<sup>3</sup> Interpreter interprets the SMT-LIB script and builds the internal representation of the system of CHCs. In Golem, CHCs are first *normalized*, then the system is translated into an internal graph representation. Normalization rewrites clauses to ensure that each predicate has only variables as arguments. The graph representation of the system is then passed to the Preprocessor, which applies various transformations to simplify the input graph. Preprocessor then hands the transformed graph to the chosen back-end engine. Engines in

<sup>3</sup> https://chc-comp.github.io/format.html.

Golem implement various SMT-based model-checking algorithms for solving the CHC satisfiability problem. There are currently six engines in Golem: TPA, BMC, KIND, IMC, LAWI, and Spacer (see details in Sect. 3). User selects the engine to run using a command-line option --engine. Golem relies on the interpolating SMT solver OpenSMT [42] not only for answering SMT queries but also for interpolant computation required by most of the engines. Interpolating procedures in OpenSMT can be customized on demand for the specific needs of each engine [1]. Additionally, Golem re-uses the data structures of OpenSMT for representing and manipulating terms.

Models and Proofs. Besides solving the CHC satisfiability problem, a *witness* for the answer is often required by the domain-specific application. Satisfiability witness is a *model*, an interpretation of the CHC predicates that makes all clauses valid. Unsatisfiability witness is a *proof*, a derivation of the empty clause from the input clauses. In software verification these witnesses correspond to program invariants and counterexample paths, respectively. All engines in Golem produce witnesses for their answer. Witnesses from engines are translated back through the applied preprocessing transformations. Only after this *backtranslation*, the witness matches the original input system and is reported to the user. Witnesses must be explicitly requested with the option --print-witness.

Models are internally stored as formulas in the background theory, using only the variables of the (normalized) uninterpreted predicates. They are presented to the user in the format defined by SMT-LIB [5]: a sequence of SMT-LIB's define-fun commands, one for each uninterpreted predicate.

For the proofs, Golem follows the trace format proposed by Eldarica. Internally, proofs are stored as a sequence of derivation steps. Every derivation step represents a ground instance of some clause from the system. The ground instances of predicates from the body form the *premises* of the step, and the ground instance of the head's predicate forms the *conclusion* of the step. For the derivation to be valid, the premises of each step must have been derived earlier, i.e., each premise must be a conclusion of some derivation step earlier in the sequence. To the user, the proof is presented as a sequence of derivations of ground instances of the predicates, where each step is annotated with the indices of its premises. See Example 1 below for the illustration of the proof trace.

Golem also implements an internal *validator* that checks the correctness of the witnesses. It validates a model by substituting the interpretations for the predicates and checking the validity of all the clauses with OpenSMT. Proofs are validated by checking all conditions listed above for each derivation step. Validation is enabled with an option --validate and serves primarily as a debugging tool for the developers of witness production.

*Example 1.* Consider the following CHC system and the proof of its unsatisfiability.

$$\begin{aligned} x > 0 &\implies L1(x) & \quad & 1. \; L\_1(1) \\ x' = x + 1 &\implies D(x, x') & \quad & 2. \; D(1, 2) \\ L1(x) \land D(x, x') &\implies L2(x') & \quad & 3. \; L\_2(2) & \quad ; 1, 2 \\ L2(x) \land x \le 2 &\implies fase & 4. \; fase & \quad ; 3 \end{aligned}$$

The derivation of *false* consists of four derivation steps. Step 1 instantiates the first clause for x := 1. Step 2 instantiates the second clause for x := 1 and x := 2. Step 3 applies resolution to the instance of the third clause for x := 1 and x := 2 and facts derived in steps 1 and 2. Finally, step 4 applies resolution to the instance of the fourth clause for x := 2 and the fact derived in step 3.

Preprocessing Transformations. Preprocessing can significantly improve performance by transforming the input CHC system into one more suitable for the back-end engine. The most important transformation in Golem is *predicate elimination*. Given a predicate not present in both the body and the head of the same clause, the predicate can be eliminated by exhaustive application of the resolution rule. This transformation is most beneficial when it also decreases the number of clauses. *Clause merging* is a transformation that merges all clauses with the same uninterpreted predicates in the body and the head to a single clause by disjoining their constraints. This effectively pushes work from the level of the model-checking algorithm to the level of the SMT solver. Additionally, Golem detects and deletes *redundant clauses*, i.e., clauses that cannot participate in the proof of unsatisfiability.

An important feature of Golem is that all applied transformations are *reversible* in the sense that any model or proof for the transformed system can be translated back to a model or proof of the original system.

### 3 Back-end Engines of GOLEM

The core components of Golem that solve the problem of satisfiability of a CHC system are referred to as *back-end engines*, or just engines. Golem implements several popular state-of-the-art algorithms from model checking and software verification: BMC, k-induction, IMC, LAWI and Spacer. These algorithms treat the problem of solving a CHC system as a *reachability* problem in the graph representation.

The unique feature of Golem is the implementation of the new modelchecking algorithm based on the concept of *Transition Power Abstraction* (TPA). It is capable of much deeper analysis than other algorithms when searching for counterexamples [14], and it discovers *transition* invariants [13], as opposed to the usual (state) invariants.

#### 3.1 Transition Power Abstraction

The TPA engine in Golem implements the model-checking algorithm based on the concept of Transition Power Abstraction. It can work in two modes: The first mode implements the basic TPA algorithm, which uses a single TPA sequence [14]. The second mode implements the more advanced version, split-TPA, which relies on two TPA sequences obtained by splitting the single TPA sequence of the basic version [13]. In Golem, both variants use the underapproximating *model-based projection* for propagating truly reachable states, avoiding full quantifier elimination. Moreover, they benefit from incremental solving available in OpenSMT, which speeds up the satisfiability queries.

The TPA algorithms, as described in the publications, operate on transition systems [13,14]. However, the engine in Golem is not limited to a single transition system. It can analyze a connected *chain of transition systems*. In the software domain, this model represents programs with a sequence of consecutive loops. The extension to the chain of transition systems works by maintaining a separate TPA sequence for each node on the chain, where each node has its own transition relation. The reachable states are propagated forwards on the chain, while safe states—from which final error states are unreachable—are propagated backwards. In this scenario, transition systems on the chain are queried for reachability between various initial and error states. Since the transition relations remain the same, the summarized information stored in the TPA sequences can be re-used across multiple reachability queries. The learnt information summarizing multiple steps of the transition relation is not invalidated when the initial or error states change.

Golem's TPA engine discovers counterexample paths in unsafe transition systems, which readily translate to unsatisfiability proofs for the corresponding CHC systems. For safe transition systems, it discovers safe k-inductive transition invariants. If a model for the corresponding CHC system is required, the engine first computes a quantified inductive invariant and then applies quantifier elimination to produce a quantifier-free inductive invariant, which is output as the corresponding model.<sup>4</sup>

The TPA engine's ability to discover deep counterexamples and transition invariants gives Golem a unique edge for systems requiring deep exploration. We provide an example of this capability as part of the evaluation in Sect. 4.

#### 3.2 Engines for State-of-the-Art Model-Checking Algorithms

Besides TPA, Golem implements several popular state-of-the-art modelchecking algorithms. Among them are bounded model checking [12], kinduction [55] and McMillan's interpolation-based model checking [49], which operate on transition systems. Golem faithfully follows the description of the algorithms in the respective publications.

<sup>4</sup> The generation of unsatisfiability proofs also works for the extension to chains of transition systems, while the generation of models for this case is still under development.

Additionally, Golem implements *Lazy Abstractions with Interpolants* (LAWI), an algorithm introduced by McMillan for verification of software [50].<sup>5</sup> In the original description, the algorithm operates on programs represented with *abstract reachability graphs*, which map straightforwardly to *linear* CHC systems. This is the input supported by our implementation of the algorithm in Golem.

The last engine in Golem implements the IC3-based algorithm Spacer [45] for solving general, even nonlinear, CHC systems. Nonlinear CHC systems can model programs with summaries, and in this setting, Spacer computes both under-approximating and over-approximating summaries of the procedures to achieve modular analysis of programs. Spacer is currently the only engine in Golem capable of solving nonlinear CHC systems.

All engines in Golem rely on OpenSMT for answering SMT queries, often leveraging the incremental capabilities of OpenSMT to implement the corresponding model-checking algorithm efficiently. Additionally, the engines IMC, LAWI, Spacer and TPA heavily use the flexible and controllable interpolation framework in OpenSMT [1,52], especially multiple interpolation procedures for linear-arithmetic conflicts [3,15].

## 4 Experiments

In this section, we evaluate the performance of individual Golem's engines on the benchmarks from the latest edition of CHC-COMP. The goal of these experiments is to 1) demonstrate the usefulness of multiple back-end engines and their potential combined use for solving various problems, and 2) compare Golem against state-of-the-art Horn solvers.

The benchmark collections of CHC-COMP represent a rich source of problems from various domains.<sup>6</sup> Version 0.3.2 of Golem was used for these experiments. Z3-Spacer (Z3 4.11.2) and Eldarica 2.0.8 were run (with default options) for comparison as the best Horn solvers available. All experiments were conducted on a machine with an AMD EPYC 7452 32-core processor and 8 × 32 GiB of memory; the timeout was set to 300 s. No conflicting answers were observed in any of the experiments. The results are in line with the results of the last editions of CHC-COMP where Golem participated [24,31]. Our artifact for reproducing the experiments is available at https://doi.org/10.5281/zenodo. 7973428.

#### 4.1 Category LRA-TS

We ran all engines of Golem on all 498 benchmarks from the LRA-TS (transition systems over linear real arithmetic) category of CHC-COMP.

Table 1 shows the number of benchmarks solved per engine, together with a *virtual best* (VB) engine.<sup>7</sup> On unsatisfiable problems, the differences between the

<sup>5</sup> It is also known as Impact, which was the first tool that implemented the algorithm.

<sup>6</sup> https://github.com/orgs/chc-comp/repositories.

<sup>7</sup> Virtual best engine picks the best performance from all engines for each benchmark.


Table 1. Number of solved benchmarks from LRA-TS category.

engines' performance are not substantial, but the BMC engine firmly dominates the others. On satisfiable problems, we see significant differences. Figure 2 plots, for each engine, the number of solved *satisfiable* benchmarks (x-axis) within the given time limit (y-axis, log scale).

Fig. 2. Performance of Golem's engines on SAT problems of LRA-TS category.

The large lead of VB suggests that the solving abilities of the engines are widely complementary. No single engine dominates the others on satisfiable instances. The *portfolio* of techniques available in Golem is much stronger than any single one of them.

Moreover, the unified setting enables direct comparison of the algorithms. For example, we can conclude from these experiments that the extra check for k-inductive invariants on top of the BMC-style search for counterexamples, as implemented in the KIND engine, incurs only a small overhead on unsatisfiable problems, but makes the KIND engine very successful in solving satisfiable problems.

#### 4.2 Category LIA-Lin

Next, we considered the LIA-Lin category of CHC-COMP. These are linear systems of CHCs with linear integer arithmetic as the background theory. There

are many benchmarks in this category, and for the evaluation at the competition, a subset of benchmarks is selected (see [24,31]). We evaluated the LAWI and Spacer engines of Golem (the engines capable of solving general linear CHC systems) on the benchmarks selected at CHC-COMP 2022 and compared their performance to Z3-Spacer and Eldarica. Notably, we also examined a specific subcategory of LIA-lin, namely extra-small-lia<sup>8</sup> with benchmarks that fall into the fragment accepted by Golem's TPA engine.

There are 55 benchmarks in extra-small-lia subcategory, all satisfiable, but known to be highly challenging for all tools. The results, given in Table 2, show that split-TPA outperforms not only LAWI and Spacer engines in Golem, but also Z3-Spacer. Only Eldarica solves more benchmars. We ascribe this to split-TPA's capability to perform deep analysis and discover transition invariants.


Table 2. Number of solved benchmarks from extra-small-lia subcategory.

For the whole LIA-Lin category, 499 benchmarks were selected in the 2022 edition of CHC-COMP [24]. The performance of the LAWI and Spacer engines of Golem, Z3-Spacer and Eldarica on this selection is summarized in Table 3. Here, the Spacer engine of Golem significantly outperforms the LAWI engine. Moreover, even though Golem loses to Z3-Spacer, it beats Eldarica. Given that Golem is a prototype, and Z3-Spacer and Eldarica have been developed and optimized for several years, this demonstrates the great potential of Golem.

Table 3. Number of solved benchmarks from LIA-Lin category.


#### 4.3 Category LIA-Nonlin

Finally, we considered the LIA-Nonlin category of benchmarks of CHC-COMP, which consists of *nonlinear* systems of CHCs with linear integer arithmetic as the background theory. For the experiments, we used the 456 benchmarks selected for the 2022 edition of CHC-COMP. Spacer is the only engine in Golem capable of solving nonlinear CHC systems; thus, we focused on a more detailed comparison of its performance against Z3-Spacer and Eldarica. The results of the experiments are summarized in Fig. 3 and Table 4.

Fig. 3. Comparison on LIA-Nonlin category (*×* - SAT, - - UNSAT). (Color figure online)

Table 4. Number of solved benchmarks from LIA-Nonlin category. The number of *uniquely* solved benchmarks is in parentheses.


Overall, Golem solved fewer problems than Z3-Spacer but more than Eldarica; however, *all* tools solved some instances *uniquely*. A detailed comparison is depicted in Fig. 3. For each benchmark, its data point in the plot reflects the runtime of Golem (x-axis) and the runtime of the competitor (y-axis). The plots suggest that the performance of Golem is often orthogonal to Eldarica, but highly correlated with the performance of Z3-Spacer. This is not surprising as the Spacer engine in Golem is built on the same core algorithm. Even though Golem is often slower than Z3-Spacer, there is a non-trivial amount of benchmarks on which Z3-Spacer times out, but which Golem solves fairly quickly. Thus, Golem, while being a newcomer, already complements existing state-of-the-art tools, and more improvements are expected in the near future.

To summarise, the overall experimentation with different engines of Golem demonstrates the advantages of the multi-engine general framework and illustrates the competitiveness of its analysis. It provides a lot of flexibility in addressing various verification problems while being easily customizable with respect to the analysis demands.

<sup>8</sup> https://github.com/chc-comp/extra-small-lia.

## 5 Conclusion

In this work, we presented Golem, a flexible and effective Horn solver with multiple back-end engines, including recently-introduced TPA-based model-checking algorithms. Golem is a suitable research tool for prototyping new SMT-based model-checking algorithms and comparing algorithms in a unified framework. Additionally, the effective implementation of the algorithm achieved with tight coupling with the underlying SMT solver makes it an efficient back end for domain-specific verification tools. Future directions for Golem include support for VMT input format [21] and analysis of liveness properties, extension of TPA to nonlinear CHC systems, and support for SMT theories of arrays, bit-vectors and algebraic datatypes.

Acknowledgement. This work was partially supported by Swiss National Science Foundation grant 200021\_185031 and by Czech Science Foundation Grant 23-06506 S.

## References


58. Zlatkin, I., Fedyukovich, G.: Maximizing branch coverage with constrained Horn clauses. In: Fisman, D., Rosu, G. (eds.) Tools and Algorithms for the Construction and Analysis of Systems, pp. 254–272. Springer International Publishing, Cham (2022)

Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

## **Model Checking**

## CoqCryptoLine: A Verified Model Checker with Certified Results

Ming-Hsien Tsai4(B), Yu-Fu Fu<sup>2</sup>, Jiaxiang Liu<sup>5</sup>, Xiaomu Shi<sup>3</sup>, Bow-Yaw Wang<sup>1</sup>, and Bo-Yin Yang<sup>1</sup>

 Academia Sinica, Taipei, Taiwan {bywang,byyang}@iis.sinica.edu.tw Georgia Institute of Technology, Atlanta, USA yufu@gatech.edu Institute of Software, Chinese Academy of Sciences, Beijing, China National Institute of Cyber Security, Taipei, Taiwan

mhtsai208@gmail.com

<sup>5</sup> Shenzhen University, Shenzhen, China

Abstract. We present the verified model checker CoqCryptoLine for cryptographic programs with certified verification results. The CoqCryptoLine verification algorithm consists of two reductions. The algebraic reduction transforms into a root entailment problem; and the bit-vector reduction transforms into an SMT QF\_BV problem. We specify and verify both reductions formally using Coq with MathComp. The CoqCryptoLine tool is built on the OCaml programs extracted from verified reductions. CoqCryptoLine moreover employs certified techniques for solving the algebraic and logic problems. We evaluate CoqCryptoLine on cryptographic programs from industrial security libraries.

## 1 Introduction

CoqCryptoLine [1] is a verified model checker with certified verification results. It is designed for verifying complex non-linear integer computations commonly found in cryptographic programs. The verification algorithms of CoqCryptoLine consist of two reductions. The algebraic reduction transforms polynomial equality checking into a root entailment problem in commutative algebra; the bit-vector reduction reduces range properties to satisfiability of queries in the Quantifier-Free Bit-Vector (QF\_BV) logic from Satisfiability Modulo Theories (SMT) [6]. Both verification algorithms are formally specified and verified by the proof assistant Coq with MathComp [7,17]. CoqCryptoLine verification programs are extracted from the formal specification and therefore verified by the proof assistant automatically.

To minimize errors from external tools, recent developments in certified verification are employed by CoqCryptoLine. The root entailment problem is solved by the computer algebra system (CAS) Singular [19]. CoqCryptoLine asks the external algebraic tool to provide certificates and validates certificates with the formal polynomial theory in Coq. SMT QF\_BV queries on the other hand are answered by the verified SMT QF\_BV solver CoqQFBV [33]. Answers to SMT QF\_BV queries are therefore all certified as well. With formally verified algorithms and certified answers from external tools, CoqCryptoLine gives verification results with much better guarantees than average automatic verification tools.

Reliable verification tools would not be very useful if they could not check real-world programs effectively. In our experiments, CoqCryptoLine verifies 54 real-world cryptographic programs. 52 of them are from well-known security libraries such as Bitcoin [35] and OpenSSL [30]. They are implementations of field and group operations in elliptic curve cryptography. The remaining two are the Number-Theoretic Transform (NTT) programs from the post-quantum cryptosystem Kyber [10]. All field operations are implemented in a few hundred lines and verified in 6 minutes. The most complicated generic group operation in the elliptic curve Curve25519 consists of about 4000 lines and is verified by CoqCryptoLine in 1.5 h.

*Related Work.* There are numerous model checkers in the community, e.g. [8, 13,21–23]. Nevertheless, few of them are formally verified. To our knowledge, the first verification of a model checker was performed in Coq for the modal μ-calculus [34]. The LTL model checker CAVA [15,27] and the model checker Munta [38,39] for timed automata were developed and verified using Isabelle/HOL [29], which can be considered as verified counterparts of SPIN [21] and Uppaal [23], respectively. CoqCryptoLine instead checks CryptoLine models [16,31] that are for the correctness of cryptographic programs. It can be seen as a verified version of CryptoLine. A large body of work studies the correctness of cryptographic programs, e.g. [2–4,9,12,14,24,26,40], cf. [5] for a survey. They either require human intervention or are unverified, while our work is fully automatic and verified. The most relevant work is bvCryptoLine [37], which is the first automated and partly verified model checker for a very limited subset of CryptoLine. We will compare our work with it comprehensively in Sect. 2.3.

## 2 CoqCryptoLine

CoqCryptoLine is an automatic verification tool that takes a CryptoLine specification as input and returns certified results indicating the validity of the specification. We briefly describe the CryptoLine language [16] followed by the modules, features, and optimizations of CoqCryptoLine in this section.

### 2.1 CryptoLine Language

A CryptoLine specification contains a CryptoLine program with preand post-conditions, where the CryptoLine program usually models some cryptographic program [16,31]. Both the pre- and post-conditions consist of an algebraic part, which is formulated as a conjunction of (modular) equations, and a range part as an SMT QF\_BV predicate. A CryptoLine specification is valid if every program execution starting from a program state satisfying the pre-condition ends in a state satisfying the post-condition.

CryptoLine is designed for modeling cryptographic assembly programs. Besides the assignment (mov) and conditional assignment (cmov) statements, CryptoLine provides arithmetic statements such as addition (add), addition with carry (adc), subtraction (sub), subtraction with borrow (sbb), half multiplication (mul) and full multiplication (mull). Most of them have versions that model the carry/borrow flags explicitly (like adds, adcs, subs, sbbs). It also allows bitwise statements, for instance, bitwise AND (and), OR (or) and leftshift (shl). To deal with multi-word arithmetic, CryptoLine further includes multi-word constructs, for example, those that split (split) or join (join) words, as well as multi-word shifts (cshl). CryptoLine is strongly typed, admitting both signed and unsigned interpretations for bit-vector variables and constants. The cast statement converts types explicitly. Finally, CryptoLine also supports special statements (assert and assume) for verification purposes.

#### 2.2 The Architecture of CoqCryptoLine

CoqCryptoLine reduces the verification problem of a CryptoLine specification to instances of root entailment problems and SMT problems over the QF\_BV logic. These instances are then solved by respective certified techniques. Moreover, the components in CoqCryptoLine are also specified and verified by the proof assistant Coq with MathComp [7,17]. Figure 1 gives an overview of CoqCryptoLine. In the figure, dashed components represent external tools. Rectangular boxes are verified components and rounded boxes are unverified. Note that all our proof efforts using Coq are transparent to users. No Coq proof is required from users during verification of cryptographic programs with CoqCryptoLine. Details can be found in [36].

Starting from a CryptoLine specification text, the CoqCryptoLine parser translates the text into an abstract syntax tree defined in the Coq module DSL. The module gives formal semantics for the typed CryptoLine language [16]. The validity of CryptoLine specifications is also formalized. Similar to most program verification tools, CoqCryptoLine transforms CryptoLine specifications to the static single assignment (SSA) form. The SSA module gives our transformation algorithm. It moreover shows that validity of CryptoLine specifications is preserved by the SSA transformation. CoqCryptoLine then reduces the verification problem via two Coq modules.

The SSA2ZSSA module contains our algebraic reduction to the root entailment problem. Concretely, a system of (modular) equations is constructed from the given program so that program executions correspond to the roots of the system of (modular) equations. To verify algebraic post-conditions, it suffices to check if the roots for executions are also roots of (modular) equations in the post-condition. However, program executions can deviate from roots of (modular) equations when over- or under-flow occurs. CoqCryptoLine will generate

Fig. 1. Overview of CoqCryptoLine

soundness conditions to ensure the executions conform to our (modular) equations. The algebraic verification problem is thus reduced to the root entailment problem provided that soundness conditions hold.

The SSA2QFBV module gives our bit-vector reduction to the SMT QF\_BV problem. It constructs an SMT query to check the validity of the given Crypto-Line range specification. Concretely, an SMT QF\_BV query is built such that all program executions correspond to satisfying assignments to the query and vice versa. To verify the range post-conditions, it suffices to check if satisfying assignments for the query also satisfy the post-conditions. The range verification problem is thus reduced to the SMT QF\_BV problem. On the other hand, additional SMT queries are constructed to check soundness conditions for the algebraic reduction. We formally prove the equivalence between soundness conditions and corresponding queries.

With the two formally verified reduction algorithms, it remains to solve the root entailment problems and the SMT QF\_BV problems with external solvers. CoqCryptoLine invokes an external computer algebra system (CAS) to solve the root entailment problems, and improves the techniques in [20,37] to validate the (untrusted) returned answers. Currently, the CAS Singular [19] is supported. To solve the SMT QF\_BV problems, CoqCryptoLine employs the certified SMT QF\_BV solver CoqQFBV [33]. In all cases, instances of the two kinds of problems are solved with certificates. And CoqCryptoLine employs verified certificate checkers to validate the answers to further improve assurance.

Note that the algebraic reduction in SSA2ZSSA is sound but not complete due to the abstraction of bit-accurate semantics into (modular) polynomial equations over integers. Thus a failure in solving the root entailment problem by CAS does not mean that the algebraic post-conditions are violated. On the other hand, the bit-vector reduction in SSA2QFBV is both sound and complete.

The CoqCryptoLine tool is built on OCaml programs extracted from verified algorithms in Coq with MathComp. We moreover integrate the OCaml programs from the certified SMT QF\_BV solver CoqQFBV. Our trusted computing base consists of (1) CoqCryptoLine parser, (2) text interface with external SAT solvers (from CoqQFBV), (3) the proof assistant Isabelle [29] (from the SAT solver certificate validator Grat used by CoqQFBV) and (4) the Coq proof assistant. Particularly, sophisticated decision procedures in external CASs and SAT solvers used in CoqQFBV need not be trusted.

#### 2.3 Features and Optimizations

CoqCryptoLine comes with the following features and optimizations implemented in its modules.

*Type System.* CoqCryptoLine fully supports the type system of the CryptoLine language. The type system is used to model bit-vectors of arbitrary bit-widths with unsigned or signed interpretation. Such a type system allows CoqCryptoLine to model more industrial examples translated from C programs via GCC [16] or LLVM [24] compared to bvCryptoLine [37], which only allows unsigned bit-vectors, all of the same bit-width.

*Mixed Theories.* With the assert and assume statements supported by CoqCryptoLine, it is possible to make an assertion on the range side (or on the algebraic side) and then make an equivalent assumption on the algebraic side (or resp. on the range side). With this feature, a predicate can be asserted on one side where the predicate is easier to prove, and then assumed on the other side to ease the verification of other predicates. The equivalence between the asserted predicate and the assumed predicate is currently not verified by CoqCryptoLine, though it is achievable. Both assert and assume statements are not available in bvCryptoLine.

*Multi-threading.* All extracted OCaml code from the verified algorithms in Coq runs sequentially. To speed up, SMT QF\_BV problems, as well as root entailment problems, are solved parallelly.

*Efficient Root Entailment Problem Solving.* CoqCryptoLine can be used as a solver for root entailment problems with certificates validated by a verified validator. A root entailment problem is reduced to an ideal membership problem, which is then solved by computing Gröbner basis [20]. To solve a root entailment problem with a certificate, we need to find a witness of polynomials c0,...,c*<sup>n</sup>* such that

$$q = \Sigma\_{i=0}^{n} c\_i p\_i \tag{1}$$

where q and p*i*'s are given polynomials. To compute the witness, bvCryptoLine relies on gbarith [32], where new variables are introduced. CoqCryptoLine utilizes the lift command in Singular instead without adding fresh variables. We show in the evaluation section that using lift is more efficient than using gbarith. The witness found is further validated by CoqCryptoLine, which relies on the polynomial normalization procedure norm\_subst in Coq to check if Eq. <sup>1</sup> holds. bvCryptoLine on the other hand uses the ring tactic in Coq, where extra type checking is performed. Elimination of ideal generators through variable substitution is an efficient approach to simplify an ideal membership problem [37]. The elimination procedure implemented in CoqCryptoLine can identify much more variable substitution patterns than those found by bvCryptoLine.

*Multi-moduli.* Modular equations with multi-moduli are common in postquantum cryptography. For example, the post-quantum cryptosystem Kyber uses the polynomial ring <sup>Z</sup>3329[X]/-<sup>X</sup><sup>256</sup> + 1 containing two moduli <sup>3329</sup> and X256+1. To support multi-moduli in CoqCryptoLine, in the proof of our algebraic reduction, we have to find integers <sup>c</sup>0,...,c*<sup>n</sup>* such that <sup>e</sup><sup>1</sup> <sup>−</sup>e<sup>2</sup> <sup>=</sup> <sup>Σ</sup>*<sup>n</sup> <sup>i</sup>*=0c*i*m*<sup>i</sup>* given the proof of e<sup>1</sup> = e<sup>2</sup> (mod m0,...,m*n*) where e1, e2, and m*i*'s are integers. Instead of implementing a complicated procedure to find the exact c*i*'s, we simply invoke the xchoose function provided by MathComp to find <sup>c</sup>*i*'s based on the proof of e<sup>1</sup> = e<sup>2</sup> (mod m0,...,m*n*). Multi-moduli is not supported by bvCryptoLine.

*Tight Integration with* CoqQFBV*.* CoqCryptoLine verifies every atomic range predicate separately using the certified SMT QF\_BV solver CoqQFBV. Constructing a text file as the input to CoqQFBV for every atomic range predicate is not a good idea because the bit-blasting procedure in CoqQFBV is performed several times for the identical program. CoqCryptoLine thus is tightly integrated with CoqQFBV to speed up bit-blasting of the same program using the cache provided by CoqQFBV. bvCryptoLine uses the SMT solver Boolector to prove range predicates without certificates.

*Slicing.* During the reductions from the verification problem of a Crypto-Line specification to instances of root entailment problems and SMT QF\_BV problems, a verified static slicing is performed in CoqCryptoLine to produce smaller problems. Unlike the work in [11], which sets all assume statements as additional slicing criteria, the slicing in CoqCryptoLine is capable of pruning unrelated predicates in assume statements. The slicing procedure implemented in CoqCryptoLine is much more complicated than the one in bvCryptoLine due to the presence of assume statements. This feature is provided as commandline option because it makes the verification incomplete. With slicing, the time in verifying industrial examples is reduced dramatically.

## 3 Walkthrough

We illustrate how CoqCryptoLine is used in this section. The x86\_64 assembly subroutine ecp\_nistz256\_mul\_montx from OpenSSL [30] shown in Fig. <sup>2</sup> is verified as an example.

An input for CoqCryptoLine contains a CryptoLine specification for the assembly subroutine. The original subroutine is marked between the comments PROGNAME STARTS and PROGNAME ENDS, which is obtained automatically from the Python script provided by CryptoLine [31].

Prior to the "START" comment are the parameter declaration, pre-condition, and variable initialization. After the "END" comment is the post-condition of the subroutine. After the subroutine ends, the result is moved to the output variables.

The assembly subroutine ecp\_nistz256\_mul\_montx takes two 256-bit unsigned integers a and b and the modulus m as inputs. The 256-bit integer <sup>m</sup> is the prime <sup>p</sup>256 = 2<sup>256</sup> <sup>−</sup> <sup>2</sup><sup>224</sup> + 2<sup>192</sup> + 2<sup>96</sup> <sup>−</sup> <sup>1</sup> from the NIST curve. The 256-bit integers a and b (less than the prime) are the multiplicands. Each 256-bit input integer <sup>d</sup> ∈ {a, b, m} is denoted by four 64-bit unsigned integer variables <sup>d</sup>*<sup>i</sup>* (for <sup>0</sup> <sup>≤</sup> i < <sup>4</sup>) in little-endian representation. The expression limbs *<sup>n</sup>* [*d0*, *<sup>d</sup>1*, ..., *<sup>d</sup>i*] is short for *<sup>d</sup><sup>0</sup>* <sup>+</sup> *<sup>d</sup>1*\*2\*\**<sup>n</sup>* + ... +*di*\*2\*\*(*<sup>i</sup>* \**<sup>n</sup>* )<sup>1</sup>. The inputs and constants are then put in the variables for memory cells with the mov statements. There are two parts to a pre-condition. The first part is for the algebraic reduction; the second part is for the bit-vector reduction:

```
and [ m0=0xffffffffffffffff, m1=0$\,\times\,$00000000ffffffff,
      m2=0$\,\times\,$0000000000000000, m3=0xffffffff00000001 ]
&&
and [ m0=0xffffffffffffffff@64, m1=0$\,\times\,$00000000ffffffff@64,
      m2=0$\,\times\,$0000000000000000@64, m3=0xffffffff00000001@64,
      limbs 64 [a0,a1,a2,a3] <u limbs 64 [m0,m1,m2,m3],
      limbs 64 [b0,b1,b2,b3] <u limbs 64 [m0,m1,m2,m3] ]
```
The output 256-bit integer represented by the four variables <sup>c</sup>*<sup>i</sup>* (for <sup>0</sup> <sup>≤</sup> i < <sup>4</sup>) has two requirements. Firstly, the output integer times 2<sup>256</sup> equals the product of the input integers modulo p256. Secondly, the output integer is less than p256. Formally, we have this post-condition:

```
eqmod limbs 64 [0, 0, 0, 0, c0, c1, c2, c3]
      limbs 64 [a0, a1, a2, a3] * limbs 64 [b0, b1, b2, b3]
      limbs 64 [m0, m1, m2, m3]
&&
limbs 64 [c0, c1, c2, c3] <u limbs 64 [m0, m1, m2, m3]
```
Here, we employ the algebraic reduction to verify the non-linear modular equality, and the bit-vector reduction to verify the proper range of the output integer.

However, verifying ecp\_nistz256\_mul\_montx takes extra annotations to hint CoqCryptoLine how to verify the post-condition. E.g., in adding two 256 bit integers represented by 64-bit variables, a chain of four 64-bit additions is performed and carries are propagated. The last carry as the chain ends must be zero or the 256-bit sum is incorrect. In ecp\_nistz256\_mul\_montx two interleaved addition chains use the carry and the overflow flags for carries respectively, so we annotate as follows at the end of two interleaving addition chains to tell CoqCryptoLine about the final carries:

<sup>1</sup> \*\* is the exponentiation operator in CryptoLine.

```
proc main
(uint64 a0, uint64 a1, uint64 a2, uint64 a3,
 uint64 b0, uint64 b1, uint64 b2, uint64 b3,
 uint64 m0, uint64 m1, uint64 m2, uint64 m3) =
{ and [ m0 = 0xffffffffffffffff,
        m1 = 0x00000000ffffffff,
        m2 = 0x0000000000000000,
        m3 = 0xffffffff00000001 ]
&&
  and [ m0 = 0xffffffffffffffff@64,
        m1 = 0x00000000ffffffff@64,
        m2 = 0x0000000000000000@64,
        m3 = 0xffffffff00000001@64,
        limbs 64 [a0, a1, a2, a3] <u
            limbs 64 [m0, m1, m2, m3],
        limbs 64 [b0, b1, b2, b3] <u
            limbs 64 [m0, m1, m2, m3] ] }
mov L0x7fffffffd9b0 a0; mov L0x7fffffffd9b8 a1;
mov L0x7fffffffd9c0 a2; mov L0x7fffffffd9c8 a3;
mov L0x7fffffffd9d0 b0; mov L0x7fffffffd9d8 b1;
mov L0x7fffffffd9e0 b2; mov L0x7fffffffd9e8 b3;
mov L0x55555557c000 0xffffffffffffffff@uint64;
mov L0x55555557c008 0x00000000ffffffff@uint64;
mov L0x55555557c010 0x0000000000000000@uint64;
mov L0x55555557c018 0xffffffff00000001@uint64;
(* ecp_nistz256_mul_montx STARTS *)
mov rdx L0x7fffffffd9d0;
mov r9 L0x7fffffffd9b0;
mov r10 L0x7fffffffd9b8;
mov r11 L0x7fffffffd9c0;
mov r12 L0x7fffffffd9c8;
mull r9 r8 rdx r9;
mull r10 rcx rdx r10;
mov r14 0x20@uint64;
mov r13 0@uint64;
...
mov r8 0@uint64;
clear carry;
clear overflow;
mull rbp rcx rdx L0x7fffffffd9b0;
                                                             adcs carry r9 r9 rcx carry;
                                                             adcs overflow r10 r10 rbp overflow;
                                                             mull rbp rcx rdx L0x7fffffffd9b8;
                                                             adcs carry r10 r10 rcx carry;
                                                             adcs overflow r11 r11 rbp overflow;
                                                             mull rbp rcx rdx L0x7fffffffd9c0;
                                                             adcs carry r11 r11 rcx carry;
                                                             adcs overflow r12 r12 rbp overflow;
                                                             mull rbp rcx rdx L0x7fffffffd9c8;
                                                             mov rdx r9;
                                                             adcs carry r12 r12 rcx carry;
                                                             split ddc rcx r9 32;
                                                             shl rcx rcx 32;
                                                             adcs overflow r13 r13 rbp overflow;
                                                             split rbp dc r9 32;
                                                             assert true && rbp=ddc;
                                                             assume rbp=ddc && true;
                                                             adcs carry r13 r13 r8 carry;
                                                             adcs overflow r8 r8 r8 overflow;
                                                             assert true && and [carry=0@1,overflow=0@1];
                                                             assume and [carry=0,overflow=0] && true;
                                                             ...
                                                             mov L0x7fffffffda00 r8;
                                                             mov L0x7fffffffda08 r9;
                                                             (* ecp_nistz256_mul_montx ENDS *)
                                                             mov c0 L0x7fffffffd9f0;
                                                             mov c1 L0x7fffffffd9f8;
                                                             mov c2 L0x7fffffffda00;
                                                             mov c3 L0x7fffffffda08;
                                                             { eqmod limbs 64 [0, 0, 0, 0, c0, c1, c2, c3]
                                                                     limbs 64 [a0, a1, a2, a3] *
                                                                     limbs 64 [b0, b1, b2, b3]
                                                                     limbs 64 [m0, m1, m2, m3]
                                                             &&
                                                               limbs 64 [c0, c1, c2, c3] <u
                                                                   limbs 64 [m0, m1, m2, m3] }
```
Fig. 2. CryptoLine Model for ecp\_nistz256\_mul\_montx

```
assert true && and [ carry=0@1, overflow=0@1 ];
assume and [ carry=0, overflow=0 ] && true;
```
The assert statement verifies that both the carry and overflow flags are zeroes through the bit-vector reduction. The assume statement then passes this information to the algebraic reduction. Effectively, CoqCryptoLine checks that both flags are zero for all inputs satisfying the pre-condition, then uses those facts as lemmas to verify the post-condition with the algebraic reduction.

The full specification for ecp\_nistz256\_mul\_montx has 230 lines, including 50 lines of manual annotations. 20 are straightforward annotations for variable declaration and initialization. The remaining 30 lines of annotations are hints to CoqCryptoLine, which then verifies the post-condition in 30 s with 24 threads.

The illustration of the typical verification flow shows how a user constructs a CryptoLine specification. The pre-condition for program inputs, the postcondition for outputs, and variable initialization must be specified manually. Additional annotations may be added as hints. Notice that hints only tell CoqCryptoLine *what*, not *why* properties should hold. Proofs of annotated hints and the post-condition are found by CoqCryptoLine automatically. Consequently, manual annotations are minimized and verification efforts are reduced significantly.

### 4 Evaluation

We evaluate CoqCryptoLine on 52 benchmarks from four industrial security libraries Bitcoin [35], boringSSL [14,18], nss [25], and OpenSSL [30]. The C reference and optimized avx2 implementations of the Number-Theoretic Transform (NTT) from the post-quantum key encapsulation mechanism Kyber [10] are also evaluated. Among the total 54 benchmarks, 43 benchmarks contain features not supported by bvCryptoLine such as signed variables. All experiments are performed on an Ubuntu 22.04.1 machine with a 3.20GHz Intel Xeon Gold 6134M CPU and 1TB RAM.

Benchmarks from security libraries are various field and group operations from elliptic curve cryptography (ECC). In ECC, rational points on curves are represented by elements in large finite fields. In Bitcoin, the finite field is the residue system modulo the prime <sup>p</sup>256k1=2<sup>256</sup> <sup>−</sup>2<sup>32</sup> <sup>−</sup>2<sup>9</sup> <sup>−</sup>2<sup>8</sup> <sup>−</sup>2<sup>7</sup> <sup>−</sup>2<sup>6</sup> <sup>−</sup>2<sup>4</sup> <sup>−</sup>1. For other security libraries (boringSSL, nss, and OpenSSL), we verify the operations in Curve25519 using the residue system modulo the prime p25519 = <sup>2</sup><sup>255</sup> <sup>−</sup><sup>19</sup> as the underlying field. Rational points on elliptic curves form a group. The group operation in turn is implemented by a number of field operations.

In lattice-based post-quantum cryptosystems, polynomial rings are used. Specifically, the polynomial ring <sup>Z</sup>3329[X]/-<sup>X</sup><sup>256</sup> + 1 is used in Kyber. To speed up multiplication in the polynomial ring, Kyber requires the multiplication to be implemented by NTT. NTT is a discrete Fast Fourier Transform over finite fields. Instead of complex roots of unity, NTT uses the principal roots of unity in fields. Mathematically, the Kyber NTT computes the following ring isomorphism

$$\mathbb{Z}\_{3329}[X]/\langle X^{256}+1\rangle \cong \mathbb{Z}\_{3329}[X]/\langle X^2-\zeta\_0\rangle \times \cdots \times \mathbb{Z}\_{3329}[X]/\langle X^2-\zeta\_{127}\rangle$$

where ζ*i*'s are the principal roots of unity.

We first compare CoqCryptoLine with all optimizations described in this paper against the unverified model checker CryptoLine [16]. Both tools invoke the computer algebra system Singular [19], but CryptoLine neither lets Singular produce certificates nor certifies answers from Singular. CoqCrypto-Line moreover uses the certified SMT QF\_BV solver CoqQFBV [33]; CryptoLine uses the uncertified but very efficient Boolector [28].

For the ECC experiments, CoqCryptoLine verifies all field operations in 6 minutes. It takes a few thousand seconds to verify group operations. The most complex implementation (x25519\_scalar\_mult\_generic) from boringSSL (4274 statements) takes about 1.5 hours.<sup>2</sup> For Kyber, CoqCryptoLine verifies in 2642 and 1048 seconds, respectively, that the reference and avx2 NTT implementations indeed compute the isomorphism. The unverified CryptoLine in comparison finishes verification in about 95 seconds. A summary of the comparison between CoqCryptoLine and CryptoLine is shown in Fig. 3a. Though CoqCryptoLine is much slower than CryptoLine, the running time (1.5 hours) for the most complex implementation is still acceptable.

<sup>2</sup> Two (out of three) modular polynomial equations in the post-condition are certified.

(a) COQCRYPTOLINE versus CRYPTO-LINE

(b) Percentages of average running time for COQCRYPTOLINE internal OCAML code (INT), external SMT QF BV solver (SMT), and external computer algebra system (CAS)

Fig. 3. Running time (in seconds) comparisons

Figure 3b shows the percentages of average running time for CoqCrypto-Line internal OCaml code (INT), external SMT QF\_BV solver (SMT), and external computer algebra system (CAS). External solvers take much more time than the internal OCaml program does. Between external solvers, the external computer algebra system takes 4.63% of the time and the external SMT QF\_BV solver spends 93.28% of the time.

To show the performance of the lift optimization, we run CoqCryptoLine and bvCryptoLine on root entailment problems generated from the benchmarks. Here we only consider 12 root entailment problems that trigger gbarith in bvCryptoLine. Figure 3c shows the running time of Singular in solving root entailment problems based on gbarith in bvCryptoLine and lift in CoqCryptoLine. bvCryptoLine fails to solve 3 root entailment problems in one hour. For the other 9 root entailment problems, lift outperforms gbarith.

We also compare CoqCryptoLine with and without slicing. The version of CoqCryptoLine without slicing is denoted by CoqCryptoLine−. The running time comparison between CoqCryptoLine and CoqCryptoLine<sup>−</sup> in Fig. 3d shows that slicing reduces the running time obviously.

## 5 Conclusion

CoqCryptoLine is a verified model checker for cryptographic programs with certified results. Its modules are formally verified in Coq with MathComp. CoqCryptoLine moreover employs external tools and validates their answers with certificates. We evaluate CoqCryptoLine on benchmarks from industrial security libraries (Bitcoin, boringSSL, nss and OpenSSL) and a post-quantum cryptography standard candidate (Kyber). In our experiments, CoqCryptoLine verifies most cryptographic programs with certificates in a reasonable time (7 min). Benchmarks with thousands of lines are verified in 1.7 h. To our knowledge, this is the first certified verification on operations of the elliptic curve secp256k1 used in Bitcoin, and the avx2 and reference implementations of Kyber number-theoretic transform.

Acknowledgments. The authors in Academia Sinica are partially funded by National Science and Technology Council grants NSTC110-2221-E-001-008-MY3, NSTC111- 2221-E-001-014-MY3, NSTC111-2634-F-002-019, the Sinica Investigator Award AS-IA-109-M01, the Data Safety and Talent Cultivation Project AS-KPQ-109-DSTCP, and the Intel Fast Verified Postquantum Software Project. The authors in Shenzhen University and ISCAS are partially funded by Shenzhen Science and Technology Innovation Commission (JCYJ20210324094202008), the National Natural Science Foundation of China (62002228, 61836005), and the Natural Science Foundation of Guangdong Province (2022A1515011458, 2022A1515010880).

### References


Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

## **Incremental Dead State Detection in Logarithmic Time**

Caleb Stanford1(B) and Margus Veanes<sup>2</sup>

<sup>1</sup> University of California, Davis, USA cdstanford@ucdavis.edu <sup>2</sup> Microsoft Research, Redmond, USA margus@microsoft.com

**Abstract.** Identifying live and dead states in an abstract transition system is a recurring problem in formal verification; for example, it arises in our recent work on efficiently deciding regex constraints in SMT. However, state-of-the-art graph algorithms for maintaining reachability information *incrementally* (that is, as states are visited and before the entire state space is explored) assume that new edges can be added from any state at any time, whereas in many applications, outgoing edges are added from each state as it is explored. To formalize the latter situation, we propose *guided incremental digraphs* (GIDs), incremental graphs which support labeling *closed* states (states which will not receive further outgoing edges). Our main result is that dead state detection in GIDs is solvable in O(log m) amortized time per edge for m edges, improving upon O( √m) per edge due to Bender, Fineman, Gilbert, and Tarjan (BFGT) for general incremental directed graphs.

We introduce two algorithms for GIDs: one establishing the logarithmic time bound, and a second algorithm to explore a lazy heuristicsbased approach. To enable an apples-to-apples experimental comparison, we implemented both algorithms, two simpler baselines, and the state-of-the-art BFGT baseline using a common directed graph interface in Rust. Our evaluation shows 110-530x speedups over BFGT for the largest input graphs over a range of graph classes, random graphs, and graphs arising from regex benchmarks.

**Keywords:** Dead State Detection · Graph Algorithms · Online Algorithms · SMT

## **1 Introduction**

Classifying states in a transition system as live or dead is a recurring problem in formal verification. For example, given an expression, can it be simplified to the identity? Given an input to a nondeterministic program, can it reach a terminal state, or can it reach an infinitely looping state? Given a state in an automaton, can it reach an accepting state? State classification is relevant to satisfiability modulo theories (SMT) solvers [8,9,24,51], where theory-specific partial decision procedures often work by exploring the state space to find a reachable path that corresponds to a satisfying string or, more generally, a sequence of constructors. To a first approximation, the core problem in all of these cases amounts to classifying each state u in a directed graph as *live*, meaning that a feasible, accepting, or satisfiable state is reachable from u; or *dead*, meaning that all states reachable from u are infeasible, rejecting, or unsatisfiable.

**Motivating Applications.** We originally encountered the problem of incremental state classification during our prior work while building Z3's regex solver [61] for the SMT theory of string and regex constraints [4,13,15]. Our solver leveraged *derivatives* (in the sense of Brzozowski [18] and Antimirov [5]) to explore the states of the finite state machine corresponding to the regex incrementally (as the graph is built), to avoid the prohibitive cost of expanding all states initially. This turns out to require solving the live and dead state detection problem in the finite state machine presented as an incremental directed graph.<sup>1</sup> Concretely, consider the regex (- \*α- <sup>100</sup>)<sup>C</sup> <sup>∩</sup> (α), where matches any character, ∩ is regex intersection, <sup>C</sup> is regex complement, and α matches any digit (0-9). A traditional solver would expand the left and right operands as state machines, but the left operand (- \*α- <sup>100</sup>)<sup>C</sup> is astronomically large as a DFA, causing the solver to hang. The derivative-based technique instead constructs the derivative regex: (- \*α- <sup>100</sup>)<sup>C</sup> <sup>∩</sup> (- <sup>100</sup>)<sup>C</sup> <sup>∩</sup> <sup>α</sup>. At this stage we have a graph of two states and one edge, where the states are the two regexes just described, and the edge is the derivative relation. After one more derivative operation, the regex is reduced to one that is clearly nonempty as it accepts the empty string.

It is important that a derivative-based solver identify nonempty (live) and empty (dead) regexes *incrementally* because it does not generally construct the entire state space before terminating (see the graph update rule Upd, p. 626 [61]). Moreover, the nonemptiness problem for extended regexes is non-elementary [62] — and still PSPACE-complete for more restricted fragments — which strongly favors a lazy approach over brute-force search.

Regexes are just one possible application; the algorithms we will present here are broadly applicable to any context where the states have a bounded (pernode) out-degree. For example, they could be applied in LTL model checking when lazily exploring the state space of a nondeterministic B¨uchi automaton (NBA), where the NBA is too expensive to construct up front. The important fact is that each state of the automaton has only finitely many outgoing edges, and when all these are added, we can hope to check for dead states incrementally.

**Prior Work.** Traditionally, while live state detection can be done incrementally, dead state detection is often done exhaustively (i.e., after the entire state space is explored). For example, bounded and finite-state model checkers based on translations to automata [20,43,58], as well as classical dead-state elimination algorithms [12,16,37], typically work on a fixed state space after it has been fully enumerated. However, we reiterate that exhaustive exploration is prohibitive for large (e.g., exponential or infinite) state spaces which arise in an SMT

<sup>1</sup> The specific setting is regexes with intersection and complement (*extended* [31,44] or *generalized* [26] regexes), which are found natively in security applications [6,61]. Other solvers have also leveraged derivatives [45] and laziness in general [36].

**Fig. 1.** GID consisting of the sequence of updates E(1, 2), E(1, 3), T(2). Terminal states are drawn as double circles. After the update T(2), states 1 and 2 are known to be live. State 3 is not dead in this GID, as a future update may cause it to be live.

**Fig. 2.** GID extending Fig. 1 with additional updates E(4, 3), E(4, 5), C(4), C(5). Closed states are drawn as solid circles. After the update C(5) (but not earlier), state 5 is dead. State 4 is not dead because it can still reach state 3.

verification context. We also have good evidence that incremental feedback can improve SMT solver performance: a representative success story is the egraph data structure [23,67], which maintains an equivalence relation among expressions incrementally; because it applies to general expressions, it is theoryindependent and re-usable. Incremental state space exploration could lead to similar benefits if applied to SMT procedures which still rely on exhaustive search.

However, in order to perform incremental dead state detection, we currently lack algorithms which match offline performance. As we discuss in Sect. 2, the best-known existing solutions would require maintaining strong connected components (SCCs) incrementally. For SCC maintenance and the related simpler problem of cycle detection, amortized algorithms are known with O(m3/<sup>2</sup>) total time for m edge additions [10,33], with some recently announced improvements [11,14]. Note that this is in sharp contrast to O(m) for the offline variants of these problems, which can be solved by breadth-first or depth-first search. More generally, research suggests there are computational barriers to solving unconstrained reachability problems in incremental and dynamic graphs [1,29].

**This Paper.** To improve on prior algorithms, our key observation is that in many applications (including our motivating applications above), edges are not added adversarially, but *from one state at a time* as the states are explored. As a result, we know when a state will have no further outgoing edges. This enables us to (i) identify dead states incrementally, rather than only after the whole state space is explored; and (ii) obtain more efficient algorithms than currently exist for general graph reachability.

We introduce *guided incremental digraphs* (GIDs), a variation on incremental graphs. Like an incremental directed graph, a guided incremental digraph may be updated by adding new edges between states, or a state may be labeled as *closed*, meaning it will receive no further outgoing edges. Some states are designated as *terminal*, and we say that a state is *live* if it can reach a terminal state and *dead* if it will never reach a terminal state in any extension – i.e. if all reachable states from it are closed (see Figs. 1 and 2). To our knowledge, the problem of detecting dead states in such a system has not been studied by existing work in graph algorithms. Our problem can be solved through solving SCC maintenance, but not necessarily the other way around (Sect. 2, Proposition 1). We provide two new algorithms for dead-state detection in GIDs.

First, we show that the dead-state detection problem for GIDs can be solved in time O(m · log m) for m edge additions, within a logarithmic factor of the O(m) cost for offline search. The worst-case performance of our algorithm thus strictly improves on the O(m<sup>3</sup>/<sup>2</sup>) upper bound for SCC maintenance in general incremental graphs. Our algorithm is technically sophisticated, and utilizes several data structures and existing results in online algorithms: in particular, Union-Find [63] and Henzinger and King's Euler Tour Trees [35]. The main idea is that, rather than explicitly computing the set of SCCs, for closed states we maintain a single path to a non-closed (open) state. This turns out to reduce the problem to quickly determining whether two states are currently assigned a path to the same open state. On the other hand, Euler Tour Trees can solve *undirected* reachability for graphs that are forests in logarithmic time.<sup>2</sup> The challenge then lies in figuring out how to reduce directed connectivity in the graph of paths to an undirected forest connectivity problem. At the same time, we must maintain this reduction under Union-Find state merges, in order to deal with cycles that are found in the graph along the way.

While as theorists we would like to believe that asymptotic complexity is enough, the truth is that the use of complex data structures (1) can be prohibitively expensive in practice due to constant-factor overheads, and (2) can make algorithms substantially more difficult to implement, leading practitioners to prefer simpler approaches. To address these needs, in addition to the logarithmic-time algorithm, we provide a second *lazy* algorithm which avoids the user of Euler Tour Trees, and only uses union-find. This algorithm is based on an optimization of adding shortcut *jump* edges for long paths in the graph to quickly determine reachability. This approach aims to perform well in practice on typical graphs, and is evaluated in our evaluation along with the logarithmic time algorithm, though we do not prove its asymptotic complexity.

Finally, we implement and empirically evaluate both of our algorithms for GIDs against several baselines in 5.5k lines of code in Rust [47]. Our evaluation focuses on the performance of the GID data structure itself, rather than its endto-end performance in applications. To ensure an apples-to-apples comparison with existing approaches, we put particular focus on providing a directed graph data structure backend shared by all algorithms, so that the cost of graph search as well as state and edge merges is identical across algorithms. We implement two na¨ıve baselines, as well as an implementation of the state-of-the-art solution

<sup>2</sup> Reachability in dynamic forests can also be solved by Sleator-Tarjan trees [59], Frederickson's Topology Trees [30], or Top Trees [3]. Of these, we found Euler Tour Trees the easiest to work with in our implementation. See also [64].

based on maintaining SCCs, BFGT [10] in our framework. To our knowledge, the latter is the first implementation of BFGT specifically for SCC maintenance. On a collection of generated benchmark GIDs, random GIDs, and GIDs directly pulled from the regex application, we demonstrate a substantial improvement over BFGT for both of our algorithms. For example, for larger GIDs (those with over 100K updates), we observe a 110-530x speedup over BFGT.

**Contributions.** Our primary contributions are:


Following the above, we expand on the application of GIDs to regex solving in SMT (Sect. 5) and survey related work (Sect. 6).

## **2 Guided Incremental Digraphs**

#### **2.1 Problem Statement**

An incremental digraph is a sequence of edge updates E(u, v), where the algorithmic challenge in this context is to produce some output after each edge is received (e.g., whether or not a cycle exists). If the graph also contains updates T(u) labeling a state as *terminal*, then we say that a state is *live* if it can reach a terminal state in the current graph. In a *guided* incremental digraph, we also include updates C(u) labeling a state as *closed*, meaning that will not receive any further outgoing edges.

**Definition 1.** Define a *guided incremental digraph (GID)* to be a sequence of updates, where each update is one of the following:


The GID is *valid* if the *closed* labels are correct: there are no instances of E(u, v) or T(u) after an update C(u). The *denotation* of G is the directed graph (V,E) where V is the set of all states u which have occurred in any update in the sequence, and E is the set of all (u, v) such that E(u, v) occurs in G. An *extension* of a valid GID G is a valid GID G such that G is a prefix of G .

<sup>3</sup> https://github.com/cdstanford/gid.

In a valid GID G, we say that a state u is *live* if there is a path from u to a terminal state in the denotation of G; and a state u is *dead* if it is not live in *any* extension of G. Notice that in a GID without any C(u) updates, no states are dead as an edge may be added in an extension which makes them live.

We provide an example of a valid GID in Figs. 1 and 2 consisting of the following sequence of updates: E(1, 2), E(1, 3), T(2), E(4, 3), E(4, 5), C(4), C(5). Terminal states T(u) are drawn as double circles; closed states, as single circles C(u); and states that are not closed, as dashed circles.

**Definition 2.** Given as input a valid GID, the *GID state classification problem* is to output, in an online fashion after each update, the set of new live and new dead states. That is, output Live(u) or Dead(u) on the smallest prefix of updates such that u is live or dead on that prefix, respectively.

#### **2.2 Existing Approaches**

In many applications, one might choose to classify dead states offline, after the entire state space is enumerated. This leads to a linear-time algorithm via either DFS or BFS, but it does not solve our problem (Definition 2) because it is not incremental. Na¨ıve application of this idea leads to O(m) per update for m updates (O(m<sup>2</sup>) total), as we may redo the entire search after each update.

For acyclic graphs, there exists an amortized O(1)-time per update algorithm for the problem (Definition 2): maintain the graph as a list of forward- and backward-edges at each state. When a state v is marked terminal, do a DFS along backward-edges to determine all states u that can reach v not already marked as live, and mark them live. When a state v is marked closed, visit all forward-edges from v; if all are dead, mark v as dead and recurse along all backward-edges from v. As each edge is visited only when marking a state live or dead, it is only visited a constant number of times overall (though we may use more than O(1) time on some particular update pass). Additionally, the live state detection part of this procedure still works for graphs containing cycles.

The challenge, therefore, lies primarily in detecting dead states in graphs which may contain cycles. For this, the breakthrough approach from [10] maintains a *condensed* graph which is acyclic, where the vertices in the condensed graph represent strongly connected components (SCCs) of states. The mapping from states to SCCs is maintained using a Union-Find [63] data structure. Maintaining the condensed graph requires O( √m) time per update. To avoid confusing closed and non-closed states, we also have to make sure that they are not merged into the same SCC; the easiest solution to this is to withhold all edges from each state u in the graph until u are closed, which ensures that u must be in a SCC on its own. Once we have the condensed graph with these modifications, the same algorithm as in the previous paragraph works to identify live and dead states. Since each edge is only visited when a state is marked closed or live, each edge is visited only once throughout the algorithm, we use only amortized O(1) additional time to calculate live and dead states. While this SCC maintenance algorithm ignores the fact that edges do not occur from closed states C(u), this still proves the following result:


**Fig. 3.** *Top:* Basic classification of GID states into four disjoint categories. *Bottom:* Additional terminology used in this paper.

**Proposition 1.** *GID state classification reduces to SCC maintenance. That is, suppose we have an algorithm over incremental graphs that maintains the set of SCCs in* O(f(m, n)) *total time given* n *states and* m *edge additions.*<sup>4</sup> *Then there exists an algorithm to solve GID state classification in* O(f(m, n)) *total time.*

Despite this reduction one way, there is no obvious reduction the other way – from cycle detection or SCCs to Definition 2. This is because, while the existence of a cycle of non-live states implies bi-reachability between all states in the cycle, it does not necessarily imply that all of the bi-reachable states are dead.

## **3 Algorithms**

This section presents Algorithm 2, which solves the state classification problem in logarithmic time (Theorem 3); and Algorithm 3, an alternative lazy solution. Both algorithms are optimized versions of Algorithm 1, a first-cut algorithm which establishes the structure of our approach. We begin by establishing some basic terminology shared by all of the algorithms (see Fig. 3).

States in a GID can be usefully classified as exactly one of four *statuses*: *live*, *dead*, *unknown*, or *open*, where *unknown* means "closed but not yet live or dead", and *open* means "not closed and not live". Note that a state may be live and neither open nor closed; this terminology keeps the classification disjoint. Pragmatically, for live states it does not matter if they are classified as open or closed, since edges from those states no longer have any effect. However, all dead and unknown states are closed, and no states are both open and closed.

Given this classification, the intuition is that for each unknown state u, we only need *one* path from u to an open state to prove that it is not dead; we want to maintain one such path for all unknown states. To maintain all of these paths

<sup>4</sup> To be precise, "maintains" means that (i) we can check whether two states are in the same SCC in O(1) time; and (ii) we can iterate over all the states, edges from, or edges into a SCC in O(1) time per state or edge.

simultaneously, we maintain an acyclic directed *forest* structure on unknown and open states where the roots are open states, and all non-root states have a single edge to another state, called its *successor*. Edges other than successor edges can be temporarily ignored, except for when marking live states; these are kept as *reserve* edges. Specifically, we add every edge (u, v) as a backward-edge from v (to allow propagating live states), but for edges not in the forest we keep (u, v) in a reserve list from u. We store all edges, including backward-edges, in the original order (u, v). The reserve list edge becomes relevant only when either (i) u is marked as closed, or (ii) u's successor is marked as dead.

In order to deal with cycles, we need to maintain the forest of unknown states not on the original graph, but on a union-find *condensed graph*, similar to [63]. When we find a cycle of unknown states, we *merge* all states in the cycle by calling the union method in the union-find. We refer to a state as *canonical* if it is the canonical representative of its equivalence class in the union find; the condensed graph is a forest on canonical states. We use x, y, z to denote canonical states (states in the condensed graph), and u, v, w to denote the original states (not known to be canonical). Following [63], we maintain edges as linked lists rather than sets, and using the original states instead of canonical states; this is important as it allows combining edge lists in O(1) time when merging states.

#### **3.1 First-Cut Algorithm**

Algorithm 1 is a first cut based on these ideas. The procedures OnEdge and OnTerminal contain all the logic to identify live states, using bck to look up backward-edges; OnTerminal doubles as a "mark live" function when it is called by OnEdge. The procedure OnClosed tries to assign a successor edge to a newly closed state, to prove that it is not dead. In case we run out of reserve edges, the state is marked dead and we recursively call OnClosed along backward-edges, which will either set a new successor or mark them dead.

The union-find data structure UF provides UF.union(v1, v2), UF.find(v), and UF.iter(v): UF.union merges v<sup>1</sup> and v<sup>2</sup> to refer to the same canonical state, UF.find returns the canonical state for v, and UF.iter iterates over states equivalent to v. These use amortized α(n) for n updates, where α(n) ∈ o(log n) is the inverse Ackermann function. We only merge states if they are bi-reachable from each other, and both unknown; this implies that all states equivalent to a state x have the same status. Each edge (u, v) is always stored in the maps res and bck using its original states (i.e., edge labels are not updated when states are merged); but we can quickly obtain the corresponding edge on canonical states via (UF.find(u), UF.find(v)). Once a state is marked Live or Dead, its edge maps are no longer used.

**Invariants.** Altogether, we respect the following invariants. *Successor* and *no cycles* describe the forest structure, and, *edge representation* ensures that all edges in the input GID are represented somehow in the current graph.

– *Merge equivalence:* For all states u and v, if UF.find(u) = UF.find(v), then u and v are bi-reachable and both closed. (This implies that u and v are both live, both dead, or both unknown.)

**Algorithm 1.** First-cut algorithm. 1: V: a type for states (integers) (variables u, v, . . .) 2: E: the type of edges, equal to (V, V) 3: UF: a union-find data structure over V 4: X: the set of canonical states in UF (variables x, y, z, . . .) 5: status: a map from X to Live, Dead, Unknown, or Open 6: succ: a map from X to V 7: res and bck: maps from X to linked lists of E 8: **procedure** OnEdge(E(u, v)) 9: <sup>x</sup> <sup>←</sup> UF.find(u); <sup>y</sup> <sup>←</sup> UF.find(v) 10: **if** status(y) = Live **then** 11: OnTerminal(T(x)) mark x and its ancestors live 12: **else if** status(x) <sup>=</sup> Live **then** status(x) must be Open 13: append (u, v) to res(x) 14: append (u, v) to bck(y) 15: **procedure** OnTerminal(T(v)) 16: <sup>y</sup> <sup>←</sup> UF.find(v) 17: **for all** x in DFS backwards (along bck) from y not already Live **do** 18: status(x) <sup>←</sup> Live 19: **output** Live(x- ) for all x in UF.iter(x) 20: **procedure** OnClosed(C(v)) 21: <sup>y</sup> <sup>←</sup> UF.find(v) 22: **if** status(y) <sup>=</sup> Open **then return** y is already live or closed 23: **while** res(y) is nonempty **do** 24: pop (v, w) from res(y); <sup>z</sup> <sup>←</sup> UF.find(w) 25: **if** status(z) = Dead **then continue** 26: **else if** CheckCycle(y, z) **then** 27: **for all** z in cycle from <sup>z</sup> to <sup>y</sup> **do** <sup>z</sup> <sup>←</sup> Merge(z, z- ) 28: **else** 29: status(y) <sup>←</sup> Unknown; succ(y) <sup>←</sup> <sup>z</sup>; 30: **return** 31: status(y) <sup>←</sup> Dead; **output** Dead(y- ) for all y in UF.iter(y) 32: ToRecurse <sup>←</sup> <sup>∅</sup> 33: **for all** (u, v) in bck(y) **do** 34: <sup>x</sup> <sup>←</sup> UF.find(u) 35: **if** status(x) = Unknown and UF.find(succ(x)) = y **then** 36: status(x) <sup>←</sup> Open temporary – marked closed on recursive call 37: add x to ToRecurse 38: **for all** x in ToRecurse **do** OnClosed(C(x)) 39: **procedure** CheckCycle(y, z) **returning** bool 40: **while** status(z) = Unknown **do** <sup>z</sup> <sup>←</sup> UF.find(succ(z)) get root state from z 41: **return** y = z 42: **procedure** Merge(x, y) **returning** V 43: z ← UF.union(x, y) 44: bck(z) <sup>←</sup> bck(x) + bck(y) - O(1) linked list append 45: res(z) <sup>←</sup> res(x) + res(y) - O(1) linked list append 46: **return** z


**Theorem 1.** *Algorithm 1 is correct.*

*Proof (Summary).* The full proof can be found in the arXiv version [60]. The *status correctness* invariant implies correct output at each step, so it suffices to argue that all of the invariants above are preserved. Upon receiving E(u, v) or T(u), some dead, unknown, or open states may become live, but this does not change the status of any other states. The main challenge of the proof is the recursive procedure OnClosedC(u). On recursive calls, some states are *temporarily* marked Open, meaning they are roots in the forest structure. During recursive calls, we need a slightly generalized invariant: each forest root corresponds to a pending call to OnClosedC(u) (i.e., an element of ToRecurse for some call on the stack) and is a state that is dead iff all of its reserve edges are dead. After we prove this (generalized) invariant, when OnClosedC(u) terminates, we know that there are no more temporary open states, and the forest structure implies that all closed states are correctly marked as unknown.

**Complexity.** The core inefficiency in Algorithm 1 — what we need to improve — lies in CheckCycle. The procedure repeatedly sets <sup>z</sup> <sup>←</sup> succ(z) to find the tree root, which in general could be linear time in the number of edges. For example, this inefficiency results in O(m<sup>2</sup>) work for a linear graph read in backwards order: E(2, 1), C(2), E(3, 2), C(3), . . . , E(n, n-1), C(n).

All other procedures use amortized α(m) time per update for m updates, using array lists to represent the maps fwd, bck, and succ for O(1) lookups. To do the amortized analysis, the cost of each call to OnClosed can be assigned *either* to the target of an edge being marked dead, *or* to an edge being merged as part of a cycle, and both of these events can only happen once per edge added to the GID. And the OnTerminal calls and loop iterations only run once per edge in the graph when the target of that edge is marked live or terminal.

#### **3.2 Logarithmic Algorithm**

At its core, CheckCycle requires solving an *undirected* reachability problem on a graph that is restricted to a forest. However, the forest is changed not just by edge additions, but edge additions *and* deletions. While undirected reachability and reachability in directed graphs are both difficult to solve incrementally, reachability in *dynamic forests* can be solved in O(log m) time per operation. This is the main intuition for our solution, using an Euler Tour Trees data structure EF of Henzinger and King [35], shown in Algorithm 2.


Unfortunately, this idea does not work straightforwardly – once again because of the presence of cycles in the original graph. We cannot simply store the forest as a condensed graph with edges on condensed states. As we saw in Algorithm 1, it was important to store successor edges as edges into V, rather than edges into X – this is the only way that we can merge states in O(1), without actually inspecting the edge lists. If we needed to update the forest edges to be in X, this could require O(m) work to merge two O(m)-sized edge lists as each edge might need to be relabeled in the EF graph.

To solve this challenge, we instead store the EF data structure on the original states, rather than the condensed graph; but we ensure that *each canonical state is represented by a tree of original states*. When adding edges between canonical states, we need to make sure to remember the original label (u, v), so that we can later remove it using the original labels (this happens when its target becomes dead). When an edge would create a cycle, we instead simply ignore it in the EF graph, because a line of connected trees forms a tree.

**Summary and Invariants.** In summary, the algorithm reuses the data, procedures, and invariants from Algorithm 1, with the following important changes: (1) We maintain the EF data structure EF, a forest on V. (2) The successor edges are stored as their original edge labels (u, v), rather than just as a target state. (3) The procedure OnClosed is rewritten to maintain the graph EF. (4) The *successor edges* and *no cycles* invariants use the new succ representation: that is, they are constraints on the edges (x, UF.find(v)), where succ(x)=(u, v). (5) We add the following two constraints on edges in EF, depending on whether those states are equivalent in the union-find structure.


#### **Theorem 2.** *Algorithm 2 is correct.*

*Proof.* Observe that the EF inter-edges constraint implies that EF only contains edges between unknown and open states, together with isolated trees. In the modified OnTerminal procedure, when marking states as live we remove interedges, so we preserve this invariant.

Next we argue that given the invariants about EF, for an *open* state y the CheckCycle procedure returns true if and only if (y, z) would create a directed cycle. If there is a cycle of canonical states, then because canonical states are connected trees in EF, the cycle can be lifted to a cycle on original states, so y and z must already be connected in this cycle without the edge (y, z). Conversely, if y and z are connected in EF, then there is a path from y to z, and this can be projected to a path on canonical states. However, because y is open, it is a root in the successor forest, so any path from y along successor edges travels only on backward-edges; hence z is an ancestor of y in the *directed* graph, and thus (y, z) creates a directed cycle.

This leaves the OnClosed procedure. Other than the EF lines, the structure is the same as in Algorithm 1, so the previous invariants are still preserved, and it remains to check the EF invariants. When we delete the successor edge and temporarily mark status(x) = Open for recursive calls, we also remove it from EF, preserving the inter-edge invariant. Similarly, when we add a successor edge to x, we add it to EF, preserving the inter-edge invariant. So it remains to consider when the set of canonical states changes, which is when merging states in a cycle. Here, a line of canonical states is merged into a single state, and a line of connected trees is still a tree, so the intra-edge invariant still holds for the new canonical state, and we are done.

**Theorem 3.** *Algorithm 2 uses amortized logarithmic time per edge update.*

*Proof.* By the analysis of Algorithm 1, each line of the algorithm is executed O(m) times and there are O(m) calls to CheckCycle. Each line of code is **Algorithm 3.** Lazy algorithm. 1: All data from Algorithm 1; jumps: a map from X to lists of V 2: **procedure** OnEdge, OnTerminal OnClosed as in Algorithm 1 3: **procedure** CheckCycle(y, z) **returning** bool 4: **return** y = GetRoot(z) 5: **procedure** GetRoot(z) **returning** V 6: **if** status(z) = Open **then return** z 7: **if** jumps(z) is empty **then** push succ(z) to jumps(z) set 0th jump 8: **repeat** pop w from jumps(z); z- = UF.find(w) remove dead jumps 9: **until** status(z- ) <sup>=</sup> Dead 10: push z to jumps(z); result <sup>←</sup> GetRoot(z- ) 11: <sup>n</sup> <sup>←</sup> length(jumps(z)); <sup>n</sup>- <sup>←</sup> length(jumps(z- )) 12: **if** n ≤ n **then** push jumps(z- )[<sup>n</sup> <sup>−</sup> 1] to jumps(z) set nth jump 13: **return** result 14: **procedure** Merge(x, y) **returning** V 15: z ← UF.union(x, y) 16: bck(z) <sup>←</sup> bck(x) + bck(y); res(z) <sup>←</sup> res(x) + res(y) 17: jumps(z) <sup>←</sup> empty; **return** <sup>z</sup>

either constant-time, α(m) = o(log m) time for the UF calls, or O(log m) time for the EF calls, so in total the algorithm takes O(m log m) time total, or amortized O(log m) time per edge.

#### **3.3 Lazy Algorithm**

While the asymptotic complexity of log m could be the end of the story, in practice, we found the cost of the EF calls to be a significant overhead. The technical details of Euler Tour Trees include building an AVL-tree cycle for each tree, where the cycle contains each state of the graph once and each edge in the graph twice. While this is elegant, it turns out that adding *one edge* to EF results in no less than *seven* modifications to the AVL tree: a split at the source, then a split at the target, then an edge addition in both directions (u, v) and (v, u) to the cycle, and finally the four resulting trees need to be glued together (using three merge operations).<sup>5</sup> Each one of these operations comes with a rebalancing operation which could do Ω(log m) tree rotations and pointer dereferences to visit the nodes in the AVL tree. Some optimizations may be possible – including, e.g., combining rebalancing operations or considering variants of AVL trees with better cache locality. Nonetheless, these constant-factor overheads constitute a serious practical drawback for Algorithm 2.

To address this, in this section, we investigate a simpler, lazy algorithm which avoids EF and directly optimizes Algorithm 1. For this, one idea in the right direction is to store for each state a direct pointer to the root which results from

<sup>5</sup> Our implementation actually uses nine modifications, as the splits at the source and target also disconnect the source and target states.

repeatedly calling succ. But there are two issues with this. First, maintaining this may be difficult (when the root changes, potentially updating a linear number of root pointers). Second, the root may be marked dead, in which case we have to re-compute all pointers to that root.

Instead, we introduce a *jump list* from each state: intuitively, it will contain states after calling successor once, twice, four times, eight times, and so on at powers of two; and it will be updated lazily, at most once for every visit to the state. When a jump becomes obsolete (the target dead), we just pop off the largest jump, so we do not lose all of our work in building the list. We maintain the following additional information: for each unknown canonical state x, a nonempty list of *jumps* [v0, v1, v2,...,vk], such that v<sup>0</sup> is reachable from x, v<sup>1</sup> is reachable from v0, v<sup>2</sup> is reachable from v1, and so on, and v<sup>1</sup> = succ(x).

The resulting algorithm is shown in Algorithm 3. The key procedure is Get-Rootz, which is called when adding a reserve edge (y, z) to the graph. In addition to all invariants from Algorithm 1, we maintain the following invariants for *every* unknown canonical state x, where jumps(x) is a list of states v0, v1, v2,...,vk. *First jump:* if the jump list is nonempty, then v<sup>0</sup> = succ(v). *Reachability:* vi+1 is reachable from v<sup>i</sup> for all i. The jump list also satisfies the following *powers of two* invariant: on the path of canonical states from v<sup>0</sup> to vi, the total number of states (including all states in each equivalence class) is at least 2<sup>i</sup> . While this invariant is not necessary for correctness, it is the key to the algorithm's practical efficiency: it follows from this that *if* the jump list is fully saturated for every state, querying GetRootz will take only logarithmic time. However, since jump lists are updated lazily, the jump list may not be saturated, so this does not establish a true asymptotic complexity for the algorithm.

#### **Theorem 4.** *Algorithm 3 is correct.*

*Proof.* The *first jump* and *reachability* invariants imply that v1, v2,... is some sublist of the states along the path from an unknown state to its root, potentially followed by some dead states. We need to argue that the subprocedure GetRoot (i) receives the same verdict as repeatedly calling succ to find a cycle in the firstcut algorithm and (ii) preserve both invariants. For *first jump*, if the jump list is empty, then GetRoot ensures that the first jump is set to the successor state. For *reachability*, popping dead states from the jump list clearly preserves the invariant, as does adding on a state along the path to the root, which is done when k ≥ k. Merging states preserves both invariants trivially because we throw the jump list away, and marking states live preserves both invariants trivially since the jump list is only maintained and used for unknown states.

## **4 Experimental Evaluation**

The primary goal of our evaluation has been to experimentally validate the performance of GIDs as a data structure in isolation, rather than their use in a particular application. Our evaluation seeks to answer the following questions:


To answer **Q1**, we put substantial implementation effort into a common framework on which a fair comparison could be made between different approaches. To this end, we implemented GIDs as a data structure in Rust which includes a graph data structure on top of which all algorithms are built. In particular, this equalizes performance across algorithms for the following baseline operations: state and edge addition and retrieval, DFS and BFS search, edge iteration, and state merging. We chose Rust for our implementation for its performance, and because there does not appear to be an existing publicly available implementation of BFGT in any other language.<sup>6</sup> The number of lines of code used to implement these various structures is summarized in Fig. 4. We implement Algorithms 2 and 3 and compare them with the following baselines:


To answer **Q2**, first, we compiled a range of basic graph classes which are designed to expose edge case behavior in the algorithms, as well as randomly generated graphs. We focus on graphs with no live states, as live states are treated similarly by all algorithms. Most of the generated graphs come in 2×2 = 4 variants: (i) the states are either read in a forwards- or backwards- order; and (ii) they are either *dead* graphs, where there are no open states at the end and so everything gets marked dead; or *unknown* graphs, where there is a single open state at the end, so most states are unknown. In the unknown case, it is sufficient to have one open state at the end, as many open states can be reduced to the case of a single open state where all edges point to that one. We include GIDs from line graphs and cycle graphs (up to 100K states in multiples of 3); complete and complete acyclic graphs (up to 1K states); and bipartite graphs (up to 1K states). These are important cases, for example, because the reverse-order line and cycle graphs are a potential worst case for Simple and BFGT.

Second, to exhibit more dynamic behavior, we generated random graphs: sparse graphs with a fixed out-degree from each state, chosen from 1, 2, 3, or 10 (up to 100K states); and dense graphs with a fixed probability of each edge, chosen from .01, .02, or .03 (up to 10K states). Each case uses 10 different random seeds. As with the basic graphs, states are read in some order and marked closed.

<sup>6</sup> That is, BFGT for SCC maintenance. BFGT for cycle detection has been implemented before, for instance, in [28] and formally verified in [32].


**Fig. 4.** *Left:* Lines of code for each algorithm and other implementation components. *Right:* Benchmark GIDs used in our evaluation. Where present, the source column indicates the quantity prior to filtering out trivially small graphs.

To answer **Q3**, we wrote a backend to extract a GID at runtime from Z3's regex solver [61]. While the backend of the solver is precisely a GID and so could be passed to our GID implementation dynamically — this setup includes many extraneous overheads, including rewriting expressions and computing derivatives when adding nodes to the graph. While some of these overheads may be possible to eliminate, and we are fairly confident that GIDs would be a bottleneck for sufficiently large input examples, this makes it difficult to isolate the performance impact of the GID data structure itself, which is the sole focus of this paper. We therefore instrumented the Z3 solver code to export the (incremental) sequence of graph updates that would be performed during a run of Z3 on existing regex benchmarks. For each benchmark, this instrumented code produces a faithful representation of the sequence of graph updates that actually occur in a run of the SMT solver on this particular benchmark. For each regex benchmark, we thus get a GID benchmark for the present paper. The benchmarks focus on *extended* regexes, rather than plain classical regexes as these are the ones for which dead state detection is relevant (see Sect. 5). We include GIDs for the RegExLib benchmarks [15] and the handcrafted Boolean benchmarks reported in [61]. We add to these 11 additional examples designed to be difficult GID cases. The collection of regex benchmarks we used (just described) is available on GitHub.<sup>7</sup>

From both the Q2 and Q3 benchmarks, we filter out any benchmark which takes under 10 milliseconds for all of the algorithms to solve (including Na¨ıve), and we use a 60 second timeout. The evaluation was run on a 2020 MacBook Air (MacOS Monterey) with an Apple M1 processor and 8GB of memory.

<sup>7</sup> https://github.com/cdstanford/regex-smt-benchmarks.

**Fig. 5.** Evaluation results. *Left:* Cumulative plot showing the number of benchmarks solved in time t or less for basic GID classes (top), randomly generated GIDs (middle), and regex-derived GIDs (bottom). *Top right:* Scatter plot showing the size of each benchmark vs time to solve. *Bottom right:* Average time to solve benchmarks of size closest to s, where values of s are chosen in increments of 1/3 on a log scale.

**Correctness.** To ensure that all of our implementations our correct, we invested time into unit testing and checked output correctness on all of our collected benchmarks, including several cases which exposed bugs in previous versions of one or more algorithms. In total, all algorithms are vetted against 25 unit tests from handwritten edge cases that exposed prior bugs, 373 unit tests from benchmarks, and 30 module-level unit tests for specific functions.

**Results.** Figure 5 shows the results. Algorithm 3 shows significant improvements over the state-of-the-art, solving more benchmarks in a smaller amount of time across basic GIDs, random GIDs, and regex GIDs. Algorithm 2 also shows state-of-the-art performance, similar to BFGT on basic and regex GIDs and significantly better on random GIDs. On the bottom right, since looking at average time is not meaningful for benchmarks of widely varying size, we stratify the size of benchmarks into buckets, and plot time-to-solve as a function of size. Both x-axis and y-axis are on a log scale. The plot shows that Algorithm 3 exhibits up to two orders of magnitude speedup over BFGT for larger GIDs – we see speedups of 110x to 530x for GIDs in the top five size buckets (GIDs of size nearest to 100K, ∼200K, ∼500K, 1M, and ∼2M).

**New Implementations of Existing Work.** Our implementation contributes, to our knowledge, the first implementation of BFGT specifically for SCC maintenance. In addition, it is one of the first implementations of Euler Tour Trees (see [7] for another), including the AVL tree backing for tours, and likely the first implementation in Rust.

## **5 Application to Extended Regular Expressions**

In this section, we explain how precisely the GID state classification problem arises in the context of derivative-based solvers [45,61]. We first define *extended* regexes [31] (regexes extended with intersection & and complement ~) modulo a symbolic alphabet A of *predicates* that represent sets of characters. We explain the main idea behind *symbolic derivatives*, as found in [61]; these generalize Brzozowski [18] and Antimirov derivatives [5] (see also [19,42] for other proposals). Symbolic derivatives provide the foundation for incrementally creating a GID. Then we show, through an example, how a solver can incrementally expand derivatives to reduce the satisfiability problem to the GID state classification problem (Definition 2).

Define a *regex* by the following grammar, where ϕ ∈ A denotes a predicate:

$$RE ::= \varphi \quad | \quad \varepsilon \quad | \quad RE\_1 \cdot RE\_2 \quad | \quad RE^\bullet \quad | \quad RE\_1 \parallel RE\_2 \quad | \quad RE\_1 \nsubseteq \to RE^\bullet$$

Let R<sup>k</sup> represent the concatenation of R k times. The *symbolic derivative* of a regex R, denoted δ(R), is a regex which describes the set of *suffixes* of strings in R after the first character is removed. The formal definition can be found in [61] and in the arXiv version of the present paper [60].

To apply Definition 1 to regexes: **states** are regexes; **edges** are transitions from a regex to its derivatives; and **terminal** states are the so-called *nullable* regexes, where a regex is nullable if it matches the empty string. Nullability can be computed inductively over the structure of regexes: for example, ε and R\* are nullable, and R<sup>1</sup> & R<sup>2</sup> is nullable iff both R<sup>1</sup> and R<sup>2</sup> are nullable. A **live** state here is thus a regex that reaches a nullable regex via 0 or more edges. This implies that there exists a concrete string matching it. Conversely, **dead** states are always empty, i.e. they match no strings, but can reach other dead states, creating strongly connected components of closed states none of which are live. For example, the *false* predicate ⊥ of A serves as the regex that matches *nothing* and is trivially a dead state. Thus ~⊥ is equivalent to - \*, where is the *true* predicate and is trivially a live state.

#### **5.1 Reduction from Incremental Regex Emptiness to GIDs**

For simplicity, suppose we want to determine the satisfiability of a single regex constraint s ∈ R, where s is a string variable and R is a concrete regex. (This is not overly restrictive – any number of simultaneous regex constraints for a string s can be combined into single regex constraint by using the Boolean operations of regexes.) For example, let L = ~(- \*α- <sup>100</sup>) and R = L& (α), where α is the "is digit" predicate that is true of characters that are digits (often denoted \d). The solver manipulates regex membership constraints on strings by unfolding them [61]. The constraint s ∈ R, that essentially tests nonemptiness of R with s as a witness, becomes

$$\iota \left( s = \epsilon \land Nullable(R) \right) \lor \left( s \neq \epsilon \land s\_{1\dots} \in \delta\_{s\_0}(R) \right)$$

where, s = since R is not nullable, si.. is the suffix of s from index i, and

$$\delta(R) = \delta(L) \otimes \delta(\iota \alpha) = \{ \alpha \text{ ? } L \uplus \text{"} (\iota^{100}) : L \} \otimes \alpha = \{ \alpha \text{ ? } L \uplus \text{"} (\iota^{100}) \uplus \alpha : L \uplus \alpha \}$$

Let R<sup>1</sup> = L& ~(- <sup>100</sup>) & α and R<sup>2</sup> = L& α. So R has two outgoing transitions R <sup>α</sup> −→R<sup>1</sup> and <sup>R</sup> <sup>¬</sup><sup>α</sup> −−→R<sup>2</sup> that contribute the edges (R, R1) and (R, R2) into the GID. Note that these edges depend only on R and not on s0.

We continue the search incrementally by checking the two branches of the if-then-else constraint, where R<sup>1</sup> and R<sup>2</sup> are again not nullable (so s<sup>1</sup>.. = ):

$$\begin{array}{l} s\_0 \in \alpha \land s\_{2..} \in \delta\_{s\_1}(R\_1) \quad \lor \quad s\_0 \in \neg \alpha \land s\_{2..} \in \delta\_{s\_1}(R\_2)\\ \delta(R\_1) = \{\alpha \text{ ? } L \& \text{ (\text{\textquotedblleft}{}^{100}) \& \text{ (\textquotedblright}{}^{99}) : L \& \text{\textquotedblright}{}(\text{\textquotedblleft}{}^{99})\} \otimes \{\alpha \text{ ? } \varepsilon : \perp \} = \{\alpha \text{ ? } \varepsilon : \perp\} \\ \delta(R\_2) = \{\alpha \text{ ? } L \& \text{ (\textquotedblleft}(^{100}) : L \& \text{\textquotedblright}{}(\alpha \text{ ? } \varepsilon : \perp) = \{\alpha \text{ ? } \varepsilon : \perp\} \end{array}$$

It follows that R<sup>1</sup> α −→ε and R<sup>2</sup> α −→ε, so the edges (R1, ε) and (R2, ε) are added to the GID where is a trivial terminal state. In fact, after R<sup>1</sup> the search already terminates because we then have the path (R, R1)(R1, ) that implies that R is live. The associated constraints s<sup>0</sup> ∈ α and s<sup>1</sup> ∈ α and the final constraint that s<sup>2</sup>.. = can be used to extract a concrete witness, e.g., s = ''42".

*Soundness* of the algorithm follows from that if R is nonempty (s ∈ R is *satisfiable*), then we eventually arrive at a nullable (terminal) regex, as in the example run above. To achieve *completeness* – and to eliminate dead states as early as possible – we incrementally construct a GID corresponding to the set of regexes seen so far (as above). After all the feasible transitions from R to its derivatives in δ(R) are added to the GID as edges (WLOG in one batch), the state R becomes closed. *Crucially, due to the symbolic form of* δ(R)*, no derivative is missing.* Therefore R is known to be empty precisely as soon as R is detected as a dead state in the GID. An additional benefit is that the algorithm is independent of the size of the universe of A, that may be very large (e.g. the Unicode character set), or even infinite. We get the following theorem that uses finiteness of the closure of symbolic derivatives [61, Theorem 7.1]:

**Theorem 5.** *For any regex* R*, (1) If* R *is nonempty, then the decision procedure eventually marks* R *live. (2) If* R *is empty, then the decision procedure marks* R *dead at the earliest stage that it is know to be dead, and terminates.*

## **6 Related Work**

**Online Graph Algorithms.** Online graph algorithms are typically divided into problems over *incremental* graphs (where edges are added), *decremental* graphs (where edges are deleted), and *dynamic* graphs (where edges are both added and deleted), with core data structures discussed in [27,49]. Important problems include *transitive closure*, *cycle detection*, *topological ordering*, and *strongly connected component (SCC) maintenance*.

For incremental topological ordering, [46] is an early work, and [33] presents two different algorithms, one for *sparse graphs* and one for *dense graphs* – the algorithms are also extended to work with SCCs. The sparse algorithm was subsequently simplified in [10] and is the basis of our implementation named BFGT in Sect. 4. A unified approach of several algorithms based on [10] is presented in [21] that uses a notion of *weak topological order* and a labeling technique that estimates transitive closure size. Further extensions of [10] are studied in [11,14] based on randomization.

For dynamic directed graphs, a topological sorting algorithm that is experimentally preferable for sparse graphs is discussed in [56], and a related article [55] discusses strongly connected components maintenance. Transitive closure for dynamic graphs is studied in [57], improving upon some algorithms presented earlier in [34]. One major application for these algorithms is in pointer analysis [54].

For *undirected* forests, fully dynamic reachability is solvable in amortized logarithmic time per edge via multiple possible approaches [3,30,35,59,64]; our implementation uses Euler Tour Trees [35].

**Data Structures for SMT.** *UnionFind* [63] is a foundational data structure used in SMT. *E-graphs* [23,67] are used to ensure *functional extensionality*, where two expressions are equivalent if their subexpressions are equivalent [25,52]. In both UnionFind and E-graphs, the maintained relation is an *equivalence* relation. In contrast, maintaining live and dead states involves tracking reachability rather than equivalence. To the best of our knowledge, the specific formulation of incremental reachability we consider here is new.

**Dead State Elimination in Automata.** A DFA or NFA may be viewed as a GID, so state classification in GIDs solves dead state elimination in DFAs and NFAs, while additionally working in an incremental fashion. Dead state elimination is also known as *trimming* [37] and plays an important role in automata *minimization* [12,38,48]. The literature on minimization is vast, and goes back to the 1950s [16,17,39–41,50,53]; see [65] for a taxonomy, [2] for an experimental comparison, and [22] for the symbolic case. Watson et. al. [66] propose an *incremental* minimization algorithm, in the sense that it can be halted at any point to produce a partially minimized, equivalent DFA; unlike in our setting, the DFA's states and transitions are fixed and read in a predetermined order.

**Acknowledgments.** We thank the anonymous reviewers of CAV 2021, TACAS 2022, and CAV 2023 for feedback leading to substantial improvements to both our paper and our results. Special thanks to Nikolaj Bjørner, for his collaboration and involvement with Z3, and Yu Chen, for helpful discussions in which he proposed the idea for the first-cut algorithm.

## **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

## Model Checking Race-Freedom When "Sequential Consistency for Data-Race-Free Programs" is Guaranteed

Wenhao Wu1(B) , Jan Hückelheim<sup>2</sup> , Paul D. Hovland<sup>2</sup> , Ziqing Luo<sup>1</sup> , and Stephen F. Siegel<sup>1</sup>

<sup>1</sup> University of Delaware, Newark, DE 19716, USA {wuwenhao,ziqing,siegel}@udel.edu <sup>2</sup> Argonne National Laboratory, Lemont, IL 60439, USA {jhueckelheim,hovland}@anl.gov

Abstract. Many parallel programming models guarantee that if all sequentially consistent (SC) executions of a program are free of data races, then all executions of the program will appear to be sequentially consistent. This greatly simplifies reasoning about the program, but leaves open the question of how to verify that all SC executions are race-free. In this paper, we show that with a few simple modifications, model checking can be an effective tool for verifying race-freedom. We explore this technique on a suite of C programs parallelized with OpenMP.

Keywords: data race · model checking · OpenMP

### 1 Introduction

Every multithreaded programming language requires a memory model to specify the values a thread may obtain when reading a variable. The simplest such model is *sequential consistency* [22]. In this model, an execution is an interleaved sequence of the execution steps from each thread. The value read at any point is the last value that was written to the variable in this sequence.

There is no known efficient way to implement a full sequentially consistent model. One reason for this is that many standard compiler optimizations are invalid under this model. Because of this, most multithreaded programming languages (including language extensions) impose a requirement that programs do not have *data races*. A data race occurs when two threads access the same variable without appropriate synchronization, and at least one access is a write. (The notion of appropriate synchronization depends on the specific language.) For data race-free programs, most standard compiler optimizations remain valid. The Pthreads library is a typical example, in that programs with data races

This is a U.S. government work and not under copyright protection in the U.S.; foreign copyright protection may apply 2023

have no defined behavior, but race-free programs are guaranteed to behave in a sequentially consistent manner [25].

Modern languages use more complex "relaxed" memory models. In this model, an execution is not a single sequence, but a set of events together with various relations on those events. These relations—e.g., *sequenced before*, *modification order*, *synchronizes with*, *dependency-ordered before*, *happens before* [21]—must satisfy a set of complex constraints spelled out in the language specification. The complexity of these models is such that only the most sophisticated users can be expected to understand and apply them correctly. Fortunately, these models usually provide an escape, in the form of a substantial and useful language subset which is guaranteed to behave sequentially consistently, as long as the program is race-free. Examples include Java [23], C and C++ since their 2011 versions (see [8] and [21, §5.1.2.4 Note 19]), and OpenMP [26, §1.4.6].

The "guarantee" mentioned above actually consists of two parts: (1) all executions of data race-free programs in the language subset are sequentially consistent, and (2) if a program in the language subset has a data race, then it has a sequentially consistent execution with a data race [8]. Putting these together, we have, for any program P in the language subset:

(SC4DRF) *If all sequentially consistent executions of* P *are data race-free, then all executions of* P *are sequentially consistent.*

The consequence of this is that the programmer need only understand sequentially consistent semantics, both when trying to ensure P is race-free, and when reasoning about other aspects of the correctness of P. This approach provides an effective compromise between usability and efficient implementation.

Still, it is the programmer's responsibility to ensure that all sequentially consistent executions of the program are race-free. Unfortunately, this problem is undecidable [4], so no completely algorithmic solution exists. As a practical matter, detecting and eliminating races is considered one of the most challenging aspects of parallel program development. One source of difficulty is that compilers may "miscompile" racy programs, i.e., translate them in unintuitive, non-semantics-preserving ways [7]. After all, if the source program has a race, the language standard imposes no constraints, so any output from the compiler is technically correct.

Researchers have explored various techniques for race checking. Dynamic analysis tools (e.g., [18]) have experienced the most uptake. These techniques can analyze a single execution precisely, and report whether a race occurred, and sometimes can draw conclusions about closely related executions. But the behavior of many concurrent programs depends on the program input, or on specific thread interleavings, and dynamic techniques cannot explore all possible behaviors. Moreover, dynamic techniques necessarily analyze the behavior of the executable code that results from compilation. As explained above, racy programs may be miscompiled, even possibly removing the race, in which case a dynamic analysis is of limited use.

Approaches based on static analysis, in contrast, have the potential to verify race-freedom. This is extremely challenging, though some promising research prototypes have been developed (e.g., [10]). The most significant limitation is imprecision: a tool may report that race-free code has a possible race— a "false alarm". Some static approaches are also not sound, i.e., they may fail to detect a race in a racy program; like dynamic tools, these approaches are used more as bug hunters than verifiers.

Finite-state model checking [15] offers an interesting compromise. This approach requires a finite-state model of the program, which is usually achieved by placing small bounds on the number of threads, the size of inputs, or other program parameters. The reachable states of the model can be explored through explicit enumeration or other means. This can be used to implement a sound and precise race analysis of the model. If a race is found, detailed information can be produced, such as a program trace highlighting the two conflicting memory accesses. Of course, if the analysis concludes the model is race-free, it is still possible that a race exists for larger parameter values. In this case, one can increase those values and re-run the analysis until time or computational resources are exhausted. If one accepts the "small scope hypothesis"—the claim that most defects manifest in small configurations of a system—then model checking can at least provide strong evidence for the absence of data races. In any case, the results provide specific information on the scope that is guaranteed to be racefree, which can be used to guide testing or further analysis.

The main limitation of model checking is state explosion, and one of the most effective techniques for limiting state explosion is *partial order reduction* (POR) [17]. A typical POR technique is based on the following observation: from a state s at which a thread t is at a "local" statement—i.e., one which commutes with all statements from other threads—then it is often not necessary to explore all enabled transitions from s; instead, the search can explore only the enabled transitions from t. Usually local statements are those that access only thread-local variables. But if the program is known to be race-free, shared variable accesses can also be considered "local" for POR. This is the essential observation at the heart of recent work on POR in the verification of Pthreads programs [29].

In this paper, we explore a new model checking technique that can be used to verify race-freedom, as well as other correctness properties, for programs in which threads synchronize through locks and barriers. The approach requires two simple modifications to the standard state reachability algorithm. First, each thread maintains a history of the memory locations accessed since its last synchronization operation. These sets are examined for races and emptied at specific synchronization points. Second, a novel POR is used in which only lock (release and acquire) operations are considered non-local. In Sect. 2, we present a precise mathematical formulation of the technique and a theorem that it has the claimed properties, including that it is sound and precise for verification of race-freedom of finite-state models.

Using the CIVL symbolic execution and model checking platform [31], we have implemented a prototype tool, based on the new technique, for verifying race-freedom in C/OpenMP programs. OpenMP is an increasingly popular directive-based language for writing multithreaded programs in C, C++, or Fortran. A large sub-language of OpenMP has the SC4DRF guarantee.<sup>1</sup> While the theoretical model deals with locks and barriers, it can be applied to many OpenMP constructs that can be modeled using those primitives, such as atomic operations and critical sections. This is explained in Sect. 3, along with the results of some experiments applying our tool to a suite of C/OpenMP programs. In Sect. 4, we discuss related work and Sect. 5 concludes.

## 2 Theory

We begin with a simple mathematical model of a multithreaded program that uses locks and barriers for synchronization.

Definition 1. Let TID be a finite set of positive integers. A *multithreaded program with thread ID set* TID comprises

	- (a) a set Local*i*, the *local states of thread* i, which is the union of five disjoint subsets, Acquire*i*, Release*i*, Barrier*i*, Nsync*i*, and Term*<sup>i</sup>*
	- (b) a set Stmt*<sup>i</sup>* of *statements*, which includes the *lock statements* acquire*i*(l) and release*i*(l) (for <sup>l</sup> <sup>∈</sup> Lock), and the *barrier-exit* statement exit*i*; all others statements are known as *nsync (non-synchronization) statements*
	- (c) for each <sup>σ</sup> <sup>∈</sup> Acquire*<sup>i</sup>* <sup>∪</sup> Release*<sup>i</sup>* <sup>∪</sup> Barrier*i*, a local state next(σ) <sup>∈</sup> Local*<sup>i</sup>*
	- (d) for each <sup>σ</sup> <sup>∈</sup> Acquire*<sup>i</sup>* <sup>∪</sup> Release*i*, a lock lock(σ) <sup>∈</sup> Lock
	- (e) for each <sup>σ</sup> <sup>∈</sup> Nsync*i*, a nonempty set stmts(σ) <sup>⊆</sup> Stmt*<sup>i</sup>* of nsync statements and function

$$\mathsf{update}(\sigma) \colon \mathsf{stmts}(\sigma) \times \mathsf{Shared} \to \mathsf{Local}\_i \times \mathsf{Shared}.\mathsf{d}$$

All of the sets Local*<sup>i</sup>* and Stmt*<sup>i</sup>* (i ∈ TID) are pairwise disjoint.

Each thread has a unique thread ID number, an element of TID. A local state for thread i encodes the values of all thread-local variables, including the program counter. A shared state encodes the values of all shared variables. (Locks are not considered shared variables.) A thread at an *acquire* state σ is attempting to acquire the lock lock(σ). At a *release* state, the thread is about to release a lock. At a *barrier* state, a thread is waiting inside a barrier. After executing one of the three operations, each thread moves to a unique next local state. A thread that reaches a *terminal* state has terminated. From an *nsync* state, any positive number of statements are enabled, and each of these statements may read and update the local state of the thread and/or the shared state.

<sup>1</sup> Any OpenMP program that does not use non-sequentially consistent atomic directives, omp\_test\_lock, or omp\_test\_nest\_lock [26, §1.4.6].

For i ∈ TID, the *local graph* of thread i is the directed graph with nodes Local*<sup>i</sup>* and an edge σ → σ if either (i) σ ∈ Acquire*<sup>i</sup>* ∪ Release*<sup>i</sup>* ∪ Barrier*<sup>i</sup>* and σ- <sup>=</sup> next(σ), or (ii) <sup>σ</sup> <sup>∈</sup> Nsync*<sup>i</sup>* and there is some <sup>ζ</sup>- <sup>∈</sup> Shared such that (σ- , ζ- ) is in the image of update(σ).

Fix a multithreaded program P and let

$$\begin{aligned} \mathsf{LockState} &= (\mathsf{Lock} \to \{0\} \cup \mathsf{TID})\\ \mathsf{State} &= \left(\prod\_{i \in \mathsf{TID}} \mathsf{Lock}\_i\right) \times \mathsf{Shared} \times \mathsf{LockState} \times 2^{\mathsf{TID}}. \end{aligned}$$

A *lock state* specifies the owner of each lock. The owner is a thread ID, or 0 if the lock is free. The elements of State are the (global) *states* of P. A state specifies a local state for each thread, a shared state, a lock state, and the set of threads that are currently blocked at a barrier.

Let <sup>i</sup> <sup>∈</sup> TID and <sup>L</sup>*<sup>i</sup>* <sup>=</sup> Local*<sup>i</sup>* <sup>×</sup> Shared <sup>×</sup> LockState <sup>×</sup> <sup>2</sup>TID. Define

$$\mathsf{enabled}\_{i} \colon L\_{i} \to 2^{\mathsf{Stmt}\_{i}}$$

λ → ⎧ ⎪⎪⎪⎪⎪⎪⎨ ⎪⎪⎪⎪⎪⎪⎩ {acquire*i*(l)} if <sup>σ</sup> <sup>∈</sup> Acquire*<sup>i</sup>* <sup>∧</sup> <sup>l</sup> <sup>=</sup> lock(σ) <sup>∧</sup> <sup>θ</sup>(l)=0 {release*i*(l)} if sigma <sup>∈</sup> Release*<sup>i</sup>* <sup>∧</sup> <sup>l</sup> <sup>=</sup> lock(σ) <sup>∧</sup> <sup>θ</sup>(l) = <sup>i</sup> {exit*i*} if σ ∈ Barrier*<sup>i</sup>* ∧ i ∈ w stmts(σ) if <sup>σ</sup> <sup>∈</sup> Nsync*<sup>i</sup>* ∅ otherwise.

where <sup>λ</sup> = (σ, ζ, θ, w) <sup>∈</sup> <sup>L</sup>*i*. This function returns the set of statements that are enabled in thread i at a given state. This function does not depend on the local states of threads other than i, which is why those are excluded from L*i*. An acquire statement is enabled if the lock is free; a release is enabled if the calling thread owns the lock. A barrier exit is enabled if the thread is not currently in the barrier blocked set.

Execution of an enabled statement in thread i updates the state as follows:

$$(\lambda, t) \mapsto \begin{cases} (\sigma', \zeta, \theta[l \mapsto i], w') & \text{if } \sigma \in \mathsf{Accque}\_i(l) \land \sigma' = \mathsf{next}(\sigma)) \\ (\sigma', \zeta, \theta[l \mapsto i], w') & \text{if } \sigma \in \mathsf{Accque}\_i \land t = \mathsf{accept}\_i(l) \land \sigma' = \mathsf{next}(\sigma)) \\ (\sigma', \zeta, \theta[l \mapsto 0], w') & \text{if } \sigma \in \mathsf{Relase}\_i \land t = \mathsf{release}\_i(l) \land \sigma' = \mathsf{next}(\sigma)) \\ (\sigma', \zeta, \theta, w') & \text{if } \sigma \in \mathsf{Barrier}\_i \land t = \mathsf{exit}\_i \land \sigma' = \mathsf{next}(\sigma)) \\ (\sigma', \zeta', \theta, w') & \text{if } \sigma \in \mathsf{Nsync}\_i \land t \in \mathsf{struct}(\sigma) \land \\ & \mathsf{update}(\sigma)(t, \zeta) = (\sigma', \zeta') \end{cases}$$

where λ = (σ, ζ, θ, w) and in each case above

$$w' = \begin{cases} w \cup \{i\} & \text{if } \sigma' \in \mathbf{Barier}\_i \land w \cup \{i\} \neq \mathsf{TID}\_i \\ \emptyset & \text{if } \sigma' \in \mathbf{Barier}\_i \land w \cup \{i\} = \mathsf{TID}\_i \\ w & \text{otherwise} \end{cases}$$

Note a thread arriving at a barrier will have its ID added to the barrier blocked set, unless it is the last thread to arrive, in which case all threads are released from the barrier.

At a given state, the set of enabled statements is the union over all threads of the enabled statements in that thread. Execution of a statement updates the state as above, leaving the local states of other threads untouched:

$$\mathsf{enabled} \colon \mathsf{State} \to 2^{\mathsf{Stmt}}$$

$$s \mapsto \bigcup\_{j \in \mathsf{TID}} \mathsf{enabled}\_{j}(\xi\_{j}, \zeta, \theta, w)$$

$$\mathsf{exeute} \colon \{(s, t) \in \mathsf{State} \times \mathsf{Stmt} \mid t \in \mathsf{enabled}(s)\} \to \mathsf{State}$$

$$(s, t) \mapsto \langle \xi[i \mapsto \sigma], \zeta', \theta', w' \rangle,$$

where <sup>s</sup> <sup>=</sup> ξ, ζ, θ, w<sup>∈</sup> State, <sup>t</sup> <sup>∈</sup> enabled(s), <sup>i</sup> <sup>=</sup> tid(t), and execute*i*(ξ*i*, ζ, θ, w, t) = σ, ζ- , θ- , w- .

Definition 2. A *transition* is a triple s *<sup>t</sup>* → s- , where <sup>s</sup> <sup>∈</sup> State, <sup>t</sup> <sup>∈</sup> enabled(s), and s- = execute(s, t). An *execution* α of P is a (finite or infinite) chain of transitions s<sup>0</sup> → *<sup>t</sup>*<sup>1</sup> <sup>s</sup><sup>1</sup> → ··· *<sup>t</sup>*<sup>2</sup> . The *length* of <sup>α</sup>, denoted <sup>|</sup>α|, is the number of transitions in α.

Note that an execution is completely determined by its initial state s<sup>0</sup> and its statement sequence t1t<sup>2</sup> ··· .

Having specified the semantics of the computational model, we now turn to the concept of the *data race*. The traditional definition requires the notion of "conflicting" accesses: two accesses to the same memory location conflict when at least one is a write. The following abstracts this notion:

Definition 3. A symmetric binary relation conflict on Stmt is a *conflict relation* for P if the following hold for all t1, t<sup>2</sup> ∈ Stmt:


$$\mathbf{execute}(\mathtt{execute}(s, t\_1), t\_2) = \mathtt{execute}(\mathtt{execute}(s, t\_2), t\_1). \tag{7}$$

Fix a conflict relation for P for the remainder of this section.

The next ingredient in the definition of *data race* is the *happens-before* relation. This is a relation on the set of *events* generated by an execution. An event is an element of Event <sup>=</sup> Stmt <sup>×</sup> <sup>N</sup>.

Definition 4. Let <sup>α</sup> = (s<sup>0</sup> <sup>→</sup> *<sup>t</sup>*<sup>1</sup> <sup>s</sup><sup>1</sup> → ··· *<sup>t</sup>*<sup>2</sup> ) be an execution. The *trace of* α is the sequence of events tr(α) = <sup>t</sup>1, n<sup>1</sup> <sup>t</sup>2, n2··· , of length <sup>|</sup>α|, where <sup>n</sup>*<sup>i</sup>* is the number of <sup>j</sup> <sup>∈</sup> [1, i] for which tid(t*<sup>j</sup>* ) = tid(t*i*). We write [α] for the set of events occurring in tr(α). A trace labels the statements executed by a thread with consecutive integers starting from <sup>1</sup>. Note the cardinality of [α] is <sup>|</sup>α|, as no two events in tr(α) are equal. Also, [α] is invariant under transposition of two adjacent commuting transitions from different threads.

Given an execution α, the *happens-before relation of* α, denoted HB(α), is a binary relation on [α]. It is the transitive closure of the union of three relations:

1. the intra-thread order relation

$$\{ (\langle t\_1, n\_1 \rangle, \langle t\_2, n\_2 \rangle) \in [\alpha] \times [\alpha] \mid \text{td}(t\_1) = \text{td}(t\_2) \land n\_1 < n\_2 \}.$$


$$\mathsf{e}\mathsf{o}\mathsf{o}\mathsf{c}\mathsf{h}(e) = |\{e' \in [\alpha] \mid e' = \langle \mathsf{exit}\_i, j \rangle \text{ for some } j \in [1, n] \}|,$$

the number of barrier exit events in thread i preceding or including e. The barrier relation is

$$\{(e, e') \in [\alpha] \times [\alpha] \mid \mathsf{epoch}(e) < \mathsf{epoch}(e')\}.$$

Two events "race" when they conflict but are not ordered by happens-before:

Definition 5. Let α be an execution and e, e- <sup>∈</sup> [α]. Say <sup>e</sup> <sup>=</sup> t, n and <sup>e</sup>- = t - , n- . We say e and e *race in* α if (t, t- ) <sup>∈</sup> conflict and neither (e, e- ) nor (e- , e) is in HB(α). The *data race relation of* α is the symmetric binary relation on [α] DR(α) = {(e, e- ) <sup>∈</sup> [α] <sup>×</sup> [α] <sup>|</sup> <sup>e</sup> and <sup>e</sup>race in α}.

Now we turn to the problem of detecting data races. Our approach is to explore a modified state space. The usual state space is a directed graph with node set State and transitions for edges. We make two modifications. First, we add some "history" to the state. Specifically, each thread records the nsync statements it has executed since its last lock event or barrier exit. This set is checked against those of other threads for conflicts, just before it is emptied after its next lock event or barrier exit. The second change is a reduction: any state that has an enabled statement that is not a lock statement will have outgoing edges from only one thread in the modified graph.

A well-known technical challenge with partial order reduction concerns cycles in the reduced state space. We deal with this challenge by assuming that P comes with some additional information. Specifically, for each i, we are given a set R*i*, with Release*<sup>i</sup>* ∪ Acquire*<sup>i</sup>* ⊆ R*<sup>i</sup>* ⊆ Local*i*, satisfying: any cycle in the local graph of thread i has at least one node in R*i*. In general, the smaller R*i*, the more effective the reduction. In many application domains, there are no cycles in the local graphs, so one can take <sup>R</sup>*<sup>i</sup>* <sup>=</sup> Release*<sup>i</sup>* <sup>∪</sup>Acquire*i*. For example, standard *for* loops in C, in which the loop variable is incremented by a fixed amount at each iteration, do not introduce cycles, because the loop variable will take on a new value at each iteration. For *while* loops, one may choose one node from the loop body to be in R*i*. *Goto* statements may also introduce cycles and could require additions to R*i*.

Definition 6. The *race-detecting state graph* for P is the pair G = (V,E), where

$$V = \mathbf{State} \times \left(\prod\_{i \in \mathsf{TID}} 2^{\mathsf{Stmt}\_i}\right)^2$$

and <sup>E</sup> <sup>⊆</sup> <sup>V</sup> <sup>×</sup> Stmt <sup>×</sup> <sup>V</sup> consists of all (s, **<sup>a</sup>** , t,<sup>s</sup>- , **a**- ) such that, letting <sup>σ</sup>*<sup>i</sup>* be the local state of thread i in s,


The race-detecting state graph may be thought of as a directed graph in which the nodes are V and edges are labeled by statements. Note that at a state where all threads are in the barrier, exit<sup>0</sup> is the only enabled statement in the racedetecting state graph, and its execution results in emptying all the **a***i*. A lock event in thread i results in emptying **a***<sup>i</sup>* only.

Definition 7. Let P be a multithreaded program and G = (V,E) the racedetecting state graph for P.


Definition 7 suggests a method for detecting data races in a multithreaded program. The nodes and edges of the race-detecting state graph reachable from an initial node are explored. (The order in which they are explored is irrelevant.) When an edge from a thread at an R*<sup>i</sup>* \ Acquire*<sup>i</sup>* state is executed, the elements of **a***<sup>i</sup>* are compared with those in **a***<sup>j</sup>* for all j ∈ TID\ {i} to see if a conflict exists, and if so, a data race is reported. When an edge in thread i terminates at an Acquire*<sup>i</sup>* state, a similar race check takes place. When an exit<sup>0</sup> occurs, or a node with no outgoing edges is reached, **a***<sup>i</sup>* and **a***<sup>j</sup>* are compared for all i, j ∈ TID with i = j. This approach is sound and precise in the following sense:

Theorem 1. *Let* P *be a multithreaded program, and* G = (V,E) *the racedetecting state graph for* <sup>P</sup>*. Let* <sup>s</sup><sup>0</sup> <sup>∈</sup> *State and let* <sup>u</sup><sup>0</sup> <sup>=</sup> <sup>s</sup>0, <sup>∅</sup>*TID*<sup>∈</sup> <sup>V</sup> *. Assume the set of nodes reachable from* u<sup>0</sup> *is finite. Then*


A proof of Theorem 1 is given in https://arxiv.org/abs/2305.18198.

*Example 1.* Consider the 2-threaded program represented in pseudocode:

$$\begin{aligned} t\_1 &\colon \mathsf{acquire}(l\_1); \ \mathtt{x=1}; \ \mathsf{release}(l\_1); \\ t\_2 &\colon \mathsf{acquire}(l\_2); \ \mathtt{x=2}; \ \mathsf{release}(l\_2); \end{aligned}$$

where <sup>l</sup><sup>1</sup> and <sup>l</sup><sup>2</sup> are distinct locks. Let <sup>R</sup>*<sup>i</sup>* <sup>=</sup> Release*<sup>i</sup>* <sup>∪</sup> Acquire*<sup>i</sup>* (<sup>i</sup> = 1, <sup>2</sup>). One path in the race-detecting state graph G executes as follows:

$$\mathsf{acque}(l\_1); \ \mathtt{x=1}; \ \mathsf{rulese}(l\_1); \ \mathsf{a}\mathsf{square}(l\_2); \ \mathtt{x=2}; \ \mathsf{rulese}(l\_2); \ \mathtt{x}$$

A data race occurs on this path since the two assignments conflict but are not ordered by happens-before. The race is not detected, since at each lock operation, the statement set in the other thread is empty. However, there is another path

acquire(l1); x=1; acquire(l2); x=2; release(l1);

in G, and on this path the race is detected at the release.

#### 3 Implementation and Evaluation

We implemented a verification tool for C/OpenMP programs using the CIVL symbolic execution and model checking framework. This tool can be used to verify absence of data races within bounds on certain program parameters, such as input sizes and the number of threads. (Bounds are necessary so that the number of states is finite.) The tool accepts a C/OpenMP program and transforms it into CIVL-C, the intermediate verification language of CIVL. The CIVL-C program has a state space similar to the race-detecting state graph described in Sect. 2. The standard CIVL verifier, which uses model checking and symbolic execution techniques, is applied to the transformed code and reports whether the given program has a data race, and, if so, provides precise information on the variable involved in the race and an execution leading to the race.

The approach is based on the theory of Sect. 2, but differs in some implementation details. For example, in the theoretical approach, a thread records the set of non-synchronization statements executed since the thread's last synchronization operation. This data is used only to determine whether a conflict took place between two threads. Any type of data that can answer this question would work equally well. In our implementation, each thread instead records the set of memory locations read, and the set of memory locations modified, since the last synchronization. A conflict occurs if the read or write set of one thread intersects the write set of another read. As CIVL-C provides robust support for tracking memory accesses, this approach is relatively straightforward to implement by a program transformation.

In Sect. 3.1, we summarize the basics of OpenMP. In Sect. 3.2, we provide the necessary background on CIVL-C and the primitives used in the transformation. In Sect. 3.3, we describe the transformation itself. In Sect. 3.4, we report the results of experiments using this tool.

All software and other artifacts necessary to reproduce the experiments, as well as the full results, are included in a VirtualBox virtual machine available at https://doi.org/10.5281/zenodo.7978348.

### 3.1 Background on OpenMP

OpenMP is a pragma-based language for parallelizing programs written in C, C++ and Fortran [13]. OpenMP was originally designed and is still most commonly used for shared-memory parallelization on CPUs, although the language is evolving and supports an increasing number of parallelization styles and hardware targets. We introduce here the OpenMP features that are currently supported by our implementation in CIVL. An example that uses many of these features is shown in Fig. 1.

The parallel construct declares the following structured block as a *parallel region*, which will be executed by all threads concurrently. Within such a parallel region, programmers can use *worksharing* constructs that cause certain parts of the code to be executed only by a subset of threads. Perhaps most importantly, the *loop worksharing construct* can be used inside a parallel region to declare <sup>a</sup> omp for loop whose iterations are mapped to different threads. The mapping of iterations to threads can be controlled through the schedule clause, which can take values including static, dynamic, guided along with an integer that defines the *chunk size*. If no schedule is explicitly specified, the OpenMP run time is allowed to use an arbitrary mapping. Furthermore, a structured block within a worksharing loop may be declared as ordered, which will cause this block to be executed sequentially in order of the iterations of the worksharing loop. Worksharing for non-iterative workloads is supported through the sections construct, which allows the programmer to define a number of different structured blocks of code that will be executed in parallel by different threads.

Programmers may use pragmas and clauses for barriers, atomic updates, and locks. OpenMP supports named critical sections, allowing no more than one thread at a time to enter a critical section with that name, and unnamed critical sections that are associated with the same global mutex. OpenMP also offers master and single constructs that are executed only by the *master thread* or one arbitrary thread.

```
1 #pragma omp parallel shared(b) private(i) shared(u,v)
2 { // parallel region: all threads will execute this
3 #pragma omp sections // sections worksharing construct
4 {
5 #pragma omp section // one thread will do this...
6 { b = 0; v = 0; }
7 #pragma omp section // while another thread does this...
8 u = rand();
9 }
10 // loop worksharing construct partitions iterations by schedule. Each thread has a
11 // private copy of b; these are added back to original shared b at end of loop...
12 #pragma omp for reduction(+:b) schedule(dynamic,1)
13 for (i=0; i<10; i++) {
14 b = b + i;
15 #pragma omp atomic seq_cst // atomic update to v
16 v+=i;
17 #pragma omp critical (collatz) // one thread at a time enters critical section
18 u = (u%2==0) ? u/2 : 3*u+1;
19 }
20 }
```
Fig. 1. OpenMP Example

Variables are shared by all threads by default. Programmers may change the default, as well as the scope of individual variables, for each parallel region using the following clauses: private causes each thread to have its own variable instance, which is uninitialized at the start of the parallel region and separate from the original variable that is visible outside the parallel region. The firstprivate scope declares a private variable that is initialized with the value of the original variable, whereas the lastprivate scope declares a private variable that is uninitialized, but whose final value is that of the logically last worksharing loop iteration or lexically last section. The reduction clause initializes each instance to the neutral element, for example <sup>0</sup> for reduction(+). Instances are combined into the original variable in an implementation-defined order.

CIVL can model OpenMP types and routines to query and control the number of threads (omp\_set\_num\_threads, omp\_get\_num\_threads), get the current thread ID (omp\_get\_thread\_num), interact with locks (omp\_init\_lock, omp\_destroy\_lock, omp\_set\_lock, omp\_unset\_lock, and obtain the current wall clock time (omp\_get\_wtime).

#### 3.2 Background on CIVL-C

The CIVL framework includes a front-end for preprocessing, parsing, and building an AST for a C program. It also provides an API for transforming the AST. We used this API to build a tool which consumes a C/OpenMP program and produces a CIVL-C "model" of the program. The CIVL-C language includes most of sequential C, including functions, recursion, pointers, structs, and dynamically allocated memory. It adds nested function definitions and primitives for concurrency and verification.

In CIVL-C, a thread is created by *spawning* a function: \$spawn f(...);. There is no special syntax for shared or thread-local variables; any variable that is in scope for two threads is shared. CIVL-C uses an interleaving model of concurrency similar to the formal model of Sect. 2. Simple statements, such as assignments, execute in one atomic step.

Threads can synchronize using *guarded commands*, which have the form \$when (e)S. The first atomic substatement of <sup>S</sup> is guaranteed to execute only from a state in which e evaluates to *true*. For example, assume thread IDs are numbered from <sup>0</sup>, and a lock value of <sup>−</sup><sup>1</sup> indicates the lock is free. The *acquire* lock operation may be implemented as \$when (l<0) l=tid;, where l is an integer shared variable and tid is the thread ID. A *release* is simply l=-1;.

A convenient way to spawn a set of threads is \$parfor (int <sup>i</sup>:d)S. This spawns one thread for each element of the 1d-domain d; each thread executes S with i bound to one element of the domain. A 1d-domain is just a set of integers; e.g., if <sup>a</sup> and <sup>b</sup> are integer expressions, the domain expression <sup>a</sup>..<sup>b</sup> represents the set {a, a + 1,...,b}. The thread that invokes the \$parfor is blocked until all of the spawned threads terminate, at which point the spawned threads are destroyed and the original thread proceeds.

CIVL-C provides primitives to constrain the interleaving semantics of a program. The program state has a single atomic lock, initially free. At any state, if there is a thread t that owns the atomic lock, only t is enabled. When the atomic lock is free, if there is some thread at a \$local\_start statement, and the first statement following \$local\_start is enabled, then among such threads, the thread with lowest ID is the only enabled thread; that thread executes \$local\_start and obtains the lock. When <sup>t</sup> invokes \$local\_end, <sup>t</sup> relinquishes the atomic lock. Intuitively, this specifies a block of code to be executed atomically by one thread, and also declares that the block should be treated as a local statement, in the sense that it is not necessary to explore all interleavings from the state where the local is enabled.

Local blocks can also be broken up at specified points using function \$yield. If <sup>t</sup> owns the atomic lock and calls \$yield, then <sup>t</sup> relinquishes the lock and does not immediately return from the call. When the atomic lock is free, there is no thread at a \$local\_start, a thread <sup>t</sup> is in a \$yield, and the first statement following the \$yield is enabled, then <sup>t</sup> may return from the \$yield call and re-obtain the atomic lock. This mechanism can be used to implement the racedetecting state graph: thread <sup>i</sup> begins with \$local\_start, yields at each <sup>R</sup>*<sup>i</sup>* node, and ends with \$local\_end.

CIVL's standard library provides a number of additional primitives. For example, the concurrency library provides a barrier implementation through a type \$barrier, and functions to initialize, destroy, and invoke the barrier.

The *mem* library provides primitives for tracking the sets of memory locations (a variable, an element of an array, field of a struct, etc.) read or modified through a region of code. The type \$mem is an abstraction representing a set of memory locations, or *mem-set*. The state of a CIVL-C thread includes a stack of mem-sets for writes and a stack for reads. Both stacks are initially empty. The function \$write\_set\_push pushes a new empty mem-set onto the write stack. At any point when a memory location is modified, the location is

```
1 int nthreads = ...;
 2 $mem reads[nthreads], writes[nthreads];
 3 void check_conflict(int i, int j) {
 4 $assert($mem_disjoint(reads[i], writes[j]) && $mem_disjoint(writes[i], reads[j]) &&
 5 $mem_disjoint(writes[i], writes[j]));
 6 }
 7 void check_and_clear_all() {
 8 for (int i=0; i<nthreads; i++)
 9 for (int j=i+1; j<nthreads; j++) check_conflict(i, j);
10 for (int i=0; i<nthreads; i++) reads[i] = writes[i] = $mem_empty();
11 }
12 void run(int tid) {
13 void pop() { reads[tid]=$read_set_pop(); writes[tid]=$write_set_pop(); }
14 void push() { $read_set_push(); $write_set_push(); }
15 void check() {
16 for (int i=0; i<nthreads; i++) { if (i==tid) continue; check_conflict(tid, i); }
17 }
18 // local variable declarations
19 $local_start(); push(); S pop(); $local_end();
20 }
21 for (int i=0; i<nthreads; i++) reads[i] = writes[i] = $mem_empty();
22 $parfor (int tid:0..nthreads-1) run(tid);
23 check_and_clear_all();
```
Fig. 2. Translation of #pragma omp parallel *S*

added to the top entry on the write stack. Function \$write\_set\_pop pops the write stack, returning the top mem-set. The corresponding functions for the read stack are \$read\_set\_push and \$read\_set\_pop. The library also provides various operations on mem-sets, such as \$mem\_disjoint, which consumes two mem-sets and returns *true* if the intersection of the two mem-sets is empty.

#### 3.3 Transformation for Data Race Detection

The basic structure for the transformation of a parallel construct is shown in Fig. 2. The user specifies on the command line the default number of threads to use in a parallel region. After this, two shared arrays are allocated, one to record the read set for each thread, and the other the write set. Rather than updating these arrays immediately with each read and write event, a thread updates them only at specific points, in such a way that the shared sets are current whenever a data race check is performed.

The auxiliary function check\_conflict asserts no read-write or write-write conflict exists between threads <sup>i</sup> and <sup>j</sup>. Function check\_and\_clear\_all checks that no conflict exists between any two threads and clears the shared mem-sets.

Each thread executes function run. A local copy of each private variable is declared (and, for firstprivate variables, initialized) here. The body of this function is enclosed in a local region. The thread begins by pushing new entries onto its read and write stacks. As explained in Sect. 3.2, this turns on memory access tracking. The body S is transformed in several ways. First, references to the private variable are replaced by references to the local copy. Other OpenMP constructs are translated as follows.

*Lock operations.* Several OpenMP operations are modeled using locks. The omp\_set\_lock and omp\_unset\_lock functions are the obvious examples, but we also use locks to model the behavior of atomic and critical section constructs. In any case, a lock acquire operation is translated to

```
pop(); check(); $yield(); acquire(l); push();
```
The thread first pops its stacks, updating its shared mem-sets. At this point, the shared structures are up-to-date, and the thread uses them to check for conflicts with other threads. This conforms with Definition 7(2), that a race check occur upon arrival at an acquire location. It then yields to other threads as it attempts to acquire lock l. Once acquired, it pushes new empty entries onto its stack and resumes tracking. A release statement becomes

```
pop(); $yield(); check(); release(l); push();
```
It is similar to the acquire case, except that the check occurs upon leaving the release location, i.e., after the yield. A similar sequence is inserted in any loop (e.g., a *while* loop or a *for* loop not in standard form) that may create a cycle in the local space, only without the release statement.

*Barriers.* An explicit or implicit barrier in S becomes

```
pop(); $local_end(); $barrier_call(); if (tid==0) check_and_clear_all();
$barrier_call(); $local_start(); push();.
```
The CIVL-C \$barrier\_call function must be invoked outside of a local region, as it may block. Once all threads are in the barrier, a single thread (0) checks for conflicts and clears all the shared mem-sets. A second barrier call is used to prevent other threads from racing ahead before this check and clear is complete. This protocol mimics the events that take place atomically with an exit<sup>0</sup> transition in Sect. 2.

*Atomic and Critical Sections.* An OpenMP atomic construct is modeled by introducing a global "atomic lock" which is acquired before executing the atomic statement and then released. The acquire and release actions are then transformed as described above. Similarly, a lock is introduced for each critical section name (and the anonymous critical section); this lock is acquired before entering a critical section with that name and released when departing.

*Worksharing Constructs.* Upon arriving at a for construct, a thread invokes a function that returns the set of iterations for which the thread is responsible. The partitioning of the iteration space among the threads is controlled by the construct clauses and various command line options. If the construct specifies the distribution strategy precisely, then the model uses only that distribution. If the construct does not specify the distribution, then the decisions are based on command line options. One option is to explore all possible distributions. In this case, when the first thread arrives, a series of nondeterministic choices is made to construct an arbitrary distribution. The verifier explores all possible choices, and therefore all possible distributions. This enables a complete analysis of the loop's execution space, but at the expense of a combinatorial explosion with the number of threads or iterations. A different command line option allows the user to specify a particular default distribution strategy, such as *cyclic*. These options give the user some control over the completeness-tractability tradeoff. For sections, only cyclic distribution is currently supported, and a single construct is executed by the first thread to arrive at the construct.

#### 3.4 Evaluation

We applied our verifier to a suite comprised of benchmarks from DataRaceBench (DRB) version 1.3.2 [35] and some examples written by us that use different concurrency patterns. As a basis for comparison, we applied a state-of-the-art static analyzer for OpenMP race detection, LLOV v.0.3 [10], to the same suite.<sup>2</sup>

LLOV v.0.3 implements two static analyses. The first uses polyhedral analysis to identify data races due to loop-carried dependencies within OpenMP parallel loops [9]. It is unable to identify data races involving critical sections, atomic operations, master or single directives, or barriers. The second is a phase interval analysis to identify statements or basic blocks (and consequently memory accesses within those blocks) that may happen in parallel [10]. Phases are separated by explicit or implicit barriers and the minimum and maximum phase in which a statement or basic block may execute define the phase interval. The phase interval analysis errs in favor of reporting accesses as potentially happening in parallel whenever it cannot prove that they do not; consequently, it may produce false alarms.

The DRB suite exercises a wide array of OpenMP language features. Of the 172 benchmarks, 88 use only the language primitives supported by our CIVL OpenMP transformer (see Sect. 3.1). Some of the main reasons benchmarks were excluded include: use of C++, simd and task directives, and directives for GPU programming. All 88 programs also use only features supported by LLOV. Of the 88, 47 have data races and 41 are labeled race-free.

We executed CIVL on the 88 programs, with the default number of OpenMP threads for a parallel region bounded by 8 (with a few exceptions, described below). We chose cyclic distribution as the default for OpenMP *for* loops. Many of the programs consume positive integer inputs or have clear hard-coded integer parameters. We manually instrumented 68 of the 88, inserting a few lines of CIVL-C code, protected by a preprocessor macro that is defined only when the program is verified by CIVL. This code allows each parameter to be specified on the CIVL command line, either as a single value or by specifying a range. In a few cases (e.g., DRB055), "magic numbers" such as 500 appear in multiple places,

<sup>2</sup> While there are a number of effective dynamic race detectors, the goal of those tools is to detect races on a particular execution. Our goal is more aligned with that of static analyzers: to cover as many executions as possible, including for different inputs, number of threads, and thread interleavings.

```
// DRB140 (race)
int a, i;
#pragma omp parallel private(i)
{
 #pragma omp master
 a = 0;
 #pragma omp for reduction(+:a)
 for (i=0; i<10; i++)
  a = a + i;
}
                                   // DRB014 (race)
                                   int n=100, m=100;
                                   double b[n][m];
                                   #pragma omp parallel for \
                                      private(j)
                                   for (i=1;i<n;i++)
                                    for (j=0;j<m;j++)
                                     // out of bound access
                                     b[i][j]=b[i][j-1];
                                                                 // diffusion1 (race)
                                                                 double *u, *v;
                                                                 // alloc + init u, v
                                                                 for (t=0; t<steps; t++) {
                                                                  #pragma omp parallel for
                                                                  for (i=1; i<n-1; i++) {
                                                                    u[i]=v[i]+c*(v[i-1]+v[i]);
                                                                  }
                                                                  u=v; v=u; // incorrect swap
                                                                 }
```
Fig. 3. Excerpts from three benchmarks with data races: two from DataRaceBench (left and middle) and erroneous 1d-diffusion (right).

which we replaced with an input parameter controlled by CIVL. These modifications are consistent with the "small scope" approach to verification, which requires some manual effort to properly parameterize the program so that the "scope" can be controlled.

We used the range 1..10 for inputs, again with a few exceptions. In three cases, verification did not complete within 3 min and we lowered these bounds as follows: for DRB043, thread bound 8 and input bound 4; for the Jacobi iteration kernel DRB058, thread bound 4 and bound of 5 on both the matrix size and number of iterations; for DRB062, thread bound 4 and input bound 5.

CIVL correctly identified 40 of the 41 data-race-free programs, failing only on DRB139 due to nested parallel regions. It correctly reported a data race for 45 of the 47 programs with data races, missing only DRB014 (Fig. 3, middle) and DRB015. In both cases, CIVL reports a bound issue for an access to b[i][j-1] when i <sup>&</sup>gt; <sup>0</sup> and j = 0, but fails to report a data race, even when bound checking is disabled.

LLOV correctly identified 46 of the 47 programs with data races, failing to report a data race for DRB140 (Fig. 3, left). The semantics for reduction specify that the loop behaves as if each thread creates a private copy, initially 0, of the shared variable a, and updates this private copy in the loop body. At the end of the loop, the thread adds its local copy onto the original shared variable. These final additions are guaranteed to not race with each other. In CIVL, this is modeled using a lock. However, there is no guarantee that these updates do not race with other code. In this example, thread 0 could be executing the assignment a=0 while another thread is adding its local result to a—a data race. This race issue can be resolved by isolating the reduction loop with barriers.

LLOV correctly identified 38 out of 41 data-race-free programs. It reported false alarms for DRB052 (no support for indirect addressing), DRB054 (failure to propagate array dimensions and loop bounds from a variable assignment), and DRB069 (failure to properly model OpenMP lock behavior).

The DRB suite contains few examples with interesting interleaving dependencies or pointer alias issues. To complement the suite, we wrote 10 additional C/OpenMP programs based on widely-used concurrency patterns (cf. [1]):

– 3 implementations of a synchronization signal sent from one thread to another, using locks or busy-wait loops with critical sections or atomics;

```
// atomic3 (no race)
int x=0, s=0;
#pragma omp parallel sections \
   shared(x,s) num_threads(2)
{
 #pragma omp section
 {
  x=1;
  #pragma omp atomic write seq_cst
  s=1;
 }
 #pragma omp section
 {
  int done = 0;
  while (!done) {
   #pragma omp atomic read seq_cst
   done = s;
  }
  x=2;
 }
}
                                             // bar2 (no race)
                                             // ...create/initialize locks l0, l1;
                                             #pragma omp parallel num_threads(2)
                                             {
                                                int tid = omp_get_thread_num();
                                                if (tid == 0) omp_set_lock(&l0);
                                                else if (tid == 1) omp_set_lock(&l1);
                                                #pragma omp barrier
                                                if (tid == 0) x=0;
                                                if (tid == 0) {
                                                  omp_unset_lock(&l0);
                                                  omp_set_lock(&l1);
                                                } else if (tid == 1) {
                                                  omp_set_lock(&l0);
                                                  omp_unset_lock(&l1);
                                                }
                                                if (tid == 1) x=1;
                                                #pragma omp barrier
                                                if (tid == 0) omp_unset_lock(&l1);
                                                else if (tid == 1) omp_unset_lock(&l0);
                                             }
```
Fig. 4. Code for synchronization using an atomic variable (left) and a 2-thread barrier using locks (right).


For each program, we created an erroneous version with a data race, for a total of 20 tests. These codes are included in the experimental archive, and two are excerpted in Fig. 4.

CIVL obtains the expected result in all 20. While we wrote these additional examples to verify that CIVL can reason correctly about programs with complex interleaving semantics or alias issues, for completeness we also evaluated them with LLOV. It should be noted, however, that the authors of LLOV warn that it ". . . does not provide support for the OpenMP constructs for synchronization. . . " and ". . . can produce False Positives for programs with explicit synchronizations with barriers and locks." [9] It is therefore unsurprising that the results were somewhat mixed: LLOV produced no output for 6 of our examples (the racy and race-free versions of diffusion2 and the two producer-consumer codes) and produced the correct answer on 7 of the remaning 14. On these problems, LLOV reported a race for both the racy and race-free version, with the exception of diffusion1 (Fig. 3, right), where a failure to detect the alias between u and v leads it to report both versions as race-free.

CIVL's verification time is significantly longer than LLOV's. On the DRB benchmarks, total CIVL time for the 88 tests was 27 min. Individual times ranged from 1 to 150 seconds: 66 took less than 5s, 80 took less than 30s, and 82 took less than 1 min. (All CIVL runs used an M1 MacBook Pro with 16GB memory.) Total CIVL runtime on the 20 extra tests was 210s. LLOV analyzes all 88 DRB problems in less than 15 s (on a standard Linux machine).

### 4 Related Work

By Theorem 1, if barriers are the only form of synchronization used in a program, only a single interleaving will be explored, and this suffices to verify race-freedom or to find all states at the end of each barrier epoch. This is well known in other contexts, such as GPU kernel verification (cf. [5]).

Prior work involving model checking and data races for unstructured concurrency includes Schemmel et al. [29]. This work describes a technique, using symbolic execution and POR, to detect defects in Pthreads programs. The approach involves intricate algorithms for enumerating configurations of prime event structures, each representing a set of executions. The completeness results deal with the detection of defects under the assumption that the program is racefree. While the implementation does check for data races, it is not clear that the theoretical results guarantee a race will be found if one exists.

Earlier work of Elmas et al. describes a sound and precise technique for verifying race-freedom in finite-state lock-based programs [16]. It uses a bespoke POR-based model checking algorithm that associates significant and complex information with the state, including, for each shared memory location, a set of locks a thread should hold when accessing that location, and a reference to the node in the depth first search stack from which the last access to that location was performed.

Both of these model checking approaches are considerably more complex than the approach of this paper. We have defined a simple state-transition system and shown that a program has a data race if and only if a state or edge satisfying a certain condition is reachable in that system. Our approach is agnostic to the choice of algorithm used to check reachability. The earlier approaches are also path-precise for race detection, i.e., for each execution path, a race is detected if and only if one exists on that path. As we saw in the example following Theorem 1, our approach is not path-precise, nor does it have to be: to verify race-freedom, it is only necessary to find one race in one execution, if one exists. This partly explains the relative simplicity of our approach.

A common approach for verifying race-freedom is to establish *consistent correlation*: for each shared memory location, there is some lock that is held whenever that location is accessed. Locksmith [27] is a static analysis tool for multithreaded C programs that takes this approach. The approach should never report that a racy program is race-free, but can generate false alarms, since there are race-free programs that are not consistently correlated. False alarms can also arise from imprecise approximations of the set of shared variables, alias analysis, and so on. Nevertheless, the technique appears very effective in practice.

Static analysis-based race-detection tools for OpenMP include OMPRacer [33]. OMPRacer constructs a static graph representation of the happens-before relation of a program and analyzes this graph, together with a novel wholeprogram pointer analysis and a lockset analysis, to detect races. It may miss violations as a consequence of unsound decisions that aim to improve performance on real applications. The tool is not open source. The authors subsequently released OpenRace [34], designed to be extensible to other parallelism dialects; similar to OMPRacer, OpenRace may miss violations. Prior papers by the authors present details of static methods for race detection, without a tool that implements these methods [32].

PolyOMP [12] is a static tool that uses a polyhedral model adapted for a subset of OpenMP. Like most polyhedral approaches, it works best for affine loops and is precise in such cases. The tool additionally supports may-write access relations for non-affine loops, but may report false alarms in that case. DRACO [36] also uses a polyhedral model and has similar drawbacks.

Hybrid static and dynamic tools include Dynamatic [14], which is based on LLVM. It combines a static tool that finds candidate races, which are subsequently confirmed with a dynamic tool. Dynamatic may report false alarms and miss violations.

ARCHER [2] is a tool that statically determines many sequential or provably non-racy code sections and excludes them from dynamic analysis, then uses TSan [30] for dynamic race detection. To avoid false alarms, ARCHER also encodes information about OpenMP barriers that are otherwise not understood by TSan. A follow-up paper discusses the use of the OMPT interface to aid dynamic race detection tools in correctly identifying issues in OpenMP programs [28], as well as SWORD [3], a dynamic tool that can stay within userdefined memory bounds when tracking races, by capturing a summary on disk for later analysis.

ROMP [18] is a dynamic/static tool that instruments executables using the DynInst library to add checks for each memory access and uses the OMPT interface at runtime. It claims to support all of OpenMP except target and simd constructs, and models "logical" races even if they are not triggered because the conflicting accesses happen to be scheduled on the same thread. Other approaches for dynamic race detection and tricks for memory and run-time efficient race bookkeeping during execution are described in [11,19,20,24].

Deductive verification approaches have also been applied to OpenMP programs. An example is [6], which introduces an intermediate parallel language and a specification language based on permission-based separation logic. C programs that use a subset of OpenMP are manually annotated with "iteration contracts" and then automatically translated into the intermediate form and verified using VerCors and Viper. Successfully verified programs are guaranteed to be race-free. While these approaches require more work from the user, they do not require bounding the number of threads or other parameters.

#### 5 Conclusion

In this paper, we introduced a simple model-checking technique to verify that a program is free from data races. The essential ideas are (1) each thread "remembers" the accesses it performed since its last synchronization operation, (2) a partial order reduction scheme is used that treats all memory accesses as local, and (3) checks for conflicting accesses are performed around synchronizations. We proved our technique is sound and precise for finite-state models, using a simple mathematical model for multithreaded programs with locks and barriers. We implemented our technique in a prototype tool based on the CIVL symbolic execution and model checking platform and applied it to a suite of C/OpenMP programs from DataRaceBench. Although based on completely different techniques, our tool achieved performance comparable to that of the state-of-the-art static analysis tool, LLOV v.0.3.

Limitations of our tool include incomplete coverage of the OpenMP specification (e.g., target, simd, and task directives are not supported); the need for some manual instrumentation; the potential for state explosion necessitating small scopes; and a combinatorial explosion in the mappings of threads to loop iterations, OpenMP sections, or single constructs. In the last case, we have compromised soundness by selecting one mapping, but in future work we will explore ways to efficiently cover this space. On the other hand, in contrast to LLOV and because of the reliance on model checking and symbolic execution, we were able to verify the presence or absence of data races even for programs using unstructured synchronization with locks, critical sections, and atomics, including barrier algorithms and producer-consumer code.

Acknowledgements. This material is based upon work by the RAPIDS Institute, supported by the U.S. Department of Energy, Office of Science, Office of Advanced Scientific Computing Research, Scientific Discovery through Advanced Computing (Sci-DAC) program, under contract DE-AC02-06CH11357 and award DE-SC0021162. Support was also provided by U.S. National Science Foundation awards CCF-1955852 and CCF-2019309.

## References


Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

## **Searching for i-Good Lemmas to Accelerate Safety Model Checking**

Yechuan Xia<sup>1</sup>, Anna Becchi<sup>2</sup>, Alessandro Cimatti<sup>2</sup>, Alberto Griggio<sup>2</sup>, Jianwen Li1(B) , and Geguang Pu1,3(B)

<sup>1</sup> East China Normal University, Shanghai, China {jwli,ggpu}@sei.ecnu.edu.cn <sup>2</sup> Fondazione Bruno Kessler, Trento, Italy {abecchi,cimatti,griggio}@fbk.eu <sup>3</sup> Shanghai Trusted Industrial Control Platform Co., Ltd., Shanghai, China

**Abstract.** IC3/PDR and its variants have been the prominent approaches to safety model checking in recent years. Compared to the previous model-checking algorithms like BMC (Bounded Model Checking) and IMC (Interpolation Model Checking), IC3/PDR is attractive due to its completeness (vs. BMC) and scalability (vs. IMC). IC3/PDR maintains an over-approximate state sequence for proving the correctness. Although the sequence refinement methodology is known to be crucial for performance, the literature lacks a systematic analysis of the problem. We propose an approach based on the definition of *i*- *good lemmas*, and the introduction of two kinds of heuristics, i.e., branching and referskipping, to steer the search towards the construction of *i*-good lemmas. The approach is applicable to IC3 and its variant CAR (Complementary Approximate Reachability), and it is very easy to integrate within existing systems. We implemented the heuristics into two open-source model checkers, IC3Ref and SimpleCAR, as well as into the mature nuXmv platform, and carried out an extensive experimental evaluation on HWMCC benchmarks. The results show that the proposed heuristics can effectively compute more *i*-good lemmas, and thus improve the performance of all the above checkers.

## **1 Introduction**

Safety model checking is a fundamental problem in verification. The goal is to prove that all the reachable states of the transition system -I,T satisfy a property P. The field has been dominated by SAT-based techniques since the introduction of Bounded Model Checking (BMC) [9]. The first wave of SAT-based model-checking algorithms, including BMC, k-induction [31] and Interpolationbased Model Checking [25] have been superseded by the research deriving from the seminal work of Bradley [11]. The IC3 algorithm maintains an overapproximate state sequence for proving the correctness; it avoids unrolling the transition relation by localizing reasoning to *frames*, used to incrementally build an inductive invariant by discovering inductive clauses.

IC3 (also known as PDR [17]) has spawned several variants, including those that attempt to combine forward and backward search [29]. Particularly relevant in this paper is CAR (Complementary Approximate Reachability), which combines the forward overapproximation with a backward underapproximation [23].

It has been noted that different ways to refine the over-approximating sequence can impact the performance of the algorithm. For example, [21] attempts to discover *good* lemmas, that can be "pushed to the top" since they are inductive. In this paper, we propose an alternative way to drive the refinement of the overapproximating sequence. We identify i- *good lemmas*, i.e. lemmas that are inductive with respect to the i-th overapproximating level. The intuition is that such i-good lemmas are useful in the search since they are fundamental to reach a fix point in the safe case. In order to guide the search towards the discovery of i-good lemmas, we propose a heuristic approach based on two key insights, i.e., branching and refer-skipping. First, with branching we try to control the way the SAT solver extracts unsatisfiable cores by privileging variables occurring in i-good lemmas. Second, we control lemma generalization by avoiding dropping literals occurring in a subsuming lemma in the previous layer (refer-skipping).

The proposed approach is applicable both to IC3/PDR and CAR, and it is very simple to implement. Yet, it appears to be quite effective in practice. We implemented the i-good lemma heuristics in two open-source implementations of IC3 and CAR, and also in the mature, state-of-the-art IC3 implementation available inside the nuXmv model checker [12], and we carried out an extensive experimental evaluation on Hardware Model Checking Competition (HWMCC) benchmarks. Analysis of the results suggests that increasing the ratio of i-good lemmas leads to an increase in performance, and the heuristics appear to be quite effective in driving the search towards i-good lemmas. In terms of performance, this results in significant improvements for all the tools when equipped with the proposed approach.

This paper is structured as follows. In Sect. 2 we present the problem and the IC3/PDR and CAR algorithms. In Sect. 3 we present the intuition underlying igood lemmas and the algorithms to find them. In Sect. 4 we overview the related work. In Sect. 5 we present the experimental evaluation. In Sect. 6 we draw some conclusions and present directions for future work.

## **2 Preliminaries**

#### **2.1 Boolean Transition System**

A Boolean transition system *Sys* is a tuple -X, Y, I, T, where X and X denote the set of state variables in the present state and the next state, respectively, and Y denotes the set of input variables. The state space of *Sys* is the set of possible assignments to X. I(X) is a Boolean formula corresponding to the set of initial states, and T(X, Y, X ) is a Boolean formula representing the transition relation. State s<sup>2</sup> is a successor of state s<sup>1</sup> with input y iff s<sup>1</sup> ∧y ∧s <sup>2</sup> |= *T*, which is also denoted by (s1, y, s2) ∈ T. In the following, we will also write (s1, s2) ∈ T meaning that (s1, y, s2) ∈ T for some assignment y to the input variables. A *path* of length k is a finite state sequence s1, s2,...,s*k*, where (s*i*, s*<sup>i</sup>*+1) ∈ T holds for (1 ≤ i ≤ k−1). A state t is reachable from s in k steps if there is a path of length k from s to t. Let S be a set of states in *Sys*. We overload T and denote the set of successors of states in S as T(S) = {t | (s, t) ∈ T,s ∈ S}. Conversely, we define the set of predecessors of states in <sup>S</sup> as <sup>T</sup> <sup>−</sup><sup>1</sup>(S) = {<sup>s</sup> <sup>|</sup> (s, t) <sup>∈</sup> T,t <sup>∈</sup> <sup>S</sup>}. Recursively, we define T<sup>0</sup>(S) = S and T*<sup>i</sup>*+1(S) = T(T*<sup>i</sup>* (S)) where i ≥ 0; the notation T <sup>−</sup>*<sup>i</sup>* (S) is defined analogously. In short, T*<sup>i</sup>* (S) denotes the states that are reachable from S in i steps, and T <sup>−</sup>*<sup>i</sup>* (S) denotes the states that can reach S in i steps.

#### **2.2 Safety Checking and Reachability Analysis**

Given a transition system *Sys* = -X, Y, I, T and a safety property P, which is a Boolean formula over X, a model checker either proves that P holds for any state reachable from an initial state in I, or disproves P by producing a *counterexample*. In the former case, we say that the system is *safe*, while in the latter case, it is *unsafe*. A *counterexample* is a finite path from an initial state s to a state t violating P, i.e., t ∈ ¬P, and such a state is called a *bad* state. In symbolic model checking, safety checking is reduced to symbolic reachability analysis. Reachability analysis can be performed in a forward or backward search. Forward search starts from initial states I and searches for bad states by computing T*<sup>i</sup>* (I) with increasing values of i, while backward search begins with states in <sup>¬</sup><sup>P</sup> and searches for initial states by computing <sup>T</sup> <sup>−</sup>*<sup>i</sup>* (¬P) with increasing values of i. Table 1 gives the corresponding formal definitions.


**Table 1.** Exact reachability analysis.

For forward search, F*<sup>i</sup>* denotes the set of states that are reachable from I within i steps, which is computed by iteratively applying T. At each iteration, we first compute a new F*i*, and then perform safe checking and unsafe checking. If the safe/unsafe checking hits, the search terminates. Intuitively, unsafe checking F*<sup>i</sup>* ∩ ¬P = ∅ indicates some bad states are within F*<sup>i</sup>* and safe checking F*<sup>i</sup>*+1 ⊆ - <sup>0</sup>≤*j*≤*<sup>i</sup>* <sup>F</sup>*<sup>j</sup>* indicates that all reachable states from <sup>I</sup> have been checked and none of them violate P. For backward search, B*<sup>i</sup>* is the set of states that can reach ¬P in i steps, and the search procedure is analogous to the forward one.

**Notations.** A *literal* is an atomic variable or its negation. If l is a literal, we denote its corresponding variable with var(l). A *cube* (resp. *clause*) is a conjunction (resp. disjunction) of literals. The negation of a clause is a cube and vice versa. A formula in *Conjunctive Normal Form* (CNF) is a conjunction of clauses. For simplicity, we also treat a CNF formula φ as a set of clauses and make no difference between the formula and its set representation. Similarly, a cube or a clause c can be treated as a set of literals or a Boolean formula, depending on the context.

We say a CNF formula φ is satisfiable if there exists an assignment of its Boolean variables, called a *model*, that makes φ true; otherwise, φ is unsatisfiable. A SAT solver is a tool that can decide the satisfiability of a CNF formula φ. In addition to providing a yes/no answer, modern SAT solvers can also produce *models* for satisfiable formulas, and *unsatisfiable cores* (UC), i.e. a reason for unsatisfiability, for unsatisfiable ones. More precisely, in the following we shall assume to have a SAT solver that supports the following API (which is standard in state-of-the-art SAT solvers based on the CDCL algorithm [24]):


#### **2.3 Overview of** IC3 **and** CAR

IC3 is a SAT-based and complete safety model checking algorithm proposed in [11], which only needs to unroll the system at most once. PDR [17] is a reimplementation of IC3 which optimizes the original version in different aspects. To prove the correctness of a given system *Sys* = -X, Y, I, T w.r.t. the safety property P, IC3/PDR maintains a monotone over-approximate state sequence O such that (1) O<sup>0</sup> = I and (2) O*<sup>i</sup>*+1 ⊇ O*<sup>i</sup>* ∪T(O*i*) for i ≥ 0. From the perspective of reachability analysis, IC3 performs as shown in the left part of Table 2. Since O is monotone, the states search can converge as soon as O*<sup>i</sup>*+1 = O*<sup>i</sup>* holds for some i ≥ 0. Otherwise, a state path (counterexample) starting from I to some state in <sup>¬</sup><sup>P</sup> can be detected (<sup>T</sup> <sup>−</sup>*<sup>i</sup>* (¬P) ∩ I = ∅).

**Table 2.** A high-level description of IC3 (left) and (Forward) CAR (right).


CAR [23] is a recently proposed algorithm, which can be considered as a general version of IC3. The main points CAR differs from IC3 are as follows:



An overview of IC3 and (forward) CAR is shown in Algorithm 1 and Algorithm 2 respectively. At a high level, both algorithms have a similar structure, consisting of an alternation of two phases: unsafe check and safe check. The unsafe check (line 14 of Algorithm 1, line 14 of Algorithm 2) tries to find a state sequence that is a path between I and ¬P; if such a sequence can be found, then it is a counterexample witnessing the violation of P; otherwise, the O*<sup>i</sup>* are


strengthened with additional clauses until O*<sup>k</sup>* is strong enough to imply P. <sup>1</sup> The safe check (line 25 of Algorithm 1, line 26 of Algorithm 2) tries to propagate the clauses in O*<sup>i</sup>* to O*i*+1 and check if a fixpoint is reached. If so then the algorithm terminates. Both algorithms make use of similar additional procedures, which will be detailed in the following section, when we introduce our novel heuristics.

## **3 Finding** *i***-Good Lemmas**

In this section, we introduce the concept of *i-good lemmas*, define the heuristics to steer the search towards i-good lemmas and describe the IC3 and CAR algorithms enhanced with i-good lemmas. For the sake of convenient description, we fix the input system *Sys* = -X, Y, I, T and the property P to be verified. In describing the implementation of our heuristics, we shall necessarily assume that the reader has some familiarity with the low-level details of IC3 and CAR, for which we refer to [11,17,23]. Specifically, we shall use pseudo-code descriptions of the main components of the algorithms (Algorithm 3, 4, and 5), in which the modifications required to implement our heuristics are highlighted in blue.

<sup>1</sup> Note that in the unsafe check, the meaning of the SAT query **is SAT**(*O<sup>i</sup>* <sup>∧</sup> *<sup>T</sup>*, *<sup>s</sup>* ) is different between CAR and IC3 (line 15 Algorithm 2) so that when it is unsatisfiable the obtained clauses have different semantics.

## **3.1 What Are** *i***-good Lemmas**

The over-approximate state sequence O in IC3 (resp. CAR) is a finite sequence, in which every element O*<sup>i</sup>* (0 ≤ i < |O|), namely *frame* i, is an over-approximation of the states of the system that are reachable in up to (resp. exactly) i steps from I, and which is strong enough to imply P. Such sequence O has the form of P ∧ C, where C is a CNF, and each clause in C is called a *lemma*. For both algorithms, the goal is that of transforming the sequence O to construct an over-approximation of all the reachable states of the system (over an unbounded horizon) that still implies P. When this happens, such over-approximation is an inductive invariant that proves P. The key idea, common to both IC3 and to CAR, is to construct the invariant *incrementally* and by reasoning in a *localized manner*, by (i) considering increasingly-long sequences of overapproximations, and by (ii) trying to propagate forward individual lemmas from a frame O*<sup>i</sup>* to its successor O*i*+1, until a fixpoint is reached<sup>2</sup>. The forward propagation procedure is crucial for ensuring the convergence of the algorithm in practice: for IC3 (resp. CAR), it checks whether a lemma c at frame i represents also an overapproximation of all the states reachable in up to (resp. exactly) i + 1 steps, and therefore can be added to frame i + 1. It is immediate to see that the successful propagation of *all* lemmas from i to i+ 1, for some i, is a sufficient condition for the termination of both IC3 and CAR with a safe result. In fact, for IC3, this is also a necessary condition.

We now introduce the notion of i*-good lemma*.

**Definition 1 (**i**-Good Lemma).** *Let* c *be a lemma that was added at frame* i *by* IC3*/*CAR *(at some previous step in the execution of the algorithm), i.e.* O*<sup>i</sup>* |= c*. We say that* c *is* i*-good if* c *now holds also at frame* i + 1*, i.e.* O*i*+1 |= c*.*

The following theorems are then consequences of the definition.

**Theorem 1.** IC3 *terminates with safe at frame* i *(*i > 0*), if and only if every lemma at frame* i *is* i*-good.*

**Theorem 2.** CAR *terminates with safe at frame* i *(*i > 0*), if every lemma at frame* i *is* i*-good.*

Such theorems provide the theoretical foundation on which we base our main conjecture: the computation of i-good lemmas can be helpful for both IC3 and CAR to accelerate the convergence in proving properties. Intuitively, an i-good lemma shows the promise of being independent of the reachability layer, and hence holds in general.

<sup>2</sup> The algorithms differ in the way they check reaching the fixpoint, but this difference will be ignored unless otherwise stated.

## **3.2 Searching for** *i***-good Lemmas**

Our conjecture is that there exists, on average, a positive correlation between the ratio of i-good lemmas vs the total amount of lemmas computed by IC3/CAR during generalization and the efficiency of the algorithm.

Ensuring that only i-good lemmas are produced is as hard as solving the verification problem itself, since this is essentially equivalent to synthesizing an inductive invariant which implies P. However, there are two situations in which it is easy to *identify* i-good lemmas, for both IC3 and CAR:


Therefore, we do not attempt to compute only i-good lemmas, but rather, our main idea is to use some (cheap) heuristics to increase the probability of producing i-good lemmas during the normal execution of IC3 and CAR.

We exploit the above observations to design two heuristics that try to bias the search for lemmas towards those that are more likely to be i-good, which we call respectively branching and refer-skipping.

**Branching.** The branching strategy [26] is an important feature of modern CDCL (Conflict-Driven Clause Learning) SAT solvers [7]. Traditional scoring schemes for branching such as VSIDS and EVSIDS have been extensively evaluated in [10]. In CDCL SAT solvers, decision variables are selected according to their priority. Whenever a conflict occurs, the priority of each variable in the clause is increased. To this end, variables that have recently been involved in conflicts are more likely to be selected as decision variables.

We adopt a similar idea in our branching heuristic for IC3/CAR to bias the unsatisfiable cores produced by the SAT solver, by ordering the assumptions in SAT queries according to their score. This is based on the fact that modern SAT solvers based on CDCL apply the assumption literals in the order given by the user, and (as a consequence of how CDCL works) the unsatisfiable core produced when the formula is unsatisfiable depends on such order, with literals occurring earlier in the assumption list being more likely to be included in the core. For example, assume the SAT query is **is SAT**(¬1 ∧ (2 ∨ ¬3), 1 ∧ ¬2 ∧ 3), which is unsatisfiable, then the returned UC from the SAT solver, e.g., Minisat [5,18], will be {1}. If the order of assumptions is changed to 3 ∧ ¬2 ∧ 1, then the UC will be {3,¬2}.

Since UCs are the source for lemmas in both IC3 and CAR, the first idea of our branching heuristic is that of sorting the assumption literals in SAT queries according to *how often they occur in recent* i- *good lemmas*. Concretely, this is implemented as follows:

– We introduce a mapping S[*v*] : v → score*v*, v ∈ X from each variable to its score (priority). Initially, all variables have the same score of 0.


In order to determine whether generalize produced an i-good lemma, we also use the function get parentnode(c) (line 3 of Algorithm 3), which returns a cube p in frame i − 1 such that p ⊆ c when c belongs to frame i. (If multiple such p exist, the one with the highest score is returned).

– When performing inductive generalization of a lemma c at frame i (Algorithm 3), in which c is strengthened by trying to drop literals from it as long as the result is still a valid lemma for frame i, the literals of c are sorted in increasing order of S[*var*(*l*)], with l ∈ c. This corresponds to the call to the function reverse sort(c) at line 2 of Algorithm 3 in the pseudo-code.


**Skipping Literals by Reference.** Lemma generalization is a crucial process in IC3/CAR that affects performance significantly. Given the original lemma c


to be added into frame i (i > 0), the generalize procedure tries to compute a new lemma g such that g ⊆ c and g is also valid to be added to frame i (O*i*). The main idea of generalization is to try to drop literals in the original lemma one by one, to see whether the left part can still be a valid lemma.

There are several generalization algorithms with different trade-offs between efficiency (in terms of the number of SAT queries) and effectiveness (in terms of the potential reduction in the size of the generalized lemma), e.g. [11,17,20]. More in general, there might be multiple different ways in which a lemma c can be generalized, with results of uncomparable strength (i.e. there might be both g<sup>1</sup> ⊆ c and g<sup>2</sup> ⊆ c such that g<sup>1</sup> ⊆ g<sup>2</sup> and g<sup>2</sup> ⊆ g1).

The main idea of the refer-skipping heuristic is to bias the generalization to increase the likelihood that the result g is a (i − 1)-good lemma. Consider the generalization of lemma c = ¬1 ∨ 2 ∨ ¬3 at frame i (i > 1). If there is already a

#### **Algorithm 5.** Auxiliary functions for CAR

```
1: function get predecessor(s, i) // generalization of predecessors
2: assert(is SAT(Oi ∧ T, s-

                      )) // precondition: ∃t that (t, s) ∈ T
3: µ := get model()
4: in := {l ∈ µ|var(l) ∈ Y }
5: t := {l ∈ µ|var(l) ∈ X}
6: sort(t) // sort literals in s in descending order of priority
7: while not is SAT(Oi ∧ in ∧ ¬s-

                            , t) do
8: if t =get UC() then
9: break
10: t :=get UC()
11: return t
12:
13: function down(c, i, rec lvl) // CTG-based dropping literals
14: cex num := 0
15: while true do
16: if not is SAT(Oi ∧ T, c-

                         ) then
17: c :={l|l
               -
                ∈ get UC()}
18: return true
19: else if rec lvl > MAX REC LVL then // MAX REC LVL = 3
20: return false
21: else
22: cex := get predecessor(c, i)
23: sort(cex) // sort literals in s in descending order of priority
24: if cex num < MAX CEX NUM and i > 0
            and not is SAT(Oi−1 ∧ T, cex) then // MAX CEX NUM = 3
25: ccex :=generalize({l|l
                            -
                             ∈ get UC()}, i − 1, rec lvl + 1)
26: Oi−1 := Oi−1 ∩ ¬ccex
27: cex num + +
28: else
29: return false
30:
31: function propagation(k)
32: i := 1
33: for i<k do
34: for ¬c ∈ Oi do
35: if not SAT(Oi ∧ T, c-

                         ) then
36: Oi+1 := Oi+1 ∩ ¬c
37: reward(c) // raise priority of variables in c
```
lemma g = ¬1∨¬3 at frame i−1, we say that g is a *candidate* (i−1)*-good lemma* for the generalization of c. In order to drive the generalization of c towards g, we *blacklist* the literals of g, so that generalize will never attempt to drop them from c. As such, we call g a reference for skipping generalization. In general, there might be multiple references for a given lemma. Currently, our strategy in refer-skipping is to just pick the one first found.

The implementation of refer-skipping is based on existing generalization algorithms and only needs to add less than 10 lines in the pseudo-code (see line 4-10 of Algorithm 3). As shown in the algorithm, a variable set req is maintained to store variables that fail to be dropped so that they are not tried to be removed again later. In order to use refer-skipping, we simply initialize req with the variables occurring in the candidate (i − 1)-good lemma that is returned by the get parentnode procedure (line 3 of Algorithm 3).

Finally, note that although in our pseudo-code (and in our implementation) we use the CTG algorithm of [20], the idea discussed here can be applied also to the other variants of generalization just as easily.

## **4 Related Work**

In the field of safety model checking, after the introduction of IC3 [11], several variants have been presented: [20] presents the counterexample-guided generalization (CTG) of a lemma by blocking states that interfere with it, which significantly improves the performance of IC3; AVY [33] introduces the ideas of IC3 into IMC (Interpolant Model Checking) [25] to induce a better model checking algorithm; its upgrade version kAVY [32] uses k-induction to guide the interpolation and IC3/PDR generalization inside; [28] proposes to combine IC3/PDR with reverse IC3/PDR; the subsequent work [29] interleaves a forward and a backward execution of IC3 and strengthens one frame sequence by leveraging the proof-obligations from the other; IC3-INN [15] enables IC3 to leverage the internal signal information of the system to induce a variant of IC3 that can perform better on certain industrial benchmarks; [30] introduces under-approximation in PDR to improve the performance of bug-finding.

The importance of discovering inductive lemmas for improving convergence is first noted in [17]. In PDR terminology, inductive lemmas are the ones belonging to frame O∞, as they represent an over-approximation of all the reachable states.

The most relevant related work is [21], where a variant of IC3 named QUIP is proposed for implementing the pushing of the discovered lemmas to O∞. At its essence, QUIP adds the negation of a discovered lemma c as a *may-*proofobligation, hence trying to push c to the next frame. Counterexamples of mayproof-obligations represent an under-approximation of the reachable states and are stored to disprove the inductiveness of other lemmas. In QUIP terminology, such lemmas are classified as *bad lemmas*, as they have no chance of being part of the inductive invariant. Since the pushing is not limited to the current number of frames, inductive lemmas are discovered when all the clauses of a frame can be pushed (O*<sup>k</sup>* \ O*k*+1 = ∅ for a level k), and then added in O∞. In QUIP terminology, lemmas belonging to O<sup>∞</sup> are classified as *good lemmas*, and are always kept during the algorithm. Observe that the concept of *good* lemma in [21] is a stronger version of Definition 1, which instead is *local* to a frame i and characterizes lemmas that can be propagated one frame ahead.

Both QUIP and our heuristic try to accomplish a similar task, which is prioritizing the use of already discovered lemmas during the generalization. There are however several differences: QUIP proceeds by adding additional proofobligations to the queue and by progressively proving the inductiveness of a lemma relative to any frame. Our approach, on the other hand, is based on a cheap heuristic strategy that *locally* guides the generalization prioritizing the locally good lemmas. Some i-good lemmas computed may not be part of the final invariant and can not be pushed later; in QUIP, such lemmas would not be considered good. In our view, pushing them is not necessarily a waste of effort, because they still strengthen the frames and their presence might be necessary to deduce the final invariant. Finally, it is worth mentioning that our heuristic is much simpler to implement and integrate into different PDR-based engines.

The idea of ordering literals when performing inductive generalization is already proposed in [11] and adopted, as a default strategy, in several implementations of IC3 [3,17,19], yielding modest improvements on HWMCC benchmarks, however without clear trends identified (see [17,19]). Compared to such works, our approach has two main differences. First, these heuristics favor literals occurring more frequently in all previous frames, whereas our approach is driven by the role of lemmas and prefers the variables occurring only in those are igood. Second, our use of ordering heuristics is more pervasive: unlike in previous works, where variable ordering heuristics are only used during the lemma generalization, we use ordering everywhere the SAT results affect search direction, which makes it more effective to bias the search.

## **5 Evaluation**

#### **5.1 Experimental Setup**

We integrated the branching and refer-skipping heuristics into three systems: the IC3Ref [3] and SimpleCAR [6] (open-source) model checkers, which implement the IC3 and (Forward and Backward<sup>3</sup>) CAR algorithms respectively, and the mature, state-of-the-art implementation of IC3 available inside the nuXmv model checker [12]. We make our implementations and data for reproducing the experiments available at https://github.com/youyusama/i-Good Lemmas MC.

Since our approach is related to QUIP [21], we include the evaluation of QUIP, and IC3 (mainly as the baseline for QUIP), as implemented<sup>4</sup> in IIMC [4]. We also consider the PDR implementation in the ABC model checker [1], which is state-of-the-art in hardware model checking.


**Table 3.** Tools and algorithms evaluated in the experiments.

Table 3 summarizes the tested tools, algorithms, and their flags. We use the flag "-br" to enable the branching heuristic and "-rs" to enable refer-skipping. Furthermore, we evaluate also another configuration (denoted as "-sh"), in which the calls to sort() functions in Algorithms 4 and 5 are replaced by random

<sup>3</sup> Although there is an implementation of Backward CAR in SimpleCAR, this methodology corresponds to reverse IC3. As a result, we did not include Backward CAR in this paper and left the evaluation in future work.

<sup>4</sup> As far as we know, this is the only publicly available QUIP implementation.

shuffles, thus simulating a strategy that orders variables randomly. When no flag is active, IC3Ref runs the instances with its own strategy of sorting variables, present in the original implementation.

We evaluate all the tools on 749 benchmarks, in *aiger* format, of the SINGLE safety property track of the 2015 and 2017 editions of HWMCC [8] <sup>5</sup>. We ran the experiments on a cluster, which consists of 2304 2.5GHz CPUs in 240 nodes running RedHat 4.8.5 with a total of 96GB RAM. For each test, we set the memory limit to 8GB and the time limit to 5 h. During the experiments, each model-checking run has exclusive access to a dedicated node.

To increase our confidence in the correctness of the results, we compare the results of the solvers to make sure they are all consistent (modulo timeouts). For the cases with unsafe results, we also check the provided counterexample with the *aigsim* tool from the Aiger package [2]. We have no discrepancies in the results, and all unsafe cases successfully pass the *aigsim* check.

#### **5.2 Experimental Results**

**Overview.** The results of the experimental evaluation are discussed below. We first consider the aggregated results, as reported in Table 4. For each tool, we group the results obtained with the various configurations; we report the total number of benchmarks solved, distinguishing between safe and unsafe benchmarks; we also report the benchmarks gained and lost by the configurations with branching and/or refer-skipping active, relative to the baseline where branching and refer-skipping are not active. We can draw the following conclusions.


<sup>5</sup> From HWMCC 2019, the official format used in the competition is switched from Aiger to Btor2 [27], a format for word-level model checking. As a result, we did not include those instances in our experiments.



**Table 4.** Summary of overall results among different configurations.

Similar insights can be obtained from Fig. 1, which clearly shows the positive effect of improvements in performance.

**Detailed Statistics.** As shown in Table 4 and Fig. 1, nuXmv is highly optimized and has a much better performance than other open-source IC3 implementations, but enabling both heuristics is still useful to improve its overall performance by solving 34 more instances. For IC3Ref and SimpleCAR, the increased numbers of solved cases are 19 and 53, respectively. Moreover, from Table 4, nuXmv/IC3Ref/SimpleCAR is able to solve 24/14/43 more safe and 10/5/10 more unsafe instances with both heuristics.

A comparison of the performance of the tools with and without the heuristics is shown in Fig. 2. All three solvers are able to reduce their time cost when equipping with branching and refer-skipping (see the last row of the figure). Explicitly, 67.8% of the instances cost less or equal to check by 'nuXmv -br -rs', and the corresponding portions for 'ic3 -br -rs' and 'fcar -br -rs' are 77.9% and 87.0%. The variability occurs when considering only a single heuristic, which needs to be explored in the future. For example, 'fcar -br' and 'nuXmv -rs' generally cost slightly more time than 'fcar' and 'nuXmv', respectively.

**Fig. 1.** Comparisons among the implementations of IC3, PDR and CAR under different configurations. (To make the figure more readable, we skip the results with a single heuristic, which are still shown in Table 4.)

According to Table 4, either branching or refer-skipping is effective for improving nuXmv, IC3Ref, and SimpleCAR. For nuXmv and SimpleCAR, branching is more useful, considering that 'nuXmv -br' (resp. 'fcar -br') solves 39 (resp. 38) more instances than 'nuXmv' (resp. 'fcar'), with 31 (resp. 32) safe and 8 (resp. 6) unsafe. For IC3Ref, the improvement with either heuristic seems relatively modest, i.e., 'ic3 -br' solves 8 more instances than 'ic3', with 3 safe and 5 unsafe, while 'ic3 -rs' solves 10 more instances than 'ic3', with 9 safe and 1 unsafe.

As listed above, 'ic3 -br -rs' loses only 6 instances that are solved by 'ic3', while 'fcar -br -rs' even loses only 1 instance that is solved by 'fcar', which indicates the performance domination of 'fcar -br -rs' over 'fcar'. For 'nuXmv -br -rs', the number of lost cases is 15, which is still modest when compared to the gain of 49. So enabling branching and refer-skipping together makes the checkers pay a limited cost. The same applies to the situations when equipping with only one single heuristic for the checkers, see Table 4.

#### **5.3 Why Do** branching **and** refer-skipping **Work?**

To measure why branching and refer-skipping work, we introduce sr, i.e. the **s**uccess **r**ate in computing i-good lemmas. Formally, sr = N*g*/N where N*<sup>g</sup>* is

**Fig. 2.** Time comparison between IC3/CAR with and without two heuristics on safeunsafe cases. The baseline is always on the y-axis. Points above the diagonal indicate better performance with the heuristics active. Points on the borders indicate timeouts (18000 s).

**Fig. 3.** Comparison on the success rate (*sr*) to compute i-good lemmas between IC3/CAR with and without branching and refer-skipping.

the number of generalizations that successfully return i-good lemmas, while N is the total number of generalization calls. We instrumented the two open-source checkers IC3Ref and SimpleCAR in order to compute sr for each terminating run (including each run with/without a returned result at timeout).


Clearly, the plot for both IC3 and CAR in Fig. 4 supports the conjecture that searching more i-good lemmas can help achieve better model-checking performance (time cost).

**Fig. 4.** Comparison between the deviation of the success rate (*sr*) to compute i-good lemmas (Y axis) and the deviation of checking (CPU) time (X axis) for IC3/CAR with and without the heuristics. For each instance, let the checking time of 'ic3'/'fcar' be *t* and that of 'ic3 -br -rs'/'fcar -br -rs' be *t* . Each point has *t* − *t* as the x value and *sr* − *sr* as the y value.

Finally, we argue that computing as many i-good lemmas as possible is the direction to take to improve the performance of IC3 and its variants. branching and refer-skipping are two heuristics that can enable IC3/CAR to compute more i-good lemmas. However, there can be more efficient ways to compute i-good lemmas, which is left for our future work.

## **6 Conclusions and Future Work**

In this paper, we proposed a heuristic-based approach to improve the performance of IC3-based safety model checking. The idea is to steer the search of the over-approximation sequence towards i-good lemmas, i.e. lemmas that can be pushed from frame i to frame i + 1. On the one side, we attempt to control the way the SAT solver extracts the unsat cores, by privileging variables occurring in i-good lemmas (branching); on the other, we control lemma generalization by avoiding dropping literals that occur in a subsuming lemma in the previous layer (refer-skipping). The approach is very simple to implement and has been integrated into two open-source model checkers and an industrial-strength, closed-source model checker. The experimental evaluation, carried out on a wide set of benchmarks, shows that the approach yields computational benefits on all the implementations. Further analysis shows a correlation between i-good lemmas and performance improvements and suggests that the proposed heuristics are effective in finding more i-good lemmas.

In the future, we plan to investigate the reasons for performance improvement/degradation at the level of the single benchmarks. We will also attempt to integrate the proposed ideas with the ideas in QUIP, explore different kinds of heuristics, and lift this approach to the safety checking of infinite-state systems [13,14].

**Acknowledgment.** We thank anonymous reviewers for their helpful comments. This work is supported by National Natural Science Foundation of China (Grant #U21B2015 and #62002118) and Shanghai Collaborative Innovation Center of Trusted Industry Internet Software. This work has been partly supported by the project "AI@TN" funded by the Autonomous Province of Trento and by the PNRR project FAIR - Future AI Research (PE00000013), under the NRRP MUR program funded by the NextGenerationEU.

## **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

## **Second-Order Hyperproperties**

Raven Beutner , Bernd Finkbeiner , Hadar Frenkel(B) , and Niklas Metzger

CISPA Helmholtz Center for Information Security, Saarbr¨ucken, Germany {raven.beutner,finkbeiner,hadar.frenkel, niklas.metzger}@cispa.de

**Abstract.** We introduce Hyper<sup>2</sup>LTL, a temporal logic for the specification of hyperproperties that allows for second-order quantification over sets of traces. Unlike first-order temporal logics for hyperproperties, such as HyperLTL, Hyper<sup>2</sup>LTL can express complex epistemic properties like common knowledge, Mazurkiewicz trace theory, and asynchronous hyperproperties. The model checking problem of Hyper<sup>2</sup>LTL is, in general, undecidable. For the expressive fragment where secondorder quantification is restricted to smallest and largest sets, we present an approximate model-checking algorithm that computes increasingly precise under- and overapproximations of the quantified sets, based on fixpoint iteration and automata learning. We report on encouraging experimental results with our model-checking algorithm, which we implemented in the tool HySO.

## **1 Introduction**

About a decade ago, Clarkson and Schneider coined the term *hyperproperties* [21] for the rich class of system requirements that relate multiple computations. In their definition, hyperproperties generalize trace properties, which are sets of traces, to *sets of* sets of traces. This covers a wide range of requirements, from information-flow security policies to epistemic properties describing the knowledge of agents in a distributed system. Missing from Clarkson and Schneider's original theory was, however, a concrete specification language that could express customized hyperproperties for specific applications and serve as the common semantic foundation for different verification methods.

A first milestone towards such a language was the introduction of the temporal logic HyperLTL [20]. HyperLTL extends linear-time temporal logic (LTL) with quantification over traces. Suppose, for example, that an agent i in a distributed system observes only a subset of the system variables. The agent *knows* that some LTL formula ϕ is true on some trace π iff ϕ holds on *all* traces π- that agent i cannot distinguish from π. If we denote the indistinguishability of π and π by <sup>π</sup> <sup>∼</sup><sup>i</sup> <sup>π</sup>- , then the property that *there exists a trace* π *where agent* i *knows* ϕ can be expressed as the HyperLTL formula

$$
\exists \pi. \forall \pi'. \pi \sim\_i \pi' \to \varphi(\pi'),
$$

where we write ϕ(π- ) to denote that the trace property ϕ holds on trace π- .

While HyperLTL and its variations have found many applications [28,32,44], the expressiveness of these logics is limited, leaving many widely used hyperproperties out of reach. A prominent example is *common knowledge*, which is used in distributed applications to ensure simultaneous action [30,40]. Common knowledge in a group of agents means that the agents not only know *individually* that some condition ϕ is true, but that this knowledge is "common" to the group in the sense that each agent *knows* that every agent *knows* that ϕ is true; on top of that, each agent in the group *knows* that every agent *knows* that every agent *knows* that ϕ is true; and so on, forming an infinite chain of knowledge.

The fundamental limitation of HyperLTL that makes it impossible to express properties like common knowledge is that the logic is restricted to *first-order quantification*. HyperLTL, then, cannot reason about sets of traces directly, but must always do so by referring to individual traces that are chosen existentially or universally from the full set of traces. For the specification of an agent's individual knowledge, where we are only interested in the (non-)existence of a single trace that is indistinguishable and that violates ϕ, this is sufficient; however, expressing an infinite chain, as needed for common knowledge, is impossible.

In this paper, we introduce Hyper<sup>2</sup>LTL, a temporal logic for hyperproperties with *second-order quantification* over traces. In Hyper<sup>2</sup>LTL, the existence of a trace π where the condition ϕ is common knowledge can be expressed as the following formula (using slightly simplified syntax):

$$\exists \pi. \exists X. \ \pi \in X \land \left( \forall \pi' \in X. \forall \pi''. \left(\bigvee\_{i=1}^n \pi' \sim\_i \pi''\right) \to \pi'' \in X\right) \land \forall \pi' \in X. \varphi(\pi').$$

The second-order quantifier <sup>∃</sup><sup>X</sup> postulates the existence of a set <sup>X</sup> of traces that (1) contains π; that (2) is closed under the observations of each agent, i.e., for every trace π already in X, all other traces π- that some agent i cannot distinguish from π are also in X; and that (3) only contains traces that satisfy ϕ. The existence of X is a necessary and sufficient condition for ϕ being common knowledge on π. In the paper, we show that Hyper<sup>2</sup>LTL is an elegant specification language for many hyperproperties of interest that cannot be expressed in HyperLTL, including, in addition to epistemic properties like common knowledge, also Mazurkiewicz trace theory and asynchronous hyperproperties.

The model checking problem for Hyper<sup>2</sup>LTL is much more difficult than for HyperLTL. A HyperLTL formula can be checked by translating the LTL subformula into an automaton and then applying a series of automata transformations, such as self-composition to generate multiple traces, projection for existential quantification, and complementation for negation [8,32]. For Hyper<sup>2</sup>LTL, the model checking problem is, in general, undecidable. We introduce a method that nevertheless obtains sound results by over- and underapproximating the quantified sets of traces. For this purpose, we study Hyper<sup>2</sup>LTLfp, a fragment of Hyper<sup>2</sup>LTL, in which we restrict second-order quantification to the smallest or largest set satisfying some property. For example, to check common knowledge, it suffices to consider the *smallest* set X that is closed under the observations of all agents. This smallest set X is defined by the (monotone) fixpoint operation that adds, in each step, all traces that are indistinguishable to some trace already in X.

We develop an approximate model checking algorithm for Hyper<sup>2</sup>LTLfp that uses bidirectional inference to deduce lower and upper bounds on second-order variables, interposed with first-order model checking in the style of HyperLTL. Our procedure is parametric in an oracle that provides (increasingly precise) lower and upper bounds. In the paper, we realize the oracles with *fixpoint iteration* for underapproximations of the sets of traces assigned to the second-order variables, and *automata learning* for overapproximations. We report on encouraging experimental results with our model-checking algorithm, which has been implemented in a tool called HySO.

## **2 Preliminaries**

For <sup>n</sup> <sup>∈</sup> <sup>N</sup> we define [n] := {1,...,n}. We assume that AP is a finite set of atomic propositions and define <sup>Σ</sup> := 2AP. For <sup>t</sup> <sup>∈</sup> <sup>Σ</sup><sup>ω</sup> and <sup>i</sup> <sup>∈</sup> <sup>N</sup> define <sup>t</sup>(i) <sup>∈</sup> <sup>Σ</sup> as the <sup>i</sup>th element in <sup>t</sup> (starting with the 0th); and <sup>t</sup>[i,∞] for the infinite suffix starting at position <sup>i</sup>. For traces <sup>t</sup>1,...,t<sup>n</sup> <sup>∈</sup> <sup>Σ</sup><sup>ω</sup> we write *zip*(t1,...,tn) <sup>∈</sup> (Σ<sup>n</sup>)<sup>ω</sup> for the pointwise zipping of the traces, i.e., *zip*(t1,...,tn)(i) := (t1(i),...,tn(i)).

*Transition Systems.* <sup>A</sup> *transition system* is a tuple <sup>T</sup> = (S, S0, κ, L) where <sup>S</sup> is a set of states, <sup>S</sup><sup>0</sup> <sup>⊆</sup> <sup>S</sup> is a set of initial states, <sup>κ</sup> <sup>⊆</sup> <sup>S</sup> <sup>×</sup><sup>S</sup> is a transition relation, and <sup>L</sup> : <sup>S</sup> <sup>→</sup> <sup>Σ</sup> is a labeling function. A path in <sup>T</sup> is an infinite state sequence <sup>s</sup>0s1s<sup>2</sup> ···∈ <sup>S</sup><sup>ω</sup>, s.t., <sup>s</sup><sup>0</sup> <sup>∈</sup> <sup>S</sup>0, and (si, s<sup>i</sup>+1) <sup>∈</sup> <sup>κ</sup> for all <sup>i</sup>. The associated trace is given by <sup>L</sup>(s0)L(s1)L(s2)···∈ <sup>Σ</sup><sup>ω</sup> and *Traces*(<sup>T</sup> ) <sup>⊆</sup> <sup>Σ</sup><sup>ω</sup> denotes all traces of <sup>T</sup> .

*Automata.* A *non-deterministic B¨uchi automaton* (NBA) [18] is a tuple A = (Σ, Q, q0, δ, F) where <sup>Σ</sup> is a finite alphabet, <sup>Q</sup> is a finite set of states, <sup>Q</sup><sup>0</sup> <sup>⊆</sup> <sup>Q</sup> is the set of initial states, <sup>F</sup> <sup>⊆</sup> <sup>Q</sup> is a set of accepting states, and <sup>δ</sup> : <sup>Q</sup> <sup>×</sup> <sup>Σ</sup> <sup>→</sup> <sup>2</sup><sup>Q</sup> is the transition function. A run on a word <sup>u</sup> <sup>∈</sup> <sup>Σ</sup><sup>ω</sup> is an infinite sequence of states <sup>q</sup>0q1q<sup>2</sup> ··· ∈ <sup>Q</sup><sup>ω</sup> such that <sup>q</sup><sup>0</sup> <sup>∈</sup> <sup>Q</sup><sup>0</sup> and for every <sup>i</sup> <sup>∈</sup> <sup>N</sup>, <sup>q</sup><sup>i</sup>+1 <sup>∈</sup> <sup>δ</sup>(qi, u(i)). The run is accepting if it visits states in F infinitely many times, and we define the language of <sup>A</sup>, denoted <sup>L</sup>(A) <sup>⊆</sup> <sup>Σ</sup><sup>ω</sup>, as all infinite words on which <sup>A</sup> has an accepting run.

*HyperLTL.* HyperLTL [20] is one of the most studied temporal logics for the specification of hyperproperties. We assume that V is a fixed set of trace variables. For the most part, we use variations of π (e.g., π, π- , π1,...) to denote trace variables. HyperLTL formulas are then generated by the grammar

$$\begin{aligned} \varphi &:= \mathbb{Q}\pi.\varphi \mid \psi\\ \psi &:= a\_{\pi} \mid \neg \psi \mid \psi \wedge \psi \mid \mathbf{O}\,\psi \mid \psi \mathcal{U}\,\psi \end{aligned}$$

where <sup>a</sup> <sup>∈</sup> AP is an atomic proposition, <sup>π</sup> ∈ V is a trace variable, <sup>Q</sup> ∈ {∀, ∃} is a quantifier, and and U are the temporal operators *next* and *until*.

The semantics of HyperLTL is given with respect to a *trace assignment* Π, which is a partial mapping <sup>Π</sup> : <sup>V</sup> <sup>Σ</sup><sup>ω</sup> that maps trace variables to traces. Given <sup>π</sup> ∈ V and <sup>t</sup> <sup>∈</sup> <sup>Σ</sup><sup>ω</sup> we define <sup>Π</sup>[<sup>π</sup> <sup>→</sup> <sup>t</sup>] as the updated assignment that maps <sup>π</sup> to <sup>t</sup>. For <sup>i</sup> <sup>∈</sup> <sup>N</sup> we define <sup>Π</sup>[i,∞] as the trace assignment defined by <sup>Π</sup>[i,∞](π) := <sup>Π</sup>(π)[i,∞], i.e., we (synchronously) progress all traces by <sup>i</sup> steps. For quantifier-free formulas ψ we follow the LTL semantics and define


The indexed atomic propositions refer to a specific path in Π, i.e., a<sup>π</sup> holds iff a holds on the trace bound to π. Quantifiers range over system traces:

Π -<sup>T</sup> <sup>ψ</sup> iff <sup>Π</sup> ψ and Π -<sup>T</sup> <sup>Q</sup>π.ϕ iff <sup>Q</sup><sup>t</sup> <sup>∈</sup> *Traces*(<sup>T</sup> ). Π[<sup>π</sup> <sup>→</sup> <sup>t</sup>] ϕ .

We write T <sup>ϕ</sup> if <sup>∅</sup> -<sup>T</sup> <sup>ϕ</sup> where <sup>∅</sup> denotes the empty trace assignment.

*HyperQPTL.* HyperQPTL [45] adds – on top of the trace quantification of HyperLTL – also propositional quantification (analogous to the propositional quantification that QPTL [46] adds on top of LTL). For example, HyperQPTL can express a promptness property which states that there must exist a bound (which is common among all traces), up to which an event must have happened. We can express this as <sup>∃</sup>q.∀π. <sup>q</sup> <sup>∧</sup> (¬q) <sup>U</sup> <sup>a</sup><sup>π</sup> which states that there exists an evaluation of proposition q such that (1) q holds at least once, and (2) for all traces π, a holds on π before the first occurrence of q. See [8] for details.

## **3 Second-Order HyperLTL**

The (first-order) trace quantification in HyperLTL ranges over the set of all system traces; we thus cannot reason about arbitrary sets of traces as required for, e.g., common knowledge. We introduce a second-order extension of HyperLTL by introducing second-order variables (ranging over sets of traces) and allowing quantification over traces from any such set. We present two variants of our logic that differ in the way quantification is resolved. In Hyper<sup>2</sup>LTL, we quantify over arbitrary sets of traces. While this yields a powerful and intuitive logic, secondorder quantification is inherently non-constructive. During model checking, there thus does not exist an efficient way to even approximate possible witnesses for the sets of traces. To solve this quandary, we restrict Hyper<sup>2</sup>LTL to Hyper<sup>2</sup>LTLfp, where we instead quantify over sets of traces that satisfy some minimality or maximality constraint. This allows for large fragments of Hyper<sup>2</sup>LTLfp that admit algorithmic approximations to its model checking (by, e.g., using known techniques from fixpoint computations [47,48]).

#### **3.1 Hyper2LTL**

Alongside the set V of trace variables, we use a set V of second-order variables (which we, for the most part, denote with capital letters X, Y, ...). We assume that there is a special variable S ∈ V that refers to the set of traces of the given system at hand, and a variable A ∈ V that refers to the set of all traces. We define the Hyper<sup>2</sup>LTL syntax by the following grammar:

$$\begin{aligned} \varphi &:= \mathbb{Q}\pi \in X. \varphi \mid \mathbb{Q}X. \varphi \mid \psi\\ \psi &:= a\_{\pi} \mid \neg \psi \mid \psi \wedge \psi \mid \mathbb{Q}\psi \mid \psi \mathcal{U} \psi \end{aligned}$$

where <sup>a</sup> <sup>∈</sup> AP is an atomic proposition, <sup>π</sup> ∈ V is a trace variable, <sup>X</sup> <sup>∈</sup> <sup>V</sup> is a second-order variable, and <sup>Q</sup> ∈ {∀, ∃} is a quantifier. We also consider the usual derived Boolean constants (*true*, *false*) and connectives (∨, →, ↔) as well as the temporal operators *eventually* ( <sup>ψ</sup> := *true* <sup>U</sup> <sup>ψ</sup>) and *globally* ( <sup>ψ</sup> := <sup>¬</sup> <sup>¬</sup>ψ). Given a set of atomic propositions <sup>P</sup> <sup>⊆</sup> AP and two trace variables π, π- , we abbreviate π =<sup>P</sup> π- := <sup>a</sup>∈<sup>P</sup> (a<sup>π</sup> <sup>↔</sup> <sup>a</sup><sup>π</sup>-).

**Semantics.** Apart from a trace assignment Π (as in the semantics of Hyper-LTL), we maintain a second-order assignment Δ : V 2<sup>Σ</sup><sup>ω</sup> mapping secondorder variables to *sets of traces*. Given <sup>X</sup> <sup>∈</sup> <sup>V</sup> and <sup>A</sup> <sup>⊆</sup> <sup>Σ</sup><sup>ω</sup> we define the updated assignment <sup>Δ</sup>[<sup>X</sup> <sup>→</sup> <sup>A</sup>] as expected. Quantifier-free formulas <sup>ψ</sup> are then evaluated in a fixed trace assignment as for HyperLTL (cf. Sect. 2). For the quantifier prefix we define:

$$\begin{aligned} \Pi, \Delta &\models \psi\\ \Pi, \Delta &\models \mathbb{Q}\pi \in X. \varphi\\ \Pi, \Delta &\models \mathbb{Q}X. \varphi \end{aligned} \qquad \begin{aligned} \text{iff} \quad \Pi &\models \psi\\ \text{iff} \quad \mathbb{Q}t \in \Delta(X). \Pi[\pi \mapsto t], \Delta \models \varphi\\ \text{iff} \quad \mathbb{Q}A \subseteq \Sigma^{\omega}. \Pi, \Delta[X \mapsto A] \models \varphi \end{aligned}$$

Second-order quantification updates Δ with a set of traces, and first-order quantification updates Π by quantifying over traces within the set defined by Δ.

Initially, we evaluate a formula in the empty trace assignment and fix the valuation of the special second-order variable S to be the set of all system traces and <sup>A</sup> to be the set of all traces. That is, given a system <sup>T</sup> and Hyper<sup>2</sup>LTL formula <sup>ϕ</sup>, we say that <sup>T</sup> satisfies <sup>ϕ</sup>, written <sup>T</sup> <sup>ϕ</sup>, if <sup>∅</sup>, [<sup>S</sup> <sup>→</sup> *Traces*(<sup>T</sup> ), <sup>A</sup> → Σ<sup>ω</sup>] <sup>ϕ</sup>, where we write <sup>∅</sup> for the empty trace assignment. The model-checking problem for Hyper<sup>2</sup>LTL is checking whether <sup>T</sup> ϕ holds.

Hyper<sup>2</sup>LTL naturally generalizes HyperLTL by adding second-order quantification. As sets range over *arbitrary* traces, Hyper<sup>2</sup>LTL also subsumes the more powerful logic HyperQPTL. The proof of Lemma 1 is given in the full version of this paper [11].

**Lemma 1.** Hyper<sup>2</sup>LTL *subsumes* HyperQPTL *(and thus also* HyperLTL*).*

**Syntactic Sugar.** In Hyper<sup>2</sup>LTL, we can quantify over traces within a secondorder variable, but we cannot state, within the body of the formula, that some path is a member of some second-order variable. For that, we define πX (as an atom within the body) as syntactic sugar for <sup>∃</sup>π- <sup>∈</sup> X. (π- =AP π), i.e., π is in X if there exists some trace in X that agrees with π on all propositions. Note that we can only use πX *outside* of the scope of any temporal operators; this ensures that we can bring the resulting formula into a form that conforms to the Hyper<sup>2</sup>LTL syntax.

## **3.2 Hyper<sup>2</sup>LTL***f p*

The semantics of Hyper<sup>2</sup>LTL quantifies over arbitrary sets of traces, making even approximations to its semantics challenging. We propose Hyper<sup>2</sup>LTLfp as a restriction that only quantifies over sets that are subject to an additional minimality or maximality constraint. For large classes of formulas, we show that this admits effective model-checking approximations. We define Hyper<sup>2</sup>LTLfp by the following grammar:

$$\begin{aligned} \varphi &:= \mathbb{Q}\,\pi \in X.\varphi \mid \mathbb{Q}(X,\mathbb{X},\varphi).\varphi \mid \psi\\ \psi &:= a\_{\pi} \mid \neg \psi \mid \psi \wedge \psi \mid \mathbb{Q}\psi \mid \psi \mathcal{U}\psi \end{aligned}$$

where <sup>a</sup> <sup>∈</sup> AP, <sup>π</sup> ∈ V, <sup>X</sup> <sup>∈</sup> <sup>V</sup>, <sup>Q</sup> ∈ {∀, ∃}, and ∈ {, } determines if we consider smallest () or largest () sets. For example, the formula <sup>∃</sup> (X, , ϕ1). ϕ<sup>2</sup> holds if there exists some set of traces X, that satisfies both ϕ<sup>1</sup> and ϕ2, and is *a* smallest set that satisfies ϕ1. Such minimality and maximality constraints with respect to a (hyper)property arise naturally in many properties. Examples include common knowledge (cf. Sect. 3.3), asynchronous hyperproperties (cf. Sect. 4.2), and causality in reactive systems [22,23].

**Semantics.** For path formulas, the semantics of Hyper<sup>2</sup>LTLfp is defined analogously to that of Hyper<sup>2</sup>LTL and HyperLTL. For the quantifier prefix we define:

$$\begin{aligned} &\Pi,\Delta\models\psi \qquad \text{iff} \quad \Pi\models\psi\\ &\Pi,\Delta\models\mathbb{Q}\pi\in X.\varphi \qquad \text{iff} \quad \mathbb{Q}t\in\Delta(X).\varPi[\pi\mapsto t],\Delta\models\varphi\\ &\Pi,\Delta\models\mathbb{Q}(X,\mathbb{X},\varphi\_{1}).\varphi\_{2} \quad \text{iff} \quad \mathbb{Q}A\in\operatorname{sol}(\varPi,\Delta,(X,\mathbb{X},\varphi\_{1})).\varPi,\Delta[X\mapsto A]\models\varphi\_{2}\end{aligned}$$

where *sol*(Π, Δ,(X, , ϕ1)) denotes all solutions to the minimality/maximality condition given by ϕ1, which we define by mutual recursion as follows:

$$\begin{aligned} sol(\Pi, \Delta, (X, \vee, \varphi)) &:= \{ A \subseteq \Sigma^{\omega} \mid \Pi, \Delta[X \mapsto A] \models \varphi \land \forall A' \subsetneq A. \Pi, \Delta[X \mapsto A'] \not\models \varphi \}, \\ sol(\Pi, \Delta, (X, \wedge, \varphi)) &:= \{ A \subseteq \Sigma^{\omega} \mid \Pi, \Delta[X \mapsto A] \models \varphi \land \forall A' \supsetneq A. \Pi, \Delta[X \mapsto A'] \not\models \varphi \}. \end{aligned}$$

A set A satisfies the minimality/maximality constraint if it satisfies ϕ and is a least (in case = ) or greatest (in case = ) set that satisfies ϕ.

**Fig. 1.** Left: An example for a multi-agent system with two agents, where agent 1 observes a and d, and agent 2 observes c and d. Right: The iterative construction of the traces to be considered for common knowledge starting with and<sup>ω</sup>.

Note that *sol*(Π, Δ,(X, , ϕ)) can contain multiple sets or no set at all, i.e., there may not exists a unique least or greatest set that satisfies ϕ. In Hyper<sup>2</sup>LTLfp, we therefore add an additional quantification over the set of all solutions to the minimality/maximality constraint. When discussing our model checking approximation algorithm, we present a (syntactic) restriction on ϕ which guarantees that *sol*(Π, Δ,(X, , ϕ)) contains a unique element (i.e., is a singleton set). Moreover, our restriction allows us to employ fixpoint techniques to find approximations to this unique solution. In case the solution for (X, , ϕ) is unique, we often omit the leading quantifier and simply write (X, , ϕ) instead of Q(X, , ϕ).

As we can encode the minimality/maximality constraints of Hyper<sup>2</sup>LTLfp in Hyper<sup>2</sup>LTL (see full version [11]), we have the following:

**Proposition 1.** *Any* Hyper<sup>2</sup>LTLfp *formula* ϕ *can be effectively translated into an* Hyper<sup>2</sup>LTL *formula* ϕ *such that for all transition systems* T *we have* T ϕ *iff* T ϕ- *.*

#### **3.3 Common Knowledge in Multi-agent Systems**

To explain common knowledge, we use a variation of an example from [43], and encode it in Hyper<sup>2</sup>LTLfp. Fig. 1(left) shows a transition system of a distributed system with two agents, agent 1 and agent 2. Agent 1 observes variables a and d, whereas agent 2 observes c and d. The property of interest is *starting from the trace* π = a<sup>n</sup>d<sup>ω</sup> *for some fixed* n > 1*, is it common knowledge for the two agents that* a *holds in the second step*. It is trivial to see that a holds on π. However, for common knowledge, we consider the (possibly) infinite chain of observationally equivalent traces. For example, agent 2 cannot distinguish the traces a<sup>n</sup>d<sup>ω</sup> and a<sup>n</sup>−<sup>1</sup>bd<sup>ω</sup>. Therefore, agent 2 only knows that a holds on π if it also holds on π- = a<sup>n</sup>−<sup>1</sup>bd<sup>ω</sup>. For common knowledge, agent 1 also has to know that agent 2 knows a, which means that for all traces that are indistinguishable from π or π for agent 1, a has to hold. This adds π-- = a<sup>n</sup>−<sup>1</sup>cd<sup>ω</sup> to the set of traces to verify a against. This chain of reasoning continues as shown in Fig. 1(right). In the last step we add acn−<sup>1</sup>d<sup>ω</sup> to the set of indistinguishable traces, concluding that a is not common knowledge.

The following Hyper<sup>2</sup>LTLfp formula specifies the property stated above. The abbreviation *obs*(π1, π2) := (π<sup>1</sup> <sup>=</sup>{a,d} <sup>π</sup>2) <sup>∨</sup> (π<sup>1</sup> <sup>=</sup>{c,d} <sup>π</sup>2) denotes that <sup>π</sup><sup>1</sup> and π<sup>2</sup> are observationally equivalent for either agent 1 or agent 2.

$$\begin{aligned} \forall \pi \in \mathfrak{S}. & \left(\bigwedge\_{i=0}^{n-1} \bigotimes^i a\_{\pi} \wedge \bigotimes^n \Box d\_{\pi}\right) \to \\ & \left(X, \forall, \pi \rhd X \wedge \left(\forall \pi\_1 \in X. \forall \pi\_2 \in \mathfrak{S}. \, obs(\pi\_1, \pi\_2) \to \pi\_2 \rhd X\right)\right). \forall \pi' \in X. \mathsf{O}\, a\_{\pi'} \end{aligned}$$

For a trace π of the form π = a<sup>n</sup>d<sup>ω</sup>, the set X represents the *common knowledge set* on π. This set X is the smallest set that (1) contains π (expressed using our syntactic sugar ); and (2) is closed under observations by either agent, i.e., if we find some <sup>π</sup><sup>1</sup> <sup>∈</sup> <sup>X</sup> and some system trace <sup>π</sup><sup>2</sup> that are observationally equivalent, π<sup>2</sup> should also be in X. Note that this set is unique (due to the minimality restriction), so we do not quantify it explicitly. Lastly, we require that all traces in X satisfy the property a. All sets that satisfy this formula would also include the trace ac<sup>n</sup>−1d<sup>ω</sup>, and therefore no such X exists; thus, we can conclude that starting from trace a<sup>n</sup>d<sup>ω</sup>, it is *not* common knowledge that a holds. On the other hand, it *is* common knowledge that a holds in the *first* step (cf. Sect. 6).

### **3.4 Hyper<sup>2</sup>LTL Model Checking**

As Hyper<sup>2</sup>LTL and Hyper<sup>2</sup>LTLfp allow quantification over arbitrary sets of traces, we can encode the satisfiability of HyperQPTL (i.e., the question of whether some set of traces satisfies a formula) within their model-checking problem; rendering the model-checking problem highly undecidable [34], even for very simple formulas [4].

**Proposition 2.** *For any* HyperQPTL *formula* ϕ *there exists a* Hyper<sup>2</sup>LTL *formula* ϕ *such that* ϕ *is satisfiable iff* ϕ *holds on some arbitrary transition system. The model-checking problem of* Hyper<sup>2</sup>LTL *is thus highly undecidable (*Σ<sup>1</sup> <sup>1</sup> *-hard).*

*Proof.* Let ϕ be the Hyper<sup>2</sup>LTL formula obtained from ϕ by replacing each HyperQPTL trace quantifier <sup>Q</sup><sup>π</sup> with the Hyper<sup>2</sup>LTL quantifier <sup>Q</sup><sup>π</sup> <sup>∈</sup> <sup>X</sup>, and each propositional quantifier <sup>Q</sup><sup>q</sup> with <sup>Q</sup>π<sup>q</sup> <sup>∈</sup> <sup>A</sup> for some fresh trace variable <sup>π</sup>q. In the body, we replace each propositional variable q with a<sup>π</sup><sup>q</sup> for some fixed proposition <sup>a</sup> <sup>∈</sup> AP. Then, <sup>ϕ</sup> is satisfiable iff the Hyper<sup>2</sup>LTL formula <sup>∃</sup>X.ϕ- holds in some arbitrary system.

Hyper<sup>2</sup>LTLfp cannot express HyperQPTL satisfiability directly. If there exists a model of a HyperQPTL formula, there may not exist a least one. However, model checking of Hyper<sup>2</sup>LTLfp is also highly undecidable.

**Proposition 3.** *The model-checking problem of* Hyper<sup>2</sup>LTLfp *is* Σ<sup>1</sup> <sup>1</sup> *-hard.* *Proof (Sketch).* We can encode the existence of a *recurrent* computation of a Turing machine, which is known to be Σ<sup>1</sup> <sup>1</sup> -hard [1].

Conversely, the *existential* fragment of Hyper<sup>2</sup>LTL can be encoded back into HyperQPTL satisfiability:

**Proposition 4.** *Let* ϕ *be a* Hyper<sup>2</sup>LTL *formula that uses only existential second-order quantification and* T *be any system. We can effectively construct a formula* ϕ *in* HyperQPTL *such that* T ϕ *iff* ϕ*is satisfiable.*

Lastly, we present some easy fragments of Hyper<sup>2</sup>LTL for which the modelchecking problem is decidable. Here we write <sup>∃</sup><sup>∗</sup><sup>X</sup> (resp. <sup>∀</sup><sup>∗</sup>X) for some sequence of existentially (resp. universally) quantified *second-order* variables and <sup>∃</sup><sup>∗</sup><sup>π</sup> (resp. <sup>∀</sup><sup>∗</sup>π) for some sequence of existentially (resp. universally) quantified *first-order* variables. For example, <sup>∃</sup><sup>∗</sup>X∀<sup>∗</sup><sup>π</sup> captures all formulas of the form <sup>∃</sup>X1,...Xn.∀π1,...,πm.ψ where <sup>ψ</sup> is quantifier-free.

**Proposition 5.** *The model-checking problem of* Hyper<sup>2</sup>LTL *is decidable for the fragments:* <sup>∃</sup><sup>∗</sup>X∀<sup>∗</sup>π*,* <sup>∀</sup><sup>∗</sup>X∀<sup>∗</sup>π*,* <sup>∃</sup><sup>∗</sup>X∃<sup>∗</sup>π*,* <sup>∀</sup><sup>∗</sup>X∃<sup>∗</sup>π*,* <sup>∃</sup>X.∃<sup>∗</sup><sup>π</sup> <sup>∈</sup> <sup>X</sup>∀<sup>∗</sup>π-<sup>∈</sup> <sup>X</sup>*.*

We refer the reader to the full version [11] for detailed proofs.

## **4 Expressiveness of Hyper2LTL**

In this section, we point to existing logics that can naturally be encoded within our second-order hyperlogics Hyper<sup>2</sup>LTL and Hyper<sup>2</sup>LTLfp.

## **4.1 Hyper<sup>2</sup>LTL and LTL<sup>K</sup>***,***<sup>C</sup>**

LTL<sup>K</sup> extends LTL with the knowledge operator K. For some subset of agents A, the formula KAψ holds in timestep i, if ψ holds on all traces equivalent to some agent in A up to timestep i. See full version [11] for detailed semantics. LTL<sup>K</sup> and HyperCTL<sup>∗</sup> have incomparable expressiveness [16] but the knowledge operator K can be encoded by either adding a linear past operator [16] or by adding propositional quantification (as in HyperQPTL) [45].

Using Hyper<sup>2</sup>LTLfp we can encode LTLK,C, featuring the knowledge operator K *and* the common knowledge operator C (which requires that ψ holds on the closure set of equivalent traces, up to the current timepoint) [41]. Note that LTLK,<sup>C</sup> is not encodable by only adding propositional quantification or the linear past operator.

**Proposition 6.** *For every* LTLK,<sup>C</sup> *formula* ϕ *there exists an* Hyper<sup>2</sup>LTLfp *formula* ϕ *such that for any system* T *we have* T -LT LK,<sup>C</sup> <sup>ϕ</sup> *iff* <sup>T</sup> ϕ- *.*

*Proof (Sketch).* We follow the intuition discussed in Sect. 3.3. For each occurrence of a knowledge operator in {K, <sup>C</sup>}, we use a fresh trace variable to keep track on the points in time with respect to which we need to compare traces. We then use this trace variable to introduce a second-order set that collects all equivalent traces (by the observations of one agent, or the closure of all agents' observations). We then inductively construct a Hyper<sup>2</sup>LTLfp formula that captures all the knowledge and common-knowledge sets, over which we check the properties at hand. See full version for more details [11].

#### **4.2 Hyper<sup>2</sup>LTL and Asynchronous Hyperproperties**

Most existing hyperlogics (including Hyper<sup>2</sup>LTL) traverse the traces of a system *synchronously*. However, in many cases such a synchronous traversal is too restricting and we need to compare traces asynchronously. As an example, consider *observational determinism* (OD), which we can express in HyperLTL as <sup>ϕ</sup>*OD* := <sup>∀</sup>π1.∀π2. (o<sup>π</sup><sup>1</sup> <sup>↔</sup> <sup>o</sup><sup>π</sup><sup>2</sup> ). The formula states that the output of a system is identical across all traces and so (trivially) no information about high-security inputs is leaked. In most systems encountered in practice, this synchronous formula is violated, as the exact timing between updates to o might differ by a few steps (we provide some examples in the full version [11]). However, assuming that an attacker only has access to the memory footprint and not a timing channel, we would only like to check that all traces are *stutter* equivalent (with respect to o).

A range of extensions to existing hyperlogics has been proposed to reason about such asynchronous hyperproperties [3,5,9,17,39]. We consider AHLTL [3]. An AHLTL formula has the form Q1π1,..., Qnπm.**E**. ψ where ψ is a qunatifierfree HyperLTL formula. The initial trace quantifier prefix is handled as in Hyper-LTL. However, different from HyperLTL, a trace assignment [π<sup>1</sup> <sup>→</sup> <sup>t</sup>1,...,π<sup>n</sup> → tn] satisfies **E**. ψ if there exist stuttered traces t - 1,...,t- <sup>n</sup> of t1,...,t<sup>n</sup> such that [π<sup>1</sup> <sup>→</sup> <sup>t</sup> - <sup>1</sup>,...,π<sup>n</sup> <sup>→</sup> <sup>t</sup> - <sup>n</sup>] <sup>ψ</sup>. We write <sup>T</sup> -*AHLTL* <sup>ϕ</sup> if a system <sup>T</sup> satisfies the AHLTL formula ϕ. Using this quantification over stutterings we can, for example, express an asynchronous version of observational determinism as <sup>∀</sup>π1.∀π2.**E**. (o<sup>π</sup><sup>1</sup> <sup>↔</sup> <sup>o</sup><sup>π</sup><sup>2</sup> ) stating that every two traces can be aligned such that they (globally) agree on o. Despite the fact that Hyper<sup>2</sup>LTLfp is itself synchronous, we can use second-order quantification to encode asynchronous hyperproperties, as we state in the following proposition.

**Proposition 7.** *For any AHLTL formula* ϕ *there exists a* Hyper<sup>2</sup>LTLfp *formula* ϕ *such that for any system* T *we have* T -*AHLTL* <sup>ϕ</sup> *iff* <sup>T</sup> ϕ- *.*

*Proof.* Assume that ϕ = Q1π1,..., Qnπn.**E**. ψ is the given AHLTL formula. For each <sup>i</sup> <sup>∈</sup> [n] we define a formula <sup>ϕ</sup><sup>i</sup> as follows

$$\forall \pi\_1 \in X\_i, \forall \pi\_2 \in \mathfrak{A}.$$

$$\left( \left( \pi\_1 =\_{\text{AP}} \pi\_2 \right) \mathcal{U} \left( \left( \pi\_1 =\_{\text{AP}} \pi\_2 \right) \wedge \square \bigwedge\_{a \in \text{AP}} a\_{\pi\_1} \leftrightarrow \bigotimes a\_{\pi\_2} \right) \right) \to \pi\_2 \diamond X\_i.$$

The formula asserts that the set of traces bound to X<sup>i</sup> is closed under stuttering, i.e., if we start from any trace in X<sup>i</sup> and stutter it once (at some arbitrary position) we again end up in Xi. Using the formulas ϕi, we then construct a Hyper<sup>2</sup>LTLfp formula that is equivalent to ϕ as follows

$$\begin{aligned} \varphi' &:= \mathbb{Q}\_1 \pi\_1 \in \mathfrak{S}, \dots, \mathbb{Q}\_n \pi\_n \in \mathfrak{S}. (X\_1, \vee, \pi\_1 \rhd X\_1 \wedge \varphi\_1) \cdots (X\_n, \vee, \pi\_n \rhd X\_n \wedge \varphi\_n) \\ &\quad \exists \pi'\_1 \in X\_1, \dots, \exists \pi'\_n \in X\_n. \psi[\pi'\_1/\pi\_1, \dots, \pi'\_n/\pi\_n] \end{aligned}$$

We first mimic the quantification in ϕ and, for each trace πi, construct a least set X<sup>i</sup> that contains π<sup>i</sup> and is closed under stuttering (thus describing exactly the set of all stuttering of πi). Finally, we assert that there are traces π- 1,...,π- n with π- <sup>i</sup> <sup>∈</sup> <sup>X</sup><sup>i</sup> (so <sup>π</sup>- <sup>i</sup> is a stuttering of πi) such that π- 1,...,π- <sup>n</sup> satisfy ψ. It is easy to see that T -*AHLTL* <sup>ϕ</sup> iff <sup>T</sup> ϕholds for all systems.

Hyper<sup>2</sup>LTLfp captures all properties expressible in AHLTL. In particular, our approximate model-checking algorithm for Hyper<sup>2</sup>LTLfp (cf. Sect. 5) is applicable to AHLTL; even for instances where no approximate solutions were previously known. In Sect. 6, we show that our prototype model checker for Hyper<sup>2</sup>LTLfp can verify asynchronous properties in practice.

#### **5 Model-Checking Hyper2LTL***f p*

In general, finite-state model checking of Hyper<sup>2</sup>LTLfp is highly undecidable (cf. Proposition 2). In this section, we outline a partial algorithm that computes approximations on the concrete values of second-order variables for a fragment of Hyper<sup>2</sup>LTLfp. At a very high-level, our algorithm (Algorithm 1) iteratively computes under- and overapproximations for second-order variables. It then turns to resolve first-order quantification, using techniques from HyperLTL model checking [8,32], and resolves existential and universal trace quantification on the under- and overapproximation of the second-order variables, respectively. If the verification fails, it goes back to refine second-order approximations.

In this section, we focus on the setting where we are interested in the least sets (using ), and use techniques to approximate the *least* fixpoint. A similar (dual) treatment is possible for Hyper<sup>2</sup>LTLfp formulas that use the largest set. Every Hyper<sup>2</sup>LTLfp which uses only minimal sets has the following form:

$$\varphi = \gamma\_1. (Y\_1, \vee, \varphi\_1^{con}). \gamma\_2 \dots (Y\_k, \vee, \varphi\_k^{con}). \gamma\_{k+1}. \psi \tag{1}$$

We quantify second-order variables <sup>Y</sup>1,...,Yk, where, for each <sup>j</sup> <sup>∈</sup> [k], <sup>Y</sup><sup>j</sup> is the least set that satisfies ϕ*con* <sup>j</sup> . Finally, for each <sup>j</sup> <sup>∈</sup> [<sup>k</sup> + 1],

$$
\gamma\_j = \mathbb{Q}\_{l\_j+1} \pi\_{l\_j+1} \in X\_{l\_j+1} \dots \mathbb{Q}\_{l\_{j+1}} \pi\_{l\_{j+1}} \in X\_{l\_{j+1}},
$$

is the block of first-order quantifiers that sits between the quantification of <sup>Y</sup><sup>j</sup>−<sup>1</sup> and <sup>Y</sup><sup>j</sup> . Here <sup>X</sup><sup>l</sup>j+1,...,X<sup>l</sup>j+1 ∈ {S, <sup>A</sup>, Y1,...,Y<sup>j</sup>−<sup>1</sup>} are second-order variables that are quantified before γ<sup>j</sup> . In particular, π1,...,π<sup>l</sup><sup>j</sup> are the first-order variables quantified before Y<sup>j</sup> .

## **5.1 Fixpoints in Hyper2LTL***f p*

We consider a fragment of Hyper<sup>2</sup>LTLfp which we call the *least fixpoint fragment*. Within this fragment, we restrict the formulas ϕ*con* <sup>1</sup> ,...,ϕ*con* <sup>k</sup> such that Y1,...,Y<sup>k</sup> can be approximated as (least) fixpoints. Concretely, we say that ϕ is in the *least fixpoint fragment* of Hyper<sup>2</sup>LTLfp if for all <sup>j</sup> <sup>∈</sup> [k], <sup>ϕ</sup>*con* <sup>j</sup> is a conjunction of formulas of the form

$$\forall \forall \dot{\pi}\_1 \in X\_1 \dots \forall \dot{\pi}\_n \in X\_n. \psi\_{step} \to \dot{\pi}\_M \rhd Y\_j \tag{2}$$

where each <sup>X</sup><sup>i</sup> ∈ {S, <sup>A</sup>, Y1,...,Y<sup>j</sup>}, <sup>ψ</sup>*step* is quantifier-free formula over trace variables ˙π1,..., <sup>π</sup>˙ <sup>n</sup>, π1,...,π<sup>l</sup><sup>j</sup> , and <sup>M</sup> <sup>∈</sup> [n]. Intuitively, Eq. (2) states a requirement on traces that should be included in Y<sup>j</sup> . If we find traces ˙ <sup>t</sup><sup>1</sup> <sup>∈</sup> <sup>X</sup>1,..., ˙ <sup>t</sup><sup>n</sup> <sup>∈</sup> X<sup>n</sup> that, together with the traces t1,...,t<sup>l</sup><sup>j</sup> quantified before Y<sup>j</sup> , satisfy ψ*step*, then ˙ t<sup>M</sup> should be included in Y<sup>j</sup> .

Together with the minimality constraint on Y<sup>j</sup> (stemming from the semantics of Hyper<sup>2</sup>LTLfp), this effectively defines a (monotone) least fixpoint computation, as ψ*step* defines exactly the traces to be added to the set. This will allow us to use results from fixpoint theory to compute approximations for the sets Y<sup>j</sup> .

Our least fixpoint fragment captures most properties of interest, in particular, common knowledge (Sect. 3.3) and asynchronous hyperproperties (Sect. 4.2). We observe that formulas of the above form ensure that the solution Y<sup>j</sup> is unique, i.e., for any trace assignment Π to π1,...,π<sup>l</sup><sup>j</sup> and second-order assignment Δ to <sup>S</sup>, <sup>A</sup>, Y1,...,Y<sup>j</sup>−<sup>1</sup>, there is only one element in *sol*(Π, Δ,(Y<sup>j</sup> , , ϕ*con* <sup>j</sup> )).

#### **5.2 Functions as Automata**

In our (approximate) model-checking algorithm, we represent a concrete assignment to the second-order variables <sup>Y</sup>1,...,Y<sup>k</sup> using automata <sup>B</sup><sup>Y</sup><sup>1</sup> ,..., <sup>B</sup><sup>Y</sup><sup>k</sup> . The concrete assignment of Y<sup>j</sup> can depend on traces assigned to π1,...,π<sup>l</sup><sup>j</sup> , i.e., the first-order variables quantified before Y<sup>j</sup> . To capture these dependencies, we view each Y<sup>j</sup> not as a set of traces but as a function mapping traces of all preceding first-order variables to a set of traces. We represent such a function <sup>f</sup> : (Σ<sup>ω</sup>)<sup>l</sup><sup>j</sup> <sup>→</sup> <sup>2</sup>(Σω) mapping the <sup>l</sup><sup>j</sup> traces to a set of traces as an automaton <sup>A</sup> over <sup>Σ</sup><sup>l</sup>j+1. For traces <sup>t</sup>1,...,t<sup>l</sup><sup>j</sup> , the set <sup>f</sup>(t1,...,t<sup>l</sup><sup>j</sup> ) is represented in the automaton by the set {<sup>t</sup> <sup>∈</sup> <sup>Σ</sup><sup>ω</sup> <sup>|</sup> *zip*(t1,...,t<sup>l</sup><sup>j</sup> , t) ∈ L(A)}. For example, the function <sup>f</sup>(t1) := {t1} can be defined by the automaton that accepts the zipping of a pair of traces exactly if both traces agree on all propositions. This representation of functions as automata allows us to maintain an assignment to Y<sup>j</sup> that is parametric in π1,...,π<sup>l</sup><sup>j</sup> and still allows first-order model checking on Y1,...,Yk.

#### **5.3 Model Checking for First-Order Quantification**

First, we focus on first-order quantification, and assume that we are given a concrete assignment for each second-order variable as fixed automata <sup>B</sup><sup>Y</sup><sup>1</sup> ,..., <sup>B</sup><sup>Y</sup><sup>k</sup>

(where <sup>B</sup>Y<sup>j</sup> is an automaton over <sup>Σ</sup>lj+1). Our construction for resolving firstorder quantification is based on HyperLTL model checking [32], but needs to work on sets of traces that, themselves, are based on traces quantified before (cf. Sect. 5.2). Recall that the first-order quantifier prefix is <sup>γ</sup><sup>1</sup> ··· <sup>γ</sup>k+1 <sup>=</sup> <sup>Q</sup>1π<sup>1</sup> <sup>∈</sup> <sup>X</sup><sup>1</sup> ··· <sup>Q</sup>lk+1 <sup>π</sup>lk+1 <sup>∈</sup> <sup>X</sup>lk+1 . For each 1 <sup>≤</sup> <sup>i</sup> <sup>≤</sup> <sup>l</sup>k+1 we inductively construct an automaton <sup>A</sup><sup>i</sup> over <sup>Σ</sup>i−<sup>1</sup> that summarizes all trace assignments to <sup>π</sup>1,...,πi−<sup>1</sup> that satisfy the subformula starting with the quantification of πi. That is, for all traces <sup>t</sup>1,...,ti−<sup>1</sup> we have

$$\{\pi\_1 \mapsto t\_1, \dots, \pi\_{i-1} \mapsto t\_{i-1}\} \models \mathbb{Q}\_i \\ \pi\_i \in X\_i \cdot \dots \cdot \mathbb{Q}\_{l\_{k+1}} \pi\_{l\_{k+1}} \in X\_{l\_{k+1}} \dots \psi\_i$$

(under the fixed second-order assignment for <sup>Y</sup>1,...,Y<sup>k</sup> given by <sup>B</sup><sup>Y</sup><sup>1</sup> ,..., <sup>B</sup><sup>Y</sup><sup>k</sup> ) if and only if *zip*(t1,...,t<sup>i</sup>−<sup>1</sup>) ∈ L(A<sup>i</sup>). In the context of HyperLTL model checking we say <sup>A</sup><sup>i</sup> is *equivalent* to <sup>Q</sup>iπ<sup>i</sup> <sup>∈</sup> <sup>X</sup><sup>i</sup> ··· <sup>Q</sup><sup>l</sup>k+1 <sup>π</sup><sup>l</sup>k+1 <sup>∈</sup> <sup>X</sup><sup>l</sup>k+1 . ψ [8,32]. In particular, <sup>A</sup><sup>1</sup> is an automaton over singleton alphabet <sup>Σ</sup><sup>0</sup>.

We construct <sup>A</sup>1,..., <sup>A</sup><sup>l</sup>k+1+1 inductively, starting with <sup>A</sup><sup>l</sup>k+1+1. Initially, we construct <sup>A</sup><sup>l</sup>k+1+1 (over <sup>Σ</sup><sup>l</sup>k+1 ) using a standard LTL-to-NBA construction on the (quantifier-free) body ψ (see [32] for details). Now assume that we are given an (inductively constructed) automaton <sup>A</sup><sup>i</sup>+1 over <sup>Σ</sup><sup>i</sup> and want to construct <sup>A</sup><sup>i</sup>. We first consider the case where <sup>Q</sup><sup>i</sup> <sup>=</sup> <sup>∃</sup>, i.e., the <sup>i</sup>th trace quantification is existential. Now X<sup>i</sup> (the set where π<sup>i</sup> is resolved on) either equals S, A or Y<sup>j</sup> for some <sup>j</sup> <sup>∈</sup> [k]. In either case, we represent the current assignment to <sup>X</sup><sup>i</sup> as an automaton <sup>C</sup> over <sup>Σ</sup><sup>T</sup> +1 for some T <i that defines the model of <sup>X</sup><sup>i</sup> based on traces <sup>π</sup>1,...,π<sup>T</sup> : In case <sup>X</sup><sup>i</sup> <sup>=</sup> <sup>S</sup>, we set <sup>C</sup> to be the automaton over <sup>Σ</sup>0+1 that accepts exactly the traces in the given system <sup>T</sup> ; in case <sup>X</sup><sup>i</sup> <sup>=</sup> <sup>A</sup>, we set <sup>C</sup> to be the automaton over <sup>Σ</sup>0+1 that accepts all traces; If <sup>X</sup><sup>i</sup> <sup>=</sup> <sup>Y</sup><sup>j</sup> for some <sup>j</sup> <sup>∈</sup> [k] we set <sup>C</sup> to be <sup>B</sup><sup>Y</sup><sup>j</sup> (which is an automaton over <sup>Σ</sup><sup>l</sup>j+1).<sup>1</sup> Given <sup>C</sup>, we can now modify the construction from [32], to resolve first-order quantification: The desired automaton <sup>A</sup><sup>i</sup> should accept the zipping of traces <sup>t</sup>1,...,t<sup>i</sup>−<sup>1</sup> if there exists a trace <sup>t</sup> such that (1) *zip*(t1,...,t<sup>i</sup>−1, t) ∈ L(A<sup>i</sup>+1), *and* (2) the trace <sup>t</sup> is contained in the set of traces assigned to <sup>X</sup><sup>i</sup> as given by <sup>C</sup>, i.e., *zip*(t1,...,t<sup>T</sup> , t) <sup>∈</sup> L(C). The construction of this automaton is straightforward by taking a product of <sup>A</sup><sup>i</sup>+1 and <sup>C</sup>. We denote this automaton with eProduct(Ai+1,C). In case <sup>Q</sup><sup>i</sup> <sup>=</sup> <sup>∀</sup> we exploit the duality that <sup>∀</sup>π.ψ <sup>=</sup> ¬∃π.¬ψ, combining the above construction with automata complementation. We denote this universal product of A<sup>i</sup>+1 and <sup>C</sup> with uProduct(Ai+1,C).

The final automaton <sup>A</sup><sup>1</sup> is an automaton over singleton alphabet <sup>Σ</sup><sup>0</sup> that is equivalent to <sup>γ</sup><sup>1</sup> ··· <sup>γ</sup><sup>k</sup>+1.ψ, i.e., the entire first-order quantifier prefix. Automaton A<sup>1</sup> thus satisfies L(A1) = ∅ (which we can decide) iff the empty trace assignment satisfies the first-order formula <sup>γ</sup><sup>1</sup> ··· <sup>γ</sup><sup>k</sup>+1. ψ, iff <sup>ϕ</sup> (of Eq. (1)) holds within the fixed model for Y1,...,Yk. For a given fixed second-order assignment (given as automata <sup>B</sup><sup>Y</sup><sup>1</sup> ,..., <sup>B</sup><sup>Y</sup><sup>k</sup> ), we can thus decide if the system satisfies the first-order part.

<sup>1</sup> Note that in this case l<sup>j</sup> < i: if trace π<sup>i</sup> is resolved on Y<sup>j</sup> (i.e., X<sup>i</sup> = Y<sup>j</sup> ), then Y<sup>j</sup> must be quantified *before* <sup>π</sup><sup>i</sup> so there are at most <sup>i</sup> <sup>−</sup> 1 traces quantified before <sup>Y</sup><sup>j</sup> .

#### **Algorithm 1**

```
1 verify(ϕ, T ) =
2 let ϕ = -

             γj (Yj , , ϕcon
                     j )
                         k
                         j=1 γk+1. ψ where γi = -

                                              Qmπm ∈ Xm
                                                        li+1
                                                         m=li+1
3 let N = 0
4 let AT = systemToNBA(T )
5 repeat
6 // Start outside-in traversal on second-order variables
7 let  = -

              S → (AT , AT ), A → (A, A)

8 for j from 1 to k do
9 Bl
         j := underApprox((Yj , , ϕcon
                               j ),,N)
10 Bu
         j := overApprox((Yj , , ϕcon
                              j ),,N)
11 (Yj ) := (Bl
                 j , Bu
                    j )
12 // Start inside-out traversal on first-order variables
13 let Alk+1+1 = LTLtoNBA(ψ)
14 for i from lk+1 to 1 do
15 let (Cl
              , Cu) = (Xi)
16 if Qi = ∃ then
17 Ai := eProduct(Ai+1, Cl
                               )
18 else
19 Ai := uProduct(Ai+1, Cu)
20 if L(A1) = ∅ then
21 return SAT
22 else
23 N = N + 1
```
During the first-order model-checking phase, each quantifier alternations in the formula require complex automata complementation. For the first-order phase, we could also use cheaper approximate methods by, e.g., instantiating the existential trace using a strategy [6,7,25].

#### **5.4 Bidirectional Model Checking**

So far, we have discussed the verification of the first-order quantifiers assuming we have a fixed model for all second-order variables Y1,...,Yk. In our actual model-checking algorithm, we instead maintain under- and overapproximations on each of the Y1,...,Yk.

In each iteration, we first traverse the second-order quantifiers in an *outsidein* direction and compute lower- and upper-bounds on each Y<sup>j</sup> . Given the bounds, we then traverse the first-order prefix in an *inside-out* direction using the current approximations to Y1,...,Yk. If the current approximations are not precise enough to witness the satisfaction (or violation) of a property, we repeat and try to compute better bounds on Y1,...,Yk. Due to the different directions of traversal, we refer to our model-checking approach as *bidirectional*. Algorithm 1 provides an overview. Initially, we convert the system T to an NBA A<sup>T</sup> accepting exactly the traces of the system. In each round, we compute under- and overapproximations for each Y<sup>j</sup> in a mapping . We initialize by mapping S to (A<sup>T</sup> , <sup>A</sup><sup>T</sup> ) (i.e., the value assigned to the system variable is precisely <sup>A</sup><sup>T</sup> for both under- and overapproximation), and <sup>A</sup> to (A, <sup>A</sup>) where <sup>A</sup> is an automaton over Σ<sup>1</sup> accepting all traces. We then traverse the second-order quantifiers outside-in (from <sup>Y</sup><sup>1</sup> to <sup>Y</sup>k) and for each <sup>Y</sup><sup>j</sup> compute a pair (B<sup>l</sup> <sup>j</sup> , <sup>B</sup><sup>u</sup> <sup>j</sup> ) of automata over Σlj+1 that under- and overapproximate the actual (unique) model of Y<sup>j</sup> . We compute these approximations using functions underApprox and overApprox, which can be instantiated with any procedure that computes sound lower and upper bounds (see Sect. 5.5). During verification, we further maintain a precision bound N (initially set to 0) that tracks the current precision of the second-order approximations.

When contains an under- and overapproximation for each second-order variable, we traverse the first-order variables in an inside-out direction (from π<sup>l</sup>k+1 to π1) and, following the construction outlined in Sect. 5.3, construct automata <sup>A</sup><sup>l</sup>k+1,..., <sup>A</sup>1. Different from the simplified setting in Sect. 5.3 (where we assume a fixed automaton <sup>B</sup><sup>Y</sup><sup>j</sup> providing a model for each <sup>Y</sup><sup>j</sup> ), the mapping contains only approximations of the concrete solution. We choose which approximation to use according to the corresponding set quantification: In case we construct <sup>A</sup><sup>i</sup> and <sup>Q</sup><sup>i</sup> <sup>=</sup> <sup>∃</sup>, we use the *underapproximation* (thus making sure that any witness trace we pick is indeed contained in the actual model of the second-order variable); and if <sup>Q</sup><sup>i</sup> <sup>=</sup> <sup>∀</sup>, we use the *overapproximation* (making sure that we consider at least those traces that are in the actual solution). If L(A1) is nonempty, i.e., accepts the empty trace assignment, the formula holds (assuming the approximations returned by underApprox and overApprox are sound). If not, we increase the precision bound N and repeat.

In Algorithm 1, we only check for the satisfaction of a formula (to keep the notation succinct). Using the second-order approximations in we can also check the negation of a formula (by considering the negated body and dualizing all trace quantifiers). Our tool (Sect. 6) makes use of this and thus simultaneously tries to show satisfaction and violation of a formula.

#### **5.5 Computing Under- and Overapproximations**

In this section we provide concrete instantiations for underApprox and overApprox.

**Computing Underapproximations.** As we consider the fixpoint fragment, each formula ϕ*con* <sup>j</sup> (defining Y<sup>j</sup> ) is a conjunction of formulas of the form in Eq. (2), thus defining Y<sup>j</sup> via a least fixpoint computation. For simplicity, we assume that Y<sup>j</sup> is defined by the single conjunct, given by Eq. (2) (our construction generalizes easily to a conjunction of such formulas). Assuming fixed models for <sup>S</sup>, <sup>A</sup> and <sup>Y</sup>1,...,Y<sup>j</sup>−<sup>1</sup>, the fixpoint operation defining <sup>Y</sup><sup>j</sup> is monotone, i.e., the larger the current model for Y<sup>j</sup> is, the more traces we need to add according to Eq. (2). Monotonicity allows us to apply the Knaster-Tarski theorem [47] and compute underapproximations to the fixpoint by iteration.

In our construction of an approximation for Y<sup>j</sup> , we are given a mapping that fixes a pair of automata for <sup>S</sup>, <sup>A</sup>, and <sup>Y</sup>1,...,Yj−<sup>1</sup> (due to the outsidein traversal in Algorithm 1). As we are computing an underapproximation, we use the underapproximation for each of the second-order variables in . So (S) and (A) are automata over Σ<sup>1</sup> and for each j- <sup>∈</sup> [<sup>j</sup> <sup>−</sup> 1], (Yj- ) is an automaton over Σlj-+1. Given this fixed mapping , we iteratively construct automata <sup>C</sup><sup>ˆ</sup> 0, Cˆ <sup>1</sup>,... over Σlj+1 that capture (increasingly precise) underapproximations on the solution for <sup>Y</sup><sup>j</sup> . We set <sup>C</sup><sup>ˆ</sup> <sup>0</sup> to be the automaton with the empty language. We then recursively define <sup>C</sup><sup>ˆ</sup> <sup>N</sup>+1 based on <sup>C</sup><sup>ˆ</sup> <sup>N</sup> as follows: For each second-order variable <sup>X</sup><sup>i</sup> for <sup>i</sup> <sup>∈</sup> [n] used in Eq. (2) we can assume a concrete assignment in the form of an automaton <sup>D</sup><sup>i</sup> over <sup>Σ</sup><sup>T</sup>i+1 for some <sup>T</sup><sup>i</sup> <sup>≤</sup> <sup>l</sup><sup>j</sup> : In case <sup>X</sup><sup>i</sup> <sup>=</sup> <sup>Y</sup><sup>j</sup> (so <sup>X</sup><sup>i</sup> ∈ {S, <sup>A</sup>, Y1,...,Y<sup>j</sup>−<sup>1</sup>}), we set <sup>D</sup><sup>i</sup> := (Xi). In case <sup>X</sup><sup>i</sup> <sup>=</sup> <sup>Y</sup><sup>j</sup> , we set <sup>D</sup><sup>i</sup> := <sup>C</sup><sup>ˆ</sup> <sup>N</sup> , i.e., we use the current approximation of <sup>Y</sup><sup>j</sup> in iteration <sup>N</sup>. After we have set <sup>D</sup>1,..., <sup>D</sup><sup>n</sup>, we compute an automaton <sup>C</sup>˙ over <sup>Σ</sup><sup>l</sup>j+1 that accepts *zip*(t1,...,t<sup>l</sup><sup>j</sup> , t) iff there exists traces ˙ t1,..., ˙ t<sup>n</sup> such that (1) *zip*(t1,...,t<sup>T</sup><sup>i</sup> , ˙ <sup>t</sup>i) ∈ L(D<sup>i</sup>) for all <sup>i</sup> <sup>∈</sup> [n], (2) [π<sup>1</sup> <sup>→</sup> <sup>t</sup>1,...,π<sup>l</sup><sup>j</sup> <sup>→</sup> <sup>t</sup><sup>l</sup><sup>j</sup> , <sup>π</sup>˙ <sup>1</sup> <sup>→</sup> ˙ <sup>t</sup>1,..., <sup>π</sup>˙ <sup>n</sup> <sup>→</sup> ˙ tn] ψ*step*, and (3) trace t equals ˙ <sup>t</sup><sup>M</sup> (of Eq. (2)). The intuition is that <sup>C</sup>˙ captures all traces that should be added to Y<sup>j</sup> : Given t1,...,t<sup>l</sup><sup>j</sup> we check if there are traces ˙ t1,..., ˙ t<sup>n</sup> for trace variables π˙ <sup>1</sup>,..., π˙ <sup>n</sup> in Eq. (2) where (1) each ˙ t<sup>i</sup> is in the assignment for Xi, which is captured by the automaton <sup>D</sup><sup>i</sup> over <sup>Σ</sup><sup>T</sup>i+1, and (2) the traces ˙ t1,..., ˙ t<sup>n</sup> satisfy ϕ*step*. If this is the case, we want to add ˙ t<sup>M</sup> (as stated in Eq. (2)). We then define <sup>C</sup><sup>ˆ</sup> <sup>N</sup>+1 as the union of <sup>C</sup><sup>ˆ</sup> <sup>N</sup> and <sup>C</sup>˙, i.e. extend the previous model with all (potentially new) traces that need to be added.

**Computing Overapproximations.** As we noted above, conditions of the form of Eq. (2) always define fixpoint constraints. To compute upper bounds on such fixpoint constructions we make use of Park's theorem, [48] stating that if we find some set (or automaton) B that is inductive (i.e., when computing all traces that we would need to add assuming the current model of <sup>Y</sup><sup>j</sup> is <sup>B</sup>, we end up with traces that are already in B), then B overapproximates the unique solution (aka. least fixpoint) of Y<sup>j</sup> . To derive such an inductive invariant, we employ techniques developed in the context of regular model checking [15] (see Sect. 7). Concretely, we employ the approach from [19] that uses automata learning [2] to find suitable invariants. While the approach from [19] is limited to finite words, we extend it to an ω-setting by interpreting an automaton accepting finite words as one that accepts an ω-word u iff every prefix of u is accepted.<sup>2</sup> As soon as the learner provides a candidate for an equivalence check, we check that it is inductive and, if not, provide some finite counterexample (see [19] for details). If the automaton is inductive, we return it as a potential overapproximation.

<sup>2</sup> This effectively poses the assumption that the step formula specifies a safety property, which seems to be the case for almost all examples. As an example, common knowledge infers a safety property: In each step, we add all traces for which there exists some trace that agrees on all propositions observed by that agent.

Should this approximation not be precise enough, the first-order model checking (Sect. 5.3) returns some concrete counterexample, i.e., some trace contained in the invariant but violating the property, which we use to provide more counterexamples to the learner.

## **6 Implementation and Experiments**

We have implemented our model-checking algorithm in a prototype tool we call HySO (**Hy**perproperties with **S**econd **O**rder).<sup>3</sup> Our tool uses spot [29] for basic automata operations (such as LTL-to-NBA translations and complementations). To compute under- and overapproximations, we use the techniques described in Sect. 5.5. We evaluate the algorithm on the following benchmarks.

**Muddy Children.** The muddy children puzzle [30] is one of the classic examples in common knowledge literature. The puzzle consists of n children standing such that each child can see all other children's faces. From the n children, an unknown number <sup>k</sup> <sup>≥</sup> 1 have a muddy forehead, and in incremental rounds, the children should step forward if they know if their face is muddy or not. Consider the scenario of n = 2 and k = 1, so child a sees that child b has a muddy forehead and child b sees that a is clean. In this case, b immediately steps forward, as it knows that its forehead is muddy since <sup>k</sup> <sup>≥</sup> 1. In the next step, <sup>a</sup> knows that its face is clean since b stepped forward in round 1. In general, one can prove that all children step forward in round k, deriving common knowledge.

For each <sup>n</sup> we construct a transition system <sup>T</sup><sup>n</sup> that encodes the muddy children scenario with n children. For every m we design a Hyper<sup>2</sup>LTLfp formula ϕ<sup>m</sup> that adds to the common knowledge set X all traces that appear indistinguishable in the first m steps for some child. We then specify that all traces in X should agree on all inputs, asserting that all inputs are common knowledge.<sup>4</sup> We used HySO to *fully automatically* check <sup>T</sup><sup>n</sup> against <sup>ϕ</sup><sup>m</sup> for varying values of <sup>n</sup> and m, i.e., we checked if, after the first m steps, the inputs of all children are common knowledge. As expected, the above property holds only if <sup>m</sup> <sup>≥</sup> <sup>n</sup> (in the worst case, where all children are dirty (k = n), the inputs of all children only become common knowledge after n steps). We depict the results in Table 1a.

**Asynchronous Hyperproperties.** As we have shown in Sect. 4.2, we can encode arbitrary AHLTL properties into Hyper<sup>2</sup>LTLfp. We verified synchronous and asynchronous version of observational determinism (cf. Sect. 4.2) on programs taken from [3,5,9]. We depict the verification results in Table 1b. Recall that Hyper<sup>2</sup>LTLfp properties without any second-order variables correspond to

<sup>3</sup> Our tool is publicly available at https://doi.org/10.5281/zenodo.7877144.

<sup>4</sup> This property is not expressible in non-hyper logics such as LTL<sup>K</sup>,<sup>C</sup>, where we can only check *trace properties* on the common knowledge set X. In contrast, Hyper<sup>2</sup>LTLfp allows us to check *hyperproperties* on X. That way, we can express that some value is common knowledge (i.e., equal across all traces in the set) and not only that a property is common knowledge (i.e., holds on all traces in the set).

**Table 1.** In Table 1a, we check common knowledge in the muddy children puzzle for n children and m rounds. We give the result (✓ if common knowledge holds and ✗ if it does not), and the running time. In Table 1a, we check synchronous and asynchronous versions of observational determinism. We depict the number of iterations needed and running time. Times are given in seconds.


HyperQPTL formulas. HySO can check such properties precisely, i.e., it constitutes a sound-and-complete model checker for HyperQPTL properties with an arbitrary quantifier prefix. The synchronous version of observational determinism is a HyperLTL property and thus needs no second-order approximation (we set the method column to "-" in these cases).

**Common Knowledge in Multi-agent Systems.** We used HySO for an automatic analysis of the system in Fig. 1. Here, we verify that on initial trace {a}<sup>n</sup>{d}<sup>ω</sup> it is CK that <sup>a</sup> holds in the first step. We use a similar formula as the one of Sect. 3.3, with the change that we are interested in whether a is CK (whereas we used <sup>a</sup> in Sect. 3.3). As expected, HySO requires 2<sup>n</sup> <sup>−</sup> 1 iterations to converge. We depict the results in Table 2a.

**Mazurkiewicz Traces.** Mazurkiewicz traces are an important concept in the theory of distributed computing [27]. Let <sup>I</sup> <sup>⊆</sup> <sup>Σ</sup> <sup>×</sup> <sup>Σ</sup> be an independence relation that determines when two consecutive letters can be switched (think of two actions in disjoint processes in a distributed system). Any <sup>t</sup> <sup>∈</sup> <sup>Σ</sup><sup>ω</sup> then defines the set of all traces that are equivalent to t by flipping consecutive independent actions an arbitrary number of times (the equivalence class of all these traces is called the Mazurkiewicz Trace). See [27] for details. The verification problem for Mazurkiewicz traces now asks if, given some <sup>t</sup> <sup>∈</sup> <sup>Σ</sup><sup>ω</sup>, all traces in the Mazurkiewicz trace of t satisfy some property ψ. Using Hyper<sup>2</sup>LTLfp we can directly reason about the Mazurkiewicz Trace of any given trace, by requiring that all traces that are equal up to one swap of independent letters are also in a given set (which is easily expressed in Hyper<sup>2</sup>LTLfp).

**Table 2.** In Table 1a, we check common knowledge in the example from Fig. 1 when starting with and<sup>ω</sup> for varying values of n. We depict the number of refinement iterations, the result, and the running time. In Table 2b, we verify various properties on Mazurkiewicz traces. We depict whether the property could be verified or refuted by iteration or automata learning, the result, and the time. Times are given in seconds.


Using HySO we verify a selection of such trace properties that often require non-trivial reasoning by coming up with a suitable invariant. We depict the results in Table 2b. In our preliminary experiments, we model a situation where we start with {a}<sup>1</sup>{}<sup>ω</sup> and can swap letters {a} and {}. We then, e.g., ask if on any trace in the resulting Mazurkiewicz trace, a holds at most once, which requires inductive invariants and cannot be established by iteration.

## **7 Related Work**

In recent years, many logics for the formal specification of hyperproperties have been developed, extending temporal logics with explicit path quantification (examples include HyperLTL, HyperCTL<sup>∗</sup> [20], HyperQPTL [10,45], HyperPDL [38], and HyperATL<sup>∗</sup> [5,9]); or extending first and second-order logics with an equal level predicate [25,33]. Others study (ω)-regular [14,37] and context-free hyperproperties [35]; or discuss hyperproperties over data and modulo theories [24,31]. Hyper<sup>2</sup>LTL is the first temporal logic that reasons about secondorder hyperproperties which allows is to capture many existing (epistemic, asynchronous, etc.) hyperlogics while at the same time taking advantage of modelchecking solutions that have been proven successful in first-order settings.

*Asynchronous Hyperproperties.* For asynchronous hyperproperties, Gutfeld et al. [39] present an asynchronous extension of the polyadic μ-calculus. Bozelli et al. [17] extend HyperLTL with temporal operators that are only evaluated if the truth value of some temporal formula changes. Baumeister et al. present AHLTL [3], that extends HyperLTL with a explicit quantification over trajectories and can be directly encoded within Hyper<sup>2</sup>LTLfp.

*Regular Model Checking.* Regular model checking [15] is a general verification method for (possibly infinite state) systems, in which each state of the system is interpreted as a finite word. The transitions of the system are given as a finite-state (regular) transducer, and the model checking problem asks if, from some initial set of states (given as a regular language), some bad state is eventually reachable. Many methods for automated regular model checking have been developed [12,13,19,26]. Hyper<sup>2</sup>LTL can be seen as a logical foundation for ωregular model checking: Assume the set of initial states is given as a QPTL formula ϕ*init*, the set of bad states is given as a QPTL formula ϕ*bad* , and the transition relation is given as a QPTL formula ϕ*step* over trace variables π and π- . The set of bad states is reachable from a trace (state) in ϕ*init* iff the following Hyper<sup>2</sup>LTLfp formula holds on the system that generates all traces:

$$\begin{aligned} \left(X, \forall, \forall \pi \in \mathfrak{S}. \varphi\_{init}(\pi) \to \pi \rhd X \land\\ \forall \pi \in X. \forall \pi' \in \mathfrak{S}. \varphi\_{step}(\pi, \pi') \to \pi' \rhd X \right). \forall \pi \in X. \neg \varphi\_{bad}(\pi) \end{aligned}$$

Conversely, Hyper<sup>2</sup>LTLfp can express more complex properties, beyond the reachability checks possible in the framework of (ω-)regular model checking.

*Model Checking Knowledge.* Model checking of knowledge properties in multiagent systems was developed in the tools MCK [36] and MCMAS [42], which can exactly express LTLK. Bozzelli et al. [16] have shown that HyperCTL<sup>∗</sup> and LTL<sup>K</sup> have incomparable expressiveness, and present HyperCTL<sup>∗</sup> lp – an extension of HyperCTL<sup>∗</sup> that can reason about past – to unify HyperCTL<sup>∗</sup> and LTLK. While HyperCTL<sup>∗</sup> lp can express the knowledge operator, it cannot capture common knowledge. LTLK,<sup>C</sup> [41] captures both knowledge and common knowledge, but the suggested model-checking algorithm only handles a decidable fragment that is reducible to LTL model checking.

## **8 Conclusion**

Hyperproperties play an increasingly important role in many areas of computer science. There is a strong need for specification languages and verification methods that reason about hyperproperties in a uniform and general manner, similar to what is standard for more traditional notions of safety and reliability. In this paper, we have ventured forward from the first-order reasoning of logics like HyperLTL into the realm of second-order hyperproperties, i.e., properties that not only compare individual traces but reason comprehensively about *sets* of such traces. With Hyper<sup>2</sup>LTL, we have introduced a natural specification language and a general model-checking approach for second-order hyperproperties. Hyper<sup>2</sup>LTL provides a general framework for a wide range of relevant hyperproperties, including common knowledge and asynchronous hyperproperties, which could previously only be studied with specialized logics and algorithms. Hyper<sup>2</sup>LTL also provides a starting point for future work on secondorder hyperproperties in areas such as cyber-physical [44] and probabilistic systems [28].

**Acknowledgements.** We thank Jana Hofmann for the fruitful discussions. This work was supported by the European Research Council (ERC) Grant HYPER (No. 101055412), by DFG grant 389792660 as part of TRR 248 – CPEC, and by the German Israeli Foundation (GIF) Grant No. I-1513-407.2019.

## **References**


16–18 May 2023, Proceedings. LNCS, vol. 13903. Springer (2023). https://doi.org/ 10.1007/978-3-031-33170-1 22


332 R. Beutner et al.


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

**Neural Networks and Machine Learning**

## **Certifying the Fairness of KNN in the Presence of Dataset Bias**

Yannan Li(B) , Jingbo Wang, and Chao Wang

University of Southern California, Los Angeles, CA 90089, USA {yannanli,jingbow,wang626}@usc.edu

**Abstract.** We propose a method for certifying the fairness of the classification result of a widely used supervised learning algorithm, the knearest neighbors (KNN), under the assumption that the training data may have historical bias caused by systematic mislabeling of samples from a protected minority group. To the best of our knowledge, this is the first certification method for KNN based on three variants of the fairness definition: individual fairness, --fairness, and label-flipping fairness. We first define the fairness certification problem for KNN and then propose sound approximations of the complex arithmetic computations used in the state-of-the-art KNN algorithm. This is meant to lift the computation results from the concrete domain to an abstract domain, to reduce the computational cost. We show effectiveness of this *abstract interpretation* based technique through experimental evaluation on six datasets widely used in the fairness research literature. We also show that the method is accurate enough to obtain fairness certifications for a large number of test inputs, despite the presence of historical bias in the datasets.

## **1 Introduction**

Certifying the fairness of the classification output of a machine learning model has become an important problem. This is in part due to a growing interest in using machine learning techniques to make socially sensitive decisions in areas such as education, healthcare, finance, and criminal justice systems. One reason why the classification output may be biased against an individual from a protected minority group is because the dataset used to train the model may have historical bias; that is, there is systematic mislabeling of samples from the protected minority group. Thus, we must be extremely careful while considering the possibility of using the classification output of a machine learning model, to avoid perpetuating or even amplifying historical bias.

One solution to this problem is to have the ability to certify, with certainty, that the classification output y = M(x) for an individual input x is fair, despite that the model M is learned from a dataset T with historical bias. This is a

This work was partially funded by the U.S. National Science Foundation grants CNS-1702824, CNS-1813117 and CCF-2220345.

c The Author(s) 2023

C. Enea and A. Lal (Eds.): CAV 2023, LNCS 13965, pp. 335–357, 2023. https://doi.org/10.1007/978-3-031-37703-7\_16

**Fig. 1.** FairKNN: our method for certifying fairness of KNNs with label bias.

form of *individual fairness* that has been studied in the fairness literature [14]; it requires that the classification output remains the same for input x even if historical bias were not in the training dataset T. However, this is a challenging problem and, to the best of our knowledge, techniques for solving it efficiently are still severely lacking. Our work aims to fill the gap.

Specifically, we are concerned with three variants of the fairness definition. Let the input <sup>x</sup> <sup>=</sup> <sup>x</sup>1,...,xD be a <sup>D</sup>-dimensional input vector, and <sup>P</sup> be the subset of vector indices corresponding to the *protected* attributes (e.g., race, gender, etc.). The first variant of the fairness definition is *individual fairness*, which requires that similar individuals are treated similarly by the machine learning model. For example, if two individual inputs x and x differ only in some protected attribute <sup>x</sup>i, where <sup>i</sup> ∈ P, but agree on all the other attributes, the classification output must be the same. The second variant is -*-fairness*, which extends the notion of individual fairness to include inputs whose un-protected attributes differ and yet the difference is bounded by a small constant (-). In other words, if two individual inputs are almost the same in all unprotected attributes, they should also have the same classification output. The third variant is *label-flipping fairness*, which requires the aforementioned fairness requirements to be satisfied even if a biased dataset T has been used to train the model in the first place. That is, as long as the number of mislabeled elements in T is bounded by n, the classification output must be the same.

We want to certify the fairness of the classification output for a popular supervised learning technique called the k-nearest neighbors (KNN) algorithm. Our interest in KNN comes from the fact that, unlike many other machine learning techniques, KNN is a *model-less* technique and thus does not have the high cost associated with training the model. Because of this reason, KNN has been widely adopted in real-world applications [1,4,16,18,23,29,36,45,46]. However, obtaining a fairness certification for KNN is still challenging and, in practice, the most straightforward approach of *enumerating all possible scenarios* and then *checking if the classification outputs obtained in these scenarios agree* would have been prohibitively expensive.

To overcome the challenge, we propose an efficient method based on the idea of *abstract interpretation* [10]. Our method relies on sound approximations to analyze the arithmetic computations used by the state-of-the-art KNN algorithm both accurately and efficiently. Figure 1 shows an overview of our method in the lower half of this figure, which conducts the analysis in an abstract domain, and the default KNN algorithm in the upper half, which operates in the concrete domain. The main difference is that, by staying in the abstract domain, our method is able to analyze a large set of possible training datasets (derived from T due to n label-flips) and a potentially-infinite set of inputs (derived from x due to perturbation) symbolically, as opposed to analyze a single training dataset and a single input concretely.

To the best of our knowledge, this is the first method for KNN fairness certification in the presence of dataset bias. While Meyer et al. [26,27] and Drews et al. [12] have investigated robustness certification techniques, their methods target decision trees and linear regression, which are different types of machine learning models from KNN. Our method also differs from the KNN data-poisoning robustness verification techniques developed by Jia et al. [20] and Li et al. [24], which do not focus on *fairness* at all; for example, they do not distinguish *protected* attributes from *unprotected* attributes. Furthermore, Jia et al. [20] consider the prediction step only while ignoring the learning step, and Li et al. [24] do not consider label flipping. Our method, in contrast, considers all of these cases.

We have implemented our method and demonstrated the effectiveness through experimental evaluation. We used all of the six popular datasets in the fairness research literature as benchmarks. Our evaluation results show that the proposed method is efficient in analyzing complex arithmetic computations used in the state-of-the-art KNN algorithm, and is accurate enough to obtain fairness certifications for a large number of test inputs. To better understand the impact of historical bias, we also compared the fairness certification success rates across different demographic groups.

To summarize, this paper makes the following contributions:


The remainder of this paper is organized as follows. We first present the technical background in Sect. 2 and then give an overview of our method in Sect. 3. Next, we present our detailed algorithms for certifying the KNN prediction step in Sect. 4 and certifying the KNN learning step in Sect. 5. This is followed by our experimental results in Sect. 6. We review the related work in Sect. 7 and, finally, give our conclusion in Sect. 8.

## **2 Background**

Let L be a supervised learning algorithm that takes the training dataset T as input and returns a learned model M = L(T) as output. The training set <sup>T</sup> <sup>=</sup> {(x, y)} is a set of labeled samples, where each <sup>x</sup> ∈X ⊆ <sup>R</sup><sup>D</sup> has <sup>D</sup> real-valued attributes, and the <sup>y</sup> ∈Y⊆ <sup>N</sup> is a class label. The learned model <sup>M</sup> : X→Y is a function that returns the classification output <sup>y</sup> ∈ Y for any input <sup>x</sup> ∈ X .

#### **2.1 Fairness of the Learned Model**

We are concerned with *fairness* of the classification output M(x) for an individual input <sup>x</sup>. Let <sup>P</sup> be the set of vector indices corresponding to the protected attributes in <sup>x</sup> ∈ X . We say that <sup>x</sup><sup>i</sup> is a protected attribute (e.g., race, gender, etc.) if and only if <sup>i</sup> ∈ P.

**Definition 1 (Individual Fairness).** *For an input* x*, the classification output* <sup>M</sup>(x) *is fair if, for any input* <sup>x</sup> *such that (1)* <sup>x</sup><sup>j</sup> <sup>=</sup> <sup>x</sup> <sup>j</sup> *for some* <sup>j</sup> ∈ P *and (2)* x<sup>i</sup> = x <sup>i</sup> *for all* <sup>i</sup> ∈ P*, we have* <sup>M</sup>(x) = <sup>M</sup>(x )*.*

It means two individuals (x and x ) differing only in some protected attribute (e.g., gender) but agreeing on all other attributes must be treated equally. While being intuitive and useful, this notion of fairness may be too narrow. For example, if two individuals differ in some unprotected attributes and yet the difference is considered *immaterial*, they must still be treated equally. This can be captured by -−fairness.

**Definition 2 (**-**-Fairness).** *For an input* x*, the classification output* M(x) *is fair if, for any input* <sup>x</sup> *such that (1)* <sup>x</sup><sup>j</sup> <sup>=</sup> <sup>x</sup> <sup>j</sup> *for some* <sup>j</sup> ∈ P *and (2)* <sup>|</sup>xi−x <sup>i</sup>| ≤ - *for all* <sup>i</sup> ∈ P*, we have* <sup>M</sup>(x) = <sup>M</sup>(x )*.*

In this case, such inputs x form a set. Let Δ-(x) be the set of all inputs x considered in the -<sup>−</sup>fairness definition. That is, <sup>Δ</sup>-(x) := {x <sup>|</sup> <sup>x</sup><sup>j</sup> <sup>=</sup> <sup>x</sup> <sup>j</sup> for some <sup>j</sup> <sup>∈</sup> <sup>P</sup>, <sup>|</sup>x<sup>i</sup> <sup>−</sup> <sup>x</sup> <sup>i</sup>| ≤ for all <sup>i</sup> ∈ P}. By requiring <sup>M</sup>(x) = <sup>M</sup>(x ) for all <sup>x</sup> <sup>∈</sup> <sup>Δ</sup>-(x), --fairness guarantees that a larger set of individuals similar to x are treated equally.

Individual fairness can be viewed as a special case of --fairness, where - = 0. In contrast, when - > 0, the number of elements in Δ-(x) is often large and sometimes infinite. Therefore, the most straightforward approach of certifying fairness by enumerating all possible elements in Δ-(x) would not work. Instead, any practical solution would have to rely on abstraction.

#### **2.2 Fairness in the Presence of Dataset Bias**

Due to historical bias, the training dataset T may have contained samples whose output are unfairly labeled. Let the number of such samples be bounded by n. We assume that there are no additional clues available to help identify the mislabeled samples. Without knowing which these samples are, fairness certification must consider all of the possible scenarios. Each scenario corresponds to a debiased dataset, T , constructed by flipping back the incorrect labels in T. Let dBiasn(T) = {T } be the set of these possible de-biased (clean) datasets. Ideally, we want all of them to lead to the same classification output.

**Definition 3 (Label-flipping Fairness).** *For an input* x*, the classification output* M(x) *is fair against label-flipping bias of at most* n *elements in the dataset* <sup>T</sup> *if, for all* <sup>T</sup> <sup>∈</sup> dBiasn(T)*, we have* <sup>M</sup> (x) = M(x) *where* M = L(T )*.*

Label-flipping fairness differs from and yet complements individual and - fairness in the following sense. While individual and --fairness guarantee equal output for similar inputs, label-flipping fairness guarantees equal output for similar datasets. Both aspects of fairness are practically important. By combining them, we are able to define the entire problem of certifying fairness in the presence of historical bias.

To understand the complexity of the fairness certification problem, we need to look at the size of the set dBiasn(T), similar to how we have analyzed the size of Δ-(x). While the size of dBiasn(T) is always finite, it can be astronomically large in practice. Let q is the number of unique class labels and m be the actual number of flipped elements in T. Assuming that each flipped label may take any of the other <sup>q</sup> <sup>−</sup> 1 possible labels, the total number of possible *clean sets* is -|T| m · (<sup>q</sup> <sup>−</sup> 1)<sup>m</sup> for each <sup>m</sup>. Since <sup>m</sup> <sup>≤</sup> <sup>n</sup>, <sup>|</sup>dBiasn(T)<sup>|</sup> <sup>=</sup> <sup>n</sup> <sup>m</sup>=1 -|T| m · (<sup>q</sup> <sup>−</sup> 1)<sup>m</sup>. Again, the number of elements in dBiasn(T) is too large to enumerate, which means any practical solution would have to rely on abstraction.

## **3 Overview of Our Method**

Given the tuple -T,P, n, -, x, where <sup>T</sup> is the training set, <sup>P</sup> represents the protected attributes, n bounds the number of biased elements in T, and bounds the perturbation of x, our method checks if the KNN classification output for x is fair.

#### **3.1 The KNN Algorithm**

Since our method relies on an *abstract interpretation* of the KNN algorithm, we first explain how the KNN algorithm operates in the concrete domain (this subsection), and then lift it to the abstract domain in the next subsection.

As shown in Fig. 2, KNN has a prediction step where KNN predict computes the output label for an input x using T and a given parameter K, and a learning step where KNN learn computes the K value from the training set T.

Unlike many other machine learning techniques, KNN does not have an explicit model M; instead, M can be regarded as the combination of T and K.

```
1 func KNN_predict(T, K, x) {
2 Let T K
            x = the K nearest neighbors of x in T;
3 Let F req(T K
                 x ) = the most frequent label in T K
                                                 x ;
4 return F req(T K
                    x );
5 }
6
7 func KNN_learn(T) {
8 for (each candidate k value) { // conducting p-fold cross validation
9 Let {Gi} = a partition of T into p groups of roughly equal size;
10 Let errk
                 i = {(x, y) ∈ Gi | y = KNN_predict(T \ Gi, k, x)} for each Gi;
11 }
12 Let K = arg min k
                       1
                       p
                           p
                           i=1
                              |errk
                                 i |
                               |Gi| ;
13 return K;
14 }
```
**Fig. 2.** The KNN algorithm, consisting of the prediction and learning steps.

Inside KNN predict, the set T <sup>K</sup> <sup>x</sup> represents the K-nearest neighbors of x in the dataset T, where distance is measured by Euclidean (or Manhattan) distance in the input vector space. F req(T <sup>K</sup> <sup>x</sup> ) is the most frequent label in T <sup>K</sup> x .

Inside KNN learn, a technique called p*-fold cross validation* is used to select the optimal value for K, e.g., from a set of candidate k values in the range [1, <sup>|</sup>T|×(p−1)/p] by minimizing classification error, as shown in Line 12. This is accomplished by first partitioning T into p groups of roughly equal size (Line 9), and then computing err<sup>k</sup> <sup>i</sup> (a set of misclassified samples from <sup>G</sup>i) by treating <sup>G</sup><sup>i</sup> as the evaluation set, and <sup>T</sup> \ <sup>G</sup><sup>i</sup> as the training set. Here, an input (x, y) <sup>∈</sup> <sup>G</sup><sup>i</sup> is "misclassified" if the expected output label, y, differs from the output of KNN predict using the candidate k value.

#### **3.2 Certifying the KNN Algorithm**

Algorithm 1 shows the top-level procedure of our fairness certification method, which first executes the KNN algorithm in the concrete domain (Lines 1–2), to obtain the default K and y, and then starts our analysis in the abstract domain.


```
1 K = KNN learn(T);
2 y = KNN predict(T, K, x);
3 KSet = abs KNN learn (T,n);
4 for each K ∈ KSet do
5 if abs KNN predict same(T, n, K, x, y) = F alse then
6 return unknown;
7 end if
8 end for
9 return certified;
```
In the *abstract* learning step (Line 3), instead of considering T, our method considers the set of all clean datasets in dBiasn(T) symbolically, to compute the set of possible optimal K values, denoted KSet.

In the *abstract* prediction step (Lines 4–8), for each K, instead of considering input x, our method considers all perturbed inputs in Δ-(x) and all clean datasets in dBiasn(T) symbolically, to check if the classification output always stays the same. Our method returns "certified" only when the classification output always stays the same (Line 9); otherwise, it returns "unknown" (Line 6).

We only perturb numerical attributes in the input x since perturbing categorical or binary attributes often does not make sense in practice.

In the next two sections, we present our detailed algorithms for abstracting the prediction step and the learning step, respectively.

## **4 Abstracting the KNN Prediction Step**

We start with abstract KNN prediction, which is captured by the subroutine *abs KNN predict same* used in Line 5 of Algorithm 1. It consists of two parts. The first part (to be presented in Sect. 4.1) computes a superset of T <sup>K</sup> <sup>x</sup> , denoted overNN, while considering the impact of perturbation of the input x. The second part (to be presented in Sect. 4.2) leverages overNN to decide if the classification output always stays the same, while considering the impact of label-flipping bias in the dataset T.

#### **4.1 Finding the** *K***-Nearest Neighbors**

To compute overNN, which is a set of samples in T that *may be* the K nearest neighbors of the test input x, we must be able to compute the distance between x and each sample in T.

This is not a problem at all in the concrete domain, since the K nearest neighbors of x in T, denoted T <sup>K</sup> <sup>x</sup> , is fixed and is determined solely by the Euclidean distance between x and each sample in T in the attribute space. However, when - perturbation is applied to x, the distance changes and, as a result, the K nearest neighbors of x may also change.

Fortunately, the distance in the attribute space is not affected by labelflipping bias in the dataset T, since label-flipping only impacts sample labels, not sample attributes. Thus, in this subsection, we only need to consider the impact of perturbation of the input x.

**The Challenge.** Due to perturbation, a single test input x becomes a potentially-infinite set of inputs Δ-(x). Since our goal is to over-approximate the K nearest neighbors of Δ-(x), the expectation is that, as long as there exists some <sup>x</sup> <sup>∈</sup> <sup>Δ</sup>-(x) such that a sample input t in T is one of the K nearest neighbors of x , denoted <sup>t</sup> <sup>∈</sup> <sup>T</sup> <sup>K</sup> x-, we must include t in the set overNN. That is,

$$\bigcup\_{x' \in \Delta^\epsilon(x)} T\_{x'}^K \subseteq overNN \subseteq T.$$

However, finding an efficient way of computing overNN is a challenging task. As explained before, the naive approach of enumerating <sup>x</sup> <sup>∈</sup> <sup>Δ</sup>-(x), computing the K nearest neighbors, T <sup>K</sup> x- , and unionizing all of them would not work. Instead, we need abstraction that is both efficient and accurate enough in practice.

Our solution is that, for each sample t in T, we first analyze the distances between t and all inputs in Δ-(x) symbolically, to compute a lower bound and an upper bound of the distances. Then, we leverage these lower and upper bounds to compute the set overNN, which is a superset of samples in T that may become the K nearest neighbors of Δ-(x).

**Bounding Distance Between** Δ-(x) **and** t**.** Assume that x = (x1, x2, ..., xD) and t = (t1, t2, ..., tD) are two real-valued vectors in the D-dimensional attribute space. Let - = (-1, -2, ..., -<sup>D</sup>), where <sup>i</sup> ≥ 0, be the small perturbation. Thus, the perturbed input is x = (x 1, x 2, ..., x <sup>D</sup>)=(x<sup>1</sup> <sup>+</sup> <sup>δ</sup>1, x<sup>2</sup> <sup>+</sup> <sup>δ</sup>2, ..., x<sup>D</sup> <sup>+</sup> <sup>δ</sup>D), where <sup>δ</sup><sup>i</sup> <sup>∈</sup> [−i, <sup>i</sup>] for all i = 1, ..., D.

The distance between <sup>x</sup> and <sup>t</sup> is a fixed value <sup>d</sup>(x, t) = <sup>D</sup> <sup>i</sup>=1(x<sup>i</sup> <sup>−</sup> <sup>t</sup>i)<sup>2</sup>, since both <sup>x</sup> and the samples <sup>t</sup> in <sup>T</sup> are fixed, but the distance between <sup>x</sup> <sup>∈</sup> <sup>Δ</sup>-(x) and <sup>t</sup> is a function of <sup>δ</sup><sup>i</sup> <sup>∈</sup> [−i, <sup>i</sup>], since <sup>D</sup> i=1(x <sup>i</sup> <sup>−</sup> <sup>t</sup>i)<sup>2</sup> <sup>=</sup> <sup>D</sup> <sup>i</sup>=1(x<sup>i</sup> <sup>−</sup> <sup>t</sup><sup>i</sup> <sup>+</sup> <sup>δ</sup>i)<sup>2</sup>. For ease of presentation, we define the distance as d- = <sup>D</sup> <sup>i</sup>=1 <sup>d</sup>- <sup>i</sup> , where <sup>d</sup>- <sup>i</sup> = (x<sup>i</sup> <sup>−</sup> <sup>t</sup><sup>i</sup> <sup>+</sup> <sup>δ</sup>i)<sup>2</sup> is the (squared) distance function in the <sup>i</sup>-th dimension. Then, our goal becomes computing the lower bound, LB(d-), and the upper bound, UB(d-), in the domain <sup>δ</sup><sup>i</sup> <sup>∈</sup> [−i, <sup>i</sup>] for all i = 1, ..., D.

**Distance Bounds Are Compositional.** Our first observation is that bounds on the distance d as a whole can be computed using bounds in the individual dimensions. To see why this is the case, consider the (square) distance in the i-th dimension, d- <sup>i</sup> = (x<sup>i</sup> <sup>−</sup> <sup>t</sup><sup>i</sup> <sup>+</sup> <sup>δ</sup>i)<sup>2</sup>, where <sup>δ</sup><sup>i</sup> <sup>∈</sup> [−i, <sup>i</sup>], and the (square) distance in the j-th dimension, d- <sup>j</sup> = (x<sup>j</sup> <sup>−</sup> <sup>t</sup><sup>j</sup> <sup>+</sup> <sup>δ</sup><sup>j</sup> )<sup>2</sup>, where <sup>δ</sup><sup>j</sup> <sup>∈</sup> [−j , <sup>j</sup> ]. By definition, d- i is completely independent of d- <sup>j</sup> when <sup>i</sup> <sup>=</sup> <sup>j</sup>.

Thus, the lower bound of d-, denoted LB(d-), can be calculated by finding the lower bound of each d- <sup>i</sup> in the <sup>i</sup>-th dimension. Similarly, the upper bound of d-, denoted UB(d-), can also be calculated by finding the upper bound of each d- <sup>i</sup> in the <sup>i</sup>-the dimension. That is,

 $LB(d^e) = \sqrt{\sum\_{i=1}^{D} LB(d\_i^e)}$  and  $UB(d^e) = \sqrt{\sum\_{i=1}^{D} UB(d\_i^e)}$ .

**Four Cases in Each Dimension.** Our second observation is that, by utilizing the mathematical nature of the (square) distance function, we can calculate the minimum and maximum values of d- <sup>i</sup> , which can then be used as the lower bound LB(d- <sup>i</sup> ) and upper bound UB(d- <sup>i</sup> ), respectively.

Specifically, in the i-th dimension, the (square) distance function d- <sup>i</sup> = ((x<sup>i</sup> <sup>−</sup> <sup>t</sup>i) + <sup>δ</sup>i)<sup>2</sup> may be rewritten to (δ<sup>i</sup> <sup>+</sup> <sup>A</sup>)<sup>2</sup>, where <sup>A</sup> = (x<sup>i</sup> <sup>−</sup> <sup>t</sup>i) is a constant and <sup>δ</sup><sup>i</sup> <sup>∈</sup> [−-, +-] is a variable. The function can be plotted in two dimensional space, using δ<sup>i</sup> as x-axis and the output of the function as y-axis; thus, it is a quadratic function Y = (X + A)<sup>2</sup>.

**Fig. 3.** Four cases for computing the upper and lower bounds of the distance function d- <sup>i</sup> (δi)=(δi+A) <sup>2</sup> for <sup>δ</sup><sup>i</sup> <sup>∈</sup> [−i, <sup>i</sup>]. In these figures, δ<sup>i</sup> is the x-axis, and d- <sup>i</sup> is the y-axis, LB denotes LB(d- <sup>i</sup> ), and UB denotes UB(d- i ).

Figure 3 shows the plot, which reminds us of where the minimum and maximum values of a quadratic function is. There are two versions of the quadratic function, depending on whether A > 0 (corresponding to the two subfigures at the top) or A < 0 (corresponding to the two subfigures at the bottom). Each version also has two cases, depending on whether the perturbation interval [−i, i] falls inside the constant interval [−|A|, <sup>|</sup>A|] (corresponding to the two subfigures on the left) or falls outside (corresponding to the two subfigures on the right). Thus, there are four cases in total.

In each case, the maximal and minimal values of the quadratic function are different, as shown by the LB and UB marks in Fig. 3.

*Case (a).* This is when (x<sup>i</sup> <sup>−</sup> <sup>t</sup>i) <sup>&</sup>gt; 0 and <sup>−</sup><sup>i</sup> <sup>&</sup>gt; <sup>−</sup>(x<sup>i</sup> <sup>−</sup> <sup>t</sup>i), which is the same as saying A > 0 and <sup>−</sup><sup>i</sup> <sup>&</sup>gt; <sup>−</sup>A. In this case, function <sup>d</sup>i(<sup>i</sup>)=(δ<sup>i</sup> + A)<sup>2</sup> is monotonically increasing w.r.t. variable <sup>δ</sup><sup>i</sup> <sup>∈</sup> [−<sup>i</sup>, +i].

Thus, LB(d- <sup>i</sup> )=(−<sup>i</sup> + (x<sup>i</sup> <sup>−</sup> <sup>t</sup>i))<sup>2</sup> and UB(d- <sup>i</sup> ) = (+<sup>i</sup> + (x<sup>i</sup> <sup>−</sup> <sup>t</sup>i))<sup>2</sup>.

*Case (b).* This is when (x<sup>i</sup> <sup>−</sup> <sup>t</sup>i) <sup>&</sup>gt; 0 and <sup>−</sup><sup>i</sup> <sup>&</sup>lt; <sup>−</sup>(x<sup>i</sup> <sup>−</sup> <sup>t</sup>i), which is the same as saying A > 0 and <sup>−</sup><sup>i</sup> <sup>&</sup>lt; <sup>−</sup>A. In this case, the function is not monotonic. The minimal value is 0, obtained when <sup>δ</sup><sup>i</sup> <sup>=</sup> <sup>−</sup>A. The maximal value is obtained when δ<sup>i</sup> = +i.

Thus, LB(d- <sup>i</sup> ) = 0 and UB(d- <sup>i</sup> ) = (+<sup>i</sup> + (x<sup>i</sup> <sup>−</sup> <sup>t</sup>i))<sup>2</sup>. *Case (c).* This is when (x<sup>i</sup> <sup>−</sup> <sup>t</sup>i) <sup>&</sup>lt; 0 and <sup>i</sup> <sup>&</sup>lt; <sup>−</sup>(x<sup>i</sup> <sup>−</sup> <sup>t</sup>i), which is the same as saying A < 0 and <sup>i</sup> <sup>&</sup>lt; <sup>−</sup>A. In this case, the function is monotonically decreasing w.r.t. variable <sup>δ</sup><sup>i</sup> <sup>∈</sup> [−i, i].

$$\text{Thus, } LB(d\_i^{\epsilon}) = \overset{\circ}{(\epsilon\_i + \overset{\circ}{(x\_i - t\_i)})^2} \text{ and } UB(d\_i^{\epsilon}) = (-\epsilon\_i + (x\_i - t\_i))^2.$$

*Case (d).* This is when (x<sup>i</sup> <sup>−</sup> <sup>t</sup>i) <sup>&</sup>lt; 0 and <sup>i</sup> <sup>&</sup>gt; <sup>−</sup>(x<sup>i</sup> <sup>−</sup> <sup>t</sup>i), which is the same as saying A < 0 and <sup>i</sup> <sup>&</sup>gt; <sup>−</sup>A. In this case, the function is not monotonic. The minimal value is 0, obtained when <sup>δ</sup><sup>i</sup> <sup>=</sup> <sup>−</sup>A. The maximal value is obtained when <sup>δ</sup><sup>i</sup> <sup>=</sup> <sup>−</sup>i.

Thus, LB(d- <sup>i</sup> ) = 0 and UB(d- <sup>i</sup> )=(−<sup>i</sup> + (x<sup>i</sup> <sup>−</sup> <sup>t</sup>i))<sup>2</sup>.

*Summary.* By combining the above four cases, we compute the bounds of the entire distance function das follows:

$$\left[\sqrt{\sum\_{i=1}^{D} \max(|x\_i - t\_i| - \epsilon\_i, 0)^2}, \quad \sqrt{\sum\_{i=1}^{D} (|x\_i - t\_i| + \epsilon\_i)^2}\right]$$

Here, the take-away message is that, since xi, t<sup>i</sup> and <sup>i</sup> are all fixed values, the upper and lower bounds can be computed in constant time, despite that there is a potentially-infinite number of inputs in Δ-(x).

**Computing** overNN **Using Bounds.** With the upper and lower bounds of the distance between Δ-(x) and sample t in the dataset T, denoted [LB(d-(x, t)), UB(d-(x, t))], we are ready to compute overNN such that every <sup>t</sup> <sup>∈</sup> overNN *may be* among the K nearest neighbors of Δ-(x).

Let UBKmin denote the K-th minimum value of UB(d-(x, t)) for all <sup>t</sup> <sup>∈</sup> <sup>T</sup>. Then, we define overNN as the set of samples in T whose LB(d-(x, t)) is *not greater than* UBKmin. In other words,

$$lowerNN = \{ t \in T \mid LB(d^\epsilon(x, t)) \le UB\_{Kmin} \}.$$

*Example.* Given a dataset <sup>T</sup> <sup>=</sup> {<sup>t</sup> <sup>1</sup>, t2, t3, t4, t<sup>5</sup>}, a test input <sup>x</sup>, perturbation -, and K = 3. Assume that the lower and upper bounds of the distance between Δ-(x) and samples in T are [25.4, 29.4], [30.1, 34.1], [35.3, 39.3], [37.2, 41.2], [85.5, 90.5]. Since K = 3, we find the 3rd minimum upper bound, UB<sup>3</sup>min = 39.3. By comparing UB<sup>3</sup>min with the lower bounds, we compute overNN<sup>3</sup> <sup>=</sup> {<sup>t</sup> <sup>1</sup>, t2, t3, t<sup>4</sup>}, since <sup>t</sup> <sup>5</sup> is the only sample in T whose lower bound is greater than 39.3. All the other four samples *may be* among the 3 nearest neighbors of Δ-(x).

Due to perturbation, the set overNN<sup>3</sup> for K = 3 is expected to contain 3 or more samples. That is, since different inputs in Δ-(x) may have different samples as their 3-nearest neighbors, to be conservative, we have to take the union of all possible sets of 3-nearest neighbors.

**Algorithm 2:** Subroutine abs same label(overNN, K, y).

**<sup>1</sup>** Let S be a subset of overNN obtained by removing all y-labeled elements; **<sup>2</sup>** Let y- = F req(S), and #y be the count of y- -labeled elements in S; **<sup>3</sup> if** #y- < K − |S| − <sup>2</sup> <sup>∗</sup> <sup>n</sup> **then <sup>4</sup> return** T rue; **5 end if <sup>6</sup> return** F alse;

*Soundness Proof.* Here we prove that any t <sup>∈</sup>/ overNN<sup>K</sup> cannot be among the K nearest neighbors of any <sup>x</sup> <sup>∈</sup> <sup>Δ</sup>-(x). Since UBKmin is the K-th minimum UB(d-(x, t)) for all <sup>t</sup> <sup>∈</sup> <sup>T</sup>, there must be samples <sup>t</sup> <sup>1</sup>, t2, ...t<sup>K</sup> such that UB(d-(x, t<sup>i</sup> )) <sup>≤</sup> UBKmin for all <sup>i</sup> = 1, <sup>2</sup>, ...K. For any <sup>t</sup> <sup>∈</sup>/ overNN, we have LB(d-(x, t )) > UBKmin.

Combining the above conditions, we have LB(d-(x, t )) > UB(d-(x, t<sup>i</sup> )) for i = 1, 2, ...K. It means at least K other samples are closer to x than t . Thus, t cannot be among the K-nearest neighbors of x .

#### **4.2 Checking the Classification Result**

Next, we try to certify that, regardless of which of the K elements are selected from overNN, the prediction result obtained using them is always the same.

The prediction label is affected by both perturbation of the input x and label-flipping bias in the dataset T. Since perturbation affects which points are identified as the K nearest neighbors, and its impact has been accounted for by overNN, from now on, we focus only on label-flipping bias in T.

Our method is shown in Algorithm 2, which takes the set overNN, the parameter K, and the expected label y as input, and checks if it is possible to find a subset of overNN with size K, whose most frequent label differs from y. If such a "bad" subset cannot be found, we say that KNN prediction always returns the same label.

To try to find such a "bad" subset of overNN, we first remove all elements labeled with y from overNN, to obtain the set S (Line 1). After that, there are two cases to consider.


In <sup>S</sup>K, the most frequent label must be either <sup>y</sup> (whose count is <sup>K</sup> − |S|) or y (which is the most frequent label in S, with the count #y ). Moreover, since we can flip up to n labels, we can flip n elements from label y to label y .


Therefore, to check if our method should return T rue, meaning the prediction result is guaranteed to be the same as label <sup>y</sup>, we only need to compare <sup>K</sup> − |S<sup>|</sup> with #y + 2 <sup>∗</sup> <sup>n</sup>. This is checked using the condition in Line 3 of Algorithm 2.

## **5 Abstracting the KNN Learning Step**

In this section, we present our method for abstracting the learning step, which computes the optimal K value based on T and the impact of flipping at most n labels. The output is a super set of possible optimal K values, denoted KSet.

Algorithm 3 shows our method, which takes the training set T and parameter n as input, and returns KSet as output. To be sound, we require the KSet to include any candidate k value that may become the optimal K for some clean set <sup>T</sup> <sup>∈</sup> dBiasn(T).

In Algorithm 3, our method first computes the lower and upper bounds of the classification error for each k value, denoted LB<sup>k</sup> and UBk, as shown in Lines 5–6. Next, it computes minUB, which is the minimal upper bound for all candidate k values (Line 8). Finally, by comparing minUB with LB<sup>k</sup> for each candidate k value, our method decides whether this candidate k value should be put into KSet (Line 9).

We will explain the steps needed to compute LB<sup>k</sup> and UB<sup>k</sup> in the remainder of this section. For now, assuming that they are available, we explain how they are used to compute KSet.

*Example.* Given the candidate k values, k1, k2, k3, k4, and their error bounds [0.1, 0.2], [0.1, 0.3], [0.3, 0.4], [0.3, 0.5]. The smallest upper bound is minUB = <sup>0</sup>.2. By comparing minUB with the lower bounds, we compute KSet <sup>=</sup> {k1, k2}, since only LB<sup>k</sup><sup>1</sup> and LB<sup>k</sup><sup>2</sup> are lower than or equal to minUB.

*Soundness Proof.* Here we prove that any <sup>k</sup> <sup>∈</sup>/ KSet cannot result in the smallest classification error. Assume that k<sup>s</sup> is the candidate k value that has the minimal upper bound (minUB), and err<sup>k</sup><sup>s</sup> is the actual classification error. By definition, we have err<sup>k</sup><sup>s</sup> <sup>≤</sup> minUB. Meanwhile, for any <sup>k</sup> <sup>∈</sup>/ KSet, we have LB<sup>k</sup>-> **Algorithm 4:** Subroutine abs may err(T, n, K, x, y).

**<sup>1</sup>** Let y be, among the non-y labels, the label with the highest count in T <sup>K</sup> <sup>x</sup> ;

**<sup>2</sup>** Let #y be the number of elements in T <sup>K</sup> <sup>x</sup> with the y label;

**<sup>3</sup>** Let n be min(n, #<sup>y</sup> <sup>∈</sup> <sup>T</sup> <sup>K</sup> <sup>x</sup> );


minUB. Combining the two cases, we have err<sup>k</sup>- > minUB <sup>≥</sup> err<sup>k</sup><sup>s</sup> . Here, err<sup>k</sup>-> err<sup>k</sup><sup>s</sup> means that k cannot result in the smallest classification error.

#### **5.1 Overapproximating the Classification Error**

To compute the upper bound errUB<sup>k</sup> <sup>i</sup> defined in Line 3 of Algorithm 3, we use the subroutine abs may err to check if (x, y) <sup>∈</sup> <sup>G</sup><sup>i</sup> may be misclassified when using <sup>T</sup> \ <sup>G</sup><sup>i</sup> as the training set.

Algorithm 4 shows the implementation of the subroutine, which checks, for a sample (x, y), whether it is possible to obtain a set S by flipping at most n labels in T <sup>K</sup> <sup>x</sup> such that the most frequent label in S is not y. If it is possible to obtain such a set S, we conclude that the prediction label for x may be an error.

The condition F req(T <sup>K</sup> <sup>x</sup> ) <sup>=</sup> <sup>y</sup>, computed on <sup>T</sup> <sup>K</sup> <sup>x</sup> after the y label of n elements is changed to y label, is a sufficient condition under which the prediction label for x may be an error. The rationale is as follows.

In order to make the most frequent label in the set T <sup>K</sup> <sup>x</sup> different from y, we need to focus on the label most likely to become the new most frequent label. It is the label y (<sup>=</sup> <sup>y</sup>) with the highest count in the current <sup>T</sup> <sup>K</sup> x .

Therefore, Algorithm 4 checks whether y can become the most frequent label by changing at most n elements in T <sup>K</sup> <sup>x</sup> from y label to y label (Lines 3–5).

#### **5.2 Underapproximating the Classification Error**

To compute the lower bound errLB<sup>k</sup> <sup>i</sup> defined in Line 4 of Algorithm 3, we use the subroutine abs must err to check if (x, y) <sup>∈</sup> <sup>G</sup><sup>i</sup> must be misclassified when using <sup>T</sup> \ <sup>G</sup><sup>i</sup> as the training set.

Algorithm 5 shows the implementation of the subroutine, which checks, for a sample (x, y), whether it is impossible to obtain a set S by flipping at most n labels in T <sup>K</sup> <sup>x</sup> such that the most frequent label in S is y. In other words, is it impossible to avoid the classification error? If it is impossible to avoid the classification error, we conclude that the prediction label must be an error, and thus the procedure returns T rue

In this sense, all samples in errLB<sup>k</sup> <sup>i</sup> (computed in Line 4 of Algorithm 3 are guaranteed to be misclassified.

**Algorithm 5:** Subroutine abs must err(T, n, K, x, y).

**<sup>1</sup> if** <sup>∃</sup><sup>S</sup> *obtained from* <sup>T</sup> <sup>K</sup> <sup>x</sup> *by flipping up to* n *labels such that* F req(S) = y **then <sup>2</sup> return** F alse; **3 end if <sup>4</sup> return** T rue;

The challenge in Algorithm 5 is to check if such a set S can be constructed from T <sup>K</sup> <sup>x</sup> . The intuition is that, to make y the most frequent label, we should flip the labels of non-y elements to label y. Let us consider two examples first.

*Example 1.* Given the label counts of T <sup>K</sup> <sup>x</sup> , denoted {l<sup>1</sup> \* 4, <sup>l</sup><sup>4</sup> \* 4, <sup>l</sup><sup>3</sup> \* 2}, meaning that 4 elements are labeled l1, 4 elements are labeled l4, and 2 elements are labeled l3. Assume that n = 2 and y = l3. Since we can flip at most 2 elements, we choose to flip one <sup>l</sup><sup>1</sup> <sup>→</sup> <sup>l</sup><sup>3</sup> and one <sup>l</sup><sup>4</sup> <sup>→</sup> <sup>l</sup>3, to get a set <sup>S</sup> <sup>=</sup> {l<sup>1</sup> \* 3, <sup>l</sup><sup>4</sup> \* 3, <sup>l</sup><sup>3</sup> \* 4}.

*Example 2.* Given the label counts of T <sup>K</sup> <sup>x</sup> , denoted {l<sup>1</sup> \* 5, <sup>l</sup><sup>4</sup> \* 3, <sup>l</sup><sup>3</sup> \* 2}, <sup>n</sup> = 2, and <sup>y</sup> <sup>=</sup> <sup>l</sup>3. We can flip two <sup>l</sup><sup>1</sup> <sup>→</sup> <sup>l</sup><sup>3</sup> to get a set <sup>S</sup> <sup>=</sup> {l<sup>1</sup> \* 3, <sup>l</sup><sup>4</sup> \* 3, <sup>l</sup><sup>3</sup> \* 4}.

**The LP Problem.** The question is how to decide whether the set S (defined in Line 1 of Algorithm 5) exists. We can formulate it as a linear programming (LP) problem. The LP problem has two constraints. The first one is defined as follows: Let <sup>y</sup> be the expected label, <sup>l</sup><sup>i</sup> <sup>=</sup> <sup>y</sup> be another label, where <sup>i</sup> = 1, ..., q and q is the total number of class labels (e.g., in the above two examples, the number q = 3). Let #y be the number of elements in T <sup>K</sup> <sup>x</sup> that have the y label. Similarly, let #l<sup>i</sup> be the number of elements with l<sup>i</sup> label. Assume that a set S as defined in Algorithm <sup>5</sup> exists, then all of the labels <sup>l</sup><sup>i</sup> <sup>=</sup> <sup>y</sup> must satisfy

$$\#l\_i - \#llip\_i < \#y + \sum\_{i=1}^{q} \#llip\_i \quad , \tag{1}$$

where #flip<sup>i</sup> is a variable representing the number of li–to–y flips. Thus, in the above formula, the left-hand side is the count of l<sup>i</sup> after flipping, the right-hand side is the count of y after flipping. Since y is the most frequent label in S, y should have a higher count than any other label.

The second constraint is

$$\sum\_{i=1}^{q} \#flip\_i \le n \quad , \tag{2}$$

which says that the total number of label flips is bounded by the parameter n.

Since the number of class labels (q) is often small (from 2 to 10), this LP problem can be solved quickly. However, the LP problem must be solved <sup>|</sup>T<sup>|</sup> times, where <sup>|</sup>T<sup>|</sup> may be as large as 50,000. To avoid invoking the LP solver unnecessarily, we propose two easy-to-check conditions. They are *necessary* condition in that, if either of them is violated, the set S does not exist. Thus, we invoke the LP solver only if both conditions are satisfied.

**Necessary Conditions.** The first condition is derived from Formula (1a), by adding up the two sides of the inequality constraint for all labels <sup>l</sup><sup>i</sup> <sup>=</sup> <sup>y</sup>. The resulting condition is

$$\left(\sum\_{l\_i \neq y} \#l\_i - \sum\_{i=1}^q \#fllip\_i\right) < \left((q-1)\#y + (q-1)\sum\_{i=1}^q \#fllip\_i\right).$$

The second condition requires that, in S, label y has a higher count (after flipping) than any other label, including the label <sup>l</sup><sup>p</sup> <sup>=</sup> <sup>y</sup> with the highest count in the current T <sup>K</sup> <sup>x</sup> . The resulting condition is

$$(\#l\_p - \#y)/2 < n,$$

since only when this condition is satisfied, it is possible to allow y to have a higher count than lp, by flipping at most n of the label l<sup>p</sup> to y.

These are necessary conditions (but may not be sufficient conditions) because, whenever the first condition does not hold, Eq. (1) does not hold either. Similarly, whenever the second condition does not hold, Eq. (1) does not hold either. In this sense, these two conditions are *easy-to-check* over-approximations of Eq. (1).

## **6 Experiments**

We have implemented our method as a software tool written in Python using the **scikit-learn** machine learning library. We evaluated our tool on six datasets that are widely used in the fairness research literature.

**Datasets.** Table 1 shows the statistics of each dataset, including the name, a short description, the size (|T|), the number of attributes, the protected attributes, and the parameters and n. The value of is set to 1% of the attribute range. The bias parameter n is set to 1 for small datasets, 10 for medium datasets, and 50 for large datasets. The protected attributes include *Gender* for all six datasets, and *Race* for two datasets, *Compas* and *Adult*, which are consistent with known biases in these datasets.

In preparation for the experimental evaluation, we have employed state-ofthe-art techniques in the machine learning literature to preprocess and balance the datasets for KNN, including encoding, standard scaling, k-bins-discretizer, downsampling and upweighting.


**Table 1.** Statistics of all of the datasets used during our experimental evaluation.

**Table 2.** Results for certifying *label-flipping* and *individual fairness* (gender) on small datasets, for which ground truth can still be obtained by naive enumeration, and compared with our method.


**Methods.** For comparison purposes, we implemented six variants of our method, by enabling or disabling the ability to certify label-flipping fairness, the ability to certify individual fairness, and the ability to certify --fairness.

Except for --fairness, we also implemented the naive approach of enumerating all <sup>T</sup> <sup>∈</sup> dBiasn(T). Since the naive approach does not rely on approximation, its result can be regarded as the ground truth (i.e., whether the classification output for an input x is truly fair). Our goal is to obtain the ground truth on small datasets, and use it to evaluate the accuracy of our abstract interpretation based method. However, as explained before, enumeration does not work for --fairness, since the number of inputs in Δ-(x) is infinite.

Our experiments were conducted on a computer with 2 GHz Quad-Core Intel Core i5 CPU and 16 GB of memory. The experiments were designed to answer two questions. First, is our method efficient and accurate enough in handling popular datasets in the fairness literature? Second, does our method help us gain insights? For example, it would be interesting to know whether decision made on an individuals from a protected minority group is more (or less) likely to be certified as fair.

**Results on Efficiency and Accuracy.** We first evaluate the efficiency and accuracy of our method. For the two small datasets, *Salary* and *Student*, we are able to obtain the ground truth using the naive enumeration approach, and then compare it with the result of our abstract interpretation based method. We want to know how much our results deviate from the ground truth.

Table 2 shows the results obtained by treating *Gender* as the protected attribute. Column 1 shows the name of the dataset. Columns 2–7 compare the naive approach (ground truth) and our method in certifying label-flipping fairness. Columns 8–13 compare the naive approach (ground truth) and our method in certifying label-flipping plus individual fairness.


**Table 3.** Results for certifying *label-flipping*, *individual*, and -*-fairness* by our method.

Based on the results in Table 2, we conclude that the accuracy of our method is high (81.9% on average) despite its aggressive use of abstraction to reduce the computational cost. Our method is also 7.5X to 126X faster than the naive approach. Furthermore, the larger the dataset, the higher the speedup.

For medium and large datasets, it is infeasible for the naive enumeration approach to compute and show the ground truth in Table 2. However, the fairness scores of our method shown in Table 3 provide "lower bounds" for the ground truth since our method is sound for certification. For example, when our method reports 95% for *Compas (race)* in Table 3, it means the ground truth must be ≥95% (and thus the gap must be ≤5%). However, there does not seem to be obvious relationship between the gap and the dataset size – the gap may be due to some unique characterristics of each dataset.

**Results on the Certification Rates.** We now present the success rates of our certification method for the three variants of fairness. Table 3 shows the results for label-flipping fairness in Columns 2–3, label-flipping plus individual fairness (denoted *+ Individual fairness*) in Columns 4–5, and label-flipping plus --fairness (denoted *+* -*-fairness*) in Columns 6–7. For each variant of fairness, we show the percentage of test inputs that are certified to be fair, together with the average certification time (per test input). In all six datasets, *Gender* was treated as the protected attribute. In addition, *Race* was treated as the protected attribute for *Compas* and *Adult*.

From the results in Table 3, we see that as more stringent fairness standard is used, the certified percentage either stays the same (as in *Salary*) or decreases (as in *Student*). This is consistent with what we expect, since the classification output is required to stay the same for an increasingly larger number of scenarios. For *Compas (race)*, in particular, adding --fairness on top of label-flipping fairness causes the certified percentage to drop from 62.4% to 56.4%.

Nevertheless, our method still maintains a high certification percentage. Recall that, for *Salary*, the 33.3% certification rate (for *+Individual fairness*) is actually 100% accurate according to comparison with the ground truth in Table 2, while the 44.6% certification rate (for *+Individual fairness*) is actually 76.2% accurate. Furthermore, the efficiency of our method is high: for *Adult*, which has 50,000 samples in the training set, the average certification time of our method remains within a few seconds.

**Table 4.** Results for certifying *label-flipping +* -*-fairness* with both *Race* and *Gender* as protected attributes.


**Results on Demographic Groups.** Table 4 shows the certified percentage of each demographic group, when both *label-flipping* and -*-fairness* are considered, and both *Race* and *Gender* are treated as protected attributes. The four demographic groups are (1) *White Male*, (2) *White Female*, (3) *Other Male*, and (4) *Other Female*. For each group, we show the certified percentage obtained by our method. In addition, we show the weighted averages for *White* and *Other*, as well as the weighted averages for *Male* and *Female*.

For *Compas*, *White Female* has the highest certified percentage (100%) while *Other Female* has the lowest certified percentage (52.2%); here, the classification output represents the recidivism risk.

For *Adult*, *Other Female* has the highest certified percentage (66.7%) while the other three groups have certified percentages in the range of 33.3%-35.3%.

The differences may be attributed to two sources, one of which is technical and the other is social. The social reason is related to historical bias, which is well documented for these datasets. If the actual percentages (ground truth) is different, the percentages reported by our method will also be different. The technical reason is related to the nature of the KNN algorithm itself, which we explain as follows.

In these datasets, some demographic groups have significantly more samples than others. In KNN, the lowest occurring group may have a limited number of close neighbors. Thus, for each test input x from this group, its K nearest neighbors tend to have a larger radius in the input vector space. As a result, the impact of perturbation on x will be smaller, resulting in fewer changes to its K nearest neighbors. That may be one of the reasons why, in Table 4, the lowest occurring groups, *White Female* in *Compas* and *Other Female* in *Adult*, have significantly higher certified percentage than other groups.

Results in Table 4 show that, even if a machine learning technique discriminates against certain demographic groups, for an individual, the prediction result produced by the machine learning technique may still be fair. This is closely related to differences (and sometimes conflicts) between *group fairness* and *individual fairness*: while group fairness focuses on statistical parity, individual fairness focuses on similar outcomes for similar individuals. Both are useful notions and in many cases they are complementary.

**Caveat.** Our work should not be construed as an endorsement nor criticism of the use of machine learning techniques in socially sensitive applications. Instead, it should be viewed as an effort on developing new methods and tools to help improve our understanding of these techniques.

### **7 Related Work**

For fairness certification, as explained earlier in this paper, our method is the first method for certifying KNN in the presence of historical (dataset) bias. While there are other KNN certification and falsification techniques, including Jia et al. [20] and Li et al. [24,25], they focus solely on robustness against data poisoning attacks as opposed to individual and --fairness against historical bias. Meyer et al. [26,27] and Drews et al. [12] propose certification techniques that handle dataset bias, but target different machine learning techniques (decision tree or linear regression); furthermore, they do not handle --fairness.

Throughout this paper, we have assumed that the KNN learning (parametertuning) step is not tampered with or subjected to fairness violation. However, since the only impact of tampering with the KNN learning step will be changing the optimal value of the parameter K, the biased KNN learning step can be modeled using a properly over-approximated KSet. With this new KSet, our method for certifying fairness of the prediction result (as presented in Sect. 4) will work AS IS.

Our method aims to certify fairness with certainty. In contrast, there are statistical techniques that can be used to prove that a system is fair or robust with a high probability. Such techniques have been applied to various machine learning models, for example, in *VeriFair* [6] and *FairSquare* [2]. However, they are typically applied to the prediction step while ignoring the learning step, although the learning step may be affected by dataset bias.

There are also techniques for mitigating bias in machine learning systems. Some focus on improving the learning algorithms using random smoothing [33], better embedding [7] or fair representation [34], while others rely on formal methods such as iterative constraint solving [38]. There are also techniques for repairing models to improve fairness [3]. Except for Ruoss et al. [34], most of them focus on group fairness such as demographic parity and equal opportunity; they are significantly different from our focus on certifying individual and - fairness of the classification results in the presence of dataset bias.

At a high level, our method that leverages a sound over-approximate analysis to certify fairness can be viewed as an instance of the abstract interpretation paradigm [10]. Abstract interpretation based techniques have been successfully used in many other settings, including verification of deep neural networks [17, 30], concurrent software [21,22,37], and cryptographic software [43,44].

Since fairness is a type of non-functional property, the verification/certification techniques are often significantly different from techniques used to verify/certify functional correctness. Instead, they are more closely related to techniques for verifying/certifying robustness [8], noninterference [5], and sidechannel security [19,39,40,48], where a program is executed multiple times, each time for a different input drawn from a large (and sometimes infinite) set, to see if they all agree on the output. At a high level, this is closely related to differential verification [28,31,32], synthesis of relational invariants [41] and verification of hyper-properties [15,35].

## **8 Conclusions**

We have presented a method for certifying the individual and --fairness of the classification output of the KNN algorithm, under the assumption that the training dataset may have historical bias. Our method relies on abstract interpretation to soundly approximate the arithmetic computations in the learning and prediction steps. Our experimental evaluation shows that the method is efficient in handling popular datasets from the fairness research literature and accurate enough in obtaining certifications for a large amount of test data. While this paper focuses on KNN only, as a future work, we plan to extend our method to other machine learning models.

## **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

## Monitoring Algorithmic Fairness

Thomas A. Henzinger , Mahyar Karimi , Konstantin Kueffner , and Kaushik Mallik(B)

Institute of Science and Technology Austria (ISTA), Klosterneuburg, Austria {tah,mahyar.karimi,konstantin.kueffner,kaushik.mallik}@ist.ac.at

Abstract. Machine-learned systems are in widespread use for making decisions about humans, and it is important that they are *fair*, i.e., not biased against individuals based on sensitive attributes. We present runtime verification of algorithmic fairness for systems whose models are unknown, but are assumed to have a Markov chain structure. We introduce a specification language that can model many common algorithmic fairness properties, such as demographic parity, equal opportunity, and social burden. We build monitors that observe a long sequence of events as generated by a given system, and output, after each observation, a quantitative estimate of how fair or biased the system was on that run until that point in time. The estimate is proven to be correct modulo a variable error bound and a given confidence level, where the error bound gets tighter as the observed sequence gets longer. Our monitors are of two types, and use, respectively, frequentist and Bayesian statistical inference techniques. While the frequentist monitors compute estimates that are objectively correct with respect to the ground truth, the Bayesian monitors compute estimates that are correct subject to a given prior belief about the system's model. Using a prototype implementation, we show how we can monitor if a bank is fair in giving loans to applicants from different social backgrounds, and if a college is fair in admitting students while maintaining a reasonable financial burden on the society. Although they exhibit different theoretical complexities in certain cases, in our experiments, both frequentist and Bayesian monitors took less than a millisecond to update their verdicts after each observation.

### 1 Introduction

Runtime verification complements traditional static verification techniques, by offering lightweight solutions for checking properties based on a single, possibly long execution trace of a given system [8]. We present new runtime verification techniques for the problem of bias detection in decision-making software. The use of software for making critical decisions about humans is a growing trend; example areas include judiciary [13,20], policing [23,49], banking [48], etc. It is important that these software systems are unbiased towards the protected attributes

This work is supported by the European Research Council under Grant No.: ERC-2020-AdG101020093.

c The Author(s) 2023

C. Enea and A. Lal (Eds.): CAV 2023, LNCS 13965, pp. 358–382, 2023. https://doi.org/10.1007/978-3-031-37703-7\_17

of humans, like gender, ethnicity, etc. However, they have often shown biases in their decisions in the past [20,47,55,57,58]. While there are many approaches for mitigating biases before deployment [20,47,55,57,58], recent runtime verification approaches [3,34] offer a new complementary tool to oversee *algorithmic fairness* in AI and machine-learned decision makers during deployment.

To verify algorithmic fairness at runtime, the given decision-maker is treated as a *generator* of events with an unknown model. The goal is to algorithmically design lightweight but rigorous *runtime monitors* against quantitative formal specifications. The monitors observe a long stream of events and, after each observation, output a quantitative, statistically sound estimate of how fair or biased the generator was until that point in time. While the existing approaches [3,34] considered only sequential decision making models and built monitors from the frequentist viewpoint in statistics, we allow the richer class of Markov chain models and present monitors from both the frequentist and the Bayesian statistical viewpoints.

Monitoring algorithmic fairness involves on-the-fly statistical estimations, a feature that has not been well-explored in the traditional runtime verification literature. As far as the algorithmic fairness literature is concerned, the existing works are mostly *model-based*, and either minimize decision biases of machinelearned systems at *design-time* (i.e., pre-processing) [11,41,65,66], or verify their absence at *inspection-time* (i.e., post-processing) [32]. In contrast, we verify algorithmic fairness at *runtime*, and do not require an explicit model of the generator. On one hand, the model-independence makes the monitors trustworthy, and on the other hand, it complements the existing model-based static analyses and design techniques, which are often insufficient due to partially unknown or imprecise models of systems in real-world environments.

We assume that the sequences of events generated by the generator can be modeled as sequences of states visited by a finite unknown Markov chain. This implies that the generator is well-behaved and the events follow each other according to some fixed probability distributions. Not only is this assumption satisfied by many machine-learned systems (see Sect. 1.1 for examples), it also provides just enough structure to lay the bare-bones foundations for runtime verification of algorithmic fairness properties. We emphasize that we do not require knowledge of the transition probabilities of the underlying Markov chain.

We propose a new specification language, called the Probabilistic Specification Expressions (PSEs), which can formalize a majority of the existing algorithmic fairness properties in the literature, including demographic parity [21], equal opportunity [32], disparate impact [25], etc. Let Q be the set of events. Syntactically, a PSE is a restricted arithmetic expression over the (unknown) transition probabilities of a Markov chain with the state space Q. Semantically, a PSE ϕ over Q is a function that maps every Markov chain M with the state space <sup>Q</sup> to a real number, and the value <sup>ϕ</sup>(M) represents the degree of fairness or bias (with respect to ϕ) in the generator M. Our monitors observe a long sequence of events from Q, and after each observation, compute a statistically rigorous estimate of <sup>ϕ</sup>(M) with a PAC-style error bound for a given confidence level. As the observed sequence gets longer, the error bound gets tighter.

Algorithmic fairness properties that are expressible using PSEs are quantitative refinements of the traditional qualitative fairness properties studied in formal methods. For example, a qualitative fairness property may require that if a certain event A occurs infinitely often, then another event B should follow infinitely often. In particular, a coin is qualitatively fair if infinitely many coin tosses contain both infinitely many heads and infinitely many tails. In contrast, the coin will be algorithmically fair (i.e., unbiased) if approximately half of the tosses come up heads. Technically, while qualitative weak and strong fairness properties are ω-regular, the algorithmic fairness properties are statistical and require counting. Moreover, for a qualitative fairness property, the satisfaction or violation cannot be established based on a finite prefix of the observed sequence. In contrast, for any given finite prefix of observations, the value of an algorithmic fairness property can be estimated using statistical techniques, assuming the future behaves statistically like the past (the Markov assumption).

As our main contribution, we present two different monitoring algorithms, using tools from frequentist and Bayesian statistics, respectively. The central idea of the *frequentist monitor* is that the probability of every transition of the monitored Markov chain M can be estimated using the fraction of times the transition is taken per visit to its source vertex. Building on this, we present a practical implementation of the frequentist monitor that can estimate the value of a given PSE from an observed finite sequence of states. For the coin example, after every new toss, the frequentist monitor will update its estimate of probability of seeing heads by computing the fraction of times the coin came up heads so far, and then by using concentration bounds to find a tight error bound for a given confidence level. On the other hand, the central idea of the *Bayesian monitor* is that we begin with a prior belief about the transition probabilities of M, and having seen a finite sequence of observations, we can obtain an updated posterior belief about M. For a given confidence level, the output of the monitor is computed by applying concentration inequalities to find a tight error bound around the mean of the posterior belief. For the coin example, the Bayesian monitor will begin with a prior belief about the degree of fairness, and, after observing the outcome of each new toss, will compute a new posterior belief. If the prior belief agrees with the true model with a high probability, then the Bayesian monitor's output converges to the true value of the PSE more quickly than the frequentist monitor. In general, both monitors can efficiently estimate more complicated PSEs, such as the ratio and the squared difference of the probabilities of heads of two different coins. The choice of the monitor for a particular application depends on whether an objective or a subjective evaluation, with respect to a given prior, is desired.

Both frequentist and Bayesian monitors use registers (and counters as a restricted class of registers) to keep counts of the relevant events and store the intermediate results. If the size of the given PSE is n, then, in theory, the frequentist monitor uses <sup>O</sup>(n<sup>4</sup>2<sup>n</sup>) registers and computes its output in <sup>O</sup>(n<sup>4</sup>2<sup>n</sup>) time after each new observation, whereas the Bayesian monitor uses <sup>O</sup>(n<sup>2</sup>2<sup>n</sup>) registers and computes its output in <sup>O</sup>(n<sup>2</sup>2<sup>n</sup>) time after each new observation. The computation time and the required number of registers get drastically reduced to *<sup>O</sup>*(n<sup>2</sup>) for the frequentist monitor with PSEs that contain up to one division operator, and for the Bayesian monitor with polynomial PSEs (possibly having negative exponents in the monomials). This shows that under given circumstances, one or the other type of the monitor can be favorable computationwise. These special, efficient cases cover many algorithmic fairness properties of interest, such as demographic parity and equal opportunity.

Our experiments confirm that our monitors are fast in practice. Using a prototype implementation in Rust, we monitored a couple of decision-making systems adapted from the literature. In particular, we monitor if a bank is fair in lending money to applicants from different demographic groups [48], and if a college is fair in admitting students without creating an unreasonable financial burden on the society [54]. In our experiments, both monitors took, on an average, less than a millisecond to update their verdicts after each observation, and only used tens of internal registers to operate, thereby demonstrating their practical usability at runtime.

In short, we advocate that runtime verification introduces a new set of tools in the area of algorithmic fairness, using which we can monitor biases of deployed AI and machine-learned systems in real-time. While existing monitoring approaches only support sequential decision making problems and use only the frequentist statistical viewpoint, we present monitors for the more general class of Markov chain system models using both frequentist and Bayesian statistical viewpoints.

All proofs can be found in the longer version of the paper [33].

#### 1.1 Motivating Examples

We first present two real-world examples from the algorithmic fairness literature to motivate the problem; these examples will later be used to illustrate the technical developments.

The Lending Problem [48]: Suppose a bank lends money to individuals based on certain attributes, like credit score, age group, etc. The bank wants to maximize profit by lending money to only those who will repay the loan in time called the "true individuals." There is a sensitive attribute (e.g., ethnicity) classifying the population into two groups g and g. The bank will be considered fair (in lending money) if its lending policy is independent of an individual's membership in g or g. Several *group fairness* metrics from the literature are relevant in this context. *Disparate impact* [25] quantifies the *ratio* of the probability of an individual from g getting the loan to the probability of an individual from g getting the loan, which should be close to 1 for the bank to be considered fair. *Demographic parity* [21] quantifies the *difference* between the probability of an individual from g getting the loan and the probability of an individual from g getting the loan, which should be close to 0 for the bank to be considered fair. *Equal opportunity* [32] quantifies the *difference* between the probability of a *true* individual from g getting the loan and the probability of a *true* individual from <sup>g</sup> getting the loan, which should be close to 0 for the bank to be considered fair. A discussion on the relative merit of various different algorithmic fairness notions is out of scope of this paper, but can be found in the literature [15,22,43,62]. We show how we can monitor whether a given group fairness criteria is fulfilled by the bank, by observing a sequence of lending decisions.

The College Admission Problem [54]: Consider a college that announces a cutoff of grades for admitting students through an entrance examination. Based on the merit, every truly qualified student belongs to group g, and the rest to group g. Knowing the cutoff, every student can choose to invest a sum of money—proportional to the gap between the cutoff and their true merit—to be able to reach the cutoff, e.g., by taking private tuition classes. On the other hand, the college's utility is in minimizing admission of students from g, which can be accomplished by raising the cutoff to a level that is too expensive to be achieved by the students from g and yet easy to be achieved by the students from g. The *social burden* associated to the college's cutoff choice is the expected expense of every student from <sup>g</sup>, which should be close to 0 for the college to be considered fair (towards the society). We show how we can monitor the social burden, by observing a sequence of investment decisions made by the students from g.

### 1.2 Related Work

There has been a plethora of work on algorithmic fairness from the machine learning standpoint [10,12,21,32,38,42,45,46,52,59,63,66]. In general, these works improve algorithmic fairness through de-biasing the training dataset (preprocessing), or through incentivizing the learning algorithm to make fair decisions (in-processing), or through eliminating biases from the output of the machine-learned model (post-processing). All of these are interventions in the design of the system, whereas our monitors treat the system as already deployed.

Recently, formal methods-inspired techniques have been used to guarantee algorithmic fairness through the verification of a learned model [2,9,29,53,61], and enforcement of robustness [6,30,39]. All of these works verify or enforce algorithmic fairness *statically* on all runs of the system with high probability. This requires certain knowledge about the system model, which may not be always available. Our runtime monitor dynamically verifies whether the current run of an opaque system is fair.

Our frequentist monitor is closely related to the novel work of Albarghouthi et al. [3], where the authors build a programming framework that allows runtime monitoring of algorithmic fairness properties on programs. Their monitor evaluates the algorithmic fairness of repeated "single-shot" decisions made by machine-learned functions on a sequence of samples drawn from an underlying unknown but fixed distribution, which is a special case of our more general Markov chain model of the generator. They do not consider the Bayesian point of view. Moreover, we argue and empirically show in Sect. 4 that our frequentist approach produces significantly tighter statistical estimates than their approach on most PSEs. On the flip side, their specification language is more expressive, in that they allow atomic variables for expected values of events, which is useful for specifying individual fairness criteria [21]. We only consider group fairness, and leave individual fairness as part of future research. Also, they allow logical operators (like boolean connectives) in their specification language. However, we obtain tighter statistical estimates for the core arithmetic part of algorithmic fairness properties (through PSEs), and point out that we can deal with logical operators just like they do in a straightforward manner.

Shortly after the first manuscript of this paper was written, we published a separate work for monitoring long-run fairness in sequential decision making problems, where the feature distribution of the population may dynamically change due to the actions of the individuals [34]. Although this other work generalizes our current paper in some aspects (support for dynamic changes in the model), it only allows sequential decision making models (instead of Markov chains) and does not consider the Bayesian monitoring perspective.

There is a large body of research on monitoring, though the considered properties are mainly temporal [5,7,19,24,40,50,60]. Unfortunately, these techniques do not directly extend to monitoring algorithmic fairness, since checking algorithmic fairness requires statistical methods, which is beyond the limit of finite automata-based monitors used by the classical techniques. Although there are works on quantitative monitoring that use richer types of monitors (with counters/registers like us) [28,35,36,56], the considered specifications do not easily extend to statistical properties like algorithmic fairness. One exception is the work by Ferrère et al. [26], which monitors certain statistical properties, like mode and median of a given sequence of events. Firstly, they do not consider algorithmic fairness properties. Secondly, their monitors' outputs are correct only as the length of the observed sequence approaches infinity (asymptotic guarantee), whereas our monitors' outputs are *always* correct with high confidence (finite-sample guarantee), and the precision gets better for longer sequences.

Although our work uses similar tools as used in statistical verification [1, 4,14,17,64], the goals are different. In traditional statistical verification, the system's runs are chosen probabilistically, and it is verified if any run of the system satisfies a boolean property with a certain probability. For us, the run is given as input to the monitor, and it is this run that is verified against a quantitative algorithmic fairness property with statistical error bounds. To the best of our knowledge, existing works on statistical verification do not consider algorithmic fairness properties.

## 2 Preliminaries

For any alphabet Σ, the notation Σ<sup>∗</sup> represents the set of all finite words over Σ. We write R, N, and N<sup>+</sup> to denote the sets of real numbers, natural numbers (including zero), and positive integers, respectively. For a pair of real (natural) numbers a, b with a<b, we write [a, b] ([a..b]) to denote the set of all real (natural) numbers between and including <sup>a</sup> and <sup>b</sup>. For a given c, r <sup>∈</sup> <sup>R</sup>, we write [<sup>c</sup> <sup>±</sup> <sup>r</sup>] to denote the set [<sup>c</sup> <sup>−</sup> r, c + <sup>r</sup>]. For simpler notation, we will use |·| to denote both the cardinality of a set and the absolute value of a real number, whenever the intended use is clear.

For a given vector <sup>v</sup> <sup>∈</sup> <sup>R</sup><sup>n</sup> and a given <sup>m</sup> <sup>×</sup> <sup>n</sup> real matrix <sup>M</sup>, for some m, n, we write v<sup>i</sup> to denote the i-th element of v and write Mij to denote the element at the <sup>i</sup>-th row and the <sup>j</sup>-th column of <sup>M</sup>. For a given <sup>n</sup> <sup>∈</sup> <sup>N</sup><sup>+</sup>, a *simplex* is the set of vectors <sup>Δ</sup>(n) := {<sup>x</sup> <sup>∈</sup> [0, 1]n+1 <sup>|</sup> n+1 <sup>i</sup>=1 <sup>x</sup><sup>i</sup> = 1}. Notice that the dimension of <sup>Δ</sup>(n) is <sup>n</sup> + 1 (and not <sup>n</sup>), a convention that is standard due to the interpretation of <sup>Δ</sup>(n) as the <sup>n</sup> + 1 vertices of an <sup>n</sup>-dimensional polytope. A *stochastic matrix* of dimension m × m is a matrix whose every row is in <sup>Δ</sup>(m−1), i.e. <sup>M</sup> <sup>∈</sup> <sup>Δ</sup>(m−1)m. Random variables will be denoted using uppercase symbols from the Latin alphabet (e.g. X), while the associated outcomes will be denoted using lowercase font of the same symbol (x is an outcome of X). We will interchangeably use the expected value <sup>E</sup>(X) and the mean <sup>μ</sup><sup>X</sup> of <sup>X</sup>. For a given set <sup>S</sup>, define <sup>D</sup>(S) as the set of every random variable—called a *probability distribution*<sup>1</sup>—with set of outcomes being 2<sup>S</sup>. A Bernoulli random variable that produces "1" (the alternative is "0") with probability <sup>p</sup> is written as *Bernoulli*(p).

#### 2.1 Markov Chains as Randomized Generators of Events

We use finite Markov chains as sequential randomized generators of events. A (finite) Markov chain <sup>M</sup> is a triple (Q, M, π), where <sup>Q</sup> = [1 ..N] is a set of states for a finite <sup>N</sup>, <sup>M</sup> <sup>∈</sup> <sup>Δ</sup>(<sup>N</sup> <sup>−</sup>1)<sup>N</sup> is a stochastic matrix called the transition probability matrix, and <sup>π</sup> ∈ D(Q) is the distribution over initial states. We often refer to a pair of states (i, j) <sup>∈</sup> <sup>Q</sup>×<sup>Q</sup> as an *edge*. The Markov chain <sup>M</sup> generates an infinite sequence of random variables <sup>X</sup><sup>0</sup> <sup>=</sup> π,X1,..., with <sup>X</sup><sup>i</sup> ∈ D(Q) for every <sup>i</sup>, such that the Markov property is satisfied: <sup>P</sup>(X<sup>n</sup>+1 <sup>=</sup> <sup>i</sup><sup>n</sup>+1 <sup>|</sup> <sup>X</sup><sup>0</sup> <sup>=</sup> <sup>i</sup>0,...,X<sup>n</sup> <sup>=</sup> <sup>i</sup><sup>n</sup>) = <sup>P</sup>(X<sup>n</sup>+1 <sup>=</sup> <sup>i</sup><sup>n</sup>+1 <sup>|</sup> <sup>X</sup><sup>n</sup> <sup>=</sup> <sup>i</sup><sup>n</sup>), which is <sup>M</sup><sup>i</sup>*n*i*n*+1 in our case. A finite *path* #"<sup>x</sup> <sup>=</sup> <sup>x</sup>0,...,x<sup>n</sup> of <sup>M</sup> is a finite word over <sup>Q</sup> such that for every <sup>t</sup> <sup>∈</sup> [0; <sup>n</sup>], <sup>P</sup>(X<sup>t</sup> <sup>=</sup> <sup>x</sup><sup>t</sup>) <sup>&</sup>gt; <sup>0</sup>. Let *Paths*(M) be the set of every finite path of <sup>M</sup>.

We use Markov chains to model the probabilistic interaction between a machine-learned decision maker with its environment. Intuitively, the Markov assumption on the model puts the restriction that the decision maker does not change over time, e.g., due to retraining.

In Fig. 1 we show the Markov chains for the lending and the college admission examples from Sect. 1.1. The Markov chain for the lending example captures the sequence of loan-related probabilistic events, namely, that a loan applicant is randomly sampled and the group information (g or g) is revealed, a probabilistic decision is made by the decision-maker and either the loan was granted (gy or gy, depending on the group) or refused (y), and if the loan is granted then with some probabilities it either gets repaid (z) or defaulted (z). The Markov chain for the college admission example captures the sequence of admission events, namely, that a candidate is randomly sampled and the group is revealed (g, g), and when the candidate is from group g (truly qualified) then the amount of money invested for admission is also revealed.

<sup>1</sup> An alternate commonly used definition of probability distribution is directly in terms of the probability measure induced over *S*, instead of through the random variable.

Fig. 1. Markov chains for the lending and the college-admission examples. (left) The lending example: The state *init* denotes the initiation of the sampling, and the rest represent the selected individual, namely, *g* and *g* denote the two groups, (*gy*) and (*gy*) denote that the individual is respectively from group *g* and group *g* and the loan was granted, *y* denotes that the loan was refused, and *z* and *z* denote whether the loan was repaid or not. (right) The college admission example: The state *init* denotes the initiation of the sampling, the states *g, g* represent the group identity of the selected candidate, and the states {0*,...,N*} represent the amount of money invested by a truly eligible candidate.

#### 2.2 Randomized Register Monitors

Randomized register monitors, or simply monitors, are adapted from the (deterministic) polynomial monitors of Ferrère et al. [27]. Let R be a finite set of integer variables called registers. A function <sup>v</sup> : <sup>R</sup> <sup>→</sup> <sup>N</sup> assigning concrete value to every register in R is called a valuation of R. Let N<sup>R</sup> denote the set of all valuations of R. Registers can be read and written according to relations in the signature <sup>S</sup> = 0, 1, +, <sup>−</sup>, <sup>×</sup>, <sup>÷</sup>, ≤. We consider two basic operations on registers:


We use <sup>Φ</sup>(R) and <sup>Γ</sup>(R) to respectively denote the set of tests and updates over <sup>R</sup>. *Counters* are special registers with a restricted signature <sup>S</sup> = 0, 1, +, <sup>−</sup>, ≤.

Definition 1 (Randomized register monitor). *A randomized register monitor is a tuple* (Σ, Λ, R, λ, T) *where* <sup>Σ</sup> *is a finite input alphabet,* <sup>Λ</sup> *is an output alphabet,* <sup>R</sup> *is a finite set of registers,* <sup>λ</sup>: <sup>N</sup><sup>R</sup> <sup>→</sup> <sup>Λ</sup> *is an output function, and* <sup>T</sup> : <sup>Σ</sup> <sup>×</sup> <sup>Φ</sup>(R) → D(Γ(R)) *is the randomized transition function such that for every* <sup>σ</sup> <sup>∈</sup> <sup>Σ</sup> *and for every valuation* <sup>v</sup> <sup>∈</sup> <sup>N</sup><sup>R</sup>*, there exists a unique* <sup>φ</sup> <sup>∈</sup> <sup>Φ</sup>(R) *with* <sup>v</sup> <sup>|</sup>= <sup>φ</sup> *and* <sup>T</sup>(σ, φ) ∈ D(Γ(R))*. A deterministic register monitor is a randomized register monitor for which* <sup>T</sup>(σ, φ) *is a Dirac delta distribution, if it is defined.*

<sup>A</sup> *state* of a monitor <sup>A</sup> is a valuation of its registers <sup>v</sup> <sup>∈</sup> <sup>N</sup><sup>R</sup>. The monitor A *transitions* from state v to a *distribution* over states given by the random variable <sup>Y</sup> = <sup>T</sup>(σ, φ) on input <sup>σ</sup> <sup>∈</sup> <sup>Σ</sup> if there exists <sup>φ</sup> such that <sup>v</sup> <sup>|</sup>= <sup>φ</sup>. Let <sup>γ</sup> be an outcome of <sup>Y</sup> with <sup>P</sup>(<sup>Y</sup> = <sup>γ</sup>) <sup>&</sup>gt; 0, in which case the registers are updated as v (x) = <sup>v</sup>(γ(x)) for every <sup>x</sup> <sup>∈</sup> <sup>R</sup>, and the respective concrete transition is written as <sup>v</sup> <sup>σ</sup> −→ v . A *run* of A on a word w<sup>0</sup> ...w<sup>n</sup> ∈ Σ<sup>∗</sup> is a sequence of concrete transitions v<sup>0</sup> <sup>w</sup><sup>0</sup> −−→ <sup>v</sup><sup>1</sup> <sup>w</sup><sup>1</sup> −−→ ... <sup>w</sup>*<sup>n</sup>* −−→ <sup>v</sup>n+1. The probabilistic transitions of <sup>A</sup> induce a probability distribution over the sample space of finite runs of the monitor, denoted <sup>P</sup>(·). For a given finite word <sup>w</sup> <sup>∈</sup> <sup>Σ</sup>∗, the *semantics* of the monitor <sup>A</sup> is given by a random variable [[A]](w) := <sup>λ</sup>(<sup>Y</sup> ) inducing the probability measure <sup>P</sup>A, where <sup>Y</sup> is the random variable representing the distribution over the final state in a run of <sup>A</sup> on the word <sup>w</sup>, i.e., <sup>P</sup>A(<sup>Y</sup> <sup>=</sup> <sup>v</sup>) := <sup>P</sup>({<sup>r</sup> <sup>=</sup> <sup>r</sup><sup>0</sup> ...r<sup>m</sup> <sup>∈</sup> <sup>Σ</sup><sup>∗</sup> <sup>|</sup> <sup>r</sup> is a run of <sup>A</sup> on <sup>w</sup> and <sup>r</sup><sup>m</sup> <sup>=</sup> <sup>v</sup>}).

Example: A Monitor for Detecting the (Unknown) Bias of a Coin. We present a simple deterministic monitor that computes a PAC estimate of the bias of an unknown coin from a sequence of toss outcomes, where the outcomes are denoted as "h" for heads and "t" for tails. The input alphabet is the set of toss outcomes, i.e., <sup>Σ</sup> = {h, t}, the output alphabet is the set of every bias intervals, i.e., <sup>Γ</sup> = {[a, b] <sup>|</sup> 0 <sup>≤</sup> a<b <sup>≤</sup> 1}, the set of registers is <sup>R</sup> = {rn, r<sup>h</sup>}, where r<sup>n</sup> and r<sup>h</sup> are counters counting the total number of tosses and the number of heads, respectively, and the output function λ maps every valuation of rn, r<sup>h</sup> to an interval estimate of the bias that has the form <sup>λ</sup> <sup>≡</sup> <sup>v</sup>(r*h*)/<sup>v</sup>(r*n*) <sup>±</sup> <sup>ε</sup>(rn, δ), where <sup>δ</sup> <sup>∈</sup> [0, 1] is a given upper bound on the probability of an incorrect estimate and <sup>ε</sup>(rn, δ) is the estimation error computed using PAC analysis. For instance, after observing a sequence of 67 tosses with 36 heads, the values of the registers will be <sup>v</sup>(r<sup>n</sup>) = 67 and <sup>v</sup>(r<sup>h</sup>) = 36, and the output of the monitor will be <sup>λ</sup>(67, 36) = <sup>36</sup>/<sup>67</sup> <sup>±</sup> <sup>ε</sup>(n, δ) for some appropriate <sup>ε</sup>(·). Now, suppose the next input to the monitor is h, in which case the monitor's transition is given as <sup>T</sup>(h, ·)=(r<sup>n</sup> + 1, r<sup>h</sup> + 1), which updates the registers to the new values <sup>v</sup> (r<sup>n</sup>) = 67 + 1 = 68 and <sup>v</sup> (r<sup>h</sup>) = 36 + 1 = 37. For this example, the tests <sup>Φ</sup>(R) over the registers are redundant, but they can be used to construct monitors for more complex properties.

## 3 Algorithmic Fairness Specifications and Problem Formulation

#### 3.1 Probabilistic Specification Expressions

To formalize algorithmic fairness properties, like the ones in Sect. 1.1, we introduce *probabilistic specification expressions* (PSE). A PSE ϕ over a given finite set Q is an algebraic expression with some restricted set of operations that uses variables labeled <sup>v</sup>ij with i, j <sup>∈</sup> <sup>Q</sup> and whose domains are the real interval [0, 1]. The syntax of ϕ is:

$$\xi ::= v \in \{v\_{ij}\}\_{i,j \in Q} \mid \xi \cdot \xi \mid 1 \div \xi,\tag{1a} \tag{1a}$$

$$\varphi ::= \kappa \in \mathbb{R} \mid \xi \mid \varphi + \varphi \mid \varphi - \varphi \mid \varphi \cdot \varphi \mid (\varphi), \tag{1b}$$

where {vij}i,j∈<sup>Q</sup> are the variables with domain [0, 1] and <sup>κ</sup> is a constant. The expression ξ in (1a) is called a *monomial* and is simply a product of powers of variables with integer exponents. A *polynomial* is a weighted sum of monomials with constant weights.<sup>2</sup> Syntactically, polynomials form a strict subclass of the expressions definable using (1b), because the product of two polynomials is not a polynomial, but is a valid expression according to (1b). A PSE ϕ is *division-free* if there is no division operator involved in ϕ. The *size* of an expression ϕ is the total number of arithmatic operators (i.e. <sup>+</sup>, <sup>−</sup>, ·, <sup>÷</sup>) in <sup>ϕ</sup>. We use <sup>V</sup><sup>ϕ</sup> to denote the set of variables appearing in the expression ϕ, and for every V ⊆ V<sup>ϕ</sup> we define *Dom*(<sup>V</sup> ) := {<sup>i</sup> <sup>∈</sup> <sup>Q</sup> | ∃vij <sup>∈</sup> <sup>V</sup> ∨ ∃vki <sup>∈</sup> <sup>V</sup> } as the set containing any state of the Markov chain that is involved in some variable in V .

The semantics of a PSE ϕ is interpreted *statically* on the unknown Markov chain <sup>M</sup>: we write <sup>ϕ</sup>(M) to denote the evaluation or the value of <sup>ϕ</sup> by substituting every variable vij in ϕ with Mij . E.g., for a Markov chain with state space {1, <sup>2</sup>} and transition probabilities <sup>M</sup><sup>11</sup> = 0.2, <sup>M</sup><sup>12</sup> = 0.8, <sup>M</sup><sup>21</sup> = 0.4, and <sup>M</sup><sup>22</sup> <sup>=</sup> <sup>0</sup>.6, the expression <sup>ϕ</sup> <sup>=</sup> <sup>v</sup><sup>11</sup> <sup>−</sup> <sup>v</sup><sup>21</sup> has the evaluation <sup>ϕ</sup>(M)=0.<sup>2</sup> <sup>−</sup> <sup>0</sup>.4 = <sup>−</sup>0.2. We will assume that for every expression (1 <sup>÷</sup> <sup>ξ</sup>), <sup>ξ</sup>(M) = 0.

Example: Group Fairness. Using PSEs, we can express the group fairness properties for the lending example described in Sect. 1.1, with the help of the Markov chain in the left subfigure of Fig. 1:


The equal opportunity criterion requires the following probability to be close to zero: <sup>p</sup> = <sup>P</sup>(<sup>y</sup> <sup>|</sup> g, z) <sup>−</sup> <sup>P</sup>(<sup>y</sup> <sup>|</sup> g, z), which is tricky to monitor as <sup>p</sup> contains the counter-factual probabilities representing "the probability that an individual from a group would repay had the loan been granted." We apply Bayes' rule, and turn <sup>p</sup> into the following equivalent form: <sup>p</sup> <sup>=</sup> <sup>P</sup>(z|g,y)·P(y|g) <sup>P</sup>(z|g) <sup>−</sup> <sup>P</sup>(z|g,y)·P(y|g) <sup>P</sup>(z|g) . Assuming <sup>P</sup>(<sup>z</sup> <sup>|</sup> <sup>g</sup>) = <sup>c</sup><sup>1</sup> and <sup>P</sup>(<sup>z</sup> <sup>|</sup> <sup>g</sup>) = <sup>c</sup>2, where <sup>c</sup><sup>1</sup> and <sup>c</sup><sup>2</sup> are known constants, the property p can be encoded as a PSE as below:

```
Equal opportunity [32]: (v(gy)z·vgy)÷c1−(v(gy)z·vgy)÷c2.
```
Example: Social Burden. Using PSEs, we can express the social burden of the college admission example described in Sect. 1.1, with the help of the Markov chain depicted in the right subfigure of Fig. 1:

$$\text{Social burden [54]:} \qquad \qquad \qquad 1 \cdot v\_{g1} + \dots + N \cdot v\_{gN} \text{.} $$

#### 3.2 The Monitoring Problem

Informally, our goal is to build monitors that observe a single long path of a Markov chain and, after each observation, output a new estimate for the value of the PSE. Since the monitor's estimate is based on statistics collected from

<sup>2</sup> Although monomials and polynomials usually only have positive exponents, we take the liberty to use the terminologies even when negative exponents are present.

a finite path, the output may be incorrect with some probability, where the source of this probability is different between the frequentist and the Bayesian approaches. In the frequentist approach, the underlying Markov chain is fixed (but unknown), and the randomness stems from the sampling of the observed path. In the Bayesian approach, the observed path is fixed, and the randomness stems from the uncertainty about a prior specifying the Markov chain's parameters. The commonality is that, in both cases, we want our monitors to estimate the value of the PSE up to an error with a fixed probabilistic confidence.

We formalize the monitoring problem separately for the two approaches. A *problem instance* is a triple (Q, ϕ, δ), where <sup>Q</sup> = [1 ..N] is a set of states, <sup>ϕ</sup> is a PSE over <sup>Q</sup>, and <sup>δ</sup> <sup>∈</sup> [0, 1] is a constant. In the frequentist approach, we use P<sup>s</sup> to denote the probability measure induced by *sampling* of paths, and in the Bayesian approach we use P<sup>θ</sup> to denote the probability measure induced by the *prior* probability density function <sup>p</sup><sup>θ</sup> : <sup>Δ</sup>(<sup>n</sup> <sup>−</sup> 1)<sup>n</sup> <sup>→</sup> <sup>R</sup> ∪ {∞} over the transition matrix of the Markov chain. In both cases, the output alphabets of the monitors contain every real interval.

Problem 1 (Frequentist monitor). *Suppose* (Q, ϕ, δ) *is a problem instance given as input. Design a monitor* A *such that for every Markov chain* M *with transition probability matrix* <sup>M</sup> *and for every finite path* #"<sup>x</sup> <sup>∈</sup> *Paths*(M)*:*

$$\mathbb{P}\_{s,\mathcal{A}}\left(\varphi(M)\in[\mathcal{A}](\overrightarrow{x}')\right)\geq 1-\delta,\tag{2}$$

*where* <sup>P</sup>s,<sup>A</sup> *is the joint probability measure of* <sup>P</sup><sup>s</sup> *and* <sup>P</sup>A*.*

Problem 2 (Bayesian monitor). *Suppose* (Q, ϕ, δ) *is a problem instance and* p<sup>θ</sup> *is a prior density function, both given as inputs. Design a monitor* A *such that for every Markov chain* M *with transition probability matrix* M *and for every finite path* #"<sup>x</sup> <sup>∈</sup> *Paths*(M)*:*

$$\mathbb{P}\_{\theta, \mathcal{A}}\left(\varphi(M) \in [\mathcal{A}](\overleftarrow{x}) \mid \overrightarrow{x}\right) \ge 1 - \delta,\tag{3}$$

*where* <sup>P</sup>θ,<sup>A</sup> *is the joint probability measure of* <sup>P</sup><sup>θ</sup> *and* <sup>P</sup>A*.*

Notice that the state space of the Markov chain and the input alphabet of the monitor are the same, and so, many times, we refer to observed states as (input) symbols, and vice versa. The estimate [l, u] = [[A]](#"x) is called the (1 <sup>−</sup> <sup>δ</sup>)· 100% *confidence interval* for <sup>ϕ</sup>(M). <sup>3</sup> The radius, given by <sup>ε</sup> = 0.5 ·(u−l), is called the *estimation error*, and the quantity 1 <sup>−</sup> <sup>δ</sup> is called the *confidence*. The estimate gets more precise as the error gets smaller and the confidence gets higher.

In many situations, we are interested in a *qualitative* question of the form "is <sup>ϕ</sup>(M) <sup>≤</sup> <sup>c</sup>?" for some constant <sup>c</sup>. We point out that, once the quantitative problem is solved, the qualitative questions can be answered using standard procedures by setting up a hypothesis test [44, p. 380].

<sup>3</sup> While in the Bayesian setting *credible intervals* would be more appropriate, we use confidence intervals due to uniformity and the relative ease of computation. To relate the two, our confidence intervals are over-approximations of credible intervals (non-unique) that are centered around the posterior mean.

#### 4 Frequentist Monitoring

Suppose the given PSE is only a single variable <sup>ϕ</sup> <sup>=</sup> <sup>v</sup>ij , i.e., we are monitoring the probability of going from state i to another state j. The frequentist monitor A for ϕ can be constructed in two steps: (1) empirically compute the average number of times the edge (i, j) was taken per visit to the state <sup>i</sup> on the observed path of the Markov chain, and (2) compute the (1−δ)· 100% confidence interval using statistical concentration inequalities.

Now consider a slightly more complex PSE <sup>ϕ</sup> = <sup>v</sup>ij <sup>+</sup> <sup>v</sup>ik. One approach to monitor <sup>ϕ</sup> , proposed by Albarghouthi et al. [3], would be to first compute the (1 <sup>−</sup> <sup>δ</sup>) · 100% confidence intervals [l1, u1] and [l2, u2] separately for the two constituent variables vij and <sup>v</sup>ik, respectively. Then, the (1 <sup>−</sup> 2δ)· 100% confidence interval for ϕ would be given by the sum of the two intervals [l1, u1] and [l2, u2], i.e., [l1+l2, u1+u2]; notice the drop in overall confidence due to the union bound. The drop in the confidence level and the additional error introduced by the interval arithmetic accumulate quickly for larger PSEs, making the estimate unusable. Furthermore, we lose all the advantages of having any dependence between the terms in the PSE. For instance, by observing that vij and vik correspond to

Fig. 2. Variation of ratio of the est. error using the existing approach [3] to est. error using our approach, w.r.t. the size of the chosen PSE.

the mutually exclusive transitions i to j and i to k, we know that ϕ (M) is always less than 1, a feature that will be lost if we use plain merging of individual confidence intervals for vij and vik. We overcome these issues by estimating the value of the PSE as a whole as much as possible. In Fig. 2, we demonstrate how the ratio between the estimation errors from the two approaches vary as the number of summands (i.e., <sup>n</sup>) in the PSE <sup>ϕ</sup> = n <sup>i</sup>=1 v1<sup>n</sup> changes; in both cases we fixed the overall <sup>δ</sup> to 0.05 (95% confidence). The ratio remains the same for different observation lengths. Our approach is always at least as accurate as their approach [3], and is significantly better for larger PSEs.

#### 4.1 The Main Principle

We first explain the idea for division-free PSEs, i.e., PSEs that do not involve any division operator; later we extend our approach to the general case.

Divison-Free PSEs: In our algorithm, for every variable vij ∈ Vϕ, we introduce <sup>a</sup> *Bernoulli*(Mij ) random variable <sup>Y</sup> ij with the mean <sup>M</sup>ij unknown to us. We make an observation yij <sup>p</sup> for every p-th visit to the state i on a run, and if j follows immediately afterwards then record yij <sup>p</sup> = 1 else record <sup>y</sup>ij <sup>p</sup> = 0. This gives us a sequence of observations #"<sup>y</sup> ij = <sup>y</sup>ij <sup>1</sup> , yij <sup>2</sup> ,... corresponding to the sequence of i.i.d. random variables #"<sup>Y</sup> ij = <sup>Y</sup> ij <sup>1</sup> , Y ij <sup>2</sup> ,.... For instance, for the run <sup>121123</sup> we obtain #"<sup>y</sup> <sup>12</sup> = 1, <sup>0</sup>, <sup>1</sup> for the variable <sup>v</sup>12.

The heart of our algorithm is an aggregation procedure of every sequence of random variable { #"<sup>Y</sup> ij}v*ij*∈V*<sup>ϕ</sup>* to a single i.i.d. sequence # "<sup>W</sup> of an auxiliary random variable <sup>W</sup>, such that the mean of <sup>W</sup> is <sup>μ</sup><sup>W</sup> <sup>=</sup> <sup>E</sup>(W) = <sup>ϕ</sup>(M). We can then use known concentration inequalities on the sequence # "<sup>W</sup> to estimate <sup>μ</sup><sup>W</sup> . Since <sup>μ</sup><sup>W</sup> exactly equals <sup>ϕ</sup>(M) by design, we obtain a tight concentration bound on <sup>ϕ</sup>(M). We informally explain the main idea of constructing # "<sup>W</sup> using simple examples; the details can be found in Algorithm 2.

Sum and Difference: Let <sup>ϕ</sup> <sup>=</sup> <sup>v</sup>ij <sup>+</sup> <sup>v</sup>kl. We simply combine #"<sup>Y</sup> ij and #"<sup>Y</sup> kl as <sup>W</sup><sup>p</sup> <sup>=</sup> <sup>Y</sup> ij <sup>p</sup> <sup>+</sup> <sup>Y</sup> kl <sup>p</sup> , so that <sup>w</sup><sup>p</sup> <sup>=</sup> <sup>y</sup>ij <sup>p</sup> <sup>+</sup> <sup>y</sup>kl <sup>p</sup> is the corresponding observation of <sup>W</sup>p. Then <sup>μ</sup><sup>W</sup>*<sup>p</sup>* <sup>=</sup> <sup>ϕ</sup>(M) holds, because <sup>μ</sup><sup>W</sup>*<sup>p</sup>* <sup>=</sup> <sup>E</sup>(W<sup>p</sup>) = <sup>E</sup>(<sup>Y</sup> ij <sup>p</sup> <sup>+</sup> <sup>Y</sup> kl <sup>p</sup> ) = <sup>E</sup>(<sup>Y</sup> ij <sup>p</sup> ) + <sup>E</sup>(<sup>Y</sup> kl <sup>p</sup> ) = <sup>M</sup>ij <sup>+</sup> <sup>M</sup>kl. Similar approach works for <sup>ϕ</sup> <sup>=</sup> <sup>v</sup>ij <sup>−</sup> <sup>v</sup>kl.

Multiplication: For multiplications, the same linearity principle will not always work, since for random variables <sup>A</sup> and <sup>B</sup>, <sup>E</sup>(A·B) = <sup>E</sup>(A)·E(B) *only if* <sup>A</sup> and B are statistically independent, which will not be true for specifications of the form <sup>ϕ</sup> <sup>=</sup> <sup>v</sup>ij ·vik. In this case, the respective Bernoulli random variables <sup>Y</sup> ij <sup>p</sup> and Y ik <sup>p</sup> are dependent: <sup>P</sup>(<sup>Y</sup> ij <sup>p</sup> = 1)·P(<sup>Y</sup> ik <sup>p</sup> = 1) = <sup>M</sup>ij ·Mik, but <sup>P</sup>(<sup>Y</sup> ij <sup>p</sup> = 1∧<sup>Y</sup> ik <sup>p</sup> = 1) is always 0 (since *both* j and k cannot be visited following the p-th visit to i).

To benefit from independence once again, we temporally shift one of the random variables by defining <sup>W</sup><sup>p</sup> <sup>=</sup> <sup>Y</sup> ij <sup>2</sup><sup>p</sup> · <sup>Y</sup> ik <sup>2</sup>p+1, with <sup>w</sup><sup>p</sup> <sup>=</sup> <sup>y</sup>ij <sup>2</sup><sup>p</sup> · <sup>y</sup>ik <sup>2</sup>p+1. Since the random variables Y ij <sup>2</sup><sup>p</sup> and <sup>Y</sup> ik <sup>2</sup>p+1 are independent, as they use separate visits of state <sup>i</sup>, hence we obtain <sup>μ</sup><sup>W</sup>*<sup>p</sup>* <sup>=</sup> <sup>M</sup>ij · <sup>M</sup>ik. For independent multiplications of the form <sup>ϕ</sup> <sup>=</sup> <sup>v</sup>ij · <sup>v</sup>kl with <sup>i</sup> <sup>=</sup> <sup>k</sup>, we can simply use <sup>W</sup><sup>p</sup> <sup>=</sup> <sup>Y</sup> ij <sup>p</sup> · <sup>Y</sup> ik <sup>p</sup> .

In general, we use the ideas of aggregation and temporal shift on the syntax tree of the PSE ϕ, inductively. With an aggregated sequence of observations for the auxiliary variable <sup>W</sup> for <sup>ϕ</sup>, we can find an estimate for <sup>ϕ</sup>(M) using the Hoeffding's inequality. We present the detailed algorithm of this monitor, namely FreqMonitorDivFree, in Algorithm 1.

The General Case (PSEs With Division Operators): We observe that every arbitrary PSE ϕ of size n can be transformed into a semantically equivalent PSE of the form <sup>ϕ</sup><sup>a</sup> <sup>+</sup> <sup>ϕ</sup>*<sup>b</sup>* <sup>ϕ</sup>*<sup>c</sup>* of size <sup>O</sup>(n<sup>2</sup>2<sup>n</sup>), where <sup>ϕ</sup>a, <sup>ϕ</sup>b, and <sup>ϕ</sup><sup>c</sup> are all divisionfree. Once in this form, we can employ three different FreqMonitorDivFree monitors from Algorithm 1 to obtain separate interval estimates for ϕa, ϕb, and ϕc, which are then combined using standard interval arithmetic and the resulting confidence of the estimate is obtained through the union bound. The steps for constructing the (general-case) FrequentistMonitor are shown in Algorithm 2, and the detailed analysis can be found in the proof of Theorem 1.

Bounding Memory: Consider a PSE <sup>ϕ</sup> <sup>=</sup> <sup>v</sup>ij <sup>+</sup> <sup>v</sup>kl. The outcome <sup>w</sup><sup>p</sup> for <sup>ϕ</sup> can only be computed when both the Bernoulli outcomes yij <sup>p</sup> and ykl <sup>p</sup> are available. If at any point only one of the two is available, then we need to store the available one so that it can be used later when the other one gets available. It can be shown that the storage of "unmatched" outcomes may need unbounded memory.

To bound the memory, we use the insight that a *random reshuffling* of the i.i.d. sequence yij <sup>p</sup> would still be i.i.d. with the same distribution, so that we do not need to store the exact order in which the outcomes appeared. Instead, for every vij ∈ Vϕ, we only store the number of times we have seen the state i and the edge (i, j) in counters <sup>c</sup><sup>i</sup> and <sup>c</sup>ij , respectively. Observe that <sup>c</sup><sup>i</sup> <sup>≥</sup> - <sup>v</sup>*ik*∈V*<sup>ϕ</sup>* <sup>c</sup>ik, where the possible difference accounts for the visits to irrelevant states, denoted as a dummy state . Given {cik}k, whenever needed, we generate in x<sup>i</sup> a *random reshuffling* of the sequence of states, together with , seen after the past visits to i. From the sequence stored in xi, for every vik ∈ Vϕ, we can consistently determine the value of yik <sup>p</sup> (consistency dictates yik <sup>p</sup> = 1 <sup>⇒</sup> <sup>y</sup>ij <sup>p</sup> = 0). Moreover, we reuse space by resetting x<sup>i</sup> whenever the sequence stored in x<sup>i</sup> is no longer needed. It can be shown that the size of every x<sup>i</sup> can be at most the size of the expression [33, Proof of Thm. 2]. This random reshuffling of the observation sequences is the cause of the probabilistic transitions of the frequenitst monitor.

#### 4.2 Implementation of the Frequentist Monitor

Fix a problem instance (Q, ϕ, δ), with size of <sup>ϕ</sup> being <sup>n</sup>. Let <sup>ϕ</sup> be transformed into ϕ<sup>l</sup> by relabeling duplicate occurrences of vij using distinct labels v<sup>1</sup> ij , v<sup>2</sup> ij ,.... The set of labeled variables in ϕ<sup>l</sup> is V <sup>l</sup> <sup>ϕ</sup>, and <sup>|</sup><sup>V</sup> <sup>l</sup> <sup>ϕ</sup><sup>|</sup> = <sup>O</sup>(n). Let *SubExpr* (ϕ) denote the set of every subexpression in the expression <sup>ϕ</sup>, and use [lϕ, u<sup>ϕ</sup>] to denote the range of values the expression ϕ can take for every valuation of every variable as per the domain [0, 1]. Let *Dep*(ϕ) = {<sup>i</sup> | ∃vij <sup>∈</sup> <sup>V</sup><sup>ϕ</sup>}, and every subexpression <sup>ϕ</sup><sup>1</sup> · <sup>ϕ</sup><sup>2</sup> with *Dep*(ϕ1) <sup>∩</sup> *Dep*(ϕ2) <sup>=</sup> <sup>∅</sup> is called a *dependent multiplication*.

Implementation of FreqMonitorDivFree in Algorithm <sup>1</sup> has two main functions. *Init* initializes the registers. *Next* implements the transition function of the monitor, which attempts to compute a new observation <sup>w</sup> for # "<sup>W</sup> (Line 4) after observing a new input σ , and if successful it updates the output of the monitor by invoking the *UpdateEst* function. In addition to the registers in *Init* and *Next* labeled in the pseudocode, following registers are used internally:


Now, we summarize the main results for the frequentist monitor.

Theorem 1 (Correctness). *Let* (Q, ϕ, δ) *be a problem instance. Algorithm <sup>2</sup> implements a monitor for* (Q, ϕ, δ) *that solves Problem 1.*

Theorem 2 (Computational resources). *Let* (Q, ϕ, δ) *be a problem instance and* <sup>A</sup> *be the monitor implemented using the* FrequentistMonitor *routine of Algorithm 2. Suppose the size of* <sup>ϕ</sup> *is* <sup>n</sup>*. The monitor* <sup>A</sup> *requires* <sup>O</sup>(n<sup>4</sup>2<sup>2</sup><sup>n</sup>) *registers, and takes* <sup>O</sup>(n<sup>4</sup>2<sup>2</sup><sup>n</sup>) *time to update its output after receiving a new input* Algorithm 1. FreqMonitorDivFree

Parameters: Q, ϕ, δ Output: Λ 1: function *Init*(σ) 2: <sup>ϕ</sup>*<sup>l</sup>* unique labeling ←−−−−−−−−−− <sup>ϕ</sup> 3: for all <sup>v</sup>*ij* <sup>∈</sup> <sup>V</sup>*<sup>ϕ</sup>* do 4: <sup>c</sup>*ij* <sup>←</sup> <sup>0</sup> # of (i, j) 5: <sup>c</sup>*<sup>i</sup>* <sup>←</sup> <sup>0</sup> # of <sup>i</sup> 6: <sup>n</sup> <sup>←</sup> <sup>0</sup> length of <sup>w</sup>#" 7: <sup>σ</sup> <sup>←</sup> <sup>σ</sup> prev. symbol 8: <sup>μ</sup>*<sup>Λ</sup>* ← ⊥ est. mean 9: <sup>ε</sup>*<sup>Λ</sup>* ← ⊥ est. error 10: *ResetX* () reset x*i*-s 11: Compute l*ϕ*, u*<sup>ϕ</sup>* int. arith. 1: function *Next*(σ- ) 2: <sup>c</sup>*<sup>σ</sup>* <sup>←</sup> <sup>c</sup>*<sup>σ</sup>* + 1 update counters 3: <sup>c</sup>*σσ*- <sup>←</sup> <sup>c</sup>*σσ*- + 1 4: <sup>w</sup> <sup>←</sup> *Eval*(ϕ*<sup>l</sup>* ) 5: if <sup>w</sup> <sup>=</sup> <sup>⊥</sup> then 6: <sup>n</sup> <sup>←</sup> <sup>n</sup> + 1 7: <sup>Λ</sup> <sup>←</sup> *UpdateEst*(w, n) 8: *ResetX* () 9: <sup>σ</sup> <sup>←</sup> <sup>σ</sup>- 10: return Λ 1: function *Eval*(ϕ*<sup>l</sup>* ) 2: if <sup>r</sup>*ϕl* <sup>=</sup> <sup>⊥</sup> then 3: if <sup>ϕ</sup>*<sup>l</sup>* <sup>≡</sup> <sup>ϕ</sup>*<sup>l</sup>* <sup>1</sup> <sup>+</sup> <sup>ϕ</sup>*<sup>l</sup>* <sup>2</sup> then 4: <sup>r</sup>*ϕl* <sup>←</sup> *Eval*(ϕ*<sup>l</sup>* <sup>1</sup>) + *Eval*(ϕ*<sup>l</sup>* 2) 5: else if <sup>ϕ</sup>*<sup>l</sup>* <sup>≡</sup> <sup>ϕ</sup>*<sup>l</sup>* <sup>1</sup> <sup>−</sup> <sup>ϕ</sup>*<sup>l</sup>* <sup>2</sup> then 6: <sup>r</sup>*ϕl* <sup>←</sup> *Eval*(ϕ*<sup>l</sup>* <sup>1</sup>) <sup>−</sup> *Eval*(ϕ*<sup>l</sup>* 2) 7: else if <sup>ϕ</sup>*<sup>l</sup>* <sup>≡</sup> <sup>ϕ</sup>*<sup>l</sup>* <sup>1</sup> · <sup>ϕ</sup>*<sup>l</sup>* <sup>2</sup> then 8: if *Dep*(<sup>V</sup> *<sup>l</sup> <sup>ϕ</sup>*<sup>1</sup> ) <sup>∩</sup> *Dep*(<sup>V</sup> *<sup>l</sup> <sup>ϕ</sup>*<sup>2</sup> ) = <sup>∅</sup> then 9: <sup>r</sup>*ϕl* <sup>←</sup> *Eval*(ϕ*<sup>l</sup>* <sup>1</sup>) · *Eval*(ϕ*<sup>l</sup>* 2) 10: else dep. mult. 11: for <sup>v</sup>*<sup>l</sup> ij* <sup>∈</sup> <sup>V</sup> *<sup>l</sup> <sup>ϕ</sup>*<sup>2</sup> <sup>∩</sup> *Dep*(<sup>V</sup> *<sup>l</sup> <sup>ϕ</sup>*<sup>1</sup> ) do 12: <sup>t</sup>*<sup>l</sup> ij* <sup>←</sup> max({<sup>t</sup> *m ik* <sup>|</sup> <sup>v</sup>*<sup>m</sup> ik* <sup>∈</sup> <sup>V</sup> *<sup>l</sup> <sup>ϕ</sup>*<sup>1</sup> }) 13: <sup>t</sup>*<sup>l</sup> ij* <sup>←</sup> <sup>t</sup> *l ij* + 1 make indep. 14: <sup>r</sup>*ϕl* <sup>←</sup> *Eval*(ϕ*<sup>l</sup>* <sup>1</sup>) · *Eval*(ϕ*<sup>l</sup>* 2) 15: else if <sup>ϕ</sup>*<sup>l</sup>* <sup>≡</sup> <sup>v</sup>*<sup>l</sup> ij* then 16: if <sup>x</sup>*i*[t*<sup>l</sup> ij* + 1] = <sup>⊥</sup> then 17: *ExtractOutcome*(x*i*, t*<sup>l</sup> ij* + 1) 18: if <sup>x</sup>*i*[t*<sup>l</sup> ij* + 1] = <sup>j</sup> <sup>=</sup> <sup>⊥</sup> then 19: <sup>r</sup>*ϕl* <sup>←</sup> <sup>1</sup> 20: else 21: <sup>r</sup>*ϕl* <sup>←</sup> <sup>0</sup> 22: else if <sup>ϕ</sup>*<sup>l</sup>* <sup>≡</sup> <sup>c</sup> then 23: <sup>r</sup>*ϕl* <sup>←</sup> <sup>c</sup> 24: return r*ϕl* 1: function *UpdateEst*(w, n) 2: <sup>μ</sup>*<sup>Λ</sup>* <sup>←</sup> *μΛ*·(*n*−1)+*<sup>w</sup> n* 3: <sup>ε</sup>*<sup>Λ</sup>* <sup>←</sup> - <sup>−</sup> (*uϕ*−*lϕ*)2 <sup>2</sup>*<sup>n</sup>* · ln *<sup>δ</sup>* 2 4: return [μ*<sup>Λ</sup>* <sup>±</sup> <sup>ε</sup>*Λ*] 1: function *ExtractOutcome*(x*i*, t) generate a shuffled sequence of symbols seen after i so that |x*i*| = t 2: Let <sup>U</sup> ← {<sup>j</sup> <sup>∈</sup> <sup>Q</sup> <sup>|</sup> <sup>v</sup>*ij* <sup>∈</sup> <sup>V</sup>*ϕ*} 3: for <sup>p</sup> <sup>=</sup> <sup>|</sup>x*i*<sup>|</sup> + 1,...,t do 4: <sup>q</sup> ← ∀<sup>u</sup> <sup>∈</sup> U . pick <sup>u</sup> w/ prob. *ciu ci* , pick w/ prob. (*ci*− *j cij* ) *ci* 5: <sup>c</sup>*<sup>i</sup>* <sup>←</sup> <sup>c</sup>*<sup>i</sup>* <sup>−</sup> <sup>1</sup> 6: if <sup>q</sup> <sup>=</sup> then 7: <sup>c</sup>*iq* <sup>←</sup> <sup>c</sup>*iq* <sup>−</sup> <sup>1</sup> 8: <sup>x</sup>*i*[|x*i*<sup>|</sup> + 1] <sup>←</sup> <sup>q</sup> 1: function *ResetX* () 2: for all *i* ∈ *Dom*(*V*ϕ) do 3: *x*<sup>i</sup> ← ∅ 4: for all *v*<sup>l</sup> ij <sup>∈</sup> *<sup>V</sup>* <sup>l</sup> <sup>ϕ</sup> do 5: *t* l ij ← 0

## Algorithm 2. FrequentistMonitor

Parameters: Q, ϕ, δ Output: Λ 1: function *Init*(σ) 2: <sup>ϕ</sup>*<sup>a</sup>* <sup>+</sup> *ϕb ϕc* change form ←−−−−−−−−− <sup>ϕ</sup>*<sup>l</sup>* labeling ←−−−−−− ϕ 3: <sup>A</sup>*<sup>a</sup>* <sup>←</sup> FreqMonitorDivFree(Q, ϕ*a*, δ/3) 4: <sup>A</sup>*<sup>b</sup>* <sup>←</sup> FreqMonitorDivFree(Q, ϕ*b*, δ/3) 5: <sup>A</sup>*<sup>c</sup>* <sup>←</sup> FreqMonitorDivFree(Q, ϕ*c*, δ/3) 6: <sup>A</sup>*a*.*Init*(σ) 7: <sup>A</sup>*b*.*Init*(σ) 8: <sup>A</sup>*c*.*Init*(σ)


*symbol. For the special case of* ϕ *containing at most one division operator (division by constant does not count),* <sup>A</sup> *requires only* <sup>O</sup>(n<sup>2</sup>) *registers, and takes only* <sup>O</sup>(n<sup>2</sup>) *time to update its output after receiving a new input symbol.*

There is a tradeoff between the estimation error, the confidence, and the length of the observed sequence of input symbols. For instance, for a fixed confidence, the longer the observed sequence is, the smaller is the estimation error. The following theorem establishes a lower bound on the length of the sequence for a given upper bound on the estimation error and a fixed confidence.

Theorem 3 (Convergence speed). *Let* (Q, ϕ, δ) *be a problem instance where* ϕ *does not contain any division operator, and let* A *be the monitor computed using Algorithm 2. Suppose the size of* ϕ *is* n*. For a given upper bound on estimation error* <sup>ε</sup> <sup>∈</sup> <sup>R</sup>*, the minimum number of visits to every state in Dom*(V<sup>ϕ</sup>) *for obtaining an output with error at most* <sup>ε</sup> *and confidence at least* <sup>1</sup>−<sup>δ</sup> *on any path is given by:*

$$-\frac{(u\_{\varphi}-l\_{\varphi})^2\ln\left(\frac{\delta}{2}\right)n}{2\mathbb{E}^2},\tag{4}$$
 
$$u\_{\varphi} = \begin{array}{c} \frac{1}{2\mathbb{E}^2} \\ \frac{1}{2} \\ \end{array} \qquad \begin{array}{c} \frac{1}{2} \\ \frac{1}{2} \\ \end{array} \qquad \begin{array}{c} \frac{1}{2} \\ \frac{1}{2} \\ \end{array} \qquad \begin{array}{c} \frac{1}{2} \\ \frac{1}{2} \\ \end{array} \qquad \begin{array}{c} \frac{1}{2} \\ \frac{1}{2} \\ \end{array} \qquad \begin{array}{c} \frac{1}{2} \\ \frac{1}{2} \\ \end{array} \qquad \begin{array}{c} \frac{1}{2} \\ \frac{1}{2} \\ \end{array}$$

*where* [lϕ, u<sup>ϕ</sup>] *is the set of possible values of* <sup>ϕ</sup> *for every valuation of every variable (having domain* [0, 1]*) in* <sup>ϕ</sup>*.*

The bound follows from the Hoeffding's inequality, together with the fact that every dependent multiplication increments the required number of samples by 1. A similar bound for the general case with division is left open.

### 5 Bayesian Monitoring

Fix a problem instance (<sup>Q</sup> = [1 ..N], ϕ, δ). Let <sup>M</sup> = <sup>Δ</sup>(<sup>N</sup> <sup>−</sup>1)<sup>N</sup> be the shorthand notation for the set of transition probability matrices of the Markov chains with state space <sup>Q</sup>. Let <sup>p</sup><sup>θ</sup> : <sup>M</sup> <sup>→</sup> [0, 1] be the prior probability density function over M, which is assumed to be specified using the matrix beta distribution (the definition can be found in standard textbooks on Bayesian statistics [37, pp. 280]). Let be a matrix, with its size dependent on the context, whose every element is 1. We make the following common assumption [31,37, p. 50]:

Assumption 1 (Prior). *We are given a parameter matrix* <sup>θ</sup> <sup>≥</sup> -*, and* p<sup>θ</sup> *is specified using the matrix beta distribution with parameter* θ*. Moreover, the initial state of the Markov chain is fixed.*

When <sup>θ</sup> = -, then p<sup>θ</sup> is the uniform density function over M. After observing a path #"x, using Bayes' rule we obtain the *posterior* density function <sup>p</sup>θ(· | #"x), which is known to be efficiently computable due to the so-called conjugacy property that holds due to Assumption 1. From the posterior density, we obtain the expected posterior semantic value of <sup>ϕ</sup> as: <sup>E</sup>θ(ϕ(M) <sup>|</sup> #"x) := <sup>M</sup> <sup>ϕ</sup>(M) · <sup>p</sup>θ(<sup>M</sup> <sup>|</sup> #"x)dM. The heart of our Bayesian monitor is an efficient incremental computation of <sup>E</sup>θ(ϕ(M) <sup>|</sup> #"x)—free from numerical integration. Once we can compute <sup>E</sup>θ(ϕ(M) <sup>|</sup> #"x), we can also compute the posterior variance <sup>S</sup><sup>2</sup> of <sup>ϕ</sup>(M) using the known expression <sup>S</sup><sup>2</sup> <sup>=</sup> <sup>E</sup><sup>θ</sup>(ϕ<sup>2</sup>(M) <sup>|</sup> #"x)−E<sup>θ</sup>(ϕ(M) <sup>|</sup> #"x), which enables us to compute a confidence interval for <sup>ϕ</sup>(M) using the Chebyshev's inequality. In the following, we summarize our procedure for estimating <sup>E</sup><sup>θ</sup>(ϕ(M) <sup>|</sup> #"x).

### 5.1 The Main Principle

The incremental computation of <sup>E</sup><sup>θ</sup>(ϕ(M) <sup>|</sup> #"x) is implemented in BayesExpMonitor. We first transform the expression ϕ into the polynomial form <sup>ϕ</sup> = - <sup>l</sup> κlξl, where {κ<sup>l</sup>}<sup>l</sup> are the weights and {ξ<sup>l</sup>}<sup>l</sup> are monomials. If the size of <sup>ϕ</sup> is <sup>n</sup> then the size of <sup>ϕ</sup> is <sup>O</sup>(n<sup>2</sup> *n* 2 ). Then we can use linearity to compute the overall expectation as the weighted sum of expectations of the individual monomials: <sup>E</sup><sup>θ</sup>(ϕ(M) <sup>|</sup> #"x) = <sup>E</sup><sup>θ</sup>(ϕ (M) <sup>|</sup> #"x) = - <sup>l</sup> <sup>κ</sup>lE<sup>θ</sup>(ξ<sup>l</sup>(M) <sup>|</sup> #"x). In the following, we summarize the procedure for estimating <sup>E</sup><sup>θ</sup>(ξ(M) <sup>|</sup> #"x) for every monomial <sup>ξ</sup>.

Let <sup>ξ</sup> be a monomial, and let #"x ab <sup>∈</sup> <sup>Q</sup><sup>∗</sup> be a sequence of states. We use dij to store the exponent of the variable vij in the monomial ξ, and define <sup>d</sup><sup>a</sup> := - <sup>j</sup>∈[1..N] <sup>d</sup>aj . Also, we record the sets of (i, j)-s and <sup>i</sup>-s with positive and negative dij and d<sup>i</sup> entries: D<sup>+</sup> <sup>i</sup> := {<sup>j</sup> <sup>|</sup> <sup>d</sup>ij <sup>&</sup>gt; <sup>0</sup>}, <sup>D</sup><sup>−</sup> <sup>i</sup> := {<sup>j</sup> <sup>|</sup> <sup>d</sup>ij <sup>&</sup>lt; <sup>0</sup>}, <sup>D</sup><sup>+</sup> := {<sup>i</sup> <sup>|</sup> <sup>d</sup><sup>i</sup> <sup>&</sup>gt; <sup>0</sup>}, and <sup>D</sup><sup>−</sup> := {<sup>i</sup> <sup>|</sup> <sup>d</sup><sup>i</sup> <sup>&</sup>lt; <sup>0</sup>}.

For any given word #"<sup>w</sup> <sup>∈</sup> <sup>Q</sup>∗, let <sup>c</sup>ij (#"w) denote the number of ij-s in #"<sup>w</sup> and let <sup>c</sup><sup>i</sup>(#"w) := - <sup>j</sup>∈<sup>Q</sup> <sup>c</sup>ij (#"w). Define <sup>c</sup><sup>i</sup>(#"w) := <sup>c</sup><sup>i</sup>(#"w) + - <sup>j</sup>∈[1..N] <sup>θ</sup>ij and <sup>c</sup>ij (#"w) := <sup>c</sup>ij (#"w) + <sup>θ</sup>ij . Let <sup>H</sup>: <sup>Q</sup><sup>∗</sup> <sup>→</sup> <sup>R</sup> be defined as:

$$\mathcal{H}(\overrightarrow{w}) := \frac{\prod\_{i=1}^{N} \prod\_{j \in D\_i^+} \prescript{"\!}{}{P\_{(\overleftarrow{c}\_i(\overleftarrow{w})-1)+|d\_{ij}|}} |d\_{ij}|}{\prod\_{i \in D^+} \prescript{"\!}{}{P\_{(\overleftarrow{c}\_i(\overleftarrow{w})-1)+|d\_i|}} |d\_i|} \cdot \frac{\prod\_{i \in D^-} \prescript{"\!}{}{P\_{(\overleftarrow{c}\_i(\overleftarrow{w})-1)}|} |d\_i|}{\prod\_{i=1}^N \prod\_{j \in D\_i^-} \prescript{"\!}{}{P\_{(\overleftarrow{c}\_{ij}(\overleftarrow{w})-1)}|} |d\_{ij}|},\tag{5}$$

where *<sup>n</sup>* <sup>P</sup> <sup>n</sup><sup>k</sup> := <sup>n</sup>! (n−k)! is the number of permutations of k > <sup>0</sup> items from n > <sup>0</sup> objects, for <sup>k</sup> <sup>≤</sup> <sup>n</sup>, and we use the convention that for <sup>S</sup> = <sup>∅</sup>, <sup>s</sup>∈<sup>S</sup> ... = 1. Below, in Lemma 1, we establish that <sup>E</sup><sup>θ</sup>(ξ(M) <sup>|</sup> #"w) = <sup>H</sup>(#"w), and present an efficient incremental scheme to compute <sup>E</sup><sup>θ</sup>(ξ(M) <sup>|</sup> #"x ab) from <sup>E</sup><sup>θ</sup>(ξ(M) <sup>|</sup> #"x a).

Lemma 1 (Incremental computation of <sup>E</sup>(·|·)). *If the following consistency condition*

$$\forall i, j \in [1 \dots N] \cdot \mathbb{Z}\_{ij}(\overrightarrow{w}) + d\_{ij} > 0 \tag{6}$$

*is met, then the following holds:*

$$\begin{aligned} \text{then } & \text{we following rows.}\\ \mathbb{E}(\xi(M) \mid \overrightarrow{x} \, ab) &= \mathcal{H}(\overrightarrow{x} \, ab) = \mathcal{H}(\overrightarrow{x} \, a) \cdot \frac{\mathsf{\mathsf{F}}\_{ab}(\overrightarrow{x}\prime) + d\_{ab}}{\mathsf{\mathsf{F}}\_{ab}(\overrightarrow{x}\prime)} \cdot \frac{\mathsf{\mathsf{F}}\_{a}(\overrightarrow{x}\prime)}{\mathsf{\mathsf{F}}\_{a}(\overrightarrow{x}\prime) + d\_{a}}.\end{aligned} \tag{7}$$


#### Algorithm 3. BayesExpMonitor

Condition (6) guarantees that the permutations in (5) are well-defined. The first equality in (7) follows from Marchal et al. [51], and the rest uses the conjugacy of the prior. Lemma 1 forms the basis of the efficient update of our Bayesian monitor. Observe that on any given path, once (6) holds, it continues to hold forever. Thus, initially the monitor keeps updating H internally without outputting anything. Once (6) holds, it keeps outputting H from then on.

#### 5.2 Implementation of the Bayesian Monitor

We present the Bayesian monitor implementation in BayesConfIntMonitor (Algorithm 4), which invokes BayesExpMonitor (Algorithm 3) as subroutine. BayesExpMonitor computes the expected semantic value of an expression ϕ in polynomial form, by computing the individual expected value of each monomial using Propostion 1, and combining them using the linearity property. We drop the arguments from <sup>c</sup><sup>i</sup>(·) and <sup>c</sup>ij (·) and simply write <sup>c</sup><sup>i</sup> and <sup>c</sup>ij as constants associated to appropriate words. The symbol mij in Line 5 of *Init* is used as a bookkeeping variable for quickly checking the consistency condition (Eq. 6) in Line 5 of *Next*. In BayesConfIntMonitor, we compute the expected value and the variance of ϕ, by invoking BayesExpMonitor on ϕ and ϕ<sup>2</sup> respectively, and then compute the confidence interval using the Chebyshev's inequality. It can be observed in the *Next* subroutines of BayesConfIntMonitor and BayesExpMonitor that a deterministic transition function suffices for the Bayesian monitors.

Theorem 4 (Correctness). *Let* (Q, ϕ, δ) *be a problem instance, and* <sup>p</sup><sup>θ</sup> *be given as the prior distribution which satisfies Assumption 1. Algorithm 4 produces a monitor for* (Q, ϕ, δ) *that solves Problem 2.*

Theorem 5 Computational resources). *Let* (Q, ϕ, δ) *be a problem instance and* A *be the monitor computed using the* BayesConfIntMonitor *routine of*


Algorithm 4. BayesConfIntMonitor

*Algorithm 4. Suppose the size of* <sup>ϕ</sup> *is* <sup>n</sup>*. The monitor* <sup>A</sup> *requires* <sup>O</sup>(n22<sup>n</sup>) *registers, and takes* <sup>O</sup>(n22<sup>n</sup>) *time to update its output after receiving a new input symbol. For the special case of* <sup>ϕ</sup> *being in polynomial form,* <sup>A</sup> *requires only* <sup>O</sup>(n2) *registers, and takes only* <sup>O</sup>(n2) *time to update its output after receiving a new input symbol.*

A bound on the convergence speed of the Bayesian monitor is left open. This would require a bound on the change in variance with respect to the length of the observed path, which is not known for the general case of PSEs. Note that the efficient (quadratic) cases are different for the frequentist and Bayesian monitors, suggesting the use of different monitors for different specifications.

#### 6 Experiments

We implemented our frequentist and Bayesian monitors in a tool written in Rust, and used the tool to design monitors for the lending and the college admission examples taken from the literature [48,54] (described in Sect. 1.1). The generators are modeled as Markov chains (see Fig. 1)—unknown to the monitors capturing the sequential interactions between the decision-makers (i.e., the bank or the college) and their respective environments (i.e., the loan applicants or the students), as described by D'Amour et al. [16]. The setup of the experiments is as follows: We created a multi-threaded wrapper program, where one thread simulates one long run of the Markov chain, and a different thread executes the monitor. Every time a new state is visited by the Markov chain on the first thread, the information gets transmitted to the monitor on the second thread, which then updates the output. The experiments were run on a Macbook Pro 2017 equipped with a 2,3 GHz Dual-Core Intel Core i5 processor and 8GB RAM. The tool can be downloaded from the following url, where we have also included the scripts to reproduce our experiments: https://github.com/ista-fairness-monitoring/fmlib.

We summarize the experimental results in Fig. 3, and, from the table, observe that both monitors are extremely lightweight: they take less than a millisecond per update and small numbers of registers to operate. From the plots, we observe that the frequentist monitors' outputs are always centered around the ground truth values of the properties, empirically showing that they are always objectively correct. On the other hand, the Bayesian monitors' outputs can vary drastically for different choices of the prior, empirically showing that the correctness of outputs is subjective. It may be misleading that the outputs of the Bayesian monitors are wrong as they often do not contain the ground truth values. We reiterate that from the Bayesian perspective, the ground truth does not exist. Instead, we only have a probability distribution over the true values that gets updated after observing the generated sequence of events. The choice of the type of monitor ultimately depends on the application requirements.

Fig. 3. The plots show the 95% confidence intervals estimated by the monitors over time, averaged over 10 different sample paths, for the lending with demographic parity (left), lending with equalized opportunity (middle), and the college admission with social burden (right) problems. The horizontal dotted lines are the ground truth values of the properties, obtained by analyzing the Markov chains used to model the systems (unknown to the monitors). The table summarizes various performance metrics.

## 7 Conclusion

We showed how to monitor algorithmic fairness properties on a Markov chain with unknown transition probabilities. Two separate algorithms are presented, using the frequentist and the Bayesian approaches to statistics. The performances of both approaches are demonstrated, both theoretically and empirically.

Several future directions exist. Firstly, more expressive classes of properties need to be investigated to cover a broader range of algorithmic fairness criteria. We believe that boolean logical connectives, as well as min and max operators can be incorporated straightforwardly using ideas from the related literature [3]. This also adds support for absolute values, since <sup>|</sup>x<sup>|</sup> = max{x, <sup>−</sup>x}. On the other hand, properties that require estimating how often a state is visited would require more information about the dynamics of the Markov chain, including its mixing time. Monitoring statistical hyperproperties [18] is another important direction, which will allow us to encode individual fairness properties [21]. Secondly, more liberal assumptions on the system model will be crucial for certain practical applications. In particular, hidden Markov models, time-inhomogeneous Markov models, Markov decision processes, etc., are examples of system models with widespread use in real-world applications. Finally, better error bounds tailored for specific algorithmic fairness properties can be developed through a deeper mathematical analysis of the underlying statistics, which will sharpen the conservative bounds obtained through off-the-shelf concentration inequalities.

## References


Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

## nl2spec**: Interactively Translating Unstructured Natural Language to Temporal Logics with Large Language Models**

Matthias Cosler<sup>2</sup>, Christopher Hahn1(B) , Daniel Mendoza1(B) , Frederik Schmitt<sup>2</sup>, and Caroline Trippel<sup>1</sup>

<sup>1</sup> Stanford University, Stanford, CA, USA hahn@cs.stanford.edu, *{*dmendo,trippel*}*@stanford.edu <sup>2</sup> CISPA Helmholtz Center for Information Security, Saarbr¨ucken, Germany *{*matthias.cosler,frederik.schmitt*}*@cispa.de

**Abstract.** A rigorous formalization of desired system requirements is indispensable when performing any verification task. This often limits the application of verification techniques, as writing formal specifications is an error-prone and time-consuming manual task. To facilitate this, we present nl2spec, a framework for applying Large Language Models (LLMs) to derive formal specifications (in temporal logics) from unstructured natural language. In particular, we introduce a new methodology to detect and resolve the inherent ambiguity of system requirements in natural language: we utilize LLMs to map subformulas of the formalization back to the corresponding natural language fragments of the input. Users iteratively add, delete, and edit these sub-translations to amend erroneous formalizations, which is easier than manually redrafting the entire formalization. The framework is agnostic to specific application domains and can be extended to similar specification languages and new neural models. We perform a user study to obtain a challenging dataset, which we use to run experiments on the quality of translations. We provide an open-source implementation, including a web-based frontend.

## **1 Introduction**

A rigorous formalization of desired system requirements is indispensable when performing any verification-related task, such as model checking [7], synthesis [6], or runtime verification [20]. Writing formal specifications, however, is an errorprone and time-consuming manual task typically reserved for experts in the field. This paper presents nl2spec, a framework, accompanied by a web-based tool, to facilitate and automate writing formal specifications (in LTL [34] and similar temporal logics). The core contribution is a new methodology to decompose the natural language input into *sub-translations* by utilizing Large Language Models (LLMs). The nl2spec framework provides an interface to interactively


**Fig. 1.** A screenshot of the web-interface for nl2spec.

add, edit, and delete these *sub-translations* instead of attempting to grapple with the entire formalization at once (a feature that is sorely missing in similar work, e.g., [13,30]).

Figure 1 shows the web-based frontend of nl2spec. As an example, we consider the following system requirement given in natural language: "Globally, grant 0 and grant 1 do not hold at the same time until it is allowed". The tool automatically translates the natural language specification correctly into the LTL formula G((!((g0 & g1)) U a)). Additionally, the tool generates subtranslations, such as the pair ("do not hold at the same time", !(g0 & g1)), which help in verifying the correctness of the translation.

Consider, however, the following ambiguous example: "a holds until b holds or always a holds". Human supervision is needed to resolve the ambiguity on the operator precedence. This can be easily achieved with nl2spec by adding or editing a sub-translation using explicit parenthesis (see Sect. 4 for more details and examples). To capture such (and other types of) ambiguity in a benchmark data set, we conducted an expert user study specifically asking for challenging translations of natural language sentences to LTL formulas.

The key insight in the design of nl2spec is that the process of translation can be decomposed into many sub-translations automatically via LLMs, and the decomposition into sub-translations allows users to easily resolve ambiguous natural language and erroneous translations through interactively modifying sub-translations. The central goal of nl2spec is to keep the human supervision minimal and efficient. To this end, all translations are accompanied by a confidence score. Alternative suggestions for sub-translations can be chosen via a drop-down menu and misleading sub-translations can be deleted before the next loop of the translation. We evaluate the end-to-end translation accuracy of our proposed methodology on the benchmark data set obtained from our expert user study. Note that nl2spec can be applied to the user's respective application domain to increase the quality of translation. As proof of concept, we provide additional examples, including an example for STL [31] in the GitHub repository<sup>1</sup>.

nl2spec is agnostic to machine learning models and specific application domains. We will discuss possible parameterizations and inputs of the tool in Sect. 3. We discuss our sub-translation methodology in more detail in Sect. 3.2 and introduce an interactive few-shot prompting scheme for LLMs to generate them. We evaluate the effectiveness of the tool to resolve erroneous formalizations in Sect. 4 on a data set obtained from conducting an expert user study. We discuss limitations of the framework and conclude in Sect. 5. For additional details, please refer to the complete version [8].

## **2 Background and Related Work**

#### **2.1 Natural Language to Linear-Time Temporal Logic**

Linear-time Temporal Logic (LTL) [34] is a temporal logic that forms the basis of many practical specification languages, such as the IEEE property specification language (PSL) [22], Signal Temporal Logic (STL) [31], or System Verilog Assertions (SVA) [43]. By focusing on the prototype temporal logic LTL, we keep the nl2spec framework extendable to specification languages in specific application domains. LTL extends propositional logic with temporal modalities U (until) and X (next). There are several derived operators, such as Fϕ ≡ *true*Uϕ and Gϕ ≡ ¬F¬ϕ. Fϕ states that ϕ will *eventually* hold in the future and Gϕ states that ϕ holds *globally*. Operators can be nested: GFϕ, for example, states that ϕ has to occur infinitely often. LTL specifications describe a systems behavior and its interaction with an environment over time. For example given a process 0 and a process 1 and a shared resource, the formula G(r<sup>0</sup> → Fg0) ∧ G(r<sup>1</sup> → Fg1) ∧ G¬(g<sup>0</sup> ∧ g1) describes that whenever a process requests (r*i*) access to a shared resource it will eventually be granted (g*i*). The subformula G¬(g<sup>0</sup> ∧ g1) ensures that grants given are mutually exclusive.

Early work in translating natural language to temporal logics focused on grammar-based approaches that could handle structured natural language [17, 24]. A survey of earlier research before the advent of deep learning is provided in [4]. Other approaches include an interactive method using SMT solving and semantic parsing [15], or structured temporal aspects in grounded robotics [45] and planning [32]. Neural networks have only recently been used to translate

<sup>1</sup> The tool is available at GitHub: https://github.com/realChrisHahn2/nl2spec.

into temporal logics, e.g., by training a model for STL from scratch [21], finetuning language models [19], or an approach to apply GPT-3 [13,30] in a oneshot fashion, where [13] output a restricted set of declare templates [33] that can be translated to a fragment of LTLf [10]. Translating natural langauge to LTL has especially been of interest to the robotics community (see [16] for an overview), where datasets and application domains are, in contrast to our setting, based on structured natural language. Independent of relying on structured data, all previous tools lack a detection and interactive resolving of the inerherent ambiguity of natural language, which is the main contribution of our framework. Related to our approach is recent work [26], where generated code is iteratively refined to match desired outcomes based on human feedback.

#### **2.2 Large Language Models**

LLMs are large neural networks typically consisting of up to 176 billion parameters. They are pre-trained on massive amounts of data, such as "The Pile" [14]. Examples of LLMs include the GPT [36] and BERT [11] model families, opensource models, such as T5 [38] and Bloom [39], or commercial models, such as Codex [5]. LLMs are Transformers [42], which is the state of the art neural architecture for natural language proccessing. Additionally, Transformers have shown remarkable performance when being applied to classical problems in verification (e.g., [9,18,25,40]), reasoning (e.g., [28,50]), as well as the auto-formalization [35] of mathematics and formal specifications (e.g., [19,21,49]).

In language modelling, we model the probability of a sequence of tokens in a text [41]. The joint probability of tokens in a text is generally expressed as [39]:

$$p(x) = p(x\_1, \ldots, x\_T) = \prod\_{t=1}^T p(x\_t | x\_{$$

where x is the sequence of tokens, x*<sup>t</sup>* represents the t-th token, and x*<t* is the sequence of tokens preceding x*t*. We refer to this as an autoregressive language model that iteratively predicts the probability of the next token. Neural network approaches to language modelling have superseded classical approaches, such as n-grams [41]. Especially Transformers [42] were shown to be the most effective architecture at the time of writing [1,23,36].

While fine-tuning neural models on a specific translation task remains a valid approach showing also initial success in generalizing to unstructured natural language when translating to LTL [19], a common technique to obtain high performance with limited amount of labeled data is so-called "few-shot prompting" [3]. The language model is presented a natural language description of the task usually accompanied with a few examples that demonstrate the input-output behavior. The framework presented in this paper relies on this technique. We describe the proposed few-shot prompting scheme in detail in Sect. 3.2.

Currently implemented in the framework and used in the expert-user study are Codex and Bloom, which showed the best performance during testing.

*Codex and GPT-3.5-turbo.* Codex [5] is a GPT-3 variant that was initially of up to 12B parameters in size and fine-tuned on code. The initial version of GPT-3 itself was trained on variations of Common Crawl,<sup>2</sup> Webtext-2 [37], two internet-based book corpora and Wikipedia [3]. The fine-tuning dataset for the vanilla version Codex was collected in May 2020 from 54 million public software repositories hosted on GitHub, using 159GB of training data for fine-tuning. For our experiments, we used the commercial 2022 version of code-davinci-002, which is likely larger (in the 176B range<sup>3</sup>) than the vanilla codex models. GPT-3.5-turbo is the currently available follow-up model of GPT-3.

*Bloom.* Bloom [39] is an open-source LLM family available in different sizes of up to 176B parameters trained on 46 natural languages and 13 programming languages. It was trained on the ROOTS corpus [27], a collection of 498 huggingface [29,48] datasets consisting of 1.61 terabytes of text. For our experiments, we used the 176B version running on the huggingface inference API<sup>4</sup>.

## **3 The** nl2spec **Framework**

#### **3.1 Overview**

The framework follows a standard frontend-backend implementation. Figure 2 shows an overview of the implementation of nl2spec. Parts of the framework that can be extended for further research or usage in practice are highlighted. The framework is implemented in Python 3 and flask [44], a lightweight WSGI web application framework. For the experiments in this paper, we use the OpenAI library and huggingface (transformer) library [47]. We parse the LTL output formulas with a standard LTL parser [12]. The tool can either be run as a command line tool, or with the web-based frontend.

The frontend handles the interaction with a human-in-the-loop. The interface is structured in three views: the "Prompt", "Sub-translations", and "Final Result" view (see Fig. 1). The tool takes a natural language sentence, optional sub-translations, the model temperature, and number of runs as input. It provides sub-translations, a confidence score, alternative sub-translations and the final formalization as output. The frontend then allows for interactively selecting, editing, deleting, or adding sub-translations. The backend implements the handling of the underlying neural models, the generation of the prompt, and the ambiguity resolving, i.e., computing the confidence score including alternative sub-translations and the interactive few-shot prompting algorithm (cf. Sect. 3.2). The framework is designed to have an easy interface to implement new models and write domain-specific prompts. The prompt is a .txt file that can be adjusted to specific domains to increase the quality of translations. To apply the sub-translation refinement methodology, however, the prompt needs to follow our interactive prompting scheme, which we introduce in the next section.

<sup>2</sup> https://commoncrawl.org/.

<sup>3</sup> https://blog.eleuther.ai/gpt3-model-sizes/.

<sup>4</sup> https://huggingface.co/inference-api.

**Fig. 2.** Overview of the nl2spec framework with a human-in-the-loop: highlighted areas indicate parts of the framework that are effortlessly extendable.

#### **3.2 Interactive Few-Shot Prompting**

The core of the methodology is the decomposition of the natural language input into sub-translations. We introduce an interactive prompting scheme that generates sub-translations using the underlying neural model and leverages the subtranslations to produce the final translation. Algorithm 1 depicts a high-level overview of the interactive loop. The main idea is to give a human-in-the-loop the options to add, edit, or delete sub-translations and feed them back into the language models as "Given translations" in the prompt (see Fig. 3). After querying a language model M with this prompt F, model specific parameters P and the interactive prompt that is computed in the loop, the model generates a natural language explanation, a dictionary of sub-translations, and the final translation. Notably, the model M can be queried multiple times as specified by the number of runs r, thereby generating multiple possible sub-translations. The confidence score of each sub-translation is computed as votes over multiple queries and by default the sub-translation with the highest confidence score is selected to be used as a given sub-translation in the next iteration. In the frontend, the user may view and select alternative generated sub-translations for each sub-translation via a drop-down menu (see Fig. 1).

Figure 3 shows a generic prompt, that illustrates our methodology. The prompting scheme consists of three parts. The specification language specific part (lines 1–4), the fewshot examples (lines 5–19), and the interactive prompt minimal.txt

 Translate the following natural language sentences into an LTL formula and explain your translation step by step. Remember that X means "next", U means "until", G means "globally", F means "finally", which means GF means "infinitely often". The formula should only contain atomic propositions or operators &, , ->, <->, X, U, G, F. Natural Language: Globally if a holds then c is true until b. Given translations: {} Explanation: "a holds" from the input translates to the atomic proposition a. "c is true until b" from the input translates to the subformula c U b. "if x then y" translates to an implication x -> y, so "if a holds then c is true until b" translates to an implication a -> c U b. "Globally" from the input translates to the temporal operator G. Explanation dictionary: {"a holds" : "a", "c is true until b" : "c U b", "if a holds then c is true until b" : "a -> c U b", "Globally" : "G"} So the final LTL translation is G a -> c U b.FINISH Natural Language: Every request r is eventually followed by a grant g. Given translations: {} Explanation: "Request r" from the input translates to the atomic proposition r and "grant g" translates to the atomic proposition g. "every" means at every point in time, i.e., globally, "never" means at no point in time, and "eventually" translates to the temporal operator F. "followed by" is the natural language representation of an implication. Explanation dictionary: {"Request r" : "r", "grant g" : "g", "every" : "G", "eventually": "F", "followed by" : "->"} So the final LTL translation is G r -> F g.FINISH

**Fig. 3.** Prompt with minimal domain knowledge of LTL.

including the natural language and sub-translation inputs (not displayed, given as input). The specification language specific part leverages "chain-of-thought" prompt-engineering to elicit reasoning from large language models [46]. The key of nl2spec, however, is the setup of the few-shot examples. This minimal prompt consists of two few-shot examples (lines 5–12 and 12–19). The end of an example is indicated by the "FINISH" token, which is the stop token for the machine learning models. A few-shot example in nl2spec consists of the natural language input (line 5), a dictionary of given translations, i.e., the sub-translations (line 5), an explanation of the translation in natural language (line 6–10), an explanation dictionary, summarizing the sub-translations, and finally, the final LTL formula.

This prompting scheme elicits sub-translations from the model, which serve as a fine-grained explanation of the formalization. Note that sub-translations provided in the prompt are neither unique nor exhaustive, but provide the context for the language model to generate the correct formalization.

## **4 Evaluation**

In this section, we evaluate our framework and prompting methodology on a data set obtained by conducting an expert user study. To show the general applicability of this framework, we use the minimal prompt that includes only minimal domain knowledge of the specification language (see Fig. 3). This prompt has intentionally been written *before* conducting the expert user study. We limited the few-shot examples to two and even provided no few-shot example that includes "given translations". We use the minimal prompt to focus the evaluation on the effectiveness of our interactive sub-translation refinement methodology in

#### **Algorithm 1:** Interactive Few-shot Prompting Algorithm


resolving ambiguity and fixing erroneous translations. In practice, one would like to replace this minimal prompt with domain-specific examples that capture the underlying distribution as closely as possible. As a proof of concept, we elaborate on this in the full version [8].

#### **4.1 Study Setup**

To obtain a benchmark dataset of *unstructured* natural language and their formalizations into LTL, we asked five experts in the field to provide examples that the experts thought are challenging for a neural translation approach. Unlike existing datasets that follow strict grammatical and syntatical structure, we posed no such restrictions on the study participants. Each natural language specification was restricted to one sentence and to five atomic propositions a, b, c, d, e. Note that nl2spec is not restricted to a specific set of atomic propositions (cf. Fig. 1). Which variable scheme to use can be specified as an initial sub-translation. We elaborate on this in the full version [8]. To ensure unique instances, the experts worked in a shared document, resulting in 36 benchmark instances. We provide three randomly drawn examples for the interested reader:


The poor performance of existing methods (cf. Table 1) exemplify the difficulty of this data set.

#### **4.2 Results**

We evaluated our approach using the minimal prompt (if not otherwise stated), with number of runs set to three and with a temperature of 0.2.

*Quality of Initial Translation.* We analyze the quality of *initial* translations, i.e., translations obtained *before* any human interaction. This experiment demonstrates that the initial translations are of high quality, which is important to ensure an efficient workflow. We compared our approach to fine-tuning language models on structured data [19] and to an approach using GPT-3 or Rasa [2] to translate natural language into a restricted set of declare patterns [13] (which could not handle most of the instances in the benchmark data set, even when replacing the atomic propositions with their used entities). The results of evaluating the accuracy of the initial translations on our benchmark expert set is shown in Table 1.

At the time of writing, using Codex in the backend outperforms GPT-3.5 turbo and Bloom on this task, by correctly translating 44.4% of the instances using the minimal prompt. We only count an instance as correctly translated if it matches the intended meaning of the expert, no alternative translation to ambiguous input was accepted. Additionally to the experiments using the minimal prompt, we conducted experiments on an augmented prompt with indistribution examples after the user study was conducted by randomly drawing four examples from the expert data set (3 of the examples haven't been solved before, see the GitHub repository or full version for more details). With this indistribution prompt (ID), the tool translates 21 instances (with the four drawn examples remaining in the set), i.e., 58.3% correctly.

This experiment shows 1) that the initial translation quality is high and can handle unstructured natural language better than previous approaches and 2) that drawing the few-shot examples in distribution only slightly increased translation quality for this data set; making the key contributions of nl2spec, i.e., ambiguity detection and effortless debugging of erroneous formalizations, valuable. Since nl2spec is agnostic to the underlying machine learning models, we expect an even better performance in the future with more fine-tuned models.

*Teacher-Student Experiment.* In this experiment, we generate an initial set of sub-translations with Codex as the underlying neural model. We then ran the tool with Bloom as a backend, taking these sub-translations as input. There were 11 instances that Codex could solve initially that Bloom was unable to solve. On these instances, Bloom was able to solve 4 more instances, i.e., 36.4% with subtranslations provided by Codex. The four instances that Bloom was able to solve



with the help of Codex were: "It is never the case that a and b hold at the same time.", "Whenever a is enabled, b is enabled three steps later.", "If it is the case that every a is eventually followed by a b, then c needs to holds infinitely often.", and "One of the following aps will hold at all instances: a,b,c". This demonstrates that our sub-translation methodology is a valid appraoch: improving the quality of the sub-translations indeed has a positive effect on the quality of the final formalization. This even holds true when using underperforming neural network models. Note that no supervision by a human was needed in this experiment to improve the formalization quality.

*Ambiguity Detection.* Out of the 36 instances in the benchmark set, at least 9 of the instances contain ambiguous natural language. We especially observed two classes of ambiguity: 1) ambiguity due to the limits of natural language, e.g., operator precedence, and 2) ambiguity in the semantics of natural language; nl2spec can help in resolving both types of ambiguity. Details for the following examples can be found in the full version [8].

An example for the first type of ambiguity from our dataset is the example mentioned in the introduction: "a holds until b holds or always a holds", which the expert translated into (a U b) | G a. Running the tool, however, translated this example into (a U (b | G(a))). By editting the sub-translation of "a holds until b holds" to (a U b) through adding explicit parenthesis, the tool translates as intended. An example for the second type of ambiguity is the following instance from our data set: "Whenever a holds, b must hold in the next two steps." The intended meaning of the expert was G (a -> (b | X b)), whereas the tool translated this sentence into G((a -> X(X(b)))). After changing the sub-translation of "b must hold in the next two steps" to b|Xb, the tool translates the input as intended.

*Fixing Erroneous Translation.* With the inherent ambiguity of natural language and the unstructured nature of the input, the tool's translation cannot be expected to be always correct in the first try. Verifying and debugging subtranslations, however, is significantly easier than redrafting the complete formula from scratch. Twenty instances of the data set were not correctly translated in an initial attempt using Codex and the minimal prompt in the backend (see Table 1). We were able to extract correct translations for 15 instances by performing at most three translation loops (i.e., adding, editing, and removing sub-translations), We were able to get correct results by performing 1.86 translation loops on average. For example, consider the instance, "whenever a holds, b holds as well", which the tool mistakenly translated to G(a & b). By fixing the sub-translation "b holds as well" to the formula fragment -> b, the sentence is translated as intended. Only the remaining five instances that contain highly complex natural language requirements, such as, "once a happened, b won't happen again" were need to be translated by hand.

In total, we correctly translated 31 out of 36 instances, i.e., 86.11% using the nl2spec sub-translation methodology by performing only 1.4 translation loops on average (see Table 1).

## **5 Conclusion**

We presented nl2spec, a framework for translating unstructured natural language to temporal logics. A limitation of this approach is its reliance on computational resources at inference time. This is a general limitation when applying deep learning techniques. Both, commercial and open-source models, however, provide easily accessible APIs to their models. Additionally, the quality of initial translations might be influenced by the amount of training data on logics, code, or math that the underlying neural models have seen during pre-training.

At the core of nl2spec lies a methodology to decompose the natural language input into sub-translations, which are mappings of formula fragments to relevant parts of the natural language input. We introduced an interactive prompting scheme that queries LLMs for sub-translations, and implemented an interface for users to interactively add, edit, and delete the sub-translations, which avoids users from manually redrafting the entire formalization to fix erroneous translations. We conducted a user study, showing that nl2spec can be efficiently used to interactively formalize unstructured and ambigous natural language.

**Acknowledgements.** We thank OpenAI for providing academic access to Codex and Clark Barrett for helpful feedback on an earlier version of the tool.

## **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

## **NNV 2.0: The Neural Network Verification Tool**

Diego Manzanas Lopez1(B) , Sung Woo Choi<sup>2</sup>, Hoang-Dung Tran<sup>2</sup>, and Taylor T. Johnson<sup>1</sup>

> <sup>1</sup> Vanderbilt University, Nashville, USA diego.manzanas.lopez@vanderbilt.edu <sup>2</sup> University of Nebraska, Lincoln, USA

**Abstract.** This manuscript presents the updated version of the Neural Network Verification (NNV) tool. NNV is a formal verification software tool for deep learning models and cyber-physical systems with neural network components. NNV was first introduced as a verification framework for feedforward and convolutional neural networks, as well as for neural network control systems. Since then, numerous works have made significant improvements in the verification of new deep learning models, as well as tackling some of the scalability issues that may arise when verifying complex models. In this new version of NNV, we introduce verification support for multiple deep learning models, including neural ordinary differential equations, semantic segmentation networks and recurrent neural networks, as well as a collection of reachability methods that aim to reduce the computation cost of reachability analysis of complex neural networks. We have also added direct support for standard input verification formats in the community such as VNNLIB (verification properties), and ONNX (neural networks) formats. We present a collection of experiments in which NNV verifies safety and robustness properties of feedforward, convolutional, semantic segmentation and recurrent neural networks, as well as neural ordinary differential equations and neural network control systems. Furthermore, we demonstrate the capabilities of NNV against a commercially available product in a collection of benchmarks from control systems, semantic segmentation, image classification, and time-series data.

**Keywords:** neural networks · cyber-physical systems · verification · tool

## **1 Introduction**

Deep Learning (DL) models have achieved impressive performance on a wide range of tasks, including image classification [13,24,44], natural language processing [15,25], and robotics [47]. Recently, the usage of these models has expanded into many other areas, including safety-critical domains, such as autonomous vehicles [9,10,85]. However, deep learning models are opaque systems, and it has been demonstrated that their behavior can be unpredictable when small changes are applied to their inputs (i.e., adversarial attacks) [67]. Therefore, for safety-critical applications, it is often necessary to comprehend and analyze the behavior of the whole system, including reasoning about the safety guarantees of the system. To address this challenge, many researches have been developing techniques and tools to verify Deep Neural Networks (DNN) [4,6,22,39,40,48,55,64,65,77,83,84,86,87], as well as learning-enabled Cyber-Physical Systems (CPS) [3,8,12,23,26,34,35,38,50,51]. It is worth noting that despite the growing research interest, the verification of deep learning models still remains a challenging task, as the complexity and non-linearity of these models make them difficult to analyze. Moreover, some verification methods suffer from scalability issues, which limits the applicability of some existing techniques to large-scale and complex models. Another remaining challenge is the extension of existing or new methods for the verification of the extensive collection of layers and architectures existing in the DL area, such as Recurrent Neural Networks (RNN) [37], Semantic Segmentation Neural Networks (SSNN) [58] or Neural Ordinary Differential Equations (ODE) [11].

This work contributes to addressing the latter challenge by introducing version 2.0 of NNV<sup>1</sup> (Neural Network V erification)<sup>2</sup>, which is a software tool that supports the verification of multiple DL models as well as learning-enabled CPS, also known as Neural Network Control Systems (NNCS) [80]. NNV is a software verification tool with the ability to compute exact and over-approximate reachable sets of feedforward neural networks (FFNN) [75,77,80], Convolutional Neural Networks (CNN) [78], and NNCS [73,80]. In NNV 2.0, we add verification support of 3 main DL models: 1) RNNs [74], 2) SSNNs (encoder-decoder architectures) [79], and 3) neural ODEs [52], as well as several other improvements introduced in Sect. 3, including support for The Verification of Neural Networks Library (VNNLIB) [29] and reachability methods for MaxUnpool and Leaky ReLU layers. Once the reachability computation is completed, NNV is capable of verifying a variety of specifications such as safety or robustness, very commonly used in learning-enabled CPS and classification domains, respectively [50,55]. We demonstrate NNV capabilities through a collection of safety and robustness verification properties, which involve the reachable set computation of feedforward, convolutional, semantic segmentation and recurrent neural networks, as well as neural ordinary differential equations and neural network control systems. Throughout these experiments, we showcase the range of the existing methods, executing up to 6 different star-based reachability methods that we compare against MATLAB's commercially available verification tool [69].

## **2 Related Work**

The area of DNN verification has increasingly grown in recent years, leading to the development of standard input formats [29] as well as friendly competitions [50,55], that help compare and evaluate all the recent methods and tools proposed in the community [4,6,19,22,31,39–41,48,55,59,64,65,77,83,84,

<sup>1</sup> Code available at: https://github.com/verivital/nnv/releases/tag/cav2023.

<sup>2</sup> Archival version: https://doi.org/10.24433/CO.0803700.v1.

86,87]. However, the majority of these methods focus on regression and classification tasks performed by FFNN and CNN. In addition to FFNN and CNN verification, Tran et al. [79] introduced a collection of star-based reachability analysis that also verify SSNNs. Fischer et al. [21] proposed a probabilistic method for the robustness verification of SSNNs based on randomize smoothing [14]. Since then, some of the other recent tools, including Verinet [31], α,β-Crown [84,87], and MN-BaB [20] are also able to verify image segmentation properties as demonstrated in [55]. A less explored area is the verification of RNN. These models have unique "memory units" that enable them to store information for a period of time and learn complex patterns of time-series or sequential data. However, due to their memory units, verifying the robustness of RNNs is challenging. Recent notable state-of-the-art methodologies for verifying RNNs include unrolling the network into an FFNN and then verify it [2], invariant inference [36,62,90], and star-based reachability [74]. Similar to RNNs, neural ODEs are also deep learning models with "memory", which makes them suitable to learn time-series data, but are also applicable to other tasks such as continuous normalizing flows (CNF) and image classification [11,61]. However, existing work is limited to a stochastic reachability approach [27,28], reachability approaches using star and zonotope reachability methods for a general class of neural ODEs (GNODE) with continuous and discrete time layers [52], and GAINS [89], which leverages ODE-solver information to discretize the models using a computation graph that represent all possible trajectories from a given input to accelerate their bound propagation method. However, one of the main challenges is to find a framework that is able to verify several of these models successfully. For example, α,β-Crown was the top performer on last year's NN verification competition [55], able to verify FFNN, CNN and SSNNs, but it lacks support for neural ODEs or NNCS. There exist other tools that focus more on the verification of NNCS such as Verisig [34,35], Juliareach [63], ReachNN [17,33], Sherlock [16], RINO [26], VenMas [1], POLAR [32], and CORA [3,42]. However, their support is limited to NNCS with a linear, nonlinear ODE or hybrid automata as the plant model, and a FFNN as the controller.

Finally, for a more detailed comparison to state-of-the-art methods for the novel features of NNV 2.0, we refer to the comparison and discussion about neural ODEs in [52]. For SSNNs [79], there is a discussion on scalability and conservativeness of methods presented (approx and relax star) for the different layers that may be part of a SSNN [79]. For RNNs, the approach details and a state-of-the-art comparison can be found in [74]. We also refer the reader to two verification competitions, namely VNN-COMP [6,55] and AINNCS ARCH-COMP [38,50], for a comparison on state-of-the-art methods for neural network verification and neural network control system verification, respectively.

## **3 Overview and Features**

NNV is an object-oriented toolbox developed in MATLAB [53] and built on top of several open-source software, including CORA [3] for reachability analysis of nonlinear ordinary differential equations (ODE) [73] and hybrid automata, MPT toolbox [45] for polytope-based operations [76], YALMIP [49] for some optimization problems in addition to MATLAB's Optimization Toolbox [53] and GLPK [56], and MatConvNet [82] for some convolution and pooling operations. NNV also makes use of MATLAB's deep learning toolbox to load the Open Neural Network Exchange (ONNX) format [57,68], and the Hybrid Systems Model Transformation and Translation tool (HyST) [5] for NNCS plant configuration.

NNV consists of two main modules: a *computation engine* and an *analyzer*, as illustrated in Fig. 1. The computation engine module consists of four components: 1) *NN constructor*, 2) *NNCS constructor*, 3) *reachability solvers*, and 4) *evaluator*. The NN constructor takes as an input a neural network, either as a DAGNetwork, dlnetwork, SeriesNetwork (MATLAB built-in formats) [69], or as an ONNX file [57], and generates a NN object suitable for verification. The NNCS constructor takes as inputs the NN object and an ODE or Hybrid Automata (HA) file describing the dynamics of a system, and then creates an NNCS object. Depending on the task to solve, either the NN (or NNCS) object is passed into the reachability solver to compute the reachable set of the system from a given set of initial conditions. Then, the computed set is sent to the analyzer module to verify/falsify a given property, and/or visualize the reachable sets. Given a specification, the verifier can formally reason whether the specification is met by computing the intersection of the define property and the reachable sets. If an exact (sound and complete) method is used, (e.g., exactstar), the analyzer can determine if the property is satisfied or unsatisfied. If an over-approximate (sound and incomplete) method is used, the verifier may also return "*uncertain*" (unknown), in addition to satisfied or unsatisfied.

**Fig. 1.** An overview of NNV and its major modules and components.

#### **3.1 NNV 2.0 vs NNV**

Since the introduction of NNV [80], we have added to NNV support for the verification of a larger subset of deep learning models. We have added reachability methods to verify SSNNs [79], and a collection of relax-star reachability methods [79], reachability techniques for Neural ODEs [52] and RNNs [74]. In addition, there have been changes that include the creation of a common NN class that encapsulates previously supported neural network classes (FFNN and CNN) as well as Neural ODEs, SSNNs, and RNNs, which significantly reduces the software complexity and simplifies user experience. We have also added direct support for ONNX [57], as well as a parser for VNN-LIB [29], which describes properties to verify of any class of neural networks. We have also added flexibility to use one of the many solvers supported by YALMIP [49], GLPK [56] or linprog [70]. Table 1 shows a summary of the major features of NNV, highlighting the novel features.

**Table 1.** Overview of major features available in NNV. Links refer to relevant files/classes in the NNV codebase. BN refers to batch normalization layers, FC to fully-connected layers, AvgPool to average pooling layers, Conv to convolutional layers, and MaxPool to max pooling layers.


\*ONNX was partially supported for feedforward neural networks through NNVMT. Support has been extended to other NN types without the need for external libraries.

**Semantic Segmentation** [79]**.** Semantic segmentation consists on classifying image pixels into one or more classes which are semantically interpretable, like the different objects in an image. This task is common in areas like perception for autonomous vehicles, and medical imaging [71], which is typically accomplished by neural networks, referred to as semantic segmentation neural networks (SSNNs). These are characterized by two major portions, the encoder, or sequence of down-sampling layers to extract important features in the input, and the decoder, or sequence of up-sampling layers, to scale back the data information and classify each pixel into its corresponding class. Thus, the verification of these models is rather challenging, due to the complexity of the layers, and the output space dimensionality. We implement in NNV the collection of reachability methods introduced by Tran et al. [79], that are able to verify the robustness of a SSNNs. This means that we can formally guarantee the robustness value for each pixel, and determine the percentage of pixels that are correctly classified despite the adversarial attack. This was demonstrated using several architectures on two datasets: MNIST and M2NIST [46]. To achieve this, additional support for transposed and dilated convolutional layers was added [79].

**Neural Ordinary Differential Equations** [52]**.** Continuous deep learning models, referred to as Neural ODEs, have received a growing consideration over the last few years [11]. One of the main reasons for their popularity is due to their memory efficiency and their ability to learn from irregularly sampled data [61]. Similarly to SSNNs, despite their recent popularity, there is very limited work on the formal verification of these models [52]. For this reason, we implemented in NNV the first deterministic verification approach for a general class of neural ODEs (GNODE), which supports GNODEs to be constructed with multiple continuous layers (neural ODEs), linear or nonlinear, as well as any discrete-time layer already supported in NNV, such as ReLU, fully-connected or convolutional layers [52]. NNV demonstrates its capabilities in a series of time-series, control systems and image classification benchmarks, where it significantly outperforms any of the compared tools in the number of benchmarks and architectures supported [52].

**Recurrent Neural Networks** [74]**.** We implement star-based verification methods for RNNs introduced in [74]. These are able to verify RNNs without unrolling, reducing accumulated over-approximation error by optimized relaxation in the case of approximate reachability. The star set is an efficient technique in the computation of RNN reachable sets due to its advantages in computing affine mapping, the intersection of half-spaces, and Minkowski summation [74]. A new star set representing the reachable set of the current hidden state can be directly and efficiently constructed based on the reachable sets of the previous hidden state and the current input set. As proposed in verifying FFNNs [7,77,78], CNNs [72], and SSNNs [79], tight and efficient over-approximation reachability can be applied to the verification of ReLU RNNs. The triangular over-approximation of ReLU enables a tight over-approximation of the exact reachable set, preventing exponentially increasing the number of star sets during splitting. Estimation of the state bound required for over-approximation can compute state bounds without solving LPs. Furthermore, the relaxed approximate reachability estimates the triangle over-approximation areas to optimize the ranges of state by solving LP optimization. Consequently, the extended exact reachability method is 10× faster, and the over-approximation method is 100× to 5000× faster than existing state-of-the-art methods [74].

**Zonotope Pre-filtering Star Set Reachability** [78]**.** The star-based reachability methods are improved by using the zonotope pre-filtering approach [7,78]. This improvement consists on equipping the star set with an outer-zonotope, on the reachability analysis of a ReLU layer, to estimate quickly the lower and upper bounds of the star set at each specific neuron to establish if splitting may occur at this neuron without the need to solve any LP problems. The reduction of LP optimizations to solve is critical for the scalability of star-set reachability methods [77]. For the exact analysis, we are able to avoid the use of the zonotope pre-filtering, since we can efficiently construct the new output set with one star, if the zero point is not within the set range, or the union of 2 stars, if the zero point is contained [78]. In the over-approximation star, the range information is required to construct the output set at a specific neuron if and only if the range contains the zero point.

**Relax-Star Reachability** [79]**.** To tackle some of the scalability problems that may arise when computing the reachable set of complex neural networks such as SSNNs, a collection of four relaxed reachability methods were introduced [79]. The main goal of these methods is to reduce the number of Linear Programming (LP) problems to solve by quickly estimating the bounds or the reachable set, and only solving a fraction of the LP problems, while over-approximating the others. The LPs to solve are determined by the heuristics chosen, which can be random, area-based, bound-based, or range-based. The number of LPs is also determined by the user, who can choose from 0% to 100%. The closer to 100%, the larger number of LPs are skipped and over-approximated, thus the reachable set tends to be a larger over-approximation of the output, which significantly reduces the computation time [79].

**Other Updates.** In addition to the previous features described, there is a set of changes and additions included in the latest NNV version:




## **4 Evaluation**

The evaluation is divided into 4 sections: 1) Comparison of FFNN and CNN to MATLAB's commercial toolbox [53,69], 2) Reachability analysis of Neural ODEs [52], 3) Robustness Verification of RNNs [74], and 4) Robustness Verification of SSNNs [79]. The results presented were all performed on a desktop with the following configuration: AMD Ryzen 9 5900X @3.7GHz 12-Core Processor, 64 GB Memory, and 64-bit Microsoft Windows 10 Pro.

#### **4.1 Comparison to MATLAB's Deep Learning Verification Toolbox**

In this comparison, we make use of a subset of the benchmarks and properties evaluated in last year's Verification of Neural Network (VNN) [55] competition, in which we demonstrate the capabilities of NNV with respect to the latest commercial product from MATLAB for the verification of neural networks [69].

We compared them on a subset of benchmarks from VNN-COMP'22 [55]: *ACAS Xu*, *Tllverify*, *Oval21* (CIFAR10 [43]), and *RL* benchmarks, which consists on verifying 90 out of 145 properties of the ACAS Xu, where we compare


**Table 2.** Verification of ACAS Xu properties 3 and 4.

**Table 3.** Verification results of the RL, tllverify and oval21 benchmarks. We selected 50 random specifications from the RL benchmarks, 10 from tllverify and all 30 from oval21. **-** means that the benchmark is not supported.


MATLAB's methods, approx-star, exact (parallel, 8 cores) and 4 relax-star methods. From the other 3 benchmarks, we select a total of 90 properties to verify, from which we limit the comparison to the approx-star and MATLAB's method. In this section, we demonstrate NNV is able to verify fully-connected layers, ReLU layers, flatten layers, and convolutional layers. The results of this comparison are described in Table 2. We can observe that MATLAB's computation time is faster than NNV star methods, except for the relax star with 100% relaxation. However, NNV's exact and approx methods significantly outperform MATLAB's framework by verifying 100% and 74% of the properties respectively, compared to 18% from MATLAB's. The remainder of the comparison is described in Table 3, which shows a similar trend: MATLAB's computation is faster, while NNV is able to verify a larger fraction of the properties.

#### **4.2 Neural Ordinary Differential Equations**

We exhibit the reachability analysis of GNODEs with three tasks: dynamical system modeling of a Fixed Point Attractor (FPA) [52,54], image classification of MNIST [46], and an adaptive cruise control (ACC) system [73].

**Dynamical Systems.** For the FPA, we compute the reachable set for a time horizon of 10 s, given a perturbation of ± 0.01 on all 5 input dimensions. The results of this example are illustrated in Fig. 2c, with a computation time of 3.01 s. The FPA model consists of one nonlinear neural ODE, no discrete-time layers are part of this model [52].

**Classification.** For the MNIST benchmark, we evaluate the robustness of two GNODEs with convolutional, fully-connected, ReLU and neural ODE layers, corresponding to CNODE*<sup>S</sup>* and CNODE*<sup>M</sup>* models introduced in [52]. We verify the robustness of 5 random images under an L<sup>∞</sup> attack with a perturbation

**Fig. 2.** Verification of RNN and neural ODE results. Figure 2a shows the verification time of the 3 RNNs evaluated. Figure 2b depicts the safety verification of the ACC, and Fig. 2c shows the reachability results of the FPA benchmark.

value of ± 0.5 on all the pixels. We are able to prove the robustness of both models on 100% of images, with an average computation time of 16.3 s for the CNODE*S*, and 119.9 s for the CNODE*M*.

**Control Systems.** We verify an NNCS of an adaptive cruise control (ACC) system, where the controller is a FFNN with 5 ReLU layers with 20 neurons each, and one output linear layer, and the plant is a nonlinear neural ODE [52]. The verification results are illustrated in Fig. 2b, showing the current distance between the ego and lead cars and the safety distance allowed. We can observe that there is no intersection between the two, guaranteeing its safety.

#### **4.3 Recurrent Neural Networks**

For the RNN evaluation, we evaluate of three RNNs trained on the speaker recognition VCTK dataset [88]. Each network has an input layer of 40 neurons, two hidden layers with 2,4, or 8 memory units, followed by 5 ReLU layers with 32 neurons, and an output layer of 20 neurons. For each of the networks, we use the same 5 input points (40-dimensional time-independent vectors) for comparison. The robustness verification consists on proving that the output label after T ∈ {5, 10, 15, 20} steps in the sequence is still the same, given an adversarial attack perturbation of = ± 0.01. We compute the reachable sets of all reachability instances using the approx-star method, which was able to prove the robustness of *19* out of 20 on N<sup>2</sup>*,*<sup>0</sup>, and N<sup>4</sup>*,*<sup>4</sup> networks, and 18 for the N<sup>8</sup>*,*<sup>0</sup> network. We show the average reachability time per T value in Fig. 2a.

#### **4.4 Semantic Segmentation**

We demonstrate the robustness verification of two SSNNs, one with dilated convolutional layers and the other one with transposed convolutional layers, in addition to average pooling, convolutional and ReLU layers, which correspond to N<sup>4</sup> and N<sup>5</sup> introduced in Table 1 by Tran et al. [79]. We evaluate them on one random image of M2NIST [18] by attacking each image using an UBAA brightening attack [79]. One of the main differences of this evaluation with respect to the robustness analysis of other classification is the evaluation metrics used. For these networks, we evaluate the average robustness values (percentage of pixels correctly classified), sensitivity (number of not robust pixels over number of attacked pixels), and IoU (intersection over union) of the SSNNs. The computation time for the dilated example, shown in Fig. 3, is 54.52 s, with a robustness value of 97.2%, a sensitivity of 3.04, and a IoU of 57.8%. For the equivalent example with the transposed network, the robustness value is 98.14%, sensitivity of 2, IoU of 72.8%, and a computation time of 7.15 s.

**Fig. 3.** Robustness verification of the dilated and transposed SSNN under a UBAA brightening attack to 150 random pixels in the input image.

## **5 Conclusions**

We presented version 2.0 of NNV, the updated version of the Neural Network Verification (NNV) tool [80], a software tool for the verification of deep learning models and learning-enabled CPS. To the best of our knowledge, NNV is the most comprehensive verification tool in terms of the number of tasks and neural networks architectures supported, including the verification of feedforward, convolutional, semantic segmentation, and recurrent neural networks, neural ODEs and NNCS. With the recent additions to NNV, we have demonstrated that NNV can be a one-stop verification tool for users with a diverse problem set, where verification of multiple neural network types is needed. In addition, NNV supports zonotope, polyhedron based methods, and up to 6 different star-based reachability methods to handle verification tradeoffs for the verification problem of neural networks, ranging from the exact-star, which is sound and complete, but computationally expensive, to the relax-star methods, which are significantly faster but more conservative. We have also shown that NNV outperforms a commercially available product from MATLAB, which computes the reachable sets of feedforward neural networks using the zonotope reachability method presented in [66]. In the future, we plan to ensure support for other deep learning models such as ResNets [30] and UNets [60].

**Acknowledgments.** The material presented in this paper is based upon work supported by the National Science Foundation (NSF) through grant numbers 1910017, 2028001, 2220418, 2220426 and 2220401, and the NSF Nebraska EPSCoR under grant OIA-2044049, the Defense Advanced Research Projects Agency (DARPA) under contract number FA8750-18-C-0089 and FA8750-23-C-0518, and the Air Force Office of Scientific Research (AFOSR) under contract number FA9550-22-1-0019 and FA9550- 23-1-0135. Any opinions, findings, and conclusions or recommendations expressed in this paper are those of the authors and do not necessarily reflect the views of AFOSR, DARPA, or NSF.

## **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

## QEBVerif: Quantization Error Bound Verification of Neural Networks

Yedi Zhang<sup>1</sup>, Fu Song1,2,3(B) , and Jun Sun<sup>4</sup>

<sup>1</sup> ShanghaiTech University, Shanghai 201210, China songfu@shanghaitech.edu.cn

<sup>2</sup> Institute of Software, Chinese Academy of Sciences and University of Chinese Academy of Sciences, Beijing 100190, China

<sup>3</sup> Automotive Software Innovation Center, Chongqing 400000, China

<sup>4</sup> Singapore Management University, Singapore 178902, Singapore

Abstract. To alleviate the practical constraints for deploying deep neural networks (DNNs) on edge devices, quantization is widely regarded as one promising technique. It reduces the resource requirements for computational power and storage space by quantizing the weights and/or activation tensors of a DNN into lower bit-width fixed-point numbers, resulting in quantized neural networks (QNNs). While it has been empirically shown to introduce minor accuracy loss, critical verified properties of a DNN might become invalid once quantized. Existing verification methods focus on either individual neural networks (DNNs or QNNs) or quantization error bound for *partial* quantization. In this work, we propose a quantization error bound verification method, named QEB-Verif, where both weights and activation tensors are quantized. QEBVerif consists of two parts, i.e., a differential reachability analysis (DRA) and a mixed-integer linear programming (MILP) based verification method. DRA performs difference analysis between the DNN and its quantized counterpart layer-by-layer to compute a tight quantization error interval efficiently. If DRA fails to prove the error bound, then we encode the verification problem into an equivalent MILP problem which can be solved by off-the-shelf solvers. Thus, QEBVerif is sound, complete, and reasonably efficient. We implement QEBVerif and conduct extensive experiments, showing its effectiveness and efficiency.

### 1 Introduction

In the past few years, the development of deep neural networks (DNNs) has grown at an impressive pace owing to their outstanding performance in solving various complicated tasks [23,28]. However, modern DNNs are often large in size and contain a great number of 32-bit floating-point parameters to achieve competitive performance. Thus, they often result in high computational costs and excessive storage requirements, hindering their deployment on resourceconstrained embedded devices, e.g., edge devices. A promising solution is to quantize the weights and/or activation tensors as fixed-point numbers of lower bit-width [17,21,25,35]. For example, TensorFlow Lite [18] supports quantization of weights and/or activation tensors to reduce the model size and latency, and Tesla FSD-chip [61] stores all the data and weights of a network in the form of 8-bit integers.

In spite of the empirically impressive results which show there is only minor accuracy loss, quantization does not necessarily preserve properties such as robustness [16]. Even worse, input perturbation can be amplified by quantization [11,36], worsening the robustness of quantized neural networks (QNNs) compared to their DNN counterparts. Indeed, existing neural network quantization methods focus on minimizing its impact on model accuracy (e.g., by formulating it as an optimization problem that aims to maximize the accuracy [27,43]). However, they cannot guarantee that the final quantization error is always lower than a given error bound, especially when some specific safetycritical input regions are concerned. This is concerning as such errors may lead to catastrophes when the quantized networks are deployed in safety-critical applications [14,26]. Furthermore, analyzing (in particular, quantifying) such errors can also help us understand how quantization affect the network behaviors [33], and provide insights on, for instance, how to choose appropriate quantization bit sizes without introducing too much error. Therefore, a method that soundly quantifies the errors between DNNs and their quantized counterparts is highly desirable.

There is a large and growing body of work on developing verification methods for DNNs [2,12,13,15,19,24,29,30,32,37,38,51,54,55,58–60,62] and QNNs [1,3,16,22,46,66,68], aiming to establish a formal guarantee on the network behaviors. However, all the above-mentioned methods focus exclusively on verifying individual neural networks. Recently, Paulsen et al. [48,49] proposed differential verification methods, aimed to establish formal guarantees on the difference between two DNNs. Specifically, given two DNNs N<sup>1</sup> and N<sup>2</sup> with the same network topology and inputs, they try to prove that |N1(**x**) − N2(**x**)<sup>|</sup> < - for all possible inputs **x** ∈ X , where X is the interested input region. They presented fast and sound difference propagation techniques followed by a refinement of the input region until the property can be successfully verified, i.e., the property is either proved or falsified by providing a counterexample. This idea has been extended to handle recurrent neural networks (RNNs) [41] though the refinement is not considered therein. Although their methods [41,48,49] can be used to analyze the error bound introduced by quantizing weights (called *partially* QNNs), they are not complete and cannot handle the cases where both the weights and activation tensors of a DNN are quantized to lower bit-width fixed-point numbers (called *fully* QNNs). We remark that fully QNN can significantly reduce the energy-consumption (floating-point operations consume much more energy than integer-only operations) [61].

Main Contributions. We propose a sound and complete Quantization Error Bound Verification method (QEBVerif) to efficiently and effectively verify if the quantization error of a *fully* QNN w.r.t. an input region and its original DNN is always lower than an error bound (a.k.a. robust error bound [33]). QEBVerif first conducts a novel reachability analysis to quantify the quantization errors, which is referred to as *differential reachability analysis* (DRA). Such an analysis yields two results: (1) *Proved*, meaning that the quantization error is proved to be always less than the given error bound; or (2) *Unknown*, meaning that it fails to prove the error bound, possibly due to a conservative approximation of the quantization error. If the outcome is *Unknown*, we further encode this quantization error bound verification problem into an equivalent mixed-integer linear programming (MILP) problem, which can be solved by off-the-shelf solvers.

There are two main technical challenges that must be addressed for DRA. First, the activation tensors in a fully QNN are discrete values and contribute additional rounding errors to the final quantization errors, which are hard to propagate symbolically and make it difficult to establish relatively accurate difference intervals. Second, much more activation-patterns (i.e., <sup>3</sup> <sup>×</sup> 6 = 18) have to consider in a forward propagation, while 9 activation-patterns are sufficient in [48,49], where an activation-pattern indicates the status of the output range of a neuron. A neuron in a DNN under an input region has 3 patterns: alwaysactive (i.e., output <sup>≥</sup> <sup>0</sup>), always-inactive (i.e., output <sup>&</sup>lt; <sup>0</sup>), or both possible. A neuron in a QNN has 6 patterns due to the clamp function (cf. Definition 2). We remark that handling these different combinations efficiently and soundly is highly nontrivial. To tackle the above challenges, we propose sound transformations for the affine and activation functions to propagate quantization errors of two networks layer-by-layer. Moreover, for the affine transformation, we provide two alternative solutions: *interval-based* and *symbolic-based*. The former directly computes sound difference intervals via interval analysis [42], while the latter leverages abstract interpretation [10] to compute sound and symbolic difference intervals, using the polyhedra abstract domain. In comparison, the symbolicbased one is usually more accurate but less efficient than the interval-based one. Note that though existing tools can obtain quantization error intervals by independently computing the output intervals of two networks followed by interval subtractions, such an approach is often too conservative.

To resolve those problems that cannot be proved via our DRA, we resort to the sound and complete MILP-based verification method. Inspired by the MILP encoding of DNN and QNN verification [39,40,68], we propose a novel MILP encoding for verifying quantization error bounds. QEBVerif represents both the computations of the QNN and the DNN in mixed-integer linear constraints which are further simplified using their own output intervals. Moreover, we also encode the output difference intervals of hidden neurons from our DRA as mixed-integer linear constraints to boost the verification.

We implement our method as an end-to-end tool and use Gurobi [20] as our back-end MILP solver. We extensively evaluate it on a large set of verification tasks using neural networks for ACAS Xu [26] and MNIST [31], where the number of neurons varies from 310 to 4890, the number of bits for quantizing weights and activation tensors ranges from 4 to 10 bits, and the number of bits for quantizing inputs is fixed to 8 bits. For DRA, we compare QEBVerif with a naive method that first independently computes the output intervals of DNNs and QNNs using the existing state-of-the-art (symbolic) interval analysis [22,55], and then conducts an interval subtraction. The experimental results show that both our interval- and symbolic-based approaches are much more accurate and can successfully verify much more tasks without the MILP-based verification. We also find that the quantization error interval returned by DRA is getting tighter with the increase of the quantization bit size. The experimental results also confirm the effectiveness of our MILP-based verification method, which can help verify many tasks that cannot be solved by DRA solely. Finally, our results also allow us to study the potential correlation of quantization errors and robustness for QNNs using QEBVerif.

We summarize our contributions as follows:


The source code of our tool and benchmarks are available at https://github. com/S3L-official/QEBVerif. Missing proofs, more examples, and experimental results can be found in [65].

## 2 Preliminaries

We denote by R,Z, N and B the sets of real-valued numbers, integers, natural numbers, and Boolean values, respectively. Let [n] denote the integer set {1,...,n} for given <sup>n</sup> <sup>∈</sup> <sup>N</sup>. We use BOLD UPPERCASE (e.g., **<sup>W</sup>**) and bold lowercase (e.g., **x**) to denote matrices and vectors, respectively. We denote by **W**i,j the j-entry in the i-th row of the matrix **W**, and by **x**<sup>i</sup> the i-th entry of the vector **<sup>x</sup>**. Given a matrix **<sup>W</sup>** and a vector **<sup>x</sup>**, we use **<sup>W</sup>** and **<sup>x</sup>**<sup>ˆ</sup> (resp. **<sup>W</sup>** and **x**˜) to denote their quantized/integer (resp. fixed-point) counterparts.

### 2.1 Neural Networks

A deep neural network (DNN) consists of a sequence of layers, where the first layer is the *input layer*, the last layer is the *output layer* and the others are called *hidden layers*. Each layer contains one or more neurons. A DNN is *feed-forward* if all the neurons in each non-input layer only receive inputs from the neurons in the preceding layer.

Definition 1 (Feed-forward Deep Neural Network). *A feed-forward DNN* <sup>N</sup> : <sup>R</sup><sup>n</sup> <sup>→</sup> <sup>R</sup><sup>s</sup> *with* <sup>d</sup> *layers can be seen as a composition of* <sup>d</sup> *functions such that* <sup>N</sup> <sup>=</sup> <sup>l</sup><sup>d</sup> ◦ <sup>l</sup><sup>d</sup>−<sup>1</sup> ◦···◦ <sup>l</sup>1*. Then, given an input* **<sup>x</sup>** <sup>∈</sup> <sup>R</sup><sup>n</sup>*, the output of the DNN* **<sup>y</sup>** <sup>=</sup> <sup>N</sup> (**x**) *can be obtained by the following recursive computation:*


*where* n<sup>1</sup> = n*,* **W**<sup>i</sup> *and* **b**<sup>i</sup> *are the weight matrix and bias vector in the* i*-th layer, and* <sup>φ</sup>(·) *is the activation function which acts element-wise on an input vector.*

In this work, we focus on feed-forward DNNs with the most commonly used activation functions: the rectified linear unit (ReLU) function, defined as ReLU(x) = max(x, 0). We also use n<sup>d</sup> to denote the output dimension s.

A quantized neural network (QNN) is structurally similar to its real-valued counterpart, except that all the parameters, inputs of the QNN, and outputs of all the hidden layers are quantized into integers according to the given quantization scheme. Then, the computation over real-valued arithmetic in a DNN can be replaced by the computation using integer arithmetic, or equally, fixed-point arithmetic. In this work, we consider the most common quantization scheme, i.e., symmetric uniform quantization [44]. We first give the concept of quantization configuration which effectively defines a quantization scheme.

A *quantization configuration* C is a tuple τ, Q, F, where Q and F are the total bit size and the fractional bit size allocated to a value, respectively, and <sup>τ</sup> ∈ {+, ±} indicates if the quantized value is unsigned or signed. Given a real number <sup>x</sup> <sup>∈</sup> <sup>R</sup> and a quantization configuration <sup>C</sup> <sup>=</sup> τ, Q, F, its quantized integer counterpart xˆ and the fixed-point counterpart x˜ under the symmetric uniform quantization scheme are:

$$\hat{x} = \text{clamp}([2^F \cdot x], \mathcal{C}^{\text{lb}}, \mathcal{C}^{\text{ub}}) \text{ and } \ \tilde{x} = \hat{x}/2^F$$

where <sup>C</sup>lb = 0 and <sup>C</sup>ub = 2<sup>Q</sup> <sup>−</sup><sup>1</sup> if <sup>τ</sup> = +, <sup>C</sup>lb <sup>=</sup> <sup>−</sup>2<sup>Q</sup>−<sup>1</sup> and <sup>C</sup>ub = 2<sup>Q</sup>−<sup>1</sup> <sup>−</sup><sup>1</sup> otherwise, and · is the round-to-nearest integer operator. The *clamping function* clamp(x, a, b) with a lower bound a and an upper bound b is defined as: ⎧⎪⎨

$$\text{clamp}(x, a, b) = \begin{cases} a, & \text{if } x < a; \\ x, & \text{if } a \le x \le b; \\ b, & \text{if } x > b. \end{cases}$$

Definition 2 (Quantized Neural Network). *Given quantization configurations for the weights, biases, output of the input layer and each hidden layer as* <sup>C</sup><sup>w</sup> <sup>=</sup> τw, Qw, Fw*,* <sup>C</sup><sup>b</sup> <sup>=</sup> τb, Qb, Fb*,* <sup>C</sup>in <sup>=</sup> τin, Qin, Fin*,* <sup>C</sup><sup>h</sup> <sup>=</sup> τh, Qh, Fh*, the quantized version (i.e., QNN) of a DNN* N *with* d *layers is a function* <sup>N</sup> : <sup>Z</sup><sup>n</sup> <sup>→</sup> <sup>R</sup><sup>s</sup> *such that* <sup>N</sup> <sup>=</sup> <sup>ˆ</sup>l<sup>d</sup> ◦ <sup>ˆ</sup>l<sup>d</sup>−<sup>1</sup> ◦···◦ <sup>ˆ</sup>l1*. Then, given a quantized input* **<sup>x</sup>**<sup>ˆ</sup> <sup>∈</sup> <sup>Z</sup><sup>n</sup>*, the output of the QNN* **<sup>y</sup>**<sup>ˆ</sup> <sup>=</sup> <sup>N</sup>(**x**ˆ) *can be obtained by the following recursive computation:*


*e*.

$$\begin{aligned} \text{5.1. A 3-layer DNN } \mathcal{N}\_e \text{ and its quantized version } \mathcal{N}\_e. \\\\ \hat{\mathbf{x}}\_j^i = \text{clamp}([2^{F\_i} \widehat{\mathbf{W}}\_{j,:}^i \cdot \hat{\mathbf{x}}^{i-1} + 2^{F\_h - F\_b} \hat{\mathbf{b}}\_j^i], 0, \ \mathcal{C}\_h^{\text{ub}}), \end{aligned}$$

*where* <sup>F</sup><sup>i</sup> *is* <sup>F</sup><sup>h</sup> <sup>−</sup> <sup>F</sup><sup>w</sup> <sup>−</sup> <sup>F</sup>in *if* <sup>i</sup> = 2*, and* <sup>−</sup>F<sup>w</sup> *otherwise;*

*– Output layer* <sup>ˆ</sup>l<sup>d</sup> : <sup>Z</sup><sup>n</sup>*d*−<sup>1</sup> <sup>→</sup> <sup>R</sup><sup>s</sup> *is the function such that* **<sup>y</sup>**<sup>ˆ</sup> <sup>=</sup> **<sup>x</sup>**ˆ<sup>d</sup> <sup>=</sup> <sup>ˆ</sup>ld(**x**ˆ<sup>d</sup>−<sup>1</sup>) = 2−F*w***W**<sup>d</sup>**x**ˆ<sup>d</sup>−<sup>1</sup> + 2<sup>F</sup>*h*−F*b***b**ˆ<sup>d</sup>*. where for every* <sup>2</sup> <sup>≤</sup> <sup>i</sup> <sup>≤</sup> <sup>d</sup> *and* <sup>k</sup> <sup>∈</sup> [n<sup>i</sup>−<sup>1</sup>]*,* **<sup>W</sup>**-

i j,k <sup>=</sup> clamp( <sup>2</sup><sup>F</sup>*w***W**<sup>i</sup> j,k , <sup>C</sup>lb <sup>w</sup> , <sup>C</sup>ub <sup>w</sup> ) *is the quantized weight and* **<sup>b</sup>**ˆ<sup>i</sup> <sup>j</sup> <sup>=</sup> clamp( <sup>2</sup><sup>F</sup>*b***b**<sup>i</sup> j , <sup>C</sup>lb <sup>b</sup> , <sup>C</sup>ub <sup>b</sup> ) *is the quantized bias.*

We remark that 2<sup>F</sup>*<sup>i</sup>* and 2<sup>F</sup>*h*−F*<sup>b</sup>* in Definition 2 are used to align the precision between the inputs and outputs of hidden layers, and F<sup>i</sup> for i = 2 and i > 2 because quantization bit sizes for the outputs of the input layer and hidden layers can be different.

#### 2.2 Quantization Error Bound and Its Verification Problem

We now give the formal definition of the quantization error bound verification problem considered in this work as follows.

Definition 3 (Quantization Error Bound). *Given a DNN* <sup>N</sup> : <sup>R</sup><sup>n</sup> <sup>→</sup> <sup>R</sup><sup>s</sup>*, the corresponding QNN* <sup>N</sup> : <sup>Z</sup><sup>n</sup> <sup>→</sup> <sup>R</sup><sup>s</sup>*, a quantized input* **<sup>x</sup>**<sup>ˆ</sup> <sup>∈</sup> <sup>Z</sup><sup>n</sup>*, a radius* <sup>r</sup> <sup>∈</sup> <sup>N</sup> *and an error bound* - <sup>∈</sup> <sup>R</sup>*. The QNN* <sup>N</sup> *has a quantization error bound of w.r.t. the input region* <sup>R</sup>(**x**ˆ, r) = {**x**ˆ <sup>∈</sup> <sup>Z</sup><sup>n</sup> | ||**x**ˆ <sup>−</sup> **<sup>x</sup>**ˆ||<sup>∞</sup> <sup>≤</sup> <sup>r</sup>} *if for every* **<sup>x</sup>**ˆ <sup>∈</sup> <sup>R</sup>(**x**ˆ, r)*, we have* ||2−F*<sup>h</sup>* <sup>N</sup>(**x**ˆ ) − N (**x** )||<sup>∞</sup> < -*, where* **x** = **x**ˆ /(Cub in − Clb in)*.*

Intuitively, quantization-error-bound is the bound of the output difference of the DNN and its quantized counterpart for all the inputs in the input region. In this work, we obtain the input for DNN via dividing **<sup>x</sup>**ˆ by (Cub in − Clb in) to allow input normalization. Furthermore, 2−F*<sup>h</sup>* is used to align the precision between the outputs of QNN and DNN.

*Example 1.* Consider the DNN N<sup>e</sup> with 3 layers (one input layer, one hidden layer, and one output layer) given in Fig. 1, where weights are associated with the edges and all the biases are 0. The quantization configurations for the weights, the output of the input layer and hidden layer are <sup>C</sup><sup>w</sup> <sup>=</sup> ±, <sup>4</sup>, <sup>2</sup>, <sup>C</sup>in <sup>=</sup> +, <sup>4</sup>, <sup>4</sup> and <sup>C</sup><sup>h</sup> <sup>=</sup> +, <sup>4</sup>, <sup>2</sup>. Its QNN <sup>N</sup><sup>e</sup> is shown in Fig. 1.

Given a quantized input **x**ˆ = (9, 6) and a radius r = 1, the input region for QNN <sup>N</sup><sup>e</sup> is <sup>R</sup>((9, 6), 1) = {(x, y) <sup>∈</sup> <sup>Z</sup><sup>2</sup> <sup>|</sup> <sup>8</sup> <sup>≤</sup> <sup>x</sup> <sup>≤</sup> <sup>10</sup>, <sup>5</sup> <sup>≤</sup> <sup>y</sup> <sup>≤</sup> <sup>7</sup>}. Since <sup>C</sup>*ub* in = 15 and <sup>C</sup>*lb* in = 0, by Definitions 1, 2, and 3, we have the maximum quantization error as max(2−2Ne(**x**ˆ ) − Ne(**x**ˆ /15)) = 0.<sup>067</sup> for **<sup>x</sup>**ˆ <sup>∈</sup> <sup>R</sup>((9, 6), 1). Then, <sup>N</sup><sup>e</sup> has a quantization error bound of w.r.t. input region R((9, 6), 1) for any -> 0.067.

We remark that if only weights are quantized and the activation tensors are floating-point numbers, the maximal quantization error of <sup>N</sup><sup>e</sup> for the input region R((9, 6), 1) is 0.04422, which implies that existing methods [48,49] cannot be used to analyze the error bound for a fully QNN.

In this work, we focus on the quantization error bound verification problem for classification tasks. Specifically, for a classification task, we only focus on the output difference of the predicted class instead of all the classes. Hence, given a DNN <sup>N</sup> , a corresponding QNN <sup>N</sup>, a quantized input **<sup>x</sup>**<sup>ˆ</sup> which is classified to class g by the DNN N , a radius r and an error bound -, the quantization error bound property <sup>P</sup>(<sup>N</sup> , <sup>N</sup>, **<sup>x</sup>**ˆ, r, -) for a classification task can be defined as follows: |2−F*h* N(**x**ˆ ∧ 

$$\bigwedge\_{\hat{\mathbf{x}}' \in R(\hat{\mathbf{x}},r)} \left( |2^{-F\_h}\widehat{\mathcal{N}}(\hat{\mathbf{x}}')\_g - \mathcal{N}(\mathbf{x}')\_g| < \epsilon \right) \wedge \left( \mathbf{x}' = \hat{\mathbf{x}}'/(\mathcal{C}\_{in}^{\mathrm{ub}} - \mathcal{C}\_{in}^{\mathrm{lb}}) \right)$$

Note that <sup>N</sup> (·)<sup>g</sup> denotes the <sup>g</sup>-th entry of the vector <sup>N</sup> (·).

#### 2.3 DeepPoly

We briefly recap DeepPoly [55], which will be leveraged in this work for computing the output of each neuron in a DNN.

The core idea of DeepPoly is to give each neuron an abstract domain in the form of a linear combination of the variables preceding the neuron. To achieve this, each hidden neuron **x**<sup>i</sup> <sup>j</sup> (the j-th neuron in the i-th layer) in a DNN is seen as two nodes **x**<sup>i</sup> j,<sup>0</sup> and **x**<sup>i</sup> j,1, such that **x**<sup>i</sup> j,<sup>0</sup> <sup>=</sup> <sup>n</sup>*i*−<sup>1</sup> <sup>k</sup>=1 **W**<sup>i</sup> j,k**x**<sup>i</sup>−<sup>1</sup> k,<sup>1</sup> <sup>+</sup> **<sup>b</sup>**<sup>i</sup> <sup>j</sup> (affine function) and **x**<sup>i</sup> j,<sup>1</sup> <sup>=</sup> ReLU(**x**<sup>i</sup> j,0) (ReLU function). Then, the affine function is characterized as an abstract transformer using an upper polyhedral computation and a lower polyhedral computation in terms of the variables **x**<sup>i</sup>−<sup>1</sup> k,<sup>1</sup> . Finally, it recursively substitutes the variables in the upper and lower polyhedral computations with the corresponding upper/lower polyhedral computations of the variables until they only contain the input variables from which the concrete intervals are computed.

Formally, the abstract element <sup>A</sup><sup>i</sup> j,s for the node **x**<sup>i</sup> j,s (<sup>s</sup> ∈ {0, <sup>1</sup>}) is a tuple Ai j,s <sup>=</sup> **a**i,<sup>≤</sup> j,s , **<sup>a</sup>**i,<sup>≥</sup> j,s , l<sup>i</sup> j,s, u<sup>i</sup> j,s, where **<sup>a</sup>**i,<sup>≤</sup> j,s and **<sup>a</sup>**i,<sup>≥</sup> j,s are respectively the lower and upper polyhedral computations in the form of a linear combination of the variables **x**<sup>i</sup>−<sup>1</sup> k,<sup>1</sup> 's if <sup>s</sup> = 0 or **<sup>x</sup>**<sup>i</sup> k,0's if <sup>s</sup> = 1, <sup>l</sup> i j,s <sup>∈</sup> <sup>R</sup> and <sup>u</sup><sup>i</sup> j,s <sup>∈</sup> <sup>R</sup> are the concrete lower and upper bound of the neuron. Then, the concretization of the abstract element <sup>A</sup><sup>i</sup> j,s is <sup>Γ</sup>(A<sup>i</sup> j,s) = {<sup>x</sup> <sup>∈</sup> <sup>R</sup> <sup>|</sup> **<sup>a</sup>**i,<sup>≤</sup> j,s <sup>≤</sup> <sup>x</sup> <sup>∧</sup> <sup>x</sup> <sup>≤</sup> **<sup>a</sup>**i,<sup>≥</sup> j,s }. j,<sup>0</sup> <sup>=</sup> <sup>n</sup>*i*−<sup>1</sup>

Concretely, **a**i,<sup>≤</sup> j,<sup>0</sup> and **<sup>a</sup>**i,<sup>≥</sup> j,<sup>0</sup> are defined as **<sup>a</sup>**i,<sup>≤</sup> j,<sup>0</sup> <sup>=</sup> **<sup>a</sup>**i,<sup>≥</sup> <sup>k</sup>=1 **W**<sup>i</sup> j,k**x**<sup>i</sup>−<sup>1</sup> k,<sup>1</sup> <sup>+</sup>**b**<sup>i</sup> j . Furthermore, we can repeatedly substitute every variable in **a**i,<sup>≤</sup> j,<sup>0</sup> (resp. **<sup>a</sup>**i,<sup>≥</sup> j,<sup>0</sup> ) with its lower (resp. upper) polyhedral computation according to the coefficients until no further substitution is possible. Then, we can get a sound lower (resp. upper) bound in the form of a linear combination of the input variables based on which l i j,<sup>0</sup> (resp. u<sup>i</sup> j,0) can be computed immediately from the given input region.

For ReLU function **x**<sup>i</sup> j,<sup>1</sup> <sup>=</sup> ReLU(**x**<sup>i</sup> j,0), there are three cases to consider of the abstract element <sup>A</sup><sup>i</sup> j,1:

– If u<sup>i</sup> j,<sup>0</sup> <sup>≤</sup> <sup>0</sup>, then **<sup>a</sup>**i,<sup>≤</sup> j,<sup>1</sup> <sup>=</sup> **<sup>a</sup>**i,<sup>≥</sup> j,<sup>1</sup> = 0, <sup>l</sup> i j,<sup>1</sup> <sup>=</sup> <sup>u</sup><sup>i</sup> j,<sup>1</sup> = 0; – If l i j,<sup>0</sup> <sup>≥</sup> <sup>0</sup>, then **<sup>a</sup>**i,<sup>≤</sup> j,<sup>1</sup> <sup>=</sup> **<sup>a</sup>**i,<sup>≤</sup> j,<sup>0</sup> , **<sup>a</sup>**i,<sup>≥</sup> j,<sup>1</sup> <sup>=</sup> **<sup>a</sup>**i,<sup>≥</sup> j,<sup>0</sup> , l i j,<sup>1</sup> <sup>=</sup> <sup>l</sup> i j,<sup>0</sup> and u<sup>i</sup> j,<sup>1</sup> <sup>=</sup> <sup>u</sup><sup>i</sup> j,0; – If l i j,<sup>0</sup> <sup>&</sup>lt; <sup>0</sup> <sup>∧</sup> <sup>u</sup><sup>i</sup> j,<sup>0</sup> <sup>&</sup>gt; <sup>0</sup>, then **<sup>a</sup>**i,<sup>≥</sup> j,<sup>1</sup> <sup>=</sup> <sup>u</sup>*<sup>i</sup> j,*0(**x***<sup>i</sup> j,*0−l *i j,*0) u*i j,*0−l*<sup>i</sup> j,*0 , **a**i,<sup>≤</sup> j,<sup>1</sup> <sup>=</sup> <sup>λ</sup>**x**<sup>i</sup> j,<sup>0</sup> where <sup>λ</sup> ∈ {0, <sup>1</sup>} such that the area of resulting shape by **a**i,<sup>≤</sup> j,<sup>1</sup> and **<sup>a</sup>**i,<sup>≥</sup> j,<sup>1</sup> is minimal, l i j,<sup>1</sup> <sup>=</sup> λl<sup>i</sup> j,0 and u<sup>i</sup> j,<sup>1</sup> <sup>=</sup> <sup>u</sup><sup>i</sup> j,0.

Note that DeepPoly also introduces transformers for other functions, such as sigmoid, tanh, and maxpool functions. In this work, we only consider DNNs with only ReLU as non-linear operators.

## 3 Methodology of QEBVerif

In this section, we first give an overview of our quantization error bound verification method, QEBVerif, and then give the detailed design of each component.

### 3.1 Overview of QEBVerif

An overview of QEBVerif is shown in Fig. 2. Given a DNN <sup>N</sup> , its QNN <sup>N</sup>, a quantization error bound and an input region consisting of a quantized input **x**ˆ and a radius <sup>r</sup>, to verify the quantization error bound property <sup>P</sup>(<sup>N</sup> , <sup>N</sup>, **<sup>x</sup>**ˆ, r, -), QEBVerif first performs a differential reachability analysis (DRA) to compute a sound output difference interval for the two networks. Note that, the difference intervals of all the neurons are also recorded for later use. If the output difference interval of the two networks is contained in [−-, -], then the property is proved and QEBVerif outputs "Proved". Otherwise, QEBVerif leverages our MILP-based quantization error bound verification method by encoding the problem into an equivalent mixed integer linear programming (MILP) problem which can be solved by off-the-shelf solvers. To reduce the size of mixed integer linear constraints and boost the verification, QEBVerif independently applies symbolic interval analysis on the two networks based on which some activation patterns could be omitted. We further encode the difference intervals of all the neurons from DRA as mixed integer linear constraints and add them to the MILP problem. Though it increases the number of mixed integer linear constraints, it is very helpful for solving hard verification tasks. Therefore, the whole verification process is sound, complete yet reasonably efficient. We remark that the MILP-based verification method is often more time-consuming and thus the first step allows us to quickly verify many tasks first.

Fig. 2. An overview of QEBVerif.

#### 3.2 Differential Reachability Analysis

Naively, one could use an existing verification tool in the literature to independently compute the output intervals for both the QNN and the DNN, and then compute their output difference directly by interval subtraction. However, such an approach would be ineffective due to the significant precision loss.

Recently, Paulsen et al. [48] proposed ReluDiff and showed that the accuracy of output difference for two DNNs can be greatly improved by propagating the difference intervals layer-by-layer. For each hidden layer, they first compute the output difference of affine functions (before applying the ReLU), and then they use a ReLU transformer to compute the output difference after applying the ReLU functions. The reason why ReluDiff outperforms the naive method is that ReluDiff first computes part of the difference before it accumulates. ReluDiff is later improved to tighten the approximated difference intervals [49]. However, as mentioned previously, they do not support *fully* quantified neural networks. Inspired by their work, we design a difference propagation algorithm for our setting. We use Sin(**x**<sup>i</sup> <sup>j</sup> ) (resp. <sup>S</sup>in(**x**ˆ<sup>i</sup> <sup>j</sup> )) to denote the interval of the j-th neuron in the i-th layer in the DNN (resp. QNN) before applying the ReLU function (resp. clamp function), and use S(**x**<sup>i</sup> <sup>j</sup> ) (resp. <sup>S</sup>(**x**ˆ<sup>i</sup> <sup>j</sup> )) to denote the output interval after applying the ReLU function (resp. clamp function). We use δin i (resp. δi) to denote the difference interval for the i-th layer before (resp. after) applying the activation functions, and use δin i,j (resp. δi,j ) to denote the interval for the <sup>j</sup>-th neuron of the <sup>i</sup>-th layer. We denote by LB(·) and UB(·) the concrete lower and upper bounds accordingly.

Based on the above notations, we give our difference propagation in Algorithm 1. It works as follows. Given a DNN <sup>N</sup> , a QNN <sup>N</sup> and a quantized input region R(**x**ˆ, r), we first compute intervals Sin(**x**<sup>i</sup> <sup>j</sup> ) and S(**x**<sup>i</sup> <sup>j</sup> ) for neurons in <sup>N</sup> using *symbolic* interval analysis DeepPoly, and compute interval Sin(**x**ˆ<sup>i</sup> <sup>j</sup> ) and S(**x**ˆ<sup>i</sup> <sup>j</sup> ) for neurons in <sup>N</sup> using concrete interval analysis method [22]. Remark that no symbolic interval analysis for QNNs exists. By Definition 3, for each quantized input **x**ˆ for QNN, we obtain the input for DNN as **x** = **x**ˆ /(Cub in − Clb in). After precision alignment, we get the input difference as <sup>2</sup>−F*in* **<sup>x</sup>**ˆ <sup>−</sup> **<sup>x</sup>** <sup>=</sup> (2−F*in* <sup>−</sup> <sup>1</sup>/(Cub in − Clb in))**x**ˆ . Hence, given an input region, we get the output difference of the input layer: <sup>δ</sup><sup>1</sup> = (2−F*in* <sup>−</sup>1/(Cub in − Clb in))S(**x**ˆ1). Then, we compute the output difference δ<sup>i</sup> of each hidden layer iteratively by applying the affine transformer and activation transformer given in Algorithm 2 and Algorithm 3.

#### Algorithm 1: Forward Difference Propagation

Input : DNN <sup>N</sup>, QNN <sup>N</sup>-, input region R(ˆx, r) output: Output difference interval δ <sup>1</sup> Compute S*in*(**x***<sup>i</sup> <sup>j</sup>* ) and <sup>S</sup>(**x***<sup>i</sup> <sup>j</sup>* ) for i ∈ [d − 1], j ∈ [n*i*] using DeepPoly; <sup>2</sup> Compute S*in*(**x**ˆ*<sup>i</sup> <sup>j</sup>* ) and <sup>S</sup>(**x**ˆ*<sup>i</sup> <sup>j</sup>* ) for i ∈ [d − 1], j ∈ [n*i*] by applying interval analysis [22]; <sup>3</sup> Initialize the difference: <sup>δ</sup><sup>1</sup> = (2−*Fin* <sup>−</sup> <sup>1</sup>/(Cub *in* − Clb *in*))S(**x**ˆ1); <sup>4</sup> for *<sup>i</sup>* in <sup>2</sup>,...,d <sup>−</sup> <sup>1</sup> do // propagate in hidden layers <sup>5</sup> for *j* in 1,...,n*<sup>i</sup>* do <sup>6</sup> Δ**b***<sup>i</sup> <sup>j</sup>* = 2−*Fb***b**ˆ*<sup>i</sup> <sup>j</sup>* <sup>−</sup> **<sup>b</sup>***<sup>i</sup> <sup>j</sup>* ; <sup>ξ</sup> = 2−*Fh*−1; <sup>7</sup> δ*in i,j* <sup>=</sup> AffTrs(**W***<sup>i</sup> j,*:, <sup>2</sup>−*Fw***W***<sup>i</sup> j,*:, Δ**b***<sup>i</sup> <sup>j</sup>* , S(**x***i*−1), δ*i*−1, ξ); <sup>8</sup> δ*i,j* = ActTrs(δ*in i,j* , S*in*(**x***<sup>i</sup> <sup>j</sup>* ), <sup>2</sup>−*Fh* <sup>S</sup>*in*(**x**ˆ*<sup>i</sup> <sup>j</sup>* )); 9 // propagate in the output layer <sup>10</sup> for *j* in 1,...,n*<sup>d</sup>* do <sup>11</sup> Δ**b***<sup>d</sup> <sup>j</sup>* = 2−*Fb***b**ˆ*<sup>d</sup> <sup>j</sup>* <sup>−</sup> **<sup>b</sup>***<sup>d</sup> j* ; <sup>12</sup> δ*d,j* = δ*in d,j* <sup>=</sup> AffTrs(**W***<sup>d</sup> j,*:, <sup>2</sup>−*Fw***W***<sup>d</sup> j,*:, Δ**b***<sup>d</sup> <sup>j</sup>* , S(**x***d*−1), δ*d*−1, 0); <sup>13</sup> return (δ*i,j* )2≤*i*≤*d,*1≤*j*≤*nd* ;

#### Algorithm 2: AffTrs Function

Input : Weight vector **W***<sup>i</sup> j,*:, weight vector **<sup>W</sup>***<sup>i</sup> j,*:, bias difference <sup>Δ</sup>**b***<sup>i</sup> <sup>j</sup>* , neuron interval <sup>S</sup>(**x***i*−1), difference interval <sup>δ</sup>*i*−1, rounding error <sup>ξ</sup> output: Difference interval δ*in i,j* <sup>1</sup> lb <sup>=</sup> LB **W***i j,*:δ*i*−<sup>1</sup> + (**W***<sup>i</sup> j,*: <sup>−</sup> **<sup>W</sup>***<sup>i</sup> j,*:)S(**x***i*−1) + Δ**b***<sup>i</sup> <sup>j</sup>* − ξ; <sup>2</sup> ub <sup>=</sup> UB **W***i j,*:δ*i*−<sup>1</sup> + (**W***<sup>i</sup> j,*: <sup>−</sup> **<sup>W</sup>***<sup>i</sup> j,*:)S(**x***i*−1) + Δ**b***<sup>i</sup> <sup>j</sup>* + ξ; <sup>3</sup> return [lb, ub];

Finally, we get the output difference for the output layer using only the affine transformer.

Affine Transformer. The difference before applying the activation function for the j-th neuron in the i-th layer is: δin i,j = 2−F*<sup>h</sup>* <sup>2</sup><sup>F</sup>*i***W**i j,:S(**x**ˆ<sup>i</sup>−<sup>1</sup>)+2<sup>F</sup>*h*−F*b***b**ˆ<sup>i</sup> j − **W**<sup>i</sup> j,:S(**x**<sup>i</sup>−<sup>1</sup>)−**b**<sup>i</sup> <sup>j</sup> where 2−F*<sup>h</sup>* is used to align the precision between the outputs of the two networks (cf. Sect. 2). Then, we soundly remove the rounding operators and give constraints for upper/lower bounds of δin i,j as follows: i,j ) <sup>≤</sup> UB(2−F*<sup>h</sup>* (2<sup>F</sup>*i***W**-

$$\begin{array}{l} \text{au gwc constraints on upper} \text{ } \text{row} \text{ } \text{row} \text{ } \text{row} \text{ } \text{row} \text{ } \text{row} \text{ } \text{row} \text{ } \text{as} \text{ } \text{row} \text{ } \text{as} \text{ } \text{row} \text{ } \text{as} \\\\ \text{UB}(\delta\_{i,j}^{in}) \le \text{UB}(2^{-F\_{h}}(2^{F\_{i}}\widehat{\mathbf{W}}\_{j,:}^{i}S(\hat{\mathbf{x}}^{i-1}) + 2^{F\_{h}-F\_{b}}\hat{\mathbf{b}}\_{j}^{i} + 0.5) - \mathbf{W}\_{j,:}^{i}S(\mathbf{x}^{i-1}) - \mathbf{b}^{i}) \\\\ \text{LB}(\delta\_{i,j}^{in}) \ge \text{LB}(2^{-F\_{h}}(2^{F\_{i}}\widehat{\mathbf{W}}\_{j,:}^{i}S(\hat{\mathbf{x}}^{i-1}) + 2^{F\_{h}-F\_{b}}\hat{\mathbf{b}}\_{j}^{i} - 0.5) - \mathbf{W}\_{j,:}^{i}S(\mathbf{x}^{i-1}) - \mathbf{b}^{i}) \\\\ \text{Finally, we have } \text{UB}(\delta\_{i,j}^{in}) \le \text{UB}(\widehat{\mathbf{W}}\_{j,:}^{i}S(\hat{\mathbf{x}}^{i-1}) - \mathbf{W}\_{j,:}^{i}S(\mathbf{x}^{i-1})) + \Delta \mathbf{b}\_{j}^{i} + \xi \text{ and} \\\\ \text{Finally, we have } \|\mathbf{y}^{i} - \mathbf{b}^{i}\|\_{\mathrm{op}} \le \|\mathbf{y}^{i} - \mathbf{b}^{i}\|\_{\mathrm{op}} \|\mathbf{x}^{i-1} - \mathbf{b}^{i}\|\_{\$$

Finally, we have UB(δin j,:S(**x**˜<sup>i</sup>−<sup>1</sup>) <sup>−</sup> **<sup>W</sup>**<sup>i</sup> j,:S(**x**<sup>i</sup>−<sup>1</sup>) + Δ**b**<sup>i</sup> <sup>j</sup> + ξ and LB(δin i,j ) ≥ LB **W**i j,:S(**x**˜<sup>i</sup>−<sup>1</sup>) <sup>−</sup> **<sup>W</sup>**<sup>i</sup> j,:S(**x**<sup>i</sup>−<sup>1</sup>) + Δ**b**<sup>i</sup> <sup>j</sup> − ξ, which can be further reformulated as follows: i,j ) ≤ UB **W**i 

$$\begin{array}{ll} \text{For } i\_{i,j} = \text{Cov}(j, \mathbf{x}) \quad \text{and} \quad \mathbf{y} = \mathbf{y} \quad \text{and} \quad \mathbf{y} = \mathbf{y} \quad \text{and} \\ \text{Furthermore,} \\ \text{UB}(\delta\_{i,j}^{in}) \leq \text{UB}(\widetilde{\mathbf{W}}\_{j,:}^{i}, \delta\_{i-1} + \Delta \mathbf{W}\_{j,:}^{i}, S(\mathbf{x}^{i-1})) + \Delta \mathbf{b}\_{j}^{i} + \xi \\ \text{LB}(\delta\_{i,j}^{in}) \geq \text{LB}(\widetilde{\mathbf{W}}\_{j,:}^{i}, \delta\_{i-1} + \Delta \mathbf{W}\_{j,:}^{i}, S(\mathbf{x}^{i-1})) + \Delta \mathbf{b}\_{j}^{i} - \xi \\ \text{where } S(\widetilde{\mathbf{x}}^{i-1}) = 2^{-F\_{in}} S(\widetilde{\mathbf{x}}^{i-1}) \text{ if } i = 2, \text{ and } 2^{-F\_{b}} S(\widetilde{\mathbf{x}}^{i-1}) \text{ otherwise. } \widetilde{\mathbf{W}}\_{j}^{i} \end{array}$$

j,: <sup>=</sup> 2−F*w***W**i j,:, Δ**W**<sup>i</sup> j,: = **W**i j,: <sup>−</sup> **<sup>W</sup>**<sup>i</sup> j,:, Δ**b**<sup>i</sup> <sup>j</sup> = 2−F*b***b**ˆ<sup>i</sup> <sup>j</sup> <sup>−</sup> **<sup>b</sup>**<sup>i</sup> <sup>j</sup> and ξ = 2−F*h*−<sup>1</sup>.

#### Algorithm 3: ActTrs function

Input : Difference interval δ*in i,j* , neuron interval <sup>S</sup>*in*(**x***<sup>i</sup> <sup>j</sup>* ), neuron interval <sup>S</sup>*in*(**x**˜*<sup>i</sup> <sup>j</sup>* ), clamp upper bound t output: Difference interval δ*i,j* <sup>1</sup> if UB(S*in*(**x***<sup>i</sup> <sup>j</sup>* )) <sup>≤</sup> <sup>0</sup> then lb = clamp(LB(S*in*(**x**˜*<sup>i</sup> <sup>j</sup>* )), 0, <sup>t</sup>); ub = clamp(UB(S*in*(**x**˜*<sup>i</sup> <sup>j</sup>* )), 0, t); <sup>2</sup> else if LB(S*in*(**x***<sup>i</sup> <sup>j</sup>* )) <sup>≥</sup> <sup>0</sup> then <sup>3</sup> if UB(S*in*(**x**˜*<sup>i</sup> <sup>j</sup>* )) <sup>≤</sup> <sup>t</sup> and LB(S*in*(**x**˜*<sup>i</sup> <sup>j</sup>* )) <sup>≥</sup> <sup>0</sup> then lb <sup>=</sup> LB(δ*in i,j* ); ub <sup>=</sup> UB(δ*in i,j* ); <sup>4</sup> else if LB(S*in*(**x**˜*<sup>i</sup> <sup>j</sup>* )) <sup>≥</sup> <sup>t</sup> or UB(S*in*(**x**˜*<sup>i</sup> <sup>j</sup>* )) <sup>≤</sup> <sup>0</sup> then <sup>5</sup> lb = clamp(LB(S*in*(**x**˜*<sup>i</sup> <sup>j</sup>* )), 0, <sup>t</sup>)−UB(S*in*(**x***<sup>i</sup> <sup>j</sup>* )); <sup>6</sup> ub = clamp(UB(S*in*(**x**˜*<sup>i</sup> <sup>j</sup>* )), 0, <sup>t</sup>)−LB(S*in*(**x***<sup>i</sup> <sup>j</sup>* )); <sup>7</sup> else if UB(S*in*(**x**˜*<sup>i</sup> <sup>j</sup>* )) <sup>≤</sup> <sup>t</sup> then <sup>8</sup> lb = max(−UB(S*in*(**x***<sup>i</sup> <sup>j</sup>* )), LB(δ*in i,j* )); ub = max(−LB(S*in*(**x***<sup>i</sup> <sup>j</sup>* )), UB(δ*in i,j* )); <sup>9</sup> else if LB(S*in*(**x**˜*<sup>i</sup> <sup>j</sup>* )) <sup>≥</sup> *<sup>0</sup>* then <sup>10</sup> lb = min(<sup>t</sup> <sup>−</sup> UB(S*in*(**x***<sup>i</sup> <sup>j</sup>* )), LB(δ*in i,j* )); ub = min(<sup>t</sup> <sup>−</sup> LB(S*in*(**x***<sup>i</sup> <sup>j</sup>* )), UB(δ*in i,j* )); 11 else <sup>12</sup> lb = max(−UB(S*in*(**x***<sup>i</sup> <sup>j</sup>* )), min(<sup>t</sup> <sup>−</sup> UB(S*in*(**x***<sup>i</sup> <sup>j</sup>* )), LB(δ*in i,j* ))); <sup>13</sup> ub = max(−LB(S*in*(**x***<sup>i</sup> <sup>j</sup>* )), min(<sup>t</sup> <sup>−</sup>LB(S*in*(**x***<sup>i</sup> <sup>j</sup>* )), UB(δ*in i,j* ))); 14 else <sup>15</sup> if UB(S*in*(**x**˜*<sup>i</sup> <sup>j</sup>* )) <sup>≤</sup> <sup>t</sup> and LB(S*in*(**x**˜*<sup>i</sup> <sup>j</sup>* )) <sup>≥</sup> <sup>0</sup> then <sup>16</sup> lb = min(LB(S*in*(**x**˜*<sup>i</sup> <sup>j</sup>* )), LB(δ*in i,j* )); ub = min(UB(S*in*(**x**˜*<sup>i</sup> <sup>j</sup>* )), UB(δ*in i,j* )); <sup>17</sup> else if LB(S*in*(**x**˜*<sup>i</sup> <sup>j</sup>* )) <sup>≥</sup> <sup>t</sup> or UB(S*in*(**x**˜*<sup>i</sup> <sup>j</sup>* )) <sup>≤</sup> <sup>0</sup> then <sup>18</sup> lb = clamp(LB(S*in*(**x**˜*<sup>i</sup> <sup>j</sup>* )), 0, <sup>t</sup>)−UB(S*in*(**x***<sup>i</sup> <sup>j</sup>* )); ub = clamp(UB(S*in*(**x**˜*<sup>i</sup> <sup>j</sup>* )), 0, t); <sup>19</sup> else if UB(S*in*(**x**˜*<sup>i</sup> <sup>j</sup>* )) <sup>≤</sup> <sup>t</sup> then <sup>20</sup> lb = max(LB(δ*in i,j* ), <sup>−</sup>UB(S*in*(**x***<sup>i</sup> <sup>j</sup>* ))); ub = min(UB(δ*in i,j* ), UB(S*in*(**x**˜*<sup>i</sup> <sup>j</sup>* ))); <sup>21</sup> if UB(δ*in i,j* ) <sup>≤</sup> <sup>0</sup> then ub <sup>=</sup> <sup>0</sup>; <sup>22</sup> if LB(δ*in i,j* ) <sup>≥</sup> <sup>0</sup> then lb <sup>=</sup> <sup>0</sup>; <sup>23</sup> else if LB(S*in*(**x**˜*<sup>i</sup> <sup>j</sup>* )) <sup>≥</sup> *<sup>0</sup>* then <sup>24</sup> lb = min(LB(δ*in i,j* ), LB(S*in*(**x**˜*<sup>i</sup> <sup>j</sup>* )), <sup>t</sup> <sup>−</sup>UB(S*in*(**x***<sup>i</sup> <sup>j</sup>* ))); ub = min(UB(δ*in i,j* ), t); 25 else <sup>26</sup> lb = min(<sup>t</sup> <sup>−</sup>UB(S*in*(**x***<sup>i</sup> <sup>j</sup>* )), 0, max(LB(δ*in i,j* ), <sup>−</sup>UB(S*in*(**x***<sup>i</sup> <sup>j</sup>* )))); <sup>27</sup> ub = clamp(UB(δ*in i,j* ), 0, t); <sup>28</sup> return [lb, ub] <sup>∩</sup> (S*in*(**x**˜*<sup>i</sup> <sup>j</sup>* ) <sup>∩</sup> [0, t]) <sup>−</sup> (S*in*(**x***<sup>i</sup> <sup>j</sup>* ) <sup>∩</sup> [0, <sup>+</sup>∞)) ;

Activation Transformer. Now we give our activation transformer in Algorithm 3 which computes the difference interval δi,j from the difference interval δin i,j . Note that, the neuron interval <sup>S</sup>(**x**ˆ<sup>i</sup> <sup>j</sup> ) for the QNN has already been converted to the fixed-point counterpart S(**x**˜<sup>i</sup> <sup>j</sup> )=2−F*<sup>h</sup>* <sup>S</sup>(**x**ˆ<sup>i</sup> <sup>j</sup> ) as an input parameter, as well as the clamping upper bound (<sup>t</sup> = 2−F*<sup>h</sup>* <sup>C</sup>ub <sup>h</sup> ). Different from ReluDiff [48] which focuses on the subtraction of two ReLU functions, here we investigate the subtraction of the clamping function and ReLU function.

### Theorem 1. *If* τ<sup>h</sup> = +*, then Algorithm 1 is sound.*

*Example 2.* We exemplify Algorithm <sup>1</sup> using the networks <sup>N</sup><sup>e</sup> and <sup>N</sup><sup>e</sup> shown in Fig. 1. Given quantized input region R((9, 6), 3) and the corresponding realvalued input region R((0.6, 0.4), 0.2), we have S(**x**ˆ<sup>1</sup> <sup>1</sup>) = [6, 12] and <sup>S</sup>(**x**ˆ<sup>1</sup> <sup>2</sup>) = [3, 9].

First, we get Sin(**x**<sup>2</sup> <sup>1</sup>) = S(**x**<sup>2</sup> <sup>1</sup>) = [0.36, 0.92], Sin(**x**<sup>2</sup> <sup>2</sup>)=[−0.4, <sup>0</sup>.2], <sup>S</sup>(**x**<sup>2</sup> <sup>2</sup>) = [0, 0.2] based on DeepPoly and Sin(**x**ˆ<sup>2</sup> <sup>1</sup>) = <sup>S</sup>(**x**ˆ<sup>2</sup> <sup>1</sup>) = [1, 4], <sup>S</sup>in(**x**ˆ<sup>2</sup> <sup>2</sup>)=[−2, 1], S(**x**ˆ<sup>2</sup> <sup>2</sup>) = [0, 1] via interval analysis: LB(Sin(**x**ˆ<sup>2</sup> <sup>1</sup>)) = (5LB(**x**ˆ<sup>1</sup> 1)−UB(**x**ˆ<sup>1</sup> 2))/2−<sup>4</sup> = 1, UB(Sin(**x**ˆ<sup>2</sup> <sup>1</sup>)) = (5UB(**x**ˆ<sup>1</sup> 1)−LB(**x**ˆ<sup>1</sup> 2))/2−<sup>4</sup> = 4, LB(Sin(**x**ˆ<sup>2</sup> <sup>2</sup>)) = (−3UB(**x**ˆ<sup>1</sup> 1)+ 3LB(**x**ˆ<sup>1</sup> 2))/2−<sup>4</sup> <sup>=</sup> <sup>−</sup>2, and UB(Sin(**x**ˆ<sup>2</sup> <sup>2</sup>)) = (−3LB(**x**ˆ<sup>1</sup> 1)+3UB(**x**ˆ<sup>1</sup> 2))/2−<sup>4</sup> = 1. By Line 3 in Algorithm 1, we have <sup>δ</sup>1,<sup>1</sup> <sup>=</sup> <sup>−</sup> <sup>1</sup> <sup>16</sup>×15S(**x**ˆ<sup>1</sup> <sup>1</sup>)=[−0.05, <sup>−</sup>0.025], <sup>δ</sup>1,<sup>2</sup> <sup>=</sup> <sup>−</sup> <sup>1</sup> <sup>16</sup>×15S(**x**ˆ<sup>1</sup> <sup>2</sup>)=[−0.0375, <sup>−</sup>0.0125].

Then, we compute the difference interval before the activation functions. The rounding error is ξ = 2−F*h*−<sup>1</sup> = 0.125. We obtain the difference intervals δin <sup>2</sup>,<sup>1</sup> = [−0.194375, <sup>0</sup>.133125] and <sup>δ</sup>in <sup>2</sup>,<sup>2</sup> = [−0.204375, <sup>0</sup>.123125] as follows based on Algorithm 2: <sup>2</sup>,1) = LB(**W**<sup>1</sup> <sup>1</sup>,1δ1,<sup>1</sup> <sup>+</sup> **<sup>W</sup>**<sup>1</sup>


By Lines 20∼22 in Algorithm 3, we get the difference intervals after the activation functions for the hidden layer as: δ2,<sup>1</sup> = δin <sup>2</sup>,<sup>1</sup> = [−0.194375, <sup>0</sup>.133125], <sup>δ</sup>2,<sup>1</sup> = [max LB(δin <sup>2</sup>,2), <sup>−</sup>UB(Sin(**x**<sup>2</sup> 2)) , min UB(δin <sup>2</sup>,2), UB(Sin(**x**˜<sup>2</sup> 2)) ] = [−0.2, <sup>0</sup>.123125].

Next, we compute the output difference interval of the networks using Algorithm 2 again but with ξ = 0: LB(δin <sup>3</sup>,1) = LB(**W**<sup>2</sup> <sup>1</sup>,1δ2,<sup>1</sup> <sup>+</sup> **<sup>W</sup>**<sup>2</sup> <sup>1</sup>,2δ2,<sup>2</sup> + Δ**W**<sup>2</sup> 1,1S(**x**<sup>2</sup> 1)+Δ**W**<sup>2</sup> 1,2S(**x**<sup>2</sup> <sup>2</sup>)) = 0.25×LB(δ2,1)+0.75×LB(δ2,2)+ (0.25−0.3)<sup>×</sup> UB(S(**x**<sup>2</sup> <sup>1</sup>)) + (0.<sup>75</sup> <sup>−</sup> <sup>0</sup>.7) <sup>×</sup> LB(S(**x**<sup>2</sup> 2)), UB(δin <sup>3</sup>,2) = UB(**W**<sup>2</sup> <sup>1</sup>,1δ2,<sup>1</sup> <sup>+</sup> **<sup>W</sup>**<sup>2</sup> <sup>1</sup>,2δ2,<sup>2</sup> + Δ**W**<sup>2</sup> 1,1S(**x**<sup>2</sup> 1)+Δ**W**<sup>2</sup> 1,2S(**x**<sup>2</sup> <sup>2</sup>)) = 0.25×UB(δ2,1)+0.75×UB(δ2,2)+ (0.25−0.3)<sup>×</sup> LB(S(**x**<sup>2</sup> <sup>1</sup>)) + (0.<sup>75</sup> <sup>−</sup> <sup>0</sup>.7) <sup>×</sup> UB(S(**x**<sup>2</sup> <sup>2</sup>)). Finally, the quantization error interval is [-0.24459375, 0.117625].

### 3.3 MILP Encoding of the Verification Problem

If DRA fails to prove the property, we encode the problem as an equivalent MILP problem. Specifically, we encode both the QNN and DNN as sets of (mixed integer) linear constraints, and quantize the input region as a set of integer linear constraints. We adopt the MILP encodings of DNNs [39] and QNNs [40] to transform the DNN and QNN into a set of linear constraints. We use (symbolic) intervals to further reduce the size of linear constraints similar to [39] while [40] did not. We suppose that the sets of constraints encoding the QNN, DNN, and quantized input region are <sup>Θ</sup>N- , Θ<sup>N</sup> , and ΘR, respectively. Next, we give the MILP encoding of the robust error bound property.

Recall that, given a DNN <sup>N</sup> , an input region <sup>R</sup>(**x**ˆ, r) such that **<sup>x</sup>** is classified to class <sup>g</sup> by <sup>N</sup> , a QNN <sup>N</sup> has a quantization error bound w.r.t. R(**x**ˆ, r) if for every **<sup>x</sup>**ˆ <sup>∈</sup> <sup>R</sup>(**x**ˆ, r), we have <sup>|</sup>2−F*<sup>h</sup>* <sup>N</sup>(**x**ˆ )<sup>g</sup> − N (**x** )g<sup>|</sup> < -. Thus, it suffices to check if <sup>|</sup>2−F*<sup>h</sup>* <sup>N</sup>(**x**ˆ )<sup>g</sup> − N (**x** )g| ≥ for some **<sup>x</sup>**ˆ <sup>∈</sup> <sup>R</sup>(**x**ˆ, r).

Let **x**ˆ<sup>d</sup> <sup>g</sup> (resp. **x**<sup>d</sup> <sup>g</sup>) be the <sup>g</sup>-th output of <sup>N</sup> (resp. <sup>N</sup> ). We introduce a realvalued variable η and a Boolean variable v such that η = max(2−F*<sup>h</sup>* **x**ˆ<sup>d</sup> <sup>g</sup> <sup>−</sup> **<sup>x</sup>**<sup>d</sup> <sup>g</sup>, 0) can be encoded by the set Θ<sup>g</sup> of constraints with an extremely large number M: Θg = <sup>η</sup> <sup>≥</sup> <sup>0</sup>, η <sup>≥</sup> <sup>2</sup>−F*<sup>h</sup>* **<sup>x</sup>**ˆ<sup>d</sup> <sup>g</sup> <sup>−</sup> **<sup>x</sup>**<sup>d</sup> <sup>g</sup>, η <sup>≤</sup> <sup>M</sup> · v, η <sup>≤</sup> <sup>2</sup>−F*<sup>h</sup>* **<sup>x</sup>**ˆ<sup>d</sup> <sup>g</sup> <sup>−</sup> **<sup>x</sup>**<sup>d</sup> <sup>g</sup> <sup>+</sup> <sup>M</sup> · (1 <sup>−</sup> <sup>v</sup>) . As a result, <sup>|</sup>2−F*<sup>h</sup>* **<sup>x</sup>**ˆ<sup>d</sup> <sup>g</sup> <sup>−</sup> **<sup>x</sup>**<sup>d</sup> <sup>g</sup>| ≥ iff the set of linear constraints <sup>Θ</sup> <sup>=</sup> <sup>Θ</sup><sup>g</sup> ∪ {2<sup>η</sup> <sup>−</sup> (2−F*<sup>h</sup>* **x**ˆ<sup>d</sup> <sup>g</sup> <sup>−</sup> **<sup>x</sup>**<sup>d</sup> <sup>g</sup>) <sup>≥</sup> -} holds.

Finally, the quantization error bound verification problem is equivalent to the solving of the constraints: <sup>Θ</sup><sup>P</sup> <sup>=</sup> <sup>Θ</sup>N- ∪ Θ<sup>N</sup> ∪ Θ<sup>R</sup> ∪ Θ. Remark that the output difference intervals of hidden neurons obtained from Algorithm 1 can be encoded as linear constraints which are added into the set Θ<sup>P</sup> to boost the solving.

### 4 An Abstract Domain for Symbolic-Based DRA

While Algorithm 1 can compute difference intervals, the affine transformer explicitly adds a concrete rounding error interval to each neuron, which accumulates into a significant precision loss over the subsequent layers. To alleviate this problem, we introduce an abstract domain based on DeepPoly which helps to compute sound symbolic approximations for the lower and upper bounds of each difference interval, hence computing tighter difference intervals.

#### 4.1 An Abstract Domain for QNNs

We first introduce transformers for affine transforms with rounding operators and clamp functions in QNNs. Recall that the activation function in a QNN <sup>N</sup> is also a min-ReLU function: min(ReLU( · ), <sup>C</sup>ub <sup>h</sup> ). Thus, we regard each hidden neuron **x**ˆ<sup>i</sup> <sup>j</sup> in a QNN as three nodes **<sup>x</sup>**ˆ<sup>i</sup> j,0, **<sup>x</sup>**ˆ<sup>i</sup> j,1, and **<sup>x</sup>**ˆ<sup>i</sup> j,<sup>2</sup> such that **<sup>x</sup>**ˆ<sup>i</sup> j,<sup>0</sup> <sup>=</sup> 2F*i* n*i*−1 k=1 **W**i j,k**x**ˆ<sup>i</sup>−<sup>1</sup> k,<sup>2</sup> + 2<sup>F</sup>*h*−F*b***b**ˆ<sup>i</sup> j (affine function), **x**ˆ<sup>i</sup> j,<sup>1</sup> <sup>=</sup> max(**x**ˆ<sup>i</sup> j,0, 0) (ReLU function) and **x**ˆ<sup>i</sup> j,<sup>2</sup> <sup>=</sup> min(**x**ˆ<sup>i</sup> j,1, <sup>C</sup>ub <sup>h</sup> ) (min function). We now give the abstract domain Ai j,p <sup>=</sup> **a**ˆi,<sup>≤</sup> j,p , **<sup>a</sup>**ˆi,<sup>≥</sup> j,p , <sup>ˆ</sup><sup>l</sup> i j,p, uˆ<sup>i</sup> j,p for each neuron **<sup>x</sup>**ˆ<sup>i</sup> j,p (<sup>p</sup> ∈ {0, <sup>1</sup>, <sup>2</sup>}) in a QNN as follows.

Following DeepPoly, **a**ˆi,<sup>≤</sup> j,<sup>0</sup> and **<sup>a</sup>**ˆi,<sup>≥</sup> j,<sup>0</sup> for the affine function of **<sup>x</sup>**ˆ<sup>i</sup> j,<sup>0</sup> with rounding operators are defined as **a**ˆi,<sup>≤</sup> j,<sup>0</sup> = 2<sup>F</sup>*<sup>i</sup>* <sup>n</sup>*i*−<sup>1</sup> k=1 **W**i j,k**x**ˆ<sup>i</sup>−<sup>1</sup> k,<sup>2</sup> + 2<sup>F</sup>*h*−F*b***b**ˆ<sup>i</sup> <sup>j</sup> <sup>−</sup> <sup>0</sup>.<sup>5</sup> and **a**ˆi,<sup>≥</sup> j,<sup>0</sup> = 2<sup>F</sup>*<sup>i</sup>* <sup>n</sup>*i*−<sup>1</sup> k=1 **W**i j,k**x**ˆ<sup>i</sup>−<sup>1</sup> k,<sup>2</sup> + 2<sup>F</sup>*h*−F*b***b**ˆ<sup>i</sup> <sup>j</sup> + 0.5. We remark that +0.<sup>5</sup> and <sup>−</sup>0.<sup>5</sup> here are added to soundly encode the rounding operators and have no effect on the perseverance of invariant since the rounding operators will add/subtract 0.5 at most to round each floating-point number into its nearest integer. The abstract transformer for the ReLU function **x**<sup>i</sup> j,<sup>1</sup> <sup>=</sup> ReLU(**x**<sup>i</sup> j,0) is defined the same as DeepPoly. <sup>h</sup> ), there are three cases for <sup>A</sup><sup>i</sup>

For the min function **x**ˆ<sup>i</sup> j,<sup>2</sup> <sup>=</sup> min(**x**ˆ<sup>i</sup> j,1, <sup>C</sup>ub j,2:

– If <sup>ˆ</sup><sup>l</sup> i j,<sup>1</sup> ≥ Cub <sup>h</sup> , then **<sup>a</sup>**ˆi,<sup>≤</sup> j,<sup>2</sup> <sup>=</sup> **<sup>a</sup>**ˆi,<sup>≥</sup> j,<sup>2</sup> <sup>=</sup> <sup>C</sup>ub <sup>h</sup> , <sup>ˆ</sup><sup>l</sup> i j,<sup>2</sup> = ˆu<sup>i</sup> j,<sup>2</sup> <sup>=</sup> <sup>C</sup>ub <sup>h</sup> ; – If uˆ<sup>i</sup> j,<sup>1</sup> ≤ Cub <sup>h</sup> , then **<sup>a</sup>**ˆi,<sup>≤</sup> j,<sup>2</sup> <sup>=</sup> **<sup>a</sup>**ˆi,<sup>≤</sup> j,<sup>1</sup> , **<sup>a</sup>**ˆi,<sup>≥</sup> j,<sup>2</sup> <sup>=</sup> **<sup>a</sup>**ˆi,<sup>≥</sup> j,<sup>1</sup> , <sup>ˆ</sup><sup>l</sup> i j,<sup>2</sup> <sup>=</sup> <sup>ˆ</sup><sup>l</sup> i j,<sup>1</sup> and <sup>u</sup>ˆ<sup>i</sup> j,<sup>2</sup> = ˆu<sup>i</sup> j,1;

Fig. 3. Convex approximation for the min function in QNNs, where Fig. 3(a) and Fig. 3(b) show the two ways where α = *<sup>C</sup>*ub *<sup>h</sup> <sup>−</sup>*ˆ*<sup>l</sup> i j,*1 *u*ˆ*i j,*1*−*ˆ*l<sup>i</sup> j,*1 and <sup>β</sup> <sup>=</sup> (ˆ*u<sup>i</sup> j,*1*−C*ub *<sup>h</sup>* ) *u*ˆ*i j,*1*−*ˆ*l<sup>i</sup> j,*1 .

– If <sup>ˆ</sup><sup>l</sup> i j,<sup>1</sup> <sup>&</sup>lt; <sup>C</sup>ub <sup>h</sup> <sup>∧</sup> <sup>u</sup>ˆ<sup>i</sup> j,<sup>1</sup> <sup>&</sup>gt; <sup>C</sup>ub <sup>h</sup> , then **<sup>a</sup>**ˆi,<sup>≥</sup> j,<sup>2</sup> <sup>=</sup> <sup>λ</sup>**x**ˆ<sup>i</sup> j,<sup>1</sup> <sup>+</sup> <sup>μ</sup> and **<sup>a</sup>**ˆi,<sup>≤</sup> j,<sup>2</sup> <sup>=</sup> <sup>C</sup>ub *<sup>h</sup>* <sup>−</sup>ˆ<sup>l</sup> *i j,*1 uˆ*i j,*1−ˆl*<sup>i</sup> j,*1 **x**ˆi j,<sup>1</sup> <sup>+</sup> (ˆu*<sup>i</sup> j,*1−Cub *<sup>h</sup>* ) uˆ*i j,*1−ˆl*<sup>i</sup> j,*1 ˆl i j,1, where (λ, μ) ∈ {(0, <sup>C</sup>ub <sup>h</sup> ),(1, 0)} such that the area of resulting shape by **a**ˆi,<sup>≤</sup> j,<sup>2</sup> and **<sup>a</sup>**ˆi,<sup>≥</sup> j,<sup>2</sup> is minimal, <sup>ˆ</sup><sup>l</sup> i j,<sup>2</sup> <sup>=</sup> <sup>ˆ</sup><sup>l</sup> i j,<sup>1</sup> and <sup>u</sup>ˆ<sup>i</sup> j,<sup>2</sup> <sup>=</sup> λuˆ<sup>i</sup> j,<sup>1</sup> <sup>+</sup> <sup>μ</sup>. We show the two ways of approximation in Fig. 3.

Theorem 2. *The min abstract transformer preserves the following invariant:* Γ(Ai j,2) <sup>⊆</sup> [ˆ<sup>l</sup> i j,2, <sup>u</sup>ˆ<sup>i</sup> j,2]*.*

From our abstract domain for QNNs, we get a symbolic interval analysis, similar to the one for DNNs using DeepPoly, to replace Line 2 in Algorithm 1.

#### 4.2 Symbolic Quantization Error Computation

Recall that to compute tight bounds of QNNs or DNNs via symbolic interval analysis, variables in upper and lower polyhedral computations are recursively substituted with the corresponding upper/lower polyhedral computations of variables until they only contain the input variables from which the concrete intervals are computed. This idea motivates us to design a symbolic difference computation approach for differential reachability analysis based on the abstract domain DeepPoly for DNNs and our abstract domain for QNNs.

Consider two hidden neurons **x**<sup>i</sup> j,s and **<sup>x</sup>**ˆ<sup>i</sup> j,s from the DNN N and the QNN <sup>N</sup>. Let <sup>A</sup>i,<sup>∗</sup> j,s <sup>=</sup> **a**i,≤,<sup>∗</sup> j,s , **<sup>a</sup>**i,≥,<sup>∗</sup> j,s , li,<sup>∗</sup> j,s, ui,<sup>∗</sup> j,s and Ai j,p <sup>=</sup> **a**ˆi,≤,<sup>∗</sup> j,p , **<sup>a</sup>**ˆi,≥,<sup>∗</sup> j,p , <sup>ˆ</sup><sup>l</sup> i,∗ j,p, <sup>u</sup>ˆi,<sup>∗</sup> j,p be their abstract elements, respectively, where all the polyhedral computations are linear combinations of the input variables of the DNN and QNN, respectively, i.e., j,s = m j,s = m

$$\begin{array}{l} \text{comditions}; \ldots; \ldots; \ldots; \ldots; \ldots; \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots} \\ \text{cominations of the input variables of the DNN and QNN, respectively} \\ \mathbf{'} - \mathbf{a}\_{j,s}^{i,\leq,\*} = \sum\_{k=1}^{m} \mathbf{w}\_{k}^{l,\*} \mathbf{x}\_{k}^{1} + \mathbf{b}\_{j}^{l,\*}, \mathbf{a}\_{j,s}^{i,\geq,\*} = \sum\_{k=1}^{m} \mathbf{w}\_{k}^{u,\*} \mathbf{x}\_{k}^{1} + \mathbf{b}\_{j}^{u,\*}; \\ \mathbf{'} - \hat{\mathbf{a}}\_{j,p}^{i,\leq,\*} = \sum\_{k=1}^{m} \hat{\mathbf{w}}\_{k}^{l,\*} \hat{\mathbf{x}}\_{k}^{1} + \hat{\mathbf{b}}\_{j}^{l,\*}, \hat{\mathbf{a}}\_{j,p}^{i,\geq,\*} = \sum\_{k=1}^{m} \hat{\mathbf{w}}\_{k}^{u,\*} \hat{\mathbf{x}}\_{k}^{1} + \hat{\mathbf{b}}\_{j}^{u,\*}. \end{array}$$

Then, the sound lower bound Δli,<sup>∗</sup> j,s and upper Δui,<sup>∗</sup> j,s bound of the difference can be derived as follows, where p = 2s:


Table 1. Benchmarks for QNNs and DNNs on MNIST.

$$\begin{array}{ll} -\Delta l\_{j,s}^{i,\*} = \text{LB} (2^{-F\_h} \hat{\mathbf{x}}\_{j,p}^i - \mathbf{x}\_{j,s}^i) = 2^{-F\_h} \hat{\mathbf{a}}\_{j,p}^{i, \leq,\*} - \mathbf{a}\_{j,s}^{i, \geq,\*}; \\ -\Delta u\_{j,s}^{i,\*} = \text{UB} (2^{-F\_h} \hat{\mathbf{x}}\_{j,p}^i - \mathbf{x}\_{j,s}^i) = 2^{-F\_h} \hat{\mathbf{a}}\_{j,p}^{i, \geq,\*} - \mathbf{a}\_{j,s}^{i, \leq,\*}. \end{array}$$
  $\text{Given a quantized input } \hat{\mathbf{x}} \text{ of the QNN } \hat{\mathcal{N}}, \text{ the input difference of two networks}$ 

is <sup>2</sup>−F*in* **<sup>x</sup>**<sup>ˆ</sup> <sup>−</sup>**<sup>x</sup>** = (2−F*in* <sup>C</sup>ub <sup>h</sup> <sup>−</sup>1)**x**. Therefore, we have <sup>Δ</sup><sup>1</sup> <sup>k</sup> <sup>=</sup> **<sup>x</sup>**˜<sup>1</sup> <sup>k</sup> <sup>−</sup>**x**<sup>1</sup> <sup>k</sup> = 2−F*in* **<sup>x</sup>**ˆ<sup>1</sup> <sup>k</sup> − **x**1 <sup>k</sup> = (2−F*in* <sup>C</sup>ub <sup>h</sup> <sup>−</sup> 1)**x**. Then, the lower bound of difference can be reformulated as follows which only contains the input variables of DNN <sup>N</sup> : Δli,<sup>∗</sup> j,s <sup>=</sup> <sup>Δ</sup>**b**l,<sup>∗</sup> <sup>j</sup> <sup>+</sup> m k=1(−**w**u,<sup>∗</sup> <sup>k</sup> + 2−F*in* <sup>C</sup>ub <sup>h</sup> **<sup>w</sup>**˜ l,<sup>∗</sup> <sup>k</sup> )**x**<sup>1</sup> <sup>k</sup>, where <sup>Δ</sup>**b**l,<sup>∗</sup> <sup>j</sup> = 2−F*<sup>h</sup>* **<sup>b</sup>**ˆl,<sup>∗</sup> <sup>j</sup> <sup>−</sup>**b**u,<sup>∗</sup> <sup>j</sup> , <sup>F</sup><sup>∗</sup> <sup>=</sup> <sup>F</sup>in <sup>−</sup>Fh, Δ<sup>1</sup> <sup>k</sup> <sup>=</sup> **<sup>x</sup>**˜<sup>1</sup> <sup>k</sup> <sup>−</sup> **<sup>x</sup>**<sup>1</sup> <sup>k</sup> and **<sup>w</sup>**˜ l,<sup>∗</sup> <sup>k</sup> = 2<sup>F</sup> <sup>∗</sup> **w**ˆ l,<sup>∗</sup> <sup>k</sup> .

Similarly, we can reformulated the upper bound Δui,<sup>∗</sup> j,s as follows using the input variables of the DNN: Δui,<sup>∗</sup> j,s <sup>=</sup> <sup>Δ</sup>**b**u,<sup>∗</sup> j + m k=1(−**w**l,<sup>∗</sup> <sup>k</sup> + 2−F*in* <sup>C</sup>ub <sup>h</sup> **<sup>w</sup>**˜ u,<sup>∗</sup> <sup>k</sup> )**x**<sup>1</sup> k, where Δ**b**u,<sup>∗</sup> <sup>j</sup> = 2−F*<sup>h</sup>* **<sup>b</sup>**ˆu,<sup>∗</sup> <sup>j</sup> <sup>−</sup> **<sup>b</sup>**l,<sup>∗</sup> <sup>j</sup> , <sup>F</sup><sup>∗</sup> <sup>=</sup> <sup>F</sup>in <sup>−</sup> <sup>F</sup>h, and **<sup>w</sup>**˜ u,<sup>∗</sup> <sup>k</sup> = 2<sup>F</sup> <sup>∗</sup> **w**ˆ u,<sup>∗</sup> <sup>k</sup> .

Finally, we compute the concrete input difference interval δin i,j based on the given input region as δin i,j = [LB(Δli,<sup>∗</sup> j,0), UB(Δui,<sup>∗</sup> j,0)], with which we can replace the AffTrs functions in Algorithm 1 directly. An illustrating example is given in [65].

#### 5 Evaluation

We have implemented our method QEBVerif as an end-to-end tool written in Python, where we use Gurobi [20] as our back-end MILP solver. All floatingpoint numbers used in our tool are 32-bit. Experiments are conducted on a 96-core machine with Intel(R) Xeon(R) Gold 6342 2.80 GHz CPU and 1 TB main memory. We allow Gurobi to use up to 24 threads. The time limit for each verification task is 1 h.

Benchmarks. We first build 45 \* 4 QNNs from the 45 DNNs of ACAS Xu [26], following a *post-training quantization scheme* [44] and using quantization configurations <sup>C</sup>in <sup>=</sup> ±, <sup>8</sup>, <sup>8</sup>, <sup>C</sup><sup>w</sup> <sup>=</sup> <sup>C</sup><sup>b</sup> <sup>=</sup> ±, Q, Q <sup>−</sup> <sup>2</sup>, <sup>C</sup><sup>h</sup> <sup>=</sup> +, Q, Q <sup>−</sup> <sup>2</sup>, where <sup>Q</sup> ∈ {4, <sup>6</sup>, <sup>8</sup>, <sup>10</sup>}. We then train 5 DNNs with different architectures using the MNIST dataset [31] and build 5 \* 4 QNNs following the same quantization scheme and quantization configurations except that we set <sup>C</sup>in <sup>=</sup> +, <sup>8</sup>, <sup>8</sup> and <sup>C</sup><sup>w</sup> <sup>=</sup> ±, Q, Q <sup>−</sup> <sup>1</sup> for each DNN trained on MNIST. Details on the networks trained on the MNIST dataset are presented in Table 1. Column 1 gives the name and architecture of each DNN, where Ablk\_B means that the network has A hidden layers with each hidden layer size B neurons, Column 2 gives the number of parameters in each DNN, and Columns 3–7 list the accuracy of these networks. Hereafter, we denote by Px-y (resp. Ax-y) the QNN using the architecture Px (using the x-th DNN) and quantization bit size Q = y for MNIST (resp. ACAS Xu), and by Px-Full (resp. Ax-Full) the DNN of architecture Px for MNIST (resp. the x-th DNN in ACAS Xu).

#### 5.1 Effectiveness and Efficiency of DRA

We first implement a naive method using existing state-of-the-art reachability analysis methods for QNNs and DNNs. Specifically, we use the symbolic interval analysis of DeepPoly [55] to compute the output intervals for a DNN, and use interval analysis of [22] to compute the output intervals for a QNN. Then, we compute quantization error intervals via interval subtraction. Note that no existing methods can directly verify quantization error bounds and the methods in [48,49] are not applicable. Finally, we compare the quantization error intervals computed by the naive method against DRA in QEBVerif, using DNNs Ax-Full, <sup>P</sup>y-Full and QNNs Ax-z, Py-<sup>z</sup> for <sup>x</sup> = 1, <sup>y</sup> ∈ {1, <sup>2</sup>, <sup>3</sup>, <sup>4</sup>, <sup>5</sup>} and <sup>z</sup> ∈ {4, <sup>6</sup>, <sup>8</sup>, <sup>10</sup>}. We use the same adversarial input regions (5 input points with radius r = {3, <sup>6</sup>, <sup>13</sup>, <sup>19</sup>, <sup>26</sup>} for each point) as in [29] for ACAS Xu, and set the quantization error bound - ∈ {0.05, <sup>0</sup>.1, <sup>0</sup>.2, <sup>0</sup>.3, <sup>0</sup>.4}, i.e., resulting 25 tasks for each radius. For MNIST, we randomly select 30 input samples from the test set of MNIST and set radius r = 3 for each input sample and quantization error bound - ∈ {1, <sup>2</sup>, <sup>4</sup>, <sup>6</sup>, <sup>8</sup>}, resulting in a total of 150 tasks for each pair of DNN and QNN of same architecture for MNIST.

Table 2 reports the analysis results for ACAS Xu (above) and MNIST (below). Column 2 lists different analysis methods, where QEBVerif (Int) is Algorithm 1 and QEBVerif (Sym) uses a symbolic-based method for the affine transformation in Algorithm 1 (cf. Sect. 4.2). Columns (H\_Diff) (resp. O\_Diff) averagely give the sum ranges of the difference intervals of all the hidden neurons (resp. output neurons of the predicted class) for the 25 verification tasks for ACAS Xu and 150 verification tasks for MNIST. Columns (#S/T) list the number of tasks (#S) successfully proved by DRA and average computation time (T) in seconds, respectively, where the best ones (i.e., solving the most tasks) are highlighted in blue. Note that Table 2 only reports the number of true propositions proved by DRA while the exact number is unknown.

Unsurprisingly, QEBVerif (Sym) is less efficient than the others but is still in the same order of magnitude. However, we can observe that QEBVerif (Sym) solves the most tasks for both ACAS Xu and MNIST and produces the most accurate difference intervals of both hidden neurons and output neurons for almost all the tasks in MNIST, except for P1-8 and P1-10 where QEBVerif (Int) performs better on the intervals for the output neurons. We also find that QEB-Verif (Sym) may perform worse than the naive method when the quantization bit size is small for ACAS Xu. It is because: (1) the rounding error added into


Table 2. Differential Reachability Analysis on ACAS Xu and MNIST.

the abstract domain of the affine function in each hidden layer of QNNs is large due to the small bit size, and (2) such errors can accumulate and magnify layer by layer, in contrast to the naive approach where we directly apply the interval subtraction. We remark that symbolic-based reachability analysis methods for DNNs become less accurate as the network gets deeper and the input region gets larger. It means that for a large input region, the output intervals of hidden/output neurons computed by symbolic interval analysis for DNNs can be very large. However, the output intervals of their quantized counterparts are always limited by the quantization grid limit, i.e., [0, <sup>2</sup>*Q*−<sup>1</sup> <sup>2</sup>*Q*−<sup>2</sup> ]. Hence, the difference intervals computed in Table 2 can be very conservative for large input regions and deeper networks.

#### 5.2 Effectiveness and Efficiency of QEBVerif

We evaluate QEBVerif on QNNs Ax-z, Py-<sup>z</sup> for <sup>x</sup> = 1, <sup>y</sup> ∈ {1, <sup>2</sup>, <sup>3</sup>, <sup>4</sup>} and <sup>z</sup> ∈ {4, <sup>6</sup>, <sup>8</sup>, <sup>10</sup>}, as well as DNNs correspondingly. We use the same input regions and error bounds as in Sect. 5.1 except that we consider <sup>r</sup> ∈ {3, <sup>6</sup>, <sup>13</sup>} for each input point for ACAS Xu. Note that, we omit the other two radii for ACAS Xu and use medium-sized QNNs for MNIST as our evaluation benchmarks of this experiment for the sake of time and computing resources.

Figure 4 shows the verification results of QEBVerif within 1 h per task, which gives the number of successfully verified tasks with three methods. Note that only the number of successfully proved tasks is given in Fig. 4 for DRA due to its incompleteness. The blue bars show the results using only the symbolic

Fig. 4. Verification Results of QEBVerif on ACAS Xu and MNIST.

differential reachability analysis, i.e., QEBVerif (Sym). The yellow bars give the results by a full verification process in QEBVerif as shown in Fig. 2, i.e., we first use DRA and then use MILP solving if DRA fails. The red bars are similar to the yellow ones except that linear constraints of the difference intervals of hidden neurons got from DRA are added into the MILP encoding.

Overall, although DRA successfully proved most of the tasks (60.19% with DRA solely), our MILP-based verification method can help further verify many tasks on which DRA fails, namely, 85.67% with DRA+MILP and 88.59% with DRA+MILP+Diff. Interestingly, we find that the effectiveness of the added linear constraints of the difference intervals varies on the MILP solving efficiency on different tasks. Our conjecture is that there are some heuristics in the Gurobi solving algorithm for which the additional constraints may not always be helpful. However, those difference linear constraints allow the MILP-based verification method to verify more tasks, i.e., 79 tasks more in total.

#### 5.3 Correlation of Quantization Errors and Robustness

We use QEBVerif to verify a set of properties <sup>Ψ</sup> <sup>=</sup> {P(<sup>N</sup> , <sup>N</sup>, **<sup>x</sup>**ˆ, r, -)}, where <sup>N</sup> <sup>=</sup> P1-Full, N ∈{ P1-4, P1-8}, **<sup>x</sup>**<sup>ˆ</sup> ∈ X and <sup>X</sup> is the set of the 30 samples from MNIST as above, <sup>r</sup> ∈ {3, <sup>5</sup>, <sup>7</sup>} and - <sup>∈</sup> <sup>Ω</sup> <sup>=</sup> {0.5, <sup>1</sup>.0, <sup>1</sup>.5, <sup>2</sup>.0, <sup>2</sup>.5, <sup>3</sup>.0, <sup>3</sup>.5, <sup>4</sup>.0, <sup>5</sup>.0}. We solve all the above tasks and process all the results to obtain the tightest range of quantization error bounds [a, b] for each input region such that a, b ∈ Ω. It allows us to obtain intervals that are tighter than those obtained via DRA. Finally, we implemented a robustness verifier for QNNs in a way similar to [40] to check the robustness of P1-4 and P1-8 w.r.t. the input regions given in Ψ.

Figure 5 gives the experimental results. The blue (resp. yellow) bars in Figs. 5(a) and 5(e) show the number of robust (resp. non-robust) samples among the 30 verification tasks, and blue bars in the other 6 figures demonstrate the

Fig. 5. Distribution of (non-)robust samples and Quantization Errors under radius r and quantization bits Q.

under *r* = 5.

under *r* = 7.

under *r* = 3.

quantization error interval for each input region. By comparing the results of P1-8 and P1-4, we observe that P1-8 is more robust than P1-4 w.r.t. the 90 input regions and its quantization errors are also generally much smaller than that of P1-4. Furthermore, we find that P1-8 remains consistently robust as the radius increases, and its quantization error interval changes very little. However, P1-4 becomes increasingly less robust as the radius increases and its quantization error also increases significantly. Thus, we speculate that there may be some correlation between network robustness and quantization error in QNNs. Specifically, as the quantization bit size decreases, the quantization error increases and the QNN becomes less robust. The reason we suspect "the fewer bits, the less robust" is that with fewer bits, a perturbation may easily cause significant change on hidden neurons (i.e., the change is magnified by the loss of precision) and consequently the output. Furthermore, the correlation between the quantization error bound and the empirical robustness of the QNN suggests that it is indeed possible to apply our method to compute the quantization error bound and use it as a guide for identifying the best quantization scheme which balances the size of the model and its robustness.

#### 6 Related Work

sults for P1-8.

While there is a large and growing body of work on quality assurance techniques for neural networks including testing (e.g., [4–7,47,50,56,57,63,69]) and formal verification (e.g., [2,8,12,13,15,19,24,29,30,32,34,37,38,51,54,55,58–60, 62,70]). Testing techniques are often effective in finding violations, but they cannot prove their absence. While formal verification can prove their absence, existing methods typically target real-valued neural networks, i.e., DNNs, and are not effective in verifying quantization error bound [48]. In this section, we mainly discuss the existing verification techniques for QNNs.

Early work on formal verification of QNNs typically focuses on 1-bit quantized neural networks (i.e., BNNs) [3,9,46,52,53,66,67]. Narodytska et al. [46] first proposed to reduce the verification problem of BNNs to a satisfiability problem of a Boolean formula or an integer linear programming problem. Baluta et al. [3] proposed a PAC-style quantitative analysis framework for BNNs via approximate SAT model-counting solvers. Shih et al. proposed a quantitative verification framework for BNNs [52,53] via a BDD learning-based method [45]. Zhang et al. [66,67] proposed a BDD-based verification framework for BNNs, which exploits the internal structure of the BNNs to construct BDD models instead of BDD-learning. Giacobbe et al. [16] pushed this direction further by introducing the first formal verification for multiple-bit quantized DNNs (i.e., QNNs) by encoding the robustness verification problem into an SMT formula based on the first-order theory of quantifier-free bit-vector. Later, Henzinger et al. [22] explored several heuristics to improve the efficiency and scalability of [16]. Very recently, [40,68] proposed an ILP-based method and an MILP-based verification method for QNNs, respectively, and both outperform the SMT-based verification approach [22]. Though these works can directly verify QNNs or BNNs, they cannot verify quantization error bounds.

There are also some works focusing on exploring the properties of two neural networks which are most closely related to our work. Paulsen et al. [48,49] proposed differential verification methods to verify two DNNs with the same network topology. This idea has been extended to handle recurrent neural networks [41]. The difference between [41,48,49] and our work has been discussed throughout this work, i.e., they focus on quantized weights and cannot handle quantized activation tensors. Moreover, their methods are not complete, thus would fail to prove tighter error bounds. Semi-definite programming was used to analyze the different behaviors of DNNs and *fully* QNNs [33]. Different from our work focusing on verification, they aim at generating an upper bound for the worst-case error induced by quantization. Furthermore, [33] only scales tiny QNNs, e.g., 1 input neuron, 1 output neuron, and 10 neurons per hidden layer (up to 4 hidden layers). In comparison, our differential reachability analysis scales to much larger QNNs, e.g., QNN with 4890 neurons.

## 7 Conclusion

In this work, we proposed a novel quantization error bound verification method QEBVerif which is sound, complete, and arguably efficient. We implemented it as an end-to-end tool and conducted thorough experiments on various QNNs with different quantization bit sizes. Experimental results showed the effectiveness and the efficiency of QEBVerif. We also investigated the potential correlation between robustness and quantization errors for QNNs and found that as the quantization error increases the QNN might become less robust. For further work, it would be interesting to investigate the verification method for other activation functions and network architectures, towards which this work makes a significant step.

Acknowledgements. This work is supported by the National Key Research Program (2020AAA0107800), National Natural Science Foundation of China (62072309), CAS Project for Young Scientists in Basic Research (YSBR-040), ISCAS New Cultivation Project (ISCAS-PYFX-202201), and the Ministry of Education, Singapore under its Academic Research Fund Tier 3 (MOET32020-0004). Any opinions, findings and conclusions or recommendations expressed in this material are those of the authors and do not reflect the views of the Ministry of Education, Singapore.

## References


Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

## **Verifying Generalization in Deep Learning**

Guy Amir(B), Osher Maayan, Tom Zelazny, Guy Katz, and Michael Schapira

The Hebrew University of Jerusalem, Jerusalem, Israel {guyam,osherm,tomz,guykatz,schapiram}@cs.huji.ac.il

**Abstract.** Deep neural networks (DNNs) are the workhorses of deep learning, which constitutes the state of the art in numerous application domains. However, DNN-based decision rules are notoriously prone to poor *generalization*, i.e., may prove inadequate on inputs not encountered during training. This limitation poses a significant obstacle to employing deep learning for mission-critical tasks, and also in real-world environments that exhibit high variability. We propose a novel, verificationdriven methodology for identifying DNN-based decision rules that generalize well to new input domains. Our approach quantifies generalization to an input domain by the extent to which decisions reached by *independently trained* DNNs are in agreement for inputs in this domain. We show how, by harnessing the power of DNN verification, our approach can be efficiently and effectively realized. We evaluate our verificationbased approach on three deep reinforcement learning (DRL) benchmarks, including a system for Internet congestion control. Our results establish the usefulness of our approach. More broadly, our work puts forth a novel objective for formal verification, with the potential for mitigating the risks associated with deploying DNN-based systems in the wild.

## **1 Introduction**

Over the past decade, deep learning [35] has achieved state-of-the-art results in natural language processing, image recognition, game playing, computational biology, and many additional fields [4,18,21,45,50,84,85]. However, despite its impressive success, deep learning still suffers from severe drawbacks that limit its applicability in domains that involve mission-critical tasks or highly variable inputs.

One such crucial limitation is the notorious difficulty of deep neural networks (DNNs) to *generalize* to new input domains, i.e., their tendency to perform poorly on inputs that significantly differ from those encountered while training. During training, a DNN is presented with input data sampled from a specific distribution over some input domain ("*in-distribution*" inputs). The induced DNNbased rules may fail in generalizing to inputs not encountered during training due to (1) the DNN being invoked "out-of-distribution" (OOD), i.e., when there is a mismatch between the distribution over inputs in the training data and in

c The Author(s) 2023

G. Amir and O. Maayan—Contributed equally.

C. Enea and A. Lal (Eds.): CAV 2023, LNCS 13965, pp. 438–455, 2023. https://doi.org/10.1007/978-3-031-37703-7\_21

the DNN's operational data; (2) some inputs not being sufficiently represented in the finite training data (e.g., various low-probability corner cases); and (3) "overfitting" the decision rule to the training data.

A notable example of the importance of establishing the generalizability of DNN-based decisions lies in recently proposed applications of deep reinforcement learning (DRL) [56] to real-world systems. Under DRL, an *agent*, realized as a DNN, is trained by repeatedly interacting with its environment to learn a decision-making *policy* that attains high performance with respect to a certain objective ("*reward*"). DRL has recently been applied to many real-world challenges [20,44,54,55,64–67,96,108]. In many application domains, the learned policy is expected to perform well across a daunting breadth of operational environments, whose diversity cannot possibly be captured in the training data. Further, the cost of erroneous decisions can be dire. Our discussion of DRL-based Internet congestion control (see Sect. 4.3) illustrates this point.

Here, we present a methodology for identifying DNN-based decision rules that generalize well to *all possible distributions* over an input domain of interest. Our approach hinges on the following key observation. DNN training in general, and DRL policy training in particular, incorporate multiple stochastic aspects, such as the initialization of the DNN's weights and the order in which inputs are observed during training. Consequently, even when DNNs with *the same* architecture are trained to perform an *identical* task on *the same* data, somewhat different decision rules will typically be learned. Paraphrasing Tolstoy's Anna Karenina [93], we argue that "successful decision rules are all alike; but every unsuccessful decision rule is unsuccessful in its own way". Differently put, when examining the decisions by several *independently trained* DNNs on a certain input, these are likely to agree only when their (similar) decisions yield high performance.

In light of the above, we propose the following heuristic for generating DNNbased decision rules that generalize well to *an entire* given domain of inputs: independently train multiple DNNs, and then seek a subset of these DNNs that are in strong agreement across *all* possible inputs in the considered input domain (implying, by our hypothesis, that these DNNs' learned decision rules generalize well to all probability distributions over this domain). Our evaluation demonstrates (see Sect. 4) that this methodology is extremely powerful and enables distilling from a collection of decision rules the few that indeed generalize better to inputs within this domain. Since our heuristic seeks DNNs whose decisions are in agreement for *each and every* input in a specific domain, the decision rules reached this way achieve robustly high generalization across different possible distributions over inputs in this domain.

Since our methodology involves contrasting the outputs of different DNNs over possibly *infinite* input domains, using formal verification is natural. To this end, we build on recent advances in formal verification of DNNs [2,12,14, 16,27,60,78,86,102]. DNN verification literature has focused on establishing the local adversarial robustness of DNNs, i.e., seeking small input perturbations that result in misclassification by the DNN [31,36,61]. Our approach broadens the applicability of DNN verification by demonstrating, for the first time (to the best of our knowledge), how it can also be used to identify DNN-based decision rules that generalize well. More specifically, we show how, for a given input domain, a DNN verifier can be utilized to assign a score to a DNN reflecting its level of agreement with other DNNs across the entire input domain. This enables iteratively pruning the set of candidate DNNs, eventually keeping only those in strong agreement, which tend to generalize well.

To evaluate our methodology, we focus on three popular DRL benchmarks: (i) *Cartpole*, which involves controlling a cart while balancing a pendulum; (ii) *Mountain Car*, which involves controlling a car that needs to escape a valley; and (iii) *Aurora*, an Internet congestion controller.

Aurora is a particularly compelling example for our approach. While Aurora is intended to tame network congestion across a vast diversity of real-world Internet environments, Aurora is trained only on synthetically generated data. Thus, to deploy Aurora in the real world, it is critical to ensure that its policy is sound for numerous scenarios not captured by its training inputs.

Our evaluation results show that, in all three settings, our verification-driven approach is successful at ranking DNN-based DRL policies according to their ability to generalize well to out-of-distribution inputs. Our experiments also demonstrate that formal verification is superior to gradient-based methods and predictive uncertainty methods. These results showcase the potential of our approach. Our code and benchmarks are publicly available as an artifact accompanying this work [8].

The rest of the paper is organized as follows. Section 2 contains background on DNNs, DRLs, and DNN verification. In Sect. 3 we present our verificationbased methodology for identifying DNNs that successfully generalize to OOD inputs. We present our evaluation in Sect. 4. Related work is covered in Sect. 5, and we conclude in Sect. 6.

## **2 Background**

#### **Deep Neural Networks (DNNs)** [35]

are directed graphs that comprise several layers. Upon receiving an assignment of values to the nodes of its first (input) layer, the DNN propagates these values, layer by layer, until ultimately reaching the assignment of the final (output) layer. Computing the value for each node is performed according to the type of that node's layer. For example, in weighted-

**Fig. 1.** A toy DNN.

sum layers, the node's value is an affine combination of the values of the nodes in the preceding layer to which it is connected. In *rectified linear unit* (*ReLU* ) layers, each node y computes the value y = ReLU(x) = max(x, 0), where x is a single node from the preceding layer. For additional details on DNNs and their training see [35]. Figure 1 depicts a toy DNN. For input V<sup>1</sup> = [1, 2]<sup>T</sup> , the second layer computes the (weighted sum) <sup>V</sup><sup>2</sup> = [10, <sup>−</sup>1]<sup>T</sup> . The ReLU functions are subsequently applied in the third layer, and the result is V<sup>3</sup> = [10, 0]<sup>T</sup> . Finally, the network's single output is V<sup>4</sup> = [20].

**Deep Reinforcement Learning (DRL)** [56] is a machine learning paradigm, in which a DRL agent, implemented as a DNN, interacts with an *environment* across discrete time-steps <sup>t</sup> <sup>∈</sup> <sup>0</sup>, <sup>1</sup>, <sup>2</sup>.... At each time-step, the agent is presented with the environment's *state* <sup>s</sup><sup>t</sup> ∈ S, and selects an *action* <sup>N</sup>(st) = <sup>a</sup><sup>t</sup> ∈ A. The environment then transitions to its next state s<sup>t</sup>+1, and presents the agent with the *reward* r<sup>t</sup> for its previous action. The agent is trained through repeated interactions with its environment to maximize the *expected cumulative discounted reward* R<sup>t</sup> = E - <sup>t</sup> <sup>γ</sup><sup>t</sup> · <sup>r</sup><sup>t</sup> (where <sup>γ</sup> <sup>∈</sup> - 0, 1 is termed the *discount factor* ) [38, 82,90,91,97,107].

**DNN and DRL Verification.** A sound DNN verifier [46] receives as input (i) a *trained* DNN N; (ii) a precondition P on the DNN's inputs, limiting the possible assignments to a domain of interest; and (iii) a postcondition Q on the DNN's outputs, limiting the possible outputs of the DNN. The verifier can reply in one of two ways: (i) SAT, with a concrete input x for which P(x ) ∧ Q(N(x )) is satisfied; or (ii) UNSAT, indicating there does not exist such an x . Typically, Q encodes the *negation* of N's desirable behavior for inputs that satisfy P. Thus, a SAT result indicates that the DNN errs, and that x triggers a bug; whereas an UNSAT result indicates that the DNN performs as intended. An example of this process appears in Appendix B of our extended paper [7]. To date, a plethora of verification approaches have been proposed for general, feed-forward DNNs [3,31,41,46,61,99], as well as DRL-based agents that operate within reactive environments [5,9,15,22,28].

## **3 Quantifying Generalizability via Verification**

Our approach for assessing how well a DNN is expected to generalize on out-ofdistribution inputs relies on the "Karenina hypothesis": while there are many (possibly infinite) ways to produce *incorrect results*, correct outputs are likely to be fairly similar. Hence, to identify DNN-based decision rules that generalize well to new input domains, we advocate training multiple DNNs and scoring the learned decision models according to how well their outputs are aligned with those of the other models for the considered input domain. These scores can be computed using a backend DNN verifier. We show how, by iteratively filtering out models that tend to disagree with the rest, DNNs that generalize well can be effectively distilled.

We begin by introducing the following definitions for reasoning about the extent to which two DNN-based decision rules are in agreement over an input domain.

**Definition 1 (Distance Function).** *Let* O *be the space of possible outputs for a DNN. A* distance function *for* <sup>O</sup> *is a function* <sup>d</sup> : O×O → <sup>R</sup><sup>+</sup>*.*

Intuitively, a distance function (e.g., the L<sup>1</sup> norm) allows us to quantify the level of (dis)agreement between the decisions of two DNNs on the same input. We elaborate on some choices of distance functions that may be appropriate in various domains in Appendix B of our extended paper [7].

**Definition 2 (Pairwise Disagreement Threshold).** *Let* N1, N<sup>2</sup> *be DNNs with the same output space* <sup>O</sup>*, let* <sup>d</sup> *be a distance function, and let* <sup>Ψ</sup> *be an input domain. We define the* pairwise disagreement threshold *(PDT) of* N<sup>1</sup> *and* N<sup>2</sup> *as:*

α = *PDT*d,Ψ (N1, N2) min <sup>α</sup> <sup>∈</sup> <sup>R</sup><sup>+</sup> | ∀<sup>x</sup> <sup>∈</sup> <sup>Ψ</sup> : <sup>d</sup>(N1(x), N2(x)) <sup>≤</sup> <sup>α</sup> 

The definition captures the notion that for *any* input in Ψ, N<sup>1</sup> and N<sup>2</sup> produce outputs that are at most α-distance apart. A small α value indicates that the outputs of N<sup>1</sup> and N<sup>2</sup> are close for all inputs in Ψ, whereas a high value indicates that there exists an input in Ψ for which the decision models diverge significantly.

To compute PDT values, our approach employs verification to conduct a binary search for the maximum distance between the outputs of two DNNs; see Algorithm 1.

**Algorithm 1.** Pairwise Disagreement Threshold

**Input:** DNNs (N*i*, N*<sup>j</sup>* ), distance func. d, input domain Ψ, max. disagreement M > 0 **Output:** PDT(N*i*, N*<sup>j</sup>* ) 1: low <sup>←</sup> 0, high <sup>←</sup> <sup>M</sup> 2: **while** (low < high) **do** 3: <sup>α</sup> <sup>←</sup> <sup>1</sup> <sup>2</sup> · (low <sup>+</sup> high) 4: query <sup>←</sup> SMT SOLVER <sup>P</sup> <sup>←</sup> Ψ, [N*i*; <sup>N</sup>*<sup>j</sup>* ], Q <sup>←</sup> <sup>d</sup>(N*i*, N*<sup>j</sup>* ) <sup>≥</sup> <sup>α</sup> 5: **if** query is SAT **then**: low <sup>←</sup> <sup>α</sup> 6: **else if** query is UNSAT **then**: high <sup>←</sup> <sup>α</sup> 7: **end while** 8: **return** α

Pairwise disagreement thresholds can be aggregated to measure the disagreement between a decision model and a *set* of other decision models, as defined next.

**Definition 3 (Disagreement Score).** *Let* <sup>N</sup> <sup>=</sup> {N1, N2,...,N<sup>k</sup>} *be a set of* k *DNN-induced decision models, let* d *be a distance function, and let* Ψ *be an input domain. A model's* disagreement score *(DS) with respect to* N *is defined as:*

$$DS\_{\mathcal{N},d,\Psi}(N\_i) = \frac{1}{|\mathcal{N}| - 1} \sum\_{j \in [k], j \neq i} PDT\_{d,\Psi}(N\_i, N\_j)$$

Intuitively, the disagreement score measures how much a single decision model tends to disagree with the remaining models, on average.

Using disagreement scores, our heuristic employs an iterative scheme for selecting a subset of models that generalize to OOD scenarios—as encoded by inputs in <sup>Ψ</sup> (see Algorithm 2). First, a set of <sup>k</sup> DNNs {N1, N2,...,Nk} are *independently* trained on the training data. Next, a backend verifier is invoked to calculate, for each of the <sup>k</sup> 2 DNN-based model pairs, their respective pairwisedisagreement threshold (up to some accuracy). Next, our algorithm iteratively: (i) calculates the disagreement score for each model in the remaining subset of models; (ii) identifies the models with the (relative) highest DS scores; and (iii) removes them (Line 9 in Algorithm 2). The algorithm terminates after exceeding a user-defined number of iterations (Line 3 in Algorithm 2), or when the remaining models "agree" across the input domain, as indicated by nearly identical disagreement scores (Line 7 in Algorithm 2). We note that the algorithm is also given an upper bound (M) on the maximum difference, informed by the user's domain-specific knowledge.

**Algorithm 2.** Model Selection **Input:** Set of models <sup>N</sup> <sup>=</sup> {N1,...,N*k*}, max disagreement <sup>M</sup>, number of ITERATIONS **Output:** N - ⊆ N 1: PDT <sup>←</sup> Pairwise Disagreement Thresholds(<sup>N</sup> , d, Ψ, <sup>M</sup>) table with all PDTs 2: N - ← N 3: **for** l = 1 ...ITERATIONS **do** 4: **for** <sup>N</sup>*<sup>i</sup>* ∈ N **do** 5: currentDS[N*i*] <sup>←</sup> DSN- (N*i*, PDT) based on definition 3 6: **end for** 7: **if** modelScoresAreSimilar(currentDS) **then**: break 8: modelsToRemove <sup>←</sup> findModelsWithHighestDS(currentDS) 9: N - ← N - \ modelsToRemove remove models that tend to disagree 10: **end for** 11: **return** N -

**DS Removal Threshold.** Different criteria are possible for determining the DS threshold above for which models are removed, and how many models to remove in each iteration (Line 8 in Algorithm 2). A natural and simple approach, used in our evaluation, is to remove the p% models with the *highest* disagreement scores, for some choice of p (25% in our evaluation). Due to space constraints, a thorough discussion of additional filtering criteria (all of which proved successful) is relegated to Appendix C of our extended paper [7].

## **4 Evaluation**

We extensively evaluated our method using three DRL benchmarks. As discussed in the introduction, verifying the generalizability of DRL-based systems is important since such systems are often expected to provide robustly high performance across a broad range of environments, whose diversity is not captured by the training data. Our evaluation spans two classic DRL settings, Cartpole [17] and Mountain Car [68], as well as the recently proposed Aurora congestion controller for Internet traffic [44]. Aurora is a particularly compelling example for a fairly complex DRL-based system that addresses a crucial real-world challenge and must generalize to real-world conditions not represented in its training data.

**Setup.** For each of the three DRL benchmarks, we first trained multiple DNNs with the same architecture, where the training process differed only in the random seed used. We then removed from this set of DNNs all but the ones that achieved high reward values in-distribution (to eliminate the possibility that a decision model generalizes poorly simply due to poor training). Next, we defined out-of-distribution input domains of interest for each specific benchmark, and used Algorithm 2 to select the models most likely to generalize well on those domains according to our framework. To establish the ground truth for how well different models actually generalize in practice, we then applied the models to OOD inputs drawn from the considered domain and ranked them based on their empirical performance (average reward). To investigate the robustness of our results, the last step was conducted for varying choices of probability distributions over the inputs in the domain. All DNNs used have a feed-forward architecture comprised of two hidden layers of ReLU activations, and include 32-64 neurons in the first hidden layer, and 16 neurons in the second hidden layer.

The results indicate that models selected by our approach are likely to perform *significantly better* than the rest. Below we describe the gist of our evaluation; extensive additional information is available in [7].

#### **4.1 Cartpole**

Cartpole [33] is a well-known RL benchmark in which an agent controls the movement of a cart with an upside-down pendulum ("pole") attached to its top. The cart moves on a platform and the agent's goal is to keep the pole balanced for as long as possible (see Fig. 2).

**Fig. 2.** Cartpole: in-distribution setting (blue) and OOD setting (red). (Color figure online)

**Agent and Environment.** The agent's inputs are s = (x, vx, θ, vθ), where x represents the cart's location on the platform, θ represents the pole's angle (i.e., <sup>|</sup>θ| ≈ 0 for a balanced pole, <sup>|</sup>θ| ≈ <sup>90</sup>◦ for an unbalanced pole), <sup>v</sup><sup>x</sup> represents the cart's horizontal velocity and v<sup>θ</sup> represents the pole's angular velocity.

**In-Distribution Inputs.** During training, the agent is incentivized to balance the pole, while staying within the platform's boundaries. In each iteration, the agent's single output indicates the cart's acceleration (sign and magnitude) for the next step. During training, we defined the platform's bounds to be [−2.4, <sup>2</sup>.4], and the cart's initial position as near-static, and close to the center of the platform (left-hand side of Fig. 2). This was achieved by drawing the cart's initial state vector values uniformly from the range [−0.05, <sup>0</sup>.05].

**(OOD) Input Domain.** We consider an input domain with larger platforms than the ones used in training. To wit, we now allow the x coordinate of the input vectors to cover a wider range of [−10, 10]. For the other inputs, we used the same bounds as during the training. See [7] for additional details.

**Evaluation.** We trained k = 16 models, all of which achieved high rewards during training on the short platform. Next, we ran Algorithm 2 until convergence (7 iterations, in our experiments) on the aforementioned input domain, resulting in a set of 3 models. We then tested all 16 original models using (OOD) inputs drawn from the new domain, such that the generated distribution encodes a novel set-

**Fig. 3.** Cartpole: Algorithm 2's results, per iteration: the bars reflect the ratio between the good/bad models (left y-axis) in the surviving set of models, and the curve indicates the number of surviving models (right y-axis).

ting: the cart is now placed at the center of a much longer, shifted platform (see the red cart in Fig. 2).

All other parameters in the OOD environment were identical to those used for the original training. Figure 9 (in [7]) depicts the results of evaluating the models using 20, 000 OOD instances. Of the original 16 models, 11 scored a lowto-mediocre average reward, indicating their poor ability to generalize to this new distribution. Only 5 models obtained high reward values, including the 3 models identified by Algorithm 2; thus implying that our method was able to effectively remove all 11 models that would have otherwise performed poorly in this OOD setting (see Fig. 3). For additional information, see [7].

#### **4.2 Mountain Car**

For our second experiment, we evaluated our method on the Mountain Car [79] benchmark, in which an agent controls a car that needs to learn how to escape a valley and reach a target. As in the Cartpole experiment, we selected a set of models that performed well in-distribution and applied our method to identify a subset of models that make similar decisions in a predefined input domain. We again generated OOD inputs (relative to the training) from within this domain, and observed that the models selected by our algorithm indeed generalize significantly better than their peers that were iteratively removed. Detailed information about this benchmark can be found in Appendix E of our extended paper [7].

### **4.3 Aurora Congestion Controller**

In our third benchmark, we applied our method to a complex, real-world system that implements a policy for Internet congestion control. The goal of congestion control is to determine, for each traffic source in a communication network, the pace at which data packets should be sent into the network. Congestion control is a notoriously difficult and fundamental challenge in computer networking [59,69]; sending packets too fast might cause network congestion, leading to data loss and delays. Conversely, low sending rates might under-utilize available network bandwidth. *Aurora* [44] is a DRL-based congestion controller that is the subject of recent work on DRL verification [9,28]. In each time-step, an Aurora agent observes statistics regarding the network and decides the packet sending rate for the following time-step. For example, if the agent observes excellent network conditions (e.g., no packet loss), we expect it to increase the packet sending rate to better utilize the network. We note that Aurora handles a much harder task than classical RL benchmarks (e.g., Cartpole and Mountain Car): congestion controllers must react gracefully to various possible events based on nuanced signals, as reflected by Aurora's inputs. Here, unlike in the previous benchmarks, it is not straightforward to characterize the optimal policy.

**Agent and Environment.** Aurora's inputs are t vectors v1,...,vt, representing observations from the t previous time-steps. The agent's single output value indicates the change in the packet sending rate over the next time-step. Each vector <sup>v</sup><sup>i</sup> <sup>∈</sup> <sup>R</sup><sup>3</sup> includes three distinct values, representing statistics that reflect the network's condition (see details in Appendix F of [7]). In line with previous work [9,28,44], we set t = 10 time-steps, making Aurora's inputs of size 3t = 30. The reward function is a linear combination of the data sender's throughput, latency, and packet loss, as observed by the agent (see [44] for additional details).

**In-Distribution Inputs.** Aurora's training applies the congestion controller to simple network scenarios where a *single* sender sends traffic towards a *single* receiver across a *single* network link. Aurora is trained across varying choices of initial sending rate, link bandwidth, link packet-loss rate, link latency, and size of the link's packet buffer. During training, packets are initially sent by Aurora at a rate corresponding to 0.<sup>3</sup> <sup>−</sup> <sup>1</sup>.5 times the link's bandwidth.

**(OOD) Input Domain.** In our experiments, the input domain encoded a link with a *shallow packet buffer*, implying that only a few packets can accumulate in the network (while most excess traffic is discarded), causing the link to exhibit a volatile behavior. This is captured by the initial sending rate being up to 8 times the link's bandwidth, to model the possibility of a dramatic decrease in available bandwidth (e.g., due to competition, traffic shifts, etc.). See [7] for additional details.

**Evaluation.** We ran our algorithm and scored the models based on their disagreement upon this large domain, which includes inputs they had not encountered during training, representing the aforementioned novel link conditions.

**Experiment (1): High Packet Loss.** In this experiment, we trained over 100 Aurora agents in the original (in-distribution) environment. Out of these, we selected k = 16 agents that achieved a high average reward in-distribution (see Fig. 20a in [7]). Next, we evaluated these agents on OOD inputs that are included in the previously described domain. The main difference between the training distribution and the new (OOD) ones is the possibility of extreme packet loss rates upon initialization.

Our evaluation over the OOD inputs, within the domain, indicates that although all 16 models performed well in-distribution, only 7 agents could successfully handle such OOD inputs (see Fig. 20b in [7]). When we ran Algorithm 2 on the 16 models, it was able to filter out *all* 9 models that generalized poorly on the OOD inputs (see Fig. 4). In particular, our method returned model {16}, which is the best-performing model according to our simulations. We note that in the first iterations, the four models to be filtered out were models {1, <sup>2</sup>, <sup>6</sup>, <sup>13</sup>}, which are indeed the four worst-performing models on the OOD inputs (see Appendix F of [7]).

**Fig. 4.** Aurora: Algorithm 2's results, per iteration.

**Experiment (2): Additional Distributions over OOD Inputs.** To further demonstrate that, in the specified input domain, our method is indeed likely to keep better-performing models while removing bad models, we reran the previous Aurora experiments for additional distributions (probability density functions) over the OOD inputs. Our evaluation reveals that all models removed by Algorithm 2 achieved low reward values also for these additional distributions. These results highlight an important advantage of our approach: it applies to all inputs within the considered domain, and so it applies to *all distributions over these inputs*.

**Additional Experiments.** We also generated a new set of Aurora models by altering the training process to include significantly longer interactions. We then repeated the aforementioned experiments. The results (summarized in [7]) demonstrate that our approach (again) successfully selected a subset of models that generalizes well to distributions over the OOD input domain.

#### **4.4 Comparison to Additional Methods**

*Gradient-based methods* [40,53,62,63] are optimization algorithms capable of finding DNN inputs that satisfy prescribed constraints, similarly to verification methods. These algorithms are extremely popular due to their simplicity and scalability. However, this comes at the cost of being inherently incomplete and not as precise as DNN verification [11,101]. Indeed, when modifying our algorithm to calculate PDT scores with gradient-based methods, the results (summarized in Appendix G of [7]) reveal that, in our context, the verification-based approach is superior to the gradient-based ones. Due to the incompleteness of gradient-based approaches [101], they often computed sub-optimal PDT values, resulting in models that generalize poorly being retained.

*Predictive uncertainty methods* [1,74] are *online* methods for assessing uncertainty with respect to observed inputs, to determine whether an encountered input is drawn from the training distribution. We ran an experiment comparing our approach to uncertainty-prediction-based model selection: we generated ensembles [23,30,51] of our original models, and used a variance-based metric (motivated by [58]) to identify subsets of models with low output variance on OOD-sampled inputs. Similar to gradient-based methods, predictiveuncertainty techniques proved fast and scalable, but lacked the precision afforded by verification-driven model selection and were unable to discard poorly generalizing models. For example, when ranking Cartpole models by their uncertainty on OOD inputs, the three models with the lowest uncertainty included also "bad" models, which had been filtered out by our approach.

## **5 Related Work**

Recently, a plethora of approaches and tools have been put forth for ensuring DNN correctness [2,6,10,15,19,24–27,29,31,32,34,36,37,41–43,46–49,52, 57,61,70,76,81,83,86,87,89,92,94,95,98,100,102,104,106], including techniques for DNN shielding [60], optimization [14,88], quantitative verification [16], abstraction [12,13,73,78,86,105], size reduction [77], and more. Non-verification techniques, including runtime-monitoring [39], ensembles [71,72,80,103] and additional methods [75] have been utilized for OOD input detection.

In contrast to the above approaches, we aim to establish *generalization guarantees* with respect to an *entire input domain* (spanning all distributions across this domain). In addition, to the best of our knowledge, ours is the first attempt to exploit variability across models for distilling a subset thereof, with improved *generalization* capabilities. In particular, it is also the first approach to apply formal verification for this purpose.

## **6 Conclusion**

This work describes a novel, verification-driven approach for identifying DNN models that generalize well to an input domain of interest. We presented an iterative scheme that employs a backend DNN verifier, allowing us to score models based on their ability to produce similar outputs on the given domain. We demonstrated extensively that this approach indeed distills models capable of good generalization. As DNN verification technology matures, our approach will become increasingly scalable, and also applicable to a wider variety of DNNs.

**Acknowledgements.** The work of Amir, Zelazny, and Katz was partially supported by the Israel Science Foundation (grant number 683/18). The work of Amir was supported by a scholarship from the Clore Israel Foundation. The work of Maayan and Schapira was partially supported by funding from Huawei.

## **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

#### **A**

Abdulla, Parosh Aziz I-184 Akshay, S. I-266, I-367, III-86 Albert, Elvira III-176 Alistarh, Dan I-156 Alur, Rajeev I-415 Amilon, Jesper III-281 Amir, Guy II-438 An, Jie I-62 Anand, Ashwani I-436 Andriushchenko, Roman III-113 Apicelli, Andrew I-27 Arcaini, Paolo I-62 Asada, Kazuyuki III-40 Ascari, Flavio II-41 Atig, Mohamed Faouzi I-184

#### **B**

Badings, Thom III-62 Barrett, Clark II-163, III-154 Bastani, Favyen I-459 Bastani, Osbert I-415, I-459 Bayless, Sam I-27 Becchi, Anna II-288 Beutner, Raven II-309 Bisping, Benjamin I-85 Blicha, Martin II-209 Bonchi, Filippo II-41 Bork, Alexander III-113 Braught, Katherine I-351 Britikov, Konstantin II-209 Brown, Fraser III-154 Bruni, Roberto II-41 Bucev, Mario III-398

#### **C**

Calinescu, Radu I-289 Ceška, Milan ˇ III-113 Chakraborty, Supratik I-367 Chatterjee, Krishnendu III-16, III-86 Chaudhuri, Swarat III-213 Chechik, Marsha III-374 Chen, Hanyue I-40 Chen, Taolue III-255 Chen, Yu-Fang III-139 Choi, Sung Woo II-397 Chung, Kai-Min III-139 Cimatti, Alessandro II-288 Cosler, Matthias II-383 Couillard, Eszter III-437 Czerner, Philipp III-437

#### **D**

Dardik, Ian I-326 Das, Ankush I-27 David, Cristina III-459 Dongol, Brijesh I-206 Dreossi, Tommaso I-253 Dutertre, Bruno II-187

#### **E**

Eberhart, Clovis III-40 Esen, Zafer III-281 Esparza, Javier III-437

#### **F**

Farzan, Azadeh I-109 Fedorov, Alexander I-156 Feng, Nick III-374 Finkbeiner, Bernd II-309 Fremont, Daniel J. I-253 Frenkel, Hadar II-309 Fu, Hongfei III-16 Fu, Yu-Fu II-227, III-329

#### **G**

Gacek, Andrew I-27 Garcia-Contreras, Isabel II-64

© The Editor(s) (if applicable) and The Author(s) 2023 C. Enea and A. Lal (Eds.): CAV 2023, LNCS 13965, pp. 457–460, 2023. https://doi.org/10.1007/978-3-031-37703-7

Gastin, Paul I-266 Genaim, Samir III-176 Getir Yaman, Sinem I-289 Ghosh, Shromona I-253 Godbole, Adwait I-184 Goel, Amit II-187 Goharshady, Amir Kafshdar III-16 Goldberg, Eugene II-110 Gopinath, Divya I-289 Gori, Roberta II-41 Govind, R. I-266 Govind, V. K. Hari II-64 Griggio, Alberto II-288, III-423 Guilloud, Simon III-398 Gurfinkel, Arie II-64 Gurov, Dilian III-281

#### **H**

Hahn, Christopher II-383 Hasuo, Ichiro I-62, II-41, III-40 Henzinger, Thomas A. II-358 Hofman, Piotr I-132 Hovland, Paul D. II-265 Hückelheim, Jan II-265

#### **I**

Imrie, Calum I-289

#### **J**

Jaganathan, Dhiva I-27 Jain, Sahil I-367 Jansen, Nils III-62 Je˙z, Artur II-18 Johannsen, Chris III-483 Johnson, Taylor T. II-397 Jonáš, Martin III-423 Jones, Phillip III-483 Joshi, Aniruddha R. I-266 Jothimurugan, Kishor I-415 Junges, Sebastian III-62, III-113

#### **K**

Kang, Eunsuk I-326 Karimi, Mahyar II-358 Kashiwa, Shun I-253 Katoen, Joost-Pieter III-113 Katz, Guy II-438 Kempa, Brian III-483 Kiesl-Reiter, Benjamin II-187 Kim, Edward I-253 Kirchner, Daniel III-176 Kokologiannakis, Michalis I-230 Kong, Soonho II-187 Kori, Mayuko II-41 Koval, Nikita I-156 Kremer, Gereon II-163 Kˇretínský, Jan I-390 Krishna, Shankaranarayanan I-184 Kueffner, Konstantin II-358 Kunˇcak, Viktor III-398

#### **L**

Lafortune, Stéphane I-326 Lahav, Ori I-206 Lengál, Ondˇrej III-139 Lette, Danya I-109 Li, Elaine III-350 Li, Haokun II-87 Li, Jianwen II-288 Li, Yangge I-351 Li, Yannan II-335 Lidström, Christian III-281 Lin, Anthony W. II-18 Lin, Jyun-Ao III-139 Liu, Jiaxiang II-227, III-329 Liu, Mingyang III-255 Liu, Zhiming I-40 Lopez, Diego Manzanas II-397 Lotz, Kevin II-187 Luo, Ziqing II-265

#### **M**

Maayan, Osher II-438 Macák, Filip III-113 Majumdar, Rupak II-187, III-3, III-437 Mallik, Kaushik II-358, III-3 Mangal, Ravi I-289 Marandi, Ahmadreza III-62 Markgraf, Oliver II-18 Marmanis, Iason I-230 Marsso, Lina III-374 Martin-Martin, Enrique III-176 Mazowiecki, Filip I-132 Meel, Kuldeep S. II-132 Meggendorfer, Tobias I-390, III-86 Meira-Góes, Rômulo I-326 Mell, Stephen I-459 Mendoza, Daniel II-383

Metzger, Niklas II-309 Meyer, Roland I-170 Mi, Junri I-40 Milovanˇcevi´c, Dragana III-398 Mitra, Sayan I-351

#### **N**

Nagarakatte, Santosh III-226 Narayana, Srinivas III-226 Nayak, Satya Prakash I-436 Niemetz, Aina II-3 Nowotka, Dirk II-187

#### **O**

Offtermatt, Philip I-132 Opaterny, Anton I-170 Ozdemir, Alex II-163, III-154

#### **P**

Padhi, Saswat I-27 P˘as˘areanu, Corina S. I-289 Peng, Chao I-304 Perez, Mateo I-415 Preiner, Mathias II-3 Prokop, Maximilian I-390 Pu, Geguang II-288

#### **R**

Reps, Thomas III-213 Rhea, Matthew I-253 Rieder, Sabine I-390 Rodríguez, Andoni III-305 Roy, Subhajit III-190 Rozier, Kristin Yvonne III-483 Rümmer, Philipp II-18, III-281 Rychlicki, Mateusz III-3

#### **S**

Sabetzadeh, Mehrdad III-374 Sánchez, César III-305 Sangiovanni-Vincentelli, Alberto L. I-253 Schapira, Michael II-438 Schmitt, Frederik II-383 Schmuck, Anne-Kathrin I-436, III-3 Seshia, Sanjit A. I-253 Shachnai, Matan III-226 Sharma, Vaibhav I-27

Sharygina, Natasha II-209 Shen, Keyi I-351 Shi, Xiaomu II-227, III-329 Shoham, Sharon II-64 Siegel, Stephen F. II-265 Sistla, Meghana III-213 Sokolova, Maria I-156 Somenzi, Fabio I-415 Song, Fu II-413, III-255 Soudjani, Sadegh III-3 Srivathsan, B. I-266 Stanford, Caleb II-241 Stutz, Felix III-350 Su, Yu I-40 Sun, Jun II-413 Sun, Yican III-16

#### **T**

Takhar, Gourav III-190 Tang, Xiaochao I-304 Tinelli, Cesare II-163 Topcu, Ufuk III-62 Tran, Hoang-Dung II-397 Tripakis, Stavros I-326 Trippel, Caroline II-383 Trivedi, Ashutosh I-415 Tsai, Ming-Hsien II-227, III-329 Tsai, Wei-Lun III-139 Tsitelov, Dmitry I-156

#### **V**

Vafeiadis, Viktor I-230 Vahanwala, Mihir I-184 Veanes, Margus II-241 Vin, Eric I-253 Vishwanathan, Harishankar III-226

#### **W**

Waga, Masaki I-3 Wahby, Riad S. III-154 Wang, Bow-Yaw II-227, III-329 Wang, Chao II-335 Wang, Jingbo II-335 Wang, Meng III-459 Watanabe, Kazuki III-40 Wehrheim, Heike I-206 Whalen, Michael W. I-27 Wies, Thomas I-170, III-350

Wolff, Sebastian I-170 Wu, Wenhao II-265

#### **X**

Xia, Bican II-87 Xia, Yechuan II-288

#### **Y**

Yadav, Raveesh I-27 Yang, Bo-Yin II-227, III-329 Yang, Jiong II-132 Yang, Zhengfeng I-304 Yu, Huafeng I-289 Yu, Yijun III-459 Yue, Xiangyu I-253

#### **Z**

Zdancewic, Steve I-459 Zelazny, Tom II-438 Zeng, Xia I-304 Zeng, Zhenbing I-304 Zhang, Hanliang III-459 Zhang, Li I-304 Zhang, Miaomiao I-40 Zhang, Pei III-483 Zhang, Yedi II-413 Zhang, Zhenya I-62 Zhao, Tianqi II-87 Zhu, Haoqing I-351 Žikeli´c, Ðor de III-86 Zufferey, Damien III-350