# **SpringerBriefs in Computer Science**

**Alexander Felfernig · Andreas Falkner · David Benavides**

# **Feature Models** AI-Driven Design, Analysis and Applications

**SpringerBriefs in Computer Science**

SpringerBriefs present concise summaries of cutting-edge research and practical applications across a wide spectrum of fields. Featuring compact volumes of 50 to 125 pages, the series covers a range of content from professional to academic.

Typical topics might include:


Briefs allow authors to present their ideas and readers to absorb them with minimal time investment. Briefs will be published as part of Springer's eBook collection, with millions of users worldwide. In addition, Briefs will be available for individual print and electronic purchase. Briefs are characterized by fast, global electronic dissemination, standard publishing contracts, easy-to-use manuscript preparation and formatting guidelines, and expedited production schedules. We aim for publication 8–12 weeks after acceptance. Both solicited and unsolicited manuscripts are considered for publication in this series.

\*\*Indexing: This series is indexed in Scopus, Ei-Compendex, and zbMATH \*\*

Alexander Felfernig • Andreas Falkner David Benavides

# Feature Models

and Applications AI-Driven Design, Analysis

Alexander Felfernig Andreas Falkner Institute of Software Technology Corporate Technology Graz University of Technology Siemens (Austria) Graz, Austria Wien, Austria

David Benavides ETS de Ingeniería Informática University of Seville Sevilla, Spain

ISSN 2191-5768 ISSN 2191-5776 (electronic) SpringerBriefs in Computer Science ISBN 978-3-031-61873-4 ISBN 978-3-031-61874-1 (eBook) https://doi.org/10.1007/978-3-031-61874-1

The work presented in this book was partially supported by: (1) the Austrian Research Promotion Agency (FFG) (www.ffg.at), (2) the FEDER/Ministry of Science and Innovation/Junta de Andalucia/State Research Agency (www.ciencia.gob.es), and (3) the company sponsors Siemens (www.siemens.com) and Lam Research (www.lamresearch.com).

© The Editor(s) (if applicable) and The Author(s) 2024. This book is an open access publication. **Open Access** This book is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution

and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made. The images or other third party material in this book are included in the book's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the book's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use.

The publisher, the authors, and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This Springer imprint is published by the registered company Springer Nature Switzerland AG The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland

If disposing of this product, please recycle the paper.

# **Preface**

Feature models (FMs) have become a fundamental means of representing variability knowledge related to software systems, services, and also physical products such as furniture, cars, and cyber–physical systems. They defne all allowed combinations of the features representing possible variants of a product. FM<sup>s</sup> provide a language and a corresponding formal semantics that helps to support reasoning operations, for example, fnding correct FM confgurations and analyzing FMs. The ever–increasing amount of research on the integration of Artifcial Intelligence (AI) methods into feature modelling related processes motivated us to write this book on *Feature Models: AI-driven Design, Analysis, and Applications*. Its purpose is to provide a basic introduction to feature modelling and analysis as well as the integration of AI methods with feature modelling. This book is intended as an introduction for persons new to the feld and also as reference material for researchers, teachers, and practitioners. More specifcally, while focusing on the AI perspective, the book covers the topics of feature modelling, FM analysis, and interacting with FM confgurators. These topics are discussed along the AI areas of knowledge representation and reasoning (KRR), explainable AI (XAI), and machine learning (ML). Last not least a personal note: we decided to order the author names by the order of our frst names.

Graz, Vienna, Sevilla *Alexander Felfernig* April 2024 *Andreas Falkner*

*David Benavides*

# **Acknowledgements**

We want to thank the following colleagues for their valuable contributions in terms of providing feedback on the book contents that helped us to develop a broader view on the topic and also to signifcantly improve understandability: *Don Batory* (University of Texas), *Jose A. Galindo, David Romero, and Jos ´ e A. Zamudio ´* (University of Seville), *Jose M. Horcas ´* (University of Malaga), ´ *Viet-Man Le, Sebastian Lubos, Thi Ngoc Trang Tran, Damian Garber, and Tamim Burgstaller* (Graz University of Technology), and *Mathias Uta* (Siemens Energy). Furthermore, special thanks goes to *Jean Marc Jezequel* who provided invaluable inspirations regarding an easy-tounderstand complexity characterization of feature models. The work presented in this book was partially supported by: (1) the *Austrian Research Promotion Agency* (FFG) (www.ffg.at) within the scope of the research projects ParXCel (880657) and OpenSpace (FO999891127), (2) the *FEDER/Ministry of Science and Innovation/Junta de Andaluc´ıa/State Research Agency* (www.ciencia.gob.es) with the following grants: Data-pl(PID2022-138486OB-I00), Tasova Plus research network (RED2022-134337-T) and MIDAS (IDI-20230256), and (3) the company sponsors Siemens (www.siemens.com) and Lam Research (www.lamresearch.com) who made the open-access publishing of this book possible.

# **Contents**



# **Chapter 1 Introduction**

**Abstract** *Feature models* (FMs) are an established means for representing variability and commonality properties of software product lines and beyond (e.g., fnancial services and confgurable products such as furniture, cars, and cyber-physical systems). *Artifcial Intelligence* (AI) plays an increasingly important role in supporting feature modelling tasks, FM analysis, and FM confguration. In this chapter, we explain our major motivation for writing this book. We provide a short overview of the history of feature modelling specifcally focusing on the relationship between feature modelling related tasks and AI methods. We discuss relevant benefts of applying FM<sup>s</sup> and refer to further topics with a relationship to feature modelling. We conclude this chapter with an overview of the major topics of in this book.

### **1.1 Motivation for the Book**

Feature models (FMs) are a wide-spread means for representing variability properties of software product lines (SPL) [2, 6, 9, 10, 16, 47] as well as confgurable products and services [9, 26, 45]. Major advantages of FM<sup>s</sup> are that (1) they are easy to understand and develop (only a few modelling concepts with a clear semantics are provided which are sufcient in many application scenarios), (2) they can be directly translated into a corresponding formal representation, for example, a constraint satisfaction problem (CSP) [53] or a Boolean satisfability (SAT) problem [15] which allows for automated and efcient reasoning processes, and (3) there exists a plethora of tools for FM-based variability management (see Chapter 5).

On an informal level, FM<sup>s</sup> represent confguration spaces with a graphical representation in terms of (1) features which can be included in a confguration or excluded (i.e., are not part of the confguration) and (2) a set of constraints which restrict the combinations of individual features in a fnal confguration. An FM of a confgurable *smartwatch* will be used as a working example throughout this book. Example features of a smartwatch are *payment* and *screen* (screentype of a smartwatch). A related constraint could specify that if a user is interested in a standard screen, no payment feature is available, i.e., the payment feature is incompatible with the standard feature. The basic concept is that a set of similar products can be described in terms of *features* and relationships among them. An FM represents allowed combinations of features (for details, see Chapter 2).

Our major motivation for writing this book is that Artifcial Intelligence (AI) methods and techniques play an increasingly important role in variability management processes where FM<sup>s</sup> are a central element [7, 8]. On the basis of an analysis of existing AI approaches in software product lines and knowledge-based confguration, we discuss these approaches in the context of the identifed categories of (1) feature modelling (Chapter 2), FM analysis (Chapter 3), interacting with FM confgurators (Chapter 4), and related tools and applications (Chapter 5).

Another motivation was to move forward towards a more integrative view on the topics addressed in diferent communities (1) the SPL community, exemplifed by the Software Product Line Conference (SPLC) and the Working Conference on Variability Modelling of Software Intensive Systems (VaMoS), and (2) the knowledge-based confguration community, exemplifed by the Workshop on Confguration (ConfWS). With this, we expect to foster more intensive cooperation in related felds and also indicate relevant open research issues to these communities.

### **1.2 A Short History of Feature Models**

FM<sup>s</sup> and software product lines are software engineering key technologies for producing highly confgurable software products. The concept of FM<sup>s</sup> was invented in the early 1990's by Kang et al. in their 1990 seminal paper "Feature-oriented Domain Analysis (FODA) Feasibility Study" [39]. There were previous works on similar topics that settled the basis for modern software product line engineering approaches. McIlroy's paper in 1969 on "Mass Produced Software Components" is probably the most important seminal paper [42]. The key idea was that customized software should be industrialized as in other domains such as hardware.

Following basic FMs, diferent variants thereof were proposed in the late 1990's and 2000's [9, 29] ranging from cardinality-based FM representations [19] to FM<sup>s</sup> taking into account feature attributes [11, 39].

An increasing adoption of feature modelling and software product lines in industry could be observed starting in the early 2000's [9, 29]. From that time on, FM<sup>s</sup> became a central element of reuse-driven development processes for highly-confgurable software systems [54]. This increasing industrial relevance was observed in industries such as electronic components and car manufacturing [13].

Following the adoption of FM<sup>s</sup> in industry, the quality of tool support increased from that time on until now (and is still continuing) resulting in various tools/frameworks and applications (see, e.g., Meinicke et al. [43] and Beuche [14]). Examples of related open source and commercial tools are discussed in Chapter 5.

Nowadays, FM<sup>s</sup> in the context of software product lines (SPLs) are in wide-spread use in industry as well as in academia with applications [35] ranging from operating systems [54], software systems for controlling trains on various hardware platforms and in diferent countries [1], automotive systems [20, 22, 64], synthetic biology [17], to software product lines for large telescope control software [32], just to mention a few. In the late 2010's and 2020's, the ever-increasing popularity of Artifcial Intelligence methods – specifcally, machine learning (ML) – also had enormous impacts on feature modelling and variability management research. Examples of related research are the application of ML to personalized FM confguration [27, 50, 52] and confguration space learning [31, 48]. The integration of AI with FM<sup>s</sup> is the central topic of this book and will be discussed throughout Chapters 2–5.

Similar to SPLs which help to customize software systems, *product confguration* is about customizing hardware (and more). Felfernig et al. [26] relate it to the mass customization paradigm [37] which is based on the idea of the customer-individual production of highly variant products under near mass production pricing conditions. Sabin and Weigel [55] defne confguration as a *special case of design activity where the artifact being confgured is assembled from instances of a fxed set of welldefned component types which can be composed conforming to a set of constraints*. Confguration has been one of the most successfully applied technologies of AI for several decades and in many application domains [26, 55, 58].

We want to emphasize that specifcally in the context of highly-confgurable products, confguration solutions were already developed throughout the 1970's and 1980's a.o. in the context of the R1/XCON computer confgurator [5]. These systems focused on rule-based knowledge representation and reasoning resulting in serious eforts in confguration model development and maintenance. At the same time as initial versions of FM languages were developed [39], confguration knowledge representation and reasoning moved away from rule-based representations to so-called model-based knowledge representations allowing a clear separation of product domain and reasoning knowledge (e.g., in terms of search heuristics). Related emerging (model-based) reasoning techniques such as constraint solving [53] and SAT solving [15] became established in both felds of research, i.e., feature modelling [6, 39] and knowledge-based confguration [26, 45, 55, 58].

Although the research communities of feature modelling and knowledge-based confguration were established in parallel and in many cases work on similar topics, we can observe an increasing degree of cooperation which appears to be fruitful for both research communities [8]. Today, *SPLC* and *VaMoS* can be regarded as major scientifc conferences for FM-related topics whereas the *Confguration Workshop* (*ConfWS*) is the platform for research on topics of knowledge-based confguration (and beyond). In 2022, *SPLC* and *ConfWS* were co-located the frst time.1

<sup>1</sup> https://2022.splc.net/

### **1.3 The Role of AI in Feature Models**

Artifcial Intelligence (AI) plays an increasing important role in diferent FM-related tasks [27]. An overview is given in Table 1.1. We distinguish between the tasks of (1) *feature modelling*, (2) *FM analysis*, and (3) *FM confguration* (i.e., interacting with confgurators). Those topics are discussed in Chapters 2–4. We now discuss the concepts of Table 1.1 in more detail.


Table 1.1: Artifcial Intelligence (AI) aspects covered in this book and relationships to *feature modelling*, *analysis*, and *confguration* (*interacting with confgurators*).

**AI Aspects.** Our categorization of diferent AI areas relevant in the context of FM<sup>s</sup> is based on the following scheme. First, the role of *knowledge representation & reasoning* (KRR) [63] is to develop appropriate concepts and languages for representing variability properties and support efcient problem solving (reasoning) procedures. Examples of related AI techniques are (1) knowledge graphs [34] and answer set programs [46], and (2) SAT solving [15], constraint solving [53], and rule-based reasoning [30]. Second, following the idea of *explainable AI* (XAI) [21], solutions as well as problems (e.g., inconsistencies) need to be explained such that users understand why a specifc confguration has been proposed or no solution could be found. Examples of AI techniques supporting such tasks are argumentation [12], confict detection [38], and model-based diagnosis [51]. Finally, diferent types of *machine learning* (ML) [44] approaches can help to support *prediction* (e.g., what will be the maximum price accepted by the user) and *classifcation* tasks (e.g., will a specifc feature be of interest for the user). Machine learning is applied to provide a personalized user experience in feature modelling (see, e.g., [27]).

**Feature Modelling.** To be applicable in FM confguration, FM<sup>s</sup> have to be designed (typically, this is performed on a graphical level) and then translated into a corresponding formal representation (FM formalization) that is a basis for the followup tasks of FM analysis and FM confguration (interacting with confgurators). FM formalization can be based on diferent AI-based approaches such as SAT solving, constraint solving, and rule-based reasoning. For demonstration purposes, we will focus our discussions on constraint-based representations, however, most of the discussed concepts can as well be applied with the mentioned alternatives. FM design and FM formalization will be discussed in Chapter 2. FM design also depends on decisions regarding the inclusion of specifc constraints, for example, regarding the combination of individual features and also on decisions regarding the inclusion or exclusion of features. An important task in this context is *product line scoping* which entails methods and techniques helping to fgure out relevant features and corresponding constraints describing the envisioned (software) product line. Example AI techniques that can be used in this context are explanations (that help to understand inclusion and exclusion decisions), confict detection (pointing out inconsistencies that need to be resolved), and diagnosis (in which way conficts should be resolved to develop consensus regarding the fnal shape of an FM). Some explanation-related aspects of product line scoping will be discussed in Chapter 2. Finally, to assure efcient solution search, AI techniques can also be used to support developers in optimizing FM confgurator search heuristics and in predicting the performance of FM confgurations (see Chapter 2).

**FM Analysis.** This task covers diferent aspects of assuring the quality of FM<sup>s</sup> with regard to aspects such as model consistency and FM complexity (e.g., in terms of the number of supported solutions). Some of the related analysis operations can be performed without solver support (e.g., counting the number of features and constraints) and other operations are in the need of solver support (e.g., checking model satisfability, counting or approximating the number of supported solutions (confgurations), and checking if some of the features are dead, i.e., cannot be included in a confguration). Diferent types of FM analysis operations and their relevance in modelling contexts are discussed in Chapter 3. Besides the mentioned analysis operations, FM<sup>s</sup> can also be tested with regard to conformance with the underlying application domain. In this context, FM development and maintenance can be supported with diferent types of diagnosis and repair functions that help to locate the sources of inconsistent FM behaviors. FM inconsistencies could be predicted using ML approaches. Related aspects are also discussed in detail in Chapter 3.

**Interacting with Confgurators.** FM confguration is typically supported by tools denoted as confgurators. These tools are based on a formal knowledge representation such as constraint satisfaction problems (CSP) or Boolean satisfability problems (SAT) supported by corresponding reasoning engines. In addition to the identifcation of a solution, some scenarios require optimization functionalities, for example, minimizing the overall price of a confguration. Concepts supporting interactive confguration scenarios are discussed in Chapter 4. In FM confguration, diagnosis algorithms are needed, for example, to support the minimization of confgurations (only relevant components should be included) or the identifcation of repair actions to fnd ways out of the *no solution could be found* dilemma. Also in this context, alternative repairs can be ranked which may require the integration of machine learning, more specifcally recommendation concepts, that help to identify the most relevant trade-ofs, i.e., trade-ofs with a high probability of being accepted by the user. Finally, in situations where users are not sure about the inclusion of specifc features, recommendation techniques can help to support the user in terms of recommending reasonable inclusions or exclusions. Related techniques and detailed examples are provided in Chapter 4.

Finally, in Chapter 5 we discuss the practical relevance of the FM-related tasks summarized in Table 1.1 by providing and discussing links to diferent FM based tools and applications.

### **1.4 Topics Related to Feature Models**

There are a couple of topics with a direct relation to feature modelling related research. These topics will be touched in upon in one way or another in this book.

**Software Product Line Engineering** focuses on the development of a software codebase (with an emphasis on reuse) that represents a family of related products with variabilities and commonalities [2, 3, 18, 49, 56].

**Knowledge-based Confguration** has overlaps with FM<sup>s</sup> [39] regarding research topics and research results [26, 45, 55, 58]. *Knowledge representation & reasoning* [63], *explainable AI* [21], and *machine learning* [44] are AI research felds that play an increasingly important role in confguration tasks. Our discussion focuses on the application of these research felds in various feature modelling related tasks.

**Recommender Systems**[23, 28, 52, 62] support the identifcation of user-relevant items from large assortments defned, for example, by product catalogs or confguration knowledge bases. These systems combine diferent AI techniques such as machine learning [44] and explanations [21] to provide a personalized user experience when being confronted with complex item spaces.

**Mass Customization** is the production of highly-variant products and services under mass production pricing conditions [24, 37]. Software product lines extend the application of the mass customization paradigm to the area of software engineering with similar related tasks and research issues. Intangibility, high complexity, and a higher degree of adaptability (compared to physical products) make related management and implementation tasks even more demanding. A phenomenon related to mass customization is *mass confusion* [36] referring to cognitive overloads of customers triggered by a high number of confguration choices. Diferent machine learning concepts such as recommender systems that can help to tackle the challenges induced by mass confusion are discussed in this book.

**Human Decision Making.** This aspect is highly relevant specifcally in the context of FM confguration. Knowledge about how humans decide, which basic types of shortcuts are used in human decision making, and in which way humans prefer to state their preferences must be taken into account when developing confgurator user interfaces [4, 60]. Importantly, decisions are often made in groups, for example, in the context of product line scoping [41]. Group members need to decide about which features and constraints need to be included in a feature model, i.e., decide about very specifc variability properties. In this context, group decision support is needed to support the group in fnding a good solution [25, 40, 59].

### **1.5 Benefts of Feature Models and Confguration**

Feature models (FMs) are key enabling technologies for supporting variability management in software development (and other types of tasks such as the confguration of physical products). The challenges of variability management and related benefts of FM<sup>s</sup> and confguration technologies can be summarized as follows.

**Efcient variability model development & confguration.** FM<sup>s</sup> are in many cases based on a graphical representation understandable for technical experts (e.g., confgurator developers) as well as domain experts (e.g., product development) which is of specifc relevance in early stages of a software development process. For this reason (both parties are able to "speak" the same language), the so-called knowledge acquisition bottleneck can be reduced in terms of lower communication overheads between developers and domain experts. For the same reason, domain know-how can be increased resulting in a kind of *corporate variability knowledge memory* [33]. Due to the systematic representation of software variabilities (using FMs), corresponding confgurations can be derived in an efcient fashion helping to reduce software development eforts and the corresponding time to market [16, 61].

**Avoiding erroneous and suboptimal FM confgurations.** FM<sup>s</sup> represent the variability properties of the underlying software product line (and beyond). When reusing and integrating individual software components, it is extremely important to assure the correctness of confgurations, i.e., we want to avoid situations where incompatible software features result in faulty or at least low-performance behavior when being installed on the customer site. Beyond promoting correctness, FM<sup>s</sup> can also help to reduce *lead-times*, since confguration processes and confguration completion can be automated, i.e., are not a manual and time-consuming process anymore.

**Understanding the confguration space** (and its set of possible solutions). Knowledge about the confguration space can be of help to fgure out weaknesses in terms of confgurations leading to low system performance and to confgurations assuring stable runtime performance [48]. Furthermore, confguration space understanding can help to better understand potential impacts of the supported confguration space on corresponding sales and production processes – this holds for physical products as well as reusable software components.

**Efcient testing.** FM<sup>s</sup> represent the software confguration space of software product lines and can also be used to support the systematic generation of test cases, for example, to achieve specifc test coverage criteria [57]. Since FM<sup>s</sup> can easily be translated into a corresponding formal representation, basic FM properties can be easily analyzed, for example, *if every feature is part of at least one confguration*.

**Avoiding mass confusion.** When confguring complex items, it cannot be guaranteed that users know in detail every ofered feature. In some cases, for example, when selling software services online, a cognitive overload can lead to situations where users (customers) refrain from taking a purchase decision. In such contexts, FM confguration (often combined with corresponding personalization services) can support users in identifying and also explaining the most relevant confgurations. For the company itself, FM<sup>s</sup> can be regarded as a kind of *corporate memory* assuring the explicit representation of the variability properties of the ofered software (as well as products and services).

**Using a common language in diferent domains.** Although there are many diferent FM dialects, most of them share a common way of expressing commonalities and variabilities. The same language can be used in diferent application domains which facilitates engineering activities. There is a community efort to develop a *Universal Variability Language* (UVL) [10] (see Chapter 2) to encourage knowledge sharing while promoting open science principles.

### **1.6 Book Overview**

The remainder of this book is organized as follows:


### **References**

1. M. Abbas, R. Jongeling, C. Lindskog, E. Enoiu, M. Saadatmand, and D. Sundmark. Product Line Adoption in Industry: An Experience Report from the Railway Domain. In *24th ACM* *Conference on Systems and Software Product Line: Volume A - Volume A*, pages 1–11. ACM, 2020.


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

# **Chapter 2 Feature Modelling**

**Abstract** In this chapter, we describe the basis of *Feature Models* (FMs) using graphical as well as textual representations. We introduce a *smartwatch* FM that will be used as a working example for this and later chapters. Based on this example, we describe feature modelling extensions using cardinalities and attributes. In the following, we show how FM<sup>s</sup> can be translated into a formal representation (constraint satisfaction problems and SAT problems) and introduce corresponding defnitions of a *FM confguration task* and a corresponding *FM confguration* (also known as confguration, product, or solution). Finally, we discuss example machine learning (ML) approaches that can be applied in the context of feature modelling tasks.

### **2.1 Features, Products, and Confgurations**

A natural way of describing any product is in terms of features. A *feature* is an increment in product functionality [9, 11, 10]. If I want a Chinese wok, I have to decide whether I want rice or noodles; duck or prawns or both; or, if I want a very spicy sauce. Similarly, if I want a smartwatch, I have to decide on the list of features I want. I may want to have sport tracking support or a concrete screen type or maybe I want that my watch allows me to pay in shops. Those are all examples of features. Similarly, in software engineering, a product is not described in terms of technical details about the way a product is developed (e.g., what specifc object-oriented pattern was used to develop part of a package), it is described to the general audience in terms of features. In systems and software engineering, the size of features is arbitrary, i.e., a feature can be of any size depending on the level of abstraction [10]. For example, a feature can be a set of classes and methods but a feature can also be described at the level of architectural elements depending on the granularity of the scope of the given product line. In this book, we will consider a feature at any level of abstraction, i.e., the concepts, tools, and processes discussed in the book can be adapted to the needed abstraction level.

There are several defnitions of what a *feature* is [20] – from more abstract to more technical ones (see below). For example, a very general and abstract defnition is given by Kang et al. [45] in the seminal work about feature modelling in 1990. In contrast, Apel et al. [5] provide a defnition more focusing on technical aspects.

"*A prominent or distinctive user–visible aspect, quality, or characteristic of a software system or systems*" – Kang et al. [45].

"*A structure that extends and modifes the structure of a given program in order to satisfy a stakeholder's requirement, to implement and encapsulate a design decision, and to ofer a confguration option*" – Apel et al. [5].

In this book, we consider a feature as *an increment in (program/product) functionality*. A product could be software, hardware, or both.

A complete list of features describes a confguration of a product. There are some features that are *implicit* to a product, this is, some features that cannot be decided or selected by the user, while there are some features that can be decided by the user. For instance, I can decide if I want a sweet sauce in a Chinese wok but I may not be able to decide the concrete kind of rice four in the case I select a wok with rice noodles. Similarly, I may be able to decide if I want to have a concrete screen size but not the specifc sport tracking technology in the case I decide to have that feature. This diferentiation is often referred as *internal* or *external* variability [65]. The external variability is the one of domain artefacts or assets that are visible for customers or stakeholders while the internal variability is the one that is handled internally by the organization and is not visible for the external stakeholders.

Some feature combinations are allowed while some others are forbidden. The allowed feature combinations are determined by a model representing features and constraints among them. For instance, I can decide to have a wok with rice but then I cannot have noodles. The ingredients constraints will be determined by the wok menu. Similarly, I can decide to have a standard screen in my smartwatch but then I will not be able to have the payment feature. The allowed feature combinations (a.k.a. confgurations) are defned by an FM.

The process of selecting and deselecting features when customizing (confguring) a product is known as *confguration process*. The fnal result of a confguration process is a *confguration* that can be a *complete confguration* (a.k.a. *full confguration*) if all the decisions regarding feature inclusion or exclusion were made during the confguration process or a *partial confguration* in the case some decisions were not made and not all the features were selected or deselected. In the former example, one can decide to get a noodles wok but not really being sure to have fsh or meat that would make a partial confguration of a wok product; or there can be a full description of the wok with all the features selected or deselected that would make a complete confguration of a wok product. Similarly, I can be sure that I want a smartwatch including sports tracking but unsure about the tracking type.

The number of allowed confgurations (a.k.a. *confguration space*) grows with the number of features. If a model has optional features with no constraints, it could have 2 distinct confgurations. Let's stop here for a moment to understand the complexity of the problem that we can face when confguring products with many options. It is estimated that the observable universe has around 10<sup>80</sup> atoms. This is a very large number. If we take a highly complex confguration system example, the Linux kernel, we fnd that the confguration options have been growing in the past and it is easy to imagine that they will keep on increasing in the future. Kernel versions 5.0, 4.0 and 3.0 have around 16,000, 14,000 and 11,000 confguration options respectively [47]. If those confguration options were only Boolean (i.e., the confguration option can be set or not but there are not more than these two options), which is not always the case, the number of potential confguration options of the Linux kernel would be in the order of 104816, i.e., we would need around 104,<sup>736</sup> universes to store all the confgurations of the Linux kernel if a confguration could be stored in only one atom – these numbers are huge. In those cases, we often talk about *colossal confguration spaces* [42, 62].

An *FM confgurator* is a tool that allows confguring an FM (during application engineering, c.f. Figure 2.2) such that a product can be produced. Such a tool has as input the FM, and permits stakeholders to defne the product they want by selecting (including) or deselecting (excluding) features. A confgurator verifes that the feature selection is legal or not. In our *wok* example, the real-world counterpart of the confgurator is represented by a combination of wok menu, waiter, and the cook who will ensure that the selected wok features are allowed before starting the "production" process. A confgurator for our *smartwatch* example FM (see Figure 2.3) can be a software tool that ofers a web interface with options to select and deselect features and taking into account domain and application constraints. We will see a related example in the next sections.

Figure 2.1 provides an overview of feature modelling related activities discussed in this chapter. First, the goal of *FM design* is to build an FM as a basis for follow-up confguration activities. FM design can be supported by (1) *FM (product line) scoping* which helps to identify those features which can be regarded as relevant and should be taken into account in the product line, (2) *confguration space learning* which helps to identify basic solver search heuristics for making search processes efcient, and (3) *knowledge extraction from data* which helps to identify FM<sup>s</sup> or parts thereof in an automated or semi-automated fashion (e.g., the automated identifcation of features from requirements specifcations). For the purpose of enabling FM confguration, FM<sup>s</sup> have to be translated into a corresponding logic-based representation, for example, as a constraint satisfaction problem (CSP) or SAT problem. All of these aspects will be discussed in detail in the follow-up sections.

Fig. 2.1: Feature modelling related activities and mapping to a corresponding logical representation (e.g., as a constraint satisfaction problem (CSP) or SAT problem) or textual representation for knowledge exchange purposes (ids in brackets refer to the corresponding subsection).

### **2.2 Feature Modelling in the Engineering Process**

Feature modelling is used as a pivotal part in software product line engineering [4, 63] and can be applied to diferent contexts. There are several proposals for engineering software product lines. In this book, we follow a simplifed and practical process proposal described by Apel et al. [4]. We distinguish four main activities that are shown in Figure 2.2. Software product line engineering activities are associated with two diferent dimensions. The vertical dimension distinguishes between *domain engineering* and *application engineering* (upper and lower part of Figure 2.2). The horizontal dimension distinguishes between *problem space* and *solution space* (left and right part of Figure 2.2).

**Domain engineering** develops reusable assets but not fnal products and has two diferent sub-processes: *domain analysis* (in the problem space) and *domain implementation* (in the solution space). Domain analysis identifes features in the scope of the product line and produces an FM that represents the allowed feature combinations. Domain implementation is the development of reusable assets to be used in application engineering.

Imagine a product line of wok dishes, during domain analysis, the frst step would be to identify the ingredients and choices that we want to ofer in the menu. Also, the constraints among these elements have to be identifed. Similarly, during domain implementation, some pre-cooking of ingredients and preparation can be done to be reused later during application engineering. Components, platforms, APIs, libraries, documents, test cases, and in general any artefact that can be reused later in the production process are outputs of the domain engineering process. Most of the assets are common to all the products and this is why product line engineering is a good approach when there are commonalities among the products in a concrete domain. A central artefact in the domain engineering process is the FM.1 There is a mapping between features in the FM and implemented artefacts in the solution space.

Fig. 2.2: Software product line engineering process based on [4]. This book focuses on *domain analysis* (feature modelling) and *requirements analysis* (confguration).

**Application engineering** produces a product based on a set of feature inclusions and exclusions (a.k.a. FM confguration) defned by an FM confgurator. Reusable elements developed in domain engineering are used together with specifc needs to produce a concrete product. An FM confgurator is built using the FM that was designed during domain engineering and provides a user interface for interacting during the feature selection process. The output of this process is an FM confguration. During application engineering, there are two sub-processes: *requirement analysis* and *product derivation*. In this context, requirement analysis identifes the application requirements taking into account the user needs. For that, an FM confgurator helps with the correct inclusion and exclusion of features in an step-wise process. Some new requirements can afect the domain analysis process when new features can be added, changed, or removed. Product derivation takes an FM confguration and corresponding (implemented) reusable artefacts as input and assembles a product that conforms with the application requirements and fulflls user needs.

<sup>1</sup> Other variability modelling approaches have been also proposed in the literature such as OVM, CVL, COVAMOF, decision modelling, and others [16, 28].

In the wok example, application engineering would be the process of selecting items with the help of the menu and the waiter as well as the process of preparing the dish and deliver it to the customer.

The *problem space* is distinguished from the *solution space* (left and right side of Figure 2.2). The problem space takes the perspective and vision of the external stakeholders, the context restrictions and, in general, the domain knowledge. In contrast, the solution space considers the perspective of internal stakeholders such as managers, developers and testers. The problem space is the *what*, while the solution space is the *how*. *What* features to ofer and *what* products to build versus *how* features are implemented and *how* products are built.

In this book, *we concentrate on the left hand side of the process* (i.e., the problem space) where feature modelling plays a major role. There are diferent alternatives for implementing features and producing products from existing features that range from templates and #if...#elsif...#endif conditional compilation directives to modules of feature-oriented programming. For details on the solution space process, we refer to other books such as the one of Apel et al. [4] or Meinicke et al. [57]. In the problem space, features are the key concepts to organize the domain knowledge.

In the following, the main concepts of feature modelling are defned. In Chapter 3, FM analysis is explained (mostly used in domain analysis) and in Chapter 4, the FM confguration process will be discussed (mostly used in requirements analysis).

### **2.3 Feature Model Basics**

The term "*feature model*" was coined by Kang et al. in the FODA report back in 1990 [45]. Feature modelling has been one of the main lines of research in software product lines since then [39]. There are diferent FM languages [13, 76]. We review the most well known dialects for those languages. In general, there is no FM language that will ft all scenarios and often, some concrete adaptations have to be done [3].

An FM is a compact representation of all possible confgurations of a product line. FM<sup>s</sup> are widely used in software product line engineering but they can also be used in other domains such as video encoding [3], security information [56], biological information [18] or representing exam options [50] just to mention a few diverse examples. The holy grail of the SPL community is the Linux operating system FM which has thousands of modules and confguration options called options.

Figure 2.3 shows our running example of a *smartwatch* product line encoded using a common FM notation. An FM is composed of:


Fig. 2.3: Example *smartwatch* FM used in the book.

**Classical FMs**. A feature diagram of a classical FM declares four relationships:


The root feature is included in all confgurations. A feature can only be included in a confguration if the parent feature is included. In addition to the tree–like relationships between features described above, an FM can also contain cross–tree constraints between features – basic ones are the following:


More complex cross-tree relationships have been proposed later in the literature allowing constraints in the form of generic propositional formulas, e.g., "*A and B implies not C*" [9, 34].

**Abstract and concrete features**. In some cases, there is a distinction between *concrete* and *abstract* features. Concrete features have a relationship with domain implementation artifacts in the solution space (c.f. Figure 2.2) while abstract features are only used for organization purposes and do not have any direct mapping to any artifact in the solution space. It is often recommended to only defne the leaves of the tree as concrete features and let all the other intermediate features to be abstract ones [10]. For simplicity but without loss of generality, in this book, we will not distinguish between concrete and abstract features but will consider all as equal.

**Smartwatch example**. In the example of Figure 2.3, all smartwatches must include a *screen* (either *touch* or*standard*), and an *energy management*system (either *basic* or *advanced–solar*). Optionally, *payment*, *gps* and *sports tracking* features can be included. Furthermore, any combination of at least one out of the features*running*, *skiing*, and *hiking* can be included when the *sports tracking* feature is selected.

Table 2.1: Examples of *satisfable* ( 1) and *non-satisfable* ( 2) FM confgurations (in 2, *payment* and *standard* cannot be included in the same confguration).


Table 2.1 shows satisfable and non-satisfable confgurations for our example FM (see Figure 2.3). Confguration <sup>1</sup> is satisfable because it does not violate any of the FM constraints. In contrast, <sup>2</sup> is non-satisfable because the *payment* feature is included and also the *standard* screen is. These features are incompatible due to the excludes constraint. Therefore, <sup>2</sup> is non-satisfable.

From FMs, confguration tools (a.k.a. *FM confgurators*) are constructed (see Chapter 5). An FM confgurator is a tool to select and deselect features interactively while checking the consistency, i.e., either not allowing non-satisfable confgurations or alerting about the potential inconsistency. A possible user interface for confguring smartwatches is the one of Figure 2.4. There are groups of features that can be confgured with selection and deselection of features. The FM confgurator has to take care to only allow satisfable confgurations and advice the user in the case any misconfguration is produced. For that, analysis and interaction with FM<sup>s</sup> are needed and those are the topics that will be addressed in Chapters 3 and 4.


Fig. 2.4: Example *smartwatch* FM confgurator.

### **2.4 Feature Model Extensions**

There are diferent proposals in the literature to extend or modify feature modelling with diferent FM constructs. The most well known families of extensions are *cardinality–based* and *attribute–based* FMs. These extensions include a discussion that has been in the community for a while regarding what are the semantics of feature cardinalities, cloning or attributes. In this book, we will not address those problems in detail and refer the reader to related work [21, 46, 58, 59, 67, 75]. In any case, all techniques presented in this book are agnostic with respect to the way FM<sup>s</sup> are defned. These techniques can be equally used to analyse or confgure classical, cardinality–based or attribute–based FMs. In the following, we provide a short discussion of these extensions.

### **2.4.1 Cardinality–based Feature Models**

Cardinality–based FM<sup>s</sup> incorporate *cardinalities*, which resemble those found in the *Unifed Modelling Language* (UML) (see [23, 69]). The relationships introduced in cardinality–based feature modelling are the following [12, 13]:

• **Feature cardinality.** A feature cardinality is a sequence of intervals [..] with as lower bound and as upper bound ( ≤ ). Intervals describe the number of instances of the feature that can be part of a confguration. This relationship may be used as a generalization of the original mandatory ([1, 1]) and optional ([0, 1]) relationships defned in FODA (Section 2.3).

• **Group cardinality.** A group cardinality is an interval ⟨..⟩, with being the lower and the upper bound ( ≤ ) limiting the number of child features that can be included in a confguration. An alternative relationship is equivalent to a ⟨1..1⟩ group cardinality. An or–relationship is equivalent to ⟨1..⟩, being the number of features in the relationship.

Figure 2.5 shows an example of the smartwatch FM using a cardinality–based notation. This FM represents the same *confguration space* (i.e., it represents exactly the same set of confgurations) as the one in Figure 2.3.

Fig. 2.5: Cardinality–based FM example.

### **2.4.2 Attribute–based Feature Models**

To determine the cost or memory usage of a particular feature in a smartwatch confguration, *feature attributes* are needed. When FM<sup>s</sup> are expanded by including feature attributes, they are referred to as *extended, advanced, or attribute-based FMs*.

FODA [45], the seminal report on FMs, had a forward-thinking approach in considering the incorporation of more data into FMs. This involved introducing connections between features and their attributes, in addition to features and their relationships. Later, Kang et al. [44] made an explicit reference to what they call "non–functional" features related to feature attributes. There is no consensus on a graphical notation for attributes. However, most proposals agree that an attribute should consist at least of a *name*, a *domain*, and a *value*. Figure 2.6 depicts a sample FM including attributes using a notation inspired by Benavides et al. in [14]. As illustrated, attributes can be used to specify the price of a feature or the size of a concrete screen. Attribute–based FM<sup>s</sup> can also include complex constraints among attributes and features like: "*If attribute price of feature advanced solar is lower than a value X, then feature touch cannot be part of the confguration*". For instance, there can be a global constraint that specifes that the price of a smartwatch is calculated using the sum of the prices of the selected features. Similarly, there can also be customer constraints that specify an upper bound on the price of a smartwatch.

Fig. 2.6: Attribute–based FM example.

More advanced confgurators can be built when using attributes, cardinalities, and complex constraints. In this book, for the sake of simplicity, we only use classical FM<sup>s</sup> as described in Section 2.3. However, all the described techniques can also be applied to other FM types.

### **2.5 Feature Model Semantics**

To provide a semantics for FMs, the main concepts of previous sections are now defned formally. We use propositional logic in the form of a CSP (Constraint Satisfaction Problem) [6, 72]. An FM is composed of two main elements:


**Defnition 2.1** (Feature). A feature is the basic element of an FM and it is assigned a value in an *FM confguration*. Boolean features that are true (⊤) are included; false (⊥) are excluded. Non-boolean features are possible (e.g., integers) but, for the sake of simplicity, we do not defne them here.

**Defnition 2.2** (Set of all features). The set of all features in an FM is = { 1, 2, ..., }. Only the features in can be part of the constraints of the *constraint model*.

**Defnition 2.3** (Constraint model). A constraint model of an FM is a set of constraints = R ∪ Π, where:


**Defnition 2.4** (Feature Model). A feature model (FM) is a tuple (,), where is the set of all features and a constraint model that uses only the features in . The semantic domain of the FM is determined by the constraints in and represents all the FM confgurations (the FM confguration space).

**Defnition 2.5** (Application requirement). An FM *application requirement* is a set of constraints specifying specifc preferences3 of a stakeholder that have to be considered in an FM confguration, i.e., = {1..}.

**Defnition 2.6** (FM Confguration). An *FM confguration* is an assignment = { <sup>1</sup> = <sup>1</sup>.. = } ( ∈ ( )) on the features of an FM represented as variables ∈ . is *satisfable* if it does not violate any constraint in the FM and application requirements (i.e., it does not violate the set ∪ - the *consistency* property). An FM confguration is *complete* (a.k.a. full confguration), if every feature has an assignment describing an inclusion or exclusion and it is *partial* otherwise.

**Defnition 2.7** (FM confguration space). The set of all the complete and satisfable FM confgurations of an FM represents the *FM confguration space*. Therefore, all the confgurations of the FM confguration space are satisfable (i.e., they do not violate the set ∪ ).

**Defnition 2.8** (FM Confguration Task). An FM confguration task is a tuple (, , ) defned by a set = { 1, 2, ..., } of features; corresponding domains for the features = {( 1), ( 2), ..., ( )} (e.g., for classical FM<sup>s</sup> , ( ) =){true (⊤), false (⊥)}); and a set of constraints = ∪ restricting the set of possible solutions () and a set of application requirements () as defned previously. In this context, = {1.. } and = {+1..}.

For an FM confguration task, a constraint solver can be activated to fnd a corresponding solution (FM confguration). More details on the inclusion of solvers are provided in Section 2.6 and Chapters 3 and 4.

The terms used in diferent research felds are similar but can lead to confusion. In this book, we defned an *FM confguration* in Defnition 2.6 and an *FM Confguration Task* in Defnition 2.8. In the literature, one can fnd related terms such as *product confguration*, *confguration*, *feature selection*, *feature combination*, *product*

<sup>2</sup> Let B denote the boolean domain, B = {true (⊤), false (⊥)} and B() a function denoting all possible boolean constraints on the set of features . Classical FMs use only boolean features.

<sup>3</sup> Also known as user, stakeholder, or customer requirements.

*description*, *product specifcation*, or *staged confguration* just to mention a few. To clarify these terms, we diferentiate between *process* and *process result*.

**FM confguration process**: In this process, features are selected or deselected with the goal to fnd a satisfable FM confguration for a given FM confguration task. This process can be initiated if the FM is stable and is performed during application engineering in the problem space dimension (see Figure 2.2). The FM confguration process is also referred as *product confguration* [57], *confguration process* [4, 57], *feature selection process* or less common *confguration setting*. During this process, the possibilities that are available to the user are often referred as *confguration options*, *confguration alternatives*, or *confguration attributes* [4, 29]. If the FM confguration process is performed in several stages, it is sometimes called *staged confguration* [24].

**FM confguration**: This is the result of an FM confguration process and – following Defnition 2.6 – is the result of a selection and deselection of features. Alternative terms may be used for FM confguration, such as *product*, *feature selection*, *feature combination*, *product description*, *product specifcation*, *product confguration* or simply *confguration*. A product is often considered a complete FM confguration with a consistent feature selection. In this book, we will use the term *FM confguration* but also *confguration* for simplifcation purposes. Following Figure 2.2, we propose to use the term FM confguration because using the term *product* can be confusing since a product is produced after the product derivation process from a complete FM confguration.

There are other related terms that should not be confused. When an FM confguration has been created as a satisfable set of included and excluded features (in the application engineering process), the product has fnally to be produced in the solution space. This "production" is also denoted as *product derivation*, *product confguration* (in some software engineering contexts), *product generation* or *product assembly* [4]. The techniques used for product derivation are diferent to the ones described in this book. Among those techniques, there is one that can cause confusion: *confguration parameters* [4]. Among the most common options, confguration parameters are passed through command line, global variables, and values in a properties or requirements fle. In this book, we will not use this term that can be controversial with *feature attributes* (see Section 2.4.2). Therefore, techniques, tools, and studies about confguration parameters are out of the scope of this book.

### **2.6 Mapping Feature Models to Logic**

Up to now, we have introduced formal defnitions and indicated that an FM has a constraint model. Depending on the FM constructs, diferent constraint models can be built. In the following, we defne the mapping from FM<sup>s</sup> to logic using constraint programming and SAT solving.

### **2.6.1 Constraint programming mapping**

A *Constraint Satisfaction Problem* (CSP) [6] consists of a set of variables, a set of fnite domains for those variables, and a set of constraints restricting the values of the variables. *Constraint programming* is the set of techniques such as algorithms or heuristics that deal with CSPs. A CSP is solved by fnding values for variables (a.k.a. states) in which all constraints are satisfed. CSP solvers can deal not only with binary values (true or false) but also with numerical values such as integers, intervals, and symbolic domains (e.g., smart watch price).

A CSP solver is a software package that takes a problem modelled as a CSP and determines whether there exists a solution for the problem. From a modelling point of view, CSP solvers provide a richer set of modelling elements in terms of variables (e.g. sets, fnite integer domains, etc.) and constraints (not only propositional connectives) as it is the case with SAT solvers.

The mapping of an FM into a CSP can vary depending on the concrete solver. In general, the following steps are performed: (1) each feature of the FM maps to a variable of the CSP with a domain of 0..1 (or *false, true*), depending on the kind of variable supported by the solver, (2) each relationship of the model is mapped into a constraint depending on the type of relationship (in this step, some auxiliary variables can appear), (3) the resulting CSP is the one defned by the variables of steps (1) and (2) with the corresponding domains and an additional constraint assigning true to the variable that represents the root, i.e., ⇔ or == 1, depending on the variables' domains.

Concrete rules for translating an FM into a CSP using propositional logic are listed in Figure 2.7 (see also the original proposal of Benavides et al. [14]). Also, the mapping of our running example of Figure 2.3 is presented. A *propositional formula* consists of a set of primitive symbols or variables and a set of logical connectives constraining the values of the variables, e.g. ¬, ∧, ∨, ⇒, ⇔. It is important to remark that the root feature is part of any product and that is why an extra constraint is added to represent this property. Note that ⊕ is used to denote that only one feature of the set can be selected. Depending on the solver, this operator may not be available and then a formula with other basic operators has to be built [12].

### **2.6.2 SAT based mapping**

A *SAT solver* is a software tool that works with a propositional formula in order to determine if the formula is satisfable, i.e., there is a variable assignment that makes the formula evaluate to true. Input formulas are usually specifed in *Conjunctive Normal Form* (CNF) using formats such as DIMACS [17]. CNF is a standard form to represent propositional formulas that is used by most SAT solvers where only three connectives are allowed: ¬, ∧, ∨, this is, the logical negation, logical conjunction and logical disjunction of formulas. It is well known, and was proved time ago, that every propositional formula can be encoded into an equivalent CNF formula.

Fig. 2.7: Mapping from an FM to a constraint satisfaction problem (CSP). In this context, the semantics of ⊕(1, 2, ..., ) is that exactly one of the features 1, 2, ..., must be included. For the sake of simplicity, we use the infx notation when only two features are involved.

Similarly, SAT is a well known NP-complete problem. Nevertheless, due to extensive research in the SAT solving area [37], there have been big advances in research and practice of SAT solving which makes it possible to address many practical problems with efcient computing resource usage [17].

The mapping of an FM into a propositional formula can change depending on the used solver. In general, the mapping is performed in the following steps: (1) each feature of the FM maps to a variable of the propositional formula, (2) each relationship of the model is mapped into one or more small formulas depending on the type of relationship, in this step some auxiliary variables can appear, (3) the resulting formula is the conjunction of all the resulting formulas of step (2) plus an additional constraint assigning true to the variable that represents the root. Rules for translating an FM into a propositional formula are listed in Figure 2.8. Further related work on specifc – and potentially more efcient – SAT encodings can be found, for example, in Nguyen et al. [61] and Sinz [77].

There are other tools that also work with propositional formulas. One of those tools that is also used in FM analysis and confguration is BDD. A *Binary Decision Diagram* (BDD) solver is a software package that takes a propositional formula as input (it can be in CNF or not) and translates it into a graph representation (the BDD itself). With this data structure, it is very easy to determine whether the formula is satisfable and there are efcient algorithms for counting the number of possible solutions [17]. The size of the BDD is crucial because it can grow exponentially in the worst case. Although it is possible to fnd a good variable ordering that reduces the size of the BDD, the problem of fnding the best variable ordering remains NP-complete [42].

### **2.6.3 CSP example mapping**

The FM example of Figure 2.3 can be represented as a CSP. Table 2.2 shows a CSP representing an FM confguration task (see Defnition 2.8): as a set of features, as the corresponding domains, as the union of the set of constraints of the FM and the constraints representing the application requirements of the smartwatch FM example (in this example representing that only confgurations with are desired, but of course other combinations could be introduced). Note that <sup>0</sup> : ℎ is a *root constraint* which prevents the generation of empty confgurations, i.e., confgurations where no feature is selected. Following the already introduced formalizations, we apply the logical operators of ⇒ denoting an implication, ⇔ denoting equivalence, ∨ denoting a logical *or*, ∧ denoting a conjunction (logical *and*), and ⊕ denoting a logical *xor* indicating that only one feature can be selected from the given set, for example, ⊕ represents ¬ ∧ ∨ ∧ ¬.

Note that throughout the book, we follow the formatting rule that (1) , i.e., the set of constraints and relationships of the FM, is defned without the explicit usage of {, }, for example, we write <sup>0</sup> : ℎ also meaning

Fig. 2.8: Mapping from FM to CNF (SAT solving context).


Table 2.2: CSP mapping example.

<sup>0</sup> : ℎ = . In contrast, for understandability reasons, the values {, } are explicitly included when specifying customer requirements and FM confgurations, for example, we write <sup>11</sup> : = also meaning <sup>11</sup> : (an example of concrete FM confgurations is given in Table 2.1).

### **2.7 Textual Languages for Feature Models**

Representing FM<sup>s</sup> as diagrams has always been possible. Indeed, the original FODA report provided a frst graphical representation of FM<sup>s</sup> that has little evolved in general. Graphical representations of FM<sup>s</sup> are usually known as *feature diagrams*. Most of the representations look similar to the ones presented in this chapter (see, e.g., Figure 2.3). There are others with diferent visual representations but basically, all express the same concepts.

In parallel, there has been a tendency to propose diferent textual representations of FM<sup>s</sup> [26]. There are diferent motivations to propose a textual variability model language [15]. Among those, exchanging models for allowing interoperability among tools as well as sharing among researchers and practitioners; teaching and learning using a common language that can be produced by programmers and displayed in any text editor; or allowing common analysis over the models with diferent tools.

**UVL**. The goal of the MODEVAR initiative4 is to create a common language for variability modelling. A proposal towards an Universal Variability Language (UVL5) is being pushed forward [13, 79]. UVL is a textual variability language that can express

<sup>4</sup> https://modevar.github.io/

<sup>5</sup> https://github.com/Universal-Variability-Language

basic FM<sup>s</sup> and has extension mechanisms to provide enriched versions to include, for instance, cardinalities, attributes, and types.

UVL utilizes a tree–like structure to represent the hierarchical nature of FM<sup>s</sup> and a tabular based identation to separate concepts. To illustrate this, Figure 2.9 shows the UVL representation of our running example of the smartwatch product line. UVL includes several key concepts for specifying constraints, including mandatory and optional as well as the "or" and "alternative" relationships. Finally, cross-tree constraints are supported, allowing any arbitrary propositional constraint involving various features.

Fig. 2.9: UVL [13] FM example.

There exist other textual variability modelling languages [26]. Some of these are based on XML and others have their own syntax such as TVL [19]. There was even an attempt to standardize a variability modelling approach at the OMG level. The approach was called Common Variability Language (CVL) [41] but did not materialize as a real standard.

Recently, and still in the umbrella of the MODEVAR initiative, a repository of UVL models was released [13, 71].6 This repository is designed using open science principles and allows the upload, search, and download of FM datasets. It is a central point to share FM<sup>s</sup> among practitioners and researchers using UVL.

**Other textual constraint languages**. In the constraint solving and confguration communities, there have been also eforts to propose textual languages for representing constraints or confguration problems. The motivations are similar. Among the most relevant proposals are DIMACS [38], Minizinc [86], or XCSP [8].

The DIMACS format [38] is commonly used to represent SAT instances that can be interpreted by diferent SAT solvers. This format uses plain text and includes a collection of clauses that are represented as a sequence of literals, which can be variables or negations. The DIMACS format is widely adopted for benchmarking SAT solvers and sharing SAT problems in research studies. DIMACS has a compact

<sup>6</sup> https://www.uvlhub.io/

syntax and it is not that easy to understand for humans since it is a plain text fle comprising only numbers for Boolean variables which represent the features. Each row forms a disjunction of possibly negated variables which represents a structural or cross-tree constraint (see Section 2.6.2). Those simple syntactical rules make it easy to process DIMACS fles by SAT solvers.

MiniZinc [86] is a textual constraint modelling language that is used to specify CSPs. MiniZinc is open-source and allows to describe a CSP in a textual fashion – CSP<sup>s</sup> formulated this way can be solved by diferent constraint solvers. It is designed to be solver–independent, enabling the user to switch between solvers easily. MiniZinc is used in various felds such as operations research, scheduling, planning and can be also used in product confguration. The syntax of MiniZinc is similar to a programming language and its intention is that it can be produced and edited by humans. An example screenshot of the MiniZinc IDE including a constraint-based representation of our example *smartwatch* FM is shown in Chapter 5.

XCSP (XML Constraint Satisfaction Problems) [8] is a textual language used for specifying instances of combinatorial problems. XCSP provides a unifed representation of various types of problems, including CSPs, combinatorial optimization problems, and scheduling problems. The format is based on XML, making it possible to parse and process using standard software tools. XCSP supports a wide range of constraints, including global constraints, soft constraints, and constraints over fnite domains or real numbers. The format has been adopted by some academic and industrial tools, and it is used for benchmarking, sharing, and comparing diferent constraint solvers. Being XML, the syntax is closer to a machine than a human user. Nevertheless, it is readable by humans, too. In the remainder of this book, for understandability reasons, we will use mostly the graphical representation of FM<sup>s</sup> – all examples can be easily translated to UVL and processed by UVL-compatible tools.

### **2.8 Further Feature Modelling Aspects**

Up to now, we have focused on knowledge representation and formalization aspects in the context of feature modelling. Further related issues will be discussed in the following subsections. *First*, product line scoping [55] is related to the task of deciding which features should fnally be included in the FM and – as a consequence – presented as an option to a confgurator user. In this context, we will focus on diferent *explanation* aspects which play a major role in such decision contexts. *Second*, *confguration space learning* [64] is directly related to the task of feature modelling: FM<sup>s</sup> can be formalized, for example, as a CSP [14]. In order to assure search efciency of the constraint solver, machine learning (ML) techniques can be used to learn solver search heuristics in a way that a confgurator shows an acceptable runtime in most of the cases. *Third*, also in the context of designing FMs, machine learning techniques, for example, *knowledge extraction from data*, can help to automatically determine features or even constraints from textual requirements.

### **2.8.1 Product Line Scoping**

Deciding which features and constraints to include in an FM can be regarded as a basic task in the context of diferent product (line) scoping scenarios [43, 52, 55, 70]. In such scenarios, basic machine learning and decision support techniques can be used to support stakeholders such as product owners, sales managers, and domain experts in their decision regarding the inclusion and exclusion of features in new versions of a product line. Specifcally, recommendations need to be explained which is a task related to the feld of explainable AI (XAI) [25].

Table 2.3 includes a simplifed example of a decision scenario regarding the inclusion and exclusion of our example smartwatch features. In this example, individual stakeholders ∈ {1..3} vote for or against the inclusion of a specifc feature where 1 indicates inclusion and 0 indicates exclusion. Such decisions about inclusion and exclusion of features can be interpreted as a basic optimization problem with the goal to minimize the number of adaptations needed such that overall consensus can be achieved regarding each individual feature [30]. Such an optimization could also take into account aspects such as fairness and unequal weights of individual stakeholders (e.g., experts vs. non-experts regarding a specifc feature [7]).



In Table 2.3, the recommendation (*rec*) shown to the group {1, 2, 3} is based on *majority voting* [31]. When analyzing the preferences of the individual stakeholders regarding feature inclusion, we can observe that the stakeholders <sup>1</sup> and <sup>3</sup> have a basic consensus regarding the inclusion and exclusion of individual features. There is one exception since <sup>3</sup> does not seem to support the inclusion of the *payment* feature. There might be diferent reasons for this preference ranging from not understanding the feature to a more basic reason of not being aware of the feature importance. In any case, the recommendation system should not just recommend to exclude the feature but also recommend, for example, discussion between <sup>1</sup> and 3. Another related observation is that stakeholder <sup>2</sup> has preferences which difer completely from those of <sup>1</sup> and <sup>3</sup> with one exception. One reason behind might be that <sup>2</sup> has more expertise regarding the preferences of the underlying customer community and the market potential of individual features. Anyway, discussions have to be triggered among the individual stakeholders in order to achieve a consensus at the end [31, 49, 82].

### **2.8.2 Confguration Space Learning**

An important issue specifcally with large and complex variability models is to provide a means for assuring acceptable runtime performance of the underlying constraint solver or SAT solver [66]. Just translating the FM into corresponding sets of variables and constraints is not enough. We have to take care of selecting appropriate *search heuristics*that help to improve solver performance [84]. Recent developments in the felds of constraint solving and SAT solving aim to integrate machine learning for recommending search heuristics that will help to solve a problem instance (confguration task) efciently – for an overview see Popescu et al. [66].

Similar to other complex domains, learning search heuristics requires the availability of corresponding datasets that can be used as an input for a (supervised) machine learning process. As such, the problem of learning search heuristics can be seen as a specifc instance of *confguration space learning* [64] where diferent data synthesis approaches are used to generate relevant example problem instances which can then be used for optimizing a corresponding machine learning model [84]. In the following, we discuss a simplifed example of a nearest neighbor (NN) based approach for recommending solver search heuristics. Table 2.4 depicts an example of a synthesized dataset. The underlying assumption is that *users only specify their preferences with regard to the inclusion of diferent* sportstracking *features*. The remaining features are directly selected by the constraint solver where denotes the *lowest value frst* search heuristic (i.e., *false* is selected before *true*) and ℎ denotes the *highest value frst* search heuristic (i.e., the solver selects *true* before *false*).

Table 2.4: Simplifed example of machine learning based search heuristics recommendation. In this context, ℎ and denote search heuristics: ℎ=*highest value frst* and =*lowest value frst*. Furthermore, represents the identifer of the corresponding dataset entry and represents the preferences defned by the current user. For simplicity, we assume that users only specify preferences regarding the *sportstracking* features – confguration completion is then performed by the constraint solver. With we denote the *runtime in milliseconds* needed for confguration completion.


The dataset includes three satisfable confgurations with the identifers {1, 2, 3}. Furthermore, in our scenario the current customer specifes his/her requirements regarding a smartwatch which are = , = , ℎ = . If we compare these preferences with the entries {1, 2, 3} in Table 2.4, the entries with the most similar preferences, i.e., the nearest neighbors are {1, 2}. Since the nearest neighbor with the id 1 has the better performance (in ), the search heuristics of 1 would be recommended for the efcient determination (completion) of an FM confguration for the current user .

Another application of machine learning is when a set of confgurations are sampled from an FM to perform measurement of a given performance function (e.g. runtime) [62].7 From a sample, machine learning techniques can be applied to predict the performance of a confguration without having to run all the FM confgurations which is impractical in most of the cases. Recent advances [62] show that uniform random sampling is better fnding *near–optimal* FM confgurations than existing machine learning proposals.

### **2.8.3 Knowledge Extraction from Data**

FM<sup>s</sup> can become quite large – see, for example, the Linux operating system FM [2, 81]. In such a context, it can be helpful to automatically identify feature candidates and also related constraints (see, e.g., [40, 51, 80]).

**BasicMachine Learning Approaches**. Feature candidates can be determined, for example, on the basis of content-based machine learning techniques (e.g., clustering) that allow an intelligent grouping of software requirements – see Li et al. [51]. Terms extracted from these requirements and associated with specifc clusters (requirement groups) can be regarded as representatives of features. If features have already been determined, machine learning methods can be applied to determine related constraints [80]. The idea is to randomly generate confgurations out of an FM (assuming that features are already available) and then to use an oracle (e.g., a software that tests the generated confguration) to fgure out if the confguration is satisfable (e.g., the software is operable). Following this approach, a dataset such as the one shown in Table 2.5 can be generated. With such a dataset, machine learning (e.g., decision tree learning) can be used to infer potential (need to be evaluated) FM constraints (see, e.g., Temple et al. [80]). If decision trees are used, FM constraints can be derived by interpreting "faulty" paths as negated constraints.

Table 2.5: Abstract dataset as a basis for machine learning based constraint extraction from confguration data. The oracle feedback can be 0 (faulty/non satisfable confguration) and 1 (satisfable confguration).


<sup>7</sup> Predicting FM confguration performance can also be regarded as an analysis task (see Chapter 3).

**Use of Large Language Models (LLMs)**. Large language models can help to increase the efciency of FM development, for example, by automatically extracting features from given requirements specifcations [40]. Table 2.6 depicts an example that shows in a simplifed fashion how requirements can be generated and related features be extracted using an LLM.8

Table 2.6: An example of LLM-generated requirements and features of a group decision support software ("Company Name Decision Application") using the LLM prompts "generate 5 requirements for an application supporting group decisions about company names" and "create an FM with 5 functional features".


Large language models can be applied for the generation of potential constraints regarding a set of identifed features. Table 2.7 depicts two LLM-generated example constraints that could be of relevance for the generated set of features. The frst constraint in Table 2.7 expresses the idea that rollback only makes sense if a brainstorming history is available. The idea of the second example constraint is that voting only makes sense of the corresponding (internet) domain is available.

<sup>8</sup> The requirements, features, and associated constraints have been generated with ChatGPT 3.5 – see https://openai.com/.



For sure, both, generated features and constraints have to be evaluated by domain experts, however, following this LLM-based approach has the potential to reduce time eforts for FM development and related quality assurance tasks [40, 54].

### **2.9 Discussion**

In this chapter, we have presented diferent FM knowledge representations and corresponding formalizations in terms of constraint solving and SAT solving. We have discussed extensions to basic feature modelling concepts in terms of attribute- and cardinality-based FM<sup>s</sup> – these concepts can be regarded as sufcient for representing variability properties in various application domains. Furthermore, we have introduced defnitions of an FM confguration task and a corresponding FM confguration which will be used as a basis for the discussions in the following chapters. Finally, we have included scenarios that show how data-driven AI techniques can help in the context of feature modelling. In the context of the topic of feature modelling, we regard the following as major open research issues.

**Further Extensions of FM Knowledge Representations.** As discussed in this chapter, there are diferent research streams regarding the extension of basic FM<sup>s</sup> (e.g., in terms of attributes and cardinalities) and also regarding the standardization of FM representations, specifcally on a textual level. Further work on FM standardization could take into account the support of other constraint types [22]. For example, resource constraints are a widely used concept in the context of knowledge-based confguration [33, 29, 74, 78]. When confguring, for example, computer systems, a resource (producer) could be the maximum acceptable price defned by the user (customer) and the corresponding consuming resources would be the hardware components integrated into the computer confguration. A related resource constraint would specify the overall price of the included components must be below the price limit specifed by the user. Approaches to confguration knowledge representation in UML/OCL and corresponding logical representations are discussed in Felfernig et al. [29, 32] – taking into account these representations might also be a way to further extend the expressivity of FM<sup>s</sup> when applied in product confguration. Finally, answer set programming (ASP) is established as an expressive confguration knowledge representation focusing on an object-oriented modelling approach – an application in the context of FM representations is worth further investigations [27, 60, 73].

**Cognitive Aspects of FM Development and Maintenance.** The development of graphical models is supported by diferent types of graphical user interfaces (see also Chapter 5). In this context, model understandability is a major criterion for assuring maintainability and consistency of complex models in the long run. Further research is needed to fgure out in more detail which types of knowledge structures are more understandable compared to others. Such studies can be performed, for example, on the basis of eye tracking equipment which can help to estimate the cognitive overload of knowledge engineers in their FM development and maintenance activities. Assuring model understandability can also be supported on the basis of basic machine learning methods, for example, diferent feature and constraint grouping strategies could result in diferent levels of model understandability [35].

**Decision Support in Product Line Scoping.** In this chapter, we have provided a simplifed example of integrating decision support systems in the process of product line scoping. A related decision support needs to provide specifc predefned decision goals such as to maximize the revenue of the ofered items but also goals such as maximizing sustainability of the ofered solutions and minimize the <sup>2</sup> footprint [36]. In this context, also the feasibility of the selected features needs to be taken into account, for example, in terms of available development resources, development risk, and market-related risks [53]. A similar situation occurs in the context of knowledgebased confguration scenarios where a confguration model has to be tailored (also in a scoping process) in such a way that it supports only confgurations which can be produced by the existing production infrastructure [53].

**Variability Mining.** As already mentioned, the increasing size and complexity of FM<sup>s</sup> triggers a need for the automated support of variability knowledge extraction/mining. Similar to *process mining* where processes are discovered from existing logs [68, 85], we envision techniques and tools inspired by Artifcial Intelligence for variability mining. Example areas for future research are the inclusion of techniques for a *user-centered knowledge acquisition* based on the ideas of *human computation* [48, 83] and the application of large language models (LLMs) for the (semi-) automated generation and maintenance of FM<sup>s</sup> (and beyond) [1, 40].

### **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

# **Chapter 3 Analysis of Feature Models**

**Abstract** Developing and maintaining Feature Models (FMs) can become an errorprone activity. In this chapter, we focus on diferent aspects of analyzing relevant properties of FMs. Such an analysis helps to increase the maintainability and correctness of FM<sup>s</sup> and also makes them better manageable in industrial settings. Analysis operations are discussed in detail and also presented formally. In addition to analysis operations, we also show how to automatically determine erroneous elements of an FM that have to be adapted or deleted in order to restore the intended FM semantics.

### **3.1 Feature Model Analysis Process**

As explained in Chapter 2, FM<sup>s</sup> are a central technique for engineering software product lines. Developing and maintaining FM<sup>s</sup> can be an error-prone activity due to missing domain knowledge, cognitive overloads of persons in charge of FM development, and outdated knowledge parts in existing FM<sup>s</sup> [4, 26]. In order to tackle this challenge, intelligent techniques and tools are needed which help to identify anomalies (i.e., unintended properties of FM<sup>s</sup> which need to be removed) and help to keep maintainability [2]. Such anomalies can exist in diferent forms and require diferent types of *analysis operations* capable of identifying those anomalies.

A conceptual process for the automated analysis of FM<sup>s</sup> is depicted in Figure 3.1. Depending on the specifc analysis task, a reasoning engine (a.k.a. solver) is sometimes needed to provide the needed feedback. For example, if we want to fgure out the number of features or excludes relationships in an FM, this is just a counting task without the need of activating a solver (e.g., a SAT or constraint solver). On the other hand, if we are interested, for example, in the presence of *dead* of *false optional* features in the FM (these concepts will be defned later), a solver support is needed. For example, if one or more features are dead, a *diagnosis* component can help to identify the responsible FM constraints. Furthermore, *testing & debugging* services can help to systematically test an FM with regard to a test suite specifying the intended semantics of the FM. All these aspects are discussed in this chapter.

We want to emphasize that in the context of our discussions specifcally due to the focus of this book on *domain and requirements analysis*, the test object is the FM, i.e., we want to assure that the FM represents intended domain knowledge. In related work on analyzing software product lines, the term *testing* often refers to the testing of the software underlying an FM [37].

Fig. 3.1: Automated analysis of FM<sup>s</sup> conceptual process.

An FM can include other artefacts that are used for analysis purposes. For instance, a table with desired or existing confgurations, user ratings of features, or implicit feedback of users during product usage [36, 40] – just to mention a few. It is important to clarify that the software product line engineering process of Figure 2.2 is an iterative process and some of the additional artefacts for FM analysis can be produced in other stages of the process, for example, the implicit feedback of feature usage or the feature ratings after deploying a product.

In the following, we diferentiate analysis operations with regard to their need of a corresponding solver or not (see also [23]). Figure 3.2 gives an overview of the structure of the chapter.

Fig. 3.2: Chapter overview (ids in brackets refer to the corresponding subsection).

### **3.1.1 Analysis Operations Without Solver Support**

There are analysis operations that can be performed without the need of a solver and can be calculated directly from the FM by ad–hoc algorithms. Table 3.1 provides an overview of example operations applied in the context of FM analysis – for related details, we refer to [4] and [22].

Table 3.1: Example FM analysis operations *without the need of solver support*. In this context, (also denoted as ()) denotes the set of features and (also denoted as ()) the set of constraints of an FM, () denotes a mandatory relationship , () denotes an optional relationship , () denotes an or relationship , and () denotes an alternative relationship . Furthermore,() denotes a requires constraint and () denotes an excludes constraint .


The notations used in the formalizations included in Tables 3.1 and 3.2 follow the defnitions of an FM confguration task and a corresponding FM confguration introduced in Chapter 2. In this context, = { 1.. } denotes the set of features in an FM. The set of constraints is defned as = ∪ where = {1.. } is a set of constraints derived from the FM and = {+1..} is a set of constraints representing application requirements, i.e., requirements regarding the inclusion or exclusion of specifc features from the user (customer) point of view. Furthermore, an FM confguration is an assignment = { <sup>1</sup> = ( 1).. = ( )} where ( ) ∈ {, }.

Some analysis operations can be executed without the need of activating a solver (see Table 3.1). In the following, we explain some of those.

**Counting features and constraints.** These analysis operations basically count the number of features and related constraints (relationships) of an FM. Our example FM (Figure 2.3) has 13 features (including the root feature). There are other operations that traverse the FM tree structure [23]. Examples thereof are the determination of the ancestors (direct and transitive) of a specifc feature and the set of leaf-features of an FM. In our example FM, there are 9 leaf-features. The number of ancestors, for example, of the *advancedsolar* feature is 2 (features *energymanagement* and *smartwatch*). Furthermore, the FM includes 2 mandatory relationships, 3 optional relationships, 2 alternative relationships, and 1 *or* relationship. Finally, the model includes 1 requires constraint and 1 excludes constraint. These descriptive numbers help to characterize the size of an FM and can also be used a basis for FM complexity analysis [22, 44].

**Diferences and commonalities between FMs.** Commonalities between FM<sup>s</sup> can be analyzed, for example, in terms of the number of features with the same names and even further in terms of the number of same constraint types referring to the same features. In contrast, diferences between FM<sup>s</sup> can be analyzed in terms of the number of features only existing in one of the analyzed models. The analysis of diferences and commonalities between FM<sup>s</sup> can play a major role in the context of *FM integration*. For example, if a car provider decides to use the same FM for representing product variabilities in *Europe* as well as in the *US*, the corresponding individual FM<sup>s</sup> have to be integrated [50]. In this regard, there is another thread of research related with semantic matching, for instance, when an FM expresses a feature with a feature name that is syntactically diferent from another feature name in another FM but the semantic meaning is the same. Imagine, for instance, that an FM defnes a *gps* feature and another one defnes a *navigation* feature. Depending on the context, the features can be equivalent. In the past, attempts were made to deal with natural language processing and FM<sup>s</sup> [42] – such approaches gain momentum specifcally in the context of the application of large language models (LLMs) in FM management [19].

### **3.1.2 Analysis Operations With Solver Support**

There are some analysis operations over FM<sup>s</sup> that are performed with a solver. Table 3.2 provides an overview of example operations frequently applied in the context of FM analysis in the case that solver support is needed (see also [4, 5, 18]). It is important to remark that the discipline has evolved over time and this book intends to recapitulate and provide an updated terminology and conceptual framework. In this sense, reading back some of the related papers shall be done with attention to the terminology used in those papers and in this book. Anyhow, to assure understandability, we try to explain the used terms as much as possible.

Table 3.2: Example FM analysis operations *with solver support*. In this context, (or ()) denotes the set of features and (or ()) the set of constraints in FM. Furthermore, () indicates that a SAT solver or constraint solver is able to fnd a solution given the constraints in . In the context of FMs, this means that at least one confguration (i.e., a set of constraints representing feature value assignments) could be identifed that complies with the constraints in . Finally, CF() ′ is the complement of the solution space defned by CF() formalized as disjunction of negated constraints of (). () is the generalized (specialized) FM. The number (#) of satisfable confgurations refers to *complete* confgurations. Finally, the *false optional feature* analysis operation excludes *root* and *mandatory* features – this is indicated with (∗) .


In the following, we give an overview of example analysis operations that can only be executed with a corresponding SAT or constraint solver support.

**Satisfable FM.** An FM is*satisfable* if there exists at least one confguration (*conf*) which is consistent with the FM constraints (defned in ). The FM in Figure 2.3 is satisfable, i.e., there exists at least one confguration where all feature settings satisfy the constraints in . However, there are non-satisfable (unsatisfable) FM<sup>s</sup> – see, for example, Figure 3.3. In this (faulty) model, two mandatory features are connected with an excludes relationship which induces a contradiction since on the one hand two features are required to co-occur in each confguration, on the other hand, the same features are regarded as incompatible. This operation received diferent names in the past but basically meaning the same: *void FM*, *invalid FM*, *valid FM*, *consistent FM*, and *solvable FM* [4]. In this book, we use the term *FM satisfability* to express the meaning of the corresponding analysis operation – this term is also in the line with the concept of satisfability checking in constraint and SAT solving [3, 18].

Fig. 3.3: Example of a *non-satisfable FM*: since both, *payment* and *gps* are mandatory, these features must be part of every confguration, i.e., cannot be incompatible at the same time.

**Confguration satisfability.** A confguration is *satisfable if* it is *consistent* (all constraints are satisfed) with regard to the FM constraints = ∪ . Table 2.1 shows one satisfable and one non-satisfable confguration with regard to our example FM. This operation can be useful to determine whether a given confguration is available in a software product line (supported by the FM). In some cases, a confguration or set of confgurations are defned and then need to be tested for satisfability with the FM. If non-satisfable, the confguration(s) can be changed or maybe the FM itself has to be changed to support the desired confgurations [52].

**Number of satisfable confgurations**. This operation returns the number of confgurations represented by the FM. Determining the total number of satisfable confgurations can be relevant in the context of diferent product line scoping scenarios [35, 17, 20] as well as in the context of deciding about variability properties of products and services, for example, when developing or adapting a company-wide mass customization strategy [7]. FM<sup>s</sup> can change over time and it can be important to understand the impact of changes on the structure of the confguration (solution) space supported by diferent versions. The number of satisfable confgurations is also relevant when performing uniform random sampling [21]. The total number of possible satisfable FM confgurations with our example FM of Figure 2.3 is 54.

**Dead features.** A feature is if there does not exist an FM confguration that includes [6, 9]. Figure 3.4 shows three examples of situations where a feature is dead. In the frst setting, the inclusion of *payment* would require the inclusion of a *basic energy management* resulting in a situation where *advanced solar* could never be included in a confguration. In the second setting, the incompatibility between the features *payment* and *standard* results in a situation where *standard* cannot be part of any confguration. In the third example, since *payment* is incompatible with *gps* and *payment* is mandatory, *gps* will never be part of a confguration, i.e., it is a dead feature. Note that these are just examples and dead features could be induced by an arbitrary combination of complex constraints [9].

Fig. 3.4: Three basic examples of *dead features* (indicated with a grey background).

**False optional features.** A feature is *false optional* if it is included in all possible confgurations although not being modelled as mandatory. Figure 3.5 shows some examples of false optional features. Having a false optional feature may be a problem if a confgurator is built from the FM because a visually optional feature cannot be deselected by the user which can be problematic from a user perspective.

Fig. 3.5: Three basic examples of *false optional features* (grey background).

Figure 3.5 includes three examples of situations where some features become false optional ones. In the frst setting, the inclusion of *payment* would require the inclusion of a *running* type *sports tracking* which makes both features – although specifed as optional – part of every confguration. Note that once *running* becomes a false optional feature, then *sport tracking* also becomes false optional. In the second setting, *advanced solar* is part of every confguration (although modeled as alternative). In the third example, *gps* is part of every confguration since *payment* is mandatory and requires the feature *gps*. Again, these are just examples – false optional features can be induced by arbitrary constraint combinations [9].

**Core and variant features.** *Core features* are features that are part of every confguration (see the features *smartwatch*, *screen*, and *energy management* in the running example). *Variant features* are those features that are included in some confgurations but also excluded from some others. The set of core features of a non-satisfable FM is empty. The union of core features, variant features, and dead features, is the set of features of the FM [6].

**Atomic sets.** Atomic sets of features of an FM can be used as a preprocessing technique for automated analysis and interaction [6]. Informally, an atomic set is a group of features that can be treated as a unit because they are tightly coupled and always appear together in any confguration of an FM. From a formal point of view, atomic sets are nonempty subsets of features such that for every confguration in an FM, all their features appear together in the confguration or none of them appears at all. For a more detailed and formal discussion of atomic sets we refer the reader to [6]. In Table 3.2, the property of an *atomic set* is defned in terms of the non-existence of a confguration where at least one combination , ( ≠ ) of features in show diferent values. For example, if = { 1, 2, 3} is an atomic set, then { <sup>1</sup> ≠ <sup>2</sup> ∨ <sup>1</sup> ≠ <sup>3</sup> ∨ <sup>2</sup> ≠ 3} ∪ is inconsistent.

**Redundancies in FMs.** An FM can include so-called redundant constraints which – if deleted from the FM – do not change the semantics of the model, i.e., the FM confguration space remains the same (see Figure 3.6).

Fig. 3.6: Basic examples of redundancies in FM<sup>s</sup> : the *excludes* and *requires* constraints in the two models are redundant, i.e., when deleting these constraints, the semantics of the corresponding models remains the same.

On the logical level, a redundant constraint part of an FM (and the corresponding SAT or constraint satisfaction problem) has the following property: inconsistent(− {} ∪ {¬}), in other words, logically follows from − {} ( − {} |= ). In our example FM, a redundant constraint would be ⇒ ℎ since through the exclusion of combining *payment* and *standard* the inclusion of *touch* remains the only allowed alternative for the screen option in the case that payment has been selected. In Figure 3.6, the excludes constraint between *running* and *skiing* is redundant since the excludes semantics is already expressed by the associated alternative relationship. Furthermore, the requires constraint between *sportstracking* and *gps* is also redundant since *gps* has to be part of every confguration. Automated redundancy detection is relevant, for example, in the context of FM development and maintenance. In order to keep an FM understandable, the inclusion of redundant constraints should be avoided.1

**FM edits.** An FM can evolve over time by adding, removing or editing new constraints or features. These changes to FM<sup>s</sup> are known as *FM edits*[46]. In FM evolution, comparing two FM<sup>s</sup> can be of help in order to know better the edits that were performed in the model.

Fig. 3.7: Diferent types of FM edits: with *refactoring*, the FM confguration space remains the same. With *generalization*, the confguration space is extended with regard to the original model. Furthermore, *specialization* reduces the confguration space. Finally, *arbitrary* edits represent all other FM edit operations.

Comparing edits in FM<sup>s</sup> is also known as *FM diferences* [1]. Figure 3.7 shows how an original FM (a) can be changed by a refactoring (b), generalization (c), specialization (d), or an arbitrary edit (e). These analysis operations are classifed as *refactoring* if the original FM represents exactly the same set of confgurations as the changed one; *generalization* if the set of confgurations of the original FM is a subset of the confgurations of the edited FM, *specialization* if the edited FM represents a subset of the confgurations of the original FM, or *arbitrary* edit in any other case. These are basic comparisons of FM<sup>s</sup> but more complex comparisons could be performed in terms of semantic feature similarity. For further discussions on more complex comparisons of FM<sup>s</sup> we refer the reader to Acher et al. [1]. Also, for a detailed discussion on reasoning about edits in FMs, we refer to Th¨um et al. [46]. Note that in our formalization in Table 3.2 we assume that () is

<sup>1</sup> An algorithm that can be used for the automated detection of redundant constraints in FMs is discussed in Section 3.3 (see also Le et al. [27]).

always satisfable. Furthermore, if () = {1, .., } then the corresponding () ′ = {¬<sup>1</sup> ∨ .. ∨ ¬}.

**Dealing with inconsistencies in FMs.** An FM could be *non-satisfable*, i.e., no solution can be found by a corresponding SAT or constraint solver. In such situations, it is important to fgure out the sources of such inconsistencies. In other words, we are interested in minimal sets of constraints as part of an FM that have to be deleted or adapted in order to restore model consistency (at least one confguration should be identifable by a SAT or constraint solver). In this context, confict detection operations are relevant. In our example, if we include a new constraint <sup>1</sup> : ℎ ⇔ (basically changing the relationship between *smartwatch* and *payment* from optional to mandatory) and another new constraint <sup>2</sup> : ¬( ∧ ℎ) (see Figure 3.8), this would induce an inconsistency, i.e., no solution could be identifed in this situation since *none* of the screen options remains selectable although there is a mandatory relationship between *smartwatch* and *screen*, i.e., at least one screen type should be selectable for a user.

Fig. 3.8: Simplifed example faulty (non-satisfable) FM including the additional constraints <sup>1</sup> : ℎ ⇔ and <sup>2</sup> : ¬(ℎ ∧ ).

**Characterizing conficts and background knowledge.** The set of all constraints = { <sup>1</sup>, <sup>2</sup>, 0, 1, 7, 9} (see Figure 3.8) makes the FM non-satisfable. Such a set is denoted as *confict* or *confict set* () [24], i.e., an inconsistency-inducing constraint set. In our example, it is impossible to fnd a confguration that supports both, constraint <sup>1</sup> (the mandatory inclusion of a *payment* feature) and at the same time the exclusion of both, *standard* and *touch* screen, since *screen* is regarded a mandatory feature. Interestingly, deleting an arbitrary constraint from allows to restore satisfability. In this example, the FM has exactly one confict set comprising all FM constraints. Typically, there are diferent confict sets and each of those has to be resolved individually to restore FM consistency. Confict identifcation can *focus* on specifc CF subsets, for example, a knowledge engineer might be interested to know which new constraints of { <sup>1</sup>, <sup>2</sup>} are responsible for a non-satisfable FM. In such a situation, the constraints from the original FM, i.e., {0, 1, 7, 9}, are regarded as *background knowledge* and confict search is focused on { <sup>1</sup>, <sup>2</sup>}.

Conficts in FM<sup>s</sup> and corresponding formalizations (e.g., constraint satisfaction problems) can occur in diferent situations: (1) if an FM is non-satisfable, knowledge engineers have to be supported in identifying the responsible conficts in . (2) it could also be the case that the FM is satisfable, i.e., at least one solution can be identifed, however, some confgurations derived from the FM do not refect existing real domain constraints (e.g., the smart watch product line allows to have basic screen but in fact the smart watch factory currently does not provide basic screens). In order to be able to identify such "unintended behaviors" of FMs, test suites (with individual test cases representing real-world properties/constraints) are defned that can be used for quality checking in the context of FM evolution. In Section 3.2, we will introduce basic concepts of confict detection and confict resolution on the basis of model-based diagnosis [39]. In Section 3.4, we show how FM<sup>s</sup> can be analyzed/tested with test suites that defne the intended behavior of an FM. More precisely, test cases specifying intended semantics are used to induce conficts in FM<sup>s</sup> which are then resolved with model-based diagnosis.

### **3.2 Diagnosing Inconsistent Constraint Sets**

The increasing size and complexity of FM<sup>s</sup> and their widespread industrial use trigger a demand for the automated identifcation of faulty FM constraints [34]. The foundation for such a support are *confict detection* [24, 29, 51] and *diagnosis* [39] algorithms which support the identifcation of faulty constraints (relationships) that represent an *explanation* for the faulty behavior of an FM (a kind of *why not* explanation). In this context, (1) *confict detection* is used to identify minimal subsets of constraints that are inconsistent (i.e., minimal unsatisfable subsets – MUS [31]), and (2) *diagnosis* helps to identify minimal sets of constraints that have to be adapted or deleted from the FM such that the new version of the model is satisfable (i.e., minimal correction subsets – MCS [45]).

A diagnosis [39, 52] is a *hitting set* which is a set of constraints that have to be deleted from the confict sets such that all conficts are resolved. We now show how to determine conficts and corresponding diagnoses. For demonstration purposes, we use a reduced version of the FM shown in Figure 3.8. The corresponding FM confguration task (, , ) is the following.2 Note that, we kept the constraint identifers that have been introduced in Chapter 2.


<sup>2</sup> For a defnition of a *FM confguration task* see Chapter 2. Due to our focus on model analysis, we omit application requirements, i.e., we focus on .

In the FM shown in Figure 3.8, we have added two faulty constraints for demonstration purposes. (1) the constraint <sup>1</sup> specifes a mandatory inclusion of the payment feature in every possible confguration. Furthermore, constraint <sup>2</sup> specifes an incompatibility between the features *payment* and *touch screen*. In real-world settings, such a combination is obligatory in the sense that payments require interactive user interface elements and for this reason, <sup>2</sup> can be regarded as faulty.

**Reasons of faulty constraints**. There are various possible reasons for the existence of faulty constraints in an FM. (1) It could be the case that due to an increasing complexity of an FM, cognitive overloads are the reason for misinterpreting specifc feature relationships. (2) In some cases, the reason for the inclusion of faulty constraints is missing domain knowledge. (3) Another reason is outdated domain knowledge, i.e., FM elements that have been included long time ago but have not been adapted to refect the new variability properties.

When starting a constraint or SAT solver to calculate a solution for our example FM confguration task, the result will be *no solution could be identifed*. The reason behind is that each *smartwatch* has to include a screen and a screen has to be either a *touch* screen or a *standard* screen. At the same time, the payment feature is defned as being mandatory with further constraints forbidding a combination of *payment* with a *standard* screen or *touch* screen.

Before taking a more detailed look at mechanisms that support confict detection and resolution, we introduce the following defnition of a *confict set*.

**Defnition 3.1** (Confict Set CS). = {1.. } is a subset of a constraint set with inconsistent(). CS is *minimal* if ¬∃′ : ′ ⊂ and ′ is a confict set.

**Three basic confict detection scenarios**. This defnition of a confict set can be applied in diferent scenarios. (1) If the set of FM constraints does not allow the determination of a solution, then we need to search for a confict set in , i.e., = (see **Subsection 3.2.1**). (2) Assuming the consistency of and the inconsistency of ∪, we are interested in fguring out a set of user preferences that are inconsistent with . In this case, a confict set will be found in , i.e., = (see **Subsection 3.2.2**). (3) In FM quality assurance, we have to be able to support knowledge engineers in understanding and explaining unintended FM "behavior" (i.e., semantics) in the sense that a SAT or constraint solver proposes FM confgurations which are unintended, i.e., in contradiction with related realworld domain constraints. For example, if a *payment* feature must be combined with *touch screen* (real-world domain constraint) but the solver does not allow such confgurations, adaptations in the FM are needed in order to refect real-world domain constraints. Also in this context, we are interested in constraint sets (i.e., FM relationships and cross-tree constraints) which are responsible for the unintended semantics. We have to analyze the constraints in , i.e., = , where conficts are induced by test cases specifying intended FM semantics (see **Section 3.4**).

**Minimality properties of confict sets.** A minimal confict set allows confict resolution by deleting only one element from CS. Minimality properties of relevance in this context are *subset minimality* (see Defnition 3.1) and *minimal cardinality* where the latter one is more restrictive in the sense of minimizing the number of confict elements. Minimal cardinality confict sets are preferred in situations where an increasing confict set size would deteriorate solution quality. For example, in group decision making, minimal cardinality conficts can help to reduce the overall communication overhead related to confict resolution [8, 28, 47]. Subset minimality, on the other hand, is specifcally useful in scenarios where there are preference relationships over diferent features. For example, in the context of our smartwatch example, users might have a strong upper price limit in mind and are more fexible with regard to the inclusion or exclusion of specifc other features. In such a situation, confict resolution should focus on conficts induced by "unimportant" constraints. Specifcally in realtime scenarios such as network load balancing or scheduling, minimality criteria have to be relaxed in order to be able to take into account corresponding response time requirements [16].

### **3.2.1 Identifying Confict Sets in Non-Satisfable Feature Models**

In the following, we show how a confict, more precisely, a minimal confict set, can be determined on the basis of QuickXPlain [24] which is a widely used *divideand-conquer* based approach to identifcation of *subset-minimal* conficts. The basic idea of QuickXPlain is the following: given an inconsistent set of constraints, for example, = {1..20} and {1..10} is inconsistent, a minimal confict set can be identifed in {1..10}, i.e., the remaining constraints {11..20} can be omitted after one consistency check. If {1..10} is consistent, the confict has to be searched in {1..15} (i.e., {1..10} extended with the "frst half" of {11..20}). In this situation, the scope of confict search has to be extended until an inconsistent state is reached. QuickXPlain can be activated with a consideration set , i.e., the set of constraints with an expected confict and a constraint set representing the background knowledge which is assumed to be consistent (see Algorithm 1).

In our example, we assume the background knowledge = {0, 1, 7, 9}, i.e., the set of constraints which are assumed to be correct. Furthermore, is the set of constraints inducing an inconsistency with and – for this reason – includes one or more confict sets. QuickXPlain is fexible and we are able to apply the algorithm in scenarios where represents an inconsistent set of user requirements, i.e., = , but – beyond that – also in scenarios where the FM constraints are inconsistent (e.g., in the context of a non-satisfable FM), i.e., = . IF ∪ is consistent or = ∅, QuickXPlain returns ∅. In any other case, the confict detection process is started by activating (Algorithm 2) which is a divide-and-conquer based routine for the identifcation of minimal confict sets (in ).

The search for a minimal (irreducible) confict set in the consideration set is performed by (see Algorithm 2) where satisfes the following property: ′ ⊂ : confict set( ′ ), i.e., no proper subset of a minimal (irreducible) confict set can be a confict set. If is consistent and has more than one element, is **Algorithm 1** QuickXPlain( = {1..}, ) : 1: **if** Consistent( ∪ ) **then** 2: *return*('no confict') 3: **else if** = ∅ **then** 4: *return*(∅)) 5: **else** 6: *return*(( ∅, , )))

7: **end if**

divided into two separate sets, where <sup>1</sup> is added to in order to analyse further elements of the confict. If includes only one element (|| = 1), this element can be considered as as part of the minimal confict set – this is due to the invariant property *inconsistent*( ∪ ), i.e., since is consistent, must be responsible for inducing the confict. In the context of the consistency check of (line 1), indicates which constraints have been added to in the previous step.

**Algorithm 2** QX(, = {1..}, ) :

```
1: if  ≠ ∅ ∧ () then
2: return(∅)
3: end if
4: if || = 1 then
5: return()
6: else
7:  = ⌊

         2
           ⌋
8: 1 ← 1...; 2 ← +1...;
9: 2 ← QX(1, 2,  ∪ 1 );
10: 1 ← QX(2, 1,  ∪ 2 );
11: return(1 ∪ 2)
12: end if
```
Assuming the consideration set = { <sup>1</sup>, <sup>2</sup>} consisting of the two additional constraints of our working example (i.e., we frst want to focus our search for minimal conficts on the two new constraints) and = {0, 1, 7, 9}, we want to sketch the execution of Algorithms 1 – 2. Algorithm 2 is based on depth-frst search where in every case the left branch is responsible for determining <sup>2</sup> whereas the right branch is responsible for determining 1. In our example (see Figure 3.9), the determined minimal confict set is = { <sup>1</sup>, <sup>2</sup>} which means that there are two possible confict resolutions: (1) to delete <sup>1</sup> and (2) to delete <sup>2</sup>.

### **3.2.2 Identifying Confict Sets in User Requirements**

In the example introduced in Chapter 2, = {0..10} represents the constraints derived from the FM (see Figure 2.3). Let us assume a set of user (customer) re-

Fig. 3.9: Execution trace of QX on the basis of = { <sup>1</sup>, <sup>2</sup>} and = {0, 1, 7, 9} resulting in the minimal confict set = { <sup>1</sup>, <sup>2</sup>}.

quirements specifying preferences regarding the inclusion and exclusion of specifc features which is represented as a set of constraints = {<sup>11</sup> : = , <sup>12</sup> : = , <sup>13</sup> : = , <sup>14</sup> : = }. These requirements specify a *smartwatch* with a standard screen and the payment feature. Furthermore, a *sportstracking* support for*running* is required, however, the user does not want to include a *gps* feature. In this example setting, we assume that the user is free in including or excluding features and the confgurator (based, e.g., on a SAT solver or constraint solver) is in charge of checking the consistency of ∪ .

Since ∪ is inconsistent in our case, confict detection can help to resolve the inconsistency. In this context, we assume = (the FM constraints are assumed to be correct) and the consideration set = is the set of application requirements that are responsible for the inconsistency ∪ . In this situation, we are interested in the confict sets in () that need to be resolved in order to restore the consistency of ∪. The steps to determine all minimal confict sets in using QuickXPlain (Algorithm 1) are shown in Figure 3.10 – 3.11.

The frst minimal confict set derived from = {11..14} is <sup>1</sup> = = {11, 13}. This confict could be shown to the user with the additional information that at least one of the requirements has to be adapted, i.e., either switched from inclusion to exclusion or vice-versa. Let us assume that the user decides to change his/her preferences regarding the requirement 11, i.e., he/she accepts the exclusion of *payment*, our set of user requirements changes to = {<sup>12</sup> : = , <sup>13</sup> : = , <sup>14</sup> : = }. Based on this changed situation, the QuickXPlain algorithm can be reactivated with = {0..10} and an adapted set of user requirements = {12, 13, 14} (no need to include <sup>11</sup> since <sup>1</sup> has been resolved by deleting/adapting 11).

The second minimal confict set (derived from ) is <sup>2</sup> = = {12, 14}. Again, a user can decide how to resolve this confict. For example, if a user has a strong preference in including the *sportstracking running* feature, he/she has to

Fig. 3.10: Execution trace of QX on the basis of = {11..14} and = {0..10} resulting in the minimal confict set = {11, 13}.

Fig. 3.11: Execution trace of QX on the basis of = {12..14} and = {0..10} resulting in the minimal confict set = {12, 14}.

accept the inclusion of the *gps* feature. Summarizing, in our example a user decided to accept the exclusion of the *payment* feature and also accepted the inclusion of the *gps* feature. These accepted adaptations are also denoted as *hitting sets* or *diagnoses* [39] – corresponding algorithmic approaches will be discussed in the following.

**The role of constraint orderings**. In many confict detection scenarios, there is an exponential number of conficts [24]. QuickXPlain is able to identify so-called *preferred conficts* (one at a time), i.e., conficts with a high probability of being of relevance for the user. Using QuickXPlain, the returned minimal confict set can difer depending on the original ordering of the constraints in the consideration set where it can be assumed that constraints at the beginning of the constraint list have the lowest importance and constraint importance increases with a corresponding higher ranking in the list. In our example, the frst returned confict set is {11, 13} with <sup>11</sup> having the lowest importance of all constraints in = {11..14} representing user requirements . If we would change the order of our example constraints to {14, 13, 12, 11} (assuming an ordered set semantics), QuickXPlain would frst return the minimal confict set <sup>1</sup> = {12, 14}.3

**Further approaches to confict detection.** Further confict detection approaches are based on the idea of integrating the search for conficts into solution search. For example, the constraints of an FM can be reformulated as follows: if, for example, <sup>1</sup> : ℎ ⇔ is the original constraint, the corresponding reformulation could be <sup>1</sup> : 1 = 1 ⇒ ℎ ⇔ where <sup>1</sup> is assumed to be a Boolean variable used for counting the number of activated constraints. Using such a representation, we are able to formulate the confict set identifcation task as a minimization problem as follows: *mincardinalityset*({{1.. } : ({1.. })}) which means to minimize <sup>1</sup> + .. + . For an overview of approaches that support the identifcation of minimal confict sets (minimal unsatisfable cores) we refer to Lifton et al. [31, 48].

**Resolution of conficts based on diagnosis.** The overall goal in most settings is to identify a minimal set of constraints that have to be deleted (adapted) in order to be able to restore consistency in a given constraint set – such sets can be denoted as a diagnoses (hitting sets) [15, 39]. In our example of inconsistent user requirements, one diagnosis (the one preferred by the user) is {11, 14}, i.e., by adapting the requirements specifed with {11, 14}, the consistency of ∪ can be restored. In our example non-satisfable FM, a diagnosis would be <sup>1</sup>. In both cases, a diagnosis denotes a (minimal) set of constraints that have to be adapted or deleted to restore consistency.

Before taking a more detailed look at mechanisms that support diagnosis determination, we introduce a defnition of a *diagnosis* (see Defnition 3.2).

**Defnition 3.2** (Diagnosis). A diagnosis Δ = {1.. } is a subset of with consistent( − Δ). Δ is *minimal* if ¬∃Δ ′ : Δ ′ ⊂ Δ and Δ ′ is a diagnosis.

**Three basic diagnosis scenarios**. Defnition 3.2 can be applied in diferent scenarios. (1) If an FM is *non-satisfable*, we need to search for a diagnosis in the set of FM constraints. (2) assuming FM consistency and – at the same time – inconsistency of ∪ , a diagnosis can be identifed in the set . (3) if an FM is consistent, it can still be the case that it shows an unintended semantics, for example, it allows to determine confgurations which are not allowed in the application domain (and could lead to potentially faulty software confgurations).

<sup>3</sup> For details regarding the QuickXPlain constraint ordering, we refer to Junker [24].

In this context, a diagnosis can again be searched in but conficts are induced by *test cases* specifying the intended semantics of an FM [10] (see Section 3.4). Diagnoses are often denoted as *hitting sets* [39] or *minimal correction subset* (MCS) [31]. Furthermore, the complement of a hitting set is denoted as *maximal satisfable subset* (MSS) [31]. In the context of a minimal diagnosis Δ, no subset of Δ fulflls the diagnosis property. In the context of an MSS Γ, no extension of Γ is consistent, i.e., allows the calculation of a solution.

**Minimality properties of diagnoses.** An important aspect of diagnosis minimality is that it is guaranteed that only those constraints (requirements) are adapted which need to be adapted in order to restore consistency. Similar to confict sets, minimal diagnoses can be either *subset minimal* or of *minimal cardinality* (the frst interpretation is used in Defnition 3.2). Minimal cardinality diagnoses are preferred in scenarios where there are no clear preferences regarding individual adaptations of constraints. In contrast, *subset minimal* diagnoses are used in contexts where preferred constraints should be kept as-is whereas less preferred constraints should be the preferred diagnosis candidates.

**Determining minimal cardinality diagnoses in non-satisfable FM<sup>s</sup> .** The basic approach to diagnosis determination is to delete at least one element from each individual confict set (then, all conficts are resolved). In our example non-satisfable FM (see Figure 3.8), there exists exactly one confict (set) which is = { <sup>1</sup>, <sup>2</sup>}. There are two possible ways of resolving this confict which is described by the two diagnoses Δ<sup>1</sup> = { <sup>1</sup>} and Δ<sup>2</sup> = { <sup>2</sup>} meaning that either <sup>1</sup> or <sup>2</sup> has to be adapted or deleted in order to restore the consistency in the FM. This way, we are able to support engineers in the development of FM<sup>s</sup> by relieving the burden of manually identifying faulty constraints.

**Determining minimal diagnoses in user requirements.** Similar to the determination of diagnoses in non-satisfable FMs, diagnosis determination in the context of inconsistent user requirements is based on the resolution of individual conficts. Given the two confict sets in the context of inconsistent user requirements (<sup>1</sup> = {11, 13} and <sup>2</sup> = {12, 14}), we are able to derive four corresponding diagnoses (Δ1, Δ2, Δ3, Δ4) based on the construction of a hitting set directed acyclic graph (HSDAG) [39]. After having resolved, for example, <sup>1</sup> by removing the constraint 11, we still have to resolve <sup>2</sup> with two remaining options, namely constraint <sup>12</sup> or constraint 14. In this example, each path of our example HSDAG leads to a correponding minimal diagnosis. However, this is not always the case, for example, some of the paths have to be closed (not taken into account) since other completed paths already represent a minimal diagnosis (which is a subset of a diagnosis described by the current path). Such a closing of nodes is important to assure efciency of HSDAG determination [39]. Furthermore, confict sets do not have to be calculated for every node in the HSDAG. As can be seen in our example (Figure 3.12), the confict set <sup>2</sup> = {12, 14} occurs twice, however, there is no need for recalculation, for example, with QuickXPlain (see Algorithm 1).

Fig. 3.12: Hitting Set Directed Acyclic Graph (HSDAG) for the confict sets <sup>1</sup> = {11, 13} and <sup>2</sup> = {12, 14}. The resulting minimal diagnoses are Δ<sup>1</sup> = {11, 12}, Δ<sup>2</sup> = {11, 14}, Δ<sup>3</sup> = {12, 13}, and Δ<sup>4</sup> = {13, 14} .

**Alternative diagnosis algorithms.** In contrast to the hitting set based approach to diagnosis determination [39], *direct diagnosis* supports the determination of hitting sets without the need of a corresponding confict detection. An example of such a diagnosis algorithm is FastDiag [15] which is based on the idea of determining subset minimal diagnoses on the basis of a divide-and-conquer based approach. The underlying idea is the following: given an inconsistent set of constraints, for example, = {1..20} and {1..10} is consistent, a minimal diagnosis can be identifed in {11..20}, i.e., {1..10} can be excluded from diagnosis search. Further diagnosis approaches follow the idea of integrating diagnosis and solution search. As discussed in the context of confict detection, constraints can be reformulated, for example, <sup>1</sup> : ℎ ⇔ can be reformulated as <sup>1</sup> : 1 = 1 ⇒ ℎ ⇔ . A diagnosis task for a constraint set {1..} can then be interpreted as a minimization task *mincardinalityset*({{..} : ({1..}− {..})}) which means to minimize + .. + .

**Relationship between diagnoses and conficts.** Note that there is a natural relationship between minimal confict sets and minimal diagnoses in terms of a duality property [43]: for a given set of minimal conficts (i.e., a set of sets) we are able to determine a corresponding set of minimal diagnoses (again, a set of sets) on the basis of a hitting set directed acyclic graph (HSDAG) [39]. Vice-versa, exactly the same set of conficts can be determined by constructing a HSDAG from the given set of diagnoses. This could be useful, for example, in a situation where one is interested in determining minimal cardinality confict sets (in contrast to subsetminimal confict sets) which can be determined on the basis of hitting set based breadth-frst search [39].

### **3.3 Redundancy Detection in Feature Models**

The development and maintenance of FM<sup>s</sup> can be a time-consuming task leading to the inclusion of unintended semantics into FMs. A somewhat orthogonal aspect compared to the topics discussed up to now is the occurrence of redundancies in FMs. Elements (constraints) which are redundant can *increase* development and maintenance eforts of FM<sup>s</sup> (due to a decreased model understandability) and *decrease* the efciency of constraint and SAT solvers [25, 27]. In FMs, a redundancy can be interpreted as a collection of model elements that can be removed from the FM without changing its semantics in terms of the solution space defned by the FM [25, 26]. More formally, if = {1..} is a set of constraints and is redundant ( ⊂ ) then − {}∪ {¬} is inconsistent, i.e., the solution space of − {} corresponds to the original one. Consequently, if ∈ is redundant, *logically follows*from − {}, i.e., − {} |= . An algorithm for determining the complete set of non-redundant constraints (Δ) in an FM is FMRedundancy (see Algorithm 3). The overall idea is to iterate over all constraints defned in , i.e., the constraints derived from the FM, and to analyze each constraint with regard to redundancy.

**Algorithm 3** FMRedundancy( = {1..}): <sup>Δ</sup> 1: <sup>Δ</sup> ← 2: **for all** ∈ **do** 3: **if** (<sup>Δ</sup> − { } ∪ {¬ } ) **then** 4: <sup>Δ</sup> ← <sup>Δ</sup> − 5: **end if** 6: **end for** 7: *return*(Δ)

In the FM of Figure 2.3, no redundant constraints can be identifed. If we add the constraint : ⇒ to set of FM constraints , a corresponding redundancy can be identifed: the *energymanagement* feature is mandatory, consequently a constraint requiring the context-dependent inclusion of this feature is redundant since this feature will be included anyway. An execution trace of Algorithm 3 can be found in Table 3.3.

Table 3.3: Identifcation of redundant constraints in based on FMRendundancy.


In our example, <sup>Δ</sup> (the redundancy-free constraint set derived from ) is returned by Algorithm 3. This set guarantees that the semantics of the corresponding FM remains the same, i.e., the FM including has the same solution (confguration or product) space as the FM including Δ.

### **3.4 Feature Model Testing and Debugging**

In the previous sections, we have discussed the concepts of confict detection and diagnosis on the basis of the scenarios of (1) restoring the satisfability of an FM and (2) restoring the consistency of user requirements (within the scope of a confguration process). In this section, we focus on situations where an FM is satisfable but still does not behave as expected in the sense that FM confgurations are supported (or even generated) that are not allowed or expected in the corresponding application domain. The reasons for such an unintended semantics are manyfold and range from insufcient domain knowledge of engineers, outdated knowledge still included in the FM, to cognitive overheads of engineers triggered by FM<sup>s</sup> of low understandability.

Our example FM depicted in Figure 2.3 does not include any dead features, i.e., every feature is activated in at least one confguration part of the complete set of possible confgurations that can be derived from the FM. If we analyze our FM with the corresponding analysis operation (see *dead feature (f)* Table 3.1), the outcome will be as expected. In situations where some features are dead, the question arises in which way the FM has to be adapted in order to exclude dead features. More generally, how to adapt the FM in such a way that specifc properties (e.g., satisfable FM<sup>s</sup> , no dead features, and no false optional features) are fulflled. A well-known concept in software engineering scenarios are test cases and test suites which are used to assure a specifc intended behavior of a software. If some test cases fail, corresponding adaptations are needed by developers. In the same sense, we are able to defne test cases for FM<sup>s</sup> in such a way that the unintended semantics can be discovered and at the same time those model parts can be automatically identifed that are responsible for this unintended semantics.

A simple example of a test suite = {1..} is the following: = {<sup>1</sup> : ℎ = , <sup>2</sup> : = , .., <sup>13</sup> : = }, i.e., each feature is represented by a corresponding test case that is used *to assure that no feature is dead*. This approach to defne intended FM semantics in terms of test cases can be applied to other types of analysis operations as well as in a more general case by specifying example-wise intended semantics. In the discussed approach, test cases are regarded as constraints that defne intended (and also unintended) semantics of FMs.4 The underlying idea is to exploit a defned set of test cases to induce inconsistencies in an FM and to resolve these inconsistencies on the basis of model-based diagnosis. In order to apply confict detection and diagnosis in such

<sup>4</sup> In the following, we focus our discussion on the specifcation of intended semantics (i.e., positive test cases).

scenarios, we need to defne the concepts of a confict set and a corresponding diagnosis (see the Defnitions 3.3–3.4). We use the term *debugging* for actions to restore the consistency and semantic correctness of an FM, i.e., getting the FM free from buggy constraint defnitions. In the context of FM testing and debugging, a confict set can be defned as follows (see Defnition 3.3).

**Defnition 3.3** (Confict Set). A confict set = {1.. } is a subset of s.t. ∃ ∈ : ( ∪ {}). CS is *minimal* if ¬∃′ : ′ ⊂ and ′ is a confict set.

In the context of FM testing and debugging, a diagnosis can be defned as follows (see Defnition 3.4).

**Defnition 3.4** (Diagnosis). A diagnosis Δ = {1.. } is a subset of with ∀ ∈ : consistent( − Δ ∪ { }). Δ is *minimal* if ¬∃Δ ′ : Δ ′ ⊂ Δ and Δ ′ is a diagnosis.

**Determining minimal diagnoses in FM testing.** Diagnosis in the context of FM testing is based on the resolution of (minimal) conficts induced by a set of test cases (see Defnition 3.3). Let us assume the existence of a set of test cases = {<sup>1</sup> : = ∧ = , <sup>2</sup> : = } requiring the existence of FM confgurations that include a *payment* feature combined with a *standard screen* as well as confgurations that exclude the *energymanagement* feature. In the context of our example FM (see Figure 2.3), both test cases induce a corresponding confict. (1) the test case <sup>1</sup> : = ∧ = induces the (singleton) confict <sup>1</sup> = {<sup>9</sup> : ¬( ∧ )} and (2) the test case <sup>2</sup> : = induces the (singleton) confict set <sup>2</sup> = {<sup>5</sup> : ⇔ ℎ}.

Based on this information, we are able to construct a hitting set directed acyclic graph which helps to resolve conficts in a structured fashion (see Figure 3.13). Since we have to deal with two singleton confict sets, i.e., confict sets containing exactly one element, the resulting HSDAG includes exactly one diagnosis (Δ1).

$$\begin{array}{c} \mathsf{cs}\_{1} = \{\mathsf{c}\_{9}\} \\ \bigsqcup \mathsf{c}\_{9} \\ \mathsf{cs}\_{2} = \{\mathsf{c}\_{5}\} \\ \bigsqcup \mathsf{c}\_{5} \\ \Delta\_{1} = \{\mathsf{c}\_{5}, \mathsf{c}\_{9}\} \end{array}$$

Fig. 3.13: Hitting Set Directed Acyclic Graph (HSDAG) for the confict sets <sup>1</sup> = {9} and <sup>2</sup> = {5}. The resulting minimal diagnosis is Δ<sup>1</sup> = {5, 9}.

**Alternative algorithms for FM testing and debugging.** In contrast to the discussed hitting set based approach, FM testing and debugging can also be implemented on the basis of direct diagnosis [15]. An approach to the related application of Fast-Diag is discussed in detail in Felfernig et al. [26].

### **3.5 Machine Learning for Confict Detection and Diagnosis**

In an inconsistent constraint set, there can be numerous conficts and corresponding diagnoses. In this context, it is important to fgure out diagnoses of relevance. In *interactive confguration settings*, diagnoses (repairs) proposed to users should only include preferences of low importance for a user. A user-individual (personalized) diagnosis can be determined on the basis of integrating machine learning concepts that help to infer preference importance, for example, on the basis of the preferences of previous user sessions [14, 38]. We refer to Chapter 4 for concepts helping to tackle such a *no solution could be found dilemma*.

When diagnosing an *unsatisfable FM*, we are in a similar situation, i.e., we need to identify diagnoses of potential highest relevance. Table 3.4 shows a simple example of diagnosis ranking where Δ represent diagnoses determined for the constraints 1..<sup>5</sup> of an unsatisfable FM.

Table 3.4: Example of a diagnosis ranking approach: diagnoses Δ<sup>1</sup> and Δ<sup>4</sup> have the highest accumulated constraint occurrence value (represented by the *score* value).


A simple approach to rank the Δ is to use the constraint occurrence in Δ1..Δ<sup>5</sup> as an indicator, i.e., the more often a constraint is part of a diagnosis, the higher its relevance. Following this idea, Δ<sup>1</sup> and Δ<sup>4</sup> (both have an accumulated constraint occurrence value of 6) could be considered as the most relevant diagnoses. In this case, a recommendation for the designer of an FM could be to frst look at the constraints contained in Δ<sup>1</sup> and Δ4. For more details on diferent ways to rank diagnoses in unsatisfable constraint sets we refer to Felfernig et al. [12, 14].

### **3.6 Discussion**

In this chapter, we have discussed diferent topics related to the analysis of FMs. First, analysis operations help to fgure out specifc properties of an FM, for example, an analysis operation related to void features helps to identify all features that cannot be part of any confguration. In this context, we have introduced a diferentiation between analysis operations in the need of a solver support (as it is the case with void features) and analysis operations without a need of a solver support (e.g., counting the features of an FM).

Analysis operations help to understand basic properties of FMs. In the following, we also introduced concepts that help developers of FM<sup>s</sup> to efciently deal with inconsistencies in FMs. For example, when confronted with a non-satisfable FM (no solution could be identifed), confict detection and diagnosis algorithms can be applied to identify the sources of an inconsistency (the conficts) and to propose corresponding repairs (the diagnoses). In a working example, we showed how to determine diagnoses. Furthermore, we extended the application of diagnosis algorithms to the testing and debugging of FMs. In this context, test cases (represented as constraints) defne the intended semantics of FMs. If an FM has a diferent semantics, the defned test case helps to induce conficts in the set of FM constraints () which can then be solved on the basis of model-based diagnosis.

As an orthogonal aspect in the context of FM analysis we introduced an algorithmic approach to redundancy detection in FMs. A constraint can be considered redundant if the semantics of an FM does not change even if we delete the constraint. In this context, we have introduced an algorithm for redundancy detection in FM<sup>s</sup> and showed its operation on the basis of a working example.

In the context of FM analysis, we regard the following aspects as *major issues for future work*.

**Synthesis mechanisms for algorithm performance evaluation.** Designing and developing confict detection and diagnosis algorithms requires the structured provision of test feature (confguration) models which allow algorithm performance evaluation in a structured fashion [49]. For example, it should be possible to predefne the number of diagnoses, the number of confict sets, and the corresponding cardinalities. On the basis of the generated FMs, more structured evaluations can be performed. In this context, large language models (LLMs) can also be regarded as a promising approach to support the generation of test models [19].

**Parallelization of FM analysis.** Existing parallelization architectures make it feasible to parallelize constraint reasoning as well as related confict detection and diagnosis processes [30, 51]. A major challenge is to efciently exploit parallelization architectures to signifcantly increase the efciency of the mentioned operations. This is important specifcally due to the increasing size and complexity of variability models (e.g., FMs). A further open issue is how to parallelize the identifcation of redundant constraints – no related algorithms exist up to now.

**Cognitive issues in FM development and maintenance.** Being able to identify the sources of an inconsistency also requires knowledge from cognitive psychology. For example, in order to identify the set of constraints responsible for the faulty semantics of an FM, knowledge about the cognitive complexity of individual FM elements should be exploited. The reason behind is that constraint structures which are less understandable have a higher probability of being the source of a faulty behavior. A simple example in this context is the logical implication ⇒ – the corresponding implication ⇐ with exactly the same semantics requires additional cognitive overheads [13].

**Gamifcation-based confict detection and diagnosis.** Specifcally in the context of teaching Artifcial Intelligence (AI) topics, it is important to provide intuitive explanations of concepts and algorithms. In this context, gamifcation has shown to be an appropriate way of making complex algorithms more accessible for students. This idea should be applied in diferent AI-related settings and could thus contribute to increase the accessibility of diferent AI techniques and algorithms [11].

**Analysis Operations focusing on FM Applications.** If one wants to apply FM<sup>s</sup> in productive use, developers need to assure specifc aspects of high relevance for successful confgurator applications. For example, when introducing new features, it must be clear that these features are covered by the current infrastructure (e.g., is the production infrastructure capable of producing the confgurations defned by customers) [32]. Furthermore, features ofered to customers should not be too restrictive [41], i.e., narrow down the confguration space too much and thus leading to situations where no relevant confgurations can be identifed for a customer [33].

### **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

### **Chapter 4 Interacting with Feature Model Confgurators**

**Abstract** In this chapter, we discuss diferent AI techniques that can be applied to support interactive FM confguration scenarios. We have in mind situations where the user of a FM confgurator is in the need of support, for example, in terms of requiring recommendations and related explanations for feature inclusions or exclusions or recommendations of how to get out of an inconsistent situation. We show how to support feature selection on the basis of recommendation technologies and also show how to apply the concepts of confict detection and model-based diagnosis to support users in inconsistent situations as well as in the context of reconfguration.

### **4.1 Feature Model Confguration**

Feature model (FM) confguration is often an interactive process where a user specifes her/his preferences regarding the given features [3, 4, 17, 18]. FM confgurators support a.o. (1) checking the consistency of the articulated preferences ( ∪ must be consistent), (2) recommending features, (3) explaining confgurations, (4) fnding ways to get out of situations where no solution can be identifed by the confgurator, and (5) reconfguration, i.e., helping to adapt an already existing confguration in such a way that new user requirements are fulflled (see Figure 4.1).

Fig. 4.1: Interacting with FM confgurators: basic AI techniques (ids in brackets refer to the corresponding subsection).

The task of consistency checking and confguration completion is taken over by SAT solvers or constraint solvers. Especially in interactive settings, efcient response times are needed. In this context, solvers apply diferent types of search heuristics that can help to perform solution search in an efcient fashion. More advanced confguration approaches apply machine learning for determining a set of search heuristics that help to further improve the performance of the solver [33]. Furthermore, knowledge compilation approaches such as binary decision diagrams (BDDs) [1] help to further improve the performance of solution search. Beyond consistency checking and constraint reasoning, FM confguration also has to be able to deal with inconsistent situations where, for example, user preferences become inconsistent with the FM constraints. In such situations, confict detection [22, 41] and diagnosis [34, 43] can support users in identifying the sources of an inconsistency and counteract correspondingly.1

Importantly, solution search, i.e., FM confguration, as well as inconsistency handling have to be *personalized* in the sense that depending on the preferences of the current user, diferent completions of the current confguration should be proposed and also diferent alternatives to resolve an inconsistency have to be provided. Such personalization services are crucial to support users in fnding their preferred confguration and help to make the overall FM confguration process more efcient. In order to provide the mentioned personalization capabilities, diferent types of recommendation services have to be integrated with constraint solving (or other reasoning approaches such as SAT solving). In this chapter, we show diferent ways how such an integration can take place. In this context, we primarily focus on recommendation approaches that are based on supervised machine learning, i.e., a set of already completed *confguration sessions*2 will be used to infer user-specifc recommendations (see Table 4.1).

Following the defnitions of an FM confguration and an FM confguration task (see the Defnitions 2.6 and 2.8), we introduce a set of confgurations that have already been completed in previous confguration sessions (see Table 4.1). In the line of Defnition 2.6, each session entry in Table 4.1 is a consistent and complete FM confguration represented as an assignment set . In the following, we discuss various ways in which the entries in Table 4.1 can be applied to personalize the interaction with an FM confgurator.

### **4.2 Recommending Features**

There are diferent reasons why users are not able to specify, i.e., include or exclude a specifc feature within the scope of an FM confguration process. (1) users might not have the domain knowledge needed to decide about the inclusion of a specifc feature – this might be the case with technical features or new features the user was

<sup>1</sup> For details see Chapter 3.

<sup>2</sup> Collection of valid confgurations typically created by confgurator users.

Table 4.1: Example: already completed confguration sessions. In the following, we will use these example session data to show the integration of diferent recommendation approaches into interactive confguration sessions (1 = inclusion, 0 = exclusion of a feature, ? = not specifed yet).


not confronted with up to now. (2) another explanation can be limited time resources, i.e., users do not have the time to specify every feature and for some features prefer to just rely on the recommendations provided by the FM confgurator. (3) although users know the feature, they tend to accept recommendations provided by the confgurator – this can happen due to the fact that users are risk-aware and want to avoid situations where they run into the risk of suboptimal confgurations [2, 29]. An example of such an envisioned suboptimal confguration can be found in operating systems where specifc system parameters could lead to suboptimal response times. In the following, we will discuss scenarios in which recommender systems [5, 10, 40] can be applied to support users in the completion of a confguration process.

A *recommender system* can be defned as *any system that guides a user in a personalized way to interesting or useful objects in a large space of possible options or that produces such objects as output* [9, 15]. For the purposes of our discussions, we distinguish between three types of recommender systems which are widely applied in diferent industrial contexts. (1) *collaborative fltering* is based on the idea of word-of-mouth promotion where the opinions of family members and friends are the major input for a recommendation. In the context of recommender systems, the role of family members and friends is taken over by so-called nearest neighbors (NNs) which are users with preferences similar to those of the current user. In the context of online sales platforms, collaborative fltering is primarily applied to predict the rating of a user for an item she/he has not consumed/seen up to now [19, 36]. (2) *content-based fltering* exploits previous item consumptions stored in the profle of the current user and tries to identify items that are similar to those that have been consumed in the past. (3) *group recommender systems* [8] determine recommendations for groups of users. In a frst step, item ratings or items are specifed by individual group members. Thereafter, the item preferences of group members are aggregated on the basis of an aggregation function. For example, *majority voting* recommends those items which are preferred by the majority of group members. In the following, we show how the mentioned recommendation approaches can be applied in confguration scenarios.

**Collaborative fltering for recommending features.** A basic approach to determine feature recommendations for individual users is to apply collaborative fltering [6, 7, 31, 39]. Formula 4.1 can be used to determine the similarity between two FM confguration sessions and where () denotes the set of features already specifed in session . In the example sessions of Table 4.1, Sessions 1–3 represent already completed confguration sessions, i.e., each feature is explicitly included or excluded, whereas the *current* session shows a subset of the features specifed. Formula 4.1 is specifed in such a way that similarities are only determined for those features specifed in both sessions, i.e., and . Such a recommendation approach based on similarities between consumed items is also denoted as *memory-based collaborative fltering* [23].

$$\text{sim}(sa, sb) = \frac{|\{f \in F(sa) \cap F(sb) : sa.f = sb.f\}|}{|f \in F(sa) \cap F(sb)|} \tag{4.1}$$

In our working example (see Table 4.1), (, 1) = 2 5 = 0.4, (, 2) = 3 5 = 0.6, and (, 3) = 2 5 = 0.4. Consequently, <sup>2</sup> is the nearest neighbor of the *current* session and the feature settings of this session could be used for recommending feature settings to the user of the *current* session. For example, for *gps* we could recommend feature inclusion since the user in <sup>2</sup> also decided to include feature *gps*. Continuing this idea, we could recommend the inclusion of the *payment* feature and further features. However, recommending the *payment* feature triggers an issue since the user in the *current* session has already selected the *standard screen* which incompatible with the *payment* feature. As a consequence, recommendations directly determined on the basis of collaborative fltering have to be checked for consistency with the constraints in the FM [10].

**Integrating feature recommendations with variable domain orderings.** In order to avoid the mentioned (potentially repeated) checking of the consistency of feature recommendations with the constraints in the FM, we can apply feature recommendations for determining a kind of variable domain ordering (heuristics) which can then be used by a constraint or SAT solver [32]. For example, if the inclusion of *payment* is recommended this would result in the defnition of a variable domain ordering [1,0] instructing the solver to try to include the feature *payment* into the confguration (if possible). In the case of an inconsistency, backtracking would be triggered by the solver resulting in an exclusion of this feature. Following this strategy also helps to avoid inconsistencies.

**Content-based fltering for recommending features.** Content-based fltering [30] can be used in scenarios where the preferences of a user in terms of feature Table 4.2: Example recommendation for the user in the *current* session (see Table 4.1). Instead of directly recommending feature settings, these settings can be included in corresponding variable domain orderings exploited by a constraint or SAT solver. In real-world contexts, the domain of the root feature (e.g., *smartwatch*) is assumed to be always *true* (1=true, 0=false).


inclusions and exclusions from the past can be directly applied in future recommendation scenarios. The major diference between content-based fltering and collaborative fltering is that the former determines recommendations based on similarities between new items and items a user liked in the past whereas the latter focuses on determining recommendations based on similarities between the current user and related nearest neighbors. Content-based recommendation builds user profles that collect in a compressed form information about a user's item consumptions in the past. In the context of FM recommendation, such a profle could simply include those features of confgurations previously selected by the user. A simple similarity function determining the similarity between the profle () of the current user and a new confguration ( ) is shown in Formula 4.2.

$$\text{sim}(p, conf) = \frac{|\{f \in F(p) \cap F(conf) : p.f = conf.f\}|}{|f \in F(p) \cap F(conf)|} \tag{4.2}$$

When using content-based fltering in the context of FM confguration, the set of features from the user profle can be regarded as user requirements to be fulflled by a new FM confguration. If an inconsistency occurs in this context, confict detection and diagnosis can help to identify minimal sets of features to be adaptated such that a solution can be identifed. In the context of our working example (*smartwatch* feature recommendation), basic properties of a *smartwatch* purchased in the past can be used to recommend similar smartwatches in upcoming smartwatch confguration scenarios. For example, the current *smartwatch* of a user gets damaged and the user is in the need of a new *smartwatch* or a new model of a *smartwatch* is released and should be primarily recommended to those users interested in similar smartwatches in the past. In our working example, we know that the user associated with confguration <sup>3</sup> prefers an *advancedsolar* management and a new smartwatch with a new generation of solar management features could be recommended. Analogously to the inclusion of collaborative fltering results into solver variable domain orderings, this approach can also be used in the context of content-based fltering: if a specifc feature has been selected (deselected) in the past, the same feature should be selected (deselected) per default when building a new confguration.

Other example domains where content-based recommendation can be applied in the context of FM confguration are the confguration of round trips in the travel domain (e.g., based on the preferences of a person from previous travel packages, the destinations and services for the new travel package (confguration) can be recommended). On the basis of information of previous software packages installed for a user, new (similar) software packages and corresponding parametrizations can be recommended to the user when setting up a new operating system.

**Enforcing confguration minimality.** An important issue in the context of recommending feature inclusion (or exclusion) is to assure that only features are included that are really needed, i.e., to answer the question *what is the minimum set of additional features to be included in a confguration such that all user requirements and FM constraints are taken into account*? For example, if we assume the existence of a partial confguration = {ℎ = , = , = , = }, it could be relevant to fnd a consistent and complete confguration ′ with a minimum set of additional features included. Achieving such a goal can be relevant since solvers do not care about solution minimality and it can be relevant to support users in terms of indicating decision alternatives regarding a minimal and complete confguration. In the following, we show how such decision alternatives can be represented in terms of minimal confict sets, and the corresponding confict resolutions are represented as diagnoses, i.e., minimal sets of needed extensions to an existing confguration such that the resulting confguration ′ is consistent with ∪ and complete, i.e., each feature has an assigned setting indicating inclusion or exclusion.

We now introduce a set = {ℎ = , = , = , = , = , = , ℎ = , = , = } with all those features not specifed in assumed to be excluded, i.e., = . For the features in , we are able to determine all minimal confict sets (see Chapter 3) with regard to the feature settings in . The minimal confict sets that can be identifed in ′ are the following: <sup>1</sup> = {ℎ = , = }, <sup>2</sup> = { = , = }. The diagnoses that can be derived from the identifed minimal confict sets are Δ<sup>1</sup> = {ℎ = , = },Δ<sup>2</sup> = {ℎ = , = }, Δ<sup>3</sup> = { = , = }, and Δ<sup>4</sup> = { = , = }. These diagnoses indicate possible minimal extensions of the current partial confguration to come up with a consistent and complete confguration ′ [42]. If we choose, for example,Δ<sup>1</sup> as a possible minimal extension for , this would result in a confguration ′ = {ℎ = , = , = , = } ∪ {ℎ = , = } ∪ { = , = , = , = , = , ℎ = , = } including (1) all feature settings of , (2) the feature settings of Δ<sup>1</sup> in negated form, i.e., {touch=true,basic=true}, and (3) all features of − Δ1.

### **4.3 Explaining Confgurations**

There exist diferent ways of explaining a confguration to a user [20, 25]. Without any claim to completeness, in the following, we discuss basic explanation scenarios of potential interest for FM confguration.

**"Why" explanations.** A user might be interested in an explanation as to *why* specifc features have been additionally included in a confguration [25] – this can be explained as follows: (1) Enumerating the user-specifed preferences and indicating the relationship to the corresponding confguration (e.g., since you have selected *sportstracking*, the *gps* feature has been included since it is required for the support of *sportstracking*). In this context, we exploit a *requires* cross-tree constraint for explaining the inclusion of a specifc feature. Kramer et al. [25] show how such explanations can be generated in a systematic fashion by proposing so-called *explanatory knowledge fragments* which specify explanation patterns for individual FM elements. For example, an explanation regarding a mandatory feature can always be concluded with the statement *in all confgurations*. (2) The inclusion of features can also be explained on the basis of the used recommendation algorithm, for example, when applying a collaborative fltering algorithm, an explanation could refer to the preferences of the nearest neighbors (e.g., the *advanced solar* energy management feature has been selected by users with similar preferences). When applying content-based fltering, an explanation can refer to past preferences of the current user (e.g., the *advanced solar* energy management feature has been included since you included such a feature also in your previous purchases).

**"How" explanations.** A user might be interested in an explanation *how* a specifc confguration has been determined – this can be explained as follows: (1) Explaining the sequence of constraints that were active when determining the current confguration (e.g., the energy management feature has been included since it is mandatory. Thereafter, a standard screen has been included since you did not want to include the payment feature, ...). (2) If diferent candidate confgurations exist and those are ranked, for example, on the basis of a utility function [10], the approach to determine the corresponding interest dimensions could be explained to the user (e.g., confguration 1 has been ranked highest, since it has the highest utility with regard to the interest dimension *sustainability* which you have specifed as the most important interest dimension). In group decision scenarios [26], such an explanation could refer to the applied aggregation strategy (e.g., *average* of the user ratings) used for ranking the confgurations [35]. Consequently, such explanations are generated by using knowledge about the way solutions are determined [16].

**"Why not" explanations.** In FM confguration scenarios (and beyond), users can also end up in situations where no solution could be identifed for the defned set of user preferences [13, 20, 28]. In such a setting, users could be interested in those requirement specifcations that are responsible for the non-existence of a solution. A confict (set) (see Section 4.4) indicates individual sets of requirements that induce an inconsistency – using this concept, users have to take a decision how to resolve each individual confict. A diagnosis (also Section 4.4) indicates a way of how to change his/her requirements in one single step. Important to mention, specifcally in constraint-solving and confguration-related AI research, conficts as well as diagnoses are regarded as specifc types of explanation [13, 20] (see also Chapter 3). In the following section, these concepts will be discussed in the context of identifying relevant confict resolutions for inconsistent user requirements.

### **4.4 Predicting Relevant Confict Resolutions**

In Chapter 3, we have introduced diferent approaches that help to deal with inconsistencies between user requirements () and the underlying FM constraints (). In this context, a confict set is defned as ⊆ such that *inconsistent*(∪). In our working example (see Figure 2.3), = {0..10} can be derived from the FM in Figure 2.3. For the following example, we assume = {<sup>11</sup> : = , <sup>12</sup> : = , <sup>13</sup> : = , <sup>14</sup> : = }.

In this example, the user requirements are inconsistent with the FM constraints, i.e., ∪ is inconsistent, which means that we have to activate confict detection for fguring out the corresponding confict sets [11, 12]. As discussed in Chapter 3, the frst minimal confict set derived is <sup>1</sup> = {11, 13}, i.e., the user is interested in including the *payment* feature but has already included the *standard* screen feature which is in contradiction to the constraints defned in the FM (represented by the constraints in ). An overview of the complete set of minimal confict sets that can be derived from is shown in Table 4.3.

Table 4.3: Minimal confict sets derived by QuickXPlain for = {<sup>11</sup> : = , <sup>12</sup> : = , <sup>13</sup> : = , <sup>14</sup> : = } and the FM constraints = {0..10}.


Conficts as those shown in Table 4.3 can be resolved (1) interactively, i.e., a customer explicitly specifes accepted changes in his/her current requirements or (2) on the basis of additional knowledge about the importance weights of the given requirements. In interactive settings, users could be explicitly asked for selecting those preferences out of a confict set, which they would accept to be adapted. For example, on the basis of the confict sets contained in Table 4.3, a user could be asked to select which requirement (preference) to delete/change from <sup>1</sup> and thereafter the same information will be requested for confict set 2. After having selected at least one requirement per confict set, all conficts are resolved (given confict minimality – see also Chapter 3). Alternatively, we can assume the existence of a dataset of user-specifc preferences from the past (see Table 4.4).

Table 4.4: Example set of completed confguration sessions where for each session *CR* represents an initially inconsistent set of requirements and *A* the fnally consistent confguration as result of the confguration process. Furthermore, *REC* represents a recommendation for the change of the currently inconsistent set of user requirements. In this example, the features *smartwatch*, *screen*, and *energymanagement* are assumed to be mandatory.


In the dataset shown in Table 4.4, each session contains the set *CR* of originally defned user requirements (which are inconsistent) and the set *A* specifying the fnal consistent (and complete) confguration confrmed by the user. Given the requirements = {<sup>11</sup> : = , <sup>12</sup> : = , <sup>13</sup> : = , <sup>14</sup> : = } of the user in the current session, two minimal confict sets are induced with regard to = {1..10}: <sup>1</sup> = {11, 13} and <sup>2</sup> = {12, 14}. The most similar session (on a scale [*0*=not similar .. 1=very similar]) determined by Formula 4.1 compared to the current session is 1: sim(, 1)=<sup>4</sup> 4 = 1.0, sim(, 2)=<sup>2</sup> 3 = 0.66, and sim(, 3)=<sup>2</sup> 3 = 0.66.

Our goal now is to identify a minimal set of changes in such that consistency is restored with regard to the FM constraints . Following our discussions in Section 3.2, we are able to construct a hitting set directed acyclic graph (HSDAG) [34] that helps to identify minimal confict resolutions (see Figure 4.2). In each step of the HSDAG construction, we can compare alternative confict resolutions Δ with regard to their conformance (*con*) with the confguration chosen by the nearest neighbor (NN) (see Formula 4.3).

$$$$

The conformance of a confict resolution is specifed by the number of changes proposed by Δ which are in line with the feature settings in the nearest neighbor confguration (the more changes conform with , the better). Figure 4.2 shows how to determine a preferred diagnosis Δ (more precisely, Δ3( )) by analyzing to which extent a set of confict resolutions is in the line with the confguration of the nearest neighbor (in our case, 1).3

Fig. 4.2: Hitting Set Directed Acyclic Graph (HSDAG) for the confict sets <sup>1</sup> = {<sup>11</sup> : = , <sup>13</sup> : = } and <sup>2</sup> = {<sup>12</sup> : = , <sup>14</sup> : = }. The search for a user-preferred diagnosis is guided by the conformance (*con*) measure (see Formula 4.3). The resulting preferred minimal diagnosis is Δ = {12, 13}.

In the frst step of the HSDAG construction (after having identifed the frst confict set 1), we have to calculate the conformance of (1) <sup>11</sup> and (2) <sup>13</sup> with the feature selections in the confguration of the nearest neighbor. This results in ({11}, )=0.0 and ({13}, )=1.0, i.e., only the confict resolution regarding <sup>13</sup> is in line with . After having resolved 1, the confict set<sup>2</sup> is the only remaining confict set. In terms conformance (see Formula 4.3), the diagnosis Δ = {12, 13} has the highest conformance value (2.0) and thus is the candidate for being presented as repair alternative to the user.

For the FM confgurator user this means that it is recommended (by Δ ) to change the requirements regarding *sportstracking* and *standard*, i.e., to exclude these two features. The recommendation *REC* in Table 4.4 is the result of applying diagnosis Δ to the original user requirements in the session.

**Direct diagnosis for personalized confict resolution.** Up to now, we have discussed diferent approaches that support the identifcation and resolution of confict sets on the basis of the concepts of hitting set directed acyclic graphs [34]. Following

<sup>3</sup> In this example, there is exactly one preferred diagnosis, but it could be more.

such an approach means to (1) identify the relevant minimal conficts and (2) to resolve those conficts by deleting or adapting at least one element of each confict set. A major drawback of this approach is that the determination of confict sets is computationally expensive [22] which can trigger performance (response time) issues specifcally in interactive settings. An alternative to the computation and resolution of minimal confict sets is to apply the concepts of *direct diagnosis* which follows the idea of determining minimal diagnoses directly and omitting the step of identifying minimal confict sets. A direct diagnosis algorithm for the determination of preferred minimal diagnoses is FastDiag [13, 28] which follows a divide-and-conquer based approach for the determination of minimal diagnoses.

The basic divide-and-conquer based approach of FastDiag has already been motivated in Chapter 3, i.e., diagnosis search in a constraint set = <sup>1</sup> ∪ <sup>2</sup> (assuming that both subsets include a nearly equal number of constraints) can be reduced by half if the constraints in one half, for example, <sup>1</sup> appear to be consistent. Our motivation for providing more details on FastDiag in this chapter is that the algorithm allows the determination of preferred diagnoses, i.e., diagnoses that can be regarded as potentially relevant for a user. FastDiag can be activated with a consideration set , i.e., the set of constraints in which a diagnosis needs to be identifed and a constraint set (the background knowledge which is assumed to not contain any diagnosis elements).

FastDiag is fexible since it allows to support scenarios where is a set of customer requirements inconsistent with the FM constraints but also scenarios where the FM confguration knowledge base is inconsistent. In this case, represents the FM constraints and = ∅. If ∪ is inconsistent, FastMSS is activated and returns a maximum satisfable subset Ω of . In FastDiag, Ω is used to derive the corresponding diagnosis Δ by building the complement = − Ω, i.e., Δ = − Ω. If ∪ is consistent, no diagnosis process needs to be activated and ∅ is returned by FastDiag (see Algorithm 4).4


The search for a maximum satisfable subset (MSS) Ω in is performed by FastMSS (see Algorithm 5) where Ω satisfes the following property: Ω′ ⊃ Ω : (Ω′ ), i.e., no superset of a maximum satisfable subset can be a maximum satisfable subset (). If ∪ is consistent, the whole set is consistent and can be regarded as part of the maximum satisfable subset. In this context, the parameter is used to avoid redundant consistency checks of ∪ . If || = 1, it can be

<sup>4</sup> Algorithm 4 is a variant of FastDiag introduced in [13].

assumed to be part of a diagnosis since otherwise it would have been returned as a consistent constraint earlier. In every other case (|| > 1), diagnosis search has to be continued in a divide-and-conquer fashion, i.e., is divided into the two subsets <sup>1</sup> and <sup>2</sup> resulting in two further activations of FastMSS – the frst one for checking <sup>1</sup> for further Ω elements and the second one for checking for Ω elements in <sup>2</sup> (Ω<sup>2</sup> includes MMS identifed in 1). All MSS elements, i.e., Ω<sup>1</sup> ∪ Ω2, that could be identifed on a specifc recursive level of FastMSS are returned to the previous activation level.


```
1: if  ≠ ∅ ∧ IsConsistent( ∪ ) then
2: return()
3: end if
4: if || = 1 then
5: return(∅)
6: end if
7:  = ⌊

        2
          ⌋
8: 1 ← 1...; 2 ← +1...;
9: Ω2 ← FastMSS(1, 1, );
10: Ω1 ← FastMSS(1 − Ω2, 2,  ∪ Ω2 );
11: return(Ω1 ∪ Ω2)
```
Assuming the customer requirements = {<sup>11</sup> : = , <sup>12</sup> : = , <sup>13</sup> : = , <sup>14</sup> : = }, we now want to sketch the execution of Algorithms 4–5 on the basis of our working example (see Figure 4.3). In order to be applicable for FastDiag, we have to defne the contents of and . In our working example, we can assume the consistency of the FM and the corresponding FM constraints, i.e., we can focus diagnosis search on where we assume <sup>0</sup> : ℎ ∧ ∧ .

FastMSS is based on depth-frst search where in every case the left branch is responsible for determining Ω<sup>2</sup> whereas the right branch of the search tree is responsible for determining Ω1. In our example (see Figure 4.3), the determined maximum satisfable subset (MSS) is Ω = {11, 12} indicating that Ω ∪ is consistent and Ω′ ⊃ Ω : (Ω′ ). The complement (the diagnosis) Δ of Ω = {11, 12} is − Ω which results in Δ = {13, 14}. As mentioned, FastDiag allows for the determination of preferred diagnoses. More precisely, diagnoses can difer depending on the original ordering of the constraints in the consideration set [13].

In our working example, we assumed the constraint ordering [11, 12, 13, 14] assuming that constraints at the beginning have the highest importance and constraint importance decreases with a corresponding lower ranking in the list, i.e., in the given example, constraint <sup>14</sup> has the lowest importance. In this context, we assume that constraints with a lower importance have a higher probability of being accepted by the user as a diagnosis element. If we would change the order of our example constraints in to [14, 13, 12, 11], the diagnosis returned by FastDiag would

Fig. 4.3: Execution trace of FastMSS on the basis of = {11..14} and = {0..10} resulting in the maximum satisfable subset (MSS) Ω = {11, 12}. The corresponding preferred diagnosis returned by FastDiag is Δ = {13, 14}.

be Δ = {11, 12}. The diagnoses returned by FastDiag are subset-minimal (see Chapter 3) but not necessarily of minimal cardinality. In diagnosis scenarios where minimal cardinality is required, we recommend the standard approach of determining minimal confict sets [22, 27] and a corresponding confict resolution based on hitting set directed acyclic graphs (HSDAGs) [34].

### **4.5 Reconfguration**

In situations where an FM confguration has already been completed, afterconfguration tasks can become relevant. Such tasks can be summarized as *reconfguration tasks* [14] with the goal to adapt an existing confguration in such a way that new requirements are fulflled. In the following, we discuss two basic scenarios which are (1) the estimation *which new feature should be recommended* with regard to a specifc confguration and (2) a situation where a FM confguration has to be adapted in order to take into account a *new set of user requirements*. We show how to apply the concepts of matrix factorization [24] to perform such prediction tasks.

**Matrix factorization for new feature recommendations.** Table 4.5 sketches a basic reconfguration setting where the major task is to predict for a new feature *musicplay* whether this feature should be recommended to users who have already completed an FM confguration.5 In the following, we will show how this task can be completed on the basis of *matrix factorization* which is a widely used model-based collaborative fltering approach [24] (in contrast to memory-based collaborative fltering which has been used in Section 4.2). In this example, we assume that users <sup>1</sup> and <sup>2</sup> have already integrated the *musicplay* feature in their smartwatch (e.g., on the basis of a software upgrade), and user <sup>4</sup> was not interested in this upgrade.

Table 4.5: The task of predicting the relevance of a new feature *(mu)sicplay* for diferent users who have already completed a confguration. The symbol ? in the matrix () indicates the task to predict whether the new feature should be recommended to the user (customer).


The entries in Table 4.5 can help to predict the relevance of individual new features for users. In our example, an additional feature *musicplay* has been integrated which is a software update that allows the activation of music sharing via smartwatch. In a marketing context, it is important to know which users (of a potentially large group of users) could be interested in this additional feature. A similar scenario is one where users who already purchased a smartwatch could be interested in a new version of due to the mentioned upgrade.

The prediction of the relevance of a new feature can be supported, for example, on the basis of memory-based collaborative fltering [23] where relevance estimation is implemented by simply analyzing the preferences of similar users with regard to new features. In contrast to the previously discussed scenarios, the recommendation of new features and – more generally – reconfguration scenarios can often be handled in an *ofine* fashion which makes these scenarios more accessible to model-based collaborative fltering approaches such as *matrix factorization* (MF) [24]. The overall idea of matrix factorization is to optimize a set of so-called *interest dimensions* (in machine learning contexts often denoted as *hidden features*) in such a way that user-individual preferences can be estimated with a high prediction quality.

Using matrix factorization, the entries in Table 4.5 (the matrix ) can be reconstructed on the basis of *dimensionality reduction* which follows the approach

<sup>5</sup> For better readability, in this example, we apply the following abbreviations for feature names: {(sm)artwatch, (sc)reen, (to)uch, (st)andard, (pa)yment, (gp)s, (sp)ortstracking, (ru)nning, (sk)iing, (hi)king, (en)ergymanagement, (ba)sic, (ad)vancedsolar, (mu)sicplay}.

of learning two low-dimensional matrices ( and representing the machine learning model) that can be used to derive a matrix ′ ≈ , i.e., ′ can be regarded as an approximation of . Following this approach, we are able to generalize from determining recommendations based on individual user preferences to a machine learning model based on dimensionality reduction which means that abstract *interest dimensions* (in machine learning terms denoted as *features*) are learned and used to predict item preferences of individual users.

For demonstration purposes, we construct the matrices (Table 4.6) and (Table 4.7) including the *hidden features*(interest dimensions) <sup>1</sup> and 2. These dimensions are denoted as *hidden*, since the underlying machine learning (matrix factorization) algorithm is not aware of the semantics of these features. We chose to include two dimensions, however, in real-world application contexts the number of such hidden features could be much higher. The role of such hidden features can be best explained by example interest dimensions with a corresponding semantic, i.e., <sup>1</sup> could (as said, we do not know) represent the interest dimension *fexibility* (i.e., the more features are included the better) and <sup>2</sup> could represent the dimension *simplicity* (i.e., the less features included, the better).

If we use matrix factorization for learning the user × feature (interest dimension) relationship and the interest dimension × feature (of the FM ) relationship, the corresponding table entries are learned, i.e., do not have to be flled out manually. In this context, the learning goal is to optimize (maximize) the *similarity* between and ′ where ′ is the result of applying a matrix multiplication of UA • AF – Table 4.8 is the result of a corresponding matrix multiplication applied to our example Tables 4.6 and 4.7. In this context, the feature *musicplay* (*mu*) has a high estimate for the users 1, 2, and <sup>8</sup> and a low estimate for all other users. The entry in Table 4.8 also confrms (predicts) a low interest of user <sup>4</sup> in the new feature.

Table 4.6: Matrix representing relationships between between the users 1..<sup>8</sup> and interest dimensions (hidden features <sup>1</sup> and 2).


Again, we have to emphasize that when applying matrix factorization [24], i.e., learning the interest dimension/user relationships, the corresponding machine learning features are *hidden*, i.e., do not have a meaning. In other words, we do not know exactly in which way the hidden features used by matrix factorization have a direct

Table 4.7: Matrix representing relationships between interest dimensions (hidden features) and selected features of our example FM .


Table 4.8: Matrix ′ as a result of a matrix multiplication UA • AF. The feature *musicplay* () appears to be potentially relevant for users 1, 2, and 8.


relationship to the (explicitly defned) features used in our working example. Consequently, machine learning approaches such as matrix factorization provide help in terms of automatically learning user × item preferences but come along with the disadvantage of a low degree of explainability due to a lack of semantic knowledge about user × item relationships.

**FM reconfguration.** In the previous section, we already took a look at a simple reconfguration scenario focusing on analyzing a potential need of extending the current FM confguration with the *inclusion of a new feature*. Beyond that, there also exist scenarios where feature settings of an existing confguration have to be adapted in order to be able to take into account a *new set of user requirements* (). On the software level of a smartwatch, the inclusion of specifc additional features could trigger a need of reconfguration [14, 21]. In our *smartwatch* example, the inclusion of additional features, for example, additional software components supporting *sportstracking* could trigger a need of changing also other settings in the existing FM confguration .

Let us assume the existence of a confguration = {ℎ = , = , ℎ = , = , = , = , = , = , = , ℎ = , = , = , = }. The user now changes his/her mind and wants to include the *payment* feature. Since *payment* excludes a *standard screen*, the (singleton) requirement = induces an inconsistency in the feature settings of . In our example, two confict sets are induced which are <sup>1</sup> = { = } and <sup>2</sup> = {ℎ = }. For <sup>1</sup> and 2, there exists one related diagnosis which is Δ = { = , ℎ = } indicating that = has to be replaced with = and ℎ = has to be replaced with ℎ = in ′ resulting in a corresponding reconfguration ′ .

Due to the binary domain of individual features, a reconfguration can be directly derived from a diagnosis Δ. A reconfguration ′ is an adaptation of the original confguration in such a way that the new requirements are consistent with the feature settings in ′ . In this context, the setting of those features remains the same which are included in but are not part of Δ. Vice-versa, feature elements of Δ have to be deleted from and included in negated form into the new confguration ′ which itself represents the reconfguration. Equation 4.4 represents a construction rule for each setting of a feature in the new confguration (reconfguration) ′ where ( ′ ) denotes the feature setting of feature in ′ (e.g., ℎ( ′ ) = ), () denotes the feature setting of feature in the original confguration (e.g., ℎ() = ), and (Δ) denotes the new feature setting of feature in (e.g., ℎ(Δ) represents the new feature setting ℎ = ).

$$f(A') = \begin{cases} f(A), \text{if } f = X \notin \Delta\\ \overline{f}(\Delta), \text{otherwise} \end{cases} \tag{4.4}$$

**Recommending reconfgurations.** Recommendations for reconfgurations can be determined in a fashion similar to the recommendation of confict resolutions (see Section 4.4). As discussed, a set of new requirements (within the scope of a reconfguration scenario) can induce a confict in the current confguration . The identifed conficts can be resolved on the basis of the concepts of model-based diagnosis [34]. Given a dataset which includes the original confguration as well as the corresponding reconfguration ′ , a collaborative fltering approach could be applied by (1) identifying (in the dataset) a confguration which is similar to the confguration of the current user and (2) to guide confict resolution in such a way that the chosen resolutions lead to a reconfguration ′ which is as similar as possible to the feature settings in the reconfguration of the nearest neighbor.

### **4.6 Discussion**

In this chapter, we have discussed diferent topics in the context of supporting FM confguration in interactive scenarios, i.e., a user is interacting with a confgurator with the goal to build a complete and consistent FM confguration. FM confguration can become a tedious task since users might not always have detailed domain knowledge resulting in situations where some of the features could not be specifed or get specifed in a suboptimal fashion. Furthermore, a complete specifcation of all features might simply not be possible due to size of the confguration model.

In order to provide a better support for users in interactive FM confguration scenarios, we have shown how diferent approaches from machine learning and recommender systems can be applied to predict the relevance of inclusion or exclusion of specifc features. In this context, we discussed (1) approaches to recommend feature inclusion or exclusion, (2) approaches to the recommendation of adaptations of feature preferences in inconsistent situations, and (3) approaches to support reconfguration scenarios, for example, in terms of determining minimal sets of adaptations needed for already existing confgurations such that a new set of user requirements can be taken into account.

In the context of the topic of interacting with FM confgurators, we regard the following aspects as major issues for future work.

**Search heuristics beyond variable domain orderings.** We discussed diferent approaches to support the recommendation of feature inclusion or exclusion. In this context, we have sketched ways to integrate such a recommendation task directly into the variable value ordering of a solver. The inclusion of variable domain orderings into solver search heuristics can be regarded as a frst step towards *accuracy-aware FM confguration*, however, further approaches, for example, variable ordering and the generation of dynamic search heuristics, i.e., search heuristics defned during solver runtime, have to be analyzed in detail.

**Integration of machine learning with constraint reasoning.** In the line of the topic of integrating search heuristics with recommendation, a more general issue is the integration of machine learning with constraint reasoning [33]. For example, it is important to further improve the predictive quality of recommendation services. This can be achieved by analyzing diferent possibilities to integrate domain knowledge into the machine learning process, for example, by explicitly encoding domain constraints in a neural network.

**Cognitive issues in interactive confguration.** There are issues located far beyond technical issues of interactive FM confguration. In many cases, confguration is a highly interactive process (with the exception of batch confguration scenarios) where users are interacting with a confgurator with the goal to build a consistent confguration entailing user-relevant features. In this context, it is important to take into account theories of human decision making to be able to better understand how to best support confgurator users [38].

**Group-based confguration.** In contrast to single-user scenarios, there are also many scenarios where a confguration task has to be completed by a group of users [26]. In this context, a group of users has to achieve consensus with regard to the inclusion or exclusion of a specifed set of features [37]. Furthermore, conficts (in the case of inconsistent requirements among diferent group members) have to be resolved in a way somehow acceptable for all group members. New user interfaces supporting group decision making in the context of FM confguration as well as new recommendation and diagnosis algorithms have to be developed to provide an efcient user support in such contexts.

### **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

# **Chapter 5 Tools and Applications**

**Abstract** Feature Models (FMs) are not only an active scientifc topic but they are supported by many tools from industry and academia. In this chapter, we provide an overview of example feature modelling tools and corresponding FM confgurator applications. In our discussion, we frst focus on diferent tools supporting the design of FMs. Thereafter, we provide an overview of tools that also support FM analysis. Finally, we discuss diferent existing FM confgurator applications.

### **5.1 Tool and Application Landscape**

We will now show how the concepts discussed in Chapters 2–4 are integrated into real-world systems (ranging from industrial applications to research-driven prototypes). Importantly, all of the discussed systems cover some subsets of the concepts discussed in the previous chapters.

Without claiming to be complete, we provide an overview of example tools and applications. In Section 5.2, we provide an overview of example feature modelling tools which are of great importance for diferent kinds of variability management processes [11, 25, 99]. In this context, we discuss the functionalities provided by those tools and provide insights into the corresponding graphical user interfaces. In Section 5.3, we focus on the way diferent types of FM analysis operations are included in feature modelling tools. Finally, in Section 5.4, we discuss examples of FM confgurator applications.

Table 5.1 provides an overview of example descriptions/presentations of tools supporting (1) the *design* of Feature Models (FMs), (2) their *analysis*, and (3) *FM confguration*. Following the major objectives of this book, we will specifcally focus on discussing AI techniques related to the topics of *knowledge representation and reasoning* (KRR), *explainable AI* (XAI), and *machine learning* (ML). In this context, we provide example screenshots of tools if corresponding test versions were publicly accessible without the need of purchasing a license. For an in-depth analysis of the existing tool support in software product lines, we refer to Horcas et al. [57].


Table 5.1: Tool and application landscape: example tools and applications (*feature modelling, FM analysis, and FM confguration*).

## **5.2 Feature Modelling Tools**

Clafer [7] is a knowledge representation language and environment for feature modelling and confguration, class and object modelling, and metamodelling. The system is available as a desktop application and as a set of publicly available webbased tools.1 It supports variability modelling including also non-Boolean features (attributes) and constraints about their values, arbitrary multiplicity in group features (e.g., x..y, where x can be distinct from 1 and y distinct from \*), feature clones and abstract classes, and multi-objective optimization. Using Clafer, complete and consistent confgurations can already be generated in the modelling phase which helps to more easily understand the semantics of the FM. An example of applying Clafer to represent our *smartwatch* FM is shown in Figure 5.1.

<sup>1</sup> https://www.clafer.org/

Fig. 5.1: Feature modelling and generation of confgurations with Clafer. The FM is shown on the left-hand-side, a corresponding confguration on the right-hand-side.

FeatureIDE [106]2 is an open-source Eclipse framework for feature-oriented software development (FOSD) with a plug-in-based extension mechanism to integrate and test existing tools and SPL approaches. As FeatureIDE supports abstract features. In FeatureIDE, the FM and the corresponding confguration interface are closely connected [86], for example, the confguration interface is based on the same hierarchical structure as defned in the FM (see Figure 5.2). In FeatureIDE, solver-based propagation also enforces consistency between selected features and the constraints defned in the FM. For example, the deselection of some features could also enforce the deselection of related features. FeatureIDE also supports the concept of *focused views* with the idea that only those features are visible to the user which are in the current focus, for example, if a user selects a specifc feature, he/she might be interested also in related subfeatures but there is no need to display the complete feature tree.

pure::variants [18]3 is an Eclipse-based solution supporting diferent variability modelling concepts such as features with attributes, feature clones, variant instances, hierarchical variant composition, and OCL-type constraints. Based on specifcations in its family model, it supports the generation and validation of the fnal confguration, i.e., the code of various programming and scripting languages. An example screenshot of the pure::variants environment is provided in Figure 5.3.

S.P.L.O.T. (Software Product Lines Online Tools) [78]4 is a web-based environment for the design, analysis, and confguration of FMs. It supports logicbased reasoning tasks on the basis of reasoning approaches such as SAT solving and binary decision diagrams (BDDs). Furthermore, S.P.L.O.T. provides a large

<sup>2</sup> http://www.featureide.com/

<sup>3</sup> https://www.pure-systems.com/

<sup>4</sup> http://www.splot-research.org/

Fig. 5.2: FM confguration with FeatureIDE. The confguration user interface follows the hierarchical structure defned in the FM .

repository of FMs5 which is a kind of confguration benchmark suite used in various evaluation contexts supporting a structured comparison of diferent confguration problem solving approaches. The system provides a fexible web-based user interface which supports the design of FM<sup>s</sup> as well as corresponding FM confguration tasks – see the Figures 5.4 and 5.5.

The FM diagram is represented in the form of a tree view. FM-related cross-tree constraints are shown in a separate view where individual constraints can be defned in terms of logical disjunctions. In a further user view, FM statistics are displayed giving an overview of the diferent FM properties, for example, *#features* and *#xor groups* (i.e., alternative relationships). Finally, the environment also supports FM analysis operations including FM satisfability, dead features, and core features. With S.P.L.O.T. as web-based application, no related installation procedures are needed. The simple graphical user interface makes it specifcally applicable in the context of university courses, e.g., to give students a short but representative overview of feature modelling concepts, their semantics, and related FM confguration processes. Based on a set of input parameters, for example, *number of features*, *minimum and maximum feature branching factor*, and *consistency of generated models*, S.P.L.O.T.

<sup>5</sup> Also available via UVLHub [96, 103] – see https://www.uvlhub.io/.

### 5.2 Feature Modelling Tools 99

Fig. 5.3: Example screenshots of the pure::variants user interface with the smartwatch FM on the left-hand-side and related confgurations on the right-hand-side.

also supports the generation (synthesis) of FM<sup>s</sup> which is of high relevance specifcally when evaluating problem solving algorithms.

FM2EXCONF [70]6 is an environment supporting the defnition of confguration tasks on the basis of FM<sup>s</sup> with the basic modelling concepts of *feature*, *or*, *alternative*, *mandatory*, *optional*, and the cross tree constraints *requires* and *excludes*. FM<sup>s</sup> can be imported on the basis of the exchange formats *SXFM* (used in S.P.L.O.T.), FeatureIDE *XML* format, and Glencoe *JSON*. As depicted in Figure 5.6, these models can also be analyzed on the basis of analysis operations such as dead and false optional features – see Benavides et al. [13]. Out of a given FM, the system supports the direct generation of a corresponding Microsoft Excel based confgurator application. Such a confgurator depicts those features derived from the FM –

<sup>6</sup> https://github.com/AIG-ist-tugraz/FM2ExConf

Fig. 5.4: *Smartwatch* FM developed in S.P.L.O.T. (Software Product Lines Online Tools).

on the basis of specifying 0 (feature exclusion) and 1 (feature inclusion), users can articulate their requirements with regard to a fnal confguration (see Figure 5.7).

For the purpose of increasing user interface understandability, the constraints derived from the FM are explicitly shown to the user. In the case of a constraint violation, a corresponding message is displayed to help the user to fnd a way out from the *no solution could be found* dilemma. In contrast to feature modelling environments such as FeatureIDE, FM2EXCONF does not provide a solver integration, i.e., functionalities such as automated FM diagnosis (see Chapter 3) and confguration completion (see Chapter 4) are not supported. However, due to the simple defnition and corresponding confgurator generation, this environment can easily be used in knowledge representation related courses.

EventHelpR [35]7 is a publicly available general-purpose group decision support tool. In the context of product (line) scoping, EventHelpR can support stakeholders

<sup>7</sup> https://www.eventhelpr.com


### Fig. 5.5: *Smartwatch* FM confguration in S.P.L.O.T..

Fig. 5.6: Example screenshot of the FM2EXCONF modelling environment [70].


Fig. 5.7: Example screenshot of FM2EXCONF [70].

in fnding solid arguments regarding the inclusion and exclusion of features. An example screenshot of EventHelpR is shown in Figure 5.8.


Fig. 5.8: Example screenshot of EventHelprR.

The underlying idea of EventHelpR [35] is to allow users (stakeholders) to provide arguments for or against the inclusion of specifc features. These arguments are then aggregated feature-wise indicating a "global" tendency of feature inclusion or exclusion. Such type of preference elicitation user interfaces can help group members to focus on the exchange of decision-relevant information, i.e., arguments, and thus to signifcantly improve the overall decision quality (in terms of the selected features). With this, EventHelpRprovides a kind of explanation-based user interface which helps to make the reasons for feature inclusion and exclusion transparent. The aggregation of the preferences of individual users is supported in terms of a group aggregation function [33, 108, 109] which determines the share of positive and negative arguments on a graphical level (see Figure 5.8). In the line of EventHelpr, the open source requirements engineering environment OpenReq8 supports group decision making in the context of prioritizing software features [31, 43, 107].

MiniZinc IDE [82]9 is a tool that allows for the specifcation and solving of constraint satisfaction problems (CSPs) in a graphical environment. The specifcation of our example FM in MiniZinc IDE is depicted in Figure 5.9.


Fig. 5.9: Example screenshot of MiniZinc Ide.

<sup>8</sup> https://openreq.eu/

<sup>9</sup> https://www.minizinc.org/ide/

On the one hand, a major disadvantage with regard to feature modelling is that IDEs such as the MiniZinc environment support the specifcation of variables and constraints, however, no related graphical knowledge representation of features, relationships, and cross-tree constraints is provided. On the other hand, models (represented as CSPs) can easily be extended with attributes, for example, for each relevant feature, we could introduce a corresponding price attribute indicating the price of a feature (e.g., for the feature , we can introduce the attribute ). Furthermore, an attribute would represent the overall price of a confguration using an additional resource constraint of type = <sup>1</sup> + .. + . This way, MiniZinc IDE could be used within the scope of diferent courses related to knowledge representation and reasoning. Specifcally, in the context of constraint solving (and beyond), search optimization plays a major role. With highly complex FMs, a corresponding solver search optimization on the basis of the concepts of machine learning becomes increasingly relevant [90, 113]. Related topics are (1) *confguration space learning* [85] which includes intelligent synthesis methods for the generation of relevant test confgurations and (2) *knowledge extraction from data* [80, 72, 104, 112] which can help to increase the efciency of modelling processes, for example, by the automated extraction of features from natural language text.

Finally, Gears [65]10 is a product line engineering tool and lifecycle framework which supports all phases of the SPL process. Product line engineering is interpreted as a highly automated task similar to the manufacturing of physical products in a factory. Gears provides a quasi-standard unifed variant management approach that is vendor-independent but integrates with other third party and proprietary tools, assets, and processes across each stage of the lifecycle — and across engineering and operations disciplines. This helps to reduce complexity, time, efort, and errors on the one hand and breaks down organizational and operational silos enabling better communication and alignment and greater collaboration on the other hand.

**Product Confguration Environments.** Specifcally, in the context of knowledge-based product confguration scenarios [36, 98, 102], there exists a plethora of commercial environments supporting the development of confgurator applications. Without any claim to completeness, related example systems are camos11, ConfigIt12, Tacton13, encoway14, and Variantum15. Note that these systems are not primarily based on FM<sup>s</sup> but in many cases on a more object- or component-oriented knowledge representation (see, e.g., [30, 34]). The corresponding reasoning (solver) support can range from constraint-based and SAT-based to rule-based reasoning [36, 98, 102].

<sup>10</sup> https://biglever.com/solution/gears/

<sup>11</sup> https://www.camos.de

<sup>12</sup> https://configit.com

<sup>13</sup> https://www.tacton.com/

<sup>14</sup> https://www.encoway.de

<sup>15</sup> https://variantum.com/

Based on the given overview of feature modelling tools (and beyond), we now discuss examples of tools that support the *analysis* of FMs.

### **5.3 Feature Model Analysis: Tool Support**

Due to changing requirements and dependencies between features, the development and maintenance of large and complex product lines can become a difcult task [63]. In the following, we discuss diferent tools that assist developers in the design and maintenance of FMs.

flama [50]16 is a tool suite for variability model analysis in general and FM analysis in particular. It is developed as Python framework with a plugin–based architecture where diferent plugins for FM languages can be developed as well as reasoning capabilities. flama supports UVL models (see Section 2.7) and SAT, BDD, and SMT reasoning capabilities [76]. It also supports analysis operations that do not need a solver support (see Section 3.1.1). The project is maintained and promoted by 4 diferent universities and its spirit is to serve a common base for the development of FM analysis and confguration capabilities. Many applications use flama as background for analysis capabilities [15, 49, 58, 59, 71, 77, 95, 96, 103].

FaMa [14]17 is a wide-spread application for FM analysis written in Java. It is a framework for the automated analysis of FM<sup>s</sup> integrating several of the most commonly used logic-based representations and solvers proposed in the literature (BDD, SAT, CSP solvers). After having imported a FM from nearly any other FM tool, it provides support for validity checking and fnding inconsistencies. By that, it covers the domain analysis phase really well, for example, features with attributes, numeric values, and constraints. For requirements analysis, too, FaMa stands out with its automatic reasoning capabilities based on symbolic AI methods, such as model validation (e.g., non-satisfable model), anomaly detection (e.g., dead features, falseoptional features), model counting (e.g., number of confgurations), and redundancy detection [69]. Currently, the project lacks support since its main contributors are now developing flama.

FactLabel [56]18 is a web-based tool that supports (in a confgurable fashion) the interactive visualization of FM characteristics (as a result of executing various FM analysis operations) which can then also be exported to other tools in diverse exchange formats (e.g., the Universal Variability Language). The result of applying FactLabel to our example FM is depicted in Figure 5.10.

FeatureIDE [106] provides a set of analysis operations (a.o. dead features, false optionals, and redundant constraints) which are defned on a logical basis used to determine anomaly for developers [63]. Such a reasoning about diferent FM properties can help to generate explanations that provide reasons as to why specifc

<sup>16</sup> https://flamapy.github.io

<sup>17</sup> http://www.fama-ts.us.es/

<sup>18</sup> https://fmfactlabel.adabyron.uma.es/


Fig. 5.10: FactLabel user interface: output for the *smartwatch* FM.

expected FM properties do not hold. For example, an explanation can indicate minimal sets of FM relationships and cross-tree constraints that are responsible for a specifc unintended FM semantics [32, 63, 74].

FMTesting [21]19 is a FeatureIDE plugin focusing on the application of modelbased diagnosis [32, 93] for explaining anomalies in FMs. The determined diagnoses represent minimal sets of FM relationships and constraints responsible for an unintended behavior of a FM (e.g., which are the relationships and constraints that make a specifc feature a void feature). Given a specifc FM, the developer can select analysis operations to be activated (e.g., void features) and the plugin determines the set of void features with the corresponding explanations (diagnoses). A screenshot of FMTesting with a corresponding diagnosis output is shown in Figure 5.11.

Similar to FMTesting, Hentze et al. [74] present a FeatureIDE service that supports the determination of diagnoses (denoted as hyper-explanations) for dead features. Furthermore, Bend´ık and I. Cern ˇ a [ ´ 17] present a tool that supports the determination of *minimal unsatisfable subsets* (confict sets) which are a basis for diagnosis determination [40, 93] (see Chapter 3). Finally, Le et al. introduce DirectDebug [67, 68] which is a Java library for the automated diagnosis of FMs.

<sup>19</sup> https://github.com/AIG-ist-tugraz/FMTesting


Fig. 5.11: FMTesting: a diagnosis plugin for FeatureIDE.

UVLHub [96]20 is a data set repository supporting feature models in UVL format using open science principles. Open science principles promote transparency, accessibility, and collaboration in scientifc research. UVLHub provides a front-end that facilitates the search, upload, storage, and management of feature model datasets, improving the capabilities of discontinued proposals such as S.P.L.O.T.. It communicates with Zenodo providing a permanent location for datasets and it is maintained by three active universities in variability modelling. Figure 5.12 shows UVLHub in action where the data set of the feature model shown in the book is displayed.

In addition to the previously discussed examples, the following tools support FM analysis operations [57]. pure::variants [18] supports a set of analysis operations comparable to those of FeatureIDE. In pure::variants, it is possible to determine the number of confgurations for individual subtrees of the FM [57]. S.P.L.O.T. [78] also provides a basic set of analysis operations including dead features and FM consistency (see Figure 5.5). Analysis operations supported in FM2EXCONF (see Figure 5.7) [38, 70] resemble those provided by the FMTesting environment (see Figure 5.11). Finally, a formalization in terms of mixed integer linear programs for the analysis of Clafer models is provided in Weckesser et al. [115] – compared to most of other existing analysis approaches, Clafer [7] FM analysis has to deal with a higher complexity due to a higher expressivity of the underlying FM language [115].

<sup>20</sup> https://www.uvlhub.io/

### 108 5 Tools and Applications

Fig. 5.12: UVLHub: a data set repository of feature models in UVL format [96].

### **5.4 Feature Model Confgurator Applications**

On the basis of the previous presentations of feature modelling tools and corresponding FM analysis approaches, we now discuss diferent FM confgurator applications. In our discussion, we specifcally focus on those applications with available associated descriptions of the used AI methods.

**Software Product Line Confguration.** FM<sup>s</sup> help to represent the confguration space of a large number of diferent systems which can be assembled out of a set of pre-defned (implemented) software artifacts. Thus, the concept of software artifacts in SPL confguration resembles the concept of components when confguring physical systems [36]. FM<sup>s</sup> of industrial software product lines [8, 89, 84] can become large and complex – see, for example, the Linux operating system FM [97, 105]. Asikainen et al. [9] introduce the confguration environment WeCoTin and show how their confguration environment can be used to model and implement a text editor confgurator application. The system includes a modelling environment which can be used to create (attribute- and cardinality-based) FM<sup>s</sup> and thereof (in an automated fashion) a corresponding confgurator user interface. In the line of this work, various software systems and tools support the task of software product line confguration. Related SPLs exist, for example, in operating systems [97], automotive systems [27, 28, 117], synthetic biology [24], and software product lines for large telescope control software [55].

**Confguring Control Systems.** Beek et al. [12] defne a single FM covering the complexity (high variability) of the European Train Control System (ETCS), an automatic train protection (ATP) system which continuously supervises all trains on a railway line, ensuring that the safety speed and distance are not exceeded. This model shows the diferent components to be installed at the diferent levels established by the ETCS standards and helps engineers to understand and solve specifc issues, such as aligning the interfaces between diferent systems (e.g. onboard and wayside equipment), development of sustainable solutions by involved manufacturers, interoperability among systems at diferent ETCS levels, backward compatibility of involved subsystems, and evolution towards new requirements. The FM can also act as facilitator in cost and performance analysis for planning purposes because it is implemented in Clafer which supports the generation of the complete set of instances (products) from the model. Furthermore, attributes associated to features and quality constraints allow assignment of costs and their corresponding optimization.

**Runtime Confguration.** Capilla et al. [5, 23] introduce concepts supporting the specifc scenario of *runtime confguration* in the context of dynamic software product lines. Such a type of confguration enables the addition and removal of variants on-the-fy, runtime dependencies and constraints checking, dynamic and optimized reconfguration, and multiple binding and re-binding. Applications range from service-oriented and cloud systems over mobile software and ecosystems for autonomous and self-adaptive systems to cyber-physical systems [66] which have to reuse, reorganize, and reconfgure their components during runtime. Benefts are that variants are bound at the latest time possible which ensures high fexibility, for example, (de)activation of system features or adaptation to changing conditions of the environment. Related ideas on *anytime diagnosis*supporting efcient reconfguration tasks are discussed in Felfernig et al. [44].

**Release Plan Confguration and Reconfguration.** In requirements engineering, dependencies are key concerns to be taken into account in the context of prioritization processes. Raatikainen et al. [92] relate individual requirements to individual features in a FM and thus represent the task of requirements prioritization as a FM confguration task. Their feature modelling environment supports the inclusion of attributes, for example, a release can be regarded as an attribute of a feature. Having completed an FM, it can be translated into the representation of a constraint satisfaction problem (CSP). A specifc aspect of this environment is the inclusion of *explanations*that help stakeholders to fgure out the sources of an inconsistency. For example, the maximum allowed amount of eforts assigned to a release plan cannot be taken into account due to the fact that too many features are required to be included, explanations help to fgure out minimal sets of feature sets that – if excluded from the current release plan – allow the identifcation of a solution (release plan). In a similar fashion, the same explanation concepts can be applied to reconfgure an existing release plan to take into account a set of new requirements [44]. Such explanations are determined on the basis of the concepts of direct diagnosis [42] (see also Chapters 3 and 4).

**Confguration in Augmented Reality.** Motivated by the trend of mobile shopping, Gottschalk et al. [51] present a FM confgurator application that supports model-based confguration of furniture, for example, the confguration of kitchens. A product modeler application supports the use of basic feature modelling concepts, i.e., *feature*, *mandatory*, *optional*, *alternative*, *or*,*requires*, and *excludes*. The corresponding product confgurator (derived from the feature model) supports the creation of individual furniture confgurations (3 object compositions) where a collection of assets (3 objects and textures) is used for generating a 3 visualization of the solution (confguration) generated by the confgurator.

**Confguring Operating Systems.** The Linux operating system kernel can be regarded as a kind of holy grail of the SPL community [105]. Sincero and Schroder- ¨ Preikschat [101] introduce the original variability management of the Linux kernel confgurator. The underlying FM supports basic FM variability modelling concepts. A more up-to-date summary of the existing Linux kernel confguration support is provided in Franz et al. [46] where XConfig is mentioned as corresponding graphical confgurator. The underyling variability defnitions follow basic FM variability concepts [62]. An extension of XConfig is ConfigFix which is a tool that supports confict resolution in the case of detected inconsistencies in the current confguration. Confict detection and resolution in ConfigFix is based on SAT solving [46]. As an additional supportive service in the context of Linux kernel confguration, Acher et al. [4] propose a machine learning approach to predict the kernel size of a Linux kernel confguration – such a service can be applied to recommend features and to rank confguration candidates. Furthermore, in the context of optimizing a kernel confguration, predictions can be chosen to select specifc reconfguration options. Herzog et al. [54] introduce a machine learning based approach that helps to optimize operating system parameters on the basis of linear models and neural networks. A confguration front-end for Kconfg-based software product lines is also presented in Friesel et al. [47].

**Confguration in Automotive.** For decades, automotive industry has been among the most extreme applications of SPLs with highly complex products with a literally astronomical number of variants [117, 118]. Modern automobiles can comprise hundreds of separate engineering systems, such as engines, brakes, air bags, lights, windshield wipers, climate control, infotainment, etc. Some of them are extraordinarily complex, such as high beam headlights that react to oncoming trafc at night. For a discussion of related details on automotive product line engineering, we refer to Wozniak and Clements [117]. From the end-user (customer) point of view, nearly every car provider also ofers confguration services to their customer communities. An example thereof is the Renault confgurator as discussed in Xu et al. [118]. This confgurator is based on constraint satisfaction (CSP) knowledge representations [10] and corresponding knowledge compilation (compression) approaches which can lead to signifcant reductions in terms of time needed for identifying a confguration.

**Confguring Videos.** Acher et al. [2] present ViViD which is a variability-based tool for confguring video sequences. In ViViD, variability modelling is based on attribute-based FM<sup>s</sup> which can be translated into a corresponding representation of a constraint satisfaction problem (CSP). As a constraint solver, the systems uses Choco, an open source Java library for constraint programming.21 In the line of [2], Lubos et al. [73] introduce a FM confguration approach with a similar objective in the sense that videos should be confgured in such a way that diferent criteria such as maximum duration and topic coverage are fulflled. In this context, basic FM<sup>s</sup> are used for variability modelling.

**Personalized Confguration.** The idea of personalized confguration is to support confgurator users in fnding a confguration that satisfes their preferences, for

<sup>21</sup> https://choco-solver.org/

example, in terms of providing user-individual recommendations of components and features of potential relevance [29, 87]. Following the idea of personalizing the interactions with confgurators, Pereira et al. [88] introduce a FM confguration environment enhanced with diferent recommendation approaches [22, 114]. The used recommendation algorithms determine features of potential relevance for the user which are shown within the scope of a corresponding confguration process. Also following the idea of personalized confguration, Rodas-Silva et al. [94] present a recommender system that suggests implementation components based on a set of selected features (related to potential WordPress website confgurations). The underlying idea is that in product lines often a selected feature can be implemented by diferent components – fnding the optimal components to implement a given confguration is the task to be supported. In this context, basic feature modelling concepts are supported to represent variability properties in the FM.

**Further Confguration Services.** Jez´ equel et al. [ ´ 60] introduce an authentication library which ofers a huge variety of options (features) where only a subset is needed for each concrete installation of an application on a server, for example, authentication by password or fngerprint but not retinal scan. Not all features are either selected or excluded – some stay open so that an administrator can change settings at runtime. Such runtime features are called *feature toggles*. In order to avoid unnecessary code, which might make hacker attacks easier, they propose automated source code creation, removal, and injection based on the selected, excluded, and open design time features. Fritsch et al. [48] present YAP (*Yet Another Product Confgurator*) which is based on FeatureIDE combined with an underlying SAT solver. The confgurator has been developed for customers of a German bank with the goal of assisting non-technical-afne users in their product design [48]. The underlying attribute-based FM consisting of around 940 features and 1,200 cross tree constraints also supports a kind of standardization in terms of ensuring consistent ofers and corresponding customer information documents. Niederer et al. [83] present a product confgurator with the goal to provide confgurator user interfaces with more fexibility regarding the specifcation of user preferences. In product confguration, users can be overwhelmed by the complexity of a product variability (in terms of features and constraints). Furthermore, many confguration user interfaces do not diferentiate between novice users and experts with regard to the product assortment. The context-aware chatbot introduced in Niederer et al. [83] provides more fexibility in terms of available conversation paths and the way user preferences can be specifed, which leads to a lower perceived complexity when interacting with the confgurator. Similar observations regarding improvements in the quality of user interaction have also been reported in the context of constraint-based recommender systems [52].

### **5.5 Discussion**

In this chapter, we have provided an overview of existing FM tools and applications ranging from *feature modelling*, *FM analysis*, to diferent *FM confguration applica-* *tions*. We selected applications specifcally from SPLC22 and VAMOS23. Despite the successful application of AI technologies for software and systems product lines, we also see open challenges and important topics for future research.

**Bridging design and runtime variability.** The fexible continuum of design and runtime variability needs to be bridged and managed [60]. A formal semantics of the underlying models can help to ensure result correctness or completeness, for example, in the context of risk models [26]. For many industrial domains, it would be very helpful to build an extensible ontology which can serve as a backbone for joint research. Higher automation of software verifcation (e.g., model checking), of change impact analysis on test cases, and of test case creation and repair would help to reduce development eforts [1]. Often, it is difcult to fnd a good trade-of between complexity and beneft [55]. Strategies for that should be examined.

**Inclusion of Large Language Models (LLMs).** By the end of 2022, LLMs for understanding natural language and corresponding systems for human-like interaction such as ChatGPT24 have gained much attention. It is worth evaluating how those systems could be used to improve feature modelling and feature confguration [3, 49], for example, as "intelligent" natural language assistants for recommending or explaining confguration decisions. Important aspects thereby are to avoid the notorious "hallucination" (i.e. the tendency of such systems to introduce "false facts" to satisfy user requirements or if they lack true information) and the protection of data (i.e. ensure non-disclosure and non-derivability of private and confdential data).

**Integration of Standard Algorithms.** Although applied and integrated in different research prototypes, FM diagnosis (for supporting explanations) is often not supported or only supported on the basis of some proprietary, often incomplete algorithmic solutions. Tool providers need to emphasize the integration of standard algorithms, such as QuickXPlain [61] for confict detection and model-based diagnosis for identifying minimal hitting sets (diagnoses) [42, 93].

**Cognitive Issues in FM Development.** The understandability and maintainability of FM<sup>s</sup> depend on the used knowledge representation. For example, the way in which constraints are ordered in a knowledge base or the way specifc logical properties (e.g., an implication) are specifed, can have an enormous impact on the overall understandability of a knowledge base [39]. Research is needed to better understand which knowledge structures help to optimize the overall understandability of a knowledge base (and the corresponding feature model) [41].

**Integration of FM Confguration with Machine Learning.** Although related solutions already exist in terms of research prototypes, an integration of FM confguration with corresponding machine learning approaches is rather the exception of the rule. Such technologies help to better assist confgurator users to select the

<sup>22</sup> https://splc.net/

<sup>23</sup> https://dblp.org/db/conf/vamos/index.html

<sup>24</sup> https://openai.com/blog/chatgpt

relevant features and also help companies to better decide on which features should be recommended to which users [37, 90].

**Confguration Space Learning.** Although highly relevant in diferent application contexts (e.g., optimizing solver search heuristics or determining stable operating systems parameter settings), confguration space learning is still more a research topic than on the way of being integrated into existing FM confguration environments [85]. The integration of such techniques into feature modelling and confguration environments can help to signifcantly improve confgurator runtime performance as well as the performance of the generated products.

**Explaining Confgurations and Beyond.** In (feature model) confguration and CSP/SAT solving, there exist various research contributions regarding the provision of explanations [53, 64]. However, there exist open issues specifcally with regard to taking into account the aspect of explanation goals which have a signifcant impact on the way explanations are formulated and presented to the user [110]. An example of such an explanation goal could be *persuasiveness*, i.e., to sensitize a user with regard to a specifc aspect, for example, *confguration sustainability* [45].

### **References**


*International Systems and Software Product Line Conference - Volume A*, pages 237–241, New York, NY, USA, 2017. Association for Computing Machinery.


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

# **Index**

application engineering, 17 artifcial intelligence, 4 atomic set, 52 augmented reality, 109

background knowledge, 54

car confguration, 110 collaborative fltering, 75 confguration, 1, 3, 4 minimality, 78 space learning, 33, 113 confgurator applications, 107 confict detection, 61 scenarios, 56 user requirements, 58 conficts, 54 constraint satisfaction, 4 content-based fltering, 75, 76 control system confguration, 108 cross-tree constraint, 19 CSP, 4, 26, 103

diagnosing inconsistent constraints, 55 diagnosis algorithms, 62 minimality, 62 direct, 82 FastDiag, 83 Hitting Set Directed Acyclic Graph, 62 HSDAG, 62

EventHelpR, 100 excludes, 19, 47 explanations, 79, 113

FactLabel, 105

FaMa, 105 FastDiag, 83 feature diagram, 19 abstract, 20 alternative, 19, 47 concrete, 20 core, 51 dead, 50 false optional, 51 mandatory, 19, 47 optional, 19, 47 or, 19 feature model, 1 benefts, 7 confgurators, 107 CSP, 25 edits, 53 extensions, 20 related topics, 6 SAT problem, 26 analysis, 5, 45, 105 attribute-based , 22 cardinality-based, 21 void, 49, 57 feature modelling, 5, 16 FeatureIDE, 96, 105 FLAMA, 105 FM2EXCONF, 99 FMTesting, 106

Gears, 104 group recommender systems, 75

Hitting Set Directed Acyclic Graph, 62 HSDAG, 62

© The Author(s) 2024 121 A. Felfernig et al., *Feature Models*, SpringerBriefs in Computer Science, https://doi.org/10.1007/978-3-031-61874-1

inconsistent feature models, 54

knowledge extraction, 35 knowledge representation, 4

large language models, 36 , 112 LLMs, 36 , 112

machine learning, 2 , 74 confguration space learning, 33 confict detection, 67 diagnosis, 67 knowledge extraction, 35 LLMs, 36 recommender systems, 75 mass customization, 3 matrix factorization, 85 maximum satisfable subset, 83 MiniZinc, 103 model-based diagnosis, 55 modelling tools, 95

operating systems, 109 or relationship, 19 , 47

problem space, 18 product confguration, 104 product line scoping, 32 pure::variants, 97

QuickXPlain, 57

R1/XCON, 3 reasoning, 4 , 26 , 28 recommender systems, 74

collaborative fltering, 75 content-based fltering, 75 group recommender systems, 75 recommending confict resolutions, 80 recommending features, 76 reconfguration, 85 , 88 , 109 redundancy, 52 detection, 63 redundant constraint, 52 release plan confguration, 109 requires, 19 , 47 runtime confguration, 109

S.P.L.O.T., 97 SAT solving, 4 , 28 satisfability, 49 search heuristics, 76 semantics of feature models, 23 smartwatch feature model, 20 software product line confguration, 108 software product lines, 2 , 97 solution space, 18

testing feature models, 65 textual languages for feature models, 30 tools, 95

user interaction, 73 UVL, 8 , 30 –32 , 105 UVLHub, 106

variability mining, 38 variable domain orderings, 76 variant feature, 51 video confguration, 110