Karlsruher Institut für Technologie

**Schriftenreihe Kontinuumsmechanik im Maschinenbau**

Daniel Wicht

Efficient fast Fourier transform-based solvers for computing the thermomechanical behavior of applied materials

21

Daniel Wicht

**Efficient fast Fourier transform-based solvers for computing the thermomechanical behavior of applied materials**

### **Schriftenreihe Kontinuumsmechanik im Maschinenbau Band 21**

Karlsruher Institut für Technologie (KIT) Institut für Technische Mechanik Bereich Kontinuumsmechanik

Hrsg. Prof. Dr.-Ing. habil. Thomas Böhlke

Eine Übersicht aller bisher in dieser Schriftenreihe erschienenen Bände finden Sie am Ende des Buchs.

# **Efficient fast Fourier transform-based solvers for computing the thermomechanical behavior of applied materials**

by Daniel Wicht

Karlsruher Institut für Technologie Institut für Technische Mechanik Bereich Kontinuumsmechanik

Efficient fast Fourier transform-based solvers for computing the thermomechanical behavior of applied materials

Zur Erlangung des akademischen Grades eines Doktor-Ingenieurs von der KIT-Fakultät für Maschinenbau des Karlsruher Instituts für Technologie (KIT) genehmigte Dissertation

von Daniel Wicht, M.Sc.

Tag der mündlichen Prüfung: 2. Juni 2022 Hauptreferent: Prof. Dr.-Ing. Thomas Böhlke Korreferent: Prof. Dr.-Ing. Martin Heilmaier Korreferent: Jun.-Prof. Dr. rer. nat. Matti Schneider

**Impressum**

Karlsruher Institut für Technologie (KIT) KIT Scientific Publishing Straße am Forum 2 D-76131 Karlsruhe

KIT Scientific Publishing is a registered trademark of Karlsruhe Institute of Technology. Reprint using the book cover is not allowed.

www.ksp.kit.edu

*This document – excluding parts marked otherwise, the cover, pictures and graphs – is licensed under a Creative Commons Attribution-Share Alike 4.0 International License (CC BY-SA 4.0): https://creativecommons.org/licenses/by-sa/4.0/deed.en*

*The cover page is licensed under a Creative Commons Attribution-No Derivatives 4.0 International License (CC BY-ND 4.0): https://creativecommons.org/licenses/by-nd/4.0/deed.en*

Print on Demand 2022 – Gedruckt auf FSC-zertifiziertem Papier

ISSN 2192-693X ISBN 978-3-7315-1220-2 DOI 10.5445/KSP/1000148765

# **Zusammenfassung**

Das mechanische Verhalten vieler angewandter Materialien wird entscheidend durch ihre Mikrostruktur beeinflusst. Zuverlässige rechnergestützte Homogenisierungsverfahren sind daher unverzichtbar um die Entwicklung und industrielle Anwendung neuer Materialklassen voranzutreiben. Ein Beispiel stellen gerichtet erstarrte NiAl Eutektika dar, welche Faser- oder Laminatstrukturen aufweisen und aufgrund ihrer hohen Schmelztemperatur und ihres Leichtbaupotentials von hohem Forschungsinteresse sind. Diese Materialklasse stellt moderne Mikromechaniklöser vor große Herausforderungen, wie z.B. Mikrostrukturmerkmale auf verschiedenen Längenskalen, ein hoher Materialkontrast bezüglich der Kriecheigenschaften und anspruchsvolle nichtlineare Materialmodelle.

Das Ziel dieser Arbeit ist daher die Untersuchung und Entwicklung effizienter FFT-basierter Mikromechaniklöser zur Berechnung des (thermo)mechanischen Effektivverhaltens nichtlinearer Komposite mit komplexer Mikrostruktur. Sowohl Lippmann-Schwinger Löser als auch Polarisationsmethoden dienen hierbei als Startpunkt für weiterentwickelte Lösungsverfahren. Insbesondere nutzen wir das mächtige BFGS Quasi-Newton Verfahren im Rahmen der FFT-basierten Mikromechanik um schnelle, tangentenfreie Algorithmen zu entwickeln. Des Weiteren verbessern wir über die Fixpunktbeschleunigung nach Anderson das Konvergenzverhalten von Polarisationsmethoden bei unendlichem Materialkontrast.

Zusätzlich zu universell einsetzbaren Verfahren werden einige spezielle Anwendungen von FFT-basierten Lösern betrachtet. Zum einen wird die spannungsexplizite Formulierung der Fließregel von Kristallplastizitätsmodellen bei kleinen Deformationen ausgenutzt, um die Rechenzeit von FFT-Lösern um ca. eine Größenordnung zu reduzieren. Hierbei dient die Spannung als primäre Feldgröße des periodischen Zellproblems in dualer Form. Zweitens werden thermomechanisch gekoppelte Probleme im Rahmen der asymptotischen Homogenisierung betrachtet. Dabei wird die Entkopplung von Mechanik und Wärmeleitung auf der Mikroebene genutzt um einen impliziten Löser zu entwickeln, welcher zu allen Dehnungsbasierten FFT-Methoden kompatibel ist.

Zum Schluss kehren wir zu den gerichtet erstarrten NiAl-Mo Legierungen zurück und nutzen die entwickelten Verfahren für eine detaillierte Studie ihres Kriechverhaltens. Der Fokus liegt hierbei auf auf zellulären NiAl-Mo Eutektika, deren Verhalten aufgrund ihrer Multiskalenstruktur bisher nicht im Rahmen von Mikromechaniksimulationen untersucht wurde. Zu diesem Zweck wird ein phänomenologisches Ersatzmodell entwickelt, welches das Kriechverhalten ausgerichteter NiAl-Mo Faserstrukturen abbildet. Die zellulären Mikrostrukturen werden über einen Level-Set Ansatz generiert. Mithilfe der Kriechsimulationen kann die Unterscheidung zwischen harter und weicher Phase im Zellgefüge geklärt werden. Des Weiteren wird den Einfluss von Zellanteil und Aspektverhältnis auf das Kriechverhalten analysiert.

# **Summary**

The mechanical behavior of many applied materials arises from their microstructure. Thus, to aid the design, development and industrialization of new materials, robust computational homogenization methods are indispensable. For instance, NiAl-based directionally solidified eutectics, with fibrous or lamellar microstructure, constitute a material class of high research interest, due to their high temperature resistance and lightweight potential. With structural features on different length scales, a high contrast of mechanical properties during creep and computationally demanding material models, these materials exemplify the challenges for modern micromechanics solvers.

Hence, the present thesis is devoted to investigating and developing FFT-based micromechanics solvers for efficiently computing the (thermo)mechanical response of nonlinear composite materials with complex microstructures. To this end, both Lippmann-Schwinger solvers and polarization schemes are considered as starting points for new general-purpose methods. More precisely, we investigate two novel applications of the powerful BFGS Quasi-Newton method in the context of FFT-based micromechanics, to produce fast, tangent-free algorithms. Moreover, we use Anderson acceleration to eliminate the main weakness of polarization schemes, i.e., their inability to handle materials with infinite contrast.

In addition to powerful general-purpose methods, we consider a number of specialized applications of FFT-based solvers. Firstly, noting that the flow rule of small-strain crystal-plasticity models is naturally formulated as a function of the stress, we revisit the dual variational

framework for the periodic cell problem. Using modern FFT-methods in the stress-based setting, computation times for polycrystalline materials are reduced by about an order of magnitude. Secondly, we consider thermomechanically coupled materials using the framework of asymptotic homogenization. Based on the decoupling of mechanics and heat conduction on the microscale, we propose an implicit staggered approach, which is compatible to all strain-based FFT-methods.

Last but not least, we return to directionally solidified NiAl-Mo eutectics and use the developed solvers to thoroughly investigate their creep behavior. More precisely, we focus on the case of cellular NiAl-Mo, which has not been subjected to a simulation study, owing to its multiscale microstructure. To tackle this problem, we propose a phenomenological surrogate model for the creep behavior of well-aligned fibrous NiAl-Mo and generate the cellular mesostructures based on a level-set approach. By micromechanical simulations, we are able to clarify the distinction between soft and hard regions and identify the impact of cell volume fraction and aspect ratio.

# **Acknowledgments**

First of all, I would like to thank my supervisor Thomas Böhlke. Since the beginning of my undergraduate studies, he has shaped my understanding of continuum mechanics and thermodynamics. His continued support and guidance have made this thesis possible. Furthermore, I am very grateful to my co-supervisor Matti Schneider for his close cooperation and countless fruitful discussions on mathematics and micromechanics. I thank my co-supervisor Martin Heilmaier for his invaluable insight and advice in the field of material science. The added perspectives have greatly improved this thesis. I would also like to express my gratitude to Alexander Kauffmann from the IAM-WK who devoted much time (and certainly just as much patience) to sharing his knowledge.

I am very grateful to the whole staff of the ITM Chair for Continuum Mechanics. Many thanks to Ute and Helga for their invaluable administrative support during many conference trips and paper submissions and for having an open ear for non-research related topics. I also thank Tom for his support in all IT and teaching-related matters. Thanks to all my present and former colleagues for making the time at the institute so enjoyable. Our digital coffee breaks most certainly kept me sane during two long years of on-and-off lockdowns. Many of you watched out for me during my novice years as researcher and I have a lot of fond memories of the conferences, seminars, work shops and summer schools I attended with you all. Particular highlights include long nights in Weimar and the demonstration of mass conservation by applied fluid mechanics. Special thanks to Jürgen Albiez, Hannes Erdle, Andreas Prahs, Loredana Kehrer, Johannes Ruck, Johannes Görthofer, Felix Ernesti, Sebastian Gajek, Max Krause, Alex Dyck, Mauricio Fernández and Juliane Lang.

I am most grateful to my wife Yuwon for having my back during the whole dissertation project. Without your encouragement and patience this would not have been possible. Also very special thanks to my son Leonard for being the sweetest kid I could have wished for and for making every day a little brighter. Many thanks to my parents for years of unwaivering support and for putting up with a disgruntled son in times of research troubles. Once, when you asked how my thesis was coming along I answered that its state is "bad" until it is finished, to shut off further inquiries. Now, feel free to ask again.

Last but not least, I gratefully acknowledge the partial financial support from the Helmholtz Association under the framework of the Helmholtz Research School on "Integrated Materials Development for Novel High Temperature Alloys (IMD)", Grant No. VH-KO-610.

Karlsruhe, August 2022 Daniel Wicht

# **Contents**






# **Chapter 1 Introduction**

## **1.1 Motivation and objectives**

The effective macroscopic behavior of heterogeneous materials emerges from the interplay between microstructure and constituent behavior (Mc-Dowell, 2008). Indeed, in modern alloys, the microstructure is tailored to fit the desired application. An example for a material class, where the microstructure is designed to enhance the effective properties for structural applications at high temperatures, are nickel-aluminum (NiAl) based directionally solidified eutectics. In these alloys, binary B2-ordered NiAl, with low mass density and a high melting point, is combined with refractory metals, typically chrome (Cr) and/or molybdenum (Mo), providing creep resistance at high temperatures. Depending on chemical composition (Cline and Walter, 1970; Cline et al., 1971; Gombola et al., 2020), a directional solidification process may result in either fibrous or lamellar structures, aligned in growth directions. Experimental results show that, for a fixed stress loading, the creep rate of the reinforced alloys is up to several orders of magnitudes lower, compared to binary NiAl. However, for fine-tuning the processing parameters and the resulting microstructure, the impact of the morphology on the creep resistance of the material has to be determined. For instance, considering fibrous NiAl-Mo eutectics, the fiber diameter (Albiez et al., 2016a), fiber aspect ratio (Haenschke et al., 2010; Hu et al., 2013) and the presence

of colonies (Misra et al., 1998; Bogner et al., 2012; Seemüller et al., 2013) all depend on the processing conditions and influence the creep behavior of the material. However, relying solely on experiments for characterizing the interplay between microstructure and mechanical behavior proves to be difficult. Firstly, considerable effort is associated with creep experiments, where a single run may take several days (Hu et al., 2013) excluding sample preparation. Secondly, deliberately manipulating the morphology is difficult, due to the sensitivity with respect to the processing parameters.

Thus, efficient computational homogenization methods are crucial for informing the material design process by robustly predicting the material behavior. For this purpose, FFT-based solvers (Moulinec and Suquet, 1994; 1998) have established themselves as powerful tools, compatible to either real or synthetic microstructure images. In this context, alloys of the NiAl-(Cr, Mo) system prove to be challenging, as microstructural features may vastly differ in their characteristic length scale. For instance, high fiber aspect ratios (Haenschke et al., 2010; Hu et al., 2013) or cellular mesostructures (Misra et al., 1998; Seemüller et al., 2013) may necessitate a fine spatial discretization, leading to representative volume elements (Kanit et al., 2003) with a large number of degrees-of-freedom. Moreover, the crystal plasticity models, governing the constituent behavior on the microscale (Albiez et al., 2016a), are associated with significant computational costs (Eghtesad et al., 2018a). This motivates us to develop and investigate highly-efficient FFT-based solvers for enabling the micromechanical study of materials with complex geometry and nonlinear material behavior. In addition to specialized methods for crystal plasticity models, we are interested in powerful general purpose solvers, which are applicable for a wide range of applied materials. Hence, next to the NiAl-based eutectics, serving as our primary motivation and guiding application, other material classes, such

as fiber reinforced polymers, are included as computational benchmarks. In the following, we give a breakdown of our primary objectives:


simulations on the cellular mesostructure, we develop a surrogate model for NiAl-Mo with well-aligned fibers and use the level-set framework by Sonon et al. (2012; 2015) to generate high-fidelity cell structures.

## **1.2 State of the art**

## **1.2.1 NiAl-based directionally solidified eutectics**

By basic thermodynamical considerations, the maximum operating temperature is one of the main limiting factors for the efficiency of gas turbines (Desideri, 2013). Thus, high-strength structural alloys with a melting point beyond the limits of state-of-the-art nickel-based superalloys are of high interest as potential turbine blade materials. In this context, the B2-ordered intermetallic NiAl features a number of attractive properties which have led to increased research interest (Darolia, 1991; Miracle, 1993; Noebe et al., 1993):


However, binary NiAl lacks in fracture toughness at low temperatures and suffers from poor creep resistance at high temperatures, preventing

its industrial application (Darolia, 1991; Miracle, 1993; Noebe et al., 1993). To counteract these weaknesses, the introduction of refractory metals, such as Cr or Mo, in combination with a directional solidification process (Cline et al., 1971) has emerged as a promising approach. Under these processing conditions, the refractory metal forms reinforcing structures in the direction of solidification, where the geometry of the inclusions depends on the chemical composition. For instance, in the NiAl-(Cr, Mo) system, NiAl-Mo and NiAl-Cr eutectics form fiber structures, whereas Cr-rich NiAl-Cr(Mo) leads to a lamellar arrangement of the phases (Cline and Walter, 1970; Cline et al., 1971; Gombola et al., 2020). Early studies on the mechanical characterization NiAl-X eutectics (Johnson et al., 1995; Misra et al., 1998; Whittenberger et al., 2001) found that the fracture toughness and creep resistance were improved compared to binary NiAl but generally not competitive to nickel-based superalloys.

For NiAl-Mo eutectics, advances in processing technology facilitated further improvements of the mechanical properties. More precisely, using an optical floating zone furnace, Bei and George (2005) were able to produce highly regular and well-aligned Mo-fiber structures. In particular, the material was virtually free of defects, such as cell and dendrite structures (Misra et al., 1998; Ferrandini et al., 2004), which deteriorate the creep resistance of the material (Seemüller et al., 2013). Dedicated studies on the influence of processing parameters on the resulting microstructures (Bogner et al., 2012; Hu et al., 2012; Zhang et al., 2013) identified a high temperature gradient at the solidification front combined with a slow growth rate as key for producing well-aligned samples. These advances sparked considerable research interest in the mechanical properties and underlying mechanisms of well-aligned NiAl-Mo eutectics. Zhang et al. (2012) described several strengthening mechanism, such as crack bridging and crack trapping, explaining the increased fracture toughness of roughly 14 MPa<sup>√</sup> m for eutectic NiAl-Mo compared to about 8 MPa<sup>√</sup> m for binary NiAl. By increasing the Mo

content beyond the eutectic composition, the fracture toughness was further improved to above 19 MPa<sup>√</sup> m. Bei et al. (2008) and Sudharshan Phani et al. (2011) found that the single-crystalline Mo-fibers in wellaligned NiAl-Mo were virtually dislocation free. This resulted in a high yield strength (Bei et al., 2007) and a decrease in the creep-rate by several orders of magnitude for a prescribed stress loading (Haenschke et al., 2010; Dudová et al., 2011; Hu et al., 2013). Motivated by these experimental findings, Albiez et al. (2016a) proposed suitable single-crystal material models for the NiAl-matrix and the Mo-fibers and studied the effective creep behavior of the well-aligned composite through crystal plasticity simulations. In particular, the softening behavior of the material during creep was elucidated by a dislocation-based hardening law, generalizing an earlier model by El-Awady (2015). In a subsequent study (Albiez et al., 2019), the material models were extended by a non-local gradient-plasticity approach to account for the movement and transfer of dislocations. Overall, both experimental studies and simulations significantly improved the understanding of the creep behavior of well-aligned NiAl-Mo composites.

### **1.2.2 FFT-based micromechanics**

FFT-based solvers, pioneered by Moulinec and Suquet (1994; 1998), combine a number of salient properties which have driven their widespread application in modern computational micromechanics. Firstly, they naturally operate on regular grids, i.e., voxel images. Hence, they directly profit from advances in modern three-dimensional imaging techniques (Uchic et al., 2007; Cocco et al., 2013; Epting et al., 2012), providing high-fidelity digital representations of real-world microstructures. In particular, FFT-based methods avoid the meshing step, which may prove infeasible considering the diversity and complexity of microstructures in modern applied materials (Bargmann et al., 2018). Secondly, based on their inherently matrix-free formulation, FFT-based methods permit

memory efficient implementations, enabling the study of large volume elements with many degrees of freedom. In this context, researchers also profit from readily available and highly optimized implementations of the FFT (Frigo and Johnson, 2005), boosting the computational efficiency of the derived solution schemes (Eisenlohr et al., 2013; El Shawish et al., 2020). Last but not least, FFT-based methods provide great flexibility with regard to the investigated material behavior, as inelastic problems were considered from the very beginning (Moulinec and Suquet, 1998). Notably, the original basic scheme by Moulinec and Suquet (1994; 1998) already featured all of the above advantages. However, the method was found to converge slowly for composites with high material contrast, i.e., the ratio of maximum and minimum eigenvalue in the (tangent-)stiffness field, and failed to converge at all for the case of infinite material contrast, e.g., pores and voids. This motivated further research efforts on the algorithmic foundations of FFT-based methods, especially in the areas of discretizations and solvers.

Alternative discretizations, such as finite differences (Willot, 2015; Schneider et al., 2016), finite volumes (Dorn and Schneider, 2019) and finite elements (Schneider et al., 2017; Leuschner and Fritzen, 2018), were initially introduced to reduce the oscillations associated to the original discretization by trigonometric polynomials. More importantly, it was realized that the convergence behavior for problems with infinite contrast was not only a matter of solution scheme but depended critically on the choice of discretization. Indeed, under some regularity assumptions on the underlying microstructure, convergence of the basic scheme (and related methods) could be established for finite difference and finite element discretizations (Schneider, 2020b).

Many successful FFT-algorithms build directly upon the basic scheme and the associated Lippmann-Schwinger equation. In this context, Zeman et al. (2010) introduced Krylov-subspace solvers, displaying excellent performance for linear elastic problems. Their application was

extended to nonlinear problems by entering Newton-Krylov methods (Gélébart and Mondon-Cancel, 2013; Kabel et al., 2014) or in the form of nonlinear conjugate gradients (Schneider, 2020a). By exposing the basic scheme as a gradient descent method (Kabel et al., 2014) the toolbox of modern optimization algorithms (Boyd and Vandenberghe, 2004; Nocedal and Wright, 1999) was made available to FFT-based micromechanics. Momentum-based fast gradient methods (Schneider, 2017a; Ernesti et al., 2020) were shown to considerably improve upon the performance of vanilla gradient descent. As tangent-free alternatives to Newton's method, Quasi-Newton approaches entered FFT-based micromechanics in the form of Anderson acceleration (Shantraj et al., 2015; Chen et al., 2019a;b) and the Barzilai-Borwein step size (Schneider, 2019a), see Ch. 3 for further details.

In contrast to the Lippmann-Schwinger solvers, which operate on displacements or compatible strain-fields, Eyre and Milton (1999) proposed an accelerated scheme with a polarization as primary variable. Initially formulated for conductivity problems, the Eyre-Milton method was adapted to linear elasticity by Michel et al. (2001) and proved to converge much faster than the basic scheme. In the same study, Michel et al. (2001) proposed an augmented Lagrangian version of the cell problem and solved it with ADMM. As an algorithm for constrained nonlinear optimization, ADMM appeared to share no connection to the Eyre-Milton method, which was motivated by series acceleration techniques and restricted to linear problems. Remarkably, for the case of linear elasticity, Moulinec and Silva (2014) identified both methods as members of a general family of polarization schemes by Monchiet and Bonnet (2012) and provided convergence estimates. The results were extended to the nonlinear setting by connecting the polarization methods to the classical Douglas-Rachford splitting (Schneider et al., 2019). Overall, for strongly convex problems, polarization methods combine excellent performance with a low memory footprint. However, the unclear choice

of algorithmic parameters for infinitely contrasted problems limits their flexibility compared to Lippmann-Schwinger solvers, see Ch. 6 for a detailed discussion and a proposed remedy.

Based on these advances in discretization and solver technology, FFTbased schemes have found application in a wide variety of problem settings. Examples include polycrystals at small (Lebensohn et al., 2012) and finite strains (Eisenlohr et al., 2013), stress localization (Rollett et al., 2010), slip band formation (Marano et al., 2019; Marano and Gélébart, 2020), fatigue-lifetime estimation (Lucarini and Segurado, 2019), damage (Boeff et al., 2015) and fracture mechanics (Chen et al., 2019b), electro-mechanically coupled materials (Vidyasagar et al., 2017), the mantle flow of geophysical minerals (Castelnau et al., 2008), homogenization of the elastic (Schneider, 2017b; Görthofer et al., 2020) and rate-dependent (Staub et al., 2018) behavior of fiber-reinforced composites, the anisotropic thermoelastic behavior of explosive materials (Gasnier et al., 2015) and concurrent multi-scale simulations (Kochmann et al., 2018; Göküzüm et al., 2019). For a broader overview of practical applications, we refer to (Schneider, 2021, Sec. 5). Segurado et al. (2018) and Lebensohn and Rollett (2020) provide reviews focusing on polycrystalline materials. An overview of modern multiscale approaches, where FFT-methods may enter as solver on the microscale, is given by Matouš et al. (2017).

## **1.3 Originality and outline**

**Chapter 2** This chapter briefly establishes the fundamentals of smallstrain continuum mechanics, serving as the basic framework of this thesis. In particular, we review the kinematic assumptions, the underlying balance equations and thermodynamic restrictions on the material behavior. On this basis, we revisit the periodic cell problem of computational micromechanics. The equivalent reformulations of

the problem in the form of the Lippmann-Schwinger equation and the Eyre-Milton equation are introduced, each serving as the starting point for distinct FFT-based solution algorithms. By embedding the problem in a variational framework, we draw the connection from FFT-based methods to classical solvers of convex optimization. Note that we do not claim any originality for the contents of this chapter. Instead, we seek to provide additional context and a basic framework for the following studies.

**Chapter 3** This chapter is devoted to investigating the power of Quasi-Newton methods in the context of FFT-based micromechanics. More precisely, we propose two novel algorithms exploiting the BFGS Hessian approximation, leading to fast tangent-free solvers. In this context, we discuss suitable line search criteria and forcing term strategies for inexact (Quasi-)Newton methods. In numerical experiments, we compare the performance and convergence behavior of the newly proposed algorithms to modern Lippmann-Schwinger solvers. The results reflect the strengths and weaknesses of the different algorithms and show which solvers excel for the special cases of computationally cheap and expensive material laws.

**Chapter 4** In contrast to the last chapter which dealt with general purpose methods, we consider the special case of small-strain single crystal elasto-viscoplasticity. Based on the observation that evaluating the inverse constitutive law is less costly for some formulations of the material model, we propose solving the associated cell problem in a stress-based framework. To this end, we revisit both the primal and dual variational setting and show their equivalence for arbitrary mixed boundary conditions. Numerical experiments demonstrate that the performance of FFT-based methods improves by about an order of magnitude with respect to computation time in the stress-based formulation.

**Chapter 5** The interplay between temperature and deformation may have a significant impact on the effective behavior of microstructured materials under thermomechanical loadings. Based on the framework of asymptotic homogenization by Chatzigeorgiou, we propose an implicit staggered scheme for thermomechanically coupled problems which is compatible to arbitrary strain or displacement-based micromechanics solvers. Exploiting the homogeneity of the temperature on the microscale, the proposed approach preserves the computational power of FFT-based schemes by introducing little overhead. As a particularly challenging example with strong temperature sensitivity and pronounced thermomechanical coupling, we consider the case of glass-fiber reinforced polypropylene to demonstrate the efficiency of our approach.

**Chapter 6** Having thoroughly investigated modern Lippmann-Schwinger solvers, we turn to polarization-based methods. In earlier studies, these algorithms have proven to be very fast and memory efficient, however, their use as general purpose solvers is limited by their sensitivity to the choice of algorithmic parameters. To tackle this problem, we propose combining polarization-based schemes with Anderson acceleration, resulting in a fast and robust algorithm which is competitive to the fastest Lippmann-Schwinger solvers. In particular, Anderson acceleration leads to a vastly improved convergence behavior for problems with infinite material contrast, where polarization-based schemes have typically struggled.

**Chapter 7** Following the previous method-driven chapters, we consider an application-oriented problem. More precisely, we use FFT-based methods to thoroughly investigate the creep behavior of cellular NiAl-Mo alloys. To this end, we build upon the studies by Albiez et al. (2016a) and Seemüller et al. (2013) to formulate a surrogate model for the wellaligned creep resistant regions and generate suitable microstructures using a level set approach. The simulations shed light on the proper classification of soft intercellular regions, which are the root cause for

the notable loss of creep resistance in the cellular material. In addition, the impact of cell volume fraction and aspect ratio on the effective creep rate is identified, improving upon coarser analytical estimates.

**Chapter 8** Last but not least, we summarize our most important findings and close with some concluding remarks.

## **1.4 Remarks on the notation**

In the present manuscript, newly introduced quantities are defined upon the first appearance in each chapter. Where appropriate, this includes the explicit expression and details such as function space, domain of definition and tensor order. Note that, in general, the latter information is *not* implicitly encoded in the notation, for instance, by specific typesets or markers. Tensor contractions are marked by dots, i.e., a single tensor contraction is denoted by ·, a double tensor contraction reads : and :: is a quadruple tensor contraction. For instance, with scalars , , , vectors , , second order tensors , and fourth order tensors C*,* D, the expression = · is equivalent to = , = · is equivalent to = , = : is equivalent to = , = C : is equivalent to = and = C :: D is equivalent to = , using the summation convention and index notation. The transposition of a second order tensor is denoted by <sup>T</sup> . I stands for the identity. The tensor product is defined by ( ⊗ ) · = ( · ) and its symmetrized version reads ⊗<sup>s</sup> = 1*/*2( ⊗ + ⊗ ). Sym() stands for the space of symmetric second order tensors in R and linear operators on Sym() are denoted by (Sym()). Note that elements of (Sym()), when interpreted as fourth order tensors, are endowed with the left and right minor symmetries. Throughout this manuscript, we operate in Cartesian coordinates. Thus, for ease of exposition, we do not particularly emphasize the distinction between tensors and matrices in most of the text. Note, however, that in a broader continuum mechanics context, the

concept of tensors as basis-independent quantities cannot be neglected in general (Bertram, 2011).

# **Chapter 2**

# **Fundamentals**

## **2.1 Elementary continuum mechanics**

The following sections give a brief introduction to the theory of smallstrain continuum mechanics, serving as the fundamental framework throughout this manuscript. Starting with basic kinematics, we specify the assumptions for the small-strain setting. Subsequently, the balance equations, forming the basis of thermomechanical boundary value problems, are established. Last but not least, we discuss common thermodynamical restrictions on material laws and introduce generalized standard materials as a convenient framework for material modeling. For further details on the continuum mechanical background, we refer to the monographs by Šilhavý (1997), Liu (2002), Haupt (2002) and Bertram (2011).

#### **2.1.1 Kinematics**

Let Ω<sup>0</sup> ⊆ R be the space occupied by a body in an arbitrary reference placement. In this manuscript, we mostly consider three-dimensional problems, i.e., = 3. The material points of the body are labeled by their reference position ∈ Ω<sup>0</sup> (Šilhavý, 1997). The motion of the body is described by the bijective function

$$\chi: \Omega\_0 \times [0, T] \to \mathbb{R}^d, \quad (X, t) \mapsto \chi(X, t), \tag{2.1}$$

which maps material points to their current position

$$x = \chi(X, t). \tag{2.2}$$

The associated current placement of the body reads (Šilhavý, 1997)

$$\Omega\_t = \{x = \chi(X, t) \mid X \in \Omega\_0\}.\tag{2.3}$$

In general, any tensor field Ξ on the material body may be parameterized in terms of the reference placement Ξ<sup>L</sup> : Ω<sup>0</sup> × [0*,* ] → R ×···× (Lagrangian description) or in terms of the current placement Ξ<sup>E</sup> : Ω × [0*,* ] → R ×···× (Eulerian description) with (Haupt, 2002)

$$
\Xi\_{\mathcal{L}}(X,t) = \Xi\_{\mathcal{E}}(\chi^{-1}(X,t),t),
\tag{2.4}
$$

$$
\Xi\_{\rm E}(x,t) = \Xi\_{\rm L}(\chi(x,t),t). \tag{2.5}
$$

For better readability, the subscripts are only written out where we wish to emphasize the parameterization. Otherwise, the parameterization is implied by the argument.

The so-called material time derivative of a function is defined as the partial time derivative for a fixed reference placement (Haupt, 2002)

$$\left. \dot{\mathbf{(\cdot)}} = \frac{\partial \mathbf{(\cdot)}}{\partial t} \right|\_{X}. \tag{2.6}$$

Consequently, the velocity and the acceleration of the material body are given by

$$v(X,t) = \dot{\chi}(X,t), \quad a(X,t) = \ddot{\chi}(X,t). \tag{2.7}$$

For a Eulerian field ΞE(*,* ) the material dime derivative reads (Haupt, 2002)

$$
\dot{\Xi}\_{\rm E}(x,t) = \frac{\partial \Xi\_{\rm E}}{\partial t}(x,t) + \frac{\partial \Xi\_{\rm E}}{\partial x}(x,t) \cdot v\_{\rm E}(x,t). \tag{2.8}
$$

The first spatial derivative of the motion is denoted by : Ω<sup>0</sup> × [0*,* ] → R × and referred to as deformation gradient (Haupt, 2002)

$$F(X,t) = \frac{\partial \chi}{\partial X}(X,t). \tag{2.9}$$

In particular, maps infinitesimal line, area and volume elements d, d, d , in the reference configuration Ω<sup>0</sup> to the respective elements d, d, d current configuration Ω (Haupt, 2002)

$$\mathrm{d}x = F \cdot \mathrm{d}X, \quad \mathrm{d}a = \det(F)F^{-\mathrm{T}} \cdot \mathrm{d}A, \quad \mathrm{d}v = \det(F)\,\mathrm{d}V. \tag{2.10}$$

For physically meaningful deformations, it is generally assumed that det() *>* 0 to avoid compression to zero or even negative volume. Let Sym() stand for the space of symmetric second order tensors of dimension and denote the associated subset of symmetric and positive definite tensors by Sym<sup>+</sup>(). Abusing notation, any deformation gradient = (*,* ) may be split

$$F = R \cdot U = V \cdot R \tag{2.11}$$

into a symmetric and positive definite part *,*  ∈ Sym<sup>+</sup>() and a proper orthogonal part ∈ () (Haupt, 2002). The left and right stretch tensors and share the same eigenvalues which are identified with the principal stretches. refers to the mean rotation.

In the undeformed state, the principal stretches are equal to one, i.e. *,*  = I. In engineering, strain measures which are zero in the undeformed state are commonly used. The family of Seth-Hill strains (Seth, 1961; Hill, 1968), defined by

$$E^{\text{Seth}} = f(U) \tag{2.12}$$

and the scalar function

$$f(\lambda) = \begin{cases} \frac{1}{m}(\lambda^m - 1) & m \in \mathbb{R} \backslash \{0\}, \\ \ln(\lambda) & m = 0, \end{cases} \tag{2.13}$$

covers many common strain measures, such as the Hencky strain ( = 0), Biot strain ( = 1) and Green strain ( = 2).

To consider the geometrically linear setting (Liu, 2002), we introduce the displacement : Ω<sup>0</sup> × [0*,* ] → R defined by

$$u(X,t) = \chi(X,t) - X \tag{2.14}$$

and the associated displacement gradient : Ω<sup>0</sup> × [0*,* ] → R ×

$$H(X,t) = \frac{\partial u}{\partial X}(X,t) = F(X,t) - \text{I.}\tag{2.15}$$

For small deformations, it is assumed that the Frobenius norm ‖ · ‖ for all displacement gradients = (*,* ) is small

$$\|H\| \ll 1.\tag{2.16}$$

Linearization around = 0 yields Liu (2002)

$$E^{\text{Seth}} = \varepsilon,\tag{2.17}$$

$$U = \mathbf{I} + \varepsilon,\tag{2.18}$$

$$R = \mathbf{I} + \omega,\tag{2.19}$$

with the infinitesimal strain

$$
\varepsilon = \frac{1}{2}(H + H^T) \tag{2.20}
$$

and the infinitesimal rotation

$$
\omega = \frac{1}{2}(H - H^T) \tag{2.21}
$$

as symmetric and skew-symmetric parts of , respectively. In addition, it is typically assumed that the displacement is small as well, so that ≈ . Thus, the distinction between Lagrangian and Eulerian parameterization vanishes and the material time derivative reduces to the partial time derivative (Haupt, 2002).

#### **2.1.2 Balance equations**

The (thermo)mechanical behavior of a body is governed by physical laws in the form of balance equations. For specified external loadings, these equations give rise to boundary value problems which may, in turn, be solved either analytically or numerically. The general integral balance of an arbitrary tensor field Ξ over any regular bounded subregion of a body ⊂ Ω with boundary *∂* reads (Liu, 2002)

$$\frac{d}{dt} \int\_{P\_t} \Xi \,\mathrm{d}v = \int\_{\partial P\_t} q\_{\Xi} \cdot n \,\mathrm{d}a + \int\_{P\_t} p\_{\Xi} + s\_{\Xi} \,\mathrm{d}v,\tag{2.22}$$

where the non-convective flux <sup>Ξ</sup> of is one tensor order above Ξ and the internal production <sup>Ξ</sup> and external supply <sup>Ξ</sup> have the same tensor order as Ξ. Applying Reynold's transport theorem and the divergence theorem yields the local form in regular points

$$\frac{\partial \Xi}{\partial t} + \text{div } (\Xi \otimes v) = \text{div } q\_{\Xi} + p + s,\tag{2.23}$$

as (2.22) has to hold for arbitrary (Liu, 2002). For simplicity of exposition, we do not consider singular surfaces and the associated jump conditions.

**Mass** For mass conservation, Ξ is identified with the mass density : Ω ⊗ [0*,* ] → R and production, supply and flux are zero (Liu, 2002). Thus, the local balance reads

$$
\dot{\rho} + \rho \text{div } v = 0. \tag{2.24}
$$

Note that in continuum solid mechanics, the balance of mass is typically not explicitly considered. Indeed, for given deformation gradient and mass density <sup>0</sup> in reference configuration, the current mass density may be computed by = det() <sup>−</sup>10. Similarly, in the small-strain context = (1 − tr())<sup>0</sup> holds. However, owing to (2.16), the density is often approximated as constant in time ≈ 0.

**Linear and angular momentum** With the linear momentum density as balanced field, the volume force density : Ω ×[0*,* ] → R as supply term and the Cauchy stress tensor : Ω × [0*,* ] → R × as flux, the balance of linear momentum reads (Liu, 2002)

$$
\rho a = \text{div } \sigma + b.\tag{2.25}
$$

Note that, in this manuscript, we restrict to the quasi-static setting where the acceleration term vanishes. Under the assumption that the balance of linear momentum (2.25) holds, the balance of angular momentum may be condensed to

$$
\sigma = \sigma^{\mathrm{T}},
\tag{2.26}
$$

i.e., the symmetry of the stress tensor (*,* ) ∈ Sym() (Liu, 2002).

**Energy** The total energy density is comprised of the internal energy density<sup>1</sup> : Ω × [0*,* ] → R and the kinetic energy 1*/*2 · . Thus, the

<sup>1</sup> From a physical viewpoint, modeling the mass specific internal energy ˜ = */* is preferable. However, in a small-strain context, where may be approximated as a constant conversion factor, using the volume density is more convenient. The same holds for the entropy and the free energy .

conservation of energy, known as first law of thermodynamics, reads

$$\dot{e} + \frac{1}{2}\rho(v \cdot v)^{\cdot} = b \cdot v + \omega + \text{div}\,(\sigma^{\text{T}} \cdot v) - \text{div } q,\tag{2.27}$$

where the supply term consists of internal heat sources : Ω × [0*,* ] → R and the power of the volume forces · and the flux is given by the negative heat flux − and the mechanical power · · . By subtracting the balance of linear momentum (2.25) multiplied by the velocity, the conservation of total energy may be condensed to the balance of internal energy

$$
\dot{e} = -\text{div}\, q + w + \sigma : \dot{\varepsilon}. \tag{2.28}
$$

**Entropy** With the entropy density : Ω × [0*,* ] → R, the generic entropy balance reads

$$
\dot{s} = \text{div}\, q\_s + p\_s + s\_s. \tag{2.29}
$$

The second law of thermodynamics

$$p\_s \ge 0,\tag{2.30}$$

states that the entropy production may never be negative, thereby restricting the direction of physical processes (Lebon et al., 2008). Note that thermodynamical theories sometimes differ in their assumptions on the flux and supply as either fixed or constitutive quantities, see Lebon et al. (2008) or Cimmelli et al. (2014) for an overview. In the next Sec. 2.1.3, we follow Coleman and Noll (1963) in the context of rational thermodynamics. Note, however, that more general approaches for exploiting the entropy balance exist (Liu, 1972).

#### **2.1.3 Thermodynamic restrictions**

The balance equations (2.24) - (2.29) are assumed to hold universally. For predicting the (thermo)mechanical behavior of specific materials, their properties have to be encoded in the form of constitutive equations for the energies and fluxes (Liu, 2002, Sec. 8.4). In the framework of rational thermodynamics, the second law of thermodynamics is interpreted as a restriction on these constitutive relations, i.e., the material laws should be formulated so that (2.30) holds identically. Material models conforming to this restriction are called *thermodynamically consistent*. For evaluating the implications of the second law, Coleman and Noll (1963) proposed systematic approach based on the Clausius-Duhem inequality, which has been widely adopted in modern continuum mechanics. In the following, we give a brief summary for the case of solids with internal variables at small strains. Coleman and Noll (1963) rely on the constitutive assumptions = */* for the entropy supply and = −*/* for the entropy flux, where : Ω×[0*,* ] → R*>*<sup>0</sup> is the absolute temperature. For this specific formulation of the entropy balance, equations (2.28)-(2.30) may be combined to yield the Clausius-Duhem inequality

$$
\theta \dot{s} - \dot{e} + \sigma : \dot{\varepsilon} - \frac{1}{\theta} q \cdot \nabla \theta \ge 0 \tag{2.31}
$$

with the temperature gradient ∇ : Ω×[0*,* ] → R . Let : Ω×[0*,* ] → denote an array of internal variables with an associated vector space which is assumed to be sufficiently large. For specifying the material behavior, we assume that the free Helmholtz energy , related to the internal energy by

$$e = \psi + \theta s,\tag{2.32}$$

has the form

$$\psi: \text{Sym}(d) \times \mathbb{R}\_{\geq 0} \times \mathbb{R}^d \times Z \to \mathbb{R},\tag{2.33}$$

$$(\varepsilon, \theta, \nabla \theta, z) \mapsto \psi(\varepsilon, \theta, \nabla \theta, z). \tag{2.34}$$

Inserting the free Helmholtz free energy (2.32) into the Clausius-Duhem inequality (2.31) yields

$$\begin{split} \left( \sigma - \frac{\partial \psi}{\partial \varepsilon} (\varepsilon, \theta, \nabla \theta, z) \right) &: \dot{\varepsilon} - \left( \frac{\partial \psi}{\partial \theta} (\varepsilon, \theta, \nabla \theta, z) + s \right) \dot{\theta} \\ - \frac{\partial \psi}{\partial \nabla \theta} (\varepsilon, \theta, \nabla \theta, z) \cdot \nabla \dot{\theta} - \frac{\partial \psi}{\partial z} (\varepsilon, \theta, \nabla \theta, z) \cdot \dot{z} - \frac{1}{\theta} q \cdot \nabla \theta \ge 0, \end{split} \tag{2.35}$$

assuming that is sufficiently smooth in all arguments. Note that (2.35) has to hold for arbitrary physical processes. As, in principle, any path may be realized for ˙, ˙ and ∇ ˙ by choosing suitable (experimental) boundary conditions, the terms linear in these quantities must vanish. In particular, the free energy is independent of the temperature gradient

$$\frac{\partial \psi}{\partial \nabla \theta}(\varepsilon, \theta, \nabla \theta, z) = 0 \tag{2.36}$$

and, therefore, ∇ is removed from the argument list in the following. In addition, we obtain potential relations for the stress

$$
\sigma = \frac{\partial \psi}{\partial \varepsilon}(\varepsilon, \theta, z) \tag{2.37}
$$

and entropy

$$s = -\frac{\partial \psi}{\partial \theta}(\varepsilon, \theta, z). \tag{2.38}$$

For simplicity, the terms in the residual inequality are commonly treated separately

$$-\frac{\partial\psi}{\partial z}(\varepsilon,\theta,z) \cdot \dot{z} \ge 0,\tag{2.39}$$

$$-q \cdot \nabla \theta \geq 0,\tag{2.40}$$

where, in the spirit of linear irreversible thermodynamics, the heat flux term may be covered by assuming Fourier's law

$$q = -\kappa \nabla \theta \tag{2.41}$$

with a positive definite thermal conductivity tensor ∈ Sym<sup>+</sup>(). To conclude, suitable evolution equations, respecting the inequality (2.39), have to be supplied for the internal variables in addition to a free energy to complete a thermodynamically consistent material model.

#### **2.1.4 Generalized standard materials**

A widely adapted framework for thermodynamically consistent material models is the two potential formulation of generalized standard materials (GSMs) (Halphen and Nguyen, 1975; Germain et al., 1983). A GSM is described by a convex free energy (2.32), and a convex and non-negative dissipation potential

$$
\phi: \mathbb{R}\_{>0} \times Z \to \mathbb{R}\_{\geq 0}, \quad (\theta, \dot{z}) \mapsto \phi(\theta, \dot{z}).\tag{2.42}
$$

with (*,* 0) = 0. The relation between the dissipation potential and the thermodynamical driving forces ≡ −*∂/∂*(*, ,* ), living in the continuous dual space \* of , is expressed via Biot's equation

$$\mathcal{A} \in \partial\_{\dot{z}} \phi(\theta, \dot{z}). \tag{2.43}$$

Here, *∂*˙ stands for the subdifferential of with respect to ˙, defined by

$$\partial\_{\dot{z}}\phi(\theta,\dot{z}) = \left\{ \mathcal{A} \in Z^\* \mid \phi(\theta,\dot{y}) - \phi(\theta,\dot{z}) \ge \mathcal{A} \cdot (\dot{y} - \dot{z}), \forall \dot{y} \in Z \right\}, \tag{2.44}$$

see (Rockafellar, 1970, Sec. 23). Thus, using our initial assumptions on and choosing ˙ = 0, the above definition (2.44) yields

$$\mathcal{A} \cdot \dot{z} \ge \phi(\theta, \dot{z}) \ge 0,\tag{2.45}$$

demonstrating that the residual inequality (2.39) holds. Equivalently, GSMs may be formulated in terms of the force potential

$$\phi^\*(\mathcal{A}, \theta) = \sup\_{\dot{z}} (\mathcal{A} \cdot \dot{z} - \phi(\dot{z}, \theta)),\tag{2.46}$$

so that the evolution equations are given explicitly by

$$
\dot{z} \in \partial \phi^\*\_{\mathcal{A}}(\mathcal{A}, \theta). \tag{2.47}
$$

In addition to being thermodynamically consistent, GSMs enjoy the property that, after a backwards Euler time discretization and condensation of internal variables, they permit expressing the stress in terms of a condensed incremental potential : Sym() × R*>*<sup>0</sup>

$$
\sigma = \frac{\partial w}{\partial \varepsilon}(\varepsilon, \theta),
\tag{2.48}
$$

which does *not* depend on (Lahellec and Suquet, 2007). Thus, for a fixed time step, a GSM effectively behaves like a nonlinear hyperelastic material. Last but not least, a few synoptic remarks are in order:


• The GSM framework covers a wide range of material models, such as classical 2-plasticity (Simo and Hughes, 1998) or certain types of crystal plasticity models, see Sec. 4.3.2 or Fritzen and Leuschner (2013). However, it is far from universal and many widely adapted models do not adhere to the two-potential formalism. When discussing specific material models in the later chapters, we indicate cases which are not covered by the theory.

# **2.2 FFT-based micromechanics**

This section introduces the basic problem setting for computational micromechanics at small strains, providing the background for the algorithms proposed in Ch. 3 - Ch. 6. In particular, we discuss two reformulations of the periodic cell problem, the Lippmann-Schwinger equation (Zeller and Dederichs, 1973) and the Eyre-Milton equation (Eyre and Milton, 1999), each giving rise to a distinct family of FFT-based solution schemes. In both cases, we interpret the respective methods in the framework of convex optimization, which serves as the natural setting for FFT-based solvers throughout this manuscript. For brevity of exposition, we restrict to the continuous setting. Please note, however, that the choice of discretization (Moulinec and Suquet, 1998; Willot, 2015; Schneider et al., 2016) constitutes an important topic in and of itself, with substantial repercussions on the convergence behavior for problems with infinite material contrast (Schneider, 2020b). For a thorough overview on state-of-the-art FFT-based micromechanics, we refer to the review by Schneider (2021).

## **2.2.1 Cell problem**

Based on the framework of small-strain continuum mechanics, we specify the periodic cell problem for computing the effective response

of heterogeneous materials. For clarity of exposition, we restrict to the purely mechanical setting, i.e., we implicitly assume that the temperature field is homogeneous and constant in time and suppress the temperature dependence of all quantities. The extended framework for thermomechanically coupled materials by Chatzigeorgiou et al. (2016) is summarized in Sec. 5.2. Let = [0*,* ] be a periodic cell on the microscale and denote the position vector by ∈ . The material distribution in the cell, i.e., the microstructure, is encoded in the heterogeneous stress operator : × Sym() → Sym()*,* (*,* ) ↦→ (*,* ). We consider the vector space of periodic and mean free displacement fields

$$\begin{aligned} H^1\_\#(Y; \mathbb{R}^d) &= \{ u \in H^1(Y; \mathbb{R}^d) \, | \, \\ &u \text{ periodic}, \, \partial\_n u \text{ anti-periodic on } \partial Y, \, \langle u \rangle\_Y = 0 \}, \end{aligned} \tag{2.49}$$

where ⟨·⟩ = 1*/*| | ∫︀ (·) d denotes volume averaging over . For a prescribed macroscopic strain , we seek a solution ∈ <sup>1</sup> #( ; <sup>R</sup> ) to the quasi-static balance of linear momentum on the microscale

$$\operatorname{div}\sigma(\cdot,\overline{\varepsilon}+\nabla^{s}u)=0,\tag{2.50}$$

where the volume forces vanish as a result of asymptotic homogenization (Bakhvalov and Panasenko, 1989). Given a solution to (2.50), the macroscopic stress computed by = ⟨(·*,* + ∇<sup>s</sup>)⟩ constitutes the effective mechanical response of the material to the loading . For the convenience of the reader, we restrict our exposition to pure strain boundary conditions, see Ch. 4 for the case of mixed boundary conditions following Kabel et al. (2016).

#### **2.2.2 Lippmann-Schwinger equation**

In the context of FFT-based micromechanics, many successful algorithms are based on an equivalent reformulation of (2.50), the so-called

Lippmann-Schwinger equation (Zeller and Dederichs, 1973). As a starting point, consider the elastic problem

$$\text{div}\,\mathbb{C}^0: \nabla^s u = -b \tag{2.51}$$

with homogeneous stiffness tensor C <sup>0</sup> ∈ (Sym()) and a mean-free right hand side ∈ −1 # ( ; <sup>R</sup> ) in the space of (volume) forces. For solving (2.51), we express and as Fourier series

$$u(x) = \sum\_{\xi \in \mathbb{Z}^d} \hat{u}(\xi) \exp(i \, x \cdot \hat{\xi}), \quad b(x) = \sum\_{\xi \in \mathbb{Z}^d} \hat{b}(\xi) \exp(i \, x \cdot \hat{\xi}), \tag{2.52}$$

with ˜ = 2*/*. Recalling that the Fourier coefficients of the divergence of a tensor field and the symmetrized gradient of a vector field are given by

$$
\widehat{\text{div}\,A}(\xi) = i \,\hat{A}(\xi) \cdot \tilde{\xi} \quad \text{and} \quad \widehat{\nabla^s v}(\xi) = i \,\tilde{\xi} \otimes^s \hat{v}(\xi), \tag{2.53}
$$

respectively, the homogeneous problem (2.51) reads

$$\hat{b}(\xi) = \left[ \mathbb{C}^0 : (\hat{\xi} \otimes^s \hat{u}(\xi)) \right] \cdot \hat{\xi} \tag{2.54}$$

in Fourier space. For an isotropic stiffness tensor C <sup>0</sup> with Lamé constants <sup>0</sup> and 0, equation (2.54) may be rearranged to

$$\hat{u}(\xi) = \left(\frac{1}{\mu\_0 ||\tilde{\xi}||^2} \mathbf{I} - \frac{\mu\_0 + \lambda\_0}{\mu\_0 (2\mu\_0 + \lambda\_0)} \frac{\tilde{\xi} \otimes \tilde{\xi}}{||\tilde{\xi}||^4}\right) \cdot \hat{b}(\xi), \quad \xi \neq 0. \tag{2.55}$$

Hence, the solution operator <sup>0</sup> associated to (2.51), i.e.,

$$\text{div}\,\mathbb{C}^0: \nabla^s u = -b \quad \text{iff} \quad u = -G^0 \cdot b,\tag{2.56}$$

admits the Fourier space representation (Mura, 1987)

$$\hat{G}^{0}(\xi) = \begin{cases} -\left(\frac{1}{\mu\_{0} \|\bar{\xi}\|^{2}} \mathbf{I} - \frac{\mu\_{0} + \lambda\_{0}}{\mu\_{0} (2\mu\_{0} + \lambda\_{0})} \frac{\bar{\xi} \otimes \bar{\xi}}{\|\bar{\xi}\|^{4}}\right) & \xi \neq 0, \\ 0 & \xi = 0. \end{cases} \tag{2.57}$$

For simplicity, we restrict to reference materials which are a multiple of the identity, i.e., C <sup>0</sup> = <sup>0</sup> I with <sup>0</sup> = 0*/*2 and <sup>0</sup> = 0. Subtracting div C 0 : ∇s on both sides of equation (2.50), the original problem can be recast in the form of (2.54)

$$\operatorname{div} \mathbb{C}^0 : \nabla^s u = -\operatorname{div} \left[ \sigma(\cdot, \overline{\varepsilon} + \nabla^s u) - \mathbb{C}^0 : (\overline{\varepsilon} + \nabla^s u) \right] \tag{2.58}$$

with = div [(·*,* + ∇s) − C 0 : ( + ∇s)]. Thus, using the property (2.56), the solution of (2.50) may be expressed by

$$u = -G^0 \text{div} \left[ \sigma(\cdot, \overline{\varepsilon} + \nabla^s u) - \mathbb{C}^0 : (\overline{\varepsilon} + \nabla^s u) \right]. \tag{2.59}$$

Taking the symmetrized gradient of (2.59) and adding the macroscopic strain yields the Lippmann-Schwinger equation

$$\varepsilon = \overline{\varepsilon} - \Gamma^0 : \left( \sigma(\cdot, \varepsilon) - \mathbb{C}^0 : \varepsilon \right) \tag{2.60}$$

where the total strain = + ∇<sup>s</sup> and the operator Γ <sup>0</sup> = ∇<sup>s</sup><sup>0</sup>div are introduced. The original FFT-based solver, the basic scheme by Moulinec and Suquet (1994; 1998), is the fixed-point iteration associated to (2.60)

$$
\varepsilon\_{k+1} = \overline{\varepsilon} - \Gamma^0 : (\sigma(\cdot, \varepsilon\_k) - \mathbb{C}^0 : \varepsilon\_k), \tag{2.61}
$$

where Γ 0 is applied in Fourier space.

#### **2.2.3 Variational framework**

Under the assumption that the stress operator is derived from a (condensed) potential

$$
\sigma = \frac{\partial w}{\partial \varepsilon},
\tag{2.62}
$$

the cell problem may be embedded in a variational framework. In the following, we briefly establish the strain-based minimization problem (Bellis and Suquet, 2019) and its relation to the basic scheme (2.61). Please note that an equivalent description in terms of displacement fluctuations is possible (Schneider, 2017a) and enables more memory efficient implementations (Kabel et al., 2014). Consider the space of compatible strain fluctuations

$$U = \left\{ \widehat{\varepsilon} \in L^2(Y; \text{Sym}(d)) \, \Big| \, \widehat{\varepsilon} = \nabla^s u, \quad u \in H^1\_\#(Y; \text{Sym}(d)), \quad \langle \widehat{\varepsilon} \rangle\_Y = 0 \right\} \tag{2.63}$$

as a subset of all periodic and square integrable stress and strain fields 2 ( ; Sym()) with the associated inner product

$$\langle S, T \rangle\_{L^2} = \langle S : T \rangle\_Y, \quad S, T \in L^2(Y; \text{Sym}(d)). \tag{2.64}$$

We seek a minimizer of the mean strain-energy

$$W(\hat{\varepsilon}) = \langle w(\cdot, \overline{\varepsilon} + \hat{\varepsilon}) \rangle\_Y \longrightarrow \min\_{\hat{\varepsilon} \in U}. \tag{2.65}$$

The differential of reads

$$DW(\hat{\varepsilon})[S] = \langle \Gamma : \sigma(\cdot, \overline{\varepsilon} + \hat{\varepsilon}) : S \rangle\_Y \quad S \in U,\tag{2.66}$$

where Γ = ∇<sup>s</sup> (div ∇<sup>s</sup> ) <sup>−</sup><sup>1</sup>div is the projector upon by the Helmholtz decomposition, see App. A. For the chosen (sub)space with inner product (2.64), the gradient is defined by

$$DW(\varepsilon)[S] = \langle \nabla W(\hat{\varepsilon}), S \rangle\_{L^2}, \quad \forall S \in U,\tag{2.67}$$

hence, we obtain

$$
\nabla W(\hat{\varepsilon}) = \Gamma : \sigma(\cdot, \overline{\varepsilon} + \hat{\varepsilon}). \tag{2.68}
$$

The condition for critical points of

$$
\Gamma : \sigma(\cdot, \varepsilon) = 0 \tag{2.69}
$$

is equivalent to the quasi-static balance of linear momentum on the microscale (2.50), thereby recovering the cell problem. Interpreting FFT-based micromechanics as an optimization problem has several immediate advantages. Firstly, Kabel et al. (2014) noted that the gradient descent iteration with step size

$$
\varepsilon\_{k+1} = \varepsilon\_k - \gamma\_k \Gamma : \sigma(\cdot, \varepsilon\_k) \tag{2.70}
$$

associated to (2.65) is precisely the basic scheme by Moulinec and Suquet (1994; 1998) with C <sup>0</sup> = 1*/* I. This elucidates the role of the reference material C <sup>0</sup> as an *algorithmic* rather than a physical parameter and clarifies its optimal choice. Indeed, for a strongly convex energy with an -Lipschitz gradient, i.e.,

$$\begin{split} \langle \sigma(\cdot, \varepsilon\_1) - \sigma(\cdot, \varepsilon\_2), \varepsilon\_1 - \varepsilon\_2 \rangle\_{L^2} &\geq \mu ||\varepsilon\_1 - \varepsilon\_2||\_{L^2}^2 \\ ||\sigma(\cdot, \varepsilon\_1) - \sigma(\cdot, \varepsilon\_2)||\_{L^2} &\leq L ||\varepsilon\_1 - \varepsilon\_2||\_{L^2}, \end{split} \tag{2.71}$$

for all 1*,* <sup>2</sup> ∈ 2 ( ; Sym()), the optimal reference material (Nesterov, 2004, Sec. 1.2.3) reads

$$\mathbb{C}^{0} = \frac{\mu + L}{2} \mathbf{I},\tag{2.72}$$

generalizing the choice for the linear elastic setting (Moulinec and Suquet, 1998). Secondly, in the variational framework, well-established algorithms for convex optimization (Boyd and Vandenberghe, 2004; Nocedal and Wright, 1999), improving upon the performance of simple gradient descent, become available for FFT-based micromechanics, see Ch. 3.

## **2.2.4 Eyre-Milton equation and polarization-based schemes**

Eyre and Milton (1999) proposed an equivalent reformulation of the Lippmann-Schwinger equation in terms of a positive polarization = (·*,* ) + C 0 : as primary variable, giving rise to a separate class of FFT-based methods. The Eyre-Milton equation reads

$$P - \mathbf{Y}^0 : Z^0(P) = 2\mathbb{C}^0 : \overline{\varepsilon} \tag{2.73}$$

with the nonlocal operator

$$\mathbf{Y}^0 = \mathbf{I} - 2\mathbb{C}^0 : \Gamma^0,\tag{2.74}$$

which is readily applied in Fourier space, and the local operator

$$\mathbf{Z}^0 = \mathbf{I} - 2\mathbb{C}^0 : (\sigma + \mathbb{C}^0)^{-1},\tag{2.75}$$

leading to a similar structure compared to the Lippmann-Schwinger equation (2.60) with nonlocal operator Γ <sup>0</sup> and local operator − C 0 . We emphasize that in (2.75) denotes the stress *operator* and is not to be confused with the stress field, i.e., applying

$$\varepsilon = (\sigma + \mathbb{C}^0)^{-1}(P) \tag{2.76}$$

is equivalent to solving

$$P = \sigma(\cdot, \varepsilon) + \mathbb{C}^0 : \varepsilon \tag{2.77}$$

for ∈ 2 ( ; Sym()). By noting that Z 0 translates the positive polarization = (·*,* ) + C 0 : to the negative polarization = (·*,* ) − C 0 : , the equivalence of the Eyre-Milton equation (2.73) and the Lippmann-Schwinger equation (2.60) is readily established

$$\begin{aligned} P - \mathcal{Y}^0 &: \mathbb{Z}^0(P) = 2\mathbb{C}^0 : \overline{\varepsilon}, \\ \Leftrightarrow \quad P - (\mathbb{I} - 2\mathbb{C}^0 : \Gamma^0) &: \tau = 2\mathbb{C}^0 : \overline{\varepsilon}, \\ \Leftrightarrow \quad P - \tau + 2\mathbb{C}^0 : \Gamma^0 &: \tau = 2\mathbb{C}^0 : \overline{\varepsilon}, \\ \Leftrightarrow \quad & \varepsilon + \Gamma^0 : \tau = \overline{\varepsilon}. \end{aligned} \tag{2.78}$$

For linear problems (Eyre and Milton, 1999; Michel et al., 2001), the fixed-point iteration associated to (2.73)

$$P\_{k+1} = 2\mathbb{C}^0 : \mathbb{R} + \mathcal{Y}^0 : Z^0(P\_k) \tag{2.79}$$

and damped versions thereof (Monchiet and Bonnet, 2012; Moulinec and Silva, 2014) were found to converge much faster than the basic scheme (2.61) for a suitable choice of C 0 . Similar to the basic scheme, the extension to inelastic problems was facilitated by connecting the Eyre-Milton scheme (2.79) to classical operator splitting methods (Peaceman and Rachford, 1955; Douglas and Rachford, 1956), see Schneider et al. (2019).

For some Hilbert space and a function : → R*,*  ↦→ () which admits the representation () = () + *ℎ*(), the Peaceman-Rachford iteration (Peaceman and Rachford, 1955) associated to the minimization problem

$$f(x) \to \min\_{x \in V} \tag{2.80}$$

reads

$$z\_{k+1} = [2(\mathbf{I} + \gamma \nabla g)^{-1} - \mathbf{I}][2(\mathbf{I} + \gamma \nabla h)^{-1} - \mathbf{I}]z\_k \tag{2.81}$$

with the iterate = (I +∇*ℎ*) and a step size . Note that for an indicator function of a convex set

$$\mu\_C(x) = \begin{cases} 0, & x \in C, \\ \infty, & x \notin C, \end{cases} \tag{2.82}$$

the operator (I +*∂* ) −1 is equivalent to the projector upon , see, for instance, Combettes and Pesquet (2011). To establish the connection to the Eyre-Milton scheme, consider the reformulation of problem (2.65)

$$\langle w(\varepsilon) \rangle\_Y + \iota\_{U\_\varepsilon}(\varepsilon) \longrightarrow \min\_{\varepsilon \in L^2(Y; \text{Sym}(d))} \tag{2.83}$$

with the indicator function for the space of compatible strain-fields adhering to the prescribed boundary conditions

$$U\_{\varepsilon} = \left\{ \varepsilon \in L^2(Y; \text{Sym}(d)) \, \middle| \, \varepsilon = \overline{\varepsilon} + \nabla^s u, \quad u \in H^1\_{\#}(Y; \text{Sym}(d)) \right\}.\tag{2.84}$$

In analogy to the space of strain-fluctuations (2.63), the projector () = + Γ : upon is derived from the Helmholtz decomposition. Thus, the Peaceman-Rachford iteration (2.81) associated to the problem (2.83) with *ℎ* ≡ ⟨⟩ and ≡ reads

$$z\_{k+1} = 2\overline{\varepsilon} + (2\Gamma - \mathbf{I})[2(\mathbf{I} + \gamma\sigma)^{-1} - \mathbf{I}]z\_k. \tag{2.85}$$

Upon multiplication of (2.85) with the reference material C <sup>0</sup> = 1*/* I the Eyre-Milton iteration (2.79) is recovered

$$P\_{k+1} = 2\mathbb{C}^0 : \overline{\varepsilon} + \underbrace{\left(2\mathbb{C}^0 \Gamma^0 - \mathrm{I}\right)}\_{=-\Upsilon^0} \underbrace{\left(2(\mathbb{C}^0 + \sigma)^{-1} - \mathrm{I}\right)}\_{=-Z^0} P\_k. \tag{2.86}$$

Hence, the convergence analysis for splitting methods by Giselsson and Boyd (2017) carries over to polarization-based schemes and the optimal choice for the reference material is

$$\mathbb{C}^{0} = \sqrt{\mu L} \,\mathrm{I},\tag{2.87}$$

for strain energies satisfying (2.71). However, note that, in contrast to the reference material of the basic scheme (2.72), this choice (2.87) becomes ill-defined for cases where tends to zero, such as perfect elastoplasticity or porous materials. A strategy for circumventing this disadvantage is presented in Ch. 6.

### **Chapter 3**

# **On Quasi-Newton methods in FFT-based micromechanics<sup>1</sup>**

## **3.1 Introduction**

In the context of FFT-based computational homogenization, Newton's method was combined with the conjugate-gradient (CG) solver in the small- (Gélébart and Mondon-Cancel, 2013) and finite-strain setting (Kabel et al., 2014) and exhibited excellent performance. Due to the small number of required function evaluations, these schemes proved to be particularly powerful for problems with computationally expensive material laws, such as single-crystal plasticity (Shantraj et al., 2015; Lucarini and Segurado, 2019; Ma and Truster, 2019), whose evaluation dominates the overall runtime. However, in contrast to gradient-based methods, the Newton-CG solver requires the evaluation of the material's tangent stiffness for each voxel. This procedure can be computationally expensive for some material laws. Furthermore, the analytic derivation of the tangent can be tedious and its implementation may require considerable programming effort, and is thus prone to errors. This gave rise to applying Quasi-Newton methods in FFT-based microme-

<sup>1</sup> This chapter is based on Wicht et al. (2020b). For the sake of a coherent structure, formatting and typography of this thesis, minor changes have been made. To avoid redundancies in the text, the introduction has been shortened.

chanics. Quasi-Newton schemes rely upon an approximation of the Hessian by generalizing the one-dimensional secant method and are thereby tangent-free (Nocedal and Wright, 1999). Schneider (2019a) used the Barzilai-Borwein method (Barzilai and Borwein, 1988), which approximates the Hessian by a multiple of the identity, to accelerate Moulinec-Suquet's basic scheme. Shantraj et al. (2015) pioneered using Anderson acceleration (Anderson, 1965) in an FFT-based context. The algorithm is included in the software DAMASK Roters et al. (2018) as the non-linear GMRES method. More recently, Chen et al. (2019a;b) successfully adapted the Anderson acceleration to simulate damage initiation and brittle fracture. Originally developed to accelerate general fixedpoint iterations, Anderson acceleration was linked to Quasi-Newton schemes by Fang and Saad (2009). More precisely, it was identified as a generalized multisecant form of the second Broyden method (or "bad Broyden method") (Broyden, 1965) which approximates the Hessian in terms of a number (called depth) of past iterates and gradients. Recently, Evans et al. (2020) proved that Anderson acceleration improved the first-order convergence rate for fixed-point iterations. Pollock and Rebholz (2021) extended the analysis to the non-contractive setting and provided sharper residual bounds.

Motivated by the mentioned work on Quasi-Newton methods, we focus our attention on the powerful and popular Broyden-Fletcher-Goldfarb-Shanno (BFGS) algorithm (Broyden, 1970; Fletcher, 1970; Goldfarb, 1970; Shanno, 1970). We revisit its basics in the framework of (inexact) Newton methods in Sec. 3.2. Both Newton and Quasi-Newton methods require appropriate globalization strategies to ensure global convergence. Often, this is realized by a backtracking line search using appropriate conditions for the acceptance of the step size. However, applying the classical Wolfe conditions (Wolfe, 1969) to FFT-based micromechanics is not feasible, as function evaluations are not available in this setting in general, since the condensed potential (Lahellec and Suquet, 2007) of the material law

carries no physical meaning and is therefore not computed. Hence, we propose using the line-search conditions proposed by Dong (2010), which solely rely upon gradient evaluations, see Sec. 3.2.3. Another aspect which is of major importance for the overall performance of inexact (Quasi-)Newton methods is the choice of the forcing term, i.e. the accuracy to which the linear system is solved. To this end, we revisit the forcing-term strategies of Eisenstat and Walker (1996), see Sec. 3.2.4. In Sec. 3.3, we turn our attention to Newton and Quasi-Newton methods as applied in the context of FFT-based micromechanics. After revisiting the Newton-CG method and the Anderson acceleration, two possible uses of the BFGS update formula in the FFT-based setting are proposed. First, we investigate the limited-memory version of the BFGS algorithm (L-BFGS) by Nocedal (1980) which only stores the last differences of iterates and gradients for its Hessian, similar to the Anderson acceleration. A second algorithm is derived, using the BFGS-update formula to approximate the local material tangent for every voxel instead of the Hessian of the global system. In analogy to the Newton-CG method, the resulting linear system is solved using conjugate gradients. Hence, we refer to the method as BFGS-CG. Last but not least, we compare the performance of the investigated solution algorithms and the impact of the different forcing-term choices for non-linear problems with finite and infinite material contrast, see Sec. 3.4.

## **3.2 Newton and Quasi-Newton methods**

### **3.2.1 Newton's method**

Let be a Hilbert space with an associated inner product × → R*,* (*,* ) ↦→ ⟨*,* ⟩ and the induced norm ‖‖ = √︀ ⟨*,* ⟩ . Suppose a twice continuously differentiable function : → R is given. Its gradient ∇ : → is defined by

$$Df(x)[v] = \langle \nabla f(x), v \rangle\_V, \qquad v \in V,\tag{3.1}$$

where : → ′ denotes the differential of and ′ is the continuous dual of . For a minimization problem

$$f(x) \longrightarrow \min\_{x \in V},\tag{3.2}$$

critical points of are characterized by

$$
\nabla f(x) = 0.\tag{3.3}
$$

Newton's method iteratively updates an initial guess <sup>0</sup> ∈ by the formula

$$\begin{aligned} x\_{n+1} &= x\_n + \xi\_n, \quad \text{where} \quad \xi\_n \in V\\ \text{solves} \quad D\nabla f(x\_n)[\xi\_n] &= -\nabla f(x\_n), \end{aligned} \tag{3.4}$$

and ∇ : → (*,*  ) denotes the Hessian of and and (*,*  ) denotes the space of linear mappings → . Let \* ∈ be a solution to (3.3). Suppose that ∇( \* ) is an isomorphism and ∇ is Lipschitz continuous in a neighborhood of \* . Then, if <sup>0</sup> is sufficiently close to \* , the Newton iteration (3.4) converges, and if ∇ is locally Lipschitz, it does so with quadratic rate (Kantorovich, 1948).

To obtain global convergence, the Newton iteration (3.4) has to be modified, for instance by damping, i.e., with ∈ (0*,* 1],

$$\begin{aligned} x\_{n+1} &= x\_n + a\_n \xi\_n, \quad \text{where} \quad \xi\_n \in V\\ \text{solves} \quad D\nabla f(x\_n)[\xi\_n] &= -\nabla f(x\_n). \end{aligned} \tag{3.5}$$

The damping factor is chosen by a line search procedure, for instance by an approximate line search involving the Wolfe (1969) conditions

$$f(x\_n + a\_n \xi\_n) \le f(x\_n) + c\_1 a\_n \langle \nabla f(x\_n), \xi\_n \rangle\_V \tag{3.6}$$

and

$$\langle \nabla f(x\_n + a\_n \xi\_n), \xi\_n \rangle\_V \ge c\_2 \langle \nabla f(x\_n), \xi\_n \rangle\_V \tag{3.7}$$

for fixed constants 0 *<* <sup>1</sup> *<* <sup>2</sup> *<* 1.

For large scale applications, the equation ∇()[] = −∇() for the Newton increment can often only be solved iteratively up to a prescribed precision, leading to an inexact, damped Newton method

$$x\_{n+1} = x\_n + a\_n \xi\_n,\quad\text{where}\quad \xi\_n \in V$$

$$\text{solves}\quad ||D\nabla f(x\_n)[\xi\_n] + \nabla f(x\_n)||\_V \le \eta\_n ||\nabla f(x\_n)||\_V.\tag{3.8}$$

The choice of is crucial, as its order of convergence (as → ∞) is linked to the convergence of to \* , see Dembo et al. (1982). More precisely, if is uniformly less than one, converges to \* linearly. Furthermore, assuming Lipschitz continuity of ∇() in a neighborhood of \* , ≤ ‖ − \*‖ is necessary to obtain quadratic convergence. However, "asymptotic quadratic convergence is achievable, but only with effort on the part of the inner, linear iterative method, which is usually unwarranted when overall time to solution is the metric", see Knoll and Keyes (2004). General-purpose strategies for the choice of were proposed by Eisenstat and Walker (1996) and are discussed in Sec. 3.2.4.

Despite the computational power of Newton's method, there are several practical disadvantages.


3. For inexact Newton-methods, the optimal choice of the Newton forcing term {} in (3.8) is difficult. Although general purpose strategies have been developed (Eisenstat and Walker, 1996), the following problem remains. Suppose you wish to find a -critical point, i.e. to find a solution to the inequality

$$\|\nabla f(x)\|\_{V} \le \delta$$

and your current iterate almost satisfies the inequality. How accurate do you have to solve for the increment to ensure that +1 is -critical?

Points 1 and 3 motivated the development of Quasi-Newton methods which we shall discuss next.

#### **3.2.2 From Newton to BFGS**

Quasi-Newton methods replace the Hessian ∇() in the linear equation

$$D\nabla f(x\_n)[\xi\_n] = -\nabla f(x\_n) \tag{3.9}$$

by an approximation which is required to fulfill the secant condition

$$y\_n = B\_{n+1} s\_n,$$

$$\text{where}\quad s\_n = x\_{n+1} - x\_n,\tag{3.10}$$

$$\text{and}\quad y\_n = \nabla f(x\_{n+1}) - \nabla f(x\_n).$$

Among the most powerful Quasi-Newton methods is the Broyden-Fletcher-Goldfarb-Shanno (BFGS) algorithm (Broyden, 1970; Fletcher, 1970; Goldfarb, 1970; Shanno, 1970), which recursively updates an approximation of the Hessian

$$B\_{n+1} = B\_n + \frac{y\_n \otimes \langle y\_n, \cdot \rangle\_V}{\langle y\_n, s\_n \rangle\_V} - \frac{B\_n s\_n \otimes \langle B\_n s\_n, \cdot \rangle\_V}{\langle s\_n, B\_n s\_n \rangle\_V} \tag{3.11}$$

for a given <sup>0</sup> ∈ (*,*  ). If the operator <sup>0</sup> is self-adjoint and positive definite, the subsequent ∈ (*,*  ) will inherit the symmetry and positive definiteness property. Alternatively, an update formula corresponding to (3.11) is available for the inverse of the Hessian = <sup>−</sup><sup>1</sup> 

$$\begin{split} H\_{n+1} &= \left( \mathbf{I} - \frac{y\_n \otimes \langle s\_n, \cdot \rangle\_V}{\langle y\_n, s\_n \rangle\_V} \right) H\_n \left( \mathbf{I} - \frac{s\_n \otimes \langle y\_n, \cdot \rangle\_V}{\langle y\_n, s\_n \rangle\_V} \right) \\ &+ \frac{s\_n \otimes \langle s\_n, \cdot \rangle\_V}{\langle y\_n, s\_n \rangle\_V} . \end{split} \tag{3.12}$$

With this formula at hand, = −∇() can be computed without solving the linear system (3.9). Thus, the damped BFGS method may be rewritten

$$x\_{n+1} = x\_n - a\_n H\_n \nabla f(x\_n). \tag{3.13}$$

Global superlinear convergence of the BFGS method (3.13) with inexact line search respecting the Wolfe conditions (3.6) and (3.7) and uniformly convex and Lipschitz-continuous objective functions in finite dimensions has been established by Powell (1976). In the general Hilbert space setting, only linear convergence (Turner and Huntley, 1976; Griewank, 1987) can be expected, see Griewank (1987) for counterexamples. If the Hessian at the critical point \* and the inverse −1 <sup>0</sup> of the initial approximation of the Hessian differ by a compact linear operator, superlinear convergence can be established (Griewank, 1987). More generally, superlinear convergence is characterized by Dennis and Moré (1977). However, their criterion is difficult to verify for a particular problem at hand.

The BFGS method still keeps the Hessian (or its inverse) in memory. In particular, due to the rank-two update, quickly becomes fully populated, restricting the method's utility for large scale applications. Nocedal (1980) introduced a limited-memory variant of BFGS (L-BFGS) depending on a positive integer , such that only the last differences of iterates and gradients are kept in storage for updating the

inverse Hessian. More precisely, for any , and = 0*, . . . ,* −1, Nocedal proposed the formula

$$\begin{split} H\_n^{m-l} &= \left( \mathbf{I} - \frac{y\_{n-l} \otimes \langle s\_{n-l}, \cdot \rangle\_V}{\langle y\_{n-l}, s\_{n-l} \rangle\_V} \right) H\_n^{m-l-1} \left( \mathbf{I} - \frac{s\_{n-l} \otimes \langle y\_{n-l}, \cdot \rangle\_V}{\langle y\_{n-l}, s\_{n-l} \rangle\_V} \right) \\ &+ \frac{s\_{n-l} \otimes \langle s\_{n-l}, \cdot \rangle\_V}{\langle y\_{n-l}, s\_{n-l} \rangle\_V} \end{split} \tag{3.14}$$

for some initial approximation <sup>0</sup> , and where we formally set and to zero for  *<* 0. Typically, the initial approximation is chosen as a multiple of the identity <sup>0</sup> = I. A common choice for the scaling factor is given by = ⟨−1*,* −1⟩ */*⟨−1*,* −1⟩ , see Shanno and Puah (1978) and Liu and Nocedal (1989), corresponding to the Barzilai-Borwein stepsize (Barzilai and Borwein, 1988). The damped L-BFGS iteration reads

$$x\_{n+1} = x\_n - a\_n H\_n^m \nabla f(x\_n). \tag{3.15}$$

How to implement the update (3.15) in the context of FFT-based micromechanics is discussed in Sec. 3.3.3. For strongly convex and Lipschitz-continuous objective functions, convergence of L-BFGS under the Wolfe conditions (3.6) and (3.7) in finite dimensions was established by Liu and Nocedal (1989). In contrast to BFGS, the convergence to \* is only linear.

#### **3.2.3 The line-search procedure of Dong**

Global convergence of Newton's method and (L-)BFGS depends on a flexible line-search procedure. Exact line search is typically infeasible in practice, because evaluating the gradient of the objective function involves non-linear, and often quite costly, operations. Thus, approximate line-search procedures ensuring sufficient decrease per iteration are mandatory, involving, for instance, the Wolfe conditions (3.6) and

(3.7). In particular, using the Wolfe conditions as criterion for the line search is crucial for ensuring global convergence of the (L-)BFGS method. Satisfying the Wolfe conditions guarantees that the curvature condition

$$
\langle y\_n, s\_n \rangle\_V > 0 \tag{3.16}
$$

holds, which is necessary for the positive definiteness of the iterates , see Sec. 6.1 in Nocedal and Wright (1999).

For FFT-based micromechanics (to be discussed in Sec. 3.3), function evaluations are not available, in general. The reason is that, in contrast to the stress, the Helmholtz free energy or the dissipation potential, the condensed potential for the non-linear material law, relating strains and stresses, has no physical meaning (because it depends on the time discretization and mixes the Helmholtz free energy and the dissipation potential). In particular, the Wolfe condition (3.6) cannot be evaluated per se. As a workaround, Dong (2010) proposed to replace the first Wolfe condition (3.6) by the inequality

$$
\langle \nabla f(x\_n + a\_n d\_n), d\_n \rangle\_V \le c\_1 \langle \nabla f(x\_n), d\_n \rangle\_V,\tag{3.17}
$$

which implies (3.6) if the gradient ∇ : → is monotone, i.e. satisfies

$$
\langle \nabla f(x) - \nabla f(y), x - y \rangle\_V \ge 0, \quad x, y \in V. \tag{3.18}
$$

In mechanics, the latter is equivalent to the monotonicity of the stress, considered as a function of the strain.

#### **3.2.4 Strategies for choosing the forcing term**

For inexact Newton-methods, the choice of the forcing term {} in (3.8) is crucial for the overall efficiency of the scheme. At iterates {} far away from the solution, ∇ and its linear approximation may disagree significantly. Thus, solving the linear system (3.9) to a high accuracy may

waste computational effort without substantially improving the overall convergence behavior (Eisenstat and Walker, 1996). This is commonly called oversolving. Setting to a moderate constant value, e.g. = 0*.*1 as suggested by Kelley (2018), can be reasonable but may not be optimal for all problems. Eisenstat and Walker (1996) propose more involved strategies, taking ∇ into account. Their first strategy, named choice 1, reads

$$\eta\_n = \left| \frac{||\nabla f(x\_n)||\_V - ||D\nabla f(x\_{n-1})[\xi\_{n-1}] + \nabla f(x\_{n-1})||\_V}{||\nabla f(x\_{n-1})||\_V} \right|,\tag{3.19}$$

with an initial value <sup>0</sup> ∈ [0*,* 1). This choice directly measures the disagreement between the gradient and its linear approximation. Thus, the value of decreases, as the Newton iterates {} approach the solution of the system. The alternative choice 2 by Eisenstat and Walker (1996) is given by

$$\eta\_n = \lambda \left( \frac{||\nabla f(x\_n)||\_V}{||\nabla f(x\_{n-1})||\_V} \right)^\beta,\tag{3.20}$$

with parameters ∈ [0*,* 1] and ∈ (1*,* 2]. The ratio of consecutive residua provides a measure for the convergence rate between the current and last iteration. Hence, close to the solution, where a faster convergence behavior is expected, decreases. Setting the parameter = 1+<sup>√</sup> 5 2 results in a comparable convergence order for choices 1 and 2. Additionally, Eisenstat-Walker suggest a safeguard for each choice to prevent a premature decrease of far away from the solution. This is achieved by limiting the decrease of by a factor of −<sup>1</sup> above a certain threshold. The safeguard for choice 1 reads

$$\eta\_n^{\text{safe}} = \begin{cases} \max\left(\eta\_n, \eta\_{n-1}^{(1+\sqrt{5})/2}\right), & \text{if } \eta\_{n-1}^{(1+\sqrt{5})/2} > 0.1, \\ \eta\_n, & \text{otherwise}, \end{cases} \tag{3.21}$$

and the safeguard for choice 2 is given by

$$\eta\_n^{\text{safe}} = \begin{cases} \max\left(\eta\_n, \lambda \eta\_{n-1}^\beta\right), & \text{if } \lambda \eta\_{n-1}^\beta > 0.1, \\ \eta\_n, & \text{otherwise.} \end{cases} \tag{3.22}$$

Even with the presented forcing term choices and safeguards in place, oversolving may occur in the final Newton iteration. Indeed, suppose we want to solve (3.3) up to a certain accuracy

$$\|\nabla f(x)\|\_{V} \le \delta,\tag{3.23}$$

and the current iterate almost satisfies (3.23). With a small value for , the final Newton iteration may reduce ‖∇()‖ far below the desired accuracy . To prevent this type of oversolving, the following safeguard

$$\eta\_n^{\text{final}} = \min(\eta\_{\text{max}}, \max(\eta\_n^{\text{safe}}, 0.5 \,\delta/||\nabla f(x)||\_V))\tag{3.24}$$

with max ∈ [0*,* 1) is suggested in Sec. 6.3 in Kelley's book (Kelley, 1995).

## **3.3 Newton and Quasi-Newton methods in FFT-based micromechanics**

#### **3.3.1 Newton's method**

We consider periodic homogenization problems (Bakhvalov and Panasenko, 1989) in the context of small-strain continuum mechanics. Let be a rectangular cell in R ( = 2*,* 3). The Hilbert space for periodic and

mean-free displacement fluctuations is

$$\begin{aligned} H^1\_\#(Y; \mathbb{R}^d) &= \{ u \in H^1(Y; \mathbb{R}^d) \, | \, \\ &u \text{ periodic}, \, \partial\_n u \text{ anti-periodic on } \partial Y, \, \langle u \rangle\_Y = 0 \}, \end{aligned} \tag{3.25}$$

where the mean of any integrable scalar or vector valued function on is defined by

$$
\langle q \rangle\_Y = \frac{1}{|Y|} \int\_Y q(x) \, dx,
$$

together with the inner product induced by the quadratic form

‖‖ 2 <sup>1</sup> #( ;R) <sup>=</sup> 1 | | ∫︁ ‖∇‖ 2 *,*

where ∇ denotes the symmetrized gradient and the quadratic form in the integrand corresponds to the Frobenius inner product on square matrices, ‖‖ <sup>2</sup> = tr( ).

Furthermore, let a (heterogeneous) strain energy potential

$$w: Y \times \operatorname{Sym}(d) \to \mathbb{R}, \quad (x, \varepsilon) \mapsto w(x, \varepsilon),$$

be given, measurable in and 2 in Sym(), where Sym() denotes the linear space of symmetric × -matrices. Denote by = *∂ ∂*  the associated stress function, and by *<sup>∂</sup>* <sup>2</sup> *∂* <sup>2</sup> its Hessian. For prescribed strain , we seek a minimizer of the function

$$H^1\_{\#}(Y; \mathbb{R}^d) \ni u \mapsto f(u) = \langle w(\cdot, \overline{\varepsilon} + \nabla^s u) \rangle\_Y. \tag{3.26}$$

To conform to the framework of the previous section, we compute the differential of

$$Df(u) = -\text{div}\,\sigma(\cdot,\overline{\varepsilon} + \nabla^s u)$$

and its gradient

$$\nabla f(u) = G \operatorname{div} \sigma(\cdot, \overline{\varepsilon} + \nabla^s u)$$

where is the Green's operator = (div ∇ ) −1 , which corresponds to the negative of the Riesz map on <sup>1</sup> #( ; <sup>R</sup> ). In this context, the equation for the -th Newton increment ∈ <sup>1</sup> #( ; <sup>R</sup> ), corresponding to (3.9), is given by

$$G\text{div}\left[\frac{\partial^2 w}{\partial \varepsilon^2}(\varepsilon\_n) : \nabla^s \xi\_n\right] = -G\text{div}\,\sigma(\varepsilon\_n),\tag{3.27}$$

where = + ∇. For any <sup>0</sup> *>* 0, equation (3.27) is equivalent to the Lippmann-Schwinger equation

$$\Xi\_n + \Gamma^0 : \left[\frac{\partial^2 w}{\partial \varepsilon^2}(\varepsilon\_n) - \mathbb{C}^0\right] : \Xi\_n = -\Gamma^0 : \sigma(\varepsilon\_n), \tag{3.28}$$

where C <sup>0</sup> = <sup>0</sup> I, Γ <sup>0</sup> = (0) <sup>−</sup>1∇div , via the identification Ξ = ∇ . Note, if a strain-based iterative scheme is used to solve (3.28), only the converged solution Ξ \* is compatible, in general, whereas this may be false for the iterates {Ξ}. This is the case, for instance, for polarizationbased schemes as the Eyre-Milton method used by Kabel et al. (2014). Typically, (3.28) is solved using Krylov-subspace methods, such as CG or MINRES (Zeman et al., 2010; Brisard and Dormieux, 2010; 2012), due to their excellent performance for linear problems. In addition, these schemes operate on compatible strain-fields, permitting a memory efficient implementation (Kabel et al., 2014). With these formulae at hand, we may formulate a damped Newton scheme, depending on Dong's version of the Wolfe conditions, (3.17) and (3.7). The resulting algorithm is summarized in Alg. 1.

**Algorithm 1** Newton's method with backtracking by Dong (2010) (, C 0 , 1*,*0, 2, maxiter)

1: ← 2: ← MSiterate (*, ,* C 0 ) 3: **repeat** 4: Ξ ← − (︁ I +Γ<sup>0</sup> : [︁ *∂* <sup>2</sup> *∂* <sup>2</sup> () <sup>−</sup> <sup>C</sup> 0 ]︁)︁<sup>−</sup><sup>1</sup> : Γ<sup>0</sup> : () *◁* Solving (3.28) 5: ← 0 6: ← +∞ 7: ← 1 8: ← 0 9: **while**  *<* maxiter **do** 10: ← + 1 11: <sup>1</sup> ← 1*,*0(1 − (2) ) − (2) 12: **if** ⟨Γ 0 : ( +Ξ)*,* Ξ⟩<sup>2</sup> *>* 1⟨Γ 0 : ()*,* Ξ⟩<sup>2</sup> **then** 13: ← 14: ← 0*.*5( + ) 15: **else if** ⟨Γ 0 : ( +Ξ)*,* Ξ⟩<sup>2</sup> *<* 2⟨Γ 0 : ()*,* Ξ⟩<sup>2</sup> **then** 16: ← 17: ← 2 18: **else** 19: **break** 20: **end if** 21: **end while** 22: ← +Ξ 23: **until** Convergence *◁* Criterion (3.29) 24: **return**

Newton's method with backtracking by Dong (2010) (*continued*)

MSiterate (, , C 0 ) 1: ← () − C 0 : 2: ← FFT() 3: ← −Γ 0 : *,* (0) = 4: ← FFT<sup>−</sup><sup>1</sup> () 5: **return**

The convergence criterion reads

$$\alpha\_0 \frac{\|\Gamma^0: \sigma^k\|\_{L^2}}{\|\langle \sigma^k \rangle\_Y\|} \equiv \frac{\|\text{div}\,(\sigma^k)\|\_{H^{-1}}}{\|\langle \sigma^k \rangle\_Y\|} \le \delta \tag{3.29}$$

with a prescribed tolerance . This choice was introduced and discussed in Schneider et al. (2019). Both, the convergence criterion (3.29) and the convergence behavior of the linear Krylov-subspace solver are independent of 0, see Zeman et al. (2010). As we start with a single iteration of the basic scheme, we use the associated reference material <sup>0</sup> = (<sup>+</sup> + −)*/*2 with the extremal eigenvalues <sup>+</sup> and <sup>−</sup> of the material tangent evaluated over all voxels. For the parameters of the line-search procedure, we choose 1*,*<sup>0</sup> = 10<sup>−</sup><sup>4</sup> and <sup>2</sup> = 0*.*9, see Dong (2010). A few remarks on the practical implementation are in order.

1. The storage requirements for Newton-CG read: 1 current strain, and 4 strains for solving the linear system by CG. Furthermore, the symmetric material tangent needs to be stored. In 3 spatial dimensions, this corresponds to 21 scalar components for every voxel, the equivalent of 3*.*5 strain fields. In total, the storage requirements amount to 8*.*5 strain-like fields. Using the line search procedure by Dong (2010) involves storing another strain field, as gradient and Newton step have to be kept in memory separately. If affine-linear extrapolation is needed, an additional strain needs to be stored.


## **3.3.2 Anderson acceleration**

The BFGS method as outlined in Sec. 3.2.2 requires the Hessian (or its inverse) to be kept in memory. Thereby, the algorithm cannot be directly applied in the context of FFT-based micromechanics, as the Hessian is usually not assembled in this setting due to memory limitations. To circumvent this problem, limited-memory Quasi-Newton methods were developed, which implicitly update the Hessian by storing a limited number of recent iterates and gradients, with commonly called the depth of the scheme.

One such algorithm is Anderson acceleration (Anderson, 1965) which was recently applied by Shantraj et al. (2015) and Chen et al. (2019a;b) in the context of FFT-based micromechanics. A general discussion of the scheme and its implementation is found, e.g., in Walker and Ni (2011) or Kelley (2018). Eyert (1996) and Fang and Saad (2009) pointed out the relation of Anderson acceleration to Quasi-Newton schemes and identified it as a generalized form of Broyden's second method. Recently, Evans et al. (2020) provided a proof that Anderson acceleration improves the convergence rate of linearly converging fixed-point methods.

For an integer depth ≥ 1, Anderson acceleration requires the last +1 iterates and gradients = Γ<sup>0</sup> : () to be kept in memory, resulting in a memory footprint of 2 + 2 strain-like fields. The algorithm is outlined in Alg. 2 for the convenience of the reader. Note that for the given algorithm Anderson acceleration is applied for every iteration. In contrast, Chen et al. (2019a;b) only accelerate every third iteration and apply the basic scheme (Moulinec and Suquet, 1998) otherwise.


Determining the coefficients = (0*, . . . ,*  ) by solving the minimization problem

$$\min\_{\alpha} \left\| \sum\_{j=0}^{m\_k} \alpha\_j g\_{k-m\_k+j} \right\|\_{L^2} \quad \text{s.t.} \quad \sum\_{j=0}^{m\_k} \alpha\_j = 1 \tag{3.30}$$

is the key step in one iteration of the Anderson acceleration. To solve this problem, we reformulate (3.30) in terms of the Lagrangian function

$$\sum\_{i=0}^{m\_k} \sum\_{j=0}^{m\_k} \frac{1}{2} \alpha\_i \alpha\_j \langle g\_{k-m\_k+i}, g\_{k-m\_k+j} \rangle\_{L^2} + \lambda \left( \sum\_{j=0}^{m\_k} \alpha\_j - 1 \right) \longrightarrow \min\_{\alpha} \max\_{\lambda}$$

by squaring the objective function and introducing the Lagrangian multiplier . The associated KKT-conditions

$$\begin{aligned} \sum\_{j=0}^{m\_k} \alpha\_j \langle g\_{k-m\_k}, g\_{k-m\_k+j} \rangle\_{L^2} + \lambda &= 0 \\ \vdots \\ \sum\_{j=0}^{m\_k} \alpha\_j \langle g\_k, g\_{k-m\_k+j} \rangle\_{L^2} + \lambda &= 0 \\ \sum\_{j=0}^{m\_k} \alpha\_j - 1 &= 0 \end{aligned} \tag{3.32}$$

constitute a system of + 2 linear equations, which are solved for and .

#### **3.3.3 Limited-memory BFGS**

As another limited-memory Quasi-Newton scheme, we propose to apply Nocedal's L-BFGS method, see Sec. 3.2.2, to FFT-based micromechanics. The L-BFGS method can be implemented with a memory-footprint of 2 + 4 strain-like fields. More precisely, the last differences of iterates = +1 − , differences of gradients = Γ<sup>0</sup> : (+1) − Γ 0 : () and inner products = 1*/*⟨*,* ⟩<sup>2</sup> have to be kept in memory. In addition, the current strain and gradient Γ 0 : () and the last strain and gradient Γ 0 : () need to be stored.

For evaluating the L-BFGS increment Ξ = − ∇(), the two-loop recursion of Matthies and Strang (1979) proves useful. A pseudo code is given in Alg. 3, where we use the initial Hessian

$$H\_n^0 = \frac{\langle s\_{n-1}, y\_{n-1} \rangle\_{L^2}}{\langle y\_{n-1}, y\_{n-1} \rangle\_{L^2}} \operatorname{I},\tag{3.33}$$

as suggested by Shanno and Puah (1978) and Liu and Nocedal (1989). The algorithm takes the current gradient Γ 0 : () as input and overwrites it by the increment Ξ.

**Algorithm 3** Two-loop recursion for evaluating for given (Matthies and Strang, 1979; Nocedal, 1980)

```
1: for  =  − 1,  − 2, . . . , 0 do
2:  ← ⟨, ⟩2
3:  ←  − 
4: end for
5:  ←
      ⟨−1,−1⟩2
      ⟨−1,−1⟩2

6: for  = 0, 1 . . . ,  − 1 do
7:  ← ⟨, ⟩2
8:  ←  + ( − )
9: end for
10: return
```
The L-BFGS method is implemented analogously to Alg. 1, where the two-loop recursion replaces the solution of the linear system (3.28) for obtaining Ξ.

#### **3.3.4 BFGS update of the material tangent**

As an alternative to the limited-memory Quasi-Newton scheme, we propose using the BFGS update to approximate the local material tangent *∂* <sup>2</sup> *∂* <sup>2</sup> in (3.28) instead of the global Hessian of in (3.26). In this context, the BFGS update reads

$$\begin{split} \mathbb{C}\_{n+1}^{\text{BFGS}} &= \mathbb{C}\_{n}^{\text{BFGS}} + \frac{\Delta\sigma\_{n} \otimes \Delta\sigma\_{n}}{\Delta\sigma\_{n} : \Delta\,\varepsilon\_{n}} \\ &- \frac{(\mathbb{C}\_{n}^{\text{BFGS}} : \Delta\,\varepsilon\_{n}) \otimes (\mathbb{C}\_{n}^{\text{BFGS}} : \Delta\,\varepsilon\_{n})}{\Delta\,\varepsilon\_{n} : \mathbb{C}\_{n}^{\text{BFGS}} : \Delta\,\varepsilon\_{n}}, \end{split} \tag{3.34}$$

where

$$
\Delta \varepsilon\_n = \varepsilon\_{n+1} - \varepsilon\_n \quad \text{and} \quad \Delta \sigma\_n = \sigma(\varepsilon\_{n+1}) - \sigma(\varepsilon\_n).
$$

We found that the material's linear elastic stiffness serves as a decent initial guess for C BFGS 0 . Consequently, Alg. 1 may be applied with C BFGS replacing *<sup>∂</sup>* <sup>2</sup> *∂* <sup>2</sup> () in (3.28). Note that, in contrast to the limited-memory schemes in Sec. 3.3.2 and Sec. 3.3.3, the linear system (3.28) still needs to be solved with an iterative solver. In comparison to the Newton-CG method, two additional strain-like fields need to be kept in memory to compute Δ.

## **3.4 Numerical demonstrations**

#### **3.4.1 General setup**

The solution schemes were implemented in Python 2.7. Computationally expensive operations such as the application of Γ <sup>0</sup> and the evaluation of the material law were written as Cython extensions and parallelized using OpenMP. For the fast Fourier transform, we relied on the FFTW

library (Frigo and Johnson, 2005). The computations ran on 6 threads on a desktop computer with 32 GB RAM and an Intel i7-8700K CPU with 6 cores and a clock rate of 3*.*7 GHz. An affine-linear extrapolation (Moulinec and Suquet, 1998) was used as initial guess for the strain field in case of multiple load steps. For the convergence criterion, we use (3.29)

$$\alpha\_0 \frac{\|\Gamma^0: \sigma^k\|\_{L^2}}{\|\left<\sigma^k\right>\_Y\|} \le \delta,$$

where <sup>0</sup> is the scaling factor of the reference material C <sup>0</sup> = <sup>0</sup> I. As Γ <sup>0</sup> = (0) <sup>−</sup>1∇div , this convergence criterion is actually independent of 0. For this study, we use the reference material of the basic scheme <sup>0</sup> = (<sup>+</sup> + −)*/*2. The tolerance is set to = 10<sup>−</sup><sup>5</sup> in Sec. 3.4.2 and = 10<sup>−</sup><sup>4</sup> in Sec. 3.4.3 and 3.4.4. Throughout, we utilize the staggered grid discretization (Schneider et al., 2016).

### **3.4.2 Continuous glass-fiber reinforced polyamide**

In the following, we investigate the performance of the L-BFGS method and Anderson acceleration as discussed in Sec. 3.3.2 and Sec. 3.3.3 with respect to the chosen depth . As microstructure we consider a polyamide matrix, reinforced by continuous glass fibers with a volume fraction of 15%, and a resolution of 256<sup>2</sup> pixels, see Fig. 3.1. Using a 2-dimensional structure enables investigating large values of the depth , without memory becoming a limiting factor. Following Doghri et al. (2011), we assume that the mechanical behavior of the polyamide matrix is governed by 2-elastoplasticity, see Sec. 3.3 in Simo and Hughes (1998). For the sake of simplicity, the rate-dependent behavior of the material is neglected in this approach. A more involved material model, accounting for viscoelastic and viscoplastic effects was proposed, e.g., by Krairi et al. (2019). The relation between the yield stress and the equivalent plastic strain = ∫︀ 0 √︁ 2 3 ‖˙p‖ d^is modelled by a linear-exponential hardening

**(a)** Microstructure (256<sup>2</sup> pixels) **(b)** Equivalent plastic strain at 5% uniaxial extension

**Figure 3.1:** Continuous glass-fiber reinforced polyamide

function

$$
\sigma\_Y(p) = \sigma\_0 + k\_1 p + k\_2 (1 - \exp(-mp)),
$$

where <sup>0</sup> denotes the initial yield strength, <sup>1</sup> denotes the asymptotic hardening modulus and <sup>2</sup> = <sup>0</sup> − <sup>∞</sup> denotes the difference between the initial and saturated yield strength for <sup>1</sup> = 0. The prefactor in the exponential function is given by = Θ*/*2, where Θ denotes the initial hardening modulus. The glass fibers are modelled as linear elastic. The material parameters according to Doghri et al. (2011) are given in Tab. 3.1. We apply mixed boundary conditions (Kabel et al., 2016), corresponding to a uniaxial extension of 5% perpendicular to the fiber direction, in a single load step.

The L-BFGS scheme and Anderson acceleration are investigated for depths from 1 to 200. In addition, Moulinec-Suquet's basic scheme (Moulinec and Suquet, 1998), the basic scheme with Barzilai-Borwein (BB) step-size control (Barzilai and Borwein, 1988; Schneider, 2019a), the Newton-CG method and the BFGS-CG method are included as


**Table 3.1:** Glass-fiber reinforced polyamide: Material parameters of fibers and matrix

benchmarks. For the Newton-CG method and the BFGS-CG method, we use forcing-term choice 2 of Eisenstat-Walker (3.20), see Sec. 3.4.3. The resulting iteration counts and the computational runtimes are given, depending on the depth, in Fig. 3.2 and Tab. 3.2.

**Figure 3.2:** Continuous glass-fiber reinforced polyamide: Iteration count (left) and computation time (right) with respect to the chosen depth

For Anderson acceleration, we observe that the required number of iterations drops significantly up to a depth of 5 and stagnates for depths larger than 50. Between the minimum depth of 1 and a depth of 200, i.e., keeping all iterates in memory, the iteration count decreases by 85%. In contrast, the convergence behavior of L-BFGS is much less affected by

the chosen depth. From the onset, it requires much fewer iterations than Anderson acceleration and exhibits a faster convergence behavior up to depths of 20. For depths larger than 5, the iteration counts of L-BFGS remain approximately constant with a decrease of about 20% compared to a depth of 1.

Considering the overall computational effort, depths around 2 to 5 appear to be optimal for both schemes. Taking more iterates into account increases the computational effort for each iteration, which offsets a further decrease in iteration counts. For this range of depths, L-BFGS and Anderson acceleration have memory footprints of 8 − 14 and 6 − 12 strain fields, respectively, compared to 8*.*5 for the Newton-CG method, 10*.*5 for the BFGS-CG method, 2 for the Barzilai-Borwein scheme and 1 for the basic scheme.

With the optimal depth choice, L-BFGS is the faster of the two limitedmemory schemes. However, it performs worse than the (Quasi-)Newton-Krylov methods and the Barzilai-Borwein scheme which exhibit similar runtimes. Even though L-BFGS converges in fewer iterations than the Barzilai-Borwein method, it is slower overall, due to the higher computational cost per iteration. In particular, the parallelization of the inner products in the two-loop recursion of Alg. 3 was not effective, introducing a significant overhead, see Chen et al. (2014). The basic scheme is the slowest of the investigated solvers, taking about an order of magnitude longer to converge. Whereas its computational cost per iteration is similar to the Barzilai-Borwein scheme, the required iteration count is significantly higher, due to the pronounced material contrast of the composite during plastification. In conclusion, we observe that the Barzilai-Borwein scheme outclasses the investigated limited-memory methods both in performance and memory footprint. Therefore we do not include the latter algorithms in the remaining numerical examples. The performance comparison of the remaining algorithms is expanded

in Sec. 3.4.3 and Sec. 3.4.4 for more complex microstructures and material laws, respectively.


**Table 3.2:** Continuous glass-fiber reinforced polyamide: Iteration counts and computational runtime with respect to the depth used in the algorithm

## **3.4.3 Porous short glass-fiber reinforced polyamide**

**Figure 3.3:** Porous glass-fiber-reinforced polyamide

We consider a porous polyamide matrix with short glass-fiber reinforcements, see Fig. 3.3, which is resolved by 256<sup>3</sup> voxels. The glass fibers are unidirectionally aligned in -direction with a volume fraction of 15%. The volume fraction of the pores is 1%. The material models and parameters correspond to those in Section 3.4.2, see Tab. 3.1. The given example constitutes a challenging non-linear test problem for the investigated micromechanical solvers. Due to the high stiffness of the glass fibers in comparison to the softer polymer matrix, the material contrast between the two phases is large. During plastification, the contrast increases even further as the minimum eigenvalue of the polyamides tangential stiffness approaches 0, owing to the exponential hardening law. In combination with the unidirectional short fiber structure, this results in strong localization of the strain fields around the fibers, see

Fig. 3.3. Last but not least, due to the presence of pores, the material contrast of the overall microstructure is infinite.

First, we investigate the different forcing term choices from Sec. 3.2.4 in the FFT-based setting to identify a suitable general-purpose strategy for the Newton-CG and BFGS-CG method. Next, we compare the performance of the solvers with the given forcing term choice for studying the material behavior under uniaxial extension.

**Influence of the forcing term on convergence and runtime.** In their study on forcing term strategies, Eisenstat and Walker (1996) considered numerical examples with up to 10<sup>4</sup> degrees of freedom. In the context of FFT-based micromechanics, much larger problem sizes are commonly considered, as it takes high voxel counts to finely discretize complex microstructures. Thus, we are interested whether the results of Eisenstat-Walker carry over to the FFT-based setting for our current example with 6 × 256<sup>3</sup> ≈ 10<sup>8</sup> degrees of freedom. Furthermore, we investigate how the BFGS-CG scheme is affected by the different forcing term strategies in comparison to the Newton-CG scheme. The following choices are considered:

1. Choice 1 corresponds to the first adaptive strategy of Eisenstat and Walker (1996) (3.19)

$$\begin{split} \eta\_{n} &= \frac{1}{||\Gamma^{0}:\sigma(\varepsilon\_{n-1})||\_{L^{2}}} \left| ||\Gamma^{0}:\sigma(\varepsilon\_{n})||\_{L^{2}} \\ &- \left| \left( \mathbf{I} + \Gamma^{0}:\left[\frac{\partial^{2}w}{\partial\varepsilon^{2}}(\varepsilon\_{n}) - \mathbb{C}^{0} \right] \right) : \Xi\_{n-1} + \Gamma^{0}:\sigma(\varepsilon\_{n-1}) \right| \right|\_{L^{2}} \Big|, \end{split} \tag{3.35}$$

with the associated safequard (3.21) and Kelley's safeguard against oversolving (3.24) in place. For this choice, the forcing term is proportional to the disagreement between the gradient and its linear approximation. Thus, decreases in the vicinity of the solution, and the linear system is solved with increasing accuracy. We start with a

high value, i.e., low accuracy, of <sup>0</sup> = max = 0*.*75, which also serves as the upper bound for the forcing term.

2. Choice 2 corresponds to the second forcing-term strategy (3.20) by Eisenstat and Walker (1996)

$$\eta\_n = \lambda \left( \frac{||\Gamma^0 : \sigma(\varepsilon\_n)||\_{L^2}}{||\Gamma^0 : \sigma(\varepsilon\_{n-1})||\_{L^2}} \right)^{\beta},\tag{3.36}$$

with safeguards (3.22) and (3.24) preventing oversolving. Like choice 1, this represents an adaptive strategy. In this case, the ratio of recent residuals serves as a measure of the convergence rate. The latter is expected to decrease close to the solution, leading to smaller values of . For the algorithmic parameters, we chose = 1 and = 1+<sup>√</sup> 5 2 , resulting in a convergence behavior similar to choice 1. The initial value and upper bound for the forcing term are set to <sup>0</sup> = max = 0*.*75.


The boundary conditions for the problem correspond to uniaxial extension up to 1% tensile strain in fiber direction, parallel to the -axis. The load is applied in a single step.

Two scenarios are considered. In the first case, the polyamide matrix is assumed to behave in a purely elastic way, resulting in a linear problem. For this example, the Newton-CG scheme and the BFGS-CG scheme are equivalent. In particular, this allows us to investigate the characteristic convergence behavior of the adaptive forcing-term choices 1 and 2 and the modest accuracy choice 3. Furthermore, we are interested how the

computational runtimes of choices 1 to 3 compare to that of choice 4, which is expected to converge in a single Newton step.

In the second case, the matrix behavior is governed by 2-elastoplasticity, constituting a non-linear problem. For the Newton-CG scheme, we compare the convergence behavior of the high accuracy choice 4 to the other options and evaluate whether quadratic convergence can be reached. Furthermore, we discuss how the convergence behavior for the different strategies changes when the approximated tangent stiffness of the BFGS-CG scheme is used. We conclude the investigation by evaluating the computational performance of the forcing term choices for both solvers and evaluate whether a strategy of choice can be identified.

**(a)** Linear elastic matrix behavior: Newton-CG solver **(b)** J2-elastoplastic matrix behavior: Newton-CG solver (left) and BFGS-CG solver (right)

**Figure 3.4:** Porous glass-fiber reinforced polyamide: Residual vs. number of Newton iterations

To evaluate the impact of the different forcing term choices, the residual is plotted as a function of the number of Newton iterations in Fig. 3.4, and as a function of the computation time in Fig. 3.5. The final iteration counts and computation times are listed in Tab. 3.3.

First, we take a look at the linear elastic case. As expected, the Newton scheme converges in a single step for the high accuracy choice 4. Choice 3 requires 5 iterations and converges at a linear rate. For choice 1 and

**(b)** J2-elastoplastic matrix behavior: Newton-CG solver (left) and BFGS-CG solver (right)

**Figure 3.5:** Porous glass-fiber reinforced polyamide: Residual vs. computation time

2, the convergence behavior is similar. Both start with a low accuracy and a comparatively slow convergence rate. As the residual becomes smaller, the value of decreases and the linear system is solved to higher accuracy. Consequently, the convergence rate increases for the last iterations. For the linear elastic case, we observe that the overall number of iterations, i.e., the sum of CG and Newton iterations, is similar for all forcing term strategies, see Tab. 3.3. The computational effort of solving the linear system to high accuracy is comparable to taking a larger number of Newton steps with modest accuracy. Hence, despite the differences in Newton iteration counts, the different forcing-term choices exhibit similar computation times, see Fig. 3.5. Notably, choice 4 is not the fastest even, though it led to convergence in a single step. The remaining difference in runtimes between the choices is explained by the wasted computational effort of solving to a smaller residual than required. Fortuitously, the final residual for choice 3 is the closest to the chosen tolerance, leading to the lowest computation time.

Next, we consider the non-linear case solved by the Newton-CG scheme. For choices 1 to 3, the convergence behavior is similar to the linear elastic case. Choice 4, however, requires 5 iterations and does not converge much faster than choice 3, even though a much higher accuracy is

used. Note that for the current example, the Newton-CG scheme with forcing term choice 4 does not exhibit a quadratic convergence rate within the chosen tolerance. For a preliminary computation on the small microstructure of Sec. 3.4.2, we could confirm a quadratic convergence rate for the Newton-CG method using very low tolerances = 10<sup>−</sup><sup>8</sup> and = 10<sup>−</sup><sup>9</sup> and thereby validate our implementation. However, the computational effort wasted by oversolving was even more excessive for such a setup. With respect to computation time, choice 1 and 2 are the fastest for the current example, converging after just over 300 seconds. Choice 3 takes roughly 30% longer. Taking a look at the overall runtime of choice 4 reveals the computational cost of oversolving. For this example, the advantage of Kelley's safeguard (3.24) becomes apparent. For all forcing-term strategies, we arrive at a residual slightly above the desired accuracy in the second to last iteration. For the adaptive choices 1 and 2, safeguard (3.24) is active and, consequently, the linear system is solved to low accuracy in short time. In case of the constant choices 3 and 4, where the safeguard is not used, we arrive at residuals much smaller than the desired accuracy, wasting computational effort.

To conclude the investigation, we take a look at the BFGS-CG scheme. For this solver, choices 3 and 4 lead to roughly the same linear rate of convergence. After few initial steps with a low accuracy, an identical convergence rate is approached for choices 1 and 2, as well. Apparently, higher accuracy than for choice 3 does not improve the convergence rate for the BFGS tangent approximation (3.34). With respect to the overall runtime, choices 1 and 2 are fastest, with choice 3 being only marginally slower. Choice 4 is the slowest option by far.

To summarize, we observe that for non-linear material behavior, the forcing term choices 1 and 2 by Eisenstat-Walker lead to the shortest runtime. However, choice 3 with a constant forcing term of = 0*.*1 is not much slower and serves as an easy-to-implement alternative. Based on the performance of choice 4, we come to the same conclusion as Knoll


**Table 3.3:** Porous glass-fiber reinforced polyamide: Iteration counts and computation times for different forcing term choices

and Keyes (2004): Aiming for a high (possibly quadratic) convergence rate by solving the linear system to high accuracy is inefficient with respect to the overall runtime of the scheme. These conclusions hold both for Newton-CG and BFGS-CG. Comparing the two solution schemes, we find that for the fastest forcing-term choice 2 the BFGS-CG scheme is only about 22% slower than the Newton-CG method, even though we applied a large non-linear load step. For the material laws considered in this example, we conclude that the BFGS update leads to a decent approximation of the tangent stiffness in a limited number of iterations.

**Discussion of the effective elastoplastic material properties.** From a material-science viewpoint, the effective elastoplastic behavior of the composite material is of interest. In particular, this includes character-

izing the anisotropy of the stress-strain relation in the elastic regime and the shape of the yield-boundary. To this end, we simulate uniaxial tensile tests in various directions relative to the fiber direction, i.e., the -axis. To be specific, the loading is applied at 0 ∘ , 15<sup>∘</sup> , 45<sup>∘</sup> and 90<sup>∘</sup> relative to the -axis in the - and -plane and at 0 ∘ , 45<sup>∘</sup> and 90<sup>∘</sup> relative to the -axis in the -plane. The tensile tests are performed up to 5% strain in load direction and subdivided into 50 load steps to obtain finely resolved stress-strain curves. This gives us the opportunity to evaluate the performance of the investigated solvers for a relevant practical application.

This paragraph focuses on the characterization of the material behavior, based on the results of the simulations. The convergence behavior and runtimes of the solution schemes are subsequently discussed in Sec. 3.4.3. The linear elastic behavior of the composite is characterized by the effective stiffness tensor <sup>C</sup>¯ relating effective stress ¯ <sup>=</sup> ⟨⟩ and effective strain ¯ = ⟨⟩ by Hooke's law

$$
\bar{\sigma} = \bar{\mathbb{C}} : \bar{\varepsilon}. \tag{3.37}
$$

Using the elastic parameters in Tab. 3.1, the effective stiffness of the composite material, given in Voigt's notation, reads

$$
\bar{\mathbf{C}} = \begin{bmatrix}
10.1 & 1.42 & 1.41 & 0.01 & 0.0 & 0.01 \\
1.42 & 3.49 & 1.45 & 0.03 & 0.0 & 0.0 \\
1.41 & 1.45 & 3.48 & 0.02 & 0.0 & 0.0 \\
0.01 & 0.03 & 0.02 & 1.04 & 0.0 & 0.0 \\
0.0 & 0.0 & 0.0 & 0.0 & 1.11 & 0.02 \\
0.01 & 0.0 & 0.0 & 0.0 & 0.02 & 1.11
\end{bmatrix} \text{ GPa,}
$$

up to 3 significant digits, and was identified through 6 linear elastic computations. C¯ may be well approximated by a transversely isotropic stiffness tensor with engineering constants <sup>L</sup> = 9*.*29 GPa, <sup>T</sup> = 2*.*81

GPa, TT = 0*.*38, LT = 0*.*29 and LT = 1*.*11 GPa, with a relative error below 1%. As a measure of the elastic anisotropy, we consider C aniso defined as

$$\mathbb{C}^{\text{aniso}} = \bar{\mathbb{C}} - \mathbb{C}^{\text{iso}} \quad \text{with} \quad \mathbb{C}^{\text{iso}} = (\bar{\mathbb{C}} :: \mathbb{P}\_1)\mathbb{P}\_1 + \frac{1}{5}(\bar{\mathbb{C}} :: \mathbb{P}\_2)\mathbb{P}\_2,\tag{3.38}$$

where P<sup>1</sup> and P<sup>2</sup> denote the projectors onto the spherical and deviatoric × matrices, respectively. The symbol :: denotes the quadruple tensor contraction, i.e., = B :: C is equivalent to = in indexnotation, using the summation convention. For the given material, ‖C aniso‖*/*‖C¯‖ = 47% in Frobenian norm, i.e., the elastic anisotropy is strong for this case.

**(a)** Stress-strain curves for varying load angles in the -plane

**(b)** Offset yield strength p0*.*2% at varying load angles in the -, - and -plane

**Figure 3.6:** Elastoplastic behavior of the porous glass-fiber reinforced polyamide. The load angles are measured relative to the -axis (fiber direction) in the and -plane and relative to the -axis in the -plane

The stress-strain curves for the simulated uniaxial tensile tests in the -plane are shown in Fig. 3.6a. We observe that, up to an angle of 45<sup>∘</sup> , the stiffness decreases and the onset of plastic behavior shifts to lower

stresses and higher strains. Between 45<sup>∘</sup> and 90<sup>∘</sup> offset of fiber to load direction, the observed behavior stays roughly identical. A common measure to quantify the onset of plasticity is the offset yield point p0*.*2%, as the actual yield stress is difficult to determine for smooth stress-strain diagrams. The offset yield point p0*.*2% is defined as the stress where the component of the effective plastic strain ¯<sup>p</sup> = ¯−C¯ <sup>−</sup><sup>1</sup> : ¯ in load direction reaches 0*.*2%. The results with respect to the load angle are shown in Fig. 3.6b. Due to the isotropic behavior in the -plane perpendicular to the fiber direction, as well as the similarity of the curves in the - and -plane, the boundary of the effective yield surface is approximately transversely isotropic. The yield strength in fiber direction is highest and decreases in a roughly linear way up to a relative angle of 45<sup>∘</sup> . Between 45<sup>∘</sup> and 90<sup>∘</sup> , it stays approximately constant. Even though the yield strength perpendicular to the fiber direction is a factor 2*.*5 lower than in fiber direction, it is still 1*.*6 times higher than for the unreinforced matrix material, see Tab. 3.1.

**Performance comparison for uniaxial extension.** Due to the transversely isotropic material behavior, we restrict the performance comparison of the solution schemes to the computations in the -plane. Fig. 3.7 shows the computation time, the total number of iterations and the number of gradient evaluations for each load step. For the Newton-CG and BFGS-CG solvers, the total number of iterates denotes the sum of CG and outer iterations, whereas only the latter are counted for the number of gradient evaluations. For the basic scheme and the Barzilai-Borwein scheme, the gradient is evaluated in each iteration, leading to identical counts for both values.

Qualitatively, the resulting plots for the computations at varying load angles are roughly similar. As the affine-linear extrapolation takes effect, the iteration counts and runtimes significantly decrease from the first to the second iteration. For the computations with relative load angles of 45<sup>∘</sup> and 90<sup>∘</sup> , the second load step is still linear elastic and the solution

**Figure 3.7:** Porous glass-fiber reinforced polyamide: Performance comparison of the solution schemes for uniaxial extension at various load angles relative to the -direction in the -plane


**Table 3.4:** Porous glass-fiber reinforced polyamide: Mean computation times and iteration counts for uniaxial extension at various load angles in the -plane

schemes converge within a single iteration. Subsequently, the iteration counts increase at the onset of plastification and decrease again after the material is fully plastified. Taking a closer look at the BFGS-CG method, we notice that its performance closely matches that of the Newton-CG method. This observation holds for both the overall performance, see Tab. 3.9, as well as for the iteration count and runtime within each load step, see Fig. 3.7. The tangent stiffness tensor for 2-elastoplasticity is merely a rank-one update of the elastic stiffness tensor, see Sec. 3.3.2 in Simo and Hughes (1998). As the BFGS-CG method is initialized with the elastic stiffness, the analytic tangent is well-approximated within a few BFGS-updates.

Evaluating the material law of 2-elastoplasticity is comparatively cheap, see Simo and Hughes (1998). More precisely, the computation time spent on evaluating ↦→ () for all voxels is roughly of the same order of magnitude as the computation time for the application of Γ <sup>0</sup> and the associated FFTs for typical cell sizes and resolutions. Usually, these are the most expensive steps in an FFT-based solution algorithm. In Tab. 3.5,


**Table 3.5:** Porous glass-fiber reinforced polyamide: Computation time per application of the most expensive operations for loading in -direction and solved by Newton-CG

the average computation time per application of these operations is given for the 0 ∘ load case solved by the Newton-CG method. For the given problem, we observe that evaluating the material law is slightly faster than applying forward and backward FFT, and about twice as expensive as applying the tangent Ξ ↦→ *<sup>∂</sup>* <sup>2</sup> *∂* <sup>2</sup> () : Ξ, i.e., a linear elastic material. The results for the other load cases and solution schemes are roughly similar. Note that the tangent operator is only applied when using the Newton-CG and BFGS-CG method. As a consequence, the computational cost of a gradient evaluation is similar to a CG iteration and the runtimes of all solvers are roughly proportional to their total iteration count, see Fig. 3.7. Thus, even though the Newton-CG and BFGS-CG method require much less evaluations of the material law, the Barzilai-Borwein scheme converges faster. The basic scheme is slower than the other investigated algorithms by a factor of 5 − 8. Due to the affine-linear extrapolation, the difference in performance is not as pronounced as for our previous example in Sec. 3.4.2.

### **3.4.4 Directionally solidified NiAl-Cr(Mo) alloy**

Due to its high melting point and corrosion-resistance, nickel-aluminumchrome eutectics with minor additions of molybdenum, i.e. NiAl-Cr(Mo) alloys, are a promising class of structural high temperature materials. The material behavior of the components in this alloy is governed by single-crystal elasto-viscoplasticity. Compared to the material laws of Sec. 3.4.3, i.e. linear elasticity and 2-elastoplasticity, evaluating the material law of a single-crystal elasto-viscoplasticity model is considerably more expensive and tends to dominate the overall computation time (Eghtesad et al., 2018a). Thus, NiAl-Cr(Mo) alloys represent a valuable benchmark for the investigated solution schemes. It is expected that the number of required gradient evaluations is more indicative of the overall performance in this case. This fact favors the use of (Quasi-)Newton-Krylov methods, as the solution of the linear system is less relevant for the runtime.

After a directional solidification process, NiAl-Cr(Mo) develops a cellular structure with NiAl and Cr(Mo) lamellae parallel to the growth direction (Cline and Walter, 1970). Similar microstructures are observed for other intermetallics, e.g. titanium-aluminides (Huang and Hall, 1991) or iron-aluminides (Scherf et al., 2016; Schmitt et al., 2017). To investigate mechanical behavior of a lamellar NiAl-Cr(Mo) alloy, a cellular microstructure with 512 grains was generated using the Voronoi tessellation routine of the software Neper (Quey et al., 2011). Based on findings by Whittenberger et al. (2001) and Raj and Locci (2001) for moderate solidification rates, an aspect ratio of 4 along the growth direction parallel to the -axis was chosen for the grains. The microstructure is shown in Fig. 3.8, resolved by 64<sup>3</sup> voxels.

Notice that we do not resolve the lamellar structure for each grain as this would require an excessively high voxel count. Instead, we homogenize a two-phase laminate for each voxel using the algorithm presented in Kabel et al. (2017). The orientation of the grains was chosen so that the normal direction of the laminate interface is uniformly distributed in the -plane, i.e., perpendicular to the growth direction. Cline and Walter (1970) investigated the crystallographic relationship in the laminate and showed that all planes and directions of NiAl and Cr(Mo) are parallel.

**Figure 3.8:** Directionally solidified NiAl-Cr(Mo)


The laminate interface is parallel to the (11¯2) plane and the growth direction is parallel to the ⟨111⟩ direction.

For the two phases of the laminate, the material behavior is governed by a single-crystal elasto-viscoplastic model. The infinitesimal strain is additively decomposed

$$
\varepsilon = \varepsilon\_{\mathbf{e}} + \varepsilon\_{\mathbf{p}} \tag{3.39}
$$

into elastic <sup>e</sup> and plastic <sup>p</sup> parts. The stress-strain relationship follows Hooke's law

$$
\sigma = \mathbb{C} : \varepsilon\_{\mathfrak{e}} = \mathbb{C} : (\varepsilon - \varepsilon\_{\mathfrak{p}}) \tag{3.40}
$$

for the elastic strains. For single-crystal elasto-viscoplasticity, the plastic strain is composed of simple shear deformations of the individual crystallographic slip systems. The evolution of the plastic strain is governed by (Bishop, 1953)

$$\dot{\varepsilon}\_{\mathbb{P}} = \sum\_{\alpha=1}^{N} \dot{\gamma}\_{\alpha} d\_{\alpha} \otimes^{s} n\_{\alpha},\tag{3.41}$$

where ˙, and denote the slip rate, slip direction and slip plane normal for the th of slip systems, respectively. For the flow rule of the slip rate, we chose the power-law formulation of Hutchinson (1976)

$$\dot{\gamma}\_{\alpha} = \dot{\gamma}\_{0} \text{sgn}(\tau\_{\alpha}) \left| \frac{\tau\_{\alpha}}{\tau^{\text{F}}} \right|^{m}, \quad \text{with} \quad \tau\_{\alpha} = \sigma : (d\_{\alpha} \otimes^{s} n\_{\alpha}) \tag{3.42}$$

and reference slip-rate ˙0, yield stress <sup>F</sup> and stress exponent . For the reinforcing Cr(Mo) phase, the yield stress F is modeled following Albiez et al. (2016a)

$$\tau^F = \frac{\tau\_{\infty}}{d\sqrt{\rho} + 1} \quad \text{with} \quad \rho = \rho\_s \left[ 1 - \exp\left( -\frac{1}{2} k\_2 \gamma \right) \left( 1 - \sqrt{\frac{\rho\_0}{\rho\_s}} \right) \right]^2 \tag{3.43}$$

and maximum yield stress ∞, characteristic length , recovery constant 2, dislocation density with its initial value <sup>0</sup> and its saturation value s . NiAl is assumed to behave perfectly plastic, i.e. = 0 . The material parameters and volume fractions for NiAl-31Cr-3Mo are taken from Albiez et al. (2016b), see Tab. 3.6.

Note that the single-crystal plasticity model with Hutchinson's flow rule is not a generalized standard material (Steinmann and Stein, 1996) and has a non-symmetric tangent stiffness. As the tangent stiffness of the phases enters the homogenized tangent stiffness of the laminate, see Glüge and Kalisch (2014), this would usually prohibit using the CG method for solving (3.28). However, we found in Sec. 4.6 that using the Newton-CG method and only the considering the symmetric part of the tangent stiffness yielded decent results. Hence, we use the symmetrized


**Table 3.6:** Directionally solidified NiAl-Cr(Mo): Material parameters of Cr(Mo) lamellae and NiAl matrix (Albiez et al., 2016b)

tangent stiffness of the single phases for the solution of the laminate and the computation of its tangent.

**Discussion of the effective creep behavior.** For high-temperature structural materials, the creep behavior, i.e., the deformation of the material subjected to a constant stress load, is an important mechanical characteristic. To investigate the anisotropic creep behavior of the NiAl-Cr(Mo) microstructure, we simulate creep tests in various directions relative to the growth direction of the material, i.e., the -axis. More specifically, we apply boundary conditions corresponding to uniaxial compression with a magnitude of 200 MPa at 0 ∘ , 15<sup>∘</sup> , 45<sup>∘</sup> and 90<sup>∘</sup> relative to the -axis in the - and -plane and at 0 ∘ , 45<sup>∘</sup> and 90<sup>∘</sup> relative to the -axis in the -plane. The load is applied in 1 second and a single load step and, afterwards held constant for 50 load steps for a specified creep time. The creep times for each angle are listed in Tab. 3.7 and were chosen to

obtain a fine resolution of the creep rate in time. Note that, due to the prescribed softening behavior (3.43), an excessively coarse resolution of the load steps over time leads to divergence of the solution schemes for this material. Simulating such a creep loading is a challenging problem for the investigated solution schemes, as a load transfer from the softer NiAl to the more creep resistant Cr(Mo) occurs as a viscous effect after the initial loading, see Albiez et al. (2016a;b). Thus, the loading in the single phases is non-monotone, especially in the first few load steps after the initial loading.


**Table 3.7:** Directionally solidified NiAl-Cr(Mo): Creep times with respect to load angle for all simulated creep experiments

In the following, we discuss the creep behavior observed in the simulations. The performance of the solution schemes for this example is compared in Sec. 3.4.4. For the characterization of the creep behavior, the creep rate ˙ c , i.e., the strain component in load direction measured after the initial loading, and its minimum value ˙ c min are of interest.

In Fig. 3.9a, the creep curves for the simulations in the -plane are shown. The curve for the load in growth direction agrees well with the computational and experimental results reported by Albiez et al. (2016b). Up to a load angle of 45<sup>∘</sup> , we observe an increase in the overall creep rate and a less pronounced softening behavior, i.e., an increase of the creep rate at increasing strains. This signifies that, in case of aligned load and growth direction, a large amount of stress is carried by the creep resistant Cr(Mo) lamellae which in turn activates their softening behavior. Fig. 3.9b shows the minimum creep rate for all computations

**(a)** Creep rate vs. creep strain for varying load angles in the -plane

**(b)** Minimum creep rates for varying load angles

**Figure 3.9:** Effective creep behavior of directionally solidified NiAl-Cr(Mo) at different load angles for an applied load of 200 MPa. The load angles are given with respect to the -axis (growth direction) in the - and -plane and with respect to the -axis in the -plane

as a function of the load angle. The good agreement of the results in the - and -plane as well as the approximately isotropic behavior in the -plane indicate a transversely isotropic effective creep behavior for NiAl-Cr(Mo). We observe that with increasing angle relative to the growth direction the logarithm of the minimum creep rate increases linearly up to an angle of 45<sup>∘</sup> and subsequently stagnates. The difference between the highest and lowest value for ˙ c min is slightly over two orders of magnitude. This represents an improvement in robustness compared to the similar directionally solidified molybdenum-reinforced nickel-aluminum alloys (NiAl-Mo) which form unidirectionally aligned fiber structures instead of laminates. For NiAl-Mo, FFT-based computations predicted a decrease in creep strength by roughly 4 orders of magnitude down to the level of pure NiAl in case of off-axis loading, see Sec. 4.6.3. Similarly, Seemüller et al. (2013) experimentally observed a

considerable increase in creep rate for NiAl-Mo with a high content of misaligned fibers. Thus, we conclude that the cellular laminate structure of NiAl-Cr(Mo) leads to a weaker anisotropy and a larger robustness against misaligned loading compared to fibrous materials with a similar composition.

**Performance comparison for creep loading.** In analogy to Sec. 3.4.3, we take a closer look at the runtimes, total iteration counts and gradient evaluations of the solvers for each load step, see Fig 3.10. Due to the material's transversely isotropic behavior, we restrict the discussion to the computations in the -plane. During the first few load steps of the creep computations, we observe high iteration counts and runtimes, due to the initial load application and the subsequent load transfer. This behavior is less pronounced for the case where growth direction and loading direction are parallel. As the normal direction of the laminates are distributed in the -plane, all laminate planes are parallel to the -direction. Thus, the resultant fields are less heterogeneous for a loading in this direction, leading to lower computational costs. As the fields stabilize and the affine-linear extrapolation takes effect, computation times and the required number of material evaluations decrease to a lower level, roughly between load step 5 and 15. For the 0 ∘ load angle and 15<sup>∘</sup> load angle computations, the computation time per material evaluation increases with the creep time, due to the softening of the material. In the former case, the required number of iterations increases as well, as the softening is more pronounced and leads to a higher internal material contrast.

In contrast to our previous example in Sec. 3.4.3, solving the two-phase laminate and evaluating the single-crystal elasto-viscoplastic material laws dominates the overall computation time, see Tab. 3.8. This holds true for all solvers and load cases. Hence, we observe that the runtime is approximately proportional to the number of gradient evaluations.

**Figure 3.10:** Directionally solidified NiAl-Cr(Mo): Performance comparison of the solution schemes for creep loading at various load angles relative to the -direction in the -plane


**Table 3.8:** Directionally solidified NiAl-Cr(Mo): Computation time per application of the most expensive operations for the case of loading in -direction solved by the Newton-CG method

We take a closer look at the convergence behavior of the BFGS-CG method. Roughly up to the 5th load step, the BFGS-CG method requires a higher number of Newton iterations than the Newton-CG method. In comparison to the example in Sec. 3.4.3, it takes more BFGS update iterations to achieve a good approximation of the tangent stiffness. Firstly, this can be traced back to the difference in loading. Whereas the first load steps of the uniaxial extension in Sec. 3.4.3 were in the linear elastic regime, the creep loading is rapidly applied in the first load step, immediately leading to non-linear material behavior. Secondly, the tangent stiffness for the single-crystalline phases and the resulting homogenized tangent stiffness of the laminate is more complex than the one of 2-elastoplasticity. Thus, with the linear elastic stiffness as starting point, more BFGS updates are necessary to approximate the material's tangent stiffness. After the slower initial load steps, BFGS-CG and Newton-CG exhibit similar runtimes and Newton iteration counts. In fact, the BFGS-CG method even converges in slightly fewer Newton iterations than the Newton-CG method for some load steps. This may be due to a combination of two factors. Firstly, we use the symmetrized tangent of the single-phases to compute the tangent of the laminate. Secondly, we do not achieve the highest possible convergence rate for Newton-CG, by using the forcing term choice 2, see Sec. 3.4.3. We


**Table 3.9:** Directionally solidified NiAl-Cr(Mo): Mean computation times and iteration counts for creep loading at various angles in the -plane

further note that the BFGS-CG method requires more CG iterations than Newton-CG, see Tab. 3.9. This indicates that the BFGS tangent approximation exhibits a higher internal material contrast than the analytic tangent for this example. Comparing the mean computation times per load step, we see that this does not negatively impact the method's overall performance. In conclusion, BFGS-CG and Newton-CG exhibit very similar computation times with BFGS-CG being even slightly faster for the 15<sup>∘</sup> to 90<sup>∘</sup> load angle computations.

For the Barzilai-Borwein method, we note that the total number of iterations is similar to the Newton-CG method for all computations. However, as the material law is evaluated for every iteration of the Barzilai-Borwein scheme, the resulting computation times are 1*.*5 to 2*.*5 times higher than for Newton-CG and BFGS-CG.

The basic scheme is the most time-consuming algorithm, taking about 4−10 times longer to converge than the inexact (Quasi-)Newton methods. Note that for all load cases except the 0 ∘ loading, the iteration counts of the basic scheme fluctuate significantly between load steps, even

**Figure 3.11:** Directionally solidified NiAl-Cr(Mo): Performance comparison of the two reference-material choices for the basic scheme for the 15<sup>∘</sup> load case

after the strain field stabilizes and the creep rate reaches its minimum value. This unexpected effect is a result of our choice of reference material <sup>0</sup> = (<sup>+</sup> + −)*/*2, which is only theoretically justified for materials whose tangent has a lower bound. For our given material, this cannot be assured globally, due to the prescribed softening behavior. However, convergence of the basic scheme to a critical point can be shown for materials with only an upper bound on the tangent if the reference material is chosen as <sup>0</sup> = +, see Sec. 1.2.3 in Nesterov's book (Nesterov, 2004). We compared the two choices for <sup>0</sup> for the 15<sup>∘</sup> load case where the fluctuations were most pronounced, see Fig. 3.11. For the conservative choice <sup>0</sup> = +, iteration counts and runtimes develop smoothly. However, the mean iteration count and computation time per load step are about 30% higher for this choice. Hence, the results for <sup>0</sup> = (<sup>+</sup> + −)*/*2 were included in the performance comparison of the different solution schemes.

## **3.5 Conclusions**

Quasi-Newton methods, such as Anderson acceleration (Shantraj et al., 2015; Chen et al., 2019b;a) and the Barzilai-Borwein method (Schneider, 2019a), have attracted considerable attention for FFT-based micromechanics. In contrast to the classical Newton method, these schemes do not require computing the Hessian. In addition, they generally outperform gradient-descent methods which share this property (Nocedal and Wright, 1999). In the present chapter, this motivated us to exploit the most popular Quasi-Newton algorithm, the BFGS method, in the context of FFT-based micromechanics. First, we proposed an implementation of Nocedal's L-BFGS algorithm (Nocedal, 1980). While this scheme proved to be faster than the similar Anderson acceleration, pioneered by Shantraj et al. (2015), L-BFGS performed worse than the Barzilai-Borwein method which is non-monotonic but has a smaller memory-footprint. This can be traced back to the comparatively high computational cost per iteration of L-BFGS, due to the many inner product evaluations in the classical two-loop algorithm, see Alg. 3. It may be possible to reduce this computational overhead, using the more sophisticated L-BFGS implementation proposed by Chen et al. (2014), where the computation of all inner products can be parallelized more effectively. However, for material laws which can be cheaply evaluated, the Barzilai-Borwein scheme currently represents the general purpose method of choice

For computationally expensive material laws, such as single-crystal plasticity, it has been shown that Newton-CG is more efficient, due to the lower number of gradient (and thus material law) evaluations, see Sec. 4.6. This led us to our second use of the BFGS update for approximating the material tangent-stiffness in the Newton-CG scheme. With the resulting BFGS-CG method, we arrived at a scheme which was competitive in performance to the classical Newton-CG method, in particular for multistep loads. Although it can not be measured in performance benchmarks, time spent programming is as much of a

resource as time spent on computations. Thus, the main advantage of the BFGS-CG scheme is that it enables the tangent-free implementation of complex and computationally demanding material laws while still being fast enough to permit their efficient computational homogenization. The results of the performance comparison between the investigated solution schemes are summarized in Tab. 3.10.

As a side product of our investigation of (Quasi-)Newton methods, we found a globalization strategy suitable for FFT-based micromechanics in the line search algorithm of Dong (2010). Another aspect of major importance for the overall performance of these schemes was the choice of the forcing term. Among the various strategies tested in our numerical experiments, consistently solving the linear system to a high accuracy was by far the slowest option. Whereas this increased the overall computation time by factors of 5 to 7 compared to the other choices, the resulting convergence rate with respect to the required Newton iterations was barely improved within the given tolerance. The best overall performance was achieved by forcing term choice 2 of Eisenstat-Walker and its associated safeguards (Eisenstat and Walker, 1996; Kelley, 1995). However, similar performance was observed for a constant moderate forcing term of 0*.*1. Thus, the choice between these two options can be seen as a matter of preference, i.e., choosing optimal performance versus ease of implementation.

As demonstrated in our numerical experiments, both Newton-CG and BFGS-CG can handle non-linear materials with infinite contrast. Consequently, they are among the most widely applicable algorithms currently available in the FFT-based context. However, the robust handling of materials with negative tangent eigenvalues, e.g., in case of damage or strain-softening, is an open topic for further research. Dai demonstrated that the BFGS method does not converge for general functions in four or higher dimensions (Dai, 2013). Damped versions of the BFGS update formula are available, see Procedure 18.2 in Nocedal and Wright (1999),


**Table 3.10:** Summary of the performance comparison between the investigated solvers

which stabilize the convergence behavior of the linear solver. Still, this may result in overall divergence if the disagreement between the tangent and its approximation becomes too large. It remains to be investigated, if a suitable approach such as the arc-length method as used for conventional finite-element computations (Wriggers, 2008) can be adapted for FFT-based micromechanics.

## **Chapter 4**

# **An efficient solution scheme for small-strain crystal elasto-viscoplasticity in a dual framework<sup>1</sup>**

## **4.1 Introduction**

For polycrystals, evaluating the viscoplastic constitutive material law of single-crystalline phases is computationally expensive. The required iteration count of the original basic scheme is proportional to the material contrast, i.e. the quotient of largest and smallest eigenvalue of the tangential stiffness, evaluated for the entire microstructure. Even for a polycrystal consisting of a single crystalline phase, the internal material contrast can become large as a result of plastification. Lebensohn et al. (2012) adapted the augmented Lagrangian scheme, introduced by Michel et al. (2001), to small-strain crystal-elasto-viscoplasticity. The algorithm belongs to a class of polarization-based schemes (Moulinec and Silva, 2014; Schneider et al., 2019) whose required iteration count is proportional to the square root of the material contrast. Another class

<sup>1</sup> This chapter is based on Wicht et al. (2020a). For the sake of a coherent structure, formatting and typography of this thesis, minor changes have been made. To avoid redundancies in the text, the introduction has been shortened.

of fast solution methods with similar convergence rate was developed based on the interpretation of the basic scheme as a gradient descent method (Kabel et al., 2014), enabling the use of accelerated gradient schemes (Schneider, 2017a; 2019a). Gélébart and Mondon-Cancel (2013) and Kabel et al. (2014) applied the Newton-Raphson method to the FFT-context and used Krylov-subspace methods (Zeman et al., 2010; Brisard and Dormieux, 2010) for solving the corresponding linear system. As the Newton-Raphson method converges quadratically in the vicinity of the solution and Krylov-subspace solvers such as conjugated gradients are optimal for their respective problem class, these algorithms exhibit excellent performance, see, e.g. Kochmann et al. (2018), albeit at the cost of high memory requirements. In the case of crystal plasticity, the low number of required Newton iterations is especially beneficial, as the evaluation of the material law is much more costly than solving the linear system. Other approaches for decreasing the overall computational effort include the use of semi-explicit time integration schemes (Nagra et al., 2017), spectral databases (Eghtesad et al., 2018b) and large-scale MPI parallelization (Eghtesad et al., 2018a).

Except for the polarization-based schemes, all listed methods are formulated in the conventional strain-based setting which we revisit in Sec. 4.2. This study is based on the observation that for certain formulations of small-strain single crystal elasto-viscoplasticity, the evaluation of the inverse material law, i.e., computing the strain as a function of the stress, is much cheaper than the conventional approach, see Sec. 4.3. Bhattacharya and Suquet (2005) formulated a dual variational setting, see Sec. 4.4, for the unit cell problem and used the basic scheme as solver. In this chapter, we exploit the cheap evaluation of the inverse law in the dual stress-based setting using modern solution schemes, see Sec. 4.5. We compare the performance and convergence behavior of the solvers in both settings for a polycrystal and a fibrous NiAl-Mo microstructure in Sec. 4.6.

## **4.2 Computational homogenization**

#### **4.2.1 The cell problem of periodic homogenization**

In this section, we review the cell problem of computational homogenization for geometrically linear continuum mechanics with simple materials, see Ch. 2 and 4 in Bertram (2011). Let be a rectangular cell in R and let 2 ( ; Sym()) denote the space of -periodic and square integrable stress and strain fields, where Sym() denotes the set of symmetric × matrices. Let ∈ 2 ( ; Sym()) be the infinitesimal strain field and denote by ∈ 2 ( ; Sym()) the stress field. As we consider both strain and stress based formulations in this chapter, we wish to clearly distinguish between the stress *field* and the stress *operator*, i.e., the material law. Hence, we denote the heterogeneous and possibly non-linear but point-wise invertible material law ℱ : × Sym() → Sym(), so that = ℱ(). The material law may result, e.g., from the implicit time discretization and static condensation of a generalized standard material. In computational homogenization, we seek a solution to the set of equations

$$
\varepsilon = \langle \varepsilon \rangle\_Y + \nabla^s u, \quad \text{and} \quad \text{div } \sigma = 0,\tag{4.1}
$$

where ∇ denotes the symmetrized gradient operator and : → R is a periodic and mean-free displacement fluctuation field. To prescribe (possibly mixed) boundary conditions necessary for the closure of the system (4.1) of equations, we follow Kabel et al. (2016). Let P and Q be projectors on Sym() which are idempotent and complementary

$$\mathbb{P}: \mathbb{P} = \mathbb{P}, \quad \mathbb{Q}: \mathbb{Q} = \mathbb{Q}, \quad \mathbb{P}: \mathbb{Q} = 0, \quad \mathbb{Q}: \mathbb{P} = 0, \quad \mathbb{P} + \mathbb{Q} = \mathbb{I}, \tag{4.2}$$

as well as orthogonal with respect to the Frobenius inner product (*,* ) ↦→ tr(), i.e.

$$\text{tr}(S[\mathbb{P}:T]) = \text{tr}(T[\mathbb{P}:S]), \quad \forall S, T \in \text{Sym}(d), \tag{4.3}$$

$$\text{tr}(S[\mathbb{Q}:T]) = \text{tr}(T[\mathbb{Q}:S]), \quad \forall S, T \in \text{Sym}(d). \tag{4.4}$$

The macroscopic loading is encoded in the prescribed strain ∈ Sym() and stress ∈ Sym() with

$$\mathbb{P}: \overline{\varepsilon} = \overline{\varepsilon} \quad \text{and} \quad \mathbb{Q}: \overline{\sigma} = \overline{\sigma}. \tag{4.5}$$

The boundary conditions are formulated as

$$\mathbb{P}: \langle \varepsilon \rangle\_Y = \overline{\varepsilon} \quad \text{and} \quad \mathbb{Q}: \langle \sigma \rangle\_Y = \overline{\sigma}. \tag{4.6}$$

#### **4.2.2 Variational formulation of the cell problem**

Under additional assumptions, the set of equations (4.1) and (4.6) can be derived from a variational principle. Assume an energy density : × Sym() → R is given. For instance, can be given as a hyperelastic energy or the statically condensed incremental potential of a generalized standard material (Lahellec and Suquet, 2007). Let ∈ 1 in and assume the stress can be derived from the hyperelastic relation = *∂ ∂*  (), where we suppress the ∈ -dependence. Consider the minimization problem (Kabel et al., 2016) in terms of the strain fluctuations ^ = −

$$W(\hat{\varepsilon}) \longrightarrow \min \quad \text{for} \quad \hat{\varepsilon} \in U \subset L^2(Y; \text{Sym}(d)) \tag{4.7}$$

with

$$W(\hat{\varepsilon}) = \langle w(\overline{\varepsilon} + \hat{\varepsilon}) - \overline{\sigma} : \hat{\varepsilon} \rangle\_Y \,. \tag{4.8}$$

The subspace under consideration is

$$\begin{aligned} U = \left\{ \widehat{\varepsilon} \in L^2(Y; \text{Sym}(d)) \, \Big|\, \widehat{\varepsilon} = \langle \widehat{\varepsilon} \rangle\_Y + \nabla^s u, \\ u \in H^1\_\#(Y; \text{Sym}(d)), \quad \mathbb{P}: \langle \widehat{\varepsilon} \rangle\_Y = 0 \right\} \end{aligned} \tag{4.9}$$

where <sup>1</sup> #( ; Sym()) denotes the Sobolev space of periodic and meanfree vector fields : → R . Denote by (^) the differential of . Critical points of are characterized by,

$$\begin{aligned} DW(\hat{\varepsilon})[S] &= 0, \quad \forall S \in U \quad \text{where} \\ DW(\hat{\varepsilon})[S] &= \left\langle \frac{\partial w}{\partial \varepsilon}(\overline{\varepsilon} + \hat{\varepsilon}) : S - \overline{\sigma} : S \right\rangle\_Y. \end{aligned} \tag{4.10}$$

By the Helmholtz decomposition of elasticity, see. App. A, the operator Γ = ∇ (div ∇ ) <sup>−</sup>1div is a projector onto the mean-free and compatible fields<sup>2</sup> . Hence, we can write the variation in (4.10) as

$$S = \mathbb{Q}: \langle S \rangle\_Y + \Gamma: S.$$

Inserting this expression into (4.10) we obtain

$$\left\langle \left( \Gamma + \mathbb{Q} : \langle \cdot \rangle\_Y \right) : \left( \frac{\partial w}{\partial \varepsilon} (\varepsilon) - \overline{\sigma} \right) : S \right\rangle\_Y = 0,\tag{4.12}$$

and as is arbitrary, this is equivalent to

$$\mathbb{Q}: \left\langle \frac{\partial w}{\partial \varepsilon}(\varepsilon) \right\rangle\_Y = \overline{\sigma} \quad \text{and} \quad \Gamma: \frac{\partial w}{\partial \varepsilon}(\varepsilon) = 0. \tag{4.13}$$

The condition Γ : = 0 is equivalent to div = 0. Thus, with our initial choice of , we have recovered (4.1) and (4.6).

<sup>2</sup> Here, we chose C<sup>0</sup> = I. Later, the reference material is reinterpreted in the context of gradient descent methods as a parameter for the step size in Sec. 4.5.

### **4.2.3 Lippmann-Schwinger equation**

The basic scheme by Moulinec and Suquet (1994; 1998) is based on the Lippmann-Schwinger equation of elasticity

$$\varepsilon = \overline{\varepsilon} + \mathbb{D}^0 : \overline{\sigma} - (\Gamma^0 + \mathbb{D}^0 : \mathbb{Q} : \langle \cdot \rangle\_Y) : (\mathcal{F}(\varepsilon) - \mathbb{C}^0 : \varepsilon) \tag{4.14}$$

with the homogeneous reference stiffness C 0 : Sym() → Sym(), the reference compliance D <sup>0</sup> = (C 0 ) <sup>−</sup><sup>1</sup> and the strain-based Green operator Γ <sup>0</sup> = ∇ (div C <sup>0</sup>∇ ) <sup>−</sup>1div . Throughout this paper, we assume that the reference stiffness is a multiple of the identity C <sup>0</sup> = <sup>0</sup> I. The important property here is that C 0 commutes with Q and P, for a formulation with general C 0 , see Kabel et al. (2016). Solving (4.14) is equivalent to solving (4.1) with (4.6). More precisely, all for which (4.14) holds are solutions of the system (4.1) of equations with boundary conditions (4.6) and vice versa. A derivation for P = I can be found, for instance, in Chapter 12 of Milton's book Milton (2002). The fixed-point scheme associated to the Lippmann-Schwinger equation (4.14)

$$\varepsilon\_{k+1} = \overline{\varepsilon} + \mathbb{D}^0 : \overline{\sigma} - (\Gamma^0 + \mathbb{D}^0 : \mathbb{Q} : \langle \cdot \rangle\_Y) : (\mathcal{F}(\varepsilon\_k) - \mathbb{C}^0 : \varepsilon\_k), \tag{4.15}$$

is precisely Moulinec-Suquet's basic scheme. The operator Γ 0 is evaluated in Fourier-space.

Concerning the computational cost of a fixed-point iteration (4.15), we can distinguish between two cases. If the evaluation of ℱ() is cheap, e.g., for linear elastic materials, most time is spent with the application of Γ 0 and the associated Fourier transforms. However, for more complicated material models, the computation of ℱ() dominates the runtime. As we will discuss in Sec. 4.3, single crystal elasto-viscoplasticity falls firmly into the latter category. Based on the observation that under certain assumptions the inverse = ℱ −1 () is much easier to compute in this

case, the stress-based formulation of the cell problem will be discussed in Sec. 4.4.

# **4.3 Material model for single crystal elasto-viscoplasticity**

### **4.3.1 Constitutive assumptions**

In small-strain plasticity, it is assumed that the strain can be additively decomposed

$$
\varepsilon = \varepsilon\_{\text{e}} + \varepsilon\_{\text{p}} \tag{4.16}
$$

into an elastic part <sup>e</sup> and a plastic part p, see Ch. 2 in Simo and Hughes (1998). For linear elastic behavior, the stress is related to the elastic strain via Hooke's law

$$\sigma = \mathbb{C} : \varepsilon\_{\mathfrak{e}} = \mathbb{C} : (\varepsilon - \varepsilon\_{\mathfrak{p}}) \tag{4.17}$$

with the stiffness tensor C : Sym() → Sym() or in strain-explicit form

$$
\varepsilon = \mathbb{D} : \sigma + \varepsilon\_{\mathbb{P}}, \tag{4.18}
$$

with the compliance tensor D = C −1 . In elasto-viscoplasticity, the evolution of the plastic strain is given by a constitutive flow rule of the form ˙<sup>p</sup> = (*,* ) with a finite number of internal variables (Simo and Hughes, 1998). For crystalline materials, we assume that the plastic deformations are realized in the form of simple shears on crystallographic slip systems, see Ch. 10 in Bertram (2011). Slip system are characterized by their slip plane normal and their slip direction . They signify close-packed planes and directions in the crystal lattice, respectively, see Ch. 3 in Hull and Bacon (2011). Hence, in single-crystal

elasto-viscoplasticity, the flow rule takes the form (Bishop, 1953)

$$\dot{\varepsilon}\_{\mathbb{P}} = \sum\_{\alpha=1}^{N} \dot{\gamma}\_{\alpha} M\_{\alpha} \tag{4.19}$$

where ˙ denotes the plastic slip rate and = ⊗ denotes the symmetrized Schmid tensor on the th of slip systems, respectively. Plastic slip in a system is activated by the projected shear stress = · (Bishop, 1953). Thus, for the constitutive flow rule for the slip rate we assume the form

$$
\dot{\gamma}\_{\alpha} = f(\tau\_{\alpha}, \tau\_{\alpha}^{\mathcal{F}}) \tag{4.20}
$$

where F denotes the scalar critical shear stress in system (Maniatty et al., 1992; Cuitiño and Ortiz, 1993). In the current work, we only consider isotropic hardening and neglect the effects of kinematic hardening. To complete the set of constitutive equations, hardening relations for F have to be provided. In the following, we adapt the simplification that the critical shear stress is equal in all slip systems <sup>F</sup> = F and depends on the accumulated plastic slip

$$\dot{\gamma} = \sum\_{\alpha=1}^{N} |\dot{\gamma}\_{\alpha}| \tag{4.21}$$

in the form of of a hardening law <sup>F</sup> = *ℎ*(). For instance, *ℎ* may arise as the integrated form of a Kocks-Mecking type dislocation storagerecovery model (Kocks and Mecking, 2003). Kubin et al. (2008) found that the reduction to a single hardening variable was a reasonably good approximation for fcc crystals. In numerical experiments, Maniatty et al. (1992) found that the impact of this simplification on the effective mechanical properties of a polycrystal was small.

#### **4.3.2 Formulation as a generalized standard material**

An isothermal generalized standard material at small strains with internal variables is described by two convex potentials and (Halphen and Nguyen, 1975; Germain et al., 1983). The volume specific Helmholtz free energy density defines the stress-strain relation and the driving force associated to by

$$
\sigma = \frac{\partial \psi}{\partial \varepsilon}(\varepsilon, z) \quad \text{and} \quad \mathcal{A} = -\frac{\partial \psi}{\partial z}(\varepsilon, z) \tag{4.22}
$$

and the dissipation potential relates the driving forces to the rates of the internal variables

$$\mathcal{A} \in \partial \phi(\dot{z})\tag{4.23}$$

where *∂* denotes the subdifferential of . In terms of the Legendre transform of ( ˙)

$$\phi^\*(\mathcal{A}) = \sup\_{\dot{z}} (\mathcal{A} \cdot \dot{z} - \phi(\dot{z})),\tag{4.24}$$

the evolution of can be equivalently written as

$$
\dot{z} \in \partial \phi^\*(\mathcal{A}). \tag{4.25}
$$

For generalized standard materials, Lahellec and Suquet (2007) show that after a backwards Euler time discretization there exists a condensed incremental potential () so that the potential relation

$$
\sigma = \frac{\partial w}{\partial \varepsilon}(\varepsilon) \tag{4.26}
$$

holds. For the crystal plasticity model, we assume a free energy of the following form

$$\psi(\varepsilon, \varepsilon\_{\mathbb{P}}, \gamma) = \frac{1}{2} (\varepsilon - \varepsilon\_{\mathbb{P}}) : \mathbb{C} : (\varepsilon - \varepsilon\_{\mathbb{P}}) + \psi\_{\mathbb{h}}(\gamma) \tag{4.27}$$

with internal variables = {p*,* } which is additively split into a quadratic elastic energy and an isotropic hardening energy h. The functional dependency of <sup>h</sup> on is phenomenological and assumed here for the sake of simplicity. This ansatz for the free energy leads to the stress-strain relation (4.17) and the driving forces

$$\sigma\_{\mathsf{P}} = -\frac{\partial \psi}{\partial \,\varepsilon\_{\mathsf{P}}}(\varepsilon, \varepsilon\_{\mathsf{P}}) = \frac{\partial \psi}{\partial \,\varepsilon}(\varepsilon, \varepsilon\_{\mathsf{P}}) \quad \text{and} \quad \tau^{\mathsf{F}} = \frac{\partial \psi\_{\mathsf{h}}}{\partial \gamma}(\gamma), \tag{4.28}$$

hence <sup>p</sup> = and = {*,* − <sup>F</sup>}. In viscoplasticity, flow rules are generally formulated in terms of the stress, see Fritzen and Leuschner (2013) and Ch. 2 in Lemaitre and Chaboche (1990). Therefore, the dual dissipation potential \* (*,*  <sup>F</sup> ) is usually prescribed, so that

$$(\dot{\varepsilon}, -\dot{\gamma}) \in \partial \phi^\*(\sigma, \tau\_F). \tag{4.29}$$

A common ansatz is the Chaboche-type potential, see Chapter 6 in Lemaitre and Chaboche (1990),

$$\phi^\*(\sigma, \tau^{\mathrm{F}}) = \frac{\tau\_{\mathrm{D}} \dot{\gamma}\_0}{m+1} \sum\_{\alpha=1}^N \left\langle \frac{|\tau\_{\alpha}| - \tau^{\mathrm{F}}}{\tau\_{\mathrm{D}}} \right\rangle\_+^{m+1},\tag{4.30}$$

with reference slip rate ˙0, drag stress D, stress exponent and the Macaulay brackets defined by ⟨·⟩<sup>+</sup> = max(0*,* ·). Differentiating w.r.t and F recovers the evolution equations (4.19) and (4.21) with the flow rule

$$\dot{\gamma}\_{\alpha} = \dot{\gamma}\_{0} \text{sgn}(\tau\_{\alpha}) \left\langle \frac{|\tau\_{\alpha}| - \tau^{\text{F}}}{\tau\_{\text{D}}} \right\rangle\_{+}^{m},\tag{4.31}$$

see Fritzen and Leuschner (2013). Another popular approach for the evolution the plastic slip is

$$\dot{\gamma}\_{\alpha} = \dot{\gamma}\_{0} \text{sgn}(\tau\_{\alpha}) \left| \frac{\tau\_{\alpha}}{\tau^{\text{F}}} \right|^{m} \tag{4.32}$$

by Hutchinson (1976). Steinmann and Stein (1996) proposed the associated potential

$$\phi^\*(\sigma, \tau^{\mathrm{F}}) = \frac{\tau^{\mathrm{F}} \dot{\gamma}\_0}{m+1} \sum\_{\alpha=1}^{N} \left| \frac{\tau\_\alpha}{\tau^{\mathrm{F}}} \right|^{m+1},\tag{4.33}$$

which recovers (4.19) with flow rule (4.32). Note that the resulting equation for the accumulated slip

$$\dot{\gamma} = \frac{m}{m+1} \sum\_{\alpha=1}^{N} \left| \frac{\tau\_{\alpha}}{\tau^{\mathrm{F}}} \right| |\dot{\gamma}\_{\alpha}| \tag{4.34}$$

corresponds to the standard formulation (4.21) only in the rate-independent limit as → ∞ and ≈ F . Consequently, a crystal plasticity model as described in Sec. 4.3.1 with Hutchinson's flow rule (4.32) is not a generalized standard material.

#### **4.3.3 Evaluation of the material law**

Applying the implicit Euler time discretization to the evolution equations (4.19) and (4.21) yields the residual equations

$$0 \stackrel{!}{=} r\_1(\sigma, \gamma) = \mathbb{D} : \sigma - \varepsilon + \varepsilon\_\mathbb{p}^\mathrm{n} + \Delta t \sum\_{\alpha=1}^N f(\tau\_\alpha, h(\gamma)) M\_\alpha,\tag{4.35}$$

$$0 \stackrel{!}{=} r\_2(\sigma, \gamma) = -\gamma + \gamma^n + \Delta t \sum\_{\alpha=1}^N |f(\tau\_\alpha, h(\gamma))|. \tag{4.36}$$

In the primal setting, the material law = ℱ() is evaluated by computing the stress for a given strain , time step Δ and internal variables n p , n . To this end, the set of 7 equations, (4.35) and (4.36), can be solved by adapting the Newton-Raphson method. With

$$x = \begin{pmatrix} \sigma \\ \gamma \end{pmatrix}, \quad r = \begin{pmatrix} r\_1 \\ r\_2 \end{pmatrix}, \quad \text{and} \quad J = \begin{pmatrix} \frac{\partial r\_1}{\partial \sigma} & \frac{\partial r\_1}{\partial \gamma} \\ \frac{\partial r\_2}{\partial \sigma} & \frac{\partial r\_2}{\partial \gamma} \end{pmatrix} \tag{4.37}$$

the Newton iteration reads

$$x^{n+1} = x^n + \Delta x \tag{4.38}$$

where Δ is the solution of

$$J\Delta x = -r(x^n).\tag{4.39}$$

Solving () = 0 is challenging for large stress exponents and large time increments, see Wulfinghoff and Böhlke (2013), as the system becomes ill-conditioned. To obtain fast and robust convergence behavior, we solve the residual equations with the outlined solution scheme for a reduced stress exponent ˜ , starting with ˜ = 1. Subsequently, ˜ is set to min(2 ˜*,* ) and () = 0 is solved again with the last converged solution as starting point. Thereby, each Newton scheme is initiated close to the solution and converges quickly. This process is repeated until the system is solved with ˜ = . An alternative routine which relies on piecewise linearization of the flow rule was proposed by Wulfinghoff and Böhlke (2013). Regardless of the chosen approach, evaluation of () and () as well as the solution of (4.39) are computationally expensive. Thus, evaluating the material law = ℱ() dominates the overall runtime.

The evaluation of the inverse material law = ℱ −1 (), however, is much less costly. For given , Δ, n p , and n , the scalar equation (4.36) can be solved for independently of . If is known, can be explicitly computed from (4.35). Thus, in the dual setting, the implicit material law only involves the solution of a single scalar equation instead of a system of 7 equations. The fact that = ℱ −1 () is cheaper to evaluate than = ℱ() has been taken advantage of in the context of polarizationbased methods by Lebensohn et al. (2012). However, due to their chosen augmented Lagrangian scheme, see (Michel et al., 2001; Schneider et al., 2019), the solution of a non-linear system of 6 equations was still required in every material point.

## **4.4 The dual variational framework**

A dual formulation of the cell problem (4.1) with the stress as primary unknown was presented by Bhattacharya and Suquet (2005) for pure stress boundary conditions, i.e. Q = I. In analogy to Sec. 4.2.2, we will derive the dual case for mixed boundary conditions through a variational approach. Let the strain energy density be convex in and let

$$w^\*(\sigma) = \sup\_{\varepsilon \in L^2(Y; \text{Sym}(d))} \left( \sigma : \varepsilon - w(\varepsilon) \right) \tag{4.40}$$

be the Legendre transform of . As ∈ 1 implies \* ∈ 1 , the inverse material law is given by = *∂*\* *∂* ().

We seek a minimizer of the problem

\* (^) −→ min for ^ ∈ \* ⊂ 2 ( ; Sym()) (4.41)

with

$$W^\*(\hat{\sigma}) = \langle w^\*(\overline{\sigma} + \hat{\sigma}) - \overline{\varepsilon} : \hat{\sigma} \rangle\_Y \tag{4.42}$$

and

$$\partial\_t U^\* = \left\{ \hat{\sigma} \in L^2(Y; \text{Sym}(d)) \, \Big|\, \text{div } \hat{\sigma} = 0, \quad \mathbb{Q}: \langle \hat{\sigma} \rangle\_Y = 0 \right\} \tag{4.43}$$

where ^ = − ∈ , see Appendix B. A critical point is characterized by

$$\begin{aligned} DW^\*(\sigma)[T] &= 0, \quad \forall T \in U^\* \quad \text{with} \\ DW^\*(\sigma)[T] &= \left\langle \frac{\partial w^\*}{\partial \sigma}(\overline{\sigma} + \hat{\sigma}) : T - \overline{\varepsilon} : T \right\rangle\_Y. \end{aligned} \tag{4.44}$$

The restriction ∈ \* can be expressed as

$$T = \mathbb{P}: \langle T \rangle\_Y + \Delta: T \tag{4.45}$$

103

by introducing the operator Δ = I − ⟨·⟩ − Γ from the Helmholtz decomposition. Δ is the orthogonal projector onto the divergence- and mean-free fields. Hence, the optimality condition can be written as

$$\left\langle \left( \Delta + \mathbb{P} : \langle \cdot \rangle\_Y \right) : \left( \frac{\partial w^\*}{\partial \sigma} (\sigma) - \overline{\varepsilon} \right) : T \right\rangle\_Y = 0. \tag{4.46}$$

This yields the Euler-Lagrange equations

$$\mathbb{P}: \left\langle \frac{\partial w^\*}{\partial \sigma}(\sigma) \right\rangle\_Y = \mathbb{E} \quad \text{and} \quad \Delta: \frac{\partial w^\*}{\partial \sigma}(\sigma) = 0,\tag{4.47}$$

which recover (4.1) and (4.6) with the initial restrictions on ^.

# **4.5 FFT-based solution schemes for the cell problem**

#### **4.5.1 The basic scheme**

The basic scheme of Moulinec and Suquet (1994; 1998) was interpreted as a gradient descent method by Kabel et al. (2014). Thus, the convergence theory for gradient descent became available in the setting of FFTbased schemes. In addition, accelerated gradient schemes could be applied to FFT-based homogenization (Schneider, 2017a; 2019a). In the following, we review the general formulation of gradient descent methods and discuss their application to the primal and dual framework of computational homogenization. Consider a minimization problem of the type

$$f(x) \longrightarrow \min\_{x \in V},\tag{4.48}$$

for a continuously differentiable function on a Hilbert space . Critical points of are characterized by

$$
\nabla f(x) = 0,\tag{4.49}
$$

where the gradient is defined by

$$Df(x)[v] = \langle \nabla f(x), v \rangle\_V, \qquad v \in V,\tag{4.50}$$

with the inner product ⟨·*,* ·⟩ associated with . The gradient descent iteration, see Ch. 9 in Boyd and Vandenberghe (2004), for the solution of this problem is given by

$$x\_{k+1} = x\_k - \gamma\_k \nabla f(x\_k) \tag{4.51}$$

which converges for sufficiently small step size . Suppose is strongly convex and has a Lipschitz continuous gradient, i.e.

$$
\langle \nabla f(x) - \nabla f(y), x - y \rangle\_V \ge \mu ||x - y||\_V^2 \quad \forall x \in V,\tag{4.52}
$$

$$\|\|\nabla f(x) - \nabla f(y)\|\|\_{V} \le L \|\|x - y\|\|\_{V} \quad \forall x \in V,\tag{4.53}$$

with positive constants and . Then the optimal choice for the step size is given by

$$
\gamma\_k = \frac{2}{\mu + L},
\tag{4.54}
$$

see Ch. 1 and 2 in Nesterov (2004). In the following, we apply the gradient descent method to the primal and dual minimization problems associated to the cell problem of computational homogenization, see Sec.s 4.2.2 and 4.4. With property (4.50), we identify the gradients in the primal and dual case from (4.12) and (4.46) as

$$
\nabla W = \left(\Gamma + \mathbb{Q} : \langle \cdot \rangle\_Y\right) : \left(\frac{\partial w}{\partial \varepsilon}(\varepsilon) - \overline{\sigma}\right), \tag{4.55}
$$

$$
\nabla W^\* = (\Delta + \mathbb{P} : \langle \cdot \rangle\_Y) : \left( \frac{\partial w^\*}{\partial \sigma}(\sigma) - \overline{\varepsilon} \right). \tag{4.56}
$$

Thus, the gradient descent iterations can be written as

$$\begin{split} \varepsilon\_{k+1} &= \overline{\varepsilon} + \gamma\_k \overline{\sigma} - \gamma\_k (\Gamma + \mathbb{Q} : \langle \cdot \rangle\_Y) : \left( \frac{\partial w}{\partial \varepsilon} (\varepsilon\_k) - \frac{1}{\gamma\_k} \varepsilon\_k \right), \\ \sigma\_{k+1} &= \overline{\sigma} + \tilde{\gamma}\_k \overline{\varepsilon} - \tilde{\gamma}\_k (\Delta + \mathbb{P} : \langle \cdot \rangle\_Y) : \left( \frac{\partial w^\*}{\partial \sigma} (\sigma\_k) - \frac{1}{\tilde{\gamma}\_k} \sigma\_k \right), \end{split} \tag{4.57}$$

where the identities

$$\varepsilon\_k = \overline{\varepsilon} + (\Gamma + \mathbb{Q} : \langle \cdot \rangle\_Y) : \varepsilon\_k \quad \text{and} \quad \sigma\_k = \overline{\sigma} + (\Delta + \mathbb{P} : \langle \cdot \rangle\_Y) : \sigma\_k,\tag{4.58}$$

are used. Introducing the reference material C <sup>0</sup> = 1*/* I in the primal formulation and <sup>D</sup>̃︀<sup>0</sup> = 1*/*˜ <sup>I</sup> in the dual formulation, we recover the basic scheme by Moulinec-Suquet and the dual basic scheme by Bhattacharya-Suquet

$$\varepsilon\_{k+1} = \overline{\varepsilon} + \mathbb{D}^0 : \overline{\sigma} - \left(\Gamma^0 + \mathbb{D}^0 : \mathbb{Q} : \langle \cdot \rangle\_Y \right) : (\mathcal{F}(\varepsilon\_k) - \mathbb{C}^0 : \varepsilon\_k), \tag{4.59}$$

$$\sigma\_{k+1} = \overline{\sigma} + \widetilde{\mathbb{C}}^0 : \overline{\varepsilon} - \left(\widetilde{\mathbb{C}}^0 : \Delta^0 + \widetilde{\mathbb{C}}^0 : \mathbb{P} : \langle \cdot \rangle\_Y \right) : (\mathcal{F}^{-1}(\sigma\_k) - \widetilde{\mathbb{D}}^0 : \sigma\_k), \tag{4.60}$$

with ℱ() = *∂ ∂*  () and ℱ −1 () = *∂*\* *∂* (). For the problems at hand (4.7) and (4.41), the inequality conditions (4.53) translate to

$$\begin{aligned} \alpha\_- \mathbf{I} \le \mathbb{C}^{\tan} \le \alpha\_+ \mathbf{I} \quad \text{with} \quad \mathbb{C}^{\tan} = \frac{\partial^2 w}{\partial \varepsilon^2},\\ \beta\_- \mathbf{I} \le \mathbb{D}^{\tan} \le \beta\_+ \mathbf{I} \quad \text{with} \quad \mathbb{D}^{\tan} = \frac{\partial^2 w^\*}{\partial \sigma^2}, \end{aligned} \tag{4.61}$$

106

where <sup>+</sup> and <sup>−</sup> are, respectively, the smallest and the largest eigenvalue of the tangent stiffness C tan and <sup>+</sup> and <sup>−</sup> are the smallest and the largest eigenvalue of the tangent compliance D tan, respectively. Thus, the optimal respective choice for the reference material w.r.t. the convergence rate of the scheme is

$$\mathbb{C}^{0} = \frac{\alpha\_{+} + \alpha\_{-}}{2} \text{ I} \quad \text{and} \quad \tilde{\mathbb{D}}^{0} = \frac{\beta\_{+} + \beta\_{-}}{2} \text{ I} . \tag{4.62}$$

Due to similarities in structure, see Table 4.1, the dual scheme can be easily implemented into an existing strain-based code. Moreover, all accelerated gradient schemes which have been introduced in the primal context (Schneider, 2017a; 2019a) carry over to the dual case.

**Table 4.1:** Summary of quantities for the gradient descent algorithm (4.51) in strain- and stress-based setting


#### **4.5.2 The Barzilai-Borwein basic scheme**

Motivated by Quasi-Newton methods, Barzilai and Borwein (1988) published an iterative algorithm for the selection of the step size in (4.51) which greatly increases the rate of convergence compared to the choice in (4.54). Two recursive update formulas for the step size

$$\gamma\_k = \gamma\_{k-1} \left( 1 - \frac{\langle \nabla f(x\_k), \nabla f(x\_{k-1}) \rangle\_V}{||\nabla f(x\_{k-1})||\_V^2} \right)^{-1} \tag{4.63}$$

and

$$\gamma\_k = \gamma\_{k-1} \left( \frac{\|\nabla f(x\_{k-1})\|\_V^2 - \langle \nabla f(x\_k), \nabla f(x\_{k-1}) \rangle\_V}{\|\nabla f(x\_{k-1})\|\_V^2 - 2\langle \nabla f(x\_k), \nabla f(x\_{k-1}) \rangle\_V + \|\nabla f(x\_{k-1})\|\_V^2} \right). \tag{4.64}$$

were proposed. The method was applied to FFT-based homogenization by Schneider (2019a) and displayed excellent speed and robustness while using only twice the memory of the basic scheme. Throughout this paper, we only consider the second variant (4.64) as it exhibited better performance for the given material models. For the initial step size 0 , the step size of the basic scheme was found to be a decent choice. Note that, due to the recursive nature of the step size selection, the eigenvalues of the tangent are only needed in the first gradient descent iteration. In the case of stress-based crystal plasticity, this property is especially favorable. As the cost of evaluating the inverse material law is comparably cheap, see Sec. 4.6, the additional computation of the tangent and its eigenvalues significantly increases the overall computational effort.

#### **4.5.3 The Newton-CG method**

Newton-Raphson methods are ubiquitous in computational mechanics as a solution algorithm for nonlinear systems of equations. In the context of minimization, where ∇() = 0 is to be solved, the damped Newton-Raphson iteration reads

$$x\_{k+1} = x\_k - a\_k \mathcal{H}^{-1}(x\_k) \nabla f(x\_k) \tag{4.65}$$

with the Hessian ℋ of and a damping factor ∈ (0*,* 1], see Ch. 9 in Boyd and Vandenberghe (2004). Instead of inverting ℋ(), the update can be performed by

$$x\_{k+1} = x\_k + a\_k \Delta x \tag{4.66}$$

where Δ is an approximate solution of

$$
\mathcal{H}(x\_k)\Delta x = -\nabla f(x\_k).\tag{4.67}
$$

The damping factor is determined by a back-tracking procedure. In this paper, we use the stopping criteria of Dong (2010)

$$c\_2 \langle \nabla f(x\_k), \Delta x \rangle\_V \le \langle \nabla f(x\_k + a\_k \Delta x), \Delta x \rangle\_V \le c\_1 \langle \nabla f(x\_k), \Delta x \rangle\_V \tag{4.68}$$

with 0 *<* <sup>1</sup> *<* <sup>2</sup> *<* 1. In contrast to the Wolfe conditions (Wolfe, 1969), Dong's criteria rely solely on gradient evaluations. This is beneficial, as evaluating requires either the primal or dual condensed incremental potential, see Tab. 4.1, which is generally not available in FFT-based homogenization. Both and \* carry no physical meaning as they depend on the chosen time discretization and are composed of primal or dual free energy and dissipation potential, respectively.

In the vicinity of a stationary point, the Newton-Raphson method converges quadratically. However, it can be difficult to actually obtain such a convergence rate in practical application. For large problems, (4.67) is usually solved iteratively up to a certain tolerance. Solving for Δ with the accuracy required for quadratic convergence is generally not feasible with respect to the overall computational effort (Knoll and Keyes, 2004). The Newton-Raphson method has been applied to FFT-based homogenization both in the small- and finite-strain setting (Gélébart and Mondon-Cancel, 2013; Kabel et al., 2014) in combination with Krylov subspace solvers (Brisard and Dormieux, 2010; Zeman et al., 2010). The linear Newton-Raphson equations corresponding to the

Lippmann-Schwinger equations (4.59) and (4.60) read

$$\left[\mathbb{I} + (\Gamma^0 + \mathbb{D}^0 : \mathbb{Q} : \langle \cdot \rangle\_Y) : (\mathbb{C}^{\tan}(\varepsilon\_k) - \mathbb{C}^0) \right] : \Delta \,\varepsilon$$

$$= \mathbb{D}^0 : \overline{\sigma} - (\Gamma^0 + \mathbb{D}^0 : \mathbb{Q} : \langle \cdot \rangle\_Y) : \mathcal{F}(\varepsilon\_k), \tag{4.69}$$

$$\left[\mathbb{I} + (\widetilde{\mathbb{C}}^0 : \Delta^0 + \widetilde{\mathbb{C}}^0 : \mathbb{P} : \langle \cdot \rangle\_Y) : ((\mathbb{D}^{\tan}(\sigma\_k) - \widetilde{\mathbb{D}}^0)) : \Delta \sigma \\ = \widetilde{\mathbb{C}}^0 : \overline{\varepsilon} - (\widetilde{\mathbb{C}}^0 : \Delta^0 + \widetilde{\mathbb{C}}^0 : \mathbb{P} : \langle \cdot \rangle\_Y) : \mathcal{F}^{-1}(\sigma\_k). \tag{4.70}$$

The algorithmic solution of this type of equation using the conjugate gradient (CG) method was outlined, e.g., by Kabel et al. (2014).

In the context of FFT-based homogenization, the performance of Newton's method in comparison to other solution schemes depends heavily on the material law. If the evaluation of the material law is cheap and comparable to the application of the tangent, then Newton- and CG-iterations have similar computational cost. In such a case, fast gradient methods outperform the Newton-CG method, considering the overall number of iterations (Schneider, 2017a). On the other hand, if the material law dominates the overall runtime and the cost of the CG-iterations is small in comparison, then Newton-CG is the method of choice, see Sec. 4.6. However, the memory requirements of the algorithm are steep. Whereas the basic scheme and the Barzilai-Borwein method can be implemented with one and two strain-like fields respectively, the Newton-CG method requires the last converged solution and 4 additional fields for the CG algorithm. In 3 spatial dimensions, the additional storage of the tangent operator C tan or D tan corresponds to 21 scalars in each voxel, further increasing the required memory to 8*.*5 strain-like fields. We have found, however, that the last converged solution and the tangent operator can be stored in single precision without significantly affecting the convergence of the Newton scheme. Thereby, the memory footprint can be reduced to 6*.*25 strain-like fields.

## **4.5.4 Eigenvalue precomputation for the stress-based formulation**

In the FFT-based homogenization of elasto-viscoplastic materials, we face certain challenges in the stress-based setting which are not present in the conventional strain-based case. In the following, we give an outline of the basic problem and present a remedy in form of appropriate preprocessing steps.

Consider the first load step in which plastification occurs. In all discussed solution methods, we start with a single iteration of the basic scheme. This raises the question how the reference material should be chosen, since we have no a priori information on the extremal eigenvalues of the tangent stiffness and compliance. A natural choice is to consider the eigenvalues of the materials' elastic stiffness C. We evaluate the adequacy of this choice for the primal case. With the onset of plastification, the lower bound <sup>−</sup> of the stiffness decreases and can even approach zero, depending on hardening and viscosity. The upper bound <sup>+</sup> stays fixed. It can be shown that gradient schemes converge for the step size = 1 in case the energy has a Lipschitz continuous gradient but is not strongly convex (Nesterov, 2004). Even for an arbitrary decrease of −, the reference stiffness of the basic scheme C <sup>0</sup> = 1 2 (<sup>+</sup> + −) I changes at most by a factor of two compared to the elastic case. Therefore, it is guaranteed that in the first load step C 0 has at least the correct order of magnitude and the solvers can usually self-correct in the next couple of iterations.

In the stress-based setting, however, we face a different situation. During plastification, the upper bound of the compliance <sup>+</sup> = 1 <sup>−</sup> usually increases by orders of magnitude and <sup>−</sup> remains fixed. The reference compliance of the basic scheme D <sup>0</sup> = 1 2 (<sup>+</sup> + −) I is roughly proportional to <sup>+</sup> in this case. Note, that in case of <sup>−</sup> = 0, e.g. for perfect elastoplasticity, the dual schemes cannot be used in this form

as <sup>+</sup> = +∞. The error made by using the eigenvalues of the elastic compliance can be arbitrarily large and leads to an underestimation of D 0 , i.e. an overly large gradient step size. This negatively affects the convergence, as the solvers take a long time to self-correct. Hence, a reasonable estimate of <sup>+</sup> has to be determined before applying a solution scheme.

Given an appropriate initial guess for the stress field, a cheap and sufficiently accurate method is to evaluate the material law and eigenvalues in the voxel with the highest von Mises stress. In a setting with multiple load steps, this works well in conjunction with an affine extrapolation of the primary field (Moulinec and Suquet, 1998). However, we lack an initial stress field in the first load step. The conventional choice of <sup>0</sup> = + D 0 : is not useful for the eigenvalue precomputation. For common load cases such as strain-controlled uniaxial extension, vanishes and 0 relies on D <sup>0</sup> which we want to estimate in the first place.

#### **Algorithm 4** Basic scheme for the Reuss mixing

```
1: R ← 0
2: repeat
3: R ←
          ∑︀

               ℱ
                  −1

                    (R)
4: D
      tan
      R ←
            ∑︀

                  ∂ ℱ
                    −1

                   ∂ (R)
5: compute +, − from D
                           tan
                           R
6: D
      0 ← ++−
             2
                 I
7: R ←  + C
                0
                 : ( − P : (R −D
                                0
                                  : R))
8: until convergence
```
To obtain a better estimate for the initial stress, we use the Reuss model, i.e. we search the stress <sup>R</sup> ∈ Sym() so that

$$\begin{aligned} \mathbb{Q}: \sigma\_{\mathbb{R}} &= \overline{\sigma}, \\ \mathbb{P}: \varepsilon\_{\mathbb{R}} &= \overline{\varepsilon} \quad \text{with} \quad \varepsilon\_{\mathbb{R}} = \sum\_{i}^{N} c\_{i} \, \mathcal{F}\_{i}^{-1}(\sigma\_{\mathbb{R}}) \end{aligned} \tag{4.71}$$

holds, where denote the volume fractions of our constituents and ℱ −1 stand for their corresponding inverse material laws. Using the Reuss model can be seen as a minimization of \* in (4.41) under the assumption that the stress field is constant. We solve for the Reuss estimate <sup>R</sup> using the basic scheme which is presented in Alg. 4 for convenience of the reader. Since the algorithm only operates on a single stress matrix, its computational expense is negligible regardless of iteration count. The choice <sup>0</sup> = <sup>R</sup> was found to be a good starting point for the FFT-based homogenization schemes and enables the estimation of +.

Analogously, the Voigt average can be used to estimate the initial strain field in the primal case, i.e. find <sup>V</sup> ∈ Sym() so that

$$\begin{aligned} \mathbb{P}: \varepsilon \mathbb{v} &= \overline{\varepsilon}, \\ \mathbb{Q}: \sigma\_{\mathbb{V}} &= \overline{\sigma} \quad \text{with} \quad \sigma\_{\mathbb{V}} = \sum\_{i}^{N} c\_{i} \, \mathcal{F}\_{i}(\varepsilon\_{\mathbb{V}}) \end{aligned} \tag{4.72}$$

holds. This represents the minimization of in (4.7) with a constant strain field. However, the effect on the performance of the strain-based FFT-solvers is rather small.

# **4.6 Numerical demonstrations**

## **4.6.1 Setup**

All algorithms were implemented in Python 2.7, supplemented by Cython (Behnel et al., 2011) extensions for the computationally expensive operations, i.e. the application of Γ <sup>0</sup> and Δ<sup>0</sup> and the evaluation of the material law. For the computation of the fast Fourier transforms, we relied on the FFTW library (Frigo and Johnson, 2005). The critical parts of the code were parallelized using OpenMP. The computations ran on 6 threads on a desktop computer with 32 GB RAM and an Intel i7 CPU with 6 cores and a clock rate of 3*.*7 GHz.

The staggered grid discretization (Schneider et al., 2016) was utilized throughout because of its superior performance for perfectly plastic material behavior. Notice that the Helmholtz decomposition, see Appendix A, is available for the staggered grid discretization (Schneider et al., 2016). In case of multiple load steps, an affine linear extrapolation (Moulinec and Suquet, 1998) was used for the primary field.

In this section, we will compare primal, dual and primal-dual algorithms. Primal algorithms are those based on the primal basic scheme, i.e. the Barzilai-Borwein method and the primal Newton method scheme. For these methods, a strain field serves as the variable to iterate on. For every iteration, the strain field is compatible, and the iterative scheme seeks an equilibrated stress field. The latter is quantified by the convergence criterion

$$\alpha\_0 \frac{\left\| \left. \varepsilon\_{k+1} - \varepsilon\_k \right\| \right\|\_{L^2}}{\left\| \left. \left< \sigma\_k \right>\_Y \right\|} \le \delta,\tag{4.73}$$

see section 5 in Schneider et al. (2019) for details. The dual schemes are based on the dual basic scheme by Bhattacharya and Suquet (2005), and iterate on equilibrated stress fields, and repeat until the associated strain field is compatible. We check this by the criterion

$$
\beta\_0 \frac{\left\| \sigma\_{k+1} - \sigma\_k \right\|\_{L^2}}{\left\| \left< \varepsilon k \right>\_Y \right\|} \le \delta,\tag{4.74}
$$

which is simply the dual of the primal convergence criterion. Last but not least, we also briefly touch upon a primal-dual algorithm, the Eyre-Milton method (Eyre and Milton, 1999). This scheme iterates on a variable called polarization. Compatibility and equilibrium of the associated strain and stress fields, respectively, are only satisfied upon convergence. For the Eyre-Milton scheme, the consistent convergence criterion

$$\frac{1}{2} \frac{||P\_{k+1} - P\_k||\_{L^2}}{||\langle \sigma\_k \rangle\_Y||} \le \delta,\tag{4.75}$$

is used, see Sec. 5 in Schneider et al. (2019) for a derivation.

Due to the differences of these three schemes, the convergence criteria are not strictly comparable. Still, these criteria have been chosen to ensure the maximum degree of fairness in comparison, taking into account functional analytic and physical considerations. For our computations, we consistently used = 10<sup>−</sup><sup>5</sup> .

Last but not least let us remark that, for the Eyre-Milton method, we use the complexity reduction trick described in Sec. 6 of Schneider et al. (2019), which reduces the complexity of a single Eyre-Milton iteration to the level of a primal basic step for small-strain crystal viscoplasticity.

#### **4.6.2 Polycrystalline microstructure**

**Setup and material parameters.** In the following section, we investigate a periodic polycrystalline microstructure with 81 grains and a resolution of 64<sup>3</sup> voxels. The microstructure was generated using the Voronoi tesselation routine of the software Neper (Quey et al., 2011) with uniformly distributed grain orientations. We consider two single-crystal elasto-viscoplasticity models as described in Sec. 4.3.1 using the flow

**Figure 4.1:** Left: Polycrystalline microstructure (64<sup>3</sup> voxels) and accumulated plastic slip at 1% tensile strain. Right: Stress-strain diagram of the material models for a tensile strain rate of 0*.*001*/*s

rules of Chaboche and Hutchinson, respectively

$$\dot{\gamma}\_{\alpha} = \dot{\gamma}\_{0} \operatorname{sgn}(\tau\_{\alpha}) \left\langle \frac{|\tau\_{\alpha}| - \tau^{\mathrm{F}}}{\tau\_{\mathrm{D}}} \right\rangle^{n}, \qquad \dot{\gamma}\_{\alpha} = \dot{\gamma}\_{0} \operatorname{sgn}(\tau\_{\alpha}) \left| \frac{\tau\_{\alpha}}{\tau^{\mathrm{F}}} \right|^{n}. \tag{4.76}$$

For the hardening law, we use a linear exponential approach based on the accumulated plastic slip , see (4.21)

$$\tau^{\mathcal{F}} = \tau\_0 + (\tau\_\infty - \tau\_0) \left( 1 - \exp\left( -\frac{\Theta\_0 - \Theta\_\infty}{\tau\_\infty - \tau\_0} \gamma \right) \right) + \Theta\_\infty \gamma \tag{4.77}$$

where <sup>0</sup> denotes the initial yield stress, Θ<sup>0</sup> and Θ<sup>∞</sup> respectively denote the initial and asymptotic hardening modulus and <sup>∞</sup> denotes the saturated yield stress for Θ<sup>∞</sup> = 0. The material model using the Hutchinson flow rule is not a generalized standard material and its tangent stiffness is not symmetric. Hence, the convergence of the solution schemes outlined in Sec. 4.5 is not theoretically justified in this case. The performance of the solvers in combination with this flow rule is still of interest as it is widely

used (Lebensohn, 2001; Lebensohn et al., 2012) and therefore included in our investigations. For the tangent operators and its eigenvalues, we consider the symmetric part of the tangent stiffness or compliance. Additionally, we include Eyre-Milton's method (Eyre and Milton, 1999) to our list of investigated solvers in the strain-based setting as similar polarization-based schemes have been widely adapted in the context of single-crystal plasticity (Lebensohn et al., 2012; Shantraj et al., 2015). For a discussion of the theoretical background, algorithmic parameters and the convergence criterion for this family of solvers, we refer to Schneider et al. (2019).


**Table 4.2:** Material parameters of the single-crystal material models (Simmons and Wang, 1971; Eghtesad et al., 2018a)

All material parameters for the single-crystal plasticity models are listed in Tab. 4.2. The stiffness parameters correspond to OFHC copper at room temperature and are taken from Simmons and Wang (1971). For the model with Hutchinson's flow rule, the viscoplastic and hardening parameters of Eghtesad et al. (2018a) were used. Note that Eghtesad et al. prescribe a slightly different formulation for the linear exponential hardening law. However, the asymptotic behavior of their hardening approach is identical to the formulation in this study. The boundary conditions in the following numerical demonstrations correspond to a strain-controlled uniaxial tensile test, i.e. a uniaxial stress state, up to 1% strain with an applied strain rate of 0*.*001*/*s. We investigate two cases where the load is applied in a single step and in 50 steps of 0*.*02%, respectively. In case of the material model with Chaboche's flow rule, the material parameters <sup>0</sup> and <sup>D</sup> were chosen so that the stress-strain curves for the given load case are roughly equivalent for both models, see Fig. 4.1.

**Figure 4.2:** Polycrystal: Residual vs. computation time (Chaboche flow rule)

tigate the case of a single load step up to 1% strain in tensile direction using the material model with Chaboche's flow rule. Fig. 4.2 compares the residual of the different solvers as a function of computation time.


**Table 4.3:** Polycrystal: Computation times and iteration counts (Chaboche flow rule)

The basic scheme was omitted in this plot for the convenience of the reader, as its required computation time was an order of magnitude larger in comparison to the other schemes. All total runtimes are given in Tab. 4.3, together with the iteration counts. Note that we only counted Newton iterations and backtracking steps, i.e. evaluations of the material law, for the Newton-CG solver and omitted the CG iterations.

For both the primal and dual setting, the Newton-CG solver exhibits the best performance. Due to the large load step, we start far from the converged solution. Consequently, the Newton-CG solver takes many smaller steps and backtracking iterations in the beginning. After reaching a residual of 10<sup>−</sup><sup>3</sup> , it converged rapidly. The Barzilai-Borwein method is the second fastest scheme and exhibits a non-monotonic convergence behavior, both for the primal and the dual setting. The convergence rate of the Eyre-Milton scheme is similar to the other solvers up to a residual of 10<sup>−</sup><sup>2</sup> but slows down considerably afterwards.

Comparing the computation times of the solvers in the primal and dual case, we see a considerable increase in speed for the latter due to the cheaper evaluation of the inverse material law. For each solver, the computation time decreases by a factor of 1*.*5 to 4 in the stress-based setting. The second fastest solver in the dual setting, the Barzilai-Borwein scheme, is still faster than the primal Newton-CG method, with the additional benefit of reduced memory consumption.

**Figure 4.3:** Polycrystal: Residual over computation time (Hutchinson flow rule)

In the following, we consider the same load as in the previous section using the material model with the flow rule by Hutchinson, see Fig. 4.3 and Tab. 4.4. Qualitatively, the convergence behavior of the solvers is similar to the case using Chaboche's flow rule. In general, we observe lower iteration counts. This can be attributed to a higher tangent stiffness, i.e. lower inner material contrast, for large load steps when using Hutchinson's flow rule. The largest effect can be seen for the Eyre-Milton scheme, where the iteration count decreases by a factor of 10. As a result, it performs only marginally slower than the Newton-CG method in the primal setting. To achieve stable convergence, the Eyre-Milton scheme tracks the extremal eigenvalues over the entire solution history, see Schneider et al. (2019). Therefore, a low tangential stiffness can lead to a subsequent slowdown of the whole solution scheme, even if it only occurs in a single iteration. This explains the much slower convergence


**Table 4.4:** Polycrystal: Computation times and iteration counts (Hutchinson flow rule)

in case of the model with Chaboche's flow rule. Furthermore, it is noteworthy that the speedup in the dual setting is larger in this case, with factors of 10 − 20 in computation times for each solver. This is a consequence of the lower iteration count and the cheaper evaluation of the inverse material law in case of Hutchinson's flow rule. For the Chaboche flow rule, all overstress terms − have to be recomputed for each evaluation of the residual (4.36) while only has to be updated in case of Hutchinson's flow rule. This leads to a lower number of computationally expensive exponentiations in the latter case. The dual Barzilai-Borwein method notably profits from this fact and is only 2*.*5 times slower than the dual Newton-CG method and 4 times faster than the primal Newton-CG method.

**Convergence behavior and runtime for 50 load steps.** For the computations in this section, we applied the load of 1% tensile strain in 50 steps. In analogy to the last section, we first consider the model using Chaboche's flow rule. The computation time of the different solvers in each load step is plotted in Fig. 4.4. All solvers, except for Newton-CG in the primal setting, exhibit a peak in computation time in the second step with the onset of plastification. In the subsequent steps, the runtime decreases and reaches a stable value approximately

**Figure 4.4:** Polycrystal: Computation time of each load step (Chaboche flow rule)

at step 20. To evaluate the overall performance, the mean computation times and iterations per step are listed in Tab. 4.5. For each solver, the iteration counts are similar in the primal and dual setting while the computation times are lower by a factor of 3 − 6 in the latter. As for the single load step, Newton-CG exhibits the best performance throughout. However, in the primal case, the Eyre-Milton scheme takes less than twice as long to converge with a third of the required memory. Similarly, the Barzilai-Borwein scheme is slower than Newton-CG by a factor of 2*.*5 in the dual setting, with the identical low memory requirements as Eyre-Milton.

Considering the material model with Hutchinson's flow rule, the results are similar to the case using Chaboche's flow rule in the primal setting, see Fig. 4.5 and Tab. 4.6. The stress-based computations run faster than for the model with Chaboche's flow rule, due to the same reasons as in the single step case. However, the difference is less pronounced for small load steps as the inverse material law needs fewer iterations to converge. We further notice that the dual Barzilai-Borwein scheme is


**Table 4.5:** Polycrystal: Mean computation times and iteration counts (Chaboche flow rule)

**Table 4.6:** Polycrystal: Mean computation times and iteration counts (Hutchinson flow rule)


close in performance to the dual Newton-CG method, being only 25% slower. While the iteration count of Barzilai-Borwein is 4*.*5 times higher in comparison, the iterations are much less costly as neither the tangent computation nor the solution of a linear system are required.

**Effective material properties.** In the following, we discuss the effective elastic and plastic properties of the polycrystalline microstructure. The overall elastic behavior is characterized by the effective stiffness C¯ : Sym() → Sym() which relates effective stress ¯ = ⟨⟩ and effective

**Figure 4.5:** Polycrystal: Computation time of each load step (Hutchinson flow rule)

strain ¯ = ⟨⟩ by ¯ = C¯ : ¯ (4.78)

assuming linear elastic behavior for the single crystalline phase. Using the elastic parameters in Tab. 4.2, the effective stiffness of the polycrystalline structure, given in Voigt notation,

$$
\tilde{\mathbf{C}} = \begin{bmatrix}
194.1 & 103.0 & 102.8 & -0.1 & -0.2 & 0.7 \\
103.0 & 191.9 & 105.0 & -1.5 & 0.3 & 0.4 \\
102.8 & 105.0 & 192.2 & 1.6 & -0.2 & -1.1 \\
0.7 & 0.4 & -1.1 & 0.3 & -0.2 & 44.2
\end{bmatrix} \text{ GPa}, \tag{4.79}
$$

was identified through 6 linear elastic computations. The isotropic part of the stiffness can be computed by

$$\mathbb{C}^{\text{iso}} = (\bar{\mathbb{C}} : \mathbb{P}\_1)\mathbb{P}\_1 + \frac{1}{5}(\bar{\mathbb{C}} : \mathbb{P}\_2)\mathbb{P}\_2. \tag{4.80}$$

with the projector P<sup>1</sup> onto the spherical × matrices and projector P<sup>2</sup> onto the deviatoric, i.e. trace-free, × matrices. For the given effective stiffness, C iso corresponds to a material with a Young's modulus of = 120*.*8 GPa and a Poisson ratio of = 0*.*35. In this case, the anisotropic part of the stiffness C aniso = C¯ − C iso is small with ‖C aniso‖*/*‖C¯‖ = 0*.*017 where ‖C‖ = √ C :: C. Hence, C iso is a reasonable approximation of C¯.

**Figure 4.6:** Polycrystalline microstructure: Contraction ratio at various load angles in the -, -, and -plane

The plastic anisotropy of polycrystals can be characterized by the contraction ratio

$$q = -\frac{\dot{\bar{\varepsilon}}\_{\mathbb{P}} : (n\_{\mathcal{O}} \otimes n\_{\mathcal{O}})}{\dot{\bar{\varepsilon}}\_{\mathbb{P}} : (n\_{\ell} \otimes n\_{\ell})} \tag{4.81}$$

which can be identified in a given plane by performing tensile tests at various angles, with the load direction *<sup>ℓ</sup>* and the orthogonal direction in the chosen plane . Here, ¯<sup>p</sup> ∈ Sym() denotes the effective plastic strain which is computed by

$$
\bar{\varepsilon}\_{\mathsf{P}} = \bar{\varepsilon} - \mathsf{\vec{\mathbb{D}}} : \bar{\sigma} \quad \text{with} \quad \mathsf{\vec{\mathbb{D}}} = \mathsf{\vec{\mathbb{C}}}^{-1}. \tag{4.82}
$$

The contraction ratio is connected to the commonly used Lankfordcoefficient by = +1 . To characterize the plastic anisotropy of the polycrystalline microstructure, computations corresponding to uniaxial tensile tests were performed at various angles in the -, - and plane using the elasto-viscoplastic material model with Chaboche's flow rule. The resulting contraction ratios are plotted in Fig. 4.6, where the load angle is taken with respect to the -direction for the - and -plane and with respect to the -direction for the -plane. We observe that all contraction ratios fluctuate around the value of 0*.*5 which signifies plastically isotropic behavior. The largest deviation is about 0*.*1.

## **4.6.3 Directionally solidified NiAl-9Mo fiber structure**

**Setup and material parameters.** We investigate the high-temperature

**Figure 4.7:** Directionally solidified NiAl-9Mo: Microstructure (1200 × 160 × 160 voxels) and accumulated plastic slip after 50s

creep of a NiAl-9Mo eutectic. Using directional solidification, this material develops a characteristic microstructure where well-aligned single-crystal Molybdenum fibers with square cross section are embedded in a nickel-aluminum matrix (Bei and George, 2005). We consider a unit cell of 1200 × 160 × 160 voxels with a fiber volume content of 14% and a fiber aspect ratio of 100 (Haenschke et al., 2010), see Fig. 4.7. The microstructure was generated by a random sequential addition algorithm (Widom, 1966). The spatial resolution of the fibers is 8 voxels per edge length. Due to the large voxel count, we restrict the investigation to the fastest solvers, i.e. the Newton-CG method in the primal setting and the Newton-CG as well as the Barzilai-Borwein methods in the dual setting.

Both materials are modeled according to Albiez et al. (2016a) using the Hutchinson flow rule (4.32). The nickel-aluminum matrix is assumed to be perfectly plastic, i.e. = 0 . For the molybdenum fibers, we use the approach in Albiez et al. (2016a)

$$\tau^F = \frac{\tau\_{\infty}}{d\sqrt{\rho} + 1} \quad \text{with} \quad \rho = \rho\_s \left[ 1 - \exp\left( -\frac{1}{2} k\_2 \gamma \right) \left( 1 - \sqrt{\frac{\rho\_0}{\rho\_s}} \right) \right]^2 \tag{4.83}$$

with the maximum yield stress ∞, the characteristic length , the dislocation density , its initial value 0, its saturation value <sup>s</sup> and the recovery constant 2. Note that we neglect the Taylor hardening term present in Albiez et al. (2016a), as its contribution is small in this case. The material parameters for fibers and matrix at 1000<sup>∘</sup>C are listed in Tab. 4.7. Owing to the large flow resistance of the fibers, the investigated composite has a high external material contrast in addition to the internal contrast caused by plastification.

The boundary conditions correspond to a uniaxial compression load of 250 MPa which is rapidly applied in a single load step of 0*.*001s which corresponds to a strain rate of 2*/*s in the load direction. The load is subsequently held for 100 load steps of 0*.*5s for a total time of 50s.

**Convergence behavior and runtime.** In the first few steps after the rapid initial compression, a load transfer from matrix to fiber takes place in the form of a viscous effect. Consequently, the phase-specific loads


**Table 4.7:** Directionally solidified eutectic: Material parameters of fibers and matrix at 1000∘C (Albiez et al., 2016a)

and fields are not monotonic at the onset of creep, see Albiez et al. (2016a); Dudová et al. (2011). This leads to an initial peak in computation time, before it drops around step 10 when the fields stabilize and extrapolation takes effect, see Fig. 4.8. The peak is most pronounced for the dual Barzilai-Borwein scheme which requires nearly the same time as the primal Newton-CG method in the first few steps but speeds up to the level of the dual Newton-CG method afterwards. Tab. 4.8 allows us to compare the overall performance of the solvers. The dual Newton-CG scheme is fastest overall, beating the primal Newton-CG by a factor of 7. It is closely followed by the dual Barzilai-Borwein method, which requires 5 times as many iterations but only 50% more computation time.

**Effective material properties.** A characteristic value for the creep strength of a material under a certain stress load is the minimal creep

**Figure 4.8:** Directionally solidified NiAl-9Mo: Computation time of each load step

**Table 4.8:** Directionally solidified eutectic: Mean computation times and iteration counts


rate ˙ c min. It is defined as the minimum of the strain rate in direction *<sup>ℓ</sup>* of the applied uniaxial stress load

$$
\dot{\varepsilon}^{\mathcal{C}} = \dot{\bar{\varepsilon}} : (n\_{\ell} \otimes n\_{\ell}). \tag{4.84}
$$

To investigate the anisotropic creep behavior of the NiAl-9Mo microstructure, we performed creep computations with a compressive uniaxial stress load of 250 MPa at different angles with respect to the fiber direction, i.e. -direction, in the -plane. The resulting creep curves and the corresponding minimal creep rate for each angle are depicted in Fig. 4.9. Computations in the -plane were conducted as well

**Figure 4.9:** Creep of directionally solidified NiAl-9Mo at different load angles with respect to the fiber direction in the -plane for an applied load of 250 MPa. Left: Creep curves. Right: Minimal creep rates

and yielded similar results, indicating approximately isotropic creep behavior in the -plane. Several comments are in order. Compared to the computations by Albiez et al. (2016a), the minimal creep rate for a load applied in fiber direction is an order of magnitude larger in the present study. This can be attributed to the fact that Albiez et al. assumed infinitely long fibers. Hence, we observe a pronounced effect of the aspect ratio on the effective creep behavior, even for a large ratio such as 100. Considering the creep curves at varying load angles, a pronounced minimum is only present in the case where load direction and fiber alignment coincide. For all other load cases, the creep rate reached a stationary value and did not increase afterwards. This is due to the fact that the plastic deformation and the accompanying softening of the fibers is only activated under high loads as the initial yield stress of molybdenum is very large, see Tab. 4.7. The fibers carry such high stresses only in the 0 <sup>∘</sup> angle load case. In the other cases where stress load and fibers are misaligned, a larger part of the load is distributed to the less creep resistant NiAl matrix which leads to higher effective

creep rates. Even for a small misalignment of 15<sup>∘</sup> , the minimum creep rate increases by more than two orders of magnitude compared to the 0 ∘ case. For the 45<sup>∘</sup> and 90<sup>∘</sup> case, the creep rate is similar to that reported for binary NiAl, e.g., by Seemüller et al. (2013) or Whittenberger et al. (1991). This indicates that the reinforcing effect of the molybdenum fibers diminishes at these load angles.

# **4.7 Conclusions**

Initially conceived by Bhattacharya and Suquet Bhattacharya and Suquet (2005) to tackle strain-locking materials, we found that the application of stress-based FFT-schemes can be beneficial in the case of small-strain single crystal elasto-viscoplasticity, due to the stress-explicit formulation of the plastic flow rule. Thereby, in our numerical examples, we were able to reduce the computation time by a factor of 2 − 20 in comparison to the strain-based setting. Further research could be invested to assess if other types of stress-explicit material laws can similarly profit from the stress-based formulation.

For this study, we considered geometrically linear crystal plasticity models. To investigate finite deformations, explicit incremental update schemes as presented, e.g., by Lebensohn (2001); Lebensohn et al. (2008) could be applied after each converged load step. To the best of our knowledge, in a Lagrangian finite-strain setting, explicit solutions for the update of the inelastic deformations currently exist only for the case of viscoelasticity (Shutov et al., 2013). To our knowledge, this does not change in the dual case.

Considering our investigated solution schemes, we found that a good initial approximation of the average load and the eigenvalues of the tangent was vital to achieve fast and stable convergence in the dual setting. To this end, suitable and computationally efficient precomputation routines were presented. As a result, the solution schemes exhibited

robust convergence behavior and similar iteration counts for both strainand stress-based computations. Still, it remains an open question if materials without a lower bound on their stiffness, e.g. in case of perfect elastoplasticity, can be handled in the dual setting.

Comparing the performance of the solvers, the Newton-CG method exhibited the best results throughout. However, in the dual setting, the Barzilai-Borwein scheme was in many cases competitive, being only slightly slower. For the presented numerical examples, we outlined how the developed methods can be used to characterize the complex behavior of polycrystalline compounds. The low memory footprint and the high computational efficiency of the stress-based Barzilai-Borwein method enable the further study of more complex microstructures which require larger cells and higher voxel counts. For instance, the investigation of non-unidirectional fiber distributions for the NiAl-Mo material in Sec. 4.6.3 or the study of cellular lamellar structures formed by NiAl-Cr(Mo) (Wang et al., 2018b) becomes feasible with the presented approach.

## **Chapter 5**

# **Computing the effective response of heterogeneous materials with thermomechanically coupled constituents by an implicit FFT-based approach<sup>1</sup>**

## **5.1 Introduction**

When subjected to a wide range of thermomechanical loadings, the interplay between temperature and deformation fields has a significant impact on the effective behavior of structural materials. Clearly, variations in temperature lead to changes in the mechanical behavior, e.g., in the form of thermal softening. In return, mechanical loadings can induce temperature changes, due to internal dissipation or changes in entropy. This interplay of mechanical and thermal effects is governed by the balance equations for linear momentum and internal energy (in terms of the heat equation). For instance, in the vicinity of

<sup>1</sup> This chapter is based on Wicht et al. (2021b). For the sake of a coherent structure, formatting and typography of this thesis, minor changes have been made. To avoid redundancies in the text, the introduction has been shortened. References to the appendix of the original paper have been replaced with references to Sec. 3.3 and Sec. 4.5 which cover similar content.

their glass transition temperature polymers are particularly sensitive to temperature variations (Ferry, 1980). Especially when subjected to cyclic loading, self-heating due to dissipation can critically affect the mechanical properties of materials and the life time of components, see, e.g., Rittel (2000), Mortazavian and Fatemi (2015) or Katunin (2019).

Thus, for the optimal use of materials, characterizing and predicting their thermomechanical behavior is of central importance. For composite materials, this proves to be a challenging task as their properties rest on their individual constituents and microstructure. In a small-strain framework, Chatzigeorgiou et al. (2016) used an asymptotic homogenization approach to derive the governing thermomechanical equations on the micro- and macroscale for generalized standard materials Germain et al. (1983), taking into account both the microstructure and the thermomechanical material behavior. This generalized previous studies using asymptotic approaches, e.g., by Terada et al. (2010) for poro-thermoelasticity or Temizer (2012) for finite thermoelasticity. A recent review on the homogenization of dissipative materials was given by Charalambakis et al. (2018).

A particular result of the asymptotic homogenization is that the microscopic balance of linear momentum depends only on the *macroscopic* temperature and is independent of temperature fluctuations on the microscale Chatzigeorgiou et al. (2016). In contrast to earlier works on the homogenization of thermomechanical material properties, e.g., by Willis (1981), the uniform temperature on the microscale is not an ad-hoc assumption but arises as a direct consequence of first-order homogenization. As a result, the thermomechanical problem on the microscale may be solved for a homogeneous temperature and is decoupled from microscopic heat-conduction. Based on these results, Tikkarrouchine et al. (2019) homogenized unidirectional short-fiber structures with temperature-independent material parameters in the context of concurrent multiscale simulations, using the finite element

(FE)-software ABAQUS. Similar FE-based multiscale studies which still consider thermal conduction on the microscale were carried out by Özdemir et al. (2008) for elastoplasticity and Li et al. (2019) for single-crystal elasto-viscoplasticity.

Motivated by the aforementioned studies, we consider solvers based on the fast Fourier transform (FFT) for the computational homogenization of thermomechanically coupled materials on the microscale. In this context, FFT-based methods have been used to homogenize linear thermoelastic materials (Vinogradov and Milton, 2008; Anglin et al., 2014; Ambos et al., 2015) and linear thermo-magneto-electroelastic materials (Sixto-Camacho et al., 2013). Shantraj et al. (2019) proposed a FFT-based staggered algorithm for coupled multi-physics problems, taking thermal conduction on the microscale into account.

To exploit the power of FFT-based methods for computing the effective thermomechanical behavior of nonlinear dissipative materials, we rely upon the framework of asymptotic homogenization, as pioneered by Chatzigeorgiou et al. (2016). Due to the weak coupling of mechanics and thermal conduction, the cell problem on the microscale is governed only by the microscopic balance of linear momentum and the evolution of the macroscopic temperature, see Sec. 5.2. Based on these results, we propose a staggered solution algorithm, where strain-field and temperature are updated in an alternating fashion, see Sec. 5.3. The proposed solution scheme may be applied on top of any iterative strain-based solution method and can be easily integrated into existing FFT-based computational micromechanics codes. Owing to the homogeneity of the temperature on the microscale, the temperature update only involves solving a scalar equation and introduces little overhead. The usefulness of the approach is demonstrated in Sec. 5.4 for glass-fiber reinforced polypropylene composites with strong thermomechanical coupling.

# **5.2 First order homogenization of thermomechanical composites**

Chatzigeorgiou et al. (2016) introduced a framework for the asymptotic homogenization of thermomechanically coupled generalized standard materials in the quasi-static small-strain setting. As a result, they obtained governing equations for macro- and microscale. In the following, we review the equations relevant for solving the thermomechanical cell problem on the microscopic level.

Let ⊆ R be a rectangular cell, with microscopic point ∈ and ∈ {1*,* 2*,* 3} spatial dimensions. We denote by Sym() the space of symmetric × matrices. For the following discussion, we consider the displacement fluctuation field : ×[0*,* ] → R , the infinitesimal strain field : × [0*,* ] → Sym(), the stress field : × [0*,* ] → Sym(), the heat flux : × [0*,* ] → R , the entropy density : × [0*,* ] → R, internal energy density : × [0*,* ] → R, internal variables : × [0*,* ] → with a sufficiently large vector space and the macroscopic absolute temperature ¯ ∈ R*>*0. For a heterogeneous Helmholtz free energy density

$$\psi: Y \times \text{Sym}(d) \times \mathbb{R}\_{\geq 0} \times Z \to \mathbb{R}, \quad (x, \varepsilon, \bar{\theta}, z) \mapsto \psi(x, \varepsilon, \bar{\theta}, z), \quad \text{(5.1)}$$

which is related to the internal energy by

$$e = \psi + s\ddot{\theta},\tag{5.2}$$

we express stress and entropy by the potential relations

$$\sigma = \frac{\partial \psi}{\partial \varepsilon}(\cdot, \varepsilon, \bar{\theta}, z), \quad \text{and} \quad s = -\frac{\partial \psi}{\partial \bar{\theta}}(\cdot, \varepsilon, \bar{\theta}, z), \tag{5.3}$$

under the assumption that is differentiable in all arguments except for the first (Coleman and Noll, 1963). As a result from the asymptotic

homogenization of Chatzigeorgiou et al. (2016), only the macroscopic temperature enters . Thus, the temperature in a microstructure cell corresponding to a macroscopic point can be interpreted as homogeneous. We assume that the free Helmholtz energy density can be additively decomposed

$$
\psi(\cdot,\varepsilon,\bar{\theta},z) = \psi\_{\text{heat}}(\cdot,\bar{\theta}) + \psi\_{\text{mech}}(\cdot,\varepsilon,\bar{\theta},z) \tag{5.4}
$$

into a component heat associated to heat storage and a component mech representing the storage of mechanical energy. This splitting does not reflect physics, but is computationally convenient, see Sec. 5.3. Many commonly used thermomechanical material models, such as viscoelasticity (Tikkarrouchine et al., 2019), elastoplasticity (Chatzigeorgiou et al., 2016) and viscoplasticity (Stainier and Ortiz, 2010) feature a free energy in the form of (5.4). The heat capacity density at constant strain

$$c\_{\varepsilon} = -\bar{\theta} \frac{\partial^2 \psi}{\partial \bar{\theta}^2},\tag{5.5}$$

is typically assumed to be independent of strain and internal state . Under this condition, the temperature dependence of the mechanical free energy mech may at most be linear. Consequently, we also partition the entropy

$$s = s\_{\text{heat}}(\cdot, \bar{\theta}) + s\_{\text{mech}}(\cdot, \varepsilon, \bar{\theta}, z) \tag{5.6}$$

with

$$s\_{\text{heat}} = -\frac{\partial \psi\_{\text{heat}}}{\partial \bar{\theta}} \quad \text{and} \quad s\_{\text{mech}} = -\frac{\partial \psi\_{\text{mech}}}{\partial \bar{\theta}}.\tag{5.7}$$

For generalized standard materials, the evolution of internal variables is governed by Biot's equation

$$\frac{\partial \psi}{\partial z}(\cdot, \varepsilon, \bar{\theta}, z) + \frac{\partial \phi}{\partial \dot{z}}(\cdot, \bar{\theta}, \dot{z}) = 0 \tag{5.8}$$

involving a dissipation potential : × R*>*<sup>0</sup> × → R≥0*,* (*,* ¯*,* ˙) ↦→ (*,* ¯*,* ˙). We assume that is convex in its third argument and (·*,* ¯*,* 0) = 0 holds. For the stress and strain field, the microscopic static balance of linear momentum without volume-force densities

$$\operatorname{div}\sigma = 0,\tag{5.9}$$

and the kinematic compatibility condition

$$
\varepsilon = \overline{\varepsilon} + \nabla^s u \quad \text{with} \quad \overline{\varepsilon} = \langle \varepsilon \rangle\_Y \tag{5.10}
$$

hold, where ⟨·⟩ = 1*/*| | ∫︀ (·) d denotes the volume average over and ∇ stands for the symmetrized gradient. The macroscopic temperature is determined by the macroscopic balance of internal energy

$$\bar{\theta}\,\dot{\overline{e}} = -\text{div}\_{\overline{x}}\,\langle q\rangle\_Y + \langle \sigma : \dot{\varepsilon}\rangle\_Y \quad \text{with} \quad \overline{e} = \langle e\rangle\_Y,\tag{5.11}$$

where we neglect additional source terms and div denotes the divergence with respect to the position ∈ Ω in the macroscopic body Ω ⊆ R . It is common to reformulate the balance of internal energy as a heat equation in terms of the entropy

$$\bar{\theta}\,\dot{\overline{s}} = -\text{div}\_{\overline{x}}\,\langle q\rangle\_Y - \left\langle \frac{\partial \psi}{\partial z} \cdot \dot{z} \right\rangle\_Y \quad \text{with} \quad \overline{s} = \langle s\rangle\_Y \,\, , \tag{5.12}$$

or the temperature

$$c\_{\varepsilon}\dot{\bar{\theta}} = -\text{div}\_{\overline{x}}\left\_{Y} + \bar{\theta}\left<\frac{\partial^{2}\psi}{\partial\varepsilon\,\partial\bar{\theta}}\,:\dot{\varepsilon}\right>\_{Y} + \bar{\theta}\left<\frac{\partial^{2}\psi}{\partial z\,\partial\bar{\theta}}\,:\dot{z}\right>\_{Y} - \left<\frac{\partial\psi}{\partial z}\,:\dot{z}\right>\_{Y}.\tag{5.13}$$

Note that, in the small-strain setting, the material time derivative ˙ (·) reduces to the local time derivative *<sup>∂</sup>*(·) *∂* . As we are only interested in

solving the cell problem, i.e., we only consider a single macroscopic point, the term −div ⟨⟩ cannot be further specified and acts as a volumetric heat supply term. Hence, we denote = −div ⟨⟩ and treat and as boundary conditions. For a treatment in a concurrent multiscale context, see, e.g., Chatzigeorgiou et al. (2016) or Tikkarrouchine et al. (2019).

## **5.3 Solution scheme for the fully-coupled thermomechanical cell problem**

Consider the Hilbert space 2 ( ; Sym()) of -periodic and square integrable stress and strain fields with inner product

$$\langle S, T \rangle \mapsto \langle S, T \rangle\_{L^2} = \langle S : T \rangle\_Y, \quad S, T \in L^2(Y; \text{Sym}(d)), \tag{5.14}$$

and the induced norm

$$\|S\|\_{L^2} = \sqrt{\langle S, S \rangle\_{L^2}}, \quad S \in L^2(Y; \text{Sym}(d)). \tag{5.15}$$

For a certain point in time, we want to find a strain field and a macroscopic temperature ¯ which solve equations (5.8) - (5.11) for prescribed and . For the convenience of the reader, we restrict to pure strain boundary conditions, see Kabel et al. (2016) for an extension to mixed boundary conditions. To solve our problem, we consider a fixed time step and apply an implicit Euler discretization in time to our system of equations. We define the operator : 2 ( ; Sym()) × R*<sup>&</sup>gt;*<sup>0</sup> → −1 # ( ; <sup>R</sup> ), where −1 # ( ; <sup>R</sup> ) denotes

the space of forces, and the function : 2 ( ; Sym()) × R*>*<sup>0</sup> → R

$$M(\varepsilon,\bar{\theta}) = \operatorname{div} \frac{\partial \psi}{\partial \varepsilon}(\cdot,\varepsilon,\bar{\theta},z) \tag{5.16}$$

$$\begin{aligned} H(\varepsilon, \bar{\theta}) &= \bar{\theta} \left< s\_{\text{heat}}(\cdot, \bar{\theta}) + s\_{\text{mech}}(\cdot, \varepsilon, \bar{\theta}, z) \right>\_{Y} \\ &- \bar{\theta} \, \overline{s}^{n} - \Delta t \mathcal{S} + \left< \frac{\partial \psi}{\partial z}(\cdot, \varepsilon, \bar{\theta}, z) \cdot (z - z^{n}) \right>\_{Y} \end{aligned} \tag{5.17}$$

with the mean entropy and internal variables at the last converged time step and the time increment Δ. When evaluating (*,* ¯) or (*,* ¯), the internal variables are computed by solving the discretized Biot's equation

$$\frac{\partial \psi}{\partial z}(\cdot, \varepsilon, \bar{\theta}, z) + \frac{\partial \phi}{\partial \dot{z}}(\cdot, \bar{\theta}, \frac{z - z^n}{\Delta t}) = 0,\tag{5.18}$$

for given strain-field and temperature ¯. The thermomechanical cell problem is defined by the system of equations

$$M(\varepsilon,\bar{\theta}) = 0,\tag{5.19}$$

$$H(\varepsilon, \bar{\theta}) = 0,\tag{5.20}$$

where (5.19) describes the mechanical problem for the strain-field and (5.20) is the thermal problem, determining the evolution of the temperature ¯.

There exist two general approaches for solving the thermomechanically coupled problem. In monolithic schemes, (5.19) and (5.20) are solved simultaneously, whereas staggered approaches treat the sub-problems (5.19) and (5.20) separately (Armero and Simo, 1992; Rothe et al., 2015). Monolithic approaches enjoy unconditional stability, but the resulting system is usually non-symmetric (Armero and Simo, 1992). Provided each sub-problem by itself is symmetric, staggered schemes circumvent this difficulty and thereby enable using more efficient solution algorithms (Simo and Miehe, 1992; Riedlbauer et al., 2014). Furthermore, they are convenient in terms of implementation, as existing solvers for

the sub-problems may be used (Erbts and Düster, 2012; Martins et al., 2017; Shantraj et al., 2019). Hence, we focus on staggered algorithms in the following.

Typically, staggered schemes are based on an isothermal split (Simo and Miehe, 1992; Armero and Simo, 1992), where the mechanical problem is solved for a fixed temperature and the thermal problem is solved for a fixed strain-field. More precisely, for given iterates and ¯, where ¯<sup>0</sup> = ¯ is set to the temperature in the last converged time step, the following steps are performed:


In this context, we distinguish between explicit and implicit staggered schemes. For explicit schemes, steps 1 and 2 are carried out only once, whereas for implicit schemes the steps are repeated until a prescribed convergence criterion is fulfilled.

Thus, explicit schemes are naturally faster. However, they suffer from lower accuracy (Vaz Jr. et al., 2011; Martins et al., 2017) and are prone to instabilities (Armero and Simo, 1992; 1993; Erbts and Düster, 2012) for problems with strong thermomechanical coupling. To address the latter difficulty, Armero and Simo (1992; 1993) proposed an unconditionally stable adiabatic split, where (5.19) is solved under the condition ˙ = 0. In the present work, we do not follow this approach (see below for a discussion) and consider an implicit staggered approach with an isothermal split. Implicit staggered schemes enjoy the same accuracy as monolithic algorithms (Rothe et al., 2015) and have been shown to be more stable than explicit schemes (Erbts and Düster, 2012). However, when repeating steps 1 and 2 until convergence, the sub-problems (5.19) and (5.20) have to be solved multiple times per time step, which is

computationally expensive. Therefore, we propose two simplifications to enhance the overall efficiency of the scheme.

First, suppose we have an iterative strain-based fixed point scheme

$$
\varepsilon\_{k+1} = F(\varepsilon\_k, \bar{\theta}), \tag{5.21}
$$

which solves (5.19) for a fixed temperature ¯. For better readability, we suppress the possible dependency of on additional algorithmic parameters and the boundary conditions . Instead of solving (5.19) after each temperature update, we only perform a single iteration (5.21) of the mechanical solver.

The second simplification concerns the temperature update, i.e., solving (5.20). Evaluating (*,* ¯) is computationally expensive, as it involves solving (5.18) for all points in Y to compute the mechanical entropy mech(·*, ,* ¯*,* ) and the dissipation *∂ ∂* (·*, ,* ¯*,* )·(− ), see (5.17). To obtain an efficient algorithm, we wish to avoid this operation outside of (5.21), i.e., without improving our current guess for the strain field. Thus, we propose an additive split of (*,* ¯)

$$H(\varepsilon, \bar{\theta}) = H\_{\text{impl}}(\bar{\theta}) + H\_{\text{expl}}(\varepsilon, \bar{\theta}) \tag{5.22}$$

into an implicit part impl( ¯)

$$H\_{\rm impl}(\bar{\theta}) = \bar{\theta} \left< s\_{\rm heat}(\cdot, \bar{\theta}) \right>\_{Y} - \bar{\theta} \, \overline{s}^{n} - \Delta t \mathcal{S} \tag{5.23}$$

and an explicit part expl(*,* ¯)

$$H\_{\rm expl}(\varepsilon,\bar{\theta}) = \bar{\theta} \left< s\_{\rm mech}(\cdot,\varepsilon,\bar{\theta},z) \right>\_{Y} + \left< \frac{\partial \psi}{\partial z}(\cdot,\varepsilon,\bar{\theta},z) \cdot (z - z^n) \right>\_{Y},\tag{5.24}$$

following our partition of the entropy. We emphasize that this splitting is not physical but computationally convenient. Instead of solving (+1*,* ¯) = 0 for updating the temperature, we solve impl( ¯) + expl(*,* ¯) = 0. More precisely, we compute the effective mechanical entropy

$$
\overline{s}\_{\text{mech},k} = \left< s\_{\text{mech}}(\cdot, \varepsilon\_k, \bar{\theta}\_k, z\_k) \right>\_{Y},\tag{5.25}
$$

as well as the mean dissipation

$$\overline{\mathcal{D}}\_k = \left\langle \frac{\partial \psi}{\partial z}(\cdot, \varepsilon\_k, \bar{\theta}\_k, z\_k) \cdot (z\_k - z^n) \right\rangle\_Y \tag{5.26}$$

as part of our mechanical iteration (5.21), see Miehe (1995). Subsequently, we solve

$$\begin{aligned} H\_{\text{split}}(\bar{\theta}) &= 0 \\ \text{with} \quad H\_{\text{split}}(\bar{\theta}) &= \bar{\theta} \left\langle s\_{\text{heat}}(\cdot, \bar{\theta}) \right\rangle\_Y + \bar{\theta} \left( \overline{s}\_{\text{mech},k} - \overline{s}^n \right) - \Delta t \mathcal{S} + \overline{\mathcal{D}}\_k. \end{aligned} \tag{5.27}$$

This is significantly more efficient than solving (+1*,* ¯) = 0, as it only involves the effective entropy related to heat storage, which is efficiently computed by

$$\left< s\_{\text{heat}}(\cdot,\bar{\theta}) \right>\_{Y} = \sum\_{j=1}^{N} c\_{j} s\_{\text{heat},j}(\bar{\theta}) \tag{5.28}$$

for an -phase composite material with volume fractions and phasespecific entropies heat*,* .

To summarize, our modified implicit algorithm involves the following steps, which are repeated until the convergence criterion of the mechanical solver is met:


The proposed algorithm is compatible to any mechanical solver in the form of (5.21), including classical FE-based methods.

For our concrete implementation, we rely on FFT-based solution schemes, due to their computational efficiency (Eisenlohr et al., 2013; Lucarini and Segurado, 2019; Rovinelli et al., 2020). In particular, we consider Moulinec-Suquet's basic scheme (Moulinec and Suquet, 1998), the Barzilai-Borwein method (Schneider, 2019a) and the inexact Newton-CG method (Kabel et al., 2014), see Sec. 3.3 and Sec. 4.5. Typically, the iteration scheme (5.21) involves applying the operator Γ = ∇ (div ∇ ) <sup>−</sup>1div in Fourier space and evaluating material law = *∂ ∂*  (·*,* ¯*, ,* ). As convergence criterion for the static equilibrium (5.19), we use

$$\frac{\|\Gamma : \sigma\|\_{L^2}}{\|\left<\sigma\right>\_Y\|\_{L^2}} \le \delta\_{\text{mech}},\tag{5.29}$$

see Sec. 5 of Schneider et al. (2019) for further details. The mean mechanical entropy mech*,* and the mean dissipation are computed when evaluating the material law. For the temperature update we use Newton's method, i.e., we iterate

$$\bar{\theta}^{i+1} = \bar{\theta}^i - \frac{H\_{\text{split}}(\bar{\theta}^i)}{H\_{\text{split}}'(\bar{\theta}^i)} \quad \text{with} \quad \bar{\theta}^0 = \bar{\theta}\_k,\tag{5.30}$$

and

$$\left\langle H'\_{\text{split}}(\bar{\theta}) = \bar{\theta} \left\langle s'\_{\text{heat}}(\cdot, \bar{\theta}) \right\rangle\_Y + \left\langle s\_{\text{heat}}(\cdot, \bar{\theta}) \right\rangle\_Y + \overline{s}\_{\text{mech},k} - \overline{s}^n,\tag{5.31}$$

until the criterion

$$\left| \frac{H\_{\text{split}}(\bar{\theta})}{\bar{\theta} \left< s\_{\text{heat}}(\cdot, \bar{\theta}) \right>\_{Y} + \bar{\theta} \, \overline{s}\_{\text{mech},k}} \right| < \delta\_{\text{heat}} \tag{5.32}$$

is met. Thus, we set ¯+1 = ¯ . If the convergence criterion (5.29) is met, we proceed to the next time step. Otherwise, we repeat updates (5.21) and (5.27). The algorithm is summarized in Alg. 5.

Several remarks are in order:

**Algorithm 5** Implicit staggered solution scheme (, maxitmech, mech, , maxitheat, heat)

1: Set initial values for ¯ and 2: ← 0 3: mech ← 1 4: **while**  *<* maxitmech **and** mech *>* mech **do** 5: ← + 1 6: ⎡ ⎢ ⎢ ⎣ mech mech ⎤ ⎥ ⎥ ⎦ ← ⎡ ⎢ ⎢ ⎢ ⎣ (*,* ¯) ⟨︀ mech(·*, ,* ¯*,* ) ⟩︀ ⟨ *∂ ∂* (·*, ,* ¯*,* ) · ( <sup>−</sup> ) ⟩ ‖Γ : ‖<sup>2</sup> */*‖ ⟨⟩ ‖<sup>2</sup> ⎤ ⎥ ⎥ ⎥ ⎦ *◁* Isothermal step (5.21) 7: heat ← 1 8: ← 0 9: **while**  *<* maxitheat **and** heat *>* heat **do** *◁* Temp. update (5.27) 10: ← + 1 11: ← ¯ ⟨︀ heat(·*,* ¯) ⟩︀ + ¯ (mech − ) − Δ + 12: ′ ← ¯ ⟨︀ ′ heat(·*,* ¯) ⟩︀ + ⟨︀ heat(·*,* ¯) ⟩︀ + mech − 13: heat ← |*/*( ¯ ⟨︀ heat(·*,* ¯) ⟩︀ + ¯ mech)| 14: ¯ ← ¯ − */*′ 15: **end while** 16: **end while** 17: **return** ¯,

1. For the temperature update, we use the entropy-based heat equation (5.12) instead of the more common temperature-based formulation (5.13). Consider the change in mech under the assumption that the heat capacity , as defined in (5.5), depends only on the temperature. Using the implicit Euler time discretization on ¯˙heat in (5.12) yields

$$\bar{\theta} \frac{s\_{\text{heat}}(\bar{\theta}) - s\_{\text{heat}}^n}{\Delta t}. \tag{5.33}$$

If, alternatively, we discretized the corresponding term ( ¯) ¯˙ in (5.13), we obtain

$$
\bar{\theta} \frac{\partial s\_{\text{heat}}}{\partial \bar{\theta}}(\bar{\theta}) \frac{\bar{\theta} - \bar{\theta}^n}{\Delta t}. \tag{5.34}
$$

Apparently, the change in entropy is basically linearized. Hence, to obtain higher precision for large time increments we prefer using (5.12).


the mechanical problem is solved under the condition ˙ = 0. For the present algorithm, we rely on the isothermal split as it is more convenient from the viewpoint of implementation. Suppose we already have an existing code for a purely mechanics-based solution scheme. For the isothermal split, only an update of the temperature dependent material parameters and the computation of mech and have to be added to the already implemented material law. For the adiabatic split, on the other hand, simple reduced forms of the material law, which identically fulfill ˙ = 0, can only be derived in special cases such as linear thermoelasticity Armero and Simo (1992). For more complex material laws with arbitrary temperature dependencies, the implementation of an additional adiabatic formulation may be cumbersome or even require an iterative local solution scheme. Thus, for tackling the issue of instability, we prefer using an implicit staggered approach based on an isothermal split (Erbts and Düster, 2012). Indeed, we encountered no numerical instabilities in our numerical experiments in Sec. 5.4, even for a composite with strong thermomechanical coupling.

## **5.4 Numerical demonstrations**

### **5.4.1 Setup**

Alg. 5 for thermomechanically coupled problems was implemented in an in-house FFT-based computational homogenization code written in Python 3*.*7 with FFTW (Frigo and Johnson, 2005) bindings. Applying Γ and evaluating the material law were integrated as Cython extensions (Behnel et al., 2011) and parallelized using OpenMP. Throughout, we rely on the discretization by trigonometric polynomials introduced by Moulinec and Suquet (1998). As convergence criterion for the iterative FFT-based solver, we use (5.29)

$$\frac{\|\Gamma : \sigma\|\_{L^2}}{\|\left<\sigma\right>\_Y\|\_{L^2}} \le \delta\_{\text{mech}}\tag{5.35}$$

with a prescribed tolerance of mech = 10<sup>−</sup><sup>5</sup> . The tolerance for the convergence criterion (5.32) of the temperature update

$$\left| \frac{H\_{\text{split}}(\bar{\theta})}{\bar{\theta} \left< s\_{\text{heat}}(\cdot, \bar{\theta}) \right>\_{Y} + \bar{\theta} \, \overline{s}\_{\text{mech},k}} \right| < \delta\_{\text{heat}} \tag{5.36}$$

is set to heat = 10<sup>−</sup><sup>4</sup> . For the computations on the 2-dimensional microstructure in Sec. 5.4.2, a desktop computer with 32 GB RAM and a 6-core Intel i7-8700K CPU was used. The computations on the 3-dimensional microstructure in Sec. 5.4.3 were performed on a workstation with 512 GB RAM and two 12-core Intel Xeon(R) Gold 6146.

### **5.4.2 Continuous glass-fiber reinforced polypropylene**

**Figure 5.1:** Continuous glass-fiber reinforced polypropylene: Microstructure and schematic of the generalized Maxwell model for polypropylene

In the following example, we consider a composite consisting of a polypropylene matrix unidirectionally reinforced by continuous glass fibers with a volume fraction of 30%. The microstructure, see Fig. 5.1, is modeled as a two-dimensional periodic cell with a resolution of 512<sup>2</sup> , containing 200 fibers. It was generated using the adaptive shrinking cell algorithm of Torquato and Jiao (2010).

The glass fibers are modeled as an isotropic linear thermoelastic material. The free energy related to heat storage reads

$$\psi\_{\text{heat}}(\theta) = c\_0 \left[ (\theta - \theta\_{\text{ref}}) - \theta \ln \left( \frac{\theta}{\theta\_{\text{ref}}} \right) \right],\tag{5.37}$$

and corresponds to a material with a constant heat capacity () = 0. Typically, for solids, states of constant strain are hard to realize under fluctuating temperatures. Hence, the heat capacity at constant strain is usually not measured experimentally. However, its value is typically close to the heat capacity at constant stress. The mechanical part of the free energy is given by

$$\psi\_{\rm mech}(\varepsilon,\theta) = \frac{1}{2}\varepsilon : \mathbb{C} : \varepsilon - \varepsilon : \mathbb{C} : (\alpha(\theta - \theta\_{\rm ref})),\tag{5.38}$$

implying the stress-strain relation

$$\sigma = \mathbb{C} : (\varepsilon - \alpha(\theta - \theta\_{\text{ref}})) \tag{5.39}$$

with a stiffness tensor C and a thermal expansion tensor ∈ Sym(). The associated entropies read

$$s\_{\text{heat}}(\theta) = c\_0 \ln \left(\frac{\theta}{\theta\_{\text{ref}}}\right) \quad \text{and} \quad s\_{\text{mech}}(\varepsilon) = \varepsilon : \mathbb{C} : \alpha. \tag{5.40}$$

As the material is elastic, no energy is dissipated, i.e., = 0, and the thermomechanical coupling is governed solely by mech. Changes in mech cause self-heating under hydrostatic compression and self-cooling under hydrostatic extension. This phenomenon is commonly referred to as thermoelastic coupling effect, see Sec. 13.2 in Haupt (2002), or Gough-Joule effect, see Sec. 96 in Truesdell and Noll (2004). For the glass fibers, we assume that both stiffness tensor and thermal expansion are isotropic, i.e.,

$$\mathbb{C} = 3K\mathbb{P}\_1 + 2G\mathbb{P}\_2 \quad \text{and} \quad \alpha = \alpha\_0 \,\mathrm{I},\tag{5.41}$$

with bulk modulus , shear modulus and isotropic coefficient of thermal expansion 0. By P<sup>1</sup> and P<sup>2</sup> we denote the projectors onto the spherical and deviatoric × matrices, respectively. The parameters of the model are taken from Tikkarrouchine et al. (2019) and listed in Tab. 5.1.

**Table 5.1:** Material parameters of the glass fibers Tikkarrouchine et al. (2019)


For the polypropylene matrix, we assume a linear thermoviscoelastic model based on a generalized Maxwell model, see Fig. 5.1 and Sec. 3.5.1 in Tschoegl (1989). For models accounting for effects outside of the viscoelastic domain, we refer, e.g., to Krairi et al. (2019) and Benaarbia et al. (2019) for extensions to viscoplasticity and damage, and Tscharnuter et al. (2012) for a study on polypropylene. Based on the caloric data in Table 18.10 in the Springer Handbook of Materials Data (Warlimont and Martienssen, 2018), we assume a heat-storage related

free energy of the form

$$\psi\_{\rm heat}(\theta) = c\_0 \left[ (1 - k \theta\_{\rm ref}) \left( (\theta - \theta\_{\rm ref}) - \theta \ln \left( \frac{\theta}{\theta\_{\rm ref}} \right) \right) - \frac{k}{2} (\theta - \theta\_{\rm ref})^2 \right], \tag{5.42}$$

corresponding to a linear heat capacity

$$c\_{\varepsilon}(\theta) = c\_0[1 + k(\theta - \theta\_{\text{ref}})].\tag{5.43}$$

The energy stored the generalized Maxwell Model with MW Maxwell elements reads

$$\begin{split} \psi\_{\text{mech}}(\varepsilon, \theta, \varepsilon\_{\text{V}\alpha}) &= \frac{1}{2} \varepsilon : \mathbb{C}\_{0} : \varepsilon + \sum\_{\alpha=1}^{N\_{\text{MW}}} \frac{1}{2} (\varepsilon - \varepsilon\_{\text{V}\alpha}) : \mathbb{C}\_{\alpha} : (\varepsilon - \varepsilon\_{\text{V}\alpha}) \\ &- \varepsilon : \mathbb{C}\_{0} : (\alpha(\theta - \theta\_{\text{ref}})) - \sum\_{\alpha=1}^{N\_{\text{MW}}} (\varepsilon - \varepsilon\_{\text{V}\alpha}) : \mathbb{C}\_{\alpha} : (\alpha(\theta - \theta\_{\text{ref}})). \end{split} \tag{5.44}$$

Consequently, the stress computes as

$$\sigma = \mathbb{C}\_0 : \left( \varepsilon - \alpha(\theta - \theta\_{\rm ref}) \right) + \sum\_{\alpha=1}^{N\_{\rm MV}} \mathbb{C}\_{\alpha} : \left( \varepsilon - \varepsilon\_{\rm V\alpha} - \alpha(\theta - \theta\_{\rm ref}) \right). \tag{5.45}$$

We assume that the viscosity tensor associated to a dashpot of the generalized Maxwell model has the form

$$\mathbb{V}\_{\alpha} = a(\theta)\tau\_{\alpha}\mathbb{C}\_{\alpha},\tag{5.46}$$

where : R*<sup>&</sup>gt;*<sup>0</sup> → R denotes a temperature-dependent shift function. The corresponding fluidity F is defined by the pseudoinverse

$$\mathbb{F}\_{\alpha} = \left(\mathbb{V}\_{\alpha}\right)^{\dagger} = \frac{1}{a(\theta)\tau\_{\alpha}} (\mathbb{C}\_{\alpha})^{\dagger}. \tag{5.47}$$

151

In terms of the partial stresses

$$\sigma\_{\rm V\alpha} = \mathbb{C}\_{\alpha} : (\varepsilon - \varepsilon\_{\rm V\alpha} - \alpha(\theta - \theta\_{\rm ref})),\tag{5.48}$$

the evolution equation for the viscous strains reads

$$
\dot{\varepsilon}\_{\rm V\alpha} = \mathbb{F}\_{\alpha} : \sigma\_{\rm V\alpha}. \tag{5.49}
$$

For simplicity, we assume that polypropylene is isotropic and linear elastic in dilation, see Sec. 9.4 in Brinson and Brinson (2015). More precisely, the stiffness tensors and the thermal expansion have the form

$$\mathbb{C}\_0 = 3K\_0 \mathbb{P}\_1 + 2G\_0 \mathbb{P}\_2, \quad \mathbb{C}\_\alpha = 2G\_\alpha \mathbb{P}\_2 \quad \text{and} \quad \alpha = \alpha\_0 \,\mathrm{I}. \tag{5.50}$$

In this particular case, the viscous strains v are purely deviatoric and independent of thermal expansion. The shift function describes the time-temperature dependency of the material. At room temperature ref = 293*.*15 K, polypropylene is above its glass transition temperature glass ≈ 273*.*15 K. Hence, we use the Williams-Landel-Ferry (WLF) equation (Williams et al., 1955) as ansatz for the shift function

$$\log\_{10} a(\theta) = -\frac{C\_1(\theta - \theta\_{\text{ref}})}{C\_2 + \theta - \theta\_{\text{ref}}}.\tag{5.51}$$

For the present study, we restrict to linear viscoelastic behavior and focus on the effects induced by the thermomechanical coupling. In particular, we omit a possible pressure dependence of the shift factor as suggested by Fillers and Tschoegl (1977) based on free-volume considerations. For our implementation, we use the time-integration scheme of Taylor et al. (1970), which is based on the partial stresses v instead of v. The

update reads

$$\sigma\_{\rm V\alpha} = \exp\left(-\frac{\Delta\xi}{\tau\_{\alpha}}\right)\sigma\_{\rm v\alpha}^{n} + \frac{\left(1 - \exp\left(-\frac{\Delta\xi}{\tau\_{\alpha}}\right)\right)}{\frac{\Delta\xi}{\tau\_{\alpha}}}\mathbb{C}\_{\alpha} : \left(\varepsilon - \varepsilon^{n} + \alpha(\theta - \theta^{n})\right) \tag{5.52}$$

where (·) denotes the value of the last converged time step and is a reduced time defined via

$$
\xi = \int\_0^t \frac{1}{\alpha(\theta(\tau))} d\tau. \tag{5.53}
$$

We compute the change in reduced time Δ = − by a 5-point Gauss quadrature, assuming a constant temperature rate. The entropies and dissipation in terms of the partial stresses v read

$$s\_{\rm heat}(\theta) = c\_0 \left[ (1 - k \theta\_{\rm ref}) \ln \left( \frac{\theta}{\theta\_{\rm ref}} \right) + k(\theta - \theta\_{\rm ref}) \right],\tag{5.54}$$

$$s\_{\text{mech}}(\varepsilon, \theta, \sigma\_{\text{V}\alpha}) = \varepsilon : \mathbb{C}\_0 : \alpha + \sum\_{\alpha=1}^{N\_{\text{MW}}} \alpha : \left[\sigma\_{\text{V}\alpha} + \mathbb{C}\_{\alpha} : (\alpha(\theta - \theta\_{\text{ref}})) \right], \tag{5.55}$$

$$\mathcal{D} = \sum\_{\alpha=1}^{N\_{\rm MV}} \sigma\_{\rm V\alpha} : \mathbb{F}\_{\alpha} : \sigma\_{\rm V\alpha}. \tag{5.56}$$

The used material parameters are listed in Tab. 5.2. The caloric parameters were chosen based on Tables 18.9 and 18.10 in the Springer Handbook of Materials Data (Warlimont and Martienssen, 2018) and the viscoelastic parameters are taken from the experimental study by Kehrer et al. (2018). Note that Kehrer et al. (2018) characterized the behavior of polypropylene over a wide range of frequencies and temperatures, using 27 Maxwell elements for their model. For the present study, we restrict to moderate temperature and frequency changes and only consider 9 elements with time constants ∈ [10<sup>−</sup><sup>4</sup> *,* 10<sup>4</sup> ] in order to reduce

computation times. The shear moduli of the elements with *>* 10<sup>4</sup> are added to the elastic shear modulus 0, whereas the elements with *<* 10<sup>−</sup><sup>4</sup> were omitted.

**Table 5.2:** Material parameters of polypropylene (Kehrer et al., 2018; Warlimont and Martienssen, 2018)


**Uniaxial extension.** In our first set of experiments, we take a look at the stress-strain behavior under uniaxial extension and compression. We want to assess the strength of the thermomechanical coupling for the investigated composite microstructure. Furthermore, we are interested in the performance of different FFT-based solution algorithms in conjunction with the staggered thermomechanical solution scheme in Alg. 5. To this end, we chose the Barzilai-Borwein method (Schneider, 2019a) and the Newton-CG method (Gélébart and Mondon-Cancel, 2013; Kabel et al., 2014) as fastest strain-based solvers, see Ch. 3. In

addition, the basic scheme by Moulinec and Suquet (1998) is included as classical benchmark. For the loading, we apply mixed boundary conditions, see Kabel et al. (2016), corresponding to strain-controlled uniaxial extension/compression to 5% with a strain rate of 1*/*s at various loading angles in the -plane with respect to the -direction, i.e., the fiber direction. For the first set of computations, we consider adiabatic conditions, i.e., = 0, where self-heating/-cooling of the material is expected. The second set of computations is performed with a fixed temperature of ref = 293*.*15 as reference.

**Figure 5.2:** Continuous glass-fiber reinforced polypropylene: Stress vs strain at various loading angles in the -plane with respect to the -direction

The resulting stress-strain curves are plotted in Fig. 5.2. In the isothermal setting, there is no distinction between tension and compression and we observe a linear relation between stresses and strains. For the loading under adiabatic conditions, however, the thermomechanical coupling induces an effectively nonlinear behavior. To be more precise, under compression, mech decreases, which leads to a rise in temperature, resulting in the softening of the polypropylene matrix. Conversely, under

tension mech increases, leading to a lower temperature and the stiffening of polypropylene. Two factors contribute to the strength of the observed thermomechanical coupling. First, due to its high thermal expansion coefficient , the Gough-Joule effect, i.e., the strain-induced change of mech is rather pronounced for polypropylene. Secondly, the mechanical behavior of polypropylene is very sensitive to temperature changes in the vicinity of its glass transition temperature, as encapsulated by the WLF equation (5.51). Note that the computations for the 0 ∘ load angle represent an exception to these observations. In this case, the fibers carry most of the load and we observe no difference between isothermal and adiabatic computations, due to temperature independence of their stiffness.

#### **Performance comparison for a single load step.** Next, we take a closer

**Figure 5.3:** Continuous glass-fiber reinforced polypropylene: Performance comparison for 5% uniaxial extension in -direction in a single load step

look at the performance of the different FFT-based solution schemes. In particular, we are interested how their convergence behavior changes in case of strong thermomechanical coupling, compared to the isothermal setting. Hence, we consider the load case of uniaxial extension at a 90<sup>∘</sup> load angle, where the coupling is most pronounced. First, the performance is evaluated for a single load step up to 5% strain.

The residual is plotted as a function of iteration counts and computation time in Fig. 5.3. Note that the convergence behavior of the Newton-CG method in the adiabatic setting is distinctly different in comparison to the isothermal computation. For the isothermal case, the decrease of the residual gradually grows in subsequent Newton iterations. Due to the adaptive forcing-term choice of Eisenstat and Walker (1996), the linear system is thus solved to higher accuracy. In contrast, the convergence rate with respect to Newton iterations is roughly constant for the adiabatic computation. This is due to the fact that we do not consider the temperature dependence of the material behavior in the computation of the Hessian. Thus, the linear approximation of the gradient is less precise than for the isothermal computation. With respect to the overall performance, this effect is somewhat alleviated by the forcing-term of Eisenstat and Walker (1996), as the linear system is solved to lower accuracy, thereby reducing the cost of each Newton iteration. Even though, Newton-CG requires 75% more Newton iterations in the adiabatic setting, the runtime only increases by about 30%, see Tab 5.3. For the basic scheme, the convergence rate in the adiabatic and isothermal setting is nearly identical. The same is true for the Barzilai-Borwein method, which displays its characteristic non-monotone behavior and converges in much fewer iterations than the basic scheme, see Tab. 5.3. Even though the iteration counts of both schemes are roughly identical for both settings, the overall computation times are slightly higher for the adiabatic computations.

A look at the computational cost of the most expensive operations, i.e., the material law, the FFTs and the Γ-operator, clarifies this phenomenon. In Tab. 5.4, the average computation times per application of these operations are listed for the 0 ∘ load case solved by the Barzilai-Borwein


**Table 5.3:** Continuous glass-fiber reinforced polypropylene: Iteration counts and computation times for 5% uniaxial extension in -direction in a single load step

scheme. Notably, for the adiabatic setting, the additional computation of and mech in the material law increases its time per application by about 70%. Thus, the overall cost per iteration ends up 30% higher. The same is true for the basic scheme.

**Table 5.4:** Continuous glass-fiber reinforced polypropylene: Computation time per application of the most expensive operations for loading in -direction and solved by the Barzilai-Borwein method in a single load step


Comparing the overall performance of the schemes, we observe that the Barzilai-Borwein method is the fastest for both the isothermal and adiabatic setting. The Newton-CG method is only slightly slower but suffers from increased iteration counts for the adiabatic case. The basic scheme is by far the slowest, taking 5-6 times longer than the Barzilai-Borwein method.

**Performance comparison for 20 load steps.** Next, we investigate the

**Figure 5.4:** Continuous glass-fiber reinforced polypropylene: Performance comparison for 5% uniaxial extension in -direction in 20 load steps

performance of the solvers, when subdividing the strain loading of 5% into 20 equally spaced load steps. An affine-linear extrapolation Moulinec and Suquet (1998) is applied at the beginning of each load step to obtain an initial guess for the strain field. The total iteration counts and computation times of the different solvers in each load step are plotted in Fig. 5.4.

For all solvers, the iteration counts decrease up to step 5 as the affinelinear extrapolation takes effect. For the isothermal computations, the iteration counts further decrease after this point, due to the linear stressstrain behavior. In contrast, the iteration counts stagnate or, in case of the basic scheme, even increase for the adiabatic computations. This coincides with the onset of the effectively nonlinear material behavior

for uniaxial strains larger than 1%, see Fig. 5.2d. Hence, the affine-linear extrapolation becomes less effective, which leads to higher iteration counts compared to the isothermal computations. Note that for a material which already behaves nonlinearly under isothermal conditions, this difference between adiabatic and isothermal computations is expected to be less pronounced.


**Table 5.5:** Continuous glass-fiber reinforced polypropylene: Mean iteration counts and computation times for 5% uniaxial extension in -direction and 20 load steps

As for the loading in a single step, the Barzilai-Borwein method is fastest. Its computation time for the adiabatic case increases by roughly 35%, due to higher iteration counts and the additional cost per iteration. The performance of the Newton-CG method is nearly identical to the Barzilai-Borwein method for the isothermal computation. However, it exhibits a larger decrease in performance for the adiabatic computation, with an increase in computation time by nearly 60%. For the basic scheme, the iteration counts are roughly identical for the isothermal and thecance adiabatic setting. In the first 9 steps, the adiabatic computation converges faster, as a consequence of the stiffening due to self-cooling and the resulting reduction in material contrast. For the subsequent steps, the isothermal computation requires fewer iterations, due to the more effective affine-linear extrapolation. Fortuitously, these effects

roughly cancel each other out. Overall, the basic scheme is still the slowest, taking 3 − 4 times longer than the Barzilai-Borwein method to converge.

To summarize, we observe that the convergence behavior of the basic scheme and the Barzilai-Borwein method in conjunction with Alg. 5 is similar to their convergence behavior under isothermal conditions, even for a composite with strong thermomechanical coupling. The computation times for the thermomechanically coupled computations increase by roughly 30% for both schemes, which is mainly due to the additional cost of computing the dissipation and mechanical entropy mech in the material law. The Newton-CG method suffered the highest decrease in performance for the coupled computations, as the temperature dependence is neglected in the Hessian computation. This leads to a significant increase in Newton- and CG-iterations, in addition to the higher cost per Newton iteration.

Considering the overall performance, the Barzilai-Borwein method and the Newton-CG method are the fastest solvers. Due to its lower memory requirements and its more robust convergence behavior in the thermomechanically coupled computations, we use the Barzilai-Borwein method for all following computations.

## **5.4.3 Planar short glass-fiber reinforced polypropylene**

Motivated by the numerical experiments in the last section, we investigate a more complex microstructure, see Fig. 5.5. We consider a polypropylene matrix reinforced by 1130 short glass-fibers with an aspect ratio of 20. The fiber volume fraction amounts to 13*.*2%, corresponding to mass fraction of 30%. The microstructure was generated by the sequential addition and migration algorithm (Schneider, 2017b) and discretized by 512 × 512 × 64 voxels. The second-order fiber-orientation tensor reads = diag(0*.*45*,* 0*.*45*,* 0*.*1), see Advani and Tucker (1987).

**Figure 5.5:** Short glass-fiber reinforced polypropylene: Microstructure and von Mises equivalent strain after 1% uniaxial extension in -direction

For the following investigations, we use the same material models and parameters as in Sec. 5.4.2.

**Dynamic mechanical analysis.** The macroscopic behavior of viscoelastic composites is often investigated under steady-state oscillations with a fixed frequency ∈ R<sup>≥</sup>0, see Sec. 5.5 in Brinson and Brinson (2015). This is commonly called dynamic-mechanical analysis (DMA). Suppose a linear viscoelastic material is harmonically excited by uniaxial tension/compression where the strain component in loading direction is given by

$$
\varepsilon(t) = \varepsilon\_{\text{amp}} \sin(\omega t),
\tag{5.57}
$$

with the strain amplitude amp ∈ R<sup>≥</sup><sup>0</sup> and the angular frequency = 2. The stress response of the material in loading direction reads

$$
\sigma(t) = \sigma\_{\text{amp}} \sin(\omega t + \delta) \tag{5.58}
$$

with the stress amplitude amp ∈ R≥<sup>0</sup> and phase difference ∈ [0*, /*2]. Typical characteristics for the material are the storage modulus

$$E' = \frac{\sigma\_{\text{amp}}}{\varepsilon\_{\text{amp}}} \cos(\delta) \tag{5.59}$$

and loss modulus

$$E'' = \frac{\sigma\_{\text{amp}}}{\varepsilon\_{\text{amp}}} \sin(\delta). \tag{5.60}$$

The storage modulus is related to the average elastic energy stored in a load cycle

$$
\psi\_{\rm cycle} = \frac{1}{4} \varepsilon\_{\rm amp}^2 E'\tag{5.61}
$$

and serves as a measure of the material's elastic stiffness. The loss modulus is proportional to the energy dissipated over a load cycle

$$\mathcal{D}\_{\text{cycle}} = \pi \,\varepsilon\_{\text{amp}}^2 \, E'',\tag{5.62}$$

see Sec. 9.1 in Tschoegl (1989). Thus, ′′ is of particular interest in cases of harmonic loadings with high cycle counts. For instance in fatigue experiments (Handa et al., 1999; Esmaeillou et al., 2012), the dissipated energy accumulates, leading to an increase of temperature over time. For linear viscoelastic material models, such as the generalized Maxwell model for polypropylene, ′ and ′′ can be computed analytically in the isothermal setting, see Sec. 11.1 in Tschoegl (1989). However, as we have seen in Sec. 5.4.2, the thermomechanical coupling induces a nonlinear behavior due to self-heating and self-cooling. Thus, we characterize the viscoelastic behavior of the composite by simulating DMA tests. More precisely, we run through the following steps:


to mitigate the effects of the initial stress relaxation on our numerical experiments.


Note that we do not use the affine-linear extrapolation for these computations, due to the nonmonotone loading. To validate our approach and to determine the necessary number of load steps per cycle, we apply steps 1-4 for a homogeneous polypropylene microstructure under isothermal conditions. The parameters for the sinusoidal loading are static = 0*.*1%, amp = 0*.*05%, see Kehrer et al. (2018), and = 10 Hz. For this frequency, the storage and loss modulus of the viscoelastic model for polypropylene are given by ′ = 2012*.*22 MPa and ′′ = 177*.*68 MPa. In addition to ′ and ′′, we also track the effective dissipated energy (5.26) in our computations and compare it to the analytical formula (5.62). The relative errors for ′ , ′′ and cycle are shown in Fig. 5.6 as a function of the load steps per cycle.

For more than 30 load steps per cycle, the relative error for all tracked quantities falls below 1%. Indeed, ′′ as determined by our DMA computation virtually coincides with its analytical value. Note that the error in dissipation does not tend to 0 for finer resolutions. This is a consequence of the stress relaxation under static strain loading, which still causes a small additional amount of energy dissipation. In preliminary computations, a higher number of cycles was considered as well. However, the results did not differ substantially. Hence, we choose 30 load steps per cycle for all subsequent computations.

**Figure 5.6:** Polypropylene: Relative error between analytic values and the results of the virtual DMA tests for ′ , ′′ and cycle as a function of load steps per cycle

With the established procedure, we simulate uniaxial DMA tests at various static load values for the planar short glass-fiber reinforced polypropylene microstructure, see Fig. 5.5. In particular, the effect of the thermomechanical coupling under adiabatic conditions on ′ and ′′ is of interest. The loading is applied in the -plane at angles between 0 <sup>∘</sup> − 90<sup>∘</sup> with respect to the -direction. Static loads static between 0*.*1% and 1*.*0% are considered. The amplitude and frequency are fixed at amp = 0*.*05% and = 10 Hz, respectively.

In Fig. 5.7, the results for ′ and ′′ are plotted alongside the mean temperature during the harmonic excitation as a function of the loading angle. First, we take a look at the storage modulus. For the 0 ∘ load case, i.e., in-plane loading, the storage modulus is at its peak value. This is due to the stiffening effect of the fibers. For increasing load angle, it drops by ca. 20% up to 45<sup>∘</sup> and subsequently stagnates. Similar to the observations in Sec. 5.4.2, the material cools down under tensile loading due to the Gough-Joule effect, see Fig. 5.7c. This causes a stiffening of the polypropylene matrix and an increase ′ . The effect is most pronounced for the 90<sup>∘</sup> load case, where we observe the largest

**Figure 5.7:** Short glass-fiber reinforced polypropylene: Complex moduli and average temperature as a function of the loading angle with respect to the -axis in the -plane

temperature difference between adiabatic and isothermal conditions. For a static load of 1*.*0%, the relative error between the adiabatic computation and the isothermal computation is slightly below 6%.

The loss modulus ′′ displays a slightly different profile with respect to the loading angle. Its value is at its maximum between 0 <sup>∘</sup> − 15<sup>∘</sup> case, where the strong strain localization around the fibers leads to strong dissipation. Subsequently, the loss modulus decreases linearly. The effect of the static loading on the loss modulus under adiabatic conditions is more pronounced than for the storage modulus. As a decrease in temperature brings the temperature of polypropylene closer to its glass transition temperature, the dissipated energy and ′′ increase. At a load angle of 90<sup>∘</sup> , where the self-cooling is most pronounced, even the lowest static loading of static = 0*.*1% leads to a 5% difference in the loss moduli. The difference increases with the static loading, reaching 13% for static = 1*.*0%.

We conclude that the thermomechanical coupling can have a significant effect when characterizing thermoplastics-based composites using DMA. Due to the Gough-Joule effect, the effective behavior of the material, in particular ′′, becomes load dependent, i.e., nonlinear. This is particularly pronounced for high loading frequencies, when there

is no time for thermal conduction or radiation to take place and the conditions are approximately adiabatic. To obtain precise results for real-life experiments, a strict temperature control of the specimen and low static loadings are therefore necessary.

**Self-heating under harmonic loading.** In the previous Sec. 5.4.3, we considered an oscillatory loading with a small number of cycles. In this case, the observed temperature changes were mostly due to the Gough-Joule effect caused by the static loading. However, for a high number of cycles, the dissipated energy accumulates over time and becomes the main driver of the temperature evolution. For example, such conditions frequently occur in fatigue testing, where the self-heating of the specimen poses a major challenge (Rittel, 2000; Mortazavian and Fatemi, 2015). Typically, in the first hundreds of cycles, the temperature increases in a roughly linear fashion (Jegou et al., 2013) and subsequently reaches an equilibrium value when dissipation and thermal conduction reach an equilibrium state. This limits, for instance, the range of viable loading frequencies for testing (Jia and Kagan, 1998; De Monte et al., 2010).

Motivated by these findings, we take a look at the effect of the thermomechanical coupling on the dissipative characteristics of the short glass-fiber reinforced composite in the initial stage of a high cycle test. More precisely, we prescribe 100 cycles of harmonic stress-controlled uniaxial tensile loading in -direction with a frequency of = 10Hz. The static stress is fixed at static = 30 MPa with a stress amplitude of amp = 30 MPa, corresponding to a load factor of = min*/*max = 0. As only a short time-frame of 10 seconds is considered, we assume adiabatic conditions. First, we consider the evolution of the temperature and the strain amplitude. In Fig. 5.8a, the minimum, maximum and average temperature are plotted for each cycle. Initially, the mean temperature is lower than the reference ref = 293*.*15 K, due to the Gough-Joule effect. Over time, the self-heating caused by the dissipated energy leads to a linear increase and after 25 cycles the initial cool-down is compensated.

**Figure 5.8:** Short glass-fiber reinforced polypropylene: Temperature and strain amplitude for each of 100 cycles under stress-controlled uniaxial harmonic loading in -direction

Together with the temperature, the strain amplitude increases as the material softens, see Fig. 5.8b. However, the reference value of amp for the isothermal case is reached after 42 cycles when the mean temperature has already surpassed ref. Taking a look at the minimum and maximum temperature in Fig. 5.8a, we observe that the large stress amplitude leads to a significant fluctuation of about 1 K for each cycle. Hence, the behavior of polypropylene fluctuates within each cycle, resulting in a slight reduction of the amplitude.

Last but not least, we take a look at the dissipation and the loss modulus for each cycle. Consistent with our observations in Sec. 5.4.3, the magnitude of the loss modulus, see Fig 5.9a, is initially higher than the isothermal prediction and subsequently decreases with increasing temperature. As the temperature reaches its reference value, so does ′′ , indicating that it is mostly unaffected by the large stress amplitude and the resulting intercyclic temperature fluctuations. The dissipation per cycle follows a similar trend. However, it barely exceeds the isothermal reference value in the first few cycles, as the higher loss modulus is partly compensated by the lower strain amplitude.

**Figure 5.9:** Short glass-fiber reinforced polypropylene: Dissipated energy and loss modulus for each of 100 cycles under stress-controlled uniaxial harmonic loading in -direction

Overall, we observe that the dissipative behavior of the material changes significantly in the first cycles of a long-term harmonic excitation. At the end of 100 cycles, the loss modulus and dissipation are 16% and 12% lower, respectively, than the values predicted for the isothermal setting. Thus, when predicting the temperature changes for fatigue tests based on (5.62), see Handa et al. (1999), accounting for the temperature dependence of the material is mandatory.

# **5.5 Conclusions**

The present study was devoted to enabling the efficient computational homogenization of thermomechanically coupled materials. Based on the asymptotic homogenization framework for dissipative materials (Chatzigeorgiou et al., 2016), we presented an efficient staggered algorithm compatible to strain or displacement-based micromechanical solvers. Due to their computational power, we focused on FFT-based solution schemes and found that best performance was achieved in combination

with the Barzilai-Borwein method. Even for a composite with strong thermomechanical coupling, its iteration counts and convergence behavior hardly differed from the usual isothermal setting. The powerful class of polarization-based schemes (Eyre and Milton, 1999; Michel et al., 2001; Monchiet and Bonnet, 2012) was excluded from the present work, as the complexity-reduction approach by Schneider et al. (2019) may prevent the evaluation of dissipation and entropy. Further studies are necessary, to make these solvers available for thermomechanically coupled problems.

In our numerical experiments, we observed that the computational overhead for the temperature-update step in the proposed algorithm was negligible. The difference in runtime between thermomechanically coupled and isothermal computations was dominated by evaluating the entropy and the dissipation, as part of the material law. In particular, computing the dissipation was costly for the chosen linear viscoelastic model, as it involves applying an inverse stiffness tensor for each Maxwell element. This lead to an increase in overall computation times by 20 − 30%. However, for material laws such as 2-plasticity, where the dissipation is readily computed, the difference is much smaller. Overall, we conclude that the proposed algorithm enables computing the effective mechanical behavior of thermomechanically coupled materials with nearly the same computational efficiency as traditional FFT-based methods in an isothermal setting.

For the investigated glass-fiber reinforced polypropylene composites we observed that the thermomechanical coupling induced an effectively nonlinear material behavior, even though the underlying material model was linear viscoelastic. In particular, the dissipative characteristics of the materials changed significantly between the isothermal and adiabatic computations. Expanding the study of similar polymer-based lightweight-materials, such as sheet-molding compounds (Görthofer et al., 2020), to include thermal effects seems promising. The presented

thermomechanical solver is compatible to the interpolation approach by Köbler et al. (2018), enabling the development of effective (macroscopic) surrogate models for arbitrary fiber orientations. For more general structures and material models, thermomechanical FFT-based computations may enter data-driven approaches, such as deep material networks (Liu et al., 2019; Liu and Wu, 2019; Gajek et al., 2020), to facilitate the simulation of components on the macroscale.

With regard to the material model of the polymer, it would be interesting to apply a free-volume based approach for the shift factor (Fillers and Tschoegl, 1977), which takes into account the pressure dependence of the viscosity. Whereas a tensile loading mechanically increases the free volume, the accompanying adiabatic cooldown, observed in this study, may weaken this effect. Investigating the interaction between these phenomena seems worthwhile to enable a thorough characterization of the thermomechanical material behavior. In addition, expanding the material model to the viscoplastic domain Krairi et al. (2019) appears attractive to investigate the influence of the plastic dissipation on the selfheating behavior of the material. As self-heating effects are particularly relevant in the context of fatigue and life-time predictions, coupling the presented thermomechanical solver with FFT-based schemes for damage (Boeff et al., 2015; Sharma et al., 2020) or fracture (Chen et al., 2019b; Ernesti et al., 2020) would be of interest.

## **Chapter 6**

# **Anderson-accelerated polarization schemes for fast Fourier transform-based computational homogenization<sup>1</sup>**

## **6.1 Introduction**

Polarization-based methods pioneered by Eyre and Milton (1999), constitute a powerful and memory efficient class of solvers, oftentimes outperforming the fastest strain-based methods, see Sec. 7 in Schneider et al. (2019). Unfortunately, these algorithms are highly sensitive to the choice of algorithmic parameters, limiting their capabilities as generalpurpose solvers. In particular for problems with infinite contrast, e.g., porous materials, where the strong convexity constant is generally unknown (Schneider, 2020b), this has proven to be highly detrimental to the performance of polarization methods (Schneider, 2019a). In this chapter, we study the combination of polarization methods and Anderson acceleration, producing a fast, flexible and versatile general-

<sup>1</sup> This chapter is based on Wicht et al. (2021a). For the sake of a coherent structure, formatting and typography of this thesis, minor changes have been made. To avoid redundancies in the text, the introduction has been shortened. The discussion of the material behavior Sec. 6.3.6 was expanded.

purpose FFT-based solver. Anderson acceleration (Anderson, 1965) is a method for improving the convergence behavior of fixed-point iterations, where derivatives of the fixed-point mapping are not available. Based on a limited number (the so-called depth) of previous iterates, Anderson acceleration generates the next iterate based on a mixture of previous iterates, where the mixing coefficients solve an associated lowdimensional optimization problem. Anderson acceleration often leads to a substantial speed-up in applications, such as convective flow (Pollock et al., 2021), well-fracture (Aksenov et al., 2021), radiation-diffusion (An et al., 2017), computer graphics (Zhang et al., 2019) or microstructure generation (Kuhn et al., 2020). Anderson acceleration may be interpreted as a multi-secant Quasi-Newton method (Fang and Saad, 2009) and is "essentially equivalent" to GMRES for linear problems, see Walker and Ni (2011). Theoretical convergence assertions were only recently provided (Toth and Kelley, 2015; Evans et al., 2020).

In FFT-based computational micromechanics, the Anderson-accelerated basic scheme was included as a solution algorithm in the AMITEX software package (Chen et al., 2019b), see Ch. 3 for a comparison to other (single-secant) Quasi-Newton methods. Unfortunately, when applied to the basic scheme, Anderson acceleration is unable to unleash its full potential. Indeed, when applied to gradient descent (such as the basic scheme (Kabel et al., 2014; Schneider, 2017a; Bellis and Suquet, 2019)), Li and Li (2020) proved that the convergence rate of an Anderson-accelerated gradient method does not improve upon the optimum convergence rate of plain gradient descent. This theoretical result is backed up by computational experiments in Ch. 3.

Applying Anderson acceleration to polarization schemes appears much more promising. Indeed, most of the time, an optimally tuned polarization method is competitive or even outperforms the fastest strain-based solvers in terms of iteration count (Schneider et al., 2019; Moulinec and Silva, 2014; Monchiet and Bonnet, 2013). Thus, by relieving the user of

the daunting task to identify the optimum numerical parameters, the Anderson-accelerated polarization scheme turns into a general-purpose solver for FFT-based computational micromechanics. We wish to draw the reader's attention to recent applications (Fu et al., 2020; Zhang et al., 2019; Ouyang et al., 2020) of Anderson acceleration to operator-splitting methods, which motivated the present work.

Please note that Shantraj et al. (2015) investigated the combination of a nonlinear GMRES method (Oosterlee and Washio, 2000) (which is equivalent to Anderson acceleration) and polarization methods in the setting of finite-strain crystal viscoplasticity, and report the Andersonaccelerated basic scheme to outperform the Anderson-accelerated polarization methods. However, polarization methods are known to be less powerful at finite strains due to the non-convexity of the problem. We refer to (Kabel et al., 2014, Sec. 3.2.5) for computational experiments. Thus, the conclusions of Shantraj et al. (2015) cannot be transferred to the small-strain setting. Furthermore, Shantraj et al. (2015) consider the deformation gradient and a rescaled polarization field as iterates of their algorithm. However, a recent study by Ouyang et al. (2020) demonstrates that it is preferable in terms of iteration counts and run-time to restrict Anderson acceleration to the lower-dimensional fixed-point iteration of the polarization. In the context of FFT-based micromechanics, this corresponds to accelerating the (damped) Eyre-Milton iteration, which is the approach we follow in this study.

This chapter is organized as follows. After recapitulating the basics of polarization methods, see Sec. 6.2.1, and Anderson acceleration, see Sec. 6.2.2, we present the resulting algorithm in Sec. 6.2.3. In Sec.6.3, we perform numerical experiments to evaluate the performance of Anderson accelerated polarization methods and compare them to the fastest strain-based solution algorithms.

# **6.2 Anderson-accelerated polarization schemes**

## **6.2.1 The Eyre-Milton equation and polarization schemes**

This section provides a stream-lined presentation of polarization methods for FFT-based computational micromechanics at small strains, see Schneider et al. (2019) as a general reference.

In the context of small-strain continuum mechanics, let a cuboid cell in R be given, together with a heterogeneous strain energy density

$$w: Y \times \text{Sym}(d) \to \mathbb{R}^d, \quad (x, \varepsilon) \mapsto w(x, \varepsilon), \tag{6.1}$$

where = 2*,* 3 denotes the spatial dimension and Sym() is the space of symmetric × matrices. In the following, we assume that is measurable in its first variable and (twice) differentiable in the strain. For a general physically nonlinear hyperelastic material, corresponds to the strain-energy density so that the stress operator computing the Cauchy stress tensor (*,* ) at in response to the applied (infinitesimal) strain ∈ Sym() is defined by the hyperelastic relation

$$\sigma: Y \times \text{Sym}(d) \to \text{Sym}(d), \quad (x, \varepsilon) \mapsto \frac{\partial w}{\partial \varepsilon}(x, \varepsilon). \tag{6.2}$$

Alternatively, may arise as the incremental potential of a generalized standard material after time discretization and static condensation of internal variables, see Miehe (2002). Assuming vanishing non-equilibrium stresses, the condensed incremental potential permits the hyperelastic definition (6.2) of the stress operator. Note that, in this case, has no intrinsic physical meaning, as it depends on the chosen time-integration scheme and mixes the Helmholtz free energy and the dissipation potential of the material. For the convenience of the reader, we suppress the -dependency of and in the following.

Introducing the space of periodic and mean-free displacement fluctuations

$$H^1\_\#(Y; \mathbb{R}^d) = \left\{ \begin{array}{c} u: \mathbb{R}^d \to \mathbb{R}^d \\\\ u \text{ periodic}, \,\partial\_n u \text{ anti-periodic on } \partial Y, \int\_Y u \, dx = 0 \right\}, \end{array} \tag{6.3}$$

we seek a solution ∈ <sup>1</sup> #( ; <sup>R</sup> ), which satisfies the static balance of linear momentum without volume forces

$$\operatorname{div}\,\sigma(\overline{\varepsilon} + \nabla^s u) = 0 \tag{6.4}$$

for a prescribed macroscopic strain . The corresponding space of squareintegrable stress- and strain-fields 2 ( ; Sym()) is endowed with the inner product

$$\langle \varepsilon\_1, \varepsilon\_2 \rangle\_{L^2} \equiv \frac{1}{|Y|} \int\_Y \varepsilon\_1(x) : \varepsilon\_2(x) \, dx \quad \text{for} \quad \varepsilon\_1, \varepsilon\_2 \in L^2(Y; \text{Sym}(d)), \tag{6.5}$$

where | | denotes the volume of the cell . Assuming that the stress for vanishing strain is square-integrable, is -strongly convex in its second variable

$$\begin{aligned} \{\sigma(\varepsilon\_1) - \sigma(\varepsilon\_2), \varepsilon\_1 - \varepsilon\_2\}\_{L^2} &\geq \mu \left\| \varepsilon\_1 - \varepsilon\_2 \right\|\_{L^2}^2 \quad \forall \varepsilon\_1, \varepsilon\_2 \in L^2(Y; \text{Sym}(d)), \\\\ &\tag{6.6}$$

and has an -Lipschitz gradient

$$\|\sigma(\varepsilon\_1) - \sigma(\varepsilon\_2)\|\_{L^2} \le L \|\|\varepsilon\_1 - \varepsilon\_2\|\_{L^2} \quad \forall \varepsilon\_1, \varepsilon\_2 \in L^2(Y; \text{Sym}(d)), \tag{6.7}$$

the balance of linear momentum has a unique solution (Bellis and Suquet, 2019). This permits to define the effective stress associated to the strain loading

$$\overline{\sigma}(\overline{\varepsilon}) = \frac{1}{|Y|} \int\_{Y} \sigma(\overline{\varepsilon} + \nabla^{s} u) \, dx,\tag{6.8}$$

where ∈ <sup>1</sup> #( ; <sup>R</sup> ) solves equation (6.4). For more general existence results for monotone operators<sup>2</sup> , which are not necessarily derived from a potential, we refer to Ch. 22 in Bauschke and Combettes (2017).

It can be shown (Schneider, 2015) that for any displacement fluctuation field solving equation (6.4) and any reference stiffness C 0 , the total strain = + ∇ ∈ 2 ( ; Sym()) solves the Lippmann-Schwinger equation

$$
\varepsilon + \Gamma^0 : (\sigma(\varepsilon) - \mathbb{C}^0 : \varepsilon) = \overline{\varepsilon}, \tag{6.9}
$$

where Γ <sup>0</sup> denotes Green's operator associated to C 0 (Mura, 1987), a bounded linear operator on 2 ( ; R ). Conversely, suppose that ∈ 2 ( ; Sym()) solves the Lippmann-Schwinger equation (6.9) for some reference stiffness C 0 , then we may find ∈ <sup>1</sup> #( ; <sup>R</sup> ), s.t. = + ∇ and solves the balance of linear momentum (6.4), see, for instance, Schneider (2015).

The Lippmann-Schwinger equation serves as the basis of successful numerical algorithms for solving the balance of linear momentum (6.4), see Ch. 3 for an overview. Alternatively, we may investigate a formulation based on the polarization field = () + C 0 : , i.e., the Eyre-Milton equation (Eyre and Milton, 1999)

$$P - \mathbf{Y}^0 : Z^0(P) = 2 \, \mathbb{C}^0 : \overline{\varepsilon} \tag{6.10}$$

<sup>2</sup> In the present setting, -monotonicity of is implied by (6.6).

in terms of the operator

$$\mathcal{Y}^0 = \mathcal{I} - 2\,\mathbb{C}^0 : \Gamma^0,\tag{6.11}$$

a non-local reflection operator on 2 ( ; Sym()), and the operator

$$Z^0 = \mathcal{I} - 2\,\mathbb{C}^0 : (\sigma + \mathbb{C}^0)^{-1}, \quad P \mapsto P - 2\,\mathbb{C}^0 : (\sigma + \mathbb{C}^0)^{-1}(P), \tag{6.12}$$

a nonexpansive and local operator on 2 ( ; Sym()). More precisely, the operator Y 0 satisfies the reflection identity Y 0 ∘ Y <sup>0</sup> = I, formulated in terms of the identity operator I on 2 ( ; Sym()). Furthermore, Z 0 is well-defined, as the operator ↦→ () + C 0 : is invertible due to the strong convexity of and the non-degeneracy of the reference stiffness C 0 .

For any solution of the Lippmann-Schwinger equation (6.9), the polarization field = () + C 0 : solves the Eyre-Milton equation and vice versa, as a direct implication of the Eyre-Milton identity

$$2\mathbb{C}^0: (\mathbf{I} + \Gamma^0: (\sigma - \mathbb{C}^0)) = (\mathbf{I} - \mathbf{Y}^0: \mathbf{Z}^0)(\sigma + \mathbb{C}^0),\tag{6.13}$$

a simple algebraic rewriting of the Lippmann-Schwinger equation (Schneider et al., 2019, Sec. 2). For any damping parameter ∈ (0*,* 1], we may consider the damped Picard iteration associated to the Eyre-Milton equation

$$P\_{k+1} = a \, P\_k + (1 - a) \left[ 2 \mathbb{C}^0 : \mathbb{E} + \mathbf{Y}^0 : \mathbf{Z}^0(P\_k) \right],\tag{6.14}$$

which is called polarization scheme (Monchiet and Bonnet, 2012; Moulinec and Silva, 2014). Under the hypotheses of this section, for any initial value <sup>0</sup> ∈ 2 ( ; Sym()), reference stiffness C <sup>0</sup> and damping parameter ∈ [0*,* 1), the iterative scheme (6.14) converges to a solution of the Eyre-Milton equation (6.10). This is a direct consequence of the identification (Schneider et al., 2019, Sec. 3) of the polarization

scheme (6.14) as the Douglas-Rachford method (Lions and Mercier, 1979), and the tight linear convergence bounds for the Douglas-Rachford splitting established by Giselsson and Boyd (2017), see (Schneider, 2019b, Sec. 3.1).

Restricted to the class of reference materials proportional to the identity, explicit formulae for obtaining the *optimum* convergence rate are available. More precisely, if C <sup>0</sup> = 1*/* I holds in terms of a positive number , the distance of the iterates of (6.14) to the fixed point \* decreases by

$$\|\|P\_{k+1} - P^\*\|\|\_{L^2} \le (a + (1 - a)\delta) \|\|P\_k - P^\*\|\|\_{L^2} \tag{6.15}$$

with

$$\delta = \max\left(\frac{sL-1}{sL+1}, \frac{s\mu-1}{s\mu+1}\right),\tag{6.16}$$

see Theorem 2 in Giselsson and Boyd (2017). The best convergence rate is achieved by setting = 1*/* √ and = 1, leading to

$$\|\|P\_{k+1} - P^\*\|\|\_{L^2} \le \frac{\sqrt{L/\mu} - 1}{\sqrt{L/\mu} + 1} \|\|P\_k - P^\*\|\|\_{L^2}.\tag{6.17}$$

At this point, some remarks are in order:


*∂* <sup>2</sup>*/∂* <sup>2</sup> , respectively. For small-strain materials, the maximum slope of the stress operator is typically not larger than the maximum slope of the algorithmic tangent at zero strain. Thus, for example for elastoplasticity, may be estimated from the maximum eigenvalue of the initial elastic stiffness, maximized for all ∈ . Computing , on the other hand, may require an eigenvalue decomposition of C alg, which is computationally expensive. Hence, an approach should be identified, which minimizes how often and are computed while preserving the convergence rate of the scheme, see Sec. 6.3.2 for further discussion.


updates, the basic scheme suffers from a step-size restriction in order to retain stability. In contrast, due to the implicit nature of the updates, polarization schemes (6.14) are stable for any step size. In particular, much larger step sizes than for the basic scheme can be used. The latter phenomenon is responsible for the improved convergence speed of the polarization methods compared to the basic scheme. Moreover, the relaxation (6.14) of the fixed-point scheme by a factor may also be applied to the basic scheme. However, the resulting scheme will be equivalent to the basic scheme with a different step size. In contrast,

for polarization methods, relaxation leads to more general methods. Overall, we may conclude that the choice of the step size and the relaxation parameter is not straight-forward. This is particularly bothersome, since the convergence rate of polarization-based schemes exhibits a strong sensitivity w.r.t. these parameters (Schneider et al., 2019, Sec. 7). Thus, for problems where estimates for or are not available, polarization-based schemes may perform poorly, see Section 3.2 in Schneider (2019a), limiting their usefulness as general-purpose solvers. In the following Sections 6.2.2 and 6.2.3, we shall discuss how Anderson acceleration (6.14) may counterbalance the slow convergence behavior of polarization-based schemes for suboptimal parameter choices.

#### **6.2.2 Anderson acceleration for fixed-point iterations**

Suppose a (nonlinear) operator : → is given, mapping a Banach space into itself, which is Lipschitz-continuous with Lipschitz constant  *<* 1. For any initial value <sup>0</sup> ∈ , Banach's fixed point theorem (Banach, 1922) asserts that the iterative scheme

$$x\_{k+1} = F(x\_k) \tag{6.18}$$

converges to the unique fixed point \* of with rate , i.e.,

$$\|\|x\_{k+1} - x^\*\|\|\_{X} \le \rho \left\|\|x\_k - x^\*\|\right\|\_{X} \tag{6.19}$$

holds. Anderson acceleration Anderson (1965), sometimes also called Anderson mixing, is a method applicable to general fixed-point iterations (6.18). It aims at improving the convergence properties of the Picard iteration (6.18) for cases where derivatives of are either not available or expensive to compute.

Anderson acceleration depends on a non-negative integer called depth. For = 0, it reduces to the original Picard iteration (6.18). For general ≥ 0, to determine the next iterate +1, Anderson acceleration "mixes" the last + 1 iterates

$$x\_{k+1} = \sum\_{i=1}^{m\_k+1} \alpha\_k^i F(x\_{k+1-i}),\tag{6.20}$$

where = min(*,* ) and the coefficients ∈ R +1 are chosen to minimize the function

$$\left\| \left| \alpha\_k^1 r\_k + \alpha\_k^2 r\_{k-1} + \dots + \alpha\_k^{m\_k+1} r\_{k-m\_k} \right| \right\|\_{X},\tag{6.21}$$

where = − () denote the residuals, subject to the mixing constraint

$$\sum\_{i=1}^{m\_k+1} \alpha\_k^i = 1.\tag{6.22}$$

The formulation (6.20) involves applying the nonlinear operator ( + 1) times for each iteration step of Anderson mixing. As evaluating the operator is typically the most expensive step, practical implementations are based on the (already) computed residuals instead, using the

equivalent update formula

$$x\_{k+1} = \sum\_{i=1}^{m\_k+1} \alpha\_k^i \left[ x\_{k+1-i} - r\_{k+1-i} \right]. \tag{6.23}$$

In this way, the nonlinear operator needs to be evaluated only once per Anderson iteration. Also, if is a Hilbert space, the minimization problem (6.21) simplifies to a quadratic programming problem for which may be solved by

$$\alpha\_k = \frac{1}{\underline{1}^T A\_k^\dagger \underline{1}} A\_k^\dagger \underline{1},\tag{6.24}$$

where 1 is a vector of all ones in R +1 , is the symmetric positive semidefinite matrix

$$A\_{k} = \begin{bmatrix} \langle r\_{k}, r\_{k} \rangle\_{\mathcal{X}} & \langle r\_{k}, r\_{k-1} \rangle\_{\mathcal{X}} & \dots & \langle r\_{k}, r\_{k-m\_{k}} \rangle\_{\mathcal{X}}\\ \langle r\_{k-1}, r\_{k} \rangle\_{\mathcal{X}} & \langle r\_{k-1}, r\_{k-1} \rangle\_{\mathcal{X}} & \dots & \langle r\_{k-1}, r\_{k-m\_{k}} \rangle\_{\mathcal{X}}\\ \vdots & \vdots & \ddots & \vdots\\ \langle r\_{k-m\_{k}}, r\_{k} \rangle\_{\mathcal{X}} & \langle r\_{k-m\_{k}}, r\_{k-1} \rangle\_{\mathcal{X}} & \dots & \langle r\_{k-m\_{k}}, r\_{k-m\_{k}} \rangle\_{\mathcal{X}} \end{bmatrix} \tag{6.25}$$

and † is the Moore-Penrose pseudoinverse (Moore, 1920; Penrose, 1955) of the matrix . In case the matrix is ill-conditioned and is finite-dimensional, Fu et al. (2020) recommend solving the optimization problem (6.21) based on a singular value decomposition (SVD) of the matrix

$$\left[ \begin{array}{c} \mid \\ x\_k - x\_{k-1} \quad \cdots \quad x\_{k-m\_k+1} - x\_{k-m\_k} \\ \mid \\ \end{array} \right] \in \mathbb{R}^{\dim(X) \times (m\_k - 1)}.\tag{6.26}$$

However, this approach requires a higher memory footprint than the procedure based on the pseudoinverse. Furthermore, we did not encounter ill-conditioning of the matrix during our numerical experiments, see section 6.3. This suggests that the SVD-approach may not be necessary for the problem at hand.

At the end of this section, we wish to put Anderson acceleration into context, and report on recent convergence assertions. Anderson acceleration may be interpreted as a Quasi-Newton method of multi-secant type (Fang and Saad, 2009). Applied to linear problems, Walker and Ni (2011) showed that Anderson acceleration is "essentially equivalent" to GMRES with depth (Saad and Schultz, 1986). Toth and Kelley (2015) showed that Anderson acceleration does not decrease the convergence rate of linearly converging fixed point iterations. Furthermore, Evans et al. (2020) showed that Anderson acceleration improves upon the convergence rate of linearly convergent fixed-point iterations, but not for those converging quadratically. However, some caution is advised for these results, because they assume that the coefficients remain uniformly bounded (and uniformly bounded away from zero) in . This assumptions is difficult to verify in practice, as it is an assumption on the Anderson acceleration procedure and not an assumption on the fixed-point mapping . Furthermore, Anderson acceleration may also converge if the original mapping was not contractive Both et al. (2019). For stationary Anderson acceleration with fixed coefficients , De Sterck and He (2020) provided convergence estimates for accelerating gradient-descent, drawing on the similarity to Nesterov's scheme (Nesterov, 1983) for = 1. In numerical tests, the authors found that the convergence rate of the stationary version provides a rough performance estimate for the classical Anderson acceleration. Using a similar strategy, Wang et al. (2021) investigated the speed up for accelerating ADMM, which may be interpreted as a dual version of the Douglas-Rachford splitting (Giselsson and Boyd, 2017).

Of particular interest is the work of Li and Li (2020). They demonstrate

that when Anderson acceleration is applied to gradient descent and strongly convex functions with Lipschitz gradient, the convergence rate is not improved compared to an optimally tuned gradient-descent scheme. At first, this result appears discouraging because other methods, for instance fast gradient solvers (Nesterov, 2004), lead to an improvement of the convergence rate. However, the problem with convergence assertions for Anderson acceleration is its finite depth . Suppose, for instance, we consider solving a symmetric linear system. Suppose that we obtain a sufficiently accurate solution with MINRES (Paige and Saunders, 1975) in steps. Then, choosing ≥ , GMRES() gives identical iterates as MINRES. As Anderson() is essentially equivalent to GMRES, it also converges as quickly as MINRES. However, this speed is *not* reflected in convergence rates, because they always consider infinite sequences.

Also, the Li and Li (2020) result may be interpreted in a positive way by noticing that it may be extremely hard to tune the parameters of gradient-descent schemes in an optimum fashion. Thus, Anderson acceleration may indeed lead to a benefit in practice, also for gradient descent. As Moulinec-Suquet's basic scheme (Moulinec and Suquet, 1994; 1998) is essentially a gradient-descent method for stress operators with potential (Kabel et al., 2014; Schneider, 2017a; Bellis and Suquet, 2019), we may interpret the positive results of Gélébart's AMITEX solver (Chen et al., 2019a;b), who applied Anderson acceleration to the basic scheme, as a testament for this statement.

In this work, we shall follow a slightly different path by applying Anderson acceleration to the polarization scheme (6.14), and use it for avoiding tedious parameter calibration.

#### **6.2.3 Application to polarization schemes**

Polarization schemes (6.14) are fixed-point methods (6.18) for the nonlinear mapping

$$\begin{aligned} F\_{\gamma, a} &: L^2(Y; \text{Sym}(d)) \to L^2(Y; \text{Sym}(d)),\\ P &\mapsto aP + (1 - a) \left[ \frac{2}{\gamma} : \overline{\varepsilon} + \mathcal{Y}^0 : Z^0(P) \right], \end{aligned} \tag{6.27}$$

where we restricted to C <sup>0</sup> = 1*/* I for simplicity. Then, the operators Y 0 and Z<sup>0</sup> attain the form

$$\mathbf{Y}^0 = \mathbf{I} - 2\,\Gamma \quad \text{for} \quad \Gamma = \nabla^s (\text{div } \nabla^s)^{-1} \text{div} \tag{6.28}$$

and

$$Z^0 = \left(\sigma - \frac{1}{\gamma} \text{ I}\right) \left(\sigma + \frac{1}{\gamma} \text{ I}\right)^{-1}.\tag{6.29}$$

In the general setting of section 6.2.1, for any  *>* 0 and ∈ [0*,* 1], the operator *,* is non-expansive, i.e., Lipschitz continuous with Lipschitz constant 1. If, furthermore, is strongly convex, for any  *>* 0 and ∈ [0*,* 1), *,* is even contractive in view of the estimate (6.17). Unfortunately, it is not always apparent how to choose the parameter pair (*,* ) to ensure fast convergence. Even though explicit values for (*,* ) were listed, their practical determination may be expensive. Indeed, suitable constants and may be read off from eigenvalue analyses based on the material tangents *∂ ∂*  (*,* ()) if is continuously differentiable. However, if the voxel count is large, the sheer number of eigenvalue decompositions may be expensive per se.

Thus, we apply Anderson acceleration to the contractive operator *,* (6.27), as discussed in section 6.2.2. Performing a single step of the polarization scheme is summarized in Alg. 6. Notice that we do not compute *,*(), but its polarization residual − *,*(), because the latter enters in the Anderson matrix (6.25) and in the Anderson update (6.23).

**Algorithm 6** Polarization step DR*,*(old), see Schneider et al. (2019)

1: ← old 2: ← Z 0 () *◁* Update estimates of and 3: ^ ← FFT() 4: ^() ← ^() − 2Γ( ^ ) : ^()*,*  ̸= 0 *◁* Apply Y<sup>0</sup> operator 5: ^(0) ← <sup>2</sup>  *◁* Fix average polarization 6: ← FFT<sup>−</sup><sup>1</sup> (^) 7: residual ← <sup>1</sup> 2 ‖old− ‖<sup>2</sup> ‖⟨⟩ ‖ *◁* ⟨⟩ is a byproduct of Z<sup>0</sup> 8: **return** residual*,*(1 − )(old − )

Alongside we compute the residual

$$\text{residual}(P) = \frac{1}{2} \frac{||P - F\_{\gamma,0}(P)||\_{L^2}}{|| \langle \sigma(\varepsilon) \rangle\_Y ||},\tag{6.30}$$

where ⟨()⟩ denotes the average stress associated to the polarization = () + 1*/* . The residual (6.30) measures the strain compatibility, the stress equilibrium and the average value of the strain in view of the identity (Schneider et al., 2019, Sec. 5)

$$\begin{split} \frac{1}{4} \| P - F\_{\gamma, 0}(P) \|^2 &= \| \Gamma : \sigma(\varepsilon) \|\_{L^2}^2 \\ &+ \frac{1}{\gamma^2} \left( \| (\mathcal{I} - \Gamma - \langle \cdot \rangle\_Y) \, \varepsilon \|\_{L^2}^2 + \| \langle \varepsilon \rangle\_Y - E \|\_{L^2}^2 \right). \end{split} \tag{6.31}$$

The residual (6.30) depends on the step size , so some care has to be taken in comparing different solution schemes. However, this phenomenon is intrinsic, because conditions on the strain and the stress field have to be enforced, and the step size helps converting between the different physical units of strain and stress.

The most expensive step for nonlinear material behavior is evaluating the operator Z 0 (6.12). In computational practice, it is often more convenient to use the equivalent expression

$$Z^0(P) = 2\left(\sigma^{-1} + \mathbb{D}^0\right)^{-1}(\mathbb{D}^0 : P) - P,\tag{6.32}$$

where D 0 is the reference compliance. Indeed, for inelastic materials whose stress-strain relationship is governed by Hooke's law, the operator Z <sup>0</sup> may be computed by a standard call to the user-defined material law (with a modified stiffness) (Schneider et al., 2019, Sec. 6). The average stress

$$
\langle \sigma(\varepsilon) \rangle\_Y = \langle \left( \sigma^{-1} + \mathbb{D}^0 \right)^{-1} (\mathbb{D}^0 : P) \rangle\_Y \tag{6.33}
$$

is easily computed as a byproduct. The algorithm is summarized in Alg.7, where a hat over a variable refers to the corresponding Fourier coefficients. The method may be implemented on 2( + 1) strain-like fields. Implementations on the displacement field, as in Grimm-Strele and Kabel (2019), are not feasible because the iterates are not compatible.

## **6.3 Numerical demonstrations**

### **6.3.1 General setup and organization**

The Anderson accelerated polarization-based schemes, abbreviated as A2DR (Anderson-Accelerated Douglas Rachford), following Fu et al. (2020), were implemented in an in-house FFT-based micromechanics solver, written in Python 3*.*7. Computationally expensive operations, such as applying Γ 0 , evaluating the material law and the Anderson update (6.23), were realized as Cython extensions using OpenMP parallelization. For applying the fast Fourier transform, we use the FFTW library (Frigo and Johnson, 2005). Throughout, we rely on the staggeredgrid discretization (Schneider et al., 2016). To describe the action of the

**Algorithm 7** Anderson-accelerated polarization scheme (*,* maxit*,* tol)

1: ← 0 *◁* Alternative: Extrapolation from previous time steps 2: initialize  *◁* Different choices possible, see Sec. 6.3 3: ← 0 4: initialize empty list ℒ 5: **while**  *<* maxit **do** 6: ← + 1 7: old ← 8: residual*,*  ← DR*,*(old) *◁* See Alg. 6; update and 9: **if** residual ≤ tol **then** 10: **exit while loop** 11: **end if** 12: update step size (may be omitted for performance reasons) 13: append and to the list ℒ 14: compute inner products of with older 's from ℒ 15: update matrix (6.25) 16: determine by equation (6.24) 17: compute new by equation (6.23) 18: discard superfluous 's and 's from ℒ 19: **end while** 20: ← (C <sup>0</sup> + ) −1 () *◁* Compute strain field 21: **return** *,*residual*,* 

corresponding discrete Γ*<sup>ℎ</sup>* operator, we introduce the complex vectors

$$k\_j(\xi) = \frac{\exp(i2\pi\xi\_j/N\_j)}{h\_j},\tag{6.34}$$

where denotes a frequency vector and *ℎ* and are the mesh spacing and voxel count in -direction, respectively. Then, the associated symmetrized gradient operator of the staggered-grid discretization has

the Fourier-space representation

$$
\widehat{Du}(\xi) = \frac{1}{2} \begin{bmatrix} 2k\_1 \hat{u}\_1 & -\left(\bar{k}\_1 \hat{u}\_2 + \bar{k}\_2 \hat{u}\_1\right) & -\left(\bar{k}\_1 \hat{u}\_3 + \bar{k}\_3 \hat{u}\_1\right) \\\ -\left(\bar{k}\_2 \hat{u}\_1 + \bar{k}\_1 \hat{u}\_2\right) & 2k\_2 \hat{u}\_2 & -\left(\bar{k}\_2 \hat{u}\_3 + \bar{k}\_3 \hat{u}\_2\right) \\\ -\left(\bar{k}\_3 \hat{u}\_1 + \bar{k}\_1 \hat{u}\_3\right) & -\left(\bar{k}\_3 \hat{u}\_2 + \bar{k}\_2 \hat{u}\_3\right) & 2k\_3 \hat{u}\_3 \end{bmatrix},\tag{6.35}
$$

where we suppress the -dependency of and for better readability. The action of Γ*<sup>ℎ</sup>* in Fourier-space reads

$$\widehat{\Gamma\_h \tau}(\xi) = \begin{cases} -\widehat{D} \left( \frac{2}{||k||^2} \operatorname{I} + \frac{\mathbb{E}k^T}{||k||^4} \right) \widehat{D^\*} \widehat{\tau}, & \xi \neq 0, \\\ 0, & \text{otherwise}, \end{cases} \tag{6.36}$$

where ̂︁\*() is the Hermitian adjoint of ̂︀(),

$$(\widehat{D^\*\tau})(\xi) = \begin{bmatrix} -\hat{\tau}\_{11}\bar{k}\_1 + \hat{\tau}\_{12}k\_2 + \hat{\tau}\_{13}k\_3\\ \hat{\tau}\_{21}k\_1 - \hat{\tau}\_{22}\bar{k}\_2 + \hat{\tau}\_{23}k\_3\\ \hat{\tau}\_{31}k\_1 + \hat{\tau}\_{32}k\_2 - \hat{\tau}\_{33}\bar{k}\_3 \end{bmatrix}. \tag{6.37}$$

Our convergence criterion for the polarization-based schemes reads

$$\frac{1}{2} \frac{||P - F\_{s,0}(P)||\_{L^2}}{||\langle \sigma(\varepsilon) \rangle\_Y||} \le \delta \tag{6.38}$$

using the residual defined in equation (6.30). For the strain-based schemes, which serve as performance benchmarks for A2DR, we use

$$\frac{\|\Gamma\_h : \sigma(\varepsilon)\|\_{L^2}}{\|\langle \sigma(\varepsilon)\rangle\_Y\|} \le \delta. \tag{6.39}$$

Note that the criterion of the polarization-based schemes (6.38) checks the compatibility of the strain field and the deviation from the prescribed macroscopic strain in addition to the equilibrium of the stress field in (6.39), as each condition is only satisfied upon convergence. Unless explicitly stated otherwise, we solve to a tolerance of = 10−<sup>5</sup> . In case of multiple load steps, we use an affine linear extrapolation (Moulinec and Suquet, 1998) as an initial guess for our solution field.

The computations in Sec. 6.3.2–6.3.5 were performed on a desktop computer with 32 GB RAM and a 6-core Intel i7-8700K CPU. The computations for Sec. 6.3.6 ran on a workstation with 512 GB RAM and two 12-core Intel Xeon(R) Gold 6146 CPUs.

These computational investigations are intended to demonstrate the power and versatility of A2DR, and are organized as follows. We start with a two-dimensional example in section 6.3.2, which permits us to study the dependence on the involved algorithmic parameters. In three dimensions, studies with large depth are prohibited by memory constraints. In section 6.3.3, we study a three-dimensional example with nonlinear constituents, but finite material contrast. The example serves as a standard benchmark for FFT-based solvers (Schneider, 2019a; 2020a). In section 6.3.4, we study a linear elastic material including pores. Porous microstructures are known to be difficult for polarization methods, because the optimum step size = 1*/* √ is not sensible for = 0. In section 6.3.5, we study a Metal-Matrix-Composite (MMC) undergoing ratcheting. This example is challenging for two reasons. For a start, the underlying material model is *not* a generalized standard material, as the material tangent is not symmetric. In particular, the convergence theory discussed in section 6.2 does not apply. As a second challenge, the material tangent becomes increasingly ill-conditioned for increased loading. Last but not least, in section 6.3.6, we study a polycrystalline microstructure. Such constitutive laws are notoriously expensive to evaluate. Thus, Newton-Krylov methods are usually the preferred choice for this type of problem, as iterations of the linearized problem require much less computational effort than the nonlinear evaluation. Furthermore, the specific material law we utilize involves

a softening behavior. In particular, the example is not covered by the available convergence theory.

## **6.3.2 Continuous glass-fiber reinforced polymer**

**(a)** Microstructure **(b)** Accumulated plastic strain at <sup>33</sup> = 5%

**Figure 6.1:** Continuous glass-fiber reinforced polymer - Microstructure and accumulated plastic strain for a uniaxial extension in -direction

As our first example, we consider polyamide 6.6 continuously reinforced by glass fibers with a volume fraction of 30%. The microstructure, see Fig. 6.1, is modeled as a two-dimensional periodic cell, generated by the adaptive shrinking cell algorithm of Torquato and Jiao (2010). The resulting structure contains 200 fibers and is resolved by 512×512 pixels. The glass fibers are modeled as isotropic linear elastic and the polyamide matrix is governed by 2-elastoplasticity with isotropic hardening, see Sec. 3.3 in Simo and Hughes (1998). Following Doghri et al. (2011), we

**Table 6.1:** Glass-fiber reinforced polyamide: Material parameters of fibers and matrix Doghri et al. (2011)


use a linear exponential hardening function for polyamide

$$
\sigma\_Y = \sigma\_0 + k\_1 p + k\_2 (1 - \exp(-mp)),
\tag{6.40}
$$

where <sup>0</sup> denotes the initial yield stress, <sup>1</sup> is the asymptotic hardening modulus and <sup>2</sup> = <sup>0</sup> − <sup>∞</sup> specifies the difference between the initial and saturated yield strength for <sup>1</sup> = 0. The material parameters are listed in Tab. 6.1. Please note that a similar microstructure was considered in Ch. 3. For the present section, however, twice the volume fraction, and four times the resolution are investigated compared to Ch. 3. In particular, we observe a more pronounced plastification caused by the higher filler fraction.

For this comparatively small two-dimensional example, we investigate the convergence rate of A2DR with respect to the chosen depth . In particular, we are interested in the sensitivity of the results with respect to the chosen algorithmic parameters. As shown in Sec. 6.2.1, the convergence rate of the non-accelerated polarization-based schemes depends on the choice of the step size and the damping parameter . Indeed, it was shown that, in practice, choosing a suboptimal step size may increase the necessary iteration counts by orders of magnitude, see Sec. 7 in Schneider et al. (2019). Hence, we aim to find a suitable depth for which the dependence of performance on and is eliminated or, at least, reduced.

**(a)** Depth vs. iteration count (left) and run-time (right) for varying and = 2*/*( + )

**(b)** Depth vs iterations for varying and fixed = 0*.*25

**Figure 6.2:** Continuous glass-fiber reinforced polymer with elastic matrix

For a start, we consider a linear problem, where both glass fibers and polyamide matrix are modeled as linear elastic. Using the formulation of mixed boundary conditions by Kabel et al. (2016), the microstructure is subjected to 5% uniaxial extension in -direction. For a fixed step size = 2*/*( + ), the required iteration counts and total run-times for A2DR for different values of and are plotted in Fig. 6.2a.

As a general trend, we observe that the required number of iterations decreases up to a depth of = 4 and stagnates afterwards. This decrease is not necessarily monotone, see the plot for = 0*.*5. The total run-times follow a similar trend. Anderson acceleration introduces only a small overhead. In particular, for this section, it suffices to investigate either the iteration count *or* the timing. We revisit this topic for the larger microstructure in Sec. 6.3.3, where the computational cost of the update steps (6.23)–(6.25) is more pronounced.

Taking a look at the effect of the damping parameter for = 0 (i.e., when Anderson acceleration is deactivated), we note that the iteration counts range from 205 for Monchiet-Bonnet's choice = 0*.*25 (Monchiet and Bonnet, 2012) to 310 for = 0*.*5, corresponding to Michel-Moulinec-Suquet's accelerated scheme (Michel et al., 2001). For depth ≥ 4, the difference between these choices is largely eliminated, and A2DR converges in roughly 50 iterations for all damping factors considered. These results indicate that, in addition to the faster convergence, A2DR relieves the user from the task of selecting the damping factor carefully. Next, we take a look at influence of the step size for a fixed damping factor of = 0*.*25, see Fig. 6.2b. Starting at = 0, the slowest choice of = 1*/* requires about 10 times as many iterations to converge compared to the theoretically optimum choice = 1*/* √ . Activating Anderson acceleration reduces this performance gap significantly. Up to a depth of = 3, the iteration counts decrease for all step sizes and stagnate afterwards. For the optimum step size the effect is least pronounced, with a decrease of 46 iterations for = 0 to 35 iterations for = 3. However, for all other step sizes the iteration counts are significantly reduced, leading to a factor of less than 2 between the slowest and fastest choice for ≥ 3. Notably, the iteration count for = 1*/* matches that of the theoretically optimum choice for = 3 and is even slightly lower for higher depths.

In conclusion, we observe that the performance of A2DR with a depth of = 4 is largely independent of the damping factor . The influence of the step size on performance does not vanish, but is significantly reduced compared to the classical polarization-based schemes without Anderson acceleration. In particular, step-size choices such as = 1*/* or = 2*/*( + ) become competitive using A2DR, making polarizationbased schemes available for materials where is close or equal to 0, see Sec. 6.3.4 and 6.3.5.

To check whether the results of the linear elastic setting carry over to nonlinear problems, we consider the case where the polyamide matrix is governed by 2-elastoplasticity. The boundary conditions are imposed as for the linear elastic case, i.e., 5% uniaxial extension in -direction, applied in 50 equidistant load steps. For computing the step size, and are estimated in the first iteration of each load step based on

**(a)** Iterations vs depth for step size = 2*/*( + )

**(b)** Iteration count vs depth for different step-sizes and fixed = 0*.*25, including a zoom on the right-hand side

**Figure 6.3:** Continuous glass-fiber reinforced polymer with elastoplastic matrix - Iteration count vs depth for various step-size choices and fixed damping factor = 0*.*25, including a zoom on the right-hand side

the tangent field, see remark 2 in Sec. 6.2.1. The step size is then kept fixed for the remainder of the load step. Using this strategy, the effect of Anderson acceleration with respect to different damping factors and step-size choices is qualitatively similar to the linear elastic setting, see Fig. 6.3a and Fig. 6.3b. For depths ≥ 4, the impact of the damping factor is largely eliminated and iteration counts stabilize. However, the difference between the investigated step-size choices is more pronounced in the nonlinear case. For the unaccelerated schemes, the slowest choice = 1*/* requires 33 times more iterations than the optimum choice = 1*/* √ . At a depth of = 4, the factor between the slowest and fastest step size is reduced to 4.

A few synoptic remarks are in order. As an alternative strategy for computing the step size , we investigated the approach of Schneider et al. (2019), where and are computed in every iteration and is subsequently updated based on the minimum value of and the maximum value of over *all past iterates*. Upon Anderson acceleration, no positive effect on the convergence behavior was observed in practice, except for very large nonlinear load steps. Furthermore, the overall performance with respect to run-time suffered due to the overhead of computing

and . Thus, we use the simpler strategy of updating and at the beginning of each load step for the remainder of the manuscript. As a second remark, based on the results up to this point, choosing = 1*/* seems to be competitive when using A2DR, as the resulting performance was often similar or better than for the theoretically optimum value of = 1*/* √ . However, whenever is unknown or zero, both, = 1*/* and = 1*/* √ cannot be used. In addition, = 1*/* was found to result in low convergence rates for high accuracy. Hence, we prefer = 1*/* √ where applicable.

Larger values for the depth , up to 200, were tested for A2DR, leading, however, to no further decrease of the iteration count. Thus, for the sake of readability, these results were omitted in the respective plots of this section. Interestingly, this strongly differs from the behavior observed for the Anderson-accelerated basic scheme, see Sec. 3.4.2, where iteration counts were found to decrease up to depths of = 50 (albeit at the cost of slower overall performance, due to computational overhead). This further demonstrates the difficulty of finding the optimum step size of the basic scheme. As exemplified by adaptive step-size selection-schemes, e.g., by Barzilai and Borwein (1988) or Malitsky and Mishchenko (2020), a constant step size of = 2*/*( + ) does not yield the best possible performance for gradient descent when time to solution is the primary objective. In contrast, using the optimum constant step size for polarization-based schemes appears to leave less room for improvement.

#### **6.3.3 Short glass-fiber reinforced polymer**

Motivated by the results for the 2-dimensional example of the last section, we compare the performance of A2DR to other modern FFT-based solvers, based on a larger 3-dimensional microstructure that serves as a recurring benchmark example for FFT-based solvers, see Sec. 3.3 in Schneider (2019a). More precisely, we consider a polyamide matrix,

**Figure 6.4:** Short glass-fiber reinforced polymer - Microstructure and von Mises strain-field for uniaxial extension in -direction

reinforced by 1140 glass fibers with an aspect ratio 30, filling 20% of the volume, see Fig. 6.4. The microstructure was generated using the sequential addition and migration algorithm (Schneider, 2017b) and resolved by 256<sup>3</sup> voxels. The fiber orientation in the resulting microstructure is close to unidirectional with a second-order fiber-orientation tensor (Advani and Tucker, 1987) of = diag(0*.*8*,* 0*.*1*,* 0*.*1). Throughout, we use the material parameters listed in Tab.6.1, as in Sec. 6.3.2.

First, we restrict to the linear elastic problem with an applied loading of 5% uniaxial extension in fiber direction. We solve up to a high accuracy of = 10<sup>−</sup><sup>10</sup> to get a better picture of the convergence rate of the investigated FFT-based solvers. We compare the performance of A2DR with = 0*.*25, = 4 and the optimum step size = 1*/* √ , to Monchiet-Bonnet's scheme with = 1*/* √ Monchiet and Bonnet (2012), the conjugate gradient (CG) method (Zeman et al., 2010; Brisard and Dormieux, 2010), the Barzilai-Borwein (BB) basic scheme (Barzilai and Borwein, 1988; Schneider, 2019a) and the original basic scheme by Moulinec and Suquet (1998). Throughout, the optimum algorithmic parameters are used for the Lippmann-Schwinger solvers. To be precise, we choose = 2*/*( + ) as the reference material of the basic scheme and the CG method (where it does not matter (Zeman et al., 2010)). The Barzilai-Borwein (BB) method is initialized with = 2*/*( + ) as well and adaptively selects its step size after the first iteration Schneider (2019a). In Fig. 6.5a, we see the excellent convergence rate of A2DR, reaching the prescribed tolerance with the lowest number of iterations among all investigated solvers. Most notably, the performance of A2DR and CG is nearly identical. Interestingly, for the same benchmark, see Sec. 3.3 in Schneider (2019a), the author already observed that the (non-accelerated) Eyre-Milton method mirrored the performance of CG for low accuracy. Using Anderson acceleration, this advantage is preserved up to the investigated tolerance of = 10<sup>−</sup><sup>10</sup>. Both the Barzilai-Borwein (BB) method and the Monchiet-Bonnet scheme exhibit similar convergence rates up to an accuracy of 10<sup>−</sup><sup>7</sup> . Subsequently, the residual of the Barzilai-Borwein method decreases rapidly, leading to a lower final iteration count. Note, however, that this only a fortuitous byproduct of the inherently non-monotone convergence behavior of the algorithm<sup>3</sup> . In the aforementioned numerical experiment in Sec. 3.3 of Schneider Schneider (2019a), the final iteration count of the Barzilai-Borwein method was very close to the one we observe for Monchiet-Bonnet's method, which is roughly 50% higher than the iteration counts of A2DR and CG. The basic scheme is not competitive, being an order of magnitude slower than the other investigated schemes.

Taking a look at the overall performance in terms of computation time, the ranking between the solvers changes slightly, see Fig. 6.5b. Essentially, the Barzilai-Borwein (BB) method and A2DR switch places, with one being slightly faster and the other being slightly slower than CG. This is due to the lower computational cost per iteration of the Barzilai-Borwein method, see Tab. 6.2. Using the complexity-reduction

<sup>3</sup> Also, a rapid decrease of the residual from 10−<sup>2</sup> to 10−<sup>5</sup> may be observed around iteration 50.

**Figure 6.5:** Short glass-fiber reinforced polymer - Performance comparison for various solution schemes

trick in Sec. 6 of Schneider et al. (2019), for all investigated algorithms, the computational effort of evaluating the material law, applying the Γ 0 -operator and computing the residual are very similar. Whereas an iteration of the Barzilai-Borwein method only requires a single inner product and one addition of two fields on top of the aforementioned steps, the A2DR update involving equations (6.23)-(6.25) requires computing + 1 inner products, solving a linear system of size + 1 and summing + 1 fields. As a consequence, the cost per iteration for A2DR( = 4) ends up at being roughly 50% higher compared to the Barzilai-Borwein method.

Last but not least, we consider the nonlinear problem with 2-elastoplastic matrix behavior. The uniaxial loading up to 5% uniaxial extension is applied in 50 equidistant steps. We add the Newton-CG method (Gélébart and Mondon-Cancel, 2013; Kabel et al., 2016) to our list of investigated schemes in place of the linear CG method. To be precise, we use Dong's line search criteria (Dong, 2010) for controlling the step size of the Newton update and prescribe Eisenstat-Walker's forcing term choice


**Table 6.2:** Short glass-fiber reinforced polymer - computational cost per iteration for the investigated solution schemes

2 (Eisenstat and Walker, 1996) as tolerance for the linear system, see Ch. 3. Note that, for Newton-CG, we take the sum of Newton iterations and linear CG iterations for computing the iteration count. Taking a look at Fig. 6.6, we see that the polarization-based schemes outperform the (Quasi-)Newton methods and the basic scheme. A2DR is fastest, followed by Monchiet-Bonnet's method whose run-time is 35% higher. Both, Newton-CG and Barzilai-Borwein (BB), perform similarly, taking roughly twice as long as A2DR to finish.

In conclusion, we see that Anderson acceleration further improves the performance of the already powerful polarization-based methods for finitely-contrasted materials. A depth of = 4 emerges as a reasonable choice for both linear and nonlinear problems. Note that the improved performance of the Anderson-accelerated polarization-schemes comes at a price. Using A2DR with a depth of = 4 requires the storage of 10 strain-like fields, not counting internal variables. Compared to 1 strainfield for the basic scheme, 2 for the Barzilai-Borwein method and 8*.*5 for Newton-CG (when storing the tangent-field), this represents a rather large memory footprint. This is exacerbated by the fact that polarizationbased schemes do not permit a displacement-based implementation, which can roughly half the memory requirements of the aforementioned strain-based methods.

**Figure 6.6:** Short glass-fiber reinforced polymer - Performance comparison for various solution schemes

**Table 6.3:** Sand-core - Material parameters of sand grains and binder (Daphalapurkar et al., 2011; Wichtmann and Triantafyllidis, 2010; Sanditov et al., 2009)


## **6.3.4 A Sand-core microstructure**

For this example, we investigate a sand-core microstructure, discretized by 256<sup>3</sup> voxels. The structure consists of 64 sand grains with a volume fraction of 58*.*58% held together by an inorganic binder with 1*.*28% volume fraction, see Fig. 6.7. For a detailed treatment of the microstructure generation and the linear elastic properties of the material, we refer to Schneider et al. Schneider et al. (2018). The material parameters of the constituents are listed in Tab. 6.3.

The sand-core microstructure represents a porous material for which is usually unknown (Schneider, 2020b). Using the natural estimate = 0

**Figure 6.7:** Sand-core structure - Microstructure and von Mises strain-field for uniaxial extension in -direction

makes the step size = 1*/* √ , which is optimal for the polarizationbased schemes, not applicable. Hence, in numerical experiments, their performance for solving this type of problem was found to be poor (Schneider, 2019a), exhibiting even lower convergence rates than the basic scheme. In contrast to fast gradient solvers (Schneider, 2017a; 2020a) or (Quasi-)Newton methods (Gélébart and Mondon-Cancel, 2013; Schneider, 2019a; Wicht et al., 2020b), this has prevented using polarization-based schemes as general-purpose algorithms.

In this context, we investigate whether A2DR can increase the efficiency of polarization-based schemes to competitive levels. To this end, the sand-core microstructure is subjected to 1% uniaxial extension in −direction and we solve up to a tolerance of = 10<sup>−</sup><sup>10</sup> to determine the convergence rate of the algorithms. We fix = 0*.*25 and consider the available step sizes for = 0, i.e., the conservative choice = 1*/* and the optimum step size of the basic scheme = 2*/*( + ), i.e., = 2*/* for = 0.

As for finitely contrasted media, Anderson acceleration substantially improves the performance for the investigated step-size choices, see

**(a)** Different step-size choices and depths

**Figure 6.8:** Sand-core structure - Residual vs iteration

Fig. 6.8a. In agreement with the results by Schneider Schneider (2019a), Monchiet-Bonnet's method with the step size of the basic scheme slows down considerably over the course of the computation and fails to converge within 1000 iterations. In contrast, A2DR with the same step size exhibits a linear convergence rate, requiring less than 300 iterations to reach the prescribed tolerance. Using = 1*/* roughly triples iteration counts and run-times.

Fig. 6.8b reveals that, using Anderson acceleration, polarization-based schemes become competitive to the fastest available FFT-based solvers for porous media. More precisely, A2DR( = 4*,*  = 2*/*( + )) ends up just between CG and the Barzilai-Borwein method in terms of iteration count, requiring 25% more than the former and 25% less than the latter. In terms of overall performance, it matches the Barzilai-Borwein method, with both methods running about 30% longer than CG.

To summarize, Anderson acceleration makes the computational power of polarization-based schemes available for treating porous microstructures, considerably broadening their range of application. The optimum

**Figure 6.9:** Metal-matrix composite - Microstructure and von Mises strain-field for a cyclic uniaxial stress loading

step size of the basic scheme = 2*/*( + ) emerges as a decent generalpurpose choice for materials with both finite and infinite contrast.

## **6.3.5 Metal-matrix composite under cyclic loading**

For our next example, we investigate the cyclic behavior of a metalmatrix composite (MMC). The microstructure consists of 50 spherical ceramic particles with a volume fraction of 30% embedded in a metal matrix, see Fig. 6.9a. For the particle placement, we relied on the random sequential addition algorithm (Widom, 1966) and the resulting microstructure was resolved by 128<sup>3</sup> voxels.

The ceramic inclusions are assumed to be linear elastic, whereas the material behavior of the matrix is governed by a 2-elastoplasticity model with kinematic hardening, see, for instance, Chaboche (1989; 2008).

For simplicity, we neglect isotropic hardening, i.e., we model the yield stress <sup>Y</sup> to be independent of the equivalent plastic strain . The governing equations for the model are given by Hooke's law

$$
\sigma = \mathbb{C} : (\varepsilon - \varepsilon\_{\mathfrak{p}}) \quad \text{with} \quad \varepsilon = \varepsilon\_{\mathfrak{e}} + \varepsilon\_{\mathfrak{p}}, \tag{6.41}
$$

the associated flow rule

$$\dot{\varepsilon}\_{\mathbb{P}} = \gamma \frac{\partial \phi}{\partial \sigma} \quad \text{with} \quad \phi(\sigma, X) = \sqrt{\frac{3}{2}} \| \mathbb{P}\_2 : (\sigma - X) \| - \sigma\_Y,\tag{6.42}$$

the Karush-Kuhn-Tucker conditions

$$
\phi(\sigma, X) \le 0, \quad \gamma \ge 0, \quad \gamma \,\phi(\sigma, X) = 0,\tag{6.43}
$$

and a kinematic hardening law, defining the evolution of , i.e., the center of the elastic domain.

Over time, a multitude of formulations has been proposed for the latter, see, for instance, the reviews by Abdel-Karim (2005) or Kang (2008) for an overview. For the present study, we choose the kinematic hardening law by Chaboche et al. (1979)

$$
\dot{X} = \sum\_{i=1}^{M} X\_i, \quad \dot{X}\_i = \frac{2}{3} h\_i \dot{\varepsilon}\_{\mathbb{P}} - \zeta\_i X \dot{p}, \tag{6.44}
$$

where is decomposed into multiple parts, following the classical Frederick-Armstrong rule (Frederick and Armstrong, 2007). Using a backwards Euler time discretization, we rely on the fixed-point algorithm by Kobayashi and Ohno (2002) for the implementation of the material model. Note that the tangent stiffness for this model is *not symmetric* (Kobayashi and Ohno, 2002). For using the Newton-CG method, Kobayashi and Ohno (2002) suggest using the symmetrized tangent when solving the linear system. Following their recommendation, we also estimate based on the symmetrized tangent, whereas is fixed by

the elastic stiffness of the materials. The material parameters for both constituents are listed in Tab. 6.4.



We consider a cyclic uniaxial stress loading with a mean stress value of 100 MPa and an amplitude of 300 MPa. The loading is applied over 4 cycles, discretized by 30 equidistant steps per cycle. Each step is solved up to a tolerance of = 10<sup>−</sup><sup>5</sup> . Note that, in contrast to Sec. 6.3.3, the prescribed hardening law leads to a stress operator which is neither derived from a potential nor strictly monotone. Indeed, in our numerical experiments, the lower bound of the tangent field quickly approached zero during plastification, preventing the use of the optimum step size = 1*/* √ . Hence, in combination with the non-monotonic loading, the metal-matrix composite with kinematic hardening constitutes a challenging benchmark, which is not covered by the theoretical treatment of Sec. 6.2.

For evaluating the performance of the solvers, we fix the algorithmic parameters of A2DR to = 0*.*25, = 2*/*( + ) and = 4 based on the results of the previous sections. Comparing the iteration counts and run-times of the different FFT-based solution schemes in Fig. 6.10, we see that A2DR performs admirably. It requires the lowest number

**Figure 6.10:** Metal-matrix composite - Performance comparison for various solution schemes

of total iterations and ties with the Barzilai-Borwein method for the fastest computation time. Monchiet-Bonnet's method, with the same algorithmic parameters as A2DR (except for = 0), takes more than twice as long to converge, with an overall performance comparable to the Newton-CG method. Note that the effect of Anderson acceleration, while still significant, is less pronounced compared to the numerical experiment on a porous structure in Sec. 6.3.4. This is due to the cyclic loading, where plastic flow in the matrix material is only activated for high stress magnitudes, see Fig. 6.9b. In the elastic loading and unloading steps, all solvers converge in a single iteration owing to the affine-linear extrapolation. Hence, the superior performance of A2DR is only realized in roughly half of all load steps. For the same reason, the basic scheme performs comparatively well for this problem, with a final iteration count just over three times higher than A2DR.

**Figure 6.11:** NiAl-9Mo - Microstructure and creep behavior

### **6.3.6 Directionally solidified NiAl-9Mo**

For our final example, we turn to a directionally solidified NiAl-9Mo eutectic alloy used as a benchmark problem in Sec. 4.6.3. The considered microstructure with 84 unidirectionally aligned fibers with square cross-section was generated using the random sequential addition algorithm (Widom, 1966) and resolved by 1200 × 160 × 160 voxels. The fibers have an aspect ratio of 100 (Haenschke et al., 2010) and take up 14% of the overall volume. Following Albiez et al. (2016a), the behavior of fibers and matrix are governed by a single-crystal elastoviscoplasticity model based on Hooke's law

$$
\sigma = \mathbb{C} : (\varepsilon - \varepsilon\_{\mathbb{P}}), \quad \text{with} \quad \varepsilon = \varepsilon\_{\mathfrak{e}} + \varepsilon\_{\mathbb{P}} \tag{6.45}
$$

and the classical power-law flow rule by Hutchinson (1976)

$$\dot{\varepsilon}\_{\mathsf{P}} = \sum\_{\alpha=1}^{N} \dot{\gamma}\_{\alpha} \, d\_{\alpha} \otimes^{s} n\_{\alpha} \quad \text{with} \quad \dot{\gamma}\_{\alpha} = \dot{\gamma}\_{0} \, \text{sgn}(\tau\_{\alpha}) \left| \frac{\tau\_{\alpha}}{\tau^{\mathsf{F}}} \right|^{m}, \tag{6.46}$$

where and denote the slip direction and the slip-plane normal, respectively, ˙<sup>0</sup> is the reference slip-rate and F is the yield stress. The operator ⊗ denotes the symmetrized dyadic product, i.e., ⊗ =


**Table 6.5:** NiAl-9Mo - Material parameters of fibers and matrix at 1000∘C Albiez et al. (2016a)

( ⊗ + ⊗ )*/*2 and the shear stress in a slip system is computed by = : ⊗ . The matrix is assumed to behave perfectly plastic, i.e., the yield stress is constant <sup>F</sup> = F 0 . For the molybdenum fibers, Albiez et al. (2016a) proposed the softening law

$$
\tau^F = \frac{\tau\_\infty}{d\sqrt{\rho} + 1} \tag{6.47}
$$

in terms of the dislocation density , with maximum yield stress <sup>∞</sup> and a characteristic length parameter . The authors used the storagerecovery model by Kocks and Mecking (2003) for the evolution of the dislocation density, which permits expressing as an explicit function of the accumulated plastic slip ˙ = ∑︀ =1 |˙|

$$\rho = \rho\_s \left[ 1 - \exp\left( -\frac{1}{2} k\_2 \gamma \right) \left( 1 - \sqrt{\frac{\rho\_0}{\rho\_s}} \right) \right]^2,\tag{6.48}$$

211

upon integration. The parameters of the materials are listed in Tab. 6.5.

**(a)** Distribution of von Mises stresses in the matrix

**Figure 6.12:** NiAl-9Mo - Von Mises stress distribution at different stages of a uniaxial creep test in fiber direction

Evaluating the material law ↦→ () for single-crystal elastoviscoplasticity is computationally expensive compared to other operations such as applying Γ <sup>0</sup> and the associated FFTs, see Eghtesad et al. (2018a) or Sec. 3.4.4. Typically, Newton methods enjoy the best performance for this type of problems, as the material law is evaluated only once per Newton iteration, whereas applying the material tangent is substantially cheaper. This is why a performance comparison with A2DR is of interest for this computationally demanding problem.

We consider a creep test, where a uniaxial stress loading with an amplitude of 250 MPa is applied in a single load step for one second. Subsequently, the loading is held constant for 120 seconds, subdivided into 120 equidistant load steps. Throughout, we solve to an accuracy of = 10<sup>−</sup><sup>5</sup> . The number of load steps was chosen to obtain a sufficiently fine resolution of the strain rate over time and to ensure the positive

**Figure 6.13:** NiAl-9Mo - Von Mises stress-field at different stages of a uniaxial creep test in fiber direction, showing the initial load transfer from fibers to matrix and the subsequent fiber softening

definiteness of the tangent stiffness, which is not guaranteed due to the softening law (6.47).

The evolution of the local fields during the creep test, see Fig. 6.12– Fig. 6.14, illustrates the influence of the softening law (6.47) by Albiez et al. (2016a) on the effective material behavior. Owing to their high initial yield strength, the molybdenum fibers exhibit no plastic activity during the initial creep stage, behaving almost linear elastically. In contrast, the matrix plastifies almost immediately upon applying the initial stress loading. The viscous stresses in the matrix, caused by the high strain-rate during the initial loading, are subsequently transferred to the fibers. In turn, this leads to a further decrease of the overall strain rate, due to the higher creep resistance of the fibers. After roughly 30 seconds, the creep rate reaches its minimum, see Fig. 6.11a. At this point, the stress in the fibers has increased up to a level where plastic slipping is initiated. With the change from elastic to plastic behavior, the softening law takes effect, resulting in a subsequent increase of the effective creep rate and decreasing stress levels in the fibers.

The different stages of creep behavior are reflected in the iteration counts of the solvers, see Fig. 6.15 and Fig. 6.11b. During the first few load steps,

**Figure 6.14:** NiAl-9Mo - Accumulated plastic slip at different stages of a uniaxial creep test in fiber direction, showing the transition from elastic to plastic behavior in the fibers

a high number of iterations is required, due to the rapid load transfer from matrix to fibers. As the creep rate stabilizes, the affine linear extrapolation takes effect and the iteration count per load step reaches a minimum roughly between step 30 and 60. Subsequently, the softening of the fibers causes an increase in the effective creep rate and leads to a higher material contrast. Thus the computational effort increases in the last 60 load steps. Comparing the investigated solution schemes, we observe that A2DR with the optimum step size = 1*/* √ closely matches the performance of the Newton-CG method. To be precise, A2DR is slower in the first 20 load steps and enjoys a slight advantage afterwards. Roughly around load step 70, both schemes break even in terms of total computation time. In the end, A2DR is even slightly faster overall than Newton-CG. Using the more widely applicable step size = 2*/*( + ) for A2DR doubles the total run-time compared to the optimum step size. Still, the slower option is about 30% faster than the Barzilai-Borwein method.

**(a)** Acc. iterations vs load step

**(b)** Acc. run-time vs load step

**Figure 6.15:** NiAl-9Mo - Performance comparison for various solution schemes

## **6.4 Conclusions**

The present study was devoted to increasing the robustness and performance of polarization-based methods in FFT-based micromechanics, by applying Anderson acceleration, a general-purpose technique for accelerating fixed-point methods.

To demonstrate the usefulness of the proposed algorithm, we covered a wide spectrum of problems, including microstructures and material laws of varying complexity. To be more precise, in Sec. 6.3.2 and Sec. 6.3.3, we investigated finitely contrasted fiber-reinforced microstructures with elastic and 2-elastoplastic material behavior, which were covered by the theoretical treatment in Sec. 6.2.1. For this class of problems, the excellent performance of polarization-based methods, see Schneider et al. (2019) and Schneider (2019a), could be further improved using Anderson acceleration. With respect to the choice of algorithmic parameters, we found that, using a depth of = 4, the influence of the damping

parameter could be eliminated and the sensitivity with respect to the step size was drastically reduced. Whereas the theoretically optimum step size = 1*/* √ led to the best performance, when applicable, the step size of the basic scheme = 2*/*( + ) emerged as a viable alternative. In particular, when the strong convexity constant is unknown or tends to zero, = 2*/* can be readily estimated from the elastic stiffness of the constituent materials.

This enabled us to investigate examples outside the framework of strongly convex optimization, where polarization-based schemes typically struggle. For the porous sand-core structure in Sec. 6.3.4, a lower bound of the elastic stiffness was unavailable. In the problems of Sec. 6.3.5 and Sec. 6.3.6, we considered computationally demanding material models which do not permit a potential-based formulation. Both cases incorporated material laws without a strictly monotone stress operator, with the latter example even including softening behavior. For all of these problems, Anderson acceleration led to substantial speed-ups compared to the classic polarization-based schemes, with A2DR being competitive to the fastest strain-based FFT-solvers.

Indeed,to optimize performance in FFT-based micromechanics, a judicious choice of the solution scheme is often inevitable. For instance, CG is the natural choice for linear elastic problems, inexact Newton-CG is hard to beat if the cost of evaluating the material law is much larger than applying the tangent and the Barzilai-Borwein method is excellent if the material law is cheap to compute, see Ch. 3. In this study, we demonstrated that A2DR closely matches (or even beats) the performance of these schemes in each of their "ideal" settings. Thus, A2DR represents a robust and powerful solution scheme, which is close to optimal for a wide range of problems.

However, the excellent performance of the method is still accompanied by a large memory footprint. Future work may be devoted to investigating alternative vector-sequence acceleration techniques (Ramière and

Helfer, 2015; Brezinski et al., 2021), seeking methods with lower memory requirements which preserve the advantages of Anderson acceleration. Last but not least, the efficiency of polarization-based methods relies on the cheap evaluation of the nonlinear Z 0 -operator. Extending the complexity-reduction technique of Schneider et al. (2019) to a wider class of materials would further increase the usefulness of A2DR as a general-purpose method.

**Chapter 7**

# **On the impact of the mesostructure on the creep response of cellular NiAl-Mo eutectics<sup>1</sup>**

## **7.1 Introduction**

Directionally solidified NiAl-Mo eutectics, consisting of well-aligned single-crystalline Mo-fibers embedded in a NiAl matrix (Bei and George, 2005), are an appealing candidate for structural high-temperature applications (Darolia, 1991). However, several studies (Misra et al., 1998; Haenschke et al., 2010; Seemüller et al., 2013) demonstrated that the microstructure of the alloy is rather sensitive to the manufacturing process. In particular, insufficient temperature gradients and/or high growth rates, which are desirable from the viewpoint of industrial application, lead to deviations from an ideal microstructure of perfectly aligned Mo-fibers in the NiAl matrix. On the mesoscale, NiAl-Mo develops cellular structures (Misra et al., 1998; Seemüller et al., 2013) in which regions of well-aligned fibers are surrounded by degenerated

<sup>1</sup> This chapter is based on Wicht et al. (2022). For the sake of a coherent structure, formatting and typography of this thesis, minor changes have been made. To avoid redundancies in the text, the introduction has been shortened.

regions with higher NiAl fraction and coarse, misaligned Mo fibers, see Fig. 7.1. Indeed, Gombola et al. (2020) revealed that similar structures emerge for various compositions in the NiAl-(Mo,Cr) system.

Seemüller et al. (2013) showed that cell formation results in a lower creep resistance, between well-aligned NiAl-Mo and binary NiAl. The ability to model and predict the creep behavior of cellular NiAl based composites appears crucial, as: (i) Perfect laboratory conditions for producing NiAl-based eutectics may not always be available in an industrial context where high growth rates are preferred. (ii) The process conditions to achieve perfect alignment become challenging in case of advanced complex alloying compositions with extended solidification intervals (Gombola et al., 2020). The applied temperature gradients need to cover the solidification interval in the transition zone from the liquid to the solids in order to obtain stable processing conditions during solidification. Thus, cellular microstructures become more likely under practical conditions. Determining the impact of partially interrelating morphological features, such as cell volume fraction and aspect ratio, on the mechanical behavior is necessary, not only to assess the sensitivity of the overall creep response to microstructural irregularities, but also for identifying suitable processing conditions and alloy compositions. Finally, a rather large disparity on reported experimental results, for example regarding the apparent stress exponent of the composite (Albiez et al., 2016a; Dudová et al., 2011; Hu et al., 2013; Seemüller et al., 2013), might indicate that the mesostructure of the material has already played a role in some of the previous studies as will be highlighted in Sec. 7.2.2 and Sec. 7.4.4.

Thus, the aim of the present study is to investigate the creep behavior of cellular NiAl-Mo through creep simulations on the microscale. To this end, we use modern FFT-based methods (Moulinec and Suquet, 1998), which have established themselves as powerful algorithms for computing the effective response of microstructured materials, such as

**Figure 7.1:** Structure of directionally solidified cellular NiAl-Mo sketched at different length scales based on dark field optical microscopy images by Seemüller et al. (2013)

composites (Burgarella et al., 2019; Wang et al., 2018a) and polycrystals (Lebensohn et al., 2012; Eisenlohr et al., 2013). In the context of micromechanical creep simulations, the effective strain-rate is computed by volume averaging the strain-rate field on the microstructure level, which arises in response to a prescribed mean stress. The main difficulty for this task lies in the multi-scale nature of the problem, i.e., the difference in the characteristic length scales of the different geometric features of the material, see Fig. 7.1. While the cellular colonies are roughly 1 mm and 0*.*2 mm in length and diameter, respectively, the diameter of the Mo fibers is in the sub-micron scale. Hence, if a volume element with multiple cells is considered for simulating the creep behavior, resolving the individual fibers will be infeasible. Instead, we follow Seemüller et al. (2013) and divide the material on the mesoscale into soft regions for the boundary, behaving similar to the NiAl-matrix, and homogeneous hard regions, mirroring the effective creep behavior of the well-aligned NiAl-Mo colonies.

In order to bridge the different length scales involved in the simulations, we proceed with the following steps:


# **7.2 Modeling the anisotropic creep behavior of well-aligned NiAl-Mo colonies**

## **7.2.1 Single crystal plasticity model for fiber and matrix**

In the following, we briefly review the material models and parameters (Albiez et al., 2016a) used for characterizing the anisotropic creep response of well-aligned NiAl-Mo. The material behavior of the NiAlmatrix and the Mo-fibers is governed by a classical small-strain singlecrystal elasto-viscoplasticity model. In the following, denotes the infinitesimal strain tensor and refers to the Cauchy stress tensor. The linear elastic material behavior is governed by Hooke's law for the elastic strains <sup>e</sup>

$$
\sigma = \mathbb{C} : (\varepsilon - \varepsilon\_{\mathbb{P}}) \quad \text{with} \quad \varepsilon = \varepsilon\_{\mathbb{e}} + \varepsilon\_{\mathbb{P}} \tag{7.1}
$$

and the stiffness tensor C. The plastic strain <sup>p</sup> due to dislocation glide is realized as a linear combination of simple shears (Bishop, 1953) in crystallographic slip systems characterized by their slip direction and

slip plane normal , where the subindex (·) refers to the th of slip systems. Assuming that the slip in the glide systems follows the classical power-law flow rule of Hutchinson (1976), the flow rule reads

$$\dot{\varepsilon}\_{\mathbf{p}} = \sum\_{\alpha=1}^{N} \dot{\gamma}\_0 \operatorname{sgn}(\tau\_{\alpha}) \left| \frac{\tau\_{\alpha}}{\tau^{\sf F}} \right|^m d\_{\alpha} \otimes^s n\_{\alpha}, \tag{7.2}$$

with shear stress = : ( ⊗ ), yield stress F , reference slip rate ˙<sup>0</sup> and stress exponent . We emphasize that the chosen flow rule only covers plasticity due to conservative dislocation glide. More sophisticated models which include the smaller strain contribution of dislocation climb by adding additional non-conservative modes of deformation have been proposed, for instance, by Lebensohn et al. (2010). However, as Albiez et al. (2016a) demonstrate, the chosen approach (7.2) is able to predict the creep behavior of NiAl-Mo for temperatures between 900<sup>∘</sup>C and 1000<sup>∘</sup>C and stresses between 100 MPa and 250 MPa with good accuracy. Furthermore, the stress exponents of the monolithic phases as well as of the composite are significantly larger than 1 indicating that diffusional contributions to the overall strain are negligible. Hence, to avoid the introduction and calibration of additional unknown material parameters, we restrict to the glide based formulation. Furthermore, the temperature dependence of the creep behavior is incorporated in the reference shear rate by Albiez et al. (2016a), using an Arrhenius approach. As an exemplifying study, we compare our modeling results mainly with the experiments by Seemüller et al. (2013), who carried out creep tests at 900<sup>∘</sup>C. All material parameters, experimental data and simulation results in the this study are given for this fixed temperature. In addition, experimental results show that the softening of the Mo-fibers, i.e., the decrease of <sup>F</sup> during creep, is only weakly pronounced in the cellular material, see Fig. 5 in Seemüller et al. (2013). Computational investigations suggest that, even for the well-aligned material, substantial softening only occurs for direct

loading in fiber direction, see Sec. 4.6.3. For a thorough investigation of this load case and a physical interpretation of the softening, we refer to the studies of Albiez et al. (2016a; 2019). As the present study focuses on cellular NiAl-Mo, we restrict to investigating the steady-state creep rate of the materials, i.e., we treat <sup>F</sup> as constant.

**Table 7.1:** Material parameters for NiAl and Mo at 900∘C (Albiez et al., 2016a; Seemüller et al., 2013)


The material parameters for NiAl and the Mo-fibers are mostly taken from Albiez et al. (2016a), see Tab. 7.1. By comparing the yield strength <sup>F</sup> of the two materials at 900<sup>∘</sup>C, the difference in creep resistance becomes apparent. Due to the directional solidification process, the Mo-fibers are virtually free of dislocations (Bei et al., 2008; Sudharshan Phani et al., 2011), leading to a high yield strength of roughly 3% of the shear modulus of Mo. Based on an extensive literature review, Albiez et al. (2016a) were able to adopt most material parameters from existing sources. Indeed, among the relevant parameters for the present study, only the reference shear rate of the Mo-fibers was calibrated to match simulation results (Albiez et al., 2016a). However, as the study of Seemüller et al. (2013) represents our primary point of comparison, we adopt two additional changes with respect to the parameters of NiAl. More precisely, we choose = 5*.*8 as measured by Seemüller et al. (2013), compared to 4*.*04 used by Albiez et al. (2016a). Indeed, a large range of values from 3 to 7 has been reported for the stress exponent

of single-phase NiAl in the literature (Noebe et al., 1993). The large scatter in experimental measurements may be due to the sensitivity of the stress exponent of near stoichiometric NiAl on composition, as noted by Whittenberger (1987). To compensate for the change in the stress exponent, the value of ˙<sup>0</sup> was modified to reach a decent agreement between modeled material behavior and experimental measurements, see Fig. 7.2c.

## **7.2.2 Minimum creep rate of well-aligned NiAl-Mo under various loading angles**

**(a)** Transverse section of the microstructure

**(c)** Comparison of micromechanical simulations and experimental results by Seemüller et al. (2013)

**(b)** Sketch of a longitudal section indicating the loading angle

For characterizing the material behavior of well-aligned NiAl-Mo, we use the two-dimensional cell shown in Fig. 7.2a, with 100 fibers occupying 14% of the total area (Bei and George, 2005). Distinct microstructural features of well-aligned NiAl-Mo include the square cross-section of the Mo-fibers and their regular arrangement in a hexagonal pattern, see Fig. 1 in Bei and George (2005) or Fig. 3 in Seemüller et al. (2013). To generate a similar hexagonal arrangement, we use the mechanical contraction algorithm of Williams and Philipse (2003) to generate a circle packing with 70% volume fraction. Subsequently, square fibers of appropriate size are placed at the centers of the packed circles. The resulting structure is discretized by 256×256 pixels. For investigating the anisotropic creep behavior of the material, we apply periodic boundary conditions and prescribe the effective stress tensor ¯, i.e., the volume average of the stress field. More precisely, the prescribed effective stress tensor has the form ¯ =  ⊗ corresponding to a uniaxial stress state with magnitude and loading direction . The loading is applied in 1 s and held until a steady-state strain-rate is reached. Different loading directions are tested with respect to their angle of misalignment to the growth direction, see Fig. 7.2b for a sketch. Details on the computational setup of the FFT-based micromechanics solver are given in Sec. 7.4.1. The computed minimum creep rate of the well-aligned material for loadings in growth direction at various stress levels is compared to the creep experiments by Seemüller et al. in Fig. 7.2c. Although the data from simulation and experiment are in decent agreement in the range from 150 − 200 MPa, the slopes, i.e., the apparent stress exponents, differ notably, with = 10 in the simulations compared to values of 5 to 7 in Seemüller et al. (2013). This indicates that the Mo-fibers control the creep behavior of the single colony in the simulation. A broader review of existing creep studies reveals that there is, in fact, no clear consensus on the stress exponent of well-aligned NiAl-Mo. For instance, creep experiments by Haenschke et al. (2010), Albiez et al. (2016a) and Dudová et al. (2011) displayed fiber-dominant behavior

with between 10 and 14. In contrast, Seemüller et al. (2013) and Hu et al. (2013) measure an exponent in the range of 4 − 7. Taking a closer look at the anisotropic creep behavior predicted by the microstructure computations, see Fig. 7.3, elucidates the disparity in experimental measurements. Fig. 7.3a reveals the pronounced sensitivity of the creep behavior with respect to the angle of misalignment between loading and fiber direction. For loading angles larger than 5 ∘ , the creep rate quickly increases by orders of magnitudes. Indeed, between 15<sup>∘</sup> and 30<sup>∘</sup> , the reinforcing effect of the fibers mostly vanishes and the creep rate approaches that of the pure NiAl matrix. A more subtle change in behavior can be observed at small angles of misalignment, see Fig. 7.3b. Between 0 ∘ to 2 ∘ , we observe no change in creep behavior and the apparent stress exponent corresponds to that of the Mo-fibers. However, between 3 ∘ to 4 ∘ , there is a turning point from fiber-controlled to matrixcontrolled creep, with little change in the overall magnitudes of creep rates (at least between 150−200 MPa). This offers a possible explanation for the wide range of determined stress exponents in the aforementioned experimental studies, as a small misalignment with respect to the loading direction has a notable impact on the measured rates. In Sec. 7.4.3, we identify the mesostructure of the material as another plausible source for the scatter in stress exponents.

Overall, we conclude that the material model and parameters by Albiez et al. (2016a) lead to a good agreement of micromechanical simulations with experimental results, in particular when taking the sensitivity of the material behavior with respect to load angle into account. Indeed, the creep data by Seemüller et al. (2013) matches the computational results for a loading angle of 4 <sup>∘</sup> almost perfectly, see Fig. 7.2c and Fig. 7.3b. Having validated the model and computations on the microscale, we use the obtained results to calibrate a surrogate model, mimicking the effective behavior of the well-aligned fibrous material.

**(a)** Minimum creep rate vs loading angle

**(b)** Norton plot for small loading angles

**Figure 7.3:** Comparison of FFT-based simulations on well-aligned NiAl-Mo microstructures and the surrogate model (7.8) for uniaxial creep tests where the loading angle is given with respect to growth direction

## **7.2.3 Phenomenological model for the well-aligned fiber structure**

The objective of the section at hand is to develop a simple phenomenological elasto-viscoplastic material model which is able to capture the creep behavior observed in Sec. 7.2.2. In particular, the following properties should be reflected by the model:


The effective linear elastic behavior is governed by Hooke's law (7.1), where the components of the effective stiffness tensor are readily obtained by six linear elastic computations. The computed stiffness tensor is almost transversely isotropic, with a relative error below 0*.*1%. The associated engineering constants are listed in Tab. 7.2. For the flow rule, we rely on the transversely isotropic splitting of the deviatoric stress tensor by Naumenko and Altenbach (2005)

$$
\sigma' = \sigma'\_{\mathcal{L}} + \sigma'\_{\mathcal{P}} + \sigma'\_{\mathcal{S}} \tag{7.3}
$$

into a longitudinal component ′ L , the plane stress ′ P and the remaining out-of-plane shear stress ′ S , defined by

$$
\sigma'\_{\mathcal{L}} = \left(\frac{3}{2}\sigma : (n \otimes n) - \text{tr}(\sigma)\right)\left(n \otimes n - \frac{1}{3}\mathcal{I}\right),
\tag{7.4}
$$

$$
\sigma\_\mathcal{P}' = (\mathbf{I} - n \otimes n) \cdot \sigma \cdot (\mathbf{I} - n \otimes n) - \frac{1}{2} (\text{tr}(\sigma) - \sigma : (n \otimes n))(\mathbf{I} - n \otimes n),
\tag{7.5}
$$

$$
\sigma\_{\mathcal{S}}' = 2(n \cdot \sigma \cdot (\mathbf{I} - n \otimes n)) \otimes^s n,\tag{7.6}
$$

respectively. Here, denotes the unit normal of the isotropic plane, i.e., the fiber direction. Naumenko and Altenbach (2005) show that the Frobenius norms ‖ ′ ‖, ‖ ′ ‖ and ‖ ′ ‖ of the stress components constitute a set of independent, transversely isotropic invariants of ′ . Thus, for any flow potential of the form Φ( ′ ) = Φ( ̂︀ ‖ ′ ‖*,* ‖ ′ ‖*,* ‖ ′ ‖), the associated flow rule ˙<sup>P</sup> = *∂*Φ *∂*′ ( ′ ) is transversely isotropic. For the present model, we use the simple ansatz

$$\begin{split} \Phi(\sigma') &= \dot{\varepsilon}\_0 \left( \frac{\sigma\_\mathcal{L}^\mathrm{F}}{m\_\mathcal{L} + 1} \left| \frac{\left\| \sigma\_\mathcal{L}' \right\|}{\sigma\_\mathcal{L}^\mathrm{F}} \right|^{m\_\mathcal{L} + 1} \right. \\ &\left. + \frac{\sigma\_\mathcal{P}^\mathrm{F}}{m\_\mathcal{P} + 1} \left| \frac{\left\| \sigma\_\mathcal{P}' \right\|}{\sigma\_\mathcal{P}^\mathrm{F}} \right|^{m\_\mathcal{P} + 1} + \frac{\sigma\_\mathcal{S}^\mathrm{F}}{m\_\mathcal{S} + 1} \left| \frac{\left\| \sigma\_\mathcal{S}' \right\|}{\sigma\_\mathcal{S}^\mathrm{F}} \right|^{m\_\mathcal{S} + 1} \right), \end{split} \tag{7.7}$$

which leads to the flow rule

$$\dot{\varepsilon}\_{\rm P}(\sigma') = \dot{\varepsilon}\_0 \left( \left| \frac{\left\| \sigma'\_{\rm L} \right\|}{\sigma\_{\rm L}^{\rm F}} \right|^{m\text{L}} \frac{\sigma'\_{\rm L}}{\left\| \sigma'\_{\rm L} \right\|} + \left| \frac{\left\| \sigma'\_{\rm P} \right\|}{\sigma\_{\rm P}^{\rm F}} \right|^{m\text{p}} \frac{\sigma'\_{\rm P}}{\left\| \sigma'\_{\rm P} \right\|} + \left| \frac{\left\| \sigma'\_{\rm S} \right\|}{\sigma\_{\rm S}^{\rm F}} \right|^{m\text{s}} \frac{\sigma'\_{\rm S}}{\left\| \sigma'\_{\rm S} \right\|} \right), \tag{7.8}$$

i.e., each stress component ′ L , ′ P , ′ <sup>S</sup> has an associated yield stress F L , F P , F S and the stress exponent L, P, S, respectively. In contrast to the equivalent-stress approach by Naumenko and Altenbach (2005), the present formulation is able to accommodate different stress exponents for longitudinal and in-plane loadings. On the downside, our flow rule does not reduce to the classical 2-plasticity model for a specific choice of parameters.

The material parameters for the flow rule, see Tab. 7.2, were calibrated by performing a creep test in fiber direction and two shear-creep tests. The resulting creep behavior of the surrogate model is compared to the crystal plasticity computations of Sec. 7.2.2 in Fig. 7.3. Overall, the surrogate model matches the simulations exceptionally well. Both, the deterioration of creep resistance for off angle loadings, see Fig. 7.3a, and the transition from fiber to matrix-dominated creep at small angles, see Fig. 7.3b, are reproduced with high accuracy. Overall, the surrogate model is suitable for facilitating computational investigations on cellular NiAl-Mo on the mesoscale. However, some remarks on the limitations of the model are in order:

1. The largest relative error in strain-rates between surrogate model and micromechanical crystal plasticity simulation is around 25% for loadings perpendicular to the fibers. This is acceptable for investigations of the creep behavior, where creep rates are typically visualized on a logarithmic scale and experimentally determined creep rates may scatter up to an order of magnitude. However in other contexts, e.g.,

**Table 7.2:** Material parameters for the surrogate model, mimicking the creep behavior of unidirectional NiAl-Mo with 14% fiber fraction


for predicting the non-linear stress-strain behavior, the model may have to be reviewed, or, at least, carefully recalibrated.

2. Both simulations Albiez et al. (2016a; 2019) and experiments Dudová et al. (2011); Hu et al. (2013); Seemüller et al. (2013) on well-aligned NiAl-Mo show a transient decrease of the creep rate in the initial stages of a creep test, owing to the load transfer from fibers to matrix. Naturally, the surrogate model cannot account for this behavior as the constituent phases are not explicitly resolved.

# **7.3 Generating synthetic cellular mesostructures**

**(b)** 3D overview

**Figure 7.4:** Different stages of the microstructure generation process with the underlying fiber structure (left), the Voronoi level set (7.9) of the center lines (middle) and the final cell structure (right)

For NiAl-10Mo alloys solidified at a rate of 80 mm*/*h, Seemüller et al. (2013) observed that regions of well-aligned unidirectional fibers formed cellular structures on the meso-scale, surrounded by misaligned fibers

and pure matrix material. The cells, featuring roughly hexagonal crosssections, were elongated in the direction of solidification, with lengths of around 1000 m and an aspect ratio of five. Based on a cell distance between 6 − 10 m, Seemüller et al. (2013) estimated a volume fraction of 82 − 85% of the hard regions.

For generating synthetic volume elements, mimicking the aforementioned characteristics, we rely on the level-set-based framework of Sonon et al. (Sonon et al., 2012; Sonon, 2014; Sonon et al., 2015). In the following, the basic methodology is briefly summarized for the convenience of the reader. Suppose we have a rectangular cell in R with a set of non-overlapping particles Φ = ⋃︀ =1 Φ . Sonon et al. propose an implicit description of the microstructure in terms of the nearest neighbour level set

$$DN\_1(x) = \begin{cases} \min\_{y \in \partial \Phi} d(x, y), & x \notin \Phi, \\\min\_{y \in \partial \Phi} -d(x, y), & x \in \Phi, \end{cases}$$

where (*,* ) denotes the periodic distance of two points *,*  ∈ and *∂*Φ stands for the boundary of the set Φ. Thus, the condition 1() ≤ 0 describes the space occupied by particles. As an extension to 1(), the level sets () may be computed (Sonon et al., 2012), encoding the periodic distance at each point to the -th nearest particle Φ . The () level sets may be used in the context of dense packing algorithms and/or for generating new microstructures by thresholding suitable level-set functions (Sonon et al., 2012; Sonon, 2014; Sonon et al., 2015)

$$f(DN\_1(x), DN\_2(x), \dots, DN\_{k\_{\text{max}}}(x)) \le 0.$$

In particular, for the present study, we exploit the Voronoi-type level set with interparticle distance

$$DN\_1(x) - DN\_2(x) + t \le 0,\tag{7.9}$$

for generating microstructures with the complex geometrical features of cellular NiAl-Mo. For a given collection of particles, 1()−2() = 0 describes the boundary of the associated Voronoi tessellation. Thus, the geometry extracted by the related level set (7.9) may be interpreted as an expansion of all particles to a shape which enforces a uniform distance of between the resulting cells, see Sec. 4.1 in Sonon et al. (2012). In two dimensions, Massart et al. (2018) used the level set (7.9) to generate irregular masonry structures featuring elongated inclusions, resembling the cells observed in NiAl-10Mo. We follow a similar approach to generate the microstructures for the present study:


3. Using the bisection method, we iteratively solve for the cell distance until a prescribed cell volume fraction is obtained. With the indicator function

$$i\_V(x) = \begin{cases} 1, & DN\_1(x) - DN\_2(x) + t \le 0, \\ 0, & \text{otherwise}, \end{cases}$$

of the level set (7.9), we terminate when the convergence criterion

$$\left|\frac{\int\_{Y} i\_{V}(x) \mathrm{d}V}{\int\_{Y} \mathrm{d}V} - \phi\right| < \delta$$

is satisfied. Throughout we set the tolerance for the volume fraction to = 10<sup>−</sup><sup>3</sup> . Unless stated otherwise, the prescribed volume fraction is set to = 85%, following the estimate of Seemüller et al. (2013). Note that we prefer to fix the volume fraction rather than the interparticle distance , as, from the viewpoint of micromechanics, the volume fraction enters the effective (linear elastic) material behavior to first order (Milton, 2002, Ch. 14).

Note that, in practice, the discrete level set is computed on a regular background grid. Throughout the present study, we choose the same refinement for the level-set computation as for the target resolution of the microstructure used in the FFT-based computations. More precisely, for a given underlying fiber packing, steps 2 and 3 of the outlined process are repeated for each realized resolution. Compared to downsampling all realizations from a single finely resolved microstructure, this approach requires a larger number of level-set computations. However, it offers tighter control of the target volume fraction, which is preferred with respect to the minimum necessary resolution for the FFT-based computations, see Sec. 7.4.2. The processing steps for a generated microstructure with dimensions 4000 m × 800 m × 800 m are visualized in Fig. 7.4. Due to the dense fiber packing, the placement and aspect ratio of the cells

closely follow that of the underlying fibers. Note that smaller fragments visible in Fig. 7.4 arise as artifacts of the 2D cuts and are actually part of regularly sized cells. A transverse section and a longitudinal section of the generated structure are compared to dark field optical microscopy images of cellular samples by Seemüller et al. (2013) in Fig. 7.5. Both the roughly hexagonal cross section of the cells and their elongated shape with an aspect ratio about five are featured in the synthetic structure. Hence, the volume elements generated by the adapted levelset strategy closely resemble cellular NiAl-Mo, enabling subsequent micromechanical studies on the materials' effective creep behavior.

# **7.4 Creep behavior of cellular multi-colony NiAl-Mo eutectics with degenerated boundary regions**

## **7.4.1 Computational setup**

For computing the effective creep response of the NiAl-Mo alloys, we rely on an in-house FFT-based micromechanics solver, written in Python 3.7 with Cython extensions and parallelized using OpenMP. More precisely, we use the BFGS-CG algorithm , see Sec. 3.3.4, in combination with the staggered grid discretization (Schneider et al., 2016). We refer to the recent review by Schneider (2021) for a general overview of current FFT-based methods and the articles by Segurado et al. (2018) and Lebensohn and Rollett (2020) for dedicated reviews on the computational homogenization of polycrystalline materials. For a detailed discussion of the specific algorithms used in the study at hand, see Ch. 3.

<sup>2</sup> Fig. 7.5b and Fig. 7.5d from Seemüller et al. (2013) are reused under the STM permissions guidelines: https://www.stm-assoc.org/intellectual-property/ permissions/permissions-guidelines/.

**(a)** Transverse section of synthetic structure

**(b)** Microscopy image of transverse section

**(c)** Longitudal section of synthetic structure

**(d)** Microscopy image of longitudal section

**Figure 7.5:** Synthetic microstructures in comparison to dark field optical microscopy images of NiAl-Mo by Seemüller et al. (2013)<sup>2</sup>

FFT-based solvers naturally operate with periodic boundary conditions, i.e., the stress and strain fields in the volume element are periodic. For our investigations, we prescribe an effective stress ¯, which is the volume average of the stress field, of the form ¯ =  ⊗ *,* corresponding to a uniaxial stress state with magnitude in direction , see Kabel et al. (2016). The loading is applied in 1 s and subsequently held constant until a steady-state creep rate is reached. For our investigation of the cellular material, we restrict to loadings in growth direction.

Throughout, convergence of the FFT-based solver is checked using the criterion proposed in Sec. 5 by Schneider et al. (2019) with a prescribed tolerance of 10<sup>−</sup><sup>4</sup> . For the soft regions in the cell-boundary regions, we use the material model of NiAl, see Sec. 7.2.1. The behavior of the hard regions in the well-aligned cells is governed by the surrogate model proposed in Sec. 7.2.3. All computations were either performed on a workstation with two 12-core Intel Xeon(R) Gold 6146 CPUs and 512 GB RAM or a workstation with two AMD EPYC 7642 with 48 cores each and 1024 GB RAM.

## **7.4.2 Study on the size of the volume element**

FFT-based micromechanics solvers naturally operate on a regular (voxel)grid. However, even when treating the hard regions in cellular NiAl-Mo as a homogeneous material, the difference between the largest geometric features, i.e., cell lengths of about 1000 m, and the smallest geometric features, i.e., the soft cell boundaries with a thickness around 10 m, is still very large. Both memory and runtime limit the size of volume elements which are feasible for computation. Thus, it is imperative to identify both a suitable volume element size and an appropriate resolution, while keeping the possible error of the material response reasonably small (Gote et al., 2022).

In this context, it is useful to recall some insights from the study on representative volume elements by Kanit et al. (2003). When comput-

ing an effective material property based on a randomly generated microstructure of finite size, Kanit et al. (2003) identify two sources of error. For an ensemble of finite microstructure realizations of the same size, there will be some scatter in the effective properties of each realization. The difference between the effective property of a single realization and the mean of an infinitely large ensemble is called dispersion or random error. The dispersion can either be reduced by increasing the size of the microstructure or by averaging over multiple microstructures. The second error source is the bias or systematic error, describing the difference between the mean effective properties for a finite volume element size and the effective properties of the infinite volume limit. For instance, choosing a small volume element may induce anomalies in the microstructure leading to incorrect effective properties, independent of the number of realizations considered. Indeed, the systematic error can only be reduced by increasing the size of the microstructure. As the size of the volume element is a limiting factor for the simulations, we aim to identify the smallest microstructure which sufficiently reduces the systematic error and keep track of the dispersion by considering multiple realizations

In the following, we investigate microstructures with varying lengths and cross-section widths . For each size, ten volume elements are generated and the effective creep rates for a uniaxial stress loading of 200 MPa in growth direction are computed. Based on preliminary investigations, the voxel size is fixed at 8 m, unless stated otherwise. In Fig. 7.6, we plot the resulting mean values together with the two-sided 99% confidence interval based on Student's −distribution, following Schneider et al. (2022). Note that for better readability, we use a linear scale on the -axis instead of the typical logarithmic scale when plotting experimental creep rates.

First, we take a look at microstructures of varying width for a fixed length of = 2000 m. We observe that up to a width of 800 m the averaged

**Figure 7.6:** Influence of cell size and resolution on the effective creep rate with default values of = 2000 m, = 800 m and a default voxel size of 8 m

creep rate increases linearly and subsequently stagnates, see Fig. 7.6a. Indeed, the creep rate for = 400 m is 30% below the stationary level, indicating a large bias. Between = 800 m and = 1200 m, the fluctuation of the mean creep rates is small compared to the confidence intervals, revealing that the dispersion is the primary error source. As expected, the confidence intervals narrow down with increasing size. However, when considering an ensemble of ten volume elements, a width of = 800 m appears sufficient.

Qualitatively, the same trends emerge for volume elements of varying length, see Fig. 7.6b. Using microstructures with = 1000 m, i.e., a single cell length, leads to a systematic underestimation of the creep rate by about 70%. Notably, owing to the imposed regularity of the structure (each cell borders itself in length direction), the dispersion is comparatively small for this case, demonstrating that bias and dispersion do not always follow the same trends. For volume elements longer than 2000 m, there are only marginal changes in the average creep rates. Overall, we conclude that a length of 2000 m, i.e., two cell lengths, is sufficient for our purposes, arriving at a default volume element size of 2000 m × 800 m × 800 m for our subsequent investigations. We emphasize that this choice is only safe if an ensemble of (at least) 10 microstructures is considered. As the dispersion is still rather high, with a relative sample standard deviation of 11*.*5%, using only a single volume element may lead to significant (and undetectable) errors (Schneider et al., 2022). Further note that these results only hold for investigating the effective creep rate. When studying other physical properties, the representative volume size has to be identified anew.

Last but not least, we validate our chosen resolution for our final volume element size of = 2000 m and = 800 m. To this end, the full ensemble of 10 microstructures was discretized with voxel lengths ranging from 2 m to 8 m. In comparison to the size of volume element, the impact of the resolution is miniscule, see Fig. 7.6c. Note that a resolution of 8 m is rather coarse, i.e., the soft cell boundary in the discretized microstructure is only one to two voxels in thickness. Hence, the low impact of resolution on the overall accuracy may appear surprising. We found that a key factor for the consistency of the results with respect to resolution stems from in the microstructure generation process, see Sec. 7.3. For each sampled resolution, the target volume fraction of = 85% was reached to high accuracy by iteratively thresholding the underlying level-set. Downsampling from a high-resolution microstructure, for instance, by using the median value, produces larger scatter in both volume fraction and creep rate. Overall, continuing the investigation with a default resolution of 8m per voxel seems reasonable.

### **7.4.3 On the definition of the soft cell boundary**

In their experimental study on cellular NiAl-Mo, Seemüller et al. (2013) observed a massive loss of creep resistance compared to the well-aligned material. More precisely, for a certain nominal stress, the strain-rate differed by about two to three orders of magnitude. The magnitude of this difference was unexpected, as fiber-free boundary regions only accounted for ∼ 15% of the total volume and grain boundaries were generally found to have no effect on the creep resistance of binary NiAl

**(a)** Fiber-free (violet) and degenerated (pink) cell boundary regions

**Figure 7.7:** (a) Optical microscopy image by Seemüller et al. (2013) and SEM image of boundary region by Haenschke et al. (2010)<sup>3</sup>

(Whittenberger, 1987). Hence, we are interested in computationally investigating the loss of creep resistance in the cellular material and comparing our results to the experimental data by Seemüller et al. (2013). In this context, we note that the definition of the soft regions and the volume fraction of the remaining well-aligned material is crucial.

For their estimated cell fraction of 82%-85%, Seemüller et al. (2013) only classified completely fiber-free regions as soft regions, see the violet shading in Fig. 7.7a. However, larger regions with a coarse fiber distribution and pronounced fiber misalignment can be identified around the cell boundaries (pink shading in Fig. 7.7a). In light of the results in Sec. 7.2.1, it is plausible that the degenerated regions do not significantly contribute to the creep resistance in growth direction. Indeed, scanning electron microscopy (SEM) images by Haenschke et al.

<sup>3</sup> Fig. 7.7a from Seemüller et al. (2013) is reused under the STM permissions guidelines: https://www.stm-assoc.org/intellectual-property/ permissions/permissions-guidelines/. The shading of the boundary regions and the associated annotations have been added. Fig. 7.7b from Haenschke et al. (2010) is reused under the CC BY 4.0 license: https://creativecommons.org/licenses/ by/4.0/legalcode. The visualization for the angle of misalignment has been added.

**Figure 7.8:** Artificial microstructures with varying volume fraction and corresponding boundary width

(2010) reveal fiber misalignments between 20<sup>∘</sup> to 30<sup>∘</sup> at cell boundaries, see Fig. 7.7b. At these angles of misalignment, a single colony of well-aligned NiAl-Mo displays essentially the same creep behavior as the pure NiAl matrix. Thus, it appears reasonable to classify both the fiber-free and the degenerated regions as soft regions. To check this assertion, we consider the simulated creep behavior for varying volume fractions of the hard phase, see Fig. 7.8a - Fig. 7.8d for an example of a microstructure with varying cell distance and volume fraction . Comparing the computed creep rates to the data by Seemüller et al. (2013) reveals that the experimentally determined creep rates lie between the simulation results for volume fractions of = 55% and = 65%, see Fig. 7.9. The cell distance of 28 m − 38 m for the associated synthetic structures roughly matches the thickness of 30 m − 40 m for the coarse region in the microscopy image by Seemüller et al. (2013). Thus, the creep simulations strengthen the hypothesis, that both coarse and fiber-free regions should be classified as soft regions.

Our results highlight that, in contrast to binary NiAl (Whittenberger, 1987), the boundary of the cellular colonies is essential for explaining the overall creep behavior of the cellular material. Owing to its much lower creep resistance, properly defining the soft regions and their volume fraction is key for reaching accurate predictions. In particular, identifying

**Figure 7.9:** Norton plot for various volume fractions and comparison to experimental data by Seemüller et al. (2013)

**Figure 7.10:** Comparison of reinforcing cell structure between synthetic volume elements with = 85% and = 55%, difference shaded in pink

the coarse regions with degenerated fiber structure as part of the soft regions sheds light on the deterioration of the creep resistance in cellular NiAl-Mo samples. Compared to the completely fiber-free regions, the degenerated part of the cell boundary occupies two to three times as much volume. Hence, the fraction of the actual hard regions is much lower than the 85% estimated by Seemüller et al. (2013), leading to the pronounced loss of creep resistance. An illustration of the difference in reinforcing structure is shown in Fig. 7.10, where the difference in synthetic volume elements with = 85% and = 55%, i.e., the impact of the coarse boundary, is visualized. Thus, it appears mandatory to pay special attention on such mesoscale deviations from the ideal fiber morphology when comparing the magnitudes of creep resistance of NiAl-based composites from different experimental datasets. As many alloys in the NiAl-(Cr,Mo) system exhibit similar colony structures with degenerated regions at the cell boundaries (Gombola et al., 2020), these findings should be taken into account when modeling and evaluating the creep resistance. In particular, more complex alloys with a larger number of constituents will be even more prone to form degenerated regions due to extended solidification intervals.

## **7.4.4 Influence of the morphology on the creep response**

In terms of the mechanical properties of directionally solidified NiAl-Mo, aiming for a well-aligned microstructure appears to be optimal. However, this degree of fiber alignment is only achieved under specific processing conditions, i.e., slow growth rates and high temperature gradients, which are typically restricted to a laboratory environment (Bei and George, 2005; Bogner et al., 2012; Hu et al., 2012). In contrast, samples solidified in industrial scale furnaces are prone to microstructural irregularities (Bogner et al., 2012). Hence, for the practical application of NiAl-Mo on a component scale, a robust prediction of the creep behavior in terms of the microstructure morphology is required to find a suitable compromise between mechanical behavior and favorable processing conditions. However, in practice, deliberate morphology modification of NiAl-Mo is limited due to strongly interrelating solidification and processing parameters. Thus, thoroughly characterizing the impact of the morphology on the creep behavior solely based on experiments is difficult.

The level-set framework outlined in Sec. 7.3 provides greater flexibility for adjusting the aspect ratio and volume fractions of the generated synthetic microstructures. Hence, we expand upon the computations of Sec. 7.4.3 and investigate the impact of these morphological quantities on the effective creep rate. In addition, we compare our results with the Kelly-Street model (Kelly and Street, 1972b), which is popular for predicting the microstructure-dependent creep behavior of cellular and fibrous composites and evaluating experimental data (Seemüller et al., 2013; Hu et al., 2013), to assess its accuracy for the case of NiAl-Mo. To this end, we generate microstructures with volume fractions from 55% − 85% and aspect ratios of 5 − 40. Aspect ratios higher than the

original ratio of 5 were realized by increasing the length of the cellular inclusions and the overall volume elements as part of the microstructure generation routine. Based on the results for */* = 5, see Sec. 7.4.2, we set the width of all volume elements to four times the width of the cellular inclusions and the length to twice the cell length.

**Figure 7.11:** Influence of volume fraction and aspect ratio */* on the creep rate of cellular NiAl-Mo for a fixed stress loading of = 100 MPa

The minimum creep rates obtained from simulations on the generated microstructures with a fixed stress loading of = 100 MPa are shown in Fig. 7.11. Recall that, according to Kanit et al. (2003), the dispersion of the effective properties is a decent measure for the representativeness of the volume element size. For the cellular microstructures considered in this study, this was confirmed in Sec. 7.4.2 for volume elements of sufficient length. As a general trend, the dispersion in the effective creep rates decreases with increasing aspect ratio, see Fig. 7.11. Hence, it is reasonable to assume that the size of the volume elements with an aspect ratio beyond the initial choice of */* = 5 is sufficiently representative as well. In Fig. 7.11a, we observe that, for a fixed aspect ratio, a decrease in volume fraction by 20% leads to an increase in creep rate by roughly an order of magnitude. This trend is independent of the specific aspect ratio, as all plots in Fig. 7.11a feature a similar slope. Similarly, for a fixed volume fraction, all plots in Fig. 7.11b exhibit approximately the same general tendency. The creep rate decreases by a factor of about three from */* = 5 to */* = 10. For each subsequent doubling of */*, the impact of the aspect ratio diminishes. The influence of volume fraction

**(a)** Norton plot for variable volume fraction with fixed aspect ratio */* = 5

**(b)** Norton plot for variable aspect ratio */* with fixed volume fraction = 65%

**Figure 7.12:** Influence of volume fraction and aspect ratio */* on the apparent stress exponent of cellular NiAl-Mo

 and aspect ratio */* on the apparent stress exponent is illustrated in Fig. 7.12. In particular, all plots in Fig. 7.12a feature the same slope, revealing that the apparent stress exponent is virtually independent of . In contrast, increasing the aspect ratio leads to a marked change from matrix-controlled creep with ≈ 6 for */* = 5 to fiber-controlled creep with ≈ 9 for */* = 40, see Fig. 7.12b. Due to the change in the apparent stress exponent, the impact of the aspect ratio on the creep rate diminishes further at higher stresses. Note that the observed values for are inside the range reported in the experimental literature (Dudová et al., 2011; Hu et al., 2013; Seemüller et al., 2013; Albiez et al., 2016a). Hence the morphology of the colonies arises as another possible source

for the scatter in experimental data, again emphasizing that information on the mesostructural properties are crucial for a proper assessment of data from different sources.

**Figure 7.13:** Comparison of the quasi-rigid Kelly-Street model (Kelly and Street, 1972b) to simulation results for = 100MPa

Lastly, we turn to the comparison of the simulations to the 1-dimensional shear-lag model by Kelly and Street (1972b), widely used in materials science to assess and interpret experimental creep data of composites (Chan, 2002; Hu et al., 2013; Seemüller et al., 2013). In particular, the Kelly-Street model for quasi-rigid inclusions admits a closed-form expression for the creep rate of the composite as a function of the applied stress. Assuming a power-law formulation

$$\dot{\varepsilon} = \dot{\varepsilon}\_0^{\text{matrix}} \left( \frac{\sigma}{\sigma\_0^{\text{matrix}}} \right)^m \tag{7.10}$$

for the matrix, the creep rate for the composite reads

$$\dot{\varepsilon} = \dot{\varepsilon}\_0^{\text{matrix}} \left[ \frac{\sigma}{\sigma\_0^{\text{matrix}} \left( \Phi(l/d)^{(m+1)/m} - 1 \right) \phi + \sigma\_0^{\text{matrix}}} \right]^m \tag{7.11}$$

with the stress transfer function

$$\Phi = \left(\frac{2}{3}\right)^{1/m} \left(\frac{m}{2m+1}\right) \left(\frac{m}{m+1}\right) \left(\sqrt{\frac{\pi}{2\sqrt{3}\phi}} - 1\right)^{-1/m},\tag{7.12}$$

see Sec. 3.1 in Kelly and Street (1972b) and the modifications by Chan (2002). Note that, in this context, all considered quantities, such as stress and strain rate ˙, are scalar valued. In addition to the Kelly-Street model, we consider the rule of mixtures as a lower bound on the creep rate

$$
\sigma = \left(1 - \phi\right) \sigma\_0^{\text{matrix}} \left(\frac{\dot{\bar{\varepsilon}}}{\dot{\varepsilon}\_0^{\text{matrix}}}\right)^{1/m} + \phi \,\sigma\_0^{\text{fiber}} \left(\frac{\dot{\bar{\varepsilon}}}{\dot{\varepsilon}\_0^{\text{fiber}}}\right)^{1/n},\tag{7.13}
$$

where it is assumed that the fibers are governed by a power-law, analogously to (7.10). Note that the rule of mixtures admits no closed-form solution for the strain rate ˙ and has to be solved numerically for given stress . The material parameters of matrix and well-aligned colonies for the analytical models are listed in Tab. 7.3.

**Table 7.3:** Parameters for the 1-dimensional power-law model


In Fig. 7.13a, we compare the dependency of the creep rate on the cell volume fraction for the original aspect ratio of */* = 5. As an additional data point, we consider the creep rate of the well-aligned material for = 100%. For volume fractions smaller than 85%, the plots of the analytical models and the simulations have a similar slope. However, the Kelly-Street model overestimates the effective creep rate by an order of magnitude compared to the simulation results, which lie roughly at the geometric mean between the Kelly-Street model and the rule of mixtures. In addition, the creep rate for the Kelly-Street model degenerates at = */*2 √ 3, i.e., the maximum volume fraction for a hexagonal packing of continuous fibers as assumed by Kelly and Street (1972b). The results highlight that using the Kelly-Street model beyond its intended regime may lead to inaccurate predictions. Indeed, Kelly and Street (1972b) note that their theory may be inaccurate for small */* and validate their model for */* = 50 and */* = 100 (Kelly and Street, 1972a). Keeping in mind that the Kelly-Street model assumes a constant strain rate in the matrix and zero strain rate in the fibers, the origins of the model inaccuracy may be traced to the heterogeneity of the local fields. In Fig. 7.14 the strain rate in growth direction is visualized for */* = 5. Note that, for the purpose of portraying the fields, we choose a higher resolution of 4 m per voxel.

**Figure 7.14:** Strain rate component in growth direction for a microstructure with aspect ratio */* = 5 and volume fraction = 65%

**Figure 7.15:** Histograms of the strain-rate in growth direction for various aspect ratios

Evidently, the strain rate in both cellular inclusions and matrix is strongly heterogeneous for this case, see Fig. 7.15a for the corresponding histogram. Thus, it is not surprising that the Kelly-Street model struggles to arrive at accurate predictions. With increasing aspect ratio, the strain-rate field becomes more homogeneous, see Fig. 7.15a-Fig. 7.15d, and the simulated creep rates approach the results for the rule of mixtures. However, for high */*, the assumption of zero strain-rate in the fibers leads to a vast underestimation of the effective creep rate of the composite by the Kelly-Street model, see Fig. 7.13b. Thus, we conclude that the model should be confined to cases where the inclusions are truly rigid.

## **7.5 Conclusions**

The present work was devoted to studying the creep behavior of directionally solidified NiAl-Mo eutectics with a cellular mesostructure using FFT-based micromechanics solvers. Our conclusions are as following:

• Combining the level level-set framework for microstructure generation (Sonon et al., 2012; Sonon, 2014; Sonon et al., 2015) with FFT-based solvers (Moulinec and Suquet, 1998) proves to be a flexible approach for simulating the creep response of cellular materials. In particular, the suggested procedure enables the individual control of morphological parameters such as cell volume fraction and aspect ratio. As alloys with a larger number of constituents in the NiAl-(Cr,Mo) system may be even more prone to developing microstructural irregularities, a flexible simulation tool-set is crucial for assessing their creep response.


may serve as a starting point for developing analytical models which address the aforementioned limitations.

# **Chapter 8 Summary and Conclusions**

In the present thesis, we investigated and developed high-performance FFT-based micromechanics solvers for efficiently computing the effective (thermo)mechanical response of applied materials. For one, we were interested in finding powerful general-purpose solvers, which perform well for a wide variety of problem classes, including nonlinear material behavior, infinite material contrast and computationally expensive material laws. Secondly, we developed dedicated algorithms for specialized applications such as crystal plasticity or thermomechanically coupled materials. All methods were tested for microstructures of industrial size and complexity. In particular, directionally solidified eutectics of the NiAl-(Cr, Mo) system, which are subject to active research as next-generation high-temperature materials, served as our primary material class of interest. Owing to their microstructure, with features encompassing multiple length scales, and the computationally expensive elasto-viscoplastic material behavior with strain-softening, the micromechanical characterization of these materials represented a research topic of interest in and of itself, in addition to being a challenging benchmark for the investigated solvers. In the following, we list the main insights of each chapter, before closing with some concluding remarks.

#### **Chapter 3**

• Among the Lippmann-Schwinger solvers, the Barzilai-Borwein method (Barzilai and Borwein, 1988; Schneider, 2019a) emerges as the solver of choice for computationally cheap material laws, due to its minimal computational overhead. For computationally expensive materials, inexact (Quasi-)Newton methods are the preferred choice, as they lead to the lowest number of gradient evaluations.

• The BFGS-CG method, approximating the local tangent-stiffness, represents a viable alternative to the classical Newton-CG method (Gélébart and Mondon-Cancel, 2013; Kabel et al., 2014). This is mainly due to the influence of the forcing term choice, i.e., the accuracy for solving the linear system. In general, solving to high-accuracy is suboptimal with respect to the total computation time. When solving the linear system to lower accuracy, having access to the exact tangent does not substantially improve the convergence rate.

#### **Chapter 4**


#### **Chapter 5**

• The asymptotic homogenization framework by Chatzigeorgiou et al. (2016) establishes that only the macroscopic temperature enters the cell problem on the microscale, effectively decoupling mechanics and heat conduction. Thus, for computing the effective response of

thermomechanically coupled materials, it is sufficient to solve a scalar equation in addition to the balance of linear momentum.

• Our proposed implicit staggered approach minimizes the additional effort for the mechanical solver and preserves the power of FFT-based methods, even for strongly coupled problems. The Barzilai-Borwein method emerges as a decent choice, as its convergence behavior is virtually unaffected. For the Newton-CG method, using an adaptive forcing term choice is crucial to compensate for the higher number of Newton iterations.

#### **Chapter 6**


#### **Chapter 7**


Considering the development of FFT-based methods, both Lippmann-Schwinger and polarization-based approaches have produced competitive general-purpose solvers, each with distinctive advantages and disadvantages. In terms of raw performance, polarization-based schemes often have the upper hand, whenever applicable. However, even with the complexity-reduction approach of Schneider et al. (2019), the reliance on a polarization field and the application of the nonlinear Z 0 operator may seem unfamiliar to users of classical displacement based mechanics solvers. In particular, auxiliary techniques and interfaces of practical relevance, such as thermomechanical coupling (see Ch. 5), composite voxels (Kabel et al., 2017) or UMAT support, are usually formulated in terms of strains or displacements. Hence, establishing compatibility to polarization-based methods is not straightforward and requires additional implementation effort. On the other hand, Lippmann-Schwinger

solvers are usually easier to implement and maintain, whereas, for optimal performance, judiciously choosing the right algorithm for the problem at hand is still unavoidable. Thus, the "ideal" solver, combining fast performance, flexibility and memory efficiency is yet to be developed. However, inexact (Quasi-)Newton methods, the Barzilai-Borwein method and (Anderson accelerated) polarization schemes, never stray too far from each other and are all worthy of a recommendation.

With respect to further investigations of NiAl-(Cr, Mo) eutectics, the methods of Ch. 7 may be used to characterize the various types microstructures, i.e., fibrous or lamellar colonies, arising at different compositions of the refractory metals. In particular, investigating the different mechanical responses for varying microstructures is crucial for identifying promising material compositions. Furthermore, transferring the results of the micromechanical investigations to the macroscale is of high interest. In this context, the anisotropic creep behavior of the directionally solidified eutectics may prove to be detrimental when subjected to the multiaxial stress-states encountered in real-world components. To facilitate such studies, either phenomenological surrogate models or data-driven approaches Dvorak and Benveniste (1992); Michel and Suquet (2003); Gajek et al. (2020) may be informed by FFT-based computations.

#### **Appendix A**

# **The Helmholtz decomposition for elasticity**

The Helmholtz decomposition is discussed, for instance, in Ch. 12.1 of Milton (2002). To establish consistency and for the convenience of the reader, it is introduced in this appendix for the case of elasticity. Let the C 0 -weighted inner product on = 2 ( ; Sym()) be defined as

$$\langle S, T \rangle\_{\mathbb{C}^0} = \frac{1}{|Y|} \int\_Y S : \mathbb{C}^0 : T \, dx, \quad S, T \in V. \tag{A.1}$$

Then the operators

⟨·⟩ *,* Γ 0 : C 0 *,* and Δ<sup>0</sup> = I −⟨·⟩ − Γ 0 : C 0 *,* (A.2)

with Γ <sup>0</sup> = ∇ (div C <sup>0</sup>∇ ) <sup>−</sup><sup>1</sup>div form a complete set of complementary orthogonal projectors. They induce an orthogonal direct sum decomposition of

$$V = \text{im } \langle \cdot \rangle\_Y \oplus \text{im } \Gamma^0 : \mathbb{C}^0 \oplus \text{im } \Delta^0 \tag{A.3}$$

with the subspaces

im ⟨·⟩ = { ∈ | = ⟨⟩ } *,* im Γ 0 : C <sup>0</sup> = {︀ ∈ | = ∇*,*  ∈ <sup>1</sup> #( ; Sym())*,* ⟨⟩ = 0}︀ *,* im Δ<sup>0</sup> = {︀ ∈ | div [C 0 : ] = 0*,* ⟨⟩ = 0}︀ *.* (A.4)

261

Hence, any ∈ can be decomposed into three components

$$S = \langle S \rangle\_Y + \Gamma^0 : \mathbb{C}^0 : S + \Delta^0 : S \tag{A.5}$$

where ⟨⟩ is constant, Γ 0 : C 0 : = ∇ is mean-free and compatible, and Δ<sup>0</sup> : = − ⟨⟩ − ∇ is mean-free and divergence-free.

#### **Appendix B**

# **The dual potential of the stress-based variational framework**

In the following, we discuss the derivation of the dual potential \* which was introduced ad-hoc in Ch. 4.4 of the main text. Let : → R be a convex function on the Banach space . Let \* : ′ → R be its Legendre transform

$$f^\*(y) = \sup\_{x \in X} \left( \langle x, y \rangle - f(x) \right) \tag{B.1}$$

where ⟨·*,* ·⟩ is the natural pairing ⟨·*,* ·⟩ : × ′ → R. If and \* are 1 then

$$y = Df(x) \quad \text{iff} \quad x = Df^\*(y). \tag{B.2}$$

Suppose the closed subset ⊆ is a convex cone, i.e.

1<sup>1</sup> + 2<sup>2</sup> ∈ for all 1*,* <sup>2</sup> ∈ and 1*,* <sup>2</sup> ≥ 0*,* (B.3)

and let \* ⊆ \* be its dual cone

$$U^\* = \{ y \in X^\* \mid \langle y, x \rangle \ge 0 \quad \forall x \in U \}. \tag{B.4}$$

Then, according to Theorem 31*.*4 in Sec. 31 of Rockafellar (1970),

$$\min\_{x \in U} f(x) = -\min\_{y \in U^\*} f^\*(y) \tag{B.5}$$

holds.

In the context of = 2 ( ; Sym()), ′ can be identified with via the Riesz map and ⟨·*,* ·⟩ can be identified with ⟨·*,* ·⟩<sup>2</sup> . Consider the minimization problem

$$\min\_{x \in U} f(x) \tag{B.6}$$

with the objective function

$$f(\hat{\varepsilon}) = \langle w(\overline{\varepsilon} + \hat{\varepsilon}) - \overline{\sigma} : \hat{\varepsilon} \rangle\_Y \,, \tag{B.7}$$

see (4.7), and

$$U = \left\{ \widehat{\varepsilon} \in X \, \middle| \, \widehat{\varepsilon} = \langle \widehat{\varepsilon} \rangle\_Y + \nabla^s u, \quad u \in H^1\_\#(Y; \text{Sym}(d)), \quad \mathbb{P}: \langle \widehat{\varepsilon} \rangle\_Y = 0 \right\}. \tag{\text{B.8}}$$

 is a closed subspace of and therefore a convex cone, see Boyd and Vandenberghe (2004). Hence, its dual cone \* is equal to its annihilator

$$U^0 = \left\{ \hat{\sigma} \in X \, | \, \text{div } \hat{\sigma} = 0, \quad \mathbb{Q} : \langle \hat{\sigma} \rangle\_Y = 0 \right\},\tag{B.9}$$

so that

⟨*,* ^ ^⟩<sup>2</sup> = 0*,* for all ^ ∈ 0 and ^ ∈ *.* (B.10)

The Legendre transform of reads

$$\begin{split} f^\*(\sigma) &= \sup\_{\hat{\varepsilon} \in X} (\langle \sigma, \hat{\varepsilon} \rangle\_{L^2} - \langle w(\overline{\varepsilon} + \hat{\varepsilon}) \rangle\_Y) \\ &= \underbrace{\sup\_{\varepsilon \in X} (\langle \sigma, \varepsilon \rangle\_{L^2} - \langle w(\varepsilon) \rangle\_Y)}\_{\langle w^\*(\sigma) \rangle\_Y} - \langle \sigma \rangle\_Y : \overline{\varepsilon} \\ &= \langle w^\*(\sigma) \rangle\_Y - \langle \sigma \rangle\_Y : \overline{\varepsilon} \end{split} \tag{B.11}$$

264

with

$$w^\*(\sigma) = \sup\_{\varepsilon \in X} \left( \sigma : \varepsilon - w(\varepsilon) \right) \quad \text{and} \quad \varepsilon = \overline{\varepsilon} + \widehat{\varepsilon}. \tag{B.12}$$

In the main text, we denote (^) = (^) and \* (^) = \* ( + ^) with = + ^. With this choice of \* (^) we obtain

$$\min\_{\hat{\varepsilon}\in U} W(\hat{\varepsilon}) = -\min\_{\hat{\sigma}\in U^\*} W^\*(\hat{\sigma}).\tag{B.13}$$

by (B.5). Hence, the equality of the primal and dual variational problems in Section 4.2.2 and Section 4.4.

#### **Appendix C**

# **Remarks on computing the level sets on voxel images**

## **C.1 Using Euclidean distance transforms to compute level sets**

In the following, we briefly lay out how Euclidean distance transforms (EDTs), which are standard algorithms in image processing (Fabbri et al., 2008), may be exploited for computing the level-sets on voxel images. Suppose a binary microstructure image ℐ : Ω → {0*,* 1} is given on a discretized domain Ω = {0*, ...,* } 3 , with a set of inclusion voxels Φ (typically with value 1) and matrix voxels Φ = Ω∖Φ (typically with value 0). For any image with a marked set of object voxels , an EDT assigns to each voxel the distance to its nearest object voxel. Thus, the signed distance field <sup>1</sup> of ℐ may be computed by a three-step process:


For computing the nearest neighbour level sets of higher order, the sequential updating strategy by (Sonon, 2014, Sec. 2.4.1) may be used. To this end, the subsets Φ of all particles, with Φ = ⋃︀ =1 Φ , have to be identified in a pre-processing step, using a connected-component extraction algorithm (Gonzalez and Woods, 2018, Sec 9.5). For each particle, firstly, its signed distance field Φ is computed by applying the outlined three-step process to Φ . Secondly, the fields are updated according to (Sonon, 2014, Sec. 2.4.1)

$$\begin{aligned} DN\_k &\leftarrow \max(DN\_{k-1}, \min(DN\_k, DS\_{\Phi\_i})) \\ DN\_1 &\leftarrow \min(DN\_1, DS\_{\Phi\_i}), \end{aligned} \tag{C.1}$$

starting with max, i.e., the highest desired value of . Note, that the computational effort for this generic strategy is proportional to the number of particles in the image ℐ. However, certain EDTs may be modified to evaluate in a single pass as shown in the next section.

## **C.2 Choice of EDT algorithm**

For an extensive performance comparison and discussion of various EDTs for 2-dimensional images, we refer to the study by Fabbri et al. (2008). Following their taxonomy, EDTs may be broadly categorized into scanning algorithms and propagating algorithms (Fabbri et al., 2008), differing in the order in which the voxels are processed. For the present discussion, we consider one representative algorithm of each family. In scanning algorithms, the image is processed in terms of its rows, columns and planes. The fastest EDT in this category (Fabbri et al., 2008) is the algorithm by Meijster et al. (2002), which exploits that the minimization problem for computing the square Euclidean distance transform may be solved for each spatial dimension separately. The algorithm lends itself well to parallelization and periodicity of the image can be incorporated at virtually no additional cost, using the scheme of Coeurjolly (2008).

**Figure C.1:** Overview of the microstructure and the <sup>1</sup> (red scale) and <sup>2</sup> (blue scale) level sets for a structure with 500 fibers

Propagating algorithms update the distance field in a narrow band or wavefront, emanating from the object voxels. As Dijkstra-type algorithms they are very similar to the fast marching method for solving the (related but more general) eikonal equation. Algorithms of this type (Ragnemalm, 1992; Cuisenaire and Macq, 1999; Lotufo et al., 2000) mostly differ in details such as the data structure for the wavefront or the propagated information, see Sec. 7.4.1. in Fabbri et al. (2008) for a generic description. For our implementation, we choose the algorithm by Lotufo et al. (2000) using a bucket queue as data structure and propagating the nearest object voxel. The bucket queue enables a partial parallelization of the algorithm. Periodicity is integrated by considering the periodic 6-neighbourhood during propagation. Note that propagation-type algorithms are generally not exact, see Cuisenaire and Macq (1999) for a detailed discussion of the 2-dimensional case. However, in our studies the maximum error for computing <sup>1</sup> was usually below a single voxel length, which we consider acceptable.

**(a)** Computing <sup>1</sup> for varying voxel counts

**Figure C.2:** Benchmark of the scanning EDT by Meijster et al. (2002) and the propagating EDT by Lotufo et al. (2000) (and its modified version Alg. 8) for computing the nearest neighbour level sets of isotropically packed fiber structures

As a first benchmark, we investigate the performance of the algorithms for computing the <sup>1</sup> level set of a microstructure generated with the SAM algorithm by Schneider (2017b), featuring 500 isotropically distributed fibers with an aspect ratio of 10, occupying a volume fraction of 23*.*5%, see Fig. C.1. All EDT benchmarks were performed on a desktop computer with an Intel i7-8700K CPU using 6 threads. The runtimes for different spatial discretizations from 64<sup>3</sup> voxels up to 512<sup>3</sup> voxels are shown in Fig. C.2a. For the chosen structure, both EDT algorithms exhibit linear time complexity with respect to the voxel count. However, the scanning algorithm is more than an order of magnitude faster than the propagating algorithm, confirming the trends observed by Fabbri et al. (2008) in the 3-dimensional setting.

The vast difference in performance may suggest that this is the end of the story and the scanning algorithm by Meijster et al. (2002) is clearly superior. However, the situation changes when considering level sets of higher order. As far as the authors are aware, the scanning algorithm is limited to the sequential updating strategy (C.1) outlined in the last section. Thus, the runtime for computing the level sets becomes dependent on the number of inclusions. This is illustrated in Fig. C.2b, where the runtimes for computing <sup>1</sup> and <sup>2</sup> for microstructures with a fixed voxel count of 256<sup>3</sup> and a varying number of packed fibers are plotted.

On the other hand, the propagating algorithm by Lotufo et al. (2000) can be naturally modified to compute the level sets up to a maximum level max in a single pass. A pseudocode for the modified algorithm is outlined in Alg. 8. Informally speaking, a unique label is assigned to each object and the associated emanating wavefront. By allowing wavefronts with different labels to pass through each other, the order of arrival at a certain coordinate determines the level of . At the end, all points are visited max times. Note that a similar concept for fast marching algorithms was outlined in (Sonon, 2014, Sec. 2.4.1). The performance of the resulting single-pass propagating algorithm is virtually independent of the number of inclusions, see Fig. C.2b. In particular, it becomes the preferable option for object counts larger than 100. At the end, some closing remarks are in order:


algorithm. However, for the most common morphing operations (Sonon et al., 2012; Sonon, 2014; Sonon et al., 2015) the necessary max does not exceed 3. Hence, our evaluation of the algorithms is not substantially affected.

3. If a sequential evaluation of the level sets is unavoidable, e.g., as part of the LS-RSA microstructure generation process by Sonon et al. (2012), then Meijster's algorithm (Meijster et al., 2002) is the method of choice. Due to its high efficiency for computing the level set of a single particle, it relieves the user of using pre-screening strategies (Sonon et al., 2012). In addition, using Coeurjolly's approach (Coeurjolly, 2008) avoids the creation and consideration of periodic neighbours.

**Algorithm 8** Propagation algorithm for computing in a single pass **Auxiliaries:**


Propagation algorithm for computing in a single pass *(continued)*

**Input:** Binary image ℐ, maximum depth max **Output:** Level sets (*,* )


$$\mathop{\rm 8:} \limits\_{\cdot \cdot} \quad \quad L(v.x, V(v.x)) \gets v.label \cdot$$


16: **if** (*.*) *<* max **and** *. /*∈ {(*.,* 0)*, ...,* (*.,* max)} **then**

17: *.* ← *.*

18: *.* ← *.*

```
19: . ← ‖. − ..‖
                  2
```

22: **end for**

23: **end if**

```
24: end while
```
# **Bibliography**

Abdel-Karim, M., 2005. Shakedown of complex structures according to various hardening rules. International Journal of Pressure Vessels and Piping 82 (6), 427–458.

Advani, S. G., Tucker, C. L., 1987. The Use of Tensors to Describe and Predict Fiber Orientation in Short Fiber Composites. Journal of Rheology 31 (8), 751–784.

Aksenov, V., Chertov, M., Sinkov, K., 2021. Application of accelerated fixed-point algorithms to hydrodynamic well-fracture coupling. Computers and Geotechnics 129, 103783.

Albiez, J., Erdle, H., Weygand, D., Böhlke, T., 2019. A gradient plasticity creep model accounting for slip transfer/activation at interfaces evaluated for the intermetallic NiAl-9Mo. International Journal of Plasticity 113, 291–311.

Albiez, J., Sprenger, I., Seemüller, C., Weygand, D., Heilmaier, M., Böhlke, T., 2016a. Physically motivated model for creep of directionally solidified eutectics evaluated for the intermetallic NiAl-9Mo. Acta Materialia 110, 377–385.

Albiez, J., Sprenger, I., Weygand, D., Heilmaier, M., Böhlke, T., 2016b. Validation of the applicability of a creep model for directionally solidified eutectics with a lamellar microstructure. PAMM 16, 297–298.

Ambos, A., Willot, F., Jeulin, D., Trumel, H., 2015. Numerical modeling of the thermal expansion of an energetic material. International Journal of Solids and Structures 60–61, 125–139.

An, H., Jia, X., Walker, H. F., 2017. Anderson acceleration and application to the three-temperature energy equations. Journal of Computational Physics 347, 1–19.

Anderson, D. G., 1965. Iterative procedures for nonlinear integral equations. Journal of the ACM 12 (4), 547–560.

Anglin, B. S., Lebensohn, R. A., Rollett, A. D., 2014. Validation of a numerical method based on Fast Fourier Transforms for heterogeneous thermoelastic materials by comparison with analytical solutions. Computational Materials Science 87, 209–217.

Armero, F., Simo, J. C., 1992. A new unconditionally stable fractional step method for non-linear coupled thermomechanical problems. International Journal for Numerical Methods in Engineering 35, 737–766.

Armero, F., Simo, J. C., 1993. A priori stability estimates and unconditionally stable product formula algorithms for nonlinear coupled thermoplasticity. International Journal of Plasticity 9 (6), 749–782.

Bakhvalov, N. S., Panasenko, G., 1989. Homogenisation: Averaging Processes in Periodic Media: Mathematical Problems in the Mechanics of Composite Materials. Mathematics and its Applications. Kluwer Academic Publishers, Dordrecht.

Banach, S., 1922. Sur les opérations dans les ensembles abstraits et leur application aux équations intégrales. Fundamenta Mathematicae 3, 133– 181.

Bargmann, S., Klusemann, B., Markmann, J., Schnabel, J. E., Schneider, K., Soyarslan, C., Wilmers, J., 2018. Generation of 3D representative volume elements for heterogeneous materials: A review. Progress in Materials Science 96, 322–384.

Barzilai, J., Borwein, J. M., 1988. Two-point step size gradient methods. IMA Journal of Numerical Analysis 8, 141–148.

Bauschke, H. H., Combettes, P. L., 2017. Convex Analysis and Monotone Operator Theory in Hilbert Spaces, 2nd Edition. CMS Books in Mathematics. Springer, New York.

Behnel, S., Bradshaw, R., Citro, C., Dalcin, L., Seljebotn, D. S., Smith, K., 2011. Cython: The Best of Both Worlds. Computing in Science Engineering 13 (2), 31–39.

Bei, H., George, E. P., 2005. Microstructures and mechanical properties of a directionally solidified NiAl-Mo eutectic alloy. Acta Materialia 53 (1), 69–77.

Bei, H., Shim, S., George, E. P., Miller, M. K., Herbert, E. G., Pharr, G. M., 2007. Compressive strengths of molybdenum alloy micro-pillars prepared using a new technique. Scripta Materialia 57 (5), 397–400.

Bei, H., Shim, S., Pharr, G. M., George, E. P., 2008. Effects of pre-strain on the compressive stress-strain response of Mo-alloy single-crystal micropillars. Acta Materialia 56, 4762–4770.

Bellis, C., Suquet, P., 2019. Geometric Variational Principles for Computational Homogenization. Journal of Elasticity 137, 119–149.

Benaarbia, A., Chatzigeorgiou, G., Kiefer, B., Meraghni, F., 2019. A fully coupled thermo-viscoelastic-viscoplastic-damage framework to study the cyclic variability of the Taylor-Quinney coefficient for semicrystalline polymers. International Journal of Mechanical Sciences 163, 105128.

Bertram, A., 2011. Elasticity and Plasticity of Large Deformations, 3rd Edition. Springer, Berlin.

Bhattacharya, K., Suquet, P. M., 2005. A model problem concerning recoverable strains of shape-memory polycrystals. Proceedings of the Royal Society A 461, 2797–2816.

Bishop, J. F. W., 1953. A theoretical investigation of the plastic deformation of crystals by glide. Philosophical Magazine 44, 51–64.

Boeff, M., Gutknecht, F., Engels, P. S., Ma, A., Hartmaier, A., 2015. Formulation of nonlocal damage models based on spectral methods for application to complex microstructures. Engineering Fracture Mechanics 147, 373–387.

Bogner, S., Hu, L., Hollad, S., Hu, W., Gottstein, G., Bührig-Polaczek, A., 2012. Microstructure of a eutectic NiAl-Mo alloy directionally solidified using an industrial scale and a laboratory scale Bridgman furnace. International Journal of Materials Research 103 (1), 17–23.

Both, J. W., Kumar, K., Nordbotten, J. M., Radu, F. A., 2019. Anderson accelerated fixed-stress splitting schemes for consolidation of unsaturated porous media. Computers & Mathematics with Applications 77 (6), 1479–1502.

Boyd, S., Vandenberghe, L., 2004. Convex Optimization. Cambridge University Press.

Brezinski, C., Cipolla, S., Redivo-Zaglia, M., Saad, Y., 2021. Shanks and Anderson-type acceleration techniques for systems of nonlinear equations. IMA Journal of Numerical Analysis , 1–36Drab061.

Brinson, H. F., Brinson, L. C., 2015. Polymer Engineering Science and Viscoelasticity, 2nd Edition. Springer, New York.

Brisard, S., Dormieux, L., 2010. FFT-based methods for the mechanics of composites: A general variational framework. Computational Materials Science 49 (3), 663–671.

Brisard, S., Dormieux, L., 2012. Combining Galerkin approximation techniques with the principle of Hashin and Shtrikman to derive a new FFT-based numerical method for the homogenization of composites. Computer Methods in Applied Mechanics and Engineering 217 - 220, 197 – 212.

Broyden, C. G., 1965. A Class of Methods for Solving Nonlinear Simultaneous Equations. Mathematics of Computation 19, 577–593.

Broyden, C. G., 1970. The convergence of a class of double rank minimization algorithms: 2. The new algorithm. Journal of Mathematical Analysis and Applications 6, 222–231.

Burgarella, B., Maurel-Pantel, A., Lahellec, N., Bouvard, J.-L., Billon, N., Moulinec, H., Lebon, F., 2019. Effective viscoelastic behavior of short fibers composites using virtual DMA experiments. Mechanics of Time-Dependent Materials 23, 337–360.

Castelnau, O., Blackman, D. K., Lebensohn, R. A., Ponte Castaneda, P., ˜ 2008. Micromechanical modeling of the viscoplastic behavior of olivine. Journal of Geophysical Research: Solid Earth 113 (B9).

Chaboche, J. L., 1989. Constitutive equations for cyclic plasticity and cyclic viscoplasticity. International Journal of Plasticity 5 (3), 247–302.

Chaboche, J. L., 2008. A review of some plasticity and viscoplasticity constitutive theories. International Journal of Plasticity 24 (10), 1642– 1693.

Chaboche, J. L., Dong-Van, K., Cordier, G., 1979. Modelization of the strain memory effect on the cyclic hardening of 316 stainless steel. In: Proceedings of the 5th International Conference on SMiRT. Berlin.

Chan, K. S., 2002. Modeling creep behavior of niobium silicide in-situ composites. Materials Science and Engineering: A 337 (1–2), 59–66.

Charalambakis, N., Chatzigeorgiou, G., Chemisky, Y., Meraghni, F., 2018. Mathematical homogenization of inelastic dissipative materials: a survey and recent progress. Continuum Mechanics and Thermodynamics 30, 1–51.

Chatzigeorgiou, G., Charalambakis, N., Chemisky, Y., Meraghni, F., 2016. Periodic homogenization for fully coupled thermomechanical modeling of dissipative generalized standard materials. International Journal of Plasticity 81, 18–39.

Chen, W., Wang, Z., Zhou, J., 2014. Large-scale L-BFGS using MapReduce. Advances in Neural Information Processing Systems 27, 1332– 1340.

Chen, Y., Gélébart, L., Chateau, C., Bornert, M., Sauder, C., King, A., 2019a. Analysis of the damage initiation in a SiC/SiC composite tube from a direct comparison between large-scale numerical simulation and synchrotron X-ray micro-computed tomography. International Journal of Solids and Structures 161, 111–126.

Chen, Y., Vasiukov, D., Gélébart, L., Park, C. H., 2019b. A FFT solver for variational phase-field modeling of brittle fracture. Computer Methods in Applied Mechanics and Engineering 349, 167–190.

Cimmelli, V. A., Jou, D., Ruggeri, T., V'an, P., 2014. Entropy Principle and Recent Results in Non-Equilibrium Theories. Entropy 16 (3), 1756–1807.

Cline, H. E., Walter, J. L., 1970. The Effect of Alloy Additions on the Rod-Plate Transition in the Eutectic NiAl-Cr. Metallurgical Transactions 1 (10), 2907–2917.

Cline, H. E., Walter, J. L., Lifshin, E., Russell, R. R., 1971. Structures, Faults, and the Rod-Plate Transition in Eutectics. Metallurgical Transactions 2 (1), 189–194.

Cocco, A. P., Nelson, G. J., Harris, W. M., Nakajo, A., Myles, T. D., Kiss, A. M., Lombardo, J. J., Chiu, W. K. S., 2013. Three-dimensional microstructural imaging methods for energy materials. Physical Chemistry Chemical Physics 15 (39), 16377–16407.

Coeurjolly, D., 2008. Distance Transformation , Reverse Distance Transformation and Discrete Medial Axis on Toric Spaces. In: 19th International Conference on Pattern Recognition.

Coleman, B. D., Noll, W., 1963. The thermodynamics of elastic materials with heat conduction and viscosity. Archive for Rational Mechanics and Analysis 13 (1), 167–178.

Combettes, P. L., Pesquet, J, C., 2011. Fixed-Point Algorithms for Inverse Problems in Science and Engineering. Springer, New York, Ch. Proximal Splitting Methods in Signal Processing, pp. 185–212.

Cuisenaire, O., Macq, B., 1999. Fast Euclidean Distance Transformation by Propagation Using Multiple Neighborhoods. Computer Vision and Image Understanding 76 (2), 163–172.

Cuitiño, A. M., Ortiz, M., 1993. Computational modelling of single crystals. Modelling and Simulation in Materials Science and Engineering 1 (3), 225–263.

Dai, Y. H., 2013. A perfect example for the BFGS method. Mathematical Programming 138 (1–2), 501–530.

Daphalapurkar, N. P., Wang, F., Fu, B., Lu, H., Komanduri, R., 2011. Determination of Mechanical Properties of Sand Grains by Nanoindentation. Experimental Mechanics 51, 719–728.

Darolia, R., 1991. NiAl alloys for high-temperature structural applications. JOM 43, 44–49.

De Monte, M., Moosbrugger, E., Quaresimin, M., 2010. Influence of temperature and thickness on the off-axis behaviour of short glass fibre reinforced polyamide 6.6 - cyclic loading. Composites Part A: Applied Science and Manufacturing 41 (10), 1368–1379.

De Sterck, H., He, Y., 2020. On the Asymptotic Linear Convergence Speed of Anderson Acceleration, Nesterov Acceleration, and Nonlinear GMRES. SIAM Journal on Scientific Computing 43 (5), 21–46.

Dembo, R. S., Eisenstat, S. C., Steihaug, T., 1982. Inexact Newton Methods. SIAM Journal on Numerical Analysis 19, 400–408.

Dennis, J. E., Moré, J. J., 1977. Quasi-Newton methods, motivation and theory. SIAM Review 19, 46–89.

Desideri, U., 2013. 3 - Fundamentals of gas turbine cycles: thermodynamics, efficiency and specific power. In: Jansohn, P. (Ed.), Modern Gas Turbine Systems. Woodhead Publishing, Cambridge, pp. 44–85.

Doghri, I., Brassart, L., Adam, L., Gérard, J.-S., 2011. A second-moment incremental formulation for the mean-field homogenization of elastoplastic composites. International Journal of Plasticity 27, 352–371.

Dong, Y., 2010. Step lengths in BFGS method for monotone gradients. Computers & Mathematics with Applications 60 (3), 563–571.

Dorn, C., Schneider, M., 2019. Lippmann-Schwinger solvers for the explicit jump discretization for thermal computational homogenization problems. International Journal for Numerical Methods in Engineering 118 (11), 631–653.

Douglas, J., Rachford, H. H., 1956. On the numerical solution of heat conduction problems in two and three space variables. Transactions of the American Mathematical Society 82, 421–439.

Dudová, M., Kuchaˇrová, K., Barták, T., Bei, H., George, E. P., Somsen, C., Dlouhý, A., 2011. Creep in directionally solidified NiAl-Mo eutectics. Scripta Materialia 65 (8), 699–702.

Dvorak, G., Benveniste, Y., 1992. On transformation strains and uniform fields in multiphase elastic media. Proceedings of the Royal Society A 437 (1900), 291–310.

Eghtesad, A., Barrett, T. J., Germaschewski, K., Lebensohn, R. A., McCabe, R. J., Knezevic, M., 2018a. OpenMP and MPI implementations of an elasto-viscoplastic fast Fourier transform-based micromechanical solver for fast crystal plasticity modeling. Advances in Engineering Software 126, 46–60.

Eghtesad, A., Zecevic, M., Lebensohn, R. A., McCabe, R. J., Knezevic, M., 2018b. Spectral database constitutive representation within a spectral micromechanical solver for computationally efficient polycrystal plasticity modelling. Computational Mechanics 61, 89–104.

Eisenlohr, P., Diehl, M., Lebensohn, R. A., Roters, F., 2013. A spectral method solution to crystal elasto-viscoplasticity at finite strains. International Journal of Plasticity 46, 37–53.

Eisenstat, S. C., Walker, H. F., 1996. Choosing the forcing terms in an inexact Newton method. SIAM Journal on Scientific Computing 17 (1), 16–32.

El-Awady, J. A., 2015. Unravelling the physics of size-dependent dislocation-mediated plasticity. Nature Communications 6 (1), 5926.

El Shawish, S., Vincent, P.-G., Moulinec, H., Cizelj, L., Gélébart, L., 2020. Full-field polycrystal plasticity simulations of neutron-irradiated austenitic stainless steel: A comparison between FE and FFT-based approaches. Journal of Nuclear Materials 529, 151927.

Epting, W. K., Gelb, J., S., L., 2012. Resolving the Three-Dimensional Microstructure of Polymer Electrolyte Fuel Cell Electrodes using Nanometer- Scale X-ray Computed Tomography. Advanced Functional Materials 22 (3), 555–560.

Erbts, P., Düster, A., 2012. Accelerated staggered coupling schemes for problems of thermoelasticity at finite strains. Computers & Mathematics with Applications 64 (8), 2408–2430.

Ernesti, F., Schneider, M., Böhlke, T., 2020. Fast implicit solvers for phasefield fracture problems on heterogeneous microstructures. Computer Methods in Applied Mechanics and Engineering 363, 112793.

Esmaeillou, B., Ferreira, P., Bellenger, V., Tcharkhtchi, A., 2012. Fatigue Behavior of Polyamide 66/Glass Fiber Under Various Kinds of Applied Load. Polymer Composites 33 (4), 540–547.

Evans, C., Pollock, S., Rebholz, L. G., Xiao, M., 2020. A Proof That Anderson Acceleration Improves the Convergence Rate in Linearly Converging Fixed-Point Methods (But Not in Those Converging Quadratically). SIAM Journal of Numerical Analysis 58 (1), 788–810.

Eyert, V., 1996. A comparative study on methods for convergence acceleration of iterative vector sequences. Journal of Computational Physics 124 (2), 271–285.

Eyre, D. J., Milton, G. W., 1999. A fast numerical scheme for computing the response of composites using grid refinement. The European Physical Journal - Applied Physics 6 (1), 41–47.

Fabbri, R., Costa, L. D. F., Torelli, J. C., Bruno, O. M., 2008. 2D Euclidean Distance Transform Algorithms: A Comparative Survey. ACM Computing Surveys 40 (1), 1–44.

Fang, H.-R., Saad, Y., 2009. Two classes of multisecant methods for nonlinear acceleration. Numerical Linear Algebra with Applications 16, 197–221.

Ferrandini, P., Batista, W. W., Caram, R., 2004. Influence of growth rate on the microstructure and mechanicalbehaviour of a NiAl-Mo eutectic alloy. Journal of Alloys and Compounds 381 (1–2), 91–98.

Ferry, J. D., 1980. Viscoelastic Properties of Polymers, 3rd Edition. Wiley, New York.

Fillers, R. W., Tschoegl, N. W., 1977. The Effect of Pressure on the Mechanical Properties of Polymers. Transactions of the Society of Rheology 21 (1), 51–100.

Fletcher, R., 1970. A new approach to variable metric algorithms. The Computer Journal 13, 317–322.

Frederick, C. O., Armstrong, P. J., 2007. A mathematical representation of the multiaxial Bauschinger effect. Materials at High Temperatures 24 (1), 1–26.

Frigo, M., Johnson, S. G., 2005. The Design and Implementation of FFTW3. In: Proceedings of the IEEE. Vol. 93.

Fritzen, F., Leuschner, M., 2013. Reduced basis hybrid computational homogenization based on a mixed incremental formulation. Computer Methods in Applied Mechanics and Engineering 260, 143–154.

Fu, A., Zhang, J., Boyd, S., 2020. Anderson Accelerated Douglas-Rachford Splitting. SIAM Journal on Scientific Computing 42 (6), A3560– A3583.

Gajek, S., Schneider, M., Böhlke, T., 2020. On the micromechanics of deep material networks. Journal of the Mechanics and Physics of Solids 142, 103984.

Gasnier, J.-B., Willot, F., Trumel, H., Figliuzzi, B., Jeulin, D., Biessy, M., 2015. A Fourier-based numerical homogenization tool for an explosive material. Matériaux & Techniques 103 (3), 1–11.

Gélébart, L., Mondon-Cancel, R., 2013. Non-linear extension of FFTbased methods accelerated by conjugate gradients to evaluate the mechanical behavior of composite materials. Computational Materials Science 77, 430 – 439.

Germain, P., Nguyen, Q., Suquet, P., 1983. Continuum Thermodynamics. Journal of Applied Mechanics 50, 1010–1020.

Giselsson, P., 2017. Tight global linear convergence rate bounds for Douglas-Rachford splitting. Fixed Point Theory Appl. 19, 2241–2270.

Giselsson, P., Boyd, S., 2017. Linear convergence and metric selection for Douglas-Rachford splitting and ADMM. IEEE Transactions on Automatic Control 62, 532–544.

Glüge, R., Kalisch, J., 2014. The effective stiffness and stress concentrations of a multi-layer laminate. Composite Structures 111, 580–586.

Göküzüm, F. S., Nguyen, L. T. K., Keip, M.-A., 2019. A multiscale FE-FFT framework for electro-active materials at finite strains. Computational Mechanics , 1–22.

Goldfarb, D., 1970. A family of variable metric methods derived by variational means. Mathematics of Computation 24, 23–26.

Gombola, C., Kauffmann, A., Geramifard, G., Blankenburg, M., Heilmaier, M., 2020. Microstructural investigations of novel high temperature alloys based on NiAl-(Cr,Mo). Metals 10 (7), 961.

Gonzalez, R. C., Woods, R. E., 2018. Digital Image Processing, 4th Edition. Pearson, New York.

Görthofer, J., Schneider, M., Ospald, F., Hrymak, A., Böhlke, T., 2020. Computational homogenization of sheet molding compound composites based on high fidelity representative volume elements. Computational Materials Science 174, 109456.

Gote, A., Fischer, A., Zhang, C., Eidel, B., 2022. Computational Homogenization of Concrete in the Cyber Size-Resolution-Discretization (SRD) Parameter Space. Finite Elements in Analysis and Design 198, 103656.

Griewank, A., 1987. The local convergence of Broyden-like methods on Lipschitzian problems in Hilbert spaces. SIAM Journal on Numerical Analysis 24 (3), 684–705.

Griewank, A., Walther, A., 2008. Evaluating Derivatives: Principles and Techniques of Algorithmic Differentiation, 2nd Edition. Society for Industrial and Applied Mathematics, Philadelphia.

Grimm-Strele, H., Kabel, M., 2019. Runtime optimization of a memory efficient CG solver for FFT-based homogenization: implementation details and scaling results for linear elasticity. Computational Mechanics 64 (5), 1339–1345.

Haenschke, T., Gali, A., Heilmaier, M., Krüger, M., Bei, H., George, E. P., 2010. Synthesis and characterization of lamellar and fibre-reinforced NiAl-Mo and NiAl-Cr. Journal of Physics: Conference Series 240, 012063.

Halphen, B., Nguyen, Q., 1975. Sur les matériaux standard généralisés. Journal de Mécanique 14, 39–63.

Handa, K., Kato, A., Narisawa, I., 1999. Fatigue Characteristics of a Glass-Fiber-Reinforced Polyamide. Journal of Applied Polymer Science 72 (13), 1783–1793.

Haupt, P., 2002. Continuum Mechanics and Theory of Materials. Springer, Berlin.

Hill, R., 1968. On constitutive inequalities for simple materials–I. Journal of the Mechanics and Physics of Solids 16 (4), 229–242.

Hu, L., Hu, W., Gottstein, G., Bogner, S., Hollad, S., Bührig-Polaczek, A., 2012. Investigation into microstructure and mechanical properties of NiAl-Mo composites produced by directional solidification. Materials Science and Engineering A 539, 211–222.

Hu, L., Zhang, G., Hu, W., Gottstein, G., Bogner, S., Bührig-Polaczek, A., 2013. Tensile creep of directionally solidified NiAl-9Mo in situ composites. Acta Materialia 61 (19), 7155–7165.

Huang, S.-C., Hall, E. L., 1991. Plastic deformation and fracture of binary TiAl-base alloys. Metallurgical Transactions A 22 (2), 427–439.

Hull, D., Bacon, D. J., 2011. Introduction to Dislocations, 5th Edition. Butterworth-Heinemann, Oxford.

Hutchinson, J. W., 1976. Bounds and self-consistent estimates for creep of polycrystalline materials. Proceedings of the Royal Society of London A 348, 101–127.

Jegou, L., Marco, Y., Saux, V. L., Calloc, S., 2013. Fast prediction of the Wöhler curve from heat build-up measurements on Short Fiber Reinforced Plastic. International Journal of Fatigue 47, 259–267.

Jia, N., Kagan, V. A., 1998. Effects of Time and Temperature on the Tension-Tension Fatigue Behavior of Short Fiber Reinforced Polyamides. Polymer Composites 19 (4), 408–414.

Johnson, D. R., Chen, X. F., Oliver, B. F., Noebe, R. D., Whittenberger, J. D., 1995. Processing and mechanical properties of in-situ composites from the NiAI-Cr and the NiAI-(Cr, Mo) eutectic systems. Intermetallics 3 (2), 99–113.

Kabel, M., Böhlke, T., Schneider, M., 2014. Efficient fixed point and Newton-Krylov solvers for FFT-based homogenization of elasticity at large deformations. Computational Mechanics 54 (6), 1497–1514.

Kabel, M., Fink, A., Schneider, M., 2017. The composite voxel technique for inelastic problems. Computer Methods in Applied Mechanics and Engineering 322, 396–418.

Kabel, M., Fliegener, S., Schneider, M., 2016. Mixed boundary conditions for FFT-based homogenization at finite strains. Computational Mechanics 57 (2), 193–210.

Kang, G., 2008. Ratchetting: Recent progresses in phenomenon observation, constitutive modeling and application. International Journal of Fatigue 30 (8), 1448–1472.

Kanit, T., Forest, S., Galliet, I., Mounoury, V., Jeulin, D., 2003. Determination of the size of the representative volume element for random composites: statistical and numerical approach. International Journal of Solids and Structures 40 (13–14), 3647–3679.

Kantorovich, L. V., 1948. On Newton's method for functional equations. Doklady Akademii Nauk SSSR 59, 1237–1240.

Katunin, A., 2019. Criticality of the self-heating effect in polymers and polymer matrix composites during fatigue, and their application in non-destructive testing. Polymers 11 (1), 1–29.

Kehrer, L., Wicht, D., Wood, J. T., Böhlke, T., 2018. Dynamic mechanical analysis of pure and fiber reinforced thermoset- and thermoplasticbased polymers and free volume-based viscoelastic modeling. GAMM-Mitteilungen 41 (1), e201800007.

Kelley, C. T., 1995. Iterative methods for linear and nonlinear equations. Society for Industrial and Applied Mathematics.

Kelley, C. T., 2018. Numerical methods for nonlinear equations. Acta Numerica 27, 207–287.

Kelly, A., Street, K. N., 1972a. Creep of discontinuous fibre composites I. Experimental behaviour of lead-phosphor bronze. Proceedings of the Royal Society of London A 328 (1573), 267–282.

Kelly, A., Street, K. N., 1972b. Creep of discontinuous fibre composites II. Theory for the steady-state. Proceedings of the Royal Society of London A 328 (1573), 283–293.

Knoll, D. A., Keyes, D. E., 2004. Jacobian-free Newton-Krylov methods: a survey of approaches and applications. Journal of Computational Physics 193, 357–397.

Kobayashi, M., Ohno, N., 2002. Implementation of cyclic plasticity models based on a general form a kinematic hardening. International Journal for Numerical Methods in Engineering 53 (9), 2217–2238.

Köbler, J., Schneider, M., Ospald, F., Andrä, H., Müller, R., 2018. Fiber orientation interpolation for the multiscale analysis of short fiber reinforced composite parts. Computational Mechanics 61 (6), 729–750.

Kochmann, J., Wulfinghoff, S., Ehle, L., Mayer, J., Svendsen, B., Reese, S., 2018. Efficient and accurate two-scale FE-FFT-based prediction of the effective material behavior of elasto-viscoplastic polycrystals. Computational Mechanics 61, 751.–764.

Kocks, U. F., Mecking, H., 2003. Physics and phenomenology of strain hardening: The FCC case. Progress in Materials Science 48, 171–273.

Krairi, A., Doghri, I., Schalnat, J., Robert, G., Van Paepegem, W., 2019. Thermo-mechanical coupling of a viscoelastic-viscoplastic model for thermoplastic polymers: thermodynamical derivation and experimental assessment. International Journal of Plasticity 115, 154–177.

Kubin, L., Devincre, B., Hoc, T., 2008. Modeling dislocation storage rates and mean free paths in face-centered cubic crystals. Acta Materialia 56, 6040–6049.

Kuhn, J., Schneider, M., Sonnweber-Ribic, P., Böhlke, T., 2020. Fast methods for computing centroidal Laguerre tessellations for prescribed volume fractions with applications to microstructure generation of polycrystalline materials. Computer Methods in Applied Mechanics and Engineering 369, 113175.

Lahellec, N., Suquet, P., 2007. On the effective behavior of nonlinear inelastic composites: I. Incremental variatonal principles. Journal of the Mechanics and Physics of Solids 55, 1932–1963.

Lebensohn, R. A., 2001. N-site modeling of a 3D viscoplastic polycrystal using Fast Fourier Transform. Acta Materialia 49 (14), 2723–2737.

Lebensohn, R. A., Brenner, R., Castelnau, O., Rollett, A. D., 2008. Orientation image-based micromechanical modelling of subgrain texture evolution in polycrystalline copper. Acta Materialia 56 (15), 3914–3926.

Lebensohn, R. A., Hartley, C. S., Tomé, C. N., Castelnau, O., 2010. Modeling the mechanical response of polycrystals deforming by climb and glide. Philosophical Magazine 90 (5), 567–583.

Lebensohn, R. A., Kanjarla, A. K., Eisenlohr, P., 2012. An elastoviscoplastic formulation based on fast Fourier transforms for the prediction of micromechanical fields in polycrystalline materials. International Journal of Plasticity 32-33, 59–69.

Lebensohn, R. A., Rollett, A. D., 2020. Spectral methods for full-field micromechanical modelling of polycrystalline material. Computational Materials Science 173, 109336.

Lebon, G., Jou, D., Casas-Vázquez, J., 2008. Understanding Nonequilibrium Thermodynamics. Springer, Berlin.

Lemaitre, J., Chaboche, J.-L., 1990. Mechanics of solid materials. Cambridge University Press, Cambridge.

Leuschner, M., Fritzen, F., 2018. Fourier-Accelerated Nodal Solvers (FANS) for homogenization problems. Computational Mechanics 32 (3), 359–392.

Li, J., Romero, I., Segurado, J., 2019. Development of a thermomechanically coupled crystal plasticity modeling framework: Application to polycrystalline homogenization. International Journal of Plasticity 119, 313–330.

Li, Z., Li, J., 2020. A fast Anderson-Chebyshev acceleration for nonlinear optimization. In: Chiappa, S., Calandra, R. (Eds.), Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistic. Vol. 108. PMLR, Online.

Lions, P.-L., Mercier, B., 1979. Splitting algorithms for the sum of two nonlinear operators. SIAM Journal on Numerical Analysis 16 (6), 964– 979.

Liu, D. C., Nocedal, J., 1989. On the limited memory BFGS method for large scale optimization. Mathematical Programming 45, 503–528.

Liu, I., 1972. Method of Lagrange Multipliers for Exploitation of the Entropy Principle. Archive for Rational Mechanics and Analysis 46 (2), 131–148.

Liu, I., 2002. Continuum Mechanics. Springer, Berlin.

Liu, Z., Wu, C. T., 2019. Exploring the 3D architectures of deep material network in data-driven multiscale mechanics. Journal of the Mechanics and Physics of Solids 127, 20–46.

Liu, Z., Wu, C. T., Koishi, M., 2019. A deep material network for multiscale topology learning and accelerated nonlinear modeling of heterogeneous materials. Computer Methods in Applied Mechanics and Engineering 345, 1138–1168.

Lotufo, R. A., Falcao, A. A., Zampirolli, F. A., 2000. Fast Euclidean Distance Transform using a Graph-Search Algorithm. Proceedings 13th Brazilian Symposium on Computer Graphics and Image Processing , 269–275.

Lucarini, S., Segurado, J., 2019. On the accuracy of spectral solvers for micromechanics based fatigue modeling. Computational Mechanics 63, 365–382.

Ma, R., Truster, T. J., 2019. FFT-based homogenization of hypoelastic plasticity at finite strains. Computer Methods in Applied Mechanics and Engineering 349, 499–521.

Malitsky, Y., Mishchenko, K., 2020. Adaptive Gradient Descent without Descent. In: Proceedings of the 37th International Conference on Machine Learning. Vol. 119. PMLR.

Maniatty, A. M., Dawson, P. R., Lee, Y. S., 1992. A time integration algorithm for elasto-viscoplastic cubic crystals applied to modeling polycrystalline deformation. International Journal for Numerical Methods in Engineering 35, 1565–1588.

Marano, A., Gélébart, L., 2020. Non-linear composite voxels for FFTbased explicit modeling of slip bands: Application to basal channeling in irradiated Zr alloys. International Journal of Solids and Structures 198, 110–125.

Marano, A., Gélébart, L., Forest, S., 2019. Intragranular localization induced by softening crystal plasticity: Analysis of slip and kink bands localization modes from high resolution FFT-simulations results. Acta Materialia 175, 262–275.

Martins, J. M. P., Neto, D. M., Alves, J. L., Oliveira, M. C., Laurent, H., Andrade-Campos, A., Menezes, L. F., 2017. A new staggered algorithm for thermomechanical coupled problems. International Journal of Solids and Structures 122–123, 42–58.

Massart, T. J., Sonon, B., Kamel, K. E. M., Poh, L. H., Sun, G., 2018. Level set-based generation of representative volume elements for the damage analysis of irregular masonry. Meccanica 53 (7), 1737–1755.

Matouš, K., Geers, M. G. D., Kouznetsova, V. G., Gillman, A., 2017. A review of predictive nonlinear theories for multiscale modeling of heterogeneous materials. J. Comp. Phys. 330, 192–220.

Matthies, H., Strang, G., 1979. The solution of nonlinear finite element equations. International Journal for Numerical Methods in Engineering 14, 1613–1626.

McDowell, D., 2008. Viscoplasticity of heterogeneous metallic materials. Materials Science and Engineering: R: Reports 62 (3), 67–123.

Meijster, A., Roerdink, J. B. T. M., Hesselink, W. H., 2002. A General Algorithm for Computing Distance Transforms in Linear Time. In: Goutsias, J., Vincent, L., Bloomberg, D. S. (Eds.), Mathematical Morphology and its applications to image and signal processing. Springer, Boston.

Michel, J. C., Moulinec, H., Suquet, P., 2001. A computational scheme for linear and non-linear composites with arbitrary phase contrast. International Journal for Numerical Methods in Engineering 52, 139–160.

Michel, J. C., Suquet, P., 2003. Nonuniform transformation field analysis. International Journal of Solids and Structures 40 (25), 6937–6955.

Miehe, C., 1995. A theory of large-strain isotropic thermoplasticity based on metric transformation tensors. Archive of Applied Mechanics 66 (1–2), 45–64.

Miehe, C., 2002. Strain-driven homogenization of inelastic microstructures and composites based on an incremental variational formulation. International Journal for Numerical Methods in Engineering 55, 1285– 1322.

Milton, G. W., 2002. The Theory of Composites. Cambridge University Press, Cambridge.

Miracle, D. B., 1993. Overview No. 104 The physical and mechanical properties of NiAl. Acta Metallurgica et Materialia 41 (3), 649–684.

Misra, A., Wu, Z. L., Kush, M. T., Gibala, R., 1998. Deformation and fracture behaviour of directionally solidified NiAl-Mo and NiAl-Mo(Re) eutectic composites. Philosophical Magazine A 78 (3), 533–550.

Monchiet, V., Bonnet, G., 2012. A polarization-based FFT iterative scheme for computing the effective properties of elastic composites with arbitrary contrast. International Journal for Numerical Methods in Engineering 89, 1419–1436.

Monchiet, V., Bonnet, G., 2013. Numerical homogenization of nonlinear composites with a polarization-based FFT iterative scheme. Computational Materials Science 79, 276–283.

Moore, E. H., 1920. On the reciprocal of the general algebraic matrix. Bulletin of the American Mathematical Society 26, 394–395.

Mortazavian, S., Fatemi, A., 2015. Fatigue behavior and modeling of short fiber reinforced polymer composites: A literature review. International Journal of Fatigue 70, 297–321.

Moulinec, H., Silva, F., 2014. Comparison of three accelerated FFT-based schemes for computing the mechanical response of composite materials. International Journal for Numerical Methods in Engineering 97, 960–985. Moulinec, H., Suquet, P., 1994. A fast numerical method for computing the linear and nonlinear mechanical properties of composites. Comptes Rendus de l'Académie des Sciences. Série II 318 (11), 1417–1423.

Moulinec, H., Suquet, P., 1998. A numerical method for computing the overall response of nonlinear composites with complex microstructure. Computer Methods in Applied Mechanics and Engineering 157, 69–94.

Mura, T., 1987. Micromechanics of Defects in Solids, 2nd Edition. Kluwer Academic Publishers, Dordrecht.

Nagra, J. S., Brahme, A., Lebensohn, R. A., Inal, K., 2017. Efficient fast Fourier transform-based numerical implementation to simulate large strain behavior of polycrystalline materials. International Journal of Plasticity 98, 65–82.

Naumenko, K., Altenbach, H., 2005. A phenomenological model for anisotropic creep in a multipass weld metal. Archive of Applied Mechanics 74 (11), 808–819.

Nesterov, Y., 1983. A method for solving the convex programming pproblem with convergence rate (1*/*<sup>2</sup> ). Doklad. Akad. Nauk SSSR 269 (3).

Nesterov, Y., 2004. Introductory Lectures on Convex Optimization: A Basic Course. Mathematics and its applications. Kluwer Academic Publishers, Boston.

Nocedal, J., 1980. Updating Quasi-Newton Matrices With Limited Storage. Mathematics of Computation 35 (151), 773–782.

Nocedal, J., Wright, S. J., 1999. Numerical Optimization. Springer, New York.

Noebe, R. D., Bowman, R. R., Nathal, M. V., 1993. Physical and mechanical properties of the B2 compound NiAl. International Materials Reviews 38 (4), 193–232.

Oosterlee, C. W., Washio, T., 2000. Krylov subspace acceleration of nonlinear multigrid with application to recirculating flows. SIAM Journal on Scientific Computing 21 (5), 1670–1690.

Ouyang, W., Peng, Y., Yao, Y., Zhang, J., Deng, B., 2020. Anderson Acceleration for Nonconvex ADMM Based on Douglas-Rachford Splitting. Computer Graphics Forum 39 (5), 221–239.

Özdemir, I., Brekelmans, W. A. M., Geers, M. G. D., 2008. FE<sup>2</sup> computational homogenization for the thermo-mechanical analysis of heterogeneous solids. Computer Methods in Applied Mechanics and Engineering 198 (3–4), 602–613.

Paige, C. C., Saunders, M. A., 1975. Solution of sparse indefinite systems of linear equations. SIAM Journal on Numerical Analysis 12 (4), 617–629.

Peaceman, D. W., Rachford, H. H., 1955. The Numerical Solution of Parabolic and Elliptic Differential Equations. Journal of the Society for Industrial and Applied Mathematics 3 (1), 28–41.

Penrose, R., 1955. A generalized inverse for matrices. Proceedings of the Cambridge Philosophical Society 51, 406–413.

Pollock, S., Rebholz, L. G., 2021. Anderson acceleration for contractive and noncontractive operators. IMA Journal of Numerical Analysis 41 (4), 2841–2872.

Pollock, S., Rebholz, L. G., Xiao, M., 2021. Acceleration of nonlinear solvers for natural convection problems. Journal of Numerical Mathematics 29 (4), 323–341.

Powell, M. J. D., 1976. Some global convergence properties of a variable metric algorithm for minimization without exact line search. In: Cottle, R. W., Lemke, C. E. (Eds.), Nonlinear Programming. Vol. IX of SIAM-AMS Proceedings. SIAM Publications, Philadelphia, pp. 53–72.

Quey, R., Dawson, P. R., Barbe, F., 2011. Large-scale 3D random polycrystals for the finite element method: Generation, meshing and remeshing. Computer Methods in Applied Mechanics and Engineering 200 (17-20), 1729–1745.

Ragnemalm, I., 1992. Neighborhoods for Distance Transformations Using Ordered Propagation. CVGIP: Image Understanding 56 (3), 399– 409.

Raj, S. V., Locci, I. E., 2001. Microstructural characterization of a directionally-solidified Ni-33 (at.%) Al-31Cr-3Mo eutectic alloy as a function of withdrawal rate. Intermetallics 9 (3), 217–227.

Ramière, I., Helfer, T., 2015. Iterative residual-based vector methods to accelerate fixed point iterations. Computers and Mathematics with Applications 70 (9), 2210–2226.

Riedlbauer, D., Steinmann, P., Mergheim, J., 2014. Thermomechanical finite element simulations of selective electron beam melting processes: performance considerations. Computational Mechanics 54 (1), 109–122.

Rittel, D., 2000. Investigation of the heat generated during cyclic loading of two glassy polymers. Part I: Experimental. Mechanics of Materials 32 (3), 131–147.

Rockafellar, R. T., 1970. Convex Analysis. Princeton University Press, Princeton.

Rollett, A. D., Lebensohn, R. A., Groeber, M., Choi, Y., Li, J., Rohrer, G. S., 2010. Stress hot spots in viscoplastic deformation of polycrystals. Modelling and Simulation in Materials Science and Engineering 18 (7), 074005.

Roters, F., Diehl, M., Shanthraj, P., Eisenlohr, P., Reuber, C., Wong, S. L., Maiti, T., Ebrahimi, A., Hochrainer, T., Fabritius, H.-O., Nikolov, S., Friák, M., Fujita, N., Grilli, N., Janssens, K. G. F., Jia, N., Kok, P. J. J., Ma, D., Meier, F., Werner, E., Stricker, M., Weygand, D., Raabe, D., 2018. DAMASK - The Düsseldorf Advanced Material Simulation Kit for modeling multi-physics crystal plasticity, thermal, and damage phenomena from the single crystal up to the component scale. Computational Materials Science 158, 420–478.

Rothe, S., Erbts, P., Düster, A., Hartmann, S., 2015. Monolithic and partitioned coupling schemes for thermo-viscoplasticity. Computer Methods in Applied Mechanics and Engineering 293, 375–410.

Rovinelli, A., Proudhon, H., Lebensohn, R. A., Sangid, M. D., 2020. Assessing the reliability of fast Fourier transform-based crystal plasticity simulations of a polycrystalline material near a crack tip. International Journal of Solids and Structures 184, 153–166.

Saad, Y., Schultz, M. H., 1986. GMRES: A generalized minimal residual algorithm for solving nonsymmetric linear systems. SIAM Journal on Scientific and Statistical Computing 7, 856–869.

Sanditov, D. S., Mantatov, V. V., Sanditov, B. D., 2009. Poisson Ratio and Plasticity of Glasses. Technical Physics 54 (4), 594–596.

Scherf, A., Kauffmann, A., Kauffmann-Weiss, S., Scherer, T., Li, X., Stein, F., Heilmaier, M., 2016. Orientation relationship of eutectoid FeAl and FeAl2. Journal of Applied Crystallography 49, 442–449.

Schmitt, A., Kumar, K. S., Kauffmann, A., Li, X., Stein, F., Heilmaier, M., 2017. Creep of binary Fe-Al alloys with ultrafine lamellar microstructures. Intermetallics 90, 180–187.

Schneider, M., 2015. Convergence of FFT-based homogenization for strongly heterogeneous media. Mathematical Methods in the Applied Sciences 38 (13), 2761–2778.

Schneider, M., 2017a. An FFT-based fast gradient method for elastic and inelastic unit cell homogenization problems. Computer Methods in Applied Mechanics and Engineering 315, 846–866.

Schneider, M., 2017b. The Sequential Addition and Migration method to generate representative volume elements for the homogenization of short fiber reinforced plastics. Computational Mechanics 59, 247–263.

Schneider, M., 2019a. On the Barzilai-Borwein basic scheme in FFT-based computational homogenization. International Journal for Numerical Methods in Engineering 118, 482–494.

Schneider, M., 2019b. On the mathematical foundations of the selfconsistent clustering analysis for non-linear materials at small strains. Computer Methods in Applied Mechanics and Engineering 354, 783–801.

Schneider, M., 2020a. A dynamical view of nonlinear conjugate gradient methods with applications to FFT-based computational micromechanics. Computational Mechanics 66, 239–257.

Schneider, M., 2020b. Lippmann-Schwinger solvers for the computational homogenization of materials with pores. International Journal for Numerical Methods in Engineering 121 (22), 5017–5041.

Schneider, M., 2021. A review of nonlinear FFT-based computational homogenization methods. Acta Mechanica 232 (6), 2051–2100.

Schneider, M., Hofmann, T., Andrä, H., Lechner, P., Ettemeyer, F., Volk, W., Steeb, H., 2018. Modelling the microstructure and computing effective elastic properties of sand core materials. International Journal of Solids and Structures 143, 1–17.

Schneider, M., Josien, M., Otto, F., 2022. Representative volume elements for matrix-inclusion composites - a computational study on the effects of an improper treatment of particles intersecting the boundary and the benefits of periodizing the ensemble. Journal of the Mechanics and Physics of Solids 158, 104652.

Schneider, M., Merkert, D., Kabel, M., 2017. FFT-based homogenization for microstructures discretized by linear hexahedral elements. International Journal for Numerical Methods in Engineering 109 (10), 1461–1489.

Schneider, M., Ospald, F., Kabel, M., 2016. Computational homogenization of elasticity on a staggered grid. International Journal for Numerical Methods in Engineering 105 (9), 693–720.

Schneider, M., Wicht, D., Böhlke, T., 2019. On polarization-based schemes for the FFT-based computational homogenization of inelastic materials. Computational Mechanics 64 (4), 1073–1095.

Seemüller, C., Heilmaier, M., Haenschke, T., Bei, H., Dlouhý, A., George, E. P., 2013. Influence of fiber alignment on creep in directionally solidified NiAl-10Mo in-situ composites. Intermetallics 35, 110–115.

Segurado, J., Lebensohn, R. A., LLorca, J., 2018. Chapter One - Computational Homogenization of Polycrystals. Advances in Applied Mechanics 51, 1–114.

Segurado, J., Llorca, J., González, C., 2002. On the accuracy of mean-field approaches to simulate the plastic deformation of composites. Scripta Materialia 46 (7), 525–529.

Seth, B. R., 1961. Generalized strain measure with applications to physical problems. Tech. rep., Wisconsin University - Madison Mathematics Research Center.

Shanno, D. F., 1970. Conditioning of quasi-Newton methods for function minimization. Mathematics of Computation 24, 647–650.

Shanno, D. F., Puah, K. H., 1978. Matrix conditioning and nonlinear optimization. Mathematical Programming 14, 149–160.

Shantraj, P., Diehl, M., Eisenlohr, P., Roters, F., Raabe, D., 2019. Spectral Solvers for Crystal Plasticity and Multi-physics Simulations. In: C. H, H. (Ed.), Handbook of Mechanics of Materials. Springer, Singapore, pp. 1347–1372.

Shantraj, P., Eisenlohr, P., Diehl, M., Roters, F., 2015. Numerically robust spectral methods for crystal plasticity simulations of heterogeneous materials. International Journal of Plasticity 66, 31–45.

Sharma, L., Peerlings, R. H. J., Shanthraj, P., Roters, F., Geers, M. G. D., 2020. An FFT-based spectral solver for interface decohesion modelling using a gradient damage approach. Computational Mechanics 65, 925– 939.

Shutov, A. V., Landgraf, R., Ihlemann, J., 2013. An explicit solution for implicit time stepping in multiplicative finite strain viscoelasticity. Computer Methods in Applied Mechanics and Engineering 265, 213–225.

Simmons, G., Wang, H., 1971. Single Crystal Elastic Constants and Calculated Aggregate Properties: A Handbook. MIT Press, Boston.

Simo, J. C., Hughes, T. J. R., 1998. Computational Inelasticity. Springer, New York.

Simo, J. C., Miehe, C., 1992. Associative coupled thermoplasticity at finite strains: Formulation, numerical analysis and implementation. Computer Methods in Applied Mechanics and Engineering 98, 41–104.

Sixto-Camacho, L. M., Bravo-Castillero, J., Brenner, R., Guinovart-Díaz, R., Mechkour, H., Rodríguez-Ramos, R., Sabina, F. J., 2013. Asymptotic homogenization of periodic thermo-magneto-electro-elastic heterogeneous media. Computers & Mathematics with Applications 66 (10), 2056–2074.

Sonon, B., 2014. On advanced techniques for generation and discretization of the microstructure of complex heterogeneous materials. Ph.D. thesis, Université Libre de Bruxelles.

Sonon, B., François, B., Massart, T. J., 2012. A unified level set based methodology for fast generation of complex microstructural multi-phase RVEs. Computer Methods in Applied Mechanics and Engineering 223– 224, 103–122.

Sonon, B., François, B., Massart, T. J., 2015. An advanced approach for the generation of complex cellular material representative volume elements using distance fields and level sets. Computational Mechanics 56 (2), 221–242.

Stainier, L., Ortiz, M., 2010. Study and validation of a variational theory of thermo-mechanical coupling in finite visco-plasticity. International Journal of Solids and Structures 47 (5), 705–715.

Staub, S., Andrä, H., Kabel, M., 2018. Fast FFT based solver for ratedependent deformations of composites and nonwovens. International Journal of Solids and Structures 154, 33–42.

Steinmann, P., Stein, E., 1996. On the numerical treatment and analysis of finite deformation ductile single crystal plasticity. Computer Methods in Applied Mechanics and Engineering 129, 235–254.

Sudharshan Phani, P., Johanns, K. E., Duscher, G., Gali, A., George, E. P., Pharr, G. M., 2011. Scanning transmission electron microscope observations of defects in as-grown and pre-strained Mo alloy fibers. Acta Materialia 59 (5), 2172–2179.

Taylor, R. L., Pister, K. S., Goudreau, G. L., 1970. Thermomechanical analysis of viscoelastic solids. International Journal for Numerical Methods in Engineering 2, 45–49.

Temizer, Ä., 2012. On the asymptotic expansion treatment of two-scale finite thermoelasticity. International Journal of Engineering Science 53, 74–84.

Terada, K., Kurumatani, M., Ushida, T., Kikuchi, N., 2010. A method of two-scale thermo-mechanical analysis for porous solids with micro-scale heat transfer. Computational Mechanics 46, 269–285.

Tikkarrouchine, E., Chatzigeorgiou, G., Chemisky, Y., Meraghini, F., 2019. Fully coupled thermo-viscoplastic analysis of composite structures by means of multi-scale three-dimensional finite element computations. International Journal of Solids and Structures 164, 120–140.

Torquato, S., Jiao, Y., 2010. Robust algorithm to generate a diverse class of dense disordered and ordered sphere packings via linear programming. Physical Review E 82, 061302.

Toth, A., Kelley, C. T., 2015. Convergence Analysis for Anderson Acceleration. SIAM Journal on Numerical Analysis 53 (2), 805–819.

Truesdell, C., Noll, W., 2004. The Non-Linear Field Theories of Mechanics, 3rd Edition. Springer, Berlin, Heidelberg.

Tscharnuter, D., Jerabek, M., Major, Z., Pinter, G., 2012. Uniaxial nonlinear viscoelastic viscoplastic modeling of polypropylene. Mechanics of Time-Dependent Materials 16, 275–286.

Tschoegl, N., 1989. The Phenomenological Theory of Linear Viscoelastic behavior: An Introduction. Springer, Berlin, Heidelberg.

Turner, P. R., Huntley, E., 1976. Variable Metric Methods in Hilbert Space with Applications to Control Problems. Journal of optimization theory and applications 19 (3), 381–400.

Uchic, M. D., Holzer, L., Inkson, B. J., Principe, E. L., Munroe, P., 2007. Three-Dimensional Microstructural Characterization Using Focused Ion Beam Tomography. MRS Bulletin 32 (5), 408–416.

Vaz Jr., M., Muñoz-Rojas, P. A., Lange, M. R., 2011. Damage evolution and thermal coupled effects in inelastic solids. International Journal of Mechanical Sciences 53 (5), 387–398.

Vidyasagar, A., Tan, W. L., Kochmann, D. M., 2017. Predicting the effective response of bulk polycrystalline ferroelectric ceramics via improved spectral phase field methods. Journal of the Mechanics and Physics of Solids 106, 133–151.

Vinogradov, V., Milton, G. W., 2008. An accelerated FFT algorithm for thermoelastic and non-linear composites. International Journal for Numerical Methods in Engineering 76 (11), 1678–1685.

Šilhavý, M., 1997. The Mechanics and Thermodynamics of Continuous Media. Springer, Berlin.

Walker, H. W., Ni, P., 2011. Anderson Acceleration for Fixed-Point Iterations. SIAM Journal on Numerical Analysis 49 (4), 1715–1735.

Wang, D., He, Y., De Sterck, H., 2021. On the Asymptotic Linear Convergence Speed of Anderson Acceleration Applied to ADMM. Journal of Scientific Computing 88 (2), 1–35.

Wang, D., Shantraj, P., Springer, H., Raabe, D., 2018a. Particle-induced damage in Fe-TiB<sup>2</sup> high stiffness metal matrixcomposite steels. Materials and Design 160, 557–571.

Wang, L., Shen, J. Zhang, G., Zhang, Y., Guo, L., Ge, Y., Gao, L., Fu, H., 2018b. Stability of lamellar structure of directionally solidified NiAl-28Cr-6Mo eutectic alloy at different withdrawal rates and temperatures. Intermetallics 94, 83–91.

Warlimont, H., Martienssen, W. (Eds.), 2018. Springer Handbook of Materials Data, 2nd Edition. Springer, Cham.

Whittenberger, J. D., 1987. Effect of composition and grain size on slow plastic flow properties of NiAl between 1200 and 1400 K. Journal of Materials Science 22 (2), 394–402.

Whittenberger, J. D., Noebe, R. D., Cullers, C. L., Kumar, K. S., Mannan, S. K., 1991. 1000 to 1200 K time-dependent compressive deformation of single-crystalline and polycrystalline B2 Ni-40Al. Metallurgical Transactions A 22 (7), 1595–1607.

Whittenberger, J. D., Raj, S. V., Locci, I. E., 2001. Effect of Microstructure on Creep in Directionally Solidified NiA1-31Cr-3Mo. Technical Report NASA/TM-2001-211306, NASA Glenn Research Center.

Wicht, D., Kauffmann, A., Schneider, M., Heilmaier, M., Böhlke, T., 2022. On the impact of the mesostructure on the creep response of cellular NiAl-Mo eutectics. Acta Materialia 226, 117626.

Wicht, D., Schneider, M., Böhlke, T., 2020a. An efficient solution scheme for small-strain crystal-elasto-viscoplasticity in a dual framework. Computer Methods in Applied Mechanics and Engineering 358, 112611.

Wicht, D., Schneider, M., Böhlke, T., 2020b. On Quasi-Newton methods in fast Fourier transform-based micromechanics. International Journal for Numerical Methods in Engineering 121 (8), 1665–1694.

Wicht, D., Schneider, M., Böhlke, T., 2021a. Anderson-accelerated polarization schemes for fast Fourier transform-based computational homogenization. International Journal for Numerical Methods in Engineering 122 (9), 2287–2311.

Wicht, D., Schneider, M., Böhlke, T., 2021b. Computing the effective response of heterogeneous materials with thermomechanically coupled constituents by an implicit fast Fourier transform-based approach. International Journal for Numerical Methods in Engineering 122 (5), 1307–1332.

Wichtmann, T., Triantafyllidis, T., 2010. On the influence of the grain size distribution curve on P-wave velocity, constrained elastic modulus and Poisson's ratio of quartz sands. Soil Dynamics and Earthquake Engineering 30 (8), 757–766.

Widom, B., 1966. Random Sequential Addition of Hard Spheres to a Volume. The Journal of Chemical Physics 44 (10), 3888–3894.

Williams, M. L., Landel, R. F., Ferrry, J. D., 1955. The Temperature Dependence of Relaxation Mechanisms in Amorphous Polymers and Other Glass-forming Liquids. Journal of the American Chemical Society 77 (14), 3701–3707.

Williams, S., Philipse, A., 2003. Random packings of spheres and spherocylinders simulated by mechanical contraction. Physical Review E 67, 1–9.

Willis, J. R., 1981. Variational and Related Methods for the Overall Properties of Composites. Advances in Applied Mechanics 21, 1–78.

Willot, F., 2015. Fourier-based schemes for computing the mechanical response of composites with accurate local fields. Comptes Rendus Mécanique 343 (3), 232–245.

Wolfe, P., 1969. Convergence Conditions for Ascent Methods. SIAM Review 11 (2), 226–235.

Wriggers, P., 2008. Nonlinear Finite Element Methods. Springer, Berlin, Heidelberg.

Wulfinghoff, S., Böhlke, T., 2013. Equivalent plastic strain gradient crystal plasticity - Enhanced power law subroutine. GAMM-Mitteilungen 36 (2), 134–148.

Zeller, R., Dederichs, P. H., 1973. Elastic constants of polycrystals. physica status solidi 55 (2), 831–842.

Zeman, J., Vondˇrejc, J., Novak, J., Marek, I., 2010. Accelerating a FFT-based solver for numerical homogenization of periodic media by conjugate gradients. Journal of Computational Physics 229 (21), 8065–8071.

Zhang, J., Peng, Y., Ouyang, W., Deng, B., 2019. Accelerating ADMM for efficient simulation and optimization. ACM Transactions on Graphics 38 (6), 1–21.

Zhang, J., Shen, J., Shang, Z., Feng, Z., Wang, L., Fu, H., 2012. Microstructure and room temperature fracture toughness of directionally solidified NiAl-Mo eutectic in situ composites. Intermetallics 21 (1), 18–25.

Zhang, J., Shen, J., Shang, Z., Wang, L., Fu, H., 2013. Directional solidification and characterization of NiAl-9Mo eutectic alloy. Transactions of Nonferrous Metals Society of China 23 (12), 3499–3507.

# **Publications**


### **Schriftenreihe Kontinuumsmechanik im Maschinenbau Karlsruher Institut für Technologie (KIT) (ISSN 2192-693X)**


Die Bände sind unter www.ksp.kit.edu als PDF frei verfügbar oder als Druckausgabe bestellbar.


Die Bände sind unter www.ksp.kit.edu als PDF frei verfügbar oder als Druckausgabe bestellbar.


Daniel Wicht **Efficient fast Fourier transform-based solvers for computing the thermomechanical behavior of applied materials.**  ISBN 978-3-7315-1220-2 **Band 21**

The mechanical behavior of many applied materials arises from their microstructure. Thus, to aid the design, development and industrialization of new materials, robust computational homogenization methods are indispensable. Hence, the present work is devoted to investigating and developing FFTbased micromechanics solvers for efficiently computing the (thermo)mechanical response of nonlinear composite materials with complex microstructures. To this end, both Lippmann-Schwinger solvers and polarization schemes are considered as starting points for powerful new algorithms. In addition to these general-purpose methods, we consider a number of specialized applications of FFT-based solvers such as crystal plasticity in a stress-based setting and thermomechanically coupled materials. Last but not least, we use the developed methods for investigating the mechanical behavior of directionally solidified NiAl-Mo alloys, which have attracted considerable research interest as high-temperature materials. These materials feature cell and fiber structures on the microscale, which greatly impacts their overall mechanical properties. The power and flexibility of FFTbased solvers enable a systematic investigation of the impact of the microstructure morphology on the resulting creep response of NiAl-Mo alloys.

ISSN 2192-693X ISBN 978-3-7315-1220-2

Gedruckt auf FSC-zertifiziertem Papier