Lecture Notes of Tensor Network Contractions

Tensor network (TN), a young mathematical tool of high vitality and great potential, has been undergoing extremely rapid developments in the last two decades, gaining tremendous success in condensed matter physics, atomic physics, quantum information science, statistical physics, and so on. In this lecture notes, we focus on the contraction algorithms of TN as well as some of the applications to the simulations of quantum many-body systems. Starting from basic concepts and definitions, we first explain the relations between TN and physical problems, including the TN representations of classical partition functions, quantum many-body states (by matrix product state, tree TN, and projected entangled pair state), time evolution simulations, etc. These problems, which are challenging to solve, can be transformed to TN contraction problems. We present then several paradigm algorithms based on the ideas of the numerical renormalization group and/or boundary states, including density matrix renormalization group, time-evolving block decimation, coarse-graining/corner tensor renormalization group, and several distinguished variational algorithms. Finally, we revisit the TN approaches from the perspective of multi-linear algebra (also known as tensor algebra or tensor decompositions) and quantum simulation. Despite the apparent differences in the ideas and strategies of different TN algorithms, we aim at revealing the underlying relations and resemblances in order to present a systematic picture to understand the TN contraction approaches.


Introduction
Tsang [13], Nightingale and Blöte [11], and Derrida [14,15], as found by Nishino [16][17][18][19][20][21][22]. Here we start their history from the Wilson numerical renormalization group (NRG) [23]. The NRG aims at finding the ground state of a spin system. The idea of the NRG is to start from a small system whose Hamiltonian can be easily diagonalized. The system is then projected on few low-energy states of the Hamiltonian. A new system is then constructed by adding several spins and a new low-energy effective Hamiltonian is obtained working only in the subspace spanned by the low-energy states of the previous step and the full Hilbert space of the new spins. In this way the low-energy effective Hamiltonian can be diagonalized again and its low-energy states can be used to construct a new restricted Hilbert space. The procedure is then iterated. The original NRG has been improved, for example, by combining it with the expansion theory [24][25][26]. As already shown in [23] the NRG successfully tackles the Kondo problem in one dimension [27], however, its accuracy is limited when applied to generic strongly correlated systems such as Heisenberg chains.
In the nineties, White and Noack were able to relate the poor NRG accuracy with the fact that it fails to consider properly the boundary conditions [28]. In 1992, White proposed the famous density matrix renormalization group (DMRG) that is as of today the most efficient and accurate algorithms for one-dimensional (1D) models [29,30]. White used the largest eigenvectors of the reduced density matrix of a block as the states describing the relevant part of the low energy physics Hilbert space. The reduced density matrix is obtained by explicitly constructing the ground state of the system on a larger region. In other words, the space of one block is renormalized by taking the rest of the system as an environment.
The simple idea of environment had revolutionary consequences in the RG-based algorithms. Important generalizations of DMRG were then developed, including the finite-temperature variants of matrix renormalization group [31][32][33][34], dynamic DMRG algorithms [35][36][37][38], and corner transfer matrix renormalization group by Nishino and Okunishi [16]. 1 About 10 years later, TN was re-introduced in its simplest form of matrix product states (MPS) [14,15,[39][40][41] in the context of the theory of entanglement in quantum many-body systems; see, e.g., [42][43][44][45]. 2 In this context, the MPS encodes the coefficients of the wave-functions in a product of matrices, and is thus defined as the contraction of a one-dimensional TN. Each elementary tensor has three indexes: one physical index acting on the physical Hilbert space of the constituent, and two auxiliary indexes that will be contracted. The MPS structure is chosen since it represents the states whose entanglement only scales with the boundary of a region rather than its volume, something called the "area law" of entanglement. Furthermore, an MPS gives only finite correlations, thus is well suited to represent While very elegant and extremely powerful for 1D models, the 2D version of DMRG [97-100] suffers several severe restrictions. The ground state obtained by DMRG is an MPS that is essentially a 1D state representation, satisfying the 1D area law of entanglement entropy [52,53,55,101]. However, due to the lack of alternative approaches, 2D DMRG is still one of the most important 2D algorithms, producing a large number of astonishing works including discovering the numeric evidence of quantum spin liquid [102-104] on kagomé lattice (see, e.g., [105][106][107][108][109][110]).
Besides directly using DMRG in 2D, another natural way is to extend the MPS representation, leading to the tensor product state [111], or projected entangled pair state (PEPS) [112,113]. While an MPS is made up of tensors aligned in a 1D chain, a PEPS is formed by tensors located in a 2D lattice, forming a 2D TN. Thus, PEPS can be regarded as one type of 2D tensor network states (TNS). Note the work of Affleck et al. [114] can be considered as a prototype of PEPS.
The network structure makes PEPS so powerful that it can encode difficult computational problems including non-deterministic polynomial (NP) hard ones [115,135,136]. What is even more important for physics is that PEPS provides an efficient representation as a variational ansatz for calculating ground states of 2D models. However, obeying the area law costs something else: the computational complexity rises [115,135,137]. For instance, after having determined the ground state (either by construction or variation), one usually wants to extract the physical information by computing, e.g., energies, order parameters, or entanglement. For an MPS, most of the tasks are matrix manipulations and products which can be easily done by classical computers. For PEPS, one needs to contract a TN stretching in a 2D plain, unfortunately, most of which cannot be neither done exactly or nor even efficiently. The reason for this complexity is what brings the physical advantage to PEPS: the network structure. Thus, algorithms to compute the TN contractions need to be developed.
Other than dealing with the PEPS, TN provides a general way to different problems where the cost functions are written as the contraction of a TN. A cost function is usually a scalar function, whose maximal or minimal point gives the solution of the targeted optimization problem. For example, the cost function of the ground-state simulation can be the energy (e.g., [138,139]); for finite-temperature simulations, it can be the partition function or free energy (e.g., [140,141]); for the dimension reduction problems, it can be the truncation error or the distance before and after the reduction (e.g., [69,142,143]); for the supervised machine learning problems, it can be the accuracy (e.g., [144]). TN can then be generally considered as a specific mathematical structure of the parameters in the cost functions.
The second thing concerns the fact that some TNs can indeed be contracted exactly. Tree TN is one example, since there is no loop of a tree graph. This might be the reason that a tree TNS can only have a finite correlation length [151], thus cannot efficiently access criticality in two dimensions. MERA modifies the tree in a brilliant way, so that the criticality can be accessed without giving up the exactly contractible structure [164]. Some other exactly contractible examples have also been found, where exact contractibility is not due to the geometry, but due to some algebraic properties of the local tensors [184,185].

Tensor Renormalization Group and Tensor Network Algorithms
Since most of TNs cannot be contracted exactly (with #P-complete computational complexity [136]), efficient algorithms are strongly desired. In 2007, Levin and Nave generalized the NRG idea to TN and proposed tensor renormalization group (TRG) approach [142]. TRG consists of two main steps in each RG iteration: contraction and truncation. In the contraction step, the TN is deformed by singular value decomposition (SVD) of matrix in such a way that certain adjacent tensors can be contracted without changing the geometry of the TN graph. This procedure reduces the number of tensors N to N/ν, with ν an integer that depends on the way of contracting. After reaching the fixed point, one tensor represents in fact the contraction of infinite number of original tensors, which can be seen as the approximation of the whole TN. After each contraction, the dimensions of local tensors increase exponentially, and then truncations are needed. To truncate in an optimized way, one should consider the "environment," a concept which appears in DMRG and is crucially important in TRG-based schemes to determine how optimal the truncations are. In the truncation step of Levin's TRG, one only keeps the basis corresponding to the χ -largest singular values from the SVD in the contraction step, with χ called dimension cut-off. In other words, the environment of the truncation here is the tensor that is decomposed by SVD. Such a local environment only permits local optimizations of the truncations, which hinders the accuracy of Levin's TRG on the systems with long-range fluctuations. Nevertheless, TRG is still one of the most important and computationally cheap approaches for both classical (e.g., Ising and Potts models) and quantum (e.g., Heisenberg models) simulations in two and higher dimensions [184,[210][211][212][213][214][215][216][217][218][219][220][221][222][223][224][225][226][227]. It is worth mentioning that for 3D classical models, the accuracy of the TRG algorithms has surpassed other methods [221,225], such as QMC. Following the contraction-and-truncation idea, the further developments of the TN contraction algorithms concern mainly two aspects: more reasonable ways of contracting and more optimized ways of truncating. While Levin's TRG "coarse-grains" a TN in an exponential way (the number of tensors decreases exponentially with the renormalization steps), Vidal's TEBD scheme [68][69][70][71] implements the TN contraction with the help of MPS in a linearized way [189]. Then, instead of using the singular values of local tensors, one uses the entanglement of the MPS to find the optimal truncation, meaning the environment is a (non-local) MPS, leading to a better precision than Levin's TRG. In this case, the MPS at the fixed point is the dominant eigenstate of the transfer matrix of the TN. Another group of TRG algorithms, called corner transfer matrix renormalization group (CTMRG) [228], are based on the corner transfer matrix idea originally proposed by Baxter in 1978 [229], and developed by Nishina and Okunishi in 1996 [16]. In CTMRG, the contraction reduces the number of tensors in a polynomial way and the environment can be considered as a finite MPS defined on the boundary. CTMRG has a compatible accuracy compared with TEBD.
With a certain way of contracting, there is still high flexibility of choosing the environment, i.e., the reference to optimize the truncations. For example, Levin's TRG and its variants [142, 210-212, 214, 221], the truncations are optimized by local environments. The second renormalization group proposed by Xie et al. [221,230] employs TRG to consider the whole TN as the environments.
Besides the contractions of TNs, the concept of environment becomes more important for the TNS update algorithms, where the central task is to optimize the tensors for minimizing the cost function. According to the environment, the TNS update algorithms are categorized as the simple [141,143,210,221,231,232], cluster [141,231,233,234], and full update [221,228,230,[235][236][237][238][239][240]. The simple update uses local environment, hence has the highest efficiency but limited accuracy. The full update considers the whole TN as the environment, thus has a high accuracy. Though with a better treatment of the environment, one drawback of the full update schemes is the expensive computational cost, which strongly limits the dimensions of the tensors one can keep. The cluster update is a compromise between simple and full update, where one considers a reasonable subsystem as the environment for a balance between the efficiency and precision.
It is worth mentioning that TN encoding schemes are found to bear close relations to the techniques in multi-linear algebra (MLA) (also known as tensor decompositions or tensor algebra; see a review [241]). MLA was originally targeted on developing high-order generalization of the linear algebra (e.g., the higher-order version of singular value or eigenvalue decomposition [242][243][244][245]), and now has been successfully used in a large number of fields, including data mining (e.g., [246][247][248][249][250]), image processing (e.g., [251][252][253][254]), machine learning (e.g., [255]), and so on. The interesting connections between the fields of TN and MLA (for example, tensor-train decomposition [256] and matrix product state representation) open new paradigm for the interdisciplinary researches that cover a huge range in sciences.

Organization of Lecture Notes
Our lectures are organized as following. In Chap. 2, we will introduce the basic concepts and definitions of tensor and TN states/operators, as well as their graphic representations. Several frequently used architectures of TN states will be introduced, including matrix product state, tree TN state, and PEPS. Then the general form of TN, the gauge degrees of freedom, and the relations to quantum entanglement will be discussed. Three special types of TNs that can be exactly contracted will be exemplified in the end of this chapter.
In Chap. 3, the contraction algorithms for 2D TNs will be reviewed. We will start with several physical problems that can be transformed to the 2D TN contractions, including the statistics of classical models, observation of TN states, and the ground-state/finite-temperature simulations of 1D quantum models. Three paradigm algorithms, namely TRG, TEBD, and CTMRG, will be presented. These algorithms will be further discussed from the aspect of the exactly contractible TNs.
In Chap. 4, we will concentrate on the algorithms of PEPS for simulating the ground states of 2D quantum lattice models. Two general schemes will be explained, which are the variational approaches and the imaginary-time evolution. According to the choice of environment for updating the tensors, we will explain the simple, cluster, and full update algorithms. Particularly in the full update, the contraction algorithms of 2D TNs presented in Chap. 3 will play a key role to compute the nonlocal environments.
In Chap. 5, a special topic about the underlying relations between the TN methods and the MLA will be given. We will start from the canonicalization of MPS in one dimension, and then generalize to the super-orthogonalization of PEPS in higher dimensions. The super-orthogonalization that gives the optimal approximation of a tree PEPS in fact extends the Tucker decomposition from single tensor to tree TN. Then the relation between the contraction of tree TNs and the rank-1 decomposition will be discussed, which further leads to the "zero-loop" approximation of the PEPS on the regular lattice. Finally, we will revisit the infinite DMRG (iDMRG), infinite TEBD (iTEBD), and infinite CTMRG in a unified picture indicated by the tensor ring decomposition, which is a higher-rank extension of the rank-1 decomposition.
In Chap. 6, we will revisit the TN simulations of quantum lattice models from the ideas explained in Chap. 5. Such a perspective, dubbed as quantum entanglement simulation (QES), shows a unified picture for simulating one-and higher-dimensional quantum models at both zero [234,257] and finite [258] temperatures. The QES implies an efficient way of investigating infinite-size manybody systems by simulating few-body models with classical computers or artificial quantum platforms. In Chap. 7, a brief summary is given.

Tensor Network: Basic Definitions and Properties
Abstract This chapter is to introduce some basic definitions and concepts of TN. We will show that the TN can be used to represent quantum many-body states, where we explain MPS in 1D and PEPS in 2D systems, as well as the generalizations to thermal states and operators. The quantum entanglement properties of the TN states including the area law of entanglement entropy will also be discussed. Finally, we will present several special TNs that can be exactly contracted, and demonstrate the difficulty of contracting TNs in general cases.

Scalar, Vector, Matrix, and Tensor
Generally speaking, a tensor is defined as a series of numbers labeled by N indexes, with N called the order of the tensor. 1 In this context, a scalar, which is one number and labeled by zero index, is a zeroth-order tensor. Many physical quantities are scalars, including energy, free energy, magnetization, and so on. Graphically, we use a dot to represent a scalar ( Fig. 2.1).
A D-component vector consists of D numbers labeled by one index, and thus is a first-order tensor. For example, one can write the state vector of a spin-1/2 in a chosen basis (say the eigenstates of the spin operatorŜ [z] ) as A matrix is in fact a second-order tensor. Considering two spins as an example, the state vector can be written under an irreducible representation as a fourdimensional vector. Instead, under the local basis of each spin, we write it as with C ss a matrix with two indexes. Here, one can see that the difference between a (D ×D) matrix and a D 2 -component vector in our context is just the way of labeling the tensor elements. Transferring among vector, matrix, and tensor like this will be frequently used later. Graphically, we use a dot with two bonds to represent a matrix and its two indexes ( Fig. 2.1).
It is then natural to define an N -th order tensor. Considering, e.g., N spins, the 2 N coefficients can be written as an N -th order tensor C, 2  Besides quantum states, operators can also be written as tensors. A spin-1/2 operatorŜ α (α = x, y, z) is a (2 × 2) matrix by fixing the basis, where we have S α s 1 s 2 s 1 s 2 = s 1 s 2 |Ŝ α |s 1 s 2 . In the same way, an N -spin operator can be written as a 2N-th order tensor, with N bra and N ket indexes. 3 We would like to stress some conventions about the "indexes" of a tensor (including matrix) and those of an operator. A tensor is just a group of numbers, where their indexes are defined as the labels labeling the elements. Here, we always put all indexes as the lower symbols, and the upper "indexes" of a tensor (if exist) are just a part of the symbol to distinguish different tensors. For an operator which is defined in a Hilbert space, it is represented by a hatted letter, and there will be M U

Fig. 2.2
The graphic representation of the Schmidt decomposition (singular value decomposition of a matrix). The positive-defined diagonal matrix λ, which gives the entanglement spectrum (Schmidt numbers), is defined on a virtual bond (dumb index) generated by the decomposition no "true" indexes, meaning that both upper and lower "indexes" are just parts of the symbol to distinguish different operators.

A Simple Example of Two Spins and Schmidt Decomposition
After introducing tensor (and its diagram representation), now we are going to talk about TN, which is defined as the contraction of many tensors. Let us start with the simplest situation, two spins, and consider to study the quantum entanglement properties for instance. Quantum entanglement, mostly simplified as entanglement, can be defined by the Schmidt decomposition [1][2][3]  where U and V are unitary matrices, λ is a positive-defined diagonal matrix in descending order, 4 and χ is called the Schmidt rank. λ is called the Schmidt coefficients since in the new basis after the decomposition, the state is written in a summation of χ product states as |ψ = a λ a |u a |v a , with the new basis |u a = s U sa |s and |v a = s V * sa |s . Graphically, we have a small TN, where we use green squares to represent the unitary matrices U and V , and a red diamond to represent the diagonal matrix λ. There are two bonds in the graph shared by two objects, standing for the summations (contractions) of the two indexes in Eq. (2.4), a and a . Unlike s (or s ), the space of the index a (or a ) is not from any physical Hilbert space. To distinguish these two kinds, we call the indexes like s the physical indexes and those like a the geometrical or virtual indexes. Meanwhile, since each physical index is only connected to one tensor, it is also called an open bond.
Some simple observations can be made from the Schmidt decomposition. Generally speaking, the index a (also a since λ is diagonal) contracted in a TN carry the quantum entanglement [4]. In quantum information sciences, entanglement is regarded as a quantum version of correlation [4], which is crucially important to understand the physical implications of TN. One usually uses the entanglement entropy to measure the strength of the entanglement, which is defined as S = −2 χ a=1 λ 2 a ln λ a . Since the state should be normalized, we have χ a=1 λ 2 a = 1. For dim(a) = 1, obviously |ψ = λ 1 |u 1 |v 1 is a product state with zero entanglement S = 0 between the two spins. For dim(a) = χ , the entanglement entropy S ≤ ln χ , where S takes its maximum if and only if λ 1 = · · · = λ χ . In other words, the dimension of a geometrical index determines the upper bound of the entanglement.
Instead of Schmidt decomposition, it is more convenient to use another language to present later the algorithms: singular value decomposition (SVD), a matrix decomposition in linear algebra. The Schmidt decomposition of a state is the SVD of the coefficient matrix C, where λ is called the singular value spectrum and its dimension χ is called the rank of the matrix. In linear algebra, SVD gives the optimal lower-rank approximations of a matrix, which is more useful to the TN algorithms. Specifically speaking, with a given matrix C of rank-χ , the task is to find a rank-χ matrix C (χ ≤ χ ) that minimizes the norm The optimal solution is given by the SVD as In other words, M is the optimal rank-χ approximation of M, and the error is given by which will be called the truncation error in the TN algorithms.

Matrix Product State
Now we take a N-spin state as an example to explain the MPS, a simple but powerful 1D TN state. In an MPS, the coefficients are written as a TN given by the contraction A [1] A [2]

Fig. 2.3
An impractical way to obtain an MPS from a many-body wave-function is to repetitively use the SVD of N tensors. Schollwöck in his review [5] provides a straightforward way to obtain such a TN is by repetitively using SVD or QR decomposition ( Fig. 2.3). First, we group the first N −1 indexes together as one large index, and write the coefficients as a 2 N −1 × 2 matrix. Then implement SVD or any other decomposition (for example, QR decomposition) as the contraction of C [N −1] and A[N] Note that as a convention in this paper, we always put the physical indexes in front of geometrical indexes and use a comma to separate them. For the tensor C [N −1] , one can do the similar thing by grouping the first N − 2 indexes and decompose again as Then the total coefficients become the contraction of three tensors as Repeat decomposing in the above way until each tensor only contains one physical index, we have the MPS representation of the state as One can see that an MPS is a TN formed by the contraction of N tensors. Graphically, MPS is represented by a 1D graph with N open bonds. In fact, an  (2.12) where all tensors are third-order. Moreover, one can introduce translational invariance to the MPS, i.e., A [n] = A for n = 1, 2, · · · , N. We use χ , dubbed as virtual bond dimension of the MPS, to represent the dimension of each geometrical index. MPS is an efficient representation of a many-body quantum state. For a N -spin state, the number of the coefficients is 2 N which increases exponentially with N. For an MPS given by Eq. (2.12), it is easy to count that the total number of the elements of all tensors is Ndχ 2 which increases only linearly with N . The above way of obtaining MPS with decompositions is also known as tensor train decomposition (TTD) in MLA, and MPS is also called tensor-train form [6]. The main aim of TTD is investigating the algorithms to obtain the optimal tensor-train form of a given tensor, so that the number of parameters can be reduced with well-controlled errors.
In physics, the above procedure shows that any states can be written in an MPS, as long as we do not limit the dimensions of the geometrical indexes. However, it is extremely impractical and inefficient, since in principle, the dimensions of the geometrical indexes {a} increase exponentially with N . In the following sections, we will directly applying the mathematic form of the MPS without considering the above procedure. Now we introduce a simplified notation of MPS that has been widely used in the community of physics. In fact with fixed physical indexes, the contractions of geometrical indexes are just the inner products of matrices (this is how its name comes from). In this sense, we write a quantum state given by Eq. (2.11) as |ψ = tT rA [1] A [2] · · · A [N ] |s 1 s 2 · · · s N = tT r N n=1 A [n] |s n . (2.13) tT r stands for summing over all shared indexes. The advantage of Eq. (2.13) is to give a general formula for an MPS of either finite or infinite size, with either periodic or open boundary condition.

Affleck-Kennedy-Lieb-Tasaki State
MPS is not just a mathematic form. It can represent non-trivial physical states. One important example can be found with AKLT model proposed in 1987, a generalization of spin-1 Heisenberg model [7]. For 1D systems, Mermin-Wagner theorem forbids any spontaneously breaking of continuous symmetries at finite temperature with sufficiently short-range interactions. For the ground state of AKLT model called AKLT state, it possesses the sparse anti-ferromagnetic order (Fig. 2.5), which provides a non-zero excitation gap under the framework of Mermin-Wagner theorem. Moreover, AKTL state provides us a precious exactly solvable example to understand edge states and (symmetry-protected) topological orders. AKLT state can be exactly written in an MPS with χ = 2 (see [8] for example). Without losing generality, we assume periodic boundary condition. Let us begin with the AKLT Hamiltonian that can be given by spin-1 operators aŝ (2.14) By introducing the non-negative-defined projectorP 2 (Ŝ n +Ŝ n+1 ) that projects the neighboring spins to the subspace of S = 2, Eq. (2.14) can be rewritten in the summation of projectors asĤ = nP 2 (Ŝ n +Ŝ n+1 ). (2.15) Thus, the AKLT Hamiltonian is non-negative-defined, and its ground state lies in its kernel space, satisfyingĤ |ψ AKLT = 0 with a zero energy. Now we construct a wave-function which has a zero energy. As shown in Fig. 2.6, we put on each site a projector that maps two (effective) spins-1/2 to a triplet, i.e., The corresponding projector is determined by the Clebsch-Gordan coefficients [9], and is a (3 × 4) matrix. Here, we rewrite it as a (3 × 2 × 2) tensor, whose three components (regarding to the first index) are the ascending, z-component, and descending Pauli matrices of spin-1/2, 5 In the language of MPS, we have the tensor A satisfying (2.20) Then we put another projector to map two spin-1/2 to a singlet, i.e., a spin-0 with The projector is in fact a (2 × 2) identity with the choice of Eq. (2.19) Now, the MPS of the AKLT state with periodic boundary condition (up to a normalization factor) is obtained by Eq. (2.12), with every tensor A given by Eq. (2.20). For such an MPS, every projector operatorP 2 (Ŝ n +Ŝ n+1 ) in the AKLT Hamiltonian is always acted on a singlet, then we haveĤ |ψ AKLT = 0.

Tree Tensor Network State (TTNS) and Projected Entangled Pair State (PEPS)
TTNS is a generalization of the MPS that can code more general entanglement states. Unlike an MPS where the tensors are aligned in a 1D array, a TTNS is given The physical ones may locate on each tensor or put on the boundary of the tree. A tree is a graph that has no loops, which leads to many simple mathematical properties that parallel to those of an MPS. For example, the partition function of a TTNS can be efficiently exactly computed. A similar but more power TN state called MERA also has such a property ( Fig. 2.7c). We will get back to this in Sect. 2.3.6. Note an MPS can be treated as a tree with z = 2. An important generalization to the TNs of loopy structures is known as projected entangled pair state (PEPS), proposed by Verstraete and Cirac [10,11]. The tensors of a PEPS are located in, instead of a 1D chain or a tree graph, a d-dimensional lattice, thus graphically forming a d-dimensional TN. An intuitive picture of PEPS is given in Fig. 2.8, i.e., the tensors can be understood as projectors that map the physical spins into virtual ones. The virtual spins form the maximally entangled state in a way determined by the geometry of the TN. Note that such an intuitive picture was firstly proposed with PEPS [10], but it also applies to TTNS.
Similar to MPS, a TTNS or PEPS can be formally written as |Ψ = tT r n P [n] |s n , (2.23) where tT r means to sum over all geometrical indexes. Usually, we do not write the formula of a TTNS or PEPS, but give the graph instead to clearly show the contraction relations. Such a generalization makes a lot of senses in physics. One key factor regards the area law of entanglement entropy [12][13][14][15][16][17] which we will talk about later in this chapter. In the following as two straightforward examples, we show that PEPS can indeed represents non-trivial physical states including nearest-neighbor resonating valence bond (RVB) and Z 2 spin liquid states. Note these two types of states on trees can be similarly defined by the corresponding TTNS.

PEPS Can Represent Non-trivial Many-Body States: Examples
RVB state was firstly proposed by Anderson to explain the possible disordered ground state of the Heisenberg model on triangular lattice [18,19]. RVB state is defined as the superposition of macroscopic configurations where all spins are paired to form the singlet states (dimers). The strong fluctuations are expected to restore all symmetries and lead to a spin liquid state without any local orders. The distance between two spins in a dimer can be short range or long range. For nearestneighbor RVB, the dimers are only the nearest neighbors ( Fig. 2.9, also see [20]). RVB state is supposed to relate to high-T c copper-oxide-based superconductor. By doping the singlet pairs, the insulating RVB state can translate to a charged superconductive state [21][22][23].
For the nearest-neighbor situation, an RVB state (defined on an infinite square lattice, for example) can be exactly written in a PEPS of χ = 3. Without losing generality, we take the translational invariance, i.e., the TN is formed by infinite Fig. 2.9 The nearest-neighbor RVB state is the superposition of all possible configurations of nearest-neighbor singlets copies of several inequivalent tensors. Two different ways have been proposed to construct the nearest-neighbor RVB PEPS [24,25]. In addition, Wang et al. proposed a way to construct the PEPS with long-range dimers [26]. In the following, we explain the way proposed by Verstraete et al. to construct the nearest-neighbor one [24]. There are two inequivalent tensors: the tensor defined on each site whose dimensions are (2 × 3 × 3 × 3 × 3) only has eight non-zero elements, P 0,0222 = P 0,2022 = P 0,2202 = P 0,2220 = 1 (2.24) The two-dimensional index of P is a physical index with s = 0 representing spin up and s = 1 spin down. The extra dimension for each of the other four geometrical indexes is used for carrying the vacuum state. The tensor P is acting as projector that maps the occupied geometrical index (either up or down) to a physical spin. For example, P 1,2122 means to map a virtual spin up which occupies the second geometrical index to a real spin up. The rest elements are all zero, which means the corresponding projections are forbidden. Then a projector B is introduced for building spin singlets between two nearestneighbor sites connected by a shared geometrical bond in the RVB structure. B is a (3 × 3) matrix with only three non-zero elements After the contraction of all geometrical indexes, the state is the superposition of all possible configurations consisting of nearest-neighbor dimers. This iPEPS looks different from the one given in Eq. (2.23) but they are essentially the same, because one can contract the B's into P 's so that the PEPS is only formed by tensors defined on the sites.
Another example is the Z 2 spin liquid state, which is one of simplest string-net states [27][28][29], firstly proposed by Levin and Wen to characterize gapped topological orders [30]. Similarly with the picture of strings, the Z 2 state is the superposition of all configurations of string loops. Writing such a state with TN, the tensor on each vertex is (2 × 2 × 2 × 2) satisfying P a 1 ···a N = 1, a 1 + · · · + a N = even, 0, otherwise. (2.28) The tensor P forces the fusion rules of the strings: the number of the strings connecting to a vertex must be even, so that there are no loose ends and all strings have to form loops. It is also called in some literatures the ice rule [31,32] or Gauss' law [33]. In addition, the square TN formed solely by the tensor P gives the famous eight-vertex model, where the number "eight" corresponds to the eight non-zero elements (i.e., allowed sting configurations) on a vertex [34].
The tensors B are defined on each bond to project the strings to spins, whose non-zero elements are B 0,00 = 1, B 1,11 = 1. (2.29) The tensor B is a projector that maps the spin-up (spin-down) state to the occupied (vacuum) state of a string.
(2.30) Different from MPS, each tensor has two physical indexes, of which one is a bra and the other is a ket index (Fig. 2.11). An MPO may represent several nontrivial physical models, for example, the Hamiltonian. Crosswhite and Bacon [53] proposed a general way of constructing an MPO called automata. Now we show how to construct the MPO of an Hamiltonian using the properties of a triangular    32) with N the total number of tensors and ⊗ the tensor product. 7 Such a property can be easily generalized to a W formed by D × D blocks. Imposing Eq. (2.32), we can construct as an example the summation of one-site local terms, i.e., n X [n] , 8 with withb n (b † n ) the annihilation (creation) operator on the n-th site. The MPO representation of such a Fourier transformation is given bŷ withÎ the identical operator in the corresponding Hilbert space. The MPO formulation also allows for a convenient and efficient representation of the Hamiltonians with longer range interactions [54]. The geometrical bond dimensions will in principle increase with the interaction length. Surprisingly, a small dimension is needed to approximate the Hamiltonian with long-range interactions that decay polynomially [46].
Besides, MPO can be used to represent the time evolution operatorÛ(τ ) = e −τĤ with Trotter-Suzuki decomposition, where τ is a small positive number called Trotter-Suzuki step [55,56]. Such an MPO is very useful in calculating real, imaginary, or even complex time evolutions, which we will present later in detail. An MPO can also give a mixed state.
Similarly, PEPS can also be generalized to projected entangled pair operator (PEPO, Fig. 2.11), which on a square lattice, for instance, can be written aŝ s n s n ,a 1 n a 2 n a 3 n a 4 n |s n s n |. (2.41) Each tensor has two physical indexes (bra and ket) and four geometrical indexes. Each geometrical bond is shared by two adjacent tensors and will be contracted.

Tensor Network for Quantum Circuits
A special case of TN are quantum circuits [57]. Quantum circuits encode computations made on qubits (or qudits in general). Figure 2.12 demonstrates the TN representation of a quantum circuit made by unitary gates that act on a product state of many constituents initialized as ⊗ |0 .

Fig. 2.12
The TN representation of a quantum circuit. Two-body unitaries act on a product state of a given number of constituents |0 ⊗ · · · ⊗ |0 and transform it into a target entangled state |ψ

An Example of Quantum Circuits
In order to make contact with TN, we will consider the specific case of quantum circuits where all the gates act on at most two neighbors. An example of such circuit is the Trotterized evolution of a system described by a nearest-neighbor HamiltonianĤ = i,i+1ĥ i,i+1 , where i, i+1 label the neighboring constituents of a one-dimensional system. The evolution operator for a time t isÛ(t) = exp(−iĤ t), and can be decomposed into a sequence of infinitesimal time evolution steps [58] (more details will be given in Sect. 3.1.3) (2.42) In the limit, we can decompose the evolution into a product of two-body evolution ) and τ = t/N. This is obviously a quantum circuit made by two-qubit gates with depth N . Conversely, any quantum circuit naturally possesses an arrow of time, it transforms a product state into an entangled state after a sequence of two-body gates.
Casual Cone One interesting concept in a quantum circuit is that of the causal cone illustrated in Fig. 2.13, which becomes explicit with the TN representations. Given a quantum circuit that prepares (i.e., evolves the initial state to) the state |ψ , we can ask a question: which subset of the gates affect the reduced density matrix of a certain subregion A of |ψ ? This can be seen by constructing the reduced density matrix of the subregion A ρ A = trĀ|ψ ψ| withĀ the rest part of the system besides A.
The TN of the reduced density matrix is formed by a set of unitaries that define the past causal cone of the region A (see the area between the green lines in Fig. 2.13). The rest unitaries (for instance, theÛ 5 and its conjugate in the right subfigure of Fig. 2.13) will be eliminated in the TN of the reduced density matrix. The contraction of the causal cone can thus be rephrased in terms of the multiplication of a set of transfer matrices, each performing the computation from t to t − 1. The maximal width of these transfer matrices defines the width of the causal cone, which can be used as a good measure of the complexity of computing ρ A [59]. The best computational strategy one can find to compute exactly ρ A will indeed always scale exponentially with the width of the cone [57].

Unitary Tensor Networks and Quantum Circuits
The simplest TN, the MP can be interpreted as a sequential quantum circuit [60]. The idea is that one can think of the MPS as a sequential interaction between each constituent (a d-level system) an ancillary D-level system (the auxiliary qDit, red bonds). The first constituent interacts (say the bottom one shown in Fig. 2.14) and then sequentially all the constituents interact with the same D-level system. With this choice, the past causal cone of a constituent is made by all the MPS matrices below it. Interestingly in the MPS case, the causal cone can be changed using the gauge transformations (see Sect. 2.4.2), something very different to what happens in two-dimensional TNs. This amounts to finding appropriate unitary transformations acting on the auxiliary degrees of freedom that allow to reorder the interactions between the D-level system and the constituents. In such a way, a desired constituent can be made to interact first, then followed by the others. An example of the causal cone in the center gauge used in iDMRG calculation [61] is presented in Fig. 2.15. This idea allows to minimize the number of tensors in the causal cone of a given region. However, the scaling of the computational cost of the contraction is not affected by such a temporal reordering of the TN, since in this case the width of the cone is bounded by one unitary in any gauge. The gauge choice just changes the number of computational steps required to construct the desired ρ A . In the case that A includes non-consecutive constituents, the width of the cone increases linearly  15 Using the gauge degrees of freedom of an MPS, we can modify its past causal cone structure to make its region as small as possible, in such a way decreasing the computational complexity of the actual computation of specific ρ A . A convenient choice is the center gauge used in iDMRG Again, the gauge degrees of freedom can be used to modify the structure of the past causal cone of a certain spin. As an example, the iDMRG center gauge is represented in Fig. 2.15.
An example of a TN with a larger past causal cone can be obtained by using more than one layers of interactions. Now the support of the causal cone becomes larger since it includes transfer matrices acting on two D-level systems (red bonds shown in Fig. 2.16). Notice that this TN has loops but it still exactly contractible since the width of the causal cone is still finite.

Definition of Exactly Contractible Tensor Network States
The notion of the past causal cone can be used to classify TNSs based on the complexity of computing their contractions. It is important to remember that the complexity strongly depends on the object that we want to compute, not just the TN. For example, the complexity of an MPS for a N-qubit state scales only linearly with N. However, to compute the n-site reduced density matrix, the cost scales exponentially with n since the matrix itself is an exponentially large object. Here we consider to compute scalar quantities, such as the observables of one-and twosite operators.
We define the a TNS to be exactly contractible when it is allowed to compute their contractions with a cost that is a polynomial to the elementary tensor dimensions D. A more rigorous definition can be given in terms of their tree width see, e.g., [57]. From the discussion of the previous section, it is clear that such a TNS corresponds to a bounded causal cone for the reduced density matrix of a local subregion. In order to show this, we now focus on the cost of computing the expectation value of local operators and their correlation functions on a few examples of TNSs.
The relevant objects are thus the reduced density matrix of a region A made of a few consecutive spins, and the reduced density matrix of two disjoint blocks A 1 and A 2 of which each made of a few consecutive spins. Once we have the reduced density matrices of such regions, we can compute arbitrary expectation values of

MPS Wave-Functions
The simplest example of the computation of the expectation value of a local operator is obtained by considering MPS wave-functions [8,62].
where A i and A i † represent the MPS tensors and its complex conjugate. The MPS transfer matrix E only acts on two auxiliary degrees of freedom. Using the property that E is a completely positive map and thus has a fixed point [8], we can substitute the transfer operator by its largest eigenvector v, leading to the final TN diagram that encodes the expectation value of a local operator.
In Fig. 2.19, we show the TN representation of the expectation value of the twopoint correlation functions. Obviously, the past causal cone width is bounded by two auxiliary sites. Note that in the second line, the directions of the arrows on the right side are changed. This in general does not happen in more complicated TNs as we will see in the next subsection. Before going there, we would like to comment the properties of the two-point correlation functions of MPS. From the calculation we have just performed, we see that they are encoded in powers of the transfer matrix that evolve the system in the real space. If that matrix can be diagonalized, we can immediately see that the correlation functions naturally decay exponentially with the ratio of the first to the second eigenvalue. Related details can be found in Sect. 5.4.2.

Tree Tensor Network Wave-Functions
An alternative kind of wave-functions are the TTNSs [63][64][65][66][67][68][69]. In a TTNS, one can add the physical bond on each of the tensor, and use it as a many-body state defined on a Caley-tree lattice [63]. Here, we will focus on the TTNS with physical bonds only on the outer leafs of the tree.
The calculations with a TTNS normally correspond to the contraction of tree TNs. A specific case of a two-to-one TTNS is illustrated in Fig. 2.20, named binary Caley tree. This TN can be interpreted as a quantum state of multiple spins with different boundary conditions. It can also be considered as a hierarchical TN, in which each layer corresponds to a different level of coarse-graining renormalization group (RG) transformation [64]. In the figure, different layers are colored differently. In the first layer, each tensor groups two spins into one and so on. The tree TN can thus be interpreted a specific RG transformation. Once more, the arrows on the tensors indicate the isometric property of each individual tensor that the directions are opposite as the time, if we interpret the tree TN as a quantum circuit. Note again that |ψ and ψ| have opposite arrows, by definition.
The expectation value of a one-site operator is in fact a tree TN shown in Fig. 2.21. We see that many of the tensors are completely contracted with their Hermitian conjugates, which simply give identities. What are left is again a bounded causal cone. If we now build an infinite TTNS made by infinitely many layers, and assume the scale invariance, the multiplication of infinitely many power of the scale   Similarly, if we compute the correlation function of local operators at a given distance, as shown in Fig. 2.22, we can once more get rid of the tensors outside the casual cone. Rigorously we see that the causal cone width now increases to four sites, since it consists of two different two-site branches. However, if we order the contraction as shown in the middle, we see that the contractions boil down again to a two-site causal cone. Interestingly, since the computation of two-point correlations at very large distance involves the power of transfer matrices that translate in scale rather than in space, one would expect that these matrices are all the same (as a consequence of scale-invariance, for example). Thus, we would get polynomially decaying correlations [70].

MERA Wave-Functions
Until now, we have discussed with the TNs that, even if they can be embedded in a 2D space, they contain no loops. In the context of network complexity theory, they are called mean-field networks [71]. However, there are also TNs with loops that are exactly contractible [57]. A particular case is that of a 1D MERA (and its generalizations) [72][73][74][75][76]. The MERA is again a TN that can be embedded in a 2D plane, and that is full of loops as seen in Fig. 2.23. This TN has a very peculiar structure, again, inspired from RG transformation [77]. MERA can also be interpreted as a quantum circuit where the time evolves radially along the network, once more opposite to the arrows that indicate the direction along which the tensors are unitary. The MERA is a layered TN, with where layer (in different colors in the figure) is composed by the appropriate contraction of some third-order tensors (isometries) and some fourth-order tensors (disentangler). The concrete form of the  network is not really important [76]. In this specific case we are plotting a two-toone MERA that was discussed in the original version of Ref. [75]. Interestingly, an operator defined on at most two sites gives a bounded past causal cone as shown in Figs. 2.24 and 2.25.
As in the case of the TTNS, we can indeed perform the explicit calculation of the past causal cone of a single-site operator ( Fig. 2.24). There we show the TN contraction of the required expectation value, and then simplify it by taking into account the contractions of the unitary and isometric tensors outside the casual cone with a bounded width involving at most four auxiliary constituents.
The calculation of a two-point correlation function of local operators follows a similar idea and leads to the contraction shown in Fig. 2.25. Once more, we see that the computation of the two-point correlation function can be done exactly due to the bounded width of the corresponding casual cone.

Sequentially Generated PEPS Wave-Functions
The MERA and TTNS can be generalized to two-dimensional lattices [64,74]. The generalization of MPS to 2D, on the other hand, gives rise to PEPS. In general, it belongs to the 2D TNs that cannot be exactly contracted [24,78]. However for a subclass of PEPS, one can implement the contract exactly, which is called sequentially generated PEPS [79]. Differently from the MERA where the computation of the expectation value of any sufficiently local operator leads to a bounded causal cone, sequentially generated PEPS has a central site, and the local observables around the central site can be computed easily. However, the local observables in other regions of the TN give larger causal cones. For example, we represent in Fig. 2.26a sequentially generated PEPS for a 3 × 3 lattice. The norm of the state is computed in (b), where the TN boils down to the norm of the central tensor. Some of the reduced density matrices of the system are also easy to compute, in particular those of the central site and its neighbors ( Fig. 2.27a). Other reduced density matrices, such as those of spins close to the corners, are much harder to compute. As illustrated in Fig. 2.27b, the causal cone of a corner site in a 3 × 3 PEPS has a width 2. In general for an L × L PEPS, the casual cone would have a width L/2.
Differently from MPS, the causal cone of a PEPS cannot be transformed by performing a gauge transformation. However, as firstly observed by F. Cucchietti (private communication), one can try to approximate a PEPS of a given causal cone with another one of a different causal cone, by, for example, moving the center site. This is not an exact operation, and the approximations involved in such a transformation need to be addressed numerically. The systematic study of the effect of these approximations has been studied recently in [80,81]. In general, we have to say that the contraction of a PEPS wave-function can only be performed exactly with exponential resources. Therefore, efficient approximate contraction schemes are necessary to deal with PEPS.

Fig. 2.27 (a)
The reduced density matrices of a PEPS that is sequentially generated containing two consecutive spins (one of them is the central spin. (b) The reduced density matrix of a local region far from the central site is generally hard to compute, since it can give rise to an arbitrarily large causal cone. For the reduced density matrix of any of the corners with a L × L PEPS, which is the most consuming case, it leads to a causal cone with a width up to L/2. That means the computation is exponentially expensive with the size of the system

Fig. 2.28
If one starts with contracting an arbitrary bond, there will be a tensor with six bonds. As the contraction goes on, the number of bonds increases linearly with the boundary ∂ of the contracted area, thus the memory increases exponentially as O(χ ∂ ) with χ the bond dimension

Exactly Contractible Tensor Networks
We have considered above, from the perspective of quantum circuits, whether a TNS can be contracted exactly by the width of the casual cones. Below, we reconsider this issue from the aspect of TN.
Normally, a TN cannot be contracted without approximation. Let us consider a square TN, as shown in Fig. 2.28. We start from contracting an arbitrary bond in the TN (yellow shadow). Consequently, we obtain a new tensor with six bonds that contains χ 6 parameters (χ is the bond dimension). To proceed, the bonds adjacent to this tensor are probably a good choice to contract next. Then we will have to restore a new tensor with eight bonds. As the contraction goes on, the Tensor Networks on Tree Graphs We here consider a scalar tree TN (Fig. 2.29a) with N L layers of third-order tensors. Some vectors are put on the outmost boundary. An example that a tree TN may represent is an observable of a TTNS. A tree TN is written as a n,m,1 ,a n,m,2 ,a n,m, 3 After the vectors are updated by the equation above, and the number of layers of the tree TN becomes N L − 1. The whole tree TN can be exactly contracted by repeating this procedure. We can see from the above contraction that if the graph does not contain any loops, i.e., has a tree-like structure, the dimensions of the obtained tensors during the contraction will not increase unboundedly. Therefore, the TN defined on it can be exactly contracted. This is again related to the area law of entanglement entropy that a loop-free TN satisfies: to separate a tree-like TN into two disconnecting parts, the number of bonds that needs to be cut is only one. Thus, the upper bond of the entanglement entropy between these two parts is constant, determined by the dimension of the bond that is cut. This is also consistent with the analyses based on the maximal width of the casual cones.
Tensor Networks on Fractals Another example that can be exactly contracted is the TN defined on the fractal called Sierpiński gasket (Fig. 2.29b) (see, e.g., [82,83]). The TN can represent the partition function of the statistical model defined on the Sierpiński gasket, such as Ising and Potts model. As explained in Sec. II, the tensor is given by the probability distribution of the three spins in a triangle.
Such a TN can be exactly contracted by iteratively contracting each three of the tensors located in a same triangle as (2.46) After each round of contractions, the dimension of the tensors and the geometry of the network keep unchanged, but the number of the tensors in the TN decreases from N to N/3. It means we can exactly contract the whole TN by repeating the above process.
Algebraically Contractible Tensor Networks The third example is called algebraically contractible TNs [84,85]. The tensors that form the TN possess some special algebraic properties, so that even the bond dimensions increase after each contraction, the rank of the bonds is kept unchanged. It means one can introduce some projectors to lower the bond dimension without causing any errors. The simplest algebraically contractible TN is the one formed by the superdiagonal tensor I defined as I is also called copy tensor, since it forces all its indexes to take a same value.
For a square TN of an arbitrary size formed by the fourth-order I s, obviously we have its contraction Z = d with d the bond dimension. The reason is that the contraction is the summation of only d non-zero values (each equals to 1).
To demonstrate its contraction, we will need one important property of the copy tensor ( Fig. 2.30): if there are n ≥ 1 bonds contracted between two copy tensors, the contraction gives a copy tensor (2.48)

Fig. 2.30
The fusion rule of the copy tensor: the contraction of two copy tensors of N 1 -th and N 2 -th order gives a copy tensor of (N 1 + N 2 − N)-th order, with N the number of the contracted bonds This property is called the fusion rule, and can be understood in the opposite way: a copy tensor can be decomposed as the contraction of two copy tensors.
With the fusion rule, one will readily have the property for the dimension reduction: if there are n ≥ 1 bonds contracted between two copy tensors, the contraction is identical after replacing the n bonds with one bond In other words, the dimension of the contracting bonds can be exactly reduced from χ n to χ . Applying this property to TN contraction, it means each time when the bond dimension increases after contracting several tensors into one tensor, the dimension can be exactly reduced to χ , so that the contraction can continue until all bonds are contracted. From the TN of the copy tensors, a class of exactly contractible TN can be defined, where the local tensor is the multiplication of the copy tensor by several unitary tensors. Taking the square TN as example, we have with U and V two unitary matrices. X is an arbitrary d-dimensional vector that can be understood as the "weights" (not necessarily to be positive to define the tensor).
After putting the tensors in the TN, all unitary matrices vanish to identities. Then one can use the fusion rule of the copy tensor to exactly contract the TN, and the contraction gives Z = b (X b ) N T with N T the total number of tensors. The unitary matrices are not trivial in physics. If we take d = 2 and the TN is in fact the inner product of the Z 2 topological state (see the definition of Z 2 PEPS in Sect. 2.2.3). If one cuts the system into two subregions, all the unitary matrices vanish into identities inside the bulk. However, those on the boundary will survive, which could lead to exotic properties such as topological orders, edge states, and so on. Note that Z 2 state is only a special case. One can refer to a systematic picture given by X. G. Wen called the string-net states [27][28][29].

General Form of Tensor Network
One can see that a TN (state or operator) is defined as the contraction of certain tensors {T [n] } with a general form as s n 1 s n 2 ··· ,a n 1 a n 2 ··· . (2.52) The indexes {a} are geometrical indexes, each of which is shared normally two tensors and will be contracted.
where Z can be the cost function (e.g., energy or fidelity) to be maximized or minimized. The TN contraction algorithms mainly deal with the scalar TNs.

Gauge Degrees of Freedom
For a given state, its TN representation is not unique. Let us take translational invariant MPS as an example. One may insert a (full-rank) matrix U and its inverse U −1 on each of the virtual bonds and then contracted them, respectively, into the two neighboring tensors. The tensors of new MPS becomeÃ [n] s,aa = bb U ab A [n] s,bb U −1 a b . In fact, we only put an identity I = UU −1 , thus do not implement any changes to the MPS. However, the tensors that form the MPS change, meaning the TN representation changes. It is also the case when inserting an matrix and its inverse on any of the virtual bonds of a TN state, which changes the tensors without changing the state itself. Such degrees of freedom is known as the gauge degrees of freedom, and the transformations are called gauge transformations.
The gauge degrees of on the one hand may cause instability to TN simulations. Algorithms for finite and infinite PEPS were proposed to fix the gauge to reach higher stability [86][87][88]. On the other hand, one may use gauge transformation to transform a TN state to a special form, so that, for instance, one can implement truncations of local basis while minimizing the error non-locally [45,89] (we will go back to this issue later). Moreover, gauge transformation is closely related to other theoretical properties such as the global symmetry of TN states, which has been used to derive more compact TN representations [90], and to classify manybody phases [91,92] and to characterize non-conventional orders [93,94], just to name a few.

Tensor Network and Quantum Entanglement
The numerical methods based on TN face great challenges, primarily that the dimension of the Hilbert space increases exponentially with the size. Such an "exponential wall" has been treated in different ways by many numeric algorithms, including the DFT methods [95] and QMC approaches [96].
The power of TN has been understood in the sense of quantum entanglement: the entanglement structure of low-lying energy states can be efficiently encoded in TNSs. It takes advantage of the fact that not all quantum states in the total Hilbert space of a many-body system are equally relevant to the low-energy or lowtemperature physics. It has been found that the low-lying eigenstates of a gapped Hamiltonian with local interactions obey the area law of the entanglement entropy [97].
More precisely speaking, for a certain subregion R of the system, its reduced density matrix is defined asρ R = Tr E (ρ), with E denotes the spatial complement of R. The entanglement entropy is defined as Then the area law of the entanglement entropy [17,98] reads with |∂R| the size of the boundary. In particular, for a D-dimensional system, one has with l the length scale. This means that for 1D systems, S = const. The area law suggests that the low-lying eigenstates stay in a "small corner" of the full Hilbert space of the many-body system, and that they can be described by a much smaller number of parameters. We shall stress that the locality of the interactions is not sufficient to the area law. Vitagliano et al. show that simple 1D spin models can exhibit volume law, where the entanglement entropy scales with the bulk [99, 100]. The area law of entanglement entropy is intimately connected to another fact that a non-critical quantum system exhibits a finite correlation length. The correlation functions between two blocks in a gapped system decay exponentially as a function of the distance of the blocks [101], which is argued to lead to the area law. An intuitive picture can be seen in Fig. 2.31. Let us consider a 1D gapped quantum system whose ground state |ψ ABC possesses a correlation length ξ corr . By dividing into three subregions A, B, and C, the reduced density operatorρ AC is obtained when tracing out the block B, i.e.,ρ AC = Tr B |ψ ABC ψ ABC | (see Fig. 2.32). In  To argue the 1D area law, the chain is separated into three subsystems denoted by A, B, and C. If the correlation length ξ corr is much larger than the size of B (denoted by l AC ), the reduced density matrix by tracing B approximately satisfiesρ AC ρ A ⊗ρ C the limit of large distance between A and C blocks with l AC ξ corr , one has the reduced density matrix satisfyingρ AC ρ A ⊗ρ C , (2.57) up to some exponentially small corrections. Then |ψ ABC is a purification 9 of a mixed state with the form |ψ AB l ⊗ |ψ B r C that has no correlations between A and C; here B l and B r sit at the two ends of the block B, which together span the original block.
It is well known that all possible purifications of a mixed state are equivalent to each other up to a local unitary transformation on the virtual Hilbert space. This naturally implies that there exists a unitary operationÛ B on the block B that completely disentangles the left from the right part aŝ This argument directly leads to the MPS description and gives a strong hint that the ground states of a gapped Hamiltonian is well represented by an MPS of finite bond dimensions, where B in Eq. (2.59) is analog to the tensor in an MPS. Let us remark that every state of N spins has an exact MPS representation if we allow χ to grow exponentially with the number of spins [102]. The whole point of MPS is that a ground state can typically be represented by an MPS where the dimension χ is small and scales at most polynomially with the number of spins: this is the reason why MPS-based methods are more efficient than exact diagonalization.
For the 2D PEPS, it is more difficult to strictly justify the area law of entanglement entropy. However, we can make some sense of it from the following aspects. One is the fact that PEPS can exactly represent some non-trivial 2D states that satisfies the area law, such as the nearest-neighbor RVB and Z 2 spin liquid mentioned above. Another is to count the dimension of the geometrical bonds D between two subsystems, from which the entanglement entropy satisfies an upper bound as S ≤ log D. 10 After dividing a PEPS into two subregions, one can see that the number of geometrical bonds N b increase linearly with the length scale, i.e., N b ∼ l. It means the dimension D satisfies D ∼ χ l , and the upper bound of the entanglement entropy fulfills the area law given by Eq. (2.56), which is (2.60) However, as we will see later, such a property of PEPS is exactly the reason that makes it computationally difficult. 21  The images or other third party material in this chapter are included in the chapter's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

Two-Dimensional Tensor Networks and Contraction Algorithms
Abstract In this section, we will first demonstrate in Sect. 3.1 that many important physical problems can be transformed to 2D TNs, and the central tasks become to compute the corresponding TN contractions. From Sects. 3.2 to 3.5, we will then present several paradigm contraction algorithms of 2D TNs including TRG, TEBD, and CTMRG. Relations to other distinguished algorithms and the exactly contractible TNs will also be discussed.

Classical Partition Functions
Partition function, which is a function of the variables of a thermodynamic state such as temperature, volume, and etc., contains the statistical information of a thermodynamic equilibrium system. From its derivatives of different orders, we can calculate the energy, free energy, entropy, and so on. Levin and Nave pointed out in Ref. [1] that the partition functions of statistical lattice models (such as Ising and Potts models) with local interactions can be written in the form of TN. Without losing generality, we take square lattice as an example. The partition function is defined as the summation of the probability of all configurations. In the language of tensor, it is obtained by simply summing over all indexes as Let us proceed a little bit further by considering four squares, whose partition function can be written in a TN with four tensors (Fig. 3.1b)  where two indexes satisfy s n j = s m k if they refer to the same Ising spin. The graphic representation of Eq. (3.6) is shown in Fig. 3.1c. One can see that on square lattice, the TN still has the geometry of a square lattice. In fact, such a way will give a TN that has a geometry of the dual lattice of the system, and the dual of the square lattice is itself.
For the Q-state Potts model on square lattice, the partition function has the same TN representation as that of the Ising model, except that the elements of the tensor are given by the Boltzmann weight of the Potts model and the dimension of each index is Q. Note that the Potts model with q = 2 is equivalent to the Ising model.
Another example is the eight-vertex model proposed by Baxter in 1971 [2]. It is one of the "ice-type" statistic lattice model, and can be considered as the classical correspondence of the Z 2 spin liquid state. The tensor that gives the TN of the partition function is also (2 × 2 × 2 × 2), whose non-zero elements are T s 1 ,··· ,s N = 1, s 1 + · · · + s N = even, 0, otherwise. (3.7) We shall remark that there are more than one ways to define the TN of the partition function of a classical system. For example, when there only exist nearestneighbor couplings, one can define a matrix M ss = e −βH ss on each bond and put on each site a super-digonal tensor I (or called copy tensor) defined as I s 1 ,··· ,s N = 1, s 1 = · · · = s N ; 0, otherwise. (3.8) Then the TN of the partition function is the contraction of copies of M and I , and possesses exactly the same geometry of the original lattice (instead of the dual one).

Quantum Observables
With a TN state, the computations of quantum observables as ψ|Ô|ψ and ψ|ψ are the contraction of a scalar TN, whereÔ can be any operator. For a 1D MPS, this can be easily calculated, since one only needs to deal with a 1D TN stripe. For 2D PEPS, such calculations become contractions of 2D TNs. Taking ψ|ψ as an example, the TN of such an inner product is the contraction of the copies of the local tensor ( Fig. 3.1c) defined as The only difference is that we should substitute some small number of T a 1 a 2 a 3 a 4 in original TN of ψ|ψ with "impurities" at the sites where the operators locate.
Taking one-body operator as an example, the "impurity" tensor on this site can be defined as s,s P s ,a 1 a 2 a 3 a 4 . (3.10) In such a case, the single-site observables can be represented by the TN contraction of (3.11) For some non-local observables, e.g., the correlation function, the contraction of ψ|Ô [i]Ô[j ] |ψ is nothing but adding another "impurity" by

Ground-State and Finite-Temperature Simulations
Ground-state simulations of 1D quantum models with short-range interactions can also be efficiently transferred to 2D TN contractions. When minimizing the energy (3.14) The first way can be realized by, e.g., Monte Carlo methods where one could randomly change or choose the value of each tensor element to locate the minimal of energy. One can also use the Newton method and solve the partial-derivative equations ∂E/∂x n = 0 with x n standing for an arbitrary variational parameter. Anyway, it is inevitable to calculate E (i.e., ψ|Ĥ |ψ and ψ|ψ ) for most cases, which is to contraction the corresponding TNs as explained above.
We shall stress that without TN, the dimension of the ground state (i.e., the number of variational parameters) increases exponentially with the system size, which makes the ground-state simulations impossible for large systems.
The second way of computing the ground state with imaginary-time evolution is more or less like an "annealing" process. One starts from an arbitrarily chosen initial state and acts the imaginary-time evolution operator on it. The "temperature" is lowered a little for each step, until the state reaches a fixed point. Mathematically speaking, by using Trotter-Suzuki decomposition, such an evolution is written in a TN defined on (D + 1)-dimensional lattice, with D the dimension of the real space of the model.
Here, we take a 1D chain as an example. We assume that the Hamiltonian only contains at most nearest-neighbor couplings, which readŝ By doing so, each two terms inĤ e orĤ o commutes with each other. Then the evolution operatorÛ(τ ) for infinitesimal imaginary time τ → 0 can be written aŝ If τ is small enough, the high-order terms are negligible, and the evolution operator becomesÛ with the two-site evolution operatorÛ(τ ) n,n+1 = e −τĤ n,n+1 .
The above procedure is known as the first-order Trotter-Suzuki decomposition [3][4][5]. Note that higher-order decomposition can also be adopted. For example, one may use the second-order Trotter-Suzuki decomposition that is written as With Eq. (3.18), the time evolution can be transferred to a TN, where the local tensor is actually the coefficients ofÛ(τ ) n,n+1 , satisfying T s n s n+1 s n s n+1 = s n s n+1 |Û(τ ) n,n+1 |s n s n+1 . (3.20) Such a TN is defined in a plain of two dimensions that corresponds to the spatial and (real or imaginary) time, respectively. The initial state is located at the bottom of the TN (β = 0) and its evolution is to do the TN contraction which can be efficiently solved by TN algorithms (presented later).
In addition, one can readily see that the evolution of a 2D state leads to the contraction of a 3D TN. Such a TN scheme provides a straightforward picture to understand the equivalence between a (d + 1)-dimensional classical and a d-dimensional quantum theory. Similarly, the finite-temperature simulations of a quantum system can be transferred to TN contractions with Trotter-Suzuki decomposition. For the density operatorρ(β) = e −βĤ , the TN is formed by the same tensor given by Eq. (3.20).

Tensor Renormalization Group
In 2007, Levin and Nave proposed TRG approach [1] to contract the TN of 2D classical lattice models. In 2008, Gu et al. further developed TRG to handle 2D quantum topological phases [6]. TRG can be considered as a coarse-graining contraction algorithm. To introduce the TRG algorithm, let us consider a square TN formed by infinite number of copies of a fourth-order tensor T a 1 a 2 a 3 a 4 (see the left side of Fig. 3.2).

Contraction and Truncation
The idea of TRG is to iteratively "coarse-grain" the TN without changing the bond dimensions, the geometry of the network, and the translational invariance. Such a process is realized by two local operations in each iteration. Let us denote the tensor in the t-th iteration as T (t) (we take T (0) = T ). For obtaining T (t+1) , the first step is to decompose T (t) by SVD in two different ways ( Fig. 3.2) as Note that the singular value spectrum can be handled by multiplying it with the tensor(s), and the dimension of the new index satisfies dim(b) = χ 2 with χ the dimension of each bond of T (t) . The purpose of the first step is to deform the TN, so that in the second step, a new tensor T (t+1) can be obtained by contracting the four tensors that form a square (Fig. 3.2) as a 1 a 2 b 1 Y a 2 a 3 b 2 U a 3 a 4 b 3 X a 4 a 1 b 4 . (3.23) We use an arrow instead of the equal sign, because one may need to divide the tensor by a proper number to keep the value of the elements from being divergent. The arrows will be used in the same way below. These two steps define the contraction strategy of TRG. By the first step, the number of tensors in the TN (i.e., the size of the TN) increases from N to 2N , and by the second step, it decreases from 2N to N/2. Thus, after t times of each iterations, the number of tensors decreases to the 1 2 t of its original number. For this reason, TRG is an exponential contraction algorithm.

Error and Environment
The dimension of the tensor at the t-th iteration becomes χ 2 t , if no truncations are implemented. This means that truncations of the bond dimensions are necessary. In its original proposal, the dimension is truncated by only keeping the singular vectors of the χ -largest singular values in Eq. (3.22). Then the new tensor T (t+1) obtained by Eq. (3.23) has exactly the same dimension as T (t) .
Each truncation will absolutely introduce some error, which is called the truncation error. Consistent with Eq. (2.7), the truncation error is quantified by the discarded singular values λ as ε = (3.24) According to the linear algebra, ε in fact gives the error of the SVD given in Eq. (3.22), meaning that such a truncation minimizes the error of reducing the rank of T (t) , which reads One may repeat the contraction-and-truncation process until T (t) converges. It usually only takes ∼10 steps, after which one in fact contract a TN of 2 t tensors to a single tensor. The truncation is optimized according to the SVD of T (t) . Thus, T (t) is called the environment. In general, the tensor(s) that determines the truncations is called the environment. It is a key factor to the accuracy and efficiency of the algorithm. For those that use local environments, like TRG, the efficiency is relatively high since the truncations are easy to compute. But, the accuracy is bounded since the truncations are only optimized according to some local information (like in TRG the local partitioning T (t) ).
One may choose other tensors or even the whole TN as the environment. In 2009, Xie et al. proposed the second renormalization group (SRG) algorithm [7]. The idea is in each truncation step of TRG, they define the global environment that is a fourth-order tensor E añ 1 añ 2 añ 3 añ 4 = {a} n =ñ T (n,t) a n 1 a n 2 a n 3 a n 4 with T (n,t) the n-th tensor in the t-th step andñ the tensor to be truncated. E is the contraction of the whole TN after getting rid of T (ñ,t) , and is computed by TRG. Then the truncation is obtained not by the SVD of T (ñ,t) , but by the SVD of E . The word "second" in the name of the algorithm comes from the fact that in each step of the original TRG, they use a second TRG to calculate the environment. SRG is obviously more consuming, but bears much higher accuracy than TRG. The balance between accuracy and efficiency, which can be controlled by the choice of environment, is one main factor to consider while developing or choosing the TN algorithms.

Corner Transfer Matrix Renormalization Group
In the 1960s, the corner transfer matrix (CTM) idea was developed originally by Baxter in Refs. [8,9] and a book [10]. Such ideas and methods have been applied to various models, for example, the chiral Potts model [11][12][13], the 8-vertex model [2,14,15], and to the 3D Ising model [16]. Combining CTM with DMRG, Nishino and Okunishi proposed the CTMRG [17] in 1996 and applied it to several models [17][18][19][20][21][22][23][24][25][26][27]. In 2009, Orús and Vidal further developed CTMRG to deal with TNs [28]. What they proposed to do is to put eight variational tensors to be optimized in the algorithm, which are four corner transfer matrices C [1] , C [2] , C [3] , C [4] and four row (column) tensors R [1] , R [2] , R [3] , R [4] , on the boundary, and then to contract the tensors in the TN to these variational tensors in a specific order shown in Fig. 3.3. The TN contraction is considered to be solved with the variational tensors when they converge in this contraction process. Compared with the boundary-state methods in the last subsection, the tensors in CTMRG define the states on both the boundaries and corners.
Contraction In each iteration step of CTMRG, one choses two corner matrices on the same side and the row tensor between them, e.g., C [1] , C [2] , and R [2] . The update of these tensors (Fig. 3.4) follows  The first arrow shows absorbing tensors R [1] , T , and R [3] to renew tensors C [1] , R [2] , and C [2] in left operation. The second arrow shows the truncation of the enlarged bond ofC [1] ,R [2] , andC [2] . Inset is the acquisition of the truncation matrix Z After the contraction given above, it can be considered that one column of the TN (as well as the corresponding row tensors R [1] and R [3] ) is contracted. Then one chooses other corner matrices and row tensors (such asC [1] , C [4] , and R [1] ) and implement similar contractions. By iteratively doing so, the TN is contracted in the way shown in Fig. 3.3.

27)
Note that for a finite TN, the initial corner matrices and row tensors should be taken as the tensors locating on the boundary of the TN. For an infinite TN, they can be initialized randomly, and the contraction should be iterated until the preset convergence is reached.
CTMRG can be regarded as a polynomial contraction scheme. One can see that the number of tensors that are contracted at each step is determined by the length of the boundary of the TN at each iteration time. When contracting a 2D TN defined on a (L×L) square lattice as an example, the length of each side is L−2t at the t-th step. The boundary length of the TN (i.e., the number of tensors contracted at the t-th step) bears a linear relation with t as 4(L − 2t) − 4. For a 3D TN such as cubic TN, the boundary length scales as 6(L − 2t) 2 − 12(L − 2t) + 8, thus the CTMRG for a 3D TN (if exists) gives a polynomial contraction.
Truncation One can see that after the contraction in each iteration step, the bond dimensions of the variational tensors increase. Truncations are then in need to prevent the excessive growth of the bond dimensions. In Ref. [28], the truncation is obtained by inserting a pair of isometries V and V † in the enlarged bonds. A reasonable (but not the only choice) of V for translational invariant TN is to consider the eigenvalue decomposition on the sum of corner transfer matrices as bC [1] † bbC Only the χ largest eigenvalues are preserved. Therefore, V is a matrix of the dimension Dχ × χ , where D is the bond dimension of T and χ is the dimension cut-off. We then truncateC [1] ,R [2] , andC [2] using V as

32)
Error and Environment Same as TRG or TEBD, the truncations are obtained by the matrix decompositions of certain tensors that define the environment. From Eq. (3.29), the environment in CTMRG is the loop formed by the corner matrices and row tensors. Note that symmetries might be considered to accelerate the computation. For example, one may take C [1] = C [2] = C [3] = C [4] and R [1] = R [2] = R [3] = R [4] when the TN has rotational and reflection symmetries (T a 1 a 2 a 3 a 4 = T a 1 a 2 a 3 a 4 after any permutation of the indexes).
In the language of TN, TEBD solves the TN contraction problems in a linearized manner, and the truncation is calculated in the context of an MPS. In the following, let us explain the infinite TEBD (iTEBD) algorithm [31] (Fig. 3.5) by still taking the infinite square TN formed by the copies of a fourth-order tensor T as an example. In each step, a row of tensors (which can be regarded as an MPO) are contracted to an MPS |ψ . Inevitably, the bond dimensions of the tensors in the MPS will increase exponentially as the contractions proceed. Therefore, truncations are necessary to prevent the bond dimensions diverging. The truncations are determined by minimizing the distance between the MPSs before and after the truncation. After the MPS |ψ converges, the TN contraction becomes ψ|ψ , which can be exactly and easily computed.
Contraction We use is two-site translational invariant MPS, which is formed by the tensors A and B on the sites and the spectrum Λ and Γ on the bonds as {a} · · · Λ a n−1 A s n−1 ,a n−1 a n Γ a n B s n ,a n a n+1 Λ a n+1 · · · . (3.33) In each step of iTEBD, the contraction is given by  = (b , a ). Meanwhile, the spectrum is also updated as where 1 is a vector with 1 b = 1 for any b.
It is readily to see that the number of tensors in iTEBD will be reduced linearly as tN, with t the number of the contraction-and-truncation steps and N → ∞ the number of the columns of the TN. Therefore, iTEBD (also finite TEBD) can be considered as a linearized contraction algorithm, in contrast to the exponential contraction algorithm like TRG.
Truncation Truncations are needed when the dimensions of the virtual bonds exceed the preset dimension cut-off χ . In the original version of iTEBD [31], the truncations are done by local SVDs. To truncate the virtual bondã, for example, one defines a matrix by contracting the tensors and spectrum connected to the target bond as Till now, the truncation of the spectrum Γ and the corresponding virtual bond have been completed. Any spectra and virtual bonds can be truncated similarly.
Error and Environment Similar to TRG and SRG, the environment of the original iTEBD is M in Eq. (3.37), and the error is measured by the discarded singular values of M. Thus, iTEBD seems to only use local information to optimize the truncations. What is amazing is that when the MPO is unitary or near unitary, the MPS converges to a so-called canonical form [46,47]. The truncations are then optimal by taking the whole MPS as the environment. If the MPO is far from being unitary, Orús and Vidal proposed the canonicalization algorithm [47] to transform the MPS into the canonical form before truncating. We will talk about this issue in detail in the next section.

Boundary-State Methods: Density Matrix Renormalization Group and Variational Matrix Product State
The iTEBD can be understood as a boundary-state method. One may consider one row of tensors in the TN as an MPO (see Sect. 2.2.6 and Fig. 2.10), where the vertical bonds are the "physical" indexes and the bonds shared by two adjacent tensors are the geometrical indexes. This MPO is also called the transfer operator or transfer MPO of the TN. The converged MPS is in fact the dominant eigenstate of the MPO. 2 While the MPO represents a physical Hamiltonian or the imaginary-time evolution operator (see Sect. 3.1), the MPS is the ground state. For more general situations, e.g., the TN represents a 2D partition function or the inner product of two 2D PEPSs, the MPS can be understood as the boundary state of the TN (or the PEPS) [48][49][50]. The contraction of the 2D infinite TN becomes computing the boundary state, i.e., the dominant eigenstate (and eigenvalue) of the transfer MPO.
The boundary-state scheme gives several non-trivial physical and algorithmic implications [48][49][50][51][52], including the underlying resemblance between iTEBD and the famous infinite DMRG (iDMRG) [53]. DMRG [54,55] follows the idea of Wilson's NRG [56], and solves the ground states and low-lying excitations of 1D or quasi-1D Hamiltonians (see several reviews [57][58][59][60]); originally it has no direct relations to TN contraction problems. After the MPS and MPO become well understood, DMRG was re-interpreted in a manner that is more close to TN (see a review by Schollwöck [57]). In particular for simulating the ground states of infinite-size 1D systems, the underlying connections between the iDMRG and iTEBD were discussed by McCulloch [53]. As argued above, the contraction of a TN can be computed by solving the dominant eigenstate of its transfer MPO. The eigenstates reached by iDMRG and iTEBD are the same state up to a gauge transformation (note the gauge degrees of freedom of MPS will be discussed in Sect. 2.4.2). Considering that DMRG mostly is not used to compute TN contractions and there are already several understanding reviews, we skip the technical details of the DMRG algorithms here. One may refer to the papers mentioned above if interested. However, later we will revisit iDMRG in the clue of multi-linear algebra.
Variational matrix product state (VMPS) method is a variational version of DMRG for (but not limited to) calculating the ground states of 1D systems with periodic boundary condition [61]. Compared with DMRG, VMPS is more directly related to TN contraction problems. In the following, we explain VMPS by solving the contraction of the infinite square TN. As discussed above, it is equivalent to solve the dominant eigenvector (denoted by |ψ ) of the infinite MPO (denoted bŷ rho) that is formed by a row of tensors in the TN. The task is to minimize ψ|ρ|ψ under the constraint ψ|ψ = 1. The eigenstate |ψ written in the form of an MPS.
The tensors in |ψ are optimized on by one. For instance, to optimize the n-th tensor, all other tensors are kept unchanged and considered as constants. Such a local minimization problem becomesĤ eff |T n = EN eff |T n with E the eigenvalue. Fig. 3.6 The illustration of (a)Ĥ eff and (b)N eff in the variational matrix product state method H eff is given by a sixth-th order tensor defined by contracting all tensors in ψ|ρ|ψ except for the n-th tensor and its conjugate (Fig. 3.6a). Similarly,N eff is also given by a sixth-th order tensor defined by contracting all tensors in ψ|ψ except for the n-th tensor and its conjugate (Fig. 3.6b). Again, the VMPS is different from the MPS obtained by TEBD only up to a gauge transformation.
Note that the boundary-state methods are not limited to solving TN contractions. An example is the time-dependent variational principle (TDVP). The basic idea of TDVP was proposed by Dirac in 1930 [62], and then it was cooperated with the formulation of Hamiltonian [63] and action function [64]. For more details, one could refer to a review by Langhoff et al. [65]. In 2011, TDVP was developed to simulate the time evolution of many-body systems with the help of MPS [66]. Since TDVP (and some other algorithms) concerns directly a quantum Hamiltonian instead of the TN contraction, we skip giving more details of these methods in this paper.

Transverse Contraction and Folding Trick
For the boundary-state methods introduced above, the boundary states are defined in the real space. Taking iTEBD for the real-time evolution as an example, the contraction is implemented along the time direction, which is to do the time evolution in an explicit way. It is quite natural to consider implementing the contraction along the other direction. In the following, we will introduce the transverse contraction and the folding trick proposed and investigated in Refs. [67][68][69]. The motivation of transverse contraction is to avoid the explicit simulation of the time-dependent state |ψ(t) that might be difficult to capture due to the fast growth of its entanglement.
Transverse Contraction Let us consider to calculate the average of a one-body operator o(t) = ψ(t)|ô|ψ(t) with |ψ(t) that is a quantum state of infinite size evolved to the time t. The TN representing o(t) is given in the left part of Fig. 3.7, where the green squares give the initial MPS |ψ(0) and its conjugate, the yellow diamond isô, and the TN formed by the green circles represents the evolution operator e itĤ and its conjugate (see how to define the TN in Sect. 3.1.3).

Fig. 3.7 Transverse contraction of the TN for a local expectation value O(t)
To perform the transverse contraction, we treat each column of the TN as an MPOT . Then as shown in the right part of Fig. 3.7, the main task of computing o(t) is to solve the dominant eigenstate |φ (normalized) ofT , which is an MPS illustrated by the purple squares. One may solve this eigenstate problems by any of the boundary-state methods (TEBD, DMRG, etc.). With |φ , o(t) can be exactly and efficiently calculated as withT o is the column that contains the operatorô. Note that the length of |φ (i.e., the number of tensors in the MPS) is proportional to the time t, thus one should use the finite-size versions of the boundary-state methods. It should also be noted that T may not be Hermitian. In this case, one should not use |φ and its conjugate, but compute the left and right eigenstates ofT instead. Interestingly, similar ideas of the transverse contraction appeared long before the concept of TN emerged. For instance, transfer matrix renormalization group (TMRG) [70][71][72][73] can be used to simulate the finite-temperature properties of a 1D system. The idea of TMRG is to utilize DMRG to calculate the dominant eigenstate of the transfer matrix (similar to T ). In correspondence with the TN terminology, it is to use DMRG to compute |φ from the TN that defines the imaginary-time evolution. We will skip of the details of TMRG since it is not directly related to TN. One may refer the related references if interested. Folding Trick The main bottleneck of a boundary-state method concerns the entanglement of the boundary state. In other words, the methods will become inefficient when the entanglement of the boundary state grows too large. One example is the real-time simulation of a 1D chain, where the entanglement entropy increases linearly with time. Solely with the transverse contraction, it will not essentially solve this problem. Taking the imaginary-time evolution as an example, it has been shown that with the dual symmetry of space and time, the boundary states in the space and time directions possess the same entanglement [69,74].
In Ref. [67], the folding trick was proposed. The idea is to "fold" the TN before the transverse contraction ( Fig. 3.8). In the folded TN, each tensor is the tensor product of the original tensor and its conjugate. The length of the folded TN in the time direction is half of the original TN, and so is the length of the boundary state.
The previous work [67] on the dynamic simulations of 1D spin chains showed that the entanglement of the boundary state is in fact reduced compared with that of the boundary state without folding. This suggests that the folding trick provides a more efficient representation of the entanglement structure of the boundary state. The authors of Ref. [67] suggested an intuitive picture to understand the folding trick. Consider a product state as the initial state at t − 0 and a single localized excitation at the position x that propagates freely with velocity v. By evolving for a time t, only (x ± vt) sites will become entangled. With the folding trick, the evolutions (that are unitary) besides the (x ± vt) sites will not take effects since they are folded with the conjugates and become identities. Thus the spins outside (x ±vt) will remain product state and will not contribute entanglement to the boundary state. In short, one key factor to consider here is the entanglement structure, i.e., the fact that the TN is formed by unitaries. The transverse contraction with the folding trick is a convincing example to show that the efficiency of contracting a TN can be improved by properly designing the contraction way according to the entanglement structure of the TN.

Relations to Exactly Contractible Tensor Networks and Entanglement Renormalization
The TN algorithms explained above are aimed at dealing with contracting optimally the TNs that cannot be exactly contracted. Then a question arises: Is a classical computer really able to handle these TNs? In the following, we show that by explicitly putting the isometries for truncations inside, the TNs that are contracted in these algorithms become eventually exactly contractible, dubbed as exactly contractible TN (ECTN). Different algorithms lead to different ECTN. That means the algorithm will show a high performance if the TN can be accurately approximated by the corresponding ETNC. Figure 3.9 shows the ECTN emerging in the plaquette renormalization [75] or higher-order TRG (HOTRG) algorithms [76]. Take the contraction of a TN (formed by the copies of tensor T ) on square lattice as an example. In each iteration step, four nearest-neighbor T s in a square are contracted together, which leads to a new square TN formed by tensors (T (1) ) with larger bond dimensions. Then, isometries (yellow Fig. 3.9 The exactly contractible TN in the HOTRG algorithm triangles) are inserted in the TN to truncate the bond dimensions (the truncations are in the same spirit of those in CTMRG, see Fig. 3.4). Let us not contract the isometries with the tensors, but leave them there inside the TN. Still, we can move on to the next iteration, where we contract four T (1) 's (each of which is formed by four T and the isometries, see the dark-red plaques in Fig. 3.9) and obtain more isometries for truncating the bond dimensions of T (1) . By repeating this process for several times, one can see that tree TNs appear on the boundaries of the coarsegrained plaques. Inside the 4-by-4 plaques (light red shadow), we have the two-layer tree TNs formed by three isometries. In the 8-by-8 plaques, the tree TN has three layers with seven isometries. These tree TNs separate the original TN into different plaques, so that it can be exactly contracted, similar to the fractal TNs introduced in Sect. 2.3.6.
In the iTEBD algorithm [29][30][31]47] (Fig. 3.10), one starts with an initial MPS (dark-blue squares). In each iteration, one tensor (light blue circles) in the TN is contracted with the tensor in the MPS and then the bonds are truncated by isometries (yellow triangles). Globally seeing, the isometries separate the TN into many "tubes" (red shadow) that are connected only at the top. The length of the tubes equals to the number of the iteration steps in iTEBD. Obviously, this TN is exactly contractible. Such a tube-like structure also appears in the contraction algorithms based on PEPS.
For the CTMRG algorithm [28], the corresponding ECTN is a little bit complicated (see one quarter of it in Fig. 3.11). The initial row (column) tensors and the corner transfer matrices are represented by the pink and green squares. In each  iteration step, the tensors (light blue circles) located most outside are contracted to the row (column) tensors and the corner transfer matrices, and isometries are introduced to truncate the bond dimensions. Globally seeing the picture, the isometries separate the TN into a tree-like structure (red shadow), which is exactly contractible.
For these three algorithms, each of them gives an ECTN that is formed by two part: the tensors in the original TN and the isometries that make the TN exactly contractible. After optimizing the isometries, the original TN is approximated by the ECTN. The structure of the ECTN depends mainly on the contraction strategy and the way of optimizing the isometries depends on the chosen environment.
The ECTN picture shows us explicitly how the correlations and entanglement are approximated in different algorithms. Roughly speaking, the correlation properties can be read from the minimal distance of the path in the ECTN that connects two certain sites, and the (bipartite) entanglement can be read from the number of bonds that cross the boundary of the bipartition. How well the structure suits the correlations and entanglement should be a key factor of the performance of a TN contraction algorithm. Meanwhile, this picture can assist us to develop new algorithms by designing the ECTN and taking the whole ECTN as the environment for optimizing the isometries. These issues still need further investigations.
The unification of the TN contraction and the ECTN has been explicitly utilized in the TN renormalization (TNR) algorithm [77,78], where both isometries and unitaries (called disentangler) are put into the TN to make it exactly contractible. Then instead of tree TNs or MPSs, one will have MERAs (see Fig. 2.7c, for example) inside which can better capture the entanglement of critical systems.

A Shot Summary
In this section, we have discussed about several contraction approaches for dealing with 2D TNs. Applying these algorithms, many challenging problems can be efficiently solved, including the ground-state and finite-temperature simulations of 1D quantum systems, and the simulations of 2D classical statistic models. Such algorithms consist of two key ingredients: contractions (local operations of tensors) and truncations. The local contraction determines the way how the TN is contracted step by step, or in other words, how the entanglement information is kept according to the ECTN structure. Different (local or global) contractions may lead to different computational costs, thus optimizing the contraction sequence is necessary in many cases [67,79,80]. The truncation is the approximation to discard less important basis so that the computational costs are properly bounded. One essential concept in the truncations is "environment," which plays the role of the reference when determining the weights of the basis. Thus, the choice of environment concerns the balance between the accuracy and efficiency of a TN algorithm.
Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence and indicate if changes were made.
The images or other third party material in this chapter are included in the chapter's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

Tensor Network Approaches for Higher-Dimensional Quantum Lattice Models
Abstract In this section, we will show several representative TN approaches for simulating the quantum lattice models in (d > 1) dimensions. We will mainly use the language of TN contractions. One may refer to several existing reviews ( 2017) for more exhaustive understanding on the TN simulations for quantum problems. We will focus on the algorithms based on PEPS, and show the key roles that the 2D TN contraction algorithms presented in Sect. 3 play in the higher-dimensional cases.

Variational Approaches of Projected-Entangled Pair State
Without losing generality, we consider a 2D quantum system with nearest-neighbor coupling on an infinite square lattice as an example. Note that the infinite PEPO (iPEPO) representation is not enforced to definê H eff . In fact, it is not easy to obtain the iPEPO of an arbitrary 2D (or 3D) Hamiltonian. The usual way is to start from the summation form of the Hamiltonian H = Ĥ ij , and compute the contribution toĤ eff from eachĤ ij separately [2].
Each term is computed by contracting a 2D TN, where one can reuse the results to improve the efficiency. Following the same clue (minimizing E), algorithms were proposed to combine TN with the QMC methods [5][6][7][8][9]. Still let us focus on those based on PEPS. One may transform Eq. It is easy to see that the normalization condition of the weights S W (S) 2 = 1 is satisfied.
The task becomes to compute W (S) and S |Ĥ |S with different configurations. The computation of S |Ĥ |S is relatively easy since |S and |S are just two product states. The computation of W (S) is more tricky. When |ψ is a PEPS on a square lattice, W (S) is a 2D scalar TN by fixing all the physical indexes of the PEPS as where P [n] s n is a fourth-order tensor that only has the geometrical index. 1 The n-th physical index is taken as s n . Considering that most of the configurations are not translationally invariant, such QMC-TN methods are usually applied to finite-size models. One may use the finite TN version of the algorithms reviewed in Sect. 3.

Imaginary-Time Evolution Methods
Another way to compute the ground-state iPEPS is to do imaginary-time evolution, analog to the MPS methods presented in Sect. 3.1.3. For a d-dimensional quantum model, its ground-state simulation can be considered as computing the contraction of a (d + 1)-dimensional TN.
Firstly, let us show how the evolution operator for an infinitesimal imaginarytime step τ can be written as an iPEPO, which is in fact one layer of the 3D TN (Fig. 4.2). The evolution of the iPEPS is to put the iPEPS at the bottom and to contract the TN layer by layer to it.
To proceed, we divide the local Hamiltonians on the square lattice into four group:Ĥ e,e =  Let us assume translational invariance to the Hamiltonian, i.e.,Ĥ [i,j ] =Ĥ [two] . The element of two-body evolution operator is a fourth-order tensor U s i s j s i s j = s i s j | exp(−τĤ [two] )|s i s j . Implement SVD or QR decomposition on U (4.2) as U s i s j s i s j = α L s i s i ,a R s j s j ,a . (4.9) Then the two tensors T [L] and T [R] that form the iPEPO ofÛ is obtained as While the TN for the imaginary-time evolution with the iPEPO is a cubic TN, one may directly use the tensor U , which also gives a 3D but not cubic TN. Without losing generality, we in the following will use the iPEPO to present the algorithms for contraction a cubic TN. The algorithm can be readily applied to deal with the statistic models on cubic lattice or other problems that can be written as the contraction of a cubic TN.
The evolutionÛ |ψ is to contract the iPEPO (one layer of the tensors) to the iPEPS. In accordance to the translational invariance of the iPEPO, the iPEPS is also formed by two inequivalent tensors (denoted by P [L] and P [R] ). Locally, the tensors in the evolved iPEPS are given as ss ,a 1 a 2 a 3 a 4 P [L] s ,α 1 α 2 α 3 α 4 , (4.11) 1, 2, 3, 4). Obviously, the bond dimensions of the new tensors are increased by dim(a x ) times. It is necessary to preset a dimension cut-off χ : when the bond dimensions become larger than χ , approximations will be introduced to reduce the dimensions back to χ . One then can iterate the evolution of the iPEPS with bounded computational cost. After the iPEPS converges, it is considered that the ground state is reached. Therefore, one key step in the imaginary-time schemes (as well as the similar contraction schemes of 3D TN's) is to find the optimal truncations of the enlarged bonds. In the following, we will concentrate on the truncation of bond dimensions, and present three kinds of scheme known as full, simple, and cluster updates according to which environment the truncations are optimized [10]. 2

Full, Simple, and Cluster Update Schemes
For truncating the dimensions of the geometrical bonds of an iPEPS, the task is to minimize the distance between the iPEPSs before and after the truncation, i.e., ε = ||ψ − |ψ |. With the normalization condition of the iPEPSs, the problem can be reduced to the maximization of the fidelity Z = ψ |ψ . (4.14) As discussed in Sect. 3.1.2, Z is in fact a scalar TN.
Full Update Among the three kinds of update schemes, full update seems to be the most natural and reasonable, in which the truncation is optimized referring to the whole iPEPS [10][11][12][13][14][15][16]. Let us consider a translationally invariant iPEPS. For square lattice, the iPEPS is formed by the infinite copies of two tensors P [L] and P [R] located on the two sub-lattices, respectively. Their evolution is given by Eq. (4.10). We useP [L] andP [R] to denote the tensors with enlarged bond dimensions. Below, we follow Ref. [13] to explain the truncation process. To truncate the fourth bond α 4 of the tensor, for example, one firstly defines the tensor M by contracting a pair ofP [L] with the dimension of the shared bond dim(α 4 ) = χ . We shall stress that Eq. (4.17) is not the SVD of M; the decomposition and truncation are optimized by the SVD of M e , hence is a non-local optimization.
With the formula given above, the task is to compute the environment tensor M e by the contraction algorithms of 2D TN's. In Ref. [13], the authors developed the SRG, where M e is computed by a modified version of TRG algorithm [17]. Other options include iTEBD [15], CTMRG [12], etc. Note that how to define the environment as well as how to truncate by the environment may have subtle differences in different works. The spirit is the same, which is to minimize the fidelity in Eq. (4.14) referring to the whole iPEPS.
Simple Update A much more efficient way known as the simple update was proposed by Jiang et al. [18]; it uses local environment to determine the truncations, providing an extremely efficient algorithm to simulate the 2D ground states. As shown in Fig. 2.8c, the iPEPS used in the simple update is formed by the tensors on the site and the spectra on the bonds: two tensors P [L] and P [R] located on the two sub-lattices, and λ [1] , λ [2] , λ [3] , and λ [4] on the four inequivalent geometrical bonds of each tensor. The evolution of the tensors in such an iPEPS is given by Eq. (4.10). λ [i] should be simultaneously evolved asλ [i] (a i ,α i ) = I a i λ α i with I a i = 1.
To truncate the fourth geometrical bond of P [L] (and P [R] ), for example, we construct a new tensor by contracting P [L] and P [R] and the adjacent spectra as  where one takes only the χ -largest singular values and the basis. P [L] and P [R] are updated as (4.20) The spectrumλ 4 is updated by λ in the SVD. The above procedure truncates dim(α 4 ) to the dimension cut-off χ , which can be readily applied to truncate any other bonds. According to the discussion about SVD in Sect. 2.2.1, the environment is the two tensors and the adjacent spectra λs in M, where the λs play the role of an "effective" environment that approximate the true environment (M e in the full update). From this viewpoint, the simple update uses local environment. Later by borrowing from the idea of the orthogonal form of the iPEPS on Bethe lattices [19][20][21][22][23][24][25][26], it was realized that the environment of the simple update is the iPEPS on the infinite trees [27][28][29], not just several tensors. We will talk about this in detail in the next chapter from the perspective of the multi-linear algebra.
Cluster Update By keeping the same dimension cut-off, the simple update is much more efficient than the full update. On the other hand, obviously, the full update possesses higher accuracy than the simple update by considering better the environment. The cluster update is between the simple and full updates, which is more flexible to balance between the efficiency and accuracy [10,27,30].
One way is to choose a finite cluster of the infinite TN and define the environment tensor by contracting the finite TN after taking a pair ofP [L] andP [R] out. One can consider to firstly use the simple update to obtain the spectra and put them on the boundary of the cluster [30]. This is equivalent to using a new boundary condition [27,29], different from the open or periodic boundary conditions of a finite cluster. Surely, the bigger the cluster becomes, more accurate but more consuming the computation will be. One may also consider an infinite-size cluster, which is formed by a certain number of rows of the tensors in the TN [10]. Again, both the accuracy and computational cost will in general increase with the number of rows. With infinite rows, such a cluster update naturally becomes the full update. Despite the progresses, there are still many open questions, for example, how to best balance the efficiency and accuracy in the cluster update.

Summary of the Tensor Network Algorithms in Higher Dimensions
In this section, we mainly focused on the iPEPS algorithm that simulates the ground states of 2D lattice models. The key step is to compute the environment tensor, which is to contract the corresponding TN. For several special cases such as trees and fractal lattices, the environment tensor corresponds to an exactly contractible TN, and thus can be computed efficiently (see Sect. 2.3.6). For the regular lattices such as square lattice, the environment tensor is computed by the TN contraction algorithms, which is normally the most consuming step in the iPEPS approaches.
The key concepts and ideas, such as environment, (simple, cluster, and full) update schemes, and the use of SVD, can be similarly applied to finite-size cases [31,32], the finite-temperature simulations [27,28,[33][34][35][36][37][38][39], and real-time simulations [31,40] in two dimensions. The computational cost of the TN approaches is quite sensitive to the spatial dimensions of the system. The simulations of 3D quantum systems are much more consuming than the 2D cases, where the task becomes to contract the 4D TN. The 4D TN contraction is extremely consuming, one may consider to generalize the simple update [29,41], or to construct finite-size effective Hamiltonians that mimic the infinite 3D quantum models [29,42] Many technical details of the approaches can be flexibly modified according to the problems under consideration. For example, the iPEPO formulation is very useful when computing a 3D statistic model, which is to contract the corresponding 3D TN. As for the imaginary-time evolution, it is usually more efficient to use the two-body evolution operators (see, e.g., [12,18]) rather than the iPEPO. The environment is not necessarily defined by the tensors; it can be defined by contracting everything of the TN except for the aimed geometrical bond [28,33]. The contraction order also significantly affects the efficiency and accuracy. One may consider to use the "single-layer" picture [10,31], or an "intersected" optimized contraction scheme [43]. 35 The images or other third party material in this chapter are included in the chapter's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

Tensor Network Contraction and Multi-Linear Algebra
Abstract This chapter is aimed at understanding TN algorithms from the perspective of MLA. In Sect. 5.1, we start from a simple example with a 1D TN stripe, which can be "contracted" by solving the eigenvalue decomposition of matrices. This relates to several important MPS techniques such as canonicalization (Orús and Vidal, Phys Rev B 78:155117, 2008) that enables to implement optimal truncations of the bond dimensions of MPSs (Sect. 5.1.1). In Sect. 5.

A Simple Example of Solving Tensor Network Contraction by Eigenvalue Decomposition
As discussed in the previous sections, the TN algorithms are understood mostly based on the linear algebra, such as eigenvalue and singular value decompositions.
Since the elementary building block of a TN is a tensor, it is very natural to think about using the MLA to understand and develop TN algorithms. MLA is also known as tensor decompositions or tensor algebra [1]. It is a highly inter-disciplinary subject. One of its tasks is to generalize the techniques in the linear algebra to higher-order tensors. For instance, one key question is how to define the rank of a tensor and how to determine its optimal lower-rank approximation. This is exactly what we need in the TN algorithms. Let us begin with a trivial example by simply considering the trace of the product of N number of (χ × χ ) matrices M as The dominant computational cost is around O(χ 3 ). In the limit of N → ∞, things become even easier, where we have where Λ 0 is the largest eigenvalue, and we have lim N →∞ ( Λ a Λ 0 ) N = 0 for a > 0. It means all the contributions except for the dominant eigenvalue vanish when the TN is infinitely long. What we should do is just to compute the dominant eigenvalue. The efficiency can be further improved by numerous more mature techniques (such as Lanczos algorithm).

Canonicalization of Matrix Product State
Before considering a 2D TN, let us take some more advantages of the eigenvalue decomposition on the 1D TN's, which is closely related to the canonicalization of MPS proposed by Orús and Vidal for non-unitary evolution of MPS [2]. The utilization of canonicalization is mainly in two aspects: locating optimal truncations of the MPS, and fixing the gauge degrees of freedom of the MPS for better stability and efficiency.

Canonical Form and Globally Optimal Truncations of MPS
As discussed in the above chapter, when using iTEBD to contract a TN, one needs to find the optimal truncations of the virtual bonds of the MPS. In other words, the problem is how to optimally reduce the dimension of an MPS. The globally optimal truncation can be down in the following expensive way. Let us divide the MPS into two parts by cutting the bond that is to be truncated ( Fig. 5.1). Then, if we contract all the virtual bonds on the left-hand side and reshape all the physical indexes there into one index, we will obtain a large matrix denoted as L ···s n ,α n that has one big physical and one virtual index. Another matrix denoted as R * s n+1 ··· ,α n can be obtained by doing the same thing on the right hand side. The conjugate of R is taken there to obey some conventions.
Then, by contracting the virtual bond and doing SVD as a n L ···s n ,a n R * s n+1 ··· ,a n = a nL ···s n ,a n λ a nR * s n+1 ··· ,a n , (5.5) the virtual bond dimension is optimally reduced to χ by only taking the χlargest singular values and the corresponding vectors. The truncation error that is minimized is the distance between the MPS before and after the truncation. Therefore, the truncation is optimal globally concerning the whole MPS as the environment.  In practice, we do not implement the SVD above. It is actually the decomposition of the whole wave-function, which is exponentially expensive. Canonicalization provides an efficient way to realize the SVD through only local operations.
Considering an infinite MPS with two-site translational invariance (Fig. 5.2); it is formed by the tensors A and B as well as the diagonal matrices Λ and Γ as {a} · · · Λ a n−1 A s n−1 ,a n−1 a n Γ a n B s n ,a n a n+1 Λ a n+1 · · · = tTr(· · · ΛAΓ BΛ · · · ). (5.6) This is the MPS used in the iTEBD algorithm (see Sect. 3.4 and Fig. 3 where Λ and Γ are positive-defined (Fig. 5.3). Equations (5.7)-(5.10) are called the canonical conditions of the MPS. Note there will be 2n equations with n-site translational invariance, meaning that each inequivalent tensor will obey to two (left and right) conditions. In the canonical form, Λ or Γ directly give the singular values by cutting the MPS on the corresponding bond. To see this, let us calculate Eq. (5.5) from a canonical MPS. From the canonical conditions, matrices L and R are unitary, satisfying L † L = I and R † R = I (the physical indexes are contracted). Meanwhile, Λ (or Γ ) is positive-defined, thus L, Λ (or Γ ) and R of a canonical MPS directly define the SVD, and Λ or Γ is indeed the singular value spectrum. Then the optimal truncations of the virtual bonds are reached by simply keeping χ -largest values of Λ and the corresponding basis of the neighboring tensors. This is true when cutting any one of the bonds of the MPS. From the uniqueness of SVD, Eqs. (5.7) and (5.8) Now the canonical conditions are given by four eigenvalue equations and can be reinterpreted as the following: with an infinite MPS formed by A, B, Λ and Γ , it is canonical when the identity is the eigenvector of its transfer matrices. Simply from the canonical conditions, it does not require the "identity" to be dominant eigenvector. However, if the identity is not the dominant one, the canonical conditions will become unstable under an arbitrarily small noise. Below, we will show that the canonicalization algorithm assures that the identity is the leading eigenvector, since it transforms the leading eigenvector to an identity. In addition, if the dominant eigenvector of M L and M R (also N L and N R ) is degenerate, the canonical form will not be unique. See Ref. [2] for more details.

Canonicalization Algorithm and Some Related Topics
Considering the iTEBD algorithm [3] (see Sect. 3.4), while the MPO represents a unitary operator, the canonical form of the MPS will be reserved by the evolution (contraction). For the imaginary-time evolution, the MPO is near-unitary. For the Trotter step τ → 0, the MPO approaches to be an identity. It turns out that in this case, the MPS will be canonicalized by the evolution in the standard iTEBD algorithm. When the MPO is non-unitary (e.g., when contracting the TN of a 2D statistic model) [2], the MPS will not be canonical, and the canonicalization might be needed to better truncate the bond dimensions of the MPS.
Canonicalization Algorithm An algorithm to canonicalize an arbitrary MPS was proposed by Orús and Vidal [2]. The idea is to compute the first eigenvectors of the transfer matrices, and introduce proper gauge transformations on the virtual bonds that map the leading eigenvector to identity.
Let us take the gauge transformations on the virtual bonds between A and B as an example. Firstly, compute the dominant left eigenvector v L of the matrix N L M L , and similarly the dominant right eigenvector v R of the matrix N R M R . Then, reshape v L and v R as two matrices and decompose them symmetrically as v R X and Y can be calculated using eigenvalue decomposition, i.e., v R = W DW † with X = W √ D. Insert the identities X −1 X and Y Y −1 on the virtual bond as shown in Fig. 5.4, then we get a new matrix M = XΓ Y on this bond. Apply SVD on M as M = UΓ V † , where we have the updated spectrumΓ on this bond. Meanwhile, we obtain Implement the same steps given above on the virtual bonds between B and A, then the MPS is transformed to the canonical form. The left-canonical MPS is defined by A L and B L as

Variants of the Canonical Form
Similarly, the right-canonical MPS is defined by A R and B R as The central-orthogonal MPS is defined as tTr(· · · A L B L A M B R A R · · · ). (5.28) One can easily check that these MPSs and the canonical MPS can be transformed to each other by gauge transformations. From the canonical conditions, A L , A R , B L , and B R are non-square orthogonal matrices (e.g., sa A L s,aa A L * s,aa = I a a ), called isometries. A M is called the central tensor of the central-orthogonal MPS. This MPS form is the state ansatz behind the DMRG algorithm [4,5], and is very useful in TN-based methods (see, for example, the works of McCulloch [6,7]). For instance, when applying DMRG to solve 1D quantum model, the tensors A L and B L define a left-to-right RG flow that optimally compresses the Hilbert space of the left part of the chain. A R and B R define a rightto-left RG flow similarly. The central tensor between these two RG flows is in fact the ground state of the effective Hamiltonian given by the RG flows of DMRG. Note that the canonical MPS is also called the central canonical form, where the directions of the RG flows can be switched arbitrarily by gauge transformations, thus there is no need to define the directions of the flows or a specific center.

Relations to Tensor Train Decomposition
It is worth mentioning the TTD [8] proposed in the field of MLA. As argued in Chap. 2, one advantage of MPS is it lowers the number of parameters from an exponential size dependence to a polynomial one. Let us consider a similar problem: for a N-th order tensor that has d N parameters, how to find its optimal MPS representation, where there are only [2dχ + (N − 2)dχ 2 ] parameters? TTD was proposed for this aim: by decomposing a tensor into a tensor-train form that is similar to a finite open MPS, the number of parameters becomes linearly relying to the order of the original tensor. The TTD algorithm shares many similar ideas with MPS and the related algorithms (especially DMRG which was proposed about two decades earlier). The aim of TTD is also similar to the truncation tasks in the TN algorithms, which is to compress the number of parameters.

Super-Orthogonalization and Tucker Decomposition
As discussed in the above section, the canonical form of an MPS brings a lot of advantages, such as determining the entanglement and the optimal truncations of the virtual bond dimensions by local transformations. The canonical form can be readily generalized to the iPEPSs on trees. Can we also define the canonical form for the iPEPSs in higher-dimensional regular lattices, such as square lattice (Fig. 5.5)? If this can be done, we would know how to find the globally optimal transformations that reduces the bond dimensions of the iPEPS, just like what we can do with an MPS. Due to the complexity of 2d TN's, unfortunately, there is no such a form in general. In the following, we explain the super-orthogonal form of iPEPS proposed in 2012 [9], which applies the canonical form of tree iPEPS to the iPEPS on regular lattices. The super-orthogonalization is a generalization of the Tucker decomposition (a higher-order generalization of matrix SVD) [10], providing a zero-loop approximation scheme [11] to define the entanglement and truncate the bond dimensions.

Super-Orthogonalization
Let us start from the iPEPS on the (infinite) Bethe lattice with the coordination number z = 4. It is formed by two tensors P and Q on the sites as well as four spectra Λ (k) (k = 1, 2, 3, 4) on the bonds, as illustrated in Fig. 5.5. Here, we still take the two-site translational invariance for simplicity.
There are eight super-orthogonal conditions, of which four associate to the tensor P and four to Q. For P , the conditions are s ···a k−1 a k+1 ··· P s,···a k ··· P * s,···a k ··· n =k Λ (n) a n Λ (n) * a n = I a k a k , (∀ k), (5.29) where all the bonds along with the corresponding spectra are contracted except for a k . It means that by putting a k as one index and all the rest as another, the k-rectangular matrix S (k) defined as S (k) s···a k−1 a k+1 ··· ,a k = P s,···a k ··· n =k Λ (n) a n , (5.30) is an isometry, satisfying S (k) † S (k) = I . The super-orthogonal conditions of the tensor Q are defined in the same way. Λ (k) is dubbed super-orthogonal spectrum when the super-orthogonal conditions are fulfilled.
In the canonicalization of MPS, the vectors on the virtual bonds give the bipartite entanglement defined by Eq. (5.5). Meanwhile, the bond dimensions can be optimally reduced by discarding certain smallest elements of the spectrum. In the super-orthogonalization, this is not always true for iPEPSs. For example, given a translational invariant iPEPS defined on a tree (or called Bethe lattice, see Fig. 5.5a) [12][13][14][15][16][17][18][19], the super-orthogonal spectrum indeed gives the bipartite entanglement spectrum by cutting the system at the corresponding place. However, when considering loopy lattices, such as the iPEPS defined on a square lattice (Fig. 5.5b), this will no longer be true. Instead, the super-orthogonal spectrum provides an approximation of the entanglement of the iPEPS by optimally ignoring the loops. One can still truncate the bond dimensions according to the superorthogonal spectrum, giving in fact the simple update (see Sect. 4.3). We will discuss the loopless approximation in detail in Sect. 5.3 using the rank-1 decomposition.

Super-Orthogonalization Algorithm
Any PEPS can be transformed to the super-orthogonal form by iteratively implementing proper gauge transformations on the virtual bonds [9]. The algorithm consists of two steps. Firstly, compute the reduced matrix M (k) of the k-rectangular matrix of the tensor P (Eq. (5.30)) as Compared with the canonicalization algorithm of MPS, one can see that the gauge transformations in the super-orthogonalization algorithm are quite similar. The difference is that one cannot transform a PEPS into the super-orthogonal form by a single step, since the transformation on one bond might cause some deviation from obeying the super-orthogonal conditions on other bonds. Thus, the above procedure should be iterated until all the tensors and spectra converge.

Super-Orthogonalization and Dimension Reduction by Tucker Decomposition
Such an iterative scheme is closely related to the Tucker decomposition in MLA [10]. Tucker decomposition is considered as a generalization of (matrix) SVD to higher-order tensors, thus it is also called higher-order or multi-linear SVD. The aim is to find the optimal reductions of the bond dimensions for a single tensor. Let us define the k-reduced matrix of a tensor T as where all except the k-th indexes are contracted. The Tucker decomposition (Fig. 5.7) of a tensor T has the form as where the following conditions should be satisfied: • Unitarity. U (k) are unitary matrices satisfying U (k) U (k) † = I . • All-orthogonality. For any k, the k-reduced matrix M (k) of the tensor S is diagonal, satisfying • Ordering. For any k, the elements of Γ (k) in the k-reduced matrix are positivedefined and in the descending order, satisfying Γ 0 > Γ 1 > · · · . From these conditions, one can see that the tensor T is decomposed as the contraction of another tensor S with several unitary matrices. S is called the core tensor. In other words, the optimal lower-rank approximation of the tensor can be simply obtained by where we only take the first χ terms in the summation of each index. Such an approximations can be understood in terms of the SVD of matrices. Applying the conditions of the k-reduced matrix of T , we have Since U (k) is unitary and Γ (k) is positive-defined and in the descending order, the above equation is exactly the eigenvalue decomposition of M (k) . From the relation between the SVD of a matrix and the eigenvalue decomposition of its reduced matrix, we can see that U (k) and Γ (k) in fact give the SVD of the matrix T a 1 ···a k−1 a k+1 ··· ,a k as Then, The optimal truncation of the rank of each index is reached by the corresponding SVD. The truncation error is obviously the distance defined as (5.40) which is minimized in this SVD. For the algorithms of Tucker decomposition, one simple way is to do the eigenvalue decomposition of each k-reduced matrix, or the SVD of each krectangular. Then for a K-th ordered tensor, K SVDs will give us the Tucker decomposition and a lower-rank approximation. This algorithm is often called higher-order SVD (HOSVD), which has been successfully applied to implement truncations in the TRG algorithm [20]. The accuracy of HOSVD can be improved. Since the truncation on one index will definitely affect the truncations on other indexes, there will be some "interactions" among different indexes (modes) of the tensor. The truncations in HOSVD are calculated independently, thus such "interactions" are ignored. One improved way is the high-order orthogonal iteration (HOOI), where the interactions among different modes are considered by iteratively doing SVDs until reaching the convergence. See more details in Ref. [10].
Compared with the conditions of Tucker decomposition, let us redefine the superorthogonal conditions of a PEPS as Note that the condition "unitary" (first one in Tucker decomposition) is hidden in the fact that we use gauge transformations to transform the PEPS into the superorthogonal form. Therefore, the super-orthogonalization is also called network Tucker decomposition (NTD).
In the Tucker decomposition, the "all-orthogonality" and "ordering" lead to an SVD associated to a single tensor, which explains how the optimal truncations work from the decompositions in linear algebra. In the NTD, the SVD picture is generalized from a single tensor to a non-local PEPS. Thus, the truncations are optimized in a non-local way.
Let us consider a finite-size PEPS and arbitrarily choose one geometrical bond (say a). If the PEPS is on a tree, we can cut the bond and separate the TN into three disconnecting parts: the spectrum (Λ) on this bond and two tree brunches stretching to the two sides of the bond. Specifically speaking, each brunch contains one virtual bond and all the physical bonds on the corresponding side, formally denoted as Ψ L i 1 i 2 ··· ,a (and Ψ R j 1 j 2 ··· ,a on the other side). Then the state given by the iPEPS can be written as a Ψ L i 1 i 2 ··· ,a Λ a Ψ R j 1 j 2 ··· ,a . (5.42) To get the SVD picture, we need to prove that Ψ L and Ψ R in the above equation are isometries, satisfying the orthogonal conditions as (5.43) Note that the spectrum Λ is already positive-defined according to the algorithm. To this end, we construct the TN of i 1 i 2 ··· Ψ L(R) i 1 i 2 ··· ,a from its boundary. If the PEPS is super-orthogonal, the spectra must be on the boundary of the TN because the super-orthogonal conditions are satisfied everywhere. 1 Then the contractions of the tensors on the boundary satisfy Eq. (5.29), which gives identities. Then we have on the new boundary again the spectra to iterate the contractions. All tensors can be contracted by iteratively using the super-orthogonal conditions, which in the end gives identities as Eq. (5.43). Thus, Ψ L and Ψ R are indeed isometries and Eq. (5.42) indeed gives the SVD of the whole wave-function. The truncations of the bond dimensions are globally optimized by taking the whole tree PEPS as the environment.
For an iPEPS, it can be similarly proven that Ψ L and Ψ R are isometries. One way is to put any non-zero spectra on the boundary and iterate the contraction by Eq. (5.31). While the spectra on the boundary can be arbitrary, the results of the contractions by Eq. (5.31) converge to identities quickly [9]. Then the rest of the contractions are exactly given by the super-orthogonal conditions (Eq. (5.29)). In other words, the identity is a stable fixed point of the above iterations. Once the fixed point is reached, it can be considered that the contraction is from infinitely far away, meaning from the "boundary" of the iPEPS. In this way, one proves Ψ L and Ψ R are isometries, i.e., Ψ L † Ψ L = I and Ψ R † Ψ R = I .

Zero-Loop Approximation on Regular Lattices
and Rank-1 Decomposition

Super-Orthogonalization Works Well for Truncating the PEPS on Regular Lattice: Some Intuitive Discussions
From the discussions above, we can see that the "canonical" form of a TN state is strongly desired, because it is expected to give the entanglement and the optimal truncations of the bond dimensions. Recall that to contract a TN that cannot be contracted exactly, truncations are inevitable, and locating the optimal truncations is one of the main tasks in the computations. The super-orthogonal form provides a robust way to optimally truncate the bond dimensions of the PEPS defined on a tree, analog to the canonicalization of MPS. Interestingly, the super-orthogonal form does not require the tree structure. For an iPEPS defined on a regular lattice, for example, the square lattice, one can still superorthogonalize it using the same algorithm. What is different is that the SVD picture of the wave-function (generally, see Eq. (5.5)) will not rigorously hold, as well as the robustness of the optimal truncations. In other words, the super-orthogonal spectrum does not exactly give the entanglement. A question rises: can we still truncate iPEPS defined on a square lattice according to the super-orthogonal spectrum?
Surprisingly, numeric simulations show that the accuracy by truncating according to the super-orthogonal spectrum is still good in many cases. Let us take the groundstate simulation of a 2D system by imaginary-time evolution as an example. As discussed in Sect. 4.2, the simulation becomes the contraction of a 3D TN. One usual way to compute this contraction is to contract layer by layer to an iPEPS (see, e.g., [22,23]). The contraction will enlarge the virtual bond dimensions, and truncations are needed. When the ground state is gapped (see, e.g., [9,23]), the truncations produce accurate results, which means the super-orthogonal spectrum approximates the true entanglement quite well.
It has been realized that using the simple update algorithm [23], the iPEPS will converge to the super-orthogonal form for a vanishing Trotter step τ → 0. The success of the simple update suggests that the optimal truncation method on trees still works well for regular lattices. Intuitively, this can be understood in the following way. Comparing a regular lattice with a tree, if it has the same coordination number, the two lattices look exactly the same if we only inspect locally on one site and its nearest neighbors. The difference appears when one goes round the closed loops on the regular lattice, since there is no loop in the tree. Thus, the error applying the optimal truncation schemes (such as superorthogonalization) of a tree to a regular lattice should be characterized by some non-local features associated to the loops. This explains in a descriptive way why the simple update works well for gapped states, where the physics is dominated by short-range correlations. For the systems that possess small gaps or are gapless, simple update is not sufficiently accurate [24], particularly for the non-local physical properties such as the correlation functions.

Rank-1 Decomposition and Algorithm
Rank-1 decomposition in MLA [25] provides a more mathematic and rigorous way to understand the approximation by super-orthogonalization (simple update) to truncate PEPS on regular lattices [11]. For a tensor T , its rank-1 decomposition (Fig. 5.8) is defined as where v (k) are normalized vectors and Ω is a constant that satisfies Rank-1 decomposition provides an approximation of T , where the distance between T and its rank-1 approximation is minimized, i.e., Apart from some very special cases, such an optimization problem is concave, thus rank-1 decomposition is unique. 2 Furthermore, if one arbitrarily chooses a set of norm-1 vectors, they will converge to the fixed point exponentially fast with the iterations. To the best of our knowledge, the exponential convergence has not been proved rigorously, but observed in most cases.

Rank-1 Decomposition, Super-Orthogonalization, and Zero-Loop Approximation
Let us still consider an translational invariant square TN that is formed by infinite copies of the 4th-order tensor T (Fig. 2.28). The rank-1 decomposition of T provides an approximative scheme to compute the contraction of the TN, which is called the theory of network contractor dynamics (NCD) [11]. The picture of NCD can be understood in an opposite way to contraction, but by iteratively using the self-consistent conditions (Eq. (5.47)) to "grow" a tree TN that covers the whole square lattice (Fig. 5.10). Let us start from Eq. (5.45) of Ω. Using Eq. (5.47), we substitute each of the four vectors by the contraction of T with the other three vectors. After doing so, Eq. (5.45) becomes the contraction of more than one T s with the vectors on the boundary. In other words, we "grow" the local TN contraction from one tensor plus four vectors to that with more tensors and vectors.
By repeating the substitution, the TN can be grown to cover the whole square lattice, where each site is allowed to put maximally one T . Inevitably, some sites will not have T , but four vectors instead. These vectors (also called contractors) give the rank-1 decomposition of T as Eq. (5.44). This is to say that some tensors in the square TN are replaced by its rank-1 approximation, so that all loops are destructed and the TN becomes a loopless tree covering the square lattice. In this way, the square TN is approximated by such a tree TN on square lattice, so that its contraction is simply computed by Eq. (5.45).
The growing process as well as the optimal tree TN is only to understand the zero-loop approximation with rank-1 decomposition. There is no need to practically Fig. 5.10 Using the self-consistent conditions of the rank-1 decomposition, a tree TN with no loops can grow to cover the infinite square lattice. The four vectors gathering in a same site give the rank-1 approximation of the original tensor implement such a process. Thus, it does not matter how the TN is grown or where the rank-1 tensors are put to destroy the loops. All information we need is given by the rank-1 decomposition. In other words, the zero-loop approximation of the TN is encoded in the rank-1 decomposition.
For growing the TN, we shall remark that using the contraction of one T with several vectors to substitute one vector is certainly not unique. However, the aim of "growing" is to reconstruct the TN formed by T . Thus, if T has to appear in the substitution, the vectors should be uniquely chosen as those given in the rank-1 decomposition due to the uniqueness of rank-1 decomposition. Secondly, there are hidden conditions when covering the lattice by "growing". A stronger version is And a weaker one only requires the vectors to be conjugate to each other as v (1) = v (3) † , v (2) = v (4) † . (5.49) These conditions assure that the self-consistent equations encode the correct tree that optimally in the rank-1 sense approximates the square TN. Comparing with Eqs. (5.29) and (5.47), the super-orthogonal conditions are actually equivalent to the above self-consistent equations of rank-1 decomposition by defining the tensor T and vector v as with a k = (a k , a k ). Thus, the super-orthogonal spectrum provides an optimal approximation for the truncations of the bond dimensions in the zero-loop level. This provides a direct connection between the simple update scheme and rank-1 decomposition.

Error of Zero-Loop Approximation and Tree-Expansion Theory Based on Rank-Decomposition
The error of NCD (and simple update) is an important issue. From the first glance, the error seems to be the error of rank-1 decomposition ε = |T − k v (k) |. This would be true if we replaced all tensors in the square TN by the rank-1 version. In this case, the PEPS is approximated by a product state with zero entanglement.
In the NCD scheme, however, we only replace a part of the tensors to destruct the loops. The corresponding approximative PEPS is an entanglement state with a tree structure. Therefore, the error of rank-1 decomposition cannot properly characterize the error of simple update.
To control the error, let us introduce the rank decomposition (also called CANDECOMP/PARAFAC decomposition) of T in MLA (Fig. 5.11) that reads where v (k,r) are normalized vectors. The idea of rank decomposition [26,27] is to expand T into the summation of R number of rank-1 tensors with R called the tensor rank. The elements of the vector Ω can always be in the descending order according to the absolute values. Then the leading term Ω 0 k v (k,0) gives exactly the rank-1 decomposition of T , and the error of the rank-1 decomposition becomes a k |. In the optimal tree TN, let us replace the rank-1 tensors back by the full rank tensor in Eq. (5.52). We suppose the rank decomposition is exact, thus we will recover the original TN by doing so. The TN contraction becomes the summation of RÑ terms withÑ the number of rank-1 tensors in the zero-loop TN. Each term is the contraction of a tree TN, which is the same as the optimal tree TN except that certain vectors are changed to v (k,r) instead of the rank-1 term v (k,0) . Note that in all terms, we use the same tree structure; the leading term in the summation is the zero-loop TN in the NCD scheme. It means with rank decomposition, we expand the contraction of the square TN by the summation of the contractions of many tree TN's.
Let us order the summation referring to the contributions of different terms. For simplicity, we assume R = 2, meaning T can be exactly decomposed as the summation of two rank-1 tensors, which are the leading term given by the rank-1 decomposition, and the next-leading term denoted as T 1 = Ω 1 k v (k,1) . We dub as the next-leading term as the impurity tensor. Definingñ as the number of the impurity tensors appearing in one of the tree TN in the summation, the expansion can be written as (5.53) Then Z C denotes the contraction of such a tree TN with a specific configuration of T 1 's. In general, the contribution is determined by the order of |Ω 1 /Ω 0 | since |Ω 1 /Ω 0 | < 1 (Fig. 5.12).
To proceed, we choose one tensor in the tree as the original point, and always contract the tree TN by ending at this tensor. Then the distance D of a vector is defined as the number of tensors in the path that connects this vector to the original point. Note that one impurity tensor is the tensor product of several vectors, and each vector may have different distance to the original point. For simplicity, we take the shortest one to define the distance of an impurity tensor. Now, let us utilize the exponential convergence of the rank-1 decomposition. After contracting any vectors with the tensor in the tree, the resulting vector approaches to the fixed point (the vectors in the rank-1 decomposition) in an exponential speed. Define D 0 as the average number of the contractions that will project any vectors to the fixed point with a tolerable difference. Consider any impurity tensors with the distance D > D 0 , their contributions to the contraction are approximately the same, since after D 0 contractions, the vectors have already been projected to the fixed point.
From the above argument, we can see that the error is related not only to the error of the rank-1 decomposition, but also to the speed of the convergence to the rank-1 component. The smaller D 0 is, the smaller the error (the total contribution from the non-dominant terms) will be. Calculations show that the convergence speed is related to the correlation length (or gap) of the physical system, but their rigorous relations have not been established yet. Meanwhile, the expansion theory of the TN contraction given above requires the rank decomposition, which, however, is not uniquely defined of an arbitrarily given tensor.

iDMRG, iTEBD, and CTMRG Revisited by Tensor Ring Decomposition
We have shown that the rank-1 decomposition solves the contraction of infinite-size tree TN and provides a mathematic explanation of the approximation made in the simple update. Then, it is natural to think: can we generalize this scheme beyond being only rank-1, in order to have better update schemes? In the following, we will show that besides the rank decomposition, the tensor ring decomposition (TRD) [28] was suggested as another rank-N generalization for solving TN contraction problems. TRD is defined by a set of self-consistent eigenvalue equations (SEEs) with certain constraints. The original proposal of TRD requires all eigenvalue equations to be Hermitian [28]. Later, a generalize version was proposed [29] that provides an unified description of the iDMRG [4,5,7], iTEBD [3], and CTMRG [30] algorithms. We will concentrate on this version in the following.

Revisiting iDMRG, iTEBD, and CTMRG: A Unified Description with Tensor Ring Decomposition
Let us start from the iDMRG algorithm. The TN contraction can be solved using the iDMRG [4,5,7] by considering an infinite-size row of tensors in the TN as an MPO [31][32][33][34][35] (also see some related discussions in Sect. 3.4). We introduce three third-order variational tensors denoted by v L , v R (dubbed as the boundary or environmental tensors) and Ψ (dubbed as the central tensor). These tensors are the fixed-point solution of the a set of eigenvalue equations. v L and v R are, respectively, the left and right dominant eigenvector of the following matrices ( Fig. 5.13a, b) Ψ is the dominant eigenvector of the Hermitian matrix ( Fig. 5.13c) that satisfies One can see that each of the eigenvalue problems is parametrized by the solutions of others, thus we solve them in a recursive way. First, we initialize arbitrarily the central tensors Ψ and get A and B by Eq. (5.56). Note that a good initial guess can make the simulations faster and more stable. Then we update v L and v R by multiplying with M L and M R as Eqs. (5.54) and (5.55). Then we have the new Ψ by solving the dominant eigenvector of H in Eq. (5.57) that is defined by the new v L and v R . We iterate such a process until all variational tensors converge.
Let us rephrase the iDMRG algorithm given above in the language of TN contraction/reconstruction. When the variational tensors give the fixed point, the eigenvalue equations "encodes" the infinite TN, i.e., the TN can be reconstructed from the equations. To do so, we start from a local representation of Z (Fig. 5.14) written as where the summation goes through all indexes. According to the fact that Ψ is the leading eigenvector of Eq. (5.57), Z is maximized with fixed v L and v R . We here and below use the symbol " " to represent the contraction relation up to a difference of a constant factor. · · · A a n−2 b n−2 b n−1 A a n−1 b n−1 b n Ψ a n b n b n+1 B a n+1 b n+1 b n+2 B a n+2 b n+2 b n+3 · · · . where ρ is an infinite-dimensional matrix that has the form of an MPO (middle of Fig. 5.14) as ρ ···a n ··· ,···a n ··· = {c} · · · T a n c n a n c n+1 T a n+1 c n+1 a n+1 c n+2 · · · . Then we come to a conclusion that Φ is the optimal MPS that gives the dominant eigenvector of ρ, satisfying Φ ρΦ. 3 Then, we can rewrite the TN contraction as Z lim K→∞ Φ † ρ K Φ, where the infinite TN appears as ρ K (Fig. 5.14). Now we define the tensor ring decomposition (TRD) 4 : with the following conditions • Z (Eq. (5.58)) is maximized under the constraint that v L and v R are normalized, • Φ † ρΦ is maximized under the constraint that Φ is normalized, the TRD (Fig. 5.15) of T is defined byT as It was shown that for the same system, the ground state obtained by iDMRG is equivalent to the ground state by iTEBD, up to a gauge transformation [7,37]. Different from this connection, TRD further unifies iDMRG and iTEBD. For iTEBD, after combining the contraction and truncation given by Eqs. Particularly, when one uses iDMRG to solve the ground state of a 1D system, the MPS formed by v [L(R)] in the imaginary-time direction satisfies the continuous structure [29,38] that was originally proposed for continuous field theories [39]. Such an iTEBD calculation can also be considered as the transverse contraction of the TN [38,40,41].
CTMRG [30,42] is also closely related to the scheme given above, which leads to the CTMRG without the corners. The tensors Ψ , v L and v R correspond to the row and column tensors, and the equations for updating these tensors are the same to the equations of updating the row and column tensors in CTMRG (see Eqs. (3.27) and (3.31)). Such a relation becomes more explicit in the rank-1 case, when corners become simply scalars. The difference is that in the original CTMRG by Orús et al. [30], the tensors are updated with a power method, i.e., Recently, eigen-solvers instead of power method were suggested in CTMRG ( [42] and a related review [43]), where the eigenvalue equations of the row and column tensors are the same to those given in TRD. The efficiency was shown to be largely improved with this modification.

Extracting the Information of Tensor Networks From Eigenvalue Equations: Two Examples
In the following, we present how to extract the properties of the TN by taking the free energy and correlation length as two example related to the eigenvalue equations. Note that these quantities correspond to the properties of the physical model and have been employed in many places (see, e.g., a review [44]). In the following, we treated these two quantities as the properties of the TN itself. When the TN is used to represent different physical models, these quantities will be interpreted accordingly to different physical properties.
For an infinite TN, the contraction usually gives a divergent or vanishing value. The free energy per tensor of the TN is defined to measure the contraction as with Z the value of the contraction in theory and N denoting the number of tensors. Such a definition is closely related to some physical quantities, such as the free energy of classical models and the average fidelity of TN states [45]. The second issue is about the correlations of the TN. The correlation function of a TN can be defined as One has the correlation length Then, introduce the transfer matrix M of Φ † ρΦ, i.e., Φ † ρΦ = TrMK withK → ∞. With the eigenvalue decomposition of M = D−1 j =0 η j v j v † j with D the matrix dimension and v j the j -th eigenvectors, one can further simply the equation as with M (T [r] ) the transfer matrix after substituting the original tensor at r withT [r] .
Similarly, one has Note that one could transform the MPS into a translationally invariant form (e.g., the canonical form) to uniquely define the transfer matrix of Φ † ρΦ. Substituting the equations above in Eq. (5.65), one has When the distance is sufficiently large, i.e., |r 1 − r 2 | 1, only the dominant term takes effects, which is Compared with Eq. (5.66), one has ξ = 1/(ln η 0 − ln η 1 ). The second case can be proven similarly. These two quantities are defined independently on specific physical models that the TN might represent, thus they can be considered as the mathematical properties of the TN. By introducing physical models, these quantities are closely related to the physical properties. For example, when the TN represents the partition function of a classical lattice model, Eq. (5.64) multiplied by the temperature is exactly the free energy. And the correlation lengths of the TN are also the physical correlation lengths of the model in two spatial directions. When the TN gives the imaginarytime evolution of an infinite 1D quantum chain, the correlation lengths of the TN are the spatial and dynamical correlation length of the ground state.
It is a huge topic to investigate the properties of the TN's or TN states. Paradigm examples include injectivity and symmetries [46][47][48][49][50][51][52][53][54][55][56][57][58][59][60][61], statistics and fusion rules [62][63][64][65]. These issues are beyond the scope of this lecture notes. One may refer to the related works if interested. 64 The images or other third party material in this chapter are included in the chapter's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.
One progress achieved in the spirit of AOP is the TRD introduced in Sect. 5.4. Considering the TN on an infinite square lattice, its contraction is reduced to a set of self-consistent eigenvalue equations that can be efficiently solved by classical computers. The variational parameters are just two tensors. One advantage of TRD is that it connects the TN algorithms (iDMRG, iTEBD, CTMRG), which are previously considered to be quite different, in a unified picture.
Another progress made in the AOP spirit is called QES for simulating infinitesize physical models [1,5,6]. It is less dependent on the specific models; it also provides a natural way for designing quantum simulators and for hybridizedquantum-classical simulations of many-body systems. Hopefully in the future when people are able to readily realize the designed Hamiltonians on artificial quantum platforms, QES will enable us to design the Hamiltonians that will realize quantum many-body phenomena.

Simulating One-Dimensional Quantum Lattice Models
Let us firstly take the ground-state simulation of the infinite-size 1D quantum system as an example. The Hamiltonian is the summation of two-body nearest-neighbor terms, which readsĤ I nf = nĤ n,n+1 . The translational invariance is imposed. The first step is to choose a supercell (e.g., a finite part of the chain withÑ sites). Then the Hamiltonian of the supercell isĤ B = Ñ n=1Ĥ n,n+1 , and the Hamiltonian connecting the supercell to the rest part isĤ ∂ =Ĥ n ,n +1 (note the interactions are nearest neighbor).
Define the operatorF ∂ asF with τ the Trotter-Suzuki step. This definition is to construct the Trotter-Suzuki decomposition [7,8]. Instead of using the exponential form e −τĤ , we equivalently chose to shiftĤ ∂ for algorithmic consideration. The errors of these two ways concerning the ground state are at the same level (O(τ 2 )). Introduce an ancillary index a and rewriteF ∂ as a sum of operators aŝ whereF L (s) a andF R (s ) a are two sets of one-body operators (labeled by a) acting on the left and right one of the two spins (s and s ) associated withĤ ∂ , respectively ( Fig. 6.1). Equation (6.2) can be easily achieved directly from the Hamiltonian or using eigenvalue decomposition. For example, for the Heisenberg interaction withĤ ∂ = α=x,y,z J αŜ α (s)Ŝ α (s ) withŜ α (s) the spin operators. We have Construct the operatorF (S) a a , with S = (s 1 , · · · , sÑ ) representing the physical spins inside the supercell, aŝ withH B =Î − τĤ B .F R (s 1 ) † a andF L (sÑ ) a act on the first and last sites of the supercell, respectively. One can see thatF (S) a a represents a set of operators labeled by two indexes (a and a) that act on the supercell.
In the language of TN, the coefficients ofF (S) a a in the local basis (|S = |s 1 · · · |sÑ ) is a fourth-order cell tensor (Fig. 6.1 With the cell tensor T , the ground-state properties can be solved using the TN algorithms (e.g., TRD) introduced above. The ground state is given by the MPS given by Eq. (5.59). Let us consider one of the eigenvalue equations [also see Eq. (5.57)] in the TRD We define a new HamiltonianĤ by using H S b 1 b 2 ,Sb 1 b 2 as the coefficientŝ H is the effective Hamiltonian in iDMRG [9][10][11] or the methods which represent the RG of Hilbert space by MPS [12,13] where the HamiltoniansĤ L andĤ R locate on the boundaries ofĤ , whose coefficients satisfy H L andĤ R are just two-body Hamiltonians, of which each acts on the bath site and its neighboring physical site on the boundary of the bulk; they define the infinite boundary condition for simulating the time evolution of 1D quantum systems [14]. H L andĤ R can also be written in a shifted form aŝ This is because the tensor v L (and also v R ) satisfies a special form [15] as v L 0,bb = I bb − τ Q bb , (6.10) v L a,bb = τ a/2 (R a ) bb (a > 0), (6.11) with Q and R two Hermitian matrices independent on τ . In other words, the MPS formed by infinite copies of v L or v R is a continuous MPS [16], which is known as the temporal MPS [17]. Therefore,Ĥ L(R) is independent on τ , called the physicalbath Hamiltonian. ThenĤ can be written as the shift of a few-body Hamiltonian asĤ = I − τĤ FB , whereĤ FB has the standard summation form aŝ n,n+1 +Ĥ R . (6.12) ForĤ L andĤ R with the bath dimension χ , the coefficient matrix ofĤ L(R) is (2χ × 2χ ). ThenĤ L(R) can be generally expanded byŜ α 1 ⊗Ŝ α 2 with {Ŝ } the generators of the SU(χ ) group, and define the magnetic field and coupling constants associated to the entanglement batĥ with S denoting the SU(χ ) spin operators andŜ the operators of the physical spin is the stabilizer on the open boundaries of the cluster state, a highly entangled state that has been widely used in quantum information sciences [19,20]. More relations with the cluster state are to be further explored. The physical information of the infinite-size model can be extracted from the ground state ofĤ FB (denoted by |Ψ (Sb 1 b 2 ) ) by tracing over the entanglementbath degrees of freedom. To this aim, we calculate the reduced density matrix of the bulk asρ Since the MPS optimally gives the ground state of the infinite model, therefore, ρ(S) of the few-body ground state optimally gives the reduced density matrix of the original model. In Eq. (6.12), the summation of the physical interactions is within the supercell that we choose to construct the cell tensor. To improve the accuracy to, e.g., capture longer correlations inside the bulk, one just needs to increase the supercell inĤ FB . In other words,Ĥ L andĤ R are obtained by TRD from the supercell of a tolerable sizeÑ, andĤ FB is constructed with a larger bulk asĤ FB =Ĥ L + Ñ n=1Ĥ n,n+1 + H R withÑ >Ñ. ThoughĤ FB becomes more expensive to solve, we have many well-established finite-size algorithms to compute its dominant eigenvector. We will show below that this way is extremely useful in higher dimensions.

Simulating Higher-Dimensional Quantum Systems
For (D > 1)-dimensional quantum systems on, e.g., square lattice, one can use different update schemes to calculate the ground state. Here, we explain an alternative way by generalizing the above 1D simulation to higher dimensions [5]. The idea is to optimize the physical-bath Hamiltonians by the zero-loop approximation (simple update, see Sect. 5.3), e.g., iDMRG on tree lattices [21,22], and then construct the few-body HamiltonianĤ FB with larger bulks. The loops inside the bulk will be fully considered when solving the ground state ofĤ FB , thus the precision will be significantly improved compared with the zero-loop approximation.
The procedures are similar to those for 1D models. The first step is to contract the cell tensor, so that the ground-state simulation is transformed to a TN contraction problem. We choose the two sites connected by a parallel bond as the supercell, and construct the cell tensor that parametrizes the eigenvalue equations. The bulk interaction is simply the coupling between these two spins, i.e.,Ĥ B =Ĥ i,j , and the interaction between two neighboring supercells is the same, i.e.,Ĥ ∂ =Ĥ i,j . By One can see that T has six bonds, of which two (S and S ) are physical and four (a 1 , a 2 , a 3 , and a 4 ) are non-physical. For comparison, the tensor in the 1D quantum case has four bonds, where two are physical and two are non-physical [see Eq. (6.4)]. As discussed above in Sect. 4.2, the ground-state simulation becomes the contraction of a cubic TN formed by infinite copies of T . Each layer of the cubic TN gives the operatorρ(τ ) = I −τĤ , which is a PEPO defined on a square lattice. Infinite layers of the PEPO lim K→∞ρ (τ ) K give the cubic TN. The next step is to solve the SEEs of the zero-loop approximation. For the same model defined on the loopless Bethe lattice, the 3D TN is formed by infinite layers of PEPOρ Bethe (τ ) that is defined on the Bethe lattice. The cell tensor is defined exactly in the same way as Eq. (6.19). With the Bethe approximation, there are five variational tensors, which are Ψ (central tensor) and v [x] (x = 1, 2, 3, 4, boundary tensors). Meanwhile, we have five self-consistent equations that encodes the 3D TN lim K→∞ρBethe (τ ) K , which are given by five matrices as  figure, where the arrows indicate the direction of orthogonality of A [3] in Eq. (6.26) A [2] is orthogonal, satisfying The self-consistent equations can be solved recursively. By solving the leading eigenvector of H given by Eq. (6.20), we update the central tensor Ψ . Then according to Eq. (6.25), we decompose Ψ to obtain A [x] , then update M [x] in Eqs. (6.21)-(6.24), and update each v [x] by M [x] v [x] . Repeat this process until all the five variational tensors converge. The algorithm is the generalized DMRG based on infinite tree PEPS [21,22]. Each boundary tensor can be understood as the infinite environment of a tree branch, thus the original model is actually approximated at this stage by that defined on an Bethe lattice. Note that when only looking at the tree locally (from one site and its nearest neighbors), it looks the same to the original lattice. Thus, the loss of information is mainly long range, i.e., from the destruction of loops.
We can have a deeper understanding of the Bethe approximation with the help of rank-1 decomposition explained in Sect. 5.3. Equations (6.21)-(6.24) encode a Bethe TN, whose contraction is written as Z Bethe = Φ |ρ Bethe (τ )|Φ witĥ ρ Bethe (τ ) the PEPO of the Bethe model and |Φ a tree iPEPS (Fig. 6.5). To see this, let us start with the local contraction ( Fig. 6.5a) as Then, each v [x] can be replaced by M [x] v [x] because we are at the fixed point of the eigenvalue equations. By repeating this substitution in a similar way as the rank-1 decomposition in Sect. 5.3.3, we will have the TN for Z Bethe , which is maximized at the fixed point ( Fig. 6.5b). With the constraint Φ |Φ = 1 satisfied, |Φ is the ground state ofρ Bethe (τ ). Fig. 6. 5 The left figure shows the local contraction that encodes the infinite TN for simulating the 2D ground state. By substituting with the self-consistent equations, the TN representingZ = Φ |ρ Bethe (τ )|Φ can be reconstructed, withρ Bethe (τ ) the tree PEPO of the Bethe model and |Φ a PEPS Now, we constrain the growth so that the TN covers the infinite square lattice. Inevitably, some v [x] s will gather at the same site. The tensor product of these v [x] s in fact gives the optimal rank-1 approximation of the "correct" full-rank tensor here (Sect. 5.3.3). Suppose that one uses the full-rank tensor to replace its rank-1 version (the tensor product of four v [x] 's), one will have the PEPO of I − τĤ (with H the Hamiltonian on square lattice), and the tree iPEPS becomes the iPEPS defined on the square lattice. Compared with the NCD scheme that employs rank-1 decomposition explicitly to solve TN contraction, one difference here for updating iPEPS is that the "correct" tensor to be decomposed by rank-1 decomposition contains the variational tensor, thus is in fact unknown before the equations are solved. For this reason, we cannot use rank-1 decomposition directly. Another difference is that the constraint, i.e., the normalization of the tree iPEPS, should be fulfilled. By utilizing the iDMRG algorithm with the tree iPEPS, the rank-1 tensor is obtained without knowing the "correct" tensor, and meanwhile the constraints are satisfied. The zero-loop approximation of the ground state is thus given by the tree iPEPS.
The few-body Hamiltonian is constructed in a larger cluster, so that the error brought by zero-loop approximation can be reduced. Similar to the 1D case, we embed a larger cluster in the middle of the entanglement bath. The few-body Hamiltonian (Fig. 6.6 H in Eq. (6.28) can also be rewritten as the shift of a few-body Hamiltonian H FB , i.e.,Ĥ = I − τĤ FB . We haveĤ FB possessing the standard summation form asĤ FB = i,j ∈clusterĤ (s i , s j ) + n∈cluster,α∈bath Ĥ P B (n, α), (6.30) withĤ ∂ (n, α) = I − τĤ P B (s n , b α ). This equations gives a general form of the few-body Hamiltonian: the first term contains all the physical interactions inside the cluster, and the second contains all physical-bath interactionsĤ P B (s n , b α ).Ĥ can be solved by any finite-size algorithms, such as exact diagonalization, QMC, DMRG [9,23,24], or finite-size PEPS [25][26][27] algorithms. The error from the rank-1 decomposition will be reduced since the loops inside the cluster will be fully considered. Similar to the 1D cases, the ground-state properties can be extracted by the reduced density matrixρ(S) after tracing over the entanglement-bath degrees of freedom. We haveρ(S) = Tr /(S) |Φ Φ| (with |Φ the ground state of the infinite model) that well approximate bŷ ρ(S) with Ψ Sb 1 b 2 ··· the coefficients of the ground state ofĤ FB . Figure 6.6 illustrates the ground state ansatz behind the few-body model. The cluster in the center is entangled with the surrounding infinite tree brunches through the entanglement-bath degrees of freedom. Note that solving Eq. (6.20) in Stage one is equivalent to solving Eq. (6.28) by choose the cluster as one supercell.
Some benchmark results of simulating 2D and 3D spin models can be found in Ref. [5]. For the ground state of Heisenberg model on honeycomb lattice, results of the magnetization and bond energy show that the few-body model of 18 physical and 12 bath sites suffers only a small finite-effect of O(10 −3 ). For the ground state of 3D Heisenberg model on cubic lattice, the discrepancy of the energy per site is O(10 −3 ) between the few-body model of 8 physical plus 24 bath sites and the model of 1000 sites by QMC. The quantum phase transition of the quantum Ising model on cubic lattice can also be accurately captured by such a few-body model, including determining the critical field and the critical exponent of the magnetization.

Quantum Entanglement Simulation by Tensor Network: Summary
Below, we summarize the QES approach for quantum many-body systems with fewbody models [1,5,6]. The QES contains three stages (Fig. 6.7) in general. The first stage is to optimize the physical-bath interactions by classical computations. The algorithm can be iDMRG in one dimension or the zero-loop schemes in higher dimensions. The second stage is to construct the few-body model by embedding a finite-size cluster in the entanglement bath, and simulate the ground state of this few-body model. One can employ any well-established finite-size algorithms by classical computations, or build the quantum simulators according to the fewbody Hamiltonian. The third stage is to extract physical information by tracing over all bath degrees of freedom. The QES approach has been generalized to finite-temperature simulations for one-, two-, and three-dimensional quantum lattice models [6].
As to the classical computations, one will have a high flexibility to balance between the computational complexity and accuracy, according to the required precision and the computational resources at hand. On the one hand, thanks to the zero-loop approximation, one can avoid the conventional finite-size effects faced by the previous exact diagonalization, QMC, or DMRG algorithms with the standard finite-size models. In the QES, the size of the few-body model is finite, but the actual size is infinite as the size of the defective TN (see Sect. 5.3.3). The approximation is that the loops beyond the supercell are destroyed in the manner of the rank-1 approximation, so that the TN can be computed efficiently by classical computation. On the other hand, the error from the destruction of the loops can be reduced in the second stage by considering a cluster larger than the supercell. It is important that the second stage would introduce no improvement if no larger loops were contained in the enlarged cluster. From this point of view, we have no "finite-size" but "finiteloop" effects. In addition, this "loop" scheme explains why we can flexibly change the size of the cluster in stage two: which is just to restore the rank-1 tensors inside the chosen cluster with the full tensors.
The relations among other algorithms are illustrated in Fig. 6.8 by taking certain limits of the computational parameters. The simplest situation is to take the dimension of the bath sites dim(b) = 1, and thenĤ ∂ can be written as a linear combination of spin operators (and identity). Thus in this case, v [x] simply plays Fig. 6.7 The "ab initio optimization principle" to simulate quantum many-body systems and second (solving the few-body Hamiltonian) stages are given above and under the arrows, respectively. Reused from [5] with permission the role of a classical mean field. If one only uses the bath calculation of the first stage to obtain the ground-state properties, the algorithm will be reduced to the zero-loop schemes such as tree DMRG and simple update of iPEPS. By choosing a large cluster and dim(b) = 1, the DMRG simulation in stage two becomes equivalent to the standard DMRG for solving the cluster in a mean field. By taking proper supercell, cluster, algorithms, and other computational parameters, the QES approach can outperform others.
The QES approach with classical computations can be categorized as a cluster update scheme (see Sect. 4.3) in the sense of classical computations. Compared with the "traditional" cluster update schemes [26,[28][29][30], there exist some essential differences. The "traditional" cluster update schemes use the super-orthogonal spectra to approximate the environment of the iPEPS. The central idea of QES is different, which is to give an effective finite-size Hamiltonian; the environment is mimicked by the physical-bath Hamiltonians instead of some spectra.
In addition, it is possible to use full update in the first stage to optimize the interactions related to the entanglement bath. For example, one may use TRD (iDMRG, iTEBD, or CTMRG) to compute the environment tensors, instead of the zero-loop schemes. This idea has not been realized yet, but it can be foreseen that the interactions among the bath sites will appear inĤ FB . Surely the computation will become much more expensive. It is not clear yet how the performance would be.
The idea of "bath" has been utilized in many approaches and gained tremendous successes. The general idea is to mimic the target model of high complexity by a simpler model embedded in a bath. The physics of the target model can be extracted Table 6.1 The effective models under several bath-related methods: density functional theory (DFT, also known as the ab initio calculations), dynamical mean-field theory (DMFT), and QES

Methods
DFT DMFT QES Effective models Tight binding model Single impurity model Interacting few-body model by integrating over the bath degrees of freedom. The approximations are reflected by the underlying effective model. Table 6.1 shows the effective models of two recognized methods (DFT and dynamic mean-field theory (DMFT) [31]) and the QES. An essential difference is that the effective models of the former two methods are of single-particle or mean-field approximations, and the effective model of the QES is strongly correlated.
The QES allow for quantum simulations of infinite-size many-body systems by realizing the few-body models on the quantum platforms. There are several unique advantages. The first one concerns the size. One of the main challenges to build a quantum simulator is to access a large size. In this scheme, a few-body model of only O(10) sites already shows a high accuracy with the error ∼O(10 −3 ) [1,5]. Such sizes are accessible by the current platforms. Secondly, the interactions in the few-body model are simple. The bulk just contains the interactions of the original physical model. The physical-bath interactions are only two-body and nearest neighbor. But there exist several challenges. Firstly, the physical-bath interaction for simulating, e.g., spin-1/2 models, is between a spin-1/2 and a higher spin. This may require the realization of the interactions between SU(N) spins, which is difficult but possible with current experimental techniques [32][33][34][35]. The second challenge concerns the non-standard form in the physical-bath interaction, such as theŜ xŜz coupling inĤ FB for simulating quantum Ising chain [see Eq. (6.15)] [18]. With the experimental realization of the few-body models, the numerical simulations of many-body systems will not only be useful to study natural materials. It would become possible to firstly study the many-body phenomena by numerics, and then realize, control, and even utilize these many-body phenomena in the bulk of small quantum devices.
The QES Hamiltonian was shown to also mimics the thermodynamics [6]. The finite-temperature information is extracted from the reduced density matrix ρ R = Tr bathρ , (6.32) withρ = e −Ĥ F B /T the density matrix of the QES at the temperature T and Tr bath the trace over the degrees of freedom of the bath sites.ρ R mimics the reduced density matrix of infinite-size system that traces over everything except the bulk. This idea has been used to simulate the quantum models in one, two, and three dimensions. The QES shows good accuracy at all temperatures, where relatively large error appears near the critical/crossover temperature.
One can readily check the consistency with the ground-state QES. When the ground state is unique, the density matrix is defined asρ = |Ψ Ψ | with |Ψ the ground state of the QES. In this case, Eqs. (6.32) and (6.16) are equivalent. With degenerate ground states, the equivalence should still hold when the spontaneous symmetry breaking occurs. With the symmetry preserved, it is an open question how the ground-state degeneracy affects the QES, where at zero temperature we haveρ = D a |Ψ a Ψ a |/D with {|Ψ a } the degenerate ground states and D the degeneracy.
lead to the quantum entanglement simulation (QES) of the lattice models. The central idea of QES is to construct an effective few-body model surrounded by the entanglement bath, where its bulk mimics the properties of the infinite-size model at both zero and finite temperatures. The interactions between the bulk and the bath are optimized by the TN methods. The QES provides an efficient way for simulating one-, two-, and even three-dimensional infinite-size many-body models by classical computation and/or quantum simulation.
With the lecture notes, we expect that the readers could use the existing TN algorithms to solve their problems. Moreover, we hope that those who are interested in TN itself could get the ideas and the connections behind the algorithms to develop novel TN schemes.
Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence and indicate if changes were made.
The images or other third party material in this chapter are included in the chapter's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.