# **International Association of Geodesy Symposia**

Jeffrey T. Freymueller Laura Sánchez  *Editors*

# X Hotine-Marussi Symposium on Mathematical Geodesy

Proceedings of the Symposium in Milan, Italy, June 13-17, 2022

# International Association of Geodesy Symposia

*Jeffrey T. Freymueller, Series Editor Laura Sánchez, Series Assistant Editor*

# **Series Editor**

Jeffrey T. Freymueller Endowed Chair for Geology of the Solid Earth Department of Earth and Environmental Sciences Michigan State University East Lansing, MI, USA

# **Assistant Editor**

Laura Sánchez Deutsches Geodätisches Forschungsinstitut Technische Universität München München, Germany

# International Association of Geodesy Symposia

*Jeffrey T. Freymueller, Series Editor Laura Sánchez, Series Assistant Editor*

Symposium 113: Gravity and Geoid Symposium 114: Geodetic Theory Today Symposium 115: GPS Trends in Precise Terrestrial, Airborne, and Spaceborne Applications Symposium 116: Global Gravity Field and Its Temporal Variations Symposium 117: Gravity, Geoid and Marine Geodesy Symposium 118: Advances in Positioning and Reference Frames Symposium 119: Geodesy on the Move Symposium 120: Towards an Integrated Global Geodetic Observation System (IGGOS) Symposium 121: Geodesy Beyond 2000: The Challenges of the First Decade Symposium 122: IV Hotine-Marussi Symposium on Mathematical Geodesy Symposium 123: Gravity, Geoid and Geodynamics 2000 Symposium 124: Vertical Reference Systems Symposium 125: Vistas for Geodesy in the New Millennium Symposium 126: Satellite Altimetry for Geodesy, Geophysics and Oceanography Symposium 127: V Hotine-Marussi Symposium on Mathematical Geodesy Symposium 128: A Window on the Future of Geodesy Symposium 129: Gravity, Geoid and Space Missions Symposium 130: Dynamic Planet - Monitoring and Understanding ::: Symposium 131: Geodetic Deformation Monitoring: From Geophysical to Engineering Roles Symposium 132: VI Hotine-Marussi Symposium on Theoretical and Computational Geodesy Symposium 133: Observing our Changing Earth Symposium 134: Geodetic Reference Frames Symposium 135: Gravity, Geoid and Earth Observation Symposium 136: Geodesy for Planet Earth Symposium 137: VII Hotine-Marussi Symposium on Mathematical Geodesy Symposium138: Reference Frames for Applications in Geosciences Symposium 139: Earth on the Edge: Science for a Sustainable Planet Symposium 140: The 1st International Workshop on the Quality of Geodetic Observation and Monitoring Systems (QuGOMS'11) Symposium 141: Gravity, Geoid and Height Systems (GGHS2012) Symposium 142: VIII Hotine-Marussi Symposium on Mathematical Geodesy Symposium 143: Scientific Assembly of the International Association of Geodesy, 150 Years Symposium 144: 3rd International Gravity Field Service (IGFS) Symposium 145: International Symposium on Geodesy for Earthquake and Natural Hazards (GENAH) Symposium 146: Reference Frames for Applications in Geosciences (REFAG2014) Symposium 147: Earth and Environmental Sciences for Future Generations Symposium 148: Gravity, Geoid and Height Systems 2016 (GGHS2016) Symposium 149: Advancing Geodesy in a Changing World Symposium 150: Fiducial Reference Measurements for Altimetry Symposium 151: IX Hotine-Marussi Symposium on Mathematical Geodesy Symposium 152: Beyond 100: The Next Century in Geodesy Symposium 153: Terrestrial Gravimetry: Static and Mobile Measurements (TG-SMM 2019) Symposium 154: Geodesy for a Sustainable Earth Symposium 155: X Hotine-Marussi Symposium on Mathematical Geodesy

# X Hotine-Marussi Symposium on Mathematical Geodesy

Proceedings of the Symposium in Milan, Italy, June 13-17, 2022

Edited by

Jeffrey T. Freymueller, Laura Sánchez

*Series Editor* Jeffrey T. Freymueller Endowed Chair for Geology of the Solid Earth, Department of Earth and Environmental Sciences Michigan State University East Lansing, MI USA

#### *Assistant Editor*

Laura Sánchez Deutsches Geodätisches Forschungsinstitut Technische Universität München München Germany

*Associate Editors* Pavel Novák Department of Geomatics University of West Bohemia Pilsen Czech Republic

Mattia Crespi Geodesy and Geomatics Division Department of Civil, Constructional and Environmental Engineering Sapienza University of Rome Rome Italy

Nico Sneeuw Institute of Geodesy University of Stuttgart Stuttgart Germany

Fernando Sansò Dipartimento di Ingegneria Civile e Ambientale Politecnico di Milano Milan Italy

Riccardo Barzaghi Dipartimento di Ingegneria Civile e Ambientale Politecnico di Milano Milan Italy

ISSN 0939-9585 ISSN 2197-9359 (electronic) International Association of Geodesy Symposia ISBN 978-3-031-55359-2 ISBN 978-3-031-55360-8 (eBook) https://doi.org/10.1007/978-3-031-55360-8

© The Editor(s) (if applicable) and The Author(s) 2024. This book is an open access publication.

**Open Access** This book is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this book are included in the book's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the book's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use.

The publisher, the authors, and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This Springer imprint is published by the registered company Springer Nature Switzerland AG The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland

Paper in this product is recyclable.

# **Preface**

This volume contains the proceedings of the jubilee X Hotine-Marussi Symposium on Mathematical Geodesy, which was held from 13 to 17 June 2022 at the Politecnico di Milano, Milan, Italy.

This series of symposia focused on theoretical geodesy started in 1959 when Antonio Marussi organized the first Symposium on Three-Dimensional Geodesy in Venice. The name of the symposia was changed in 1965 when the third Symposium on Mathematical Geodesy was held in Torino. The first three symposia were strongly influenced by the prominent British geodesist Martin Hotine. After his death in 1968, the series was renamed again and the first Hotine Symposium on Mathematical Geodesy was held in Trieste, 1969. This symposium and following four symposia were organized by Antonio Marussi. After his death in 1984, the series was renamed to the Hotine-Marussi Symposia, the title used up to now. The first five Hotine-Marussi Symposia (1985, 1989, 1994, 1998 and 2003) were organized by Fernando Sansò, the driving force behind the series of Hotine-Marussi symposia over more than three decades.

Since 2006, the organization of the Hotine-Marussi Symposia has been under the responsibility of the Inter-Commission Committee on Theory (ICCT) within the International Association of Geodesy (IAG). The ICCT organized the last five Hotine-Marussi Symposia held in Wuhan (2006), Rome (2009, 2013, and 2018), and Milan (2022). The overall goal of the Hotine-Marussi Symposia is aligned with the main objective of the ICCT, i.e., to advance geodetic theory in all branches of geodesy, reflecting developments in geodetic observing systems and interactions of geodesy with other Earth-related sciences. Thus, the Hotine-Marussi Symposia on Mathematical Geodesy are a primary venue for theoretically oriented geodesists.

The X Symposium in Milan attracted 60 participants from 30 countries who contributed 80 papers (62 oral and 18 poster presentations). The scientific program of the symposium was organized in 10 regular sessions thematically modelled according to the topics of the ICCT study groups and mostly convened by their chairs:

	- IX *Geodetic methods in Earth system science* Mattia Crespi, Nico Sneeuw
	- X *Theory of local gravity field modelling* Hussein Abd-Elmotaal, Jianliang Huang

*Participants of the X Hotine-Marussi Symposium, 15 June 2022, Politecnico di Milano, Italy*

Additionally, a special session was held on 15 June 2022, organized by Pavel Novák, president of ICCT (2015–2023). Its program consisted of five invited talks focused on the two basic concepts of physical geodesy – geoid and quasigeoid:


Based on the debate over the geoid and quasi-geoid, a motion critical of the use of the quasigeoid as a reference surface for physical heights in scientific and engineering applications was proposed to the Assembly of the X Hotine-Marussi Symposium. The Assembly recommended further discussion of this issue within the geodetic community.

The scientific program of the symposium was complemented by a great social program including a tour of the Duomo (Cathedral) and Centro Storico (Historic Center) di Milano.

The symposium was organized as a classic meeting with on-site participation; however, due to pandemic restrictions, a limited number of presentations were delivered using online tools. Although the number of participants did not match the number of previous Hotine-Marussi symposia, the meeting was attended by numerous geodesists, both young and senior ones.

We would like to acknowledge all who contributed to the success of the X Hotine-Marussi Symposium. The study group chairmen and the entire Scientific Committee (P. Novák, M. Crespi, N. Sneeuw, F. Sansò, R. Barzaghi, C. Kotsakis, M. Reguzzoni, J. Bogusz, A. Kealy, M. Schmidt, J. Müller, B. Li, M. Santos, M. Šprlák, K. Sosnica, R. Tenzer, J. Huang, A. Calabia, ´ D. Tsoulis, B. Soja, Y. Tanaka, A. Khodabandeh, A. Kłos, S. Claessens, R. Cunderlík, G. ˇ Savastano, D. Carrion) put much effort in organizing and convening their sessions. The peerreview process was led by Jeffrey Freymueller and Laura Sánchez, the IAG Symposia Series editors. Although most of the reviewers remain anonymous for the authors, a complete list of reviewers is printed in this volume to express our gratitude for their dedication.

The Symposium was financially and promotionally supported by the Politecnico di Milano. The IAG provided travel support to selected young participants of the Symposium.

However, most of our gratitude goes to the Local Organizing Committee of the Symposium. The team chaired by Riccardo Barzaghi consisted of members of the Department of Civil and Environmental Engineering of the Politecnico di Milano: B. Betti, F. Migliaccio, A. Albertella, M. Reguzzoni, G. Venuti, D. Carrion, C. De Gaetani, L. Rossi, and C. Vajani. Through their effort and organization skills, Riccardo and his team significantly contributed to the success of the Symposium.

Pilsen, Czech Republic Pavel Novák Rome, Italy Mattia Crespi Stuttgart, Germany Nico Sneeuw Milan, Italy Fernando Sansò Milan, Italy Riccardo Barzaghi November 2023

# **Contents**

#### **Part I Gravity Field Modelling and Height Systems**


#### **Part III Geodetic Data Analysis**

**Part I**

**Gravity Field Modelling and Height Systems**

# **Remarks on the Terrain Correction and the Geoid Bias**

# Lars E. Sjöberg and Majid Abrehdary

#### **Abstract**

The incomplete knowledge of the topographic density distribution causes a topographic bias in all gravimetric geoid determinations. This bias becomes critical in aiming for accurate geoid models in high mountainous regions. The bias can be divided into two components: the bias of the Bouguer shell (or Bouguer plate) and that of the remaining terrain. Starting from the known (disturbing) potential at the Earth's surface, we study the possible *location* of the bias caused by incomplete reduction of the terrain masses in the computational process, We show that there is no such bias for terrain masses located exterior to the Bouguer plate/shell and/or inside the Bouguer plate at a lateral distance exceeding the height *HP* of the topography at the computational point. We conclude that the only possible terrain bias could be generated by masses inside a dome of height <sup>p</sup><sup>2</sup> - 1 HP centered along the radius vector through the computational point with its base of radius *HP* at sea-level.

**Keywords**

Downward continuation Terrain bias Terrain correction Topographic correction

# **1 Introduction**

The topographic corrections in gravimetric geoid determination can be decomposed into the corrections for the Bouguer shell/plate and the terrain (e.g., Heiskanen and Moritz 1967, Sect. 3-3). If one assumes that the mass distribution of the topography is rather random, it is suitable to define the Bouguer shell as spherically symmetric with density distribution as that given along the radius vector at the computation point. This implies that there are terrain corrections to be considered all-over the Earth except along the radius at the computation point. Here we assume that the computation point is located at the Earth's surface, and we will consider

Uppsala University, Uppsala, Sweden

Royal Institute of Technology, Stockholm, Sweden e-mail: lsjo@KTH.se

M. Abrehdary Uppsala University, Uppsala, Sweden the locations of masses causing possible terrain bias in analytical downward continuation (DWC) of the (disturbing) potential to sea-level. (The Bouguer shell bias can be found, e.g., in Sjöberg 2007 and in Sjöberg and Bagherbandi (2017, Sect. 5.2.5).

In the DWC process of the surface disturbing potential to sea level, it is only the topographic potential that may cause a bias. Below we will divide the study of the possible bias caused by the terrain masses located in the exterior zone (i.e. the zone exterior to the Bouguer plate) as well as in the remote and near zones inside the Bouguer plate. In each zone we search for the answer to the question whether there could be a source of mass causing a bias.

The method we use to answer the question is to compare the effect of each surface potential (*dVP*) generated by a point mass in the zone when downward continued to sea level dV - P and its true potential at sea level (*dVg*). The DWC is formulated by the Taylor series:

$$d\boldsymbol{V\_{P}^{\*}} = \sum\_{\mathbf{k}=0}^{\Psi} \frac{(-H\_{P})^{\mathbf{k}}}{\mathbf{k}!} \frac{\partial^{\mathbf{k}} \mathbf{d}V\_{\mathbf{P}}}{\partial H\_{P}^{k}},\tag{1}$$

© The Author(s) 2023

J. T. Freymueller, L. Sánchez (eds.), *X Hotine-Marussi Symposium on Mathematical Geodesy*, International Association of Geodesy Symposia 155, https://doi.org/10.1007/1345\_2023\_191

L. E. Sjöberg (-)

**Fig. 1** The shaded area describes the dome generated by Eq. (14)

where *HP* is the orthometric height of the topography at point *P*. If dV - <sup>P</sup> ¤ dV <sup>g</sup> (which may be due to that the series diverges or converges to the wrong value), there is a bias, otherwise not.

Assuming that sea-level is located on a flat Earth, the terrain potentials at the surface point *P* and at sea-level generated by a point mass of density *d* (more precisely, denoting the gravitational constant times the density) located at lateral distance *s* from *P* and at height *h* become (see Fig. 1)

$$dV\_P = d\,\mu/D \quad \text{and} \quad dV\_\S = d\,\mu/D\_0,\qquad \text{(2a)}$$

where

$$D = \sqrt{s^2 + \Delta^2}, \quad D\_0 = \sqrt{s^2 + h^2} \quad \text{and} \quad \Delta = H\_P - h. \tag{2b}$$

# **2 The Terrain Correction for Masses Located in the Remote Zone of the Bouguer Shell**

Let us define the remote zone as the location of all points at lateral distance *s* exceeding the height of the computation point, i.e. *s* > *s*<sup>0</sup> D *HP*.

Then the potential at sea-level of the point mass can be developed in the series

$$dV\_g^l = \frac{\mu}{s} \sum\_{k=0}^{\infty} \binom{-1/2}{k} d^k = \frac{\mu}{s} \left[ 1 - \frac{d}{2} + \frac{3d^2}{8} - \right]. \tag{3}$$

where *d* D (*h*/*s*) 2.

Also, for *HP h* the surface potential at *P* becomes

$$dV\_P^l = \frac{\mu}{s} \sum\_{k=0}^{\infty} \binom{-1/2}{k}^k t^k = \frac{\mu}{s} \left[ 1 - \frac{t}{2} + \frac{3t^2}{8} - \right], \quad (4)$$

where *t* D (/*s*) 2.

We note that each term *t <sup>k</sup>* is a polynomial in *HP*, implying that it is downward continued by simply putting *HP* D 0.

Hence, inserting (4) into (1) with (*t k*)- ! *dk* for all *k* > 0, it follows that

$$(dV\_P)^\* = dV\_{\mathfrak{g}},\tag{5}$$

i.e. there is no terrain correction needed in this zone.

Note that the exterior part of the far-zone (where *HP* < *h*) is not yet included. See the next section.

# **3 The Terrain Correction for Masses Located Outside the Bouguer Plate**

In the exterior zone (outside the Bouguer plate at height *HP*) it holds that *HP* < *h* (the height of the point mass) and *D*<sup>0</sup> > *D* (see notations in Fig. 1). Then the inverse distances in Eq. (2b) are related by the sequence

$$\begin{array}{c} A = \frac{1}{D} = \frac{1}{\sqrt{s^2 + \Delta^2}} = \\ \frac{1}{\sqrt{\mathcal{B}^2 + \left(\Delta^2 - h^2\right)}} = \frac{B}{\sqrt{1 + \mathcal{B}^2 \left(\Delta^2 - h^2\right)}} = \frac{B}{\sqrt{1 + q}}, \end{array} \qquad (6a)$$

where

$$\begin{array}{rcl} \mathcal{B} = 1/D\_0 = 1/\sqrt{s^2 + h^2} \quad \text{and} \quad q = \mathcal{B}^2 \left(\Delta^2 - h^2\right) \\ \quad = \mathcal{B}^2 \left(H\_P^2 - 2H\_P h\right). \end{array} \tag{6b}$$

Hence, the potential at *P* can be written

$$dV\_P = \mu A = \mu B / \sqrt{1+q},\tag{7}$$

and, if j*q*j < 1, the inverse square-root can be expanded as a power series in *q* as for *t* in Eq. (4). However, outside the

Bouguer shell *h* > *HP*, so that

$$|q| = \frac{H\_P \left(2h - H\_P\right)}{s^2 + h^2} < 1,\tag{8}$$

and, accordingly, one can expand *A* as a convergent power series of *q*. Then, after applying Eq. (1) follows

$$(dV\_P)^\* = \mu A^\* = \mu B = dV\_{\mathfrak{g}},\tag{9}$$

which means that there is no geoid bias generated in the DWC process by the masses in the exterior to the Bouguer plate.

# **4 The Terrain Correction Due to Masses in the Near-Zone Inside the Bouguer Plate**

Next, let us replace *s*<sup>0</sup> D *HP* by *s*<sup>1</sup> < *s*<sup>0</sup> in Eq. (2b). Then the DWC effect on the inverse distance

$$\begin{split} C &= \frac{1}{D} = \frac{1}{\sqrt{s\_1^2 + \left(H\_F - h\right)^2}} = \\ \frac{1}{\sqrt{s\_1^2 + h^2}} \frac{1}{\sqrt{1 + p}} &= \frac{1}{\sqrt{s\_1^2 + h^2}} \sum\_{k=0}^{\infty} \binom{-1/2}{k} p^k, \end{split} \tag{10}$$

will approach 1=<sup>q</sup> s2 <sup>1</sup> C h2, if

$$\left| \left( H\_P^2 - 2hH\_P \right) / \left( s\_1^2 + h^2 \right) \right| < 1,\tag{11}$$

which yields

$$dV\_P^\star = \mu C^\star = \mu / \sqrt{s\_1^2 + h^2} = dV\_\lg. \tag{12}$$

Recalling that inside the Bouguer plate *HP* > *h*, so that inequality (11) becomes

$$\left(H\_P^2 - 2hH\_P\right) / \left(s\_1^2 + h^2\right) < 1,\tag{13a}$$

or

$$\frac{s\_1^2 + (H\_P + h)^2}{2H\_P^2} > 1.\tag{13b}$$

This inequality is met for all masses located at points (*s*, *h*) outside the dome - (with a hole of radius 0 < *s*<sup>1</sup> < *s*<sup>0</sup> D *HP* in its center) generated by the circle

$$\frac{s^2 + (H\_P + h)^2}{2H\_P^2} = 1\tag{14}$$

within the sector - *s*<sup>1</sup> *s HP* and height 0 h <sup>p</sup><sup>2</sup> - 1 HP , rotating round its vertical axis. See Fig. 1. Inside - (*s,h*) violates (13b). (See the shaded area in Fig. 1 with vertex at point *Q0 .*)

# **5 Conclusions**

If the Bouguer shell is modelled with a spherically symmetric density distribution that changes radially according to the topographic density distribution along the radius vector **r** through the computation point on the Earth's surface, there will remain terrain masses all over the Earth, except along **r**. Assuming that the disturbing potential is known at the Earth's surface at height *HP*, the study shows that only terrain mass inside a dome of height <sup>p</sup><sup>2</sup> - 1 HP along **r** with its base of radius *HP* at sea level, is likely to cause a bias. Masses located outside cannot produce a bias in the geoid determination, and if there are no terrain masses inside -, the only topographic bias will be that caused by the Bouguer plate.

**Acknowledgement** We appreciate the constructive remarks by three unknown reviewers.

# **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/ licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

# **Why a Height Theory Must Be Rigorous and Physically Correct**

Petr Vanícek, Marcelo Santos, Robert Kingdon, Ismael Foroughi, ˇ and Michael B. Sheng

#### **Abstract**

Let us start with defining what we understand by a height system. A height system is a conglomerate of reference surface upon which height H D 0, and a recipe for how heights above that surface are obtained from observations. Two such systems, which we call the classical or Gauss-Stokes's system and the Molodensky system, are used in practical height measurement.

The reference surface used by the classical system is the Geoid and its usage is based on valid physical arguments. Determination of the height above the geoid requires data at the surface of the Earth obtained by levelling, gravimetry, sea level measurements, and topographical density from geological measurements. This system served us well when decimetre height accuracy was required and will continue doing so even now when one or even two orders of magnitude better accuracy is needed. On the other hand, Molodensky's system uses the quasigeoid as a reference surface; this surface is ill suited for a global height system. This paper argues the case that the standard classical reference surface, the geoid, should be used in practice everywhere.

#### **Keywords**

Geodetic heights - Geoid - Height system -Quasigeoid

# **1 Review**

Heights are the vertical coordinates in a 3D curvilinear coordinate system and as such can have different metric for different height systems. The role of the Earth gravity field is much more important in the vertical dimension than in horizontal dimensions. [But look at the error in the length of a metre where the effect of gravity overwhelms the contribution of measurement error 50-times (Vanícek and ˇ

Department of Earth and Space Science and Engineering, York University, Toronto, ON, Canada

Foroughi 2019)]. It is clearly a greater intellectual challenge to work with heights then to work with horizontal positions. Also the term "vertical" associated with heights implies that a horizontal, a.k.a., level surface should serve as the reference surface for heights.

Horizontality brings in the notion of a gravity equipotential surface (by definition) and a natural extension of this notion, the Geopotential numbers *C*. These are defined as

$$C\left(\Omega, H\right) = W\_0 - W\left(\Omega, H\right), \tag{1}$$

where W stands for real gravity potential, for the couple of horizontal coordinates (ª, œ) and H for the height above the height reference surface. As *W*<sup>0</sup> is the potential of the reference surface, *C* can be used as a height indicator (a pseudo-height) if we were willing to use physical units of potential (Gal m) as "units of heights".

P. Vanícek · M. Santos · R. Kingdon · M. B. Sheng ( ˇ -)

Department of Geodesy and Geomatics, University of New Brunswick, Fredericton, NB, Canada

I. Foroughi

Department of Geodesy and Geomatics, University of New Brunswick, Fredericton, NB, Canada

J. T. Freymueller, L. Sánchez (eds.), *X Hotine-Marussi Symposium on Mathematical Geodesy*, International Association of Geodesy Symposia 155, https://doi.org/10.1007/1345\_2023\_193


**Table 1** Overview of height systems discussed in this paper

There are three kinds of terrestrial heights *H* used in practice: Orthometric *H*O, Dynamic *H*<sup>D</sup> and Normal *H*N, all referred to the geoid *W(*˝*)* D *W0*. They are all defined by a similar equation

$$H\left(\Omega\right) = C\left(\Omega, H\right) / \mid \nabla V\left(\Omega\right) \mid,\tag{2}$$

where j rV .-/ j is the integral mean absolute value of the gradient of potential V between the reference surface and the point of interest on the Earth surface. The value of the denominator in Eq. (2) for the three varieties of terrestrial heights *H* are:

$$\begin{array}{rcl} \text{\textquotedblleft } H^{\text{O}} \text{ \textquotedblright} \dots \text{ \textquotedblleft real gravity } (V = W), \end{array}$$


We understand the normal height HN was introduced by Vignal in France in 1957 although the term "normal" might have been used already by Molodensky a few years earlier. In any case, there is a slight difference between Vignal's and Molodensky's versions of Normal heights in the way the mean normal gravity is computed. As a reference, the various height systems, including some to be discussed later, are summarized in Table 1.

The three classical height systems are described, for example, in Vanícek and Krakiwsky ( ˇ 1986). Orthometric height is the only height system to use Euclidean metric, i.e., *H*<sup>O</sup> is the real height above the geoid as measured by a constant metre. Note that we are here talking about *rigorous Orthometric height and not the approximate Helmert variety* (Tenzer et al. 2005) and the term "metric" is taken in the mathematical sense as the way of measuring distances. The Dynamic height *H*<sup>D</sup> is the only physically meaningful height (fluid does not flow between two points of equal dynamic heights), its Riemannian metric being dictated by the shape of gravity field. The Normal height *H*<sup>N</sup> is comparable in some ways to the Orthometric height, as it was originally meant as an approximation of Orthometric height, but like Dynamic height, is measured by a "rubber metre". Its metric has no real meaning.

# **2 Problems with Molodensky's Approach**

In the mid-twentieth century, Russian physicist M.S. Molodensky observed that the lack of information on topographical density at that time caused too large an error in H to satisfy the practical accuracy requirements. To eliminate this problem, Molodensky devised an interesting alternative theory of heights and of the external gravity field, which does not require any knowledge of topographical density. Molodensky's practical heights use the quasigeoid as the reference surface and are numerically quite close to Vignal's Normal heights. The metric of Molodensky's (practical) heights is Euclidean*. We distinguish between Normal and Molodensky's heights because while Molodensky was writing mostly about Normal heights, geodetic practise uses what we call here Molodensky's heights; the commonly used terminology follows Heiskanen and Moritz* (1967) *explanations that is much the same as ours.*

It has been known for some time that Molodensky's theory has two flaws: (1) It calls for integration over a surface (Telluroid) which is reflective of topography and is thus not integra-ble and (2) the reference surface it uses, the *quasigeoid*, is not globally continuous. Being reflective of topography, it has folds, reflecting the folds of topographical surface, which cause the discontinuity and other funny features reflecting topography (Vanícek and Santos ˇ 2019).

How large can quasigeoid's folds be? The size of the fold is given as

$$\begin{split} \Delta \xi\_{AB} &= \xi\_A - \xi\_B = (W\_A - U\_A) / \chi - (W\_B - U\_B) / \chi \\ &= (W\_A - W\_B) / \chi - (U\_A - U\_B) / \chi. \end{split} \tag{3}$$

Now,

$$(W\_A - W\_B) = -\overline{\mathfrak{g}}\_{AB} \Delta H\_{AB} \tag{4}$$

and similarly

$$(U\_A - U\_B) = -\overline{\gamma}\_{AB} \Delta H\_{AB} \,. \tag{5}$$

Thus,

$$
\Delta\xi\_{AB} = \left(-\overline{\mathbf{g}}\_{AB} + \overline{\mathbf{\boldsymbol{\gamma}}}\_{AB}\right)\Delta H\_{AB}/\boldsymbol{\gamma} = -\overline{\boldsymbol{\delta}}\overline{\mathbf{g}}\_{AB}\Delta H\_{AB}/\boldsymbol{\gamma},\tag{6}
$$

where ıgAB stands for mean gravity disturbance between points A and B. Let us assume ıgAB to be 100 mGal and 4 *HAB* to be 1,000 m. For this situation, the magnitude of the fold 4*AB* would be about 10 cm, but larger folds plausibly exist in the real world. There are also other problems with the quasigeoid, which is not a well mathematically behaving surface at all and neither has it any physical meaning even where it is defined.

The basic requirement of a height system is *holonomity* which, in brief really means a uniqueness of definition: for each horizontal position ˝ there must be only one value of height *H* referred to the point of interest on the Earth surface (Sansò and Vanícek ˇ 2006). In other words, heights must be true coordinates. This property is also demanded by the classical treatment of levelled differences so that height differences between any two points on the Earth surface are the same whichever levelling route between them is followed. Clearly, if the terrain has overhangs then in the areas of overhangs we shall have to use some non-standard tool to vertically locate points that are underneath the upper topographical surface. Consequently, if the topographical surface is not a one-valued mathematical function; the uniqueness can be guaranteed if the reference surface is continuous. This is not the case with the quasigeoid as discussed above. *Hence Molodensky's heights cannot be used in a global height system.*

# **3 Arrival of Satellites and the Problem of Height Congruency**

Since the late 1960s satellite positioning techniques have become widespread. For the first time in history there appears an alternative approach to height determination – but the heights are of a different kind. They are Geodetic heights referred to the reference ellipsoid. These should not be called ellipsoidal heights as this would imply heights of an ellipsoid surface above some other reference surface, per common English usage (e.g., "sea surface heights", "topographical heights", "geoidal heights", etc.). They are obtained through a geometrical transformation of the 3D positions derived from observations to satellites such as GNSS. Geodetic heights have the Euclidean metric but are not referred to a horizontal surface as the reference ellipsoid is not a horizontal surface. To assure the congruency between the existing Orthometric and the Geodetic systems of heights, we must introduce the height of the geoid above the reference ellipsoid *N*, a.k.a., Geoidal height and, forgetting about the negligeable differences among the lengths of different plumblines, we get the following simple relation

$$\forall \Omega \in \Omega\_0 \; ; \; h \; (\Omega) = H^O \; (\Omega) + N \; (\Omega) \; . \tag{7}$$

The congruency with the other existing terrestrial height systems, Dynamic and Normal, is a-chieved simply through multiplication by the appropriate ratio of potential gradients, c.f., Eq. (2).

Satellite techniques have become so accurate that the accuracy of Geodetic heights h is now equivalent to that of standard Orthometric heights *H*O. It is thought that the standard deviation of Geodetic heights and the standard deviations of individual, or point terrestrial height is about the same, 2–3 cm. Since the standard deviation, as a measure of error of the computed Geoidal height *N*, can be about fivetimes smaller if the data available for geoid determination are of good quality and quantity (Foroughi et al. 2019) and the topographical heights are reasonably low. It is now more economical than using the classical levelling technique to determine Orthometric height as a difference of Geodetic height *h* and Geoidal height *N*.

As an aside, there is a very often asked question that makes sense as the ease with which we can measure and calculate Geodetic heights *h* for almost any point on the surface of the Earth with an accuracy good enough for most applications becomes obvious: Why not use Geodetic heights as practical heights? The problem is its datum: in the Geodetic height system the height of the sea shore height varies between 100 m and C100 m which makes it difficult to work with in a technically meaningful way. Clearly, Geodetic heights must be transformed to Orthometric, Dynamic or Normal heights which all have the same reference surface, the geoid, selected so that it approximates the mean sea level and are thus useful in practice.

It is impossible to use a simple equation like Eq. (7) to make the Molodensky heights congruent with geodetic heights since it is non-holonomic due to using the quasigeoid as its reference surface. The difference between the geoid and quasigeoid can be evaluated only approximately. Hence as a *consequence of the quasigeoid being a discontinuous surface, the Molodensky height system cannot be made globally congruent with Geodetic height system*.

# **4 Conclusions**

The maintenance of the congruency of a terrestrial height system with the Geodetic system is an ongoing process. This process includes, of course, the increasing accuracy of geoidal height determination. Thus the congruency of the terrestrial and Geodetic height systems should be assured as much as possible at any stage of height densification process. The height systems are developed in successive iterations as the understanding of the involved problems improves and more and increasingly denser observations of the real world become available. To ensure that these iterations converge to a correct result, i.e., to the true congruency, the individual parts have to be formulated correctly in the physical sense.

As an aside, let us take, for illustration, the downward continuation of gravity. It is a well posed problem for reasonably low topography and reasonably large steps [more than 1 or 2 arc-minutes (Martinec 1996)] in the description of topography. When the step gets too small and/or topography gets too high the process becomes unstable and can be solved only by some artificial means (like regularization or, preferably, by Moore-Penrose generalized matrix inversion). But, if the combination of step size and topography yields a regular matrix, there exists a unique and *physically correct* solution. It seems that the existing regular geoid solutions can be accurate to 1 cm (Foroughi et al. 2019) and a more detailed solution will probably not give any real improvement. This would be true if the geoidal deflection of vertical does not change more than 1 second-of-arc in a horizontal distance of 1 km which may plausibly be the case worldwide. Hence, if the combination of step size and topographical heights yields a regular downward continuation system of equations then this process gives an example of a well posed problem that satisfies the requirement: each successive iteration will bring the real congruency closer and closer to the ideal.

On the other hand, *Molodensky's system suffers from flaws that come from the theoretical formulation rather than practical implementation*. To use this system requires the user to evaluate integrals over a surface that does not allow integration over it. Similarly, the heights are defined so that the reference surface reflects the folds that exist on the topographical surface and no successive iterations are going to get rid of this problem. This behaviour makes the quasigeoid as a global reference surface for heights needed for the United Nations' resolution on the "Global Geodetic Reference Frame (UN-GGRF) for Sustainable Development" unacceptable.

In the real world, when solving a practical problem, some people use an approach that is known to be somewhat suspect just because it is easier to use, it works to the presently required accuracy and it "satisfies the need" of the instant. We have written this paper to address specifically these geodesists who are doing a disservice to their profession by preparing the ground for future problems arising from using Molodensky's height system that are bound to appear. *Please abandon Molodensky's height system, i.e., forget about using the quasigeoid* and make sure that the height system you are working with has a chance to have the necessary congruency with the Geodetic height system, that it is physically rigorous and correct and has the potential to improve the height control with future iterations.

**Acknowledgements** Ismael Foroughi was supported by Mitacs Application No. IT25134.

# **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/ licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

# **Geodetic Heights and Holonomity**

# Fernando Sansò, Riccardo Barzaghi, and Mirko Reguzzoni

#### **Abstract**

It is sometimes stated in the geodetic literature that the normal height system, so important in both geodetic theory and practice, is nonholonomic, i.e. the normal height of a point in reality depends also on the integration path of a certain differential. On the contrary, this paper proves that the normal height system is holonomic also identifying the critical point on which the nonholonomic statement is based. Besides that, the general concepts related to the definition of the height system are revised and an overview of the current heights is given.

Indeed, given the theorical and practical importance of the subject, this is a key item in Geodesy that must be clearly stated by using definitions and results well known in mathematics.

#### **Keywords**

Ellipsoidal heights - Geopotential - Normal heights -Orthometric heights

# **1 Introduction**

Geodetic heights are tools used in Geodesy as coordinates to place in 3D space points that belong to the surface of the earth S or to a layer extending dozens of kilometres above or below it, a set that we will call -.

There are four principal height systems in use in Geodesy: the gravity potential *W*(*P*) or the geopotential number *C*(*P*) D *W*<sup>0</sup> *W*(*P*), or the dynamic height *HD*(*P*) D *C*(*P*)/0, which are linear functions of it (for these definitions see e.g. § 4.2 of Heiskanen and Moritz 1967); the orthometric height, which is mostly preferred by surveyors as it is considered much akin to the levelling observations; the normal height, introduced as a tool for the linearization of the Geodetic Boundary Value Problem (Sansò and Sideris 2013; Yanushauskas 1989); the ellipsoidal height, which is a purely geometric concept, though nowadays accessible by direct observations from GNSS satellites.

F. Sansò · R. Barzaghi (-) · M. Reguzzoni

DICA – Politecnico di Milano, Milano, Italy e-mail: riccardo.barzaghi@polimi.it

The first and the last systems are, so to say, naturally global and adapt to build unambiguously a world-wide system (height datum).

On the contrary, orthometric and normal heights, may be for their closeness to the levelling increments

$$
\delta L = -d \, W/\text{g} \, (\text{g Earth gravity}) \tag{1.1}
$$

have been adopted as local systems where corrections to the integral along levelling lines of the differential form (1.1) could be neglected.

Therefore, a practical idea was to assign a height 0 conventionally to some origin point and then give a height to other points connected by levelling lines by integrating ı*L* along them. In this way, many local levelling height systems have been created.

Yet (1.1) is not an exact differential form, as it is proved in e.g. Sansò and Vanicek ( ˇ 2006). So, the integral of ı*L* over a large, closed loop is significantly different from zero. In other words, levelled height systems brought to a global (or even only large) scale clearly show their non-holonomity.

J. T. Freymueller, L. Sánchez (eds.), *X Hotine-Marussi Symposium on Mathematical Geodesy*,

International Association of Geodesy Symposia 155, https://doi.org/10.1007/1345\_2023\_234

Is such a property inherited by orthometric or by normal heights? As for orthometric heights, there has been a longlasting discussion in the geodetic community. Yet, the question has been completely and satisfactorily solved in Sansò and Vanicek ( ˇ 2006), where it is shown that the orthometric height is a continuously differentiable single-valued function and, as such, it is a regular holonomic coordinate in the layer around S described above.

Now, the question seems to arise again in relation to the normal heights, if the quasi-geoid is claimed to be a reference surface and to display potentially such irregular features that make normal heights a non-holonomic system (Sansò and Vanicek ˇ 2006).

In the paper, the authors consider once again the question of holonomity and show that the mentioned proposition descends from a misinterpretation of the quasi-geoid as reference surface of normal heights.

# **2 What Is a Geodetic Heigh?**

A regular system of coordinates in a domain - R<sup>3</sup> is a triple of points functions (*q*1, *q*2, *q*3).

Such that, the correspondence

$$P \in \Omega \text{ } \leftrightarrow \text{ } (q\_1, q\_2, q\_3) \tag{2.1}$$

is biunivocal and continuously differentiable, i.e. r*Pqi*(*P*) should be continuous vector fields in - (Marussi 1985; Sansò et al. 2019). The above condition implies that, when is simply connected, the integral of *dqi* on any closed rectifiable curve in is zero (Sansò and Vanicek ˇ 2006).

Since the idea of height is related to represent something that is "above" or "below" an observer, one of the 3 coordinates *qi* (i D 1, 2, 3), say *q*<sup>3</sup> for the sake of definiteness, will be considered a "height" if it has a relation to the direction of the physical vertical, namely to the unit vector

$$\underline{n}\_P = -\frac{\underline{\mathbf{g}}(P)}{\mathbf{g}(P)}\tag{2.2}$$

where g.P / is the vector of the Earth gravity field.

In Sansò et al. (2019), we have stipulated that *q*<sup>3</sup> is a geodetic height if the tangent t <sup>3</sup> to the *q*<sup>3</sup> line at *P* makes an acute angle with n.P /, i.e.

$$\underline{n}\_P \cdot \underline{t}\_3(P) \ge 1 - \varepsilon \tag{2.3}$$

for some value ", 0 " < 1.

In fact, an observer moving from *P* in the direction of t <sup>3</sup> will see the gravity potential value *W(P)* decrease which is a distinctive property of being higher, at least up to some distance from the centre of the Earth, at which the centrifugal term of *W(P)* is still a perturbation of the main gravitational term.

Yet, such a definition has a logical limit in that the *q*<sup>3</sup> line is, as a matter of fact, defined by the other two coordinates, which have to be consistent along it. Indeed, *q*<sup>3</sup> has to change monotonously along its coordinate line, otherwise the oneto-one correspondence (2.1) can fail, so that condition (2.3) makes sense. However, there are many directions in which both *q*<sup>3</sup> and *W(P)* can change, one increasing and the other decreasing. So, we prefer to make a slightly generalization of (2.3) that could be verified on the basis of the function *q*3(*P*) alone.

# **2.1 A New Definition of a General Geodetic Height** *HG*

*HG* is defined in a layer -, a closed set as in Fig. 1, by means of the following three elements:

a) a surface *RS* - (Reference Surface) defined as a Lipschitz function of the ellipsoidal coordinates of its point

$$RS = \left\{ P; \underline{r}\_P = \underline{r}\_{P\_\ell} + h\_P \underline{v}\_{P\_\ell} \right\} \tag{2.4}$$

*hP* D ellipsoidal height of *P*; *Pe* D projection of P on the Ellipsoid; Pe D normal to the Ellipsoid passing through P b) a family I of lines in -

$$\mathfrak{I} = \{L\_P, P \in \mathfrak{Q}\} \tag{2.5}$$

with the following characteristics: for every *P* 2 passes one and only one line

$$L\_P = \left\{ \underline{r}\_{\mathcal{Q}} = \underline{\xi}\_P(\mathcal{Q}) \right\} \tag{2.6}$$

**Fig. 1** The ellipsoid *E* and the set -

**Fig. 2** -, ellipsoid *E*, Reference Surface *RS*, *cos* D 1 - , *PRS* projection of *P* on *RS*, PRSe projection of *PRS* on *E*

each *LP* has a continuous tangent field, pointing upward

$$\underline{t}\_P(\underline{Q}) = \left\{ \frac{d\underline{\xi}\_P(\underline{Q})}{dl} ; \mathcal{Q} \in L\_P \right\}.\tag{2.7}$$

As such, each *LP* is then rectifiable. We also require that each *LP* pins the RS at one point only *PRS*; the correspondence *P* ! *PRS* is called the projection of *P* on *RS* along I. By pointing upward, we mean that at any point *P* in -, direction of the vertical nP and upward tangent of *LP*, tP , form and acute angle (see Fig. 2), namely

$$\underline{n}\_P \cdot \underline{t}\_P = 1 - \eta \ (0 \le \eta \le 1) \tag{2.8}$$

c) let us first define a linear coordinate on *LP*, namely the arclength l<sup>P</sup> PRS D l<sup>P</sup> , counted positively outside *RS* and negatively inside; now fix a functional of the arc L<sup>P</sup> PRS D L<sup>P</sup> , a function *F*(*l <sup>P</sup>*), which is monotonic in the sense that

$$l^{P'} > l^P \Rightarrow F\left(l^P\right) > F\left(l^{P'}\right) \tag{2.9}$$

and such that

$$\left(l\right)^{P} = 0 \ \left(P = P\_{RS}\right) \Rightarrow F\left(L^{P\_{RS}}\right) = 0\tag{2.10}$$

Then we define the general geodetic height *HG* as

$$H\_G(P) = F\left(l^P\right) \tag{2.11}$$

As it is obvious by the above definitions, the *RS* surface corresponds exactly to the points *P* where *HG*(*P*) D 0.

We will verify in the next section that all the four height systems, mentioned in the Introduction, comply with such a definition.

In doing so, we will clarify a small incongruence, which is present even in classical textbooks, in the definition of normal height.

*Remark 2.1* It might seem that the new definition of *HG* is on the one hand too complicated and on the other hand too generic since it depends on the ambiguous constant ˜, as it was previously from ". Yet, we must underline that our definition is as a matter of fact tailored on that of orthometric height and that, when we consider practical height systems, " or ˜ are in fact very small quantities making such systems numerically not very different from one another.

This is typically of the Earth gravity fields which seems a "black night where all cows are black" (F. Hegel) in the sense that variables related to it, though defined in different ways, appear often not well numerically distinguishable one from the other (particularly in small areas, although globally they display a different behaviour).

# **3 The Four Height Systems Are Geodetic Heights**

We want to verify that: 1) the dynamic height *HD*; 2) the ellipsoidal height *h*; 3) the orthometric height *Ho*; 4) the normal height *H*- , are in fact geodetic heights according to the definition of the previous section.

1) Let us recall that

$$H\_D(P) = \frac{W\_0 - W(P)}{\chi\_0} \tag{3.1}$$

where, *W*<sup>0</sup> is the potential assigned to the geoid as an equipotential surface, satisfying also

$$W\_0 = U\_0 \tag{3.2}$$

with *U*<sup>0</sup> the value of the normal equipotential on the Earth Ellipsoid *E*; <sup>0</sup> on the contrary is a reference constant value for the normal gravity, typically taken as on *E* at latitude ® D 45<sup>ı</sup> (see e.g. Heiskanen and Moritz 1967). The choice of <sup>0</sup> is just to give to *HD* the dimension of a length and a numerical value not too distant from other height systems.

For *HD* we have:

a) *RS* is the geoid

$$P \in RS, \ W(P) = W\_0 = U\_0 \tag{3.3}$$

In fact, this corresponds also to *HD* D 0. b) The family I for *HD* is that of plumb lines

$$\mathfrak{I} = \{L\_{Pb}\}\tag{3.4}$$

Therefore, the tangent field to *LPb* is directly n and the condition (2.8) is satisfied with

$$
\eta = 0\tag{3.5}
$$

c) Taking an integral along the plumbline from *PRS* and *P*, we can clearly write

$$H\_D(P) = \frac{1}{\mathcal{Y}^0} \int\_{-P\_{\rm RS}}^P \mathbf{g}(Q) dl = F\left(l\_{Pb}^P\right) \qquad (3.6)$$

And since *<sup>g</sup>*(*Q*) is positive everywhere on *LPb* <sup>T</sup> -, *HD* is clearly monotonous.

2) The ellipsoidal height *h* has its own geometric definition by means of the normal to the ellipsoid, P I

$$h(P) = \int\_{-P\_\varepsilon}^{P} dh = F\left(l\_{P\_\varepsilon}^P\right) \tag{3.7}$$


$$\mathfrak{I} = \{h\underline{\nu}\_P; P\_e + h\underline{\nu}\_P \in \mathfrak{Q}\}\tag{3.8}$$

We have therefore that the tangent field to *LP* is

$$
\underline{t}\_P = \underline{\nu}\_P \tag{3.9}
$$

And

$$
\underline{\iota}\_P \cdot \underline{\eta}\_P = \underline{\nu}\_P \cdot \underline{\eta}\_P = \cos \delta\_p \tag{3.10}
$$

where ı*<sup>p</sup>* is the deflection of the vertical at *P.* Since

$$1 - \underline{\nu}\_P \cdot \underline{n}\_P \cong \frac{1}{2} \delta\_P^2 \tag{3.11}$$

Even assuming for ı*<sup>p</sup>* the overwhelming upper bound

$$
\delta\_p < 10^{-2} \tag{3.12}
$$

we see that, from (2.8)

$$
\eta < 0.5 \cdot 10^{-4} \tag{3.13}
$$

c) It is clear from the definition (3.7) that

$$h\_P = l^P \tag{3.14}$$

The length of the ellipsoidal normal between *Pe* and *P.* So, F lP Pe is indeed tautologically monotonous.


$$RS = \{P; W(P) = W\_0\} \tag{3.15}$$

$$P \in RS \implies H\_o(P) = 0 \tag{3.16}$$

b) The family of I is in this case the family of plumblines again

$$\mathfrak{I} = \{L\_{Pb}\}\tag{3.17}$$

and, once more, one has

$$
\underline{t}\_P = \underline{n}\_P \tag{3.18}
$$

and

$$
\underline{\eta} = 1 - \underline{t} \cdot \underline{\eta} = 0 \tag{3.19}
$$

c) Since *Ho* is directly the length of the plumbline, one has that

$$F\left(L\_P\right) = H\_0(P) = l^P \tag{3.20}$$

which is indeed monotonous.

4) The normal height *H* is defined as the ellipsoidal height of the point *P* such that the geodetic coordinates D (, ), the normal potential *U* and the actual potential *W* satisfy the following conditions

$$
\sigma\_{P^\*} = \sigma\_P, U\left(P^\*\right) = W(P) \text{ or } H\_P^\* = h\_{P^\*} \qquad (3.21)
$$

i.e. *P* is on the same normal to *E* as *P* and the second of (3.21) is verified.<sup>1</sup>

A little thought shows that, calling *Pe* the orthogonal projection of *P* on *E*, this definition can be written analytically as

$$U\left(P\_e + H\_P^\* \underline{\mathbb{V}}\_P\right) = W(P) \tag{3.22}$$

a) from (3.22) it is clear that

$$H\_P^\* = 0 \Longleftrightarrow W(P) = U\left(P\_e\right) = U\_0 = W\_0 \qquad (3.23)$$

<sup>1</sup>Usually, this definition refers to the International Reference Ellipsoid.

namely *P* is on the geoid and *Pe* is its projection on *E*. In other words, this means that

$$RS \equiv Geoid\tag{3.24}$$

b) the family I in this case is

$$\Im = \{P\_e + H\_P^\* \underline{\mathbb{U}}\_P\},$$

i.e. the family of ellipsoidal normal lines. So,

$$
\underline{t}\_P = \underline{v}\_P \tag{3.25}
$$

and, again, we have what we have seen in (3.10) and (3.18), namely a very small .

c) Let us write (3.22) for a generic point Q D Pe C hQP on the ellipsoidal normal P , as

$$W\left(P\_e + H\_{\underline{Q}}^\star \underline{v}\_P\right) = W\left(P\_e + h\_{\underline{Q}} \underline{v}\_P\right) \tag{3.26}$$

We fix *P* and hence *Pe* and P , move only *Q* along the normal and differentiate, getting the relation

$$\underline{\mathbb{V}}\_{P^\*} \cdot \underline{\mathcal{V}}\left(H^\*\_{\underline{\mathcal{Q}}}\right) dH^\* = \underline{\mathbb{V}}\_P \cdot \underline{\mathbb{g}}\left(h\_{\mathcal{Q}}\right) dh \tag{3.27}$$

We exploit the fact that P - D P , namely constant along *LP*, and

$$\underline{\upsilon}\_{P\*} \cdot \underline{\chi} \left( H\_{\underline{Q}}^{\star} \right) \cong -\underline{\chi} \left( H\_{\underline{Q}}^{\star} \right) \tag{3.28}$$

$$\underline{\boldsymbol{v}}\_{P} \cdot \underline{\mathbf{g}}\left(\boldsymbol{h}\_{\mathcal{Q}}\right) = -\operatorname{g}\left(\boldsymbol{h}\_{\mathcal{Q}}\right) \underline{\boldsymbol{v}}\_{P} \cdot \underline{\boldsymbol{n}}\_{\mathcal{Q}}\tag{3.29}$$

$$\lg\left(h\_{\mathcal{Q}}\right) = \mathcal{\nu}\left(H\_{\mathcal{Q}}^{\star}\right) + \Delta\mathcal{g} \tag{3.30}$$

write (3.27) in the form

$$dH\_{\mathcal{Q}}^{\*} = \left(1 + \frac{\Delta \mathcal{g}}{\mathcal{Y}}\right) \underline{\mathbb{v}}\_{P} \cdot \underline{\mathbb{n}}\_{\mathcal{Q}} dh \tag{3.31}$$

or

$$H\_{\mathcal{Q}}^{\star} = \int\_{-P\_{\mathcal{R}}}^{\mathcal{Q}} \left(1 + \frac{\Delta \mathcal{g}}{\mathcal{Y}}\right) \underline{\mathcal{V}}\_{P} \cdot \underline{\mathcal{U}}\_{\mathcal{Q}} dh \qquad (3.32)$$

We stress once more that the integral in (3.32) is along the ellipsoidal normal and that *PRS* is just the projection of *P* along the normal on the geoid so that

$$h^{\mathcal{Q}} = \int\_{\cdot P\_{\mathcal{R}\mathcal{S}}}^{\mathcal{Q}} dh = h\_{\mathcal{Q}} - N\_{P\_{\mathcal{e}}} \tag{3.33}$$

with NPe the geoid undulation, i.e. it is the same as the orthometric height of Q. 

Since both 1 C g and P nQ are quite close to 1, hence positive, we see that

$$H^\* = F\left(l\_{\mathcal{P}}\right) \tag{3.34}$$

is an increasing functional of *l <sup>P</sup>* according to (2.9). Therefore, also *H* is complying with our definition of general geodetic height.

*Remark 3.1* It might be worth here to amend an excusable imprecision present in the definition of the normal height in classical books as Heiskanen and Moritz (1967).

In fact, instead of (3.32), as it is defined *H* in Heiskanen and Moritz (1967), § 8.3, one finds often the following alternative definition:

$$RS = Geoi d\tag{3.35}$$

$$\Im = \left\{ \begin{matrix} \Im \\ L\_P = \operatorname{force} \text{ lines of } \underline{\chi} \\ \end{matrix} \right\} \qquad (3.36)$$

$$H^{\star\prime} = \int\_{P\_\ell^\prime \left(\stackrel{\smile}{L}\_P\right)}^{P^{\star\prime}} dl \tag{3.37}$$

which corresponds to integrating along the normal plumb line, instead of the ellipsoidal normal, until

$$U\left(P^{\*'}\right) = W(P) \tag{3.38}$$

The situation is illustrated in Fig. 3.

Since the curvature of the normal plumbline is very small in and the corresponding normal deflection of the vertical ı is of the order of

$$O\left(\stackrel{\sim}{\delta}\right) = 5 \times 10^{-3} \,\frac{h}{R} \,\tag{3.39}$$

0

**Fig. 3** A comparison between Hand H-

with *R* the mean radius of the Earth, one can easily see that

$$O\left(\left|H^{\ast\prime} - H^{\ast}\right|\right) = O\left(\frac{1}{2}\stackrel{\sim}{\delta}^2 h\right) \qquad\qquad(3.40)$$

which is well below the mm level in our set -.

So, this alternative definition which leads to the other form of widespread use (see Heiskanen and Moritz 1967, § 4.5)

$$\begin{cases} \begin{array}{c} H^{\ast \prime} = \frac{W\_0 - W(P)}{\overline{\mathcal{V}}}\\ \overline{\mathcal{V}} = \frac{1}{H^{\ast \prime}} \int\_0^{H^{\ast \prime}} \mathcal{V}(z) dz \end{array} \end{cases} \tag{3.41}$$

has no relevant numerical difference with *H*- , although one could comment that this change of integration path has a logical relation also to the difference between the vector and the scalar Boundary Value Problem in linearized form (Sansò 1995).

# **4 Holonomity of the Geodetic Heights**

We define a regular holonomic coordinate in a simply connected set -, a function *q*(*P*), *P* 2 -, such that its differential

$$dq = \nabla q(P) \cdot d\underline{P} \tag{4.1}$$

is a continuous function of *P*. So, holonomity is a matter of regularity and of the set -. In this sense, if *q*(*P*) is directly defined, its holonomity is guaranteed by inspecting its regularity in and, as a consequence, for any closed rectifiable curve *C* in -, one has

$$\int\_{-C} dq \equiv 0 \tag{4.2}$$

Different might be the conclusion if *q*(*P*) was not directly defined but we rather define a 1-differentiable form ! D !(*P*, *dP*), we assign a conventional value to *q* at some point *Po* internal to and we put

$$q(P) = q\left(P\_0\right) + \int\_{-P\_o}^{P} w \tag{4.3}$$

the integral being along some line joining *Po* to *P*.

In this sense, *q* in general is not a function of *P* only but also of the path *L*. This can be a non-holonomic coordinate. Only when !(*P*, *dP*) is an exact differential form, namely

$$\mathcal{L}\,\boldsymbol{\nu}\,(\boldsymbol{P},d\boldsymbol{P}) = d\boldsymbol{f}\,\,(\boldsymbol{P},d\boldsymbol{P}) = \nabla f\,(\boldsymbol{P}) \cdot d\underline{\boldsymbol{P}},\qquad(4.4)$$

then we can say that

$$q(P) = q\left(P\_0\right) + \left[f\left(P\right) - f\left(P\_0\right)\right] \tag{4.5}$$

i.e. *q* is holonomic.

**Fig. 4** The angular coordinates *<sup>P</sup>* D *sP* in R<sup>2</sup>

If we write !, for instance, in 3D cartesian coordinates,

$$\begin{cases} \boldsymbol{\omega} \left( \boldsymbol{P}, \boldsymbol{d} \boldsymbol{P} \right) = \frac{\boldsymbol{\underline{v}}}{\boldsymbol{\underline{v}}} \boldsymbol{d} \, \underline{\underline{r}} = \boldsymbol{A} \boldsymbol{d} \boldsymbol{x} + \boldsymbol{B} \boldsymbol{d} \boldsymbol{y} + \boldsymbol{C} \boldsymbol{d} \, \boldsymbol{z} \\ \boldsymbol{\underline{v}} = \boldsymbol{A} \underline{\underline{e}}\_{\boldsymbol{x}} + \boldsymbol{B} \underline{e}\_{\boldsymbol{y}} + \boldsymbol{C} \underline{e}\_{\boldsymbol{z}} \end{cases} \quad (4.6)$$

and we further assume regularity of *A*, *B*, *C*, we know that ! is exact iff

$$\nabla \wedge \underline{v} = 0 \tag{4.7}$$

This is the universally known Stokes theorem. So, the question of non-holomomity is posed only if *q* is defined starting from a differential form. Alternatively, we might have a non-holonomity problem when *q* is not a proper coordinate because it is multivalued. Since holonomic coordinates are common, we give a counter example for a coordinate in 2D that is not regular holonomic over the whole R2.

*Example 4.1* Let us take the angular coordinate in R<sup>2</sup> (see Fig. 4).

It is clear that is singular at *P*D*0* where it is not defined. Therefore, we can say that is holonomic in -<sup>1</sup> but not in -2 (see Fig. 5), because in -<sup>1</sup> every closed curve can be shrunk to a point without exiting from the set, while it is not in the -<sup>2</sup> because this circular crown is not simply connected. In fact, the integral of *d* along the curve *L* in -<sup>2</sup> is non-zero

$$\int\_{-L} d\theta = 2\pi\tag{4.8}$$

Clearly, the line *L* in Fig. 5 cannot be shrunk to a point remaining in -2.

Another way to look at the problem is to say that indeed is a multivalued function of *P*

$$
\theta\_P = \overline{\theta}\_P + 2n\pi
$$

The only way to avoid that and return to a single valued function is to cut the plane along a straight line issued from the origin. Therefore, -<sup>1</sup> in Fig. 5 is acceptable for while -<sup>2</sup> is not because it includes one part of the forbidden cut.

**Fig. 5** The -<sup>1</sup> and the -<sup>2</sup> domains

So, returning to geodetic heights, we need only to verify whether our definition of regular holonomity is satisfied by them.

First of all, we underline that our set in R3, which is similar to a spherical crown, is simply connected. In fact, every loop in can b continuously shrunk to a point remaining in -. So, the only thing to be verified is that *HG* is continuously differentiable.

#### **Proposition 4.1**

*The geodetic heights HD*, *Ho*, *H*- , *h are regular holonomic variables in* -.

*Proof* As for *h(P)* it is enough to observe that

$$\operatorname{dh}\left(P,dP\right) = \underline{\operatorname{\underline{\nu}}}(P) \cdot d \,\underline{P} \tag{4.9}$$

i.e.

$$
\nabla h = \underline{\mathbb{v}}(P) \tag{4.10}
$$

which is certainly a continuous function of *P* in a layer around the Earth ellipsoid.

The proof for *HD*, *Ho*, *H* is just a check on the regularity of *W*(*P*) and g.P / D rW .P / as they are all defined by means of the gravity field. The result comes if we consider that W .P / D V .P / C <sup>1</sup> 2!<sup>2</sup> x<sup>2</sup> C y<sup>2</sup> , where the centrifugal part is clearly continuously differentiable in R3.

As for the Newtonian potential *V*(*P*) we need to remember that the mass density (*P*) generating *V*(*P*) is bounded on the compact set *B*, the body of the Earth, and zero outside. Therefore, we have too (*P*) 2 *LP*(R3), 8 *p* > 3. Then, as we know from potential theory (e.g. see Miranda 1970, § 13), we have that the potential *V*(*P*) generated by (*P*) in any bounded domain (like a sphere with a large but fixed radius) is in *C*1, ˛ with ˛<1 <sup>3</sup> <sup>p</sup> : We recall that *C*1, ˛ functions are not only continuous with their first derivatives but even Hölder continuous with exponent ˛. So, V .P /; g.P / are ˛- Hölder continuous for any exponent ˛ < 1.

Therefore, that *HD* is regular holonomic, is just tautological. That *Ho* is regular holonomic has been proven in Sansò and Vanicek ( ˇ 2006) with a detailed geometric analysis of *dHo*.

That *H* is regular holonomic descends from its definition (3.26). In fact, differentiating such relation in the direction of P we get

$$\underline{\underline{\underline{\nu}}}\left(P^\*\right) \cdot \underline{\underline{\underline{\nu}}}\_P dH^\* = \underline{\underline{\underline{g}}}(P) \cdot \underline{\underline{\underline{\nu}}}\_P dh \tag{4.11}$$

since *P* is a continuous function of *P,* according to the implicit function theorem, P and g.P / are continuous functions of *P*, because .P/- P ¤ 0 in - (in fact, .P/- P Š .P/); so we see that *dH* is a continuous function of *P* too, and the proof is complete.

*Remark 4.1* In geodetic practice there is in use a true nonholonomic height, namely the so called normal orthometric height *Hno* defined as (Sansò et al. 2019, §6.5)

$$H\_{no}(P) = -\frac{1}{\overline{\mathcal{V}}} \int\_{-p\_0}^{P} \frac{\chi(\mathcal{Q})}{\mathbf{g}(\mathcal{Q})} dW(\mathcal{Q}) \qquad (4.12)$$

where P0P is a levelling line, dW .Q/ g.Q/ is the levelling increment and is the average of (*Q*) on the ellipsoidal normal between the geoid and *Hno*, under the point *P*. Such a height system is applied in the Australian continent (see Featherstone and Kuhn 2006).

As a matter of fact

$$w = -\frac{\mathcal{Y}}{\mathcal{g}}dW\tag{4.13}$$

is not an exact differential form, as otherwise g should be constant on equipotential surfaces, which is not.

# **5 Comparisons and Conclusions**

The paper has examined four fundamental geodetic height systems with the purpose of clarifying that they are proper regular coordinates and none of them is non-holonomic. Moreover, the reference surface of the four height systems have been identified: they are the geoid for dynamic, orthometric and normal heights, the ellipsoid for ellipsoidal heights.

The question then arises on which height should be utilized in practice. The answer of the authors is that it depends on the use we want to make of it.

Certainly, the ellipsoidal height is the neatest concept from the geometrical point of view, and it is the natural to use it when GNSS observations play a main role; as an example, consider aerial navigation. On the contrary, the other height systems enter more naturally where the gravity fields play a role, for instance throughout levelling lines observations as it happens in many engineering applications. In this respect, it seems useful to us to evaluate the order of magnitude of the corrections to be applied to the integral of the levelling increment 4*LPQ* along the levelling line, namely

$$
\Delta L\_{P\underline{\mathcal{Q}}} = \int\_{-P}^{\underline{\mathcal{Q}}} \delta L = \int\_{-P}^{\underline{\mathcal{Q}}} \underline{n} \cdot d\underline{r} \tag{5.1}
$$

In this sense it is interesting to compare such corrections that we will call

$$C\left(H\_D\right) = \Delta L\_{P\underline{Q}} - \Delta H\_{D\_{PQ}} \tag{5.2}$$

$$C(h) = \Delta L\_{P\underline{Q}} - \Delta h\_{P\underline{Q}} \tag{5.3}$$

$$C\left(H^\*\right) = \Delta L\_{P\mathcal{Q}} - \Delta H^\*\_{P\mathcal{Q}}\tag{5.4}$$

$$C\left(H\_o\right) = \Delta L\_{PQ} - \Delta H\_{o\_{PQ}} \tag{5.5}$$

Such corrections are a metric index of the difference between the levelling increment added along a line, and the specific height difference of the two extreme points. The larger the correction, the larger are errors entering in its computation.

Based on formulas by Heiskanen and Moritz (1967), section 4.4, further elaborated in Sansò et al. (2019), section 6, it is easy to derive the following rough estimates of the four corrections, all referring to the worst case:

$$O\left(C\left(H\_D\right)\right) \sim 10^{-3} \Delta H\_{P\mathcal{Q}} \tag{5.6}$$

$$\mathcal{O}\left(C(h)\right) \sim 10^{-4} l\_{P\mathcal{Q}}\tag{5.7}$$

$$O\left(\mathbf{C}\left(H^\*\right)\right) \sim 10^{-4} \Delta H\_{PQ} \tag{5.8}$$

$$\dot{O} \left( \dot{C} \left( \dot{H}\_o \right) \right) \sim 10^{-4} \Delta H\_{P\mathcal{Q}} \tag{5.9}$$

where 4*HPQ* is the height difference between the end points of a levelling line and *lPQ* its horizontal length.

Yet it is necessary to recall that the computation of *C*(*Ho*), contrary to that of *C*(*H*- ), requires some knowledge of the density of the topographic masses, so introducing one further uncertainty in its computation.

We conclude then that *H* is better suited than *Ho* to treat levelling observations, without introducing unnecessary uncertainties due to poor knowledge of the mass density. Of course, *H* is also the natural coordinate to be used when tackling the solution of the GBVP.

# **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/ licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

# **Physical Heights of Inland Lakes**

# Nico Sneeuw, Muriel Bergé-Nguyen, and Jean-François Crétaux

#### **Abstract**

Inland satellite altimetry has gained traction over the past decade and is now routinely used to monitor the water levels of rivers, lakes and reservoirs. The accuracy of such inland water height measurements, at least from radar altimetry is still relatively poor from a geodetic viewpoint, namely in the range of several decimeter. Accuracies from spaceborne laser altimetry, in particular from the ICESat-2 mission, are at cm-level, however, and further progress in the radar altimetry domain is expected from swath-based altimetry by the SWOT mission, (to be) launched December 2022. With accuracies down to cm-level one needs to reconsider the height system definition of inland lake surfaces as obtained from satellite altimetry. Conventionally one subtracts a global geoid model from the altimetry-derived ellipsoidal height to obtain an orthometric height. Without wind stress, seiches and other time-variable height disturbances the lake water surfaces will conform to equipotential surfaces in the Earth's gravity field. Thus lake surfaces are surfaces of constant dynamic height, from which follows that a lake surface cannot be a surface of constant orthometric or normal height. Because equipotential surfaces are inherently non-parallel, two points at a lake surface can and will have different orthometric height. Although being well-understood in physical geodesy, we will here model this effect and quantify it for various case studies. We demonstrate that the effects can be as large as a few dm for large lakes at high altitudes, which is an order of magnitude that is relevant in terms of satellite altimetry error levels.

#### **Keywords**

Gravity - Lake surface - Orthometric height -Satellite altimetry

# **1 Introduction**

Satellite altimetry fundamentally provides the range between satellite and water surface. After proper corrections and with knowledge of the spacecraft's orbital position the basic

N. Sneeuw (-)

Institute of Geodesy, University of Stuttgart, Stuttgart, Germany e-mail: sneeuw@gis.uni-stuttgart.de

M. Bergé-Nguyen · J.-F. Crétaux Laboratoire d'Etudes en Géophysique et Océanographie Spatiale (LEGOS), CNES, Observatoire Midi-Pyrénées, Toulouse, France e-mail: muriel.berge-nguyen@cnes.fr; jean-francois.cretaux@cnes.fr product of satellite altimetry is the water surface height h above (and normal to) the reference ellipsoid.

By subtracting a geoid N one obtains orthometric height H by virtue of h D H C N. Typically a global geoid model is used, which obviously comes with its own model errors (commission, omission). As a consequence of H D h N these model errors end up fully in the determined orthometric height. For many applications such model geoid errors, considered as bias over the water body of interest, hardly play a role, particularly if variations over time are more interesting than absolute levels. In other applications, the altimetryderived ellipsoidal heights over a lake are specifically used to improve the local geoid. The underlying assumption in such

J. T. Freymueller, L. Sánchez (eds.), *X Hotine-Marussi Symposium on Mathematical Geodesy*, International Association of Geodesy Symposia 155, https://doi.org/10.1007/1345\_2023\_192

cases is that the orthometric height over a lake surface be constant.

Satellite altimeters have repeat cycles of 10 nodal days (TOPEX/Poseidon and Jason family), 35 days (ERS and Envisat family) or 27 days (Sentinel-3 family). These are sampling rates that cannot capture the faster hydrological dynamics, so single-track overpasses are temporally insufficient to create lake height time series. Therefore, over large lakes and reservoirs it is only natural to combine the height information from different groundtracks, either from the same satellite or from different ones, in order to densify the temporal sampling or to extend the length of the time series. After geoid correction, and potentially after inter-satellite bias correction, the height information from the different tracks are combined into a single height time series of the lake, e.g. Bergé-Nguyen et al. (2021), although the tracks refer to different locations on the lake. The combination of track information therefore assumes that the lake surface is a surface of constant orthometric height.

It is known from physical geodesy that the system of orthometric heights does provide unique and physically defined heights, but that surfaces of constant orthometric heights are not equipotential surfaces. That means that even with error-free satellite altimetry and with perfect geoid knowledge the orthometric heights from H D h N will not be constant over a lake. It also means that satellitealtimetric lake surfaces cannot be used to improve geoids without further precautions.

Although these effects are conceptually known from physical geodesy, it seems that the satellite altimetric literature largely ignores them, presumably because they are small. In this contribution we will formulate the orthometric height variation analytically and quantify its effects numerically.

# **2 Orthometric Height**

The physical height of a surface point P is fundamentally defined through the geopotential number CP , which is the gravity potential difference between the geoid and the surface (Heiskanen and Moritz 1967, §4-4):

$$C\_P = W\_0 - W\_P \dots$$

The potential difference is obtained by the work integral (per unit of mass) in the gravity field g from geoid to point P:

$$C\_P = \int\_P^0 \mathbf{g} \cdot \mathbf{dr} \,,$$

which is a general path integral, typically evaluated over the surface of the Earth. In practice it is discretized as <sup>P</sup> <sup>i</sup> gili , i.e. the sum over the product of leveling increments li and surface gravity gi along the leveling line. The geopotential number can be evaluated conceptually also along the plumbline between the surface point P and its footpoint P0 on the geoid. With g being tangent to the local plumbline, the scalar product of two vectors under the integral reduces to a product of two scalars:

$$C\_P = \int\_P^{P\_0} \mathbf{g} \cdot \mathbf{d}r = -\int\_P^{P\_0} \mathbf{g} \, \mathrm{d}H = \int\_{P\_0}^P \mathbf{g} \, \mathrm{d}H \, \mathrm{d}H$$

By simultaneously multiplying *and* dividing by the length of this stretch of plumbline, called HP , one arrives at:

$$C\_P = H\_P \left[ \frac{1}{H\_P} \int\_{P\_0}^{P} \mathbf{g} \, \mathbf{d}H \right] = H\_P \, \bar{\mathbf{g}}\_P \, \mathbf{J}$$

in which gN<sup>P</sup> denotes the average gravity along the plumbline between surface point P and the geoid. Finally, the above equation is recast into the definition of orthometric height:

$$H\_P = \frac{C\_P}{\bar{\mathbf{g}}\_P} \,, \tag{1}$$

i.e. the height of surface point P above the geoid, measured along the curved plumbline. Although the orthometric height constitutes a clear and physical definition of height above the geoid, two weaknesses are pointed out in the geodetic literature:


plumbline. A reduction of the approximation error may be achieved if one uses a better topographic model and sub-surface density information.

These weaknesses are physical geodesy textbook material and will not concern us in this contribution.

# **3 Orthometric Height Variation at Lake Surface**

A further consequence of plumblines being curved does concern us here, though. With equipotential surfaces being perpendicular to the plumbline, a plumbline curvature will necessarily lead to equipotential surfaces at altitude being not parallel to the geoid. In other words, different points on the equipotential surface at altitude have different distances to the geoid, i.e. different orthometric heights. Hence, an equipotential surface is fundamentally *not* a surface of constant orthometric height. The only exception is the geoid itself with a constant H D 0.

This geometric description becomes mathematically obvious in definition (1): the geopotential number will be constant at the equipotential surface, but the mean gravity will spatially vary, depending on surrounding topography or on lake bathymetry or on density variation in the crust. If we identify the surface of a lake or reservoir as an equipotential surface, we thus must conclude that the orthometric height of the lake surface is spatially variable. The main assumption made here is that the lake or reservoir is at rest. Any wind stress or dynamic lake topography would violate the assumption.

Consider now the difference in orthometric height between two points, P and Q, on the lake surface:

$$
\Delta H\_{P\mathcal{Q}} = H\_{\mathcal{Q}} - H\_P = \frac{C\_{\mathcal{Q}}}{\bar{\mathcal{g}}\_{\mathcal{Q}}} - \frac{C\_P}{\bar{\mathcal{g}}\_P} \dots
$$

With some manipulation, and setting CQ D CP ,

$$
\Delta H\_{P\underline{Q}} = \frac{\bar{\mathbf{g}}\_P}{\bar{\mathbf{g}}\_P} \frac{C\_{\underline{Q}}}{\bar{\mathbf{g}}\_{\underline{Q}}} - \frac{C\_P}{\bar{\mathbf{g}}\_P} \frac{\bar{\mathbf{g}}\_{\underline{Q}}}{\bar{\mathbf{g}}\_{\underline{Q}}} = \frac{\bar{\mathbf{g}}\_P - \bar{\mathbf{g}}\_{\underline{Q}}}{\bar{\mathbf{g}}\_{\underline{Q}}} \frac{C\_P}{\bar{\mathbf{g}}\_P} \ .
$$

we arrive at

$$
\Delta H\_{P\underline{\mathcal{Q}}} = \frac{\bar{\mathbf{g}}\_P - \bar{\mathbf{g}}\_{\underline{\mathcal{Q}}}}{\bar{\mathbf{g}}\_{\underline{\mathcal{Q}}}} H\_P \approx \frac{\mathbf{g}\_P - \mathbf{g}\_{\underline{\mathcal{Q}}}}{\mathbf{g}\_{\underline{\mathcal{Q}}}} H\_P \,. \tag{2}
$$

In the latter approximation the mean gravity values along the plumbline are replaced by their surface values. Appendix 1 justifies this step numerically.

The orthometric height variation HQ HP (2) is proportional to the gravity difference gP gQ (in this order). This makes sense, as scalar gravity represents the gradient of the potential along the plumbline. Loosely speaking it represents the density of equipotential surfaces along the plumbline. The higher the gravity, the denser the level surfaces and, hence, the lower the height. The orthometric height variation is also proportional to the height of the lake itself. This also makes sense: the longer the plumbline, the more damage curvature can do.

A numerical rule-of-thumb can be derived from (2) by setting its denominator to 10 m=s2. If the gravity variations are then given in units of mGal and the height in km, the left hand side is provided in mm. This rule-of-thumb resembles (Heiskanen and Moritz 1967, eqn. (4-34))

$$
\delta H\_{\rm mm} \doteq \delta \bar{\mathbf{g}}\_{\rm mgal} H\_{\rm km} \ ,
$$

although that equation was meant to evaluate the effect of an error in mean gravity gN on H.

# **4 Quantification: Case Studies**

In order to quantify how serious the orthometric height variation can get, we will have a look at a number of case studies of lakes either with large gravity variation at its surface or at large altitude. The number of case studies is limited, however, because gravimetry over lakes is rare. And if data exist, their availability may be restricted.

It would not make sense to synthesize lake surface gravity from global geopotential models if real lake gravimetry did not enter such models. In such cases the global models would act as interpolators, smoothening the gravity field over the lake while underestimating the gravity variation.

The "lake" with the potentially largest gravity variation would be the global ocean. Despite a considerable gravity range of about 500 mGal, the oceans, being the embodiment of the geoid, have orthometric heights very close to zero. Therefore the orthometric height variation is near-zero, too.

# **4.1 Lake Vänern, Sweden**

Lake Vänern has a surface area of 5650 km2. At a surface elevation of just 44 m one cannot expect a large orthometric height variation, but the high quality of the gravity data material makes for an interesting case study nonetheless. Landmäteriet, the national mapping, cadastral and land registration authority of Sweden, performed gravimetry over the ice in 2011 by hovercraft while the lake was frozen.

With a gravity range of about 50 mGal and a surface altitude of just 44 m the range of orthometric height variation is only 2 mm (Fig. 1). In Eq. (2) the points P and Q can

**Fig. 1** Gravity (top) and corresponding orthometric height variation (bottom) over Lake Vänern

be chosen freely. Here Q represents any of the surface points, whereas P is selected such that the height variation is centered. Although the orthometric height variation is negligible in the context of satellite altimetry, the case study demonstrates that the flattening alone explains the North-South gravity variation to a large extent. Consequently, the orthometric height variations can be approximated partially by the normal gravity already. For Lake Vänern this would amount to about 60% of the full effect (not shown here).

# **4.2 Lake Michigan, USA**

The GRAV-D project of NOAA/NGS, the US National Geodetic Survey, provides a wealth of airborne gravity data.

**Fig. 2** Gravity (top) and corresponding orthometric height variation (bottom) over Lake Michigan

Over the wider area of the Great Lakes the data have been downward continued by Li et al. (2016). Michigan Lake was selected here as case study because of its North-South extension of nearly 500 km (Fig. 2).

Due to its sheer size one can expect Lake Michigan to have a considerable gravity variation. Indeed, it shows a range of about 350 mGal. Even after subtracting the normal field at the surface, the gravity disturbances still show a range of 80 mGal (not shown here). Despite a moderate surface

**Fig. 3** Gravity (top) and corresponding orthometric height variation (bottom) over Issyk Kul

altitude of about 140 m the large gravity range guarantees an orthometric height variation of the lake surface of about 5 cm, a level that starts to become interesting in radar altimetry, and that certainly is relevant to laser altimetry.

#### **gravity** 0 12 25 50 km 0 550 560 570 580 590 600 610 620 630 640 gravity [mGal] + 977000 **orthometric height variation** 0 12 25 50 km 0 -20 -15 -10 -5 0 5 10 15 orthometric height variation [cm]

**Fig. 4** Gravity (top) and corresponding orthometric height variation (bottom) over Salar de Uyuni

# **4.3 Issyk Kul, Kyrgyzstan**

With a surface elevation of about 1600 m, a length of 178 km and surrounded by mountain ranges, Issyk Kul might have been an interesting case study. As an endorheic lake Issyk Kul would not suffer from drainage-related surface slope and would suitably fulfil the condition of being an equipotential surface. However, no observed gravity data is available. The example is used here to demonstrate the use of gravity data synthesized from a global geopotential model. Since no lake gravimetry was ingested into the global geopotential model EGM2008 used here, the gravity variation over the lake will be too smooth. Moreover it shows interpolation artefacts. As a result the orthometric height variation will be underestimated and will also contain artefacts (Fig. 3).

# **4.4 Salar de Uyuni, Bolivia**

The Salar de Uyuni is not a lake, it is a salt flat high up in the Andes at an altitude of 3700 m. Occasional rainfall creates a thin layer of water that solves the top salt layer and restores topographical deformations back to an equipotential surface before it evaporates, cf. (Borsa et al. 2008a). Gravimetry was part of geophysical prospecting in the 1970s (Cady and Wise 1992). Borsa et al. (2008b) assessed the equipotential surface properties of this salt flat.

The gravity variation range of about 80 mGal is less impressive than in the case study of Lake Michigan. However, the high altitude creates a large orthometric height variation in the range of 3 dm (Fig. 4).

# **5 Conclusions and Outlook**

Since a water surface at rest conforms to the local gravity field potential, a lake surface will be an equipotential surface. As a consequence it cannot be a surface of constant orthometric height. We here formulated the orthometric height variation across the lake surface. A rule-of thumb says that its effect (in units of mm) is calculated as the amount of gravity variation at the surface (in mGal) times the overall orthometric height of the lake surface (in km). For several case studies we have shown that orthometric height variation can amount to 2–3 dm for lakes at high altitudes. However, for most lakes worldwide the orthometric height variation effect will be around the mm- to cm-level, i.e. hardly relevant in the context of satellite radar altimetry.

Because North-South gravity variation is to a large extent due to the flattening of the Earth, the orthometric height variation can be modeled to that extent accordingly. In the case studies shown here the flattening-induced orthometric height variation can explain roughly 50–60% of the total effect, the remainder due to density-induced gravity anomaly.

The database of globally available lake gravity is poor. Using gravity information derived from global geopotential models will not be an alternative as these models lack spatial resolution and will underestimate gravity variation over the lake, if lake gravimetry has not been ingested into the model. A further alternative, namely deriving gravity information from satellite altimetry itself, a well-established technique over the open ocean and big lakes, was not part of this study, and must be the object of further exploration.

The orthometric height variation at the lake surface is a geometric expression of the fact that equipotential surfaces are not parallel to each other. Therefore, as a word of warning to geoid modelers: it would be a mistake to use a geometrically derived lake surface, e.g. from laser altimetry, as a proxy for the geoid in areas where gravity information is sparse. The surface of the lake at rest is simply not parallel to the geoid.

**Acknowledgements** Lake gravity data have been generously provided by Per-Anders Olsson and his colleagues at Landmäteriet for Lake Vänern, by Xiaopeng Li at National Geodetic Survey of NOAA for the Great Lakes, and by Adrian Borsa at the Scripps Institution of Oceanography of UC San Diego for the Salar de Uyuni. The help of Frieder Schmid from University of Stuttgart in terms of data visualizations is greatly appreciated.

# **Appendix 1: Approximation**

In Eq. (2) the mean gravity gN<sup>P</sup> was replaced by surface gravity gP . It will be shown here that this approximation is allowed when comparing with Poincaré-Prey reduced gravity, which is the next-best approximation of mean gravity. In Sect. 2 this reduction was explained by the 3 steps: (a) removing a Bouguer plate of thickness <sup>1</sup> <sup>2</sup>HP , (b) going down to the mid-point along the plumbline in free air, and (c) restoring the Bouguer plate. This leads to the approximation:

$$\begin{aligned} \bar{\mathbf{g}}\_P &= \mathbf{g}(\frac{1}{2}H\_P) \\ &= \mathbf{g}\_P - \mathbf{BO} \cdot \frac{1}{2}H\_P + \mathbf{FA} \cdot \frac{1}{2}H\_P - \mathbf{BO} \cdot \frac{1}{2}H\_P \\ &= \mathbf{g}\_P + (\frac{1}{2}\mathbf{FA} - \mathbf{BO})H\_P \\ &= \mathbf{g}\_P + \mathbf{PP} \cdot H\_P \ . \end{aligned}$$

The conventional value for the free-air gradient FA D 0:3086 mGal=m. For the crustal density one conventionally takes D 2670 kg=m3 leading to a Bouguer gradient of BO D 0:1119 mGal=m. Hence the Poincaré-Prey gradient becomes

$$\text{PP} = 0.0424 \,\text{m} \\ \text{Gal/m} = 4.24 \cdot 10^{-7} \,\text{s}^{-2} = 424 \,\text{E} \,\text{J}$$

Let us see now how this small correction gradient affects the orthometric height variation (2).

$$
\Delta H\_{P\underline{\mathcal{Q}}} = \frac{\bar{\mathcal{g}}\_P - \bar{\mathcal{g}}\_{\underline{\mathcal{Q}}}}{\bar{\mathcal{g}}\_{\underline{\mathcal{Q}}}} H\_P
$$

$$
\approx \frac{\mathcal{g}\_P - \mathcal{g}\_{\underline{\mathcal{Q}}} - \text{pp} \cdot \Delta H\_{P\underline{\mathcal{Q}}}}{\mathcal{g}\_{\underline{\mathcal{Q}}} + \text{pp} \cdot H\_{\underline{\mathcal{Q}}}} H\_P \, . \tag{3}
$$

The orthometric height variation -HPQ has been shown to be at mm- to cm-level for most lakes. Only in extreme cases like the Salar de Uyuni does it become a few dm. After multiplication with PP a very small number remains that does little to change the numerator. In the denominator PP is multiplied by the lake height itself. But even if the lake is situated at a few km altitude, the resulting correction is still small relative to the full gravity value.

Of course, the Poincaré-Prey reduction is a first approximation to determining the mean gravity along the plumbline. An improved approximation would require topographic, bathymetric and density information. It is believed that such a refinement will not numerically change the presented results. Hence, the approximation of Eq. (2) seems more than sufficient.

# **Appendix 2: Normal Height Variation**

Along the same lines as above it can be derived how much normal height H<sup>n</sup> will vary along a lake surface:

$$
\Delta H\_{P\underline{\mathcal{Q}}}^{\mathfrak{n}} = \frac{\bar{\mathcal{V}}\_P - \bar{\mathcal{V}}\_{\mathcal{Q}}}{\bar{\mathcal{V}}\_{\mathcal{Q}}} H\_P^{\mathfrak{n}} \,. \tag{4}
$$

The mean normal gravity along the plumbline in the normal field is measured from the ellipsoid footpoint P0 upwards:

$$\bar{\gamma}\_P = \bar{\gamma}(\frac{1}{2}H\_P^{\text{n}}) = \bar{\gamma}\_{P\_0} - \frac{1}{2}\text{FA} \cdot H\_P^{\text{n}} \dots$$

Inserting this mean normal gravity for P and Q into Eq. (4) leads to a similar discussion as above. Despite the fact that the free-air gradient FA is numerically larger than PP, we conclude that such a refinement can be neglected and that the normal height variation at the lake surface can be approximated by

$$
\Delta H\_{PQ}^{\rm n} \approx \frac{\chi\_{P\_0} - \chi\_{\mathcal{Q}\_0}}{\mathcal{Y}\_{\mathcal{Q}\_0}} H\_P^{\rm n} \,, \tag{5}
$$

$$
\Delta H\_{PQ}^{\text{n}} \approx \frac{\chi\_P - \chi\_{\mathcal{Q}}}{\chi\_{\mathcal{Q}}} H\_P^{\text{n}} \,. \tag{6}
$$

The former version makes use of normal gravity at the footpoint on the ellipsoid. The latter version, which uses normal gravity at the lake surface, is exactly the part of the orthometric height variation due to flattening. Thus, it can be predicted by the normal gravity variation with latitude. Alternatively, one can resort to formulas for N as provided in Heiskanen and Moritz (1967, §4-5).

# **References**

Bergé-Nguyen M, Cretaux JF, Calmant S, et al (2021) Mapping mean lake surface from satellite altimetry and GPS kinematic surveys. Adv Space Res 67(3):985–1001. https://doi.org/10.1016/j.asr.2020. 11.001


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/ licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

# **The Uncertainties of the Topographical Density Variations in View of a Sub-Centimetre Geoid**

Ismael Foroughi, Mehdi Goli, Spiros Pagiatakis, Stephen Ferguson, Petr Vanicek, Marcelo Santos, and Michael Sheng

#### **Abstract**

We estimate the uncertainty of the modelled geoid heights based on the standard deviations of the topographic mass density variation. We model the geoid using the one-step integration method considering mass density variations along with their associated error estimates to calculate the direct and indirect topographic density effects on the geoid heights in the Helmert space. We employ the *UNB\_TopoDensT\_2v01* global lateral density model and its standard deviations and test our algorithms in the Auvergne test area, in central France. Our results show that the topographic mass density variations are currently known well enough to model the geoid with sub-centimetre internal error in topographically mild regions such as Auvergne.

#### **Keywords**

Density variation - Geoid error - Gravimetric inversion -One-step integration

# **1 Introduction**

Regional gravimetric geoid models are solutions to the Geodetic Boundary Value Problems (GBVPs). The GBVPs are solved in a fictitious harmonic space where there is no topographic mass above the boundary surface (geoid or reference ellipsoid) on which various gravity functionals (e.g., gravity anomalies or gravity disturbances) furnish the boundary values. Consequently, the solutions of the GBVP require that the topographic mass elevation and density be available to compute its gravitational attraction on the points

Department of Earth and Space Science and Engineering, York University, Toronto, ON, Canada e-mail: i.foroughi@unb.ca

M. Goli Shahrood University of Technology, Shahrud, Iran

S. Ferguson Sander Geophysics Ltd., Ottawa, ON, Canada

P. Vanicek · M. Santos · M. Sheng Department of Geodesy and Geomatics Engineering, University of New Brunswick, Fredericton, NB, Canada

where gravity is observed, prior to attempting their removal and the calculation of the boundary values (Heiskanen and Moritz 1967, Ch. 3). The Helmert second condensation method provides a mechanism to remove the mass above the boundary surface by condensing the topographic mass to a very thin layer (Vanícek and Martinec ˇ 1994). The gravitational attraction of the topography at the observation points is used to reduce the gravity measurements down to the boundary surface through an empty (harmonic) space. This downward continuation is possible via the calculation of the Direct Topographic Effect (DTE) and the Secondary Indirect Topographic Effect (SITE). To transfer back to the real space in presence of the topography, the Primary Indirect Topographic Effect (PITE) is used.

In the past, due to lack of knowledge of the variable topographic mass density, a constant density value of 2670 kg m<sup>3</sup> has exclusively been used to compute the topographic effects mentioned above. The departure of the actual density value from its constant value, hereafter called *anomalous density,* and its impact on the geoid heights was first investigated at the University of New Brunswick by Martinec (1993) and then implemented over the years by Fraser et al. (1998), Pagiatakis et al. (1999), and Huang et al. (2001). It was shown that this effect may reach the decimetre level, which is

© The Author(s) 2023

I. Foroughi (-) · S. Pagiatakis

J. T. Freymueller, L. Sánchez (eds.), *X Hotine-Marussi Symposium on Mathematical Geodesy*, International Association of Geodesy Symposia 155, https://doi.org/10.1007/1345\_2023\_189

far from the ever growing need for a sub-centimetre accurate geoid (Huang et al. 2001; Foroughi et al. 2017; Janák et al. 2018; Tenzer et al. 2021). The anomalous density is a 3D function, however, it has been shown that the laterally anomalous density is dominant compared to its radial counterpart in the topographic reduction calculations, e.g., DTE, SITE, PITE, and consequently, it is important in the geoid modelling (Kingdon et al. 2009). When the topographic reductions are computed using the anomalous density, they are called direct density effect (DDE), secondary indirect density effect (SIDE), and primary indirect density effect (PIDE), respectively (Huang et al. 2001).

The *UNB\_TopoDensT\_2v01* is a global model of the variable topographic density considered as a function of the horizontal coordinates, and it provides three different resolutions namely, 30<sup>00</sup> 3000, 5<sup>0</sup> 5<sup>0</sup> , and 1<sup>ı</sup> 1 ı along with their corresponding standard deviations (STDs). This model provides valuable information in the computation of the contribution of the anomalous density on the geoid heights and the estimation of their uncertainties (internal error). The contribution of the anomalous density in regional geoid modelling has been well studied before, see for example (Hunegnaw 2001; Kuhn 2002; Sjöberg 2004; Kiamehr 2006; Sjöberg and Bagherbandi 2011; Chaves and Ussami 2013; Janák et al. 2018; Albarici et al. 2019; Vajda et al. 2020; Lin and Li 2022). However, the effect of the standard deviation of the anomalous density on the internal error of the geoid heights has not (or just partially) been addressed in the literature, see e.g., Huang et al. (2001) for Canada, Foroughi et al. (2019) for the Auvergne test case, and Foroughi et al. (2023) for the Colorado 1-cm geoid experiment.

The internal error estimate of the geoid heights in the Auvergne test area was performed by Foroughi et al. (2019). In their study, the STD of the DDE on the geoid heights was disregarded and only the Bouguer shell for the PIDE was considered for error propagation. In this contribution we aim to address two missing parts namely, (a) investigate whether our internal error estimate of the geoid heights in Auvergne still stays below one centimetre when considering the complete terms of the PIDE for error propagation and (b) include the errors of the DDE on the geoid heights. In this study, we use the one-step integration method for the geoid determination, which is a combination of the inverse Poisson integral and Stokes or Hotine integral transform (Novák 2003; Goli et al. 2019b). This method provides equivalent results to the two-step geoid determination method developed by the gravity research group at the University of New Brunswick (Vanícek and Martinec ˇ 1994). We apply our formulations in the Auvergne test area which has been used by other researchers for geoid determination using different methodologies (Duquenne 2007; Goyal et al. 2021).

#### 28 I. Foroughi et al.

# **2 Theory**

Using the one-step integration method (Novák 2003), the regional geoid heights are computed as:

$$N = \mathbf{D} \left[ \delta \mathbf{g} + \delta \mathbf{g}' - \left( \delta \mathbf{g}\_L + \delta \mathbf{g}\_L' \right) \right] + \left( N\_L + \delta N\_L' \right) + \delta N',\tag{1}$$

where ı*g* is the observed gravity disturbance (usually predicted at grid points of various grid sizes for numerical simplicity), ı*gt* and ı*Nt* are the DTE and PITE respectively; ı*gL* and *NL* are the long wavelength (reference) of the gravity disturbances and the geoid heights computed using an Earth gravity model (EGM) up to a maximum degree/order *L*; ıg<sup>t</sup> L and ıN<sup>t</sup> <sup>L</sup> are the computed long-wavelength components of the DTE and PITE, using the spherical harmonic coefficients of the topography or by numerical integration over a reference topography. D stands for the one-step inverse operator. Our focus in this study is on the contribution of topographic reductions (*Nt* ) on the geoid heights (and their STDs) when the anomalous density rather than a constant mean density is used. Neglecting the density variations in the reference DTE and PITE (ıg<sup>t</sup> <sup>L</sup> and ıN<sup>t</sup> <sup>L</sup>) we limit our formulations to:

$$N^{\ell} = \mathbf{D} \left[ \delta \mathbf{g}^{\ell} \right] + \delta N^{\ell}. \tag{2}$$

Note that, there are other small terms in Eq. (1) that we neglected here, such as the ellipsoidal corrections, and the atmospheric effects because of their small contributions and irrelevance to the topic of this study. We also did not include the SITE here since we are using gravity disturbances as opposed to gravity anomalies, therefore the SITE term is not required; please see Vanícek and Martinec ( ˇ 1994) and Janák et al. (2018) for further details.

As mentioned above, according to the Helmert second condensation method, the topography above the boundary surface (here spherical approximation of the reference ellipsoid) is condensed to a thin layer on the same surface. The topographic reductions in Helmert space are applied by using the residual gravitational potential (ı*V*) as the difference between the gravitational potential of the real (*VT* ) and of the condensed topography (*VC*), i.e.,

$$
\delta V = V^T - V^C. \tag{3}
$$

Using Bruns formula in Eq. (3), the PITE reads:

$$
\delta N^l = \frac{\delta V}{\gamma\_0},\tag{4}
$$

where -<sup>0</sup> is the normal gravity computed on the reference ellipsoid. The DTE on the gravity disturbances is computed


**Table 1** Definition of the kernels and spatial distances in the Helmert topographical corrections e

by the radial derivative of the ı*V*, i.e.,

$$
\delta \mathbf{g}^l = \frac{\partial \delta V}{\partial r}.\tag{5}
$$

Computation of the ı*V* is achieved by using the following equation (Martinec 1998, Eqs. (3.3) and (3.4)):

$$\begin{split} \delta V \left( r, \Omega \right) &= \left. G \iint \overline{\rho} (\Omega') \int\_{r'=\mathbb{R}}^{\mathbb{R} + \mathbb{H}(\Omega')} \frac{r'^2}{\mathcal{L}(r, \psi, r')} \mathrm{d}r' \mathrm{d}\Omega' \\ &\quad - R^2 \iint \frac{\sigma(\Omega')}{\mathcal{L}\left(r, \psi, R\right)} \mathrm{d}\Omega', \end{split} \tag{6}$$

where, *R* is the mean radius of the Earth; and <sup>0</sup> are indicators of the spherical latitude(') and longitude (); *r* and *r* <sup>0</sup> are the radii at the computation and integration points, respectively; *G* is the gravitational constant; and L are the angular and spatial distance between the two points; and and are the indicators of topographic and condensed mass density, respectively. As mentioned above, the constant density value of 0 D 2670 kg <sup>m</sup><sup>3</sup> is used when computing topographic reductions in Helmert space and the anomalous topographic density is limited to only the lateral variations (ı()), i.e.,

$$
\rho(\Omega) = \rho\_0 + \delta\rho(\Omega) \,. \tag{7}
$$

Inserting Eq. (7) into Eq. (2) and only keeping the terms with density variation will result in the total effect of the anomalous density on the geoid heights: e

$$N^{l}\_{\delta\rho} = D\left[\delta \mathbf{g}^{l}\_{\delta\rho}\right] + \delta N^{l}\_{\delta\rho}.\tag{8}$$

The DDE on gravity measurements can be written as (Martinec 1998, Eq. (6.7)) e

$$\begin{split} \delta \mathbf{g}\_{\delta\rho}^{l} (r, \Omega) &= \mathbb{G} \iint\limits\_{\Omega'} \left[ \delta\rho (\Omega') \left. \frac{\partial \widetilde{\mathcal{L}^{-1}} \left( r, \psi, r' \right)}{\partial r} \right|\_{r'=R}^{R+H(\Omega')} \right. \\ &\left. -\delta\rho \left( \Omega \right) \left. \frac{\partial \widetilde{\mathcal{L}^{-1}} \left( r, \psi, r' \right)}{\partial r} \right|\_{r'=R}^{R+H(\Omega)} \right. \\ &\left. -R^{2} \left[ \delta\sigma \left( \Omega' \right) - \delta\sigma \left( \Omega \right) \right] \left. \frac{\partial \widetilde{\mathcal{L}^{-1}} \left( r, \psi, R \right)}{\partial r} \right] \text{d}\Omega' \end{split} \tag{9}$$

and the PIDE on the geoid is (ibid, Eq. (6.10)):

$$\begin{split} \text{and the PDE on the geoid is (bild, Eq. (6.10)):}\\ \delta N^{l}\_{\delta \rho} \left( \mathbf{r}, \Omega \right) &= -\frac{2\pi G}{\mathcal{I}^{0}} \delta \rho \left( \Omega \right) H^{2} \left( \Omega \right) \left[ 1 + \frac{2}{3} \frac{H \left( \Omega \right)}{R} \right] \\ &+ \frac{G}{\mathcal{I}^{0}} \iint\limits\_{\Omega'} \left[ \delta \rho \left( \Omega' \right) \widetilde{\mathcal{L}^{-1}} \left( R, \psi, r' \right) \right]\_{r'=R}^{R+H(\Omega')} \\ &- \delta \rho \left( \Omega \right) \left. \widetilde{\mathcal{L}^{-1}} \left( R, \psi, r' \right) \right|\_{r'=R}^{R+H(\Omega)} \\ &- R^{2} \frac{\delta \sigma \left( \Omega' \right) - \delta \sigma \left( \Omega \right)}{\mathcal{L} \left( R, \psi \right)} \bigg] \mathbf{d} \Omega', \end{split} \tag{10}$$

where ı() is the anomalous mass density of the condensed layer (ibid, Eq. (6.4)):

$$\delta\sigma\left(\Omega\right) = \delta\rho\left(\Omega\right) \left[ H\left(\Omega\right) \left(1 + \frac{H\left(\Omega\right)}{R} + \frac{H^2\left(\Omega\right)}{3R^2}\right) \right],\tag{11}$$

where *H* is the height of the points and the kernels and spatial distances are defined in Table 1.

By propagating the anomalous topographic density (ı) error in Eq. (9) and by neglecting the correlation between the two terms, the uncertainties of DDE are

$$\begin{split} \mathrm{s}^{2}\_{\delta g^{\prime}\_{\delta \rho}}(r,\Omega) &= G^{2} \iint\limits\_{\Omega'} \left[ \left( \mathcal{M}\left(r,\psi,r'\right) \right)^{2} \mathrm{s}^{2}\_{\delta \rho}\left(\Omega'\right) \right. \\ &\left. + \left( \mathcal{N}\left(r,\psi,r'\right) \right)^{2} \mathrm{s}^{2}\_{\delta \rho}\left(\Omega\right) \right] \mathrm{d}\Omega'^{2}, \end{split} \tag{12}$$

where

;

$$\begin{split} \mathcal{M}\left(r,\,\psi,r'\right) &= \left. \frac{\widehat{\partial \mathcal{L}^{-1}}\left(r,\,\psi,r'\right)}{\partial r} \right|\_{r'=R}^{R+H(\Omega')} \\ &- R^2 \delta\tau(\Omega') \left. \frac{\partial \mathcal{L}^{-1}(r,\,\psi,R)}{\partial r} \right|\_{r}^{r'=R}, \\\\ \mathcal{N}\left(r,\,\psi,r'\right) &= -\left. \frac{\widehat{\partial \mathcal{L}^{-1}}\left(r,\,\psi,r'\right)}{\partial r} \right|\_{r'=R}^{R+H(\Omega)} \\ &+ R^2 \delta\tau\left(\Omega\right) \left. \frac{\partial \mathcal{L}^{-1}\left(r,\,\psi,R\right)}{\partial r} \right|\_{r}^{r'=R}. \end{split}$$

where ı ./ D h H ./ 1 C H./ <sup>R</sup> <sup>C</sup> <sup>H</sup>2./ 3R<sup>2</sup> <sup>i</sup>, and *<sup>s</sup>* is the standard deviation. Similar to DDE, the error of PIDE can be obtained by applying the error of propagation to Eq. (10) :

$$\begin{split} s\_{\delta N\_{\delta \rho}^{\prime}}^{2}(r,\Omega) &= \left(\frac{G}{\chi\_{0}}\right)^{2} \iint\limits\_{\Omega'} \left[\left(\mathcal{P}\left(r,\psi,r'\right)\right)^{2}s\_{\delta \rho}^{2}\left(\Omega'\right)\right. \\ &\left.+\left(\mathcal{R}\left(r,\psi,r'\right)\right)^{2}s\_{\delta \rho}^{2}\left(\Omega\right)\right]d\Omega'^{2}, \\\\ \text{here} \\ \mathcal{P}\left(r,\psi,r'\right) &= \left.\widehat{\mathcal{L}^{-1}}\left(R,\psi,r'\right)\right|\_{r=r}^{R+H(\Omega')} - R^{2}\frac{\delta\tau(\Omega')}{\delta\left(R,\psi\right)}, \end{split} \tag{13}$$

where

$$\mathcal{P}\left(r,\psi,r'\right) = \widetilde{\mathcal{L}^{-1}}\left(R,\psi,r'\right)\Big|\_{r'=R}^{R+H(\Omega')} - R^2 \frac{\delta\tau(\Omega')}{\mathcal{L}\left(R,\psi\right)},$$

$$\begin{split} \mathcal{P}\left(r,\psi,r'\right) &= \left.\widetilde{\mathcal{L}^{-1}}\left(R,\psi,r'\right)\right|\_{r'=R}^{R+H(\Omega')} - R^2 \frac{\delta\tau(\Omega')}{\mathcal{L}\left(R,\psi\right)}, \\\\ \mathcal{R}\left(r,\psi,r'\right) &= -2\pi G \left.H^2\left(\Omega\right)\right| \left[1 + \frac{2}{3} \frac{H\left(\Omega\right)}{R}\right]\_{\begin{subarray}{c} \left(R+H(\Omega)\right) \\ \left|r'=R\right| \end{subarray}} \\ &\quad - \widetilde{\mathcal{L}^{-1}}\left(R,\psi,r'\right)\Big|\_{r'=R}^{\left(R+H(\Omega)\right)} + R^2 \frac{\delta\tau\left(\Omega\right)}{\mathcal{L}\left(R,\psi\right)}. \end{split}$$

The one-step integration method is a combination of the Hotine integral transforms and the inverse of the Poisson integral equation (Novák 2003). The integration kernel of the one-step method relates the disturbing potential at the boundary level to the gravity disturbances above this surface. Given its inverse operator and the numerical instability of the downward continuation, the solution to this method is achieved iteratively(Goli et al. 2019b). Returning to our main attempt, we intend to propagate the anomalous topographical density error in Eq. (8). The error propagation in the iterative techniques is not a simple task, therefore, we seek an alternative regularized solution using classic Tikhonov regularization. Neglecting the correlation between two terms of Eq. (8), the generalized formula of the geoid internal error due to the errors of the topographical density variation reads:

$$
\mathbf{C}\_{N\_{\delta\rho}^{l}} = \mathbf{B}^{T}\mathbf{C}\_{\delta\underline{\boldsymbol{\varepsilon}}\_{\delta\rho}^{l}}\mathbf{B} + \mathbf{C}\_{\delta N\_{\delta\rho}^{l}},
$$

$$
\mathbf{B} = \left(\mathbf{D}^{T}\mathbf{D} + \mu\mathbf{I}\right)^{-1}\mathbf{D}^{T} \tag{14}
$$

where, **D** is a coefficient matrix containing discretized values of the one-step integration kernel (Novák 2003; Goli et al. 2019b), **C**ıg<sup>t</sup> ı and **C**ıN<sup>t</sup> ı are the covariance matrices of DDE and PIDE, respectively, and is a regularisation parameter that is estimated to provide an equivalent solution to the that of the iterative approch. Please see Foroughi et al. (2023) for further details on the estimation of the regularization parameter in the one-step integration method. In the digital density models, only the STDs of the laterally varying density models are provided and the correlation between these values is complicated to estimate and therefore not available. Consequently, we assume that the **C**ıg<sup>t</sup> ı and **C**ıN<sup>t</sup> ı are diagonal matrices with their diagonal elements computed by discretization of the integrals in Eqs. (12) and (13), therefore, the STDs of the anomalous density on the geoid heights read:

$$\left|\mathbf{s}\_{N\_{\delta\rho}^{l}}^{2} = \left(\operatorname{diag}\left(\mathbf{C}\_{N\_{\delta\rho}^{l}}\right)\right). \right. \tag{15}$$

# **3 Numerical Results**

We consider the Auvergne test area to evaluate our formulations because this area has been used for international collaborations of different geoid determination methods (Valty et al. 2012; Foroughi et al. 2017; Mahbuby et al. 2017; Janák et al. 2018; Foroughi et al. 2019; Goyal et al. 2021; Abbak et al. 2022; Klees et al. 2022). Besides, we already have an internal error estimate of the geoid heights in this area where the contribution of the anomalous topographic density was missing so we can simply include this new contribution here. The Auvergne test case was introduced by Duquenne (2007) for comparing different geoid determination methods and determining whether the effort should be on methodological improvement or more gravity observations for more accurate geoid models. The Auvergne gravity data coverage is limited by 1 ı << 7 ı and 43<sup>ı</sup> <'< 49<sup>ı</sup> and the geoid computation area is the 3<sup>ı</sup> 2 ı centre block between 1.5<sup>ı</sup> << 4.5<sup>ı</sup> and 45<sup>ı</sup> <'< 47<sup>ı</sup> with medium topography (maximum height of 2000*m*). The geoid heights have been sought with a resolution grid of 1<sup>0</sup> 1<sup>0</sup> this area. We extract the lateral density variations and their STDs from the global model *UNB\_TopoDensT\_2v01* (Sheng et al. 2019) with two different resolutions of 30<sup>00</sup> 30<sup>00</sup> and 5<sup>0</sup> 5<sup>0</sup> . Table 2 provides the statistics of the anomalous topographic mass density and their standard deviation in the study area.

Please note that the STDs of the *UNB\_TopoDensT\_2v01* are estimated using the range of density values suggested for the 15 rock types included in the global lithospheric model of GLiM (Hartmann and Moosdorf 2012) and further incorporated into *UNB\_TopoDensT\_2v01*. The range of STDs of the lateral density variation in this model is large and it will improve with a better estimate of the rock density structure. Both DDE and PIDE are computed by integration over the inner zone, near zone, and far zone areas. The inner zone covers an area of 5<sup>0</sup> 5<sup>0</sup> containing grid values of 3<sup>00</sup> 3<sup>00</sup> around each computation point. The finer 3<sup>00</sup> 3<sup>00</sup> anomalous density grid was interpolated from the 30<sup>00</sup> 30<sup>00</sup> grid to

**Table 2** Laterally anomalous density values and their standard deviations in the Auvergne test case


**Fig. 1** Lateral anomalous density variations (**a**) and their STDs in Auvergne (**b**) [kg/m3]

**Fig. 2** The effect of the anomalous density on the DTE (i.e., DDE) (**a**) and the DDE STDs (**b**) in Auvergne [mGal]

match the resolution of the existing DEMs used for computing DTE and PITE when integrating over the inner zone. The near and far zones cover 10<sup>0</sup> 10<sup>0</sup> and 3<sup>ı</sup> 3 ı comprising grid values on 30<sup>00</sup> 30<sup>00</sup> and 5<sup>0</sup> 5<sup>0</sup> spacing, respectively. At first, the point values of the DDE and PIDE (and their STDs) were computed on a 15<sup>00</sup> 15<sup>00</sup> computation grid and then the mean values on a 1<sup>0</sup> 1<sup>0</sup> grid was estimated. The calculation of the mean values is recommended (especially in the rough topography area) since the mean gravity values are typically used for the calculation of the local gravimetric geoid (Vanícek and Martinec ˇ 1994; Janák and Vanícek ˇ 2005; Afrasteh et al. 2019; Goli et al. 2019a). We have computed the DDE for the whole data coverage and the PIDE for the geoid computation area.

The anomalous lateral density is shown in Fig. 1(a) and their STDs are shown in Fig. 1(b). Figures 2 and 3 show DDE and PIDE along with their STDs respectively with a summary of their statistics provided in Table 3.

Please note that the minimum STD of zero in Table 3 is due to the Mediterranean Sea in the southeast part of the Auvergne area where its density was set to 1027 kg <sup>m</sup><sup>3</sup> and the uncertainty was set to zero in the *UNB\_TopoDensT\_2v01*.

**Fig. 3** The effect of anomalous density on PITE (i.e., PIDE) (**a**) and the PIDE STD (**b**) in Auvergne [cm]


**Table 3** DDE, PIDE and their STDs in Auvergne

Figure 2 represents the effect of DDE on the gravity observations and its estimated STDs. Using the one-step integration method we also compute the effect of DDE on the geoid heights that is shown in Fig. 4(a) and estimate its STDs (i.e., the first term on the right-hand side of Eq. (14)) that is shown in Fig. 4(b) and are mostly below 1mm in this area. Statistics are provided in Table 4.

The total effect of the anomalous density on the geoid heights (i.e., Eq. (8)) is displayed in Fig. 5(a) and the STDs (sıNt ı ) are displayed in Fig. 5(b) and statistics are provided in Table 4.

By adding the sıNt ı to the total error estimate of the geoid heights reported by Foroughi et al. (2019), the error estimate of the geoid heights including the errors of the anomalous density is shown in Fig. 6 with statistics provided in Table 5.

**Fig. 4** The effect of DDE on the geoid heights and STD


**Table 4** Statistics of the total effects of DDE and PIDE on the geoid heights in Auvergne

**Fig. 5** Total effect of density anomalous on the geoid (**a**) and STDs (**b**)

**Fig. 6** Total error estimate of the geoid heights in Auvergne including all error sources (Foroughi et al. 2019) plus the STDs of the anomalous density

**Table 5** Statistics of the uncertainties of the geoid heights including all error sources (Foroughi et al. 2019) plus DDE and PIDE


# **4 Conclusion and Remarks**

With the existence of the global models of the topographical density variations, the topographic reductions for geoid determination can be computed using the actual density instead of a constant density value which may be far from reality, up to 20% (Kuhn 2002). The knowledge of the anomalous topographic density variation is increasing and with a better understanding of the rock types and their density structure, we will be able to compute the topographic reductions more accurately. We used the recently available global laterally varying topographic mass density model *UNB\_TopoDensT\_2v01* to compute the direct and indirect density effects (DDE and PIDE respectively) and their uncertainties on the gravity and geoid heights using the Helmert second condensation method. Due to numerical complexity and lack of information on the covariance of the topographical density variations, we only considered the variances and neglected the correlations between error terms. Given the structure of the *UNB\_TopoDensT\_2v01,* finding the correlation between density variations is not an easy task. All we can say is that the correlations are predominantly positive and neglecting them will give a larger error estimate of the geoid heights and vice versa. Since a negative correlation is unlikely, our results are an overly pessimistic estimate of the geoid error, i.e., the internal error estimate of this study can be well trusted.

We tested our formulations in the Auvergne test area and showed that PIDE is the dominant source of uncertainty in the geoid heights. In a medium topography area, like the Auvergne region, the maximum uncertainty of the geoid heights due to the errors in the anomalous density is less than 2 cm with a mean value of only 1.5 mm which is below the target sub-centimetre threshold for internal error of geoid heights. We also added our DDE and PIDE error estimates to those computed by Foroughi et al. (2019) for the same region and confirmed that even using a comprehensive error propagation of the STDs of the anomalous density in the geoid determination, the mean value of the internal error is still below one-centimetre threshold. The STDs of the geoid heights including the uncertainties caused by the inclusion of anomalous density are larger than 1 cm in higher topography where the mean STD of the geoid heights is higher than 1 cm anyway. The results of this contribution confirm that topographical density information is now known well enough (errors in the density variation values are small enough) to make the resulting geoid more accurate than when the geoid is computed with assumed constant topographical density. We showed that a geoid with an internal error estimate of better than one centimetre is achievable considering the density variation for most of the globe where the topography is lower than 2000 m.

**Acknowledgments** Ismael Foroughi was supported by Mitacs Application No. IT25134 and No. IT25135.

# **References**


application of EGM08 and CRUST2.0. Acta Geophysica 59(3):502– 525. https://doi.org/10.2478/s11600-011-0004-6


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/ licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

# **Estimation of Height Anomalies from Gradients of the Gravitational Potential Using a Spectral Combination Method**

Martin Pitonák, Michal Šprlák, and Pavel Novák ˇ

#### **Abstract**

In this study, we apply a spectral combination method for estimation of height anomalies from gradients of the gravitational potential measured by satellites. The spectral combination method is used for solving over-determined problems within gravity field modelling when multiple types of gravity data are collected and used for recovery of unobservable quantities (typically the gravitational potential). The method applies solutions to geodetic boundary-value problems formulated in spherical approximation for gradients of the gravitational potential of up to the third order. Spectral forms of the solutions are combined using spectral weights defined under the condition of minimizing the global mean-square error of the estimators. Mathematical models are implemented and tested using gradients synthesized from a global geopotential model which allows for closed-loop testing of the estimators. The tests reveal among others that horizontal derivatives of the gravitational potential influence recovered values more than their vertical counterparts.

#### **Keywords**

Boundary-value problem - Earth's gravity field - Gradients of the disturbing potential - Height anomaly -Spectral combination method

# **1 Introduction**

Boundary-value problems (BVPs) and their solutions represent an important tool for describing and modelling potential fields such as the Earth's gravitational field. Solutions to spherical geodetic BVPs lead to spherical harmonic series or surface convolution integrals with Green's kernel functions. New BVPs have recently been formulated reflecting the development of new sensors. BVPs have also been developed for observables measured by kinematic sensors on moving platforms, i.e., airplanes and satellites. Solutions to BVPs for higher-order gradients of the gravitational potential as boundary conditions are represented by multiple integral transforms. For example, solutions to gravimetric BVP are represented by two integral transforms (Grafarend 2001, Eqs. (199) and (142)):

$$\begin{aligned} &T^{(V)}(r,\mathcal{Q}) \\ &= -\frac{R}{4\pi} \int\_{\mathcal{Q}'} \left\{ \sum\_{n=3}^{N\_{\text{max}}} \frac{2n+1}{n+1} \left(\frac{R}{r}\right)^{n+2} P\_{n,0} \left(\cos\psi\right) \right\} (1) \\ &\times T\_z(R,\mathcal{Q}') \text{d}\mathcal{Q}' \,, \end{aligned}$$

T .H /.r; ˝/

$$=\frac{R}{4\pi}\int\_{\varOmega'}\left\{\sum\_{n=3}^{N\_{\max}}\frac{2n+1}{n(n+1)}\left(\frac{R}{r}\right)^{n+2}P\_{n,1}\left(\cos\psi\right)\right\} \,(2)$$

$$\times \left[-T\_{\times}(R,\varOmega')\cos\alpha'+T\_{\times}(R,\varOmega')\sin\alpha'\right] \mathrm{d}\varOmega',$$

J. T. Freymueller, L. Sánchez (eds.), *X Hotine-Marussi Symposium on Mathematical Geodesy*, International Association of Geodesy Symposia 155, https://doi.org/10.1007/1345\_2023\_194

M. Pitonák ( ˇ -) · M. Šprlák · P. Novák

NTIS—New Technologies for the Information Society, Faculty of Applied Sciences, University of West Bohemia, Plzen, Czech Republic ˇ e-mail: pitonakm@ntis.zcu.cz

to the gradiometric BVP by three integral transforms (Martinec 2003, Eq. (19)):

$$\begin{split} &T^{(VV)}(r,\mathcal{Q}) \\ &= \frac{R^2}{4\pi} \int\_{\mathcal{Q}'} \left\{ \sum\_{n=3}^{N\_{\text{max}}} \frac{2n+1}{(n+1)(n+2)} \left( \frac{R}{r} \right)^{n+3} P\_{n,0} (\cos \psi) \right\} (3) \\ &\quad \times T\_{\text{zz}}(R, \mathcal{Q}') \text{d}\mathcal{Q}' \,, \end{split}$$

T .VH /.r; ˝/ D <sup>R</sup><sup>2</sup> 4- Z ˝<sup>0</sup> ( <sup>N</sup> Xmax nD3 2n C 1 n.n <sup>C</sup> 1/.n <sup>C</sup> 2/ -R r nC<sup>3</sup> Pn;1 .cos / ) Tx*z*.R; ˝<sup>0</sup> / cos ˛<sup>0</sup> Ty*z*.R; ˝<sup>0</sup> /sin ˛<sup>0</sup> d˝<sup>0</sup> ; (4)

T .HH /.r; ˝/

$$\begin{split} \mathcal{I} &= \frac{R^2}{4\pi} \int\_{\varOmega'} \left\{ \sum\_{n=3}^{N\_{\max}} \frac{2n+1}{(n-1)n(n+1)(n+2)} \left( \frac{R}{r} \right)^{n+3} \right\} \\ &\times P\_{n,2} \left( \cos \psi \right) \Bigg\} \\ &\times \left[ \left( T\_{xx}(R, \varOmega') - T\_{yy}(R, \varOmega') \right) \cos 2\alpha' \right. \\ &\left. - 2T\_{xy}(R, \varOmega') \sin 2\alpha' \right] \mathrm{d}\varOmega', \end{split} (5)$$

and to the gravitational curvature BVP by four integral transforms (Šprlák and Novák 2016, Eqs. (49)–(52)):

$$\begin{split} &T^{(VVV)}(r,\Omega) \\ &= -\frac{R^3}{4\pi} \int\_{\Omega'} \\ &\times \left\{ \sum\_{n=3}^{N\_{\text{max}}} \frac{2n+1}{(n+1)(n+2)(n+3)} \left(\frac{R}{r}\right)^{n+4} P\_{n,0} \left(\cos\psi\right) \right\} \\ &\times T\_{zz}(R,\Omega') \text{d}\Omega', \end{split} \tag{6}$$

$$\begin{split} &T^{(VWH)}(r,\Omega) \\ &=\frac{R^3}{4\pi} \int\_{\Omega'} \\ &\times \left\{ \sum\_{n=3}^{N\_{\text{max}}} \frac{2n+1}{n(n+1)(n+2)(n+3)} \left(\frac{R}{r}\right)^{n+4} P\_{n,1}\left(\cos\psi\right) \right\}^{(7)} \\ &\times \left[ T\_{xz}(R,\Omega') \cos\alpha' - T\_{yzz}(R,\Omega') \sin\alpha' \right] \text{d}\Omega', \end{split} \tag{7}$$

$$\begin{split} T^{(VHH)}(r,\Omega) \\ &= -\frac{R^3}{4\pi} \int\_{\Omega'} \\ &\times \left\{ \sum\_{n=3}^{N\_{\text{max}}} \frac{2n+1}{(n-1)n(n+1)(n+2)(n+3)} \left(\frac{R}{r}\right)^{n+4} \right\} \\ &\times P\_{n,2}\left(\cos\psi\right) \Bigg\vert \\ &\times \left[ \left(T\_{xx\varepsilon}(R,\varDelta') - T\_{yy\varepsilon}(R,\varDelta')\right) \cos 2\alpha' \right. \\ &\left. - 2T\_{xy\varepsilon}(R,\varDelta')\sin 2\alpha' \right] \mathrm{d}\varOmega' \,, \end{split} (8)$$

T .HHH /.r; ˝/

$$\begin{split} \mathcal{I} &= -\frac{R^3}{4\pi} \int\_{\varOmega'} \\ &\times \left\{ \sum\_{n=3}^{N\_{\max}} \frac{2n+1}{(n-2)(n-1)n(n+1)(n+2)(n+3)} \right. \\ &\times \left( \frac{R}{r} \right)^{n+4} P\_{n,3} \left( \cos \psi \right) \Bigg\} \\ &\times \left[ \left( T\_{xxx}(R, \varOmega') - 3T\_{xyy}(R, \varOmega') \right) \cos 3\alpha' \\ &\quad + \left( T\_{yyy}(R, \varOmega') - 3T\_{xxy}(R, \varOmega') \right) \cos 3\alpha' \right] \operatorname{d} \varOmega' . \end{split} (9)$$

The notation in the previous equations is defined as follows. An Earth-fixed coordinate system is used with the geocentric radius r, spherical latitude ' and longitude . Moreover, in a point positioned in the 3-D space by a triplet of the spherical coordinates .r; '<sup>0</sup> ; <sup>0</sup> / D .r; ˝<sup>0</sup> /, a right-handed local Cartesian system is defined with the *z*-axis aligned with the geocentric radius and pointing outwards, and the x-axis pointing to the geodetic North. The symbol T represents the disturbing potential and components of the first-, second- and third-order gradient tensors are Ti ; Tij and Tij k (with indexes i;j;k running over the Cartesian coordinates x; y;*z*). The spherical distance and the backward azimuth ˛<sup>0</sup> are defined between the computation point .r; ˝/ and the integration point .R; ˝<sup>0</sup> / located at the mean Earth's sphere with the radius R. Pn;m are the associated Legendre functions of the degree n and order m. The minimum degree was homogenized to the smallest common degree n D 3 recoverable from all gradients. The series in Eqs. (1) to (9) theoretically extends to infinity; however, it will always be truncated at the degree Nmax for the satellite data where the gravitational signal is attenuated. Note that expressions in curly brackets represent the integral kernel functions in the spectral forms. The superscripts in Eqs. (1) to (9) stand for the name of the corresponding solution (V — vertical, H—horizontal) and could be expressed by a more general index i. For example, the superscript VHH means the vertical-horizontal-horizontal solution. The solutions of geodetic BVPs defined by Eqs. (1) to (9) represent the direct problem. The components of the first-, second- and thirdorder gradient tensors Ti ; Tij and Tij k are located at the mean sphere while the unknown disturbing potential is estimated at the sphere with radius r (r>R).

# **2 Spectral Combination**

The goal of this study is to apply Eqs. (1) to (9) for downward continuation (DWC), i.e., to estimate values of the disturbing potential at the mean Earth's sphere with the radius of R from the components of the first-, second- and third-order gradient tensors Ti ; Tij and Tij k located at the mean orbital sphere. The well-known example of such a problem is an estimation of the spherical harmonic coefficients from the satellite observables. To do so, we change the ratio .R=r/, called an attenuation factor, to .r=R/ in Eqs. (1) to (9). Further, the arguments in brackets on the left-hand side of Eqs. (1) to (9) from .r; ˝/ to .R; ˝/ and the arguments related to the components of the first-, second- and third-order gradient tensors Ti ; Tij and Tij k on the right-hand side from .R; ˝<sup>0</sup> / to .r; ˝<sup>0</sup> /. Note that in the same way we modified Eqs. (1) to (9) but their exact formulas are omitted here and we provide two examples in terms of the simplest V solution and the most complicated HHH solution modified for the DWC:

$$\begin{split} &T^{(V)}(R,\mathcal{Q}) \\ &= -\frac{R}{4\pi} \int\_{\mathcal{Q}'} \left\{ \sum\_{n=3}^{N\_{\text{max}}} \frac{2n+1}{n+1} \left(\frac{r}{R}\right)^{n+2} P\_{n,0} \left(\cos\psi\right) \right\} \\ &\times T\_{\varepsilon}(r,\mathcal{Q}') \text{d}\mathcal{Q}', \end{split} \tag{10}$$

T .HHH /.R; ˝/

$$\begin{split} \mathcal{I} &= -\frac{R^3}{4\pi} \int\_{\mathcal{D}'} \\ &\times \left\{ \sum\_{n=3}^{N\_{\text{max}}} \frac{2n+1}{(n-2)(n-1)n(n+1)(n+2)(n+3)} \left(\frac{r}{R}\right)^{n+4} \right\} \\ &\times P\_{n,3} \left(\cos\psi\right) \Bigg\{ \\ &\times \left[ \left( T\_{xxx}(r,\mathcal{D}') - 3T\_{xyy}(r,\mathcal{D}') \right) \cos 3\alpha' \right. \\ &\left. + \left( T\_{yyy}(r,\mathcal{D}') - 3T\_{xxy}(r,\mathcal{D}') \right) \cos 3\alpha' \right] \mathrm{d}\mathcal{Q}'. \end{split} (11)$$

The geometry of this problem is depicted in Fig. 1. Note that all symbols used in this figure are explained in the text above. The problem presented by modified Eqs. (1) to (9) represents DWC. To solve it, we need to control the signal-to-noise ratio of results. Among many methods, we decided to apply the least-squares spectral combination method developed for combination of then-available gravity data by Sjöberg (1980) and Wenzel (1982). Since then it has been used by many scholars for geoid determination. The very first publication, which discussed the application of the spectral combination method for combining solutions of boundary-value problems (BVPs) of the potential theory, was published by Eshagh (2011). Using the spectral combination method, Eshagh (2012) combined three analytical solutions to the spherical gradiometric BVP and applied the spectral combination method for DWC of second-order gradients of the gravitational potential simulated at the satellite orbit. The spectral combination of solutions to the spherical gravitational curvature BVP for estimation of the gravitational potential was investigated by Pitonák et al. ( ˇ 2018). The method can be used not only for combination of various data types but it can also continue observables from an observation level down to the irregular Earth's surface (or elsewhere as long as the harmonicity of the gravitational potential is guaranteed) and transform them to corresponding gravitational field quantity, e.g., Sjöberg and Eshagh (2012), Eshagh (2012) or Pitonák et al. ( ˇ 2018). Despite DWC being an inverse problem, the method does not need any matrix inversion and the signal-to-noise ratio of results is controlled by spectral weights. However, DWC can still be problematic as one continues gradient data inside a geocentric sphere which encloses completely Earth's masses (Brillouin's sphere). In this space, an external form of a spherical harmonic series representing the gravitational potential may not be converging.

In this study, we apply the spectral combination method to all gradients of the gravitational potential of up to the third order. The method combines nine solutions of the respective BVPs, two for the gravimetric BVP, three for the gradiometric BVP and four for the thirdorder gravitational curvature BVP. The integral kernel functions in Eqs. (1) to (9) can be modified by adding spectral weights an which respect signal and error degree variances of measured gravitational gradients. Moreover, the various solutions can be combined in the spectral domain which provides a minimum expected global mean-square error of estimated parameters. In order to obtain height anomalies we apply the well-known Bruns's equation (e.g., Heiskanen and Moritz 1967, Eq. 2–144, p. 85) to Eqs. (1)– (9).

**Fig. 1** Geometry of the downward continuation of satellite data

The height anomaly estimator based on a single gradient group has the spectral form:

$$\zeta^{(i)}(\mathfrak{Q}) = \frac{1}{\mathcal{Y}} \sum\_{n=3}^{N\_{\text{max}}} a\_{i,n} \ b\_{i,n} \ T\_{i,n}(\mathfrak{Q}) \,, \tag{12}$$

which is obtained from spherical harmonics Ti;n derived by spherical analysis of the gradient group Ti . Spectral weights ai;n are defined as follows (Eshagh 2012; Pitonák et al. ˇ 2020):

$$a\_{i,n} = \frac{c\_n \ t\_n}{c\_n \ t\_n^2 + \sigma\_{i,n}^2 \ b\_{i,n}^2} \ , \tag{13}$$

with the signal degree variances of the height anomaly cn and the error degree variance of the particular gradient group <sup>2</sup> i;n. Numerical coefficients for the order of the gradient ` are:

$$b\_{i,n} = R^\ell \frac{(n-j)!}{(n+\ell)!}, \; \ell = \{1, 2, 3\} \; , \; i$$

and the respective attenuation factors tn are defined as follows:

$$t\_n = \left(\frac{R}{r}\right)^{n+1+\ell}, \ \ell = \{1, 2, 3\}.$$

The index j represents the order of the gradient in the horizontal coordinates x and y (number of repetitions of the index H, j = {0,1,2,3}). Note that in the practical computation we used modified integral transforms. Two examples in terms of the simplest V and the most complicated HHH solutions are:

$$\begin{split} &\xi^{(V)}(R,\mathcal{Q}) \\ &= -\frac{R}{4\pi\mathcal{V}} \int\_{\mathcal{Q}'} \left\{ \sum\_{n=3}^{N\_{\text{max}}} \frac{(2n+1)}{n+1} \boldsymbol{P}\_{n,0} \left( \cos \psi \right) \right\} \quad (14) \\ &\quad \times T\_{\varepsilon}(r,\mathcal{Q}') \text{d}\mathcal{Q}' \,, \end{split} \tag{14}$$

$$\begin{split} \zeta^{(HHH)}(R,\Omega) \\ &= -\frac{R^3}{4\pi\nu} \int\_{\Omega'} \\ &\times \left\{ \sum\_{n=3}^{N\_{\max}} \frac{(2n+1)a\_n^{HHH}}{(n-2)(n-1)n(n+1)(n+2)(n+3)} \right. \\ &\times P\_{n,3}(\cos\psi) \Bigg\} \\ &\times \left[ \left( T\_{xxx}(r,\Omega') - 3T\_{xyy}(r,\Omega') \right) \cos 3\alpha' \\ &+ \left( T\_{yyy}(r,\Omega') - 3T\_{xxy}(r,\Omega') \right) \cos 3\alpha' \right] d\Omega'. \end{split} (15)$$

Two or more gradient groups can be then combined in the spectral domain. The combined solution based on two gradient groups reads:

$$\xi^{(i,j)}(\mathfrak{D}) = \frac{1}{\mathcal{Y}} \sum\_{n=3}^{N\_{max}} \left[ \begin{array}{c} a\_n^{(i,j)} \ b\_{i,n} \ T\_{i,n}(r, \mathfrak{D}) \\\\ b\_{j,n} \ T\_{j,n}(r, \mathfrak{D}) \end{array} \right], \tag{16}$$

with the respective spectral weights defined as follows (Eshagh 2012):

$$a\_n^{(i,j)} = \frac{\overline{\sigma}\_{j,n}^2}{t\_n \left(\overline{\sigma}\_{i,n}^2 + \overline{\sigma}\_{j,n}^2\right)}, \ a\_n^{(j,i)} = \frac{\overline{\sigma}\_{i,n}^2}{t\_n \left(\overline{\sigma}\_{i,n}^1 + \overline{\sigma}\_{j,n}^2\right)}, \tag{17}$$

with

$$
\overline{\sigma}\_{i,n}^2 = b\_{i,n}^2 \sigma\_{i,n}^2 \ , \ \overline{\sigma}\_{j,n}^2 = b\_{j,n}^2 \ \sigma\_{j,n}^2 \ .
$$

In contrary to a single group estimator in Eq. (12), which is based on the signal degree variances cn of estimated parameters, the two- and more-group estimators are based only on the error degree variances of input data. We present their unbiased forms herein. Combining two gradient groups in Eq. (16) yields already 36 different combined solutions. One example for the combination of the V and H gradient groups in the integral form is:

$$\begin{split} \xi^{(V,H)}(R,\varOmega) \\ = &-\frac{R}{4\pi\mathscr{V}} \int\_{\varOmega'} \left\{ \sum\_{n=3}^{N\_{\max}} \frac{(2n+1)\,a\_n^{V,H}}{(n+1)} P\_{n,0} \left(\cos\psi\right) \right\} \\ \times T\_{\mathbb{S}}(r,\varOmega') \mathrm{d}\varOmega' \\ &+ \frac{R}{4\pi\mathscr{V}} \int\_{\varOmega'} \left\{ \sum\_{n=3}^{N\_{\max}} \frac{(2n+1)\,a\_n^{H,V}}{n(n+1)} P\_{n,1} \left(\cos\psi\right) \right\} \\ \times \left[ -T\_{\mathbb{X}}(r,\varOmega') \cos\alpha' + T\_{\mathbb{Y}}(r,\varOmega') \sin\alpha' \right] \mathrm{d}\varOmega', \end{split} (18)$$

Similarly, the solution based on three gradient groups reads:

$$\begin{split} \xi^{(i,j,k)}(\mathfrak{D}) \\ = & \frac{1}{\mathcal{V}} \sum\_{n=3}^{N\_{\max}} \left[ \left. a\_n^{(i,j,k)} \left. b\_n^{(i)} \right. \right|\_{n} T\_{i,n}(r, \mathfrak{D}) + a\_n^{(j,i,k)} \left. b\_n^{(j)} \right. \right] \\ \times T\_{j,n}(r, \mathfrak{D}) + a\_n^{(k,i,j)} \left. b\_n^{(k)} \right. \left. T\_{k,n}(r, \mathfrak{D}) \right] \end{split} (19)$$

with the spectral weights defined as follows (Eshagh 2012):

$$\begin{aligned} a\_n^{(i,j,k)} &= \frac{\overline{\sigma}\_{j,n}^2 \, \overline{\sigma}\_{k,n}^2}{t\_n \, D\_n}, \; a\_n^{(j,i,k)} = \frac{\overline{\sigma}\_{i,n}^2 \, \overline{\sigma}\_{k,n}^2}{t\_n \, D\_n}, \\\ a\_n^{(k,i,j)} &= \frac{\overline{\sigma}\_{i,n}^2 \, \overline{\sigma}\_{j,n}^2}{t\_n \, D\_n}, \end{aligned}$$

;

and

$$D\_n = \overline{\sigma}\_{i,n}^2 \, \overline{\sigma}\_{j,n}^2 + \overline{\sigma}\_{i,n}^2 \, \overline{\sigma}\_{k,n}^2 + \overline{\sigma}\_{j,n}^2 \, \overline{\sigma}\_{k,n}^2 \, \dots$$

**Fig. 2** Scheme of the three-group estimator

An integral form of the combination of the V , V V and VVV gradient groups is:

$$\begin{split} \zeta^{(V,VV,VV)}(R,\varOmega) \\ &= -\frac{R}{4\pi\chi} \int\_{\varOmega'} \left\{ \sum\_{n=3}^{N\_{\max}} \frac{(2n+1)}{(n+1)} \varOmega\_{n,0}^{V,VV,VVV} \varGamma\_{n,0} \left( \cos \psi \right) \right\} \\ &\times T\_{\div}(r,\varOmega') \text{d}\varOmega' \\ &+ \frac{R^2}{4\pi\chi} \int\_{\varOmega'} \left\{ \sum\_{n=3}^{N\_{\max}} \frac{(2n+1)}{(n+1)(n+2)} P\_{n,0} \left( \cos \psi \right) \right\} \\ &\times T\_{\div}(R,\varOmega') \text{d}\varOmega' - \frac{R^3}{4\pi\chi} \int\_{\varOmega'} \\ &\times \left\{ \sum\_{n=3}^{N\_{\max}} \frac{(2n+1)}{(n+1)(n+2)(n+3)} P\_{n,0} \left( \cos \psi \right) \right\} \\ &\times T\_{\times\mp}(R,\varOmega') \text{d}\varOmega' . \end{split} (20)$$

There are 84 solutions based on the three-group estimator in Eq. (19). Its scheme is then presented in Fig. 2. Each integral transform is represented by one line. Here we select 5 possible solutions out of 84. The four- to nine-group estimators can be derived analogously. The four- and fivegroup estimators provide the maximum number of combined solutions, 126 solutions each. For more groups, the number of combined solutions decreases again with the nine-group estimator providing a single solution. Generally, the number of the combined solutions is given by the factorial coefficient.

# **3 Numerical Experiments**

The spectral combination method was tested using synthetic disturbing gradients derived from the global geopotential model GO\_CONS\_GCF\_2\_TIM\_R6e (Zingerle et al. 2019) up to the maximum degree Nmax D 250. The Geodetic Reference System 1980 was used as the normal field (Moritz 2000). In total, 19 disturbing gradients of up to the third order

**Fig. 3** Signal and error degree variances of the first-order gradients (**a**), the second-order gradients (**b**) and the third-order gradients (**c**)

were synthesized at the equiangular coordinate grid with the resolution of 0.2 arc-deg. The global grid was located at the mean satellite orbit with the geocentric radius 6,633,850 m. The error variances were calculated from the formal errors of the applied global geopotential model.

The height anomalies were computed in the area of Himalayas limited by ' 2 Œ24:75<sup>ı</sup>; 45:25<sup>ı</sup> and 2 Œ69:75<sup>ı</sup>; 105:25<sup>ı</sup> from global grids of the gradient groups by the spectral combination method at the Brillouin sphere of radius 6,383,850 m, thus safely outside solid Earth's masses. Brillouin's sphere was used instead of the real Earth surface since the solutions of corresponding BVPs are based on the external spherical harmonic series of the gravitational potential, and its first- , second- and thirdorder gradients. To combine the solutions based on groups of the disturbing gradients in the spectral domain, respective spectral weights must be computed first. Figure 3 shows the required signal and error spectra (signal and error degree variances) corresponding to the first-, second- and thirdorder gradients, respectively. Note that we calculated signal and error spectra from the spherical harmonic coefficients of GO\_CONS\_GCF\_2\_TIM\_R6e and their uncertainties.

Based on the estimated spectral weights, the combined solutions were computed. As the synthetic gradient data have been used, the closed-loop tests could be performed. We divided values of the disturbing potential by normal gravity generated by the mean sphere of radius 6,383,850 m in order to obtain height anomalies. Thus, the numerical results include basic statistics of the differences between values computed by the spectral combination method and their


**Table 1** Statistics of the differences between estimated values of the height anomaly by the one-group estimators and their reference counterparts synthesized from the global geopotential model (in metres).

The indexes 1; : : : ; 9 stand for the V , H, V V , VH, HH, VVV , V VH, VHH and HHH solutions, respectively

**Table 2** Statistics of the differences between values of the height anomaly estimated by the selected two-group estimators and their reference counterparts synthesized from the global geopotential model (in metres)


**Table 3** Statistics of the differences between values of the height anomaly estimated by the selected three-group estimators and their reference counterparts synthesized from the global geopotential model (in metres)


counterparts synthesized directly from the global geopotential model in the bandwidth 3-250. In total, there are 511 combined solutions; thus, only few selected examples can be presented in this study. As it is clear from the statistics for the one-group estimator, see Table 1, the worst fit with respect to the true values was obtained from the first-order vertical gradient T*z*, while the best fit was achieved for the second-order gradient group VH and the third-order gradient groups V VH and VHH. From the statistics obtained using the two-group estimator, see Table 2, one can conclude that the more accurate group (based on results from the one-group estimator) improves the less accurate solutions from the onegroup estimator. The same pattern can be observed for the three-group estimator as well as for the rest of the gradient groups, see Tables 3 and 4, respectively.

# **4 Conclusion**

The spectral combination method was applied for the estimation of the height anomaly from gradients of the gravitational potential up to the third order as potentially observed by satellites. The method has been applied to synthetic gradient data synthesized at the global equiangular grids from the state-of-the-art global geopotential model which allowed for applying the closed-loop test. Obtained numerical results revealed some interesting properties of the spectral combination method applied to satellite gradients of the geopotential, namely: (i) The best fit was obtained from the mixed vertical and horizontal second- and third-order gradient groups and their respective combination; (ii) Horizontal gradients of the disturbing gravitational potential in the local coordinate frame influence results more than vertical gradients; and (iii) The combination of more than six groups is not beneficial and does not improve obtained solution.

**Acknowledgements** The authors acknowledge the financial support of the study through the project GA21-13713S of the Czech Science Foundation. We would like to thank prof. Mehdi Eshagh, Dr. Blažej Bucha and one anonymous reviewer for their thoughtful and constructive comments, and to prof. Jeff Freymueller for handling our manuscript. Computational resources were provided by the e-INFRA CZ project (ID:90140), supported by the Ministry of Education, Youth and Sports of the Czech Republic.

**Authors Contributions** MP designed the study and performed numerical experiments. PN drafted the manuscript. All authors read, commented and approved the final manuscript.

**Table 4** Statistics of the differences between values of the height anomaly estimated by the selected multi-group estimators and their reference counterparts synthesized from the global geopotential model (in metres)


**Data Availability** The global geopotential model GO\_CONS\_GCF\_2 \_TIM\_R6e is freely available via ICGEM (International Centre for Global Earth Models) website.

# **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/ licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

# **Evaluation of the Recent African Gravity Databases V2.x**

Hussein A. Abd-Elmotaal, Norbert Kühtreiber, Kurt Seitz, Bernhard Heck, and Hansjörg Kutterer

#### **Abstract**

In the framework of the activities of the IAG Sub-Commission on the gravity and geoid in Africa, a recent set of gravity databases has been established. They are namely: AFRGDB\_V2.0 and AFRGDB\_V2.2. The AFRGDB\_V2.0 has been created using the window remove-restore technique employing EGM2008 as geopotential Earth model complete to degree and order 1800. The AFRGDB\_V2.2 has been established using the Residual Terrain Model (RTM) reduction technique employing GOCE DIR\_R5 complete to degree and order 280, using the best RTM reference surface. The available gravity data set for Africa, used to establish the above mentioned two independently derived databases, consists of shipborne, altimetry derived gravity anomalies and of land point gravity data. In particular, the data set of point gravity values shows clear deficits with regard to a homogeneous data coverage over the completely African continent. The establishment of the gravity databases has been carried-out using the weighted least-squares prediction technique, in which the point gravity data on land has got the highest precision, while the shipborne and altimetry gravity data got a moderate precision. In this paper a new gravity data set on land and on sea, which became recently available for the IAG Sub-Commission on the gravity and geoid in Africa, located partly in the gap areas of the data set used for generating the gravity databases, has been employed to evaluate the accuracy of the previously created gravity databases. The results show reasonable accuracy of the established gravity databases considering the large data gaps in Africa.

#### **Keywords**

Africa - Geoid determination - Gravity field - Gravity interpolation -Window technique

H. A. Abd-Elmotaal (-)

Civil Engineering Department, Faculty of Engineering, Minia University, Minia, Egypt

N. Kühtreiber Institute of Geodesy, Graz University of Technology, Graz, Austria e-mail: norbert.kuehtreiber@tugraz.at

K. Seitz · B. Heck · H. Kutterer Geodetic Institute, Karlsruhe Institute of Technology, Karlsruhe, Germany e-mail: kurt.seitz@kit.edu; bernhard.heck@kit.edu; hansjoerg. kutterer@kit.edu

# **1 Introduction**

The International Association of Geodesy (IAG) has established, some years ago, the Sub-Commission on the gravity and geoid in Africa. The main task of that Sub-Commission is to determine a precise regional geoid for the continent. In order to achieve its main goal, the IAG Sub-Commission on the gravity and geoid in Africa has established a recent set of gravity databases. This set comprises the AFRGDB\_V2.0 (Abd-Elmotaal et al. 2018) and AFRGDB\_V2.2 (Abd-Elmotaal et al. 2020). The aim of this investigation is to perform an external validation of

J. T. Freymueller, L. Sánchez (eds.), *X Hotine-Marussi Symposium on Mathematical Geodesy*, International Association of Geodesy Symposia 155, https://doi.org/10.1007/1345\_2023\_197

the above mentioned gravity databases employing a recently available gravity data set. This data set was not used in creating V2.x gravity databases and is located partly in the gap areas of the data.

In the following, the gravity data sets used to establish the recent gravity databases AFRGDB\_V2.x will be presented. The different methodologies applied for establishing the AFRGDB\_V2.x gravity databases will be described. The recently available gravity data set used for the validation process will be presented. The validation of the AFRGDB\_V2.x will be performed and discussed.

# **2 Data Used for Establishing the AFRGDB\_V2.x Gravity Databases**

The basis for the creation of a gravity anomaly database across the entire African continent in a homogeneous and comprehensive manner is formed by three complementary data sets. There are three types of data available.

The available land point gravity data, is the most important data set for determining the geoid at the continent. Before they enter the merging scheme, they have to pass a laborious gross-error detection process. This data screening step was developed by Abd-Elmotaal and Kühtreiber (2014) using the least-squares prediction technique (Moritz 1980). During this gross-error detection the gravity anomaly at the computational point is predicted using the neighbouring points and then compared to the measured gravity anomaly value. A possible erroneous measurement is removed from the data if the difference between the measurement and the predicted value exceeds a certain threshold. Afterwards, a grid-filtering scheme (Abd-Elmotaal and Kühtreiber 2014) on a grid of 1<sup>0</sup> 1<sup>0</sup> is applied to the screened land data to improve the behaviour of the empirical covariance function especially near the origin (Kraiger 1988). The statistics of the land free-air gravity anomalies, after the gross-error detection and the grid-filtering, are illustrated in Table 1. The distribution of the available land gravity data set, with its obvious large data gaps, is shown in Fig. 1a.

The gravity data set used to generate AFRGDB\_V2.x comprises in addition data over the oceanic region. The goal of the African Geoid Project is the calculation of the geoid on the African continent. Data within the data

**Table 1** Statistics of the gravity anomalies used to generate AFRGDB\_V2.x. Units in [mgal]


window which are located on the oceans are used to stabilize the solution at the continental margins to avoid the Gibbs phenomenon. The sea data consists of shipborne point data and altimetry-derived gravity anomalies along tracks. The altimetry-derived data set was derived from the average of 44 repeated cycles of the satellite altimetry mission GEOSAT by the National Geophysical Data Center NGDC (www. ngdc.noaa.gov) (Abd-Elmotaal and Makhloof 2013, 2014). The derived gravity anomalies are given along its ground tracks and have a good spatial coverage as can be realized from Fig. 1c. The distribution of the shipborne data is given in Fig. 1b. The shipborne and altimetry-derived free-air anomalies have passed a gross-error detection scheme developed by Abd-Elmotaal and Makhloof (2013), also based on the least-squares prediction technique. It estimates the gravity anomaly at the computational point utilizing the neighbourhood points, and defines a possible blunder by comparing it to the given data value which is currently being examined for an error. The gross-error technique works in an iterative scheme till it reaches 1:5 mgal or better for the discrepancy between the predicted and data values. A stochastic weighting combination between the shipborne and altimetry data took place (Abd-Elmotaal and Makhloof 2014) in order to merge both data sets into one homogeneous data set. Then a grid-filtering process on a grid of 3<sup>0</sup> 3<sup>0</sup> has been applied to the shipborne and altimetry-derived gravity anomalies to decrease their dominating effect on the gravity data set. The statistics of the shipborne and altimetry-derived free-air anomalies, after the gross-error detection and gridfiltering, are listed in Table 1.

More details about the used data sets can be found in Abd-Elmotaal et al. (2018).

# **3 Methodology for Creating AFRGDB\_V2.x**

The two gravity databases for the African continent (versions V2.0 and V2.2) which are here evaluated, have been created based on principally different methodologies. In the following subsections the applied methodologies will be shortly described. They mainly differ in the way how the highfrequency part of the gravity anomalies is reduced before a suitable interpolation or prediction technique is applied to get gridded data.

# **3.1 Methodology for Creating AFRGDB\_V2.0**

The version V2.0 of AFRGDB relies on the window removerestore technique which is used to smooth the signal of

40 30 20 10 0 –10 –20 –30 –40

the gravity attraction and avoids the double consideration of topographical masses. This leads to un-biased reduced anomalies with minimum variance. The window technique, which was introduced by Abd-Elmotaal and Kühtreiber (1999, 2003), consists of a remove and a restore step. When performing the remove step, the measured free-air gravity anomalies gF are decomposed into the contribution of the topographic-isostatic masses for the fixed data window (gTI *<sup>w</sup>*i n), the long wavelength component modelled by a global geopotential model (GPM) (gGPM ), the contribution of the topographic-isostatic masses in terms of spherical harmonics up to d/o nmax of the same data window (g*w*i ncof ). The synthesis of gGPM and g*w*i ncof is performed to the maximum degree nmax D 1800. Furthermore, the EGM2008 geopotential model (Pavlis et al. 2012) is used as the GPM. From this spectral decomposition the window-reduced gravity anomalies can be expressed by Abd-Elmotaal and Kühtreiber (1999, 2003) (cf. Fig. 2)

$$
\Delta \mathbf{g}\_{\text{win-red}} = \Delta \mathbf{g}\_F - \Delta \mathbf{g}\_{TI\,\text{win}} - \Delta \mathbf{g}\_{GPM} \Big|\_{\boldsymbol{n}=\boldsymbol{\varepsilon}}^{\boldsymbol{n}\_{\text{max}}} + \\
$$

$$
+ \Delta \mathbf{g}\_{\text{win-of}} \Big|\_{\boldsymbol{n}=\boldsymbol{\varepsilon}}^{\boldsymbol{n}\_{\text{max}}} \,. \tag{1}
$$

**Fig. 2** The window remove-restore technique

The reduced and smoothed gravity anomalies represented in Eq. (1) are point values. They are interpolated on the 50 5<sup>0</sup> target grid covering the geographical window (40<sup>ı</sup> 42<sup>ı</sup>I 20<sup>ı</sup> 60<sup>ı</sup>) of the African continent. The technique used to get the gG *<sup>w</sup>*i n-red is an unequal weight least-squares interpolation technique (Moritz 1980). A smart fitting technique of the empirically determined covariance function by employing a least-squares regression algorithm (Abd-Elmotaal and Kühtreiber 2016) has been implemented in the interpolation process.

a

The effects which have to be subtracted in the remove-step in order to smooth the point wise given gravity anomalies to improve the interpolation results are added back, but now in the nodes of the equidistant target grid. The applied technique is described in Abd-Elmotaal and Kühtreiber (1999, 2003) and can be formally expressed by

$$
\Delta \mathbf{g}\_F^G = \Delta \mathbf{g}\_{win\text{-}red}^G + \Delta \mathbf{g}\_{TI\text{-}win}^G + \Delta \mathbf{g}\_{EGM2008}^{\vert \imath\_{\imath\_{\imath\_{\imath\_{\imath\_2}}}}} - 1
$$

$$
$$

The superscript G which is added to the involved values (compare (1) and (2)) indicates the gridded values. gG F computed by (2) represent the values for the AFRGDB\_V2.0 gravity database for Africa. In Abd-Elmotaal et al. (2018) more details about the establishment of AFRGDB\_V2.0 can be found.

It is worth mentioning, that the harmonic analysis (Abd-Elmotaal and Kühtreiber 2021; Abd-Elmotaal et al. 2013) of the topographic-isostatic masses needed to compute the term g*w*i ncof in Eq. (1) is the most time consuming part in the window remove-restore process employed for the creation of the AFRGDB\_V2.0 gravity database.

# **3.2 Methodology for Creating AFRGDB\_V2.2**

The creation of version V2.2 of the gravity database for Africa is based on the RTM reduction technique, proposed first by Forsberg (1984). The remove step of the modified RTM technique used in the creation of the AFRGDB\_V2.2 gravity database, employing the best smoothed DHM as RTM surface, can mathematically be expressed by

$$\Delta \mathbf{g}\_{RTM\text{-}red} = \Delta \mathbf{g}\_F - \Delta \mathbf{g}\_{RTM\text{-}win} - \Delta \mathbf{g}\_{Dir\text{-}R5} \Big|\_{n=2}^{n\_{max}},\quad(3)$$

where gRTM-red refers to the RTM-reduced gravity anomalies, gF refers to the measured free-air gravity anomalies, gDir\_R5 stands for the contribution of the GOCE Dir\_R5 global reference geopotential model (Bruinsma et al. 2014). The RTM effect on gravity gRTM *<sup>w</sup>*i n of the topographic masses is computed from a fixed data window. Here nmax D 280 is the used upper maximum degree. The reduced anomalies are interpolated on a 5<sup>0</sup> 5<sup>0</sup> grid for the African result window using the same technique described in Sect. 3.1 yielding the interpolated gridded reduced anomalies gG RTM-red . The restore step for the modified RTM technique used for creating the AFRGDB\_V2.2 gravity database for Africa can mathematically be expressed by

$$\Delta \mathbf{g}\_F^G = \Delta \mathbf{g}\_{RTM\text{-}red}^G + \Delta \mathbf{g}\_{RTM\text{-}win}^G + \Delta \mathbf{g}\_{Dir\text{-}RS}^{\left. \right|\_{\text{max}}},\tag{4}$$

where the superscript G stands again for values computed at the grid points. gG <sup>F</sup> computed by (4) represent the values for the AFRGDB\_V2.2 gravity database for Africa. More details about the establishment of the AFRGDB\_V2.2 gravity database can be found in Abd-Elmotaal et al. (2020).

It should be mentioned, that the required computations to establish the AFRGDB\_V2.2 gravity database for Africa described in Sect. 3.2 are fairly faster than the technique used to create the AFRGDB\_V2.0 gravity database described in Sect. 3.1.

# **4 The New Data Set Used for the Validation**

A new gravity data set, covering part of the gaps appearing in the AFRGDB\_V2.x gravity data (cf. Fig. 1), became recently available for the IAG Sub-Commission on the gravity and geoid in Africa. This gridded gravity data set comprises 27,121 grid points on land and 16,659 grid points on sea. The distribution of the new gravity data is illustrated in Fig. 3. Table 2 gives the statistics of the new gravity anomaly

**Fig. 3** The distribution of the new gravity data used to evaluate AFRGDB\_V2.x (green: land data, blue: sea data)

**Table 2** Statistics of the new gravity anomalies used to evaluate AFRGDB\_V2.x. Units in [mgal]


data. In the validation of AFRGDB\_V2.x presented here, the new data is not used for an update of the database. The recent two solutions AFRGDB\_V2.0 and AFRGDB\_V2.2 are interpolated on the grid of the newly acquired data. The resulting residuals (differences) are used for the validation.

# **5 Validation of AFRGDB\_V2.0 and AFRGDB\_V2.2**

The new gravity data set has been used to evaluate the accuracy of the AFRGDB\_V2.x. As can be clearly seen in Fig. 1, the data collected so far show large gaps especially in the north-eastern region of the African continent. With the different methods used to create the AFRGDB\_V2.x databases, the influence of this shortcoming should also be reduced. With the new data, a validation can be carried out under unfavorable data conditions.

Figure 4 shows the histogram of the residuals from the difference between the AFRGDB\_V2.x and the new land data contained in the new data grid. It can be concluded, that the AFRGDB\_V2.0 adjusts better than the AFRGDB\_V2.2 because the precision index of the AFRGDB\_V2.0 is larger than that of AFRGDB\_V2.2.

Figure 5 shows the histogram of the validation of the AFRGDB\_V2.x gravity database in respect to the new grid data on sea. Here also, Fig. 5 shows, using the precision index as decision parameter, that the accuracy of AFRGDB\_V2.0 is better than that of AFRGDB\_V2.2, at least in this region under consideration.

Figure 6 shows the histogram of the validation of the full data for the AFRGDB\_V2.x gravity databases. This figure also confirms the previous conclusion that the AFRGDB\_V2.0 fits better than the AFRGDB\_V2.2 to the new data.

While 68.03% of the new grid points have differences less than 10 mgal for the AFRGDB\_V2.0, this holds for 57.66% of the AFRGDB\_V2.2. The respective residuals are shown in Fig. 7.

**Fig. 4** Histogram of the validation on land for the (**a**) AFRGDB\_V2.0 and (**b**) AFRGDB\_V2.2 gravity database for Africa

**Fig. 5** Histogram of the validation on sea for the (**a**) AFRGDB\_V2.0 and (**b**) AFRGDB\_V2.2 gravity database for Africa

**Fig. 6** Histogram of the validation of the full data for the (**a**) AFRGDB\_V2.0 and (**b**) AFRGDB\_V2.2 gravity database for Africa

**Fig. 7** Validation of the (**a**) AFRGDB\_V2.0 and (**b**) AFRGDB\_V2.2 gravity database for Africa. Units in [mgal]

# **6 Conclusion**

A validation of the recently established AFRGDB\_V2.0 and AFRGDB\_V2.2 gravity databases for Africa has been successfully carried out. The new data which are used for validation, covers the north-eastern region of the African continent. In this region occur large data gaps in the previous database, particularly in the point values on land. The performed validation shows that the AFRGDB\_V2.0 gravity database is more precise in this region than the AFRGDB\_V2.2 gravity database. This becomes obvious from the residuals between the new data used for validation and the respective model (cf. Fig. 7). While 68.03% of the data points have differences less than 10 mgal for the AFRGDB\_V2.0, for the AFRGDB\_V2.2 this holds only for 57.66% of the data points. This statement is also supported by the statistical parameters in Table 3. They show that

**Table 3** Statistics of the validation of the AFRGDB\_V2.0 and AFRGDB\_V2.2 gravity data bases. Units in [mgal]


the AFRGDB\_V2.0 fits better than the AFRGDB\_V2.2 to the new data. However, the computation efforts and CPUtime for the AFRGDB\_V2.2 gravity database are much less compared to those of the AFRGDB\_V2.0 gravity database. The validation, as an external check of the quality of the gravity databases AFRGDB\_V2.x for Africa, shows reasonable accuracy of the established gravity databases considering the large data gaps in Africa. The performed validation of the so far used data for establishing the AFRGDB\_V2.x databases shows significant discrepancy concerning the new data set for Sinai, which deserves deeper investigation.

**Acknowledgements** We thank the International Association of Geodesy (IAG) and the International Union of Geodesy and Geophysics (IUGG) for their support. The thanks extend to Dr. Sylvain Bonvalot, Director of the Bureau Gravimétrique International (BGI), who provided part of the data.

# **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/ licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

**Part II**

**Estimation Theory**

# **PDF Evaluation of Elliptically Contoured GNSS Integer Ambiguity Residuals**

# Peter J. G. Teunissen and Sandra Verhagen

#### **Abstract**

In this contribution we will present and evaluate the joint probability density function (PDF) of the multivariate integer GNSS carrier phase ambiguity residuals, thereby assuming that the GNSS data belong to the very general class of *elliptically contoured* (EC) distributions. Examples of distributions belonging to this class are the multivariate normal distribution, the t-distribution and the contaminated normal distribution. Since the residuals and their properties depend on the integer estimation principle used, we will present the PDF of the ambiguity residuals for the whole class of admissible integer estimators. This includes the estimation principles of integer rounding, integer bootstrapping, and integer least squares. The probabilistic properties of these estimators vary with the distributions from the ECclass. In order to get a better understanding of the various features of the joint PDF of the ambiguity residuals we will use a step-by-step construction aided by graphical means.

#### **Keywords**

Ambiguity success-rate - GNSS - Integer ambiguity resolution - Integer least-squares (ILS) - Pull-in region -Z-transformation

# **1 Introduction**

Several studies have indicated the occurrence of GNSS instances where working with distributions that have tails

P. J. G. Teunissen (-)

Department of Geoscience and Remote Sensing, Delft University of Technology, Delft, The Netherlands

GNSS Research Centre, School of Earth and Planetary Sciences, Curtin University of Technology, Perth, WA, Australia

Department of Infrastructure Engineering, The University of Melbourne, Melbourne, VIC, Australia e-mail: p.j.g.teunissen@tudelft.nl

S. Verhagen

Department of Geoscience and Remote Sensing, Delft University of Technology, Delft, The Netherlands e-mail: sandra.verhagen@tudelft.nl

heavier than the normal would be more appropriate. In Heng et al (2011), for instance, it is shown that GPS satellite clock errors and instantaneous UREs have heavier tails than the normal distribution for about half of the satellites. Similar findings can be found in Dins et al (2015). Also in fusion studies of GPS and INS, Student's t-distribution has been proposed as the more suited distribution, see e.g. Zhu et al. (2012), Zhong and Xu (2018), Wang and Zhou (2019). And similar findings can be found in studies of multi-sensor GPS fusion for personal and vehicular navigation (Dhital et al 2013; Al Hage et al 2019).

An appropriate class of distributions that can be used to model distributions with heavy tails is the *class of elliptically contoured (EC) distributions*. Many distributions belong to this class (Chmielewski 1981; Cabane et al 1981), with important examples being the multivariate normal distribution, the contaminated normal distribution and the multivariate t-distribution (Kibria and Joarder 2006; Roth 2013).

If we assume our GNSS data vector y, with mean

E.y/ D Aa C Bb ; y 2 Rm; a 2 Zn; b 2 R<sup>p</sup> (1)

© The Author(s) 2023

J. T. Freymueller, L. Sánchez (eds.), *X Hotine-Marussi Symposium on Mathematical Geodesy*, International Association of Geodesy Symposia 155, https://doi.org/10.1007/1345\_2023\_204

Authors Peter J.G. Teunissen and Sandra Verhagen contributed equally to this work.

and design matrix ŒA; B-, to be elliptically contoured, then by virtue of linearity, also the least-squares ambiguity estimator aO of a is elliptically contoured. Our starting point will therefore be to assume that the probability density function (PDF) of aO is a member from the class of EC-distributions and thus given as

$$|f\_{\hat{a}}(\mathbf{x}) = |\Sigma\_{\hat{a}\hat{a}}|^{-1/2} \mathbf{g}(\|\mathbf{x} - a\|\_{\Sigma\_{\hat{a}\hat{a}}}^2) \tag{2}$$

where a 2 Z<sup>n</sup>, †a<sup>O</sup>a<sup>O</sup> 2 Rn<sup>n</sup> is positive definite, and g W R 7! Œ0;1/ is a decreasing function that satisfies R <sup>R</sup><sup>n</sup> g.x<sup>T</sup> x/dx D 1 (Cabane et al 1981; Teunissen 2020). As the PDF is completely determined by the three ingredients: the mean E.a/O D a, the matrix †a<sup>O</sup>aO, and the function g, we write aO ECn.a; †a<sup>O</sup>aO; g/.

As aO is an unbiased estimator of a 2 Z<sup>n</sup>, the realvalued ambiguity-float solution aO is used to estimate a as <sup>a</sup><sup>L</sup> <sup>D</sup> <sup>I</sup>.a/<sup>O</sup> <sup>2</sup> <sup>Z</sup><sup>n</sup>, where <sup>I</sup> <sup>W</sup> <sup>R</sup><sup>n</sup> 7! <sup>Z</sup><sup>n</sup> is an admissible integer estimator. Popular examples of I.:/ are: integer least-squares (ILS), integer bootstrapping (IB) and integer rounding (IR) (Teunissen 1998, 1999). With both aO and aL available, the *ambiguity residual* is defined as

$$
\check{\epsilon} = \hat{a} - \check{a} \in \mathbb{R}^n \tag{3}
$$

In current GNSS practice, the ambiguity residual is used for various inferences and ambiguity validation purposes (Verhagen and Teunissen 2004; Teunissen and Montenbruck 2018). To be able to do such in a statistically meaningful way requires knowledge of the PDF of L.

# **2 Normal, Contaminated Normal and Student's t-Distribution**

Before we commence deriving the PDF of L, we first provide a comparative insight into the behaviors of three ECdistributions, namely the normal, the contaminated normal and the Student t-distribution. Their g-functions are given as

$$\begin{array}{l} \mathbf{g}(\boldsymbol{\chi}) = (2\pi)^{-\frac{w}{2}} e^{-\frac{1}{2}\boldsymbol{\chi}} \text{ (normal)}\\ \mathbf{g}(\boldsymbol{\chi}) = (1-\epsilon) \frac{\epsilon^{-\frac{1}{2}\boldsymbol{\chi}}}{(2\pi)^{\frac{w}{2}}} + \epsilon \frac{\frac{\delta-\frac{w}{2}}{2} e^{-\frac{1}{2\delta}\boldsymbol{\chi}}}{(2\pi)^{\frac{w}{2}}} \text{ (cont.norm)}\\ \mathbf{g}(\boldsymbol{\chi}) = \frac{\Gamma(\frac{w+d}{2})}{(d\pi)^{\frac{d}{2}}\Gamma(\frac{d}{2})} \left[1 + \frac{\boldsymbol{\chi}}{d}\right]^{-\frac{w+d}{2}} \text{ (Student)} \end{array}$$

in which x 2 R, .:/ denotes the gamma-function, and d the degrees of freedom of the Student distribution. Both the contaminated normal and the multivariate t-distribution have tails heavier than the normal. The contaminated normal distribution is an -mixture of two normal distributions having the same mean but ı-proportional variance matrices. The relevance of the contaminated distribution stems from the fact that it is a finite mixture distribution particularly useful for modeling data that are thought to contain a distinct subgroup of observations and thus can be used to model experimental error or contamination.

Note, since (2) is symmetric with respect to a, that a in (2) is indeed the mean of aO, E.a/O D a. The positive-definite matrix †a<sup>O</sup>a<sup>O</sup> in (2) however, is in general *not* the variance matrix of aO. It can be shown that the variance matrix of aO, which we will denote as Qa<sup>O</sup>aO, is a scaled version of †a<sup>O</sup>aO. For the above three distributions, their Q and †-matrices are related as

$$\begin{array}{ll}\text{Normal}: & Q\_{\hat{a}\hat{a}} = \Sigma\_{\hat{a}\hat{a}}\\\text{Cont.normal}: & Q\_{\hat{a}\hat{a}} = (1 - \epsilon + \epsilon\delta)\Sigma\_{\hat{a}\hat{a}}\\\text{Student distrib.}: & Q\_{\hat{a}\hat{a}} = \frac{d}{d-2}\Sigma\_{\hat{a}\hat{a}}\end{array}$$

Figure 1 shows the three univariate PDFs for the case they have the same † (left) and for the case they have the same Q (right). This shows that when the three distributions are compared with the same †, the contaminated normal and Student distribution indeed have heavier tails than the normal and are also less peaked than the normal distribution. This situation changes however when the distributions are compared having the same variance. Although the contaminated and Student distribution then still have heavier tails than the normal distribution, this is less pronounced (see the zoomins), while now the normal distribution is the less-peaked of the three distributions. This shows that in practice one has to exercise some caution when comparing these distributions, especially since often one will already have determined or know the precision of the observables and therefore work under the assumption that the three distributions have the same variance.

# **3 Distribution of the Ambiguity Residual**

We will now provide the PDF of the ambiguity residual L assuming that the PDF of the GNSS data is member of the class of elliptically contoured distributions. We have the following result.

**Theorem** *Let* aO ECn.a; †a<sup>O</sup>a<sup>O</sup>; g/ *and* aL D I.a/O *. Then the PDF of* L D Oa La *is given as*

$$f\_{\vec{i}}(\boldsymbol{x}) = \sum\_{\boldsymbol{z} \in \mathbb{Z}^n} \frac{\mathbf{g}(\|\boldsymbol{x} - \boldsymbol{a} - \boldsymbol{z}\|\_{\Sigma\_{\vec{a}\tilde{\boldsymbol{a}}}}^2)}{\sqrt{|\Sigma\_{\vec{a}\tilde{\boldsymbol{a}}}|}} p\_0(\boldsymbol{x}) \tag{4}$$

*where* p0.x/ *is the indicator function of the origin-centred pull-in region of the integer-map* I.:/ *(Teunissen 2002).*

**Fig. 1** The univariate PDFs of the normal (blue), contaminated normal (green; D 0:5, ı D 5) and Student (red; d D 3) distribution, together with their zoom-ins. Left: all PDFs have the same D 0:1

In constructing fL.x/ from faO.x/ we follow the distributional steps as graphically depicted in Fig. 2:


Note that the domain of the PDF fL.x/ is that of the indicator function p0.x/ and thus dependent on which integer ambiguity estimator is used for computing aL. Also note that we have not yet made the assumption in (4) that a 2 Z<sup>n</sup>. This is the reason why a is still present in the expression of (4); otherwise it would vanish because of the infinite integer sum. This therefore allows us to consider the distribution also for

(thus different variances); Right: all PDFs have the same variance (thus different 's). The normal distribution is shown for D 0:1 (Left and Right)

non-integer values of the ambiguities. We will come back to this in Sect. 6. First however, we will consider for the case a 2 Z<sup>n</sup>, the shape of the ambiguity-residual PDF for some different EC-distributions and some different integer ambiguity estimators.

#### **4 PDF** *f*-<sup>L</sup> .*x*/**: One-Dimensional Case**

For the one-dimensional univariate case, integer rounding (IR) is the only admissible integer estimator, the pull-in region of which is given by the origin-centred interval of length 1. Figure 3 shows the univariate PDF fL.x/ for when the data is distributed as normal (blue), contaminated normal (green) and Student (red), in case of different values for sigma (0:1, 0:25, 0:5). The PDFs at the top all have the same †, while those at the bottom all have the same Q (variance). The following conclusions can be drawn:

1. The difference between the two PDFs fL.x/ and faO.x/ is small if is sufficiently small with respect to 1 (the length of the pull-in interval). This can be understood as follows: the smaller gets, the larger the probability of correct integer estimation (i.e. ambiguity success-rate) and thus

**Fig. 2** From the float PDF fa<sup>O</sup> .x/, via the joint PDF fa; <sup>O</sup> <sup>a</sup><sup>L</sup> .x;*z*/, to the ambiguity-residual PDF fL.x/

the less uncertain the outcome of the integer estimator aL becomes. The uncertainty of L D Oa La will then resemble that of faO.x/.


#### **5 PDF** *f*-<sup>L</sup> .*x*/**: Two-Dimensional Case**

In the multivariate case (n 1) not only the type of ECdistribution that is assumed for the data, but now also the choice of integer estimator has its impact on the PDF of the ambiguity residual. To show this, we consider the PDF of the two-dimensional double-differenced ambiguity residual vector of a single-epoch, GNSS dual-frequency geometryfree model, thereby assuming that the data follows a normal distribution. Figure 4 shows by colors the function values of the PDFs of aO and L D Oa La for three different integer estimators (IR, IB, ILS), for the case the ambiguities are in double-differenced (DD) form (top row) and for the case the ambiguities are in Z-transformed or ƒ-decorrelated form (bottom row). As the two dimensional pull-in regions of IR, IB and ILS are a unit-square, a parallellogram and an hexagon, respectively, these are also the domains of the corresponding fL.x/.

As the DD ambiguities are highly correlated, the contourlines of faO.x/ are very elongated (Fig. 4, top-left). The impact of this extreme elongation is seen reflected in the three PDFs of the ambiguity-residual vector (Fig. 4, top row). For IR and IB this results in multi-modality and ridges in their PDFs.This is not the case for ILS, as the shape of its pull-in region provides the best-possible approximation to the shape of the contour lines of faO.x/ (Teunissen 1999).

As integer ambiguities are usually not resolved in DDform, but rather in ƒ-decorrelated form using the LAMBDAmethod (Teunissen 1995), the corresponding PDFs are shown in the bottom row of Fig. 4. We now see, when compared to the DD-case (Fig. 4, top row), that the shapes of the three ambiguity-residual PDFs are over a larger domain

**Fig. 3** Univariate PDF of ambiguity residual L D Oa - La when the data is distributed as normal (blue), contaminated normal (green, D 0:5, ı D 5) and Student (red, d D 3), for different values of sigma (0:1,

0:25, 0:5). Top: same † for all three distributions. Bottom: Same Q for all three distributions

**Fig. 4** PDFs of aO and L D Oa - La for three different integer estimators (IR, IB, ILS), when ambiguities are in double-differenced form (top row) and ƒ-decorrelated form (bottom row)

**Fig. 5** PDFs fa<sup>O</sup> .x/ (top row) and fL.x/ (bottom row) for different a … Z (0:1, 0:5) and different (0:1, 0:25), when data is assumed to be distributed as normal (blue), contaminated normal (green, D 0:5, ı D 5) or Student (red, d D 3)

similar to that of faO.x/. The differences between fL.x/ and faO.x/ are now more confined to the boundaries of the pull-in regions and are also different for the different pull-in regions. These differences will of course get smaller, the more precise the ambiguities are.

# **6 The Case** *a* … Z*<sup>n</sup>*

So far we assumed the ambiguities to be integer. As a result the PDF of the ambiguity residual L D Oa La was shown to be symmetric with respect to the origin. This situation changes drastically however when the ambiguities fail to be integer, a … Z<sup>n</sup>. Note when we change the value of a, that the EC-PDF faO.x/ simply translates over this change in a, without changing its shape. This is not the case however for the PDF of the ambiguity-residual. This difference in behaviour of faO.x/ and fL.x/ under changes of a is illustrated in Fig. 5. The lack of translational invariance in fL.x/ is due to the finite extent of its domain as dictated by the pull-in region. Due to this constraint, the shape of fL.x/ has to change when changing a over a noninteger value. Its shape will only remain the same when the change in a is over an integer value.

# **7 Summary and Conclusion**

In this contribution we provided the PDF fL.x/ of the ambiguity-residuals for the case the distribution of the GNSS data is elliptically contoured. The normal, the contaminated normal and the Student distribution were hereby taken as examples. We then evaluated several characteristics of fL.x/ in its dependence on both the shape of the elliptically contoured data distributions ('same †' vs 'same Q') as well as the chosen integer ambiguity estimator (IR, IB or ILS). Finally we highlighted the lack of translational invariance of fL.x/, which is a property that really discriminates it from the PDF faO.x/ of the float ambiguities.

In many empirical GNSS studies, the evaluation of the ambiguity residuals is still done by comparing their histograms with the PDF faO.x/. This is incorrect and should not be done, since, as the above has shown, the two PDFs faO.x/ and fL.x/ can have very different characteristics. Moreover, there is no need to use faO.x/ for comparative purposes, since the exact analytical expression for the PDF of the ambiguity residuals is available (cf. Fig. 4).

# **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/ licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

# **Spatio-Spectral Assessment of Some Isotropic Polynomial Covariance Functions on the Sphere**

Dimitrios Piretzidis, Christopher Kotsakis, Stelios P. Mertikas, and Michael G. Sideris

#### **Abstract**

In gravity field modeling, covariance functions are mainly associated with least squares collocation. Prior to the implementation of least squares collocation, the characteristics of the selected analytical covariance function need to be well understood. In this contribution, we study four polynomial covariance functions, i.e., the spherical, Askey, C<sup>2</sup>-Wendland and C<sup>4</sup>-Wendland models. All of them are defined on the sphere and correspond to isotropic, positive definite and compactly supported functions. We examine them in the spatial and spectral domains, and assess their characteristics, such as the correlation length, the curvature parameter, the spectral maximum and the spectral decay rate. We also provide analytical expressions and numerical estimates for these parameters.

#### **Keywords**

Askey model - Polynomial covariance functions - Spherical harmonic coefficients - Wendland model

# **1 Introduction**

Covariance functions (CFs) are routinely used in geostatistical analysis to model the stochastic behavior of spatial random fields. The study of CFs in physical geodesy is usually conducted within the context of least squares collocation for the estimation of functionals related to the Earth's

D. Piretzidis (-) Space Geomatica P.C., Chania, Greece e-mail: d\_piretzidis@spacegeomatica.com

C. Kotsakis

Department of Geodesy and Surveying, Aristotle University of Thessaloniki, Thessaloniki, Greece e-mail: kotsaki@topo.auth.gr

#### S.P. Mertikas

Geodesy and Geomatics Engineering Laboratory, Technical University of Crete, Chania, Greece e-mail: smertikas@tuc.gr

disturbing potential. Spatial CFs are categorized in several ways, depending on their mathematical properties. With respect to the surface they are defined on, CFs are classified into planar or spherical (i.e., defined on the plane or sphere, respectively), with the former ones used mostly in local-scale applications and the latter ones in global-scale applications. Regarding their structural properties (Devaraju and Sneeuw 2018), CFs can be classified based on their invariance under translation (homogeneous/non-homogeneous) and under rotation (isotropic/non-isotropic). The examination of their support length (i.e., maximum distance at which the CF is non-zero) gives rise to yet another distinction. That is, CFs with an infinite support length, which are usually called global CFs, and CFs with a finite support length, also known as local, finite or compactly supported CFs. Lastly, depending on their validity, CFs are classified as positive definite and non-positive definite, with only the former ones to provide physically meaningful modeling options.

The design of CFs and the study of their properties is an active topic of research in the field of applied mathematics (e.g., Emery et al. 2022), with applications to all branches of geosciences. In this contribution, we focus only

© The Author(s) 2023

J. T. Freymueller, L. Sánchez (eds.), *X Hotine-Marussi Symposium on Mathematical Geodesy*, International Association of Geodesy Symposia 155, https://doi.org/10.1007/1345\_2023\_190

M.G. Sideris Department of Geomatics Engineering, University of Calgary, Calgary, AB, Canada e-mail: sideris@ucalgary.ca

on polynomial CFs, which are still regarded as appealing models mostly due to the simplicity of their mathematical expressions. We examine four polynomial CFs, namely, the spherical, Askey, C<sup>2</sup>-Wendland and C<sup>4</sup>-Wendland CFs. The spherical CF represents the normalized volume resulted from the convolution of two identical balls. The Askey covariance model was firstly used as a radial basis function by Askey (1973), who also proved its positive definiteness. The Wendland CFs (Wendland 1995) are constructed by the repeated integration of the Askey CF using the "Montée" integral operator <sup>I</sup> <sup>f</sup>f .r/g D <sup>R</sup> <sup>1</sup> <sup>r</sup> tf .t / dt, which produces CFs of arbitrary smoothness k (termed C<sup>k</sup>-Wendland CFs). Based on this design principle, the Askey CF is also considered as the C<sup>0</sup>-Wendland CF. The selection of these models is motivated from their frequent use in previous studies that primarily investigate them from a purely mathematical perspective (Gneiting 2013; Guinness and Fuentes 2016). All models correspond to isotropic, positive definite, compactly supported functions defined on the spherical surface.

The paper is structured as follows. In Sect. 2 the spatial and spectral representation of the four polynomial CFs is presented. Some alternative expressions for the evaluation of the Askey and Wendland CFs in the spatial domain are also provided. In Sect. 3 two spatial characteristics (correlation length and curvature parameter) and two spectral characteristics (spectral maximum and decay rate) are discussed. For the correlation length, curvature parameter and spectral maximum, analytical expressions are derived, whereas the spectral decay rate is evaluated numerically. Finally, Sect. 4 summarizes the most important conclusions of this study.

# **2 Isotropic Polynomial Covariance Functions**

# **2.1 Spatial Representation**

Since the four CFs under study correspond to isotropic functions, they only depend on the distance between two points on the spherical surface. Their adaptation from the line (or plane) to the sphere is done by replacing the Euclidean distance with the spherical distance 2 Œ0; -. The standard expressions used to describe them in the spatial domain are given in Table 1. All CFs depend on the variance c0, which represents the CF value at D 0, and the support length 0 that denotes the spatial extent of the CF, i.e., the distance for which C . / D 0; 8 > 0. The Askey and Wendland CFs also depend on the shape parameter . The numerical range of c0, 0 and that results in a positive definite CF on the sphere is also provided in Table 1. The variable **1**<sup>I</sup> denotes the indicator function, given by

$$\mathbf{1}\_I(\boldsymbol{\psi}) = \begin{cases} 1, & \boldsymbol{\psi} \in I \\ 0, & \boldsymbol{\psi} \notin I \end{cases} \tag{1}$$

The intervals I1 and I2 are defined as I1 D Œ0; min. 0; -/ and I2 D Œ0; 0. It is also evident by the expressions in Table 1 that CS, CA, CW2 and CW4 are polynomials of order three, , C1 and C2, respectively. Figure 1 presents some CF examples for different values of 0 and . The spherical and Askey CFs demonstrate a sharp decrease at D 0, whereas the Wendland CFs have a smoother, Gaussian-like behavior in the same vicinity.

Several alternative formulations of the Askey and Wendland CFs can be found in the literature. Hubbert (2012) derived expressions in terms of the associated Legendre function of the first kind P <sup>m</sup> <sup>n</sup> of degree n and order m, with n; m 2 N<sup>0</sup> and n m. These expressions are normalized in this work so that C .0/ D c0, resulting in the following equations:

$$\mathbf{C}\_{\rm A}(\boldsymbol{\psi}, \boldsymbol{\tau}) = \mathbf{g}\_{0, \rm \mathbf{r}} \left( 1 - \frac{\boldsymbol{\psi}^2}{\boldsymbol{\psi}\_0^2} \right)^{\frac{\boldsymbol{\xi}}{2}} P\_0^{-\rm r} \left( \frac{\boldsymbol{\psi}\_0}{\boldsymbol{\psi}} \right) \mathbf{1}\_{I\_{\rm I}}(\boldsymbol{\psi}) \qquad (2a)$$

$$\mathbf{C}\_{\rm W2}(\boldsymbol{\psi},\boldsymbol{\tau}) = \mathbf{g}\_{1,\rm \tau} \boldsymbol{\psi} \left( 1 - \frac{\boldsymbol{\psi}^2}{\boldsymbol{\psi}\_0^2} \right)^{\frac{\boldsymbol{\xi}}{2}} P\_1^{-\boldsymbol{\tau}} \left( \frac{\boldsymbol{\psi}\_0}{\boldsymbol{\psi}} \right) \mathbf{1}\_{I\_2}(\boldsymbol{\psi}) \quad (2\mathbf{b}).$$

$$C\_{\rm W4}(\boldsymbol{\psi}, \boldsymbol{\tau}) = \operatorname{g}\_{2, \rm \tau} \psi^2 \left( 1 - \frac{\boldsymbol{\psi}^2}{\boldsymbol{\psi}\_0^2} \right)^{\frac{\boldsymbol{\xi}}{2}} P\_2^{-\rm \tau} \left( \frac{\boldsymbol{\psi}\_0}{\boldsymbol{\psi}} \right) \mathbf{1}\_{I\_2}(\boldsymbol{\psi}), \tag{2c}$$

**Table 1** Isotropic polynomial covariance functions on the sphere (Gneiting 2013)


<sup>a</sup> R- <sup>C</sup> denotes the set of all positive real numbers, i.e., R- <sup>C</sup> D fx 2 R W x>0g

**Fig. 1** Examples of (**a**) spherical, (**b**) Askey, (**c**) C<sup>2</sup>-Wendland and (**d**) C<sup>4</sup>-Wendland CFs in the spatial domain for different 0 (**a**) and values (**b**,**c**,**d**). A variance of c0 D 1 is selected in all cases

with the parameter gs; defined as:

$$g\_{s,\tau} = \frac{2^s s! (\tau + s)!}{(2s)!} c\_0 \psi\_0^{-s} \tag{3}$$

and the associated Legendre function of negative order P m n given by the expression (Hubbert 2012):

$$P\_n^{-m}(\mathbf{x}) = \left(\frac{\mathbf{x} - 1}{\mathbf{x} + 1}\right)^{\frac{m}{2}} \sum\_{j=0}^n \frac{(j+n)!(\mathbf{x} - 1)^j}{2^j \, j!(j+m)!(n-j)!}. \quad (4)$$

We note that the associated Legendre function P <sup>m</sup> <sup>n</sup> includes the Condon-Shortley phase .1/m, therefore the relation P <sup>m</sup> <sup>n</sup> D .1/mPn;m applies, with Pn;m being the standard definition of the associated Legendre function used in physical geodesy. Hubbert (2012) developed additional closed-form expressions for Eqs. (2a)–(2c) using analytical relations for P <sup>m</sup> <sup>n</sup> . These results were later used by Chernih and Hubbert (2014) to derive expression in standard polynomial form. The expressions of Chernih and Hubbert (2014), again normalized here so that C .0/ D c0, read:

$$C\_{\mathbf{A}}(\boldsymbol{\psi}, \boldsymbol{\tau}) = \sum\_{m=0}^{\mathbf{r}} k\_{0,m,\mathbf{r}} \boldsymbol{\psi}^{m} \mathbf{1}\_{I\_{1}}(\boldsymbol{\psi}) \tag{5a}$$

$$C\_{\mathsf{W2}}(\boldsymbol{\psi},\boldsymbol{\tau}) = \sum\_{m=0}^{\mathsf{r}+1} k\_{1,m,\mathsf{r}} \boldsymbol{\psi}^m \mathbf{1}\_{I\_2}(\boldsymbol{\psi}) \tag{5b}$$

$$C\_{\mathsf{W}4}(\boldsymbol{\psi}, \boldsymbol{\tau}) = \sum\_{m=0}^{\mathsf{r}+2} k\_{2,m,\mathsf{r}} \boldsymbol{\psi}^m \mathbf{1}\_{I\_2}(\boldsymbol{\psi}),\tag{5c}$$

with

$$k\_{s,m,\tau} = \frac{(-1)^m \Gamma\left(\frac{m+1}{2}\right) \Gamma\left(\frac{1}{2} - s\right) (\tau + s)!}{\Gamma\left(\frac{m+1}{2} - s\right) \Gamma\left(\frac{1}{2}\right) (\tau + s - m)! m!} c\_0 \psi\_0^{-m} \qquad (6)$$

and with .x/ denoting the gamma function (Gradshteyn and Ryzhik 2014, p. xxxii). Additional representations of the Askey and Wendland CFs, e.g., in terms of hypergeometric functions, can be found in Hubbert (2012).

# **2.2 Spectral Representation**

The spherical harmonic coefficients G.n/ of an isotropic covariance function are derived via the application of the Legendre transform J , as follows (Jekeli 2017, p. 54):

$$\begin{aligned} G(\mathfrak{n}) &= \int \{ \mathbf{C} \left( \psi \right) \} \\ &= \frac{1}{2} \int\_0^\pi \mathbf{C} \left( \psi \right) P\_n(\cos \psi) \sin \psi \,\mathrm{d}\psi, \end{aligned} \tag{7}$$

where Pn denotes the Legendre polynomials of degree n. Applying the linearity property of the Legendre transform to the expressions of Sect. 2.1, the spherical harmonic coefficients of the four CFs under study are given by:

$$G\_{\rm S}(n) = c\_0 \left( \Psi\_{0, I\_{\rm l}}(n) - \frac{3}{2} \frac{\Psi\_{1, I\_{\rm l}}(n)}{2\psi\_0} + \frac{\Psi\_{3, I\_{\rm l}}(n)}{2\psi\_0^3} \right) \qquad (8a)$$

$$G\_{\mathbf{A}}(n) = \sum\_{m=0}^{r} k\_{0,m,\mathbf{r}} \,\Psi\_{m,I\_1}(n) \tag{8b}$$

$$G\_{\rm W2}(n) = \sum\_{m=0}^{r+1} k\_{1,m,\rm r} \Psi\_{m,I\_2}(n) \tag{8c}$$

$$G\_{\rm W4}(n) = \sum\_{m=0}^{r+2} k\_{2,m,\rm r} \Psi\_{m,I\_2}(n),\tag{8d}$$

where m;I is the Legendre transform of the monomial <sup>m</sup> in I , i.e.,

$$\Psi\_{m,I}(n) = \frac{1}{2} \int\_I \psi^m P\_n(\cos \psi) \sin \psi \,\mathrm{d}\psi. \tag{9}$$

**Fig. 2** Examples of (**a**) spherical, (**b**) Askey, (**c**) C<sup>2</sup>-Wendland and (**d**) C<sup>4</sup>-Wendland CFs in the spherical harmonic domain for different 0 (**a**) and values (**b**,**c**,**d**). A variance of c0 D 1 is selected in all cases

The spherical harmonic representation of the CFs of Fig. 1 is shown in Fig. 2. Since all CFs are positive definite, P the Schoenberg criteria should apply, i.e., G.n/ <sup>0</sup> and <sup>1</sup> <sup>n</sup>D<sup>0</sup> G.n/ < 1 (Schoenberg 1942). The first Schoenberg criterion can be easily noted in Fig. 2, where it is also evident that, in all cases, the coefficients G.n/ decrease at a constant rate in higher degrees. The coefficients of the spherical CF exhibit a strong oscillating pattern that also appears (in a much lesser extent) in the rest of the CFs for small values.

# **3 Characteristics**

# **3.1 Spatial Characteristics**

The three main spatial characteristics of a CF is the variance, the correlation length and the curvature parameter (Moritz 1976). The variance is defined as the CF value at D 0 and equals to c0 for all the models of Table 1. The correlation length, denoted as , represents the spherical distance at which the CF decreases to half the variance, i.e.,

$$C(\xi) = \frac{C(0)}{2}.\tag{10}$$

Since all the CFs of Table 1 are strictly monotonic (i.e., strictly decreasing) on Œ0; 0, it can be deduced by the virtue of the intermediate value theorem that there exists a unique 2 Œ0; 0 satisfying Eq. (10). The determination of an analytical expression for is performed by solving Eq. (10) explicitly, and therefore it depends on the mathematical complexity of C . /. For the correlation length of the spherical CF, with the aid of MATLAB's computer algebra system (MATLAB 2020), we find the expression:

$$\xi\_{\rm S} = \left[ \sqrt{3} \sin \left( \frac{2\pi}{9} \right) - \cos \left( \frac{2\pi}{9} \right) \right] \psi\_0 \approx 0.3473 \psi\_0, \tag{11}$$

whereas, for the Askey CF the following expression can be easily derived:

$$
\xi\_{\mathcal{A}} = \left(1 - \frac{1}{\sqrt[4]{2}}\right) \psi\_0. \tag{12}
$$

Obtaining an analytical expression for the correlation length of the Wendland CFs is not a simple task, since it requires (a) the derivation of analytical expressions for the roots of a polynomial of arbitrary order and (b) a subsequent investigation on whether these roots are real and belong to Œ0; 0. Regarding the first requirement, analytical expressions for the roots of polynomials up to order four exist but become too complicated to be used in practice for orders greater than two. In addition, based on the Abel– Ruffini theorem, no algebraic expressions exist for the roots of general polynomial equations of order greater than four. We are also not aware of any method that addresses the second requirement.

Instead of a rigorous analytical expression for the correlation length of C<sup>2</sup>- and C<sup>4</sup>-Wendland CFs, denoted as W2 and W4, we seek for an approximate expression that can be easily generalized for any . In the sequel, we outline the procedure employed for deriving such an expression for W2. We firstly define the scaling parameter sW2 2 Œ0; 1 as sW2 D W2= 0 and rewrite Eq. (10) for the C<sup>2</sup>-Wendland function as follows:

$$(1 + \tau s\_{\rm W2})(1 - s\_{\rm W2})^{\rm r} = \frac{1}{2}.\tag{13}$$

We solve Eq. (13) numerically (e.g., using the bisection method or MATLAB's vpasolve function; The Math-Works Inc., 2022) for several values and only keep the real solutions in Œ0; 1, which are unique for each . These solutions are plotted in Fig. 3 up to D 50. It is evident that sW2 smoothly decreases for increasing and can be approximated quite well by the rational model:

$$s\_{\rm W2} \approx \frac{\alpha\_{\rm W2}}{\beta\_{\rm W2} + \pi},\tag{14}$$

**Fig. 3** Scaling parameters s, and approximations (fitted rational models) for sW2 and sW4

where the parameter values ˛W2 D 1:679 and ˇW2 D 1:350 are estimated using ordinary least-squares. Substituting Eq. (14) into the defining expression for sW2 and solving with respect to W2, we derive the following approximation:

$$
\xi\_{\mathbf{W}2} \approx \frac{\alpha\_{\mathbf{W}2}}{\beta\_{\mathbf{W}2} + \mathfrak{r}} \psi\_0. \tag{15}
$$

Performing the same procedure for the C<sup>4</sup>-Wendland CF yields:

$$
\xi\_{\rm W4} \approx \frac{\alpha\_{\rm W4}}{\beta\_{\rm W4} + \mathfrak{r}} \psi\_0,\tag{16}
$$

with ˛W4 D 2:330 and ˇW4 D 2:312. The corresponding values of the scaling parameter sW4 D W4= 0 are also provided in Fig. 3, along with s<sup>S</sup> 0:3473 and s<sup>A</sup> D 1 <sup>p</sup> <sup>2</sup> that are directly derived from Eqs. (11) and (12), respectively. The maximum absolute error of sW2 and sW4 using the approximations of Eqs. (15) and (16) does not exceed 6 10-<sup>5</sup> and 10-<sup>5</sup>, respectively, in the examined range. The overall behavior of the four s groups indicates that for the same 0 and , the following inequality applies: s<sup>S</sup> > sW4 > sW2 > sA. The same inequality is also true for ; hence, the Askey and spherical models always produce a CF with the smallest and largest correlation length, respectively, and the C<sup>2</sup>-Wendland CF always has a smaller correlation length than the C<sup>4</sup>-Wendland CF for a given f 0; g pair. An example of this behavior is shown in Figs. 1c and 1d for D 6, where the C<sup>2</sup>-Wendland CF (red line) is sharper than the corresponding C<sup>4</sup>-Wendland CF (blue line). A simple investigation of Eqs. (11), (12), (15) and (16) also shows that a decreasing 0 or an increasing yields a smaller correlation length , which corresponds to a sharper CF. This is again corroborated by the examples in Fig. 1.

The curvature parameter of a CF is defined as:

$$
\chi = \kappa(0)\frac{\xi^2}{C(0)},\tag{17}
$$

where . / is the curvature (or reciprocal radius of curvature) of C . /, given by:

$$\kappa(\psi) = \frac{C''(\psi)}{\left[1 + \left(C'(\psi)\right)^2\right]^{\frac{1}{2}}}.\tag{18}$$

Evaluating .0/ using Eq. (18) and substituting to Eq. (17) results in the following expressions:

$$\chi\_{\mathbb{S}} = 0 \tag{19a}$$

$$\chi\_{\rm A} = \frac{\pi (\pi - 1) \psi\_0}{\left(c\_0^2 \pi^2 + \psi\_0^2\right)^{\frac{3}{2}}} \xi\_{\rm A}^2 \tag{19b}$$

$$\chi\_{\rm W2} = -\frac{\mathfrak{r}(\mathfrak{r}+1)}{\mathfrak{v}\_0^2} \xi\_{\rm W2}^2 \tag{19c}$$

$$\chi\_{\rm W4} = -\frac{\pi^2 + 3\pi + 2}{3\psi\_0^2} \xi\_{\rm W4}^2. \tag{19d}$$

The zero curvature parameter (i.e., infinite radius of curvature) of the spherical CF indicates that CS. / is linear at D 0, whereas the positive and negative curvature parameters of the Askey and Wendland CFs, respectively, show that the former is convex and the latter concave at D 0. Finally, only <sup>A</sup> shows a dependence on c0.

# **3.2 Spectral Characteristics**

The two spectral characteristics discussed in this section are the magnitude of the zeroth-degree spherical harmonic coefficient and the spectral decay rate. The magnitude of the zeroth-degree coefficient G.0/ denotes the spectral maximum. Substituting n D 0 in Eq. (7) yields the expression:

$$G(0) = \frac{1}{2} \int\_0^\pi C(\psi) \sin \psi \,\mathrm{d}\psi,\tag{20}$$

which also corresponds to the average of C . / over the sphere. The analytical solution of Eq. (20) for the spherical CF results in:

$$G\_{\rm S}(0) = \frac{c\_0[\psi\_0^3 - 3\sin(\psi\_0) + 3\psi\_0\cos(\psi\_0)]}{2\psi\_0^3}.\tag{21}$$

The integral of Eq. (20) is rewritten for the Askey CF as:

$$G\_{\rm A}(0) = \frac{c\_0}{2\psi\_0^{\rm r}} \int\_0^{\psi\_0} (\psi\_0 - \psi)^{\rm r} \sin\psi \,\mathrm{d}\psi \qquad (22)$$

**Fig. 4** Evaluation of G.0/ for the (**a**) spherical, (**b**) Askey, (**c**) C<sup>2</sup>-Wendland and (**d**) C<sup>4</sup>-Wendland CFs. A variance of c0 D 1 is selected in all cases. The magnitude of G.0/ is provided in a logarithmic scale

and has the following analytical solution (Prudnikov et al. 1986, §2.5.5, eq. 1):

$$G\_{\rm A}(0) = \frac{c\_0 \psi\_0^2}{2(\pi^2 + 3\pi + 2)} \,\_1F\_2\left(1; \frac{\mathfrak{r} + 3}{2}, \frac{\mathfrak{r} + 4}{2}; -\frac{\psi\_0^2}{4}\right), \quad (23)$$

where pFq.a1;:::;apI b1;:::;bqI x/ is the generalized hypergeometric series (Gradshteyn and Ryzhik 2014, §9.14, eq. 1). Proceeding in the same way and using the relation of Prudnikov et al. (1986, §2.5.7, eq. 1), we derive the expression

$$\begin{aligned} G\_{\rm W2}(0) &= G\_{\rm A}(0) - \frac{i\,\tau c\_0 \psi\_0}{4} \mathbf{B}(2, \tau + 1) \times \\\\ \left[ \,\_1F\_1(2; \tau + 3; i\,\psi\_0) - \,\_1F\_1(2; \tau + 3; -i\,\psi\_0) \right] \end{aligned} (24)$$

for the C<sup>2</sup>-Wendland CF and

$$G\_{\rm W4}(0) = G\_{\rm W2}(0) - \frac{i(\tau^2 - 1)c\_0 \psi\_0}{12} \mathbf{B}(3, \tau + 1) \times \quad (25)$$

$$[{}\_1F\_1(3; \tau + 4; i\psi\_0) - {}\_1F\_1(3; \tau + 4; -i\psi\_0)]$$

for the C<sup>4</sup>-Wendland CF, with i being the imaginary unit and B.x; y/ the beta function (Gradshteyn and Ryzhik 2014, §8.380, eq. 1). The magnitude of G.0/ is presented in Fig. 4 with respect to different 0 and values, and for c0 D 1. The corresponding magnitude for c0 ¤ 1 is given by G.0/jc0D<sup>a</sup> D aG.0/jc0D1. It is evident from Fig. 4 that G.0/ has a larger magnitude for increasing 0 and decreasing . The inequality GS.0/ > GW4.0/ > GW2.0/ > GA.0/ also holds true for a specific triplet of fc0; 0; g values.

The spectral decay rate describes how the magnitude of G.n/ changes for increasing degree n. It is defined in decibel per octave using the equation:

$$
\mu = -\frac{G^{\text{[dB]}}(n\_2) - G^{\text{[dB]}}(n\_1)}{\log\_2(n\_2) - \log\_2(n\_1)},\tag{26}
$$

where the coefficients G<sup>Œ</sup>dB .n/ are expressed in decibels as follows: ˇ ˇ

$$G^{\text{[dB]}}(n) = 20 \log \left( \left| \frac{G(n)}{G(0)} \right| \right). \tag{27}$$

Equation (26) represents the magnitude change every time n doubles. A positive value of *u* indicates magnitude decay, whereas a negative value shows magnitude gain. Based on the results of Fig. 2, the decay rate of G.n/ is relatively small for lower degrees and converges to a maximum value, which remains constant at higher degrees. This is also evident in Fig. 5, where the spectral decay rate of the C<sup>4</sup>-Wendland CF, denoted as *u*W4, is evaluated for consecutive degrees (i.e., n1 D n and n2 D n C 1). The strong fluctuations of *u*W4 for D 6 occur due to the oscillating behavior of GW4 for small values. Additional numerical experiments show that the convergent value of *u* for a specific CF is not influenced by c0, 0 or . For n1 D 100 and n2 D 200, the decay rate of

**Fig. 5** Spectral decay rate of C<sup>4</sup>-Wendland CF for consecutive degrees

the four CFs under study is estimated in decibel per octave as follows: *u*<sup>S</sup> D 18, *u*<sup>A</sup> D 18, *u*W2 D 30 and *u*W4 D 42.

# **4 Summary and Conclusions**

In this contribution, we examined the spatial and spectral properties of the spherical, Askey, C<sup>2</sup>- and C<sup>4</sup>-Wendland CFs, which are all isotropic, positive-definite and compactly supported functions defined on the sphere.

The spatial assessment is performed by analyzing the CFs shape, correlation length and curvature parameter. The shape of the spherical and Askey CFs exhibits a sharp decay of their covariance value at D 0, therefore these models are better suited for modeling the stochastic behavior of geophysical signals with a sharply-decreasing empirical CF at the origin. Analytical expressions are developed that allow the rigorous calculation of the spherical and Askey CFs correlation length. Due to theoretical limitations, similar rigorous expressions cannot be derived for the Wendland models. At the present time, this issue is resolved with the development of approximate expressions. The examination of the correlation length for the four CFs using the same set of parameters resulted in the following inequality: <sup>S</sup> > W4 > W2 > A. Analytical expressions for the evaluation of the curvature parameter of the four CFs under study are also developed. Since the curvature parameter depends on the correlation length, these expressions are again exact for the spherical and Askey CFs, and approximate for the Wendland models. The spherical CF has a zero curvature parameter, which suggests a linear decrease of the covariance values at D 0. The positive and negative curvature parameter of the Askey and Wendland CFs, respectively, is also reflected in their convex and concave shape at the origin.

The assessment of the four CFs in the spectral domain is performed by calculating the spherical harmonic coefficients and examining the spectral maximum and spectral decay rate. All spherical harmonic coefficients are positive, as a result of the first Schoenberg criterion for positive-definite functions on the sphere. The spectral maximum G.0/ can be evaluated using analytical expressions that are given in terms of the hypergeometric function for the Askey and Wendland CFs. A smaller spectral maximum, which represents the mean value of the CF over the sphere, appears to be associated with a smaller correlation length. Although this connection is not mathematically proven, it is empirically corroborated by the inequality GS.0/ > GW4.0/ > GW2.0/ > GA.0/, which is based on numerical evidence. The visual inspection of the CF spectrum and the evaluation of the spectral decay rate for consecutive degrees shows that the decay rate increases in low degrees and converges to a maximum value in higher degrees. This maximum value is estimated numerically in the convergence region. Results showed that the spherical and Askey CFs have a similar spectral decay rate (18 dB/octave), whereas the C<sup>2</sup> and C<sup>4</sup>-Wendland CFs have higher decay rates (30 and 42 dB/octave, respectively). Additional experiments show that the maximum spectral decay rate of each CF does not depend on any of its parameters.

Due to their spatial structure, all CFs examined in this work can be used to model positively-correlated signals. In practice, empirical CFs of geophysical signals often exhibit an oscillatory behavior for large distances that can result in negative correlations. The design of compactly supported, positive-definite CFs that account for such oscillations is therefore of great need, since they can provide better modeling options.

The analysis presented in this work contributes to the general understanding of the behavior of some frequently used polynomial CFs on the sphere. The same investigation and comparison can be performed for various other models and the results can be further utilized in the context of stochastic modeling of spatial signals for geodetic applications.

**Acknowledgements** This work has been produced with the financial assistance of the European Union and the European Space Agency under the project FRM4S6 (Fiducial Reference Systems for Sentinel-6, No. 4000129892/20/NL/FF/ab). The views expressed herein can in no way be taken to reflect the official opinion of the European Union and/or European Space Agency.

**Author Contributions** DP conceptualized the study, performed the mathematical derivations and carried out the numerical experiments. CK, SPM and MGS provided editorial feedback and suggestions for improving the analysis, and validated the manuscript.

**Data Availability Statement** No data are used for this study.

# **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/ licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

# **MDBs Versus MIBs in Case of Multiple Hypotheses: A Study in Context of Deformation Analysis**

Safoora Zaminpardaz and Peter J. G. Teunissen

#### **Abstract**

Statistical testing procedures employed in geodetic quality control often consist of two steps: detection and identification. In the detection step, the null hypothesis (working model) H<sup>0</sup> undergoes a validity check. If the outcome of the detection step is the rejection of H0, identification of potential source of model error is exercised through a search among the specified alternative hypotheses. The testing performance is thus not only led by its ability to detect biases but to correctly identify them as well. The detection capability of a testing regime is usually assessed by its Minimal Detectable Bias (MDB) given a certain correct detection probability. The information provided by the MDB only concerns correct detection and *not* correct identification. The testing identification performance should be evaluated by its Minimal Identifiable Bias (MIB) given a certain correct identification probability. In this contribution, we demonstrate the difference between MDB and MIB. It is hereby highlighted that a small MDB (or a high probability of correct detection) does not necessarily imply a small MIB (or a high probability of correct identification). The factors driving the difference between detection and identification performance are illustrated using a simple example. Our analysis is then continued in the framework of deformation monitoring.

#### **Keywords**

Deformation monitoring - Detection, identification and adaptation (DIA) - Minimal detectable bias (MDB) - Minimal identifiable bias (MIB) -Statistical testing

Authors Safoora Zaminpardaz and Peter J.G. Teunissen contributed equally to this work.

S. Zaminpardaz (-)

School of Science, RMIT University, Melbourne, VIC, Australia e-mail: safoora.zaminpardaz@rmit.edu.au

P. J. G. Teunissen

Department of Geoscience and Remote Sensing, Delft University of Technology, Delft, The Netherlands

GNSS Research Centre, School of Earth and Planetary Sciences, Curtin University of Technology, Perth, WA, Australia

Department of Infrastructure Engineering, The University of Melbourne, Melbourne, VIC, Australia e-mail: p.j.g.teunissen@tudelft.nl

# **1 Introduction**

In geodetic quality control, statistical testing procedures often consist of two steps: *detection* and *identification* (Baarda 1968; Teunissen 1985; Caspary and Borutta 1987; Kösters and Van der Marel 1990; Amiri Simkooei 2001; Perfetti 2006; Lehmann and Lösler 2017; Klein et al. 2019; Nowel 2020). In the detection step, the validity of the null hypothesis H<sup>0</sup> is checked. If H<sup>0</sup> is rejected in the detection step, an identification is carried out as to which of the alternative hypotheses to select. In case there is only one alternative hypothesis, say H1, the rejection of H<sup>0</sup> is equivalent to the selection of H1. Thus, 'correct detection' of mismodelling error would be equivalent to

© The Author(s) 2023

J. T. Freymueller, L. Sánchez (eds.), *X Hotine-Marussi Symposium on Mathematical Geodesy*, International Association of Geodesy Symposia 155, https://doi.org/10.1007/1345\_2023\_208

'correct identification' of it when working with a single alternative hypothesis. This is however not the case if one has to deal with multiple alternative hypotheses. In this contribution, for multiple-alternative testing, we study the performance of the detection and identification steps using the concepts of the minimal detectable bias (MDB) and the minimal identifiable bias (MIB), respectively, and highlight the factors driving the difference between them.

This contribution is structured as follows. In Sect. 2, we describe the null and alternative hypotheses, and highlight the role of the misclosure space partitioning in testing these hypotheses. The testing decisions and their probabilities are discussed, whereby the following events are defined: correct acceptance (CA), false alarm (FA), correct detection (CD), missed detection (MD), correct identification (CI) and wrong identification (WI). The concepts of MDB and MIB are discussed in Sect. 3 for a testing procedure comprising detection and identification steps. It is hereby highlighted that the MDB provides information about correct detection and not about correct identification. To provide insight into the difference between the MDB and the MIB, we compare them in Sect. 4, for a simple multiple-hypothesis testing example. It is demonstrated, in graphical form, that the MIB could be significantly larger than the MDB. The MDB-MIB comparison is then continued for actual deformation measurement system examples in Sect. 5. Finally a summary with conclusions is presented in Sect. 6.

We use the following notation: The n-dimensional space of real numbers is denoted as R<sup>n</sup>, and the set of points on the circumference of the n-dimensional zero-centered unit sphere as S<sup>n</sup>. Random vectors are indicated by use of the underlined symbol '-'. Thus t 2 R<sup>n</sup> is a random vector, while t is not. The squared weighted norm of a vector, with respect to a positive-definite matrix Q, is defined as kk2 <sup>Q</sup> D .-/<sup>T</sup> Q-1.-/. H is reserved for statistical hypotheses, P for regions partitioning the misclosure space, and N .x; Q/ for the normal distribution with mean x and variance matrix Q. P.-/ denotes the probability of the occurrence of the event within parentheses. The symbol <sup>H</sup> should be read as 'distributed as ::: under <sup>H</sup>'. The superscripts <sup>T</sup> and -<sup>1</sup> are used to denote the transpose and the inverse of a matrix.

# **2 Statistical Hypothesis Testing**

In any quality control procedure, a set of hypotheses, including a null and several alternative hypotheses, are postulated to explain the phenomenon in question. For example, in geodetic deformation monitoring, the null hypothesis describes the 'all-stable, no movement' model, while the alternative hypotheses capture different dynamic behaviors of the structure under consideration. Let the observational model under the null hypothesis H0, a.k.a. working hypothesis, be given as

$$\mathcal{H}\_0 \colon \quad \mathsf{E}(\underline{\mathbf{y}}) = A\mathbf{x}; \quad \mathsf{D}(\underline{\mathbf{y}}) = \mathcal{Q}\_{\mathbf{y}\mathbf{y}} \tag{1}$$

with E.-/ the expectation operator, D.-/ the dispersion operator, y 2 R<sup>m</sup> the normally distributed random vector of observables linked to the estimable unknown parameters x 2 R<sup>n</sup> through the design matrix A 2 Rm<sup>n</sup> of rank.A/ D n, and Qyy 2 Rm<sup>m</sup> the positive-definite variance matrix of y. The redundancy of H<sup>0</sup> is r D m rank.A/ D m n.

The validity of the null hypothesis can be violated if the functional model and/or the stochastic model are misspecified. Here we assume that a misspecification is restricted to an underparametrization of the mean of y, which is the most common error that occurs when formulating the model (Teunissen 2017). Thus, the alternative hypothesis H<sup>i</sup> is formulated as

$$\mathcal{H}\_i: \quad \mathsf{E}(\underline{\mathbf{y}}) = A\mathbf{x} + C\_i b\_i; \quad \mathsf{D}(\underline{\mathbf{y}}) = \mathcal{Q}\_{\mathbf{y}\mathbf{y}} \quad (2)$$

for some vector Cibi 2 R<sup>m</sup> n f0g such that ŒA Ci is a known matrix of full rank and bi is an unknown vector.

# **2.1 Misclosure Space Partitioning**

Let us assume that there are k types of mismodelling errors in the form of Cibi (cf. 2) when parametrizing the mean of observations. The information required to validate the hypotheses at hand is contained in the *misclosure* vector t 2 R<sup>r</sup> given as (Teunissen 2006)

$$
\underline{\mathfrak{t}} = \begin{array}{c} \mathcal{B}^T \underline{\mathbf{y}} \\ \end{array} \tag{3}
$$

where B 2 Rm<sup>r</sup> is a full-rank matrix, with rank.B/ D r, such that ŒA B- 2 Rm<sup>m</sup> is invertible and AT B D 0. With C0b0 D 0 and given that y <sup>H</sup><sup>i</sup> <sup>N</sup> .Ax <sup>C</sup> Cibi ; Qyy/ for <sup>i</sup> <sup>D</sup> 0; 1; : : : ; k, the misclosure vector is then distributed as

$$\underline{\mathfrak{h}} \stackrel{\mathcal{H}\_i}{\sim} \mathcal{N}(\mathbf{C}\_l b\_i, \underline{\mathbf{Q}}\_{ll} = \mathbf{B}^T \mathbf{Q}\_{\text{yy}} \mathbf{B}), \quad \text{for} \quad i = 0, 1, \ldots k \quad (4)$$

with Cti D B<sup>T</sup> Ci . As t has a known Probability Density Function (PDF) under H0, which is the PDF of N .0; Qtt/, any statistical testing procedure is driven by the misclosure vector t and its known PDF under H0.

An unambiguous testing procedure can be established through assigning the outcomes of t to the statistical hypotheses H<sup>i</sup> for i D 0; 1; : : : ; k, which can be realized through a *partitioning* of the misclosure space R<sup>r</sup> (Teunissen 2018). Let <sup>P</sup><sup>i</sup> <sup>R</sup><sup>r</sup> (<sup>i</sup> <sup>D</sup> 0; 1; : : : ; k) be a partitioning of the misclosure space, i.e. [<sup>k</sup> <sup>i</sup>D<sup>0</sup> <sup>P</sup><sup>i</sup> <sup>D</sup> <sup>R</sup><sup>r</sup> and <sup>P</sup><sup>i</sup> \ <sup>P</sup><sup>j</sup> D ; for i ¤ j . The unambiguous testing procedure is then defined as

select H<sup>i</sup> if and only if t 2 P<sup>i</sup> for i D 0; 1 : : : ; k (5)

We note, although in (5) the statistical testing is formulated in the misclosure vector t, that one can equally well work with the least-squares residual vector eO<sup>0</sup> D y AxO <sup>0</sup> where xO <sup>0</sup> D .AT Q-1 yy A/-1AT Q-1 yy y. By using the relation t D B<sup>T</sup> eO0, there is no explicit need of having to compute t as testing can be expressed directly in eO<sup>0</sup> (Teunissen 2006).

# **2.2 Testing Decisions**

As (5) shows, the testing decisions are driven by the outcome of the misclosure vector t. Under each hypothesis H<sup>i</sup> (i D 0; 1; : : : ; k), the outcome of t can lead to k C 1 different decisions out of which only one is correct, i.e. when t 2 P<sup>i</sup> . With k C 1 hypotheses Hi's (i D 0; 1; : : : ; k), one can define different statistical events including Correct Acceptance (CA), False Alarm (FA), Missed Detection (MD), Correct Detection (CD), Correct Identification (CI) and Wrong Identification (WI). The definitions of these events together with their links are illustrated in Fig. 1. In this figure, the events under alternative hypotheses are given an identifying index, as they differ from alternative to alternative. In addition, the contributions of different alternative hypotheses to the events of false alarm and wrong identification are distinguished by means of an index.

Given the translational property of the PDF of t under the null and alternative hypotheses (cf. 4), the probabilities of the events in Fig. 1 can be computed based on the misclosure PDF under H0, denoted by ft.jH0/, as

$$\begin{array}{lcl} \mathsf{P}\_{\mathsf{FA}} &= \mathsf{P}(\underline{\mathfrak{r}} \notin \mathcal{P}\_{0} | \mathcal{H}\_{0}) = \int\_{\mathbb{R}^{\prime} \backslash \mathcal{P}\_{0}} f\_{\underline{\mathfrak{r}}}(\mathsf{r} | \mathcal{H}\_{0}) \, d\, \mathsf{r} \\ \mathsf{P}\_{\mathsf{CA}} &= 1 - \mathsf{P}\_{\mathsf{FA}} \\ \mathsf{P}\_{\mathsf{CD}\_{l}} &= \mathsf{P}(\underline{\mathfrak{r}} \notin \mathcal{P}\_{0} | \mathcal{H}\_{i}) = \int\_{\mathbb{R}^{\prime} \backslash \mathcal{P}\_{0}} f\_{\underline{\mathfrak{r}}}(\mathsf{r} - \mathsf{C}\_{l\_{l}} b\_{i} | \mathcal{H}\_{0}) \, d\, \mathsf{r} \\ \mathsf{P}\_{\mathsf{MD}\_{l}} &= 1 - \mathsf{P}\_{\mathsf{CD}\_{l}} \\ \mathsf{P}\_{\mathsf{CL}\_{l}} &= \mathsf{P}(\underline{\mathfrak{r}} \in \mathcal{P}\_{i} | \mathcal{H}\_{i}) = \int\_{\mathcal{P}\_{l}} f\_{\underline{\mathfrak{r}}}(\mathsf{r} - \mathsf{C}\_{l\_{l}} b\_{i} | \mathcal{H}\_{0}) \, d\, \mathsf{r} \\ \mathsf{P}\_{\mathsf{WL}\_{l}} &= \mathsf{P}\_{\mathsf{CD}\_{l}} - \mathsf{P}\_{\mathsf{CL}\_{l}} \end{array} \tag{6}$$

The probability of false alarm PFA is usually set a priori by the user. We note that the last four probabilities all depend on the *unknown* bi which one needs to set to evaluate the mentioned four probabilities.

Here, it is important to note the difference between the probabilities of correct detection and correct identification, i.e. PCD<sup>i</sup> PCI<sup>i</sup> . These two probabilities would be identical if there is only one alternative hypothesis, say H<sup>i</sup> , since then <sup>P</sup><sup>i</sup> <sup>D</sup> <sup>R</sup><sup>r</sup> <sup>n</sup>P0. Similar to the CD- and CI-probability, we have the concepts of the minimal detectable bias (MDB) (Baarda 1968) and the minimal identifiable bias (MIB) (Teunissen

**Fig. 1** An overview of testing decisions, driven by the misclosure vector t, under null and alternative hypotheses

2018). In the following sections, we highlight the difference between the MDB (PCDi) and the MIB (PCIi).

# **3 Testing Performance**

Statistical testing procedures employed in quality control often comprises two steps (Baarda 1968; Teunissen 1985; Caspary and Borutta 1987; Kösters and Van der Marel 1990; Amiri Simkooei 2001; Perfetti 2006; Lehmann and Lösler 2017; Nowel 2020), as follows


The testing performance is thus not only led by its ability to detect biases but to correctly identify them as well. While the former is measured by means of the MDB (or alternatively CD-probability), the latter should be measured using the MIB (or alternatively CI-probability) (Teunissen 2018; Zaminpardaz and Teunissen 2019; Imparato et al. 2019). Note, in single-redundancy case r D 1, that P<sup>1</sup> D ::: D <sup>P</sup><sup>k</sup> <sup>D</sup> <sup>R</sup><sup>r</sup> <sup>n</sup> <sup>P</sup>0, implying that the alternative hypotheses are not distinguishable from one another, and thus identification would not be possible.

# **3.1 Minimal Detectable Bias (MDB)**

The concept of the MDB was introduced in Baarda (1967, 1968) as a diagnostic tool for measuring the ability of the testing procedure to *detect* misspecifications of the model. The MDB, for each alternative hypothesis H<sup>i</sup> , is defined as the smallest size of bi that can be detected given a certain CD- and FA-probability. As the third equality in (6) shows, PCD<sup>i</sup> depends, in addition to the PDF of t under H<sup>0</sup> and bi , also on P<sup>0</sup> which is commonly defined as (Baarda 1968; Teunissen 2006)

$$\mathcal{P}\_0 = \left\{ t \in \mathbb{R}^r \| \| t \|\_{\mathcal{Q}\_{\mathcal{U}}}^2 \le \chi^2\_{1-\mathsf{P}\_{\mathsf{FA}}}(r,0) \right\} \tag{7}$$

where 2 1-<sup>P</sup>FA .r; 0/ is the .1 PFA/ quantile of the central Chi-square distribution with r degrees of freedom. Using (7), one in fact compares the test statistic ktk<sup>2</sup> Qtt against the critical value 2 1-<sup>P</sup>FA .r; 0/, with user-defined PFA, to decide whether H<sup>0</sup> is valid or not. This testing process is called the *overall model test*, which would be a Uniformly Most Powerful Invariant (UMPI) detector test in case of dealing with a single alternative hypothesis (Arnold 1981; Teunissen 2006; Lehmann and Voß-Böhme 2017).

With (7), the CD-probability of H<sup>i</sup> is given by

$$\mathsf{P}\_{\mathsf{CD}\_{\mathsf{I}}} = \mathsf{P}\left(\left\lVert\mathsf{f}\right\rVert\_{\mathcal{Q}\_{\mathsf{II}}}^2 > \chi^2\_{1-\mathsf{P}\_{\mathsf{FA}}}(r,0)|\mathcal{H}\_{\mathsf{I}}\right) \tag{8}$$

where, according to (4), ktk<sup>2</sup> Qtt under H<sup>i</sup> has a non-central Chi-square distribution with r degrees of freedom and the non-centrality parameter 2 <sup>i</sup> D kCti bik<sup>2</sup> Qtt . One can compute 2 <sup>i</sup> D 2.PFA;PCD<sup>i</sup> ;r/ from the Chi-square distribution for a given model redundancy r, CD-probability PCD<sup>i</sup> and FAprobability PFA. If bi 2 R is a scalar, then Cti takes the form of a vector cti , and the MDB is given by (Baarda 1968; Teunissen 2006)

$$|b\_i \in \mathbb{R}: \ |b\_{i, \text{MDB}}| = \frac{\lambda(\mathsf{P}\_{\text{FA}}, \mathsf{P}\_{\text{CD}\_i}, r)}{\|c\_{l\_i}\|\_{\underline{Q}\_{II}}} \tag{9}$$

which shows that for a given set of fPFA;PCD<sup>i</sup> ; rg, the MDB depends on kcti kQtt . For the higher-dimensional case when bi 2 Rq>1 is a vector instead of a scalar, a similar expression can be obtained. Let the bias vector be parametrized, in terms of its magnitude kbik and its unit direction vector d, as bi D kbik d. Then the MDB along the direction d 2 Sq-<sup>1</sup> is given by (Teunissen 2006)

$$\|b\_i \in \mathbb{R}^{q>1}: \|b\_{i, \text{MDB}}(d)\| = \frac{\lambda(\mathsf{P}\_{\text{FA}}, \mathsf{P}\_{\text{CD}\_i}, r)}{\|C\_{l\_i}d\|\_{\mathcal{Q}\_{ll}}}; \ d \in \mathbb{S}^{q-1} \tag{10}$$

If the unit vector d sweeps the surface of the unit sphere Sq-<sup>1</sup>, an ellipsoidal region is obtained of which the boundary defines the MDBs in different directions. The shape and the orientation of this ellipsoidal region is governed by the variance matrix Qb<sup>O</sup> i bO <sup>i</sup> <sup>D</sup> .C<sup>T</sup> ti Q-1 tt Cti/-<sup>1</sup>, and its size is determined by .PFA;PCD<sup>i</sup> ;r/ (Zaminpardaz et al. 2015; Zaminpardaz 2016).

The MDB concept expresses the sensitivity of the *detection* step of the testing procedure. One can compare the MDBs of different alternative hypotheses for a given set of fPFA;PCD; rg, which provides information on how sensitive is the rejection of H<sup>0</sup> for the Hi-biases the size of their MDBs. The smaller the MDB is, the more sensitive is the rejection of H0.

# **3.2 Minimal Identifiable Bias (MIB)**

As the last equality in (6) shows, a high CD-probability PCD<sup>i</sup> does not necessarily imply a high CI-probability PCI<sup>i</sup> unless we have the special case of only a single alternative hypothesis. Therefore, in case of multiple hypotheses, the MDB does *not* provide information about correct identification. To assess the sensitivity of the identification step, one can analyse the MIBs of the alternative hypotheses. The MIB of the alternative hypothesis H<sup>i</sup> is defined as the smallest size of bi that can be identified given a certain CI probability (Teunissen 2018).

The MIB corresponding with H<sup>i</sup> can be found from inverting the fifth equality in (6). This inversion is, however, not trivial as PCI<sup>i</sup> is an r-fold integral over the complex region P<sup>i</sup> . One can take resort to numerical evaluation techniques. For example, the MIBs in Sect. 4 are numerically computed as follows. The probability PCI<sup>i</sup> is computed, by means of Monte Carlo simulation, see e.g. Teunissen (2018), at discrete biases bi and then the bias at which PCI<sup>i</sup> gets close enough to the pre-set CI-probability is the MIB sought.

According to the fifth equality in (6), the MIB for a given PCI<sup>i</sup> depends on the probability mass of the PDF of t under H<sup>i</sup> over P<sup>i</sup> . This probability mass is driven by the shape and size of P<sup>i</sup> , magnitude of E.tjHi/ and its direction with respect to the borders of <sup>P</sup><sup>i</sup> . Note, if bi <sup>2</sup> <sup>R</sup>q>1 is a vector, then, a given CI-probability yields different MIBs along different directions in R<sup>q</sup>. In this case, a pre-set CIprobability defines a region in R<sup>q</sup> the boundary of which defines the MIBs in different directions. The MIB of H<sup>i</sup> for a given CI-probability is denoted by jbi;MIBj if bi 2 R, and kbi;MIB.d /k along the unit direction d 2 Sq-<sup>1</sup> if bi 2 Rq>1.

# **4 MDB Versus MIB**

As for a given bias bi , the CD-probability exceeds the CIprobability, i.e. PCD<sup>i</sup> PCI<sup>i</sup> , then for a given PCD<sup>i</sup> D PCI<sup>i</sup> , we have

$$\begin{array}{lcl} b\_i \in \mathbb{R} &: |b\_{i, \text{MIB}}| \ge |b\_{i, \text{MDB}}| \\ b\_i \in \mathbb{R}^{q > 1} &: \|b\_{i, \text{MIB}}(d)\| \ge \|b\_{i, \text{MDB}}(d)\| \text{ for any } d \in \mathbb{S}^{q - 1} \end{array} \tag{11}$$

The following example elaborates more on the above link between the MDB and the MIB.

*Example* Let y 2 R<sup>4</sup> contain two pairs of observations of an unknown distance x 2 R made using two different instruments, e.g., two different tape measures. The observations are assumed uncorrelated and equally precise with the same standard deviation . Under the null hypothesis H0, the observations are assumed to be bias-free, whereas under the alternative hypotheses H<sup>i</sup> (i D 1; 2), it is assumed that the observation pair made by one of the instruments are biased by Cibi (i D 1; 2) with Ci 2 R4<sup>2</sup> and bi 2 R<sup>2</sup>. These hypotheses are formulated as

$$\begin{array}{ll} \mathcal{H}\_{0}: \mathsf{E}(\underline{\mathbf{y}}) = e\_{4} \,\, \mathrm{x}, & \mathsf{D}(\underline{\mathbf{y}}) = \sigma^{2} I\_{4} \\ \mathcal{H}\_{i}: \mathsf{E}(\underline{\mathbf{y}}) = e\_{4} \,\, \mathrm{x} + \left(\mu\_{i}^{2} \otimes I\_{2}\right) b\_{i}, & \mathsf{D}(\underline{\mathbf{y}}) = \sigma^{2} I\_{4} \end{array} \tag{12}$$

where ˝ shows the Kronecker product (Henderson and Pukelsheim 1983), e 2 R the vector of ones, I 2 R the identity matrix, and *u*<sup>2</sup> <sup>i</sup> <sup>2</sup> <sup>R</sup><sup>2</sup> the canonical unit vector having one as its ith element and zeros otherwise.

The redundancy of H0-model is r D 4 1 D 3>1, which means, upon the rejection of H0, that the identification of potential source of error would be possible. Under H1, it is assumed that the mean-difference of the observables of the second instrument is zero, while under H2, this is assumed for the first instrument. To test the three hypotheses in consideration, the following detection and identification steps are exercised:


$$\mathcal{P}\_i = \left\{ t \in \mathbb{R}^r \mid \mathcal{P}\_0 \Big| \, T\_i = \max\_{j \in \{1, \ldots, k\}} T\_j \right\} \tag{13}$$

where

$$T\_i = t^T \mathcal{Q}\_{\iota\iota}^{-1} \mathcal{C}\_{\iota\iota} \left( \mathcal{C}\_{\iota\iota}^T \mathcal{Q}\_{\iota\iota}^{-1} \mathcal{C}\_{\iota\iota} \right)^{-1} \mathcal{C}\_{\iota\iota}^T \mathcal{Q}\_{\iota\iota}^{-1} t \qquad (14)$$

would be a realization of the Generalized Likelihood Ratio (GLR) test statistic in case there is only one single alternative hypothesis (Teunissen 2006).

We note that the vector of misclosures t is not uniquely defined. This, however, does not affect the outcome of the above testing procedure as both the detector ktk<sup>2</sup> Qtt and the test statistic T <sup>i</sup> remain invariant for any linear one-to-one transformation of the misclosure vector. Therefore, instead of t, one can for instance also work with

$$
\bar{\underline{t}}\;=\;\mathcal{G}^{-T}\underline{\underline{t}}\begin{cases}
\stackrel{\mathcal{H}\_0}{\approx}\mathcal{N}(0,\;I\_r) \\
\stackrel{\mathcal{H}\_i}{\approx}\mathcal{N}(\bar{\mathcal{C}}\_{l\_i}b\_i,\;I\_r)
\end{cases}
\tag{15}
$$

with <sup>C</sup>Nti <sup>D</sup> <sup>G</sup>-<sup>T</sup> Cti and the Cholesky-factor <sup>G</sup><sup>T</sup> of the Cholesky-factorisation Qtt <sup>D</sup> <sup>G</sup><sup>T</sup> <sup>G</sup>. The advantage of using t N over t lies in the ease of visualizing certain effects due to

**Fig. 2** Partitioning of the misclosure space R<sup>3</sup> corresponding with t N (15) using (7) and (13). The blue sphere shows the boundary of P<sup>0</sup> with PFA D 0:1, while the orthogonal green and red planes separate P<sup>1</sup> from P<sup>2</sup>

the identity-variance matrix of t N (Zaminpardaz and Teunissen 2019). The partitioning corresponding with t N is denoted by P<sup>i</sup> for i D 0; 1; 2.

The misclosure space (R<sup>3</sup>) partitioning corresponding with (7) and (13) is shown in Fig. 2. For the sake of visualization, instead of t, we work with t N defined in (15). The blue sphere shows the boundary of P<sup>0</sup> choosing PFA D 0:1, while the green and red planes separate P<sup>1</sup> from P2. The two planes are orthogonal to each other implying that P<sup>1</sup> and P<sup>2</sup> are the same in shape and size.

As bi in (12) is a 2-vector, i.e. bi D Œbi;1; bi;2- T , the MDBs and the MIBs of the alternative hypotheses are dependent not only on the pre-set CD- and CI-probability, but also on the bias direction in R<sup>2</sup>. Figure 3 shows the MDB and MIB curves for H<sup>i</sup> (i D 1; 2) given D 0:1, PFA D 0:1 and for different values of PCD<sup>i</sup> D PCI<sup>i</sup> . In each panel, in agreement with (11), it can be seen that the MIB curve encompasses the MDB curve.

Note, if E.t NjHi/ <sup>D</sup> <sup>C</sup>Nti bi lies on the border of <sup>P</sup><sup>1</sup> and P2, that the CI-probability of H<sup>i</sup> cannot reach above 0:5. As shown in Fig. 2, the regions P<sup>1</sup> and P<sup>2</sup> are separated from each other by the following two planes

$$\tau^T \left( \frac{\bar{C}\_{l\_1}^\perp}{\|\bar{C}\_{l\_1}^\perp\|} \pm \frac{\bar{C}\_{l\_2}^\perp}{\|\bar{C}\_{l\_2}^\perp\|} \right) = 0; \ \bar{\tau} \in \mathbb{R}^3 \tag{16}$$

with CN ? ti <sup>2</sup> <sup>R</sup><sup>3</sup> being a vector of which the range space is the orthogonal complement of the range space of CNti . It can be easily verified, if bi is parallel to Œ1; 1-<sup>T</sup> , that E.t NjHi/ will lie on the intersection of the above planes. This explains the bands around the direction of Œ1; 1-<sup>T</sup> in Fig. 3 when PCI<sup>i</sup> is set to be larger than 0:5. On the other hand, when bi is parallel to Œ1; 1-<sup>T</sup> , the MDB and the MIB are very close to each other. A bias along the direction of Œ1; 1-T makes E.t NjHi/ lie at its farthest position from the planar borders of P<sup>1</sup> and P2. Thus, under H<sup>i</sup> (i D 1; 2), most of the probability mass of the PDF of t <sup>N</sup> that lies outside <sup>P</sup><sup>0</sup> falls into the region P<sup>i</sup> . As a result PCD<sup>i</sup> and PCI<sup>i</sup> are very close to each other for a given bias along Œ1; 1-<sup>T</sup> , or alternatively the MDB and the MIB are very close to each other along Œ1; 1-<sup>T</sup> for a pre-set PCD<sup>i</sup> D PCI<sup>i</sup> . ut

The above example clearly shows that the detection and identification performance of a testing procedure could be completely different from each other.

# **5 Deformation Monitoring**

In this section, we continue our MDB-MIB comparison for a dam deformation monitoring case, inspired by an example in Heunecke et al. (2013, p. 227), see also (Zaminpardaz et al. 2020). Figure 4 [top] shows a top view of a dam over a lake,

**Fig. 3** Illustration of the MDB versus the MIB curves for testing the hypotheses in (12) using (7) and (13), given D 0:1 and PFA D 0:1. The panels from left to right correspond to PCD<sup>i</sup> D PCI<sup>i</sup> of 0.4, 0.6, 0.8 and 0.99, respectively

**Fig. 4** Deformation monitoring of a dam (Zaminpardaz et al. 2020). [Top] The horizontal monitoring network consists of *four* reference points around the dam and *two* object points on the dam (points 5 and 6). The blue lines indicate the distance+direction measurements between their ending points, and the arrows point from total station

together with two different 2-D terrestrial survey networks designed to monitor the dam's horizontal displacement. For simplicity, it is assumed that the dam is vertically stable. The survey networks consist of two *object* points on the dam subject to displacement (points 5, 6), and four *reference* points in a stable area close to the dam (points 1, 2, 3, 4). To determine horizontal deformations of the dam, two sets

to target. [Bottom] The graphs of MDB (solid lines) and MIB (dashed lines) of different alternative hypotheses in (18) as function of the preset probability. The results correspond with the testing procedure in (7) and (13), given PFA D 0:01

of measurements are collected at two times (or epochs), l D 1; 2.

In the survey network shown in Fig. 4 [top-left], each measurement set contains 60 measurements; five distance measurements and five direction measurements taken from each of the six points to the rest of the points by a total station. The distance and direction measurements are assumed to be normally distributed with standard deviations of 1 cm and 10 s of arc, respectively. The measurements are assumed to be all uncorrelated. To make the scale, orientation and location of the 2-D survey network estimable, the coordinates of the reference points 1 and 2 (black triangles in Fig. 4 [top]) are assumed given. The 60 distance and direction observations at epoch l are then used to estimate the Easting and Northing of points i D 3; : : : ; 6, together with the unknown instrument scale factor (one for the whole network) and six unknown orientations (one per instrument set-up).

To analyse the dam's horizontal displacement, we make use of the epoch-wise estimated coordinatesof points i D 3; : : : ; 6 and their corresponding variance matrices. Let xi;l 2 R<sup>2</sup> (for i D 3; : : : ; 6 and l D 1; 2) be the coordinate vector of point i at epoch l, and let xl D Œx<sup>T</sup> 3;l ; x<sup>T</sup> 4;l ; x<sup>T</sup> 5;l ; x<sup>T</sup> 6;l - <sup>T</sup> 2 <sup>R</sup><sup>8</sup> for <sup>l</sup> <sup>D</sup> 1; 2. Under the null hypothesis <sup>H</sup>0, where deformation is absent, we assume

$$\mathcal{H}\_0: \varkappa\_2 = \varkappa\_1 \text{ (all stable)}\tag{17}$$

The redundancy under H<sup>0</sup> is r D 8. The dam is supposed to be subject to load of the water in the lake, and hence it is assumed that either only one or both of the dam points may be pushed back in the direction perpendicular to the dam. Thus we have three alternative hypotheses as

$$\begin{aligned} \mathcal{H}\_{i}: \mathbf{x}\_{2} &= \mathbf{x}\_{1} + \left( u\_{i+2}^{4} \otimes d \right) b\_{i} \text{ (point } i+4 \text{ is unstable, } i=1,2 \text{)}\\ \mathcal{H}\_{3}: \mathbf{x}\_{2} &= \mathbf{x}\_{1} + \left( \mu \otimes d \right) b\_{3} \text{ (points 5 and 6 are unstable)} \end{aligned} \tag{18}$$

with *u*<sup>4</sup> <sup>i</sup>C<sup>2</sup> <sup>2</sup> <sup>R</sup><sup>4</sup> the canonical unit vector having one as its .i C 2/th element and zeros otherwise, *u* D *u*<sup>4</sup> <sup>3</sup> C *u*<sup>4</sup> 4, d 2 S the known unit vector in the direction perpendicular to the dam, and bi 2 R the unknown scalar deformation size parameter. Note, under H3, that we assume that the object points 5 and 6 deform with the same amount.

We note that since r D 8>1, our testing procedure involves both the detection and identification step (7) and (13). Assuming PFA D 0:01, Fig. 4 [bottom-left] shows the MDB as a function of the CD-probability in solid curves, and the MIB as a function of the CI-probability in dashed curves for the three hypotheses in (18). For each hypothesis, its MIB graph lies above its MDB graph corroborating the first inequality in (11). For example, for a given pre-set probability of PCD<sup>i</sup> D PCI<sup>i</sup> D 0:98, there is an offset of almost 6mm between the MIB and the MDB in case of H<sup>1</sup> and H3, while the H2's MDB and MIB difference is at submm level.

The MIB-MDB difference will change if the survey network measurement set-up changes. Figure 4 [top-right] shows a survey network obtained by removing 17 pairs of distance/direction measurements from the top-left network. As a result of loosing 34 measurements compared to the previous survey network, both the MDBs and the MIBs increase as shown in Fig. 4 [bottom-right]. It is observed that the MIB and the MDB can differ significantly from each other. For example, for a given pre-set probability of PCD<sup>i</sup> D PCI<sup>i</sup> D 0:98, there is an offset of almost 16mm between the MIB and the MDB in case of H<sup>1</sup> and H3.

As shown in Fig. 4 [bottom], the MDB and the MIB, for a pre-set probability, differ from hypothesis to hypothesis. For example, for the range of probabilities shown in Fig. 4 [bottom-left], it is observed that

$$\begin{aligned} |b\_{2,\text{MIB}}| &> |b\_{3,\text{MIB}}| > |b\_{1,\text{MIB}}|\\ |b\_{2,\text{MIB}}| &> |b\_{3,\text{MIB}}| > |b\_{1,\text{MIB}}| \end{aligned} \tag{19}$$

As the MDB, for a given set of fPFA;PCD<sup>i</sup> ; rg, is driven by kcti kQtt , the first expression in the above equation can be explained by comparing kcti kQtt for i D 1; 2; 3. The larger the value of kcti kQtt , the smaller the MDB is expected to be. For example, for the survey network shown in Fig. 4 [topleft], we have

$$\|\|c\_{l\_1}\|\|\_{\mathcal{Q}\_{ll}} \approx 180; \ \|c\_{l\_2}\|\_{\mathcal{Q}\_{ll}} \approx 105; \ \|c\_{l\_3}\|\_{\mathcal{Q}\_{ll}} \approx 158 \qquad (20)$$

which are driven by the network geometry, measurement precision and the direction of displacement. The above equation implies that H<sup>1</sup> and H<sup>2</sup> should, respectively, have the smallest and the largest MDBs among the three alternatives for a pre-set CD-probability. The MIB inequalities in (19) are due to a combination of (20), the shape and size of P<sup>i</sup> , magnitude of E.tjHi/ and its direction with respect to the borders of P<sup>i</sup> .

# **6 Summary and Concluding Remarks**

In this contribution, a comparative analysis was provided of the detection and identification steps of statistical testing procedures. The *detection* step aims to validate the null hypothesis H0, while the identification step, upon the rejection of H0, aims to select the most likely alternative hypothesis among those in consideration.In case there is only one alternative hypothesis, say H1, the rejection of H<sup>0</sup> is equivalent to the identification of H1. This is however not the case when working with multiple alternatives. Having different functionalities, the detection and identification performance of the testing procedure should then be assessed using two different diagnostic tools. The detection capability of a testing regime is usually assessed by its Minimal Detectable Bias (MDB), whereas the testing identification performance should be evaluated by its Minimal Identifiable Bias (MIB).

Using the concept of misclosure space partitioning, we discussed testing decisions and their probabilities. Through this partitioning, it was shown that the distribution of the misclosure vector can be used to determine the correct detection (CD) and correct identification (CI) probabilities of each of the alternative hypotheses. One can then 'invert' these probabilities to determine their corresponding minimal biases, i.e. the MDB and the MIB. It was highlighted that a small MDB (or high probability of correct detection) does not necessarily imply a small MIB (or a high probability of correct identification), unless one is dealing with the special case of having only one single alternative hypothesis. The factors driving the difference between detection and identification performance were illustrated using a simple multiplealternative testing example. Our evaluations were extended to basic deformation measurement system examples with multiple alternative hypotheses, where monitoring measurements were provided by a 2D terrestrial survey network.

# **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/ licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

# **A Simple TLS-Treatment of the Partial EIV-Model as One with Singular Cofactor Matrices I: The Case of a KRONECKER Product for** *QA* D *Q***<sup>0</sup>** ˝ *Qx*

# Shahram Jazaeri, Burkhard Schaffrin, and Kyle Snow

#### **Abstract**

Following the pioneering work in the PhD dissertation by Snow (PhD thesis, Rep. No. 502, Div. of Geodetic Sci, School of Earth Sciences, The Ohio State University, Columbus, OH, USA, 2012), the two articles by Schaffrin et al. (J Geodetic Sci 4(1):28–36, 2014) and by Jazaeri et al. (Z für Vermessungswesen 139(4):229–240, 2014) provided a broad overview of the Total Least-Squares (TLS) adjustment within EIV-Models with singular cofactor matrices. Around the same time, Xu et al. (J Geodesy 86:661–675, 2012) proposed a specific algorithm to find the TLS solution within a partial EIV-Model, which has been improved by various authors since, including Shi et al. (J Geodesy 89(1):13–16, 2015), Wang et al. (Cehui Xuebao/Acta Geodaet et Cartograph Sinica 46(8):978–987, 2017), Zhao (Surv Rev 49(356):346–354, 2017), and Han et al. (Surv Rev 52(371):126–133, 2020), to name a few. On the other hand, it is easy to see that the partial EIV-Model is a special case of the general EIV-Model with singular cofactor matrices and thus does not need a separate class of algorithms unless they are more efficient than the standard algorithms. This, however, does not seem to be guaranteed as will be shown in this contribution for the straight-line adjustment under QA D Q0 ˝ Qx. As a consequence, we shall argue that, rather than discussing the partial EIV-Model, it would be more worthwhile to make the respective developments within an EIV-Model with singular cofactor matrices directly.

#### **Keywords**

Errors-In-Variables model - Singular cofactor matrices -Total least squares

# **1 Introduction**

About a decade ago, Xu et al. (2012) introduced the partial Errors-In-Variables (PEIV) model in order to accommodate for nonrandom elements within the coefficient matrix A,

S. Jazaeri

B. Schaffrin

K. Snow (-) Polaris Geospatial Services, LLC, Westerville, OH, USA which describes the connection of the unknown parameters with the observations that were collected in order to determine them. Obviously, nonrandom elements can as well be considered as "random with zero variance" (and zero covariance if applicable), thus leading to a *singular cofactor matrix* Dfvec Ag D -2 <sup>0</sup> QA that is positive semidefinite.

Incidentally, at about the same time, Snow (2012) published his PhD dissertation in which a large part is concerned with exactly the *much wider subclass* of EIV-Models where the cofactor matrix QA is allowed to be *singular* (and, quite possibly, the cofactor matrix Qy, too). There, a variety of algorithms is proposed to handle all the cases in which a *unique* Total Least-Squares (TLS) solution exists.

Here, however, most of the attention will be directed to the case where QA shows a Kronecker product structure:

© The Author(s) 2023

J. T. Freymueller, L. Sánchez (eds.), *X Hotine-Marussi Symposium on Mathematical Geodesy*, International Association of Geodesy Symposia 155, https://doi.org/10.1007/1345\_2023\_229

Department of Civil and Environmental Engineering, School of Engineering No. 1, Shiraz University, Shiraz, Iran e-mail: s.jazayeri@shirazu.ac.ir

Division of Geodetic Science, School of Earth Science, The Ohio State University, Columbus, OH, USA

QA :D Q0 ˝ Qx. For this special case, the original algorithm of Schaffrin and Wieser (2008) to find the Weighted TLS solution had been designed, with particular efficiency whenever Qx :D Qy can be assumed. More often than not, this assumption is fulfilled when it comes to straightline adjustment in two or three dimensions. Therefore, the prominent example in this study will be taken from this class.

In Sects. 2.1 and 2.2, a review is provided for the EIV-Model with singular cofactor matrix, resp. with a Kroneckerproduct structure for QA. A similar review for the PEIV-Model will be found in Sect. 3, followed by a selection of algorithms for the various options in Sect. 4. Finally, the key algorithms will be compared in terms of their efficiency when applied to a number of typical examples in Sect. 5, before certain conclusions can be drawn in Sect. 6.

# **2 The EIV-Model and the Weighted TLS Solution—A Review**

# **2.1 Potentially Singular Cofactor Matrices**

The definition of the EIV-Model is given as

$$\mathbf{y} = \underbrace{(A - E\_A)}\_{n \times m} \boldsymbol{\xi} + \boldsymbol{e}\_y, \quad \boldsymbol{e}\_A := \text{vec} \, E\_A,\tag{1a}$$
 
$$\text{rk}\, A = m < n,$$

$$
\begin{bmatrix} \mathbf{e}\_y \\ \mathbf{e}\_A \end{bmatrix} \sim (\begin{bmatrix} \mathbf{0} \\ \mathbf{0} \end{bmatrix}, \sigma\_0^2 \boldsymbol{\mathcal{Q}} = \sigma\_0^2 \begin{bmatrix} \underline{\mathcal{Q}}\_y & \underline{\mathcal{Q}}\_{yA} \\ \underline{\mathcal{Q}}\_{A\mathbf{y}} & \underline{\mathcal{Q}}\_A \\ \underline{\mathcal{Q}}\_{m\times n m} \end{bmatrix}), \qquad (\mathbf{1}\mathbf{b}).
$$

Q symmetric, nonnegative-definite, with the usual notation as in Snow (2012) and Schaffrin et al. (2014), for instance; see also Fang (2011) whose derivation is reviewed in the following while temporarily assuming that Q is positivedefinite with

$$\begin{aligned} \boldsymbol{\mathcal{Q}}^{-1} &:= \begin{bmatrix} \boldsymbol{P}\_{11} & \boldsymbol{P}\_{12} \\ \boldsymbol{P}\_{21} & \boldsymbol{P}\_{22} \end{bmatrix}, \\\ \boldsymbol{P}\_{11} &= \boldsymbol{P}\_{11}^{T}, \ \boldsymbol{P}\_{21} = \boldsymbol{P}\_{12}^{T}, \ \boldsymbol{P}\_{22} = \boldsymbol{P}\_{22}^{T}. \end{aligned} \tag{2}$$

Thus, the target function reads (with as n 1 vector of Lagrange multipliers):

$$\begin{aligned} \Phi(\boldsymbol{e}\_{\boldsymbol{y}}, \boldsymbol{e}\_{A}, \boldsymbol{\xi}, \boldsymbol{\lambda}) &:= \boldsymbol{e}\_{\boldsymbol{y}}^{T} P\_{11} \boldsymbol{e}\_{\boldsymbol{y}} + 2 \boldsymbol{e}\_{\boldsymbol{y}}^{T} P\_{12} \boldsymbol{e}\_{A} + \\ + \boldsymbol{e}\_{A}^{T} P\_{22} \boldsymbol{e}\_{A} + 2 \boldsymbol{\lambda}^{T} [\boldsymbol{y} - A \boldsymbol{\xi} - \boldsymbol{e}\_{\boldsymbol{y}} + \left(\boldsymbol{\xi}^{T} \otimes I\_{\boldsymbol{n}}\right) \boldsymbol{e}\_{A}], \end{aligned} \quad (3)$$

which must be stationary, leading to the necessary Euler-Lagrange conditions:

$$\frac{1}{2}\frac{\partial \Phi}{\partial \mathbf{e}\_y} = P\_{11}\tilde{\mathbf{e}}\_y + P\_{12}\tilde{\mathbf{e}}\_A - \hat{\lambda} \doteq \mathbf{0},\tag{4a}$$

$$\frac{1}{2}\frac{\partial \Phi}{\partial \mathbf{e}\_A} = P\_{21}\tilde{\mathbf{e}}\_y + P\_{22}\tilde{\mathbf{e}}\_A + \left(\hat{\xi} \otimes I\_n\right)\hat{\lambda} \doteq \mathbf{0},\qquad(4b)$$

$$\frac{1}{2}\frac{\partial \Phi}{\partial \xi} = -A^T \hat{\lambda} + \tilde{E}\_A^T \hat{\lambda} \doteq \mathbf{0},\tag{4c}$$

$$\frac{1}{2}\frac{\partial \Phi}{\partial \lambda} = \mathbf{y} - A\hat{\boldsymbol{\xi}} - \tilde{\boldsymbol{e}}\_{\text{y}} + \left(\hat{\boldsymbol{\xi}}^{T} \otimes I\_{n}\right)\tilde{\boldsymbol{e}}\_{A} \doteq \mathbf{0},\qquad(4d)$$

and the sufficient condition:

$$\frac{1}{2} \frac{\partial^2 \Phi}{\partial \begin{bmatrix} \mathbf{e}\_y\\ \mathbf{e}\_A \end{bmatrix} \partial \begin{bmatrix} \mathbf{e}\_y\\ \mathbf{e}\_y \end{bmatrix} \partial \begin{bmatrix} \mathbf{e}\_x\\ \mathbf{e}\_A \end{bmatrix}} = \begin{bmatrix} P\_{11} & P\_{12} \\ P\_{21} & P\_{22} \end{bmatrix} \tag{5}$$

#### is positive-definite.

Taking (4a) and (4b) together and solving for the combined *residual vector* gives

$$\begin{aligned} \begin{bmatrix} \tilde{\boldsymbol{e}}\_{\boldsymbol{y}} \\ \tilde{\boldsymbol{e}}\_{A} \end{bmatrix} &= \begin{bmatrix} \boldsymbol{Q}\_{\boldsymbol{y}} & \boldsymbol{Q}\_{\boldsymbol{y}A} \\ \boldsymbol{Q}\_{A\boldsymbol{y}} & \boldsymbol{Q}\_{A} \end{bmatrix} \cdot \begin{bmatrix} \boldsymbol{I}\_{n} \\ - (\hat{\boldsymbol{\xi}} \otimes \boldsymbol{I}\_{n}) \end{bmatrix} \cdot \hat{\boldsymbol{\lambda}} = \\ &= \begin{bmatrix} \boldsymbol{Q}\_{\boldsymbol{y}} - \boldsymbol{Q}\_{\boldsymbol{y}A} (\hat{\boldsymbol{\xi}} \otimes \boldsymbol{I}\_{n}) \\ \boldsymbol{Q}\_{A\boldsymbol{y}} - \boldsymbol{Q}\_{A} (\hat{\boldsymbol{\xi}} \otimes \boldsymbol{I}\_{n}) \end{bmatrix} \cdot \hat{\boldsymbol{\lambda}} \end{aligned} \tag{6a}$$

and, with (4d),

$$\mathbf{y} - A\hat{\boldsymbol{\xi}} = \left[I\_n \left| - (\hat{\boldsymbol{\xi}}^T \otimes I\_n) \right.\right] \begin{bmatrix} \tilde{\boldsymbol{e}}\_{\boldsymbol{y}} \\ \tilde{\boldsymbol{e}}\_{A} \end{bmatrix} =: \boldsymbol{Q}\_1 \cdot \hat{\boldsymbol{\lambda}};\qquad(6b)$$

thus,

$$
\hat{\lambda} = \mathcal{Q}\_1^{-1}(\mathbf{y} - A\hat{\xi}) \tag{7a}
$$

for

$$\begin{split} \underset{n \times n}{\mathcal{Q}} := \left[ I\_n \Big| - \left( \hat{\xi}^T \otimes I\_n \right) \right] \left[ \begin{matrix} \mathcal{Q}\_{\mathcal{Y}} & \mathcal{Q}\_{\mathcal{Y}A} \\ \mathcal{Q}\_{A\mathcal{Y}} & \mathcal{Q}\_A \end{matrix} \right] \left[ \begin{matrix} I\_n \\ - (\hat{\xi} \otimes I\_n) \end{matrix} \right] \\ =: \mathcal{B} \mathcal{Q} \mathcal{B}^T \end{split} \tag{7b}$$

and

$$\begin{aligned} \underset{n \times n \, (m+1)}{\mathcal{B}} &:= \left[ I\_n \Big| - (\hat{\boldsymbol{\xi}}^T \otimes I\_n) \right] = \boldsymbol{B}(\hat{\boldsymbol{\xi}}), \\ \text{rk } \boldsymbol{B} &= \boldsymbol{n} \text{ ("full row-rank")}. \end{aligned} \tag{7c}$$

Finally, the *TLS solution* can be obtained from (4c) through

$$\mathbf{0} = (A - \tilde{E}\_A)^T \hat{\lambda} = \tag{8a}$$

$$=(A - \tilde{E}\_A)^T \mathcal{Q}\_1^{-1} [(\mathbf{y} - \tilde{E}\_A \hat{\xi}) - (A - \tilde{E}\_A) \hat{\xi}]$$

as *estimated parameter vector*

$$\begin{aligned} \hat{\boldsymbol{\xi}} &= \left[ (\boldsymbol{A} - \tilde{\boldsymbol{E}}\_A)^T \boldsymbol{\mathcal{Q}}\_1^{-1} (\boldsymbol{A} - \tilde{\boldsymbol{E}}\_A) \right]^{-1} \text{.} \\ &\cdot \left[ (\boldsymbol{A} - \tilde{\boldsymbol{E}}\_A)^T \boldsymbol{\mathcal{Q}}\_1^{-1} (\boldsymbol{y} - \tilde{\boldsymbol{E}}\_A \hat{\boldsymbol{\xi}}) \right] \end{aligned} \tag{8b}$$

with the *residual vector*

$$
\begin{bmatrix} \tilde{\boldsymbol{e}}\_y \\ \tilde{\boldsymbol{e}}\_A \end{bmatrix} = \boldsymbol{Q}\boldsymbol{B}^T \cdot \boldsymbol{\mathcal{Q}}\_1^{-1} (\mathbf{y} - \boldsymbol{A}\hat{\boldsymbol{\xi}}) \tag{8c}
$$

and the *estimated variance component*

$$
\hat{\sigma}\_0^2 = (\mathbf{y} - A\hat{\xi})^T Q\_1^{-1} (\mathbf{y} - A\hat{\xi}) / (n - m). \qquad (8d)
$$

This would be a "Fang-type" algorithm after Fang (2011).

Alternatively, (4c) and (7a) can be combined to

$$\begin{aligned} A^T Q\_1^{-1} (\mathbf{y} - A\hat{\boldsymbol{\xi}}) &= A^T \hat{\boldsymbol{\lambda}} = \tilde{E}\_A^T \hat{\boldsymbol{\lambda}} = \\ &= \text{vec} (\hat{\boldsymbol{\lambda}}^T \tilde{E}\_A) = (I\_m \otimes \hat{\boldsymbol{\lambda}}^T) \tilde{e}\_A \end{aligned} \quad (9a)$$

and, using (6a), to

$$\begin{aligned} \boldsymbol{A}^T \boldsymbol{Q}\_1^{-1} (\mathbf{y} - \boldsymbol{A} \hat{\boldsymbol{\xi}}) &= \\ &= (I\_m \otimes \hat{\boldsymbol{\lambda}}^T) [\boldsymbol{Q}\_{A\boldsymbol{\eta}} \hat{\boldsymbol{\lambda}} - \boldsymbol{Q}\_A (\hat{\boldsymbol{\xi}} \otimes \hat{\boldsymbol{\lambda}})] = \boldsymbol{A} \quad (9b) \\ &= (I\_m \otimes \hat{\boldsymbol{\lambda}}^T) [\boldsymbol{Q}\_{A\boldsymbol{\eta}} \hat{\boldsymbol{\lambda}} - \boldsymbol{Q}\_A (I\_m \otimes \hat{\boldsymbol{\lambda}}) \hat{\boldsymbol{\xi}}] \end{aligned} \quad (9b)$$

from which the following equation can be obtained:

$$\begin{aligned} \left[A^T \mathcal{Q}\_1^{-1} A - (I\_m \otimes \hat{\lambda})^T \mathcal{Q}\_A (I\_m \otimes \hat{\lambda})\right] \hat{\xi} &= \\ &= A^T \mathcal{Q}\_1^{-1} \mathbf{y} - (I\_m \otimes \hat{\lambda})^T \mathcal{Q}\_{A\mathbf{y}} \hat{\lambda}. \end{aligned} \quad (9c)$$

Apparently, (9c) turns out to be a *generalized form* of formula (18a) in Schaffrin (2015) that allows treatment of EIV-Models with cross-covariances QAy that are *non-zero*. This would be part of a modified "Mahboub-type" algorithm after Mahboub (2012) and Schaffrin (2015).

It is now noticed that, for a *unique* TLS solution to be obtained, *only* Q1 *needs to be nonsingular*, not Q itself! This means that the more restrictive *rank condition*

$$\text{rk } B\underline{Q} = \text{rk } B = \underline{n} \tag{10}$$

ought to hold for the algorithms (8b)–(8d), resp. (9c) with (7a)–(7b) to work. But Neitzel and Schaffrin (2016) have already proved that the more general *"Neitzel-Schaffrin condition"*

$$\text{rk}\left[B\underline{Q}\,\middle|\,A\right] = \text{rk}\,B = n, \quad B = B\left(\xi\right), \qquad (11)$$

is *necessary and sufficient* for the uniqueness of the Weighted TLS solution. So, there must be a way to generalize the algorithm (8b)–(8d) for the case that (10) is violated (rk BQ < n) but (11) is fulfilled. The generalization of (9c) is left for a future publication; but see Snow (2012, ch. 3.2) for some preliminary results, particularly the system (3.21) shown therein.

Assuming that rkŒBQ | A EQA D n holds true as well, the extended matrix

$$\mathcal{Q}\_3 := \mathcal{Q}\_1 + (A - \tilde{E}\_A) S (A - \tilde{E}\_A)^T > 0 \qquad (12)$$

will be *nonsingular* for any symmetric, positive-definite matrix S (that needs to be suitably chosen as it may affect the efficiency of our algorithms), where

$$\mathcal{Q}\_3 \cdot \hat{\boldsymbol{\lambda}} = (\mathbf{y} - A\hat{\boldsymbol{\xi}}) + (A - \tilde{E}\_A)S[(A - \tilde{E}\_A)^T \hat{\boldsymbol{\lambda}}] = $$
 
$$= \mathbf{y} - A\hat{\boldsymbol{\xi}} = \mathcal{Q}\_1 \cdot \hat{\boldsymbol{\lambda}}\tag{13a}$$

due to (4c). Thus, wherever O D Q<sup>1</sup> <sup>1</sup> .y A-O/ appears in algorithm (8b)–(8d), it can simply be replaced by O D Q<sup>1</sup> <sup>3</sup> .y A-O/ giving us the more general algorithm (13b)– (13d) as follows:

$$\begin{aligned} \hat{\xi} &= \left[ (A - \tilde{E}\_A)^T \mathcal{Q}\_3^{-1} (A - \tilde{E}\_A) \right]^{-1} \text{.} \\ &\cdot \left[ (A - \tilde{E}\_A)^T \mathcal{Q}\_3^{-1} (\mathbf{y} - \tilde{E}\_A \hat{\xi}) \right], \end{aligned} \tag{13b}$$

$$
\begin{bmatrix} \tilde{\boldsymbol{e}}\_{\boldsymbol{y}} \\ \tilde{\boldsymbol{e}}\_{A} \end{bmatrix} = \boldsymbol{Q}\boldsymbol{B}^{\boldsymbol{T}} \cdot \boldsymbol{Q}\_{3}^{-1} (\boldsymbol{y} - \boldsymbol{A}\hat{\boldsymbol{\xi}}), \tag{13c}
$$

$$
\hat{\sigma}\_0^2 = (\mathbf{y} - A\hat{\boldsymbol{\xi}})^T \mathcal{Q}\_3^{-1} (\mathbf{y} - A\hat{\boldsymbol{\xi}}) / (n - m). \tag{13d}
$$

We note that the interpretation of the vector y EQA-O is still not clear!

# **2.2** *QA* D *Q***<sup>0</sup>** ˝ *Qx* **with Kronecker-Product Structure**

Now, a special case should be treated where a *Kroneckerproduct* structure can be assumed for

$$\underbrace{\mathcal{Q}\_A}\_{nm \times nm} := \underbrace{\mathcal{Q}\_0}\_{m \times m} \otimes \underbrace{\mathcal{Q}\_x}\_{n \times n} \quad \text{with} \quad \mathcal{Q}\_{A^y} = 0 = \mathcal{Q}\_{yA}^T. \tag{14}$$

Then, Q1 D BQB<sup>T</sup> can be rewritten as

$$\mathcal{Q}\_1 = \tag{15a}$$

$$= \left[ I\_n \Big| - (\hat{\boldsymbol{\xi}}^T \otimes I\_n) \right] \begin{bmatrix} \mathcal{Q}\_{\boldsymbol{y}} & \mathbf{0} \\ \mathbf{0} & \mathcal{Q}\_0 \otimes \mathcal{Q}\_x \end{bmatrix} \begin{bmatrix} I\_n \\ - (\hat{\boldsymbol{\xi}}^T \otimes I\_n) \end{bmatrix}$$

$$= \mathcal{Q}\_{\boldsymbol{y}} + \hat{\boldsymbol{\xi}}^T \mathcal{Q}\_0 \hat{\mathbf{\xi}} \cdot \mathcal{Q}\_x \tag{15b}$$

and O (if Q1 is nonsingular) as

$$\begin{split} \hat{\lambda} &= \mathcal{Q}\_1^{-1} (\mathbf{y} - A\hat{\boldsymbol{\xi}}) = \\ &= (\mathcal{Q}\_Y + \hat{\boldsymbol{\xi}}^T \mathcal{Q}\_0 \hat{\boldsymbol{\xi}} \cdot \mathcal{Q}\_x)^{-1} (\mathbf{y} - A\hat{\boldsymbol{\xi}}), \quad (15c) \end{split} $$

which leads to the *residual vector*

$$\tilde{\mathbf{e}}\_{\mathbf{y}} = \mathcal{Q}\_{\mathbf{y}} (\mathcal{Q}\_{\mathbf{y}} + \hat{\boldsymbol{\xi}}^T \mathcal{Q}\_0 \hat{\boldsymbol{\xi}} \cdot \mathcal{Q}\_x)^{-1} (\mathbf{y} - A \hat{\boldsymbol{\xi}}) \qquad (15\text{d})$$

and to the *residual matrix*

$$\tilde{E}\_A = -\mathcal{Q}\_x(\mathcal{Q}\_\circ + \hat{\boldsymbol{\xi}}^T \mathcal{Q}\_0 \hat{\boldsymbol{\xi}} \cdot \mathcal{Q}\_x)^{-1} (\mathbf{y} - A \hat{\boldsymbol{\xi}}) \hat{\boldsymbol{\xi}}^T \mathcal{Q}\_0. \tag{15e}$$

From (4c), it now follows that

$$-A^T \hat{\boldsymbol{\lambda}} = A^T (\boldsymbol{\varrho}\_{\cdot \mathbf{y}} + \hat{\boldsymbol{\xi}}^T \boldsymbol{\varrho}\_0 \hat{\boldsymbol{\xi}} \cdot \boldsymbol{\varrho}\_{\cdot \mathbf{x}})^{-1} (A \hat{\boldsymbol{\xi}} - \mathbf{y}) = $$
 
$$= -\tilde{E}\_A^T \hat{\boldsymbol{\lambda}} = \boldsymbol{\varrho}\_0 \hat{\boldsymbol{\xi}} \cdot \hat{\boldsymbol{v}}, \tag{16a}$$

with the scalar

$$\begin{aligned} \hat{\boldsymbol{\upsilon}} &:= (\mathbf{y} - A\hat{\boldsymbol{\xi}})^T (\boldsymbol{\varrho}\_{\boldsymbol{\upsilon}} + \hat{\boldsymbol{\xi}}^T \boldsymbol{\varrho}\_0 \hat{\boldsymbol{\xi}} \cdot \boldsymbol{\varrho}\_{\boldsymbol{\upsilon}})^{-1} \boldsymbol{\varrho}\_{\boldsymbol{\upsilon}} \boldsymbol{\varsigma} \\ &\cdot (\boldsymbol{\varrho}\_{\boldsymbol{\upsilon}} + \hat{\boldsymbol{\xi}}^T \boldsymbol{\varrho}\_0 \hat{\boldsymbol{\xi}} \cdot \boldsymbol{\varrho}\_{\boldsymbol{\upsilon}})^{-1} (\mathbf{y} - A\hat{\boldsymbol{\xi}}) \end{aligned} \quad (16b)$$

and, thus, ultimately the *estimated parameter vector* as

$$\hat{\boldsymbol{\xi}} = \left[\boldsymbol{A}^T (\boldsymbol{\varrho}\_{\boldsymbol{\mathcal{Y}}} + \hat{\boldsymbol{\xi}}^T \boldsymbol{\mathcal{Q}}\_0 \hat{\boldsymbol{\xi}} \cdot \boldsymbol{\mathcal{Q}}\_x)^{-1} \boldsymbol{A} - \hat{\boldsymbol{\nu}} \cdot \boldsymbol{\mathcal{Q}}\_0\right]^{-1} \boldsymbol{\cdot}$$

$$\boldsymbol{\cdot} \boldsymbol{A}^T (\boldsymbol{\mathcal{Q}}\_{\boldsymbol{\mathcal{Y}}} + \hat{\boldsymbol{\xi}}^T \boldsymbol{\mathcal{Q}}\_0 \hat{\boldsymbol{\xi}} \cdot \boldsymbol{\mathcal{Q}}\_x)^{-1} \boldsymbol{\mathcal{y}},\tag{16c}$$

which needs to be computed *iteratively* using (16b)–(16c) until convergence. This constitutes the original algorithm by Schaffrin and Wieser (2008) that only requires the invertibility of Q1 in (15b) and leads to the *estimated variance component*

$$\begin{aligned} \hat{\sigma}\_0^2 &= \hat{\boldsymbol{\lambda}}^T (\mathbf{y} - A\hat{\boldsymbol{\xi}}) / (n - m) = \\ &= (\mathbf{y} - A\hat{\boldsymbol{\xi}})^T (\boldsymbol{Q}\_{\boldsymbol{\mathcal{Y}}} + \hat{\boldsymbol{\xi}}^T \boldsymbol{Q}\_0 \hat{\boldsymbol{\xi}} \cdot \boldsymbol{Q}\_{\boldsymbol{\mathcal{X}}})^{-1} (\mathbf{y} - A\hat{\boldsymbol{\xi}}) \cdot \\ &\quad \cdot (n - m)^{-1} . \end{aligned} \tag{16d}$$

Obviously, Q1 D Qy C -O T Q0-O - Qx will be nonsingular as long as Qy is nonsingular, which is, however, *not always necessary* due to the second term. On the other hand, oftentimes the cofactor matrices Qy and Qx turn out to be *identical*:

$$\mathcal{Q}\_x \coloneqq \mathcal{Q}\_y,\tag{17}$$

in which case the algorithm (16b)–(16d) simplifies to

$$
\hat{\lambda} = \mathcal{Q}\_{\text{y}}^{-1}(\text{y} - A\hat{\xi}) \cdot (\text{l} + \hat{\xi}^T \mathcal{Q}\_0 \hat{\xi})^{-1}, \tag{18a}
$$

$$
\hat{\boldsymbol{\nu}} = \hat{\boldsymbol{\lambda}}^T \mathcal{Q}\_{\boldsymbol{\gamma}} \hat{\boldsymbol{\lambda}} = \hat{\boldsymbol{\lambda}}^T (\mathbf{y} - A\hat{\boldsymbol{\xi}}) \cdot (1 + \hat{\boldsymbol{\xi}}^T \mathcal{Q}\_0 \hat{\boldsymbol{\xi}})^{-1}, \quad (18b)
$$

$$\hat{\boldsymbol{\xi}} = \left[\boldsymbol{A}^T \boldsymbol{Q}\_{\boldsymbol{\chi}}^{-1} \boldsymbol{A} - \hat{\boldsymbol{\lambda}}^T (\mathbf{y} - \boldsymbol{A}\hat{\boldsymbol{\xi}}) \cdot \boldsymbol{Q}\_0\right]^{-1} \boldsymbol{A}^T \boldsymbol{Q}\_{\boldsymbol{\chi}}^{-1} \mathbf{y}, \quad (18c)$$

$$
\hat{\sigma}\_0^2 = \hat{\lambda}^T (\mathbf{y} - A\hat{\xi}) / (n - m) = \\
$$

$$
= \hat{\boldsymbol{\nu}} \cdot (\mathbf{l} + \hat{\boldsymbol{\xi}}^T \mathcal{Q}\_0 \hat{\xi}) / (n - m), \tag{18d}
$$

but requires Qy to be *invertible*; in contrast, Q0 and—thus— QA may be singular! In particular, Q0 :D 0 refers to the classical *Gauss-Markov Model (GMM)*.

Although the algorithm (18a)–(18d) loses its validity in case of a *singular* matrix Qy and thus *singular* Q1 D .1 C -O T Q0-O/ - Qy, it is still possible to handle this case along the lines of algorithm (13b)–(13d), but *without the "gain in efficiency"* from the Kronecker-product structure of the matrix QA.

# **3 The Special Case of the Partial Errors-In-Variables (PEIV) Model**

This special subgroup covers all the EIV-Models where some of the elements within the matrix A happen to be *nonrandom*. But, instead of introducing zero variances (and zero covariances) with the corresponding *singular cofactor* *matrix* QA, Xu et al. (2012) preferred a *dualistic* viewpoint and re-wrote the *observation equations* from (1a) as

$$\begin{aligned} \mathbf{y} &= (A - E\_A)\boldsymbol{\xi} + \mathbf{e}\_\mathbf{y} = \\ &= (\boldsymbol{\xi}^T \otimes I\_n)(\text{vec}\, A - \mathbf{e}\_A) + \mathbf{e}\_\mathbf{y} =: \end{aligned} \tag{19a}$$
 
$$\begin{aligned} &= : (\boldsymbol{\xi}^T \otimes I\_n) \cdot \boldsymbol{\mu}\_A + \mathbf{e}\_\mathbf{y}, \end{aligned}$$

where <sup>A</sup> is split into

$$
\mu\_A := \mathfrak{a} + G \cdot \mu\_a \quad \text{with} \quad \mu\_a := \underset{t \times 1}{\mathfrak{a}} - \mathfrak{e}\_a. \tag{19b}
$$

Here the t 1 vector a contains a basis for the *random* elements of A, whereas the nm 1 vector ˛ shows all *nonrandom* elements of vec A plus zeros elsewhere. As a result, the actual random elements within A are generated through the product of the vector a of basis elements with the (given) nm t matrix G, and they show up in all those places where the nm 1 vector ˛ shows zeros.

In addition, let the *random error vectors* be specified by

$$
\begin{bmatrix} \boldsymbol{e}\_{\boldsymbol{y}} \\ \boldsymbol{e}\_{a} \end{bmatrix} \sim (\begin{bmatrix} \mathbf{0} \\ \mathbf{0} \end{bmatrix}, \sigma\_{0}^{2} \boldsymbol{Q} = \sigma\_{0}^{2} \begin{bmatrix} \underline{\boldsymbol{Q}}\_{\boldsymbol{y}} & \boldsymbol{0} \\ \boldsymbol{0} & \underline{\boldsymbol{Q}}\_{a} \\ \boldsymbol{0} & \boldsymbol{I} \end{bmatrix}), \tag{19c}$$

with symmetric, positive-definite matrix Q. Any crosscovariances between y and a could also be considered, but they are avoided here to keep the following development of formulas relatively simple.

In analogy to the GMM variant in Schaffrin (2015), let the target function be defined by

$$\Phi(\boldsymbol{\mu}\_a, \boldsymbol{\xi}) := (\boldsymbol{a} - \boldsymbol{\mu}\_a)^T \boldsymbol{Q}\_a^{-1} (\boldsymbol{a} - \boldsymbol{\mu}\_a) +$$

$$+ \left[ \mathbf{y} - (\boldsymbol{\xi}^T \otimes I\_n)(\boldsymbol{a} + G\boldsymbol{\mu}\_a) \right]^T \cdot$$

$$\cdot \boldsymbol{Q}\_{\boldsymbol{\chi}}^{-1} \left[ \mathbf{y} - (\boldsymbol{\xi}^T \otimes I\_n)(\boldsymbol{a} + G\boldsymbol{\mu}\_a) \right], \quad (20)$$

which must be made stationary, leading to the *necessary Euler-Lagrange conditions*

$$\frac{1}{2}\frac{\partial \Phi}{\partial \mu\_a} = -\underline{Q}\_a^{-1}(\mathbf{a} - \hat{\boldsymbol{\mu}}\_a) - \boldsymbol{G}^T(\hat{\boldsymbol{\xi}} \otimes I\_n)\boldsymbol{Q}\_\chi^{-1}.$$

$$\cdot \left[\mathbf{y} - (\hat{\boldsymbol{\xi}}^T \otimes I\_n)(\mathbf{a} + \boldsymbol{G}\hat{\boldsymbol{\mu}}\_a)\right] \dot{\mathbf{=}} \mathbf{0},\qquad(21a)$$

$$\frac{1}{2}\frac{\partial\Phi}{\partial\xi} = \left(\begin{bmatrix}\mathbf{a}\_{1}^{T}\\\cdots\\\mathbf{a}\_{m}^{T}\end{bmatrix} + \begin{bmatrix}\hat{\boldsymbol{\mu}}\_{a}^{T}\boldsymbol{G}\_{1}^{T}\\\cdots\\\hat{\boldsymbol{\mu}}\_{a}^{T}\boldsymbol{G}\_{m}^{T}\end{bmatrix}\right)\boldsymbol{Q}\_{\boldsymbol{\nu}}^{-1}.$$

$$\cdot\left[\mathbf{y} - (\hat{\boldsymbol{\xi}}^{T}\otimes I\_{n})(\mathbf{a} + G\hat{\boldsymbol{\mu}}\_{a})\right] \doteq \mathbf{0},\qquad(21b)$$

where the terms ˛<sup>i</sup> n-1 and Gi nt come from

$$\begin{aligned} \boldsymbol{\alpha}\_{1\timesnm}^T &:= \begin{bmatrix} \boldsymbol{\alpha}\_1^T, \cdots, \boldsymbol{\alpha}\_m^T \end{bmatrix} \quad \text{and} \\ \boldsymbol{G}\_{1\timesnm}^T &:= \begin{bmatrix} \boldsymbol{G}\_1^T, \cdots, \boldsymbol{G}\_m^T \end{bmatrix} .\end{aligned} \tag{21c}$$

Reordering (21a) and (21b) yields first

$$\left[\boldsymbol{\varrho}\_{a}^{-1} + \boldsymbol{G}^{T}(\hat{\boldsymbol{\xi}} \otimes \boldsymbol{I}\_{n})\boldsymbol{\varrho}\_{\boldsymbol{y}}^{-1}(\hat{\boldsymbol{\xi}}^{T} \otimes \boldsymbol{I}\_{n})\boldsymbol{G}\right]\boldsymbol{\hat{\mu}}\_{a} =:$$

$$=: \left[\boldsymbol{\varrho}\_{a}^{-1} + \boldsymbol{S}\_{\hat{\boldsymbol{\xi}}}^{T}\boldsymbol{\varrho}\_{\boldsymbol{y}}^{-1}\boldsymbol{S}\_{\hat{\boldsymbol{\xi}}}\right]\boldsymbol{\hat{\mu}}\_{a} =$$

$$=\boldsymbol{\varrho}\_{a}^{-1} \cdot \boldsymbol{a} + \boldsymbol{S}\_{\hat{\boldsymbol{\xi}}}^{T}\boldsymbol{\varrho}\_{\boldsymbol{y}}^{-1}\left[\boldsymbol{y} - \boldsymbol{\alpha}\_{1} \cdot \hat{\boldsymbol{\xi}}\_{1} - \dots - \boldsymbol{\alpha}\_{m} \cdot \hat{\boldsymbol{\xi}}\_{m}\right],\quad(22a)$$

with

$$S\_{\hat{\xi}} := G\_1 \cdot \hat{\xi}\_1 + \dots + G\_m \cdot \hat{\xi}\_m,\tag{22b}$$

and then

$$\begin{split} \left( \left[ \mathfrak{a}\_{i} + G\_{i} \hat{\boldsymbol{\mu}}\_{a} \right]^{T} \boldsymbol{\mathcal{Q}}\_{\text{y}}^{-1} \left[ \mathfrak{a}\_{i} + G\_{i} \hat{\boldsymbol{\mu}}\_{a} \right] \right) \cdot \hat{\boldsymbol{\xi}} &= \\ = \left[ \mathfrak{a}\_{i} + G\_{i} \hat{\boldsymbol{\mu}}\_{a} \right]^{T} \boldsymbol{\mathcal{Q}}\_{\text{y}}^{-1} \cdot \mathbf{y}. \end{split} \tag{22c}$$

Furthermore, the *residual vectors* result from

$$
\tilde{\mathbf{e}}\_y = \mathbf{y} - (\hat{\boldsymbol{\xi}}^T \otimes I\_n)(\boldsymbol{a} + G\hat{\boldsymbol{\mu}}\_a) \quad \text{and} \tag{22d}
$$

$$
\tilde{\mathbf{e}}\_a = \mathbf{a} - \hat{\boldsymbol{\mu}}\_a,
$$

and the *estimated variance component* from

$$
\hat{\sigma}\_0^2 = (\tilde{\mathbf{e}}\_\mathbf{y}^T \mathbf{Q}\_\mathbf{y}^{-1} \tilde{\mathbf{e}}\_\mathbf{y} + \tilde{\mathbf{e}}\_a^T \mathbf{Q}\_a^{-1} \tilde{\mathbf{e}}\_a)/(n - m). \tag{22e}
$$

The above just describes the original approach by Xu et al. (2012). Various improvements in terms of computational efficiency were later achieved by Shi et al. (2015), Wang et al. (2016), and Zhao (2017). Moreover, Wang et al. (2017) and Han et al. (2020) also allowed for cross-covariances between Qy and Qa, a case that is included in the numerical experiments of Sect. 5.

# **4 The Various Algorithms**

The various algorithms for weighted TLS solutions compared in this contribution are listed below in bulleted form with brief descriptions. For further details about them, the reader is referred to the references provided.

# **Algorithms for Weighted TLS Solutions Within the EIV-Model**


# **Algorithms for Weighted TLS Solutions Within the Partial EIV-Model**


TLS solution within the partial EIV-Model that he argued should be preferred over those earlier algorithms because of a reduction he found in both the number of iterations and total time required to solve 2D affine and similarity transformation problems. He also compared these algorithms to a "Fang-type" algorithm, listing his results in his Tables 5 and 8, which show a drastic reduction in both iterations and time compared to Xu et al. (2012) and Shi et al. (2015), but a more marginal improvement in time over Fang's type without any reduction in the number of iterations. We note that Zhao's algorithm also does not allow for a cross-covariance matrix QyA.

# **Classical Algorithm Without Direct Reference to an EIV-Model as Described Herein**

9. Deming's (1931, 1934) algorithm (within a Gauss-Helmert Model): Finally, we mention the classical leastsquares solution within the Gauss-Helmert Model, which might also be referred to as "Deming's algorithm." Because of its long-time usage and well-known behavior, we chose to include it in our experiments for comparison purposes. An example of a rigorous presentation of it can be found in Schaffrin and Snow (2010) as well as in chapter 4 of Snow (2012), among others.

# **4.1 Uniformity of Algorithm Coding**

Many factors that are beyond the scope of our work here could be considered when writing computer code to optimize efficiency (time) in numerical computing. However, our aim was not to try write code that could run as fast as possible, which would have been a somewhat arduous task considering the number of algorithms we chose to compare. In fact, because we had already written some algorithms in MATLAB in the past, and because other authors cited above have published or shared their algorithms in MATLAB, we decided to stick with that language, though we might have written faster code in C++, for example.

What we were mainly concerned with was following a few simple practices for writing efficient code in MATLAB while keeping the code relatively easy to read. We strove to do this consistently for all the algorithms we tested. To summarize, we used the following guidelines to help ensure all the algorithms were coded with a similar level of efficiency:


Thus, if a weight matrix appeared multiple times in formulas, its inverse was saved and reused within the algorithm.


Other time-saving techniques or operations that we might have left out were, at least, done so consistently among all algorithms. We do not suspect that any inefficiencies remaining in our code would affect the number of iterations required by the algorithms.

Regarding the reporting of execution times in the following chapter, we acknowledge that what counts most here is the relative times among the algorithms, since many factors related to hardware, software, and available computing resources could influence the absolute times. To try to minimize these factors, we wrote a high-level script that called all algorithms sequentially 5000 times each for each problem. Any open programs other than MATLAB were closed before executing the script. This means that the same computer was in more-or-less the same state for all algorithms used in the comparisons. Nevertheless, the times surely will vary between repeated instances of the same test. As such, we would not distinguish between two algorithms with times (per 5000 runs) that agree within 10 percent of each other.

# **5 Numerical Experiments**

A motivation for the experiments that follow were the claims or suggestions in many of the cited papers on the partial EIV-Model that TLS solutions within that model should be preferred over those within the standard EIV-Model laid out in Sect. 2 above. The arguments are usually made in favor of computational efficiency, viz. fewer iterations to convergence or faster overall computational times. Or the argument is sometimes made that the partial EIV model and associated TLS solutions are easier to formulate. One only needs to peruse those papers to find such statements. We certainly reject the argument regarding ease of model and algorithm formulation, as we take the standard model to be more elegant and simpler in form; that does seem obvious to us, but of course others may think differently.

In any case, we would not argue with anyone who selects an algorithm based on its savings in time, especially when computations are being made in time-critical situations. Thus we conducted the following experiments to see how the various algorithms listed above compared in a variety of problems and datasets. The problem types we explored were 2D line fitting, 2D affine and similarity transformations, and a third-order auto-regressive problem. However, for the sake of space, we only report on the 2D line-fitting problem here.

Table 1 lists some results obtained from six different datasets identified in the first column, together with their number of points. The first dataset is a combination of two that appear in section 17 of Deming (1964). The datasets labeled Haneberg, Pearson, and Niemeier can be found in Schaffrin and Snow (2020). Neri's data can be found in Snow (2012) and Kelly's data in Kelly (1984). The table also lists Neri\*, which are the original Neri data with simulated crosscovariances added in Snow (2012).

The table shows the number of iterations to convergence and the time in seconds for 5000 consecutive executions. Note that each consecutive execution represents an independent call to the algorithm's function so that no results from a previous execution are used in a subsequent one. The convergence criterion requires the norm of the incremental vector of estimated parameters to be less than a specified value; the value used was 1010. We do not bother listing estimated parameter, residuals, or total SSR, as these values were the same for at least six digits beyond the decimal point for all algorithms.

The lowest and highest times for each dataset are highlighted in bold typeface (and those within 0.2 s, too). The algorithms from Wang et al. (2017) and Schaffrin and Wieser (2008) have the lowest times, and the "Fang-type" algorithm B from Snow (2012) or that of Xu et al. (2012) has the highest. For Snow's algorithm, we do not know whether "better choices" for matrix S :D Im appearing in Q3 might have led to lower times. However, we should say again that among the algorithms featured here, only Snow's A and B (also Fang 2011), and now Schaffrin's D, and that of Wang et al. (2017), resp. Deming's can handle a cross-covariance matrix QyA; and only algorithm B can handle a singular cofactor matrix Q1. Moreover, the algorithm of Schaffrin and Wieser (2008), which performs admirably here, cannot be used in transformation problems, since the cofactor matrix QA cannot easily be expressed as a Kronecker product in those problems.


**Table 1** Results of 2D line-fitting: data set, number of points, and number of iterations/times in s for 5000 executions. Low and high times shown in bold, and those within 0.2 s. Neri\* includes a non-zero cross-covariance matrix, which some of the algorithms cannot accommodate

# **6 Conclusions and Outlook**

Our study has confirmed that a variety of published algorithms for the TLS solution within (partial) EIV-Models yield equivalent numerical results for the estimated parameters and residuals across a number of tested datasets. This is to be expected. It also suggests that one can do just fine working within the standard EIV-Model rather than resorting to the partial EIV-Model, though it might be worth adopting the latter in certain cases. We suggest the following logic for choosing a TLS algorithm, beginning with more general cases and moving towards more specific ones.


Our future work will consider making further efficiency improvements to the algorithms as we have coded them and will report on their performance among a wider variety of problems and datasets.

**Acknowledgements** We would like to thank Prof. Wang and Dr. Guangyu Xu for kindly sharing their MATLAB algorithm with us. Thanks are also expressed to PhD student Xuechen Yang for helping us understand certain parts of Wang's et al. papers in Chinese.

# **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/ licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

# **Bayesian Robust Multivariate Time Series Analysis in Nonlinear Regression Models with Vector Autoregressive and t-Distributed Errors**

Alexander Dorndorf, Boris Kargoll, Jens-André Paffenholz, and Hamza Alkhatib

#### **Abstract**

Geodetic measurements rely on high-resolution sensors, but produce data sets with many observations which may contain outliers and correlated deviations. This paper proposes a powerful solution using Bayesian inference. The observed data is modeled as a multivariate time series with a stationary autoregressive (VAR) process and multivariate t-distribution for white noise. Bayes' theorem integrates prior knowledge. Parameters, including functional, VAR coefficients, scaling, and degree of freedom of the t-distribution, are estimated with Markov Chain Monte Carlo using a Metropolis-within-Gibbs algorithm.

#### **Keywords**

Metropolis-within-Gibbs algorithm - Robust Bayesian time series analysis t-distribution - VAR process

# **1 Introduction**

In statistical modeling and hypothesis testing, models and procedures exist for estimating parameters from observations. These models often include a functional model, correlation model, and stochastic model, with the latter usually assumed to be normally distributed. However, this assumption can lead to incorrect results if outliers are present.

Institute of Geo-Engineering, Clausthal University of Technology, Clausthal, Germany e-mail: alexander.dorndorf@tu-clausthal.de; jens-andre.

paffenholz@tu-clausthal.de

Various approaches exist to address the issue of outliers in observations. One such approach is robust parameter estimation, which aims to reduce the impact of outliers on the estimation result. In Bayesian inference, robust estimation is achieved by substituting the initially assumed distribution for the observations with a distribution having heavy tails. Consequently, to obtain a robust estimator for the assumption of normally distributed observations, one option is to replace this distribution with a heavy-tailed t-distribution (Lange et al. 1989).

Modeling multivariate time series has been approached by Alkhatib et al. (2018) and Kargoll et al. (2020). Alkhatib et al. (2018) proposed a nonlinear functional model with a t-distributed error model, while Kargoll et al. (2020) introduced two different outlier models for nonlinear deterministic and vector-autoregressive (VAR) models. The VAR process models auto- and cross-correlations, but both Alkhatib et al. (2018) and Kargoll et al. (2020) do not consider prior knowledge of the parameters. Kargoll et al. (2020) derived a generalized expectation maximization (GEM) algorithm to approximate parameters but it does not include prior knowledge, and the variance-covariance matrix (VCM) of the parameters can only be estimated with a computationally intensive algorithm with bootstrapping. However, it

J. T. Freymueller, L. Sánchez (eds.), *X Hotine-Marussi Symposium on Mathematical Geodesy*, International Association of Geodesy Symposia 155, https://doi.org/10.1007/1345\_2023\_210

Authors Alexander Dorndorf and Hamza Alkhatib contributed equally to this work.

A. Dorndorf · J.-A. Paffenholz

B. Kargoll FB AFG, Hochschule Anhalt, Dessau, Germany e-mail: boris.kargoll@hs-anhalt.de

H. Alkhatib (-) Geodetic Institute, Leibniz University Hannover, Hannover, Germany e-mail: alkhatib@gih.uni-hannover.de

is possible to integrate prior knowledge and estimate the VCM of the parameters in Bayesian inference, under the assumption that the prior knowledge is available in the form of a distribution function.

In Dorndorf et al. (2021), the model in Alkhatib et al. (2018) was extended to consider prior information using Bayesian inference. This paper focuses on the VAR model. For an overview of Bayesian time series analysis models, see Steel (2010). In Bayesian time series analysis, the VAR coefficients are treated as random variables (Box and Jenkins 2015) and require a prior density for estimation. A Bayesian AR model with a non-informative prior and normally distributed white noise is presented in Box and Jenkins (2015). Ni and Sun (2005) introduced a Bayesian VAR model structured similarly to Kargoll et al. (2020) and solved with a Gibbs sampler, but it requires the time series to be detrended.

The algorithm from Dorndorf et al. (2021) will be extended to handle a VAR process (described in detail in Sect. 2), and the posterior density function will be approximated through Markov-Chain Monte Carlo (MCMC) algorithm (outlined in Sect. 3). A multi-variate time series model for laser tracker observations of a circle in 3D will be proposed in Sect. 4 and evaluated through Monte Carlo simulation. The findings will be used to evaluate the performance of the implemented Metropolis-Hastingswithin Gibbs algorithm.

# **2 The Bayesian Time Series Model**

The observation `<sup>t</sup> are expressed in the observation matrix <sup>L</sup> <sup>D</sup> - `<sup>1</sup> --- `<sup>n</sup> T . The observation model is defined to be a regression time series

$$\mathcal{L}\_l = \tilde{\mathcal{E}}\_l + \mathcal{E}\_l = h\_l \left( \tilde{\mathcal{\beta}} \right) + \mathcal{E}\_l, \quad t = 1, \dots, n,\tag{1}$$

where the random variable *L*<sup>t</sup> consists of a deterministic part `Q <sup>t</sup> and a stochastic part *<sup>E</sup>*<sup>t</sup> . Here `<sup>Q</sup> <sup>t</sup> is a true value and can be described by an arbitrary possibly nonlinear (differentiable) function h<sup>t</sup> .-/ over the true functional parameters ˇQ. The stochastic component *E*<sup>t</sup> represents a colored noise for the time series *L*<sup>t</sup> that is obtained from a VAR model with

$$\mathcal{E}\_l = \tilde{A}\_1 \mathcal{E}\_{l-1} + \dots + \tilde{A}\_p \mathcal{E}\_{l-p} + \mathcal{U}\_l. \tag{2}$$

The matrix AQ <sup>j</sup> contains the true VAR coefficients of the p-th VAR order and the matrix is thus given with

$$
\tilde{\mathcal{A}}\_{j} = \begin{bmatrix}
\tilde{\alpha}\_{j;1,1} & \cdots & \tilde{\alpha}\_{j;1,N} \\
\vdots & \ddots & \vdots \\
\tilde{\alpha}\_{j;N,1} & \cdots & \tilde{\alpha}\_{j;N,N}
\end{bmatrix}, \quad j = 1, \ldots, p,\tag{3}
$$

94 A. Dorndorf et al.

where *U*<sup>t</sup> in Eq. 2 is the white noise and follows from the approximation of a multivariate Student distribution *U*<sup>t</sup> t 0; ‰Q ; -Q where the expectation value of white noise *U*<sup>t</sup> is 0; ‰Q denotes the true scale matrix of white noise and -Q is the true degree of freedom for the multivariate t-distribution; N in Eq. 3 is the dimension of the multivariate time series. The scaling matrix has the structure of a VCM resulting in

$$
\tilde{\Psi} = \begin{bmatrix}
\tilde{\psi}\_1^2 & \cdots \tilde{\rho}\_{1,N} \tilde{\psi}\_1 \tilde{\psi}\_N \\
\vdots & \ddots & \vdots \\
\tilde{\rho}\_{N,1} \tilde{\psi}\_1 \tilde{\psi}\_N & \cdots & \tilde{\psi}\_N^2
\end{bmatrix} \tag{4}
$$

with the true correlation coefficient Qi;k and the true scaling factors Q <sup>2</sup> <sup>i</sup> on the diagonal.

It follows that the random variable *L*<sup>t</sup> of Eq. 1 can be specified by the parameters ˇQ, ‰Q , AQ <sup>j</sup> , and -Q, where the scaling matrix ‰Q according to Eq. 4 consist of Q <sup>2</sup> <sup>i</sup> (with i 2 f1 :::;Ng) and Qk;o (with k 2 f1 :::;N 1g and o 2 f2 :::;Ng). These parameters are now grouped into the true parameter vector:

$$
\tilde{\boldsymbol{\theta}} = \left[ \tilde{\boldsymbol{\beta}}^T \left( \tilde{\boldsymbol{\psi}}^2 \right)^T \tilde{\boldsymbol{\rho}}^T \, \tilde{\boldsymbol{a}}^T \, \tilde{\boldsymbol{v}} \right]^T \,. \tag{5}
$$

Thus, this vector consists of ˇQ D h ˇQ 1;:::; ˇQ m iT , Q <sup>2</sup> D - Q 2 <sup>1</sup> ;:::; <sup>Q</sup> <sup>2</sup> N <sup>T</sup> ; <sup>Q</sup> <sup>D</sup> ŒQ1;2;:::; Q1;N ; Q2;3;:::; QN-1;N <sup>T</sup> and aQ D Œ˛Q1<sup>I</sup>1;1;:::; ˛Q1<sup>I</sup>N ;1; ˛Q1<sup>I</sup>1;2;:::; ˛Q1IN ;N ; ˛Q2<sup>I</sup>1;1;:::; ˛QpIN ;N T . Hence, the dimension of the parameter vector -Q is B D mCN C <sup>N</sup>2-N <sup>2</sup> CN<sup>2</sup> p C1, where m is the total number of the functional parameters ˇQ, N is the dimension of the multivariate time series and p is the order of the VAR process.

In general, all parameters in Eq. 5 are unknown for the observed data L, and thus, the estimated values -O need to be calculated. In the context of Bayesian inference, the parameter vector -O is estimated based on the corresponding random variable ', whose density function is also unknown. Assuming that the time series data *L*<sup>t</sup> from the model in Eq. 1 is given, this data depends on the random variable '. This relationship can be expressed as a likelihood function f*L*j'. the random variable ' that can be expressed as a probability distribution f' called prior density function. Let's assume that we possess prior knowledge about the random variable ', which can be represented as a probability distribution f' known as the prior density function. We can then update this prior knowledge with the observed data L using the Bayes theorem, resulting in the calculation of the posterior density function f'j*<sup>L</sup>* as follows:

$$\begin{aligned} \left(f\_{\Theta|\mathcal{L}}\left(\boldsymbol{\mathfrak{g}},\boldsymbol{\Psi}^{2},\boldsymbol{\mathfrak{p}},\boldsymbol{a},\boldsymbol{\upsilon}\mid\boldsymbol{L}\right)\propto f\_{\Theta}\left(\boldsymbol{\mathfrak{g}},\boldsymbol{\Psi}^{2},\boldsymbol{\mathfrak{p}},\boldsymbol{a},\boldsymbol{\upsilon}\right) \\ \cdot f\_{\mathcal{L}|\Theta}\left(\boldsymbol{L}\mid\boldsymbol{\mathfrak{g}},\boldsymbol{\Psi}^{2},\boldsymbol{\mathfrak{p}},\boldsymbol{a},\boldsymbol{\upsilon}\right). \end{aligned} (6)$$

According to Kargoll et al. (2020) the joint likelihood function leads to:

$$\begin{split} f\_{\mathcal{L}|\Theta}(L \mid \theta) \\ = & \prod\_{l=1}^{\boldsymbol{\nu}} \left( \frac{\Gamma\left(\frac{\boldsymbol{\nu} + \boldsymbol{N}}{2}\right)}{\Gamma\left(\frac{\boldsymbol{\nu}}{2}\right) \sqrt{(\boldsymbol{\nu} \,\boldsymbol{\pi})^{N}}} |\Psi|^{-1/2} \left[ 1 + \frac{\boldsymbol{u}\_{l}^{T} \Psi^{-1} \boldsymbol{u}\_{l}}{\boldsymbol{\nu}} \right]^{-\frac{\boldsymbol{\nu} + \boldsymbol{N}}{2}} \right) \end{split} \tag{7}$$

where is the gamma function, and *u*<sup>t</sup> in Eq. 7 is:

$$\begin{aligned} \mathfrak{u}\_{l} &= \mathfrak{e}\_{l} - \sum\_{j=1}^{p} \left( A\_{j} \cdot \mathfrak{e}\_{l-j} \right) & \begin{array}{c} \text{prior density} \\ \text{parallel class} \\ \text{The used n} \\ \end{array} \\ &= \mathfrak{e}\_{l} - h\_{l}(\mathfrak{f}) - \sum\_{j=1}^{p} \left( A\_{j} \cdot \left[ \mathfrak{f}\_{l-j} - h\_{l-j} \left( \mathfrak{f} \right) \right] \right) . \end{array} . (8) \\ & \begin{array}{c} \text{are:} \\ f\_{\mathfrak{do}\_{\mathfrak{f}}} \left( \mathfrak{f} \right) \propto 1, \quad -\infty < \mathfrak{f} < \infty, \\ f\_{\mathfrak{do}\_{\mathfrak{f}}} \left( \mathfrak{g} \right) \propto 1, \quad -1 \le \mathfrak{p} \le 1, \\ f\_{\mathfrak{do}\_{\mathfrak{v}}} \left( \upsilon \right) \propto 1, \quad 2 < \upsilon \le 120. . \end{array} . \end{aligned}$$

The prior densities used have values of ˙1, which are considered improper. But combining these densities with the likelihood results in a proper posterior density. For the parameter , an improper density could have been used, but due to mathematical constraints it is only possible for a correlation coefficient to be between ˙1. Similarly, the scaling factor <sup>2</sup> was excluded from having a value smaller than zero. For the degree of freedom -, a proper density was defined according to Kargoll et al. (2020) to prevent an improper posterior density and ensure a fair comparison between the Bayesian and classical GEM models. The computation of the exact posterior density function given in Eq. 6 is not feasible, so approximation techniques must be used. To solve these problems, a MCMC algorithm has been developed and will be described in the next section.

# **3 The Developed MCMC Algorithm**

The goal of the MCMC methods is now to generate the random numbers -.1/ ! -.2/ ! ---! -.b/ as a Markov chain. The MCMC method, therefore, generates a total of b random numbers as realizations of '. The Markov chain defined here is related to the full parameter vector -.y/ , whereby y is the current number of the sample of the Markov chain. However, since the density function f'j*<sup>L</sup>* -.y/ j -.y-1/ ; L is unknown here, the Markov chain cannot be generated directly with the Gibbs sampler. For this reason, this density Thus, the calculation of the likelihood function for the observations L is based on the product of the unknown white noise *u*<sup>t</sup> that results from the relation of Eq. 8. Due to assumed stochastic independence, the joint prior density function in Eq. 6 can be written as

$$\begin{split} \left. f\_{\Theta} \left( \not\mathfrak{f}, \not\psi^{2}, \not\mathfrak{p}, \not\mathfrak{a}, \not\nu \right) \\ = & f\_{\Theta\_{\not\mathfrak{f}}} \left( \not\mathfrak{f} \right) \cdot f\_{\Theta\_{\not\mathfrak{f}^{2}}} \left( \not\mathfrak{p}^{2} \right) \cdot f\_{\Theta\_{\not\mathfrak{p}}} \left( \not\mathfrak{p} \right) \cdot f\_{\Theta\_{\not\mathfrak{a}}} \left( \not\mathfrak{a} \right) \cdot f\_{\Theta\_{\not\mathfrak{b}}} \left( \not\nu \right) . \end{split} \tag{9}$$

In this paper, we only consider the case of a non-formative prior density because the Bayes model is compared to a comparable classical adjustment model for validation purposes. The used non-informative prior density for the parameters are:

$$f\_{\mathsf{H}\_{\mathsf{F}}}(\mathsf{f}) \propto 1, \quad -\infty < \mathsf{f} < \infty, \qquad \qquad f\_{\mathsf{H}\_{\mathsf{F}^{2}}}(\mathsf{\mu}^{2}) \propto 1, \qquad 0 < \mathsf{\mu}^{2} < \infty,$$

$$f\_{\mathsf{H}\_{\mathsf{F}}}(\mathsf{\mu}) \propto 1, \quad -1 \le \mathsf{\mu} \le 1, \qquad \qquad f\_{\mathsf{H}\_{\mathsf{A}}}(\mathsf{a}) \propto 1, \quad -\infty < \mathsf{a} < \infty,\tag{10}$$

function at time y of the Markov chain is decomposed into a univariate conditional density function as follows

$$f\_{\boldsymbol{\Theta}|\mathcal{L}}\left(\boldsymbol{\theta}\_{\boldsymbol{z}}^{(\boldsymbol{y})} \mid \boldsymbol{\theta}\_{1}^{(\boldsymbol{y})}, \dots, \boldsymbol{\theta}\_{\boldsymbol{z}-1}^{(\boldsymbol{y})}, \boldsymbol{\theta}\_{\boldsymbol{z}+1}^{(\boldsymbol{y}-1)}, \dots, \boldsymbol{\theta}\_{\boldsymbol{B}}^{(\boldsymbol{y}-1)}, \boldsymbol{L}\right). \quad (11)$$

Applying the formula in Eq. 11 to the posterior density from Eq. 6, it means for example for the parameter ˇ2 that the conditional posteriori density is f'ˇ2 <sup>j</sup>*<sup>L</sup>* ˇ.y/ <sup>2</sup> <sup>j</sup> <sup>ˇ</sup>.y/ <sup>1</sup> ; ˇ.y-1/ <sup>3</sup> ;::: , ˇ.y-1/ <sup>m</sup> <sup>2</sup> .y-1/ ; .y-1/ a.y-1/ ; -.y-1/ ; L .

To generate the random numbers -.y/ using the MCMC methods now, one of the different existing Monte Carlo algorithms for generating Markov chains needs to be selected. In this paper, the Metropolis-Hastings algorithm and the Gibbs sampler are chosen for this purpose, which are two of the most important algorithms for MCMC methods (refer to Gelman et al. (2013)). Both algorithms can be integrated into a so-called Metropolis-within-Gibbs algorithm, shown in Algorithm 1. The Gibbs sampler is the "For loop" in line 2 of the algorithm. This requires as the start value the parameter b as length for the Markov chain to be generated. The Metropolis algorithm starts in line 4, where this algorithm is executed once for each component *z* of using the "For loop" in line 3. To run the Metropolis algorithm, the proposal density f'- *<sup>z</sup>* j*<sup>z</sup> <sup>z</sup>* <sup>j</sup> .y-1/ *<sup>z</sup>* is required for generating the random realization *<sup>z</sup>* . In this paper, a normal distribution is always used as proposal density and therefore results from ' *<sup>z</sup>* N .y-1/ *<sup>z</sup>* ; 2 *z* .

#### **Algorithm 1** Metropolis-within-Gibbs

**Require:** Set the number of iterations b and the start position -.0/ **Require:** Posterior density f'j*<sup>L</sup>* . j L/ from Eq. 6 **Require:** For all components *z* of -, a univariate symmetric proposal density have to be f'- *<sup>z</sup>* j'*<sup>z</sup>* - *<sup>z</sup>* j .y1/ *<sup>z</sup>* with a suitable jump parameter *<sup>z</sup>* selected. .1;:::;b/

**Ensure:** Realizations from the generated Markov chain: -1: Initialize the auxiliary vector # D -.0/ as a run variable.

2: **for** y D 1:::b **do**

3: **for** *z* D 1:::B **do**

4: Drawing a random realization - *<sup>z</sup>* from the proposal density f'- *<sup>z</sup>* j'*<sup>z</sup>* - *<sup>z</sup>* j .y1/ *<sup>z</sup>* 5: Create -- D Œ#1;:::;#*z*1; - *<sup>z</sup>* ; #*z*<sup>C</sup>1;:::#B T

6: Calculate the ratio with <sup>r</sup> <sup>D</sup> <sup>f</sup>'j*L*.-jL/ <sup>f</sup>'j*L*.#jL/ -

7: Calculate the acceptance probability for with .y/ *<sup>z</sup>* <sup>D</sup> min .r;1/ and draw a random realization .y/ *<sup>z</sup>* from <sup>T</sup> <sup>U</sup> .0; 1/ 8 < -.-/ for .y/ *<sup>z</sup>* .y/ *z*

The mean of the distribution is set by the previous value in

:

# for

.y/ *<sup>z</sup>* <sup>&</sup>gt;.y/ *<sup>z</sup>*

*<sup>z</sup>* determines the jump

8: Acceptance / Rejection Step: # D 9: **end for** 10: Saving -.y/ D #

the Markov chain and the variance 2

11: **end for**

distance and affects convergence to the desired posterior density. Optimal convergence is achieved when the acceptance rate is around 44% with a normal distribution as the proposal density, as shown in Gelman et al. (1996).

Automating the selection of 2 *<sup>z</sup>* with adaptive MCMC algorithms, such as the one presented in Roberts and Rosenthal (2009), is recommended when B is large. In Algorithm 1, the acceptance rate decides if -./ or # is accepted as the realization in step y of the chain. To optimize the jump values, the approach presented in Dorndorf et al. (2019) was used to achieve an acceptance rate between 40–50%. This method requires initial values, the posterior density, and the conditional density functions to be constructed.

The mean value of the different parameter groups can be estimated from the generated Markov chain -.y/ using:

$$\hat{\theta}\_z = \frac{1}{b-o} \sum\_{y=o+1}^{b} \theta\_z^{(y)} \quad \text{for} \quad z = 1, \dots, B,\qquad(12)$$

where o is the Warm Up Phase. Based on the mean values estimated in Eq. 12 and the realizations of the Markov chain generated by Algorithm 1, the VCM of the parameters can be estimated (refer to Gelman et al. (2013)):

$$\hat{\Sigma}\_{\hat{\theta}\hat{\theta};z,i} = \frac{1}{b-o} \sum\_{y=o+1}^{b} \left(\theta\_z^{(y)} - \hat{\theta}\_z\right) \left(\theta\_i^{(y)} - \hat{\theta}\_i\right) \quad \text{for} \quad z = 1, \dots, B \; ; i = 1, \dots, B. \tag{13}$$

# **4 Closed Loop Monte Carlo Simulation**

The Closed Loop Simulation (CLS) in this chapter is based on an experiment conducted at the Geodetic Institute Hanover using a multi-sensor system consisting of a laser scanner and GNSS equipment. The experiment aimed to determine the transformation parameters between the global coordinate system defined by the GNSS equipment and the laser scanner's local, sensor-defined coordinate system using a high-precision laser tracker. For further details, refer to Paffenholz (2012). The CLS was developed to estimate the expected accuracy of the parameters and was used to validate the Bayesian model presented in Algorithm 1. The advantage of a CLS is that the true functional and stochastic model in Eq. 1 to Eq. 4 are known. Real data processing is beyond the scope of this paper.

# **4.1 The Framework of the Simulation**

The CLS involves a 3D non-linear regression model of a circle with 6 parameters: two for orientation (' and !/, one for radius (r), and three for center (cx; cy; c*z*). The observable 3D circle points are described by

$$\overline{\ell\_{\ge 1} := \tilde{\ell}\_{1,l} = h\_{1,l}\left(\tilde{\mathfrak{P}}\right) = \tilde{r}\sin\left(\tilde{\kappa}\_l\right)\cos\left(\tilde{\mathfrak{P}}\right) + \tilde{c}\_x,\tag{14}$$

$$\tilde{\ell}\_{\gamma,l} := \tilde{\ell}\_{2,l} = h\_{2,l}\left(\tilde{\mathfrak{F}}\right) = \tilde{r}\sin\left(\tilde{\kappa}\_l\right)\sin\left(\tilde{\wp}\right)\sin\left(\tilde{\wp}\right) + r\cos\left(\tilde{\kappa}\_l\right)\cos\left(\tilde{\wp}\right) + \tilde{c}\_{\gamma},\tag{15}$$

$$\tilde{\ell}\_{3J} \coloneqq \tilde{\ell}\_{3J} = h\_{3J} \left( \tilde{\mathfrak{f}} \right) = -\tilde{r} \sin(\tilde{\kappa}\_l) \sin(\tilde{\varphi}) \cos(\tilde{\omega}) + r \cos(\tilde{\kappa}\_l) \sin(\tilde{\omega}) + \tilde{c}\_z,\tag{16}$$

where t D 1; : : : ; n (with n D 1000) and Q<sup>t</sup> D QO C Q - .t 1/. In this equation the parameter O is the unknown orientation and the parameter  is the angle of rotation of the TLS between two observations.

In this simulation, the functional parameters are the 3D circle parameters ˇ, which were assumed to take the true values cQ<sup>x</sup> D 0:12 Œm, cQ<sup>y</sup> D 3:36 Œm, cQ*<sup>z</sup>* D 0:10 Œm, rQ D 0:50 Œm, !Q D 0:05 Œdeg, 'Q D 0:01 Œdeg, Q<sup>O</sup> D 184:00 Œdeg and Q D 0:36 Œdeg. The model of Eq. 2 with a VAR order of p D 1 is used in the CLS as the stochastic model for generating the realisations of the coloured noise. The VAR matrix AQ <sup>1</sup> then results according to Eq. 3 for the chosen true coefficients to ˛Q1I1;1 D 0:50, ˛Q1I1;2 D 0:10, ˛Q1I1;3 D 0:15, ˛Q1I2;1 D 0:10, ˛Q1I2;2 D 0:20, ˛Q1I2;3 D 0:25, ˛Q1I3;1 D 0:20, ˛Q1I3;2 D 0:05 and ˛Q1I3;3 D 0:75. For the generation of the random white noise, the stochastic model *U*<sup>t</sup> t 0; ‰Q ; -Q is used in the CLS. The scaling matrix in Eq. 4 is initialized with the scaling factors <sup>Q</sup> <sup>D</sup> - 8:8 6:1 11:9<sup>T</sup> <sup>Œ</sup>m and the correlation coefficients <sup>Q</sup> <sup>D</sup> - 0:37 0:15 0:09<sup>T</sup> . The degree of freedom of the Student distribution is fixed to -Q D 4:14.

# **4.2 Results of the Simulation**

In the CLS, the results of the developed Bayesian MCMC Algorithm 1 in Sect. 3 are compared with the results of the GEM algorithm in Kargoll et al. (2020). The GEM model in the CLS was run twice: once with the estimation of the parameter - (referred to as GEM), and once with a known degree of freedom - D 10;000;000. The latter scenario represents the case where the likelihood function corresponds to a multivariate normal distribution. The purpose of these runs is to show the effect of incorrect noise assumption on the estimation results. The initial values for the MCMC algorithm were set to b D 5000 and o D 2000, and the initial values for the parameters were set to the true parameters of CLS to avoid any biases in the results.

In the following analysis, the parameters estimated by MCMC Algorithm 1 and GEM algorithm are subtracted from the true values -Q given in Sect. 4.1. From this follows  O *z*;s D O *z*;s Q *<sup>z</sup>* with *z* D 1; : : : ; B, s is the index for Monte Carlo run of the CLS with s D 1; : : : ; 10;000 and B is the total number of the parameters. The estimated reduced parameter values are shown for specific chosen parameters in Fig. 1 using the boxplot. The parameters estimated by MCMC and GEM algorithms are compared with true values and shown in Fig. 1. The results show that the parameters estimated by MCMC and GEM algorithms are almost the same, with wider confidence intervals for functional parameters in the GEM estimator with - D 1. The estimated VAR coefficients are comparable among all estimators, with unbiased estimates for the other two estimators. The estimated correlation coefficients show that the GEM estimator with - D 1 scatters more around the true values compared to the other two estimators. The results of the MCMC and GEM solutions are similar but there are differences in the degree of freedom (-) and the parameters in the scaling matrix (‰). The median of the boxplots for and x deviates more for the GEM compared to MCMC, but the deviation is not noticeable for the correlation coefficients (). However, the larger deviation of the median in the GEM does not affect the dispersion of the estimated parameters, which is comparable for both MCMC and GEM.

The performance of both algorithms was compared by using the estimated ˇO to determine `O, and then calculating the residuals vQi;t;s (i D x; y;*z* and t D 1 : : : 1000, s D 1 : : : 10;000) to see how well the predictions matched the true observations. The mean, standard deviation, minimum, and maximum of the residuals were determined and are shown in Table 1.

The GEM algorithm generates residuals with slightly smaller values compared to the MCMC algorithm. The differences between the estimated parameters -O and  O <sup>x</sup> presented in Fig. 1 have no influence on the calculated results of Table 1, because the residuals vQ <sup>x</sup>, vQ <sup>y</sup> and vQ*<sup>z</sup>* are only calculated on the basis of the estimated parameters ˇO. However, the differences shown in Table 1 are not significant compared to the parameters Q , which was used to create the white noise for the CLS. The prediction of `Q <sup>x</sup> and `Q <sup>y</sup> have less deviations than the `Q*<sup>z</sup>* component due to the symmetrical circle trajectory in the x-y component, which supports estimation. However, in the z-component, inaccuracies in the estimated parameters have a stronger effect.

# **5 Conclusions**

In this paper, a robust Bayesian model with VAR process was presented and compared to a classical model based on a GEM algorithm and a VAR model with multivariate normal distribution assumption for the white noise. The robust Bayesian model showed almost identical results to the robust classical model, with differences arising from the use of different estimators for the parameters. The robust Bayesian model offers the advantage of being able to determine the precision of the parameters and to apply different estimators for the parameters, which is not possible with the classical model without a significant increase in computational cost. The limitations of the robust Bayesian model and its future applications, such as investigating the quality of the VCM and the convergence of the Markov chains, as well as defining an informative prior density and validating the model on real data, will be explored in future work.

**Fig. 1** Differences between the estimated values and true values for the 10,000 CLS runs as a box plot. For GEM - D 1 the degree of freedom is fixed at -D 10;000;000

**Table 1** Descriptive statistics for estimated residuals between predicted observations and true observations


**Acknowledgements** This research was supported by the German Research Foundation (DFG, Deutsche Forschungsgemeinschaft) – project "Bayesian adaptive robust adjustment of multivariate geodetic measurement processes with data gaps and nonstationary colored noise" under number 386369985.

# **References**

Alkhatib H, Kargoll B, Paffenholz J-A (2018) Further results on robust multivariate time series analysis in nonlinear models with autoregressive and t-distributed errors. In: Valenzuela O, Rojas F, Pomares H, Rojas I (eds) Time series analysis and forecasting. ITISE 2017. Contributions to statistics. Springer, New York, pp 25–38


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/ licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

**Part III**

**Geodetic Data Analysis**

# **An Estimate of the Effect of 3D Heterogeneous Density Distribution on Coseismic Deformation Using a Spectral Finite-Element Approach**

# Yoshiyuki Tanaka, Volker Klemann, and Zdenek Martinec ˇ

#### **Abstract**

The advancement of the Global Geodetic Observing System (GGOS) has enabled monitoring of mass transport and solid-Earth deformation processes with unprecedented accuracy. Coseismic deformation is modelled as an elastic response of the solid Earth to an internal dislocation. Self-gravitating spherical Earth models can be employed in modelling regional to global scale deformations. Recent seismic tomography and highpressure/high-temperature experiments have revealed finer-scale lateral heterogeneities in the elasticity and density structures within the Earth, which motivates us to quantify the effects of such finer structures on coseismic deformation. To achieve this, fully numerical approaches including the Finite Element Method (FEM) have often been used. In our previous study, we presented a spectral FEM, combined with an iterative perturbation method, to consider lateral heterogeneities in the bulk and shear moduli for surface loading. The distinct feature of this approach is that the deformation of the entire sphere is modelled in the spectral domain with finite elements dependent only on the radial coordinate. By this, self-gravitation can be treated without special treatments employed when using an ordinary FEM. In this study, we extend the formulation so that it can deal with lateral heterogeneities in density in the case of coseismic deformation. We apply this approach to a longer-wavelength vertical deformation due to a large earthquake. The result shows that the deformation for a laterally heterogeneous density distribution is suppressed mainly where the density is larger, which is consistent with the fact that self-gravitation reduces longer-wavelength deformations for 1-D models. The effect on the vertical displacement is relatively small, but the effect on the gravity change could amount to the same order of magnitude of a given heterogeneity if the horizontal scale of the heterogeneity is large enough.

#### **Keywords**

Deformation - Density - Earthquake - Finite element method - Gravity field - Lateral heterogeneity

Y. Tanaka (-)

V. Klemann Department 1 Geodesy, German Research Centre for Geosciences, Potsdam, Germany

Z. Martinec

Faculty of Mathematics and Physics, Charles University, Praha 8, Czech Republic

J. T. Freymueller, L. Sánchez (eds.), *X Hotine-Marussi Symposium on Mathematical Geodesy*, International Association of Geodesy Symposia 155, https://doi.org/10.1007/1345\_2023\_236

Department of Earth and Planetary Science, Graduate School of Science, The University of Tokyo, Tokyo, Japan e-mail: y-tanaka@eps.s.u-tokyo.ac.jp

Geophysics Section, Dublin Institute for Advanced Studies DIAS, Dublin 2, Ireland

# **1 Introduction**

Recent advancements in terrestrial and satellite gravity observations have enabled us to monitor mass transports associated with physical processes in the atmosphere, ocean, hydrosphere, and cryosphere with an unprecedent accuracy (Crossley et al. 2013; Wouters et al. 2014). These surface mass transports cause elastic and anelastic deformations of the solid Earth. The resultant deformation of the density interfaces (atmosphere-crust and crust-mantle boundaries, etc.) and compression/dilatation of the solid-Earth material lead to an additional change in the gravity field. By physically modelling this process and comparing the model results, we can learn about deformation mechanisms and rheological properties of the material (e.g., crustal rigidity, mantle viscosity) (Whitehouse 2018).

In addition to surface loading, co- and post-seismic gravity changes induced by large earthquakes have been observed by satellite observations with spatial scales of 300 km and amplitudes of several -Gals (1 -Gal D 10-<sup>8</sup> m s-2) (e.g., Matsuo and Heki 2011). It is widely accepted that coseismic deformation is physically represented by the elastic response to an internal dislocation. To interpret gravity changes due to large earthquakes, dislocation models have been proposed, which assume a self-gravitating sphere with a 1-D (i.e., spherically symmetric) internal structure (Sun 2014; Zhou et al. 2019). However, seismic tomography and high-temperature/high-pressure experiments nowadays reveal increasingly finer internal structures, particularly in plate subduction zones (Hasegawa and Nakajima 2017). This motivates us to estimate the effects of laterally heterogeneous structures on gravity changes.

So far, several methods have been proposed to calculate coseismic deformation of a laterally heterogeneous Earth model. They can be categorized into two types. (Semi- )analytical perturbation approaches (e.g., Pollitz 2003; Fu and Sun 2008) give a physically clear image on the causes of the deformation. However, the perturbation methods employed make it difficult to deal with strong lateral heterogeneities. On the other hand, fully numerical approaches such as the finite element method (FEM) and the spectral element method can treat such heterogeneities (e.g., Cheng et al. 2019; Pollitz 2020). However, the inclusion of self-gravitation can cause modelling errors when using a commercial package of the FEM. To prevent this, special treatments of self-gravitation are necessary (e.g., Wu 2004; Nield et al. 2022; Vachon et al. 2022).

To address the above difficulties associated with strong heterogeneity and self-gravitation, Tanaka et al. (2019) employed a spectral finite-element approach (Martinec 2000) which combines the advantages of the analytical and numerical approaches. Tanaka et al. (2019) considered lateral heterogeneities in the bulk and shear moduli in modelling of the elastic response to surface loading. This model was applied to ocean tide loading (Huang et al. 2021). However, lateral heterogeneities in density have not yet been considered.

The purpose of this study is to extend the method by Tanaka et al. (2019) for the case of laterally heterogeneous density distributions when modelling coseismic deformation. In Sect. 2, we first explain the way that the spectral finite-element approach facilitates computation of global deformation. Next, we estimate the effect of a 3-D density distribution. In Sect. 3, after some checks of the method, we demonstrate the effects of 3-D density distribution on the coseismic vertical displacement and gravity change due to a megathrust earthquake. Finally, in Sect. 4, results and future work are summarized.

# **2 Method**

# **2.1 An Overview of the Spectral Finite-Element Approach**

We apply the approach to the governing equations for the elastic deformation of a self-gravitating sphere (Farrell 1972) under free-surface and internal source conditions represented by double-couple forces that are equivalent to a dislocation. No terms are ignored/approximated in the governing equations and no additional forces/boundary conditions are added. The governing equations are converted into a corresponding variational problem associated with the elastic strain and gravitational energies (*E E*bulk C *E*shear C *E*grav) and the work derived from the surface and source conditions (ı*F*) (Tanaka et al. 2014):

$$
\delta E\left(\mathfrak{u}, \delta \mathfrak{u}, \phi\_1, \delta \phi\_1\right) = \delta F\left(\delta \mathfrak{u}, \delta \phi\_1\right), \tag{1}
$$

where *u*, <sup>1</sup> and ı denote the displacement, the incremental gravity potential and the variation, respectively. The variation in the shear strain energy is given as

$$\delta E\_{\text{shear}}(\mathfrak{u}, \delta \mathfrak{u}) \equiv \int\_{-V} 2\mu \left( \mathfrak{e} \cdot \delta \mathfrak{e} \right) dV,\tag{2}$$

where and denote the shear modulus and the strain tensor, respectively, and *V* indicates a volume integral over the entire sphere. The source time function included in ı*F* is assumed to be a step function. The solution of Eq. (1) gives the static deformation which balances the double-couple forces.

Commercial FEM packages usually employ 3-D finite elements to compute Eqs. (1) and (2). In our approach, however, Eq. (2) is decomposed into the 1-D and residual 3-D parts:

$$\begin{split} \delta E\_{\text{shear}} &= \int\_{V} 2\mu\_{0}(r) \left( \epsilon \cdot \delta \epsilon \right) dV \\ &+ \int\_{V} 2\Delta \mu \left( r, \theta, \varphi \right) \left( \epsilon \cdot \delta \epsilon \right) dV, \end{split} \tag{3}$$

where (*r*, , ®) denote the radial distance, colatitude and longitude, respectively, and <sup>0</sup> and 4 represent the shear modulus of the reference 1-D model and the difference from 0, respectively. We apply 1-D finite elements in the radial direction and represent angular dependencies of the strain field by tensor spherical harmonics. Then, thanks to their orthogonal properties, the first term on the LHS of Eq. (3) becomes straightforward for numerical evaluation. The second term is numerically evaluated (Martinec 2000). We assume that lateral heterogeneities exist only within a small volume 4*V* near the source (i.e., 4 D 0 outside 4*V* and 4*V V*). Then, we can take the integration domain of the second term to be much smaller than the entire sphere. These treatments reduce costs for computing the global deformation to a large extent.

# **2.2 Inclusion of Laterally Heterogeneous Density Distributions**

The variation in the gravitational energy for the 1-D case is given by Eq. (42) of Martinec (2000) as

$$\begin{aligned} &\delta E\_{\mathrm{grav}}(\mathfrak{u}, \phi\_{1}, \delta \mathfrak{u}, \delta \phi\_{1}) \\ &\equiv \int\_{V} \rho\_{0} [\mathrm{grad} \left( \mathfrak{u} \cdot \mathrm{grad} \,\, \phi\_{0} \right)] \\ &- \mathrm{div} \, \mathfrak{u} \, \mathrm{grad} \,\, \phi\_{0} + \mathrm{grad} \,\, \phi\_{1} \right] \cdot \delta \boldsymbol{\mu} \, \boldsymbol{V} \\ &+ \int\_{V} \left[ \frac{1}{4 \pi G} (\mathrm{grad} \,\, \phi\_{1} \cdot \mathrm{grad} \,\, \delta \phi\_{1}) + \rho\_{0} (\mathfrak{u} \cdot \mathrm{grad} \,\, \delta \phi\_{1}) \right] \boldsymbol{dV}, \end{aligned} \tag{4}$$

where *G* is the gravitational constant and 0(*r*) and *g*0(*r*) grad <sup>0</sup> denote the density and gravity for the initial state before deformation takes place. In the following, we extend this energy variation to the laterally heterogeneous case and will come back to the remaining 3-D part of the energy variation which is not included in Eq. (4).

When there is a small lateral heterogeneity, the initial stress field, before an earthquake occurs, deviates only slightly from the hydrostatic state. In the following, *u* and <sup>1</sup> represent the coseismic deformation with respect to this laterally heterogeneous initial state. We substitute <sup>0</sup> C 4(*r*, , ®) and *g*<sup>0</sup> C 4*g*(*r*, , ®) into <sup>0</sup> and *g*<sup>0</sup> in Eq. (4), respectively. Here, 4 denotes the difference from the 1-D density distribution at the initial state due to a given lateral heterogeneity. Since gravity is linearly dependent on density, Poisson's equation is valid for the incremental density. Therefore, div grad 4 D 4*G*4 holds and 4*g* ( grad 4) denotes the static gravity increment due to 4. Subtracting the energy variation for the 1-D case from the result, neglecting the terms including the product of 44*g*, and considering the orthogonality of vector spherical harmonics, we obtain

$$
\delta E\_{\text{grav},jm}^{\Delta} = \delta E\_{\text{grav},jm}^{I} \left( \Delta \rho \right) + \delta E\_{\text{grav},jm}^{II} \left( \Delta \mathbf{g} \right), \quad (5)
$$

where

$$\begin{cases} \delta E^{I}\_{\text{grav},jm} = \sum\_{j'm'} \int \Delta V \, \Delta \rho \, (r, \theta, \varphi) \\\\ \left[ -\frac{4\varrho \, U\_{jm}}{r} + \frac{J\varrho \, V\_{jm}}{r} + \frac{dF\_{jm}}{dr} + 8\pi G \rho\_{0} U\_{jm} \right) \delta U^{\star}\_{jm'} \\ + \left( \frac{g\varrho \, U\_{jm}}{r} + \frac{F\_{jm}}{r} \right) \delta V^{\star}\_{jm'} + \left( U\_{jm} \frac{d\delta F^{\star}\_{jm'}}{dr} + \frac{V\_{jm}}{r} \delta F^{\star}\_{jm'} \right) \end{cases} \tag{6}$$

and

$$\begin{aligned} &\delta E\_{\text{grav},jm}^{II} \\ &= \sum\_{j'm'} \int\_{-V} \Delta \mathbf{g}\left(r, \theta, \varphi\right) \rho\_0(r) \\ &\left[ \left( -\frac{4U\_{jm}}{r} + \frac{JV\_{jm}}{r} \right) \delta U\_{j'm'}^{\star} + \left( \frac{U\_{jm}}{r} \right) \delta V\_{j'm'}^{\star} \right] dV \end{aligned} \tag{7}$$

(c.f., Eq. (65) of Martinec (2000)). Here, (*U*(*r*), *V*(*r*), *F*(*r*))*jm* denote spherical harmonic coefficients for the vertical and horizontal displacements and the incremental gravity potential at degree *j*, order *m*. The asterisks represent complex conjugates and *J* D *j*(*j* C 1) is a factor originating from div *u*, and 4*g* represents the magnitude of 4*g* in the radial direction. The products of the vector spherical harmonics in the integrands (e.g., **S**.-1/ jm - **S**.-1/ <sup>j</sup>0m<sup>0</sup>/(see, Eq. (B1) of Martinec (2000)) are omitted for simplicity. Note that, in the 3-D case, summations over *j* <sup>0</sup> and *m* <sup>0</sup> appear, indicating a modal coupling with other degrees and orders.

The integrands in Eqs. (6) and (7) have a common term

$$\begin{aligned} & \left( -\frac{4U\_{jm}}{r} + \frac{J\,V\_{jm}}{r} \right) \delta U\_{j\prime m\prime}^{\ast} + \left( \frac{U\_{jm}}{r} \right) \delta V\_{j\prime m\prime}^{\ast} \qquad (8) \\ & \equiv K\_U(r) \delta U\_{j\prime m\prime}^{\ast} + K\_V(r) \delta V\_{j\prime m\prime}^{\ast} . \end{aligned}$$

By this notation, the corresponding energy variations for *j* 0 , *m*<sup>0</sup> can be written as

$$
\delta E\_{\text{grav},jm}^{I'} \equiv \int\_{\Delta V} \Delta \rho \left( r, \theta, \varphi \right) \text{g}\_0 \left[ K\_U \delta U\_{j'm'}^\star + K\_V \delta V\_{j'm'}^\star \right] dV \tag{9}
$$

and

$$
\delta E\_{\rm grav,j\,m}^{III\prime} \equiv \int\_{V} \rho\_0(r) \Delta \mathbf{g}\left(r, \theta, \varphi\right) \left[K\_U \delta U\_{j\,m\prime}^{\star} + K\_V \delta V\_{j\,m\prime}^{\star}\right] dV \,\,\tag{10}
$$

It is expected that ˇ ˇ ˇ ıE<sup>I</sup> <sup>0</sup> grav;jm ˇ ˇ ˇ ˇ ˇ ˇ ıEII <sup>0</sup> grav;jm ˇ ˇ ˇ for two reasons. First, 4/<sup>0</sup> is 10-<sup>2</sup> and 4*g*/*g*<sup>0</sup> is 10-5 (10 mGal/980 Gal) in the real Earth, which means 4*g*<sup>0</sup> 04*g*. Second, 4*g* and *KU*, *<sup>V</sup>* are smaller outside 4*V* in Eq. (10) (note that the source is located within 4*V* and the deformation (i.e., *Ujm* and *Vjm* included in *KU*, *<sup>V</sup>*) decays with the square of the epicentral distance (Okada 1992). Assuming that 4*V* has a shape of a spherical cap, we will roughly estimate a ratio between the magnitudes of Eqs. (9) and (10). We assume that *KUK <sup>V</sup>D*/*r*, where *D* D1 within 4*V* and *D* D *ds*/(*a r*) <sup>2</sup> outside 4*V*. Here, *ds* (D40 km) and *a* (D6,371 km) are the depth of the point source and the Earth's radius. The density of the 1-D case, 0, is homogeneous within the Earth and 4 is set to <sup>1</sup> <sup>100</sup>0 within 4*V*. The integration is performed by an elementary numerical difference method. The result is shown in Sect. 3.1.

Now, we come back to the part which is excluded in the energy variation in Eq. (4). In the 3-D case, the last three terms in Eq. (A7) of Martinec (2000) add to Eq. (4). The last term of Eq. (A7) associated with discontinuities within the Earth vanishes in the present case because we do not consider lateral heterogeneities near the core-mantle boundary (CMB) and the normal vector of the fault is orthogonal to the displacement (shear slip is assumed). The third term in Eq. (A7) associated with the surface integral vanishes when 4*V* and the deformation field caused by the source are line symmetric, as employed in Sect. 2.4. The second term in Eq. (A7) is given as

$$\begin{array}{ll} \frac{1}{2} \int\_{V} \begin{array}{l} \left( \mathsf{grad}\ \rho\_{0} \cdot \delta \mathsf{u} \right) \left( \mathsf{u} \cdot \mathsf{grad}\ \phi\_{0} \right) - \left( \mathsf{grad}\ \rho\_{0} \cdot \mathsf{u} \right) \\\left( \delta \mathsf{u} \cdot \mathsf{grad}\ \phi\_{0} \right) \Big| \, dV. \end{array} \end{array} \tag{11}$$

Substituting <sup>0</sup> C 4(*r*, , ®) and *g*<sup>0</sup> C 4*g*(*r*, , ®) into Eq. (11), and based on the same argument as for Eq. (10), we can approximate Eq. (11) as

$$\frac{\varepsilon\_0}{2} \int\_{\Delta V} \left[ \left( \text{grad}\,\Delta\rho \cdot \delta\mathfrak{u} \right) U - \left( \text{grad}\,\Delta\rho \cdot \mathfrak{u} \right) \,\delta U \right] dV. \quad (12)$$

We consider the case where 4 is constant within 4*V*. Then, grad4 takes non-zero values only on the boundary of 4*V*. Furthermore, we note that the terms including the vertical gradient of the density in Eq. (12) cancel out on a horizontal surface. Therefore, we consider only the vertical surface which consists of the boundary of 4*V*. We derive a weak formulation and evaluate the magnitude of Eq. (12). The result is shown in Sect. 3.1.

# **2.3 Iteration**

The effect of the lateral heterogeneity is finally determined by solving the following equation iteratively, as described in Tanaka et al. (2019).

$$\begin{array}{l} \delta E\_{\text{ID}} \left( \mathfrak{u}^{i}, \delta \mathfrak{u}, \phi\_{1}^{i}, \delta \phi\_{1} \right) \\ = \delta F \left( \delta \mathfrak{u}, \delta \phi\_{1} \right) - \delta E\_{\text{grav}}^{\Delta} \left( \mathfrak{u}^{i-1}, \delta \mathfrak{u}, \phi\_{1}^{i-1}, \delta \phi\_{1} \right), \end{array} (13)$$

where ı*E* 1D denotes the energy variation excluding the effects of lateral heterogeneities. At the first step (*i* D 1), ıE grav is set to zero. For *i* 2, ıE grav is computed with the solution obtained at the previous step. This iteration is repeated until (*u*, 1) *<sup>i</sup>* Š (*u*, 1) *i* - 1. The convergence behavior is shown in Sect. 3.1.

# **2.4 Model Setting**

We use a synthetic rectangular fault model to simulate coseismic deformation due to a megathrust earthquake. The length and width of the fault are 550 km and 100 km, respectively, and a slip of 10 m is uniform on the fault (MW D 8.7). The strike, dip and rake angles are (0<sup>ı</sup> , 25<sup>ı</sup> , 90<sup>ı</sup> ) and the fault is dipping to the west (Fig. 1). The fault is distributed within 2.5<sup>ı</sup> ˇ 2.5<sup>ı</sup> and 104.1<sup>ı</sup> ® 105<sup>ı</sup> , where ˇ and ® denote latitude and longitude, respectively, and the fault is located at depths ranging from 15 km to 57 km.

PREM (Dziewonski and Anderson 1981) is considered as the reference Earth model. In Model A, we assume that the density is larger by 5% than in the reference model within a region of 20<sup>ı</sup> ˇ 20<sup>ı</sup> , 80<sup>ı</sup> ® 110<sup>ı</sup> and depths from 0 to 670 km, including the above fault (Fig. 1). In Model B, the heterogeneity of Model A is given in a region excluding the fault (80<sup>ı</sup> ® 104<sup>ı</sup> ). The elastic parameters in Models A and B are the same as in PREM. The results shown below are proportional to the magnitude of the heterogeneity. If the

**Fig. 1** The fault and Earth structure models used in the computation. (**a**) A cross section of the Earth model at latitude ˇ D 0. The rectangular reverse fault (green line) consists of 92 point sources having the same dip-slip mechanism. The density in the upper mantle is increased by 5% with respect to the PREM for the longitudinal ranges (®) shown by the black (Model A) and red (Model B) arrows. (**b**) A top view. The green box shows a vertical projection of the fault. The horizontal ranges where the density is increased are shown by the black (Model A) and red (Model B) boxes

lateral heterogeneity is 1% instead of 5%, then, the effect on the deformation becomes 1/5 the magnitude of the case shown here

Assuming a future satellite gravity mission, we set the cutoff spherical harmonic degree as 100 and applied no spatial filter such as a Gaussian filter to the computational results. The radial intervals of the finite elements, 4*r*, depend on depth and are set as follows: 4*r* D 1 km for depths 0– 100 km, 5 km for 100–150 km, 10 km for 150–300 km, 15 km from 300 km to the CMB and 20 km below the CMB. The horizontal grid needed for numerical computations of the 3-D part is set according to the method described in Martinec (2000). The numbers of the grid points are 152 and 512 in latitude and longitude, respectively.

The effects of lateral heterogeneities of Models A and B are evaluated as the differences with respect to the reference model. The results are shown in Sects. 3.2–3.3.

# **3 Results and Discussions**

# **3.1 Check of the Approximations Used**

Table 1 shows the ratio ˇ ˇ ˇ ıEII <sup>0</sup> grav;jm ˇ ˇ ˇ = ˇ ˇ ˇ ıE<sup>I</sup> <sup>0</sup> grav;jm ˇ ˇ ˇ for density distributions with different radii and thicknesses. The ratios are less than 0.5% for all the cases. The reason why the result for depths 0–670 km is equal to that for depths 0–100 km is that the deformation is concentrated in the proximity of the source, which is located at the depth of 40 km, and the integrand in ıEII <sup>0</sup> grav;jm below the depth of 100 km is much smaller than in 4*V*. These results allow us to neglect ıEII grav;jm (Eq. 7) for practical applications because the effects of lateral heterogeneities in the density are at most a few percent of the peak coseismic deformation as shown later, and hence neglecting ıEII grav;jm causes an error of the order of only 0.01%, which is below detectable levels of geodetic observations.

Next, we compare the surface deformation for Model B obtained by including and excluding the energy variation represented by Eq. (12). The results show that, when the energy variation of Eq. (12) is included, the vertical displacement and the gravity change decrease by 0.4 mm and 0.04 -Gal at the most, where the density distribution laterally changes near the west side of the fault (®104<sup>ı</sup> ). The magnitudes of these decreases are less than 0.1%, if compared with the deformation at the corresponding location in the 1-D case (peak p2 in Figs. 2a and 3a). However, for the vertical displacement, a difference of 0.4 mm is not negligible because the effect of the lateral heterogeneity is of the same order of magnitude. Figure 2b, c show that the differences between the cases including and excluding the



energy variation are visible. For the gravity change, the effect of lateral heterogeneity is of the order of -Gal. Therefore, a difference of 0.04 -Gal amounts to only 1% of the effect of lateral heterogeneity. In the subsequent sections, we discuss results obtained by including the energy variation of Eq. (12).

Table 2 shows the result of iterations for Model B. We see that the difference is largest between *i* D 1 and 2, amounting to 0.03–0.3%. After *i* D 2, the differences are smaller than 0.02%, indicating that the spherical harmonic coefficients for the vertical displacement converged at 10-<sup>4</sup> level. A similar tendency is seen for Model A.

These results are summarized as follows. As far as a relatively large-scale heterogeneity like Models A and B is concerned, the energy variation arising from 4*g* (Eq. 7) is negligible and the energy variation represented by Eq. (12) is not, in estimating the effect due to the lateral heterogeneity on the coseismic deformation. A few steps of iteration are sufficient.

# **3.2 Vertical Displacement**

Figure 2a shows the vertical displacement along the latitude line passing through the center of the fault (ˇ D 0 ı and 80<sup>ı</sup> ® 130<sup>ı</sup> ) computed for the reference model. We see an uplift of 1 m at ®105<sup>ı</sup> above the shallower edge of the fault (p1) and a subsidence of 0.5 m at ®102<sup>ı</sup> above the deeper edge of the fault (p2), which is a well-known pattern observed for thrust-type fault motion. The blue curve in Fig. 2c shows the difference between model A relative to the reference model. We see that the pattern is roughly opposite to that in Fig. 2a (compare p1 with p3 and p2 with p4), but the effect is very small. The largest lower peak (p3) has amplitude of 1 mm, indicating that a 5% increase in density reduces the coseismic maximum uplift seen at p1 by 0.1%.

**Fig. 2** Coseismic vertical displacements at latitude ˇ D 0 for different Earth models. The horizontal axes denote longitude (Fig. 1). The green line denotes the fault, and the cut-off spherical harmonic degree is 100. (**a**) The vertical displacement, *U*, for the reference 1-D model (PREM). (**b**) The difference in the vertical displacements computed for

the reference model and Model A (blue)/B (red). The energy variation of Eq. (12) is excluded. The blue and red boxes denote the ranges where the density in the upper mantle is increased. (**c**) The same as in (**b**) but Eq. (12) is included. (**d**) The same as in (**c**) but the shear modulus in the upper mantle is increased instead of the density (Model C)

**Fig. 3** The same as in Fig. 2, but the coseismic gravity change is shown. (**a**) The gravity change, *g*<sup>0</sup> , for the reference 1-D model. (**b**) The difference in the gravity changes computed for the reference model and Model A (blue)/B (red). Note that the patterns are opposite to those in the vertical displacements in Fig. 2 (**b**) and that the relative magnitude amounts to a few percent (compare peaks p1 and p3 or p2 and p6)

The reduction of the vertical displacement is consistent with the fact that the inclusion of self-gravitation suppresses longer-wavelength deformations for a flat-Earth model (Barbot and Fialko 2010). This can be understood if we consider that an increase in density enhances the gravitational effect (i.e., 0*g*<sup>0</sup> is replaced by (<sup>0</sup> C 4)*g*<sup>0</sup> where 4 > 0).

For comparison, Fig. 2d shows a result when the shear modulus is increased by 10% for the same region as the region where the heterogeneity is considered in Model A. We see that the pattern of the difference is the same as in the coseismic change and that amplitude increases by 10%. This indicates that the difference in the vertical displacement is proportional to the difference in the shear modulus. This

**Table 2** A convergence of the solution for Model B. The first column (*i*) shows numbers of iterations (Sect. 2.3). *Ujm* denote the real part of the spherical harmonic coefficient of the vertical displacement at the surface (*r* D *a*) for degree *j* and order *m*


is because the energy of the seismic source is proportional to the shear modulus. In contrast, the effect of the density is only 1/20 of the given heterogeneity in magnitude (namely, the 5% increase in the density caused the 0.1% increases in the vertical displacement).

Next, we compare Models A and B. The red curve in Fig. 2c shows the result for Model B. We see that, when the heterogeneity is excluded from the source region, the largest negative peak at ®105<sup>ı</sup> for Model A (p3) is reduced (p5) and that Model B shows a close pattern to Model A on ® < 104<sup>ı</sup> . This indicates that the reduction of the vertical displacement occurs mainly in the region where the density is increased.

# **3.3 Gravity Change**

We have seen that the effect of the laterally heterogeneous density distribution on the vertical displacement is as small as 0.1% of the coseismic change. However, the effect on the gravity change is an order of magnitude larger, as will be shown below.

Figure 3a shows the coseismic gravity change computed for the reference model. The pattern resembles the vertical displacement in Fig. 2a. The blue curve in Fig. 3b shows the difference between Model A and the reference model. We see that the pattern is similar to the coseismic change in Fig. 3a and that the increase in amplitude amounts to 2% of the coseismic gravity change (compare p1 with p3).

The reason why the effect on the gravity change is larger than on the vertical displacement can be explained by a Bouguer (slab) approximation:

$$\begin{array}{l} \left(\mathbf{g'} + \Delta \mathbf{g'}\right)|\_{r=a} \sim 2\pi \mathbf{G} \left(\rho\_0 + \Delta \rho\right)|\_{r\sim a} \left(U + \Delta U\right) \\ \cong 2\pi \mathbf{G} \left(\rho\_0 U + \rho\_0 \Delta U + U \Delta \rho\right)|\_{r\sim a}. \end{array} (14)$$

In this equation, *a* denotes the Earth's radius, and *g* <sup>0</sup> and *U* are the surface gravity change and the vertical displacement caused by the deformation for the reference model. 4 means the difference from the reference model due to the inclusion of the lateral heterogeneity. The first term in the rightmost side of Eq. (14) represents the gravity change due to the deformation in the 1-D case. In the second term, 4*U* is opposite to *U* and is significantly smaller than *U* (Sect. 3.2). So, the last term is dominant as the effect of the lateral heterogeneity, indicating that the deformation for the 1-D model and the local density distribution take effects and that the ratio of the third term to the first term is 4/0.

The red curve in Fig. 3b shows the result for model B. We see that the difference between Models A and B is small on ® 104<sup>ı</sup> where the heterogeneity in Model B is present and that amplitudes of Model B become smaller for ® > 104<sup>ı</sup> . This result suggests that the effect of the lateral heterogeneity on the gravity change is larger where the heterogeneity is present, as seen for the vertical displacement.

Fu and Sun (2008) estimated coseismic gravity changes for a 3-D heterogeneous spherical Earth model. The magnitude of the lateral heterogeneity in the density used in their computation was ˙0.5% with respect to the PREM. In the shallow upper mantle, the main sources of heterogeneity are subducting slabs. Their result shows that the effect of lateral heterogeneity on the gravity change caused by a point dislocation placed at 100 km or below was 0.01–0.03%. In our result, the effect on the gravity change is of the same order of magnitude as for the heterogeneity. That means that, if a heterogeneity was 0.5%, the effect on the gravity change would be 0.5% in our model. A few reasons are considered to explain why our result is an order of magnitude larger than their result; In our study, (1) the horizontal scale of the heterogeneity given is much larger than the thickness of slab, (2) the source depth is shallower than 100 km, (3) the cut-off degree is lower and thus longer-wavelength deformations are dominant, which are more strongly affected by the gravity field (generated by the initial static density distribution). To examine the effect due to a fine 3-D density structure for a shallow seismic source, higher-degree terms must be computed, which will be done in a next study.

# **4 Conclusions**

We developed a spectral finite-element approach for estimating the effects of laterally heterogeneous density distributions on coseismic deformations. Considering that deformations due to a great earthquake will be observed by a future satellite gravity mission, we computed a coseismic vertical displacement and gravity change up to *jmax* D 100 for Earth models with a large-scale lateral heterogeneity being present near the seismic fault. The results show that the increase in the density within the upper mantle by 5% over a horizontal scale of 3,000 km could suppress the vertical displacement by an order of 0.1% and amplify the gravity change by an order of 1% with respect to the case for the reference 1- D model. The differences from the 1-D model were larger where the heterogeneity was present, and a larger increase in the gravity change than in the vertical displacement occurs, because the local density structure maps directly into the gravity change.

In this study, we imposed a few limitations on the heterogeneity: a large horizontal scale, being present in the vicinity of the source, and simple (symmetric) geometry. Under these conditions, we showed that the estimation of the energy variation of Eq. (A7) of Martinec (2000) could be simplified. For more complex density distributions by subducting slabs, plumes, surface topography and bathymetry, it might be more effective to directly compute the energy variation of Eq. (A2), which is an alternative representation of Eq. (A7). Furthermore, for surface loading, gravity increments due to lateral heterogeneities in the density enter into the boundary conditions. In this case, it should be examined whether neglecting the second term of the gravitational energy (Sect. 2.2) is valid. To extend the applicability of the spectral FEM to more general cases is a future challenge.

**Acknowledgements** We are grateful to the two anonymous reviewers who gave us valuable comments to improve the manuscript. We used the computer systems of the Earthquake and Volcano Information Center of the Earthquake Research Institute, The University of Tokyo. YT was supported by JST Grant Number JPMJMI18A1 and JSPS KAKENHI Grant Numbers JP21H01187 and JP21H05204. This study has been partially supported by the European Space Agency under contract no. 4000135530.

# **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/ licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

# **On the Estimation of Time Varying AR Processes**

Johannes Korte, Till Schubert, Jan Martin Brockmann, and Wolf-Dieter Schuh

#### **Abstract**

In time series analysis auto regressive (AR) modelling of zero mean data is widely used for system identification, signal decorrelation, detection of outliers and forecasting. An AR process of order p is uniquely defined by r coefficients and the variance of the noise. The roots of the characteristic polynomial can be used as an alternative parametrization of the coefficients, which is used to construct a continuous covariance function of the AR processes or to verify that the AR processes are stationary. In this contribution we propose an approach to estimate an AR process of time varying coefficients (TVAR process). In the literature, roots are evaluated at discrete times, rather than a continuous function like we have for time varying systems. By introducing the assumption that the movement of the roots are linear functions in time, stationarity for all possible epochs in the time domain is easy to accomplish. We will illustrate how this assumption leads to TVAR coefficients where the k-th coefficient is a polynomial of order k with further restrictions on the parameters of the coefficients. At first we study how to estimate TVAR process parameters by using a Least Squares approach in general. As any AR process can be rewritten as a combination of AR processes of order two with two complex conjugated roots and AR processes of order one, we limit our investigations to these orders. Higher order TVAR processes are computed by successively estimating TVAR processes of orders one or two. Based on a simulation, we will demonstrate the advantages of a time varying model and compare them to the stationary time stable model. In addition, we will give a method to identify time series, for which the model of the TVAR processes with linear roots is suitable.

#### **Keywords**

AR process - Motions of the roots - Non-stationarity -Time varying AR coefficients

# **1 Introduction**

In time series analysis the choice of auto regressive (AR) processes is often used, for example as decorrelation filter (see Schubert et al. (2019), Schuh et al. (2014), or Schuh and Brockmann (2019)) or to estimate discrete covariances (see Schuh (2016, p. 32, eq. (182))). The transition to time variable AR processes (TVAR processes) for non-stationary time series has proven to be a suitable extension (see Charbonnier et al. (1987), or Kargoll et al. (2018)). In this paper we concentrate on TVAR(p) process of order p D 1 or p D 2 and its estimation. The most common way to estimate TVAR processes is to chose the function of the coefficients, without going further into the characteristics of the processes. This has been done with either trigonometric functions, modified Legendre polynomials, or even spheroidal sequences (see Grenier (1983), Hall et al. (1977) and Slepian (1978)). Kamen (1988) also includes the motion of roots for time

J. T. Freymueller, L. Sánchez (eds.), *X Hotine-Marussi Symposium on Mathematical Geodesy*, International Association of Geodesy Symposia 155, https://doi.org/10.1007/1345\_2023\_188

J. Korte (-) · T. Schubert · J. M. Brockmann · W.-D. Schuh

Institute of Geodesy and Geoinformation, University of Bonn, Bonn, Germany

e-mail: korte@geod.uni-bonn.de; schubert@geod.uni-bonn.de; brockmann@geod.uni-bonn.de; schuh@uni-bonn.de

varying coefficients of a TVAR(2) process in the case that the first parameter is constant and the second is a linear function. Here we go the other way around and determine the TVAR(p) coefficients assuming a linear model for the motion of the roots. This is done in seven sections. In Sect. 2 we derive the TVAR process and the associated auxiliary equations which represents the connection between the coefficients and the roots. Section 3 proves that in the case of linear motions of the roots, the k-th coefficient of the TVAR process is a polynomial of order k. The polynomial representation results in a parameter change from the actual coefficients of the AR process to the coefficients of the individual polynomials. A new estimation equation is derived in Sect. 4. In order to guarantee that the motion of the roots is linear, the parameters must meet further conditions, which are derived in Sect. 5. The robustness of this estimation is tested on a simulation, in Sect. 6. A short summary of the paper as well as the results are presented in Sect. 7.

# **2 Relating Time Varying AR Process Coefficients to the Time Varying Roots**

The definition of the time stable AR (TSAR) process can be found in a variety of books.<sup>1</sup> The process <sup>S</sup><sup>t</sup> is called time stable AR process of order p (TSAR(p) process) if it is described by the recursive equation

$$\mathcal{S}\_{l} = \alpha\_{1}\mathcal{S}\_{l-1} + \alpha\_{2}\mathcal{S}\_{l-2} + \dots + \alpha\_{p}\mathcal{S}\_{l-p} + \mathcal{E}\_{l}, \qquad (1)$$

where ˛1, ˛2, ..., ˛p are the coefficients of the AR process and E<sup>t</sup> is an i.i.d. sequence with variance -2 <sup>E</sup> (see Hamilton (1994, p. 58, eq. (3.4.31))).

For an AR(p) process the auxiliary equation is defined by:

$$b(\mathbf{x}) = \mathbf{x}^p - \alpha\_1 \mathbf{x}^{p-1} - \alpha\_2 \mathbf{x}^{p-2} - \dots - \alpha\_p \tag{2}$$

$$=(\mathbf{x} - r\_1)(\mathbf{x} - r\_2)...(\mathbf{x} - r\_p).\tag{3}$$

In (3) the rk, (with <sup>k</sup> <sup>D</sup> 1; 2; :::; p), are the roots (b.x/ <sup>Š</sup> D 0) of the auxiliary equation. These are either real values or they appear as complex conjugated pairs. In Hamilton (1994, p. 34) it is shown that the AR process is stationary if and only if these roots are inside the unit-circle (krkk < 1). For a time varying AR (TVAR) process of order p the definition is given by Kamen (1988) as:

$$\mathcal{S}\_{l} = \alpha\_{1}(t)\mathcal{S}\_{l-1} + \alpha\_{2}(t)\mathcal{S}\_{l-2} + \dots + \alpha\_{p}(t)\mathcal{S}\_{l-p} + \mathcal{E}\_{l}.\tag{4}$$

where ˛1.t /, ˛2.t /, ..., ˛p.t / are the time varying coefficients of the TVAR(p) process, which change their value with the time t. E<sup>t</sup> remains an i.i.d. sequence with variance -2 <sup>E</sup> . The TVAR process should be stationary at any fixed but arbitrary time in a given interval <sup>I</sup> . This is equal to <sup>k</sup>rk. /<sup>k</sup> <sup>Š</sup> < 1 8 2 I . So for t D the auxiliary equation can be computed by

$$b\_{\tau}(\mathbf{x}) = \mathbf{x}^{p} - \alpha\_{1}(\tau)\mathbf{x}^{p-1} - \alpha\_{2}(\tau)\mathbf{x}^{p-2} - \dots - \alpha\_{p}(\tau) \quad (5)$$

$$=(\mathbf{x} - r\_1(\mathbf{r}))(\mathbf{x} - r\_2(\mathbf{r}))...(\mathbf{x} - r\_p(\mathbf{r})).\tag{6}$$

If (6) is converted into a polynomial again, and then a coefficient comparison with (5) is made, the coefficients ˛k. / can be calculated directly:

$$\alpha\_k(t)$$

$$= (-1)^{k+1} \sum\_{m\_1=1}^{p-k+1} \sum\_{m\_2=m\_1+1}^{p-k+2} \sum\_{m\_3=m\_2+1}^{p-k+3}$$

$$\dots \sum\_{m\_k=m\_{k-1}+1}^{p} r\_{m\_1}(t)r\_{m\_2}(t)r\_{m\_3}(t)\dots r\_{m\_k}(t) \tag{7}$$

This means that ˛k. / can be written as the sum of all possible products of k different roots multiplied with .1/k<sup>C</sup><sup>1</sup>.

# **3 Derivation of the Time Varying AR(***p***) Process Coefficients from Linear Root Motions**

To keep it simple, we assume a linear polynomial for the motion of the roots

$$r\_k(t) = a\_k + b\_k t.\tag{8}$$

Analogous to the time stable approach, the rk.t / occur again as real roots or as pairs of complex conjugated roots, and therefore this also applies to ak and bk. So it follows from the linear root motions in (8) that the coefficients ˛k.t / from (7) are polynomials of order k:

$$\alpha\_k(t) = \sum\_{j=0}^k \beta\_j^{(k)} t^j. \tag{9}$$

In this context the ˇ.k/ <sup>j</sup> , with k 2 Œ1; 2; :::; p] and j 2 Œ0; 1; 2; :::; k, is the .j C 1/ parameter of the function ˛k.t /. It should be mentioned that in this way the number of unknown parameters increased from p to <sup>p</sup>2C3p <sup>2</sup> parameters.

Unfortunately, the representation of coefficients by polynomials in (9) is not sufficient to guarantee linear root

<sup>1</sup>The following definition could be found in Box et al. (2008), Brockwell and Davis (1991), Buttkus (2000), Hamilton (1994), Priestley (1981). Here the notation of Hamilton (1994) is used.

movements. Therefore, it is shown in Sect. 4 how the TVAR coefficients ˇ.k/ <sup>j</sup> are generally estimated and in Sect. 5 we derive the restrictions for linear root motions for the TVAR(1) and TVAR(2) process.

# **4 Parameter Estimation for TVAR Processes**

In this section we will show how the parameters (ˇ.k/ <sup>j</sup> ) are estimated. First, the parameter vector with the dimension p2C3p <sup>2</sup> 1 is set up in ascending order j . This means that the ˇ.k/ <sup>j</sup> belonging to ˛k.t / do not follow each other, but the ˇ.k/ <sup>j</sup> are sorted according to the order of the monomials (k):

$$\mathcal{B} := \underbrace{[\mathcal{B}\_0^{(1)} \; \beta\_0^{(2)} \dots \beta\_0^{(p)}}\_{p} \underbrace{\mathcal{B}\_1^{(1)} \; \beta\_1^{(2)} \dots \beta\_1^{(p)}}\_{p} \underbrace{\mathcal{B}\_2^{(2)} \dots \mathcal{B}\_2^{(p)}}\_{p-1}$$

$$\dots \underbrace{\mathcal{B}\_{p-1}^{(p-1)} \; \beta\_{p-1}^{(p)}}\_{2} \underbrace{\mathcal{B}\_p^{(p)} \; \mathbf{I}}\_{1}^T . \tag{10}$$

With this reorganisation and using (9), the transformation between ˛k.t / and ˇ.k/ <sup>j</sup> is given by

$$\begin{aligned} & \begin{bmatrix} \alpha\_1(t) \\ \alpha\_2(t) \\ \alpha\_3(t) \\ \vdots \\ \alpha\_p(t) \end{bmatrix} \\ &= \underbrace{\begin{bmatrix} 10 \ 0 \ 0 \ \dots \ 0 \end{bmatrix} \begin{bmatrix} t \ 0 \ 0 \ \dots \ 0 \end{bmatrix} \begin{bmatrix} t \ 0 \ 0 \ \dots \ 0 \end{bmatrix} \begin{array}{ccccc} 0 \ 0 \ \dots \ 0 \end{array} \begin{array}{ccccc} 0 \ \dots \ 0 \\ \vdots \\ 0 \ \end{array}}\_{\begin{array}{c} \begin{array}{c} 0 \ \dots \ 0 \end{array} \begin{array}{ccccc} 0 \ \dots \ 0 \end{array} \begin{array}{c} 0 \ \dots \ 0 \end{array} \begin{array}{c} 0 \ \dots \ 0 \end{array} \begin{array}{c} 0 \ \dots \ 0 \end{array} \begin{array}{c} 0 \ \dots \ 0 \end{array} \begin{array}{c} 0 \ \dots \ 0 \end{array} \begin{array}{c} 0 \ \dots \ 0 \end{array} \begin{array}{c} 0 \ \dots \ 0 \end{array} \begin{array}{c} 0 \ \dots \ 0 \end{array} \begin{array}{c} 0 \ \dots \ 0 \end{array} \begin{array}{c} 0 \ \dots \ 0 \end{array} \begin{array}{c} 0 \ \dots \ 0 \end{array} \begin{array}{c} 0 \ \dots \ 0 \end{array} \begin{array}{c} 0 \ \dots \ 0 \end{array} \begin{array}{c} 0 \ \dots \ 0 \end{array} \begin{array}{c} 0 \ \dots \ 0 \end{array} \begin{array}{c} 0 \ \dots \ 0 \end{array} \begin{array}{c} 0 \ \dots \ 0 \end{array} \begin{array}{c} 0 \ \dots \ 0 \end{array}$$

We estimate the parameters directly from the observations by solving the least squares problem S<sup>t</sup> C v<sup>t</sup> D T ˛.t /. By equating the noise and the negative residuals (E<sup>t</sup> D vt), we can derive the linear relationship between the TVAR coefficients ˛.t / and the observations S<sup>t</sup> from (4). So the design matrix T is given by

$$\mathbf{T} = \begin{bmatrix} S\_{p-1} \ S\_{p-2} \ S\_{p-3} \ \dots \ S\_0 \\\ S\_p \ S\_{p-1} \ S\_{p-2} \ \dots \ S\_1 \\\ \dots \\\ S\_{n-1} \ S\_{n-2} \ S\_{n-3} \ \dots \ S\_{n-p} \end{bmatrix} \text{ with } n \ge \frac{p^2 + 3p}{2} > p.c. \tag{13}$$

If we exchange the parameters for the LS problem from ˛k.t / to ˇ.k/ <sup>j</sup> like it is seen in (11), then the new estimation problem is given by

$$\begin{aligned} \left[\begin{array}{l} \mathcal{S}\_{\rho} \\ \mathcal{S}\_{\rho+1} \\ \mathcal{S}\_{\rho+2} \\ \cdots \\ \mathcal{S}\_{\rho+n} \end{array} \right] & \stackrel{!}{=} \mathcal{T} \mathcal{M} \mathcal{B} \\ & \stackrel{!}{=} \left[\begin{array}{l} \mathcal{T} \mid \mathcal{T} \odot t \mid \left(\mathcal{T} \odot t. \right.\end{array} \right) \left(1:n, 2:p\right) \mid \ldots \mid \left(\mathcal{T} \odot t. \right.\tag{14} \end{array} \right] \mathcal{B} . \end{aligned} \tag{15}$$

Now t is a vector containing the observation times. Here the l-th row of T ˇ t results from the l-th row of T multiplied by the l-th element in t, and t: <sup>h</sup> is the element-by-element exponentiation of t to power h.

# **5 Additional Conditions for Linear Root Motions**

In this section we show which conditions must apply to the TVAR(1) and TVAR(2) estimation processes of Sect. 4 to result in linear root motion. For higher order TVAR processes, successive TVAR(1) and TVAR(2) processes are estimated in all possible combinations. The best combination is then found via the AIC for AR processes (see Buttkus (2000, p. 261, eq. (11.85))).

# **5.1 The TVAR(1) Process with Linear Motion of the Roots**

Using (5) and (6) for the TVAR(1) process shows that r1. / D ˛1. /. Since it follows from (9) that ˛1.t / is a linear function, the same is true for r1.t /. So every TVAR(1) process estimated by (14) has linear root motions.

# **5.2 The TVAR(2) Process with Linear Motions of the Roots**

The analytical conversion from coefficients ˛1. / and ˛2. / to roots r1. / and r2. / for a TVAR process is given by the solution of the quadratic auxiliary equation (5), see Abramowitz and Stegun (1965, p. 17, eq. 3.8.1):

$$r\_{1,2}(\tau) = \frac{\alpha\_1(\tau)}{2} \pm \sqrt{\left(\frac{\alpha\_1(\tau)}{2}\right)^2 + \alpha\_2(\tau)}.\qquad(15)$$

Because of (9) ˛1.t / is linear. But as we assume a linear root motion, the expression under the root must be a linear function to the square so that the sum of both remains linear:

$$
\left(\frac{\alpha\_1(\tau)}{2}\right)^2 + \alpha\_2(\tau) \stackrel{!}{=} \left(f + \operatorname{g}\tau\right)^2. \tag{16}
$$

Both sides are quadratic equations which are equal if and only if each of the three polynomial coefficients are equal. Coefficients comparison leads to three conditions, for which two are used to determine f and g. After f and g have been inserted into the third condition, and ˛. / has been replaced by the ˇ.k/ <sup>j</sup> (see (9)), the non-linear condition is given by

$$
\left(\boldsymbol{\beta}\_{0}^{(1)}\right)^{2}\boldsymbol{\beta}\_{2}^{(2)} + \left(\boldsymbol{\beta}\_{1}^{(1)}\right)^{2}\boldsymbol{\beta}\_{0}^{(2)} - \boldsymbol{\beta}\_{0}^{(1)}\boldsymbol{\beta}\_{1}^{(1)}\boldsymbol{\beta}\_{1}^{(2)}
$$

$$
+4\boldsymbol{\beta}\_{0}^{(2)}\boldsymbol{\beta}\_{2}^{(2)} - \left(\boldsymbol{\beta}\_{1}^{(2)}\right)^{2} \stackrel{!}{=} 0.\tag{17}
$$

Adding this condition to the estimation in (14) leads to a TVAR(2) process with linear root motions.

# **6 Robustness of the Estimate Against Deviations**

In this section we will simulate 100 TVAR(3) processes, each consisting of 1000 observations. In each simulation, both the same linear roots and the same standard derivation of the noise (<sup>n</sup> D 10-<sup>3</sup>) are used. The true roots are chosen as:

$$\begin{aligned} r\_1 &= 0.2 + 0.9 \cdot t \\ r\_{2,3} &= 0.3 \pm 0.6i + (0.5 \mp 0.2i) \cdot t \text{ with } t \in [0, 1]. \end{aligned} \tag{18}$$

Furthermore each TVAR process is initialized by p independent and identically distributed random variables with standard derivation <sup>n</sup> and passes through a warm-up phase over 500 observations. To show the robustness of the TVAR estimate the process is modelled 100 times with different noise. As a reference we use the estimates of time stable AR coefficients under the assumption of a stationary processes. (I.e. the TSAR coefficients are calculated using the Yule-Walker equations (Hamilton 1994, p. 59, eq. (3.4.36)).)

One of the 100 realizations can be seen in Fig. 1. Each dot in Fig. 2 shows one out of three roots of an AR process estimated from a window of 100 observations using the Yule-Walker equations. The change in brightness (from dark to light) visualizes the shift of the window. A new point represents a shift of an observation. The green lines represent the roots of the true TVAR process (from Eq. (18)) and it can be seen that they follow the estimated roots of the moving

**Fig. 1** A time series for one set of white noise

**Fig. 2** Roots of the windowed estimate, compared to the roots of the time variable estimate for the timeseries on the right

window. For all other estimates, the whole time series was used instead of switching to a windowing.

In Fig. 4 the estimated root motions of the TVAR estimate for all 100 simulations are shown. Comparing the roots of the TSAR process (Fig. 3) with the TVAR root motion (Fig. 4), it is noticeable that the roots of the TSAR process scatter around constant values, but the time-varying estimate tends to vary around the true root motions (which are shown in red).

Figure 5 shows the difference between one of the two estimation methods (TVAR (green) or TSAR (blue) estimation) and the true root movement. Instead of considering all 100 realizations individually, the deviations for each time are averaged over all realizations.

It is immediately noticeable that the residuals for the real root in the TVAR estimate (on the right side of Fig. 5) are consistently smaller than in the TSAR estimate. And even in the case of complex roots, the time variable root has on average smaller deviations although the time stable estimation performs better in the interval t 2 Œ130; 420. Due to the two dimensional representation in Fig. 4 it seems that smaller residuals occur with the real root than with the complex roots, but this is refuted by Fig. 5 where it can be seen that the residuals for the complex roots are smaller than those for the real root.

**Fig. 3** All root motions of the TSAR estimates for the 100 simulations

**Fig. 4** All root motions of the TVAR estimates for the 100 simulations

**Fig. 5** Residuals between the roots from the TSAR and the true roots (blue), and the residuals between the roots from the TVAR estimate and the true roots (green). Furthermore a distinction is made between the complex roots, on the left side, and the real root on the right side

# **7 Conclusion and Outlook**

In this paper we have shown that the use of TVAR processes with linear motions of the roots leads towards an estimation where the k-th coefficient of the TVAR process is a polynomial of order k. But to construct linear root motions, additional conditions are necessary which we derived here for the TVAR(1) and the TVAR(2) process. By successively calculating TVAR(1) and TVAR(2) processes, TVAR processes of higher order can also be estimated, whose roots then also moves linearly. This can be seen directly in the

To show the robustness of the TVAR estimate, 100 time series were simulated and the linear roots of the TVAR estimate were first compared with the roots of AR processes of a moving window. Since the root movement of the TVAR estimate is the averaged over time by the root computed by the moving window, the window can be used as a test if the model of the TVAR estimate with linear roots is suitable for a time series.

Second, 100 time series were simulated by which a TVAR or TSAR process was estimated. The results show, that the roots from the TVAR estimate fit better with the true roots than the roots of the TSAR estimation. This means that the introduction of TVAR processes with linear root moves provides a suitable extension for time series analysis. The results also show that the estimation with the TVAR processes remains reasonably stable.

One problem that has gone unnoticed here is that the linear roots run out of the unit circle over time. In order to solve this problem, future research should focus on root movements which guarantee stationarity for any length of time.

**Acknowledgements** This research is funded by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation)—Grant No. 435703911 (SCHU 2305/7-1 'Nonstationary stochastic processes in least squares collocation—NonStopLSC').

# **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/ licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

# **Refinement of Spatio-Temporal Finite Element Spaces for Mean Sea Surface and Sea Level Anomaly Estimation**

# Moritz Borlinghaus, Christian Neyers, and Jan Martin Brockmann

#### **Abstract**

The mean sea surface (MSS) is an important reference surface for oceanographic or geodetic applications such as sea level studies or the geodetic determination of the steady-state ocean circulation. Models of the MSS are derived from averaged along-track radar altimetry which provides instantaneous measurements of the sea surface heights (SSH). SSH observations corrected for tides and other physical signals and can be modeled as the sum of the MSS and sea level anomalies (SLA) which describe the temporal variability of the ocean. The typical MSS products are defined as grids of heights at a specific reference epoch and result from spatial and temporal prediction and filtering of the along-track SSH observations, whereas SLA products are computed with respect to an MSS model and are also defined as e.g. daily or averaged monthly grids.

In this contribution a one-step least-squares approach is used to estimate a continuous spatio-temporal model of the MSS and filtered SLAs from along-track altimetric SSH measurements using C<sup>1</sup>-smooth finite element spaces for the spatial representation. The finite elements are defined on triangulations with different edge lengths and, thus, different spatial resolutions for MSS and SLA modeling. To model the temporal ocean variability finite B-Splines base functions are combined with the spatial finite elements to construct a spatio-temporal model. This contribution presents a concept to adapt the triangulations to the spatial characteristics of the signal of the MSS and SLA in a study region south of Africa. Least-squares residuals are studied to detect areas which show unmodeled spatial signal. These serve as input for the refinement of the triangulation. The results show that the residuals are indeed a good indicator for unmodeled signal, but as they are significantly influenced by unmodeled temporal signals as well, the refinement has only a small local impact on the obtained MSS and SLA models.

#### **Keywords**

Finite elements - Mean sea surface - Mesh refinement -Sea level anomalies

# **1 Introduction**

To estimate a continuous model in the spatial and temporal domain of both, the mean sea surface (MSS) and sea level anomalies (SLA) from altimetric sea surface height (SSH) measurements is the key idea of the proposed work. The sea surface can be represented as the sum of the long time mean (i.e. MSS) and its temporal variability (i.e. SLA).

J. T. Freymueller, L. Sánchez (eds.), *X Hotine-Marussi Symposium on Mathematical Geodesy*, International Association of Geodesy Symposia 155, https://doi.org/10.1007/1345\_2023\_205

M. Borlinghaus (-) · C. Neyers · J. M. Brockmann

Institute of Geodesy and Geoinformation, University of Bonn, Bonn, Germany

e-mail: borlinghaus@geod.uni-bonn.de; neyers@geod.uni-bonn.de; brockmann@geod.uni-bonn.de

Both products have various areas of application such as the computation of mean dynamic topography (MDT) models and the derived long-term stable ocean currents (Knudsen et al. 2011; Becker et al. 2014; Mulet et al. 2021). Sea level anomalies can be furthermore used to detect mesoscale eddies (Bolmer et al. 2022).

Common MSS estimation approaches use multi-step procedures in which the temporal variability is eliminated from the SSH observations in a first step (e.g. Pujol et al. 2018; Andersen and Knudsen 2009; Jin et al. 2016). Whereas temporal averaging of all cycles is done for the exact repeat missions (ERM) to obtain a mean profile along the reference track, prior information like gridded SLA products (e.g. Taburet et al. 2019) are used to reduce the ocean variability from SSH observations obtained in Geodetic Mission phases (GM). Alternatively, binned SSH observations can be approximated by a temporal model per grid cell, to estimate a cell dependent temporal correction. Afterwards, collocation-like interpolation and gridding techniques are used to combine the corrected data and to estimate MSS (e.g. Pujol et al. 2018; Andersen and Knudsen 2009; Jin et al. 2016) models on pre-defined over-sampled grids.

In contrast to this, the approach used here is a generic onestep approach based on finite elements as basis functions to describe the spatial signal of both, the MSS and the SLA (Borlinghaus et al. 2023). In this context, the SLA model is mainly used to absorb the temporal ocean variability. The finite elements are set up on triangulations which have initially a constant edge length over the entire study area to start with a homogeneous spatial resolution. To describe the temporal domain, B-Splines with a constant node spacing are used as finite basis function to obtain a high flexibility. But, these initial models show – especially in high variable areas – large least-squares residuals. In this study the spatial distribution of the residuals are used as an indicator of an insufficient parameterization and thus to refine the triangulations.

The manuscript is organized as follows, in Sect. 2 the theoretical background to construct the C<sup>1</sup>-smooth finite element space (FES) is summarized. Based on this space the least-squares observation equations are set up to estimate the spatio-temporal model. The used altimetric satellite data and the analyzed estimation scenarios are explained in Sect. 3. In Sect. 4 a reference scenario configuration is introduced which serves as a basis for the refinements. The scenario with the refined FES for the static model component is analyzed in Sect. 5. In Sect. 6 the impact of the refinement of the FES for the temporal model component is evaluated. Finally, a summary, some conclusions and an outlook are provided in Sect. 7.

# **2 Summary of the MSS Estimation Approach**

The basis for this study is the finite element based spatiotemporal estimation approach for the MSS and the SLA as proposed in Borlinghaus et al. (2023). Here the key idea is shortly summarized (cf. Borlinghaus et al. 2023). The geophysically corrected SSH is represented as the sum of the long time mean (MSS) and the sea level anomalies (SLA)

$$f\_{\rm SSH}(\lambda, \phi, \Delta t) = \mathbf{g}\_{\rm masses}(\lambda, \phi) + f\_{\rm SLA}(\lambda, \phi, \Delta t) \quad (1)$$

where both gMSS W R<sup>2</sup> ! R and fSLA W R<sup>3</sup> ! R are continuous functions and the time is represented as t WD t t0 with the reference epoch t0. To model the temporal behavior of the SLA, it is assumed that changes in time are continuous and separable from spatial variability. The spatial domain is modeled with finite elements as basis functions, which have only a local support. This allows to model complex signals by a continuous mathematical function which have no accessible closed expression that is directly derived from physical laws. Within this study finite elements defined on triangular meshes are chosen, as they can easily be adapted to regions with complex boundaries (e.g. coastal regions). There are different finite elements which guarantee a C<sup>1</sup> smooth surface. Here the Argyris element (Argyris et al. 1968) is selected because it is the element with lowest degrees of freedom which guarantees C<sup>1</sup>-continuity while spanning a complete polynomial space. In particular this is the local space of polynomials of degree 5 with 21 degrees of freedom including the function value, two first and three second derivatives for each of the three nodes, as well as the three normal derivatives in the centers of the edges.

The entire domain of interest is partitioned into a finite number of triangular sub-regions, each of which has its own locally defined basis functions and corresponding parameters. To construct the triangulations utilized in this study the software package JIGSAW (Engwirda 2017) is used. It allows for an automatic generation of meshes given geometrical boundary constraints which define the local study area. The desired location specific size of the triangles can be configured via an input map which defines the target length of the edges in the region of interest. Based on that input a mesh of well-defined triangles is optimized by the software.

As described in Eq. (1), the SSH is modeled by two components, one for the static part and one for the temporal ocean variability. The static MSS signal is described by the function

$$\mathfrak{g}\_{\rm MSS}(\lambda,\phi) = \sum\_{i \in I\_{\rm MSS}} a\_{\rm MSS,i} b\_{\rm MSS,i}(\lambda,\phi) \tag{2}$$

where i 2 IMSS describes the indexing of all IMSS piecewise defined basis functions bMSS;i.-; / and aMSS;i the corresponding scaling coefficients/parameters.

To model the spatio-temporal SLA signal

$$f\_{\rm SLA}(\lambda, \phi, \Delta t) = \sum\_{i \in I\_{\rm SLA}} \sum\_{j \in J} e\_{\rm SLA} b\_{\rm SLA}(\lambda, \phi) B\_j^3(\Delta t) \tag{3}$$

is used, which is build by tensor product of the spatial and temporal basis functions. Here eSLA;i;j are the unknown spatio-temporal parameters and bSLA;i.-; / again the spatial finite elements.

In detail, uniform B-Splines of degree 3 (cf. De Boor 2001; Fahrmeir et al. 2021) with a constant node spacing of Ð 6 d of the temporal nodes are used in this study to obtain a high temporal resolution. B<sup>3</sup> <sup>j</sup> .t / describes the j -th B-Spline basis function .j 2 /. As the chosen B-Spline function corresponds to a low-pass filter with cutoff frequency of c <sup>1</sup> 2 D 1=12 1/d (Sünkel 1985), the model can represent signals down to a 12 d period. Given a single ERM of the Jason family with a temporal repeat of ıt D 10 d, the Nyquist frequency N D <sup>1</sup> 2ıt D 1=20 1/d follows, thus 20 d periods are resolvable. Given a combination of simultaneous operating ERM missions (e.g. Jasons and SARAL, HY-2A, and Sentinel-3) it turned out that in the joint spatio-temporal analysis the 6 d node spacing in close to the highest possible temporal resolution which can be estimated stably.

In Borlinghaus et al. (2023) it is shown that for a stable estimation of both components, two different FES are required. A fine resolution for gMSS to capture the high frequency static (geoid) signal, and a significantly coarser space for fSLA. Consequently, the resolvable spatial resolution is limited. As for both functions the Argyris element is selected, the spatial resolution of the functions completely depends on the mesh.

To estimate the unknown parameters aMSS;i and eSLA;i;j in a least-squares adjustment, Eq. (1) is used to setup the linear observation equations for all SSH observations as left hand side. Here, the SSH observations are assumed to be uncorrelated with a variance of <sup>2</sup> <sup>0</sup> , i.e the covariance matrix of all SSH observations *L* is

$$
\Sigma\left\langle \mathcal{L} \right\rangle = \sigma\_0^2 I,\tag{4}
$$

although it is known that the noise standard deviation of SSHs is spatially not homogeneous.

To be complete, the estimation is stabilized applying a Tikhonov regularization (cf. Tikhonov et al. 1977) with manually adapted individual weights from variance component analysis (cf. Koch and Kusche 2002) for each parameter group (i.e. individual identity matrices for parameters of the same kind, i.e. SLA and MSS as well as for the function values, first and second derivatives which are the local parameters of the finite elements). This compensates spatial or temporal data gaps and inappropriate observation distribution close to the boundary of the region of interest. Furthermore, linear zero-mean constraints are applied to the SLA model to prevent leakage of static MSS signal into fSLA. Additionally, the temporal model is stabilized at its borders via forcing the second derivatives to zero (for further details see Borlinghaus et al. 2023).

# **3 Real Data Experiment**

Focus of the presented study is the improvement of the triangulations for both the spatial MSS modeling and the spatial SLA modeling. Therefore, the approach summarized in Sect. 2 is applied in a real data experiment to obtain optimized triangulations. In Borlinghaus et al. (2023), it is assumed that both meshes are known a priori, they have been generated as homogeneous meshes, i.e. homogeneous edge lengths which have been chosen motivated by the spatial sampling of the satellites and computational resources.

Figure 1 shows the investigated region south of Africa where the methodology is tested. This region is selected because of its high spatial and temporal variability and

**Fig. 1** Test region south of Africa with an estimated model of the mean sea surface


**Table 1** Description of the three estimation scenarios with the involved FES and the temporal node spacing

to keep the computational effort handy. The high spatial variability results from high frequency geoid signal and is especially visible in the southwest of the region. Furthermore the Agulhas current is the dominant temporal feature in this region which is subject to large temporal variability.

To obtain a best possible spatial resolution for the MSS and a sufficient temporal resolution to compensate the SLA, observations of both exact repeat altimetry missions (ERM, e.g. Jason-1) and geodetic missions (GM, e.g. CryoSat-2) have to be jointly analyzed. Because of the high temporal resolution ERMs have a poor spatial resolution. For GMs, the opposite is true.

Thus, all available ERM and GM altimetry missions for which a L2P data product is available on AVISO+ for the study region and period (2010 to 2019, inclusive) are selected.<sup>1</sup> These are in total 3:2 106 observations collected by nine satellite missions. The reference epoch t0 is set to January 1st, 2015, which is in the mid of the study period.

This study configuration is used to estimate models of the MSS and the SLA utilizing the summarized estimation approach, while targeting a data adaptive refinement of the meshes of both FES. Table 1 summarizes the main configuration details of the three scenarios considered here.

# **4 Reference Scenario and Objectives of the Study**

The initial scenario R serves as a reference. MSS and SLA are estimated in order to have a kind of internal baseline for comparisons of models estimated with refined triangulations. This reference scenario utilizes a FES with a homogeneous target edge length of 35 km in the entire domain for the MSS and 175 km for the SLA. Please note that it is not easy to provide a precise measure of the spatial resolution of the FES. But, given the definition of the ARGYRIS element, the local polynomials within a triangle are of degree five. Consequently, this corresponds to one dimensional polynomials of degree five along all slices as well. Given the six parameters of the polynomial, the polynomial has four degrees of freedom accounting for two constraints required to guarantee the C<sup>1</sup>-smoothness at the borders of the triangles. Thus, we expect a spatial resolution in the order of edge length divided by four kilometer, which is confirmed by numerical experiments (mot shown here). For the MSS it is 9 km, and thus slightly above the along-track sampling of the 1 Hz SSH sampling (7:5 km).

In Borlinghaus et al. (2023) it is shown that the FES for the temporal model component requires a coarser resolution to avoid overparameterization. As it is mainly determined by the ERMs, the reference edge length is tailored to the ground track spacing of the ERMs which is in the order of 100 km to 315 km (e.g. Sentinel-3 and the Jason family). For the reference scenario, a edge length of 175 km was chosen. The temporal resolution given by the node spacing of the B-Splines (cf. Table 1) remains constant for all scenarios. It is chosen as Ð 6 d, thus differences in the results obtained only relate to the refined triangulation.

The configuration of the reference scenario summarized above is used to estimate a MSS and the model for the SLAs. Based on the resulting least-squares residuals, this study addresses the research question, whether it is possible to improve the spatial meshes of both – MSS and SLA – based on this internal quality measure.

Figure 2 shows the empirical standard deviations of all residuals within a single triangle for the reference scenario R. Whereas Fig. 2a uses the fine MSS mesh for the computation of the standard deviations, Fig. 2b shows them computed for the coarser SLA mesh. The standard deviations in both figures are not homogeneous, they are in a range of 4 cm to 5 cm in the northwestern and southeastern part, but reach more than 8 cm in the central part where the Agulhas current is the dominant feature. Additionally, some small regions with higher values are visible in the southwestern part in Fig. 2a. The higher variances are an indicator for an insufficient parameterization, either in the spatial or in the temporal domain. Furthermore, larger difference at the eastern boundary become visible, which are attributed to numerical issues and boundary effects. The overall standard deviation of the residuals in the test region of R is 5:5 cm.

The goal of this study is to use the maps shown in Fig. 2 to refine both triangulations, for the MSS as well as the SLA to identify the unmodeled higher resolution signals. As the larger standard deviations can either result from (i) unmodeled spatial high resolution MSS signal, (ii) unmodeled spatial higher resolution SLA signal, or (iii) unmodeled highresolution temporal SLA signal, these maps are a good proxy for mesh refinement. They are converted to a to JIGSAW input map of target edge lengths, from which optimized meshes are generated. For regions of lower standard deviation, edge

<sup>1</sup> The L2P data product as processed on behalf of CNES (Centre National d'Etudes Spatiales) SALP project and distributed by AVISO+ are used. A detailed product description and data access see https://www.aviso.altimetry.fr/en/data/products/sea-surfaceheight-products/global/along-track-sea-level-anomalies-l2p.html (last accessed 25/10/22).

**Fig. 2** Standard deviation of the residuals per triangle of R of the FESs (F<sup>35</sup> and F175) for the static (**a**) and temporal (**b**) model component

length at the upper limit are requested, whereas for regions of high standard deviations edge length of the lower limit are requested. In scenario S, this is studied for the mesh of the MSS, where the target edge length is allowed to vary between 17 km to 35 km depending on the standard deviation (F17;35). Consequently the spatial resolution is doubled in regions of highest variance. Technically speaking, the spatial map of standard deviation (cf. Fig. 2a) is converted to a map of target edge lengths, mapping the standard deviation to a target edge length of the interval 17 km to 35 km. Based on this, a new mesh is optimized by JIGSAW, trying to obtain the regionally requested edge lengths. The mesh for the SLA is not changed (F175, cf. Table 1).

This is modified in scenario T , which uses the homogeneous F<sup>35</sup> for the MSS, but refines the mesh for the SLA component to 130 km to 200 km, again depending on the regional standard deviations shown in Fig. 2b. As the mesh of the SLA component dominates the number of unknown parameters and to avoid overparameterization, the lower bound of the interval for the target edge length is limited to 130 km. Here, the upper limit is chosen as 200 km, which allows even larger triangles in regions of low variance.

In the following two sections (Sects. 4 and 5), the results obtained with the newly generated meshes in the two different scenarios are analyzed and compared to the reference scenario R.

# **5 Refined MSS Component**

The first refined scenario S combines the refined FES for the MSS with the original homogeneous F<sup>175</sup> for the SLA. The homogeneous reference mesh is shown in Fig. 3a and the refined mesh optimized by JIGSAW is shown in Fig. 3b with the area of the individual triangles color coded. Although the correlation of the mesh to the map of standard deviations is visible (cf. Fig. 2a), it is obvious that the mesh, due to the transition from coarse to fine, is blurry. As targeted, the major refinements are seen in the area of the Agulhas current with the highest standard deviation of the residuals, but also some refinements with a smaller extend can be seen in the southwest.

After estimating the MSS and SLA model with the refined mesh for the MSS, new residuals are computed. Figure 3c shows the standard deviation of the residuals per triangle of F17;35. Compared to Fig. 2a there is no difference in magnitude of the standard deviation visible and the Agulhas current is still the dominant feature. Compared to the reference scenario in Fig. 2a, no obvious difference is visible. Figure 3d shows the differences of the standard deviations evaluated on F35. Here, red colors indicate a reduced standard deviation of the refined scenario compared to the reference scenario. The

**Fig. 3** Approximate triangle sizes of the reference (F35, **a**) and refined FES (F17;35, **b**), the standard deviation of the residuals per triangle (**c**) and the change of the standard deviations per triangle compared to

the reference scenario (**d**), the RMS per triangle of the differences to CNES\_CLS15 MSS (**e**) and the differences of the RMS compared to the reference scenario (**f**)

highest improvements can be seen in the southwestern part, but additional improvements are visible in the northern part. But, in the center of the region where the main refinement happens, improvements are only minor. However, the overall standard deviation of the residuals could not be improved significantly (sub mm level).

To only rely on the residuals to judge the quality of the refinement is disadvantageous. As external comparison, the MSS model component is compared to the well established model CNES\_CLS15 MSS2 (Pujol et al. 2018). To do so the model is evaluated on the grid provided by the comparison model. Figure 3e shows the RMS of the difference between the estimated MSS and the CNES\_CLS15 computed per triangle. In general, a good agreement of 1 cm to 4 cm is achieved. The differences have a more or less random structure over the complete study area with highest values in the region of the Agulhas current and in the southwestern region. To show again the effect of the mesh refinement, the differences of the RMS of differences to the CNES\_CLS15 MSS are computed (see Fig. 3f). Again, red colors show a reduced RMS and thus an improvement, green colors correspond to larger RMS and thus a degradation.

The highest improvement are again visible in the southwest and northern part. This confirms, that the refined model captures an additional MSS signal. But, the central area shows higher RMS values of the difference of the refined model to the CNES\_CLS15 MSS compared to the reference model from scenario R. This can either indicate an overparameterization or a lower filtering effect caused by the smaller triangles in this area. Due to the smaller triangles the refined model has 27,446 degrees of freedom, compared to 14,043 of the reference solution. This suggests the conclusion, that the increased standard deviation of the residuals in the central area cannot be attributed to unmodeled MSS signal. It has to be attributed to unmodeled spatio-temporal SLA signal, which is studied in scenario T which uses the reference mesh for the MSS (cf. Table 1).

# **6 Refined SLA Component**

For the second model the refined FES (F130;200) for temporal model component is used together with the original homogeneous mesh for the MSS (as for the reference solution). Figures 4a and b show the reference and refined mesh. As desired the main refinement appears in the high variable area of the Agulhas current, but especially in the northwestern and southeastern part regions with a coarser triangle sizes are visible. Because of more then 600 temporal B-Spline nodes, small changes in the degrees of freedom of the spatial FES have a large impact on the total number of unknown parameters. Therefore the lower and the upper limit for the target edge length are defined as 130 km to 200 km and optimized by JIGSAW depending on the standard deviation to obtain F130;200. This leads to only small changes in the total number of parameters of the spatio-temporal SLA from 414,864 to 416,673.

Figure 4c shows the standard deviation of the residuals of the model T . Again the main features in the area of the Agulhas current are visible. To highlight the difference to the reference solution, the differences of the standard deviations are computed and are shown in Fig. 4d. The differences of the standard deviations computed from the residuals of scenario R and the refined scenario T show the largest improvements as expected in the area of the refinement. But, some higher values are visible in regions with a coarser triangulation structure. The overall standard deviation of the residuals is slightly reduced by 1 mm, it is 5:4 cm.

For an external comparison, DUACS Level 4 gridded SLA DT2018 maps are used. The spatio-temporal SLA model (cf. Eq. 3) is evaluated on the grid and at the time stamps provided by the product. Then the differences between both time series are computed and the RMS is computed for each pixel of the grid. Afterwards, RMS values for the triangles of the different epochs are averaged (see Fig. 4e). Here the trend of the Agulhas current is not visible as a dominant feature but again larger mean RMS values are visible in regions with higher temporal variability.

To get an impression of the improvement obtained by the refinement the differences of the mean RMS values of the reference model and the refined model are computed (see Fig. 4f). Here the differences show a more or less random characteristic with no dominant features in the area of the FES refinement. Additionally the regional improvements visible in Fig. 4d are not visible in Fig. 4f. This shows that, due to the regional refinements, signal is modeled which has no impact on the differences to the DUACS SLA maps. Again, the overall mean RMS is slightly improved by 1 mm from 5:7 cm to 5:6 cm.

# **7 Summary, Discussion and Outlook**

In this contribution, the refinement of the triangulation of the FES which are used to model the MSS and the SLA as a continuous mathematical function are studied. MSS and SLA are jointly estimated from along-track SSH observation in a least-squares adjustment. For the refinement, leastsquares observation residuals from an initial reference solution with a homogeneous mesh are used to identify conspicuous spatial regions which are candidates for mesh refinement. The model functions used in this study are compound by

<sup>2</sup> In order to adjust the different reference epochs an epoch adjustment using DUACS Level 4 gridded SLA DT2018 maps (Taburet et al. 2019) is performed.

**Fig. 4** Approximate triangle sizes of the reference (F175, **a**) and refined FES (F130;200, **b**), the standard deviation of the residuals per triangle (**c**) and the change of the standard deviations per triangle compared to the

two components, one is static which models the MSS and one spatio-temporal which approximates the temporal ocean variability (SLA). Both parts utilize finite element basis functions for spatial description, a high resolution FES for the MSS and a lower resolution FES for the SLA. The latter is composed to a spatio-temporal model using B-Splines basis functions for the temporal domain.

To test this strategy, a small region south of Africa is selected where a high temporal as well as high spatial variability is expected. The reference scenario R for MSS modeling uses a homogeneous FES with a target edge length of 35 km and 175 km for the SLA component. In this con-

reference scenario (**d**), the mean RMS per triangle of the differences to DUACS SLA maps (**e**) and the differences of the mean RMS compared to the reference scenario (**f**)

tribution, the refinement of both components is individually studied, to better access the effects on the derived MSS and SLA. The first scenario S refines the mesh for the MSS to a target edge length of 17 km to 35 km depending on the empirically derived standard deviation of the least-squares residuals within a triangle of the mesh. The second scenario T adjusts the FES for the temporal model component, to target edge lengths of 130 km to 200 km, again depending on the empirically derived standard deviations of the residuals. For both refined scenarios, MSS as well as SLA models are estimated and used to access the performance of the refinement.

The spatial pattern of estimated standard deviations per triangle are a good internal quality measure to indicate potential regions of an insufficient parameterization. It is shown that they can help locally to better adapt the FES in high-residual regions, e.g. the south-western part. In regions of high temporal variability, the mesh can be refined as well, leading to smaller least-squares residuals. But, when comparing the results to reference models for the MSS, degradations are visible as well, which indicates the risk of overparameterization. However, the impact of the FES refinement on both resulting scenarios is small, we conclude that the choice of the initial meshes based on the data sampling was reasonable. The regions of lowest standard deviation are in the order of 4 cm which is inline with the typical assumption of the accuracy of a few centimeter of a single SSH measurement. The largest problem for the FES refinement based on lest-squares residuals is to differentiate between larger standard deviations resulting from (i) unmodeled MSS signal, (ii) unmodeled spatial SLA signal or (iii) insufficient temporal resolution. This can lead to an iterative process to compute refined FES. In general three different design choices related to the approximation capabilities can be adjusted:


The first two design criteria were shown to have a negligible impact on the overall results. For the FES of the MSS component, the positive impact of the refinement is only visible in some very local areas. Thus, the iterative refinement seems to be a useful technique to refine the FES locally to avoid an overparameterization in smoother areas.

But still, the highest potential to improve the estimated models seems to be an increased resolution of the temporal domain by reducing the node spacing of the B-Splines. However, this is on the one hand limited by the available data – i.e. the repeat cycle of the ERM (10 d) and the number of available satellite missions operating in parallel. On the other hand, this significantly increases the number of unknown parameters which results in computational challenges and might cause numerical problems. This requires more advanced and adopted regularization techniques.

**Data Availability Statement** The altimeter products were produced by Ssalto/Duacs and distributed by Aviso+, with support from CNES (https://www.aviso.altimetry.fr).

**Acknowledgements** The authors acknowledge the financial support via the DFG project "*PArametric determination of the dynamic ocean* *topography from geoid, altimetric sea surface heights and SAR derived RAdial SURface Velocities*" (PARASURV, grant BR5470/1-1) and the funding by the TRA Modelling (University of Bonn) as part of the Excellence Strategy of the federal and state governments.

The authors gratefully acknowledge the Gauss Centre for Supercomputing e.V. (www.gauss-centre.eu) for funding this project by providing computing time through the John von Neumann Institute for Computing (NIC) on the GCS Supercomputer JUWELS at Jülich Supercomputing Centre (JSC) and the granted access to the Bonna cluster hosted by the University of Bonn.

Furthermore, the preliminary coursework of Victoria Brunn and the discussions with Wolf-Dieter Schuh are gratefully acknowledged.

The used colormaps are taken from Thyng et al. (2016).

The effort by two anonymous reviewers is gratefully acknowledged, their comments helped to significantly improve the quality of the manuscript.

# **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/ licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

# **On the Coestimation of Long-Term Spatio-Temporal Signals to Reduce the Aliasing Effect in Parametric Geodetic Mean Dynamic Topography Estimation**

# Jan Martin Brockmann, Moritz Borlinghaus, Christian Neyers, and Wolf-Dieter Schuh

#### **Abstract**

The geodetic estimation of the mean dynamic ocean topography (MDT) as the difference between the mean of the sea surface and the geoid remains, despite the simple relation, still a difficult task. Mainly, the spectral inconsistency between the available altimetric sea surface height (SSH) observations and the geoid information causes problems in the separation process of the spatially and temporally averaged SSH into geoid and MDT. This is aggravated by the accuracy characteristics of the satellite derived geoid information, as it is only sufficiently accurate for a resolution of about 100 km.

To enable the direct use of along-track altimetric SSH observations, we apply a parametric approach, where a C<sup>1</sup>-smooth finite element space is used to model the MDT and spherical harmonics to model the geoid. Combining observation equations for altimetric SSH observations with gravity field normal equations assembled from dedicated gravity field missions in a least-squares adjustment, allows for a joint estimation of both – i.e. the MDT and an improved geoid.

In order to enable temporal averaging and to obtain a proper spatial resolution, satellite altimetry missions with an exact repeat cycle are combined with geodetic missions. Whereas the temporal averaging for the exact repeat missions is implicitly performed due to the regular temporal sampling, aliasing is introduced for the geodetic missions, because of the missing repeat characteristics. In this contribution, we will summarise the used approach and introduce the coestimation of long-term temporal sea level variations. It is studied how the additional spatio-temporal model component, i.e. linear trends and seasonal signals, reduces the aliasing problem and influences the estimate of the MDT and the geoid.

#### **Keywords**

Finite elements - Geoid - Mean dynamic topography - Sea surface height - Signal separation -Spatio-temporal modelling

# **1 Introduction**

The difference between the sea surface height (SSH) above a reference ellipsoid and the geoid is the ocean's dynamic topography. The accurate knowledge of its steady-state part, the mean dynamic ocean topography (MDT) is crucial for both oceanographers (e.g. Wunsch and Gaposchkin 1980; Fu 2014; Wunsch and Stammer 1998), as it gives valuable information about the ocean's circulation and geostrophic surface currents, and geodesists (Rummel 2001), as it permits the unification of independent local vertical datums. The geodetic estimation of the MDT can be represented, under the concept of signal de-convolution, as the separation of the Mean Sea Surface (MSS) height into the MDT and the

J. T. Freymueller, L. Sánchez (eds.), *X Hotine-Marussi Symposium on Mathematical Geodesy*, International Association of Geodesy Symposia 155, https://doi.org/10.1007/1345\_2023\_224

J. M. Brockmann (-) · M. Borlinghaus · C. Neyers · W.-D. Schuh Institute of Geodesy and Geoinformation, University of Bonn, Bonn, Germany

e-mail: brockmann@geod.uni-bonn.de; borlinghaus@geod.uni-bonn. de; neyers@geod.uni-bonn.de; schuh@geod.uni-bonn.de

geoid height. For the separation process, additional independent information about the geoid, the MDT or both is required. When omitting any kind of oceanographic input data (ocean salinity, temperature, pressure), instead relying only on satellite derived geoid and SSH data, the resulting MDT estimate is called *geodetic*. A remaining scientific challenge is the spectral inconsistency of the involved data sets (e.g. Albertella et al 2008; Woodworth et al 2015).

Several approaches were developed and applied to determine a MDT from altimetric measurements and geoid information at the global or regional scale. Basis are the altimetric SSH measurements hSSH D horb halt c C o C e, which result from the difference between the altitude of the satellite (horb) and the raw altimeter range (halt) which have to be corrected due to environmental conditions (c). These include instrument and sea state bias, atmospheric, tidal and inverted barometer corrections (Aviso 2020). Additionally, the measurements contain random and systematic errors e and a mission specific bias o. Due to the highly complex signal structure of the SSH observations, containing a multitude of individual constituents, most of the MDT estimation approaches use preprocessed MSS products (e.g. Andersen et al 2015; Schaeffler et al 2016) instead of the original along-track SSH measurements. These result from a spatial and temporal gridding and averaging of multi-mission alongtrack observations, utilizing either deterministic or stochastic approaches and taking care of data homogenization, removal of the temporal ocean variability (OV) and a reduction of errors. The MSS products are provided as fine grids (e.g. 10 1<sup>0</sup> ) and contain the averaged SSH of multiple mission collected over decades with spatial resolution of a few kilometers (Andersen et al 2015).

The geoid information is typically based on a spherical harmonic model generated from satellite observations, e.g. GRACE and GOCE. Consequently, it has a limited spatial resolution of about 100 km with an accuracy level of 1 cm to 3 cm (e.g. Brockmann et al 2021).

To derive a consistent MDT from the difference of a MSS and the geoid information (N)

$$
\zeta = \mathbf{M} \mathbf{S} \mathbf{S} - N \tag{1}
$$

filtering is required to overcome the spatial inconsistencies. The approaches mostly differ in the chosen filtering strategy, the filter characteristics or the domain the filtering is applied in (e.g. Albertella et al 2008; Bingham et al 2008; Cunderlík ˇ et al 2013; Siegismund 2013; Gilardoni et al 2015; Knudsen et al 2021).

In this study, we utilize a parametric least-squares approach (cf. Becker et al 2014; Neyers 2017) which jointly estimates both, a refined geoid as well as a model of the MDT. Here, we analyse the feasibility of coestimating a spatio-temporal model component which is supposed to compensate the OV. This will reduce the reliance on prior models used for reduction and/or assumptions about implicit canceling. For this purpose, Sect. 2 summarizes the parametric approach and introduces the applied estimation strategy. Section 3 introduces the data, study region and the configuration details for the numerical experiments applied to study the extended model. The results obtained for both configurations, i.e. with and without the spatio-temporal extension, are presented and compared in Sect. 4. Finally, Sect. 5 summarizes the results and draws conclusions.

# **2 Parametric Modelling Approach**

# **2.1 Parametric Model Functions**

#### **2.1.1 Modelling the Geoid**

The Earth's gravity field is typically modelled by global spherical harmonic basis functions. Although not optimal when working regionally, this representation results from the used gravity field information (cf. Sect. 3.2). The disturbing potential at an evaluation point with spherical coordinates (r, , ') then reads (e.g. Hofmann-Wellenhof and Moritz 2005)

$$T(r,\lambda,\varphi) = \frac{GM}{R} \sum\_{l=0}^{l\_{\text{max}}} \left(\frac{R}{r}\right)^{l+1} \sum\_{m=0}^{l} P\_{lm}\left(\sin\varphi\right) \qquad (2)$$

$$\left(c\_{lm}\cos(m\lambda) + s\_{lm}\sin(m\lambda)\right) - U\left(r,\varphi\right) \,,$$

with the maximum degree lmax of the expansion. GM and R are the gravitational constant of the Earth and the equatorial radius, and Plm.-/ the fully normalized associated Legendre basis functions and U .-/ the normal potential. clm and slm are the Stokes coefficients. Using this representation, the geoid can be approximated in the spherical harmonic domain and represented as a function of ellipsoidal coordinates (h D 0, , )

$$N(\lambda, \phi) = T(r\left(0, \lambda, \phi\right), \lambda, \left\vert\begin{pmatrix} 0, \lambda, \phi \end{pmatrix}\right\rangle / \gamma(\phi), \qquad (3)$$

where .-/ is the normal gravity.

#### **2.1.2 Modelling the MDT**

Since there is no natural choice of suitable basis functions for the MDT, a finite element approach on a triangulation is chosen to approximate the unknown function. The MDT is modelled by the continuous function

$$\zeta(\lambda,\phi) = \sum\_{k \in K} a\_{\text{MDT},k} \; b\_k(\lambda,\phi) \tag{4}$$

where aMDT;k are K unknown scaling coefficients of the basis functions bk.; / resulting from the chosen finite element space. In this study, the Argyris element (Argyris et al 1968) is selected to achieve a C<sup>1</sup>-smooth definition of over the domain of interest . Within each triangle, polynomials of degree 5 are spanned by 21 degrees of freedom. The spatial resolution of zeta is then controlled by the mesh resolution, i.e. triangle size.

#### **2.1.3 Modelling the Ocean Variability**

Similar to Borlinghaus et al (2023), a spatio-temporal finite element space is constructed by a tensor product of spatial and temporal basis functions. Whereas for the spatial domain the same finite element space as for the MDT is chosen (cf. (4)), the temporal model function is a linear combination of a trend and seasonal harmonics

$$\begin{aligned} a\_{\text{OV},l}(t) &= e\_{\text{OV},1,l}(t - t\_0) + \\ e\_{\text{OV},2,l} \sin\left(\omega(t - t\_0)\right) + e\_{\text{OV},3,l} \cos\left(\omega(t - t\_0)\right) \end{aligned} \quad (5)$$

with reference epoch t0 and the fixed annual frequency !.

Combining these functions with a tensor product as in Borlinghaus et al (2023) yields the spatio-temporal model function

$$\partial V(\lambda, \phi, t) = \sum\_{l \in L} a\_{\text{OV}, l}(t) b\_l(\lambda, \phi) \tag{6}$$

$$= \sum\_{l \in L} e\_{\text{OV}, 1, l}(t - t\_0) b\_l(\lambda, \phi)$$

$$+ e\_{\text{OV}, 2, l} \sin \left( \omega (t - t\_0) \right) b\_l(\lambda, \phi)$$

$$+ e\_{\text{OV}, 3, l} \cos \left( \omega (t - t\_0) \right) b\_l(\lambda, \phi) \tag{7}$$

which can be used to absorb long term signals of the ocean variability. The scaling coefficients eOV;m;l of the spatiotemporal basis functions are estimated in the least-squares adjustment from the altimetric SSH observations.

# **2.2 Combined Estimation Procedure**

The unknown Stokes coefficients clm and slm for the geoid, aMDT;k for the MDT as well as optionally eOV;m;l for the ocean variability are estimated from a joint adjustment of the satellite-based geoid information and the altimetric sea surface height measurements. As the gravity field information is already available in form of normal equations of global satellite-only gravity field models in spherical harmonic domain, observation equations have to be set up for the altimetric SSH measurements only.

*SSH Observation Equations for Scenario A* No ocean variability is estimated. Thus, it is assumed that the ocean variability cancels due to the implicit spatio-temporal averaging within the least-squares adjustment. Consequently the observation equation for the i-th SSH measurement li at location i ; i simply reads

$$l\_i + v\_i = N(\lambda\_i, \phi\_i) + \zeta(\lambda\_i, \phi\_i) + o\_j,\qquad(8)$$

Here, vi are the residuals and oj is a mission specific bias correction parameter which is estimated in addition. The bias of one selected reference mission is fixed to zero.

*SSH Observation Equations for Scenario B* The ocean variability is coestimated using the model function from (6). Especially for the geodetic missions it is not guaranteed that the ocean variability cancels, due to the missing, or at least very long repeat cycle. Thus, the observation equations, which now depend on the measurement epoch, read

$$\begin{aligned} l\_i + v\_i &= N(\lambda\_i, \phi\_i) + \xi(\lambda\_i, \phi\_i) \\ &+ \mathcal{O}V(\lambda\_i, \phi\_i, t\_i) + o\_j. \end{aligned} \tag{9}$$

Parts of the ocean variability can now be absorbed by the deterministic function OV .i ; i ; ti/, which can reduce aliasing signals in the MDT.

# **2.3 Smoothness Conditions**

To support the separation and to make the resulting system of equations solvable, smoothness conditions are formulated for the geoid, the MDT and optionally the ocean variability. For this purpose, regularization matrices are constructed and applied in the adjustment process.

*Regularization of the Spherical Harmonics* The Kaula rule is used to determine degree dependent weights of a diagonal regularization matrix for all spherical harmonic coefficients above degree 200, which cannot – at least not accurately – be determined from the satellite based geoid information assembled (indicated as *SH medium* and *SH high deg* in Fig. 1). Regularization towards zero is applied using a small weight of 10-<sup>4</sup> just to make all spherical harmonic coefficients estimable.

*Regularization of the MDT* The separation of the SSH into geoid and MDT only works for the long wavelengths, for which the geoid is accurately known from the satellite based geoid. To support the separation, the assumption that the MDT is smooth can be added. For that purpose, the leastsquares objective function is extended by the minimization of the norm of the gradient of the MDT, i.e. krkL2./. As the Argyris element uses full polynomials of degree five, the condition can be expressed as a (non-diagonal) regularization matrix, which is derived by a numerical quadrature using the control points and weights from Taylor et al (2005). An empirical weight is used in the combination.

**Fig. 1** Schematic view on the combination of normal equations for scenario A without, and B with coestimation of OV. (**a**) Structure of normal equations without estimating the ocean variability. (**b**) Structure of normal equations when coestimating the ocean variability

*Regularization of* OV Similar to the MDT, a regularization matrix is derived for the function which represents the ocean variability. Consequently, the objective function is further extended by adding the norm of OV's spatial gradient, krOV kL2.T /. This leads to a block-diagonal regularization matrix with three blocks, that are all identical to the MDT's regularization block. Individual weights for the amplitudes and the trend are determined by variance component estimation, resulting in 5:7, 2:9 and 6,816,929.4 respectively.

# **2.4 Combined Solution**

With the contributors, i.e. the satellite based geoid information, the SSH normal equations as derived from the observation equations in (8) or (9) and the regularization matrices, the least-squares normal equations can be assembled and solved for the unknown parameters.

For scenario A, without coestimation of the ocean variability, the schematic overview of the normal equations which are assembled and solved is provided in Fig. 1a. For scenario B, which coestimates the ocean variability, the structure of normal equations is provided in Fig. 1b. The large dimensional normal equations are assembled and solved in a massive parallel implementation on a high performance compute cluster.

# **3 Configuration for the Numerical Experiments**

# **3.1 Study Region and Mesh**

To study the proposed coestimation of the long-term ocean variability, a numerical real data experiment is conducted.

**Fig. 2** Overview of the Agulhas study region used in the numerical experiments: the expected signal from the CNES-CLS18 MDT and the used triangulation (gray)

The Agulhas region (10 <sup>ı</sup> E to 40 <sup>ı</sup> E and 42 <sup>ı</sup> S to 20 <sup>ı</sup> S, cf. Fig. 2) is chosen as a local study region as it contains both regions with smooth and strong geoid signal as well as of low and high ocean variability. The domain of the finite element space is limited by a polygon derived from these borders and the coastlines. Using the jigsaw-geo package (Engwirda 2017), a triangulation is generated (see Fig. 2) inside the polygon. The resulting mean length of the edges is about 175 km in the region of interest.

# **3.2 Used Data Sets**

For the gravity field information, the unregularized normal equations from the GOCO06s satellite-only gravity field model (Kvas et al 2021) are used. They are assembled in the spherical harmonic domain for degrees 2 to 300 and can be directly included in the estimation (cf. Fig. 1).


**Table 1** Characteristics of the used altimetry missions

Ten years of along-track SSH measurements are analysed for the period 01/2010 to 12/2019. The corrected L2P data as distributed by Aviso+ is selected and used for the ERM Jason-1, Jason-2 and Jason-3 and the GM Cryosat-2. Geodetic mission phases from the Jason missions are used in addition (see Table 1). In total more then 5 106 SSH observations are used. In the experiment, they are assumed to be uncorrelated with a mission specific variance which is derived by variance component estimation.

# **3.3 Scenario Configuration**

To show the effect of the coestimation of ocean variability on the target quantities, i.e. the MDT and the geoid, two scenarios are computed. In scenario A, geoid and the MDT are estimated (cf. Fig. 1a) using (8) as least-squares observation equations for altimetric SSH. Contrary, scenario B utilizes (9) to jointly estimate geoid, MDT and the spatio-temporal model for the ocean variability (Fig. 1b).

In both scenarios, the basic settings are the same to obtain comparable solutions: The geoid is estimated from spherical harmonic degree 2 to lmax D 600, i.e. 361,197 parameters. As only local SSH data are used, medium and high degree global spherical harmonics are regularized towards zero using the Kaula rule for the degree dependent weighting. In both scenarios, the MDT is estimated with the same finite element space, i.e. using the mesh shown in Fig. 2 and the Argyris element. This results in 1195 unknown parameters for the MDT. To support the separation, the MDT is regularized applying the smoothness condition (cf. Sect. 2.3). Scenario B additionally coestimates the ocean variability in form of a linear trend and annual harmonics (cf. 6) and thus estimates 31195 additional parameters for which additional smoothness conditions are applied.

Full altimetry normal equations are assembled which takes about 15 h to 20 h with 576 cores on the JUWELS supercomputer with a massive parallel implementation. The combination and solution of the normal equations (cf. Fig.1) for the unknown parameters takes additional 1 h to 2 h. Weights are derived by variance component estimation and some empirical tuning. The derived results for the geoid, the MDT and the ocean variability are presented and discussed in Sect. 4.

# **4 Results and Evaluation**

# **4.1 Comparison of the MDT and Geoid Estimates**

For both scenarios the parameters aMDT;k for the MDT and clm/slm for the Earth's gravity field are estimated. From these parameters, the MDT as well as the geoid can be evaluated and compared among each other and to reference models.

Figure 3 shows both MDT solutions as a difference to the established CNES-CLS18 MDT (Mulet et al 2021), which is adopted to the reference epoch 01/01/2015, as well as the difference between both solutions. In terms of RMS the MDT estimated in scenario A shows a consistency of 5:1 cm compared to CNES-CLS18. It is dominated by a large systematic difference close to the coast of South Africa. The RMS in regions of low ocean variability is about 1 cm to 2 cm, i.e. 0:9 cm in the north-western part (orange box in Fig. 3a) and 1:6 cm in the north-eastern part (green box in Fig. 3a). It is significantly larger close to the main Agulhas current with 3:6 cm (red box in Fig. 3a) where a higher OV is expected. Despite the large systematic difference, the solutions shows a good agreement to CNES-CLS18. The RMS is in the same order of magnitude as the RMS between the CNES-CLS18 MDT and the alternative DTU22 MDT model (Knudsen et al 2022), which is about 3:1 cm in the region of interest. But,

**Fig. 3** Differences of the two MDT solutions evaluated on the grid as provided by the CNES-CLS18 MDT. The coloured boxes indicate regions for which the statistics are provided in the discussion. (**a**) A - CNES-CLS18 MDT. (**b**) B - CNES-CLS18 MDT. (**c**) A - B MDT

the strong coastal difference indicates a systematic error in the MDT derived in scenario A – the MDT estimate does not include the strong coastal gradient which clearly shows up in both CNES-CLS18 (cf. Fig. 2) and DTU22 MDT.

Figure 3b shows the same for the MDT estimated in scenario B (including the coestimation of OV). A comparison of Fig. 3a and b as well as the direct difference in Fig. 3c shows that both solutions are very similar and thus equivalent. RMS values with respect to CNES-CLS18 are the same, and the RMS of the difference is below 2 mm (and maximal/minimal differences are below ˙1 cm in regions of strongest variations). At the level of the MDT and the comparison shown here, it can not be concluded that the estimated MDT benefits from the coestimation of OV.

Similarly, the estimated gravity field can be compared to existing higher resolution models. For this purpose, Fig. 4 shows both estimated spherical harmonic series evaluated to degree 600 in terms of geoid height differences to the XGM2019 (Zingerle et al 2019) model evaluated to degree 760. As a comparison, Fig. 4a shows the difference of the used GOCO06S (at degree 250) model to the XGM2019 model (degree 760). This difference is dominated by the additional higher frequency signal of XGM2019 and the RMS is about 20 cm. The difference for the estimated geoid from scenario A is shown in Fig. 4b. Obviously, the differences are significantly reduced, the RMS is 2:0 cm for the orange, 3:5 cm for the green and 5:3 cm for the red region. In the entire region, the RMS is 10:0 cm and again dominated by a large difference close to the coast of South Africa. This allows to draw two conclusions: Firstly, as the RMS is significantly reduced, the geoid of GOCO06S is significantly improved for the higher frequencies in the joint estimation. Thus, the estimated gravity field is successfully improved locally from the SSH measurements. Secondly, as now the large coastal difference shows up with an inverted sign compared to the MDT, it is confirmed that the separation failed in the coastal area, the missing strong gradient in the MDT entered the geoid.

Similar to the MDT results, the geoid determined in scenario B is equivalent to the geoid derived in scenario A, Fig. 4c shows hardly a difference. Figure 4d shows the differences for which the RMS is below 1:0 cm. Same conclusions as for the MDT solutions can be drawn: it cannot be demonstrated, that the solution which coestimates OV is superior compared to the solution without.

# **4.2 Estimates of Ocean Variability**

As neither MDT nor the geoid improved, the estimates for the ocean variability are compared to gridded sea level anomaly products (daily DUACS Level 4 gridded SLA DT2018, Taburet et al (2019)) to validate the coestimated spatio-temporal signal. For this purpose, trends, amplitudes and phases are estimated independently for each cell of the DUACS grid (see Fig. 5, first row) using a least-squares regression.

Using (7), the same quantities can be derived for all the grid locations from the parameters estimated in scenario B. There, <sup>P</sup> <sup>l</sup>2<sup>L</sup> eOV;1;lbl.; / reflects the trend estimated for location .; /. Similarly, amplitudes and phases of the seasonal harmonic can be derived from <sup>P</sup> <sup>l</sup>2<sup>L</sup> eOV;2;lbl.; / and <sup>P</sup> <sup>l</sup>2<sup>L</sup> eOV;3;lbl.; / in the domain of interest. These derived quantities are shown in Fig. 5, second row. A good agreement is visible, most of the dominant features which are visible in the maps derived from the gridded data are visible in the maps derived from the coestimated quantities as well. This confirms, that the coestimation works – under the assumption that the chosen model (5) sufficiently reflects the true OV.

To show the quality of this simple model, individual time series can be analysed. From the gridded SLA product, a time series for a single grid point at location .c ; c / can be easily extracted. The time series for the coestimated model follows from (7) for the location .c ; c / as a onedimensional function in the time domain

$$f(t) := \mathcal{O}V(\lambda\_c, \phi\_c, t) \quad \mathbb{R} \mapsto \mathbb{R}. \tag{10}$$

**Fig. 4** Differences of the two gravity field solutions evaluated for geoid heights. The coloured boxes indicate regions for which the statistics are provided in the discussions. (**a**) GOCO06S(lmax D 250) - XGM2019. (**b**) A - XGM2019. (**c**) B - XGM2019. (**d**) A -B

**Fig. 5** Spatial maps for trends, seasonal amplitudes (cm) and phases (<sup>ı</sup>). First row: from regression of the DUACS gridded SLA maps, Second row: derived from the parameters estimated in scenario B. (**a**) Trends (cm/yr). (**b**) Amplitudes (cm). (**c**) Phases ( <sup>ı</sup>). (**d**) Trends (cm/yr). (**e**) Amplitudes (cm). (**f**) Phases (<sup>ı</sup>)

Figure 6 shows both, the time series from the DUACS product as well as the function estimated in scenario B for a region of low as well as a region of high ocean variability. The first time series is evaluated in the north of the orange region of very low ocean variability (13:375 <sup>ı</sup> E, 20:875 <sup>ı</sup> S). It is visible in Fig. 6a, that the seasonal signal is dominant, but seasonal variations are below ˙5 cm. The estimated model (blue curve) nicely captures this main feature and approximates the variability. On the contrary, the second time series shown in Fig. 6b is close to the Agulhas current, thus strong ocean variability is expected (red box, 17:325 <sup>ı</sup> E, 39:125 <sup>ı</sup> S). The DUACS SLA product shows strong high-frequency variations in the range of ˙1 m, whereas the seasonal signal is hardly visible. Consequently, the estimated model is a poor approximation of the OV, the dominant signal is not captured by a model like (5).

# **5 Summary and Conclusion**

In this contribution, the parametric joint estimation of the MDT and geoid is extended for the coestimation of a spatiotemporal model component. This is supposed to model and compensate long-term ocean variability to avoid aliasing into the mean – i.e. static – geoid and MDT models.

For this purpose, a parametric approach is chosen in which both geoid and MDT are modelled by continuous functions. The geoid in terms of spherical harmonics and the MDT by finite element basis functions. Similar to Borlinghaus et al (2023), combining separable functions – finite element basis functions in the spatial and a trend and seasonal harmonic functions in the time domain – a spatio-temporal function is designed to model the OV. The parameters are jointly estimated from altimetric SSH measurements with (9) and (6) as flexible observation equations, and a satellitebased gravity field model, applying some smoothness conditions.

A numerical real-data experiment is performed to study the coestimation of the OV. For this purpose, MDT and geoid are estimated without (scenario A) and with this extension (scenario B). Comparing the results among each other and to reference models, no obvious improvements could be shown for MDT or geoid. It is concluded that


Although it is shown that a basic model of (linear) trend and annual harmonic is not sufficient to model the OV close to strong current systems, it is demonstrated that the coestimation is working in general. Estimated trends, **Fig. 6** Sea level anomaly time series from the DUACS product (orange) for two grid points compared to the function coestimated in scenario B (blue). (**a**) Location in region of low ocean variability. (**b**) Location in region of high ocean variability

amplitudes and phases are similar compared to those derived from gridded sea level anomaly products.

Consequently, more advanced models in the temporal domain are required. E.g. Borlinghaus et al (2023) proposed the use of B-Splines when coestimating the OV while estimating the mean sea surface. But, this will significantly increase the parameter space, which is already large (more than 360,000 parameters), by additional hundreds of thousands parameters, which is not yet operational and subject to future work.

**Acknowledgements** The work is financially supported by the DFG project "PArametric determination of the dynamic ocean topography from geoid, altimetric sea surface heights and SAR derived RAdial SURface Velocities – PARASURV" (BR5470/1-1). Furthermore, parts of this work were funded by the TRA Modelling (University of Bonn) as part of the Excellence Strategy of the federal and state governments. The authors gratefully acknowledge the Gauss Centre for Supercomputing e.V. (gauss-centre.eu) for funding this project by providing computing time through the John von Neumann Institute for Computing on the GCS Supercomputer JUWELS at Jülich Supercomputing Centre and the University of Bonn for the granted access to the Bonna cluster. The altimeter products were produced and distributed by Aviso+ (https://www.aviso.altimetry.fr/), as part of the Ssalto ground processing segment.

# **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/ licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

# **A Flexible Family of Compactly Supported Covariance Functions Based on Cutoff Polynomials**

Till Schubert and Wolf-Dieter Schuh

#### **Abstract**

In time series analysis, signal covariance modeling is an essential part of stochastic methods like prediction or filtering. In geodetic applications, covariance functions are rarely treated as true compactly supported functions although large amounts of data would approve such. Covariance models for complex correlation shapes are also rare. Ideally, general families of covariance functions with a large flexibility are desirable to model complex correlations structures like negative correlations. In this paper, we derive isotropic finite covariance functions that are parametrized in a way that positive definiteness is guaranteed. These are based on cutoff polynomials which are derived from operations such as autoconvolution and autocorrelation. Next to the compact support, the resulting autocovariance models share the advantages of (a) positive definiteness by design, (b) extensibility to arbitrary orders and (c) extensive flexibility by employing multiple tunable shape parameters. All these realize various correlation shapes such as negative correlations (the so-called hole effect) and several oscillations. The methodological concepts are derived for homogeneous and isotropic random fields in R<sup>d</sup> . The family of covariance functions is then derived for onedimensional applications. A data example demonstrates the covariance modeling approach using stationary time-series data.

#### **Keywords**

Collocation - Covariance modeling - Finite covariance functions -Stochastic processes

# **1 Introduction and Related Work**

Signal covariance modeling is an important part of many stochastic prediction methods. Within that, finite covariance functions are of important use. The finite support leads to many zero-elements in the covariance matrices which allows the use of sparse data structures and efficient solvers. Both yield advantages in runtime and data allocation and also enable the handling of large tasks. In geostatistical applications, covariance functions are rarely treated as true compactly supported functions although the enormous amount of data would approve such, see e.g. Sansò and Schuh (1987) and Schuh (2016).

In time series analysis simply parametrized covariance functions are helpful for statistical analysis of data. This implies the use of functions depending on a number of tunable shape parameters. Such a parametrization is achieved by a construction using certain operations such as autoconvolution and autocorrelation.

Several covariance functions organized as a family or class exist. Sansò and Schuh (1987), Wendland (1995), Wu (1995), Gaspari and Cohn (1999) and Buhmann (2001) have introduced such classes of finite covariance functions, either given by (a) polynomials, (b) they contain trigonometric expressions, (c) they comprise rational functions or (d) are built from combined expressions of the former. All are

T. Schubert (-) · W.-D. Schuh

Theoretical Geodesy Group, Institute of Geodesy and Geoinformation, University of Bonn, Bonn, Germany e-mail: schubert@geod.uni-bonn.de

J. T. Freymueller, L. Sánchez (eds.), *X Hotine-Marussi Symposium on Mathematical Geodesy*, International Association of Geodesy Symposia 155, https://doi.org/10.1007/1345\_2023\_200

defined on a compact support and might be given in a piecewise definition within the finite support.

One well known compactly supported covariance model of interest is the so-called spherical covariance function (Wackernagel 1998; Chilès and Delfiner 1999) which is very often used in geostatistics (Webster and Oliver 2007) and also appears in textbooks on stochastic processes (Priestley 1981). The derivations in this paper use the spherical covariance function

$$\gamma(\tau) = \begin{cases} \sigma^2 \left( 1 - \frac{3}{2} \frac{|\tau|}{b} + \frac{1}{2} \frac{|\tau|^3}{b^3} \right) & \text{, for } |\tau| \le b \text{ ,} \\ 0 & \text{, otherwise } \text{,} \end{cases} \quad (1)$$

(Matheron 1965, p. 57; Priestley 1981, Eq. (3.7.93); Wackernagel 1998, p. 56; Chilès and Delfiner 1999, p. 81; Webster and Oliver 2007, p. 87) as an initial example and a starting point. Notably, the spherical covariance model can be constructed by a 3D-autoconvolution of a unit sphere but also by a 1D-autocorrelation of a first order monomial cut on the interval Œ0; 1.

This present article uses techniques different from those given in the established families primarily by means of using self-correlation instead of self-convolution. Furthermore, the suggested autocovariance models combine the advantages of (a) guaranteed positive definiteness in R<sup>1</sup>, similar to the models of Sansò and Schuh (1987) and Gaspari and Cohn (1999), (b) extensibility to arbitrary orders as e.g. the families of Wendland (1995) and Wu (1995) and (c) extensive flexibility by employing multiple tunable shape parameters as is partially done in Gaspari and Cohn (1999).

Many empirical covariance structures of real-world problems decrease to a minimum below zero, i.e. obtaining negative correlations in a certain interval, cf. e.g. Daley et al. (2015). This phenomenon is called the hole effect (e.g. Journel and Froidevaux 1982; Webster and Oliver 2007) and several globally supported covariance models exist for that purpose, e.g. a damped cosine, see e.g. Gneiting (1999) and Schubert et al. (2020).

The family of covariance functions presented here is able to handle different correlation patterns, such as negative correlations and several oscillations and furthermore provide finite support. Furthermore, we show a way of easily constraining the function to higher classes of continuous differentiability. All in all, this flexibility is needed when the functions are fitted to data-derived empirical autocovariances.

The paper is organized in the following way. Section 2 provides the basic methodology on autocovariance functions. Section 3 introduces the new methodological concepts and the definition of the family of covariance functions which is followed by a data example in Sect. 4. The family of covariance functions is derived only for validity in R<sup>1</sup>, but an outlook is given for deriving similar families with validity in R<sup>2</sup> and R<sup>3</sup>.

# **2 Methodology**

# **2.1 Autocovariance Functions and Positive Semi-Definiteness**

Autocovariance functions - .h/ of a discrete stochastic process have to be positive semi-definite, i.e.,

$$\sum\_{i,j=1}^{N} z\_i \gamma(t\_i - t\_j) z\_j \ge 0 \text{ for any } z\_i, z\_j \in \mathbb{R} \tag{2}$$

with discrete lag h D jti tj j D k t; k 2 Z (cf. Brockwell and Davis 1991, Prop. 1.5.1; Yaglom 1987, Eq. (1.28)).

For analytical covariance functions -./ depending on a lag the positive definiteness requirement (Eq. 2) translates to non-negativity of the Fourier transform of the covariance function, known as Bochner's theorem (Bochner 1955; Schuh 2016).

In this paper, we will define a family of covariance functions n./ where n is the order, which can for instance be the degree of an expansion as a polynomial. Furthermore, for the sake of brevity we define 0 and omit the vertical bars.

In this paper, we deal with autocovariance functions -./ of compact support. For reasons of brevity, the common notation of the plus subscript .1 /<sup>C</sup> which corresponds to max .0; 1 /, i.e. signifying a cutoff at 1, is extended to .1 =b/<sup>C</sup> indicating a cutoff at the range parameter b as in Eq. (1). However, due to the fact that we deal with nonmonotonous functions we use the notation .1 =b/.b/.

# **2.2 Operations on Covariance Functions**

It is well known that operations such as summation, scaling, multiplication and convolution are admissible operations to be applied to positive semi-definite covariance functions, naturally with positive weights to preserve positive definiteness. For instance, covariance tapering is known as the multiplication of an arbitrary covariance function with a finite one to obtain a function of finite support, see e.g. Furrer et al (2006). However, it is a matter of fact that even if one negative definite function is involved in a product the result can nonetheless be positive definite. Hence, it is beneficial to define covariance models that are positive definite by design due to the fact that general models exploit the full parameter space whilst nested models imply restrictions.

This work focuses on the operations autoconvolution and autocorrelation. The former gives rise to the use of B-splines (linear and higher orders) as autocovariance functions, for example. The operations self-convolution (autoconvolution)

$$\chi(\mathfrak{r}) = (f \ast f)\mathfrak{r} = \int\_{\mathbb{R}^d} f(t)\,\, f(\mathfrak{r} - t)\,\, dt \qquad (3)$$

and self-correlation (autocorrelation)

$$\chi(\tau) = (h \star h) \text{ (\tau)} = \int\_{\mathbb{R}^d} h(t) \, h(\tau + t) \, dt \qquad (4)$$

of the indicator functions f(t) and h(t) differ only in the sign of t in the second function. Let us use the following notation to introduce, analogous to self-convolution (autoconvolution) .f f /, the operation self-correlation (autocorrelation)

$$\gamma(\mathfrak{r}, \sigma, b, \mathfrak{d}) = (h \star h) \quad (\mathfrak{r}, \sigma, b, \mathfrak{d}) \tag{5}$$

and denote the involved functions by the autoconvolution kernel f .t; b; d/ and autocorrelation kernel h.t; b; d/. f .t/ and h.t/ are also called indicator functions. b is the range parameter signifying the length of support while the di in vector d are shape parameters. The autocovariance function -./ has a scale parameter given by the variance <sup>2</sup> such that -.0/ D <sup>2</sup>.

# **2.3 Positive Definiteness in** R*<sup>d</sup>*

The application in spatial domains requires positive semidefiniteness of the covariance function in higher dimensions R<sup>d</sup> , which is derived here.

In multivariate problems the distance r is taken as the Euclidean distance of the d-dimensional vector h and the isotropic (radial) covariance function reads -.r/ D -.khk/ with h 2 R<sup>d</sup> . For applications with data in higher dimensions, e.g. spatial data, the reduction to a one-dimensional distance-like norm (e.g. Euclidean) does not guarantee positive definiteness of the covariance function. Instead, the Bochner theorem is generalized to non-negativity H<sup>d</sup> f-.r/g WD S.s/ 0 8s of the Hankel transform H<sup>d</sup> f-.r/g building a spectrum S.s/ (Bochner 1955; Moritz 1976; Lantuéjoul 2002, p. 25).

The Hankel transform H<sup>d</sup> f g of order d for an isotropic covariance function -.r/ in R<sup>d</sup> is generally defined as (see e.g. Sneddon 1951, Sec. 12; Yaglom 1987, p. 353; Chilès and Delfiner 1999, p. 68; Lantuéjoul 2002, p. 241)

$$\begin{aligned} \mathcal{H}\_d\{\boldsymbol{\gamma}(\boldsymbol{r})\} := \boldsymbol{S}(\boldsymbol{s}) = \frac{1}{(2\pi)^{d/2} \ s^{d/2 - 1}} \\ \int\_0^\infty \mathbf{J}\_{d/2 - 1}(\boldsymbol{s} \, \boldsymbol{r}) \, \boldsymbol{r}^{d/2} \, \boldsymbol{\gamma}(\boldsymbol{r}) \, \boldsymbol{d} \, \boldsymbol{r} \end{aligned} (6)$$

where J . / is the Bessel function of the first kind (called J-Bessel) and order .

Next to the transition to the spectral domain given by the Hankel transform, convolution and correlation theorem also translate to the higher dimensions and their respective transforms. As a result, a self-convolution (Eq. 3) of a function f .r/ in R<sup>d</sup> to generate -.r/ corresponds to a squaring of H<sup>d</sup> ff .r/g (convolution theorem)

$$S(\mathbf{s}) := \mathcal{H}\_d\{\boldsymbol{\chi}(\mathbf{r})\} = \mathcal{H}\_d\{f(\mathbf{r})\}^2 \tag{7}$$

and hence guaranteed positive definiteness only if f .r/ is symmetric.

The crucial difference between self-convolution and selfcorrelation is that the autocorrelation operation (Eq. 4) translates to the operation of a true absolute value (Euclidean norm) in the (potentially complex-valued) spectral domain (correlation theorem)

$$S(\mathbf{s}) := \mathcal{H}\_d\{\boldsymbol{\chi}(\boldsymbol{r})\} = \left| \mathcal{H}\_d\{\boldsymbol{h}(\boldsymbol{r})\} \right|^2 \tag{8}$$

which ensures non-negativity (Bochner's theorem) for whatever parity of h.r/, see e.g. Chilès and Delfiner (1999, Eq. (2.30)). Thus, autocorrelation allows non-symmetric and even one-sided indicator functions h.t /, whereas autoconvolution restricts to symmetric indicator functions. A reasonable assumption now is that self-correlation can in general produce more flexible functions due to the non-symmetric nature of the indicator function and guaranteed non-negative spectrum S.s/.

# **3 Methodological Advances**

Established families for compactly supported rarely exhibit negative correlations, the hole effect. Gneiting (2002, Sec. 2.3) introduces oscillatory compactly supported functions based on the so-called turning bands operator (e.g. Matheron 1973; Lantuéjoul 2002). In fact, the turning bands operator (also TBM, Turning Bands Method) can retrieve the covariance model of the same type which has maximum hole effect and which can build the boundary of the domain of validity, i.e. positive semi-definiteness in a dimension of interest. For details see e.g. the covariance model when applying the TBM operator to the spherical model, which amounts to -./ <sup>D</sup> 1 3 C 2 <sup>3</sup> .-1/, see Chilès and Delfiner (1999, Tab. A.2), and builds a limit case for positive definiteness in 1D.

As this only retrieves limit cases and not fully flexible models, allowing polynomial coefficients to vary arbitrarily within the bounds of validity can provide the full flexibility. This is the general idea of this paper and will be derived in the next sections.

# **3.1 Covariance Functions Given by Cutoff Polynomials**

We desire covariance models -./ or -.r/ of polynomial type. The general idea is an extension to allow variable polynomial coefficients ai and define

$$\gamma(\tau) = \begin{cases} \sigma^2 \left( 1 + \sum\_{i=1}^n a\_i \left( \frac{\tau}{b} \right)^i \right) & , \tau \le b \; , \\ 0 & , \tau > b \; . \end{cases} \quad (9)$$

The vector <sup>a</sup> <sup>D</sup> a0; a1; :::; an includes the polynomial coefficients ai , among which a0 is covered by the variance <sup>2</sup>. With the purpose of defining a family of covariance functions in this section, the family is subscripted by the degree m of the indicator function as m./ and thus linked to the number of defining shape parameters. This is favorable to a subscript by the polynomial degree n which takes only odd degrees.

For functions -.r/ with finite support the integration Eq. (6) is done up to a fixed upper limit which leads to a particular form of a Riemann-Liouville integral. The Hankel transform of compactly supported monomials H<sup>d</sup> n rk .r-1/o is given by algebraic combinations of rational, trigonometric, J-Bessel, Struve and Lommel functions (cf. Gradshteyn and Ryzhik 2000, Eqs. (6.561)) but has a compact notation given by (cf. Gradshteyn and Ryzhik 2000, Eq. (6.569); Erdélyi 1954, Eq. (13.1.56), p. 193)

$$\mathcal{H}\_d\left\{r\_{(r\le 1)}^k\right\} := S(s) = \frac{2^{1-d}}{\pi^{d/2}\,\Gamma\left(\frac{d}{2}\right)(d+k)}$$

$${}\_1\text{F}\_2\left(\frac{d+k}{2};\frac{d}{2},\frac{d+k}{2}+1;-\frac{s^2}{4}\right)$$

where <sup>1</sup>F2. -I - ; -I -/ is one particular form of the generalized hypergeometric function. Eventually, linear combinations of Eq. (10), given by the weighted sum using the polynomial coefficients ai , provide the result for the Hankel transform of a general compactly supported polynomial and thus an evaluation of the positive definiteness in various dimensions R<sup>d</sup> . Eq. (10) is independent of the support range b and the transforms H<sup>d</sup> n rk .r-1/o are solely weighted by the ai .

The spectra S .s/ for covariance models given by combinations of Eq. (10) may follow typical spectra for shaping filters in the sense that designated extrema in the spectrum are modelled, corresponding to the oscillating behavior of the autocovariance function. Beyond that however, they are of oscillating and slowly attenuating nature. It should be noted here that the validation of S .s/ 0 can be cumbersome as the spectrum might asymptotically reach a negative value. The use of an indicator function bypasses the problem and can guarantee positive semi-definiteness in R<sup>d</sup> . This will be done in the next section, however only for R<sup>1</sup>.

Note that Hristopulos (2015) uses compactly supported polynomials in the spectral domain and achieves a linear combination of Eq. (10) as a family of globally supported covariance functions in time domain, called the Bessel-Lommel covariance functions.

# **3.2 Idea of Parametrization**

From this point on the paper will deal only with univariate autocorrelation leading to a family of covariance functions valid in R<sup>1</sup>.

The idea is that the autocovariance function is generated from an analytical (univariate) self-correlation

$$\gamma(\tau) = (h \star h) \text{ (\tau)} = \int\_{-\infty}^{\infty} h(t) \, h(t + \tau) \, dt \qquad (11)$$

(see e.g. Yaglom 1987, Eq. (2.45); Chilès and Delfiner 1999, Eq. (2.30); Eq. (7.22); Lantuéjoul 2002, p. 190; Schlather 2012, Eq. (2.12) and Iske 2018, Cor. 8.12) of an indicator function h.t/. In order to achieve compactly supported covariance functions one has to use a compactly supported indicator function.

In general, it proves beneficial to assess the function as generated from autocorrelation operation such that several properties of covariance functions are automatically satisfied and the covariance functions are positive definite by design.

In order to achieve valid covariance models for 1D, we first restrict ourselves to covariance functions generated from one-dimensional (univariate) autocorrelation operations. The resulting functions can nonetheless suit as isotropic covariance functions in R<sup>d</sup> only if positive definiteness in the higher dimension is ensured additionally.<sup>1</sup> On the other hand, covariance models valid in a certain dimensions can always be used in a lower dimension.

In contrast to autoconvolution, autocorrelation allows non-symmetric and even one-sided indicator functions. What is not demonstrated here, self-correlation has the advantages of enabling to model one more lobe with each order, which is not possible with self-convolution. Furthermore, the maximum hole effect can only be achieved by selfcorrelation. As a result, we acknowledge the assumption that self-correlation can in general produce more flexible functions. In addition, the indicator functions are restricted without loss of generality to one-sided functions.

<sup>1</sup>Equation (10) serves as an evaluation of the spectrum given by H<sup>d</sup> f-.r/g which can be checked for non-negativity.

Clearly, the number of intervals in the piece-wise definition of the indicator function plays a role. A desirable property of the self-correlation is the possibility to achieve an autocovariance function consisting of just one defined interval (apart from the symmetric counterpart and the zero outside the support range b) by an arbitrary indicator function that shares this property. Hence, we restrict to onesided indicator functions defined by a polynomial in a single interval, although multi-interval covariance models are also common, see Gaspari and Cohn (1999).

# **3.3 Parameterizing Polynomials by Univariate Self-Correlation**

In order to achieve valid covariance functions for 1D the idea is that the autocovariance function is generated from a univariate self-correlation (Eq. 11). Doing that using a onesided indicator function h.t/ of cutoff polynomial type of degree m

$$h(t) = \begin{cases} \alpha \sum\_{i=0}^{m} d\_i \left( \frac{l}{b} \right)^i & , 0 \le t \le b \\ 0 & , \, t < 0, \,\, t > b \end{cases},\tag{12}$$

constructs an autocovariance function by

$$\chi(\mathfrak{r}) = \int\_0^{b-\mathfrak{r}} h(t) \, h(t+\mathfrak{r}) \, dt \tag{13}$$

which automatically fulfills the symmetry -./ D -./.

The operation yields covariance models - . / of polynomial type (Eq. 9) of degree n D 2m C 1 with coefficients ai depending on a set of defining parameters di . The scalar ˛ signifies a scaling by the total overlapping area in the integration and is applied to realize -.0/ D <sup>2</sup>. It depends on all parameters di but is not specified further, as it will be replaced by <sup>2</sup>.

We exemplarily look at the self-correlation of a finite straight-line indicator function, i.e. m D 1. For this case only, the operation requires only one2 defining parameter d1 and the result is a cubic polynomial with only constant, linear and cubic term given by

$$\begin{split} \gamma\_{m=1}(\tau) &= \sigma^2 \left( 1 - \left( \frac{d\_1^{\,^2}}{2 \, d\_1^{\,^2} + 6 \, d\_1 + 6} + 1 \right) \frac{\tau}{b} \right. \\ &\left. + \left( \frac{d\_1^{\,^2}}{2 \, d\_1^{\,^2} + 6 \, d\_1 + 6} \right) \left( \frac{\tau}{b} \right)^3 \right)\_{(\tau \le b)}. \end{split} \tag{14}$$

**Fig. 1** Covariance models corresponding to the flexible family <sup>m</sup><sup>D</sup>1./ (Eq. 14). The colors are used for visual separability. The spherical model (dashed line) is the subcase for d1 D 1 or c0 D 0:5

For this case, the expressions for a1 and a3 fulfill the linear condition a1 C a3 D 1 such that

$$a = \begin{bmatrix} 1, & -c\_0 - 1, & 0, & c\_0 \end{bmatrix} \tag{15}$$

can be given as an equivalent expression for Eq. (14) using a different defining parameter c0. However, the rational expression for a3 in Eq. (14) leads to c0 2 Œ0; 2. These parameter bounds need to be additionally enforced when using Eq. (15), whilst they are guaranteed by the parametrization (14). As a result, the parametrization using di and Eq. (12) is favored despite its non-linear nature.

Figure 1 shows this family of functions for a selection of defining parameters d1 over its full range of values. The triangular and spherical model as well as the model with extreme hole effect (TBM operator) are part of this family. Naturally, all functions of this family are at least valid in R<sup>1</sup>. For dimensions d D 2 and d D 3, only the spherical model (Eq. 1) (d1 D 1, c0 D 0:5) is valid which can be shown by evaluation of Eq. (10) as well as by the geometrical considerations of convolving the unit sphere in 3D.

With the generation by Eqs. (13) and (12) we introduce a family of covariance functions that is extensible to arbitrary orders m which yields lengthy, non-linear expressions for the polynomial coefficients ai similar to Gaspari et al (2006, Eqs. (33), (C.1) and (C.2)). Despite being long and nonlinear the formulas are manageable and converge when fitting the tuning parameters. The equations for m D 2 to 5 are given in the Appendix Eqs. (A.1) to (A.4). There, similar to Eq. (15), a formulation using linear constraint relations among shape parameters can be achieved, if c10 to c16 are considered as the defining tuning parameters, whose bounds are not materialized though.

Higher order models of this family naturally provide more flexibility than for m D 1 and realize covariance functions with a bigger hole effect compared to Fig. 1.

<sup>2</sup>Without loss of generality, d0 has been set to d0 D 1.

So in general, nonlinear expressions are obtained. The big advantage however is that the covariance function is well-shaped and, by virtue of the univariate self-correlation, positive semi-definiteness in R<sup>1</sup> is also guaranteed.

In general, the number of defining parameters is m C 1, see Eq. (12), but can collapse to less as in the case of Eq. (14) which does not involve d0. The covariance functions are polynomials of degree n where in general m D .n 1/ =2 holds. The variance <sup>2</sup> takes the role of a0 and a factor to all other ai and is also a parameter to be estimated. In addition there is the finite range b. In total, the number of parameters to estimate is m C 3.

Clearly, the continuity properties of the indicator function define the continuity class of the covariance function. With no specifications to the continuity of h.t/, -./ from Eq. (13) will in general only be <sup>C</sup><sup>0</sup> at <sup>D</sup> <sup>b</sup>. The covariance function's property of continuous differentiability at D b can be easily increased to higher classes, e.g. <sup>C</sup><sup>1</sup>. This can be done by setting the parameters d0, d1, etc. to exactly zero which result in <sup>C</sup><sup>1</sup>, <sup>C</sup><sup>2</sup>, etc., respectively.

# **4 An Example: Milan Cathedral Deformation Time Series**

This example is the well known deformation time series of Milan Cathedral (Sansò 1985). The time series measurements are levelling heights of a pillar in the period of 1965 to 1977. For details see also Schubert et al. (2020). The time series is detrended using a linear function and the remaining residuals define the stochastic signal. Based on the detrended time series, the biased estimator is used to determine the empirical covariances.

The fitting of the analytical covariance function is done using the MATLAB function fmincon (MathWorks 2022) with the optimization with respect to the parameters , b and d0 to dm formulated in the least-squares sense. If needed the support range b and variance <sup>2</sup> can be constrained by lower and upper bounds. The other shape parameters di can be left unconstrained unless higher continuity classes should be achieved.

A first fitted covariance model of order m D 2 obtains a compact support of b D 6:04 years where the transition is of <sup>C</sup><sup>0</sup>, see Fig. 2.

During the model fitting it became apparent that a model with two lobes requires <sup>C</sup><sup>1</sup> continuity at <sup>D</sup> <sup>b</sup>. Hence, a linear equality constraint is applied directly to the parameter d0, i.e. d0 D 0.

The fitted covariance model is of degree m D 5 and exhibits two lobes up to the support range of b D 10:24 years. The shape is comparable to a damped oscillatory behavior, as would result from a modelling using AR and

**Fig. 2** Compactly supported covariance models fitted to the empirical covariances of the Milan Cathedral deformation time series

ARMA-processes (see Schubert et al. 2020), but it has finite support.

Performing the collocation prediction of the pillar deformation, the residual sum of squares (RSS) in a leaveone-out cross-validation (LOOCV) shows a reduction from 0:5314 mm2 for the first model to 0:5011 mm2 for the second model. This demonstrates the better representation of the stochastic behavior by the higher order model.

# **5 Conclusions**

We introduced a family of compactly supported autocovariance functions based on cutoff polynomials. The functions are parameterized by a set of defining parameters which build the polynomial coefficients in a non-linear fashion resulting from the construction by self-correlation. These covariance models define a general family with a large flexibility and the ability to model various oscillatory shapes.

In summary, we have introduced versatile compactly supported functions suited for geodetic applications with large amounts of data. The resulting autocovariance models are flexible due to multiple tunable shape parameters and share the advantages of being positive definite by design, extensible to arbitrary orders and easy to constrain to different continuity classes. As an extension to this paper, families of covariance functions that are guaranteed to be positive definite in 2D, 3D and on the sphere have been derived and will be published soon.

**Acknowledgements** We thank the anonymous reviewers for their valuable comments which helped improving the manuscript.

**Funding** This research is funded by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) – Grant No. 435703911 (SCHU 2305/7-1 'Nonstationary stochastic processes in least squares collocation – NonStopLSC').

**Competing Interests** The authors declare no competing interests.

# **Appendix A: Higher Order Covariance Models of the 1D Family**

These families are given for <sup>2</sup> D 1 and for a support range of b D 1. For arbitrary b replace by =b. The expressions involve auxiliary variables cl and defining parameters di .

For m D 2:

$$\begin{aligned} c\_0 &= d\_2^{-2}, \\ c\_1 &= -5 \, d\_1^{-2} + d\_2^{-2} + 20 \, d\_0 \, d\_2 \, , \\ c\_2 &= 5 \, d\_1^{-2} + 15 \, d\_1 \, d\_2 + 9 \, d\_2^{-2} + 10 \, d\_0 \, d\_2 \, , \\ c\_3 &= 30 \, d\_0^{-2} + 30 \, d\_0 \, d\_1 + 20 \, d\_0 \, d\_2 + 10 \, d\_1^{-2} + 15 \, d\_1 \, d\_2 + 6 \, d\_2^{-2} \, , \\ c\_{10} &= \frac{c\_0}{c\_3}, \quad c\_{11} = \frac{c\_1}{c\_3}, \quad c\_{12} = \frac{c\_2}{c\_3} \, , \\ \gamma\_{m=2}(\mathbf{r}) &= \left( -c\_{10} \, \mathbf{r}^5 + (c\_{10} - c\_{11}) \, \mathbf{r}^3 + (c\_{11} + c\_{12}) \, \mathbf{r}^2 + (-c\_{12} - 1) \, \mathbf{r} + 1 \right) \mathbf{r} + 1 \, ] \end{aligned} \tag{A.1}$$

For m D 3:

c0 D 3 d3 2 ; c1 D 14 d2 <sup>2</sup> C 3 d3 <sup>2</sup> C 42 d1 d3 ; c2 D 70 d1 <sup>2</sup> C 168 d1 d3 C 14 d2 <sup>2</sup> C 140 d2 d3 C 280 d0 d2 C 102 d3 <sup>2</sup> C 420 d0 d3 ; c3 D 70 d1 <sup>2</sup> C 210 d1 d2 C 252 d1 d3 C 126 d2 <sup>2</sup> C 280 d2 d3 C 140 d0 d2 C 150 d3 <sup>2</sup> C 210 d0 d3 ; c4 D 420 d0 <sup>2</sup> C 420 d0 d1 C 280 d0 d2 C 210 d0 d3 C 140 d1 <sup>2</sup> C 210 d1 d2 C 168 d1 d3 C 84 d2 <sup>2</sup> C 140 d2 d3 C 60 d3 2 ; c10 <sup>D</sup> c0 c4 ; c11 <sup>D</sup> c1 c4 ; c12 <sup>D</sup> c2 c4 ; c13 <sup>D</sup> c3 c4 ; <sup>m</sup><sup>D</sup>3./ <sup>D</sup> c10 <sup>7</sup> C .c11 c10/ <sup>5</sup> C .c11 c12/ <sup>3</sup> C .c12 C c13/ <sup>2</sup> C .c13 1/ C 1 .-1/ (A.2)

For m D 4:

c0 D 2 d4 2 ; c1 D 9 d3 <sup>2</sup> C 2 d4 <sup>2</sup> C 24 d2 d4 ; c2 D 42 d2 <sup>2</sup> C 504 d0 d4 126 d1 d3 ; c3 D 756 d0 d4 C 126 d1 d3 C 630 d1 d4 C 396 d2 d4 C 315 d3 d4 42 d2 <sup>2</sup> C 9 d3 <sup>2</sup> C 250 d4 2 ; c4 D 210 d1 <sup>2</sup> C 504 d1 d3 C 1050 d1 d4 C 42 d2 <sup>2</sup> C 420 d2 d3 C 864 d2 d4 C 840 d0 d2 C 306 d3 <sup>2</sup> C 945 d3 d4 C 1260 d0 d3 C 590 d4 <sup>2</sup> C 1764 d0 d4 ; c5 D 210 d1 <sup>2</sup> C 630 d1 d2 C 756 d1 d3 C 840 d1 d4 C 378 d2 <sup>2</sup> C 840 d2 d3 C 900 d2 d4 C 420 d0 d2 C 450 d3 <sup>2</sup> C 945 d3 d4 C 630 d0 d3 C 490 d4 <sup>2</sup> C 756 d0 d4 ;

(A.3)

$$c\_6 = 1260 \, d\_0^2 + 1260 \, d\_0 \, d\_1 + 840 \, d\_0 \, d\_2 + 630 \, d\_0 \, d\_3 + 504 \, d\_0 \, d\_4 + 420 \, d\_1 \, ^2 + 630 \, d\_1 \, d\_2 + 504 \, d\_1 \, d\_3$$

$$+ 420 \, d\_1 \, d\_4 + 252 \, d\_2^2 + 420 \, d\_2 \, d\_3 + 360 \, d\_2 \, d\_4 + 180 \, d\_3^2 + 315 \, d\_3 \, d\_4 + 140 \, d\_4^2$$

$$c\_{10} = \frac{c\_0}{c\_6}, \quad c\_{11} = \frac{c\_1}{c\_6}, \quad c\_{12} = \frac{c\_1 + c\_2}{c\_6}, \quad c\_{13} = \frac{c\_4}{c\_6}, \quad c\_{14} = \frac{c\_5}{c\_6},$$

$$\chi\_{n=4\left(\tau\right)} = \left(-c\_{10} \,\tau^9 + \left(c\_{10} - c\_{11}\right) \,\tau^7 + \left(c\_{11} - c\_{12}\right) \,\tau^5 + \left(c\_{12} + c\_{13}\right) \,\tau^4 + \left(-c\_{13} - c\_{14}\right) \,\tau^3 + \left(-c\_{15} - c\_{16}\right) \,\tau^2 + \left(c\_{17} - c\_{18}\right) \,\tau^3 + \left(c\_{19} - c\_{19}\right) \,\tau^2 + \left(c\_{10} - c\_{15}\right) \,\tau^3 + \left(c\_{11} - c\_{16}\right) \,\tau^2 + \left(c\_{10} - c\_{17}\right) \,\tau^3 + \left(c\_{18} - c\_{19}\right) \,\tau^2 + \$$

For m D 5:

t0 D 5 d5 2 ; t1 D 22 d4 <sup>2</sup> C 5 d5 <sup>2</sup> C 55 d3 d5 ; t2 D 99 d3 <sup>2</sup> C 660 d1 d5 264 d2 d4 ; t3 D 5544 d0 d4 1386 d1 d3 C 13860 d0 d5 C 6270 d1 d5 C 264 d2 d4 C 4620 d2 d5 C 3410 d3 d5 C 2772 d4 d5 C 462 d2 <sup>2</sup> 99 d3 <sup>2</sup> C 22 d4 <sup>2</sup> C 2305 d5 2 ; t4 D 8316 d0 d4 C 1386 d1 d3 C 20790 d0 d5 C 6930 d1 d4 C 16830 d1 d5 C 4356 d2 d4 C 12705 d2 d5 C 3465 d3 d4 C 10450 d3 d5 C 11088 d4 d5 462 d2 <sup>2</sup> C 99 d3 <sup>2</sup> C 2750 d4 <sup>2</sup> C 7595 d5 2 ; t5 D 2310 d1 <sup>2</sup> C 5544 d1 d3 C 11550 d1 d4 C 17820 d1 d5 C 462 d2 <sup>2</sup> C 4620 d2 d3 C 9504 d2 d4 C 15015 d2 d5 C 9240 d0 d2 C 3366 d3 <sup>2</sup> C 10395 d3 d4 C 14960 d3 d5 C 13860 d0 d3 C 6490 d4 <sup>2</sup> C 16632 d4 d5 C 19404 d0 d4 C 9730 d5 <sup>2</sup> C 25410 d0 d5 ; t6 D 2310 d1 <sup>2</sup> C 6930 d1 d2 C 8316 d1 d3 C 9240 d1 d4 C 9900 d1 d5 C 4158 d2 <sup>2</sup> C 9240 d2 d3 C 9900 d2 d4 C 10395 d2 d5 C 4620 d0 d2 C 4950 d3 <sup>2</sup> C 10395 d3 d4 C 10780 d3 d5 C 6930 d0 d3 C 5390 d4 2 C 11088 d4 d5 C 8316 d0 d4 C 5670 d5 <sup>2</sup> C 9240 d0 d5 ; t7 D 13860 d0 <sup>2</sup> C 13860 d0 d1 C 9240 d0 d2 C 6930 d0 d3 C 5544 d0 d4 C 4620 d0 d5 C 4620 d1 <sup>2</sup> C 6930 d1 d2 C 5544 d1 d3 C 4620 d1 d4 C 3960 d1 d5 C 2772 d2 <sup>2</sup> C 4620 d2 d3 C 3960 d2 d4 C 3465 d2 d5 C 1980 d3 2 C 3465 d3 d4 C 3080 d3 d5 C 1540 d4 <sup>2</sup> C 2772 d4 d5 C 1260 d5 2 ; t10 <sup>D</sup> t0 t7 ; t11 <sup>D</sup> t1 t7 ; t12 <sup>D</sup> t1 <sup>C</sup> t2 t7 ; t13 <sup>D</sup> t3 t7 ; t14 <sup>D</sup> t4 t7 ; t15 <sup>D</sup> t5 t7 ; t16 <sup>D</sup> t6 t7 ;

$$\begin{aligned} \mathbf{r}(\mathbf{r}) &= \begin{pmatrix} t\_{10} & t\_{7} & t\_{7} & t\_{7} & t\_{7} & t\_{7} & t\_{7} \end{pmatrix} \\ \mathbf{r}(\mathbf{r}) &= \begin{pmatrix} t\_{10} \ \mathbf{r}^{11} + (t\_{11} - t\_{10}) \ \mathbf{r}^{9} + (t\_{12} - t\_{11}) \ \mathbf{r}^{7} + (-t\_{12} - t\_{13}) \ \mathbf{r}^{5} + (t\_{13} + t\_{14}) \ \mathbf{r}^{4} + (-t\_{14} - t\_{15}) \ \mathbf{r}^{3} \\ &+ (t\_{15} + t\_{16}) \ \mathbf{r}^{2} + (-t\_{16} - 1) \ \mathbf{r} + 1 \end{pmatrix}\_{\begin{pmatrix} \mathbf{r} \leq 1 \\ \mathbf{r} \leq 0 \end{pmatrix}} \\ &\tag{A.4}$$

# **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/ licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

# **Modeling of Inhomogeneous Spatio-Temporal Signals by Least Squares Collocation**

Wolf-Dieter Schuh, Johannes Korte, Till Schubert, and Jan Martin Brockmann

#### **Abstract**

Through inverse modeling and adjustment techniques, the geodesists try to derive mathematical models from their measurements to get a better understanding of various processes in the system Earth. Sophisticated deterministic and stochastic models are developed to achieve the best possible reflection of reality and the remaining uncertainty.

The main focus of this article is on the further development of stochastic model representations, with the capability to switch from the usual assumption of homogeneous (time-stationary) to inhomogeneous (time-variable) stochastic models. To accomplish this we build up and extend a methodical framework to connect the filter and the covariance approach represented by time-variable autoregressive processes (AR) and time-variable (inhomogeneous) covariance models for least squares collocation.

We apply these time-variable covariance models to describe the temporal component of a spatio-temporal point stack of surface displacements derived from a DInSAR-SBAS analysis of the ERS1 and ERS2 missions from the Lower-Rhine Embayment in North Rhine-Westphalia. The construction of a time-variable spatio-temporal covariance model allows to use the least squares collocation approach to predict the vertical movements at any location and at any time. Furthermore, a report on the uncertainty of the prediction is provided.

#### **Keywords**

Collocation - Non-stationarity - Time-variable AR processes -Time-variable covariances

# **1 Introduction**

Concepts like stationarity and homogeneity (invariance with respect to transformation) play a central role in the treatment of stochastic processes. According to the definition of (Brockwell and Davis 1991, Def. 1.3.1) stationary processes are defined on the one hand by the unchanged (marginal) distribution function, but according to definition (Brockwell and Davis 1991, Def. 1.3.2) stationarity can also be defined by

Theoretical Geodesy Group, Institute of Geodesy and Geoinformation, University Bonn, Bonn, Germany

e-mail: schuh@geod.uni-bonn.de; korte@geod.uni-bonn.de; schubert@geod.uni-bonn.de; brockmann@geod.uni-bonn.de covariance stationarity. This means that the covariance function is invariant with respect to a linear transformation (see also Moritz (1980, Sec. 12)). In practice, many phenomena modelled by stochastic processes do not satisfy this requirement and exhibit a time- or location-varying character.

The term time-variable is often used differently. Following Priestley (1989, Sec. 6.1), time-variable processes can be subdivided into models with a deterministic trend (e.g. polynomial or seasonal) or with an "explosive" AR models where the roots of the characteristic polynomial are not only inside but also outside the unit circle. Here we want to study another type of non-stationary processes, where the coefficients of a discrete AR process <sup>S</sup><sup>t</sup> ; t <sup>2</sup> <sup>Z</sup><sup>C</sup> are variable in time.

However, the motion of the coefficients must be constrained to ensure a finite variance of the resulting process,

© The Author(s) 2023

J. T. Freymueller, L. Sánchez (eds.), *X Hotine-Marussi Symposium on Mathematical Geodesy*, International Association of Geodesy Symposia 155, https://doi.org/10.1007/1345\_2023\_202

W.-D. Schuh (-) · J. Korte · T. Schubert · J. M. Brockmann

E ˚ <sup>S</sup><sup>t</sup> <sup>E</sup>fS<sup>t</sup> <sup>g</sup>/<sup>2</sup> - < 1. Korte et al. (2022) restricts the variability of the time-varying coefficients of an AR process by the restriction that the roots of the characteristic polynomials should be only within the unit circle. Since only linear motions of the roots (poles) are allowed, this requirement can be guaranteed also for higher order processes. For general (infinite) time-variable AR processes of first order, TVAR(1), however, the condition of convergence of the product sequence of the time-variable AR coefficients is a sufficient condition to guarantee finite variances/covariances (Knopp 1922, Sec. VII). For finite AR processes these restrictions simplify accordingly.

The article is organized as follows. In the Sect. 2, we examine the time-variable AR process of first order and put special focus on the inhomogeneity of the first and second central moment of the density function (expectation and covariances). In Sect. 3 we use this time-variable AR process to construct a time-variable spatio-temporal covariance model for a DInSAR-SBAS point stack of surface displacements from ERS1 and ERS2 data from the Lower-Rhine Embayment in North Rhine-Westphalia. A summary and outlook concludes the work.

# **2 Time-Variable Autoregressive Process of First Order (TVAR(1))**

The focus of this section is to derive the first and second central moments of a time-variable autoregressive process of first order, TVAR(1), which is defined by

$$\mathcal{S}\_l := \alpha\_l \mathcal{S}\_{l-1} + \mathcal{E}\_l, \qquad t \in \mathbb{Z} \tag{1}$$

where ˚ ˛t - <sup>t</sup> <sup>2</sup> <sup>R</sup> form a sequence of time-variable coefficients under the condition that the product series limt!1 Qt jD1 ˛2 j converges. ˚ Et - <sup>t</sup> represents an independent and identically distributed (i.i.d.) sequence of random variables with expectation <sup>E</sup> <sup>f</sup>E<sup>t</sup> <sup>g</sup> <sup>D</sup> <sup>0</sup> and a constant variance † <sup>f</sup>E<sup>t</sup> <sup>g</sup> <sup>D</sup> <sup>2</sup> e . t denotes the sampling rate.

# **Process Definition and Moving Average Representation of a TVAR(1) Process**

To find an equivalent representation of the TVAR(1) process by a moving average process, we have to substitute the past signals

$$\mathcal{S}\_{l} = \alpha\_{l} \; \mathcal{S}\_{l-1} + \mathcal{E}\_{l} \tag{2}$$

$$
\sigma = \alpha\_l \left( \alpha\_{l-1} \mathcal{S}\_{l-2} + \mathcal{E}\_{l-1} \right) + \mathcal{E}\_l \tag{3}
$$

and obtain the general representation

$$\mathcal{S}\_{l} = \alpha\_{l} \left( \alpha\_{l-1} \left( \dots \alpha\_{2} \left( \alpha\_{1} \left. \mathcal{S}\_{0} + \mathcal{E}\_{1} \right) \dots \right) + \mathcal{E}\_{l-1} \right) + \mathcal{E}\_{l} \quad (4) \right)$$

where S<sup>0</sup> denotes the signal at the initial point t D 0. This results in compact form to

$$\mathcal{S}\_{l} = \prod\_{j=1}^{l} \alpha\_{j} \; \mathcal{S}\_{0} + \sum\_{k=1}^{l} \prod\_{\substack{j=k+1 \ j \neq k}}^{l} \alpha\_{j} \; \mathcal{E}\_{k} \tag{5}$$

(notice: <sup>Q</sup><sup>t</sup> jDtC1 ˛j WD 1). With this moving average representation of a TVAR(1) process it is now straightforward to compute the expectation and the covariances of the process.

#### **Expectation of a TVAR(1) Process**

The expectation

$$E\left\{\mathcal{S}\_{l}\right\} = E\{\prod\_{j=1}^{l} \alpha\_{j}\left.\mathcal{S}\_{0} + \sum\_{k=1}^{l} \prod\_{j=k+1}^{l} \alpha\_{j}\left.\mathcal{E}\_{k}\right\}\right\} \quad (6)$$

$$=\prod\_{j=1}^{l} \alpha\_{j} \to \{\mathcal{S}\_{0}\} + \sum\_{k=1}^{l} \prod\_{\substack{j=k+1 \ j\neq k}}^{l} \alpha\_{j} \to \{\mathcal{E}\_{k}\} \tag{7}$$

depends on the expectation of the initial state S<sup>0</sup> and the stochastic behavior of the noise E<sup>t</sup> , which is by definition (1) of the AR process <sup>E</sup> ˚ Ej - D 0 for j D 1; : : : ; t. The expectation of the initial state S<sup>0</sup> is unknown, but we can deduce from (7) the conditional expectation of S<sup>t</sup> given a known initial condition E fS0gDs0

$$E\left\{\mathcal{S}\_l|s\_0\right\} = \prod\_{j=1}^l \alpha\_j \left|s\_0\right. \tag{8}$$

In the following we restrict this general formulation of TVAR(1) processes by assuming that S<sup>0</sup> has the same stochastic properties as a long convergent AR(1) process with constant coefficient j˛j< 1,

$$\mathcal{S}\_{i} = \alpha \mathcal{S}\_{i-1} + \mathcal{E}\_{i}, \quad \text{for} \quad i = \dots, -2, -1, 0 \dots \quad (9)$$

Taking the properties of the i.i.d. sequence of the random variables E<sup>t</sup> into account, we can state that Si-<sup>1</sup> and E<sup>i</sup> are uncorrelated and due to the convergence behaviour lim <sup>t</sup>!1 ˛<sup>t</sup> <sup>D</sup> <sup>0</sup>, the expectation and variance of <sup>S</sup><sup>0</sup> is asymptotically independent of the initial state of this process and given by

$$E\left\{\mathcal{S}\_0\right\} = 0 \qquad \text{and} \qquad \sigma\_{\mathcal{S}\_0}^2 = \frac{1}{1 - \alpha^2} \sigma\_\epsilon^2 \qquad (10)$$

cf. e.g. Box and Jenkins (1970, pp. 57-58). Applying the expectation E fS0g D 0 in (7) or (8) this immediately results in

$$E\left\{\mathcal{S}\_l\right\} = 0\tag{11}$$

for the TVAR(1) process under the assumption (9) for S0. This choice of the initial state has of course an influence on the further deviations of the variances and covariances. In contrast to Wegman (1974) where the second moments are defined as conditional moments we integrate the stochastic properties of the initial state S<sup>0</sup> under assumption (10) of a long convergent AR(1) process.

#### **Variance/Covariance of a TVAR(1) Process**

The covariance as joint second central moment is defined by

$$\Sigma\left\{\mathcal{S}\_{l},\mathcal{S}\_{l+h}\right\} = E\left\{ (\mathcal{S}\_{l} - E\left\{\mathcal{S}\_{l}\right\}) (\mathcal{S}\_{l+h} - E\left\{\mathcal{S}\_{l+h}\right\}) \right\}.\tag{12}$$

where t and t C h denote the time points. Substituting the moving average representation (5) and noting that the expectation value vanishes due to (11) we obtain

$$\Sigma\left\{\mathcal{S}\_{l},\mathcal{S}\_{l+h}\right\} = E\left\{ \left(\prod\_{j=1}^{l} \alpha\_{j}\,\,\mathcal{S}\_{0} + \sum\_{k=1}^{l} \prod\_{\substack{j=k+1 \ j\neq k+1}}^{l} \alpha\_{j}\,\,\mathcal{E}\_{k} \right) \right.$$

$$\left(\prod\_{m=1}^{l+h} \alpha\_{m}\,\,\mathcal{S}\_{0} + \sum\_{\ell=1}^{l+h} \prod\_{m=\ell+1}^{l+h} \alpha\_{m}\,\,\mathcal{E}\_{\ell} \right) \,. \tag{13}$$

A reordering with respect to the expectation and the products provides

$$\begin{split} \Sigma\left\{\mathcal{S}\_{l},\mathcal{S}\_{l+h}\right\} &= \prod\_{j=1}^{l} \alpha\_{j} \prod\_{m=1}^{l+h} \alpha\_{m} \operatorname{E}\left\{\mathcal{S}\_{0}\mathcal{S}\_{0}\right\} + \\ &+ \prod\_{j=1}^{l} \alpha\_{j} \sum\_{\ell=1}^{l+h} \prod\_{m=\ell+1}^{l+h} \alpha\_{m} \operatorname{E}\left\{\mathcal{S}\_{0}\mathcal{E}\_{\ell}\right\} + \\ &+ \prod\_{m=1}^{l+h} \alpha\_{m} \sum\_{k=1}^{l} \prod\_{j=k+1}^{l} \alpha\_{j} \operatorname{E}\left\{\mathcal{E}\_{k}\mathcal{S}\_{0}\right\} + \\ &+ \sum\_{k=1}^{l} \prod\_{j=k+1}^{l} \alpha\_{j} \sum\_{\ell=1}^{l+h} \prod\_{m=\ell+1}^{l+h} \alpha\_{m} \operatorname{E}\left\{\mathcal{E}\_{k}\mathcal{E}\_{\ell}\right\}. \end{split}$$

Taking into account the properties of the i.i.d. sequence of the random variables E<sup>t</sup> , we can state

$$E\{\mathcal{E}\_k \: \mathcal{E}\_\ell\} = \sigma\_\mathcal{E}^2 \delta\_{k\ell},$$

$$E\{\mathcal{S}\_0 \: \mathcal{E}\_\ell\} = 0 \quad \text{for} \quad \ell \neq 0, \quad \text{and}$$

$$E\{\mathcal{S}\_0 \: \mathcal{S}\_0\} = \sigma\_{\mathcal{S}\_0}^2,\tag{15}$$

where ık` denotes the Kronecker-Delta.

From (14) one obtains

$$\Sigma\left\{\mathcal{S}\_{l},\mathcal{S}\_{l+h}\right\} = \prod\_{n=l+1}^{l+h} \alpha\_{n} \prod\_{j=1}^{l} \alpha\_{j}^{2} \,\sigma\_{\mathcal{S}\_{0}}^{2} +$$

$$\sum\_{k=1}^{l} \prod\_{j=k+1}^{l} \alpha\_{j}^{2} \prod\_{m=k+1}^{l+h} \alpha\_{m} \,\sigma\_{\mathcal{E}}^{2} \quad \text{for } h > 0 \,\,. \tag{16}$$

This can be reformulated to

$$\begin{aligned} \Sigma \left\{ \mathcal{S}\_l, \mathcal{S}\_{l+h} \right\} &= \\ \prod\_{n=l+1}^{l+h} \alpha\_n \left( \prod\_{j=1}^l \alpha\_j^2 \, \sigma\_{\mathcal{S}\_0}^2 + \sum\_{k=1}^l \prod\_{j=k+1}^l \alpha\_j^2 \, \sigma\_{\mathcal{E}}^2 \right), & h > 0 \end{aligned} \tag{17}$$

which gives the covariance sequence of a TVAR(1) process for the times t and tCh, for a positive lag h. Note that because of the symmetry properties of covariances † fS<sup>t</sup> ; StChg D † fSt<sup>C</sup>h; S<sup>t</sup> g holds.

The covariance sequence (17) can be split into two parts. The first part involving only the future events connected with the index n and the second part involving only the events before time t including the time itself linked to the indices j and k. The expression in the brackets defines the variance (i.e. lag 0) at time t

$$\chi\_l(0) := \Sigma \left\{ \mathcal{S}\_l, \mathcal{S}\_l \right\} = \prod\_{j=1}^l \alpha\_j^2 \, \sigma\_{\mathcal{S}\_0}^2 + \sum\_{k=1}^l \prod\_{\substack{j=k+1 \ j \neq k}}^l \alpha\_j^2 \, \sigma\_{\mathcal{E}}^2 \, . \tag{18}$$

The variance t.0/ is influenced by two quantities. The variance of the initial value S<sup>0</sup> characterizes the warm up behavior of the process and the constant variance of the process <sup>2</sup> <sup>E</sup> establish in connection with the time-variable coefficients ˛j the further variance behaviour of the process. It is thus clear that the variance of the TVAR(1) process is time-variant (see also Fig. 1).

Often it is convenient to use a simple recursion formula instead of (18). If we rewrite (18) and use t 1 as maximal upper bound instead of t we get

$$\gamma\_l(0) := \alpha\_l^2 \prod\_{j=1}^{l-1} \alpha\_j^2 \,\sigma\_{S\_0}^2 + $$

$$\prod\_{j=l+1}^l \alpha\_j^2 \sigma\_\mathcal{E}^2 + \sum\_{k=1}^{l-1} \alpha\_l^2 \prod\_{j=k+1}^{l-1} \alpha\_j^2 \,\sigma\_\mathcal{E}^2 \tag{19}$$

**Fig. 1** Variance sequence of a TVAR(1) process. The time-variable coefficients follow the third degree polynomial ˛t D 0:7 C 0:016t - 0:00035t <sup>2</sup> C 0:000002t <sup>3</sup> (**a**). The variances (**b**) are computed in three different ways. (1) by (18) (2) by the filter approach and (3) by the filter approach supplemented by the influence of the variance of the

which can be rewritten to

$$\chi\_l(0) := \alpha\_l^2 \left( \prod\_{j=1}^{l-1} \alpha\_j^2 \, \sigma\_{\mathcal{S}\_0}^2 + \sum\_{k=1}^{l-1} \prod\_{j=k+1}^{l-1} \alpha\_j^2 \, \sigma\_{\mathcal{E}}^2 \right) + \sigma\_{\mathcal{E}}^2 \, . \tag{20}$$

Here the term in the brackets represents t-1.0/. Therefore we end up with the simple recursion equation

$$
\gamma\_l(0) := \alpha\_l^2 \gamma\_{l-1}(0) + \sigma\_\mathcal{E}^2 \tag{21}
$$

for the variances at time t, with the initial state 0.0/ D <sup>2</sup> S0 . The covariances with respect to a time lag h follows from (17) and (18)

$$\Sigma\left\{\mathcal{S}\_{l},\mathcal{S}\_{l+h}\right\} = \prod\_{n=l+1}^{l+h} \alpha\_{n}\,\,\nu\_{l}(0)\,, \qquad h>0\,. \tag{22}$$

The covariance matrix †*<sup>S</sup>* of the TVAR(1) process of finite length tmax can now be computed by arranging the covariances † fS<sup>t</sup> ; StChg for t D 0; : : : ; tmax and h D 0; : : : ; tmaxt into the upper triangle of a matrix. The lower triangle part is completed symmetrically accordingly. Figure 2 gives an example for the covariance matrix of a TVAR(1) process.

#### **Filter Representation and Covariance Matrix**

It should be mentioned, that the same covariance matrix †*<sup>S</sup>* can be derived from the filter approach (cf. Schuh and

initial state added by the S0. The figure shows, that all three approaches delivers the same result taking into account the warm up phase of the filter approach without prior information of the statistics of initial signal S0

**Fig. 2** Covariance matrix †*<sup>S</sup>* of a TVAR(1) process. The time-variable coefficients follow the third degree polynomial ˛t D 0:7 C 0:016t - 0:00035t <sup>2</sup> C 0:000002t <sup>3</sup>. The computation of the matrix can be performed by (17) or by the recursion (21) in connection with (22)

Brockmann (2020)). The covariance matrix †*<sup>S</sup>* consists of of the filter part †<sup>F</sup> *<sup>S</sup>* and the warm up part †<sup>W</sup> *S*

$$
\boldsymbol{\Sigma}\_{\mathbf{s}} = \boldsymbol{\Sigma}\_{\mathbf{s}}^{\boldsymbol{F}} + \boldsymbol{\Sigma}\_{\mathbf{s}}^{\boldsymbol{W}} \,. \tag{23}
$$

The filter matrix H is defined by

$$H = \begin{bmatrix} 1 \\ -\alpha\_1 & 1 \\ & -\alpha\_2 & 1 \\ & & \ddots & \ddots \\ & & & -\alpha\_l & 1 \end{bmatrix} \tag{24}$$

and the covariance matrix †<sup>F</sup> *<sup>S</sup>* for the filter part follows from

$$
\boldsymbol{\Sigma}\_{\mathbf{s}}^{F} = \boldsymbol{H}^{-1} \left( \boldsymbol{H}^{-1} \right)^{T} . \tag{25}
$$

(cf. Schuh and Brockmann (2020, Sec. 5)). Because of the warm up phase of the filter approach this covariance matrix must be modified with respect to the influence of S<sup>0</sup> by the matrix †<sup>W</sup> *<sup>S</sup>* which elements are computed according to (17) by

$$
\sigma\_{\mathcal{S}\_0^{i^2}}^W = \prod\_{n=1}^i \alpha\_n^2 \sigma\_{\mathcal{S}\_0}^2, \quad i = 1, \ldots, t \tag{26}
$$

$$
\sigma\_{\mathcal{S}\_0^{ij}}^W = \sigma\_{\mathcal{S}\_0^{ij}}^W = \prod\_{n=i+1}^j \alpha\_n \,\sigma\_{\mathcal{S}\_0^{i^2}}^W, \begin{array}{l} i = 1, \dots, t \\ j = i, \dots, t \end{array} \tag{27}
$$

The complete covariance matrix †*<sup>S</sup>* of a TVAR(1) process results from the sum of the two parts according to (23) and is identical to the calculation of the covariance matrix by (17) or (21) and (22) respectively (see also Fig. 1).

# **3 Time-Variable Collocation of a DInSAR Point Stack**

We apply these inhomogeneous covariances to model the temporal component of a spatio-temporal point stack derived from a DInSAR-SBAS analysis. The test region is the Lower-Rhine Embayment in North Rhine-Westphalia, Germany, with the still active open-cast mines Garzweiler, Hambach and Inden and the already closed coal mines Sophia-Jacoba in the mining region Erkelenz and Emil Mayrisch in the mining region Aachen. The Remote Sensing Software Graz (RSG) is used to analyze the data from the ERS1 and ERS2 mission. This results in a spatio-temporal point stack of surface displacements with respect to the initial frame 1992, May 5th up to 2000, Dec. 12th (cf. Esch et al. (2019)).

The construction of a time-variable spatio-temporal covariance model allows to use the least squares collocation approach to estimate the surface displacements at any place and at any time and provide a report on the uncertainty of this estimation.

When evaluating the deformations, the estimation error of the prediction should be minimized according to the Wiener-Kolmogorov-Principle. For this purpose, we consider the measured deformations as a special realization of a random process. Since the distribution function of this random process is unknown and no assumptions are to be made about it, we choose a linear approach via the principle of the **B**est **L**inear **P**redictor (BLP) (Teunissen 2007). Due to the pre-processing of the DInSAR data it can be assumed that the expected value of the signal (deformations) becomes zero over the entire area. This implicitly transforms the best linear predictor into the **B**est **L**inear **U**nbiased **P**redictor (BLUP) (cf. e.g. Schuh (2016, Sec. 3.2) or Teunissen (2007, Corollary I(i))). The BLUP corresponds to the Least Squares Collocation approach (cf. e.g. Moritz (1980, Sec. 11) or Schuh (2016)) and the predictor is defined by

$$\widetilde{\mathbf{s}}\_{\mathcal{P}} = \boldsymbol{\Sigma}\_{\mathcal{S}} \langle \mathbf{x}\_{\mathcal{P}}, \mathbf{x}\_{o} \rangle \underbrace{\left(\boldsymbol{\Sigma}\_{\mathcal{S}} \langle \mathbf{x}\_{o}, \mathbf{x}\_{o} \rangle + \boldsymbol{\Sigma}\_{\mathcal{N}} \langle \mathbf{x}\_{o}, \mathbf{x}\_{o} \rangle\right)^{-1}}\_{:= \boldsymbol{\Sigma}\_{\mathcal{S} + \mathcal{N}} \{\mathbf{x}\_{o}, \mathbf{x}\_{o}\}} \stackrel{-1}{\Delta \mathcal{E}} \quad (28)$$

where †*<sup>S</sup>* fxo; xog denotes the covariance matrix between the observed locations xo, whereas †*<sup>S</sup>* fxp; xog represents the covariance matrix between the observed locations x<sup>o</sup> and the location x<sup>p</sup> which are supposed to be predicted. †*<sup>N</sup>* fxo; xog reflects the noise characteristics. Here -` represents the observed displacements of the point stack. In this example 144.302 scatterers are identified in 64 time frames. The data points are clustered in urban regions. To achieve a homogeneous data distribution as well in urban regions as in rural regions the whole area is divided in Œ9x7 tiles and in each tiles the same number of points are randomly selected.

The huge computational effort to solve

$$\left(\Sigma\_{\mathcal{S}\leftrightarrow\mathcal{N}}\langle \mathbf{x}\_o, \mathbf{x}\_o \rangle\right)^{-1} \Delta \mathcal{U} \,, \tag{29}$$

for which the dimension follows from the number of measurements can be significantly reduced in case the covariances can be separated into a spatial and temporal domain and by the use of finite covariance functions (Schuh 1989).

#### **Spatial Covariance Model**

To make the covariances in space independent from the time we only consider the observed displacements of the same time difference to compute the spatial empirical covariance function. For each chosen time difference the empirical covariance function is computed and provides a sample of the stochastic behavior. All samples are documented in Fig. 3 (left). By plotting the confidence region for the estimates it can be stated that the spatial behavior is homogenous with respect to the time. These samples of empirical covariance

**Fig. 3** Spatial covariance functions of the DInSAR point stack. (left) empirical covariance function of the distortions with respect to equal time differences (right) analytic model of the spatial covariance function

functions are approximated by a finite covariance function which is constructed by the autocorrelation of truncated polynomial base functions (cf. Schubert and Schuh (2022)). The positive definite finite analytical covariance function can be seen in Fig. 3 (right). Due to the finite support of this positive semidefinite function the covariance matrix is sparse.

#### **Temporal Covariance Model**

The data characteristics in the time domain are characterized by the epochs of the available SAR recordings. Especially for the images of the ERS1 and ERS2 satellites the recorded data are irregularly distributed in time. From the variance plot in Fig. 4 (a) the time dependence of this signal is obvious. We approximate these variances by an equidistant TVAR(1) process with a sampling that is twice as high as the time difference of the ERS1 and ERS2 recordings. The time variation of the coefficients is modeled by a polynomial of degree three.

The variances of the TVAR(1) model can also be seen in Fig. 4 (a). The covariance matrix for all equidistant time points of the TVAR(1) model follows (22) and can be downsampled to the measurement epochs. The identification of the measurement dates is done by the nearest neighbors. Thus, we obtain the temporal covariance matrix at the identified measurement dates from the TVAR(1) model which is shown in Fig. 4 (b).

It should be mentioned, that the temporal covariance can be computed only for discrete times, but of arbitrarily small time intervals.

#### **Separable Spatio-Temporal Collocation Approach**

The above investigations have shown that the spatio-temporal covariance function can be separated into a time-variable temporal t.t; t C h/ and a homogeneous spatial component sp.x/,

$$\boldsymbol{\chi}(\Delta \boldsymbol{x}, t, t+h) = \boldsymbol{\chi}\_l(t, t+h) \cdot \boldsymbol{\chi}\_{sp}(\Delta \boldsymbol{x})\,. \qquad (\Im 0)$$

Since only permanent back scatterers, which are detected in all recordings, are included in the SBAS solution, the temporal distances are, however, the same for all scatterers. This allows for a compact representation of the covariance matrices by the Kronecker product

$$\Delta\_{\mathbf{s}}\{\mathbf{x}\_{k},\mathbf{t}\_{k};\mathbf{x}\_{o},\mathbf{t}\_{o}\} = \boldsymbol{\Sigma}\_{\mathbf{s}}^{\prime}\{\mathbf{t}\_{k},\mathbf{t}\_{o}\} \otimes \boldsymbol{\Sigma}\_{\mathbf{s}}^{sp}\{\mathbf{x}\_{k},\mathbf{x}\_{o}\}, \quad (31)$$

**Fig. 4** Temporal covariance modeling of the DInSAR point stack. (**a**) empirical variances of the distortions with respect to dates compared with the variances derived from the approximated TVAR(1) model (**b**) empirical determined temporal covariance matrix (**c**) covariance matrix derived from the TVAR(1) model thinned for the measurement dates

with k 2 fp; og and (28) can thus be represented by

$$\begin{split} \widetilde{\mathbf{s}\_{p}} &= \boldsymbol{\Sigma}\_{\mathbf{s}}^{\top} \{ \boldsymbol{t}\_{p}, \boldsymbol{t}\_{o} \} \otimes \boldsymbol{\Sigma}\_{\mathbf{s}}^{sp} \{ \mathbf{x}\_{p}, \mathbf{x}\_{o} \} \\ & \left( \underbrace{\boldsymbol{\Sigma}\_{\mathbf{s}}^{\prime} \{ \boldsymbol{t}\_{o}, \boldsymbol{t}\_{o} \} \otimes \boldsymbol{\Sigma}\_{\mathbf{s}}^{sp} \{ \mathbf{x}\_{o}, \mathbf{x}\_{o} \} + \boldsymbol{\Sigma}\_{\boldsymbol{\mathcal{N}}}}\_{\boldsymbol{\Sigma}\_{\mathbf{s}+\boldsymbol{\mathcal{N}}} \{ \mathbf{x}\_{o}, \mathbf{t}\_{o}; \mathbf{x}\_{o}, \mathbf{t}\_{o} \}} \right)^{-1} \boldsymbol{\Delta} \boldsymbol{\mathcal{I}} \,, \qquad (32) \end{split}$$

where †*<sup>N</sup>* characterizes the noise component. If the noise is designed appropriately, the calculations of the estimator can be split into a temporal and spatial component according to the rules of array algebra (cf. e.g. Blaha (1977); Rauhala (1974))

$$
\widetilde{\boldsymbol{S}}\_{\boldsymbol{\rho}} = \boldsymbol{\Sigma}\_{\mathbf{s}}^{sp}(\boldsymbol{x}\_{\boldsymbol{\rho}}, \boldsymbol{x}\_{o}) \big( \boldsymbol{\Sigma}\_{\mathbf{s} + \boldsymbol{\mathcal{N}}}^{sp}(\boldsymbol{x}\_{o}, \boldsymbol{x}\_{o}) \big)^{-1}
$$

$$
\Delta L \left( \boldsymbol{\Sigma}\_{\mathbf{s} + \boldsymbol{\mathcal{N}}}^{\boldsymbol{t}}(t\_{o}, t\_{o}) \right)^{-1} \boldsymbol{\Sigma}\_{\mathbf{s}}^{\boldsymbol{t}}(t\_{o}, t\_{\boldsymbol{\mathcal{P}}}) \,. \tag{33}
$$

Here the observations are arranged in the matrix -L, each column represents the displacement of all scatterers for a specific epoch, i.e.

$$
\Delta L \, := \text{reshape}(\Delta \mathcal{E}, n\_o^{sp}, n\_o^l) \,, \tag{34}
$$

where nsp <sup>o</sup> denotes the number of observed scatterers and nt o the number of recordings (time frames). The same rearrangement is done for the predicted values

$$\widetilde{\mathbf{S}}\_p := \text{reshape}(\widetilde{\mathbf{s}}\_p, n\_p^{sp}, n\_p^t) \,, \tag{35}$$

with nsp <sup>p</sup> the number of points to predict and n<sup>t</sup> <sup>p</sup> the number of time frames to be predicted. According to the rules of array algebra the noise can be designed in two different ways without destroying the Kronecker structure, either

$$
\boldsymbol{\Sigma}\_{\mathcal{N}} = \boldsymbol{\Sigma}\_{\mathbf{s}}^{\prime} \{ \boldsymbol{t}\_o, \boldsymbol{t}\_o \} \otimes \mathbb{1}\_{sp} \sigma\_{sp}^2 \quad \text{or} \quad
$$

$$
\boldsymbol{\Sigma}\_{\mathcal{N}} = \mathbb{1}\_{\boldsymbol{t}} \sigma\_{\boldsymbol{t}}^2 \otimes \boldsymbol{\Sigma}\_{\mathbf{s}}^{sp} \{ \mathbf{x}\_o, \mathbf{x}\_o \} \,. \tag{36}
$$

But in both cases the interpretation of the noise behaviour is not straightforward. A much more obvious choice for the noise would be

$$\boldsymbol{\mathfrak{L}}\_{\mathcal{N}} = \mathbb{1}\_l \otimes \mathbb{1}\_{sp} \sigma\_{\mathcal{N}}^2 = \mathbb{1} \sigma\_{\mathcal{N}}^2 \,. \tag{37}$$

As shown in Schuh et al. (2022) an eigenvalue decomposition of †sp *<sup>S</sup>* <sup>f</sup>xo; <sup>x</sup>o<sup>g</sup> into <sup>U</sup> spƒsp<sup>U</sup> <sup>T</sup> sp or †<sup>t</sup> *<sup>S</sup>* fxo; xog into U <sup>t</sup>ƒtU <sup>T</sup> <sup>t</sup> again gives a separable form for the prediction,

$$\widetilde{\mathbf{S}}\_{\boldsymbol{p}} = \mathbf{E}\_{\mathbf{s}}^{sp}(\mathbf{x}\_{\boldsymbol{p}}, \mathbf{x}\_{o}) \Big( \sum\_{k=1}^{n\_{B}} \boldsymbol{U}\_{sp} \Big( (\mathbf{A}\_{l})\_{k} \mathbf{A}\_{sp} + \mathbb{1}\_{sp} \boldsymbol{\sigma}\_{\mathcal{M}}^{2} \Big)^{-1} $$
 
$$\boldsymbol{U}\_{sp}^{T} \Delta \boldsymbol{L} \ \boldsymbol{U}\_{l}(:,k) \boldsymbol{U}\_{l}(:,k)^{T} \Big) \boldsymbol{\Sigma}\_{\mathbf{s}}^{l} \{t\_{o}, \boldsymbol{t}\_{p}\} \ . \qquad (38)$$

The great advantage of the collocation approach is that besides the predicted values, the accuracy of the prediction can also be determined by variance propagation (Moritz 1980, Sec. 17). Also these calculations can be separated into a temporal and spatial component (cf. Schuh et al. (2022)).

#### **Results of the Rigorous Collocation of a DInSAR-Stack**

Our test region is, as mentioned above the Lower-Rhine Embayment in North Rhine-Westphalia, Germany. For ERS1 and ERS2, the DInSAR-SBAS analysis results in a spatiotemporal point stack with 144.302 permanent scatterers in 64 time frames. The covariances are separated in a time-variant temporal component and a homogeneous space component. Since the data are available strictly at the respective recording times, a Kronecker representation of the covariance matrices is possible, which allows to split the calculations into a temporal and a spatial one. Thus, the numerical complexity of the task can be reduced significantly and it becomes possible to compute this very extensive collocation task on a workstation or notebook within about one to two hours.

With the collocation methods, surface deformations can be predicted for any location at any discrete time point. The tailored collocation approach elaborated here thus provides a continuous prediction in space for previously freely defined discrete time points. Beside the predicted values, their uncertainty is also quantified. In Fig. 5 the effects caused by groundwater management from the active opencast mines Garzweiler, Hambach and Inden are clearly recognizable by subsidence. Whereas in the already closed coal mines, an uplift is taking place. The accuracy (standard deviation) of the prediction is in a range 5–15 [mm] and it is immediately apparent that this accuracy is very heterogeneous. The bright points correspond to the measured permanent scatterers, while measurements in the vicinity are missing for the brown areas or they correspond to an extrapolation outside the image scene.

The orange line from the northwest to the southeast shown in Fig. 5 (left) marks a profile. Figure 6 (left) displays the behaviour of the displacement in time along this profile. Figure 6 (right) shows the displacement for the time span 1992:4 to 2000:9 [yr] and their predicted accuracies. These are just a few examples to illustrate the many possibilities of the collocation approach. In Schuh et al. (2022) the patterns of movement in time are provided as an animation.

To study the benefit of the time-variable covariance model it will be now of interest to show the difference between modeling with a time-variable and a static temporal covariance function. As static covariance function, a Gaussian function with a standard deviation of 5 [mm] and a halfvalue width of approx. 4 [yr] years is fitted to the mean empirical covariances calculated from all 'training points'. To stabilize the temporal covariance matrix, an additional

**Fig. 5** Predicted surface displacements (left) and their uncertainties (right) in the Lower-Rhine Embayment in North Rhine-Westphalia

**Fig. 6** Time-dependent behavior of the surface displacement along the northwest-southeast profile (see shown in Fig. 5) (left) behavior in time predicted profile (right) distortion in a fixed time span and their uncertainties

i.i.d. noise of with a standard deviation of 2 [mm] has to be introduced. Using the residuals between predicted and measured deformations on 912 randomly selected test locations, the mean, standard deviation, RMS and maximum deviations for each time epoch is empirically determined and compared to the predicted formal standard deviations. Figure 7 summarizes the differences between static and time-varying modeling. While only minor differences can be observed in the predicted values, the predicted formal variances show a significantly different behavior. While in the static case the variances are constant over time, the timevariable modeling shows a steady increase in the variances in line with the empirical values. In contrast to static modeling, the time-variable formulation thus results in better consistent behavior between the model and the data and opens up further possibilities to fit the model even better to the data.

**Fig. 7** Statistics of the residuals between predicted and measured distortions on 912 randomly distributed test locations. (left) static (right) timevariable temporal covariance function

# **4 Summary and Outlook**

In addition to the many advantages of the collocation method as a data-adaptive method, it is repeatedly stated that there is a lack of flexibility in the modeling of the covariances and that the models cannot be implemented due to the enormous computational effort. In this work, we demonstrate that these limitations can be overcome by appropriate methodological approaches. The advantages of the collocation method can be even used in case of time-varying behavior and extensive measurement points. Since the collocation method can be used to estimate function values and their uncertainties for arbitrary locations and times, this method is also very well suited for the fusion of SAR data with other data, e.g. epochwise levelling campaigns, which will be investigated in the future.

**Acknowledgements** This research is funded by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) Grant No. 435703911 (SCHU 2305/7-1 Nonstationary stochastic processes in least squares collocation–NonStopLSC). The authors acknowledge the European Space Agency (ESA-Project ID 17055 Integrated Modelling of SAR Interferometry and Leveling) for provision of the ERS 1/2 data. The DInSAR-SBAS evaluation was done by the Remote Sensing Software Graz (RSG). The authors gratefully acknowledge the granted access to the Bonna cluster hosted by the University of Bonn.

# **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/ licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

# **A Multi-Epoch Processing Strategy for PPP-RTK Users**

A. Khodabandeh, P. J. G. Teunissen, and D. Psychas

#### **Abstract**

The present contribution aims to address why the stochastic model of the PPP-RTK user-filter is misspecified, and how one can limit the precision-loss associated with user parameter solutions. By developing tools for measuring the stated precision-loss under existing formulations of the user's Kalman filter, we propose an alternative formulation that recursively delivers close-to-minimum-variance filtered solutions when certain conditions hold. Such conditions are discussed, and their impact on the user ambiguity-resolved positioning performance is illustrated by supporting numerical results.

#### **Keywords**

Global Navigation Satellite System (GNSS) - Integer ambiguity resolution enabled precise point positioning (PPP-RTK) - Kalman filter -Time-correlated corrections

# **1 Introduction**

In PPP-RTK, one employs state-space representation for positioning corrections so as to reduce their transmission rate, i.e. the frequency with which the corrections are to be provided to single-receiver GNSS users (Wubbena et al 2005; Laurichesse and Mercier 2007; Collins et al 2010; Teunissen et al 2010). However, a reduction in the transmission rate comes at the cost of delivering time-delayed corrections. The user is therefore required to time-predict the corrections so as to bridge the gap between the corrections' generation time and the user positioning time. Consequently, next to the intrinsic uncertainty brought by the randomness of GNSS measurements, 'multi-epoch' PPP-RTK corrections also inherit extra uncertainty that is associated with their time-prediction (Wang et al 2017).

As the user's Kalman filter relies on the provision of such random positioning corrections, his corrected observation equations become *correlated* in time. This violates the Kalman filter's key assumption, namely, that the input measurements must be time-uncorrelated. As a consequence, the user's Kalman filter loses its *minimum-variance* optimality property.

In this contribution we aim to identify the main factor that makes the stochastic model of the PPP-RTK user-filter misspecified, and thereby address how the user can limit the precision-loss associated with his parameter solutions. By developing tools for measuring the stated precisionloss under existing formulations of the user's Kalman filter, alternative multi-epoch formulations are developed that can recursively deliver close-to-minimum-variance filtered solutions of the user parameters. To bound the corresponding precision-loss experienced by the filtered solutions of such formulations, certain conditions must hold. These condi-

J. T. Freymueller, L. Sánchez (eds.), *X Hotine-Marussi Symposium on Mathematical Geodesy*, International Association of Geodesy Symposia 155, https://doi.org/10.1007/1345\_2023\_228

A. Khodabandeh (-)

Department of Infrastructure Engineering, The University of Melbourne, Melbourne, VIC, Australia e-mail: akhodabandeh@unimelb.edu.au

P. J. G. Teunissen Department of Infrastructure Engineering, The University of Melbourne, Melbourne, VIC, Australia

Department of Geoscience and Remote Sensing, Delft University of Technology, Delft, The Netherlands

GNSS Research Centre, Curtin University, Perth, WA, Australia

D. Psychas European Space Agency (ESA/ESTEC), Noordwijk, The Netherlands

tions are discussed, and their impact on the user ambiguityresolved positioning performance is illustrated by supporting numerical results.

# **2 User Model Aided by External Corrections**

Consider the (linearized) system of observation equations of a single-receiver PPP-RTK user

$$
\underline{\mu} = B \, b + C \, c + \underline{n} \,, \tag{1}
$$

where the user observation vector *u*, together with the zeromean random noise n, are linked to the user's unknown parameter vector b and the unknown correction vector c through the full-rank design matrices B and C. The augmented design matrix ŒB; C is rank-defect though, meaning that the system is *not* solvable for both b and c. The observation vector *u* may contain GNSS carrier-phase and pseudorange (code) measurements, with b containing the position coordinates, carrier-phase ambiguities, receiver clock parameters, and instrumental biases. On the other hand, the correction vector c may contain estimable forms of satellite orbit and clock parameters, atmospheric parameters, and phase/code biases (Leick et al 2015; Odijk et al 2015; Teunissen and Montenbruck 2017). The underscore symbol indicates the 'randomness' of quantities.

Due to the rank-deficiency of ŒB; C in (1), the user cannot *unbiasedly* determine the unknown parameters b with the *sole* use of his measurements. To obtain b unbiasedly, the user has to take recourse to an external provider, e.g., a network of permanent GNSS stations (Wubbena et al 2005), to receive an unbiased solution of the correction vector c. Let cO denote such external correction solution. With the provision of cO, the user can extend his model (1) to

$$
\begin{bmatrix} \frac{\mu}{\zeta} \end{bmatrix} = \begin{bmatrix} \begin{B} \ C \\ 0 \ I \end{B} \end{bmatrix} \begin{bmatrix} b \\ c \end{bmatrix} + \begin{bmatrix} \underline{n} \\ \underline{\epsilon} \end{bmatrix}, \tag{2}
$$

with being the zero-mean random noise vector that characterises the 'randomness' of the correction solution cO. Since the user design-matrix B is of full-column rank, and that the correction vector c can now be determined by cO, the system (2) is solvable. As far as the estimation of the user parameters b is concerned, the system of equations (2) can be reduced for c. Such reduced model is formed by pre-multiplying the matrix ŒI; C with (2). This gives

$$
\underline{u} - \mathcal{C}\,\underline{\hat{c}} = B\,b + \underline{\tilde{u}}\,, \quad \text{with} \quad \underline{\tilde{\mu}} := \underline{\eta} - \mathcal{C}\,\underline{\epsilon} \qquad (3)
$$

The reduced model (3), with the user *corrected* observation vector *u* CcO, forms the basis of existing PPP-RTK models (Wubbena et al 2005; Laurichesse and Mercier 2007; Collins et al 2010; Teunissen et al 2010). In contrast to the model (2) where both b and c are *jointly* estimated, (3) does not directly allow a further update on the correction solution cO. From the perspective of a single-receiver user who is merely interested in his parameters b, the reduced model (3) is more appealing in the sense that it involves fewer unknowns. In fact, the reduced model (3) can be shown to deliver user parameter solutions that are *identical* to those of (2) *if* the (co)variance propagation law to the corrected observation vector *u*CcO is properly applied (Teunissen 2000). This means all the information required for the estimation of b is preserved when the user weights the corrected observation vector *u* CcO in accordance with the inverse-variance matrix of the noise vector nQ D n C . In practice however, the variance matrix of the correction solution cO, i.e. the dispersion of in nQ, may only be *partially* known to the user. As a consequence, the user takes recourse to the known part of such variance matrix to weight his corrected observation vector *u* CcO, missing part of the required information, thereby experiencing *precision-loss* in the estimation of b. The following theorem provides a general means for measuring such precisionloss.

**Theorem (-Suboptimality)** Let the zero-mean random vector p, with the full-column rank matrix L, *perturb* the system of observation equations

$$\mathbf{y} = A\,\mathbf{x} + \underline{\mathbf{e}} + L\,\mathbf{p},\tag{4}$$

in which the observation vector y, with its zero-mean residual vector e, is linked to the unknown parameter vector x by the full-column rank design matrix A. Also, let the variance matrix of e be given by the positive-definite matrix Qe. In the absence of the variance matrix of p, say Qp, the leastsquares estimator

$$\underline{\hat{\underline{\chi}}} = A^+ \underset{\underline{\hat{\underline{\chi}}}}{\text{ with }} \quad A^+ := (A^T \mathcal{Q}\_e^{-1} A)^{-1} A^T \mathcal{Q}\_e^{-1}, \quad (5)$$

is *not* minimum-variance, and therefore, suboptimal. Its precision-loss, in estimating every function D f <sup>T</sup> x, can be measured by the following variance-ratio bounds

$$1 + \lambda\_{\text{min}}(M\_{L\_A}M\_{L\_{A^\perp}}) \le \frac{\text{Var}(f^T \underline{\hat{x}})}{\text{Var}(f^T \underline{\hat{x}}^\star)}$$

$$\le 1 + \lambda\_{\text{max}}(M\_{L\_A}M\_{L\_{A^\perp}}) \qquad (6)$$

with xO denoting the optimal (minimum-variance) leastsquares estimator. Matrices MLA and MLA? are given by MLA D QpL<sup>T</sup> <sup>A</sup>.Qe C LQpL<sup>T</sup> /-1LA and MLA? <sup>D</sup> QpL<sup>T</sup> <sup>A</sup>? .Qe <sup>C</sup> LA? QpLT <sup>A</sup>? /-1LA? , where LA D AACL and LA? D L LA. The symbols min.-/ and max.-/ denote the minimum and maximum eigenvalues of a matrix, respectively. -

*Proof* The proof is given in Appendix. ut

To better appreciate the bounds in (6), compare the suboptimal least-squares estimator (5) with its minimum-variance counterpart (Koch 1999; Teunissen 2000)

$$\underline{\hat{X}}^\* = (A^T \underline{Q}\_y^{-1} A)^{-1} A^T \underline{Q}\_y^{-1} \underline{y}, \quad \text{with} \quad \underline{Q}\_y = \underline{Q}\_e + L \underline{Q}\_p L^T \tag{7}$$

The theorem states that if, instead of the full variance matrix Qy, the weighting of the observation vector y in (4) is conducted based on the known part Qe, the increase in the variance of the solutions f <sup>T</sup> xO relative to that of their optimal counterparts f <sup>T</sup> xO can always be bounded by (6). Ideally, we wish to have the bounds .1 C min/ and .1 C max/ close to unity. Their deviation from unity is due to the presence of the nonnegative eigenvalues min and max. They indicate smallest and largest precision-loss that is experienced by the suboptimal estimator (5), respectively.

Such precision-loss is driven by the product of the two matrices MLA and MLA? , each of which being a function of the orthogonal projections LA and LA? of matrix L, respectively. Here, the orthogonality is defined with respect to the inner-product metric Q-1 <sup>e</sup> . Thus L<sup>T</sup> AQ-1 <sup>e</sup> LA? D 0 and L D LA C LA? . This implies, for nonzero matrices L, that the two matrices MLA and MLA? cannot *simultaneously* be made zero. In fact, these two matrices 'compete' to limit the precision-loss experienced by the estimator xO. To see this, let us consider two extreme competing cases: (1) when L completely lies in the column-space of the design matrix A (i.e. when LA? D 0), and (2) when L is orthogonal to the column-space of A (i.e. when LA D 0). The first case is when L can be expressed as L D AP for some matrix P. For this case, the random vector p is completely absorbed by the parameter vector x, thus simplifying the model (4) as y D A .x C P p/ C e. As a result, the model cannot distinguish between x and x D x C P p, meaning that the uncertainty due to p cannot be adjusted by *any* weighted least-squares adjustment. Both the optimal and suboptimal estimators xO and xO would therefore experience the same amount of uncertainty. This is also corroborated by the bounds in (6) as the eigenvalues min and max become zero through the equality LA? D 0 (or MLA? D 0).

The second case is when AT Q-1 <sup>e</sup> L D 0. For this case, both the optimal and suboptimal estimators xO and xO are *uncorrelated* with the random vector p, i.e. Cov.xO ; p/ D Cov.xO; p/ D 0. This follows by applying the covariance propagation law, respectively, between (7) and p, and between (5) and p, together with the equalities Cov.y; p/ D LQp, AT Q-1 <sup>e</sup> L D 0 and Q-1 <sup>y</sup> D Q-1 <sup>e</sup> Q-1 <sup>e</sup> L.Q-1 <sup>p</sup> C L<sup>T</sup> Q-1 <sup>e</sup> L/-1LT Q-1 <sup>e</sup> . Thus, both the estimators xO and xO remain intact irrespective of the uncertainty-level of p. The bounds in (6) also support this as the eigenvalues min and max become zero through the equality LA D 0 (or MLA D 0). Apart from the two extreme cases discussed above, the maximum eigenvalue max is *different* from zero, leading the estimator (5) to lose its minimum-variance property.

The result (6) can be used to quantify the suboptimality level of PPP-RTK user parameter solutions when the correctional uncertainty, i.e. the variance matrix of in the reduced model (3), is unknown to the user. To set the stage for measuring the largest possible precision-loss that the user estimator can experience, one needs to make the following settings y 7! .*u*CcO/, A 7! B, e 7! n, p 7! , and L 7! C. In the next section we employ the result (6) to assess the precisionperformance of 'multi-epoch' formulations that are used to determine the user parameter vector b in a *recursive* manner.

# **3 Multi-epoch Formulations of the User Model**

In the context of PPP-RTK, the user parameter solutions are to be computed in a near real-time manner, requiring the application of least-squares estimation in its 'recursive' Kalman filter forms (Kalman 1960; Simon 2006; Teunissen 2001). Accordingly, the user parameter vector b may be partitioned into a time-series of parameter vectors bj (j D i;i C 1; : : :), where the subscripts i and j indicate the time-instance (epoch). Likewise, the time-uncorrelated observation vectors *u*<sup>j</sup> (j D i;i C 1; : : :) replace *u*. This gives the 'multi-epoch' version of the user observation equations (1) as follows

$$\underline{u}\_{j} = \mathcal{B}\_{j}\,b\_{j} + \mathcal{C}\_{j}\,c\_{j} + \underline{\eta}\_{j}\,, \quad j = i, i+1, \ldots \quad (8)$$

Given the system of equations (8), the user needs to receive solutions of the correction vectors cj from an external provider at every epoch j . In practice however, the provider disseminates state-space correction solutions at -second intervals to minimize the amount of information required to be transmitted to the user (Wubbena et al 2005). The longer the sampling period , the less the bandwidth required for data-transmission. While each individual correction type (e.g. satellite orbits versus clocks) can have its own sampling period , such distinction is not made here just for the sake of presentation. We instead only show one common sampling period for all correction types. Let cOkjk denote the solution of the correction vector ck that is obtained based on all the provider observations collected up to and including the epoch k, where k is a positive integer indicating the number of the -second intervals. The user would need a correction solution at epoch i k though. To this end, such solution can be *time-predicted* using the delayed solution cOkjk if information about the time-behavior of the corrections would be known to the user. Such information can be expressed in terms of the corrections' dynamic models (Teunissen 2001)

$$\underline{\varrho}\_{l}^{c} = c\_{l} - \Phi^{c} \, c\_{l-1} + \underline{\chi}\_{l}^{c}, \quad t = 2, 3 \dots \tag{9}$$

where the randomness of the zero-sampled pseudoobservation o<sup>c</sup> <sup>t</sup> is characterized by the time-uncorrelated process noises *w*<sup>c</sup> <sup>t</sup> . The transition matrix ˆ<sup>c</sup> links the correction parameters between two successive epochs. Thus, ˆc .ji / <sup>D</sup> <sup>Q</sup>ji <sup>h</sup>D<sup>1</sup> ˆ<sup>c</sup> (j >i) links the corrections from epoch i to epoch j . Accordingly, the sought-for correction solution can be time-predicted as cOijk D ˆ<sup>c</sup> .ik/ cOkjk .

As with the corrections, the time-behavior of the user parameter vectors bj can also be incorporated into the estimation process to improve the corresponding parameter solutions. They are expressed by the following dynamic models

$$\underline{a}\_{j}^{b} = b\_{j} - \Phi^{b}b\_{j-1} + \underline{w}\_{j}^{b}, \quad j = i + 1, i + 2 \dots \quad (10)$$

where the transition matrix ˆb links the user parameters over time, with the zero-sampled pseudo-observation o<sup>b</sup> <sup>t</sup> and time-uncorrelated process noises *w*<sup>b</sup> <sup>j</sup> (j D i;i C 1:::).

# **3.1 Representation in Batch Forms**

The user can feed the time-predicted correction solution cOijk into his measurement and dynamic models (8) and (10) so as to run his recursive Kalman-filter. As the below will show, different formulations for the user-filter can be established, and the user ideally wishes to adopt the formulation that can deliver parameter solutions with *smallest* precision-loss. To measure the precision-loss under different formulations, one can employ the result of the theorem given in (6). To do so, one first needs to form the *multi-epoch* version of (2), and consequently, identify the corresponding reduced model (3).

Consider the epochs within a -second time-interval j D i; : : : ; .k C 1/ 1, where it is assumed that the user initial epoch i is larger than or equal to the correction transmissiontime k, i.e. i k. During this time-interval, the userfilter relies on the provider filtered correction cOkjk . In the next time-interval, i.e. at epoch j D .k C 1/, the user-filter can replace the out-dated correction cOkjk by its newer counterpart cO.kC1/j.kC1/ . With this in mind, the multi-epoch version of (2) follows by augmenting the user measurement and dynamic models (8) and (10), with the dynamic models of the corrections (9). This reads (Teunissen 2001)

$$
\begin{bmatrix}
\begin{bmatrix}
\frac{\mathcal{U}\_{i}}{c} \\
\frac{\mathcal{C}\_{i}}{c} \\
0 \\
\frac{\mathcal{C}\_{i}}{c} \\
\end{bmatrix}
\end{bmatrix} = 
\begin{bmatrix}
\begin{bmatrix}
\begin{bmatrix}
\mathcal{B}\_{i} \ C\_{i} \\
0 \ I
\end{bmatrix} \\
0 \ I
\end{bmatrix} \\
\begin{bmatrix}
\frac{\mathcal{U}\_{i}}{c} \\
0 \ I
\end{bmatrix} \\
\begin{bmatrix}
\frac{\mathcal{B}\_{i}}{c} \\
0 \ I
\end{bmatrix} \\
\vdots \\
\begin{bmatrix}
\frac{\mathcal{B}\_{i}}{c} \\
0 \end{bmatrix} \\
\end{bmatrix} = 
\begin{bmatrix}
\begin{bmatrix}
\Phi^{b} & 0 \\
0 & \Phi^{c}
\end{bmatrix} & I \\
& -\begin{bmatrix}
\Phi^{b} & 0 \\
0 & \Phi^{c}
\end{bmatrix}
I
\end{bmatrix}
I
\end{bmatrix} \begin{bmatrix}
\begin{bmatrix}
b\_{i} \\
c\_{i} \\
c\_{i+1} \\
c\_{i+1} \\
\vdots \\
c\_{i+1} \\
c\_{i+2}
\end{bmatrix} \\
\vdots \\
\begin{bmatrix}
\mathcal{B}\_{(k+1)\mathsf{r}} & \mathcal{C}\_{(k+1)\mathsf{r}} \\
0 & I
\end{bmatrix}
\end{bmatrix} + \begin{bmatrix}
\begin{bmatrix}
b\_{i} \\
c\_{i} \\
c\_{i+1} \\
c\_{i+2} \\
\vdots \\
c\_{i+2}
\end{bmatrix} \\
\vdots \\
\begin{bmatrix}
\mathcal{B}\_{(k+1)\mathsf{r}} & \mathcal{C}\_{(k+1)\mathsf{r}} \\
0 & I
\end{bmatrix}
\end{bmatrix} + \\
\in \quad(11)$$

On the left-hand side of (11), the user observation vectors *u*<sup>j</sup> (j D i;i C 1; : : : ; .k C 1/) are accompanied by the correction solutions of the two successive time-intervals cOijk and cO.kC1/j.kC1/ , together with the zero-sampled pseudoobservations o<sup>b</sup> <sup>j</sup> and o<sup>c</sup> <sup>j</sup> . On the right-hand side of the equation, all the involved unknowns (both the user and correction parameters bj and cj ) are linked to the measurements via the 'batch' structure of the design matrices Bj and Cj , together with the transition matrices ˆb and ˆ<sup>c</sup> . As with any system of observation equations, the batch-form (11) is also accompanied by a zero-mean random vector ". This vector can be expressed as a summation of four uncorrelated terms as follows

The first term I contains the user-specific measurement and process noises that are time-uncorrelated. The second term II contains the accumulative process noise due to the correction latency i k, i.e., the delay in time after the corrections are filtered by the provider and the time they are provided to the user. The third term III contains the correction process noises that are also time-uncorrelated. In contrast to the first three terms however, the fourth (last) term IV contains the correction estimation-errors Okjk D cOkjk ck and O.kC1/j.kC1/ D Oc.k<sup>C</sup>1/j.kC1/ c.k<sup>C</sup>1/ which are *correlated*, see e.g. Teunissen and Khodabandeh (2013). This implies that the variance matrix of " is *not* 'block-diagonal', preventing the recursive computation of minimum-variance parameter solutions (Teunissen 2001). This shows that the stochastic model of the PPP-RTK user-filter is always misspecified, and therefore, *suboptimal* in the minimum-variance sense, no matter which formulation is adopted. However, the user can still recursively compute suboptimal parameter solutions by approximating the stated variance matrix using a block-diagonal positivedefinite matrix. Each approximation adopted leads to a different formulation of the user-filter. In the following we discuss three different formulations and assess their corresponding precision-loss in estimating the user parameters bj .

# **3.2 Case 1: Correctional Uncertainty Ignored**

A straightforward choice of the block-diagonal matrix that can approximate the variance matrix of " is made by ignoring the uncertainty of the corrections. In other words, the external corrections cO<sup>j</sup> <sup>j</sup>k (j D i;i C 1; : : : ; .k C 1/) are assumed precise enough to be treated as *non-random*, the scenario that is commonly exercised in practice (Khodabandeh 2021). According to this choice, the presence of the last three terms II, III and IV in (12) is discarded. Therefore, only the variance matrix of the first term I is used to weight the underlying observation vectors. At every epoch j , the user would then work with the following measurement model

$$
\begin{bmatrix} \underline{\underline{u}\_{j}} \\ \underline{\hat{c}\_{j}}\_{j\mid k\tau} \end{bmatrix} \approx \begin{bmatrix} \underline{B}\_{j} & \underline{C}\_{j} \\ 0 & I \end{bmatrix} \begin{bmatrix} b\_{j} \\ c\_{j} \end{bmatrix} + \begin{bmatrix} \underline{n}\_{j} \\ 0 \end{bmatrix} \tag{13}
$$

The reduced form of the above system, together the user dynamic model (10), is used to setup the underlying userfilter, that is

$$\text{Case 1}: \begin{cases} \text{measurement-model}: \underline{\underline{\mu}}\_{j} - \mathbf{C}\_{j} \underline{\hat{\underline{c}}}\_{j|k\underline{\pi}} \approx \mathbf{B}\_{j} \, b\_{j} + \underline{\underline{n}}\_{j} \\ \text{dynamic-model}: \begin{array}{l} \underline{\underline{\sigma}}\_{j} = b\_{j} - \mathbf{\overline{\Phi}}^{b} \, b\_{j-1} + \underline{\underline{w}}\_{j}^{b} \end{array} \end{cases} \tag{14}$$

Since the measurement noises nj are time-uncorrelated, the user can run his Kalman-filter in its recursive form (Teunissen 2001).

# **3.3 Case 2: Correction Process Noise Ignored**

The second choice for approximating the variance matrix of " can be made by ignoring the uncertainty of the correction process noises *w*<sup>c</sup> <sup>j</sup> over the epochs j D i C 1; : : : ; .k C 1/ 1. According to this choice, the presence of the last two terms III and IV in (12) is discarded. The user chooses the variance matrix of ICII to weight his observation vectors. At every epoch j , the user would then work with the following measurement model

$$
\begin{bmatrix} \underline{\underline{u}\_j} \\ \underline{\hat{c}}\_{j|k\tau} \end{bmatrix} \approx \begin{bmatrix} \underline{B}\_j \ C\_j \\ 0 \ \underline{I} \end{bmatrix} \begin{bmatrix} b\_j \\ c\_j \end{bmatrix} + \begin{bmatrix} \underline{n}\_j \\ \sum\_{h=k\tau+1}^j \Phi\_{(j-h)}^c \underline{\underline{v}\_h^c} \end{bmatrix} \tag{15}
$$

Similar to Case 1, the reduced form of the above system, together (10), is used to setup the underlying user-filter, that is (compare with 14)

$$\begin{cases} \text{Case 2}: \\ \begin{cases} \text{measurement-model}: \underline{\boldsymbol{\mu}}\_{j} - \boldsymbol{\mathcal{C}}\_{j} \boldsymbol{\hat{\Sigma}}\_{j|k\tau} \approx \boldsymbol{\mathcal{B}}\_{j} \boldsymbol{b}\_{j} \\ \qquad \qquad + (\underline{\boldsymbol{\mu}}\_{j} - \sum\_{h=k\tau+1}^{j} \boldsymbol{\mathcal{C}}\_{j} \boldsymbol{\Phi}^{c}\_{(j-h)} \underline{\boldsymbol{\mu}}^{c}\_{h}) \\ \text{dynamic-model}: \qquad \underline{\boldsymbol{\varrho}}\_{j}^{b} = \boldsymbol{b}\_{j} - \boldsymbol{\Phi}^{b} \boldsymbol{b}\_{j-1} + \underline{\boldsymbol{\mu}}\_{j}^{b} \end{cases} (16)$$

Since the uncertainty of *w*<sup>c</sup> <sup>j</sup> is ignored, the reduced measurement noise vectors nj <sup>P</sup> j hDkC1 Cj ˆ<sup>c</sup> .jh/ *w*<sup>c</sup> <sup>h</sup> can be treated *as if* they are time-uncorrelated, allowing the recursive computation of the user parameter solutions. As with Case 1, Case 2 also delivers suboptimal parameter solutions. In contrast to Case 1 however, Case 2 incorporates the uncertainty due to the time-prediction of the corrections cO<sup>j</sup> <sup>j</sup>k into the measurement model.

# **3.4 Case 3: Correction Estimation-Error Ignored**

As stated previously, it is only the last term IV in (12) that makes the user-filter misspecified. One may therefore approximate the variance matrix of " by neglecting the presence of IV. The rationale behind such approximation is that the provider filtered solutions cOkjk can become precise enough so as to neglect their estimation error Okjk when the duration of the provider-filter initialization, i.e. the time-difference between the epoch k and the initial epoch t D 1, becomes sufficiently large (e.g., 1 h), see (Wang et al 2017; Khodabandeh 2021; Psychas et al 2022). Upon making this approximation, the user would then work with the following measurement and dynamic models (compare with 16)

$$\text{Case 3}: \begin{cases} \text{measurement-model}: \begin{cases} \begin{bmatrix} \underline{u}\_{i} \\ \underline{c}\_{i} \end{bmatrix} \approx \begin{bmatrix} \mathcal{B}\_{i} \ C\_{i} \\ \mathcal{C}\_{i} \end{bmatrix} \begin{bmatrix} b\_{i} \\ c\_{i} \end{bmatrix} + \begin{bmatrix} \underline{n}\_{i} \\ \sum\_{k=\ell+1}^{i} \Phi^{c}\_{(i-h)} \underline{w}^{c}\_{h} \end{bmatrix} \\\ \underline{u}\_{j} \approx [\mathcal{B}\_{j}, \ C\_{j}] \begin{bmatrix} b\_{i} \\ c\_{i} \end{bmatrix} + \underline{n}\_{j} \quad (j \neq k\tau) \\\ \underline{b}^{c}\_{j} \end{cases} \end{cases} (17)$$

Note the difference between the formulation of Case 3 and those of the two earlier cases. In Case 3, the system is *not* reduced for the correction parameters cj . This is because the reduced measurement noise vectors nj <sup>P</sup> j hDkC1 Cj ˆ<sup>c</sup> .jh/ *w*<sup>c</sup> <sup>h</sup> are *time-correlated*. In order to run the filter in its recursive form, the user therefore has to work with the *augmented* state-vector Œb<sup>T</sup> <sup>j</sup> ; c<sup>T</sup> j - T instead.

To numerically evaluate the maximum precision-loss experienced by the user-filter under the formulation of the three cases discussed above, we employ the result (6) and compute the square-root of the upper-bound, i.e. <sup>p</sup>1Cmax, for the case where a dual-frequency Galileo user (E1/E5a) is provided with clock-, bias- and ionospheric- corrections every seconds. The eigenvalue max is evaluated on the basis of the variance matrix corresponding to the multiepoch batch model (11). The corresponding results as a function of the correction latency i k is shown in Fig. 1. As illustrated in the figure, the stated upper-bounds of all the three cases are close to unity in the absence of correction latency (i.e. when i D k), indicating that they would deliver parameter solutions almost as precise as those of the minimum-variance estimation. However, the suboptimality levels of Cases 1 and 2 rapidly get worse the higher the latency becomes (the red and blue curves). Provided that the duration of the provider-filter initialization is sufficiently long, the precision-loss associated with Case 3 remains marginal though (see the green curves in the right-panel of the figure).

Next to the primary evaluation in Fig. 1, we also make use of a Galileo dual-frequency (E1/E5a) real-world dataset to study the positioning performance of the misspecified user-filter. The data-set was collected with a 1Hz sampling-

**Fig. 1** The maximum increase in the standard-deviation ratio of the suboptimal-to-optimal estimation of the user parameters using networkderived corrections of a single station (thick lines) and twenty stations

(dashed lines). The duration of the provider-filter initialization is set to 5 min (*left*) and 1 h (*right*). The results of Cases 1, 2 and 3 are indicated in red, blue and green, respectively

Correction latency [seconds]

**Fig. 2** Ambiguity-float results: The medians (50% percentiles) of the absolute positioning errors corresponding to 300 user-filter realizations within the area of their 25% and 75% percentiles. The horizontal axes

indicate the time lapsed (in seconds) since the user-filter has started. The results of Cases 1, 2 and 3 are indicated in red, blue and green, respectively

rate on 21 January 2022 by two GNSS permanent stations: CUT0 and UWA0, both located in Western Australia. The precise orbital corrections are a-priori applied to the data. To emphasize the performance of the proposed filter formulations (i.e. Cases 2 and 3) in handling time-delayed corrections, we consider rather high correction latencies than the typical latency of 5–10 s of current IGS real-time PPP corrections (https://igs.org/rts/), see, e.g., Leandro et al (2011). The clock corrections are made available to the user every 10 s, ionospheric corrections every 30 s, and phasebias corrections every 10 min. The corrections are generated via a single-station PPP-RTK setup (Khodabandeh 2021), where the duration of the provider-filter initialization is set to 1 h. Station CUT0 serves as correction-provider, whereas station UWA0 serves as user that is about 8km away from the provider.

In order to infer the overall performance of the user-filter under the formulations offered by Cases 1, 2 and 3, we generate 300 different realizations of the filtered positioning solutions by shifting the user-filter starting epoch i every 15 s. The time-series of the medians (i.e. 50% percentiles) of these realizations within the area of their 25% and 75% percentiles are presented in Figs. 2 and 3 for the both the user ambiguity-float and-fixed options, respectively. The medians of the positioning errors corresponding to Cases 2 and 3 are shown to be considerably smaller than those of Case 1. The results also indicate that Case 3 outperforms Case 2 as it, on average, delivers smaller medians of the positioning errors. In particular, the difference in their performance becomes considerable when the user fixes his float ambiguities. Note also the presence of periodic jumps of the medians for all the three cases. This behaviour is due to the periodic nature

**Fig. 3** Ambiguity-fixed results: The medians (50% percentiles) of the absolute positioning errors corresponding to 300 user-filter realizations within the areas of their 25% and 75% percentiles. The horizontal axes

indicate the time lapsed (in seconds) since the user-filter has started. The results of Cases 1, 2 and 3 are indicated in red, blue and green, respectively

of the correction latencies that vary from zero to 1 s for each data-transmission interval. The corresponding periodic peaks become more pronounced in the solutions of the east component when the float ambiguities are wrongly fixed.

# **4 Concluding Remarks**

In this contribution we presented a general means for measuring the precision-loss that is experienced by the misspecified PPP-RTK user-filter. It was addressed why the stochastic model of the user-filter is always misspecified, irrespective of the multi-epoch formulation adopted, cf. term IV in (12).

By discussing three different formulations for the userfilter, it was demonstrated that the user can potentially limit the suboptimality level of his filter, i.e. when the correction latency is not high and when the duration of the providerfilter initialization is sufficiently long. In contrast to the commonly-used multi-epoch formulation (Case 1), our proposed formulations (Cases 2 and 3) were shown to deliver user parameter solutions that are almost as precise as those of the minimum-variance estimation.

**Acknowledgements** This work was conducted as part of the first author's Alexander von Humboldt Fellowship at the Institute for Communication and Navigation, the German Aerospace Center (DLR), Munich, with Professor Christoph Günther and Dr. Gabriele Giorgi as his hosts. This support is gratefully acknowledged.

# **Appendix**

*Proof of the Theorem* Let Qx<sup>O</sup> and QxO be the variance matrices of the estimators xO and xO , respectively. To prove (6), we employ the following *Rayleigh quotient*-bounds of the matrix-pair .QxO; QxO-/ (Magnus and Neudecker 2017)

$$
\lambda\_{\min}(\mathcal{Q}\_{\hat{\boldsymbol{x}}} \mathcal{Q}\_{\hat{\boldsymbol{\lambda}}^{\*}}^{-1}) \leq \frac{f^{T} \mathcal{Q}\_{\hat{\boldsymbol{x}}} f}{f^{T} \mathcal{Q}\_{\hat{\boldsymbol{x}}^{\*}} f} \leq \lambda\_{\max}(\mathcal{Q}\_{\hat{\boldsymbol{x}}} \mathcal{Q}\_{\hat{\boldsymbol{x}}^{\*}}^{-1}).\qquad(18)$$

Defining matrix M D .Qx<sup>O</sup> QxO- / Q-1 xO- , (18) can be expressed as

$$1 + \lambda\_{\text{min}}(M) \le \frac{f^T \mathcal{Q}\_{\hat{\mathbb{X}}} f}{f^T \mathcal{Q}\_{\hat{\mathbb{X}}^\*} f} \le 1 + \lambda\_{\text{max}}(M),\qquad(19)$$

as Qx<sup>O</sup> Q-1 xO- D I C M. What remains to show is .M / D .MLAMLA? /. Application of the variance propagation law to (5) and (7) gives the variance matrices of the estimators xO and xO as Qx<sup>O</sup> D A<sup>C</sup>QyA<sup>C</sup><sup>T</sup> and QxO- D .AT Q-1 <sup>y</sup> A/-1, respectively. Therefore, the matrix difference .Qx<sup>O</sup> QxO-/ can be expressed as

$$\begin{split} \mathcal{Q}\_{\hat{\mathbb{X}}} - \mathcal{Q}\_{\hat{\mathbb{X}}^{\*}} &= A^{+} \underbrace{(\mathcal{Q}\_{\mathcal{Y}} - A \mathcal{Q}\_{\hat{\mathbb{X}}^{\*}} A^{T})}\_{=A^{+}} A^{+T} \\ &= A^{+} \underbrace{\mathcal{Q}\_{\mathcal{Y}} A^{\perp} (A^{\perp T} \mathcal{Q}\_{\mathcal{Y}} A^{\perp})^{-1} A^{\perp T} \mathcal{Q}\_{\mathcal{Y}}}\_{=A^{+} L \mathcal{Q}\_{\mathcal{Y}} A^{\perp} (A^{\perp T} \mathcal{Q}\_{\mathcal{Y}} A^{\perp})^{-1} A^{\perp T} \mathcal{Q}\_{\mathcal{Y}} L^{\top} A^{\top T} \\ &= A^{+} L \mathcal{Q}\_{\mathcal{Y}} A^{\perp} (A^{\perp T} \mathcal{Q}\_{\mathcal{Y}} A^{\perp})^{-1} A^{\perp T} \mathcal{Q}\_{\mathcal{Y}} L^{\top} A^{\top T} \end{split} \tag{20}$$

The first equality follows from the identity ACA D I and the least-squares residuals' variance matrix QeO- D Qy AQxO- AT , while the second equality follows by expressing QeO in its conditional adjustment form (Teunissen 2000), with A? being an orthogonal-complement basis matrix of A. Thus AT A? D 0, and ŒA; A? is a square, invertible matrix. The last (third) equality follows from Qy D Qe C LQpL<sup>T</sup> and A<sup>C</sup>QeA? D 0. Substitution of the last expression, with Q-1 xO- D .AT Q-1 <sup>y</sup> A/, into M D .Qx<sup>O</sup> QxO- / Q-1 xOgives

$$M = \underbrace{A^+ L \mathcal{Q}\_p A^\perp (A^{\perp T} \mathcal{Q}\_{\underline{v}} A^\perp)^{-1} A^{\perp T}}\_{\underline{v}} \underbrace{\mathcal{Q}\_p [AA^+ L]^T \mathcal{Q}\_{\underline{v}}^{-1} A}\_{\underline{V}}$$

As the nonzero eigenvalues of the matrix-product U V remain *invariant* for switching the order of the involved matrices as V U (Magnus and Neudecker 2017, p. 16), the following matrix

$$VU = \underbrace{\mathcal{Q}\_p[AA^+L]^T \mathcal{Q}\_y^{-1}[AA^+L]}\_{M\_{L4}}$$

$$\times \underbrace{\mathcal{Q}\_p A^\perp (A^{\perp T} \mathcal{Q}\_y A^\perp)^{-1} A^{\perp T}}\_{M\_{L\_{A^\perp}}} \qquad (22)$$

inherits the same nonzero eigenvalues as those of (21). ut

# **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/ licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

**Part IV**

**Geoid and Quasi-Geoid**

# **Geoid or Quasi-Geoid? A Short Comparison**

# Lars E. Sjöberg and Majid Abrehdary

#### **Abstract**

This article is a short introduction to the debate on choosing the geoid and orthometric heights or the quasi-geoid and normal heights as the vertical coordinate system. It mainly compiles some more or less already known facts for comparing the two systems.

**Keywords**

Geoid - Normal height - Orthometric height -Quasi-geoid

# **1 Introductory Comparison Between Geoid and Quasi-Geoid**

The geoid is simple to explain to the layman but not so the quasi-geoid. Also, only the geoid is an equipotential surface of interest in geophysics. Mathematically, to determine the geoid is an inverse problem while to determine the quasigeoid is a forward problem, and the geoid problem is also a free boundary value problem (bvp) in the sense that the boundary itself is unknown (over dry land). On the contrary the quasi-geoid, or rather the height anomaly (-), can be achieved by solving one of the following problems. (See also Heiskanen and Moritz 1967, Chap. 8 and Sjöberg and Bagherbandi 2017, Sect. 6.3.)

**Problem 1 (Molodensky's problem)** Derive from the known gravity and potential at the unknown topographic surface. This is a free bvp.

**Problem 2 (a modern problem)** Derive from the known gravity at the known topographic surface and orthometric height. This is a fixed bvp.

L. E. Sjöberg (-)

Uppsala University, Uppsala, Sweden e-mail: lsjo@kth.se

M. Abrehdary Uppsala University, Uppsala, Sweden **Problem 3 (A second modern problem)** Derive from the known potential at the topographic surface and geodetic height (*h*). This is a fixed bvp.

In the sequel we will denote the solution of Problem 1 by Stokes' formula, Method 1, the solution of Problem 2, Method 2, and the solution of Problem 3, Method 3. In the rest of the paper we will mainly refer to Methods 2 and 3. Geoid problems are caused by the partly unknown topographic density distribution and its extension down to the (unknown) geoid, problems that do not occur in quasi-geoid determination. On the other hand, in rough topography with over-hangs and vertical topography the quasi-geoid height is as ambiguous as the topography (e.g., Sjöberg 2018a).

The geoid height *N* is given by Bruns' formula:

$$N = T\_{\mathfrak{g}} / \mathfrak{\chi}\_0,\tag{1}$$

where *Tg* is the disturbing potential at the geoid and <sup>0</sup> is normal gravity at the reference ellipsoid.

On the contrary, the *height anomaly* is given by the disturbing potential at point *P* on the Earth's surface and normal gravity at the point *Q* on the *telluroid* (see Fig. 1):

$$
\zeta = T\_P / \mathcal{Y}\_{\mathcal{Q}}.\tag{2}
$$

*Note 1* The telluroid is defined as the surface, where the normal potential at each point *Q* equals the Earth's surface potential at point *P* along the normal to reference ellipsoid (see Fig. 1).

© The Author(s) 2023

J. T. Freymueller, L. Sánchez (eds.), *X Hotine-Marussi Symposium on Mathematical Geodesy*, International Association of Geodesy Symposia 155, https://doi.org/10.1007/1345\_2023\_199

Royal Institute of Technology (KTH), Stockholm, Sweden

**Fig. 1** Illustrations of the geodetic height (*h*), geoid height (*N*), orthometric height (*H*), height anomaly (−) and normal height (*H<sup>N</sup>*)

*Note 2* The height anomaly is the height from the telluroid to the Earth's surface. The end user of the height anomaly usually places at the reference ellipsoid under the name *quasigeoid* height, so that the height from this surface to the Earth's surface becomes the normal height (*HN*).

*Note 3* The fact that the orthometric height (between points *P* and *P*0) is slightly curved along the plumbline is practically irrelevant.

# **2 Computational Steps for Geoid and Quasi-Geoid Determination**

The determination of the geoid by Stokes' formula requires downward continuation (DWC; e.g., by remove-compute restore technique) of gravimetric data to sea-level, while quasi-geoid determination in modern methods either requires DWC of gravity to sea-level or point level in Method 2 or direct employment of surface potentials in Method 3. These methods are illustrated below (see also Sjöberg and Bagherbandi 2017, Sects. 6.2–6.3.):

# **2.1 Geoid Determination by Stokes' Formula**

$$\tilde{N} = \frac{\prescript{R}{}{\sf tr}\_{\mathcal{V}\_0}}{\prescript{R}{}{\sf tr}\_{\mathcal{O}}} \prescript{}{\sf S}{\rm S} \left( \psi \right) \left[ \Delta \mathbf{g} + \Delta \mathbf{g}\_{\rm dir}^{\rm T} \right]^\star \mathbf{d} \sigma + \mathbf{dN}\_{\rm I}^{\rm T} \tag{3}$$

where *R* is sea-level radius, is the unit sphere, *S*( ) is Stokes function with argument being the geocentric radius between computation and integration points, g<sup>T</sup> dir is the direct topographic effect on the gravity anomaly *g*, []-D DWC to sea-level, and dNT <sup>I</sup> is the indirect topographic effect. *Note 4* In Eq. (3) and below for quasigeoid determination *g* is the "modern" Earth surface gravity anomaly (introduced by M.S. Molodensky in 1945), i.e., surface gravity minus normal gravity at the *telluroid*/normal height (Fig. 1). However, geoid determination is traditionally conducted from a more approximate gravity anomaly determined by applying its free-air reduction to mean sea-level minus normal gravity at the reference ellipsoid.

# **2.2 Quasi-Geoid Determination**

#### **Method 1 (M.S. Molodensky 1945)**

Here the disturbing potential is introduced as a surface integral of an unknown surface density () on the telluroid (†). A Fredholm integral equation of the second kind relates to the known surface gravity anomaly. Assuming that normal heights are known all over the Earth and introducing several approximations, the latter integral can be solved for and by successive iterations. However, the series will hardly converge for terrain slopes larger than 45<sup>ı</sup>, which calls for a solution of low resolution or accuracy. (For more details, see Heiskanen and Moritz 1967, Chap. 8.)

# **Method 2 (Stokes Formula)**

or

**Approach 1 (Remove-Compute-Restore)**

$$\zeta = \frac{R}{4\pi\mathcal{V}\_{\underline{\mathcal{Q}}}} \iint\limits\_{\sigma} \mathbf{S}\left(\boldsymbol{\psi}, \boldsymbol{r}\_{P}\right) \left[\Delta\mathbf{g} + \Delta\mathbf{g}\_{\mathrm{dir}}^{\mathrm{T}}\right]^{\star} \mathrm{d}\sigma + \mathrm{d}\zeta\_{\mathrm{I}}^{\mathrm{T}},\qquad(4)$$

where (*S*( ,*rP*) is the extended Stokes' formula, d-T <sup>I</sup> is the indirect topographic effect and *rP* is the geocentric radius at point *P.*

**Approach 2 (Direct DWC according to** Bjerhammar 1962**)**

$$\zeta = \frac{R}{4\pi\nu\_{\mathcal{Q}}} \iint\limits\_{\mathrm{s}} \mathrm{S}\left(\psi, r\_{P}\right) [\Delta\mathrm{g}]^{\ast} \mathrm{d}\sigma. \tag{5}$$

**Approach 3 (DWC to point level of radius** *rP***)**

$$\zeta = \frac{r\_P}{4\pi\chi\_{\mathcal{Q}}} \iint\limits\_s \mathbf{S}\left(\psi\right) [\Delta\mathbf{g}]^{\ast\ast} \mathrm{d}\sigma,\tag{6}$$

where []--D DWC to point level.

# **Method 3 (Direct Determination from Known Surface Potential and Geodetic Height)**

Assuming that the topographic surface is known from satellite geodetic positioning and the Earth's potential at the surface from Earth Gravitational Models or in the future from direct determination by atomic clocks (e.g., Bjerhammar 1975, 1985), one can determine the surface disturbing potential and normal gravity at normal height (by iteration). Then the height anomaly/quasi-geoid follows from Eq. (2).

# **3 A Geoid Validation Problem**

Let us assume that the geodetic height (*h*) is known from GNSS-leveling, and that the geoid and orthometric heights *N* and *H* are functions of the topographic density . Then it holds that

$$h = N\left(\mu\right) + H\left(\mu\right) \iff N\left(\mu\right) = h - H\left(\mu\right). \quad \text{(7)}$$

**(**Note that Eq. (7) is an approximation, as the orthometric height is slightly curved along the plumbline, but that will not significantly affect the following result.)

If is in error, it follows that the errors of *N* and *H* are related by:

$$dN\left(\mu\right) = \text{-} dH\left(\mu\right),\tag{8}$$

so that the erroneous density provides equal errors with opposite signs for geoid and orthometric heights. Hence, validating a gravimetric geoid model by GNSS-levelling ignores the error in topographic density. (See also Sjöberg 2018b.) One can show that this problem occurs also in validating a gravimetric geoid model by astro-gravimetric leveling, e.g., by using a zenith camera (Sjöberg 2022). That is, the true topographic density distribution cannot be verified by this validation process.

One may assume that Eq. (8) does not hold except for the true density , arguing that estimated geoid and orthometric heights are affected in different ways by the erroneous mass density, such that the estimated geodetic height hO ./ by Eq. (7) would disagree with its true value *h* (obtained by accurate geodetic positioning, e.g. GNSS). Then one could think of adjusting the density in hO ./ such that it matches *h*. However, this procedure would not be realistic, as one cannot solve the inverse gravimetric problem of the density of mass of the Earth by exterior gravity and geometric data. Hence, Eq. (8) must hold for any assumed density distribution.

On the other hand, if the gravimetric geoid and orthometric height models use different topographic density models, Eq. (8) does not hold.

Sometimes one justifies the accuracy of a gravimetric geoid model determined by adjusting overdetermined gravimetric data in a least-squares procedure (e.g., Foroughi et al. 2019). However, then the reported standard error can only estimate the internal accuracy, while any remaining topographic DWC problem is missing. See Sjöberg (2022).

The above verification problems do not occur in quasigeoid determination.

# **4 Orthometric Height vs. Normal Height**

Using the geoid as the reference surface, the natural height system is based on the orthometric height. However, determining the true orthometric height is an inverse problem just as the geoid problem. In practice one frequently introduces a topographic density model both for the geoid and the orthometric height to get a consistent system. Typical hereby is Helmert orthometric heights with a constant topographic density of 2,670 kg/m3. One should also remember that orthometric heights are curved along the plumbline, but the curvature can usually be ignored.

The normal height is the normal to the reference ellipsoid between the reference ellipsoid and telluroid. Assuming that the topographic height *h* (e.g., from GNSS positioning), normal gravity at the reference ellipsoid (0), its vertical gradient (*a* D 0.3086 mGal/m) as well as the disturbing potential *T* at the point of computation are known (not necessarily at the Earth's surface), the normal height *HN* can be determined in an iterative procedure simultaneously with normal gravity at normal height (*Q*) by using the start values *<sup>Q</sup>* <sup>0</sup> *ha* and *HN h*. Then one iterates the solution by alternating between the following solutions until numerical convergence:

$$H^N = h - T/\chi\_{\mathcal{Q}}\tag{9a}$$

and

$$
\gamma\_{\mathcal{Q}} = \gamma\_0 - aH^N. \tag{9b}
$$

In this way the reference ellipsoid is the zero-level of the normal height system. On the contrary, using the quasigeoid as the zero-level of the normal height system will cause ambiguity problems in rough topography of a non-star shaped Earth model.

# **5 Concluding Remarks**


**Acknowledgements** This study was financially supported by the Swedish National Space Agency grant no 187/18. We also acknowledge the constructive remarks from the EC, prof. J. Freymueller and from two unknown reviewers, which considerably improved the final version of the paper.

# **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons. org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

# **The Quasigeoid: Why Molodensky Heights Fail**

Robert Kingdon, Petr Vanícek, Marcelo Santos, Michael Sheng, ˇ and Ismael Foroughi

#### **Abstract**

Any height system has two constituents: A reference surface upon which all heights are equal to zero, and a prescription for how observed heights and height differences will be related to that surface. That prescription is typically formulated with reference to Earth's gravity field, but in this contribution, we will use the concept of metric spaces instead. In most height systems, the height of a point can be interpreted as the length of the 3 dimensional path from a point of interest to the reference surface in a particular metric space. The geometry of the path is that of the space associated with the height system. This submission explores the definition of a height system simply as a metric space and a reference surface, applies it to common height systems used in geodesy (geodetic, orthometric, dynamic, normal), and examines their characteristics through that lens.

#### **Keywords**

Height systems - Metric spaces - Normal height - Orthometric height -Reference surfaces

# **1 Introduction**

Any height system requires two constituents: A horizontal reference surface upon which all heights are equal to zero, and a prescription for how observed heights and height differences will be related to that surface. This is consistent with standard works such as Heiskanen and Moritz (1967) who define heights as path lengths from points of interest to reference surfaces and prescribe methods for using levelling observations to define heights of points. In the modern world, satellite positioning observations of heights above a superficial surface, an ellipsoid, must also be related to a

R. Kingdon (-) · P. Vanícek · M. Santos · M. Sheng ˇ Department of Geodesy and Geomatics Engineering, University of New Brunswick, Fredericton, NB, Canada e-mail: robert.kingdon@unb.ca

reference surface, which is done using models of geoidellipsoid separation (geoidal heights) or of height anomaly. While traditional approaches often focus on heights of points on the topographical surface, we here intentionally discuss how height systems define heights more generally for points anywhere in a three-dimensional space near Earth's surface.

The traditional interpretation of height systems and of relationships between them applies concepts of potential fields and their corresponding gradient vector fields to describe the form of reference surfaces and the paths along which heights are measured. The mathematical apparatus of this approach has been well-developed over many years by a progression of authors (e.g. Gauss 1828; Molodensky et al. 1962; Hotine 1969; Marussi 1985; Sansò and Vanìcek ˇ 2006; Sansò et al. 2019). In this paper, we apply a mathematically equivalent interpretation whereby for almost all common height systems a 3-manifold (a 3-dimensional space that is locally Euclidian) can be defined such that the height of a point is the length in a Euclidean space of the line of steepest descent following the gradient of the manifold from the point to a reference surface.

J. T. Freymueller, L. Sánchez (eds.), *X Hotine-Marussi Symposium on Mathematical Geodesy*, International Association of Geodesy Symposia 155, https://doi.org/10.1007/1345\_2023\_230

I. Foroughi

Department of Geodesy and Geomatics Engineering, University of New Brunswick, Fredericton, NB, Canada

Department of Earth and Space Science and Engineering, York University, Toronto, ON, Canada

Each 3-manifold, if it is to be used in this way, should be Riemannian. This implies a metric space, where a distance function can be defined. The level surfaces of the manifold comprise a family of 2D subspaces, called horizontal in our case. The 1-dimensional subspaces, orthogonal to the family of 2D horizontal subspaced are identical to what Sansò et al. (2019) call the "lines of the vertical", and we will adopt that terminology here. Precisely one line of the vertical goes through every point in the region of interest, and the height of a point is the length measured in this 1-dimensional subspace from that point to the reference surface. While several different "lines of the vertical" arise in different height systems, their variations in curvature and torsion are not very significant. Their differing lengths, associated with differing "height metrics" of the 1-dimensional subspaces, are their important characteristic and are part of the 3 dimensional metric of the manifold.

The idea of describing heights based on the characteristics of metric spaces is not new. The physical space and approximate physical spaces discussed here, for example, can be understood mathematically according to Hotine's discussion of *N* surfaces as coordinates (Hotine 1969, Chap. 12), reflected in Sansò et al.'s discussion of the Hotine-Marussi coordinate triad (2019, Sect. 5.3). Here we invoke the interpretation of heights via three-dimensional Riemannian manifolds only to explore differences between height systems. Similar comparisons of the geometry implied of different height systems have been undertaken before (e.g. Heiskanen and Moritz 1967; Vanícek and Krakiwsky ˇ 1986; Featherstone and Kuhn 2006; Sansò et al. 2019), but not with the same framework of manifolds and mappings between them. We will not in this paper present a mathematical framework for these systems' implementation; this work has already been done in the references cited above. The intent is to use the concept of metric spaces as a lens to assess existing height systems.

# **2 Common Height Systems Defined by Their Metric Space and Reference Surface**

We will next use the context of metric spaces to discuss some of the common height systems. We will exclude geopotential numbers, focusing instead on systems that deal with heights in linear units. We will also mostly exclude dynamic heights. While these systems fit within the understanding of height systems outlined above, and are important for some practical engineering applications, they have been excluded in the interest of space.

**Fig. 1** Geodetic height system, showing normal to the ellipsoid through point *A*, along which height is measured, as medium dashed line; and other lines to the ellipsoid from *A* as faint narrower dashed lines

We begin with *geodetic height*, also sometimes called the *ellipsoidal height (*wouldn't these be heights of the ellipsoid above the ellipsoid?), which is used here to mean the vertical coordinate *h* in the triplet of geodetic coordinates (®, -, *h*). These heights are associated with a space we call the geometric space, *G*, which is a Euclidean manifold. Since the space is flat, no single line of steepest descent can be defined, and lines minimizing Euclidean distance (straight lines) are used instead. The reference surface is a reference ellipsoid, and the geodetic height *h* of a point is simply the shortest distance from the ellipsoid to the point. The system is illustrated in Fig. 1.

The geometric space is so called because it has a role in relating any other space to geometrical measurements on Earth's surface. In particular, the gometrical space is used for describing the position of points, which is the ultimate concern in geodesy. Other quantities, such as gravity potential, may be used as coordinates to represent the curved intrinsic geometry of the other spaces involved, but they must ultimately be related back to positions. The geopotential numbers, which do not use linear units, still operate by specifying a geometric location along a line of the vertical – in particular, that point where a equipotential surface having a specific potential intersects the line. We may thus consider all other spaces to be embedded in the geometric space.

Next, we discuss. The orthometric height system. In this system, the metric space used for defining the lines of height is the physical space, which we will call *P*, and the reference surface is the geoid. The physical space is a Riemannian manifold with a shape defined by the gravity potential function *W* D *W*(**x**), where **x** is a coordinate triad representing the position of any point in the geometrical space. In *P*, the family of equipotential surfaces are the level surfaces, and the plumblines, which are everywhere orthogonal to those surfaces, will be lines of the vertical. In the orthometric heights, the geoid, which is one of the level surfaces of *P*, serves as the reference surface. The system is illustrated in Fig. 2.

**Fig. 2** A point *A* in the physical space, with plumblines and equipotential surfaces represented by thin grey lines, the geoid by a thick grey line, topography by a medium black line, and orthometric height of point *A* by a medium dotted black line

The formula for orthometric height *HO* above the geoid induced by this definition is exactly the plumbline length given by the standard formula (Heiskanen and Moritz 1967, Eqs. 4–21):

$$H^{\partial} = \stackrel{\leftarrow}{C} \Big/\_{\overline{\mathbb{g}}} , \tag{1}$$

where *C* is the geopotential number and g is the integral mean of the vertical gradient of gravity potential along the plumbline between the geoid and the point of interest. The denominator in Eq. (1) arises from the choice of the plumbline as the line of the vertical, and uniquely transforms the geopotential numbers associated with the physical space into Euclidean lengths in the geometrical space. Any other denominator, unless it is just a scaling of the denominator in Eq. (1) by a constant value, would transform the geopotential number to a non-Euclidean space. A practical challenge exists in calculating g, because this requires sufficiently accurate topographical density models. Current methods for calculating g (Santos et al. 2006) allow orthometric heights accurate to 1 cm or better except in high mountains.

Next we turn to the normal height systems, which we discuss as the Vignal or Molodensky type. Vignal heights, first called *altitudes orthodynamiques* were developed in the 1940s and 1950s (e.g. Eremeev 1965; Simonsen 1965), roughly parallel to Molodensky's (Molodensky 1945) system. The goal was to replace dynamic heights with a system more closely matched to Earth's gravity field. In Vignal's system, the mean gravity value in the denominator of Eq. (1) is replaced with the integral mean normal gravity between the ellipsoid and a point displaced above the ellipsoid by an amount equal to the height of the point of interest (Eremeev 1965). Vignal's method takes as a better approximation of g than the arbitrary value used in dynamic heights, and thus an improved approximation of orthometric height in an era when real gravity not known as well as it is now. The heights arising from this system are given by the equation for the length of the normal plumbline from the reference

$$H^N = \stackrel{\mathcal{C}}{\prime} \Big/\_{\overline{\mathcal{Y}}} \,. \tag{2}$$

From this formula, it is immediately clear that the geoid *must* be the zero-height surface for Vignal normal heights because for *C* D 0 (on the geoid), *HN* D *0*. However, is it also clear that the Vignal normal heights of points will be different from their orthometric heights, because the denominator in Eq. (2) is different from that in Eq. (1). If we attempt to understand Vignal heights as lengths of a line of the vertical from the geoid to a point, this difference in denominator implies that those lengths are defined using a non-Euclidean metric. Interpreted in the context presented in this paper, this is a consequence of Vignal heights being defined using a space *N* that is not equivalent to the physical space *P*, but is only an approximation of it.

*N* is of a category that we will call *approximate physical spaces*, which comprise simplifications of the physical space. Such spaces are common in geodesy and surveying: Widespread examples include the planar and spherical approximations, where the shape of Earth's gravity field and topography is distorted by simplification of Earth's gravity field.

Like the spaces used in the planar and spherical approximations, the normal space *N* is also an approximate physical space. It is a bit closer to the real space, and improves upon the spherical approximation by using an ellipsoidal approximation of Earth's gravity field. In *N*, the real plumblines map to the normal plumblines, which are the lines of the vertical in the Vignal system. The equipotential surfaces map to the ellipsoidal equipotential surfaces of the normal gravity field, of which the reference ellipsoid serves as the reference surface for Vignal heights. The shape of the space is described by the function *U* D *U*(**x**) in the same way that the function *W* D *W*(**x**) describes the physical space. The physical space *P* maps to *N* by a smooth mapping *MN*, as shown in Fig. 3, and based on Eqs. (1) and (2) the mapping of plumbline lengths is given by g=. The approximation error of the normal space is distributed globally, and is on the order of no more than about 150 m for the region of interest (several kilometres above or below the topographical surface). The error in normal heights arising from this approximation is less than 4 m globally (Foroughi and Tenzer 2017), and much smaller than that in most areas.

Approximate physical spaces are important for simplifying the mathematical formulations involved in positioning. For example, the "reduction" of an observed vector between two points to a horizontal distance becomes more simple if a planar or spherical approximation is used. Likewise, the

**Fig. 3** The physical space mapped to the Normal approximation space *N* by mapping *MN*, with Vignal normal height indicated by the medium dotted black line in the Normal space

normal space provides a very important "reference space" for positioning and gravity field computations in geodesy, where it is often used as a first approximation of Earth's gravity field. Differences between quantities in the physical space and the normal space can then be treated as small anomalies that can be dealt with in subsequent refinements more easily than if the full quantities were used. More detailed "spheroidal" approximations are also frequently made in geoid modelling computations. However, such simplification comes at the expense of geometry and should be applied with care. In the case of Vignal heights, while the normal space is a sufficient approximation of *P* for the accuracy goals and data availability of the 1950s, nowadays far better estimates of orthometric heights are available, like the rigorous orthometric heights (Santos et al. 2006).

We now turn to what we call the Molodensky normal heights. Molodensky, in his creation of a comprehensive normal height system (Molodensky et al. 1962) applies a definition similar to Vignal's but more exactly calculated, and also applies two other definitions that were meant to be equivalent to it:


Because the height anomaly is the vertical displacement between a point of potential *W* and a point of normal potential *U* D *W*, it represents the mapping of vertical point positions from *P* to *N*. Thus Definition 1 should theoretically produce heights identical to Vignal's, apart from the difference between the normal plumbline and the ellipsoidal normal. Furthermore, since can be calculated not only at the topographical surface but anywhere in the space near it, Definition 1 can be extended from Molodensky's usage to provide a viable route for those wishing to transform geodetic heights to Vignal normal heights. However, because Molodensky's approach to calculating requires a regularized Earth surface (Molodensky 1945) it does not yield heights exactly equivalent to Vignal's: An additional layer of

**Fig. 4** The Molodensky normal height system, showing normal to the ellipsoid through point *A*, along which height is measured, as a medium dashed line, and the quasigeoid as a thin black line

approximation is added, as well as some ambiguity unless the choice of regularized Earth surface is specified.

Definition 2 is more problematic. As highlighted at the 2018 Hotine-Marussi symposium and discussed in subsequent publications (e.g. Kingdon et al. 2022), the quasigeoid is a folded and creased surface not well-suited as a reference surface for heights. Thus, a regularized topography must again be used to construct a usable quasigeoid. The quasigeoid constructured in this way will still not be a physically meaningful surface however: A marble placed on the geoid would not roll; a marble placed on the quasigeoid would navigate a path among the variations of topography and gravity anomalies. Furthermore, the quasigeoid is associated with the values of at the (regularized) surface of topography only, and so Definition 2 will provide incorrect heights for points not situated at the topographical surface.

Because the use of the height anomalies was uniquely associated with the Molodensky system, herein we use the term "Molodensky heights" to refer to heights defined using height anomalies. There heights may be defined according to Definition 1, in which case they are equivalent to Vignal heights. However, if defined according to Definition 2 as shown in Fig. 4, they are defined with the quasigeoid as the reference surface and the ellipsoidal normals as the lines of the vertical. Notably, the lines of the vertical are not perpendicular to the reference surface in this definition, and all of the problems with the quasigeoid listed above are inherited. Any attempt to define an approximate physical space associated with the Definition 2 Molodensky heights would be quite complex, given their association with the irregular quasigeoid.

# **3 Conclusions**

Consistent interpretations of each height system are possible in the context of vertical lines in selected metric spaces from points of interest to defined reference surfaces. Such interpretation reveals that for most systems of heights, including the Vignal normal heights, the geoid is the reference surface. These systems all have the property that height is a member of a 1-dimensional space perpendicular to the level surfaces of the manifold in which they are defined, and the height dimension extends perpendicular to the reference surface. The only system that does not cleanly fit into this framework is the Molodensky system that uses the quasigeoid. The Vignal normal heights present a more consistent system, and Molodensky's use of the height anomaly to define his normal heights presents a path to their use.

In the longer term, we have seen that any variety of normal height relies on an approximation of Earth's gravity field, and this gives rise to certain challenges. The differences in accuracy between normal and orthometric heights are small, but as measurement precision increases small differences may become significant. At the same time, as better density models and gravity observation allow improved characterization of Earth's gravity field, the arguments for using an approximate physical space for defining heights wane. Like the planar and spherical approximation, the normal approximation will always have a role in geodetic computation, generating reference ellipsoids and gravity anomalies, and for other purposes where a simplified or approximate gravity field is called for. However, the time to set it aside in the ultimate definition of height systems is near.

# **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/ licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

# **Molodensky's and Helmert's Theories: Two Equivalent Geodetic Approaches to the Determination of the Gravity Potential and the Earth Surface**

# Fernando Sansó, Riccardo Barzaghi, and Mirko Reguzzoni

#### **Abstract**

A fundamental problem of physical geodesy is the determination of the "Surface of the Earth" and its gravitational potential from various types of observations performed on the Earth surface S itself or in the outer space. When data are derived from gravimetry on S we speak of Molodensky's problem. Since the gravity field depends linearly on its source, i.e. the mass distribution, it follows that we can manipulate the (unknown) internal density in a known way and still return to the same external solution once the effects of the manipulation have been eliminated (restored). This is used, in the frame of Molodensky's theory, with the Residual Terrain Correction that is removed (and then restored) before approximating the solution by some regularized (collocation or other) approach. Differently, Helmert's approach shifts the masses of the topographic layer, compressing them to some internal surface and substituting their effects on gravity by that of a single layer. Data are thus lowered to some internal ellipsoid or sphere and a solution is then easily computed. The effects of the internal changes are then inverted and added back to the solution. Despite the apparent completely different approach one can prove that the final solutions, when data are given continuously on the boundary and the errors are made to tend to zero, converge to the true potential on the surface S and then in the outer space. So the two solutions are geodetically equivalent and do not create any scientific conflict. Different is what happens inside S, down to the geoid level. Here Helmert's approach, that introduces the density of the topographic layer as data, is certainly less erroneous in approximating the true potential. Yet due to the imperfect knowledge of the density and even more to the ill-posedness of the downward continuation operator, the internal potential can have large errors, unless the solution is duly regularized and an appropriate tuning is introduced between the regularization parameter and the size of data errors.

#### **Keywords**

Boundary value problem - Earth gravity field - Helmert's approach - Molodensky's approach -Physical geodesy

F. Sansó · R. Barzaghi · M. Reguzzoni (-) Department of Civil and Environmental Engineering (DICA), Politecnico di Milano, Milano, Italy e-mail: riccardo.barzaghi@polimi.it; mirko.reguzzoni@polimi.it

# **1 The Gravimetric Surface of the Earth**

The purpose of physical geodesy is to find the potential W of the gravity field of the Earth in the space outside the masses and possibly even a little below their surface S to be used in geological and geophysical interpretation. This has to be done by exploiting all the information we have on

J. T. Freymueller, L. Sánchez (eds.), *X Hotine-Marussi Symposium on Mathematical Geodesy*, International Association of Geodesy Symposia 155, https://doi.org/10.1007/1345\_2023\_212

W , primarily observations of potential differences between couples of points on S, the gravity modulus at points on the surface and many others, including satellite altimetry on the oceans, satellite gravimetry and gradiometry, satellite tracking, etc.

One fundamental feature of geodetic problems is that the surface S itself has to be considered as unknown. This holds despite satellite observations of the Earth seem to supply us with a geometric determination of S, for instance by radar altimetry on the ocean and by SAR or photogrammetry on land. Yet, apart from the fact that the accuracy of the geometry is e.g. on land of about 1 m and of a few centimeters for the quasi stationary sea surface, the true issue is that such a surface is not the same we are looking for, from the gravity field view angle.

For instance a circular tower with a radius of 30 m and a height of 40 m produces an attraction of less than 0.03 mGal on top of its roof, on account of the fact its concrete occupies less than 1% of its volume. On the other hand, the same tower is well determined by space photogrammetry, i.e. it enters into the geometric surface of the Earth in photogrammetric sense, but not in the gravimetric sense.

For these reasons we have to accept that the gravimetric surface S is an unknown characterized by the fact that at its points we know the minimal information on the gravity field necessary to solve our problem. We will not repeat the long road leading from observations to the scalar Molodensky Problem (or scalar GBVP) for which we rather send to the literature (Heiskanen and Moritz 1967; Moritz 1980; Krarup 2006; Sacerdote and Sansò 1986; Sansò 1997; Heck 1989, 1997; Martinec 1998; Sansò and Sideris 2013). We only recall that the above definition of S within the scalar GBVP, the same given in Sansò and Sideris (2017), needs some more discussion. In fact we can observe that we have two sources of information on S; one is the coordinates of points at which we make gravimetric measurements, the other is the information coming from Digital Terrain Models (or Ocean Models) derived from space. So we can first of all smooth our gravimetric signal by applying a residual terrain modelling (Forsberg 1984; Forsberg and Tscherning 1997), namely by subtracting the gravimetric effect on W and g of a layer between some reference surface and the actual surface of the Earth, as it is known by a DTM, and then use the smoothed signal to produce a "local interpolation" to some surface that is only locally known with respect to the measurement stations, but globally a part of the unknowns of the GBVP. Of course in doing so we will significantly smooth the actual geometric S, particularly in mountain regions, avoiding sharp vertical discontinuities, ravines and especially overhangs, that would destroy the results of the mathematical analysis of the GBVP. In fact, it is well known that the oblique derivative BVP for the Laplace operator, in our case the direction of the vertical, the direction of the derivative should never be tangent to the boundary to guarantee existence and uniqueness theorems (Yanushauakas 1989; Miranda 1970). So we can imagine that S is at least a Lipschitz surface and that lines orthogonal to the Earth ellipsoid E cross S at one point only. In other words, we can assume the surface S to be in one-to-one correspondence with the ellipsoid E by orthogonal projection. This implies that S has an equation

$$h = H\left(\sigma\right),\tag{1}$$

with h indicating the ellipsoidal height, the couple of angular ellipsoidal coordinates, - D .;/, and H a Lipschitz function of the point P on E with coordinates -.

So we can summarize our problem of knowing W outside the masses as the solution of a free BVP for the Laplace equation that can be formalized as follows (scalar Molodensky's problem)

$$W(P) = V(P) + \frac{1}{2}\boldsymbol{\alpha}^2 \left(\mathbf{x}\_P^2 + \mathbf{y}\_P^2\right),\tag{2}$$

$$\begin{cases} \triangle V = 0 & \text{outside } S\\ W\left(H\left(\sigma\right), \sigma\right) = W\_0\left(\sigma\right) & \text{on } S\\ \left|\nabla W\left(H\left(\sigma\right), \sigma\right)\right| = g\_0\left(\sigma\right) & \text{on } S\\ V \to 0 & \text{for } h \to \infty \text{ .} \end{cases} \tag{3}$$

where the unknowns are W .P/, i.e. V .P/, in fh>H .-/g, and H .-/ itself.

**Remark 1.1** Why do we insist so much to transform our "exterior" problem into a BVP? This is because, although non linear and difficult, the problem can be mathematically analyzed and proved to have a unique solution in functional Hölder spaces under "reasonable" conditions on the datum fW0 .-/; g0 .-/g and, even more important, such a solution is continuously dependent on data (Sansò 1976, 1989; Hörmander 1976). Further on, the linearized version of (3) has a well understood behaviour and its solution is well posed in quite general Sobolev Spaces.

**Remark 1.2** What about estimating W .P/ inside the masses, for instance in the topographic layer, which has a maximum width of roughly 9 km, between the ellipsoid E and the surface S? The problem, crossing S from outside towards the inner body, changes its mathematical nature. First of all in the body, B D fh<H .-/g, the gravitational potential V D W <sup>1</sup> 2!<sup>2</sup> - x<sup>2</sup> C y<sup>2</sup> is not any more harmonic, but it rather satisfies the Poisson equation

$$
\triangle V\left(P\right) = -4\pi G\rho\left(P\right) \quad \text{in } B\,,\tag{4}
$$

with G the Newton constant and .P/ the density of topographic masses. Once S has been determined, and assuming we know .P/ between S and E, the solution of (4) complemented with the boundary conditions

$$\begin{cases} \left. V + \frac{1}{2} \boldsymbol{\omega}^2 \left( \boldsymbol{\chi}^2 + \boldsymbol{\mathcal{y}}^2 \right) \right|\_{\boldsymbol{S}} = W\_0 \left( \boldsymbol{\sigma} \right) \\ \left. \nabla \left( \boldsymbol{W} + \frac{1}{2} \boldsymbol{\omega}^2 \left( \boldsymbol{\chi}^2 + \boldsymbol{\mathcal{y}}^2 \right) \right) \right|\_{\boldsymbol{S}} = \mathbf{g}\_0 \left( \boldsymbol{\sigma} \right) \end{cases} \quad (5)$$

has the form of an "initial" or Cauchy problem for the Laplace operator. Quoting Miranda (1970, page 60) "In this case we do not usually think of an extension of the Existence Theorem for Cauchy's problem, inasmuch as this problem is not to be considered well posed for the elliptic equation in other than the analytical field. Hadamard in fact observed that the problem does not admit in general a solution, and that when it does, this solution does not depend continuously on the data". We will return on this point later, showing that despite the above statements, the calculus of W in a thin layer like the topographic one is approximately possible, although with an error which might become unacceptable the deeper we go inside S.

# **2 The Ambiguous Gravimetric Surface** *S* **and the Linearization Band**

As we have seen from the discussion of Section 1, the gravimetric surface S is not exactly fixed, but it rather depends on a number of preliminary operations we apply to the measurements to derive W0 .-/ and g0 .-/. Once this step is fixed, the surface S D fh D H .-/g comes from the solution of the scalar Molodensky problem. Then a natural question is: since W0 .-/ and g0 .-/ might be referred to the same surface in a layer of width ˙ıH0, can we claim that we have different solutions of the GBVP? Of course this should not be the case, because our physical gravity field is only one.

The answer comes from a sensitivity analysis to the shift of data between two surfaces, S and S<sup>0</sup> , with respect to the gap

$$
\delta H\left(\sigma\right) = H'\left(\sigma\right) - H\left(\sigma\right) \,. \tag{6}
$$

This resembles strictly the reasoning done to evaluate the linerization band for the nonlinear problem (3); in fact the idea is to make an evaluation of the second order terms arising in a Taylor formula and verify that they can be neglected. Such a reasoning has been developed in a very fine numerical analysis in Heck (1989, 1997) as well as by some rougher estimates in Sansò and Sideris (2017), in any way leading to the conclusion that the linearization band has a width somewhere between 100 m and 200 m. Note that the more optimistic estimate in Sansò and Sideris (2017) with respect to those in Heck (1989, 1997) is due to the fact that it is based on the mean square value of the curvature of equipotential surfaces (see Sansò and Sideris 2013) and since this is a quite oscillating function, especially in mountainous areas, it can easily reach as much as 10 times the value of the r.m.s.

We aim to show that if we stay in such a band we can safely switch from one surface S to another S<sup>0</sup> , shifting our data W0 ! W <sup>0</sup> <sup>0</sup> and g0 ! g<sup>0</sup> <sup>0</sup> by a linear Taylor formula, without changing substantially the solution of (3). Let us note that our shift has to be considered in free air since we assume that we have already cleared the masses in the space around S by the residual terrain correction.

We have, for some t .-/, .-/ < 1,

$$W\_0'\left(\sigma\right) - W\_0\left(\sigma\right) = W\left(H'\left(\sigma\right), \sigma\right) - W\left(H\left(\sigma\right), \sigma\right) = 0$$

$$\frac{\partial W\left(H\left(\sigma\right), \sigma\right)}{\partial h} \delta H\left(\sigma\right) + $$

$$\frac{1}{2} \frac{\partial^2 W\left(H\left(\sigma\right) + t\left(\sigma\right)\delta H\left(\sigma\right), \sigma\right)}{\partial h^2} \delta H\left(\sigma\right)^2\tag{7}$$

$$\begin{aligned} \mathbf{g}'\_0\left(\sigma\right) - \mathbf{g}\_0\left(\sigma\right) &= \mathbf{g}\left(H'\left(\sigma\right), \sigma\right) - \mathbf{g}\left(H\left(\sigma\right), \sigma\right) = \\ \frac{\partial \mathbf{g}\left(H\left(\sigma\right), \sigma\right)}{\partial h} \delta H\left(\sigma\right) + \\ \frac{1}{2} \frac{\partial^2 \mathbf{g}\left(H\left(\sigma\right) + \mathbf{\tau}\left(\sigma\right)\delta H\left(\sigma\right), \sigma\right)}{\partial h^2} \delta H\left(\sigma\right)^2 \ . \end{aligned} \tag{8}$$

For the sake of an estimate of orders of magnitude, we can substitute in (7) and (8)

$$\mathbf{H}(\mu = GM) \qquad \frac{\partial^2 W}{\partial h^2} \sim -2\frac{\mu}{r^3} \sim -\frac{2}{r}\mathbf{g}\_0 \qquad (9)$$

$$\frac{\partial^2 \mathbf{g}}{\partial h^2} \sim 6 \frac{\mu}{r^4} \sim -\frac{6}{r^2} \mathbf{g}\_0 \,. \qquad (10)$$

So that we get, with O Œ- denoting the orders of magnitude,

$$\mathcal{O}\left[\frac{1}{2}\frac{\partial^2 W}{\partial h^2}\delta H^2\right] = \frac{\delta H^2}{r}\mathbf{g}\_0\tag{11}$$

$$\mathcal{O}\left[\frac{1}{2}\frac{\partial^2 \mathbf{g}}{\partial h^2} \delta H^2\right] = \Im\left(\frac{\delta H}{r}\right)^2 \mathbf{g}\_0 \,. \tag{12}$$

With ıH D 200 m and r 6 - 106 m, dividing (11) by g0 to transform the variation of potential into a shift of equipotential surfaces, we get

$$\left[O\left[\frac{1}{\text{g}\_0}\left(\frac{1}{2}\frac{\partial^2 W}{\partial h^2}\delta H^2\right)\right]\sim 0.6\text{ cm}\tag{13}$$

$$\mathcal{O}\left[\frac{1}{2}\frac{\partial^2 \mathbf{g}}{\partial h^2} \delta H^2\right] \sim \text{3 } \mu \text{Gal}\,. \tag{14}$$

We consider these figures negligible and we can conclude that we can shift from any surface to another by a simple linear Taylor formula in the linearization band, getting substantially equivalent nonlinear scalar Molodensky problems.

As for the linearization of (3), we do not need to repeat the procedure well established in literature (see Sacerdote and Sansò 1986, Heck 1997, Sansò and Sideris 2013, Sansò and Sideris 2017); we rather underline some aspects of the almost equivalence between linearized and nonlinear BVP by the following remarks.

**Remark 2.1** Let us recall that the linearization of (3) is done, following the lesson of T. Krarup (2006), by choosing an approximate surface <sup>S</sup><sup>Q</sup> <sup>D</sup> ˚ h D HQ .-/ , such that ıH .-/ D H .-/ HQ .-/ is known to belong to the linearization band, and on the same time splitting the actual potential W into normal plus anomalous potential

$$W\left(h,\sigma\right) = U\left(h,\sigma\right) + T\left(h,\sigma\right) \; ; \tag{15}$$

ıH and T are considered as first order infinitesimals. Then the equations linearized starting from SQ result to be

$$DW\_0\left(\sigma\right) = W\_0\left(\sigma\right) - U\left(\tilde{H}\left(\sigma\right), \sigma\right) = \\\\ \qquad T\left(\tilde{H}\left(\sigma\right), \sigma\right) - \chi\left(\tilde{H}\left(\sigma\right), \sigma\right) \delta H\left(\sigma\right) \\, \qquad (16)$$

$$D\mathbf{g}\_0\left(\sigma\right) = \mathbf{g}\_0\left(\sigma\right) - \chi\left(\tilde{H}\left(\sigma\right), \sigma\right) = $$

$$\chi'\left(\tilde{H}\left(\sigma\right), \sigma\right) \delta H\left(\sigma\right) - T'\left(\tilde{H}\left(\sigma\right), \sigma\right) \; ; \qquad (17)$$

here we have denoted by a prime the derivative of a function f .h; -/ with respect to h, and, as usual,

$$\mathcal{Y}(h,\sigma) = |\nabla U(h,\sigma)|\,. \tag{18}$$

Typically (16) is used to derive ıH .-/

$$\delta H\left(\sigma\right) = \frac{1}{\chi\left(\tilde{H}\left(\sigma\right), \sigma\right)} \left[T\left(\tilde{H}\left(\sigma\right), \sigma\right) - DW\_0\left(\sigma\right)\right],\tag{19}$$

known as generalized Bruns equation; then we substitute it into (18) to get a boundary condition on SQ for the sole unknown T , namely

$$-T'\left(\tilde{H}\left(\sigma\right), \sigma\right) + \frac{\chi'\left(\tilde{H}\left(\sigma\right), \sigma\right)}{\chi\left(\tilde{H}\left(\sigma\right), \sigma\right)} T\left(\tilde{H}\left(\sigma\right), \sigma\right) = $$

$$D\mathbf{g}\_0\left(\sigma\right) + \frac{\chi'\left(\tilde{H}\left(\sigma\right), \sigma\right)}{\chi\left(\tilde{H}\left(\sigma\right), \sigma\right)} D W\_0\left(\sigma\right) \,. \tag{20}$$

The new boundary conditions (16) and (17) defined on SQ neglect second order terms; among them, that of the Eq. (17) is the most worrying. It reads (Sansò and Sideris 2017)

$$\begin{aligned} \mathcal{Q}z(\mathbf{g}) &= -T'' \left( \tilde{H} \left( \sigma \right), \sigma \right) \delta H \left( \sigma \right) + \\ &\frac{1}{2\gamma \left( \tilde{H} \left( \sigma \right), \sigma \right)} \left[ \left| \nabla T \left( \tilde{H} \left( \sigma \right), \sigma \right) \right|^2 - T''^2 \left( \tilde{H} \left( \sigma \right), \sigma \right) \right]; \end{aligned} \tag{21}$$

it has been shown (see Heck 1997 and references therein) that locally in mountainous areas Q2 .g/ can reach the level of 0.3 mGal, which is too large to be neglected. Yet, by using (19) and (21) with T substituted by some global model TM , even with a moderate maximum degree, we can reduce the neglected part of Q2 by at least an order of magnitude. In other words, computing Q2 .g/ with a model TM and subtracting it to Dg0 .-/ brings the linear boundary value problem (20), complemented by Bruns equation (19), to be practically equivalent to the nonlinear (3).

# **3 Equivalent Linearized Molodensky Problems**

Summarizing the discussion of the previous section, we can say that we can accomplish the aim of physical geodesy by solving the linearized Molodensky problem. This boils down to: choose an approximate SQ in the linearization band around the gravimetric surface S and then solve the oblique derivative BVP

$$\begin{cases} \Delta T = 0 & \text{in } \tilde{\Omega} \equiv \{ h \ge \tilde{H} \: (\sigma) \} \\ -T' + \frac{\mathcal{V}'}{\mathcal{V}} \Big|\_{\bar{\mathcal{S}}} = D\mathbf{g}\_0 + \frac{\mathcal{V}'}{\mathcal{V}} DW\_0 \end{cases}; \qquad (22)$$

the generalized Bruns relation (19) will then provide us with the height anomaly ıH .-/ that allows to reconstruct S, i.e. H .-/ D HQ .-/ C ıH .-/.

As we know, typically in the asymptotic development of T in spherical harmonics, the terms of degree 0 and 1, i.e. the coefficients T0;0 and T1;k (k D 1; 0; 1/, are skipped as they are related to the total mass of the Earth, accounted for the normal potential, and to the coordinates of the barycentre which are chosen as origin of the coordinates (see Sansò and Sideris 2013). This implies that usually (22) is complemented by the asymptotic relation

$$T = \mathcal{O}\left(\frac{1}{r^3}\right), \quad r \to \infty. \tag{23}$$

Since this item is not central in the following discussion, we do not dwell on it.

What is important is that since the solution of (22) on a practical ground is still a quite challenging task, according to different approaches, many efforts are done to reduce (22) to a problem on a Bjerhammar sphere (i.e. a sphere totally internal to the masses) to make it numerically tractable.

In fact:


Since all the mentioned methods extend the harmonicity domain of T at least down to the ellipsoid or a sphere and, as we shall discuss in the next section, this operation is improperly posed basically because high frequency errors tend to be amplified, it is convenient to perform computations on data as smooth as possible. To this purpose we can use to our best the freedom that we have in designing the BVP (22).

The choices that we can exploit are two: one is the choice of SQ, always remaining in the linearization band, the other is to change in a known way the mass distribution below S and modify accordingly the data. This known change can then be retrieved at the end of the computation procedure.

**Remark 3.1** Let us notice that already an early use of the Residual Terrain Correction to modify our data goes exactly in the wanted direction.

# **3.1 The Classical Molodensky Choice**

This is to introduce as SQ the so called telluroid defined by the relation

$$\tilde{S} \equiv S^\* \equiv \left\{ h = h^\* \left( \sigma \right); \quad U \left( h^\* \left( \sigma \right), \sigma \right) = W\_0 \left( \sigma \right) \right\} . \tag{24}$$

The height h of a point, P D .h; -/, identified by the relation 

$$U\left(h^\*,\sigma\right) = W\left(h,\sigma\right) \tag{25}$$

is called normal height of P; the set of points fQg that have an ellipsoidal height equal to the normal height of P, when it runs on S, is the telluroid.

**Remark 3.2** Let us notice that the original choice of Molodensky was to use isozenithal lines instead of the ellipsoidal normal. This leads to the formulation of a free BVP which is now called the vector Molodensky problem (Heck 1997; Sansò and Sideris 2013), the linearization of which has been first rigorously done in Krarup (2006). On the contrary, the above exposition is along the lines of the scalar Molodensky BVP, which has been developed by the Russian School (Brovar 1964; Pellinen 1982) and presented in Heiskanen and Moritz (1967), Moritz (1980) as well as theoretically systematized in Sacerdote and Sansò (1986).

Returning to our object, we notice that the choice (25) implies

$$DW = W\_0 \left( \sigma \right) - U \left( h^\star \left( \sigma \right), \sigma \right) \equiv 0 \; . \tag{26}$$

Therefore, in this case the known term of the oblique derivative problem (22) becomes

$$\begin{split} D\mathbf{g}\_0 + \frac{\boldsymbol{\chi}'}{\boldsymbol{\chi}} DW\_0 &= D\mathbf{g}\_0 = \mathbf{g}\left( H\left(\boldsymbol{\sigma}\right), \boldsymbol{\sigma}\right) - \boldsymbol{\chi}\left( h^\*\left(\boldsymbol{\sigma}\right), \boldsymbol{\sigma}\right) \\ &= \Delta \mathbf{g}\left(\boldsymbol{\sigma}\right) \end{split} \tag{27}$$

namely the ordinary free air geodetic gravity anomaly. On the same time ıH .-/ becomes

$$\delta H\left(\sigma\right) = H\left(\sigma\right) - h^\*\left(\sigma\right) = \zeta\left(\sigma\right) \,, \tag{28}$$

also called height anomaly, and (19) reduces to the ordinary Bruns relation

$$\zeta\left(\sigma\right) = \frac{T\left(h^\star\left(\sigma\right), \sigma\right)}{\chi\left(h^\star\left(\sigma\right), \sigma\right)}.\tag{29}$$

It turns out that physically the height anomaly is everywhere less than 120 m in absolute value, so confirming that (22), maybe applying the correction Q2 .g/ (see 21) for mountainous areas, is fully in the linearization band, i.e. equivalent to the original nonlinear problem.

In conclusion the linearized Molodensky problem writes

$$\begin{cases} \Delta T = 0 & \text{in } \tilde{\Omega} \\ -T' + \frac{\mathcal{V}'}{\mathcal{Y}} T = \Delta \mathbf{g}\_0 & \text{on } \tilde{\Omega} \end{cases} \tag{30}$$

# **3.2 The Helmert Approach**

The Helmert idea was to combine both possibilities, first dislocating the topographic masses from their 3D distribution to a single layer on a condensation surface S0, the condensation process, and then substituting in W the attraction of the column with that of the single layer element at its basis (Vanícek and Martinec ˇ 1994). In formula, accepting a spherical approximation where the ellipsoid is substituted by a sphere of radius R0, this reads

$$dM = d\sigma \int\_{R\_0 - D(\sigma)}^{R(\sigma) = R\_0 + H(\sigma)} r^2 \rho \,(r, \sigma) \, dr = \kappa \,(\sigma) \, dS\_0 \,, \tag{31}$$

where D .-/ is the depth of the condensation surface, dS0 is its area element and dM is the mass contained in the column.

In practice, an approximation is applied to (31), namely is considered as constant along the column, implying that (31) can be written as

$$\begin{aligned} \kappa \left( \sigma \right) &= \rho \left[ H \left( \sigma \right) + D \left( \sigma \right) \right] \\ &\cdot \left[ R\_0^2 + R\_0 \left( H \left( \sigma \right) - D \left( \sigma \right) \right) + \frac{1}{3} H \left( \sigma \right)^2 + \frac{1}{3} D \left( \sigma \right)^2 \right] \frac{d\sigma}{dS\_0} \; ; \end{aligned} \tag{32}$$

no doubt that the approximation .r;-/ = constant is one of the weakest points of Helmert's approach. Investigations on the impact of using density models with lateral or even 3D variations has been done in Kingdon et al. (2009). Yet, this is not to the extent of invalidating the approach, as far as what we subtract at the beginning we add back at the end. Only the conclusion that all masses above the geoid are so removed has to be regarded as an approximate statement.

The term d - dS0 depends on what condensation surface S0 is chosen; in literature two choices are considered (see Heck 2003). The first condensation method corresponds to D .-/ D 0; the second method, the one mostly applied in recent literature (see for instance Vanícek et al. ˇ 1999), consists in putting D .-/ D N .-/ (the geoid undulation), namely in choosing the geoid itself as S0. This has the effect of transforming the term H .-/ D .-/ into

$$H\_0\left(\sigma\right) = H\left(\sigma\right) - N\left(\sigma\right) \,, \tag{33}$$

i.e. the orthometric height of the point on S with horizontal coordinates -. H0 .-/ is considered as known on S, although this statement is not so firm because once more to know H0 .-/ one would need to know as well .h; -/ below the point P .h; -/ to compute the necessary orthometric corrections (see Heiskanen and Moritz 1967, Sansò et al. 2019).

Yet, by taking the geoid as S0, one has

$$dS\_0 = \frac{R\_0^2}{\cos \delta\_0} d\sigma \tag{34}$$

with ı0 the deflection of the vertical. Since we have at most ı0 3 - 10<sup>4</sup>, (34) can be safely written as dS0 D R<sup>2</sup> <sup>0</sup> d - and (32), after one further simplification, becomes just

$$
\kappa \left( \sigma \right) = \rho \, H\_0 \left( \sigma \right) \, . \tag{35}
$$

The Helmert potential correction is then defined as

$$
\delta V^H \left( h, \sigma \right) = V\_T \left( h, \sigma \right) - V\_C \left( h, \sigma \right) \,, \tag{36}
$$

where VT is the potential of the topographic masses, i.e. the masses between the geoid and S, and VC is the potential of the single layer with density (35).

**Remark 3.3** The topographic potential VT is strictly speaking not computable as it is given on S by the formula (in spherical approximation)

$$V\_T\left(H\left(\sigma\right),\sigma\right) = G\rho \int d\sigma' \int\_{R\_0 + N\left(\sigma'\right)}^{R\_0 + H\_0(\sigma') + N\left(\sigma'\right)} \frac{r^2 dr}{\ell\_{\sigma\sigma'}},\tag{37}$$

where `--<sup>0</sup> is the distance between P .R0 C H .-/; -/ and the running point .r;-0 /. As we see, VT on S depends on N .-/ which is in fact unknown. Yet, as proved in Sansò and Sideris (2017), (37) can be substituted, with a small error, by

$$V\_T\left(H\left(\sigma\right), \sigma\right) \cong G\rho \int d\sigma' \int\_{R\_0}^{R\_0 + H\_0(\sigma')} \frac{r^2 dr}{\ell\_{\sigma\sigma'}} \qquad (38)$$

which is then computable with the available information and may be then refined by iterating on (37). The same holds therefore for ıV <sup>H</sup> .h; -/ in (36).

One fundamental statement of empirical nature we will need in the next development is that

$$O\left(\max \frac{\delta V^H(\sigma)}{\nu(\sigma)}\right) = 2 \text{ m} \tag{39}$$

(see Vanícek et al. ˇ 1999). Since this is two orders of magnitude smaller than the width of the band, where two nonlinear formulations of Molodensky's problem can be transformed linearly one into the other, we can safely assume that the use of the corrective term ıV <sup>H</sup> .h; -/ will not change the problem yielding the sought potential.

By definition the Helmert potential W <sup>H</sup> is given by

$$\begin{split}W^H\left(h,\sigma\right) &= W\left(h,\sigma\right) - \delta V^H\left(h,\sigma\right) \\ &= W\left(h,\sigma\right) - V\_T\left(h,\sigma\right) + V\_C\left(h,\sigma\right) \end{split} \quad (40)$$

namely it is the actual potential where the topographic part VT is substituted by the potential of the same masses squeezed on the geoid S0. As such, W <sup>H</sup> , apart from its centrifugal component, is harmonic down to the geoid, at least if the hypothesis of constant density has to be considered correct.

Since ıV <sup>H</sup> .h; -/ is computable at the level of S, we come to know W <sup>H</sup> <sup>0</sup> .-/ on it. With the same accuracy we have now to transform g0 .-/ into g<sup>H</sup> <sup>0</sup> .-/. Denoting with n<sup>H</sup> the vertical direction of the Helmert field, according to (39) one has for P 2 S

$$\mathbf{g}\left(P\right) = \left|\nabla W^H\left(P\right) + \nabla \delta V^H\left(P\right)\right|$$

$$\cong \mathbf{g}^H\left(P\right) - \underline{\mathbf{n}}^H \cdot \nabla \delta V^H\left(P\right)$$

$$\cong \mathbf{g}^H\left(P\right) - \frac{\partial}{\partial h} \delta V^H\left(P\right) \,. \tag{41}$$

All approximations here are easily justified on the basis of Remark 3.3.

Now we have transformed our original nonlinear problem for W into an equivalent nonlinear problem for W <sup>H</sup> D W ıV <sup>H</sup> by claiming that W <sup>H</sup> minus the centrifugal term has to be harmonic outside S and on (the unknown) S one must have

$$W\_0^H \left( \sigma \right) = W\_0 \left( \sigma \right) - \delta V^H \left( \sigma \right) \tag{42}$$

$$\mathbf{g}\_0^H(\sigma) = \mathbf{g}\_0(\sigma) + \frac{\partial \delta V^H}{\partial h}(\sigma) \;. \tag{43}$$

The two scalar nonlinear Molodensky problems for W and W <sup>H</sup> are equivalent by construction. So if we go to the respective linearized version for the anomalous potential T and T <sup>H</sup> , i.e.

$$T = W - U\ ,\tag{44}$$

$$T^H = W^H - U = W - U - \delta V^H = T - \delta V^H,\quad(45)$$

we would not need to make any computation to claim that also the two linearized problems are equivalent since each of them is equivalent to its nonlinear version. Yet, for the sake of clarity we proceed to verify this statement.

For this purpose we notice that we can define a Helmert telluroid S-<sup>H</sup> <sup>D</sup> ˚ h D h-<sup>H</sup> .-/ by stating that

$$W^H\left(H\left(\sigma\right),\sigma\right) = U\left(h^{\star H}\left(\sigma\right),\sigma\right) \,. \tag{46}$$

Then (46) suitably linearized gives us the Helmert-Bruns relation

$$\xi^H \left( \sigma \right) = H \left( \sigma \right) - h^{\star H} \left( \sigma \right) = \frac{T^H \left( h^{\star H} \left( \sigma \right), \sigma \right)}{\chi \left( h^{\star H} \left( \sigma \right), \sigma \right)} \, . \quad (47)$$

We observe that from (44) one has too

$$\xi^H = \frac{T^H}{\chi} = \frac{T}{\chi} - \frac{\delta V^H}{\chi} = \xi - \frac{\delta V^H}{\chi} \qquad (48)$$

and from (28) and (46)

$$h^{\star H}\left(\sigma\right) - h^{\star}\left(\sigma\right) = \zeta\left(\sigma\right) - \zeta^H\left(\sigma\right) = \frac{\delta V^H}{\mathcal{Y}}\,. \qquad (49)$$

Finally the linearized boundary condition for T <sup>H</sup> reads

$$\frac{\partial}{\partial h}T^H\left(h^{\star H}\left(\sigma\right),\sigma\right) + \frac{\mathcal{Y}'}{\mathcal{Y}}T^H\left(h^{\star H}\left(\sigma\right),\sigma\right) = $$

$$\text{g}\_0^H\left(\sigma\right) - \mathcal{Y}\left(h^{\star H}\left(\sigma\right),\sigma\right) = \Delta \text{g}\_0^H\left(\sigma\right) \,. \tag{50}$$

Now let T be the solution of (28) with boundary condition on S-

$$-\frac{\partial T}{\partial h}\left(h^{\star}\left(\sigma\right),\sigma\right) + \frac{\chi^{\prime}}{\mathcal{Y}}T\left(h^{\star}\left(\sigma\right),\sigma\right)$$

$$=g\_{0}\left(\sigma\right) - \chi\left(h^{\star}\left(\sigma\right),\sigma\right) \tag{51}$$

and T <sup>H</sup> the solution of the same problem on S-<sup>H</sup> with boundary condition (50). We prove that T T <sup>H</sup> D ıV <sup>H</sup> .-/ so that (49) holds too and then

$$H\left(\sigma\right) = h^\*\left(\sigma\right) + \zeta\left(\sigma\right) = h^{\*H}\left(\sigma\right) + \zeta^H\left(\sigma\right) \,, \quad (52)$$

i.e. the gravimetric surface reconstructed from the two problems is the same and on the same time

$$W\_0\left(\sigma\right) = W\_0^H\left(\sigma\right) + \delta V^H\left(\sigma\right) \,, \tag{53}$$

i.e. the potential of the gravity field is the same outside S. All that in the usual linear approximation since we are moving well inside the linearization band.

If we take (51) minus (50) and consider that h-<sup>H</sup> .-/ h- .-/ is of the order of few meters, we can write

$$\left.-\frac{\partial}{\partial h}\left(T - T^H\right) + \frac{\mathcal{V}'}{\mathcal{Y}}\left(T - T^H\right)\right|\_{S^\star} = $$

$$\left.\text{g}\_0\left(\sigma\right) - \text{g}\_0^H\left(\sigma\right) - \mathcal{Y}\left(h^\star\left(\sigma\right), \sigma\right) + \mathcal{Y}\left(h^{\star H}\left(\sigma\right), \sigma\right) = $$

$$-\frac{\partial \delta V^H}{\partial h} + \mathcal{Y}\left(h^{\star H}\left(\sigma\right) - h^\star\left(\sigma\right)\right) \,. \tag{54}$$

On the other hand, by the very definition of S and S-<sup>H</sup> , we have

$$\begin{aligned} \delta V \left( \sigma \right) &= W\_0 \left( \sigma \right) - W\_0^H \left( \sigma \right) \\ &= U \left( h^\star \left( \sigma \right), \sigma \right) - U \left( h^{\star H} \left( \sigma \right), \sigma \right) \\ &= -\gamma \cdot \left( h^\star \left( \sigma \right) - h^{\star H} \left( \sigma \right) \right), \end{aligned} \tag{55}$$

namely

$$h^{\ast H} \left( \sigma \right) - h^{\ast} \left( \sigma \right) = \frac{\delta V \left( \sigma \right)}{\mathcal{Y}} \,. \tag{56}$$

Substituting (56) into (54) we see that

$$\left.-\frac{\partial}{\partial h}\left(T - T^H\right) + \frac{\chi'}{\chi}\left(T - T^H\right)\right|\_{S^\bullet} = -\frac{\partial \,\delta V^H}{\partial h} + \frac{\chi'}{\chi}\delta V^H \tag{57}$$

and therefore, by the uniqueness of the solution of Molodensky's problem, also recalling that we do not consider here the first degree harmonics problem, we find

$$T - T^H = \delta V^H \tag{58}$$

as we wanted to prove.

The conclusions of this section are that:


# **4 The Problem of the Internal Potential**

The question we want to focus on is not so much the equivalence between Molodensky's and Helmert's solutions in the topographic layer (i.e. between S and the geoid), but rather to clarify what are the error sources that limit the knowledge of W in this layer. Such errors are then reflected on the accuracy of the determination of internal equipotential surfaces, in particular of the geoid, defined as the equipotential surface corresponding to some conventional value W <sup>0</sup> of W .

In fact the problem has to be set in the following way, as already recalled in Sect. 1:

$$\text{given} \quad S = \{ r = R \: (\sigma) \}\tag{59}$$

$$\text{and} \quad \begin{cases} W\left(R\left(\sigma\right), \sigma\right) = W\_0\left(\sigma\right) \\ \left| \nabla W\left(R\left(\sigma\right), \sigma\right) \right| = \mathcal{g}\_0\left(\sigma\right) \end{cases} \qquad (60)$$

and given the mass density D .r;-/ in a layer L D .S0; S/ below S, we have to determine W satisfying the conditions (60) on S and the Poisson equation

$$
\Delta V = \Delta \left( W - \frac{1}{2} \boldsymbol{\omega}^2 \left( \boldsymbol{\chi}^2 + \boldsymbol{\chi}^2 \right) \right) = -4 \pi G \rho \qquad (61)
$$

in the layer L.

There are two possible versions of the problem:


$$W\left(R\_0\left(\sigma\right), \sigma\right) = \overline{W}\_0\tag{62}$$

with W <sup>0</sup> a constant such that

$$\overline{W}\_0 > \max\_{\sigma} W\left(R\left(\sigma\right), \sigma\right) \,. \tag{63}$$

It is clear that (a) includes (b) as soon as we are able to fix a surface S0 such that S0 is placed in an intermediate position between S0 itself and S; this can be verified a posteriori if the solution W found in L is such that on S0

$$\min\_{\sigma} W\left(R\left(\sigma\right), \sigma\right) > \overline{W}\_0\,. \tag{64}$$

For the Earth, for instance, an S0 inside the ellipsoid, down some 150 m, will do if S0 has to be the geoid. The reason for stating the condition (63) or the check (64) is that W is increasing going downward, so W <sup>0</sup> on the geoid is larger than W at any point with positive height and W is larger than W <sup>0</sup> at any point on S0.

**Remark 4.1** In any event an important remark is necessary here to interpret the initial conditions (60). In fact we have to recall that as far as V D W <sup>1</sup> 2!<sup>2</sup> - x<sup>2</sup> C y<sup>2</sup> is a Newtonian potential generated by a bounded density, we expect this function to be globally continuous together with its first derivatives, which are even Hölder continuous for any exponent <1. This is an easy combination of a majorization of W and rW , derived from Newton's integral, when is in L<sup>p</sup> .B/, 8p (what is true because the Earth body B is bounded and is bounded too) (Miranda 1970) and well known embedding theorems of Sobolev spaces in Hölder spaces (see e.g. Adams 1975). Physically it means that any sensible solution for the potential W cannot have a discontinuity in the first derivatives across S because otherwise we would have a single layer (i.e. an unbounded ) on this surface. Therefore, we can expect that W0 and g0 are pointwise well defined functions, as they are traces of spacewise Hölder functions on a surface S which, as a minimum requirement, is Lipschitz continuous.

At this point it is convenient to observe that it is useless to carry on (60), i.e. a nonlinear condition on rW on S. In fact from W0 .-/ and g0 .-/, we can reconstruct any first derivative of W on S. Assume for instance we want to have W <sup>0</sup> j<sup>S</sup> D @W @r ˇ ˇ <sup>S</sup> ; after defining

$$\left. \underline{\partial}\_{\sigma} W \left( R \left( \sigma \right), \sigma \right) \right| = \left. \nabla\_{\sigma} W \left( r, \sigma \right) \right|\_{r = R(\sigma)}, \qquad (65)$$

we can write the system

$$\nabla\_{\sigma} W\_0 \left( \sigma \right) = W' \left( R \left( \sigma \right), \sigma \right) \nabla\_{\sigma} R \left( \sigma \right) + \underbrace{\partial\_{\sigma}}\_{\left( \left. \sigma \right)} W \left( R \left( \sigma \right), \sigma \right) \tag{66}$$

$$\mathrm{g}\_0^2(\sigma) = W'(R\left(\sigma\right), \sigma)^2 + \frac{1}{R\left(\sigma\right)^2} \left| \underline{\partial}\_{\sigma} W\left(R\left(\sigma\right), \sigma\right) \right|^2 \,. \tag{67}$$

Deriving @-W from (66) and substituting it into (67), we get an easily solvable quadratic equation in W <sup>0</sup> .R .-/; -/. In a similar fashion, by using T .r;-/ D W .r;-/U .r;-/ and T <sup>0</sup> .r;-/ D W <sup>0</sup> .r;-/ U<sup>0</sup> .r;-/, we can get hold on S of the initial values

$$T\_0\left(\sigma\right) = T\left(R\left(\sigma\right), \sigma\right) \tag{68}$$

and

$$\delta \mathbf{g}\_0 \left( \sigma \right) = -T' \left( R \left( \sigma \right), \sigma \right) \tag{69}$$

or even, equivalently,

$$
\Delta \mathbf{g}\_0 \left( \sigma \right) = -T' - \frac{2}{R \left( \sigma \right)} T \, \,. \tag{70}
$$

Clearly the symbols ıg and 4g refer to gravity disturbance ıg and gravity anomaly 4g, here in spherical approximation.

In any way the problem we are facing now is a Cauchy problem for the Poisson equation for T , which, as recalled in Sect. 1, in general has a solution which, when it exists, is not continuously dependent on data.

Referring to the formulation a) of a fixed layer L and to the initial data (68) and (69) on S, it seems natural as a first step to subtract the known "topographic" potential VT (see (31)) to data and then solve the downward continuation of T VT , which has now to be harmonic in L.

As we see in the following elementary Example 4.1, however this leads to correct T with a term VT which can easily be 10 times larger than the former, especially in mountainous areas. Since this is never a good idea, as it multiplies by 10 as well the various model errors, Helmert has invented his "trick" of substituting VT by its condensed version VC on the geoid. In this way in fact the residual (Helmertized) potential is still close to T , but harmonic in L and the downward continuation problem is put in its pure form. To be precise, since we fixed S0, the above refers to the first Helmert condensation method (see Heck 2003); this however has little relevance to the present discussion.

**Example 4.1** Let us consider the rather simplistic case that S is a sphere of radius R, S0 a sphere of radius R0, with R D R0 C 1 (in km), R0 Š 6 - 103 km and the layer L between S0 and S is filled with a mass of a constant density 2:67 g cm<sup>3</sup> so that 4G Š 0:2 Gal km<sup>1</sup> . Moreover let us assume that  103 Gal, as it is approximately true in reality. Then

$$\frac{V\_T \left(R\_0, \sigma\right)}{\mathcal{Y}} = \frac{\frac{4}{3}\pi G\rho \left[\left(R\_0 + 1\right)^3 - R\_0^3\right]}{R\_0 \mathcal{Y}}$$

$$= \frac{4\pi G\rho}{\mathcal{Y}} \left[R\_0 + 1 - \frac{1}{3R\_0}\right]$$

$$\cong 1.2\,\text{km} + 0.2 \cdot 10^{-3}\,\text{km} + 10^{-8}\,\text{km}\,\text{J}$$

As we see, in terms of shifts of the equipotential surfaces, the first term is very large (10 times the order of magnitude of the geoid), the second term amounts to only 20 cm and the third term is negligible. But the first term is exactly the effect of the condensed Helmert layer, so the second term is essentially the Helmert correction to the potential. Indeed, in high mountain and with a more complicated geometry, this term can rise by an order of magnitude, as stated in (39).

At this point would we know already ıg .r;-/ or g .r;-/ inside L, then we could simply integrate a first order differential equation, namely

$$\frac{\partial T}{\partial r} = \delta \mathbf{g} \quad \rightarrow \quad T\left(r, \sigma\right) = T\_0\left(\sigma\right) + \int\_r^{R(\sigma)} \delta \mathbf{g}\left(\mathbf{s}, \sigma\right) d\mathbf{s} \tag{71}$$

or

$$\begin{aligned} \frac{\partial T}{\partial r} + \frac{2}{r}T &= -\Delta \mathbf{g} \quad \rightarrow \quad T\left(r, \sigma\right) = T\_0\left(\sigma\right) + \\ \int\_r^{R(o)} \frac{s^2}{r^2} \Delta \mathbf{g}\left(s, \sigma\right) ds \,. \end{aligned} \quad (72)$$

So the problem is to continue say ıg, for the sake of definiteness, in L either in free air, if we refer to the Helmert version, or following the classical Bouguer theory (Heiskanen and Moritz 1967), if we leave the masses in L. In both cases we solve essentially the approximate Bruns equation

$$\frac{\partial \delta \mathbf{g}}{\partial r} = -\frac{2}{r} \delta \mathbf{g} - 4\pi G\rho \; , \tag{73}$$

the last term being absent in the Helmert approach.

In Sansò and Sideris (2013, appendix A2), (73) has been justified by using in the exact Bruns equation the estimate on the difference between the true equipotentials mean curvature C and the corresponding normal counterpart K

$$|\mathcal{C} - \mathcal{K}| \sim \frac{10^{-3}}{2R} \tag{74}$$

which has been derived as a mean square value for the high resolution global model EGM96 (Lemoine et al 1998).

To better understand the error committed in this approximation, one can simply compare (73) with Poisson's equation, written in spherical coordinates, and recall that ıg D T <sup>0</sup> ; then one sees that in (73) the term

$$\mathcal{E}\left(\delta \mathbf{g}\right) = \frac{1}{r^2} \triangle\_\sigma T \tag{75}$$

is lost. Although this can have a relative small r.m.s., satisfying (74), it is clear that it can become quite large in rugged areas. This suggests that a much better approximation could be obtained by substituting (73) with

$$\frac{\partial \delta \mathbf{g}}{\partial r} = -\frac{2}{r} \delta \mathbf{g} - 4\pi G \rho - \frac{1}{r^2} \Delta\_\sigma T\_M \,, \tag{76}$$

where TM is some high resolution global model of T . Calling

$$f = 4\pi G\rho + \frac{1}{r^2} \triangle\_\sigma T\_M \,, \tag{77}$$

(76) has the solution

$$\delta \mathbf{g}\left(r, \sigma\right) = \frac{R^2\left(\sigma\right)}{r^2} \delta \mathbf{g}\_0\left(\sigma\right) + \int\_r^{R\left(\sigma\right)} \frac{s^2}{r^2} f\left(\mathbf{s}, \sigma\right) ds\,\,. \tag{78}$$

This substituted in (71) gives the sought solution T .r;-/ in L. After some computations the result is

$$T\left(r,\sigma\right) = T\_0\left(\sigma\right) + \delta g\_0\left(\sigma\right)R^2\left(\sigma\right)\left(\frac{1}{r} - \frac{1}{R\left(\sigma\right)}\right)$$

$$+ \int\_r^{R\left(\sigma\right)} \left(\frac{1}{r} - \frac{1}{s}\right) f\left(s,\sigma\right) ds\,\,. \tag{79}$$

One has to underline that this line of thought is already contained in Heiskanen and Moritz (1967).

# **5 Conclusions**

Two main conclusions can be drawn from the presented analysis:

	- the imperfect knowledge of ; this gives the same error in both approaches,
	- the downward continuation error; since this can be reduced by computing correction terms with the help of a global model TM , the use of Helmert's approach that removes discontinuities and of a Helmertized model T <sup>H</sup> <sup>M</sup> , is likely to produce better results.

# **References**

Adams RA (1975) Sobolev spaces. Academic Press, New York


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/ licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

# **List of Reviewers**

Alfonso Vitti Amir Khodabandeh Baocheng Zhang Blažej Bucha Claudia Noemi Tocho Dawid Kwasniak Dimitrios Piretzidis Fernando Sansò Hermann Drewes Ismael Foroughi Jeff Freymueller Jianliang Huang Jie Han Krzysztof Nowel Krzysztof Sosnica Marcelo Santos Martin Pitonák Martina Idzanovic Mehdi Eshagh Michael Schmidt Mintourakis Ioannis Mirko Reguzzoni Mohammad Ali Sharifi Pavel Novák Petr Holota Reiner Rummel Riccardo Barzaghi Robert Odolinski Roland Klees Ruediger Lehmann Silvio R.C. de Freitas Urs Marti Wolf-Dieter Schuh Xing Fang Yan Ming Wang Zuzana Minarechova

# **Author Index**

# **A**

Abbak, R., 30 Abd-Elmotaal, H.A., 47–52 Abramowitz, M., 115 Abrehdary, M., 3–5, 171–174 Adams, R.A., 189 Afrasteh, Y., 31 Al Hage, J., 57 Albarici, F.L., 28 Albertella, A., 130 Alkhatib, H., 93–98 Amiri Simkooei, A., 73, 76 Andersen, O.B., 120, 130 Anderson, A., 106 Argyris, J.H., 120, 131 Arnold, S.F., 76 Askey, R., 66, 67, 69–71

#### **B**

Baarda, W., 73, 75, 76 Bagherbandi, M., 3, 28, 171 Barbot, S., 108 Barzaghi, R., 11–18, 181–190 Becker, S., 120, 130 Bergé-Nguyen, M., 19–25 Bingham, R.J., 130 Bjerhammar, A., 172, 173 Blaha, G., 155 Bochner, S., 140, 141 Bolmer, E., 120 Borlinghaus, M., 119–127, 129–136 Borsa, A.A., 23 Borutta, H., 73, 76 Box, G.E.P., 114, 150 Brockmann, J.M., 113–117, 119–127, 129–136, 149–157 Brockwell, P.J., 140, 149 Brovar, V.V., 185 Bruinsma, S., 50 Buhmann, M., 139 Buttkus, B., 114

#### **C**

Cabane, S., 57, 58 Cady, J.W., 23 Caspary, W., 73, 76 Charbonnier, R., 113 Chaves, C.A.M., 28 Cheng, H., 104 Chernih, A., 67 Chilès, J.P., 140–142 Chmielewski, M.A., 57 Cohn, S.E., 139, 140 Collins, P., 159, 160 Crétaux, J.-F., 19–25 Crossley, D., 104 Cunderlík, R., 130

#### **D**

Daley, D.J., 140 Davis, R.A., 114, 140, 149 De Boor, C., 121 Delfiner, P., 140, 141 Deming, W.E., 88–90 Devaraju, B., 65 Dhital, A., 57 Dins, A., 57 Dorndorf, A., 93–98 Duquenne, H., 28, 30 Dziewonski, A.M., 106

## **E**

Emery, X., 65 Engwirda, D., 120 Eremeev, V.F., 177 Esch, C., 153 Eshagh, M., 39–41

## **F**

Fahrmeir, L., 121 Fang, X., 84, 85, 88, 89 Farrell, W.E., 104 Featherstone, W.E., 176 Ferguson, S., 27–34 Fialko, Y., 108 Foroughi, I., 7–10, 27–34, 173, 175–179 Forsberg, R., 50, 182 Fraser, D., 27 Froidevaux, R., 140 Fu, G., 104 Fu, L.L., 129 Fu, W., 109 Fuentes, M., 66 Furrer, R., 140

#### **G**

Gaposchkin, E.M., 129 Gaspari, G., 139, 140, 143 Gauss, C.F., 175

© The Author(s) 2024 J. T. Freymueller, L. Sánchez (eds.), *X Hotine-Marussi Symposium on Mathematical Geodesy*, International Association of Geodesy Symposia 155, https://doi.org/10.1007/978-3-031-55360-8 Gelman, A., 95, 96 Gilardoni, M., 130 Gneiting, T., 66, 140, 141 Goli, M., 27–34 Goyal, R., 28, 30 Gradshteyn, I.S., 67, 70, 142 Grafarend, E.W., 37 Grenier, Y., 113 Guinness, J., 66

#### **H**

Hall, M., 113 Hamilton, J.D., 114 Han, J., 87, 88 Hartmann, J., 30 Hasegawa, A., 104 Heck, B., 47–52, 182–186, 189 Heiskanen, W., 20, 21, 25, 175–177 Heiskanen, W.A., 3, 8, 11, 13, 15, 16, 18, 27, 39, 171, 172, 182, 185, 186, 190 Heki, K., 104 Henderson, H.V., 77 Heng, L., 57 Heunecke, O., 78 Hofmann-Wellenhof, B., 130 Hörmander, L., 182 Hotine, M., 175, 176 Hristopulos, D.T., 142 Huang, J., 27, 28 Huang, P., 104 Hubbert, S., 66, 67 Hunegnaw, A., 28

#### **I**

Imparato, D., 76

#### **J**

Janák, J., 28, 31 Jazaeri, S., 83–90 Jekeli, C., 67 Jenkins, G., 150 Jin, T., 120 Joarder, A.H., 57 Journel, A.G., 140

#### **K**

Kalman, R.E., 161 Kamen, E.W., 113, 114 Kargoll, B., 93–98, 113 Kelly, G., 89, 90 Khodabandeh, A., 159–167 Kiamehr, R., 28 Kibria, B.M.G., 57 Kingdon, R., 7–10, 28, 175–179, 186 Klees, R., 30 Klein, I., 73 Klemann, V., 103–110 Knopp, K., 150 Knudsen, P., 120, 130, 133 Koch, K.-R., 121, 161 Korte, J., 113–117, 149–157 Kösters, A., 73, 76

Kotsakis, C., 65–71 Kraiger, G., 48 Krakiwsky, E.J., 8, 176 Krarup, T., 182, 184, 185 Kuhn, M., 28, 33, 176 Kühtreiber, N., 47–52 Kusche, J., 121 Kutterer, H., 47–52 Kvas, A., 132

#### **L**

Lange, K.L., 93 Lantuéjoul, C., 141, 142 Laurichesse, D., 159, 160 Leandro, R.F., 165 Lehmann, R., 73, 76 Leick, A., 160 Lemoine, F.G., 190 Li, X., 22, 28 Lin, M., 28 Lösler, M., 73, 76

#### **M**

Magnus, J.R., 166 Mahbuby, H., 30 Makhloof, A., 48 Martinec, Z., 10, 27–29, 31, 38, 103–110, 182, 186 Marussi, A., 175 Marussi, Z., 12 Matheron, G., 140, 141 Matsuo, K., 104 Mercier, F., 159 Mertikas, S.P., 65–71 Miranda, C., 17, 182, 183, 189 Molodensky, M.S., 175, 177, 178, 185, 187, 188 Montenbruck, O., 58, 160 Moosdorf, N., 30 Moritz, A., 8 Moritz, H., 3, 11, 13, 15, 16, 18, 20, 21, 25, 27, 39, 42, 48, 49, 68, 130, 141, 149, 153, 155, 171, 172, 175–177, 182, 185, 190 Mulet, S., 120, 133

#### **N**

Nakajima, J., 104 Neitzel, F., 85 Neudecker, H., 166 Neyers, C., 119–127, 129–136 Ni, S., 94 Nield, G.A., 104 Novák, P., 28, 30, 37–44 Nowel, K., 73, 76

#### **O**

Odijk, D., 160 Okada, Y., 106 Oliver, M.A., 140

#### **P**

Paffenholz, J.-A., 93–98 Pagiatakis, S., 27–34 Pavlis, N.K., 49, 185

Pellinen, L.P., 185 Perfetti, N., 73, 76 Piretzidis, D., 65–71 Pitonák, M., 37–44 ˇ Pollitz, F.F., 104 Priestley, M.B., 140, 149 Prudnikov, A.P., 70 Psychas, D., 159–167 Pujol, M.I., 120, 125 Pukelsheim, F., 77

#### **R**

Rauhala, U., 155 Reguzzoni, M., 11–18, 181–190 Roth, M., 57 Rummel, R., 129 Ryzhik, I.M., 67, 70, 142

#### **S**

Sacerdote, F., 182, 184, 185 Sansò, F., 9, 11–18, 139, 140, 144, 175, 176, 181–190 Santos, M., 7–10, 27–34, 175–179 Schaeffler, P., 130 Schaffrin, B., 83–90 Schlather, M., 142 Schoenberg, I.J., 68, 71 Schubert, T., 113–117, 139–144, 149–157 Schuh, W.-D., 113–117, 129–136, 139–144, 149–157 Seitz, K., 47–52 Sheng, M.B., 7–10, 27–34, 175–179 Shi, Y., 87, 88 Sideris, M.G., 11, 65–71, 182–184, 190 Siegismund, F., 130 Simon, D., 161 Simonsen, O., 177 Sjöberg, L.E., 3–5, 28, 39, 171–174 Slepian, D., 113 Sneddon, I.N., 141 Sneeuw, N., 19–25, 65 Snow, K., 83–90 Šprlák, M., 37–44 Stammer, D., 129 Steel, M.F.J., 94 Stegun, I.A., 115 Sun, D., 94 Sun, W., 104, 109 Sünkel, H., 121

#### **T**

Taburet, G., 120, 125, 134 Tanaka, Y., 103–110 Taylor, M.A., 131 Tenzer, R., 8, 177 Teunissen, P.J.G., 57–62, 73–81, 153, 159–167 Tikhonov, A., 121 Tscherning, C.C., 182

#### **U**

Ussami, N., 28

# **V**

Vachon, R., 104 Vajda, P., 28 Valty, P., 30 Van der Marel, H., 73, 76 Vanícek, P., 7–12, 27–34, 175–179, 186, 187 ˆ Verhagen, S., 57–62 Voß-Böhme, A., 76

#### **W**

Wackernagel, H., 140 Wang, K., 159, 164 Wang, L., 87–89 Wang, Z., 57 Webster, R., 140 Wegman, E.J., 151 Wendland, H., 66–71, 139, 140 Wenzel, H.G., 39 Whitehouse, P.L., 104 Wieser, A., 83, 86, 88, 89 Wise, R.A., 23 Woodworth, P.L., 130 Wouters, B., 104 Wu, P., 104 Wu, Z., 140 Wubbena, G., 159–161 Wunsch, C., 129

## **X**

Xu, P., 83, 87–89 Xu, X., 57

#### **Y**

Yaglom, A.M., 140–142 Yanushauakas, A.T., 182 Yanushauskas, A.J., 11

## **Z**

Zaminpardaz, S., 73–81 Zhao, J., 87, 88 Zhong, M., 57 Zhou, J., 104 Zhou, W., 57 Zhu, H., 57 Zingerle, P., 42, 134

# **Subject Index**

#### **A**

Africa, 47, 49–52, 121, 126, 133, 134 Ambiguity success-rate, 59 AR process, 113–117, 149, 150 Askey model, 66, 69, 71

#### **B**

Boundary-value problems (BVPs), 16, 37–39, 43, 171, 182, 184, 185

#### **C**

Collocation, 120, 144, 149–157, 185 Covariance modeling, 139–144, 153–155

#### **D**

Deformation, 23, 73–81, 103–110, 144, 153, 155, 156 Deformation monitoring, 74, 78–80 Density, 3–5, 8, 17, 18, 20, 21, 24, 27–34, 94, 95, 97, 103–110, 150, 171–174, 177, 179, 182, 186–189 Density variation, 21, 27–34 Downward continuation (DWC), 3, 5, 10, 27, 30, 39, 40, 172, 173, 189, 190

#### **E**

Earthquake, 104–106, 109 Earth's gravity field, 7, 12, 13, 130, 133, 177, 179, 181 Ellipsoidal heights, 9, 11–14, 17, 19, 176, 182 Errors-In-Variables (EIV) model, 83–90

#### **F**

Finite covariance functions, 139, 153, 154 Finite element method (FEM), 104, 110 Finite elements, 103–110, 119–127, 130–133, 135

#### **G**

Geodetic heights, 9–18, 172, 173, 176, 178 Geoid, 3–5, 8–10, 13–15, 17, 19–21, 24, 27–34, 47, 48, 50, 121, 122, 129–135, 171–174, 176–178, 185–190 Geoid determination, 3, 9, 28, 30, 34, 39, 134, 172 Geoid error, 19, 34 Geopotential, 7, 8, 11, 20, 44, 176, 177 Global Navigation Satellite System (GNSS), 9, 11, 17, 57–62, 96, 159, 160, 165, 173 Gradients of the disturbing potential, 38

Gravimetric inversion, 173 Gravity, 7–11, 13, 20–25, 27–33, 39, 43, 47–52, 104, 105, 107–110, 130, 171–174, 176–178, 181–190 Gravity field, 8, 17, 20, 21, 104, 130–132, 134, 135, 177, 178, 181–183, 187 Gravity interpolation, 48, 50

#### **H**

Height anomaly, 37–44, 171–173, 175, 178, 179, 184, 185 Height system, 7–11, 13–17, 173, 175–179 Helmert's approach, 181–190

# **I**

Integer ambiguity resolution, 57–62 Integer ambiguity resolution enabled precise point positioning (PPP-RTK), 159–167 Integer least-squares (ILS), 58, 60–62

**K** Kalman filter, 159, 161–163

#### **L**

Lake surface, 20, 21, 23–25 Lateral heterogeneity, 104–107, 109, 110

#### **M**

Mean dynamic topography (MDT), 120, 129–136 Mean sea surface (MSS), 119–127, 129, 130, 136 Mesh refinement, 122, 125 Metric spaces, 176–178 Metropolis-within-Gibbs algorithm, 95, 96 Minimal detectable bias (MDB), 73–81 Minimal identifiable bias (MIB), 73–81 Molodensky's approach, 8–9, 178, 181–190 Motions of the roots, 114, 115, 117, 150

#### **N**

Non-stationarity Normal height, 8, 9, 11–15, 17, 24–25, 172, 173, 177–179, 185

# **O**

One-step integration, 28, 30 Orthometric height, 4, 8, 9, 12–15, 19–25, 172, 173, 176–179, 186 Physical geodesy, 20, 21, 65, 67, 181, 184 Polynomial covariance functions, 65–71 Pull-in region, 58–60, 62

#### **Q**

Quasi-geoid, 8–10, 12, 171–179

#### **R**

Reference surfaces, 7–10, 12–14, 17, 173, 175, 176, 178, 179, 182 Robust Bayesian time series analysis, 93–98

#### **S**

Satellite altimetry, 19, 20, 22, 24, 48, 182 Sea level anomalies (SLA), 119–127, 134, 135 Sea surface height (SSH), 9, 119–122, 125, 127, 129–135 Signal separation, 129 Singular cofactor matrices, 83–90 Spatio-temporal modelling, 120, 126, 130, 131, 135 Spectral combination method, 37–44 Spherical harmonic coefficients, 28, 39, 43, 67, 69, 71, 105, 107, 109, 131 Statistical testing, 73–76 Stochastic processes, 140, 149

**T**

t-distribution, 58, 93, 94 Terrain bias, 3–5 Terrain correction, 3–5, 183, 185 Time-correlated corrections, 164 Time-variable AR processes, 150 Time-variable covariances, 155 Time varying AR coefficients, 114, 115 Topographic correction, 3 Total least squares (TLS), 83–90

**V**

VAR process, 94, 97

#### **W**

Wendland model, 71 Window technique, 49

## **Z**

Z-transformation, 60